While rebuilding parity on a new parity drive, a data drive failed. What now?


Recommended Posts

Hello. I replaced the parity drive with a new one and started a parity rebuild. When it got to 2.3%, disk2  started to show thousands of errors on the Main page under "Device status". I believe that the hard drive assigned to disk2 had become unavailable.

After a reboot, the "Device status" box now shows disk2 as "Not installed". On the same line, it shows the correct size of the underlying hard drive (4TB) but then it says "Unformatted".

If stop the array, I have the option to re-assign the hard drive to disk2 but when I do so, I get a blue ball, which I believe means that Unraid would treat the hard drive as a new one.

I've tried to reboot a few times and to double-check the cables but no luck.

 

So currently I have a one data drive missing and a parity drive which is only 2.3% completed.

 

I still have the old parity hard drive and have made no changes to the data drives. How can I use the old parity hard drive to rebuild disk2? (I would obviously use a new hard drive for disk2).

I'm using Unraid version 5.0.6 and I am afraid I have not made a copy of the flash drive before replacing the parity drive.

 

Thank you!

Link to comment
10 hours ago, riccume said:

I'm using Unraid version 5.0.6

This is a problem, you should be able to use the invalid slot command but don't remember if it was working correctly with that release, nor the steps needed for that particular release since it's been many years, maybe you can find something in the forum, if you don't post back I see if I can get 5.0.6 to boot on my test server.

  • Like 1
Link to comment
36 minutes ago, JorgeB said:

This is a problem, you should be able to use the invalid slot command but don't remember if it was working correctly with that release, nor the steps needed for that particular release since it's been many years, maybe you can find something in the forum, if you don't post back I see if I can get 5.0.6 to boot on my test server.

Thanks JorgeB. I've done some additional searching and I was wondering whether I can use the "Parity Swap" ("Swap Disable") procedure (https://wiki.unraid.net/The_parity_swap_procedure ).

Are you saying that the commands needed for the Parity Swap might not work property on 5.0.6? It seems to have been around for a long time

although some people seem to have struggled with it...

- disable both parity drive and disk2

- install old parity hard drive in disk2

- copy the old parity drive to the new parity drive using the built-in Copy function

- install a new hard drive on disk2 and rebuild parity

Link to comment
11 minutes ago, JorgeB said:

Parity swap isn't for this case, search for "invalidslot" and "5.0.6"

Double-checking; why isn't parity swap a possible solution here? The old parity drive is still available, unchanged (was kept offline) and working and I made no changes to the data in the array since taking it off. All I did was mount the array with the new parity drive, start the parity rebuild, then stop it when the errors from disk2 started to pop up - so the other data drives should also be unchanged.

Given this premise, cannot I simply 'rewind' to the setup before the failed parity rebuild and use the standard parity swap procedure to replace disk2?

 

In the meanwhile, as you suggest, I'm trying to figure out how invalidslot works - I'm not familiar with it.

 

Link to comment

Luckily v5.0.6 still boots on my test server, procedure is this:

 

-stop array and take a note of all the current assignments

-Utils -> New Config -> Yes I want to do this -> Apply

-Back on the main page, assign all the disks as they were as well as old parity and new disk2, double check all assignments are correct

-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):

mdcmd set invalidslot 2

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box, disk2 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check.

  • Like 1
Link to comment
10 minutes ago, JorgeB said:

Luckily v5.0.6 still boots on my test server, procedure is this:

 

-stop array and take a note of all the current assignments

-Utils -> New Config -> Yes I want to do this -> Apply

-Back on the main page, assign all the disks as they were as well as old parity and new disk2, double check all assignments are correct

-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):


mdcmd set invalidslot 2

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box, disk2 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check.

So so grateful for your help here JorgeB! Double-checking the obvious to make sure I don't end up deleting data for good:

- I physically replace the new hard drive for the parity disk with the old one that I took out before the failed parity rebuild

- I physically replace the failed hard drive for disk2 with a new one

- The new hard drive for disk2 cannot be bigger than the old parity hard drive. I'm asking because I already have two spare 12TB hard drives (I am in the process of upgrading my rig) but I don't think they work for the process you suggest given that the old parity hard drive is 4TB.

 

I'm always eager to learn more about unRaid so would love to hear why the parity swap procedure doesn't work here - but completely understand if you don't have time. Thanks!!

Link to comment
11 minutes ago, riccume said:

- I physically replace the new hard drive for the parity disk with the old one that I took out before the failed parity rebuild

- I physically replace the failed hard drive for disk2 with a new one

Yes.

 

11 minutes ago, riccume said:

- The new hard drive for disk2 cannot be bigger than the old parity hard drive.

Correct, same size or larger than old disk2, but not larger than current parity.

  • Like 1
Link to comment
8 minutes ago, JorgeB said:

You could possibly run the invalid slot command with an unassigned disk2, then run the parity swap, but invalid slot doesn't always work without a disk assigned, and I can't test if it does on v5.0.6 now since I'm about to go out for the day, but can test tomorrow if you want.

I understand now, it makes sense, thanks. To play it easy/safe, I will purchase a 4TB drive, try the process you suggest and report back. Enjoy your day!

Link to comment
8 minutes ago, trurl said:

Please upgrade after your array is stable again. It is very difficult to support that very old version, and even possible any support you get will be incorrect.

Thanks. You bet! This is the first stage of a full rebuild; current rig was built in 2010 and I've been living on borrowed time for a while.

 

Off subject so please feel free to disregard but for full info, this is my plan for the rig rebuild:

- upgrade parity drive to new 14TB one and rebuild parity

- upgrade one of the old data drives (they are 6 in total) to new 14TB one and rebuild parity. I cannot simply add the 14TB drive because I've run out of power connectors

- copy data from some old data drives to new 14TB drive using the rsync -avX /mnt/disk[number]/ /mnt/disk[number] command

- remove all old data drives copied in step above, add a new 12TB drive, and copy data from remaining old data drives using the rsync -avX /mnt/disk[number]/ /mnt/disk[number] command

- rebuild parity with new configuration (one parity drive, two data drives)

- move hard drives and flash drive to new rig (details here

- upgrade flash drive to new usb stick

- upgrade unRAID to latest version

Wish me luck!

 

Link to comment

You should upgrade to latest version before making any disk changes because you need to be able to format new disks as XFS instead of ReiserFS. Rebuilding that first disk will result in ReiserFS and can't be avoided but the other disks should use the new format and then you can move the data off the first disk and reformat it.

  • Like 1
Link to comment
3 hours ago, trurl said:

You should upgrade to latest version before making any disk changes because you need to be able to format new disks as XFS instead of ReiserFS. Rebuilding that first disk will result in ReiserFS and can't be avoided but the other disks should use the new format and then you can move the data off the first disk and reformat it.

Thanks @trurl I believe though that my current CPU/motherboard is 32 bit (Gigabyte GA-D510UD) and unRAID 6 won't work on it. If so, should I stay with ReiserFS for the moment?

Link to comment

@JorgeB I'm not sure it is working; I've stopped the parity rebuild. I followed your instructions to the T but when I restarted the array, it showed disk2 with a green dot, the parity disk with a orange dot, and it started writing on the parity disk! I quickly stopped the parity rebuild to minimise 'damage' to the old parity drive (the parity rebuild was stopped after 6109 writes to the parity disk so hopefully nothing major).

 

See below:

- screen grab of the Main page before I stopped the parity rebuild

- screen grab of the invalidslot command in Telnet

 

I'm struggling to image what could have gone wrong. I am sure I did not refresh the Main page before starting the array but could it be that I waited too long before doing so? After sending the invalidslot command, I went through your instructions again before clicking the Start button; let's say 1 minute between invalidslot command and starting the array...?

 

... or any other suggestions? Thank you!

242194495_Telnetcommand.JPG.04b210e4343a9245001129fd15c27357.JPG673871843_MainParityRebuild.thumb.JPG.b7e0d31dde1d83f3f27a0a087e097a62.JPG

Edited by riccume
Link to comment

Looks like it was done correctly, but apparently something wasn't, make sure you're on 5.0.6 like mentioned, it appears you are since the kernel version is the same.

 

11 hours ago, riccume said:

but could it be that I waited too long before doing so?

Just did a quick test, waited for about 3 minutes after the invalid slot command and it still worked correctly.

 

Array must be new after the new config (blue icon for all devices):

image.thumb.png.ebe6665efe1f1880eccb14e3131dd7c6.png

 

Type the invalid slot command and start array:

image.thumb.png.847b3a705a6273ed6493652be5f008b4.png

 

 

 

 

Link to comment

No idea why it didn't work then, you can try the other way, more steps but at least it won't damage parity any more:

 

-stop array and take a note of all the current assignments

-Utils -> New Config -> Yes I want to do this -> Apply

-Back on the main page, assign all the disks as they were as well as old parity and new disk2, double check all assignments are correct

-Check both "parity is already valid" and "maintenance mode" and start the array

-Stop the array

-Unassign disk2

-Start the array

-Stop the Array

-Re-assign disk2

-Start the array to begin rebuilding, disk2 will most likely be unmontable, you'll need to run a filesystem check, and probably rebuild the superblock and then use --rebuild-tree, still reiserfs can usually recover pretty well from a situation like this.

 

 

 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.