Troubleshooting steps for missing drive?


Recommended Posts

Whatever you do DON'T format the drive!

 

The stuff on the console is telling you why it is unmountable.

 

Do you have another (new or old) drive that you could use to rebuild disk 4 onto? That would allow you to set the original disk 4 aside and not alter it in any way and use another disk to try to recover what you can from a rebuild, and then if anything is still missing you could try to recover that from the original disk 4.

 

Thanks for the warning trurl.  I did not format the drive.  Let me make sure I understand your advice though.  I have an RMA underway and a replacement 6TB drive due from WD later this week.  If I install that drive (perhaps preclearing and formatting it as XFS in my other unRAID server first) and then do a new config assigning the new drive as Disk 4, then I can attempt rebuilding the old Disk 4 on to the new one?  If that's my best bet then I don't mind at all waiting for the replacement drive.  I just need to make sure that's the route to go.

Link to comment

... If I install that drive (perhaps preclearing and formatting it as XFS in my other unRAID server first) and then do a new config assigning the new drive as Disk 4, then I can attempt rebuilding the old Disk 4 on to the new one?

 

NO !!  For one thing, even if you can "fool" UnRAID into rebuilding, then there's no reason to format the drive => the rebuild will include all formatting information, so it will be identical to the original drive.

 

To have any chance of success, you need to either install the actual old (failed) disk #4 ... or at least a drive that's the same size, but a different serial # than your new (replacement) drive.    But as I've noted several times, this will only work if NOTHING HAS CHANGED ... and it certainly seems like you've done some reformatting of drives that you've then put into the array.    The implication of this is that EVERY BIT THAT HAS CHANGED on ANY drive will result in an invalid parity -- so any data reconstructed based on those bits will be WRONG.

 

To be explicitly clear about what you actually have, provide the following details:

 

For every disk in the array ...

 

Disk #  --  Function  -- EXACTLY what has happened to it since you started this process

 

For example:

 

Disk #1  -- Parity -- Hasn't been touched ... not installed in any other system;  NO writes to the system it's installed it;  etc.

 

Note if you've installed the disk in your other system;  if so what was done with that system while the disk was installed;  if you did ANY writes to the disk (e.g. reformatting it);  etc.

 

Link to comment

... If I install that drive (perhaps preclearing and formatting it as XFS in my other unRAID server first) and then do a new config assigning the new drive as Disk 4, then I can attempt rebuilding the old Disk 4 on to the new one?

 

NO !!  For one thing, even if you can "fool" UnRAID into rebuilding, then there's no reason to format the drive => the rebuild will include all formatting information, so it will be identical to the original drive.

 

To have any chance of success, you need to either install the actual old (failed) disk #4 ... or at least a drive that's the same size, but a different serial # than your new (replacement) drive.    But as I've noted several times, this will only work if NOTHING HAS CHANGED ... and it certainly seems like you've done some reformatting of drives that you've then put into the array.    The implication of this is that EVERY BIT THAT HAS CHANGED on ANY drive will result in an invalid parity -- so any data reconstructed based on those bits will be WRONG.

 

To be explicitly clear about what you actually have, provide the following details:

 

For every disk in the array ...

 

Disk #  --  Function  -- EXACTLY what has happened to it since you started this process

 

For example:

 

Disk #1  -- Parity -- Hasn't been touched ... not installed in any other system;  NO writes to the system it's installed it;  etc.

 

Note if you've installed the disk in your other system;  if so what was done with that system while the disk was installed;  if you did ANY writes to the disk (e.g. reformatting it);  etc.

 

Disk 1 - Parity -- physically removed from Server2 to install in Server1 but not formatted or written to, then reinstalled exactly as it was to Server2

 

Disk 2 - 6TB data -- same as Disk1

 

Disk 3 - 6TB data -- same as Disk1

 

Disk 4 - 6TB data -- failed during the attempted move to Server1, then went from "missing" to "not installed" in Server2 when I attempted to start the array without it physically installed and with Disk 7 also "unmountable".  Now reinstalled to the Server2 array but still showing as "Not installed / Unmountable".

 

Disk 5 - 6TB data -- same as Disk 1

 

Disk 6 - 4TB data -- empty disk, removed from Server2 and installed to be the parity drive (w/no volume) in new Windows FlexRAID array, then moved to Server1 to be formatted back to XFS for unRAID prior to re-installation in Server2.  (Trying to restore previous Server2 config after failure of Disk4 killed the move to the 2 smaller servers.)

 

Disk 7 - 4TB data -- empty disk, removed from Server2 and installed and formatted as a data drive (NTFS) in new Windows FlexRAID array, then moved to Server 1 to be formatted back to XFS prior to re-installation to Server2 (except apparently the re-format to XFS didn't finish, because when I started the Server2 array it came up as "unmountable".)  Removed and reinstalled back to Server1 to try again to format as XFS, which apparently worked since now back in Server2 it does mount now when I start the array.

 

Disk 8 - 4TB data -- same as Disk 7 except no problem w/reformat back to XFS and mounting the first time Server2 array was started after reinstall.

 

Disk 9 - 4TB data -- same as Disk 8

 

As for write activity to Disks 6-9 in Windows, I did start the new FlexRAID array and created a single folder called "TV", which I intended to fill w/all my TV shows on the 6TB data drives once I successfully completed their move from Server2 to the smaller Server1.  After Disk 4 failed during the move, I deleted that folder prior to deleting the FlexRAID (tRAID) config and removing those drives to be reinstalled in unRAID.  That single "TV" folder was never written to and this was the entire extent of the write activity to those drives outside of unRAID.

 

Now that that's more clear now (I hope) what do you recommend that I do next?

Link to comment

So if I understand this correctly, disks 6, 7, 8, and 9 were all MODIFIED after being removed from server #2 and then replaced there.    i.e. they were installed in FlexRAID and then put in a different server and formatted with XFS !!!

 

Clearly that completely invalidates your parity drive ==> so as this stage I'd say there's NO CHANCE of restoring the data for disk #4.

 

You'll either have to simply restore everything from your backups;  or, in the absence of backups; attempt data recovery from the failed disk #4.    You could try attaching it to a Windows PC and seeing if you can read any of the data using a free IFS like Linux Reader [ http://www.diskinternals.com/linux-reader/ ];  try reading it directly in another Linux box [perhaps just boot the trial version of UnRAID and assign ONLY that disk as a single data disk];  or, if the data's sufficiently important, send it off for professional data recovery.

 

 

Link to comment

Earlier you said you are supposed to be able to recover from a single drive failure. This is true as long as ALL other drives are OK and still part of the array.

 

There is no magic that allows the parity drive by itself to reconstruct a disk, it doesn't have the capacity. The parity drive plus all the other disks allow the data of the failed disk to be calculated. It is really very simple arithmetic, but all of the drives with all of their bits exactly right are required for the calculation to succeed.

 

If you are interested see here for an explanation of how this works. Understanding this might have prevented you from making these mistakes.

Link to comment

Earlier you said you are supposed to be able to recover from a single drive failure. This is true as long as ALL other drives are OK and still part of the array.

 

There is no magic that allows the parity drive by itself to reconstruct a disk, it doesn't have the capacity. The parity drive plus all the other disks allow the data of the failed disk to be calculated. It is really very simple arithmetic, but all of the drives with all of their bits exactly right are required for the calculation to succeed.

 

If you are interested see here for an explanation of how this works. Understanding this might have prevented you from making these mistakes.

 

Thanks trurl.  Clearly you're right that I didn't understand well enough how parity works.  Next time I'll study up on the correct way to empty data drives and reconfigure parity prior if I'm going to remove drives from an array.  Thanks for the link and for your help on this.

Link to comment

You'll either have to simply restore everything from your backups;  or, in the absence of backups; attempt data recovery from the failed disk #4.    You could try attaching it to a Windows PC and seeing if you can read any of the data using a free IFS like Linux Reader [ http://www.diskinternals.com/linux-reader/ ];  try reading it directly in another Linux box [perhaps just boot the trial version of UnRAID and assign ONLY that disk as a single data disk];  or, if the data's sufficiently important, send it off for professional data recovery.

 

Thanks Gary.  I've tried attaching Disk 4 to my Windows 8 PC and to my other unRAID server and neither one even recognizes there is a drive installed.  So at this point I'll just assume the data is unrecoverable (at least by me, and professional data recovery probably doesn't make since it sounds like the cost wouldn't be any cheaper than just reacquiring the content I no longer have backed up somewhere).

 

Last questions:  since I understand now that my parity data has been completely invalidated by all this moving around and reformatting of empty data drives, is there anything special I need to do to get parity valid again?  The current config of Server2 is the one I want to keep, just with the RMA replacement drive substituting the failed Disk 4.  Since I won't have the replacement for another 2-3 days and then I'll still want to run it through my usual 3 preclear cycles, should I just go ahead with a new config without it now and then a normal parity check to rebuild parity?  If there's anything else that should be done please let me know.

Link to comment

When you do New Config and Start, it will rebuild parity from scratch. Then a parity check to make sure parity rebuild was successful.

 

Parity will remain valid when you add a precleared drive and format it. Here is why this works:

 

A clear (all zeros) drive has no effect on parity values. Formatting actually writes to the new disk when it creates the empty file system, but those writes will also update parity.

 

Link to comment

Agree with trurl => since it's going to be a few days before you get the new drive, I'd go ahead and do a New Config and let the parity sync run.    After the parity sync finishes, run a parity check to confirm all is well ... and then you're ready to go.

 

When you pre-clear the new drive after it arrives, you'll be able to add it to the array very quickly ... it will just require a format, which is very quick.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.