Jump to content

Failed Array - All 3 - 8 TB drives as well as Parity Drive


Recommended Posts

I apologize if this has been discussed before, I did try a search, but couldn't find an exactly similar situation.  I had four (4) 8 TB WD drives that all experienced what I believe to be a power spike / failure.  No disks would spin up when power was applied.  I've since sent them to a repair shop to be looked at and only two (2) of the drives are able to be recovered.  My setup included three (3) drives for data with one (1) in parity.  With two (2) drives failing, I believe I have a total loss of data at this point.  I'm trying to come to grips with this, as it truly is what it is, 2020 has struck... This configuration also included a SSD cache drive which still shows as available / good in unRAID and in BIOS.  I have purchased one (1) new 2TB drive for now, simply to get things back operational (I was utilizing, HomeAssistant, MQTT, Plex, Unifi, etc dockers and would prefer them to be working sooner than later).

 

Also, I'm hopeful, but honestly cannot remember if I had it setup to backup my appdata folder to the cache drive or the array or both.  The more I think about it, I think it was only the array, which sucks for obvious reasons...  I haven't re-started the array to check, because it is showing all of the drive failures and I didn't want to go down the wrong path.

 

Is there any reason to wait until the two (2) drives are recovered to do anything with the current unRAID array setup?

 

I've come to grips with the loss of data and have a better plan moving forward, but would really love some coaching as to the 'easiest' way to get things back to 'normal'.  Ideally, I'm able to start a new array and 'apply' my appdata from the existing cache drive to the new 2TB drive?

 

Thank you all in advance!

 

image.png.40a0c876b7726c9433febe05d6df7a3a.png

Link to comment

Is the repair shop aware that you need bit perfect copies of the drives and not just the files? If not, they may be writing off your parity drive as failed since it has no files. If you can get three out of 4 drives bit perfect, you are golden, as you can emulate the remaining drive.

 

The good news is each data drive is independent, so if they recover 2 data drives, you will have those files.

 

Do you have any diagnostic files from before the failure? They would contain helpful information.

Link to comment
12 hours ago, jonathanm said:

Is the repair shop aware that you need bit perfect copies of the drives and not just the files? If not, they may be writing off your parity drive as failed since it has no files. If you can get three out of 4 drives bit perfect, you are golden, as you can emulate the remaining drive.

 

The good news is each data drive is independent, so if they recover 2 data drives, you will have those files.

 

Do you have any diagnostic files from before the failure? They would contain helpful information.

I really appreciate the prompt response!  I sent the HD recovery place an e-mail this morning and their response was below...

 

Any chance you can point me to the diagnostic file location?  I can turn on unRAID, but I fear that the info would be on the array, which is toast?

 

I've asked the recovery company to replace the PCB boards on the other units, just in case they are thinking they are broken but instead it is the parity drive, fingers crossed.

 

Thanks again!

 

image.png.a646145384ddca8fc7f4bc68b3d8594c.png

Link to comment
20 minutes ago, Oakley707 said:

Any chance you can point me to the diagnostic file location?

Unfortunately anything you generate now probably wouldn't be very helpful. I was hoping that maybe you had downloaded the diagnostics zip file prior to everything falling apart. If you boot the server, and don't attempt to start the array, you should still be able to go to tools, diagnostics, and download the diagnostics zip file as things currently stand.

 

Do you at least have a full copy of the screenshot you posted in the first post? You truncated the serial numbers, which are the important parts to identifying which drives were assigned to the slots.

Link to comment
4 minutes ago, jonathanm said:

Unfortunately anything you generate now probably wouldn't be very helpful. I was hoping that maybe you had downloaded the diagnostics zip file prior to everything falling apart. If you boot the server, and don't attempt to start the array, you should still be able to go to tools, diagnostics, and download the diagnostics zip file as things currently stand.

 

Do you at least have a full copy of the screenshot you posted in the first post? You truncated the serial numbers, which are the important parts to identifying which drives were assigned to the slots.

So, the error happened overnight with no warning signs. Everything worked, woke up to the unraid box off. Still doesn't make much sense, but the gpu doesn't work now either? So, I'm thinking some sort of power blip from the PSU? The SSD and the nvme drive were unharmed. I can turn it on and grab the diagnostic zip in a bit, not currently home. 

 

I do have a full copy of the screenshot, I truncated for some reason as I thought giving the serials would impact something, 🤷‍♂️🤷‍♂️🤷‍♂️

 

The HD recovery company has agreed to replace the PCB in all drives, so 🤞🤞🤞that one is the parity and why they think it is "empty"... They didn't respond to my comment around bit perfect at all. 

 

Thanks again. 

Link to comment
2 minutes ago, Oakley707 said:

So, the error happened overnight with no warning signs. Everything worked, woke up to the unraid box off. Still doesn't make much sense, but the gpu doesn't work now either? So, I'm thinking some sort of power blip from the PSU?

That doesn't give me the warm'n'fuzzies about your PSU.

 

If it were me I'd replace the PSU and all power cabling before powering up the system. Hate for whatever happened to repeat itself with finality.

Link to comment
2 minutes ago, jonathanm said:

That doesn't give me the warm'n'fuzzies about your PSU.

 

If it were me I'd replace the PSU and all power cabling before powering up the system. Hate for whatever happened to repeat itself with finality.

I agree and it was replaced as part of the troubleshooting. Good point though. 

Link to comment
17 hours ago, Oakley707 said:

simply to get things back operational (I was utilizing, HomeAssistant, MQTT, Plex, Unifi, etc dockers and would prefer them to be working sooner than later).

A simple answer was turn on Unraid and use build-in USB backup feature to generate a USB image, then you can setup with the 2TB disk.

 

Once all disk return back, you can perform backup USB again then restore back old USB image, and decide next step ...

Link to comment
On 12/24/2020 at 11:34 AM, jonathanm said:

Did you keep any of the power cabling in place or was it all swapped?

Alright jonathanm, I'm looking to you for coaching.  Drives came back and unfortunately, they were right only 2 drives came back useful...

 

image.png.2b1e0df41759508da6a3f5c0b0f696aa.png

 

The parity and Disk 2.  It appears unRAID will NOT let me start the array with 'too many missing disks', what is the recommended strategy from here?  Any use in trying to access the drives somehow to recover files / things?  Or just blow it all up?  Again, I'm hopeful that I have good info on the cache drive, but don't know that either yet because I haven't started the array.  Much appreciated in advance!

Link to comment

Well, the good news is one of the bad drives is parity, so no direct data loss there. However, anything on disk2 is likely gone forever. Disk 1 and 3 may have data corruption, but you won't know that until you start the array.

 

Go to tools, new config, keep all. Go back to Main, and remove the faulty drive assignments, parity and disk2. You should then be able to start the array. Hopefully disk1 and disk3 will mount properly, UNDER NO CIRCUMSTANCES should you format anything if asked.

Link to comment
5 minutes ago, jonathanm said:

Well, the good news is one of the bad drives is parity, so no direct data loss there. However, anything on disk2 is likely gone forever. Disk 1 and 3 may have data corruption, but you won't know that until you start the array.

 

Go to tools, new config, keep all. Go back to Main, and remove the faulty drive assignments, parity and disk2. You should then be able to start the array. Hopefully disk1 and disk3 will mount properly, UNDER NO CIRCUMSTANCES should you format anything if asked.

Well, you are TOTALLY right.  I may kiss someone, I had an additional 8 TB drive and in reality don't think I had used it much at all.  Started the new array, no errors, mount issues, or asking about formatting... I've already opened up some personal photos and SHOCKINGLY, they all seem to be there.  Apparently I was lucky this time around. Thanks again so much!  Maybe 2020 wasn't such a bust.  Happy New Year!

Link to comment
1 hour ago, Oakley707 said:

Well, you are TOTALLY right.  I may kiss someone, I had an additional 8 TB drive and in reality don't think I had used it much at all.  Started the new array, no errors, mount issues, or asking about formatting... I've already opened up some personal photos and SHOCKINGLY, they all seem to be there.  Apparently I was lucky this time around. Thanks again so much!  Maybe 2020 wasn't such a bust.  Happy New Year!

This characteristic of Unraid is one of the extreme differentiators from other RAID systems. With traditional RAID, when you exceed the redundancy level, you lose everything. With Unraid, each parity protected array drive is a separate filesystem, so normally you only lose the content actually on the failed drives, all the intact drives are readable.

 

Just be very glad one of the failures was the parity drive, otherwise you would have lost the contents of 2 data drives instead of one. This was a very hands on lesson on why parity is arguably the LEAST important drive in the array, as it holds no data by itself, only if all the remaining data drives are healthy can the parity drive actually rebuild a missing drive.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...