Help! Second drive failed during rebuild


Recommended Posts

I have a dilemma I need some help with resolving. I have an array with 16 drives. About 3 times in 4 months I have had drive 15 drop out and need a rebuild. Generally this is not a problem but the drive tests good so I have been waiting for further symptoms to diagnose. Well, it may be related and may not but I was about 3% into the rebuild and lost drive 11. Now my challenge is  how to get the array online without compromising data. Obviously a second parity drive would resolve yeh problem but unfortunately I only have a single parity drive at the moment. Screen shot and diags attached if anyone has any thoughts. 

Screen Shot 2018-01-04 at 6.45.51 PM.png

nasvm-diagnostics-20180104-1848.zip

Link to comment

There are no pending sectors but SMART shows triple digits raw read error rate, not good on a WD disk, anything above single digits is bad news.

 

You can try re-enabling that disk and rebuild disk15 again and see how it holds up, but only if the array is 100% unchanged (this includes docker and/or VMs using the array) since before the second disk got disable.

Link to comment

OK I was able to re-enable the disk in the JBOD controller and I disabled SMART monitoring for now. I disabled docker and VM support in UNRAID. All the drives are online again and I'm rebuilding drive 15. It will take awhile but by tomorrow my two new 8TB drives should arrive. One for replacing drive 11 and second parity drive to reduce the chances of this happening again. I love UNRAID but when things go wrong it's nerve wracking. I should have looked a little deeper instead of panicking, I could have done this yesterday. 

 

2% so far..... 17 hours to go. Thanks for the guidance. 

Link to comment

OK... well I rebuilt drive 15. By then my 2 new 8TB drive arrived so I replaced the bad drive 11 and added a second parity drive...... then rebuilt those. Now the array is all in sync but I fear I have corruption. I cannot add change or remove shares. My appadata share give an I/O error when trying to add or change anything.  Diagnostics attached. I'm not sure what my viable options are at this point. Drive 11 and possibly drive 15 are corrupted I'm guessing. 

nasvm-diagnostics-20180107-1959.zip

Link to comment

Yes.... it seems i have a bunch of files in Lost and Found on drive 15 but hopefully nothing too important that I cannot recover. Thanks for the help. Hopefully I have enough redundancy now for a bit. I have to watch these older drives more carefully and swap them before they puke. I have 94TB online right now. It's painful when things go wrong. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.