Drive dropped out during parity check, now "Not installed"


Recommended Posts

I'm running 6.3.5.

 

Tonight a drive dropped out during a parity check.  Unraid stopped the parity check and put a red X by the drive, listing something like 1000 errors.  I thought this would be no big deal.....nature's way of telling me which drive needed an upgrade, and I've replaced drives before.  Things went South, however, because the GUI wasn't responding.  I suspect the system was tied up by the failing drive, but I'm not really sure.  (My experience with the WD click-of-death may not be applicable to Unraid). 

 

Long-story-short.

 

Did an unclean power-down.  :(

Replaced the drive

Restarted

Drive didn't show "missing"

Drive showed "Not installed"

 

This leads me to believe it's NOT going to rebuild the drive if I start the array.  

If I click the HISTORY button, it tells me the last parity check was cancelled with zero errors.  And parity is listed as valid on the main screen.

 

I wanna be very careful how I proceed here.  My backup for the data on that drive is a month old, but is likely substantially correct.  (I'd recently installed a new/larger drive in the array, so it'd been taking most of the write activity)  So, I'm not panicking, but I don't want my ignorance to make things worse.

 

I'd really much rather the system rebuild the drive's contents if at all possible.

 

What's my best/safest course of action? 

 

 

Link to comment

Something's still squirrely.  The HTTP interface is unresponsive when I issue the command to take the array down, power down.....or even click the HISTORY button.  It just sits there.  The "Uptime" counter is continuing to advance, despite the fact it's ignoring the commands.  I'd attributed this to the failing drive yesterday, but that drive's disconnected.  The entire time I've been typing this, I've had a blank 'Parity/Read-Check History' window atop my system's normal main screen, which I now cannot access.  On a whim, I opened another browser and went to my server's IP (just in case Chrome's the real problem), and the IP address never answered. 

 

I was able to telnet into the server and issue a reboot command, but the reboot never happens.

 

Going to my server's IPMI screen, I see it's stopped at the same location as I've seen earlier;

-the last 3 lines onscreen for reference-

 

Sending all processes the SIGKILL signal.

Saving random seed from /dev/urandom in /boot/config/random-seed.

Turning off swap.

 

And that's where it will apparently sit until I hit the server's RESET button.  (At least that's where it's been for the last 10 minutes.)

 

UPDATE: 

 

I tried twice to boot into my normal config and begin a rebuild..never could take the array offline- it hung every time.

Restarted into safe mode and started the rebuild.  So I'm on a road with no turns for the next 2 days while it writes. 

I suspect I'll have other issues to address once the rebuild finishes.  One fire at a time...

Thanks for the response, johnnieblack

 

Edited by johnny121b
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.