Hard Drive died right after upgrading to 5.0-rc8a - Is the Array Safe?

Maktai · November 5, 2012

Background:

I followed the instructions for upgrading in the release notes. After upgrading to 5.0-rc8a I ran preclear on a new 3 TB drive. The drive tested good so I swapped it for my parity drive. The drive successfully completed parity sync. I then ran parity check without correction and ended up with a lot of errors on my disk 4 (ST3750640NS_3QD14BYM) (sdg). I checked the syslog but am really not sure what I am looking at.

Questions:

1. Did the drive day as a result of the upgrade to 5.0-rc8a or was it just a coincidence?

2. Is my data still intact? I am too scared to start the array for fear of causing irreparable harm.

Please tell me that I got lucky and the parity sync completed before the drive died so all my data is safe. I am having a panic attack here. The syslog.txt file is attached.

Thank you for your help.

syslog_2.txt

dgaschk · November 5, 2012

Post a SMART report for disk 4.

Bizarroterl · November 5, 2012

I had the same thing happen. Upgraded, installed a 4TB for the parity, ran the parity rebuild, and then a data drive failed. Swapped in another drive, precleared, and then did a rebuild.

Maktai · November 6, 2012

I got the smart report. I apologize for how long this took. I had already started preclear on the old parity drive and was scared to try getting a smart report at the same time.

smart.txt

jbartlett · November 6, 2012

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 130

I'd watch this value. If it's climbing, it could be an issue with it's SATA cable going/is bad. On a Windows box, when I see this one climbing, it's typically after the computer becomes 100% unresponsive and the LED activity light is full ON for a couple minutes.

Maktai · November 6, 2012

I will definitely watch that and maybe proactively change the cables out. Could the source of the write errors been entirely from bad cabling? Am I good to swap another drive in and let the system rebuild the lost drive?

jbartlett · November 7, 2012

Can't hurt to replace the SATA cable. Other things can cause this issue too such as a failing/underpowered power supply and certain multi-SATA cards. What is your hardware layout?

Maktai · November 7, 2012

Hardware in the server consists of:

MB A8N-SLI Premium with 8 onboard SATA I connectors

PS Kingwin 80+ Gold 550W

and 6 Hard Drives

I only have two locking cables so will have to order more. I blew all the dust out of the case. That can't hurt connection reliability. I can't wait for the new cables so I checked the connections and am rebuilding the failed drive. Thank you for helping me determine the likely cause.

Maktai · November 7, 2012

Seems I should be more patient. unRAID red-balled the replacement drive that passed preclear check.

Maktai · November 10, 2012

I installed locking SATA cables, reran preclear on the drive, and rebuilt the array. There were no errors.

Thank you unRAID for saving me from bad hardware and from my own impatience. You are a program truly smarter than your user. May your creator be blessed all the days of his life.

Hard Drive died right after upgrading to 5.0-rc8a - Is the Array Safe?

Recommended Posts

Maktai

Link to comment

dgaschk

Link to comment

Bizarroterl

Link to comment

Maktai

Link to comment

jbartlett

Link to comment

Maktai

Link to comment

jbartlett

Link to comment

Maktai

Link to comment

Maktai

Link to comment

Maktai

Link to comment

Join the conversation