[SOLVED] Failure of 2 drives in a 15 drive array - can I recover the array?

nottlv · November 14, 2011

It appears I may have simultaneous failure of two drives in a 15 drive array. My unRAID server is a LimeTech MD-1510/LI; unRAID version is 4.7. The system has a mix of WD, Hitachi & Seagate drives, including several ST31500341AS 1.5TB drives. I recently had an issue with one of these Seagate drives were the disk was showing as unavailable. I did an RMA advanced replacement from Seagate, got the replacement drive in two days, ran preclear on it with a clean bill of health and rebuilt the array. Everything was then fine. Now, four days later, it appears as if two of the other Seagate drives have failed. I first noticed that a directory had far fewer files in it than it should, so I checked the unMenu page and there were I/O errors listed. I rebooted the array, and now two drives are listed as missing (see unMenu screenshot) and the array won't start. Also attached is the syslog; the two drives are listed as missing so I haven't been able to figure out how to run a SMART test on them (if there is a way, please let me know and I'll certainly do that).

Is there any hope of rebuilding the array or am I out of luck when it comes to the data on the two affected drives?

syslog.txt

Joe L. · November 14, 2011

For two drives to be missing, so soon after you replaced a different drive, it would be highly likely that you accidentally dislodged a power or data cable.

Power down, re-seat the cables, then power up once more.

nottlv · November 15, 2011

Thanks Joe,

I checked the cables--everything seemed fine. I then re-slotted the drives (my server uses the IcyDock enclosures) just to make sure that wasn't an issue. I then powered up the system--I heard the clicking noise indicative of bad drive. When I went to the unRAID web console, it was showing an odd message. Disk 3 had a red dot and was marked as unavailable, but the error message was that the parity disk was smaller than the largest disk in the array. Disk 3 was showing a total size of over 2.1GB, even though it's a 1.5GB drive (my parity is 2GB Hitachi). I then unassigned disk 3, and was able to successfully start the array. I then tried to use preclear on disk 3, and got the message "Sorry: Device /dev/sdn is not responding to an fdisk -l /dev/sdn command", so I have to assume the drive is beyond recovery at this point. I guess it's another RMA for Seagate.

I was still a little concerned about disk 1, so I ran a SMART check on it. Attached is the output; based on your wiki post about interpreting the SMART parameters, this drive looks like it may be on the road to failure, correct?

smart_sdm.txt

dgaschk · November 15, 2011

The drive has 3 reallocated sectors. As long as this number does not rise the drive should be ok. Monitor this value and the Current_Pending_Sector values.

Joe L. · November 15, 2011

this drive looks like it may be on the road to failure, correct?

Actually, EVERY drive, regardless of their SMART report, is on the road to failure. It is not a question of "if" any given drive will fail, but "when". It is just a matter of time:(

Remember, there are only two kinds of hard-drives. No, not IDE and SATA, but those that have already failed, and those that have not yet failed. Just give it more time, and it will.

As mentioned, 3 re-allocated sectors is not an issue, as most modern drives have several thousand in the pool of spare sectors. But... keep an eye on the drive. If you see the number creeping upward, it is time to replace it.

nottlv · November 16, 2011

Duh...I was looking at the VALUE and not the RAW_VALUE so I thought the drive was on it's last legs. Anyways, thanks for all the help guys.

[SOLVED] Failure of 2 drives in a 15 drive array - can I recover the array?

Recommended Posts

nottlv

Link to comment

Joe L.

Link to comment

nottlv

Link to comment

dgaschk

Link to comment

Joe L.

Link to comment

nottlv

Link to comment

Join the conversation