Bad Drive! Or something else?

August 5, 201015 yr

:'( I'm not sure what happened. I was off the computer all day yesterday and most of the day today. I checked with unMENU tonight and got DISK_DSBL on disk 3! It's got a red dot on the regular unRAID menu. Looking at the syslog, it's huge, if I'm reading it correctly, the server restarted on its own at 4:40AM. I didn't do that, so something happened. Could this be a bad cable, card? I'm including the whole syslog, but had to break it into several parts.

syslog_2010_8_4_pt1.txt.zip

syslog_2010_8_4_pt2.txt.zip

Quote

August 5, 201015 yr

Author

Two more...

syslog_2010_8_4_pt3.txt.zip

syslog_2010_8_4_pt4.txt.zip

Quote

August 5, 201015 yr

Author

Last one...

syslog_2010_8_4_pt5.txt.zip

Quote

August 5, 201015 yr

Author

alright, to answer myself, I just read the troubleshooting section...should have done that first, eh?...anyway, I'll do a smart test and see what comes up. One question, I'm getting the red ball next to disk 3, but it is still available in the array. I thought unRAID would take it offline.

Quote

August 5, 201015 yr

The red ball means the physical disk is not available. unRaid is working as expected. It is simulating the failed drive from the other drives in the array and parity.

What did you expect would happen?

Quote

August 5, 201015 yr

alright, to answer myself, I just read the troubleshooting section...should have done that first, eh?...anyway, I'll do a smart test and see what comes up. One question, I'm getting the red ball next to disk 3, but it is still available in the array. I thought unRAID would take it offline.

A red ball indicates that a "write" to the drive failed so it was disabled.

The remaining drives in combination with parity are simulating the failed drive. If you read files from the failed drive you are actually reading the corresponding blocks of data from ALL the drives and calculating what was on the failed drive. It is exactly why you have an unRAID array.

If you had not looked at the management console you might never have noticed the failed indicator.

You can even write to the simulated drive. It too is written to the parity drive as if the drive were actually there. Do not be fooled though, if you have a second disk fail you'll lose the contents of BOTH the failed drives.

You will want to get the drive back on-line as soon as possible. If you thing it was a cabling issue, then to fix the cable:

Stop the array

Un-assign the failed drive

Power down

Fix/re-seat the cable

Power Up

Start the array with the disk un-assigned. You'll still be able to get to the contents of it, as an un-assigned drive is treated exactly as a failed drive. You;ll still access the drive "simulated" by parity and all the other drives. Starting the array with the drive un-assigned caused the array to forget the serial number of the failed disk.

Stop the array once more

Re-assign the failed disk. (unRAID will think it is a new disk, since it forgot the serial number in the prior step)

Start the array once more. It will then begin the process of re-constructing the old simulated contents onto the new physical drive.

If the drive was really physically bad, same exact steps, just install the new disk when you have the array powered off.

DO NOT press the button labeled as "Restore" as it has nothing to do with re-constructing data. It Immediately invalidates parity and would prevent re-construction of a replacement disk.

Joe L.

Quote

August 6, 201015 yr

Author

Right, of course. Thanks Brit, Joe, one more question, I'm guessing I can still try to run a smart test via telnet if the physical disk isn't completely dead, right? How would you determine if the disk has failed or it's a bad cable or something else?

EDIT: ah, nevermind, I'll follow the steps in the troubleshooting guide. THANKS!!!

Quote

August 6, 201015 yr

Author

UPDATE: OK, I ran the smartctl which I'm including. I think it looks good except for:

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 309

which according to the wiki should be 0 and is probably due to a bad cable. I followed your directions Joe, and swapped out the cable and now it's rebuilding the disk. It's estimating 638 min. which is probably appropriate, it's a 2TB drive.

It's really amazing that the "simulated" disk 3 is still there as if nothing happened. Great work Tom!

Smart_2010-8-2010.txt

Quote

August 7, 201015 yr

Author

Well it worked! Everything's back to normal. This is an amazing system!

Quote

Bad Drive! Or something else?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)