June 22, 201610 yr My UPS battery died this weekend and we hadn't had a power outage in 7 months so of course when I plugged in my server without the UPS so I could use it over the weekend before my new batteries arrived the power flickers just enough to reboot my server. Now my newest drive (6 months WD red 3TB) has a red X. Ran Smart reports (long version) came back clean so I decided to try rebuilding the drive. 27 hours later the message said Data Rebuild completed with 0 errors and then the next entry is "Disk 5 in error state (disk dsbl)" attaching the last test I have. Currently running a new long test. Is this drive likely dead? tower-smart-20160621-2146.zip
June 22, 201610 yr Community Expert There are no pending sectors but from the SMART errors logged it does look look a bad disk, only last 5 errors are shown: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 231 185 021 Pre-fail Always - 3441 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 56 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4209 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 56 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 43 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 5288 194 Temperature_Celsius 0x0022 114 109 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 ATA Error Count: 96 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 96 occurred at disk power-on lifetime: 4162 hours (173 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 02 00 00 00 a0 Device Fault; Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ef 10 02 00 00 00 a0 08 00:10:45.713 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:45.712 IDENTIFY DEVICE ef 03 45 00 00 00 a0 08 00:10:45.712 SET FEATURES [set transfer mode] ef 10 02 00 00 00 a0 08 00:10:45.712 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:45.711 IDENTIFY DEVICE Error 95 occurred at disk power-on lifetime: 4162 hours (173 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 45 00 00 00 a0 Device Fault; Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ef 03 45 00 00 00 a0 08 00:10:45.712 SET FEATURES [set transfer mode] ef 10 02 00 00 00 a0 08 00:10:45.712 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:45.711 IDENTIFY DEVICE ef 10 02 00 00 00 a0 08 00:10:40.258 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:40.258 IDENTIFY DEVICE Error 94 occurred at disk power-on lifetime: 4162 hours (173 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 02 00 00 00 a0 Device Fault; Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ef 10 02 00 00 00 a0 08 00:10:45.712 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:45.711 IDENTIFY DEVICE ef 10 02 00 00 00 a0 08 00:10:40.258 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:40.258 IDENTIFY DEVICE ef 03 45 00 00 00 a0 08 00:10:40.257 SET FEATURES [set transfer mode] Error 93 occurred at disk power-on lifetime: 4162 hours (173 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 02 00 00 00 a0 Device Fault; Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ef 10 02 00 00 00 a0 08 00:10:40.258 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:40.258 IDENTIFY DEVICE ef 03 45 00 00 00 a0 08 00:10:40.257 SET FEATURES [set transfer mode] ef 10 02 00 00 00 a0 08 00:10:40.257 SET FEATURES [Enable SATA feature] Error 92 occurred at disk power-on lifetime: 4162 hours (173 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 61 45 00 00 00 a0 Device Fault; Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ef 03 45 00 00 00 a0 08 00:10:40.257 SET FEATURES [set transfer mode] ef 10 02 00 00 00 a0 08 00:10:40.257 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 00:10:40.256 IDENTIFY DEVICE ec 00 00 00 00 00 a0 08 00:10:21.095 IDENTIFY DEVICE SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 4169 - # 2 Short offline Completed without error 00% 4162 - # 3 Extended offline Completed without error 00% 1867 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
June 22, 201610 yr Author Finished a new Long Test overnight. Haven't had a chance to see if it has anything new but wanted to post this before I left for work. tower-smart-20160621-2155.zip
June 22, 201610 yr Community Expert Extended SMART test completed without error, still believe that this disk is not very healthy, you could try using it on another server (if you have one) or in the same server but trading cables and controller/enclosure with another disk, if the same disk fails again in the future then you can be sure it's bad.
June 22, 201610 yr Author I paid the taxes to buy a new disc locally to replace this one with the intent of running a pre-clear on it after I get the new one pre-cleared and rebuilt. If that comes back with issues I will start the warranty process. If not I, I do have a pro license and I'm only currently at 9 disk so I can try it. Thanks for the help so far.
June 22, 201610 yr Just my opinion, but I think the drive is fine. All of the visible errors are just device faults from 47 hours ago, rather unusual, but probably just issues with the bad power incident. There's no other evidence of anything physically wrong with the drive. And the faults haven't occurred since, so probably the drive was fine once rebooted.
June 23, 201610 yr Author The issue is that after two reboots and a rebuild I still can't get unraid to not report it as faulty and enable it.
June 23, 201610 yr So the drive is unmountable? Rebuilding an unmountable drive won't make it mountable if e.g. there's a file system corruption. Wait for one of our experts to tell you what to do next.
June 23, 201610 yr Community Expert So the rebuild completed successfully but the disk was disabled right after? If you still have it post the syslog that includes the rebuild.
June 25, 201610 yr Author I bought a new drive and pre-cleared it and rebuilt Drive 5 on the new drive. I then Pre-cleared the offending drive with full test and it completed with no errors. I ran the SMART Short test which came back without errors. Before I start a SMART extended test is there anything else I should do to check this drive so I can feel comfortable adding it back in? By the way thanks again to everyone for the help.
June 25, 201610 yr Author So the drive is unmountable? Rebuilding an unmountable drive won't make it mountable if e.g. there's a file system corruption. Wait for one of our experts to tell you what to do next. Didn't see your post in time but I imagine pre clearing a drive and adding it in as a new drive would probably take care of this issue, correct?
Archived
This topic is now archived and is closed to further replies.