November 21, 201312 yr Hi guys, I kicked off a parity check today (correcting I believe, had the tick checked on the webGUI) as I hadn't done one this month. I just went onto the webGUI again there to check progress and it had seemingly stopped and disabled disk 7 (red ball). I checked the syslog and there are lots of lines for read errors, failed to read identity etc related to disk 7. I have tried to run a SMART report but get the "Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)" message. Should I power down, check connections, power up and try to run another SMART report? PS: Running 5.0 Final with most disks (including this failed one) running off a AOC-SASLP-MV8 and the webGUI shows there to be 768 errors on the drive. syslog-2013-11-20.txt.zip
November 21, 201312 yr Should I power down, check connections, power up and try to run another SMART report? Yes.
November 21, 201312 yr Author Here are the results of a SMART test on the disk after rebooting: smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green family Device Model: WDC WD15EADS-00P8B0 Serial Number: WD-WMAVU0757932 Firmware Version: 01.00A01 User Capacity: 1,500,301,910,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Nov 21 01:02:13 2013 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 247) Self-test routine in progress... 70% of test remaining. Total time to complete Offline data collection: (33600) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 177 175 021 Pre-fail Always - 6125 4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 6256 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 074 074 000 Old_age Always - 19275 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1070 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 146 193 Load_Cycle_Count 0x0032 183 183 000 Old_age Always - 52712 194 Temperature_Celsius 0x0022 133 097 000 Old_age Always - 17 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 199 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 16745 1330310082 # 2 Short offline Completed: read failure 90% 16744 1330310082 # 3 Short offline Completed: read failure 90% 16744 1330310082 # 4 Short offline Completed: read failure 90% 16744 1330310082 # 5 Short offline Completed: read failure 90% 16744 1330310082 # 6 Short offline Aborted by host 90% 16744 - # 7 Short offline Aborted by host 90% 16744 - # 8 Short offline Aborted by host 90% 16744 - # 9 Short offline Aborted by host 90% 16744 - #10 Short offline Aborted by host 90% 16744 - #11 Short offline Aborted by host 90% 16744 - #12 Short offline Aborted by host 90% 16744 - #13 Short offline Aborted by host 90% 16744 - #14 Short offline Completed: read failure 90% 16744 1330310082 #15 Short offline Completed: read failure 90% 16744 1330310082 #16 Short offline Completed: read failure 90% 15168 1999110227 #17 Short offline Completed without error 00% 15168 - #18 Short offline Completed without error 00% 15168 - #19 Short offline Completed without error 00% 15168 - #20 Short offline Completed: read failure 90% 15168 1999110227 #21 Short offline Completed without error 00% 15167 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. It lists there being lots of read failures and 1 current pending sector. Is this drive on its way out i.e. do I need replace it? Or I be able to trust parity to rebuild even though this disk failed during the last corrective parity check? Maybe I should have done a non-corrective instead? Cheers
November 21, 201312 yr The pending sector must be cleared. Here are several options from least to most preferable: Rebuild the disk onto itself. Pre-clear the disk and then rebuild. Rebuild onto a spare and then pre-clear the disk.
December 3, 201312 yr Author I rebuilt the data onto a spare disk from my PC and that seemed to go just fine. I then pre-cleared the failed disk which got rid of that pending sector. Thanks.
Archived
This topic is now archived and is closed to further replies.