February 4, 201115 yr i don't really understand if this is a problem that i need to change my parity for a new drive or is it still ok. Unraid should make it more easy to understand this stuff like tell you what to expect like drive is about to fail or its still good. Disk 0: WARNING - Current_Pending_Sector it is now 4 (warning threshold is 1) Disk 0: WARNING - Offline_Uncorrectable it is now 3 (warning threshold is 1) ran parity check from unraid not the nocorrect i get now Disk 0: WARNING - Current_Pending_Sector it is now 2 (warning threshold is 1) Disk 0: WARNING - Offline_Uncorrectable it is now 3 (warning threshold is 1) my smart smartctl -a -d ata /dev/sdc smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green family Device Model: WDC WD20EADS-00R6B0 Serial Number: WD-WCAVY2269670 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Fri Feb 4 08:08:21 2011 MST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: (43200) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 179 154 021 Pre-fail Always - 8025 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1225 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 7671 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 39 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 14712 194 Temperature_Celsius 0x0022 134 117 000 Old_age Always - 18 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 3 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 7661 27516853 # 2 Short offline Completed: read failure 30% 2454 3328229 # 3 Short offline Aborted by host 10% 2446 - # 4 Short offline Aborted by host 90% 2446 - # 5 Short offline Aborted by host 70% 2446 - # 6 Short offline Aborted by host 60% 2446 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
February 4, 201115 yr i don't really understand if this is a problem that i need to change my parity for a new drive or is it still ok. Unraid should make it more easy to understand this stuff like tell you what to expect like drive is about to fail or its still good. Disk 0: WARNING - Current_Pending_Sector it is now 4 (warning threshold is 1) Disk 0: WARNING - Offline_Uncorrectable it is now 3 (warning threshold is 1) ran parity check from unraid not the nocorrect i get now Disk 0: WARNING - Current_Pending_Sector it is now 2 (warning threshold is 1) Disk 0: WARNING - Offline_Uncorrectable it is now 3 (warning threshold is 1) Before you blame unRAID, you are reading that information from a user-developed add-on, the MyMain screen in unMENU. The "Smart" reporting screen in myMain indicates that some of the parameters in the SMART report for that drive have changed. It indicates to me that there are still two sectors on the disk that it has marked as un-readable, and they have not yet been subsequently written to to re-allocate them. You should keep an eye on the drive over the next months/years. Since it seems the un-readable sector count did not increase, it appears as if the drive was able to successfully write 2 of those sectors back to their original locations on the disk and did not need to re-allocate them. The other 2 have probably not been written to since being identified as un-readable. Joe L
February 4, 201115 yr Author i was not blaming unraid im just thinking unraid should have some way to explain the sate of the hard drives to users. Ok i will watch it is there something that i should be watching for that indicates its going to fail or it has failed? and what does this mean Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 7661 27516853 sorry for so many questions but this stuff is confusing and im scare if the parity is good or not i don't want to end up having another drive fail and the parity is bad edit when my parity finish it said Last checked on 2/4/2011 5:36:10 AM, finding 0 errors. on my error count i see 54 errors only errors in my syslog i could find Feb 3 21:47:55 Tower kernel: mdcmd (92): check CORRECT (unRAID engine) Feb 3 21:47:55 Tower kernel: md: recovery thread woken up ... (unRAID engine) Feb 3 21:47:55 Tower kernel: md: recovery thread checking parity... (unRAID engine) Feb 3 21:47:55 Tower kernel: md: using 1152k window, over a total of 1953514552 blocks. (unRAID engine) Feb 3 21:55:31 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:31 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:31 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:31 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:31 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:31 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:33 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:33 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:33 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:36 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:36 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:36 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:38 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:38 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:38 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:41 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:41 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:41 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:43 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:43 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Unhandled sense code (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 (System) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Sense Key : 0x3 [current] [descriptor] (Drive related) Feb 3 21:55:43 Tower kernel: Descriptor sense data with sense descriptors (in hex): Feb 3 21:55:43 Tower kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 3 21:55:43 Tower kernel: 04 b8 60 70 Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] ASC=0x11 ASCQ=0x4 (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] CDB: cdb[0]=0x28: 28 00 04 b8 5f 4f 00 02 d0 00 (Drive related) Feb 3 21:55:43 Tower kernel: end_request: I/O error, dev sdc, sector 79192176 (Errors) Feb 3 21:55:43 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192112/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192120/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192128/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192136/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192144/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192152/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192160/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192168/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192176/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192184/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192192/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192200/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192208/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192216/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192224/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192232/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192240/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192248/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192256/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192264/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192272/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192280/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192288/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192296/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192304/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192312/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192320/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192328/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192336/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192344/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192352/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192360/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192368/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192376/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192384/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192392/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192400/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192408/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192416/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192424/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192432/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192440/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192448/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192456/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192464/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192472/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192480/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192488/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192496/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192504/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192512/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192520/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192528/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192536/0, count: 1 (Errors)
February 4, 201115 yr i was not blaming unraid im just thinking unraid should have some way to explain the sate of the hard drives to users. Ok i will watch it is there something that i should be watching for that indicates its going to fail or it has failed? and what does this mean Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 7661 27516853 It indicates the short test you requested stopped when it reached the first un-readable sector. Those show as "Offline Uncorrectable" (Attempts to re-read them could not get the contents) 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 3 sorry for so many questions but this stuff is confusing and im scare if the parity is good or not i don't want to end up having another drive fail and the parity is bad edit when my parity finish it said Last checked on 2/4/2011 5:36:10 AM, finding 0 errors. That is good. on my error count i see 54 errors Those are the "un-readable sectors" reported back from the OS as a result of the unreadable (media error) sectors, or perhaps more accurately, the number of "read" attempts that were not successful. only errors in my syslog i could find Feb 3 21:47:55 Tower kernel: mdcmd (92): check CORRECT (unRAID engine) Feb 3 21:47:55 Tower kernel: md: recovery thread woken up ... (unRAID engine) Feb 3 21:47:55 Tower kernel: md: recovery thread checking parity... (unRAID engine) Feb 3 21:47:55 Tower kernel: md: using 1152k window, over a total of 1953514552 blocks. (unRAID engine) Feb 3 21:55:31 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:31 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:31 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:31 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:31 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:31 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:31 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:33 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:33 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:33 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:33 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:33 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:36 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:36 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:36 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:36 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:36 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:38 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:38 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:38 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:38 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:38 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:41 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:41 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:41 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:41 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:41 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: irq_stat 0x40000001 (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: failed command: READ DMA EXT (Minor Issues) Feb 3 21:55:43 Tower kernel: ata3.00: cmd 25/00:d0:4f:5f:b8/00:02:04:00:00/e0 tag 0 dma 368640 in (Drive related) Feb 3 21:55:43 Tower kernel: res 51/40:9f:70:60:b8/00:01:04:00:00/e0 Emask 0x9 (media error) (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: status: { DRDY ERR } (Drive related) Feb 3 21:55:43 Tower kernel: ata3.00: error: { UNC } (Errors) Feb 3 21:55:43 Tower kernel: ata3.00: configured for UDMA/133 (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Unhandled sense code (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 (System) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] Sense Key : 0x3 [current] [descriptor] (Drive related) Feb 3 21:55:43 Tower kernel: Descriptor sense data with sense descriptors (in hex): Feb 3 21:55:43 Tower kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 3 21:55:43 Tower kernel: 04 b8 60 70 Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] ASC=0x11 ASCQ=0x4 (Drive related) Feb 3 21:55:43 Tower kernel: sd 2:0:0:0: [sdc] CDB: cdb[0]=0x28: 28 00 04 b8 5f 4f 00 02 d0 00 (Drive related) Feb 3 21:55:43 Tower kernel: end_request: I/O error, dev sdc, sector 79192176 (Errors) Feb 3 21:55:43 Tower kernel: ata3: EH complete (Drive related) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192112/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192120/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192128/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192136/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192144/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192152/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192160/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192168/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192176/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192184/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192192/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192200/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192208/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192216/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192224/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192232/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192240/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192248/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192256/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192264/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192272/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192280/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192288/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192296/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192304/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192312/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192320/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192328/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192336/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192344/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192352/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192360/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192368/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192376/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192384/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192392/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192400/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192408/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192416/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192424/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192432/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192440/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192448/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192456/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192464/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192472/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192480/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192488/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192496/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192504/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192512/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192520/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192528/0, count: 1 (Errors) Feb 3 21:55:43 Tower kernel: md: disk0 read error (Errors) Feb 3 21:55:43 Tower kernel: handle_stripe read error: 79192536/0, count: 1 (Errors) Yes, those are the read errors. If you do not trust the drive, or want to put it through a good test: un-assign it from the array run the preclear_disk.sh script on it, then run the preclear_script.sh on it a second time. then re-assign it to the array and let parity be re-constructed to it. (as long as the results do not show additional sectors that are un-readable) You array will be without parity protection while you clear and test the drive, but it sounds as if it will give you a lot of piece of mind. Joe L
Archived
This topic is now archived and is closed to further replies.