August 22, 201312 yr I installed a new 2TB harddrive and rebuilt using the basic Unraid 4.7 menu. It spotted 116 errors when it was done, with all green balls. I checked the parity again but still get 116 errors. My syslog showed errors in UnMenu, and the following is a sample of the red lines: Aug 20 17:16:49 Tower kernel: res 51/40:8f:f8:5e:0b/00:03:22:00:00/e0 Emask 0x9 (media error) Aug 20 17:16:49 Tower kernel: ata1.00: error: { UNC } Aug 20 17:16:49 Tower kernel: end_request: I/O error, dev sda, sector 571170552 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170488/0, count: 1 Aug 20 17:16:49 Tower kernel: md: parity incorrect: 571170488 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170496/0, count: 1 Aug 20 17:16:49 Tower kernel: md: parity incorrect: 571170496 Aug 20 17:16:49 Tower kernel: md: disk0 read error ... Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170632/0, count: 1 Aug 20 17:16:49 Tower kernel: md: parity incorrect: 571170632 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170640/0, count: 1 Aug 20 17:16:49 Tower kernel: md: parity incorrect: 571170640 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170648/0, count: 1 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571170656/0, count: 1 Aug 20 17:16:49 Tower kernel: md: disk0 read error ... Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571171400/0, count: 1 Aug 20 17:16:49 Tower kernel: md: disk0 read error Aug 20 17:16:49 Tower kernel: handle_stripe read error: 571171408/0, count: 1 Aug 21 11:30:29 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Aug 21 11:30:29 Tower kernel: res 51/40:87:f8:5e:0b/00:00:22:00:00/e0 Emask 0x9 (media error) Aug 21 11:30:29 Tower kernel: ata1.00: error: { UNC } Aug 21 11:30:31 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Aug 21 11:30:31 Tower kernel: res 51/40:87:f8:5e:0b/00:00:22:00:00/e0 Emask 0x9 (media error) Aug 21 11:30:31 Tower kernel: ata1.00: error: { UNC } Aug 21 11:30:31 Tower kernel: md: parity incorrect: 571170488 Aug 21 11:30:31 Tower kernel: md: parity incorrect: 571170496 .. Aug 21 11:30:31 Tower kernel: md: parity incorrect: 571170632 Aug 21 11:30:31 Tower kernel: md: parity incorrect: 571170640 Below is a smartctl short result on the parity drive: root@Tower:/boot/smarthistory# smartctl -a -A /dev/sda smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD2001FASS-00W2B0 Serial Number: WD-WMAY00086118 Firmware Version: 01.00101 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Aug 22 04:10:02 2013 Local time zone must be set--see zic m SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (30180) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off supp ort. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 064 047 021 Pre-fail Always - 13825 4 Start_Stop_Count 0x0032 097 097 000 Old_age Always - 3654 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 065 065 000 Old_age Always - 25918 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 53 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 17 193 Load_Cycle_Count 0x0032 179 179 000 Old_age Always - 65821 194 Temperature_Celsius 0x0022 110 095 000 Old_age Always - 42 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 2 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 3 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error # 1 Short offline Completed without error 00% 25918 - # 2 Short offline Completed without error 00% 24001 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Below is a smartctl short result on the new drive: root@Tower:/boot/smarthistory# smartctl -a -A /dev/sdf smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EFRX-68AX9N0 Serial Number: WD-WCC300359112 Firmware Version: 80.00A80 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Aug 22 04:01:24 2013 Local time zone must be set--see zic m SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (25980) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off supp ort. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70bd) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 2 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 37 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 1 194 Temperature_Celsius 0x0022 121 118 000 Old_age Always - 26 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error # 1 Short offline Completed without error 00% 37 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I've attached a full syslog. What looks to be wrong? Update: I swapped out the SATA cable for the parity drive, and re-checked parity. It reported 0 error this time, with no error reported in syslog. I haven't determined whether this is a coincidence, but for now I'll assume it was a bad cable unless the errors come back. syslog-2013-08-22.zip
Archived
This topic is now archived and is closed to further replies.