riccume Posted March 1, 2011 Share Posted March 1, 2011 My motherboard (Gigabyte GA-D510UD) only has 4 SATA slots so when I run out of storage space and had to add a 5th hard drive I purchased a SATA PCI Card (VIA VT6421A chipset; no PCI-Express slot on motherboard). In this configuration the parity check would run at speed in the area of 45 Mb/s. A couple of weeks ago I had to add a 6th hard drive, again attached to the SATA PCI Card. Now the parity check runs at speed in the area of 3 Mb/s, i.e. 15x slower! The only explanation I can think of is that the two hard drives attached to SATA PCI Card are being accessed simultaneously thus creating a "traffic jam" on the PCI card, but I don't understand why - neither of them is the parity drive, which is instead connected directly to the motherboard. Shouldn't the parity check be done sequentially, first between the parity drive and one drive, then between the parity drive and another drive, and so on? So the questions - Is this the reason? Is it "normal"? Any solution? Thank you! Quote Link to comment
mcs Posted March 1, 2011 Share Posted March 1, 2011 No. When performing a parity check all drives are accessed simultaniosly. Still that drop seams excessive. Quote Link to comment
SSD Posted March 1, 2011 Share Posted March 1, 2011 Two disks on the PCI bus would not slow you down nearly that much. Post a syslog and smart report from your two PCI drives. Quote Link to comment
riccume Posted March 1, 2011 Author Share Posted March 1, 2011 Thanks, good to know it isn't normal. I'll provide the additional info as soon as I get back home. Cheers. Quote Link to comment
SSD Posted March 1, 2011 Share Posted March 1, 2011 This likely has nothing to do with this issue, but once you get this solved you might want to not assign your PCI slots to 2 consecutive disk slots. There were reports that interleaving the disks helped (a small amount) with performance. So instead of disk1 - motherboard disk2 - motherboard disk3 - motherboard disk4 - motherboard disk5 - PCI card disk6 - PCI card you could have disk1 - motherboard disk2 - motherboard disk3 - motherboard disk4 - PCI card disk5 - motherboard disk6 - PCI card Quote Link to comment
riccume Posted March 1, 2011 Author Share Posted March 1, 2011 OK, first the syslog. I downloaded it using unMenu and it is huge, so here the most recent rows. Clearly there is something going wrong here... Mar 1 19:46:18 Server kernel: ata4: hard resetting link Mar 1 19:46:19 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:19 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:19 Server kernel: ata4: EH complete Mar 1 19:46:19 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:19 Server kernel: ata3.00: BMDMA stat 0x5 Mar 1 19:46:19 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:19 Server kernel: ata3.00: failed command: READ DMA EXT Mar 1 19:46:19 Server kernel: ata3.00: cmd 25/00:c8:9f:66:58/00:03:15:00:00/e0 tag 0 dma 495616 in Mar 1 19:46:19 Server kernel: res 51/84:b7:9f:66:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:19 Server kernel: ata3.00: status: { DRDY ERR } Mar 1 19:46:19 Server kernel: ata3.00: error: { ICRC ABRT } Mar 1 19:46:19 Server kernel: ata3: hard resetting link Mar 1 19:46:19 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:19 Server kernel: ata3.00: configured for UDMA/33 Mar 1 19:46:19 Server kernel: ata3: EH complete Mar 1 19:46:19 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:19 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:19 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:19 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:19 Server kernel: ata4.00: cmd 25/00:e0:9f:6f:58/00:00:15:00:00/e0 tag 0 dma 114688 in Mar 1 19:46:19 Server kernel: res 51/84:cf:9f:6f:58/84:00:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:19 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:19 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:19 Server kernel: ata4: hard resetting link Mar 1 19:46:19 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:19 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:19 Server kernel: ata4: EH complete Mar 1 19:46:19 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:19 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:19 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:19 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:19 Server kernel: ata4.00: cmd 25/00:00:7f:74:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:19 Server kernel: res 51/84:5f:7f:74:58/84:01:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:19 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:19 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:19 Server kernel: ata4: hard resetting link Mar 1 19:46:20 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:20 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:20 Server kernel: ata4: EH complete Mar 1 19:46:20 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:20 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:20 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:20 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:20 Server kernel: ata4.00: cmd 25/00:00:87:7e:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:20 Server kernel: res 51/84:1f:87:7e:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:20 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:20 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:20 Server kernel: ata4: hard resetting link Mar 1 19:46:20 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:20 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:20 Server kernel: ata4: EH complete Mar 1 19:46:20 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:20 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:20 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:20 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:20 Server kernel: ata4.00: cmd 25/00:18:97:87:58/00:01:15:00:00/e0 tag 0 dma 143360 in Mar 1 19:46:20 Server kernel: res 51/84:d7:97:87:58/84:00:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:20 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:20 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:20 Server kernel: ata4: hard resetting link Mar 1 19:46:21 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:21 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:21 Server kernel: ata4: EH complete Mar 1 19:46:21 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:21 Server kernel: ata3.00: BMDMA stat 0x5 Mar 1 19:46:21 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:21 Server kernel: ata3.00: failed command: READ DMA EXT Mar 1 19:46:21 Server kernel: ata3.00: cmd 25/00:00:97:90:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:21 Server kernel: res 51/84:af:97:90:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:21 Server kernel: ata3.00: status: { DRDY ERR } Mar 1 19:46:21 Server kernel: ata3.00: error: { ICRC ABRT } Mar 1 19:46:21 Server kernel: ata3: hard resetting link Mar 1 19:46:21 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:21 Server kernel: ata3.00: configured for UDMA/33 Mar 1 19:46:21 Server kernel: ata3: EH complete Mar 1 19:46:21 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:21 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:21 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:21 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:21 Server kernel: ata4.00: cmd 25/00:00:97:99:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:21 Server kernel: res 51/84:ef:97:99:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:21 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:21 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:21 Server kernel: ata4: hard resetting link Mar 1 19:46:22 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:22 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:22 Server kernel: ata4: EH complete Mar 1 19:46:22 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:22 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:22 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:22 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:22 Server kernel: ata4.00: cmd 25/00:90:07:a8:58/00:03:15:00:00/e0 tag 0 dma 466944 in Mar 1 19:46:22 Server kernel: res 51/84:6f:07:a8:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:22 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:22 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:22 Server kernel: ata4: hard resetting link Mar 1 19:46:22 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:22 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:22 Server kernel: ata4: EH complete Mar 1 19:46:22 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:22 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:22 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:22 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:22 Server kernel: ata4.00: cmd 25/00:00:97:ab:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:22 Server kernel: res 51/84:8f:97:ab:58/84:01:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:22 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:22 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:22 Server kernel: ata4: hard resetting link Mar 1 19:46:22 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:22 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:22 Server kernel: ata4: EH complete Mar 1 19:46:22 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:22 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:22 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:22 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:22 Server kernel: ata4.00: cmd 25/00:00:97:bd:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:22 Server kernel: res 51/84:ef:97:bd:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:22 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:22 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:22 Server kernel: ata4: hard resetting link Mar 1 19:46:23 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:23 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:23 Server kernel: ata4: EH complete Mar 1 19:46:23 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:23 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:23 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:23 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:23 Server kernel: ata4.00: cmd 25/00:00:0f:c8:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:23 Server kernel: res 51/84:4f:0f:c8:58/84:01:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:23 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:23 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:23 Server kernel: ata4: hard resetting link Mar 1 19:46:23 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:23 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:23 Server kernel: ata4: EH complete Mar 1 19:46:23 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:23 Server kernel: ata4.00: BMDMA stat 0x5 Mar 1 19:46:23 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:23 Server kernel: ata4.00: failed command: READ DMA EXT Mar 1 19:46:23 Server kernel: ata4.00: cmd 25/00:00:67:d1:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:23 Server kernel: res 51/84:7f:67:d1:58/84:01:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:23 Server kernel: ata4.00: status: { DRDY ERR } Mar 1 19:46:23 Server kernel: ata4.00: error: { ICRC ABRT } Mar 1 19:46:23 Server kernel: ata4: hard resetting link Mar 1 19:46:24 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:24 Server kernel: ata4.00: configured for UDMA/33 Mar 1 19:46:24 Server kernel: ata4: EH complete Mar 1 19:46:24 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:24 Server kernel: ata3.00: BMDMA stat 0x5 Mar 1 19:46:24 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:24 Server kernel: ata3.00: failed command: READ DMA EXT Mar 1 19:46:24 Server kernel: ata3.00: cmd 25/00:00:67:da:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:24 Server kernel: res 51/84:4f:67:da:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:24 Server kernel: ata3.00: status: { DRDY ERR } Mar 1 19:46:24 Server kernel: ata3.00: error: { ICRC ABRT } Mar 1 19:46:24 Server kernel: ata3: hard resetting link Mar 1 19:46:24 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:24 Server kernel: ata3.00: configured for UDMA/33 Mar 1 19:46:24 Server kernel: ata3: EH complete Mar 1 19:46:24 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:24 Server kernel: ata3.00: BMDMA stat 0x5 Mar 1 19:46:24 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:24 Server kernel: ata3.00: failed command: READ DMA EXT Mar 1 19:46:24 Server kernel: ata3.00: cmd 25/00:f8:6f:df:58/00:03:15:00:00/e0 tag 0 dma 520192 in Mar 1 19:46:24 Server kernel: res 51/84:77:6f:df:58/84:03:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:24 Server kernel: ata3.00: status: { DRDY ERR } Mar 1 19:46:24 Server kernel: ata3.00: error: { ICRC ABRT } Mar 1 19:46:24 Server kernel: ata3: hard resetting link Mar 1 19:46:25 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 1 19:46:25 Server kernel: ata3.00: configured for UDMA/33 Mar 1 19:46:25 Server kernel: ata3: EH complete Mar 1 19:46:25 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 1 19:46:25 Server kernel: ata3.00: BMDMA stat 0x5 Mar 1 19:46:25 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 1 19:46:25 Server kernel: ata3.00: failed command: READ DMA EXT Mar 1 19:46:25 Server kernel: ata3.00: cmd 25/00:00:6f:e8:58/00:04:15:00:00/e0 tag 0 dma 524288 in Mar 1 19:46:25 Server kernel: res 51/84:0f:6f:e8:58/84:00:15:00:00/e0 Emask 0x12 (ATA bus error) Mar 1 19:46:25 Server kernel: ata3.00: status: { DRDY ERR } Mar 1 19:46:25 Server kernel: ata3.00: error: { ICRC ABRT } Mar 1 19:46:25 Server kernel: ata3: hard resetting link Quote Link to comment
riccume Posted March 1, 2011 Author Share Posted March 1, 2011 Smart Status Report for one of the hard drives: Statistics for /dev/sdg 00S_WD-WCAVY3453118 smartctl -a -d ata /dev/sdg smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD20EADS-00S2B0 Serial Number: WD-WCAVY3453118 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Mar 1 19:48:42 2011 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (41100) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 8 3 Spin_Up_Time 0x0027 186 151 021 Pre-fail Always - 7700 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 968 5 Reallocated_Sector_Ct 0x0033 200 199 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5322 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 21 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 7 193 Load_Cycle_Count 0x0032 190 190 000 Old_age Always - 31072 194 Temperature_Celsius 0x0022 107 102 000 Old_age Always - 45 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 198 198 000 Old_age Offline - 405 SMART Error Log Version: 1 Warning: ATA error count 36 inconsistent with error log pointer 4 ATA Error Count: 36 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 36 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 2f 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b2f = 1481519 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:21.512 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:21.512 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:21.509 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:21.509 SET FEATURES [set transfer mode] Error 35 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 31 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b31 = 1481521 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:18.049 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:18.049 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:18.046 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:18.046 SET FEATURES [set transfer mode] Error 34 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 2f 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b2f = 1481519 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:15.551 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:15.551 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:15.548 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:15.548 SET FEATURES [set transfer mode] SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 1190 1481530 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
riccume Posted March 1, 2011 Author Share Posted March 1, 2011 And for the other: Statistics for /dev/sde 00M_WD-WCAZA0670624 smartctl -a -d ata /dev/sde smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0670624 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Mar 1 19:49:32 2011 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (39000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 173 171 021 Pre-fail Always - 6350 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 517 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3507 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 13 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 13753 194 Temperature_Celsius 0x0022 114 107 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
SSD Posted March 1, 2011 Share Posted March 1, 2011 Try replacing the sata cable contacting to the drive with serial number ending in 118. Quote Link to comment
riccume Posted March 1, 2011 Author Share Posted March 1, 2011 Thanks. Will have to do it tomorrow and report back, it takes a few minutes to play around with the Lian Li PC-Q08 case. Quote Link to comment
riccume Posted March 2, 2011 Author Share Posted March 2, 2011 Unfortunately no luck: checked all connections, changed SATA cable on hard drive 118, relaunched parity check - and still 3Mb/sec speed. See new syslog below: clearly something still wrong, but too cryptic for me to understand. Any further suggestions? The strange thing is that the unRaid server works otherwise perfectly and it reports no error... Mar 2 20:25:07 Server kernel: ata4: hard resetting link Mar 2 20:25:07 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:07 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:07 Server kernel: ata4: EH complete Mar 2 20:25:07 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:07 Server kernel: ata3.00: BMDMA stat 0x5 Mar 2 20:25:07 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:07 Server kernel: ata3.00: failed command: READ DMA EXT Mar 2 20:25:07 Server kernel: ata3.00: cmd 25/00:00:9f:26:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:07 Server kernel: res 51/84:ef:9f:26:08/84:03:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:07 Server kernel: ata3.00: status: { DRDY ERR } Mar 2 20:25:07 Server kernel: ata3.00: error: { ICRC ABRT } Mar 2 20:25:07 Server kernel: ata3: hard resetting link Mar 2 20:25:07 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:07 Server kernel: ata3.00: configured for UDMA/33 Mar 2 20:25:07 Server kernel: ata3: EH complete Mar 2 20:25:08 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:08 Server kernel: ata4.00: BMDMA stat 0x5 Mar 2 20:25:08 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:08 Server kernel: ata4.00: failed command: READ DMA EXT Mar 2 20:25:08 Server kernel: ata4.00: cmd 25/00:00:9f:2f:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:08 Server kernel: res 51/84:df:9f:2f:08/84:03:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:08 Server kernel: ata4.00: status: { DRDY ERR } Mar 2 20:25:08 Server kernel: ata4.00: error: { ICRC ABRT } Mar 2 20:25:08 Server kernel: ata4: hard resetting link Mar 2 20:25:08 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:08 Server kernel: ata3.00: BMDMA stat 0x5 Mar 2 20:25:08 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:08 Server kernel: ata3.00: failed command: READ DMA EXT Mar 2 20:25:08 Server kernel: ata3.00: cmd 25/00:00:9f:2f:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:08 Server kernel: res 51/84:5f:9f:2f:08/84:02:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:08 Server kernel: ata3.00: status: { DRDY ERR } Mar 2 20:25:08 Server kernel: ata3.00: error: { ICRC ABRT } Mar 2 20:25:08 Server kernel: ata3: hard resetting link Mar 2 20:25:08 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:08 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:08 Server kernel: ata3.00: configured for UDMA/33 Mar 2 20:25:08 Server kernel: ata3: EH complete Mar 2 20:25:08 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:08 Server kernel: ata4: EH complete Mar 2 20:25:08 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:08 Server kernel: ata4.00: BMDMA stat 0x5 Mar 2 20:25:08 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:08 Server kernel: ata4.00: failed command: READ DMA EXT Mar 2 20:25:08 Server kernel: ata4.00: cmd 25/00:78:9f:38:08/00:01:00:00:00/e0 tag 0 dma 192512 in Mar 2 20:25:08 Server kernel: res 51/84:67:9f:38:08/84:00:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:08 Server kernel: ata4.00: status: { DRDY ERR } Mar 2 20:25:08 Server kernel: ata4.00: error: { ICRC ABRT } Mar 2 20:25:08 Server kernel: ata4: hard resetting link Mar 2 20:25:08 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:08 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:08 Server kernel: ata4: EH complete Mar 2 20:25:08 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:08 Server kernel: ata3.00: BMDMA stat 0x5 Mar 2 20:25:08 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:08 Server kernel: ata3.00: failed command: READ DMA Mar 2 20:25:08 Server kernel: ata3.00: cmd c8/00:60:9f:41:08/00:00:00:00:00/e0 tag 0 dma 49152 in Mar 2 20:25:08 Server kernel: res 51/84:1f:9f:41:08/84:02:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:08 Server kernel: ata3.00: status: { DRDY ERR } Mar 2 20:25:08 Server kernel: ata3.00: error: { ICRC ABRT } Mar 2 20:25:08 Server kernel: ata3: hard resetting link Mar 2 20:25:09 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:09 Server kernel: ata3.00: configured for UDMA/33 Mar 2 20:25:09 Server kernel: ata3: EH complete Mar 2 20:25:09 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:09 Server kernel: ata4.00: BMDMA stat 0x5 Mar 2 20:25:09 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:09 Server kernel: ata4.00: failed command: READ DMA EXT Mar 2 20:25:09 Server kernel: ata4.00: cmd 25/00:00:9f:4a:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:09 Server kernel: res 51/84:ef:9f:4a:08/84:03:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:09 Server kernel: ata4.00: status: { DRDY ERR } Mar 2 20:25:09 Server kernel: ata4.00: error: { ICRC ABRT } Mar 2 20:25:09 Server kernel: ata4: hard resetting link Mar 2 20:25:09 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:09 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:09 Server kernel: ata4: EH complete Mar 2 20:25:09 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:09 Server kernel: ata4.00: BMDMA stat 0x5 Mar 2 20:25:09 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:09 Server kernel: ata4.00: failed command: READ DMA EXT Mar 2 20:25:09 Server kernel: ata4.00: cmd 25/00:00:9f:52:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:09 Server kernel: res 51/84:5f:9f:52:08/84:03:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:09 Server kernel: ata4.00: status: { DRDY ERR } Mar 2 20:25:09 Server kernel: ata4.00: error: { ICRC ABRT } Mar 2 20:25:09 Server kernel: ata4: hard resetting link Mar 2 20:25:10 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:10 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:10 Server kernel: ata4: EH complete Mar 2 20:25:10 Server kernel: ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:10 Server kernel: ata4.00: BMDMA stat 0x5 Mar 2 20:25:10 Server kernel: ata4: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:10 Server kernel: ata4.00: failed command: READ DMA EXT Mar 2 20:25:10 Server kernel: ata4.00: cmd 25/00:00:9f:5a:08/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 2 20:25:10 Server kernel: res 51/84:00:9f:5a:08/84:00:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:10 Server kernel: ata4.00: status: { DRDY ERR } Mar 2 20:25:10 Server kernel: ata4.00: error: { ICRC ABRT } Mar 2 20:25:10 Server kernel: ata4: hard resetting link Mar 2 20:25:10 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 20:25:10 Server kernel: ata4.00: configured for UDMA/33 Mar 2 20:25:10 Server kernel: ata4: EH complete Mar 2 20:25:10 Server kernel: ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Mar 2 20:25:10 Server kernel: ata3.00: BMDMA stat 0x5 Mar 2 20:25:10 Server kernel: ata3: SError: { UnrecovData Proto TrStaTrns } Mar 2 20:25:10 Server kernel: ata3.00: failed command: READ DMA EXT Mar 2 20:25:10 Server kernel: ata3.00: cmd 25/00:58:9f:75:08/00:03:00:00:00/e0 tag 0 dma 438272 in Mar 2 20:25:10 Server kernel: res 51/84:37:9f:75:08/84:03:00:00:00/e0 Emask 0x12 (ATA bus error) Mar 2 20:25:10 Server kernel: ata3.00: status: { DRDY ERR } Mar 2 20:25:10 Server kernel: ata3.00: error: { ICRC ABRT } Mar 2 20:25:10 Server kernel: ata3: hard resetting link Quote Link to comment
riccume Posted March 2, 2011 Author Share Posted March 2, 2011 Smart Status Report for hard drive 118 (I re-run a Short Smart Test). Nothing new here: Statistics for /dev/sdg 00S_WD-WCAVY3453118 smartctl -a -d ata /dev/sdg smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD20EADS-00S2B0 Serial Number: WD-WCAVY3453118 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Mar 2 20:33:18 2011 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (41100) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 8 3 Spin_Up_Time 0x0027 163 151 021 Pre-fail Always - 8841 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 971 5 Reallocated_Sector_Ct 0x0033 200 199 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5346 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 23 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 8 193 Load_Cycle_Count 0x0032 190 190 000 Old_age Always - 32114 194 Temperature_Celsius 0x0022 115 102 000 Old_age Always - 37 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 198 198 000 Old_age Offline - 405 SMART Error Log Version: 1 Warning: ATA error count 36 inconsistent with error log pointer 4 ATA Error Count: 36 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 36 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 2f 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b2f = 1481519 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:21.512 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:21.512 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:21.509 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:21.509 SET FEATURES [set transfer mode] Error 35 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 31 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b31 = 1481521 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:18.049 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:18.049 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:18.046 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:18.046 SET FEATURES [set transfer mode] Error 34 occurred at disk power-on lifetime: 1102 hours (45 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 2f 9b 16 e0 Error: UNC 8 sectors at LBA = 0x00169b2f = 1481519 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 2f 9b 16 e0 08 45d+12:26:15.551 READ DMA ef 10 02 00 00 00 a0 08 45d+12:26:15.551 SET FEATURES [Reserved for Serial ATA] ec 00 00 00 00 00 a0 08 45d+12:26:15.548 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 45d+12:26:15.548 SET FEATURES [set transfer mode] SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5346 - # 2 Short offline Aborted by host 90% 5346 - # 3 Short offline Completed: read failure 90% 1190 1481530 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
riccume Posted March 2, 2011 Author Share Posted March 2, 2011 And the one for the other hard drive, which is connected via the PCI SATA card (again, I re-run a Short Smart Test): Statistics for /dev/sde 00M_WD-WCAZA0670624 smartctl -a -d ata /dev/sde smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0670624 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Mar 2 20:39:37 2011 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (39000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 175 171 021 Pre-fail Always - 6241 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 520 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3531 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 10 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 14281 194 Temperature_Celsius 0x0022 120 107 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 3531 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
SSD Posted March 2, 2011 Share Posted March 2, 2011 Try swapping the cables between the 2 drives on the PCI card. Do it on the disk side. See if the problems follow the disk to the other port, or the problems move to the other disk. Quote Link to comment
riccume Posted March 2, 2011 Author Share Posted March 2, 2011 Thanks bjp999, will try right away. Question - from the syslog is there a way to understand which disk is experiencing the problem? I cannot figure it out. Quote Link to comment
SSD Posted March 2, 2011 Share Posted March 2, 2011 You should post full syslog. It actually look like 2 disks - ata3 and ata4 are acting up. If they are both on that PCI card, could mean the controller is failing or nota securely plugged into its slot. Quote Link to comment
riccume Posted March 2, 2011 Author Share Posted March 2, 2011 So swapped cables, syslog attached (had to split in two files). Question - how do I make the link between "ata3 and ata4" and the drives themselves? Not familiar with it sorry. I also attach SMART reports for all drives, I stand corrected on one point - the two drives connected via PCI card are the ones ending in 073 and 381. Thanks! syslog-2011-03-02a.txt syslog-2011-03-02b.txt smarts.txt Quote Link to comment
SSD Posted March 3, 2011 Share Posted March 3, 2011 Look at the section below. No doubt that ata3 and ata4 are both on your PCI I/O card - looks like there may be a pata slot on the card too that you are not using. They are both WD 2T EARS. Not sure which is which, but they are both generating errors. (Sorry, missed this earlier - I thought one disk was fine and other was generating errors). If it works flawlessly with one disk and not with two, the controller might be broken. You might want to hook up each drive to the controller (leaving second one disconnected) and see if it comes up fine. If so, you will have verified your disks are good. The only think left would be the controller (which could be bad, incompatible with unRAID, or incompatible with your motherboard). Mar 2 21:49:01 Server kernel: ata3: SATA max UDMA/133 port i16@0xd000 bmdma 0xd400 irq 20 Mar 2 21:49:01 Server kernel: ata4: SATA max UDMA/133 port i16@0xd100 bmdma 0xd408 irq 20 Mar 2 21:49:01 Server kernel: ata5: PATA max UDMA/133 port i16@0xd200 bmdma 0xd410 irq 20 Mar 2 21:49:01 Server kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 21:49:01 Server kernel: ata3.00: ATA-8: WDC WD20EARS-00J2GB0, 80.00A80, max UDMA/133 Mar 2 21:49:01 Server kernel: ata3.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32) Mar 2 21:49:01 Server kernel: ata3.00: configured for UDMA/133 Mar 2 21:49:01 Server kernel: scsi 3:0:0:0: Direct-Access ATA WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5 Mar 2 21:49:01 Server kernel: sd 3:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) Mar 2 21:49:01 Server kernel: sd 3:0:0:0: [sdc] Write Protect is off Mar 2 21:49:01 Server kernel: sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00 Mar 2 21:49:01 Server kernel: sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Mar 2 21:49:01 Server kernel: sdc: Mar 2 21:49:01 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 21:49:01 Server kernel: ata4.00: ATA-8: WDC WD20EARS-00MVWB0, 51.0AB51, max UDMA/133 Mar 2 21:49:01 Server kernel: ata4.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32) Mar 2 21:49:01 Server kernel: ata4.00: configured for UDMA/133 Mar 2 21:49:01 Server kernel: scsi 4:0:0:0: Direct-Access ATA WDC WD20EARS-00M 51.0 PQ: 0 ANSI: 5 Mar 2 21:49:01 Server kernel: sd 4:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) Mar 2 21:49:01 Server kernel: sd 4:0:0:0: [sdd] Write Protect is off Mar 2 21:49:01 Server kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00 Mar 2 21:49:01 Server kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Quote Link to comment
S80_UK Posted March 3, 2011 Share Posted March 3, 2011 May I chip in...? I have seen reports on the web (but not confirmed myself) that the VT6421 does not play well with the WD20EARS. I have used these drives on a controller using the SIL3114 (not yet with unRAID but I shall soon) and the drives seem quite happy. I picked this thread up because I am also planning to use the GA-D510UD motherboard, but with the SIL controller, since that's what I already have. Quote Link to comment
riccume Posted March 3, 2011 Author Share Posted March 3, 2011 Thanks bjp999 and S80_UK. bjp999, when you say "see if it comes up fine" what are you referring to exactly, i.e. what should I look for? Quote Link to comment
SSD Posted March 3, 2011 Share Posted March 3, 2011 If there are known compatibility issues, you might just want to get a different controler card. Quote Link to comment
riccume Posted March 3, 2011 Author Share Posted March 3, 2011 Wise advise. Just ordered a SATA PCI card with SIL3114 controller, will report back once tested. Thanks. Quote Link to comment
S80_UK Posted March 4, 2011 Share Posted March 4, 2011 With luck, I'll be assembling my new rig this weekend - will also let you know how it goes. Quote Link to comment
riccume Posted March 9, 2011 Author Share Posted March 9, 2011 All good guys, I substituted the PCI SATA card w/ VT6421 controller with one with SIL3114 controller (right off the box, no flashing new BIOS or anything like that), launched a parity check and it is now running at c. 50MB/sec - as fast as it was before adding the second hard drive on the PCI card. So the VT6421 controller seems to indeed have been the problem, at least with my WD20EARS. Thanks! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.