March 19, 201313 yr Hi I had a drive redballed yesterday. The syslog contains the following lines (the redballed drive is sdb): Mar 18 15:51:41 shortie kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Mar 18 15:51:41 shortie kernel: ata2.00: failed command: READ DMA EXT Mar 18 15:51:41 shortie kernel: ata2.00: cmd 25/00:c0:c8:e7:f3/00:00:5a:01:00/e0 tag 0 dma 98304 in Mar 18 15:51:41 shortie kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Mar 18 15:51:41 shortie kernel: ata2.00: status: { DRDY } Mar 18 15:51:41 shortie kernel: ata2: hard resetting link Mar 18 15:51:51 shortie kernel: ata2: softreset failed (device not ready) Mar 18 15:51:51 shortie kernel: ata2: hard resetting link Mar 18 15:52:01 shortie kernel: ata2: softreset failed (device not ready) Mar 18 15:52:01 shortie kernel: ata2: hard resetting link Mar 18 15:52:12 shortie kernel: ata2: link is slow to respond, please be patient (ready=0) Mar 18 15:52:36 shortie kernel: ata2: softreset failed (device not ready) Mar 18 15:52:36 shortie kernel: ata2: limiting SATA link speed to 1.5 Gbps Mar 18 15:52:36 shortie kernel: ata2: hard resetting link Mar 18 15:52:41 shortie kernel: ata2: softreset failed (device not ready) Mar 18 15:52:41 shortie kernel: ata2: reset failed, giving up Mar 18 15:52:41 shortie kernel: ata2.00: disabled Mar 18 15:52:41 shortie kernel: ata2.00: device reported invalid CHS sector 0 Mar 18 15:52:41 shortie kernel: ata2: EH complete Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Unhandled error code Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x88: 88 00 00 00 00 01 5a f3 e7 c8 00 00 00 c0 00 00 Mar 18 15:52:41 shortie kernel: end_request: I/O error, dev sdb, sector 5820901320 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Unhandled error code Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x88: 88 00 00 00 00 01 5a f3 e8 88 00 00 00 48 00 00 Mar 18 15:52:41 shortie kernel: end_request: I/O error, dev sdb, sector 5820901512 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Unhandled error code Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x8a: 8a 00 00 00 00 01 5a f3 e7 b8 00 00 00 10 00 00 Mar 18 15:52:41 shortie kernel: end_request: I/O error, dev sdb, sector 5820901304 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901256/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901264/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901272/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901280/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901288/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901296/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901304/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901312/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901320/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901328/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901336/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901344/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901352/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901360/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901368/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901376/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901384/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901392/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901400/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901408/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901416/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901424/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901432/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901440/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901448/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901456/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901464/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901472/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901480/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901488/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901496/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901504/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 read error Mar 18 15:52:41 shortie kernel: handle_stripe read error: 5820901512/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901240/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901248/1, count: 1 Mar 18 15:52:41 shortie kernel: md: recovery thread woken up ... Mar 18 15:52:41 shortie kernel: md: recovery thread has nothing to resync Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Unhandled error code Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x04 driverbyte=0x00 Mar 18 15:52:41 shortie kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x8a: 8a 00 00 00 00 01 5a f3 e7 c8 00 00 01 08 00 00 Mar 18 15:52:41 shortie kernel: end_request: I/O error, dev sdb, sector 5820901320 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901256/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901264/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901272/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901280/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901288/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901296/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901304/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901312/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901320/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901328/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901336/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901344/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901352/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901360/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901368/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901376/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901384/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901392/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901400/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901408/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901416/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901424/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901432/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901440/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901448/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901456/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901464/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901472/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901480/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901488/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901496/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901504/1, count: 1 Mar 18 15:52:41 shortie kernel: md: disk1 write error Mar 18 15:52:41 shortie kernel: handle_stripe write error: 5820901512/1, count: 1 When I tried to run a smart test on the drive it failed with ioctl error: -5 A clean powerdown, a quick reseat of the drives and the drive is accessable but obviously still redballed. The smart results look clean to me (but I'm a rookie): ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 86 3 Spin_Up_Time 0x0007 126 126 024 Pre-fail Always - 615 (Average 615) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 967 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 135 135 020 Pre-fail Offline - 26 9 Power_On_Hours 0x0012 098 098 000 Old_age Always - 14099 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 63 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1085 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 1085 194 Temperature_Celsius 0x0002 176 176 000 Old_age Always - 34 (Min/Max 22/47) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 I've bought a couple of replacement drives (one to rebuild onto tonite; one to preclear and keep as a hotswap). I'll obviously stress test the the redballed drive when the array is protected again but given the above is this likely a drive issue or a problem with the PC/controller? The server is built in a HP proliant microserver (36L) disks are Hitachi 7K3000s. Unraid is 5.0 beta 8. Full syslog attached as zipfile. Thanks Eric syslog-20130318-203007.zip
March 19, 201313 yr Run a long SMART test on the drive. Based in the Power-Off_Retract_Count it could be a power issue.
March 20, 201313 yr Author Thanks, I did a couple of long smart tests and the results looked identical to me. I've swapped out the drive (for a Toshiba 3Tb equivalent) and the array is fault tolerant again. Am now preclearing a second new 3Tb disk to keep in a box as a spare and the original is on a 4x preclear stress test in my backup server. As an aside I'm noticing how much quicker the new 1Tb platter drives at sequental reads and writes are than my Hitachi 7K3000s; its really visible on the preclears which I started about the same time. What is ioctl error? google doesn't seem to help much. Eric
Archived
This topic is now archived and is closed to further replies.