Kandinsky Posted September 18, 2013 Share Posted September 18, 2013 Dear Experts, Please see below an extract from the log file for a specific HDD that keeps coming up with the same errors over and over. Is the drive faulty or should I investigate something else please? Sep 18 18:27:09 GOOGOLPLEX kernel: ata1.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen (Errors) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1.00: irq_stat 0x08000000, interface fatal error (Errors) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1: SError: { UnrecovData HostInt 10B8B BadCRC } (Errors) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1.00: failed command: IDENTIFY DEVICE (Minor Issues) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in (Drive related) Sep 18 18:27:09 GOOGOLPLEX kernel: res 50/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x50 (ATA bus error) (Errors) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1.00: status: { DRDY } (Drive related) Sep 18 18:27:09 GOOGOLPLEX kernel: ata1: hard resetting link (Minor Issues) Sep 18 18:27:10 GOOGOLPLEX kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) (Drive related) Sep 18 18:27:10 GOOGOLPLEX kernel: ata1.00: configured for UDMA/33 (Drive related) Sep 18 18:27:10 GOOGOLPLEX kernel: ata1: EH complete (Drive related) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen (Errors) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1.00: irq_stat 0x08000000, interface fatal error (Errors) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1: SError: { UnrecovData HostInt 10B8B BadCRC } (Errors) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1.00: failed command: IDENTIFY DEVICE (Minor Issues) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in (Drive related) Sep 18 18:31:54 GOOGOLPLEX kernel: res 50/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x50 (ATA bus error) (Errors) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1.00: status: { DRDY } (Drive related) Sep 18 18:31:54 GOOGOLPLEX kernel: ata1: hard resetting link (Minor Issues) Sep 18 18:31:55 GOOGOLPLEX kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) (Drive related) Sep 18 18:31:55 GOOGOLPLEX kernel: ata1.00: configured for UDMA/33 (Drive related) Sep 18 18:31:55 GOOGOLPLEX kernel: ata1: EH complete (Drive related) SMART Report: ATA Error Count: 171 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 171 occurred at disk power-on lifetime: 1859 hours (77 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ec 00 01 00 00 00 00 08 00:04:43.993 IDENTIFY DEVICE b0 da 00 00 4f c2 00 08 00:04:43.992 SMART RETURN STATUS b0 d1 01 01 4f c2 00 08 00:04:43.992 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:04:43.992 SMART READ DATA ec 00 01 00 00 00 00 08 00:04:43.992 IDENTIFY DEVICE Error 170 occurred at disk power-on lifetime: 1859 hours (77 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d1 01 01 4f c2 00 08 00:04:43.965 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:04:43.965 SMART READ DATA ec 00 01 00 00 00 00 08 00:04:43.965 IDENTIFY DEVICE e5 00 00 00 00 00 00 08 00:04:43.965 CHECK POWER MODE ec 00 01 00 00 00 00 08 00:04:43.965 IDENTIFY DEVICE Error 169 occurred at disk power-on lifetime: 1859 hours (77 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d1 01 01 4f c2 00 08 00:04:43.932 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:04:43.931 SMART READ DATA ec 00 01 00 00 00 00 08 00:04:43.931 IDENTIFY DEVICE e5 00 00 00 00 00 00 08 00:04:43.931 CHECK POWER MODE ec 00 01 00 00 00 00 08 00:04:43.931 IDENTIFY DEVICE Error 168 occurred at disk power-on lifetime: 1859 hours (77 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 4f c2 00 Error: ABRT Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- b0 d1 01 01 4f c2 00 08 00:04:43.872 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:04:43.871 SMART READ DATA ec 00 01 00 00 00 00 08 00:04:43.871 IDENTIFY DEVICE e5 00 00 00 00 00 00 08 00:04:43.871 CHECK POWER MODE ec 00 01 00 00 00 00 08 00:04:43.871 IDENTIFY DEVICE Error 167 occurred at disk power-on lifetime: 1859 hours (77 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ec 00 01 00 00 00 00 08 00:04:43.871 IDENTIFY DEVICE e5 00 00 00 00 00 00 08 00:04:43.871 CHECK POWER MODE ec 00 01 00 00 00 00 08 00:04:43.871 IDENTIFY DEVICE b0 d1 01 01 4f c2 00 08 00:04:43.869 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] b0 d0 01 00 4f c2 00 08 00:04:43.868 SMART READ DATA Many thanks! Quote Link to comment
garycase Posted September 18, 2013 Share Posted September 18, 2013 When SMART reports errors repeatedly, it's time to replace the drive. Quote Link to comment
Kandinsky Posted September 27, 2013 Author Share Posted September 27, 2013 Interestingly I have resolved the issue. It transpires there is a problem with the backplane on my 5 drive caddy which obviously is causing SATA errors which then seems to screw up the drives or at least unRaid things it is. Take the backplane out of the equation and link it directly to the M1015 and all is well. Very strange! Quote Link to comment
garycase Posted September 27, 2013 Share Posted September 27, 2013 Indeed interesting, but not all that unusual. I've recommended for a long time using the CoolerMaster 4-in-3's instead of hot-swap cages ... they have much better cooling; and eliminate the additional connectors of a hot-swap cage. [and they're significantly less expensive as well ] http://www.newegg.com/Product/Product.aspx?Item=N82E16817993002 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.