Moussa Posted August 1, 2022 Share Posted August 1, 2022 For the last couple of months I've had a disk in my array that sometimes produces errors like the ones below: Aug 1 08:16:50 Blackbox kernel: ata6: hard resetting link Aug 1 08:16:56 Blackbox kernel: ata6: link is slow to respond, please be patient (ready=0) Aug 1 08:17:00 Blackbox kernel: ata6: COMRESET failed (errno=-16) Aug 1 08:17:00 Blackbox kernel: ata6: hard resetting link Aug 1 08:17:01 Blackbox kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Aug 1 08:17:01 Blackbox kernel: ata6.00: configured for UDMA/133 Aug 1 08:17:01 Blackbox kernel: ata6: EH complete Aug 1 08:54:37 Blackbox kernel: ata6.00: exception Emask 0x50 SAct 0x40000 SErr 0x4890800 action 0xe frozen Aug 1 08:54:37 Blackbox kernel: ata6.00: irq_stat 0x0c400040, interface fatal error, connection status changed Aug 1 08:54:37 Blackbox kernel: ata6: SError: { HostInt PHYRdyChg 10B8B LinkSeq DevExch } Aug 1 08:54:37 Blackbox kernel: ata6.00: failed command: READ FPDMA QUEUED Aug 1 08:54:37 Blackbox kernel: ata6.00: cmd 60/00:90:70:bc:bb/04:00:8c:00:00/40 tag 18 ncq dma 524288 in Aug 1 08:54:37 Blackbox kernel: ata6.00: status: { DRDY } Aug 1 08:54:37 Blackbox kernel: ata6: hard resetting link Aug 1 08:54:43 Blackbox kernel: ata6: link is slow to respond, please be patient (ready=0) Aug 1 08:54:47 Blackbox kernel: ata6: COMRESET failed (errno=-16) Aug 1 08:54:47 Blackbox kernel: ata6: hard resetting link Initially I put it down to an issue with the motherboard, cables, or power supply but I've since swapped cables and also moved the drive to another motherboard SATA port without any success. This is now causing some parity check errors, and I think the drive is at fault even if SMART is saying the drive is fine. Should I be replacing this drive now or is there something else I can try? Thanks in advance for any help. blackbox-diagnostics-20220801-1600.zip blackbox-smart-20220801-1613.zip Quote Link to comment
ChatNoir Posted August 1, 2022 Share Posted August 1, 2022 You should do an Extended SMART test (deactivate spin down for the drive) and post your diagnostics when it's finished. It will take some time on a 10TB drive. Quote Link to comment
JorgeB Posted August 1, 2022 Share Posted August 1, 2022 That looks more like a power connection/problem, and the disk looks healthy, but if the cables were replace/swapped and the problem persists it might be a disk problem, like mentioned above you should run an extended SMART test since the ones started never finished, likely due to spin down. Quote Link to comment
ChatNoir Posted August 1, 2022 Share Posted August 1, 2022 6 minutes ago, JorgeB said: the ones started never finished, likely due to spin down. If memory serves, it was a long time ago anyway. Quote Link to comment
Moussa Posted August 1, 2022 Author Share Posted August 1, 2022 Thanks both, I'll run the extended SMART test and see what comes out. Quote Link to comment
Moussa Posted August 2, 2022 Author Share Posted August 2, 2022 Unfortunately I cannot get an extended SMART test to complete. It seems to get aborted early with a `Aborted by host` message, and I'm not sure why. I have the spin down delay set to `Never` on that drive so it's not being spun down. Quote Link to comment
Moussa Posted August 3, 2022 Author Share Posted August 3, 2022 After several attempts I managed to get an extended SMART test to complete, and it did so without any errors. So now I'm at a loss as to what the issue is. Faulty SATA connector on the drive? blackbox-smart-20220803-1226.zip Quote Link to comment
JorgeB Posted August 3, 2022 Share Posted August 3, 2022 SMART cannot always identify issues, I would suggest swapping cables again with a different disk to make sure, and if it happens again replace it. Quote Link to comment
Moussa Posted August 3, 2022 Author Share Posted August 3, 2022 Thanks, I'll try to change out the cable again at some point. Quote Link to comment
Moussa Posted October 21, 2022 Author Share Posted October 21, 2022 As an update to this, after much hair-tearing and swapping of cables and waiting for the issue to re-appear I finally decided to replace the PSU on a suspicion after reading some other forum threads. Upgraded to a newer beefier PSU and it looks like the errors have disappeared now, but still have my fingers crossed they won't reappear in the near future. Not sure if my PSU was faulty or I had too many drives on each cable but it turned out to be the culprit. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.