Faulty disk or something else?


Recommended Posts

For the last couple of months I've had a disk in my array that sometimes produces errors like the ones below:

 

Aug  1 08:16:50 Blackbox kernel: ata6: hard resetting link
Aug  1 08:16:56 Blackbox kernel: ata6: link is slow to respond, please be patient (ready=0)
Aug  1 08:17:00 Blackbox kernel: ata6: COMRESET failed (errno=-16)
Aug  1 08:17:00 Blackbox kernel: ata6: hard resetting link
Aug  1 08:17:01 Blackbox kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Aug  1 08:17:01 Blackbox kernel: ata6.00: configured for UDMA/133
Aug  1 08:17:01 Blackbox kernel: ata6: EH complete
Aug  1 08:54:37 Blackbox kernel: ata6.00: exception Emask 0x50 SAct 0x40000 SErr 0x4890800 action 0xe frozen
Aug  1 08:54:37 Blackbox kernel: ata6.00: irq_stat 0x0c400040, interface fatal error, connection status changed
Aug  1 08:54:37 Blackbox kernel: ata6: SError: { HostInt PHYRdyChg 10B8B LinkSeq DevExch }
Aug  1 08:54:37 Blackbox kernel: ata6.00: failed command: READ FPDMA QUEUED
Aug  1 08:54:37 Blackbox kernel: ata6.00: cmd 60/00:90:70:bc:bb/04:00:8c:00:00/40 tag 18 ncq dma 524288 in
Aug  1 08:54:37 Blackbox kernel: ata6.00: status: { DRDY }
Aug  1 08:54:37 Blackbox kernel: ata6: hard resetting link
Aug  1 08:54:43 Blackbox kernel: ata6: link is slow to respond, please be patient (ready=0)
Aug  1 08:54:47 Blackbox kernel: ata6: COMRESET failed (errno=-16)
Aug  1 08:54:47 Blackbox kernel: ata6: hard resetting link

 

Initially I put it down to an issue with the motherboard, cables, or power supply but I've since swapped cables and also moved the drive to another motherboard SATA port without any success. This is now causing some parity check errors, and I think the drive is at fault even if SMART is saying the drive is fine. Should I be replacing this drive now or is there something else I can try?

 

Thanks in advance for any help.

 

 

blackbox-diagnostics-20220801-1600.zip blackbox-smart-20220801-1613.zip

Link to comment

That looks more like a power connection/problem, and the disk looks healthy, but if the cables were replace/swapped and the problem persists it might be a disk problem, like mentioned above you should run an extended SMART test since the ones started never finished, likely due to spin down.

Link to comment
  • 2 months later...

As an update to this, after much hair-tearing and swapping of cables and waiting for the issue to re-appear I finally decided to replace the PSU on a suspicion after reading some other forum threads. Upgraded to a newer beefier PSU and it looks like the errors have disappeared now, but still have my fingers crossed they won't reappear in the near future. Not sure if my PSU was faulty or I had too many drives on each cable but it turned out to be the culprit.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.