Disk error state help


Recommended Posts

So I was getting some read errors on a hard drive, and it finally went into an error state in Unraid. I figured the drive was bad, so I replaced it. Rebuilt the array without issues. A few hours later though, that new hard drive went into an error state. Is the new hard drive really a dud? I replaced the SATA cable, same thing. I also just replaced this SATA controller a few months ago. Any ideas before I send the new hard drive back?

 

 

syslog.txt ST8000DM004-2CX188_ZCT2P6ND-20200802-1041.txt tower-diagnostics-20200802-0936.zip

Link to comment
  • 2 weeks later...

Still looks like a connection issue. There was probably nothing even wrong with the original drive you replaced.

 

Check all connections, SATA and power, both ends, including any splitters. Make sure you don't bundle your SATA cables. Make sure there is enough slack in the cables so the connector can sit square on the connection with nothing pulling on it.

Link to comment
4 minutes ago, trurl said:

Still looks like a connection issue. There was probably nothing even wrong with the original drive you replaced.

 

Check all connections, SATA and power, both ends, including any splitters. Make sure you don't bundle your SATA cables. Make sure there is enough slack in the cables so the connector can sit square on the connection with nothing pulling on it.

Thanks, I will check the connections again. Unfortunately in the middle of me troubleshooting the mover decided to start. It's taking forever, I think because I'm getting thousands of read errors.

Link to comment

Still getting read errors on multiple drives (the problem seems to jump from one drive to the other).  I took the drives completely out of the case (to bypass the backplane on the HDD cage), used fresh Sata cables, made sure the cables were firmly connected without bends, and used different SATA power connectors on the PSU. Still getting UDMA CRC error counts. I do seem to be getting just a few now though instead of thousands. It doesn't seem to be a matter of if I get these errors but how many.  So annoying. By the time I replace various parts on the server to troubleshoot the issue further, I might as well have just built an entirely new server.

Edited by lockdown571
Link to comment
5 hours ago, johnnie.black said:

On the diags posted I only see issues with disk6, could be the Seagate doesn't like that Jmicron controller, try swapping it with one the disks on the onboard SATA ports.

Thanks, I will give that a try. Unfortunately this will be the third SATA controller I've been through in a short amount of time 😠

Link to comment

I am fairly certain at this point the Jmicron controller is the issue. None of the hard drives seem to like that port. I had an issue with a hard drive that wasn't using the controller, but that seemed to have resolved itself. I think that just confused the picture.

 

Pretty annoying since this is the third PCI-E SATA controller I have tried. I suppose the only other possibility is a problem with the PCI-E bus itself. I was using a PCI-E riser cable and tried switching that out, but that didn't fix it. This is the last time a build a server that depends on SATA expansion cards.

Edited by lockdown571
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.