lockdown571 Posted August 2, 2020 Share Posted August 2, 2020 So I was getting some read errors on a hard drive, and it finally went into an error state in Unraid. I figured the drive was bad, so I replaced it. Rebuilt the array without issues. A few hours later though, that new hard drive went into an error state. Is the new hard drive really a dud? I replaced the SATA cable, same thing. I also just replaced this SATA controller a few months ago. Any ideas before I send the new hard drive back? syslog.txt ST8000DM004-2CX188_ZCT2P6ND-20200802-1041.txt tower-diagnostics-20200802-0936.zip Quote Link to comment
JorgeB Posted August 3, 2020 Share Posted August 3, 2020 Looks more like a connection issue, replace/swap both cables (power + SATA) and try again. Quote Link to comment
lockdown571 Posted August 15, 2020 Author Share Posted August 15, 2020 I have now replaced the hard drive, the SATA cable, the SATA controller, and the power supply, and I am still getting udma crc error counts. What a headache. Any more ideas before I nuke the server from orbit? 😠 tower-diagnostics-20200815-0854.zip Quote Link to comment
trurl Posted August 15, 2020 Share Posted August 15, 2020 Still looks like a connection issue. There was probably nothing even wrong with the original drive you replaced. Check all connections, SATA and power, both ends, including any splitters. Make sure you don't bundle your SATA cables. Make sure there is enough slack in the cables so the connector can sit square on the connection with nothing pulling on it. Quote Link to comment
lockdown571 Posted August 15, 2020 Author Share Posted August 15, 2020 4 minutes ago, trurl said: Still looks like a connection issue. There was probably nothing even wrong with the original drive you replaced. Check all connections, SATA and power, both ends, including any splitters. Make sure you don't bundle your SATA cables. Make sure there is enough slack in the cables so the connector can sit square on the connection with nothing pulling on it. Thanks, I will check the connections again. Unfortunately in the middle of me troubleshooting the mover decided to start. It's taking forever, I think because I'm getting thousands of read errors. Quote Link to comment
lockdown571 Posted August 15, 2020 Author Share Posted August 15, 2020 (edited) Still getting read errors on multiple drives (the problem seems to jump from one drive to the other). I took the drives completely out of the case (to bypass the backplane on the HDD cage), used fresh Sata cables, made sure the cables were firmly connected without bends, and used different SATA power connectors on the PSU. Still getting UDMA CRC error counts. I do seem to be getting just a few now though instead of thousands. It doesn't seem to be a matter of if I get these errors but how many. So annoying. By the time I replace various parts on the server to troubleshoot the issue further, I might as well have just built an entirely new server. Edited August 15, 2020 by lockdown571 Quote Link to comment
JorgeB Posted August 16, 2020 Share Posted August 16, 2020 On the diags posted I only see issues with disk6, could be the Seagate doesn't like that Jmicron controller, try swapping it with one the disks on the onboard SATA ports. Quote Link to comment
lockdown571 Posted August 16, 2020 Author Share Posted August 16, 2020 5 hours ago, johnnie.black said: On the diags posted I only see issues with disk6, could be the Seagate doesn't like that Jmicron controller, try swapping it with one the disks on the onboard SATA ports. Thanks, I will give that a try. Unfortunately this will be the third SATA controller I've been through in a short amount of time 😠 Quote Link to comment
lockdown571 Posted August 18, 2020 Author Share Posted August 18, 2020 (edited) I am fairly certain at this point the Jmicron controller is the issue. None of the hard drives seem to like that port. I had an issue with a hard drive that wasn't using the controller, but that seemed to have resolved itself. I think that just confused the picture. Pretty annoying since this is the third PCI-E SATA controller I have tried. I suppose the only other possibility is a problem with the PCI-E bus itself. I was using a PCI-E riser cable and tried switching that out, but that didn't fix it. This is the last time a build a server that depends on SATA expansion cards. Edited August 18, 2020 by lockdown571 Quote Link to comment
lockdown571 Posted August 22, 2020 Author Share Posted August 22, 2020 I just installed this SATA controller: https://www.amazon.com/gp/product/B07ST9CPND/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1. Ran a parity check without a single error. This is the fourth SATA controller I have tried. Hopefully the fourth time is the charm. Quote Link to comment
JorgeB Posted August 23, 2020 Share Posted August 23, 2020 I have a couple of those, they work fine with Unraid. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.