FQs19 Posted April 15, 2022 Share Posted April 15, 2022 (edited) Here are my previous posts about several of my disks throwing UDMA CRC errors, then actually disabling a disk. First post with issues Second post with issues I was doing an uncorrecting parity check to verify that my correcting parity check fixed the sync errors. I woke up today to find that my network connection to my server dropped. I logged in locally and there were a ton of warning popups. Now I have yet another disk that received UDMA CRC errors and is showing Errors on the main tab. The errors just kept coming, so I stopped the parity check. There's no way these are disk problems and I'm really doubting its related to the backplane in my case. Are we positive that it's not the LSI card? I've been holding off on switching cases, to get rid of the backplane, until all the sync errors were corrected, but at this point I think I need to switch cases. Is that a good idea? Just shutdown my server, remove everything, and put it all in my new case that doesn't have a backplane. So it would just be cabled straight from my LSI card and motherboard to the disks. I can't afford to keep getting disk errors and I have too much stuff going on in my life to deal with this. I'm at a loss and confused. Just wondering what you all would do in this situation. I grabbed all this information before I shutdown the server, because I don't want it running at this point. threadripper19-diagnostics-20220415-1028.zip threadripper19-smart-20220415-1025.zip Edited September 29, 2022 by FQs19 Topic Solved Quote Link to comment
Solution JorgeB Posted April 15, 2022 Solution Share Posted April 15, 2022 3 minutes ago, FQs19 said: Are we positive that it's not the LSI card? It can be, it could even be neither, it could be for example a PSU problem, but since there are also CRC errors and those can basically only be from the controller, cables or backplane and the other disk issues are likely related it should be one of them, backplane more likely but HBA is also a possibility, especially if it's a Chinese fake, unfortunately no way of knowing for sure until you rule one of them out, either by using a different HBA or connecting the disks directly bypassing the backplane (or using a different one). Quote Link to comment
FQs19 Posted April 15, 2022 Author Share Posted April 15, 2022 10 minutes ago, JorgeB said: It can be, it could even be neither, it could be for example a PSU problem, but since there are also CRC errors and those can basically only be from the controller, cables or backplane and the other disk issues are likely related it should be one of them, backplane more likely but HBA is also a possibility, especially if it's a Chinese fake, unfortunately no way of knowing for sure until you rule one of them out, either by using a different HBA or connecting the disks directly bypassing the backplane (or using a different one). Thanks for responding to all of my problems. I really appreciate it. So moving my server into the new case without the backplane should be my first step? Quote Link to comment
JorgeB Posted April 15, 2022 Share Posted April 15, 2022 It's what I would do, unless you have another HBA you can try with. 1 Quote Link to comment
FQs19 Posted April 15, 2022 Author Share Posted April 15, 2022 57 minutes ago, JorgeB said: It's what I would do, unless you have another HBA you can try with. Thanks. I'll switch cases and report back. It might be a week or two before I have time for that. Quote Link to comment
FQs19 Posted April 17, 2022 Author Share Posted April 17, 2022 Status update: Just because I had my server down and because I don't have time to move it to a new case, I ran memtest+86 for over two days. It did 7 passes with no errors. So at least I confirmed my memory is good. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.