Jump to content

(Solved) UDMA CRC errors / Kernel Panic Error write-up


Recommended Posts

Hello all,

 

I did some trouble shooting over the past couple months and figured I would do a write-up to help others in the future in case they experience these issues:

 

1.) I bought a *used* drive from Amazon to replace an old 4tb drive. Immediately after installing it, I started to get UDMA CRC errors and they were rising. It was weird bc the previous drive had no such issues, didn't think I was that rough installing the new drive. I read some forms and they state that either the cables or connectors are faulty, doubted this bc the cables weren't touched during the swap. Since my drives connect to a back plane I heard that could be that issue too and started looking into how to troubleshoot this. The forms suggested that UDMA CRC errors aren't a huge deal as they usually get corrected but the fact that the county was increasing every couple minutes or so made it an issue. I switched around wires/connections (important to note how your drives are listed in the array) to see if I could clear the error or migrate it to a different drive but as I figured they weren't the issue. Eventually, I just replaced the drive and *poof* no more errors! I figured the connector on the HDD had become faulty on the used drive. So much for saving a buck lol.

 

2.) Around this time the server also started crashing randomly. Super stressful period for sure, since I'm trying to run a bunch of my own services now. The power would stay on but the system's network access would go down, unreachable via Windows and SSH. I tried to get the Syslog server running but wasn't really successful in it. Still not sure what I was doing wrong there, but I ended up just mirroring the log to the USB since the crash was happening fairly frequently. Unfortunately, logs didn't really provide me much to go on. It wasn't until I connected a monitor to the server to see the feed that I saw the system was throwing out a Kernel Panic Error!!!! Reviewing the forms, I did see that having mcvlan set can cause issues so I changed that to ipvlan for the docker systems but it did not resolve the crashes. After pinpointing that I did some troubleshooting on the system devices. Did a MemTest86 and the ram passed w flying colors so figured those are probably good. I have an LSI card but eliminated that because I had changed the wires around previously and the drives were working fine. Did a GPU test, by disconnecting it from the server entirely but still got crashes. I then checked the ASUS 2.5G Ethernet USB Adapter I bought to upgrade the integrated 1 GB port and just plugged into that 1 GB  port. Voila! The system has been stable again for six and a half days so far! (*up seven and a half at time of edit*)

 

 

Just want to thank the community and try to contribute a little! Hopefully, it helps someone in the future!

-Edited for clarity

Edited by franktowers1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...