Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Troubleshooting UDMA CRC errors

Featured Replies

Hello everyone,

 

Sorry for the long post, it's a fairly complicated problem to describe.

 

I have been having problems with UDMA CRC errors on and off since I started using Unraid back in 2016. My setup was finally stable for a few years, until I decided to replace my 8TB (shucked WD My Book) parity drives with recertified 12TB HGST drives. Before doing so, I loaded the drives in another machine, ran a long SMART test and made sure no errors were reported in SMART. The drives seemed fine.

 

After replacing the first parity drive, I kicked off a parity rebuild and a few UDMA CRC errors popped up on two data drives in the array. I acknowledged those and moved forward, because I was used to seeing such errors. After the rebuild completed, I took the machine offline again, replaced the second parity drive and attempted to rebuild parity again. This time, one of the data drives reported many UDMA CRC errors and dropped from the array. I rebooted a few times, checking and reseating SATA cables between each reboot and every time I started the array, a bunch of UDMA CRC errors would appear on one or more drives, seemingly at random.

 

I started seeing UDMA CRC errors a few years ago when I increased the number of drives I was using, from 4 initially all the way to 8 (the AsRock A88X motherboard I was using had 8 ports). At the time, I replaced the PSU and all the SATA data cables. This didn't help. I downgraded to 6 drives only and that seemed to help. Feeling that something was wrong with the integrated SATA controller on that motherboard, I changed the motherboard for a different model from a different brand (Asus). The setup with 6 drives was stable for a couple of years, I think I only saw the UDMA CRC error count increase occasionally on the same couple of drives.

 

But at present, the whole thing has become completely unusable, and worse, I was one data drive down (emulated). For sanity purposes, I loaded the offending drive in another machine and was able to access the data just fine. The drive didn't give any UDMA CRC error in that other machine.

 

At a loss as to what to do, I gutted this other machine and transferred all the drives and the SATA cables to that machine, which I used it to stabilize my NAS and rebuild the array. This machine has been running for 5 days without issues at this point, so I know that my drives and SATA cables are good.

 

Back to my Unraid chassis, I am not sure how to proceed. The motherboard was replaced a few years ago, the SATA cables were all replaced and they seem to be working fine in the temporary machine. I also replaced the PSU a while back just in case it was power related, so I know it's good. I did a long Memtest and the RAM didn't show any problem. The only thing I haven't replaced is the CPU, but I don't see how this could be the problem since on FM2+ A88X platforms, SATA links are handled by the chipset.

 

I don't understand why I am having these issues. Originally, I had assumed a bad chipset or a bad chipset architecture, which would cause these issues when you have too many drives. That and the fact that it seemed more stable with less drives is why I limited myself to 6 drives despite having an Unraid Pro license. However, last week I didn't try to increase the drive count, but only swapped an existing drive for a new one, and that caused UDMA CRC issues again.

 

I have attached the diagnostic zip to my post.

 

Could you please guide me through troubleshooting this problem?

 

To recap, I have verified:

 

  • drives are very likely good
  • SATA data cables are fine
  • RAM is fine
  • PSU should be fine, it was replaced
  • motherboard was changed, but exhibits similar issues to the previous one from a different make

 

Is the A88X platform just not good for this amount of drives despite offering 8 SATA ports on many motherboards models?

Is the CPU defective, despite no handling the SATA links directly?

Have I been unlucky with two bad motherboards?

 

EDIT: I should probably mention that I am using a 3-bay ICY-Dock backplane to expand how many drives can fit in the case, but only 2 bays are used. I have not observed that the drives showing errors were limited to the drives in that bay, on the contrary, it seemed to affect any drive and any motherboard SATA port.

 

Thank you again for your time and your help!

 

EDIT2: I forgot to post details about my system. Here goes:

 

  • Case: Cooler Master Elite 342
  • CPU: AMD A6-7400K
  • Motherboard: Asus A88XM-PLUS (formerly Asrock FM2A88M PRO3+)
  • RAM: Kingston 8GB DDR3-1333 CL9
  • PSU: Corsair CV550 (formerly Corsair VS SERIES 350)
  • Backplane: ICY-Dock MB153SP-B
  • Cooling: Default AMD Aircooler + 1 Cooler master 120mm + 2 Noctua NF-P 12 PWM + 1 Noctua NF-A8 PWN
  • Unraid OS USB key: SanDisk Ultra Fit USB 3.0 16GB
  • Drives:
    • Currently in the array
      • 3x TOSHIBA DT01ACA300 (3TB)
      • 2x HGST HUH721212ALE601 (12TB)
      • 1x Western Digital WD80EZAZ (8TB)
    • Waiting to be added to the array (or to replace smaller drives)
      • 1x HGST HUH721212ALE601 (12TB)
      • 2x Western Digital WD80EZAZ (8TB)

 

 

pinky-diagnostics-20241003-2020.zip

Edited by asktoomuch

Solved by Frank1940

  • Community Expert
1 hour ago, asktoomuch said:

Is the A88X platform just not good for this amount of drives despite offering 8 SATA ports on many motherboards models?

Don't remember seeing other users with issues with those chipsets, but if it was me, I would get a 4 to 6 port controller from the recommended list then retest and see if the errors still happen with it, if yes, the board is likely not the problem.

  • Author

Thanks for the suggestion! I ordered this JMB585 controller from Aliexpress (Amazon didn't seem to have any in my region), I'm hoping that having less drives on the chipset's SATA controller might help. It's going to take a while to get here however.

 

If the issue is not coming from the motherboard, any idea what could be the cause? The drives seem fine and so do the cables. Before moving the array into another machine, I even tested powering the drives with a dedicated 750W gaming power supply, but I noticed the same UDMA CRC errors then, so I don't think the PSU is to blame either.

 

 

  • 2 weeks later...
  • Author

I have been running this JMB585-based card for a few days now. I'm still very confused about the root cause of my issues.

 

image.png.fb60ee4929acdaa273afcceab710a64e.png

 

2 of the drives are connected directly to the motherboard, I'm planning to have 3 in the long-run. The mobo is handling the ICY-Dock I mentioned above.

 

The IOCREST card is handling the other 4 drives for now. Possibly I will had a 5th one to it in the long-run. I have not noticed any UDMA CRC error since I added the card and shifted 4 drives to it.

 

So is it that my second motherboard (I got rid of the original one for similar issues if you remember) is having an issue with handling too many drives, even though it's supposed to support up to 8? If that's really the problem, then I must have been very unlucky to have the same exact problem with two distinct motherboards from different brands.

 

When I changed the mobo, the A88X that handles the drives was changed at the same time:

 

image.png.3825b090fd1ec6a817bd899fd1dc6fab.png

 

So yeah, still confused.

 

I will keep running tests, fingers crossed that no more issues crop up.

Edited by asktoomuch

  • Community Expert

You don't have all of the SATA data cables dressed and wrapped up tight to make for a neat appearance...

  • Author

I don't understand what you mean, I'm sorry. Are you talking about the JMB585 card and its 5 ports?

  • Community Expert
8 minutes ago, asktoomuch said:

I don't understand what you mean, I'm sorry. Are you talking about the JMB585 card and its 5 ports?

I am talking about the SATA data cables between the card/MB SATA ports and the STAT ports on the hard drives.  Most of these cables are unshielded and if you tie them all together in a neat bundle, you can have crosstalk problems.  Crosstalk problems can cause CRC errors.  (If case you did not know, CRC errors occur when the CRC code generated by the Hard Drive fails to match CRC code calculated by the card/MB.  A small number of failures is not a big issue as the data is simply re-transmitted until it passes.  Large numbers of failures can slow data transfer rates.)  The only thing is this path are the SATA data cable and the two SATA connectors.  So they are always the first suspects...

  • Author

Oh that's actually a very good point, thanks for explaining!


I knew about crosstalk as a concept, but I never thought it would be an issue with SATA cables, so I never considered it. My NAS case is pretty small, so it's definitely possible that some of these cables were routed too close to each other, especially in the corners. Now that I think of it, I tried to shove the cables away from the middle of the case to limit how much they restrict the airflow. I will definitely take a look and see what I can do about that.

 

What would you recon is a safe distance between 2 cables? 1/2 inch (~1 cm) should be good enough?

Edited by asktoomuch

  • Community Expert
  • Solution
Just now, asktoomuch said:

What would you recon is a safe distance between 2 cables? 1/2 inch (~1 cm) should be good enough?

Loose routing should be fine.  Tying each cable separately with a single tie should be fine.  (Excessive obsession with neatness is the problem.)  Just leave amble slack as the SATA connector is a friction connection and if there is any tangential force on the cable, vibration (HD spinning!) can cause the connector to work itself loose!  (I have always said that the SATA connector design is the poster child for how NOT to design a connector system!)

 

An other precaution is to double check that all SATA connectors (power and data) are fully seated after you do any maintenance where the cables are disturbed.

  • 4 weeks later...
  • Author

Quick update after almost a month: the new setup has been rock solid.

 

I am not sure whether to attribute it to the new JMB585 card from Aliexpress, or the fact that in making this change, I rerouted the SATA data cables more loosely. Honestly, at this point I am happy with the setup, so I won't be touching it.

 

Just for future reference (future me, or you dear reader), if you also struggle with UDMA CRC errors:

- Try routing the SATA cables loosely (do not bundle them together)

- If that doesn't help and you are confident about the cables and the drives themselves, try a JMB585-based card

 

Best!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.