Hard Drive Error Whack-A-Mole


Jon4got2

Recommended Posts

Hello.

I've been trying to find the source of these disk read errors and udma crc errors for a while now.  I've replaced drive cables, drive enclosure, and the drive itself (more than once).  SMART tests come back fine.  Thought it might have been that 2 of the drives were not on the HBA, so I added an identical LSI 9300 8i, updated firmware, and reconfigured which cables were connected to which drives and ran a new config.  I have a backup unraid server that is running just fine with (i believe) all the same settings except for the nerd pack plugin, sabnzbd, and sonarr.  Could it be mobo/cpu issues?

 

I was using reslio sync for both, but I've disabled that for a while now until I get this figured out.  I also have an SMB share from the #1 server mounted in #2 server so the plex server on #2 can read media that isn't backed up on #2 (in case this is relevant).

 

Thanks in advance for your help!

tower-diagnostics-20201216-1055.zip

Link to comment

Could be the PSU itself or low quality splitter or too much on a single line.

14 hours ago, Jon4got2 said:

Looks like it was on 2 separate wires, but there were some unnecessary adapters and plugs in between.  I removed those now, and I'll try another run at a new config and see how it holds up.  

So, you now have 5 drive per PSU line ?

Link to comment

With those enclosures the SATA power connector can be the weak point. If you can, try to connect both enclosures with 1 connector from each lead from the PSU, so the two power leads on each enclosure are fed from different strands of the PSU. Ideally, you would want 4 separate leads, but your PSU may not have enough leads separated out. Possibly use one high quality 4pin to SATA power adapter on each enclosure to allow more wire capacity from the PSU.

Link to comment

UPDATE:

New power supply in, and my drive errors have been cured!  Thanks again for your help.  

 

However, I still have one remaining issue that I thought might have been related but apparently not... it seems I am having network issues.  I have this identical network card in my other unRaid server setup the same way with LACP 802.3ad on the same Luxul managed switch.  I get a "bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond" error on boot.  The other server has no network issues, but this system doesn't seem to properly negotiate the LACP with the switch.  Works fine for a while, then log floods with "br0: received packet on bond0 with own address as source" and "Call Trace" errors.  Eventually it locks up and I need to force a hard reboot, which also means I can't export a proper diagnostic report.  I did keep a log window open and copied that to a text file, however.  I have tried resetting the network without the bond, and then resetting it again.  Same with the switch settings.

 

Would it be more proper to start a new thread for this?

unraid log.rtf

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.