Bad cable, raidcard or backplane?


Recommended Posts

Hello

 

So I know I have at least one slot in my backplane that doesn't function properly, because I've had multiple "disk disabled" errors in those. So I switched the disks around and haven't had any errors for over a year, until today.

 

I installed a new disk on the 18. (4 days ago) into a up until now unused slot and just a few minutes ago it reported errors and disabled itself. It is disk 10 (sdp) and it is a seagate ironwolf (and from what I know seagate disks report their smart values very weirdly), So the only thing I really see there is UDMA CRC Error count on 6 which again, would be a bad cable or connection at least. The diagnostics are attached below, and I am not sure how to read all this so any help is highly appreciated. 

 

Now for my setup, I run 3 Icy Box IB-565SSK backplanes. I know that one slot on backplane 2 does spit out errors and now it is one on backplane 3. The backplanes are connected to 2 dell perc h310 (in LSI 9211-8i IT mode) only the parity drives are connected directly to the motherboard. I switched the cables and everything on the one particular slot on backplane 2 and now I think I'm trying to connect that disk directly too, just to see if it helps. The motherboard is a Asus p10s-m and the power supply is a corsair RM750 (with a single 12V rail so I doubt it has anything to do with power shortage). 

 

So can anyone maybe read from the smart or syslog what the culprit here could be? Thank you very much for reading :)

azeroth-diagnostics-20190422-1613.zip

Link to comment

Okay thank you. I am now in the progress of rebuilding it to itself (shame that can't just "trust" it again and just write the changes to it). I switched out the connection as mentioned and am now connected directly with the motherboard. Rebuild is 60% complete and no errors or any more CRC errors so far, but then again last time they only occurred after 4 days on power so I'll see, hopefully that was the last of that.

Link to comment

Fair point, I don't know if it would be possible to just monitor all the changes that have been done to the emulated disc after the physical one being disabled and just restore this. But anyways, back to my original question, is the error, if it was a connection/cable/raid-card one more likely to occur during high usage (so for example during the current data rebuild) or is it entirely random and could in theory just occur again after the rebuild is complete?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.