Random HDD errors (tried everything i know) can bad memory/CPU trigger this?


Recommended Posts

Hello All, 

 
 

I've been trying to troubleshoot an issue I started to get after switching to a 1U HP Server where random HDD's will start reporting errors. If I leave it long enough, the drive will go disabled due to not able to write, however, as soon as I start to see errors and restart the server clears, and I am good for a week or two. I use an internal LSI SAS controller with two SAS to sata cables for the drives. I've since switched to external LSI controller and new external SAS cable to sata cables and still have the same issue. I thought the external HDD case was the issue, so I decided to remove that from the picture and used a separate PC power supply (with jumper) and connected to the drives directly, however that doesn't help. (note: my friend is doing this with zero issues) 

 

I read that memory might be an issue but due to having dual CPU's i don't know if I can remove sticks (I have a lot of them). Thoughts? Can memory trigger these types of issues? Before people start advising the external drive setup if the cause, I have four internal bays with direct onboard sata cables and these too have issues. 

 

It seems to be completely random on what drive starts with the errors, and if you search my name, you will see a few other posts with issues rebuilding what I thought was bad drives, but I now know replacing drives with newer ones is not fixing anything either. When I get the email error alert, I see sometimes one drive with a single error, and sometimes i check its 5 of 7 drives with errors, again with a reboot, its fixed for sometime.   

 

 I have attached a new diag report after a restart, but I can start grabbing them once I see errors again before I reboot. Some of my old post with errors also have diag reports and I assume they are all the same issues and not what i thought was bad drives.  

 


server-diagnostics-20180819-0613.zip

Edited by rcmpayne
Link to comment
9 hours ago, johnnie.black said:

If I understand correctly you already replaced the HBA and are using a different PSU with the drives, my next suspect would be the motherboard, I would consider very unlikely CPU and RAM causing these.

I have a Dell R415 1u I can try. Can I just swap the lsi controller, usb boot drive + the hdds and boot in the new system?

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.