Drives keep disabling - part 2: more disabling


Recommended Posts

Drives have been disabling randomly and for no reason 2-3 times a week. I posted in the forum previously, received advice and have been testing. I have been running copious tests to try to rule out variables and here is where I've landed.

 

Previously, I had a GPU (3070) in the PCIEx8 slot and the HBA in the PCIEx16 slot. To eliminate variables, I removed the GPU altogether, relocated the HBA to the x8 slot and ran the server for a full week. Not a single drive disabled and everything was great. Even through a parity check at it with no disabled drives.

 

Tonight, I reinstalled the GPU, changing nothing else. The server booted up and ran fine for an hour, then, a drive was disabled. Either the GPU is the issue or the motherboard does not like having 2 cards installed. I checked the manual and it states that if a card is installed in the x16 and x8 slots, both slots will run at x8. I've set both slots to run at 3.0. According to the mother board manual (z390) my current arrangement should work just fine. I'd really like to keep my gpu installed for transcoding but it seems I may not have a choice. 

 

Diagnostics attached. 

 

Hoping someone can point me to a bios setting or something that will solve this. 

theark-diagnostics-20240312-1915.zip

Link to comment
6 hours ago, disposable-alleviation3423 said:

indicate the HBA goes offline and causes the issue.

Yep:

Mar 12 17:10:10 TheArk kernel: mpt3sas_cm0: SAS host is non-operational !!!!

Make sure the HBA is well seated and sufficiently cooled, you can also try a different PCIe slot.

Link to comment

I have tackled both those suggestions as a part of a previous thread.

 

This new thread is a result of:

 

1. Connecting a noctua fan to the heatsink on the HBA.

2. Trying all three PCIE slots for the HBA

3. Finding that the 8 lane slot works flawlessly (for 2 weeks) UNTIL I install the gpu in the 16 lane slot.

 

My issue seems like a PCIE lane issue but according to the mobo manual, my configuration should work as installed.  For added context I also have (3) M.2 drives installed, but again, the manual says those share lanes with the sata drives, not PCIE.

 

I've added the relevant snips from the manual below. Am I reading this wrong? 

 

Configuration that is causing the issue is as follows. If I remove the GPU it works without issue.

 

PCIE1 - 2.5gbs network card

PCIE2 - 3070 gpu

PCIE3 - Empty

PCIE4 - LSI 9300-8i HBA

PCIE5 - Empty

 

M2_1 - 1TB M.2

M2_2 - 1TB M.2

M2_3 - 1TB M.2

 

image.png.e5bd2bb52ec59db966a209296c5ba600.png

 

image.png.65705f75dac21ed8acc3e82d4a2e4457.png

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.