disposable-alleviation3423 Posted March 13 Share Posted March 13 Drives have been disabling randomly and for no reason 2-3 times a week. I posted in the forum previously, received advice and have been testing. I have been running copious tests to try to rule out variables and here is where I've landed. Previously, I had a GPU (3070) in the PCIEx8 slot and the HBA in the PCIEx16 slot. To eliminate variables, I removed the GPU altogether, relocated the HBA to the x8 slot and ran the server for a full week. Not a single drive disabled and everything was great. Even through a parity check at it with no disabled drives. Tonight, I reinstalled the GPU, changing nothing else. The server booted up and ran fine for an hour, then, a drive was disabled. Either the GPU is the issue or the motherboard does not like having 2 cards installed. I checked the manual and it states that if a card is installed in the x16 and x8 slots, both slots will run at x8. I've set both slots to run at 3.0. According to the mother board manual (z390) my current arrangement should work just fine. I'd really like to keep my gpu installed for transcoding but it seems I may not have a choice. Diagnostics attached. Hoping someone can point me to a bios setting or something that will solve this. theark-diagnostics-20240312-1915.zip Quote Link to comment
BRiT Posted March 13 Share Posted March 13 Are you certain it's not Power Supply related, if you have the GPU plugged in maybe it's pulling too much power and the drives are dropping offline because of that? Quote Link to comment
disposable-alleviation3423 Posted March 13 Author Share Posted March 13 I have not tried a different PSU but it's a 750w so I can't imagine it's running up against the limit. The diagnostics indicate the HBA goes offline and causes the issue. Quote Link to comment
JorgeB Posted March 13 Share Posted March 13 6 hours ago, disposable-alleviation3423 said: indicate the HBA goes offline and causes the issue. Yep: Mar 12 17:10:10 TheArk kernel: mpt3sas_cm0: SAS host is non-operational !!!! Make sure the HBA is well seated and sufficiently cooled, you can also try a different PCIe slot. Quote Link to comment
disposable-alleviation3423 Posted March 13 Author Share Posted March 13 I have tackled both those suggestions as a part of a previous thread. This new thread is a result of: 1. Connecting a noctua fan to the heatsink on the HBA. 2. Trying all three PCIE slots for the HBA 3. Finding that the 8 lane slot works flawlessly (for 2 weeks) UNTIL I install the gpu in the 16 lane slot. My issue seems like a PCIE lane issue but according to the mobo manual, my configuration should work as installed. For added context I also have (3) M.2 drives installed, but again, the manual says those share lanes with the sata drives, not PCIE. I've added the relevant snips from the manual below. Am I reading this wrong? Configuration that is causing the issue is as follows. If I remove the GPU it works without issue. PCIE1 - 2.5gbs network card PCIE2 - 3070 gpu PCIE3 - Empty PCIE4 - LSI 9300-8i HBA PCIE5 - Empty M2_1 - 1TB M.2 M2_2 - 1TB M.2 M2_3 - 1TB M.2 Quote Link to comment
JorgeB Posted March 13 Share Posted March 13 23 minutes ago, disposable-alleviation3423 said: I have tackled both those suggestions as a part of a previous thread. Then suggest trying a different HBA, or different board. Quote Link to comment
disposable-alleviation3423 Posted March 13 Author Share Posted March 13 This is my second HBA. Guess I'll buy a board and just throw money at the problem. Thank you for your help. Helps me not think I'm losing my mind. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.