Jump to content

6.12.4 - what the hell is wrong with my unraid


Recommended Posts

Hi fellow users, I'm using Unraid for couple of years now and since few days i'm struggling with disk errors.

My setup is fairly simple ryzen 5600, 16gb ram, Dell H310 IT mode + Quadro P400 + Some aliexpress sata card, and bunch of different disks (2wd 14, 5 seagate 16tb, some ssd for cache).

 

Everything worked great until i started seeing parity errors (and my parity drive going offline), after some digging and testing i was able to find that this is happening because i've got some errors on disks from my array (connected to HBA), but errors start to appear when i'm doing some write operation to disks.

 

1. I have changed HBA to new one (from ali, it mode), but this one crashes my unraid and i'm not able to even start array

2. Changed hba cables (same problem)

3. Changed card slot (same problem).

 

I'm able to start array and use it in read only mode (unraid shows some writes, 300-500 writes) but when i upload something then errors appears.

I was able also to do full extended smart on all disks connected without any errors.

 

When this write error occurs my LSI cards is dropped from unraid and I'm not able to see it in system devices. And those errors occured only on disks connected to my LSI (from what i saw), i can try and buy another one but maybe it's something else.

 

Diagnostics uploaded.

 

If you have any idea what i can do i'm more than happy to try.

 

 

poseidon-diagnostics-20231005-2051.zip

Link to comment
36 minutes ago, JorgeB said:
Oct  5 20:50:22 Poseidon kernel: mpt2sas_cm0: SAS host is non-operational !!!!

 

HBA stops responding, if you've already tried a different slot it would be good to try a different controller.

 

Already tried second one (also Perc H310) it works same, can start array but when it comes to writes errors appear and HBA is going offline.

I used different slot same, 3 different cables, same (slots seems to be working because i use Quadro for transcoding and works as expected).

Will try to buy another (last one) hba, and if it wont work i will go with something like pci to sata adapter.

 

Link to comment
12 minutes ago, JorgeB said:

Could be some compatibility issue with the HBA and the board.

 

it was working before, my MB is Gigabyte b550m ds3h. I'm thinking maybe some bios settings, but also set to default and only enabled uefi + svm + iommu. I've got second card just because one port on my old HBA is damaged and I'm getting more disks soon.

Edited by ragingOgr
Link to comment
43 minutes ago, ragingOgr said:

 

it was working before, my MB is Gigabyte b550m ds3h. I'm thinking maybe some bios settings, but also set to default and only enabled uefi + svm + iommu. I've got second card just because one port on my old HBA is damaged and I'm getting more disks soon.

There have been cases of this sort of symptom occurring if the HBA is not squarely seated in the motherboard slot.

Link to comment
32 minutes ago, itimpi said:

There have been cases of this sort of symptom occurring if the HBA is not squarely seated in the motherboard slot.

 

I was inserting it into multiple pcie so at least once it should be good, but will try to clean slot and make double check for that. Thanks.

Link to comment
  • 3 weeks later...

Ok some update.

 

Got "new" perc h310 card flashed to IT firmware (as usual). And what i found and confirmed that everything works fine unless my disk is used extensively by docker.


I was using unassigned nvme drives as a fast storage for nzbget (1gb download, unpack and so on) so i thought it's because those drives are not in array (worked before), now both drives are in separate cache pools but problem persisted.

 

I can move TB of data using unbalance and everything is fine, but when i use eg. radarr/sonarr and nzbget i will get disk errors shortly after (I can watch movies on plex and everything i fine). 

 

I read that there is overheating problem but after adding fan that blows onto a card problem was not solved (will add 40mm directly on card this week).

 

I have also run xfs_repair on all disks (in maintenance mode). 

 

 

I think problem occurs only when transfering from nvme to disks attached to HBA, any idea why? to fast nvme drives or something? 

 

 

 

 

 

Edited by ragingOgr
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...