Recommended Posts

Hello.
I have a problem:

 

I have a hard disk array in xfs encrypted and a cache in btrfs encrypted raid 1.

I have SMB sharing enabled on unraid and CIFS client sharing ( version 2:7.0-2 ) enabled on several Debian 12.

My usage consists of having several tens or even hundreds of GB on the cache, then once an hour, emptying it on the array.

The problem is that during this migration, the SMB and NFS shares crash. I have to restart the service on unraid once the mover is finished.

 

Very annoying when clients have an error.

 

Unraid is installed on a dedicated machine.
The network card has its drivers up to date (pcie 1x card and RTL8125B controller).
For hard disks + SSD
LSI 9300-8i controller, SAS 3008 chipset.

Thanks for your help.
Sorry for my english .

Link to comment

It probably doesn't crash, but when mover runs it kinda monopolizes the resources.

 

If you write so much that you have to empty the cache every hour it's generally recommended not to use cache in the first place since it's not  helping with the throughput at this point, the array's speed is becoming the limit on emptying the cache anyway.

Link to comment

Once a new burst of data arrives on the cache, the cache is empty, transfers nothing and has no I/O in progress.

I've just done a test:

 

Add 250 GB of files to the cache
Empty cache via mover on array
At the same time as the mover is executing, launch another 250GB burst of writing to the cache.
At the same time, read and write to the array via SMB shares

 

My aim is to push it to its limits.

 

Result: no crashes ...

 

I'm going to do some more tests, more or less light, and I'll keep you informed.

Edited by wary-disruption4336
Link to comment

This sounds like a classical RTL8125B load issue - they are known for random disconnects and packet corruption ... otherwise I see no reason why SMB/NFS should really crash. Thats just a wild guess tought ... but nothing unusual with these NICs (which also come with C-State sleep issues with various Kernels).

 

I'd try to rule that out personally first by using another NIC that's less problematic, specifically if you have lots and lots of transfers coming in. Otherwise you've left out some important specs that will enable one to really judge what be might at fault here. For example the Cache Drive(s) might be also a bottleneck - depending what you use, or RAM ... but it could also be an issue with the BTRFS-Storage itself ...

 

Or are you perhaps using something like an Odroid H3 here with an onboard NIC without PCIe ports? I know why I'm staying away from these really tempting boards ... going as low a 1.3W standby but then again when you throw something with load there's lots of issues to discover. Potentiall I had the Odroid H2 in the past and had power issues which fried two expensive enterprise SSDS and the 5V rail on the board.

 

Go straight with 10GBe SFP Intel or an old 1GBe RJ45 Intel - 2.5GBe NICs are all potential issue-monsters, both from Intel and Realtek.

Edited by jit-010101
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.