SMB Performance Tuning


mgutt

Recommended Posts

I've got some fun things I've noticed (without any in-depth research) but simple anecdota - if things are "clean" - I haven't done anything to lock up the samba connection, I can get 100-200MB/sec between my Windows system and Unraid (2.5G onboard ethernet on both connected to the same 2.5G switch) and that's great. What sucks is when samba locks up (and seems to happen frequently enough to go to Google once again) and everything stalls out for what feels like an eternity.

 

Just minutes ago I tried to move one folder to another inside of the same share (/mnt/user/foo) and same mapped drive and all, not even that much data (~5 gig) and my entire Windows explorer process wound up locking up for well over 5 minutes. It never timed out or gave up, it just sat there. I can't figure out a discernable pattern so far, other than shfs processes do seem to be busier at the moment (I am doing some other stuff on the array, usually, but nothing that should be completely freezing up simple samba operations)

 

Link to comment

Disk shares are really night and day. That shfs overhead is a killer. Seems like my system gets bogged down possibly with I/O having to pass through the shfs layer and that locks things up from the SMB server reading from it... because right now mounting a disk share directly is like it's directly attached to my Windows system.

 

At least for now so I don't have to worry about data corruption weird stuff I'm only going to do activities inside of the specific disk share itself. Not move things in and out of it. It'll let me at least do a lot of cleanup on the specific disk, stuff that was sometimes super painful when trying to go through the user share.

Link to comment
  • 4 weeks later...
  • 1 month later...
On 3/13/2023 at 9:47 AM, meganie said:

The non-pro variants support RDMA just fine: https://network.nvidia.com/pdf/user_manuals/ConnectX-3_VPI_Single_and_Dual_QSFP_Port_Adapter_Card_User_Manual.pdf

 

"Client RDMA Capable: False" is just listed because Windows 10 Pro doesn't support RDMA. That's why I would have to upgrade my client to Windows 10 Pro for Workstations to support it.

But as far as I know Unraid/Samba don't support it, still listed as prototype: https://wiki.samba.org/index.php/Roadmap#SMB2.2FSMB3

From what I understand, RDMA is no longer prototype/experimental, since it is a server multi channel feature (out of experimental since Samba 4.15 release) and it is somewhat detailed in the interfaces option in their documentation https://www.samba.org/samba/docs/current/man-html/smb.conf.5.html . This gives me some confidence that I won't run into bugs with using RDMA.

 

I tried this on Unraid 6.12.2, Windows 11 Education client, and both machines using MCX314A-BCCT NICs. Both machines are directly connected since I don't have 40gbps network switches and I don't think I can get multi NIC SMB to work when both ports on each machine are connected to different networks (my home network and direct connection between the two machines).

 

The NIC from Windows side does show it is RSS and RDMA capable from the command:

PS C:\Windows\system32> Get-SmbClientNetworkInterface

Interface Index RSS Capable RDMA Capable Speed   IpAddresses                               Friendly Name
--------------- ----------- ------------ -----   -----------                               -------------
16              True        True         40 Gbps {10.6.13.18}                              mlx_direct_40g

For the SMB Extras configuration, I have this:

interfaces = "10.6.13.17;capability=RSS,capability=RDMA,speed=40000000000" "10.32.0.46;capability=RSS,speed=10000000000"

I only set up the 40gbps direct connection to have RDMA since I don't have any other clients in my home network that can use RDMA. Maybe in the future someday since I am considering in building a test bench pc, so I'll just get another Mellanox card to pair that with it.

 

The Unraid machine is using a 6 core i5-8600k and Windows machine is using a 12 core i7-12700k, so I am not sure why there is only 4-5 TCP connections only between the two machines...

root@Alagaesia:~# netstat -tnp | grep smb
tcp      234      0 10.6.13.17:445          10.6.13.18:50885        ESTABLISHED 3983/smbd           
tcp        0      0 10.6.13.17:445          10.6.13.18:58921        ESTABLISHED 28874/smbd          
tcp        0      0 10.32.0.46:445          10.32.1.32:42472        ESTABLISHED 26795/smbd          
tcp      117   1528 10.6.13.17:445          10.6.13.18:50883        ESTABLISHED 3983/smbd           
tcp      234      0 10.6.13.17:445          10.6.13.18:50577        ESTABLISHED 3983/smbd           
tcp      234      0 10.6.13.17:445          10.6.13.18:50884        ESTABLISHED 3983/smbd 

I'm assuming 4 of those connections are for RSS and one of them is for RDMA.

 

 

This next command on the Windows machine is a little interesting though:

PS C:\Windows\system32> Get-SmbMultichannelConnection -IncludeNotSelected

Server Name Selected Client IP   Server IP  Client Interface Index Server Interface Index Client RSS Capable Client RDMA Capable
----------- -------- ---------   ---------  ---------------------- ---------------------- ------------------ -------------------
10.6.13.17  False    10.6.13.18  10.6.13.17 16                     7                      False              True
10.6.13.17  True     10.6.13.18  10.6.13.17 16                     7                      True               False
10.6.13.17  False    172.20.0.46 10.32.0.46 4                      10                     True               False
10.6.13.17  False    172.20.0.46 10.6.13.17 4                      7                      False              True
10.6.13.17  False    172.20.0.46 10.6.13.17 4                      7                      True               False

The connections that are probably useful here are those between 10.6.13.18 and 10.6.13.17. For some reason I have two connections where RSS is enabled while RDMA is disabled, and vise versa. I don't know if this is by design or there is some other setting that I missed to have a single connection be RSS and RDMA capable.

 

I don't have the time to do further testing with this for a while, but it seems like the RSS capable connection is selected most of the time, while I only saw the RDMA capable connection get selected once in a random test. 

Edited by Percutio
Link to comment

I could be wrong, but my understanding is that only Windows 11 for Workstations (and relevant server SKUs) support RDMA.

 

That’s a pretty hefty overhead in terms of licensing cost. They don’t even make it even remotely easy to find for purchase.

Link to comment

From some of the charts that I looked at, SMB Direct is available for Workstation/Enterprise/Education editions, excluding the server editions. I am using Windows 11 Education in the test but Education features have been very difficult to track over the years since when I first got Windows 10 Education, it is basically Enterprise without Cortana and a little bit extra, but Windows 11 Education has a lot more differences...

 

EDIT:

This is what I see in the Windows Features window so hopefully this does mean I can get RDMA working someday when I get more time again :D

 

Screenshot 2023-07-10 020819.png

Edited by Percutio
Link to comment
  • 2 months later...

 

On 2/8/2023 at 5:37 PM, meganie said:

But in unraid I get zero lines with "egrep 'CPU|eth1' /proc/interrupts":

1197314073_UnraidRSS.thumb.JPG.e4b12e87d4fe769ee10d3abbb8354274.JPG

(eth0 is the non RSS capable onboard NIC)

 


I'm using the same card, and I'm pretty sure my configuration is working for RSS.  The "this command must show..." just doesn't account for how the mellanox driver reports its interrupts (maybe only in certain configurations of the card/driver/firmware?). Try:

# egrep 'CPU|eth*|mlx' /proc/interrupts

image.thumb.png.5407c8a6084eff83cf181a3a4ce33355.png

 

On the Windows side, you may also want to run something like the following, in an admin Powershell:

> Set-NetAdapterRss  -Name "Ethernet_10Gbe" -MaxProcessors 6 -NumberOfReceiveQueues 6 -BaseProcessorNumber 4 -MaxProcessorNumber 16 -Profile Closest

image.thumb.png.d4dc10ec25ce3e52c7ddb1adbdd3934a.png

 

I'm not sure why Windows is reporting 'Client RDMA Capable = False' for me; I thought at one point it showed 'True' during my setup/config of the RSS feature, but I might be mis-remembering. Docs seem to indicate that you need the Windows Workstation license for RDMA as a *server*, but not for RDMA *client*, so I'd hoped to be able to get that flipped on as well

 

Though OTOH, I think the ConnectX 3 [non-Pro] require correctly configured PFC or Global Pause in order to function over ethernet/fiber (ie, non-infiniband). This requires support in your switch, too.

 

RoCEv2 (in the ConnectX 3 Pro and beyond) might be more easily managed in that regard

Edited by nick5429
Link to comment

@nick5429 Thanks for the help even though it didn't really change in Windows even after I've executed

Set-NetAdapterRss  -Name "Ethernet 2" -MaxProcessors 6 -NumberOfReceiveQueues 6 -BaseProcessorNumber 4 -MaxProcessorNumber 16 -Profile Closest 

Windows.jpg.e17b8b9055f1124e5437a1cc683ef66b.jpg

I've also upgrade to Windows 11 Pro for Workstation in the meanwhile.

Unraid.thumb.jpg.2984913193be955ede7ccd782b3e7ce5.jpg

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.