Poor connection speed to Unraid server despite 10G NIC.


Recommended Posts

I'm slowly despairing over Unraid and my poor network speed. 🙁

 

Hardware:

ASRock X570M Pro4,

AMD Ryzen 5 PRO 4650G with Radeon IGPU

4x Kingston 9965745-026.A00G, 16 GiB DDR4 @ 3200 MT/s

2x Seagate Exos X16 12 TB

1x Seagate Exos X18 16 TB (Parity)

2x Samsung SSD 970 EVO Plus 2TB (organized separately as cache)

1x 10GbE PCIE Intel X520-DA1-82599EN

 

I've always had a poor SMB transfer with my hardware in combination with Unraid. Although I received many advice and support in the forum, in the German subforum, for which I'm very grateful, I gave up on the SMB issue since I could achieve speeds of 950 MiB/s through SFTP. Unfortunately, even that deteriorated over time for reasons unknown. Initially, I thought it was related to the switch to RC6, as I could only reach 60 MiB/s with SFTP, 20 MiB/s with SMB and 40 MiB/s with NFS after update. However, this seems to be only partially correct, as I have now reverted back to 6.11.5 and with NFS, I only achieve about 100 MiB/s.

 

Interestingly, this problem only occurs in connection with this specific hardware and Unraid.

I have now installed Endeavouros, Ubuntu Server, Opensuse Tumbleweed, Fedora Workstation, Fedora Server, and Rocky Linux on the hardware. Depending on the operating system, there were only slight deviations, but all delivered the network performance expected from a 10G NIC.

 

I have now turned the forum upside down and have tried various things. Unfortunately, none of them resulted in even the slightest change. I have also wiped the USB stick several times to restart completely fresh.

 

I even swapped out the cache SSDs and tried the NIC in all available slots. I experimented with various BIOS settings and changed the MTU from 9000 to 1500 and back.

 

It's not an issue with the client either. I tried two Linux clients with 10G NICs, and they achieved nearly identical results. I also tested a Windows 11 client with 2.5G and another Linux client with the same speed result. Interestingly, these clients reached the same speed as the 10G clients. When I let the clients communicate with each other, the transfer rate is as it should be.

 

The issue is also not with the switch either since a direct connection between the client and Unraid server achieves the same speed of only 100 MiB/s. Even the connection from the Unraid server through the router doesn't make any difference. The other clients communicate with each other without any problems over the same infrastructure. I have also tried rearranging the ports in all possible configurations.

 

It's not a cable issue either, as I have already tried several cables and even switched from DAC to AOC.

 

The NIC itself in the Unraid server is also not the problem, as it works perfectly fine in my desktop PC.

 

I tried adjusting the sysctl in various ways, but none of the changes made any difference.

 

Could it be possible that there is a driver issue? As far as I know, Unraid uses Kernel 5.19, while the other operating systems I have tested, if I remember correctly, use version 6.X.

 

interface_eth0.png

 

iperf.png.df892c2147b3f82c3f67b0a11a74b6b1.png

gundula-diagnostics-20230530-2340.zip

nic.png

Edited by Exord
Link to comment
  • Exord changed the title to Poor connection speed to Unraid server despite 10G NIC.
1 hour ago, Exord said:

Could it be possible that there is a driver issue?

Not likely, I use 82599 in various Unraid v6 haven't issue.

 

You have many retr, this abnormal.

 

You have 4 16GB stick run at 3200MT, pls try set it to 2666MT even 2400MT for test any different and don't overclock BCLK / PCIe.

 

If according your previous post, if the problem was writing to SSD limited to ~300MB/s then it is other issue. ( not network issue )

 

Edited by Vr2Io
Link to comment
7 hours ago, Vr2Io said:

 

You have 4 16GB stick run at 3200MT, pls try set it to 2666MT even 2400MT for test any different and don't overclock BCLK / PCIe.

 

 

 

OK, I will try after work.

 

7 hours ago, Vr2Io said:

 

If according your previous post, if the problem was writing to SSD limited to ~300MB/s then it is other issue. ( not network issue )

 

 

I think the old post I wrote at the time does not meet the current problem. Therefore, I wanted to start here with a new post. I can of course be wrong. At least the problem that the performance from SSD to SSD was bad, could be solved at that time. Today I don't even get to the ~300MB/s anymore. No matter whether with NFS, SMB or SFTP. At that time the SMB connection was bad, today all connections are bad. It also doesn't matter if I write to the SSDs or HDDs.

 

Edit: In order to exclude compatibility problems with the card I have ordered a Mellanox X-3 card. Thanks to Amazon Prime, it should arrive tomorrow.

Edited by Exord
Link to comment
12 hours ago, Exord said:

Just switched to 6.12.0-rc6. The CPU usage now goes much higher during the transfer. Previously the load was around 10% during a transfer. Now it goes partially up to 23%, data transfer is now at around 1.1 Gbps

CPU usage usually not much importance, sometimes those figure also various with multi-thread enable or not.

 

Below are some capture FYR, SMB read and write. on 6.12.rc6 ( NIC 82599 and NVMe SSD, MTU1500 ), if I test on a slow SATA SSD, then throughput less then half.

2nd PIC are flush the RAM cache, real through physical SATA SSD.

 

image.thumb.png.9455da3125945aa2caf218d1254fc6a4.png 

 

image.thumb.png.192f41eea24402ba219908b7e9772917.png

 

12 hours ago, JorgeB said:

Can you post a screenshot of a transfer graph using Windows explorer? And new diags saved during the transfer.

You should make some test, JorgeB may have idea.

Edited by Vr2Io
Link to comment
17 hours ago, JorgeB said:

Can you post a screenshot of a transfer graph using Windows explorer? And new diags saved during the transfer.

The only Windows PC in the house is unfortunately not mine. Therefore I don't always have access to it. But if Linux is enough for you then I have made a couple of screenshots during a transfer. On the screenshots you can also see the statistics of the Dynamix System Statistics Plugin during the copy.

 

The diagnostic file was created directly after the transfer.

 

gscreenshot_2023-06-02-102109.png

gscreenshot_2023-06-02-102130.png

gscreenshot_2023-06-02-102213.png

gscreenshot_2023-06-02-102311.png

gundula-diagnostics-20230602-1026.zip

Link to comment
  • 3 weeks later...

Unfortunately, the problem still exists. I have now in desperation again changed something in hardware, since the dealer has taken back a few things.

I have now a 2 x 10g port MellanoxX3 with current drivers flashed in the Unraid server. These are now connected to a MikroTik CRS305-1G-4S+in via 802.3ad aggregation. Aggregation is enabled in Unraid and in the Mikrotik. The aggregation compared to the single port added 30 MiB to the transfer.This gave me an extra 30 MiB on the transfer:-\.

 

In the desktop is now a 10g Aquantia NIC.

 

For further ideas I would be super grateful.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.