UPDATE 3/3/2021: I have definitively determined my performance issues are caused by WireGuard. I do not yet know if or when I'll find a solution. UPDATE 8/1/2022: This is still very much broken. I try a file transfer every couple of months and it continues to be horribly slow. Using RSYNC over SSH outside the Wireguard tunnel works great and is what I will continue to use until I can figure this sh*t out. FINAL UPDATE 11/24/2022: See my last post here for solution and TL;DR: Let me preface all of this by saying I'm not sure where my issue lies, so I'm going to layout what I know and hopefully get some ideas on where to look for my performance woes. The before times: Before setting up WireGuard I had SSH open to the world (with security and precautions in place) on my main server so that once a month my backup server could connect and push and pull content as defined in my backup script. This all worked splendidly for years and I always got my full speeds up to the bandwidth limit I set in my rsync parameters. Now: With the release of WireGuard for UnRAID I quickly shutdown my SSH port forward and setup WireGuard. I have one tunnel for my administrative devices and a second tunnel which serves as sever2server access between NODE and VOID. NODE is my main server, and runs 6.8.3 stable. It is located on a 100Mbps/100Mbps fiber line. UPDATE: As a last ditch effort I upgraded NODE to 6.9.0-RC2 as well, no change in the issue. VOID is my backup, runs 6.9.0-RC2 and lives in my home on a 400Mbps/20Mbps cable line. In this setup, my initial rsync session will go full speed for anywhere from 5-30 minutes, then suddenly and dramatically drop in speed, down to 10Mbps or less and stay there until I cancel the transfer. I can restart the transfer immediately and regain full speed for a time, but it always eventually falls again. Here is my rsync call: rsync -avu --stats --numeric-ids --progress --delete -e "ssh -i /mnt/cache/.watch/id_rsa -T -o Compression=no -x -o StrictHostKeyChecking=no" root@NODE:/mnt/user/TV/Popeye/ /mnt/user/TV/Popeye/ Here is a small sample of the rsync transfer log to illustrate the sudden and sharp drop in speed: Season 1938/Popeye - S1938E09 - Mutiny Ain't Nice DVD [BTN].mkv 112,422,538 100% 10.80MB/s 0:00:09 (xfr#24, to-chk=58/135) Season 1938/Popeye - S1938E10 - Goonland DVD [BTN].avi 72,034,304 100% 9.76MB/s 0:00:07 (xfr#25, to-chk=57/135) Season 1938/Popeye - S1938E11 - A Date to Skate DVD [BTN].mkv 138,619,127 100% 10.44MB/s 0:00:12 (xfr#26, to-chk=56/135) Season 1938/Popeye - S1938E12 - Cops Is Always Right DVD [BTN].mkv 127,109,972 100% 11.02MB/s 0:00:10 (xfr#27, to-chk=55/135) Season 1939/Popeye - S1939E01 - Customers Wanted DVD [BTN].mkv 114,673,044 100% 10.50MB/s 0:00:10 (xfr#28, to-chk=54/135) Season 1939/Popeye - S1939E02 - Aladdin and His Wonderful Lamp DVD [BTN].mkv 325,996,501 100% 11.69MB/s 0:00:26 (xfr#29, to-chk=53/135) Season 1939/Popeye - S1939E03 - Leave Well Enough Alone DVD [BTN].mkv 105,089,182 100% 11.30MB/s 0:00:08 (xfr#30, to-chk=52/135) Season 1939/Popeye - S1939E04 - Wotta Nitemare DVD [BTN].mkv 149,742,115 100% 754.78kB/s 0:03:13 (xfr#31, to-chk=51/135) Season 1939/Popeye - S1939E05 - Ghosks Is The Bunk DVD [BTN].mkv 114,536,257 100% 675.53kB/s 0:02:45 (xfr#32, to-chk=50/135) Season 1939/Popeye - S1939E06 - Hello, How Am I DVD [BTN].mkv 92,083,730 100% 700.03kB/s 0:02:08 (xfr#33, to-chk=49/135) Season 1939/Popeye - S1939E07 - It's The Natural Thing to Do DVD [BTN].mkv 110,484,799 100% 715.66kB/s 0:02:30 (xfr#34, to-chk=48/135) Season 1939/Popeye - S1939E08 - Never Sock a Baby DVD [BTN].mkv 97,660,132 100% 716.88kB/s 0:02:13 (xfr#35, to-chk=47/135) Season 1940/Popeye - S1940E01 - Shakespearian Spinach DVD [BTN].mkv 102,543,357 100% 632.64kB/s 0:02:38 (xfr#36, to-chk=46/135) Season 1940/Popeye - S1940E02 - Females is Fickle DVD [BTN].mkv 102,363,188 100% 674.34kB/s 0:02:28 (xfr#37, to-chk=45/135) Season 1940/Popeye - S1940E03 - Stealin' Ain't Honest DVD [BTN].mkv 100,702,236 100% 732.80kB/s 0:02:14 (xfr#38, to-chk=44/135) Season 1940/Popeye - S1940E04 - Me Feelins is Hurt DVD [BTN].mkv 111,018,052 100% 672.35kB/s 0:02:41 (xfr#39, to-chk=43/135) Season 1940/Popeye - S1940E05 - Onion Pacific DVD [BTN].mkv 103,088,015 100% 650.18kB/s 0:02:34 (xfr#40, to-chk=42/135) Season 1940/Popeye - S1940E06 - Wimmin is a Myskery DVD [BTN].mkv 61,440,000 59% 757.02kB/s 0:00:56 ^C rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(701) [generator=3.2.3] and my accompanying stats page during the same transfer. You can see the sudden decline around 11:46 which coincides with my sudden drop in transfer speed above: I don't see anything telling in the system logs on either server when this speed drop happens. It almost seems like a buffer is filling up and not being emptied quick enough, causing the speed to tank. What I don't think it is: I don't think my issue is with WireGuard or my ISP speeds on either end. While the transfer is crawling along over SSH at sub-par speeds I can easily browse to NODE over WireGuard from my Windows or Mac computer and pick any file to copy over the tunnel and I can fully saturate the sending servers upload with no issues while SSH is choking in the background: Could it have something to do with the SSH changes that took place between 6.8.3 and 6.9.0? None of the changes I'm aware of sound like the culprit but I could be wrong. So besides that I'm pretty much out of ideas on what it could be without just playing with random ssh and rsync options. Let me know if there is some other info I can provide, below are both servers diagnostic files: node-diagnostics-20210204-0751.zip void-diagnostics-20210204-0752.zip EDIT: I just realized LimeTech has a guide about this published: https://unraid.net/blog/unraid-server-to-server-backups-with-rsync-and-wireguard I looked it over and I'm not really doing anything different except not passing -z (compression) to rsync and disabling compression for the SSH connection. a lot of what is transferred for me is video and doesn't compress well so why waste the CPU cycles on it.

Poor file transfer performance over WireGuard

May 10, 20215 yr

Author

On 5/5/2021 at 4:41 PM, Torben said:

I'm kind of in the same boat. 2 unRAID-Servers connected thru Wireguard, accessing SMB shares mounted thru Unassigned Devices (but it also happens when mounted manually). unRAID and all plugins are updated.

Here's what happens: The download starts and most of the time - not always though - it drops from 250 MBit/s to 10 MBit/s at some point in time. Sometimes after a minute, sometimes after hours, while copying a file and sometimes while starting to copy a new one, with rsync or Midnight Commander. I mostly can fix it by unmounting the share and remounting it and the game starts again. I tried troubleshooting for hours, using Google and the unRAID forum (and just trying out stuff) and more than once thought I made it...but it always comes back.

What I think I should mention is CPU load/multi core behavior. When wireguard and SMB is working, multiple cores are used and load is jumping all over them, sometimes spiking up to 98% on a core for a split second then using multiple cores again. But when the problem occurs, only one core is used. It shoots up to about 90% and some milliseconds later to 100% and it stays there for maybe two seconds. Then it jumps to another core, 90% 100% 2s, next core. But it doesn't seem to make a difference in "top", the CPU load per process seems to stay the same as if it's still using multiple cores.

Unfortunately I have no idea what info could be usefull to continue troubleshooting, so please let me know if you have any suggestions.

I have not seen any correlation between CPU usage spikes and my wireguard performance woes. My processor can be entirely idle with nothing running and I'll still have a speed drop.

Anecdotally I have had some marginal improvements in how long I can maintain a transfer at speed through the tunnel by adjusting the MTU for the tunnel down to 1420. However this is not fool proof and it will still lose speed at some point and require me to restart the transfer.

At 1420 I was able to move about 300GB of data at speed over the course of 12-14 hours. In total this month I was able to move about 700GB and only had to restart the transfer 4-5 times.

Quote

May 10, 20215 yr

Hm, it's weird that you don't have this CPU correlation.

Since the default MTU of Wireguard is 1420, I'd be surprised if this really made a difference - except you didn't set it to "auto" before.

In my experience it doesn't matter how much data or how many files you transfer, it looks like happening randomly and that's what makes troubleshooting so fricking annoying. And when you lose speed, the MTU size normally is too high as you need two packet fragments instead of one packet.

I didn't lose speed with MTU 1420, but I tried to find out the max MTU of my servers internet connection instead of wireguards and besides getting weird results on unRAID (1600 working with a max of 1500...what?!), I tried it on my Windows machine connecting with the Windows WG client. The result was that the internet connection maxed out at MTU 1492 (and later on this was what unRAID also said), which means a Wireguard MTU of 1412 (using 80 as buffer by default). I'm now trying it with MTU 1412, have max speed and so far copying for 45 minutes without a problem...but as we know this doesn't mean a thing.

Edited May 10, 20215 yr by Torben

Quote

May 11, 20215 yr

Author

16 hours ago, Torben said:

Hm, it's weird that you don't have this CPU correlation.

Since the default MTU of Wireguard is 1420, I'd be surprised if this really made a difference - except you didn't set it to "auto" before.

In my experience it doesn't matter how much data or how many files you transfer, it looks like happening randomly and that's what makes troubleshooting so fricking annoying. And when you lose speed, the MTU size normally is too high as you need two packet fragments instead of one packet.

I didn't lose speed with MTU 1420, but I tried to find out the max MTU of my servers internet connection instead of wireguards and besides getting weird results on unRAID (1600 working with a max of 1500...what?!), I tried it on my Windows machine connecting with the Windows WG client. The result was that the internet connection maxed out at MTU 1492 (and later on this was what unRAID also said), which means a Wireguard MTU of 1412 (using 80 as buffer by default). I'm now trying it with MTU 1412, have max speed and so far copying for 45 minutes without a problem...but as we know this doesn't mean a thing.

Yeah I've read that the default is 1420 but I figured it couldn't hurt to set it manually in case it wasn't auto detecting correctly. I had also tried going all the way down to 1380 and it didn't seem to make much of any improvement.

I'm not sure if taking my MTU to far down could be making things worse so maybe 1380 was to far the other way and 1412 is the sweet spot?

Let me know if you can get consistent results with several large transfers.

It would be great to get some sort of debug output from wireguard like I mentioned a few posts back with the kernel debug options. I don't know if we'll ever nail this down without some logs or WG dev help.

Edited May 11, 20215 yr by weirdcrap

Quote

May 11, 20215 yr

Don't know if you already knew, but you can get the current MTU for all interfaces with: ifconfig | grep -i MTU

I'm by far no expert, but the MTU size depends on your servers internet connection. For LAN it's 1500, on the internet it mostly isn't. You can check it by using the "ping" command.

Working:

root@tower:/mnt/user# ping -c 1 -M do -s 1464 9.9.9.9
PING 9.9.9.9 (9.9.9.9) 1464(1492) bytes of data.
1472 bytes from 9.9.9.9: icmp_seq=1 ttl=60 time=10.6 ms

--- 9.9.9.9 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.645/10.645/10.645/0.000 ms



Not working:

root@tower:/mnt/user# ping -c 1 -M do -s 1465 9.9.9.9
PING 9.9.9.9 (9.9.9.9) 1465(1493) bytes of data.
ping: local error: message too long, mtu=1492

--- 9.9.9.9 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

There's a buffer sizes on the network, for internet connections 28, for wireguard 80. That's why I'm just pinging with a package of 1464 instead of 1492 on the internet connection and for wireguard it would mean 1412. For you it could mean something different, since every internet connection can have a different MTU size.

The networking guy at work meant the MTU size of WG is independant from the internet connection, since it's a different protocol, but my logic tells me that wireguard is still running over the internet connection and if the WG package is too large, the internet connection can't handle it correctly. Ok, I didn't have any problems before unRAID 6.9 - or I just didn't realize -, but I tried that much other stuff that I'm grasping at any straw I can to get constantly high transfer speeds again. 🙂

Quote

May 11, 20215 yr

Author

22 minutes ago, Torben said:
Don't know if you already knew, but you can get the current MTU for all interfaces with: ifconfig | grep -i MTU

I'm by far no expert, but the MTU size depends on your servers internet connection. For LAN it's 1500, on the internet it mostly isn't. You can check it by using the "ping" command.
Working:

root@tower:/mnt/user# ping -c 1 -M do -s 1464 9.9.9.9
PING 9.9.9.9 (9.9.9.9) 1464(1492) bytes of data.
1472 bytes from 9.9.9.9: icmp_seq=1 ttl=60 time=10.6 ms

--- 9.9.9.9 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.645/10.645/10.645/0.000 ms



Not working:

root@tower:/mnt/user# ping -c 1 -M do -s 1465 9.9.9.9
PING 9.9.9.9 (9.9.9.9) 1465(1493) bytes of data.
ping: local error: message too long, mtu=1492

--- 9.9.9.9 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
There's a buffer sizes on the network, for internet connections 28, for wireguard 80. That's why I'm just pinging with a package of 1464 instead of 1492 on the internet connection and for wireguard it would mean 1412. For you it could mean something different, since every internet connection can have a different MTU size.

The networking guy at work meant the MTU size of WG is independant from the internet connection, since it's a different protocol, but my logic tells me that wireguard is still running over the internet connection and if the WG package is too large, the internet connection can't handle it correctly. Ok, I didn't have any problems before unRAID 6.9 - or I just didn't realize -, but I tried that much other stuff that I'm grasping at any straw I can to get constantly high transfer speeds again. 🙂

My LAN MTUs are all 1500 and running traceroute --mtu myendpoint shows my MTU is not incremented down at all before it reaches it's destination out on the internet so the default wireguard setting of 1420 should be correct for me and my network.

I was simply saying I'd be willing to try tuning it lower just to see if it makes any difference if you had promising results. In my experience it doesn't seem to have any effect.

Quote

May 11, 20215 yr

Alright. 🙂

I did 2 transfers yesterday, 200 GB in total, without a problem. Will try running another 200GB tonight and let you know if it worked.

Quote

May 13, 20215 yr

I almost wanted to say: "Works", since I did several large transfers, but when I did the last test I ran into the same problem...CPU spikes up to 100% on one core and switches like that to all the others without ever going back to multi core. There are 2 differences: The speed first drops from 90 MBit/s to 80 MBit after some time and then shortly after to 70-80 MBit/s and stays there until everything goes wrong and the speed instantly drops. But now it drops to 13 MBit/s. Before it was instantly going to 10 MBit/s without the steps in between. And it was the same with a 250 MBit connection - instant drop to 10 MBit/s.

I don't know, I'll try MTU 1350 next, found some stuff on some forums. And if that doesn't make a difference I'll probably search for a different solution, since this problem annoys the crap out of me.

Edited May 13, 20215 yr by Torben

Quote

May 13, 20215 yr

Author

59 minutes ago, Torben said:

I almost wanted to say: "Works", since I did several large transfers, but when I did the last test I ran into the same problem...CPU spikes up to 100% on one core and switches like that to all the others without ever going back to multi core. There are 2 differences: The speed first drops from 90 MBit/s to 80 MBit after some time and then shortly after to 70-80 MBit/s and stays there until everything goes wrong and the speed instantly drops. But now it drops to 13 MBit/s. Before it was instantly going to 10 MBit/s without the steps in between. And it was the same with a 250 MBit connection - instant drop to 10 MBit/s.

I don't know, I'll try MTU 1350 next, found some stuff on some forums. And if that doesn't make a difference I'll probably search for a different solution, since this problem annoys the crap out of me.

I feel your pain. Once a month when I sync my servers I get frustrated and do a day of furious research while I baby sit the transfer. So far I haven't found anything to resolve the issue but please do report back if you figure it out first

Quote

May 13, 20215 yr

I just wanted to let you know about my testing today: With MTU 1350 I reached speeds a bit higher and more stable than before and it worked very well until it stopped working. Same CPU sh*t as before.

Then I had the idea to see what happens if I don't max out the connection with "rsync --bwlimit". I set it a bit below max speed, about 8 MBit/s (8%, better a bit less i guess) and holy shit...the CPU load used by Wireguard/the copy process reduced drastically overall. Every couple of seconds it uses one or 2 cores up to 45-70% total, the rest of the time the CPU is almost as idle as it is without a copy process running.

Quote

May 14, 20215 yr

Author

14 hours ago, Torben said:

I just wanted to let you know about my testing today: With MTU 1350 I reached speeds a bit higher and more stable than before and it worked very well until it stopped working. Same CPU sh*t as before.

Then I had the idea to see what happens if I don't max out the connection with "rsync --bwlimit". I set it a bit below max speed, about 8 MBit/s (8%, better a bit less i guess) and holy shit...the CPU load used by Wireguard/the copy process reduced drastically overall. Every couple of seconds it uses one or 2 cores up to 45-70% total, the rest of the time the CPU is almost as idle as it is without a copy process running.

Ah I didn't realize you weren't already using bwlimit. That might explain why I haven't seen the CPU spikes you have. My remote server shares bandwidth with a friend's business so I always make sure to only use roughly have of the available bandwidth.

So with the bandwidth limit can your WireGuard transfer consistently maintain it's speed? I'm still seeing my speeds drop to 1MBps > at random points during my tests with my bwlimit of 5MBps

Edited May 14, 20215 yr by weirdcrap

Quote

May 15, 20215 yr

The first time I tried, it failed again after propably some hours, but I had to run the mover at the same time, since my cache drive filled up because of all the testing.

I had to do a large file transfer this night and did two changes at once...which you shouldn't do troubleshooting, but...yeah. I uninstalled the Unassigned Devices Plugin (just to make sure it doesn't do...something), disabled settings in Dockers that react to new files and the transfer went for 10h at max. speed, just dropping from 8,5 MB/s to 7,5 MB/s for one file. I'll continue testing and see what happens.

I mounted with a simple "mount -t cifs -o username=USERNAME,iocharset=utf8 SOURCE TARGET" and "rsync --bwlimit..."

Quote

May 16, 20215 yr

And here are the news:

Some hours after the successful 10h transfer I was browsing the remote share without reconnecting it a bit to contol if everything was alright and wanted to start a new transfer. This transfer instantly started with 12 MBit/s, so the SMB connection was broken and I had to reconnect.

Now I got one of my colleagues to join the "WG-SMB-Test-Club", connecting to my server. She's got a way slower internet connection (--bwlimit to half the internet connection, maybe I have to go lower for consistency) and especially a way less powerfull CPU (J5040 vs 10900T).

We started with MTU 1350 and after 2 minutes the CPU didn't use multi core anymore like it did before and just uses one core jumping in 2 steps to 100% switching to the next at 100%...the old story. Additional info: We tried to do a "clear" in bash and "top" while this happend, but it took several seconds (5+) for the commands to be executed.

I then realized that we didn't do a change I did on my server as one of the first steps of troubleshooting: Disable Settings -> Global Share Settings -> Tunable (support Hard Links)

So we killed all rsync processes, disconnected SMB, stopped the array, changed the setting and we're now transferring for over 1.5h without a problem so far. Let's see when or if it fails.

I have no idea how all this corellates - and if it really does or everything is just a fluke, but I'll try to get everything to "the same setup" as close as possible and see if I can find out more things in common.

Edited May 16, 20215 yr by Torben

Quote

May 16, 20215 yr

Author

In my testing hardware doesnt seem to make a difference. I have two i7 Haswells each with 32GB of RAM and they struggle just as badly as my old AMD FX-6300 did.

I would be surprised if Hardlinks being one or off makes a difference. I have that setting on and never considered turning it off. If you think you see improvements with it off I'll give it a shot.

Like right now I just started a transfer no more than a few minutes ago and the speed has already tanked to barely 1MB/s. My CPU usage on both servers is >5%.

Quote

May 16, 20215 yr

It's hard to say if it makes a difference or not, since we now had the problem again that the transfer was staying at 10 MBit/s almost immediately after a fresh SMB mount. But my colleague was surfing the web, so it could be the overall speed drops that caused it. As far as I could say, I had less problems with hardlinks turned off and since I don't experience any other problems because of that, I'll leave it off for now.

What I wanted to say with the difference in hardware was, that it doesn't make a difference how much power your cores have, it's always one core, 100%. Did you check overall CPU usage or single cores? The overall usage stays the same, it's just not multi core with e.g 1 core at x% and one core at y% (with bwlimit it just ramps up every couple of seconds), it's one core at 100% and the rest is handling the background noise. We now have 2 unRAID servers showing this behavior. Also both servers show used/free space of the remote share in Unassigned Devices as 0/0, not the real numbers as it does when everything's fine. But that's an already known thing, as far as I googled thru the last three weeks of somewhat trying to workaround this issue.

I bet we're both talking about the same speeds, but you're talking about 1 MB/s and I'm talking about 10 MBit/s (I'm using the unRAID Dashboard network part to check the speed). To me it looks like a fallback speed, since it's 10 MBit/s no matter what...1 GBit or 50 MBit download connection, 100 or 250 or 500 MBit upload connection - always 10 MBit/s in this "fail state" scenario.

Quote

May 17, 20215 yr

Author

14 hours ago, Torben said:

It's hard to say if it makes a difference or not, since we now had the problem again that the transfer was staying at 10 MBit/s almost immediately after a fresh SMB mount. But my colleague was surfing the web, so it could be the overall speed drops that caused it. As far as I could say, I had less problems with hardlinks turned off and since I don't experience any other problems because of that, I'll leave it off for now.

What I wanted to say with the difference in hardware was, that it doesn't make a difference how much power your cores have, it's always one core, 100%. Did you check overall CPU usage or single cores? The overall usage stays the same, it's just not multi core with e.g 1 core at x% and one core at y% (with bwlimit it just ramps up every couple of seconds), it's one core at 100% and the rest is handling the background noise. We now have 2 unRAID servers showing this behavior. Also both servers show used/free space of the remote share in Unassigned Devices as 0/0, not the real numbers as it does when everything's fine. But that's an already known thing, as far as I googled thru the last three weeks of somewhat trying to workaround this issue.

I bet we're both talking about the same speeds, but you're talking about 1 MB/s and I'm talking about 10 MBit/s (I'm using the unRAID Dashboard network part to check the speed). To me it looks like a fallback speed, since it's 10 MBit/s no matter what...1 GBit or 50 MBit download connection, 100 or 250 or 500 MBit upload connection - always 10 MBit/s in this "fail state" scenario.

I checked overall CPU usage on the stats page as well as individual core utilization on the WebUI dashboard. With nothing else running except a file transfer I don't see utilization over 5% on any single core. There may be an occasional spike but it always quickly drops back down to almost nothing.

Yes we are referring to roughly the same speeds, I do the big B for bytes. I agree I have also noticed In the last few months it does seem to linger around 1MB/s more so than it used to. When this issue first started for me when the speed dropped it would drop down into the kilobytes and stay there until I cancelled it. 1MB/s is an improvement over the dial-up like speeds I was getting before.

Quote

June 30, 20215 yr

Author

This is still very much broken for me. Right now no matter how many times I restart the transfer or the servers I can't even top 100KB/s.

My current file is running at 1.5KB/s, it's disgusting how badly this runs.

If I find time this weekend I'm going to go bother the wireguard devs on IRC and see if they have any ideas. I really really want this to work but in its current state it is unusable.

Edited June 30, 20215 yr by weirdcrap

Quote

June 30, 20215 yr

Funny that you posted today...after some frustrating time I tried some stuff over the weekend until now. I reset everything I changed before to defaults and gave it a try. It sucks more than with changing MTU and what not, but...yeah.

1. When running with the "Userscripts" plugin in the background (or "manually" in bash) and high speeds, it fails within 1-2 minutes dropping to the speed we already found as some kind of fallback speed.

2. When running with the "Userscripts" plugin in the foreground and high speeds, it runs between 15 and 60 minutes, then drops.

3. I limited rsync speed to 10 MB/s (80-85 MBit/s) and it's behaving like 2.

4. We tried using NFS to see what happens. Running without limiting rsync it runs full speed for the first file (250 MBit/s) and drops to 85-100 MBit/s afterwards. I limited rsync to have less CPU usage and transfered almost 2TB at 85-100 MBit/s speed without a problem. So it's not working correctly, but it sucks way less than SMB. Unfortunately NFS can't be secured the way I'd like, so I'm still hoping someone finds a solution.

All I didn't try so far is switching to SMBv1 for testing purposes (found that in a on reddit or so). And I still see the same behavior of one CPU core going up to 100% and jumping to the next when failing (with SMB). I'm almost thinking about a driver issue, kernel issue with newer CPUs, C-State issue, I have no fricking idea.

Quote

July 20, 20223 yr

Author

Still having the same problems with this. I'm going to rebuild the server from scratch with all new hardware and we'll see if that makes any difference. I'm not holding out any hope though.

Quote

July 20, 20223 yr

Me too. And even NFS speed over wireguard got worth running at about 60% of the speed it was before. So I tried to find a workaround and installed an Ubuntu VM with the folders mounted via mount-tags on the receiving unRAID server, which I found out causes the problem (even a fresh installed test server on different hardware on 6.8.3 works, same server on 6.9+ shows the problem). So far it's working great for a couple of weeks now. In the beginning I wanted to do some more investigating on why it works in the VM - or if I can replicate the problem in the VM -, since I think having a VM just for doing what the host was/should be capable of is "a bit" unnecessary, but well, a lot of time spent and no real idea left where to continue.

I'm curious what happens when you have new hardware.

Quote

August 1, 20223 yr

Author

On 7/20/2022 at 4:36 PM, Torben said:

Me too. And even NFS speed over wireguard got worth running at about 60% of the speed it was before. So I tried to find a workaround and installed an Ubuntu VM with the folders mounted via mount-tags on the receiving unRAID server, which I found out causes the problem (even a fresh installed test server on different hardware on 6.8.3 works, same server on 6.9+ shows the problem). So far it's working great for a couple of weeks now. In the beginning I wanted to do some more investigating on why it works in the VM - or if I can replicate the problem in the VM -, since I think having a VM just for doing what the host was/should be capable of is "a bit" unnecessary, but well, a lot of time spent and no real idea left where to continue.

I'm curious what happens when you have new hardware.

I may have to go the VM route as well. The new hardware made no difference in the file transfer speeds. Not that I honestly expected it to.😥

Quote

November 25, 20223 yr

Author
Solution

A final update to this.

Unfortunately I had to abandon using UnRAID as the WireGuard server as I couldn't resolve this. This issue coupled with my other issue drove me to invest in putting my own router in front of the remote server.

I've now built a site to site WireGuard tunnel between my two routers and everything is working exactly as I would expect it to over the WireGuard tunnel. I'm getting my full speed up to the bandwidth limit I set in the rsync command.

So TL;DR is don't expect UnRAID's implementation of WireGuard to be able to move large amounts of data without choking. At least not as of this post, hopefully it can be improved in the future.

Quote

November 25, 20223 yr

That's also my plan, the Pi4 with OpenWRT is already prepared.

The sad thing about this is that it was working flawlessly with 6.8.3 and something somewhere somehow broke the functionality starting with 6.9. But well, because of that I started looking at OpenWRT and got something new to tinker with.

Quote

November 25, 20223 yr

Author

6 hours ago, Torben said:

That's also my plan, the Pi4 with OpenWRT is already prepared.

The sad thing about this is that it was working flawlessly with 6.8.3 and something somewhere somehow broke the functionality starting with 6.9. But well, because of that I started looking at OpenWRT and got something new to tinker with.

Oh yeah it never worked right for me. From the day I set it up it ran like crap.

pfSense's implementation works great. Got a Proctectli FW4B at home and a J4125 at the remote site both running pfsense 2.6.0.

Quote

1

January 24, 20233 yr

Unfortunately, no one is working to fix the problem?

Quote

January 25, 20233 yr

Author

On 1/24/2023 at 7:42 AM, PlanetDyna said:

Unfortunately, no one is working to fix the problem?

The problem is no one seems to be able to definitively pinpoint the cause of the issue.

LimeTech is apparently unable to reproduce this in their testing and it seems to be limited to only a small subset of users so it just isn't garnering much attention I think. The more people who find this thread and share their experiences the more likely someone will start to take a more serious look at the problem.

The only solution at this time is to just not use UnRAID as a WireGuard server if you want to be able to move large amounts of data quickly.

Edited January 25, 20233 yr by weirdcrap

Quote

1

Poor file transfer performance over WireGuard

Featured Replies

Solved by weirdcrap

Top Posters In This Topic

Popular Days

Most Popular Posts

ljm42

claunia

weirdcrap

Posted Images

Join the conversation

Top Posters In This Topic

Popular Days

Most Popular Posts

ljm42

claunia

weirdcrap

Posted Images

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)