Very slow performance - TCP out of memory

SP67 · February 20, 2022

Hi,

Since a few days ago, I'm getting very slow performance (close to unusable). Access via SMB (using VPN) has become really slow (down from a couple MB/s to 50-100 kB/s), using the web interface is quite painful, etc.

I've checked the system log as I found the following error repeatedly: TCP: out of memory -- consider tuning tcp_mem.

After searching online I have not found anything of help. I attach some more lines of the log in case something can be of help.

Thanks!

Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS kernel: veth31a3898: renamed from eth0
Feb 20 16:48:37 NAS avahi-daemon[1742]: Interface veth7126499.IPv6 no longer relevant for mDNS.
Feb 20 16:48:37 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth7126499.IPv6 with address fe80::5036:b8ff:fe52:19d.
Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS kernel: device veth7126499 left promiscuous mode
Feb 20 16:48:37 NAS kernel: docker0: port 11(veth7126499) entered disabled state
Feb 20 16:48:37 NAS avahi-daemon[1742]: Withdrawing address record for fe80::5036:b8ff:fe52:19d on veth7126499.
Feb 20 16:50:23 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:23 NAS kernel: veth75ec41e: renamed from eth0
Feb 20 16:50:24 NAS avahi-daemon[1742]: Interface veth1cd4cd0.IPv6 no longer relevant for mDNS.
Feb 20 16:50:24 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth1cd4cd0.IPv6 with address fe80::4470:62ff:fe0c:e210.
Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:24 NAS kernel: device veth1cd4cd0 left promiscuous mode
Feb 20 16:50:24 NAS kernel: br-926959b88699: port 2(veth1cd4cd0) entered disabled state
Feb 20 16:50:24 NAS avahi-daemon[1742]: Withdrawing address record for fe80::4470:62ff:fe0c:e210 on veth1cd4cd0.
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS kernel: veth945447c: renamed from eth0
Feb 20 16:53:46 NAS avahi-daemon[1742]: Interface veth9105c65.IPv6 no longer relevant for mDNS.
Feb 20 16:53:46 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth9105c65.IPv6 with address fe80::64e4:56ff:fe1f:baa6.
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS kernel: device veth9105c65 left promiscuous mode
Feb 20 16:53:46 NAS kernel: br-926959b88699: port 1(veth9105c65) entered disabled state
Feb 20 16:53:46 NAS avahi-daemon[1742]: Withdrawing address record for fe80::64e4:56ff:fe1f:baa6 on veth9105c65.
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS kernel: veth5922185: renamed from eth0
Feb 20 16:54:14 NAS avahi-daemon[1742]: Interface veth33f6111.IPv6 no longer relevant for mDNS.
Feb 20 16:54:14 NAS avahi-daemon[1742]: Leaving mDNS multicast group on interface veth33f6111.IPv6 with address fe80::40ba:fbff:fef8:f50d.
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS kernel: device veth33f6111 left promiscuous mode
Feb 20 16:54:14 NAS kernel: docker0: port 12(veth33f6111) entered disabled state
Feb 20 16:54:14 NAS avahi-daemon[1742]: Withdrawing address record for fe80::40ba:fbff:fef8:f50d on veth33f6111.
Feb 20 16:54:32 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:55:15 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:56:03 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 16:59:27 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem
Feb 20 17:00:13 NAS kernel: TCP: out of memory -- consider tuning tcp_mem

Squid · February 20, 2022

8 minutes ago, SP67 said:

Access via SMB (using VPN) has become really slow

Is everything running normally when accessing locally?

SP67 · February 20, 2022

I'm abroad, but for what my girlfriend is telling me, it's also very slow. Jellyfin is also unusable using local access.

SP67 · February 20, 2022

The "fix common problems" plugins has found the following error:

Your server has run out of memory, and processes (potentially required) are being killed off. You should post your diagnostics and ask for assistance on the unRaid forums". I'm attaching the diagnostics.

nas-diagnostics-20220220-1137.zip

Squid · February 20, 2022

That out of memory that FCP is detecting is the TCP thing (never seen that message before, and it's not what FCP is designed to catch). I would start by rebooting and then go from there.

SP67 · February 20, 2022

Ok, i'll try rebooting.

I cannot stop the array because mover is working (which i find extrange because its scheduled to start at 3 AM, its currently 6 PM and the server hasn't have much use lately due to its performance). I'll give it a couple more hours, but any advice in case it is stuck?

SP67 · February 20, 2022

Ok so I think my usb drive just died. Here what I did, following this guide: https://wiki.unraid.net/Console#To_cleanly_Stop_the_array_from_the_command_line

Stopped mover

Stopped docker

Stopped Samba

umount /dev/md1

umount /dev/md2 (only 2 disk + parity in the array)

/root/mdcmd stop

reboot

Now the server refuses to boot. I do have a backup of the usb drive using the my server plugin, but I’m not sure how to proceed.

Edited February 20, 2022 by SP67

itimpi · February 20, 2022

From the screenshot it looks like you may have problems with your flash drive.

You should try plugging it into a PC/Mac and running a check.

SP67 · February 20, 2022

Ok, so I’ve run chkdsk /f and /r and it solved an error on the usb drive. Now the servers boots again. I’ll check the performance and come back to tell the news.

thanks

Squid · February 20, 2022

1 hour ago, SP67 said:

https://wiki.unraid.net/Console#To_cleanly_Stop_the_array_from_the_command_line

Absolutely ancient (circa 2011) instructions, and doesn't come close to taking into consideration anything else running on the server (VMs, docker, other services etc)

You want to do

powerdown

SP67 · February 20, 2022

Last time I did that I had issues with the array not stopping, so I thought about finding another way.

Squid · February 20, 2022

This is how you actually do it via the command line (a pure stop). It ain't pretty, and is even more aggravating nowadays due to csrf_token requirements

uek2wooF · December 21, 2022

I also had these TCP out of memory errors, turns out there were zillions of processes that looked like "[wget]". Not sure if they were in a docker container or what. Restarting docker did not fix anything though. I killed them off and then everything was fine. I have had these runaway wgets a couple times, not sure what is causing them.

I was about to post just that, but then:

Haha I just went and looked and guess what, zillions of these:

/usr/bin/wget -q -O - -T 20 -U MakeMKV/v1.15.2/linux(x64-release) -o /dev/null http://hkdata.fairuse.org/svq/sdf.dat.gz

The weird thing is, I haven't had that MakeMKV container in a while. What could be spawning these???

the parent pid of the wgets is 32120, and look:

root     32120 32101  0 Dec18 ?        00:00:00 /usr/bin/python3 -u /sbin/my_init
root     32274 32120  0 Dec18 ?        00:00:00 /bin/bash /etc/my_init.d/ripper.sh

aha, from docker ps (why so many spaces in there after uptime???):

a41e2947de6c   rix1337/docker-ripper   "/sbin/my_init"          2 years ago     Up 45 hours                                                                                                    Ripper

WTF???

Oh ok:

image.png.76de84e53f802f0376c45338b34397cb.png

I must have installed this broken Ripper app at some point maybe after I got rid of MakeMKV. I wonder what crap that wget is trying to do. Anyway I will remove this container and hopefully my unraid flakiness goes away.

Very slow performance - TCP out of memory

Recommended Posts

SP67

Link to comment

Squid

Link to comment

SP67

Link to comment

SP67

Link to comment

Squid

Link to comment

SP67

Link to comment

SP67

Link to comment

itimpi

Link to comment

SP67

Link to comment

Squid

Link to comment

SP67

Link to comment

Squid

Link to comment

uek2wooF

Link to comment

Join the conversation