[6.12.1] Docker ram usage increase

Shagon · June 23, 2023

Additional issues - machine crashed for no reason, rolling back to 6.11.5 as I had 0 issues on that version for months.

ChatNoir · June 23, 2023

I would agree that Docker RAM seems to have increased a lot since 6.12. I have no issues with the server though.

(I made no change to my containers)

Kilrah · June 23, 2023

I don't have before/after graphs but I have 32GB RAM and 60 containers running and am at ~50% RAM usage which is about the same as before.

Edited June 23, 2023 by Kilrah

Shagon · June 23, 2023

The RAM usage with version is 20.10.21 is fine, my initial thinking was that there was a large change in docker that introduced this, however it might be a memory leak introduced in Docker version 23, I can't pinpoint the cause of it, nevertheless its a side-effect, larger one as the usage seems to jump 50% compared to previous version - when no changes to the containers has been made.

The main issue that I have now is that 6.12.1 is unstable - crashing the OS after several hours.

I've reverted back to 6.11.5 as it is much more stable - running it for months without issues.

Looking into posts on reddit as well as here it seems I'm not the only one. Narrowing down the issue my guess is that kernel support for intel cpus (i have i3-10100) isn't as great with the newer kernel (10th or 11th gen issues?).

either way I'm happy being on the previous version for a few more months until everything is resolved

Jaycedk · June 25, 2023

I'm seeing the same with 6.12.1, Memory running much higher that 6.11.5

ChatNoir · June 25, 2023

The question is :

is it a Unraid problem ?
or a linux+docker problem ? (ie new kernel + new docker version)

If it's the second, not much that can be done here.

ChatNoir · June 30, 2023

Changed Status to Open

ChatNoir · June 30, 2023

Not sure to see a significant improvement from 6.12.2.

It seems that some containers are more impacted than others (Plex, Jellyfin for example for me)

On the other end, I am wondering it is an actual RAM usage problem or an issue of reporting from docker and/or kernel interpretation. (my table above uses 'docker_container_mem' )

On the same time period than the image above (15 days), my system RAM usage does not seem to be fluctuating much while docker is supposed to have reached ~27GB from an average of 8-9GB on 6.11.

My total RAM is 64GB, so going from 9 to 27 should be clearly visible on the system.

image.png.21c65430f97b396c21170885e394dcf3.png

The Unraid Dashboard is consistant with my Grafana dashboard.

image.png.b4e8a9d22bb232cc9a86b3d1507198f5.png

Shagon · June 30, 2023

Getting back to the crashes, I narrowed it down to ipvlan, for some reason it crashes more often than macvlan. Macvlan setting is more stable - at least on my 16GB DDR4, B460M-K, i3-10100 system, looking at usage I managed to get 6+ months without any crash, while with ipvlan it crashed 3 times in 48h. Twice on 6.12.1 and once on 6.11.5. My only conclusion is tha ipvlan doesn't play well with this system.

That being said I am hesitant to upgrade to 6.12.2 with the above RAM usage as well as crashes. I can live with the ram usage, I can just buy more ram, but crashes are a big issue - as I have DNS running for the household as well as plex - something that _requires_ 100% uptime as family members do not use netflix and other subscription based services 😄

Is there any way for me to troubleshoot why ipvlan crashes the system? Or better yet - are there any plans deprecating macvlan and removing it from future versions of unraid?

I'm basically looking for a stable solution whatever it might be.

Balor · July 2, 2023

In my setup I narrowed down the docker crash to a container that did interact with the network layer. I'm using 6.12.1 with ipvlan.

https://github.com/thrnz/docker-wireguard-pia

The idea is to tell other containers to use this one for their network. This container act as a VPN router that redirects the traffic to Private Internet Access.

As soon as I removed it, my server is fully stable again. So my guess is, there is some issues with stability and networking.

To be clear I had those issue already in 6.11.5, so I think there is some underlying issue in docker network.

Weirdly enough, if I use a container that has built it wireguard capabilities, it just work and the server stay stable.

Squid · July 3, 2023

Have you tried switching from macvlan as the network driver (Settings - Docker) to instead ipvlan?

Shagon · July 16, 2023

I tried ipvlan but the system seems to crash sometime with it. Ever since I rolled back to 6.11.5 I had only one issue (that I already reported however nothing can be done about it it seems -

Considering the increased ram usage and everything else I am tempted to leave the system until 6.12.5 or later version as daily reboots are not something my family will take for granted (media server for 4 people).

I really need to make sure the system is stable before updating, macvlan seems to be stable in my usage, I've had 0 reboots because of it until today, changed to ipvlan and I had issues on 6.12.2, at least 3 reboots in 2 days.

If there's anything I need to do to make sure 6.12.X works properly on my machine please let me know, otherwise I can't have daily crashes that might corrupt the data on disk or have the server be down since multiple people in my household use it (several services, mostly plex and book reading).

Shagon · July 16, 2023

Even with the above, I can take the increased ram usage - that's fine - it's the system stability that's critical. Having to try out ipvlan vs macvlan and tweaking stuff - I could do this if I'm alone, but with a family that _depends_ on the media on the server - I can't change stuff on-the-fly not knowing if it will crash while I'm away

Again, I am willing to do anything to ensure system stability - I don't even run VMs in this thing, just docker containers - unraid - for me at least - is a better alternative than other software that offer the same thing which is a stable OS with upgrade support with a friendly and helpful community.

As a family person I am done with the times where I can make my own networking cables, fine tune the system to my liking - now I just need stuff to work

Shagon · July 24, 2023

I've performed the following changes today:

- Updated OS to 6.12.3 (`6.11.5` => `6.12.3`)

- Updated BIOS to latest version (was a few versions behind)

- Removed 5 containers

- Added `--log-driver none --no-healthcheck` to all containers

Docker custom network type is still set to `macvlan`, will monitor for any crashes.

Shagon · July 24, 2023

Not an hour later I see the following via `dmesg`:

Spoiler

[Mon Jul 24 14:40:18 2023] ------------[ cut here ]------------
[Mon Jul 24 14:40:18 2023] WARNING: CPU: 2 PID: 80 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] Modules linked in: af_packet xt_mark veth xt_nat xt_tcpudp macvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype br_netfilter xfs ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) nct6775 nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc i915 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp iosf_mbi drm_buddy i2c_algo_bit coretemp ttm crct10dif_pclmul crc32_pclmul drm_display_helper crc32c_intel joydev ghash_clmulni_intel input_leds sha512_ssse3 mei_hdcp mei_pxp drm_kms_helper wmi_bmof mxm_wmi aesni_intel drm crypto_simd cryptd intel_gtt rapl intel_cstate r8169 i2c_i801 ahci mei_me agpgart i2c_smbus syscopyarea sysfillrect hid_apple sysimgblt tpm_crb intel_uncore i2c_core mei libahci realtek led_class fb_sys_fops thermal fan tpm_tis
[Mon Jul 24 14:40:18 2023] video tpm_tis_core wmi tpm backlight intel_pmc_core acpi_pad acpi_tad button unix
[Mon Jul 24 14:40:18 2023] CPU: 2 PID: 80 Comm: kworker/u16:3 Tainted: P O 6.1.38-Unraid #2
[Mon Jul 24 14:40:18 2023] Hardware name: ASUS System Product Name/PRIME B460M-K, BIOS 1620 07/09/2021
[Mon Jul 24 14:40:18 2023] Workqueue: events_unbound macvlan_process_broadcast [macvlan]
[Mon Jul 24 14:40:18 2023] RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01
[Mon Jul 24 14:40:18 2023] RSP: 0018:ffffc900001a8d98 EFLAGS: 00010202
[Mon Jul 24 14:40:18 2023] RAX: 0000000000000001 RBX: ffff888327e34600 RCX: 09c939e45870cc48
[Mon Jul 24 14:40:18 2023] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888327e34600
[Mon Jul 24 14:40:18 2023] RBP: 0000000000000001 R08: a1f12dc52477d268 R09: 0f8457e5fca31fda
[Mon Jul 24 14:40:18 2023] R10: fa2f08dcb877b495 R11: ffffc900001a8d60 R12: ffffffff82a11d00
[Mon Jul 24 14:40:18 2023] R13: 000000000002667b R14: ffff8883205b8e00 R15: 0000000000000000
[Mon Jul 24 14:40:18 2023] FS: 0000000000000000(0000) GS:ffff88845f480000(0000) knlGS:0000000000000000
[Mon Jul 24 14:40:18 2023] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 24 14:40:18 2023] CR2: 00001480549716da CR3: 0000000161f58001 CR4: 00000000003706e0
[Mon Jul 24 14:40:18 2023] Call Trace:
[Mon Jul 24 14:40:18 2023] <IRQ>
[Mon Jul 24 14:40:18 2023] ? __warn+0xab/0x122
[Mon Jul 24 14:40:18 2023] ? report_bug+0x109/0x17e
[Mon Jul 24 14:40:18 2023] ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] ? handle_bug+0x41/0x6f
[Mon Jul 24 14:40:18 2023] ? exc_invalid_op+0x13/0x60
[Mon Jul 24 14:40:18 2023] ? asm_exc_invalid_op+0x16/0x20
[Mon Jul 24 14:40:18 2023] ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat]
[Mon Jul 24 14:40:18 2023] nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
[Mon Jul 24 14:40:18 2023] nf_hook_slow+0x3a/0x96
[Mon Jul 24 14:40:18 2023] ? ip_protocol_deliver_rcu+0x164/0x164
[Mon Jul 24 14:40:18 2023] NF_HOOK.constprop.0+0x79/0xd9
[Mon Jul 24 14:40:18 2023] ? ip_protocol_deliver_rcu+0x164/0x164
[Mon Jul 24 14:40:18 2023] __netif_receive_skb_one_core+0x77/0x9c
[Mon Jul 24 14:40:18 2023] process_backlog+0x8c/0x116
[Mon Jul 24 14:40:18 2023] __napi_poll.constprop.0+0x28/0x124
[Mon Jul 24 14:40:18 2023] net_rx_action+0x159/0x24f
[Mon Jul 24 14:40:18 2023] __do_softirq+0x126/0x288
[Mon Jul 24 14:40:18 2023] do_softirq+0x7f/0xab
[Mon Jul 24 14:40:18 2023] </IRQ>
[Mon Jul 24 14:40:18 2023] <TASK>
[Mon Jul 24 14:40:18 2023] __local_bh_enable_ip+0x4c/0x6b
[Mon Jul 24 14:40:18 2023] netif_rx+0x52/0x5a
[Mon Jul 24 14:40:18 2023] macvlan_broadcast+0x10a/0x150 [macvlan]
[Mon Jul 24 14:40:18 2023] ? _raw_spin_unlock+0x14/0x29
[Mon Jul 24 14:40:18 2023] macvlan_process_broadcast+0xbc/0x12f [macvlan]
[Mon Jul 24 14:40:18 2023] process_one_work+0x1a8/0x295
[Mon Jul 24 14:40:18 2023] worker_thread+0x18b/0x244
[Mon Jul 24 14:40:18 2023] ? rescuer_thread+0x281/0x281
[Mon Jul 24 14:40:18 2023] kthread+0xe4/0xef
[Mon Jul 24 14:40:18 2023] ? kthread_complete_and_exit+0x1b/0x1b
[Mon Jul 24 14:40:18 2023] ret_from_fork+0x1f/0x30
[Mon Jul 24 14:40:18 2023] </TASK>
[Mon Jul 24 14:40:18 2023] ---[ end trace 0000000000000000 ]---

Changing "Docker custom network type" to "ipvlan" and monitoring for issues.

Decided to change "Host access to custom networks" to "disabled" as well - not sure why that was enabled.

- figured that one out, I route my containers through adguard for adblock/DNS, so without this it doesn't work

Figured out how to get containers to resolve the DNS from Adguard DNS from a container, custom network and then in network settings for unraid itself I just use a static 172.18.0.0/16 IP.

Edited July 24, 2023 by Shagon

Shagon · July 24, 2023

Hello @ChatNoir 👋 Just wanted to keep you updated regarding this case and have everything condensed into one post.

6h ago I've done the following changes on Unraid:

- Updated OS to 6.12.3 (`6.11.5` => `6.12.3`)

- Updated BIOS to latest version (was a few versions behind)

- Removed 5 containers

- Added `--log-driver none --no-healthcheck` to all containers

With that the "Docker custom network type" was set to "macvlan". 30 minutes later I've gotten the nasty kernel bug - see comment - comment-with-bug.

I've changed the Docker custom network type to "ipvlan" and disabled "Host access to custom networks" since it uses a "macvlan" method to expose routes.

After a while I noticed that something isn't working and it turns out the feature "Host access to custom networks" is rather useful if you want containers communicating with other containers, particularly AdGuard that I use. Normal communication between containers, e.g. sending a request with curl worked, but not DNS.

Adding every container to a custom network and using Settings > Network Settings > IPv4 default gateway allowed me to have a default DNS set for both unraid and docker - because I used a static IP for AdGuard.

For bonus points I added static IP for every container and then created a client in AdGuard to track what containers are sending what requests.

Getting back to the issue at hand, to test the stability I watched a movie, had my family use it normally and then observed logs via `dmesg -T --follow`. I am happy to report that after 6h I see no errors in the dmesg logs - however as I am in IT work I know it might require more time to manifest hence why I will monitor for the next week or so and report my findings back.

The reason why I tagged you is because I think maybe people are using "Host access to custom networks" that causes the macvlan errors in dmesg and it might be a good idea to add that information to the documentation.

Hopefully this is the end of the stability issues reported. I'll also continue monitoring RAM usage however that is a minor issue versus crashes caused by the docker network driver.

ChatNoir · July 24, 2023

2 minutes ago, Shagon said:

Hello @ChatNoir 👋 Just wanted to keep you updated regarding this case and have everything condensed into one post.

Thanks for that, but I do not see any MACVLAN or any other issues, I was just answering on your thread title about RAM increase.

And on that topic, it is still higher (though not problematic with my usecase). Here is the view from the last 90 days.

image.png.ca9943bf997832368c8d806920847d8c.png

The main culprit seems to be Plex being around 300MB previously to 1 to 11GB now. Maybe the --no-healthcheck command could help, I'll look into it (maybe) when I have some vacation time at home.

Shagon · July 25, 2023

24h later the system has no errors in the dmesg log. I believe the root cause was the macvlan host access to network.

In terms of the docker usage:

# docker stats --no-stream --format='table {{.Name}}\t{{.MemUsage}}' | (sed -u 1q; sort)
NAME             MEM USAGE / LIMIT
adguard          107.1MiB / 15.49GiB
calibre-web      142.7MiB / 15.49GiB
filebrowser      19.52MiB / 15.49GiB
home-assistant   274.4MiB / 15.49GiB
homepage         115.2MiB / 15.49GiB
plex             507.2MiB / 15.49GiB
prowlarr         156MiB / 15.49GiB
qbittorrent      45.89MiB / 15.49GiB
radarr           275.7MiB / 15.49GiB
sonarr           355.4MiB / 15.49GiB
tautulli         77.74MiB / 15.49GiB
yarr             29.61MiB / 15.49GiB

Mine went back to somewhat normal usage, every container has "--no-healthcheck --log-driver none", except plex - it has "--device=/dev/dri --log-driver none --no-healthcheck" as I need quicksync 😄

Overall - after 24h I'm satisfied with the system stability - will continue to monitor until the end of the week (until July 30th) and report back then (or earlier if I do see any issues).

Shagon · July 31, 2023

Almost a week later - no issues - macvlan was to blame for freezes and reboots. As for the docker usage - no time to troubleshoot that (20.10.24 on 6.12.3 seems to be the same as on 6.11.5 I think), if we upgrade docker again and the usage jumps 50% I'll just get a 16GB stick and upgrade the 8GB=>16GB stick.

[6.12.1] Docker ram usage increase

User Feedback

Recommended Comments

Shagon 1

Link to comment

ChatNoir 738

Link to comment

Kilrah 466

Link to comment

Shagon 1

Link to comment

Jaycedk 10

Link to comment

ChatNoir 738

Link to comment

ChatNoir 738

Link to comment

ChatNoir 738

Link to comment

Shagon 1

Link to comment

Balor 4

Link to comment

Squid 4991

Link to comment

Shagon 1

Link to comment

Shagon 1

Link to comment

Shagon 1

Link to comment

Shagon 1

Link to comment

Shagon 1

Link to comment

ChatNoir 738

Link to comment

Shagon 1

Link to comment

Shagon 1

Link to comment

Join the conversation