6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

anethema · April 8, 2021

I'm running 6.9.2 now. Let's see if it freezes.

kaiguy · April 8, 2021

13 minutes ago, bonienl said:

Your case is very different, you don't have any custom (macvlan) network defined and containers with their own (fixed) IP address configured. Instead you have a user defined bridge network (proxynet) and your containers reside in this network (172.18.0.X).

Right. As I mentioned, I originally did have containers (unifi, adguard) with fixed IP addresses, but in the process of troubleshooting I changed the network settings. Through further troubleshooting I then realized that the host access to custom networks caused the trace even without those fixed IP addresses (and easily/quickly repeatable), so that's what I've been focusing on.

13 minutes ago, bonienl said:

Q: when host access is enabled, can you show the routing table again, I like to see which shim interfaces are defined in this case.

Here you go! Thanks.

Edited April 8, 2021 by kaiguy
additional info

bonienl · April 8, 2021

Thanks for the quick reply.

A side note: host access is only applicable to custom (macvlan) networks, not bridge networks. In your case enabling it has no use and it can stay disabled, it should however not cause the call traces (still investigating)!

vagrantprodigy · April 10, 2021

I had the crashes on 6.9.0 and 6.9.1, and updated to 6.9.2 shortly after it was released. Mine had a kernel panic overnight, so I can confirm this issue is NOT resolved. My original support thread with logs is here:

Edited April 10, 2021 by vagrantprodigy

bonienl · April 10, 2021

Need diagnostics with 6.9.2 to verify.

Chilipants · April 11, 2021

Just installed 6.9.2 today, macvlan kernel panic within 4 hours. I used to get frequent macvlan call traces and hard lockups when I had an intel 10gbit card installed on 6.8.3 and up, until I replaced it with a dual 1 gbit intel card. 6.9.1 had no macvlan traces or lockups.

Diagnostics attached...

tower-diagnostics-20210410-2213.zip

sirkuz · April 16, 2021

I waited until 6.9.2 to make the jump from 6.8.3. Unfortunately I ran into this problem as well. 6.8.3 was stable for months for me so trying to downgrade now to keep the wife and kids happy. Will follow along to see when this one gets sorted out. Cheers to those putting in the effort to get this fixed! Thanks

Edited April 16, 2021 by sirkuz

danieland · April 16, 2021

same problems here, 6.8.3 is fine.

whoopn · April 16, 2021

4 minutes ago, danieland said:

same problems here, 6.8.3 is fine.

Yes, I’ve reverted to 6.8.3 and have been up and running with no issues for 4 days, 6.9.1 would have crashed by now.

Perhaps the UNRAID devs could compare what in the network stack changed and revert that change as it’s a regression.

I think I speak for most unraid users that reliability is far more important than feature set. Reliability is the reason I’ve stayed on unraid.

Capt_Rekt · April 16, 2021

5 minutes ago, danieland said:

same problems here, 6.8.3 is fine.

Is there a way to go back all the way to 6.8.3? I had already upgraded to 6.9.x and 6.9.2 and do not have an option to downgrade lower than 6.9.x

whoopn · April 16, 2021

Just now, Capt_Rekt said:

Is there a way to go back all the way to 6.8.3? I had already upgraded to 6.9.x and 6.9.2 and do not have an option to downgrade lower than 6.9.x

I just went into the tools >> upgrade os and it had the option to restore. It may only allow one version worth of regression.

danieland · April 16, 2021

2 minutes ago, Capt_Rekt said:

Is there a way to go back all the way to 6.8.3? I had already upgraded to 6.9.x and 6.9.2 and do not have an option to downgrade lower than 6.9.x

sorry, i don't know how to downgrade.

hope limetech found a way to resolve this bug,

Capt_Rekt · April 16, 2021

1 minute ago, whoopn said:

I just went into the tools >> upgrade os and it had the option to restore. It may only allow one version worth of regression.

Which is my situation, I downgraded one level but it isn't far enough back. So now I am stuck manually restarting my server every day if I want to use it.

whoopn · April 16, 2021

Just now, Capt_Rekt said:

Which is my situation, I downgraded one level but it isn't far enough back. So now I am stuck manually restarting my server every day if I want to use it.

That stinks. Perhaps this works?

1. make a CA backup, separately backup or protect VMs

2. save those files somewhere other than the array, keep a copy on the array for convenience

3. reinstall 6.8.3

4. Restore backup

5. cross fingers

6. profit?!?

jsiemon · April 17, 2021

After upgrading from 6.9.1 to 6.9.2 I took a chance and added back br0 with a static IP on one of my Docker Containers. Less than 24 hrs later I have a hard crash. My syslog for the past 2 days is included with what I believe to be the crash is highlighted in yellow.Apr 16 -17 Crash Log.docx

danieland · April 20, 2021

Any updates?

K1ng0011 · April 21, 2021

Looks like others are now having the issue that I have struggled with. I have had the issue with two different motherboards and on versions of unRAID since 6.8.3. I swapped to a motherboard with IPMI and now I have the macvlan issue return. I resolved the issue on my MSI Pro Carbon x370 by removing my 10 gig PCIE NIC and using the onboard NIC. All of my dockers are currently in host mode or br0.2 with a static IP. I had six days of uptime and then just this morning I had another macvlan call trace come up. Since I installed my new motherboard in I check my logs every morning before I go to work and usually after I get off work. It seems like this has been an issue for a long time but something changed and caused more users to be affected than before.

Current Unraid Version: 6.9.2

Original Motherboard: MSI Pro Carbon X370

Current Motherboard: ASRockRack X470D4U2-2T

tower-syslog-20210421-1728.zip

whoopn · April 21, 2021

I think the TL;DR is 6.9.X is borked...9 days uptime on 6.8.3 after only getting like 2-4 on 6.9.1

my best is around 140 days before I had to bounce it for something.

DuzAwe · April 21, 2021

Same issues as K1ng0011 also with an ASRockRack Board, Same X470 Base. Have had a great number of system lock ups since December. Started for me with RC2. Not in a position to be able to create new vlans. Have instead removed all my dockers that used BR0 as even stopped I was having issues.

1 hour ago, K1ng0011 said:

Current Unraid Version: 6.9.2

Original Motherboard: MSI Pro Carbon X370

Current Motherboard: ASRockRack X470D4U2-2T

thelibrary-diagnostics-20210418-2211 (1).zip

Edited April 21, 2021 by DuzAwe

danieland · April 23, 2021

Today my unraid crash again, is there any way we can fix this without buying a switch?

Edited April 23, 2021 by danieland

K1ng0011 · April 23, 2021

If you look at hoopsters post he reported macvlan call traces in unRAID version 6.5.0 which was released in 2018. This has been an issue for quite a long time. I have seen references that this issue might be related to docker itself and not unRAID but it seems like the issue has not been widespread enough to warrant unRAID reaching out to the people who maintain the docker software. I spent from December 2020 to February 2021 diagnosing this issue on my original motherboard. When I ended up removing my 10 Gig PCIE NIC ASUS XG-C100C from my motherboard the issue stopped. So somehow specific hardware was causing this issue. A the time I was on 6.8.3. Unless there is a wide number of users affected I am not hopeful that this issue will get resolved. Currently I have the "host access to custom networks" setting under my docker container settings page disabled to hopefully prevent any hard lockups to see if unRAID can come up with a solution. If I were you I would install the "CA Backup / Restore Appdata" plugin if you do not have it already. I have had these hard lockups corrupt my app data for my dockers and this allowed me to restore my appdata folder.

jnk22 · April 24, 2021

I am having similar issues that seem to be related to docker/macvlan:

My kernel panics exists since 6.8.3 (my initial Unraid version) and have not been solved by upgrading to the latest version as well.

I am not using the network "br0" with static IP addresses, only the following:

- host

- bridge

- proxynet (custom docker network)

The error logs seem to be pretty identical though.

Could this be related?

Chilipants · April 26, 2021

On 4/24/2021 at 4:20 AM, jnk22 said:

I am having similar issues that seem to be related to docker/macvlan:

My kernel panics exists since 6.8.3 (my initial Unraid version) and have not been solved by upgrading to the latest version as well.

I am not using the network "br0" with static IP addresses, only the following:

- host

- bridge

- proxynet (custom docker network)

The error logs seem to be pretty identical though.

Could this be related?

Disabling Host access to custom networks on docker settings should suffice... it's not necessary for proxynet. I had mine enabled for some reason or another ( testing at some point ) and forgot to disable and it was the sole cause of my macvlan call traces and lockups. Disabling it has ceased all macvlan problems for me. This was addressed earlier in this thread by kaiguy and bonienl.

Seems nic driver related since different cards have different stabilities when host access is enabled.

Lilarcor · April 27, 2021

I got the same issue with 6.9.0/1/2.

unraid-diagnostics-20210427-2221.zip

gilladur · April 27, 2021

I think you can put me on the list 😤

Since the Upgrade to 6.9 my former rock stable server is crashing 1-2 a month in all the recent versions.

Apr 12 23:33:06 Server kernel: ------------[ cut here ]------------
Apr 12 23:33:06 Server kernel: WARNING: CPU: 1 PID: 13942 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
Apr 12 23:33:06 Server kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle nf_tables xt_nat xt_tcpudp vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding igb i2c_algo_bit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm wmi_bmof crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl intel_cstate intel_uncore nvme ipmi_ssif nvme_core i2c_i801 i2c_smbus input_leds i2c_core led_class ahci libahci ie31200_edac intel_pch_thermal wmi fan thermal acpi_ipmi video ipmi_si backlight button [last unloaded: i2c_algo_bit]
Apr 12 23:33:06 Server kernel: CPU: 1 PID: 13942 Comm: kworker/1:1 Not tainted 5.10.28-Unraid #1
Apr 12 23:33:06 Server kernel: Hardware name: Supermicro Super Server/X11SCL-F, BIOS 1.5 10/05/2020
Apr 12 23:33:06 Server kernel: Workqueue: events macvlan_process_broadcast [macvlan]
Apr 12 23:33:06 Server kernel: RIP: 0010:__nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
Apr 12 23:33:06 Server kernel: Code: e8 dc f8 ff ff 44 89 fa 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 36 f6 ff ff 84 c0 75 bb 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 89 df 44 89 e6 31 db e8 6d f3 ff ff e8 35 f5 ff ff e9 22 01
Apr 12 23:33:06 Server kernel: RSP: 0018:ffffc90000110dd8 EFLAGS: 00010202
Apr 12 23:33:06 Server kernel: RAX: 0000000000000188 RBX: 00000000000082f4 RCX: 000000006bbd9a8a
Apr 12 23:33:06 Server kernel: RDX: 0000000000000000 RSI: 000000000000019a RDI: ffffffffa03b1dd0
Apr 12 23:33:06 Server kernel: RBP: ffff88835c40b540 R08: 00000000b4ddb4fe R09: ffff888163ec87c0
Apr 12 23:33:06 Server kernel: R10: 0000000000000000 R11: ffff888172bda000 R12: 000000000000519a
Apr 12 23:33:06 Server kernel: R13: ffffffff8210b440 R14: 00000000000082f4 R15: 0000000000000000
Apr 12 23:33:06 Server kernel: FS:  0000000000000000(0000) GS:ffff88884ec80000(0000) knlGS:0000000000000000
Apr 12 23:33:06 Server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 12 23:33:06 Server kernel: CR2: 00002a91561fadf4 CR3: 000000000200a003 CR4: 00000000003706e0
Apr 12 23:33:06 Server kernel: Call Trace:

I don't want to / can't use vlans (my router isn't supporting this properly).

After I find out about the macvlan issue I tried to switch every docker away from br0. Hopefully this helps.
Just one docker I wasn't able to move from br0 (diyHue) - maybe someone can tell me how?

Host access to custom networks is disabled (but I don't know if I did change it after the last stall I had, as I moved the appdata to a different drive)

Just one addition - I quite often read that people with this issue are running piHole or the Unifi-controller. I'm also using the Unifi-controller docker and had it on fixed ip.

Did the Unraid team reply to the issue so far - only the community seems to care. Very disappointing.

server-diagnostics-20210427-2125.zip

Edited April 27, 2021 by gilladur

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

User Feedback

Recommended Comments

anethema 0

Link to comment

kaiguy 28

Link to comment

bonienl 1764

Link to comment

vagrantprodigy 12

Link to comment

bonienl 1764

Link to comment

Chilipants 0

Link to comment

sirkuz 6

Link to comment

danieland 0

Link to comment

whoopn 2

Link to comment

Capt_Rekt 0

Link to comment

whoopn 2

Link to comment

danieland 0

Link to comment

Capt_Rekt 0

Link to comment

whoopn 2

Link to comment

jsiemon 5

Link to comment

danieland 0

Link to comment

K1ng0011 4

Link to comment

whoopn 2

Link to comment

DuzAwe 11

Link to comment

danieland 0

Link to comment

K1ng0011 4

Link to comment

jnk22 8

Link to comment

Chilipants 0

Link to comment

Lilarcor 0

Link to comment

gilladur 11

Link to comment

Join the conversation