6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

codefaux · July 6, 2021

4 hours ago, bonienl said:

Perfectly normal when you enable VLANs for an interface.

VLAN 0 refers to the untagged communication.

Yes, I am well aware of the significance of VLAN 0. I'm also well aware that it happens when VLANs are enabled for an interface.

However, upon reading my message, the following things stand out as unusual -- which may be why I wrote them in detail, and provided screenshots --

1 -- There are no currently enabled interfaces with VLANs enabled so..."perfectly normal when" goes right out the window, yeah?

2 -- Before (when I was crashing) I did not have VLAN 0 messages in my log

3 -- After (now that I'm NOT crashing) I DO have VLAN 0 messages in my log

The first point I was trying to raise, in this case, was that I'm not using VLANs on any of my enabled network interfaces, and none of my disabled interfaces have VLANs configured but they are enabled -- the system seems to still be activating VLAN 0 as if preparing for VLAN operation, which seems to be abnormal thus commenting on it, if it were normal I wouldn't have taken the time.

The second point I was trying to raise, specifically, was that I seem to have stumbled upon a potential solution for this crash problem while Docker/Kernel/Limetech find a solution.

What I'd like to see is if anyone experiencing these crashes could do the following;

A) Enable VLAN but do not configure it on an unused network interface

B) DISABLE that interface (IPv4/IPv6 Address Assignment None, not a member of bridge or bond)

C) Reboot, start your array, check if you see "VLAN 0" related log messages

D) Report if your system becomes stable

I've got uptime around a month now, perfectly stable, with no changes to my system beyond what's mentioned above and in greater detail in my previous message.

Edited July 6, 2021 by codefaux

VlarpNL · August 3, 2021

On 7/6/2021 at 1:14 PM, codefaux said:

Yes, I am well aware of the significance of VLAN 0. I'm also well aware that it happens when VLANs are enabled for an interface.

However, upon reading my message, the following things stand out as unusual -- which may be why I wrote them in detail, and provided screenshots --

1 -- There are no currently enabled interfaces with VLANs enabled so..."perfectly normal when" goes right out the window, yeah?

2 -- Before (when I was crashing) I did not have VLAN 0 messages in my log

3 -- After (now that I'm NOT crashing) I DO have VLAN 0 messages in my log

The first point I was trying to raise, in this case, was that I'm not using VLANs on any of my enabled network interfaces, and none of my disabled interfaces have VLANs configured but they are enabled -- the system seems to still be activating VLAN 0 as if preparing for VLAN operation, which seems to be abnormal thus commenting on it, if it were normal I wouldn't have taken the time.

The second point I was trying to raise, specifically, was that I seem to have stumbled upon a potential solution for this crash problem while Docker/Kernel/Limetech find a solution.

What I'd like to see is if anyone experiencing these crashes could do the following;

A) Enable VLAN but do not configure it on an unused network interface

B) DISABLE that interface (IPv4/IPv6 Address Assignment None, not a member of bridge or bond)

C) Reboot, start your array, check if you see "VLAN 0" related log messages

D) Report if your system becomes stable

I've got uptime around a month now, perfectly stable, with no changes to my system beyond what's mentioned above and in greater detail in my previous message.

Well, knock on wood, but my system seems to be stable after making these changes. I had trouble even keeping Unraid up for more than 2 days, but it has now been running without issues for about a week.

Bees? · August 3, 2021

For what it's worth, I was having this issue as well and simply disabling host access to custom networks seems to have resolved it for me. No vlan changes or anything else. Couldn't keep it running running for longer than a day or two, and now it's been running for about a week and a half without an issue.

VlarpNL · August 4, 2021

On 8/3/2021 at 10:15 AM, VlarpNL said:

Well, knock on wood, but my system seems to be stable after making these changes. I had trouble even keeping Unraid up for more than 2 days, but it has now been running without issues for about a week.

Well, there you have it. I jinxed it. My server just froze up.

I'm disabling host access to custom networks to see if that improves anything. Fingers crossed.

sirkuz · August 10, 2021

On 6/12/2021 at 10:56 AM, bonienl said:

For Unraid version 6.10 I have replaced the Docker macvlan driver for the Docker ipvlan driver.

IPvlan is a new twist on the tried and true network virtualization technique. The Linux implementations are extremely lightweight because rather than using the traditional Linux bridge for isolation, they are associated to a Linux Ethernet interface or sub-interface to enforce separation between networks and connectivity to the physical network.

The end-user doesn't have to do anything special. At startup legacy networks are automatically removed and replaced by the new network approach. Please test once 6.10 becomes available. Internal testing looks very good so far.

17 hours on 6.10.0-rc1 with the switch over to ipvlan and no panics yet....knock on wood If I make it to 48 hours I will be convinced.

codefaux · August 11, 2021

On 8/4/2021 at 7:33 AM, VlarpNL said:

Well, there you have it. I jinxed it. My server just froze up.

I'm disabling host access to custom networks to see if that improves anything. Fingers crossed.

Sorry to hear it -- I'm still running the configuration posted and still stable save the recent power bumps in our area, but I had nearly a month of uptime at one point and even that power cycle was scheduled. I've also moved additional Docker containers onto the system so I could shut down the other for power/heat savings during the heat wave. I haven't had the spoons to convince myself to change anything since it became stable, so I'm not on the RC yet.

I might suggest that if there is a panic, it could be unrelated -- post its details just to be sure. Various hardware have errata that could cause panics, including both Intel and AMD C-state bugs on various generations, and even filesystem corruption from previous panics -- something Unraid doesn't forward to the UI from the kernel logs, by default.

Good luck all.

VlarpNL · August 11, 2021

1 hour ago, codefaux said:

Sorry to hear it -- I'm still running the configuration posted and still stable save the recent power bumps in our area, but I had nearly a month of uptime at one point and even that power cycle was scheduled. I've also moved additional Docker containers onto the system so I could shut down the other for power/heat savings during the heat wave. I haven't had the spoons to convince myself to change anything since it became stable, so I'm not on the RC yet.

I might suggest that if there is a panic, it could be unrelated -- post its details just to be sure. Various hardware have errata that could cause panics, including both Intel and AMD C-state bugs on various generations, and even filesystem corruption from previous panics -- something Unraid doesn't forward to the UI from the kernel logs, by default.

Good luck all.

I finally enabled syslog etc. and it seems it is not related to this issue at all anymore that my server crashes. I had the macvlan problem earlier, so I made the assumption (yes, I know! Assumtions are the mother of all f*ups) that it was still the same issue.

That probably means that the macvlan issue is gone on my machine.

DuzAwe · August 11, 2021

So bit of a curveball I ran macvlan br0 on the new rc for 24 hours without issue.

I hadn't realised that I didn't enable ipVlan, I have now enabled ipvlan and its still all good.

But I had not been able to run for 24 hours on mcavlan with 6.9.

Edited August 11, 2021 by DuzAwe

sirkuz · August 11, 2021

ipvlan seems like an upgrade from macvlan anyways so I didn't even bother with testing my config on macvlan with 6.10.0-rc1. I went right to ipvlan as soon as the update was complete and now at 45 hours with no issue.

Generally couldn't make it half that long in 6.9.x when I tried with my configuration before locking up.

VlarpNL · August 16, 2021

This morning I had another crash again related to br_netfilter/nf_nat. So I have upgraded to 6.10.0-rc1 and switched to ipvlan for Docker. Hoping this solves this issue.

$\V/$ · August 17, 2021

Unraid 6.9.2 here. Experiencing the same problem. Happens every once in a long while. The kernel panic screen shows something related to IPv6. The only changes to my server recently are upgrading from 6.8.3 and reassigning a bunch of dockers to br0,br1.

VlarpNL · August 23, 2021

I'm stable running on 6.10.0-rc1 with ipvlan for over a week now. An uptime of more than 7 days is a record in the past couple of months.

DuzAwe · August 23, 2021

Stable for 12 days

bonienl · August 23, 2021

Why don’t you try Unraid version 6.10.0-rc1 ?

skidz7 · November 2, 2021

Just found this thread - I'm on 6.9.2 and have been having the same issue for some time now.

What's been the result for those of you who have upgraded to 6.10rc?

My unraid box:

- Host access to custom networks: off

- Using a combination of Bridge & Br0 w/ fixed IP addresses

- Have multiple NICs in the server, also using bonded

- Motherboard model: Supermicro X9DRi-LN4+ (I noticed someone else having the same problem was using the same model)

- Was running a Unifi controller container

I just today removed the Unifi container to see if there's any change.

Considering the 6.10 update as another option depending on how that's worked out for others with this problem.

Excerpt of syslog:

Oct 26 10:19:46 X1 kernel: ------------[ cut here ]------------
Oct 26 10:19:46 X1 kernel: WARNING: CPU: 4 PID: 3970 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x9b>
Oct 26 10:19:46 X1 kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_m>
Oct 26 10:19:46 X1 kernel: CPU: 4 PID: 3970 Comm: kworker/4:0 Not tainted 5.10.28-Unraid #1
Oct 26 10:19:46 X1 kernel: Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.4 11/20/2019
Oct 26 10:19:46 X1 kernel: Workqueue: events macvlan_process_broadcast [macvlan]
Oct 26 10:19:46 X1 kernel: RIP: 0010:__nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
Oct 26 10:19:46 X1 kernel: Code: e8 dc f8 ff ff 44 89 fa 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 36 f6 ff ff 84 c0 75>
Oct 26 10:19:46 X1 kernel: RSP: 0018:ffffc9000c764dd8 EFLAGS: 00010202
Oct 26 10:19:46 X1 kernel: RAX: 0000000000000188 RBX: 000000000000144d RCX: 00000000ba3cf88b
Oct 26 10:19:46 X1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa03085fc
Oct 26 10:19:46 X1 kernel: RBP: ffff8898dabb1b80 R08: 000000006d5ec060 R09: 0000000000000000
Oct 26 10:19:46 X1 kernel: R10: 0000000000000098 R11: ffff888107e9cd00 R12: 00000000000010ff
Oct 26 10:19:46 X1 kernel: R13: ffffffff8210b440 R14: 000000000000144d R15: 0000000000000000
Oct 26 10:19:46 X1 kernel: FS:  0000000000000000(0000) GS:ffff88981fb00000(0000) knlGS:0000000000000000
Oct 26 10:19:46 X1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 26 10:19:46 X1 kernel: CR2: 00001528988c3698 CR3: 00000018e47b0006 CR4: 00000000001706e0

PixelDJ · November 5, 2021

Just wanted to give an update for those of you having this issue who are still on 6.9 or earlier.

I have about 25 docker containers all running on various VLANs. This is on a Dell PowerEdge R510 server. I was experiencing crashes daily because of this issue. I tried tracking down the problem for weeks.

I updated to 6.10 rc1 3 weeks ago and haven't had a single crash since.

Migrating from macvlan to ipvlan driver couldn't be easier; none of my docker containers were affected negatively.

I would definitely recommend upgrading to 6.10 if you are experiencing kernel panics and run docker with custom networks.

EDIT: Attempted to upgrade to RC2 but received (bzroot checksum failed) error on bootup. I made a flash backup before upgrading so I manually restored to a new flash drive. Will hold off on upgrading to RC2 for a bit in case it happens again.

Edited November 9, 2021 by PixelDJ
Added info about RC2 upgrade.

skidz7 · November 8, 2021

Update from another who was having the same issue on 6.9.3

After upgrading to 6.10-rc2 - Same result

Updated docker settings to use ipvlan instead of macvlan - seems to have resolved the issue. Have not noticed any issues with any of my docker containers as a result of this change.

Uptime 3 days, 43 minutes ( Prior to this change, was seeing hard crashes prior to the 3 day mark)

Will report back if there's any change.

-jim · November 25, 2021

I too have been unable to run any docker containers for months due to this bug in Unraid and am deeply disappointed there is no apparent documented workable solution.

I have dug through 100s of posts looking for a solution and found a lot of contradictions and mostly try this and then that.

I have tried 6.10.0-rc2 and still no help.

I keep looking at a ipvlan but cannot find how to set this up for my network.

So for now, Unriad and all the hardware for me just burn up electricity.

JorgeB · November 25, 2021

30 minutes ago, -jim said:

I have tried 6.10.0-rc2 and still no help.

I keep looking at a ipvlan but cannot find how to set this up for my network.

Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)

rodan5150 · November 25, 2021

Now that I run my docker containers that need their own IP on a dedicated "docker" NIC, no issues at all on 6.9. It has been great for months.

codefaux · November 25, 2021

7 hours ago, -jim said:

I too have been unable to run any docker containers for months due to this bug in Unraid and am deeply disappointed there is no apparent documented workable solution.

I have dug through 100s of posts looking for a solution and found a lot of contradictions and mostly try this and then that.

I have tried 6.10.0-rc2 and still no help.

I keep looking at a ipvlan but cannot find how to set this up for my network.

So for now, Unriad and all the hardware for me just burn up electricity.

I still haven't updated, because the workaround I posted still works for me. I'm using Docker with 20+ containers, each with a dedicated IP, all from one NIC.

I never even tried the 6.10-rc because after it was released, I read a few posts from folks using the 6.10-rc that the problem still existed even with ipvlan. Honestly, I'm stable, and I'm not going to upgrade until I stop hearing about this bug.

-jim · December 21, 2021

My Unraid has become a warm brick.

On 11/25/2021 at 6:40 AM, JorgeB said:

Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)

Sure that was easy. But then there is a lot of configuration that must be done in other places that is not defined for Unraid.

Edited December 21, 2021 by -jim

Fma965 · December 21, 2021

45 minutes ago, -jim said:

My Unraid has become a warm brick.

Sure that was easy. But then there is a lot of configuration that must be done in other places that is not defined for Unraid.

Not really, thats all that needs to be done.

Esg450 · December 30, 2021

Just to add to this list, I was also experiencing these issues with 6.9.2 and was getting a kernel panic about every 2 weeks. I've been running *-rc1 and *-rc2 since they came out with ipvlan enabled and have been stable ever since then (58 days currently). For reference this is the hardware I'm running as well:

Mobo: asrock X470D4U

CPU: 3700x

Memory: 32GB Kingston KSM26ED8

Mr_Jay84 · February 9, 2022

Hello Guys

I think my server has been suffering from this too. I had originally thought it was down to C-States and my dual ES Xeons. It's been progressively worse since 6.8, now on 6.10 RC2

I changed practically everything new CPUS, removed the GPUs, Load of new RAM, went through every version of the BIOS. The problem was always there going from crashing every 12hrs to 4 days.

I noticed it got worse when using PiHole/LanCache & occasionally Tdarr nodes, all have there own IPs on br0.

I changed over to IPVLAN, crashed within 12 hrs.

I've now disabled "Host access to custom networks" - Let's see how that goes.

The next step will be VLANs, not an issue for me as I already have a few on my LAN. I can set the docker IPs over to my IoT VLAN.

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

User Feedback

Recommended Comments

codefaux 19

Link to comment

VlarpNL 3

Link to comment

Bees? 1

Link to comment

VlarpNL 3

Link to comment

sirkuz 6

Link to comment

codefaux 19

Link to comment

VlarpNL 3

Link to comment

DuzAwe 11

Link to comment

sirkuz 6

Link to comment

VlarpNL 3

Link to comment

\V/ 0

Link to comment

VlarpNL 3

Link to comment

DuzAwe 11

Link to comment

bonienl 1764

Link to comment

skidz7 1

Link to comment

PixelDJ 5

Link to comment

skidz7 1

Link to comment

-jim 0

Link to comment

JorgeB 7474

Link to comment

rodan5150 3

Link to comment

codefaux 19

Link to comment

-jim 0

Link to comment

Fma965 21

Link to comment

Esg450 1

Link to comment

Mr_Jay84 16

Link to comment

Join the conversation