6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

CorneliousJD · March 23, 2021

User @kaiguy actually noted that it seems to be br0 + "host access" being enabled, and just disabling host access worked for him.

Work exploring maybe?

I originally had br0 + host access as well. Now I just have it on br0.10 (VLAN 10)

whoopn · March 25, 2021

To UNRAID staff...can we get an official response?

it happened again after only being up for one day.

0F1FCC4A-4318-4761-9E18-A00CC550CF9E.bmp

bonienl · March 25, 2021

Please attach complete diagnostics, a screenshot doesn't show the relevant information.

In case your server doesn't respond anymore, consider to activate the Mirror syslog to flash function, see Tools -> Syslog server. This would save a copy of your syslog to flash and can be retrieved afterwards.

Edited March 25, 2021 by bonienl

CorneliousJD · March 25, 2021

6 hours ago, bonienl said:

Please attach complete diagnostics, a screenshot doesn't show the relevant information.

In case your server doesn't respond anymore, consider to activate the Mirror syslog to flash function, see Tools -> Syslog server. This would save a copy of your syslog to flash and can be retrieved afterwards.

I setup an actual syslog server to capture it myself but yes, this is indeed the way.

I should note my first panic after doing this did NOT get captured but the 2nd one did and that's what's posted in the first post here.

9 hours ago, whoopn said:

To UNRAID staff...can we get an official response?

it happened again after only being up for one day.

0F1FCC4A-4318-4761-9E18-A00CC550CF9E.bmp 606.65 kB · 2 downloads

They did mention it already saying that they're looking into this A few posts back Limetech popped in on this.

whoopn · March 26, 2021

23 hours ago, CorneliousJD said:

They did mention it already saying that they're looking into this A few posts back Limetech popped in on this.

Ah thank you! I'm going to setup syslog as well as disable any host attached network settings for docker...I'm seriously considering moving all of my docker containers to a VM...

trott · March 26, 2021

this is the reason I asked if we can attached a VF to container in the thread, it can help to solve the issue and meet the requirement

Edited March 26, 2021 by trott

CorneliousJD · March 26, 2021

27 minutes ago, whoopn said:

Ah thank you! I'm going to setup syslog as well as disable any host attached network settings for docker...I'm seriously considering moving all of my docker containers to a VM...

If you have anything on br0 custom network, change that, or put it on a VLAN, and possibly disable host access as well. I think moving things to a VM is way overkill. I have nearly 60 containers running, and had just 2 on br0, I moved one to br0.10 (VLAN 10) and just changed the other to bridge mode instead.

Been running solid now ever since.

Jimmy · March 26, 2021

My web interface crash randomly and I think I did not have a static IP for my dockers. Here my logs

bugz-diagnostics-20210325-2252.zip

bonienl · March 26, 2021

3 hours ago, Jimmy said:

My web interface crash randomly and I think I did not have a static IP for my dockers. Here my logs

You have a different problem.

Your system is running out of memory and I also see disk full statuses.

Better create a report under General Support

Start your system in safe mode and post your diagnostics under that report.

Jimmy · March 26, 2021

4 minutes ago, bonienl said:

You have a different problem.

Your system is running out of memory and I also see disk full statuses.

Better create a report under General Support

Start your system in safe mode and post your diagnostics under that report.

But that’s not the case....sometimes in the logs I have a Kernel Panic that appears.

image.png.89e8ea6e7fec4b72f2b10df29b040e82.png image.png.087d0505f2fc8cb795a591060c274cdc.png

bonienl · March 26, 2021

16 minutes ago, Jimmy said:

.sometimes in the logs I have a Kernel Panic that appears.

I don't see the call trace specific to this issue in your log files.

KoNeko · March 26, 2021

I run all my Dockers on br0. all have a custom IP that i assigned.

i read vlan this and vlan that. Why would i want to do that? if it fix the locks that isnt a good reason in my Book because then something is wrong. and the setting this up without the Vlan should not be possible.

i just trying to find out why this happens. I dont like answers like So you have put it on Vlan than it works. That sounds to me like when you go with your car to the garage and say my car isnt driving anymore but backwards/reverse still works and the Garage says so then you just drive it in reverse.

i had a Medusa docker which was the only one not on br0 but that one seems to be gone after i tried to safe settings. So i have to recreate that one now

Its runnin for 5+ days now.

DarkMan83 · March 27, 2021

This should explain, why the panics doesn't occur after a crash/restart of the server, because setting the "Host Access" to yes does not persist a reboot for me. After reboot it shows still yes, but the host can't access the container.

So i have to disable and than enable host access in the settings again.

After that it works again but the kernel panic will occur at some time.

Mybe i'm alone with this setting persitant bug, but if more ppl on the same boat and it' unknown untill now, than there could possibly much more ppl with kernel panics upcoming

Could be easily verified, just ping the container from host after reboot.

P.S. Sometimes i'm running for weeks until a panic occurs.

Edited March 27, 2021 by DarkMan83

jsiemon · March 27, 2021

Just a quick follow-up to my previous post. Its been about a week now since I changed my 2 Docker containers from br0 + static ip to Host and so far no crashes. All of my other Docker containers were already on Host or Bridge. I realize that a week isn't a long time, but prior to this change I had 3 KP in the span of a week so this change appears to have made a significant improvement and may point to the root cause of the issue. I'll report back again in another week good or bad.

UPDATE:

url_redirect.cgi.bmp

I spoke too soon. Just had a KP. Here is a screen shot. I know it's not the same as a log, but it is all that I have.

Edited March 27, 2021 by jsiemon

DarkMan83 · March 28, 2021

22 hours ago, jsiemon said:

Just a quick follow-up to my previous post. Its been about a week now since I changed my 2 Docker containers from br0 + static ip to Host and so far no crashes. All of my other Docker containers were already on Host or Bridge. I realize that a week isn't a long time, but prior to this change I had 3 KP in the span of a week so this change appears to have made a significant improvement and may point to the root cause of the issue. I'll report back again in another week good or bad.

UPDATE:

url_redirect.cgi.bmp 606.65 kB · 12 downloads

I spoke too soon. Just had a KP. Here is a screen shot. I know it's not the same as a log, but it is all that I have.

Same kernel panic as always

manuel_calavera · March 28, 2021

I'm getting this kernel panic too, and unRAID freezes and usually has a black empty screen when I plug my monitor in.

Managed to snap a pic of the error on my phone on 10th March, pic attached. After this I set up a server syslog.

Today, I had another crash/freeze, but no error message, just a blank screen. Attached are my diags and syslog (I hard reset my system at Mar 28 11:24:54)

Overall it's frozen/crashed like this about 4 or 5 times in the past couple of months or so.

syslog-192.168.0.21.log rocinante-diagnostics-20210328-1202.zip

jbeazies · March 28, 2021

Following up - Just got a kernel panic sometime early this morning despite reconfiguring my only static ip'd docker to a br0.4 vlan.

Tuftuf · March 29, 2021

I've gone to the effort of building an 11th gen NUC system to replace my main unraid server party due to needing to troubleshoot this bug! The 11th gen was a bit of a failure as it seems driver support for 11th gen cpu quick sync is shocking right now.

So I found a cheap used 10400T mini pc! Everything is running from there as of last night.

Now it's time to unplug some hard drives and start testing this bug again! Good (well bad) but good to see others are still seeing the issue.

manuel_calavera · March 29, 2021

It's just happened again during the partity build form the last crash. This is killing me!

bonienl · March 29, 2021

People, please post diagnostics.

manuel_calavera · March 29, 2021

8 minutes ago, bonienl said:

People, please post diagnostics.

Sorry about that, should have added I posted a couple of posts up with all that attached.

kaiguy · March 29, 2021

17 minutes ago, bonienl said:

People, please post diagnostics.

Disabling host access to custom networks has helped eliminate my kernel panics. Is this something you’d like me to re-enable for the cause?

bonienl · March 29, 2021

If you have a way to create these call traces on demand that would be helpful (of course we need diagnostics to further investigate). I have host access enabled but don't have any of these call traces, and as such it is hard to reproduce the issue for me.

In the next Unraid version some more conntrack modules will be loaded, it would hopefully help to tackle the problem in more detail.

CorneliousJD · March 29, 2021

20 minutes ago, bonienl said:

If you have a way to create these call traces on demand that would be helpful (of course we need diagnostics to further investigate). I have host access enabled but don't have any of these call traces, and as such it is hard to reproduce the issue for me.

In the next Unraid version some more conntrack modules will be loaded, it would hopefully help to tackle the problem in more detail.

FWIW I was never able to create them on demand despite trying to, although for me it seemed to happen every ~60 hours or so (roughly) I would never hit 3 days of uptime before I made VLANs.

I also never had call traces "build up" - it was one and done, resulting in a kernel panic for me.

I still have host access enabled with VLANs and I've eliminated my call traces and kernel panics so far.

Uptime 17 days 4 hours 8 minutes since making those changes.

If anyone finds a way to force this to happen I would consider purposely recreating the issue to help provide whatever logs are needed beyond what I've already supplied (diags and syslog of the call trace and panic).

bonienl · March 29, 2021

9 minutes ago, CorneliousJD said:

I also never had call traces "build up" - it was one and done

In this case, is it the connection to the server not working anymore, or the complete server halted?

In other words local console is still working in this case?

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

User Feedback

Recommended Comments

CorneliousJD 117

Link to comment

whoopn 2

Link to comment

bonienl 1768

Link to comment

CorneliousJD 117

Link to comment

whoopn 2

Link to comment

trott 14

Link to comment

CorneliousJD 117

Link to comment

Jimmy 0

Link to comment

bonienl 1768

Link to comment

Jimmy 0

Link to comment

bonienl 1768

Link to comment

KoNeko 6

Link to comment

DarkMan83 17

Link to comment

jsiemon 5

Link to comment

DarkMan83 17

Link to comment

manuel_calavera 0

Link to comment

jbeazies 0

Link to comment

Tuftuf 18

Link to comment

manuel_calavera 0

Link to comment

bonienl 1768

Link to comment

manuel_calavera 0

Link to comment

kaiguy 28

Link to comment

bonienl 1768

Link to comment

CorneliousJD 117

Link to comment

bonienl 1768

Link to comment

Join the conversation