CorneliousJD

November 9, 2023

Just now, jzawacki said:

Good luck with that. When I attempted to use ipvlan, the entire unraid server had issues talking to the network.

I was hoping to no longer be a part of this topic as my server had been running fine for over 6 months without issues. Then, being super dumb, I updated to 6.12.4 and the server kernel panic'd literally the next day. Hopefully, after it finishes a parity check, it'll go back to normal. Fingers crossed.

I solved my issues before but just making a dedicated NIC for docker with macvlan and I haven't had any crashes since.

I don't plan on switching back even if they have this fixed. I don't see a reason to change from a known good and working config. You may want to try it? Worth a shot!

March 31, 2023

4 minutes ago, sage2050 said:

Knocking on every piece of wood and tree in sight, but I crossed over 1 week of uptime last night after changing docker from macvlan to iplan. I also disabled some less than crucial dockers so I'll start enabling them one by one.

I was running fine but would randomly see all sorts of network issues on my server after that. not able to update containers, not able to reach the internet from some, etc. seems the way it networks those together w/ ipvlan didn't play nice with my unifi router.

March 28, 2023

8 hours ago, macmanluke said:

Think iv run into a similar issue
Seems my EdgerouterX does not like same mac being used across multiple devices when in ipvlan mode.

but macvlan is potentially causing me crashes ugh!

For what it's worth because I kept having nothing but problems, I eventually took one NIC (my server had 4) and made it just for a docker network, no vlans or anything but ALL dockers use that one NIC interface now, and it's all on macvlan and I have not had any problems since.

November 8, 2022

8 minutes ago, jzawacki said:

Well, I had to switch back to macvlan due to some network weirdness. unraid had been running for 20 days without issues, but I had some dockers that wouldn't stay connected and it wasn't till I realized that unraid couldn't check to see if there was a version update available, did I blame it on the docker ipvlan setting. Switched it back to macvlan and all the network weirdness went away and the server hasn't kernel panic'd yet, so I'm keeping my fingers crossed.

I ran into same issues with ipvlan,

Ipvlan has been less drastic of issues but more sporadic and IMO, the worse of the two

October 19, 2022

Ipvlan made more issues for me.

Then I went back to macvlan with vlans on docker and it was fine, until I got a new gateway (UDM-SE) and then it started crashing again.

I am now on a dedicated NIC for containers with their own IP.

Things keep changing... but I'm hopeful that this keeps things stable long term.

September 5, 2022

Forgot I made this post, but this issue still exists in 6.10.3 currently.

Is there any acknowledgement of this issue? The thread linked above seems to show some other users with the same problem.

July 5, 2022

This seems to be the same or a very similar issue, they are experiencing problems much faster than I am, but they seem to have it narrowed down to the same ipvlan issues, and the shim interface where the host access to custom networks being enabled is what causes the problem.

Linking here as it may be helpful.

July 4, 2022

7 hours ago, bonienl said:

I use ipvlan extensively without any problems.

Loosing internet access seems to point to something happening in your network set up outside Unraid.

So this only affects this server. All other devices are ok, and it also *only* happens when doing a CA appdata backup. If I turn off the schedule for backups it doesn't happen at all.

Also changing back to macvlan prevents the issue.

If it's something network related outside of the server, then it makes no sense to me haha

July 3, 2022

11 minutes ago, Squid said:

Does this have anything to do with the backup plugin not (currently) honoring the startup order?

Hi Squid, it shouldn't, because the only thing that *would* matter is my pihole container, but my server itself is set to use other public DNS servers, plus I run a secondary pihole on a raspberry pi totally external to my server that acts as my backup DNS on my LAN.

Since my server itself fails to ping out to 8.8.8.8 when this happens, it would seem to not have anything to do with container order.

If I can be of any more help let me know.

July 3, 2022

Forgot to attach my diags in last reply, sorry!server-diagnostics-20220703-1714.zip

June 9, 2021

15 hours ago, jonp said:

Hey everyone, just a quick update on this issue. The main problem we've faced is the inability to recreate this issue in our labs. We are still actively working on it, but if anyone here knows the full solution, we are open to providing a bounty for it. Just PM me and so long as the fix isn't a hack or workaround, we will gladly compensate you for your time and work.

Thank you for continuing to update everyone on this, I appreciate it even though I've been able to work around the issue with a VLAN.

My only "fixes" are workarounds for now - I'm only sharing them here to help people who are frustrated by the crasahes to do something free or at least very cheap (add a $15 NIC for another interface) to get around the issue.

hoping someone here has real answers at some point though, that would be great!

If there's anything else I can provide from my system please let me know, happy to keep providing logs/diags/etc.

June 8, 2021

6 minutes ago, Eddie Seelke said:

No, I have to use a specific IP for this service and using a VLAN would change it. Well, I don't have to, but it would be too much work to change it. lol

What service is it? Out of curiosity at this point really.

What about adding another NIC (if your motherboard doesn't already have 2 or more LAN ports?)

That could be a very cheap $15 card for a gigabit NIC.

https://amzn.to/3zfpmzF (referral link)

I beleive that way you could keep the same IP address, but give it its own network interface instead?

June 8, 2021

1 minute ago, Eddie Seelke said:

I have been having this issue as well. I do have one docker that is using br0, but unfortunately I cannot move it to a VLAN. Is there any other way to fix this? Or is there an update on when a permanent fix will be out?

Is there a reason you can't run a VLAN? (hardware doesn't support it?)

Alternatively, you could add a 2nd NIC into your server and run it on a separate IP address from that I believe? I don't know the specifics of this exactly, but I think that would work so it's not assigned to br0 anymore?

Also, are you able to simply not run that container on its own IP?

I don't know of any other way besides those two options personally that will fix this.

I had success with the VLAN method, and implemented it in less than 10 minutes total.

June 5, 2021

26 minutes ago, vagrantprodigy said:

It has to be the br0 ones, as I've turned the others off completely. I really don't want to mess around with vlans in my home network, and complicate things further. I'll probably spin up a VM in ESXi for docker for now, and if this isn't fixed in the next few months, I may just end up migrating to a new platform. 6.7 broke things for me, as did 6.8 and 6.8.3, so I came from 6.6.7. I promised myself prior to 6.9.0 if this was another failed upgrade, I'd look into alternatives to unRAID, which really sucks, as I have 2 unraid pro licenses, and have been using unRAID for several years.

To each their own, a single vlan for Dockers is easier than setting up ESXi IMO.

Well I agree, I didn't want to complicate my home network any further either, it was an extremely simple process that took me less than 10 minutes to complete. In my eyes 10 minutes to avoid crashes was definitely worth it.

I've already linked to post somewhere in this thread that goes over the details of adding a docker VLAN, complete with photos!

June 5, 2021

12 minutes ago, vagrantprodigy said:

The kernel panics for me are getting more frequently, despite disabling all containers I don't absolutely need. All host network containers are disabled at this point, I have 2 on br0 that are running. I had my third kernel panic in 12 hours last night.

Devs, is there a fix coming for this soon? If not, despite my 2 licenses, I need to start looking at other platforms, because this is causing a huge problem for me.

Not a dev, but as noted in this thread a few times it's your br0 ones causing the issue. Not bridge vs host.

You probably need to put those on a separate vlan.

This is admittedly a workaround, but one that's worked for me. Stable with zero crashes since doing it over a month ago

May 11, 2021

@limetech

Is there work being done on this still since it wasn't resolved as previously thought?

March 29, 2021

16 minutes ago, bonienl said:

In this case, is it the connection to the server not working anymore, or the complete server halted?

In other words local console is still working in this case?

Complete server halt, local console wouldn't respond (it just displayed the kernel panic/halt on the screen).

All connections severed to the server, had to reboot it by power button.

March 29, 2021

20 minutes ago, bonienl said:

If you have a way to create these call traces on demand that would be helpful (of course we need diagnostics to further investigate). I have host access enabled but don't have any of these call traces, and as such it is hard to reproduce the issue for me.

In the next Unraid version some more conntrack modules will be loaded, it would hopefully help to tackle the problem in more detail.

FWIW I was never able to create them on demand despite trying to, although for me it seemed to happen every ~60 hours or so (roughly) I would never hit 3 days of uptime before I made VLANs.

I also never had call traces "build up" - it was one and done, resulting in a kernel panic for me.

I still have host access enabled with VLANs and I've eliminated my call traces and kernel panics so far.

Uptime 17 days 4 hours 8 minutes since making those changes.

If anyone finds a way to force this to happen I would consider purposely recreating the issue to help provide whatever logs are needed beyond what I've already supplied (diags and syslog of the call trace and panic).

March 26, 2021

27 minutes ago, whoopn said:

Ah thank you! I'm going to setup syslog as well as disable any host attached network settings for docker...I'm seriously considering moving all of my docker containers to a VM...

If you have anything on br0 custom network, change that, or put it on a VLAN, and possibly disable host access as well. I think moving things to a VM is way overkill. I have nearly 60 containers running, and had just 2 on br0, I moved one to br0.10 (VLAN 10) and just changed the other to bridge mode instead.

Been running solid now ever since.

March 25, 2021

6 hours ago, bonienl said:

Please attach complete diagnostics, a screenshot doesn't show the relevant information.

In case your server doesn't respond anymore, consider to activate the Mirror syslog to flash function, see Tools -> Syslog server. This would save a copy of your syslog to flash and can be retrieved afterwards.

I setup an actual syslog server to capture it myself but yes, this is indeed the way.

I should note my first panic after doing this did NOT get captured but the 2nd one did and that's what's posted in the first post here.

9 hours ago, whoopn said:

To UNRAID staff...can we get an official response?

it happened again after only being up for one day.

0F1FCC4A-4318-4761-9E18-A00CC550CF9E.bmp 606.65 kB · 2 downloads

They did mention it already saying that they're looking into this A few posts back Limetech popped in on this.

March 23, 2021

User @kaiguy actually noted that it seems to be br0 + "host access" being enabled, and just disabling host access worked for him.

Work exploring maybe?

I originally had br0 + host access as well. Now I just have it on br0.10 (VLAN 10)

March 22, 2021

1 minute ago, jonp said:

Hi everyone,

Thank you for your patience with us on this and @bonienl for taking point on trying to recreate the issue. We are discussing this internally and will continue to do so until we have something to share with you guys. Issues like these can be tricky to pin down, so please bare with us while we attempt to do so.

Thanks for the update, glad to see it's being talked about, that's all I can hope for, for now

If you need any additional logs or info let us know in the thread, myself and others here seem to be eager to help if we can by providing data/logs/specs, etc.

March 22, 2021

4 hours ago, jsiemon said:

Count me as another who is now experiencing hard crashes on my server and based on my reading here i suspect it is my Dockers on br0 creating problems. I've been running UnRaid for a long time (more than 15 years??) and can't recall ever having had a server crash. However since upgrading to 6.9.1 I've had 3 hard crashes in matter of week that require forcibly powering-off. Prior to upgrading to 6.9..1 I was running 6.8.2 using Dockers and br0 with static IPs without any issue but now this bug seems to have gotten me as well. Today I've moved all of my dockers to Bridge and Host where appropriate and will see if this resolves my crashing issues. I'll report back here either way.

Please keep us posted. I moved to VLANs instead of static IPs and so far I'm at Uptime 9 days 12 hours 11 minutes

@limetech -- is there any way someone can chime on here on this?

Not expecting any immediate fixes, but is this somethign related to 6.9.X and something that can be addressed or?... We're all kind of in the dark here, I understand that it's happened to some users and VLANs have fixed it for them in the past, but this is a lot of users now reporting problems ONLY since 6.9.0/6.9.1... Myself included.

March 19, 2021

For what it's worth, most of my panics weren't when the containers were seeing heavy usage, it happened overnight randomly for me.

I'm on 7 days 7 hours of uptime now.

March 17, 2021

8 hours ago, kaiguy said:

Not sure if disabling host access to custom networks fixed it, or migrating the two containers that had static IPs assigned to br0 to a raspberry pi, but I’m no longer getting these syslog errors/locks.

I would prefer to keep everything on the server, so next project will be setting up a docker VLAN on my pfsense and tplink smart switch. Not once did I see them before 6.9.x, so I am hopeful that whatever is going on in this kernel is corrected.

I still have host access enabled so that shouldn't be part of it (i edited the post to no longer state that host access is causing this) -- the fix was more than likely moving your static br0 containers to a Pi.

Also hoping we can get a chime-in here from someone at Limetech -- VLAN isn't a huge issue for me so I'm a happy camper now, but I can see where VLANs would cause more issues for certain apps, or where users just don't have equipment at home for that type of setup.

CorneliousJD

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Report Comments posted by CorneliousJD

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

[6.10.X] - New ipvlan causing network issues after backups.

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

[6.10.X] - New ipvlan causing network issues after backups.

[6.10.X] - New ipvlan causing network issues after backups.

[6.10.X] - New ipvlan causing network issues after backups.

[6.10.X] - New ipvlan causing network issues after backups.

[6.10.X] - New ipvlan causing network issues after backups.

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)

6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)