• 6.9.0/6.9.1 - Kernel Panic due to netfilter (nf_nat_setup_info) - Docker Static IP (macvlan)


    CorneliousJD
    • Urgent

    So I had posted another thread about after a kernel panic, docker host access to custom networks doesn't work until docker is stopped/restarted on 6.9.0

     

     

    After further investigation and setting up syslogging, it apperas that it may actually be that host access that's CAUSING the kernel panic? 

    EDIT: 3/16 - I guess I needed to create a VLAN for my dockers with static IPs, so far that's working, so it's probably not HOST access causing the issue, but rather br0 static IPs being set. See following posts below.

     

    Here's my last kernel panic that thankfully got logged to syslog. It references macvlan and netfilter. I don't know enough to be super useful here, but this is my docker setup.

     

    image.png.dac2782e9408016de37084cf21ad64a5.png

     

    Mar 12 03:57:07 Server kernel: ------------[ cut here ]------------
    Mar 12 03:57:07 Server kernel: WARNING: CPU: 17 PID: 626 at net/netfilter/nf_nat_core.c:614 nf_nat_setup_info+0x6c/0x652 [nf_nat]
    Mar 12 03:57:07 Server kernel: Modules linked in: ccp macvlan xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_MASQUERADE iptable_nat nf_nat xfs md_mod ip6table_filter ip6_tables iptable_filter ip_tables bonding igb i2c_algo_bit cp210x usbserial sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd ipmi_ssif isci glue_helper mpt3sas i2c_i801 rapl libsas i2c_smbus input_leds i2c_core ahci intel_cstate raid_class led_class acpi_ipmi intel_uncore libahci scsi_transport_sas wmi ipmi_si button [last unloaded: ipmi_devintf]
    Mar 12 03:57:07 Server kernel: CPU: 17 PID: 626 Comm: kworker/17:2 Tainted: G        W         5.10.19-Unraid #1
    Mar 12 03:57:07 Server kernel: Hardware name: Supermicro PIO-617R-TLN4F+-ST031/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
    Mar 12 03:57:07 Server kernel: Workqueue: events macvlan_process_broadcast [macvlan]
    Mar 12 03:57:07 Server kernel: RIP: 0010:nf_nat_setup_info+0x6c/0x652 [nf_nat]
    Mar 12 03:57:07 Server kernel: Code: 89 fb 49 89 f6 41 89 d4 76 02 0f 0b 48 8b 93 80 00 00 00 89 d0 25 00 01 00 00 45 85 e4 75 07 89 d0 25 80 00 00 00 85 c0 74 07 <0f> 0b e9 1f 05 00 00 48 8b 83 90 00 00 00 4c 8d 6c 24 20 48 8d 73
    Mar 12 03:57:07 Server kernel: RSP: 0018:ffffc90006778c38 EFLAGS: 00010202
    Mar 12 03:57:07 Server kernel: RAX: 0000000000000080 RBX: ffff88837c8303c0 RCX: ffff88811e834880
    Mar 12 03:57:07 Server kernel: RDX: 0000000000000180 RSI: ffffc90006778d14 RDI: ffff88837c8303c0
    Mar 12 03:57:07 Server kernel: RBP: ffffc90006778d00 R08: 0000000000000000 R09: ffff889083c68160
    Mar 12 03:57:07 Server kernel: R10: 0000000000000158 R11: ffff8881e79c1400 R12: 0000000000000000
    Mar 12 03:57:07 Server kernel: R13: 0000000000000000 R14: ffffc90006778d14 R15: 0000000000000001
    Mar 12 03:57:07 Server kernel: FS:  0000000000000000(0000) GS:ffff88903fc40000(0000) knlGS:0000000000000000
    Mar 12 03:57:07 Server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 12 03:57:07 Server kernel: CR2: 000000c000b040b8 CR3: 000000000200c005 CR4: 00000000001706e0
    Mar 12 03:57:07 Server kernel: Call Trace:
    Mar 12 03:57:07 Server kernel: <IRQ>
    Mar 12 03:57:07 Server kernel: ? activate_task+0x9/0x12
    Mar 12 03:57:07 Server kernel: ? resched_curr+0x3f/0x4c
    Mar 12 03:57:07 Server kernel: ? ipt_do_table+0x49b/0x5c0 [ip_tables]
    Mar 12 03:57:07 Server kernel: ? try_to_wake_up+0x1b0/0x1e5
    Mar 12 03:57:07 Server kernel: nf_nat_alloc_null_binding+0x71/0x88 [nf_nat]
    Mar 12 03:57:07 Server kernel: nf_nat_inet_fn+0x91/0x182 [nf_nat]
    Mar 12 03:57:07 Server kernel: nf_hook_slow+0x39/0x8e
    Mar 12 03:57:07 Server kernel: nf_hook.constprop.0+0xb1/0xd8
    Mar 12 03:57:07 Server kernel: ? ip_protocol_deliver_rcu+0xfe/0xfe
    Mar 12 03:57:07 Server kernel: ip_local_deliver+0x49/0x75
    Mar 12 03:57:07 Server kernel: ip_sabotage_in+0x43/0x4d
    Mar 12 03:57:07 Server kernel: nf_hook_slow+0x39/0x8e
    Mar 12 03:57:07 Server kernel: nf_hook.constprop.0+0xb1/0xd8
    Mar 12 03:57:07 Server kernel: ? l3mdev_l3_rcv.constprop.0+0x50/0x50
    Mar 12 03:57:07 Server kernel: ip_rcv+0x41/0x61
    Mar 12 03:57:07 Server kernel: __netif_receive_skb_one_core+0x74/0x95
    Mar 12 03:57:07 Server kernel: process_backlog+0xa3/0x13b
    Mar 12 03:57:07 Server kernel: net_rx_action+0xf4/0x29d
    Mar 12 03:57:07 Server kernel: __do_softirq+0xc4/0x1c2
    Mar 12 03:57:07 Server kernel: asm_call_irq_on_stack+0x12/0x20
    Mar 12 03:57:07 Server kernel: </IRQ>
    Mar 12 03:57:07 Server kernel: do_softirq_own_stack+0x2c/0x39
    Mar 12 03:57:07 Server kernel: do_softirq+0x3a/0x44
    Mar 12 03:57:07 Server kernel: netif_rx_ni+0x1c/0x22
    Mar 12 03:57:07 Server kernel: macvlan_broadcast+0x10e/0x13c [macvlan]
    Mar 12 03:57:07 Server kernel: macvlan_process_broadcast+0xf8/0x143 [macvlan]
    Mar 12 03:57:07 Server kernel: process_one_work+0x13c/0x1d5
    Mar 12 03:57:07 Server kernel: worker_thread+0x18b/0x22f
    Mar 12 03:57:07 Server kernel: ? process_scheduled_works+0x27/0x27
    Mar 12 03:57:07 Server kernel: kthread+0xe5/0xea
    Mar 12 03:57:07 Server kernel: ? __kthread_bind_mask+0x57/0x57
    Mar 12 03:57:07 Server kernel: ret_from_fork+0x22/0x30
    Mar 12 03:57:07 Server kernel: ---[ end trace b3ca21ac5f2c2720 ]---

     



    User Feedback

    Recommended Comments



    Hey guys -- I was having a nearly exact issue on my own system recently -- kernel panics, not ALWAYS nf_conntrack but frequently it was -- and I disabled C-States using sysfs.

     

    It seems to have helped me, and I'm using docker in the exact way this thread indicates that it breaks. It's been only a day so far, but normally by now it would've done something stupid, so I think I might be on the right track. If anyone else wishes to try, here's a simple bash one-liner;

     

    for cpus in $(find /sys/devices/system/cpu -iname disable); do echo 1 > $cpus; done

     

    It's non-persistant so worst case you wasted some time. Cross your fingers with me, let's hope it helps at least someone.
     

    Link to comment
    Share on other sites

    Is this the same thing as the global c-stste feature that people disable on AM4 motherboards in the bios?

     

     

    Link to comment
    Share on other sites

    If it is, I have this disabled at the bios level as well as in my boot.cfg and I have still had lock ups with macvlan. 

    Link to comment
    Share on other sites
    20 hours ago, K1ng0011 said:

    Is this the same thing as the global c-stste feature that people disable on AM4 motherboards in the bios?

     

     

    I'm not directly familiar with what you're mentioning, nor do I see the feature you mention on the page you linked, BUT -- it should be. There's only one thing c-state applies to, and disabling it anywhere should disable it everywhere.

     

    EDIT: MAN I hope this doesn't mean I'm suffering from two different crash-inducing bugs.

    EDIT PART TWO: I'm suffering from two different crash-inducing bugs. Or at least, I was. Reverted to 6.8.3 -- ignore my c-state suggestion, it will not help this issue.

    Edited by codefaux
    Link to comment
    Share on other sites

    I was on a Supermicro X9SRL-F board with an old Xeon E5 and no issues on 6.9.2.

     

    I just swapped the mobo with a Supermicro X11SCA-F and every few hours I get a kernel panic and the nf_conntrack message. It locked up once yesterday (no console, no ssh access, etc.) and required a hard reset and parity check.

     

    Today I moved the 2 containers off of br0 into br0.10 (vlan) and 2 hours later I got the kernel panic again (no lock up, though).

     

    I have 2 network interfaces bonded in this case.

     

    Here's the message:
     

    Apr 30 19:36:53 Tower kernel: ------------[ cut here ]------------
    Apr 30 19:36:53 Tower kernel: WARNING: CPU: 5 PID: 0 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: Modules linked in: xt_mark xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat xt_nat xt_tcpudp iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth macvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax nfsd lockd grace sunrpc i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops md_mod ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ipmi_ssif crypto_simd cryptd glue_helper rapl intel_cstate mpt3sas i2c_i801 i2c_smbus input_leds raid_class intel_uncore nvme i2c_core scsi_transport_sas ahci led_class nvme_core wmi libahci video intel_pch_thermal acpi_ipmi ie31200_edac backlight ipmi_si
    Apr 30 19:36:53 Tower kernel: acpi_pad thermal button fan [last unloaded: e1000e]
    Apr 30 19:36:53 Tower kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.10.28-Unraid #1
    Apr 30 19:36:53 Tower kernel: Hardware name: Supermicro Super Server/X11SCA-F, BIOS 1.4 09/03/2020
    Apr 30 19:36:53 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: Code: e8 dc f8 ff ff 44 89 fa 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 36 f6 ff ff 84 c0 75 bb 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 89 df 44 89 e6 31 db e8 6d f3 ff ff e8 35 f5 ff ff e9 22 01
    Apr 30 19:36:53 Tower kernel: RSP: 0018:ffffc900002108a0 EFLAGS: 00010202
    Apr 30 19:36:53 Tower kernel: RAX: 0000000000000188 RBX: 0000000000006ffc RCX: 0000000020a3ac63
    Apr 30 19:36:53 Tower kernel: RDX: 0000000000000000 RSI: 00000000000000fe RDI: ffffffffa060b1f0
    Apr 30 19:36:53 Tower kernel: RBP: ffff888617023400 R08: 0000000093f9e799 R09: ffff888101087480
    Apr 30 19:36:53 Tower kernel: R10: 0000000000000158 R11: ffff888105074100 R12: 00000000000034fe
    Apr 30 19:36:53 Tower kernel: R13: ffffffff8210b440 R14: 0000000000006ffc R15: 0000000000000000
    Apr 30 19:36:53 Tower kernel: FS:  0000000000000000(0000) GS:ffff88902c340000(0000) knlGS:0000000000000000
    Apr 30 19:36:53 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Apr 30 19:36:53 Tower kernel: CR2: 000014ea44ed23f4 CR3: 000000000200a003 CR4: 00000000003726e0
    Apr 30 19:36:53 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Apr 30 19:36:53 Tower kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Apr 30 19:36:53 Tower kernel: Call Trace:
    Apr 30 19:36:53 Tower kernel: <IRQ>
    Apr 30 19:36:53 Tower kernel: nf_conntrack_confirm+0x2f/0x36 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: nf_hook_slow+0x39/0x8e
    Apr 30 19:36:53 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
    Apr 30 19:36:53 Tower kernel: ? ip_protocol_deliver_rcu+0xfe/0xfe
    Apr 30 19:36:53 Tower kernel: ip_local_deliver+0x49/0x75
    Apr 30 19:36:53 Tower kernel: ip_sabotage_in+0x43/0x4d [br_netfilter]
    Apr 30 19:36:53 Tower kernel: nf_hook_slow+0x39/0x8e
    Apr 30 19:36:53 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
    Apr 30 19:36:53 Tower kernel: ? l3mdev_l3_rcv.constprop.0+0x50/0x50
    Apr 30 19:36:53 Tower kernel: ip_rcv+0x41/0x61
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_one_core+0x74/0x95
    Apr 30 19:36:53 Tower kernel: netif_receive_skb+0x79/0xa1
    Apr 30 19:36:53 Tower kernel: br_handle_frame_finish+0x30d/0x351
    Apr 30 19:36:53 Tower kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: br_nf_hook_thresh+0xa3/0xc3 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    Apr 30 19:36:53 Tower kernel: ? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    Apr 30 19:36:53 Tower kernel: NF_HOOK+0xd7/0xf7 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: br_nf_pre_routing+0x229/0x239 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: br_handle_frame+0x25e/0x2a6
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_core+0x335/0x4e7
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_list_core+0x78/0x104
    Apr 30 19:36:53 Tower kernel: netif_receive_skb_list_internal+0x1bf/0x1f2
    Apr 30 19:36:53 Tower kernel: ? dev_gro_receive+0x55d/0x578
    Apr 30 19:36:53 Tower kernel: gro_normal_list+0x1d/0x39
    Apr 30 19:36:53 Tower kernel: napi_complete_done+0x79/0x104
    Apr 30 19:36:53 Tower kernel: igb_poll+0xcc8/0xef6 [igb]
    Apr 30 19:36:53 Tower kernel: ? resched_curr+0x1e/0x4c
    Apr 30 19:36:53 Tower kernel: net_rx_action+0xf4/0x29d
    Apr 30 19:36:53 Tower kernel: __do_softirq+0xc4/0x1c2
    Apr 30 19:36:53 Tower kernel: asm_call_irq_on_stack+0x12/0x20
    Apr 30 19:36:53 Tower kernel: </IRQ>
    Apr 30 19:36:53 Tower kernel: do_softirq_own_stack+0x2c/0x39
    Apr 30 19:36:53 Tower kernel: __irq_exit_rcu+0x45/0x80
    Apr 30 19:36:53 Tower kernel: common_interrupt+0x119/0x12e
    Apr 30 19:36:53 Tower kernel: asm_common_interrupt+0x1e/0x40
    Apr 30 19:36:53 Tower kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8
    Apr 30 19:36:53 Tower kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5
    Apr 30 19:36:53 Tower kernel: RSP: 0018:ffffc900000fbea0 EFLAGS: 00000246
    Apr 30 19:36:53 Tower kernel: RAX: ffff88902c362380 RBX: 0000000000000008 RCX: 000000000000001f
    Apr 30 19:36:53 Tower kernel: RDX: 0000000000000000 RSI: 0000000024879873 RDI: 0000000000000000
    Apr 30 19:36:53 Tower kernel: RBP: ffffe8ffffb67a00 R08: 00000de4ea48c35c R09: 0000000000000000
    Apr 30 19:36:53 Tower kernel: R10: 000000000000218c R11: 071c71c71c71c71c R12: 00000de4ea48c35c
    Apr 30 19:36:53 Tower kernel: R13: ffffffff820c5dc0 R14: 0000000000000008 R15: 0000000000000000
    Apr 30 19:36:53 Tower kernel: cpuidle_enter_state+0x101/0x1c4
    Apr 30 19:36:53 Tower kernel: cpuidle_enter+0x25/0x31
    Apr 30 19:36:53 Tower kernel: do_idle+0x1a6/0x214
    Apr 30 19:36:53 Tower kernel: cpu_startup_entry+0x18/0x1a
    Apr 30 19:36:53 Tower kernel: secondary_startup_64_no_verify+0xb0/0xbb
    Apr 30 19:36:53 Tower kernel: ---[ end trace d5200fbed8c48686 ]---

     

    Link to comment
    Share on other sites
    12 hours ago, dumurluk said:

    I was on a Supermicro X9SRL-F board with an old Xeon E5 and no issues on 6.9.2.

     

    I just swapped the mobo with a Supermicro X11SCA-F and every few hours I get a kernel panic and the nf_conntrack message. It locked up once yesterday (no console, no ssh access, etc.) and required a hard reset and parity check.

     

    Today I moved the 2 containers off of br0 into br0.10 (vlan) and 2 hours later I got the kernel panic again (no lock up, though).

     

    I have 2 network interfaces bonded in this case.

     

    Here's the message:
     

    
    Apr 30 19:36:53 Tower kernel: ------------[ cut here ]------------
    Apr 30 19:36:53 Tower kernel: WARNING: CPU: 5 PID: 0 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: Modules linked in: xt_mark xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat xt_nat xt_tcpudp iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth macvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax nfsd lockd grace sunrpc i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops md_mod ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ipmi_ssif crypto_simd cryptd glue_helper rapl intel_cstate mpt3sas i2c_i801 i2c_smbus input_leds raid_class intel_uncore nvme i2c_core scsi_transport_sas ahci led_class nvme_core wmi libahci video intel_pch_thermal acpi_ipmi ie31200_edac backlight ipmi_si
    Apr 30 19:36:53 Tower kernel: acpi_pad thermal button fan [last unloaded: e1000e]
    Apr 30 19:36:53 Tower kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.10.28-Unraid #1
    Apr 30 19:36:53 Tower kernel: Hardware name: Supermicro Super Server/X11SCA-F, BIOS 1.4 09/03/2020
    Apr 30 19:36:53 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0x9b/0x1e6 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: Code: e8 dc f8 ff ff 44 89 fa 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 36 f6 ff ff 84 c0 75 bb 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 89 df 44 89 e6 31 db e8 6d f3 ff ff e8 35 f5 ff ff e9 22 01
    Apr 30 19:36:53 Tower kernel: RSP: 0018:ffffc900002108a0 EFLAGS: 00010202
    Apr 30 19:36:53 Tower kernel: RAX: 0000000000000188 RBX: 0000000000006ffc RCX: 0000000020a3ac63
    Apr 30 19:36:53 Tower kernel: RDX: 0000000000000000 RSI: 00000000000000fe RDI: ffffffffa060b1f0
    Apr 30 19:36:53 Tower kernel: RBP: ffff888617023400 R08: 0000000093f9e799 R09: ffff888101087480
    Apr 30 19:36:53 Tower kernel: R10: 0000000000000158 R11: ffff888105074100 R12: 00000000000034fe
    Apr 30 19:36:53 Tower kernel: R13: ffffffff8210b440 R14: 0000000000006ffc R15: 0000000000000000
    Apr 30 19:36:53 Tower kernel: FS:  0000000000000000(0000) GS:ffff88902c340000(0000) knlGS:0000000000000000
    Apr 30 19:36:53 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Apr 30 19:36:53 Tower kernel: CR2: 000014ea44ed23f4 CR3: 000000000200a003 CR4: 00000000003726e0
    Apr 30 19:36:53 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Apr 30 19:36:53 Tower kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Apr 30 19:36:53 Tower kernel: Call Trace:
    Apr 30 19:36:53 Tower kernel: <IRQ>
    Apr 30 19:36:53 Tower kernel: nf_conntrack_confirm+0x2f/0x36 [nf_conntrack]
    Apr 30 19:36:53 Tower kernel: nf_hook_slow+0x39/0x8e
    Apr 30 19:36:53 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
    Apr 30 19:36:53 Tower kernel: ? ip_protocol_deliver_rcu+0xfe/0xfe
    Apr 30 19:36:53 Tower kernel: ip_local_deliver+0x49/0x75
    Apr 30 19:36:53 Tower kernel: ip_sabotage_in+0x43/0x4d [br_netfilter]
    Apr 30 19:36:53 Tower kernel: nf_hook_slow+0x39/0x8e
    Apr 30 19:36:53 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
    Apr 30 19:36:53 Tower kernel: ? l3mdev_l3_rcv.constprop.0+0x50/0x50
    Apr 30 19:36:53 Tower kernel: ip_rcv+0x41/0x61
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_one_core+0x74/0x95
    Apr 30 19:36:53 Tower kernel: netif_receive_skb+0x79/0xa1
    Apr 30 19:36:53 Tower kernel: br_handle_frame_finish+0x30d/0x351
    Apr 30 19:36:53 Tower kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: br_nf_hook_thresh+0xa3/0xc3 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    Apr 30 19:36:53 Tower kernel: ? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    Apr 30 19:36:53 Tower kernel: NF_HOOK+0xd7/0xf7 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: br_nf_pre_routing+0x229/0x239 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    Apr 30 19:36:53 Tower kernel: br_handle_frame+0x25e/0x2a6
    Apr 30 19:36:53 Tower kernel: ? br_pass_frame_up+0xda/0xda
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_core+0x335/0x4e7
    Apr 30 19:36:53 Tower kernel: __netif_receive_skb_list_core+0x78/0x104
    Apr 30 19:36:53 Tower kernel: netif_receive_skb_list_internal+0x1bf/0x1f2
    Apr 30 19:36:53 Tower kernel: ? dev_gro_receive+0x55d/0x578
    Apr 30 19:36:53 Tower kernel: gro_normal_list+0x1d/0x39
    Apr 30 19:36:53 Tower kernel: napi_complete_done+0x79/0x104
    Apr 30 19:36:53 Tower kernel: igb_poll+0xcc8/0xef6 [igb]
    Apr 30 19:36:53 Tower kernel: ? resched_curr+0x1e/0x4c
    Apr 30 19:36:53 Tower kernel: net_rx_action+0xf4/0x29d
    Apr 30 19:36:53 Tower kernel: __do_softirq+0xc4/0x1c2
    Apr 30 19:36:53 Tower kernel: asm_call_irq_on_stack+0x12/0x20
    Apr 30 19:36:53 Tower kernel: </IRQ>
    Apr 30 19:36:53 Tower kernel: do_softirq_own_stack+0x2c/0x39
    Apr 30 19:36:53 Tower kernel: __irq_exit_rcu+0x45/0x80
    Apr 30 19:36:53 Tower kernel: common_interrupt+0x119/0x12e
    Apr 30 19:36:53 Tower kernel: asm_common_interrupt+0x1e/0x40
    Apr 30 19:36:53 Tower kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8
    Apr 30 19:36:53 Tower kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5
    Apr 30 19:36:53 Tower kernel: RSP: 0018:ffffc900000fbea0 EFLAGS: 00000246
    Apr 30 19:36:53 Tower kernel: RAX: ffff88902c362380 RBX: 0000000000000008 RCX: 000000000000001f
    Apr 30 19:36:53 Tower kernel: RDX: 0000000000000000 RSI: 0000000024879873 RDI: 0000000000000000
    Apr 30 19:36:53 Tower kernel: RBP: ffffe8ffffb67a00 R08: 00000de4ea48c35c R09: 0000000000000000
    Apr 30 19:36:53 Tower kernel: R10: 000000000000218c R11: 071c71c71c71c71c R12: 00000de4ea48c35c
    Apr 30 19:36:53 Tower kernel: R13: ffffffff820c5dc0 R14: 0000000000000008 R15: 0000000000000000
    Apr 30 19:36:53 Tower kernel: cpuidle_enter_state+0x101/0x1c4
    Apr 30 19:36:53 Tower kernel: cpuidle_enter+0x25/0x31
    Apr 30 19:36:53 Tower kernel: do_idle+0x1a6/0x214
    Apr 30 19:36:53 Tower kernel: cpu_startup_entry+0x18/0x1a
    Apr 30 19:36:53 Tower kernel: secondary_startup_64_no_verify+0xb0/0xbb
    Apr 30 19:36:53 Tower kernel: ---[ end trace d5200fbed8c48686 ]---

     

    My motherboard is Supermicro X11SCA-F, i think i solve the problem by assign custom ip of docker containers to the br1 (a second NIC in the server), that's mean, you can't bonded the two network interfaces 

    Link to comment
    Share on other sites
    16 minutes ago, danieland said:

    My motherboard is Supermicro X11SCA-F, i think i solve the problem by assign custom ip of docker containers to the br1 (a second NIC in the server), that's mean, you can't bonded the two network interfaces 

    Thanks, that's what I was going to try next. Btw do you know which nic is the one shared with ipmi? Is it the igb one or the e1000e one as shown in unraid gui?

    Link to comment
    Share on other sites

    This does not fix the problem but I am using it as a work around. Disabling "host access to custom networks" resolves any lockups for me. When I setup vlans I still had the issue. I prefer not to double NAT so I have always set my dockers network type to Host or Custom: br0 with a static IP. I just moved my two dockers that had static IPs to the bridge network type. This gives them a private IP on their own subnet. This allows the qbittorrent docker I have running with open vpn to still work and my other dockers to access qbittorrent. It allows everything to work without "host access to custom networks" from being enabled. 

    Link to comment
    Share on other sites

    Here we go again. Before I could get to my server, it's locked up again. Barely lasted 12 hours. No br0 static IP containers. I have a couple on br0.10 vlan. "host access to custom networks" was enabled iirc.

     

    Here's the last of the log sent to remote syslog before the crash:

    May  1 13:35:22 Tower kernel: BUG: unable to handle page fault for address: 00000000ffffffae
    May  1 13:35:22 Tower kernel: #PF: supervisor read access in kernel mode
    May  1 13:35:22 Tower kernel: #PF: error_code(0x0000) - not-present page
    May  1 13:35:22 Tower kernel: PGD 800000061069f067 P4D 800000061069f067 PUD 0 
    May  1 13:35:22 Tower kernel: Oops: 0000 [#1] SMP PTI
    May  1 13:35:22 Tower kernel: CPU: 6 PID: 20760 Comm: node Tainted: G        W         5.10.28-Unraid #1
    May  1 13:35:22 Tower kernel: Hardware name: Supermicro Super Server/X11SCA-F, BIOS 1.4 09/03/2020
    May  1 13:35:22 Tower kernel: RIP: 0010:nf_nat_setup_info+0x129/0x6aa [nf_nat]
    May  1 13:35:22 Tower kernel: Code: ff 48 8b 15 ef 6a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28
    May  1 13:35:22 Tower kernel: RSP: 0018:ffffc9000023c700 EFLAGS: 00010202
    May  1 13:35:22 Tower kernel: RAX: ffff88815d423e06 RBX: ffff8881147c57c0 RCX: 0000000000000000
    May  1 13:35:22 Tower kernel: RDX: 00000000ffffff68 RSI: 00000000c2955a34 RDI: ffffc9000023c720
    May  1 13:35:22 Tower kernel: RBP: ffffc9000023c7c8 R08: 00000000f7570d7c R09: ffff8881010870a0
    May  1 13:35:22 Tower kernel: R10: ffff8886a70e0388 R11: ffffffff815cbe4b R12: 0000000000000000
    May  1 13:35:22 Tower kernel: R13: ffffc9000023c720 R14: ffffc9000023c7dc R15: ffffffff8210b440
    May  1 13:35:22 Tower kernel: FS:  000015372ba5eb48(0000) GS:ffff88902c380000(0000) knlGS:0000000000000000
    May  1 13:35:22 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  1 13:35:22 Tower kernel: CR2: 00000000ffffffae CR3: 00000001fee5c001 CR4: 00000000003726e0
    May  1 13:35:22 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    May  1 13:35:22 Tower kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    May  1 13:35:22 Tower kernel: Call Trace:
    May  1 13:35:22 Tower kernel: <IRQ>
    May  1 13:35:22 Tower kernel: ? bond_start_xmit+0x26e/0x292 [bonding]
    May  1 13:35:22 Tower kernel: ? __ksize+0x15/0x64
    May  1 13:35:22 Tower kernel: ? krealloc+0x26/0x7a
    May  1 13:35:22 Tower kernel: nf_nat_masquerade_ipv4+0x10b/0x131 [nf_nat]
    May  1 13:35:22 Tower kernel: masquerade_tg+0x44/0x5e [xt_MASQUERADE]
    May  1 13:35:22 Tower kernel: ipt_do_table+0x51a/0x5c0 [ip_tables]
    May  1 13:35:22 Tower kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
    May  1 13:35:22 Tower kernel: ? fib_validate_source+0xb0/0xda
    May  1 13:35:22 Tower kernel: nf_nat_inet_fn+0xe9/0x183 [nf_nat]
    May  1 13:35:22 Tower kernel: nf_nat_ipv4_out+0xf/0x88 [nf_nat]
    May  1 13:35:22 Tower kernel: nf_hook_slow+0x39/0x8e
    May  1 13:35:22 Tower kernel: nf_hook+0xab/0xd3
    May  1 13:35:22 Tower kernel: ? __ip_finish_output+0x146/0x146
    May  1 13:35:22 Tower kernel: ip_output+0x7d/0x8a
    May  1 13:35:22 Tower kernel: ? __ip_finish_output+0x146/0x146
    May  1 13:35:22 Tower kernel: ip_forward+0x3f1/0x420
    May  1 13:35:22 Tower kernel: ? ip_check_defrag+0x18f/0x18f
    May  1 13:35:22 Tower kernel: ip_sabotage_in+0x43/0x4d [br_netfilter]
    May  1 13:35:22 Tower kernel: nf_hook_slow+0x39/0x8e
    May  1 13:35:22 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
    May  1 13:35:22 Tower kernel: ? l3mdev_l3_rcv.constprop.0+0x50/0x50
    May  1 13:35:22 Tower kernel: ip_rcv+0x41/0x61
    May  1 13:35:22 Tower kernel: __netif_receive_skb_one_core+0x74/0x95
    May  1 13:35:22 Tower kernel: netif_receive_skb+0x79/0xa1
    May  1 13:35:22 Tower kernel: br_handle_frame_finish+0x30d/0x351
    May  1 13:35:22 Tower kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
    May  1 13:35:22 Tower kernel: ? br_pass_frame_up+0xda/0xda
    May  1 13:35:22 Tower kernel: br_nf_hook_thresh+0xa3/0xc3 [br_netfilter]
    May  1 13:35:22 Tower kernel: ? br_pass_frame_up+0xda/0xda
    May  1 13:35:22 Tower kernel: br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter]
    May  1 13:35:22 Tower kernel: ? br_pass_frame_up+0xda/0xda
    May  1 13:35:22 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    May  1 13:35:22 Tower kernel: ? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat]
    May  1 13:35:22 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    May  1 13:35:22 Tower kernel: ? br_handle_frame_finish+0x351/0x351
    May  1 13:35:22 Tower kernel: NF_HOOK+0xd7/0xf7 [br_netfilter]
    May  1 13:35:22 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    May  1 13:35:22 Tower kernel: br_nf_pre_routing+0x229/0x239 [br_netfilter]
    May  1 13:35:22 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
    May  1 13:35:22 Tower kernel: br_handle_frame+0x25e/0x2a6
    May  1 13:35:22 Tower kernel: ? br_pass_frame_up+0xda/0xda
    May  1 13:35:22 Tower kernel: __netif_receive_skb_core+0x335/0x4e7
    May  1 13:35:22 Tower kernel: ? igb_poll+0xcc8/0xef6 [igb]
    May  1 13:35:22 Tower kernel: __netif_receive_skb_one_core+0x3d/0x95
    May  1 13:35:22 Tower kernel: process_backlog+0xa3/0x13b
    May  1 13:35:22 Tower kernel: net_rx_action+0xf4/0x29d
    May  1 13:35:22 Tower kernel: __do_softirq+0xc4/0x1c2
    May  1 13:35:22 Tower kernel: asm_call_irq_on_stack+0x12/0x20
    May  1 13:35:22 Tower kernel: </IRQ>
    May  1 13:35:22 Tower kernel: do_softirq_own_stack+0x2c/0x39
    May  1 13:35:22 Tower kernel: do_softirq+0x3a/0x44
    May  1 13:35:22 Tower kernel: __local_bh_enable_ip+0x3b/0x43
    May  1 13:35:22 Tower kernel: ip_finish_output2+0x2ec/0x31f
    May  1 13:35:22 Tower kernel: ? ipv4_mtu+0x3d/0x64
    May  1 13:35:22 Tower kernel: __ip_queue_xmit+0x2a3/0x2df
    May  1 13:35:22 Tower kernel: __tcp_transmit_skb+0x845/0x8ba
    May  1 13:35:22 Tower kernel: tcp_connect+0x76d/0x7f4
    May  1 13:35:22 Tower kernel: tcp_v4_connect+0x3fc/0x455
    May  1 13:35:22 Tower kernel: __inet_stream_connect+0xd3/0x2b6
    May  1 13:35:22 Tower kernel: inet_stream_connect+0x34/0x49
    May  1 13:35:22 Tower kernel: __sys_connect+0x62/0x9d
    May  1 13:35:22 Tower kernel: __x64_sys_connect+0x11/0x14
    May  1 13:35:22 Tower kernel: do_syscall_64+0x5d/0x6a
    May  1 13:35:22 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    May  1 13:35:22 Tower kernel: RIP: 0033:0x15372ba1d352
    May  1 13:35:22 Tower kernel: Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 8a d2 ff ff 41 54 b8 02 00 00 00 49 89 f4 be 00 88 08 00 55
    May  1 13:35:22 Tower kernel: RSP: 002b:00007ffcbe850d08 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    May  1 13:35:22 Tower kernel: RAX: ffffffffffffffda RBX: 000015372ba5eb48 RCX: 000015372ba1d352
    May  1 13:35:22 Tower kernel: RDX: 0000000000000010 RSI: 00007ffcbe850dd0 RDI: 000000000000001c
    May  1 13:35:22 Tower kernel: RBP: 000015372ba5eb7c R08: 0000000000000000 R09: 0000000000000000
    May  1 13:35:22 Tower kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002a
    May  1 13:35:22 Tower kernel: R13: 0000000000000010 R14: 0000153729a44c68 R15: 0000559800038140
    May  1 13:35:22 Tower kernel: Modules linked in: xt_mark xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat xt_nat xt_tcpudp iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth macvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax nfsd lockd grace sunrpc i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops md_mod ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ipmi_ssif crypto_simd cryptd glue_helper rapl intel_cstate mpt3sas i2c_i801 i2c_smbus input_leds raid_class intel_uncore nvme i2c_core scsi_transport_sas ahci led_class nvme_core wmi libahci video intel_pch_thermal acpi_ipmi ie31200_edac backlight ipmi_si
    May  1 13:35:22 Tower kernel: acpi_pad thermal button fan [last unloaded: e1000e]
    May  1 13:35:22 Tower kernel: CR2: 00000000ffffffae

     

    Link to comment
    Share on other sites
    20 hours ago, dumurluk said:

    Thanks, that's what I was going to try next. Btw do you know which nic is the one shared with ipmi? Is it the igb one or the e1000e one as shown in unraid gui?

    I'm not sure, I thinks it is igb.

    By the way, my ipmi remote console is not work after loading into the unraid, is it normal? I can only monitor the boot process. 

    Link to comment
    Share on other sites
    4 hours ago, danieland said:

    I'm not sure, I thinks it is igb.

    By the way, my ipmi remote console is not work after loading into the unraid, is it normal? I can only monitor the boot process. 

    Same. I'm assuming you have unraid load igpu drivers for hw transcode?

    Once those drivers are loaded, the motherboard disables vga, which is connected to the IPMI for ikvm.

     

    See here for a broader discussion on that: 

     

    Link to comment
    Share on other sites

    I've been dealing with this issues off an on across different versions of 6.9.x. There was a version of 6.9 where this was resolved. From what I remember a different version of docker is what solved these lockups, kernel panics and crashes, but since then docker has been updated in later build of 6.9.x. I'm going to add a dedicated NIC for docker containers and see if that helps.

    Edited by ryanhaver
    Link to comment
    Share on other sites
    22 hours ago, Lilarcor said:

    Is there a solution for it?

    The solution seems to be, for the time being, using 6.8. It doesn’t appear Limetech has made this a critical bug yet.

    Link to comment
    Share on other sites
    21 minutes ago, whoopn said:

    solution seems to be, for the time being, using 6.8

    This was my solution also, due to both this and a bug which eats SMART config files. 6.8.3 -- still saw one trace for nf_xxxxx but it was non-fatal and I've been stable for a week and a half ish, for the first time in kinda a while.

    Edited by codefaux
    Nomenclature
    Link to comment
    Share on other sites

    11 days and 6 hours of uptime with host access to custom networks disabled. unRAID 6.9.2.

    Link to comment
    Share on other sites

    I've never had host access to custom networks enabled and still see this problem. I thought I resolved the issues by putting all docker containers in standard bridge mode on their own NIC, but this doesn't seem to have resolved the issue. I was thinking that for some reason UNRAID did not like Docker Sharing an IP on the same NIC that hosted the UNRAID UI and/or any other services, but this doesn't seem to be the case.

     

    I'm starting to think this issue is more related to how Docker is implemented on UNRAID. I tend to agree with @CorneliousJD that this issue is related to the use of docker containers with custom (user defined) bridge networks. I never had this issue before setting up PiHole, which needs a custom bridge like br0 or br1. Even after putting PiHole on it's own NIC using br1 the issue still resurfaced.

    Edited by ryanhaver
    Added additional context
    • Like 1
    Link to comment
    Share on other sites

    Ya the host access option makes no difference in my setup. It has been off for me from day 1 and if I put the unifi controller or pihole on a br0 to get their own ip, I lock up within 12-24 hours. 

    Link to comment
    Share on other sites

    Just a quick followup/update on the state of my installation since I last posted 30+ days ago on April 17.  Since reverting all of my docker containers back to Host from br0 I have had no crashes in almost 33 days.  However, what I find interesting is that my VM's are using br0 without issue.  It makes me think it is not a network issue but rather the way Docker is implemented in 6.9.x.  I believe ryanhaver in a previous post hypothesized this idea, but my experience seems to support this assertion.  FWIW I'm running 6.9.2 and I'm still looking forward to a fix for this issue as not using br0 for docker containers is a stop gap counter measure, not a real solution.

    Edited by jsiemon
    Link to comment
    Share on other sites

    I've shifted away from using any user defined bridge networks and the issue is gone. This issue has existed intermittently in one form or another in various UNRAID 6.8.x and 6.9.x builds. The more confusing part is that when it was resolved back in 6.8.x it appeared to be due to an updated version of Docker. I didn't keep track of the docker versioning when 6.9.x was released, but I did not have this issue in earlier releases of 6.9.x.

    Link to comment
    Share on other sites

    You can add me to the list of people having kernel panics with macvlan call traces.  I only have 1 docker on custom:br0 (unifi controller).  I was stable on 6.8.3 for ages, then updated to 6.9.0 and had an kernel panic, downgraded to to 6.8.3 and was stable until 6.9.2 came out - updated and got kernel panics again.  I have disabled my unifi docker and am stable again for almost 2 weeks on 6.9.2.  I do not have host access enabled.  Motherboard is a Supermicro X9DRi-LN4+.  It would be great to have a solution to this issue.

    Link to comment
    Share on other sites

    Checking in here that I have the exact same problem on Unraid 6.9.2. I have an AdGuard docker on br0 and have the random crashes with the same error messages in the logs. 

     

    Running Unraid on a HPE MicroServer Gen10. 

    Link to comment
    Share on other sites

    The kernel panics  for me are getting more frequently, despite disabling all containers I don't absolutely need. All host network containers are disabled at this point, I have 2 on br0 that are running. I had my third kernel panic in 12 hours last night.

     

    Devs, is there a fix coming for this soon? If not, despite my 2 licenses, I need to start looking at other platforms, because this is causing a huge problem for me.

    Link to comment
    Share on other sites
    12 minutes ago, vagrantprodigy said:

    The kernel panics  for me are getting more frequently, despite disabling all containers I don't absolutely need. All host network containers are disabled at this point, I have 2 on br0 that are running. I had my third kernel panic in 12 hours last night.

     

    Devs, is there a fix coming for this soon? If not, despite my 2 licenses, I need to start looking at other platforms, because this is causing a huge problem for me.

     

    Not a dev, but as noted in this thread a few times it's your br0 ones causing the issue. Not bridge vs host. 

     

    You probably need to put those on a separate vlan. 

     

    This is admittedly a workaround, but one that's worked for me. Stable with zero crashes since doing it over a month ago 

    Link to comment
    Share on other sites



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.