• <6.10.2> Kernel Bug, and Freeze whole system and must be hard rebooted


    Shawn Young
    • Minor

    This situation happens randomly. Basically every 2-5 days, I have to reboot it once.

     

    It seems to be a hardware problem. I have run memtest, all passed. 

     

    Here is the syslog from the start of the freeze before reboot.

     

    Please let me know if I should change a motherboard or do something else.

     

    Jun 27 22:53:01 Tower kernel: BUG: unable to handle page fault for address: 0000001000000008
    Jun 27 22:53:01 Tower kernel: #PF: supervisor write access in kernel mode
    Jun 27 22:53:01 Tower kernel: #PF: error_code(0x0002) - not-present page
    Jun 27 22:53:01 Tower kernel: PGD 0 P4D 0 
    Jun 27 22:53:01 Tower kernel: Oops: 0002 [#1] SMP PTI
    Jun 27 22:53:01 Tower kernel: CPU: 2 PID: 26345 Comm: kworker/2:0 Tainted: G     U  W  O      5.15.43-Unraid #1
    Jun 27 22:53:01 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z170-HD3 DDR3/Z170-HD3 DDR3-CF, BIOS F21f 03/09/2018
    Jun 27 22:53:01 Tower kernel: Workqueue: events free_work
    Jun 27 22:53:01 Tower kernel: RIP: 0010:__purge_vmap_area_lazy+0x410/0x4c8
    Jun 27 22:53:01 Tower kernel: Code: 89 82 49 c7 46 18 00 00 00 00 49 c7 46 20 00 00 00 00 49 89 38 e8 c7 dd 26 00 49 c7 46 38 00 00 00 00 48 8b 55 00 49 8d 46 28 <48> 89 42 08 49 89 56 28 49 89 6e 30 48 89 45 00 4d 85 f6 74 35 49
    Jun 27 22:53:01 Tower kernel: RSP: 0018:ffffc90007cd3d78 EFLAGS: 00010202
    Jun 27 22:53:01 Tower kernel: RAX: ffff8882342833e8 RBX: ffffc9001617c000 RCX: ffff8881a6a69c00
    Jun 27 22:53:01 Tower kernel: RDX: 0000001000000000 RSI: ffffffff828944f0 RDI: ffff8882342833d0
    Jun 27 22:53:01 Tower kernel: RBP: ffff8881a6a69c00 R08: ffff8881a8f89e98 R09: 0000000080400030
    Jun 27 22:53:01 Tower kernel: R10: ffff8881a8f89e98 R11: ffff888234283400 R12: ffffc90016181000
    Jun 27 22:53:01 Tower kernel: R13: ffff8881a8f89e98 R14: ffff8882342833c0 R15: ffff8881a8f89eb0
    Jun 27 22:53:01 Tower kernel: FS:  0000000000000000(0000) GS:ffff888238a80000(0000) knlGS:0000000000000000
    Jun 27 22:53:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Jun 27 22:53:01 Tower kernel: CR2: 0000001000000008 CR3: 000000000420a005 CR4: 00000000003726e0
    Jun 27 22:53:01 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Jun 27 22:53:01 Tower kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Jun 27 22:53:01 Tower kernel: Call Trace:
    Jun 27 22:53:01 Tower kernel: <TASK>
    Jun 27 22:53:01 Tower kernel: free_vmap_area_noflush+0x23d/0x290
    Jun 27 22:53:01 Tower kernel: remove_vm_area+0x5e/0x76
    Jun 27 22:53:01 Tower kernel: __vunmap+0x74/0x178
    Jun 27 22:53:01 Tower kernel: free_work+0x22/0x2c
    Jun 27 22:53:01 Tower kernel: process_one_work+0x198/0x27a
    Jun 27 22:53:01 Tower kernel: worker_thread+0x19c/0x240
    Jun 27 22:53:01 Tower kernel: ? rescuer_thread+0x28b/0x28b
    Jun 27 22:53:01 Tower kernel: kthread+0xde/0xe3
    Jun 27 22:53:01 Tower kernel: ? set_kthread_struct+0x32/0x32
    Jun 27 22:53:01 Tower kernel: ret_from_fork+0x22/0x30
    Jun 27 22:53:01 Tower kernel: </TASK>
    Jun 27 22:53:01 Tower kernel: Modules linked in: xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan xt_conntrack nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat br_netfilter xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 md_mod kvmgt mdev i915 iosf_mbi i2c_algo_bit ttm drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables tg3 r8168(O) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_wmi_thunderbolt crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore nvme i2c_i801 i2c_smbus nvme_core i2c_core input_leds led_class fan wmi thermal video backlight button acpi_pad [last unloaded: tg3]
    Jun 27 22:53:01 Tower kernel: CR2: 0000001000000008
    Jun 27 22:53:01 Tower kernel: ---[ end trace e0d6c6e6b92502f8 ]---
    Jun 27 22:53:01 Tower kernel: RIP: 0010:__purge_vmap_area_lazy+0x410/0x4c8
    Jun 27 22:53:01 Tower kernel: Code: 89 82 49 c7 46 18 00 00 00 00 49 c7 46 20 00 00 00 00 49 89 38 e8 c7 dd 26 00 49 c7 46 38 00 00 00 00 48 8b 55 00 49 8d 46 28 <48> 89 42 08 49 89 56 28 49 89 6e 30 48 89 45 00 4d 85 f6 74 35 49
    Jun 27 22:53:01 Tower kernel: RSP: 0018:ffffc90007cd3d78 EFLAGS: 00010202
    Jun 27 22:53:01 Tower kernel: RAX: ffff8882342833e8 RBX: ffffc9001617c000 RCX: ffff8881a6a69c00
    Jun 27 22:53:01 Tower kernel: RDX: 0000001000000000 RSI: ffffffff828944f0 RDI: ffff8882342833d0
    Jun 27 22:53:01 Tower kernel: RBP: ffff8881a6a69c00 R08: ffff8881a8f89e98 R09: 0000000080400030
    Jun 27 22:53:01 Tower kernel: R10: ffff8881a8f89e98 R11: ffff888234283400 R12: ffffc90016181000
    Jun 27 22:53:01 Tower kernel: R13: ffff8881a8f89e98 R14: ffff8882342833c0 R15: ffff8881a8f89eb0
    Jun 27 22:53:01 Tower kernel: FS:  0000000000000000(0000) GS:ffff888238a80000(0000) knlGS:0000000000000000
    Jun 27 22:53:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Jun 27 22:53:01 Tower kernel: CR2: 0000001000000008 CR3: 00000001558a0001 CR4: 00000000003726e0
    Jun 27 22:53:01 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Jun 27 22:53:01 Tower kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Jun 27 22:53:04 Tower kernel: ------------[ cut here ]------------
    Jun 27 22:53:04 Tower kernel: refcount_t: underflow; use-after-free.
    Jun 27 22:53:04 Tower kernel: WARNING: CPU: 5 PID: 10544 at lib/refcount.c:28 refcount_warn_saturate+0xa7/0xe8
    Jun 27 22:53:04 Tower kernel: Modules linked in: xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan xt_conntrack nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat br_netfilter xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 md_mod kvmgt mdev i915 iosf_mbi i2c_algo_bit ttm drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables tg3 r8168(O) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_wmi_thunderbolt crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore nvme i2c_i801 i2c_smbus nvme_core i2c_core input_leds led_class fan wmi thermal video backlight button acpi_pad [last unloaded: tg3]
    Jun 27 22:53:04 Tower kernel: CPU: 5 PID: 10544 Comm: syncthing Tainted: G     UD W  O      5.15.43-Unraid #1
    Jun 27 22:53:04 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z170-HD3 DDR3/Z170-HD3 DDR3-CF, BIOS F21f 03/09/2018
    Jun 27 22:53:04 Tower kernel: RIP: 0010:refcount_warn_saturate+0xa7/0xe8
    Jun 27 22:53:04 Tower kernel: Code: 05 87 4f fc 00 01 e8 ad e2 43 00 0f 0b c3 80 3d 77 4f fc 00 00 75 53 48 c7 c7 e2 dd 10 82 c6 05 67 4f fc 00 01 e8 8e e2 43 00 <0f> 0b c3 80 3d 57 4f fc 00 00 75 34 48 c7 c7 0a de 10 82 c6 05 47
    Jun 27 22:53:04 Tower kernel: RSP: 0018:ffffc900012ef900 EFLAGS: 00010286
    Jun 27 22:53:04 Tower kernel: RAX: 0000000000000000 RBX: 000000000001dae5 RCX: 0000000000000027
    Jun 27 22:53:04 Tower kernel: RDX: 0000000000000003 RSI: ffffc900012ef788 RDI: ffff888238b5c510
    Jun 27 22:53:04 Tower kernel: RBP: ffff8881a8f89e00 R08: ffffffff822b4e28 R09: 0000000000000001
    Jun 27 22:53:04 Tower kernel: R10: 0000000000aaaaaa R11: ffffc90021e20420 R12: ffff8881a8f89e10
    Jun 27 22:53:04 Tower kernel: R13: ffffffff828fd500 R14: ffffc900012ef978 R15: ffffffff81ed36c0
    Jun 27 22:53:04 Tower kernel: FS:  000000c000062c90(0000) GS:ffff888238b40000(0000) knlGS:0000000000000000
    Jun 27 22:53:04 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Jun 27 22:53:04 Tower kernel: CR2: 00007f8ba56b4900 CR3: 00000001bdf18006 CR4: 00000000003726e0
    Jun 27 22:53:04 Tower kernel: Call Trace:
    Jun 27 22:53:04 Tower kernel: <TASK>
    Jun 27 22:53:04 Tower kernel: nf_ct_destroy+0x69/0x72 [nf_conntrack]
    Jun 27 22:53:04 Tower kernel: __nf_conntrack_find_get+0x7d/0x11c [nf_conntrack]
    Jun 27 22:53:04 Tower kernel: nf_conntrack_in+0x1d4/0x4df [nf_conntrack]
    Jun 27 22:53:04 Tower kernel: nf_hook_slow+0x3e/0x93
    Jun 27 22:53:04 Tower kernel: __ip6_local_out+0x109/0x134
    Jun 27 22:53:04 Tower kernel: ? ipv6_select_ident+0x11/0x11
    Jun 27 22:53:04 Tower kernel: ip6_local_out+0x1c/0x3d
    Jun 27 22:53:04 Tower kernel: ip6_send_skb+0x1e/0x5d
    Jun 27 22:53:04 Tower kernel: udp_v6_send_skb+0x308/0x37b
    Jun 27 22:53:04 Tower kernel: udpv6_sendmsg+0x7dd/0xa6d
    Jun 27 22:53:04 Tower kernel: ? ip_select_ident_segs+0x51/0x51
    Jun 27 22:53:04 Tower kernel: ? sock_sendmsg_nosec+0x1c/0x3c
    Jun 27 22:53:04 Tower kernel: sock_sendmsg_nosec+0x1c/0x3c
    Jun 27 22:53:04 Tower kernel: ____sys_sendmsg+0x143/0x1a8
    Jun 27 22:53:04 Tower kernel: ? __accumulate_pelt_segments+0x29/0x3c
    Jun 27 22:53:04 Tower kernel: ? __update_load_avg_cfs_rq+0x10d/0x1d2
    Jun 27 22:53:04 Tower kernel: ___sys_sendmsg+0x7f/0xb7
    Jun 27 22:53:04 Tower kernel: ? update_curr+0x26/0x140
    Jun 27 22:53:04 Tower kernel: ? __update_load_avg_cfs_rq+0xe9/0x1d2
    Jun 27 22:53:04 Tower kernel: ? update_cfs_rq_load_avg+0x138/0x146
    Jun 27 22:53:04 Tower kernel: ? update_cfs_rq_load_avg+0x138/0x146
    Jun 27 22:53:04 Tower kernel: ? __fget+0x29/0x2f
    Jun 27 22:53:04 Tower kernel: __sys_sendmsg+0x60/0x93
    Jun 27 22:53:04 Tower kernel: do_syscall_64+0x83/0xa5
    Jun 27 22:53:04 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
    Jun 27 22:53:04 Tower kernel: RIP: 0033:0x4b59db
    Jun 27 22:53:04 Tower kernel: Code: e8 6a 11 fb ff eb 88 cc cc cc cc cc cc cc cc e8 fb 56 fb ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
    Jun 27 22:53:04 Tower kernel: RSP: 002b:000000c000985a88 EFLAGS: 00000212 ORIG_RAX: 000000000000002e
    Jun 27 22:53:04 Tower kernel: RAX: ffffffffffffffda RBX: 000000c000040500 RCX: 00000000004b59db
    Jun 27 22:53:04 Tower kernel: RDX: 0000000000000000 RSI: 000000c00083be80 RDI: 000000000000000f
    Jun 27 22:53:04 Tower kernel: RBP: 000000c000985ad8 R08: 0000000000000001 R09: 000000c000898000
    Jun 27 22:53:04 Tower kernel: R10: 0000000000000008 R11: 0000000000000212 R12: 0000000000203000
    Jun 27 22:53:04 Tower kernel: R13: 0000000000000000 R14: 000000c001503860 R15: 0000152979c113b7
    Jun 27 22:53:04 Tower kernel: </TASK>
    Jun 27 22:53:04 Tower kernel: ---[ end trace e0d6c6e6b92502f9 ]---

     

    tower-diagnostics-20220628-0725.zip




    User Feedback

    Recommended Comments

    See if this applies to you, if yes, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info.:

     

    https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

    See also here:

    https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.