Jump to content

[SOLVED] Unraid "freeze" every 1-3 days.


Recommended Posts

Hi

 

Since beta 6.9.0.Beta28 my Unraid server freeze and hangs every 2-4 days and I just can't find the reason :(

I just did the update to RC2 but that did not change anything.

 

I have been running mem-test several times without any errors.

 

I often see this in the log prior to freezing:

 

Jan 17 12:07:32 Tower kernel: ------------[ cut here ]------------
Jan 17 12:07:32 Tower kernel: WARNING: CPU: 15 PID: 0 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x99/0x1e1
Jan 17 12:07:32 Tower kernel: Modules linked in: iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat xt_nat iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost macvlan vhost_iotlb tap xt_MASQUERADE iptable_filter iptable_nat nf_nat ip_tables xfs md_mod i915 iosf_mbi i2c_algo_bit drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops bonding x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm intel_wmi_thunderbolt crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd btusb btrtl btbcm btintel bluetooth glue_helper rapl i2c_i801 mpt3sas nvme intel_cstate input_leds i2c_smbus nvme_core ecdh_generic intel_uncore i2c_core video led_class raid_class wmi igc backlight scsi_transport_sas ecc thermal acpi_pad button fan
Jan 17 12:07:32 Tower kernel: [last unloaded: i2c_dev]
Jan 17 12:07:32 Tower kernel: CPU: 15 PID: 0 Comm: swapper/15 Not tainted 5.10.1-Unraid #1
Jan 17 12:07:32 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z490I AORUS ULTRA/Z490I AORUS ULTRA, BIOS F4 06/17/2020
Jan 17 12:07:32 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0x99/0x1e1
Jan 17 12:07:32 Tower kernel: Code: e4 e3 ff ff 8b 54 24 14 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 54 e1 ff ff 84 c0 75 b8 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 89 df 44 89 e6 31 db e8 89 de ff ff e8 af e0 ff ff e9 1f 01
Jan 17 12:07:32 Tower kernel: RSP: 0018:ffffc900004108f8 EFLAGS: 00010202
Jan 17 12:07:32 Tower kernel: RAX: 0000000000000188 RBX: 000000000000c885 RCX: 00000000fad213b7
Jan 17 12:07:32 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8200a6c0
Jan 17 12:07:32 Tower kernel: RBP: ffff8881fb61c280 R08: 000000008f9aa1f3 R09: ffff888100df8b60
Jan 17 12:07:32 Tower kernel: R10: 0000000000000158 R11: ffff888dbb05b500 R12: 0000000000007ba0
Jan 17 12:07:32 Tower kernel: R13: ffffffff8210da40 R14: 000000000000c885 R15: ffff8881fb61c28c
Jan 17 12:07:32 Tower kernel: FS:  0000000000000000(0000) GS:ffff88907e5c0000(0000) knlGS:0000000000000000
Jan 17 12:07:32 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 17 12:07:32 Tower kernel: CR2: 0000000000000000 CR3: 000000000400c003 CR4: 00000000007706e0
Jan 17 12:07:32 Tower kernel: PKRU: 55555554
Jan 17 12:07:32 Tower kernel: Call Trace:
Jan 17 12:07:32 Tower kernel: <IRQ>
Jan 17 12:07:32 Tower kernel: nf_conntrack_confirm+0x2f/0x36
Jan 17 12:07:32 Tower kernel: nf_hook_slow+0x39/0x8e
Jan 17 12:07:32 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
Jan 17 12:07:32 Tower kernel: ? ip_protocol_deliver_rcu+0xfe/0xfe
Jan 17 12:07:32 Tower kernel: ip_local_deliver+0x49/0x75
Jan 17 12:07:32 Tower kernel: ip_sabotage_in+0x43/0x4d
Jan 17 12:07:32 Tower kernel: nf_hook_slow+0x39/0x8e
Jan 17 12:07:32 Tower kernel: nf_hook.constprop.0+0xb1/0xd8
Jan 17 12:07:32 Tower kernel: ? l3mdev_l3_rcv.constprop.0+0x50/0x50
Jan 17 12:07:32 Tower kernel: ip_rcv+0x41/0x61
Jan 17 12:07:32 Tower kernel: __netif_receive_skb_one_core+0x74/0x95
Jan 17 12:07:32 Tower kernel: netif_receive_skb+0x79/0xa1
Jan 17 12:07:32 Tower kernel: br_handle_frame_finish+0x30d/0x351
Jan 17 12:07:32 Tower kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
Jan 17 12:07:32 Tower kernel: ? br_pass_frame_up+0xda/0xda
Jan 17 12:07:32 Tower kernel: br_nf_hook_thresh+0xa3/0xc3
Jan 17 12:07:32 Tower kernel: ? br_pass_frame_up+0xda/0xda
Jan 17 12:07:32 Tower kernel: br_nf_pre_routing_finish+0x23d/0x264
Jan 17 12:07:32 Tower kernel: ? br_pass_frame_up+0xda/0xda
Jan 17 12:07:32 Tower kernel: ? br_handle_frame_finish+0x351/0x351
Jan 17 12:07:32 Tower kernel: ? nf_nat_ipv4_in+0x1e/0x4a [nf_nat]
Jan 17 12:07:32 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0
Jan 17 12:07:32 Tower kernel: ? br_handle_frame_finish+0x351/0x351
Jan 17 12:07:32 Tower kernel: NF_HOOK+0xd7/0xf7
Jan 17 12:07:32 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0
Jan 17 12:07:32 Tower kernel: br_nf_pre_routing+0x229/0x239
Jan 17 12:07:32 Tower kernel: ? br_nf_forward_finish+0xd0/0xd0
Jan 17 12:07:32 Tower kernel: br_handle_frame+0x25e/0x2a6
Jan 17 12:07:32 Tower kernel: ? br_pass_frame_up+0xda/0xda
Jan 17 12:07:32 Tower kernel: __netif_receive_skb_core+0x335/0x4e7
Jan 17 12:07:32 Tower kernel: ? dma_pte_clear_level+0xff/0x159
Jan 17 12:07:32 Tower kernel: __netif_receive_skb_list_core+0x78/0x104
Jan 17 12:07:32 Tower kernel: netif_receive_skb_list_internal+0x1bf/0x1f2
Jan 17 12:07:32 Tower kernel: ? dev_gro_receive+0x55d/0x578
Jan 17 12:07:32 Tower kernel: gro_normal_list+0x1d/0x39
Jan 17 12:07:32 Tower kernel: napi_complete_done+0x79/0x104
Jan 17 12:07:32 Tower kernel: igc_poll+0x642/0xa27 [igc]
Jan 17 12:07:32 Tower kernel: net_rx_action+0xf4/0x29d
Jan 17 12:07:32 Tower kernel: __do_softirq+0xc4/0x1c2
Jan 17 12:07:32 Tower kernel: asm_call_irq_on_stack+0xf/0x20
Jan 17 12:07:32 Tower kernel: </IRQ>
Jan 17 12:07:32 Tower kernel: do_softirq_own_stack+0x2c/0x39
Jan 17 12:07:32 Tower kernel: __irq_exit_rcu+0x45/0x80
Jan 17 12:07:32 Tower kernel: common_interrupt+0x119/0x12e
Jan 17 12:07:32 Tower kernel: asm_common_interrupt+0x1e/0x40
Jan 17 12:07:32 Tower kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8
Jan 17 12:07:32 Tower kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5
Jan 17 12:07:32 Tower kernel: RSP: 0018:ffffc90000173ea0 EFLAGS: 00000246
Jan 17 12:07:32 Tower kernel: RAX: ffff88907e5e2300 RBX: 0000000000000003 RCX: 000000000000001f
Jan 17 12:07:32 Tower kernel: RDX: 0000000000000000 RSI: 000000002c13c21a RDI: 0000000000000000
Jan 17 12:07:32 Tower kernel: RBP: ffffe8ffffddd300 R08: 000001d833406275 R09: 0000000000000000
Jan 17 12:07:32 Tower kernel: R10: 000000000000147b R11: 071c71c71c71c71c R12: 000001d833406275
Jan 17 12:07:32 Tower kernel: R13: ffffffff820c7e80 R14: 0000000000000003 R15: 0000000000000000
Jan 17 12:07:32 Tower kernel: cpuidle_enter_state+0x101/0x1c4
Jan 17 12:07:32 Tower kernel: cpuidle_enter+0x25/0x31
Jan 17 12:07:32 Tower kernel: do_idle+0x1a1/0x20f
Jan 17 12:07:32 Tower kernel: cpu_startup_entry+0x18/0x1a
Jan 17 12:07:32 Tower kernel: secondary_startup_64_no_verify+0xb0/0xbb
Jan 17 12:07:32 Tower kernel: ---[ end trace 9806cae49f5b5b24 ]---

 

Anyone able to tell me what this means?

 

I have attached some diagnostic log files and would appreciate all comments or suggestions.

 

Best regards

Jan

 

tower-diagnostics-20210101-1426.zip tower-diagnostics-20210110-1344.zip

Edited by [email protected]
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...