Jump to content

unRaid crashing within 24 hours of every reboot/startup


sorrow
Go to solution Solved by JorgeB,

Recommended Posts

Current version: v6.12.6
The crashing has been happening since around/before: v6.11.x

 

I've been running unRaid for the last few years on a Dell r720, but it started crashing every few days, so I decided to upgrade to a Dell r730, and now it is crashing within 24 hours of every single reboot and/or startup. It seems to be a random time every single day, and randomly within that 24 hour period after reboot and/or startup.

All of my drives show healthy. I ran a MemTest on the v6.12.5 version and it passed with zero errors. I have the syslogs sent over to another unRaid server from this one and either it doesn't send out anything weird, or there are just normal messages that don't appear to be errors from a crash.

I've been going through all configurations and nothing really stands out as a reason for it be suddenly crashing like this at random times.

Any help as to some troubleshooting steps or just next steps at places to look for issues would be kindly appreciated.

Thanks.


 

Link to comment

Are you using macvlan for docker networking?   That seems to be the commonest cause for instability at the moment.   Note also that passing memtest is not definitive (whereas failing is) on RAM issues.  Sometimes running with the less sticks of RAM can help.

 

The log from the syslog server might be useful in showing what lead up to the crash even though you suggest it is not showing anything useful.

 

Link to comment

When you are talking about macvlan, are you referring to setting up the networking as something like this? Custom: br0..etc? Instead of setting it up as a bridge? If so, I am sure that I have some Custom: br0 interfaces setup.

As for the memtest, I knocked it down to 2 sticks of RAM, and then switched them out for one another and ran the test again on the other 2, for the total of 4 sticks.

 

Here is information from the syslog server from last night just before it crashed. This is different than the night before because this actually says something like, "general protection fault" whereas the logs from the few previous nights didn't have this listed.

 

Dec 27 05:00:01 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: BTRFS info (device sdf1): balance: start -dusage=50
Dec 27 05:00:01 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: BTRFS info (device sdf1): balance: ended with status: 0
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: general protection fault, probably for non-canonical address 0x37708569a0089b13: 0000 [#1] PREEMPT SMP PTI
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: CPU: 54 PID: 55089 Comm: node Tainted: P        W  O       6.1.64-Unraid #1
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.17.0 03/15/2023
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RIP: 0010:nf_nat_setup_info+0x14a/0x7d1 [nf_nat]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: Code: 4c 89 ff e8 1e f8 ff ff 48 8b 15 2a 6a 00 00 89 c0 48 8d 04 c2 4c 8b 30 4d 85 f6 74 2a 49 81 ee 90 00 00 00 eb 21 8a 44 24 46 <41> 38 46 46 74 21 49 8b 96 90 00 00 00 48 85 d2 0f 84 56 ff ff ff
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RSP: 0018:ffffc9000711c740 EFLAGS: 00010202
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RAX: ffff8888bb073706 RBX: ffff888149f1b600 RCX: 41d483225180f979
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RDX: 37708569a0089b5d RSI: 168e7d3aa1d626d9 RDI: edd8a2f5a67b2e7e
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RBP: ffffc9000711c808 R08: 10acad0aca90a25c R09: 210fba724e3b6d10
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: R10: 18009b0c7f645b6a R11: ffffc9000711c718 R12: ffffc9000711c81c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: R13: 0000000000000000 R14: 37708569a0089acd R15: ffffffff82a14d00
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: FS:  0000150f49b29c18(0000) GS:ffff88885fcc0000(0000) knlGS:0000000000000000
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: CR2: 0000150f478ac000 CR3: 00000002a7f7c003 CR4: 00000000003706e0
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: Call Trace:
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: <IRQ>
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __die_body+0x1a/0x5c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? die_addr+0x38/0x51
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? exc_general_protection+0x30f/0x345
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? asm_exc_general_protection+0x22/0x30
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? nf_nat_setup_info+0x14a/0x7d1 [nf_nat]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? krealloc+0x82/0x93
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_nat_masquerade_ipv4+0x114/0x13c [nf_nat]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: masquerade_tg+0x48/0x66 [xt_MASQUERADE]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ipt_do_table+0x519/0x5ba [ip_tables]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? xt_write_recseq_end+0xf/0x1c [ip_tables]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __local_bh_enable_ip+0x56/0x6b
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? ipt_do_table+0x575/0x5ba [ip_tables]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_nat_inet_fn+0x126/0x1a8 [nf_nat]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_nat_ipv4_out+0x15/0x91 [nf_nat]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_hook_slow+0x3d/0x96
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __ip_finish_output+0x144/0x144
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_hook+0xdf/0x110
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? ethtool_set_channels+0x93/0x181
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __ip_finish_output+0x144/0x144
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ip_output+0x78/0x88
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __ip_finish_output+0x144/0x144
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ip_sabotage_in+0x52/0x60 [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: nf_hook_slow+0x3d/0x96
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? ip_rcv_finish_core.constprop.0+0x3e8/0x3e8
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: NF_HOOK.constprop.0+0x79/0xd9
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? ip_rcv_finish_core.constprop.0+0x3e8/0x3e8
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __netif_receive_skb_one_core+0x77/0x9c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: netif_receive_skb+0xbf/0x127
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: br_handle_frame_finish+0x43a/0x474 [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_pass_frame_up+0xdd/0xdd [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: br_nf_hook_thresh+0xe5/0x109 [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_pass_frame_up+0xdd/0xdd [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: br_nf_pre_routing_finish+0x2c1/0x2ec [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_pass_frame_up+0xdd/0xdd [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_nf_hook_thresh+0x109/0x109 [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: br_nf_pre_routing+0x236/0x24a [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_nf_hook_thresh+0x109/0x109 [br_netfilter]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: br_handle_frame+0x27a/0x2e0 [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? br_pass_frame_up+0xdd/0xdd [bridge]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __netif_receive_skb_core.constprop.0+0x4fd/0x6e9
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? se_is_idle+0x16/0x34
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? place_entity+0x6e/0xae
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __netif_receive_skb_one_core+0x40/0x9c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: process_backlog+0x8c/0x116
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __napi_poll.constprop.0+0x2b/0x124
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: net_rx_action+0x159/0x24f
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __do_softirq+0x129/0x288
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: do_softirq+0x7f/0xab
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: </IRQ>
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: <TASK>
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __local_bh_enable_ip+0x4c/0x6b
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __dev_queue_xmit+0x806/0x832
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? slab_post_alloc_hook+0x4d/0x15e
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? __skb_dequeue+0x39/0x39
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? ___neigh_lookup_noref+0x5b/0x6d
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ip_finish_output2+0x39d/0x3e0
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __ip_queue_xmit+0x2d8/0x31f
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __tcp_transmit_skb+0x84f/0x8bf
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: tcp_connect+0x7f1/0x87c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: tcp_v4_connect+0x41c/0x47f
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __inet_stream_connect+0xe7/0x332
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ? percpu_counter_add_batch+0x85/0xa2
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: inet_stream_connect+0x39/0x52
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __sys_connect+0x68/0xa7
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: __x64_sys_connect+0x14/0x1b
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: do_syscall_64+0x6b/0x81
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RIP: 0033:0x150f49c3bf63
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 70 d0 ff ff 41 54 b8 02 00 00 00 55 48 89 f5 be 00 88 08 00
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RSP: 002b:00007ffc5f961ee8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RAX: ffffffffffffffda RBX: 000000000000002a RCX: 0000150f49c3bf63
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RDX: 0000000000000010 RSI: 00007ffc5f961fd0 RDI: 0000000000000015
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: RBP: 0000150f49b29c18 R08: 0000000000000000 R09: 0000000000000000
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000150f4786eff8
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: R13: 00007ffc5f961fd0 R14: 0000000000000010 R15: 0000150f49b29c4c
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: </TASK>
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: Modules linked in: xt_connmark xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha udp_diag xt_mark nft_compat nf_tables xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls ixgbe xfrm_algo mdio igb intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul sr_mod crc32_pclmul mgag200 crc32c_intel
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: cdrom ghash_clmulni_intel drm_shmem_helper sha512_ssse3 sha256_ssse3 sha1_ssse3 drm_kms_helper aesni_intel ipmi_ssif crypto_simd drm cryptd rapl intel_cstate mxm_wmi backlight mei_me syscopyarea i2c_algo_bit sysfillrect ahci sysimgblt intel_uncore megaraid_sas fb_sys_fops i2c_core mei libahci ipmi_si wmi acpi_power_meter button unix [last unloaded: xfrm_algo]
Dec 27 05:01:08 503ebd0d06deb61de39a8fbd9006ec135ba4ebe1 kernel: ---[ end trace 0000000000000000 ]---

 

Edited by sorrow
Link to comment
8 minutes ago, JorgeB said:

Go to (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)), then reboot.

 

Oh, in the Advanced Docker Settings it does list this:

 

Docker custom network type:  macvlan

 

I'll change this to ipvlan, reboot, and then see how that goes.

Thanks!

 

Link to comment

Looks like they added this to the Known Issues:
 

Known issues

Call traces and crashes related to macvlan

If you are getting call traces related to macvlan (or any unexplained crashes, really), as a first step we'd recommend navigating to Settings > Docker, switching to advanced view, and changing the Docker custom network type from macvlan to ipvlan. This is the default configuration that Unraid has shipped with since version 6.11.5 and should work for most systems.

Note that some users have reported issues with port forwarding from certain routers (Fritzbox) and reduced functionality with advanced network management tools (Ubiquity) when in ipvlan mode. If this affects you, see the alterate solution available since Unraid 6.12.4.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...