KingHawk Posted May 17 Share Posted May 17 Starting last Sunday the server hangs/crashes quite often. First I thought it to be caused by corruption on the SSD's so I've formatted my 2 SSD pools. Sadly the hangs/crashes returned within 48 hours. I hope someone can help because at this point I have no idea what's causing it. jj-silverstone-diagnostics-20240517-0817.zip syslog-192.168.2.110.log Quote Link to comment
JorgeB Posted May 17 Share Posted May 17 Nothing relevant logged that I can see, any changes done Sunday, hardware or software? Quote Link to comment
KingHawk Posted May 17 Author Share Posted May 17 I've been working a lot on my setup recently but that's mainly in the dockers. The only unraid setting I remember changing is in docker settings enabling 'Host access to custom networks'. Any other settings I changed were network related and I thought I reverted all the changes cause they didn't help the issue at that time. I was also asking ChatGPT4o to analyze and it only made these recommendations: ### Recommendations 1. **Network Configuration:** - Address the network configuration issues, particularly the conflict with Docker networks (e.g., `network with name br0 already exists`). 2. **Resource Monitoring:** - Monitor resource usage during high Docker activity to identify potential bottlenecks or resource exhaustion. 3. **System Cooling:** - Ensure adequate cooling for drives, especially those that have reached high temperatures. ----------------------------- Could the br0 conflict cause any major hangs/crashes like this? If this is a conflict I don't know how I did it. PROTOCOL | ROUTE | GATEWAY ---------|---------------|---------------------- IPv4 | default | 192.168.2.254 via shim-br0 IPv4 | default | 192.168.2.254 via br0 IPv4 | 172.17.0.0/16 | docker0 IPv4 | 192.168.2.0/24| shim-br0 IPv4 | 192.168.2.0/24| br0 High temperatures have been a thing on my server sometimes, but I doubt it would happen middle of the night. New fans are coming today hoping to get temps down. Also too bad I can't see AMD temps on unraid but a cpu temp issue shouldn't hang the system like this I assume. Quote Link to comment
JorgeB Posted May 17 Share Posted May 17 You do have macvlan with bridging, and that's a known issue, though it usually leaves related call traces in the log, but start by changing the docker network to ipvlan (or disable bridging if you really need macvlan) and retest, if the same I would retest with the docker service disabled. Quote Link to comment
KingHawk Posted May 17 Author Share Posted May 17 (edited) If you think that's worth a shot I can try that after I figure out how this affects my whole docker network setup. Currently everything works via brige except for adguard-home that uses br0. Edited May 17 by KingHawk Quote Link to comment
KingHawk Posted May 17 Author Share Posted May 17 Guess I'll also update the BIOS to (PRIME B650M-A WIFI II) BIOS 2613 when installing the new fans, it can't possibly make things worse... My current BIOS is already pretty recent because I needed a fix for the NVME drive being dropped by a bug in the BIOS. Quote Link to comment
KingHawk Posted May 17 Author Share Posted May 17 (edited) A kernel warning appeared in my log, is this problematic? EDIT: The fix for this is using ipvlan like you suggested isn't it? 😅 May 17 12:17:35 JJ-SILVERSTONE kernel: ------------[ cut here ]------------ May 17 12:17:35 JJ-SILVERSTONE kernel: WARNING: CPU: 1 PID: 8172 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: Modules linked in: xt_connmark xt_mark iptable_mangle xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha nvidia_uvm(PO) macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls ixgbe xfrm_algo mdio zfs(PO) edac_mce_amd nvidia_drm(PO) edac_core intel_rapl_msr intel_rapl_common nvidia_modeset(PO) zunicode(PO) iosf_mbi zzstd(O) zlua(O) zavl(PO) amdgpu kvm_amd icp(PO) kvm nvidia(PO) gpu_sched drm_buddy i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm_display_helper ghash_clmulni_intel sha512_ssse3 sha256_ssse3 zcommon(PO) sha1_ssse3 drm_kms_helper aesni_intel znvpair(PO) crypto_simd spl(O) May 17 12:17:35 JJ-SILVERSTONE kernel: cryptd drm agpgart i2c_piix4 nvme wmi_bmof ahci mpt3sas rapl k10temp i2c_core nvme_core ccp libahci raid_class scsi_transport_sas syscopyarea sysfillrect sysimgblt fb_sys_fops tpm_crb video tpm_tis tpm_tis_core wmi tpm backlight acpi_cpufreq button unix [last unloaded: xfrm_algo] May 17 12:17:35 JJ-SILVERSTONE kernel: CPU: 1 PID: 8172 Comm: kworker/u32:2 Tainted: P O 6.1.79-Unraid #1 May 17 12:17:35 JJ-SILVERSTONE kernel: Hardware name: ASUS System Product Name/PRIME B650M-A WIFI, BIOS 2412 01/26/2024 May 17 12:17:35 JJ-SILVERSTONE kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01 May 17 12:17:35 JJ-SILVERSTONE kernel: RSP: 0018:ffffc90000234d98 EFLAGS: 00010202 May 17 12:17:35 JJ-SILVERSTONE kernel: RAX: 0000000000000001 RBX: ffff88859a1a3700 RCX: 3fea9838c99fde63 May 17 12:17:35 JJ-SILVERSTONE kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88859a1a3700 May 17 12:17:35 JJ-SILVERSTONE kernel: RBP: 0000000000000001 R08: 6bb951648c64c4aa R09: e42219b161685142 May 17 12:17:35 JJ-SILVERSTONE kernel: R10: 5d8dc8a14068ad07 R11: ffffc90000234d60 R12: ffffffff82a16f40 May 17 12:17:35 JJ-SILVERSTONE kernel: R13: 000000000001b8fa R14: ffff8881fce2ed00 R15: 0000000000000000 May 17 12:17:35 JJ-SILVERSTONE kernel: FS: 0000000000000000(0000) GS:ffff88901e240000(0000) knlGS:0000000000000000 May 17 12:17:35 JJ-SILVERSTONE kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 17 12:17:35 JJ-SILVERSTONE kernel: CR2: 0000152604cfb000 CR3: 000000000420a000 CR4: 0000000000750ee0 May 17 12:17:35 JJ-SILVERSTONE kernel: PKRU: 55555554 May 17 12:17:35 JJ-SILVERSTONE kernel: Call Trace: May 17 12:17:35 JJ-SILVERSTONE kernel: <IRQ> May 17 12:17:35 JJ-SILVERSTONE kernel: ? __warn+0xab/0x122 May 17 12:17:35 JJ-SILVERSTONE kernel: ? report_bug+0x109/0x17e May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? handle_bug+0x41/0x6f May 17 12:17:35 JJ-SILVERSTONE kernel: ? exc_invalid_op+0x13/0x60 May 17 12:17:35 JJ-SILVERSTONE kernel: ? asm_exc_invalid_op+0x16/0x20 May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat] May 17 12:17:35 JJ-SILVERSTONE kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: nf_hook_slow+0x3d/0x96 May 17 12:17:35 JJ-SILVERSTONE kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 17 12:17:35 JJ-SILVERSTONE kernel: NF_HOOK.constprop.0+0x79/0xd9 May 17 12:17:35 JJ-SILVERSTONE kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 17 12:17:35 JJ-SILVERSTONE kernel: __netif_receive_skb_one_core+0x77/0x9c May 17 12:17:35 JJ-SILVERSTONE kernel: process_backlog+0x8c/0x116 May 17 12:17:35 JJ-SILVERSTONE kernel: __napi_poll.constprop.0+0x2b/0x124 May 17 12:17:35 JJ-SILVERSTONE kernel: net_rx_action+0x159/0x24f May 17 12:17:35 JJ-SILVERSTONE kernel: __do_softirq+0x129/0x288 May 17 12:17:35 JJ-SILVERSTONE kernel: do_softirq+0x7f/0xab May 17 12:17:35 JJ-SILVERSTONE kernel: </IRQ> May 17 12:17:35 JJ-SILVERSTONE kernel: <TASK> May 17 12:17:35 JJ-SILVERSTONE kernel: __local_bh_enable_ip+0x4c/0x6b May 17 12:17:35 JJ-SILVERSTONE kernel: netif_rx+0x52/0x5a May 17 12:17:35 JJ-SILVERSTONE kernel: macvlan_broadcast+0x10a/0x150 [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: ? _raw_spin_unlock+0x14/0x29 May 17 12:17:35 JJ-SILVERSTONE kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: process_one_work+0x1ab/0x295 May 17 12:17:35 JJ-SILVERSTONE kernel: worker_thread+0x18b/0x244 May 17 12:17:35 JJ-SILVERSTONE kernel: ? rescuer_thread+0x281/0x281 May 17 12:17:35 JJ-SILVERSTONE kernel: kthread+0xe7/0xef May 17 12:17:35 JJ-SILVERSTONE kernel: ? kthread_complete_and_exit+0x1b/0x1b May 17 12:17:35 JJ-SILVERSTONE kernel: ret_from_fork+0x22/0x30 May 17 12:17:35 JJ-SILVERSTONE kernel: </TASK> May 17 12:17:35 JJ-SILVERSTONE kernel: ---[ end trace 0000000000000000 ]--- Edited May 17 by KingHawk Quote Link to comment
Solution JorgeB Posted May 18 Solution Share Posted May 18 11 hours ago, KingHawk said: The fix for this is using ipvlan like you suggested isn't it? 😅 Yep, or disable bridging for eth0, containers should keep working as they are. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.