Jump to content

KingHawk

Members
  • Posts

    14
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

KingHawk's Achievements

Noob

Noob (1/14)

4

Reputation

  1. A kernel warning appeared in my log, is this problematic? EDIT: The fix for this is using ipvlan like you suggested isn't it? 😅 May 17 12:17:35 JJ-SILVERSTONE kernel: ------------[ cut here ]------------ May 17 12:17:35 JJ-SILVERSTONE kernel: WARNING: CPU: 1 PID: 8172 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: Modules linked in: xt_connmark xt_mark iptable_mangle xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha nvidia_uvm(PO) macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls ixgbe xfrm_algo mdio zfs(PO) edac_mce_amd nvidia_drm(PO) edac_core intel_rapl_msr intel_rapl_common nvidia_modeset(PO) zunicode(PO) iosf_mbi zzstd(O) zlua(O) zavl(PO) amdgpu kvm_amd icp(PO) kvm nvidia(PO) gpu_sched drm_buddy i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm_display_helper ghash_clmulni_intel sha512_ssse3 sha256_ssse3 zcommon(PO) sha1_ssse3 drm_kms_helper aesni_intel znvpair(PO) crypto_simd spl(O) May 17 12:17:35 JJ-SILVERSTONE kernel: cryptd drm agpgart i2c_piix4 nvme wmi_bmof ahci mpt3sas rapl k10temp i2c_core nvme_core ccp libahci raid_class scsi_transport_sas syscopyarea sysfillrect sysimgblt fb_sys_fops tpm_crb video tpm_tis tpm_tis_core wmi tpm backlight acpi_cpufreq button unix [last unloaded: xfrm_algo] May 17 12:17:35 JJ-SILVERSTONE kernel: CPU: 1 PID: 8172 Comm: kworker/u32:2 Tainted: P O 6.1.79-Unraid #1 May 17 12:17:35 JJ-SILVERSTONE kernel: Hardware name: ASUS System Product Name/PRIME B650M-A WIFI, BIOS 2412 01/26/2024 May 17 12:17:35 JJ-SILVERSTONE kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01 May 17 12:17:35 JJ-SILVERSTONE kernel: RSP: 0018:ffffc90000234d98 EFLAGS: 00010202 May 17 12:17:35 JJ-SILVERSTONE kernel: RAX: 0000000000000001 RBX: ffff88859a1a3700 RCX: 3fea9838c99fde63 May 17 12:17:35 JJ-SILVERSTONE kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88859a1a3700 May 17 12:17:35 JJ-SILVERSTONE kernel: RBP: 0000000000000001 R08: 6bb951648c64c4aa R09: e42219b161685142 May 17 12:17:35 JJ-SILVERSTONE kernel: R10: 5d8dc8a14068ad07 R11: ffffc90000234d60 R12: ffffffff82a16f40 May 17 12:17:35 JJ-SILVERSTONE kernel: R13: 000000000001b8fa R14: ffff8881fce2ed00 R15: 0000000000000000 May 17 12:17:35 JJ-SILVERSTONE kernel: FS: 0000000000000000(0000) GS:ffff88901e240000(0000) knlGS:0000000000000000 May 17 12:17:35 JJ-SILVERSTONE kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 17 12:17:35 JJ-SILVERSTONE kernel: CR2: 0000152604cfb000 CR3: 000000000420a000 CR4: 0000000000750ee0 May 17 12:17:35 JJ-SILVERSTONE kernel: PKRU: 55555554 May 17 12:17:35 JJ-SILVERSTONE kernel: Call Trace: May 17 12:17:35 JJ-SILVERSTONE kernel: <IRQ> May 17 12:17:35 JJ-SILVERSTONE kernel: ? __warn+0xab/0x122 May 17 12:17:35 JJ-SILVERSTONE kernel: ? report_bug+0x109/0x17e May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? handle_bug+0x41/0x6f May 17 12:17:35 JJ-SILVERSTONE kernel: ? exc_invalid_op+0x13/0x60 May 17 12:17:35 JJ-SILVERSTONE kernel: ? asm_exc_invalid_op+0x16/0x20 May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat] May 17 12:17:35 JJ-SILVERSTONE kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack] May 17 12:17:35 JJ-SILVERSTONE kernel: nf_hook_slow+0x3d/0x96 May 17 12:17:35 JJ-SILVERSTONE kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 17 12:17:35 JJ-SILVERSTONE kernel: NF_HOOK.constprop.0+0x79/0xd9 May 17 12:17:35 JJ-SILVERSTONE kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 17 12:17:35 JJ-SILVERSTONE kernel: __netif_receive_skb_one_core+0x77/0x9c May 17 12:17:35 JJ-SILVERSTONE kernel: process_backlog+0x8c/0x116 May 17 12:17:35 JJ-SILVERSTONE kernel: __napi_poll.constprop.0+0x2b/0x124 May 17 12:17:35 JJ-SILVERSTONE kernel: net_rx_action+0x159/0x24f May 17 12:17:35 JJ-SILVERSTONE kernel: __do_softirq+0x129/0x288 May 17 12:17:35 JJ-SILVERSTONE kernel: do_softirq+0x7f/0xab May 17 12:17:35 JJ-SILVERSTONE kernel: </IRQ> May 17 12:17:35 JJ-SILVERSTONE kernel: <TASK> May 17 12:17:35 JJ-SILVERSTONE kernel: __local_bh_enable_ip+0x4c/0x6b May 17 12:17:35 JJ-SILVERSTONE kernel: netif_rx+0x52/0x5a May 17 12:17:35 JJ-SILVERSTONE kernel: macvlan_broadcast+0x10a/0x150 [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: ? _raw_spin_unlock+0x14/0x29 May 17 12:17:35 JJ-SILVERSTONE kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] May 17 12:17:35 JJ-SILVERSTONE kernel: process_one_work+0x1ab/0x295 May 17 12:17:35 JJ-SILVERSTONE kernel: worker_thread+0x18b/0x244 May 17 12:17:35 JJ-SILVERSTONE kernel: ? rescuer_thread+0x281/0x281 May 17 12:17:35 JJ-SILVERSTONE kernel: kthread+0xe7/0xef May 17 12:17:35 JJ-SILVERSTONE kernel: ? kthread_complete_and_exit+0x1b/0x1b May 17 12:17:35 JJ-SILVERSTONE kernel: ret_from_fork+0x22/0x30 May 17 12:17:35 JJ-SILVERSTONE kernel: </TASK> May 17 12:17:35 JJ-SILVERSTONE kernel: ---[ end trace 0000000000000000 ]---
  2. Guess I'll also update the BIOS to (PRIME B650M-A WIFI II) BIOS 2613 when installing the new fans, it can't possibly make things worse... My current BIOS is already pretty recent because I needed a fix for the NVME drive being dropped by a bug in the BIOS.
  3. If you think that's worth a shot I can try that after I figure out how this affects my whole docker network setup. Currently everything works via brige except for adguard-home that uses br0.
  4. I've been working a lot on my setup recently but that's mainly in the dockers. The only unraid setting I remember changing is in docker settings enabling 'Host access to custom networks'. Any other settings I changed were network related and I thought I reverted all the changes cause they didn't help the issue at that time. I was also asking ChatGPT4o to analyze and it only made these recommendations: ### Recommendations 1. **Network Configuration:** - Address the network configuration issues, particularly the conflict with Docker networks (e.g., `network with name br0 already exists`). 2. **Resource Monitoring:** - Monitor resource usage during high Docker activity to identify potential bottlenecks or resource exhaustion. 3. **System Cooling:** - Ensure adequate cooling for drives, especially those that have reached high temperatures. ----------------------------- Could the br0 conflict cause any major hangs/crashes like this? If this is a conflict I don't know how I did it. PROTOCOL | ROUTE | GATEWAY ---------|---------------|---------------------- IPv4 | default | 192.168.2.254 via shim-br0 IPv4 | default | 192.168.2.254 via br0 IPv4 | 172.17.0.0/16 | docker0 IPv4 | 192.168.2.0/24| shim-br0 IPv4 | 192.168.2.0/24| br0 High temperatures have been a thing on my server sometimes, but I doubt it would happen middle of the night. New fans are coming today hoping to get temps down. Also too bad I can't see AMD temps on unraid but a cpu temp issue shouldn't hang the system like this I assume.
  5. Starting last Sunday the server hangs/crashes quite often. First I thought it to be caused by corruption on the SSD's so I've formatted my 2 SSD pools. Sadly the hangs/crashes returned within 48 hours. I hope someone can help because at this point I have no idea what's causing it. jj-silverstone-diagnostics-20240517-0817.zip syslog-192.168.2.110.log
  6. Yes it is but I didn't mark it or add additional info to this post because I thought I deleted it 😮
  7. Thanks for the tips! Fixed the issue by upgrading from Corsair SF600 to SF750. Have been 8 days without issues now.
  8. Hi again ich77, I'm having some trouble with my NVidia card that first worked properly for a few weeks after you helped me get it working. Sadly now it keeps falling off the bus, I've made some changes to the BIOS and realloc but it didn't improve the situation... Do you have any ideas on how I could fix this, things I could try or is the GPU itself the problem? Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: GPU at PCI:0000:01:00: GPU-33d616df-a0e8-4c9a-3c11-0cad75613c6e Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus. Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus. Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: A GPU crash dump has been created. If possible, please run Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: nvidia-bug-report.sh as root to collect this data before Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: the NVIDIA kernel module is unloaded. I've also documented some of the things I tried to fix it on this topic: jj-silverstone-diagnostics-20240306-2257.zip
  9. Sadly still having trouble with the GPU falling off the bus after working normally for 22 hours. Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: GPU at PCI:0000:01:00: GPU-33d616df-a0e8-4c9a-3c11-0cad75613c6e Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus. Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus. Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: A GPU crash dump has been created. If possible, please run Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: nvidia-bug-report.sh as root to collect this data before Mar 6 22:09:42 JJ-SILVERSTONE kernel: NVRM: the NVIDIA kernel module is unloaded.
  10. I've managed to fix the BAR failed to assign warnings with the following settings in BIOS: 1) Above 4G Decoding - ENABLED 2) Re-Size BAR Support - ENABLED 3) SR-IOV Support - ENABLED 4) Hot-Plug Support - DISABLED I've also added 'pci=realloc=off' to '/boot/syslinux/syslinux.cfg' as described here: So now I'll just wait and see if everything stays stable with this setup.
  11. I've had many problems this week so I'm looking what to do now. Short summary of what happened: - wednesday eve sudden crash of unraid (no logs); - thursday morning raid-1 ssd pool shows curruption errors and GPU falling off the bus for first time (rebooted as fast fix); - thursday afternoon again ssd pool curruption errors and GPU falling off the bus; - thursday eve updated unraid from .6 to .8, no immediate errors, after that and I've run a scrub with error fix but this didn't find any errors; - friday morning GPU falling off the bus, no corruption errors. I've just rebooted hoping the GPU would show up again but it didn't. I got a lot of errors like these (are they related?): Mar 1 14:57:30 JJ-SILVERSTONE kernel: pci 0000:06:00.0: BAR 6: failed to assign [mem size 0x00080000 pref] Mar 1 14:57:30 JJ-SILVERSTONE kernel: pci 0000:06:00.1: BAR 6: no space for [mem size 0x00080000 pref] Mar 1 14:57:30 JJ-SILVERSTONE kernel: pci 0000:06:00.1: BAR 6: failed to assign [mem size 0x00080000 pref] Mar 1 14:57:30 JJ-SILVERSTONE kernel: pci 0000:06:00.0: BAR 7: no space for [mem size 0x00100000 64bit] Mar 1 14:57:30 JJ-SILVERSTONE kernel: pci 0000:06:00.0: BAR 7: failed to assign [mem size 0x00100000 64bit] Everything was running fine after switching to the new GPU a while ago so I wonder what changed that would make it so that the GPU started falling off the bus and even stops showing up completely. Looking forward to any help in fixing these issues. jj-silverstone-diagnostics-20240301-1507.zip
  12. I didn't know about that thanks! I will try this tomorrow morning. EDIT: Successfully update everything and disabled script. New gpu successfully installed.
  13. I'm having some trouble getting my new RTX 4070 Ti Super working on unraid. When I install the new gpu the Nvidia Driver plugin shows [Unknown Error] under 'Installed GPU(s):' The latest version of the NVIDIA driver (in the plugin) is installed and this worked perfectly fine with the old gpu GTX 1050 Ti. Diagnostics attached, hope it helps. jj-silverstone-diagnostics-20240207-1346.zip
×
×
  • Create New...