July 25, 20232 yr Something changed but my system started crashing about 4 weeks ago. I have tried updating all sw and still crashing. I'm not an expert but any help would be appreciated. Here is my diagnotic logs. No new HW has been added to my system. Note crashing started on 6.11.5 after running for over 6 months no issues. I have updated to 6.12.3 now and still having crashes ever few days. Thanks in advance for your help. tower-diagnostics-20230717-2027.zip Edited July 25, 20232 yr by cyberdac
July 31, 20232 yr Community Expert Make sure this is done and if it still crashes enable the syslog server and post that after a crash.
August 1, 20232 yr Author Thanks for the suggestions. I have turned off power save mode and turned of C-States. See how it works. Thx, Dave
August 9, 20232 yr Author I made listed Bios updates, but system continued to crash within 48hrs. So I had to replace MB with older Intel system. I will continue to debug MB with Ubuntu 23.04.
August 13, 20232 yr Author So my crashing or hanging as I see it continues. Now it is happening with 24hrs of reboot. Here are things I've tried. New PSU Different MB - since put back to org Updated firmware on LSISAS2008: FWVersion(17.00.01.00), ChipRevision(0x03), BiosVersion(07.33.00.00) New firmware on LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00) Here is my syslog. Any help is appriciated. syslog
August 14, 20232 yr Author Crashes started weeks before upgrading to 6.12.xx. Even after upgrade to 6.12.xx crashes have been persisting.
August 14, 20232 yr Author Here might be a clue. Just reboot from freeze and started move from my cache as it was full form some files downloaded earlier today. Then soon after got this error taken from syslog... Aug 13 20:06:12 Tower kernel: ---[ end trace 0000000000000000 ]--- Aug 13 20:06:12 Tower kernel: BTRFS: error (device loop2: state A) in btrfs_run_delayed_refs:2144: errno=-117 Filesystem corrupted Aug 13 20:06:12 Tower kernel: BTRFS info (device loop2: state EA): forced readonly Aug 13 20:08:01 Tower kernel: ------------[ cut here ]------------ Aug 13 20:08:01 Tower kernel: WARNING: CPU: 8 PID: 977 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Aug 13 20:08:01 Tower kernel: Modules linked in: af_packet bluetooth ecdh_generic ecc macvlan veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc igb alx mdio amdgpu edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi kvm_amd kvm gpu_sched drm_buddy video drm_ttm_helper ttm drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel Aug 13 20:08:01 Tower kernel: sha512_ssse3 drm_kms_helper aesni_intel crypto_simd gigabyte_wmi wmi_bmof mxm_wmi drm cryptd mpt3sas tpm_crb backlight agpgart syscopyarea i2c_piix4 tpm_tis sysfillrect i2c_algo_bit ahci tpm_tis_core rapl input_leds raid_class sysimgblt led_class k10temp ccp i2c_core scsi_transport_sas fb_sys_fops libahci thermal tpm wmi button acpi_cpufreq unix [last unloaded: igb] Aug 13 20:08:01 Tower kernel: CPU: 8 PID: 977 Comm: kworker/u64:10 Tainted: P W O 6.1.38-Unraid #2 Aug 13 20:08:01 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming 5/AX370-Gaming 5, BIOS F51h 02/09/2023 Aug 13 20:08:01 Tower kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] Aug 13 20:08:01 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Aug 13 20:08:01 Tower kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01 Aug 13 20:08:01 Tower kernel: RSP: 0018:ffffc90000408d98 EFLAGS: 00010202 Aug 13 20:08:01 Tower kernel: RAX: 0000000000000001 RBX: ffff8882d7e4d000 RCX: ed87fee5f359cae6 Aug 13 20:08:01 Tower kernel: RDX: 0000000000000000 RSI: 00000000000001fd RDI: ffff8882d7e4d000 Aug 13 20:08:01 Tower kernel: RBP: 0000000000000001 R08: 5ad6a1ff9ed79af4 R09: 556e5e42ab8ffcc6 Aug 13 20:08:01 Tower kernel: R10: 2816518b476e31c1 R11: ffffc90000408d60 R12: ffffffff82a11d00 Aug 13 20:08:01 Tower kernel: R13: 000000000001398c R14: ffff8883cadb9c00 R15: 0000000000000000 Aug 13 20:08:01 Tower kernel: FS: 0000000000000000(0000) GS:ffff88842ea00000(0000) knlGS:0000000000000000 Aug 13 20:08:01 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 13 20:08:01 Tower kernel: CR2: 00001457f76d9f58 CR3: 000000018d3ea000 CR4: 00000000003506e0 Aug 13 20:08:01 Tower kernel: Call Trace: Aug 13 20:08:01 Tower kernel: <IRQ> Aug 13 20:08:01 Tower kernel: ? __warn+0xab/0x122
August 14, 20232 yr Community Expert 6 hours ago, cyberdac said: Aug 13 20:08:01 Tower kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] Aug 13 20:08:01 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)).
August 15, 20232 yr Author Think I might has solved the mystery. Still a little early, but everything was pointing to bad cache drive. So this morning I change out the cache drive after spending whole night moving files off old cache drive. It is now 13hrs and no crashes. This the longest stretch I have gone in over a week. And thanks for tip from JorgeB how to fix ipvlan 🙂
August 16, 20232 yr Author Now it has been 48hrs without crash or hang. This issue closed and solved with new cache drive. Old drive most likely just needed a fresh formatting, but replace with new drive instead. Thanks for all the help given. Long term I will back out some of the BIOS changes made to debug, but very happy to be past this nasty issue. 🙂
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.