reknew Posted August 28, 2023 Share Posted August 28, 2023 (edited) tower-diagnostics-20230828-1259.zip运行了大概半个月死机了,路由器显示unraid未上线,hdmi连显示器没反应,只能强制重启了,很担心我的硬盘😭 失联前的最后一段日志如下,请大佬看看我这是什么问题 Aug 27 13:17:08 tower webGUI: Successful login user root from 192.168.31.173 Aug 27 13:21:13 tower kernel: docker0: port 2(veth70c60bf) entered blocking state Aug 27 13:21:13 tower kernel: docker0: port 2(veth70c60bf) entered disabled state Aug 27 13:21:13 tower kernel: device veth70c60bf entered promiscuous mode Aug 27 13:21:14 tower kernel: eth0: renamed from veth9b721fc Aug 27 13:21:14 tower kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Aug 27 13:21:14 tower kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth70c60bf: link becomes ready Aug 27 13:21:14 tower kernel: docker0: port 2(veth70c60bf) entered blocking state Aug 27 13:21:14 tower kernel: docker0: port 2(veth70c60bf) entered forwarding state Aug 27 13:22:46 tower webGUI: Successful login user root from 192.168.31.173 Aug 27 13:24:35 tower kernel: general protection fault, probably for non-canonical address 0xffff088204573860: 0000 [#1] PREEMPT SMP NOPTI Aug 27 13:24:35 tower kernel: CPU: 7 PID: 9040 Comm: dockerd Tainted: P U O 6.1.36-Unraid #1 Aug 27 13:24:35 tower kernel: Hardware name: Maxsun MS-TZZ H610ITX 2.5G/MS-TZZ H610ITX 2.5G, BIOS 5.27 03/31/2023 Aug 27 13:24:35 tower kernel: RIP: 0010:evict+0x7d/0x150 Aug 27 13:24:35 tower kernel: Code: 48 8d ab 10 01 00 00 48 39 c5 74 43 48 8b 43 28 48 8d b8 40 05 00 00 e8 4c fe 61 00 48 8b 83 18 01 00 00 48 8b 93 10 01 00 00 <48> 89 42 08 48 89 10 48 8b 43 28 48 89 ab 10 01 00 00 48 89 ab 18 Aug 27 13:24:35 tower kernel: RSP: 0018:ffffc90000807c28 EFLAGS: 00010246 Aug 27 13:24:35 tower kernel: RAX: ffff888204573858 RBX: ffff888204573748 RCX: ffffffff81e41ca0 Aug 27 13:24:35 tower kernel: RDX: ffff088204573858 RSI: ffff8882045737c8 RDI: ffff88813e68e540 Aug 27 13:24:35 tower kernel: RBP: ffff888204573858 R08: ffff88815fec9b98 R09: ffffffff813b856c Aug 27 13:24:35 tower kernel: R10: ffff888190548240 R11: 0000000000000005 R12: ffffffff81e41ca0 Aug 27 13:24:35 tower kernel: R13: ffff888117ef6718 R14: ffff888143f20000 R15: ffff8883f9bf2e68 Aug 27 13:24:35 tower kernel: FS: 00001540e1477700(0000) GS:ffff8884b09c0000(0000) knlGS:0000000000000000 Aug 27 13:24:35 tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 27 13:24:35 tower kernel: CR2: 000000c00069d360 CR3: 000000015f7a0000 CR4: 0000000000750ee0 Aug 27 13:24:35 tower kernel: PKRU: 55555554 Aug 27 13:24:35 tower kernel: Call Trace: Aug 27 13:24:35 tower kernel: <TASK> Aug 27 13:24:35 tower kernel: ? __die_body+0x1a/0x5c Aug 27 13:24:35 tower kernel: ? die_addr+0x38/0x51 Aug 27 13:24:35 tower kernel: ? exc_general_protection+0x30f/0x345 Aug 27 13:24:35 tower kernel: ? asm_exc_general_protection+0x22/0x30 Aug 27 13:24:35 tower kernel: ? __clear_extent_bit+0x314/0x329 Aug 27 13:24:35 tower kernel: ? evict+0x7d/0x150 Aug 27 13:24:35 tower kernel: ? evict+0x6f/0x150 Aug 27 13:24:35 tower kernel: __dentry_kill+0xcb/0x131 Aug 27 13:24:35 tower kernel: shrink_dentry_list+0xaa/0xba Aug 27 13:24:35 tower kernel: shrink_dcache_parent+0xf3/0x118 Aug 27 13:24:35 tower kernel: d_invalidate+0x74/0xdd Aug 27 13:24:35 tower kernel: btrfs_delete_subvolume+0x409/0x528 Aug 27 13:24:35 tower kernel: btrfs_ioctl_snap_destroy+0x42a/0x50c Aug 27 13:24:35 tower kernel: btrfs_ioctl+0x246/0x2883 Aug 27 13:24:35 tower kernel: ? __do_sys_newfstatat+0x35/0x5c Aug 27 13:24:35 tower kernel: vfs_ioctl+0x1b/0x2f Aug 27 13:24:35 tower kernel: __do_sys_ioctl+0x52/0x78 Aug 27 13:24:35 tower kernel: do_syscall_64+0x68/0x81 Aug 27 13:24:35 tower kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Aug 27 13:24:35 tower kernel: RIP: 0033:0x40468e Aug 27 13:24:35 tower kernel: Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 Aug 27 13:24:35 tower kernel: RSP: 002b:000000c000f05a28 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 Aug 27 13:24:35 tower kernel: RAX: ffffffffffffffda RBX: 00000000000000da RCX: 000000000040468e Aug 27 13:24:35 tower kernel: RDX: 000000c000f05bb0 RSI: 000000005000940f RDI: 00000000000000da Aug 27 13:24:35 tower kernel: RBP: 000000c000f05a68 R08: 0000000000000000 R09: 0000000000000000 Aug 27 13:24:35 tower kernel: R10: 0000000000000000 R11: 0000000000000206 R12: 000000c0014b4f10 Aug 27 13:24:35 tower kernel: R13: 0000000000000000 R14: 000000c0096a1ba0 R15: ffffffffffffffff Aug 27 13:24:35 tower kernel: </TASK> Aug 27 13:24:35 tower kernel: Modules linked in: xt_REDIRECT xt_mark ts_bm xt_string af_packet ccp xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype br_netfilter bridge xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nct6775 nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs 8021q garp mrp stp llc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 i2c_i801 btusb aesni_intel mei_hdcp mei_pxp i2c_smbus btrtl btbcm wmi_bmof crypto_simd Aug 27 13:24:35 tower kernel: cryptd rapl intel_cstate intel_uncore btintel bluetooth i2c_core mei_me nvme r8169 tpm_crb nvme_core mei realtek tpm_tis tpm_tis_core video ahci input_leds ecdh_generic joydev led_class ecc libahci thermal wmi fan tpm backlight intel_pmc_core acpi_tad acpi_pad button unix Aug 27 13:24:35 tower kernel: ---[ end trace 0000000000000000 ]--- Aug 27 13:24:35 tower kernel: RIP: 0010:evict+0x7d/0x150 Aug 27 13:24:35 tower kernel: Code: 48 8d ab 10 01 00 00 48 39 c5 74 43 48 8b 43 28 48 8d b8 40 05 00 00 e8 4c fe 61 00 48 8b 83 18 01 00 00 48 8b 93 10 01 00 00 <48> 89 42 08 48 89 10 48 8b 43 28 48 89 ab 10 01 00 00 48 89 ab 18 Aug 27 13:24:35 tower kernel: RSP: 0018:ffffc90000807c28 EFLAGS: 00010246 Aug 27 13:24:35 tower kernel: RAX: ffff888204573858 RBX: ffff888204573748 RCX: ffffffff81e41ca0 Aug 27 13:24:35 tower kernel: RDX: ffff088204573858 RSI: ffff8882045737c8 RDI: ffff88813e68e540 Aug 27 13:24:35 tower kernel: RBP: ffff888204573858 R08: ffff88815fec9b98 R09: ffffffff813b856c Aug 27 13:24:35 tower kernel: R10: ffff888190548240 R11: 0000000000000005 R12: ffffffff81e41ca0 Aug 27 13:24:35 tower kernel: R13: ffff888117ef6718 R14: ffff888143f20000 R15: ffff8883f9bf2e68 Aug 27 13:24:35 tower kernel: FS: 00001540e1477700(0000) GS:ffff8884b09c0000(0000) knlGS:0000000000000000 Aug 27 13:24:35 tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 27 13:24:35 tower kernel: CR2: 000000c00069d360 CR3: 000000015f7a0000 CR4: 0000000000750ee0 Aug 27 13:24:35 tower kernel: PKRU: 55555554 Aug 27 13:24:35 tower kernel: note: dockerd[9040] exited with preempt_count 1 Edited August 28, 2023 by reknew Quote Link to comment
Solution JackieWu Posted August 28, 2023 Solution Share Posted August 28, 2023 (edited) 这段内核报错涉及到 btrfs 文件系统和内存,不太好能确定是什么问题造成的,不过我建议你最好能检测一下内存(Unraid 自带有 memtest86 内存检测,或者你也可以参考这里的检测方法),如果内存没问题可以尝试升级到 6.12.3 版本。 你发上来的压缩包没有包含失联前的日志(8月27日),只有失联后的(8月28日),不过里面有一些关于你缓存池的文件系统报错: Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x801eeefa dinode Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Unmount and run xfs_repair Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): First 128 bytes of corrupted metadata buffer: Aug 28 07:42:27 tower kernel: 00000000: 49 4e 41 ff 03 01 00 00 00 00 03 e8 00 00 00 64 INA............d Aug 28 07:42:27 tower kernel: 00000010: 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 ................ Aug 28 07:42:27 tower kernel: 00000020: 35 41 ec 29 aa d6 9b 7c 35 42 3c 55 9e 8b ef f6 5A.)...|5B<U.... Aug 28 07:42:27 tower kernel: 00000030: 35 44 42 d3 93 b1 c6 ed 00 00 00 00 00 00 00 5e 5DB............^ Aug 28 07:42:27 tower kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Aug 28 07:42:27 tower kernel: 00000050: 00 00 25 01 00 00 00 00 00 00 00 00 b4 e0 ec 16 ..%............. Aug 28 07:42:27 tower kernel: 00000060: ff ff ff ff 63 4a cd c2 00 00 00 00 00 00 00 0e ....cJ.......... Aug 28 07:42:27 tower kernel: 00000070: 00 00 00 07 00 04 74 2f 00 00 00 00 00 00 00 08 ......t/........ Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x801eeefa dinode 但这个报错应该跟失联问题关系不大,建议你修复下缓存池的文件系统。 Edited August 28, 2023 by JackieWu 1 Quote Link to comment
reknew Posted September 10, 2023 Author Share Posted September 10, 2023 On 8/29/2023 at 12:34 AM, JackieWu said: 这段内核报错涉及到 btrfs 文件系统和内存,不太好能确定是什么问题造成的,不过我建议你最好能检测一下内存(Unraid 自带有 memtest86 内存检测,或者你也可以参考这里的检测方法),如果内存没问题可以尝试升级到 6.12.3 版本。 你发上来的压缩包没有包含失联前的日志(8月27日),只有失联后的(8月28日),不过里面有一些关于你缓存池的文件系统报错: Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x801eeefa dinode Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Unmount and run xfs_repair Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): First 128 bytes of corrupted metadata buffer: Aug 28 07:42:27 tower kernel: 00000000: 49 4e 41 ff 03 01 00 00 00 00 03 e8 00 00 00 64 INA............d Aug 28 07:42:27 tower kernel: 00000010: 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 ................ Aug 28 07:42:27 tower kernel: 00000020: 35 41 ec 29 aa d6 9b 7c 35 42 3c 55 9e 8b ef f6 5A.)...|5B<U.... Aug 28 07:42:27 tower kernel: 00000030: 35 44 42 d3 93 b1 c6 ed 00 00 00 00 00 00 00 5e 5DB............^ Aug 28 07:42:27 tower kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Aug 28 07:42:27 tower kernel: 00000050: 00 00 25 01 00 00 00 00 00 00 00 00 b4 e0 ec 16 ..%............. Aug 28 07:42:27 tower kernel: 00000060: ff ff ff ff 63 4a cd c2 00 00 00 00 00 00 00 0e ....cJ.......... Aug 28 07:42:27 tower kernel: 00000070: 00 00 00 07 00 04 74 2f 00 00 00 00 00 00 00 08 ......t/........ Aug 28 07:42:27 tower kernel: XFS (nvme0n1p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x801eeefa dinode 但这个报错应该跟失联问题关系不大,建议你修复下缓存池的文件系统。 用memtest测试发现内存确实有问题,哪怕降频到2133也会零星报错,已经申请换货了,另外文件系统的报错也修复了,谢谢大佬! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.