求助!每运行一段时间(5-10天)就崩溃


Go to solution Solved by anpple,

Recommended Posts

你的日志里面出现过 page fault 错误,这个错误跟内存有关:

 

Sep 27 05:51:16 Overse kernel: BUG: unable to handle page fault for address: ffffffff10d96206
Sep 27 05:51:16 Overse kernel: #PF: supervisor write access in kernel mode
Sep 27 05:51:16 Overse kernel: #PF: error_code(0x0002) - not-present page
Sep 27 05:51:16 Overse kernel: PGD 420e067 P4D 420e067 PUD 0 
Sep 27 05:51:16 Overse kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Sep 27 05:51:16 Overse kernel: CPU: 6 PID: 160 Comm: kswapd0 Tainted: P     U  W  O       6.1.49-Unraid #1
Sep 27 05:51:16 Overse kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C82/MAG B460M MORTAR WIFI (MS-7C82), BIOS 1.10 05/18/2020
Sep 27 05:51:16 Overse kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x152/0x1cf
Sep 27 05:51:16 Overse kernel: Code: b9 01 00 00 00 f0 0f b1 0b 74 76 eb cc c1 ee 12 83 e0 03 ff ce 48 c1 e0 05 48 63 f6 48 05 80 e1 02 00 48 03 04 f5 c0 ea 17 82 <48> 89 10 8b 42 08 85 c0 75 04 f3 90 eb f5 48 8b 32 48 85 f6 74 bc
Sep 27 05:51:16 Overse kernel: RSP: 0018:ffffc9000069faf0 EFLAGS: 00010286
Sep 27 05:51:16 Overse kernel: RAX: ffffffff10d96206 RBX: ffff888157b4aee8 RCX: 00000000001c0000
Sep 27 05:51:16 Overse kernel: RDX: ffff88901f3ae180 RSI: 0000000000003148 RDI: ffff888157b4aee8
Sep 27 05:51:16 Overse kernel: RBP: 0000000000000006 R08: 0000000000000000 R09: 000000000000018f
Sep 27 05:51:16 Overse kernel: R10: ffff88868948b800 R11: 0000000000000000 R12: ffff88901f3ae180
Sep 27 05:51:16 Overse kernel: R13: 0000000000000000 R14: ffff888157b4ae90 R15: 0000000000000000
Sep 27 05:51:16 Overse kernel: FS:  0000000000000000(0000) GS:ffff88901f380000(0000) knlGS:0000000000000000
Sep 27 05:51:16 Overse kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 27 05:51:16 Overse kernel: CR2: ffffffff10d96206 CR3: 0000000154ffe003 CR4: 00000000007706e0
Sep 27 05:51:16 Overse kernel: PKRU: 55555554
Sep 27 05:51:16 Overse kernel: Call Trace:
Sep 27 05:51:16 Overse kernel: <TASK>
Sep 27 05:51:16 Overse kernel: ? __die_body+0x1a/0x5c
Sep 27 05:51:16 Overse kernel: ? page_fault_oops+0x329/0x376
Sep 27 05:51:16 Overse kernel: ? fixup_exception+0x22/0x24b
Sep 27 05:51:16 Overse kernel: ? exc_page_fault+0xf4/0x11d
Sep 27 05:51:16 Overse kernel: ? asm_exc_page_fault+0x22/0x30
Sep 27 05:51:16 Overse kernel: ? native_queued_spin_lock_slowpath+0x152/0x1cf
Sep 27 05:51:16 Overse kernel: do_raw_spin_lock+0x14/0x1a
Sep 27 05:51:16 Overse kernel: shrink_lock_dentry+0xa1/0xea
Sep 27 05:51:16 Overse kernel: shrink_dentry_list+0x3d/0xba
Sep 27 05:51:16 Overse kernel: prune_dcache_sb+0x51/0x73
Sep 27 05:51:16 Overse kernel: super_cache_scan+0xf4/0x17c
Sep 27 05:51:16 Overse kernel: do_shrink_slab+0x188/0x2a1
Sep 27 05:51:16 Overse kernel: shrink_slab+0x1f9/0x267
Sep 27 05:51:16 Overse kernel: shrink_node+0x318/0x549
Sep 27 05:51:16 Overse kernel: balance_pgdat+0x4e9/0x6a2
Sep 27 05:51:16 Overse kernel: ? newidle_balance+0x289/0x30a
Sep 27 05:51:16 Overse kernel: kswapd+0x2f0/0x333
Sep 27 05:51:16 Overse kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20
Sep 27 05:51:16 Overse kernel: ? balance_pgdat+0x6a2/0x6a2
Sep 27 05:51:16 Overse kernel: kthread+0xe4/0xef
Sep 27 05:51:16 Overse kernel: ? kthread_complete_and_exit+0x1b/0x1b
Sep 27 05:51:16 Overse kernel: ret_from_fork+0x1f/0x30
Sep 27 05:51:16 Overse kernel: </TASK>

 

 

建议你可以尝试检测下内存,方法参考:

 

 

Edited by JackieWu
Link to comment
On 2023/10/8 at PM3点54分, JackieWu said:

你的日志里面出现过页面错误错误,这个错误跟内存有关:

 

 

 

 

建议您可以尝试检测下内存,方法参考:

 

 

感谢大佬回复,我按照你的方式检测了内存,3次都是PASS,还有其他原因会导致这个问题么?

WechatIMG240.jpg

Edited by shushi1010
Link to comment
5 hours ago, shushi1010 said:

感谢大佬回复,我按照你的方式检测了内存,3次都是PASS,还有其他原因会导致这个问题么?

WechatIMG240.jpg

 

目前依然是不定时崩溃吗,如果还有这个情况的话请把日志上传上来(记得开启日志服务器保存日志)。

 

另外关于内存,我之前遇到过一个特殊的情况,就是系统崩溃是由于内存引起的,但是当时检测内存没有问题。所以即使内存检测 OK 也不一定可以排除内存问题,当然这种情况比较少见,所以目前也还是继续观察看看。

Edited by JackieWu
Link to comment
24 minutes ago, JackieWu said:

 

目前依然是不定时崩溃吗,如果还有这个情况的话请把日志上传上来(记得开启日志服务器保存日志)。

 

另外关于内存,我之前遇到过一个特殊的情况,就是系统崩溃是由于内存引起的,但是当时检测内存没有问题。所以即使内存检测 OK 也不一定可以排除内存问题,当然这种情况比较少见,所以目前也还是继续观察看看。

是的,5号左右刚崩溃一次,今天刚开机估计过几天还是要崩溃,日志见附件,看了以下好像跟之前崩溃报错一样

syslog-127.0.0.1.log

Link to comment
12 minutes ago, shushi1010 said:

是的,5号左右刚崩溃一次,今天刚开机估计过几天还是要崩溃,日志见附件,看了以下好像跟之前崩溃报错一样

syslog-127.0.0.1.log 3.78 MB · 0 downloads

问下,你说的崩溃是指 unraid 完全无法访问了(无法 ping、ssh 无法登录),还是说可能只是 webui 进不去但是 SSH 可以登录之类的情况,如果是后者的话其实不是 unraid 崩溃,那就得从其他方向排查。

Link to comment
1 hour ago, JackieWu said:

问下,你说的崩溃是指 unraid 完全无法访问了(无法 ping、ssh 无法登录),还是说可能只是 webui 进不去但是 SSH 可以登录之类的情况,如果是后者的话其实不是 unraid 崩溃,那就得从其他方向排查。

完全无法访问,路由器上也看不到IP,屏幕上显示报错信息,但是无法登录无法退出,只能强制关机重启

Link to comment
3 hours ago, shushi1010 said:

是的,5号左右刚崩溃一次,今天刚开机估计过几天还是要崩溃,日志见附件,看了以下好像跟之前崩溃报错一样

syslog-127.0.0.1.log 3.78 MB · 0 downloads

 

5 号的日志里面有关于 macvlan 的内核日志报错,根据你的说法和目前比较常见的失联问题,猜测可能是由于此错误造成的系统崩溃。你用的是 6.12.4 版本,这个版本有个改动是可以将桥接功能给关掉来解决 macvlan call trace 问题(但这个问题可能与你的问题不相关),你可以尝试利用这一手段去解决,具体的方法可以参考我的博客《6.12.4 关于失联问题的解决办法以及相关更新说明》

 

Oct  5 17:00:33 Overse kernel: ------------[ cut here ]------------
Oct  5 17:00:33 Overse kernel: WARNING: CPU: 8 PID: 13757 at net/netfilter/nf_nat_core.c:594 nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
Oct  5 17:00:33 Overse kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter nct6683 xfs md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc bonding tls i915 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iosf_mbi drm_buddy i2c_algo_bit kvm ttm drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm_kms_helper aesni_intel crypto_simd drm cryptd btusb btrtl btbcm btintel mei_hdcp mei_pxp intel_gtt rapl bluetooth mpt3sas nvme intel_cstate i2c_i801 agpgart intel_wmi_thunderbolt
Oct  5 17:00:33 Overse kernel: i2c_smbus wmi_bmof mxm_wmi mei_me raid_class syscopyarea r8169 ahci intel_uncore ecdh_generic nvme_core i2c_core mei sysfillrect scsi_transport_sas joydev libahci ecc realtek sysimgblt thermal fb_sys_fops fan video wmi backlight intel_pmc_core acpi_pad acpi_tad button unix
Oct  5 17:00:33 Overse kernel: CPU: 8 PID: 13757 Comm: kworker/u24:0 Tainted: P     U  W  O       6.1.49-Unraid #1
Oct  5 17:00:33 Overse kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C82/MAG B460M MORTAR WIFI (MS-7C82), BIOS 1.10 05/18/2020
Oct  5 17:00:33 Overse kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
Oct  5 17:00:33 Overse kernel: RIP: 0010:nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
Oct  5 17:00:33 Overse kernel: Code: a8 80 75 26 48 8d 73 58 48 8d 7c 24 20 e8 18 1b 3b 00 48 8d 43 0c 4c 8b bb 88 00 00 00 48 89 44 24 18 eb 54 0f ba e0 08 73 07 <0f> 0b e9 75 06 00 00 48 8d 73 58 48 8d 7c 24 20 e8 eb 1a 3b 00 48
Oct  5 17:00:33 Overse kernel: RSP: 0018:ffffc90000304c78 EFLAGS: 00010282
Oct  5 17:00:33 Overse kernel: RAX: 0000000000000180 RBX: ffff888a85b1b700 RCX: ffff88812ec82e00
Oct  5 17:00:33 Overse kernel: RDX: 0000000000000000 RSI: ffffc90000304d5c RDI: ffff888a85b1b700
Oct  5 17:00:33 Overse kernel: RBP: ffffc90000304d40 R08: 00000000d41fa8c0 R09: 0000000000000000
Oct  5 17:00:33 Overse kernel: R10: 0000000000000098 R11: 0000000000000000 R12: ffffc90000304d5c
Oct  5 17:00:33 Overse kernel: R13: 0000000000000000 R14: ffffc90000304e40 R15: 0000000000000001
Oct  5 17:00:33 Overse kernel: FS:  0000000000000000(0000) GS:ffff88901f400000(0000) knlGS:0000000000000000
Oct  5 17:00:33 Overse kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  5 17:00:33 Overse kernel: CR2: 00007f67dc05e000 CR3: 00000002f1452004 CR4: 00000000007706e0
Oct  5 17:00:33 Overse kernel: PKRU: 55555554
Oct  5 17:00:33 Overse kernel: Call Trace:
Oct  5 17:00:33 Overse kernel: <IRQ>
Oct  5 17:00:33 Overse kernel: ? __warn+0xab/0x122
Oct  5 17:00:33 Overse kernel: ? report_bug+0x109/0x17e
Oct  5 17:00:33 Overse kernel: ? nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
Oct  5 17:00:33 Overse kernel: ? handle_bug+0x41/0x6f
Oct  5 17:00:33 Overse kernel: ? exc_invalid_op+0x13/0x60
Oct  5 17:00:33 Overse kernel: ? asm_exc_invalid_op+0x16/0x20
Oct  5 17:00:33 Overse kernel: ? nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
Oct  5 17:00:33 Overse kernel: ? nf_nat_setup_info+0x44/0x7d1 [nf_nat]
Oct  5 17:00:33 Overse kernel: ? xt_write_recseq_end+0xf/0x1c [ip_tables]
Oct  5 17:00:33 Overse kernel: ? __local_bh_enable_ip+0x56/0x6b
Oct  5 17:00:33 Overse kernel: ? ipt_do_table+0x57a/0x5bf [ip_tables]
Oct  5 17:00:33 Overse kernel: ? xt_write_recseq_end+0xf/0x1c [ip_tables]
Oct  5 17:00:33 Overse kernel: __nf_nat_alloc_null_binding+0x66/0x81 [nf_nat]
Oct  5 17:00:33 Overse kernel: nf_nat_inet_fn+0xc0/0x1a8 [nf_nat]
Oct  5 17:00:33 Overse kernel: nf_nat_ipv4_local_in+0x2a/0xaa [nf_nat]
Oct  5 17:00:33 Overse kernel: nf_hook_slow+0x3a/0x96
Oct  5 17:00:33 Overse kernel: ? ip_protocol_deliver_rcu+0x164/0x164
Oct  5 17:00:33 Overse kernel: NF_HOOK.constprop.0+0x79/0xd9
Oct  5 17:00:33 Overse kernel: ? ip_protocol_deliver_rcu+0x164/0x164
Oct  5 17:00:33 Overse kernel: __netif_receive_skb_one_core+0x77/0x9c
Oct  5 17:00:33 Overse kernel: process_backlog+0x8c/0x116
Oct  5 17:00:33 Overse kernel: __napi_poll.constprop.0+0x28/0x124
Oct  5 17:00:33 Overse kernel: net_rx_action+0x159/0x24f
Oct  5 17:00:33 Overse kernel: __do_softirq+0x126/0x288
Oct  5 17:00:33 Overse kernel: do_softirq+0x7f/0xab
Oct  5 17:00:33 Overse kernel: </IRQ>
Oct  5 17:00:33 Overse kernel: <TASK>
Oct  5 17:00:33 Overse kernel: __local_bh_enable_ip+0x4c/0x6b
Oct  5 17:00:33 Overse kernel: netif_rx+0x52/0x5a
Oct  5 17:00:33 Overse kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
Oct  5 17:00:33 Overse kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
Oct  5 17:00:33 Overse kernel: process_one_work+0x1a8/0x295
Oct  5 17:00:33 Overse kernel: worker_thread+0x18b/0x244
Oct  5 17:00:33 Overse kernel: ? rescuer_thread+0x281/0x281
Oct  5 17:00:33 Overse kernel: kthread+0xe4/0xef
Oct  5 17:00:33 Overse kernel: ? kthread_complete_and_exit+0x1b/0x1b
Oct  5 17:00:33 Overse kernel: ret_from_fork+0x1f/0x30
Oct  5 17:00:33 Overse kernel: </TASK>
Oct  5 17:00:33 Overse kernel: ---[ end trace 0000000000000000 ]---

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.