unraid 不定时死机


Go to solution Solved by lyqalex,

Recommended Posts

自从安装完 unraid 之后,就会出现不定期的死机(大概能坚持个 2-3 天左右)

然后不能访问机器了(HTTP / SSH)都不行,也没法正常关机(我自己设置了一个脚本监测网络状况,如果无法 ping 通就会执行 powerdown -r 来重启),每次都只能拔电源,太伤硬盘了。

 

一开始开了远程日志没法找到问题,后来开了镜像日志之后,今天又死机了一次,发现了一些怪异的日志记录:

 

Apr  2 03:48:01 Unraid kernel: ------------[ cut here ]------------
Apr  2 03:48:01 Unraid kernel: NETDEV WATCHDOG: eth0 (igb): transmit queue 6 timed out
Apr  2 03:48:01 Unraid kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0xcf/0x12b
Apr  2 03:48:01 Unraid kernel: Modules linked in: ccp macvlan xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nct6683 ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl nvme intel_cstate intel_uncore nvme_core ahci video i2c_i801 libahci i2c_smbus backlight acpi_pad button igb i2c_algo_bit i2c_core e1000e
Apr  2 03:48:01 Unraid kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G    BUD W         5.10.28-Unraid #1
Apr  2 03:48:01 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  2 03:48:01 Unraid kernel: RIP: 0010:dev_watchdog+0xcf/0x12b
Apr  2 03:48:01 Unraid kernel: Code: 79 b7 00 00 75 38 48 89 ef c6 05 63 79 b7 00 01 e8 79 dd fc ff 44 89 e1 48 89 ee 48 c7 c7 ef 7f de 81 48 89 c2 e8 50 16 10 00 <0f> 0b eb 10 41 ff c4 48 05 40 01 00 00 41 39 f4 75 9d eb 16 48 8b
Apr  2 03:48:01 Unraid kernel: RSP: 0018:ffffc90000180ed8 EFLAGS: 00010286
Apr  2 03:48:01 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104ec0438 RCX: 0000000000000027
Apr  2 03:48:01 Unraid kernel: RDX: 00000000ffffefff RSI: 0000000000000001 RDI: ffff88902ee98920
Apr  2 03:48:01 Unraid kernel: RBP: ffff888104ec0000 R08: 0000000000000000 R09: 00000000ffffefff
Apr  2 03:48:01 Unraid kernel: R10: ffffc90000180d08 R11: ffffc90000180d00 R12: 0000000000000006
Apr  2 03:48:01 Unraid kernel: R13: ffffc90000180f10 R14: ffffc90000180f10 R15: ffffffff820060c8
Apr  2 03:48:01 Unraid kernel: FS:  0000000000000000(0000) GS:ffff88902ee80000(0000) knlGS:0000000000000000
Apr  2 03:48:01 Unraid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  2 03:48:01 Unraid kernel: CR2: 0000154834000010 CR3: 000000000400a002 CR4: 00000000003726e0
Apr  2 03:48:01 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr  2 03:48:01 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr  2 03:48:01 Unraid kernel: Call Trace:
Apr  2 03:48:01 Unraid kernel: <IRQ>
Apr  2 03:48:01 Unraid kernel: call_timer_fn.isra.0+0x12/0x6f
Apr  2 03:48:01 Unraid kernel: ? netif_tx_lock+0x7a/0x7a
Apr  2 03:48:01 Unraid kernel: __run_timers.part.0+0x144/0x185
Apr  2 03:48:01 Unraid kernel: ? update_process_times+0x68/0x6e
Apr  2 03:48:01 Unraid kernel: ? hrtimer_forward+0x73/0x7b
Apr  2 03:48:01 Unraid kernel: ? tick_sched_timer+0x5a/0x64
Apr  2 03:48:01 Unraid kernel: ? timerqueue_add+0x62/0x68
Apr  2 03:48:01 Unraid kernel: run_timer_softirq+0x21/0x43
Apr  2 03:48:01 Unraid kernel: __do_softirq+0xc4/0x1c2
Apr  2 03:48:01 Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  2 03:48:01 Unraid kernel: </IRQ>
Apr  2 03:48:01 Unraid kernel: do_softirq_own_stack+0x2c/0x39
Apr  2 03:48:01 Unraid kernel: __irq_exit_rcu+0x45/0x80
Apr  2 03:48:01 Unraid kernel: sysvec_apic_timer_interrupt+0x87/0x95
Apr  2 03:48:01 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  2 03:48:01 Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8
Apr  2 03:48:01 Unraid kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5
Apr  2 03:48:01 Unraid kernel: RSP: 0018:ffffc900000b7ea0 EFLAGS: 00000246
Apr  2 03:48:01 Unraid kernel: RAX: ffff88902eea2380 RBX: 0000000000000006 RCX: 000000000000001f
Apr  2 03:48:01 Unraid kernel: RDX: 0000000000000000 RSI: 000000003c9b28ab RDI: 0000000000000000
Apr  2 03:48:01 Unraid kernel: RBP: ffffe8ffffaa1600 R08: 00017210d94e2967 R09: 0000000000000392
Apr  2 03:48:01 Unraid kernel: R10: 000000007fffffff R11: 071c71c71c71c71c R12: 00017210d94e2967
Apr  2 03:48:01 Unraid kernel: R13: ffffffff820c5dc0 R14: 0000000000000006 R15: 0000000000000000
Apr  2 03:48:01 Unraid kernel: cpuidle_enter_state+0x101/0x1c4
Apr  2 03:48:01 Unraid kernel: cpuidle_enter+0x25/0x31
Apr  2 03:48:01 Unraid kernel: do_idle+0x1a6/0x214
Apr  2 03:48:01 Unraid kernel: cpu_startup_entry+0x18/0x1a
Apr  2 03:48:01 Unraid kernel: secondary_startup_64_no_verify+0xb0/0xbb
Apr  2 03:48:01 Unraid kernel: ---[ end trace fb5642dccbb87fb3 ]---
Apr  2 03:48:01 Unraid kernel: igb 0000:01:00.0 eth0: Reset adapter
Apr  2 03:48:02 Unraid kernel: igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Apr  2 03:48:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Apr  2 03:48:33 Unraid kernel: rcu: 	9-....: (60000 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=14882 
Apr  2 03:48:33 Unraid kernel: 	(t=60001 jiffies g=41273801 q=28685)
Apr  2 03:48:33 Unraid kernel: NMI backtrace for cpu 9
Apr  2 03:48:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G    BUD W         5.10.28-Unraid #1
Apr  2 03:48:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  2 03:48:33 Unraid kernel: Call Trace:
Apr  2 03:48:33 Unraid kernel: <IRQ>
Apr  2 03:48:33 Unraid kernel: dump_stack+0x6b/0x83
Apr  2 03:48:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Apr  2 03:48:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f
Apr  2 03:48:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3
Apr  2 03:48:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6
Apr  2 03:48:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543
Apr  2 03:48:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca
Apr  2 03:48:33 Unraid kernel: update_process_times+0x50/0x6e
Apr  2 03:48:33 Unraid kernel: tick_sched_timer+0x36/0x64
Apr  2 03:48:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b
Apr  2 03:48:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39
Apr  2 03:48:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b
Apr  2 03:48:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68
Apr  2 03:48:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  2 03:48:33 Unraid kernel: </IRQ>
Apr  2 03:48:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95
Apr  2 03:48:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  2 03:48:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a
Apr  2 03:48:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8
Apr  2 03:48:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202
Apr  2 03:48:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000
Apr  2 03:48:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8
Apr  2 03:48:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000
Apr  2 03:48:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940
Apr  2 03:48:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0
Apr  2 03:48:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa
Apr  2 03:48:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc
Apr  2 03:48:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1
Apr  2 03:48:33 Unraid kernel: ? check_pte+0x27/0x106
Apr  2 03:48:33 Unraid kernel: rmap_walk_anon+0xe7/0x156
Apr  2 03:48:33 Unraid kernel: try_to_unmap+0x88/0xc9
Apr  2 03:48:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8
Apr  2 03:48:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5
Apr  2 03:48:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65
Apr  2 03:48:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10
Apr  2 03:48:33 Unraid kernel: migrate_pages+0x499/0x7c1
Apr  2 03:48:33 Unraid kernel: ? move_freelist_tail+0xba/0xba
Apr  2 03:48:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b
Apr  2 03:48:33 Unraid kernel: compact_zone+0x6b7/0x90a
Apr  2 03:48:33 Unraid kernel: proactive_compact_node+0x75/0xa2
Apr  2 03:48:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:48:33 Unraid kernel: kcompactd+0x1ee/0x22c
Apr  2 03:48:33 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:48:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:48:33 Unraid kernel: kthread+0xe5/0xea
Apr  2 03:48:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:48:33 Unraid kernel: ret_from_fork+0x1f/0x30
Apr  2 03:48:44 Unraid dhcpcd[1999]: br0: fe80::270:87ff:fee0:519 is unreachable
Apr  2 03:49:31 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 63179 jiffies s: 27197 root: 0x200/.
Apr  2 03:49:31 Unraid kernel: rcu: blocking rcu_node structures:
Apr  2 03:49:31 Unraid kernel: Task dump for CPU 9:
Apr  2 03:49:31 Unraid kernel: task:kcompactd0      state:R  running task     stack:    0 pid:  393 ppid:     2 flags:0x00004008
Apr  2 03:49:31 Unraid kernel: Call Trace:
Apr  2 03:49:31 Unraid kernel: ? proactive_compact_node+0x75/0xa2
Apr  2 03:49:31 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:49:31 Unraid kernel: ? kcompactd+0x1ee/0x22c
Apr  2 03:49:31 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:49:31 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:49:31 Unraid kernel: ? kthread+0xe5/0xea
Apr  2 03:49:31 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:49:31 Unraid kernel: ? ret_from_fork+0x1f/0x30
Apr  2 03:51:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Apr  2 03:51:33 Unraid kernel: rcu: 	9-....: (240003 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=59467 
Apr  2 03:51:33 Unraid kernel: 	(t=240004 jiffies g=41273801 q=88942)
Apr  2 03:51:33 Unraid kernel: NMI backtrace for cpu 9
Apr  2 03:51:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G    BUD W         5.10.28-Unraid #1
Apr  2 03:51:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  2 03:51:33 Unraid kernel: Call Trace:
Apr  2 03:51:33 Unraid kernel: <IRQ>
Apr  2 03:51:33 Unraid kernel: dump_stack+0x6b/0x83
Apr  2 03:51:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Apr  2 03:51:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f
Apr  2 03:51:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3
Apr  2 03:51:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6
Apr  2 03:51:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543
Apr  2 03:51:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca
Apr  2 03:51:33 Unraid kernel: update_process_times+0x50/0x6e
Apr  2 03:51:33 Unraid kernel: tick_sched_timer+0x36/0x64
Apr  2 03:51:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b
Apr  2 03:51:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39
Apr  2 03:51:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b
Apr  2 03:51:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68
Apr  2 03:51:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  2 03:51:33 Unraid kernel: </IRQ>
Apr  2 03:51:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95
Apr  2 03:51:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  2 03:51:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a
Apr  2 03:51:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8
Apr  2 03:51:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202
Apr  2 03:51:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000
Apr  2 03:51:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8
Apr  2 03:51:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000
Apr  2 03:51:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940
Apr  2 03:51:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0
Apr  2 03:51:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa
Apr  2 03:51:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc
Apr  2 03:51:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1
Apr  2 03:51:33 Unraid kernel: ? check_pte+0x27/0x106
Apr  2 03:51:33 Unraid kernel: rmap_walk_anon+0xe7/0x156
Apr  2 03:51:33 Unraid kernel: try_to_unmap+0x88/0xc9
Apr  2 03:51:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8
Apr  2 03:51:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5
Apr  2 03:51:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65
Apr  2 03:51:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10
Apr  2 03:51:33 Unraid kernel: migrate_pages+0x499/0x7c1
Apr  2 03:51:33 Unraid kernel: ? move_freelist_tail+0xba/0xba
Apr  2 03:51:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b
Apr  2 03:51:33 Unraid kernel: compact_zone+0x6b7/0x90a
Apr  2 03:51:33 Unraid kernel: proactive_compact_node+0x75/0xa2
Apr  2 03:51:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:51:33 Unraid kernel: kcompactd+0x1ee/0x22c
Apr  2 03:51:33 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:51:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:51:33 Unraid kernel: kthread+0xe5/0xea
Apr  2 03:51:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:51:33 Unraid kernel: ret_from_fork+0x1f/0x30
Apr  2 03:52:31 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 243403 jiffies s: 27197 root: 0x200/.
Apr  2 03:52:31 Unraid kernel: rcu: blocking rcu_node structures:
Apr  2 03:52:31 Unraid kernel: Task dump for CPU 9:
Apr  2 03:52:31 Unraid kernel: task:kcompactd0      state:R  running task     stack:    0 pid:  393 ppid:     2 flags:0x00004008
Apr  2 03:52:31 Unraid kernel: Call Trace:
Apr  2 03:52:31 Unraid kernel: ? proactive_compact_node+0x75/0xa2
Apr  2 03:52:31 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:52:31 Unraid kernel: ? kcompactd+0x1ee/0x22c
Apr  2 03:52:31 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:52:31 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:52:31 Unraid kernel: ? kthread+0xe5/0xea
Apr  2 03:52:31 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:52:31 Unraid kernel: ? ret_from_fork+0x1f/0x30
Apr  2 03:54:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Apr  2 03:54:33 Unraid kernel: rcu: 	9-....: (420006 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=104071 
Apr  2 03:54:33 Unraid kernel: 	(t=420007 jiffies g=41273801 q=145781)
Apr  2 03:54:33 Unraid kernel: NMI backtrace for cpu 9
Apr  2 03:54:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G    BUD W         5.10.28-Unraid #1
Apr  2 03:54:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  2 03:54:33 Unraid kernel: Call Trace:
Apr  2 03:54:33 Unraid kernel: <IRQ>
Apr  2 03:54:33 Unraid kernel: dump_stack+0x6b/0x83
Apr  2 03:54:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Apr  2 03:54:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f
Apr  2 03:54:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3
Apr  2 03:54:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6
Apr  2 03:54:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543
Apr  2 03:54:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca
Apr  2 03:54:33 Unraid kernel: update_process_times+0x50/0x6e
Apr  2 03:54:33 Unraid kernel: tick_sched_timer+0x36/0x64
Apr  2 03:54:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b
Apr  2 03:54:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39
Apr  2 03:54:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b
Apr  2 03:54:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68
Apr  2 03:54:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  2 03:54:33 Unraid kernel: </IRQ>
Apr  2 03:54:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95
Apr  2 03:54:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  2 03:54:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a
Apr  2 03:54:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8
Apr  2 03:54:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202
Apr  2 03:54:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000
Apr  2 03:54:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8
Apr  2 03:54:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000
Apr  2 03:54:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940
Apr  2 03:54:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0
Apr  2 03:54:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa
Apr  2 03:54:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc
Apr  2 03:54:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1
Apr  2 03:54:33 Unraid kernel: ? check_pte+0x27/0x106
Apr  2 03:54:33 Unraid kernel: rmap_walk_anon+0xe7/0x156
Apr  2 03:54:33 Unraid kernel: try_to_unmap+0x88/0xc9
Apr  2 03:54:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8
Apr  2 03:54:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5
Apr  2 03:54:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65
Apr  2 03:54:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10
Apr  2 03:54:33 Unraid kernel: migrate_pages+0x499/0x7c1
Apr  2 03:54:33 Unraid kernel: ? move_freelist_tail+0xba/0xba
Apr  2 03:54:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b
Apr  2 03:54:33 Unraid kernel: compact_zone+0x6b7/0x90a
Apr  2 03:54:33 Unraid kernel: proactive_compact_node+0x75/0xa2
Apr  2 03:54:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:54:33 Unraid kernel: kcompactd+0x1ee/0x22c
Apr  2 03:54:33 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:54:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:54:33 Unraid kernel: kthread+0xe5/0xea
Apr  2 03:54:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:54:33 Unraid kernel: ret_from_fork+0x1f/0x30
Apr  2 03:55:32 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 423627 jiffies s: 27197 root: 0x200/.
Apr  2 03:55:32 Unraid kernel: rcu: blocking rcu_node structures:
Apr  2 03:55:32 Unraid kernel: Task dump for CPU 9:
Apr  2 03:55:32 Unraid kernel: task:kcompactd0      state:R  running task     stack:    0 pid:  393 ppid:     2 flags:0x00004008
Apr  2 03:55:32 Unraid kernel: Call Trace:
Apr  2 03:55:32 Unraid kernel: ? proactive_compact_node+0x75/0xa2
Apr  2 03:55:32 Unraid kernel: ? fragmentation_score_node+0x2b/0x59
Apr  2 03:55:32 Unraid kernel: ? kcompactd+0x1ee/0x22c
Apr  2 03:55:32 Unraid kernel: ? init_wait_entry+0x24/0x24
Apr  2 03:55:32 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f
Apr  2 03:55:32 Unraid kernel: ? kthread+0xe5/0xea
Apr  2 03:55:32 Unraid kernel: ? __kthread_bind_mask+0x57/0x57
Apr  2 03:55:32 Unraid kernel: ? ret_from_fork+0x1f/0x30
Apr  2 03:57:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Apr  2 03:57:33 Unraid kernel: rcu: 	9-....: (600009 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=148692 
Apr  2 03:57:33 Unraid kernel: 	(t=600010 jiffies g=41273801 q=197481)
Apr  2 03:57:33 Unraid kernel: NMI backtrace for cpu 9

 

从日志上看,貌似是触发了一个什么问题,然后就一直在循环导致的,谁有碰到过类似的情况吗?怎么解决的呢?

Link to comment

这类型的异常,多由硬件引起。你要提供详细信息,包括板u硬件、插件等等列明。在unraid~工具~诊断生产信息发上来。在此之前,你可以把所有docker、vms停用,尤其卸载非应用商店的组件,测试稳定性,如果仍然出现故障,则可判断为硬件问题。

ps:不要使用es和qs的cpu。

Link to comment
On 4/2/2022 at 7:58 PM, lyqalex said:

这类型的异常,多由硬件引起。你要提供详细信息,包括板u硬件、插件等等列明。在unraid~工具~诊断生产信息发上来。在此之前,你可以把所有docker、vms停用,尤其卸载非应用商店的组件,测试稳定性,如果仍然出现故障,则可判断为硬件问题。

ps:不要使用es和qs的cpu。

 

感谢回复,诊断信息见附件。

停用了 vms 后,昨天又发生死机了,日志如下:

Apr  3 01:58:59 Joes-Unraid kernel: ------------[ cut here ]------------
Apr  3 01:58:59 Joes-Unraid kernel: NETDEV WATCHDOG: eth0 (igb): transmit queue 3 timed out
Apr  3 01:58:59 Joes-Unraid kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0xcf/0x12b
Apr  3 01:58:59 Joes-Unraid kernel: Modules linked in: macvlan xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nct6683 ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl nvme ahci intel_cstate i2c_i801 intel_uncore nvme_core video libahci i2c_smbus backlight acpi_pad button igb i2c_algo_bit i2c_core e1000e
Apr  3 01:58:59 Joes-Unraid kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U  W         5.10.28-Unraid #1
Apr  3 01:58:59 Joes-Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  3 01:58:59 Joes-Unraid kernel: RIP: 0010:dev_watchdog+0xcf/0x12b
Apr  3 01:58:59 Joes-Unraid kernel: Code: 79 b7 00 00 75 38 48 89 ef c6 05 63 79 b7 00 01 e8 79 dd fc ff 44 89 e1 48 89 ee 48 c7 c7 ef 7f de 81 48 89 c2 e8 50 16 10 00 <0f> 0b eb 10 41 ff c4 48 05 40 01 00 00 41 39 f4 75 9d eb 16 48 8b
Apr  3 01:58:59 Joes-Unraid kernel: RSP: 0018:ffffc90000180ed8 EFLAGS: 00010286
Apr  3 01:58:59 Joes-Unraid kernel: RAX: 0000000000000000 RBX: ffff888105bb8438 RCX: 0000000000000027
Apr  3 01:58:59 Joes-Unraid kernel: RDX: 00000000ffffefff RSI: 0000000000000001 RDI: ffff88902ee98920
Apr  3 01:58:59 Joes-Unraid kernel: RBP: ffff888105bb8000 R08: 0000000000000000 R09: 00000000ffffefff
Apr  3 01:58:59 Joes-Unraid kernel: R10: ffffc90000180d08 R11: ffffc90000180d00 R12: 0000000000000003
Apr  3 01:58:59 Joes-Unraid kernel: R13: ffffc90000180f10 R14: ffffc90000180f10 R15: ffffffff820060c8
Apr  3 01:58:59 Joes-Unraid kernel: FS:  0000000000000000(0000) GS:ffff88902ee80000(0000) knlGS:0000000000000000
Apr  3 01:58:59 Joes-Unraid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  3 01:58:59 Joes-Unraid kernel: CR2: 000015056d9ca900 CR3: 000000000400a005 CR4: 00000000003706e0
Apr  3 01:58:59 Joes-Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr  3 01:58:59 Joes-Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr  3 01:58:59 Joes-Unraid kernel: Call Trace:
Apr  3 01:58:59 Joes-Unraid kernel: <IRQ>
Apr  3 01:58:59 Joes-Unraid kernel: call_timer_fn.isra.0+0x12/0x6f
Apr  3 01:58:59 Joes-Unraid kernel: ? netif_tx_lock+0x7a/0x7a
Apr  3 01:58:59 Joes-Unraid kernel: __run_timers.part.0+0x144/0x185
Apr  3 01:58:59 Joes-Unraid kernel: ? update_process_times+0x68/0x6e
Apr  3 01:58:59 Joes-Unraid kernel: ? hrtimer_forward+0x73/0x7b
Apr  3 01:58:59 Joes-Unraid kernel: ? tick_sched_timer+0x5a/0x64
Apr  3 01:58:59 Joes-Unraid kernel: ? timerqueue_add+0x62/0x68
Apr  3 01:58:59 Joes-Unraid kernel: run_timer_softirq+0x21/0x43
Apr  3 01:58:59 Joes-Unraid kernel: __do_softirq+0xc4/0x1c2
Apr  3 01:58:59 Joes-Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  3 01:58:59 Joes-Unraid kernel: </IRQ>
Apr  3 01:58:59 Joes-Unraid kernel: do_softirq_own_stack+0x2c/0x39
Apr  3 01:58:59 Joes-Unraid kernel: __irq_exit_rcu+0x45/0x80
Apr  3 01:58:59 Joes-Unraid kernel: sysvec_apic_timer_interrupt+0x87/0x95
Apr  3 01:58:59 Joes-Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  3 01:58:59 Joes-Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8
Apr  3 01:58:59 Joes-Unraid kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5
Apr  3 01:58:59 Joes-Unraid kernel: RSP: 0018:ffffc900000b7ea0 EFLAGS: 00000246
Apr  3 01:58:59 Joes-Unraid kernel: RAX: ffff88902eea2380 RBX: 0000000000000006 RCX: 000000000000001f
Apr  3 01:58:59 Joes-Unraid kernel: RDX: 0000000000000000 RSI: 000000003c9b28ab RDI: 0000000000000000
Apr  3 01:58:59 Joes-Unraid kernel: RBP: ffffe8ffffaa1600 R08: 0000285539e1a494 R09: 0000000000000385
Apr  3 01:58:59 Joes-Unraid kernel: R10: 000000007fffffff R11: 071c71c71c71c71c R12: 0000285539e1a494
Apr  3 01:58:59 Joes-Unraid kernel: R13: ffffffff820c5dc0 R14: 0000000000000006 R15: 0000000000000000
Apr  3 01:58:59 Joes-Unraid kernel: cpuidle_enter_state+0x101/0x1c4
Apr  3 01:58:59 Joes-Unraid kernel: cpuidle_enter+0x25/0x31
Apr  3 01:58:59 Joes-Unraid kernel: do_idle+0x1a6/0x214
Apr  3 01:58:59 Joes-Unraid kernel: cpu_startup_entry+0x18/0x1a
Apr  3 01:58:59 Joes-Unraid kernel: secondary_startup_64_no_verify+0xb0/0xbb
Apr  3 01:58:59 Joes-Unraid kernel: ---[ end trace 5f09cc7a9ef954d7 ]---
Apr  3 01:58:59 Joes-Unraid kernel: igb 0000:01:00.0 eth0: Reset adapter
Apr  3 01:59:00 Joes-Unraid kernel: igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Apr  3 01:59:50 Joes-Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Apr  3 01:59:50 Joes-Unraid kernel: rcu: 	6-....: (1 GPs behind) idle=682/1/0x4000000000000004 softirq=728000/728001 fqs=14870 
Apr  3 01:59:50 Joes-Unraid kernel: 	(t=60000 jiffies g=2959521 q=21275)
Apr  3 01:59:50 Joes-Unraid kernel: NMI backtrace for cpu 6
Apr  3 01:59:50 Joes-Unraid kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U  W         5.10.28-Unraid #1
Apr  3 01:59:50 Joes-Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019
Apr  3 01:59:50 Joes-Unraid kernel: Call Trace:
Apr  3 01:59:50 Joes-Unraid kernel: <IRQ>
Apr  3 01:59:50 Joes-Unraid kernel: dump_stack+0x6b/0x83
Apr  3 01:59:50 Joes-Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Apr  3 01:59:50 Joes-Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f
Apr  3 01:59:50 Joes-Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3
Apr  3 01:59:50 Joes-Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6
Apr  3 01:59:50 Joes-Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543
Apr  3 01:59:50 Joes-Unraid kernel: update_process_times+0x50/0x6e
Apr  3 01:59:50 Joes-Unraid kernel: tick_sched_timer+0x36/0x64
Apr  3 01:59:50 Joes-Unraid kernel: __hrtimer_run_queues+0xb7/0x10b
Apr  3 01:59:50 Joes-Unraid kernel: ? tick_sched_do_timer+0x39/0x39
Apr  3 01:59:50 Joes-Unraid kernel: hrtimer_interrupt+0x8d/0x15b
Apr  3 01:59:50 Joes-Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68
Apr  3 01:59:50 Joes-Unraid kernel: sysvec_apic_timer_interrupt+0x82/0x95
Apr  3 01:59:50 Joes-Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
Apr  3 01:59:50 Joes-Unraid kernel: RIP: 0010:nf_ct_zone_equal+0x24/0x2b [nf_conntrack]
Apr  3 01:59:50 Joes-Unraid kernel: Code: e9 b9 f8 ff ff c3 89 d1 b8 01 00 00 00 31 d2 d3 e0 0f b6 4f 0f 85 c1 74 04 66 8b 57 0c 0f b6 7e 03 31 c9 85 c7 74 03 66 8b 0e <66> 39 d1 0f 94 c0 c3 48 8b 86 90 00 00 00 89 f9 48 89 f7 48 8b 80
Apr  3 01:59:50 Joes-Unraid kernel: RSP: 0018:ffffc90000230978 EFLAGS: 00000202
Apr  3 01:59:50 Joes-Unraid kernel: RAX: 0000000000000001 RBX: ffff88811e778b88 RCX: 0000000000000000
Apr  3 01:59:50 Joes-Unraid kernel: RDX: 0000000000000000 RSI: ffff88811e778c8c RDI: 0000000000000003
Apr  3 01:59:50 Joes-Unraid kernel: RBP: ffff88811e778c80 R08: ffff88811e778b40 R09: ffff88811e778b88
Apr  3 01:59:50 Joes-Unraid kernel: R10: 0000000000000001 R11: ffffffff8210b440 R12: ffffffff8210b440
Apr  3 01:59:50 Joes-Unraid kernel: R13: ffffc900002309e0 R14: ffff88811e778c8c R15: ffff88811e778b40
Apr  3 01:59:50 Joes-Unraid kernel: nf_conntrack_tuple_taken+0xdc/0x144 [nf_conntrack]
Apr  3 01:59:50 Joes-Unraid kernel: nf_nat_used_tuple+0x2e/0x49 [nf_nat]
Apr  3 01:59:50 Joes-Unraid kernel: nf_nat_setup_info+0x332/0x6aa [nf_nat]
Apr  3 01:59:50 Joes-Unraid kernel: ? ipt_do_table+0x4bb/0x5c0 [ip_tables]
Apr  3 01:59:50 Joes-Unraid kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables]
Apr  3 01:59:50 Joes-Unraid kernel: __nf_nat_alloc_null_binding+0x5f/0x76 [nf_nat]
Apr  3 01:59:50 Joes-Unraid kernel: nf_nat_inet_fn+0x91/0x183 [nf_nat]
Apr  3 01:59:50 Joes-Unraid kernel: ? br_handle_frame_finish+0x351/0x351
Apr  3 01:59:50 Joes-Unraid kernel: nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat]
Apr  3 01:59:50 Joes-Unraid kernel: nf_hook_slow+0x39/0x8e
Apr  3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
Apr  3 01:59:50 Joes-Unraid kernel: NF_HOOK+0xb7/0xf7 [br_netfilter]
Apr  3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
Apr  3 01:59:50 Joes-Unraid kernel: br_nf_pre_routing+0x229/0x239 [br_netfilter]
Apr  3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
Apr  3 01:59:50 Joes-Unraid kernel: br_handle_frame+0x25e/0x2a6
Apr  3 01:59:50 Joes-Unraid kernel: ? br_pass_frame_up+0xda/0xda
Apr  3 01:59:50 Joes-Unraid kernel: __netif_receive_skb_core+0x335/0x4e7
Apr  3 01:59:50 Joes-Unraid kernel: __netif_receive_skb_list_core+0x78/0x104
Apr  3 01:59:50 Joes-Unraid kernel: netif_receive_skb_list_internal+0x1bf/0x1f2
Apr  3 01:59:50 Joes-Unraid kernel: ? dev_gro_receive+0x55d/0x578
Apr  3 01:59:50 Joes-Unraid kernel: gro_normal_list+0x1d/0x39
Apr  3 01:59:50 Joes-Unraid kernel: napi_complete_done+0x79/0x104
Apr  3 01:59:50 Joes-Unraid kernel: igb_poll+0xcc8/0xef6 [igb]
Apr  3 01:59:50 Joes-Unraid kernel: net_rx_action+0xf4/0x29d
Apr  3 01:59:50 Joes-Unraid kernel: __do_softirq+0xc4/0x1c2
Apr  3 01:59:50 Joes-Unraid kernel: asm_call_irq_on_stack+0xf/0x20
Apr  3 01:59:50 Joes-Unraid kernel: </IRQ>
Apr  3 01:59:50 Joes-Unraid kernel: do_softirq_own_stack+0x2c/0x39
Apr  3 01:59:50 Joes-Unraid kernel: __irq_exit_rcu+0x45/0x80
Apr  3 01:59:50 Joes-Unraid kernel: common_interrupt+0x119/0x12e
Apr  3 01:59:50 Joes-Unraid kernel: asm_common_interrupt+0x1e/0x40
Apr  3 01:59:50 Joes-Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8

 

这次我计划停用 docker 试试。

 

换一个低版本的能否解决问题?我同样的机器配置其实跑了几个月的黑群晖,一点问题都没有。

 

unraid-diagnostics-20220404-1055.zip

Link to comment
  • 2 weeks later...

提醒:vfio.pci 卸载之后,记得把 flash 下面的 vfio.conf 也删除掉,不然重启之后就没网络了。

更新一下进度:

vfio.pci 删除之后,之前的错误不再产生 -> 稳定坚持了 3 天

但是出现了另外一个问题,docker 如果使用 macvlan 网络,也会偶发性的报类似的错误 -> 阵亡(而且是没有任何日志的)

更换所有使用 macvlan 的 docker 镜像之后,目前已经稳定跑了 4.5 天,期间无报错。

后续待观察。

 

对于 macvlan 的这个问题,不知道有没有什么解决方案? 

  • Like 1
Link to comment
  • 6 months later...
  • 9 months later...
On 8/28/2023 at 12:21 AM, JackieWu said:

 

可以尝试此方法看看能不能解决问题:《unRAID 失联问题解析以及如何给Docker分配独立网口》

我之前是因为增加了一个USB 2.5G网卡,板载的是1G的,我只用了1个网线插在2.5G网卡,所以启用了桥接,否则访问不了。

 

现在我把2.5G网卡换成eth0 ,取消了桥接,并且用ipvlan。过几天看看有没有作用,我个人觉得应该能解决,我增加了个2.5G,硬盘连续的坏,重建,我还升级了系统,没有意识到有个能是网卡的问题。

 

docker我不能用eth1,因为它是1G的。我就按照以前单网卡的使用方法。

 

非常感谢。

  • Like 1
Link to comment
On 8/28/2023 at 12:21 AM, JackieWu said:

 

可以尝试此方法看看能不能解决问题:《unRAID 失联问题解析以及如何给Docker分配独立网口》

6.12.4-rc19 已经没有macvlan选项了,并且我已经没有桥接,把2个网口分开了。

 

目前还是会失联,我不确定它是死机还是失联,因为现在我的显示器已经没有vga口了,我确定不了。

 

我添加的网卡是螃蟹的USB 2.5G网卡,确实有点头疼。(我安装了这个网卡的驱动插件)

Link to comment
9 minutes ago, akina said:

6.12.4-rc19 已经没有macvlan选项了,并且我已经没有桥接,把2个网口分开了。

 

目前还是会失联,我不确定它是死机还是失联,因为现在我的显示器已经没有vga口了,我确定不了。

 

我添加的网卡是螃蟹的USB 2.5G网卡,确实有点头疼。(我安装了这个网卡的驱动插件)

得看日志

Link to comment
On 8/29/2023 at 9:35 PM, JackieWu said:

得看日志

现在每次能坚持1天多,目前解决办法只有先暂时不用螃蟹的2.5G网卡,用1G的板载单网卡看看。

 

单网卡都还出问题,我就没法了,只能准备绿联或者极空间了,只可惜还买了授权也是能浪费了

Link to comment
1 minute ago, akina said:

现在每次能坚持1天多,目前解决办法只有先暂时不用螃蟹的2.5G网卡,用1G的板载单网卡看看。

 

单网卡都还出问题,我就没法了,只能准备绿联或者极空间了,只可惜还买了授权也是能浪费了

 

失联的原因可以有很多,我也处理过不同情况下的失联问题,这个得借助日志去判断,如果你有需要可以上传上来让大家一起帮忙。

Link to comment
On 8/31/2023 at 5:51 PM, JackieWu said:

 

失联的原因可以有很多,我也处理过不同情况下的失联问题,这个得借助日志去判断,如果你有需要可以上传上来让大家一起帮忙。

现在把2.5g usb网卡拔出,5天了,暂时没有问题。

 

但这段时间,还有个问题一直困扰着我,就是共享目录丢失。

 

“共享”里面不显示任何文件夹,也访问不了。停止阵列再启动的话,会恢复,或者重启会恢复。

Edited by akina
Link to comment
5 hours ago, akina said:

现在把2.5g usb网卡拔出,5天了,暂时没有问题。

 

但这段时间,还有个问题一直困扰着我,就是共享目录丢失。

 

“共享”里面不显示任何文件夹,也访问不了。停止阵列再启动的话,会恢复,或者重启会恢复。

还是那句话,得看日志

Link to comment
1 hour ago, akina said:

我传上来了,应该也有失联的日志

tower-diagnostics-20230907-1626.zip 164.47 kB · 1 download

 

看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。

 

我建议:

  • 修复下你的 Disk1  硬盘的文件系统;
  • 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下;
  • 检测下你的内存条,这里有参考这里
Edited by JackieWu
Link to comment
3 hours ago, JackieWu said:

 

看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。

 

我建议:

  • 修复下你的 Disk1  硬盘的文件系统;
  • 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下;
  • 检测下你的内存条,这里有参考这里

我正在修复disk1

我搜网卡驱动的插件,搜出来的是同一个驱动,等我修复完驱动,我删除看看。

内存条我觉得问题不大,我这里没有台式机的内存条了,之前用了4,5年,unraid也从来没出错,崩溃,死机。我用的软路由,笔记本,小主机全是笔记本内存条,也不好替换检查。

Link to comment
On 9/7/2023 at 6:07 PM, JackieWu said:

 

看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。

 

我建议:

  • 修复下你的 Disk1  硬盘的文件系统;
  • 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下;
  • 检测下你的内存条,这里有参考这里

修复disk1 最后出现Sorry,could not find valid secondary superblock

 

unraid日志如下:

Sep 12 05:33:18 Tower emhttpd: spinning down /dev/sde
Sep 12 05:33:20 Tower emhttpd: read SMART /dev/sde
Sep 12 06:00:28 Tower kernel: XFS (md1p1): Metadata corruption detected at xfs_buf_ioend+0xa9/0x384 [xfs], xfs_inode block 0xeeb46790 xfs_inode_buf_verify
Sep 12 06:00:28 Tower kernel: XFS (md1p1): Unmount and run xfs_repair
Sep 12 06:00:28 Tower kernel: XFS (md1p1): First 128 bytes of corrupted metadata buffer:
Sep 12 06:00:28 Tower kernel: 00000000: 9a 7f b7 4f f4 7d e0 f7 1a 2a 09 84 35 61 10 ef  ...O.}...*..5a..
Sep 12 06:00:28 Tower kernel: 00000010: 38 5f b8 35 98 0b 9c 29 dd ca 0b b6 cc ff 38 72  8_.5...)......8r
Sep 12 06:00:28 Tower kernel: 00000020: c3 c1 e1 f9 f0 a2 8e 27 5e 12 1f 44 6c 10 65 83  .......'^..Dl.e.
Sep 12 06:00:28 Tower kernel: 00000030: c7 47 8c 19 43 8c 70 ff e7 da 7d 15 ee 1a 04 1a  .G..C.p...}.....
Sep 12 06:00:28 Tower kernel: 00000040: 1e ed f8 1c a1 de 00 f1 d6 a2 f4 7a 1e fd d0 66  ...........z...f
Sep 12 06:00:28 Tower kernel: 00000050: 06 20 93 27 c2 42 68 02 7b fe 1c 76 90 c1 69 b6  . .'.Bh.{..v..i.
Sep 12 06:00:28 Tower kernel: 00000060: 43 66 1e f3 c8 2f 7a ce 8f 50 13 4f c7 69 f1 3a  Cf.../z..P.O.i.:
Sep 12 06:00:28 Tower kernel: 00000070: 70 3e 90 5f 13 87 ee 2b 2a e6 08 29 67 00 39 af  p>._...+*..)g.9.
Sep 12 06:00:28 Tower kernel: XFS (md1p1): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0xeeb46790 len 32 error 117
Sep 12 06:00:29 Tower kernel: XFS (md1p1): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x19c/0x26f [xfs] (fs/xfs/xfs_trans_buf.c:296).  Shutting down filesystem
.
Sep 12 06:00:29 Tower kernel: XFS (md1p1): Please unmount the filesystem and rectify the problem(s)
Sep 12 06:10:58 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa0ef2.4699dd96 does not match aorg 0xe8aa0ef2.46985d01 from [email protected] xmt 0xe8aa0ef2.7475ae15
Sep 12 06:13:06 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa0f72.469581dd does not match aorg 0xe8aa0f72.46972e59 from [email protected] xmt 0xe8aa0f72.7512d614
Sep 12 06:20:15 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa111f.469fbc8f does not match aorg 0xe8aa111f.469c53f6 from [email protected] xmt 0xe8aa111f.59a190d3
Sep 12 06:20:51 Tower kernel: md1p1: writeback error on inode 2200312912, offset 251658240, sector 2548144816

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.