JoeZhao Posted April 2, 2022 Share Posted April 2, 2022 自从安装完 unraid 之后,就会出现不定期的死机(大概能坚持个 2-3 天左右) 然后不能访问机器了(HTTP / SSH)都不行,也没法正常关机(我自己设置了一个脚本监测网络状况,如果无法 ping 通就会执行 powerdown -r 来重启),每次都只能拔电源,太伤硬盘了。 一开始开了远程日志没法找到问题,后来开了镜像日志之后,今天又死机了一次,发现了一些怪异的日志记录: Apr 2 03:48:01 Unraid kernel: ------------[ cut here ]------------ Apr 2 03:48:01 Unraid kernel: NETDEV WATCHDOG: eth0 (igb): transmit queue 6 timed out Apr 2 03:48:01 Unraid kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0xcf/0x12b Apr 2 03:48:01 Unraid kernel: Modules linked in: ccp macvlan xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nct6683 ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl nvme intel_cstate intel_uncore nvme_core ahci video i2c_i801 libahci i2c_smbus backlight acpi_pad button igb i2c_algo_bit i2c_core e1000e Apr 2 03:48:01 Unraid kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G BUD W 5.10.28-Unraid #1 Apr 2 03:48:01 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 2 03:48:01 Unraid kernel: RIP: 0010:dev_watchdog+0xcf/0x12b Apr 2 03:48:01 Unraid kernel: Code: 79 b7 00 00 75 38 48 89 ef c6 05 63 79 b7 00 01 e8 79 dd fc ff 44 89 e1 48 89 ee 48 c7 c7 ef 7f de 81 48 89 c2 e8 50 16 10 00 <0f> 0b eb 10 41 ff c4 48 05 40 01 00 00 41 39 f4 75 9d eb 16 48 8b Apr 2 03:48:01 Unraid kernel: RSP: 0018:ffffc90000180ed8 EFLAGS: 00010286 Apr 2 03:48:01 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104ec0438 RCX: 0000000000000027 Apr 2 03:48:01 Unraid kernel: RDX: 00000000ffffefff RSI: 0000000000000001 RDI: ffff88902ee98920 Apr 2 03:48:01 Unraid kernel: RBP: ffff888104ec0000 R08: 0000000000000000 R09: 00000000ffffefff Apr 2 03:48:01 Unraid kernel: R10: ffffc90000180d08 R11: ffffc90000180d00 R12: 0000000000000006 Apr 2 03:48:01 Unraid kernel: R13: ffffc90000180f10 R14: ffffc90000180f10 R15: ffffffff820060c8 Apr 2 03:48:01 Unraid kernel: FS: 0000000000000000(0000) GS:ffff88902ee80000(0000) knlGS:0000000000000000 Apr 2 03:48:01 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 2 03:48:01 Unraid kernel: CR2: 0000154834000010 CR3: 000000000400a002 CR4: 00000000003726e0 Apr 2 03:48:01 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 2 03:48:01 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Apr 2 03:48:01 Unraid kernel: Call Trace: Apr 2 03:48:01 Unraid kernel: <IRQ> Apr 2 03:48:01 Unraid kernel: call_timer_fn.isra.0+0x12/0x6f Apr 2 03:48:01 Unraid kernel: ? netif_tx_lock+0x7a/0x7a Apr 2 03:48:01 Unraid kernel: __run_timers.part.0+0x144/0x185 Apr 2 03:48:01 Unraid kernel: ? update_process_times+0x68/0x6e Apr 2 03:48:01 Unraid kernel: ? hrtimer_forward+0x73/0x7b Apr 2 03:48:01 Unraid kernel: ? tick_sched_timer+0x5a/0x64 Apr 2 03:48:01 Unraid kernel: ? timerqueue_add+0x62/0x68 Apr 2 03:48:01 Unraid kernel: run_timer_softirq+0x21/0x43 Apr 2 03:48:01 Unraid kernel: __do_softirq+0xc4/0x1c2 Apr 2 03:48:01 Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 2 03:48:01 Unraid kernel: </IRQ> Apr 2 03:48:01 Unraid kernel: do_softirq_own_stack+0x2c/0x39 Apr 2 03:48:01 Unraid kernel: __irq_exit_rcu+0x45/0x80 Apr 2 03:48:01 Unraid kernel: sysvec_apic_timer_interrupt+0x87/0x95 Apr 2 03:48:01 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 2 03:48:01 Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8 Apr 2 03:48:01 Unraid kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5 Apr 2 03:48:01 Unraid kernel: RSP: 0018:ffffc900000b7ea0 EFLAGS: 00000246 Apr 2 03:48:01 Unraid kernel: RAX: ffff88902eea2380 RBX: 0000000000000006 RCX: 000000000000001f Apr 2 03:48:01 Unraid kernel: RDX: 0000000000000000 RSI: 000000003c9b28ab RDI: 0000000000000000 Apr 2 03:48:01 Unraid kernel: RBP: ffffe8ffffaa1600 R08: 00017210d94e2967 R09: 0000000000000392 Apr 2 03:48:01 Unraid kernel: R10: 000000007fffffff R11: 071c71c71c71c71c R12: 00017210d94e2967 Apr 2 03:48:01 Unraid kernel: R13: ffffffff820c5dc0 R14: 0000000000000006 R15: 0000000000000000 Apr 2 03:48:01 Unraid kernel: cpuidle_enter_state+0x101/0x1c4 Apr 2 03:48:01 Unraid kernel: cpuidle_enter+0x25/0x31 Apr 2 03:48:01 Unraid kernel: do_idle+0x1a6/0x214 Apr 2 03:48:01 Unraid kernel: cpu_startup_entry+0x18/0x1a Apr 2 03:48:01 Unraid kernel: secondary_startup_64_no_verify+0xb0/0xbb Apr 2 03:48:01 Unraid kernel: ---[ end trace fb5642dccbb87fb3 ]--- Apr 2 03:48:01 Unraid kernel: igb 0000:01:00.0 eth0: Reset adapter Apr 2 03:48:02 Unraid kernel: igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Apr 2 03:48:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU Apr 2 03:48:33 Unraid kernel: rcu: 9-....: (60000 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=14882 Apr 2 03:48:33 Unraid kernel: (t=60001 jiffies g=41273801 q=28685) Apr 2 03:48:33 Unraid kernel: NMI backtrace for cpu 9 Apr 2 03:48:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G BUD W 5.10.28-Unraid #1 Apr 2 03:48:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 2 03:48:33 Unraid kernel: Call Trace: Apr 2 03:48:33 Unraid kernel: <IRQ> Apr 2 03:48:33 Unraid kernel: dump_stack+0x6b/0x83 Apr 2 03:48:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Apr 2 03:48:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f Apr 2 03:48:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3 Apr 2 03:48:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6 Apr 2 03:48:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543 Apr 2 03:48:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca Apr 2 03:48:33 Unraid kernel: update_process_times+0x50/0x6e Apr 2 03:48:33 Unraid kernel: tick_sched_timer+0x36/0x64 Apr 2 03:48:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b Apr 2 03:48:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39 Apr 2 03:48:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b Apr 2 03:48:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68 Apr 2 03:48:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 2 03:48:33 Unraid kernel: </IRQ> Apr 2 03:48:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95 Apr 2 03:48:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 2 03:48:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a Apr 2 03:48:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8 Apr 2 03:48:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202 Apr 2 03:48:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000 Apr 2 03:48:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8 Apr 2 03:48:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000 Apr 2 03:48:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940 Apr 2 03:48:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0 Apr 2 03:48:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa Apr 2 03:48:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc Apr 2 03:48:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1 Apr 2 03:48:33 Unraid kernel: ? check_pte+0x27/0x106 Apr 2 03:48:33 Unraid kernel: rmap_walk_anon+0xe7/0x156 Apr 2 03:48:33 Unraid kernel: try_to_unmap+0x88/0xc9 Apr 2 03:48:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8 Apr 2 03:48:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5 Apr 2 03:48:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65 Apr 2 03:48:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10 Apr 2 03:48:33 Unraid kernel: migrate_pages+0x499/0x7c1 Apr 2 03:48:33 Unraid kernel: ? move_freelist_tail+0xba/0xba Apr 2 03:48:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b Apr 2 03:48:33 Unraid kernel: compact_zone+0x6b7/0x90a Apr 2 03:48:33 Unraid kernel: proactive_compact_node+0x75/0xa2 Apr 2 03:48:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:48:33 Unraid kernel: kcompactd+0x1ee/0x22c Apr 2 03:48:33 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:48:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:48:33 Unraid kernel: kthread+0xe5/0xea Apr 2 03:48:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:48:33 Unraid kernel: ret_from_fork+0x1f/0x30 Apr 2 03:48:44 Unraid dhcpcd[1999]: br0: fe80::270:87ff:fee0:519 is unreachable Apr 2 03:49:31 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 63179 jiffies s: 27197 root: 0x200/. Apr 2 03:49:31 Unraid kernel: rcu: blocking rcu_node structures: Apr 2 03:49:31 Unraid kernel: Task dump for CPU 9: Apr 2 03:49:31 Unraid kernel: task:kcompactd0 state:R running task stack: 0 pid: 393 ppid: 2 flags:0x00004008 Apr 2 03:49:31 Unraid kernel: Call Trace: Apr 2 03:49:31 Unraid kernel: ? proactive_compact_node+0x75/0xa2 Apr 2 03:49:31 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:49:31 Unraid kernel: ? kcompactd+0x1ee/0x22c Apr 2 03:49:31 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:49:31 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:49:31 Unraid kernel: ? kthread+0xe5/0xea Apr 2 03:49:31 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:49:31 Unraid kernel: ? ret_from_fork+0x1f/0x30 Apr 2 03:51:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU Apr 2 03:51:33 Unraid kernel: rcu: 9-....: (240003 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=59467 Apr 2 03:51:33 Unraid kernel: (t=240004 jiffies g=41273801 q=88942) Apr 2 03:51:33 Unraid kernel: NMI backtrace for cpu 9 Apr 2 03:51:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G BUD W 5.10.28-Unraid #1 Apr 2 03:51:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 2 03:51:33 Unraid kernel: Call Trace: Apr 2 03:51:33 Unraid kernel: <IRQ> Apr 2 03:51:33 Unraid kernel: dump_stack+0x6b/0x83 Apr 2 03:51:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Apr 2 03:51:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f Apr 2 03:51:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3 Apr 2 03:51:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6 Apr 2 03:51:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543 Apr 2 03:51:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca Apr 2 03:51:33 Unraid kernel: update_process_times+0x50/0x6e Apr 2 03:51:33 Unraid kernel: tick_sched_timer+0x36/0x64 Apr 2 03:51:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b Apr 2 03:51:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39 Apr 2 03:51:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b Apr 2 03:51:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68 Apr 2 03:51:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 2 03:51:33 Unraid kernel: </IRQ> Apr 2 03:51:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95 Apr 2 03:51:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 2 03:51:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a Apr 2 03:51:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8 Apr 2 03:51:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202 Apr 2 03:51:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000 Apr 2 03:51:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8 Apr 2 03:51:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000 Apr 2 03:51:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940 Apr 2 03:51:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0 Apr 2 03:51:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa Apr 2 03:51:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc Apr 2 03:51:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1 Apr 2 03:51:33 Unraid kernel: ? check_pte+0x27/0x106 Apr 2 03:51:33 Unraid kernel: rmap_walk_anon+0xe7/0x156 Apr 2 03:51:33 Unraid kernel: try_to_unmap+0x88/0xc9 Apr 2 03:51:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8 Apr 2 03:51:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5 Apr 2 03:51:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65 Apr 2 03:51:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10 Apr 2 03:51:33 Unraid kernel: migrate_pages+0x499/0x7c1 Apr 2 03:51:33 Unraid kernel: ? move_freelist_tail+0xba/0xba Apr 2 03:51:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b Apr 2 03:51:33 Unraid kernel: compact_zone+0x6b7/0x90a Apr 2 03:51:33 Unraid kernel: proactive_compact_node+0x75/0xa2 Apr 2 03:51:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:51:33 Unraid kernel: kcompactd+0x1ee/0x22c Apr 2 03:51:33 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:51:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:51:33 Unraid kernel: kthread+0xe5/0xea Apr 2 03:51:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:51:33 Unraid kernel: ret_from_fork+0x1f/0x30 Apr 2 03:52:31 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 243403 jiffies s: 27197 root: 0x200/. Apr 2 03:52:31 Unraid kernel: rcu: blocking rcu_node structures: Apr 2 03:52:31 Unraid kernel: Task dump for CPU 9: Apr 2 03:52:31 Unraid kernel: task:kcompactd0 state:R running task stack: 0 pid: 393 ppid: 2 flags:0x00004008 Apr 2 03:52:31 Unraid kernel: Call Trace: Apr 2 03:52:31 Unraid kernel: ? proactive_compact_node+0x75/0xa2 Apr 2 03:52:31 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:52:31 Unraid kernel: ? kcompactd+0x1ee/0x22c Apr 2 03:52:31 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:52:31 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:52:31 Unraid kernel: ? kthread+0xe5/0xea Apr 2 03:52:31 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:52:31 Unraid kernel: ? ret_from_fork+0x1f/0x30 Apr 2 03:54:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU Apr 2 03:54:33 Unraid kernel: rcu: 9-....: (420006 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=104071 Apr 2 03:54:33 Unraid kernel: (t=420007 jiffies g=41273801 q=145781) Apr 2 03:54:33 Unraid kernel: NMI backtrace for cpu 9 Apr 2 03:54:33 Unraid kernel: CPU: 9 PID: 393 Comm: kcompactd0 Tainted: G BUD W 5.10.28-Unraid #1 Apr 2 03:54:33 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 2 03:54:33 Unraid kernel: Call Trace: Apr 2 03:54:33 Unraid kernel: <IRQ> Apr 2 03:54:33 Unraid kernel: dump_stack+0x6b/0x83 Apr 2 03:54:33 Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Apr 2 03:54:33 Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f Apr 2 03:54:33 Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3 Apr 2 03:54:33 Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6 Apr 2 03:54:33 Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543 Apr 2 03:54:33 Unraid kernel: ? trigger_load_balance+0x5a/0x1ca Apr 2 03:54:33 Unraid kernel: update_process_times+0x50/0x6e Apr 2 03:54:33 Unraid kernel: tick_sched_timer+0x36/0x64 Apr 2 03:54:33 Unraid kernel: __hrtimer_run_queues+0xb7/0x10b Apr 2 03:54:33 Unraid kernel: ? tick_sched_do_timer+0x39/0x39 Apr 2 03:54:33 Unraid kernel: hrtimer_interrupt+0x8d/0x15b Apr 2 03:54:33 Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68 Apr 2 03:54:33 Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 2 03:54:33 Unraid kernel: </IRQ> Apr 2 03:54:33 Unraid kernel: sysvec_apic_timer_interrupt+0x71/0x95 Apr 2 03:54:33 Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 2 03:54:33 Unraid kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x18a Apr 2 03:54:33 Unraid kernel: Code: c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 74 0c 0f ba e0 08 72 1a c6 47 01 00 eb 14 85 c0 74 0a 8b 07 84 c0 74 04 f3 90 <eb> f6 66 c7 07 01 00 c3 48 c7 c0 00 30 02 00 65 48 03 05 f0 8e f8 Apr 2 03:54:33 Unraid kernel: RSP: 0018:ffffc90000bbbb10 EFLAGS: 00000202 Apr 2 03:54:33 Unraid kernel: RAX: 0000000000000101 RBX: ffff88810bda92c0 RCX: 000ffffffffff000 Apr 2 03:54:33 Unraid kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00042171e8 Apr 2 03:54:33 Unraid kernel: RBP: ffffc90000bbbbb0 R08: ffff888000000000 R09: 0000000000000000 Apr 2 03:54:33 Unraid kernel: R10: ffffea000ae5b440 R11: 00000000000267d8 R12: ffff888100e8d940 Apr 2 03:54:33 Unraid kernel: R13: ffffea000ae5b400 R14: 00000001538a7ba9 R15: ffff88810bda92c0 Apr 2 03:54:33 Unraid kernel: queued_spin_lock_slowpath+0x7/0xa Apr 2 03:54:33 Unraid kernel: page_vma_mapped_walk+0x497/0x4dc Apr 2 03:54:33 Unraid kernel: try_to_unmap_one+0x115/0x5f1 Apr 2 03:54:33 Unraid kernel: ? check_pte+0x27/0x106 Apr 2 03:54:33 Unraid kernel: rmap_walk_anon+0xe7/0x156 Apr 2 03:54:33 Unraid kernel: try_to_unmap+0x88/0xc9 Apr 2 03:54:33 Unraid kernel: ? page_remove_rmap+0x1d8/0x1d8 Apr 2 03:54:33 Unraid kernel: ? __rcu_read_unlock+0x5/0x5 Apr 2 03:54:33 Unraid kernel: ? page_get_anon_vma+0x65/0x65 Apr 2 03:54:33 Unraid kernel: ? mmu_notifier_invalidate_range+0x10/0x10 Apr 2 03:54:33 Unraid kernel: migrate_pages+0x499/0x7c1 Apr 2 03:54:33 Unraid kernel: ? move_freelist_tail+0xba/0xba Apr 2 03:54:33 Unraid kernel: ? isolate_freepages_block+0x26b/0x26b Apr 2 03:54:33 Unraid kernel: compact_zone+0x6b7/0x90a Apr 2 03:54:33 Unraid kernel: proactive_compact_node+0x75/0xa2 Apr 2 03:54:33 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:54:33 Unraid kernel: kcompactd+0x1ee/0x22c Apr 2 03:54:33 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:54:33 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:54:33 Unraid kernel: kthread+0xe5/0xea Apr 2 03:54:33 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:54:33 Unraid kernel: ret_from_fork+0x1f/0x30 Apr 2 03:55:32 Unraid kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 9-... } 423627 jiffies s: 27197 root: 0x200/. Apr 2 03:55:32 Unraid kernel: rcu: blocking rcu_node structures: Apr 2 03:55:32 Unraid kernel: Task dump for CPU 9: Apr 2 03:55:32 Unraid kernel: task:kcompactd0 state:R running task stack: 0 pid: 393 ppid: 2 flags:0x00004008 Apr 2 03:55:32 Unraid kernel: Call Trace: Apr 2 03:55:32 Unraid kernel: ? proactive_compact_node+0x75/0xa2 Apr 2 03:55:32 Unraid kernel: ? fragmentation_score_node+0x2b/0x59 Apr 2 03:55:32 Unraid kernel: ? kcompactd+0x1ee/0x22c Apr 2 03:55:32 Unraid kernel: ? init_wait_entry+0x24/0x24 Apr 2 03:55:32 Unraid kernel: ? kcompactd_do_work+0x16f/0x16f Apr 2 03:55:32 Unraid kernel: ? kthread+0xe5/0xea Apr 2 03:55:32 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Apr 2 03:55:32 Unraid kernel: ? ret_from_fork+0x1f/0x30 Apr 2 03:57:33 Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU Apr 2 03:57:33 Unraid kernel: rcu: 9-....: (600009 ticks this GP) idle=8d6/1/0x4000000000000000 softirq=11958037/11958037 fqs=148692 Apr 2 03:57:33 Unraid kernel: (t=600010 jiffies g=41273801 q=197481) Apr 2 03:57:33 Unraid kernel: NMI backtrace for cpu 9 从日志上看,貌似是触发了一个什么问题,然后就一直在循环导致的,谁有碰到过类似的情况吗?怎么解决的呢? Quote Link to comment
lyqalex Posted April 2, 2022 Share Posted April 2, 2022 这类型的异常,多由硬件引起。你要提供详细信息,包括板u硬件、插件等等列明。在unraid~工具~诊断生产信息发上来。在此之前,你可以把所有docker、vms停用,尤其卸载非应用商店的组件,测试稳定性,如果仍然出现故障,则可判断为硬件问题。 ps:不要使用es和qs的cpu。 Quote Link to comment
JoeZhao Posted April 4, 2022 Author Share Posted April 4, 2022 On 4/2/2022 at 7:58 PM, lyqalex said: 这类型的异常,多由硬件引起。你要提供详细信息,包括板u硬件、插件等等列明。在unraid~工具~诊断生产信息发上来。在此之前,你可以把所有docker、vms停用,尤其卸载非应用商店的组件,测试稳定性,如果仍然出现故障,则可判断为硬件问题。 ps:不要使用es和qs的cpu。 感谢回复,诊断信息见附件。 停用了 vms 后,昨天又发生死机了,日志如下: Apr 3 01:58:59 Joes-Unraid kernel: ------------[ cut here ]------------ Apr 3 01:58:59 Joes-Unraid kernel: NETDEV WATCHDOG: eth0 (igb): transmit queue 3 timed out Apr 3 01:58:59 Joes-Unraid kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0xcf/0x12b Apr 3 01:58:59 Joes-Unraid kernel: Modules linked in: macvlan xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod i915 iosf_mbi drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nct6683 ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl nvme ahci intel_cstate i2c_i801 intel_uncore nvme_core video libahci i2c_smbus backlight acpi_pad button igb i2c_algo_bit i2c_core e1000e Apr 3 01:58:59 Joes-Unraid kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U W 5.10.28-Unraid #1 Apr 3 01:58:59 Joes-Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 3 01:58:59 Joes-Unraid kernel: RIP: 0010:dev_watchdog+0xcf/0x12b Apr 3 01:58:59 Joes-Unraid kernel: Code: 79 b7 00 00 75 38 48 89 ef c6 05 63 79 b7 00 01 e8 79 dd fc ff 44 89 e1 48 89 ee 48 c7 c7 ef 7f de 81 48 89 c2 e8 50 16 10 00 <0f> 0b eb 10 41 ff c4 48 05 40 01 00 00 41 39 f4 75 9d eb 16 48 8b Apr 3 01:58:59 Joes-Unraid kernel: RSP: 0018:ffffc90000180ed8 EFLAGS: 00010286 Apr 3 01:58:59 Joes-Unraid kernel: RAX: 0000000000000000 RBX: ffff888105bb8438 RCX: 0000000000000027 Apr 3 01:58:59 Joes-Unraid kernel: RDX: 00000000ffffefff RSI: 0000000000000001 RDI: ffff88902ee98920 Apr 3 01:58:59 Joes-Unraid kernel: RBP: ffff888105bb8000 R08: 0000000000000000 R09: 00000000ffffefff Apr 3 01:58:59 Joes-Unraid kernel: R10: ffffc90000180d08 R11: ffffc90000180d00 R12: 0000000000000003 Apr 3 01:58:59 Joes-Unraid kernel: R13: ffffc90000180f10 R14: ffffc90000180f10 R15: ffffffff820060c8 Apr 3 01:58:59 Joes-Unraid kernel: FS: 0000000000000000(0000) GS:ffff88902ee80000(0000) knlGS:0000000000000000 Apr 3 01:58:59 Joes-Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 3 01:58:59 Joes-Unraid kernel: CR2: 000015056d9ca900 CR3: 000000000400a005 CR4: 00000000003706e0 Apr 3 01:58:59 Joes-Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 3 01:58:59 Joes-Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Apr 3 01:58:59 Joes-Unraid kernel: Call Trace: Apr 3 01:58:59 Joes-Unraid kernel: <IRQ> Apr 3 01:58:59 Joes-Unraid kernel: call_timer_fn.isra.0+0x12/0x6f Apr 3 01:58:59 Joes-Unraid kernel: ? netif_tx_lock+0x7a/0x7a Apr 3 01:58:59 Joes-Unraid kernel: __run_timers.part.0+0x144/0x185 Apr 3 01:58:59 Joes-Unraid kernel: ? update_process_times+0x68/0x6e Apr 3 01:58:59 Joes-Unraid kernel: ? hrtimer_forward+0x73/0x7b Apr 3 01:58:59 Joes-Unraid kernel: ? tick_sched_timer+0x5a/0x64 Apr 3 01:58:59 Joes-Unraid kernel: ? timerqueue_add+0x62/0x68 Apr 3 01:58:59 Joes-Unraid kernel: run_timer_softirq+0x21/0x43 Apr 3 01:58:59 Joes-Unraid kernel: __do_softirq+0xc4/0x1c2 Apr 3 01:58:59 Joes-Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 3 01:58:59 Joes-Unraid kernel: </IRQ> Apr 3 01:58:59 Joes-Unraid kernel: do_softirq_own_stack+0x2c/0x39 Apr 3 01:58:59 Joes-Unraid kernel: __irq_exit_rcu+0x45/0x80 Apr 3 01:58:59 Joes-Unraid kernel: sysvec_apic_timer_interrupt+0x87/0x95 Apr 3 01:58:59 Joes-Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 3 01:58:59 Joes-Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8 Apr 3 01:58:59 Joes-Unraid kernel: Code: 00 48 83 c4 28 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 9c 58 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 55 8b af 28 04 00 00 b8 01 00 00 00 45 31 c9 53 45 31 d2 39 c5 Apr 3 01:58:59 Joes-Unraid kernel: RSP: 0018:ffffc900000b7ea0 EFLAGS: 00000246 Apr 3 01:58:59 Joes-Unraid kernel: RAX: ffff88902eea2380 RBX: 0000000000000006 RCX: 000000000000001f Apr 3 01:58:59 Joes-Unraid kernel: RDX: 0000000000000000 RSI: 000000003c9b28ab RDI: 0000000000000000 Apr 3 01:58:59 Joes-Unraid kernel: RBP: ffffe8ffffaa1600 R08: 0000285539e1a494 R09: 0000000000000385 Apr 3 01:58:59 Joes-Unraid kernel: R10: 000000007fffffff R11: 071c71c71c71c71c R12: 0000285539e1a494 Apr 3 01:58:59 Joes-Unraid kernel: R13: ffffffff820c5dc0 R14: 0000000000000006 R15: 0000000000000000 Apr 3 01:58:59 Joes-Unraid kernel: cpuidle_enter_state+0x101/0x1c4 Apr 3 01:58:59 Joes-Unraid kernel: cpuidle_enter+0x25/0x31 Apr 3 01:58:59 Joes-Unraid kernel: do_idle+0x1a6/0x214 Apr 3 01:58:59 Joes-Unraid kernel: cpu_startup_entry+0x18/0x1a Apr 3 01:58:59 Joes-Unraid kernel: secondary_startup_64_no_verify+0xb0/0xbb Apr 3 01:58:59 Joes-Unraid kernel: ---[ end trace 5f09cc7a9ef954d7 ]--- Apr 3 01:58:59 Joes-Unraid kernel: igb 0000:01:00.0 eth0: Reset adapter Apr 3 01:59:00 Joes-Unraid kernel: igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Apr 3 01:59:50 Joes-Unraid kernel: rcu: INFO: rcu_sched self-detected stall on CPU Apr 3 01:59:50 Joes-Unraid kernel: rcu: 6-....: (1 GPs behind) idle=682/1/0x4000000000000004 softirq=728000/728001 fqs=14870 Apr 3 01:59:50 Joes-Unraid kernel: (t=60000 jiffies g=2959521 q=21275) Apr 3 01:59:50 Joes-Unraid kernel: NMI backtrace for cpu 6 Apr 3 01:59:50 Joes-Unraid kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G U W 5.10.28-Unraid #1 Apr 3 01:59:50 Joes-Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370M Pro4, BIOS P4.20 10/31/2019 Apr 3 01:59:50 Joes-Unraid kernel: Call Trace: Apr 3 01:59:50 Joes-Unraid kernel: <IRQ> Apr 3 01:59:50 Joes-Unraid kernel: dump_stack+0x6b/0x83 Apr 3 01:59:50 Joes-Unraid kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Apr 3 01:59:50 Joes-Unraid kernel: nmi_cpu_backtrace+0x7d/0x8f Apr 3 01:59:50 Joes-Unraid kernel: nmi_trigger_cpumask_backtrace+0x56/0xd3 Apr 3 01:59:50 Joes-Unraid kernel: rcu_dump_cpu_stacks+0x9f/0xc6 Apr 3 01:59:50 Joes-Unraid kernel: rcu_sched_clock_irq+0x1ec/0x543 Apr 3 01:59:50 Joes-Unraid kernel: update_process_times+0x50/0x6e Apr 3 01:59:50 Joes-Unraid kernel: tick_sched_timer+0x36/0x64 Apr 3 01:59:50 Joes-Unraid kernel: __hrtimer_run_queues+0xb7/0x10b Apr 3 01:59:50 Joes-Unraid kernel: ? tick_sched_do_timer+0x39/0x39 Apr 3 01:59:50 Joes-Unraid kernel: hrtimer_interrupt+0x8d/0x15b Apr 3 01:59:50 Joes-Unraid kernel: __sysvec_apic_timer_interrupt+0x5d/0x68 Apr 3 01:59:50 Joes-Unraid kernel: sysvec_apic_timer_interrupt+0x82/0x95 Apr 3 01:59:50 Joes-Unraid kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 3 01:59:50 Joes-Unraid kernel: RIP: 0010:nf_ct_zone_equal+0x24/0x2b [nf_conntrack] Apr 3 01:59:50 Joes-Unraid kernel: Code: e9 b9 f8 ff ff c3 89 d1 b8 01 00 00 00 31 d2 d3 e0 0f b6 4f 0f 85 c1 74 04 66 8b 57 0c 0f b6 7e 03 31 c9 85 c7 74 03 66 8b 0e <66> 39 d1 0f 94 c0 c3 48 8b 86 90 00 00 00 89 f9 48 89 f7 48 8b 80 Apr 3 01:59:50 Joes-Unraid kernel: RSP: 0018:ffffc90000230978 EFLAGS: 00000202 Apr 3 01:59:50 Joes-Unraid kernel: RAX: 0000000000000001 RBX: ffff88811e778b88 RCX: 0000000000000000 Apr 3 01:59:50 Joes-Unraid kernel: RDX: 0000000000000000 RSI: ffff88811e778c8c RDI: 0000000000000003 Apr 3 01:59:50 Joes-Unraid kernel: RBP: ffff88811e778c80 R08: ffff88811e778b40 R09: ffff88811e778b88 Apr 3 01:59:50 Joes-Unraid kernel: R10: 0000000000000001 R11: ffffffff8210b440 R12: ffffffff8210b440 Apr 3 01:59:50 Joes-Unraid kernel: R13: ffffc900002309e0 R14: ffff88811e778c8c R15: ffff88811e778b40 Apr 3 01:59:50 Joes-Unraid kernel: nf_conntrack_tuple_taken+0xdc/0x144 [nf_conntrack] Apr 3 01:59:50 Joes-Unraid kernel: nf_nat_used_tuple+0x2e/0x49 [nf_nat] Apr 3 01:59:50 Joes-Unraid kernel: nf_nat_setup_info+0x332/0x6aa [nf_nat] Apr 3 01:59:50 Joes-Unraid kernel: ? ipt_do_table+0x4bb/0x5c0 [ip_tables] Apr 3 01:59:50 Joes-Unraid kernel: ? ipt_do_table+0x570/0x5c0 [ip_tables] Apr 3 01:59:50 Joes-Unraid kernel: __nf_nat_alloc_null_binding+0x5f/0x76 [nf_nat] Apr 3 01:59:50 Joes-Unraid kernel: nf_nat_inet_fn+0x91/0x183 [nf_nat] Apr 3 01:59:50 Joes-Unraid kernel: ? br_handle_frame_finish+0x351/0x351 Apr 3 01:59:50 Joes-Unraid kernel: nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat] Apr 3 01:59:50 Joes-Unraid kernel: nf_hook_slow+0x39/0x8e Apr 3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] Apr 3 01:59:50 Joes-Unraid kernel: NF_HOOK+0xb7/0xf7 [br_netfilter] Apr 3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] Apr 3 01:59:50 Joes-Unraid kernel: br_nf_pre_routing+0x229/0x239 [br_netfilter] Apr 3 01:59:50 Joes-Unraid kernel: ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] Apr 3 01:59:50 Joes-Unraid kernel: br_handle_frame+0x25e/0x2a6 Apr 3 01:59:50 Joes-Unraid kernel: ? br_pass_frame_up+0xda/0xda Apr 3 01:59:50 Joes-Unraid kernel: __netif_receive_skb_core+0x335/0x4e7 Apr 3 01:59:50 Joes-Unraid kernel: __netif_receive_skb_list_core+0x78/0x104 Apr 3 01:59:50 Joes-Unraid kernel: netif_receive_skb_list_internal+0x1bf/0x1f2 Apr 3 01:59:50 Joes-Unraid kernel: ? dev_gro_receive+0x55d/0x578 Apr 3 01:59:50 Joes-Unraid kernel: gro_normal_list+0x1d/0x39 Apr 3 01:59:50 Joes-Unraid kernel: napi_complete_done+0x79/0x104 Apr 3 01:59:50 Joes-Unraid kernel: igb_poll+0xcc8/0xef6 [igb] Apr 3 01:59:50 Joes-Unraid kernel: net_rx_action+0xf4/0x29d Apr 3 01:59:50 Joes-Unraid kernel: __do_softirq+0xc4/0x1c2 Apr 3 01:59:50 Joes-Unraid kernel: asm_call_irq_on_stack+0xf/0x20 Apr 3 01:59:50 Joes-Unraid kernel: </IRQ> Apr 3 01:59:50 Joes-Unraid kernel: do_softirq_own_stack+0x2c/0x39 Apr 3 01:59:50 Joes-Unraid kernel: __irq_exit_rcu+0x45/0x80 Apr 3 01:59:50 Joes-Unraid kernel: common_interrupt+0x119/0x12e Apr 3 01:59:50 Joes-Unraid kernel: asm_common_interrupt+0x1e/0x40 Apr 3 01:59:50 Joes-Unraid kernel: RIP: 0010:arch_local_irq_enable+0x7/0x8 这次我计划停用 docker 试试。 换一个低版本的能否解决问题?我同样的机器配置其实跑了几个月的黑群晖,一点问题都没有。 unraid-diagnostics-20220404-1055.zip Quote Link to comment
Solution lyqalex Posted April 4, 2022 Solution Share Posted April 4, 2022 把插件vfio.pci卸载掉,6.9X后不需要安装这个插件。重启后,重新设置直通方面的设置。 Quote Link to comment
JoeZhao Posted April 15, 2022 Author Share Posted April 15, 2022 提醒:vfio.pci 卸载之后,记得把 flash 下面的 vfio.conf 也删除掉,不然重启之后就没网络了。 更新一下进度: vfio.pci 删除之后,之前的错误不再产生 -> 稳定坚持了 3 天 但是出现了另外一个问题,docker 如果使用 macvlan 网络,也会偶发性的报类似的错误 -> 阵亡(而且是没有任何日志的) 更换所有使用 macvlan 的 docker 镜像之后,目前已经稳定跑了 4.5 天,期间无报错。 后续待观察。 对于 macvlan 的这个问题,不知道有没有什么解决方案? 1 Quote Link to comment
lyqalex Posted April 15, 2022 Share Posted April 15, 2022 macvlan这个问题,现在有两个办法解决: 1、退回6.83版本,临时办法,不是很新的系统可以考虑。 2、升级到6.10 rc4,临时办法,建议较新的系统。 ps:当然,等待6.10正式版再升级就最好了。 Quote Link to comment
xxb Posted November 6, 2022 Share Posted November 6, 2022 我也這樣,好幾個月了,至今無解 換過cpu和內存都沒用 Quote Link to comment
akina Posted August 24, 2023 Share Posted August 24, 2023 同样问题,我的坚持不了几天必然死机。以前6.9以内的版本,几个月都不死机的,时间计数被重置通常是因为停电。也就是,6.9及更早版本我从来没遇到过死机 Quote Link to comment
JackieWu Posted August 27, 2023 Share Posted August 27, 2023 On 8/24/2023 at 11:02 AM, akina said: 同样问题,我的坚持不了几天必然死机。以前6.9以内的版本,几个月都不死机的,时间计数被重置通常是因为停电。也就是,6.9及更早版本我从来没遇到过死机 可以尝试此方法看看能不能解决问题:《unRAID 失联问题解析以及如何给Docker分配独立网口》 Quote Link to comment
akina Posted August 29, 2023 Share Posted August 29, 2023 On 8/28/2023 at 12:21 AM, JackieWu said: 可以尝试此方法看看能不能解决问题:《unRAID 失联问题解析以及如何给Docker分配独立网口》 我之前是因为增加了一个USB 2.5G网卡,板载的是1G的,我只用了1个网线插在2.5G网卡,所以启用了桥接,否则访问不了。 现在我把2.5G网卡换成eth0 ,取消了桥接,并且用ipvlan。过几天看看有没有作用,我个人觉得应该能解决,我增加了个2.5G,硬盘连续的坏,重建,我还升级了系统,没有意识到有个能是网卡的问题。 docker我不能用eth1,因为它是1G的。我就按照以前单网卡的使用方法。 非常感谢。 1 Quote Link to comment
akina Posted August 29, 2023 Share Posted August 29, 2023 On 8/28/2023 at 12:21 AM, JackieWu said: 可以尝试此方法看看能不能解决问题:《unRAID 失联问题解析以及如何给Docker分配独立网口》 6.12.4-rc19 已经没有macvlan选项了,并且我已经没有桥接,把2个网口分开了。 目前还是会失联,我不确定它是死机还是失联,因为现在我的显示器已经没有vga口了,我确定不了。 我添加的网卡是螃蟹的USB 2.5G网卡,确实有点头疼。(我安装了这个网卡的驱动插件) Quote Link to comment
JackieWu Posted August 29, 2023 Share Posted August 29, 2023 9 minutes ago, akina said: 6.12.4-rc19 已经没有macvlan选项了,并且我已经没有桥接,把2个网口分开了。 目前还是会失联,我不确定它是死机还是失联,因为现在我的显示器已经没有vga口了,我确定不了。 我添加的网卡是螃蟹的USB 2.5G网卡,确实有点头疼。(我安装了这个网卡的驱动插件) 得看日志 Quote Link to comment
akina Posted August 31, 2023 Share Posted August 31, 2023 On 8/29/2023 at 9:35 PM, JackieWu said: 得看日志 现在每次能坚持1天多,目前解决办法只有先暂时不用螃蟹的2.5G网卡,用1G的板载单网卡看看。 单网卡都还出问题,我就没法了,只能准备绿联或者极空间了,只可惜还买了授权也是能浪费了 Quote Link to comment
JackieWu Posted August 31, 2023 Share Posted August 31, 2023 1 minute ago, akina said: 现在每次能坚持1天多,目前解决办法只有先暂时不用螃蟹的2.5G网卡,用1G的板载单网卡看看。 单网卡都还出问题,我就没法了,只能准备绿联或者极空间了,只可惜还买了授权也是能浪费了 失联的原因可以有很多,我也处理过不同情况下的失联问题,这个得借助日志去判断,如果你有需要可以上传上来让大家一起帮忙。 Quote Link to comment
akina Posted September 6, 2023 Share Posted September 6, 2023 (edited) On 8/31/2023 at 5:51 PM, JackieWu said: 失联的原因可以有很多,我也处理过不同情况下的失联问题,这个得借助日志去判断,如果你有需要可以上传上来让大家一起帮忙。 现在把2.5g usb网卡拔出,5天了,暂时没有问题。 但这段时间,还有个问题一直困扰着我,就是共享目录丢失。 “共享”里面不显示任何文件夹,也访问不了。停止阵列再启动的话,会恢复,或者重启会恢复。 Edited September 6, 2023 by akina Quote Link to comment
JackieWu Posted September 6, 2023 Share Posted September 6, 2023 5 hours ago, akina said: 现在把2.5g usb网卡拔出,5天了,暂时没有问题。 但这段时间,还有个问题一直困扰着我,就是共享目录丢失。 “共享”里面不显示任何文件夹,也访问不了。停止阵列再启动的话,会恢复,或者重启会恢复。 还是那句话,得看日志 Quote Link to comment
akina Posted September 6, 2023 Share Posted September 6, 2023 8 hours ago, JackieWu said: 还是那句话,得看日志 所有共享文件夹丢失的时候,日志是这样的,我截取了一部分。 Quote Link to comment
JackieWu Posted September 7, 2023 Share Posted September 7, 2023 16 hours ago, akina said: 所有共享文件夹丢失的时候,日志是这样的,我截取了一部分。 有完整的不? Quote Link to comment
akina Posted September 7, 2023 Share Posted September 7, 2023 6 minutes ago, JackieWu said: 有完整的不? 我传上来了,应该也有失联的日志 tower-diagnostics-20230907-1626.zip Quote Link to comment
JackieWu Posted September 7, 2023 Share Posted September 7, 2023 (edited) 1 hour ago, akina said: 我传上来了,应该也有失联的日志 tower-diagnostics-20230907-1626.zip 164.47 kB · 1 download 看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。 我建议: 修复下你的 Disk1 硬盘的文件系统; 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下; 检测下你的内存条,这里有参考这里。 Edited September 7, 2023 by JackieWu Quote Link to comment
akina Posted September 7, 2023 Share Posted September 7, 2023 3 hours ago, JackieWu said: 看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。 我建议: 修复下你的 Disk1 硬盘的文件系统; 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下; 检测下你的内存条,这里有参考这里。 我正在修复disk1 我搜网卡驱动的插件,搜出来的是同一个驱动,等我修复完驱动,我删除看看。 内存条我觉得问题不大,我这里没有台式机的内存条了,之前用了4,5年,unraid也从来没出错,崩溃,死机。我用的软路由,笔记本,小主机全是笔记本内存条,也不好替换检查。 Quote Link to comment
akina Posted September 12, 2023 Share Posted September 12, 2023 On 9/7/2023 at 6:07 PM, JackieWu said: 看了你日志里9月1日到今天的信息,报错的日志有点多,既有硬盘文件系统损坏的,也有网卡驱动这一块报错。 我建议: 修复下你的 Disk1 硬盘的文件系统; 你的网卡用是 r8169 的驱动,但你下载的是 r8152 的驱动插件,建议删掉这个插件重启 unRAID 观察一下; 检测下你的内存条,这里有参考这里。 修复disk1 最后出现Sorry,could not find valid secondary superblock unraid日志如下: Sep 12 05:33:18 Tower emhttpd: spinning down /dev/sde Sep 12 05:33:20 Tower emhttpd: read SMART /dev/sde Sep 12 06:00:28 Tower kernel: XFS (md1p1): Metadata corruption detected at xfs_buf_ioend+0xa9/0x384 [xfs], xfs_inode block 0xeeb46790 xfs_inode_buf_verify Sep 12 06:00:28 Tower kernel: XFS (md1p1): Unmount and run xfs_repair Sep 12 06:00:28 Tower kernel: XFS (md1p1): First 128 bytes of corrupted metadata buffer: Sep 12 06:00:28 Tower kernel: 00000000: 9a 7f b7 4f f4 7d e0 f7 1a 2a 09 84 35 61 10 ef ...O.}...*..5a.. Sep 12 06:00:28 Tower kernel: 00000010: 38 5f b8 35 98 0b 9c 29 dd ca 0b b6 cc ff 38 72 8_.5...)......8r Sep 12 06:00:28 Tower kernel: 00000020: c3 c1 e1 f9 f0 a2 8e 27 5e 12 1f 44 6c 10 65 83 .......'^..Dl.e. Sep 12 06:00:28 Tower kernel: 00000030: c7 47 8c 19 43 8c 70 ff e7 da 7d 15 ee 1a 04 1a .G..C.p...}..... Sep 12 06:00:28 Tower kernel: 00000040: 1e ed f8 1c a1 de 00 f1 d6 a2 f4 7a 1e fd d0 66 ...........z...f Sep 12 06:00:28 Tower kernel: 00000050: 06 20 93 27 c2 42 68 02 7b fe 1c 76 90 c1 69 b6 . .'.Bh.{..v..i. Sep 12 06:00:28 Tower kernel: 00000060: 43 66 1e f3 c8 2f 7a ce 8f 50 13 4f c7 69 f1 3a Cf.../z..P.O.i.: Sep 12 06:00:28 Tower kernel: 00000070: 70 3e 90 5f 13 87 ee 2b 2a e6 08 29 67 00 39 af p>._...+*..)g.9. Sep 12 06:00:28 Tower kernel: XFS (md1p1): metadata I/O error in "xfs_imap_to_bp+0x50/0x70 [xfs]" at daddr 0xeeb46790 len 32 error 117 Sep 12 06:00:29 Tower kernel: XFS (md1p1): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x19c/0x26f [xfs] (fs/xfs/xfs_trans_buf.c:296). Shutting down filesystem. Sep 12 06:00:29 Tower kernel: XFS (md1p1): Please unmount the filesystem and rectify the problem(s) Sep 12 06:10:58 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa0ef2.4699dd96 does not match aorg 0xe8aa0ef2.46985d01 from [email protected] xmt 0xe8aa0ef2.7475ae15 Sep 12 06:13:06 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa0f72.469581dd does not match aorg 0xe8aa0f72.46972e59 from [email protected] xmt 0xe8aa0f72.7512d614 Sep 12 06:20:15 Tower ntpd[1245]: receive: Unexpected origin timestamp 0xe8aa111f.469fbc8f does not match aorg 0xe8aa111f.469c53f6 from [email protected] xmt 0xe8aa111f.59a190d3 Sep 12 06:20:51 Tower kernel: md1p1: writeback error on inode 2200312912, offset 251658240, sector 2548144816 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.