WasteLand Posted August 27, 2022 Share Posted August 27, 2022 (edited) I searched for the past few days and came up empty handed hopefully one of you geniuses can help me out. The server will randomly lockup/crash luckely I have setup IDRAC as I am not where I can physically touch the server. Unraid 6.10.3 Dell R710 (one PS is dead) here are some of the logs from the crash today there are about 2,000 more lines of logs after this. 17:35:35 is when the server went offline. Aug 27 14:10:55 Tower emhttpd: spinning down /dev/sdg Aug 27 14:25:57 Tower emhttpd: spinning down /dev/sdg Aug 27 14:40:57 Tower emhttpd: spinning down /dev/sdg Aug 27 14:55:58 Tower emhttpd: spinning down /dev/sdg Aug 27 15:10:59 Tower emhttpd: spinning down /dev/sdg Aug 27 15:26:00 Tower emhttpd: spinning down /dev/sdg Aug 27 15:41:01 Tower emhttpd: spinning down /dev/sdg Aug 27 15:56:02 Tower emhttpd: spinning down /dev/sdg Aug 27 16:11:03 Tower emhttpd: spinning down /dev/sdg Aug 27 16:26:04 Tower emhttpd: spinning down /dev/sdg Aug 27 16:41:05 Tower emhttpd: spinning down /dev/sdg Aug 27 16:56:06 Tower emhttpd: spinning down /dev/sdg Aug 27 17:11:07 Tower emhttpd: spinning down /dev/sdg Aug 27 17:26:08 Tower emhttpd: spinning down /dev/sdg Aug 27 17:35:35 Tower kernel: Invalid SPTE change: cannot replace a present leaf Aug 27 17:35:35 Tower kernel: SPTE with another present leaf SPTE mapping a Aug 27 17:35:35 Tower kernel: different PFN! Aug 27 17:35:35 Tower kernel: as_id: 0 gfn: 3a3ff4 old_spte: ffff8881346e8d18 new_spte: 6100003511f4877 level: 1 Aug 27 17:35:35 Tower kernel: ------------[ cut here ]------------ Aug 27 17:35:35 Tower kernel: kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:446! Aug 27 17:35:35 Tower kernel: invalid opcode: 0000 [#1] SMP PTI Aug 27 17:35:35 Tower kernel: CPU: 1 PID: 6931 Comm: CPU 2/KVM Tainted: G W I 5.15.46-Unraid #1 Aug 27 17:35:35 Tower kernel: Hardware name: Dell Inc. PowerEdge R710/0YDJK3, BIOS 2.0.13 04/06/2010 Aug 27 17:35:35 Tower kernel: RIP: 0010:__handle_changed_spte+0x113/0x42d [kvm] Aug 27 17:35:35 Tower kernel: Code: 0c 21 d8 a8 01 74 25 80 7c 24 10 00 74 1e 8b 74 24 20 45 89 f1 4d 89 e8 4c 89 e1 4c 89 fa 48 c7 c7 45 6b 27 a0 e8 0a 70 5a e1 <0f> 0b 4d 39 ec 0f 84 00 03 00 00 66 90 eb 35 65 8b 3d eb 98 db 5f Aug 27 17:35:35 Tower kernel: RSP: 0018:ffffc9000647fa70 EFLAGS: 00010246 Aug 27 17:35:35 Tower kernel: RAX: 00000000000000c2 RBX: 0000000000000001 RCX: 0000000000000000 Aug 27 17:35:35 Tower kernel: RDX: 0000000000000000 RSI: ffff88890ba1c510 RDI: ffff88890ba1c510 Aug 27 17:35:35 Tower kernel: RBP: ffffc90006702000 R08: ffff88922ff697a8 R09: 3738346631313533 Aug 27 17:35:35 Tower kernel: R10: 3030303031363120 R11: 657470735f773120 R12: ffff8881346e8d18 Aug 27 17:35:35 Tower kernel: R13: 06100003511f4877 R14: 0000000000000001 R15: 00000000003a3ff4 Aug 27 17:35:35 Tower kernel: FS: 000014aa09fff640(0000) GS:ffff88890ba00000(0000) knlGS:0000000b4cf97000 Aug 27 17:35:35 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 27 17:35:35 Tower kernel: CR2: 000002b801f94000 CR3: 0000000972df8000 CR4: 00000000000026e0 Aug 27 17:35:35 Tower kernel: Call Trace: Aug 27 17:35:35 Tower kernel: <TASK> Aug 27 17:35:35 Tower kernel: tdp_mmu_set_spte_atomic_no_dirty_log+0x47/0x5c [kvm] Aug 27 17:35:35 Tower kernel: kvm_tdp_mmu_map+0x33c/0x467 [kvm] Aug 27 17:35:35 Tower kernel: direct_page_fault+0x487/0x6b9 [kvm] Aug 27 17:35:35 Tower kernel: ? kvm_mtrr_check_gfn_range_consistency+0x61/0xe5 [kvm] Aug 27 17:35:35 Tower kernel: ? raw_spin_rq_unlock_irqrestore+0xd/0x17 Aug 27 17:35:35 Tower kernel: kvm_mmu_page_fault+0x2a0/0x4b5 [kvm] Aug 27 17:35:35 Tower kernel: ? sysvec_apic_timer_interrupt+0xb/0x7d Aug 27 17:35:35 Tower kernel: ? asm_sysvec_apic_timer_interrupt+0x12/0x20 Aug 27 17:35:35 Tower kernel: ? vmx_vmexit+0x1d/0x40 [kvm_intel] Aug 27 17:35:35 Tower kernel: ? vmx_vmexit+0x11/0x40 [kvm_intel] Aug 27 17:35:35 Tower kernel: vmx_handle_exit+0x509/0x5df [kvm_intel] Aug 27 17:35:35 Tower kernel: kvm_arch_vcpu_ioctl_run+0x1168/0x144b [kvm] Aug 27 17:35:35 Tower kernel: ? kvm_vm_ioctl_irq_line+0x23/0x2f [kvm] Aug 27 17:35:35 Tower kernel: kvm_vcpu_ioctl+0x195/0x56d [kvm] Aug 27 17:35:35 Tower kernel: ? __seccomp_filter+0xa2/0x307 Aug 27 17:35:35 Tower kernel: vfs_ioctl+0x1e/0x2b Aug 27 17:35:35 Tower kernel: __do_sys_ioctl+0x51/0x74 Aug 27 17:35:35 Tower kernel: do_syscall_64+0x83/0xa5 Aug 27 17:35:35 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Aug 27 17:35:35 Tower kernel: RIP: 0033:0x14ae0e458067 Aug 27 17:35:35 Tower kernel: Code: 3c 1c e8 2c ff ff ff 85 c0 79 97 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d1 1d 0d 00 f7 d8 64 89 01 48 Aug 27 17:35:35 Tower kernel: RSP: 002b:000014aa09ffcdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Aug 27 17:35:35 Tower kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 000014ae0e458067 Aug 27 17:35:35 Tower kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001a Aug 27 17:35:35 Tower kernel: RBP: 000014aa0bec9a80 R08: 000055f01bd4fc38 R09: 000055f01b7a22b0 Aug 27 17:35:35 Tower kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 Aug 27 17:35:35 Tower kernel: R13: 000055f01bd92a03 R14: 0000000000000002 R15: 0000000000000000 Aug 27 17:35:35 Tower kernel: </TASK> Aug 27 17:35:35 Tower kernel: Modules linked in: veth xt_nat cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle macvlan vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding bnx2 sr_mod cdrom intel_powerclamp coretemp kvm_intel ipmi_ssif i2c_core kvm crc32c_intel intel_cstate intel_uncore mpt3sas input_leds led_class ata_piix raid_class scsi_transport_sas ipmi_si acpi_power_meter button [last unloaded: bnx2] Aug 27 17:35:35 Tower kernel: ---[ end trace ddea469b93dfa5c1 ]--- Aug 27 17:35:35 Tower kernel: RIP: 0010:__handle_changed_spte+0x113/0x42d [kvm] Aug 27 17:35:35 Tower kernel: Code: 0c 21 d8 a8 01 74 25 80 7c 24 10 00 74 1e 8b 74 24 20 45 89 f1 4d 89 e8 4c 89 e1 4c 89 fa 48 c7 c7 45 6b 27 a0 e8 0a 70 5a e1 <0f> 0b 4d 39 ec 0f 84 00 03 00 00 66 90 eb 35 65 8b 3d eb 98 db 5f Aug 27 17:35:35 Tower kernel: RSP: 0018:ffffc9000647fa70 EFLAGS: 00010246 Aug 27 17:35:35 Tower kernel: RAX: 00000000000000c2 RBX: 0000000000000001 RCX: 0000000000000000 Aug 27 17:35:35 Tower kernel: RDX: 0000000000000000 RSI: ffff88890ba1c510 RDI: ffff88890ba1c510 Aug 27 17:35:35 Tower kernel: RBP: ffffc90006702000 R08: ffff88922ff697a8 R09: 3738346631313533 Aug 27 17:35:35 Tower kernel: R10: 3030303031363120 R11: 657470735f773120 R12: ffff8881346e8d18 Aug 27 17:35:35 Tower kernel: R13: 06100003511f4877 R14: 0000000000000001 R15: 00000000003a3ff4 Aug 27 17:35:35 Tower kernel: FS: 000014aa09fff640(0000) GS:ffff88890ba00000(0000) knlGS:0000000b4cf97000 Aug 27 17:35:35 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 27 17:35:35 Tower kernel: CR2: 000002b801f94000 CR3: 0000000972df8000 CR4: 00000000000026e0 Aug 27 17:35:53 Tower kernel: ------------[ cut here ]------------ Aug 27 17:35:53 Tower kernel: NETDEV WATCHDOG: eth0 (bnx2): transmit queue 7 timed out Aug 27 17:35:53 Tower kernel: WARNING: CPU: 1 PID: 19 at net/sched/sch_generic.c:477 dev_watchdog+0x115/0x180 Aug 27 17:35:53 Tower kernel: Modules linked in: veth xt_nat cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle macvlan vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding bnx2 sr_mod cdrom intel_powerclamp coretemp kvm_intel ipmi_ssif i2c_core kvm crc32c_intel intel_cstate intel_uncore mpt3sas input_leds led_class ata_piix raid_class scsi_transport_sas ipmi_si acpi_power_meter button [last unloaded: bnx2] Aug 27 17:35:53 Tower kernel: CPU: 1 PID: 19 Comm: migration/1 Tainted: G D W I 5.15.46-Unraid #1 Aug 27 17:35:53 Tower kernel: Hardware name: Dell Inc. PowerEdge R710/0YDJK3, BIOS 2.0.13 04/06/2010 Aug 27 17:35:53 Tower kernel: Stopper: multi_cpu_stop+0x0/0xca <- migrate_swap+0xfb/0x11a Aug 27 17:35:53 Tower kernel: RIP: 0010:dev_watchdog+0x115/0x180 Aug 27 17:35:53 Tower kernel: Code: 25 cb 00 00 75 36 48 89 ef c6 05 7d 25 cb 00 01 e8 d0 8e fb ff 44 89 e1 48 89 ee 48 c7 c7 54 09 15 82 48 89 c2 e8 ae 30 12 00 <0f> 0b eb 0e 41 ff c4 48 05 40 01 00 00 e9 65 ff ff ff 48 8b 83 48 Aug 27 17:35:53 Tower kernel: RSP: 0000:ffffc90006400ec8 EFLAGS: 00010282 Aug 27 17:35:53 Tower kernel: RAX: 0000000000000000 RBX: ffff888104d88480 RCX: 0000000000000027 Aug 27 17:35:53 Tower kernel: RDX: 0000000000000003 RSI: ffffc90006400d50 RDI: ffff88890ba1c510 Aug 27 17:35:53 Tower kernel: RBP: ffff888104d88000 R08: ffff88922ff697a8 R09: 0000000000000000 Aug 27 17:35:53 Tower kernel: R10: 7420372065756575 R11: 712074696d736e61 R12: 0000000000000007 Aug 27 17:35:53 Tower kernel: R13: 000000010f195000 R14: ffffc90006400f10 R15: ffffffff816df274 Aug 27 17:35:53 Tower kernel: FS: 0000000000000000(0000) GS:ffff88890ba00000(0000) knlGS:0000000000000000 Aug 27 17:35:53 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 27 17:35:53 Tower kernel: CR2: 000014a35af78390 CR3: 0000000a45a0a000 CR4: 00000000000026e0 Aug 27 17:35:53 Tower kernel: Call Trace: Aug 27 17:35:53 Tower kernel: <IRQ> Aug 27 17:35:53 Tower kernel: ? psched_ppscfg_precompute+0x40/0x40 Aug 27 17:35:53 Tower kernel: call_timer_fn+0x59/0xde Aug 27 17:35:53 Tower kernel: __run_timers+0x146/0x184 Aug 27 17:35:53 Tower kernel: ? enqueue_hrtimer+0x62/0x69 Aug 27 17:35:53 Tower kernel: ? recalibrate_cpu_khz+0x1/0x1 Aug 27 17:35:53 Tower kernel: run_timer_softirq+0x19/0x2d Aug 27 17:35:53 Tower kernel: __do_softirq+0xef/0x218 Aug 27 17:35:53 Tower kernel: __irq_exit_rcu+0x4d/0x88 Aug 27 17:35:53 Tower kernel: sysvec_apic_timer_interrupt+0x66/0x7d Aug 27 17:35:53 Tower kernel: </IRQ> Aug 27 17:35:53 Tower kernel: <TASK> Aug 27 17:35:53 Tower kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20 Aug 27 17:35:53 Tower kernel: RIP: 0010:multi_cpu_stop+0x5a/0xca Aug 27 17:35:53 Tower kernel: Code: c7 c7 00 b8 3a 82 49 c7 c5 00 b8 3a 82 e8 97 d0 2d 00 39 c5 41 0f 94 c7 eb 0b 89 ed 49 0f a3 6d 00 41 0f 92 c7 45 31 e4 31 ed <4c> 89 ef e8 9b ff ff ff 89 e8 8b 6b 20 39 e8 74 3b 83 fd 02 74 07 Aug 27 17:35:53 Tower kernel: RSP: 0000:ffffc900063e3e88 EFLAGS: 00000293 Aug 27 17:35:53 Tower kernel: RAX: 00000000320a1889 RBX: ffffc9000855bb48 RCX: 0000000000000000 Aug 27 17:35:53 Tower kernel: RDX: 00000000320a1887 RSI: 0000000000000286 RDI: 00000000320a1889 Aug 27 17:35:53 Tower kernel: RBP: 0000000000000001 R08: ffff88812e87e010 R09: ffff88890ba2bbf0 Aug 27 17:35:53 Tower kernel: R10: 0000000000000000 R11: 00000000000c8000 R12: 0000000000000000 Aug 27 17:35:53 Tower kernel: R13: ffffffff81e0aa80 R14: 0000000000000282 R15: ffffc9000855bb00 Aug 27 17:35:53 Tower kernel: ? multi_cpu_stop+0xab/0xca Aug 27 17:35:53 Tower kernel: ? stop_machine_yield+0x3/0x3 Aug 27 17:35:53 Tower kernel: cpu_stopper_thread+0x93/0x109 Aug 27 17:35:53 Tower kernel: ? smpboot_register_percpu_thread+0xb7/0xb7 Aug 27 17:35:53 Tower kernel: smpboot_thread_fn+0x128/0x13c Aug 27 17:35:53 Tower kernel: kthread+0xde/0xe3 Aug 27 17:35:53 Tower kernel: ? set_kthread_struct+0x32/0x32 Aug 27 17:35:53 Tower kernel: ret_from_fork+0x22/0x30 Aug 27 17:35:53 Tower kernel: </TASK> Aug 27 17:35:53 Tower kernel: ---[ end trace ddea469b93dfa5c2 ]--- Edited August 27, 2022 by WasteLand Quote Link to comment
JorgeB Posted August 28, 2022 Share Posted August 28, 2022 Does it happen without running VMs? Quote Link to comment
WasteLand Posted August 28, 2022 Author Share Posted August 28, 2022 It does, other than yesterday I haven't spun up any VM's in months and experienced the same random issue. Quote Link to comment
JorgeB Posted August 29, 2022 Share Posted August 29, 2022 Those errors look kvm related, enable the syslog server, don't run any VMs and post that log together with the complete diagnostics after a crash. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.