Since the reboot to update from 7.0.0 to 7.0.1, I've been getting consistent crashes. This box is now on 7.1.0-rc1. I've attached the diagnostics after the crash Wednesday overnight, and the syslog after the crash a few hours ago.cs7box-diagnostics-20250416-1318.zipsyslog-20250417-1630.txt
0416 log shows hardware issues at the end:
Apr 16 00:28:24 CS7Box emhttpd: read SMART /dev/sde
Apr 16 02:18:23 CS7Box kernel: mce: [Hardware Error]: Machine check events logged
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Apr 16 02:18:24 CS7Box kernel: CPU1 BANK0 CMCI storm detected
Apr 16 02:18:25 CS7Box kernel: CPU3 BANK0 CMCI storm detected
0417 log is more verbose after running "mcelog --daemon" on startup:
Apr 17 02:50:46 CS7Box emhttpd: read SMART /dev/sde
Apr 17 02:52:41 CS7Box kernel: Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
Apr 17 02:52:41 CS7Box kernel: CPU: 9 UID: 99 PID: 2444554 Comm: Plex Media Scan Tainted: P O 6.12.23-Unraid #1
Apr 17 02:52:41 CS7Box kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
Apr 17 02:52:41 CS7Box kernel: Hardware name: ASUS System Product Name/ROG MAXIMUS X HERO (WI-FI AC), BIOS 2701 07/13/2021
Apr 17 02:52:41 CS7Box kernel: RIP: 0010:pmd_flags+0x1d/0x30
Apr 17 02:52:41 CS7Box kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 48 b8 00 f0 ff ff ff ff 0f 00 40 f6 c7 80 74 06 48 2d 00 f0 1f 00 48 f7 d0 48 21 f8 c3 <cc> cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90
Apr 17 02:52:41 CS7Box kernel: RSP: 0018:ffffc9000a74fb80 EFLAGS: 00010202
Apr 17 02:52:41 CS7Box kernel: RAX: 000ffffffffff000 RBX: 0000000000000000 RCX: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: RDX: ffff88843037efa8 RSI: 0000149ffea00000 RDI: 000000045f5f9067
Apr 17 02:52:41 CS7Box kernel: RBP: ffff88820da3d498 R08: 0000000000000000 R09: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff88820da3d498
Apr 17 02:52:41 CS7Box kernel: R13: ffff88843037efa8 R14: ffffea0010c0dfa8 R15: ffffc9000a74fca0
Apr 17 02:52:41 CS7Box kernel: FS: 0000149ffe0f0b00(0000) GS:ffff88884ec40000(0000) knlGS:0000000000000000
Apr 17 02:52:41 CS7Box kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 17 02:52:41 CS7Box kernel: CR2: 0000149ffea11000 CR3: 0000000430438002 CR4: 00000000003726f0
Apr 17 02:52:41 CS7Box kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 17 02:52:41 CS7Box kernel: Call Trace:
Apr 17 02:52:41 CS7Box kernel: <TASK>
Apr 17 02:52:41 CS7Box kernel: is_swap_pmd+0x10/0x20
Apr 17 02:52:41 CS7Box kernel: is_pmd_migration_entry+0xa/0x30
Apr 17 02:52:41 CS7Box kernel: split_huge_pmd_locked+0x5c/0x750
Apr 17 02:52:41 CS7Box kernel: ? rwsem_down_write_slowpath+0x1e5/0x3e0
Apr 17 02:52:41 CS7Box kernel: __split_huge_pmd+0x99/0xf0
Apr 17 02:52:41 CS7Box kernel: vma_adjust_trans_huge+0x22/0x50
Apr 17 02:52:41 CS7Box kernel: __split_vma+0x19f/0x220
Apr 17 02:52:41 CS7Box kernel: vms_gather_munmap_vmas+0x80/0x1d0
Apr 17 02:52:41 CS7Box kernel: do_vmi_align_munmap+0x126/0x1a0
Apr 17 02:52:41 CS7Box kernel: ? __pfx_futex_wake_mark+0x10/0x10
Apr 17 02:52:41 CS7Box kernel: do_vmi_munmap+0x13e/0x150
Apr 17 02:52:41 CS7Box kernel: __vm_munmap+0x92/0xd0
Apr 17 02:52:41 CS7Box kernel: __x64_sys_munmap+0x17/0x20
Apr 17 02:52:41 CS7Box kernel: do_syscall_64+0x68/0xe0
Apr 17 02:52:41 CS7Box kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 17 02:52:41 CS7Box kernel: RIP: 0033:0x14a0070f6895
Apr 17 02:52:41 CS7Box kernel: Code: 00 00 00 0f 05 9b 48 89 c7 e9 27 55 fe ff 41 56 53 50 49 89 f6 48 89 fb e8 88 3d 02 00 b8 0b 00 00 00 48 89 df 4c 89 f6 0f 05 <9b> 48 89 c7 48 83 c4 08 5b 41 5e e9 fb 54 fe ff 31 c0 83 fa 04 74
Apr 17 02:52:41 CS7Box kernel: RSP: 002b:0000149ffe0eef30 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
Apr 17 02:52:41 CS7Box kernel: RAX: ffffffffffffffda RBX: 0000149ffea0a000 RCX: 000014a0070f6895
Apr 17 02:52:41 CS7Box kernel: RDX: 0000000000000000 RSI: 0000000000009000 RDI: 0000149ffea0a000
Apr 17 02:52:41 CS7Box kernel: RBP: 0000000000000000 R08: 0000000000000028 R09: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: R10: 0000000000000001 R11: 0000000000000206 R12: 0000149ffea12ffc
Apr 17 02:52:41 CS7Box kernel: R13: 0000000000000001 R14: 0000000000009000 R15: 0000000000009000
Apr 17 02:52:41 CS7Box kernel: </TASK>
Apr 17 02:52:41 CS7Box kernel: Modules linked in: xt_connmark xt_comment iptable_raw xt_mark veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle iptable_mangle vhost_net tun vhost vhost_iotlb nf_conntrack_netlink xt_nat xt_tcpudp xt_conntrack nfnetlink xfrm_user xfrm_algo ip6table_nat xt_addrtype nvidia_uvm(PO) nfsd auth_rpcgss lockd grace sunrpc md_mod zfs(PO) spl(O) nct6775 nct6775_core hwmon_vid iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet cfg80211 rfkill bridge 8021q garp mrp stp llc e1000e r8169 realtek led_class nvidia_drm(PO) nvidia_modeset(PO) intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nvidia(PO) drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3
Apr 17 02:52:41 CS7Box kernel: drm_kms_helper sha1_ssse3 aesni_intel crypto_simd cryptd mei_hdcp mei_pxp drm rapl nvme intel_cstate wmi_bmof mxm_wmi mpt3sas intel_uncore i2c_i801 agpgart mei_me i2c_smbus tpm_crb raid_class nvme_core ahci mei i2c_core scsi_transport_sas libahci tpm_tis tpm_tis_core tpm video wmi libaescfb ecdh_generic backlight ecc acpi_pad button [last unloaded: e1000e]
Apr 17 02:52:41 CS7Box kernel: ---[ end trace 0000000000000000 ]---
Apr 17 02:52:41 CS7Box kernel: pstore: backend (efi_pstore) writing error (-28)
Apr 17 02:52:41 CS7Box kernel: RIP: 0010:pmd_flags+0x1d/0x30
Apr 17 02:52:41 CS7Box kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 48 b8 00 f0 ff ff ff ff 0f 00 40 f6 c7 80 74 06 48 2d 00 f0 1f 00 48 f7 d0 48 21 f8 c3 <cc> cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90
Apr 17 02:52:41 CS7Box kernel: RSP: 0018:ffffc9000a74fb80 EFLAGS: 00010202
Apr 17 02:52:41 CS7Box kernel: RAX: 000ffffffffff000 RBX: 0000000000000000 RCX: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: RDX: ffff88843037efa8 RSI: 0000149ffea00000 RDI: 000000045f5f9067
Apr 17 02:52:41 CS7Box kernel: RBP: ffff88820da3d498 R08: 0000000000000000 R09: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff88820da3d498
Apr 17 02:52:41 CS7Box kernel: R13: ffff88843037efa8 R14: ffffea0010c0dfa8 R15: ffffc9000a74fca0
Apr 17 02:52:41 CS7Box kernel: FS: 0000149ffe0f0b00(0000) GS:ffff88884ec40000(0000) knlGS:0000000000000000
Apr 17 02:52:41 CS7Box kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 17 02:52:41 CS7Box kernel: CR2: 0000149ffea11000 CR3: 0000000430438002 CR4: 00000000003726f0
Apr 17 02:52:41 CS7Box kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 17 02:52:41 CS7Box kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 17 02:52:41 CS7Box kernel: note: Plex Media Scan[2444554] exited with preempt_count 1
This does mention Plex, but the server has crashed without the Plex container running.
Tagging @optiman and @trurl from the previous discussion.