VladoPortos Posted August 17, 2022 Share Posted August 17, 2022 (edited) Hi all, After updating to 6.10.0 and adding brand-new HW (RAM and GPU) I'm experiencing crash withing 24h... seems like after Parity Check finishes. I have tested the RAM with memtest and no issue, I have updated BIOS to the latest just in case (did not help), tested the GPU in other system and no issue. Finally, I managed to catch the error, with open ssh connection to the server all the time, because it did not log into the syslog server. Aug 16 21:59:51 PlexServer kernel: usb 5-5: USB disconnect, device number 8 Aug 16 21:59:51 PlexServer acpid: input device has been disconnected, fd 7 Aug 16 21:59:51 PlexServer acpid: input device has been disconnected, fd 8 Aug 16 21:59:51 PlexServer acpid: input device has been disconnected, fd 9 Aug 16 22:56:56 PlexServer webGUI: Successful login user root from 10.0.0.246 Aug 16 23:09:06 PlexServer kernel: md: recovery thread: P corrected, sector=19695876888 Aug 16 23:09:06 PlexServer kernel: md: recovery thread: P corrected, sector=19695876944 Aug 16 23:09:06 PlexServer kernel: md: recovery thread: P corrected, sector=19695877192 Aug 16 23:09:06 PlexServer kernel: md: recovery thread: P corrected, sector=19695877216 Aug 16 23:09:06 PlexServer kernel: md: recovery thread: P corrected, sector=19695877248 Aug 16 23:09:06 PlexServer kernel: general protection fault, probably for non-canonical address 0xfffeffff828e18f8: 0000 [#1] SMP NOPTI Aug 16 23:09:06 PlexServer kernel: CPU: 32 PID: 88791 Comm: sshd Tainted: P O 5.15.46-Unraid #1 Aug 16 23:09:06 PlexServer kernel: Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F6 11/23/2021 Aug 16 23:09:06 PlexServer kernel: RIP: 0010:tcp_stream_memory_free+0x25/0x31 Aug 16 23:09:06 PlexServer kernel: Code: 91 ee 0e 00 c3 0f 1f 44 00 00 8b 87 0c 07 00 00 89 f1 8b 97 94 05 00 00 29 d0 8b 97 10 07 00 00 d3 e0 85 d2 75 0a 48 8b 57 30 <8b> 92 f8 03 00 00 39 d0 0f 92 c0 c3 0f 1f 44 00 00 83 ff 31 48 c7 Aug 16 23:09:06 PlexServer kernel: RSP: 0018:ffffc90001d57948 EFLAGS: 00010246 Aug 16 23:09:06 PlexServer kernel: RAX: 0000000000000000 RBX: ffff8881eb980000 RCX: 0000000000000001 Aug 16 23:09:06 PlexServer kernel: RDX: fffeffff828e1500 RSI: 0000000000000001 RDI: ffff8881eb980000 Aug 16 23:09:06 PlexServer kernel: RBP: ffff88815ed7e600 R08: ffff88815ed7e600 R09: ffffffff81673e5a Aug 16 23:09:06 PlexServer kernel: R10: 0000000000000020 R11: 0000000000000d41 R12: 0000000000000000 Aug 16 23:09:06 PlexServer kernel: R13: ffff8881c062e4c0 R14: 0000000000000000 R15: 0000000000000000 Aug 16 23:09:06 PlexServer kernel: FS: 000014ca28ea7740(0000) GS:ffff889ffd800000(0000) knlGS:0000000000000000 Aug 16 23:09:06 PlexServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 16 23:09:06 PlexServer kernel: CR2: 000014b9e4000010 CR3: 0000000f5946c000 CR4: 0000000000350ee0 Aug 16 23:09:06 PlexServer kernel: Call Trace: Aug 16 23:09:06 PlexServer kernel: <TASK> Aug 16 23:09:06 PlexServer kernel: __sk_stream_is_writeable.constprop.0+0x36/0x3d Aug 16 23:09:06 PlexServer kernel: tcp_poll+0x162/0x1f1 Aug 16 23:09:06 PlexServer kernel: sock_poll+0xb9/0xc2 Aug 16 23:09:06 PlexServer kernel: do_one_tree+0x382/0x651 Aug 16 23:09:06 PlexServer kernel: ? br_dev_queue_push_xmit+0x136/0x160 Aug 16 23:09:06 PlexServer kernel: ? d_obtain_root+0x24/0x24 Aug 16 23:09:06 PlexServer kernel: ? d_obtain_root+0x24/0x24 Aug 16 23:09:06 PlexServer kernel: ? d_obtain_root+0x24/0x24 Aug 16 23:09:06 PlexServer kernel: ? tcp_recvmsg_locked+0x704/0x755 Aug 16 23:09:06 PlexServer kernel: ? __slab_free+0x8d/0x244 Aug 16 23:09:06 PlexServer kernel: ? devfreq_add_governor+0x1f/0x197 Aug 16 23:09:06 PlexServer kernel: ? ip_do_fragment+0x310/0x399 Not sure yet what it is, anybody seen this before ? ( I really need it to be stable again ) EDIT - I just downgraded back to 6.9.2 and running parity check, to see if it finished ok, to eliminate v6.10.3 as issue. ( Kind of wanted to run win11 VM though as my daily driver, my whole reason to upgrade HW and SW) if it dies again... I'll start pulling HW out, I'm so frustrated right now... plexserver-diagnostics-20220817-0546.zip Edited August 20, 2022 by VladoPortos solved Quote Link to comment
JorgeB Posted August 17, 2022 Share Posted August 17, 2022 If v6.9.2 works try upgrading to v6.11.0-rc3 instead, it might like the newer kernel more. Quote Link to comment
VladoPortos Posted August 17, 2022 Author Share Posted August 17, 2022 34 minutes ago, JorgeB said: f v6.9.2 works try upgrading to v6.11.0-rc3 instead, it might like the newer kernel more. Will do, so far so good. but I think it dies either at the end of Parity check or soon after its done. So around 13Hours to go... Quote Link to comment
JorgeB Posted August 17, 2022 Share Posted August 17, 2022 Also make sure this is correctly set. Quote Link to comment
VladoPortos Posted August 17, 2022 Author Share Posted August 17, 2022 (edited) 1 hour ago, JorgeB said: Also make sure this is correctly set. Hmm It is Threadripper in it, and 4x 32 GB ram Patriot Viper 4 DDR4 3600Mhz, I seated them exactly as the manual specified. In the table from your link the max supported speed is 3200 ( I need to check in BIOS what speed the RAMs are using ) . Checking the processor now, it really max out on 3200 (Threadripper 3960x) ... depending on what's in BIOS I might need to dial it down. C state I think I set correctly (I'll give it another look if it dies) EDIT: checked dmidecode and RAM is set to 2666 Mhz so that should be ok. Edited August 17, 2022 by VladoPortos more info Quote Link to comment
VladoPortos Posted August 18, 2022 Author Share Posted August 18, 2022 Reporting back: on 6.9.2 - Parity check finished, it fixed 4 "Sync errors corrected:" same as before on 6.10 when it crashed. Aug 17 23:31:44 PlexServer kernel: md: recovery thread: P corrected, sector=19695876888 Aug 17 23:31:44 PlexServer kernel: md: recovery thread: P corrected, sector=19695876944 Aug 17 23:31:44 PlexServer kernel: md: recovery thread: P corrected, sector=19695877192 Aug 17 23:31:44 PlexServer kernel: md: recovery thread: P corrected, sector=19695877216 Aug 18 00:10:33 PlexServer flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update Aug 18 01:24:32 PlexServer webGUI: Successful login user root from 10.253.0.3 Aug 18 04:26:14 PlexServer kernel: md: sync done. time=79748sec Aug 18 04:26:14 PlexServer kernel: md: recovery thread: exit status: 0 However, it did not die, its going ok so far. I'll give it another day if its going to work without crash and call it a bug in 6.10 Quote Link to comment
VladoPortos Posted August 18, 2022 Author Share Posted August 18, 2022 Im away from home, but lost connection to my home from mobile... so I assume it died again time to start pulling out HW Quote Link to comment
VladoPortos Posted August 18, 2022 Author Share Posted August 18, 2022 This is the issue I managed to capture: Syslog: Aug 18 10:05:05 PlexServer kernel: kernel tried to execute NX-protected page - exploit attempt? (uid: 0) Aug 18 10:05:05 PlexServer kernel: BUG: unable to handle page fault for address: ffffffff826bbdb4 Aug 18 10:05:05 PlexServer kernel: #PF: supervisor instruction fetch in kernel mode Aug 18 10:05:05 PlexServer kernel: #PF: error_code(0x0011) - permissions violation Aug 18 10:05:05 PlexServer kernel: PGD 200e067 P4D 200e067 PUD 200f063 PMD 80000000026001e3 Aug 18 10:05:05 PlexServer kernel: Oops: 0011 [#1] SMP NOPTI Aug 18 10:05:05 PlexServer kernel: CPU: 9 PID: 121827 Comm: kworker/u256:5 Tainted: P O 5.10.28-Unraid #1 Aug 18 10:05:05 PlexServer kernel: Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F6 11/23/2021 Aug 18 10:05:05 PlexServer kernel: Workqueue: events_freezable_power_ thermal_zone_device_check Aug 18 10:05:05 PlexServer kernel: RIP: 0010:0xffffffff826bbdb4 Aug 18 10:05:05 PlexServer kernel: Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 Aug 18 10:05:05 PlexServer kernel: RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 Aug 18 10:05:05 PlexServer kernel: RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 Aug 18 10:05:05 PlexServer kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 Aug 18 10:05:05 PlexServer kernel: R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: FS: 0000000000000000(0000) GS:ffff889ffd240000(0000) knlGS:0000000000000000 Aug 18 10:05:05 PlexServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 18 10:05:05 PlexServer kernel: CR2: ffffffff826bbdb4 CR3: 00000018233ca000 CR4: 0000000000350ee0 Aug 18 10:05:05 PlexServer kernel: Call Trace: Aug 18 10:05:05 PlexServer kernel: ? thermal_zone_set_trips+0x2e/0x134 Aug 18 10:05:05 PlexServer kernel: ? thermal_get_temp+0x1e/0x37 [thermal] Aug 18 10:05:05 PlexServer kernel: ? thermal_zone_device_update+0xa8/0xe5 Aug 18 10:05:05 PlexServer kernel: ? process_one_work+0x13c/0x1d5 Aug 18 10:05:05 PlexServer kernel: ? worker_thread+0x18b/0x22f Aug 18 10:05:05 PlexServer kernel: ? process_scheduled_works+0x27/0x27 Aug 18 10:05:05 PlexServer kernel: ? kthread+0xe5/0xea Aug 18 10:05:05 PlexServer kernel: ? __kthread_bind_mask+0x57/0x57 Aug 18 10:05:05 PlexServer kernel: ? ret_from_fork+0x22/0x30 Aug 18 10:05:05 PlexServer kernel: traps: extendedTest.ph[97763] general protection fault ip:881cfa sp:7ffed44e6970 error:0 Aug 18 10:05:05 PlexServer kernel: Modules linked in: xt_mark nft_compat nft_counter nvidia_uvm(PO) xt_nat macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat Aug 18 10:05:05 PlexServer kernel: in php[600000+336000] Aug 18 10:05:05 PlexServer kernel: iptable_mangle nf_tables Aug 18 10:05:05 PlexServer kernel: Aug 18 10:05:05 PlexServer kernel: vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) drm backlight agpgart it87 hwmon_vid wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables wmi_bmof mxm_wmi edac_mce_amd amd_energy btusb btrtl btbcm btintel kvm_amd bluetooth kvm crct10dif_pclmul igb crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ecdh_generic crypto_simd ecc cryptd ccp i2c_piix4 ahci i2c_algo_bit glue_helper i2c_core libahci rapl k10temp thermal button acpi_cpufreq wmi nvme nvme_core Aug 18 10:05:05 PlexServer kernel: CR2: ffffffff826bbdb4 Aug 18 10:05:05 PlexServer kernel: ---[ end trace a35bf397933c9bf6 ]--- Aug 18 10:05:05 PlexServer kernel: general protection fault, probably for non-canonical address 0xfffe8881e212c9f8: 0000 [#2] SMP NOPTI Aug 18 10:05:05 PlexServer kernel: CPU: 15 PID: 97763 Comm: extendedTest.ph Tainted: P D O 5.10.28-Unraid #1 Aug 18 10:05:05 PlexServer kernel: RIP: 0010:0xffffffff826bbdb4 Aug 18 10:05:05 PlexServer kernel: Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 Aug 18 10:05:05 PlexServer kernel: Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F6 11/23/2021 Aug 18 10:05:05 PlexServer kernel: RIP: 0010:unlink_anon_vmas+0x62/0x127 Aug 18 10:05:05 PlexServer kernel: RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 Aug 18 10:05:05 PlexServer kernel: Code: 6d 08 4c 89 e7 49 8b 75 00 e8 d9 eb ff ff 49 8d 75 40 48 89 ef 49 89 c4 e8 a5 b7 fe ff 49 8b 45 40 48 85 c0 75 09 49 8b 45 38 <ff> 48 34 eb 29 48 8b 45 18 48 89 ef 48 8b 55 10 48 89 42 08 48 89 Aug 18 10:05:05 PlexServer kernel: RSP: 0000:ffffc90004473bd8 EFLAGS: 00010246 Aug 18 10:05:05 PlexServer kernel: RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 Aug 18 10:05:05 PlexServer kernel: Aug 18 10:05:05 PlexServer kernel: RAX: fffe8881e212c9f8 RBX: ffff889fd3936600 RCX: ffff8899187fd2e0 Aug 18 10:05:05 PlexServer kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 Aug 18 10:05:05 PlexServer kernel: R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: RBP: ffff8899187fd2c0 R08: 000015189cf76000 R09: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: FS: 0000000000000000(0000) GS:ffff889ffd240000(0000) knlGS:0000000000000000 Aug 18 10:05:05 PlexServer kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881e212c9f8 Aug 18 10:05:05 PlexServer kernel: R13: ffff8881e212c9f8 R14: ffff889fd3936668 R15: dead000000000100 Aug 18 10:05:05 PlexServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 18 10:05:05 PlexServer kernel: CR2: ffffffff826bbdb4 CR3: 00000018233ca000 CR4: 0000000000350ee0 Aug 18 10:05:05 PlexServer kernel: FS: 0000000000000000(0000) GS:ffff889ffd3c0000(0000) knlGS:0000000000000000 Aug 18 10:05:05 PlexServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 18 10:05:05 PlexServer kernel: CR2: 0000149d85630180 CR3: 0000001a0e882000 CR4: 0000000000350ee0 Aug 18 10:05:05 PlexServer kernel: Call Trace: Aug 18 10:05:05 PlexServer kernel: free_pgtables+0x81/0xbb Aug 18 10:05:05 PlexServer kernel: exit_mmap+0xc4/0x155 Aug 18 10:05:05 PlexServer kernel: __mmput+0x3b/0xcf Aug 18 10:05:05 PlexServer kernel: do_exit+0x3b4/0x8eb Aug 18 10:05:05 PlexServer kernel: do_group_exit+0x8e/0x8e Aug 18 10:05:05 PlexServer kernel: get_signal+0x1b3/0x599 Aug 18 10:05:05 PlexServer kernel: arch_do_signal+0x2b/0x705 Aug 18 10:05:05 PlexServer kernel: ? signal_wake_up_state+0x11/0x20 Aug 18 10:05:05 PlexServer kernel: ? __send_signal+0x1c5/0x233 Aug 18 10:05:05 PlexServer kernel: exit_to_user_mode_prepare+0x38/0xc6 Aug 18 10:05:05 PlexServer kernel: irqentry_exit_to_user_mode+0x5/0x12 Aug 18 10:05:05 PlexServer kernel: exc_general_protection+0x1aa/0x1cc Aug 18 10:05:05 PlexServer kernel: ? vfs_write+0xec/0x121 Aug 18 10:05:05 PlexServer kernel: ? asm_exc_general_protection+0x8/0x30 Aug 18 10:05:05 PlexServer kernel: asm_exc_general_protection+0x1e/0x30 Aug 18 10:05:05 PlexServer kernel: RIP: 0033:0x881cfa Aug 18 10:05:05 PlexServer kernel: Code: Unable to access opcode bytes at RIP 0x881cd0. Aug 18 10:05:05 PlexServer kernel: RSP: 002b:00007ffed44e6970 EFLAGS: 00010206 Aug 18 10:05:05 PlexServer kernel: RAX: 00000000800c0005 RBX: 0000000000000005 RCX: 0001151898800000 Aug 18 10:05:05 PlexServer kernel: RDX: 00011518988641b0 RSI: 0000000000000064 RDI: 00001518988b14e0 Aug 18 10:05:05 PlexServer kernel: RBP: 000015189a600060 R08: 0000000000000001 R09: 00000000011370d4 Aug 18 10:05:05 PlexServer kernel: R10: 0000000000000005 R11: 0000000000000001 R12: 0000000000000055 Aug 18 10:05:05 PlexServer kernel: R13: 0001000000000050 R14: 000015189a616e50 R15: 000015189a600040 Aug 18 10:05:05 PlexServer kernel: Modules linked in: xt_mark nft_compat nft_counter nvidia_uvm(PO) xt_nat macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) drm backlight agpgart it87 hwmon_vid wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables wmi_bmof mxm_wmi edac_mce_amd amd_energy btusb btrtl btbcm btintel kvm_amd bluetooth kvm crct10dif_pclmul igb crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ecdh_generic crypto_simd ecc cryptd ccp Aug 18 10:05:05 PlexServer kernel: i2c_piix4 ahci i2c_algo_bit glue_helper i2c_core libahci rapl k10temp thermal button acpi_cpufreq wmi nvme nvme_core Aug 18 10:05:05 PlexServer kernel: ---[ end trace a35bf397933c9bf7 ]--- Aug 18 10:05:05 PlexServer kernel: RIP: 0010:0xffffffff826bbdb4 Aug 18 10:05:05 PlexServer kernel: Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 Aug 18 10:05:05 PlexServer kernel: RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 Aug 18 10:05:05 PlexServer kernel: RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 Aug 18 10:05:05 PlexServer kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 Aug 18 10:05:05 PlexServer kernel: R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 Aug 18 10:05:05 PlexServer kernel: FS: 0000000000000000(0000) GS:ffff889ffd3c0000(0000) knlGS:0000000000000000 Aug 18 10:05:05 PlexServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 18 10:05:05 PlexServer kernel: CR2: 0000149d85630180 CR3: 0000001a0e882000 CR4: 0000000000350ee0 Aug 18 10:05:05 PlexServer kernel: Fixing recursive fault but reboot is needed! Aug 18 10:05:06 PlexServer kernel: umip: Tdarr_Node[51140] ip:35de076ca3c9 sp:7ffcabe12398: STR instruction cannot be used by applications. Aug 18 10:05:06 PlexServer kernel: umip: Tdarr_Node[51140] ip:35de076ca3c9 sp:7ffcabe12398: For now, expensive software emulation returns the result. Dmesg: [Aug18 10:04] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ +0.000044] BUG: unable to handle page fault for address: ffffffff826bbdb4 [ +0.000030] #PF: supervisor instruction fetch in kernel mode [ +0.000025] #PF: error_code(0x0011) - permissions violation [ +0.000025] PGD 200e067 P4D 200e067 PUD 200f063 PMD 80000000026001e3 [ +0.000030] Oops: 0011 [#1] SMP NOPTI [ +0.000019] CPU: 9 PID: 121827 Comm: kworker/u256:5 Tainted: P O 5.10.28-Unraid #1 [ +0.000044] Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F6 11/23/2021 [ +0.000053] Workqueue: events_freezable_power_ thermal_zone_device_check [ +0.000029] RIP: 0010:0xffffffff826bbdb4 [ +0.000020] Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 [ +0.000085] RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 [ +0.000024] RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 [ +0.000030] RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 [ +0.000030] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 [ +0.000030] R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 [ +0.000030] R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 [ +0.000030] FS: 0000000000000000(0000) GS:ffff889ffd240000(0000) knlGS:0000000000000000 [ +0.000034] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000025] CR2: ffffffff826bbdb4 CR3: 00000018233ca000 CR4: 0000000000350ee0 [ +0.000030] Call Trace: [ +0.000017] ? thermal_zone_set_trips+0x2e/0x134 [ +0.000024] ? thermal_get_temp+0x1e/0x37 [thermal] [ +0.000023] ? thermal_zone_device_update+0xa8/0xe5 [ +0.000024] ? process_one_work+0x13c/0x1d5 [ +0.000020] ? worker_thread+0x18b/0x22f [ +0.000020] ? process_scheduled_works+0x27/0x27 [ +0.000022] ? kthread+0xe5/0xea [ +0.000017] ? __kthread_bind_mask+0x57/0x57 [ +0.000021] ? ret_from_fork+0x22/0x30 [ +0.003626] traps: extendedTest.ph[97763] general protection fault ip:881cfa sp:7ffed44e6970 error:0 [ +0.003722] Modules linked in: xt_mark nft_compat nft_counter nvidia_uvm(PO) xt_nat macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat [ +0.000061] in php[600000+336000] [ +0.000001] iptable_mangle nf_tables [ +0.000122] vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) drm backlight agpgart it87 hwmon_vid wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables wmi_bmof mxm_wmi edac_mce_amd amd_energy btusb btrtl btbcm btintel kvm_amd bluetooth kvm crct10dif_pclmul igb crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ecdh_generic crypto_simd ecc cryptd ccp i2c_piix4 ahci i2c_algo_bit glue_helper i2c_core libahci rapl k10temp thermal button acpi_cpufreq wmi nvme nvme_core [ +0.000410] CR2: ffffffff826bbdb4 [ +0.000568] ---[ end trace a35bf397933c9bf6 ]--- [ +0.000001] general protection fault, probably for non-canonical address 0xfffe8881e212c9f8: 0000 [#2] SMP NOPTI [ +0.000003] CPU: 15 PID: 97763 Comm: extendedTest.ph Tainted: P D O 5.10.28-Unraid #1 [ +0.000075] RIP: 0010:0xffffffff826bbdb4 [ +0.000003] Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 [ +0.000107] Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER/TRX40 AORUS MASTER, BIOS F6 11/23/2021 [ +0.000007] RIP: 0010:unlink_anon_vmas+0x62/0x127 [ +0.000095] RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 [ +0.000077] Code: 6d 08 4c 89 e7 49 8b 75 00 e8 d9 eb ff ff 49 8d 75 40 48 89 ef 49 89 c4 e8 a5 b7 fe ff 49 8b 45 40 48 85 c0 75 09 49 8b 45 38 <ff> 48 34 eb 29 48 8b 45 18 48 89 ef 48 8b 55 10 48 89 42 08 48 89 [ +0.000002] RSP: 0000:ffffc90004473bd8 EFLAGS: 00010246 [ +0.000139] RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 [ +0.000002] RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 [ +0.000110] RAX: fffe8881e212c9f8 RBX: ffff889fd3936600 RCX: ffff8899187fd2e0 [ +0.000077] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 [ +0.000001] R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 [ +0.000089] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 [ +0.000001] RBP: ffff8899187fd2c0 R08: 000015189cf76000 R09: 0000000000000000 [ +0.000140] R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 [ +0.000002] FS: 0000000000000000(0000) GS:ffff889ffd240000(0000) knlGS:0000000000000000 [ +0.000080] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881e212c9f8 [ +0.000002] R13: ffff8881e212c9f8 R14: ffff889fd3936668 R15: dead000000000100 [ +0.000085] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000002] CR2: ffffffff826bbdb4 CR3: 00000018233ca000 CR4: 0000000000350ee0 [ +0.000089] FS: 0000000000000000(0000) GS:ffff889ffd3c0000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.001212] CR2: 0000149d85630180 CR3: 0000001a0e882000 CR4: 0000000000350ee0 [ +0.000088] Call Trace: [ +0.000073] free_pgtables+0x81/0xbb [ +0.000076] exit_mmap+0xc4/0x155 [ +0.000075] __mmput+0x3b/0xcf [ +0.000081] do_exit+0x3b4/0x8eb [ +0.000073] do_group_exit+0x8e/0x8e [ +0.000075] get_signal+0x1b3/0x599 [ +0.000075] arch_do_signal+0x2b/0x705 [ +0.000076] ? signal_wake_up_state+0x11/0x20 [ +0.000078] ? __send_signal+0x1c5/0x233 [ +0.000078] exit_to_user_mode_prepare+0x38/0xc6 [ +0.000080] irqentry_exit_to_user_mode+0x5/0x12 [ +0.000079] exc_general_protection+0x1aa/0x1cc [ +0.000079] ? vfs_write+0xec/0x121 [ +0.000075] ? asm_exc_general_protection+0x8/0x30 [ +0.000080] asm_exc_general_protection+0x1e/0x30 [ +0.000079] RIP: 0033:0x881cfa [ +0.000078] Code: Unable to access opcode bytes at RIP 0x881cd0. [ +0.000085] RSP: 002b:00007ffed44e6970 EFLAGS: 00010206 [ +0.000081] RAX: 00000000800c0005 RBX: 0000000000000005 RCX: 0001151898800000 [ +0.000088] RDX: 00011518988641b0 RSI: 0000000000000064 RDI: 00001518988b14e0 [ +0.000088] RBP: 000015189a600060 R08: 0000000000000001 R09: 00000000011370d4 [ +0.000089] R10: 0000000000000005 R11: 0000000000000001 R12: 0000000000000055 [ +0.000088] R13: 0001000000000050 R14: 000015189a616e50 R15: 000015189a600040 [ +0.000090] Modules linked in: xt_mark nft_compat nft_counter nvidia_uvm(PO) xt_nat macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) drm backlight agpgart it87 hwmon_vid wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables wmi_bmof mxm_wmi edac_mce_amd amd_energy btusb btrtl btbcm btintel kvm_amd bluetooth kvm crct10dif_pclmul igb crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ecdh_generic crypto_simd ecc cryptd ccp [ +0.000043] i2c_piix4 ahci i2c_algo_bit glue_helper i2c_core libahci rapl k10temp thermal button acpi_cpufreq wmi nvme nvme_core [ +0.000630] ---[ end trace a35bf397933c9bf7 ]--- [ +0.000081] RIP: 0010:0xffffffff826bbdb4 [ +0.000080] Code: 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 58 e7 51 eb 25 15 00 00 00 00 00 00 00 00 00 00 c0 b5 4e eb 25 15 00 00 00 00 00 00 <00> 00 00 00 06 00 00 00 00 01 00 77 00 00 00 00 00 00 00 00 00 00 [ +0.000150] RSP: 0018:ffffc9000565be28 EFLAGS: 00010246 [ +0.000082] RAX: 0000000000000000 RBX: ffff888102674800 RCX: 0000000000000000 [ +0.000090] RDX: ffff889f53979c00 RSI: ffff88873c7b3000 RDI: ffff888102674bc8 [ +0.000092] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff889b6dbb5c00 [ +0.000091] R10: 0000000000000000 R11: ffff88873c7b3000 R12: 0000000000000000 [ +0.000093] R13: ffff888102674bc8 R14: ffff888102674bc8 R15: 0000000000000000 [ +0.000088] FS: 0000000000000000(0000) GS:ffff889ffd3c0000(0000) knlGS:0000000000000000 [ +0.000093] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000082] CR2: 0000149d85630180 CR3: 0000001a0e882000 CR4: 0000000000350ee0 [ +0.000089] Fixing recursive fault but reboot is needed! [Aug18 10:05] umip: Tdarr_Node[51140] ip:35de076ca3c9 sp:7ffcabe12398: STR instruction cannot be used by applications. [ +0.000112] umip: Tdarr_Node[51140] ip:35de076ca3c9 sp:7ffcabe12398: For now, expensive software emulation returns the result. Any idea how to find out what is it related to, PCI device, or what ? Quote Link to comment
JorgeB Posted August 18, 2022 Share Posted August 18, 2022 8 minutes ago, VladoPortos said: Any idea how to find out what is it related to, PCI device, or what ? Cannot tell based on that, difficult to see where the problem could be if it's hardware related. Quote Link to comment
VladoPortos Posted August 18, 2022 Author Share Posted August 18, 2022 So I even try 6.11 rc but that did not recognize nvidia cards, and instantly crashed the moment I turned on docker so back to 9.10, removed one newly added component that I suspect is the cause and now doing much longer memtest. Quote Link to comment
Solution VladoPortos Posted August 18, 2022 Author Solution Share Posted August 18, 2022 Well well well, I was wrong ! I actually got faulty RAM, at least I really hope it is the new pair I put in. I took them out and running the memtest again. (Please please please be the issue, easy to solve ) Quote Link to comment
VladoPortos Posted August 20, 2022 Author Share Posted August 20, 2022 Well confirmed, after the new RAM was removed, server is back to rock solid. I just returned from store with the replacement and let the Memetest run for the whole night this time. Marking this as solved. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.