lostinspace Posted March 15, 2021 Share Posted March 15, 2021 (edited) System details Unraid 6.9.1 Gigabyte B365M motherboard Intel i7-8700 CPU 48 GB DDR4 RAM 2666mhz (mismatched sticks/brands) LSI 9201-8i flashed to P20 IT mode. System was stable on 6.8.3 for months. Since upgrading to 6.9, and now 6.9.1, the system crashes and becomes completely unresponsive every few days (no GUI access, no SSH access, all dockers down, no network), requiring a hard power off to reboot it and get it running again. I set up an external syslog to capture what happens before the system becomes unresponsive, below. Also, attached are the diagnostic logs after hard power down and booting back up. Mar 14 16:38:32 Unraid kernel: <IRQ> Mar 14 16:38:32 Unraid kernel: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.10.21-Unraid #1 Mar 14 16:38:32 Unraid kernel: CR2: 000000c000853000 CR3: 000000000200c001 CR4: 00000000003726e0 Mar 14 16:38:32 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 14 16:38:32 Unraid kernel: Call Trace: Mar 14 16:38:32 Unraid kernel: Code: 6c a0 00 00 41 56 45 31 f6 41 55 41 89 d5 41 54 55 48 89 fd 48 89 f7 53 4c 8b 66 10 31 db 49 81 e4 00 f0 ff ff 45 39 ee 7d 28 <48> 8b 47 10 41 ff c6 8b 57 18 25 ff 0f 00 00 48 8d 84 10 ff 0f 00 Mar 14 16:38:32 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 14 16:38:32 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 14 16:38:32 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 14 16:38:32 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 14 16:38:32 Unraid kernel: DMAR: [DMA Write] Request device [0b:00.0] PASID ffffffff fault addr f3f63000 [fault reason 05] PTE Write access is not set Mar 14 16:38:32 Unraid kernel: DMAR: [DMA Write] Request device [0b:00.0] PASID ffffffff fault addr f3f65000 [fault reason 05] PTE Write access is not set Mar 14 16:38:32 Unraid kernel: DMAR: [DMA Write] Request device [0b:00.0] PASID ffffffff fault addr f3f66000 [fault reason 05] PTE Write access is not set Mar 14 16:38:32 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 14 16:38:32 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 14 16:38:32 Unraid kernel: FS: 0000000000000000(0000) GS:ffff8886172c0000(0000) knlGS:0000000000000000 Mar 14 16:38:32 Unraid kernel: Hardware name: Gigabyte Technology Co., Ltd. B365M DS3H/B365M DS3H, BIOS F5 08/13/2019 Mar 14 16:38:32 Unraid kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 2abf9cd8898ea000 Mar 14 16:38:32 Unraid kernel: R13: 0000000000000010 R14: 0000000000000002 R15: 000000000000008c Mar 14 16:38:32 Unraid kernel: RAX: b61397b86406a014 RBX: 000000000017710a RCX: 0000000000000002 Mar 14 16:38:32 Unraid kernel: RBP: ffff888101be30b8 R08: 0000000000000000 R09: ffff8883b2291000 Mar 14 16:38:32 Unraid kernel: RDX: b61397b86406a015 RSI: ffff8883b2291000 RDI: b61397b86406a014 Mar 14 16:38:32 Unraid kernel: RIP: 0010:intel_unmap_sg+0x26/0x68 Mar 14 16:38:32 Unraid kernel: RSP: 0018:ffffc900001a4ec8 EFLAGS: 00010083 Mar 14 16:38:32 Unraid kernel: __handle_irq_event_percpu+0x36/0xcb Mar 14 16:38:32 Unraid kernel: blk_update_request: I/O error, dev nvme0n1, sector 1645609472 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 Mar 14 16:38:32 Unraid kernel: general protection fault, probably for non-canonical address 0xb61397b86406a014: 0000 [#1] SMP PTI Mar 14 16:38:32 Unraid kernel: handle_irq_event_percpu+0x2c/0x6f Mar 14 16:38:32 Unraid kernel: nvme_irq+0xb/0x17 [nvme] Mar 14 16:38:32 Unraid kernel: nvme_pci_complete_rq+0x56/0x61 [nvme] Mar 14 16:38:32 Unraid kernel: nvme_process_cq+0xdb/0x15b [nvme] Mar 14 16:38:32 Unraid kernel: nvme_unmap_data+0x51/0xae [nvme] Any guidance on a solution would be greatly appreciated! unraid-diagnostics-20210314-1716.zip Edited March 24, 2021 by lostinspace Quote Link to comment
sfaruque Posted March 16, 2021 Share Posted March 16, 2021 (edited) I have similar issue. Unfortunately I can't get into the WebGUI or SSH. The server is basically appears to be DEAD. I couldn't even get any diagnostics. Is there any way of getting the diagnostics if I don't have access to WebGUI or SSH? My video card is stubbed with 'System Tools' so Unraid with GUI access from boot doesn't even show anything on the monitor when plugged in directly onto Unraid. What a mess! Unraid version 6.9.1 ASRock X570 / Ryzen 3600 / 32GB ECC RAM Edited March 16, 2021 by sfaruque Quote Link to comment
lostinspace Posted March 17, 2021 Author Share Posted March 17, 2021 At the suggestion of a reddit user who said he had similar problems, I have formatted the cache drive. I will update in a few days (or sooner, if it crashes). Quote Link to comment
lostinspace Posted March 23, 2021 Author Share Posted March 23, 2021 Crashed again. External syslog below, diagnostics (after hard reset) attached Mar 23 03:44:27 Unraid kernel: #PF: error_code(0x0002) - not-present page Mar 23 03:44:27 Unraid kernel: #PF: supervisor write access in kernel mode Mar 23 03:44:27 Unraid kernel: ------------[ cut here ]------------ Mar 23 03:44:27 Unraid kernel: ---[ end trace 4d3dcddc45e38db6 ]--- Mar 23 03:44:27 Unraid kernel: <IRQ> Mar 23 03:44:27 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Mar 23 03:44:27 Unraid kernel: ? process_scheduled_works+0x27/0x27 Mar 23 03:44:27 Unraid kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Mar 23 03:44:27 Unraid kernel: CPU: 1 PID: 1510 Comm: kworker/1:1H Not tainted 5.10.21-Unraid #1 Mar 23 03:44:27 Unraid kernel: CPU: 1 PID: 1510 Comm: kworker/1:1H Tainted: G D 5.10.21-Unraid #1 Mar 23 03:44:27 Unraid kernel: CR2: 0000000000000000 Mar 23 03:44:27 Unraid kernel: CR2: 0000000000000000 CR3: 000000000200c005 CR4: 00000000003726e0 Mar 23 03:44:27 Unraid kernel: CR2: 0000000000000000 CR3: 000000000200c005 CR4: 00000000003726e0 Mar 23 03:44:27 Unraid kernel: CR2: 0000000000000000 CR3: 000000000200c005 CR4: 00000000003726e0 Mar 23 03:44:27 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 23 03:44:27 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 23 03:44:27 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 23 03:44:27 Unraid kernel: Call Trace: Mar 23 03:44:27 Unraid kernel: Call Trace: Mar 23 03:44:27 Unraid kernel: Code: 05 c8 97 e2 00 01 e8 e7 6f 3e 00 0f 0b c3 80 3d b8 97 e2 00 00 75 53 48 c7 c7 52 63 da 81 c6 05 a8 97 e2 00 01 e8 c8 6f 3e 00 <0f> 0b c3 80 3d 98 97 e2 00 00 75 34 48 c7 c7 7a 63 da 81 c6 05 88 Mar 23 03:44:27 Unraid kernel: Code: c3 b8 00 fe ff ff f0 0f c1 07 c3 31 c0 48 81 ff 58 56 6f 81 72 0c 31 c0 48 81 ff 00 58 6f 81 0f 92 c0 c3 31 c0 ba 01 00 00 00 <f0> 0f b1 17 74 04 89 c6 eb bb c3 8b 07 45 31 c0 85 c0 75 11 ba 01 Mar 23 03:44:27 Unraid kernel: Code: c3 b8 00 fe ff ff f0 0f c1 07 c3 31 c0 48 81 ff 58 56 6f 81 72 0c 31 c0 48 81 ff 00 58 6f 81 0f 92 c0 c3 31 c0 ba 01 00 00 00 <f0> 0f b1 17 74 04 89 c6 eb bb c3 8b 07 45 31 c0 85 c0 75 11 ba 01 Mar 23 03:44:27 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 23 03:44:27 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 23 03:44:27 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 23 03:44:27 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 23 03:44:27 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr d6766000 [fault reason 06] PTE Read access is not set Mar 23 03:44:27 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr f973c000 [fault reason 06] PTE Read access is not set Mar 23 03:44:27 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr ff10c000 [fault reason 06] PTE Read access is not set Mar 23 03:44:27 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 23 03:44:27 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 23 03:44:27 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 23 03:44:27 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 23 03:44:27 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 23 03:44:27 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 23 03:44:27 Unraid kernel: FS: 0000000000000000(0000) GS:ffff888bff240000(0000) knlGS:0000000000000000 Mar 23 03:44:27 Unraid kernel: FS: 0000000000000000(0000) GS:ffff888bff240000(0000) knlGS:0000000000000000 Mar 23 03:44:27 Unraid kernel: FS: 0000000000000000(0000) GS:ffff888bff240000(0000) knlGS:0000000000000000 Mar 23 03:44:27 Unraid kernel: Hardware name: Gigabyte Technology Co., Ltd. B365M DS3H/B365M DS3H, BIOS F5 08/13/2019 Mar 23 03:44:27 Unraid kernel: Hardware name: Gigabyte Technology Co., Ltd. B365M DS3H/B365M DS3H, BIOS F5 08/13/2019 Mar 23 03:44:27 Unraid kernel: Modules linked in: xt_CHECKSUM ipt_REJECT macvlan ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat veth xt_MASQUERADE iptable_nat nf_nat xfs nfsd lockd grace sunrpc md_mod ip6table_filter ip6_tables iptable_filter ip_tables bonding i915 wmi_bmof iosf_mbi x86_pkg_temp_thermal i2c_algo_bit intel_powerclamp coretemp drm_kms_helper kvm_intel drm kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel intel_gtt crypto_simd cryptd agpgart mpt3sas i2c_i801 syscopyarea sysfillrect glue_helper sysimgblt r8169 rapl raid_class i2c_smbus fb_sys_fops scsi_transport_sas nvme i2c_core ahci intel_cstate nvme_core realtek wmi intel_uncore libahci video backlight thermal acpi_pad button fan Mar 23 03:44:27 Unraid kernel: Modules linked in: xt_CHECKSUM ipt_REJECT macvlan ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat veth xt_MASQUERADE iptable_nat nf_nat xfs nfsd lockd grace sunrpc md_mod ip6table_filter ip6_tables iptable_filter ip_tables bonding i915 wmi_bmof iosf_mbi x86_pkg_temp_thermal i2c_algo_bit intel_powerclamp coretemp drm_kms_helper kvm_intel drm kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel intel_gtt crypto_simd cryptd agpgart mpt3sas i2c_i801 syscopyarea sysfillrect glue_helper sysimgblt r8169 rapl raid_class i2c_smbus fb_sys_fops scsi_transport_sas nvme i2c_core ahci intel_cstate nvme_core realtek wmi intel_uncore libahci video backlight thermal acpi_pad button fan Mar 23 03:44:27 Unraid kernel: Oops: 0002 [#1] SMP PTI Mar 23 03:44:27 Unraid kernel: PGD 0 P4D 0 Mar 23 03:44:27 Unraid kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 Mar 23 03:44:27 Unraid kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 Mar 23 03:44:27 Unraid kernel: R10: ffffc9000014cbd0 R11: ffffc9000014cbc8 R12: 0000000000000000 Mar 23 03:44:27 Unraid kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Mar 23 03:44:27 Unraid kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Mar 23 03:44:27 Unraid kernel: R13: 0000000000000000 R14: 0000000000000129 R15: 000000000000008a Mar 23 03:44:27 Unraid kernel: RAX: 0000000000000000 RBX: ffff8881042607c0 RCX: 0000000000000027 Mar 23 03:44:27 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104cdbe80 RCX: ffff888104cdbec8 Mar 23 03:44:27 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104cdbe80 RCX: ffff888104cdbec8 Mar 23 03:44:27 Unraid kernel: RBP: 0000000000000000 R08: ffff888104cdbe80 R09: 00646b636f6c626b Mar 23 03:44:27 Unraid kernel: RBP: 0000000000000000 R08: ffff888104cdbe80 R09: 00646b636f6c626b Mar 23 03:44:27 Unraid kernel: RBP: ffff888104cdbe80 R08: 0000000000000000 R09: 00000000ffffefff Mar 23 03:44:27 Unraid kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Mar 23 03:44:27 Unraid kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Mar 23 03:44:27 Unraid kernel: RDX: 00000000ffffefff RSI: 0000000000000001 RDI: ffff888bff258920 Mar 23 03:44:27 Unraid kernel: RIP: 0010:do_raw_spin_lock+0x7/0x12 Mar 23 03:44:27 Unraid kernel: RIP: 0010:do_raw_spin_lock+0x7/0x12 Mar 23 03:44:27 Unraid kernel: RIP: 0010:refcount_warn_saturate+0xa7/0xe8 Mar 23 03:44:27 Unraid kernel: RSP: 0018:ffffc9000014cda0 EFLAGS: 00010086 Mar 23 03:44:27 Unraid kernel: RSP: 0018:ffffc900003dbe38 EFLAGS: 00010246 Mar 23 03:44:27 Unraid kernel: RSP: 0018:ffffc900003dbe38 EFLAGS: 00010246 Mar 23 03:44:27 Unraid kernel: WARNING: CPU: 1 PID: 1510 at lib/refcount.c:28 refcount_warn_saturate+0xa7/0xe8 Mar 23 03:44:27 Unraid kernel: Workqueue: kblockd blk_mq_requeue_work Mar 23 03:44:27 Unraid kernel: Workqueue: kblockd blk_mq_requeue_work Mar 23 03:44:27 Unraid kernel: __handle_irq_event_percpu+0x36/0xcb Mar 23 03:44:27 Unraid kernel: __refcount_sub_and_test.constprop.0+0x24/0x2a Mar 23 03:44:27 Unraid kernel: blk_mq_free_request+0xc6/0xdf Mar 23 03:44:27 Unraid kernel: blk_mq_request_bypass_insert+0x1b/0x72 Mar 23 03:44:27 Unraid kernel: blk_mq_requeue_work+0x8f/0xff Mar 23 03:44:27 Unraid kernel: handle_edge_irq+0xb0/0xd0 Mar 23 03:44:27 Unraid kernel: handle_irq_event+0x34/0x51 Mar 23 03:44:27 Unraid kernel: handle_irq_event_percpu+0x2c/0x6f Mar 23 03:44:27 Unraid kernel: kthread+0xe5/0xea Mar 23 03:44:27 Unraid kernel: nvme_irq+0xb/0x17 [nvme] Mar 23 03:44:27 Unraid kernel: nvme_process_cq+0xdb/0x15b [nvme] Mar 23 03:44:27 Unraid kernel: process_one_work+0x13c/0x1d5 Mar 23 03:44:27 Unraid kernel: refcount_t: underflow; use-after-free. Mar 23 03:44:27 Unraid kernel: ret_from_fork+0x22/0x30 Mar 23 03:44:27 Unraid kernel: worker_thread+0x18b/0x22f unraid-diagnostics-20210323-0818.zip Quote Link to comment
JorgeB Posted March 23, 2021 Share Posted March 23, 2021 Both crashes appear to be related to the NVMe device, look for a BIOS update. Quote Link to comment
lostinspace Posted March 23, 2021 Author Share Posted March 23, 2021 (edited) Thank you @JorgeB for your time and comments. I updated the motherboard BIOS to the latest available (what I believe is an experimental version; from F5 to F6e). There were no notes about SSD/NVMe changes but we'll see. What is concerning to me is that this only started after 6.9 upgrade; it worked fine for months until the upgrade. I'm considering pulling the drive to and putting into my Windows PC to update the firmware (SK Hynix P31 Gold NVMe). If I get another crash I will do so. For posterity, I ran a pass and a half (5 hours) of memtest86 with no errors. Per the updated GPU Driver instructions, I also commented out the go file edits made in 6.8.3. I had already "enabled" the new iGPU intel drivers via touch command immediately after the upgrade to 6.9, but the guide now says remove the go file edits. Edited March 23, 2021 by lostinspace Quote Link to comment
Tristankin Posted March 23, 2021 Share Posted March 23, 2021 Intel iGPU seems to be the issue in 6.9.0/1, I rolled back for this exact reason as media is the main reason I have the server. Not sure if there is a definitive answer for it being incorrect config, bad plex container version, or something funky with the kernel and intel igpus. I'm holding off on upgrading again till they get this sorted but the intel igpu transcoders seem to be a small section of the community. Quote Link to comment
Hoopster Posted March 24, 2021 Share Posted March 24, 2021 24 minutes ago, Tristankin said: Intel iGPU seems to be the issue in 6.9.0/1, It's not a universal issue as I (and many others) are currently running unRAID 6.9.1 and using the new way of loading i915 drivers documented in the release notes rather than loading them from the 'go' file as I did in previous versions. With 6.9.1, i915/Intel iGPU has worked for me with no issues in Plex and HandBrake with both the 'go' file and touch /boot/config/modprobe.d/i915.conf methods. I have a 9th-generation Intel CPU (Xeon E-2288G) with the UHD P630 iGPU. Quote Link to comment
ken-ji Posted March 24, 2021 Share Posted March 24, 2021 Chiming in that I'm running a i7-7700 also with UHD630 iGPU using it with Emby with iGPU transcoding as well and my server's rock stable (Unraid bugs not withstanding) - I've only rebooted it to enable VFIO binding and recover from a bad package install (newer Slackware packages don't work as they updated glibc but Limetech didn't) Quote Link to comment
lostinspace Posted March 24, 2021 Author Share Posted March 24, 2021 (edited) And another one. So updating BIOS didn't revolve, taking GPU edits out of go file didn't resolve. Unless someone has some other suggestions, I'll have to go back to to 6.8.3. Mar 24 08:26:42 Unraid emhttpd: read SMART /dev/sdh Mar 24 08:27:09 Unraid kernel: #PF: error_code(0x0002) - not-present page Mar 24 08:27:09 Unraid kernel: #PF: supervisor write access in kernel mode Mar 24 08:27:09 Unraid kernel: ---[ end trace c84b3c57793f4667 ]--- Mar 24 08:27:09 Unraid kernel: ? __kthread_bind_mask+0x57/0x57 Mar 24 08:27:09 Unraid kernel: ? process_scheduled_works+0x27/0x27 Mar 24 08:27:09 Unraid kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Mar 24 08:27:09 Unraid kernel: CPU: 8 PID: 903 Comm: kworker/8:1H Not tainted 5.10.21-Unraid #1 Mar 24 08:27:09 Unraid kernel: CR2: 0000000000000000 Mar 24 08:27:09 Unraid kernel: CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000003726e0 Mar 24 08:27:09 Unraid kernel: CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000003726e0 Mar 24 08:27:09 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 24 08:27:09 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 24 08:27:09 Unraid kernel: Call Trace: Mar 24 08:27:09 Unraid kernel: Code: c3 b8 00 fe ff ff f0 0f c1 07 c3 31 c0 48 81 ff 58 56 6f 81 72 0c 31 c0 48 81 ff 00 58 6f 81 0f 92 c0 c3 31 c0 ba 01 00 00 00 <f0> 0f b1 17 74 04 89 c6 eb bb c3 8b 07 45 31 c0 85 c0 75 11 ba 01 Mar 24 08:27:09 Unraid kernel: Code: c3 b8 00 fe ff ff f0 0f c1 07 c3 31 c0 48 81 ff 58 56 6f 81 72 0c 31 c0 48 81 ff 00 58 6f 81 0f 92 c0 c3 31 c0 ba 01 00 00 00 <f0> 0f b1 17 74 04 89 c6 eb bb c3 8b 07 45 31 c0 85 c0 75 11 ba 01 Mar 24 08:27:09 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 24 08:27:09 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 24 08:27:09 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 24 08:27:09 Unraid kernel: DMAR: DRHD: handling fault status reg 3 Mar 24 08:27:09 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr d6c30000 [fault reason 06] PTE Read access is not set Mar 24 08:27:09 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr db7b2000 [fault reason 06] PTE Read access is not set Mar 24 08:27:09 Unraid kernel: DMAR: [DMA Read] Request device [0b:00.0] PASID ffffffff fault addr db7b2000 [fault reason 06] PTE Read access is not set Mar 24 08:27:09 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 24 08:27:09 Unraid kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 24 08:27:09 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 24 08:27:09 Unraid kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Mar 24 08:27:09 Unraid kernel: FS: 0000000000000000(0000) GS:ffff888bff400000(0000) knlGS:0000000000000000 Mar 24 08:27:09 Unraid kernel: FS: 0000000000000000(0000) GS:ffff888bff400000(0000) knlGS:0000000000000000 Mar 24 08:27:09 Unraid kernel: Hardware name: Gigabyte Technology Co., Ltd. B365M DS3H/B365M DS3H, BIOS F6e 08/18/2020 Mar 24 08:27:09 Unraid kernel: Modules linked in: xt_CHECKSUM macvlan ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat veth xt_MASQUERADE iptable_nat nf_nat xfs nfsd lockd grace sunrpc md_mod ip6table_filter ip6_tables iptable_filter ip_tables bonding wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp i915 kvm_intel kvm iosf_mbi crct10dif_pclmul i2c_algo_bit crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper aesni_intel crypto_simd cryptd glue_helper drm mpt3sas intel_gtt agpgart i2c_i801 syscopyarea rapl i2c_smbus sysfillrect i2c_core sysimgblt r8169 nvme intel_cstate raid_class input_leds led_class scsi_transport_sas fb_sys_fops nvme_core intel_uncore ahci realtek wmi libahci video backlight thermal acpi_pad button fan Mar 24 08:27:09 Unraid kernel: Oops: 0002 [#1] SMP PTI Mar 24 08:27:09 Unraid kernel: PGD 0 P4D 0 Mar 24 08:27:09 Unraid kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 Mar 24 08:27:09 Unraid kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 Mar 24 08:27:09 Unraid kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Mar 24 08:27:09 Unraid kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Mar 24 08:27:09 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104aba500 RCX: ffff888104aba548 Mar 24 08:27:09 Unraid kernel: RAX: 0000000000000000 RBX: ffff888104aba500 RCX: ffff888104aba548 Mar 24 08:27:09 Unraid kernel: RBP: 0000000000000000 R08: ffff888104aba500 R09: 00646b636f6c626b Mar 24 08:27:09 Unraid kernel: RBP: 0000000000000000 R08: ffff888104aba500 R09: 00646b636f6c626b Mar 24 08:27:09 Unraid kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Mar 24 08:27:09 Unraid kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 Mar 24 08:27:09 Unraid kernel: RIP: 0010:do_raw_spin_lock+0x7/0x12 Mar 24 08:27:09 Unraid kernel: RIP: 0010:do_raw_spin_lock+0x7/0x12 Mar 24 08:27:09 Unraid kernel: RSP: 0018:ffffc90001863e38 EFLAGS: 00010246 Mar 24 08:27:09 Unraid kernel: RSP: 0018:ffffc90001863e38 EFLAGS: 00010246 Mar 24 08:27:09 Unraid kernel: Workqueue: kblockd blk_mq_requeue_work Mar 24 08:27:09 Unraid kernel: blk_mq_request_bypass_insert+0x1b/0x72 Mar 24 08:27:09 Unraid kernel: blk_mq_requeue_work+0x8f/0xff Mar 24 08:27:09 Unraid kernel: kthread+0xe5/0xea Mar 24 08:27:09 Unraid kernel: process_one_work+0x13c/0x1d5 Mar 24 08:27:09 Unraid kernel: ret_from_fork+0x22/0x30 Mar 24 08:27:09 Unraid kernel: worker_thread+0x18b/0x22f Edited March 24, 2021 by lostinspace Quote Link to comment
lostinspace Posted April 7, 2021 Author Share Posted April 7, 2021 Just to add a data point here, I couldn't figure out the problem and was tired of the server freezing/crashing and becoming completely unavailable/unresponsive, so I reverted back to 6.8.3. I've got 14 straight days of uptime - unheard of when I went to 6.9/6.9.1. So there's something with 6.9/6.9.1 that just doesn't like my hardware or something. Quote Link to comment
NerdyGriffin Posted April 13, 2021 Share Posted April 13, 2021 On 3/23/2021 at 8:00 PM, Hoopster said: ... using the new way of loading i915 drivers documented in the release notes ... Could you please provide a link to this, I have searched around but I cannot find this mentioned anywhere that I have read so far Quote Link to comment
Hoopster Posted April 13, 2021 Share Posted April 13, 2021 6 minutes ago, NerdyGriffin said: Could you please provide a link to this, I have searched around but I cannot find this mentioned anywhere that I have read so far https://wiki.unraid.net/Manual/Release_Notes/Unraid_OS_6.9.0#GPU_Driver_Integration Quote Link to comment
codefaux Posted April 28, 2021 Share Posted April 28, 2021 Not sure if this is relevant but I just recently may have fixed a huge instability issue I had. I'm on an X8DTH-6F board, Xeon processors (which ones don't matter, trust me -- I tried four different pair, spread across the supported set) and had intermittent crashing. I fixed it by using sysfs to disable cstates on my CPUs. I can't promise this will help anyone, but I had VERY similar crash messages, pointing to a wide variety of random hardware. Log in via terminal, drop this command, and cross your fingers. for cpus in /sys/devices/system/cpu/cpu*/cpuidle/state*/disable; do echo 1 > $cpus; done Hope this helps. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.