April 2, 20242 yr I have been experiencing numerous crashes and hanging of the system. I noticed recently that it appears to be whenever I've got VM's and/or docker enabled. If parity check is running and VM's are enabled then parity check speed lowers drastically (i'm aware this expected behaviour), but I also then start seeing kernel panics in syslog and the parity check gets stuck and cannot be paused/cancelled/killed forcing me to hard reset the system. If I disable VM and docker, the parity check will run till completion, and similarly if parity check is not running then VM's and docker will run without issue. If I then turn on either docker or VM I seem to immediately trigger a kernel panic and parity check crashes. This just happened to me. I had parity check running with VM and Docker disabled and it was chugging along nicely for hours. I then turned on VM to grab some config off it quickly (thinking if I'm quick enough it'd be OK), but it instantly triggered the issue. I was still able to interact with the UI and SSH, but I couldn't kill parity check, so I downloaded diagnostics (attached) and hard reset the system. I've never had an issue before with parity checks whilst Docker is enabled, and I don't want to have to turn off everything for a day for me to run my monthly parity checks. Any help you can provide is greatly appreciated Here is the syslogged kernel panic: Apr 2 14:06:13 Valhalla kernel: ------------[ cut here ]------------ Apr 2 14:06:13 Valhalla kernel: kernel BUG at drivers/md/unraid.c:1617! Apr 2 14:06:13 Valhalla kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI Apr 2 14:06:13 Valhalla kernel: CPU: 8 PID: 14807 Comm: unraidd0 Tainted: P O 6.1.82-Unraid #1 Apr 2 14:06:13 Valhalla kernel: Hardware name: ASUS System Product Name/ProArt Z790-CREATOR WIFI, BIOS 2102 03/15/2024 Apr 2 14:06:13 Valhalla kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] Apr 2 14:06:13 Valhalla kernel: Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 f3 72 a0 48 8b 73 20 e8 fb 56 13 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 Apr 2 14:06:13 Valhalla kernel: RSP: 0018:ffffc9000b32fdf0 EFLAGS: 00010246 Apr 2 14:06:13 Valhalla kernel: RAX: 0000000000000000 RBX: ffff888182bfe4d8 RCX: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RDX: 0000000000000000 RSI: ffffffff829ec720 RDI: ffff888101afe038 Apr 2 14:06:13 Valhalla kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88812c450108 Apr 2 14:06:13 Valhalla kernel: R13: ffff888182bfe620 R14: ffff888182bfe698 R15: ffff88816ee41218 Apr 2 14:06:13 Valhalla kernel: FS: 0000000000000000(0000) GS:ffff88a03f200000(0000) knlGS:0000000000000000 Apr 2 14:06:13 Valhalla kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 2 14:06:13 Valhalla kernel: CR2: 0000150729e808c8 CR3: 000000000420a000 CR4: 0000000000750ee0 Apr 2 14:06:13 Valhalla kernel: PKRU: 55555554 Apr 2 14:06:13 Valhalla kernel: Call Trace: Apr 2 14:06:13 Valhalla kernel: <TASK> Apr 2 14:06:13 Valhalla kernel: ? __die_body+0x1a/0x5c Apr 2 14:06:13 Valhalla kernel: ? die+0x30/0x49 Apr 2 14:06:13 Valhalla kernel: ? do_trap+0x7b/0xfe Apr 2 14:06:13 Valhalla kernel: ? unraidd+0x1051/0x1140 [md_mod] ### [PREVIOUS LINE REPEATED 1 TIMES] ### Apr 2 14:06:13 Valhalla kernel: ? do_error_trap+0x6e/0x98 Apr 2 14:06:13 Valhalla kernel: ? unraidd+0x1051/0x1140 [md_mod] Apr 2 14:06:13 Valhalla kernel: ? exc_invalid_op+0x4c/0x60 Apr 2 14:06:13 Valhalla kernel: ? unraidd+0x1051/0x1140 [md_mod] Apr 2 14:06:13 Valhalla kernel: ? asm_exc_invalid_op+0x16/0x20 Apr 2 14:06:13 Valhalla kernel: ? unraidd+0x1051/0x1140 [md_mod] Apr 2 14:06:13 Valhalla kernel: md_thread+0xf4/0x122 [md_mod] Apr 2 14:06:13 Valhalla kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20 Apr 2 14:06:13 Valhalla kernel: ? signal_pending+0x1d/0x1d [md_mod] Apr 2 14:06:13 Valhalla kernel: kthread+0xe4/0xef Apr 2 14:06:13 Valhalla kernel: ? kthread_complete_and_exit+0x1b/0x1b Apr 2 14:06:13 Valhalla kernel: ret_from_fork+0x1f/0x30 Apr 2 14:06:13 Valhalla kernel: </TASK> Apr 2 14:06:13 Valhalla kernel: Modules linked in: vhost_net tun vhost tap kvm_intel kvm xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_iotlb veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) nfsd auth_rpcgss oid_registry lockd grace sunrpc tcp_diag inet_diag nct6775 nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs af_packet 8021q garp mrp bridge stp llc bonding tls igc atlantic i915 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp iosf_mbi drm_buddy i2c_algo_bit ttm btusb btrtl btbcm drm_display_helper btintel drm_kms_helper bluetooth drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel sr_mod Apr 2 14:06:13 Valhalla kernel: crypto_simd intel_gtt cryptd rapl ecdh_generic input_leds intel_cstate mei_hdcp mei_pxp wmi_bmof cdrom joydev led_class ecc mpt3sas intel_uncore i2c_i801 thunderbolt agpgart nvme i2c_smbus ahci mei_me raid_class scsi_transport_sas i2c_core nvme_core libahci mei syscopyarea sysfillrect video vmd sysimgblt fb_sys_fops thermal fan tpm_crb tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core acpi_pad acpi_tad button unix [last unloaded: kvm] Apr 2 14:06:13 Valhalla kernel: ---[ end trace 0000000000000000 ]--- Apr 2 14:06:13 Valhalla kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] Apr 2 14:06:13 Valhalla kernel: Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 f3 72 a0 48 8b 73 20 e8 fb 56 13 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 Apr 2 14:06:13 Valhalla kernel: RSP: 0018:ffffc9000b32fdf0 EFLAGS: 00010246 Apr 2 14:06:13 Valhalla kernel: RAX: 0000000000000000 RBX: ffff888182bfe4d8 RCX: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RDX: 0000000000000000 RSI: ffffffff829ec720 RDI: ffff888101afe038 Apr 2 14:06:13 Valhalla kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88812c450108 Apr 2 14:06:13 Valhalla kernel: R13: ffff888182bfe620 R14: ffff888182bfe698 R15: ffff88816ee41218 Apr 2 14:06:13 Valhalla kernel: FS: 0000000000000000(0000) GS:ffff88a03f200000(0000) knlGS:0000000000000000 Apr 2 14:06:13 Valhalla kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 2 14:06:13 Valhalla kernel: CR2: 0000150729e808c8 CR3: 0000000286308000 CR4: 0000000000750ee0 Apr 2 14:06:13 Valhalla kernel: PKRU: 55555554 Apr 2 14:06:13 Valhalla kernel: ------------[ cut here ]------------ Apr 2 14:06:13 Valhalla kernel: WARNING: CPU: 8 PID: 14807 at kernel/exit.c:814 do_exit+0x87/0x923 Apr 2 14:06:13 Valhalla kernel: Modules linked in: vhost_net tun vhost tap kvm_intel kvm xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_iotlb veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) nfsd auth_rpcgss oid_registry lockd grace sunrpc tcp_diag inet_diag nct6775 nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs af_packet 8021q garp mrp bridge stp llc bonding tls igc atlantic i915 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp iosf_mbi drm_buddy i2c_algo_bit ttm btusb btrtl btbcm drm_display_helper btintel drm_kms_helper bluetooth drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel sr_mod Apr 2 14:06:13 Valhalla kernel: crypto_simd intel_gtt cryptd rapl ecdh_generic input_leds intel_cstate mei_hdcp mei_pxp wmi_bmof cdrom joydev led_class ecc mpt3sas intel_uncore i2c_i801 thunderbolt agpgart nvme i2c_smbus ahci mei_me raid_class scsi_transport_sas i2c_core nvme_core libahci mei syscopyarea sysfillrect video vmd sysimgblt fb_sys_fops thermal fan tpm_crb tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core acpi_pad acpi_tad button unix [last unloaded: kvm] Apr 2 14:06:13 Valhalla kernel: CPU: 8 PID: 14807 Comm: unraidd0 Tainted: P D O 6.1.82-Unraid #1 Apr 2 14:06:13 Valhalla kernel: Hardware name: ASUS System Product Name/ProArt Z790-CREATOR WIFI, BIOS 2102 03/15/2024 Apr 2 14:06:13 Valhalla kernel: RIP: 0010:do_exit+0x87/0x923 Apr 2 14:06:13 Valhalla kernel: Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 69 0b 81 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 6b 0a 81 00 48 8b 83 d0 06 00 00 83 Apr 2 14:06:13 Valhalla kernel: RSP: 0018:ffffc9000b32fee0 EFLAGS: 00010286 Apr 2 14:06:13 Valhalla kernel: RAX: 0000000000000000 RBX: ffff88815bc4b000 RCX: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff Apr 2 14:06:13 Valhalla kernel: RBP: 000000000000000b R08: 0000000000000000 R09: 0000000000aaaaaa Apr 2 14:06:13 Valhalla kernel: R10: 0000000000000001 R11: 0000000000000001 R12: ffff88815751b000 Apr 2 14:06:13 Valhalla kernel: R13: ffff88816ee839c0 R14: 0000000000000002 R15: ffffffff820b2735 Apr 2 14:06:13 Valhalla kernel: FS: 0000000000000000(0000) GS:ffff88a03f200000(0000) knlGS:0000000000000000 Apr 2 14:06:13 Valhalla kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 2 14:06:13 Valhalla kernel: CR2: 0000150729e808c8 CR3: 0000000286308000 CR4: 0000000000750ee0 Apr 2 14:06:13 Valhalla kernel: PKRU: 55555554 Apr 2 14:06:13 Valhalla kernel: Call Trace: Apr 2 14:06:13 Valhalla kernel: <TASK> Apr 2 14:06:13 Valhalla kernel: ? __warn+0xab/0x122 Apr 2 14:06:13 Valhalla kernel: ? report_bug+0x109/0x17e Apr 2 14:06:13 Valhalla kernel: ? do_exit+0x87/0x923 Apr 2 14:06:13 Valhalla kernel: ? handle_bug+0x41/0x6f Apr 2 14:06:13 Valhalla kernel: ? exc_invalid_op+0x13/0x60 Apr 2 14:06:13 Valhalla kernel: ? asm_exc_invalid_op+0x16/0x20 Apr 2 14:06:13 Valhalla kernel: ? do_exit+0x87/0x923 Apr 2 14:06:13 Valhalla kernel: make_task_dead+0x11c/0x11c Apr 2 14:06:13 Valhalla kernel: rewind_stack_and_make_dead+0x17/0x17 Apr 2 14:06:13 Valhalla kernel: RIP: 0000:0x0 Apr 2 14:06:13 Valhalla kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. Apr 2 14:06:13 Valhalla kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Apr 2 14:06:13 Valhalla kernel: </TASK> Apr 2 14:06:13 Valhalla kernel: ---[ end trace 0000000000000000 ]--- valhalla-diagnostics-20240402-1425.zip
April 2, 20242 yr Community Expert 32 minutes ago, humaintenance said: [md_mod] Unraid driver is crashing, this is almost always a hardware issue, but if it was working before with a different release, downgrade and re-test, in some rare occasions it can be a kernel compatibility issue.
April 2, 20242 yr Author So the downgrade available to me through the UI is 6.12.8 which also had the issue. I found 6.11.5 via https://docs.unraid.net/unraid-os/download_list/, so I'll use that to roll back and test. Edited April 2, 20242 yr by humaintenance
April 7, 20242 yr Author Solution So I didn't rollback to an earlier release yet, I instead did things like run memtest (parallel) for a day), all fine. I then saw 6.12.10 was released and it was a kernel rollback. Since upgrading to .10 I've not had it crash on me, so potentially that was it, though I'm still cautious so monitoring it closely. Will report back if I get the issue again. Edit Thursday 11th: So I fairly confident the issues I was having were related to the kernel update, as I've had no crash since updating to 6.12.10. I've run a successful parity sync/corrections all while having docker engine on and dockers running. Gonna mark this as resolved. Edited April 11, 20242 yr by humaintenance
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.