Kernel BUG/Oops in bio_associate_blkg_from_css during Parity Sync/Data Rebuild - Unraid 7.2.4 - General Support

March 12Mar 12

System crashes repeatedly during Parity Sync/Data Rebuild operations. The crash occurs consistently after several hours of rebuild operation. System becomes completely unresponsive and requires a hard power cycle. The array does not auto-start after reboot (unclean shutdown detected).

I was able to capture the following kernel oops via remote syslog:

kernel: BUG: unable to handle page fault for address: 0000000000001018

kernel: #PF: supervisor read access in kernel mode

kernel: #PF: error_code(0x0000) - not-present page

kernel: PGD 0 P4D 0

kernel: Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI

kernel: CPU: 10 UID: 0 PID: 11559 Comm: unraidd0 Tainted: P O 6.12.54-Unraid #1

kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE

kernel: Hardware name: ASRock Z790 PG Riptide/Z790 PG Riptide, BIOS 20.01 09/24/2025

kernel: RIP: 0010:bio_associate_blkg_from_css+0x148/0x190

kernel: Code: 8b 3c 24 e8 aa 6a 5a 00 eb 04 4d 8b 76 30 4d 85 f6 74 0c 4c 89 f7 e8 37 f4 ff ff 84 c0 74 eb e8 3e 57 b4 ff eb 27 48 8b 45 08 <48> 8b 40 18 48 8b b8 e0 01 00 00 48 83 c7 38 e8 74 f4 ff ff 48 8b

kernel: RSP: 0018:ffffc90003eabcf8 EFLAGS: 00010246

kernel: RAX: 0000000000001000 RBX: ffffffff83063fe0 RCX: 0000000000000000

kernel: RDX: ffff8881174be480 RSI: ffffffff83063fe0 RDI: 0000000000000000

kernel: RBP: ffff8881ecc74048 R08: 0000000000000000 R09: 0000000000000000

kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001

kernel: R13: ffff8881ecc739a8 R14: ffff888106809860 R15: ffff8881ecc740f8

kernel: FS: 0000000000000000(0000) GS:ffff88a00f280000(0000) knlGS:0000000000000000

kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033

kernel: CR2: 0000000000001018 CR3: 0000000005618002 CR4: 0000000000772ef0

kernel: PKRU: 55555554

kernel: Call Trace:

kernel: <TASK>

kernel: ? submit_bio_noacct_nocheck+0x152/0x2e0

kernel: bio_associate_blkg+0x3b/0x50

kernel: bio_init+0x5b/0xa0

kernel: unraidd+0x1257/0x13d0 [md_mod]

kernel: ? preempt_latency_start+0x2b/0x50

kernel: ? md_thread+0xf1/0x120 [md_mod]

kernel: ? kthread_should_park+0x12/0x30

kernel: md_thread+0xf1/0x120 [md_mod]

kernel: ? __pfx_autoremove_wake_function+0x10/0x10

kernel: ? __pfx_md_thread+0x10/0x10 [md_mod]

kernel: kthread+0xec/0x100

kernel: ? __pfx_kthread+0x10/0x10

kernel: ret_from_fork+0x21/0x40

kernel: ? __pfx_kthread+0x10/0x10

kernel: ret_from_fork_asm+0x1a/0x30

kernel: </TASK>

Troubleshooting performed:

Memtest86 passed multiple passes with zero errors
All SMART checks passed on all drives
Updated Samsung 990 PRO NVMe firmware to latest
Disabled Intel VMD in BIOS
Adjusted CPU voltage settings (Vcore Compensation, LLC, voltage offset)
Reduced CPU P-Core and E-Core ratios
All temperatures well within spec (CPU 34-41°C, NVMes 38-47°C)
XMP not active, RAM running at JEDEC DDR5-3600
Crash captured via remote syslog (Python UDP listener on separate machine)

Configuration:

1 parity drive (18TB), 13 data drives (8TB-18TB mix), 6 cache pools (mix of NVMe and SATA SSD)
60+ Docker containers running
Crash occurs consistently during Parity Sync/Data Rebuild, typically after 4-8 hours of operation
The unraidd0 process hits a null pointer dereference in bio_associate_blkg_from_css during block I/O submission

Additional context:

The crash was also preceded by an unmanic segfault earlier in the same session, though that appeared unrelated
The system continued running after the kernel oops but the parity rebuild thread died, eventually leading to system unresponsiveness

Has anyone else encountered this crash signature during parity rebuilds on kernel 6.12.54?

Quote

March 12Mar 12

1 hour ago, rhodesjo said:
[md_mod]

Unraid driver is crashing; this is almost always a hardware issue. Which CPU do you have?

Quote

March 12Mar 12

Author

1 hour ago, JorgeB said:
Unraid driver is crashing; this is almost always a hardware issue. Which CPU do you have?

i9-14900k - but it's been on the approved microcode since day 1 about a year ago. I tried very hard to troubleshoot and ensure it's not a CPU degradation issue.

Quote

March 12Mar 12

1 hour ago, rhodesjo said:
i9-14900k

I bet this is the problem. There have been dozens of confirmed cases with this and similar CPUs, especially the 13700K, 14700K, 13900K and 14900K.

Some users are on their 3rd and 4th CPU.

Quote

March 12Mar 12

Author

25 minutes ago, JorgeB said:
I bet this is the problem. There have been dozens of confirmed cases with this and similar CPUs, especially the 13700K, 14700K, 13900K and 14900K.
Some users are on their 3rd and 4th CPU.

Thank you for the response. After further testing I've confirmed that core 20 appears to be degraded — I've had 4 segfaults from different processes (Unmanic, Authentik/Celery, generic Python) all on core 20. I've now isolated it with isolcpus and am testing stability. I also had crashes with parity paused, which further points to CPU rather than the md driver alone.

I'll be filing an Intel RMA. Microcode 0x132 was applied at install about a year ago, so hopefully the degradation is limited. Appreciate the confirmation.

Quote

1

Kernel BUG/Oops in bio_associate_blkg_from_css during Parity Sync/Data Rebuild - Unraid 7.2.4

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)