UNRAID instability on N6005 since 6.12.* (Crashes every 48-72 hours)


Recommended Posts

Hey,

I've been running an N6005 NUC with the 4 x 226v 2.5G network ports as an UNRAID device that hosts my home assistant, pihole, and mainly my firewall since the start of the year, with no issues.

 

After 6.12.* my UNRAID is crashing weekly, recently it is every couple of days.

 

I've noticed in the system log I have some errors. ACPI is disabled in the BIOS.

 

Sep  4 22:13:31 UnraidRouter kernel: x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
Sep  4 22:13:31 UnraidRouter kernel: ACPI: Early table checksum verification disabled
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS03._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS04._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS01._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS02._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: xor: measuring software checksum speed
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS03._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS03._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS04._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS04._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS01._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS01._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS02._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.SS02._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS03._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20220331/psargs-330)
Sep  4 22:13:31 UnraidRouter kernel: ACPI Error: Aborting method \_SB.PC00.XHCI.RHUB.HS03._PLD due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Sep  4 22:13:31 UnraidRouter kernel: floppy0: no floppy controllers found
Sep  4 22:13:31 UnraidRouter kernel: tpm_crb: probe of MSFT0101:00 failed with error 378
Sep  4 22:14:35 UnraidRouter mcelog: failed to prefill DIMM database from DMI data
Sep  4 22:15:21 UnraidRouter kernel: BTRFS info (device loop2): using crc32c (crc32c-intel) checksum algorithm
Sep  4 22:15:25 UnraidRouter kernel: BTRFS info (device loop3): using crc32c (crc32c-intel) checksum algorithm

 

These are the random Kernel panics I was getting from within the VMs:

 

[fib_algo] inet.0 (bsearch4#28) rebuild_fd_flm: switching algo to radix4_lockless
kernel trap 1 with interrupts disabled


Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer	= 0x20:0xffffffff81224720
stack pointer	        = 0x28:0xfffffe001079d458
frame pointer	        = 0x28:0xfffffe001079d540
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 0 (if_io_tqg_1)
trap number		= 1
panic: privileged instruction fault
cpuid = 1
time = 1692981925
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001079d270
vpanic() at vpanic+0x151/frame 0xfffffe001079d2c0
panic() at panic+0x43/frame 0xfffffe001079d320
trap_fatal() at trap_fatal+0x387/frame 0xfffffe001079d380
calltrap() at calltrap+0x8/frame 0xfffffe001079d380
--- trap 0x1, rip = 0xffffffff81224720, rsp = 0xfffffe001079d458, rbp = 0xfffffe001079d540 ---
lapic_handle_intr() at lapic_handle_intr/frame 0xfffffe001079d540
ng_pppoe_rcvdata() at ng_pppoe_rcvdata+0x339/frame 0xfffffe001079d5d0
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe001079d660
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe001079d6a0
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe001079d730
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe001079d770
ng_ppp_link_xmit() at ng_ppp_link_xmit+0x124/frame 0xfffffe001079d7c0
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe001079d850
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe001079d890
ng_apply_item() at ng_apply_item+0x2bf/frame 0xfffffe001079d920
ng_snd_item() at ng_snd_item+0x28e/frame 0xfffffe001079d960
ng_iface_send() at ng_iface_send+0xdf/frame 0xfffffe001079d9e0
ng_iface_output() at ng_iface_output+0xe3/frame 0xfffffe001079da20
ip_tryforward() at ip_tryforward+0x4f7/frame 0xfffffe001079dae0
ip_input() at ip_input+0x724/frame 0xfffffe001079db70
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe001079dbc0
ether_demux() at ether_demux+0x159/frame 0xfffffe001079dbf0
ether_nh_input() at ether_nh_input+0x36b/frame 0xfffffe001079dc50
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe001079dca0
ether_input() at ether_input+0x69/frame 0xfffffe001079dd00
iflib_rxeof() at iflib_rxeof+0xbcb/frame 0xfffffe001079de00
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe001079de40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe001079dec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe001079def0
fork_exit() at fork_exit+0x7e/frame 0xfffffe001079df30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001079df30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
panic.txt0600003414472155245  7146 ustarrootwheelprivileged instruction faultversion.txt0600007414472155245  7545 ustarrootwheelFreeBSD 13.2-RELEASE-p2 stable/23.7-n254761-4b4f06e3731 SMP

 

Fatal double fault
rip 0xffffffff81115a76 rsp 0xfffffe00107a9dd0 rbp 0xfffffe00107a9dd0
rax 0x1063525043c96 rdx 0x1063500000000 rbx 0xfffff80001a60000
rcx 0 rsi 0 rdi 0xfffffe00107a9e88
r8 0xfad9c0 r9 0x80000000 r10 0xffffffff
r11 0x1 r12 0xfffff80001a60028 r13 0
r14 0x1063525043c96 r15 0x2f rflags 0x10246
cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
fsbase 0x35ad7a465120 gsbase 0xffffffff82c11000 kgsbase 0
cpuid = 1; apic id = 01
timeout stopping cpus
panic: double fault
cpuid = 1
time = 1693126628
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0011e4edb0
vpanic() at vpanic+0x151/frame 0xfffffe0011e4ee00
panic() at panic+0x43/frame 0xfffffe0011e4ee60
dblfault_handler() at dblfault_handler+0x1ce/frame 0xfffffe0011e4ef20
Xdblfault() at Xdblfault+0xd7/frame 0xfffffe0011e4ef20
--- trap 0x17, rip = 0xffffffff81115a76, rsp = 0xfffffe00107a9dd0, rbp = 0xfffffe00107a9dd0 ---
acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfffffe00107a9dd0
acpi_cpu_idle() at acpi_cpu_idle+0x2ef/frame 0xfffffe00107a9e10
cpu_idle_acpi() at cpu_idle_acpi+0x48/frame 0xfffffe00107a9e30
cpu_idle() at cpu_idle+0x9f/frame 0xfffffe00107a9e50
sched_idletd() at sched_idletd+0x4e1/frame 0xfffffe00107a9ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00107a9f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00107a9f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
panic.txt0600001414472607744  7152 ustarrootwheeldouble faultversion.txt0600007414472607744  7553 ustarrootwheelFreeBSD 13.2-RELEASE-p2 stable/23.7-n254761-4b4f06e3731 SMP

 

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address	= 0x0
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff8122470c
stack pointer	        = 0x28:0xfffffe001079d420
frame pointer	        = 0x28:0xfffffe001079d600
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 0 (if_io_tqg_1)
trap number		= 12
panic: page fault
cpuid = 1
time = 1693246635
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001079d1e0
vpanic() at vpanic+0x151/frame 0xfffffe001079d230
panic() at panic+0x43/frame 0xfffffe001079d290
trap_fatal() at trap_fatal+0x387/frame 0xfffffe001079d2f0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe001079d350
calltrap() at calltrap+0x8/frame 0xfffffe001079d350
--- trap 0xc, rip = 0xffffffff8122470c, rsp = 0xfffffe001079d420, rbp = 0xfffffe001079d600 ---
native_lapic_set_lvt_triggermode() at native_lapic_set_lvt_triggermode+0xdc/frame 0xfffffe001079d600
calltrap() at calltrap+0x8/frame 0xfffffe001079d600
--- trap 0x1, rip = 0xffffffff81224720, rsp = 0xfffffe001079d6d8, rbp = 0xfffffe001079d7e0 ---
lapic_handle_intr() at lapic_handle_intr/frame 0xfffffe001079d7e0
pf_test_state_udp() at pf_test_state_udp+0x130/frame 0xfffffe001079d850
pf_test() at pf_test+0xc57/frame 0xfffffe001079d9c0
pf_check_in() at pf_check_in+0x25/frame 0xfffffe001079d9e0
pfil_run_hooks() at pfil_run_hooks+0x97/frame 0xfffffe001079da20
ip_tryforward() at ip_tryforward+0x181/frame 0xfffffe001079dae0
ip_input() at ip_input+0x724/frame 0xfffffe001079db70
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe001079dbc0
ether_demux() at ether_demux+0x159/frame 0xfffffe001079dbf0
ether_nh_input() at ether_nh_input+0x36b/frame 0xfffffe001079dc50
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe001079dca0
ether_input() at ether_input+0x69/frame 0xfffffe001079dd00
iflib_rxeof() at iflib_rxeof+0xbcb/frame 0xfffffe001079de00
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe001079de40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe001079dec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe001079def0
fork_exit() at fork_exit+0x7e/frame 0xfffffe001079df30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001079df30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
panic.txt0600001214473162253  7140 ustarrootwheelpage faultversion.txt0600007414473162253  7543 ustarrootwheelFreeBSD 13.2-RELEASE-p2 stable/23.7-n254761-4b4f06e3731 SMP

 

Does this look like a hardware issue or is this some random bug that has appeared since the updates?

Many thanks!

unraidrouter-diagnostics-20230905-1655.zip

Edited by kamikazedan
Updated explanation of issue
Link to comment
13 hours ago, kamikazedan said:

I raised an issue with OPNsense, who mentioned it could be UNRAID, FreeBSD, or hardware causing this issue.

I hope you told them you were running "PO"sense in a VM?

 

You can't properly run OPNsense or pfSense in a VM unless it's your home lab and then you would/should know why/when it doesn't work.

 

I run pfSense on a similar box, except... I only use that box as a bare metal router so that nothing gets to my Unraid box.

 

Clear as mud? 🙂

 

MrGrey.

Link to comment
11 hours ago, MrGrey said:

I hope you told them you were running "PO"sense in a VM?

 

You can't properly run OPNsense or pfSense in a VM unless it's your home lab and then you would/should know why/when it doesn't work.

 

I run pfSense on a similar box, except... I only use that box as a bare metal router so that nothing gets to my Unraid box.

 

Clear as mud? 🙂

 

MrGrey.


It's a home lab and I've been running the firewall in a VM for years across different types of hardware as many people do.

This is an UNRAID box purely for virtualisation of the firewall and DNS, no important data is stored on it.

I was clear that it was running in a VM.

Is your comment supposed to help in any way or is this a telling off for doing something wrong?

 

If you have any suggestions that could help, it would be appreciated.

Edited by kamikazedan
Link to comment

I've been running on a VM for 3 years now and have had ZERO problems... Would I implement it in a VM for a client? No, I make them buy Netgate appliances... Would I use it in my own business? Yup.

 

My first thought is to take a GOOD LOOK at the System Devices, and then downgrade to a known good version of Unraid Take another look at the system devices and see if there is a difference when it comes to the Network cards... Maybe a new driver conflicting with the VM?

 

Maybe try swapping out any gear you can... Simplify the config and see if the problem goes away? If you can't change the hardware maybe shut the pihole down for a little while.

 

As for MrGrey... have you never tried something new? Pushed a boundary on what you thought was possible and needed a little support? I learn by breaking things every day and there are people on here that know so much about this stuff... I've had problems that people on this board had answers for that I barely recognized the words they used and I've been in this industry (successfully) for 30 years.

Link to comment
On 9/7/2023 at 8:56 PM, Arbadacarba said:

Have you installed the Guest agent?

OR switching AWAY from Spice?

 

image.png.b280ba76a8dfacecf16d73997f1c8291.png


Guest agent is installed now, I had not installed it before as it was never an issue.

 

So it appears I was totally wrong about just the VMs crashing, the whole OS is crashing.

So either something is totally broken in UNRAID or there's a hardware issue.

unraidrouter-diagnostics-20230910-1632.zip

Edited by kamikazedan
Link to comment
  • kamikazedan changed the title to UNRAID instability on N6005 since 6.12.* (Crashes every 48-72 hours)

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.