6.12.8 Kernel panic when starting array

August 4, 20241 yr

Out of the blue, my server was somehow stopped working after a few weeks of uptime. Could not access GUI, so i rebooted the server.

Server seems booting fine, but when i try to start the array, things goes wrong.

i did a 4hour memtest (latest from memtest.org, not the one included on unraid USB), 0 errors.

I disabled DOCKER and VM, to make sure they are not causing the issue.

As when starting the array, i cannot make a diagnostic file, perhaps this information from the syslog below could help to diagnose the issue ?

Aug 4 20:48:18 SERVER1 emhttpd: mounting /mnt/cache
Aug 4 20:48:18 SERVER1 emhttpd: shcmd (568): mkdir -p /mnt/cache
Aug 4 20:48:18 SERVER1 emhttpd: shcmd (569): /usr/sbin/zpool import -f -N -o autoexpand=on -d /dev/nvme1n1p1 -d /dev/nvme2n1p1 1547749684351647778 cache
Aug 4 20:48:18 SERVER1 kernel: VERIFY3(size <= rt->rt_space) failed (281442900058112 <= 8586334208)
Aug 4 20:48:18 SERVER1 kernel: PANIC at range_tree.c:436:range_tree_remove_impl()
Aug 4 20:48:18 SERVER1 kernel: Showing stack for process 7359
Aug 4 20:48:18 SERVER1 kernel: CPU: 6 PID: 7359 Comm: zpool Tainted: P O 6.1.74-Unraid #1
Aug 4 20:48:18 SERVER1 kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AORUS MASTER/Z690 AORUS MASTER, BIOS F8 08/08/2022
Aug 4 20:48:18 SERVER1 kernel: Call Trace:
Aug 4 20:48:18 SERVER1 kernel: <TASK>
Aug 4 20:48:18 SERVER1 kernel: dump_stack_lvl+0x44/0x5c
Aug 4 20:48:18 SERVER1 kernel: spl_panic+0xd0/0xe8 [spl]
Aug 4 20:48:18 SERVER1 kernel: ? slab_free_freelist_hook.constprop.0+0x3b/0xaf
Aug 4 20:48:18 SERVER1 kernel: ? bt_grow_leaf+0xc3/0xd6 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? bt_grow_leaf+0xc3/0xd6 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? zfs_btree_find_in_buf+0x4c/0x94 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? zfs_btree_find+0x16d/0x1b0 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? rs_get_start+0xc/0x1d [zfs]
Aug 4 20:48:18 SERVER1 kernel: range_tree_remove_impl+0x77/0x406 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? zio_wait+0x1ee/0x1fd [zfs]
Aug 4 20:48:18 SERVER1 kernel: space_map_load_callback+0x70/0x79 [zfs]
Aug 4 20:48:18 SERVER1 kernel: space_map_iterate+0x2d3/0x324 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? spa_stats_destroy+0x16c/0x16c [zfs]
Aug 4 20:48:18 SERVER1 kernel: space_map_load_length+0x93/0xcb [zfs]
Aug 4 20:48:18 SERVER1 kernel: metaslab_load+0x33b/0x6e3 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? zap_lookup_impl+0x1b0/0x1da [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? zap_lookup_impl+0x1b0/0x1da [zfs]
Aug 4 20:48:18 SERVER1 kernel: vdev_trim_calculate_progress+0x12e/0x217 [zfs]
Aug 4 20:48:18 SERVER1 kernel: vdev_trim_load+0x13d/0x148 [zfs]
Aug 4 20:48:18 SERVER1 kernel: vdev_trim_restart+0x144/0x1f0 [zfs]
Aug 4 20:48:18 SERVER1 kernel: vdev_trim_restart+0x1cc/0x1f0 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? preempt_latency_start+0x2b/0x46
Aug 4 20:48:18 SERVER1 kernel: vdev_trim_restart+0x1cc/0x1f0 [zfs]
Aug 4 20:48:18 SERVER1 kernel: spa_load+0xfcc/0x1110 [zfs]
Aug 4 20:48:18 SERVER1 kernel: spa_load_best+0x61/0x267 [zfs]
Aug 4 20:48:18 SERVER1 kernel: spa_import+0x282/0x5ac [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? get_nvlist+0xe8/0x119 [zfs]
Aug 4 20:48:18 SERVER1 kernel: zfs_ioc_pool_import+0xea/0x143 [zfs]
Aug 4 20:48:18 SERVER1 kernel: zfsdev_ioctl_common+0x68f/0x726 [zfs]
Aug 4 20:48:18 SERVER1 kernel: ? mod_lruvec_page_state.constprop.0+0x1c/0x2e
Aug 4 20:48:18 SERVER1 kernel: ? __kmalloc_large_node+0xd6/0xfb
Aug 4 20:48:18 SERVER1 kernel: ? __kmalloc_node+0x5e/0xb1
Aug 4 20:48:18 SERVER1 kernel: zfsdev_ioctl+0x5b/0xb4 [zfs]
Aug 4 20:48:18 SERVER1 kernel: vfs_ioctl+0x1b/0x2f
Aug 4 20:48:18 SERVER1 kernel: __do_sys_ioctl+0x52/0x78
Aug 4 20:48:18 SERVER1 kernel: do_syscall_64+0x68/0x81
Aug 4 20:48:18 SERVER1 kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce
Aug 4 20:48:18 SERVER1 kernel: RIP: 0033:0x14582ea344e8
Aug 4 20:48:18 SERVER1 kernel: Code: 00 00 48 8d 44 24 08 48 89 54 24 e0 48 89 44 24 c0 48 8d 44 24 d0 48 89 44 24 c8 b8 10 00 00 00 c7 44 24 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 07 89 d0 c3 0f 1f 40 00 48 8b 15 f9 e8 0d
Aug 4 20:48:18 SERVER1 kernel: RSP: 002b:00007ffe48bbcb18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Aug 4 20:48:18 SERVER1 kernel: RAX: ffffffffffffffda RBX: 0000000000433320 RCX: 000014582ea344e8
Aug 4 20:48:18 SERVER1 kernel: RDX: 00007ffe48bbd480 RSI: 0000000000005a02 RDI: 0000000000000004
Aug 4 20:48:18 SERVER1 kernel: RBP: 00007ffe48bc0a60 R08: 000014582eb14490 R09: 000014582eb14490
Aug 4 20:48:18 SERVER1 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe48bbd480
Aug 4 20:48:18 SERVER1 kernel: R13: 000000000043b1b0 R14: 0000000000435930 R15: 0000000000000000
Aug 4 20:48:18 SERVER1 kernel: </TASK>

Edited August 4, 20241 yr by Deler7

Quote

August 4, 20241 yr

Community Expert
Solution

Zfs is panicking, see if the pool munt read only:

zpool import -o readonly=on /mnt/cache

If it works, start the array, the GUI will still show the pool unmountbale, but the data should be under /mnt/cache, then backup and re-format the pool

Quote

1

August 4, 20241 yr

Author

Initially, i couldn't do the readonly mount as specified. When i changed /mnt/cache into just cache , it worked.

Then i proceed as advised, started the array, copied the files from the cache, erased the cache drives, re-format. Reboot, copy all files back to the new cache pool, and enabled docker again. Everything worked again as before.

Many many thanks for this great advise and lightning fast support !

Now everyting is up and running again, i'm now wondering how this could happen. Is it most likely a HW issue, or perhaps a misbehaving docker, or is it just a ZFS quirk that just could happen anytime to anyone ?

As attached, the 2x WD RED 1TB NVME drives are fairly young, only 9.78TB written, and i don't see any issues on it ?

image.png.bc30f4bd6daa4ec03c519d6b114289a3.png

Otherwise, I just knock on wood this won't happen again soon

Quote

August 5, 20241 yr

Community Expert

8 hours ago, Deler7 said:

When i changed /mnt/cache into just cache

Opps, sorry, typo, glad to hear it worked.

If it happens again soon after a reformat, it may indicate an underlying hardware issue, like bad RAM

Quote

1

6.12.8 Kernel panic when starting array

Featured Replies

Solved by JorgeB

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)