使用unraid碰到kernel BUG


Recommended Posts

最近是刚开始使用unraid,unraid主机出现失联现象,无法正常访问unraid宿主机及其它虚拟机如软路由系统这些。查看日志文件,错误如下:

Oct  6 03:40:02 Tower crond[1853]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
Oct  6 03:47:55 Tower kernel: __pte_list_remove: 00000000c9cfb92d 0->BUG
Oct  6 03:47:55 Tower kernel: ------------[ cut here ]------------
Oct  6 03:47:55 Tower kernel: kernel BUG at arch/x86/kvm/mmu/mmu.c:907!
Oct  6 03:47:55 Tower kernel: invalid opcode: 0000 [#1] SMP NOPTI
Oct  6 03:47:55 Tower kernel: CPU: 3 PID: 5144 Comm: kvm-nx-lpage-re Tainted: G     U            5.10.21-Unraid #1
Oct  6 03:47:55 Tower kernel: Hardware name: ASUS System Product Name/TUF GAMING B460M-PLUS (WI-FI), BIOS 0708 07/23/2020
Oct  6 03:47:55 Tower kernel: RIP: 0010:__pte_list_remove+0x21/0xf1 [kvm]

 

不知道这个代码是不是我死机的原因,我也不懂,所以特来请教

 

Edited by ccttunraid
Link to comment
  • 2 years later...
2 hours ago, ccmspringming said:

我也碰到这个问题,无解。

网页极其卡

某虚拟机的几个进程cpu100%,还kill -9 不掉

 

如果需要帮忙判断问题的话,可以上传诊断日志,同时开启日志服务器保存日志,并将出现问题时的日志发上来,方法参考:

 

 

另外,你也可以进入 unraid 的安全模式来观察此问题是不是依然存在。

Link to comment
On 10/13/2023 at 10:36 PM, JackieWu said:

 

如果需要帮忙判断问题的话,可以上传诊断日志,同时开启日志服务器保存日志,并将出现问题时的日志发上来,方法参考:

 

 

另外,你也可以进入 unraid 的安全模式来观察此问题是不是依然存在。

今天又碰到了这个问题

想destroy 卡死的vm,报这个错误

error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainGetBlockInfo)

其他诊断文件麻烦看下上传的文件,谢谢

blackserver-diagnostics-20231020-0932.zip

Link to comment
On 10/20/2023 at 9:44 AM, ccmspringming said:

今天又碰到了这个问题

想destroy 卡死的vm,报这个错误

error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainGetBlockInfo)

其他诊断文件麻烦看下上传的文件,谢谢

blackserver-diagnostics-20231020-0932.zip 198.04 kB · 1 download

 

你的 Win 虚拟机直通了 3090 显卡,但是我看你的系统设置里面并没有 vfio-pic 绑定显卡,建议绑定了之后再观察看看。

 

另外日志里面有关于 cachepool 缓存池的内核报错,建议你检测一下这个缓存池的 zfs 文件系统。

 

Oct 20 04:23:00 BlackServer kernel: CPU: 6 PID: 15026 Comm: z_wr_int_0 Tainted: P     U  W  O       6.1.49-Unraid #1
Oct 20 04:23:00 BlackServer kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z490 Taichi, BIOS P1.90 03/17/2021
Oct 20 04:23:00 BlackServer kernel: Call Trace:
Oct 20 04:23:00 BlackServer kernel: dump_stack_lvl+0x44/0x5c
Oct 20 04:23:00 BlackServer kernel: vcmn_err+0x86/0xc3 [spl]
Oct 20 04:23:00 BlackServer kernel: ? scsi_queue_rq+0x645/0x7cb
Oct 20 04:23:00 BlackServer kernel: ? blk_mq_dispatch_rq_list+0x4de/0x511
Oct 20 04:23:00 BlackServer kernel: ? number+0x195/0x311
Oct 20 04:23:00 BlackServer kernel: PANIC: cachepool: blkptr at 000000001049b76e has invalid CHECKSUM 128
Oct 20 04:23:00 BlackServer kernel: Showing stack for process 2232
Oct 20 04:23:00 BlackServer kernel: zfs_panic_recover+0x6b/0x86 [zfs]
Oct 20 04:23:00 BlackServer kernel: zfs_blkptr_verify_log+0x98/0xfc [zfs]
  Oct 20 04:23:00 BlackServer kernel: </TASK>
Oct 20 04:23:00 BlackServer kernel: ? __blk_mq_sched_dispatch_requests+0xc9/0x11c
Oct 20 04:23:00 BlackServer kernel: ? blk_mq_sched_dispatch_requests+0x2f/0x5a
Oct 20 04:23:00 BlackServer kernel: ? select_task_rq_fair+0xb90/0xba6
Oct 20 04:23:00 BlackServer kernel: ? sched_clock_cpu+0x12/0xa1
Oct 20 04:23:00 BlackServer kernel: ? __smp_call_single_queue+0x23/0x35
Oct 20 04:23:00 BlackServer kernel: zfs_blkptr_verify+0x9f/0x380 [zfs]
Oct 20 04:23:00 BlackServer kernel: zio_free+0x21/0xe9 [zfs]
Oct 20 04:23:00 BlackServer kernel: dsl_dataset_block_kill+0x1da/0x42e [zfs]
Oct 20 04:23:00 BlackServer kernel: dbuf_write_done+0x4b/0x18c [zfs]
Oct 20 04:23:00 BlackServer kernel: arc_write_done+0x361/0x3a0 [zfs]
Oct 20 04:23:00 BlackServer kernel: ? preempt_latency_start+0x2b/0x46
Oct 20 04:23:00 BlackServer kernel: zio_done+0xa4e/0xc79 [zfs]
Oct 20 04:23:00 BlackServer kernel: zio_execute+0xb1/0xdf [zfs]
Oct 20 04:23:00 BlackServer kernel: taskq_thread+0x266/0x38a [spl]
Oct 20 04:23:00 BlackServer kernel: ? wake_up_q+0x44/0x44
Oct 20 04:23:00 BlackServer kernel: ? zio_subblock+0x22/0x22 [zfs]
Oct 20 04:23:00 BlackServer kernel: ? taskq_dispatch_delay+0x106/0x106 [spl]
Oct 20 04:23:00 BlackServer kernel: kthread+0xe4/0xef
Oct 20 04:23:00 BlackServer kernel: ? kthread_complete_and_exit+0x1b/0x1b
Oct 20 04:23:00 BlackServer kernel: ret_from_fork+0x1f/0x30
Oct 20 04:23:00 BlackServer kernel: <TASK>

 

Link to comment
4 hours ago, JackieWu said:

 

你的 Win 虚拟机直通了 3090 显卡,但是我看你的系统设置里面并没有 vfio-pic 绑定显卡,建议绑定了之后再观察看看。

 

另外日志里面有关于 cachepool 缓存池的内核报错,建议你检测一下这个缓存池的 zfs 文件系统。

 

Oct 20 04:23:00 BlackServer kernel: CPU: 6 PID: 15026 Comm: z_wr_int_0 Tainted: P     U  W  O       6.1.49-Unraid #1
Oct 20 04:23:00 BlackServer kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z490 Taichi, BIOS P1.90 03/17/2021
Oct 20 04:23:00 BlackServer kernel: Call Trace:
Oct 20 04:23:00 BlackServer kernel: dump_stack_lvl+0x44/0x5c
Oct 20 04:23:00 BlackServer kernel: vcmn_err+0x86/0xc3 [spl]
Oct 20 04:23:00 BlackServer kernel: ? scsi_queue_rq+0x645/0x7cb
Oct 20 04:23:00 BlackServer kernel: ? blk_mq_dispatch_rq_list+0x4de/0x511
Oct 20 04:23:00 BlackServer kernel: ? number+0x195/0x311
Oct 20 04:23:00 BlackServer kernel: PANIC: cachepool: blkptr at 000000001049b76e has invalid CHECKSUM 128
Oct 20 04:23:00 BlackServer kernel: Showing stack for process 2232
Oct 20 04:23:00 BlackServer kernel: zfs_panic_recover+0x6b/0x86 [zfs]
Oct 20 04:23:00 BlackServer kernel: zfs_blkptr_verify_log+0x98/0xfc [zfs]
  Oct 20 04:23:00 BlackServer kernel: </TASK>
Oct 20 04:23:00 BlackServer kernel: ? __blk_mq_sched_dispatch_requests+0xc9/0x11c
Oct 20 04:23:00 BlackServer kernel: ? blk_mq_sched_dispatch_requests+0x2f/0x5a
Oct 20 04:23:00 BlackServer kernel: ? select_task_rq_fair+0xb90/0xba6
Oct 20 04:23:00 BlackServer kernel: ? sched_clock_cpu+0x12/0xa1
Oct 20 04:23:00 BlackServer kernel: ? __smp_call_single_queue+0x23/0x35
Oct 20 04:23:00 BlackServer kernel: zfs_blkptr_verify+0x9f/0x380 [zfs]
Oct 20 04:23:00 BlackServer kernel: zio_free+0x21/0xe9 [zfs]
Oct 20 04:23:00 BlackServer kernel: dsl_dataset_block_kill+0x1da/0x42e [zfs]
Oct 20 04:23:00 BlackServer kernel: dbuf_write_done+0x4b/0x18c [zfs]
Oct 20 04:23:00 BlackServer kernel: arc_write_done+0x361/0x3a0 [zfs]
Oct 20 04:23:00 BlackServer kernel: ? preempt_latency_start+0x2b/0x46
Oct 20 04:23:00 BlackServer kernel: zio_done+0xa4e/0xc79 [zfs]
Oct 20 04:23:00 BlackServer kernel: zio_execute+0xb1/0xdf [zfs]
Oct 20 04:23:00 BlackServer kernel: taskq_thread+0x266/0x38a [spl]
Oct 20 04:23:00 BlackServer kernel: ? wake_up_q+0x44/0x44
Oct 20 04:23:00 BlackServer kernel: ? zio_subblock+0x22/0x22 [zfs]
Oct 20 04:23:00 BlackServer kernel: ? taskq_dispatch_delay+0x106/0x106 [spl]
Oct 20 04:23:00 BlackServer kernel: kthread+0xe4/0xef
Oct 20 04:23:00 BlackServer kernel: ? kthread_complete_and_exit+0x1b/0x1b
Oct 20 04:23:00 BlackServer kernel: ret_from_fork+0x1f/0x30
Oct 20 04:23:00 BlackServer kernel: <TASK>

 

非常感谢您的指导

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.