Jump to content
scottpk

Diagnosing Random Lockup

1 post in this topic Last Reply

Recommended Posts

In general things have been pretty solid, but recently I've been having the system lock up every couple days, where the only apparent way to break through is to hard reset. (Or at least, the web UI will not respond... since the video card is setup for passthrough I don't know whether it spit a message on-screen.)

 

I can't think of anything that would've coincided with this starting to happen... I enabled the option to persist the syslog through reboots, attached is the syslog I captured today after having to hard reset again...

 

I see that from 2:45 PM until I reboot there is a trace message from the Linux RCU thread, about every three minutes...

Aug 14 14:51:36 Raiden kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Aug 14 14:51:36 Raiden kernel: rcu: 	1-....: (420005 ticks this GP) idle=1be/1/0x4000000000000002 softirq=1700269094/1700269094 fqs=102699 
Aug 14 14:51:36 Raiden kernel: rcu: 	 (t=420006 jiffies g=217067417 q=25007350)
Aug 14 14:51:36 Raiden kernel: NMI backtrace for cpu 1
Aug 14 14:51:36 Raiden kernel: CPU: 1 PID: 19660 Comm: kworker/u64:2 Tainted: G        W         4.19.56-Unraid #1
Aug 14 14:51:36 Raiden kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P3.10 03/07/2019
Aug 14 14:51:36 Raiden kernel: Workqueue: events_power_efficient gc_worker
Aug 14 14:51:36 Raiden kernel: Call Trace:
Aug 14 14:51:36 Raiden kernel: <IRQ>
Aug 14 14:51:36 Raiden kernel: dump_stack+0x5d/0x79
Aug 14 14:51:36 Raiden kernel: nmi_cpu_backtrace+0x71/0x83
Aug 14 14:51:36 Raiden kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Aug 14 14:51:36 Raiden kernel: nmi_trigger_cpumask_backtrace+0x57/0xd7
Aug 14 14:51:36 Raiden kernel: rcu_dump_cpu_stacks+0x91/0xbb
Aug 14 14:51:36 Raiden kernel: rcu_check_callbacks+0x28f/0x58e
Aug 14 14:51:36 Raiden kernel: ? tick_sched_handle.isra.5+0x2f/0x2f
Aug 14 14:51:36 Raiden kernel: update_process_times+0x23/0x45
Aug 14 14:51:36 Raiden kernel: tick_sched_timer+0x36/0x64
Aug 14 14:51:36 Raiden kernel: __hrtimer_run_queues+0xb1/0x105
Aug 14 14:51:36 Raiden kernel: hrtimer_interrupt+0xf4/0x20d
Aug 14 14:51:36 Raiden kernel: smp_apic_timer_interrupt+0x79/0x91
Aug 14 14:51:36 Raiden kernel: apic_timer_interrupt+0xf/0x20
Aug 14 14:51:36 Raiden kernel: </IRQ>
Aug 14 14:51:36 Raiden kernel: RIP: 0010:gc_worker+0xf6/0x258
Aug 14 14:51:36 Raiden kernel: Code: 02 0f 8f ec 00 00 00 48 8b 05 00 ea 8b 00 05 00 5c 26 05 41 89 87 88 00 00 00 e9 d4 00 00 00 48 8b 15 e8 e9 8b 00 29 d0 85 c0 <7f> 11 4c 89 ff e8 da e9 ff ff ff 44 24 08 e9 b6 00 00 00 85 ed 0f
Aug 14 14:51:36 Raiden kernel: RSP: 0018:ffffc90008af7e68 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
Aug 14 14:51:36 Raiden kernel: RAX: 000000001742ae36 RBX: ffffffff8228e440 RCX: ffffffffffffffb8
Aug 14 14:51:36 Raiden kernel: RDX: 00000001174f929c RSI: 0000000000000135 RDI: ffffffff8228e440
Aug 14 14:51:36 Raiden kernel: RBP: 0000000000000000 R08: 0000000000000018 R09: 0000746e65696369
Aug 14 14:51:36 Raiden kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000002f36
Aug 14 14:51:36 Raiden kernel: R13: 0000000064ac2ff9 R14: ffff8882e3b18f48 R15: ffff8882e3b18f00
Aug 14 14:51:36 Raiden kernel: ? gc_worker+0x1cc/0x258
Aug 14 14:51:36 Raiden kernel: process_one_work+0x16e/0x24f
Aug 14 14:51:36 Raiden kernel: ? pwq_unbound_release_workfn+0xb7/0xb7
Aug 14 14:51:36 Raiden kernel: worker_thread+0x1dc/0x2ac
Aug 14 14:51:36 Raiden kernel: kthread+0x10b/0x113
Aug 14 14:51:36 Raiden kernel: ? kthread_park+0x71/0x71
Aug 14 14:51:36 Raiden kernel: ret_from_fork+0x22/0x40
Aug 14 14:54:36 Raiden kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Aug 14 14:54:36 Raiden kernel: rcu: 	1-....: (600008 ticks this GP) idle=1be/1/0x4000000000000002 softirq=1700269094/1700269094 fqs=146764 
Aug 14 14:54:36 Raiden kernel: rcu: 	 (t=600009 jiffies g=217067417 q=35280857)
Aug 14 14:54:36 Raiden kernel: NMI backtrace for cpu 1
Aug 14 14:54:36 Raiden kernel: CPU: 1 PID: 19660 Comm: kworker/u64:2 Tainted: G        W         4.19.56-Unraid #1
Aug 14 14:54:36 Raiden kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P3.10 03/07/2019
Aug 14 14:54:36 Raiden kernel: Workqueue: events_power_efficient gc_worker
Aug 14 14:54:36 Raiden kernel: Call Trace:
Aug 14 14:54:36 Raiden kernel: <IRQ>
Aug 14 14:54:36 Raiden kernel: dump_stack+0x5d/0x79
Aug 14 14:54:36 Raiden kernel: nmi_cpu_backtrace+0x71/0x83
Aug 14 14:54:36 Raiden kernel: ? lapic_can_unplug_cpu+0x8e/0x8e
Aug 14 14:54:36 Raiden kernel: nmi_trigger_cpumask_backtrace+0x57/0xd7
Aug 14 14:54:36 Raiden kernel: rcu_dump_cpu_stacks+0x91/0xbb
Aug 14 14:54:36 Raiden kernel: rcu_check_callbacks+0x28f/0x58e
Aug 14 14:54:36 Raiden kernel: ? tick_sched_handle.isra.5+0x2f/0x2f
Aug 14 14:54:36 Raiden kernel: update_process_times+0x23/0x45
Aug 14 14:54:36 Raiden kernel: tick_sched_timer+0x36/0x64
Aug 14 14:54:36 Raiden kernel: __hrtimer_run_queues+0xb1/0x105
Aug 14 14:54:36 Raiden kernel: hrtimer_interrupt+0xf4/0x20d
Aug 14 14:54:36 Raiden kernel: smp_apic_timer_interrupt+0x79/0x91
Aug 14 14:54:36 Raiden kernel: apic_timer_interrupt+0xf/0x20
Aug 14 14:54:36 Raiden kernel: </IRQ>
Aug 14 14:54:36 Raiden kernel: RIP: 0010:gc_worker+0xa7/0x258
Aug 14 14:54:36 Raiden kernel: Code: 44 89 e0 48 8d 04 c2 4c 8b 30 41 f6 c6 01 0f 85 36 01 00 00 41 0f b6 46 37 48 c7 c1 f0 ff ff ff 41 ff c5 48 6b c0 38 48 29 c1 <4d> 8d 3c 0e 49 8b 97 80 00 00 00 41 8b 87 88 00 00 00 0f ba e2 0e
Aug 14 14:54:36 Raiden kernel: RSP: 0018:ffffc90008af7e68 EFLAGS: 00000296 ORIG_RAX: ffffffffffffff13
Aug 14 14:54:36 Raiden kernel: RAX: 0000000000000038 RBX: ffffffff8228e440 RCX: ffffffffffffffb8
Aug 14 14:54:36 Raiden kernel: RDX: 00000001175251bf RSI: 0000000000000135 RDI: ffffffff8228e440
Aug 14 14:54:36 Raiden kernel: RBP: 0000000000000000 R08: 0000000000000018 R09: 0000746e65696369
Aug 14 14:54:36 Raiden kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000002f36
Aug 14 14:54:36 Raiden kernel: R13: 00000000dc4f32b4 R14: ffff8882e3b18f48 R15: ffff8882e3b18f00
Aug 14 14:54:36 Raiden kernel: ? gc_worker+0x1cc/0x258
Aug 14 14:54:36 Raiden kernel: process_one_work+0x16e/0x24f
Aug 14 14:54:36 Raiden kernel: ? pwq_unbound_release_workfn+0xb7/0xb7
Aug 14 14:54:36 Raiden kernel: worker_thread+0x1dc/0x2ac
Aug 14 14:54:36 Raiden kernel: kthread+0x10b/0x113
Aug 14 14:54:36 Raiden kernel: ? kthread_park+0x71/0x71
Aug 14 14:54:36 Raiden kernel: ret_from_fork+0x22/0x40

does anyone understand how to tell what it is saying is stalled and/or how I can work backward to find the culprit?

syslog

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.