Dieter Koblenz Posted December 12, 2021 Share Posted December 12, 2021 I've been working with UnRAID for some time now, and I recently installed a new MB, CPU and memory. Before I had the occasional crash and never completely figured out why. Now I've got IPMI, I can view the screen and I seem to get a kernel panic. This happened three times now, at random intervals. I've disabled C-states in the bios (and via terminal) and the memory is running on it's design speed (conforming to the best practices tutorial). Temperatures are all within acceptable parameters. I've added my diagnostics file and the screenshot from the terminal. I am hoping someone can give me some pointers. starbase-diagnostics-20211212-1025.zip Quote Link to comment
JorgeB Posted December 12, 2021 Share Posted December 12, 2021 Enable the syslog server and post that after a crash. Quote Link to comment
Dieter Koblenz Posted December 12, 2021 Author Share Posted December 12, 2021 3 hours ago, JorgeB said: Enable the syslog server and post that after a crash. Yes, I am doing a couple of memtest passes first and then I'll fire up the box again. I did enable the syslog server, but I need a remote recipiant to sustain it right? I think I did enable logging to a share as well. Quote Link to comment
ChatNoir Posted December 12, 2021 Share Posted December 12, 2021 8 hours ago, Dieter Koblenz said: but I need a remote recipiant to sustain it right? You can also mirror the syslog to the flash drive for debugging purposes. Quote Link to comment
Dieter Koblenz Posted December 14, 2021 Author Share Posted December 14, 2021 Here's the syslog, though at first glance I can't find a problem just before the kernel panic. syslog-127.0.0.1 - kopie.log Quote Link to comment
Dieter Koblenz Posted December 15, 2021 Author Share Posted December 15, 2021 This morning I found my syslog filled with this error: Dec 15 07:29:23 STARBASE nginx: 2021/12/15 07:29:23 [alert] 8124#8124: worker process 20337 exited on signal 6 I've closed some dockers for the time being, and the spamming stopped. I've seen people in this thread: with a similar problem and also kernel panics, so this might be related even though previously I didn't see this error anywhere. Quote Link to comment
JorgeB Posted December 15, 2021 Share Posted December 15, 2021 On 12/14/2021 at 6:27 AM, Dieter Koblenz said: Here's the syslog Unfortunately no call traces logged there, so not much to see. Quote Link to comment
Dieter Koblenz Posted December 18, 2021 Author Share Posted December 18, 2021 (edited) I've uploaded a new syslog with several errors, I tried reboot from the command line but it failed. Edit: Never mind, I see it doesn't contain anything useful. syslog-127.0.0.1.log Edited December 18, 2021 by Dieter Koblenz Quote Link to comment
Dieter Koblenz Posted December 27, 2021 Author Share Posted December 27, 2021 As a temporary solution, I've added a script to reboot my machine at 04 AM. This did prevent crashes for a couple of days, until this morning I had another (new) kernel panic. Unfortunately, I couldn't screenshot the debug screen this time: I did see the following message: rcu_sched stall. Nothing much so far to go on, but it's definitely driving me nuts. Quote Link to comment
Dieter Koblenz Posted January 14, 2022 Author Share Posted January 14, 2022 I've switched memory banks for ECC in the hope this would solve the problem. It doesn't. I've changed all components including the motherboard except for the drives. At this moment, I'm considering moving to another OS to see if it also crashes this often. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.