July 25, 2025Jul 25 For the past few weeks I've been having issues with random kernel panics on my unraid NAS. I assumed it was related to a failing NVMe drive, but even after replacing said drive I'm still getting kernel panics. What steps can I take to diagnose this? The diagnostics I attached were taken about 3 minutes or so after the panic and the system came back up tower-diagnostics-20250725-1319.zip
July 25, 2025Jul 25 Community Expert Make sure this has been taken care of:https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#findComment-819173
July 25, 2025Jul 25 Author 2 minutes ago, JorgeB said:Make sure this has been taken care of:https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#findComment-819173I'll give that a try, I've had issues with C states before, but that was on a different mobo and a intel CPU.
July 25, 2025Jul 25 Author Alright, I've set Power Supply Idle Control to typical current idle as per that FAQ, I'll post here if it panics again or if it remains stable for 7 days straight
July 25, 2025Jul 25 Community Expert If they continue enable the syslog server and post that after a crash, it may also be worth running memtest.
July 25, 2025Jul 25 Author 23 minutes ago, JorgeB said:If they continue enable the syslog server and post that after a crash, it may also be worth running memtest.Yeah, I already had mirror syslog to flash setup because of the panics, but if it happens again I'll setup a remote server for it as well
August 1, 2025Aug 1 Author Diagnostics attached, but not sure how helpful they will be, given that I turned off mirror syslog to flash and I had to restart the NAS to get the web ui to respond after it crashed overnight tower-diagnostics-20250801-1055.zip
August 1, 2025Aug 1 Community Expert Could be unrelated, since that should not crash the server, but the syslog-previous shown an issue with the users shares, but not the beginning of the problem. Re-enable the syslog server and post that after the next event.
August 4, 2025Aug 4 Author Crashed again, I don't see anything at all useful in the syslog output. This portion of the log is from about 16-17 minutes or so before it crashed, right up to when it crashed, and I don't see anything useful. The server crashed at 18:17Z 18:15Z (2:17 2:15 PM local time.) As you can see the last output was from a few minutes before. Diagnostics attached, let me know if you need logs from even further backEdit: I got the times slightly wrong "timestamp","source","message" "2025-08-04T18:00:02.000Z","Tower","Tower move: Starting Mover ..." "2025-08-04T18:00:02.000Z","Tower","Tower move: Cron + options: start" "2025-08-04T18:00:02.000Z","Tower","Tower move: ionice -c 2 -n 7 nice -n 0 /usr/local/emhttp/plugins/ca.mover.tuning/age_mover start" "2025-08-04T18:00:52.000Z","Tower","Tower kernel: mdcmd (188): nocheck PAUSE" "2025-08-04T18:00:52.000Z","Tower","Tower kernel:" "2025-08-04T18:00:53.000Z","Tower","Tower kernel: md: recovery thread: exit status: -4" "2025-08-04T18:01:03.000Z","Tower","Tower Parity Check Tuning: Send notification: Paused: Automatic Correcting Parity-Check (45.4% completed) (type=normal link=/Settings/Scheduler)" "2025-08-04T18:03:04.000Z","Tower","Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: docker0: port 2(veth8779f38) entered disabled state" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: veth93ddbf9: renamed from eth0" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: docker0: port 2(veth8779f38) entered disabled state" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: veth8779f38 (unregistering): left allmulticast mode" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: veth8779f38 (unregistering): left promiscuous mode" "2025-08-04T18:03:36.000Z","Tower","Tower kernel: docker0: port 2(veth8779f38) entered disabled state" "2025-08-04T18:06:24.000Z","Tower","Tower kernel: mdcmd (189): check resume" "2025-08-04T18:06:24.000Z","Tower","Tower kernel:" "2025-08-04T18:06:24.000Z","Tower","Tower kernel: md: recovery thread: check P ..." "2025-08-04T18:06:29.000Z","Tower","Tower Parity Check Tuning: Send notification: Resumed: Automatic Correcting Parity-Check (45.4% completed) (type=normal link=/Settings/Scheduler)"tower-diagnostics-20250804-1435.zip Edited August 4, 2025Aug 4 by Meowcat285
August 5, 2025Aug 5 Community Expert Having nothing relevant logged typically points to a hardware issue, since you have multiple RAM sticks, try using the server with just one, if the same try with the other one, that will basically rule out bad RAM.
August 5, 2025Aug 5 Author 7 hours ago, JorgeB said:Having nothing relevant logged typically points to a hardware issue, since you have multiple RAM sticks, try using the server with just one, if the same try with the other one, that will basically rule out bad RAM.Would it be worth just running a memory test first off?
August 5, 2025Aug 5 Author 1 hour ago, Meowcat285 said:Would it be worth just running a memory test first off?So I did run a memory test and found no issues. I'm gonna try disabling CPU C-states, given I've had issues with those before on ASRock boards, if it's still crashing, I'll try doing what you suggested.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.