Jump to content

Crashing After Running a few hours


Go to solution Solved by JorgeB,

Recommended Posts

Hello, I have tried a few things and can not seem to figure out what is going wrong.

 

Unraid version: 6.12.6

 

What is happening: After a few hours, usually, of run time I will get a kernal error. It has crashed while idle, during parity checks, during just normal operations (watching movie on Jellyfin). It will crash less than 15 minutes into the movie.

 

What I have done:

-Upgraded from Trial to Pro, trial period was almost over regardless

-Used two different usb drives to boot

-Tried two different sets of RAM, with RAM speeds changed in BIOS to both OC and within normal spec 2866 & 3200

-I am running an AMD Ryzen 5 3600, I changed the idle power setting to typical in the bios and turned global c states off

-I have tried leaving all dockers off, I am unsure if it was due to the upgrade or not, but all my dockers were gone after I loaded into the GUI after one crash

-Booting up today I see my dockers are back with no input from me so I assume it was some kind of issue with loading, but I have no clue

-With the reappearance I have updated both Jellyfin (released today 06FEB2024) and Netdata (released tofay 06FEB2024), I also run Krusader but it did not have an update

 

It is possible I have done other troubleshooting but this is what I can remember off the top of my head. I have attached the diagnostic folder and a few log files. I am unsure of exactly what was needed so I hope I have what is required.

 

Status right now: Booted up, parity not running, Jellyfin running, no other dockers

Settings for everything in system and docker appear as they were before these crashes became common.

 

If there is anything else you need I will provide it and if anything happens on my end in the meantime I will update the thread

Troubleshooting 06FEB.7z

Link to comment

As I said in the above post, yes I have. However since I have booted up yesterday it has not crashed, which was not normal. I am currently running through a rotation of tasks I was doing prior to crashing to see if it triggers again. If not I guess just being powered down for 24+ hours removed the gremlins. If I do not notice anything in the next couple days I will come back and update and mark solved.

 

I had recently gotten new RAM and it started to happen around the time I installed it, I can not remember for sure if it happened prior to the installation or after. As of now I have the original RAM installed. I may go back and test out the new RAM again, but I would like to run a successful parity check. I should note it has crashed on both sets of RAM, the only difference being the new set is 32GB and the older set is 16

Link to comment

I would first re-test with v6.11.5, since there have been some rare cases of kernel compatibly issue, if the Unraid driver keeps crashing it's almost certainly hardware related, as to where to start, I would try running with just one stick of RAM (without XMP), if the same try a different one, that will basically rule out the RAM.

Link to comment

Downgraded to 6.11.5. Booted up in safe mode, in hindsight I realized I didn't try this before, and completed the parity check. I have swapped out the RAM and checking to see if potentially those are still an issue. If everything is still going fine for the next couple days, I'll come back and mark this solved.

Link to comment

It was definitively the compatibility issue, both sets of RAM work just fine. I am also seeing significantly lower idle CPU usage. I am not sure if those two things are related but that was an immediate thing I noticed. On the latest version of unraid I was idleing between 6-15% consistently and now I idle around 1% which seems way more normal. Both of these values were with all dockers off.

 

I tried to upgrade to 6.12.4 and the crashing immediately started happening again. Is there anyway I can use my dockers on 6.11.5 or is there any other options? The app tab says unavailable prior to 6.12.0. I currently only have Jellyfin, krusader, and netdata, but I had wanted to explore other options. Though I view the media platform as my higher priority.

Edited by Whitty
Addition of pertienent information
Link to comment

You should be able to make docker work after downgrading, from the v6.12.0 release notes:

 

"If you revert back from 6.12 to 6.11.5 or earlier, you have to force update all your Docker containers and start them manually after downgrading. This is necessary because of the underlying change to cgroup v2 starting with 6.12.0-rc1."

 

If it works, keep on v6.11 for now and then try v6.13 once it's out.

 

 

Link to comment
  • 3 weeks later...

Not sure if this is still being monitored or not, but I am still having crashing issues. I've ruled out RAM and usb stick. Attached a few diagnostic files here. At this point I feel like just getting a new mobo and cpu because I can't think of anything else? I have also used both 11.5 and 12.8 unraid to test. I have since reverted back to 6.11.5

 

The most recent crash occured during parity check. It isn't really a full crash and shutdown, but CPU will max out and most often it is a kernel panic error. Parity check crash was a full system crash though.
whittyshare-flash-backup-20240220-2216.zip

whittyshare-diagnostics-20240224_1356.zip whittyshare-diagnostics-20240223-1525.zip whittyshare-diagnostics-20240220-1709.zip whittyshare-diagnostics-20240218-1751.zip whittyshare-diagnostics-20240218-1643.zip whittyshare-diagnostics-20240218-1117.zip

Edited by Whitty
add info
Link to comment

In my log file I am getting the following error near continuously

 

Feb 25 02:32:24 Server kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: P W O 5.19.17-Unraid #2

Feb 25 02:32:24 Server kernel: Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 5302 10/20/2023

Feb 25 02:32:24 Server kernel: Call Trace:

 

This fills my syslog file rather quickly and I am wondering if this is the reason for the crashes? I hadn't noticed this previously in troubleshooting so I cannot say for sure if this is new or not.

 

I had found a few other topics relating to similar errors and disabled my docker to see if that was causing the error and it is not related as far as I can tell.

 

This error appears with the entire array stopped as well.

 

After rebooting with all dockers off and not on autostart the error does not persist. Yet. I plan to maintain current alignment and start other dockers later to see if they cause the issue. I attached the most recent diagnostics just in case, though I doubt they will provide any additional data.

whittyshare-diagnostics-20240224-2158.zip

Link to comment
  • Solution
8 hours ago, Whitty said:

In my log file I am getting the following error near continuously

 

Feb 25 02:32:24 Server kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: P W O 5.19.17-Unraid #2

Feb 25 02:32:24 Server kernel: Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 5302 10/20/2023

Feb 25 02:32:24 Server kernel: Call Trace:

If you are getting errors with the two very different kernels, and assuming the Ryzen specific issues have been taken care of as linked above, it suggests a hardware issue.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...