Server Crash: CPU and RAM being Taken to 100%, WebUI Unresponsive Requiring Unsafe Server Hard Shutdown on 6.12.6


Go to solution Solved by Linguafoeda,

Recommended Posts

I moved my server to a new apartment / new ISP and router and now am having serious issues with with it :( I moved it two days ago within a couple of hours last night, i noticed the web UI had frozen last showing the CPU pegged at 100% / temps in the 80s, the RAM usage skyrocketing from 48% to 97% and the entire GUI freezing. It required me to manually turn off the server unsafely (not sure what's called but a forced shutdown essentially). Then to make things worse, the mobo won't boot the USB device until I unplug the PSU and re-plug it in. this now happened again for 2nd time 24 hours later. I have no idea how to begin to troubleshoot this or what is causing this. I didn't have this issue 3 days ago when my server was sitting in another apartment, I'm extremely sad / frustrated given a lot of what i do is sitting on this now, unstable server. I typically run Plex, related *arr apps, downloading utilities, Windows 11 VM and some other random media-related things.

 

I started off not being able to connect with wireguard down, nginx not working (see below thread i'm having issues with) and now my CPU/RAM is getting taken to 100% within 24 hours of use causing a crash...does anyone know what is happening here? I'm running 6.12.6 which has been running for a at least a few weeks now (and again, all was fine until 3 days ago).

 

I got a notice last night from Fix Common Problems to (a) use the community apps 8125G realtek plugin so i installed that and (b) also fiddled around with something related to "macvlan" after being redirected to this link from FCP (https://docs.unraid.net/unraid-os/release-notes/6.12.4/#fix-for-macvlan-call-traces), where I did all the steps (I changed it to ipvlan, bonding was already on so no changes there, I turned bridging off, and I enabled "Host access to custom networks"). I don't have any logs from when the server crashd requiring a hard shut down, I can only see logs when i rebooted. Now IPVlan option is grayed out from my server setting too...i'm at loss of what to do.

Edited by Linguafoeda
Link to comment
  • Linguafoeda changed the title to Server Crash: CPU and RAM being Taken to 100%, WebUI Unresponsive Requiring Unsafe Server Hard Shutdown on 6.12.6

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Link to comment

I am having a similar issue,

 

I narrowed it down to the shfs is process is using most of the ram and at some point crash the server, and it will require a reboot.  However the process starts all over again.

 

This started happening recently in the past month. 

 

I rebooted the server last night and now it up to 30.9 % ram usage (it was 15% after reboot)

 

I am on unraid version Version: 6.12.6

root     11504  0.0  0.0 208872  2292 ?        Ssl  Dec25   0:00 /usr/local/bin/shfs /mnt/user0 -disks 510 -o default_permissions,allow_other,noatime
root     11515  136 30.9 20886396 20338792 ?   Ssl  Dec25 1210:05 /usr/local/bin/shfs /mnt/user -disks 511 -o default_permissions,allow_other,noatime -o remember=0

 

I have attached the diagnostic file.  Any help would be appreciated. 

tower-diagnostics-20231226-0759.zip

Link to comment
  • 2 weeks later...

You have many ARRS application and Plex, could you disable those and monitor crash or not ? If still crash, you may need disable docker service, the aim were identify does problem come from docker.

 

Your system temperature in 6x degree, does this true or wrong sensors select ? ( It doesn't make sense as CPU temp in 7x degree only with 50% loading when system temp in 6x )

Edited by Vr2Io
Link to comment
23 hours ago, Vr2Io said:

You have many ARRS application and Plex, could you disable those and monitor crash or not ? If still crash, you may need disable docker service, the aim were identify does problem come from docker.

 

Your system temperature in 6x degree, does this true or wrong sensors select ? ( It doesn't make sense as CPU temp in 7x degree only with 50% loading when system temp in 6x )

 

I will try that in a few days. I may try roll back to 6.12.4 too. Should I just let my VM run and turn off Plex and ARRs. It would be odd that only I'm experiencing this (has it been posted elsewhere?) If it were related to one of those arr/Plex Dockers right?

Link to comment
37 minutes ago, Linguafoeda said:

So this might be plex related?

 

Limit the Plex container to a reasonable amount of RAM and set the scheduled task to run at on the next closest hour to see it's fixed. If it is, you can change it to your regular time. If there's a media file that the scanner doesn't like, it'll cause the process to run away so limiting its RAM will restart the container instead of bringing down the system.

Link to comment
19 hours ago, Linguafoeda said:

[FFMPEG] - Unknown encoding

 

This suggests its a video file that can't be properly read. If you've added media before it started acting up, start by removing those files and/or examining them, and adding them back until you find the culprit.

Link to comment
6 hours ago, Michael_P said:

 

This suggests its a video file that can't be properly read. If you've added media before it started acting up, start by removing those files and/or examining them, and adding them back until you find the culprit.

 

I posted in the plex forum to see how to trouble shoot this. I added a ton of different tv shows / movies so it would be very cumbersome to try test each and every one. Any idea of how to troubleshoot finding specifically the one that is throwing the error?

Link to comment
On 1/11/2024 at 11:57 AM, Michael_P said:

 

I'd start with large blocks of files, and narrow it down from there. Or maybe Tdarr's health check could work, too

 

How do you limit Plex's RAM usage? It just crashed again last night. i feel i can safely say it's Plex causing unraid to crash, but now i want to troubleshoot and fix my plex install while i will be remotely away instead of having my server constantly crash. I also pinned the Plex container to only use CPU 4-7 / HT 12-15 

 

image.png.a64c749d4c1b3b666e80af1982a62935.png

Edited by Linguafoeda
Link to comment
  • Solution

as an update - a Plex employee told me this is a known bug with the BETA plex version 1.40.7775 which i had been inadvertently on. I changed the VERSION variable of linuxserver container from "latest" to "docker" and it downgraded me to 1.32.8.7639. the problem has disappeared. Hallelujah

  • Like 1
  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.