Jump to content

Web UI drops off network, VMs and Docker stop - crashing?


Go to solution Solved by HBoardman,

Recommended Posts

Hi folks,

 

I know, another "why is my system crashing post", sorry...

 

I'm having this issue where my system seems like it crashes, but I'm not sure how to verify this as the CLI stops outputting after the suspected crashes. After a day or so my dockers and VMs stop and I cannot access the Web UI. I also can't ping the server in the local network. All things considered, it seems like a crash and I have to hard shutdown to get the server responsive again, where it will crash again after a day or so. 

 

This happened a while ago, and I seemed to fix it by either disabling c-states and updating unRAID. Alas, I don't know what else I can do now the issue has come back. There's been no hardware changes, temps seem fine and I haven't changed anything massive software wise other than updating unRAID to 6.12.3 and deleting unused Appdata folders from dockers I don't use anymore. 

 

Any advice would be hugely appreciated, any data further than this diagnostics file just ask. 

 

Thanks!

Spoiler

 

 

viki-diagnostics-20230731-1757.zip

  • Upvote 1
Link to comment

Okay, so i've had to restart my NAS because the UI was unavailable and restarted.

 

  1. It locked up the GUI again after a couple of minutes (perhaps due to me updating a docker container?)
    1. both name and ip don't work for the Unraid UI
  2. My homeassistant VM stopped reacting
  3. most, but not all Docker containers continued to work and i was still able to interact with them (e.g. watch stuff on jellyfin)
  4. All SMB shares are unresponsive
  5. ssh doesn't react at all

I'm currently running unRaid Version: 6.12.2 so i've now started the update to 6.12.3. Hoping that this will solve my problems. I've also enabled syslog and tried to put it in my downloads share

Link to comment
On 7/31/2023 at 6:58 PM, Zoba said:

I'm running into exactly this problem. My homeassistant vm stays responsive but can't access the gui and ssh doesn't react at all. I feel like this happened only after the latest update though

I can't even access my HA VM, I lose access to everything. Mine definitely started before the latest update, and happened again a few months ago which I seemed to fix by updating the OS.

Link to comment

Something else I've just noticed - I was doing some transcoding through Plex (nothing big, 1080 to 1080) and the server went down in a matter of minutes after being up. Is there anything in the logs to suggest that Plex might be the issue? It's currently running in a Docker container. 

Link to comment

Hey there, my HA and all the other stuff used to drop over the course of minutes/hours as well, but some stuff kept going (?). Didn't test how long things would take to get stuck because i'd rather use my NAS. I've now updated to the latest unraid minor version and so far had no stops. I've recently added tone mapping and transcoding to jellyfin as well - so it might have to do with it? I'm using a 3400g in my system and it's igpu to transcode.

Link to comment
57 minutes ago, Zoba said:

Hey there, my HA and all the other stuff used to drop over the course of minutes/hours as well, but some stuff kept going (?). Didn't test how long things would take to get stuck because i'd rather use my NAS. I've now updated to the latest unraid minor version and so far had no stops. I've recently added tone mapping and transcoding to jellyfin as well - so it might have to do with it? I'm using a 3400g in my system and it's igpu to transcode.

Is that 6.12.3? That's what I'm running now, so sadly might not be my issue. I've stopped the Plex docker for now, see if that makes any difference in stability. 

Link to comment
4 hours ago, JorgeB said:

Syslog has constant call traces, looks more like a hardware issue, try running with just 1 or 2 sticks of RAM (try different one if still issues), and leave XMP disabled for testing.

Interesting - would it be worth running memtest? I have 2x8GB and 2x16GB sticks in at the moment, does unRAID not like weird configurations?

Link to comment
Just now, HBoardman said:

Interesting - would it be worth running memtest?

 

This will never do any harm :)  Note that passing memtest is not a definitive proof that you have no memory issues, whereas failing it is.

 

1 minute ago, HBoardman said:

I have 2x8GB and 2x16GB sticks in at the moment, does unRAID not like weird configurations?

 

At that level Unraid is just Linux (Slackware) so should be no more sensitive than other Linux systems.

Link to comment
4 hours ago, itimpi said:

 

This will never do any harm :)  Note that passing memtest is not a definitive proof that you have no memory issues, whereas failing it is.

 

 

At that level Unraid is just Linux (Slackware) so should be no more sensitive than other Linux systems.

Makes sense. Although interestingly, I've had the Plex docker spun down since last night, and no signs of issues so far... I may migrate this to a VM and see if this is still remains stable. Wonder what Docker could theoretically do to affect the stability of the entire system? 

Link to comment
1 hour ago, Zoba said:

Okay, it happened again. Started a video with transcoding and the entire server became unresponsive after roughly 20 seconds.

Mm, definitely something up with Plex. Can't tell if it's the Docker itself or something load-related when Plex starts doing some heavy lifting. 

Link to comment
  • Solution
8 hours ago, ggfools said:

you guys may be on to something, my machine became unresponsive last night while plex was doing sonic analysis on my music library.

Migrated to a Windows VM yesterday (I know Windows isn't a great choice for a "server" VM, but I really can't be bothered to argue with Linux while I need my Plex server back up ASAP), and so far so good - unRAID has been stable. Time will tell if it remains this way, but at the moment it looks like something to do with the Plex Docker container/configuration. 

Link to comment
On 8/5/2023 at 10:21 AM, Zoba said:

@HBoardman did you run into any more issues?

 

EDIT: Also what hardware do you use? An amd iGPU like me?

Seems to have "fixed" it - no idea what issue the VM had, and I don't know how to troubleshoot it but for now moving Plex to a VM seems to make the system stable again. Unfortunate downside is running the overhead of an OS for one program, but I'd rather have my server running!

 

Hardware is a 9600K, 48GB of RAM, 20TB across 8 disks (I know, I'm working on it).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...