HBoardman Posted July 31, 2023 Share Posted July 31, 2023 Hi folks, I know, another "why is my system crashing post", sorry... I'm having this issue where my system seems like it crashes, but I'm not sure how to verify this as the CLI stops outputting after the suspected crashes. After a day or so my dockers and VMs stop and I cannot access the Web UI. I also can't ping the server in the local network. All things considered, it seems like a crash and I have to hard shutdown to get the server responsive again, where it will crash again after a day or so. This happened a while ago, and I seemed to fix it by either disabling c-states and updating unRAID. Alas, I don't know what else I can do now the issue has come back. There's been no hardware changes, temps seem fine and I haven't changed anything massive software wise other than updating unRAID to 6.12.3 and deleting unused Appdata folders from dockers I don't use anymore. Any advice would be hugely appreciated, any data further than this diagnostics file just ask. Thanks! Spoiler viki-diagnostics-20230731-1757.zip 1 Quote Link to comment
HBoardman Posted July 31, 2023 Author Share Posted July 31, 2023 Oh, and here's the diagnostics file from what I assume was before the most recent crash (earlier last night, as I realised the server was down this morning). viki-diagnostics-20230316-1915.zip Quote Link to comment
JorgeB Posted July 31, 2023 Share Posted July 31, 2023 Enable the syslog server and post that after a crash. Quote Link to comment
Zoba Posted July 31, 2023 Share Posted July 31, 2023 I'm running into exactly this problem. My homeassistant vm stays responsive but can't access the gui and ssh doesn't react at all. I feel like this happened only after the latest update though Quote Link to comment
Zoba Posted July 31, 2023 Share Posted July 31, 2023 Okay, so i've had to restart my NAS because the UI was unavailable and restarted. It locked up the GUI again after a couple of minutes (perhaps due to me updating a docker container?) both name and ip don't work for the Unraid UI My homeassistant VM stopped reacting most, but not all Docker containers continued to work and i was still able to interact with them (e.g. watch stuff on jellyfin) All SMB shares are unresponsive ssh doesn't react at all I'm currently running unRaid Version: 6.12.2 so i've now started the update to 6.12.3. Hoping that this will solve my problems. I've also enabled syslog and tried to put it in my downloads share Quote Link to comment
HBoardman Posted August 1, 2023 Author Share Posted August 1, 2023 On 7/31/2023 at 6:58 PM, JorgeB said: Enable the syslog server and post that after a crash. Do you want the syslog file itself? I had the server running already, and this is the most recent one. It was last modified just after I rebooted it after it's most recent crash, which I assume was at 13:37 today as that's when the log finishes. Appreciate any help. syslog Quote Link to comment
HBoardman Posted August 1, 2023 Author Share Posted August 1, 2023 On 7/31/2023 at 6:58 PM, Zoba said: I'm running into exactly this problem. My homeassistant vm stays responsive but can't access the gui and ssh doesn't react at all. I feel like this happened only after the latest update though I can't even access my HA VM, I lose access to everything. Mine definitely started before the latest update, and happened again a few months ago which I seemed to fix by updating the OS. Quote Link to comment
HBoardman Posted August 1, 2023 Author Share Posted August 1, 2023 Something else I've just noticed - I was doing some transcoding through Plex (nothing big, 1080 to 1080) and the server went down in a matter of minutes after being up. Is there anything in the logs to suggest that Plex might be the issue? It's currently running in a Docker container. Quote Link to comment
Zoba Posted August 1, 2023 Share Posted August 1, 2023 Hey there, my HA and all the other stuff used to drop over the course of minutes/hours as well, but some stuff kept going (?). Didn't test how long things would take to get stuck because i'd rather use my NAS. I've now updated to the latest unraid minor version and so far had no stops. I've recently added tone mapping and transcoding to jellyfin as well - so it might have to do with it? I'm using a 3400g in my system and it's igpu to transcode. Quote Link to comment
HBoardman Posted August 1, 2023 Author Share Posted August 1, 2023 57 minutes ago, Zoba said: Hey there, my HA and all the other stuff used to drop over the course of minutes/hours as well, but some stuff kept going (?). Didn't test how long things would take to get stuck because i'd rather use my NAS. I've now updated to the latest unraid minor version and so far had no stops. I've recently added tone mapping and transcoding to jellyfin as well - so it might have to do with it? I'm using a 3400g in my system and it's igpu to transcode. Is that 6.12.3? That's what I'm running now, so sadly might not be my issue. I've stopped the Plex docker for now, see if that makes any difference in stability. Quote Link to comment
JorgeB Posted August 2, 2023 Share Posted August 2, 2023 Syslog has constant call traces, looks more like a hardware issue, try running with just 1 or 2 sticks of RAM (try different one if still issues), and leave XMP disabled for testing. Quote Link to comment
HBoardman Posted August 2, 2023 Author Share Posted August 2, 2023 4 hours ago, JorgeB said: Syslog has constant call traces, looks more like a hardware issue, try running with just 1 or 2 sticks of RAM (try different one if still issues), and leave XMP disabled for testing. Interesting - would it be worth running memtest? I have 2x8GB and 2x16GB sticks in at the moment, does unRAID not like weird configurations? Quote Link to comment
itimpi Posted August 2, 2023 Share Posted August 2, 2023 Just now, HBoardman said: Interesting - would it be worth running memtest? This will never do any harm Note that passing memtest is not a definitive proof that you have no memory issues, whereas failing it is. 1 minute ago, HBoardman said: I have 2x8GB and 2x16GB sticks in at the moment, does unRAID not like weird configurations? At that level Unraid is just Linux (Slackware) so should be no more sensitive than other Linux systems. Quote Link to comment
HBoardman Posted August 2, 2023 Author Share Posted August 2, 2023 4 hours ago, itimpi said: This will never do any harm Note that passing memtest is not a definitive proof that you have no memory issues, whereas failing it is. At that level Unraid is just Linux (Slackware) so should be no more sensitive than other Linux systems. Makes sense. Although interestingly, I've had the Plex docker spun down since last night, and no signs of issues so far... I may migrate this to a VM and see if this is still remains stable. Wonder what Docker could theoretically do to affect the stability of the entire system? Quote Link to comment
Zoba Posted August 2, 2023 Share Posted August 2, 2023 Okay, it happened again. Started a video with transcoding and the entire server became unresponsive after roughly 20 seconds. Quote Link to comment
HBoardman Posted August 2, 2023 Author Share Posted August 2, 2023 1 hour ago, Zoba said: Okay, it happened again. Started a video with transcoding and the entire server became unresponsive after roughly 20 seconds. Mm, definitely something up with Plex. Can't tell if it's the Docker itself or something load-related when Plex starts doing some heavy lifting. Quote Link to comment
ggfools Posted August 3, 2023 Share Posted August 3, 2023 you guys may be on to something, my machine became unresponsive last night while plex was doing sonic analysis on my music library. Quote Link to comment
Solution HBoardman Posted August 3, 2023 Author Solution Share Posted August 3, 2023 8 hours ago, ggfools said: you guys may be on to something, my machine became unresponsive last night while plex was doing sonic analysis on my music library. Migrated to a Windows VM yesterday (I know Windows isn't a great choice for a "server" VM, but I really can't be bothered to argue with Linux while I need my Plex server back up ASAP), and so far so good - unRAID has been stable. Time will tell if it remains this way, but at the moment it looks like something to do with the Plex Docker container/configuration. Quote Link to comment
Zoba Posted August 3, 2023 Share Posted August 3, 2023 If so, it also affects Jellyfin. Disabling Transcoding but leaving on tone mapping still fails. Quote Link to comment
Zoba Posted August 5, 2023 Share Posted August 5, 2023 (edited) @HBoardman did you run into any more issues? EDIT: Also what hardware do you use? An amd iGPU like me? Edited August 5, 2023 by Zoba Quote Link to comment
HBoardman Posted August 7, 2023 Author Share Posted August 7, 2023 On 8/5/2023 at 10:21 AM, Zoba said: @HBoardman did you run into any more issues? EDIT: Also what hardware do you use? An amd iGPU like me? Seems to have "fixed" it - no idea what issue the VM had, and I don't know how to troubleshoot it but for now moving Plex to a VM seems to make the system stable again. Unfortunate downside is running the overhead of an OS for one program, but I'd rather have my server running! Hardware is a 9600K, 48GB of RAM, 20TB across 8 disks (I know, I'm working on it). Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.