VelcroBP Posted December 27, 2022 Share Posted December 27, 2022 I've been experiencing intermittent freezes where the UI and all container apps become unresponsive. I am still able to connect via SSH and IPMI, and rebooting via powerdown -r restores functionality for a while. I haven't been able to correlate to any particular action by container or service. However, just prior to the most recent freeze this a.m. I noticed high RAM usage in the dashboard (98%). TOP only accounted for ~55%, so I don't think it was actually using all of the allocated RAM but thought I'd mention it. I've been running a Kiwi server on another PC since the last hang, and the syslog from today is attached. MootowerSyslogCatchAll-2022-12-27.txt Quote Link to comment
JorgeB Posted December 27, 2022 Share Posted December 27, 2022 Nothing relevant in the syslog, see if you can get the diagnostics next time, you can also try booting in safe mode to see if it helps. Quote Link to comment
VelcroBP Posted December 27, 2022 Author Share Posted December 27, 2022 (edited) is there a way to get diagnostics via command line? They are reset upon reboot, or are they still available in the flash drive somewhere? Edited December 27, 2022 by VelcroBP clarifying question Quote Link to comment
JorgeB Posted December 27, 2022 Share Posted December 27, 2022 Click on the link. Quote Link to comment
VelcroBP Posted December 29, 2022 Author Share Posted December 29, 2022 Just had another OS freeze. The shares are still accessible but not any of the webUIs. I was able to run generate a diagnostics via IPMI and have attached. mootower-diagnostics-20221228-1931.zip Quote Link to comment
VelcroBP Posted December 29, 2022 Author Share Posted December 29, 2022 and here is the version created just before powerdown reboot mootower-diagnostics-20221228-1940.zip Quote Link to comment
JorgeB Posted December 29, 2022 Share Posted December 29, 2022 Nothing relevant logged that I can see, try booting in safe mode to rule out plugin issues. Quote Link to comment
VelcroBP Posted December 30, 2022 Author Share Posted December 30, 2022 I will try safe mode. But, since I don't have a definitive way of forcing the issue, or any correlated events to test, how will I rule out plugin issues? Do I leave it run and if there's no freeze-up for x # of days then I assume it's a plugin? Quote Link to comment
JorgeB Posted December 30, 2022 Share Posted December 30, 2022 Basically yes, if it doesn't crash start enabling services and plugins one by one. Quote Link to comment
VelcroBP Posted December 30, 2022 Author Share Posted December 30, 2022 This is by deleting all plugins (per below from a different post), then re-install one-by one and boot in normal mode? Can I leave the config folders in /plugins/ for the re-install? Quote Delete/rename all *.plg files in /boot/config/plugins, then re-enable one or a a few at a time. Quote Link to comment
VelcroBP Posted December 30, 2022 Author Share Posted December 30, 2022 Just now, VelcroBP said: then re-install one-by one Or can I just restore a copy of the .plg file into the /plugins/ folder? I have backed up the folder to another PC. Quote Link to comment
JorgeB Posted December 31, 2022 Share Posted December 31, 2022 You can just rename all plg files, then rename back to plg one at a time Quote Link to comment
VelcroBP Posted January 2 Author Share Posted January 2 So far running stable after a couple days running in safe mode. I've renamed the extension of the .plg files. Clarification on the plugin testing: Is it enough to install one at a time during the current safe boot session? Or do I need to reboot into normal mode after restoring each .plg extension? Quote Link to comment
JorgeB Posted January 2 Share Posted January 2 You need to reboot in normal mode. Quote Link to comment
VelcroBP Posted January 14 Author Share Posted January 14 So I disabled all plugins and have been running in normal mode since 12/30. Every couple of days I re-enabled a few plugins, with the final batch being on 1/12. So far, no issues or freezes at all and I thought all must be well, the issue was with a plugin that was corrected by reinstalling. Today I was testing the Roku Jellyfin app (having recently setting up a container), and upon playing a file and returning to the menu, unRaid locked up again. Just the same as before, with all shares accessible and the console as well via SSH. Just the UI and apps are not responding. I can log into the Main and container WebUIs, but nothing loads. I'm attaching a diagnostic zip from during the hang (generated via console command), as well as the auto generated post-boot one. The freeze occurred at ~12:10. Syslog doesn't seem to report anything relevant, but I can attach it if needed. mootower-diagnostics- DURING - 20230114-1211.zip mootower-diagnostics- POST REBOOT - 20230114-1217.zip Quote Link to comment
VelcroBP Posted January 15 Author Share Posted January 15 I was not able to force a hang by replicating the actions with Jellyfin. It might have just coincidentally froze while I was testing Jellyfin - client stream? Quote Link to comment
VelcroBP Posted January 16 Author Share Posted January 16 Just hung up again. This time, a user was initiating a Plex remote stream that was transcoding from 1080 - SD (no idea why she would have her iOS Plex app set to SD). Attached are the diagnostics from during the freeze (~15:25) and right after reboot and starting the array. Any help anyone can provide in parsing these for a potential cause to start troubleshooting next would be greatly appreciated. I will continue disabling 1 plugin at a time unless other suggestions come in. Though I the fact that the 2 most recent incidents involved Jellyfin playback or Plex transcoding/playback make me inclined to think it's something related to video? I'm at a loss really. mootower-diagnostics-20230116-1527 -- DURING FREEZE.zip mootower-diagnostics-20230116-1537 -- AFTER REBOOT and ARRAY START.zip Quote Link to comment
JorgeB Posted January 17 Share Posted January 17 Unfortunately there's nothing relevant logged. Quote Link to comment
VelcroBP Posted January 17 Author Share Posted January 17 ok thanks for looking. I will keep going with plugins, then I'll try disabling iGPU transcoding. Just grasping at straws really. If I can't find a root cause, I hope to have funds in the next few months to rebuild my server and replace/upgrade everything but the data drives. Just needs to hang in there until then. Quote Link to comment
VelcroBP Posted January 17 Author Share Posted January 17 I had a though last night, don't know if it's relevant. I have been getting CRC errors on one of my data drives. I've kept an eye on it, and recently it started happening more frequently. I've swapped the cable, and the mobo port and they continued. Yesterday I moved it into a different bay in the Norco 3x5 and if it still grows then I'm assuming bad drive. With that said about the drive, is it possible for an error communicating with a data drive, like during playback or transcode or whatever, could cause the OS/UI to freeze? With several of my reboots from hangs, the system fails to POST due to a SMART error with that drive. I press F1 to retry and it boots normally. Just grasping at straws really. I hope to have funds in the next few months to rebuild my server and replace/upgrade everything but the data drives. Just needs to hang in there until then lol. Quote Link to comment
JorgeB Posted January 18 Share Posted January 18 Seems unlikely but won't say it's impossible. Quote Link to comment
VelcroBP Posted January 20 Author Share Posted January 20 New theory: issue occurs when Plex is transcoding AND Nextcloud sync operation is running? That was the system state at the time of hanging today anyway. So far Plex has been running for all the hangs I've been present for. And the only non-Plex was with Jellyfin?? Also new with today's freeze: the UI returned many errors of devices not being available for unmounting during the Powerdown, resulting in an unclean shutdown. mootower-diagnostics-20230120-1458 - HANG TIME.zip mootower-diagnostics-20230120-1501 - POST BOOT.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.