December 15, 2025Dec 15 Over the last 3-4 days, my unraid box has arbitrarily soft locked itself.WebUI ceases to functionSSH will not replyResponds to PINGNone of the containers work externallyThe VM On it pings but is non-responsive.Terminal operations are incredibly slow to print but otherwise CANNOT log in as login times out after 60s.I managed to get it to recognize things were failing (I guess) and it dumped the diag logs and rebooted itself after ... 3-4 hours of dumping the logs.I've gone ahead and said "Eh, maybe its the version" and updated to 7.2.2 as troubleshooting, additionally I downgraded the GPU drivers from 590 to 580 (nVidia card for GPU passthru) -- There has been no tangible difference in results with these changes.Last night it froze while I was using the VM, no other container was under heavy load (See: Plex or Jellyfin had no users that I'm aware of, audiobookshelf is not resource intensive).I had nearly a 200d uptime with this sucker before this began and now I struggle to hit 20hrs.In the logs there are some OOM errors but this appears to be DURING the dump window, not before it locked, it locked approximately around 10P EST -- Specifically the entry at 21:59:07 was me attempting to connect to it because I realized it was running like a pile of dung.Any assistance would be WILDLY appreciated, my next step is an aggressive MEMTEST session or restrict WAM to the VM and see what happens.p.s. thanks to the discord crew in Support for the assists, helped me go in the right direction to find it.tower-diagnostics-20251215-0105.zip Edited December 17, 2025Dec 17 by considerthecricket
December 15, 2025Dec 15 Community Expert ffprobe appears to be the process causing the OOM; see if you can identify the container and limit its RAM usage.
December 16, 2025Dec 16 Author 9 hours ago, trurl said:I have had C state 6 disabled since I stood the box up in 2019, I had no joke a 200d uptime running 7.1.2 before this began with near no warning.
December 16, 2025Dec 16 Author 9 hours ago, JorgeB said:ffprobe appears to be the process causing the OOM; see if you can identify the container and limit its RAM usage.I had suspected that too, but I think ffprobe is a symtom of the root cause, especially considering my base outage began hours before the first OOM error percolated to the logs.I've got an 18hr uptime with the same approximate set of things running now with parity check still chugging along.I know with virtualization, the processes are abstracted from the hypervisor, but is there any chance some of these logs are scraped from the guest as well? If so, ffmpeg could be causing some issues with this dump considering when it lost its mind I was streaming a game from sunshine running within an ubuntu VM.But it's got a locked set of RAM at the guest level and is told "No more" beyond that.
December 16, 2025Dec 16 Author I managed to kind-of catch the issue this morning before it rebooted or had a panic attack.I specifically logged into the CLI from console last night and left it up.I do agree it is an OOM issue but the logging was skewed in some fashion, backlblaze backup according to today's output spikes to consuming a whole crap ton of ram sometime during the night (7.5Gb worth) and isn't playing nicely.Example from logs (truncated a bit)Dec 16 06:20:47 Tower kernel: [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj nameDec 16 06:20:47 Tower kernel: [ 570766] 0 570766 9772504 4346299 4344553 0 1746 35758080 0 0 qemu-system-x86Dec 16 06:20:47 Tower kernel: [1569864] 99 1569864 2872730 1953259 1953246 13 0 22814720 0 0 bztransmit64.exBetween the VM and backblaze, its an ass kicker.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.