June 5, 20251 yr Hey everyone,I'm really hoping someone can help me figure this out.About a month ago I build a new server and migrated all the disks etc.Everything seemed fine to start with, but after a while random feezes started to occur.Now I'm not sure as to what is causing this. It might be a new VM I've created with HW passthrough, or it might be occurring while the parity checks running. (because of schedule/ unclean shutdown.The strange thing is that while the UI seems to be feezing, the VMs and (most of the) dockers keep running just fine. I'm just not able to access the server anymore and am forced to perform a hard shutdown (if the pressing of the button for just a couple of seconds doesn't work).I attached the syslog and diagnostics to this topic. I couldn't find any big problems, but I'm hoping you can!Ps. yes I'm running out of space :) Fixing that as we speak massivedump-diagnostics-20250605-1939.zip syslog-10.10.5.3.zip
June 5, 20251 yr Community Expert Please disable mover logging, start a new persistent syslog, and post that after the next crash.
June 5, 20251 yr Author I see a lot of messages like this one:php-fpm[11936]: [WARNING] [pool www] child 3703766 exited on signal 9 (SIGKILL) after 11.864694 seconds from startIs this problem still a thing?
June 6, 20251 yr Community Expert 12 hours ago, martijndemulder said:I see a lot of messages like this one:In my experience, these errors can be the result of the server being close to exhausting the memory, GUI can become extremely slow, like 1 minute to open the dashboard, try limiting the memory for VMs/docker services, or adding a little more RAM.It could also be one or more containers hogging the CPU, try pinning only some cores to them, and leave cores 0/1 available for Unraid.Also, recommend trying a couple of other things, go to Settings - Global share settings and set the Number of fuse File Descriptors to the max, and enable this:https://docs.unraid.net/unraid-os/release-notes/7.0.0/#excessive-flash-drive-activity-slows-the-system-down
June 13, 20251 yr Author What would you define as close to exhausting memory? 80%, 90% or higher? It's running on a fairly high usage at the moment (84%) since DDR5 is pretty expensive :) But if that's the root-cause it's a no-brainer to add extra memory!I pinned halve of the CPU cores to Unraid/ docker and the other halve to VMs so that the VM wouldn't slow down the server too much when running under full load.Once I rebooted again, I'll have a look for the Excessive flash drive activity slows the system downFunny thing is that the server has been running for a week now. The GUI died almost instantly, but the VMs and Dockers have been working fine. That's why I haven't posted any logging yet. Edited June 13, 20251 yr by martijndemulder
June 14, 20251 yr Community Expert Typically above 90%, but it can happen with lower values, like above 80%
June 15, 20251 yr Author Check! I rebooted the server today and have access to the management again. I shut down one of the VMs in order to test that theory (saves up 6GB of memory which puts me at the 68% mark ATM)Are you interested in the syslog so far? Or doesn't the syslog show memory related issues?
June 16, 20251 yr Author I should've downloaded it, but I was optimistic that the management would stay online. Well, it didn't :(I'll try to reboot it soon and upload the syslog![edit] I can SSH into it via one of my unifi network devices, but command like "HTOP" won't work So, can't reach SSH into the server (10.10.5.3/24 from 10.20.5.0/24 subnet, but I can via the Unifi device which is in the 10.10.5.0/24 net.[/edit] syslog-16-06-2025.zip Edited June 16, 20251 yr by martijndemulder
June 17, 20251 yr Author I did some more troubleshooting this evening and it seems that somehow the routing table isn't finding its way back to my client network anymore.Yes I know I've got 2 default routes. This is because of the way my management network (10.60.5.x/24, physical NIC) and server network (10.10.5.x/24, vlan interface) is set up. This has worked fine for the last couple of years and that's why I didn't change it.root@P-y-n-h:~# ip route showdefault via 10.10.5.1 dev br0.10 metric 1default via 10.60.5.1 dev br0 metric 210.10.5.0/24 dev shim-br0.10 proto kernel scope link src 10.10.5.3 metric 101210.10.5.0/24 dev br0.10 proto kernel scope link src 10.10.5.3 metric 101310.60.5.0/24 dev br0 proto kernel scope link src 10.60.5.3 metric 1012172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdownroot@P-y-n-h:~# ip route show to 10.20.5.3root@P-y-n-h:~# ip route show to 10.10.5.3/2410.10.5.0/24 dev shim-br0.10 proto kernel scope link src 10.10.5.3 metric 101210.10.5.0/24 dev br0.10 proto kernel scope link src 10.10.5.3 metric 1013My memory usage seems fine:root@P-y-n-h:~# free -mt total used free shared buff/cache availableMem: 63810 43644 4972 2570 18486 20165Swap: 0 0 0Total: 63810 43644 4972 Edited June 17, 20251 yr by martijndemulder
June 22, 20251 yr Author Solution Apparently I was suffering from the network issues that were fixed in release 7.1.4. Problem solved!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.