m00nman Posted January 22 Share Posted January 22 (edited) I've switched from Proxmox to unraid not that long ago and I've run into several problems/inconveniences that I managed to overcome and would like to share with everyone as it doesn't seem to be a common knowledge. The issues: 1. Windows VM (Win10 1703 +) high CPU usage when idle. Pretty self explanatory. Default settings the VMs are created with cause CPU to be busy with interrupts contstantly. 2. No KSM enabled by default = much less ram is free/available to other services when running 2+ windows VMs at the same time. This has caused OOM (out of memory) to kick in and kill one of the VMs over time when docker containers start using more RAM. This will probably be useful to people with limited RAM on their servers. I only have 32GB myself and this made a huge difference for me. 3. CPU pinning is the default in unRaid. This is great to isolate certain cores to only be used with a certain VM in some situation, when for example, unraid is also your main PC and you want some cores dedicated to the VM that you use day to day to play games or whatever else you do, but terrible for server workloads, especially if your server doesn't have that many cores and a lot of containers/services/VMs and there is no way to know which core will be loaded at any given time, while others are idling. Solutions: 1. I stumbled upon a thread on the forums that recommended enabling HPET timer which seemed to resolve the issue somewhat. The issue is that HPET is an unreliable clock source and often goes out of sync. The real solution is to enable Hyper-V Enlightenments which was introduced in qemu 3.0. It is already partially enabled in unRaid by default. This is what Proxmox uses by default for Windows VMs Go to settings for your Windows VM, and enable XML view in the upper right corner. We will need to edit 2 blocks: add the following to <hyperv mode='custom'> block <vpindex state='on'/> <synic state='on'/> <stimer state='on'/> add the following to <clock offset='localtime'> block <timer name='hypervclock' present='yes'/> In the end, it should look like this The bonus is that this reduces idle CPU usage even further compared to HPET, without all of the HPET drawbacks. Please note this ONLY applies to Windows VMs. Linux and *BSD already use a different paravirtualized clock source. 2, unRaid does come with a kernel that has KSM (kernel samepage merging) enabled (thank you, unraid dev team). What it does is it looks for identical pages in memory for multiple VMs and replaces them with write-protected single page, thus saving (a lot) of RAM. The more similar VMs you have, the more ram you will save with almost no performance penalty. To enable KSM at runtime append the following line to /boot/config/go echo 1 > /sys/kernel/mm/ksm/run And remove the following block from all of the VMs configs that are subject to KSM: <memoryBacking> <nosharepages/> </memoryBacking> Let it run for an hour or 2, and then you can check if it's actually working (besides seeing more free ram) by cat /sys/kernel/mm/ksm/pages_shared The number should be greater than 0 if it's working. If it isn't working then either your VMs aren't similar enough, or your server hasn't reached the threshold of % used memory. The result (This is with Windows 11 and Windows Server 2022 VMs, 8GB ram each) 3. We want to disable CPU pinning completely and let the kernel deal with scheduling and distributing load between all the cores on the CPU. Why is CPU pinning not always good? Let's assume you did your best to distribute and pin cores to different VM. For simplicity let's assume we have a 2 core CPU and 4 VMs. We pin core #1 to VM1 and VM3, and core #2 to VM2 and VM4. Now it so happened that VM1 and VM3 started doing something CPU intensive at the same time and they have to share that core #1 between the two of them all while core #2 is doing completely nothing. By letting kernel schedule the load without pinning it will distribute the load between both cores. Let's go back into the VM settings and Delete the following block <cputune> . . . </cputune> Make sure that the line <vcpu placement='static'>MAX_CPU_NUMBER</vcpu> and <topology sockets='1' dies='1' cores='MAX_CPU_NUMBER' threads='1'/> still has the maximum number of cores your VM is allowed to use (obviously MAX_CPU_NUMBER is a number of cores you want to limit this particular VM to) NOTE: if you switch back from XML view to the basic view and change some setting (could be completely unreleated) and save, unraid may overwrite some of these settings. Particularly I noticed that it likes to overwrite max cores assigned to VM to just a single core. You will just need to change back to XML view and change "vcpu placement" and "topology" again Bonus: - Make sure you are only using VirtIO devices for storage and network - For "network model" pick "virtio" for better throughput ("virtio-net" is the default) - If you have Realtek 8125[A || B] network adapter and having issues with throughput, have a look at @hkall comment below. Edited January 22 by m00nman 1 1 Quote Link to comment
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.