Please Add NUMAD


Recommended Posts

NUMAD is a simple program that makes doing NUMA related tasks for qemu much easier. Basically its an automatic NUMA affinity management daemon.

 

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-tool_reference-numad

 

https://linux.die.net/man/8/numad

 

Basically when creating a VM you can set the numa stuff to "auto" and it will go out using NUMAD and determine the best memory allocations for you based on your selections.

 

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

  • Like 1
Link to comment
  • 1 month later...
3 hours ago, bastl said:

Threadripper users would also benefit from NUMAD

I would assume the board would have to support NUMA for this to be effective?  From past posts I know my board doesn't have this option (that I've found yet) so my TR is all lumped together.  Then again this hasn't effected me that much, or at least to a noticeable degree, on my "daily-driving." 

 

Looks interesting/cool for the folks with the know-how.

Link to comment
9 hours ago, Jcloud said:

I would assume the board would have to support NUMA for this to be effective?  From past posts I know my board doesn't have this option (that I've found yet) so my TR is all lumped together.  Then again this hasn't effected me that much, or at least to a noticeable degree, on my "daily-driving." 

 

Looks interesting/cool for the folks with the know-how.

It's probably named something obscure, I've seen UMA/NUMA been given many different names depending on where it's referenced.

FWIW, UMA mode makes life easy, albeit with a performance hit.

 

+1 for OP suggestion

Edited by tjb_altf4
Link to comment
  • 4 weeks later...
  • 1 month later...
Saw that earlier, last change was almost 3 years ago, also no license specified, which might be problematic for us.
I'm sorry for being so annoying about this but I've been having bad GPU performance and lag in VMs and from the looks of it my VMs RAM is getting split between NUMA Nodes even though I only pin cores that belong to one or the other CPU.

Here is the original discussion about it that I know of. https://forums.unraid.net/bug-reports/prereleases/660-rc2-vm-memory-allocation-across-numa-boundaries-r151/

Sent from my SM-G955U using Tapatalk

Link to comment
Just now, limetech said:

For you guys to test this out, is it sufficient to put the numad executable somewhere, say /usr/local/sbin/numad and then you can invoke in your 'go' file the way you see fit?

I'd be fine with that. No reason to make people suffer if it causes an issue but also gives us a chance to try it out. Thanks for giving us that chance if that's the route you chose.

Link to comment
16 minutes ago, Jerky_san said:

I'd be fine with that. No reason to make people suffer if it causes an issue but also gives us a chance to try it out. Thanks for giving us that chance if that's the route you chose.

Download from here: https://s3.amazonaws.com/dnld.lime-technology.com/test/numad

 

Probably can just keep this on your flash device, from console do something like this:

cd /boot
wget https://s3.amazonaws.com/dnld.lime-technology.com/test/numad
numad

To invoke from 'go' file add this line:

/boot/numad

It has numerous options you can get from aforementioned 'man' page.

 

If you determine this is useful we can include with bzroot image and create a proper 'rc.numad' script.

  • Like 1
  • Upvote 1
Link to comment
1 hour ago, limetech said:

Download from here: https://s3.amazonaws.com/dnld.lime-technology.com/test/numad

 

Probably can just keep this on your flash device, from console do something like this:


cd /boot
wget https://s3.amazonaws.com/dnld.lime-technology.com/test/numad
numad

To invoke from 'go' file add this line:


/boot/numad

It has numerous options you can get from aforementioned 'man' page.

 

If you determine this is useful we can include with bzroot image and create a proper 'rc.numad' script.

So did everything like you said. In my syslog I see

 

Jan 25 21:19:46 Tower kernel: mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl

 

Only problem is that if I try to use it on a vm say

 

<vcpu placement='auto'>8</vcpu>

unsupported configuration: numad is not available on this host. I threw numad in /bin folder hoping that would resolve that but it did. I'll be frank in stating I don't know enough about linux to do want to tinker to far with messing with symlinks and paths that I might not be able to revert.

 

Also at least my experience so far its not great.. My own XML config tweaks are much better though I wonder if numad takes in account PCI-express devices which would be why it's allocating everything from node0 even though I have CPU cores from node0 and node1 allocated.
 

Another thing I still don't understand is even when I use XML and tell QEMU this is how I want ram to be allocated it just flat out ignores me. For instance 64GB of ram 34% used with the below XML I "might" get 12 gb on node 0 and 4gb on 1

    <numa>
      <cell id='0' cpus='0-7' memory='8388608' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='8388608' unit='KiB'/>
    </numa>

Or right now I have it set to the below

<numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
</numa>

But this is how the allocation ended up

Per-node process memory usage (in MBs) for PID 108804 (qemu-system-x86)
         Node 0 Node 1 Node 2 Node 3 Total
         ------ ------ ------ ------ -----
Huge          0      0      0      0     0
Heap          0      0      0      0     0
Stack         0      0      0      0     0
Private   11386      0   5128      0 16514
-------  ------ ------ ------ ------ -----
Total     11386      0   5128      0 16514
 

Edited by Jerky_san
Link to comment

So did exactly what @limetech said to do then rebooted.

I went into my XML file for the VM I'm working with and added this to my XML after the cputune section.

  <numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>

After doing that I get this now when running numastat -c qemu.

image.png.1c4212f88ba82b9b51e6a9a609e088d7.png

 

I was gonna us the "strict" option but it didn't seem to really work or anything.

After a little playing around I decided to move my 32GB or RAM Plex Media Server VM to my Node 0 which is GPU Accelerated.

I moved my 16GB of RAM Gaming VM to Node 1.

I moved my pfSense 8GB of RAM to Node 0.

My Ubuntu Server VM which is 16GB of RAM which isn't mission critical to have high performance I just have it set the normal without the "numatune" section added to the XML file.

Here it is now.

image.png.4a205a7a6cf1e8b2806428f65c9459b4.png

Some of the VM is getting put in to the other node but I think that's the RAM being used for the QEMU service and the RAM in the Node is equal or larger than the allocated memory.

 

You can read more about how to use this in the following sites/threads.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_tuning_and_optimization_guide/index#chap-Virtualization_Tuning_Optimization_Guide-NUMA

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

So far everything looks to be good but I need to do some testing on the VMs to see if its fixed any performance issues.

Link to comment
On 1/26/2019 at 11:14 AM, AnnabellaRenee87 said:

So did exactly what @limetech said to do then rebooted.

I went into my XML file for the VM I'm working with and added this to my XML after the cputune section.


  <numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>

After doing that I get this now when running numastat -c qemu.

image.png.1c4212f88ba82b9b51e6a9a609e088d7.png

 

I was gonna us the "strict" option but it didn't seem to really work or anything.

After a little playing around I decided to move my 32GB or RAM Plex Media Server VM to my Node 0 which is GPU Accelerated.

I moved my 16GB of RAM Gaming VM to Node 1.

I moved my pfSense 8GB of RAM to Node 0.

My Ubuntu Server VM which is 16GB of RAM which isn't mission critical to have high performance I just have it set the normal without the "numatune" section added to the XML file.

Here it is now.

image.png.4a205a7a6cf1e8b2806428f65c9459b4.png

Some of the VM is getting put in to the other node but I think that's the RAM being used for the QEMU service and the RAM in the Node is equal or larger than the allocated memory.

 

You can read more about how to use this in the following sites/threads.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_tuning_and_optimization_guide/index#chap-Virtualization_Tuning_Optimization_Guide-NUMA

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

So far everything looks to be good but I need to do some testing on the VMs to see if its fixed any performance issues.

Have you tried spanning two numa nodes? When I tried it.. it was terrible.. didn't do its job properly at all and kept all the ram on a single node.

Link to comment
6 hours ago, Jerky_san said:

Have you tried spanning two numa nodes? When I tried it.. it was terrible.. didn't do its job properly at all and kept all the ram on a single node.

A quick check and everything looks like it's working great.

image.png.853b00cfadf0fdaa2b16075659f64fe0.png

The point of NUMAD was so we could move VMs to NUMA Nodes vs having them split between NUMA Nodes because it causes lag when something has to go between CPUs or goes from one NUMA Node to the other.

 

I think all my dockers are going to whichever Node is available. Don't know what the exact process is for docker containers. (docker, dockerd, containerd, etc)

image.thumb.png.eca4c78e9254b0e43830b3b4e8ade6b2.png

Edited by AnnabellaRenee87
Grammar.
Link to comment
6 hours ago, AnnabellaRenee87 said:

A quick check and everything looks like it's working great.

image.png.853b00cfadf0fdaa2b16075659f64fe0.png

The point of NUMAD was so we could move VMs to NUMA Nodes vs having them split between NUMA Nodes because it causes lag when something has to go between CPUs or goes from one NUMA Node to the other.

 

I think all my dockers are going to whichever Node is available. Don't know what the exact process is for docker containers. (docker, dockerd, containerd, etc)

image.thumb.png.eca4c78e9254b0e43830b3b4e8ade6b2.png

My understanding of it was that it was supposed to optimally allocate ram based on the assigned cores which for me it doesn't do. Say I only assigned cores from node02 it would put all my ram for the vm on node0. I would also assume if I had cores from node0 and node02 it would properly allocate ram based on that but it didn't sadly. And for some reason unraid or qemu doesn't listen either even though I define all that in the topology. Even without numad it doesn't allocate the ram based on the topoloty assigned so I have to make fake machines that suck up the right amount of ram to balance it myself.

 

Mine rarely split BTW without me pushing it to. It usually will just grab everything from node0 even when told to get it from node02. Or as much free ram as node0 has..

Edited by Jerky_san
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.