Please Add NUMAD

Jerky_san · October 15, 2018

NUMAD is a simple program that makes doing NUMA related tasks for qemu much easier. Basically its an automatic NUMA affinity management daemon.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-tool_reference-numad

https://linux.die.net/man/8/numad

Basically when creating a VM you can set the numa stuff to "auto" and it will go out using NUMAD and determine the best memory allocations for you based on your selections.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

AnnabellaRenee87 · November 20, 2018

Yes, please add this! Being on a dual CPU system, my VMs memory is being devided between the two nodes.

@limetech

bastl · November 21, 2018

Threadripper users would also benefit from NUMAD

Jcloud · November 21, 2018

3 hours ago, bastl said:

Threadripper users would also benefit from NUMAD

I would assume the board would have to support NUMA for this to be effective? From past posts I know my board doesn't have this option (that I've found yet) so my TR is all lumped together. Then again this hasn't effected me that much, or at least to a noticeable degree, on my "daily-driving."

Looks interesting/cool for the folks with the know-how.

bastl · November 21, 2018

@Jcloud All x399 boards should support it. AMD Ryzen Master does exactly the same when you switch to Game Mode.

tjb_altf4 · November 22, 2018

9 hours ago, Jcloud said:

I would assume the board would have to support NUMA for this to be effective? From past posts I know my board doesn't have this option (that I've found yet) so my TR is all lumped together. Then again this hasn't effected me that much, or at least to a noticeable degree, on my "daily-driving."

Looks interesting/cool for the folks with the know-how.

It's probably named something obscure, I've seen UMA/NUMA been given many different names depending on where it's referenced.

FWIW, UMA mode makes life easy, albeit with a performance hit.

+1 for OP suggestion

Edited November 22, 2018 by tjb_altf4

Chamzamzoo · November 25, 2018

+1 yes please

jordanmw · December 19, 2018

Also wanting this.

limetech · January 25, 2019

Where's the source code? Is there any kind of UI that has to be designed/implemented as well?

AnnabellaRenee87 · January 25, 2019

1 hour ago, limetech said:

Where's the source code? Is there any kind of UI that has to be designed/implemented as well?

From looking around it looks like it's here https://github.com/K1773R/numad/blob/master/numad.c

limetech · January 25, 2019

13 minutes ago, AnnabellaRenee87 said:

From looking around it looks like it's here https://github.com/K1773R/numad/blob/master/numad.c

Saw that earlier, last change was almost 3 years ago, also no license specified, which might be problematic for us.

AnnabellaRenee87 · January 25, 2019

Saw that earlier, last change was almost 3 years ago, also no license specified, which might be problematic for us.

I'm sorry for being so annoying about this but I've been having bad GPU performance and lag in VMs and from the looks of it my VMs RAM is getting split between NUMA Nodes even though I only pin cores that belong to one or the other CPU.

Here is the original discussion about it that I know of. https://forums.unraid.net/bug-reports/prereleases/660-rc2-vm-memory-allocation-across-numa-boundaries-r151/

Sent from my SM-G955U using Tapatalk

Gokux · January 25, 2019

1 hour ago, limetech said:

Saw that earlier, last change was almost 3 years ago, also no license specified, which might be problematic for us.

It is LGPL 2.1

Quote

numad is free software; you can redistribute it and/or modify it under the

terms of the GNU Lesser General Public License as published by the Free

Software Foundation; version 2.1.

limetech · January 26, 2019

1 hour ago, Likos said:

It is LGPL 2.1

👍

Jerky_san · January 26, 2019

1 hour ago, limetech said:

👍

Thank you very much limetech for considering this. It's definitely a huge help for us with numa. Makes me kind of sad I bit on the 2990wx when the new designs have chiplets.

limetech · January 26, 2019

For you guys to test this out, is it sufficient to put the numad executable somewhere, say /usr/local/sbin/numad and then you can invoke in your 'go' file the way you see fit?

Jerky_san · January 26, 2019

Just now, limetech said:

For you guys to test this out, is it sufficient to put the numad executable somewhere, say /usr/local/sbin/numad and then you can invoke in your 'go' file the way you see fit?

I'd be fine with that. No reason to make people suffer if it causes an issue but also gives us a chance to try it out. Thanks for giving us that chance if that's the route you chose.

limetech · January 26, 2019

16 minutes ago, Jerky_san said:

I'd be fine with that. No reason to make people suffer if it causes an issue but also gives us a chance to try it out. Thanks for giving us that chance if that's the route you chose.

Download from here: https://s3.amazonaws.com/dnld.lime-technology.com/test/numad

Probably can just keep this on your flash device, from console do something like this:

cd /boot
wget https://s3.amazonaws.com/dnld.lime-technology.com/test/numad
numad

To invoke from 'go' file add this line:

/boot/numad

It has numerous options you can get from aforementioned 'man' page.

If you determine this is useful we can include with bzroot image and create a proper 'rc.numad' script.

Jerky_san · January 26, 2019

Sweet I'll give it a try later tonight

Jerky_san · January 26, 2019

1 hour ago, limetech said:
Download from here: https://s3.amazonaws.com/dnld.lime-technology.com/test/numad

Probably can just keep this on your flash device, from console do something like this:
cd /boot
wget https://s3.amazonaws.com/dnld.lime-technology.com/test/numad
numad
To invoke from 'go' file add this line:
/boot/numad
It has numerous options you can get from aforementioned 'man' page.

If you determine this is useful we can include with bzroot image and create a proper 'rc.numad' script.

So did everything like you said. In my syslog I see

Jan 25 21:19:46 Tower kernel: mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl

Only problem is that if I try to use it on a vm say

<vcpu placement='auto'>8</vcpu>

unsupported configuration: numad is not available on this host. I threw numad in /bin folder hoping that would resolve that but it did. I'll be frank in stating I don't know enough about linux to do want to tinker to far with messing with symlinks and paths that I might not be able to revert.

Also at least my experience so far its not great.. My own XML config tweaks are much better though I wonder if numad takes in account PCI-express devices which would be why it's allocating everything from node0 even though I have CPU cores from node0 and node1 allocated.

Another thing I still don't understand is even when I use XML and tell QEMU this is how I want ram to be allocated it just flat out ignores me. For instance 64GB of ram 34% used with the below XML I "might" get 12 gb on node 0 and 4gb on 1

    <numa>
      <cell id='0' cpus='0-7' memory='8388608' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='8388608' unit='KiB'/>
    </numa>

Or right now I have it set to the below

<numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
</numa>

But this is how the allocation ended up

Per-node process memory usage (in MBs) for PID 108804 (qemu-system-x86)
Node 0 Node 1 Node 2 Node 3 Total
------ ------ ------ ------ -----
Huge 0 0 0 0 0
Heap 0 0 0 0 0
Stack 0 0 0 0 0
Private 11386 0 5128 0 16514
------- ------ ------ ------ ------ -----
Total 11386 0 5128 0 16514

Edited January 26, 2019 by Jerky_san

AnnabellaRenee87 · January 26, 2019

So did exactly what @limetech said to do then rebooted.

I went into my XML file for the VM I'm working with and added this to my XML after the cputune section.

  <numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>

After doing that I get this now when running numastat -c qemu.

image.png.1c4212f88ba82b9b51e6a9a609e088d7.png

I was gonna us the "strict" option but it didn't seem to really work or anything.

After a little playing around I decided to move my 32GB or RAM Plex Media Server VM to my Node 0 which is GPU Accelerated.

I moved my 16GB of RAM Gaming VM to Node 1.

I moved my pfSense 8GB of RAM to Node 0.

My Ubuntu Server VM which is 16GB of RAM which isn't mission critical to have high performance I just have it set the normal without the "numatune" section added to the XML file.

Here it is now.

image.png.4a205a7a6cf1e8b2806428f65c9459b4.png

Some of the VM is getting put in to the other node but I think that's the RAM being used for the QEMU service and the RAM in the Node is equal or larger than the allocated memory.

You can read more about how to use this in the following sites/threads.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_tuning_and_optimization_guide/index#chap-Virtualization_Tuning_Optimization_Guide-NUMA

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

So far everything looks to be good but I need to do some testing on the VMs to see if its fixed any performance issues.

AnnabellaRenee87 · January 26, 2019

Sorry to reply directly like this but I wanted to seperate it from my last message.

Maybe on the Dashboard have 3 or more RAM usage bars?

Like on the first bar put the total system memory usage like it is now.

Bars 2 and 3 make those show what's left on each node. (or more if a system has more memory Nodes.)

Jerky_san · January 28, 2019

On 1/26/2019 at 11:14 AM, AnnabellaRenee87 said:
So did exactly what @limetech said to do then rebooted.

I went into my XML file for the VM I'm working with and added this to my XML after the cputune section.
  <numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>
After doing that I get this now when running numastat -c qemu.

I was gonna us the "strict" option but it didn't seem to really work or anything.

After a little playing around I decided to move my 32GB or RAM Plex Media Server VM to my Node 0 which is GPU Accelerated.

I moved my 16GB of RAM Gaming VM to Node 1.

I moved my pfSense 8GB of RAM to Node 0.

My Ubuntu Server VM which is 16GB of RAM which isn't mission critical to have high performance I just have it set the normal without the "numatune" section added to the XML file.

Here it is now.

Some of the VM is getting put in to the other node but I think that's the RAM being used for the QEMU service and the RAM in the Node is equal or larger than the allocated memory.

You can read more about how to use this in the following sites/threads.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_tuning_and_optimization_guide/index#chap-Virtualization_Tuning_Optimization_Guide-NUMA

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-numa-numa_and_libvirt

So far everything looks to be good but I need to do some testing on the VMs to see if its fixed any performance issues.

Have you tried spanning two numa nodes? When I tried it.. it was terrible.. didn't do its job properly at all and kept all the ram on a single node.

AnnabellaRenee87 · January 28, 2019

6 hours ago, Jerky_san said:

Have you tried spanning two numa nodes? When I tried it.. it was terrible.. didn't do its job properly at all and kept all the ram on a single node.

A quick check and everything looks like it's working great.

image.png.853b00cfadf0fdaa2b16075659f64fe0.png

The point of NUMAD was so we could move VMs to NUMA Nodes vs having them split between NUMA Nodes because it causes lag when something has to go between CPUs or goes from one NUMA Node to the other.

I think all my dockers are going to whichever Node is available. Don't know what the exact process is for docker containers. (docker, dockerd, containerd, etc)

Edited January 28, 2019 by AnnabellaRenee87
Grammar.

Jerky_san · January 29, 2019

6 hours ago, AnnabellaRenee87 said:

A quick check and everything looks like it's working great.

The point of NUMAD was so we could move VMs to NUMA Nodes vs having them split between NUMA Nodes because it causes lag when something has to go between CPUs or goes from one NUMA Node to the other.

I think all my dockers are going to whichever Node is available. Don't know what the exact process is for docker containers. (docker, dockerd, containerd, etc)

My understanding of it was that it was supposed to optimally allocate ram based on the assigned cores which for me it doesn't do. Say I only assigned cores from node02 it would put all my ram for the vm on node0. I would also assume if I had cores from node0 and node02 it would properly allocate ram based on that but it didn't sadly. And for some reason unraid or qemu doesn't listen either even though I define all that in the topology. Even without numad it doesn't allocate the ram based on the topoloty assigned so I have to make fake machines that suck up the right amount of ram to balance it myself.

Mine rarely split BTW without me pushing it to. It usually will just grab everything from node0 even when told to get it from node02. Or as much free ram as node0 has..

Edited January 29, 2019 by Jerky_san

Please Add NUMAD

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation