Performance Improvements in VMs by adjusting CPU pinning and assignment


Recommended Posts

My VM is on my "unassigned device" .

 

MY CPUs show up like the following. How whould I assign my VM if I want 4-6 CPUs for it?

 

 

cpu 0 / 12

 

cpu 1 / 13

 

cpu 2 / 14

 

cpu 3 / 15

 

cpu 4 / 16

 

cpu 5 / 17

 

cpu 6 / 18

 

cpu 7 / 19

 

cpu 8 / 20

 

cpu 9 / 21

 

cpu 10 / 22

 

cpu 11 / 23

Link to comment
My VM is on my "unassigned device" .

 

MY CPUs show up like the following. How whould I assign my VM if I want 4-6 CPUs for it?

 

Sorry if I confuse you but here it goes - Because you have a lot of threads but at a slower speed @ 2.2 GHz and not 100% sure what you are wanting to do with your VM or will want more than 1 VM I may try setting it up like so, see screenshot

 

Red outline give to UNRAID - might want to give more to UNRAID if you are going to use Dockers.

 

Yellow = VM1 and would give it 6 physical cores to access just because of the 2.2 GHz range may want to give it more

Green = VM2, same idea as VM1

Orange = all to VM 1 because you have the threads to do so if you want just one VM and because of the 2.2 GHz range might find it helpful

 

You still have the rest of your threads to use as back up during testing if you need to give more to UNRAID, want to create more VM's, or need to give more to a VM to find the right balance for what you want to do.

 

I would look at isolcpus= and adding the emulatorpin cpuset= to the VM templet

Capture.PNG.fd14f48bbf66ecf89e0115ef25d2ab97.PNG

Link to comment

In regards to -  isolcpus= and adding the emulatorpin cpuset= to the VM templet

 

So power down the VM template and edit the xml file and add the following

 

  <cputune>

    <vcpupin vcpu='2' cpuset='14'/>

    <vcpupin vcpu='3' cpuset='15'/>

    <vcpupin vcpu='4' cpuset='16'/>

    <vcpupin vcpu='5' cpuset='17'/>

    <vcpupin vcpu='6' cpuset='18'/>

    <vcpupin vcpu='7' cpuset='19'/>

    <emulatorpin cpuset='0-12,1-3'/>

  </cputune>

 

add the following to boot file

 

append isolcpus=2,3,4,5, 6,7, 14,15,16,17,18,19 initrd=/bzroot

Link to comment
In regards to -  isolcpus= and adding the emulatorpin cpuset= to the VM templet

 

So power down the VM template and edit the xml file and add the following

 

  <cputune>

    <vcpupin vcpu='2' cpuset='14'/>

    <vcpupin vcpu='3' cpuset='15'/>

    <vcpupin vcpu='4' cpuset='16'/>

    <vcpupin vcpu='5' cpuset='17'/>

    <vcpupin vcpu='6' cpuset='18'/>

    <vcpupin vcpu='7' cpuset='19'/>

    <emulatorpin cpuset='0-12,1-3'/>

  </cputune>

 

add the following to boot file

 

append isolcpus=2,3,4,5, 6,7, 14,15,16,17,18,19 initrd=/bzroot

 

That looks correct but I am not sure if you just had a typo in your post for <emulatorpin cpuset='0-12,1-3'/> it should look like <emulatorpin cpuset='0-12,1-13'/>

 

For the isolcpus=2,3,4,5, 6,7, 14,15,16,17,18,19 you are only telling UNRAID to ignore those CPU but it still has access to 8,9,10,11,20,21,22,23 and that's ok just an FYI

Link to comment

In regards to -  isolcpus= and adding the emulatorpin cpuset= to the VM templet

 

So power down the VM template and edit the xml file and add the following

 

  <cputune>

    <vcpupin vcpu='2' cpuset='14'/>

    <vcpupin vcpu='3' cpuset='15'/>

    <vcpupin vcpu='4' cpuset='16'/>

    <vcpupin vcpu='5' cpuset='17'/>

    <vcpupin vcpu='6' cpuset='18'/>

    <vcpupin vcpu='7' cpuset='19'/>

    <emulatorpin cpuset='0-12,1-3'/>

  </cputune>

 

add the following to boot file

 

append isolcpus=2,3,4,5, 6,7, 14,15,16,17,18,19 initrd=/bzroot

 

You are not doing it correctly.  Vcpus are numbered from '0'.  Vcpus are the VMs CPUS and not the system cpus.  You are also only assigning one of the thread pairs.  This results in too much context switching.  You are better off assigning thread pairs to a VM.

 

Don't edit the xml and assign cpus.  Use the VM editor.  Once you do that, then add the emulatorpin if necessary.

 

I think you want to do this:

 <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='3'/>
    <vcpupin vcpu='2' cpuset='4'/>
    <vcpupin vcpu='3' cpuset='14'/>
    <vcpupin vcpu='4' cpuset='15'/>
    <vcpupin vcpu='5' cpuset='16'/>
    <emulatorpin cpuset='0-12'/>
  </cputune>

 

So with the VM editor assign 2,3,4,14,15,16 cpus and let the VM editor make the settings.

 

You only need to emulatorpin a thread pair.  That's generally enough.

 

This setup assigns 3 cores and 6 threads to your VM.  I doubt you need any more cpus.

 

Re-read the first post and be sure you understand cores and threads.

Link to comment

I see some posting their XML settings to show how they are assigning CPUs.  It is very confusing and to some they might feel they need to edit the XML directly and not use the VM editor.  My suggestion is to show the thread pairs you are assigning and not the XML results as I have done in the OP.  We also need to see your total thread pairs.  For example:

 

Thread pairs in system.

0,4

1,5

2,6

3,7

 

CPU assignment to VM:

1,5

2,6

3,7

 

Emulator pin:

0,4

 

This way we can see your system architecture and how you are assigning the CPUs.

 

I've seen other posts where a person only shows their XML CPU assignments when they have 16 or more threads and it's difficult to understand their architecture without reverse engineering.

 

I very strongly discourage editing the XML file directly.  There is generally no need to edit the XML file and the VM editor should be used.  If you do not know what you are doing, you can create real problems directly editing the XML.

Link to comment

Ok so with my VM when I edit it under logical CPUs I have 0-23.

 

Per your previous post I selected 2,3,4,14,15,16

 

This is what the XML file shows. I only added emulatorpin cpuset='0-12 to the end of it.

 

Oh is emulatorpin suppose to be 0-12 or 0,12?

 

 

<vcpu placement='static'>6</vcpu>

  <cputune>

    <vcpupin vcpu='0' cpuset='2'/>

    <vcpupin vcpu='1' cpuset='3'/>

    <vcpupin vcpu='2' cpuset='4'/>

    <vcpupin vcpu='3' cpuset='14'/>

    <vcpupin vcpu='4' cpuset='15'/>

    <vcpupin vcpu='5' cpuset='16'/>

    <emulatorpin cpuset='0-12'/>

  </cputune>

Link to comment

Ok so with my VM when I edit it under logical CPUs I have 0-23.

 

Per your previous post I selected 2,3,4,14,15,16

 

This is what the XML file shows. I only added emulatorpin cpuset='0-12 to the end of it.

 

Oh is emulatorpin suppose to be 0-12 or 0,12?

 

 

<vcpu placement='static'>6</vcpu>

  <cputune>

    <vcpupin vcpu='0' cpuset='2'/>

    <vcpupin vcpu='1' cpuset='3'/>

    <vcpupin vcpu='2' cpuset='4'/>

    <vcpupin vcpu='3' cpuset='14'/>

    <vcpupin vcpu='4' cpuset='15'/>

    <vcpupin vcpu='5' cpuset='16'/>

    <emulatorpin cpuset='0-12'/>

  </cputune>

 

0,12

Link to comment

hi dlandon

 

thank you for this extensive guide and all the explanations supporting it.

i just tried it on a windows VM. I'm not yet sure if it's because of the pinning, placebo :) or the fact that i passthrough the entire usb controller hosting the unifying receiver - but it does "feels" snapier and without any lag...

 

I wanted to know if there is a recommendation for optimizing also the Linux VM's (and not just Windows).

I assume it's similar concept, but thinking if there's something that needs to be done differently from Windows...

 

thanks

-d

 

Link to comment

hi dlandon

 

thank you for this extensive guide and all the explanations supporting it.

i just tried it on a windows VM. I'm not yet sure if it's because of the pinning, placebo :) or the fact that i passthrough the entire usb controller hosting the unifying receiver - but it does "feels" snapier and without any lag...

 

I wanted to know if there is a recommendation for optimizing also the Linux VM's (and not just Windows).

I assume it's similar concept, but thinking if there's something that needs to be done differently from Windows...

 

thanks

-d

 

Install the Tips and Tweaks plugin and adjust networking and disk caching.  These have caused issues for some.  Read the help for information on what to adjust.  Also read the Tips and Tweaks posts for ideas.

Link to comment

I meant to drop this here the other day for posterity: 

 

OS X Performance: cpu pinning, benchmarks, discussion and windows comparison

https://lime-technology.com/forum/index.php?topic=56139.0

 

*note- no audio/video testing was done in conjunction with these tests. Using the non-accepted settings may cause less than desired audio/video issues in windows. In previous testing, using non-paired cores in os x produced no audio/video issues and increased performance. This includes running 2 separate vm's on a core's separate threaded pairs. OS X audio issues only arose when vm's were placed on the same core threads. (more info here: https://lime-technology.com/forum/index.php?topic=51915.msg533600#msg533600)

 

Link to comment

thanks dlandon, i already did some of that as part of the optimizations for the Windows VM

 

to be more specific on the fine tuning, i changed only the networking part, but not the disk caching.

I read the blog post explaining the various combinations for the disk cache, but i didn't want to adjust them yet, as i wasn't sure what values will give me an improvement. My usage of VM's is not extreme, so I left them as default, assuming it's good for the regular usage :)

 

just a side node - "fix common plugin" did not work for me. installation hanged for 2-3 minutes, i stopped it, then consequent attempts to install resulted in a puzzle icon for the plugin and clicking on the icon lead to blank page; will try to find if it's a common problem with this plugin :)

 

thanks!

 

 

Link to comment

Post on the fix common problems post for help with install problems.

 

The issue with the disk cache settings is they are setup for small amounts of ram memory and are too large for large amounts of ram memory in a lot of unPAID systems because they are set as a percentage of total memory.

Link to comment

thanks dlandon. i was able to install the fix common problems plugin without other issues (probably after some unraid restart).

 

I have 32 GB of RAM in my x99 work box.

- 2 VM's with GPU passthrough, one for folding (3 GB RAM), another one for regular work (8GB RAM, maybe up to 12, if really required).

- 3 VM's for development (4 GB RAM each) - ubuntu 16.04, no GPU passthrough

- sometimes an ocasional test VM with 2-4 GB RAM, no intensive work...

 

None are high I/O throughput to disk.

 

This box is used for work and not for data storage (i have different box for this purpose - in my signature)

 

What should be an optimal setting? should i reduce from default 10/20 to 5/10? or even less?

thanks,

-d

Link to comment

thanks dlandon. i was able to install the fix common problems plugin without other issues (probably after some unraid restart).

 

I have 32 GB of RAM in my x99 work box.

- 2 VM's with GPU passthrough, one for folding (3 GB RAM), another one for regular work (8GB RAM, maybe up to 12, if really required).

- 3 VM's for development (4 GB RAM each) - ubuntu 16.04, no GPU passthrough

- sometimes an ocasional test VM with 2-4 GB RAM, no intensive work...

 

None are high I/O throughput to disk.

 

This box is used for work and not for data storage (i have different box for this purpose - in my signature)

 

What should be an optimal setting? should i reduce from default 10/20 to 5/10? or even less?

thanks,

-d

 

I have 32GB of ram and I use 5/10.  I had an erratic mouse in my VM Until I lowered the values.  The default settings are way too high for large amounts of ram.

Link to comment

that's the same behaviour i got till i did the cpu pinning based on your guide, and i also did passthrough for the entire USB controller (onboard 3.1) to the Windows VM. After that it reduced a lot, and only sometimes i could notice there's some very mild mouse lag...

 

but will try to reduce it to 5/10, and see if it makes it even better.

 

thanks!

Link to comment

 

What does everyone use to measure system latency in windows?

 

I was checking on a win10 vm I setup with 2 cores on my system (of 24 total)

 

When using DPC Latency checker, I average about 1000 with very, very few intermittent jumps to 2200. But when I set an emulator pin, the average climbs to 1500 with multiple regular jumps to 8000+

 

When using LatencyMon with no emulator pin I get (in vertical order) 250, 390, 500, 502, 76, and then eventually peg out pagefault at 82259.

 

when setting an emulator pin, i get: 380, 554, 777, 632, 894 and immediately peg hard pagefault at 170255.

 

Using emulator pin seems to cause higher numbers in both latency tests.....? ----->Is the an expected result?

 

 

Regardless of the number, I have zero issues using the windows vm. no laggy mouse, no audio/video problems. vm work without any issues. Just curious what others are using to determine latency on their win vm's.

Link to comment

1812

 

i empirically test by moving the mouse across the screen, in various applications and games, when CPU load is high :)

 

I also noticed in a linux VM (connected to it via VNC), that the mouse lagged terribly, when going over some web page links - something that was affecting also the VM that initiated the VNC session.

But this was before passing through the entire usb controller with the unifying receiver attached to it and implementing the cpu pinning guides. After that, such mouse lag reduced a lot.

-d

Link to comment

1812

 

i empirically test by moving the mouse across the screen, in various applications and games, when CPU load is high :)

 

I also noticed in a linux VM (connected to it via VNC), that the mouse lagged terribly, when going over some web page links - something that was affecting also the VM that initiated the VNC session.

But this was before passing through the entire usb controller with the unifying receiver attached to it and implementing the cpu pinning guides. After that, such mouse lag reduced a lot.

-d

 

I'm always a fan of "if it works, it works." I don't have any issues with latency in any of my vm's, just trying to get a better grip on what causes it with other folks and how they test for it.

 

Link to comment

may a small question, if i would like to test performance with isolcpu in syslinux, how would i add the line in case i already use an append for INTEL audio passthrough ?

 

currenty it looks like this

 

label unRAID OS

  menu default

  kernel /bzimage

  append vfio-pci.ids=8086:a170 modprobe.blacklist=i2c_i801,i2c_smbus initrd=/bzroot

 

can i just add the isol append command ?

 

thanks ahead

Link to comment

may a small question, if i would like to test performance with isolcpu in syslinux, how would i add the line in case i already use an append for INTEL audio passthrough ?

 

currenty it looks like this

 

label unRAID OS

  menu default

  kernel /bzimage

  append vfio-pci.ids=8086:a170 modprobe.blacklist=i2c_i801,i2c_smbus initrd=/bzroot

 

can i just add the isol append command ?

 

thanks ahead

 

You do not add another append, but simply add the isolcpu somewhere between append and initrd=bzroot.

Link to comment
  • 3 weeks later...

Running latest Unraid and not sure that my cpus are being isolated successfully. I get no errors etc. but if I launch my plex docker and begin streaming a video I can see the CPU load is being spread across all CPUs, including those that have been isolated. This causes massive latency spikes in my win7 VM. Should that happen? If the cores/threads are isolated should plex be able to use them?

 

*EDIT* I'm a div, I put the cpu isolation section in the wrong part of the boot config. Looks like finally I have a win7 VM I can use!

 

It just goes to show, just because a motherboard and CPU say they support VT-D/VT-X etc. it doesn't mean it will actually work very well. I was running a Z97 chipset motherboard with a corei5, both of which support virtualisation but latency was always all over the place and basically any VM was always unusable. Upgraded to an X99 Asrock board with Xeon 2620 v4 and now it works great., no latency issues of any kind. Still have to isolate cores even with this setup though to get a latency free environment.

Edited by allanp81
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.