20% reduction in CPU Performance?


Recommended Posts

I am evaluating combining my Workstation / gaming Rig with my NAS. I am trying to understand the performance hit by moving to a VM.

 

I am losing around 20% of my expected CPU performance in both multi and single threaded application.

 

my test system is:

 

Asrock EP2C602

2x e5-2670

32GB of Ram

GTX 960 past thought to the VM

 

when I test the system on a clean windows build (bare metal) I get a Cinebench R15 score of around 1980 (which is  as expected)

 

548142843_Win10BM.thumb.JPG.a0018d733818966fdb52226ef9da69f1.JPG

 

when I move to a VM, assigning with only one CPU I would expect to get around 50% of this as I plan to use the other CPU for Other tasks. currently I am getting 750-820 in R15 (8cores / 16 threads) and the single core test performance is about the same reduction over expected.

 

819.JPG.15c97d7ccfb0e44bd57813c417e52ed9.JPG

 

I have isolated a complete CPU for this testing (the CPU that is directly connected to the GPU), however I get the same results on a VNC windows 10 install on the other CPU (without isolation). results are just as bad if give it all 16 cores (around 1600 scored). 4 cores / 8 threads net 400, so scaling looks constant, im just missing 400 points somewhere?

 

2066342400_IsolatedCPU.jpg.f0b4d98f6722445b5404541c99b9d487.jpg

 

I have tips / tweaks set to performance with boost enabled. Win 10 power mode is set to performance.

 

the CPU looks to b boosting to 3.0Ghz (which is consistent with my BM testing)

 

2119229203_CPUlookstobeboosting.JPG.cb0e5f94f4351eb2461f00c2ab81359a.JPG

 

I have played around with moving the emulations cores off the VM CPU to see if removing the overhead helps but with no impact. this had zero impact!

 

I get the feeling I am missing something as most of the videos on unraid show almost BM performance in most applications.

 

XML

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='5'>
  <name>WIN.10.A</name>
  <uuid>73c3fde2-1f62-2524-5410-b8c6087fa87c</uuid>
  <description>Windows 10</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='16'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='18'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='20'/>
    <vcpupin vcpu='6' cpuset='6'/>
    <vcpupin vcpu='7' cpuset='22'/>
    <emulatorpin cpuset='8,24'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-3.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/73c3fde2-1f62-2524-5410-b8c6087fa87c_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/mnt/disks/Unassigned_SSD/WIN.10.A/vdisk1.img' index='2'/>
      <backingStore/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <alias name='virtio-disk2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/drivers/virtio-win-0.1.160-1.iso' index='1'/>
      <backingStore/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <alias name='ide0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:01:70:a7'/>
      <source bridge='br0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-5-WIN.10.A/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x1a' function='0x0'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

 

Edited by gray squirrel
typos
Link to comment

I am also seeing performance reduction with a sandybridge cpu in VM. Mine is a engineering sample mobile chip, somewhere close to a 2820qm or 2760qm. Theres one cpu-z bench I see for a 2760qm that has scores of 330/1540 for single/multi. I only get 280/1440 on my cpu. Perhaps it is inherent performance penalty of VM with sandybridge architecture?

 

And for cpu intensive games like gtaV, I'm not getting good performance at all. My gpu is a 980ti and it performs fine in non cpu intensive stuff. My avg fps in gta online is ~40-50fps with constant drops to 30fps. This is at 720p low settings. This is with all 4c/8t passed to the VM and it is ~2.9-3.1ghz in gtaV. For reference, a stock 2600k (3.5ghz all core) with a 980ti maintains 60fps at 1080p high settings.

 

I did notice that I get much better gtaV performance on q35 than i440. For cpu benchmarks, I don't see a difference between q35 and i440.

Edited by kakashisensei
Link to comment
  • 1 year later...

I've noticed the same behavior on my Gaming Server (unRaid 6.9.2). The server ran for several months without any problems and I could not find any impairments in various, not particularly performance-hungry games.

 

Now I have tried Battlefield 2042 and the CPU performance is extremely bad at almost 100% load, while the GPU works at 30%. That means sometimes under 30 FPS. Now I know that BF is in no way optimized for performance yet, but the game still runs better on significantly worse systems. So I did some testing and found something similar to your problems…

 

VM (Intel Xeon E-2246G):

VM.thumb.JPG.927de1bb6f439bdf09c3446dd90039e8.JPG

 

With VMs turned off and hardly any CPU usage, unRaid shows Turbo Boost over 4700 MHz (max Turbo for this CPU is 4,80 GHz):

1569276777_VMOff.JPG.72e094d7f423864d9217920341f1854b.JPG

 

Then, when the VM is started and idle, the CPU does not go over 4600 MHz:

GameOff.JPG.af118c7c93ff7af0ccb3a8d6ce7fb9d6.JPG

 

And now it's getting weird... As soon as I start a game on the VM, the CPU no longer goes over 4500 MHz. Even if the load is almost 100%, as it is the case with BF 2042 started:

GameOn.JPG.ff115b4156dc5425f76705a170c9b79b.JPG

 

The full turbo boost potential of 4.80 GHz of the CPU is never used and the more CPU power the VM would actually need, the less is used. Did you find out why? Does anyone have a solution?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.