Graphics Card poor performence on VM


Recommended Posts

Hello folks,

 

I have kind of an problem over here. First I'll tell you abut my system which has the following specifications:

 

Motherboard: Asus Z9PED-WS, Dual Socket 2011

CPU: 2x Intel Xeon E5-2670 8*2,6Ghz / 3,3Ghz Turbo (16 Threads each, 32Threads total)

RAM: 16GB Registred ECC DDR3 1333Mhz from HP

GPU: Asus GTX 760 + Inno3d GTX 770 iChill HerculeZ

 

Unraid Verision: tested 6.17 / 6.18 / 6.19 and now on 6.20 Beta2

 

I tired setting up a gaming VM and did some benchmarks betweer barebones performence and VM performence assigning the following resources:

 

2 / 4 / 6 / 8 / 32 Cores

8GB of Ram - fixes for all Configs

GTX 760 / GTX 770 (only one at a time)

 

 

With any of those configs I have seen a performence hit of about 30%. I used CS:GO as my testing suite hence I play it alot. Using a benchmark MAP which basically runs a cinematic I was able to capute the average FPS on each config. Each config ran the benchmark 3 times and the result was averaged.

 

2 Cores + 760 -94.3 FPS

4 Cores + 760 - 124.9 FPS

6 Cores + 760 -127.1 FPS

8 Cores + 760 - 127.1 FPS

32 Cores + 760 - 127.3 FPS

Bare bones - 176.4 FPS

 

2 Cores + 770 - 154.3 FPS

4 Cores + 770 - 164.9 FPS

6 Cores + 770 -171.7 FPS

8 Cores + 770 - 174.2 FPS

32 Cores + 770 - 179.1 FPS

Bare bones - 246.0 FPS

 

 

Latest XML:

 

<domain type='kvm'>
  <name>Windows 8.1</name>
  <uuid>4f5d86fc-5587-d459-b26d-0451d606a8ba</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 8.x" icon="windows.png" os="windows"/>
  </metadata>
  <memory unit='KiB'>6291456</memory>
  <currentMemory unit='KiB'>6291456</currentMemory>
  <memoryBacking>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='20'/>
    <vcpupin vcpu='1' cpuset='21'/>
    <vcpupin vcpu='2' cpuset='22'/>
    <vcpupin vcpu='3' cpuset='23'/>
    <vcpupin vcpu='4' cpuset='24'/>
    <vcpupin vcpu='5' cpuset='25'/>
    <vcpupin vcpu='6' cpuset='26'/>
    <vcpupin vcpu='7' cpuset='27'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/VMs/Microsoft.Windows.8.1.x86.x64.AIO.German.iso'/>
      <target dev='hda' bus='sata'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/VMs/virtio-win-0.1.112-1.iso'/>
      <target dev='hdb' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/VMs/Windows 8.1/vdisk1.img'/>
      <target dev='hdc' bus='sata'/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:30:37:96'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='connect'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <hostdev mode='subsystem' type='pci' managed='yes' xvga='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x1b' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0xc05b'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x04b4'/>
        <product id='0x0101'/>
      </source>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 

 

Any advice here? I would be verry happy if someone could help.

 

THANK YOU

Link to comment

I have the same problem (980ti, 1x Xeon E5 2630 v3, 16G ram, OS is on an SSD that i mounted with unassigned devices) i got the best results with 4 cores and both threads pinned, the vm uses the "real" cores and the emulator uses the hyperthreaded cores. but these are just the "best" results i could get by editing the config, i still have frame drops and low fps in some games. I did a lot of testing in firestrike and it looks like a problem with the cpu performance.

 

EDIT: My vCPU config

 

<cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <emulatorpin cpuset='12-15'/>
  </cputune>

 

 

  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>

 

the syslinux config (the cores 3 and 11 are for my DSM vm)

 

label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=3,4,5,6,7,11,12,13,14,15 initrd=/bzroot

Link to comment

Whilst it's not usually recommended to share a core with the unraid OS, I am currently doing this on my i7 until I get my new dual xeon 2670 build up and running. But my current hyper threaded 4 core (8 logical cores) passed through to the vm with unraid also running on core 0 has no performance loss to bare metal. I've played ashes of the singularity in dx 12, witcher 3 in dx 11 on my amd r9 280x and tested firestrike benchmarks within 1-2% of bare metal performance.

 

30% overhead is definitely not usual. Not sure where you would go to start diagnosing this. Hopefully somebody can step in and help you out.

Link to comment

Thanks Man. I have borrowed a AMD HD 7870 Ghz Edition just to test if this is an nvidia only issue. I am seeing the same Performence differences.

 

Have you noticed unusually high hard disk activity in windows by any chance? Leave windows performance / resource monitor open for a bit and keep an eye on it.

 

My question would then be did you follow the guest vm post install guide on the wiki? Before doing this I would get extremely poor performance with my cpu constantly being chewed up and therefor getting poor general performance before applying this. Not sure exactly what it was, I suspect it may have been something to do with windows indexing.

 

- http://lime-technology.com/wiki/index.php/UnRAID_6/VM_Guest_Support

Link to comment
Have you noticed unusually high hard disk activity in windows by any chance?

 

No, disk activity is only a few kbits when idle.

 

did you follow the guest vm post install guide on the wiki

 

Yes, everything except enabling rdp (i dont need it because im using gpu passthrough)

Link to comment

My disk activity was only in the kb/s as well but the disk performance column on the task manager process tab was constantly showing 99%. I think it may have been disabling windows indexing that resolved this issue for me.

 

I don't have any other suggestions at this stage but will continue to keep an eye on your thread. Good luck.

Link to comment

my disk activity at 0-1% when idle, even with indexing enabled.

 

f3dora have you tried sending the lime tech support a query with a pointer towards one of your threads? I'm guessing they are pretty busy but they do frequent the forums from time to time. They might be able to put you on the correct path towards a solution.

Link to comment

Alright I have tried the above mentioned tips and still cant get any better performence on my GPU. Disk acrtivity seems totalls normal with about 3% usage / activity and my CPU chiilling at 17%.

 

But what I have noticed is that the GPU constantly fluctuates in its clock speed. Any way to set it to a fixed value? Nvidia please

Link to comment

I unpinned the hyperthreaded cores and now my performance is a lot better! i have to run more benchmarks to test if the problems are completely gone.

 

My new config(8 core xeon cpu with HT):

<vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
  </cputune>

 

<cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>

 

label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=1-7 initrd=/bzroot

 

source: https://lime-technology.com/forum/index.php?topic=49051.msg472259#msg472259

Link to comment
  • 2 weeks later...

I unpinned the hyperthreaded cores and now my performance is a lot better! i have to run more benchmarks to test if the problems are completely gone.

 

My new config(8 core xeon cpu with HT):

<vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
  </cputune>

 

<cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>

 

label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=1-7 initrd=/bzroot

 

source: https://lime-technology.com/forum/index.php?topic=49051.msg472259#msg472259

 

I don't think you got it quite right.  You have set your cpus according to the following.

 

0,8
1,9
2,10
3,11
4,12 *
5,13 *
6,14 *
7,15 *

 

The "*" are the pairs you are assigning to your VM, but you are isolating 1-7, so 0 and 8-15 are not isolated.  12-15 assigned to your VM are not isolated.  cpus 1-3 are not used by unRAID or this VM because they are isolated.

 

You need to isolate the cpus this way.

 

label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=4-7,12-15 initrd=/bzroot

 

This will isolate the physical cpus in pairs that are used by your VM.

 

You seem to be confusing the 1-7 physical cores to your vcpu cores which are relative to the VM and not physical cores.

 

Also add emulatorpin:

 

<vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
     <emulatorpin cpuset='0-3,8-11'>
  </cputune>

 

This gets the emulator work off your VM cpus and onto the other cpus.

Link to comment

Does not help my issues :/

 

The previous poster did not do it correctly.  Read this post very carefully and try again https://lime-technology.com/forum/index.php?topic=49051.msg472259#msg472259.

 

People are getting confused about virtual and physical cpus and what hyperthreading is all about.

 

Hyperthreading is a method to share a cpu core with two threads to try to get more performance out of cores.

 

A 4 core hyperthreaded processor will lay out like this:

0,4
1,5
2,6
3,7

 

The pairs are (0,4), (1,5), (2,6), and (3,7).  When isolating cpus you need to isolate them in pairs.  If you want to assign 2,3,6,7 to a VM, you need to isolate all four cpus.  Isolating 2,3 does not do the job.  The isolated cpus are physical cpus, not vcpus.

 

The pairs are not a core and a hyperthread, both cpus are hyperthreaded on one core.  Cpus 3 and 7 share one core.

Link to comment
  • 1 month later...

Hi Fresh did you sort your problem out?

I have the same cpu's and motherboard and i have sorted all my speed issues out key points were

 

1. on the boot flash drive: isolate the cores you want to use for vm so unraid cannot use them

2. I actually found hyper threading lost me performance so i only assigned 4 cores per vm (i previously had 6 cores but it is running lower fps at 6 cores atm possibly didn't assign emulatorpin cpuset enough resources)

3. My machine gets higher fps with hyperv option on

 

Flash Drive settings I added

append isolcpus=1-14,17-30 initrd=/bzroot

This isolates all my cores except "0-16" and "15-31" for unraid to use

 

CPU Settings in xml

  <cputune>

    <vcpupin vcpu='0' cpuset='2'/>

    <vcpupin vcpu='1' cpuset='3'/>

    <vcpupin vcpu='2' cpuset='4'/>

    <vcpupin vcpu='3' cpuset='5'/>

    <emulatorpin cpuset='6-22'/>

  </cputune>

 

 

Link to comment

I have tested barebone compared to the vm and found although 1080p gaming is quite good when i put the lowest resolution and detail to test the cpu i get at least 30% loss of performance

witcher 3 i get 146fps in the vm and barebone i get 200fps but the video card is running at 100%. I expected 10% but i found in quite a few scenarios to lose around 30%. I have all the correct settings and tested many different configurations to have the same result. I am happy with the gaming performance but i feel the 2670 needs as much single threaded performance it can get and it appears to have more overhead than i expected.

Link to comment
  • 1 month later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.