ilarion Posted February 1, 2021 Share Posted February 1, 2021 (edited) Hi, i have very strange problem with windows vm and GPU passthrough. First this is my hardware: MB:Jingsha/KLISSRE x99-D8 C612 chipset CPU: E5-2678 V3 12C 24T 2.5-3.3GHz RAM: 32GB DDR4 2133 QUAD CHANEL GPU1: RX480 8GB GPU2; R9 290 4GB NVME: 240GB PSU: EVGA SUPERNOVA 1000G2 UNRAID: Version: 6.8.3 The windows VM is: 8 cores pined and isolated (whit corresponding treads)(4/8 for unraid) 16GB ram GPU RX480 passed to VM (edited xml so to make it multifunctional and on the same bus) NVME passed to VM as pcie device isolated with VFIO-config plugin USB 3.0 card passed to VM as pcie device isolated with VFIO-config plugin GPU and NVME set to high from msi utility V3 CPU Scaling Governor: Performance Enable Intel Turbo/AMD Performance Boost: YES no other VMs or Dockers The vm feels perfect unigine haeven result was: 2799 baremetal (same windows)(same NVME) 2646 in VM (same windows)(same NVME) GTA-V feels the same, FPS the same no noticeable drops in fps or frametimes. The problem is with Battlefield 1. There is unimaginable stutter and drops in frametimes. In Riva tuner i see in the graphic for the GPU that every time that is drop in frame time, there is drop in gpu load from 100 to like 6 percent. This is every second, the graphic for gpu load looks like comb. The cpu load stays 60-70 percent. Bf1 is set to use the maximum 11 treads that the game engine supports. I tried: First moved from vfio hard on the cache drive (sata ssd) to nvme passthrough. - no difference Changing pcie slot for the GPU on the motherboard. - no difference VFIO allow unsafe interrupts: yes/no - no difference XEON TURBOBOOST UNLOCK HACK - no difference with i without it (except cinebench) Tried smaller/bigger number cores and ram. tried: 6.9.0-rc2 - no difference tried: different graphic drivers - there is difference in other games but not in BF1 Because the nvme is passed, i can tried in VM and baremetal. In baremetal no stutter in bf1, same cpu load and same number of cores are loaded because of game engine limitation. I just can`t understand what makes this game behave so poor in VM. And everything else feel the same. This is my VM xml. <?xml version='1.0' encoding='UTF-8'?> <domain type='kvm' id='1'> <name>ASROCK-VM-NVME</name> <uuid>bac5863c-dead-e638-e244-c94a7c331d46</uuid> <metadata> <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/> </metadata> <memory unit='KiB'>17301504</memory> <currentMemory unit='KiB'>17301504</currentMemory> <memoryBacking> <nosharepages/> </memoryBacking> <vcpu placement='static'>16</vcpu> <cputune> <vcpupin vcpu='0' cpuset='4'/> <vcpupin vcpu='1' cpuset='16'/> <vcpupin vcpu='2' cpuset='5'/> <vcpupin vcpu='3' cpuset='17'/> <vcpupin vcpu='4' cpuset='6'/> <vcpupin vcpu='5' cpuset='18'/> <vcpupin vcpu='6' cpuset='7'/> <vcpupin vcpu='7' cpuset='19'/> <vcpupin vcpu='8' cpuset='8'/> <vcpupin vcpu='9' cpuset='20'/> <vcpupin vcpu='10' cpuset='9'/> <vcpupin vcpu='11' cpuset='21'/> <vcpupin vcpu='12' cpuset='10'/> <vcpupin vcpu='13' cpuset='22'/> <vcpupin vcpu='14' cpuset='11'/> <vcpupin vcpu='15' cpuset='23'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-q35-4.2'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader> <nvram>/etc/libvirt/qemu/nvram/bac5863c-dead-e638-e244-c94a7c331d46_VARS-pure-efi.fd</nvram> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <hyperv> <relaxed state='on'/> <vapic state='on'/> <spinlocks state='on' retries='8191'/> <vendor_id state='on' value='none'/> </hyperv> </features> <cpu mode='host-passthrough' check='none'> <topology sockets='1' cores='8' threads='2'/> <cache mode='passthrough'/> </cpu> <clock offset='localtime'> <timer name='hypervclock' present='yes'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/local/sbin/qemu</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/user/isos/virtio-win-0.1.173-2.iso' index='1'/> <backingStore/> <target dev='hdb' bus='sata'/> <readonly/> <alias name='sata0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <controller type='sata' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'> <alias name='pcie.0'/> </controller> <controller type='pci' index='1' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='1' port='0x8'/> <alias name='pci.1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='2' port='0x9'/> <alias name='pci.2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='pci' index='3' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='3' port='0xa'/> <alias name='pci.3'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='4' port='0xb'/> <alias name='pci.4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='5' port='0xc'/> <alias name='pci.5'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/> </controller> <controller type='pci' index='6' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='6' port='0xd'/> <alias name='pci.6'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/> </controller> <controller type='pci' index='7' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='7' port='0xe'/> <alias name='pci.7'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </controller> <controller type='usb' index='0' model='ich9-ehci1'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <alias name='usb'/> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <alias name='usb'/> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <alias name='usb'/> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/> </controller> <interface type='bridge'> <mac address='52:54:00:0a:aa:54'/> <source bridge='br0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/0'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/0'> <source path='/dev/pts/0'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-ASROCK-VM-NVME/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0' multifunction='on'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/> </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x1'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </source> <alias name='hostdev2'/> <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </source> <alias name='hostdev3'/> <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='usb' managed='no'> <source startupPolicy='optional'> <vendor id='0x2318'/> <product id='0x2808'/> <address bus='3' device='3'/> </source> <alias name='hostdev4'/> <address type='usb' bus='0' port='1'/> </hostdev> <memballoon model='none'/> </devices> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+0:+100</label> <imagelabel>+0:+100</imagelabel> </seclabel> </domain> I attach diagnostic too. x99-diagnostics-20210201-0633.zip Edited February 1, 2021 by ilarion Quote Link to comment
zeus83 Posted February 1, 2021 Share Posted February 1, 2021 Hi, it may worth checking this Quote Link to comment
ilarion Posted February 2, 2021 Author Share Posted February 2, 2021 Nope it wasn't this, but 10x for the answer. I found resolution for the time.... if i switch to DX11 the performance is nearly the same as in baremetal on DX11. The solution is not perfect because on baremetal on dx12 i have much more good fps, but it ok for now because the frames are stable though lower. I tried to put all cores to the VM and the performance on dx12 was much better which is very strange because cpu usage was like 20%. So there is some kind of cpu bottleneck in dx12 in this game on my system. Quote Link to comment
zeus83 Posted February 2, 2021 Share Posted February 2, 2021 I didn't run Battlefiled 1 , but any DX12 game I tried runs perfectly fine. Have you tried any other DX12 game ? It still worth checking that you clock timer used is TSC because some gaming APIs are dependent on this timer during draw calls. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.