ASUS TRX40-PRO & Threadripper 3960x


Recommended Posts

Okay guys, I'm running a ASUS TRX40-PRO & Threadripper 3960x, 32Gigs 3200 memory.

Right now I'm just passing 10c/10t "host" CPU & 20G ram, but I'm not impressed...

 

Any tips/tricks of getting the best performance on a Windows VM?

Here's the current VM config:

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='2'>
  <name>Windows 10</name>
  <uuid>f0f6d8a1-ff05-65df-7cd5-69d6624a3f9b</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>20971520</memory>
  <currentMemory unit='KiB'>20971520</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>24</vcpu>
  <iothreads>2</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='12'/>
    <vcpupin vcpu='1' cpuset='36'/>
    <vcpupin vcpu='2' cpuset='13'/>
    <vcpupin vcpu='3' cpuset='37'/>
    <vcpupin vcpu='4' cpuset='14'/>
    <vcpupin vcpu='5' cpuset='38'/>
    <vcpupin vcpu='6' cpuset='15'/>
    <vcpupin vcpu='7' cpuset='39'/>
    <vcpupin vcpu='8' cpuset='16'/>
    <vcpupin vcpu='9' cpuset='40'/>
    <vcpupin vcpu='10' cpuset='17'/>
    <vcpupin vcpu='11' cpuset='41'/>
    <vcpupin vcpu='12' cpuset='18'/>
    <vcpupin vcpu='13' cpuset='42'/>
    <vcpupin vcpu='14' cpuset='19'/>
    <vcpupin vcpu='15' cpuset='43'/>
    <vcpupin vcpu='16' cpuset='20'/>
    <vcpupin vcpu='17' cpuset='44'/>
    <vcpupin vcpu='18' cpuset='21'/>
    <vcpupin vcpu='19' cpuset='45'/>
    <vcpupin vcpu='20' cpuset='22'/>
    <vcpupin vcpu='21' cpuset='46'/>
    <vcpupin vcpu='22' cpuset='23'/>
    <vcpupin vcpu='23' cpuset='47'/>
    <emulatorpin cpuset='0-1'/>
    <iothreadpin iothread='1' cpuset='2-3'/>
    <iothreadpin iothread='2' cpuset='4-5'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/f0f6d8a1-ff05-65df-7cd5-69d6624a3f9b_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='none'/>
      <frequencies state='on'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
      <hint-dedicated state='on'/>
    </kvm>
    <vmport state='off'/>
    <smm state='on'>
      <tseg unit='MiB'>48</tseg>
    </smm>
    <ioapic driver='kvm'/>
    <vmcoreinfo state='on'/>
  </features>
  <cpu mode='host-passthrough' check='partial'>
    <topology sockets='1' cores='24' threads='1'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
    <feature policy='require' name='svm'/>
    <feature policy='require' name='apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='invtsc'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='yes'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' io='threads' iothread='1' queues='2'/>
      <source file='/mnt/user/domains/Windows 10/vdisk1.img' index='3'/>
      <backingStore/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <alias name='virtio-disk2'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' io='threads' iothread='2' queues='2'/>
      <source file='/mnt/user/domains/Windows 10/vdisk2.img' index='2'/>
      <backingStore/>
      <target dev='hdd' bus='virtio'/>
      <alias name='virtio-disk3'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/virtio-win-0.1.173.iso' index='1'/>
      <backingStore/>
      <target dev='hdb' bus='sata'/>
      <readonly/>
      <alias name='sata0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0xb'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xc'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xd'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0xe'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0xf'/>
      <alias name='pci.8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x10'/>
      <alias name='pci.9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='10' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:08:d9:27'/>
      <source bridge='br0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-2-Windows 10/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <rom file='/boot/vBIOS/ASUS-ROG-STRIX-nVidia-GTX1050Ti-GP107-KVM.rom'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x46' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x46' slot='0x00' function='0x3'/>
      </source>
      <alias name='hostdev3'/>
      <address type='pci' domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

Edited by ChewbaccaBG
config update
Link to comment

Well, it's a gaming VM. There's some stuttering, fps drops from time to time w/o any changes - at one boot it can work okay, at reboot - poor performance.

So, I'm looking to optimize the VM as much as possible.

 

From what I see in dmesg - there's 4 NUMA regions:

[    0.000000] No NUMA configuration found
[    0.412283] pci_bus 0000:00: on NUMA node 0
[    0.414292] pci_bus 0000:20: on NUMA node 1
[    0.424251] pci_bus 0000:40: on NUMA node 2
[    0.425815] pci_bus 0000:60: on NUMA node 3

 

Link to comment
1 hour ago, ChewbaccaBG said:

Well, it's a gaming VM. There's some stuttering, fps drops from time to time w/o any changes - at one boot it can work okay, at reboot - poor performance.

So, I'm looking to optimize the VM as much as possible.

 

From what I see in dmesg - there's 4 NUMA regions:

[    0.000000] No NUMA configuration found
[    0.412283] pci_bus 0000:00: on NUMA node 0
[    0.414292] pci_bus 0000:20: on NUMA node 1
[    0.424251] pci_bus 0000:40: on NUMA node 2
[    0.425815] pci_bus 0000:60: on NUMA node 3

 

  • Watch and follow the vid below to create the lstopo png image and attach it.
  • Tools -> Diagnostics -> attach zip file.

Hopefully it gives a better idea of what needs to be tuned.

 

Also, did you isolate the cores for the VM?

 

 

 

Link to comment
2 minutes ago, ChewbaccaBG said:

# lstopo /mnt/user/appdata/topoligy-tr3960x.png
lstopo: error while loading shared libraries: libudev.so.0: cannot open shared object file: No such file or directory

Run this first before doing the lstopo command:

ln -s /lib64/libudev.so.1 /lib64/libudev.so.0

 

 

 

Link to comment

Only these things that come to mind:

  • Isolate 24 cores to cover the full dies to spread the load evenly (20 cores will not spread evenly across die since there are 3 CCX per die).
    • Something like 12 - 23 and 36 - 47.
    • You may even get better performance cutting it down to 12 cores. Fewer cores may lead to more consistent performance.
  • Take out all the cache + feature tuning in <cpu>, especially the cache.
    • L3 cache should work correctly with Q35 4.x and does not need emulation (emulated L3 cache is unlikely to improve performance).
    • The feature stuff was originally required as CPU was being emulated as EPYC instead of TR.
  • Take out 
    <timer name='hpet' present='no'/>

    because you already have present=yes.

  • Add this 

    <frequencies state='on'/>

    above </hyperv>

  • Install Tips and Tweaks plugin and change your CPU governor to High Performance.

  • Pass through the USB controller instead of attaching USB devices. The libvirt usb sometimes cause lags under load.

 

Link to comment

I'm already passing the usb controller (ASMedia, still having trouble passing the other one).

I'll check now the other stuff and reboot the vm.

 

Here are pinned / isolated cores for the VM.

New VM (xml) config edited @ 1st post.

 

kernel /bzimage
append amd_iommu=on iommu=pt pci-stub.ids=8086:10c9 vfio-pci.ids=1022:1485,1022:149c rd.driver.pre=vfio-pci video=efifb:off isolcpus=12-23,36-47 nohz_full=12-23,36-47 rcu_nocbs=12-23,36-47 modprobe.blacklist=radeon,amdgpu,fglrx,nouveau,nvidiafb,nvidia,nvidia_drm,snd-pcsp initrd=/bzroot

 

VM.png.4aeaad0ed87f860c770e32aa0b8818a0.png

 

 

Edited by ChewbaccaBG
Link to comment

Disabled mem-ballooning, added iothreads & emulatorpin, config @ 1st post updated.

Tried following some of the tips in:

https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/

https://mathiashueber.com/qemu-troubleshooting-errors-gpu-passthrough-vm

 

Also, for some reason the task manager @ the windows guest shows "L1 Cache: N/A". Any ideas?

 

More testing tomorrow.

Edited by ChewbaccaBG
Link to comment

So_far_1.thumb.png.0a14d8eb7f74d16452b86dd4a7b894d7.png

 

msconfig -> boot -> advanced options -> Number of processors set to 24 (total provided to the VM).

Page file -> disabled

MSI enabled on:

 

MSI.png.e512927cc4013f0f0475c70dc52ef7f6.png

 

More tweaks to come soon, I'm still reading a ton of posts on reddit and the libvirt manuals, testing and getting some feedback from LatencyMon.

Edited by ChewbaccaBG
  • Like 1
Link to comment
  • 2 weeks later...

Do you have a HDMI audio device on your GTX 1050? If so, you need to do a multifunction device manually on to your config (see spaceinvaders video "GPU passthrough" on youtube). Have you installed the VM drivers on the Win10 host? Also tune Windows powerplans to high performance, disable hibernate etc

Edited by Hakabe
Type
Link to comment
  • 10 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.