At my wits end with latency and audio drops in win10vm w/ threadripper 1950x


Tritech

Recommended Posts

*********************EDIT*********************************

Check page 3 for my working .xml

********************************************************

I've got a threadripper 1950x w/ 32gb ram, on a asrock x399 professional board. I'm passing through an NVME controller to a win10 vm. Just updated to RC3 today. 1080ti is passed through, but I'm using onboard for my audio. I'm getting drops in audio and constantly have to switch my microphone inputs in discord (every minute or so, constantly switching from "default" to my mic and vice-versa). When ingame, FPS seems reasonable. I'm running the usual suspects docker wise (plex,sab, radarr etc) and they all seem fine. Bios is up-to-date. HVM and iommu enabled, as well as NUMA. I've tried for the last week to get UMA to work and its all the same things. Currently using OVMF and q35.31.

 

my syslinux line:

 

append isolcpus=12-15,28-31 initrd=/bzroot

 

My gpu is on numa node 1 (the 10de:1b06).

NUMA.thumb.png.2dce468f436e0db5ff70014d8e05c461.png

 

my xml: https://pastebin.com/1Em6QQTq

 

latency.thumb.PNG.374680f27ee7250ae45133b3055526a1.PNG

 

Things I've added to the xml

 

    <emulatorpin cpuset='0'/>  -will test if adding 16 to that helps

 

 <smbios mode='host'/> - didn't notice a difference

 

Things I've tried: I've tried both seabios and OVMF and OVMF seemed more responsive and less latency. I've tried the same machine in 1440fx3.1 and that seemed worse as well. I've also tested most of these in new VMs and installing windows each time.

 

I've installed "tips and tweaks" and set 'disable flow control and nic offload' to yes. Gpu scaling govenor to performance (seems slightly better than 'on demand').

 

Also applied MSI fix for all audio devices and gpu.

 

Does anyone have any suggestions to get this thing running well?

 

 

 


 

Edited by Tritech
Link to comment

Looks like you have the same config as me. ASROCK Fatal1ty X399 Professional Gaming? 1080ti, NVME and onboard audio passthrough. Audio drops i only had before applying the MSI-fix. Make sure to run the program as administrator and recheck after every Nvidia driver update or every windows update if the settings are still in place. For me the settings are reset a couple times now after updates. I still using i440fx as machine type cause it was the recommended one as i started with unraid end of 2017 and it still works today. I played around a lot since then and tested a couple stuff. Currently my main VM uses cores from die1 only including emupin on that die. Die0 is used for other VMs and dockers and stuff. All the cores used by the VM are isolated from unraid so no other processes can access them. The CPU mode I've set is to decrease the CPUs cache latency inside the VM. With one of the next mayor qemu updates that shouldn't be needed anymore and the cache should be detected corectly without this tweak. I hadn't had the time yet to test if the latest Q35 versions brought any big improvements. The xml below works just fine for me, why changing it, right?

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='4'>
  <name>WIN10_NVME_UEFI</name>
  <uuid>6dce2fa4-5c94-dd9c-4bd8-cf3524279efa</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>14</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='9'/>
    <vcpupin vcpu='1' cpuset='25'/>
    <vcpupin vcpu='2' cpuset='10'/>
    <vcpupin vcpu='3' cpuset='26'/>
    <vcpupin vcpu='4' cpuset='11'/>
    <vcpupin vcpu='5' cpuset='27'/>
    <vcpupin vcpu='6' cpuset='12'/>
    <vcpupin vcpu='7' cpuset='28'/>
    <vcpupin vcpu='8' cpuset='13'/>
    <vcpupin vcpu='9' cpuset='29'/>
    <vcpupin vcpu='10' cpuset='14'/>
    <vcpupin vcpu='11' cpuset='30'/>
    <vcpupin vcpu='12' cpuset='15'/>
    <vcpupin vcpu='13' cpuset='31'/>
    <emulatorpin cpuset='8-24'/>
    <iothreadpin iothread='1' cpuset='8-24'/>
  </cputune>
  <numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-3.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/6dce2fa4-5c94-dd9c-4bd8-cf3524279efa_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <topology sockets='1' cores='7' threads='2'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/isos/clover/spaces_win_clover.img'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <boot order='1'/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source dev='/dev/disk/by-id/ata-Samsung_SSD_850_EVO_1TB_S2RFNX0J606029L'/>
      <backingStore/>
      <target dev='hdd' bus='sata'/>
      <alias name='sata0-0-3'/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/virtio-win-0.1.160-1.iso'/>
      <backingStore/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <alias name='ide0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='nec-xhci' ports='15'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='sata0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:65:2d:ab'/>
      <source bridge='br0'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/1'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/1'>
      <source path='/dev/pts/1'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-4-WIN10_NVME_UEFI/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x43' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <rom file='/mnt/user/Backup/vbios/Strix1080ti/AsusStrix1080TI_dump_edit.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x43' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0a' slot='0x00' function='0x3'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x41' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0xc246'/>
        <address bus='5' device='2'/>
      </source>
      <alias name='hostdev4'/>
      <address type='usb' bus='0' port='1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1b1c'/>
        <product id='0x1b50'/>
        <address bus='5' device='3'/>
      </source>
      <alias name='hostdev5'/>
      <address type='usb' bus='0' port='2'/>
    </hostdev>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

lm.thumb.png.899ea60715ff2369e5a72cc3346b7b5b.png

Edited by bastl
screenshot attached
  • Like 1
Link to comment

Yea, you've replied to a few of my other threads and I really appreciate the help!. It's the same board. Couple of questions after looking at your .xml. 

 

<emulatorpin cpuset='8-24'/>
<iothreadpin iothread='1' cpuset='8-24'/>


- Are these just unused threads your assigning? Wouldn't some of these be on die 0 and 1?

<numatune>
    <memory mode='preferred' nodeset='1'/>
  </numatune>

are you on numa or uma?

 

 

I assume this is the epic hack? 


 

<cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <topology sockets='1' cores='7' threads='2'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>

 

 

 

Edited by Tritech
Link to comment
1 minute ago, Tritech said:

- Are these just unused threads your assigning? Wouldn't some of these be on die 0? ie 16-23?

I see I have a mistake in my config. It should be "8,24" not "8-24".

 

These 2 threads are the first core and it's hyperthread on the second die. The die I isolated and only use for my main VM. In theory using the iopin and the emulatorpin on the same die as the cores the Vm uses prevents communication between the 2 dies and reduces latency. These both cores are only used for emulating and the io stuff. Not sure how big the difference in performance is, but i set it up a couple weeks ago with the idea to have everything this VM does limited to only 1 die. 

 

13 minutes ago, Tritech said:

are you on numa or uma?

NUMA. I had set the mode to strict before, but for some reason the VM always grabs a couple MB of RAM from node0 even if i tell her not to do so. A lot of people reported this behaviour. I couldn't figure it out yet for me, so i have set it to preferred till someone comes up with a fix. 

 

18 minutes ago, Tritech said:

I assume this is the epic hack?

correct

  • Like 1
Link to comment

I figured that was a typo with the '8-24'. I applied all three of those changes, basically the whole top section of your xml and still >2000 latency.

 

https://pastebin.com/aDNwA2P5

 

You have you main card in the top pcie slot correct? which is node1.

 

Do you isolcpus=8-15,24-31 or do you leave 8/24 out?

 

 

I think when I changed to numa I switched to "channel", I'd have to go back and take a look. Is there anything else in the bios to check while I'm there?

 

Edit: Double checked my bios and it was "memory interleaving" that I set to channel. I've tested the xml changes on i440 and q35, both with about the same results. Do you have SMT on?

 

 

Edited by Tritech
Link to comment

Recheck your xml, some of the EPYC part is missing

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>

With this setup and working properbly you should see inside taskmanager that the CPU is an Epyc. Yours is missing that part

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='14' threads='1'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
  </cpu>

 

14 minutes ago, Tritech said:

You have you main card in the top pcie slot correct? which is node1.

Yes the 1080ti is in the first slot that should be directly connected to node1. I swaped around the cards a couple weeks ago. Second card 1050ti is in slot3 now. I guess i have to reboot the server and check lstopo again.

 

18 minutes ago, Tritech said:

Do you isolcpus=8-15,24-31 or do you leave 8/24 out?

Currently i have set it to

isolcpus=9-15,25-31

Not exactly sure what unraid does, if i isolate the cores 8 and 24 too. Maybe something else to test. 

 

20 minutes ago, Tritech said:

I think when I changed to numa I switched to "channel", I'd have to go back and take a look. Is there anything else in the bios to check while I'm there?

Channel should be the same i've set it to. I tested all the different modes and did a couple memory latency, bandwith and benchmark tests to confirm which mode is doing what. This was on one of the older 2.** BIOS, Currently running un 3.30. There is a newer 3.50 i haven't checked yet. Beside of this setting and enabling IOMMU, SR-IOV Support, SVM Mode, enabling the memory XMP profile and a slight overclock, everything is on default.

Link to comment
4 minutes ago, david279 said:

Why are you using sata as the bus for the 850 evo? That can lead to performance loss. Use virtio.

 

I don't have an EVO in this machine. It's a WD black, and its not using any emulation. The entire NVME controller is passed through and windows installed like on a normal drive. I can dual boot this if I need to to literal bare metal.

 

<hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x41' slot='0x00' function='0x0'/>
      </source>
     </hostdev>

Basically the way you would pass through any USB controller, etc.

 

 

Unfortunately, none of these fixes worked. Basically in the same boat but now it says I have an Epic. The only differences I can see from your setup to mine is I'm on 6.7rc3 and bios 3.5. Do you have SMT set to auto/enabled (don't remember how it was worded)?

 

My bios setup was basically yours, xmp the memory (had to enable oc mode on the ram to get the xmp to take), iommu, SR-IOV and SVM. I'm still on stock clocks for now, I have a working OC for windows, but trying to take that out of the equation.

 

*sigh*

Link to comment

@Tritech Another stuff might be not default settings:

 

CSM - disabled

Launch PXE OpROM Policy - Legacy only

Launch Storage OpROM Policy - Legacy only

Launch Video OpROM Policy - Legacy only

 

Fast Boot - disabled

Secure Boot - disabled

 

SD Configuration Mode - disabled

ACS enabled - auto

 

NVMe Raid Mode - disabled

 

ACPI HPET Table - enabled

 

Deep Sleep - disabled

 

AMD fTPM switch - disabled

SMT Mode - auto

 

 

Link to comment

Hold my beer, I'm going in!


BTW, the only oddity I see in the windows log when passing through the audio card is :

Cannot reset device 0000:09:00.3, depends on group 16 which is not owned.

9:00.3 is the audio card. checked group 16 and thats:

 

AMD Zeppelin/Renoir PCIe Dummy Function | Non-Essential Instrumentation (09:00.0)

so I tried to pass that and now

 

Cannot reset device 0000:09:00.0, depends on group 17

and group 17 is a sata controller... so thats not gonna work.

 

audio still works.

 

 

Link to comment

I have the same group type on my ryzen motherboard. When I tried passing through the onboard audio I would get audio lagging and other audio oddities. This is with 8 threads isolated. I just bit the bullet and started using a USB audio device. Just the fact the onboard audio is lumped with some other devices may lead to some issues. I use a cheap roccat juke plugged into a USB hub connect to a onboard controller. Best 14.99 I could spend.

Sent from my SM-N960U using Tapatalk

Link to comment

I already waited for that question 😂 I have that same error from the beginning.

Cannot reset device 0000:0a:00.3, depends on group 18 which is not owned.

Group 18 for me is the same as your group 17

[1022:1455] 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function

I never tried to pass that trough because i never had audio issues with the onboard device inside the VM. The VM is running for almost 10 hours now with a online radio playing in the backround and not one single audio drop or lag. 

Edited by bastl
Link to comment

Well, it's fucked good and proper. CSM disabled is very bad.

 

0x43192d0d,8) failed: Device or resource busy
2019-02-11T13:59:17.121295Z qemu-system-x86_64: vfio_region_write(0000:42:00.0:region3+0x140f8, 

 

spammed the log 100% full in a few seconds, screen garbled. I've undone every change I made, pretty sure, and now every vm is getting error 43 on the video card.

 

 

Link to comment

Just to throw it out there I've been chasing this damn unicorn since the 2990wx came out. On the RC's of unraid my sound card literally just drops out for not reason. I have latency down but it will randomly spike if you let it run long enough. Plus ASUS doesn't seem to care about my board anymore (Zenith) and hasn't released the new AGESIS update along with breaking 3200 ram speed with the threadripper 2 update.

Link to comment
1 hour ago, bastl said:

@Tritech I will check the next days if it's the newest BIOS causing it.

Thanks man, I figured it out. Well the most recent issue. Disabling CSM only lets you boot unraid USB in UEFI. Evidently it shits the bed when you do so. I managed to get into a vm with csm disabled and latency was great. Not sure if that was just due to no nvidia drivers loaded, or thats actually the fix. Can you actually boot and USE unraid if it boots as UEFI?

 

CSM - disabled = Big Problems

Launch PXE OpROM Policy - Legacy only =all network boot stuff disabled

Launch Storage OpROM Policy - Legacy only

Launch Video OpROM Policy - Legacy only

 

Fast Boot - disabled = yep

Secure Boot - disabled =yep

 

SD Configuration Mode - disabled = there were two of these "sd configuration mode, and eMMC/sd configuration" (currently reenabled to get shit working again, will test in a few) EDIT: disabled both

ACS enabled - auto = think it was auto (will check on this next reboot)

 

NVMe Raid Mode - disabled = yep

 

ACPI HPET Table - enabled = yep

 

Deep Sleep - disabled = didn't see it but anything suspend related was off

 

AMD fTPM switch - disabled = yep

SMT Mode - auto  = yep

 

also to note the choices for "memory interleaving"  are "none, channel, die, socket or auto". I selected channel.

 

 

 

 

Edited by Tritech
updated for future reference
Link to comment
5 minutes ago, Tritech said:

Thanks man, I figured it out. Well the most recent issue. Disabling CSM only lets you boot unraid USB in UEFI. Evidently it shits the bed when you do so. I managed to get into a vm with csm disabled and latency was great. Not sure if that was just due to no nvidia drivers loaded, or thats actually the fix. Can you actually boot and USE unraid if it boots as UEFI?

For me I can boot unraid in UFEI mode but my nvidia gpu won't start for a VM

  • Upvote 1
Link to comment
2 hours ago, Jerky_san said:

Just to throw it out there I've been chasing this damn unicorn since the 2990wx came out. On the RC's of unraid my sound card literally just drops out for not reason. I have latency down but it will randomly spike if you let it run long enough. Plus ASUS doesn't seem to care about my board anymore (Zenith) and hasn't released the new AGESIS update along with breaking 3200 ram speed with the threadripper 2 update.

It's crazy that the Zenith isn't getting support. That was the flagship Gen1 board.

Link to comment
8 minutes ago, Tritech said:

It's crazy that the Zenith isn't getting support. That was the flagship Gen1 board.

Yeah they honestly don't seem to care.. The new intels are out and they are all on that now. Not an update since November and the only reason that update happened was because if you had 2 2080's the board wouldn't boot. Back then on the forums I was on there was an actual ASUS employee who would listen to our concerns/problems but he quit his job and since this radio silence from ASUS. He was our only way in and was amazing. Would even make us custom bios sometimes to fix issues before asus released fixes publicly. 

 

Btw tried UEFI boot for shits and giggles and below is what I get.. Thats my 1080ti.. Simply doesn't even map now.

 

2019-02-11T17:06:39.233906Z qemu-system-x86_64: -device vfio-pci,host=41:00.0,id=hostdev0,bus=pci.0,addr=0x3,romfile=/mnt/disk1/domains/1080ti.rom: Failed to mmap 0000:41:00.0 BAR 1. Performance may be slow

 

  • Upvote 1
Link to comment

Yea, that's exactly what mine was doing. I just had an idea but i'm heading out and can't test atm. I think to get uefi working i read somewhere that you need to add it in your syslinux like:

 

 kernel /bzimage
append vfio-pci.ids=10de:1b06,10de:10ef isolcpus=9-15,25-31 initrd=/bzroot

 

those ids being your gpu and gpu audio.

Edited by Tritech
Link to comment
12 minutes ago, Tritech said:

Yea, that's exactly what mine was doing. I just had an idea but i'm heading out and can't test atm. I think to get uefi working i read somewhere that you need to add it in your syslinux like:

 


 kernel /bzimage
append vfio-pci.ids=10de:1b06,10de:10ef isolcpus=9-15,25-31 initrd=/bzroot

 

those ids being your gpu and gpu audio.

tried it.. makes shit tons of writes to your log for the VM saying the resource is busy. Unraid isn't letting go.. Like the frozen song..

  • Upvote 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.