Win 10 VFIO VM video/HDMI audio stuttering (AMD Ryzen 3000/x570)


JesterEE

Recommended Posts

Hey community!  I was really trying hard NOT to write this post.  My Google-fu is pretty strong, and I thought for sure, I could find a solution to my issue on the vast interweb ... but, I have just found lots of things that didn't seem to work 😥.

 

A lot of what I did for VM setup and debugging has come from the Unraid wiki, Unraid forum, SpaceInvader One's Youtube, /r/VFIOLevel1Tech - VFIO, and MathiasHueber.com's Game VM Tweak Guide.

 

I'm looking for help/insight into the issues I'm seeing, helpful new troubleshooting steps, or just corroboration that someone with a similar system (x570 MB, CPU architecture, Unraid/QEMU version) is seeing the same types of things ... and I'm not crazy.  The system works, the pass-through works ... it's just the end experience leaves a fair amount to be desired.  Far from bare metal!  My last hardware build on Unraid 6.7.2-stable was under-powered, so I assumed the stuttering/lag/etc. I was seeing was because my hardware wasn't up to the task.  But I upgraded recently to 2019 tech and am still seeing the same types of issues.  I am perplexed!

 

System

Unraid 6.8.0-Stable LinuxServerio's NVIDIA Build | ASUS ROG Strix x570-e AGESA 1.0.0.4 B bios 1405 (11/26/2019) | AMD Ryzen 3800X (8C/16T) | 32GB DDR4-3600 | Gigabyte NVIDIA GTX 1060 6GB Windforce

Windows 10 VM is a vdisk hosted on an Unassigned Device 2.5" 1TB SATA SSD.

 

Problem

In a Windows 10 v1909 i440fx-4.1.1 VM with NVIDIA GPU and on-board USB pass-through provisioned with isolated 4C/8T from the 2nd CCX and 16GB RAM, I get a non-negligible amount of video/HDMI audio stuttering when watching Youtube/gaming/forcing audio output (i.e. clicking Windows the volume slider)/etc.  This happens regardless of what else is running on my Unraid system (i.e. dockers), but more frequently with other applications running.

 

Other Notes

  • The ACS patch and unsafe interrupts are not required for my system to do hardware pass-through appropriately.
  • I notice that the lag is accompanied by either:
    • high CPU/single thread spikes in either the isolated VM cores or the host cores where the emulator and iothreads are processed (observed in the Unraid Dashboard)
      • OR
    • guest VM I/O spikes (virtio ethernet/primary disk observed in the guest Task Manager).
  • I tried to isolate cores on the 1st CCX with the cores on the 2nd CCX.  This caused input lag in the VM regardless of if the cores were assigned to the VM or not.
  • I tried to setup my my 2nd on-board gigabit NIC to be passed to the VM (maybe to alleviate the virtio ethernet lag spikes).  After adding the NIC (which is in it's own IOMMU group) to the vfio-pci.cfg file and restarting, I tested the VM before assigning the NIC to the VM (i.e. no changes to the VM).  The host change caused input lag in the VM just like the previous bullet.  This seemed like a non-starter so I reverted the binding without trying the NIC in the VM.

 

Attempted Troubleshooting

  1. Enabling Message-Signaled Interrupts (MSI) for GPU/HDMI audio added with MSI-utility v2
    • Yes, this did not fix it!
    • Yes, I re-enabled MSI of the devices when I installed new GPU drivers
    • Yes, I restarted the guest between applying and testing
  2. Adding/Removing MSI for other virtio devices
  3. Disabling Unraid docker during testing (i.e. free up the host resources)
  4. Updated Windows 10 guest virtio drivers to v0.1.173-2 (Link)
  5. Limited cores (1, 2, 4 HT core(s))
  6. Isolated/Non-isolated VM cores
  7. emulatorpin/iothreadpin VM template directives
  8. Hyper-V Enlightenment template directives
  9. Changing the HDMI audio pass-through guest address to be the same as the GPU, but with a different function (similar to the way it is on the host)
    •     <hostdev mode='subsystem' type='pci' managed='yes'>
            <driver name='vfio'/>
            <source>
              <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
            </source>
            <rom file='/mnt/user/Server/config/Gigabyte.GTX1060.rom'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
          </hostdev>
          <hostdev mode='subsystem' type='pci' managed='yes'>
            <driver name='vfio'/>
            <source>
              <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
            </source>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
          </hostdev>

       

    • to

    •     <hostdev mode='subsystem' type='pci' managed='yes'>
            <driver name='vfio'/>
            <source>
              <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
            </source>
            <rom file='/mnt/user/Server/config/Gigabyte.GTX1060.rom'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0' multifunction='on'/>
          </hostdev>
          <hostdev mode='subsystem' type='pci' managed='yes'>
            <driver name='vfio'/>
            <source>
              <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
            </source>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x1'/>
          </hostdev>

       

  10. Windows guest default audio format change (CD/DVD/Studio 16-bit/24-bit)
  11. Q35-4.1.1 VM with the same ... everything

 

VM Setup

 

VM Template

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>Vanellope</name>
  <uuid>2eb4ab82-9e1e-dc2b-b155-d61c76458527</uuid>
  <description>Windows 10 Gaming VM</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <iothreads>2</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='12'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='13'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='14'/>
    <vcpupin vcpu='6' cpuset='7'/>
    <vcpupin vcpu='7' cpuset='15'/>
    <emulatorpin cpuset='3,11'/>
    <iothreadpin iothread='1' cpuset='1,9'/>
    <iothreadpin iothread='2' cpuset='2,10'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/2eb4ab82-9e1e-dc2b-b155-d61c76458527_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <vpindex state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
      <synic state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
      <stimer state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
      <reset state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
      <frequencies state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
    </hyperv>
    <vmport state='off'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
    <ioapic driver='kvm'/>    <!-- required for QEMU 4.0 or later -->
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='8' threads='1'/>
    <cache mode='passthrough'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
    <timer name='rtc' present='no' tickpolicy='catchup'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
    <timer name='pit' present='no' tickpolicy='delay'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
    <timer name='tsc' present='yes' mode='native'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/mnt/disks/CT1000MX500SSD1_1820E13CF91D/domains/Vanellope/Vanellope_i440fx_4p1p1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/mnt/disks/CT1000MX500SSD1_1820E13CF91D/games/blizzard/blizzard.img'/>
      <target dev='hdd' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/win_10_1909_x64.iso'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/virtio-win-0.1.173.iso'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:29:0d:5c'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/Server/config/Gigabyte.GTX1060.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x06' slot='0x00' function='0x3'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
</domain>


IOMMU

IOMMU group 0:	[1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 1:	[1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
IOMMU group 2:	[1022:1482] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 3:	[1022:1482] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 4:	[1022:1483] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
IOMMU group 5:	[1022:1482] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 6:	[1022:1482] 00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 7:	[1022:1482] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 8:	[1022:1484] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
IOMMU group 9:	[1022:1482] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
IOMMU group 10:	[1022:1484] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
IOMMU group 11:	[1022:1484] 00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
IOMMU group 12:	[1022:1484] 00:08.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
IOMMU group 13:	[1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
[1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
IOMMU group 14:	[1022:1440] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0
[1022:1441] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1
[1022:1442] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2
[1022:1443] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3
[1022:1444] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4
[1022:1445] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5
[1022:1446] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6
[1022:1447] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7
IOMMU group 15:	[1022:57ad] 01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57ad
IOMMU group 16:	[1022:57a3] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a3
IOMMU group 17:	[1022:57a3] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a3
IOMMU group 18:	[1022:57a3] 02:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a3
IOMMU group 19:	[1022:57a4] 02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a4
[1022:1485] 06:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
[1022:149c] 06:00.1 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
[1022:149c] 06:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
IOMMU group 20:	[1022:57a4] 02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a4
[1022:7901] 07:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
IOMMU group 21:	[1022:57a4] 02:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 57a4
[1022:7901] 08:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
IOMMU group 22:	[1000:0072] 03:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
IOMMU group 23:	[10ec:8125] 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
IOMMU group 24:	[8086:1539] 05:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
IOMMU group 25:	[10de:1c03] 09:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
[10de:10f1] 09:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
IOMMU group 26:	[1022:148a] 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
IOMMU group 27:	[1022:1485] 0b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
IOMMU group 28:	[1022:1486] 0b:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
IOMMU group 29:	[1022:149c] 0b:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
IOMMU group 30:	[1022:1487] 0b:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
IOMMU group 31:	[1022:7901] 0c:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
IOMMU group 32:	[1022:7901] 0d:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

Diagnostics: Attached

 

lstopo: Attached

 

Thanks for looking through this (long) post!

-JesterEE

 

 

topo_cogsworth_v3.png

cogsworth-diagnostics-20191224-1958.zip

Edited by JesterEE
Spelling / Grammar / Missing config
Link to comment

@JesterEE Quick question, have you ever tried to minimize the PCI device passthrough to the bare minimum? Only GPU? The more devices you passthrough the more possible sources of failure you add. From what you've explained you already did a couple things and tweaks, but what I've seen over the 2 years I use Unraid now, the more devices you add the more possible hickups you generate. Most issues showed for lots of people passing through onboard controllers to the VMs like USB or network.

 

  • I would try to setup a VM with only the GPU passed through, no USB controller, no network controller.
  • Select your keyboard and mouse via the USB selection in the template
  • vdisk on a drive nothing else uses, maybe try another disk connected to another port (or M.2 slot). onboard sata controllers are mostly behind the chipset and share ressources with other devices like network ports, USB or even onboard audio. If you push these devices to hard at the same time, you can end up in situations wher other devices have to wait for their data to be processed. Mostly you will also have a M.2 slot connected to the chipset. Double check the manual which device is connected where, which devices share ressources and try to prevent using this devices in the first place.
  • dissable every possible background scripts or dockers
  • tweaks inside the VM, powerplan to high performance, MSI fix, dissable everything on W10 you don't need (telemetry, local update sharing, ms account sync, use a local account etc)
  • adding the pci ids of the card to the syslinux config
  • possible worth trying is to not passthrough the HDMI audio part of the card
  • try another PCI slot for the card and maybe put another one into the first slot. You have only one card in the system, right?

 

 

EDIT:

 

I didn't noticed the following first time reading your post, you're using "Unraid 6.8.0-Stable LinuxServerio's NVIDIA Build". Adding the pci ids of the card to the syslinux config to prevent unraid to pick it up at boot, not sure but it might can cause some issues. Never tried these builds, but I guess the GPU will be used for Plex, right?

Edited by bastl
  • Like 1
Link to comment
  • 3 weeks later...

@bastl Thanks for taking the time to reply to this thread.  Busy last few weeks and I didn't have too much time to investigate more.  And I was kind of hoping that the 6.9-rc series would have started before I started looking into it again 😅.

 

Anyway, you brought up some good points.  The GPU pass-through is necessary, so I reverted to just that device with a "fresh" XML template generated from the VM Manager.  Unfortunately, I still saw the same issue.  Though, as I added more pass-through devices (GPU audio, USB controller), and generated artificial and real (docker) loads on the host, it didn't get any worse.  So, my money is something in the kernel or the hardware firmware itself.  This was an interesting test though, because it was clear that I actually didn't need to pass through the USB controller to lower my DPC latency as others have reported ... and the stuttering wasn't from either the input devices or from the USB controller pass-through.

 

My Vdisk is on a 2.5" SSD that is dedicated to this one VM.  It does tend to get hot though (tops out at 130F under VM load), so I really need to try it with a fan on it to make sure its not sensitive to the generated heat (but I doubt it ... it's still in the acceptable range).

 

I have not tried to stub the GPU.  I haven't seen anyone else do it so it didn't occur to me.  I'll try that next.  Just not tonight 😁.

 

You also brought up a good point about the modified Unraid kernel I'm running.  Yes, I run it to allow the GPU to run in a couple dockers.  I will try the stock kernel too and see if I get any different results.  I doubt swapping between the same revision of the NVIDIA build and stock build will cause any issues, so I have no problem giving it a shot even if I'd like to live on the NVIDIA build.  It would be good to know if that's the culprit.

 

-JesterEE

Link to comment

A little follow up. Today I stubbed the GPU and the attached HDMI audio controller. What do you know ... buttery smooth performance during 99.5% of my testing. So apparently, at least with my current Unraid install, stubbing the GPU is mandatory.  I am still on the 6.8.0-stable LinuxServer.io NVIDIA build, so I'm hoping (praying) the v5 kernel build won't require me to stub the GPU. Note that I did not need to passthrough my USB controller for my keyboard and mouse to keep a solid experience.

 

Also, I needed to have the cores of my CPU used by the VM (4HT cores) isolated from the Linux kernel. I tried the VM without the isolation, and even without any other processes running (outside of whatever Unraid does in the background) I got lots of stutters even with the GPU stubbed.

 

Having to stub the GPU is kind of a deal breaker for me to even consider dealing with VFIO for gaming anymore. I know it might seem like a small reason, but let me break it down a bit for my scenario.

 

I upgraded my setup from a i7 2700. This CPU is more than capable of running Unraid and all the dockers I run with a little compute power to spare. Even running a media server (Plex) is fine on this system as long as I have my NVIDIA GTX 1060 for the transcode tasks. And... there's the rub. Without being able to have the GPU available to Linux AND the VM, I have no need to have my Unraid box do the gaming VM task. If I have to buy another GPU to have for Plex while I reserve the other for the VM ... I might as well have 2 separate physical PCs anyway. Since I still have my "old" components, the only thing I would need is another GPU to make a complete second system viable. So why would I opt to keep half my Ryzen 3800x isolated, just to do it in one box and have half the performance? At that point, I don't think it makes sense for me and I can build a much better stand alone gaming machine with little extra investment. Just because you can do something, doesn't always mean you should.

 

Now if there were better solutions to completely isolate components and free them again on demand ... I'd reconsider my position. CPU isolation with cset shielded maybe an option in the future if the developers implement it. I am not aware of a solution for on demand VFIO passthrough, so that would also need to be available. And, surely, since this is not a main focus for the developers (rightfully so IMO), having this tech in Unraid is surely not going to be Soon™ 🤣.

 

Anyway, I still want to test the Limetech Unraid 6.8.0-stable and LinuxServer.io's Unraid 6.8.0-rc7 NVIDIA build with the v5 kernel before I consider this testing complete. Because I need to have a GPU available to Plex, the Limetech builds are a non-starter, but I can at least see if there is a difference when not stubbing the GPU. Maybe the GPU drivers on the Unraid host are still holding onto the GPU preventing a good experience in the VM. This would be good information for LinuxServer.io at least. We'll see!

 

I will report my findings here for the community and others that run into similar issues and start searching for solutions.

 

-JesterEE

Edited by JesterEE
Updated info on NVIDIA RC Build
  • Like 1
Link to comment
  • 2 weeks later...

giphy.gif

 

 

OK, I give up!  I tried so many things ... and so many things just failed to provide a different end experience.  Since my last post I tried:

  • 4 different builds of Unraid ... v4 and v5 Linux kernels
    • 6.8.0-Stable
    • 6.8.0-Stable LinuxServer.io NVIDIA build
    • 6.8.0-RC7
    • 6.8.0-RC7 LinuxServer.io NVIDIA build
  • w/ and w/o VFIO GPU/USB pass-though
  • w/ and w/o CPU isolation
  • various number of vCPUs
  • various vCPU assignments
  • various memory allocations
  • fresh Windows 10 1909 VMs
    • both i440fx and Q35 variants
  • various virtio driver versions
  • fresh KVM XMLs
  • various BIOS and Unraid settings

 

In my last post I thought I had it figured out ... but after using it in that configuration for a time, it was better, but not as good as I originally thought, so the quest continued.

 

When I started working with KVM in Unraid, I thought it would be a great use of my equipment, maximizing it's efficiency for my usage case, and a chance to have some fun and tinker for a while with my new hardware.  At the end of all this effort, it has been very unfulfilling and frustrating.  I've probably spent >120 hours reading posts all over the internet, and trying different things to get the VM experience to a place where I can almost forget I'm working with a VM.  I honestly don't think it's possible and in hindsight, not worth the effort!

 

I don't know how others have done it, and I'm coming to think it's more of a personal perception thing. I.E. what's "good enough" for someone else is not "good enough" for me.  So, I'm done trying.  I'm back to a simple VM configuration, where I was almost at the beginning of the search for the perfect settings, that works well enough ... and it's "not quite good enough" for me to use regularly, but will suffice till I can build yet another computer.  I uploaded a video of what I'm seeing on YouTube.  I would appreciate if other's that have "behaving game oriented VMs" can give it a watch, and see if they are seeing things like I am as a sanity check for me.  Maybe the small glitching is just part of working with a VM in this environment and I'm being hypersensitive.

 

 

 

Thanks for your feedback!

 

-JesterEE

 

Edited by JesterEE
Link to comment
  • 2 weeks later...

Hi,

 

I have similar build running Ryzen 3600.

 

Did you try removing those lines:

<synic state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->
<stimer state='on'/>  <!-- https://mathiashueber.com/performance-tweaks-gaming-on-virtual-machines/ -->

Those lines enabled causes a great shuttering and lags in my Windows 10 VM.

 

Another thing to look at is:

<timer name='tsc' present='yes' mode='native'/> 

You have to check if you Linux host is using TSC timer indeed:

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

If you put your host system into sleep then upon resume the TSC timer will most likely be marked as unreliable and dropped in favor of HPET (at least this is what I observed on AMD hardware). TSC timer is crucial for Windows 10 VM to run smooth.

If this is the case for you then you may try to add a kernel parameter 'tsc=reliable'.

 

Additionally try running this software
https://www.thesycon.de/eng/latency_check.shtml

 

to monitor your DPC latency.

  • Thanks 1
Link to comment

@zeus83 Thanks for the comment.  When I reverted back to the simple XML those lines were removed and I still have stuttering.  My XML now is what Unraid generates in the GUI + edits for CPU feature policy='require' name='topoext', multifunction corrected GPU/HDMI pass through, and disk cache='none'.

 

It's hard to quantify better or worse stuttering between VM settings since it's very subjective.  My not scientific bar is "Can I do something with medium hardware intensity for 10-15 mins and notice significant lag/stutters?".  The answer to the question is always yes 😭.

 

I'm currently running the default VM clock setup:

  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
    <timer name='tsc' present='yes' mode='native'/>
  </clock>

My current_clocksource is tsc and my available_clocksource is tsc hpet acpi_pm.  My system does not sleep, so it should always be on the tsc clock.

 

I did not however run the kernel parameter 'tsc=reliable', so that's something I will try!  Thank you so much for the recommendation!  Getting rid of clock stability checks doesn't seem like a great idea overall ... but it's worth a try ... and some more reading if it actually works.

 

Note the DPC Latency Checker you linked is not working on new versions of Windows (>8).  From their website:

Quote

The utility produces incorrect results because the implementation of kernel timers has changed in Windows 8, which causes a side effect with the measuring algorithm used by the utility.

 

The only software I know for DPC Latency checking that working in Windows 10 is LatencyMon.

 

General thread update:

I guess I didn't quite raise the white flag ... I'm just not investing a lot of time and trying everything under the sun anymore.

 

I did try to add another GPU (GeForce GTX 240 🤣) as the primary (PCIE1 8x) and the GeForce 1060 (PCIE2 8x) as secondary unbound from the Linux kernel with vfio-pci.cfg.  Didn't help.  Pretty much the same as running just the GeForce 1060 in PCIE1 16x either bound or unbound from the kernel, and adding the kernel parameter 'video=efifb:off' so it's not held by the host.  This is what I expected, but now it's validated.

 

I think the next thing I'm going to try is Skitals 6.8.0-RC5 kernel so I can try my on-board audio instead of the HDMI audio.  Maybe I'm asking the GPU bus to do too much 🙄.

 

-JesterEE

Link to comment
  • 2 weeks later...

Config update

 

Tried @zeus83's recommendation and we still having similar issues. Loaded @Skitals 6.8.0-RC5 kernel but was having issues passing through my Nvidia card, so I scrapped that effort. Upgraded to 6.8.2, same expected issues, but now my Nvidia card is passing through fine with QEMU 4.2.

 

My last straw is thermonuclear ... Maybe I set an Unraid configuration option or loaded a plugin somewhere along the line that is interfering with the VM performance. I'm going to temporarily disable my array, load a fresh install on my USB drive, and just run the virtual machine. No docker, no plugins, no tweaks, no nothing ... Vanilla.  I'll try 6.8.2, and 6.8.0-RC7. If one works really well I know it's something I did and I'll re-setup my array and try to reconfigure Unraid as I like it until I figure out what is causing the issue, but I have low expectations.

 

-JesterEE

Link to comment

So I did what I said in the previous post. Nuked my Unraid USB (after backing it up, of course!), loaded 6.8.2, created a dummy array with another single USB drive, set a SSD as an Unassigned Device for a VM (actually, just used the same SSD/VM I have been using), isolated half the cores for the VM, stubbed my GPU, and played with a few benchmarks and games to test it out. Butter...so, so smooth. Bare metal performance. So, now I know it's possible ... now to figure out what's causing it to stop being that way.

 

The struggle continues...

 

-JesterEE

Link to comment
  • 1 month later...
On 2/23/2020 at 10:19 AM, JesterEE said:

So I did what I said in the previous post. Nuked my Unraid USB (after backing it up, of course!), loaded 6.8.2, created a dummy array with another single USB drive, set a SSD as an Unassigned Device for a VM (actually, just used the same SSD/VM I have been using), isolated half the cores for the VM, stubbed my GPU, and played with a few benchmarks and games to test it out. Butter...so, so smooth. Bare metal performance. So, now I know it's possible ... now to figure out what's causing it to stop being that way.

 

The struggle continues...

 

-JesterEE

Any word on this? I have not been able to pass the on board audio regardless of how many hours I dump into trying different things

Link to comment
On 4/13/2020 at 1:57 AM, Critica1Err0r said:

Any word on this? I have not been able to pass the on board audio regardless of how many hours I dump into trying different things

 

 

From what I've read (a while ago), the onboard audio needs to be addressed in the kernel.  Please see this 6.8.0-RC5 kernel mod by Skitals.  Unfortunately, this has not been addressed upstream in the main line Linux 5.6 kernel (which is actually a version newer than the 6.9.0-beta series is built on), so we are SOL unless it's been addressed in another kernel module.  My recommendation, try Skitals 6.8.0-RC5 kernel mod, and the 6.9.0-beta1 build to see if it fixes it for you.  Not all x570 boards use the audio chip/codec so depending on what board you're running, your mileage may vary.  I tried the 6.8.0-RC5 kernel mod with my ASUS ROG Strix x570-E Ryzen build (see my sig), but had some issues so I gave up on it.  If Skitals does a new version based on a 6.9.0-RC build, I might try it again.

Link to comment
17 hours ago, cobhc said:

I'd like to know if you managed to find out what was causing the issues @JesterEE. I'm on a completely different hardware configuration but I've been struggling with VM performance for a long time now.

No definitive answer unfortunately!  I wish I had one!  I reloaded my server with the Unraid 6.8.3 LinuxServer.io Nvidia build and almost everything that I would "normally" run, and my VM is pretty great.  I think any bottlenecks/stuttering I'm still getting is just a limitation of the hardware (GPU) I'm running an not the emulation.  Though ... it's night and day from what I had originally.  It was awful!

 

When I was restarting my build I was going pretty slowly, only loading a plugin or docker at a time while I tested out the VM performance.  There are still a couple plugins I want to run that I haven't loaded yet because I couldn't dedicate the time yet.

 

If I had to guess ... I probably changed some obscure setting while I was tinkering with my first ever Unraid setup that didn't need to be changed that hosed everything.  But, I won't be sure till I load all the stuff I want to run.  As I get that time to do my testing, I'll post here, but I doubt it will be soon.

 

Here are the plugins I'm running that I have whitelisted for my build

  • CA Appdata Backup/Restore v2
  • CA Auto Update
  • CA Dynamix Unlimited Width
  • CA Fix Common Problems
  • CA User Scripts
  • Community Applications
  • ControlR
  • Disable Security Mitigations
  • Dynamix Active Streams
  • Dynamix Local Master
  • Dynamix SSD TRIM
  • Dynamix System Autofan (not currently using)
  • Dynamix System Info
  • Dynamix System Temp (not currently using - Need new the Linux kernel for my x570 build)
  • Dynamix WireGuard
  • Enhanced Log Viewer
  • File Activity
  • Mover Tuning
  • NerdPack GUI
  • Preclear Disk
  • Recycle Bin
  • Statistics
  • Tips and Tweaks
  • Unassigned Devices
  • UnBALANCE
  • Unraid Nvidia
  • VFIO-PCI CFG
  • VM Backup

 

Here are a couple more I want to run, that I currently greylisted:

  • Dynamix Cache Dirs
  • Dynamix File Integrity
Edited by JesterEE
Link to comment
  • 1 year later...

Hi,

I'm not sure if I should really post on this 1-year old thread, but just let me share this.

I have struggled with similar issue on a different setup.

But now, it's fixed and working.

 

In my case, what I missed was enabling MSI on the second function (HD-Audio) of the GPU.

It's not visible in the MSI-Utility-v2 AFAIK.

To do this, I have obtained the VID/PID of the second function using lspci on the host:

$ sudo lspci -s 04:00.1
04:00.1 Audio device: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] (rev a1)
$ sudo lspci -s 04:00.1 -v -n
04:00.1 0403: 10de:0fbc (rev a1)
        Subsystem: 19da:1346
        Flags: bus master, fast devsel, latency 0, IRQ 140, IOMMU group 13
        Memory at df080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

So, 10de:0fbc is the VID/PID for the function.

I've changed the relevant registry key just like I did on the first function.

For me it was:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\PCI\VEN_10DE&DEV_0FBC&SUBSYS_134619DA&REV_A1\4&26a0a5e1&0&0019\Device Parameters\Interrupt Management\MessageSignaledInterruptProperties\MSISupported

 

reference:

https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Slowed_down_audio_pumped_through_HDMI_on_the_video_card

 

My setup is as follows:

Guest OS: Windows 10

VMM: QEMU/KVM

Host Linux kernel: 5.4.38-gentoo

M/B: AsRock H170M Pro4

CPU: i3-7300

GPU: GTX750Ti

 

  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.