Can't get GPU Passthrough working (Black screen / 1 core pegged at 100%))


Recommended Posts

Good day everybody, 

 

Would appreciate your assistance to find out the possible cause of my issue. 

 

My system specification below:
CPU: Ryzen 7 3700X

Motherboard: Gigabyte B450 Aorus Elite (version: F51)

GPU: EVGA GTX 1080 SC Gaming (P/N: 08G-P4-6183-KR)

 

I have watched numerous videos guide to set up GPU passthrough. I still can't get it working unfortunately. 

 

1) Monitor would simply go to standby mode when I start the VM. Monitor was connected through the GPU DP port.

2) VM did not boot at all when using GPU passthrough. VNC works fine however.

3) 1 CPU core that I have pinned to the VM gets stuck at 100% usage.

4) VM would not shut down gracefully. Have to force stop.

 

Steps performed:

1) SVM and IOMMU mode is enabled.

2) GPU was grouped with other devices. I've Isolated the groupings as per mentioned is this video (https://www.youtube.com/watch?v=qQiMMeVNw-o) using step 2 by setting PCIe ACS override to both.

3) Downloaded and edited the appropriate VBIOS for my GPU. (https://www.techpowerup.com/vgabios/201504/evga-gtx1080-8192-180308)

4) Tried both machine models (i440fx & Q35). None worked.

5) Tried OVMF and SeaBIOS. None worked.

 

Is there anything else that I might have missed? Some Information that may help troubleshoot:

 

100% CPU:

 

540057058_100CPU.PNG.4668f980614f482306b8d6e2ad522e8b.PNG

 

HTOP:

htop.thumb.PNG.b0e25be9912f8c83d167e618fe9faa36.PNG

 

IOMMU Group:

IOMMU group 24:	[10de:1b80] 07:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
IOMMU group 25:	[10de:10f0] 07:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)

 

VM Configuration

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='1'>
  <name>Windows 10 (Shared)</name>
  <uuid>e2de9569-5950-d734-0806-c014274fbb67</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='12'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='13'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='14'/>
    <vcpupin vcpu='6' cpuset='7'/>
    <vcpupin vcpu='7' cpuset='15'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/e2de9569-5950-d734-0806-c014274fbb67_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/mnt/user/domains/Windows 10 (Shared)/vdisk1.img' index='1'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <boot order='1'/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='sata0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:53:e5:c0'/>
      <source bridge='br0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-Windows 10 (Shared)/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <rom file='/mnt/user/isos/EVGA.GTX1080.8192.180308.dump'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

Link to comment
  • 1 year later...
  • 5 months later...
On 4/17/2021 at 9:48 PM, foop09 said:

Did you ever find a solution to this? I have the same issue, when I add the GPU with VBIOS the CPU gets pegged and never boots or posts. However, set the graphics back to VNC and it's fine. Could it be my actual GPU? no idea

I am also having the same issue.  Tried with old ATI and NVIDIA cards.

Link to comment
  • 2 months later...

In my case I have the best luck when my VM's "machine" is set to i440fx and making sure the PCI device is set to "bind to VFIO at boot" in the device settings under the tools menu. With these settings I have had two different GPUs passthrough with no issues. Obviously, before doing any of this ensure your bios is configured properly for PCI-passthrough.

 

However, I have two different AMD GPUs where passthrough NEVER works, I'll get the single core at 100% and the VM will never POST. I'm guessing there's an incompatibility with these cards. 

Link to comment
  • 1 year later...
On 9/21/2021 at 1:40 AM, ghost82 said:

When you have issues with gpu passthrough, most of the times we can be of some help if you attach unraid diagnostics file and the output of terminal command cat

cat /proc/iomem

 

Common errors are not splitting iommu groups, memory not mapped properly, not passing proper gpu components to the vm, wrong target topology.

 

This is an old post I know...

 

I have this problem with the pegged core with my secondary GPU, a radeon r7 360. The funny thing is I always have that problem in 6.10 and later. So I stuck on 6.9. But I recently noticed that I actually sometimes have it on 6.9 as well. The sporadic nature, and the fact it seems to never happen on first boot, but once it does happen one VM that uses it might book OK but  made me thing it is some kind of resource leak or something. Can you explain the 'memory not mapped properly' issue I might encounter and how I would troubleshoot it? I have plenty of free memory but I could see some sort of fragmentation issue being involved. I'm pretty stumped on this one though.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.