AMD GPU Passthrough


Recommended Posts

Hi, I'm new to unraid and the community. I have reused parts from my old PC for unraid and I have been trying to get a VM working with my Gigabyte 5700XT to no avail yet. I went step-by-step by Spaceinvader's guide part 1 and 2. I was able to setup the VM and install the virtio drivers. The problems start as soon as I try to change the GPU from VNC to the dedicated GPU. On the first boot of the VM after the change, the VM is stuck on a black screen, unable to gracefully shutdown it has to be force stopped. After that I'm not getting anymore pings, can't see it on the router and can't VNC into the VM when dedicated GPU is selected and I have to force stop. When I change the settings of VM back to VNC, everything works as per usual. 

 

I have now rebuilt the VM 2 times with the same results, tried all PCIe ACS override settings, tried adding the VBIOS manually (download from TechPowerup).

 

Below can be seen the XML of the VM after changing back and forth from VNC and GPU

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>Windows 10</name>
  <uuid>3ca404d4-c486-5c66-7ff6-801fe07777ac</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='6'/>
    <vcpupin vcpu='2' cpuset='1'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='2'/>
    <vcpupin vcpu='5' cpuset='8'/>
    <vcpupin vcpu='6' cpuset='3'/>
    <vcpupin vcpu='7' cpuset='9'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-5.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/3ca404d4-c486-5c66-7ff6-801fe07777ac_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/Windows 10/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/Windows.iso'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/virtio-win-0.1.190-1.iso'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:71:36:2a'/>
      <source bridge='br0'/>
      <model type='virtio-net'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
</domain>
 

myserver-diagnostics-20211027-1652.zip

Link to comment
13 minutes ago, Sulframus said:

the VM is stuck on a black screen

 

You are trying to passthrough the unique gpu you have in the system, it could work, but you have to probably disable efifb (if unraid is booted with uefi) or vesafb (if unraid is booted with legacy bios).

As you can see a portion of the memory is already in use and cannot be reserved to vfio.

Quote

Oct 27 07:47:37 MyServer kernel: vfio-pci 0000:09:00.0: BAR 0: can't reserve [mem 0xe0000000-0xefffffff 64bit pref]

Output of

cat /proc/iomem

from unraid terminal?

 

PS: I need to make a counter and see how it increases with time, seems like my answers are the same written over again in the forum :D

Please mods don't assume it's spam :D

 

Edited by ghost82
Link to comment
8 minutes ago, ghost82 said:

 

You are trying to passthrough the unique gpu you have in the system, it could work, but you have to probably disable efifb (if unraid is booted with uefi) or vesafb (if unraid is booted with legacy bios).

As you can see a portion of the memory is already in use and cannot be reserved to vfio.

Output of

cat /proc/iomem

from unraid terminal?

 

Unraid is booting with UEFI, I have added the "video=efifb:off" previously to GUI mode by a mistake. I have updated it to the default one and rebooted. Now I am see the VM on the router, but no ping, RDP or VNC is working yet.

 

root@MyServer:~# cat /proc/iomem
00000000-00000fff : Reserved
00001000-0009ffff : System RAM
000a0000-000fffff : Reserved
  000a0000-000bffff : PCI Bus 0000:00
  000c0000-000dffff : PCI Bus 0000:00
    000c0000-000cdfff : Video ROM
  000f0000-000fffff : System ROM
00100000-09e0ffff : System RAM
  04000000-04a00816 : Kernel code
  04c00000-04e4afff : Kernel rodata
  05000000-05127f7f : Kernel data
  05471000-055fffff : Kernel bss
09e10000-09ffffff : Reserved
0a000000-0a1fffff : System RAM
0a200000-0a20bfff : ACPI Non-volatile Storage
0a20c000-0affffff : System RAM
0b000000-0b01ffff : Reserved
0b020000-c76cf017 : System RAM
c76cf018-c76e7c57 : System RAM
c76e7c58-c76e8017 : System RAM
c76e8018-c76f6057 : System RAM
c76f6058-d174dfff : System RAM
d174e000-d176cfff : ACPI Tables
d176d000-d81a0fff : System RAM
d81a1000-d81a1fff : Reserved
d81a2000-da60bfff : System RAM
da60c000-da749fff : Reserved
da74a000-da759fff : ACPI Tables
da75a000-da861fff : System RAM
da862000-dac21fff : ACPI Non-volatile Storage
dac22000-db77efff : Reserved
db77f000-ddffffff : System RAM
de000000-dfffffff : Reserved
e0000000-fec2ffff : PCI Bus 0000:00
  e0000000-f01fffff : PCI Bus 0000:07
    e0000000-f01fffff : PCI Bus 0000:08
      e0000000-f01fffff : PCI Bus 0000:09
        e0000000-efffffff : 0000:09:00.0
          e0000000-efffffff : vfio-pci
        f0000000-f01fffff : 0000:09:00.0
          f0000000-f01fffff : vfio-pci
  f8000000-fbffffff : PCI MMCONFIG 0000 [bus 00-3f]
    f8000000-fbffffff : Reserved
      f8000000-fbffffff : pnp 00:00
  fc600000-fc8fffff : PCI Bus 0000:0b
    fc600000-fc6fffff : 0000:0b:00.3
      fc600000-fc6fffff : xhci-hcd
    fc700000-fc7fffff : 0000:0b:00.1
      fc700000-fc7fffff : ccp
    fc800000-fc807fff : 0000:0b:00.4
    fc808000-fc809fff : 0000:0b:00.1
      fc808000-fc809fff : ccp
  fc900000-fcafffff : PCI Bus 0000:07
    fc900000-fc9fffff : PCI Bus 0000:08
      fc900000-fc9fffff : PCI Bus 0000:09
        fc900000-fc97ffff : 0000:09:00.0
          fc900000-fc97ffff : vfio-pci
        fc9a0000-fc9a3fff : 0000:09:00.1
          fc9a0000-fc9a3fff : vfio-pci
    fca00000-fca03fff : 0000:07:00.0
  fcb00000-fccfffff : PCI Bus 0000:02
    fcb00000-fcbfffff : PCI Bus 0000:03
      fcb00000-fcbfffff : PCI Bus 0000:05
        fcb00000-fcb03fff : 0000:05:00.0
        fcb04000-fcb04fff : 0000:05:00.0
          fcb04000-fcb04fff : r8169
    fcc00000-fcc7ffff : 0000:02:00.1
    fcc80000-fcc9ffff : 0000:02:00.1
      fcc80000-fcc9ffff : ahci
    fcca0000-fcca7fff : 0000:02:00.0
      fcca0000-fcca7fff : xhci-hcd
  fcd00000-fcdfffff : PCI Bus 0000:0d
    fcd00000-fcd007ff : 0000:0d:00.0
      fcd00000-fcd007ff : ahci
  fce00000-fcefffff : PCI Bus 0000:0c
    fce00000-fce007ff : 0000:0c:00.0
      fce00000-fce007ff : ahci
  fcf00000-fcffffff : PCI Bus 0000:01
    fcf00000-fcf03fff : 0000:01:00.0
      fcf00000-fcf03fff : nvme
  fd000000-fd0fffff : Reserved
    fd000000-fd0fffff : pnp 00:01
  fd500000-fd5fffff : Reserved
  fea00000-fea0ffff : Reserved
  feb80000-fec01fff : Reserved
    feb80000-febfffff : amd_iommu
    fec00000-fec003ff : IOAPIC 0
    fec01000-fec013ff : IOAPIC 1
  fec10000-fec10fff : Reserved
    fec10000-fec10fff : pnp 00:05
fec30000-fec30fff : Reserved
  fec30000-fec30fff : AMDIF030:00
fed00000-fed00fff : Reserved
  fed00000-fed003ff : HPET 0
    fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : Reserved
fed80000-fed8ffff : Reserved
  fed81500-fed818ff : AMDI0030:00
fedc0000-fedc0fff : pnp 00:05
fedc2000-fedcffff : Reserved
fedd4000-fedd5fff : Reserved
fee00000-ffffffff : PCI Bus 0000:00
  fee00000-feefffff : Reserved
    fee00000-fee00fff : Local APIC
      fee00000-fee00fff : pnp 00:05
  ff000000-ffffffff : Reserved
    ff000000-ffffffff : pnp 00:05
100000000-41f37ffff : System RAM
41f380000-41fffffff : RAM buffer
 

Link to comment
9 minutes ago, Sulframus said:

Unraid is booting with UEFI, I have added the "video=efifb:off" previously to GUI mode by a mistake. I have updated it to the default one and rebooted. Now I am see the VM on the router, but no ping, RDP or VNC is working yet.

ok this output come after you added video=efifb:off, or not?Because I don't see it and the addresses are now free for vfio...

Try to change the gpu code in the xml, from this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>

 

To this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </hostdev>

 

Note that your gpu may suffer the so called amd reset bug, so change one thing at a time and reboot the whole server before starting the vm.

 

8 minutes ago, JonathanM said:

Trust me, we know exactly how you feel.

 

Thank you for taking the time to deal with this.

No problem at all, if I have time and I suspect I could solve the issue it's a pleasure.

Edited by ghost82
Link to comment

Unfortunately, no dice. 

 

Adding logs from VM itself as well.

 

-overcommit mem-lock=off \
-smp 8,sockets=1,dies=1,cores=4,threads=2 \
-uuid 3ca404d4-c486-5c66-7ff6-801fe07777ac \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=31,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime \
-no-hpet \
-no-shutdown \
-boot strict=on \
-device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x7.0x7 \
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x7 \
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x7.0x1 \
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x7.0x2 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 \
-blockdev '{"driver":"file","filename":"/mnt/user/domains/Windows 10/vdisk1.img","node-name":"libvirt-3-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-3-format","read-only":false,"cache":{"direct":false,"no-flush":false},"driver":"raw","file":"libvirt-3-storage"}' \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=libvirt-3-format,id=virtio-disk2,bootindex=1,write-cache=on \
-blockdev '{"driver":"file","filename":"/mnt/user/isos/Windows.iso","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-2-format","read-only":true,"driver":"raw","file":"libvirt-2-storage"}' \
-device ide-cd,bus=ide.0,unit=0,drive=libvirt-2-format,id=ide0-0-0,bootindex=2 \
-blockdev '{"driver":"file","filename":"/mnt/user/isos/virtio-win-0.1.190-1.iso","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"}' \
-device ide-cd,bus=ide.0,unit=1,drive=libvirt-1-format,id=ide0-0-1 \
-netdev tap,fd=33,id=hostnet0 \
-device virtio-net,netdev=hostnet0,id=net0,mac=52:54:00:71:36:2a,bus=pci.0,addr=0x2 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-chardev socket,id=charchannel0,fd=34,server,nowait \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \
-device usb-tablet,id=input0,bus=usb.0,port=1 \
-device vfio-pci,host=0000:09:00.0,id=hostdev0,bus=pci.0,multifunction=on,addr=0x5 \
-device vfio-pci,host=0000:09:00.1,id=hostdev1,bus=pci.0,addr=0x5.0x1 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
2021-10-27 15:44:48.825+0000: Domain id=1 is tainted: high-privileges
2021-10-27 15:44:48.825+0000: Domain id=1 is tainted: host-cpu
char device redirected to /dev/pts/0 (label charserial0)

Link to comment
4 minutes ago, ghost82 said:

I cannot find any abnormal in the diagnostics, to exclude a vbios issue I would find a way to dump my own vbios.

I have tried the script to dump VBios in unraid from Spaceinvader, but it failed with some error. I will put the GPU in another machine to dump the VBios.

Edited by Sulframus
Typo
Link to comment
On 10/28/2021 at 8:22 AM, ghost82 said:

I cannot find any abnormal in the diagnostics, to exclude a vbios issue I would find a way to dump my own vbios.

After dumping the VBios of my own GPU and fully reinstalling the VM, I was able to get it working. But after some time, the VM froze up and now it's doing the same thing as before. I'll retry again tomorrow with a fresh new image, hopefully the results will be good.

Link to comment
  • 2 months later...

Any updates on this?

 

I also have a gigabyte 5700xt, and once upon a time with a previous config was doing GPU passthrough without any issues. Then recently I built a new server, put my 5700xt into it, followed the same steps as before and for the life of me can't get the damn thing to work.

 

Hoping someone's seen the light w/ this card + unraid combo

Link to comment
19 hours ago, Matthew Kent said:

Any updates on this?

 

I also have a gigabyte 5700xt, and once upon a time with a previous config was doing GPU passthrough without any issues. Then recently I built a new server, put my 5700xt into it, followed the same steps as before and for the life of me can't get the damn thing to work.

 

Hoping someone's seen the light w/ this card + unraid combo

Unfortunately I've moved my unraid server onto another hardware, so I wasn't able to do the GPU passthrough in the end. 

Link to comment
  • 1 year later...
On 1/12/2022 at 1:12 PM, Matthew Kent said:

Any updates on this?

 

I also have a gigabyte 5700xt, and once upon a time with a previous config was doing GPU passthrough without any issues. Then recently I built a new server, put my 5700xt into it, followed the same steps as before and for the life of me can't get the damn thing to work.

 

Hoping someone's seen the light w/ this card + unraid combo

Same issue here, and I've been pulling my hair out trying to get the gpu passed through. It will show up in the VM once I install and boot with vnc but drivers fail to install. 

Edit: just to clarify this is on a 5700 xt as well

 

Edited by bonfire62
Link to comment

Hey! 
 

I finally figured it out. I don’t know why I was able to get this working on my old server. But essentially everything started working for me when I replaced my CPU with a CPU/GPU chip. It might’ve been the case I had an extra junk GPU inside my old server as it’s a full tower and forgot about it. Either way, I think GPU pass-thru doesn’t work properly unless you have more than one

Edited by Matthew Kent
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.