[PCIe GPU Passthrough] Unable to passthrough to ANY VM (Error 43/Bootloop)


Recommended Posts

Hello Unraid Forums

 

I'm having major issues trying to setup PCIe GPU passthrough on my Unraid server. Everything else is working perfectly except for GPU passthrough.

 

I started on 6.8.3 RC and I have upgraded to 6.9.0 beta1 to see if the issue was a bug with Unraid.


Just before we start, I have tried mutliple methodolgies for trying to passthrough a GPU to a VM. Additionally, I have read the forum rules and feel that whilst this subject has been covered before, I still cannot get to the bottom of the issue. I'll go in to more detail. I've attached the diagnostics information, the XML for my Win10 VM. Before we start I'll just outline why I decided to use unraid.

 

Reasons for starting Unraid project

I have been studying for my CompTIA A+ and passed my first exam with a view to go on to do a CCNX qualification. I work as a 1st Line IT Tech Support and I've been been prodominately using Windows throughout my life. I wanted to learn more about Docker containers, VM's and paravirtualisation and cloud applications. I thought of renting a hosted VM platform but the cost is way to high considering I have hardware I can learn with myself (cloud VM's with GPU's are $$$). I started messing about with VirtualBox and wanted to create a paravirtualised environment with a PCIe GPU passthrough for gaming. It would server as a sort of useful retirement home for spare parts etc.

 

What I want to achieve

I want to use dockers for more persistent applications, such as P2P, ad-blocking, media server, etc. My intention was to have the following:

  1. Intel I350-T4 Quad Port Gigabit - Passthrough to pfSense
  2. MSI GTX 970 - Passthrough to a stable gaming VM (Win10/Mint/SteamOS)
  3. Docker containers to handle app services, such as ad-blocking, P2P server, Plex, etc
  4. Hackintosh environment
  5. Windows 10 environement
  6. Linux environment

Just quicklly for the record, uptime doesn't really matter to much to me, whilst pfSense is my number one priority to learn I need to do that last, after I have got the gaming/media VM's to work. I've got a failover plan if the Unraid server goes down, so lets leave my desire to use pfSense aside

 

Brief Specs
CPU Intel i5-4440 - GPU MSI GTX 970 - MB Gigabyte B85 HD3 rev 2.1
1x 240GB SSD - 1x 1TB Parity Drives - 3x 500GB Data Drives 

Steam Controller - Intel I350-T4 Quad Port Gigabit

 

Guides/Videos Used So Far:

https://forums.serverbuilds.net/t/guide-remote-gaming-on-unraid/4248/9

Steps so far

I've been trying to get the GPU to passthrough to a VM for the best part of 10 days solid whilst I'm furloghed and I'm starting to lose it a bit. I know what I want to achieve is possible with the hardware I have (with some limitations, upgrades will be considered if I can get the project to a workable state.

The best I've got so far is to be able to pass the GPU through to Windows 10 Pro VM but I'm getting the dreaded error 43 message.

 

CPU and Motherboard

Before even starting this project I made sure that my CPU (i5-4440) and MB (Gigabyte B85-HD3) both support VT-d and virtualisation, which they aparently do.

 

GPU and vBIOS
I have placed the GTX 970 into my main PC and it worked perfectly. I did some stresstesting and benchmarking and it's a perfectly usable card. Whilst I was doing this, I used MSI Afterburner to download my BIOS rom for use as a vBIOS so I could follow the guides. Whats really perplexing me is that I see a lot of users getting thier GTX 970 to passthrough with no problem, then another section saying they just couldn't do it and gave up.

 

I've tried legacy boot with iGPU for console just to try and make sure that Unraid isn't doing weird things by loading it for console, I know it dissapears when you launch the VM but it's something I thought I'd try as the second slot option hasn't been working for me either.

 

Unraid

I've tried both UEFI and Legacy boot options for Unraid. I followed the recommendation of @SpaceInvaderOne to install the plug-in.

 

No modification attempts

  • Attempted to passthough on Linux Mint 19.3 - Fail
    • UEFI - Fail (System hangs after installing drivers)
    • Legacy - Fail (System hangs after installing drivers)
  • Attempted to passthrough to Windows 10 Pro - Fail
    • UEFI - Fail (Error 43)
    • Legacy - Fail (Error 43)
  • Attempted to passthrough to SteamOS - Fail
    • UEFI - Fail
    • Legacy - Fail

SpaceInvaderOne style

  • Editted VM xml - fail
  • Eddited Syslinux - fail


Windows 10 XML

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>VM-WIN10PRO</name>
  <uuid>NOT SHARING THIS WITH YOU</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>2</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='2'/>
    <emulatorpin cpuset='3'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/NOT SHARING THIS WITH YOU_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='2' threads='1'/>
    <cache mode='passthrough'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/VM-WIN10PRO/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/isos/virtio-win-0.1.185.iso'/>
      <target dev='hdb' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:ac:c2:7f'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='4'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <rom file='/boot/GPU-vBIOS/nVidia/MSI-GTX970-GM204-200-A1.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0xc226'/>
      </source>
      <address type='usb' bus='0' port='1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0xc227'/>
      </source>
      <address type='usb' bus='0' port='2'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x28de'/>
        <product id='0x1142'/>
      </source>
      <address type='usb' bus='0' port='3'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
</domain>
 

 

 

 

tango-cloud-diagnostics-20200607-1917.zip

  • Thanks 1
Link to comment

I noticed whilst doing some trawling of B85 chipsets, it says that it doesn't support VT-d, even though the BIOS has an option for it. I'm feel conflicted about this having a decent implementation.

I tried also making sure there where no other PCIe devices, as I know my CPU has only one 16x lane directly and everything else goes through the southbridge. I installed a Intel NIC when I built this machine a few weeks ago and whilst troubleshooting, I realised that when the card is installed, it reverts back to 1x8GB module and without it 2x8GB modules, which is strange. I'm going to try this passthrough without the NIC on a fresh install (kept only disk/network config).

Maybe my motherboard is faulty and I'm actually getting a geniune error 43 when trying to pass through. I really don't want to ugprade just yet but it's becoming aparent that this might be the only way forward with this issue. I'm thinking of going for a B450 MB and a Ryzen 3100 with about 16GB of DDR.

I'm also on a trial licence as well, I don't really want to sink loads of money into semi-new build. I want to get this hardware running as proof and then upgrade, so I'm going to plough on for a few more weeks and hopefully I can sort something out so I can justify the full licence and following upgrades.

Link to comment

I managed to get it working on Windows 10 Pro. In order to get it working I did the following.

  1. Formatted the array and shutdown
  2. Backed up my flash drive and seperated my licence key file
  3. Unzipped the Unraid files, putting the licence key in, vBIOS and latest VirtIO drivers 0.1.185
  4. Configured the VM manager (NO ACS/Latest VirtIO)
  5. Created a VM with
    1. 3 CPUS / 8GB RAM
    2. i440fx-4.2 with OVMF bios
    3. Hyper-V off,
    4. 60G vdisk , VNC, no other passthrough
    5. Install/Driver = SATA
    6. Passed through the vBIOS i pulled from MSI afterburner
  6. Installed the Windows 10 VM as per the manual instructions (i.e. installing drivers pre-desktop)
  7. Booted to desktop, installed all further drivers and enabled RDP
  8. Restarted and logged in via RDP
  9. Installed latest Geforce Drivers (446.14 DCH WHQL)

I'm fairly certain that having another PCIe device installed has been causing me issues, due to the above randomness caused by losing a memory module by inserting the NIC. I'm not sure if my northbridge/CPU/MB is faulty but I would certainly recommend removing PCIe devices to see if you can get GPU passthrough, then add them in afterwards.

Here is my IOMMU groupings
 

IOMMU group 0:[8086:0c00] 00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)

IOMMU group 1:[8086:0c01] 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)

                      [10de:13c2] 01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)

                      [10de:0fbb] 01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)

IOMMU group 2:[8086:0412] 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)

IOMMU group 3:[8086:0c0c] 00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)

IOMMU group 4:[8086:8c31] 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)

IOMMU group 5:[8086:8c3a] 00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)

IOMMU group 6:[8086:8c2d] 00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 05)

IOMMU group 7:[8086:8c20] 00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 05)

IOMMU group 8:[8086:8c10] 00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5)

IOMMU group 9:[8086:8c14] 00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d5)

IOMMU group 10:[8086:8c16] 00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #4 (rev d5)

IOMMU group 11:[8086:8c26] 00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 05)

IOMMU group 12:[8086:8c50] 00:1f.0 ISA bridge: Intel Corporation B85 Express LPC Controller (rev 05)

[8086:8c02] 00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05)

[8086:8c22] 00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 05)

IOMMU group 13:[10ec:8168] 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)

IOMMU group 14:[8086:244e] 04:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41)

I've downloaded Fallout New Vegas and it's working a dream now.

Link to comment

I was mucking with my VM template XML and started getting error 43 for the GPU.  I fixed it using something I found online by adding some hyperv parameters.  Specifically the vendor ID one is the one that the website I looked at says fixes it and sure enough I'm working great.

 

<hyperv>
  <relaxed state='on'/>
  <vapic state='on'/>
  <spinlocks state='on' retries='8191'/>
  <vpindex state='on'/>
  <synic state='on'/>
  <stimer state='on'/>
  <reset state='on'/>
  <vendor_id state='on' value='1234567890ab'/>
  <frequencies state='on'/>
</hyperv>

 

  • Thanks 1
Link to comment
  • 8 months later...
On 6/11/2020 at 7:11 AM, nickp85 said:

I was mucking with my VM template XML and started getting error 43 for the GPU.  I fixed it using something I found online by adding some hyperv parameters.  Specifically the vendor ID one is the one that the website I looked at says fixes it and sure enough I'm working great.

 



<hyperv>
  <relaxed state='on'/>
  <vapic state='on'/>
  <spinlocks state='on' retries='8191'/>
  <vpindex state='on'/>
  <synic state='on'/>
  <stimer state='on'/>
  <reset state='on'/>
  <vendor_id state='on' value='1234567890ab'/>
  <frequencies state='on'/>
</hyperv>

 

Thanks for this mate...! You save my day, I try almost everything on the internet and have same error 43, after this fix the card work PERFECT!
WBR!

Edited by b0n3v
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.