Radeon RX 7900 XT passthrough


Go to solution Solved by Ancalagon,

Recommended Posts

Has anyone tried passing an AMD Navi 31 (Radeon RX 7900 XT or XTX) GPU through to a VM yet? I just got the RX 7900 XT and have not had luck passing to a Windows 11 VM. I'm upgrading from an RX 6800 XT, which I had working previously. These are steps I've taken:

 

  1. Bind 0d:00.0, 0d:00.1, 0d:00.2, and 0d:00.3 IOMMU groups to VFIO at boot
  2. append initrd=/bzroot video=efifb:off in Syslinux Configuration
  3. <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0' multifunction='on'/> in VM XML
  4. Change other three GPU devices to same bus and slot with own function, i.e. <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>

 

After Unraid reboot, on first VM boot, nothing displays on attached monitor. Remoted into the VM with RDP, the GPU shows code 43 error in device manager. After rebooting the VM, the GPU doesn't show up in device manager at all.

Link to comment

Maybe entering a vendor id and setting kvm hidden state to ON will help:

 

  <features>
    <acpi/>
    <apic/>
    <hyperv mode='custom'>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>

 

Not using beta drivers might help too.

 

Edited by Jumbo_Erdnuesse
Link to comment
  • Solution

I got this working! First, I was able to successfully pass through the GPU after switching to legacy CSM boot. I was then able to save my vbios rom. Then after adding the vbios rom to the VM XML and switching back to EFI boot, the GPU passthrough works! I can also confirm the GPU is properly reset when rebooting the VM as well.

Edited by Ancalagon
Link to comment
On 12/28/2022 at 6:02 AM, Ancalagon said:

I got this working! First, I was able to successfully pass through the GPU after switching to legacy CSM boot. I was then able to save my vbios rom. Then after adding the vbios rom to the VM XML and switching back to EFI boot, the GPU passthrough works! I can also confirm the GPU is properly reset when rebooting the VM as well.

what method did you use to save your vbios?

Link to comment
On 1/4/2023 at 4:24 PM, Skitals said:

Did you hex edit it, or used exactly as dumped from gpu-z?

I'm using the original vbios dump, no edits.

 

Quote

Can you share exactly what you did? Are you binding all 4 devices to vfio from the system devices page?

Yes, I am binding all 4 devices to vfio. After adding them to the VM, I'm manually editing the XML to put them on the same bus and slot as a multifunction device.

 

Here's the GPU portion of my XML:

<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
  </source>
  <alias name='hostdev0'/>
  <rom file='/mnt/user/domains/GPU ROM/AMD.RX7900XT.022.001.002.008.000001.rom'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0' multifunction='on'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
  </source>
  <alias name='hostdev1'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x2'/>
  </source>
  <alias name='hostdev2'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x2'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x3'/>
  </source>
  <alias name='hostdev3'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x3'/>
</hostdev>

 

Have you been able to boot the VM after booting the host in legacy CSM mode?

Edited by Ancalagon
Link to comment
34 minutes ago, Ancalagon said:

I'm using the original vbios dump, no edits.

 

Yes, I am binding all 4 devices to vfio. After adding to them the VM, I'm manually editing the XML to put them on the same bus and slot as a multifunction device.

 

Here's the GPU portion of my XML:

<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
  </source>
  <alias name='hostdev0'/>
  <rom file='/mnt/user/domains/GPU ROM/AMD.RX7900XT.022.001.002.008.000001.rom'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0' multifunction='on'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
  </source>
  <alias name='hostdev1'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x2'/>
  </source>
  <alias name='hostdev2'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x2'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x3'/>
  </source>
  <alias name='hostdev3'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x3'/>
</hostdev>

 

Have you been able to boot the VM after booting the host in legacy CSM mode?

 

Thanks... I got it working. What I did different I have no idea. I swear I tried this exact setup like 5 minutes in. 

 

Reset also seems to be working. That's a big relief as there are multiple reports of the xtx having the reset bug on reddit. I have stopped and started the vm multiple times and it works every time.

Link to comment
Quote

I got it working

Great to hear.

 

After I got this working, I was messing around with it more to see if there were other ways I could get it working without needing the vbios, but never found one. But afterwards reverting the changes, it wasn't consistent getting it to work again, even after using the same steps (booting into CSM mode first, etc.). But once it works, it seems to continue working, restarting the VM multiple times and never had it stop working.

 

When it wasn't working (always code 43), I found booting the VM with only the GPU passed through without other passed through hardware worked. Whether that was a coincidence or not, I can't say. It's definitely more finicky than the RX 6800 XT was.

Link to comment
On 1/4/2023 at 11:26 PM, Ancalagon said:

Great to hear.

 

After I got this working, I was messing around with it more to see if there were other ways I could get it working without needing the vbios, but never found one. But afterwards reverting the changes, it wasn't consistent getting it to work again, even after using the same steps (booting into CSM mode first, etc.). But once it works, it seems to continue working, restarting the VM multiple times and never had it stop working.

 

When it wasn't working (always code 43), I found booting the VM with only the GPU passed through without other passed through hardware worked. Whether that was a coincidence or not, I can't say. It's definitely more finicky than the RX 6800 XT was.

 

Do you ever see the Tianocore screen when booting a vm? I always did with my 5700 XT, I never see it with the 7900 XTX. When it works the first thing I see is the Windows login screen.

 

Finicky is an understatement. I lived with the 5700 XT for 2 years with the dreaded reset bug and that was a joy in comparison.

 

I am also seeing massive instability. I've had multiple crashes. Once the screen went black while launching 3dmark, and it recovered with Andrenalin not running. Another time the Windows VM fully locked up while I was using Chrome. Looking at the unraid dashboard there were a couple cores pegged at 100%. I had to Force stop the VM. Never seen anything like this in 2 years of using the 5700 XT and a Windows vm as my primary desktop computer.

 

Not sure if it's driver instability or issues with using the card in a virtualized environment. I haven't used it enough on bare metal windows to know if there are stability issues there, too.

 

The resizable bar thing is another issue. I'm used to games running within a percentage inside a vm. Games that make use of resizable bar are seeing 20% better fps on bare metal.

Link to comment

I've tried everything in this thread but can't seem to get my XTX passthrough to post.  It and the VNC says "...has not initialized the display (yet)".  When I run it with just the VNC it boots, when I passthrough my secondary Nvidia 3080 it works.  Tried disabling resize bar and setting the XTX as the non primary gpu in the bios, and nothing happens.  I've run out of ideas.

 

Edit1: was finally able to get it to post.  I do have to reboot the host each time I want to reboot the VM right now.  Now my problem is with the driver install, it gets to 79% then just sits there and does nothing.

 

Edit2: Finally seeing some results.  I am passing a SATA drive to the VM so I figured maybe I should just install the drivers in bare-metal and possibly bypass whatever problem I had with that.  I got it to work!  I did pick Q35-7.1 instead of i440fx-7.1 when I finally got it working, maybe that was my issue with installing drivers?  Is Q35 good or bad?  Now that it's seemingly working i'll try and get it to work in EFI boot.  Big relief.

Edited by RustyClark
update
Link to comment
2 hours ago, RustyClark said:

I've tried everything in this thread but can't seem to get my XTX passthrough to post.  It and the VNC says "...has not initialized the display (yet)".  When I run it with just the VNC it boots, when I passthrough my secondary Nvidia 3080 it works.  Tried disabling resize bar and setting the XTX as the non primary gpu in the bios, and nothing happens.  I've run out of ideas.

 

I ditched the 7900 XTX. It was too many headaches, worse than the navi reset bug. I picked up a 4070 Ti today and passthrough just works, first time every time.

 

I've been team red for a long time, but the 7000 series is not cut out for vms.

Link to comment
13 hours ago, Skitals said:

 

I ditched the 7900 XTX. It was too many headaches, worse than the navi reset bug. I picked up a 4070 Ti today and passthrough just works, first time every time.

 

I've been team red for a long time, but the 7000 series is not cut out for vms.

 

I somewhat agree, but Nvidia's left reality with their pricing, hence my move back to AMD.  4070 ti is a good card!

Link to comment
On 1/6/2023 at 10:57 AM, Skitals said:

Do you ever see the Tianocore screen when booting a vm?

Come to think of it, I don't think I have. I have seen the progress circle with the Windows 11 logo. That may have been after booting in CSM mode. I think it usually goes straight to the Windows login screen like you said though.

 

On 1/6/2023 at 3:15 PM, RustyClark said:

Is Q35 good or bad?

I've been using Q35-7.1 myself.

Link to comment
On 1/7/2023 at 5:39 AM, RustyClark said:

Nvidia's left reality with their pricing

The size of their cards is bonkers now too. For one of my machines, Nvidia cards don't come close to fitting in the case. I also prefer AMD to be able to run macOS VMs (hopefully eventually supported for RX 7000 series).

Edited by Ancalagon
Link to comment
  • 3 weeks later...

Thought I'd mention that in my situation I have to make some adjustments to get EFI boot to work.  I use two GPUs, one Nvidia, and the XTX.

 

I had to use the dumped bios for the XTX

I had to change video=efifb:off to video=efifb:off,vesafb:off,vesa:off

I had to disable resizable bar in the bios (but kept "above 4g decoding")

 

The good news is that after getting EFI boot to work I no longer had to restart the host every time I needed to reboot the XTX VM, the bad news is resizable bar does not seem to work for the XTX VM yet.  Cheers!

Link to comment
16 hours ago, RustyClark said:

Thought I'd mention that in my situation I have to make some adjustments to get EFI boot to work.  I use two GPUs, one Nvidia, and the XTX.

 

I had to use the dumped bios for the XTX

I had to change video=efifb:off to video=efifb:off,vesafb:off,vesa:off

I had to disable resizable bar in the bios (but kept "above 4g decoding")

 

The good news is that after getting EFI boot to work I no longer had to restart the host every time I needed to reboot the XTX VM, the bad news is resizable bar does not seem to work for the XTX VM yet.  Cheers!

Try the solution posted in this thread, works on my 3090 with an AMD Epyc setup.

 

Edited by shpitz461
Link to comment
  • 2 months later...
On 1/4/2023 at 6:52 PM, Ancalagon said:

I'm using the original vbios dump, no edits.

 

Yes, I am binding all 4 devices to vfio. After adding them to the VM, I'm manually editing the XML to put them on the same bus and slot as a multifunction device.

 

Here's the GPU portion of my XML:

<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
  </source>
  <alias name='hostdev0'/>
  <rom file='/mnt/user/domains/GPU ROM/AMD.RX7900XT.022.001.002.008.000001.rom'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0' multifunction='on'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
  </source>
  <alias name='hostdev1'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x2'/>
  </source>
  <alias name='hostdev2'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x2'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x0c' slot='0x00' function='0x3'/>
  </source>
  <alias name='hostdev3'/>
  <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x3'/>
</hostdev>

 

Have you been able to boot the VM after booting the host in legacy CSM mode?

 

When you say that you are "binding all 4 devices to vfio" are you referring to your equivalent of?

image.thumb.png.7cc34886e1f7358dfd58f89b53c80c71.png

 

Also when i map all 4 of the devices to the win10 vm i get the below error

 

image.png.de6fe53fb429a769c61069f121badb55.png

 

here is the portion of my gpu xml

 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/RX7900XT-20230331.rom'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x1'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x2'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x3'/>
    </hostdev>

 

also fwiw i am running a xtx 7900xt on a wrx80 creator v2 with a 5955wx as the cpu on Unraid 6.12 rc2

Edited by celborn
Link to comment
On 3/31/2023 at 7:59 PM, shpitz461 said:

Your GPU is on the same IOMMU group with other devices, you have to get it isolated to its own group (both xx.0 for video and xx.1 for audio).

If you have an option in the BIOS to enable SR-IOV try that. If that doesn't help you'll have to go the ACS-Override path...

I ive tried SR-IOV and it hasnt helped. 

Also, I did the ACS downstream and each of the devices ended up in their own iommu but passing through the VGA and Audio device to the win10 VM just got me a black screen on the monitor.

 

I was able to boot to the bare metal on the VM with the 7900xt and it worked flawlessly.

 

The question I have is I keep reading that everyone is passing all 4 devices through but Im only seeing two (23:00.0 & 23:00.1) what is yalls graphics card looking like in the device view?

 

Also, when i try to manually pass through 21:00.0 & 22:00.0 i get a "Non-endpoint cannot be passed through to the guest

 

image.thumb.png.5ff449f4a395082bd37a1d0663d2a2b8.png

Link to comment

I stumbled through this and finally got the XFX 7900 XT passed through properly with a VM.

 

Heres what ive got running.

 

Im booting Unraid 6.12rc2 in legacy mode (will test in UEFI later)

 

here is my cutrrent grub settings isolating several of the 5955wx CPUs for the VM's exclusive use 

 

kernel /bzimage
append initrd=/bzroot iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init isolcpus=9-15,25-31

 

Here is my GPU Iommu Group with the two pcie devices i am passing through to the VM

 

IOMMU group 13:				[1022:1482] 40:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
[1022:1483] 40:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
[1002:1478] 44:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev 10)
[1002:1479] 45:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (rev 10)
*[1002:744c] 46:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX] (rev cc)
*[1002:ab30] 46:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device ab30

 

 

I setup a new VM passing through the same bare metal SSD as Q35-7.1 instead of the i440 that the previous VM was.

Installed a fresh Windows 11 Pro on the bare metal SSD from the old VM. 

I first didnt assign a GPU to the VM and did the win11 install via noVNC and once the install weas finished I added the 7900 XT in place of the virtual display and added the sound card portion.

Here is the gpu portion of my fresh install of Win11 Pro

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x46' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x46' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>

 

Hopefully this helps someone out down the line that was having the same issues I was and makes things easier on them as I spent 4-5 hours a day for 5 days straight tweaking settings. rebooting the server. rinse and repeat........

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.