5700 XT Passthrough issues, now OVMF VMs boot into UEFI


Recommended Posts

Unraid version: 6.7.2

 

Hardware:
Ryzen 7 3700X

ASRock X570 Extreme4 (with latest BIOS - 2.10)

RX 5700 XT

 

Setting up GPU pass-through to my Windows 10 VM has been a real struggle so far I'll try to give a detailed history.

If there's any settings or logs you'd like to see I trust you will ask for them in the comments.
This being my first VM rodeo it took a while to get the IOMMU groups right. After updating my BIOS they were more easily managed and look fine without overrides.

This setup without any bells and whistles did mysteriously pass through the GPU once! (makes you wonder?)

 

Then, after gracefully restarting the VM, Unraid stopped passing through the GPU - not throwing any error messages and occasionally freezing up the server.

(somewhat related at this point, passing through my AMD Starship/Matisse HD Audio Controller along with the 5700 XT's Video & Audio caused the VM Manager to lock up and become inaccessible - with the rest of Unraid GUI remaining responsive for 5-10 minutes and then the system would freeze completely. Switching off autostart on the VM that was causing this was another hurdle I managed to overcome, but I replicated the circumstances in my current situation without autostart and got the same lockdown issues. During this time I had to force shut down the server several times.)

 

I am aware of the AMD Reset bug but this behaviour persisted even after server reboots. So I started reading around and trying my options. Passing a VBIOS got me into the VM with passthrough a second and a third time, after switching to Q35 3.1 between session 2 and 3 I even managed to install drivers successfully.

 

Then another graceful restart later I was back to no hand-off happening at all and the VM becoming unresponsive (had to force shut down the VM many times to try different settings) Unraid refused to hand off the GPU in both CLI and GUI mode. Note that during this time the VM would still work fine with VNC.

In a moment of perhaps desperation I tried permitting Unraid's UEFI boot mode, this only made matters worse crashing the server as soon as I'd boot the VM with passthrough so I was once again forced to hard-shutdown the server, I quickly disabled that option again and tried some more VM configs to see if I could get the passthrough working again.

Lo and behold, the GPU was handed off by Unraid and on the screen appeared UEFI Interactive Shell 2.2, exit and continue or choosing a boot device all get me back into the same UEFI Shell screen.

 

Now we've arrived at my current situation. I've confirmed that Windows 10, Windows 8.1 and Ubuntu VM's based on OVMF all boot into the UEFI Interactive Shell, seemingly no matter the settings and including through VNC remote. I suspect that one of the many hard-resets may have corrupted some file essential to OVMF.

 

I did manage to get a fresh Windows 10 VM up and running with SeaBIOS, works fine with VNC but when I pass through the GPU video and sound it gets handed over, shows a grey screen and some text for half a second and then goes black. I left this VM on overnight and the only thing that changed is that the first assigned logical core is permanently stuck at 100% utilisation. Moving forward with SeaBIOS for my Win10 sounds fine, however, OVMF not working on any VM is too big an issue to ignore.

 

----------

 

That's a lot of potentially fixable issues in one post and a lead on any of them would be greatly appreciated, the issues in short:
- VM Manager then entire server freeze when passing through on-board sound card along with GPU audio & video.

- Seemingly any and all VM's with OVMF fail to boot properly after having enabled then disabled Unraid UEFI boot

- Overall and currently failing/unstable GPU passthrough situation (barring AMD reset bug).
 

Edited by Cyclamate
Link to comment

I honestly can't help a lot with this, but I have noticed that when I reboot my windows vm with a 5700xt passed through, I get the reset bug and a server reboot does not fix it. I have to shut down the server wait at least 5 sec and then power it back up. When I just reboot the server, the led on the 5700xt don't go off so the gpu is always receiving power and never clears the bug. Try a shut down and wait for the led to go off for a few sec and then power up and see if that fixes at least that issue.

 

For the others, if you think you corrupted some files you can always reinstall unraid on the usb and keep your current config to see if fresh files fixes that. I believe it's a similar procedure to upgrading unraid manually.

Link to comment
11 minutes ago, Kosslyn said:

I honestly can't help a lot with this, but I have noticed that when I reboot my windows vm with a 5700xt passed through, I get the reset bug and a server reboot does not fix it. I have to shut down the server wait at least 5 sec and then power it back up. When I just reboot the server, the led on the 5700xt don't go off so the gpu is always receiving power and never clears the bug. Try a shut down and wait for the led to go off for a few sec and then power up and see if that fixes at least that issue.

 

For the others, if you think you corrupted some files you can always reinstall unraid on the usb and keep your current config to see if fresh files fixes that. I believe it's a similar procedure to upgrading unraid manually.

From my post count and other things I assume it's quite obvious that I'm still new to Unraid.
I think it's amazing how far this forum, documentation and Spaceinvader One's vids alone have gotten me so far.
It's exactly these kinds of perhaps obvious procedures that I was hoping to have been blindsided by.

 

Will definitely try re-installing Unraid, I already have backups of the USB.

Is there anything besides the config folder that requires special attention?

 

Besides rebooting the server (gracefully or not) I also ran the manual "remove > suspended state > rescan" script to try and circumvent the reset bug.

But I'm at least a little hopeful that a longer graceful shutdown might give different results. Will report back when I get home.

Link to comment

Everything you have supports uefi and you should use it. Turn on allow UEFI in unraid, disable CSM in bios, and stick to OVMF.

 

If your VMs were created with seabios, they probably do not support uefi. The same goes if, say, Windows was installed on an SSD when in legacy (CSM) mode. If you start a VM in OVMF (uefi) mode and there is no bootable efi partition on your vdisk/passed drive you will end up with what you saw, the uefi shell.

 

So turn on allow UEFI in unraid, disable CSM/legacy boot in your motherboard, and create a NEW vm and install your operating system with OVMF.

 

Your life will be a lot easier if you use "video=efifb:off" as a kernel parameter in syslinux.cfg. This completely disables the framebuffer for unraid and vfio will have an easier time binding your gpu. With a single GPU I believe you will need to pass a vbios. Pass AMD Navi 10 HDMI Audio with the gpu. The other one you tried is the onboard motherboard audio and it WILL hard lock unraid. 

 

Oh, and using Q35-4.0.1 or newer. v3.1 works but you will only get gen3 pcie speeds by default. v4.0.1 and newer are gen4 pcie speeds by default.

 

That should give you a place to start!

Edited by Skitals
Link to comment
11 hours ago, Skitals said:

Everything you have supports uefi and you should use it. Turn on allow UEFI in unraid, disable CSM in bios, and stick to OVMF.

So turn on allow UEFI in unraid, disable CSM/legacy boot in your motherboard...

Done and seems to boot into UEFI mode now (high res boot screens). I'll gladly stick to OVMF wherever possible but there's still some things to overcome.

I noticed that with this change I am no longer able to boot into GUI Mode, is that an expected behavior?

I'm mostly just using the WebGUI from a laptop so it's not the end of the world, would I need a 2nd GPU otherwise?

 

Booting gives me my MoBo splash screen, then boot options "overlay" onto that. Non-GUI mode shows up on screen and is interactive, GUI Mode shows a black screen with only a blinking underscore and is not interactive.

 

Quote

If your VMs were created with seabios, they probably do not support uefi. The same goes if, say, Windows was installed on an SSD when in legacy (CSM) mode. If you start a VM in OVMF (uefi) mode and there is no bootable efi partition on your vdisk/passed drive you will end up with what you saw, the uefi shell.

... and create a NEW vm and install your operating system with OVMF.

I understand that it's not possible to go from SeaBIOS to OVMF pointing to the same vDisk.

 

The issue remains that on any new windows 10 VM I configure, and I'm telling it to create a new vDisk, as long as OVMF is the BIOS it's going to show the TIANO Core screen for a split second and then go into the UEFI Interactive Shell.

 

Purely as a trial I did get a Ubuntu VM up and running with OVMF and no passthrough. So I suspected maybe there's something wrong with the windows 10 ISO.
But both the one that worked before and one I downloaded fresh from Microsoft's website net the same result, what gives?

 

Quote

Your life will be a lot easier if you use "video=efifb:off" as a kernel parameter in syslinux.cfg. This completely disables the framebuffer for unraid and vfio will have an easier time binding your gpu. With a single GPU I believe you will need to pass a vbios.

Passing kernel parameters is something I have a hard time finding examples/instructions for, so I'm not sure I'm doing it right...

Screenshot_2019-12-22_syslinuxconfig.thumb.png.d18e2126d36ce0219508d14d2d8bcbeb.png

 

With this I no longer get the Linux install log and prompt on non-GUI Mode or the single blinking dash on GUI Mode, it just freezes after Loading /bzroot(-gui)...ok
Still perfectly accessible through WebGUI. Again, is this expected behavior due to only having one GPU?

It's worth mentioning that in either case passthrough to the Windows 10 VM "works" now... only it's still getting stuck in UEFI Interactive Shell both with and without passthrough.

 

Quote

Pass AMD Navi 10 HDMI Audio with the gpu. The other one you tried is the onboard motherboard audio and it WILL hard lock unraid. 

I've always passed both the video and audio (both in their seperate IOMMU group) when trying to passthrough the GPU. One of Spaceinvader One's videos advises to pass through onboard audio on top of that. Anyway, audio configuration is of a later concern.

 

Quote

Oh, and using Q35-4.0.1 or newer. v3.1 works but you will only get gen3 pcie speeds by default. v4.0.1 and newer are gen4 pcie speeds by default.

Interesting, the highest Q35 version in the VM configuration dropdown is v3.1 for me, why could that be?

Link to comment

The "freezing" after 'Loading /bzroot(-gui)...ok' is normal with efifb:off. It's not recommended to use the GUI mode except for initial setup in case networking isn't working to use the web gui, so try sticking to the normal non-gui mode.

 

If you have ubuntu up and running with vnc graphics, what happens when you passthrough the 5700XT to that vm?

 

Are you trying to passthrough the 5700XT to the windows10 vm before you have windows installed and setup? That is not recommended. It's recommended to get your guest os installed using vnc graphics, install virtio drivers, etc before passing through your gpu.

 

If the highest version of q35 is 3.1 you are running an old version of unraid. I would highly recommend running a verion of unraid with a 5.x kernel since you are using all very new tech (x570 chipset, pcie gen 4 gpu, etc). You also want to use a kernel with navi reset patch, instead of that script you are using to "reset" the card. I would highly highly recommend unraid 6.8.0-RC5 combined with the corresponding custom kernel found in the first thread here: 

 

Link to comment
36 minutes ago, Skitals said:

The "freezing" after 'Loading /bzroot(-gui)...ok' is normal with efifb:off. It's not recommended to use the GUI mode except for initial setup in case networking isn't working to use the web gui, so try sticking to the normal non-gui mode.

I see, makes sense, so controlling Unraid through the WebGUI is the only option with this setting enabled as long as I have a single GPU right?
Because the non-GUI seems to require the frame buffer to show up on screen as well, and I disabled it for both GUI and non-GUI.

 

Quote

If you have ubuntu up and running with vnc graphics, what happens when you passthrough the 5700XT to that vm?

So after a little bit of fiddling around within this new 18.04.3 Ububtu VM...

Passthrough works as expected, I installed drivers and as far as I can see everything is working as it should.

 

This is a newly configured Ubuntu VM on top of an OVMF Bios.

Implying that Windows or perhaps specifically Windows 10 has something to do with my troubles I suppose.

I'll try a Windows 7 or 8.1 install and see where that strands.

 

Quote

Are you trying to passthrough the 5700XT to the windows10 vm before you have windows installed and setup? That is not recommended. It's recommended to get your guest os installed using vnc graphics, install virtio drivers, etc before passing through your gpu.

Yeah, I've been doing it the way you suggest but I no longer can because I'm still not getting past the EUFI Secure Shell when running Win10 on OVMF.

I merely tried to passthrough once without installing Virtio drivers to see if the UEFI Shell would show up on screen and it did, not really sure what I was trying to prove there.

 

Quote

If the highest version of q35 is 3.1 you are running an old version of unraid. I would highly recommend running a verion of unraid with a 5.x kernel since you are using all very new tech (x570 chipset, pcie gen 4 gpu, etc). You also want to use a kernel with navi reset patch, instead of that script you are using to "reset" the card. I would highly highly recommend unraid 6.8.0-RC5 combined with the corresponding custom kernel found in the first thread here:

Now, I'm open to trying RC builds and kernel patches but I'd prefer to identify the cause of my Win10 VM's being unable to boot succesfully first...

Keeping that I was at some point at the point where I had succesfully installed drivers and wanted to reboot before configuring MSI interrupts.
That seems so long ago now...

 

Man if I had known that using newly developed hardware would give so much of a headache to reach a stable VM I'd have thought twice.

Link to comment
8 minutes ago, Cyclamate said:

Looking at the xml raw it looks like I must have misunderstood the meaning of the cpu pinning layout, I'll make sure those are more sensible right now.

 

EDIT: Nevermind, They're just not shown in the order I expected.

So to be clear, you were able to install/boot windows with vnc graphics. You tried passing through your gpu and you only got the uefi shell. You switched back to vnc graphics, and it still only goes to the uefi shell. Is that correct?

Link to comment

If the above is the case, try this:

 

I think there is something about the 5700XT that fudges up the nvram file and the only solution to create a new vm (unless there is a way to delete/rebuild the nvram file that I am unfamiliar with). It only takes a minute to create a new vm and point it to vdisk you installed windows to. Try doing this:

 

Create a new VM. Select Windows 10 template. From the form view make only the following changes:

 

Quote

Name: Change "Windows 10" to a name you have never used before such as "Windows 10 test"


Logical CPUs: Select all but the first row (leave cpu 0 / ?? and cpu 1 / ?? for unraid)


Initial memory and Max memory: same value for both, 8GB is fine if you have 16GB total installed


Machine: q35-4.0.1


BIOS: OVMF


Hyper-V: Yes


USB Controller: 3.0 (nex XHCI)


OS Install ISO: LEAVE BLANK


VirtIO Driver ISO: If you installed virtio drivers in your vdisk you can leave it blank, otherwise point to latest virtio iso


Primary vDisk Location: MANUAL. In the field next to manual: /mnt/user/domains/GravitasOne/vdisk1.img
The other vdisk options will disappear if this vdisk is still there. Leave Primary vDisk Bus on VirtIO.


Graphics: Radeon 5700XT


Graphics ROM BIOS: /mnt/user/domains/VBIOS_roms/MSI.RX5700XT.8192.190805.rom


Sound Card: AMD Navi 10 HDMI


USB Devices: Check your keyboard and mouse.

Hit create and report back.

 

The only way I got my 5700XT working was to go full send like this, creating a new vm pointing to a working windows install with the 5700XT passed from the get go.
 

Edited by Skitals
Link to comment
10 minutes ago, Skitals said:

So to be clear, you were able to install/boot windows with vnc graphics. You tried passing through your gpu and you only got the uefi shell. You switched back to vnc graphics, and it still only goes to the uefi shell. Is that correct?

I was able to install/boot with vnc graphics, installed Virtio drivers, passed through the gpu succesfully a total of three times and during the last session I succesfully installed the AMD drivers.

Then I was troubleshooting over the fact that it wouldn't pass through again (after vm shutdown and also after server reboot) but it would still boot normally with VNC. During this time I must have changed some setting or file causing the current and all new VM installations running on OVMF to malfunction.

 

Getting that Ubuntu VM running on OVMF even with passthrough feels like great progress though, that narrows it down to either my Windows VM configuration or Win10 - specific files.

Link to comment
1 minute ago, Cyclamate said:

I was able to install/boot with vnc graphics, installed Virtio drivers, passed through the gpu succesfully a total of three times and during the last session I succesfully installed the AMD drivers.

Then I was troubleshooting over the fact that it wouldn't pass through again (after vm shutdown and also after server reboot) but it would still boot normally with VNC. During this time I must have changed some setting or file causing the current and all new VM installations running on OVMF to malfunction.

 

Getting that Ubuntu VM running on OVMF even with passthrough feels like great progress though, that narrows it down to either my Windows VM configuration or Win10 - specific files.

I would try the above, it will only take a minute to create a new vm and plug in those options. It sounds the same as the bug I ran into, there is definitely something going wrong with the nvram file and that's the only workaround I found. Create a new vm and pointing to the same exact vdisk builds a new nvram file.

Link to comment
4 hours ago, Skitals said:

I would try the above, it will only take a minute to create a new vm and plug in those options. It sounds the same as the bug I ran into, there is definitely something going wrong with the nvram file and that's the only workaround I found. Create a new vm and pointing to the same exact vdisk builds a new nvram file.

Sadly it didn't work, boots into UEFI Shell just like any other new Windows VM would right now.

That means not passing an iso and instead linking to an existing vDisk with a windows install on it made no difference.)

It doesn't really matter what preset I use for Windows, if it includes OVMF it will get stuck in the UEFI shell.

 

The same goes for fresh Windows 8.1 and 7 VMs, but alas, the ubuntu VM does run fine on OVMF!

While I much prefer the idea of the OVMF actually working like it used to at first, it's becoming awfully tempting to just run with SeaBIOS...

Link to comment
10 hours ago, Cyclamate said:

I was able to install/boot with vnc graphics, installed Virtio drivers, passed through the gpu succesfully a total of three times and during the last session I succesfully installed the AMD drivers.

Then I was troubleshooting over the fact that it wouldn't pass through again (after vm shutdown and also after server reboot) but it would still boot normally with VNC. During this time I must have changed some setting or file causing the current and all new VM installations running on OVMF to malfunction.

 

Getting that Ubuntu VM running on OVMF even with passthrough feels like great progress though, that narrows it down to either my Windows VM configuration or Win10 - specific files.

So are you saying if you were to create a new win10 vm right now with a new vdisk, win10 iso, ovmf, vnc, etc it would go straight to the uefi shell? You can't get to the installer? Or you could install it and it works until you try passing through your gpu?

 

I'm afraid that the amd driver install on your win10 vdisk messed up that vm, but it shouldn't affect anything if you create a brand new vm with a new vdisk and nvram.

 

If you can, create a new vm with a new vdisk and install windows 10 and the virtio drivers. Get it working with vnc. Before doing anything else, backup that vdisk so you have a clean baseline!

 

If you are really brave and want to risk fudging up your ubuntu vm, take your working ubuntu vm w/ passthrough and edit the xml. Replace the path in "<source file="/mnt/user/domains/Ubuntu/vdisk1.img"/>" with the path to your new/clean/working win10 vnc vdisk. Don't change anything else. Hit save and start.

Link to comment

Also, to add, I had a LOT of issues getting my machine working. But now that it does, it works flawlessly. I started by installing windows 10 natively onto an nvme. When I first got passthrough working and I installed the amd drivers, I wasn't able to get the vm to start again. I didn't have the exact problem you are describing, but I had a LOT of issues. One thing I did that I thought might have made a difference was booting natively into windows from that nvme (which was now not working as a vm). To my surprise the gpu and drivers worked fine natively. I rebooted into unraid and the vm magically started working again.

 

If you have a spare drive or ssd I would suggest installing windows 10 natively on it, and then passing through that drive instead of dealing with a vdisk. At the very least it helps with troubleshooting because you can always boot directly into windows 10 to see if there is an issue with your windows installation or if its virtualization quirks. Here is a video for reference on installing windows natively and booting it as a vm: 

 

 

Link to comment
4 hours ago, Skitals said:

So are you saying if you were to create a new win10 vm right now with a new vdisk, win10 iso, ovmf, vnc, etc it would go straight to the uefi shell? You can't get to the installer? Or you could install it and it works until you try passing through your gpu?

No installer, straight to UEFI Shell. The reason I'm reluctant to run with SeaBIOS, which does give me an installer and does seems to work fine otherwise, is because OVMF worked for Windows at first. I've tried starting from scratch on a new vDisk numerous times and assure you I can't create a new stable baseline as things stand, not with OVMF.

 

That's why I'm just about ready to throw in the towel and stick with SeaBIOS for Windows. Because I don't think I'd lose much, if anything, besides forwards compatibility to a UEFI-based config.

Link to comment
  • 3 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.