Win10 VM graphics pass-through broke after AMD BIOS update


Recommended Posts

Just to inform everyone. 

 

Since BIOS version 3.10 (the latest 3.20 as well) the same bug is present for the board 

 

Asrock Rack X470D4U

 

PS: a downgrade of the bios to the latest working Firmware 1.50 worked for me but only via DOS. 

 

 

Edited by Trashor
Additional info
Link to comment

I will post in here to keep up to date regarding this issue and hopefully ASUS will release a BIOS update in the somewhat near future. Since I may eventually update to Ryzen 3rd Gen processor. I also am not sure if I could downgrade bios to work with GPU passthrough since I am using a 2nd Gen Ryzen Processor.

Link to comment
  • 2 weeks later...
On 8/18/2019 at 10:52 PM, Leoyzen said:

 I'm using RTX2070 for a Win 10/Ubuntu and RX560 for Win10/Hackintosh)

hi

 

do you use specific bios file to passthrough your rtx2070 ?

 

What is your immogroup for this card ?

 

Can i see your vm xml config file please ?

 

see my post, i got troubel to start the VM

 

++

 

 

Link to comment

Here is my tale, if it helps someone else:

 

I have an ASUS Prime X470 Pro board that had a 1700x in it along with an rx570 and all was well until I upgraded to BIOS 5220 with AGESA 1.0.0.3 ABBA. I received the well documented "Unknown PCI header type 127" error when running a VM with gpu passthrough enabled.

 

I then upgraded the system to a 3900X with no other changes except for fixing my CPU pinning settings and removing a passed through generic USB device that mysteriously appeared in each of my VM settings. Once I had done that, I fired up my hackintosh VM and surprisingly, Clover booted and displayed fine but when booting into Mac OS the VM got hung up at the apple logo.

 

After a another attempt at starting that VM I received the dreaded "Unknown PCI header type 127" error. I attempted to shutdown Unraid and it also got hungup during shutdown requiring a forced power down.

 

After booting back up, I attempted to boot a Linux Mint VM with GPU passthrough and it booted successfully. I was able to run update the system and run a few program and then I attempted to reboot the VM via the start menu and the screen went dark and it never came back. I performed a force shutdown of the VM and tried to boot it and once again I get the error "Unknown PCI header type 127".

 

So far this is as much testing as I've been able to do. It seems I'm getting a bit farther than some in this, but the issue still isn't solved, even with the latest AGESA update.

 

 

Link to comment

Looks like I got mine working.

Build:
Mobo: B450M Mortar
Bios: 7B89v1B
GPU: RX 460

Was getting the constant header 127 problem and all VMs with the GPU attached even failed to start.

Solution:
1. Updated to the latest BIOS
2. Update to custom built kernel (described below ~ its easy just copying a couple files)

3. After that the GPU was able to boot into the VM, but I noticed the screen had graphical tears, and the logs indicated vfio_region_write failed.  This was solved by adding a script to ensure unraid didn't try to reserve the only GPU (even though I have the `append vfio-pci.ids`)  Following the below guide helped me here (just make a script and run it)


After doing the above 3, I finally got it working, and was able to boot back into my Windows 10 Gaming VM.  Hope that helps somone, and thanks all!

Link to comment

I was concerned that I'd run into this issue when upgrading, but so far almost everything is working for me.  In case it's helpful to anyone else with similar hardware:

Unraid version 6.7.2 with PCIe ACS override set to Both
Mobo: Gigabyte AX370-Gaming K7 (F42a BIOS)
CPU: 1700X upgraded to 3900X


I have 2 Windows 10 VMs with a variety of passed through devices.

VM 1
r9 390 with VBIOS.  This is the primary GPU.
Also passing through an NVME SSD

VM 2
Radeon 7850 without VBIOS

The above worked with the standard 6.7.2 kernel, but I did switch to the custom kernel here in order to get sensors working: 


I followed the process here to upgrade from BIOS version F6 to F42a: https://forum.level1techs.com/t/solved-bought-x470-aorus-gaming-5-wifi-bios-update-recomendation/145836/9   The post on the thread was for Gigabyte x470, but it turned out the same process works for x370.

Link to comment
  • 4 weeks later...
  • 2 weeks later...

Hey everyone, I've been tracking this issue the last week or so as the parts came in for my new build. Just wanted to share what's got my pass-through finally working.

 

My motherboard is an MSI B450 Gaming Pro Carbon AC and the GPU is an EVGA GTX 1060 SC Gaming. The motherboard originally shipped with BIOS version 7B85v16 and I upgraded to the latest non-beta version 7B85v1A. Still had the header 127/D3 problem. However it appears that sometime since I last refreshed the support page version 7B85v1B left beta and I was able to apply it and this appears to have fixed the issue.

 

Notes for 7B85v1B:

- Update AMD ComboPI1.0.0.4 Patch B (SMU v46.54)
- Improved system boot up time
- Improved PCI-E device compatibility

 

I can now boot into my Windows 10 VM. Will report back with any further issues

Link to comment

 

On 11/21/2019 at 10:52 AM, whiskeykilo said:

 

[...] MSI B450 Gaming Pro Carbon AC and the GPU is an EVGA GTX 1060 SC Gaming. [...]

 

Notes for 7B85v1B:

- Update AMD ComboPI1.0.0.4 Patch B (SMU v46.54)
[...]

 

I can now boot into my Windows 10 VM. Will report back with any further issues

 

From what I am seeing, the latest AGESA AMD BIOS version 1.0.0.4 Patch B is working for some folks with ASUS PRIME X470 PRO.  Read it on AMD support page https://community.amd.com/thread/241650

 

And if @whiskeykilo gets it working on the MSI B450 Gaming Pro AC, there may be good evidence on other platforms, too.  I will try upgrading my BIOS with this latest version, maybe over this long weekend and get back.

Link to comment

I got an Asus Prime X470 Pro / Ryzen 7 2700 with the latest 1.0.0.4B firmware. I tried passthrough with a Radeon 5700 and an Nvidia GTX970, with and withount vbios. Nothing works. I can't get them to work and it is definitely a problem with the Bios. Next thing i try is to downgrade the bios. What version did people have success ? Maybe i will try downgrading to the first which is on the Asus site.

Link to comment
  • 3 weeks later...

Update: I just upgraded my Asus Prime X470 Pro / Ryzen 3900x with the latest 1.0.04B AGESA bios and tried GPU passthrough with my AMD RX-570 card. The VM booted fine but when I shut down the VM and attempted to boot it again, the server did something very strange.

 

First the spinning arrows saying the VM was attempting to boot kept going for a very long time. I got no video on the monitor and once in a while a thread on the CPU would max out and then idle. After a while of this, suddenly the CPU briefly went 100% all cores then the VM went into a paused state. Try as I might I could not force quit the VM nor got it to boot. The web interface for my server was very unresponsive and the console seemed to freeze for 30 seconds at a time only briefly updating the contents. I could not get the array to safely stop and eventually the entire server froze and I was unable to get it to respond at all.

 

I was able to reproduce this several times and I'm afraid to do it again for fear of what it may be doing to my parity...

 

Anyone have any suggestions? I suspect this is similar to the reset bug but worse somehow. I only have a GPU and a USB keyboard and mouse directly passed through to the VM. No audio, no USB cards or anything like that.

 

Interestingly, I did find on the ASUS forum that this version 1.0.0.4B bios has caused problems for users with sound cards that causes the entire system to not boot or to studder and freeze in a similar fashion to what I'm seeing.. I almost wonder if this could be related. Unfortunately, my bios does not have the option to try the suggested fix, disable PCIe Ten Bit Tag Support to see if that's related...

 

https://rog.asus.com/forum/showthread.php?115064-Beware-of-agesa-1-0-0-4B-bios-not-good!/page2

Link to comment

Just to let you guys know (as this was the most helpful thread I've found related to this issues), this is not an issue only in UnRaid. I'm currently using KVM on ubuntu 18.04 and encountering the same issues. PCI passthrough doesn't throw any obvious errors using Virt-Manager but nothing happens when trying to boot a machine that already worked in previous BIOS versions. Also, when I force off the VM, it starts throwing the header 127 error. 

 

Specs:

Asus Prime-x370-pro

Ryzen 1700

BIOS Version 5220 - AGESA 1.0.0.3ABBA

Passthrough: GTX 1080 + Via Technologies PCIE USB board

 

Me using an Nvidia Card and having the same issues kinda negates the AMD reset bug

 

Things I tried:

Updated Kernel to 5.4.5 with ACS patch

Reassigned the board (as the IDs changed when upgraded BIOS)

BIOS Version 5204 - AGESA 1.0.0.3AB

 

All options lead to the same issues.

 

AMD is doing some very nice work on supporting the AM4 socket for a long period and adding support for the new chips on reasonably old chipsets but I think we all can agree it's not nice on them to break things that worked on older versions to be able to do that.

Also, on the Board manufacturers side (ASUS mainly in my case), there is no reason we can't have a reasonably straightforward and official way to downgrade our BIOS.

Link to comment

I'm not sure if I'm 100% on topic or should start a new thread, but I recently upgraded my x370 board and things have been interesting.  I have an Asus Prime x370 Pro which was working fine with my Ryzen 1700 for a long time on an older BIOS from 2018, with decent IOMMU grouping for USB passthrough to VMs.  I ordered a Ryzen 3900x and upgraded BIOS while I waited for it to come in the mail, and ran into the same problem as Spaceinvader One in the video below. (I think this is what @mhentschke is getting) Luckily, plugging in the 3900x mostly fixed things, but my IOMMU groups totally changed (I don't have a single usb controller that's in it's own IOMMU group anymore, so that sucks).  It sounds like for that part of the problem, the solution is either upgrading the CPU or installing the custom kernel per previous posts.

 

After the BIOS and CPU upgrade hardware addresses for some of my devices changed and had to be updated in the VM XMLs for things to work. For example, an address that used to direct to a discrete USB controller I installed turned into some other PCI bridge or something, and that was listed under "Other PCI devices" on bottom of the edit VM page.  I suspect this was part of the problem @klingon00 described, the other part being the AMD reset bug causing his GPU to hang after having a VM crash and having to force stop. BTW, Klingon00, I would manually check your hackingtosh XML, as I know with my setup based on a Spaceinvader One tutorial, if you change anything at all from the GUI menu it deletes some of the special QEMU arguments that let the MAC boot.

 

Now, what I'm getting is my two windows VM's with nvidia cards passed through still work fine, but I've tried to passthtough the secondary card to a Mac VM and several linux distros including the latest stable Ubuntu 18.04 LTS, and all will apparently crash on boot and just give me a black screen/no signal.  The monitor will actually go to low power mode, and I can't remote in via splashtop, so I don't think anything is actually booting.  

 

It seems like this all could be related to Ryzen 3000 BIOS update issues, but if anyone has any hints for me about why GPU passthrough would work for windows and not linux/mac, that would be awesome.

 

I think the next thing I will do is try passing through GPU on linux VMs with different machine types and switching between SeaBIOS and OVMF.

 

Sorry for the novel, but I'm really suspicious that all these different issues have the same root cause..

 

 

Link to comment

I've been doing some digging, and there are two relevant known issue on reddit: https://www.reddit.com/r/VFIO/wiki/known_issues 

 

First is IRQ issue on QEMU 4.0+ (Unraid 6.8.0 stable uses QEMU 4.1.something).  Second is the BIOS update AGESA issues, which the wiki says are similar to the old threadripper reset bug.

 

https://www.reddit.com/r/VFIO/wiki/known_issues

 

I may just try to roll back to an older version of unraid and see if it resolves the QEMU issues and lets me boot..

 

 

 

Link to comment

Wanted to throw out there:  I upgraded my BIOS to the latest 1.0.0.4 Patch B (version 7B77v1D for my mobo), my GPU pass through with the MSI x470 Gaming M7 AC and a Ryzen 2700x now works without a hitch!  Of course, the CPU Pinning numbers did change in addition to some IOMMU groups.  Edit - Did this after upgrade to Unraid 6.8.0.

 

If you have been listening to this thread, waiting to update your motherboard BIOS because of this issue, I think it's a safe time to try if your manufacturer has that latest version.  per MSI for my motherboard it is - Update AMD ComboPI1.0.0.4 Patch B (SMU v46.54).

 

Cheers!

Edited by mattz
added Unraid version
Link to comment

Update:

 

I finally got fed up with the crashes associated with GPIO passthrough and the suggestions that i'm fighting the old AMD reset bug are correct. I managed to dig an old EVGA GTX 950 out of storage and put that in instead of my RX 570 card... and my VMs work just fine now. I'm able to reboot them at will, swapping which VM's can use the card all works fine, all without crashes, high CPU usage or black screens!

 

So I guess it's Nvidia for me again for a while. I was hoping to stick with the AMD card since the hack VM works so much easier with it but I've been down this road before... Nvidia net drivers for Mac aren't exactly the best...

 

So to clarify: AGESA 1.0.0.4B has fixed my passthrough issues on my 3900x on an ASUS Prime x470 Pro motherboard. AMD GPU reset bug is still an issue in Unraid however but Nvidia cards work fine.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.