Jump to content
mattz

Win10 VM broke after AMD x470 BIOS update

13 posts in this topic Last Reply

Recommended Posts

Posted (edited)

A BIOS update broke my Win10 gaming VM.  I updated my MSI x470 Gaming M7 AC mobo from 7B77v14 to 7B77v18 (I must have missed a few).  After I saw this I updated Unraid to 6.6.7 to no avail. 

 

The error is a double-whammy.  I now get a couple errors:

  1.  vfio: Cannot reset device 0000:1f:00.3, depends on group 20
  2. vfio: Unable to power on device, stuck in D3

I can fix the first by removing the calls to my Audio card (which is 1f:00.3), but I am left with the "stuck in D3" error.  And after I unsuccessfully try to start the Win10 VM I have tried to reboot Unraid, but it hangs after the shut down procedure and requires a hard-reset to bring the system back up.

 

The BIOS update did do something funny:  It re-assigned all my CPU pin pairings, so I had to fix that in both the PIN assignments on the VM and in my append isolcpus code (attached that shot).  However, as far as I see, it did not change any of the IOMMU groups, vfio-pci.ids, or device numbers.

 

Through my searching, the only references to the Stuck in D3 are related to graphics card pass-through, which I am using.  However, it has worked without a hitch until this mobo BIOS update, so I am not sure what has changed for me to tackle.  I wouldn't think the mobo upgrade would affect that.

 

What can I try to fix this D3 error?  What else can I provide to better assist with this?  Attached my Win10 VM XML, and put some other specs below.

 

Note- I have some Ubuntu VMs that are working just fine after the update--after re-assigning the CPU pin pairings.  Of course, I do not pass through GPUs or any devices to these instances. 

 

Thanks for reading!  I appreciate the help.

 

Raid OS 6.6.7

Mobo - MSI x470 Gaming M7 AC, latest BIOS v18 (March 7, 2019)

CPU - Ryzen 2700x

GPU - EVGA 1070 FTW
 

image.png

vm-win10-xml-20190316.xml

Edited by mattz
added note about Unraid hanging on restart

Share this post


Link to post

have you tried machine type 3.1? Looks like you are using 2.1 which is a little old at this point.

Share this post


Link to post
Posted (edited)

David, Thanks for the response!  Tried switching to i440fx-3.0 (Unraid doesn't have 3.1 yet, it seems), but that didn't help and had the same messages about stuck in D3 and depends on group.

 

Also started from scratch:  Created a new Windows 10 VM with the 3.0 machine with newer virtio 1.16 driver ISO but had the same error messages when I tried attaching the Nvidia 1070 GPU (stuck in D3) or attaching the sound card (depends on group xx).

 

Then switched on the PCIe ACS Override to better separate the IOMMU groups to target the "depends on groups".  Same error messages.  Tackling that "Depends on" message has been more difficult than I thought.  What exactly does that mean?  Here is what I tried:

  • Sound card is Group 33, so I isolated that with append vfio-pci.ids=1022:1457.  Restart, attach, log says "Depends on Group 31"
  • Then isolate Group 31 with vfio-pci.ids=1022:1457,1022:1455.  Restart, attach both of them to the VM, now log says "Depends on Group 32"  What does the SATA controller have to do with anything?
  • Then isolate Group 32 with vfio-pci.ids=1022:1457,1022:1455,1022:7901.  Restart, attach all three of them to the VM. Now log goes back to a "Stuck in D3" message, even though I didn't add the Nvidia GPU back.

I think this mobo is just yanking my chain.  What are some next steps I can try to get this going?  My thought right now is to revert the BIOS to v14, but I assume I will just end up in this same place if I need to update the BIOS at a later date. :(

 

Thanks again, would still appreciate the input.

 

image.thumb.png.612681f0dc3a497de3a690599b96506a.png

 

 

image.png

Edited by mattz
moved image to attachment

Share this post


Link to post

I'm not an expert, but why are you passing the sound from group 31, 32, 33? Your GPU is 26 and the correspoding sound is group 27, you should always pass those as a pair. If you want to pass another sound device, use it as an additional one.

Also: Do you have a valid rom for the GPU? Is Hyper-V disabled?

Share this post


Link to post

@Jaster Good Q's - I do have the GPU audio as primary, my Audio controller as 2nd Audio.  However, passing through those other groups because of the Log Messages I was getting (screen shot below).  I hope I do not need to pass through anything else, but after the BIOS update I got those messages.

 

For the other Q's

  • GPU ROM is valid... or at least, it worked for me until the mobo BIOS update.  Would I need a new ROM after a mobo BIOS update??
  • HVM is enabled

image.thumb.png.24995b1e554449c74cf734c8eebeeaca.png

 

This is the selection I have for the error below:

 

image.thumb.png.8cc6a62fddefed383b74f71ff9a74efb.png

image.thumb.png.d73380d97e823068fdffb49c4f4e38e0.png

Share this post


Link to post

Try not isolating anything in group 30+ just add the HD controller sound card as a second sound card. 

Share this post


Link to post

Ok, removed all PCIe isolation:

image.thumb.png.35f3ebe66d99839261f2de0c146349c8.png 

 

Removed the GPU and pass-through the Mobo Audio:

image.thumb.png.8242c01a7fc962282c3429d3e09c2251.png

 

The VM boots OK and I do see the audio device!  But I do still see the warnings in the Log:

image.thumb.png.c0359abcbfa7490577fe3976df7deaca.png

image.thumb.png.eced674bf23b7550cd09e640b2d55978.png

 

 

Now... If I add back the GPU, GPU ROM, and GPU Audio card... 

image.thumb.png.928680d1401f60a9cc6671fe334a89f0.png

 

I get the "stuck in D3" error again:

image.thumb.png.37accff15d2e3e38e9dbb8a670702998.png

 

 

It is strange that the second shot, the "Other PCI Devices" the "USB 3.0 Host controller" went away, especially since it shouldn't have been available in the first place since I didn't isolate it.

 

 

Share this post


Link to post

And... just for good measure, if I try it _without_ the GPU ROM BIOS, I get the same error, plus some other messages...

image.thumb.png.c1da8b56a46a62ade9edb66b80d3451c.png

image.thumb.png.21bc73c98d7b2ee1f6f8ea7a0b8efe69.png

 

I am doing a full computer reboot after every attempt with a "stuck in D3" error to make sure everything is clean.

 

Would having a second graphics card be a way out of this mess?  Again, this all worked BEFORE the v18 BIOS update, when I was on v14...

Share this post


Link to post

Ok, so... I can't revert my BIOS, the MSI M-Flash does not allow to go to an older version.  So at this point I am stuck unable to load my GPU into any VM, even Ubuntu.

 

Searching for anything around a D3 power state with Nvidia has turned up very little.  I did see a few references to a Linux Kernel (even the few posts that show D3 on this forum), and it pointed to Linux Kernel 4.19 (as of Unraid 6.6.7 it is still 4.18).  I did notice that a Kernel update to 4.19 is coming in the 6.7.0 release--not to mention qemu to v.3.1 which @david7279 originally mentioned, so maybe I hold my breath until then! 

Should I get into the Next branch or wait until Stable?  Would it do any good for me to get into the conversation about this next version or is my problem such an edge case I should wait for Stable?

 

Cheers,

 

image.thumb.png.e07ec2f2b9781fea4b01381c74cd8c6f.png

 

 

Share this post


Link to post

You could always make a flash/config/vm backup then update to next, and rollback to your current version as long as you don't update it again, no? (at least that's what I assume rollback does)

Share this post


Link to post

Upgraded Unraid OS to 6.7.0-rc5.  No good, still have that "Stuck in D3"  error. 

Was holding out hope the Linux Kernel update to 4.19 would help, but that's a no-go...  Not sure where to go now.

 

image.thumb.png.cdfebf3b2c284410b6271df3f51e6fdd.png

 

 

Share this post


Link to post

You know, I really don't get it.  It seems like this is a Threadripper / Ryzen 2 issue from over a year ago.  Some pretty ingenious folks on reddit posted a fix for it.  However, the final fix was a BIOS update to x399 mobos. 

 

My CPU is a Ryzen 2 2700x... And it only broke with this error message on the _LATEST_ BIOS update.  I have emailed MSI's tech support, but I am not getting much help from them... They still think I need to reinstall Windows to make it work.  😕

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now