Windows 10 VM gpu crashing


Recommended Posts

Hello, having issues with my Win10 VM that I use for VR gaming in my living room ever since my server updated past 6.7, it was working fine for a few months on unraid 6.6.7

 

Current Version 6.7.2 2019-06-25

mobo: asus z97-a

cpu: i7 4790k

memory: 32GB DDR3 RAM

gfx: MSI R9 390x 8GB  with AMD driver 18.9.3 (all newer drivers black screen and crash instantly)

 

Win10 VM is getting half the cpu cores and 16GB RAM with GPU+hdmi audio passthrough + mobo soundcard and USB controller

 

So the situation seems to be that my graphics driver is randomly crashing in the VM, usually while playing games, and it becomes unreachable. Screen becomes disconnected and can't ping/remote to the VM at all. Requires me to force stop the VM and also Reboot the whole unraid server because the VM can't start due to Execution error internal error: Unknown PCI header type '127', which I'm aware is a known issue. Win10 event logs are showing "Display driver amdkmdap stopped responding and has successfully recovered." is happening right when the VM becomes unreachable even though it says the driver recovered.

 

Originally I had been sitting on unraid 6.6.7 after I had finally gotten everything working normally but one day I rebooted the server and it seemed to have installed 6.7 on its own, and I started having issues. Possibly just a coincidence. Attached diagnostics should be a bit after the VM crashed, though I don't see much related to that itself since it seems to be Windows that's crashing. Anything you can tell me would be appreciated. Even if it's an NVidia gpu suggestion cause I'm ready to blame issues on this 390x, personally.

 

Thanks

unraid-diagnostics-20190813-0036.zip

Link to comment
28 minutes ago, mb9023 said:

I rebooted the server and it seemed to have installed 6.7 on its own, and I started having issues

Can't comment on your trouble, but this wouldn't have happened unless you clicked Update OS or hit the link in the banner that told you about an update.  But, it is possible to downgrade if you can't sort everything out by going to Tools Upgrade OS and then downgrade from there.

Link to comment
3 hours ago, phbigred said:

Sounds similar to a problem SpaceInvader One had. Might be worth attempting a move to QT35 instead of i440fx machine type. I had a similar problem with my RX580 passthrough. 

 

If I try to change my machine to Q35 I get this error

XML error: The PCI controller with index='0' must be model='pcie-root' for this machine type, but model='pci-root' was found instead

 

I'm fairly sure I tried to setup a separate Q35 VM before but ran into other issues, I guess I'll try it again after checking out this video. Thanks

Link to comment
27 minutes ago, mb9023 said:

If I try to change my machine to Q35 I get this error

XML error: The PCI controller with index='0' must be model='pcie-root' for this machine type, but model='pci-root' was found instead

 

I'm fairly sure I tried to setup a separate Q35 VM before but ran into other issues, I guess I'll try it again after checking out this video. Thanks

Start a new template. The change from i440fx to Q35 is too complicated for the GUI to adjust the xml.

Also, remember to add this bit of code at the bottom of your xml, in front of </domain>. This makes your emulated PCIe run at x16 (instead of the default x1).

  <qemu:commandline>
    <qemu:arg value='-global'/>
    <qemu:arg value='pcie-root-port.speed=8'/>
    <qemu:arg value='-global'/>
    <qemu:arg value='pcie-root-port.width=16'/>
  </qemu:commandline>

 

Link to comment
6 minutes ago, testdasi said:

Start a new template. The change from i440fx to Q35 is too complicated for the GUI to adjust the xml.

Also, remember to add this bit of code at the bottom of your xml, in front of </domain>. This makes your emulated PCIe run at x16 (instead of the default x1).


  <qemu:commandline>
    <qemu:arg value='-global'/>
    <qemu:arg value='pcie-root-port.speed=8'/>
    <qemu:arg value='-global'/>
    <qemu:arg value='pcie-root-port.width=16'/>
  </qemu:commandline>

 

I had no idea this was a thing, wow. I'll give it a try as well.

Link to comment

So I tried above suggestions, creating a new q35 template with the same vdisks and was able to install the latest graphics driver and the VM is still crashing. I think I might look into getting an NVidia card to replace it..that'll get me away from the AMD bug anyway. I've had a feeling this card had some issues for a long time but never been able to pin it down as it's never been this bad.

 

I would try going back to 6.6 but I only see 6.7 in restore points, and I'd rather not end up messing up stuff in the rest of the server anyway.

 

I'll update if replacing the card ends up working but it might be a week or so as I'm going on vacation for a few days.

Link to comment

12 hours later update...VM crashed while I was trying to run a benchmark and on server reboot my 390x wasn't showing up in Unraid at all. So I opened up the case and took it all out and set it back in. Few hours of benchmarks and Beat Saber playing since then and it hasn't crashed once. Hopefully stays that way.

Link to comment
  • 1 year later...
8 minutes ago, BilboT34Baggins said:

Resurrecting this topic, mb9023, are you still having issues? I'm having very similar issues with my rx580 - even down to unraid periodically not recognizing the gpu and having to reseat it. 

Nope haven't had any issues in a long time, don't think I changed anything else since going to Q35 and reseating the GPU.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.