Virtual Machine Freezing with Nvidia GPU


Recommended Posts

Hi, I currently am trying to set up a virtual machine and pass through my gigabyte 780 windforce. I am successful in adding it, however, randomly, the computer will at inevitably freeze at some point. The virtual machine seems to be very stable until I add the graphics card. Then, in a couple of hours, it will completely freeze the vm. It doesn't necissarily crash the computer, just the monitor output is stuck on whatever I was doing until it crashed, and I am unable to interface with the vm other than force stopping it. Is there any way to fix this?

Link to comment

Could be a large number of things, unfortunately you haven't posted any info about your system, not even if the VM is Linux or FreeBSD or Windows 98...  Please add the Diagnostics file from your UnRaid host (WebGui-->Tools-->Diagnostics-->Download)

And the XML file from your VM (WebGui-->VM's-->Edit the VM-->Switch from Form View To XML View-->Copy everything there to a text file and add it here also)

Link to comment

 

On 3/28/2019 at 6:48 AM, Warrentheo said:

Could be a large number of things, unfortunately you haven't posted any info about your system, not even if the VM is Linux or FreeBSD or Windows 98...  Please add the Diagnostics file from your UnRaid host (WebGui-->Tools-->Diagnostics-->Download)

And the XML file from your VM (WebGui-->VM's-->Edit the VM-->Switch from Form View To XML View-->Copy everything there to a text file and add it here also)

Sorry, I failed to further specificy. It is regardless of the OS of choice. I've had it happen on windows and ubuntu whenever I add the gpu. I have since removed the gpu from the vm, and have had no issues at all. I did add back the gpu into the vm right now for it to show up in the xml, but I have not yet started the virtual machine again. Should I run the virtual machine with the gpu added back to get meaningful data from the unraid diagnostics? Thank you very much for the help

vm.xml

 

edit: Unraid forum is limited to 25 files. What file do you want from the diagnostics or should I upload it via drive or something

Edited by martinpetrov1568
Last question
Link to comment

Here is a review I found on the amazon page for your motherboard:

https://www.amazon.com/Supermicro-X9DAI-LGA2011-USB3-0-Motherboard/dp/B007HVZBWM#customerReviews

Bottom line, this appears to be a motherboard issue...

 

Quote

Good:
Outstanding features, PCIe 3.0, a good number of slots, Very stable if you don't mess with BIOS to try and get Windows 8/Server 2012 running.

Facts:
EATX form factor = large size, while I have dealt with large size before, this just seems bulky, and the need to shoe-horn the MB into a case. Also only 1/2 of the EATX mounting holes lined up, so I had to use the good old, white motherboard standoffs from before the industry started to standardize.

BAD BAD BAD:
The BIOS is the down fall of this motherboard, can not install Windows 8/Server 2012, even with all the options turned off in BIOS, and no BIOS update is out yet, however Windows 2008R2 and Windows 7 install just fine, with all options enabled in BIOS

PCIe 3.0 doesn't work, if you have a PCIe Video card (Nvidia) 3.0 it will not boot, and if it does like my Windows 7 install did, it somehow uninstalled the video driver, and I was unable to get video back in windows 7 (had to reinstall). Forcing BIOS to PCIe 2.0 everything works just fine, maybe it will work when LGA2011 Ivy Bridge comes out, But Sandy Bride E, forget about it.

Only discovered the BIOS problems because I was trying to get Windows 2012 to load:
BIOS seems to be a bit buggy, if something is not set just right it can hang in BIOS, reboot after reboot until it boots. Also I mistakenly changed a memory setting, and now the darn thing just hangs at memory detection, reguardless of what type of memory is installed, with no memory it beeps, just like it should, memory installed it gets stuck. Been working with tech support, swap this swap that, it should work, well it doesn't, well its your power supply then, no its not (next topic), (yes I tryed CMOS Clear), sending it back for RMA

Yes tech changes every day, but BIOS is 20+ years of experiance, wow, I mean I can't remember having a real BIOS issue in the last 10 years, that clear CMOS didn't fix.

Power Supply:
This motherboard is very very picky about power, would not boot with most of my power supplys that I had on hand, PC Power and Cooling 1000w, Antec 800w, seasonic 750w.

Stangly enough Vantec 650w what the heck? its so old it still has the old style flat connecors for pre atx motherboards. But this one worked just fine, well, until I was stupid, and spun up my GPU Nvidia 660, playing a game, after about 3 hours, it said no more. Replaced the Blown fuse, nope, still dead.

Bought a brand new Cosair 850w, thinking the others were only EPS 2.3 and not 2.31 as the motherboard manual said it needed to be, and also wanted 2 native 8 pin EPS connectors. Also as the 650w seemed to be amost enough, surely 850w would be enough, it worked for about a week and died. Next bought a Cosair AX1200w seems to be big enough to handle it.

There is a 3.3 Version of your BIOS (you currently have 3.2) this may or may not help...

https://www.supermicro.com/products/motherboard/Xeon/C600/X9DAi.cfm (Then click update BIOS)

Edited by Warrentheo
Link to comment
1 hour ago, Warrentheo said:

Here is a review I found on the amazon page for you motherboard:

https://www.amazon.com/Supermicro-X9DAI-LGA2011-USB3-0-Motherboard/dp/B007HVZBWM#customerReviews

Bottom line, this appears to be a motherboard issue...

 

 

Interestingly, I have had more severe problems with the motherboard previously. I was running ES chips on it and due to that had to run an early bios. With that bios, I had the very same issues with pcie 3.0. My GPU wouldn't even be recognized in unraid and I wouldn't even be able to select it to pass through. However, I have since bought normal xeon 26XX's and updated my bios to the latest version. Now the 780 is recognized and I have no issues outside of a vm.

 

You are right, and it may very well be the motherboard, but how could a well established company such as supermicro release a product that claims to support pci 3.0, yet have it be buggy in VM's, something that servers boards like these were intended for. Now, I still do agree that perhaps this may be the case, but I think there must be a more likely alternative?

 

I actually looked at the logs for the vm when uploading them, and noticed that I have this warning: qemu-system-x86_64: warning: guest updated active QH

I tried googling it briefly, and ran into other's that have the same warning, and have very similar issues with their gpu 

https://bugzilla.redhat.com/show_bug.cgi?id=1288805

https://forums.linuxmint.com/viewtopic.php?t=282914

Does anyone know more about this particular error? Obviously I'm gonna try to read more, but if you guys have any further input it would be much appreciated

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.