Server Crash on GPU Passthrough on x570


Recommended Posts

Hey Guys,

 

I apologize if this question has been asked before... I have gone through many different forum posts on here and other places and I haven't found a solution that works.

 

I am trying to create a Windows 10 VM to use as my gaming VM. I have followed SpaceInvaderOne's videos and am able to create the VM and run it through VNC without any problems. On shutdown, addition of either of my GPUs (Sapphire R9 270X 4GB or GeForce GTX 750 TI), and attempt to boot the VM, it tries to run for a couple of minutes and then the server becomes unresponsive. I have to force shut down the server to be able to boot again. I have been trying off and on for 3 months and am at a loss as to what my next steps may be. I have included more information below.

 

Of the things I have tried, each of these have been combined with each other at least once:

SeaBIOS or OVMF

Sapphire R9 270X 4GB or GeForce GTX 750 TI

Latest i440fx or Q35 variant

Multiple variants of Unraid releases

UEFI or Legacy boot

C-state on/off

 

I have a feeling that my next step will be compiling the 5.X Kernel, but that would be a first for me and it concerns me to do that if it is not needed.

 

My system is as follows:

Model: Custom

M/B: Micro-Star International Co., Ltd. X570-A PRO (MS-7C37) Version 3.0

BIOS: American Megatrends Inc. Version H.70. Dated: 01/09/2020

CPU: AMD Ryzen 5 3600X 6-Core @ 3800 MHz

HVM: Enabled

IOMMU: Enabled

Cache: 384 KiB, 3072 KiB, 32768 KiB

Memory: 16 GiB DDR4 (max. installable capacity 128 GiB)

Network: bond0: fault-tolerance (active-backup), mtu 1500
 eth0: 1000 Mbps, full duplex, mtu 1500

Kernel: Linux 4.19.107-Unraid x86_64

OpenSSL: 1.1.1d

Uptime: 0 days, 00:19:08

tower-diagnostics-20200320-1325.zip

Link to comment

Hey, are you still getting these issues?

 

I seem to have a very similar setup to you (Details in signature below), and I've been using my system as a VM gaming system for some time.

I recently changed from an R9-290 to a GTX1080Ti, and that seems to now be working well also after some modifications (details here, recommend you check it out). Using my R9-290, I had to use a compatible vBIOS to get it to work at all.

 

Some questions:

- Have you isolated your CPU usage between VMs and other tasks like Docker?

- What are the EXACT model numbers of your AMD and nVidia cards?

- Are you downloading and using a compatible vBIOS from techpowerup?

 

Some details I'd recommend for a smooth setup:

- Host: Use legacy boot on your host, avoid UEFI. Broke things for me. make sure in BIOS that Legacy boot/compatibility is enabled for all PCIe devices also.

- Host: I'm currently using Unraid 6.8.3, just upgraded, but I was stable on 6.8.1 and 6.8.2 for a good while.

- VM: Use OVMF BIOS, Q35-4.2 machine type.

- VM: Use latest VirtIO drivers, I'm using virtio-win-0.1.173-2.iso

- VM: All vdisks mounted using VirtIO.

- VM: Before trying to install Windows 10, try booting the VM with only CPU 1 core for the install.

 

Edited by KptnKMan
Link to comment

I had actually pinned your most recent topic about your 1080TI experience because most of what you were saying sounded very similar to what I am experiencing. The way you diagnosed and solved your issue was inspiring! I'll be working on it all day tomorrow and will definitely try some of the things that you mentioned that i might have missed along with following up with your 1080TI post. I'll be following up with this post first thing in the morning with more information about the system along with how my efforts with some of your ideas above work.

 

Just to make sure, you never had any issues with the 4.19.x kernel that is being supported in the latest releases?

Link to comment

Hey, if you're referring to the kernel in the latest 6.8.3 release, then I've had no issues. As a matter of fact, I've seen a slight performance boost in gaming, which I thought might be due to the AMD cpu cache passthrough update. The 4.19.107 kernel seems to be good for me.

 

The solution I found was mostly due to the work by @SpaceInvaderOne. He really figured out most of this, but I'm happy to help.

 

Let us know how you do.

Like I said, we seem to have similar hardware, so we can probably figure it out. 👍

Link to comment

Ok, more work has been done.

 

Following what you mentioned above:

I am using Unraid 6.8.3, VM setup is OVMF, Q35-4.2, using virtio-win-0.1.173-2.iso, and all vdisks are mounted using VirtIO.

 

I do not have any docker containers running, the only thing that "should" be processing is the single VM that I am attempting to start. I am able to install Windows without any problem using a single thread. I then shut it down, add the GPU, Sound Card(s), and pin 4 threads from my CPU and start it again and it hangs every time. The system is clearly working hard, but my ssh and local headless connection respond to input every couple of minutes. Using htop, I was able to see some of the updates (when it would update) and at multiple points, it was sending ALL CPU cores to 100%, not just the ones that had been pinned.

 

I think it's necessary to add that I just tried to do the same thing on a Manjaro install and VNC worked fine for the installation, but the same thing happened when I switched the GPU in and it hung.

Link to comment
1 hour ago, colem said:

I am using Unraid 6.8.3, VM setup is OVMF, Q35-4.2, using virtio-win-0.1.173-2.iso, and all vdisks are mounted using VirtIO.

I'm afraid I'll need still more information from you.

I've gone back to my previous post above and highlighted all the things red that I think you need to definitively confirm.

You need to check and confirm all the things, including your hosts BIOS is setup correctly. I'd recommend use the BIOS Defaults option and then set everything up in turn. Virtualisation Enabled, Legacy PCI Support, Boot order, etc. Especially if you've upgraded your BIOS. Also, make sure you're on the latest BIOS (But it seems like you are).

Also, be verbose and specific, I cant see what you're looking at.

 

A couple of extra things here:

- Assign your CPU cores in SETTINGS->CPU Pinning. I leave the first core completely alone for unRAID, use 4 cores for my VMs and currently my last core for Docker and one of my dev VMs. You can do it in each VM, but I find this method doesn't mess up my XML. Like this:

image.png.78f504907d75ac3d2b2f2b166e63d69e.png

- What does your XML look like? Maybe post that in a code block here.

 

1 hour ago, colem said:

I then shut it down, add the GPU, Sound Card(s), and pin 4 threads from my CPU and start it again and it hangs every time. The system is clearly working hard, but my ssh and local headless connection respond to input every couple of minutes. Using htop, I was able to see some of the updates (when it would update) and at multiple points, it was sending ALL CPU cores to 100%, not just the ones that had been pinned.

Also, considering that you're having issues, I think you're changing waaaay too many things at once:

- You need to do everything step-by-step and see where problems arise. 1 change at a time. Check, confirm and reboot before anything else.

- How about keeping everything the same, on 1 core, and add just the GPU and GPU-Audio.

- Are you adding other Sound cards? Other devices?

 

1 hour ago, colem said:

I think it's necessary to add that I just tried to do the same thing on a Manjaro install and VNC worked fine for the installation, but the same thing happened when I switched the GPU in and it hung.

- What GPU are you using? Different GPUs have different requirements.

- What vBIOS are you using and where did you get it?

 

Lastly, I'd recommend posting your unRAID diagnostics file, as it will help others to troubleshoot. That's in TOOLS->DIAGNOSTICS.

Edited by KptnKMan
Link to comment
  • 3 weeks later...
  • 10 months later...

Hi, I have this exact same problem. Has anyone found a solution? 

I have Aorus X570 Pro Wifi with Ryzen 7 3700X. 32GB DDR4, Nvidia RTX2060 Super.

I purchased unRaid mainly to use it as VM. Followed all instructions as @SpaceInvaderOne. But if I activate more than 1 Logical CPU's, it freezes. 

Deactivated C-State in MB- no luck.

Using VNC- no luck. 

Just hit a brick wall, any assistance would be great.

Link to comment

I just upgraded to Asus Rog Strix X570-E Gaming with Ryzen 3700x from what you can see in my signature. I've converted Windows from seabios to ovmf and my Windows VM booted without any issues.

 

- I'm using the last 2 CPU cores with both its threads so: 6/14,7/15

- I'm passing through my RX570 and also it's audio since they are in the same iommu group (no vbios set)

- in host bios I've enabled: SVM Mode -> Enabled

- I have Hyper-V set to yes in the VM config

 

Are you using seabios or ovmf?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.