November 24, 20196 yr Hi I'm newbie at this and would appreciate any insight. Not sure how to trouble shoot this issue where the server crashes when I reboot or shutdown the VM. On unraid 6.8 two VM's one Windows with a RTX 2080 passthrough and the other a Linux Fedora with a RX 580 Passthrough both have the issue. If I ever need to restart or shutdown a VM there is a 80% chance the Unraid server locks up, no longer reachable via webgui or ssh. I attached my syslog and my iommu groups and the xml for the windows VM. groups.txt syslog.txt windows-vm-2080.txt
November 24, 20196 yr Community Expert There is no stable 6.8 release yet and you don't mention which Release Candidate you are running. Instead of syslog you should always go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post. It includes syslog and many other things.
November 24, 20196 yr @sendas You are passing through the "NVIDIA Corporation TU104 USB 3.1 Host Controller" from the GPU. From your xml: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x2'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </hostdev> For the audio part in case you wanna pass it through, it should be function='0x1' instead of 0x2. Try not to passthrough any other devices from a 2xxx Nvidia card except of the GPU and the audio part. Couple days ago a guy had issues as soon as he handed over the USB device from his Nvidia card to the VM. The VM crashed and almost always the server became unstable, frooze or crashed as well. Edited November 24, 20196 yr by bastl
November 24, 20196 yr Author Interesting, I intentionally did a passthrough on the USB-C to use as a dedicated USB controller for the VM. I'm out of PCI-E slots for adding another card.
November 24, 20196 yr @sendas As far as I know the USB-C is only usable for VR headsets. Correct me if I'am wrong.
November 24, 20196 yr Author The USB-C port seems to be working as a normal USB hub when the VM is running. I can plug in a usb headset or thumbdrive no problem. The vm is rock solid for days, if I dont try and restart it.
November 24, 20196 yr @sendas Try without it and report back how it behaves without the USB controller. Just an idea. 😉
November 24, 20196 yr Author Ok will try without it. I was reading the Arch wiki warning of devices that dont accept RESET. Theres a bash script that displays the devices showing which ones do or do not support reset. this was the output. I'm assuming if its missing [RESET] it means that the USB controller cannot be reset? https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Passing_through_a_device_that_does_not_support_resetting IOMMU group 21 [RESET] 03:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104 [GeForce RTX 2080 SUPER] [10de:1e81] (rev a1) IOMMU group 23 03:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
November 24, 20196 yr @sendas Not sure how exact the output of the command is. In case someone reads this and want's to check output. Here is the command: for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d);do echo "IOMMU group $(basename "$iommu_group")"; for device in $(\ls -1 "$iommu_group"/devices/); do if [[ -e "$iommu_group"/devices/"$device"/reset ]]; then echo -n "[RESET]"; fi; echo -n $'\t';lspci -nns "$device"; done; done The HDMI Audio Controllers of both of my 10xx Nvidia cards aren't marked as resetable but both are working. Same for my onboard audio controller. Never had an issue with restarting a VM with it passed through. Not sure if @limetech did something special to reset these devices which have no RESET info. If so and you can confirm the USB controller of the card is the issue for you, we already have 2 users reporting this. Maybe one of the devs can than have a look into it. IOMMU group 35 0b:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457] IOMMU group 28 [RESET] 09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1) IOMMU group 29 09:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1) IOMMU group 49 [RESET] 43:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1) IOMMU group 50 43:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
November 24, 20196 yr Author Removing the Nvidia USB-C hub seems to have removed the issue. I tested a mix of restarts and shutdowns, well over 20 times with no issue. It would be nice to be able to use the built in hub but if its not a very VM friendly controller I'll look into passing through one of the built in controllers and see if that works. Bastl thanks for trying that out. I find it interesting the audio is missing the RESET. My Nvidia audio is also missing the RESET, but I never use it, just use a USB headset.
November 24, 20196 yr Author As a side note, I'm getting 400 more points on my passmark score after removing that Nvidia usb hub. Not sure whats going on there, now on to passthrough a different controller.
November 25, 20196 yr Read through this thread and it sounds like the long and short of it is that if you don't pass through / use the USB C controller on the GPU, the lockups don't occur. Unfortunately with any type of hardware passthrough using KVM, this is a possibility. Hardware passthrough today works fairly well, but relies on a combination of the right firmware in the device and the right quirks in the kernel/KVM/QEMU. Over time, these things may improve, but for now, we lack the ability to provide any type of resolution on these hardware-specific issues.
March 1, 20206 yr When I run the command nothing happens?? for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d);do echo "IOMMU group $(basename "$iommu_group")"; for device in $(\ls -1 "$iommu_group"/devices/); do if [[ -e "$iommu_group"/devices/"$device"/reset ]]; then echo -n "[RESET]"; fi; echo -n $'\t';lspci -nns "$device"; done; done
Archived
This topic is now archived and is closed to further replies.