February 22, 201610 yr Team, I need some help, if nothing else at least a point in the right direction. If I let the VM run for awhile then reboot it, it will lock up my entire server. If I don't have the VM running for a day or so, then I try and boot up the VM it completely locks up my server. When my VM is running, there are no issues (aside from the fact that the SSDs are much slower than I anticpated in the cache pool, but I have a separate thread for that). I am running the rig listed in my signature, my main goal is using iommu to have this windows 10 VM as my main gaming machine. I have the following VM setup: Q35-2.3, Seabios, disk is running on my cache pool (256GB + 400GB SSD), cores 4-7 assigned, 10GB of ram. I am passing in 770GTX and an entire usb controller to be able to use my KB, mouse and headset. There are no apparent iommu group issues. I started this install with a fresh windows 8.1 then went to a single core during the update to windows 10. I have installed my own ELK + filebeat docker stack to try and capture and persist some of the logs when the server crashes, but I am having no luck, the erros don't write fast enough. I am not sure what steps to take next to troubleshoot this. Any help would be greatly appreciated, even if its just pointing me in the right place to start troubleshooting. I have attached the xml file for the vm, and the libvirt log windows10.xml.txt libvirt-win10.log.txt
February 26, 201610 yr Author Anyone have any insight here? The issue continues to occur, today for instance the windows VM had been running for days with no issues. I logged on tested it, then just used windows shutdown. After the shutdown the entire server locked up. I am not sure why but i wonder if this has anything to do with passing through the USB controller. Anyone wanna help me troubleshoot?
February 27, 201610 yr Author Ok, The woes continue, I spent the last couple of days spinning up an OSX vm, the exact same thing happened. Running it last night after I finally got everything working, was no issue, this morning I woke up the display's (I had disabled power saving mode due to some other posts). The VM worked with no issue, I decided to shut it down via the apple menu->shut down and boom, the entire unraid server locked up, just like with the window VM. Can anyone help me troubleshoot? Here is my XML minus the key <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <name>OSX-El-Capitan-10.11-VNC</name> <uuid>0ba39646-7ba1-4d41-9602-e2968b2fe36d</uuid> <metadata> <vmtemplate name="Custom" icon="osx.png" os="osx"/> </metadata> <memory unit='KiB'>9765888</memory> <currentMemory unit='KiB'>9765628</currentMemory> <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='4'/> <vcpupin vcpu='1' cpuset='5'/> <vcpupin vcpu='2' cpuset='6'/> <vcpupin vcpu='3' cpuset='7'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-q35-2.3'>hvm</type> <kernel>/mnt/cache/vstorage/osx/enoch_rev2795_boot</kernel> <boot dev='hd'/> <bootmenu enable='yes'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>core2duo</model> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/mnt/cache/vstorage/osx/ElCapitan.img'/> <target dev='hda' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/> </controller> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/> </controller> <controller type='pci' index='2' model='pci-bridge'> <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:00:20:30'/> <source bridge='br0'/> <model type='e1000-82545em'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0'/> </interface> <memballoon model='none'/> </devices> <seclabel type='none' model='none'/> <qemu:commandline> <qemu:arg value='-device'/> <qemu:arg value='isa-applesmc,osk=xxxxxxxxxxxxxxxxxxxxxxx'/> <qemu:arg value='-smbios'/> <qemu:arg value='type=2'/> <qemu:arg value='-device'/> <qemu:arg value='ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=2,chassis=1,id=root.1'/> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=03:00.0,bus=pcie.0,multifunction=on,x-vga=on'/> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=03:00.1,bus=pcie.0'/> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=00:1d.0,bus=root.1,addr=00.0'/> </qemu:commandline> </domain>
February 27, 201610 yr My guess is that your card doesn't support to be reset. What you can try to do, is to eject the card from the VM before rebooting. Haven't done it myself, but I think its done the same way as with USB devices.
February 28, 201610 yr Author I wonder if there is a way to do this on OSX, ill try and test it on windows and see what happens.
March 5, 201610 yr Author Anyone have any idea how to eject a PCI passthrough device? I do not see the options in windows or OSX. Or does anyone else have any advice on how to handle this problem, both windows and osx, work and reboot with no issues (i even have a script to hard shut them down and swap to the other OS). But if I leave them running (or shut off) for any extended period of time, the next power cycle is almost 100% a full unraid server freeze. Thanks, Tom
Archived
This topic is now archived and is closed to further replies.