Windows VM Locks up and Reboots


Recommended Posts

Hardware
MSI X470 Gaming Plus
Ryzen 2700x
Kraken X62 AIO with fans in push/pull
4 x 16 GB 3200Mhz Ram (Can get to boot at 2933, have backed down to 2866 currently, have also tried at 2133)
1 TB Sabrent Rocket NVMe drive (passed through to VM)
2 x 1 TB 5400 RPM HDD (One for array, one for parity)

3 x 500 GB Samsung 850 SSD (cache drive, not used for this VM)
Zotac 750ti (Used as primary GPU for UnRaid)
XFX Vega 64 Reference (Passed through to VM)
Corsair RM850x PSU (Purchased 6 months ago, entire 850w rating is supported on the 12v rail)
USB Controller (Passed through, devices connected: Mouse, KB, Soundcard, Bluetooth)

 

Tried
Clear CMOS button
Remove battery for 20+ minutes
Works fine bare metal
Changed RAM slots
Reseatted GPU
Creating a new VM
Overclocking, Underclocking, Stock for GPU, RAM, and CPU
Clean dust and repaste GPU
HPET on and off (unRaid and VM)

 

Issue
When running vm it quickly freezes, usb drops and reconnects then vm restarts. After 2 or 3 restarts it will usually freeze during boot and VM will need a force stop. It can be started back up at this point but sometimes during the freeze it will crash all of unRaid. At least to the point of the webserver being unreachable.

 

I will attach the most recent diagnostics from the usb drive and my current xml file. Logs from the VM and unRaid didn't seem out of the ordinary.

 

The strangest part is I've been running fine for so long. This VM has been in use for close to two years with minimal issues. I'm not sure if I recently tweaked something that caused issues or not. Just a few days before this I had an issue with stuttering when moving the mouse. Eventually tracked it down to the polling rate of the mouse (Worked fine at 1000 previously). I installed the software and turned it down to 250 and everything worked great for a full day. I woke up the next day and unRaid had crashed. It was frozen for quite a while and my system clock seemed to have gotten stuck when it froze; the time in the bios was off.

 

That made me think it was a hardware issue, but I've spent the last couple weeks running bare metal with no issues. No matter what stress test or benchmark I put it through everything has checked out. I've also ran unRaid overnight a few times and haven't had any crashes without running the VM. I also have another VM that I did let run during this time and it seemed to be fine. I do have VM cores isolated though so its possible the issue didn't show under less stress.

 

Just remembered I changed HPET settings around this time as well. Turned them off for unRaid and the VM and performance is better but still crashes and restarts.

I am using this build of unRaid as the newer kernel fixed some issues with the VM that pretty much started working properly after the update. Newer builds of unRaid go back to the older kernel.

Maybe a more experienced eye can spot the issue or point me in the right direction. Thanks for the help!

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='4'>
  <name>Windows 10 new</name>
  <uuid>226fdef2-a553-9fd5-585b-04d7050c9f48</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='8'/>
    <vcpupin vcpu='1' cpuset='9'/>
    <vcpupin vcpu='2' cpuset='10'/>
    <vcpupin vcpu='3' cpuset='11'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <emulatorpin cpuset='14-15'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-4.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/226fdef2-a553-9fd5-585b-04d7050c9f48_VARS-pure-efi.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='6' threads='1'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0xb'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xc'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xd'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0xe'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0xf'/>
      <alias name='pci.8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x10'/>
      <alias name='pci.9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='10' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:31:71:67'/>
      <source bridge='br0'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/1'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/1'>
      <source path='/dev/pts/1'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-4-Windows 10 new/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x21' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x21' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x22' slot='0x00' function='0x3'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev3'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1e71'/>
        <product id='0x170e'/>
        <address bus='1' device='2'/>
      </source>
      <alias name='hostdev4'/>
      <address type='usb' bus='0' port='3'/>
    </hostdev>
    <hub type='usb'>
      <alias name='hub0'/>
      <address type='usb' bus='0' port='2'/>
    </hub>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

fender-diagnostics-20200603-0701.zip windows_10_vm.txt

Edited by StuDaBaiker
Link to comment
  • 2 months later...

I had a similar issue where everything would lock up and crash unraid. Started out of the blue too. For my case it turned out to be USB controller which wasn't resetting properly.

 

Which device does this part of the XML correspond to?

    <controller type='pci' index='10' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>

For my case, the template would always get edited to pcie-to-pci-bridge and the system didn't know what to do with it.

If this corresponds to the USB controller, try to remove the usb controller from the VM (and this part from the xml if it remains) and try to boot up.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.