hiptoss VM issues (split from Windows issues with unRAID)


Recommended Posts

I've recently upgraded to 6.9.0beta22 and I'm having VM issues I haven't seen before.  It seems no matter which settings I try. I will consistently get a bluescreen on boot, prior to windows login.  The bluescreen is the "kernel security check failure" screen with the little QR code.  It auto reboots, tries to repair, and does this bluescreen loop until I stop the VM.  I've tried various machine types, with hyper-v and without, many cores, or just one, various memory configurations, etc.  I have a 3960x and 128gb of ram, in case it matters.  Below is the XML of my latest attempt, as well as attached diag.  In this iteration, I'm only using 8 cores, 16gb ram, passing just keyboard and mouse, and using only VNC for video.


Any help would be tremendously appreciated.

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>abc</name>
  <uuid>5bf07f3a-c47f-f0c7-1698-9bf520dc24e1</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='20'/>
    <vcpupin vcpu='1' cpuset='44'/>
    <vcpupin vcpu='2' cpuset='21'/>
    <vcpupin vcpu='3' cpuset='45'/>
    <vcpupin vcpu='4' cpuset='22'/>
    <vcpupin vcpu='5' cpuset='46'/>
    <vcpupin vcpu='6' cpuset='23'/>
    <vcpupin vcpu='7' cpuset='47'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/5bf07f3a-c47f-f0c7-1698-9bf520dc24e1_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source dev='/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_500GB_S466NX0KC32791N'/>
      <target dev='hdc' bus='ide'/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:10:46:d3'/>
      <source bridge='br0'/>
      <model type='virtio-net'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='3'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x04d9'/>
        <product id='0x1818'/>
      </source>
      <address type='usb' bus='0' port='1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1038'/>
        <product id='0x1710'/>
      </source>
      <address type='usb' bus='0' port='2'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 


 

beastmode-diagnostics-20200623-1128.zip

Link to comment
7 minutes ago, hiptoss said:

I've recently upgraded to 6.9.0beta22 and I'm having VM issues I haven't seen before.  It seems no matter which settings I try. I will consistently get a bluescreen on boot, prior to windows login.  The bluescreen is the "kernel security check failure" screen with the little QR code.  It auto reboots, tries to repair, and does this bluescreen loop until I stop the VM.  I've tried various machine types, with hyper-v and without, many cores, or just one, various memory configurations, etc.  I have a 3960x and 128gb of ram, in case it matters.  Below is the XML of my latest attempt, as well as attached diag.  In this iteration, I'm only using 8 cores, 16gb ram, passing just keyboard and mouse, and using only VNC for video.


Any help would be tremendously appreciated.

 


<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>abc</name>
  <uuid>5bf07f3a-c47f-f0c7-1698-9bf520dc24e1</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='20'/>
    <vcpupin vcpu='1' cpuset='44'/>
    <vcpupin vcpu='2' cpuset='21'/>
    <vcpupin vcpu='3' cpuset='45'/>
    <vcpupin vcpu='4' cpuset='22'/>
    <vcpupin vcpu='5' cpuset='46'/>
    <vcpupin vcpu='6' cpuset='23'/>
    <vcpupin vcpu='7' cpuset='47'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/5bf07f3a-c47f-f0c7-1698-9bf520dc24e1_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source dev='/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_500GB_S466NX0KC32791N'/>
      <target dev='hdc' bus='ide'/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:10:46:d3'/>
      <source bridge='br0'/>
      <model type='virtio-net'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='3'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x04d9'/>
        <product id='0x1818'/>
      </source>
      <address type='usb' bus='0' port='1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1038'/>
        <product id='0x1710'/>
      </source>
      <address type='usb' bus='0' port='2'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 


 

beastmode-diagnostics-20200623-1128.zip 128.12 kB · 0 downloads

 

Change the CPU from Host passthru to Emulated.  there is currently an issue with passing-thru AMD Ryzen 3000 series on QEMU 5.0 (what they ship 6.9beta22 with.  You can do that in the WebGUI mode of the edit VM page, not even need to go in Advanced XML view.

Link to comment
1 hour ago, Pducharme said:

 

Change the CPU from Host passthru to Emulated.  there is currently an issue with passing-thru AMD Ryzen 3000 series on QEMU 5.0 (what they ship 6.9beta22 with.  You can do that in the WebGUI mode of the edit VM page, not even need to go in Advanced XML view.

Thank you for the suggestion.  When I try that, I get a popup saying


"XML error: Non-empty feature list specified without CPU model" -- is there something else I need to add?  I tried with i440fx4.2 and 5.0

EDIT: After re-creating a new VM, I'm in a functional state.  I did have to still turn on acs override within unraid, even with the new options to directly pass components.  After figuring that out, and using SATA on my two passthrough NVME drives to the windows VM, I'm up and running.  Thank you very much for your help.  

Edited by hiptoss
Link to comment

A significant downside to using emulated CPUs is apparently that some apps and games actually check your CPU capabilities.  For example, last night I tried to play NBA 2k20, and I received a popup saying basically "cpu without SSE 4.2 detected" and the game wouldn't start.  Taking @Pducharme's tip, I spent a couple hours trying to figure out how to work around the QEMU issue with AMD 3000 series CPUs.  In my research, I ran across a great thread on the VFIO subreddit explaining the issue, and how to work around it.  

The "easy" solution I used was to simply edit the qemu xml and change cpu mode to 'host-model' instead of 'host-passthrough'.  It isn't perfect in that it shows my 3960x as an EPYC processor on boot, but it does allow me to use the VM for gaming.  (note that there are more detailed methods listed on the reddit thread fiddling with the -amd-stibp qemu cpu argument.  I wasn't able to get that to work with XML parsing errors.)

The details of how I fixed this after creating a regular Windows 10 VM with host-passthrough, from the command line.  My VM is called "testing1", which creates a testing1.xml in /etc/libvirt/qemu.  

cd /etc/libvirt/qemu

virsh edit testing1

^ will open your XML to edit (change testing1 to whatever your .xml filename is, minus the .xml extension.  ie, myvm.xml = virsh edit myvm)

For my config the cpu section looked like this:
 

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='10' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>


All I had to do was change that section to look like this:
 

  <cpu mode='host-model' check='none'>
    <topology sockets='1' dies='1' cores='10' threads='2'/>
    <feature policy='require' name='topoext'/>
  </cpu>


After saving, and then starting the VM, everything "just worked."  I used it last night for a few hours, and I'm typing this post from the VM left on overnight.  So far so good.

I wanted to point this out in case anyone else finds themselves in my position, having no idea about the QEMU bug with the later AMD cpus.  Perhaps someone can add the workarounds from that thread in a more visible post until the bug is fixed in an updated release.  

Edited by hiptoss
  • Like 1
Link to comment

Hi.

 

Same Issue after upgrading from 6.8.3 to 6.9 rc22. I have upgraded my server hw to RYZEN 3900x using 6.8.3 Unraid. After upgrading to 6.9.rc22 my Win10 VM got stuck with the same BSOD looping error as mentioned above.

 

Above solution to change the XML File to <cpu mode='host-model' check='none'> and to delete <cache mode='passthrough'/> is the temporary solution to fix this issue.

 

THX!
 

Link to comment
  • 1 month later...

This looks to have saved my bacon! thank you very much!

I just needed to match the number of vCPUs to 1/2 the cores (which in my case i pass through 8 total vCores

<cpu mode='host-model' check='none'> <topology sockets='1' dies='1' cores='4' threads='2'/> <feature policy='require' name='topoext'/> </cpu>

Link to comment
  • 2 weeks later...

I’m doubtful this will ever get sorted I can’t understand why unraid staff can’t type up how to downgrade properly from 6.9.0 beta 25 to 6.8.3 again without losing cache drives the tiny little info on the pre release you get doesn’t make any sense to me

Edited by Dava2k7
Link to comment
On 8/24/2020 at 12:53 AM, Dava2k7 said:

I’m doubtful this will ever get sorted I can’t understand why unraid staff can’t type up how to downgrade properly from 6.9.0 beta 25 to 6.8.3 again without losing cache drives the tiny little info on the pre release you get doesn’t make any sense to me

 

I was pulling my hair out for hours a couple weeks ago getting this all working. First AMD build in a loooooong time on a Ryzen 3700x + Asrock B550m Steel legend. After working out that I needed to move to 6.9 beta 25 for the internal NIC, I obviously found my self in the same position. Not sure where I found this from, but those commandline arguments are key to a successful win10 VM using host past thru. My build has been solid since working this out! Hope it helps!

  <vcpu placement='static'>8</vcpu>
  <iothreads>2</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='12'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='13'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='14'/>
    <vcpupin vcpu='6' cpuset='7'/>
    <vcpupin vcpu='7' cpuset='15'/>
    <emulatorpin cpuset='1,9'/>
    <iothreadpin iothread='1' cpuset='0-1'/>
    <iothreadpin iothread='2' cpuset='2-3'/>
  </cputune>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
<qemu:commandline>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='host,topoext=on,invtsc=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vendor-id=kittycatss,hv-vpindex,hv-synic,hv-stimer,hv-reset,hv-frequencies,host-cache-info=on,l3-cache=off,-amd-stibp'/>
  </qemu:commandline>

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.