Windows server VM fails every other boot


Recommended Posts

I moved to a new server, an R720 vs an R710 like before.  I moved all data over, and used the same USB, just created a new array.  Now, I am unable to create a Server 2016 VM.  Windows is no issue.  It does install properly, and I go through everything to the desktop.  Then, when I reboot, it fails and gives me the screen I added with the files.  Then, if I make a change to the VM under edit, either moving to another NIC, or adding a virtual drive, whatever, it fires right up.  Then, when I reboot, I get the same screen you see, again.  Does anyone have any ideas as to what I am doing wrong??

 

Thanks!

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='36'>
  <name>LabDC01</name>
  <uuid>6a058db3-194f-a87c-89f4-901f80c72dc2</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='8'/>
    <vcpupin vcpu='1' cpuset='20'/>
    <vcpupin vcpu='2' cpuset='9'/>
    <vcpupin vcpu='3' cpuset='21'/>
    <vcpupin vcpu='4' cpuset='10'/>
    <vcpupin vcpu='5' cpuset='22'/>
    <vcpupin vcpu='6' cpuset='11'/>
    <vcpupin vcpu='7' cpuset='23'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-3.1'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/VMs/LabDC01/vdisk1.img'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <boot order='1'/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='sata0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:3c:c4:5d'/>
      <source bridge='br0'/>
      <target dev='vnet2'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/2'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/2'>
      <source path='/dev/pts/2'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-36-LabDC01/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5902' autoport='yes' websocket='5702' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>
 

2019-06-03 15_06_43-QEMU (LabDC01) - noVNC.png

Link to comment
4 minutes ago, testdasi said:

This looks to be a rather idiosyncratic problem with Windows and not Unraid. That file looks to be some HBA controller driver according to Google so I would suggest you rebuild your install media and reinstall Windows from scratch.

So just an update on what I have tried.  I have redownloaded Server 2016 and also Server 2012.  They both are doing the exact same thing as you have seen above.  I completed deleted my libvirt file and started over from scratch.  I have pulled all my NIC cards, no change.  I swapped my H310 HBA for an H200, no change.  I have included more screenshots, as the file changes on every reboot.  I tried creating the VM with just one core.  Nothing is working, and I am getting a little nervous as this is a production machine.  I appreciate any advice anyone is able to give-

 

2019-06-05 08_05_40-QEMU (LabDC01) - noVNC.jpg

2019-06-05 08_06_16-QEMU (LabDC01) - noVNC.jpg

2019-06-05 08_06_39-QEMU (LabDC01) - noVNC.jpg

Link to comment
On 6/3/2019 at 4:10 PM, poopsie said:

moved to a new server

 

14 minutes ago, poopsie said:

the file changes on every reboot.

Those two facts lead me to bad RAM. Has this new machine had at least 24 hours running memtest with a clean result? If it's using ECC RAM, I think you will need to create a boot USB with the new proprietary memtest program to get accurate results.

 

Perhaps as a test temporarily remove half the RAM and see if the symptoms change, swap with the removed set of RAM and repeat.

Link to comment
15 minutes ago, jonathanm said:

 

Those two facts lead me to bad RAM. Has this new machine had at least 24 hours running memtest with a clean result? If it's using ECC RAM, I think you will need to create a boot USB with the new proprietary memtest program to get accurate results.

 

Perhaps as a test temporarily remove half the RAM and see if the symptoms change, swap with the removed set of RAM and repeat.

I will give this a shot in a few hours and report back.  Thank you!!!!

Link to comment
4 hours ago, jonathanm said:

 

Those two facts lead me to bad RAM. Has this new machine had at least 24 hours running memtest with a clean result? If it's using ECC RAM, I think you will need to create a boot USB with the new proprietary memtest program to get accurate results.

 

Perhaps as a test temporarily remove half the RAM and see if the symptoms change, swap with the removed set of RAM and repeat.

So I ran home and swapped out RAM from another machine.  It did the same thing, no change.  I had my old R710 laying around still that it was working on prior to the server move, so I pulled the RAM from that also.  No change, same error.  This one really has me stumped....

Edited by poopsie
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.