Windows 10 vm crashing to black screen randomly


Go to solution Solved by Rbby258,

Recommended Posts

Edit: I've narrowed it down to a hardware issue, possibly cpu or the motherboard. I will update and close this once I've found which.

 

For some reason my windows 10 vm is going into a black screen after a few minutes of being in windows. I've been using this vm with a RTX 3090 passthrough and a 13900k with everything water-cooled and runs perfectly in bare metal booting straight from the nvme, motherboard is z790 Taichi carrara.

 

I've had it say this in the vm logs right when it crashes

qemu-system-x86_64: vfio: Unable to power on device, stuck in D3

 

And this right after launching overwatch 2 and freezing

 

-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/0 (label charserial0)
error: kvm run failed Invalid argument
RAX=fffff8016336500c RBX=000000000000000c RCX=000000000000000c RDX=0000000000000071
RSI=000000000000000c RDI=fffff80166e62f00 RBP=000000000000000c RSP=fffff80166e62e38
R8 =fffff80166e62f01 R9 =0000000000000001 R10=fffff80163365000 R11=0000000000000000
R12=000001f9f5b75cbc R13=000001fddf0054f8 R14=0000000000000000 R15=fffff80163c05128
RIP=fffff8016336500d RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 ffffffff 00c00000
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =0053 0000000000000000 00003c00 0040f300 DPL=3 DS   [-WA]
GS =002b fffff8015ded9000 ffffffff 00c0f300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 fffff80166e49000 00000067 00008b00 DPL=0 TSS64-busy
GDT=     fffff80166e4afb0 00000057
IDT=     fffff80166e48000 00000fff
CR0=80050033 CR2=000001fddf00d5c8 CR3=00000003ab694000 CR4=00350ef8
DR0=000001fd93feeee0 DR1=00007ff73f0d0540 DR2=000001fd918a07f0 DR3=00007ff73f7452c8 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=cc cc cc cc cc cc cc ba 70 00 00 00 8a c1 ee ba 71 00 00 00 <ec> c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 44 8a c2 ba 70 00 00 00 8a c1 ee

This problem has only just started happening, before it would run for days until manually shutdown.

 

Please let me know any other information you my need to help me as I'm no longer able to use this vm. 

 

Many thanks.

 

 

tower-diagnostics-20230501-1714.zip

Edited by Rbby258
Link to comment
  • Rbby258 changed the title to Windows 10 vm crashing to black screen randomly

Try vfio-pci.disable_idle_d3=1 instead of disable_idle_d3=1

However this may well be a motherboard bios issue.

change also from this:
 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/3090.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>

to this:
 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/3090.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </hostdev>

 

Link to comment
  • Solution
On 5/6/2023 at 11:12 AM, ghost82 said:

Try vfio-pci.disable_idle_d3=1 instead of disable_idle_d3=1

However this may well be a motherboard bios issue.

change also from this:
 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/3090.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>

to this:
 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/3090.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </hostdev>

 

 

Hi, thanks for helping. I added that and also tried and tested many other things but finally have answers! A faulty 13900k and some faulty cablemod customs psu cables.. I ordered a i5 13500 and another "cheap" z790 board to test hardware. Cpu didn't work properly on both boards. And cables would cause issues like loss of display if the pc was knocked.

 

Changing cpu and going back to my oem cables got everything working again. Have to say, cablemod have issued new cables at no cost and intel has agreed to refund my 13900k because "we can only offer you a refund for the CPU, because we were informed that the part or any alternate parts are currently out of stock and there is no ETA for new incoming stock" mad.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.