(Solved) VM hangs on start up splash screen


Astryl

Recommended Posts

AMD Threadripper w/ 590X GPU passthrough. Was previously stable for well over a year. Suddenly when I start it I can no longer get into windows. It either boot loops at the "preforming recovery" screen or freezing while loading on the TianoCore splash screen. 

 

Attempted fixes:

 

Four different vBIOS roms.

New VM XML

New VDisk / Fresh VM (this worked temporarily up until Windows loaded & immediately bricked at a black screen)

Bind & unbound GPU at vfio. 

 

I'm at my wits end, this previously worked completely perfectly without issue.

 

 

orbital-diagnostics-20211117-1802.zip

Edited by Astryl
Link to comment
Nov 18 02:07:01 ORBITAL kernel: Plex Media Serv[40895]: segfault at 14e338ecb018 ip 000014e33d56caa3 sp 000014e33736d3c0 error 4 in Plex Media Server[14e33ccc5000+bae000]
Nov 18 02:07:01 ORBITAL kernel: Code: 8b 45 08 49 8b 4d 20 48 89 ca 48 09 c2 0f 84 1a 02 00 00 48 39 c8 75 0e 49 8b 4d 18 49 3b 4d 30 0f 84 07 02 00 00 49 8b 4d 00 <83> 79 08 ff 0f 84 a5 01 00 00 41 8b 4e 10 83 f9 01 75 05 41 8b 0e
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS error (device sdc1): bad tree block start, want 1228455936 have 0
Nov 18 02:07:09 ORBITAL kernel: BTRFS info (device sdc1): failed to delete reference to Plex Media Server.4.log, inode 53909236 parent 268
Nov 18 02:07:09 ORBITAL kernel: BTRFS: error (device sdc1) in __btrfs_unlink_inode:4034: errno=-5 IO failure
Nov 18 02:07:09 ORBITAL kernel: BTRFS info (device sdc1): forced readonly
Nov 18 02:07:09 ORBITAL kernel: BTRFS: error (device sdc1) in btrfs_rename:9598: errno=-5 IO failure

 

 

Uh...

 

Is my cache pool dying? Good health reports on both drives...

Link to comment
1 hour ago, JorgeB said:

Looks more like filesystem corruption, but those errors are not in the posted diags.

 

Yeah looking back at it, this is from today and may be related to my UD ssd for my plex media metadata and not relevant to the other issue. 

 

Outside of changing slots on my GPU, does anyone have any insight here?

 

Last night I tried the following additional steps:

 

PCIe overrides from "Both" -> "Multifunction"

Adding "video=efifb:off" to my syslinux.cfg

 

Still freezes at the same place on either the new or old vDisk.

 

 

Link to comment
-m 65536 \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":68719476736}' \
-overcommit mem-lock=off \
-smp 1,sockets=1,dies=1,cores=1,threads=1 \
-uuid a1b3e671-4ac9-86b0-dfd8-b927bd0d0dc2 \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=33,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime \
-no-hpet \
-no-shutdown \
-boot strict=on \
-device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \
-device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
-device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
-device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
-device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
-device nec-usb-xhci,p2=15,p3=15,id=usb,bus=pcie.0,addr=0x7 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 \
-blockdev '{"driver":"file","filename":"/mnt/user/domains/Arcana/vdisk1.img","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":false,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}' \
-device ide-hd,bus=ide.2,drive=libvirt-1-format,id=sata0-0-2,bootindex=1,write-cache=on \
-netdev tap,fd=35,id=hostnet0 \
-device virtio-net,netdev=hostnet0,id=net0,mac=52:54:00:78:9b:47,bus=pci.1,addr=0x0 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-chardev socket,id=charchannel0,fd=36,server=on,wait=off \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \
-audiodev id=audio1,driver=none \
-device vfio-pci,host=0000:4a:00.0,id=hostdev0,bus=pci.3,addr=0x0 \
-device vfio-pci,host=0000:4a:00.1,id=hostdev1,bus=pci.4,addr=0x0 \
-device usb-host,hostdevice=/dev/bus/usb/003/003,id=hostdev2,bus=usb.0,port=1 \
-device usb-host,hostdevice=/dev/bus/usb/007/002,id=hostdev3,bus=usb.0,port=2 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/3 (label charserial0)
2021-11-18T20:12:23.537143Z qemu-system-x86_64: vfio: Cannot reset device 0000:4a:00.1, no available reset mechanism.
2021-11-18T20:12:23.542112Z qemu-system-x86_64: vfio: Cannot reset device 0000:4a:00.1, no available reset mechanism.

 

 

Played around with it some more. Seems the GPU isn't resetting, I already have the AMD reset app installed from CA. Not sure what changed in 72 hours that caused this to suddenly become an issue.

Link to comment
  • Astryl changed the title to (Solved) VM hangs on start up splash screen

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.