Jump to content

Windows VM pauses when vm is shutdown


Recommended Posts

Hi!

 

When I shutdown my windows 10 vm (either from the unraid gui or from the vm) it ends up in a paused state, instead of being stopped. A restart also ends up in a paused state.

 

If I try resume the vm I get:

internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required

So I force-stop and then start the vm and everything works fine.

 

It's not really a big issue for me except:

  • If I shutdown the vm it doesn't release the ram it was assigned. It is released after a force-stop.
  • Windows did an update the other day and the vm ended up paused. When I rebooted it it said "failed update, rolling back" but something went wrong because windows explorer would constantly crash after that so i reset the vm.
  • I have to force-stop and start instead of just clicking start.

 

One potential clue is whenever I start the vm the logs show this error:

qemu-system-x86_64: vfio_err_notifier_handler(0000:0a:00.1) Unrecoverable error detected. Please collect any data possible and then kill the guest

The vm runs fine despite the above error.

 

Any ideas?

 

Diagnostics are attached and thanks for looking!

 

 

 

tower-diagnostics-20200429-1337.zip

Link to comment

@Subtletree One of the passed through devices causing this error.

Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.1
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1:   device [1022:1483] error status/mask=00100000/04400000
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1:    [20] UnsupReq               (First)
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1:   TLP Header: 34000000 0a000010 00000000 80008000
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast error_detected message
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast mmio_enabled message
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast resume message
Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: AER: Device recovery successful
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
	Kernel driver in use: pcieport

One of the devices you pass through causing this PCI-bridge to throw an error. It might be the following or one of the 2 USB controllers you're passing through

07:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]

Remove them one by one and test if it helps.

  • Thanks 1
Link to comment
  • 1 year later...

@SimpleDino I feel bad I never came back with an update!

 

If i remember correctly, @bastl was right and the "AER: Uncorrected (Non-Fatal) error received" was the source of the problem (Thanks @bastl!) . I think I did something like the advice here and silenced the error. Whatever I was reading at the time suggested it's not really an issue and silencing the error was all good.

 

I'm not at home now so will check my settings when I get home.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...