Subtletree Posted April 29, 2020 Share Posted April 29, 2020 Hi! When I shutdown my windows 10 vm (either from the unraid gui or from the vm) it ends up in a paused state, instead of being stopped. A restart also ends up in a paused state. If I try resume the vm I get: internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required So I force-stop and then start the vm and everything works fine. It's not really a big issue for me except: If I shutdown the vm it doesn't release the ram it was assigned. It is released after a force-stop. Windows did an update the other day and the vm ended up paused. When I rebooted it it said "failed update, rolling back" but something went wrong because windows explorer would constantly crash after that so i reset the vm. I have to force-stop and start instead of just clicking start. One potential clue is whenever I start the vm the logs show this error: qemu-system-x86_64: vfio_err_notifier_handler(0000:0a:00.1) Unrecoverable error detected. Please collect any data possible and then kill the guest The vm runs fine despite the above error. Any ideas? Diagnostics are attached and thanks for looking! tower-diagnostics-20200429-1337.zip Quote Link to comment
bastl Posted April 29, 2020 Share Posted April 29, 2020 @Subtletree One of the passed through devices causing this error. Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.1 Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: device [1022:1483] error status/mask=00100000/04400000 Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: [20] UnsupReq (First) Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: TLP Header: 34000000 0a000010 00000000 80008000 Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast error_detected message Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast mmio_enabled message Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: broadcast resume message Apr 28 18:37:25 Tower kernel: pcieport 0000:00:03.1: AER: Device recovery successful 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483] Kernel driver in use: pcieport One of the devices you pass through causing this PCI-bridge to throw an error. It might be the following or one of the 2 USB controllers you're passing through 07:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485] Remove them one by one and test if it helps. 1 Quote Link to comment
SimpleDino Posted July 12, 2021 Share Posted July 12, 2021 @Subtletree Was this ever solved? Having the same issue unfortunately Br, Quote Link to comment
Subtletree Posted July 12, 2021 Author Share Posted July 12, 2021 @SimpleDino I feel bad I never came back with an update! If i remember correctly, @bastl was right and the "AER: Uncorrected (Non-Fatal) error received" was the source of the problem (Thanks @bastl!) . I think I did something like the advice here and silenced the error. Whatever I was reading at the time suggested it's not really an issue and silencing the error was all good. I'm not at home now so will check my settings when I get home. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.