Windows 10 VM completely freezing after 6.11.1 upgrade


Recommended Posts

I tried that but I cannot even start the VM with that setting. I get the following error in the popup:

 

Unable to write to file /var/log/libvirt/qemu/Windows 10.log: No space left on device

 

It looks like that config instantly filled up all of the space allocated for logs since the dashboard shows Logs: 100%, yet there isn't much in /var/log at all. Had to reboot because I have no idea how to clear out that memory.

Edited by johnsanc
Link to comment
1 hour ago, johnsanc said:

I tried that but I cannot even start the VM with that setting. I get the following error in the popup:

 

Unable to write to file /var/log/libvirt/qemu/Windows 10.log: No space left on device
 

 

It looks like that config instantly filled up all of the space allocated for logs since the dashboard shows Logs: 100%, yet there isn't much in /var/log at all. Had to reboot because I have no idea how to clear out that memory.

 

This means an error flooded the log space.

You should find out what it is.

 

Link to comment

I dont know if this is the same exact issue, but since i upgraded to 6.11 my windows 11 vm started to have hickups, it would freeze for a second from time to time, then i started to have some networking issues, i could not get in to my network shares. Then, after several experiments i noticed that in windows control panel, my network adapters (virtual) that are connected to some vlans, are constantly flipping from on/off. they are connected to different subnets and with different mac addresses. If i leave one virtual network adapter connected and disable all th others, the VM get stable again. I've had this setup running before without any issues, then, after upgrading, the issues started, then i even downgraded again but could not get things working again like they were working before. 

Link to comment
22 hours ago, btagomes said:

i noticed that in windows control panel, my network adapters (virtual) that are connected to some vlans, are constantly flipping from on/off.

I noticed that too long time ago in my windows 11 vm; I'm not sure if this is a windows bug or qemu or the interaction with the 2. However, this doesn't seem to affect network or any other functionality, at least for me.

Link to comment
3 hours ago, johnsanc said:

i tested with VNC only. Same freeze issue if the standard memory backing portion for virtiofs is included.

Would you be able to supply a copy of the QEMU log for the VM, and would you be happy for me to share with Libvirt and QEMU dev teams. Just with Memory Backing set.

Link to comment

@SimonF - I will try to reproduce that and provide the logs, however they look the same as the ones I provided in my diagnostics. There is nothing useful in the logs when the freeze occurs at all.

 

Is there any way to enable better debug logging to capture info about the freeze?

 

Also can others following this thread also please try just leaving in the memory backing config and NOT any of the virtiofs stuff to see if you also still observe the freeze?

 

  <memoryBacking>
    <source type='memfd'/>
    <access mode='shared'/>
  </memoryBacking>

 

  • Like 1
Link to comment
2 minutes ago, johnsanc said:

@SimonF - I will try to reproduce that and provide the logs, however they look the same as the ones I provided in my diagnostics. There is nothing useful in the logs when the freeze occurs at all.

 

Is there any way to enable better debug logging to capture info about the freeze?

 

Also can others following this thread also please try just leaving in the memory backing config and NOT any of the virtiofs stuff to see if you also still observe the freeze?

 

  <memoryBacking>
    <source type='memfd'/>
    <access mode='shared'/>
  </memoryBacking>

 

Ok will look at your Diags. I still dont get any freezes.

Link to comment

@SimonF - I was able to get the VM to freeze again with just the memoryBacking config in place. I've attached new diagnostics and my VM XML as well. According to the Windows lock screen the freeze occurred around 1:47 PM ET today. The only thing I see in the logs is something like:

 

2022-10-16 18:16:12.052+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:17:02.245+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:38:28.779+0000: Domain id=5 is tainted: custom-ga-command

 

Each line only appears when trying to view the log from the VMs tab, or trying to shut down the VM normally. As previously mentioned, the only thing that you can do at this point is force stop the VM.

 

If there is a way to enable better logging I am happy to do so to help troubleshoot.

tower-diagnostics-20221016-1438.zip johnsanc_win10vm_q35.xml

Edited by johnsanc
  • Thanks 3
Link to comment

Hello all,

 

I am facing the exact same issue. Recently updated to Windows 11, in the first days all was OK.

Then I learned that for my framedrops and audio stutters that I had since started using Unraid KVM the solution is to isolate the cores that I feed to the VM. That made a huge performance difference (in a good way) but at the same time the freezes started occuring. Same symptoms as the rest of the guys. I can see the Windows screen but it's unresponsive (I can't even move the mouse).

 

Some things I did in the past few days:

 

  • Isolated 6 out of the 8 cores of my Intel Core-i7 9700 CPU and used those for the VM. Unraid uses 0,1 and the VM 2,3,4,5,6,7
  • I pass through my Nvidia 3060 Ti, a USB pcie controller and an NVME SSD for extra storage. My vdisk is in the Unraid cache.
  • I updated to the latest virtiofs (using the cd image)
  • I changed my Win11 Virtual Memory from the vdisk to the directly attached nvme
  • I am still on 6.10 btw

I was about to create a new thread here asking for help and then I noticed this one, so here I am. As the rest of the people no useful logs show up about the freezes.

 

Any help would be greatly appreciated.

 

 

Link to comment

I would suggest a separate post for your issues specifically. This thread is focused on Windows VMs completely freezing (not stuttering) after a few hours on 6.11.1 when trying to use Virtio-FS. Based on my findings so far these freezes are directly related to `memoryBacking` configuration... but no solution yet.

 

Hopefully @SimonF or another developer can help to get this issue into the correct hands.

Link to comment
10 hours ago, johnsanc said:

I would suggest a separate post for your issues specifically. This thread is focused on Windows VMs completely freezing (not stuttering) after a few hours on 6.11.1 when trying to use Virtio-FS. Based on my findings so far these freezes are directly related to `memoryBacking` configuration... but no solution yet.

 

Hopefully @SimonF or another developer can help to get this issue into the correct hands.

 

My windows VM is freezing so I think the issue is related to the thread

 

This is part of my original post 

On 10/19/2022 at 9:30 AM, snolly said:

That made a huge performance difference (in a good way) but at the same time the freezes started occuring. Same symptoms as the rest of the guys. I can see the Windows screen but it's unresponsive (I can't even move the mouse)

 

I now updated to the latest Unraid 6.11.1 and the latest Virtio-FS drivers. Freezes still occur.

 

What I noticed (and that may help to figure out what is causing this) is this:

 

My CPU is 8 plain cores (no HT). 0,1 are pinned to all docker containers. 2-7 are pinned to the Windows VM.

 

If I isolate all 2-7 cores then Windows VM freezes in a couple of minutes.

If I isolate 4-7 and I leave 2,3 free (still pinned to the VM) then Windows do not seem to freeze.

 

Still, performance is better than no isolated cores at all but I would prefer to isolate 6 cores and pin them exclusively to my Windows VM.

 

If you need any configuration or log files please let me know.

 

Cheers.

Edited by snolly
Link to comment
On 10/16/2022 at 7:48 PM, johnsanc said:

@SimonF - I was able to get the VM to freeze again with just the memoryBacking config in place. I've attached new diagnostics and my VM XML as well. According to the Windows lock screen the freeze occurred around 1:47 PM ET today. The only thing I see in the logs is something like:

 

2022-10-16 18:16:12.052+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:17:02.245+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:38:28.779+0000: Domain id=5 is tainted: custom-ga-command

 

Each line only appears when trying to view the log from the VMs tab, or trying to shut down the VM normally. As previously mentioned, the only thing that you can do at this point is force stop the VM.

 

If there is a way to enable better logging I am happy to do so to help troubleshoot.

tower-diagnostics-20221016-1438.zip 295.36 kB · 1 download johnsanc_win10vm_q35.xml 8.59 kB · 1 download

New drivers released today 225.

Link to comment
1 hour ago, SimonF said:

New drivers released today 225.

 

BE CAREFUL with 225.

It failed for me.

Not only it didn't update, it also messed my 221 install.
I couldn't remove, repair or reinstall.
It deleted the display driver (so I had only low-res VNC), it deleted network connection.

I had to use microsoft troubleshooter, which I managed to pass to the VM using a USB stick that I modified the VM config to directly see.

At least this properly deleted the crashed setup.
Even then, with all virtio drivers removed, 225 would not complete installing.
221 (re)installed fine.

 

So, beware - although your mileage may vary.

Do yourselves a favor and keep this in your VM beforehand:

https://support.microsoft.com/en-us/topic/fix-problems-that-block-programs-from-being-installed-or-removed-cca7d1b6-65a9-3d98-426b-e9f927e1eb4d

Edited by NLS
  • Thanks 1
Link to comment

Confirmed 225 is messed up. I think its related to qemupciserial. I was able to install if I deselected that component. Attempting a manual install errors on that component. I suspect it may be using a self-signed certificate or something.


Either way I'm giving it a shot. We shall see shortly if the VM freezes... fingers crossed.

 

EDIT: No luck. 225 drivers have the exact same issue and the VM freezes after about 2 hours if memoryBacking is included.

Edited by johnsanc
Link to comment
4 hours ago, johnsanc said:

Confirmed 225 is messed up. I think its related to qemupciserial. I was able to install if I deselected that component. Attempting a manual install errors on that component. I suspect it may be using a self-signed certificate or something.


Either way I'm giving it a shot. We shall see shortly if the VM freezes... fingers crossed.

 

EDIT: No luck. 225 drivers have the exact same issue and the VM freezes after about 2 hours if memoryBacking is included.

Thanks for the update, did 225 resolve permissions and multiple shares?

Link to comment

So, staying to 221.

I am using VirtioFS for the last few days.

I haven't noticed a freeze yet BUT
1) I haven't kept the VM on long enough (will do during the weekend).

2) I think it needs more RAM than before.

 

I suspect there is a memory leak, this can explain the "timed" freeze.

Any other memory sharing methods evaluated? (and anybody really knows the differences?)

 

Link to comment

225 solves no relevant issues. According to GitHub 227 should solve permissions problem. I mentioned this in the other topic with a link to the developer comment.

 

I want to reiterate that the freeze occurs for me with every memoryBacking configuration I’ve tried, and has no relation to the presence of the virtiofs configuration in the XML. The freeze occurs even when virtiofs is not used at all.

 

On the windows side everything is perfectly stable and there’s no errors. With the exact same windows setup the freeze will occur whenever memoryBacking is defined.

 

i have no idea what causes this and the logs are useless. If someone has instructions how to turn on better logging I’ll do it. To me this sounds like s lower level bug and isn’t related to virtiofs at all.

Edited by johnsanc
  • Thanks 1
Link to comment
54 minutes ago, johnsanc said:

225 solves no relevant issues. According to GitHub 227 should solve permissions problem. I mentioned this in the other topic with a link to the developer comment.

 

I want to reiterate that the freeze occurs for me with every memoryBacking configuration I’ve tried, and has no relation to the presence of the virtiofs configuration in the XML. The freeze occurs even when virtiofs is not used at all.

 

On the windows side everything is perfectly stable and there’s no errors. With the exact same windows setup the freeze will occur whenever menoryBacking is defined.

 

i have no idea what causes this and the logs are useless. If someone has instructions how to turn on better logging I’ll do it. To me this sounds like s lower level bug and isn’t related to virtiofs at all.

Sorry work has been manic this week. 

 

Libvirt team think it is likely to be a QEMU issue so need to raise an issue with them.

 

root@computenode:~# cat /usr/local/sbin/qemu 
#!/bin/bash

eval exec /usr/bin/qemu-system-x86_64 $(/usr/local/emhttp/plugins/dynamix.vm.manager/scripts/qemu.php "$@")

you can add debug options to this script but i haven't looked to see what the best options are as yet.

Link to comment
On 10/16/2022 at 7:48 PM, johnsanc said:

@SimonF - I was able to get the VM to freeze again with just the memoryBacking config in place. I've attached new diagnostics and my VM XML as well. According to the Windows lock screen the freeze occurred around 1:47 PM ET today. The only thing I see in the logs is something like:

 

2022-10-16 18:16:12.052+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:17:02.245+0000: Domain id=5 is tainted: custom-ga-command
2022-10-16 18:38:28.779+0000: Domain id=5 is tainted: custom-ga-command

 

Each line only appears when trying to view the log from the VMs tab, or trying to shut down the VM normally. As previously mentioned, the only thing that you can do at this point is force stop the VM.

 

If there is a way to enable better logging I am happy to do so to help troubleshoot.

tower-diagnostics-20221016-1438.zip 295.36 kB · 2 downloads johnsanc_win10vm_q35.xml 8.59 kB · 2 downloads

One thing I have noticed different to my setup is that you are using Q35 and and I am using i440 is eveoryone having freezing issues running Q35 or are some of you running i440 as the machine type.

Link to comment
3 minutes ago, SimonF said:

One thing I have noticed different to my setup is that you are using Q35 and and I am using i440 is eveoryone having freezing issues running Q35 or are some of you running i440 as the machine type.


Freezes with i440 here. One thing that might worth mentioning is that I have 6 out of 8 cores pinned to the VM. If I isolate all 6 cores in Unraid then the freeze occurs in a matter of minutes. If I isolate 4 out of 6 the freeze might occur after hours.

  • Like 1
Link to comment
11 minutes ago, snolly said:


Freezes with i440 here. One thing that might worth mentioning is that I have 6 out of 8 cores pinned to the VM. If I isolate all 6 cores in Unraid then the freeze occurs in a matter of minutes. If I isolate 4 out of 6 the freeze might occur after hours.

And you still are running on 6.10.x? 6.11.1 has updated QEMU and Libvirt to 6.10.x

Edited by SimonF
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.