At my wits end with latency and audio drops in win10vm w/ threadripper 1950x


Tritech

Recommended Posts

.brothers.thumb.PNG.33687f08a04bb8bdaccee2154694e7d7.PNG

 

Ladies and gentlemen, we got'em.

 

Massive thanks to reddit user setzer with helping on this. I don't think he's on unraid but his help was invaluable. Latency is now down to at least manageable levels. I'll continue more tweaking.

 

His .xml = https://pastebin.com/GT1dySwt

My .xml = https://pastebin.com/yGcL0GNj

 

and he also sent along some additional reading for us. https://forum.level1techs.com/t/increasing-vfio-vga-performance/133443

Link to comment

Yea, I didn't grasp the concept that the initial post was making about creating a pci root bus and assigning it vs a card. The more recent activity there does seem like that the bulk of improvements should come with QEMU updates...whenever we get those.

 

The guy I got it from said that the last lines in his xml we for a patched QEMU.

 

I was also recommend "hugepages", but after a cursory search it seems that unraid enabled that by default. Couldn't get a vm to load with it enabled.

<qemu:commandline>

    <qemu:arg value='-global'/>

    <qemu:arg value='pcie-root-port.speed=8'/>

    <qemu:arg value='-global'/>

    <qemu:arg value='pcie-root-port.width=16'/>

  </qemu:commandline>

 

Edited by Tritech
Link to comment

@Tritech 

<memory mode='strict' nodeset='1'/>

Btw the numatune doesn't really work as it supposed to work. It always grabs RAM from the other node too. First VM "strict" 8GB from node0 and second "preferred" 16GB from node1. If I remeber right strict can cause issues when not enough RAM is available on the specified node and should cause an error but it doesn't for me. No glue how to fix this yet.

numastat.png.5dc67b440bee12291435c53f1dbc31bf.png

 

 

Edit:

Another thing i noticed in your xml

    <numa>
      <cell id='0' cpus='0-15' memory='16777216' unit='KiB'/>
    </numa>

Isn't that line telling the VM using 16GB from node0 for cores 0-15 where you using cores 8-15 and 24-32?

Edited by bastl
Link to comment
42 minutes ago, bastl said:

 

Edit:

Another thing i noticed in your xml


    <numa>
      <cell id='0' cpus='0-15' memory='16777216' unit='KiB'/>
    </numa>

Isn't that line telling the VM using 16GB from node0 for cores 0-15 where you using cores 8-15 and 24-32?

I saw that, and yes, that's what it looks like to me. Lemme test.

Link to comment

@Tritech Ok the "<numa>" tag is only if you have more vCPUs as one NUMA node has and you want to have a NUMA topology inside the VM. Let's say 2 cores from each node and than you can tell the vm which "virtual" node uses how much RAM. This should not affect us. 

 

example from https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/virtualization_tuning_and_optimization_guide/

Btw. a really useful guide.


4 available nodes (0-3)
Node 0:	CPUs 0 4, size 4000 MiB
Node 1: CPUs 1 5, size 3999 MiB
Node 2: CPUs 2 6, size 4001 MiB
Node 3: CPUs 0 4, size 4005 MiB

In this scenario, use the following Domain XML setting:

<cputune>
	<vcpupin vcpu="0" cpuset="1"/>
	<vcpupin vcpu="1" cpuset="5"/>
	<vcpupin vcpu="2" cpuset="2"/>
	<vcpupin vcpu="3" cpuset="6"/>
</cputune>
<numatune>
  <memory mode="strict" nodeset="1-2"/> 
</numatune>
<cpu>
	<numa>
		<cell id="0" cpus="0-1" memory="3" unit="GiB"/>
		<cell id="1" cpus="2-3" memory="3" unit="GiB"/>
	</numa>
</cpu>

 

Link to comment
The <emulatorpin> tag specifies which host physical CPUs the emulator (a subset of a domain, not including vCPUs)
  will be pinned to. The <emulatorpin> tag provides a method of setting a precise affinity to emulator
  thread processes. As a result, vhost threads run on the same subset of physical CPUs and memory, and
  therefore benefit from cache locality.

@Tritech Does that mean emulatorpin outside the range of the already used vCPUs? I already have it set up for my main VM that the emupin cores are separated from the cores the VM uses, same die. Difficult if it's not your main language ^^

Link to comment

I get what your saying, I think its saying that they should be in the included range. You know how you left out cores 8/24? Well I think they have to be on the same "domain" to be used at all, well to at least get the most out of them. At least that's the way I interpret it.

 

I've tweaked my config for now just so they're all on the same domain. I'll fix it later when I change my isolcpus at reboot.

 

Here's some updates as well, seems that storport.sys is whats giving me the highest execution time. Gonna see if I can track down any gains there.

storport.thumb.PNG.8ec2e03dbb546d111e996b56a15c22e0.PNG

Link to comment

Actually I let it run a bit longer and both of the highest execution are network related. ndis.sys and adf.sys. Come to think of it, you're using a different ethernet port than I am. I wonder if that may have some issue. I'm using the 10G port, which I don't really have a use for right now, the rest of my network is gigabit.

Edited by Tritech
Link to comment
3 hours ago, Tritech said:

Yea, I didn't grasp the concept that the initial post was making about creating a pci root bus and assigning it vs a card. The more recent activity there does seem like that the bulk of improvements should come with QEMU updates...whenever we get those.

 

The guy I got it from said that the last lines in his xml we for a patched QEMU.

 

I was also recommend "hugepages", but after a cursory search it seems that unraid enabled that by default. Couldn't get a vm to load with it enabled.


<qemu:commandline>

    <qemu:arg value='-global'/>

    <qemu:arg value='pcie-root-port.speed=8'/>

    <qemu:arg value='-global'/>

    <qemu:arg value='pcie-root-port.width=16'/>

  </qemu:commandline>

 

 

Ive been pushing for the changes detailed in that level1tech forum post for a while... 

https://forums.unraid.net/topic/77499-qemu-pcie-root-port-patch/

 

Feel free to post in there to push the issue.. the next stable release of QEMU doesnt look like its coming up until April\May: https://wiki.qemu.org/Planning/4.0. So fingers crossed there's an Unraid release offering that soon after.

 

The alternative is for the @limetech guys to be nice to us and include QEMU from the master branch rather than from a stable release in the next RC....

Considering how many issues it would fix around threadripper, as well as PCIe passthrough performance increases, it would make ALOT of people happy...

 

 

  • Like 1
Link to comment
47 minutes ago, billington.mark said:

 

Ive been pushing for the changes detailed in that level1tech forum post for a while... 

https://forums.unraid.net/topic/77499-qemu-pcie-root-port-patch/

 

Feel free to post in there to push the issue.. the next stable release of QEMU doesnt look like its coming up until April\May: https://wiki.qemu.org/Planning/4.0. So fingers crossed there's an Unraid release offering that soon after.

 

The alternative is for the @limetech guys to be nice to us and include QEMU from the master branch rather than from a stable release in the next RC....

Considering how many issues it would fix around threadripper, as well as PCIe passthrough performance increases, it would make ALOT of people happy...

 

 

Yeah I really hope they do like they did previously with a special build just for threadripper.. I'll stay on that till qemu 4.0 makes it into unraid if it gets the performance increases talked about. It will be exciting.

Link to comment
8 hours ago, billington.mark said:

Having a build with QEMU from master would benefit everyone, not just you guys with threadripper builds ;) 

Last night I worked for almost 3 hours to do just that.. Its a lot harder than I expected to get it working on Unraid.. What I was hoping to do was just see what I could do knowing that when I restart it all blows away anyways.

 

Should mention it failed because I'm guessing LimeTech compiles with special options or something. It would constantly error when I tried to start the VM saying the field name wasn't a valid field. Looking at the log of a working machine it's apparently parsing the XML and presenting that as a command. I'll keep working on it but the LimeTech QEMU executable is much larger to so I'm missing something..

Edited by Jerky_san
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.