Jump to content
SignorRossi

6.8.1 Singlethread performance (GTA5) not good enough

18 posts in this topic Last Reply

Recommended Posts

Hey!

 

My Unraid Host is working fine except singlethread resp. GTA5 performance. I did not test other games as of today.

 

My hardware: Ryzen 9 3900X, Asus X470 Strix F, 64 GB RAM (3400/16 IF1700), RTX 2080ti, several SSDs,HDDs.

Unraid-Settings: CPU Pinning and Isolation (0/12), Spectre, Meltdown and Zombieload mitigations disabled, CPU Scaling Governor "perfomance", Boost enabled (also in BIOS).

 

Results

CPU-Z 1.91 singlethread

Unraid: 51x, ESXi: 53x, bare metal: 55x

 

GTA5 (min fps, average over 5 runs):

Unraid (8C/16T): 33.63, ESXi (8C/16T: 45.52, bare metal: 50,80

 

When playing GTA5 besides the benchmark, I have some places on the map to measure fps, the VM Iooses 15 to 20 fps compared to bare metal. This is unsatisfying.

 

Do you guys have any other suggestions/tipps on how to improve the performance of the VM? I read about lstopo but I think this does not apply to the 3900X. Or am i wrong?

gamingvm.xml

Share this post


Link to post

The running the performance governor while gaming to see if it helps. Use the tips and tweaks plugin to help.

Share this post


Link to post

Thank you for the quick reply. But what do you mean? The governor is already set to "performance".

 

I'm running the benchmark right now and I see that the heaviest load is on cores HT 19 and 20. Wouldn't it be better, if the load is on the physical cores? But how to achieve that? Is ProcessLasso working in a VM?

Share this post


Link to post

We found yesterday there are some issues with the 3rd gen ryzen and caching. There isn't a way to fix it 100%.

 

Technically your cache should be fixed with the below but doubt it will.. If it works you should see a boost in performance but there is a fix needed for virtio/qemu

<cpu mode='host-passthrough' check='none'>
   <topology sockets='1' cores='8' threads='2'/>
   <cache mode='passthrough'/>
   <feature policy='require' name='topoext'/>
   <feature policy='disable' name='monitor'/>
   <feature policy='require' name='hypervisor'/>
   <feature policy='disable' name='svm'/>
   <feature policy='disable' name='x2apic'/>
</cpu>  

 

Share this post


Link to post

Thank you, but as you expected, it didn't help.

 

So I have to wait until an update for qemu will be released I guess?

Share this post


Link to post

I have issues (stutters) with my gaming VM with my Ryzen 3800X. I've almost given up. I thought 6 months was enough for the software to catch up with the hardware, but I was wrong. I still have one last ditch effort moving to the v5 Linux kernel on the last batch of RCs, but after that, I'm surrendering. 

 

Anyway, I looked at your XML and I have a couple recommendations (though I don't think it's going to help ... Still maybe better in general though). I expect you did all the normal things, so only going to comment of what's there.

 

I see you're using your emulator pinning on your lowest core. I wouldn't do that. In my testing I have seen an idling VM take a fair amount of CPU usage for the emulator and it can spike under load. You really don't want the scheduler working around some host tasks that usually run on the lowest core. Also, the emulator process is a single thread. I have been using a HT core (i.e. HT coreA, HT coreB) for the emulator too, but just last night I monitored my emulator process and saw it hopping between both HTs while only having 1 active at any given time. Probably ok, maybe not the best for a long running process where you want to minimize latency. Try keeping it to one vcpu. I saw no noticeable difference in either case, but since the process stays alive the whole time the VM is on, there's no reason to have the scheduler move it around. IMO, let the scheduler work around the emulator process.

 

I would try shifting all your cores to the highest cores on your CPU and leave the lowest for the host. Also, maybe limit your VM cores to the ones on the same CCX so they don't have to talk over the InfinityFabric. I tried to find an lstopo of the 3900x on Google but I couldn't find one. Maybe post yours here?

 

Unrelated to the CPU, you are passing in your GPU and HDMI sound as 2 separate addresses. SpaceInvader One talked about it in his Advanced GPU passthrough YouTube video. Probably won't change anything, but take a look at that video and try passing the GPU as a multifunction device.  Also, are you stubbing your GPU?

 

@Skitals has a x3900. Maybe he has some more recommendations for you.

 

-JesterEE

Edited by JesterEE

Share this post


Link to post
1 hour ago, JesterEE said:

I thought 6 months was enough for the software to catch up with the hardware, but I was wrong.

[...]

@Skitals has a x3900. Maybe he has some more recommendations for you.

 

-JesterEE

Unraid 6.8(.1) Stable uses a linux 4.19 kernel. The 4.19 kernel was release in 2018. My recommendation is to stop using old software with new hardware.

Share this post


Link to post

@JesterEE thanks a lot for the information. I implemented your hints but also without success. Multi thread performace is great single thread performance still sux.

 

Here is the output of lstopo:

topo.png.1f5c1c4cf069a447b4df9b48f83a66a9.png

 

 

@Skitals How do I use an uptodate kernel with Unraid? How do I install a new kernel?

 

Edited by SignorRossi

Share this post


Link to post
1 hour ago, SignorRossi said:

@JesterEE thanks a lot for the information. I implemented your hints but also without success. Multi thread performace is great single thread performance still sux.

 

Here is the output of lstopo:

topo.png.1f5c1c4cf069a447b4df9b48f83a66a9.png

 

 

@Skitals How do I use an uptodate kernel with Unraid? How do I install a new kernel?

 

There are certain versions you will need to roll back to and there is actually a specialized kernel as well. But hopefully limetech will release 6.9rc1 and it will also do this..

Share this post


Link to post

Found this the other day. This might be related, it might not be. Take a look:

 

 

Share this post


Link to post

This is interesting! Right now I am trying to understand the libvirt documentation, there is something called "cachetune"...

 

Edit: cachetune is not supported 😕

Edited by SignorRossi

Share this post


Link to post

I haven't tested GTA5 (im installing it now to try), but I haven't had any performance issues in any games. I attached my redacted xml if you want to take a look, there is nothing special. I do not use CPU Isolation. The CPU pinning is the last 10 core+ht pairs. I do not pin anything to the first 2 cores. I am using the the default cpu freq governor. The only bios changes from factory settings are related to enabling virtualization, and setting xmp profile on ram.

good_vm.txt

Share this post


Link to post
1 hour ago, SignorRossi said:

Thank you @Skitals! I copied your CPU config but still no succes, singlethread performance is still low😞

 

Which results do you get with CPU-Z 1.91?

There is more to it than the cpu settings. The machine type and what devices you are emulating and which drivers you are using could all be a factor. Are you using a vdisk, how many devices using virtio, which USB drivers are you using, etc. Also your motherboard bios... Which dictates your cpu microcode (agesa). Edit also what version of unraid including what kernel, what version of qemu, what version of virtio, etc. For reference I am using unraid 6.8.0-rc5, AMD AGESA 1.0.0.4 B, virtio drivers from virtio-win-0.1.160-1.iso. It's possible you have one device or one driver that is a bad actor.

 

I will mess around with GTA5 and CPU-Z this weekend and let you know what I find.

Edited by Skitals

Share this post


Link to post

I am currently using unraid 6.8.0-rc7, newest Mobo BIOS with AGESA 1.0.0.4B and virtio drivers from virtio-win-0.1.171.iso. As SSD I am using a partition with SATA.

 

I looked at your xml @Skitals you are using so many devices, I don't think I can use this for my system.

 

I really appreciate your help! Thanks a lot!

Edited by SignorRossi

Share this post


Link to post

I think I solved the issue.

 

Steps I have taken:

Disabling Spectre, Meltdown and Zombieload (MDS) mitigations both in Unraid an the Win 10 VM (Inspectre).

Set CPU Scaling Governor in Unraid to "on demand".

Isolate Cores 0, 12 for Unraid.

CPU Settings:

  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='16'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='17'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='18'/>
    <vcpupin vcpu='6' cpuset='7'/>
    <vcpupin vcpu='7' cpuset='19'/>
    <vcpupin vcpu='8' cpuset='8'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='9'/>
    <vcpupin vcpu='11' cpuset='21'/>
    <vcpupin vcpu='12' cpuset='10'/>
    <vcpupin vcpu='13' cpuset='22'/>
    <vcpupin vcpu='14' cpuset='11'/>
    <vcpupin vcpu='15' cpuset='23'/>
    <emulatorpin cpuset='1'/>
  </cputune>

Activate Hyper-V (which was disabled bc of Error 43)

    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <vendor_id state='on' value='123456789ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>

More CPU Settings:

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='8' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
    <feature policy='disable' name='x2apic'/>
  </cpu>

In the VM I disabled uPNP.

In the VM I installed Processlasso and set the energy plan to "Bitsum highest performance"

In Processlasso I optimized settings for the GTA5 task (disable SMT etc.)

 

Et voila: more than 46 min fps in the benchmark run reproducible also after reboot. And it doesn't matter if the VM has 6C12T, 8C16T or 9C/18T the performance in GTA5 is roughly the same. If I set the CPU Scaling Governor in Unraid to "performance" I will gain 2 min fps on average, which is negligible, better save some energy.

 

Thanks to everybody your help is really appreciated! If you have any suggestions on how to maybe improve things further more just let me know!

Edited by SignorRossi
  • Like 1
  • Thanks 1

Share this post


Link to post
6 hours ago, SignorRossi said:

Et voila: more than 46 min fps in the benchmark run reproducible also after reboot. And it doesn't matter if the VM has 6C12T, 8C16T or 9C/18T the performance in GTA5 is roughly the same. If I set the CPU Scaling Governor in Unraid to "performance" I will gain 2 min fps on average, which is negligible, better save some energy.

 

Thanks to everybody your help is really appreciated! If you have any suggestions on how to maybe improve things further more just let me know!

Thanks for the update. I was just going to post an update and eat some crow, because my single-core benchmarks are similar to yours:

 

baremetal-vs-kvm.thumb.png.7580c70f8f9427907cdc8061b42b07e4.png

 

I think 94% of bare metal performance is about what is expected, and that's the same as you posted in your original post. I'm very curious, with your changes, did that improve your cpu-z benchmark figure?

 

I did play some gta5 and didn't see any slowdowns (playing >120fps at 1440p on 144hz freesync monitor). The game sounds notoriously cpu-bound. I tried doing some benchmarks but it seems ALL over the place. Especially for min fps. Some runs I would get 9 min fps... on bare metal. That doesn't translate on screen or in gameplay, I think it's literally the first few frames when the pass first starts. For my VM I am passing 10c/20t, so that might be giving me the edge.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.