ryzen 1920x win10 strange behaviour


Nooke

Recommended Posts

Hi,

 

so I have some serious differences in L3 cache performance for my win10 vm depending on cores being used.

 

there must be something fishy...

 

My current kernel options:

  append processor.max_cstate=1 nvme_core.default_ps_max_latency_us=0 kvm_amd npt=1 nested=1 amd_iommu=on isolcpus=6-11,18-23 nohz_full=6-11,18-23 rcu_nocbs=6-11,18-23 pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1 initrd=/bzroot

 

 

aida-benchmark.thumb.png.a554f1f88b737a643c42ee02c52f79b0.png

 

Here my current xml for 4core 8thread.

Only difference for 5c/10t and 6c/12t would be the core-count, emulatorpin

 

win10vm.xml

 

Using unraid 6.8.0-rc3

latest windows 10 1903

 

as I'm still having issues with my system, appreciate any support :)

 

cheers

Nooke

 

Link to comment
1 hour ago, jordanmw said:

Check this out:

 

 

 

Thanks for sharing but I don't have a core pairing issue here.

I'm using cores only from 1 numa node (eg node1, cores 6-11,18-23).

Numa Node 1 is connected to my GPU aswell as my NMVe SDD (see lstopo attached)

lstoponew3.thumb.png.b41ba6af862bcc47452d73374f0c9ef9.png

 

So my problem here is that with cores from 1 node depending on core assignments in win10 I get totally screwed benchmarks for L3 Cache performance in AIDA64.

Link to comment

Each core has its own L1 / L2 so it scales pretty well.

In contrast, L3 is shared between groups of cores. As you can see on your numa config, each of 6-8 and 9-11 groups has its own L3 cache.

So depending on how the cores are assigned (e.g. 6-9 is different from 6-7+9-10 despite both being 4 cores) AND the exact circumstances of the test run, your test results will differ.

 

What issues do you have on your system that would lead you to think it's related to L3 performance?

Link to comment

I had some serious fps drops and stutter in win 10 (not only in gaming).

felt sluggish since last win 10 update.

 

anyways I kinda fixed it.

L3 cache benchmarks looking fine now:

L3 read 388GB/s

L3 write 300GB/s

L3 copy 367GB/s

L3 latency 10.6ns

(6core 12threads)

 

that's more like the results I would have expected at the beginning.

 

If anyone else has some issues regarding this - could share my findings.

Edited by Nooke
Link to comment

pretty simple - I just changed the cpu topology.

 

I used the following

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='6' threads='2'/>
    <cache mode='passthrough'/>

 

and now I'm using this instead

 

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='2' cores='3' threads='2'/>
    <cache mode='passthrough'/>

 

that made my L3 cache benchmark skyrocket.

 

1920x-6c12t-2socket.png.1b95b3ad51018da2b5387df89adae672.png

 

and overall smoothness of the windows 10 vm is improved.

kinda have the most fps ever in cs:go and world of warcraft (stable!)

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.