Ryzen/Threadripper PSA: Core Numberings and Assignments


Recommended Posts

18 minutes ago, bastl said:

The numatune setting only works if you set your BIOS settings for the RAM to channel. On all TR4 boards i saw so far the default setting is auto and the CPU is presented as 1 node to the OS. You might have to check your manual of the board how to setup the RAM slots correctly for 2 dimms. You're the first one i see here in the forum with only 2 dimms on the TR4 platform.

I changed numatune to 0-1 and i have better r/w/c memory speeds now. Latency not too bad. I'll stick with this setting for now and see how I get on. Defo will upgrade to 64GB if and when memory prices come down a bit.

Link to comment
3 hours ago, mikeyosm said:

I changed numatune to 0-1 and i have better r/w/c memory speeds now. Latency not too bad. I'll stick with this setting for now and see how I get on. Defo will upgrade to 64GB if and when memory prices come down a bit.

Even better mem stats now, getting close to bare metal 🙂

 

W10 XML:

 

<cputune>
    <vcpupin vcpu='0' cpuset='10'/>
    <vcpupin vcpu='1' cpuset='26'/>
    <vcpupin vcpu='2' cpuset='11'/>
    <vcpupin vcpu='3' cpuset='27'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='28'/>
    <vcpupin vcpu='6' cpuset='13'/>
    <vcpupin vcpu='7' cpuset='29'/>
    <vcpupin vcpu='8' cpuset='14'/>
    <vcpupin vcpu='9' cpuset='30'/>
    <vcpupin vcpu='10' cpuset='15'/>
    <vcpupin vcpu='11' cpuset='31'/>
    <emulatorpin cpuset='0,16'/>
  </cputune>
  <numatune>
    <memory mode='interleave' nodeset='0-1'/>
  </numatune>
  <resource>

 

  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>EPYC-IBPB</model>
    <topology sockets='1' cores='6' threads='2'/>
    <feature policy='require' name='topoext'/>
  </cpu>

 

Clipboard Image.jpg

Link to comment
5 hours ago, mikeyosm said:

What did the cache look like before you created a new xml? And how does it look now? A comparison would be good so I can troubleshoot my performance issues. Thanks.

It was similar to jerkey_san showing 2/2/16/16 way instead of the correct 8/4/8/16 way.  It does seem to have smoothed out the VM a bit.  World of warcraft now feels more like it does bare metal. Just as a warning though, I did have to re-activate my windows 10 licenses.

Link to comment
19 minutes ago, TType85 said:

It was similar to jerkey_san showing 2/2/16/16 way instead of the correct 8/4/8/16 way.  It does seem to have smoothed out the VM a bit.  World of warcraft now feels more like it does bare metal. Just as a warning though, I did have to re-activate my windows 10 licenses.

OK, mine reports back correctly 8 4 8 16 and I didnt have to re-activate.

Link to comment
2 minutes ago, jordanmw said:

Even with bios set to channel and numa all setup within the VM, I am still getting memory used from both cores.  20ish MB?  Not sure what is going on there- I can decrease the RAM to account for the 20 mb but it still grabs the same amount:

numa.PNG

There are ways to do this but if I'm being 100% honest I highly suggest avoiding it. The reason I say that is because Unraid is made to deal with Xeons/Intel more than NUMA and threadripper and because of this if you use the settings that can get this basically perfect everytime and change the CPU count the VM simply disappears as unraid alters the CPU part of the config but doesn't update the settings(honestly I doubt it can as it doesn't have numatune) and the config becomes invalid and is dropped. Lost two damn XML configs because of that. Thats why my current gaming machine is not activated as I rebuilt from scratch and am still tweaking and don't want to have to waste the time to get it activated again.

Link to comment
4 hours ago, mikeyosm said:

Even better mem stats now, getting close to bare metal 🙂

 

W10 XML:

 

<cputune>
    <vcpupin vcpu='0' cpuset='10'/>
    <vcpupin vcpu='1' cpuset='26'/>
    <vcpupin vcpu='2' cpuset='11'/>
    <vcpupin vcpu='3' cpuset='27'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='28'/>
    <vcpupin vcpu='6' cpuset='13'/>
    <vcpupin vcpu='7' cpuset='29'/>
    <vcpupin vcpu='8' cpuset='14'/>
    <vcpupin vcpu='9' cpuset='30'/>
    <vcpupin vcpu='10' cpuset='15'/>
    <vcpupin vcpu='11' cpuset='31'/>
    <emulatorpin cpuset='0,16'/>
  </cputune>
  <numatune>
    <memory mode='interleave' nodeset='0-1'/>
  </numatune>
  <resource>

 

  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>EPYC-IBPB</model>
    <topology sockets='1' cores='6' threads='2'/>
    <feature policy='require' name='topoext'/>
  </cpu>

 

Clipboard Image.jpg

Spoke too soon. After using the VM for a few hours, it became very unresponsive and sluggish to the point it hard rebooted itself, grrr.

Checked UNRAID memory usage and i had none left. Back to square one.

Link to comment
On 11/27/2018 at 3:44 PM, TType85 said:

It was similar to jerkey_san showing 2/2/16/16 way instead of the correct 8/4/8/16 way.  It does seem to have smoothed out the VM a bit.  World of warcraft now feels more like it does bare metal. Just as a warning though, I did have to re-activate my windows 10 licenses.

Did you find out what caused the problem within the xml? (comparing the old and the new one)

 

Update: I looked for differences from my VM to a new Windows 10 and changed

<type arch='x86_64' machine='pc-i440fx-2.10'>hvm</type>

into

<type arch='x86_64' machine='pc-i440fx-3.0'>hvm</type>

Afterwards the Chache showed up correctly in CPU-Z...

I have not clue for what this configuration is or why it changed afterwards but maybe this was also just related to the restart of the VM and has nothing to do with it at all 🤣

 

 

Edited by Symon
Link to comment
55 minutes ago, mikeyosm said:

Spoke too soon. After using the VM for a few hours, it became very unresponsive and sluggish to the point it hard rebooted itself, grrr.

Checked UNRAID memory usage and i had none left. Back to square one.

I'll just say that if your splitting between NUMA with the EPYC stuff that will basically not work. The QEMU system doesn't provide the windows machine the NUMA information to function properly.

Link to comment
1 hour ago, Symon said:

Did you find out what caused the problem within the xml? (comparing the old and the new one)

 

Update: I looked for differences from my VM to a new Windows 10 and changed


<type arch='x86_64' machine='pc-i440fx-2.10'>hvm</type>

into


<type arch='x86_64' machine='pc-i440fx-3.0'>hvm</type>

Afterwards the Chache showed up correctly in CPU-Z...

I have not clue for what this configuration is or why it changed afterwards but maybe this was also just related to the restart of the VM and has nothing to do with it at all 🤣

I think you got it there.  Mine was 2.11 now 3.0. That was the only change I could find.

Link to comment
2 hours ago, Jerky_san said:

I'll just say that if your splitting between NUMA with the EPYC stuff that will basically not work. The QEMU system doesn't provide the windows machine the NUMA information to function properly.

I tried strict node 1 (my gpu is on that node) and despite a very slow to power on VM, performance increase was neglible.

I think waiting for a patched kernel will be a safer bet, I'm fresh out of ideas on how I can get close to bare metal memory performance.

Link to comment
1 hour ago, mikeyosm said:

I tried strict node 1 (my gpu is on that node) and despite a very slow to power on VM, performance increase was neglible.

I think waiting for a patched kernel will be a safer bet, I'm fresh out of ideas on how I can get close to bare metal memory performance.

Looking at your XML your CPU pairings are very strange Forgot your running at 2950x

Edited by Jerky_san
Link to comment
  • 4 weeks later...

 

I'm getting closer to bare metal. Ram is past bare metal now and the L1-L3 latency is less than bare metal. The speed is less I believe because I only have 2 numa involved and would be faster if I involved all the NUMA including the ones who don't have memory controller access.

Physical

physical.PNG.0bfc8bb2bbe9a12376ec0eb48ff12cd5.PNG

Virtual

 

image.thumb.png.4fdc39cdf58d95d5a672b21fcd005a22.png

Old virtual

Performance.PNG.6a2c8cc53019cfbe2a913d615c826564.PNG

Edited by Jerky_san
Link to comment

More Tweaking with more cores across more numa

 

image.thumb.png.36e756b3204ce04a4bcb03fbb1931e06.png

 

Below is what I settled on and the XML.. I actually got much faster L1-L3 read/writes but I think I missed something as L3 cache slowed. I don't know where I messed up but I'll experiment more tmrw probably. I don't really need all the cores though so in a way I'm satisfied with what I got. Remember I am on a 2990wx so adjust accordingly. I am thinking about doing video's like SpaceInvader but I am not great at speaking #_#;;

image.thumb.png.fc11ac8490e9306634f35e47f95c74e7.png

 

 

 <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='36'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='37'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='38'/>
    <vcpupin vcpu='6' cpuset='7'/>
    <vcpupin vcpu='7' cpuset='39'/>
    <vcpupin vcpu='8' cpuset='8'/>
    <vcpupin vcpu='9' cpuset='40'/>
    <vcpupin vcpu='10' cpuset='9'/>
    <vcpupin vcpu='11' cpuset='41'/>
    <vcpupin vcpu='12' cpuset='10'/>
    <vcpupin vcpu='13' cpuset='42'/>
    <vcpupin vcpu='14' cpuset='11'/>
    <vcpupin vcpu='15' cpuset='43'/>
    <emulatorpin cpuset='4-11'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0,2'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-3.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/3b8790bc-59c0-ff66-e9b9-c3c716abc8b5_VARS-pure-efi.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <synic state='on'/>
      <stimer state='on'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>EPYC-IBPB</model>
    <topology sockets='1' cores='8' threads='2'/>
    <feature policy='require' name='topoext'/>
    <numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>

 

 

Why I decided to stop is below. It appears you get diminishing returns the more cores you add after a certain point. RAM also gets slower instead of faster.

24 physical cores

image.thumb.png.a80c6da28ed5175af465ae7684dff8d8.png

50 cores breaks L3 cache - Another thing I thought is that maybe since I am using the EPYC work around it's causing things to break as well. Also ignore the last image don't know why it won't remove.

image.thumb.png.a0a72774d44f2e638b2f83050485f988.png

image.png

Edited by Jerky_san
Trying to delete rouge picture
Link to comment
  • thenonsense changed the title to Ryzen/Threadripper PSA: Core Numberings and Assignments
  • 3 months later...

i have been tweaking my VM settings and got CPU and memory speed decent along with low latency for L1-L3 but their speed is very low. Any suggestions?

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm'>
  <name>Windows 10 test</name>
  <uuid>924dacca-7dd4-bc22-0bb7-9280368c9603</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>6291456</memory>
  <currentMemory unit='KiB'>6291456</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='18'/>
    <vcpupin vcpu='1' cpuset='42'/>
    <vcpupin vcpu='2' cpuset='19'/>
    <vcpupin vcpu='3' cpuset='43'/>
    <vcpupin vcpu='4' cpuset='20'/>
    <vcpupin vcpu='5' cpuset='44'/>
    <vcpupin vcpu='6' cpuset='21'/>
    <vcpupin vcpu='7' cpuset='45'/>
    <emulatorpin cpuset='18-21'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-3.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/924dacca-7dd4-bc22-0bb7-9280368c9603_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>EPYC-IBPB</model>
    <topology sockets='1' cores='4' threads='2'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/disks/Samsung_SSD_840_PRO_Series_S12PNEAD275552V/VM/Windows 10 test/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </disk>
   
   
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
</domain>
 

 

Annotation 2019-09-03 101009.jpg

Link to comment
15 hours ago, chron said:

<cputune>
    <vcpupin vcpu='0' cpuset='18'/>
    <vcpupin vcpu='1' cpuset='42'/>
    <vcpupin vcpu='2' cpuset='19'/>
    <vcpupin vcpu='3' cpuset='43'/>
    <vcpupin vcpu='4' cpuset='20'/>
    <vcpupin vcpu='5' cpuset='44'/>
    <vcpupin vcpu='6' cpuset='21'/>
    <vcpupin vcpu='7' cpuset='45'/>
    <emulatorpin cpuset='18-21'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>

Are you sure the cores you're using are on node0? You have to set the correct node in numatune. The shown latency and speeds indicating that you're using the wrong node. With "strict" you limiting the VM to use RAM only from a specific node. In your case "0". The cores have to be from the same node to gain the best performance.

 

example from my setup:

  <cputune>
    <vcpupin vcpu='0' cpuset='9'/>
    <vcpupin vcpu='1' cpuset='25'/>
    <vcpupin vcpu='2' cpuset='10'/>
    <vcpupin vcpu='3' cpuset='26'/>
    <vcpupin vcpu='4' cpuset='11'/>
    <vcpupin vcpu='5' cpuset='27'/>
    <vcpupin vcpu='6' cpuset='12'/>
    <vcpupin vcpu='7' cpuset='28'/>
    <vcpupin vcpu='8' cpuset='13'/>
    <vcpupin vcpu='9' cpuset='29'/>
    <vcpupin vcpu='10' cpuset='14'/>
    <vcpupin vcpu='11' cpuset='30'/>
    <vcpupin vcpu='12' cpuset='15'/>
    <vcpupin vcpu='13' cpuset='31'/>
    <emulatorpin cpuset='8,24'/>
    <iothreadpin iothread='1' cpuset='8,24'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='1'/>
  </numatune>

grafik.thumb.png.e3a4283cf5fcb4840e15d464fe76a63a.png

 

node0 = cores 0-7/16-23

node1 = cores 8-15/24-31

Link to comment

I thought i needed to choose the node the ram is connected to. Here is my cpu layout.

 

numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 24 25 26 27 28 29
node 0 size: 32123 MB
node 0 free: 1897 MB
node 1 cpus: 12 13 14 15 16 17 36 37 38 39 40 41
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 6 7 8 9 10 11 30 31 32 33 34 35
node 2 size: 32227 MB
node 2 free: 406 MB
node 3 cpus: 18 19 20 21 22 23 42 43 44 45 46 47
node 3 size: 0 MB
node 3 free: 0 MB
node distances:
node   0   1   2   3 
  0:  10  16  16  16 
  1:  16  10  16  16 
  2:  16  16  10  16 
  3:  16  16  16  10

 

I couldn't use node 0 or 2 for VM because the PCI-E slot my GPU is on isnt connected to those I assume because the VM wouldny see the card if I had any other cores selected.

Link to comment

@chron Best performance you will see by choosing cores and RAM from the same node and not by mixing them. Prefered nodes in your case are node0 or node2. Limit your VM to only one node and its RAM and isolate the cores from everything else like you see in my picture above. The used cores by my VM are isolated and only used by this specific VM. Don't use cores from node1+3 for VMs. It will work, sure, but you add an extra step the data has to travel to get to the RAM what adds latency and also reduces the speed slightly.

Link to comment
  • 6 months later...

Oh, my, god.
I came a few time on this topic but I was too lazy to read it all.
This information REALLY needs to be condensed / communicated (or integrated in the OS ? ) by unraid's dev team.
It would have saved me so much time. A bit pissed of off not having been informed properly ; I just have been told that TR were bad for gaming on unraid... After having bought a pro key. Meh.


 

I run a 1950x + 5700XT + 64GB.
Note that I use OVMF for the 5700 XT.

 

I was getting 750GB/sec bare metal on L3 and 45GB/sec on VM.

I now get a solid 312GB, but as I understood it's close to bare metal because I'm now using half the Threadripper.

 

 

Some tips :
I went to NUMA by booting on windows bare metal and setting the Threadripper to gaming mode with Ryzen Master.

The option can be now hidden in the bios as AMD asked Motherboard manufacturers to do so.

I had no improvements (results were worse) without activating Numa in the XML.

I was previously running 12 cores for the VM but I'm fine with 8 only if I get stability.
It applies not only for gaming but also video editing, where you never really knows if your system performs at it's best. 


I'm redownloading Modern Warfare, it's a good benchmark as it was UNPLAYABLE before ; system stutter every few secs.
I'll update the post to confirm/or not if it fixed the lags I was getting.

 

Here's (another) the XML and screenshots of the tests I did before achieving what I was expecting to get, thanks to you guys, who took time to collect and share that precious information !

<domain type='kvm' id='1'>
  <name>Windows 10 RX</name>
  <uuid>XXXXXXXXXXXXXXXX</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>33554432</memory>
  <currentMemory unit='KiB'>33554432</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='16'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='18'/>
    <vcpupin vcpu='3' cpuset='19'/>
    <vcpupin vcpu='4' cpuset='20'/>
    <vcpupin vcpu='5' cpuset='21'/>
    <vcpupin vcpu='6' cpuset='22'/>
    <vcpupin vcpu='7' cpuset='23'/>
    <vcpupin vcpu='8' cpuset='24'/>
    <vcpupin vcpu='9' cpuset='25'/>
    <vcpupin vcpu='10' cpuset='26'/>
    <vcpupin vcpu='11' cpuset='27'/>
    <vcpupin vcpu='12' cpuset='28'/>
    <vcpupin vcpu='13' cpuset='29'/>
    <vcpupin vcpu='14' cpuset='30'/>
    <vcpupin vcpu='15' cpuset='31'/>
  </cputune>
  <numatune>
    <memory mode='interleave' nodeset='0-1'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-4.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/XXXXXXXXXXXXXXXXXXXXXXXX_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <topology sockets='1' cores='8' threads='2'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
    <numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>

 

BENCHMARKS_UNRAID.png

Edited by dboris
  • Thanks 1
Link to comment
2 hours ago, dboris said:

I came a few time on this topic but I was too lazy to read it all.
This information REALLY needs to be condensed / communicated (or integrated in the OS ? ) by unraid's dev team.
It would have saved me so much time. A bit pissed of off not having been informed properly ; I just have been told that TR were bad for gaming on unraid... After having bought a pro key. Meh.

🥱  

Link to comment
7 minutes ago, dboris said:

Not everyone can afford to spend countless hours reading every topic on you-name-it-forum.

I just skimmed through this topic in less than 10 minutes. If that is "countless hours" for you then you have bigger things to be concerned about.

Blaming others for your own laziness will get you quite far.

Link to comment

I spent multiple weekends trying to problem solve my unraid system. 

So I'm glad it took you 10mn to read the topic, but it took me a few hours to do the benchmarks, test and confirm. Each windows reboot require a system reboot as I have a 5700XT.

This fixed my latency issues but made somehow my realtek sound card crash (the windows audio service consummes 15% of the CPU, and freeze the windows audio settings). It still works in bare metal on the same SSD, no problem. Don't you think I went through some other topics already?

As of today I just spent a straight 12 hours on problem solving unraid.

Did your 2150 posts took you 10 minutes too?


Don't you think I can rightfully express my regrets on not checking this topic twice, without having you trying to make yourself shine over me calling myself lazy ? Thanks for your forum contribution, thumbs down for your behaviour. 👎


 

 

 

 

Edited by dboris
  • Like 1
Link to comment
11 hours ago, dboris said:

I spent multiple weekends trying to problem solve my unraid system. 

So I'm glad it took you 10mn to read the topic, but it took me a few hours to do the benchmarks, test and confirm. Each windows reboot require a system reboot as I have a 5700XT.

This fixed my latency issues but made somehow my realtek sound card crash (the windows audio service consummes 15% of the CPU, and freeze the windows audio settings). It still works in bare metal on the same SSD, no problem. Don't you think I went through some other topics already?

As of today I just spent a straight 12 hours on problem solving unraid.

Did your 2150 posts took you 10 minutes too?


Don't you think I can rightfully express my regrets on not checking this topic twice, without having you trying to make yourself shine over me calling myself lazy ? Thanks for your forum contribution, thumbs down for your behaviour. 👎


 

 

 

 

Don't really see what board your running but some boards are weird about passing the Audio.. Even my Zenith sometimes shits the bed with the audio though its fairly rare. Also testdasi, me, and another user whose username escapes me all have 2990wx and we've probably put countless hours of research into making our systems run VMs really well. Though on this thread it's mostly the first few posts and when it gets down to mine and a few others. Btw most things have actually been integrated into Unraid(depending on what version your running). For instance you don't have to do all the special XML editing crap on a lot of stuff unless your bridging both numa together. Tbh on that unraid can't do that very easily as there is A LOT of special sauce on that. Heck it took me months to make the numa properly render so memory could easily be split and allocated. Anyways I'd say learn how to skim fast as that is what I generally do.. Guessing that is what testdasi is talking about it. Most of these responses in this thread are mostly people asking questions or something and only a few posting updates and such. Just lock onto them and go through their comments looking for updates from them. Will go a lot faster..

 

 

Below is all you really need these days if your running the latest Unraid. For the processor part to render the cache properly. Unraid actually does most of this for you now though in the latest version. It will not do the numa part. At least to my knowledge.

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='16' threads='1'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
    <feature policy='disable' name='x2apic'/>
    <numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>

This is what you should be running for your numa.. It will properly allocate your ram. The way you currently have it. It will allocate all it can from node0 first which is "problematic" to say the least.. Please keep in mind I run a 2990wx so your NUMA's will be different.. I'm assuming 0/1 so yeah.. don't just copy paste this and then come back and complain it doesn't work please o_o.

  <numatune>
    <memory mode='strict' nodeset='0,2'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='2'/>
  </numatune>
Edited by Jerky_san
Link to comment

I noticed it was working on some other VMs, so after reinstalling windows to be clear of software bugs, and after a good night of sleep, I messed around with the XMLs and ended up solving my audio problem while retaining the L3 performance.

I get 4793 on Cinebench.

So, with a 1950X, this is what worked :

 

  <numatune>
    <memory mode='interleave' nodeset='0-1'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-4.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/xxx_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <topology sockets='1' cores='8' threads='2'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
    <numa>
      <cell id='0' cpus='0-7' memory='16777216' unit='KiB'/>
      <cell id='1' cpus='8-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>
  

 

12b3f0c7-3fda-43c7-9b7e-ddffe291191d.jpg

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.