high vCPU count VM and overheads? (40+ vCPUs?)


Recommended Posts

Hi All,

I use my VM for cpu based multithreaded computations and I often use 40 or more vCPUs per VM.

My system has 44 (88Threads) cores. (dual Xeon)

 

Is there any general knowledge about the cpu count and the overhead? a quick sysbench threading benchmark for 80 threads seems to give a lot of work to the remaining CPUs (4core/8threads are left for unraid).

 

Any experience from people with high vCPU/high performance VMs?

 

Link to comment
2 hours ago, daemonix said:

Hi All,

I use my VM for cpu based multithreaded computations and I often use 40 or more vCPUs per VM.

My system has 44 (88Threads) cores. (dual Xeon)

 

Is there any general knowledge about the cpu count and the overhead? a quick sysbench threading benchmark for 80 threads seems to give a lot of work to the remaining CPUs (4core/8threads are left for unraid).

 

Any experience from people with high vCPU/high performance VMs?

 

 

Some apps are coded to use only 1 thread. For such apps, it is the speed of the single core that matters for the performance of the single app.

 

Some apps are written to use a fixed (or max) number of threads (e.g., 4). For such apps, the number of useful cores is limited.

 

Some apps are written use a lot of threads, and for such apps, the entire set of Cores are useful. Games tend to be in this category.

 

Some people run a lot of apps at the same time, and having lots of cores would let them run in parallel without having to share the cores much if at all.

 

Of course if the apps you run are not CPU intensive, none of this may make much difference. Your threads will be mostly idle anyway.

 

If you are running hard gaming apps, your high core count might be valuable. But games tend to be gated more with GPU performance than CPU performance.

 

Would 10 4G Cores be better than 20 2G Cores?  That would depend on the app. The 10 faster cores might be better for heavy spreadsheets, while the 20 2G Cores might be better for simultaneous transcodes.

 

So see how your usage model stacks up against these use cases. If you have enough threads of execution running in parallel, your 44 cores may be giving you outstanding performance.

Link to comment
4 hours ago, SSD said:

 

Some apps are coded to use only 1 thread. For such apps, it is the speed of the single core that matters for the performance of the single app.

 

Some apps are written to use a fixed (or max) number of threads (e.g., 4). For such apps, the number of useful cores is limited.

 

Some apps are written use a lot of threads, and for such apps, the entire set of Cores are useful. Games tend to be in this category.

 

Some people run a lot of apps at the same time, and having lots of cores would let them run in parallel without having to share the cores much if at all.

 

Of course if the apps you run are not CPU intensive, none of this may make much difference. Your threads will be mostly idle anyway.

 

If you are running hard gaming apps, your high core count might be valuable. But games tend to be gated more with GPU performance than CPU performance.

 

Would 10 4G Cores be better than 20 2G Cores?  That would depend on the app. The 10 faster cores might be better for heavy spreadsheets, while the 20 2G Cores might be better for simultaneous transcodes.

 

So see how your usage model stacks up against these use cases. If you have enough threads of execution running in parallel, your 44 cores may be giving you outstanding performance.

Hi, thank for the reply but I believe there is a bit of a misunderstanding. I probably need to rephrase.

 

i would like to know if qemu/kvm has a problem or maybe increasing overhead if the VM have a lot of vCPUs. 

Most people if the forum are normal users with 8-16 cpus running games, flex etc. My case is of high performance multi threaded computation.

 

i feel that kvm/libvirt is hintering the VM when many cores are full. 

Any know issues with this?

Link to comment

I run a 64 core osx vm for video editing. Seems to work fine, minus a little performance loss due to virtualization. I end up using the remaining unused cores as emulator pins which it never maxes out. I've run as few as 2-3 emulator pins with single 64 core vm and not hit any limits in that regards either.

 

If you're going to run high core counts in a vm, you need to be on 6.5.3.rc series as they updated a setting to allow MUCH faster booting vs previous high core count vm's.

Link to comment

So I isolated 80 cpus, from that I gave 4 to emupin and 76 to one VM.

As you can see it seems that the leftover cores for unraid are going crazy instead of the emupin cores..

At times (htop image) unraid WEBGUI is extremely slow and you have to wait 10 sec to refresh the VM or docker page.

 

append isolcpus=2-21,46-65,24-43,68-87 initrd=/bzroot

 

395980898_Screenshot2018-05-3011_22_09.thumb.png.dd41d52d9946e58ae736ca12ea0f2dcc.png

 

902879585_Screenshot2018-05-3011_24_37.thumb.png.dc4e51e32249e4966346c82947ec7936.png

 

1178469697_Screenshot2018-05-3011_25.03copy.thumb.png.354aa6b9e5a30548998358e2f4166a73.png

 

Link to comment
16 hours ago, daemonix said:

really appreciate this!!

 

Is there a rule for the emupin? the move vCPU on the VM the more emupins?

 

not really any hard and fast rules. I've run all sorts of combinations and its all pretty close. emulator pin can help with some minor latency and audio hiccups. But many think that the more cpus you add the more emulator pins you should have. I just watch and if the pin(s) i'm using are maxing out, I add more. It seems really dependent on the type of workload.

 

15 hours ago, daemonix said:

another question: Is it better to get all the vCPUs from one physical CPU for a single VM? or half from CPU0 and half from CPU1? Im thinking about memory 'channels' etc on dual Xeons

 

 I don't know if unraild allocates the ram as "take all from one side(cpu)  first" or "take from all sides(cpus) equally" allocation method. I've never run into any memory speed issues though using 2-4 procs, with ram on 1-4 processors so i've never looked into it. This isn't solid info, but just my experience. The equipment I run was used for running way more vm's than I do and it managed that fine. So I imagine that running a single vm doesn't stress it in terms of ram access. But I wouldnt mind being proven wrong and being shown better optimization that actually makes more than a 1% difference.

 

 

13 hours ago, daemonix said:

So I isolated 80 cpus, from that I gave 4 to emupin and 76 to one VM.

As you can see it seems that the leftover cores for unraid are going crazy instead of the emupin cores.. 

At times (htop image) unraid WEBGUI is extremely slow and you have to wait 10 sec to refresh the VM or docker page.

 

append isolcpus=2-21,46-65,24-43,68-87 initrd=/bzroot

 

395980898_Screenshot2018-05-3011_22_09.thumb.png.dd41d52d9946e58ae736ca12ea0f2dcc.png

 

902879585_Screenshot2018-05-3011_24_37.thumb.png.dc4e51e32249e4966346c82947ec7936.png

 

1178469697_Screenshot2018-05-3011_25.03copy.thumb.png.354aa6b9e5a30548998358e2f4166a73.png

 

 

your emulator pin assignments of 2, 24, 46, 68 in the xml are boxed/identified as in your unraid grouping, with the emulator pin box 3, 25, 47, 69 having nothing assigned to them. So thats why you are seeing activity as you described.

Link to comment

yes exactly. The cores are the same its a numbering issue, this is why I added the image as it was weird for me to understand.

 

As you can see the emupin are almost free (a bit of activity on one core) and the remaining unraid (un-isolated) cores are almost maxed out! even the webGUI is slower

Link to comment

testing update!!! 

I think my HP Proliant has a configuration problem or something... :( 

I run sysbench on my 'workstation' hardware based on supermicro and 60 core VM rum with almost ZERO utilisation on the remaining cpus :S 

 

Look at this results from a dual 'Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz': (BLUE ON htop is just docker/syntching, nothing to worry about).

This time I had NO emupin and all 60 vCPU are isolated. Look at the sysbench latency too, nice and cool.

 

Whats should I do on the HP 380g8 as a troubleshooting steps? reset BIOS settings?

thanks

 

1087664985_Screenshot2018-05-3111_03_00.png.76353507e2f9bfdecb53f7685b6d3fa8.png

 

323074474_Screenshot2018-05-3111_03_12.png.97f397bba850b70b61f363536471df67.png

 

1039377601_Screenshot2018-05-3110_56_12.thumb.png.ab914a978ebad51585dac3345431b5b7.png

Link to comment
5 hours ago, glennv said:

Htop starts counting cpu cores from 1 as seen in the screenshot  and unraid assignement starts from zero. Got me a few times as well so i always check on the unraid dashboard for core utilisation instead.

 

That's what I get for being on the forums with only 2 hours of sleep...!

 

 

17 minutes ago, daemonix said:

Whats should I do on the HP 380g8 as a troubleshooting steps? reset BIOS settings?

 

take the vm down to basics, just 4-8 cores and 1 emulator pin and see if the problem can be duplicated. if so, post your diagnostics zip file and I'll see if anything jumps out. 

 

you might also compare c-states settings between the 2 servers. I haven't fiddled with those much. As far as bios reset, maybe, but I haven't played with g8's that much so I don't know their settings as closely as 7/6's. 

  • Like 1
Link to comment

Only thing i can think of (i am totaly confused with all your testing results on what is what but)  is that the cpu's are not turboing.

 

Check on the unraid server the followig to display current requencies . Check during the tests if they reach max turbo speed or are stuck on stock speed.

 

watch  grep MHz /proc/cpuinfo 

And to check which governors are active on your cores (intel_pstate):

cat /sys/devices/system/cpu/cpufreq/policy*/scaling_driver

And then check:
This is the important one, should be zero :

cat /sys/devices/system/cpu/intel_pstate/no_turbo

Some example output values from my system for cpu pstate/turbo stuff

 
# cat /sys/devices/system/cpu/intel_pstate/turbo_pct
36
# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
35
# cat /sys/devices/system/cpu/intel_pstate/num_pstates
23

Edited by glennv
Link to comment
54 minutes ago, glennv said:

Check during the tests if they reach max turbo speed or are stuck on stock speed.

you are correct! indeed HP-2699v4 CPU is not going up to turbo speed, while the other one is OK.

 

55 minutes ago, glennv said:

And to check which governors are active on your cores (intel_pstate):

pcc-cpufreq on HP and acpi-cpufreq on the Supermicro (the one that performs OK)

 

How do I get intel_pstate on unraid?

Both unraid setups are very basic with just VM and one docker (syntching) running. 

Link to comment

Good, you have a starting point now. You dont set these in UNRAID. You have to dig into the bios of your specific motherboard and make sure all the proper cpu frequency scaling state stuff incl turbo is activated. Then you can check with provided commands on unraid commandline if it is working.

Typicaly these things are under advanced cpu config , or advanced powermanagement etc .

For supermicros i found this somewhere , but you have to check for your HP . At least you have something to compare and play with.

 

-------------- cut from supermicro support -------

Question
How do I enable Turbo mode to get the maximum Turbo mode speed on my X10DRi motherboard?
Answer
Please make sure the following settings are correct:

1.    Please make sure all cores are enabled: Advanced >> CPU Configuration >> Core enabled >> “0” to enable all cores.


2.    Under the Bios setup go to: Advanced >> CPU Configuration >> Advanced Power Management and make sure the setting are as follows: 
Power Technology >> Custom

Energy performance Tuning >> disable

Energy performance BIAS setting >> performance

Energy efficient turbo >> disable


3.    Then go to Advanced >> CPU Configuration >> Advanced Power Management >> CPU P state control and make sure the settings are as follows 
EIST (P-States) >> Enable

Turbo mode >> enable

P-state coordination >> HW_ALL


4.    Then Advanced >> CPU Configuration >> Advanced Power Management >> CPU C state control and make sure the setting are as follows. 
Package C-state limit >> C0/C1 state

CPU C3 Report >>disable

CPU C6 report >> enable

Enhanced Halt state >> disable

 

-----------

Edited by glennv
  • Like 1
Link to comment

hi all,

a full "reset to manufacturer settings" on the HP 380 BIOS helped a lot!!!! ? definitely no overhead and higher CPU clock speed. CPUs are 2699v4 so turbo speed seems to be 3.6Ghz. At the moment I can see some cores going up to 3.2-3.3Ghz but I think its ok as the full turbo is for some cores only?!?! (right?)

Regarding Proliant seetings, here is a tour of the BIOS and the possible settings.

-At the moment I run the "Balanced" setting from the BIOS and the Performance setting on unraid Tweaks.

 

The custom options seem similar to the settings you posted above but Im not sure.

 

171803990_Screenshot2018-05-3120_46_37.thumb.png.c576b7c0af145702339dd685860ef209.png

 

1409881423_Screenshot2018-05-3120_39_16.thumb.png.8dd28d01f3902117dbe99f026b4cb597.png

 

970447828_Screenshot2018-05-3120_40_25.thumb.png.556003305f0641c077276883b615062a.png

 

2017126544_Screenshot2018-05-3120_40_18.thumb.png.1aead8e5bbc97c6785d7529a04c07534.png

 

2063679112_Screenshot2018-05-3120_41_17.thumb.png.d6b06120fdaf9e37d18d4238cba51471.png

 

There is no TURBO setting on Unraid though!!! :S 

 

1529974846_Screenshot2018-05-3120_49_51.thumb.png.18ebcf0f04d9aadf591e7a943ebece2b.png

 

So with basic/auto 'balance performance and savings' mode: no overhead to non-pinned/non-isolated cores at all!!! :):):)

 

785738879_Screenshot2018-05-3121_02_34.thumb.png.93237e8dcd735ce648f603b501c6e424.png

 

 

 

Link to comment
5 hours ago, daemonix said:

There is no TURBO setting on Unraid though!!! :S 

 

I had the same in my Dell Server (T630 with dual Xeon E5-2640 V4) and it turned out that I set it to manage its power/turbo by the BMC not the OS, changed the setting and now the unRAID does the work!

I've no experience with HP servers but I'm sure others have ;)

Link to comment
  • 8 months later...
On 5/30/2018 at 1:00 AM, 1812 said:

I run a 64 core osx vm for video editing. Seems to work fine, minus a little performance loss due to virtualization. I end up using the remaining unused cores as emulator pins which it never maxes out. I've run as few as 2-3 emulator pins with single 64 core vm and not hit any limits in that regards either.

 

How did you get to go with so many cores in OSX ? Whatever i do am stuck at 32 vcores. Have a dual 2697v2 supermicro board.

Whatever combination i try of sockets, cores , hyperthreading, anything that ends up above 32 and crash during boot.

32 or lower and all great .

 

running clover based sierra vm and would like to assign 40 vcores and the rest for unraid.

Did you do anything specific in clover ? Any specific machine type (tried to change but crashed like a madmen, now on model 14.1) or other setting that makes it work ?

Was almost convinced it was a hard limit untill i found you and a few others running way above 32 vcores.

Link to comment
44 minutes ago, glennv said:

How did you get to go with so many cores in OSX ? Whatever i do am stuck at 32 vcores. Have a dual 2697v2 supermicro board.

Whatever combination i try of sockets, cores , hyperthreading, anything that ends up above 32 and crash during boot.

32 or lower and all great .

 

running clover based sierra vm and would like to assign 40 vcores and the rest for unraid.

Did you do anything specific in clover ? Any specific machine type (tried to change but crashed like a madmen, now on model 14.1) or other setting that makes it work ?

Was almost convinced it was a hard limit untill i found you and a few others running way above 32 vcores.

Remove the topology from the xml. As an added bonus It will actually use all the cores to 100% then as well vs limiting the second “hyperthreaded core” to 80ish percent. I did a bunch of testing a year or so showing it.

 

For definition I used 14,2 and no special changes in clover.

Link to comment
11 hours ago, 1812 said:

Remove the topology from the xml. As an added bonus It will actually use all the cores to 100% then as well vs limiting the second “hyperthreaded core” to 80ish percent. I did a bunch of testing a year or so showing it.

 

For definition I used 14,2 and no special changes in clover.

When i do that it does boot with 40 vcpu and at first glance it all looks perfect. But most programs i start crash immediately. Even terminal.

Geekbench looses the 64bit option and only shows the 32 bit option , which is interesting. It dies when i try to run it.

 

I have heard of this behavior but only above 64 cores. I am running 14.2 as well btw (was a typo before).

 

Can you show me an xml of your working setup with 32+ cores please . Maybe i miss something obvious.

 

Link to comment
1 hour ago, glennv said:

When i do that it does boot with 40 vcpu and at first glance it all looks perfect. But most programs i start crash immediately. Even terminal.

Geekbench looses the 64bit option and only shows the 32 bit option , which is interesting. It dies when i try to run it.

 

I have heard of this behavior but only above 64 cores. I am running 14.2 as well btw (was a typo before).

 

Can you show me an xml of your working setup with 32+ cores please . Maybe i miss something obvious.

 

 

I've never experienced the behavior you describe nor ever seen above 64 cores as it requires modifying several components in OSX at the core level last time I checked.

 

Here is 60 cores

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>OSX Desktop</name>
  <uuid>7b3b2d3d-80bc-fbb7-8d3b-a5eb7e88fc85</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Linux" icon="linux.png" os="linux"/>
  </metadata>
  <memory unit='KiB'>25165824</memory>
  <currentMemory unit='KiB'>25165824</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>60</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset=‘2’/>
    <vcpupin vcpu='1' cpuset=‘3’/>
    <vcpupin vcpu='2' cpuset=‘4’/>
    <vcpupin vcpu='3' cpuset=‘5’/>
    <vcpupin vcpu='4' cpuset=‘6’/>
    <vcpupin vcpu='5' cpuset='7'/>
    <vcpupin vcpu='6' cpuset='8'/>
    <vcpupin vcpu='7' cpuset='9'/>
    <vcpupin vcpu='8' cpuset='10'/>
    <vcpupin vcpu=‘9’ cpuset='11’/>
  <vcpupin vcpu=’10’ cpuset='12’/>
    <vcpupin vcpu=’11' cpuset=’13’/>
    <vcpupin vcpu=’12' cpuset=’14’/>
    <vcpupin vcpu=’13' cpuset=’15’/>
    <vcpupin vcpu=’14' cpuset=’16’/>
    <vcpupin vcpu=’15' cpuset=’17’/>
    <vcpupin vcpu=’16’ cpuset=’18’/>
<vcpupin vcpu=’17’ cpuset=’19’/>
    <vcpupin vcpu=’18’ cpuset=’20’/>
    <vcpupin vcpu=’19’ cpuset=’21’/>
    <vcpupin vcpu=‘20’ cpuset=’22’/>
<vcpupin vcpu=’21’ cpuset=’23’/>
    <vcpupin vcpu=’22’ cpuset=’24’/>
    <vcpupin vcpu='23’ cpuset=’25’/>
    <vcpupin vcpu=’24’ cpuset=’26’/>
    <vcpupin vcpu=’25’ cpuset=’27’/>
    <vcpupin vcpu=’26’ cpuset=’28’/>
    <vcpupin vcpu=’27’ cpuset=’29’/>
    <vcpupin vcpu=’28’ cpuset=’30’/>
    <vcpupin vcpu=’29’ cpuset=’31’/>
    <vcpupin vcpu=‘30’ cpuset=’32’/>
<vcpupin vcpu=’31’ cpuset=’33’/>
    <vcpupin vcpu=’32’ cpuset=’34’/>
    <vcpupin vcpu=’33’ cpuset=’35’/>
    <vcpupin vcpu=’34’ cpuset=’36’/>
    <vcpupin vcpu=’35’ cpuset=’37’/>
    <vcpupin vcpu=’36’ cpuset=’38’/>
    <vcpupin vcpu=’37’ cpuset=’39’/>
    <vcpupin vcpu=’38’ cpuset=’40’/>
    <vcpupin vcpu=’39’ cpuset=’41’/>
    <vcpupin vcpu=‘40’ cpuset=’42’/>
<vcpupin vcpu=’41’ cpuset=’43’/>
    <vcpupin vcpu=’42’ cpuset=’44’/>
    <vcpupin vcpu=’43’ cpuset=’45’/>
    <vcpupin vcpu=’44’ cpuset=’46’/>
    <vcpupin vcpu=’45’ cpuset=’47’/>
    <vcpupin vcpu=’46’ cpuset=’48’/>
    <vcpupin vcpu=’47’ cpuset=’49’/>
    <vcpupin vcpu=’48’ cpuset=’50’/>
    <vcpupin vcpu=’49’ cpuset=’51’/>
    <vcpupin vcpu=‘50’ cpuset=’52’/>
<vcpupin vcpu=’51’ cpuset=’53’/>
    <vcpupin vcpu=’52’ cpuset=’54’/>
    <vcpupin vcpu=’53’ cpuset=’55’/>
    <vcpupin vcpu=’54’ cpuset=’56’/>
    <vcpupin vcpu=’55’ cpuset=’57’/>
    <vcpupin vcpu=’56’ cpuset=’58’/>
    <vcpupin vcpu=’57’ cpuset=’59’/>
    <vcpupin vcpu=’58’ cpuset=’60’/>
    <vcpupin vcpu=’59’ cpuset=’61’/>
 <emulatorpin cpuset='1'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-q35-3.0’>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/7b3b2d3d-80bc-fbb7-8d3b-a5eb7e88fc85_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'/>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/disks/Samsung_SSD_850_PRO_128GB/RD/vdisk.img'/>
      <target dev='hda' bus='sata'/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/Apps Disk/vdisk1.img'/>
      <target dev='hdb' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/disks/pool2/scratch/vdisk1.img'/>
      <target dev='hdc' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='8' model='dmi-to-pci-bridge'>
      <model name='i82801b11-bridge'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='9' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='9'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='scsi' index='0' model='lsilogic'>
      <address type='pci' domain='0x0000' bus='0x09' slot='0x01' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:4d:55:22’/>
      <source bridge='br0'/>
      <model type='e1000-82545em'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0' multifunction='on'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-usb'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='usb-mouse,bus=usb-bus.0'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='usb-kbd,bus=usb-bus.0'/>
    <qemu:arg value='-smbios'/>
    <qemu:arg value='type=2'/>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='Penryn,vendor=GenuineIntel,kvm=on,vmx,rdtscp,+invtsc,+avx,+aes,+xsave,xsaveopt,vmware-cpuid-freq=on,'/>
  </qemu:commandline>
</domain>

 

This is high Sierra. I haven't tried +32 on macOS as my GPU for this video editing rig requires Nvidia drivers.

Link to comment

Tnx. (you probably meant to say Majove as there are no Nvidia drivers yet)

Mine is exactly like yours excluding of course the devices.  Use it for Resolve / Nuke / Houdini renders.

Crazy.  No explanation. So must be somewhere in clover . Can not be OS specific as did a clean install of Mojave (without passing my Nvidia) and behaves the same. Nothing indicative in boot logs and everything indicated 40 cores are active. But programs crash with (as usual) not helpfull errors. But its a system wide thing that only appears with 32+ cores. Anything lower or equal 32 and all smooth. Insane.

Will try and upgrade clover to latest version and see if that helps, but doubt it as i think the Mojave instal i used the latest.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.