amstel Posted March 13, 2017 Share Posted March 13, 2017 41 minutes ago, 1812 said: I don't use those, so you'll have to try and see. nope. thanks for that. last question, can I isolate 2 cpus and use 4 cpus for that VM? I mean, just like now when I'm passing the 4 cpus, only that 2 of them will be isolated for this VM only ? Thanks. Quote Link to comment
1812 Posted March 13, 2017 Share Posted March 13, 2017 17 minutes ago, amstel said: can I isolate 2 cpus and use 4 cpus for that VM? I mean, just like now when I'm passing the 4 cpus, only that 2 of them will be isolated for this VM only ? Yes. Just because a cpu is isolated/not isolated from unRaid doesn't keep it from being assigned to a vm. If you do that, my suggestion would be to ensure that vcpu 0 of the vm is an isolated core, as windows and most other operating systems will use that as a primary resource and favor it, especially during initial booting of the vm. If you did not change it, then the vm and unraid would try to use that as primary, resealing in decreased performance. Also note that using the shared cores (non isolated) may contribute to latency, diminishing he point of isolating cores. Quote Link to comment
amstel Posted March 13, 2017 Share Posted March 13, 2017 (edited) 20 minutes ago, 1812 said: Yes. Just because a cpu is isolated/not isolated from unRaid doesn't keep it from being assigned to a vm. If you do that, my suggestion would be to ensure that vcpu 0 of the vm is an isolated core, as windows and most other operating systems will use that as a primary resource and favor it, especially during initial booting of the vm. If you did not change it, then the vm and unraid would try to use that as primary, resealing in decreased performance. Also note that using the shared cores (non isolated) may contribute to latency, diminishing he point of isolating cores. yes I know that, I just don't want the VM to be too slow, cause using 'only' 2 hyperthreaded cores (1 cpu). I will give those 2 different configurations a try and will test each to see which will act better. Thanks for the help! Edited March 13, 2017 by amstel Quote Link to comment
amstel Posted March 13, 2017 Share Posted March 13, 2017 (edited) 1 hour ago, 1812 said: Yes. Just because a cpu is isolated/not isolated from unRaid doesn't keep it from being assigned to a vm. If you do that, my suggestion would be to ensure that vcpu 0 of the vm is an isolated core, as windows and most other operating systems will use that as a primary resource and favor it, especially during initial booting of the vm. If you did not change it, then the vm and unraid would try to use that as primary, resealing in decreased performance. Also note that using the shared cores (non isolated) may contribute to latency, diminishing he point of isolating cores. well, I have changed cpu (1,3) to be isolated. now, I'd like to assign all 4 of them to the VM. <cputune> <vcpupin vcpu='0' cpuset='1'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='0'/> <vcpupin vcpu='3' cpuset='2'/> <emulatorpin cpuset='2'/> </cputune> do I still need to use the emulatorpin feature? if yes, which CPU should I assign there? Thanks. Edited March 13, 2017 by amstel Quote Link to comment
1812 Posted March 13, 2017 Share Posted March 13, 2017 7 minutes ago, amstel said: well, I have changed cpu (1,3) to be isolated. now, I'd like to assign all 4 of them to the VM. <cputune> <vcpupin vcpu='0' cpuset='1'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='0'/> <vcpupin vcpu='3' cpuset='2'/> <emulatorpin cpuset='2'/> </cputune> do I still need to use the emulatorpin feature? if yes, which CPU should I assign there? Thanks. your core assignments look soooo wonky, which is fine because windows 10 doesn't care about where the cores come from (as shown through benchmarking tests) or even using threaded pairs. If you're using all the cores, there is no need to really specify an emulator pin. There might be a super small gain? But if you want to, make it 2 or 3 Quote Link to comment
amstel Posted March 13, 2017 Share Posted March 13, 2017 20 minutes ago, 1812 said: your core assignments look soooo wonky, which is fine because windows 10 doesn't care about where the cores come from (as shown through benchmarking tests) or even using threaded pairs. If you're using all the cores, there is no need to really specify an emulator pin. There might be a super small gain? But if you want to, make it 2 or 3 mmm, well, I tested 4 different combinations with GeekBench cpu benchmark and those are the results (singlecore, multicore): 4 cpu's, no isolation: 3805, 6951 2 isolated cpus (1,3): 3323, 4041 2 isolated cpus (1,3) + 2 shared cpus: 3632, 6588 2 isolated cpus (1,3) + 2 shared cpus + emulatorpin=2: 3554, 6781 it seems that the normal configurations gives the best benchmark result. what am I missing here? Thanks. Quote Link to comment
1812 Posted March 13, 2017 Share Posted March 13, 2017 typically you should run each test a minimum of three times to find the average. your system could have been doing something in the background at any time causing a variance. Your first and last tests are within 3% of each other and both using all the cores. VM performance is more than just a raw benchmark score. On one of my transcoding cluster servers, I usually give it all processor cores since the entire server is only doing 1 task assigned remotely. This does not take into account any audio/video latency that could occur because I don't never use those interfaces. CPU pinnings are typically done to improve the ability to run multiple instances of something, be it vm's, dockers, etc... And find a balance that works to be able to do all at the same time. So it's not surprising that your tests showing all cores being used are similar in speed. Additionally, your numbers will be different depending on what you have running in the background. If you have a docker or two trying to use 30% of 2 cores, vs 30% of 4 cores, you vm will have a noticeable difference in your benchmark scores. Quote Link to comment
DZMM Posted March 14, 2017 Share Posted March 14, 2017 Curious, does this apply just to Windows VM or would a pfsense/freebsd VM benefit from adding emulator pin? Also, should you always use the first core? I now have 4 VMs running and I'm wondering if i should 'load balance'? Quote Link to comment
1812 Posted March 14, 2017 Share Posted March 14, 2017 You are moving the vm/host functions off the cores being utilized for the vm, so you are allowing more processing power to be dedicated to the vm, regardless of the os. But, i've found that when benchmarking with and without an emulator pin on a vm with many cores, there is only a very small difference in performance. My suspicion is that when you have 1-3 cores that are being pushed to full utilization, or on a host with low cpu power, that is when you'll find the most benefit. Quote Link to comment
harperhendee Posted March 21, 2017 Share Posted March 21, 2017 I'm using a higher core count Xeon (22 CPUs/44 threads). I'm wondering if I should take into account the routing of these cores. In this generation of Xeon, there are multiple rings (see attached). It seems like you would want to keep VM cores near each other in the ring, so that they would use the fabric more efficiently. In the simplest case, I would create two VMs out of this topology using ring 1 and ring 2. When we go to 4 VMs, maybe align to the 4 columns. Of course, I can just as easily convince myself that the CPUs should be spread out so that each one has more distributed bandwidth. I think this works out if there are multiple VMs that are not necessarily running full tilt at the same time. Spreading out the CPUs ensures that each CPU lives in a "quiet neighborhood." Therefore, the isolated VM with spread out CPUs has more paths to memory and fewer rivals. But when all VMs are enabled and under load, they will end up stepping all over each other. Quote Link to comment
1812 Posted March 22, 2017 Share Posted March 22, 2017 i'm curious! can you do some benchmarking and let us know? I'm hoping in the future to move to a 80 thread machine.... muahahahaha..... Quote Link to comment
harperhendee Posted March 22, 2017 Share Posted March 22, 2017 (edited) I took an educated guess on the cpu numbering, then mapped all 8 VMs + 1 core for unraid. It looks like the diagram. I mainly mapped things based on what was easy in Visio. The final mapping does the following: Unraid: Gets first 2 threads Vanaheim and Muspelheim: Get the middle of each ring. Consume some of the other VM mappings if I want to run just these two. Alfheim and Svartalfheim: My next two VMs getting prepped for a new FOVE headset Niflheim, Helheim, Asgard, Midgard: Minimum size VMs are placeholders for future expansion Utgard: I'd like to have a dual-boot option into windows (machine name Jotunheim). Utgard is a VM that can read and execute from Jotunheim's unmanaged disk. , Edited March 22, 2017 by harperhendee Removed early draft of mapping Quote Link to comment
harperhendee Posted March 23, 2017 Share Posted March 23, 2017 BTW, I was able to confirm a few things about CPU numbering and physical layout from Anandtech and a few other sources: 1) The CPUs are numbered in roughly consecutive order around the rings. The numbering is determined by a correlated variable, so they might not be exactly as you'd expect, but there's no functional difference between 0,1,2,3,4,5 and 1,0,3,2,5,4 ordering. 2) Base layouts give you basically 1, 1.5, or 2 iterations of the ring structure. If there are 12 cores in a ring, there will be 3 versions with 12/18/24 physical cores. 3) The CPUs are always fused off in equal numbers from each ring. This includes half-rings. So the 22 core Xeon has two rings of 11 cores each. There's never a 10 and 12 core ring. 4) There is a small latency price to pay when crossing rings. Try to minimize cross-ring traffic. From a topological point of view, if you cross a ring, you generate 2x the bandwidth. 1 Quote Link to comment
planetwilson Posted May 15, 2017 Share Posted May 15, 2017 Trying to work out the best approach having read through all this for me. I have a 14 core E5-2695 v3 and 80GB of RAM for doing various VM type scenarios. I often run a few different Windows VMs at once. cpu 0 / 14 cpu 1 / 15 cpu 2 / 16 cpu 3 / 17 cpu 4 / 18 cpu 5 / 19 cpu 6 / 20 cpu 7 / 21 cpu 8 / 22 cpu 9 / 23 cpu 10 / 24 cpu 11 / 25 cpu 12 / 26 cpu 13 / 27 I am thinking of assigning pairs 03/17 onwards in pairs to VMs. So a beefy server would get 12/26 and 13/27 for instance. Another server might get 10/24 and 11/25 assigned. That way I am not trying to put threads across VMs. Not sure it is worth pinning the emulator stuff? Should I add the isolcpu bits to syslinux so that unRAID can only use say 0/14, 1/15 and 02/16 for itself and Dockers... Quote Link to comment
Felipe Avelar Posted June 8, 2017 Share Posted June 8, 2017 Hello, i have a hp z800 with 2 xeon x5650, my sysdevs show the pairing like this: cpu 0 <===> cpu 12 cpu 1 <===> cpu 13 cpu 2 <===> cpu 14 cpu 3 <===> cpu 15 cpu 4 <===> cpu 16 cpu 5 <===> cpu 17 cpu 6 <===> cpu 18 cpu 7 <===> cpu 19 cpu 8 <===> cpu 20 cpu 9 <===> cpu 21 cpu 10 <===> cpu 22 cpu 11 <===> cpu 23 i dont know how to differ the CORE vs the ht... HALP! plzzzz Quote Link to comment
BRiT Posted June 9, 2017 Share Posted June 9, 2017 That's because there is no difference between Core and HyperThread. Core and Hyperthread is 2 sides to the same coin. It depends which side is face up. Both 0 and 12 are part of the same CORE and the HT. Which one is operating at the time determines which one is considered the "core" and the one not currently operating can be considered the "hyperthread". Quote Link to comment
mathieuNls Posted June 26, 2017 Share Posted June 26, 2017 (edited) Hi, I've followed most of the advice on this entry and still falling short of native performances by 16% for an i7-6700k without OC. For CPU Single Thread I score 393 (average of three tests with nothing else running) while the reference for my processor is 474 according to CPU-Z. The CPU multi-thread test is irrelevant as not all the cores are linked to the VM. Do any of you achieve to have more negligible losses (~5%)? My config: name>Windows 10</name> <uuid>d560712f-74e7-e728-d16e-7f42e6209349</uuid> <metadata> <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/> </metadata> <memory unit='KiB'>20971520</memory> <currentMemory unit='KiB'>20971520</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>6</vcpu> <cputune> <vcpupin vcpu='0' cpuset='3'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='2'/> <vcpupin vcpu='3' cpuset='6'/> <vcpupin vcpu='4' cpuset='1'/> <vcpupin vcpu='5' cpuset='5'/> <emulatorpin cpuset='0,4'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader> <nvram>/etc/libvirt/qemu/nvram/d560712f-74e7-e728-d16e-7f42e6209349_VARS-pure-efi.fd</nvram> </os> <features> <acpi/> <apic/> <hyperv> <relaxed state='on'/> <vapic state='on'/> <spinlocks state='on' retries='8191'/> <vendor id='none'/> </hyperv> </features> <cpu mode='host-passthrough'> <topology sockets='1' cores='3' threads='2'/> </cpu> <clock offset='localtime'> <timer name='hypervclock' present='yes'/> <timer name='hpet' present='no'/> </clock> Edited June 26, 2017 by mathieuNls Quote Link to comment
SpaceInvaderOne Posted July 16, 2017 Share Posted July 16, 2017 Hi. I have made a video about server tuning which covers isolating cores emulator pin and other topics here. Hope its useful. Quote Link to comment
Thomas van Dalen Posted September 3, 2017 Share Posted September 3, 2017 (edited) Hi, i have a 1080TI and i have hyper v On In the OP i see: Set Hyper-V to 'yes' unless you need it off for Nvidia GPUs. Can anyone tell me why that is? I have read its been fixed with v 6.2. And can be left Enabled. Except for the audio lag, everything else works fine, including gaming I still suffer audio issues, mostly when windows detects a USB plugin or right after logging in. I already have Enable+ on both the GPU and the GPU its audio chip GM200 I have an i7-6700k cpu 0 / 4 cpu 1 / 5 cpu 2 / 6 cpu 3 / 7 syslinux.conf: append isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui vm xml: <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='6'/> <vcpupin vcpu='3' cpuset='7'/> <emulatorpin cpuset='0-1,4-5'/> </cputune> I believe that I have core 1 cpu 0 / 4 for unraid and dockers. (docker names start with --cpuset=0,4) Did i understand this correct? core2 = 1 / 5 for reserved for later virtual machines core3 = 2 / 6 for win 10 gaming vm core4 = 3 / 7 for win 10 gaming vm Edited September 3, 2017 by Thomas van Dalen Quote Link to comment
SpaceInvaderOne Posted September 4, 2017 Share Posted September 4, 2017 (edited) 17 hours ago, Thomas van Dalen said: Hi, i have a 1080TI and i have hyper v On In the OP i see: Set Hyper-V to 'yes' unless you need it off for Nvidia GPUs. Can anyone tell me why that is? I have read its been fixed with v 6.2. And can be left Enabled. Except for the audio lag, everything else works fine, including gaming I still suffer audio issues, mostly when windows detects a USB plugin or right after logging in. I already have Enable+ on both the GPU and the GPU its audio chip GM200 I have an i7-6700k cpu 0 / 4 cpu 1 / 5 cpu 2 / 6 cpu 3 / 7 syslinux.conf: append isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui vm xml: <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='6'/> <vcpupin vcpu='3' cpuset='7'/> <emulatorpin cpuset='0-1,4-5'/> </cputune> I believe that I have core 1 cpu 0 / 4 for unraid and dockers. (docker names start with --cpuset=0,4) Did i understand this correct? core2 = 1 / 5 for reserved for later virtual machines core3 = 2 / 6 for win 10 gaming vm core4 = 3 / 7 for win 10 gaming vm 4 Hi no you dont need to disable hyperv for nvidia cards now. IMO dont isocpu cores on quad core servers. There are not enough cores for this. You can achieve what you want by careful pinning of containers and vms. However you are not pinning your containers correctly. This needs to be done in the extra parameters of the docker container template. You say you are leaving core 2 (1/5) for the future. Don't!! Worry about future the VMs when you come to set them up. Not now.! Anyway, will you ever run 2 vms at once or just one at a time? Get working what you have first... That core (1/5) is totally wasted. Because you have isocpu, three of your cores, they can only be used when you manually pin a process to them. The host can't touch them. The only process you have put on this core is emulatorpin. That is very light. So this core is idle really. Then unRAID and all your containers are on the first core (0/4). Your working that first core hard but giving that second core a day off Use both these cores for your containers, unraid and emulator pin. Now also to remember is when you have a normal bare metal windows pc and you want to play a game ( I am guessing you are a gamer from the 1080TI ! ) you wouldn't expect great performance if you were also running handbrake at the same time encoding video. You would stop that play the game and start that later. Same with unRAID VMs and Containers. We have to sometimes start and stop them. I mention about how to setup different profiles for containers so when gaming a container will use one core and when not it can use all cores. It was in the second video in the server tuning series that I did. May watch those for some ideas. Edited September 4, 2017 by gridrunner 1 Quote Link to comment
Thomas van Dalen Posted September 4, 2017 Share Posted September 4, 2017 (edited) 3 hours ago, gridrunner said: IMO dont isocpu cores on quad core servers. There are not enough cores for this. You can achieve what you want by careful pinning of containers and vms. However you are not pinning your containers correctly. This needs to be done in the extra parameters of the docker container template. I have seen the video and adjusted all dockers extra parameters to --cpuset-cpus=0,4 Dockers: deluge, filezilla, xeoma and zoneminder, nothing more. with append isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui this so unraid wont use other cores right? so that should be good right? even the op use this line in syslinux.cfg for a quad setup Quote You say you are leaving core 2 (1/5) for the future. Don't!! Worry about future the VMs when you come to set them up. Not now.! I have set that extra core 2 also to win 10, I am not sure if only core 3 and 4 are enough for ultra high setting gaming. Quote Anyway, will you ever run 2 vms at once or just one at a time? Get working what you have first... Properly not no, but if 2 cores (core 3 and 4) are enough for windows ultra gaming settings, id like to have kali to be able to launch next to it on a single core.(core 2) In Windows xml i change the line to: <emulatorpin cpuset='0,4'/> so now it uses core 2 3 and 4. The video was usefull Sorry still having troubling understand the great picture. And thank you for replying before Anyway, still issuing the audio lag when usb devices get attached Edited September 4, 2017 by Thomas van Dalen Quote Link to comment
SpaceInvaderOne Posted September 4, 2017 Share Posted September 4, 2017 44 minutes ago, Thomas van Dalen said: I have seen the video and adjusted all dockers extra parameters to --cpuset-cpus=0,4 Dockers: deluge, filezilla, xeoma and zoneminder, nothing more. with append isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui this so unraid wont use other cores right? so that should be good right? even the op use this line in syslinux.cfg for a quad setup I have set that extra core 2 also to win 10, I am not sure if only core 3 and 4 are enough for ultra high setting gaming. Properly not no, but if 2 cores (core 3 and 4) are enough for windows ultra gaming settings, id like to have kali to be able to launch next to it on a single core.(core 2) In Windows xml i change the line to: <emulatorpin cpuset='0,4'/> so now it uses core 2 3 and 4. The video was usefull Sorry still having troubling understand the great picture. And thank you for replying before Anyway, still issuing the audio lag when usb devices get attached Yes, isolcpu will isolate CPU cores from unRAID. But 'unRAID' doesn't just mean unRAID running its NAS duties. unRAID runs your Docker engine and VMs too. So yes the cores are Isolated from unRAID this way. So if cores 2,3,4 are isolated then that means unRAID cant use them itself when running Docker containers, Nas functions, VMs etc. It isn't so noticable for the VMs because of how the unRAID VM manager handles creating VMs as you only have the option to pin cores. However, when not using the template manager in KVM you can have the host handle the vcpus with the scheduler itself. You don't 'have' to pin cores. If you didn't unRAID would handle a VM in the same way it would a Docker container (that hasn't been pinned) unRAID vm manager makes you manually pin the vCPU cores as in 99% of cases this will give the best results. Anyway, even when you pin the vcps with the template it only pins the vcps. As you know it doesn't pin the emulator functions. You have to do this manually in the XML. So the problem can occur when you have isolated say 3 out of the 4 cores unRAID cant use them. So it's emulator functions will only be able to run on the non-isolated core remaining because that's all unRAID has access to. Had the cores not have been isolated unraid could have used all 4 cores to put that function where it sees best. Normally we pin the emulator function so it stays off cores. But when you have isolated cores it stays off those anyway, so there isn't any point pinning that function, unless you want to pin it back to the isolated cores. Another reason not to isolate cores is when the VM isn't running they are doing nothing. You may only be using a few docker containers now but later you may use something like plex and when it wants to transcode some streams, then one core wouldn't be enough. If all cores were free unRAID could then allow Plex to use more resources as it needs. However, if you didn't want it to, then you would pin it to only the cores you want it to use. To be honest with server tuning there is no right way. No one size fits all. You have to mess around with various things and find whats best for you. You say you have an audio lag when you plug USB devices in the server. So I assume that you have passed through a physical USB 3 controller to your VM? How do you mean an audio lag. Do you mean the sound windows makes when the device is plugged in is 'strange' then everything ok. Or when you plug in a USB device all the sound goes out of sync on the VM (such as a video playing etc) ? Also when you post bits of your XML. Best to just copy and paste it all in the post so people can see everything that's in the VM That way we can see if you are using your onboard motherboard sound or only your HDMI 1080ti sound and the USB pass through etc. Quote Link to comment
Thomas van Dalen Posted September 4, 2017 Share Posted September 4, 2017 18 minutes ago, gridrunner said: Also when you post bits of your XML. Best to just copy and paste it all in the post so people can see everything that's in the VM That way we can see if you are using your on board motherboard sound or only your HDMI 1080ti sound and the USB pass through etc. Thank you for your time and explanation. My English is not great, and I am reading over and over to try to understand more... I am still on trial method, but yes when i got my last hiccups gone I am planning to buy the software. Please tell me if i have to remove the isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui or not. The audio lag is another issue and i do not feel like i have to put that in this post so i made a new post here: Quote Link to comment
SpaceInvaderOne Posted September 4, 2017 Share Posted September 4, 2017 1 minute ago, Thomas van Dalen said: Thank you for your time and explanation. My English is not great, and I am reading over and over to try to understand more... I am still on trial method, but yes when i got my last hiccups gone I am planning to buy the software. Please tell me if i have to remove the isolcpus=1,2,3,5,6,7 initrd=/bzroot,/bzroot-gui or not. The audio lag is another issue and i do not feel like i have to put that in this post so i made a new post here: 6 Yes I would remove the isolcpu, so append isolcpus=1,2,3,5,6,7 initrd=/bzroot putting it back to append initrd=/bzroot 1 Quote Link to comment
Thomas van Dalen Posted September 4, 2017 Share Posted September 4, 2017 5 hours ago, gridrunner said: Yes I would remove the isolcpu, so append isolcpus=1,2,3,5,6,7 initrd=/bzroot putting it back to append initrd=/bzroot I have done that, should i keep the <emulatorpin cpuset='0,4'/> on the windows vm? and --cpuset-cpus=0,4 in dockers? just to start with, or leave those fiels blank and only select the cvpus from the template? agian TY Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.