Isolating CPU Not Working

Spritzup · March 4, 2018

Good Evening,

I've followed @gridrunner excellent guide in maximizing performance in both unRaid and in a host VM. However, I'm seeing that Plex is not respecting the fact that the CPU's are isolated, and is often using them when transcoding... this bring my VM to a screeching halt. It was my understanding that isolating the CPU's made it so that nothing could use them, with the exception of any VM you assigned them too... am I mistaken?

Please see my core assignments and my VM.xml. Thanks in advance.

~Spritz

CPU Thread Pairings

Pair 1:	cpu 0 / cpu 16
Pair 2:	cpu 1 / cpu 17
Pair 3:	cpu 2 / cpu 18
Pair 4:	cpu 3 / cpu 19
Pair 5:	cpu 4 / cpu 20
Pair 6:	cpu 5 / cpu 21
Pair 7:	cpu 6 / cpu 22
Pair 8:	cpu 7 / cpu 23
Pair 9:	cpu 8 / cpu 24
Pair 10:	cpu 9 / cpu 25
Pair 11:	cpu 10 / cpu 26
Pair 12:	cpu 11 / cpu 27
Pair 13:	cpu 12 / cpu 28
Pair 14:	cpu 13 / cpu 29
Pair 15:	cpu 14 / cpu 30
Pair 16:	cpu 15 / cpu 31

<domain type='kvm'>
  <name>Brawn</name>
  <uuid>aa4f920a-0dfe-d619-f00b-46c900a1055c</uuid>
  <description>Gaming PC</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='18'/>
    <vcpupin vcpu='4' cpuset='3'/>
    <vcpupin vcpu='5' cpuset='19'/>
    <vcpupin vcpu='6' cpuset='4'/>
    <vcpupin vcpu='7' cpuset='20'/>
    <emulatorpin cpuset='15,31'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.10'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/aa4f920a-0dfe-d619-f00b-46c900a1055c_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/mnt/disks/Brawn_SSD_1/Brawn/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/Data/OS_ISOs/Windows_10.iso'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/Data/OS_ISOs/virtio-win-0.1.141-1.iso'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:40:f9:bb'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

Oh, and my syslinux config -->

default menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=1,2,3,4,17,18,19,20 vfio-pci.ids=1b6f:7052 initrd=/bzroot

Edited March 4, 2018 by Spritzup
Add syslinux

2.6k · March 4, 2018

Post your syslinux.cfg showing what you’ve isolated

Spritzup · March 4, 2018

12 hours ago, 1812 said:

Post your syslinux.cfg showing what you’ve isolated

Added to the bottom of my original post. Thanks for looking!

~Spritz~

2.6k · March 4, 2018

12 minutes ago, Spritzup said:

Added to the bottom of my original post. Thanks for looking!

~Spritz~

and you're running plex in a docker? a bit odd

without addressing the problem, you could specify cpu pinning in docker until the real solution is found: https://www.reddit.com/r/unRAID/comments/6hhvh5/cpu_pinning_to_specific_dockers/

Spritzup · March 4, 2018

15 minutes ago, 1812 said:

and you're running plex in a docker? a bit odd

without addressing the problem, you could specify cpu pinning in docker until the real solution is found: https://www.reddit.com/r/unRAID/comments/6hhvh5/cpu_pinning_to_specific_dockers/

Yup, using the Linux server.io container.

Thanks, for the suggestion, I had thought of doing the same thing as well.

~Spritz~

Spritzup · March 5, 2018

So I've pinned both the Plex and NZBGet container to specific CPU's, and that seems to have put a bandaid on the issue. However, unRaid (and I assume Docker) is still using those supposedly isolated cores for other actions, as even with the VM powered off those cores are seeing some activity. For the moment I can live with that, as whatever is hitting it is not a heavy hitter.

All that said though, I'd like to try and figure out why this isn't functioning as expected. When I run the command (which escapes me at the moment) to verify that the CPU's are isolated, it returns the expected result. I can also see the system parsing the isolated CPU line during boot, without error. Yet when I look at cAdvisor (and I don't know if this is accurate or not), it shows all CPU's are available for the containers to use.

I'm kind of at a loss on this one. Any assistance would be appreciated.

Thanks!

~Spritz

Squid · March 5, 2018

31 minutes ago, Spritzup said:

So I've pinned both the Plex and NZBGet container to specific CPU's, and that seems to have put a bandaid on the issue. However, unRaid (and I assume Docker) is still using those supposedly isolated cores for other actions,

With a container, pinning the app to a specific core does not mean that the core is for the container's exclusive use. It only means that the container is limited to running on that core. Everything else in the system still has access to that core.

33 minutes ago, Spritzup said:

Yet when I look at cAdvisor (and I don't know if this is accurate or not),

It is

You really want to peruse the Docker FAQ. Specifically this on how to fine tune cpu pinning for docker applications

Spritzup · March 5, 2018

@Squid Thanks for the reply. Unfortunately I think their is some confusion. The issue is that Docker (and I assume by extension unRaid) is not respecting the "isolcpus" command in my syslinux file. What should have been happening is that 8 cores would be isolated on for VM use, and everything else would run on the remaining 24.

However, that did not appear to be happening, as I could observer both NZBGet and Plex using the supposedly isolated CPU's thus bring my VM to a screeching halt. As a bandaid, I've pinned CPU's for specific container use, but this is not ideal IMO.

So TLDR - The "isolated cores" in this case is those isolated for a VM using the "isolcpus" command. The docker cpu pinning is a bandaid, but is working as expected.

~Spritz

Squid · March 5, 2018

2 hours ago, Spritzup said:

Unfortunately I think their is some confusion.

I was too lazy to read the entire thread. Only read the last couple posts.

Spritzup · March 5, 2018

2 minutes ago, Squid said:

I was too lazy to read the entire thread. Only read the last couple posts.

haha, I've had days like that as well. If you have any insight, I'd appreciate it. I did read the previous link that you provided, and it was an interesting read, thanks for that

~Spritz

Squid · March 5, 2018

I don't run my system with any isolated cores, and as I said in the post, I'm not sure exactly how docker pinning works in conjunction with cpu pinning. (IE: do the cores get renumbered, or what) Been waiting for someone to really experiment and figure that one out.

Spritzup · March 5, 2018

That seems like a good video for @gridrunner

dadarara · July 18, 2018

I am also seeing this situation.

whats the best way/command to see who is using the CPU?

I assume there are commands like HTOP / TOP that can show the process name, but I wish to know on the higher level

like the name of the docker or the VM.

my configuration of the 32 cores I have was to dedicated some to VMs specifically and others to unraid and dockers.

append intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 isolcpus=2-5,8-14,18-21,24-30 vfio-pci.ids=8086:244e,1033:0194,1b73:1100 modprobe.blacklist=i2c_i801,i2c_smbus initrd=/bzroot

I have this line in all my dockers in EXTRA Parameter field.

--cpuset-cpus=0,1,6,7,16,17,22,23 --log-opt max-size=50m --log-opt max-file=1

Squid · July 18, 2018

NetData app

Sent via Tapatalk because my wife thinks I spend too much time on the computer

dadarara · July 18, 2018

Squid

thanks

I have it working

but maybe I am not looking at the right place, I cant find WHO is using the CPU.

there are lots of other technical metrics though.

Squid · July 18, 2018

Netdata does report all that under the Applications section. IIRC, there may have to be a slight change to the template to have it report the name of the app instead of it's docker ID. Check the support thread. Or for a simpler view, then the cAdvisor app.

Edited July 18, 2018 by Squid

dadarara · July 19, 2018

sorry to be so blind

I dont see applications section

I do see all the dockers by name though, that is what you refer to? cause it doesn't show the core8, core 9 in my case that are used by someone specific.

2.6k · September 20, 2018

bumping this because I see this cpu usage "leakage" happening as well.

it appears some dockers respect the

isolcpus

and others sort of do...

for example:

I have 12 threads, isolating 1-6,7-11, leaving 0,6.

netdata has has time on cpu 7 even though it shouldn't, and shows access to more cores than just 0,6

zero tier only has 0,6

1758004383_ScreenShot2018-09-20at3_18_42PM.png.f9db695f95b979744b375bb6e82d6a64.png

next cloud, marinadb, lets encrypt, and duck dns all also only report usage on 0,6.

but cloudberry has also snuck time in on cpu7, and also shows usage on 2 and 5.

387497316_ScreenShot2018-09-20at3_16_13PM.png.76afa5d160625bda0f2161aa841c2cdf.png

there is no cpu pinning specified for any of these dockers. so clearly, something is allowing them to move outside of the isolated cpu set.

to verify it isn't just a problem with 1 server, I checked another, and sure enough, cpu3 which is isolated in the syslinux.cfg is getting used by net data again

I have plex on this server, but it appears to only have access to the cores it should.

now, these amounts of usage are not enough to impact my performance/usability, but so support the idea that some dockers seem to be finding their way onto other threads.

Isolating CPU Not Working

Recommended Posts

Spritzup

Link to comment

1812

Link to comment

Spritzup

Link to comment

1812

Link to comment

Spritzup

Link to comment

Spritzup

Link to comment

Squid

Link to comment

Spritzup

Link to comment

Squid

Link to comment

Spritzup

Link to comment

Squid

Link to comment

Spritzup

Link to comment

dadarara

Link to comment

Squid

Link to comment

dadarara

Link to comment

Squid

Link to comment

dadarara

Link to comment

1812

Link to comment

Join the conversation