• VM CPU utilization inconsistence


    Jerky_san
    • Retest Minor

    So I have a gaming VM that runs 16 cores(pair threaded) with those cores isolated at boot. When mover or something is running in unraid I can 100% tell in game. Its like unraid is using those cores even though I have specifically said not to. A test I performed was run mover than run a "Dying Light" which usually runs at max fps at 1080p with nearly everything turned to max. When mover is running there is constant stutter. I traced this back to my first CPU core in the VM is showing around 85% utilization but unraid is saying its nearly 100%. It appears something is adding CPU usage on top of the actual usage. I then stop the mover and it shows what task manager shows in the VM. I would assume Unraid would use one of the other "64smt/32physical" cores that I have but it doesn't appear to want to.

    2990wx

    1070GTX

    64GB 3200 CL15-15-15-35 ram

    Edit:

    Realized I didn't show the CPU isolation stuff for booting

     

    label unRAID OS
      menu default
      kernel /bzimage
      append isolcpus=0,1,2,3,4,5,6,7,32,33,34,35,36,37,38,39 vfio-pci.ids=1022:145f initrd=/bzroot
    label unRAID OS GUI Mode
      kernel /bzimage
      append isolcpus=0,1,2,3,4,5,6,7,32,33,34,35,36,37,38,39 vfio-pci.ids=1022:145f initrd=/bzroot,/bzroot-gui

     

     

    Pictures and diagnostics included

    Capture.PNG

    tower-diagnostics-20180905-1100.zip




    User Feedback

    Recommended Comments

    Alright so moved my gaming machine way down and ran mover again. It still spiked cpu0 but didn't affect my gaming machine as it was so far down. It appears it is ignoring the cpuset for cpu 0

    Link to comment
    15 hours ago, Jerky_san said:

    Alright so moved my gaming machine way down and ran mover again. It still spiked cpu0 but didn't affect my gaming machine as it was so far down. It appears it is ignoring the cpuset for cpu 0

    I asked the question a short while ago about isolating core 0 and what happens. Theoretically unRAID should avoid the core but now that I think about it, I don't think that's the case.

     

    And this draws from my experience with isolating cores and then assign the isolated cores to a docker. The docker would end up using ONE of the cores to 100%. A docker is part of what you would call "unRAID" (since it's part of the host). That means isolation actually doesn't prevent the host to use the core. My hypothesis is that a process doesn't know if the core is isolated or not until it starts and checks the isolation list and/or being told "you naughty process, you can't use this". But since it already hold the cores, it will continue to do whatever it wants to do until done like it doesn't care. But it's also prevented from using any other isolated core.

     

    So until this is fully resolved, the old advice to keep core 0 (and its SMT sister) free would still be in effect.

     

     

    That is complicated by the inconsistency in core pair display in different envi.

    My 2990WX shows 0 paired with 1 (so not 0 paired with 32 as yours). Your Zenith X399 must be doing something very different.

    Config 5.JPG

    Link to comment
    38 minutes ago, testdasi said:

    I asked the question a short while ago about isolating core 0 and what happens. Theoretically unRAID should avoid the core but now that I think about it, I don't think that's the case.

     

    And this draws from my experience with isolating cores and then assign the isolated cores to a docker. The docker would end up using ONE of the cores to 100%. A docker is part of what you would call "unRAID" (since it's part of the host). That means isolation actually doesn't prevent the host to use the core. My hypothesis is that a process doesn't know if the core is isolated or not until it starts and checks the isolation list and/or being told "you naughty process, you can't use this". But since it already hold the cores, it will continue to do whatever it wants to do until done like it doesn't care. But it's also prevented from using any other isolated core.

     

    So until this is fully resolved, the old advice to keep core 0 (and its SMT sister) free would still be in effect.

     

     

    That is complicated by the inconsistency in core pair display in different envi.

    My 2990WX shows 0 paired with 1 (so not 0 paired with 32 as yours). Your Zenith X399 must be doing something very different.

    Config 5.JPG

    Well this is interesting.. I wonder which board is right? Hope I am not pairing my gaming VM on SMT cores instead of real cores and the like. Guess maybe I should bench and see if it changes..

    Link to comment
    4 minutes ago, testdasi said:

    Shameless plug: I already did some testing in my build topic. :D

     

     

    Thanks I'm reading it now and you make me sad... your IOMMU groups are far superior to mine ;-; now wishing I would of bought that board lol

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.