• [ 6.6.0-rc1 ] VM performance on threadripper 2990wx


    SpaceInvaderOne
    • Solved

    I am having some problems with this version using a server with a threadripper 2990wx,  64 gigs ram, a gtx 1080 GPU and a gigabyte x399 gaming 7 motherboard with bios version f10 (AGESA 1.1.0.0  )

     

    On 6.5.3 my boot times of vms are very fast and everything works as expected.

     

    1.  ubuntu mate  with 32 cores (64 threads), 32 gigs ram .... passed through  gtx 1080 

    time from clicking start to desktop ......43 seconds

    2. windows 10 with 16 cores(32 threads), 32 gigs ram ...passed through nvme drive, gtx 1080, USB controller

    time from clicking start to desktop ..... 33 seconds

    3. windows 10 6 cores (12 threads) 4 gigs ram...passed through gtx1080

    time from clicking start to boot ......25 seconds.

     

    On 6.6.0-rc1

    1.  ubuntu mate  with 32 cores (64 threads), 32 gigs ram .... passed through  gtx 1080 

    time from clicking start to desktop .......4 minutes 5 seconds

    2. windows 10 with 16 cores(32 threads), 32 gigs ram ...passed through nvme drive, gtx 1080, USB controller

    time from clicking start to desktop ..... after 10 minutes forced stopped vm

    3. windows 10 6 cores (12 threads) 4 gigs ram...passed through gtx1080

    time from clicking start to boot .......2 minutes and 19 seconds

     

    Also, benchmarks suffering about 50 % drop in single thread scores and 25 percent drop in multi core (using geek bench on ubuntu vm)

    Any other people having issues with threadripper and VMS?

     

    ultramagnus-diagnostics-20180904-2110.zip




    User Feedback

    Recommended Comments



    There was a kernel patch we used in 6.3.5 release that addressed an issue with Threadripper 'reset'.  This patch no longer worked with 4.17 kernel in 6.6.0 and upon investigating, AMD provided updated bios that addresses that issue.

     

    Probably this is a different issue.  I think 4.18 kernel has specific patches for Threadripper, but the Intel 10Gbit ixgbe out-of-tree driver does not compile with 4.18 and we have been waiting for an update:

    https://sourceforge.net/p/e1000/bugs/625/

     

    This is the problem we face in using OOT drivers: often we get stuck on a specific kernel.

     

    If we don't see updated driver soon then we will revert back to stock ixgbe driver and move on to the 4.18 kernel.

    Link to comment

     

     

    Ah that makes sense. I did read that you had removed the reset patch and wondered why. 

    I did wonder if my issue could be in to do with the SEV platform security code issue with the new Agesa released and new linux kernels and it tries to enable features only in the amd epic cpu and not threadripper.

    But i guess if that was the case unraid wouldn't boot at all?

    Anyway this server is going back to 6.5.3 for now and i will enjoy all the 6.6.0 features on my other server where it works fine :)

    Link to comment

    Silly one, but since you quote 50% and 25% reductions in benchmark scores, have you looked to see what CCXs your cores are mapped to? If they're on one of the secondary CCXs (that don't have direct access to the IMC) then that might be part of the issue.

     

    (Unless of course you've got much better scores changing nothing but the unRAID version back to 653. If that's the case, then feel free to ignore the above)

    Link to comment

    I have a 2990wx as well.. Boot times for 16 cores 32 gb of ram Windows 10 with 1070gtx passed, audio passed, NVME passed, and usb controller passed is less than 1 minute. Probably closer to 30 seconds or less. Running 6.6.0-rc1

    Link to comment

    Hi @eschultz sorry for the late reply in testing this. just got back from my summer hols. Tested this morning with rc3. Same result. VM performance very poor and startup times very very slow. Gone back to 6.5.3 again.

    @Jerky_san what motherboard do you use in your system please? I am using a gigabyte X399 AORUS Gaming 7 motherboard with with f10 bios (with AGESA 1.1.0.0 for threadripper 2 suport)

    Link to comment
    13 hours ago, gridrunner said:

    Hi @eschultz sorry for the late reply in testing this. just got back from my summer hols. Tested this morning with rc3. Same result. VM performance very poor and startup times very very slow. Gone back to 6.5.3 again.

    @Jerky_san what motherboard do you use in your system please? I am using a gigabyte X399 AORUS Gaming 7 motherboard with with f10 bios (with AGESA 1.1.0.0 for threadripper 2 suport)

    Asus Zenith  x399

    Link to comment

    I'm using a MSI X399 and @Jerky_san is using Asus Zenith.  Both our latest BIOS are based on AGESA 1.1.0.1A while your latest Gigabyte BIOS is based AGESA 1.1.0.0.  Couldn't find a decent changelog for AGESA though to compare.

    Link to comment

    Just to chime in.

    On my threadripper 1920x I am not seeing any issues on a asrock taichi running bios 3.30.

     

    Performance on my win10 vm is the same and no slow start up.

     

     

     

     

    Link to comment

    I have a Gigabyte X399 Designaire running the F10 BIOS with a Threadripper 2950x. Windows 10 performance on 6.6.0 rc2 was very poor and I rolled back to 6.5.3. I haven't tried rc3 but I don't see anything in the changelog that makes me think this would be resolved.

     

    What I was seeing was extremely slow startups, including a significant delay until seeing the Tianocore logo, and then the spinning dots while Windows is booting would sort of stutter. Boot up took a total of around 5 minutes compared to 30ish seconds on 6.5.3. Performance while in the VM was similar - lots of stutters and delays. Couldn't see anything significant of interest in Task Manager.

    Link to comment

    Given Gigabyte horrendous track record in releasing BIOS update, is there any chance of putting the dirty Threadripper patch back to see if its exclusion was the cause of the slow down?

     

    Only the Gigabyte X399 Aorus Extreme got 1.1.0.1A and it was in early August. I have no hope that it will ever make it to other boards.

    Edited by testdasi
    Link to comment

    It's a possibility the issue is the core count.

     

    So far only people with issues have the 32 core version.

     

    I know linux/windows and driver vendors are scrambling to support server core count cpu's in consumers hands :)

    Edited by Dazog
    Link to comment
    55 minutes ago, Dazog said:

    It's a possibility the issue is the core count.

     

    So far only people with issues have the 32 core version.

     

    I know linux/windows and driver vendors are scrambling to suppoer server core count cpu's in consumers hands :)

    @rinseaid reported same problem with 2950X so it probably is not due to core count. More likely AGESA since those with problems use Gigabyte mobo which is still on 1.1.0.0 (instead of the newer 1.1.0.1A).

    @gridrunner given problem is also reported on 2950X, perhaps it's better to change bug title to 2nd gen Threadripper, isn't it? 

    Link to comment

    I meant between rc2 and rc3. Based on gridrunner's report of rc3 and the similarity in our hardware I'm not confident but will test and report results.

    Link to comment
    7 hours ago, rinseaid said:

    I have a Gigabyte X399 Designaire running the F10 BIOS with a Threadripper 2950x. Windows 10 performance on 6.6.0 rc2 was very poor and I rolled back to 6.5.3. I haven't tried rc3 but I don't see anything in the changelog that makes me think this would be resolved.

     

    What I was seeing was extremely slow startups, including a significant delay until seeing the Tianocore logo, and then the spinning dots while Windows is booting would sort of stutter. Boot up took a total of around 5 minutes compared to 30ish seconds on 6.5.3. Performance while in the VM was similar - lots of stutters and delays. Couldn't see anything significant of interest in Task Manager.

    Was seeing this exact behavior with a early internal 6.6.0 build on my MSI X399 threadripper but it was before I upgraded the BIOS and the AGESA version installed was really old at the time. 

    Link to comment
    22 hours ago, Dazog said:

    Just to chime in.

    On my threadripper 1920x I am not seeing any issues on a asrock taichi running bios 3.30.

     

    Performance on my win10 vm is the same and no slow start up.

    It looks like Asrock BIOS 3.20 was AGESA 1.1.0.0 but no mention of a AGESA upgrade in 3.30.  If it is still AGESA 1.1.0.0 that would go against my suspicion of the older AGESA version causing the slowdown in the newer kernel found in Unraid 6.6.0.

    Link to comment
    1 hour ago, eschultz said:

    It looks like Asrock BIOS 3.20 was AGESA 1.1.0.0 but no mention of a AGESA upgrade in 3.30.  If it is still AGESA 1.1.0.0 that would go against my suspicion of the older AGESA version causing the slowdown in the newer kernel found in Unraid 6.6.0.

    3.30 is still 1.1.0.0

     

    I suspect its a threadripper 2 issue that a piece of software unraid uses doesn't like the 32 core cpu.

     

    Link to comment
    48 minutes ago, Dazog said:

    3.30 is still 1.1.0.0

     

    I suspect its a threadripper 2 issue that a piece of software unraid uses doesn't like the 32 core cpu.

     

    No, I don't think it would be down to the core count. Because I have good performance with 6.5.3 with 32 cores. If it were the core count then I guess that I would see the performance issues with the 6.5.3 as well?

    I still have my 1950x so will swap out the 2990wx tomorrow and see if the problem continues with the gen 1 threadripper

    • Like 2
    Link to comment

    Its not core count as I have a 2990wx and it starts VMs quickly. My only problem is I still get random stuttering in games. I have everything isolated and such but still random stutters. Ironically on my 8 core 1700 this stuttering stuff didn't happen and I only had 4 physical cores for my gaming VM. My gaming VM currently has 8-10 physical cores(tried 10 just to see if more cores made a difference). The 8 cores are on the same die and I let everything else run on another 8 cores effectively not using half the processor. Still stutters..

    Link to comment

    Just suggesting that there may be a regression in QEMU or the likes since 6.6 uses a newer version.

     

    Not many people have the 32 core variant, including developers of these applications. Shrugs.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.