Jump to content
testdasi

The Black Bear - Threadripper 2990WX build

26 posts in this topic Last Reply

Recommended Posts

Posted (edited)

After a few months of researching, prepping my data, persuading she-who-must-be-obeyed, etc. I finally pulled the plug when Amazon finally has my motherboard in stock. I had a i7-5820K (overclocked to 4GHz) and a Xeon E3-1245v5 server (ITX case) but it was more out of necessity since I wasn't able to afford Dual Xeon to merge them and still have sufficient performance. The 2990WX came out at just the right price point that makes the idea possible.

 

OS at time of building:  6.4.0

OS Current: 6.5.3
CPU: AMD Ryzen Threadripper 2990WX

Heatsink:  Noctua NH-U14S TR4-SP3 (in push-pull - I nicked a 15mm fan from my old NH-D14 workstation cooler)
Motherboard: Gigabyte X399 Designare EX (f10 BIOS)
RAM: 64GB Corsair Vengeance LPX 2666MHz + 32GB GSkill Ripjaw 4 2800MHz (nicked from the old workstation)
Case: Silverstone Fortress FT02 (old workstation case)
Drive Cage:  Evercool Dual (2x5.25 -> 3x3.5 with 80mm fan) - need this to mount the temp drive + 2x2.5" SSD for my workstation VM
Power Supply: Corsair HX850 (8+ years old!)

GPU: Zotac GTX 1070 Mini for main VM, Zotac GT 710 PCIe x1 for unRAID / whatever else

 

Parity Drive:  10TB Seagate Iron Wolf

Array: 8TB Seagate Archive, 8TB Seagate NAS, 8TB Hitachi HE8, 6TB Western Digital Black

Cache: 2x 1.2TB Intel 750 NVMe

Temp drive: 3TB Toshiba (a rebadged Hitachi Deskstar)

VM-only: 512GB Samsung SM951 M.2 (the AHCI variety), 2TB Samsung Evo 850 2.5" SSD, 2TB Crucial MX300 2.5" SSD

 

Total Hard Drive Array Capacity: 30TB

Total Cache Drive Array Capacity: 1.2TB

 

Primary Use: Main video/photo editing workstation + various unRAID stuff that people do on unRAID
Likes: Pretty much take anything thrown at it and spit it output in my face.
Dislikes: It's heavy AF!

Future Plans: Need to ebay off old stuff so I don't upset she-who-must-be-obeyed

 

Some tips:

  • If you build from scratch, make sure to get a motherboard with the ability to upgrade BIOS without CPU (and familiarise yourself fully with the process). Even Linus forgot to update his BIOS before his 2990WX build.
  • Windows isn't yet optimised for too many threads/cores. My testing shows anything more than 32 processes leads to a DROP in performance. The diminishing return also means going beyond 24-28 processes leads to almost no improvement in real-world performance. (a process = 1 thread/core e.g. 24 processes = 12 cores in SMT or 24 cores without SMT).
  • Due to Ryzen design (essentially gluing 2x4-core units into a die and then gluing 2/4 dies into a CPU), 24-core config performs better than 28-core, at least with my workload on Windows. 32-core no-SMT is fastest but only in barebone.
  • Perhaps placebo but I found spreading the cores out evenly seems to improve performance. I guess every 8 logical cores = 1 unit so if you have an 8-core VM for example, have 1 core in each unit.

 

Finally, I would like to thank @gridrunner@eschultz, @Jcloud, @methanoid, @tjb_altf4, @jbartlett, @guru69 for their kind advice during my research process. 👍

Front.jpg

Interior Side.jpg

 

Below are my IOMMU groups:

The motherboard native IOMMU is actually very good (e.g. can pass through USB and main GPU). I still turned on acs override (with multifunction) because the LAN ports + wifi + the PCIe2.0 slots are all in the same group and I plan to pass through one of them to another VM.

And this Gigabyte motherboard is amazing in that it allows you to pick which slot as primary GPU so you can use a good GPU on the fast slot and dump a cheapo on the slowest slot (in my case a single-slot PCIe x1 GT 710 GPU) for unRAID. Do not need to use bios (but I use regardless)

IOMMU group 0:	[1022:1452] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 1:	[1022:1453] 00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
IOMMU group 2:	[1022:1453] 00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
IOMMU group 3:	[1022:1452] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 4:	[1022:1452] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 5:	[1022:1452] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 6:	[1022:1452] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 7:	[1022:1454] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 8:	[1022:1452] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 9:	[1022:1454] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 10:	[1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 59)
	[1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
IOMMU group 11:	[1022:1460] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
	[1022:1461] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
	[1022:1462] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
	[1022:1463] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
	[1022:1464] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
	[1022:1465] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
	[1022:1466] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
	[1022:1467] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
IOMMU group 12:	[1022:1460] 00:19.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
	[1022:1461] 00:19.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
	[1022:1462] 00:19.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
	[1022:1463] 00:19.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
	[1022:1464] 00:19.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
	[1022:1465] 00:19.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
	[1022:1466] 00:19.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
	[1022:1467] 00:19.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
IOMMU group 13:	[1022:1460] 00:1a.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
	[1022:1461] 00:1a.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
	[1022:1462] 00:1a.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
	[1022:1463] 00:1a.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
	[1022:1464] 00:1a.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
	[1022:1465] 00:1a.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
	[1022:1466] 00:1a.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
	[1022:1467] 00:1a.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
IOMMU group 14:	[1022:1460] 00:1b.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
	[1022:1461] 00:1b.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
	[1022:1462] 00:1b.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
	[1022:1463] 00:1b.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
	[1022:1464] 00:1b.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
	[1022:1465] 00:1b.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
	[1022:1466] 00:1b.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
	[1022:1467] 00:1b.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
IOMMU group 15:	[1022:43ba] 01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset USB 3.1 xHCI Controller (rev 02)
IOMMU group 16:	[1022:43b6] 01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset SATA Controller (rev 02)
IOMMU group 17:	[1022:43b1] 01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset PCIe Bridge (rev 02)
IOMMU group 18:	[1022:43b4] 02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
IOMMU group 19:	[1022:43b4] 02:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
IOMMU group 20:	[1022:43b4] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
IOMMU group 21:	[1022:43b4] 02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
IOMMU group 22:	[1022:43b4] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
IOMMU group 23:	[8086:1539] 04:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
IOMMU group 24:	[8086:24fd] 05:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)
IOMMU group 25:	[8086:1539] 06:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
IOMMU group 26:	[10de:128b] 07:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
IOMMU group 27:	[10de:0e0f] 07:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
IOMMU group 28:	[8086:0953] 08:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01)
IOMMU group 29:	[1022:145a] 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
IOMMU group 30:	[1022:1456] 09:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
IOMMU group 31:	[1022:145f] 09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] USB 3.0 Host controller
IOMMU group 32:	[1022:1455] 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
IOMMU group 33:	[1022:7901] 0a:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
IOMMU group 34:	[1022:1457] 0a:00.3 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller
IOMMU group 35:	[1022:1452] 20:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 36:	[1022:1452] 20:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 37:	[1022:1452] 20:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 38:	[1022:1452] 20:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 39:	[1022:1452] 20:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 40:	[1022:1454] 20:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 41:	[1022:1452] 20:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 42:	[1022:1454] 20:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 43:	[1022:145a] 21:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
IOMMU group 44:	[1022:1456] 21:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
IOMMU group 45:	[1022:1455] 22:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
IOMMU group 46:	[1022:1452] 40:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 47:	[1022:1453] 40:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
IOMMU group 48:	[1022:1453] 40:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
IOMMU group 49:	[1022:1452] 40:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 50:	[1022:1452] 40:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 51:	[1022:1453] 40:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
IOMMU group 52:	[1022:1452] 40:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 53:	[1022:1452] 40:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 54:	[1022:1454] 40:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 55:	[1022:1452] 40:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 56:	[1022:1454] 40:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 57:	[144d:a801] 41:00.0 SATA controller: Samsung Electronics Co Ltd Device a801 (rev 01)
IOMMU group 58:	[8086:0953] 42:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01)
IOMMU group 59:	[10de:1b81] 43:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
IOMMU group 60:	[10de:10f0] 43:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
IOMMU group 61:	[1022:145a] 44:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
IOMMU group 62:	[1022:1456] 44:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
IOMMU group 63:	[1022:145f] 44:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] USB 3.0 Host controller
IOMMU group 64:	[1022:1455] 45:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
IOMMU group 65:	[1022:7901] 45:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
IOMMU group 66:	[1022:1452] 60:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 67:	[1022:1452] 60:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 68:	[1022:1452] 60:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 69:	[1022:1452] 60:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 70:	[1022:1452] 60:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 71:	[1022:1454] 60:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 72:	[1022:1452] 60:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge
IOMMU group 73:	[1022:1454] 60:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
IOMMU group 74:	[1022:145a] 61:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
IOMMU group 75:	[1022:1456] 61:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
IOMMU group 76:	[1022:1455] 62:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455

 

Edited by testdasi
  • Like 4
  • Upvote 1

Share this post


Link to post

nice rig, just curios why do you still have a cd-rom, in the age of usb images

Share this post


Link to post
2 hours ago, ijuarez said:

nice rig, just curios why do you still have a cd-rom, in the age of usb images

Cuz it costs me more to get rid of it than to keep it 😆 and every now and then it does come in handy e.g. not everyone uploads wedding vid to Youtube

Share this post


Link to post
12 minutes ago, testdasi said:

😆 not everyone uploads wedding vid to Youtube

Neanderthals

  • Like 1

Share this post


Link to post
Posted (edited)
34 minutes ago, ijuarez said:
47 minutes ago, testdasi said:

😆 not everyone uploads wedding vid to Youtube

Neanderthals

I chuckled at this. . . .Grandfolks, Neanderthals both follow geological-time scales. ;) 

 

3 hours ago, testdasi said:

Finally, I would like to thank @gridrunner@eschultz, @Jcloud, @methanoid, @tjb_altf4, @jbartlett, @guru69 for their kind advice during my research process. 👍

You're welcome.

 

Went with a TR2, eh? Nice. With the lady-who-must-be-pleased factor, I'm a little surprised you didn't shop for a TR1, with the TR2 out I expect the price to drop. 

Either way that system looks good, I hope it serves you well.

 

Typical UCD questions:

   Do you have a killerwatt, or similar, to measure power draw for stats?

   What does your IOMMU's look like? 

Edited by Jcloud

Share this post


Link to post

What VMs are you running and what resources have you got allocated?

 

 

Share this post


Link to post
4 hours ago, ijuarez said:

Neanderthals

A surprising number of Millennials actually. And apparently cute tiny DVDs are popular with some folks.

 

4 hours ago, Jcloud said:

Went with a TR2, eh? Nice. With the lady-who-must-be-pleased factor, I'm a little surprised you didn't shop for a TR1, with the TR2 out I expect the price to drop. 

Either way that system looks good, I hope it serves you well.

 

Typical UCD questions:

   Do you have a killerwatt, or similar, to measure power draw for stats?

   What does your IOMMU's look like? 

I started with a TR1 build plan but I timed it perfectly and got sign off for a TR2. 😅 And TR1 price went down before the TR2 came out but actually went back up. I remember at one point when the rumour mill was in full swing, the 1950X was like 600, it's now 700+.

My watt meter is broken so not sure about power consumption. Will update post with IOMMU.

 

1 hour ago, Spies said:

What VMs are you running and what resources have you got allocated?

3 Windows VM: 1 main workstation and 2 remote-only VMs.

2 Ubuntu servers as VPN gateways.

Half of the cores and RAM go to the workstation (these cores are all isolated). That's my main daily driver.

The rest of the VM has 2 cores each and some share with unRAID dockers.

 

I'm currently testing a weird-and-wonderful config. TR2 logical numbering is a bit different: 0+1 logical cores are both on the same physical core.

  • I assigned 24 odd (logical) cores to my workstation + the remaining 8 odd ones as emulator pin. The emulator cores are shared with the other various VMs (which don't really do much).
  • The remaining 32 logical cores - the even numbers - are distributed and shared among dockers and usual unRAID stuff.

So there's no (physical) core that is exclusively used by anything and I rely on SMT to schedule the tasks appropriately. We'll see how it goes.

 

1 hour ago, gridrunner said:

Nice build. :)

 

Thanks. :D

  • Like 1

Share this post


Link to post
14 hours ago, testdasi said:

Perhaps placebo but I found spreading the cores out evenly seems to improve performance. I guess every 8 logical cores = 1 unit so if you have an 8-core VM for example, have 1 core in each unit.

Innnnnteresting.

  • Upvote 1

Share this post


Link to post

So here is a quick summary of my test results.

I use barebone SMT-off as the base (since it's fastest). The % below is slower than base so lower is better.

All tests done on Windows barebone / VM. SMT is on for VM. Nothing else is running while doing the tests (except for (8))

  1. Barebone SMT on: 52% <-- yes SLOWER!
  2. VM 1-7, 17-23, 33-39, 49-55 (28 logical cores): 33%
  3. VM all odd numbers except 1, 17, 33, 49 (28 logical cores): 36%
  4. VM first 32 except 0, 8, 16, 24 (28 logical cores): 29%
  5. VM all odd numbers (32 logical cores): 30%
  6. VM last 32 (32 logical cores): 34%
  7. VM all odd numbers except 1, 9, 17, 25, 33, 41, 49, 57 (24 logical cores): 20%
  8. VM same as 7 but with 3 simultaneous transcodes on the even logical cores using dockers (24 logical cores): 56%

My conclusions:

  • (1), (7) and (8) says Windows is badly optimised for Threadripper 2 but Linux is much better.
  • (3) - (7) seems to confirm what I was guessing. Each 8 logical cores represent 4 physical cores and thus 1 CCX. Spreading things evenly across more CCX improves performance. 3-3-3-3-3-3-3-3 is faster than 3-4-3-4-3-4-3-4!

Linux is actually great with SMT optimisation so I'll stick to my weird-and-wonderful config moving forward. :D

 

 

 

 

 

Config 5.JPG

Share this post


Link to post
Posted (edited)

Interesting details.  I've gone back to my original "slower" cpu-pinning, as it's easier for my brain to wrap around. For me, while the benchmarks take a hit, I don't really notice it in my real-world-day-to-day driving.

Edited by Jcloud
gramar

Share this post


Link to post

Thanks to the magic of KVM, I now have MacOS running on an old Surface 3. 😁

MacOS on Surface 3~01.jpg

Share this post


Link to post
Thanks to the magic of KVM, I now have MacOS running on an old Surface 3.
1455188132_MacOSonSurface301.thumb.jpg.fb01d1e9d0be39a9510471adc3bfa802.jpg
Nice

Sent from my BND-L34 using Tapatalk

Share this post


Link to post

So apparently, the 2990WX all-core turbo is 3.4GHz.

Note: this was on F10 BIOS.

# grep MHz /proc/cpuinfo
cpu MHz         : 3315.662
cpu MHz         : 3302.891
cpu MHz         : 3382.593
cpu MHz         : 3384.594
cpu MHz         : 3389.368
cpu MHz         : 3389.600
cpu MHz         : 3391.838
cpu MHz         : 3390.623
cpu MHz         : 3392.705
cpu MHz         : 3397.049
cpu MHz         : 3389.122
cpu MHz         : 3384.777
cpu MHz         : 3393.248
cpu MHz         : 3393.420
cpu MHz         : 3393.441
cpu MHz         : 3393.442
cpu MHz         : 3386.566
cpu MHz         : 3378.696
cpu MHz         : 3393.268
cpu MHz         : 3392.793
cpu MHz         : 3388.878
cpu MHz         : 3392.872
cpu MHz         : 3393.441
cpu MHz         : 3393.330
cpu MHz         : 3393.136
cpu MHz         : 3391.281
cpu MHz         : 3393.417
cpu MHz         : 3393.139
cpu MHz         : 3391.659
cpu MHz         : 3393.042
cpu MHz         : 3392.735
cpu MHz         : 3390.230
cpu MHz         : 3390.927
cpu MHz         : 3399.651
cpu MHz         : 3393.443
cpu MHz         : 3393.257
cpu MHz         : 3398.353
cpu MHz         : 3393.405
cpu MHz         : 3393.446
cpu MHz         : 3393.409
cpu MHz         : 3393.484
cpu MHz         : 3392.372
cpu MHz         : 3393.442
cpu MHz         : 3393.443
cpu MHz         : 3393.363
cpu MHz         : 3392.820
cpu MHz         : 3393.443
cpu MHz         : 3393.308
cpu MHz         : 3392.475
cpu MHz         : 3393.030
cpu MHz         : 3375.652
cpu MHz         : 3363.026
cpu MHz         : 3393.333
cpu MHz         : 3393.370
cpu MHz         : 3393.444
cpu MHz         : 3393.236
cpu MHz         : 3388.671
cpu MHz         : 3392.779
cpu MHz         : 3391.320
cpu MHz         : 3393.352
cpu MHz         : 3393.198
cpu MHz         : 3393.226
cpu MHz         : 3393.170
cpu MHz         : 3392.884

 

Edited by testdasi

Share this post


Link to post

So apparently my test 6 happens to be testing all the slow cores (no direct memory access).

Will need to retest the fast cores once my data migration is done.

~# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 0 size: 48208 MB
node 0 free: 350 MB
node 1 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
node 2 size: 48354 MB
node 2 free: 4680 MB
node 3 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
node 3 size: 0 MB
node 3 free: 0 MB
node distances:
node   0   1   2   3
  0:  10  16  16  16
  1:  16  10  16  16
  2:  16  16  10  16
  3:  16  16  16  10

 

Share this post


Link to post

Love F11e BIOS (AGESA 1.1.0.1a). All-core Turbo is now 3.8GHz! :D

~# cat /proc/cpuinfo | grep "MHz"
cpu MHz         : 3838.883
cpu MHz         : 3842.082
cpu MHz         : 3839.437
cpu MHz         : 3838.451
cpu MHz         : 3841.782
cpu MHz         : 3842.523
cpu MHz         : 3842.181
cpu MHz         : 3835.253
cpu MHz         : 3843.495
cpu MHz         : 3834.948
cpu MHz         : 3840.813
cpu MHz         : 3841.923
cpu MHz         : 3841.034
cpu MHz         : 3841.805
cpu MHz         : 3835.424
cpu MHz         : 3841.688
cpu MHz         : 3833.602
cpu MHz         : 3842.614
cpu MHz         : 3842.236
cpu MHz         : 3842.521
cpu MHz         : 3843.175
cpu MHz         : 3842.522
cpu MHz         : 3841.415
cpu MHz         : 3842.517
cpu MHz         : 3831.458
cpu MHz         : 3842.367
cpu MHz         : 3842.650
cpu MHz         : 3842.180
cpu MHz         : 3842.218
cpu MHz         : 3841.945
cpu MHz         : 3842.148
cpu MHz         : 3840.602
cpu MHz         : 3832.557
cpu MHz         : 3842.336
cpu MHz         : 3842.060
cpu MHz         : 3840.882
cpu MHz         : 3841.774
cpu MHz         : 3840.777
cpu MHz         : 3842.270
cpu MHz         : 3842.064
cpu MHz         : 3842.156
cpu MHz         : 3835.171
cpu MHz         : 3841.963
cpu MHz         : 3840.519
cpu MHz         : 3839.358
cpu MHz         : 3833.257
cpu MHz         : 3830.856
cpu MHz         : 3840.741
cpu MHz         : 3834.879
cpu MHz         : 3842.435
cpu MHz         : 3841.519
cpu MHz         : 3840.938
cpu MHz         : 3842.043
cpu MHz         : 3840.830
cpu MHz         : 3841.720
cpu MHz         : 3837.862
cpu MHz         : 3841.364
cpu MHz         : 3840.644
cpu MHz         : 3824.251
cpu MHz         : 3840.582
cpu MHz         : 3842.038
cpu MHz         : 3840.441
cpu MHz         : 3841.091
cpu MHz         : 3840.340

 

  • Upvote 1

Share this post


Link to post

After additional tuning (and BIOS updates) I have managed to make my 24-core VM (all odd numbers except 1, 9, 17, 25, 33, 41, 49, 57) to come pretty dang close to barebone 32-core non-SMT (F10 BIOS).

 

As previous benchmarks, % is above barebone 32-core non-SMT so lower is better.

  • Emulator core pinned on 9, 25, 41, 57, NUMA optimisation: 20%
  • Emulator core pinned on 9, 25, with NUMA optimisation: 8%
  • Emulator core pinned on 9, 25, with NUMA optimisation, F11e BIOS: 2% 😁

 

unRAID doesn't come with numatune and numad to optimise RAM for NUMA design but it's possible to work around. I created an MSDOS "blocker" VM with core pinned on NUMA node 0 and just enough allocated RAM to reduce my node 0 free RAM down to roughly half of what my actual VM needs. Then I start the main VM, resulting in an almost perfect 50-50 split between node 0 and node 2.

Share this post


Link to post
On 9/20/2018 at 1:38 PM, testdasi said:

After additional tuning (and BIOS updates) I have managed to make my 24-core VM (all odd numbers except 1, 9, 17, 25, 33, 41, 49, 57) to come pretty dang close to barebone 32-core non-SMT (F10 BIOS).

 

As previous benchmarks, % is above barebone 32-core non-SMT so lower is better.

  • Emulator core pinned on 9, 25, 41, 57, NUMA optimisation: 20%
  • Emulator core pinned on 9, 25, with NUMA optimisation: 8%
  • Emulator core pinned on 9, 25, with NUMA optimisation, F11e BIOS: 2% 😁

 

unRAID doesn't come with numatune and numad to optimise RAM for NUMA design but it's possible to work around. I created an MSDOS "blocker" VM with core pinned on NUMA node 0 and just enough allocated RAM to reduce my node 0 free RAM down to roughly half of what my actual VM needs. Then I start the main VM, resulting in an almost perfect 50-50 split between node 0 and node 2.

Sweet work matey. Would you mind posting your cpu assignments for your VMs so I can have an idea of how to configure my 2950x (when I receive it).  I'll be purchasing the MSI meg creation also so looking forward to that 🙂

Do all cores turbo correctly? I'm having to use the zenstate script for my 1700 to boost.

Edited by mikeyosm

Share this post


Link to post

Did the gpu passthrough go smoothly? considering upgrading to a 2950x from my 2700X and one of the VM's in the future might have a gpu passed through but i was curious how smooth it all went (heard TR had a rough start with compatibility)

Share this post


Link to post
On 10/23/2018 at 7:53 AM, mikeyosm said:

Sweet work matey. Would you mind posting your cpu assignments for your VMs so I can have an idea of how to configure my 2950x (when I receive it).  I'll be purchasing the MSI meg creation also so looking forward to that 🙂

Do all cores turbo correctly? I'm having to use the zenstate script for my 1700 to boost.

Below are my core assignments for the main workstation VM. I tested a lot of configs and this is the best one. The idea is to divide the CPU into 4 NUMA nodes, each node has 2 CCX, each CCX has 4 pairs of HT logical cores. So you can see for each CCX, I assign 3 non-linked logical cores to my workstation VM.

 

Based on my testing, you want to spread the core assignments across as few nodes as possible but as evenly across as many CCX as possible and always leave core 0 not used by anything. So for instance, if you assign 6 cores to a VM, best config is to assign 3 to each CCX with both CCX on the same numa node. The side effect of that is an odd number of cores on a node, at least based on my workload, performs worse than having 1 fewer core assigned but spread evenly.

 

You also might want to watch SpaceInvaderOne video on how to identify which PCIe slot is assigned to which numa node + how your cores are displayed. Mine shows 0 + 1 as a HT pair but I remember someone on the forum reported Asus mobo reports 0 + 32 as a pair. For gaming VM, you want to assign only the cores linked to your PCIe slot which has the GPU. That will minimise latency when playing (although I don't see any diff b/w gaming VM and my workstation VM, but then I use Process Lasso (to pin cores WITHIN a VM for a certain process) so that may be why).

 

And everything turbo to 3.8GHz automatically (all cores) and drop to about 1.7 GHz when idle.

  <cputune>
    <vcpupin vcpu='0' cpuset='3'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='7'/>
    <vcpupin vcpu='3' cpuset='11'/>
    <vcpupin vcpu='4' cpuset='13'/>
    <vcpupin vcpu='5' cpuset='15'/>
    <vcpupin vcpu='6' cpuset='19'/>
    <vcpupin vcpu='7' cpuset='21'/>
    <vcpupin vcpu='8' cpuset='23'/>
    <vcpupin vcpu='9' cpuset='27'/>
    <vcpupin vcpu='10' cpuset='29'/>
    <vcpupin vcpu='11' cpuset='31'/>
    <vcpupin vcpu='12' cpuset='35'/>
    <vcpupin vcpu='13' cpuset='37'/>
    <vcpupin vcpu='14' cpuset='39'/>
    <vcpupin vcpu='15' cpuset='43'/>
    <vcpupin vcpu='16' cpuset='45'/>
    <vcpupin vcpu='17' cpuset='47'/>
    <vcpupin vcpu='18' cpuset='51'/>
    <vcpupin vcpu='19' cpuset='53'/>
    <vcpupin vcpu='20' cpuset='55'/>
    <vcpupin vcpu='21' cpuset='59'/>
    <vcpupin vcpu='22' cpuset='61'/>
    <vcpupin vcpu='23' cpuset='63'/>
    <emulatorpin cpuset='9,25'/>
  </cputune>

 

15 hours ago, Blindsay said:

Did the gpu passthrough go smoothly? considering upgrading to a 2950x from my 2700X and one of the VM's in the future might have a gpu passed through but i was curious how smooth it all went (heard TR had a rough start with compatibility)

No problem at all with GPU pass through.

 

It depends on motherboard and GPU combi. My combi works well enough, except for the fact that Gigabyte mobo doesn't like unRAID v6.6+. My VMs take up to 10 minutes to boot and once booted, it runs ridiculously slow. So I downgraded to 5.5.3 and expect to stay there for years to come, or at least until my current 5-year upgrade cycle ends. Nobody seems to know why.

 

Apparently other users reported no problem (e.g. with ASRock X399 Taichi) so I do recommend not to buy Gigabyte mobo.

 

The 2950X is basically 2 Ryzen CPU glued together so you need to make sure to passthrough the node connected directly to your GPU PCIe slot for best performance. Alternatively use Process Lasso to limit the cores used by your games (or Adobe Lightroom <-- does not like more than 6 cores) within the VM can use similar effect.

Share this post


Link to post
On 10/30/2018 at 7:56 AM, testdasi said:

Below are my core assignments for the main workstation VM. I tested a lot of configs and this is the best one. The idea is to divide the CPU into 4 NUMA nodes, each node has 2 CCX, each CCX has 4 pairs of HT logical cores. So you can see for each CCX, I assign 3 non-linked logical cores to my workstation VM.

 

Based on my testing, you want to spread the core assignments across as few nodes as possible but as evenly across as many CCX as possible and always leave core 0 not used by anything. So for instance, if you assign 6 cores to a VM, best config is to assign 3 to each CCX with both CCX on the same numa node. The side effect of that is an odd number of cores on a node, at least based on my workload, performs worse than having 1 fewer core assigned but spread evenly.

 

You also might want to watch SpaceInvaderOne video on how to identify which PCIe slot is assigned to which numa node + how your cores are displayed. Mine shows 0 + 1 as a HT pair but I remember someone on the forum reported Asus mobo reports 0 + 32 as a pair. For gaming VM, you want to assign only the cores linked to your PCIe slot which has the GPU. That will minimise latency when playing (although I don't see any diff b/w gaming VM and my workstation VM, but then I use Process Lasso (to pin cores WITHIN a VM for a certain process) so that may be why).

 

And everything turbo to 3.8GHz automatically (all cores) and drop to about 1.7 GHz when idle.


  <cputune>
    <vcpupin vcpu='0' cpuset='3'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='7'/>
    <vcpupin vcpu='3' cpuset='11'/>
    <vcpupin vcpu='4' cpuset='13'/>
    <vcpupin vcpu='5' cpuset='15'/>
    <vcpupin vcpu='6' cpuset='19'/>
    <vcpupin vcpu='7' cpuset='21'/>
    <vcpupin vcpu='8' cpuset='23'/>
    <vcpupin vcpu='9' cpuset='27'/>
    <vcpupin vcpu='10' cpuset='29'/>
    <vcpupin vcpu='11' cpuset='31'/>
    <vcpupin vcpu='12' cpuset='35'/>
    <vcpupin vcpu='13' cpuset='37'/>
    <vcpupin vcpu='14' cpuset='39'/>
    <vcpupin vcpu='15' cpuset='43'/>
    <vcpupin vcpu='16' cpuset='45'/>
    <vcpupin vcpu='17' cpuset='47'/>
    <vcpupin vcpu='18' cpuset='51'/>
    <vcpupin vcpu='19' cpuset='53'/>
    <vcpupin vcpu='20' cpuset='55'/>
    <vcpupin vcpu='21' cpuset='59'/>
    <vcpupin vcpu='22' cpuset='61'/>
    <vcpupin vcpu='23' cpuset='63'/>
    <emulatorpin cpuset='9,25'/>
  </cputune>

 

No problem at all with GPU pass through.

 

It depends on motherboard and GPU combi. My combi works well enough, except for the fact that Gigabyte mobo doesn't like unRAID v6.6+. My VMs take up to 10 minutes to boot and once booted, it runs ridiculously slow. So I downgraded to 5.5.3 and expect to stay there for years to come, or at least until my current 5-year upgrade cycle ends. Nobody seems to know why.

 

Apparently other users reported no problem (e.g. with ASRock X399 Taichi) so I do recommend not to buy Gigabyte mobo.

 

The 2950X is basically 2 Ryzen CPU glued together so you need to make sure to passthrough the node connected directly to your GPU PCIe slot for best performance. Alternatively use Process Lasso to limit the cores used by your games (or Adobe Lightroom <-- does not like more than 6 cores) within the VM can use similar effect.

I was wondering do you have a plex server and if you do does your main VM get really slow when people are streaming from it? My main VM has all of its own cores completely dedicated to it but the minute someone starts watching something from plex my system's performance nose dives with massive stuttering. I can't figure out why either and its starting to drive me a little crazy.

 

If you look at CPU utilization with htop nothing is really crossing 50%. Memory is 60% used. The VM has its own dedicated SSD. It makes like 0 sense why the stuttering and crap starts occurring. It usually begins with voice distortion on discord and turns into a full blown mess. Games even like factorio that are basically just CPU and memory bound turn to crap with me getting less than 20fps.

Edited by Jerky_san

Share this post


Link to post
On 11/7/2018 at 1:19 PM, Jerky_san said:

I was wondering do you have a plex server and if you do does your main VM get really slow when people are streaming from it? My main VM has all of its own cores completely dedicated to it but the minute someone starts watching something from plex my system's performance nose dives with massive stuttering. I can't figure out why either and its starting to drive me a little crazy.

 

If you look at CPU utilization with htop nothing is really crossing 50%. Memory is 60% used. The VM has its own dedicated SSD. It makes like 0 sense why the stuttering and crap starts occurring. It usually begins with voice distortion on discord and turns into a full blown mess. Games even like factorio that are basically just CPU and memory bound turn to crap with me getting less than 20fps.

Sorry for late response. Have been in over my head with work.

I have Plex and didn't have any of your problem. What unraid version do u use? Did you pin the VM to the node directly connected to the GPU?

 

I have voice distortion on my HDMI audio output + general lag and stuttering if the node directly connected to the GPU is under super heavy load, which Plex certainly will do, even when the VM cores are not shared. That's why I always leave at least 1 core free, which seems to have solved the issue.

Share this post


Link to post
4 hours ago, testdasi said:

Sorry for late response. Have been in over my head with work.

I have Plex and didn't have any of your problem. What unraid version do u use? Did you pin the VM to the node directly connected to the GPU?

 

I have voice distortion on my HDMI audio output + general lag and stuttering if the node directly connected to the GPU is under super heavy load, which Plex certainly will do, even when the VM cores are not shared. That's why I always leave at least 1 core free, which seems to have solved the issue.

I found out that the cache is wrong on threadripper and if you tell it to emulate an EPYC CPU the cache will all pass through correctly and BAM no more lag or stuttering or anything. Tis amazin. See the post below but I'm hoping QEMU or limetech will do the patch I posted as I don't know how so I can cross NUMA again. But now I get substantial FPS increase and all my cache levels are very close to bare metal. If you attempt to cross NUMA to the other die with a memory controller though it does't pass the NUMA info to the VM so it gets crap memory performance.

 

 

Share this post


Link to post
2 hours ago, Jerky_san said:

I found out that the cache is wrong on threadripper and if you tell it to emulate an EPYC CPU the cache will all pass through correctly and BAM no more lag or stuttering or anything. Tis amazin. See the post below but I'm hoping QEMU or limetech will do the patch I posted as I don't know how so I can cross NUMA again. But now I get substantial FPS increase and all my cache levels are very close to bare metal. If you attempt to cross NUMA to the other die with a memory controller though it does't pass the NUMA info to the VM so it gets crap memory performance.

Interesting. Was the xml in the post part of your unraid VM xml?

I may try it when I have some time. Wonder if I can actually have the best of both worlds by using Process Lasso to restrict games to the same node while having my VM cross NUMA nodes for things that need more cores than low latency.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now