Jump to content
testdasi

The Black Bear - Threadripper 2990WX build

114 posts in this topic Last Reply

Recommended Posts

Minor updates:

  • Bought the Asus Hyper M.2 (4xM.2 -> PCIe x16) to mount the PM983 away from the GPU and open up future expandability. It now idles at under 40 degrees (5 below previous best), reaching around 46 during heavy gaming sessions (10 below previous best).
  • The SM951 (AHCI) refuses to die despite my intentional "running it to the ground" effort. Switching to the Hyper M.2 requires rejigging the config anyway so I decided to switch to the SM951 as my workstation VM boot drive (again!). Along the way, I'm reminded of how compatible the AHCI PCIe M.2 standard is - it shows up to the VM as a SATA drive so it boots natively (unlike NVMe, for example which requires a few hoops).
  • I now have a very #FirstWorldProblem of having 1x 2280 and 3x 22110 slots that I want to fill with M.2. drives. :)
  • I found my old Kingston SSDNow+ 128GB collecting dust (it was back when SATA III didn't exist) so decided to use that as my temp space for things waiting for the next gdrive batch. It becomes a half-intentional protection against crypto virus because of how unionfs works. If I were infected, it would take less than 30 minutes for the drive to fill up and error out (giving me a quick clue) while protecting my actual content.
Edited by testdasi

Share this post


Link to post
4 hours ago, testdasi said:

Asus Hyper M.2

Fascinating! Do you use my DiskSpeed Docker? I'm curious how these show up in it. Four controllers with one drive or one controller with four drives?

 

I think I might have to pick one of these up. How is the cooling? Seems like it'll be a good way to add VM drives or a larger cache pool.

Edited by jbartlett

Share this post


Link to post

Hi @testdasi

 

I read your thread with interest as I'm planning a threadripper build and considering MB options. The Designaire is high on my list and I wanted to check something with you.

 

At the beginning of your thread, you intimated that you would hesitate to recommend the designaire due to a few issues and niggles. However, reading on, it seems you've found solutions for man , (all?) the challenges. Would you still recommend looking elsewhere or would you feel it's now OK for an unRaid build?

 

Second, and if you would be so kind, would you mind sharing your passthrough method and, if possible, any config files relevant? 

 

I'm just interest to see what and how you've done it with this board. I've been passing through GPUs, Controllers and discrete devices since VMs became a thing in unRaid, but not frequently enough to remember how to do it each time. The best practice has evolved over time and I struggle to find the most up to date recommendations.

 

Cheers!

 

 

Share this post


Link to post
On 7/18/2019 at 6:27 PM, jbartlett said:

Fascinating! Do you use my DiskSpeed Docker? I'm curious how these show up in it. Four controllers with one drive or one controller with four drives?

 

I think I might have to pick one of these up. How is the cooling? Seems like it'll be a good way to add VM drives or a larger cache pool.

My vfio stubbing continues to work without any intervention so can't test it with the DiskSpeed Docker. But based on how I understand PCIe bifurcation, it should be as if you have 4 more M.2 slots. Everything (as far as I can tell) behaves just like the motherboard M.2 ports.

 

The cooling is excellent. 7-9 degrees C cooler. Most importantly, it cools back down very quickly after load.

Share this post


Link to post
4 minutes ago, meep said:

At the beginning of your thread, you intimated that you would hesitate to recommend the designaire due to a few issues and niggles. However, reading on, it seems you've found solutions for man , (all?) the challenges. Would you still recommend looking elsewhere or would you feel it's now OK for an unRaid build?

 

Second, and if you would be so kind, would you mind sharing your passthrough method and, if possible, any config files relevant? 

 

I'm just interest to see what and how you've done it with this board. I've been passing through GPUs, Controllers and discrete devices since VMs became a thing in unRaid, but not frequently enough to remember how to do it each time. The best practice has evolved over time and I struggle to find the most up to date recommendations.

 

I can fully recommend the Designare EX as long as you don't need ACS Override due to the lag issue I reported.

There may very well be some advanced fix that I don't know of but the easiest fix for me is to not use ACS Override and everything runs perfectly fine.

Come to think of it, it may be an issue with the i440fx VM machine type (I now use Q35 - that was one potential fix I didn't test).

 

There is really nothing special with my passthrough method. I basically follow the SpaceInvaderOne textbook.

  • vfio stub the devices that need to be passed through
  • dump my own vbios for the GPU and use it (eventhough it's theoretically not necessary since my motherboard BIOS allows me to boot Unraid with its cheapo GPU on slot 3)

For a refresher on PCIe passthrough, just watch SpaceInvaderOne youtube videos. They are practically just short of the guy going to your place to do it for you.

 

Of course if you have any problem, feel free to ask.

Share this post


Link to post

Do you happen to have an LSI SAS controller (or anyone who has this mobo have one)? I found out the the AsRock x399 Taichi has an issue with the LSI controllers where it just resets them constantly, and requires a BIOS fix to come out for it. No news from AsRock on eta or if they will fix it, but I am looking into getting a different mobo that supports SAS controllers. Does the DESIGNARE support a sas controller?

Share this post


Link to post
On 8/26/2019 at 6:38 PM, urhellishntemre said:

Do you happen to have an LSI SAS controller (or anyone who has this mobo have one)? I found out the the AsRock x399 Taichi has an issue with the LSI controllers where it just resets them constantly, and requires a BIOS fix to come out for it. No news from AsRock on eta or if they will fix it, but I am looking into getting a different mobo that supports SAS controllers. Does the DESIGNARE support a sas controller?

No I don't use SAS controller unfortunately so can't test that. I fully embrace the cloud now so don't need a big array anymore.

Share this post


Link to post

Okay, I picked up this board, and for the most part am doing good. The only thing I can't figure out is when my Windows 10 VM is set to BR0 it does not connect to the internet, but it connects with the virbr0. Obviously the draw back is to virbr0 is you can't manage unRAID with it since it's out of range of the network. I made sure default network bridge for the VMs is br0, the VM has BR0 selected as active, and both ethernet ports have a active ethernet connection. Is there something I am missing here to get VMs on the same network as the host machine?

Share this post


Link to post
6 hours ago, urhellishntemre said:

Okay, I picked up this board, and for the most part am doing good. The only thing I can't figure out is when my Windows 10 VM is set to BR0 it does not connect to the internet, but it connects with the virbr0. Obviously the draw back is to virbr0 is you can't manage unRAID with it since it's out of range of the network. I made sure default network bridge for the VMs is br0, the VM has BR0 selected as active, and both ethernet ports have a active ethernet connection. Is there something I am missing here to get VMs on the same network as the host machine?

BR0 and br0 are different. Linux-based OS are case-sensitive.

 

If you set it to br0 (note: lower case) and it still doesn't work then you need to post it in the VM section. It's certainly not hardware related.

Share this post


Link to post

Updates:

  • Updated to 6.8.0-rc3 from 6.7.2. I have not used rc on my prod server for a long time now but decided to do it this time.
    • qemu to v4.1 is a much-appreciated upgrade for better PCIe support (e.g. no need the manually-added bit of xml tags for Q35). I no longer need the qemu:commandline piece of codes (aka the root port patch) and my PCIe shows up as x16 now.
    • Support for Wireguard. Now I have no clue how to set it up properly so will wait for Spaceinvaderone guide but it's better to be on 6.8 and know / work around any teething issues before his guide is out.
    • On the subject of teething issues, Spaceinvaderone reported some rather minor gripes with rc1.
  • My attempt to run the Kingston 128GB SSD to the ground has failed, just as my other failed attempts to run SSD's to the ground. This speaks volume on the longevity and endurance of SSD if used correctly e.g. frequent trim, minimal long-term data (i.e. practical over-provision), etc.
  • Got a 4TB Samsung 860 EVO as NAS drive (i.e. long-term data) because my 2TB 850 Evo has already filled up. In the process, I have also rejigged my assignments.
    • The 2TB 850 EVO is now my cache drive. This resolves a rather peculiar issue with Plex. When my Plex db is on the i750, sometimes media thumbnails fail to load properly. I know it's not a data corruption because if I refresh the page, things show up normally again (and I don't have db corruption errors that others have reported with 6.7). I have done my testing and long-story-short, I think it's because the i750 has some funky latency that messes up Plex / browser image load timeout.
    • The i750 is now used as my temp drive i.e. heavy write. I store minimal long-term data on it to minimize write magnification.
    • My Crucial MX300 is used as my intermediate drive i.e. for data waiting to be processed. This will be another project for me to run to the ground - we'll see how "successful" I am (already 44TB written and only 1 bad block, 9191 spare blocks to go!).
    • My Kingston 128GB now serves as mount points for rclone and associated logs. Having mount points on the array occasionally caused my HDD to spin up unnecessarily.
    • I retired the 300GB Toshiba 7200rpm 2.5" HDD for the nth times. Given its 100MB/s speed, I really don't miss it that much.
  • My important data is now 1-2-4 (1 piece of data, 2 locations, 4 copies - primary, online, offline, offsite).

 

Edited by testdasi

Share this post


Link to post
47 minutes ago, H2O_King89 said:

Where you able to pass on board audio ?

No. It's in the same IOMMU group as a SATA device.

My hunch is that it's for the M.2 slot if run in SATA mode so theoretically I can vfio stub it but I have never come to do it since I don't need onboard audio.

 

Of course, ACS Override should work but then in my case, it lags if ACS Override is on.

Edited by testdasi

Share this post


Link to post
No. It's in the same IOMMU group as a SATA device.
My hunch is that it's for the M.2 slot if run in SATA mode so theoretically I can vfio stub it but I have never come to do it since I don't need onboard audio.
 
Of course, ACS Override should work but then in my case, it lags if ACS Override is on.
Okay I'll give it a try. I need audio in. It nice if we could pass a audio card to a docker to run the program

Sent from my Pixel 4 XL using Tapatalk

Share this post


Link to post

Updates:

  • Great news! 6.8 has resolved the peculiar lagging with ACS Override On (that seemingly only happens to Gigabyte X399 motherboard). Tested with 6.8.0-rc7 and no lag with both Q35-4.1 and i440fx-4.1. ❤️ the new kernel!
  • During testing, I also discovered that Hyper-V has an impact on performance. I did CrystalDiskMark on my passed-through Samsung 970 NVMe and noticed with Hyper-V off:
    • Random I/O is 60% slower!
    • Sequential is capped at 2.6GB/s (approximately) (3.5GB/s with Hyper-V on).
  • Extrapolating on the above, I would imagine GPU performance would be worse with Hyper-V off too. Hence, I would say unless you have error code 43 that can't be resolved with any other tweak, don't just turn off Hyper-V by default.
  • Finally get what the Unraid Mount tag thing is all about. It allows you to mount Unraid share directly to a Linux mount point (presumably skipping network). The mount command needs to be run inside the OS for it to work e.g.
sudo mount -t 9p -o trans=virtio "Unraid Mount tag" /mnt/example

 

Edited by testdasi

Share this post


Link to post
14 hours ago, jbartlett said:

Interesting. I'll run this test on my VMs with GPU passed through and not.

Yes please. In case you haven't heard, there's a bug on the GUI that prevents changing HyperV status so you might want to create a new template.

Share this post


Link to post
23 hours ago, testdasi said:

Yes please. In case you haven't heard, there's a bug on the GUI that prevents changing HyperV status so you might want to create a new template.

I set the Hyper-V to No simply by removing the features/hyperv block in the XML editor. Scores were lower with Hyper-V off but by a negligible amount with a five test pass.

 

Timespy Graphics Only Scores, no CPU, 1080p with a Quadro P2000

Hyper-V		  On	 Off
Average		4865	4863
Highest		4876	4874
Lowest		4844	4838
Variance	  32	  36
Offset High	  11	  11
Orrset Low	  21	  25

 

Share this post


Link to post

Updates:

  • Thanks to @Jerky_san xml, I made some edit to my main VM so can now evenly distribute RAM across node 0 and 2 without the need of dummy VM's to block out available RAM. I went a bit further than his code to create 4 guest nodes (since my VM span all 4 host nodes) to help make process pinning with Process Lasso much easier - no need to click each core anymore, just click on the NUMA box to select all cores to that NUMA.

 

Codes:

This numatune section (after </cputune>) creates 4 guest NUMA nodes (cellid) with strict allocation from host nodes 0 and 2 (nodeset). The 2990WX only has 2 nodes with memory access so cell 2 and 3 are also allocated to host 0 and 2.

  <numatune>
    <memory mode='strict' nodeset='0,2'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='2'/>
    <memnode cellid='2' mode='strict' nodeset='0'/>
    <memnode cellid='3' mode='strict' nodeset='2'/>
  </numatune>

This cpu section is where the magic happens.

  • The numa section allocates exact RAM amount to each guest node (which using numatune above, would allocate the same amount + overhead to the appropriate host node). Obviously the total across guest nodes should equal to the total memory allocated to the VM.
  • The cpus tag identifies which guest cores are assigned to which cell ID. I grouped them in the same chiplet arrangement (e.g. 0-5 is in the same chiplet, matched to NUMA node 0 etc).
  • Cores in cell 2 and 3 are from the host nodes without memory controller. However, since numa doesn't allow zero memory, I allocated a token amount of 1GB to each.
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC</model>
    <topology sockets='1' cores='24' threads='1'/>
    <feature policy='require' name='topoext'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='svm'/>
    <feature policy='disable' name='x2apic'/>
    <numa>
      <cell id='0' cpus='0-5' memory='25165824' unit='KiB'/>
      <cell id='1' cpus='6-11' memory='25165824' unit='KiB'/>
      <cell id='2' cpus='12-17' memory='1048576' unit='KiB'/>
      <cell id='3' cpus='18-23' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>

 

These were not related to the numa allocation but I still added them since they seem to give me a marginal 1-2% improvement in performance.

HyperV:

    <hyperv>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='KVM Hv'/>
      <frequencies state='on'/>
    </hyperv>

Clock:

  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='yes'/>
  </clock>

 

Jerky_san's original post:

 

Share this post


Link to post
6 hours ago, testdasi said:

no need to click each core anymore, just click on the NUMA box to select all cores to that NUMA.

You're the second person I've seen mention a NUMA box. I've never seen one, can you send a screen shot?

Share this post


Link to post
1 hour ago, jbartlett said:

You're the second person I've seen mention a NUMA box. I've never seen one, can you send a screen shot?

It's under CPU Affinity, below the core boxes.

NUMA boxes.PNG

Share this post


Link to post
1 hour ago, testdasi said:

It's under CPU Affinity, below the core boxes.

Ah, I missed that it was under Process Lasso. My mind clicked in on the unraid VM editor.

Share this post


Link to post

Updates:

  • Attempted to update BIOS to F12i only to waste 2 hours of my life as the new BIOS is buggy and the patch note is misleading.
    • Couldn't reliably save config.
    • When config could be saved, it didn't retain beyond 1-2 boot cycles.
    • Exit without saving = can't boot up at all (need to clear CMOS for it to boot back up).
    • Saving profile crashed the BIOS itself (blank screen).
    • Patch note said "PCIe bifurcation" as additional feature, which is misleading. It really just changes the wording of the BIOS. This feature was already available for a while e.g. F12e, just under different name.
  • So in the process of restoring back to F12e (fortunately I kept both original Gigabyte BIOS as well as my tweaked one), I discovered Global C State Control is critical (at least on 5.x kernel) for performance.
    • With it disabled, CDM random performance dropped by 75%! It was not just benchmark, the lag was obvious, albeit not completely unusable like the ACS Override lag issue.
    • On that subject, perhaps the 2 issues were related. The newer kernel just mitigates the situation to some extent.
  • Also turned on Precision Boost and Typical Current Idle to see if it stablises things.
  • Did some additional tweaks to the workstation VM in the <features> section

HyperV.

  • All of these offer even more performance, from 1% to 5%.
  • The vendor_id is a error code 43 workaround (even though I don't have the issue). The value must be exactly 12 characters for it to work.
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='0123456789ab'/>
      <frequencies state='on'/>
    </hyperv>

KVM - error code 43 workaround

    <kvm>
      <hidden state='on'/>
    </kvm>

IOAPIC - apparently Q35 on qemu 4.0 had some changes that requires this line - not that I had any issue that prompted it.

    <ioapic driver='kvm'/>

 

Source:

https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#QEMU_4.0:_Unable_to_load_graphics_drivers/BSOD/Graphics_stutter_after_driver_install_using_Q35

 

Share this post


Link to post
On 12/7/2019 at 3:40 AM, testdasi said:

All of these offer even more performance, from 1% to 5%

I saw a 0.73% drop with the additional hyper-v settings than what the GUI put in.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.