Let's Talk C-Groups, Passthrough, Vt-d, Motherboards, ACS Override


bungee91

Recommended Posts

I'm hoping to start a discussion to better understand what boards properly support DMA mapping, isolation, and the ability to safely pass thru individual PCIe cards.

 

I have read a decent amount of discussion by Alex Williamson on the topic discussing what is happening, and what is supposed to happen, however I also see a lot of people having issues and boards stating Vt-d support, and IOMMU groups, however the isolation that is supposed to be present is (from what I understand) non-existent.

 

I have fairly newer consumer grade hardware (i7 4790s, & AsRock Z97 Extreme 4) that I planned to use for the "one box to rule them all" setup.

I searched heavily with the MB selection and decided this board was perfect for my needs.

3 PCIe-16x slots (16x/8x/4x)

3 PCIe-1x slots.

8 SATA ports (2 on separate controller that was not the Marvell that has addressing issues when IOMMU (vt-d) is enabled).

 

Anyhow, long story short by default all 6 PCIe slots are grouped within the same IOMMU C Group...  ???

Why?.. No idea... I have moved things around, etc.. and regardless of what I do they exist together.

So the only option I have to assign one to a VM is the a PCI ACS Override setting (I just use downstream, however there are other for specific device selection).

I decided to try a tuner card in a 1x slot to a Win7 VM that has been working rock solid... Nope, issues, nothing good in the syslog. VM is locked up, server is unreachable (even putty is dead), console still works great, invoke the powerdown script, start again.

 

Anyhow, and sorry to rant...

It'd be nice to know which motherboards people are using that properly assign PCIe slots to their own C groups for proper isolation.

 

So, what hardware do you have? Does it properly assign the PCIe slots in isolated groups?

 

ASRock Z97 Extreme 4 (from my testing, and understanding of how this all works) is NOT recommended for this use case.

 

Any input, discussion, understanding of this BEFORE people buy hardware to do this type of setup I believe would be very beneficial.

I am considering replacing my MB with one that can properly handle this use case, and isolate separate cards correctly.

 

Discuss...

 

-----------

Useful reads:

https://lkml.org/lkml/2013/5/30/513

http://vfio.blogspot.com/2014/08/vfiovga-faq.html

http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html

 

Link to comment

I read a bunch of the Alex Williamson discussion and it's indeed interesting how difficult it seems to be to provide isolation between groups.

 

I have to wonder if the issue is the use of consumer-grade hardware.  A socket 1150 board only supports 16 PCIe lanes, whereas you'd have 40 lanes with a 2011v3 board.    The 2011v3 CPUs (both the i7's and E5's) also have more DMI channels; twice the memory bandwidth (51.2GB/s vs 25.6GB/s); and support CPUs with far more cores than the 4-core max you can get with Socket 1150 => as many as 18 cores in an E5 Xeon, and up to 8 cores with an i7-5960X.    I don't know if the limited # of lanes in the 1150 boards also limits how you can group them; or if this support is more due to the consumer-grade chipset vs. a server class one; but it'd be interesting to do some testing with that.  Hopefully JonP can provide us some feedback on this after v6 gets out  :)

 

I'm also looking at a "one box to rule them all" setup, but am strongly leaning towards a C612 chipset 2011 based setup using this board:  http://www.newegg.com/Product/Product.aspx?Item=N82E16813182927&cm_re=x10srl-f-_-13-182-927-_-Product  with an E5 Xeon (probably a 6-core E5-1650).

 

Link to comment

Bungee, just wanted to let you know I haven't forgotten about this thread, but won't have time to give you the reply it deserved probably until after we get 6.0 final out the door.  Thanks for understanding.

 

No worries, let's geek out and chat about it when you're not so pressed for time!  ;)

Link to comment

I read a bunch of the Alex Williamson discussion and it's indeed interesting how difficult it seems to be to provide isolation between groups.

 

I have to wonder if the issue is the use of consumer-grade hardware.  A socket 1150 board only supports 16 PCIe lanes, whereas you'd have 40 lanes with a 2011v3 board.    The 2011v3 CPUs (both the i7's and E5's) also have more DMI channels; twice the memory bandwidth (51.2GB/s vs 25.6GB/s); and support CPUs with far more cores than the 4-core max you can get with Socket 1150 => as many as 18 cores in an E5 Xeon, and up to 8 cores with an i7-5960X.    I don't know if the limited # of lanes in the 1150 boards also limits how you can group them; or if this support is more due to the consumer-grade chipset vs. a server class one; but it'd be interesting to do some testing with that.  Hopefully JonP can provide us some feedback on this after v6 gets out  :)

 

I'm also looking at a "one box to rule them all" setup, but am strongly leaning towards a C612 chipset 2011 based setup using this board:  http://www.newegg.com/Product/Product.aspx?Item=N82E16813182927&cm_re=x10srl-f-_-13-182-927-_-Product  with an E5 Xeon (probably a 6-core E5-1650).

 

I think that it's a little bit of both....

The use case is certainly the minority for these chipsets/controllers, and I believe for that reason not much thought is put into it.

I kind of expected to see information in my MB manual regarding this similar to how IRQ mapping and shared ports used to be laid out years ago (apparently they don't do that anymore, I guess I'm getting old!).

The addition of the bus widths/channels should certainly help to route for proper isolation on non-consumer grade chipsets/hardware. It is also a plus that the CPU has support for isolation on downstream ports!.. Most MB's that we use are more focused for multiple GPU's that have SLI or Crossfire in use, however I am surprised that doesn't require isolation to work properly (haven't researched, making some assumptions here).

 

I have done further testing using the downstream override option, and it hasn't lead to any issues that I've noticed yet (still a bit flakey at times with pass-thru in general, however I also just seem to be un-lucky  :P ).

Link to comment

Agree that manuals these days don't have NEAR the level of detail they used to.  I think a large part of that is the massive integration => there's not nearly as much customization that can be done on the boards themselves, since so much is now integrated into the CPU or included in the chipset.    The Intel Ark site has a good overview of the specific capabilities included in each of the various CPU's and chipsets; but you have to do some real digging to find the specific details for each of the various features that are supported.

 

There are a few folks on this board that are using E5 series Xeons in their systems ... not Haswell generation, but I think there are some Sandy or Ivy bridge units.    Perhaps one of them will chime in with their experiences.

 

 

 

Link to comment
  • 1 year later...

Would really like to see some community testing on this thread.

the asrock z170 extreme7+ and asrock z170 gaming i7 (which are basically identical) have 3 isolated 16x lanes but the 4th (top middle) 16x lane and all the other pcie lanes are grouped with the chipset and sata controllers and the usb ports are also bonded to the chipsets groups, I can post a full copy paste of the system devices page when I get home.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.