HOW TO: Using SR-IOV in UnRAID with 1Gb/10Gb/40Gb network interface cards (NICs)


BVD

Recommended Posts

Seeing this I decided to put my Mellanox ConnectX3-Pro into my UnRAID server.

 

I created 4 VFs following @ConnectivIT's guide using Option 1 (didnt set any static MAC yet). Then I added 1 of the VFs to my Win10 gaming workstation.

However, Windows device manager doesn't want to start the device. There is a Mellanox ConnectX-3 VPI (MT041900) Virtual Network Adapter visible, but it is stopped with code 43. I tried installing 5.50.53000 WinOF driver from Nvidia site, but still no go.

 

Any suggestion for a solution would be greatly appreciated!

Link to comment

What version of windows 10? It's like I said with mellanox drivers, they're all kinda tied up in licensing crap...

 

My suggestion would be to use something like 7zip to manually extract an older version of the drivers from the executable and then attempt manual installation.

 

For windows and virtual functions, intel (and some chelsio) are really the only sure fire way to ensure compatibility thanks to the crap nvidia and mellanox have pulled with their drivers.

  • Thanks 1
Link to comment
14 hours ago, sorano said:

Seeing this I decided to put my Mellanox ConnectX3-Pro into my UnRAID server.

 

I created 4 VFs following @ConnectivIT's guide using Option 1 (didnt set any static MAC yet). Then I added 1 of the VFs to my Win10 gaming workstation.

However, Windows device manager doesn't want to start the device. There is a Mellanox ConnectX-3 VPI (MT041900) Virtual Network Adapter visible, but it is stopped with code 43. I tried installing 5.50.53000 WinOF driver from Nvidia site, but still no go.

 

Any suggestion for a solution would be greatly appreciated!

 

Try using mlxconfig to confirm SR-IOV is properly enabled on the card - step 3 here: https://mymellanox.force.com/mellanoxcommunity/s/article/howto-configure-sr-iov-for-connectx-3-with-kvm--infiniband-x

 

Edit: 

 

You could probably follow the firmware update instructions in my post, but replace fw-ConnectX2-rel.mlx with fw-ConnectX3-rel.mlx

 

I'm not sure if the format of the ini file is the same for ConnectX-3 though, ie this part:

num_pfs = 1
total_vfs = 64
sriov_en = true

 

Maybe export what's in the current config

 

mstflint -d mtxxxxx_pci_cr0 dc > backup.ini

 

What's in the [HCA] section?

 

I did all this on a Windows machine, but there are Mellanox tools in unRAID Apps: Mellanox-Firmware-Tools

 

Edit2:

 

Some more bed-time reading for you!  https://forums.servethehome.com/index.php?threads/how-to-enable-sr-iov-on-connectx-3.24321/

Edited by jortan
  • Thanks 1
Link to comment
On 3/29/2021 at 7:06 PM, BVD said:

What version of windows 10? It's like I said with mellanox drivers, they're all kinda tied up in licensing crap...

 

My suggestion would be to use something like 7zip to manually extract an older version of the drivers from the executable and then attempt manual installation.

 

For windows and virtual functions, intel (and some chelsio) are really the only sure fire way to ensure compatibility thanks to the crap nvidia and mellanox have pulled with their drivers.

It's Windows 10 20H2.

 

But yeah, seems to be related to that driver mess just like you wrote. While troubleshooting I added the VF to an Ubuntu VM that just boots off Ubuntu 20.10 live iso and the VF worked straight away.

 

Guess I'm gonna be finding and testing alot of drivers, if you have any recommendation it would be greatly appreciated.

 

The card is updated with latest official firmware; 2.42.5000.

Link to comment
3 hours ago, sorano said:

Guess I'm gonna be finding and testing alot of drivers, if you have any recommendation it would be greatly appreciated.

 

It's a long-shot but you may want to try WinOF v5.50.54000 (the one that's allegedly for Server2019 only)

 

Worked fine with Win10 LTSC (1809) and ConnectX-2

  • Thanks 1
Link to comment

The error 43 is the exact driver shenanigans they pull with video cards, but with virtual functions its unfortunately even more difficult to get around.

 

A couple things you can try:

* instead of running the full installer, open the executable (.exe installer) with 7zip or the like, and manually install the drivers for your OS (using the .inf files)

* test against a different version of windows 10 to verify whether its locked to a specific build type (education, pro, pro for workstation, or an enterprise LTS build)

* read through the release notes for earlier driver versions in order to find one that has the best chances for success; 2017 or earlier is likely your best bet in my experience

 - note, this may require a firmware downgrade of your card, which I've very limited personal experience with

  • Thanks 1
Link to comment
  • 4 weeks later...

After spending way to many hours trying to get Mellanox Virtual Function running under Windows I finally gave up and bought an Intel X520 DA-2image.png.3e7ec190581163cca4379f10aa4d6c10.png

 

New card, new problems.

It's like the kernel is ignoring the SR-IOV functions of the card for some reason.

 

image.png.0b811f2ad54f59814d2c8a1fded9f6b6.png

 

I checked in BIOS but could not find anything related to activating SR-IOV for the card either.

 

Since sriov_numvfs does not exist for the device I cannot get any VF's.

 

 image.png.ffc6079c2b511b0247c585c11836ced4.png

Edited by sorano
Link to comment

You have to bind the 520 series prior to using VFs as its partitioning the entire device - take a look at the chipset specific section which goes over that a bit for more detail, but the guide covers it in pretty decent detail, just be sure to follow it through and you'll be set 👍

Link to comment
3 hours ago, BVD said:

You have to bind the 520 series prior to using VFs as its partitioning the entire device - take a look at the chipset specific section which goes over that a bit for more detail, but the guide covers it in pretty decent detail, just be sure to follow it through and you'll be set 👍

 

Damn, I really hoped that would have been it but it's still not working.

 

Same as before. I cannot even create VFs even though X520 is VFIO bound:

image.png.36f98f069e1d417d8146c78a8a9c7065.png

 

image.png.3ebc02e1d170651211b75c9c25609f9c.png

 

sriov_numfs still does not exist for the card:

image.png.5d5a11cce816d59246058bb9ee553f02.png

 

 

Link to comment

I'd re-read the instructions- you have 4 lines for your vfs, doing two separate things, where you should only have one line per physical port

 

Sorry, I'm mobile right now or I'd type out more, but it looks like you mightve mixed both methods for some reason

Link to comment
2 hours ago, BVD said:

I'd re-read the instructions- you have 4 lines for your vfs, doing two separate things, where you should only have one line per physical port

 

Sorry, I'm mobile right now or I'd type out more, but it looks like you mightve mixed both methods for some reason

 

Well, in the last post you told me I need to vfio bind the NIC before configuring VFs right?

So after that I added the two first lines:

# VFIO bind Intel X520 before creating Virtual Functions
bash /boot/config/vfio-pci-bind.sh 8086:154d 0000:06:00.0
bash /boot/config/vfio-pci-bind.sh 8086:154d 0000:06:00.1


Then the next two lines are creating the actual VFs

# Create Intel X520 Virtual Functions
echo 4 > /sys/bus/pci/devices/0000:06:00.0/sriov_numvfs
echo 4 > /sys/bus/pci/devices/0000:06:00.1/sriov_numvfs

(This is the part that does not work since sriov_numvfs is not visible under /sys/bus/pci/devices/0000:06:00.0/ so the echo does nothing)

 

Link to comment

Finally back home!

So this is where I think you got tripped up (just referencing the guide for any others that might come across this):
 

Quote

5.a - In Settings -> User Scripts, create a new script.

5.b For each interface, we'll call the script, specifying the vendor ID (note, this is different from the physical devices vendor ID), domain (always 0000 in our case), and bus ID - we'll choose to run this at first array start only, as it's only needed once per boot, one line per vf:

sudo bash /boot/config/vfio-pci-bind.sh 8086:10ed 0000:17:10.0;

 

Instead of putting this in a user script, you'd added it to the go file - the intent with the script is to allow you to bind (anything really that has function level reset capabilities, but in this case - ) any port at any time, as long as that port isn't currently in an active (up) state. This is necessary in our case, as the virtual functions don't exist until after PCIe init time - they're only created once the numvfs is called, which then 'partitions' (sorta, but that's the easiest way to think of it anyway) the physical function into whatever number of virtual functions you decide upon.

 

You shouldn't have to bind the physical port at all for method one - you're partitioning at the pci layer, prior to the driver loading. This is where I was saying I'd thought perhaps you'd intermingled the two methods a bit.

 

So starting from scratch, your process should be something like:

* Add the 'echo' lines desired to the go file, then reboot

* Create a user script that calls the bind script for each of the VFs you want to pass on - run it once now to try it out, then set it to run on first array start only for the future

 

Lemme know if this doesn't get you over the finish line and we can see about taking a look together further down the road 


_____

 

As an alternative, method two would be an option here as well:
* Run bind script for the port, or bind via UI and reboot (this unloads the ixgbe driver, freeing up the physical function so it can be partitioned into virtual functions - something that isn't necessary in method one due to initiating the vf creation prior to ixgbe being bound to the pf)

* Type the 'echo' line into the terminal for the ethX device

Link to comment

I really appreciate you taking your time trying to help.

 

Right now I'm just going to accept that this piece of trash Asus motherboard is fucking broken with SR-IOV and plan better for my next build.

 

No matter what I do sriov_numvfs will not show up in sysfs for the device, so the echo'ing has no effect.

Link to comment
  • 2 months later...

Got SR-IOV working nicely with my X540 Dual 10G card. Works great in Linux machines, but when i try and use a VF in OSX where i have to use the smalltree driver, it does not work as the VF has different device id  (1515, instead of 1528 for main device) as can be read but not changed from in my case /sys/bus/pci/devices/0000:82:00.0/sriov_vf_device.

So the driver does not even recognise the VF.

Anyone got something working on OSX and wants to share the process ?

Link to comment

I'd be surprised if macOS even had a VF driver for anything that doesnt come from apple - my expectation would be that youd have to do some manual driver hacking to make it work... what cards does apple support/sell for 10Gb or higher connectivity? Do they provide driver/firmware downloads?

Link to comment
1 hour ago, BVD said:

I'd be surprised if macOS even had a VF driver for anything that doesnt come from apple - my expectation would be that youd have to do some manual driver hacking to make it work... what cards does apple support/sell for 10Gb or higher connectivity? Do they provide driver/firmware downloads?

 

Yeah i am sure hacking is involved ;-) Hence my question. 

Already using for years flashed intel 10G and 1Gb cards and hacked SmallTree drivers so they recognise these as smalltree cards to run these in OSX.

But this is a new mode with this SR-IOV.  My guess is based on the test and how i use to hack it , is the as i mentioned different device id that is presented to the OS .

 

Not sure what apple sells nowadays themselves on 10G, but there are a few companies (like smalltree, sanlink , etc) that have 10G (and higher) cards and their own drivers .

But SR-IOV will be new for these as well . Its a new things we are playing with here with a very small user base of enthousiasts alike

Link to comment
  • 6 months later...

First of all a huge thank you to you @BVD! Imo it's the most complete and "user friendliest" guide to SR-IOV that I could find anywhere. Really cool!


I know that you focused this thread around NICs but I tried to apply the guide to intels iGPU that (supposedly) support SR-IOV on gen 11+.

Don't want to capture this thread - details are in a separate Thread: 

 

When I try to use the VFs from the iGPU, I get an error stating, that the VF needs a VF Token to be used (to my research a shared secret UUID between the PF and VF). Most sources about VF Tokens I could find are from discussions around DPDK so I guess it's also a topic one could stumble upon when using VFs on NICS. I was wondering if someone here ever had to deal with something similar and know a way of setting the Token with the workflow described here.

Link to comment
  • 2 months later...

Hi, thank you for a fantastic guide. I have been playing with it for some time and can't get it to work. I believe I have similiar problems with some of the people in this post. now let me decribe my setup and helpfully someone can point a light.

 

-ASUS TUF GAMING B550M-PLUS bios Version 2604

-AMD Ryzen 7 5700G

-32 gb cosair ram (3600mhz)

-intel x550-t2

 

Before the nic shows up normally in system devices. both using the ixgbe driver

1338813719_.thumb.png.f537f428e9b4c9de6d4ea3e5aefa7f74.png

 

 

Creating the VF was rather easy. whether editing the go file in method 1 or the echo method in option 2, they both work and I went with option 2 as its easier.

sudo echo 4 > /sys/class/net/eth1/device/sriov_numvfs

eth1 correspons to the 4:00.1 controller for the intel x550. eth0 is used for unarid. eth2 is the onboard 2.5g nic.

 

After creating the vf, system device shows that VFs are created.

 

1071141365_.thumb.png.417b35602ff819e899f44c30aabcf9eb.png

 

run the bind scrip from the unraid command line. it shows that the scipt unbounds the physical device which should not happen.

 

root@Tower:~# sudo bash /boot/config/vfio-pci-bind.sh 8086:1565  0000:05:10.1
Vendor:Device 8086:1565 found at 0000:05:10.1

IOMMU group members (sans bridges):
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:04:00.0
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:04:00.1
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.1
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.3
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.5
/sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.7

Binding...
Unbound 0000:04:00.0 from ixgbe
Unbound 0000:04:00.1 from ixgbe
/boot/config/vfio-pci-bind.sh: line 157: /sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.1/driver_override: No such file or directory
/boot/config/vfio-pci-bind.sh: line 172: echo: write error: No such device
/boot/config/vfio-pci-bind.sh: line 157: /sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.3/driver_override: No such file or directory
/boot/config/vfio-pci-bind.sh: line 172: echo: write error: No such device
/boot/config/vfio-pci-bind.sh: line 157: /sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.5/driver_override: No such file or directory
/boot/config/vfio-pci-bind.sh: line 172: echo: write error: No such device
/boot/config/vfio-pci-bind.sh: line 157: /sys/bus/pci/devices/0000:05:10.1/iommu_group/devices/0000:05:10.7/driver_override: No such file or directory
/boot/config/vfio-pci-bind.sh: line 172: echo: write error: No such device

success...

Device 8086:1565 at 0000:05:10.1 bound to vfio-pci
Devices listed in /sys/bus/pci/drivers/vfio-pci:
lrwxrwxrwx 1 root root    0 Mar 30 11:08 0000:01:00.0 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:01:00.0
lrwxrwxrwx 1 root root    0 Mar 30 11:08 0000:01:00.1 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:01:00.1
lrwxrwxrwx 1 root root    0 Mar 30 11:08 0000:04:00.0 -> ../../../../devices/pci0000:00/0000:00:02.1/0000:02:00.2/0000:03:00.0/0000:04:00.0
lrwxrwxrwx 1 root root    0 Mar 30 11:08 0000:04:00.1 -> ../../../../devices/pci0000:00/0000:00:02.1/0000:02:00.2/0000:03:00.0/0000:04:00.1

ls -l /dev/vfio/
total 0
crw------- 1 root users 248,   1 Mar 30 10:52 10
crw------- 1 root root  248,   2 Mar 30 11:08 14
crw------- 1 root users 248,   0 Mar 30 10:52 9
crw-rw-rw- 1 root root   10, 196 Mar 30 10:52 vfio

I think this is where the problem lies, when I run the bind script from andre-richter, it unbinds my phsical interface, then, as the phsical interface is being bound to vfio, the vf function I created just vanishes. after going into andre-richter' github, I found this line

Quote

 

# (2) Unbinds all devices that are in the same iommu group as the supplied

# device from their current driver (except PCIe bridges).

#

# (3) Binds to vfio-pci:

# (3.1) The supplied device.

# (3.2) All devices that are in the same iommu group.

 

 

Afterward, system device shows that both the nics are bound to vfio,  Which should not happen right?

361224265_.thumb.png.df7c8b8f65b6961880817179016422ee.png

I think what happened was that, after creating the vf, the physical function and the vf are in the same immmo group.

 

If my understanding is correct, the scipt shoudn't bind the phscial function to vfio since the ixgbe driver needs to be active in order to provide the vfs. And because the scipt binds all device in the same iommu group, in my case this will include the phsical function. In addition, from reading the guide, devices that uses the ixgbe driver does not have a abstrication layer. thus why am i getting 5 nics in total (including the phsical).

Link to comment

I got it. if i set to mutifunction+downstream. the vfs will be in their own group just as how you showed in your tutrioal. then it doesn't confict with the scipt. for anyone with the same problem. go to vm settings and set it to PCIe ACS override to both.  this allows all the vfs to be in their own iommu group. don;t konw if this is needed or if someone can modifiy the scipt. but for now this works.

Link to comment
14 hours ago, Bobo_unraid said:

I got it. if i set to mutifunction+downstream. the vfs will be in their own group just as how you showed in your tutrioal. then it doesn't confict with the scipt. for anyone with the same problem. go to vm settings and set it to PCIe ACS override to both.  this allows all the vfs to be in their own iommu group. don;t konw if this is needed or if someone can modifiy the scipt. but for now this works.

 

Whether or not PCIe ACS override is needed is widely system dependent - any time you hit an issue with IOMMU grouping, it's one of the options that are available, but unfortunately not a one size fits all.

 

Glad you got it figured out!!

Link to comment
8 minutes ago, BVD said:

 

Whether or not PCIe ACS override is needed is widely system dependent - any time you hit an issue with IOMMU grouping, it's one of the options that are available, but unfortunately not a one size fits all.

 

Glad you got it figured out!!

Also, not too sure about the donwside of enabling this option. I tend to enable stuff as needed, less option enabled is less trouble for me.

 

I think the script work only if you vf and phsical function are not in the same iommu group. my asus motherborad puts the vfs and pfs in the same group which was the problem. I'm not sure if this is the limation of the bind script but, looking at the author code descriton, it looks to me if all the vfs are in the same iommu group, by just supplying one of the vfs to the scipt, then scipt can just bind all(not sure). It seems to me that's what the author is excepting(maybe the default norm is pf in one group and vfs in another group). I wanted to modify the scipt to bind just the one specified and not include the devices in their iommu group, but my limited bash skill is just not letting me do that.

 

I think, this might be a point to be added to your tutrial. looking at the iommu group of the machine and how it interacts with the pci-bind script.

Link to comment
  • 2 weeks later...

The limitation of IOMMU grouping is always on the side of the motherboard (I usually hesitate to make statements like this as fact, but I think it's pretty close here). The ability to segment to individual traces down is completely dependent upon the PCB design and BIOS implementation, which is on the MB side.

 

The 'downside' to enabling the override is, at least in theory, stability - what we're doing is 'ignoring' what the motherboard says it should support. In reality, if you enable it and everything 'just works', it'll more than like continue to 'just work' indefinitely (barring any changes to the MB BIOS).

 

Unfortunately IOMMU grouping and how it all actually works is beyond the scope of this tutorial, but I agree it's a subject that could use clarification. A lot of it boils down to hardware implementation and optionROM size the MB vendor implements into its design - most consumer boards, they only have enough space to load the fancy graphical BIOS they come with, where workstation/server boards still tend towards keyboard driven UI's (leaving more space for other operations).

Link to comment
  • 1 month later...

Currently, I'm surprised to find that the 9p driver for native file sharing works, I almost through there is a specific diver for folder sharing.

 

One problem is really bugging me and I feel like this is simple question but the amount of time it make me to ban my head is getting too much. So I'm gonna ask here. 

 

Right now I have an opnsense vm using vf 0, instead of using virtio-net following this guide and making it work. everything works except that my pi-hole docker container cannot ping opnsense at 10.0.0.1. the pi-hole container is set-up using docker custom (bro). this is not an issue when using virtio-net. and all other container using custom network seems to be inaccessible.

 

I googled around and come off this trust parameter with the command

ip set link eth0 vf0 trust on

However, this does not seem to work nor it stays over reboot, I wonder if you have encountered similar problems?

 

Link to comment

Would need to know more about your network for this one - e.g. the ifconfig outputs for each of the interfaces involved, whether you've got them both configured under the same domain, what routes you've configured, how your attempting to communicate (tried pinging from one to the other and vice versa, or just using 53/853 for dns/DoT, etc), and so on. 

 

When using the virtio driver, you're saying "this application is part of this host" as far as the network is concerned - so if your host certs are correct and the like, your container's under that umbrella (with caveats of course). When using this guide though, you're saying "this is a separate server which simply communicates across the same physical port", so you essentially need to treat its configuration like that of a brand new host being added to your network.

Edited by BVD
Link to comment
  • 4 weeks later...

Awesome post!! I haven't yet upgraded my network to 10G, so I plan to use an older Intel 4 port gigabit NIC (I don't recall the chipset off the top of my head, but I know it is a PCIe 2.0 version). I am planning to run pfSense on a VM, so I want good network throughput. Assuming that NIC even works with SR-IOV, would you suggest I spend the time setting it up, or is there really only a benefit once you move to 10 gig?

Edited by sphbecker
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.