HOW TO: Using SR-IOV in UnRAID with 1Gb/10Gb/40Gb network interface cards (NICs)


BVD

38 posts in this topic Last Reply

Recommended Posts

Seeing this I decided to put my Mellanox ConnectX3-Pro into my UnRAID server.

 

I created 4 VFs following @ConnectivIT's guide using Option 1 (didnt set any static MAC yet). Then I added 1 of the VFs to my Win10 gaming workstation.

However, Windows device manager doesn't want to start the device. There is a Mellanox ConnectX-3 VPI (MT041900) Virtual Network Adapter visible, but it is stopped with code 43. I tried installing 5.50.53000 WinOF driver from Nvidia site, but still no go.

 

Any suggestion for a solution would be greatly appreciated!

Link to post

What version of windows 10? It's like I said with mellanox drivers, they're all kinda tied up in licensing crap...

 

My suggestion would be to use something like 7zip to manually extract an older version of the drivers from the executable and then attempt manual installation.

 

For windows and virtual functions, intel (and some chelsio) are really the only sure fire way to ensure compatibility thanks to the crap nvidia and mellanox have pulled with their drivers.

Link to post
14 hours ago, sorano said:

Seeing this I decided to put my Mellanox ConnectX3-Pro into my UnRAID server.

 

I created 4 VFs following @ConnectivIT's guide using Option 1 (didnt set any static MAC yet). Then I added 1 of the VFs to my Win10 gaming workstation.

However, Windows device manager doesn't want to start the device. There is a Mellanox ConnectX-3 VPI (MT041900) Virtual Network Adapter visible, but it is stopped with code 43. I tried installing 5.50.53000 WinOF driver from Nvidia site, but still no go.

 

Any suggestion for a solution would be greatly appreciated!

 

Try using mlxconfig to confirm SR-IOV is properly enabled on the card - step 3 here: https://mymellanox.force.com/mellanoxcommunity/s/article/howto-configure-sr-iov-for-connectx-3-with-kvm--infiniband-x

 

Edit: 

 

You could probably follow the firmware update instructions in my post, but replace fw-ConnectX2-rel.mlx with fw-ConnectX3-rel.mlx

 

I'm not sure if the format of the ini file is the same for ConnectX-3 though, ie this part:

num_pfs = 1
total_vfs = 64
sriov_en = true

 

Maybe export what's in the current config

 

mstflint -d mtxxxxx_pci_cr0 dc > backup.ini

 

What's in the [HCA] section?

 

I did all this on a Windows machine, but there are Mellanox tools in unRAID Apps: Mellanox-Firmware-Tools

 

Edit2:

 

Some more bed-time reading for you!  https://forums.servethehome.com/index.php?threads/how-to-enable-sr-iov-on-connectx-3.24321/

Edited by jortan
Link to post
On 3/29/2021 at 7:06 PM, BVD said:

What version of windows 10? It's like I said with mellanox drivers, they're all kinda tied up in licensing crap...

 

My suggestion would be to use something like 7zip to manually extract an older version of the drivers from the executable and then attempt manual installation.

 

For windows and virtual functions, intel (and some chelsio) are really the only sure fire way to ensure compatibility thanks to the crap nvidia and mellanox have pulled with their drivers.

It's Windows 10 20H2.

 

But yeah, seems to be related to that driver mess just like you wrote. While troubleshooting I added the VF to an Ubuntu VM that just boots off Ubuntu 20.10 live iso and the VF worked straight away.

 

Guess I'm gonna be finding and testing alot of drivers, if you have any recommendation it would be greatly appreciated.

 

The card is updated with latest official firmware; 2.42.5000.

Link to post
3 hours ago, sorano said:

Guess I'm gonna be finding and testing alot of drivers, if you have any recommendation it would be greatly appreciated.

 

It's a long-shot but you may want to try WinOF v5.50.54000 (the one that's allegedly for Server2019 only)

 

Worked fine with Win10 LTSC (1809) and ConnectX-2

Link to post

The error 43 is the exact driver shenanigans they pull with video cards, but with virtual functions its unfortunately even more difficult to get around.

 

A couple things you can try:

* instead of running the full installer, open the executable (.exe installer) with 7zip or the like, and manually install the drivers for your OS (using the .inf files)

* test against a different version of windows 10 to verify whether its locked to a specific build type (education, pro, pro for workstation, or an enterprise LTS build)

* read through the release notes for earlier driver versions in order to find one that has the best chances for success; 2017 or earlier is likely your best bet in my experience

 - note, this may require a firmware downgrade of your card, which I've very limited personal experience with

Link to post
  • 4 weeks later...

After spending way to many hours trying to get Mellanox Virtual Function running under Windows I finally gave up and bought an Intel X520 DA-2image.png.3e7ec190581163cca4379f10aa4d6c10.png

 

New card, new problems.

It's like the kernel is ignoring the SR-IOV functions of the card for some reason.

 

image.png.0b811f2ad54f59814d2c8a1fded9f6b6.png

 

I checked in BIOS but could not find anything related to activating SR-IOV for the card either.

 

Since sriov_numvfs does not exist for the device I cannot get any VF's.

 

 image.png.ffc6079c2b511b0247c585c11836ced4.png

Edited by sorano
Link to post

You have to bind the 520 series prior to using VFs as its partitioning the entire device - take a look at the chipset specific section which goes over that a bit for more detail, but the guide covers it in pretty decent detail, just be sure to follow it through and you'll be set 👍

Link to post
3 hours ago, BVD said:

You have to bind the 520 series prior to using VFs as its partitioning the entire device - take a look at the chipset specific section which goes over that a bit for more detail, but the guide covers it in pretty decent detail, just be sure to follow it through and you'll be set 👍

 

Damn, I really hoped that would have been it but it's still not working.

 

Same as before. I cannot even create VFs even though X520 is VFIO bound:

image.png.36f98f069e1d417d8146c78a8a9c7065.png

 

image.png.3ebc02e1d170651211b75c9c25609f9c.png

 

sriov_numfs still does not exist for the card:

image.png.5d5a11cce816d59246058bb9ee553f02.png

 

 

Link to post

I'd re-read the instructions- you have 4 lines for your vfs, doing two separate things, where you should only have one line per physical port

 

Sorry, I'm mobile right now or I'd type out more, but it looks like you mightve mixed both methods for some reason

Link to post
2 hours ago, BVD said:

I'd re-read the instructions- you have 4 lines for your vfs, doing two separate things, where you should only have one line per physical port

 

Sorry, I'm mobile right now or I'd type out more, but it looks like you mightve mixed both methods for some reason

 

Well, in the last post you told me I need to vfio bind the NIC before configuring VFs right?

So after that I added the two first lines:

# VFIO bind Intel X520 before creating Virtual Functions
bash /boot/config/vfio-pci-bind.sh 8086:154d 0000:06:00.0
bash /boot/config/vfio-pci-bind.sh 8086:154d 0000:06:00.1


Then the next two lines are creating the actual VFs

# Create Intel X520 Virtual Functions
echo 4 > /sys/bus/pci/devices/0000:06:00.0/sriov_numvfs
echo 4 > /sys/bus/pci/devices/0000:06:00.1/sriov_numvfs

(This is the part that does not work since sriov_numvfs is not visible under /sys/bus/pci/devices/0000:06:00.0/ so the echo does nothing)

 

Link to post

Finally back home!

So this is where I think you got tripped up (just referencing the guide for any others that might come across this):
 

Quote

5.a - In Settings -> User Scripts, create a new script.

5.b For each interface, we'll call the script, specifying the vendor ID (note, this is different from the physical devices vendor ID), domain (always 0000 in our case), and bus ID - we'll choose to run this at first array start only, as it's only needed once per boot, one line per vf:

sudo bash /boot/config/vfio-pci-bind.sh 8086:10ed 0000:17:10.0;

 

Instead of putting this in a user script, you'd added it to the go file - the intent with the script is to allow you to bind (anything really that has function level reset capabilities, but in this case - ) any port at any time, as long as that port isn't currently in an active (up) state. This is necessary in our case, as the virtual functions don't exist until after PCIe init time - they're only created once the numvfs is called, which then 'partitions' (sorta, but that's the easiest way to think of it anyway) the physical function into whatever number of virtual functions you decide upon.

 

You shouldn't have to bind the physical port at all for method one - you're partitioning at the pci layer, prior to the driver loading. This is where I was saying I'd thought perhaps you'd intermingled the two methods a bit.

 

So starting from scratch, your process should be something like:

* Add the 'echo' lines desired to the go file, then reboot

* Create a user script that calls the bind script for each of the VFs you want to pass on - run it once now to try it out, then set it to run on first array start only for the future

 

Lemme know if this doesn't get you over the finish line and we can see about taking a look together further down the road 


_____

 

As an alternative, method two would be an option here as well:
* Run bind script for the port, or bind via UI and reboot (this unloads the ixgbe driver, freeing up the physical function so it can be partitioned into virtual functions - something that isn't necessary in method one due to initiating the vf creation prior to ixgbe being bound to the pf)

* Type the 'echo' line into the terminal for the ethX device

Link to post

I really appreciate you taking your time trying to help.

 

Right now I'm just going to accept that this piece of trash Asus motherboard is fucking broken with SR-IOV and plan better for my next build.

 

No matter what I do sriov_numvfs will not show up in sysfs for the device, so the echo'ing has no effect.

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.