Jump to content
L0rdRaiden

New method for passthrough devices in Unraid 6.7 (vfio/BIND)

22 posts in this topic Last Reply

Recommended Posts

With the new release, Unraid 6.7

There is a new method to passthrough devices

Quote

New vfio-bind method.  Since it appears that the xen-pciback/pciback kernel options no longer work, we introduced an alternate method of binding, by ID, selected PCI devices to the vfio-pci driver. This is accomplished by specifying the PCI ID(s) of devices to bind to vfio-pci in the file 'config/vfio-pci.cfg' on the USB flash boot device.  This file should contain a single line that defines the devices:

BIND=<device> <device> ...

Where <device> is a Domain:Bus:Device.Function string, for example,

BIND=02:00.0

Multiple device should be separated with spaces.

 

The script /usr/local/sbin/vfio-pci is called very early in system start-up, right after the USB flash boot device is
mounted but before any kernel modules (drivers) have been loaded.  The function of the script is to bind each specified device to the vfio-pci driver, which makes them available for assignment to a virtual machine, and also prevents the Linux kernel from automatically binding them to any present host driver. 

 

In addition, and importantly, this script will bind not only the specified device(s), but all other devices in the same IOMMU group as well.  For example, suppose there is an NVIDIA GPU which defines both a VGA device at 02:00.0 and an audio device at 02.00.1.  Specifying a single device (either one) on the BIND line is sufficient to bind both device to vfio-pci.  The implication is that either all devices of an IOMMU group are bound to vfio-pci or none of them are. 

 

Right now most of us I guess have something like this in the Syslinux configuration since is the dicussed method in forum and some youtube videos

imagen.thumb.png.9ef5ab519a10b12c3f2ad96e083ae29f.png

And with this options enable

imagen.png.086919a31575e0f534f5a117a41d8cf0.png

 

So considering the new method and my current configuration to passthourgh a pci network card, what should I add, remove, keep of my current settings

@limetech since is a new feature maybe someone from the team could explain it in detail for the KVM noobs.

Share this post


Link to post

You don't have to do anything if using vfio-pci.ids. This is for those that use xen-pciback.hide with the PCI ID in case they have two (or more) cards with the same vendor:device ID and only want one to be hidden (USB controller).

Share this post


Link to post
3 hours ago, saarg said:

You don't have to do anything if using vfio-pci.ids. This is for those that use xen-pciback.hide with the PCI ID in case they have two (or more) cards with the same vendor:device ID and only want one to be hidden (USB controller).

So if someone was going to be setting up some devices for PT in a vanilla system, nothing PT'd yet. Should we use this new method, or the example above where it's appended in the syslinux config?

Share this post


Link to post

To expand on @saarg explanation: the "vfio-pci.ids" kernel parameter specifies devices that the Linux kernel should not try to initialize or assign to a driver, because doing so sometimes makes the device behave improperly when assigned to a VM (or makes it impossible to assign).  This parameter identifies devices using "Vendor:Model" strings where Vendor and Model are numeric values assigned by the device manufacturer.  This is easy because that string will never change.  The disadvantage is if you have two or more of the exact same device in your server, then all of them will be "invisible" to Linux.

 

The other kernel parameter available was "xen-pciback.hide" (and synonym "pciback") which accomplishes the same thing, but takes a string of the form "Domain:Bus:Device.Function".  Each of those values are also numbers that identify the device according to where it exists in your server PCI bus topology.  The advantage with this method is that an exact device can be identified independent of whether another of the same device exists in the server.  The disadvantage with this method is that if the physical h/w configuration changes, e.g., you move the device to a different PCI slot, then the PCI-ID of that device also changes.

 

The problem we ran across was that xen-pciback.hide/pciback wasn't working any more, and rather than wait for a kernel dev to fix it, we decided to use the above alternate method.  Note that in general, kernel evolution is moving away from kernel parameters and to more flexible methods.  For example, "isolcpus" is really deprecated and there is alternate method of isolation CPU's using config files which we will adopt in a future Unraid OS release.

 

As you can see, there is currently no "perfect" way of maintaining permanent assignment of devices to VM's.

Share this post


Link to post
3 minutes ago, limetech said:

To expand on @saarg explanation: the "vfio-pci.ids" kernel parameter specifies devices that the Linux kernel should not try to initialize or assign to a driver, because doing so sometimes makes the device behave improperly when assigned to a VM (or makes it impossible to assign).  This parameter identifies devices using "Vendor:Model" strings where Vendor and Model are numeric values assigned by the device manufacturer.  This is easy because that string will never change.  The disadvantage is if you have two or more of the exact same device in your server, then all of them will be "invisible" to Linux.

So, in my planning, I am going to be passing 2 x identical USB 3.1 PCIE x 4 expansion cards, and 2 x identical Zotac GTX 1080 mini GPU's through. The goal is to have one GPU and one USB card PT's to each VM, total of two VM's. Since I will have 2 x identical sets of matching hardware, I assume I should use this new method?

Share this post


Link to post
1 hour ago, cybrnook said:

So, in my planning, I am going to be passing 2 x identical USB 3.1 PCIE x 4 expansion cards, and 2 x identical Zotac GTX 1080 mini GPU's through. The goal is to have one GPU and one USB card PT's to each VM, total of two VM's. Since I will have 2 x identical sets of matching hardware, I assume I should use this new method?

Either method should work, note all it does is tell the kernel not to bind any driver to the device(s), though actually it does bind vfio stub driver (this is what prevents other drivers from binding).  Another consideration is the IOMMU grouping.  You can't pass two devices from the same IOMMU group to two different VM's unless you use the ACS override function.  Using ACS override is "ok" in most Unraid applications because usually you are in complete control over what VM's are running and what is running in each VM.  The cloud-server guys can't use ACS override because doing so (theoretically) allows one VM to gain access to another VM running on same hardware - this is why the ACS override patch will never be accepted into the official kernel source tree.  Personally, in this age we live in where everyone is extremely security-conscious, I would only use modern h/w with proper ACS support.

Share this post


Link to post

I have this in my syslinux.cfg file...

 

label KVM with GUI/unRAID OS
  menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

which worked great until about rc7, and now I cannot get my VM to recognize my USB keyboard, or any hard drives I connect to the USB port.

 

Do these changes have anything to do with my new issue, or is it just coincidence and I need to look elsewhere for a solution to this new problem?

 

if so, where might I look next?

 

Thanks

media-diagnostics-20190511-1957.zip

Share this post


Link to post
6 hours ago, JustinChase said:

I have this in my syslinux.cfg file...

 

label KVM with GUI/unRAID OS
  menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

which worked great until about rc7, and now I cannot get my VM to recognize my USB keyboard, or any hard drives I connect to the USB port.

 

Do these changes have anything to do with my new issue, or is it just coincidence and I need to look elsewhere for a solution to this new problem?

 

if so, where might I look next?

 

Thanks

media-diagnostics-20190511-1957.zip 94.63 kB · 0 downloads

That issue have nothing to do with the new method. Probably best to open a new thread about the issue.

Share this post


Link to post

 

On 5/12/2019 at 5:16 AM, L0rdRaiden said:

Multiple device should be separated with spaces

so would that be 

menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 1523:b456 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

or 

 

menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 vfio-pci.ids=1523:b456 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

 

 

Share this post


Link to post
9 hours ago, darthjonathan12 said:

 

so would that be 

menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 1523:b456 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

or 

 

menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142 vfio-pci.ids=1523:b456 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

 

 

Figured it out, it should be separated with commas. 

 

menu default
  kernel /bzimage
  append vfio-pci.ids=1b21:1142,1523:b456 intel_iommu=on pcie_acs_override=downstream,multifunction initrd=/bzroot,/bzroot-gui

 

 

Share this post


Link to post

Hi, just to confirm, is this the fix for gpu Passthrough on current Ryzen 3 bios where the gpu hangs in a low power state or says pci 127 error?

Share this post


Link to post
3 hours ago, helin0x said:

Hi, just to confirm, is this the fix for gpu Passthrough on current Ryzen 3 bios where the gpu hangs in a low power state or says pci 127 error?

 

No, this is just another way to blacklist the card using PCI number.

Share this post


Link to post

Switched to AMD GPU and Unraid 6.7.2, nothing works...until I found this thread. Now it works like a charme :)

 

syslinux.cfg and vfio-pci.ids does not work for me anymore.

 

The next test is to install the identical second card and see if I can use both. That was not possible with vfio-pci.ids or stubs. The cards have the same ID. Always error 127.

Edited by Attackwave

Share this post


Link to post
On 7/31/2019 at 11:25 PM, saarg said:

 

No, this is just another way to blacklist the card using PCI number.

What is the best way to blacklist devices so that the unraid host's kernel doesn't try to use them? For example I have two GPUs and two USB cards dedicated to two VMs, not to mention all of the peripherals connected to those USB cards. I don't want unraid to even bother trying to use these devices when the VMs are not started - connecting to those USB devices to the host is problematic.

 

Right now I can see when I stop a VM all the passed through devices start appearing in dmesg. I used append vfio-pci.ids=01:00,81:00,02:00,03:00 but it didn't change this behavior at all. When I added the vfio-pci.cfg it totally broke passthrough.

So, to recap:


- Passthrough works well with no configuration added, but devices aren't properly blacklisted which requires manual intervention by me for every start/stop of the host or guest machines.

- the "new method" breaks passthrough

I need a more direct way of blacklisting devices at startup, ideally to tell the kernel to just leave 4 PCI IDs alone completely and use the hardware it has been provided.

Edited by geekazoid
More detail

Share this post


Link to post

I should add that you need to power cycle the motherboard between changes to these settings. The PCI root device was throwing errors in dmesg after I added he vfio options, even after rolling them back and rebooting. It must put the device in a state that doesn't change until you power cycle the board. YMMV.

Share this post


Link to post

Did some clean cycles to confirm my condition. Here's what happens when I employ vfio settings "new method" as above after a clean boot without it (no errors) and a reconfigure with it after a power cycle:

 

[   98.766638] vfio-pci 0000:01:00.0: BAR 1: can't reserve [mem 0xb0000000-0xbfffffff 64bit pref]

(above repeats ~2000 times)

[   98.932910] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[   98.932914] pcieport 0000:00:01.0:   device [8086:2f02] error status/mask=00004000/00000000
[   98.932918] pcieport 0000:00:01.0:    [14] CmpltTO                (First)
[   98.932922] pcieport 0000:00:01.0: broadcast error_detected message
[   98.933000] pcieport 0000:00:01.0: broadcast mmio_enabled message
[   98.933003] pcieport 0000:00:01.0: broadcast resume message
[   98.933010] pcieport 0000:00:01.0: AER: Device recovery successful
[   98.950407] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:01.0
[   98.950414] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[   98.950417] pcieport 0000:00:01.0:   device [8086:2f02] error status/mask=00004000/00000000
[   98.950421] pcieport 0000:00:01.0:    [14] CmpltTO                (First)

 

I can now remove the vfio settings, *power cycle* then it will go back to normal. That's the excruciating detail, lemme know if there is a better thread for this.

Share this post


Link to post

It doesn't matter which method you use to bind the devices. You use the one that works for you.

You have the wrong syntax for vfio-pci.ids=. It's not the PCI number you use, it's the provider:device ID you use.

Share this post


Link to post
On 11/1/2019 at 12:02 AM, saarg said:

It doesn't matter which method you use to bind the devices. You use the one that works for you.

You have the wrong syntax for vfio-pci.ids=. It's not the PCI number you use, it's the provider:device ID you use.

Sorry I think I forgot to enable notify on reply. Thanks for this important comment on my syntax. By provider I assume that you mean domain? So basically use the notation from dmesg.

 

So if I follow you, this:

append vfio-pci.ids=01:00 81:00 02:00 03:00 intel_iommu=on initrd=/bzroot

should be this:

append vfio-pci.ids=0000:0100 0000:8100 0000:0200 0000:0300 intel_iommu=on initrd=/bzroot

 

Share this post


Link to post
On 11/1/2019 at 12:02 AM, saarg said:

It doesn't matter which method you use to bind the devices. You use the one that works for you.

You have the wrong syntax for vfio-pci.ids=. It's not the PCI number you use, it's the provider:device ID you use.

Oh in fact I believe that you meant this:

 

image.thumb.png.e336891fd940bfbdbfcfbc22e0a7c1c5.png

image.thumb.png.76732d53cbf19a308deaa49f072616d4.png

image.thumb.png.8a937d54e1e3df96cace90ed935189f7.png

image.thumb.png.b79a91e3858f4053f4dd7715c4292d6b.png

 

Thus the correct form would be:

append vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111 intel_iommu=on initrd=/bzroot

 

Is this right?

Share this post


Link to post
3 hours ago, geekazoid said:

Oh in fact I believe that you meant this:

 

image.thumb.png.e336891fd940bfbdbfcfbc22e0a7c1c5.png

image.thumb.png.76732d53cbf19a308deaa49f072616d4.png

image.thumb.png.8a937d54e1e3df96cace90ed935189f7.png

image.thumb.png.b79a91e3858f4053f4dd7715c4292d6b.png

 

Thus the correct form would be:


append vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111 intel_iommu=on initrd=/bzroot

 

Is this right?

Yes the last one is the correct one.

Share this post


Link to post
29 minutes ago, saarg said:

Yes the last one is the correct one.

Cool. Well what it changes is this:

 

- the iommu group numbers change for my passthrough devices, possible more. Total iommu groups is diminished. All my passthrough devices are listed though.

- my PCIe USB cards are still not showing up as shareable in the VM manager

- the devices connected to my passthrough USB cards are still connected to my host despite the blacklist

- this appears in dmesg:

root@fluffy:~# dmesg | grep vfio
[    0.000000] Command line: BOOT_IMAGE=/bzimage vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111i intel_iommu=on initrd=/bzroot
[    0.000000] Kernel command line: BOOT_IMAGE=/bzimage vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111i intel_iommu=on initrd=/bzroot
[   12.411973] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[   12.424129] vfio_pci: add [10de:13f1[ffffffff:ffffffff]] class 0x000000/00000000
[   12.424473] vfio-pci 0000:81:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[   12.436168] vfio_pci: add [10de:1d01[ffffffff:ffffffff]] class 0x000000/00000000
[   12.436405] vfio_pci: add [1912:0014[ffffffff:ffffffff]] class 0x000000/00000000
[   12.436621] vfio_pci: add [1b73:0111[ffffffff:ffffffff]] class 0x000000/00000000

Also this does nothing either way:

root@fluffy:~# cat /boot/config/vfio-pci.cfg 
BIND=0000:0100 0000:8100 0000:0200 0000:0300

All of my test cycles feature a full cold start. I'm going to start regression testing on 6.6 soon because this is messing with my workaday life.

Share this post


Link to post
6 hours ago, geekazoid said:

Cool. Well what it changes is this:

 

- the iommu group numbers change for my passthrough devices, possible more. Total iommu groups is diminished. All my passthrough devices are listed though.

- my PCIe USB cards are still not showing up as shareable in the VM manager

- the devices connected to my passthrough USB cards are still connected to my host despite the blacklist

- this appears in dmesg:


root@fluffy:~# dmesg | grep vfio
[    0.000000] Command line: BOOT_IMAGE=/bzimage vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111i intel_iommu=on initrd=/bzroot
[    0.000000] Kernel command line: BOOT_IMAGE=/bzimage vfio-pci.ids=10de:13f1,10de:1d01,1912:0014,1b73:111i intel_iommu=on initrd=/bzroot
[   12.411973] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[   12.424129] vfio_pci: add [10de:13f1[ffffffff:ffffffff]] class 0x000000/00000000
[   12.424473] vfio-pci 0000:81:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[   12.436168] vfio_pci: add [10de:1d01[ffffffff:ffffffff]] class 0x000000/00000000
[   12.436405] vfio_pci: add [1912:0014[ffffffff:ffffffff]] class 0x000000/00000000
[   12.436621] vfio_pci: add [1b73:0111[ffffffff:ffffffff]] class 0x000000/00000000

Also this does nothing either way:


root@fluffy:~# cat /boot/config/vfio-pci.cfg 
BIND=0000:0100 0000:8100 0000:0200 0000:0300

All of my test cycles feature a full cold start. I'm going to start regression testing on 6.6 soon because this is messing with my workaday life.

The iommu groups will lot change using either methods.

Don't use both methods. Use one or the other.

 

After boot you can check which module is loaded using lspci -k

 

Post you diagnostics.

Edited by saarg

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.