GPU passthrough in Ubuntu desktop 22.04 VM not recognized


Recommended Posts

My Ubuntu VM installed with no issues and boots fine. I am also able to RDP into it no problem (have XRDP installed). I'm trying to do a GPU passthrough with a rtx3090. 

 

I've checked below IOMMU group in tools >  system devices (and rebooted)

IOMMU group 84 :

[10de:2204] 01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)

[10de:1aef] 01:00.1 Audio device: NVIDIA Corporation GA102 High Definition Audio Controller (rev a1)

 

Then I selected "NVIDIA GeForce RTX 3090 (01:00.0)" in dropdown of  "Graphics Card" field of my VM's configuration. 

 

When I try to install the Nvidia Linux driver it can't find the GPU. I installed inxi to check then ran "sudo inxi --full". Although Unraid OS recognizes it, my Ubuntu VM does not see my GPU.  Any ideas on what I'm missing?

 

Link to comment

you will not be able to use the unraid nvdia driver if you have it bound.

the card will need to be bound via iommu to use in a vm properly.

you will need to edit the xml config to fix te4h graphic and audio of the g card.

see space invaders video

*this is the secret sauce fix... ^ Thank you Space Invader 

 

You will need to edit the xml near the bottom and fix your audio to be the next function and to stay on the correct bus.


Optional but helpful edits:
#################################

 

Main > flash drive > to add to your grub option:
video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off

 

We want to make sure the 3090 is fully set to vfio in use driver and that nothing touch it's FB to passit for full use in the VM. We also don't want unraid to control the power setting to it, the vm will handle that. This is why the advance eidt options...


example grub config that i run for my unraid same gpu:

Current Grub boot command:

kernel /bzimage
append initrd=/bzroot nvme_core.default_ps_max_latency_us=5500 default_hugepagesz=1G hugepagesz=1G pcie_acs_override=downstream,multifunction transparent_hugepage=always video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off

 

I also recommend adding this to your vfio modprobe to assist with the GPU passing.
tools > system drivers > inuse drop down set to all

scroll to vfio.conf hit the pencil at the right and add this for your system:

*don't forget to save it at the right...
 

we want to edit this config and rebuild the modules and rebuild:

/etc/modprobe.d/vfio.conf

 

softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidiafb pre: vfio-pci
softdep nvidia_drm pre: vfio-pci
softdep drm pre: vfio-pci

options vfio-pci ids=10de:2204,10de:1aef disable_idle_d3=1 enable_sriov disable_denylist disable_vga=1

 

reboot
#########################


lspci -v will tell you what in use drivers there are running in you want to see vfio under the 3090 and audio device.
Fix your xml edits to add mutifunciton to the video portion of the Gcard and fix the bus to match the gcard audio function.

Link to comment

Thanks for the reply. There is a lot to unpack so I'm trying to make sure I'm doing things right step by step. I made multifunction edit to my VM's config  but doens't work yet. Does this at least look correct? (before I move on to something else)

 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
    </hostdev>

Link to comment

Close, but no here is a fix for that portion:


the problem being that both the pci bus are still need to be different. by default, unraids likes to default throws thing to pci bus 4


Usually find keyboard in the xml and add the multifunction and fix settings to the next pcie device and one after.

we will use vm virtual mother board slot 2. we will remove the broken/unnecessary xml
I'm not sure if alias name is breaking something for you here.

Note now that you have edited the xml you can't use the unraid GUI vm options to fix if you do you will erase all xml edits and need to fix them each edit!
 

Quote

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
    </hostdev>


 

your vm tab should also show this after edits are done:
image.thumb.png.e50331ca33e8cabe6bed2b7a9f517027.png

Note that it says NVIDIA graphic for display out.


I recommend editing the web gui version first, then making the necessary edits.
image.thumb.png.4bea296c3b475e475e7e9c01af836043.png

 

I don't like to use the pin feature, it has its uses. but i rather set a vcpu count and let the machin load balncer work it out.
i would need the entire xml to make thoese edits for you. its hard to explin what to change and where. ther is no security issues.

this is my friends unraid with a windows gaming machine with a nvdia 3060 as an example:
 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='1'>
  <name>Windows</name>
  <uuid>a0abe979-e098-7bee-2054-aefa184b8732</uuid>
  <description>Windwos-Gaming</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-7.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/a0abe979-e098-7bee-2054-aefa184b8732_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv mode='custom'>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='1'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/Windows/vdisk1.img' index='1'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <serial>vdisk1</serial>
      <boot order='1'/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <interface type='direct' trustGuestRxFilters='yes'>
      <mac address='52:54:00:a7:b1:48'/>
      <source dev='vhost0' mode='bridge'/>
      <target dev='macvtap0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-Windows/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='2'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <audio id='1' type='none'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x10' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x10' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0b' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

your xml will be differnet you won't be able to copy past as there are thing you need to watch.

 

vcpu and cores also is removed if you make a gui edit...


don't delete your line 3-line 7 this is unraid visual display in vm tab
note my line 14 vcpu is set to 4 and that cputune is removed.
my line 19-21 machine vm type is pc q35 v 7.2
the loader and nvram are vm specfic!!!!
note my line 33 - 36 cores needs updated to fix hardware vcpu processing.
note my line 49 this is where your vdisk is stored on unraid.


find keyboard my line 150:
150-169 is my gcard pass though.

I also have a usb pci device passed

You can use this as a reference if needed.

Edited by bmartino1
Link to comment
16 hours ago, bmartino1 said:

the ubuntu vm also needs to have nvida drivers installer.

I would have you install openssh-server and connect to the vm via ssh

 

you can see the NVIDIA card by typing lspci in the ubuntu vm.

 

apt-get install mc nano vim openssh-server nfs-common cifs-utils

^ other recommend packages. 

https://ubuntu.com/server/docs/nvidia-drivers-installation

 

I downloaded driver directly from Nvidia. During installation it doesn't recognize the existence of the 3090 not do I see it when I run various hardware utilities (I can see the GPU in Unraid host but not the VM). Also connected to Ubuntu VM with SSH as opposed to RDP but didn't make a difference.

Link to comment
Posted (edited)
18 hours ago, bmartino1 said:

Close, but no here is a fix for that portion:


the problem being that both the pci bus are still need to be different. by default, unraids likes to default throws thing to pci bus 4


Usually find keyboard in the xml and add the multifunction and fix settings to the next pcie device and one after.

we will use vm virtual mother board slot 2. we will remove the broken/unnecessary xml
I'm not sure if alias name is breaking something for you here.

Note now that you have edited the xml you can't use the unraid GUI vm options to fix if you do you will erase all xml edits and need to fix them each edit!
 


 

your vm tab should also show this after edits are done:
image.thumb.png.e50331ca33e8cabe6bed2b7a9f517027.png

Note that it says NVIDIA graphic for display out.


I recommend editing the web gui version first, then making the necessary edits.
image.thumb.png.4bea296c3b475e475e7e9c01af836043.png

 

I don't like to use the pin feature, it has its uses. but i rather set a vcpu count and let the machin load balncer work it out.
i would need the entire xml to make thoese edits for you. its hard to explin what to change and where. ther is no security issues.

this is my friends unraid with a windows gaming machine with a nvdia 3060 as an example:
 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='1'>
  <name>Windows</name>
  <uuid>a0abe979-e098-7bee-2054-aefa184b8732</uuid>
  <description>Windwos-Gaming</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-7.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/a0abe979-e098-7bee-2054-aefa184b8732_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv mode='custom'>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='1'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/Windows/vdisk1.img' index='1'/>
      <backingStore/>
      <target dev='hdc' bus='sata'/>
      <serial>vdisk1</serial>
      <boot order='1'/>
      <alias name='sata0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <interface type='direct' trustGuestRxFilters='yes'>
      <mac address='52:54:00:a7:b1:48'/>
      <source dev='vhost0' mode='bridge'/>
      <target dev='macvtap0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-Windows/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='2'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <audio id='1' type='none'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x10' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x10' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0b' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>

 

your xml will be differnet you won't be able to copy past as there are thing you need to watch.

 

vcpu and cores also is removed if you make a gui edit...


don't delete your line 3-line 7 this is unraid visual display in vm tab
note my line 14 vcpu is set to 4 and that cputune is removed.
my line 19-21 machine vm type is pc q35 v 7.2
the loader and nvram are vm specfic!!!!
note my line 33 - 36 cores needs updated to fix hardware vcpu processing.
note my line 49 this is where your vdisk is stored on unraid.


find keyboard my line 150:
150-169 is my gcard pass though.

I also have a usb pci device passed

You can use this as a reference if needed.

 

I changed bus setting to "2" as  you suggested but it wouldn't save XML ("attempted double use of PCI Address error). I was able to change it to "5" but it didn't make a difference.

 

Out of curiousity are you on a system with two GPUs? My system only has one GPU the 3090. It has no IGPU.. The first reply (other poster) suggested that when my system boots the host system might be locking the GPU which is what is preventing the passthrough to VM. I didn't understand his instructions as to how to change that though.

 

 

 

 

 

Edited by JKunraid
grammar edits
Link to comment

technical yes. but doesn't mater. My cpu has a onboard GPU. i run it in headless mode.

 

the error "attempted double use of PCI Address error"
means that something else is using the virtual slot 2 for a device so 2 is out. i would move back to 4 or 7...

 

I would need the entire xml file to assist further.

via ubuntu ssh if you use lscpi do you see it in the list.

If unraid sees the 3090 via lspci -v

what driver is in use. the nvdia card needs to be stubbed and bind via system devices.


image.thumb.png.fa330ad3908c425be9bffd19d042a53d.png

I run amd and stubed my 6600xt.
My friend runs a amd and stubed his 3060.

we have stubbed and used a 3090 in testing an know that this works.


step 1: confirm iommu and hvm:

Dashboard top right info:

image.png.889c11adb8225f3e4b63aa13e425cf53.png

next step verify stubed GPU:


/Tools/SysDevs

Tools > system devices

?iommu slot 1?
confirm with 2 green dots by gcard and view vfio bind log:
image.thumb.png.d91ddbfe429e877c9c02daa424f1cfa3.png

 

step 2 vm tab.
confirm gui settings:
image.thumb.png.5e5b4de0ade80fa4b7e5ed56bcc368eb.png

 

add vm> linux 
confirm machine type:

image.png.375844f8c59a32ef90c80cc9545b2d0a.png

must be q35 (can work with ifx but issues. must be version 7 or above.)

 

confirm graphic options:

image.png.cff83751098353fa6e5d178e1639e398.png

 

gui must be set before xml edits!
in my case its a amd card.

image.png.f3c0cccaa37c684d53cc99f40d4ad0bf.png

 

create vm. edit xml
find keybaord and set proper xml edits.

at end of gcard add mutifunction

next one is audi/hdmi driver.
set to same buss and add function 1...

^this is the bare minimum. other optional are side fixes, enhancements and checks to stop error and ease g-card for transition to use in vm.

 

 

image.png

image.png

Link to comment
51 minutes ago, bmartino1 said:

technical yes. but doesn't mater. My cpu has a onboard GPU. i run it in headless mode.

 

the error "attempted double use of PCI Address error"
means that something else is using the virtual slot 2 for a device so 2 is out. i would move back to 4 or 7...

 

I would need the entire xml file to assist further.

via ubuntu ssh if you use lscpi do you see it in the list.

If unraid sees the 3090 via lspci -v

what driver is in use. the nvdia card needs to be stubbed and bind via system devices.


image.thumb.png.fa330ad3908c425be9bffd19d042a53d.png

I run amd and stubed my 6600xt.
My friend runs a amd and stubed his 3060.

we have stubbed and used a 3090 in testing an know that this works.


step 1: confirm iommu and hvm:

Dashboard top right info:

image.png.889c11adb8225f3e4b63aa13e425cf53.png

next step verify stubed GPU:


/Tools/SysDevs

Tools > system devices

?iommu slot 1?
confirm with 2 green dots by gcard and view vfio bind log:
image.thumb.png.d91ddbfe429e877c9c02daa424f1cfa3.png

 

step 2 vm tab.
confirm gui settings:
image.thumb.png.5e5b4de0ade80fa4b7e5ed56bcc368eb.png

 

add vm> linux 
confirm machine type:

image.png.375844f8c59a32ef90c80cc9545b2d0a.png

must be q35 (can work with ifx but issues. must be version 7 or above.)

 

confirm graphic options:

image.png.cff83751098353fa6e5d178e1639e398.png

 

gui must be set before xml edits!
in my case its a amd card.

image.png.f3c0cccaa37c684d53cc99f40d4ad0bf.png

 

create vm. edit xml
find keybaord and set proper xml edits.

at end of gcard add mutifunction

next one is audi/hdmi driver.
set to same buss and add function 1...

^this is the bare minimum. other optional are side fixes, enhancements and checks to stop error and ease g-card for transition to use in vm.

 

 

image.png

image.png

 

"technical yes. but doesn't mater. My cpu has a onboard GPU. i run it in headless mode."

 

What steps do I need to take to run Unraid in headless mode?

 

"lspci -v"


Does not see the GPU.

 

"bind via system devices."

 

Both Nvidia GPU and it's audio are checked in system devices (i.e. bound together into IOMMU group)

 

Here is my Ubuntu VM's XML in full if it helps any...

 

<?xml version='1.0' encoding='UTF-8'?>
<domain type='kvm' id='60'>
  <name>t-ubuntu</name>
  <uuid>1f83ae0a-6582-6423-7dac-30c1882ceed8</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Ubuntu" icon="ubuntu.png" os="ubuntu"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='24'/>
    <vcpupin vcpu='1' cpuset='56'/>
    <vcpupin vcpu='2' cpuset='25'/>
    <vcpupin vcpu='3' cpuset='57'/>
    <vcpupin vcpu='4' cpuset='26'/>
    <vcpupin vcpu='5' cpuset='58'/>
    <vcpupin vcpu='6' cpuset='27'/>
    <vcpupin vcpu='7' cpuset='59'/>
    <vcpupin vcpu='8' cpuset='28'/>
    <vcpupin vcpu='9' cpuset='60'/>
    <vcpupin vcpu='10' cpuset='29'/>
    <vcpupin vcpu='11' cpuset='61'/>
    <vcpupin vcpu='12' cpuset='30'/>
    <vcpupin vcpu='13' cpuset='62'/>
    <vcpupin vcpu='14' cpuset='31'/>
    <vcpupin vcpu='15' cpuset='63'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-7.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/1f83ae0a-6582-6423-7dac-30c1882ceed8_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='8' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/t-ubuntu-chia/vdisk1.img' index='1'/>
      <backingStore/>
      <target dev='hdc' bus='virtio'/>
      <serial>vdisk1</serial>
      <boot order='1'/>
      <alias name='virtio-disk2'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:e6:4f:6e'/>
      <source bridge='br0'/>
      <target dev='vnet57'/>
      <model type='virtio-net'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/1'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/1'>
      <source path='/dev/pts/1'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-60-t-ubuntu-chia/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <audio id='1' type='none'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+100</label>
    <imagelabel>+0:+100</imagelabel>
  </seclabel>
</domain>
 

 

 

 

 

 

Link to comment

your xml looks good I don't see any xml issues. 

 

the fact that unraid lspci or lspic-v done't show you the data is problematic.

weird that system devices shows it though. a successful stub look slike this:

lspci -v shows friend's gpu kernel driver in use as vfio. Please confirm g-card is firmly slotted and all power cables are firmly inserted.
 

image.thumb.png.262f7c76ef73e54ba27fa2c076f2a32b.png

 

since you don't have a onboard gpu and will go headless mode you will need to add the fb and int call back to your grub config to run unraid in headless mode.

go to main > flash

 

here you will need to edit your unraid grub settings.

 

be sure to fix and keep any unraid set grub options

the syntax should be:

 

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init

 

depending on your vm settings you may have 

"pcie_acs_override=downstream,multifunction"

 

then the syntax would be:

 

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pcie_acs_override=downstream,multifunction

 

a reboot is required to intact these settings.

I would also recommend unbinding the card. reboot add grub settings able then bind card to confirm a good stub.


system tools vfio

 

image.thumb.png.dabd90e26054cd851406b7a8c6365364.png

 

image.thumb.png.483881b4faccea1d837d415b587552a2.png

 

 Optional but recommend.
edit vfio and add

softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidiafb pre: vfio-pci
softdep nvidia_drm pre: vfio-pci
softdep drm pre: vfio-pci

options vfio-pci ids=10de:2204,10de:1aef disable_idle_d3=1 enable_sriov disable_denylist disable_vga=1

 

rebuild and reboot. this eases the nvidia driver into vfio.

Edited by bmartino1
Link to comment

Rare but your card may be affected. but you may be affected.

You may need to temp install windows on a bare metal pc with the g card and run the windows NVIDIA driver binary.

The windows binary also updates 3090 firmware on cards.

 

Linux binary can sometimes break this while trying to "stub" or set a card to vfio.

 

Link to comment

  

On 4/4/2024 at 11:16 AM, bmartino1 said:

Verifying IOMMU parameters

 

dmesg | grep -e DMAR -e IOMMU


Intel will show something similar:

image.png.811138c5b4308713bdd3eddee1bbd471.png

 

There should be a line that looks like "DMAR: IOMMU enabled".

If there is no output, something is wrong.

 

Verify IOMMU interrupt remapping is enabled

 

It is not possible to use PCI passthrough without interrupt remapping. Device assignment will fail with 'Failed to assign device "[device name]": Operation not permitted' or 'Interrupt Remapping hardware not found, passing devices to unprivileged domains is insecure.'.

All systems using an Intel processor and chipset that have support for Intel Virtualization Technology for Directed I/O (VT-d), but do not have support for interrupt remapping will see such an error. Interrupt remapping support is provided in newer processors and chipsets (both AMD and Intel).

To identify if your system has support for interrupt remapping:

 

dmesg | grep 'remapping'

 

If you see one of the following lines:

AMD-Vi: Interrupt remapping enabled

DMAR-IR: Enabled IRQ remapping in x2apic mode ('x2apic' can be different on old CPUs, but should still work)

then remapping is supported.

If your system doesn't support interrupt remapping, you can allow unsafe interrupts
^enabled under unraid VM settings...

image.png.e7e44d62d49c51c603a819d864b16deb.png

 

 

Edited by bmartino1
Link to comment
On 4/3/2024 at 6:50 PM, bmartino1 said:

you need unraid to show lspci -v

 

Kernel driver in use for all parts of the G-card.
if you dont' see vfio / vfio-pci there is a bigger problem!

 

If not, the device will not function properly in a vm.

 

I ran "lspci -v | grep vfio" (on host) and it returned following.

Kernel driver in use: vfio-pci
Kernel driver in use: vfio-pci

 

I also ran "lspci -v | grep nvidia"

 Kernel modules: nvidia_drm, nvidia

Link to comment
On 4/3/2024 at 6:20 PM, bmartino1 said:

your xml looks good I don't see any xml issues. 

 

the fact that unraid lspci or lspic-v done't show you the data is problematic.

weird that system devices shows it though. a successful stub look slike this:

lspci -v shows friend's gpu kernel driver in use as vfio. Please confirm g-card is firmly slotted and all power cables are firmly inserted.
 

image.thumb.png.262f7c76ef73e54ba27fa2c076f2a32b.png

 

since you don't have a onboard gpu and will go headless mode you will need to add the fb and int call back to your grub config to run unraid in headless mode.

go to main > flash

 

here you will need to edit your unraid grub settings.

 

be sure to fix and keep any unraid set grub options

the syntax should be:

 

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init

 

depending on your vm settings you may have 

"pcie_acs_override=downstream,multifunction"

 

then the syntax would be:

 

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pcie_acs_override=downstream,multifunction

 

a reboot is required to intact these settings.

I would also recommend unbinding the card. reboot add grub settings able then bind card to confirm a good stub.


system tools vfio

 

image.thumb.png.dabd90e26054cd851406b7a8c6365364.png

 

image.thumb.png.483881b4faccea1d837d415b587552a2.png

 

 Optional but recommend.
edit vfio and add

softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidiafb pre: vfio-pci
softdep nvidia_drm pre: vfio-pci
softdep drm pre: vfio-pci

options vfio-pci ids=10de:2204,10de:1aef disable_idle_d3=1 enable_sriov disable_denylist disable_vga=1

 

rebuild and reboot. this eases the nvidia driver into vfio.

 

 

I tried to add line...

 

"kernel /bzimage append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init"

 

...but  couldn't find the grub file on flash drive. Where exactly is located in Unraid and its exact name?

Link to comment

If you haven't already. There is a os unraid release, I recommend updating to 6.12.10

? Perfect lspci showing the correct driver for VM use.

 

Kernel driver in use: vfio-pci
 

Kernel modules: is the driver that would be in use if it wasn't bond - told otherwise.



Main > Flash ...

 

/Main/Settings/Flash?name=flash

 

Click Main
image.png.b6489dc17d695f95ac666b3bd2b58f0e.png


Click Flash under boot Device.

image.png.1a4588437cb9a97bd754136cc67add16.png


find syslinux configurations

image.png.f33fd9af222c3988354cd2e681f38204.png

 

to the right in green is your default boot option and grub configurations.

 

This is your Grub Config:

image.thumb.png.4958ed95063c50b260e1698f32c63d62.png

 

I have other options for my grub config. for Hard disk and NVME settings.

image.thumb.png.fb96f4b015729a50cba41f0dbd2bbc22.png

 

For NVIDIA with fb issues, you just need the video portion and unraid gui settings from the vm tag. in this case, its the pcie_ace_overider option.

 

You may need to fix the syntax in the box if adding more grub options. this is how the vm tab echo inserts and boots from setting the gui doesn't read it right and display it in the wrong syntax..

 

ex:
kernel /bzimage

pcie_acs_override=downstream,multifunction
append initrd=/bzroot

 

need to become

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off pcie_acs_override=downstream,multifunction

 

 

image.png

Edited by bmartino1
Link to comment
Posted (edited)
1 hour ago, bmartino1 said:

If you haven't already. There is a os unraid release, I recommend updating to 6.12.10

? Perfect lspci showing the correct driver for VM use.

 

Kernel driver in use: vfio-pci
 

Kernel modules: is the driver that would be in use if it wasn't bond - told otherwise.



Main > Flash ...

 

/Main/Settings/Flash?name=flash

 

Click Main
image.png.b6489dc17d695f95ac666b3bd2b58f0e.png


Click Flash under boot Device.

image.png.1a4588437cb9a97bd754136cc67add16.png


find syslinux configurations

image.png.f33fd9af222c3988354cd2e681f38204.png

 

to the right in green is your default boot option and grub configurations.

 

This is your Grub Config:

image.thumb.png.4958ed95063c50b260e1698f32c63d62.png

 

I have other options for my grub config. for Hard disk and NVME settings.

image.thumb.png.fb96f4b015729a50cba41f0dbd2bbc22.png

 

For NVIDIA with fb issues, you just need the video portion and unraid gui settings from the vm tag. in this case, its the pcie_ace_overider option.

 

You may need to fix the syntax in the box if adding more grub options. this is how the vm tab echo inserts and boots from setting the gui doesn't read it right and display it in the wrong syntax..

 

ex:
kernel /bzimage

pcie_acs_override=downstream,multifunction
append initrd=/bzroot

 

need to become

kernel /bzimage
append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off pcie_acs_override=downstream,multifunction

 

 

image.png

 

Thanks. I made the edit (also changing it to default menu option I think) Can you quickly review before I try rebooting in case I messed something up.

 

default menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label Unraid OS
  kernel /bzimage
  append initrd=/bzroot
label Unraid OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label Unraid OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Unraid OS GUI Safe Mode (no plugins)
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui unraidsafemode
label Memtest86+
  kernel /memtest
label GPU passthrough mode
  menu default
  kernel /bzimage
  append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off pcie_acs_override=downstream,multifunction

 

 

 

(btw- I'm on Unraid 6.12.8)

 

Edited by JKunraid
added info
Link to comment
2 hours ago, JKunraid said:

 

Thanks. I made the edit (also changing it to default menu option I think) Can you quickly review before I try rebooting in case I messed something up.

 

default menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label Unraid OS
  kernel /bzimage
  append initrd=/bzroot
label Unraid OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label Unraid OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Unraid OS GUI Safe Mode (no plugins)
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui unraidsafemode
label Memtest86+
  kernel /memtest
label GPU passthrough mode
  menu default
  kernel /bzimage
  append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off pcie_acs_override=downstream,multifunction

 

 

 

(btw- I'm on Unraid 6.12.8)

 

 

It best to stay updated. with OS release. tools > update OS

6.12.10 fixes a CVE issue.

not sure if you added labe to the global configurations.

 

its best to edit the text under Unraid OS.

but yes, that looks good. 

Link to comment

the last known edits to XML I could recommend for nvida software once working in a VM

 

  

On 8/11/2022 at 6:44 PM, ilarion said:

Why are some of the menus in the "Nvidia control panel" missing?

 

For those of you that do gpu passthrough with NVIDIA cards and pull your hair why half of the menus in the "Nvidia control panel" are missing. - problem that google didn't help at all.

I have the luck to have nvme passtrought and could boot baremetal so i could see that baremetal menus are visible so for them to be visible in vm put :
 

<kvm>
<hidden state='on'/>
</kvm>

in the "<features>" section of your xml.

 

 

 

 so 
  <features>
    <acpi/>
    <apic/>
  </features>

would become 
 

  <features>
    <acpi/>
    <apic/>
    <kvm>
<hidden state='on'/>
</kvm>
  </features>

 

Link to comment
19 hours ago, bmartino1 said:

 

It best to stay updated. with OS release. tools > update OS

6.12.10 fixes a CVE issue.

not sure if you added labe to the global configurations.

 

its best to edit the text under Unraid OS.

but yes, that looks good. 

 

The good news. The System boots fine. The bad news. VM still doesn't recognize the GPU.  Any thoughts on what I should try next?

Link to comment

your board may need a override setting option in grub. 

 

we need the output of thesee 2 commands for unraid terminal

 

Lets Verifying IOMMU parameters

 

dmesg | grep -e DMAR -e IOMMU

 

There should be a line that looks like "DMAR: IOMMU enabled".

"detected IOMMU"

If there is no output, something is wrong.

 

and

 

dmesg | grep 'remapping'

 


If you see one of the following lines:

AMD-Vi: Interrupt remapping enabled

DMAR-IR: Enabled IRQ remapping in x2apic mode ('x2apic' can be different on old CPUs, but should still work)

then remapping is supported.

If your system doesn't support interrupt remapping, you can allow unsafe interrupts
^enabled under unraid VM settings...

 

image.thumb.png.a1cfc3c99075354b56a9a597d17b938b.png

 

You may need the grub line to allow unsafe interrupts.

 

there is a grub options a modprobe option and a vfio added parameter:

options vfio_iommu_type1 allow_unsafe_interrupts=1

Otherwise your baord Bios firmware may not support IOMMU and your manufacture is lazy. A Bios upgrade or downgrade may be needed.

*patches to fix specter and meltdown sometime break with how IOMMU is supposed to operate. AMD ASUS and Gigabyte had a big upraor during that time as they killied iommu for mitigation to teh virus/malware.

 

I recommend installing but not eanble the disable to see if your processor/board are affected and if mitigation are in affect.;

image.png.7a78ed691e722eba02db862df617de81.png
 

Edited by bmartino1
Link to comment
4 hours ago, bmartino1 said:

your board may need a override setting option in grub. 

 

we need the output of thesee 2 commands for unraid terminal

 

Lets Verifying IOMMU parameters

 

dmesg | grep -e DMAR -e IOMMU

 

There should be a line that looks like "DMAR: IOMMU enabled".

"detected IOMMU"

If there is no output, something is wrong.

 

and

 

dmesg | grep 'remapping'

 


If you see one of the following lines:

AMD-Vi: Interrupt remapping enabled

DMAR-IR: Enabled IRQ remapping in x2apic mode ('x2apic' can be different on old CPUs, but should still work)

then remapping is supported.

If your system doesn't support interrupt remapping, you can allow unsafe interrupts
^enabled under unraid VM settings...

 

image.thumb.png.a1cfc3c99075354b56a9a597d17b938b.png

 

You may need the grub line to allow unsafe interrupts.

 

there is a grub options a modprobe option and a vfio added parameter:

options vfio_iommu_type1 allow_unsafe_interrupts=1

Otherwise your baord Bios firmware may not support IOMMU and your manufacture is lazy. A Bios upgrade or downgrade may be needed.

*patches to fix specter and meltdown sometime break with how IOMMU is supposed to operate. AMD ASUS and Gigabyte had a big upraor during that time as they killied iommu for mitigation to teh virus/malware.

 

I recommend installing but not eanble the disable to see if your processor/board are affected and if mitigation are in affect.;

image.png.7a78ed691e722eba02db862df617de81.png
 

 

Results of running (dmesg | grep -e DMAR -e IOMMU) on host system.

 

[    0.507012] pci 0000:60:00.2: AMD-Vi: IOMMU performance counters supported
[    0.508237] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[    0.513697] pci 0000:20:00.2: AMD-Vi: IOMMU performance counters supported
[    0.515574] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.518364] pci 0000:60:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518532] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518699] pci 0000:20:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518867] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.521408] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.521522] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[    0.521637] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[    0.521747] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
[    1.748930] AMD-Vi: AMD IOMMUv2 loaded and initialized

 

 

Results of (dmesg | grep 'remapping') on host,

 

 0.519033] AMD-Vi: Interrupt remapping enabled

 

Installed "disable security mitigations" with no luck. Is this enabled or disabled by default?

 

You mention add line (options vfio_iommu_type1 allow_unsafe_interrupts=1) to grub. Am I supposed to append it to it's own line  somewhere specific?

(e.g  append it to the line I previously added?)

 

label GPU passthrough mode
  menu default
  kernel /bzimage
  append initrd=/bzroot video=vesafb:off,efifb:off,simplefb:off,astdrmfb initcall_blacklist=sysfb_init pci=noaer pcie_aspm=off pcie_acs_override=downstream,multifunction options vfio_iommu_type1 allow_unsafe_interrupts=1

 

btw - thanks for the help. Hopefully we can figure this out but even if not I'm learning a lot just trying different things.

 

Link to comment

No worries.

 

the mitigations:

/Settings/DisableSecurity

 

are not disabled untill you push the disable button.

Example:

image.thumb.png.01c74fdd2325bce56ad3945bf9065504.png

 

What hardware are you pruning.

?opteron / thread ripper/EPIC?

Processor / Board manfuacture?

 

The concern is form 
[    0.518364] pci 0000:60:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518532] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518699] pci 0000:20:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.518867] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40

 

I ask as the Iommu dmesge output has memory locking similar to our lab tests with amd threadrippper. IOMMU was blocked untill we added 3 iommu settings to grub.

 

From my testing, AMD opteron chips will only acept v1 iommu passthough and won't take the newere stuff. kerneal 5 maybe board bios specfic.

 

Thread ripper and epyc require "amd_iommu=on iommu=pt intel_iommu=on" added to grub to fully unlock and use iommu properly.

^adding that to boot may break things in unraid.


but IOMMU may be workign as you did get 
[    1.748930] AMD-Vi: AMD IOMMUv2 loaded and initialized

 

 

I don't remember the grub options. that will take me a bit to dig through my old notes... Lots has changed between kernels and that testing. all of this worked on kernal 5.15 lts. we now run kernal 6 . xx some grub options are no longer need others are dead.


the settings "options vfio_iommu_type1 allow_unsafe_interrupts=1" is added to vfio.conf or something like that in modproabe as Unraid is a mutable OS. 
you will need to go to vm settings to enable unsafe interrupts.

 

Options etc is not a grub keneral command.

 

Link to comment

the last thing i would ask is the diagnosit file.

 

I would need you to start the vm once to generate thing in dmesg syslog.

this will also let use review some other config setting you may have set.

 

Download and attach for me and other advance users to take a look and hopefully fix and find other errors to fix.

 

image.thumb.png.f1fa4be3bdd7b365bf1de2e78c93e9f9.png

 

Honestly I'm not sure what else to check or try. Iommu is good pci memory remaping is in. Grub is not letting os/kernel use Gcard nor touch frame buffer.
VM hypervisor is enabled and working. ?vm starts? but with blank screen.

xml file is added correctly and xml edits are in to correctly grab the card and use it in the VM.

 

?try reinstalling the VM OS or try another distro like Pop OS with same vm config?

Pop comes with the driver pre packaged. if it works with pop. its a VM os kernel driver issue.

?install winddows 10 see if you see the card and get a error code 43?

 

Edited by bmartino1
Link to comment
On 4/6/2024 at 6:56 PM, bmartino1 said:

the last thing i would ask is the diagnosit file.

 

I would need you to start the vm once to generate thing in dmesg syslog.

this will also let use review some other config setting you may have set.

 

Download and attach for me and other advance users to take a look and hopefully fix and find other errors to fix.

 

image.thumb.png.f1fa4be3bdd7b365bf1de2e78c93e9f9.png

 

Honestly I'm not sure what else to check or try. Iommu is good pci memory remaping is in. Grub is not letting os/kernel use Gcard nor touch frame buffer.
VM hypervisor is enabled and working. ?vm starts? but with blank screen.

xml file is added correctly and xml edits are in to correctly grab the card and use it in the VM.

 

?try reinstalling the VM OS or try another distro like Pop OS with same vm config?

Pop comes with the driver pre packaged. if it works with pop. its a VM os kernel driver issue.

?install winddows 10 see if you see the card and get a error code 43?

 

 

Threadripper 3970x on Asus Zenith II Extreme Mobo.  Bios is from 2022. There is a newer version so i'm going to give it a shot tomorrow. 

 

I created the diagnostic file.  i have some privacy related concerns about it since it contains a lot of data. Is there any way I can send you specific files or folder? (I will review before sending)

 

I'll give Pop OS a try tomorrow.  Another Linux distro with an existing template in the vm setup after that. Big Brother AI spyware after that... umm I mean Windows. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.