Jump to content

[PLUGIN] Intel iGPU SR-IOV - Support Page


Recommended Posts

Posted (edited)

Using latest Unraid

 

i5 14500

B760M-A Motherboard

 

downloaded plugin, activated plugin etc. created the virtual gpus, passtrough it to the vm - so far so good

 

BUT

 

after the vm starts 1-2minutes or if i shutdown the vm, the GPU disapears, even from the host iteself and i need to reboot host machine to bring it back

 

Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.13.1
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads!
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GuC RC: enabled
Jun  3 14:05:57 NAS kernel: mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
Jun  3 14:05:57 NAS kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Jun  3 14:05:57 NAS kernel: ACPI: video: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
Jun  3 14:05:57 NAS kernel: input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input7
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: 7 VFs could be associated with this PF
Jun  3 14:05:57 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:05:57 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:05:57 NAS acpid: input device has been disconnected, fd 11
Jun  3 14:05:57 NAS kernel: pci 0000:00:02.0: Removing from iommu group 0

any way to fix this? :(

 

iommou 0 is the intel gpu

 

tested with passtrough 0000:00:02.1 and 0000:00:02.2   and with the gpu itself (thats the log file, but same issue if i use .1 and .2 :) - just tested all i can... 

 

this are the last lines  if using 0000:00:02.2 (group 21) --ignore the filesystem on my ssd :D

n  3 14:48:37 NAS kernel: vfio-pci 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered blocking state
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:48:37 NAS kernel: device vnet2 entered promiscuous mode
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered blocking state
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered forwarding state
Jun  3 14:48:39 NAS kernel: device-mapper: ioctl: 4.47.0-ioctl (2022-07-28) initialised: [email protected]
Jun  3 14:48:39 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:48:39 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:48:40 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:48:41 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Partition '/dev/sdc2' does not have a file system and cannot be mounted.
Jun  3 14:49:34 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:49:34 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:49:34 NAS kernel: device vnet2 left promiscuous mode
Jun  3 14:49:34 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:49:34 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:49:34 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS kernel: vfio-pci 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: Running in SR-IOV VF mode
Jun  3 14:49:36 NAS unassigned.devices: Partition '/dev/sdc2' does not have a file system and cannot be mounted.
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] VT-d active for gfx access
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] Using Transparent Hugepages
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: HuC firmware PRELOADED
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] Protected Xe Path (PXP) protected content support initialized
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] PMU not supported for this GPU.
Jun  3 14:49:36 NAS kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.2 on minor 2
Jun  3 14:49:36 NAS usb_manager: Info: rc.usb_manager  vm_action Windows 11 stopped end -
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
Jun  3 14:49:37 NAS kernel: pci 0000:00:02.1: Removing from iommu group 20
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=io+mem:owns=io+mem
Jun  3 14:49:37 NAS kernel: pci 0000:00:02.2: Removing from iommu group 21
Jun  3 14:49:38 NAS kernel: i915 0000:00:02.0: Disabled 2 VFs
Jun  3 14:49:38 NAS kernel: Console: switching to colour dummy device 80x25
Jun  3 14:49:38 NAS acpid: input device has been disconnected, fd 11
Jun  3 14:49:38 NAS kernel: pci 0000:00:02.0: Removing from iommu group 0

image.thumb.png.c14f746d0d536dd7c6b134878b64ecd1.png

image.thumb.png.9b89a19aec73712bd142781c0bfb4ffa.png

nas-diagnostics-20240603-1411.zip

Edited by Encore
Link to comment

Hi guy's
I have geneally speaking two problems.

 

1. While Intel iGPU SR-IOV Plugin installed the command intel_gpu_top is not working.

Quote

root@jericho:~# intel_gpu_top 
Failed to detect engines! (No such file or directory)
(Kernel 4.16 or newer is required for i915 PMU support.)

 

2. If i try to pass through one of the virtual gpus from the plugin to my jellyfin container jellyfin will not start to hardware accelerate.

  • added the group to the container with "--group-add={groupid}"
    • Quote

      root@jericho:/dev/dri# getent group video
      video:x:18:sddm

       

  • added the device by adding "/dev/dri/renderD130:/dev/dri/renderD130" device to the container
    • Quote

      root@jericho:/dev/dri# ls -ali
      total 0
      408 drwxr-xr-x  3 root root       180 Jun  4 15:29 ./
        1 drwxr-xr-x 16 root root      3300 Jun  4 15:30 ../
      412 drwxr-xr-x  2 root root       160 Jun  4 15:29 by-path/
      410 crw-rw----  1 root video 226,   0 Jun  4 15:29 card0
      419 crw-rw-rw-  1 root video 226,   1 Jun  4 15:29 card1
      425 crw-rw-rw-  1 root video 226,   2 Jun  4 15:29 card2
      409 crw-rw-rw-  1 root video 226, 128 Jun  4 15:29 renderD128
      418 crw-rw-rw-  1 root video 226, 129 Jun  4 15:29 renderD129
      424 crw-rw-rw-  1 root video 226, 130 Jun  4 15:29 renderD130

    • Quote

      root@3b99ed97a0d8:/dev/dri# ls -ali
      total 0
      11 drwxr-xr-x 2 root root       60 Jun  4 15:36 .
       1 drwxr-xr-x 6 root root      360 Jun  4 15:36 ..
      12 crw-rw-rw- 1 root   18 226, 130 Jun  4 15:36 renderD130

       

  • activated HA in Jellyfin and used Quicksync whichs is supported by my alder lake gpu
    • image.png.dcb2b576a96d915125c913e1c62083a9.png
  • Quote

    root@3b99ed97a0d8:/# /usr/lib/jellyfin-ffmpeg/ffmpeg -v verbose -init_hw_device vaapi=va -init_hw_device opencl@va
    ffmpeg version 6.0.1-Jellyfin Copyright (c) 2000-2023 the FFmpeg developers
      built with gcc 12 (Debian 12.2.0-14)
      configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-static --disable-libxcb --disable-sdl2 --disable-xlib --enable-lto --enable-gpl --enable-version3 --enable-shared --enable-gmp --enable-gnutls --enable-chromaprint --enable-opencl --enable-libdrm --enable-libass --enable-libfreetype --enable-libfribidi --enable-libfontconfig --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libopenmpt --enable-libdav1d --enable-libsvtav1 --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libfdk-aac --arch=amd64 --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-vaapi --enable-amf --enable-libvpl --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
      libavutil      58.  2.100 / 58.  2.100
      libavcodec     60.  3.100 / 60.  3.100
      libavformat    60.  3.100 / 60.  3.100
      libavdevice    60.  1.100 / 60.  1.100
      libavfilter     9.  3.100 /  9.  3.100
      libswscale      7.  1.100 /  7.  1.100
      libswresample   4. 10.100 /  4. 10.100
      libpostproc    57.  1.100 / 57.  1.100
    [AVHWDeviceContext @ 0x55ef57b2cf00] Cannot open DRM render node for device 0.
    [AVHWDeviceContext @ 0x55ef57b2cf00] Cannot open DRM render node for device 1.
    [AVHWDeviceContext @ 0x55ef57b2cf00] Trying to use DRM render node for device 2.
    [AVHWDeviceContext @ 0x55ef57b2cf00] libva: VA-API version 1.21.0
    [AVHWDeviceContext @ 0x55ef57b2cf00] libva: Trying to open /usr/lib/jellyfin-ffmpeg/lib/dri/iHD_drv_video.so
    [AVHWDeviceContext @ 0x55ef57b2cf00] libva: Found init function __vaDriverInit_1_21
    [AVHWDeviceContext @ 0x55ef57b2cf00] libva: va_openDriver() returns 0
    [AVHWDeviceContext @ 0x55ef57b2cf00] Initialised VAAPI connection: version 1.21
    [AVHWDeviceContext @ 0x55ef57b2cf00] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.3 (7c1c775).
    [AVHWDeviceContext @ 0x55ef57b2cf00] Driver not found in known nonstandard list, using standard behaviour.
    [AVHWDeviceContext @ 0x55ef57b5c6c0] 0.0: Intel(R) OpenCL Graphics / Intel(R) UHD Graphics
    [AVHWDeviceContext @ 0x55ef57b5c6c0] Intel QSV to OpenCL mapping function found (clCreateFromVA_APIMediaSurfaceINTEL).
    [AVHWDeviceContext @ 0x55ef57b5c6c0] Intel QSV in OpenCL acquire function found (clEnqueueAcquireVA_APIMediaSurfacesINTEL).
    [AVHWDeviceContext @ 0x55ef57b5c6c0] Intel QSV in OpenCL release function found (clEnqueueReleaseVA_APIMediaSurfacesINTEL).
    Hyper fast Audio and Video encoder
    usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

    Use -h to get full help or, even better, run 'man ffmpeg'

I'm pretty much without any idea.
SR-IOV is working fine on my windows 11 vm so it is generally working in unraid

Link to comment
On 6/3/2024 at 2:45 PM, Encore said:

Using latest Unraid

 

i5 14500

B760M-A Motherboard

 

downloaded plugin, activated plugin etc. created the virtual gpus, passtrough it to the vm - so far so good

 

BUT

 

after the vm starts 1-2minutes or if i shutdown the vm, the GPU disapears, even from the host iteself and i need to reboot host machine to bring it back

 

Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.13.1
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads!
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] GuC RC: enabled
Jun  3 14:05:57 NAS kernel: mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
Jun  3 14:05:57 NAS kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Jun  3 14:05:57 NAS kernel: ACPI: video: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
Jun  3 14:05:57 NAS kernel: input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input7
Jun  3 14:05:57 NAS kernel: i915 0000:00:02.0: 7 VFs could be associated with this PF
Jun  3 14:05:57 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:05:57 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:05:57 NAS acpid: input device has been disconnected, fd 11
Jun  3 14:05:57 NAS kernel: pci 0000:00:02.0: Removing from iommu group 0

any way to fix this? :(

 

iommou 0 is the intel gpu

 

tested with passtrough 0000:00:02.1 and 0000:00:02.2   and with the gpu itself (thats the log file, but same issue if i use .1 and .2 :) - just tested all i can... 

 

this are the last lines  if using 0000:00:02.2 (group 21) --ignore the filesystem on my ssd :D

n  3 14:48:37 NAS kernel: vfio-pci 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered blocking state
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:48:37 NAS kernel: device vnet2 entered promiscuous mode
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered blocking state
Jun  3 14:48:37 NAS kernel: br0: port 4(vnet2) entered forwarding state
Jun  3 14:48:39 NAS kernel: device-mapper: ioctl: 4.47.0-ioctl (2022-07-28) initialised: [email protected]
Jun  3 14:48:39 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:48:39 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:48:40 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:48:41 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:48:41 NAS unassigned.devices: Partition '/dev/sdc2' does not have a file system and cannot be mounted.
Jun  3 14:49:34 NAS kernel: ata6.00: Enabling discard_zeroes_data
Jun  3 14:49:34 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:49:34 NAS kernel: device vnet2 left promiscuous mode
Jun  3 14:49:34 NAS kernel: br0: port 4(vnet2) entered disabled state
Jun  3 14:49:34 NAS kernel: sdc: sdc1 sdc2 sdc3 sdc4
Jun  3 14:49:34 NAS kernel: i915 0000:00:02.0: VF2 FLR
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS unassigned.devices: Disk with ID 'Samsung_SSD_860_EVO_500GB_S3Z2NB0K660578V (dev2)' is not set to auto mount.
Jun  3 14:49:36 NAS kernel: vfio-pci 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: Running in SR-IOV VF mode
Jun  3 14:49:36 NAS unassigned.devices: Partition '/dev/sdc2' does not have a file system and cannot be mounted.
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] VT-d active for gfx access
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] Using Transparent Hugepages
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: HuC firmware PRELOADED
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] Protected Xe Path (PXP) protected content support initialized
Jun  3 14:49:36 NAS kernel: i915 0000:00:02.2: [drm] PMU not supported for this GPU.
Jun  3 14:49:36 NAS kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.2 on minor 2
Jun  3 14:49:36 NAS usb_manager: Info: rc.usb_manager  vm_action Windows 11 stopped end -
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
Jun  3 14:49:37 NAS kernel: pci 0000:00:02.1: Removing from iommu group 20
Jun  3 14:49:37 NAS kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=io+mem:owns=io+mem
Jun  3 14:49:37 NAS kernel: pci 0000:00:02.2: Removing from iommu group 21
Jun  3 14:49:38 NAS kernel: i915 0000:00:02.0: Disabled 2 VFs
Jun  3 14:49:38 NAS kernel: Console: switching to colour dummy device 80x25
Jun  3 14:49:38 NAS acpid: input device has been disconnected, fd 11
Jun  3 14:49:38 NAS kernel: pci 0000:00:02.0: Removing from iommu group 0

image.thumb.png.c14f746d0d536dd7c6b134878b64ecd1.png

image.thumb.png.9b89a19aec73712bd142781c0bfb4ffa.png

nas-diagnostics-20240603-1411.zip 280.4 kB · 0 downloads

 

Hey, at the moment it looks like 14th gen is not supported from the source. It may change in the future though.

  • Upvote 1
Link to comment
On 6/5/2024 at 12:54 AM, Stephan M. said:

Hi guy's
I have geneally speaking two problems.

 

1. While Intel iGPU SR-IOV Plugin installed the command intel_gpu_top is not working.

 

2. If i try to pass through one of the virtual gpus from the plugin to my jellyfin container jellyfin will not start to hardware accelerate.

  • added the group to the container with "--group-add={groupid}"
    •  

  • added the device by adding "/dev/dri/renderD130:/dev/dri/renderD130" device to the container
    •  
    •  

  • activated HA in Jellyfin and used Quicksync whichs is supported by my alder lake gpu
    • image.png.dcb2b576a96d915125c913e1c62083a9.png
  •  

I'm pretty much without any idea.
SR-IOV is working fine on my windows 11 vm so it is generally working in unraid

 

Hey, looks like you use the wrong methods.

 

For 1:

As you use SRIOV you have to adapt the command like this:

intel_gpu_top -d sriov

 

For 2:

 

Don't use on of the VFs for docker. Use /dev/dri instead.

58 minutes ago, Stephan M. said:

No one here?

 

Sorry?! 👀 😆😜

  • Like 1
Link to comment
11 minutes ago, giganode said:

For 1:

As you use SRIOV you have to adapt the command like this:

intel_gpu_top -d sriov

That's amazing! Thank you 🙂

 

12 minutes ago, giganode said:

For 2:

 

Don't use on of the VFs for docker. Use /dev/dri instead.

I already did that in the past because it's the default way with the same result.
The direct reference to on VT was just the last try I did.

 

13 minutes ago, giganode said:

Sorry?! 👀 😆😜

Don't worry 🙂 Appreciate your help a lot! :-* 

Link to comment
4 minutes ago, Stephan M. said:

That's amazing! Thank you 🙂

 

I already did that in the past because it's the default way with the same result.
The direct reference to on VT was just the last try I did.

 

Don't worry 🙂 Appreciate your help a lot! :-* 

 

What about if you use /dev/dri/renderD128 ?

Link to comment
Posted (edited)
4 minutes ago, giganode said:

What about if you use /dev/dri/renderD128 ?

Unfortunately, the same problem.
Is there somewhere a log that shows me more information about the transcoding Plex is doing?

Edited by Stephan M.
Added more information
Link to comment

Just a little heads up for all i915 SRIOV users...

 

When the first beta/RC from Unraid 6.13.0 or 7.0 drops it will most likely be the case that SRIOV is not supported there.

This is caused because SRIOV source code needs to be updated to be compatible with Kernel 6.8+

 

I already created an issue on the GitHub from the maintainer from the Kernel module here.

 

The main issue is that the maintainer from the Kernel module moved on from SRIOV and now relies entirely on PR from the community.

The next issue with this module is that newer generations of CPUs (like 14th gen) won't be fully supported because the codebase is not up to date and only supports up to 13th gen.

 

Intel is working on a implementation upstream in the Kernel but it seems that it will take a bit longer until it is working, everything is tested and can be pushed upstream to the Kernel.

  • Like 2
  • Thanks 1
Link to comment
On 6/7/2024 at 9:56 AM, Stephan M. said:

Unfortunately, the same problem.
Is there somewhere a log that shows me more information about the transcoding Plex is doing?

 

Have you tried hw acceleration in another container? you mentioned jellyfin. Do you use ples as well?

 

when hw is working in docker you can take a look with ..

 

intel_gpu_top -d sriov

 

.. while transcoding.

 

which container of jellyfin do you use?

Link to comment
On 6/7/2024 at 3:30 PM, giganode said:

 

Hey, at the moment it looks like 14th gen is not supported from the source. It may change in the future though.

Edited for a better understanding

 

I have exactly the same issue as Encore's, but mine is a 12600K with B760M D4, it's upgraded from 9980HK with M4SMD (B150).

So my issue: I can passtrough a VF to a VM and the GPU is working fine in the VM, but once I shutdown the VM, all the GPU and VFs will disappear (from Tools - System Devices), unless I reboot Unraid.

 

When the issue is not appear, intel_gpu_top -d sriov layout:

intel-gpu-top: Intel Alderlake_s (Gen12) @ /dev/dri/card0 -    0/ 699 MHz
    77% RC6;  0.04/26.65 W;        0 irqs/s

         ENGINES     BUSY                                        MI_SEMA MI_WAIT
       Render/3D    0.10% |▏                                   |      0%      0%
         Blitter    0.00% |                                    |      0%      0%
           Video    0.00% |                                    |      0%      0%
    VideoEnhance    0.00% |                                    |      0%      0%
         Compute    0.00% |                                    |      0%      0%

 

And dmesg | grep i915 layout:

[   42.451359] i915 0000:00:02.0: Running in SR-IOV PF mode
[   42.451628] i915 0000:00:02.0: [drm] VT-d active for gfx access
[   42.451725] i915 0000:00:02.0: vgaarb: deactivate vga console
[   42.451757] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[   42.452346] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[   42.454439] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
[   42.454532] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1)
[   42.476010] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.13.1
[   42.476014] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[   42.478956] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads!
[   42.479359] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[   42.479359] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[   42.479733] i915 0000:00:02.0: [drm] GuC RC: enabled
[   42.480181] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[   42.480249] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[   42.512448] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[   42.513433] i915 0000:00:02.0: 7 VFs could be associated with this PF
[   42.540781] fbcon: i915drmfb (fb0) is primary device
[   42.613281] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[   44.808474] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[   44.808522] i915 0000:00:02.1: enabling device (0000 -> 0002)
[   44.808535] i915 0000:00:02.1: Running in SR-IOV VF mode
[   44.809295] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.4.1
[   44.810804] i915 0000:00:02.1: [drm] VT-d active for gfx access
[   44.810816] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[   44.811192] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.4.1
[   44.811814] i915 0000:00:02.1: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
[   44.811815] i915 0000:00:02.1: HuC firmware PRELOADED
[   44.813559] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
[   44.813561] i915 0000:00:02.1: [drm] PMU not supported for this GPU.
[   44.813602] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 1
[   44.813738] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
[   44.813741] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   44.813766] i915 0000:00:02.2: enabling device (0000 -> 0002)
[   44.813773] i915 0000:00:02.2: Running in SR-IOV VF mode
[   44.814013] i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
[   44.814246] i915 0000:00:02.2: [drm] VT-d active for gfx access
[   44.814253] i915 0000:00:02.2: [drm] Using Transparent Hugepages
[   44.814604] i915 0000:00:02.2: [drm] GT0: GUC: interface version 0.1.4.1
[   44.815104] i915 0000:00:02.2: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
[   44.815105] i915 0000:00:02.2: HuC firmware PRELOADED
[   44.816641] i915 0000:00:02.2: [drm] Protected Xe Path (PXP) protected content support initialized
[   44.816643] i915 0000:00:02.2: [drm] PMU not supported for this GPU.
[   44.816672] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.2 on minor 2
[   44.816755] i915 0000:00:02.0: Enabled 2 VFs
[  149.702760] i915 0000:00:02.0: VF1 FLR
[  149.810657] i915 0000:00:02.0: VF1 FLR

 

Once I shutdown the VM, which has a VF configured, all GPU and VFs will disappear and there are some new lines with dmesg | grep i915:

[  284.324075] i915 0000:00:02.0: VF1 FLR
[  286.533005] i915 0000:00:02.1: Running in SR-IOV VF mode
[  286.533827] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.4.1
[  286.535312] i915 0000:00:02.1: [drm] VT-d active for gfx access
[  286.535326] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[  286.535723] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.4.1
[  286.536229] i915 0000:00:02.1: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
[  286.536231] i915 0000:00:02.1: HuC firmware PRELOADED
[  286.537712] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
[  286.537715] i915 0000:00:02.1: [drm] PMU not supported for this GPU.
[  286.537760] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 1
[  287.586734] i915 0000:00:02.1: [drm] *ERROR* tlb invalidation response timed out for seqno 23
[  287.626701] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
[  287.626705] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[  287.696644] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=io+mem:owns=io+mem
[  288.918021] i915 0000:00:02.0: Disabled 2 VFs

 

 And intel_gpu_top -d sriov layout:

Requested device sriov not found!

 

My go file:

#!/bin/bash
# Start the Management Utility
/usr/local/sbin/emhttp &

 

Steps I have tried:

1. Remove VFs from VMs.

2. Remove Intel GPU TOP and Intel Graphics SR-IOV then reboot Unraid.

3. Install Intel Graphics SR-IOV and Intel GPU TOP,get VFs enabled then reboot Unraid.

4. Ensure the host GPU is not passthrough and VFs are not binded (no checks before VFs in System Devices), then add a VF to a VM.

5. Start the VM and ensure GPU (UHD770 in my case) is recognized in Device Management (in Windows).

6. Shutdown VM, GPUs are all disappeared in System Devices.

 

BTW, the Intel-GVT-g plugin has already been removed before I upgrade this NAS, Unraid version is 6.12.10, but in System Devices my GPU displays like:

[8086:4680] 00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev 0c)

Without something like UHD770 in its name, will this be a symptom of my issue?


Enclosed is the log file, please help to see if there is any troubleshoot I can try, thanks in advance!

diagnostics-20240618-2109.zip

Edited by Keniji
  • Upvote 1
Link to comment
On 6/18/2024 at 7:27 AM, Keniji said:

Without something like UHD770 in its name, will this be a symptom of my issue?

its normal, as i know  - same name for me on 14th gen, but transcoding docker jellyfin works fine etc, just sr-iov not working :/

 

wondering, why its on your 12th gen too.

 

i tested complete fresh unraid usb stick, setted bios/uefi back to default, just enabled virtualisation, iommou groups, sr-iov enabled in bios etc etc. but still same, gpu will be removed from host after like 1 minutes or two , or after reboot, even from host system -> reboot required.

 

hope it will be fixed 2024 :/ or upstreamed to kernel trough intel self.

Link to comment
1 hour ago, Encore said:

its normal, as i know  - same name for me on 14th gen, but transcoding docker jellyfin works fine etc, just sr-iov not working :/

 

wondering, why its on your 12th gen too.

 

i tested complete fresh unraid usb stick, setted bios/uefi back to default, just enabled virtualisation, iommou groups, sr-iov enabled in bios etc etc. but still same, gpu will be removed from host after like 1 minutes or two , or after reboot, even from host system -> reboot required.

 

hope it will be fixed 2024 :/ or upstreamed to kernel trough intel self.

Well then mine will be a bit different. If I don't shutdown/reboot the VM, everything looks normal -- The assigned VF works fine in the VM, generated VFs and the host GPU will not disappear...

So still need to know why VFs will get disabled and the host GPU is gone too.

Link to comment

Another update with my issue resolved:

So let's say I have a vm call VM01, and I create a new VM call VM02 (with the same configuration and disk images).

Everything works fine with VM02, so I delete VM01 (and with a host reboot though it does no matter), and rename VM02 to VM01, then I found the issue appear again with this new VM01!!?? Really don't know why but I decide to use a new name for the VM finally...

 

An update: The issue seems not exist with a fresh unraid. But still not able to locate the cause since I need my NAS to run some services (e.g. homeassistant) and I cant get it offline for too much time, but I will figure it out.

  

On 6/21/2024 at 1:21 PM, alturismo said:

just fired the iov Test VM here up to check logs on shutdown

 

VM log

image.thumb.png.17f10bc8d908ab0e5d91c5eb5d4ec8ea.png

 

syslog (see last entry)

image.thumb.png.b209a0a3064a5e1418df4b6c9fff7fbd.png

 

and all devices are here as expected

 

image.thumb.png.ec8db7ca27e78b04c3f2a5aa15cdcfaf.png

 

image.thumb.png.6faa463cd86fc06e8e7bd40ac5901786.png

Thanks, maybe my problem is also refer to the line, which Encore has mentioned before:

[  287.586734] i915 0000:00:02.1: [drm] *ERROR* tlb invalidation response timed out for seqno 23

 

Found a bug(?), not sure if they're related:

https://github.com/strongtz/i915-sriov-dkms/issues/118

 

Edited by Keniji
Link to comment
On 6/9/2024 at 5:09 AM, ich777 said:

This is caused because SRIOV source code needs to be updated to be compatible with Kernel 6.8+


Is this still the case now beta1 has launched?
I saw a note in the release notes that "Add SR-IOV support for Intel iGPU" is listed under Add/edit VM template so I am wondering if the Kernal supports it now?

  • Like 1
Link to comment

Intel are adding a new graphics driver that has native SR-IOV support:

https://www.phoronix.com/news/Intel-Xe-DRM-Linux-6.9-Pull

https://lore.kernel.org/dri-devel/CAPj87rO4K6QS8hVn-d6N8CEi+Uibmgo6mZ5bNGz2rZDUMshvxA@mail.gmail.com/T/

 

This driver is for Intel Xe graphics, which includes the integrated graphics in Tiger Lake (11th gen) and newer, as well as their discrete GPUs.

 

The pull request says it's experimental for Tiger Lake (11th gen) to Meteor Lake (1st gen of Core Ultra mobile CPUs, released December 2023), and will be used as the primary driver for the next generation onwards.

 

What's unclear to me is:

  • Is this driver ready to test in 6.9, or are more improvements needed?
  • Will it support SR-IOV in integrated GPUs or only discrete GPUs?
  • When will the driver be considered production-ready? (i.e. no longer "experimental")

 

Unfortunately, Unraid 7 beta 1 uses Linux 6.8 which doesn't contain this driver, so we can't try it out with Unraid yet. 6.8 is EOL so they're planning to upgrade to 6.9 before RC1.

Edited by Daniel15
Link to comment
1 hour ago, Daniel15 said:

Intel are adding a new graphics driver that has native SR-IOV support:

https://www.phoronix.com/news/Intel-Xe-DRM-Linux-6.9-Pull

https://lore.kernel.org/dri-devel/CAPj87rO4K6QS8hVn-d6N8CEi+Uibmgo6mZ5bNGz2rZDUMshvxA@mail.gmail.com/T/

 

This driver is for Intel Xe graphics, which includes the integrated graphics in Tiger Lake (11th gen) and newer, as well as their discrete GPUs.

 

The pull request says it's experimental for Tiger Lake (11th gen) to Meteor Lake (1st gen of Core Ultra mobile CPUs, released December 2023), and will be used as the primary driver for the next generation onwards.

 

What's unclear to me is:

  • Is this driver ready to test in 6.9, or are more improvements needed?
  • Will it support SR-IOV in integrated GPUs or only discrete GPUs?
  • When will the driver be considered production-ready? (i.e. no longer "experimental")

 

Unfortunately, Unraid 7 beta 1 uses Linux 6.8 which doesn't contain this driver, so we can't try it out with Unraid yet. 6.8 is EOL so they're planning to upgrade to 6.9 before RC1.

The driver is in 6.8 and was added to  7.0.0. It does drive the igpu and arc cards, but is limited at this point. Does not support sriov and that is not planned until 6.10 or 6.11

 

You need to disbale i915 binding to the cards. These are remark out, but if I remove the # the driver would not be bound to i915 and bound to xe.

 

image.png

 

image.png

 

https://www.phoronix.com/review/intel-xe-benchmark

 

00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev 0c)
        DeviceName: Onboard - Video
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7d25
        Kernel driver in use: xe

03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08)
        Subsystem: Intel Corporation Device 1020
        Kernel driver in use: i915
        Kernel modules: i915, xe
09:00.0 VGA compatible controller: NVIDIA Corporation TU117GLM [Quadro T400 Mobile] (rev a1)
        Subsystem: NVIDIA Corporation Device 1489
        Kernel driver in use: nvidia
        Kernel modules: nvidia_drm, nvidia
0d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 24 [Radeon RX 6400/6500 XT/6500M] (rev c7)
        Subsystem: XFX Limited Device 6405
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

 

Intel_gpu_top does not support the xe driver.

 

root@computenode:~# intel_gpu_top -L
card1                    Intel Dg2 (Gen12)                 pci:vendor=8086,device=56A0,card=0
└─renderD129            
card3                    Intel Alderlake_s (Gen12)         pci:vendor=8086,device=4680,card=0
└─renderD131            
card0                    10de:1fb2                         pci:vendor=10DE,device=1FB2,card=0
└─renderD128            
card2                    1002:743f                         pci:vendor=1002,device=743F,card=0
└─renderD130            

 

root@computenode:~# intel_gpu_top -d pci:vendor=8086,device=4680,card=0
Failed to detect engines! (No such file or directory)
(Kernel 4.16 or newer is required for i915 PMU support.)

root@computenode:~# 

  • Like 1
Link to comment
38 minutes ago, SimonF said:

The driver is in 6.8 and was added to  7.0.0. It does drive the igpu and arc cards, but is limited at this point. Does not support sriov and that is not planned until 6.10 or 6.11

 

Makes sense. Thanks for the info.

 

I did see some code for SR-IOV in the i915 driver in Intel's 6.9 kernel. If I get some free time, I'll try to see if those changes can be extracted into a DKMS module again. I don't have any kernel development experience though.

 

39 minutes ago, SimonF said:

Intel_gpu_top does not support the xe driver.

Anything that depends on functionality in the i915 driver will need to be updated.

  • Like 3
Link to comment
Posted (edited)

Hey there, thanks for all the work on keeping this active! 

I'd love some help, I am trying to:
* Run Plex docker with transcoding

* Run a Linux or WIndows VM with output to a physical monitor via cable (not remotely)

 

I tried using the following:

Monitor USB C to USB C
USB C to HDMI

HDMI to HDMI

USB HUB (with HDMI) to HDMI

 

While I get the terminal output with all the above on the monitor, Plex is recognizing the vgpus, the VM's are also recognizing the vGpus, I am enabling the USB in Unraid VM settings I still can't get the VM's to connect to the monitor. I'd love some help with this as I read the entire old thread and new one, no luck so far 🤯 

Edited by luct
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...