[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...


Recommended Posts

13 hours ago, ich777 said:

I think it's better to talk here and close the Github issue.

 

Looking forward to your response.

root@Tower:~# lsmod

Module                  Size  Used by

nct6775                65536  0

hwmon_vid              16384  1 nct6775

apex                   16384  0

gasket                 94208  1 apex

ip6table_filter        16384  1

ip6_tables             28672  1 ip6table_filter

iptable_filter         16384  1

ip_tables              28672  1 iptable_filter

x_tables               45056  4 ip6table_filter,iptable_filter,ip6_tables,ip_tables

bonding               131072  0

edac_mce_amd           32768  0

kvm_amd               122880  0

kvm                   864256  1 kvm_amd

crct10dif_pclmul       16384  1

crc32_pclmul           16384  0

crc32c_intel           24576  0

ghash_clmulni_intel    16384  0

aesni_intel           380928  0

wmi_bmof               16384  0

crypto_simd            16384  1 aesni_intel

cryptd                 24576  2 crypto_simd,ghash_clmulni_intel

input_leds             16384  0

rapl                   16384  0

r8169                  77824  0

wmi                    28672  1 wmi_bmof

led_class              16384  1 input_leds

i2c_piix4              24576  0

i2c_core               86016  1 i2c_piix4

ahci                   40960  0

realtek                24576  1

k10temp                16384  0

ccp                    81920  1 kvm_amd

libahci                40960  1 ahci

button                 16384  0

 

root@Tower:~# uname -a
Linux Tower 5.13.8-Unraid #1 SMP Wed Aug 4 09:39:46 PDT 2021 x86_64 AMD Ryzen 5 1600 Six-Core Processor AuthenticAMD GNU/Linux


Unraid 6.10rc1

plugin pic

945902167_ScreenShot2021-08-12at2_48_11PM.thumb.png.cd2ffa632401f6666b82bd9ba0052218.png

Edited by aceofskies05
Link to comment
2 minutes ago, aceofskies05 said:

Unraid 6.10rc1

plugin pic

From what I see the gasket and apex module are loaded and should at least work.

 

Can you give me also the output from 'lspci -vv' or even better would be the Diagnostics (Tools -> Diagnostics -> Download -> drop the downloaded zip file here in the text box).

 

Was the module recognized on 6.9.2?

Link to comment
2 hours ago, ich777 said:

From what I see the gasket and apex module are loaded and should at least work.

 

Can you give me also the output from 'lspci -vv' or even better would be the Diagnostics (Tools -> Diagnostics -> Download -> drop the downloaded zip file here in the text box).

 

Was the module recognized on 6.9.2?

Sorry I linked the wrong thread context its here. https://github.com/magic-blue-smoke/Dual-Edge-TPU-Adapter/issues/3

 

Issue exists on unraid 6.9 and 6.10... Think this is more of a driver issue? Maybe?

So whats going on is that there is a dual edge tpu coral via m2 slot... When you plug it in via a m2/pci adapter you only get to use 1 TPU.  Ie an adapter like this

Now someone came along and created an adapter that lets you properly pass through BOTH tpus to the system.  If you goto the thread above, you can get lots more details.

Now,  if I pass thought the PCI to a VM ( qemu xml in unraid ) I can see the two tpu devices on the VM... I then can spin up the frigate docker instance that uses both coral TPUs and the docker inside the VM successfully uses both TPU devices. My theory is that the config is passed through to the VM and then frigate uses a config like this to recognize each TPU.  I also had to install the TPU as PCI in the VM ie https://coral.ai/docs/m2/get-started/#4-run-a-model-on-the-edge-tpu
 

detectors:
  coral1:
    type: edgetpu
    device: pci:0
  coral2:
    type: edgetpu
    device: pci:1

 

Now in the VM, and then opening a shell IN the docker container I can run " ls -l /dev/apex* " ... Here I can see both tpu apex drivers or whatever.  
1706624538_ScreenShot2021-08-12at5_23_46PM.thumb.png.4deb3d1e220d4e4ea9097b87833a763d.png

Now, when I open a shell to Frigate docker on Unraid ( same image as the one of the container, in the vm on unraid ) and run "ls -l /dev/apex*" I see this,

103068920_ScreenShot2021-08-12at5_25_25PM.thumb.png.e6cc8c0d56f6711c0ef5b7816fb46f53.png



My current working theory is that I some how need to forward the PCI/Apex drivers? to the container but I dont see a way to do this... Im thinking since I forwarded to the VM, then installed the TPU drivers via google/linux packages it magically worked to pass through to the docker container in the VM. 

351221937_ScreenShot2021-08-12at5_25_55PM.thumb.png.e7c52f48be6c325f1bbb173378bb2770.png

I know this is slightly outside the scope of your driver tool plugin, but this interesting problem is causing me grief because I cant seem to get unraid to pass the dual edge tpu over to docker on Unraids native docker platform. I'd really like to avoid the over head of the vm.

tower-diagnostics-20210812-1732.zip

Link to comment
7 hours ago, aceofskies05 said:

Now, when I open a shell to Frigate docker on Unraid ( same image as the one of the container, in the vm on unraid ) and run "ls -l /dev/apex*" I see this,

From what I see in the logs you have bound the "two" devices (I think how you explained it to me it's only one but with two chips on it so you actually see two) but those two are bound to VFIO for use like you've mentioned above in a VM, if they are bound to VFIO no container or anything on unRAID can use this devices except the VMs:

03:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]
	Subsystem: Global Unichip Corp. Coral Edge TPU [1ac1:089a]
	Kernel driver in use: vfio-pci
	Kernel modules: apex
04:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]
	Subsystem: Global Unichip Corp. Coral Edge TPU [1ac1:089a]
	Kernel driver in use: vfio-pci
	Kernel modules: apex

(You also can see that are Kernel modules for those two devices available named "apex")

 

7 hours ago, aceofskies05 said:

My current working theory is that I some how need to forward the PCI/Apex drivers? to the container but I dont see a way to do this... Im thinking since I forwarded to the VM, then installed the TPU drivers via google/linux packages it magically worked to pass through to the docker container in the VM. 

Actually that is pretty simple btw, I hope you are using this one:

grafik.png.3511cab2adb31c749345657e7f620065.png

 

also here is the support thread for the container itself: Click

(since I actually don't own a TPU device and I'm only the guy who create plugins with drivers for unRAID so that they are working on/with unRAID)

 

You only have to click on: grafik.png.b30d5618a00c828f18c8abbfb29852e1.png on the bottom and then create a entry for a device like this:

grafik.thumb.png.7ee8965665bfb3dc700bf2e55b61a426.png

(Please make sure that you change the config type to "Device" and enter the device path like: "/dev/apex", click on "Add" and "Apply" and you now should be able to see the device in the container)

 

A short summary:

  1. Turn off autostart from the VM where you use the TPU
  2. Unbind the devices from VFIO and reboot your server
  3. Create a Device like mentioned above in the Docker template
  4. I would recommend to delete the "wrong" mapping for your usecase. Delete: "/dev/bus/usb" and "/dev/dri/renderD128" from the template by simply clicking remove
  5. Enjoy

 

If it isn't working after this please try to create a post in the appropriate support thread for Frigate because there are people that can really help, at least more than me, and that actually have such a device.

From what I know the Dual TPU should work just fine on unRAID. :)

  • Like 2
Link to comment
  • 3 weeks later...

Sorry for the poorly description.

 

The problem is that the intel_gpu_top doesn't show the stats.

 

What I have done so far for the intel_gpu_top:

 - touch /boot/config/modprobe.d/i915.conf

 - installed the intel_gpu_top from CA

 - run the intel_gpu_top in the cli

 

My unRAID version is 6.9.2

Bios is set on legacy mode instead of UEFI mode to get my GT1030 passthrough to a VM working. Also the bios is set to use the iGPU as primary display. Even if no screen connected the iGPU still shown in the device list also shown in the /dev/dri which is what I wanted too.

Currently I'm setting up the plex media server container. Will tell if that works with HW transcoding.

 

 

 

Edited by bergi9
Remove attachment
Link to comment
14 minutes ago, bergi9 said:

touch /boot/config/modprobe.d/i915.conf

Try it to without the file, but I can't imagine that this makes a difference...

 

15 minutes ago, bergi9 said:

Also the bios is set to use the iGPU as primary display.

Have you changed any other settings? Please note that some Motherboards actually need a display or a dummy HDMI plug installed so that everything is working properly even if it's set to the primary display output when you have dGPU in your system (like in my case).

 

I have to create an issue on the intel_gpu_tools Gitlab if the problem persists with the details from your Diagnostics if that's okay for you.

Link to comment
1 minute ago, ich777 said:

Please note that some Motherboards actually need a display or a dummy HDMI plug installed so that everything is working properly even if it's set to the primary display output when you have dGPU in your system (like in my case).

I already connected my unRAID to a monitor and rebooted, no difference.

 

2 minutes ago, ich777 said:

I have to create an issue on the intel_gpu_tools Gitlab if the problem persists with the details from your Diagnostics if that's okay for you.

It's ok for me, just please use this diagnostic file. Will delete the other diagnostic file from the post above as it includes my public IPv6 address.

homeserver-diagnostics-20210902-1556.zip

  • Like 1
Link to comment

Figured out, after setting in the BIOS for the "VGA Priority" from "Intel Onboard Device" to "Onboard Device" the intel_gpu_top works, also the transcoding in plex works after this change too. Just when the BIOS boots it picks my GT1030 GPU until unRAID loaded the i915 driver then unRAID switches back to the Intel GPU so that the GT1030 is still available for VMs.

You may close the issue on the gitlab.

 

2 hours ago, ich777 said:
2 hours ago, bergi9 said:

touch /boot/config/modprobe.d/i915.conf

Try it to without the file, but I can't imagine that this makes a difference...

You're right, that made no difference.

Also it works without connecting to any monitor. I'm glad that it works without requiring a dummy-plug.

BIOS_VGA25.jpg

  • Like 1
Link to comment

I want to add ZRAM and ZSTD support to my Unraid instance. I assume this helper would be the easiest way to go about this, but I'm wondering about how to use it properly. As far as I can see I can add user patches and user scripts, but the patches are run before 'make oldconfig' and the user scripts after the kernel is compiled. I don't need to patch anything, I only need to enable these config options:

 

CONFIG_ZSMALLOC=m
CONFIG_ZRAM=m
CONFIG_CRYPTO_ZSTD=m

 

Does this helper have any way to achieve this? For now I add this manually after 'make oldconfig' in the buildscript:

 

cd ${DATA_DIR}/linux-$UNAME
./scripts/config --set-val CONFIG_ZSMALLOC m
./scripts/config --set-val CONFIG_ZRAM m
./scripts/config --set-val CONFIG_CRYPTO_ZSTD m
make olddefconfig

 

It seems to have worked fine since I can create zram devices with zstd compression without problems after loading the modules:

 

modprobe zstd
modprobe zsmalloc
modprobe zram
Link to comment
54 minutes ago, Strayer said:

I assume this helper would be the easiest way to go about this, but I'm wondering about how to use it properly.

No because I never meant to build modules like that... but you can do it you did it or:

 

Enable the custom mode in the advanced section, then the container will copy the whole build script over to the main directory.

Open up a unRAID terminal preferably in Putty or a SSH client and type in: 'docker exec -ti Unraid-Kernel-Helper /bin/bash' to connect to the containers console and then you can execute step by step and replace "make oldconfig" with "make menuconfig" then you can select the modules.

 

Please note that the Unraid-Kernel-Helper will soon be deprecated because you can now add nearly every module via a plugin to unRAID since version 6.9.0

 

If you explain me a little bit more for what purpose you need ZRAM and ZSTD maybe we can make a plugin for that.

Link to comment
8 minutes ago, ich777 said:

If you explain me a little bit more for what purpose you need ZRAM and ZSTD maybe we can make a plugin for that.

 

First of all, thanks for replying! I'm using it as a replacement for tmpfs and because I want to add a little amount of compressed memory for swap (see the various discussions and blog posts about if the Linux kernel benefits from having at least a bit of swap available).

 

Biggest specific use case for me is putting some very VERY verbose debug log files of a container on a compressed ram disk. I don't want to thrash my SSDs with them, but also want to avoid having array disks running because of log files. I can use tmpfs for that, but after testing a bit zram manages to compress the data down to 25% of the initial size. This needs a bit of custom code in a user script or the go file to create the block devices, file systems and mount them, obviously, but I do like the way it works. I definitely see myself using this for more files that don't need to be persisted through reboots.

 

zstd just because it is a bit more efficient at compressing compared to the default lzo. (see e.g.

 

Edited by Strayer
partitions != file systems
Link to comment
2 hours ago, Strayer said:

This needs a bit of custom code in a user script or the go file to create the block devices, file systems and mount them, obviously, but I do like the way it works.

Can you post your script or what did you mount where, a plugin for this should be no problem at all but I'm currently searching for more answers and I'm really want to know what you script does... :D

Link to comment
11 minutes ago, ich777 said:

Can you post your script or what did you mount where, a plugin for this should be no problem at all but I'm currently searching for more answers and I'm really want to know what you script does... :D

 

Sorry, I was still playing around with this this morning and didn't finish anything yet. What I did for testing zram was essentially this:
 

modprobe zstd
modprobe zsmalloc
modprobe zram

zramctl -f -s 200MiB -a zstd
# returns device name, e.g. /dev/zram0

mkfs.ext4 /dev/zram0
mount /dev/zram0 /tmp/zramtest

 

I would probably throw something like this in my go file with a better mount point than /tmp/zramtest. I'd then use this mount point as a bind mount in the docker-compose.yml of the container I'm running, replacing the current bind mount on the array for the log files.

 

Doing this for swap is mostly the same, except using mkswap and swapon instead of mkfs and mount. Most of the popular zram packages (e.g. zram-tools from Debian) do some percentage calculation to determine how much ram should be used for swap. Thats a very different topic.

Link to comment
13 minutes ago, Strayer said:

Sorry, I was still playing around with this this morning and didn't finish anything yet. What I did for testing zram was essentially this

I will look into this when I got a little bit more time... :)

 

But a plugin for this should be more than doable.

 

14 minutes ago, Strayer said:

Doing this for swap is mostly the same, except using mkswap and swapon instead of mkfs and mount.

Do you want to have a SWAP partition in RAM or do I misunderstand something here?

 

 

BTW nice avatar, remember playing DOTT on my old DOS machine... :D

Link to comment
18 minutes ago, ich777 said:

I will look into this when I got a little bit more time... :)

 

But a plugin for this should be more than doable.

 

Nice! Do you work on these plugins in the open? I couldn't find anything on Github, but I may be blind. I'm a bit wary of using precompiled 3rd party kernel modules. Sorry if this comes off rude, but I'd rather compile these myself like with the build script of this topic, where I at least have some kind of oversight on what happens.

 

18 minutes ago, ich777 said:

Do you want to have a SWAP partition in RAM or do I misunderstand something here?

 

I wouldn't mind moving less used stuff to a swap space placed in compressed RAM, yes. This is how most modern operating systems work anyway. There are some nice articles on this, I managed to find these that I read a while ago:

 

https://haydenjames.io/linux-performance-almost-always-add-swap-space/

https://haydenjames.io/linux-performance-almost-always-add-swap-part2-zram/

 

I pretty much started installing Debians zram-tools on all (mostly cloud) servers that I manage and so far didn't run into any issues. That package creates swap on compressed ram disks based on the RAM size and CPU count.

 

But my primary use case is to be able to use compressed ram disks as described with the docker containers in my previous post.

 

18 minutes ago, ich777 said:

BTW nice avatar, remember playing DOTT on my old DOS machine... :D

 

Ha, thanks. Have been using the avatar for more than 10 years now and the people who recognized it are shockingly few. I started moving on to a custom commissioned avatar on most of my profiles though, I just forgot to change it here. Now I'm happy that I did forget :D

Link to comment
1 hour ago, Strayer said:

Nice! Do you work on these plugins in the open? I couldn't find anything on Github, but I may be blind. I'm a bit wary of using precompiled 3rd party kernel modules. Sorry if this comes off rude, but I'd rather compile these myself like with the build script of this topic, where I at least have some kind of oversight on what happens.

Yes, it's all on Github, search on the CA App for DVB Driver, Nvidia Driver,... it's also on Github and open source.

 

The build is completely automated and I do nothing special woth the modules other than compiling it for the appropriate Kernel version and the plugin also updates the modules when you upgrade to a newer version.

 

You can at least for now but as said, I wiol depricate the Kernep-Helper soon since troubleshooting is much more complicated with custom compiled images also for the developers from unRAID itself, also keep in mind if it's installed as a plugin the Devs or even someone else could see that you are using plugins.

 

1 hour ago, Strayer said:

I wouldn't mind moving less used stuff to a swap space placed in compressed RAM, yes.

But why not use ZSWAP?

Link to comment
  • 2 weeks later...
23 minutes ago, Econaut said:

Interested in loading the amdgpu module and using the AMD APU (5700G) for docker containers - is that possible now?

It is tested on 6.9.x on AMD APUs 3xxx series, AMD APUs 4xxx series and some Radeon CPUs with iGPU with Jellyfin (best would be to visit my Jellyfin support thread).

 

I don't had such a new CPU until now, try to upgrade unRAID to 6.10.0-rc1 and downloading the app Radeon TOP from the CA App.

After that open up a terminal from unRAID itself and type in:

ls -la /dev/dri/

 

After that you should get an output that lists a folder 'card0' & 'renderD128'

 

If that is working you can try to install my Jellyfin container from the CA App and pass through the device /dev/dri (the description of the template and the variables should tell you everything).

Go to the settings from Jellyfin and at Playback enable Hardware transcoding and press save at the bottom.

Then go back to the main menu from Jellyfin click on a video file and try to play it, click on the little gear icon after the playback started and change the quality to something lower than the input file is.

If everything is working it should transcode through the iGPU.

 

Keep in mind that your load on the CPU can be still pretty high since the audio is most likely to be transcoded as well and that is done on the CPU and throttling is not working in Jellyfin (even if the checkbox is there).

  • Thanks 1
Link to comment
On 9/16/2021 at 6:33 AM, ich777 said:

It is tested on 6.9.x on AMD APUs 3xxx series, AMD APUs 4xxx series and some Radeon CPUs with iGPU with Jellyfin (best would be to visit my Jellyfin support thread).

 

I don't had such a new CPU until now, try to upgrade unRAID to 6.10.0-rc1 and downloading the app Radeon TOP from the CA App.

After that open up a terminal from unRAID itself and type in:

ls -la /dev/dri/

 

After that you should get an output that lists a folder 'card0' & 'renderD128'

 

If that is working you can try to install my Jellyfin container from the CA App and pass through the device /dev/dri (the description of the template and the variables should tell you everything).

Go to the settings from Jellyfin and at Playback enable Hardware transcoding and press save at the bottom.

Then go back to the main menu from Jellyfin click on a video file and try to play it, click on the little gear icon after the playback started and change the quality to something lower than the input file is.

If everything is working it should transcode through the iGPU.

 

Keep in mind that your load on the CPU can be still pretty high since the audio is most likely to be transcoded as well and that is done on the CPU and throttling is not working in Jellyfin (even if the checkbox is there).

 

Cool! Worth a shot - I was interested in trying all 3 (Jellyfin, Plex, Emby) if I can pass hardware transcode capability to each docker.

 

Install said this which was interesting:

 

Verifying package radeontop-2021.09.13.txz.
Installing package radeontop-2021.09.13.txz:
PACKAGE DESCRIPTION:
Package radeontop-2021.09.13.txz installed.

-----AMDGPU Kernel Module already enabled!-----

----Installation of radeontop complete-----
plugin: radeontop.plg installed

 

I do have these items:


image.png.0f962ea30976a17c47f8aecabb3bee4c.png

 

Is there a way to use this amdgpu module on other dockers mentioned as well? (trying out your Jellyfin shortly)

Link to comment
  • ich777 changed the title to [Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.