[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...


Recommended Posts

22 hours ago, ich777 said:

Where is the error?

This is looking perfectly fine, nothing is using your GPU...

 

You can post images here too, just copy it to the clipboard and paste it here. ;)

 

Have you passed over the device /dev/dri in your container template to the container?

Is Plex now supporting AMD GPU transcoding?

 

I would recommend that you post this in the appropriate Plex thread because the card is recognized fine and I can imagine that the path /dev/dri do indeed exist on Unraid itself (...otherwise the card won't show up in radeontop).


The error when running radeontop from command line is "Failed to open DRM node, no VRAM support." I think zeroes across the board still is fishy because I'm also not getting any sensor data, but I'm not sure if that's normal or not.
I've passed /dev/dri to containers I've tried testing hardware acceleration, but there is only "card0" and no "renderD128" which I understand is required for transcoding?

 

radeontop -p /dev/dri/card0 returns "Unsupported driver ast" (ast2500) so the amd card is not showing up in /dev/dri

I think my issue may be related to my IPMI, which has an Aspeed AST2500 with onboard graphics. I'm researching this thread where someone has a similar issue: https://forums.unraid.net/topic/72829-hardware-transcoding-plex-transcoding-not-working-renderd128-missing/?do=findComment&comment=669903 However this user has an intel igpu instead of a dedicated amd gpu.

Tomorrow I'll try messing with bios and IPMI VGA settings to see if it makes a difference. I will edit this post if I find a solution.


At this point I don't think it's a plex specific issue (I've tried several docker containers that support hardware acceleration) so I don't want to post in that thread just yet, but I recognize the radeontop plugin is likely not the cause of my problem.

 

image.png

Edited by gezellig
Link to comment
55 minutes ago, gezellig said:

The error when running radeontop from command line is "Failed to open DRM node, no VRAM support."

I don't see that in your screenshot from above. Is this the exact message?

Have you already unbound the card from VFIO and rebooted your server?

 

55 minutes ago, gezellig said:

I think zeroes across the board still is fishy because I'm also not getting any sensor data, but I'm not sure if that's normal or not.

Also don't forget that radeontop is based on this GitHub repository and I compile this from time to time from the latest master because this repository doesn't follow a strict release cycle (last release was almost a year ago I think).

 

In other words, radeontop is no indication that you GPU is working or not.

 

1 hour ago, gezellig said:

radeontop -p /dev/dri/card0

Why are you doing it like that? Is card0 your radeon card? Have you yet tried card1?

What is the output from:

ls -la /dev/dri

 

1 hour ago, gezellig said:

At this point I don't think it's a plex specific issue (I've tried several docker containers that support hardware acceleration)

But is Plex supporting AMD Hardware yet? AFAIK it is only supporting Intel and Nvidia by default, AMD is only working with workarounds.

You have to understand that the container needs to come with support for AMD because HW transcoding ≠ HW transcoding across different vendors.

 

1 hour ago, gezellig said:

I recognize the radeontop plugin is likely not the cause of my problem.

Exactly.

Have you yet tried the official Jellyfin container if HW transcoding is working there?

 

It would be also helpful if you post your Diagnostics.

Link to comment
On 9/21/2022 at 1:56 AM, ich777 said:

I don't see that in your screenshot from above. Is this the exact message?

Yes. It appears before the terminal shows the graph. Here's a gif firefox_fJKO9WvRcB.gif.b071b6f90c0577959b9429c079c1ab9f.gif
 

 

On 9/21/2022 at 1:56 AM, ich777 said:

Have you already unbound the card from VFIO and rebooted your server?

Yes.

 

On 9/21/2022 at 1:56 AM, ich777 said:

Why are you doing it like that? Is card0 your radeon card? Have you yet tried card1?

What is the output from:

ls -la /dev/dri

I thought card0 was my card, but just in case I tried to force radeontop to use card0 with that command, but when I got "unsupported driver ast" I realized maybe the AST2500 card is showing up as card0. Here is ls -la /dev/dri

root@kea:~# ls -la /dev/dri
total 0
drwxrwxrwx  3 root root      80 Sep 21 00:01 ./
drwxr-xr-x 16 root root    3660 Sep 21 00:11 ../
drwxrwxrwx  2 root root      60 Sep 21 00:01 by-path/
crwxrwxrwx  1 root video 226, 0 Sep 21 00:01 card0


 

 

On 9/21/2022 at 1:56 AM, ich777 said:

Have you yet tried the official Jellyfin container if HW transcoding is working there?

 

It would be also helpful if you post your Diagnostics.

I've tried both jellyfin and emby, but both of those also rely on /dev/dri/renderd128 for vaapi, so if it's not showing up on the host, I can't pass it through to containers.

 

Link to comment
On 9/21/2022 at 12:03 AM, ich777 said:

You should be able to install it manually from here: Click

 

Great, thats what I did and was able to get it working in checkmk. No docker or vm information though. I copied the docker.py over to the plugins folder but wasnt sure how to get that working. Fabulous software though, so it would be nice to get it working.

Link to comment

I got it working! Turns out the card0 is the AST2500, and for some reason even though the 5700xt was showing up fine in lspci and in unraid "system devices" it wasn't showing up in /dev/dri. After swapping around a lot of bios settings, I now have card0 showing up (AST2500) and card1 (5700xt) both showing up, and am able to pass card1 to containers. /dev/dri/renderd128 now also shows up for vaapi encoding.
This IPMI is very finicky. The bios settings I changed were:
SR-IOV Support: disabled

Prioritize onboard video instead of PCIe: enabled
Onboard VGA: from [Auto] to enabled (enabled for some reason is required for the OS to recognize the 5700xt, but also makes the machine take about 2 minutes to post and start displaying video, very odd. I also don't have access to IPMI remote control, which kind of sucks but at least GPU is recognized.)
 

Edited by gezellig
  • Like 1
Link to comment
13 hours ago, ich777 said:

Someone already forked my repo and made it available here: Click

 

I think that he update the checkmk agent if I'm not mistaken.

 

Looks like he did 13 days ago however that one doesnt pull up any services in cmk. Oh well....Im sure I can figure it out eventually. Thanks for the help

  • Like 1
Link to comment
On 9/17/2022 at 10:29 AM, Frank Crawford said:

Folks,

I'm currently maintaining the it87 driver that is now being pulled into your systems, and the current version supports the following chipsets: IT8603E, IT8606E, IT8607E, IT8613E, IT8620E, IT8622E, IT8623E, IT8625E, IT8628E, IT8528E, IT8655E, IT8665E, IT8686E, IT8688E, IT8689E, IT8695E, IT8705F, IT8712F, IT8716F, IT8718F, IT8720F, IT8721F, IT8726F, IT8728F, IT8732F, IT8736F, IT8738E, IT8758E, IT8771E, IT8772E, IT8781F, IT8782F, IT8783E/F, IT8786E, IT8790E, IT8792E, Sis950.

 

If you are have any of these chipsets, then you don't need to specify the "force_id" option any further.  If you do find you need it please contact me and I'll look at adding any missing chipsets, but it will depend on what details I can find.

 

Unfortunately, for some motherboards, which have multiple chipsets on board, you will need to add the option "ignore_resource_conflict=1" option, in place of the option "acpi_enforce_resources=lax" during the modprobe command, e.g.

for similar reasons, although it only affects the it87 module not others.

 

Good luck with the use of this module that various people have been improving over the years.

Welcome and thanks for your work, Frank. Your driver is working flawlessly on my Gigabyte B560M DS3H v2 (8689).

 

By the way, do you mean that "acpi_enforce_resources=lax" is not needed anymore?

 

Thanks

Link to comment
On 9/18/2022 at 7:20 AM, ich777 said:

Thanks for joining in here Frank!

 

I think you don't know this but on Unraid it would be better to pass this over with the syslinux.conf like:

i87.ignore_resource_conflict=1

 

or you could also do this with a file in /boot/config/modprobe.d/i87.conf with the content:

options i87 ignore_resource_conflict=1

 

and reboot afterwards regardless of which method you choose.

Hi ich777,

 

I think you mistyped here, after copy-pasting your instructions, your it87 driver won't load on my server after reboot, I supposed that it is:

it87.ignore_resource_conflict=1

or, in the it87.conf file:

options it87 ignore_resource_conflict=1

Looks like you mistyped "i87" for "it87" in all your post.

Edited by PsychoRS
  • Like 1
Link to comment
1 hour ago, PsychoRS said:

By the way, do you mean that "acpi_enforce_resources=lax" is not needed anymore?

Exactly.

 

1 hour ago, PsychoRS said:

Looks like you mistyped "i87" for "it87" in all your post.

Exactly, thank you for pointing that out, I'll change that.

Link to comment
On 9/20/2022 at 12:03 PM, ich777 said:

Please keep it here on the oublic part of the forums or is it something confidential…

I was sending Frank a PM around driver support for specific hardware, as per his github advice. Doesnt need to be shared with everyone for my specific motherboard.

On 9/20/2022 at 12:21 PM, ich777 said:

I didn‘t saw that until now, do you append any extra options to the module?

Sorry, It all works now and survives a restart if I use the workaround options in the below screenshot. How i fixed it was by uninstalling the dynamix system temp plugin, rebooting, then re-installing. But according to Frank, I shouldnt need the acpi=lax or the force ID, hence the PM to get this sorted for others with the same hardware.

 

image.thumb.png.90bfbc9e52a785e217f94e34483482d4.png

 

When i get a solution I will share it here.

Edited by eatoff
Link to comment
5 minutes ago, ich777 said:

Shouldn't you rather use:

it87.ignore_resource_conflict=1

instead of the "lax" option?

... really, did I miss that somewhere?

what I had in the workaround worked.

 

Do you mean change the workaround from:

acpi_enforce_resources=lax if87.force_id=0x8689

 

to only:

it87.ignore_resource_conflict=1

Link to comment
38 minutes ago, eatoff said:

... really, did I miss that somewhere?

See here:

Quote

Unfortunately, for some motherboards, which have multiple chipsets on board, you will need to add the option "ignore_resource_conflict=1" option, in place of the option "acpi_enforce_resources=lax" during the modprobe command, e.g.

 

I think you need both, at least you can try:

it87.ignore_resource_conflict=1 it87.force_id=0x8689 

 

 

I really don't like the "lax" option because you can introduce in certain conditions damage to the hardware...

Link to comment
8 minutes ago, ich777 said:

See here:

 

Just tried this one with no luck

 

This:

image.thumb.png.86857836ea36d03602f80b22b67d866b.png

 

Also results in this:

image.png.4a207f2b9c643bc5bb44446373147f49.png

 

Using that Workaround in my image does actually work. But according to Frank it shouldnt be needed, so am trying to work with him to get it resolved.

Link to comment
12 minutes ago, ich777 said:

As I wrote above also append your probe.

Apologies, somehow I didnt read that last bit in your post.

 

To keep this thread up to speed, that did work. It now looks like:

image.thumb.png.4aabdc90f046368cc83056dd160b7b2d.png

 

And i now get:

image.png.eb939d2f25a782d769ff2aab9d87694c.png

 

26 minutes ago, ich777 said:

I think you need both, at least you can try:

it87.ignore_resource_conflict=1 if87.force_id=0x8689 

I really don't like the "lax" option because you can introduce in certain conditions damage to the hardware...

Just FYI, your code says if87.force_id=0x8689, should be it87.force_id=0x8689

 

Thanks again for all your help ich777

 

EDIT: I realise I wasn't clear, this setup does work for me. got rid of the lax argument

Edited by eatoff
Link to comment
6 minutes ago, eatoff said:

Just FYI, your code says if87.force_id=0x8689, should be it87.force_id=0x8689

I just copy pasted it from you... :D

37 minutes ago, eatoff said:

acpi_enforce_resources=lax if87.force_id=0x8689

Anyways, I will change that.

 

6 minutes ago, eatoff said:

EDIT: I realise I wasn't clear, this setup does work for me. got rid of the lax argument

Nice, glad it's now sorted out.

Link to comment
36 minutes ago, PsychoRS said:

Ich777, any idea of why I don't get Power Usage with your intel_gpu_top with my i5-11400?

No.

 

Is this also true when you issue:

intel_gpu_top

from a Unraid console? Please also post a screenshot.

 

I can recompile intel_gpu_top from the latest master branch later that day but I'm not sure if it will fix the issue.

Link to comment
19 minutes ago, ich777 said:

No.

 

Is this also true when you issue:

intel_gpu_top

from a Unraid console? Please also post a screenshot.

 

I can recompile intel_gpu_top from the latest master branch later that day but I'm not sure if it will fix the issue.

No power consumption output in console:

intelgputop.PNG

Edited by PsychoRS
Link to comment
7 minutes ago, PsychoRS said:

No power consumption output in console:

I will recompile it later that day, keep an eye on plugin updates, but if you don't see it after the update I think you are out of luck and have to wait a bit longer until it is supported.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.