[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...


Recommended Posts

6 hours ago, sonisame said:

Wondering what I am doing wrong to have this issue.

You are doing nothing wrong but only because the device is on the list doesn‘t mean that it is going to work.

I‘ve had multiple users with RX580 cards which reported that they don‘t needed the plugin because it was working OOB for them.

 

Also make sure that the card is not bound to VFIO.

 

It always depends also on the manufacturer from the card too and only because the RX580 is on the list doesn‘t meant that card from manufacturer A works but it can be the case that the card from manufacturer B, C, E is working.

 

As said above, the AMD Vendor Rest plugin is a workaround, not a solution.

Link to comment

 

Hello everyone!

 

Any reports of system crashes caused by the intel-gpu-top plugin?

 

I don't remember exactly, but I believe the problem started after the last February update. The server started crashing requiring a physical restart directly from the button. I thought it was a hardware problem, but after removing the plugin the server no longer crashed.

Link to comment
56 minutes ago, Vattar said:

Any reports of system crashes caused by the intel-gpu-top plugin?

So far you are the only one and I can see that many people have it downloaded from the Git repository.

 

Are you sure that it isn't caused by anything in combination with Intel-GPU-TOP?

  • Like 1
Link to comment
6 hours ago, ich777 said:

Até agora você é o único e posso ver que muitas pessoas o baixaram do repositório Git.

 

Tem certeza de que não é causado por nada em combinação com Intel-GPU-TOP?

I don't think so, how can I know?

 

I mirrored the syslog on the usb-flash to try to identify the cause of the problem, but nothing was registered.

 

When I installed unraid for the first time (a year ago) I have the impression that I had the same problem with the plugin, however, when it was updated the crashes ended.

 

Everything is working fine except that I don't have anymore gpu stats anymore, however, transcoding is still working normally and the server has been stable for fourth consecutive day.
 
I will keep it in tests for at least a week, if it remains stable I will reinstall the plugin to make sure that it is causing the problem.

Link to comment

So I've got two issues:

 

When using Intel-GPU-Top and the GPU statistics plugins, I'm running an i5-6500T, but it doesn't show in the dropdown in the statistics plugin (it only shows one option named '99: HD - 0000-00-000-000000' and the only statistic that appears to be updating on the dashboard are the IMC Bus Utilization (changes quite a bit), and the Interrupts/sec (which is usually 2 or 4). Everything else is zeroes. When I run the CLI command, I get the same thing (which is why I'm posting in this support thread instead of the GPU Statistics one - the numbers match, they're just mostly all zeroes).

 

Second, when using the RTL8152/3/4/6 USB Drivers plugin, my adapter isn't showing up in the list. I've got a different USB3 Gigabit one that worked OOB on another port, but the 2.5GBit one I just installed doesn't show up. I'm not sure, but when trying to research why the ones I just bought weren't working on either my unRAID or Synology, I did find a comment in release notes about needing to update the package so the Synology version of the driver could support a new hardware revision of the adapter I'm using (TUC-ET2G). Is there a possibility this plugin needs a similar update? The relevant release is here: https://github.com/bb-qq/r8152/releases/tag/2.15.0-9 and the relevant commit for that release is here: https://github.com/bb-qq/r8152/commit/9015a72f773d8af17261a150ca38627bc9ab1350

 

I have not yet applied the change to my go file to enable 2.5GBit, because I wanted to see if I could get the adapter working at all first before trying to get it to run full speed.

 

Diagnostics attached.

tower-diagnostics-20230308-2251.zip

Edited by at0m
a word
Link to comment
9 hours ago, Vattar said:

When I installed unraid for the first time (a year ago) I have the impression that I had the same problem with the plugin, however, when it was updated the crashes ended.

The plugin on it's own does only install the intel_gpu_binary and does nothing else, if you don't start it it won't do anything and your system will behave as like it isn't installed.

 

In case you have the GPU Statistics plugin installed it will call it in a interval of one second I think when you are on the Unraid Dashboard so that you can see the GPU Statistics on your Dashboard.

 

Do you have a HDMI Dummy Plug or a physical monitor attached to your iGPU?

 

 

I would also recommend that you change your Docker network to IPVLAN if you are on MACVLAN if you have containers in br0 because that's more likely that it crashes the server.

  • Like 1
Link to comment
2 hours ago, at0m said:

When using Intel-GPU-Top and the GPU statistics plugins, I'm running an i5-6500T, but it doesn't show in the dropdown in the statistics plugin (it only shows one option named '99: HD - 0000-00-000-000000' and the only statistic that appears to be updating on the dashboard are the IMC Bus Utilization (changes quite a bit), and the Interrupts/sec (which is usually 2 or 4). Everything else is zeroes. When I run the CLI command, I get the same thing (which is why I'm posting in this support thread instead of the GPU Statistics one - the numbers match, they're just mostly all zeroes).

If it shows something on the Dashbaord then the GPU Statistics plugins is configured properly (please note that this would be better suited in the GPU Statistics plugins support thread.

 

Is something using your iGPU?

Please also see my comment here:

 

 

2 hours ago, at0m said:

Second, when using the RTL8152/3/4/6 USB Drivers plugin, my adapter isn't showing up in the list.

I really can't help with that, what chipset does your adapter use? In the description from the plugin the supported chipsets are listed.

 

2 hours ago, at0m said:

Is there a possibility this plugin needs a similar update?

Maybe but that is something that can be requested in the Feature Requests forums, even if you got no answer in there the team is watching this forum really closely.

Please include the exact adapter, what chipset and a link to the adapter.

Link to comment
8 hours ago, ich777 said:

Do you have a HDMI Dummy Plug or a physical monitor attached to your iGPU?

I have a monitor connected to the server, however, it remains off most of the time.

 

8 hours ago, ich777 said:

I would also recommend that you change your Docker network to IPVLAN if you are on MACVLAN if you have containers in br0 because that's more likely that it crashes the server.

Yes, I did the test by changing from macvlan to ipvlan in dockers, however, that doesn't change anything, I've always used macvlan and never had any problems related to crashes.

 

 

I will continue without the plugin for a few more days, and then I will use it again to validate if it is influencing something.


If I can contribute with more information I will be at your disposal! I will bring you more news about that soon. 

Edited by Vattar
Link to comment
13 minutes ago, Vattar said:

I will continue without the plugin for a few more days, and then I will use it again to validate if it is influencing something.

As said above I really can't imagine why your server would crash with only the Intel-GPU-TOP plugin installed because it does nothing by default. Maybe try to only install the Intel-GPU-TOP plugin without GPU Statistics.

 

Also this is the first report about Intel-GPU-TOP crashing the system. Please keep me updated.

 

BTW, the latest Intel-GPU-TOP plugin release was almost 27000 times downloads to date:

    name: intel-gpu-top v1.27.1,
        name: intel.gpu.top-2023.02.15.txz,
        download_count: 26951,

 

  • Like 2
Link to comment
On 3/9/2023 at 2:07 AM, ich777 said:

If it shows something on the Dashbaord then the GPU Statistics plugins is configured properly (please note that this would be better suited in the GPU Statistics plugins support thread.

 

Is something using your iGPU?

Please also see my comment here:

 

 

I really can't help with that, what chipset does your adapter use? In the description from the plugin the supported chipsets are listed.

 

Maybe but that is something that can be requested in the Feature Requests forums, even if you got no answer in there the team is watching this forum really closely.

Please include the exact adapter, what chipset and a link to the adapter.

I know the GPU Stats plugin is configured right. As I said, since the data matches, I came here. Nothing was using my iGPU though, so maybe that was causing it. I'll pursue that later.

 

 

If you look at the links I posted, it shows the chipset by where the ID was added to the file. Mine reports the same model number, v2.0R, which is what that patch was meant to add support for. You should literally be able to add that line to the driver the plugin uses. It's added to this table:

/* table of devices that work with this driver */

static const struct usb_device_id rtl8152_table[] = {

 

on the unraid box, lsusb reports the below, which matches the commit, as previously stated

Bus 002 Device 005: ID 20f4:e02c TRENDnet TUC-ET2G(v2.0R)

 

Was purchased from here: https://www.amazon.com/TRENDnet-Gigabit-Ethernet-TUC-ET2G-Chromebook/dp/B07RBMTVYF

Mfg website here, although the datasheet doesn't call out the chipset: https://www.trendnet.com/support/TUC-ET2G-v2

Link to comment
5 hours ago, at0m said:

Nothing was using my iGPU though, so maybe that was causing it. I'll pursue that later.

Please put some load on it and see if you get some values in the GPU Statistics plugin.

 

5 hours ago, at0m said:

You should literally be able to add that line to the driver the plugin uses.

Yes that's of course true but that's not how I do the plugins because if I start to add this driver most certainly someone else want me to add another device id and so on...

 

I would rather recommend that you create a PR in the source repository which my plugin is based on (what you've already did I think).

Link to comment
2 hours ago, sunnyjainb said:

I used windows vms and jellyfin to watch live stream but it still shows zeros

Are you really sure that Jellyfin is configured properly so that transcoding is working?

This is usually a indicator that transcoding is not working.

Please post in the corresponding support thread from your Jellyfin container.

BTW I would strongly recommend that you use the official Jellyfin container.


From what I see the Intel GPU TOP plugin is working fine and doing it‘s job as it should.

Link to comment
2 hours ago, sunnyjainb said:

I used windows vms and jellyfin to watch live stream but it still shows zeros

 

windows vm > only when using gvt-g would trigger the igpu ...

jellyfin > may try playing in the browser, enable transcode or use a mobile device not in LAN, should force transcode too

Link to comment
1 hour ago, alturismo said:

 

windows vm > only when using gvt-g would trigger the igpu ...

jellyfin > may try playing in the browser, enable transcode or use a mobile device not in LAN, should force transcode too

image.png.2d02c2746add14db6e5a8cd22fce2ef3.png

I have set this in jellyfin also try playing non-original resolution in web browser, but still all zeros. Im so confused. Is there any docker heavily relies on gpu for me to test with? Tons of thanks!

Link to comment
29 minutes ago, sunnyjainb said:

I have set this in jellyfin also try playing non-original resolution in web browser, but still all zeros. Im so confused.

This is something for the Jellyfin support thread if it is not working since this has nothing to do with the Intel GPU TOP Plugin.

Make sure that you configure the iGPU also in Jellyfin at Playback and then force a transcode, simply watching something isn't enough because not all media need to be transcoded.

 

You also didn't answer if you are using the official container or a third party one.

Link to comment

Hi,

My system has a nVidia 1050ti installed (all working) but also I have an integrated  Radeon R5/R6/R7 Graphics (rev 84) which I'm having issues installing the radeontop plugin. Everything shows up under devices ok. But when I try and install the plugin, i get the following :

 

plugin: installing: radeontop.plg
Executing hook script: pre_plugin_checks
plugin: downloading: radeontop.plg ... done

plugin: downloading: radeontop-2023.02.22.txz ... done


+==============================================================================
| Installing new package /boot/config/plugins/radeontop/radeontop-2023.02.22.txz
+==============================================================================

Verifying package radeontop-2023.02.22.txz.
Installing package radeontop-2023.02.22.txz:
PACKAGE DESCRIPTION:
Package radeontop-2023.02.22.txz installed.

---------Enabling AMDGPU Kernel Module---------

------Something went wrong! Can't enable-------
----AMDGPU Kernel Module, removing package!----
Removing package: radeontop-2023.02.22
Removing files:
--> Deleting /usr/bin/radeontop
--> Deleting /usr/local/emhttp/plugins/radeontop/bin/radeontop
--> Deleting /usr/local/emhttp/plugins/radeontop/images/radeontop.png
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so.2
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so.2.4.0
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so.1
--> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so.1.0.0
--> Deleting /usr/share/libdrm/amdgpu.ids
--> Deleting empty directory /usr/share/libdrm/
--> Deleting empty directory /usr/local/emhttp/plugins/radeontop/lib/
--> Deleting empty directory /usr/local/emhttp/plugins/radeontop/images/
--> Deleting empty directory /usr/local/emhttp/plugins/radeontop/bin/
WARNING: Unique directory /usr/local/emhttp/plugins/radeontop/ contains new files
plugin: run failed: /bin/bash
Executing hook script: gui_search_post_hook.sh
Executing hook script: post_plugin_checks

 

Unfortunatly the syslog only shows:

Mar 18 11:57:04 Server root: plugin: running: anonymous
Mar 18 11:57:04 Server root: plugin: creating: /boot/config/plugins/radeontop/radeontop-2023.02.22.txz - downloading from URL https://github.com/ich777/unraid-radeontop/releases/download/2023.02.22/radeontop-2023.02.22.txz
Mar 18 11:57:04 Server root: plugin: checking: /boot/config/plugins/radeontop/radeontop-2023.02.22.txz - MD5
Mar 18 11:57:04 Server root: plugin: running: /boot/config/plugins/radeontop/radeontop-2023.02.22.txz
Mar 18 11:57:05 Server root: plugin: creating: /usr/local/emhttp/plugins/radeontop/README.md - from INLINE content
Mar 18 11:57:05 Server root: plugin: running: anonymous

 

Is there a way of finding out what the error is and how to solve? I'd like to get the drivers installed so I can use the boot gui.

server-diagnostics-20230318-1159.zip

Edited by ricostuart
attaching diagnostics
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.