[Plugin] Nvidia-Driver


ich777

Recommended Posts

5 hours ago, nuhll said:

 

Whats Vaapi? Ive removed dev dri, which was working fine, just to be sure.

 

Did the latest patches change anything GPU related? Oo I think i was on 6.10

may as note as it seems the cards may initialized differently

 

root@AlsServer:~# ls -la /dev/dri/
total 0
drwxrwxrwx  3 nobody users      180 Oct  8 18:52 ./
drwxr-xr-x 17 root   root      3640 Oct  9 18:35 ../
drwxrwxrwx  2 nobody users      160 Oct  8 18:52 by-path/
crwxrwxrwx  1 nobody users 226,   0 Oct  8 18:52 card0
crwxrwxrwx  1 nobody users 226,   1 Oct  8 18:52 card1
crw-rw----  1 root   video 226,   2 Oct  9 19:47 card2
crwxrwxrwx  1 nobody users 226, 128 Oct  8 18:52 renderD128
crwxrwxrwx  1 nobody users 226, 129 Oct  8 18:52 renderD129
crwxrwxrwx  1 nobody users 226, 130 Oct  8 18:52 renderD130
root@AlsServer:~#

 

as you see i have 3 cards here, 2 x Nvidia and 1 x Intel iGPU

 

so, as we dont know which card is which from here ... before my layout was

 

...128 iGPU

...129 NV

...130 NV

 

now it changed and the iGPU is here ...130 ;)

 

to change plex to use something different here, i had to manually change the render device, which is done here

 

image.png.b14808720c0cbfcbea1781a73aa7294a.png

 

by adding / changing this entry there

image.thumb.png.6a366496f29d961d94e4d21639f2a625.png

 

may this helps ...

  • Like 1
Link to comment
7 hours ago, nuhll said:

Whats Vaapi?

Video Accelerated API, but that's mainly used for Intel and AMD GPUs.

 

7 hours ago, nuhll said:

Did the latest patches change anything GPU related? Oo I think i was on 6.10

Have you passed the path /dev/dri to the container in your template? If yes, please remove it if you are using Nvidia unless you want to use a Intel or AMD GPU for transcoding.

Link to comment

I am getting Error:

 

FATAL: Module nvidia not found in directory /lib/modules/5.19.14-unraid

 

Yet on the unraid USB in the path pictured a package exists.

 

I took the server offline, moved an NVME from a pcie to the motherboard m.2 slot and added a hard drive via SATA awaiting preclear.

 

Once this error shows it shuts down and reboots doing the same process the second time.

 

 

 

WhatsApp Image 2022-10-11 at 17.52.55.jpeg

Screenshot 2022-10-11 181128.png

 

 

Edited by Be-Art
Removal of Diagnostics for security
Link to comment
17 minutes ago, Be-Art said:

FATAL: Module nvidia not found in directory /lib/modules/5.19.14-unraid

I need the full diagnostics.

 

From what I see the download is not completed, is this a fresh installation from the plugin?

Also I think you haven't waited until the done button was displayed in the plugin installation windows and just closed it or am I wrong?

Link to comment

I was not busy with nvidia manually. as mentioned above. I needed to add a sata drive and move m.2 from pcie to MOBO. 

The Nvidia updates are automated to run think at 9am (i am pretty sure) with the rest of the docker updates.

The system was shutdown, hardward changes done and booted up again, now cannot access UI.

Have removed Nerdpack manually as i noted it stated incompatible on the diagnostics.   

Link to comment
12 minutes ago, Be-Art said:

The system was shutdown, hardward changes done and booted up again, now cannot access UI.

I think this issue isn't related to the Nvidia Plugin but anyways, please move your USB boot device to your Windows PC and remove the file ../config/plugins/nvidia-driver.plg and reboot your server.

 

I can only tell you from the Diagnostics that your driver is not fully downloaded.

Link to comment

hey nood here, i don't know what i maybe doing wrong can someone please help me out, i needs to only be able to get 2mb 720 when i'm hw transcoding, but thats the same as when in software transcoding. i've been at this for months now trying to fix it but no luck, it was working fine when i was using the igpu when one day it just stopped working. so i switched to a p2000 think thats would fix the problem.

Screen Shot 2022-10-11 at 12.42.09 PM.png

Screen Shot 2022-10-11 at 12.43.53 PM.png

Screen Shot 2022-10-11 at 12.48.02 PM.png

Screen Shot 2022-10-11 at 12.50.44 PM.png

Screen Shot 2022-10-11 at 12.52.32 PM.png

Screen Shot 2022-10-11 at 12.53.37 PM.png

Screen Shot 2022-10-11 at 12.56.27 PM.png

Screen Shot 2022-10-11 at 12.57.54 PM.png

Link to comment
2 minutes ago, shoyrock said:

hey nood here, i don't know what i maybe doing wrong can someone please help me out, i needs to only be able to get 2mb 720 when i'm hw transcoding, but thats the same as when in software transcoding. i've been at this for months now trying to fix it but no luck, it was working fine when i was using the igpu when one day it just stopped working. so i switched to a p2000 think thats would fix the problem.

This seems like a Plex specific question since the Card is recognized in the plugin and seems to be working fine.

Please head over to the appropriate support thread for the container (Click the Icon on the Docker page and select Support.

Link to comment
11 minutes ago, Be-Art said:

tried the above, now get this issue. Server still auto restarting after this tower login ad os version screen.

No, something wasn't removed right since now the plugin now has downloaded the driver successful, the wrong one but it successfully downloaded it.

You are using a card where you need to download the legacy driver 470 series.

 

Shut down your server, pull the USB Boot device, put it in your Windows PC and remove the folder .../config/plugins/nvidia-driver and the file .../config/plugins/nvidia-driver.plg again.

It wasn't removed properly before.

 

I would also recommend that you remove NerdPack and NerdPack.plg from your boot device which live in the same folder as the other two ".../config/plugins/".

 

As said above, the boot loop has nothing to do with the Nvidia Driver from my perspective.

Link to comment
2 hours ago, ich777 said:

This seems like a Plex specific question since the Card is recognized in the plugin and seems to be working fine.

Please head over to the appropriate support thread for the container (Click the Icon on the Docker page and select Support.

thanks ill try to see is theres support an help but i don't think its going to, i tried all the plax docker containers, still had the same problem with all of them.

Link to comment

Just upgraded my server to 6.11 and I can't get my card to show up.  Followed the instructions provided to 'fix' the updated drivers, no change.  Removed and reinstalled the entire plugin, no change.  The card shows up in System Devices, but not in Nvidia-Driver.  Screenshots from Nvidia-Driver and System Devices, along with diagnostics attached here. Quadro RTX 4000 in an HP DL380 Gen7 with dual Xeon L5630, 144GB DDR3 ECC RAM, using an x16 riser for the card.

system devices.png

no devices.png

diagnostics-20221011-1500.zip

Edited by worldspawn
Link to comment
9 hours ago, worldspawn said:

Quadro RTX 4000 in an HP DL380 Gen7 with dual Xeon L5630, 144GB DDR3 ECC RAM, using an x16 riser for the card.

Was this card working before or have you installed it recently?

 

Please make sure that you've enabled Above 4G Decoding and if you have somewhere a option to enable Resizable BAR support.

Oct 11 15:00:17 Broomhilde kernel: NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x23:0x65:1365)
Oct 11 15:00:17 Broomhilde kernel: NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
Oct 11 15:00:21 Broomhilde kernel: NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x23:0x65:1365)
Oct 11 15:00:21 Broomhilde kernel: NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
Oct 11 15:00:25 Broomhilde kernel: NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x23:0x65:1365)
Oct 11 15:00:25 Broomhilde kernel: NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
Oct 11 15:00:29 Broomhilde kernel: NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x23:0x65:1365)
Oct 11 15:00:29 Broomhilde kernel: NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0

 

9 hours ago, worldspawn said:

using an x16 riser for the card.

I have also seen this message a few times with bad riser cables.

Link to comment
10 hours ago, shoyrock said:

thanks ill try to see is theres support an help but i don't think its going to, i tried all the plax docker containers, still had the same problem with all of them.

I think you have another issue here and this is not related to HW transcoding at all, is it possible that you've got baked in sub titles? If yes, from what I know this can cause really high CPU usage, but as said above, I would choose a Plex container that you want to use and go to the appropriate support thread and ask for help there.

 

Also HW transcoding is working from what I see from your screenshot.

  • Like 1
Link to comment
13 hours ago, ich777 said:

Shut down your server, pull the USB Boot device, put it in your Windows PC and remove the folder .../config/plugins/nvidia-driver and the file .../config/plugins/nvidia-driver.plg again.

It wasn't removed properly before.

After attempting this with no joy. i found info that if the second m.2 socket 'm.2_2' on the Rox Strix B450-F Gaming II is used then 'PCIe x16_1' will run at x8 mode. Eventually gave up making adjustments in the bios trying to get the system to accept the nvme on the mobo and put it back to the pcie card. It was still tricky to get into the UI after as i received 'page not found 404' error.

 

I'm not super experienced with all aspects of the pcie's. Any thoughts, The monitor would output the boot processes including hardware checks and in the bios would see both nvme drives. 

Link to comment
10 minutes ago, Be-Art said:

After attempting this with no joy.

Can you please be a bit more precise? What is the output? As said above, I don't think that is related to the Nvidia Driver plugin at all and would recommend that you post on the General Support sub-forums.

 

11 minutes ago, Be-Art said:

i found info that if the second m.2 socket 'm.2_2' on the Rox Strix B450-F Gaming II is used then 'PCIe x16_1' will run at x8 mode.

Also if you post in the General Support sub-forums please provide which PCIe card/NVME you put where, otherwise people won't be able to help.

 

12 minutes ago, Be-Art said:

It was still tricky to get into the UI after as i received 'page not found 404' error.

So is your server not boot looping anymore?

 

12 minutes ago, Be-Art said:

I'm not super experienced with all aspects of the pcie's. Any thoughts, The monitor would output the boot processes including hardware checks and in the bios would see both nvme drives.

Please also (and of course always) provide your Diagnostics with the report on the General Support sub-forums.

Link to comment
7 hours ago, ich777 said:

Was this card working before or have you installed it recently?

 

Please make sure that you've enabled Above 4G Decoding and if you have somewhere a option to enable Resizable BAR support.

I have also seen this message a few times with bad riser cables.

The card is working NOW, as in, it is outputting video and displaying the console, just not being detected correctly in Nvidia-driver.

 

Regarding the Above 4G decoding and Resizable BAR support, I'm not sure where those are, but the previous card Quadro P4000 was operating without any issue before the update.

 

 

It's not a riser cable, Servers use risers for rotating the cards 90 deg so they fit into the case, it's an HP factory riser, and riser is known good, and was working before the server update.

Link to comment
19 minutes ago, worldspawn said:

Regarding the Above 4G decoding and Resizable BAR support, I'm not sure where those are, but the previous card Quadro P4000 was operating without any issue before the update.

In the BIOS, please make sure that you turn on Above 4G decoding and if you have an option for Resizable BAR support also turn it on.

 

You have to understand that RTX cards work a little different on the hardware side of things, it is maybe working in terms of display output, but it is NOW using a generic driver and not hardware accelerated in any way.

 

19 minutes ago, worldspawn said:

Servers use risers for rotating the cards 90 deg so they fit into the case, it's an HP factory riser, and riser is known good, and was working before the server update.

I trust you that it was working before but newer graphics cards are notorious to not work well with non shielded risers (regardless if they are cables or PCBs) but that doesn't mean that the riser is bad, this was just a suspicion from my side since this would not be the first time that a riser causes an issue.

Link to comment
2 hours ago, mikeyosm said:

Any chance you could add the NVIDIA GRID vGPU driver as well? Some of us use Tesla cards like the P4 and would be very useful to be able to use this in dockers under split workload scenarios.

No, this plugin only supports consumer grade cards, sorry.

  • Thanks 1
Link to comment

I'm having some problems with the plugin after upgrading to 6.11.0 and now 6.11.1. 

 

After reboot the plugin disappeared - has happened multiple times. I was able to re-install it after reboot and it is working fine.  Now while it is working I would like to be able to check for any updated Nvidia drivers. When I click on the plugin nothing happens - empty screen? Other plugins are working fine.

 

nvidia-smi is still working and my docker is working

 

nvidia-smi
Sat Oct 15 12:13:15 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.76       Driver Version: 515.76       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:05:00.0 Off |                  N/A |
| 34%   32C    P8    N/A /  N/A |      1MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

 

Thanks for the great work on the plugin and help!

Link to comment

Hey @ich777. Got your driver installed and plex HW transcoding working perfectly, thanks for putting this together.

 

I'm having issues with another docker using nvenv transcoding though. I've got dizquetv going and set up right but when it tries to run a transcode using h264_nvenc it gives the error 'Cannot load libcuda.so.1'. Vexorian's not sure but they do seem to think it's some issue with the docker accessing the gpu. Any ideas?

Edited by skwisgaarz
Link to comment

Random issue started happening - i'm assuming since upgrading to 6.11; upon fresh boot, my p400 is recognized and works great. Shortly afterwards (around 15 minutes) it errors out and the nvidia app no longer recognizes the card as being present.

 

My log is flooded with the following error:

Quote

Oct 15 19:25:26 Tower kernel: NVRM: GPU 0000:42:00.0: request_irq() failed (-22)

 

any thoughts on where I should start looking?

Link to comment
5 hours ago, chr said:

I'm having some problems with the plugin after upgrading to 6.11.0 and now 6.11.1. 

 

After reboot the plugin disappeared - has happened multiple times. I was able to re-install it after reboot and it is working fine.  Now while it is working I would like to be able to check for any updated Nvidia drivers. When I click on the plugin nothing happens - empty screen? Other plugins are working fine.

 

nvidia-smi is still working and my docker is working

 

nvidia-smi
Sat Oct 15 12:13:15 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.76       Driver Version: 515.76       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:05:00.0 Off |                  N/A |
| 34%   32C    P8    N/A /  N/A |      1MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

 

Thanks for the great work on the plugin and help!

 

Ah shit - I wasn't on the last page. My issue above *I believe* is identical to chr's issue

Link to comment
5 hours ago, skwisgaarz said:

I've got dizquetv going and set up right but when it tries to run a transcode using h264_nvenc it gives the error 'Cannot load libcuda.so.1'. Vexorian's not sure but they do seem to think it's some issue with the docker accessing the gpu. Any ideas?

and you are using the nvidia tagged dizquetv docker ? or the regular one ? i have no idea if its all in one or not, i would ask there in the github to see what is right or wrong.

 

when i look at the repo there are different editions ... just as note.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.