[Plugin] Nvidia-Driver


ich777

Recommended Posts

7 hours ago, keymaster said:

Hiya

Firstly, thank you for fantastic tutorial and guide, then sorry for whatever my failure is....I have tried and tried to get "hw" on the transcoding on plex (with pass and using binhex plexpass image) but epic failure.

I have Quadro P400 card and all seems correct but duh.... please can somebody point me to my error.

 

nvidea4.png

 

nvidea2.png

nvidia1.png

nvidea3.png

I would just like to note that your screenshot shows that it is attempting direct play on the iPhone.. as such no transcoding would happen. The audio stream to my knowledge is only transcoded on the CPU so you should never see that HW transcode.. 

  • Like 1
Link to comment
2 hours ago, edrohler said:

Are there any specific steps to upgrade to Unraid 6.9 with the existing Nvidia Plugin already working and in use?

What release are you upgrading from?

 

note that the nvidia plugin that worked with Unraid 6.8.3 and used a custom build is not the same as the Nvidia Driver plugin used with 6.9.0 and does not need a custom build.

Link to comment

i'm getting:

Installed GPU(s):Failed to initialize NVML: Unknown Error

 

Hardware:

ASROCK X570 TAICHI

5950X

1080 TI (used for video passthrough to VM)

1660 Super (used for video passthrough to docker container [plex])

 

 

Let me know if you need to know anything else, I don't know unraid well enough to just know what log to grab

 

Solution: Double check VFIO binding make sure you aren't selecting even part of the card...i had the usb controller selected which stopped the driver from loading.

 

Edited by abc123
Link to comment
5 hours ago, edrohler said:

Are there any specific steps to upgrade to Unraid 6.9 with the existing Nvidia Plugin already working and in use?

Delete the old Plugin, Upgrade to 6.9.0 (after this your Plex container isn't able to start because the Driver is missing), go to the CA App and install the Nvidia-Driver Plugin (for a detailed explanation go to the first post of this thread), reboot.

 

2 hours ago, isorage said:

Hi, is there a way to use older drivers?? i got a NVIDIA Corporation GF100GL [Quadro 4000] (rev a3) and it needs driver 390.141 i think.

Please read this post and follow back the linked posts:

 

1 hour ago, abc123 said:

Let me know if you need to know anything else, I don't know unraid well enough to just know what log to grab

Please make sure that you are on the latest Motherboard BIOS, check if you got above 4G decoding enabled.

If this all doesn't help try to boot with Legacy or CSM mode.

Please post the Diagnostics here (Tools -> Diagnostics -> Download -> drag the file here in the textbox) if you don't get it to work.

Link to comment
On 3/2/2021 at 10:11 AM, ich777 said:

Please try it in a desktop PC and install the drivers...

 

Installed the card on my spare PC and loaded up the latest drivers and it worked like normal, no issues. Let it run for 24 hours and even installed plex and setup HW transcoding. Ran like a champ with a 4K movie in repeat mode

 

On 3/2/2021 at 10:11 AM, ich777 said:

Eventually it's some kind of hardware combination that won't work. Have you looked also at your C-States in your BIOS? I have something in my head that there where problems with the "older" Ryzen and the C-States.

 

I threw the card back into my UnRaid server, but before I did that I did go into the BIOS and disabled C-States. I installed the latest version of your Nvidia Driver plug-in and I also installed GPU Statistics.

 

One thing that caught my eye was that my GPU power state was at P0 and I my fan power was at 90% 

 

I was also getting a TON of these warnings in my log ... like every 2 seconds

 

kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs

kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]

 

Mind you that at this point I have only installed both plug-ins. I have not configured Plex for HW transcoding or anything like that. This is just the card installed with the drivers on the system. Nothing else. I don't use VMs so there is no passthrough or anything like that. 

 

After doing some digging around for those errors I came across someone in the GPU Statistics thread that was having the same issue. The only way he was able to get the card to settle down and stop logging those warning was by running 

 

nvidia-smi --persistence-mode=1

 

Once I did this the card backed down the fans to about 45% and the power draw was down to 7w from 28w and now the power state reads p8 

 

image.png.4c46200feb7787b6956cb5a509d190f0.png

 

The card has been sitting in my system now for a little over an hour and it has not fallen off the bus like it was before. So I am going to just let it hang out for the next day or two and see what it does. 

  • Thanks 1
Link to comment
46 minutes ago, SiRMarlon said:

Installed the card on my spare PC and loaded up the latest drivers and it worked like normal, no issues.

What system exactly? Intel or AMD?

 

47 minutes ago, SiRMarlon said:

kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs

kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]

I think this has something to do with the AMD Platform since I only got now a few reports that it doesn't work on AMD Systems.

Have to analyze this further.

It could possibly be that this is only a small setting in the BIOS or solved with a BIOS update...

 

Can you try if you experience crashes again to try different BIOS versions for your board?

 

Can you take a few pictures of the BIOS especially for the PCIe Bus and set it to me via PM (eventually you find something about BAR and a setting that says Above 4G Decoding)?

  • Thanks 1
Link to comment

I disabled 4G Decoding and am still receiving 

Installed GPU(s):	Failed to initialize NVML: Unknown Error

None of my graphics cards are in the `VFIO-PCI CFG`, this error shows before i even boot into my VM, when my VM is powered down it is showing my 1080ti.

image.thumb.png.85d39cd6f0cf0a502458ba4d656f044a.png

 

 

Here is my System Devices with 1660 Super:

[10de:21c4] 10:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 SUPER] (rev a1)

 

 

NVM i accidentally had picked the USB part of the 1660 super to VFIO binding...which is why it wasn't able to see the card correctly..

Edited by abc123
  • Like 1
Link to comment
10 hours ago, ich777 said:

IWhat system exactly? Intel or AMD?

 

That was my old Intel system that is on Windos10. I can try and throw it in my new AMD system this weekend to make sure it works on that with no issues. 

 

10 hours ago, ich777 said:

I think this has something to do with the AMD Platform since I only got now a few reports that it doesn't work on AMD Systems.

Have to analyze this further.

It could possibly be that this is only a small setting in the BIOS or solved with a BIOS update...

 

I have the latest Bios for my motherboard installed.

 

10 hours ago, ich777 said:

Can you take a few pictures of the BIOS especially for the PCIe Bus and set it to me via PM (eventually you find something about BAR and a setting that says Above 4G Decoding)?

 

Yeah I can do that once I get back to the house later this evening. 

  • Like 1
Link to comment

I'm having an issue pulling the latest drivers upon restart. 

I found this in the logs wondering if you can advise? 

 

Spoiler

Mar  5 22:20:52 Odin root: plugin: installing: /boot/config/plugins/nvidia-driver.plg
Mar  5 22:20:52 Odin root: plugin: running: anonymous
Mar  5 22:20:52 Odin root: plugin: skipping: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.03.04.txz already exists
Mar  5 22:20:52 Odin root: plugin: running: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.03.04.txz
Mar  5 22:20:52 Odin root: 
Mar  5 22:20:52 Odin root: +==============================================================================
Mar  5 22:20:52 Odin root: | Installing new package /boot/config/plugins/nvidia-driver/nvidia-driver-2021.03.04.txz
Mar  5 22:20:52 Odin root: +==============================================================================
Mar  5 22:20:52 Odin root: 
Mar  5 22:20:52 Odin root: Verifying package nvidia-driver-2021.03.04.txz.
Mar  5 22:20:52 Odin root: Installing package nvidia-driver-2021.03.04.txz:
Mar  5 22:20:52 Odin root: PACKAGE DESCRIPTION:
Mar  5 22:20:52 Odin root: Package nvidia-driver-2021.03.04.txz installed.
Mar  5 22:20:52 Odin root: plugin: creating: /usr/local/emhttp/plugins/nvidia-driver/README.md - from INLINE content
Mar  5 22:20:52 Odin root: plugin: running: anonymous
Mar  5 22:20:53 Odin root: 
Mar  5 22:20:53 Odin root: +==============================================================================
Mar  5 22:20:53 Odin root: | WARNING - WARNING - WARNING - WARNING - WARNING - WARNING - WARNING - WARNING
Mar  5 22:20:53 Odin root: |
Mar  5 22:20:53 Odin root: | Don't close this window with the red 'X' in the top right corner until the 'DONE' button is displayed!
Mar  5 22:20:53 Odin root: |
Mar  5 22:20:53 Odin root: | WARNING - WARNING - WARNING - WARNING - WARNING - WARNING - WARNING - WARNING
Mar  5 22:20:53 Odin root: +==============================================================================
Mar  5 22:20:53 Odin root: 
Mar  5 22:20:53 Odin root: --------------------Nvidia driver v455.45.01 found locally---------------------
Mar  5 22:20:53 Odin root: 
Mar  5 22:20:53 Odin root: -----------------Installing Nvidia Driver Package v455.45.01-------------------
Mar  5 22:20:53 Odin dhcpcd[2791]: br0: leased 10.0.0.10 for 7200 seconds
Mar  5 22:20:53 Odin dhcpcd[2791]: br0: adding route to 10.0.0.0/24
Mar  5 22:20:53 Odin dhcpcd[2791]: br0: changing default route via 10.0.0.1
Mar  5 22:20:53 Odin dhcpcd[2791]: br0: deleting route to 169.254.0.0/16
Mar  5 22:20:53 Odin dhcpcd[2791]: br0: pid 2791 deleted default route via 10.0.0.1
Mar  5 22:20:54 Odin ntpd[2864]: Listen normally on 3 br0 10.0.0.10:123
Mar  5 22:20:54 Odin ntpd[2864]: Deleting interface #1 br0, 169.254.17.123#123, interface stats: received=0, sent=0, dropped=0, active_time=7 secs
Mar  5 22:20:54 Odin ntpd[2864]: new interface(s) found: waking up resolver
Mar  5 22:20:55 Odin nmbd[2903]: [2021/03/05 22:20:55.180477,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:20:55 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(138) ERRNO=Network is unreachable
Mar  5 22:20:57 Odin nmbd[2903]: [2021/03/05 22:20:57.182874,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:20:57 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(138) ERRNO=Network is unreachable
Mar  5 22:20:59 Odin nmbd[2903]: [2021/03/05 22:20:59.185263,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:20:59 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(138) ERRNO=Network is unreachable
Mar  5 22:21:01 Odin nmbd[2903]: [2021/03/05 22:21:01.187496,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:21:01 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(138) ERRNO=Network is unreachable
Mar  5 22:21:03 Odin nmbd[2903]: [2021/03/05 22:21:03.189500,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:21:03 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(138) ERRNO=Network is unreachable
Mar  5 22:21:03 Odin nmbd[2903]: [2021/03/05 22:21:03.189603,  0] ../../source3/libsmb/nmblib.c:922(send_udp)
Mar  5 22:21:03 Odin nmbd[2903]:   Packet send failed to 169.254.255.255(137) ERRNO=Network is unreachable
Mar  5 22:21:03 Odin nmbd[2903]: [2021/03/05 22:21:03.189631,  0] ../../source3/nmbd/nmbd_packets.c:179(send_netbios_packet)
Mar  5 22:21:03 Odin nmbd[2903]:   send_netbios_packet: send_packet() to IP 169.254.255.255 port 137 failed
Mar  5 22:21:03 Odin nmbd[2903]: [2021/03/05 22:21:03.189666,  0] ../../source3/nmbd/nmbd_nameregister.c:581(register_name)
Mar  5 22:21:03 Odin nmbd[2903]:   register_name: Failed to send packet trying to register name #001#002__MSBROWSE__#002<01>
Mar  5 22:21:29 Odin kernel: nvidia: loading out-of-tree module taints kernel.
Mar  5 22:21:29 Odin kernel: nvidia: module license 'NVIDIA' taints kernel.
Mar  5 22:21:29 Odin kernel: Disabling lock debugging due to kernel taint
Mar  5 22:21:29 Odin kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 245
Mar  5 22:21:29 Odin kernel: 
Mar  5 22:21:29 Odin kernel: nvidia 0000:02:00.0: enabling device (0100 -> 0103)
Mar  5 22:21:29 Odin kernel: nvidia 0000:02:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
Mar  5 22:21:29 Odin kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  455.45.01  Thu Nov  5 23:03:56 UTC 2020
Mar  5 22:21:29 Odin kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
Mar  5 22:21:29 Odin kernel: Linux agpgart interface v0.103
Mar  5 22:21:29 Odin kernel: nvidia-uvm: Loaded the UVM driver, major device number 243.
Mar  5 22:21:29 Odin kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  455.45.01  Thu Nov  5 22:55:44 UTC 2020
Mar  5 22:21:29 Odin kernel: [drm] [nvidia-drm] [GPU ID 0x00000200] Loading driver
Mar  5 22:21:29 Odin kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:02:00.0 on minor 0
Mar  5 22:21:31 Odin root: 
Mar  5 22:21:31 Odin root: --------------Installation of Nvidia driver v455.45.01 successfull-------------

 

Also FYI you spelt successfull incorrect, it's successful.

Minor just noticed it now looking at these logs :)

Link to comment

Good Afternoon,

A thousand thanks for all the hardwork on this plug-in!

 

I just upgrade to 6.9 from 6.8.3 following the plug-in instructions, plug-in removed and re-install, docker disabled/enabled, with multiple reboots.   Hardware encoding is now broken, playback stoping after 2-3 seconds and erroring out.  An unknown error occurred (4294967283).   Hardware decoding is working, and I've just disabled HW encoding for the time being.   It was working without issue for the last year.  I did remove the container and re-installed it.  

 

repo: linuxserver/plex
Plex Version: 1.21.4.4079

 

X9DRi-LN4+/X9DR3-LN4
2x Nvidia GP107GL [Quadro P400]

 

What information can I supply, and what troubleshooting steps would you recommend?

 

I've also trying switching to binhex's plexpass container, which has the same behavior of stopping after a few seconds.

 

Should I post this issue somewhere else?

 

image.thumb.png.52072ec2ab8fa56376f4e0f36cff0d4f.png

 

seine-diagnostics-20210305-1319.zip

Edited by Grrrreg
typo
Link to comment
19 hours ago, ich777 said:

Delete the old Plugin, Upgrade to 6.9.0 (after this your Plex container isn't able to start because the Driver is missing), go to the CA App and install the Nvidia-Driver Plugin (for a detailed explanation go to the first post of this thread), reboot.

 

Please read this post and follow back the linked posts:

 

Please make sure that you are on the latest Motherboard BIOS, check if you got above 4G decoding enabled.

If this all doesn't help try to boot with Legacy or CSM mode.

Please post the Diagnostics here (Tools -> Diagnostics -> Download -> drag the file here in the textbox) if you don't get it to work.

Thx for the info, yes the card is old. I just got a new one for my botting server and wanted to move this one to my media server as its beater then the onboard one ( server board ). i use kodi so i don't need it, but i was thinking it would improve my vm when i open it in the webui ( win 10 just to go online under the vpn ).  i guess ill just retire this card it had a good run. 

  • Like 1
Link to comment

Just as and FYI on this system.

 

AMD Ryzen 7 2700x / Asrock B450 ITX / Nvidia Quadro P2000 its been about 24 hours since I installed the card back into the system. I disabled C-States in the BIOS, I am running the card with the SMI instance as mentioned above and the card has not fallen off the bus like it was doing before. I am going to let it run through the weekend and if by Monday it has not fallen off the bus I will go ahead and re-configure PLEX to use the card and put it under load with a transcode to see if still stays stable. 

 

I am on Driver version 460.56

Link to comment
7 hours ago, SavellM said:

--------------------Nvidia driver v455.45.01 found locally---------------------

This line says that the container autmomatically installs the "old" driver but only because selected I think (I have built in a routine that checks if the container can actually get all releases and fall back to the installed one if it has no access to the internet or even Github.

Please try to select 'latest' in the Plugin, press 'Update' and then restart (you can actually check what version would be installed before rebooting if you do 'cat /boot/config/plugins/nvidia-driver/settings.cfg' in the terminal).

 

7 hours ago, Grrrreg said:

Hardware encoding is now broken, playback stocking after 2-3 seconds and erroring out.

A few users also had that problem on the first few pages, I hope you don't mind me saying if you can read through that, I think a reinstall of the Container solves the issue, one user even said clicking 'force update' would solve the problem but I don't think that this is enough...

Please keep me updated...

 

2 hours ago, SiRMarlon said:

Just as and FYI on this system.

Thank you for the frequent updates. :)

 

If this is the solution to this problem I'll be very thankfull! :)

Are the warnings from the syslog gone or they still in there?

 

I also got another user on the German subforums that has a similar problem with a Asrock Taichi X570, I'm not sure if it's related to Asrock or newer AMD systems in general.

  • Thanks 1
Link to comment
53 minutes ago, ich777 said:

Thank you for the frequent updates. :)

 

If this is the solution to this problem I'll be very thankfull! :)

Are the warnings from the syslog gone or they still in there?

 

I also got another user on the German subforums that has a similar problem with a Asrock Taichi X570, I'm not sure if it's related to Asrock or newer AMD systems in general.

 

All the warnings in the syslog are gone. Hopefully it stays stable for the next few days and I can get back to using it. :)

Link to comment
3 minutes ago, SiRMarlon said:

All the warnings in the syslog are gone. Hopefully it stays stable for the next few days and I can get back to using it. :)

Have you done anything else, I think if the warnings are gone try to transcode over the weekend...

It should work now hopefully.

Link to comment
2 hours ago, ich777 said:

This line says that the container autmomatically installs the "old" driver but only because selected I think (I have built in a routine that checks if the container can actually get all releases and fall back to the installed one if it has no access to the internet or even Github.

Please try to select 'latest' in the Plugin, press 'Update' and then restart (you can actually check what version would be installed before rebooting if you do 'cat /boot/config/plugins/nvidia-driver/settings.cfg' in the terminal).

I did select latest, it then asks me to restart which I did twice.

This is what I saw.

 

I guess I'll try reinstall the app again too... Thanks for your efforts.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.