[Plugin] Nvidia-Driver


ich777

Recommended Posts

nope, no firewall, it seems to be working tho after updating to the 6.10 version of unraid! so that may have solved the problem. will update if this run is successful
 image.png.a31f45555b74440e6a37fb8934d70f0d.png
Edit: updating to 6.10 or higher seems to have fixed the problem. was able to install plugin, reboot, then reinstall plex with --runtime=nvidia flag successfully

Edited by KnobleOutlaw
update
Link to comment
21 minutes ago, KnobleOutlaw said:

nope, no firewall, it seems to be working tho after updating to the 6.10 version of unraid! so that may have solved the problem. will update if this run is successful

Please note that I've tried it several times over here, the download speeds are really all over the place, sometimes 100KB and sometimes ~180MBit/s.

What I want to say is that this can take a long time on 100KB/s for the driver ~200MB

  • Like 1
Link to comment
16 minutes ago, Fantomen said:

Sorry. See attached. 

Since this is a Dell Power Edge and the drivers are loaded, I would strongly recommend that you first check if any BIOS updates are available.

Make sure that you've enabled above 4G Decoding and see if you got any options about resizable BAR.

I'm not too familiar with Dell Power Edge servers but I know for a fact that if one BIOS option is set wrong the card won't initialize:

Jul 24 18:38:04 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x25:0x65:1428)
Jul 24 18:38:04 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:38:05 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:38:05 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0

 

If possible please also try a different PCIe slot. Did you know also that the card works fine?

Link to comment
3 hours ago, ich777 said:

Since this is a Dell Power Edge and the drivers are loaded, I would strongly recommend that you first check if any BIOS updates are available.

Make sure that you've enabled above 4G Decoding and see if you got any options about resizable BAR.

I'm not too familiar with Dell Power Edge servers but I know for a fact that if one BIOS option is set wrong the card won't initialize:

Jul 24 18:38:04 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x25:0x65:1428)
Jul 24 18:38:04 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:38:05 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:38:05 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: RmInitAdapter failed! (0x23:0xffff:1382)
Jul 24 18:39:53 Anton kernel: NVRM: GPU 0000:82:00.0: rm_init_adapter failed, device minor number 0

 

If possible please also try a different PCIe slot. Did you know also that the card works fine?

I just plugged the card into my desktop and can confirm that it’s working. 
 

4G decoding is enabled. 
 

I have so far tried in two PCI-e slots, one x16 and one x8. I’ll try in the other 4 though I believe they are x8 only. 
 

I will also try to pass the cards to a VM, if that works then I’m not sure what is next. 

Link to comment
13 minutes ago, Fantomen said:

I will also try to pass the cards to a VM, if that works then I’m not sure what is next. 

VMs are a different kind of thing…

 

I think a good start would be to look here.

I don‘t really like Dell server since they are some times a little finicky…


BTW Nothing technical or not related to your issue at all but I find it kind of funny that you are trying to run a HP card in a Dell server… :D

Link to comment
10 hours ago, ich777 said:

BTW Nothing technical or not related to your issue at all but I find it kind of funny that you are trying to run a HP card in a Dell server… :D

Haha, I managed to get the card for a good price on eBay (200€).

 

10 hours ago, ich777 said:

VMs are a different kind of thing…

 

I understand that, but should the card work in a VM it should mean that there are no errors with the risers/pci-e slots/bios.

 

10 hours ago, ich777 said:

I think a good start would be to look here.

I don‘t really like Dell server since they are some times a little finicky…

 

I will take a look and also see if i can PM someone who got it to work, maybe I am overlooking something in BIOS.

Link to comment
16 minutes ago, Fantomen said:

I understand that, but should the card work in a VM it should mean that there are no errors with the risers/pci-e slots/bios.

Yes and no because a VM handles everything diffent and the card is basically newly initialized when you fire it up like it is the case if you have in the a real computer installed.

Hope this makes a bit of sense to you.

 

18 minutes ago, Fantomen said:

I will take a look and also see if i can PM someone who got it to work, maybe I am overlooking something in BIOS.

Don‘t get me wrong but HP and Dell are really „special“ about their branded hardware and I remember a time where some components won‘t work if they weren‘t speciflicy from Dell or HP and also some parts won‘t work in other computers if the machine wasn‘t from Dell or HP.

Only wanted to point that put. ;)

 

Also make sure that the card get‘s enough power, I remember a few useres here also with Dell servers which had all issues with these kind of cards and servers.

 

Also, while trying the card in your personal computer, did you install the drivers and put a 3D load on it?

 

Do you boot Unraid with UEFI or CSM (Legacy).

If you are using UEFI try CSM instead.

Link to comment

Hi @ich777 ! First of all thanks for the amazing support regarding this plugin.

 

So I'm currently in the same situation as @Fantomen but with a different PC (actually I tried with 2 and I got the same results) and a different but supported GPU. I tried with my custom old PC and more recently with my HP Z440. My GPU is a MSI GTX 1050 Ti, I get video output, everything is working fine. But in the plugin settings, it's not being detected. I can see it in de devices list, I can assign it to a VM (just checked in case it was the GPU that was defective) but no luck with the plugin. 

 

I've tried multiple solution from various reply (kinda lost count of what I tried lol) so thanks for any advice you'll be able to give me !

 

FYI, I did update my BIOS yesterday prior moving my setup to the Z440.

unraid-diagnostics-20220728-1328.zip

Edited by mrwookie
Adding details.
Link to comment

Hi All,

 

So I have my unraid server running pretty good. I have a Gigabyte Aorus Master motherboard running a AMD 3900x processor and 32 GB Ram, a RTX 2070 Super is installed in PCIe Slot 1 and I get output fine from a monitor. When I install the Nvidia Driver app it installs, I reboot but when I open the app it shows no GPU found even though there is monitor output. I have updated the Bios to the latest version, and tried all the bios settings people have said on the forum but to no avail. The GPU is found in System devices and can also be Bound if I was running a VM (which I am not). I have attached the diagnostics as I am at a total loss now and have spent 2 days trying to get the GPU seen by the nvidia driver app.

 

Any help or thoughts greatly appreciated.

unraid-diagnostics-20220729-1243.zip

Link to comment
15 hours ago, tDames said:

I'm having a issue. I have a EVGA 1060 6GB and installed the plugin as per instructions and it shows up for some time and then just disappears for no reason.

Is this Diagnostics after it dropped and isn't recognized anymore by your system or is this a Diagnostics file when it is still recognized?

This seems like after a restart because the card is recognized, it would be really helpful if you can post the Diagnostics after it isn't recognized anymore by your system.

 

Please also try to boot with Legacy Boot (CSM) instead of UEFI.

Link to comment
22 minutes ago, johntankard said:

So, output was

Did you actually wait until the driver plugin was fully downloaded and clicked the DONE button or did you close the window with the red X on top?

 

Please remove the plugin, reboot and install it from the CA App again.

Seems like the driver package isn't fully downloaded.

 

22 minutes ago, johntankard said:

I have attached latest diagnostics again....

Where? Also this isn't necessary, please follow the steps from above and wait until the DONE button on plugin installation is displayed.

Link to comment
14 minutes ago, johntankard said:

I know not to click the red x however, I have left it for over 1 Hr and it was just sat on the same screen with no done button....

When the done button isn't displayed the driver download isn't finished and it can't work.

 

GitHub gave me some trouble recently with download speeds and a ~200MB driver a download speed of about 15KB/s can take a while.

 

Can you try to to download this file only to see which speeds do you get (you don't need this file, this is only how fast the download is for you)? Currently I'm getting about 18MB/s down from GitHub.

 

Please follow the procedure described above, remove the driver, reboot and try to install it again from the CA App, I hope that it doesn't take this long to download this time.

Link to comment

@ich777 I followed your instructions and I'm now booting in Legacy mode and still no luck.
 

  1. I have 2 GPUs (one AMD and one Nvidia). If I unplug the power of the Nvidia, I get the following message in the boot sequence log right before the login lines appear : "Modprobe: ERROR: could not find module by name 'nvidia' "
  2. If I plug both with the Nvidia in the primary slot and the AMD in the secondary slot, I don't get the error but the GPU is not detected by the plugin.
  3. After enabling Legacy boot (and disabling all UEFI boot option to make sure that I was booting in Legacy), I uninstalled the nvidia driver plugin, rebooted as you instructed others in the forum, did a clean install, didn't clicked on the red X but waited, clicked in the DONE button, set Docker to OFF, Apply, Docker to ON, Apply, checked if the GPU was detected, no luck so I rebooted again and still no luck.
  4. If I type in the unraid terminal "modprobe nvidia", I dont see any ouput and if I type "nvidia-smi" I get "No devices were found".

Here's my diagnostic after all those changes.

unraid-diagnostics-20220729-1054.zip

Link to comment

Hi Ich777,

 

So its been over an hour and still stuck on do not close this screen, I have performed some network tests and all is ok, I have no problem installing other apps and 'done' appearing. What I have noticed though is that when I start a clean install of the nvidia driver it starts downloading at 3MB/s for approx 5 seconds and then the download stops. I tried downloading that file you sent me and it downloads at a steady 2.3MB/s.

 

I just can't seem to pull the driver from github. I use pfsense but its pretty much a vanilla install and as said there is no problem installing other apps.

 

John

Link to comment
25 minutes ago, johntankard said:

I tried downloading that file you sent me and it downloads at a steady 2.3MB/s.

Then download the file that I've linked above on your computer and put it in the folder '/boot/config/plugins/nvidia-driver/packages/5.15.46/' on your server, after that reboot and the driver should be installed.

 

Have you maybe done anything custom to Unraid, custom DNS servers...? Any AdBlocking software on your network or something like that?

Where are you located in the world?

 

Maybe try it in a few hours, I really can't help since this is a download issue from GitHub. I've had such an issue with German users where it downloads the files with a few KB/s and the download took forever.

 

What kind of USB Boot device do you have? Do you have enough free space on there?

Link to comment
44 minutes ago, mrwookie said:

I have 2 GPUs (one AMD and one Nvidia). If I unplug the power of the Nvidia, I get the following message in the boot sequence log right before the login lines appear : "Modprobe: ERROR: could not find module by name 'nvidia' "

Are you really sure that this is the message when you unplug the card? This indicates another issue...

 

49 minutes ago, mrwookie said:

If I type in the unraid terminal "modprobe nvidia", I dont see any ouput and if I type "nvidia-smi" I get "No devices were found".

The drivers are already loaded from what I see here:

 

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP107 [GeForce GTX 1050 Ti] [1462:3351]
	Kernel driver in use: nvidia
	Kernel modules: nvidia_drm, nvidia

 

Do you have enabled Above 4G Decoding and Resizabel BAR Support in your BIOS too?

 

I can only imagine that this is some kind of weird HP thing that the card doesn't initialize successfully...

Multiple users of HP or Dell Servers had issues and one wrong set BIOS setting can prevent it from working.

 

What's really weird is that I have no indication that the driver isn't loaded successfully on your system, instead everything seems fine.

 

I would also recommend to preven UEFI boot in Unraid itsel (click on the blue Flash text in the Main tab and there should be a checkbox for that - don't forget to click Apply).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.