[Plugin] Nvidia-Driver


ich777

Recommended Posts

17 minutes ago, ich777 said:

Please read the first recommended post on top.

 

There is no easy way to do that.

 

Sorry but I can‘t help here since this card is simply not supported.

Maybe try to get a slightly newer card like a Quadro P400 or similar that is supported by the plugin.

I would still recommend that you get something like a Nvidia T400 since it‘s a recent card.

If you don't plan or want to use acceleration within Docker containers through your Nvidia graphics card then don't install this plugin!

 

Okay so this isnt needed for me and my needs and my issue has nothing to do with a missing driver. or at least nothing to do with this plug in.  THanks for the quick reply!

Link to comment
10 minutes ago, Ace319 said:

Okay so this isnt needed for me and my needs and my issue has nothing to do with a missing driver. or at least nothing to do with this plug in.

I think you misunderstood, the legacy driver won't work on such new Kernes which Unraid is based on, so to speak this card is basically obsolete, that's why I recommended a new card. :)

Link to comment
On 2/4/2024 at 12:35 PM, ich777 said:

Does your server has a active Internet connection on boot? So to speak does the PiHole that you where talking about run on your Unraid server?

 

If your server has a active internet connection on boot then simply restart but keep in mind that the boot will take longer since it will download the driver on boot.

@ich777 Had to go back awhile but did this step and driver has updated. Still seeing the same web error about trying to download the last listed version in the pop up as well as the php error on the left side. 

 

Guessing I'd need to remove the plugin, reboot, reinstall the plugin yeah? Don't see any steps on removing the driver itself. 

Link to comment
16 minutes ago, jaybird2203 said:

Guessing I'd need to remove the plugin, reboot, reinstall the plugin yeah? Don't see any steps on removing the driver itself. 

Yes, as I wrote above.

Some users strangely have that issue but I never was able to reproduce this.

Link to comment
43 minutes ago, ich777 said:

Yes, as I wrote above.

Some users strangely have that issue but I never was able to reproduce this.

Yeah, that sorted the issue on my setup - thanks for the direction!

 

Always get cautious about a plugin that manages a driver and the direction given is to remove the driver lol

Edited by jaybird2203
  • Like 1
Link to comment
10 minutes ago, jaybird2203 said:

Always get cautious about a plugin that manages a driver and the direction given is to remove the driver lol

Nothing to worry about on Unraid and how the plugin system works with drivers.

Link to comment

Hello,

I'm looking for some assistance with an ongoing issue that's been giving me a bit of trouble.

My GPU, a 1060, seems to have disappeared from view. It's been chugging along fine for quite some time, particularly serving its purpose for transcoding on Plex.
 

Initially, I suspected a GPU hardware fault, possibly indicating the need for a replacement. However, I tested by booting into my gaming PC on the same rig (dual boot) and played a solid three-hour Battlefield session without a hitch. This seems to suggest that everything is shipshape on the hardware front.
 

In an effort to troubleshoot, I've recently updated to the latest version of Unraid, only a few versions behind, and also ensured I'm running the latest release branch of the Nvidia driver to cover all bases.


Please see logs..
[   48.087597] [drm] [nvidia-drm] [GPU ID 0x00000a00] Loading driver
[   48.088477] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:0a:00.0 on minor 0


[  137.155842] nvidia-uvm: Loaded the UVM driver, major device number 239.
[  137.685108] NVRM: GPU at PCI:0000:0a:00: GPU-24fbbf6a-793a-da81-f287-80f2835cfcc5
[  137.685126] NVRM: Xid (PCI:0000:0a:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
[  137.685139] NVRM: GPU 0000:0a:00.0: GPU has fallen off the bus.
[  137.714976] NVRM: GPU 0000:0a:00.0: request_irq() failed (-22)
[  137.715000] NVRM: GPU 0000:0a:00.0: request_irq() failed (-22)

[  152.103589] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x22:0x56:762)
[  152.103641] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
[  152.111316] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x22:0x56:762)
[  152.111361] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
[  153.059424] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x22:0x56:762)
[  153.059469] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
[  153.063811] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x22:0x56:762)
[  153.063842] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0

*** /proc/driver/nvidia/./gpus/0000:0a:00.0/information
*** ls: -r--r--r-- 1 root root 0 2024-04-01 15:16:20.086794978 +1000 /proc/driver/nvidia/./gpus/0000:0a:00.0/information
Model:          NVIDIA GeForce GTX 1060 6GB
IRQ:            114
GPU UUID:      GPU-24fbbf6a-793a-da81-f287-80f2835cfcc5
Video BIOS:      ??.??.??.??.??
Bus Type:      PCIe
DMA Size:      47 bits
DMA Mask:      0x7fffffffffff
Bus Location:      0000:0a:00.0
Device Minor:      0
GPU Excluded:     No

*** /proc/driver/nvidia/./gpus/0000:0a:00.0/unbindLock does not exist

Any suggested would be great.. 

Link to comment
11 hours ago, MxFox said:

Initially, I suspected a GPU hardware fault, possibly indicating the need for a replacement. However, I tested by booting into my gaming PC on the same rig (dual boot) and played a solid three-hour Battlefield session without a hitch. This seems to suggest that everything is shipshape on the hardware front.

 

I don't have a solution for your issue, but I do applaud your diligence in testing the hardware.  😄

Edited by ConnerVT
  • Haha 2
Link to comment
13 hours ago, MxFox said:

I'm looking for some assistance with an ongoing issue that's been giving me a bit of trouble.

Please always include Diagnostics.

 

Anyways, you got a XID 79 error which you can read more about here.

 

On what BIOS version are you? In which slot on your Motherboard is the card? Is Above 4G Decoding and Resizable BAR support enabled in the BIOS?

Link to comment

Looking for some direction here.  I've had to remove all my plugins and docker.img.  started adding all the dockers back and got to the nvidia plugin for my p2000.  When doing the update, rebooting , I only get this info on the plugin page... 

 

I'm not sure what directory or file this is referring to.  I can't seem to find anything googling it.  

Screenshot 2024-04-01 at 2.22.52 PM.png

Screenshot 2024-04-01 at 2.25.45 PM.png

Screenshot 2024-04-01 at 2.28.28 PM.png

Edited by mkyb14
updated with image of lib location
Link to comment
12 minutes ago, mkyb14 said:

added. 

Have you yet tried to reboot?

BTW, this is the wrong location the file is located in /usr/lib64 you are actually looking at the 32bit libraries in the screenshot above.

 

Have you any special scripts in place that maybe modify the PATH variable on your system or something similar?

I can't see anything obvious from your Diagnostics at a first glance.

The only thing that is a bit strange that the driver got not installed early in the boot process since it should be installed and pick up the card early.

 

You could of course try to:

  1. Remove the plugin
  2. Reboot
  3. Install the plugin
  4. Reboot

 

May I ask why did you had to remove the docker.img? Hardware fault?

My best guess else is that something is going on with the RAM but that is just a vague guess since the driver get's also installed into the rootfs which on Unraid is the system memory.

Link to comment

I will remove the plugin and start over.  I removed everything as I was having the system lock up randomly after updating to 6.12.8.  no one could tell me why in other forums... so I booted to safe mode and slowly turned on dockers and then after that plugins and crashed after a while when I had enabled the nvidia one...so I removed everything to start over.

give me like 5 mins and I'll redo the plugin again and post diagnostics again

 

Link to comment
10 minutes ago, mkyb14 said:

same result.  remove plugin, re-install, reboot.  plugins, screenshot

Are you really sure that your RAM is okay?

 

As said my best guess is that there is something wrong with the RAM and the files not installing correctly.

Are you on the latest BIOS version?

There must be something else going on since I also tested this exact driver package and it is working fine over here.

Link to comment
3 hours ago, ich777 said:

Please always include Diagnostics.

 

Anyways, you got a XID 79 error which you can read more about here.

 

On what BIOS version are you? In which slot on your Motherboard is the card? Is Above 4G Decoding and Resizable BAR support enabled in the BIOS?

Thanks for getting back to me. I've enabled 4G Decoding, but it turns out my BIOS doesn't support Resizable BAR.
 

I apologize for not sharing the logs initially. I figured it would be simpler to extract what I thought was relevant for you all. I have now uploaded the new logs.
 

Just to clarify, nothing has changed on my server recently. So, having to tweak BIOS settings to get it working doesn't quite add up for me, considering it's been running smoothly like this for years.
 

Perhaps someone else has encountered this issue before. After enabling 4G decoding, I'm not getting any display on my monitor. Unraid still boots up fine, but I can't see anything on the screen

Please also see the slot I have the GPU plugged into..
image.thumb.png.140d727971fc4cbb9fff2bd89065f548.png

nvidia-bug-report.log.gz

Edited by MxFox
Link to comment
2 hours ago, ich777 said:

Are you really sure that your RAM is okay?

 

As said my best guess is that there is something wrong with the RAM and the files not installing correctly.

Are you on the latest BIOS version?

There must be something else going on since I also tested this exact driver package and it is working fine over here.

yes, ecc ram has run memtest for 48hrs previously, no issues there or with board.  current bios for supermicro along with latest on all hba cards.  burn in in safemode is fine, now that this is running, I'll setup plex again with the variables and then setup sab,sonarr,radarr and see what happens over the following week or so.  

  • Like 1
Link to comment
1 hour ago, MxFox said:

please see attached..

Thanks.

 

The first thing that I noticed is that your BIOS is out of date, make sure to update your BIOS.

 

Please make also sure to disable C-States in the BIOS since these cause (at least on Linux) issues with Nvidia cards.

I deleted the Diagnostics from above because there was something in the go file that didn't need to be public.

 

BTW, I think that your Motherboard supports Resizable BAR support maybe it is named differently.

  • Like 1
Link to comment
17 hours ago, ich777 said:

Thanks.

 

The first thing that I noticed is that your BIOS is out of date, make sure to update your BIOS.

 

Please make also sure to disable C-States in the BIOS since these cause (at least on Linux) issues with Nvidia cards.

I deleted the Diagnostics from above because there was something in the go file that didn't need to be public.

 

BTW, I think that your Motherboard supports Resizable BAR support maybe it is named differently.

So, I've gone ahead and updated my BIOS and disabled C-States, but unfortunately, I'm still not having any luck with locating anything related to Resizable BAR.
 

After rebooting and returning to Unraid, my graphics card still isn't showing up. I stumbled upon an error log message stating, "NVRM: GPU 0000:0a:00.0: RmInitAdapter failed!" I did some digging online but couldn't find much, except for a couple of folks mentioning issues with the newer Nvidia driver versions.
 

Feeling a bit stuck, I decided to take a step back and downgrade to version 545.29.05 of the driver, and lo and behold, my card is back in action. But 10min later is drop off.. Any thing else i can do ?
 

I came across this article... 

Diagnostics.zip

Edited by MxFox
Link to comment
6 hours ago, MxFox said:

Feeling a bit stuck

I had multiple people in this thread which had the same issue:

 

Here is one who was able to solve it by upgrading the PSU:

 

Here is one who never responded again but is seems to be fixed now (because I found a follow up Diagnostics from him and the card is properly recognized) :

 

 

The next one is this here which I'm also not sure if it's fixed:

 

 

However please keep in mind that Linux is a lot more sensitive to thermal issues or power delivery issues for example. I remember one user who had the GPU fall from the bus because of a thermal issue.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.