Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[Plugin] Nvidia-Driver

Featured Replies

  • Author
6 minutes ago, SamH said:

...

"remaining": 0,

...

"used": 60,

...

There is your issue, something in your network is eating up all your GitHub API Calls.

 

I can't quite remember there was a monitoring container out there which was using all API Calls but I can't remember what name it was, it was something with "Pi"...

  • Replies 5.9k
  • Views 1m
  • Created
  • Last Reply

Top Posters In This Topic

Most Popular Posts

  • To utilize your Nvidia graphics card in your Docker container(s) the basic steps are:   Add '--runtime=nvidia' in your Docker template in 'Extra Parameters' (you have to enable 'Advanced

  • Recompiled the drivers and they are now just working fine (to get it working scroll down):   Please do the following (this is only necessary if you upgraded before I recompiled the dri

  • I'm currently spinning up my build VM and compiling the drivers again, currently drivers for 6.11.0 stable are not available...

Posted Images

4 minutes ago, ich777 said:

There is your issue, something in your network is eating up all your GitHub API Calls.

 

I can't quite remember there was a monitoring container out there which was using all API Calls but I can't remember what name it was, it was something with "Pi"...

I have shutdown all containers although i don't see anything that should be using any API calls, I will wait to see what happens when the next hours reset happens

  • Author
1 minute ago, SamH said:

I have shutdown all containers although i don't see anything that should be using any API calls, I will wait to see what happens when the next hours reset happens

Are you maybe behind a CG-NAT where maybe other users are using all the API Calls?

Just now, ich777 said:

Are you maybe behind a CG-NAT where maybe other users are using up the API Calls?

I have a sneaking suspicion that it was VSCode or at least one of the extensions that was using up all my free API requests, will need to dig in to that but for now the drivers have now installed successfully.

Thank you for al your help and patience!!

  • Author

@the-un-unraider is your problem solved?

 

The Webclient and Nvidia is known to cause issues if you change the quality settings on the fly.

I would recommend that you use a native app and then change the quality.

@ich777

No the problem‘s still here but i figured it‘s a plex issue. It also happens on the native Windows client though.

Wondering if someone can help me fix the issue with RTX 4090 - issue seems when daily backup runs, the nvidia gpu was being dropped for some reason. As a fix - i removed the driver and the steps to reinstall.

GPU is only assigned to ollama docker container - and after the backup ran last night this morning it seems the issue has occurred again.

Seems to have only started since I updated to Unraid 7

 

Here are some outputs - if someone can help troubleshoot please

 

CleanShot2025-01-30at08_34_05.thumb.png.cf5a13f5bd629aeabed9b78252098ae6.png

 

Docker run settings for ollama:

CleanShot2025-01-30at08_37_25.png.ee834cf17895327169c38b2013c86af3.png

 

Edit: seeing this in syslog
CleanShot2025-01-30at08_49_10.thumb.png.2cb302055bb9f5fd7db93911f69e4939.png

 

Attached - nvidia-bug-report

nvidia-bug-report.log-2.gz

Edited by m0dded

  • Author
32 minutes ago, m0dded said:

Seems to have only started since I updated to Unraid 7

Can you please upload your Diagnostics instead of the Nvidia bug report when the error occurs? Are you also sure that the power supply is up to the task?

 

However what you are seeing is probably caused by a C library that is shipped with the container.

Are you also sure that the card isn't overheating or similar issues?

@ich777 thanks for the quick reply.

The issue is occurring at night during backup - however i just ran a manual backup after a reboot and the issue didn't occur.

Will get the diagnostics to you overnight.

PSU is LianLi 1000W EG1000.BE so don't think that is an issue - I have stress tested the gpu and there was no crash.

 

Would really like to solve this issue and achieve stability, so appreciate your help.

I do see this in syslog when the container starts:
 

Jan 30 09:34:31 Trinity kernel: br-5f6600742638: port 4(vethacf6d46) entered disabled state
Jan 30 09:34:31 Trinity kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20230628/dsfield-184)
Jan 30 09:34:31 Trinity kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Jan 30 09:34:31 Trinity kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)
Jan 30 09:34:32 Trinity kernel: eth0: renamed from vethb22b052
Jan 30 09:34:32 Trinity kernel: br-5f6600742638: port 4(vethacf6d46) entered blocking state
Jan 30 09:34:32 Trinity kernel: br-5f6600742638: port 4(vethacf6d46) entered forwarding state
Jan 30 09:34:33 Trinity kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20230628/dsfield-184)
Jan 30 09:34:33 Trinity kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Jan 30 09:34:33 Trinity kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)

This was in the bug report - not sure if it helps:

 

(['[  265.363644] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  565.77  Wed Nov 27 23:33:08 UTC 2024\n',
  '[38542.793206] NVRM: GPU at PCI:0000:01:00: GPU-68a24a84-1227-c298-42fe-359fe10a2390\n',
  "[38542.793209] NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.\n",
  '[38542.793213] NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.\n',
  '[38542.793240] NVRM: A GPU crash dump has been created. If possible, please run\n',
  '               NVRM: nvidia-bug-report.sh as root to collect this data before\n',
  '               NVRM: the NVIDIA kernel module is unloaded.\n',
  "[38542.793341] NVRM: Xid (PCI:0000:01:00): 154, pid='<unknown>', name=<unknown>, GPU recovery action changed from 0x0 (None) to 0x2 (Node Reboot Required)\n",
  'ERROR: An internal driver error occurred\n',
  'ERROR: An internal driver error occurred\n',
  'ERROR: An internal driver error occurred\n',
  "ERROR: Error while querying valid values for attribute 'OperatingSystem' on [gpu:0] (No such attribute).\n",
  'ERROR: An internal driver error occurred\n',
  "ERROR: Error while querying valid values for attribute 'Ubb' on [gpu:0] (No such attribute).\n",
  'ERROR: An internal driver error occurred\n',
  "ERROR: Error while querying valid values for attribute 'Overlay' on [gpu:0] (No such attribute).\n",
  'ERROR: An internal driver error occurred\n',
  "ERROR: Error while querying valid values for attribute 'Stereo' on [gpu:0] (No such attribute).\n",
  'ERROR: An internal driver error occurred\n',
  "ERROR: Error while querying valid values for attribute 'TwinView' on [gpu:0] (No such attribute).\n"],
 ['[  265.363644] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  565.77  Wed Nov 27 23:33:08 UTC 2024\n'],
 ['[    0.000000] Linux version 6.6.68-Unraid (root@Develop) (gcc (GCC) 14.2.0, GNU ld version 2.43.1-slack151) #1 SMP PREEMPT_DYNAMIC Tue Dec 31 13:42:37 PST 2024\n',
  'Linux version 6.6.68-Unraid (root@Develop) (gcc (GCC) 14.2.0, GNU ld version 2.43.1-slack151) #1 SMP PREEMPT_DYNAMIC Tue Dec 31 13:42:37 PST 2024\n'])

 

@m0dded reseat the gpu in the pcie slot

2 minutes ago, mcmasterp said:

@m0dded reseat the gpu in the pcie slot

@mcmasterp - have done that and the power cable too. Strangest thing is its stable when it gets used. Issue started to occur post Unraid 7 and I am assuming after backup as it seems to be overnight

 

  • Author
8 hours ago, m0dded said:

This was in the bug report

Please always include your Diagnostics since it has more and clearer information in it as the Nvidia bug report script.

 

4 hours ago, m0dded said:

Issue started to occur post Unraid 7

Have you ever considered that this is probably not the issue here since we are talking about a Docker container it could be also the case that it got upgraded and that's why this error came up?

Nothing changed in terms of the driver or the Base OS.

 

However as previously said this seems to be an issue with a library in the container and why your GPU falls from the bus:

8 hours ago, m0dded said:
Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.\n",

 

You have a Xid 79 error here, you can read more about that here:

https://docs.nvidia.com/deploy/xid-errors/index.html#xid-error-listing

 

However this error is pretty random and everything could be the cause of the issue, it could even be a system memory error.

Simply doing the os update from 6.12.13 > 7.0 and get the following in red and a message not to reboot...

Can't seem to figure out why. 

 

30-01-2025 14:22Plugin Update HelperNotificationDownload from plugin package(s): Nvidia Driver for unRAID v7.0.0 failed! Please visit the support thread(s) before rebooting to avoid plugin issues!alert

30-01-2025 14:22Plugin Update HelperNotificationNvidia Driver v download failed, please go to the support thread for this plugin and make a post with a screenshot from this error!

Screenshot 2025-01-30 at 2.27.19 PM.png

Edited by mkyb14

  • Author
6 hours ago, mkyb14 said:

Can't seem to figure out why. 

Please reboot, the reboot will take a bit longer since the driver plugin will download the driver on boot. (~220MB).

 

Sorry this was my fault since I introduced a change yesterday but should now work fine.

no fault here, you're doing us a favor making the package for us!  appreciate you

hello, i've had this issue a few times; why does the nvidia driver after a while it cant find my gpu anymore? it works for a bit sometimes 1 day sometimes 2 days then i have this issue; ill attach the unraid logs if they can help to solve this

 

image.thumb.png.7cf1dbf80c1eb79dcd3b64b4db4f24fb.png

syslog-20250201-2217.zip

Edited by giovacca

5 hours ago, giovacca said:

ill attach the unraid logs if they can help to solve this

you rather should post your diagnostics <<click<< to get better help

 

what we see in syslog is (Beginning of the end ...)

 

Feb  1 22:58:27 BigNigga kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.

 

Speculation now, check

 

BIOS (above 4g and rbar activated)

Energy savings (may too much, start default)

Nested enabled (may test without)

Card seating on board (check or may pull it out and reseat it)

Power supply (is your PSU still ok, enough Power)

Hardware failure (may the Card is ... test on another mashine)

...

On 1/30/2025 at 6:34 PM, ich777 said:

Please always include your Diagnostics since it has more and clearer information in it as the Nvidia bug report script.

 

@ich777 - Thanks will do - seems to have been quite stable after I uninstalled and reinstalled the nvidia driver. Hopefully the issue has gone away. Thanks again for your help and for the plugin

  • Author
36 minutes ago, m0dded said:

@ich777 - Thanks will do - seems to have been quite stable after I uninstalled and reinstalled the nvidia driver. Hopefully the issue has gone away. Thanks again for your help and for the plugin

I don‘t think this has been the solution necause a simple restart from your Server would have done exactly the same because the Nvidia driver is installed each start from Unraid.

There is something wrong with RTX5080 fe. It shows as follws:

Nvidia Info:
Nvidia Driver Version: 570.86.16
Open Source Kernel Module: No
Installed GPU(s):
No devices were found

 

The system device shows:

[10de:2c02] 01:00.0 VGA compatible controller: NVIDIA Corporation Device 2c02 (rev a1)
[10de:22e9] 01:00.1 Audio device: NVIDIA Corporation Device 22e9 (rev a1)

 

The  Diagnostics is uploaded.

tower-diagnostics-20250204-2322.zip

  • Author
1 minute ago, neoherozzz said:

There is something wrong with RTX5080 fe. It shows as follws:

Seems like 5000 series cards need the OpenSource Kernel module:

Feb  4 22:48:12 Tower kernel: NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2c02)
Feb  4 22:48:12 Tower kernel: NVRM: installed in this system requires use of the NVIDIA open kernel modules.
Feb  4 22:48:12 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:884)
Feb  4 22:48:12 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Feb  4 22:48:12 Tower kernel: NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2c02)
Feb  4 22:48:12 Tower kernel: NVRM: installed in this system requires use of the NVIDIA open kernel modules.
Feb  4 22:48:12 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:884)
Feb  4 22:48:12 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

 

However the Open Source Kernel module is based on the latest stable version on Unraid which is currently: 565.77

 

Have you yet tried the Open Source Driver which is available for Unraid 7.0.0? Please test if the Open Source driver version 565.77 also supports your card, if not please let me know and I look into what we can do about that.

15 minutes ago, ich777 said:

Please test if the Open Source driver version 565.77 also supports your card

It seems not worked with 565.77, Open Source Kernel Module is no.


Nvidia Info:

Nvidia Driver Version: 565.77

Open Source Kernel Module: No

Installed GPU(s):
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

 

 

tower-diagnostics-20250204-2347.zip

  • Author
Just now, neoherozzz said:

It seems not worked with 565.77, Open Source Kernel Module is no.

Did you select the Open Source version? From your Diagnostics I see that the non Open Source package is in place.

3 minutes ago, ich777 said:

Did you select the Open Source version?

I selected new feature branch by mistake. The open source shows follows:

 

Nvidia Info:

Nvidia Driver Version: 565.77

Open Source Kernel Module: Yes

Installed GPU(s):
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

tower-diagnostics-20250204-2357.zip

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.