[Plugin] Nvidia-Driver


ich777

Recommended Posts

16 minutes ago, ich777 said:

When does it update or what do you mean exactly?

 

Can you please be a little bit more specific? Also you don't answered this question:

 

Did the driver updated or did the plugin update?

 

Have you already tried to pick a driver from the stable branch?

Sorry the Nvidia Driver updated 22nd June according to telegram, Yes the GUI does lockup so i had to MC in and remove the Nvidia plugin and now it does not lock up and no Error in the logs. When i re install the plugin i get this - R720XD kernel: NVRM: GPU 0000:42:00.0: RmInitAdapter failed! and the GUI locks up. Card Works iv tested it in my mining machine.  yes i tried latest and the option below that. 

Telegram.jpg

Link to comment
22 minutes ago, leeknight1981 said:

Sorry the Nvidia Driver updated 22nd June according to telegram, Yes the GUI does lockup

When you are saying GUI you mean the WebGUI from Unraid and not the plugin page?

 

Are you on the Dashboard when the GUI locks up or does it lock up in general? Can you close the window and reopen the GUI again?

Have you installed any custom build of Unraid or is this a stock 6.9.2 build?

 

Please open up a Terminal from Unraid and type in: 'nvidia-smi' (without quotes) and post the output here.

 

22 minutes ago, leeknight1981 said:

When i re install the plugin i get this - R720XD kernel: NVRM: GPU 0000:42:00.0: RmInitAdapter failed! and the GUI locks up.

I really can't help this is usually a sign the the Card can't initialize because of too less power or some other hardware related issue (sometimes it can also happen when you are booting with UEFI).

 

22 minutes ago, leeknight1981 said:

Card Works iv tested it in my mining machine.

Did you mine with it or did you tested it with a 3D load and connected it to a Display?

 

22 minutes ago, leeknight1981 said:

yes i tried latest and the option below that. 

And rebooted in between?

 

 

The problem with this error is that it is not easy to identify. I think you are booting with Legacy (CSM)?

Please double check if you are booting with Legacy mode.

 

Link to comment
1 minute ago, ich777 said:

When you are saying GUI you mean the WebGUI from Unraid and not the plugin page?

 

Are you on the Dashboard when the GUI locks up or does it lock up in general? Can you close the window and reopen the GUI again?

Have you installed any custom build of Unraid or is this a stock 6.9.2 build?

 

Please open up a Terminal from Unraid and type in: 'nvidia-smi' (without quotes) and post the output here.

 

I really can't help this is usually a sign the the Card can't initialize because of too less power or some other hardware related issue (sometimes it can happen also when you are booting with UEFI).

 

Did you mine with it or did you tested it with a 3D load and connected it to a Display?

 

And rebooted in between?

 

 

The problem with this error is that it is not easy to identify. I think you are booting with Legacy (CSM)?

Please double check if you are booting with Legacy mode.

 

i put it in the machine and installed windows 10 on it and installed the drivers and the card showed up all ok and worked on external display, GUI if i click say Plugin's or Stats tabs i get the 3 wavy orange lines and it does nothing, totally standard unraid i don't play about with it, nvidia-smi: command not found, Honestly it has been working perfectly fine till that nvidia driver update Emby has worked and the card has transcoded everything correctly. On the 22nd about 10am UK time i went onto unraid and the GUI was weird and laggy and like locked up so i rebooted and was the same. So i removed the nvidia plugin and rebooted and its ok. Re install it i get that error. I cant get another card atm as non in stock and well overpriced as they all being brought to mine. This has been happily working Fine for well over 12 months till that update on 22nd

Link to comment
4 minutes ago, leeknight1981 said:

nvidia-smi: command not found

There seems to be something wrong when the plugin is installed and you get this output.

 

4 minutes ago, leeknight1981 said:

Honestly it has been working perfectly fine till that nvidia driver update

Downgrade to the driver that was installed previously. Eventually this will solve your issues.

 

5 minutes ago, leeknight1981 said:

On the 22nd about 10am UK time i went onto unraid and the GUI was weird and laggy and like locked up so i rebooted and was the same. So i removed the nvidia plugin and rebooted and its ok.

This was after the driver upgrade and the reboot I think.

 

6 minutes ago, leeknight1981 said:

So i removed the nvidia plugin and rebooted and its ok. Re install it i get that error.

You get the error that the GUI locks up and that Unraid is laggy?

 

7 minutes ago, leeknight1981 said:

This has been happily working Fine for well over 12 months till that update on 22nd

Have you double checked that you boot in legacy mode?

You can also try a BIOS reset, but only if you know what you are doing.

 

 

Do you have a second machine where you can test the card with Unraid, I would do it as follows:

  1. Create a new USB Boot stick on a spare USB Key
  2. Put the card in the other machine
  3. Boot the other machine with your card installed from the USB Key
  4. Register for Trail on the WebGUI
  5. Install the CA App
  6. Install the Nvidia Driver plugin from the CA App
  7. See if the WebGUI becomes laggy on this machine too, if not open up a terminal and issue the command: 'nvidia-smi'
Link to comment
54 minutes ago, ich777 said:

There seems to be something wrong when the plugin is installed and you get this output.

 

Downgrade to the driver that was installed previously. Eventually this will solve your issues.

 

This was after the driver upgrade and the reboot I think.

 

You get the error that the GUI locks up and that Unraid is laggy?

 

Have you double checked that you boot in legacy mode?

You can also try a BIOS reset, but only if you know what you are doing.

 

 

Do you have a second machine where you can test the card with Unraid, I would do it as follows:

  1. Create a new USB Boot stick on a spare USB Key
  2. Put the card in the other machine
  3. Boot the other machine with your card installed from the USB Key
  4. Register for Trail on the WebGUI
  5. Install the CA App
  6. Install the Nvidia Driver plugin from the CA App
  7. See if the WebGUI becomes laggy on this machine too, if not open up a terminal and issue the command: 'nvidia-smi'

OK ill have to set up UnRaid on the tower as the R710 and supermicro servers wont take a GPU! i'm working till 1800 UK Time so ill have a play tonight... Its just strange that its stopped working 1 hour after that update ;(

WhatsApp Image 2021-06-23 at 15.26.52 (1).jpeg

WhatsApp Image 2021-06-23 at 15.26.52.jpeg

Edited by leeknight1981
Card Is OUT ill put it in a tower later
Link to comment
11 minutes ago, leeknight1981 said:

Its just strange that its stopped working 1 hour after that update ;(

Yes but thats something that can happen. But I don't think that the new driver is causing the problem.

 

12 minutes ago, leeknight1981 said:

OK ill have to set up UnRaid on the tower as the R710 and supermicro servers wont take a GPU!

Thank you, looking forward to your findings.

 

10 minutes ago, squirrellydw said:

How do I restart the Nvidia plugin from terminal?  I don't wanna reboot the server right now

The plugin is not designed to do that because this would involve to stop the whole Docker service, uninstall the old plugin, make sure that the plugin isn't used by anything, uninstalling the old driver, installing the new driver, start the Docker service again.

 

You can of course do that by hand but I see not much of a benefit by doing this.

 

If the server is booted into GUI mode this would complicate things even more. :D

 

Link to comment
20 minutes ago, leeknight1981 said:

Ok GPU is working and have been smashing games on it for 2 hours no sweat, So deffo Not the Graphics Card 😕

Can you try to do the steps from above with a spare USB Key and Unraid and see if it works on this machine?

Link to comment
10 minutes ago, ich777 said:

Can you try to do the steps from above with a spare USB Key and Unraid and see if it works on this machine?

Not really No and I don’t have a spare USB now I’m not at home and this is my mates gaming machine, So I’ll have to leave the card here and he will set UnRaid up when he has 5 mins but I’m 99.6% certain it’s not my hardware / server at fault 😜

 

I’ll come back to you 

 

maybe a Day or Two 

Link to comment
9 minutes ago, leeknight1981 said:

but I’m 99.6% certain it’s not my hardware / server at fault 😜

But what should it be else, on my server the plugin just works fine, tried now 3 different driver versions and all work just fine.

 

Don't get me wrong but if there is a problem with the plugin or the drivers itself I think more people would have reported that here... ;)

Link to comment
2 hours ago, ich777 said:

But what should it be else, on my server the plugin just works fine, tried now 3 different driver versions and all work just fine.

 

Don't get me wrong but if there is a problem with the plugin or the drivers itself I think more people would have reported that here... ;)

I know mate i'm at a los we will try a fresh UnRaid and slap the GPU in, And see what happens ill come back to you with the findings :)

  • Thanks 1
Link to comment

I'm having an issue where my card appears for a little bit and then disappears. It's an EVGA GeForce RTX 3060 and on the plugins it'll show the driver version and the Installed GPU fine after I reboot. After I check it and refresh the page, it's gone and says No devices found.

I don't have a VM

I do see in the logs

NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1199)

NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

image.thumb.png.b26884559e6ea173765d0e7ad20f6c10.png

Link to comment
1 hour ago, HellraiserOSU said:

I do see in the logs

Please post your Diagnostics (Tools -> Diagnostics -> Download -> drop the downloaded zip file here in the text box.

 

1 hour ago, HellraiserOSU said:

After I check it and refresh the page

For how long is the card recognized, can you use it in Docker containers or does it drop instantly?

Do you boot with Legacy or UEFI?

 

Please also see here: Click

Changing the PCIe slot or the PCIe generation can help, also check if you got a setting in the BIOS named "Above 4G decoding” or “large/64bit BARs" and make sure to enable it.

Link to comment
On 6/28/2021 at 4:48 PM, ich777 said:

Please post your Diagnostics (Tools -> Diagnostics -> Download -> drop the downloaded zip file here in the text box.

 

For how long is the card recognized, can you use it in Docker containers or does it drop instantly?

Do you boot with Legacy or UEFI?

 

Please also see here: Click

Changing the PCIe slot or the PCIe generation can help, also check if you got a setting in the BIOS named "Above 4G decoding” or “large/64bit BARs" and make sure to enable it.

This is what mines doing Also! ill let you know in another one :) 

Link to comment

Guess Who's back... 

OK GPU Works in a gaming machine flat out, It also works in UnRaid IE Shows UP and in the plugin. 

So i put it back in my server, work's fine! Then i install the Plugin Card Shows up, Ver No, GUID Etc. Then i Re Enabled docker And Boom that same fault in logs Web - GUI locks up and freezes :( So Deffo Not the GPU or Server as it only happens when i install Plugin and or Re Enable Docker. More than Happy to assist with this via teams or what ever :) Logs below in order of time. Before shut down, Re boot, Install plugin, re enable docker etc :) 

r720xd-diagnostics-20210701-2212.zip r720xd-diagnostics-20210701-2208.zip r720xd-diagnostics-20210701-2201.zip r720xd-diagnostics-20210701-2157.zip

Link to comment
9 minutes ago, ich777 said:

Have you tried it already in another system?

Also I don't hear back from @HellraiserOSU not sure if it's solved now or not.

removed Plugin 

Jul 1 22:30:44 R720XD emhttpd: cmd: /usr/local/emhttp/plugins/dynamix.plugin.manager/scripts/plugin remove nvidia-driver.plg
Jul 1 22:30:44 R720XD root: plugin: running: anonymous
Jul 1 22:31:01 R720XD kernel: docker0: port 1(veth049e849) entered disabled state
Jul 1 22:31:01 R720XD kernel: veth6194793: renamed from eth0
Jul 1 22:31:01 R720XD avahi-daemon[12225]: Interface veth049e849.IPv6 no longer relevant for mDNS.
Jul 1 22:31:01 R720XD avahi-daemon[12225]: Leaving mDNS multicast group on interface veth049e849.IPv6 with address fe80::5468:5dff:fe57:d2c7.
Jul 1 22:31:01 R720XD kernel: docker0: port 1(veth049e849) entered disabled state
Jul 1 22:31:01 R720XD kernel: device veth049e849 left promiscuous mode
Jul 1 22:31:01 R720XD kernel: docker0: port 1(veth049e849) entered disabled state
Jul 1 22:31:01 R720XD avahi-daemon[12225]: Withdrawing address record for fe80::5468:5dff:fe57:d2c7 on veth049e849.

Link to comment
9 minutes ago, leeknight1981 said:

OK GPU Works in a gaming machine flat out

Have you tested it also with Unraid on the gaming machine? This was the main question...

 

1 minute ago, leeknight1981 said:

removed Plugin 

Yes? And what's the problem here?

Link to comment

Yes we installed UnRaid on the Gaming Machine, Yes the Card showed up and yes it showed in the Plugin... but nothing on that machine to test as was just set up to see if the card was visible etc as its not my machine :(

Not saying there is a problem was showing what was in the logs after i removed it :) Id just like to get to the bottom of this as someone else is having the same issue i think. All was working Fine then an Update then its stopped working. Iv tested the card it outputs and games Fine... It shows up in a Fresh UnRaid - Plugin. I put the card back in mine its fine it shows install the plugin disable/reenable docker then i get the fault and whats in the logs the same as the chap above :) 

Link to comment
6 hours ago, leeknight1981 said:

Yes we installed UnRaid on the Gaming Machine, Yes the Card showed up and yes it showed in the Plugin...

If it showed up and not freezed on the gaming machine then it seems to me that this is a hardware combination issue on your machine.

 

Have you already tried the steps that I've linked above?

You also said something about a mining rig, do you have a chance to put another Nvidia card in the system?

 

6 hours ago, leeknight1981 said:

All was working Fine then an Update then its stopped working.

As said at the time there was only a driver update of the plugin and nothing else, a rollback should have it enabled again, and the changes to the plugin itself are only cosmetic and or add new features like the notification if a new driver is available and do nothing to the drivers itself.

 

6 hours ago, leeknight1981 said:

Id just like to get to the bottom of this as someone else is having the same issue i think.

If you search this thread for this issue this was reported a few times but after a BIOS reset/update/change everything seems to work again, some user even told me that it workes after he put the card in another system and put in back again in his server.

 

This issue can also happen with too long and too less shielded risers or defective risers.

Link to comment

I think you need a section for step involved in swapping a card (Nvidia for Nvidia). For example 970 out, 1030 in.

I had asked a question in the forums with no answer, however;

 

So for anyone who is playing along at home (in the future)

 

1. Turned off all the auto starts for docker and VM's to be on the safe side.

2. Shutdown unraid PC

3. Swapped out the 970 and in the 1030.

4. Power up server

5. Wait for full boot up, watching terminal output (as had hdmi plugged in)

7. Check in Nvidia plugin - new 1030 was recognised.

8. Update GPU info into the containers for plex, etc.

9. Remove then reinstall GPU statistics (for some reason this did not refresh with new info until I completed this).

10. Profit!!?????

Link to comment
38 minutes ago, Joshndroid said:

I think you need a section for step involved in swapping a card (Nvidia for Nvidia). For example 970 out, 1030 in.

Swapping the card should be as easy as:

  1. Shutting down the server
  2. Pull the old card out and put the new card in
  3. Start the server back again
  4. At this point the containers that have assigned the "old" Nvidia card won't start
  5. Go to the plugin page and copy the UUID from the "new" card
  6. Change the UUID in the Docker templates from the "old" to the "new" UUID
  7. Finish

I don't think it's necessary to create a dedicated tutorial for this since it's basically the same if you install the plugin for the first time and assign a Docker a Nvidia card.

 

38 minutes ago, Joshndroid said:

Remove then reinstall GPU statistics (for some reason this did not refresh with new info until I completed this).

This is related to the GPU Statistics plugin and should be asked in the appropriate support thread, from what I know you have to change the UUID from the dropdown and at least one setting and it should pick that up, eventually @b3rs3rk can help here.

Link to comment
3 hours ago, ich777 said:

If it showed up and not freezed on the gaming machine then it seems to me that this is a hardware combination issue on your machine.

 

Have you already tried the steps that I've linked above?

You also said something about a mining rig, do you have a chance to put another Nvidia card in the system?

 

As said at the time there was only a driver update of the plugin and nothing else, a rollback should have it enabled again, and the changes to the plugin itself are only cosmetic and or add new features like the notification if a new driver is available and do nothing to the drivers itself.

 

If you search this thread for this issue this was reported a few times but after a BIOS reset/update/change everything seems to work again, some user even told me that it workes after he put the card in another system and put in back again in his server.

 

This issue can also happen with too long and too less shielded risers or defective risers.

The mining machine is mining and I don’t think the card in that will fit in my R720XD just seems odd when I install the plugin it’s all ok card is there as soon as I disable and enable docker it instantly says No Devices Found in the plug-in and I get the NVRM: GPU 0000:42:00.0 RmInitAdapter failed so I’m convinced it’s not my hardware as it all works and did work fine until after that update I give up tbh I’ll wait till it’s fixed I guess and say wasn’t my server / GPU as I’m Not the only person with the issue :-/ 

61FC4359-7564-45B5-ACE2-519CA878C17B.jpeg

BD09F250-D8E8-44AD-89AC-F46DBD399801.jpeg

Link to comment
1 hour ago, ich777 said:

Swapping the card should be as easy as:

  1. Shutting down the server
  2. Pull the old card out and put the new card in
  3. Start the server back again
  4. At this point the containers that have assigned the "old" Nvidia card won't start
  5. Go to the plugin page and copy the UUID from the "new" card
  6. Change the UUID in the Docker templates from the "old" to the "new" UUID
  7. Finish

I don't think it's necessary to create a dedicated tutorial for this since it's basically the same if you install the plugin for the first time and assign a Docker a Nvidia card.

 

This is related to the GPU Statistics plugin and should be asked in the appropriate support thread, from what I know you have to change the UUID from the dropdown and at least one setting and it should pick that up, eventually @b3rs3rk can help here.

 

by me replying and you replying we have it here in the thread. it can be found now if people search. ¯\_(ツ)_/¯

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.