[Plugin] Nvidia-Driver


ich777

Recommended Posts

8 minutes ago, badatusernames said:

Well that might be my problem.  I just assumed you could just pop a GPU in and let it help out the whole machine.  I do not have an Intel iGPU.  Actually using an AMD Ryzen 3.

But something is using your card I'm pretty sure otherwise it won't show that it using hw next to the transcoded stream...

 

9 minutes ago, badatusernames said:

Just for more information, is there somewhere I could see what it is capable of, or what cards are capable of doing it?

Yep, there you go: Click

(The quickest way to get to your card is that you untick the box 'Show All' and only select 'GK110 - Kepler (2nd Gen)', on the dropdownlist you can select the quality)

Link to comment
1 minute ago, ich777 said:

But something is using your card I'm pretty sure otherwise it won't show that it using hw next to the transcoded stream...

But then shouldn't it pop up in the smi information?

 

Most of what I'd want to transcode is 4k to 1080 or lower, so seeing that it can't handle the HVEC stuff should mean it isn't going to be using this right?

 

I guess the thing I'm finding out is this might be less helpful for plex than I initially thought.

 

I did see that this plugin shouldn't be used if we want to use it in a VM, is there a better way to get that passed through? Do I just need to disable the plugin for it to show up in the GPU drop down for the VMs? or do I need another plugin? It looks like the other big Nvidia plugin got discontinued or something so that's not really an option.

Link to comment

Hello,

 

As already reported, I've successfully setup a RTX 3060Ti FE with this plugin and above all the help of @ich777, and it's properly used in 3 containers (telegraf for GPU monitoring, Plex for hw transcoding and Folding@Home for folding 24/7).

 

Now I'd like to go a bit further and optimize the power consumption vs folding ratio. I've read in numerous forums about folding that the way to go was to undervolt and overclock the GPU to get the same computing power with less power drawn. Helping medical research (against Covid-19 among others) when the server is idle is ofc great, but if I can limit my electricity bill with the same output, that's also fine 😉.

 

Undervolting is not an issue, with the 'nvidia-smi --power-limit=XXX' I can set the power limit anywhere between 100W and 220W with this card, default being 200W. But of course, when I lower the power limit, the built-in algorithm lowers the GPU and memory clock frequencies accordingly, which produces less science (as they say in F@H forum).

 

The only way I've found to overclock an Nvidia GPU under Linux command-line is through the nvidia-settings utility, which is built-in with the drivers, the commands to overclock being :

nvidia-settings -a '[gpu:0]/GPUGraphicsMemoryOffset[3]=50'     #offset of +50Mhz on GPU Clock
nvidia-settings -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=50' #offset of +50Mhz on Memory Clock

The purpose is to compensate for the clock frequencies lowered by the power limit algorithm with a positive offset to get the same computing power with less power drawn. As always, the challenge is to find the sweet spot between power consumption, computing power and stability. But I've already reasonably overclocked CPUs and GPUs, and I'm patient, so this is not a problem.

 

Unluckily I didn't reach so far, because issuing the nvidia-settings ... command produces the following error :

root@NAS:~# nvidia-settings 
nvidia-settings: error while loading shared libraries: libX11.so.6: cannot open shared object file: No such file or directory

After checking, this nvidia-settings utility requires a X11 server on the machine.

Moreover, the xorg.conf file must contain these clauses 

    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"

to have an access to the clock offsets through nvidia-settings.

 

Of course I understand Unraid OS does not come with, and does not need btw, any graphical environment, and that's why the utility throws an error. And it's really a poor design choice by Nvidia to require a graphical environment to fine tune the GPU on a headless server under CLI. I think nvidia-settings is basically a graphical utility to tune the nvidia drivers which accepts CLI commands (see https://forums.developer.nvidia.com/t/overclock-gpu-at-lower-voltages/139326/2 for instance). Anyway that's the way it seems to be, unless I missed something...

 

All that could work fine if I ran F@H in a Linux VM with the GPU passed-through, but I'd prefer not to lose GPU computing performance due to the virtualization overhead and above all I would lose the ability to use the GPU in containers, and thus the whole interest of this great plugin...

 

So I raise the question to the community, as my technical skills enable me to understand the root cause of the dead-end, but sadly not to find a solution by myself 🙁

 

Thanks in advance for your ideas and feedback.

Edited by Gnomuz
  • Upvote 1
Link to comment
2 hours ago, badatusernames said:

But then shouldn't it pop up in the smi information?

Yes, try to restart the Plex container and try it again, eventually something is preventing that it showes up there, how much CPU usage do you have when you transcode a 1080p to for exampel 720p?

Also I would recommend trying a 1080p movie.

 

2 hours ago, badatusernames said:

Most of what I'd want to transcode is 4k to 1080 or lower, so seeing that it can't handle the HVEC stuff should mean it isn't going to be using this right?

Yes, eventually try to get a GTX1050Ti on the used market, really cheap and verry good for transcoding.

 

2 hours ago, badatusernames said:

I did see that this plugin shouldn't be used if we want to use it in a VM, is there a better way to get that passed through?

This is not 100% true, please read the first post again, you can install this plugin and also use Nvidia cards in VM's.

What you should not do is use one card for Container and also for the VM, if you want to do something like that then you need a Nvidia card for your Containers and one card for your VM.

If one card is transcoding in a Container and you start up a VM that uses the same card would lead most likely to a hard lockup from the server.

 

55 minutes ago, Gnomuz said:

nvidia-settings

Like you've said, this utility is for desktop use only and a GUI application, meaning that you need to run a Desktop environment to bring it up (I think there are command line switches but I never investigated further).

 

What you can try is that you boot Unraid in GUI mode and then start the utility (please be sure before issuing this command to type in 'export DISPLAY=0' in the console to tell the console to use the display 0 for GUI applications).

For this to work you have to attach a monitor to the graphics card/iGPU.

 

What do you want to do with that utility, this is a very very basic utility and I think you can not tweak much in there.

Link to comment
49 minutes ago, ich777 said:

Yes, try to restart the Plex container and try it again, eventually something is preventing that it showes up there, how much CPU usage do you have when you transcode a 1080p to for exampel 720p?

Also I would recommend trying a 1080p movie.

 

Yes, eventually try to get a GTX1050Ti on the used market, really cheap and verry good for transcoding.

 

This is not 100% true, please read the first post again, you can install this plugin and also use Nvidia cards in VM's.

What you should not do is use one card for Container and also for the VM, if you want to do something like that then you need a Nvidia card for your Containers and one card for your VM.

If one card is transcoding in a Container and you start up a VM that uses the same card would lead most likely to a hard lockup from the server.

Tried a 1080p to 720p and while plex still tells me it's HW transcoding, I still don't have a running process on the nvidi-smi. (I know the pic says SD, but I did both 720 and SD just to see)

 

But I'll be on the lookout for a 1050 then!

 

But on the VM side of things, I don't see it as an option for the GPU under graphics cards in the VM section.  It just has VNC. Do I need to start from scratch for my VM or is it just a setting I'm missing?

transcode.png

Edited by badatusernames
Link to comment
7 minutes ago, badatusernames said:

Tried a 1080p to 720p and while plex still tells me it's HW transcoding, I still don't have a running process on the nvidi-smi. (I know the pic says SD, but I did both 720 and SD just to see)

Can you look at the CPU usage? If it's very low while transcoding it's definetely working :)

Eventually that's also a thing that's Nvidia-SMI is not showing in your case...

 

7 minutes ago, badatusernames said:

But I'll be on the lookout for a 1050 then!

Don't forget the 'ti' :)

Here I can get them for about Eur. 80,-

 

7 minutes ago, badatusernames said:

But on the VM side of things, I don't see it as an option for the GPU under graphics cards in the VM section.  It just has VNC. Do I need to start from scratch for my VM or is it just a setting I'm missing?

I would recommend to get a second GPU for this.

 

You can click the little '+' icon and add the GPU otherwise please ask in the VM subforums, but this has nothing to do with the plugin itself. :)

Also be sure to not use it at the same time in the Container and start a VM as I said above.

Link to comment

Unfortunately the CPU usage was maxed out, so I think it's doing it through that and just not touching the GPU.  I'm beginning to wonder about it's compatibility in general.  It is listed as part of the driver supported cards, but again, it is 6+yrs old.

 

And for the VM, I don't even see a little + icon, but I'll got to the VM sub to see if they can help.

Link to comment
1 hour ago, ich777 said:

What do you want to do with that utility, this is a very very basic utility and I think you can not tweak much in there.

Thanks for your reply, as usual. I'll try to boot Unraid in GUI mode tomorrow, it's late in here.

 

What I want to do with nvidia-settings is overclock the GPU, while undervolting it (with 'nvidia-smi -pl XXX' which works) to diminish the power draw and maintain the folding throughput. nvidia-settings works in command mode, and you can do the same as in GUI mode, esp. apply an offset to your clocks, i.e. overclock, provided you have the famous "Coolbits" set to 28 in xorg.conf. The main issue is nvidia-settings needs to run in an X server.

 

Just to be sure, I have an integrated GPU on the IPMI motherboard (which is used by Unraid and is the primary graphics in BIOS), and the 3060 Ti with no screen attached just computing/transcoding. If I get it well, I should attach a monitor to the 3060Ti, change the primary graphics card in BIOS, and boot Unraid in GUI mode. We'll see ... I don't know what graphical environment Unraid uses when in GUI mode, and thus don't know how to tweak xorg.conf, if it exists.

 

I had thought of another approach, using a container like your "debian-buster-nvidia", which on paper has everything I would need : the driver is installed, the graphical environment exists, so nvidia-settings should work from inside the container, either in cli or graphical mode. And the tweaks applied from there would maybe have an effect on the card itself, and thus on processes launched by another container. But it's mostly speculation at that stage, and very likely wishful thinking 😉

I've found many over-complicated and unsure approaches around when it comes to overclocking an Nvidia GPU on a headless server not running an X server. The best I've found so far is https://gist.github.com/johnstcn/add029045db93e0628ad15434203d13c#overclocking (in the context of coin mining, but folding has the same requirements if not objectives...). But as we can't install much nor persist settings across reboots in Unraid, I quickly reach a dead-end on my side 🙁

Edited by Gnomuz
Link to comment

Hi all,

Using this plugin with a rtx 3090 however I am getting an SMI error. The card works fine on another machine and it's being picked up within system devices on unraid. I could use some help trouble shooting this issue. 

I noticed the troubleshooting guide specifically warned against riser cables, but since I'm using an ITX case a riser cable is needed for this build. Does it sound like the riser I'm using is faulty?

Link to comment
7 hours ago, badatusernames said:

It is listed as part of the driver supported cards, but again, it is 6+yrs old.

Yep that's true but keep in mind that you can use the card also for other things like folding@home that doesn't rely on hw transcoding or even in my Debian-Buster-Nvidia container for gaming.

 

7 hours ago, Gnomuz said:

I had thought of another approach, using a container like your "debian-buster-nvidia"

Never thought of that but this can actually work because you can launch the utility from a terminal from within the desktop environment.

 

7 hours ago, Gnomuz said:

But as we can't install much nor persist settings across reboots in Unraid, I quickly reach a dead-end on my side 🙁

Think from the other side, you have also the 'go' script where you can do/run basically everything as lomg as it lives on the usb boot drive.

You can of course so that also in the debian-buster-nvidia container if you set it to autostart.

 

6 hours ago, Juxsta said:

Using this plugin with a rtx 3090 however I am getting an SMI error.

Can you give me the exact error message?

Without the error message I can't tell what's probably wrong? Are you on Unraid 6.9.0 RC1 or on beta 35?

Link to comment
On 11/23/2020 at 7:23 PM, ich777 said:

This seems pretty normal to me, my card 1050Ti is in idle (and basically all the time) at P0 but it's actually not because the power reporting is broken that cards for some reason, the 1060 3GB that I have is in idle at P5 but it also consumes almost no power.

The power states in general are a little weired I think the reach from P0 to P15 where 0=max performance and 12=max powersaving but not all power states are available on all systems nore are the available on all cards as far as I know.

The persistance mode is only to keep the card in a certain state and/or alive to not idle out, I also read about that this mode or daemon is becoming depricated soon.

Leave it as it is since all is working as expected or am I wrong?

 

 

Hi @ich777,

 

first of all thank you so much for your hard work. It runs pretty smooth on my end.
 

Regarding the information from the quoted post. So did this mean, that the 1050TI will never change power states on unraid with this exactly configuration but other Models like 1060 will?

Edited by ph0b0s101
Link to comment
44 minutes ago, ph0b0s101 said:

Regarding the information from the quoted post. So did this mean, that the 1050TI will never change power states on unraid with this exactly configuration but other Models like 1060 will?

This is a little weired on some cards, on my GTX1050Ti it reports P0 (highest power state) but it actually is not in that state if it's doing nothing, if you search for something like 'nvidia-smi GTX1050Ti power state' on google you find much threads that it's false or even not reported.

From my research it's a simple bug in nvidia-smi and I think nvidia has no interest in fixing that because the card is now also pretty old...

 

21 minutes ago, tjb_altf4 said:

Would just request a future enhancement for a download meter of some kind, thats my only (minor) feedback

Yes, that's something that is on my list but I have to look into this since I'm not a specialist in webcoding. :)

  • Like 1
Link to comment
9 hours ago, ich777 said:

This is a little weired on some cards, on my GTX1050Ti it reports P0 (highest power state) but it actually is not in that state if it's doing nothing, if you search for something like 'nvidia-smi GTX1050Ti power state' on google you find much threads that it's false or even not reported.

From my research it's a simple bug in nvidia-smi and I think nvidia has no interest in fixing that because the card is now also pretty old...

 

Thank you for sharing your research results with me and the community, this saves me much time and effort in solving an issues which is based on a bug.

 

Much appreciated ;-)

  • Like 1
Link to comment
22 hours ago, ich777 said:

Can you give me the exact error message?

Without the error message I can't tell what's probably wrong? Are you on Unraid 6.9.0 RC1 or on beta 35?

Error message:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
————————————-

currently running beta 35

Link to comment
1 hour ago, Juxsta said:

Error message:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
————————————-

currently running beta 35

This is because the driver (455.35) for beta35 simply doesn't support beta35, you have to upgrade to RC1 to get the latest driver that also has support for th RTX3090.

Link to comment
7 hours ago, kizer said:

So if we are running 6.9-beta35 and upgrade to 6.9-RC1 and have the Nvidia plugin set to auto update in the Plugin do we just update to RC1 and the plugin update takes care of the rest or do we have to do anything else?

 

I'm running a GTX1050TI just for the record. 

By the time yes and no.

The plugin checks on every installation/start/restart of the server if there is a newer version of the driver available and installs it on boot.

 

In the future I will implement a notification system so that you get a notification if a newer driver is available in the Unraid GUI but you have to manually reboot to install it.

The reboot is required because:

  • If something is using the card on an update it will simply fail and your server can/will crash
  • If nothing is using the card you have to restart the Docker daemon in order to pick up the new driver version - that's something I don't want to do automatically
Link to comment

Sorry guys if this is the wrong area but unsure of exactly where to plant this. I am and have been using unraid 6.8.3, I have a Nvidia 1070 on this system I use for transcoding. I recently needed to do some reboots etc to add some new drives and seemingly during these reboots I've lost access to my Nvidia gpu. Checking the Nvidia plugin shows no devices found. If I run lspci in terminal I see my 1070 listed. The gpu is on in my system, lights on, fans spinning. No vms running so I'm unsure of the path I should take to try and figure this out. I never changed anything with this and has been working no problem until I've done my new drives.

 

 

 

Also all the sudden my plex is randomly crashing and I see nothing in docker logs at all and docker never stops but I have to costantly be rebooting the plex docker to get it back online if anyone has any ideas there? 

 

 

 

Plex version - Version 1.21.1.3795

 

 

 

 

 

Thanks for your time

Screenshot_20201216-083554_Chrome.jpg

Screenshot_20201216-083716_Chrome.jpg

Screenshot_20201216-083711_Chrome.jpg

godzilla-diagnostics-20201215-1459.zip

Link to comment
23 minutes ago, bradtn said:

Sorry guys if this is the wrong area

It is but I am going to split it into the thread where it belongs.

 

On second thought, I think I will leave it here and split your other posts about that into this thread since the way forward for you may be in this thread.

 

Link to comment
3 minutes ago, trurl said:

On second thought, I think I will leave it here and split your other posts about that into this thread since the way forward for you may be in this thread.

Please don't post about the same problem in multiple places. It makes it impossible to coordinate responses. "Crossposting" has been considered bad form on message boards since before the World Wide Web.

  • Thanks 1
Link to comment
1 minute ago, trurl said:

The Linuxserver nvidia plugin is no longer supported so I have left your posts here in this thread about a new nvidia plugin. You might want to start at the top.

I did see that but why would that randomly effect my existing setup? Wouldn't that effect builds going forward etc that was my understanding when I was reading it? 

Link to comment
1 hour ago, bradtn said:

Sorry guys if this is the wrong area but unsure of exactly where to plant this. I am and have been using unraid 6.8.3, I have a Nvidia 1070 on this system I use for transcoding. I recently needed to do some reboots etc to add some new drives and seemingly during these reboots I've lost access to my Nvidia gpu. Checking the Nvidia plugin shows no devices found. If I run lspci in terminal I see my 1070 listed. The gpu is on in my system, lights on, fans spinning. No vms running so I'm unsure of the path I should take to try and figure this out. I never changed anything with this and has been working no problem until I've done my new drives.

 

 Also all the sudden my plex is randomly crashing and I see nothing in docker logs at all and docker never stops but I have to costantly be rebooting the plex docker to get it back online if anyone has any ideas there? 

 

 Plex version - Version 1.21.1.3795

 Thanks for your time

Screenshot_20201216-083716_Chrome.jpg

Screenshot_20201216-083711_Chrome.jpg

Screenshot_20201216-083554_Chrome.jpg

godzilla-diagnostics-20201215-1459.zip 232.33 kB · 0 downloads

I have subsequently upgraded my unraid to 6.9.0rc1 and grabbed the new proper Nvidia plugin and did all that and my issue remains the same. No devices found on Nvidia plugin page but lspci command lists my card. At a loss as to what I should do or try next 

Link to comment
37 minutes ago, bradtn said:

I have subsequently upgraded my unraid to 6.9.0rc1 and grabbed the new proper Nvidia plugin and did all that and my issue remains the same. No devices found on Nvidia plugin page but lspci command lists my card. At a loss as to what I should do or try next 

 Please don't quote your own posts... :)

 

Can you open up a terminal of Unraid itself an give me the output of 'nvidia-smi'?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.