[Plugin] Nvidia-Driver


ich777

Recommended Posts

30 minutes ago, ich777 said:

I don't understand, what should that change exactly?

 

Did you reinstall the driver at some point and or do you have a iGPU built in? Do you use GUI mode?

 

You can however try to execute this command from the command line and reboot afterwards:

sed -i "/disable_xconfig=/c\disable_xconfig=true" /boot/config/plugins/nvidia-driver/settings.cfg

(but this would be the last thing that I would try)

 

The latest driver idles at 12watts whereas the previous one idles at 7watts.

Link to comment
2 minutes ago, dopeytree said:

The latest driver idles at 12watts whereas the previous one idles at 7watts.

I can compile this driver for 6.11.0 but are you really sure this is not also caused by the new Kernel and/or other things?

Usually a driver revision change doesn't introduce such a huge wattage increase.

Link to comment
2 hours ago, cherrymomo said:

my 3060ti got 7w in idle in drive  v470.141.03 .

but in other of two driver it will be like 17-20w idle

Thank you for the report.

 

There is literally nothing I can do about that because this is the proprietary driver from Nvidia and you should report that on the Nvidia Forums.

 

Another thing to keep in mind or at least before you report that if this is a measured value (with a power meter at the wall) or is this just something that the GPU Plugin gave back? I wouldn't always count on values which are reported by the software itself. Also good to know would be if you measured this with nvidia-persistenced on or off? In which power state was the card?

 

Also another question that I have is, why reports the GPU Statistics plugin that your card is running at PCIe Gen 1 speeds when it is actually able to run at Gen 4 speeds?

 

Keep also in mind that if you use the RTX3060Ti only for transcoding that is really not the most efficient card for that because there are much cheaper and more efficient cards out there for such tasks like the Nvidia T400 or T600 which also get the job just done fine and was affordable even at the GPU crisis and costed brand new about $ 120,-

The T400 has a maximum TDP of 30W (and yes, this is the real TDP) and draws in idle about 1-3W

(I think the T600 has a maximum TDP of around 40W if I'm not mistaken and there is also a T1000 which has a maximum TDP of around 50W)

Link to comment
1 hour ago, ich777 said:

Thank you for the report.

 

There is literally nothing I can do about that because this is the proprietary driver from Nvidia and you should report that on the Nvidia Forums.

 

Another thing to keep in mind or at least before you report that if this is a measured value (with a power meter at the wall) or is this just something that the GPU Plugin gave back? I wouldn't always count on values which are reported by the software itself. Also good to know would be if you measured this with nvidia-persistenced on or off? In which power state was the card?

 

Also another question that I have is, why reports the GPU Statistics plugin that your card is running at PCIe Gen 1 speeds when it is actually able to run at Gen 4 speeds?

 

Keep also in mind that if you use the RTX3060Ti only for transcoding that is really not the most efficient card for that because there are much cheaper and more efficient cards out there for such tasks like the Nvidia T400 or T600 which also get the job just done fine and was affordable even at the GPU crisis and costed brand new about $ 120,-

The T400 has a maximum TDP of 30W (and yes, this is the real TDP) and draws in idle about 1-3W

(I think the T600 has a maximum TDP of around 40W if I'm not mistaken and there is also a T1000 which has a maximum TDP of around 50W)

Within two hours I reinstalled all the drivers and tried once and found the following problem.
1. standby voltage and driver version has no relationship (in the reinstallation of the latest version of the driver my graphics card can maintain a standby voltage of 5w)
ps: not using any power section script

2. unraid in the first boot (all gpu driver versions) graphics card will maintain the memory frequency at very high, voltage is about 40-50w, fan 0 speed.
First I open the console and enter nvtop, the gpu will become standby (memory frequency drops, fan low speed, voltage 17-20w) when I close the unraid console will resume high load, fan 0 speed. Then I try to open the plex to play the video and then close the video, then the gpu will become standby with 5w voltage (idle state).
So when I start unraid every time I have to have a play video operation to keep the gpu in low power idle.

3. about the pcie problem, I just looked at the idle state default pcie gen 1 when there is a program call to the gpu (such as play video) will automatically switch to pcie gen 4

屏幕截图 2022-09-28 075554.png

屏幕截图 2022-09-28 075623.png

Link to comment
1 minute ago, cherrymomo said:

unraid in the first boot (all gpu driver versions) graphics card will maintain the memory frequency at very high, voltage is about 40-50w, fan 0 speed.

That's why I recommend to append this to the go file:

nvidia-persistenced

this will solve this.

 

2 minutes ago, cherrymomo said:

1. standby voltage and driver version has no relationship (in the reinstallation of the latest version of the driver my graphics card can maintain a standby voltage of 5w)

So there is no difference in power usage...?

I thought there was a massive difference...

Link to comment
7 minutes ago, ich777 said:

That's why I recommend to append this to the go file:

nvidia-persistenced

this will solve this.

 

So there is no difference in power usage...?

I thought there was a massive difference...

I do not know whether there is a difference in the peak voltage of the gpu in each driver version, but in the three versions of the idle state voltage only 1-2w difference (the highest 7w lowest 5w)

 

All results were obtained after I reinstalled the driver, normal upgrades may vary

Edited by cherrymomo
Link to comment
3 minutes ago, cherrymomo said:

I do not know whether there is a difference in the peak voltage of the gpu in each driver version, but in the three versions of the idle state voltage only 1-2w difference (the highest 7w lowest 5w)

I would recommend that you measure this with a real power meter and compare the results.

This is as always, just a software sensor and this is something that is calculated and doesn't have to reflect the real power usage.

 

As said above, to solve your issue after a reboot append this to your go file:

nvidia-persistenced

 

Link to comment

Installed the Nvidia Driver for the 1st time or a least uninstalled in the past.

 

Unraid 6.11.0   Edit: also happens on Unraid 6.10.3

 

I can access Unraid via browser but local boot monitor is just black with a cursor line at the top left.

 

Local boot monitor show all text during boot then when it would come time to for the Unraid login screen to come up everything goes black.

 

Unraid boot video card is an ATI old 32bit PCI card that has always worked.

image.png.ac9ac3d2187debee5df320d1bfe83f00.png

image.thumb.png.7c704db1c27c349f68f1ee42c880c7f0.png

 

unraid-diagnostics-20220928-1734.zip

Edited by Paul_Ber
Link to comment
3 hours ago, Paul_Ber said:

I can access Unraid via browser but local boot monitor is just black with a cursor line at the top left.

Execute this command from a Unraid terminal and reboot afterwards:

sed -i "/disable_xconfig=/c\disable_xconfig=true" /boot/config/plugins/nvidia-driver/settings.cfg

 

  • Like 1
Link to comment

Hi all,

I'm very new to unraid and servers in general. I have a good understanding of software and hardware but I'm still learning so please go easy on me.

I have an old GTX980ti I put in my server in hopes of being able to use it with plex and transcoding. In stalled the Nvidia driver from apps and it said it installed correctly but what I see is the following:

 

image.thumb.png.d5d4e33920a89632e031a7bb96e30a39.png

 

It doesn't look like the driver is picking up the installed GPU or displaying the driver version.

I stopped the docker process and restarted as I've seen suggested but that doesn't seem to have done the trick.

 

Any help would be greatly appreciated! 

 

Link to comment
8 hours ago, alturismo said:

may start with a screenshot

 

image.thumb.png.3e0d163b3ee3b539efe648c1fd2912c6.png

 

and just to make sure, you didnt bound the GPU to a running VM or setted VFIO Bind

 

I apologize. I've attached my diagnostics.

I don't have any VM's or containers setup yet. The only thing I've done so far is just get the array built, setup a few shares and users. 

Next was to try to get Plex installed and was following a youtube tutorial on getting the nvidia driver installed and got this issue.

 

 

 

executor-server-diagnostics-20220930-0735.zip

Link to comment
2 hours ago, dfrontiera said:

I don't have any VM's or containers setup yet. The only thing I've done so far is just get the array built, setup a few shares and users. 

It seems that you have closed the plugin installation window before it actually finished.

Please uninstall the plugin once, go to the CA App download a fresh copy from the plugin and wait for the Done button to appear.

  • Like 1
Link to comment
32 minutes ago, ich777 said:

It seems that you have closed the plugin installation window before it actually finished.

Please uninstall the plugin once, go to the CA App download a fresh copy from the plugin and wait for the Done button to appear.

 

Yep that did it. Feel a bit silly, I should have tried that first. Thank you for your help!!! 

  • Like 1
Link to comment

Hello!

 

I'm facing an issue with my Unraid tower.

I have a RTX3080 + P2200, and nvidia-driver plugin see only the RTX3080.

Unraid: up-to-date / nvidia-driver plugin: up-to-date / nvidia driver installed: 515.76 / both devices are not VFIO binded / Cpu Threadripper: 3960x / motherboard: Aorus Master TRX40

 

root@ValiLab:~# nvidia-smi
Mon Oct  3 15:41:08 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.76       Driver Version: 515.76       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:21:00.0 Off |                  N/A |
|  0%   48C    P8    14W / 370W |      0MiB / 10240MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
root@ValiLab:~# lspci -nnv | grep -i nvidia
21:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102 [GeForce RTX 3080] [10de:2206] (rev a1) (prog-if 00 [VGA controller])
        Kernel driver in use: nvidia
        Kernel modules: nvidia_drm, nvidia
21:00.1 Audio device [0403]: NVIDIA Corporation GA102 High Definition Audio Controller [10de:1aef] (rev a1)
4a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106GL [Quadro P2200] [10de:1c31] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2200] [10de:131b]
        Kernel modules: nvidia_drm, nvidia
4a:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller [10de:131b]
root@ValiLab:~# dmesg | grep nvidia
[   72.084431] nvidia: loading out-of-tree module taints kernel.
[   72.084434] nvidia: loading out-of-tree module taints kernel.
[   72.084829] nvidia: module license 'NVIDIA' taints kernel.
[   72.085201] nvidia: module license 'NVIDIA' taints kernel.
[   72.221439] nvidia-nvlink: Nvlink Core is being initialized, major device number 243
[   72.222868] nvidia 0000:21:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   72.269157] nvidia 0000:4a:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   72.513267] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  515.76  Mon Sep 12 19:11:54 UTC 2022
[   72.516799] [drm] [nvidia-drm] [GPU ID 0x00002100] Loading driver
[   72.517198] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:21:00.0 on minor 0
[   72.517675] [drm] [nvidia-drm] [GPU ID 0x00004a00] Loading driver
[   72.518082] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:4a:00.0 on minor 1
[  111.118063] NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead.

 

Anyone can help please?

Diagnostics here => valilab-diagnostics-20221003-1552.zip

Thanks !

Edited by Valiran
add CPU + MB info + diagnostics
Link to comment
1 hour ago, Valiran said:

Anyone can help please?

Because your Quadro is bount to VFIO and the system can't see it:

4a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106GL [Quadro P2200] [10de:1c31] (rev a1)
	Subsystem: NVIDIA Corporation GP106GL [Quadro P2200] [10de:131b]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidia_drm, nvidia
4a:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
	Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller [10de:131b]
	Kernel driver in use: vfio-pci

 

Do you use the Quadro in a VM? If not I would recommend that you unbind it and reboot, after that it should show up just fine.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.