[Plugin] Nvidia-Driver


ich777

Recommended Posts

8 hours ago, wgstarks said:

Got this while updating to 6.10.1. Guessing it’s still wrong but posting it as requested.

Fixed just now, should now work, you maybe have to reboot once so that the new plugin update script is installed.

Link to comment
1 hour ago, tjb_altf4 said:

Upgraded to 6.10.1 and had a problem with gpu again, went into plugin settings and it has reverted to latest driver again.

Seems it is not honoring driver preference settings between upgrades?

I will look into this ASAP and test this.

Link to comment
1 hour ago, tjb_altf4 said:

Already rebooted again with the setting changed, this is what it looks like now, and this is what I had set in plugin prior to OS update

I've tested it now, a little different and also the Plugin Update Helper was fixed too before I've done that:

  1. Downgraded to 6.9.2
  2. Changed the driver version to 470.96 in the config file
  3. Triggered the update to 6.10.1
    grafik.png.5c55994f1db6a026e5d24765c4a5ae25.png
    In this case the Plugin Update Helper downloaded 470.94 for 6.10.1
  4. After the reboot to 6.10.1
    grafik.thumb.png.3f075f397f520b2bc07b65fb4de9d49e.png
    The right version is installed

 

 

The plugin (and also the Plugin Update Helper) should always honor the preferred driver version, I will do some more tests next week and report back.

 

The driver only falls back to latest if it can't find the selected driver version, maybe the network was a little to late initialized on your machine, but that's only a really vague guess.

Anyways, this shouldn't happen anymore because the plugin update is fixed now, but as said above you maybe have to reboot once more so that the new version is installed.

 

Sorry for the inconvenience...

  • Thanks 1
Link to comment
2 hours ago, tjb_altf4 said:

this is what I had set in plugin prior to OS update

Did a bit more testing:

 

Downgrading from 6.10.1 -> 6.9.2 (picture from before the upgrade):

image.png.84a47a727ce9958a3b81c43b4f5135db.png

image.thumb.png.12c4d95c9724c5a1df3c97a28c04951b.png

 

Upgarde from 6.9.2 -> 6.10.0 (picture from before the upgrade):

image.thumb.png.556b3c2204312e9da1960850d2d5a1eb.png

 

The Plugin Update Helper fails on 6.10.0 because the Kernel version string is formatted wrong in this version but this is a good test because the plugin packages are not pre-downloaded like in your case and will be downloaded on boot.

 

 

Upgarde from 6.10.0 -> 6.10.1 (picture from before the upgrade):

image.thumb.png.1b75728ca5705277e3df9f0236425327.png

 

You are completely correct that it falls back to latest.

 

But finally set the version again to 470.94 and upgraded to 6.10.1 (picture after the upgrade):

image.thumb.png.960366e46cb638f7f5f30db3fe762bbc.png

 

 

The reason why it resets the version back to latest is because the Plugin Update Helper fails but this shouldn't happen anymore in the upcoming version from Unraid because I've reported that the Kernel version string was formatted wrong to Tom and it should now be fixed.

 

Thank you for the report! I will take a look to make the Kernel version detection even more robust.

  • Thanks 1
Link to comment

I'm not sure why my Quadro P2000 is not being detected. They are not locked or binded to any VMs or VFIO. System Devices tool does detect the P2000 but Nvidia Drivers Package wont see it.

 

System Devices:

 cbe0af32bb952925fc98b73e56d9bdbc.png

 

Nvidia Driver Package:

6c90a67f0a959599a41b61e09d24a8d6.png

 

Logs:

 

111ac58c8c24054e47b67ebfe0d4282d.png

 

 

Diagnostics:

 

mirzaserver-diagnostics-20220523-0208.zip

I bought this card second-hand very recently. Does it seem like something is wrong with the kernel?

Edited by Mmirzax
Link to comment
19 minutes ago, Mmirzax said:

I bought this card second-hand very recently. Does it seem like something is wrong with the kernel?

Did you make sure that the card is actually working?

 

Please also make sure that you are on the latest BIOS version, try to reseat the card in the PCIe slot, do you have IOMMU enabled in your BIOS?

From what I see in your Diagnostics you are booting with UEFI, can you try to boot with Legacy (CSM) mode?

 

I only can think of a HW compatibility issue.

 

May I ask for what do you want to use the P2000 HW transcoding or something else too?

Link to comment
1 minute ago, Mmirzax said:

It seems to be the card. I have updated bios, tried to boot with Legacy CSM, reseat the card. Try another port. It does seem to be the card. I'll have to get it swapped out.

Do you have maybe another computer where you could put the card in, install the drivers and put a 3D load on it?

This would be also a good thing to do first.

Link to comment

hi there guys,

 

i had to switch my usb stick today. which was a breeze and not a problem at all.

after the reboot and changing my key file, the nvidia drivers plugin didnt list my p400.

but that happens every now and than, most of the times a reboot fixes this.

 

sadly not today. on my old usb stick i can tell, that nvidia driver version 510.60.02 was installed.

 

on every boot it tells me "modprobe: ERROR: could not inster 'nvidia': No Such device"

 

i tried every other driver option within the plugin.. no success. has any one an idea?

 

unraid version 6.9.2

 

So more info will follow. still on the road. pardon me.

 

Link to comment
8 minutes ago, sausagewaterson said:

if i reseat the cards, the 1050ti isnt useable in my vms. but the p400 with my docker. iam a little confused now. it appears to me that the vfio/iommu groups are changed/wrong. iam even more confused now...

From what I see in your logs both of your Nvidia cards are bound to VFIO, this is the reason why the plugin can't see none of the cards:

03:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1fb2] (rev a1)
	Subsystem: Lenovo Device [17aa:1489]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidia_drm, nvidia
03:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
	Subsystem: Lenovo Device [17aa:1489]
	Kernel driver in use: vfio-pci
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1)
	Subsystem: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:11bf]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidia_drm, nvidia
04:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
	Subsystem: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:11bf]
	Kernel driver in use: vfio-pci

 

Link to comment
11 minutes ago, sausagewaterson said:

yes. makes sense to me. but if i hook up only the 1050ti and bound it. i can use it as before.

as soon as i add the t400. it isnt recognized anymore. which was not the case before. 

Have you tried this yet:

image.png.622f589d15ee421c3e3711c3992f427c.png

 

Or what do you mean with that? Did you bound the second card also to VFIO?

Link to comment
1 hour ago, Bamabo said:

we need nvidia-driver-legacy :( for people like myself who still uses k2000

Did you even take a look at the plugin page itself yet, the driver version 470.94 maybe??? :D

 

On what Unraid version are you?

 

The driver 470.94 is available for every Unraid version so far and I will compile it as long as it compiles on newer Kernel versions.

Of course when you are on Unraid 6.9.2, which is now outdated because Unraid 6.10.1 stable was released, you won't see the version because I list only the last 8 drivers but it's there and you have to enable it manually.

Link to comment
On 5/22/2022 at 4:41 PM, ich777 said:

The reason why it resets the version back to latest is because the Plugin Update Helper fails but this shouldn't happen anymore in the upcoming version from Unraid because I've reported that the Kernel version string was formatted wrong to Tom and it should now be fixed.

 

Thank you for the report! I will take a look to make the Kernel version detection even more robust.

FYI not fixed in 6.10.2 (coming from 6.10.1), helper failed and I needed to set driver again, requiring an additional reboot post upgrade 

 

(not a big issue for me, just wanted you to know in case it was meant to be fixed)

Edited by tjb_altf4
Link to comment
4 minutes ago, tjb_altf4 said:

I didn't look too closely, but it looked like it failed in a similar way to previous times where it failed (red notification) them OK (green notification) 

I think what happened is that you still had the old version installed, the Plugin Update Helper updates only when you reboot or you are installing a new plugin which needs modules for specific Kernel versions.

 

I've tested upgrading and downgrading between various versions from Unraid.

 

EDIT: If you are curious to test, you can downgrade to 6.10.1 and after a reboot again upgrade to 6.10.2 and it should work just fine.

Link to comment
On 5/20/2022 at 11:08 AM, ich777 said:

Everything seems fine from what I see from your log.

 

Can you try the following:

Reboot without the driver plugin being installed

Issue these commands and post the output here (copy and paste the whole thing should work fine):

mkdir -p /tmp/nvdrv && cd /tmp/nvdrv
wget https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.40-Unraid-1.txz
installpkg nvidia-515.43.04-5.15.40-Unraid-1.txz
depmod -a
modprobe nvidia
rm -rf /tmp/nvdrv
nvidia-smi

 

Morning, just upgraded to 6.10.2 and my issue is back again, I repeated the steps I followed previously to get it working however to negative results.

 

I have just tried your suggested steps above, results as follows:

/tmp/nvdrv# mkdir -p /tmp/nvdrv && cd /tmp/nvdrv
wget https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.40-Unraid-1.txz
installpkg nvidia-515.43.04-5.15.40-Unraid-1.txz
depmod -a
modprobe nvidia
rm -rf /tmp/nvdrv
nvidia-smi
--2022-05-28 10:18:06-- https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.40-Unraid-1.txz
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/306724515/91c671a8-ea29-453f-8603-2b55d8950db6?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220528%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220528T091807Z&X-Amz-Expires=300&X-Amz-Signature=af4ced2956ac30529332c4a4049c19707e39ae3bc79c3d5116b925b9b047b4cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=306724515&response-content-disposition=attachment%3B filename%3Dnvidia-515.43.04-5.15.40-Unraid-1.txz&response-content-type=application%2Foctet-stream [following]
--2022-05-28 10:18:07-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/306724515/91c671a8-ea29-453f-8603-2b55d8950db6?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220528%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220528T091807Z&X-Amz-Expires=300&X-Amz-Signature=af4ced2956ac30529332c4a4049c19707e39ae3bc79c3d5116b925b9b047b4cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=306724515&response-content-disposition=attachment%3B filename%3Dnvidia-515.43.04-5.15.40-Unraid-1.txz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 257129976 (245M) [application/octet-stream]
Saving to: ‘nvidia-515.43.04-5.15.40-Unraid-1.txz’

nvidia-515.43.04-5.15.40-Unraid-1.txz                       100%[========================================================================================================================================>] 245.22M  63.6MB/s    in 3.9s

2022-05-28 10:18:11 (62.7 MB/s) - ‘nvidia-515.43.04-5.15.40-Unraid-1.txz’ saved [257129976/257129976]

Verifying package nvidia-515.43.04-5.15.40-Unraid-1.txz.
Installing package nvidia-515.43.04-5.15.40-Unraid-1.txz:
PACKAGE DESCRIPTION:
Package nvidia-515.43.04-5.15.40-Unraid-1.txz installed.
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.15.43-Unraid
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
 

Link to comment
10 minutes ago, Big-G said:

Morning, just upgraded to 6.10.2 and my issue is back again, I repeated the steps I followed previously to get it working however to negative results.

 

I have just tried your suggested steps above, results as follows:

/tmp/nvdrv# mkdir -p /tmp/nvdrv && cd /tmp/nvdrv
wget https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.40-Unraid-1.txz
installpkg nvidia-515.43.04-5.15.40-Unraid-1.txz
depmod -a
modprobe nvidia
rm -rf /tmp/nvdrv
nvidia-smi
--2022-05-28 10:18:06-- https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.40-Unraid-1.txz
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/306724515/91c671a8-ea29-453f-8603-2b55d8950db6?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220528%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220528T091807Z&X-Amz-Expires=300&X-Amz-Signature=af4ced2956ac30529332c4a4049c19707e39ae3bc79c3d5116b925b9b047b4cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=306724515&response-content-disposition=attachment%3B filename%3Dnvidia-515.43.04-5.15.40-Unraid-1.txz&response-content-type=application%2Foctet-stream [following]
--2022-05-28 10:18:07-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/306724515/91c671a8-ea29-453f-8603-2b55d8950db6?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220528%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220528T091807Z&X-Amz-Expires=300&X-Amz-Signature=af4ced2956ac30529332c4a4049c19707e39ae3bc79c3d5116b925b9b047b4cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=306724515&response-content-disposition=attachment%3B filename%3Dnvidia-515.43.04-5.15.40-Unraid-1.txz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 257129976 (245M) [application/octet-stream]
Saving to: ‘nvidia-515.43.04-5.15.40-Unraid-1.txz’

nvidia-515.43.04-5.15.40-Unraid-1.txz                       100%[========================================================================================================================================>] 245.22M  63.6MB/s    in 3.9s

2022-05-28 10:18:11 (62.7 MB/s) - ‘nvidia-515.43.04-5.15.40-Unraid-1.txz’ saved [257129976/257129976]

Verifying package nvidia-515.43.04-5.15.40-Unraid-1.txz.
Installing package nvidia-515.43.04-5.15.40-Unraid-1.txz:
PACKAGE DESCRIPTION:
Package nvidia-515.43.04-5.15.40-Unraid-1.txz installed.
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.15.43-Unraid
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
 

 

Amended the script to reflect version deficit:

 /tmp/nvdrv# mkdir -p /tmp/nvdrv && cd /tmp/nvdrv
wget https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.43-Unraid-1.txz
installpkg nvidia-515.43.04-5.15.43-Unraid-1.txz
depmod -a
modprobe nvidia
rm -rf /tmp/nvdrv
nvidia-smi

 

Results:

Sat May 28 10:31:33 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04    Driver Version: 515.43.04    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   44C    P0    39W / 180W |      0MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:03:00.0 Off |                  N/A |
| 35%   32C    P0    N/A /  19W |      0MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
 

Link to comment
3 minutes ago, Big-G said:

 

Amended the script to reflect version deficit:

 /tmp/nvdrv# mkdir -p /tmp/nvdrv && cd /tmp/nvdrv
wget https://github.com/ich777/unraid-nvidia-driver/releases/download/5.15.40-Unraid/nvidia-515.43.04-5.15.43-Unraid-1.txz
installpkg nvidia-515.43.04-5.15.43-Unraid-1.txz
depmod -a
modprobe nvidia
rm -rf /tmp/nvdrv
nvidia-smi

 

Results:

Sat May 28 10:31:33 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04    Driver Version: 515.43.04    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   44C    P0    39W / 180W |      0MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:03:00.0 Off |                  N/A |
| 35%   32C    P0    N/A /  19W |      0MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
 

Following the success above, we to CA and installed plugin again:

image.thumb.png.5bfd8efedcf1820b1b00e1098a659357.png

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.