[Plugin] Nvidia-Driver


ich777

Recommended Posts

4 hours ago, KennySPT said:

Followed the steps outlined at the beginning of this thread.   

Please post a screenshot from your Plex container where all the variables are visible.

 

Is there an official Plex image Support thread for unRAID here on the forums since this has nothing to do with the driver itself since in your case it is working just fine feom what I see from the output from nvidia-smi

Link to comment
On 1/5/2022 at 5:02 AM, ich777 said:

What? Why? Have you restarted the Docker Daemon once or even read the first post or the red warning about that you have to restart the Docker Daemon once?

 

I think someone already forked my driver and also the repo and made it available, please ask in the Chinese language subforums.

Probably just network issue. My unraid server usually failed to download plugins and it makes them running with something missed.

 

I've done all steps in normal way including disable docker service and enable it via WebUI. Also trying reboot Unraid Server serveral times and not helping.

 

Downloading files from foreign websites is always painful and fails easily on my country mainland . Maybe the runtime parameter didn't take effect while failing to install Nvidia Driver plugin. (even the webUI displays "install sucessfully")

 

So I just place it manually into the daemon.json file and the Emby container runs correctly with GTX750 hardware transcoding.

 

BTW, I appoint http_proxy in the /boot/config/go and the whole system network access (to CA or others) seems fine.

Edited by Zakikun
Link to comment
50 minutes ago, Zakikun said:

So I just place it manually into the daemon.json file and the Emby container runs correctly with GTX750 hardware transcoding.

If you download the appropriate driver package from my Github and place it in the 'packages' directory in the nvidia-driver folder everything will be installed correctly, including the daemon.json file.

Link to comment
3 minutes ago, ich777 said:

If you download the appropriate driver package from my Github and place it in the 'packages' directory in the nvidia-driver folder everything will be installed correctly, including the daemon.json file.

Does the driver package append the runtime command into /etc/docker/daemon.json or replace it automatically after downloading completely?

 

mkdir -p /etc/docker 
tee /etc/docker/daemon.json <<-'EOF' 
{ 
    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
EOF

 

I think it's my personal issue that I have edited /boot/config/go to create /etc/docker/daemon.json and change docker image registry mirror to mainland local server before I installed Nvidia Driver plugin. 

 

Is the plugin priority lower than system boot, so the daemon.json being overwritten by "go" file, not by plugin?

Link to comment
22 minutes ago, Zakikun said:

Does the driver package append the runtime command into /etc/docker/daemon.json or replace it

22 minutes ago, Zakikun said:

I think it's my personal issue that I have edited /boot/config/go to create /etc/docker/daemon.json and change docker image registry mirror to mainland local server before I installed Nvidia Driver plugin.

22 minutes ago, Zakikun said:
tee /etc/docker/daemon.json <<-'EOF' 

 

With your method you are replacing the daemon.json that is also in place from the nvidia driver, if you want to inject something to the file I would rather do it like this:

sed -i '0,/{/a\    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"],' /etc/docker/daemon.json

With this method you are injecting the "registry-mirrors" after the first found { and the pre configured  nvidia-runtime configuration in the file is untouched.

 

You also don't need to create the path /etc/docker because this is done on driver installation.

 

The go file is executed after the plugins are installed, please keep in mind I would execute the line before emhttp is called in the go file so that the file is modified before the Docker service is started.

 

The file should then look something like this:

{
    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"],
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

 

 

Hope this makes sense, explains your questions and solves your issue. :)

Link to comment
30 minutes ago, ich777 said:

With your method you are replacing the daemon.json that is also in place from the nvidia driver, if you want to inject something to the file I would rather do it like this:

sed -i '0,/{/a\    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"],' /etc/docker/daemon.json

With this method you are injecting the "registry-mirrors" after the first found { and the pre configured  nvidia-runtime configuration in the file is untouched.

 

You also don't need to create the path /etc/docker because this is done on driver installation.

 

The go file is executed after the plugins are installed, please keep in mind I would execute the line before emhttp is called in the go file so that the file is modified before the Docker service is started.

 

The file should then look something like this:

{
    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"],
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

 

 

Hope this makes sense, explains your questions and solves your issue. :)

Now I see. Thanks for help.

So the correct startup process like these?

1. Unraid Server bootup

2. plugin installed and extract daemon.json into /etc/docker/daemon.json

3. go file loaded, extra mirror line injected into daemon.json and emhttp called

4. Docker service start with json configuration

5. containers startup in order

6. related container run up with runtimes

  • Like 1
Link to comment
20 hours ago, Dazog said:

wow massive amount of fixes for this branch.

This driver does not compile currently because of this (last line from the changelog), have to contact Nvidia about that:

Updated nvidia.ko to load even if no supported NVIDIA GPUs are present when an NVIDIA NVSwitch device is detected in the system. Previously, nvidia.ko would fail to load into the kernel if no supported GPUs were present.

 

Also keep in mind this is a beta driver.

 

You have to give me a few days to solve this.

  • Like 1
Link to comment
9 hours ago, ich777 said:

This driver does not compile currently because of this (last line from the changelog), have to contact Nvidia about that:

Updated nvidia.ko to load even if no supported NVIDIA GPUs are present when an NVIDIA NVSwitch device is detected in the system. Previously, nvidia.ko would fail to load into the kernel if no supported GPUs were present.

 

Also keep in mind this is a beta driver.

 

You have to give me a few days to solve this.

No rush, the do not speak of patch is always a few days behind drivers anyway ;)

Link to comment
1 hour ago, Jschroedl said:

Hello,

 

I got the notification about the Nvidia 510 driver and I rebooted Unraid to install it. Looking at the last few messages, it appears that was a mistake. Now my instance doesn't boot. Is there a way to revert the installed Nvidia driver?

 

Joe Schroedl

You can select the driver.

settings>Nvidia Driver>Select preferred driver version

Link to comment
6 hours ago, Jschroedl said:

Looking at the last few messages, it appears that was a mistake.

My server is running the 510 beta driver just fine.

You only see drivers that are already compiled for unRAID.

 

6 hours ago, Jschroedl said:

Now my instance doesn't boot

What do you mean exactly with this?

Link to comment
5 hours ago, GrahamsCrackers said:

I just replaced my EVGA 3070 via RMA. However, now when I load the Nvidia-Driver plug in, it is not recognizing my GPU. I'm not sure what the problem is.

Please make sure that you are on the latest BIOS version, enable "Above 4G Decoding" & "Resizable BAR" support in your BIOS and also disable C-States.

Link to comment
1 hour ago, GrahamsCrackers said:

In theory, the EVGA card should be functional because I just received it from EVGA directly. Not sure what else could be causing the issue. 

Maybe the card thinks you want to use it for mining or it needs at least a display output connected.

Have you tried to connect a display to it or a dummy HDMI plug and see if the card is recognized?

 

From what I saw in your Diagnostics the card is recognized and the driver is loaded but your log outputs this:

Jan 14 21:11:56 Quasar kernel: NVRM: GPU at PCI:0000:0a:00: GPU-928e9831-f6bf-21e5-c151-b84101b7a474
Jan 14 21:11:56 Quasar kernel: NVRM: Xid (PCI:0000:0a:00): 62, pid=15364, 0000(0000) 00000000 00000000
Jan 14 21:12:03 Quasar kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
Jan 14 21:12:03 Quasar kernel: caller _nv000649rm+0x1ad/0x200 [nvidia] mapping multiple BARs
Jan 14 21:12:03 Quasar kernel: NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
Jan 14 21:12:03 Quasar kernel: NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
Jan 14 21:12:03 Quasar kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
Jan 14 21:12:03 Quasar kernel: caller _nv000649rm+0x1ad/0x200 [nvidia] mapping multiple BARs
Jan 14 21:12:03 Quasar kernel: NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
Jan 14 21:12:03 Quasar kernel: NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0

 

You could also try to boot with UEFI, usually that means that Above 4G decoding or Resizable BAR support is not turned on.

Link to comment
3 hours ago, ich777 said:

Sure thing, why shouldn't they, if they are listed in the plugin I've compiled them, otherwise they wont be displayed in the plugin anyways.

 

I only ask because someone said their machine would not boot after updating. I thought he had a issue with the newest driver.

Link to comment
3 hours ago, david279 said:

I only ask because someone said their machine would not boot after updating.

As you can see from the screenshot from me and @alturismo we are both on this driver version.

 

The boot issue can be caused also by something else but I have not heard back fron the user yet.

 

3 hours ago, david279 said:

thought he had a issue with the newest driver.

I don't think this plugin caused this, it is very unlikely that it prevents a boot from the server.

Link to comment

I have an HDMI plugged into a monitor for all of these tests. I have switched to UEFI in the bios and double checked that both resizable bar and above 4g support are enabled. I am able to get further in the process with the 2070m super than I am with the evga 3070, but I am still unable to HW transcode in plex with either. currently, I get a 1003 network error with the 2070 and the 3070 is still not read in the system. I have attached diagnostics for the 2070. Maybe it will show more than the logs from earlier with the 3070. quasar-diagnostics-20220116-1250.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.