[Plugin] Nvidia-Driver


ich777

Recommended Posts

9 minutes ago, ConnerVT said:

When I tried "disable_xconfig=true" earlier, I may have been booting with UEFI.  One notable difference I saw on my machine is the text size (and GRUB menu) had smaller text.  In hindsight, this may be why I had the funny looking resolution when I tried this config entry before.

This is caused by the implementation in the BIOS and mostly nothing I can do about...

 

However may I ask why you want to use the iGPU for Console or better speaking the Unraid GUI output?

It wouldn't make much of a difference if you for example disable the iGPU and display it through the Nvidia GPU, of course you have to set disable_xconfig again to false.

Link to comment
5 minutes ago, SBH said:

On gainward site driver is referred to NVIDIA page. There is this driver noted for this GPU.

What is wrong with driver version: 470.141.03?

 

From what I see in your Diagnostics the driver is loaded and your card is recognized just fine.

 

BTW if you want to use this card for hardware transcoding that's a pretty bad choice because it even doesn't support h265 (HEVC).

 

25 minutes ago, SBH said:

Thanks a lot 

The screenshot that you've posted above is for another Unraid version too...

  • Thanks 1
Link to comment
9 minutes ago, ich777 said:

What is wrong with driver version: 470.141.03?

I thought, that it should be equal with recommendation of NVIDIA.

 

9 minutes ago, ich777 said:

BTW if you want to use this card for hardware transcoding that's a pretty bad choice because it even doesn't support h265 (HEVC).

Thanks for this input, I was really thought about to using GPU for HW Transcoding and the iGPU for a small VM. In this case, I will switch.

My iGPU is performing well for the HW Transcoding.

 

Thanks for the support

Link to comment
13 minutes ago, SBH said:

I thought, that it should be equal with recommendation of NVIDIA.

Yes and no.

 

The reason behind why it is like it is for the legacy driver is because I only compile the latest legacy driver which is available at a new release from a new Unraid version, like in this case Nvidia Driver version 470.141.03 was the latest legacy driver when Unraid 6.11.5 was released on November 3rd 2022:

image.png.a0ba25ffd44958d0bfd8b419dd48f732.png

(BTW 5.19.17 is the Kernel version from Unraid 6.11.5 and this version allows me to exactly identify in the plugin which version from Unraid a user is running and needs to be downloaded since the driver depends on the Kernel version)

 

This driver should run with all cards which need the legacy driver just fine.

 

On the other hand, for stable Unraid versions I compile ever new Nvidia driver (for recent cards) that is released in the life cycle for a specific Unraid version, like you can see for 6.11.5 there are a lot since November 3rd:

image.png.7c8f6d3e457d39cc53c63d2c3ea51e99.png

 

 

13 minutes ago, SBH said:

Thanks for this input, I was really thought about to using GPU for HW Transcoding and the iGPU for a small VM. In this case, I will switch.

If you want to use this card for a VM then please uninstall this plugin, this plugin is only meant if you want to use your Nvidia card in a Docker, or even multiple, Docker contianer(s).

 

See the first post from this thread:

image.png.74c6f93ee654efdc68b3761fdfd67da0.png

  • Thanks 1
Link to comment
16 minutes ago, ConnerVT said:

A quick follow up.  When I tried "disable_xconfig=true" earlier, I may have been booting with UEFI.  One notable difference I saw on my machine is the text size (and GRUB menu) had smaller text.  In hindsight, this may be why I had the funny looking resolution when I tried this config entry before.  Something to think about.

 

I spoke too soon.  😞

 

A Reboot put me back into that strange display resolution (640x960) on a monitor connected to the iGPU.  (At least it boots to iGPU)

 

A Shutdown/power button start does display correctly (1360x768).  I now at least have a path to get here.

 

Interested if you have any ideas on the reboot/640x960 and how it can be corrected.

Link to comment
1 hour ago, ich777 said:

However may I ask why you want to use the iGPU for Console or better speaking the Unraid GUI output?

 

Mainly to conserve the resources of the P400 (memory and running processes), and because the iGPU is in the system and paid for, so might as well put it to use.  It may also be the better option working with the KVM, as it is more tightly associated with the hardware if addressing BIOS/setup issues.  I just never expected it to be this much frustration

Link to comment
2 hours ago, ConnerVT said:

A Reboot put me back into that strange display resolution (640x960) on a monitor connected to the iGPU.  (At least it boots to iGPU)

Do you have the RadeonTOP plugin installed?

There is basically nothing I can do about that but maybe also check if you have blacklisted amdgpu on your system...

 

2 hours ago, ConnerVT said:

Interested if you have any ideas on the reboot/640x960 and how it can be corrected.

This has nothing to do with the Nvidia Driver plugin anymore and I'm not too familiar with the GUI mode on Unraid because I don't see a point for me running Unraid like that, I know there are might be some use cases but as said, not for me...

 

I can only think that the driver won't load because of above mentioned reasons.

 

2 hours ago, ConnerVT said:

Mainly to conserve the resources of the P400 (memory and running processes)

The desktop environment won't do much in terms of performance since this is a really lightweight task even for a T400...

 

2 hours ago, ConnerVT said:

and because the iGPU is in the system and paid for, so might as well put it to use.

If you are someday using Jellyfin, you already could use your AMD iGPU for transcoding in the container. ;)

Not too sure about Plex but I won't bet on it that they have it ready yet... :D

Link to comment

I didn't think it was anything to do with the nVidia driver or plugin directly.  I was just trying to pick your mind (as the expression goes), for maybe "one of those things you find out while working on other things."  I believe you have more insight to all things related to Unraid's use of GPUs than most anyone.  I really appreciate all the help you have given me.

 

I've been invested in the Plex platform for years before I built an Unraid server.  Use to run Media Browser, until it evolved into Emby.  For those not watching in real time, that was a hot mess.  Been on Plex ever since.  Bought a Lifetime Pass for $75USD in 2018 and haven't looked back.

Link to comment

I'm sure this has been asked before, but I can't seem to get a concrete answer that works for me.

My Quadro P600 isn't detected in Unraid, it does get picked up in "System Devices" but nothing else.

My logs show a lot of the below error:

Jan 24 23:09:22 io kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0x40:1423)
Jan 24 23:09:22 io kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

 

I can give more diagnostic logs if needed. Thanks all!

Link to comment
9 minutes ago, UpperCenter said:

Here you go.

May I ask, was the card working before or did you just install it?

May I also ask for what do you want to use the card? If you want to use it for transcoding, your Skylake iGPU should do the job just fine up to h265 (HEVC) see: here

 

Please make sure that you've enabled Above 4G Decoding and Resizabel BAR support (if you have that option) in your BIOS.

If that doesn't help try to boot with Legacy Boot (CSM) instead of UEFI?

Link to comment
13 hours ago, ich777 said:

May I ask, was the card working before or did you just install it?

May I also ask for what do you want to use the card? If you want to use it for transcoding, your Skylake iGPU should do the job just fine up to h265 (HEVC) see: here

 

Please make sure that you've enabled Above 4G Decoding and Resizabel BAR support (if you have that option) in your BIOS.

If that doesn't help try to boot with Legacy Boot (CSM) instead of UEFI?

 

I just installed the card, it's brand new. I wanted to use it for 4k transcoding which the Skylake can't really handle. The mobo doesn't support either 4G Decoding or Resizabel BAR. I will try CSM boot.

Link to comment
47 minutes ago, ich777 said:

I can't imagine that your board doesn't support Above 4G Decoding, maybe look for something like Extended Address Space in the PCI section from the BIOS and enable it.

Went into the BIOS, didn't see anything that looked like 4G decoding or Extended Address Space, but there were some additional options I turned on. It didn't fix the issue. I've enabled "CSM Support" which from what I can see is the only Legacy boot options I have. If it wasn't clear from the diagnostics, i'm using https://www.asrock.com/mb/intel/h110m-hdv/

Thanks again

Link to comment

I think I'm finally stuck.

 

updated to 6.11.5 (after some system fan issues) updated the nvidia driver to 525.85.05.

 

I can see it in the plugin but I cannot use '--runtime=nvidia' it just fails with bad permeameter or if I check it I get the below

image.thumb.png.5e6ab9a06497ad103d21bbb958b51335.png

 

I have tried full uninstall/reinstall with reboots. changing versions etc. Now it did work once when I was looking around with other peoples issues 'nvidia-persistenced' then '$(pidof nvidia-persistenced)' But now it just wont take at all. 

 

Im sure its something silly like it always is haha. any help would be huge. diagnostics attached :) 

tower-diagnostics-20230127-0816.zip

Link to comment
2 hours ago, BomB191 said:

I have tried full uninstall/reinstall with reboots. changing versions etc. Now it did work once when I was looking around with other peoples issues 'nvidia-persistenced' then '$(pidof nvidia-persistenced)' But now it just wont take at all. 

What do you mean exactly with that? Do you have nvidia-persistenced enabled? If so, you kill it by doing:

kill $(pidof nvidia-persistenced)

from a Unraid Terminal.

 

2 hours ago, BomB191 said:

I can see it in the plugin but I cannot use '--runtime=nvidia' it just fails with bad permeameter or if I check it I get the below

Can you please double check if the UUID from the GPU matches in the template? Can you also maybe post a screenshot from the container template so that I can see which parameters you've added for the Nvidia Driver to work?

Please note that on most newer driver version the Key "all" at the GPU UUID causes issues and you should always put in your UUID from the card.

Also please add a Variable, as described in the second post from this thread, with the Key: "NVIDIA_CAPABILITIES" and as Value: "all"

This should fix the issue

 

If the above doesn't help, please try to click grafik.png.a6cd51764a63d28a1f1d21d3a10b8ec2.png on the Docker page (with Advanced View turned on - don't forget to turn it off again).

 

If that all doesn't help then please try to delete the container (only the container on the Docker page) -> go to the Docker page and at the bottom click Add Container -> from the drop-down select your Unmanic (by that you ensure that all your paths and settings that you've already in the old template are preserved) -> click Apply to install the container again.

Link to comment
6 hours ago, ich777 said:

What do you mean exactly with that? Do you have nvidia-persistenced enabled? If so, you kill it by doing:

kill $(pidof nvidia-persistenced)

from a Unraid Terminal.

I enabled then killed it. (someone else was having power state issues)

 

6 hours ago, ich777 said:

Can you please double check if the UUID from the GPU matches in the template? Can you also maybe post a screenshot from the container template so that I can see which parameters you've added for the Nvidia Driver to work?

Please note that on most newer driver version the Key "all" at the GPU UUID causes issues and you should always put in your UUID from the card.

Also please add a Variable, as described in the second post from this thread, with the Key: "NVIDIA_CAPABILITIES" and as Value: "all"

This should fix the issue

image.thumb.png.5e5b163624a8a5b76745546123c46e4c.png

image.thumb.png.993d174df5f65d3b8c6717de6e59f71f.png

 

Still failed 

image.thumb.png.7f8ffaa5de9e1e7562db09a1fa9c718f.png

 

Tried un installing then reinstalling as per instructions and this also fail.

Same with force update

image.png.0e74af974227e363470659a728dde0d4.png

 

 

the weird thing is everything else works

image.png.72c4448fa966b72e5c3c12c0a6944a9a.png

Driver picks it up OK an so does nvidia-smi (I did try downgrading versions) 

image.png.d01c2b7885ff7305e7d12b371553625d.png

 

Same issue with the plex docker from plex.

I would expect this to work but somethings weird :( 

 

So update I think its something to do with /proc/sys/kernel/overflowuid its weird even logged in as root I don't have permissions to delete it or modify permissions 

Edited by BomB191
Link to comment
4 hours ago, BomB191 said:

So update I think its something to do with /proc/sys/kernel/overflowuid its weird even logged in as root I don't have permissions to delete it or modify permissions 

Bad parameter is usually caused when the nvidia runtime isn‘t installed properly.

Do you have any custom script installed that messes with the daemon.json file?

 

Other versions from the driver also doesn‘t work as you‘ve wrote above?

 

What did you change to fix the fan issues that you where having? Is there maybe a setting that you‘ve changed that could affect the Nvidia Driver?

 

This happened once to me but a force update from the container fixed this issue for me back then on my test server.

Back then I tried also to downgrade to a previous version from Unraid and upgraded again but I couldn‘t reproduce it.

Link to comment
58 minutes ago, ich777 said:

What did you change to fix the fan issues that you where having? Is there maybe a setting that you‘ve changed that could affect the Nvidia Driver?

it was a fan issue with asus x470 boards and the fan controller that's now in the base unraid/linux image bug out causing all fans to stop after a random amount of time, within a week or so. I have a file in 'config\modprobe.d' named 'disable-asus-wmi.conf'

With the text of

'# Workaround broken firmware on ASUS motherboards
blacklist asus_wmi_sensors'

 

1 hour ago, ich777 said:

This happened once to me but a force update from the container fixed this issue for me back then on my test server.

Back then I tried also to downgrade to a previous version from Unraid and upgraded again but I couldn‘t reproduce it.

So an update while I was awaiting a reply.

I re downgraded back to unraid 6.10.3. tried the usual stuff and failed.

Re updated back to 6.11.5 did the usual thing and now it works. I have no idea. but it has persisted through several reboots and transcoding now works on the GPU.

 

Best guess something just stuck initially and fixed itself when I cycled the down/upgrade cycle

  • Like 1
Link to comment
29 minutes ago, BomB191 said:

Best guess something just stuck initially and fixed itself when I cycled the down/upgrade cycle

I really don't know what causes that issue because I barely can't reproduce it and I'm pretty much clueless what it could be since Unraid runs from RAM and the package is also installed on boot, so this basically means everything is installed fresh on each reboot.

 

I can only imagine that something in Docker prevents it from running properly because that's the only place in the chain where something is stored across reboots.

 

Anyways, glad that everything is now working for you again!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.