[Plugin] Nvidia-Driver


ich777

Recommended Posts

18 minutes ago, menseph said:

strange thing is I can also pass it to a VM and I see it in the system devices

If you bind it to VFIO (which is is visible in your screenshot) the plugin can‘t see the card because it‘s reserved for use in the VM.

 

Please uncheck the two boxes, reboot and it should show up on the plugin page.

  • Like 1
Link to comment

I've searched this topic, but haven't found anyone with my specific issue.

 

4 out of every 5 updates, I get a notification email that says this:

Event: Nvidia Driver
Subject: Notification
Description: Found new Nvidia Driver v535.113.01 but a download error occurred! Please try to download the driver manually!
Importance: alert

 

I don't see anything in the unRAID logs that shows a download error or a Nvidia-Driver plugin error. 

 

My networking is working. I can download plugins, updates, transfer files, etc.

 

To get the Nvidia-Driver plugin update, I need to manually click on the "Update & Download" button. 

That works every time. 

 

Could anyone explain to me why the automatic update function isn't working on my server??

Thanks

 

Running unRAID 6.12.4

All Plugins and Dockers are up to date.

 

 

820120140_Screenshot2023-09-22092020.thumb.png.be4e953e2b33044e5c6c52040e53af8f.png

myers19-diagnostics-20230922-0941.zip

Edited by FQs19
uploaded wrong unRAID server diagnostics
Link to comment
1 hour ago, FQs19 said:

Could anyone explain to me why the automatic update function isn't working on my server??

Actually no.

I will look into that. Can you please share a bit more information on that? I'm assuming you've set it to latest otherwise the automatic download function wouldn't work at all.

 

1 hour ago, FQs19 said:

4 out of every 5 updates, I get a notification email that says this

So to speak it sometimes works, you get a notification that the download was successful and to reboot the server correct?

Maybe I changed something that prevents updates from working correctly but I don't think so because you are the first user who is reporting this at least IIRC and if it works sometimes it seems that something is wrong with the download form GitHub.

 

May I ask where are you located in the world?

The driver packages are hosted on GitHub and some users report issues some times with downloads but I'm not entirely sure why, maybe the CDN is a bit flakey...

 

I don't think that this is a possibility but it could of course happen that the driver isn't ready when your server checks for updates <- but this shouldn't be happening because the time frame for that is only about 15 to 20 minutes.

 

1 hour ago, FQs19 said:

That works every time. 

Good to hear that at least this is working... :)

Link to comment
1 minute ago, ich777 said:

Actually no.

I will look into that. Can you please share a bit more information on that? I'm assuming you've set it to latest otherwise the automatic download function wouldn't work at all.

 

So to speak it sometimes works, you get a notification that the download was successful and to reboot the server correct?

Maybe I changed something that prevents updates from working correctly but I don't think so because you are the first user who is reporting this at least IIRC and if it works sometimes it seems that something is wrong with the download form GitHub.

 

May I ask where are you located in the world?

The driver packages are hosted on GitHub and some users report issues some times with downloads but I'm not entirely sure why, maybe the CDN is a bit flakey...

 

I don't think that this is a possibility but it could of course happen that the driver isn't ready when your server checks for updates <- but this shouldn't be happening because the time frame for that is only about 15 to 20 minutes.

 

Good to hear that at least this is working... :)

 

First, I hope you saw that I attached the wrong diagnostics file and attached the correct one. 

The server with this issue is Myers19.

 

I have the plugin set to Production and notifications are Enabled. 

The plugin does try to download automatically, but it fails which is then when I get the email notification of download failure.

I do get email notifications when the download is successful and to reboot. 

I'm located in the eastern part of Pennsylvania, USA. 

Also, I don't have any network Ad blocker installed, like PiHole.

And my other unRAID server, which uses this same plugin, doesn't seem to have this issue with automatically downloading updates. 

Link to comment
3 minutes ago, FQs19 said:

First, I hope you saw that I attached the wrong diagnostics file and attached the correct one. 

Sure thing, I've looked at it but couldn't find anything suspicious at a quick look.

 

2 minutes ago, FQs19 said:

And my other unRAID server, which uses this same plugin, doesn't seem to have this issue with automatically downloading updates. 

Okay, then this is specific to your server but I really can't tell what's happening there...

How many servers do you have? Please note that I use GitHub API calls for the check but that shouldn't be an issue at all since you can do 50 API calls in one hour, so as long as you don't have more than 50 servers you should be good to go <- even then the update check is randomly triggered between 8am and 10am

 

This is really strange and I don't have a explanation for that... Sorry.

 

You can at least disable the update check since it won't make much sense to be always on the latest driver if you are using the card only for transcoding, if you are using the card for example for Steam Headless from @Josh.5 it makes more sense to always be on the latest driver.

  • Thanks 1
Link to comment
1 minute ago, ich777 said:

Sure thing, I've looked at it but couldn't find anything suspicious at a quick look.

 

Okay, then this is specific to your server but I really can't tell what's happening there...

How many servers do you have? Please note that I use GitHub API calls for the check but that shouldn't be an issue at all since you can do 50 API calls in one hour, so as long as you don't have more than 50 servers you should be good to go <- even then the update check is randomly triggered between 8am and 10am

 

This is really strange and I don't have a explanation for that... Sorry.

 

You can at least disable the update check since it won't make much sense to be always on the latest driver if you are using the card only for transcoding, if you are using the card for example for Steam Headless from @Josh.5 it makes more sense to always be on the latest driver.

I only have 2 servers. 

This specific server doesn't do anything besides hold data storage. 

The Nvidia 1060 that is in it, actually isn't even being used. So I could just uninstall the plugin and be fine. 

 

No worries on not being able to figure out the issue. 

I appreciate you looking into it and answering. 

 

Thanks. 

  • Like 1
Link to comment

Hello, I have a nvidia quadro p1000 that I am trying to use with my docker containers (specifically plex or jellyfin) and the nvidia driver isn't working with it currently. I've tried changing from latest to specific builds of the drivers and none of them work. What is coming up is posted right below. 

image.thumb.png.0f1fe5e0aac1dd8f5758018e427603be.png

I also had it bound to vfio at boot before and it was giving me the error of NVIDIA-SMI has failed. 

image.thumb.png.0f46f2b7271245313353e56b6dcd5cdd.png

Im not entirely sure what im doing wrong here so any help would be greatly appreciated. I've attached my diagnostics file as well if that helps. If you need any more info please let me know! It also looks like the newest driver on nvidia's site and the pics above mismatch, dont know if that means anything.

unraidnas-diagnostics-20230925-1917.zip

Edited by SussyJack
added couple sentences for more context
Link to comment
17 minutes ago, ConnerVT said:

Unbind the GPU.

Hey! When i had it unbound it was coming up with the first picture there that i posted. I've just unbound it and its still coming up with that error, should i post new diagnostics?

 

Edit: After a couple reboots it seems to have finally shown up. 

 

Edit Edit: it now isn't showing up again. Same error

 

Final edit (hopefully): Ok, I think i found what was going on. It was giving me that error because I somehow had that card assigned to a vm when it shouldn't have been. The card is now showing up in the nvidia driver settings and it all seems to work. I'll keep an eye on it and update if i'm still having issues.

 

 

image.thumb.png.190fdb785f24723d51152be6a1f9e8df.png

Edited by SussyJack
Link to comment
6 hours ago, SussyJack said:

I also had it bound to vfio at boot before and it was giving me the error of NVIDIA-SMI has failed. 

Exactly as @ConnerVT said, unbind it from VFIO:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P1000] [10de:1cb1] (rev a1)
    Subsystem: Dell GP107GL [Quadro P1000] [1028:11bc]
    Kernel driver in use: vfio-pci
    Kernel modules: nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
    Subsystem: Dell GP107GL High Definition Audio Controller [1028:11bc]
    Kernel driver in use: vfio-pci

 

Link to comment
13 minutes ago, Justin Vesmic said:

In GUI Shows everything. But in plex Shows Unknown GPU and won't encode.

 

as it looks like your GPU encoding works

 

image.png.664d508cf0be8587cdc129cdf330e0bc.png

 

i would rather look into Plex if you have setted up all correctly and / or may your limited as its already running 4 tdarr instances (when i look into nvenc specs for your card 5 should be allowed ...)

Link to comment
11 minutes ago, alturismo said:

as it looks like your GPU encoding works

 

image.png.664d508cf0be8587cdc129cdf330e0bc.png

 

i would rather look into Plex if you have setted up all correctly and / or may your limited as its already running 4 tdarr instances (when i look into nvenc specs for your card 5 should be allowed ...)

 

This is my A4500 I have 2 cards installed, my 4060 ain't working tho :(

Screenshot 2023-09-27 033939.png

Screenshot 2023-09-27 033935.png

Screenshot 2023-09-27 034114.png

Edited by Justin Vesmic
Link to comment
4 minutes ago, Justin Vesmic said:

 

This is my A4500 I have 2 cards installed haha my 4060 ain't working tho :(

No worries, both of your cards are detected:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD107 [GeForce RTX 4060] [10de:2882] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:4116]
    Kernel driver in use: nvidia
    Kernel modules: nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22be] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:4116]
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102GL [RTX A4500] [10de:2232] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:163c]
    Kernel driver in use: nvidia
    Kernel modules: nvidia_drm, nvidia
02:00.1 Audio device [0403]: NVIDIA Corporation GA102 High Definition Audio Controller [10de:1aef] (rev a1)
    Subsystem: NVIDIA Corporation GA102 High Definition Audio Controller [10de:163c]

 

However, you have a lot of ACPI errors in your syslog:

Sep 27 02:45:52 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Sep 27 02:45:52 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Sep 27 02:45:52 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)

 

Which usually indicates an error/bug in your BIOS which I can do nothing about.

Please make sure that you've enabled Resizable BAR and Above 4G Decoding in your BIOS (it could be also called Extended Address Space or similar in your PCI sub menu in the BIOS).

 

In which slots do you have the cards? Maybe try to swap the cards in their slots.

 

May I ask why do you need two cards for that?

Seems a bit overkill since the A4500 has unlimited transcodes IIRC (you can also check here).

 

Are you sure that you've assigned the right card with the right UUID in the Plex Docker template?

 

As said, I think the message that spawns your syslog is the cause of the issue and that indicates a BUG in the BIOS. Please note that you use consumer hardware which is in most cases not designed for multiple GPUs anymore (with the fall from SLI) and so on.

Link to comment
2 hours ago, ich777 said:

No worries, both of your cards are detected:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD107 [GeForce RTX 4060] [10de:2882] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:4116]
    Kernel driver in use: nvidia
    Kernel modules: nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22be] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device [1458:4116]
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102GL [RTX A4500] [10de:2232] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:163c]
    Kernel driver in use: nvidia
    Kernel modules: nvidia_drm, nvidia
02:00.1 Audio device [0403]: NVIDIA Corporation GA102 High Definition Audio Controller [10de:1aef] (rev a1)
    Subsystem: NVIDIA Corporation GA102 High Definition Audio Controller [10de:163c]

 

However, you have a lot of ACPI errors in your syslog:

Sep 27 02:45:52 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Sep 27 02:45:52 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Sep 27 02:45:52 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)

 

Which usually indicates an error/bug in your BIOS which I can do nothing about.

Please make sure that you've enabled Resizable BAR and Above 4G Decoding in your BIOS (it could be also called Extended Address Space or similar in your PCI sub menu in the BIOS).

 

In which slots do you have the cards? Maybe try to swap the cards in their slots.

 

May I ask why do you need two cards for that?

Seems a bit overkill since the A4500 has unlimited transcodes IIRC (you can also check here).

 

Are you sure that you've assigned the right card with the right UUID in the Plex Docker template?

 

As said, I think the message that spawns your syslog is the cause of the issue and that indicates a BUG in the BIOS. Please note that you use consumer hardware which is in most cases not designed for multiple GPUs anymore (with the fall from SLI) and so on.


Seems to work on plex and tdarr i feel dumb lol. returning the 4060 LP And gunna sell my A4500 just bought a A5000 :) Should be good enough for all my dockers instead of mutiple GPU's 😹

  • Like 1
Link to comment

Finally giving in and asking for help. Been struggling to get this working for a while. I am looking to use my Nvidia GeForce 1070 FE to HW transcode 4k HEVC 10 Bit files. Was going to use the skylake igpu, but it doesn't look like it supports it. Will instead use the skylake igpu for VMs once I get it working using the Intel GVT-g plug-in. Want to get this working first then will post in that forum topic for help on that if needed.

 

System specs:
Motherboard: Gigabyte ga-z170x-gaming7 updated to the latest bios.
CPU: Intel 6700k skylake
Memory: 64 GB DDR4
GPU: Nvidia GeForce 1070 FE

This Nvidia GPU was used in a bare metal machine and also can be passed through to a VM using its vbios rom. In the VM I was able to install jellyfin and watch 4k HEVC 10 Bit files using my Firefox browser. It is installed in the PCIEx16 slot. PCIe ACS override is disabled. So far, I have the Nvidia-Driver installed and this is what shows up on the settings page and with 'watch nvidia-smi'.

Screenshot_20230928-101238_Firefox_1.thumb.jpg.9eb4a96f46fe52bfc3fa9643da60bb1e.jpgScreenshot_20230928-100142_Firefox_1.thumb.jpg.6c761cff0cd731a19f705a386dfb02f1.jpg

 

Here is my container dialog.

2045667663_Screenshot_20230928-113342_VivaldiBrowser.thumb.png.18ea55ce0a6ddea41d68bcf4d7f46ae5.png

When I try to play a 4k HEVC 10 Bit file, this is the error that shows up. Other video files as well as flac music files don't work either. I get the same playback error.

Screenshot_20230928-110213_Firefox_1.thumb.jpg.097cccf2fae86ab978b1b72900ee9d70.jpg

Here is the jellyfin dashboard. It looks like it says it's trancoding, but the above playback error shows up after several seconds.

Screenshot_20230928-110403_Firefox_1.thumb.jpg.e803a4e35a47c08ae8e49f3261a7b894.jpg

Screenshot_20230928-110521_Firefox_1.thumb.jpg.5ed481cb52f17420a6e5f71c025d149a.jpg

Here is my diagnostics after getting the playback error.

bluedragon-diagnostics-20230928-1139.zip

 

Any help on this would be greatly appreciated!

 

Edited by Kinspappy
Deleted a duplicate image.
Link to comment
1 hour ago, Kinspappy said:

Finally giving in and asking for help. Been struggling to get this working for a while. I am looking to use my Nvidia GeForce 1070 FE to HW transcode 4k HEVC 10 Bit files.

What container are you using?

 

BTW, you've deleted the wrong duplicate image, I can't open your docker run image.

 

1 hour ago, Kinspappy said:

Want to get this working first then will post in that forum topic for help on that if needed.

This seems more like a configuration error in Jellyfin itself than an issue with the Nvidia Driver plugin since everything is working from what I can see from your Diagnostics.

 

Please try the official container first from Jellyfin and see if this is working.

Link to comment
3 minutes ago, Kinspappy said:

Don't know how to install the official jellyfin version as I don't see it in Community Applications.

You can change the repository to jellyfin/jellyfin but be aware that you then have to set up Jellyfin from scratch.

 

I would recommend that you create a post in the Support thread from the container since this seems not like a Nvidia Driver issue at all.

 

As said above, everything seems to work fine and over here with my Nvidia T400 transcoding is working in the official container.

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.