[Plugin] Nvidia-Driver


ich777

Recommended Posts

@Jacon may a shot in the dark, but may take a look at the following

 

what does this command show in the terminal while plex is the case sensitive name from your docker

root@AlsServer:~# docker exec plex ls -la /dev/dri/
total 0
drwxr-xr-x 2 root root       160 Dec 26 09:39 .
drwxr-xr-x 6 root root       360 Dec 26 09:39 ..
crwxrwxrwx 1 plex users 226,   0 Dec 26 09:39 card0
crwxrwxrwx 1 plex users 226,   1 Dec 26 09:39 card1
crwxrwxrwx 1 plex users 226,   2 Dec 26 09:39 card2
crwxrwxrwx 1 plex users 226, 128 Dec 26 09:39 renderD128
crwxrwxrwx 1 plex users 226, 129 Dec 26 09:39 renderD129
crwxrwxrwx 1 plex users 226, 130 Dec 26 09:39 renderD130
root@AlsServer:~#

 

if there are more cards then 1 like here, then may take a look at the following file, depending on your plex docker

image.png.375bbdfb5c46150eb71d0936df7ddba4.png

 

when opening the preferences.xml file, look for the render device located there

image.thumb.png.deb60fa88cbcad39de05966e34b98e53.png

 

and change it to your "other" device, in my sample its 130, restart plex container and test

 

this ONLY makes sence in terms you have at least more then renderD0128

Link to comment
8 hours ago, alturismo said:

@Jacon may a shot in the dark, but may take a look at the following

 

what does this command show in the terminal while plex is the case sensitive name from your docker

root@AlsServer:~# docker exec plex ls -la /dev/dri/
total 0
drwxr-xr-x 2 root root       160 Dec 26 09:39 .
drwxr-xr-x 6 root root       360 Dec 26 09:39 ..
crwxrwxrwx 1 plex users 226,   0 Dec 26 09:39 card0
crwxrwxrwx 1 plex users 226,   1 Dec 26 09:39 card1
crwxrwxrwx 1 plex users 226,   2 Dec 26 09:39 card2
crwxrwxrwx 1 plex users 226, 128 Dec 26 09:39 renderD128
crwxrwxrwx 1 plex users 226, 129 Dec 26 09:39 renderD129
crwxrwxrwx 1 plex users 226, 130 Dec 26 09:39 renderD130
root@AlsServer:~#

 

if there are more cards then 1 like here, then may take a look at the following file, depending on your plex docker

image.png.375bbdfb5c46150eb71d0936df7ddba4.png

 

when opening the preferences.xml file, look for the render device located there

image.thumb.png.deb60fa88cbcad39de05966e34b98e53.png

 

and change it to your "other" device, in my sample its 130, restart plex container and test

 

this ONLY makes sence in terms you have at least more then renderD0128

 

@alturismo "No such container exists" when I run that command.

 

I do not have the HardwareDevicePath= option when I open that XML.  I'm using the official Plex docker.

 

Link to comment
1 hour ago, Jacon said:

"No such container exists" when I run that command.

You have to change „plex“ to the exact name of your Plex container.

 

1 hour ago, Jacon said:

I do not have the HardwareDevicePath= option when I open that XML.  I'm using the official Plex docker.

Again, I would recommend that you post on the official support thread for Plex because when it is working for you with other containers this is purely a Plex issue.

  • Like 1
Link to comment
9 minutes ago, Jacon said:

Plex-Media-Server is the official name.  I now get 'ls: cannot access 'dev/dri: No such file or directory'

ok, then like @ich777 mentioned its more a plex forum issue then here

 

2 hours ago, Jacon said:

@alturismo "No such container exists" when I run that command.

 

 

11 hours ago, alturismo said:

what does this command show in the terminal while plex is the case sensitive name from your docker

 

;)

Link to comment
3 hours ago, alturismo said:

ok, then like @ich777 mentioned its more a plex forum issue then here

 

 

;)

As info, I added this to Extra Parameters.  Still no change...reverts to CPU transcoding.

--runtime=nvidia --device=/dev/dri:/dev/dri

 

image.png.d24247acf9ddd2fb2d7e848710f902c3.png

 

I posted this over to the Plex forums too.  Adding info here for reference or in the event someone else has any idea.

Edited by Jacon
Link to comment

Well...I loaded up Jellyfin again for the hell of it, and it also reports back the same XID 31 error for the ffmpeg process.

 

NVRM: Xid (PCI:0000:01:00): 31, pid=29372, name=ffmpeg, Ch 00000008, intr 10000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_8 faulted @ 0x1_ffd0b000. Fault is of type FAULT_PDE ACCESS_TYPE_WRITE

 

This is either a hardware or driver problem.  Plex points to the driver, this forum points to Plex and the nvidia developer forums aren't really helpful.

 

All I can gather from nvidia is the XID 31 error means 1) bad driver or 2) bad application request, but since Plex and JF both throw this error, I lean to this being a driver problem.

XID 31: Fifo: MMU Error

 

https://docs.nvidia.com/pdf/XID_Errors.pdf

 

Link to comment
19 minutes ago, Jacon said:

This is either a hardware or driver problem.  Plex points to the driver, this forum points to Plex and the nvidia developer forums aren't really helpful.

 

Quote

XID 31: Fifo: MMU Error

This event is logged when a fault is reported by the MMU, such as when an illegal address access is made by an applicable unit on the chip Typically these are application-level bugs, but can also be driver bugs or hardware bugs.

or its really just hardware ... as you tried all kinda things there is not much anyone can do now for you anymore in this case. drivers are installed, are working "sometimes" in "some" apps like handbrake, its working overall for many many users ...

 

a GTX1070 also aint the newest model anymore, may it has been OC'd and is slowly getting EOL, can be anything (sadly).

 

In the end to finally exlude it, you would have to get some spare card to test if its the card or the combo mainboard, cpu, ram, gpu, ... or vice vers, if you know somebody nearby take your card there and test it there on a different mashine.

Link to comment
10 minutes ago, alturismo said:

or its really just hardware ... as you tried all kinda things there is not much anyone can do now for you anymore in this case. drivers are installed, are working "sometimes" in "some" apps like handbrake, its working overall for many many users ...

 

a GTX1070 also aint the newest model anymore, may it has been OC'd and is slowly getting EOL, can be anything (sadly).

 

In the end to finally exlude it, you would have to get some spare card to test if its the card or the combo mainboard, cpu, ram, gpu, ... or vice vers, if you know somebody nearby take your card there and test it there on a different mashine.

This is a MSI GTX 1070 Gaming X which I think comes OC from the factory.  I've read a couple forum posts (not here) where factory overclocking can cause these errors in Linux.  I want to test that therory so I'm curious if there's a way to de-clock the GPU and memory back to nvidia recommended settings to see if that fixes my problem.  I bought this card second-hand for $100 a couple months ago, therefore I want to try and figure this out before hitting the nuclear option and buying a new card, like a p2000.

 

I appreciate your suggestions.  Thanks.

Link to comment
7 minutes ago, Jacon said:

This is a MSI GTX 1070 Gaming X which I think comes OC from the factory.  I've read a couple forum posts (not here) where factory overclocking can cause these errors in Linux.

This only applies to manual overclocking and not factory overclocking.

 

8 minutes ago, Jacon said:

I want to test that therory so I'm curious if there's a way to de-clock the GPU and memory back to nvidia recommended settings to see if that fixes my problem.

Not on Unraid, as said above this only applies to manual overclocking. No manufacturer would sell a card which is OOB unstable to a user, think only about the RMAs that such a manufacturer which sells unstable cards OOB would get...

 

10 minutes ago, Jacon said:

I bought this card second-hand for $100 a couple months ago,

I thought this was a new card or at least that you've had it installed somewhere and you are validated that it is working properly?

 

11 minutes ago, Jacon said:

before hitting the nuclear option and buying a new card, like a p2000

I don't recommend getting a P2000 because this is nowadays the most uneconomical card that you can buy and it happened to already a few people here in the support thread which bought a P2000 second hand that died on them in a couple of months or even days.

 

I would rather recommend that you buy something like a Nvidia T400, T600 or T1000.

You can get a T400 for about 100,- brand new, this card doesn't need external power, has a maximum TDP of 35W and is Turing based.

In terms of transcoding it can do the same as your 1070, 3 simultaneous transcodes.

Link to comment
3 minutes ago, ich777 said:

This only applies to manual overclocking and not factory overclocking.

 

Not on Unraid, as said above this only applies to manual overclocking. No manufacturer would sell a card which is OOB unstable to a user, think only about the RMAs that such a manufacturer which sells unstable cards OOB would get...

 

I thought this was a new card or at least that you've had it installed somewhere and you are validated that it is working properly?

 

I don't recommend getting a P2000 because this is nowadays the most uneconomical card that you can buy and it happened to already a few people here in the support thread which bought a P2000 second hand that died on them in a couple of months or even days.

 

I would rather recommend that you buy something like a Nvidia T400, T600 or T1000.

You can get a T400 for about 100,- brand new, this card doesn't need external power, has a maximum TDP of 35W and is Turing based.

In terms of transcoding it can do the same as your 1070, 3 simultaneous transcodes.

I bought it used and it worked great for a few months, but then gave me trouble when I had the plugin set for Auto Update and it downloaded the r515 drivers (back to page 102-104).  Previous owner said it worked well but he had it sitting for awhile before giving it to me.

 

Then, all of a sudden it completely stopped working last week.

 

I'll take a look at the T-series cards.  Thanks!

Link to comment
14 minutes ago, Jacon said:

I bought it used and it worked great for a few months, but then gave me trouble when I had the plugin set for Auto Update and it downloaded the r515 drivers (back to page 102-104).

Again, this seems not like a driver issue to me because you are one out of many people which have the driver installed and nobody else has such an issue with the driver. For everybody else it seems to be working or my thread would get spammed with support requests:

grafik.png.817914bf668851d7ef34eb2f9575e705.png

(these are the downloads from the individual driver packages for Unraid 6.11.5)

 

You have to at least see my point of view that so many people have installed the drivers and non of them have an issue like you.

Or better put it that way, if you have Unraid 6.11.5 installed with the driver 525.60.13 and someone else have the exact same system and the same driver installed you are both using the exact same package and nothing is different.

As said before, the drivers are pre compiled, compressed to a package which is downloaded by the plugin and ultimately installed on your system.

 

14 minutes ago, Jacon said:

Previous owner said it worked well but he had it sitting for awhile before giving it to me.

 

Then, all of a sudden it completely stopped working last week.

But maybe there is some issue now with the memory now or it died in the last week, I really can't tell what changed in the last week on your system, maybe there was some other kind of update or the card really died,...

 

14 minutes ago, Jacon said:

I'll take a look at the T-series cards.  Thanks!

Sorry, but there is nothing more I can do about that, I also know a few people which using Unraid and the plugin with for example a 1050, 1050ti, 3060, 3080ti and non of them has an issue with it.

Link to comment
On 2/18/2022 at 6:42 AM, ich777 said:

Exactly, because this is a Datacenter card.

 

To be honest I even don't know if Docker or their container toolkit is compatible with those cards.

 

What is the exact use case for thus card in your server?

 

Not easily because it's not as easy as changing a URL.

The drivers needs to be compiled for each individual unRAID version, then the container packages get added, then it is packed up and uploaded to Github.

I'm not sure if this has been discussed before beyond this, and sorry if it has and it's not feasible, but have you done any more consideration on adding data center driver support?

 

I have some tesla p4's, which are now selling for around $160 canadian, making it an extremely affordable, and capable little card, it has the same gpu chip as the 1080 just limited to 75 watts (no external power necessary, you will need a 3D print cooling shroud and a 40x20mm fan, mine by noctua runs silent) and clocks restricted, although it is has 2 nvenc chips capable of H.265 8k and HEVC 10-bit, opposed to most consumer and professional cards only having 1 nvenc chip, making it a great choice for a plex server. My ideal setup would be to have this as my primary docker dedicated gpu and save a video card with video outputs for a vm, it is detected by the nvidia driver but plex crashes whenever defaulting to it to encode. I know it is difficult to get video games running on these cards, however, according to reddit posts such as this, the default data center driver has no issues running encode and decode operations. I honestly have no clue how much work it would be to include this, but I know I would greatly appreciate it, and with the prices of these cards seemingly only dropping, I would bet on them becoming a more popular home server choice due to their capabilities and especially efficiency.

Edited by loganawe
adding a bit about how the p4 doesnt need additional power
Link to comment
19 minutes ago, ich777 said:

Tesla cards are working OOB with this Plugin.

Oh okay, is there anything I have to select? If not, do you have any ideas for why im having trouble getting it to actually transcode, the plex application shows up in nvidia-smi then never actually does anything, a few seconds later popping up an error and crashing the video then going blank under active apps again.

Link to comment
1 hour ago, loganawe said:

Oh okay, is there anything I have to select? If not, do you have any ideas for why im having trouble getting it to actually transcode, the plex application shows up in nvidia-smi then never actually does anything, a few seconds later popping up an error and crashing the video then going blank under active apps again.

Maybe the card is bad or something else wrong with the hardware?

Have you read the second post of this thread and added everything to the container that is necessary for transcoding?

 

I have now a few people which confirmed it working with the driver, see this post:

 

Link to comment
7 minutes ago, ich777 said:

Maybe the card is bad or something else wrong with the hardware?

I've been able to get a couple streams working concurrently off it. For some reason it just would not transcode with the v525.60.11 driver, even after multiple restarts, I've since switched to v470.141.03 and it's been functional enough to make me pleased for now. I am finding it to be much less cooperative than my gtx 1060 however, the desktop app seems to work fine enough, but on mobile there's issues (it will end the video when I select it to transcode, but if I restart playing it will work perfectly fine, even scrubbing around quickly), and the web app just hangs on a black screen. I'll try to test out a few more driver versions to see if anything changes.

Link to comment
56 minutes ago, loganawe said:

but on mobile there's issues (it will end the video when I select it to transcode, but if I restart playing it will work perfectly fine, even scrubbing around quickly), and the web app just hangs on a black screen.

This has nothing to do with the driver, this is a known Plex issue if you are using the web client, however the mobile apps should all just work fine.

For example if you set the global streaming quality in your LAN to something lower than the source it will also work in the web clients.

 

This was also discussed earlier in this thread (about one or two years ago someone reported that to me), Plex has a lot of issues currently with it's transcoder. I also read somewhere that the 8Mbit/s setting is also broken on Intel QSV transcoding but can't remember what the exact issue was <- read that somewhere on the German sub forums.

 

 

Link to comment
On 12/31/2022 at 12:23 AM, ich777 said:

I would rather recommend that you buy something like a Nvidia T400, T600 or T1000.

You can get a T400 for about 100,- brand new, this card doesn't need external power, has a maximum TDP of 35W and is Turing based.

In terms of transcoding it can do the same as your 1070, 3 simultaneous transcodes.

Any opinion on a 1660?  It’s Turing based. 
 

The T400 looks a little light for being able to 4K transcode 3-4 streams. 

Link to comment
1 hour ago, Jacon said:

Any opinion on a 1660?  It’s Turing based. 

I have many people having issues with using GTX 1600 series cards which fall from the bus for no reason or simply wont work at all.

 

Then pick a T600 it has 4GB of VRAM

 

A T400 is more than capable of 3 simultaneous streams if we are not talkig about 4K HFR…

I can transcode 3x 4K streams without any issue on my T400

 

BTW the T400, T600 & T1000 uses the same NVENC chip as the GTX 1600 series.

 

Also keep in mind for transcoding you don‘t need a powerhouse of a graphics card anyways. Even my Intel i5-10600 is capable of 4+ 4K transcodes at once with QuickSync and only uses a fraction of the power what you would need with a Nvidia card.

Link to comment
1 hour ago, ich777 said:

I have many people having issues with using GTX 1600 series cards which fall from the bus for no reason or simply wont work at all.

 

Then pick a T600 it has 4GB of VRAM

 

A T400 is more than capable of 3 simultaneous streams if we are not talkig about 4K HFR…

I can transcode 3x 4K streams without any issue on my T400

 

BTW the T400, T600 & T1000 uses the same NVENC chip as the GTX 1600 series.

 

Also keep in mind for transcoding you don‘t need a powerhouse of a graphics card anyways. Even my Intel i5-10600 is capable of 4+ 4K transcodes at once with QuickSync and only uses a fraction of the power what you would need with a Nvidia card.

Yeah, I really need 4K HDR transcoding capability with a little future-proofing. I guess I can always exchange a T400 if it isn’t up to the task. 

  • Like 1
Link to comment

Hey @ich777!

 

I wondered if you'd implement custom URL support (or accept a PR?) so drivers can be downloaded from another location, namely a locally hosted WebDAV server or another webserver.

That way users could compile their driver packages themselves and host them on their local network.

 

This would be useful for users with GRID cards where the drivers need a subscription. But with that people could use the vGPU functionality of GRID/Tesla/Quadro etc. cards on UnRAID, though mdevctl would need to be compiled for UnRAID/Slackware and be added to NerdTools as well for creating mdev devices.

Edited by shawly
Link to comment
4 minutes ago, shawly said:

I wondered if you'd implement custom URL support (or accept a PR?) so drivers can be downloaded from another location, namely a locally hosted WebDAV server or another webserver.

In general I accept PRs but only if it extends the plugin so that users can benefit from it.

 

5 minutes ago, shawly said:

That way users could compile their driver packages themselves and host them on their local network.

Can you explain a bit more in detail why?

Link to comment

Just edited my post with more details on why this would be useful. :)

 

  

19 minutes ago, shawly said:

This would be useful for users with GRID cards where the drivers need a subscription. But with that people could use the vGPU functionality of GRID/Tesla/Quadro etc. cards on UnRAID, though mdevctl would need to be compiled for UnRAID/Slackware and be added to NerdTools as well for creating mdev devices.

 

Small correction, the enterprise GRID drivers need a subscription. So they can't be downloaded from public servers. That's why one would need to provide their own webserver to host the files on their internal network for UnRAID to download them.

I mean technically the scripts wouldn't even need a custom webserver, they could just pull the packages from a disk on the UnRAID host.

 

With vGPU support one can create multiple vGPUs that can be used with KVM + QEMU so you can use your GPU in a virtual machine without having to passthrough the whole GPU to a guest and it can be split into multiple vGPUs as well so one can use multiple guests AND simultaneously use the host GPU via Docker.

Edited by shawly
Link to comment
15 minutes ago, shawly said:

Tesla/Quadro

Tesla and Quadro cards are already supported by the plugin.

 

15 minutes ago, shawly said:

GRID

Sorry but this plugin is not designed for Datacenter GPUs.

You can however fork it and create your own plugin for these cards and maybe publish it to the CA App as long as it doesn't violate the Nvidia EULA.

This would be also beneficial to other users.

 

I would be also glad to help if you need anything.

 

17 minutes ago, shawly said:

though mdevctl would need to be compiled for UnRAID/Slackware and be added to NerdTools as well for creating mdev devices.

From my perspective everything that is needed for the cards to work properly should be integrated into the plugin package itself and should not be added to NerdTools (by the way also check out un-get as an alternative for NerdTools ;) ) otherwise this would be not really user friendly to install.

I also include the Container Runtime and Nvidia Toolkit which is needed for the cards to work in Docker otherwise the driver alone would be pretty much useless...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.