[Plugin] Linuxserver.io - Unraid Nvidia


Recommended Posts

10 hours ago, IamSpartacus said:

Tested transcoding six 4K HDR10 50-80Mbps files down to 2-4Mbps streams all at once.  Obviously not at all a real world test because I'll never be transcoding 4K at all (at least not on purpose) but was cool to see how well my 1080Ti handled it.

 

Thanks again so much to the LSIO team.  I owe you all a round.  Donate link?

 

Capture.PNG

Glad to hear it works

 

Here's the donate link: https://www.linuxserver.io/donate

 

Thanks

  • Like 1
  • Upvote 1
Link to comment
On 3/20/2019 at 10:03 PM, Xaero said:

image.thumb.png.6fd29df300f7d7ca56740f41c9a6dc6f.png

After running the script posted. You can see the nvidia smi entry highlighted in purple.
And yes, this is with titpetric/netdata from Community Apps.
Do you have the container set to save it's configuration to an appdata folder, perhaps? If so I think on startup it copies or symlinks the one in the appdata folder over the top of the one that script writes to. It also overwrites the file every time the container updates.

I checked, NOT there and no it's not saving data elsewhere. I just updated the container though and re-ran the script - which declared it had already been run for some reason despite my rebuilding\upgrading. Refreshed the screen and it's THERE! I wonder if they did something to the latest release? In any case it's there and this is REALLY cool :)

 

One odd thing - I notice that I have, under users, a set of GUID that you don't. The GUID I have are for individual containers. I can see usage of various things for each of them - just by GUID and not name. It would be way more useful by name but anyway I just thought I'd point out this difference as you might also want to be able to monitor individual containers resource usage. No idea how I ended up with this and you didn't though but let me know if you'd like to compare setup notes to try and get it!

 

Thank You!

Link to comment

Hello everyone! Thanks, again LSIO team for putting this together with an keeping it alive as well for work jointly with this community! You all rock! 
So, I notice something strange behavior with the plugin today. Note* I am running unRAID RC5 6.7.0
After reboot, my GPUs information is right where it needs to be with all the appropriate information. Though after about an hour the info disappears. The driver appears to be working correctly and still does transcoding with plex. 
I am not sure what all information would be helpful, but I'll give you whatever I can to help figure out what is going on.

Bellow, I have attached my diagnostic file. ( I did not see anything concerning, but another set of eyes wouldn't hurt.) 

tower-diagnostics-20190326-0048.zip

Link to comment
On 3/22/2019 at 9:42 PM, BLKMGK said:

I checked, NOT there and no it's not saving data elsewhere. I just updated the container though and re-ran the script - which declared it had already been run for some reason despite my rebuilding\upgrading. Refreshed the screen and it's THERE! I wonder if they did something to the latest release? In any case it's there and this is REALLY cool :)

 

One odd thing - I notice that I have, under users, a set of GUID that you don't. The GUID I have are for individual containers. I can see usage of various things for each of them - just by GUID and not name. It would be way more useful by name but anyway I just thought I'd point out this difference as you might also want to be able to monitor individual containers resource usage. No idea how I ended up with this and you didn't though but let me know if you'd like to compare setup notes to try and get it!

 

Thank You!

The container data becomes present over time. The support page has documentation on how to pass thru something to make it happen instantly but recommends against it for security reasons 

 

If you leave it running the GUIDs should eventually become the container names. I don't normally run netdata.

Link to comment
1 minute ago, CHBMB said:

v6.7.0rc6 uploaded

Damn! your faster than me seeing there is a new build available from Unraid! 😮I'll go read the changes.  RC5 pretty stable for me.  I'm in the process to install 4 new 8TB WD drive (just started my 2nd drive... so will update only in 20 hours)

Link to comment
21 hours ago, blinside995 said:

Hello everyone! Thanks, again LSIO team for putting this together with an keeping it alive as well for work jointly with this community! You all rock! 
So, I notice something strange behavior with the plugin today. Note* I am running unRAID RC5 6.7.0
After reboot, my GPUs information is right where it needs to be with all the appropriate information. Though after about an hour the info disappears. The driver appears to be working correctly and still does transcoding with plex. 
I am not sure what all information would be helpful, but I'll give you whatever I can to help figure out what is going on.

Bellow, I have attached my diagnostic file. ( I did not see anything concerning, but another set of eyes wouldn't hurt.) 

tower-diagnostics-20190326-0048.zip

Updated to the RC 6 and seems to still be having the same behavior. 

Link to comment
1 hour ago, blinside995 said:

Updated to the RC 6 and seems to still be having the same behavior. 

On boot the plugin runs this command.

nvidia-smi --query-gpu=gpu_name,gpu_bus_id,gpu_uuid --format=csv,noheader | sed -e s/00000000://g | sed 's/\,\ /\n/g' > /tmp/nvidia

and outputs to a file /tmp/nvidia

When you go to the plugin page it reads the contents of that file to get the GPU info.

cat /tmp/nvidia

I can't tell you why your copy of that file is getting deleted, but that's what I would bet is causing it.  Try running the above command and see what the output is, both from a fresh boot and after the problem occurs.

 

My guess is you have some cleanup script or process that is deleting stuff from /tmp

 

  • Like 1
Link to comment
5 minutes ago, CHBMB said:

On boot the plugin runs this command.


nvidia-smi --query-gpu=gpu_name,gpu_bus_id,gpu_uuid --format=csv,noheader | sed -e s/00000000://g | sed 's/\,\ /\n/g' > /tmp/nvidia

and outputs to a file /tmp/nvidia

When you go to the plugin page it reads the contents of that file to get the GPU info.


cat /tmp/nvidia

I can't tell you why your copy of that file is getting deleted, but that's what I would bet is causing it.  Try running the above command and see what the output is, both from a fresh boot and after the problem occurs.

 

My guess is you have some cleanup script or process that is deleting stuff from /tmp

 

So, running that did bring back all the info for the GPUs. I'll check an see if some script is messing with that. 

Link to comment
26 minutes ago, blinside995 said:

So, running that did bring back all the info for the GPUs. I'll check an see if some script is messing with that. 

If you have anything clearing /tmp regularly, or the kernel is trying to clear memory because you are nearly out, the file would get deleted.
 

 

36 minutes ago, CHBMB said:

 


nvidia-smi --query-gpu=gpu_name,gpu_bus_id,gpu_uuid --format=csv,noheader | sed -e s/00000000://g | sed 's/\,\ /\n/g' > /tmp/nvidia

My guess is you have some cleanup script or process that is deleting stuff from /tmp

Why is this being done in lieu of the plugin calling nvidia-smi via exec?
Surely polluting tmpfs is avoidable in this case?

Just curious, there's probably a good reason I can't think of.

Link to comment

im pulling my hair out on this one. would there be any reason to think that there maybe any benefit going from 6.6.7 to 6.7.0 rc6. im a newb at this sorry

 

edit: okay im dumb i read others say copy and pasting entered unseen characters but i didn't believe that could be the issue. guess what i typed rather than pasted and it finally worked. Dont be like me i wasted so much time swapping in 3 different gpu's and in the end it was sooo simple and i even read the fix but dismissed it without test.

Edited by huskycdn
Link to comment
14 hours ago, Xaero said:

If you have anything clearing /tmp regularly, or the kernel is trying to clear memory because you are nearly out, the file would get deleted.
 

 

Why is this being done in lieu of the plugin calling nvidia-smi via exec?
Surely polluting tmpfs is avoidable in this case?

Just curious, there's probably a good reason I can't think of.

It's a couple of factors, firstly I know bash better than php :D

Secondly, there is a design element to this, with my current approach (which is very similar to what I use for the DVB plugin) the file should be persistent, except in blinside995's case, but we don't know why he's the only one with this issue yet.

So if it's persistent, even if the GPU is used for something else, VM being the obvious thing, the values persist, I was concerned if we tried to do it on every page load then firstly we're polling a GPU that may now be doing something else, risking the integrity of that and secondly we're talking a file of a few bytes here, and Unraid is run entirely from RAM, so as for polluting the /tmp folder, well, that's kind of it's purpose. 

I suppose I could move it somewhere else, but it'd still be in memory, as that's how Unraid works.

Link to comment
12 hours ago, huskycdn said:

im pulling my hair out on this one. would there be any reason to think that there maybe any benefit going from 6.6.7 to 6.7.0 rc6. im a newb at this sorry

 

edit: okay im dumb i read others say copy and pasting entered unseen characters but i didn't believe that could be the issue. guess what i typed rather than pasted and it finally worked. Dont be like me i wasted so much time swapping in 3 different gpu's and in the end it was sooo simple and i even read the fix but dismissed it without test.

Can't say you weren't warned :D

 

Link to comment
7 hours ago, CHBMB said:

I was concerned if we tried to do it on every page load then firstly we're polling a GPU that may now be doing something else.

That'd be the obvious thing I didn't think of. Yeah, if the GPU got passed through to a VM or something it might vanish from nvidia-smi. 
Always learning.

Link to comment

hi guys, i get this error after i installed nvidia unraid build, " NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. "

any idea what causing this ? im quite new to server software and hardware.

i have a 1050 ti and tried to install nvidia 6.0.7.rc6 with 418.43 driver verison.

(i used the Unraid Nvidia plugin to install this)

the same error exist when i switch gpu to an older gtx 970.

 

EDIT: i tried to install the unraid nvidia 6.6.6 and with the 410.78 driver

still same error :/

 

Mobo: Supermicro X9DRi-LN4+/X9DR3-LN4+, Version REV:1.20A

Cpu:   Intel® Xeon® CPU E5-2670 v2 @ 2.50GHz

Edited by Norrox
more info
Link to comment
2 hours ago, Norrox said:

hi guys, i get this error after i installed nvidia unraid build, " NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. "

any idea what causing this ? im quite new to server software and hardware.

i have a 1050 ti and tried to install nvidia 6.0.7.rc6 with 418.43 driver verison.

(i used the Unraid Nvidia plugin to install this)

the same error exist when i switch gpu to an older gtx 970.

 

EDIT: i tried to install the unraid nvidia 6.6.6 and with the 410.78 driver

still same error :/

I have the same error, but by install the unraid nvidia Version it stucks hat checksume passend after thst nothing happend. 

Link to comment
Is there a way to check and see if unRaid even detect that i have a GPU ?

Run lspci -v
This shows you all hardware recognised. You should see your cpu (look for brand name) in the output list and its adress, which you eventualy need if you want to do anything with it.
Link to comment
51 minutes ago, glennv said:


Run lspci -v
This shows you all hardware recognised. You should see your cpu (look for brand name) in the output list and its adress, which you eventualy need if you want to do anything with it.

Thanks, i cant seem to find any reference that my gpu is there? 😮

https://pastebin.com/p1rWMTXT

Edited by Norrox
to much text
Link to comment
3 hours ago, saarg said:

It's not there, so either your card is dead, not properly seated or not enough power. Plugged in the power? 

 

Ok, now unraid are finding it.. a little embarrassing... i was using the wrong pci-e socket in conjunction with my cpu, my mobo can have 2 and it has 3 pci-e socket per cpu and i was 100% sure that it was in the correct socket, and by chance i saw it when i was glancing over the manual... now i will check and see if i can get the transcoding to work with emby! :)  

Edited by Norrox
Link to comment

I have a question, can you do this on an existing install of unRAID or does it require a completely new install? There are no CPU limitations with this are there? I understand on windows you have to have a gen 8 CPU, my unRAID server has Xeon CPU's, will this still work?

Link to comment
  • trurl locked this topic
Guest
This topic is now closed to further replies.