[Support] D34DC3N73R - Netdata GLIBC (GPU Enabled)


Recommended Posts

Netdata with Nvidia GPU monitoring in a container. 

 

⚠️ 5/9/23 - Netdata v1.39.0 - LAST RELEASE OF d34dc3n73r/netdata-glibc  ⚠️ 

 

Everything this image set out to do can now be accomplished with the stock netdata image. I will be updating the template to work with netdata/netdata, and my custom image will no longer be used. The template will still be set up to work with GPUs, but when the switch to the official netdata image happens, users MUST edit python.d.conf or go.d.conf to enable GPU charts.


NOTE: An Nvidia GPU and the Unraid Nvidia Plugin are recommended.
If you wish to use it without a gpu, turn on Advanced View and remove '--runtime=nvidia' from Extra Parameters, and remove NVIDIA_VISIBLE_DEVICES or set it to 'void'.

 

d34dc3n73r/netdata-glibc info

This image was created due to netdata/netdata using Alpine, a musl distribution, as a base. Nvidia drivers are only compatible with glibc distributions. This image uses netdata/netdata as a base and adds a GNU C library to run binaries linked against glibc. This image does not contain nvidia-smi, but is compatible with nvidia-docker2, nvidia-container-toolkit and the Unraid Nvidia Plugin.


58919768-269d0180-86e4-11e9-8405-2a7b7c5
 

 

TO INSTALL:
This template is available in Community Applications. Search 'netdata glibc' and install.

 

OVERRIDE DIRECTORY:

Override support works like the official Netdata image: 
https://learn.netdata.cloud/docs/installation/installation-methods/docker#configure-agent-containers

This image includes an override directory. The container path /etc/netdata will be mounted to /mnt/user/appdata/netdata/override. The override directory contains placeholder directories, a generated netdata.conf file, and the edit-config script. The edit-config script can be used to make edits on any stock conf file. For instance, to edit python.d.conf do the following.
From the container console:

/etc/netdata/edit-config python.d.conf

This command with load python.d.conf file from the stock configuration directory /usr/lib/netdata/conf.d using vi as an editor. The edited file will be saved to /etc/netdata and will override the stock configuration when netdata is restarted. Subsequent edits on a file will load the file from /etc/netdata.

 

 

LINKS:

-----------------------------------------

Github Repository: https://github.com/D34DC3N73R/netdata-glibc

Docker Hub: https://hub.docker.com/r/d34dc3n73r/netdata-glibc

Unraid Template: https://raw.githubusercontent.com/D34DC3N73R/unraid-templates/master/netdata-glibc.xml
-----------------------------------------
Base Project: https://www.netdata.cloud/

Base Github Repository: https://github.com/netdata/netdata

Base Docker Hub: https://hub.docker.com/r/netdata/netdata

Edited by D34DC3N73R
update for Netdata v1.39
  • Like 1
Link to comment

@rjlan The current suggestion is mount a volume as a subdirectory of /usr/share/netdata/web. 
To accomplish that, add a new path
Name: Custom Dashboards
Container Path: /usr/share/netdata/web/custom

Host Path: /mnt/user/appdata/netdata/web

After adding your custom dashboard.html to /mnt/user/appdata/netdata/web/ visit http://YOUR.SERVER.IP:19999/custom/dashboard.html

I tested this with their tv.html and their simple demo.html. One caveat is make sure to use full urls in place of relative links to dashboard.js

example:
<script type="text/javascript" src="dashboard.js?v20190902-0"></script>
becomes
<script type="text/javascript" src="http://localhost:19999/dashboard.js?v20190902-0"></script>

 

I'll think about adding this and an override path (for netdata.conf, etc) to the template, but I'd like to see if netdata implements any changes first.

Link to comment
  • 2 weeks later...

Update: Custom configuration support is now enabled. See the netdata configuration guide for details.

/mnt/user/appdata/netdata/override is mapped to /etc/netdata.  The nvidia-smi plugin is now enabled in the stock configuration so GPU support will still work with a blank override directory.

Note: /mnt/user/appdata/netdata/override will be empty by design. Stock configuration files are loaded from /usr/lib/netdata/conf.d/ 
If you'd like to alter one of these config files, you can copy it to the /etc/netdata directory and make edits. 

 

Edit: I've done some work on the image so files from /etc/netdata will appear in the volume mount /mnt/user/appdata/netdata/override. Please reach out if you have any problems.

 

Also, side note in regards to custom dashboards: It appears netdata is working on a react dashboard, so I'm probably not going to put much time into custom dashboards for the current implementation. 

Edited by D34DC3N73R
add override update
  • Like 1
Link to comment
On 5/11/2020 at 5:14 PM, glompy said:

I fixed this by opening up the docker console and executing the following command

 


chown netdata:netdata -R /usr/share/netdata

 

I'm unable to reproduce this on a clean install. If you didn't edit the default template settings (specifically mounting a volume for custom dashboards), I'm guessing this was a problem with the netdata image and was fixed in a recent build. Please let me know if you run into any other problems.

Edit: it does appear to be a problem with v1.22.0 and was fixed in v1.22.1

Edited by D34DC3N73R
add fix info
Link to comment
  • 9 months later...
  • 2 weeks later...
On 2/21/2021 at 8:16 PM, Megaman69 said:

Im trying to reverse proxy this with swag. I get

 

File does not exist, or is not accessible: /usr/share/netdata/web/netdata/

 

Its probably something simble but im an idiot. Any suggestions?


Are you trying to use custom dashboards? Does it work locally?

Link to comment
On 3/4/2021 at 4:26 PM, derbo said:

Anyone using this with 6.9? 

 

I haven't used this in a long time and noticed it's now not showing nvidia-smi. Docker variable for Nvidia visible devices is set to all. 


I haven't tried this out with 6.9 yet. Do you have other containers that use the GPU passthrough and are they working? I'll have to look into 6.9 to see if they changed the way GPU settings work.

Link to comment
On 3/8/2021 at 3:05 PM, D34DC3N73R said:


I haven't tried this out with 6.9 yet. Do you have other containers that use the GPU passthrough and are they working? I'll have to look into 6.9 to see if they changed the way GPU settings work.

 

Not sure what changed, since my post and your post but I just tried it with 6.9.1 and the latest and its working fine now. /shrug. 

Link to comment
  • 2 months later...
  • 1 month later...
On 3/9/2021 at 10:07 PM, derbo said:

 

Not sure what changed, since my post and your post but I just tried it with 6.9.1 and the latest and its working fine now. /shrug. 

 

I just tried this for the first time in a while and notice the GPU charts are missing too. The logs show:

 

2021-07-17 00:21:16: python.d INFO: nvidia_smi[nvidia_smi] : chart 'nvidia_smi.gpu0_processes_mem' was suppressed due to non updating
2021-07-17 00:21:16: python.d INFO: nvidia_smi[nvidia_smi] : chart 'nvidia_smi.gpu0_user_mem' was suppressed due to non updating

 

The GPU is working fine with Plex for transcoding and nvidia-smi returns output from the command line. I'll dig deeper another time.

Edited by Simon
Link to comment
  • 6 months later...

I'm very interested in this docker, but unfortunately I do not have an Nvidia GPU. (Old AMD Radeon HD 5800). Is there any way I can install this docker?  I tried installing but it failed regardles of the "branch" I selected. (I did see the note about "turn on Advanced View and remove '--runtime=nvidia' from Extra Parameters, and remove NVIDIA_VISIBLE_DEVICES or set it to 'void'", but couldn't figure out how to do so.
Any assistance would be greatly appreciated.

Link to comment
  • 3 weeks later...
On 1/28/2022 at 8:10 AM, StephenCND said:

I'm very interested in this docker, but unfortunately I do not have an Nvidia GPU. (Old AMD Radeon HD 5800). Is there any way I can install this docker?  I tried installing but it failed regardles of the "branch" I selected. (I did see the note about "turn on Advanced View and remove '--runtime=nvidia' from Extra Parameters, and remove NVIDIA_VISIBLE_DEVICES or set it to 'void'", but couldn't figure out how to do so.
Any assistance would be greatly appreciated.

You can use the standard netdata image, unless you'd like to use it for the override support. In that case, make sure you're in the advanced view (upper right corner after clicking "install")
image.png.41f89e55c3dc9ddf66fe9b80670b26ad.png
In "Extra Parameters" delete `--runtime=nvidia` 
Then click "Show more settings ..."
image.png.b478570053dcd1130f6257de101e9777.png
and click "Remove" on the `NVIDIA_VISIBLE_DEVICES` environment variable.

Link to comment
  • 4 months later...

Much thanks for creating this, it's been very helpful.

 

Question: what about the 'Sign Up To Cloud' button in the upper-right?

 

I clicked it, gave an email, got an email taking me to directions for all the ways to connect my systems to netdata's cloud storage.  One of them was for Docker, and the main difference seemed to be that it specified a base 64 rando identity param named NETDATA_CLAIM_TOKEN, and another NETDATA_CLAIM_URL=https://app.netdata.cloud

 

So I edited my netdata-glibc config and added those at the end of Extra Parameters-

 

--runtime=nvidia --cap-add SYS_PTRACE --security-opt apparmor=unconfined -e NETDATA_CLAIM_TOKEN=rando-string-goes-here -e NETDATA_CLAIM_URL=https://app.netdata.cloud

 

Clicked apply, container restarts, however the 'Sign Up To Cloud' button doesn't disappear, and following signin at https://app.netdata.cloud/spaces it says I have no nodes.

 

Has anybody else figured out how to link netdata-glibc with netdata.cloud?

Link to comment
  • 4 months later...

Hello,

I've sorted the claim thing by adding a volume and claiming via script. It survives reboots ;)

Added this volume mount in the template: 

/var/lib/netdata/cloud.d/ -> /mnt/user/appdata/netdata/cloud.d/

As per read here: https://learn.netdata.cloud/docs/agent/claim#connect-an-agent-running-in-docker

Quote

For the connection process to work, the contents of /var/lib/netdata must be preserved across container restarts using a persistent volume. See our recommended docker run and Docker Compose examples for details.

(well, this doc is quite outdated because mounting /etc/netdata or /var/lib/netdata won't work as we know...)

 

Then ran this command on host:

docker exec -it netdata netdata-claim.sh -token=TOKEN -url=https://api.netdata.cloud

As per documentation: https://learn.netdata.cloud/docs/agent/claim#using-docker-exec

 

Maybe 'netdata-claim.sh -token=TOKEN -url=https://api.netdata.cloud' works in container console from unraid GUI instead of ssh'ing in the host (but as I'm an SSH man ...)

 

Happy supervision!

Reynald

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.