Jump to content
falconexe

Ultimate UNRAID Dashboard (UUD)

457 posts in this topic Last Reply

Recommended Posts

Ultimate UNRAID Dashboard (UUD)

 

Current Release: Version 1.4 | Version 1.5 (Adds Plex/Varken) In Active Development!

 

 

UUD NEWS:

  • 2020-10-09: UUD Version 1.4 is Released
  • 2020-09-28: The UUD is Featured in the Official UNRAID Monthly Newsletter (September 2020)!
  • 2020-09-21: UUD Version 1.3 is Released
  • 2020-09-14: UUD Version 1.2 is Released
  • 2020-09-12: UUD Version 1.1 is Released
  • 2020-09-11: The UUD is Born and Version 1.0 is Released

 

 

Overview:

Welcome to the OFFICIAL UUD forum topic.  The UUD is my attempt to develop the Ultimate Grafana/Telegraf/InfluxDB/Plex/Tautulli/Varken dashboard. This entire endeavor started when one of our fellow users @hermy65 posed a simple, but complex question in another forum topic (see post #3). I decided to give it a shot, as I am an IT professional, specifically in enterprise data warehouse/SQL server. After a few days of hard work, UUD version 1.0 was released. We are currenlty on Version 1.4 with Version 1.5 in active development. If you are a Grafana developer, or have had experience building dashboards/panels for UNRAID, please let me know. I would love to collaborate.

 

 

Version 1.4 Screenshots - Serial Numbers Redacted (Click the Images as They are Very High Resolution):

 

1.thumb.png.ff5e73f2904e5cf5919ed051e5e105b2.png

2.thumb.png.a82390f29b32ed98915e938333ce4486.png

3.thumb.png.c087bede41c870a774f4992666ebac1d.png

4.thumb.png.c74e6c79e84eb3f9ae3a10207ba56287.png

5.thumb.png.1b5a841be4db7b5da2e3c4b8213bfe12.png

6.thumb.png.6eed49bd003cafeec3d004e3f25eab4f.png

7.thumb.png.622d09d95055e92a7411541086843f5b.png

8.thumb.png.aa465e6a8272dfd81ffcbb9202170944.png

9.thumb.png.057f510f552b1e7d63b7743a9a573756.png

10.thumb.png.6fcfd5e56a0c79620096348facdaf869.png

 

 

 

Disclaimer: This is based on my 30 Drive UNRAID Array. So this shows an example of a fully maxed out UNRAID setup with max drives, dual CPUs, Dual NICs, etc. You will/may need to adjust panels & queries to accommodate your individual UNRAID architecture. This is a heavily modified and customized version of GilbN's original off of his tutorial website, with new and original code. As such, he is a co-developer on this version. I have spent many hours custom coding new functionality and features based on that original template. Much has been learned and I am excited to see how far this can go in the future. GilbN has been gracious enough to help support my modded version here as he wrote the back-end. Thanks again!

 

 

Developers:

  • Primary Developer: @falconexe (USA)
    • UUD Founder | Active Development | Panels | Database Queries | Look & Feel | GUI | Refinement | Support
  • Co-Developer: @GilbN (Europe)
    • Original Template | Back-end | Dynamics | REGEX | Support | Tutorials

 

 

Contributors:

 

 

Dependencies (Last Updated On 2020-10-09)

  • Docker - InfluxDB
  • Docker - Telegraf
    • Docker Network Type: HOST (Otherwise You May Not Get All Server Metrics)
    • 👉 Create Telegraf Configuration File 👈 (DO THIS FIRST!)
      • Create and Place a image.png.718db2fa7bd030f3b87fabbb1016f388.png File into Directory "mnt/user/appdata/YOUR_TELEGRAF_FOLDER"
      • Enable and Install Telegraf Plugins
        • Telegraf Plugin - [[inputs.net]]
          • Enable in telegraf.config
        • Telegraf Plugin - [[inputs.docker]]
          • Enable in telegraf.config
        • Telegraf Pugin - [[inputs.diskio]] 
          • Enable in telegraf.config
          • To Use Static Drive Serial Numbers in Grafana (For DiskIO Queries) Do the Following:
            • Edit telegraf.conf > [[inputs.diskio]] > Add device_tags = ["ID_SERIAL"] > Use ID_SERIAL Flag in Grafana
            • Now Upon Booting, You Don't Have to Worry About SD* Mounts Changing (So Your Graphs Don't Get Messed Up!)
            • You Can Also Set Overrides on the Query Fields to Map the Serial Number to a Common Disk Name Like "DISK01" etc.
        • Telegraf Plugin - [[inputs.smart]]
          • Enable in telegraf.config
            • Also Enable "attributes = true"
          • Bash Into Telegraf Docker and Run "apk add smartmontools"
        • Telegraf Plugin - [[inputs.ipmi_sensor]]
          • Enable in telegraf.config
          • Bash Into Telegraf Docker and Run  "apk add ipmitool"
        • Telegraf Pugin - [[inputs.apcupsd]]
          • Enable in telegraf.config
      • Telegraf Docker Config
        • Add New Path (NOTE: This path has now been merged into Atribe's Telegraf Docker Image. (Thanks @GilbN & @atribe)10.png.3c2db3d43d7d815c8724af977ca96abe.png
        • Post Arguments
          • "/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && telegraf'"11.png.8bccc08a88f66075873335c98cc74f35.png
  • Docker - Grafana
  • CA Plugin: IPMI Tools

 

 

  • NON SERVER HARDWARE (If You Cannot Use "IPMI" and Need to Use "Sensors")
    • As an alternate to IPMI to monitor CPU/System/Aux Temps, you can try the Sensors Plugin.
      • Telegraf Plugin - [[inputs.sensors]]

        • Enable  in the Telegraf Config (Uncomment It)

        • Bash into the Telegraf Docker and Execute "apk add lm_sensors"

      • Stop All 3 Dockers (Grafana > Telegraf > InfluxDB)

      • If You Want to Keep This Plugin in Perpetuity, You Will Need to Modify Your Telegraf Docker Post Arguments (Adding lm_sensors):

        "/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && apk add lm_sensors && telegraf'"

      • Start All 3 Dockers (InfluxDB > Telegraf > Grafana)

 

 

Dashboard Variables (Update These For Your Server): 

Variables.thumb.png.5e9eb30f174c5dca0a70b27899d543c7.png

 

 

 

Compatible With:

 

 

Let me know if you have any questions or are having any issues getting this up and running if you are interested. I am happy to help. I haven't been this geeked out about my UNRAID server in a very long time. This is the cherry on top for my UNRAID experience going back to 2014 when I built my first server. Thanks everyone!

 

 

VERSION 1.4 (Latest)

Ultimate UNRAID Dashboard - Version 1.4 - 2020-10-09 (falconexe).json

 

VERSION 1.3 (Deprecated)

Ultimate UNRAID Dashboard - Version 1.3 - 2020-09-21 (falconexe).json

 

VERSION 1.2 (Very Deprecated)

Ultimate UNRAID Dashboard - Version 1.2 - falconexe.json

 

 

Edited by falconexe
  • Like 4
  • Thanks 2

Share this post


Link to post

LATEST RELEASE NOTES:

 

49 minutes ago, falconexe said:

The Ultimate UNRAID Dashboard Version 1.4 is here! This is a MASSIVE 😁 update adding many new powerful features, panels, and hundreds of improvements. The main goal of this release is to increase usability and simplify the dashboard so more people can modify it without getting lost in REGEX and having to ask for support as often. As a result, the most complex queries have been rewritten in a way that is clear and transparent, while still remaining just as powerful. Finally, I have added requested features and threw in some new bells and whistles that I thought you guys would like.  As always, I'm here if you need me. ENJOY!

 

 

Highlights:

  • Keep it Simple - Added User Transparency Back Into Dashboard by Removing REGEX on Certain Panels
    • This Will Make it Extremely Easy to Customize the Dashboard to Your Specific Needs/Requirements
    • You Can Now See Exactly How Certain Panels are Derived and Making Modifications is Self Explanatory
    • This Will Also Make Support MUCH Easier For Everyone!
  • Multi-Host Support
    • Change the Host Drop Down Variable and Monitor Another Host Instantly
    • Added the Host Variable to Every Single Panel
    • The Entire Dashboard Can Now Monitor Any Host in Real Time With a Single Variable Change Via Drop Down Menu!
  • Initial Support For Non Server Hardware
    • Initial Support For Sensors Plugin to Monitor Non Server Hardware (Only Used If IPMI Is NOT Supported on Your Hardware)
    • Requires New "sensors" Plugin (See Dependencies Section on Post #1)
    • Added Template Sensor Queries (Disabled By Default)
      • You Will Need to Modify These Example Queries As Required For Your Non Server Hardware
      • These are Just Building Blocks to Help Those Who Cannot Use IPMI
      • Please See the Forum Topic For Detailed Help!    
  • Initial Support For Unassigned Drives
    • Added Ability For Unassigned Drives Via 2 Variables (Serial and Path)
    • Added Unassigned Drives to Panels Throughout Dashboard Where Applicable
    • Default Dashboard Comes With Only 1 Unassigned Path Variable
      • You Will Need to Add Additional Path Variables to Include/Exclude Multiple Unassigned Drive Paths
  • Support For Multiple Cache Drives in DiskIO Graphs
  • Support For Multiple Unassigned Drives in DiskIO Graphs
  • Monitoring of ALL System Temps
  • Monitoring of ALL System Voltages
  • Monitoring of ALL System Fans
  • Monitoring of RAM DIMM Temps
  • Further GUI Refinements to Assist with Smaller Resolution Monitors
  • Variable Changes
    • Removed Redundant And/Or Unneeded Variables
      • Cleans Up and Reduces Clutter Of Upper Variable Menu
    • Re-Ordered Variables
      • Smaller Length Variables Are Now First (Typically Row 1)
      • Longer Length Variables Are Now Last (Typically Row 2)
    • Standardized Dashboard to Use Single Datasource Instead of 3
      • Before: Telegraf/Disk/UPS
      • After: Telegraf
      • This Also Keeps the Variables Menu Cleaner With Less Clutter (2 Less Variables!)
    • Standardized All Variables Names in Title Case, Logical Prefixes, and Added Underscores to Separate Words
    • Shortened Variable Label Text When/Where Possible
  • Changed All Panels to Use Default Min Interval Setting of Datasource
    • Set Once in on Datasource and All Panels Not Explicitly Set Will Auto Adjust
    • Only Those Panels Different From the Default Min Interval Are Now Explicitly Set (Example: Array Growth)
  • Modified and Added New Auto-Refresh Time Interval Options In Drop Down Menu
    • Now: 30s,1m,5m,10m,15m,30m,1h,2h,6h,12h,1d
  • Replaced All "Retro LCD" Bar Gauges With "Basic" (Cleaner GUI With Unified Aesthetic)
  • Adjusted All Panel Thresholds to Be More Accurate on Color Changes (See Bug Fixes)
  • Added GROUP BY "time($_interval)" To All Panels
    • Increases Overall Dashboard Performance
  • Removed Min/Max/Avg Values From All Line Graphs to Decrease Screen Width Requirements
    • Shows More Data on Smaller Screens
  • Corrected Various Grammatical Errors
  • Bug Fixes and Optimizations
  • Hundreds of Other Quality of Life and Under the Hood Improvements
    • You Can't See Them, But They're There...In Code...LOTS OF CODE

 

Bug Fixes:

  • Changed Remaining Panels Using FROM "autogen" to "default"
  • Updated All Aliases to Match Panel Names
    • There Were Still Some Discrepancies
  • Adjusted All Threshold Values to Be 1/10th Below Desired Measurement
    • Forces Color Change on Next Whole Number
    • Example: 90% Is Supposed to Be Red, But Would Still Show Preceding Orange Threshold Color (89.9% Resolves This)

 

New Panels:

  • Overwatch
    • System Temps
      • Monitors ALL System Temps (Including CPU)
      • Uses IPMI Unit "degrees_c" to Pull Values Instead of Individual Names
      • Added/Modified Panel Description
      • image.thumb.png.bfb3dbdc962427a0cf8dc87809df66b2.png
    • System Power
      • Monitors ALL System Voltages
      • Uses IPMI Unit "volts" to Pull Values Instead of Individual Names
      • Added/Modified Panel Description
      • image.thumb.png.6d566b88783b25c50e391ea5f317d698.png
    • Fan Speeds (Replaces Fan Speed Gauges):
      • Monitors ALL System Fans 
      • Uses IPMI Unit "rpm" to Pull Values Instead of Induvial Names
      • Also Fixes Issue Where Labels Were Not Being Dynamically Generated
      • Added/Modified Panel Description
    • RAM Load:
      • Show Current Ram Usage %
      • Replaces RAM Used %
  • Disk I/O
    • Unassigned I/O (Read & Write)
      • Adds Support to Monitor Disk I/O of Unassigned Drives
      • Does Not Show Min/Max/Avg Values On Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
      • Added Ability to Show Multiple Unassigned Drives by Serial Number
      • image.thumb.png.6f4410ac8608c47f7d6fd104db12c858.png
  • Disk Overview
    • Unassigned Storage
      • Adds Support to Monitor Storage of Unassigned Drives
      • image.thumb.png.67c9aeb8e23ad984562f95a0b03256ca.png
  • Detailed Server Performance
    • RAM DIMM Temps 
      • Adds Support to Monitor RAM DIMM Temps
      • Uses IPMI & REGEX
      • image.thumb.png.b36b90f28567cb955e8a5eeff3db1103.png

 

Panel Changes:

  • Overwatch
    • ALL Subpanels
      • Overhauled Look and Feel
    • Array Total
      • Added Sparkline Graph
    • Array Utilized
      • Added Sparkline Graph
    • Array Available
      • Added Sparkline Graph
    • Array Utilized %
      • Added Sparkline Graph
    • Cache Utilized
      • Added Sparkline Graph
    • Cache Utilized %
      • Added Sparkline Graph
    • CPU Load
      • Added Sparkline Graph
    • RAM Load
      • Added Sparkline Graph
    • 1GbE Network
      • Renamed Panel
      • Changed to Orientation to Vertical
      • Added Sparkline Graph
    • 10GbE Network
      • Renamed Panel
      • Changed to Orientation to Vertical
      • Added Sparkline Graph
    • Array Growth (Year)
      • Renamed Panel
        • Previously Named "Array Growth (Annual)"
  • DISK I/O
    • Cache I/O (Read & Write)
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
      • Added Ability to Show Multiple Cache Drives by Serial Number
    • Array I/O (Read)
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
    • Array I/O (Write)
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
  • Disk Overview
  • image.thumb.png.83e6042e3199db9d93ba44f8d4ec7c81.png
    • Array Disk Storage
      • Added Used % Field
      • Now Used to Indicate Drive Free Space By Color
      • Modified Thresholds to Be More Accurate
    • Total Array Storage
      • Renamed Panel
        • Previously Named "Array Storage"
      • Added Used % Field
      • Now Used to Indicate Drive Free Space By Color
      • Modified Thresholds to Be More Accurate
    • Drive Temperatures
      • Renamed Panel
        • Formerly "Drive Temperatures (Celsius)"
      • Added Support For Unassigned Drives
  • Detailed Server Performance
    • Network Interfaces (RX)
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
    • Network Interfaces (TX)
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
    • Network 1GBe
      • Renamed Panel
        • Formerly "Network 1GBe (eth0)"
        • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
          • Shows More Data on Smaller Screens
    • Network 10GBe
      • Renamed Panel
        • Formerly "Network 10GBe (eth2)"
        • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
          • Shows More Data on Smaller Screens
    • RAM
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
    • CPU Package
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens
    • CPU 01 Load
      • Renamed Panel
        • Formerly "CPU 01"
        • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
          • Shows More Data on Smaller Screens
      • Removed REGEX and Manually Set Cores Individually
        • Increases Supportability
        • Makes it Easier For Novice Users by Increasing Query Transparency
        • Ensures Tags Stay Ordered Numerically (1,10,11...2,20,21... Is Now 1,2,...10...20...)
        • Renamed Each Core With +1 Array Order Naming (Core 00 Now = Core 01...)       
    • CPU 02 Load
      • Renamed Panel
        • Formerly "CPU 02"
        • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
          • Shows More Data on Smaller Screens
      • Removed REGEX and Manually Set Cores Individually
        • Increases Supportability
        • Makes it Easier For Novice Users by Increasing Query Transparency
        • Ensures Tags Stay Ordered Numerically (1,10,11...2,20,21... Is Now 1,2,...10...20...)
        • Renamed Each Core With +1 Array Order Naming (Core 00 Now = Core 01...)
    • CPU 01 Core Load
      • Changed Bar Gauge Type From "Retro LCD" to "Basic"
      • Changed Bar Gauge Orientation to Vertical
    • CPU 02 Core Load
      • Changed Bar Gauge Type From "Retro LCD" to "Basic"
      • Changed Bar Gauge Orientation to Vertical
    • Fan Speeds
      • Renamed Panel
        • Formerly "IPMI Fan Speeds"
      • Removed Min/Max/Avg Values From Line Graph to Decrease Screen Width Requirements
        • Shows More Data on Smaller Screens

 

Updated Panel Descriptions:

  • Overwatch
    • System Temps
      • Note: Uses IPMI
    • System Power
      • Note: Uses IPMI
    • Fan Speeds
      • Note: Uses IPMI
    • Array Total:
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Array Utilized
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Array Available
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Array Utilized %
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Array Growth (Day)
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Array Growth (Week)
      • Note: Query Options >  Min Interval - Must Match on Week/Month/Year To Stay In Sync Set to 2 Hours by Default For Performance Reasons) - Change Path to "mnt/user" if Cache Drive is Not Present\
  • Disk Overview
    • Array Disk Storage
      • Note: Uses Variable
    • Array Total Storage
      • Note: Change Path to "mnt/user" if Cache Drive is Not Present
    • Unassigned Storage
      • Note: Uses Variable
    • Drive S.M.A.R.T. Health Summary
      • Removed Description
    • Drive Life
      • Removed Description
  • Detailed Server Performance
    • CPU 01 Core Load
      • Removed Description
    • CPU 02 Core Load
      • Removed Description
    • RAM DIMM Temps
      • Note: Uses IPMI & REGEX

 

Removed/Converted/Deprecated Panels:

  • Overwatch
    • CPU 01 Temp
    • CPU 02 Temp
    • RAM Free %
    • Fan Speed Gauges

 

Variables:

 

Variables.thumb.png.3542cf1ab123a8f8262d9c001c1f600d.png

 

  • New
    • Drives_Unassigned
      • Used to Select Unassigned Drives(s) From Drop Down Menu
    • Path_Unassigned
      • Used to Set a Single Unassigned Drive Path For Inclusion/Exclusion in Drive Panels
      • Add Additional Unassigned Path Variables to Include/Exclude Additional Unassigned Drive Paths
  • Renamed
    • Host
      • Formerly "host"
    • Datasource_Telegraf
      • Formerly "telegrafdatasource"
    • CPU_Threads
      • Formerly "cputhreads"
    • UPS_Max_Watts
      • Formerly "upsmaxwatt"
    • UPS_kWh_Price
      • Formerly "upskwhprice"
    • Currency
      • Formerly "currency"
    • Drives_Flash
      • Formerly "flashdrive"
    • Drives_Cache
      • Formerly "cachedrives"
    • Drives_Parity
      • Formerly "paritydrives"
    • Drives_Array
      • Formerly "arraydrives"
  • Deprecated
    • diskdatasource
    • upsdatasource

 

 

See Post Number 1 For the New Version 1.4 JSON File!

 

 

Previous Release Notes:

 

Version 1.3

 

Edited by falconexe

Share this post


Link to post

RESOURCES:

 

 Original Forum Post Where I Initially Developed UUD Versions 1.0 & 1.1:

 

Tutorials:

 

 

Docker Support: 

 

  • @atribe Docker Repo (Base Dockers): 

 

  • @testdasi Grafana Unraid Stack (Integration): 
Edited by falconexe

Share this post


Link to post

Again, great job on the dash!! I'll be checking it out later this evening 😁

Share this post


Link to post

Picking up our conversation from the previous thread:

 

Quote

You also need to ensure your Telegraf config is setup under S.M.A.R.T. where you are not explicitly placing drive names in there.

This is what the inputs.smart section says now:

# # Read metrics from storage devices supporting S.M.A.R.T.
 [[inputs.smart]]
#   ## Optionally specify the path to the smartctl executable
#   # path = "/usr/bin/smartctl"
#
#   ## On most platforms smartctl requires root access.
#   ## Setting 'use_sudo' to true will make use of sudo to run smartctl.
#   ## Sudo must be configured to to allow the telegraf user to run smartctl
#   ## without a password.
#   # use_sudo = false
#
#   ## Skip checking disks in this power mode. Defaults to
#   ## "standby" to not wake up disks that have stoped rotating.
#   ## See --nocheck in the man pages for smartctl.
#   ## smartctl version 5.41 and 5.42 have faulty detection of
#   ## power mode and might require changing this value to
#   ## "never" depending on your disks.
    nocheck = "standby"
#
#   ## Gather all returned S.M.A.R.T. attribute metrics and the detailed
#   ## information from each drive into the 'smart_attribute' measurement.
#   # attributes = false
#
#   ## Optionally specify devices to exclude from reporting.
#   # excludes = [ "/dev/pass6" ]
#
#   ## Optionally specify devices and device type, if unset
#   ## a scan (smartctl --scan) for S.M.A.R.T. devices will
#   ## done and all found will be included except for the
#   ## excluded in excludes.
#   # devices = [ "/dev/ada0 -d atacam" ]
#
#   ## Timeout for the smartctl command to complete.
#   # timeout = "30s"

I had not previously edited this section at all, since there was no indication of needing to do so in GilbN's tutorial. I restarted telegraf after making these changes, but now (5-10 minutes later), I'm not seeing any HD data at all. I've updated to your V1.2 dash (replaced v1.1 with the same ID number) and I'm not getting anything.

 

I've noted that from an SSH directly into the server, that `ls` cannot find /usr/bin/smartctl, but `which` seems to be able to find it, so I'm cornfused...

root@NAS:/usr/bin# which smartctl
/usr/sbin/smartctl
root@NAS:/usr/bin# ls -la sma*
/bin/ls: cannot access 'sma*': No such file or directory

Of course, I realized that my drives were all spun down, and, theoretically, it should have been skipping data gathering for them. However, I'm still getting nothing even 10 minutes after clicking "spin up all drives".

Edited by FreeMan

Share this post


Link to post
40 minutes ago, FreeMan said:

Of course, I realized that my drives were all spun down, and, theoretically, it should have been skipping data gathering for them. However, I'm still getting nothing even 10 minutes after clicking "spin up all drives".

 

I have this line commented out. Try that.

# nocheck = "standby"

 

Please also screenshot one of the queries where you are trying to select any drive data. I need to look at how/what it is trying to do. Maybe the Array I/O section or drive temps panel would be a good place to start. In all cases, you will need to correctly pick your drive(s) in these queries.

 

After some other searching, you may have to enable S.M.A.R.T. on your drives explicitly. I found the below command. Where "X" is your correct last drive kernel letter. However, you really shouldn't have to do this as my drives had smart natively within UNRAID. Nothing special had to be done. My guess is that you don't need to do this, but wanted to throw it out there as some people couldn't get S.M.A.R.T. to work until they ran this command (use at your own risk).

 

smartctl -s /dev/sdX

Edited by falconexe

Share this post


Link to post
11 minutes ago, falconexe said:

After some other searching, you may have to enable S.M.A.R.T. on your drives explicitly. I found the below command. Where "X" is your correct last drive kernel letter. However, you really shouldn't have to do this as my drives had smart natively within UNRAID. Nothing special had to be done. My guess is that you don't need to do this, but wanted to throw it out there as some people couldn't get S.M.A.R.T. to work until they ran this command (use at your own risk).

My thought is that this shouldn't be necessary as I'm getting SMART reports in the unRAID WebGUI by default as well. After uncommenting the [[inputs.smart]] section header, I noticed this in the log:

2020-09-14T18:40:25Z I! Starting Telegraf 1.15.3
2020-09-14T18:40:25Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T18:40:25Z I! Loaded inputs: processes hddtemp netstat disk diskio docker sensors kernel apcupsd net smart cpu swap mem system
2020-09-14T18:40:25Z I! Loaded aggregators:
2020-09-14T18:40:25Z I! Loaded processors:
2020-09-14T18:40:25Z I! Loaded outputs: influxdb
2020-09-14T18:40:25Z I! Tags enabled: host=NAS
2020-09-14T18:40:25Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T18:40:25Z I! Starting Telegraf 1.15.3
2020-09-14T18:40:25Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T18:40:25Z I! Loaded inputs: processes hddtemp netstat disk diskio docker sensors kernel apcupsd net smart cpu swap mem system
2020-09-14T18:40:25Z I! Loaded aggregators:
2020-09-14T18:40:25Z I! Loaded processors:
2020-09-14T18:40:25Z I! Loaded outputs: influxdb
2020-09-14T18:40:25Z I! Tags enabled: host=NAS
2020-09-14T18:40:25Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T18:40:30Z E! [inputs.smart] Error in plugin: smartctl not found: verify that smartctl is installed and that smartctl is in your PATH
2020-09-14T18:40:30Z E! [inputs.smart] Error in plugin: smartctl not found: verify that smartctl is installed and that smartctl is in your PATH

So it seems that it's not able to find smartctl at all, which is really odd since:

root@NAS:~# which smartctl
/usr/sbin/smartctl

indicates that it's there and on the path.

Also:

root@NAS:~# smartctl -a /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
<snip>

so from a command line, at least, I can access smartctl

 

I was not getting any UPS info, then I discovered the [[inputs.apcupsd]] section and uncommented the header there. Now UPS data is working fine.

Share this post


Link to post
2 minutes ago, FreeMan said:

I was not getting any UPS info, then I discovered the [[inputs.apcupsd]] section and uncommented the header there. Now UPS data is working fine.

 

I added this to the list of plugins to enable in the announcement post. Thanks for catching that. I forgot about that one...

Share this post


Link to post

I noticed you are still loading HDDTemp. You can comment that out in the config since we will be using S.M.A.R.T. exclusively to pull temps.

 

20-09-14T18:40:25Z I! Starting Telegraf 1.15.3

2020-09-14T18:40:25Z I! Using config file: /etc/telegraf/telegraf.conf

2020-09-14T18:40:25Z I! Loaded inputs: processes hddtemp netstat disk diskio docker sensors kernel apcupsd net smart cpu swap mem system

 

Share this post


Link to post
7 minutes ago, FreeMan said:

My thought is that this shouldn't be necessary as I'm getting SMART reports in the unRAID WebGUI by default as well. After uncommenting the [[inputs.smart]] section header, I noticed this in the log:


2020-09-14T18:40:25Z I! Starting Telegraf 1.15.3
2020-09-14T18:40:25Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T18:40:25Z I! Loaded inputs: processes hddtemp netstat disk diskio docker sensors kernel apcupsd net smart cpu swap mem system
2020-09-14T18:40:25Z I! Loaded aggregators:
2020-09-14T18:40:25Z I! Loaded processors:
2020-09-14T18:40:25Z I! Loaded outputs: influxdb
2020-09-14T18:40:25Z I! Tags enabled: host=NAS
2020-09-14T18:40:25Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T18:40:25Z I! Starting Telegraf 1.15.3
2020-09-14T18:40:25Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T18:40:25Z I! Loaded inputs: processes hddtemp netstat disk diskio docker sensors kernel apcupsd net smart cpu swap mem system
2020-09-14T18:40:25Z I! Loaded aggregators:
2020-09-14T18:40:25Z I! Loaded processors:
2020-09-14T18:40:25Z I! Loaded outputs: influxdb
2020-09-14T18:40:25Z I! Tags enabled: host=NAS
2020-09-14T18:40:25Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T18:40:30Z E! [inputs.smart] Error in plugin: smartctl not found: verify that smartctl is installed and that smartctl is in your PATH
2020-09-14T18:40:30Z E! [inputs.smart] Error in plugin: smartctl not found: verify that smartctl is installed and that smartctl is in your PATH

So it seems that it's not able to find smartctl at all, which is really odd since:


root@NAS:~# which smartctl
/usr/sbin/smartctl

indicates that it's there and on the path.

 

 

 

Agreed if you are getting S.M.A.R.T. reports anywhere, then it is working.

 

Perhaps you need to explicitly tell it the path in the Telegraf config for some reason on your system. Can you try uncommenting this line and ensuring the path for your server is correct?

 

# # Read metrics from storage devices supporting S.M.A.R.T. [[inputs.smart]]

# ## Optionally specify the path to the smartctl executable

# # path = "/usr/bin/smartctl"

Share this post


Link to post
2 minutes ago, falconexe said:

Perhaps you need to explicitly tell it the path in the Telegraf config for some reason on your system. Can you try uncommenting this line and ensuring the path for your server is correct?

I'd thought about that...

 

Uncommented (removed both #) and at least the error is different, and different is progress. Right! Right??

2020-09-14T19:00:27Z I! Starting Telegraf 1.15.3
2020-09-14T19:00:27Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T19:00:27Z I! Loaded inputs: mem apcupsd disk net netstat diskio kernel processes system cpu docker sensors smart swap
2020-09-14T19:00:27Z I! Loaded aggregators:
2020-09-14T19:00:27Z I! Loaded processors:
2020-09-14T19:00:27Z I! Loaded outputs: influxdb
2020-09-14T19:00:27Z I! Tags enabled: host=NAS
2020-09-14T19:00:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T19:00:27Z I! Starting Telegraf 1.15.3
2020-09-14T19:00:27Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T19:00:27Z I! Loaded inputs: mem apcupsd disk net netstat diskio kernel processes system cpu docker sensors smart swap
2020-09-14T19:00:27Z I! Loaded aggregators:
2020-09-14T19:00:27Z I! Loaded processors:
2020-09-14T19:00:27Z I! Loaded outputs: influxdb
2020-09-14T19:00:27Z I! Tags enabled: host=NAS
2020-09-14T19:00:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T19:00:30Z E! [inputs.smart] Error in plugin: failed to run command '/usr/bin/smartctl --scan': fork/exec /usr/bin/smartctl: no such file or directory -

Also, I did stop HHDTtemp and commented it from the config so it's not looking for it on startup.

 

Interesting that the log says "Starting Telegraf 1.15.3" twice within the same second with no apparent "Stopping telegraf" message in between.

Share this post


Link to post
1 hour ago, FreeMan said:

I'd thought about that...

 

Uncommented (removed both #) and at least the error is different, and different is progress. Right! Right??


2020-09-14T19:00:27Z I! Starting Telegraf 1.15.3
2020-09-14T19:00:27Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T19:00:27Z I! Loaded inputs: mem apcupsd disk net netstat diskio kernel processes system cpu docker sensors smart swap
2020-09-14T19:00:27Z I! Loaded aggregators:
2020-09-14T19:00:27Z I! Loaded processors:
2020-09-14T19:00:27Z I! Loaded outputs: influxdb
2020-09-14T19:00:27Z I! Tags enabled: host=NAS
2020-09-14T19:00:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T19:00:27Z I! Starting Telegraf 1.15.3
2020-09-14T19:00:27Z I! Using config file: /etc/telegraf/telegraf.conf
2020-09-14T19:00:27Z I! Loaded inputs: mem apcupsd disk net netstat diskio kernel processes system cpu docker sensors smart swap
2020-09-14T19:00:27Z I! Loaded aggregators:
2020-09-14T19:00:27Z I! Loaded processors:
2020-09-14T19:00:27Z I! Loaded outputs: influxdb
2020-09-14T19:00:27Z I! Tags enabled: host=NAS
2020-09-14T19:00:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"NAS", Flush Interval:10s
2020-09-14T19:00:30Z E! [inputs.smart] Error in plugin: failed to run command '/usr/bin/smartctl --scan': fork/exec /usr/bin/smartctl: no such file or directory -

Also, I did stop HHDTtemp and commented it from the config so it's not looking for it on startup.

 

Interesting that the log says "Starting Telegraf 1.15.3" twice within the same second with no apparent "Stopping telegraf" message in between.

Hang tight. I found the fix. Testing now...

Share this post


Link to post

@falconexe finally getting this setup and the main issue im running into so far is getting my UPS data to pull in. This is what my telegraf config looks like 

 

# # Monitor APC UPSes connected to apcupsd
[[inputs.apcupsd]]
#   # A list of running apcupsd server to connect to.
#   # If not provided will default to tcp://127.0.0.1:3551
#   servers = ["tcp://127.0.0.1:3551"]
#
#   ## Timeout for dialing server.
#   timeout = "5s"

Im guessing i need to fill out the ip of my unraid server here since im using the built in APC UPS daemon under Settings -> UPS but when i try that it says no route to hose. Im guessing i need to configure something but im not sure what. Perhaps i cannot use the built in APC UPS daemon in unraid?

 

Edited by hermy65

Share this post


Link to post
17 minutes ago, falconexe said:

Hang tight. I found the fix. Testing now...

@FreeMan

 

Add this to your Post Arguments on the Docker Edit Page for Telegraf.

 

/bin/sh -c 'apk update && apk add smartmontools && telegraf'

 

This fixes the issue on my side when testing.

Share this post


Link to post
2 minutes ago, hermy65 said:

@falconexe finally getting this setup and the main issue im running into so far is getting my UPS data to pull in. This is what my telegraf config looks like 

 


# # Monitor APC UPSes connected to apcupsd
[[inputs.apcupsd]]
#   # A list of running apcupsd server to connect to.
#   # If not provided will default to tcp://127.0.0.1:3551
#   servers = ["tcp://127.0.0.1:3551"]
#
#   ## Timeout for dialing server.
#   timeout = "5s"

Im noticing that in my telegraf log files this is happening over and over again even though that server is commented out in the config as referenced above

 


2020-09-14T20:53:30Z E! [inputs.apcupsd] Error in plugin: dial tcp 127.0.0.1:3551: connect: connection refused

Im guessing i need to configure something but im not sure what. Perhaps i cannot use the built in APC UPS daemon in unraid under the settings -> UPS section?

I have a meeting for actual work ha ha. I'll take a look at this later today. But yes, it appears that it cannot communicate with your UPS. I have an APC 1500 so it does work. Try placing your IP address in the servers line and uncomment it.

Share this post


Link to post
8 hours ago, Roxedus said:

You can use this method to install ipmitools and the sensors at startup, this way you can get auto-updates for telegraf. https://selfhosters.net/docker/telegraf/ipmi/

@Roxedus

 

Thanks again for this tip. So what if you need to do multiple of these? The way I have UUD setup currently is we use both IPMI and Smart. Do you know the syntax to place multiple arguments?

 

I've tried the following. Individually, they both work, but combined, they do not.

 

Docker Post Arguments:

 

Fails:

  • /bin/sh -c 'apk update && apk add ipmitool && telegraf' /bin/sh -c 'apk update && apk add smartmontools && telegraf'
  • /bin/sh -c 'apk update && apk add ipmitool && telegraf', /bin/sh -c 'apk update && apk add smartmontools && telegraf'
  • /bin/sh -c 'apk update && apk add ipmitool && telegraf' 'apk update && apk add smartmontools && telegraf'

 

Share this post


Link to post
Just now, falconexe said:

@Roxedus

 

Thanks again for this tip. So what if you need to do multiple of these? The way I have UUD setup currently is we use both IPMI and Smart. Do you know the syntax to place multiple arguments?

 

I've tried the following. Individually, they both work, but combined, they do not.

 

Docker Post Arguments:

 

Fails:

  • /bin/sh -c 'apk update && apk add ipmitool && telegraf' /bin/sh -c 'apk update && apk add smartmontools && telegraf'
  • /bin/sh -c 'apk update && apk add ipmitool && telegraf', /bin/sh -c 'apk update && apk add smartmontools && telegraf'
  • /bin/sh -c 'apk update && apk add ipmitool && telegraf' 'apk update && apk add smartmontools && telegraf'

 

/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && telegraf

Share this post


Link to post
9 minutes ago, GilbN said:

/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && telegraf

 

9 minutes ago, Roxedus said:

Don't add && between the packages

You guys ROCK. @GilbN, you were missing the close quote on the end BTW...

 

So this works perfectly:

 

Solved: /bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && telegraf'

 

image.thumb.png.23ac3ba42b4b2b78d95b29cf2f953cd9.png

 

I added this in the topic header under Dependencies so new users know to do this.

Edited by falconexe

Share this post


Link to post
19 minutes ago, falconexe said:

@FreeMan

 

Add this to your Post Arguments on the Docker Edit Page for Telegraf.

 

/bin/sh -c 'apk update && apk add smartmontools && telegraf'

 

This fixes the issue on my side when testing.

@FreeMan

 

Actually add this. It will load both IPMI and S.M.A.R.T. when the docker automatically starts. I believe this is why it was not finding it in your path. Without these extra arguments, you would have to manually bash into the docker and manually load them each time the Docker updates/restarts.

 

/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && telegraf'

 

Report back and let me know if your log is no long spamming pink.

Edited by falconexe

Share this post


Link to post
23 minutes ago, hermy65 said:

@falconexe finally getting this setup and the main issue im running into so far is getting my UPS data to pull in. This is what my telegraf config looks like 

 


# # Monitor APC UPSes connected to apcupsd
[[inputs.apcupsd]]
#   # A list of running apcupsd server to connect to.
#   # If not provided will default to tcp://127.0.0.1:3551
#   servers = ["tcp://127.0.0.1:3551"]
#
#   ## Timeout for dialing server.
#   timeout = "5s"

Im guessing i need to fill out the ip of my unraid server here since im using the built in APC UPS daemon under Settings -> UPS but when i try that it says no route to hose. Im guessing i need to configure something but im not sure what. Perhaps i cannot use the built in APC UPS daemon in unraid?

 

You can. I use it. As long as telegraf runs as host that should work.  Is the deamon running on the host?

Edited by GilbN

Share this post


Link to post
4 minutes ago, GilbN said:

You can. I use it. As long as telegraf runs as host that should work.  Is the deamon running on the host?

Ah, that may be my issue. My telegraf does not run as host so maybe thats why its not working?

Share this post


Link to post
2 minutes ago, hermy65 said:

Ah, that may be my issue. My telegraf does not run as host so maybe thats why its not working?

Yeah you should really run telegraf as host, as you want it to get all the host metrics ;)

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.