Ultimate UNRAID Dashboard (UUD)


776 posts in this topic Last Reply

Recommended Posts

32 minutes ago, falconexe said:

 

Finally, let us know when you would like to tackle the APC/CPU temps issue. One thought on CPU temps is that you may not be running on server hardware, and therefore IPMI doesn't apply. In that case you can try using the sensors plugin. I'll be adding this support to version 1.4.

SELECT moving_average("temp_input", 10) FROM "sensors" WHERE ("feature" = 'smbusmaster_0') AND $timeFilter

 

This is how I get my Threadripper temps.

Link to post
  • Replies 775
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

Donate:      Ultimate UNRAID Dashboard (UUD)   Current Release: Version 1.5 (Added Real Time PLEX Monitoring) UUD Version 1.6 is in Active Development!    

Something really cool happened. I woke up this morning and my profile had this green wording on it.       I just officially became the newest UNRAID "Community Developer". I

@SpencerJ @limetech   DEVELOPER UPDATE:   Another night. Another 8 hours into UUD Version 1.6. I'm over 20 hours into 1.6 and counting. And just passed 120 hours over the lifetime

Posted Images

The main issue I had was with the Enable in Telegraf for various stuff.

I do understand the concept of comment and un-comment stuff, but I am certainly no programer and it was not obvious what I was suppose to do precisly:

  • What uncomment (mostly just the "title" of the section)
  • and how much to remove (the # plus the following space)

On my first tries, I uncommented way to much in those sections and had plenty of error in the docker log.

 

Then you don't mention to add the InfluxDB IP adress + port so it was not doing much. Seems silly but I had to figure it out.

 

That out of the way, I had some trouble within Grafana to add the InfluxDB. Didn't saw your link to GilbN tutorial but did find one on Reddit.

 

After that, I did do some tuning of course but within the things I was able to understand. I am going slow and tackling one issue at the time. :) 

Link to post
44 minutes ago, ChatNoir said:

The main issue I had was with the Enable in Telegraf for various stuff.

I do understand the concept of comment and un-comment stuff, but I am certainly no programer and it was not obvious what I was suppose to do precisly:

  • What uncomment (mostly just the "title" of the section)
  • and how much to remove (the # plus the following space)

On my first tries, I uncommented way to much in those sections and had plenty of error in the docker log.

 

Then you don't mention to add the InfluxDB IP adress + port so it was not doing much. Seems silly but I had to figure it out.

 

That out of the way, I had some trouble within Grafana to add the InfluxDB. Didn't saw your link to GilbN tutorial but did find one on Reddit.

 

After that, I did do some tuning of course but within the things I was able to understand. I am going slow and tackling one issue at the time. :) 


Thanks for the feedback. I’ve been debating wether or not to just include a default Telegraf config. I’ll probably do this now so all anyone has to do is add their host/IP address in there. That way everything on the UUD will just work without much fussing around with code.

 

Furthermore, I’ll probably end up making COMPLETE install instructions instead of pointing people to other websites. Just need some time to write it all up. I’m currently super deep into developing Version 1.4...and it’s a massive update.

 

We’ll continue to dial the UUD in with each release. It’s a ton of work and planning as you can imagine. So far, everyone is loving it. There is a slight learning curve, but the ROI is tremendous and this entire thing is just plain friggen cool!

Link to post
On 9/25/2020 at 4:55 AM, falconexe said:

 

As an alternate to IPMI to monitor CPU/System/Aux Temps, you can try the Sensors Plugin.

  • Enable [[inputs.sensors]] in the Telegraf Config (Uncomment It)image.png.428640e03b730afa1d36b8cb0f59b753.png
  • Bash into the Telegraf Docker and Execute "apk add lm_sensors"
  • Stop All 3 Dockers (Grafana > Telegraf > InfluxDB)
  • If You Want to Keep This Plugin in Perpetuity, You Will Need to Modify Your Telegraf Docker Post Arguments To (Adding lm_sensors):
    • "/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && apk add lm_sensors && telegraf'"

  • Start All 3 Dockers (InfluxDB > Telegraf > Grafana)

Let me know if that works for you.

I followed these steps and updated the advanced settings of the telegraf docker and restarted all 3 docker containers but the cpu temps and other sensors are still not shown, I have also run the apk add lm_sensors and that seems to also have worked. How quickly am I meant to see any output in Grafana?

Also my smart dics heath output does not look like the sample screenshots, is there something I need to still change?

smart.thumb.PNG.5a46ba93cfb8f46ccae1ca002acb022f.PNG

Edited by MrLondon
Link to post
21 hours ago, falconexe said:

Hmm, that is a weird one. I've gone thought he CA backup process and have not seen that. I assume if you refresh, or close/reopen the window, it remains?

 

I'm sure you are already doing this, but one thought is to make sure that all 3 Dockers (Grafana/Telegraf/InfluxDB) are all set to auto restart. The way I acomplish this is defining an Extra Parameter in each Docker of "--restart=always". Or you can use the buttons to auto restart in the UNRAID GUI.

 

image.thumb.png.ef967774cc517992613e79a6e3543d4e.png

 

If this issue still persists after refreshing/reopening Grafana, try closing the browsers, clearing your cache/cookies, and reload.

 

Report back. My next backup is Monday at 5AM.

Firefox closed for an update and they're still there. I didn't clear cache/cookies, though. It does remain after a Ctrl-F5 hard refresh of the tab, and after close/reopen the tab.

 

I do have the `--restart=always` parameter set for one of the 3 dockers (don't recall which) per your earlier instructions, but I've never had any issue with dockers starting with a server restart or after a CA Backup run.

 

Interestingly, this is not the first CA Backup run since installing all this stuff, it just happens to be the one that caused a weird visual glitch.

Edited by FreeMan
Link to post
10 hours ago, MrLondon said:

I followed these steps and updated the advanced settings of the telegraf docker and restarted all 3 docker containers but the cpu temps and other sensors are still not shown, I have also run the apk add lm_sensors and that seems to also have worked. How quickly am I meant to see any output in Grafana?

Also my smart dics heath output does not look like the sample screenshots, is there something I need to still change?

smart.thumb.PNG.5a46ba93cfb8f46ccae1ca002acb022f.PNG

 

What kind of hardware are you running on? If you can't use IPMI (AKA you are on non-server hardware), which the dashboard is currently configured for, you will need to modify the queries to instead use "Sensors" now that you have that plugin installed and activated.

 

@GilbN do you have any example query language you can send him for sensors?

 

Grafana should display output immediately and will refresh based on your currently set interval.

 

Please let us know what your S.M.A.R.T. query looks like. Right click the plane and click "Explore". Screenshot both the query and data so we can see what you are dealing with.

Edited by falconexe
Link to post
4 hours ago, FreeMan said:

Firefox closed for an update and they're still there. I didn't clear cache/cookies, though. It does remain after a Ctrl-F5 hard refresh of the tab, and after close/reopen the tab.

 

I do have the `--restart=always` parameter set for one of the 3 dockers (don't recall which) per your earlier instructions, but I've never had any issue with dockers starting with a server restart or after a CA Backup run.

 

Interestingly, this is not the first CA Backup run since installing all this stuff, it just happens to be the one that caused a weird visual glitch.

 

Clear cache and cookies and report back.

Link to post
4 minutes ago, falconexe said:

 

What kind of hardware are you running on? If you can't use IPMI (AKA you are on non-server hardware), which the dashboard is currently configured for, you will need to modify the queries to instead use "Sensors" now that you have that plugin installed and activated.

 

@GilbN do you have any example query language you can send him for sensors?

 

Grafana should display output immediately and will refresh based on your currently set interval.

 

Please let us know what your S.M.A.R.T. query looks like. Right click the plane and click "Explore. Screenshot both the query and data so we can see what you are dealing with.

Hi thanks for the reply, I am using a AMD motherboard and the temp is shown in unraid itself. It is a Asrock B450 motherboard with 2600 amd processor. Here is the explore screen you requested

 

smart_explore.thumb.PNG.71119d4db298d30ced4729d2152bfedc.PNG

Link to post
16 minutes ago, MrLondon said:

Hi thanks for the reply, I am using a AMD motherboard and the temp is shown in unraid itself. It is a Asrock B450 motherboard with 2600 amd processor. Here is the explore screen you requested

 

smart_explore.thumb.PNG.71119d4db298d30ced4729d2152bfedc.PNG

 

Yeah you'll need to use sensors for that MB/CPU hardware config. IPMI will not work.

 

The S.M.A.R.T. query looks normal on your screenshot, but I also need to see the data.

 

Can you scroll down and post a screenshot of the table below? It looks like this. Feel free to sanitize the serial numbers out. We need to see where the data is falling out. AKA, what drives and/or fields are not showing up (if any).

 

image.thumb.png.08dc12022b3972df09046f1bb0378a48.png

Edited by falconexe
Link to post
7 minutes ago, falconexe said:

 

Yeah you'll need to use sensors for that MB/CPU hardware config. IPMI will not work.

 

@MrLondon Here is a sample CPU query using sensors from GilbN's original dashboard. Not sure if it will work for you, as I can't test it on server hardware, but if you plug this in, it might just work.

 

image.png.90b8ce1f93edb34ecf21030060357934.png

 

Here is the JSON for that single query too. First save the dashboard. Duplicate the current panel. You can then click the new panel, select "Inspect > Panel JSON", remove the old JSON code, and past this in and click Apply.

 

{

  "datasource": "$telegrafdatasource",

  "fieldConfig": {

    "defaults": {

      "custom": {},

      "unit": "celsius",

      "min": 0,

      "max": 100,

      "thresholds": {

        "mode": "absolute",

        "steps": [

          {

            "color": "rgb(0, 255, 255)",

            "value": null

          },

          {

            "color": "#EAB839",

            "value": 50

          },

          {

            "color": "red",

            "value": 75

          }

        ]

      },

      "mappings": []

    },

    "overrides": []

  },

  "gridPos": {

    "h": 12,

    "w": 11,

    "x": 2,

    "y": 18

  },

  "hideTimeOverride": true,

  "id": 128,

  "interval": "$interval",

  "links": [],

  "options": {

    "reduceOptions": {

      "values": false,

      "calcs": [

        "lastNotNull"

      ],

      "fields": ""

    },

    "orientation": "horizontal",

    "displayMode": "lcd",

    "showUnfilled": true

  },

  "pluginVersion": "7.1.5",

  "targets": [

    {

      "alias": "$tag_feature $tag_chip",

      "groupBy": [

        {

          "params": [

            "$__interval"

          ],

          "type": "time"

        },

        {

          "params": [

            "feature"

          ],

          "type": "tag"

        },

        {

          "params": [

            "chip"

          ],

          "type": "tag"

        }

      ],

      "measurement": "sensors",

      "orderByTime": "ASC",

      "policy": "default",

      "query": "SELECT distinct(\"temp_input\") FROM \"sensors\" WHERE (\"chip\" = 'coretemp-isa-0000' AND \"feature\" = 'core_0') AND $timeFilter GROUP BY time($__interval) fill(null)",

      "rawQuery": false,

      "refId": "C",

      "resultFormat": "time_series",

      "select": [

        [

          {

            "params": [

              "temp_input"

            ],

            "type": "field"

          },

          {

            "params": [],

            "type": "last"

          }

        ]

      ],

      "tags": [

        {

          "key": "feature",

          "operator": "!~",

          "value": "/.*package/"

        }

      ]

    }

  ],

  "title": "$host - CPU Temp",

  "transformations": [],

  "type": "bargauge",

  "cacheTimeout": null,

  "description": "",

  "timeFrom": null,

  "timeShift": null

}

 

 

Edited by falconexe
Link to post
9 minutes ago, falconexe said:

 

Yeah you'll need to use sensors for that MB/CPU hardware config. IPMI will not work.

 

The query looks normal on your screenshot, but I also need to see the data.

 

Can you scroll down and post a screenshot of the table below? It looks like this. Feel free to sanitize the serial numbers out. We need to see where the data is falling out. AKA, what drives and/or fields are not showing up (if any).

 

image.thumb.png.08dc12022b3972df09046f1bb0378a48.png

something is strange. I have 17 drives in the system 2 parity, 1 cache and 14 data drives but my table does not list them all and I don't have all the headers you are showing in your table, how is this possible?

 

smart_table.thumb.PNG.379f813fb704ee3a95c74a62387f2185.PNG

Link to post
19 minutes ago, MrLondon said:

something is strange. I have 17 drives in the system 2 parity, 1 cache and 14 data drives but my table does not list them all and I don't have all the headers you are showing in your table, how is this possible?

 

smart_table.thumb.PNG.379f813fb704ee3a95c74a62387f2185.PNG

 

It is possible something was missed in the install/config with S.M.A.R.T. I'm noticing that you seem to be missing data any time "smart_attribute" is being called, but you do have data for "smart_device".

Edited by falconexe
Link to post
10 minutes ago, falconexe said:

 

I'm also noticing that you see to be missing data any time "smart_attribute" is being called, but you do have data for "smart_device".

 

@MrLondon Yeah, so you missed the following when you installed it.

 

image.png.7d58202759c2026a6456a1e5c92502da.png

 

Within the telegraf.conf, you need to add/uncomment/set to true the following line.

 

image.thumb.png.33056f8834273a72151355cfd46cec7e.png

 

 

Get that fixed and restart all 3 dockers (Grafana/Telegraf/InfluxDB). I'm betting your issue will be resolved!

Edited by falconexe
Link to post

you were correct, somehow the uncomment to attributes was back to having a #, I have removed it and now I have more information but it is still only showing the information for 2 drives not all 17. in the docker i have this line  /bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && apk add lm_sensors && telegraf'

smart_table1.thumb.PNG.7c26d6024e7e58ada8ec9cc768c506bd.PNG

Link to post
1 hour ago, MrLondon said:

you were correct, somehow the uncomment to attributes was back to having a #, I have removed it and now I have more information but it is still only showing the information for 2 drives not all 17. in the docker i have this line 

 

/bin/sh -c 'apk update && apk upgrade && apk add ipmitool && apk add smartmontools && apk add lm_sensors && telegraf'

  

Quote

smart_table1.thumb.PNG.7c26d6024e7e58ada8ec9cc768c506bd.PNG

 

Please confirm that you restarted the Dockers in this order.

 

Stop: Grafana > Telefraf > InfluxDB

Start: InfluxDB > Telegraf > Grafana

 

Please also clear your browser history/cookies. Enough time should have pass by now for a few new datapoints to be inputted into the database, so I would expect all of these drives to have data in many of these fields. I built the query to retag NULL values as "N/A" as well, so even if some are missing, due to different drives technologies where some attributes do not apply, it won't break anything.

 

Finally, I cannot guarantee that something didn't change within the query and/or related settings (either intentionally or by accident). If all else fails, you can try loading the following JSON into just that panel to "reset" it back to the default that was released in version 1.3. I already provided those instructions on how to accomplish this above in past posts. They have also been posted numerous times within this topic (see page 10).

 

Drive S.M.A.R.T. Health Overview.txt

 

Once you do all of this, please post a screenshot of both the query raw data table and the actual panel again. And Report Back...

 

 

Edited by falconexe
Link to post

Quick UUD development update. Work on Version 1.4 continues. It is going to be a few more weeks before it is released. Thousands of lines of code, new panels, bug fixes, optimizations, and even more capability (including multi server). Thanks again for all of your feedback and continued support. My Wife hates me right now (JK), but you guys love me, so....😂

 

As a way to assist the UUD community, IF you have your UUD working, and would like to assist, troubleshoot, and guide new users along, that would be fantastic. The more support I do, the longer it takes for 1.4 to drop. I have multiple day jobs and also run a company, so time is limited. I appreciate it! Have a great coming week everyone!

Link to post
28 minutes ago, FreeMan said:

Done. Still have the non-functional scroll bars on the horizontal graphs.

 

More of a mild nuisance than a "problem" - work on higher priority items.

Unfortunately, this is not something I can fix. I have never heard of this before and have no idea what could have caused it. You are the only person reporting this so far.

 

I would try installing a secondary copy of 1.3 and see if the default version has the same issue. You could also try reverting your current dashboard to a previous version before the appdata backup and see if that helps.

 

Perhaps the appdata plugin corrupted something? I myself use the same exact setup and don't have the issue and I've backed up 4 times now since I started.

 

Try those 2 options and let me know if it get's fixed. I'd also be curious if this happens from another computer and also a different browser.

 

 

Link to post
14 minutes ago, ctrlaltd1337 said:

Is there a way to not include my /downloads share in the array growth calculation? It will delete itself eventually so it's not a big deal, but if I could just not include that share in the calculation that would be awesome. :)

 

image.thumb.png.cc3708ad51f913166df7032f65328a74.png

 

Nope, share paths do not exist inside of "disk" which is the plugin grabbing that data. /mnt/user is the path for the array disks. Since the share sits on 1 or more disks in the array, it get's counted. The only thing I can think of is put that share on a single disk and remove it from the query where path DOES NOT EQUAL that disk.

Link to post
6 minutes ago, falconexe said:

I would try installing a secondary copy of 1.3 and see if the default version has the same issue. You could also try reverting your current dashboard to a previous version before the appdata backup and see if that helps.

I still have v1.2 installed. Oddly, it's showing the same symptom.

 

It is probably something wonky in Grafana. If it's still showing like this when you release v1.4, I'll uninstall & reinstall the Grafana docker. In the meantime, as I said, it's a mild nuisance, not a "problem". It's more annoying that it's weird and without an obvious solution than it is to look at.

 

Don't sweat it!

Edited by FreeMan
Link to post

Hey @falconexe.  First off - thanks for the lesson on doing the cpu information the "old" way.  I explored doing that, but then I took the time to dive into the regex and it was ridiculously simple.

 

I also finally got back to the UPS portion and I did figure some things out with the UPS stats.  It seems that any panel that utilizes a variable is not getting parsed.  The variable doesn't seem to get included in the query.. for example in this query:

 

SELECT last("load_percent") * .01 * $upsmaxwatt/1000 FROM "apcupsd" WHERE $timeFilter GROUP BY time($__interval)

the "$upsmaxwatt" isn't getting passed, the query inspector shows "error parsing query: found /, expected identifier, string, number, bool at line 1, char 37".  This error is happening with each variable and I cannot for the life of me figure out why the variable isn't getting passed - they are all filled out.

 

For giggles I installed a different panel that includes UPS stats (Unraid System Dashboard v2 from @GilbN) and the UPS data on his works out of the box, and other than the variable names they appear identical in query structure.

 

Thoughts?  Anything else I can supply or look at to help figure this out?

 

Thanks!

Link to post
1 hour ago, FreeMan said:

I still have v1.2 installed. Oddly, it's showing the same symptom.

 

It is probably something wonky in Grafana. If it's still showing like this when you release v1.4, I'll uninstall & reinstall the Grafana docker. In the meantime, as I said, it's a mild nuisance, not a "problem". It's more annoying that it's weird and without an obvious solution than it is to look at.

 

Don't sweat it!

Yeah definitely something specific to your system, and I agree, something sounds corrupted. Let us know if you get it fixed and how you did. It may help someone else someday.

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.