Ultimate UNRAID Dashboard (UUD)


Recommended Posts

DEVELOPER UPDATE:

 

Hey guys. Sorry it has been so long since my last post/news. Unfortunately, I am going through a divorce. It was a complete shock, and I'll just leave it at that. As you can imagine, it has taken up a lot of my time and energies. I was on track to release UUD 1.6 right before that bombshell went off.

 

So, I am slowly getting back into development mode to polish a few things up and get this out the door. Another user @LTM and I have also been hard at work on another project that will directly support the UUD now and into the future. I am very excited to make that introduction and announcement soon.

 

I wanted to say thanks to @GilbN and all of the other users in the community who have helped out people with questions and issues while I was away. It's great to see! Stay tuned...

 

Edited by falconexe
Link to comment
5 hours ago, falconexe said:

DEVELOPER UPDATE:

 

Hey guys. Sorry it has been so long since my last post/news. Unfortunately, I am going through a divorce. It was a complete shock, and I'll just leave it at that. As you can imagine, it has taken up a lot of my time and energies. I was on track to release UUD 1.6 right before that bombshell went off.

 

So, I am slowly getting back into development mode to polish a few things up and get this out the door. Another user @LTM and I have also been hard at work on another project that will directly support the UUD now and into the future. I am very excited to make that introduction and announcement soon.

 

I wanted to say thanks to @GilbN and all of the other users in the community who have helped out people with questions and issues while I was away. It's great to see! Stay tuned...

 

I'm so sorry to hear this, Please reach out if there is anything we can do for you.

Link to comment
7 hours ago, SpencerJ said:

FYI- Grafana/InfluxDB users, if you wake up to a broken dashboard due to an update, here is a fix: 

 

Thank you @T0rqueWr3nch!

 

OK, it seems I shouldn't have taken that exactly literally

 

Quote

influxdb:1.8.4-alpine@UnraidOfficial pic.twitter.com/JfDgwDGwJ6

gives me an error that the repository name must be lower case and leaves me with an orphaned image.

 

However

Quote

influxdb:1.8.4-alpine

does work.

 

Unfortunately, before I saw SpencerJ's post, I'd fiddled with the drive dropdowns and have lost some variable settings so I'll have to reset them. Better than having to completely start from scratch.

 

Hrmm... Maybe I spoke too soon:

image.png.ba6ad77be6246ec7271f323d2155fd83.png

I get the same missing drives...

 

Then I stopped all three dockers and restarted Influxdb > Telegraf > Grafana and all is good now. Whew!

Edited by FreeMan
  • Thanks 1
Link to comment

Thank you everyone for all your contribution. Cant wait for 1.6. I just had a question. Im running on a dell r710 with ipmi enabled. Accessing my idrac i can see all my system stats, on the dashboard the stats show up but the system temp and system power are only displaying a single variable. Not sure if anyones seen this before or has any experience on adjusting it. Any help would be appreciated.

Dashboard.JPG

Link to comment

This might have been asked and answered but I'll give it a shot anyway. I use Prometheus instead of Telegraf/Influx for my monitoring (other machines right now) so i was wondering if anyone had forked this project to work with Prom? It is a wicked cool dashboard I just dont want to run 2 monitoring solutions or switch my other stuff (if it isnt forked i might have to do one or the other tho cause this board is amazing).

Link to comment

Maybe this was discussed in previous pages but tried to search for it and couldn't find anything (nor could I find an actual 'search' button in the thread but that's another issue xD). 

 

Is there any workaround to Telegraf and the SMART check input plugin preventing disks from being spun down? I kinda thought it was an 6.9RC2 bug but after some comments on Reddit realized that that bit as causing it. Disks remain off is manually spun down but will not go into that state on their own even if completely inactive because Telegraf checking the SMART values seems to be preventing it. 

Link to comment
3 hours ago, Laucien said:

Maybe this was discussed in previous pages but tried to search for it and couldn't find anything (nor could I find an actual 'search' button in the thread but that's another issue xD). 

 

Is there any workaround to Telegraf and the SMART check input plugin preventing disks from being spun down? I kinda thought it was an 6.9RC2 bug but after some comments on Reddit realized that that bit as causing it. Disks remain off is manually spun down but will not go into that state on their own even if completely inactive because Telegraf checking the SMART values seems to be preventing it. 

I just checked. My disks spin down and I use the smart plugin. 

Link to comment
I just checked. My disks spin down and I use the smart plugin. 
Huh weird.

On my case I have 2 SATA HDDs in this server and I'm using the smart plugin to get all the info for the dashboard. My telegraf config is the one from the first post only disabling IPMI as I don't have that. Unraid version is 6.9 RC2.

I was having an issue in that the drives will never spin down even if nothing was using them but if I manually spun them down the would remain like that until something actually tried to access the data. From a couple posts on reddit I disabled the auto fan plugin and my Telegraf docker and the problem went away. Then trying to narrow it down I enabled Telegraf again disabling the SMART plugin and disks went back to spinning down as expected.

Any idea how I could troubleshoot this? Or what logs/info I could share.
Link to comment

Does your appdata folder sit on these disks? AKA are you NOT using a cache disk? If so, the data will be written to the appdata folder that sits on these disks based on your Telegraf interval (30 seconds is default). And if you have these 2 disks in the same spin-up group, that would also explain it.

 

 

Link to comment
6 minutes ago, falconexe said:

Does your appdata folder sit on these disks? AKA are you NOT using a cache disk? If so, the data will be written to the appdata folder that sits on these disks based on your Telegraf interval (30 seconds is default). And if you have these 2 disks in the same spin-up group, that would also explain it.

 

 

 

Nope, appdata is cache only and the disk remain spun down for hours when not actively accessing the files so other than me browsing the contents or Plex/Nextcloud doing their things there's no other containers or VMs that access the array. 

 

The issue I'm having is not that the disks mysteriously wake up but that its keeping them from spinning down. 

 

At first I thought there was an issue with 6.9RC2 and people replying to my post here seem to confirm that but disabling the SMART plugin in Telegraf made the problem go away so... wut?.

Link to comment

OK so I found something. I installed dstat to check the I/O on a specific disk and here's what I got: 

 

Here's with Telegraf with the SMART input plugin enabled: 

----system---- --dsk/sdc-- ----most-expensive----
     time     | read  writ|     i/o process
28-02 02:25:46|   0     0 |shfs       2079k 4901k
28-02 02:25:47|   0     0 |php-fpm    7661k  758k
28-02 02:25:48|   0     0 |shfs       2233k 5318k
28-02 02:25:49|   0     0 |shfs       2006k 4968k
28-02 02:25:50|2048B    0 |shfs       2334k 5935k
28-02 02:25:51|   0     0 |php-fpm      45M 3862k
28-02 02:25:52|   0     0 |shfs       2324k 4945k
28-02 02:25:53|   0     0 |shfs       2191k 5546k
28-02 02:25:54|   0     0 |containerd-9506k 9409k
28-02 02:25:55|   0     0 |shfs       2421k 6957k
28-02 02:25:56|   0     0 |shfs       2100k 6074k
28-02 02:25:57|   0     0 |shfs       2106k 5652k
28-02 02:25:58|   0     0 |cache_dirs   13k   20M
28-02 02:25:59|   0     0 |shfs       2858k 7894k
28-02 02:26:00|2048B    0 |shfs       4220k 8579k
28-02 02:26:01|   0     0 |php-fpm      70M 6148k
28-02 02:26:02|   0     0 |php-fpm      89M 4730k
28-02 02:26:03|   0     0 |containerd-9487k 9408k
28-02 02:26:04|   0     0 |shfs       2079k 5756k
28-02 02:26:05|   0     0 |shfs       2291k 6356k
28-02 02:26:06|   0     0 |shfs       2250k 6382k
28-02 02:26:07|   0     0 |php-fpm    7801k  758k
28-02 02:26:08|   0     0 |dockerd     481k   94k
28-02 02:26:09|   0     0 |shfs        179k  419k
28-02 02:26:10|2048B    0 |telegraf   1947k   19k

 

 

See how there's some read activity every 10 seconds? That's the scanning interval configured in Telegraf. 

 

Here's the same query with Telegraf running but with the SMART plugin disabled: 

----system---- --dsk/sdc-- ----most-expensive----
     time     | read  writ|     i/o process
28-02 02:28:50|   0     0 |shfs       2269k 5409k
28-02 02:28:51|   0     0 |shfs       2053k 4867k
28-02 02:28:52|   0     0 |shfs       2138k 5050k
28-02 02:28:53|   0     0 |shfs       1976k 5076k
28-02 02:28:54|   0     0 |shfs       2200k 5333k
28-02 02:28:55|   0     0 |containerd-9506k 9409k
28-02 02:28:56|   0     0 |shfs       2300k 5075k
28-02 02:28:57|   0     0 |php-fpm      42M 3813k
28-02 02:28:58|   0     0 |shfs       2401k 6440k
28-02 02:28:59|   0     0 |shfs       2208k 6437k
28-02 02:29:00|   0     0 |shfs       4119k 8965k
28-02 02:29:01|   0     0 |shfs         10M   16M
28-02 02:29:02|   0     0 |cache_dirs   76k   20M
28-02 02:29:03|   0     0 |containerd-9487k 9408k
28-02 02:29:04|   0     0 |shfs       2210k 5747k
28-02 02:29:05|   0     0 |shfs       2460k 6214k
28-02 02:29:06|   0     0 |shfs       2687k 7160k
28-02 02:29:07|   0     0 |shfs       2217k 5862k
28-02 02:29:08|   0     0 |shfs       2290k 6244k
28-02 02:29:09|   0     0 |shfs       2224k 6365k
28-02 02:29:10|   0     0 |shfs       2639k 7237k
28-02 02:29:11|   0     0 |shfs       2914k 7400k
28-02 02:29:12|   0     0 |containerd   93k  480k
28-02 02:29:13|   0     0 |containerd   93k  480k
28-02 02:29:14|   0     0 |shfs        264k  326k
28-02 02:29:15|   0     0 |containerd   93k  480k
28-02 02:29:16|   0     0 |containerd   93k  480k
28-02 02:29:17|   0     0 |containerd   93k  480k
28-02 02:29:18|   0     0 |containerd   93k  480k
28-02 02:29:19|   0     0 |containerd   93k  480k
28-02 02:29:20|   0     0 |dockerd    1532k  147k
28-02 02:29:21|   0     0 |dockerd     894k  450k
28-02 02:29:22|   0     0 |cache_dirs  276k   29M
28-02 02:29:23|   0     0 |shfs       1960k 5683k
28-02 02:29:24|   0     0 |shfs       2035k 6021k
28-02 02:29:25|   0     0 |containerd-9506k 9409k
28-02 02:29:26|   0     0 |shfs       2179k 5248k
28-02 02:29:27|   0     0 |shfs       2107k 5013k
28-02 02:29:28|   0     0 |shfs       2530k 6197k

 

 

See how there's no activity?. 

 

 

Anyway, I know this isn't related to the Grafana dashboard project so don't want to derail the thread. If anyone has any idea of where I could look into I'd really appreciate it. 

Edited by Laucien
  • Thanks 1
Link to comment
On 2/28/2021 at 6:38 AM, Laucien said:

Maybe this was discussed in previous pages but tried to search for it and couldn't find anything (nor could I find an actual 'search' button in the thread but that's another issue xD). 

 

Is there any workaround to Telegraf and the SMART check input plugin preventing disks from being spun down? I kinda thought it was an 6.9RC2 bug but after some comments on Reddit realized that that bit as causing it. Disks remain off is manually spun down but will not go into that state on their own even if completely inactive because Telegraf checking the SMART values seems to be preventing it. 

Also having this issue on upgrade.

 

I have 7 disks, and I can see the read counts incrementing every 10 seconds even though the drives are spun down.

Appears to not wake up drives, but it does stop them from spinning down.

 

Should we submit a bug report?

Link to comment

 

 

8 hours ago, deanpelton said:

but it does stop them from spinning down.

Yep, same here.  Drives will no longer spin down.  This is just since upgrading last night to 6.9.0.  With 6.8.3, no issues.

 

I stopped Grafana, Telegraf and Influxdb as I only use them for UUD and the disks spun down again after the designated inactivity period.

 

There are documented changes to smartctl handling in 6.9.0 and that seems to have impacted the way Telegraf deals with SMART as well.

Link to comment
57 minutes ago, Hoopster said:

 

 

Yep, same here.  Drives will no longer spin down.  This is just since upgrading last night to 6.9.0.  With 6.8.3, no issues.

 

I stopped Grafana, Telegraf and Influxdb as I only use them for UUD and the disks spun down again after the designated inactivity period.

 

There are documented changes to smartctl handling in 6.9.0 and that seems to have impacted the way Telegraf deals with SMART as well.

It's just the [[Inputs.smart]]  that are keeping the drives alive.

If you comment it out of the Telegraf file you can keep using your dashboard, don't forget to comment out any others within that section.

 

Does look to be an unraid issue, given the reads for the drives increment from this even though most of them are spun down.

Link to comment

Thanks everyone for reporting this issue.

 

@SpencerJ are you and the team aware of this?

 

The UUD has only been tested/released on 6.83 stable. I have yet to upgrade myself. I’ll release 1.6 first, then look into upgrading my 2 servers soon thereafter.

 

Edited by falconexe
  • Like 2
Link to comment
15 minutes ago, falconexe said:

team aware of this

Already a bug report for spin downs. Smartctl outside of array processing shows as io as the command is reading from the disk even thou may not be real user data. Limetech is aware.

Edited by SimonF
  • Thanks 1
Link to comment

Wondering if anyone can help me figure out I am not getting data for some metrics? At one point I had the library data and stream data. I recently moved the server to a different location with different network, but I believe I changed IP addresses where applicable. Not sure why negative numbers in growth? Why some data about library but not others? I have disk data, but can't get overview?

 

 

image.png.d18d93bbcbf072558119347427a4754d.pngimage.thumb.png.b8f16e9e87ed3cef5d6295d0f9e5f7f4.pngimage.thumb.png.d927fda8e4d48ef87d1a640db3205619.pngimage.png.bbe1077b7027e641c8252da27cf77985.png image.png.87368014bcbca78bc1ba695453bdec56.png

Edited by dbinott
Link to comment
2 hours ago, falconexe said:

Thanks everyone for reporting this issue.

 

@SpencerJ are you and the team aware of this?

 

The UUD has only been tested/released on 6.83 stable. I have yet to upgrade myself. I’ll release 1.6 first, then look into upgrading my 2 servers soon thereafter.

 

We are aware. Let us know what you find after you upgrade and we’ll get together to get this sorted. 
 

Cc: @limetech

  • Like 3
Link to comment
On 2/19/2021 at 5:01 PM, FreeMan said:

Ah! That makes perfect sense and, yeah, I knew that. This makes sense - that's yesterday's growth, since it was all written to cache and the mover ran around 01:30 today.

 

One thing I did was change the displays to the last 24 hours, 7 days, 4 weeks, and 365 days.

 

Under Query Options and Relative Time

 

Today = now-24h

Week = now-7d

Month = now-30d

Year = now-365d

 

 

image.png.23e4bb9603e74245d0e1497326fd51c5.png

 

You can also turn off the time data on single graphs by selecting Hide time info.    (Last 24 hours) below.

 

image.png.7a80e2b5548b045f308ae9fe7bee6f37.png

 

 

 

Edited by hogfixer
.
  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.