Jump to content
CS01-HS

[Guide] Grafana spinup graph

6 posts in this topic Last Reply

Recommended Posts

I have a visually pleasing (but technically dirty) solution to my quest for a spin-up graph. I'm new to grafana and find it frustrating so if anyone has improvements feel free to post them.

 

This is the end result:

2004995325_ScreenShot2020-11-22at5_38_09AM.thumb.png.c2322de0e856a21aed10e2285a4a2981.png

 

Setup:

 

1. Start with a User Script script to track drive activity and temperature in influx set to run every 5 minutes (borrowed from this php version.)

Replace every XX with your system's settings (default influx port is 8086)

#!/bin/bash

# User settings
INFLUX_IP="XX"
INFLUX_PORT="XX"
HOSTNAME="XX"

# Drive IDs (in /dev/disk/by-id/) in position order from top of graph to bottom
declare -a DRIVE_LIST=(
  "XX"
  "XX"
)

position=0
# Loop through drives
for drive in ${DRIVE_LIST[@]}; do
    # capture smartctl output
    smartctl_output=`smartctl -n standby -AH /dev/disk/by-id/$drive`
    # test if awake
    is_asleep=`echo "$smartctl_output" | grep 'Device is in STANDBY mode' | wc -l`
    if [[ $is_asleep -ne 1 ]]; then
        temp=`echo "$smartctl_output" | egrep ^194 | awk '{print $10}'`
        active=",active=1,temp_c=${temp}"
    else
        active=''
    fi

    grafana_command="curl -i -XPOST 'http://$INFLUX_IP:$INFLUX_PORT/write?db=telegraf' --data-binary 'hdd_spin,host=$HOSTNAME,id_serial=$drive position=$position${active}'"
    eval $grafana_command

    position=$[$position +1]
done

 

2. Create a new pane with the following query

1974348793_ScreenShot2020-11-22at5_42_11AM.thumb.png.7a3f6552c68c0abcc3daa35f39088911.png

 

 

3.  Now the hacks start. The graph goes from 0 to (in my case with 7 drives) -7. We need a way to turn these lines into pretty ribbons. We'll graph the pos column along negative-y (so position 0 is at the top) then for every drive we'll create a corresponding transform that's the drive's position value but negative, minus 1, and have grafana fill the space between them.

 

Here are the first three transforms in my setup in position order:

Parity disk in position 0 transform to -1 (0 - 1)

1st Pool disk in position 1 transform to -2 (1 * -2/1)

1st Array disk in position 2 transform to -3 (2 * -3/2)

 

Grafana doesn't allow fractions so you'll have to calculate the decimal value.

 

The next entry in the sequence would be:

Position 3 to -4 (3 * -4/3), or -1.33.

 

89143669_ScreenShot2020-11-22at5_48_39AM.thumb.png.15a3dd31da93bb7cda8954376d949562.png

 

4. Now go to Overrides to alias the drive pos and temp fields

1300646233_ScreenShot2020-11-22at5_57_55AM.thumb.png.609f76c2681abb6e637648aa460eaba1.png1410948183_ScreenShot2020-11-22at6_02_11AM.thumb.png.2f708fc9b426d898045ab809e2410261.png

 

5. Now to Panel to tweak the display.

904917215_ScreenShot2020-11-22at6_10_28AM.thumb.png.e59da259e7c4cf3ce589499cd6885a19.png316088348_ScreenShot2020-11-22at6_11_00AM.thumb.png.b3863443ae8aaba58f222e0f82497fe0.png1971833116_ScreenShot2020-11-22at6_11_24AM.thumb.png.1dd75170da3715faa538649a7eaef23c.png  

 

Create series overrides for each drive's position field (-pos), fill field (-fill) and temperature field (no suffix)

Note the fill below to in the -pos fields which creates the "ribbons."

1789346449_ScreenShot2020-11-22at6_20_09AM.thumb.png.ad1dbed1efd22729d40e34a1ca626357.png198542958_ScreenShot2020-11-22at6_22_16AM.thumb.png.ae3674e905d2689e1418218f396a72f9.png112611757_ScreenShot2020-11-22at6_24_07AM.png.3da53055dd1116608f8eb5406f0c22d4.png 

 

That's it.

 

To verify you haven't missed anything a completed panel for 7 drives will have:

  • 7 Transforms
  • 14 Overrides
  • 21 Series Overrides

 

 

Note that the Legend will sort alphabetically by serial ID, not position (unfortunately.)

If you're lucky (or obsessive enough to reposition your drives alphabetically, ahem) they'll match.

 

EDIT: 11/22/2020 - Updated instructions for version 2, which adds temperature.

Edited by CS01-HS
v2

Share this post


Link to post

This is pretty sweet. I'm the devloper the Ultimate Unraid Dashboard (UUD). If you want to cross-promote your solution, feel free to also post or link it in my topic. I don't personally, use Spin-up Groups, but this is a really neat solution to non-native data within Telegraf.

 

 

Share this post


Link to post

When I get some more time, I'll deep dive in. Base on a quick read, I should be able to help you clean some of this up.

Share this post


Link to post
16 minutes ago, falconexe said:

This is pretty sweet. I'm the devloper the Ultimate Unraid Dashboard (UUD). If you want to cross-promote your solution, feel free to also post or link it in my topic. I don't personally, use Spin-up Groups, but this is a really neat solution to non-native data within Telegraf.

Sure maybe if/when it gets cleaned up. 

 

Just to be clear this isn't related to spin-up groups (which I don't use either), just standard drives. I wanted a way to easily track whether my drives were sleeping/waking too frequently. Someone cleverer might be able to integrate total wakes over the specified time range.

Share this post


Link to post
1 minute ago, CS01-HS said:

Sure maybe if/when it gets cleaned up. 

 

Just to be clear this isn't related to spin-up groups (which I don't use either), just standard drives. I wanted a way to easily track whether my drives were sleeping/waking too frequently. Someone cleverer might be able to integrate total wakes over the specified time range.

 

Ahh, thanks for that clarification. I meant spin down delay, not groups (my bad). My disks spin 24/7, so this would not be useful for me, but MANY people do so I can totally see the value in this type of graph. It would really help in finding a rogue drive spinning up and then traversing the logs looking for WHY.

 

I can also see this stuff going into a heat map type graph. If you look at my UUD topic, I'll be adding these to my next update. You could adapt them on a per drive/pool basis and have a heat map dashboard for all of them as an overview panel. You could set them all to the same timeframe like last 24 hours. Just a different way of looking at the same data... Anyway, congrats on this solution/hack.

 

Hatmap Example:

 

image.png.27c0a34475f48ddf129c165268de7771.png

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.