Spin down SAS drives

August 29, 20205 yr

Author

Awesome indeed.

I'm now running this script off of rsyslog - whenever the spindown message shows up. Short testing so use at your own risk but - seems to work like a charm.

Woo hoo!

EDIT: This indeed works, but it seems that the drive doesn't stay spun down for very long - "someone somewhere" appears to spin it back up a few moments later. Obviously the dot remains grey and not green, but the disk does start revolving again... Hmm. This does not seem to happen with SATA drives.

#!/bin/bash

#
# Spin down SAS drives - stopgap script until Unraid does organically.
#
# This script is initiated via syslog - when Unraid issues the "spindown n" message.
# If the drive is SAS, the scipt will issue the commands to spin down a SAS drive.
#
# Spin up is not implemented - assumed to "just happen" when i/o is directed at drive.
#
# @doron 2020-08-30

MDCMD=/usr/local/sbin/mdcmd
SG_MAP=/usr/bin/sg_map
SG_START=/usr/bin/sg_start
SMARTCTL=/usr/sbin/smartctl

grep -qe "mdcmd.*spindown" <<< "$1" || exit 0

# Get syslog line without line breaks and whatnot
LINE=$(paste -sd ' ' <<< $1)

# Obtain Unraid slot number being spun down, from syslog message
SLOTNUM=$(sed -r 's/.*: *spindown ([[:digit:]]*).*/\1/' <<<  $LINE)

# Get the device name from the slot number
RDEVNAME=$($MDCMD status | grep "rdevName.$SLOTNUM" | sed 's/.*=//')

if [ "$($SMARTCTL -i /dev/$RDEVNAME |
        grep protocol |
        sed -r 's/.*protocol: *(.*) .*/\1/')" == "SAS" ]
        then

  # Figure out /dev/sgN type name from /dev/sdX name
  SGDEVNAME=$($SG_MAP | grep "/dev/$RDEVNAME" | sed -r 's/(.*)[[:space:]].*/\1/' )

  if [ "$SGDEVNAME" != "" ] ; then

        # Do the magic
        $SG_START --pc=3 $SGDEVNAME
        logger -t "SAS Assist" "spinning down slot $SLOTNUM, device $SGDEVNAME"

  fi

fi

Triggering it - you can do by placing the script somewhere permanent and adding something like this into a conf file in /etc/rsyslog.d:

:msg,contains,"spindown" ^PATH-TO-SCRIPT

Edited August 30, 20205 yr by doron

Quote

August 29, 20205 yr

am I doing something wrong (highly possible), or will this not affect unassigned devices?

root@Cube:~# sg_start -vvv --pc=3 /dev/sg13
open /dev/sg13 with flags=0x802
    start stop unit command: 1b 00 00 00 30 00
      duration=0 ms
root@Cube:~# sdparm --command=sense /dev/sg13
    /dev/sg13: SEAGATE   DKS2E-H4R0SS      7FA6
Additional sense: Standby condition activated by command

looks fine. but if I touch the Main page and refresh unassigned devices, the dot stays green, and a re-run of sense gives me...

root@Cube:~# sdparm --command=sense /dev/sg13
/dev/sg13: SEAGATE DKS2E-H4R0SS 7FA6
root@Cube:~# <end>

I'm guessing the refresh "touched" the device?

Quote

August 30, 20205 yr

i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA

I have also found a way to use the sdX names by using the --readonly so the device is opened as readonly and not readwrite.

sg_start --readonly --pc=3 /dev/sdd

also sg_raw can be used also.

sg_raw -v -R /dev/sdd 1b 00 00 00 10 00
cdb to send: [1b 00 00 00 10 00]
SCSI Status: Good

root@Tower:~# sg_raw -v -R /dev/sdd 1b 00 00 00 30 00
cdb to send: [1b 00 00 00 30 00]
SCSI Status: Good

Edited August 30, 20205 yr by SimonF

Quote

September 1, 20205 yr

On 8/30/2020 at 6:25 AM, SimonF said:

i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA

I have also found a way to use the sdX names by using the --readonly so the device is opened as readonly and not readwrite.

sg_start --readonly --pc=3 /dev/sdd

also sg_raw can be used also.

sg_raw -v -R /dev/sdd 1b 00 00 00 10 00
cdb to send: [1b 00 00 00 10 00]
SCSI Status: Good

root@Tower:~# sg_raw -v -R /dev/sdd 1b 00 00 00 30 00
cdb to send: [1b 00 00 00 30 00]
SCSI Status: Good

@SimonF how do you change your smart device poll time for just the SAS drive?

Edited September 1, 20205 yr by keshavdaboss

Quote

September 1, 20205 yr

1 hour ago, keshavdaboss said:

@SimonF how do you change your smart device poll time for just the SAS drive?

Its system wide I believe and there is no option per device. I have only SAS drives in my test system array so that works for me.

Quote

September 7, 20205 yr

For the users that have tried this, how many watts do you see your UPS dropping per disk that you put in standby? and total? I have 17 disks total (15 data + 2 parity) of which 4 are SAS disks. When I put mine in standby using the newly found methodology, I don't really see any improvement in power consumption. Curious what others are seeing.

Quote

September 7, 20205 yr

My array consists of 4 SAS drives and 2 SSDs. At idle with only my background processes and the SSDs it’s using 40W, at idle with the SAS array unnecessarily spinning it uses 80W, and with the drives active and working, it’s around a 100W.

I have a little current meter on my plug to measure and track energy usage and cost.

Quote

September 8, 20205 yr

Author

On 8/30/2020 at 4:25 PM, SimonF said:

i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA

I took some time to do a longer-running test, monitoring all my SAS drives every few seconds, to try and get a more comprehensive picture as to what's actually going on.

Bottom line: The state 3 thing does work, and it spins down the drives, so that subsequent i/o wakes them up. However these SAS drives tend to spin back up for various other reasons, not all of which I can yet map. Indeed the periodical SMART instructions spins them up, but these aren't the only events causing that.

I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid. I wrote an additional small script that looks for all SAS drives that should currently be spun down (aka greyed in the UI), and if they're not really spun down - spins them back down. I had it run immediately after the SMART query (you can do that with a "plugin EVENT"), to eliminate that cause, and kept the monitor running.

What I found is that some of these drives keep spinning back up - after a few minutes or seconds - with no apparent reason (i.e. Unraid did not spin them up, and I don't see i/o being done against them).

At this time I don't have a good guess as to why this happens.

Quote

September 8, 20205 yr

@doron not sure if background media scans happen if drives in standby.

I have disabled on my drives for when i was using idle timers.

sdparm --clear=EN_BMS --save /dev/sdX

You can see if background scans are active on Smartctl.

Also maybe some other functions are accessing drive as I know Smartctl doesn't work with the standby option on SAS. Does the GUI check for standby?

Do you run the IPMI plugin and checking disk temps as this may cause spinups also.

Quote

September 8, 20205 yr

9 hours ago, doron said:

I took some time to do a longer-running test, monitoring all my SAS drives every few seconds, to try and get a more comprehensive picture as to what's actually going on.

Bottom line: The state 3 thing does work, and it spins down the drives, so that subsequent i/o wakes them up. However these SAS drives tend to spin back up for various other reasons, not all of which I can yet map. Indeed the periodical SMART instructions spins them up, but these aren't the only events causing that.

I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid. I wrote an additional small script that looks for all SAS drives that should currently be spun down (aka greyed in the UI), and if they're not really spun down - spins them back down. I had it run immediately after the SMART query (you can do that with a "plugin EVENT"), to eliminate that cause, and kept the monitor running.

What I found is that some of these drives keep spinning back up - after a few minutes or seconds - with no apparent reason (i.e. Unraid did not spin them up, and I don't see i/o being done against them).

At this time I don't have a good guess as to why this happens.

@doron What is the best way to implement such scripts ? This reaches the limits of my Slackware knowledge

Thanks!

Quote

September 8, 20205 yr

Author

46 minutes ago, Cilusse said:

@doron What is the best way to implement such scripts ? This reaches the limits of my Slackware knowledge

Thanks!

Ah. Note that both scripts are currently temp hacks, so I haven't yet bothered to install them properly (or at least ensure that they survive a reboot).

Re the syslog script: you can add a file into /etc/rsyslog.d named 99-<something>.conf, containing the following:

:msg,contains,"spindown" ^/path/to/script

after which you need to restart rsyslogd:

/etc/rc.d/rc.rsyslogd restart

Re the event script: This is based on the Unraid plugin event system (plugins can ask for an upcall from Unraid in case of certain "events", one of which is the SMART collection). To do it "properly" you need to have a plugin. Since this is a hack, I just piggy-backed on an existing plugin. Any will do; I used "User Script", which originally does not make use of this event (called "poll_attributes").

So: in /usr/local/emhttp/plugins/<your-selected-plugin>/event/ , place your script under the name "poll_attributes" and automagically, it will run right after every SMART poll.

Note that this script blocks emhttp (i.e. emhttp waits for it to complete). I paste mine below (I shared my syslog script in a previous message).

#!/bin/bash

SG_MAP=/usr/bin/sg_map
SG_START=/usr/bin/sg_start
SMARTCTL=/usr/sbin/smartctl
SDPARM=/usr/sbin/sdparm

DISKS_INI=/var/local/emhttp/disks.ini

(

# Get a list of disks that are expected to be spun down right now
DISKS_SPUN_DOWN=$(cat $DISKS_INI |
        paste -sd '^' |
        sed  -e 's/\^\[/\n\[/g' |
        tr "^" " " |
        grep DISK_OK |
        grep "color=\"green-blink\"" |
        sed -r 's/.* device=\"([a-z0-9]*).*/\1/' )

for RDEVNAME in $DISKS_SPUN_DOWN ; do

  # If it's a SAS device
  if [ "$($SMARTCTL -i /dev/$RDEVNAME |
        grep protocol |
        sed -r 's/.*protocol: *(.*) .*/\1/')" == "SAS" ]
        then

    # Figure out /dev/sgN type name from /dev/sdX name
    SGDEVNAME=$($SG_MAP | grep "/dev/$RDEVNAME" | sed -r 's/(.*)[[:space:]].*/\1/' )

    if [ "$SGDEVNAME" != "" ] ; then

        # If it's not currently spun down...
        if ! grep -iq "standby condition activated" <<< $($SDPARM --command=sense $SGDEVNAME) ; then

                # ... Do the magic
                $SG_START --pc=3 $SGDEVNAME
                logger -t "SAS Assist" "spinning device $RDEVNAME back down"

        fi
    fi

  fi

done
) &

Edited September 8, 20205 yr by doron

Quote

September 9, 20205 yr

Another upvote from me, been having this problem for a couple of years now, and with 15 $TB SAS drives spinning day and night, paid a few quid more than I have needed/wanted to!

Currently run 15 14 Seagate ST4000NM0023 (SAS) and 2 ST4000DM005 (SATA) drives through 2 LSI 9207-8i in IT mode. Would love a simple way to put the drives into standby!

Quote

September 9, 20205 yr

14 hours ago, doron said:

$SG_START --pc=3 $SGDEVNAME

You can use the sdX drive if you use -r this would simplify your script.

Quote

September 9, 20205 yr

Author

7 minutes ago, SimonF said:

You can use the sdX drive if you use -r this would simplify your script.

Absolutely. I had the code done before you came up with the new and improved, so it's still there.

If / when I repack it as a plugin, I'll probably improve that aspect too.

Quote

September 9, 20205 yr

When I was messing about, --readonly did not work for me, as in it did not successfully put the drive into standby mode. I ended up having to use the sgX drive references. I have HGST 8TB drives if it makes any difference. I haven’t looked at the man page but my guess is -r and —readonly are the same thing.

Quote

September 10, 20205 yr

Vote from me. Just built my new Unraid server to replace two old NAS boxes. 4x SAS drives and 4x SATA drives running on an LSI 9211-8i in IT mode. Unraid will spin down the SATA drives but not the SAS drives. Lots of unnecessary heat and power consumption with drives spinning away when they aren't in use 90% of the time!

Edited September 10, 20205 yr by absolute_badger

Quote

September 10, 20205 yr

Yes they are the same, how did you check standby? I found device polling would spin them up.

Using sdX without -r or --readonly I saw an entry in the syslog for the disk and no action on the drive spin down. sgX you do not need to specify readonly.

Command I use to see spindown is as follows.

sdparm --command=sense /dev/sdg

/dev/sdg: HGST HUS724030ALS640 A1C4
Additional sense: Standby condition activated by command

Quote

September 10, 20205 yr

7 hours ago, SimonF said:

Yes they are the same, how did you check standby? I found device polling would spin them up.

Using sdX without -r or --readonly I saw an entry in the syslog for the disk and no action on the drive spin down. sgX you do not need to specify readonly.

Command I use to see spindown is as follows.

sdparm --command=sense /dev/sdg

/dev/sdg: HGST HUS724030ALS640 A1C4
Additional sense: Standby condition activated by command

I only used —readonly with sdX. Yes, that is the command I used to see if the drive was spun down. I’ve followed your previous posts very closely. Sometime this week I will revisit and see if it’s still acting up.

Quote

September 11, 20205 yr

Author

On 9/8/2020 at 9:00 PM, SimonF said:

@doron not sure if background media scans happen if drives in standby.

I have disabled on my drives for when i was using idle timers.

sdparm --clear=EN_BMS --save /dev/sdX

You can see if background scans are active on Smartctl.

Also maybe some other functions are accessing drive as I know Smartctl doesn't work with the standby option on SAS. Does the GUI check for standby?

Do you run the IPMI plugin and checking disk temps as this may cause spinups also.

So it turns out that disabling this does further improve on the situation, but it's still happening: The SAS drives still do wake up from time to time, with no apparent reason or i/o done against them.

With my two scripts running constantly, the overall standby times are longer - but it's not perfect. Maybe we can figure out some more reasons for these drives to spin up and out of STANDBY state. @SimonF? 🙂

Quote

September 11, 20205 yr

On 9/8/2020 at 7:13 AM, doron said:

I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid.

It doesn't seem wize to hitch your wagon to something that's been buggy for a very long time. Especially when there's a very easy way to do this yourself -- just read /sys/block/sdX/stat directly, and when you notice that there's been no i/o activity for a certain period of time, then just go ahed and spin down the drive. For example, I am attaching here my own little script that has been faithfully serving me for over five years now. Just disable all spindown stuff in the UI, start my script from your "go" file, and forget about it. (Note, the UI may also be buggy in the way it polls for smart data, thus spinning up your disks, so you may want to look into that too.)

#!/bin/bash
# spind: disk spin-down daemon
copy="Version 3.9 <c> 2020 by Pourko Balkanski"
prog=spind

####################################################################
MINUTES=${MINUTES:-60}  # the number of idle minutes before spindown
####################################################################

idleTimeout=$(($MINUTES*60)) # in seconds
loopDelay=61 # seconds

kill $(pidof -x -o $$ $0) 2>/dev/null # our previous instances, if any
[ "$1" = "-q" ] && exit 0 # Don't start a new daemon if called with -q

renice 5 -p $$ >/dev/null  # renice self
log () { logger -t $prog $@ ;}
log $copy

# Make a list of the disks that could be spun down
i=0
for device in /dev/[sh]d[aaa-zzz] ;do
   if proto=$(smartctl -i $device | grep -iE ' sas| sata| ide') ;then
      ((i++))
      devName[$i]=$device
      cmdStat[$i]="cat /sys/block/$(basename $device)/stat"
      devLastStat[$i]=$(${cmdStat[$i]})
      devSecondsIdle[$i]=0
      devError[$i]=0  # We'll use to flag disks that won't spin down
      cmdSpinStatus[$i]="hdparm -C $device"
      cmdStandby[$i]="hdparm -y $device"
      if grep -iq ' SAS' <<<$proto ;then
          # Switch from /dev/sdX to /dev/sgN
          devName[$i]=$(sg_map26 $device)
          cmdSpinStatus[$i]="sdparm --command=sense ${devName[$i]}"
          cmdStandby[$i]="sg_start --pc=3 ${devName[$i]}"
      fi
      theList+="${devName[$i]} "
   fi
done
devCount=$i

if [ "$theList" = "" ] ;then
  log 'No supported disks found. Exiting.'
  exit 1
fi
log "Will spin down disks after $MINUTES minutes of idling."
log "Monitoring: $theList"

while :;do
   sleep $loopDelay
   for i in $(seq $devCount) ;do
      [ ${devError[$i]} -gt 2 ] && continue  # this disk has previously failed to spin down.
      devNewStat[$i]=$(${cmdStat[$i]})
      if [ "${devNewStat[$i]}" != "${devLastStat[$i]}" ] ; then
         # Some i/o activity has occured since the last time we checked.
         devSecondsIdle[$i]=0
         devLastStat[$i]=${devNewStat[$i]}
      else # No new activity since we last checked...
          # ...So, let's check its spin status
          if ${cmdSpinStatus[$i]} | grep -iq standby ; then
              devSecondsIdle[$i]=0
          else # it's currently spinning
              let "devSecondsIdle[$i] += $loopDelay"
              # Check if it's been idling for long enough...
              if [ ${devSecondsIdle[$i]} -gt $idleTimeout ] ; then
                  # It is time to spin this one down!
                  log "spinning down ${devName[$i]} "
                  ${cmdStandby[$i]} >/dev/null 2>&1
                  devSecondsIdle[$i]=0
                  sleep 1 # no need to worry about race conditions here.
                  # Check if the drive actually spun down as a result of our command
                  if ${cmdSpinStatus[$i]} | grep -iq standby ;then
                     devError[$i]=0
                  else
                     ((devError[$i]++))
                     [ ${devError[$i]} -gt 2 ] && log "${devName[$i]} fails to spin down."
                  fi
              fi
          fi
      fi
   done
done &
disown
exit 0

spind-3.9.zip

Edited September 11, 20205 yr by Pourko

Quote

September 11, 20205 yr

I have written the standby function into smartctl for SCSI(SAS) devices as its only currently available for ATA and submitted to owner for inclusion.

example outputs not sure if that would be of use.

root@Tower:~# smartctl -i -n standby /dev/sdg
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY BY COMMAND mode, exit(2)
root@Tower:~# smartctl -i -n standby /dev/sdh
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY BY TIMER mode, exit(2)
root@Tower:~# smartctl -i -n never /dev/sdh
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724030ALS640
Revision: A1C4
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca027baa9a8
Serial number:
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri Sep 11 20:28:44 2020 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Power mode was: STANDBY BY TIMER

I have also provided some code to devs for changes to mdcmd to include within Unraid

Quote

September 11, 20205 yr

Author

1 hour ago, Pourko said:

It doesn't seem wize to hitch your wagon to something that's been buggy for a very long time.

Hi. You are coming to a thread after a long discussion - please read over. In short, hanging the action on the syslog thing (and the other script/hack I posted) is by no means a "solution" - it's a stopgap testing mechanism, to test our assumptions about the feasibility and the efficacy of the STANDBY command solution.

So far, it proves to generate mixed results: The drives do spin down, but after a while they spin back up. You'd not see it unless you monitor the STANDBY status of the drive rather closely.

Looking at your script, it seems to be susceptible to the same issue.

By the way, to do what you are doing, you don't need to read i/o counters - you can tell the drive to automatically spin down after a certain amount of idle time. That, in turn, has two drawbacks:

(a) Unraid's spin up/down management is not aware of this, so no UI display and settings dialog control.

(b) Same problem as above - the drives do spin back up, with no apparent i/o, for reasons yet to be understood.

1 hour ago, Pourko said:

(Note, the UI may also be buggy in the way it polls for smart data, thus spinning up your disks, so you may want to look into that too.)

See previously in this threat.

Quote

September 11, 20205 yr

1 minute ago, doron said:

you don't need to read i/o counters - you can tell the drive to automatically spin down after a certain amount of idle time.

Right. But I have a bunch of disks that disregard that setting. Which was the main reason I wrote my script.

Anyway, I was only trying to help. For myself, I have a solution that has been working flawlessly on my server for years. If you don't like it -- forget I posted it.

Cheers.

Quote

September 11, 20205 yr

Author

Just now, Pourko said:

Anyway, I was only trying to help. For myself, I have a solution that has been working flawlessly on my server for years. If you don't like it -- forget I posted it.

On the contrary, it would in fact be good that you take part - you seem to have relevant experience - it'd just be good to get in sync with the discussion.

Quote

September 11, 20205 yr

See, I have the feeling that you are not correctly identifying the problem. The way I see it, the problem is not how to spin down disks, the problem is that some buggy scripts in the UI don't know how to properly query a disk without waking it up, and they don't know when to rightfully display a green ball (or whatever other collor). Personally, I rarely use the UI for anything, and on my server disks spin down when they are supposed to, and they stay spun down. From reading the posts in this thread, I have the impression that you are trying to fix things kind of backwards, i.e., you take some info from the UI (that does not match reality) and try to make the disk status match that unreal info from the UI. That is why I suggested that maybe you shouldn't bother, doing it that way, and instead plead with the UI people to fix their UI, if the UI is that important to you. I hope this explaination makes some sense. :-)

Edited September 11, 20205 yr by Pourko

Quote

Spin down SAS drives

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)