doron Posted August 29, 2020 Author Share Posted August 29, 2020 (edited) Awesome indeed. I'm now running this script off of rsyslog - whenever the spindown message shows up. Short testing so use at your own risk but - seems to work like a charm. Woo hoo! EDIT: This indeed works, but it seems that the drive doesn't stay spun down for very long - "someone somewhere" appears to spin it back up a few moments later. Obviously the dot remains grey and not green, but the disk does start revolving again... Hmm. This does not seem to happen with SATA drives. #!/bin/bash # # Spin down SAS drives - stopgap script until Unraid does organically. # # This script is initiated via syslog - when Unraid issues the "spindown n" message. # If the drive is SAS, the scipt will issue the commands to spin down a SAS drive. # # Spin up is not implemented - assumed to "just happen" when i/o is directed at drive. # # @doron 2020-08-30 MDCMD=/usr/local/sbin/mdcmd SG_MAP=/usr/bin/sg_map SG_START=/usr/bin/sg_start SMARTCTL=/usr/sbin/smartctl grep -qe "mdcmd.*spindown" <<< "$1" || exit 0 # Get syslog line without line breaks and whatnot LINE=$(paste -sd ' ' <<< $1) # Obtain Unraid slot number being spun down, from syslog message SLOTNUM=$(sed -r 's/.*: *spindown ([[:digit:]]*).*/\1/' <<< $LINE) # Get the device name from the slot number RDEVNAME=$($MDCMD status | grep "rdevName.$SLOTNUM" | sed 's/.*=//') if [ "$($SMARTCTL -i /dev/$RDEVNAME | grep protocol | sed -r 's/.*protocol: *(.*) .*/\1/')" == "SAS" ] then # Figure out /dev/sgN type name from /dev/sdX name SGDEVNAME=$($SG_MAP | grep "/dev/$RDEVNAME" | sed -r 's/(.*)[[:space:]].*/\1/' ) if [ "$SGDEVNAME" != "" ] ; then # Do the magic $SG_START --pc=3 $SGDEVNAME logger -t "SAS Assist" "spinning down slot $SLOTNUM, device $SGDEVNAME" fi fi Triggering it - you can do by placing the script somewhere permanent and adding something like this into a conf file in /etc/rsyslog.d: :msg,contains,"spindown" ^PATH-TO-SCRIPT Edited August 30, 2020 by doron Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 am I doing something wrong (highly possible), or will this not affect unassigned devices? root@Cube:~# sg_start -vvv --pc=3 /dev/sg13 open /dev/sg13 with flags=0x802 start stop unit command: 1b 00 00 00 30 00 duration=0 ms root@Cube:~# sdparm --command=sense /dev/sg13 /dev/sg13: SEAGATE DKS2E-H4R0SS 7FA6 Additional sense: Standby condition activated by command looks fine. but if I touch the Main page and refresh unassigned devices, the dot stays green, and a re-run of sense gives me... root@Cube:~# sdparm --command=sense /dev/sg13 /dev/sg13: SEAGATE DKS2E-H4R0SS 7FA6 root@Cube:~# <end> I'm guessing the refresh "touched" the device? Quote Link to comment
SimonF Posted August 30, 2020 Share Posted August 30, 2020 (edited) i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA I have also found a way to use the sdX names by using the --readonly so the device is opened as readonly and not readwrite. sg_start --readonly --pc=3 /dev/sdd also sg_raw can be used also. sg_raw -v -R /dev/sdd 1b 00 00 00 10 00 cdb to send: [1b 00 00 00 10 00] SCSI Status: Good root@Tower:~# sg_raw -v -R /dev/sdd 1b 00 00 00 30 00 cdb to send: [1b 00 00 00 30 00] SCSI Status: Good Edited August 30, 2020 by SimonF Quote Link to comment
keshavdaboss Posted September 1, 2020 Share Posted September 1, 2020 (edited) On 8/30/2020 at 6:25 AM, SimonF said: i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA I have also found a way to use the sdX names by using the --readonly so the device is opened as readonly and not readwrite. sg_start --readonly --pc=3 /dev/sdd also sg_raw can be used also. sg_raw -v -R /dev/sdd 1b 00 00 00 10 00 cdb to send: [1b 00 00 00 10 00] SCSI Status: Good root@Tower:~# sg_raw -v -R /dev/sdd 1b 00 00 00 30 00 cdb to send: [1b 00 00 00 30 00] SCSI Status: Good @SimonF how do you change your smart device poll time for just the SAS drive? Edited September 1, 2020 by keshavdaboss Quote Link to comment
SimonF Posted September 1, 2020 Share Posted September 1, 2020 1 hour ago, keshavdaboss said: @SimonF how do you change your smart device poll time for just the SAS drive? Its system wide I believe and there is no option per device. I have only SAS drives in my test system array so that works for me. Quote Link to comment
JimJamUrUnraid Posted September 7, 2020 Share Posted September 7, 2020 For the users that have tried this, how many watts do you see your UPS dropping per disk that you put in standby? and total? I have 17 disks total (15 data + 2 parity) of which 4 are SAS disks. When I put mine in standby using the newly found methodology, I don't really see any improvement in power consumption. Curious what others are seeing. Quote Link to comment
Cilusse Posted September 7, 2020 Share Posted September 7, 2020 My array consists of 4 SAS drives and 2 SSDs. At idle with only my background processes and the SSDs it’s using 40W, at idle with the SAS array unnecessarily spinning it uses 80W, and with the drives active and working, it’s around a 100W. I have a little current meter on my plug to measure and track energy usage and cost. Quote Link to comment
doron Posted September 8, 2020 Author Share Posted September 8, 2020 On 8/30/2020 at 4:25 PM, SimonF said: i found that smartctl requests can spin up the drives. I have set my device poll to 18000 which means they wake up every 5 hours. Not sure about UD as dont have a SAS drive as UD only SATA I took some time to do a longer-running test, monitoring all my SAS drives every few seconds, to try and get a more comprehensive picture as to what's actually going on. Bottom line: The state 3 thing does work, and it spins down the drives, so that subsequent i/o wakes them up. However these SAS drives tend to spin back up for various other reasons, not all of which I can yet map. Indeed the periodical SMART instructions spins them up, but these aren't the only events causing that. I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid. I wrote an additional small script that looks for all SAS drives that should currently be spun down (aka greyed in the UI), and if they're not really spun down - spins them back down. I had it run immediately after the SMART query (you can do that with a "plugin EVENT"), to eliminate that cause, and kept the monitor running. What I found is that some of these drives keep spinning back up - after a few minutes or seconds - with no apparent reason (i.e. Unraid did not spin them up, and I don't see i/o being done against them). At this time I don't have a good guess as to why this happens. Quote Link to comment
SimonF Posted September 8, 2020 Share Posted September 8, 2020 @doron not sure if background media scans happen if drives in standby. I have disabled on my drives for when i was using idle timers. sdparm --clear=EN_BMS --save /dev/sdX You can see if background scans are active on Smartctl. Also maybe some other functions are accessing drive as I know Smartctl doesn't work with the standby option on SAS. Does the GUI check for standby? Do you run the IPMI plugin and checking disk temps as this may cause spinups also. Quote Link to comment
Cilusse Posted September 8, 2020 Share Posted September 8, 2020 9 hours ago, doron said: I took some time to do a longer-running test, monitoring all my SAS drives every few seconds, to try and get a more comprehensive picture as to what's actually going on. Bottom line: The state 3 thing does work, and it spins down the drives, so that subsequent i/o wakes them up. However these SAS drives tend to spin back up for various other reasons, not all of which I can yet map. Indeed the periodical SMART instructions spins them up, but these aren't the only events causing that. I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid. I wrote an additional small script that looks for all SAS drives that should currently be spun down (aka greyed in the UI), and if they're not really spun down - spins them back down. I had it run immediately after the SMART query (you can do that with a "plugin EVENT"), to eliminate that cause, and kept the monitor running. What I found is that some of these drives keep spinning back up - after a few minutes or seconds - with no apparent reason (i.e. Unraid did not spin them up, and I don't see i/o being done against them). At this time I don't have a good guess as to why this happens. @doron What is the best way to implement such scripts ? This reaches the limits of my Slackware knowledge Thanks! Quote Link to comment
doron Posted September 8, 2020 Author Share Posted September 8, 2020 (edited) 46 minutes ago, Cilusse said: @doron What is the best way to implement such scripts ? This reaches the limits of my Slackware knowledge Thanks! Ah. Note that both scripts are currently temp hacks, so I haven't yet bothered to install them properly (or at least ensure that they survive a reboot). Re the syslog script: you can add a file into /etc/rsyslog.d named 99-<something>.conf, containing the following: :msg,contains,"spindown" ^/path/to/script after which you need to restart rsyslogd: /etc/rc.d/rc.rsyslogd restart Re the event script: This is based on the Unraid plugin event system (plugins can ask for an upcall from Unraid in case of certain "events", one of which is the SMART collection). To do it "properly" you need to have a plugin. Since this is a hack, I just piggy-backed on an existing plugin. Any will do; I used "User Script", which originally does not make use of this event (called "poll_attributes"). So: in /usr/local/emhttp/plugins/<your-selected-plugin>/event/ , place your script under the name "poll_attributes" and automagically, it will run right after every SMART poll. Note that this script blocks emhttp (i.e. emhttp waits for it to complete). I paste mine below (I shared my syslog script in a previous message). #!/bin/bash SG_MAP=/usr/bin/sg_map SG_START=/usr/bin/sg_start SMARTCTL=/usr/sbin/smartctl SDPARM=/usr/sbin/sdparm DISKS_INI=/var/local/emhttp/disks.ini ( # Get a list of disks that are expected to be spun down right now DISKS_SPUN_DOWN=$(cat $DISKS_INI | paste -sd '^' | sed -e 's/\^\[/\n\[/g' | tr "^" " " | grep DISK_OK | grep "color=\"green-blink\"" | sed -r 's/.* device=\"([a-z0-9]*).*/\1/' ) for RDEVNAME in $DISKS_SPUN_DOWN ; do # If it's a SAS device if [ "$($SMARTCTL -i /dev/$RDEVNAME | grep protocol | sed -r 's/.*protocol: *(.*) .*/\1/')" == "SAS" ] then # Figure out /dev/sgN type name from /dev/sdX name SGDEVNAME=$($SG_MAP | grep "/dev/$RDEVNAME" | sed -r 's/(.*)[[:space:]].*/\1/' ) if [ "$SGDEVNAME" != "" ] ; then # If it's not currently spun down... if ! grep -iq "standby condition activated" <<< $($SDPARM --command=sense $SGDEVNAME) ; then # ... Do the magic $SG_START --pc=3 $SGDEVNAME logger -t "SAS Assist" "spinning device $RDEVNAME back down" fi fi fi done ) & Edited September 8, 2020 by doron Quote Link to comment
Duggie264 Posted September 9, 2020 Share Posted September 9, 2020 Another upvote from me, been having this problem for a couple of years now, and with 15 $TB SAS drives spinning day and night, paid a few quid more than I have needed/wanted to! Currently run 15 14 Seagate ST4000NM0023 (SAS) and 2 ST4000DM005 (SATA) drives through 2 LSI 9207-8i in IT mode. Would love a simple way to put the drives into standby! Quote Link to comment
SimonF Posted September 9, 2020 Share Posted September 9, 2020 14 hours ago, doron said: $SG_START --pc=3 $SGDEVNAME You can use the sdX drive if you use -r this would simplify your script. 1 Quote Link to comment
doron Posted September 9, 2020 Author Share Posted September 9, 2020 7 minutes ago, SimonF said: You can use the sdX drive if you use -r this would simplify your script. Absolutely. I had the code done before you came up with the new and improved, so it's still there. If / when I repack it as a plugin, I'll probably improve that aspect too. 1 Quote Link to comment
JimJamUrUnraid Posted September 9, 2020 Share Posted September 9, 2020 When I was messing about, --readonly did not work for me, as in it did not successfully put the drive into standby mode. I ended up having to use the sgX drive references. I have HGST 8TB drives if it makes any difference. I haven’t looked at the man page but my guess is -r and —readonly are the same thing. Quote Link to comment
absolute_badger Posted September 10, 2020 Share Posted September 10, 2020 (edited) Vote from me. Just built my new Unraid server to replace two old NAS boxes. 4x SAS drives and 4x SATA drives running on an LSI 9211-8i in IT mode. Unraid will spin down the SATA drives but not the SAS drives. Lots of unnecessary heat and power consumption with drives spinning away when they aren't in use 90% of the time! Edited September 10, 2020 by absolute_badger Quote Link to comment
SimonF Posted September 10, 2020 Share Posted September 10, 2020 Yes they are the same, how did you check standby? I found device polling would spin them up. Using sdX without -r or --readonly I saw an entry in the syslog for the disk and no action on the drive spin down. sgX you do not need to specify readonly. Command I use to see spindown is as follows. sdparm --command=sense /dev/sdg /dev/sdg: HGST HUS724030ALS640 A1C4 Additional sense: Standby condition activated by command Quote Link to comment
JimJamUrUnraid Posted September 10, 2020 Share Posted September 10, 2020 7 hours ago, SimonF said: Yes they are the same, how did you check standby? I found device polling would spin them up. Using sdX without -r or --readonly I saw an entry in the syslog for the disk and no action on the drive spin down. sgX you do not need to specify readonly. Command I use to see spindown is as follows. sdparm --command=sense /dev/sdg /dev/sdg: HGST HUS724030ALS640 A1C4 Additional sense: Standby condition activated by command I only used —readonly with sdX. Yes, that is the command I used to see if the drive was spun down. I’ve followed your previous posts very closely. Sometime this week I will revisit and see if it’s still acting up. Quote Link to comment
doron Posted September 11, 2020 Author Share Posted September 11, 2020 On 9/8/2020 at 9:00 PM, SimonF said: @doron not sure if background media scans happen if drives in standby. I have disabled on my drives for when i was using idle timers. sdparm --clear=EN_BMS --save /dev/sdX You can see if background scans are active on Smartctl. Also maybe some other functions are accessing drive as I know Smartctl doesn't work with the standby option on SAS. Does the GUI check for standby? Do you run the IPMI plugin and checking disk temps as this may cause spinups also. So it turns out that disabling this does further improve on the situation, but it's still happening: The SAS drives still do wake up from time to time, with no apparent reason or i/o done against them. With my two scripts running constantly, the overall standby times are longer - but it's not perfect. Maybe we can figure out some more reasons for these drives to spin up and out of STANDBY state. @SimonF? 🙂 Quote Link to comment
Pourko Posted September 11, 2020 Share Posted September 11, 2020 (edited) On 9/8/2020 at 7:13 AM, doron said: I have a script that spins a SAS drive down when the syslog message about it is spewed by Unraid. It doesn't seem wize to hitch your wagon to something that's been buggy for a very long time. Especially when there's a very easy way to do this yourself -- just read /sys/block/sdX/stat directly, and when you notice that there's been no i/o activity for a certain period of time, then just go ahed and spin down the drive. For example, I am attaching here my own little script that has been faithfully serving me for over five years now. Just disable all spindown stuff in the UI, start my script from your "go" file, and forget about it. (Note, the UI may also be buggy in the way it polls for smart data, thus spinning up your disks, so you may want to look into that too.) #!/bin/bash # spind: disk spin-down daemon copy="Version 3.9 <c> 2020 by Pourko Balkanski" prog=spind #################################################################### MINUTES=${MINUTES:-60} # the number of idle minutes before spindown #################################################################### idleTimeout=$(($MINUTES*60)) # in seconds loopDelay=61 # seconds kill $(pidof -x -o $$ $0) 2>/dev/null # our previous instances, if any [ "$1" = "-q" ] && exit 0 # Don't start a new daemon if called with -q renice 5 -p $$ >/dev/null # renice self log () { logger -t $prog $@ ;} log $copy # Make a list of the disks that could be spun down i=0 for device in /dev/[sh]d[aaa-zzz] ;do if proto=$(smartctl -i $device | grep -iE ' sas| sata| ide') ;then ((i++)) devName[$i]=$device cmdStat[$i]="cat /sys/block/$(basename $device)/stat" devLastStat[$i]=$(${cmdStat[$i]}) devSecondsIdle[$i]=0 devError[$i]=0 # We'll use to flag disks that won't spin down cmdSpinStatus[$i]="hdparm -C $device" cmdStandby[$i]="hdparm -y $device" if grep -iq ' SAS' <<<$proto ;then # Switch from /dev/sdX to /dev/sgN devName[$i]=$(sg_map26 $device) cmdSpinStatus[$i]="sdparm --command=sense ${devName[$i]}" cmdStandby[$i]="sg_start --pc=3 ${devName[$i]}" fi theList+="${devName[$i]} " fi done devCount=$i if [ "$theList" = "" ] ;then log 'No supported disks found. Exiting.' exit 1 fi log "Will spin down disks after $MINUTES minutes of idling." log "Monitoring: $theList" while :;do sleep $loopDelay for i in $(seq $devCount) ;do [ ${devError[$i]} -gt 2 ] && continue # this disk has previously failed to spin down. devNewStat[$i]=$(${cmdStat[$i]}) if [ "${devNewStat[$i]}" != "${devLastStat[$i]}" ] ; then # Some i/o activity has occured since the last time we checked. devSecondsIdle[$i]=0 devLastStat[$i]=${devNewStat[$i]} else # No new activity since we last checked... # ...So, let's check its spin status if ${cmdSpinStatus[$i]} | grep -iq standby ; then devSecondsIdle[$i]=0 else # it's currently spinning let "devSecondsIdle[$i] += $loopDelay" # Check if it's been idling for long enough... if [ ${devSecondsIdle[$i]} -gt $idleTimeout ] ; then # It is time to spin this one down! log "spinning down ${devName[$i]} " ${cmdStandby[$i]} >/dev/null 2>&1 devSecondsIdle[$i]=0 sleep 1 # no need to worry about race conditions here. # Check if the drive actually spun down as a result of our command if ${cmdSpinStatus[$i]} | grep -iq standby ;then devError[$i]=0 else ((devError[$i]++)) [ ${devError[$i]} -gt 2 ] && log "${devName[$i]} fails to spin down." fi fi fi fi done done & disown exit 0 spind-3.9.zip Edited September 11, 2020 by Pourko Quote Link to comment
SimonF Posted September 11, 2020 Share Posted September 11, 2020 I have written the standby function into smartctl for SCSI(SAS) devices as its only currently available for ATA and submitted to owner for inclusion. example outputs not sure if that would be of use. root@Tower:~# smartctl -i -n standby /dev/sdg smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org Device is in STANDBY BY COMMAND mode, exit(2) root@Tower:~# smartctl -i -n standby /dev/sdh smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org Device is in STANDBY BY TIMER mode, exit(2) root@Tower:~# smartctl -i -n never /dev/sdh smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: HGST Product: HUS724030ALS640 Revision: A1C4 Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes LU is resource provisioned, LBPRZ=0 Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000cca027baa9a8 Serial number: Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Fri Sep 11 20:28:44 2020 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled Power mode was: STANDBY BY TIMER I have also provided some code to devs for changes to mdcmd to include within Unraid Quote Link to comment
doron Posted September 11, 2020 Author Share Posted September 11, 2020 1 hour ago, Pourko said: It doesn't seem wize to hitch your wagon to something that's been buggy for a very long time. Hi. You are coming to a thread after a long discussion - please read over. In short, hanging the action on the syslog thing (and the other script/hack I posted) is by no means a "solution" - it's a stopgap testing mechanism, to test our assumptions about the feasibility and the efficacy of the STANDBY command solution. So far, it proves to generate mixed results: The drives do spin down, but after a while they spin back up. You'd not see it unless you monitor the STANDBY status of the drive rather closely. Looking at your script, it seems to be susceptible to the same issue. By the way, to do what you are doing, you don't need to read i/o counters - you can tell the drive to automatically spin down after a certain amount of idle time. That, in turn, has two drawbacks: (a) Unraid's spin up/down management is not aware of this, so no UI display and settings dialog control. (b) Same problem as above - the drives do spin back up, with no apparent i/o, for reasons yet to be understood. 1 hour ago, Pourko said: (Note, the UI may also be buggy in the way it polls for smart data, thus spinning up your disks, so you may want to look into that too.) See previously in this threat. Quote Link to comment
Pourko Posted September 11, 2020 Share Posted September 11, 2020 1 minute ago, doron said: you don't need to read i/o counters - you can tell the drive to automatically spin down after a certain amount of idle time. Right. But I have a bunch of disks that disregard that setting. Which was the main reason I wrote my script. Anyway, I was only trying to help. For myself, I have a solution that has been working flawlessly on my server for years. If you don't like it -- forget I posted it. Cheers. Quote Link to comment
doron Posted September 11, 2020 Author Share Posted September 11, 2020 Just now, Pourko said: Anyway, I was only trying to help. For myself, I have a solution that has been working flawlessly on my server for years. If you don't like it -- forget I posted it. On the contrary, it would in fact be good that you take part - you seem to have relevant experience - it'd just be good to get in sync with the discussion. Quote Link to comment
Pourko Posted September 11, 2020 Share Posted September 11, 2020 (edited) See, I have the feeling that you are not correctly identifying the problem. The way I see it, the problem is not how to spin down disks, the problem is that some buggy scripts in the UI don't know how to properly query a disk without waking it up, and they don't know when to rightfully display a green ball (or whatever other collor). Personally, I rarely use the UI for anything, and on my server disks spin down when they are supposed to, and they stay spun down. From reading the posts in this thread, I have the impression that you are trying to fix things kind of backwards, i.e., you take some info from the UI (that does not match reality) and try to make the disk status match that unreal info from the UI. That is why I suggested that maybe you shouldn't bother, doing it that way, and instead plead with the UI people to fix their UI, if the UI is that important to you. I hope this explaination makes some sense. :-) Edited September 11, 2020 by Pourko Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.