[Plugin] Spin Down SAS Drives


doron

Recommended Posts

57 minutes ago, doron said:

I'll use that as a filter in the script, ergo "can work with that".

Probably there are no true non-SAS "SCSI" controllers out there, as in parallel SCSI or Fibre Channel.  I know we don't include any of the parallel scsi drivers into our kernel.  I would think it's safe to assume that any scsi is going to be SAS.

 

The purpose of the 'emhttp_device_scsi_smart' script is to produce the output of 'smartctl -A'.  emhttpd will execute a shell command like this (example device sdb):

"{ /usr/local/sbin/emhttp_device_scsi_smart sdb > /var/local/emhttp/smart/sdb.new ; mv /var/local/emhttp/smart/sdb.new /var/local/emhttp/smart/sdb } &"

What this does is invoke the transport-specific script to use smartctl to output to a 'sdb.new' file.  After this has completed the 'mv' command renames the file from 'sdb.new' to 'sdb'.  In Linux this is atomic; that is, if a process is reading /var/local/emhttp/smart/sdb at the time the 'mv' executes, the process still reads the current file.  When the process closes the file, the vfs layer will then delete it.  Hope this makes sense.

 

Note that those smartctl calls are relatively expensive - meaning, not only do they take a long time to execute, they also totally disrupt any I/O stream taking place to the device.  Typically the driver has to completely flush it's I/O queue, plug incoming I/O requests, execute the sense-mode (or whatever) command to get the SMART data, then finally unplug normal I/O.  This is one reason emhttpd wants to manage spin up/down - because it keeps track of the a device spinning state without having to actually interrogate the device.  (The other reason is to implement spinup groups.)

 

BTW: can you post here a sample output of 'smartctl -A /dev/sdX' of a SAS drive you have?

Link to comment
22 minutes ago, limetech said:

Probably there are no true non-SAS "SCSI" controllers out there, as in parallel SCSI or Fibre Channel.  I know we don't include any of the parallel scsi drivers into our kernel.  I would think it's safe to assume that any scsi is going to be SAS.

Well, this drive is a SATA SSD:

/dev/disk/by-id/scsi-1ATA_TS120GSSD220S_<redacted>

It is under ESXi though (RDM), so you may think of it as an exception.

 

Quote

Hope this makes sense.

It does. Thanks.

Quote

BTW: can you post here a sample output of 'smartctl -A /dev/sdX' of a SAS drive you have?

Sure, here you go. This is with -A:

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     26 C
Drive Trip Temperature:        85 C

Manufactured in week 36 of year 2018
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  739
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  16374
Elements in grown defect list: 0

And this is with -a (little a):

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              HUH721212AL4200
Revision:             A3D0
Compliance:           SPC-4
User Capacity:        12,000,138,625,024 bytes [12.0 TB]
Logical block size:   4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca2708b9bf8
Serial number:        <redacted>
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sun Nov 22 02:14:43 2020 IST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     28 C
Drive Trip Temperature:        85 C

Manufactured in week 36 of year 2018
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  740
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  16375
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 768214057353216

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0      60350      60070.756           0
write:         0        0         0         0          7      24002.699           0
verify:        0        0         0         0        811          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -     148                 - [-   -    -]
# 2  Background short  Completed                   -     128                 - [-   -    -]
# 3  Background short  Completed                   -      22                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]

Note that both spin the drive up.

Edited by doron
Link to comment
16 minutes ago, doron said:

Well, this drive is a SATA SSD:


/dev/disk/by-id/scsi-1ATA_TS120GSSD220S_<redacted>

It is under ESXi though (RDM), so you may think of it as an exception.

 

For those devices it will invoke:

/usr/local/sbin/emhttp_device_scsi-1ATA_<operation>

 

Here is the complete list of transports emhttpd current recognizes:

ata
scsi-SATA
scsi-1ATA
scsi
usb
nvme
virtio

 

20 minutes ago, doron said:

Sure, here you go. This is with -A:

Thanks.  This is consistent with my results from a couple years ago first trying to get SAS spin control to work.  The temperature is being reported correctly.  It's possible before I release that I might get rid of the 'smart' <operation> script callout.

 

22 minutes ago, doron said:

Note that both spin the drive up.

Yup.  Also the '-no standby' seems to work for ATA but not SAS.  This also is handled by emhttpd - that is, it only issues 'smartctl' under these conditions:

  1. the array is Stopped: in this case it's polled every 1 sec
  2. the array is Started and we just detected a spin-up, either by I/O or by explicit spin up commnad: in this case we issue a single smartctl to get the temperature
  3. the array is Started and poll_interval (Settings/Disk Settings) has elapsed.
Link to comment
2 minutes ago, limetech said:

 

For those devices it will invoke:


/usr/local/sbin/emhttp_device_scsi-1ATA_<operation>

 

Here is the complete list of transports emhttpd current recognizes:

Perfect. Thx.

2 minutes ago, limetech said:

Yup.  Also the '-no standby' seems to work for ATA but not SAS.

Indeed. BTW my plugin installs a wrapper for smartctl which reinstates the "-n standby" thingie (until smartmontools.org fix it).

Link to comment
  • 2 weeks later...

hi people!

i am actually building a new NAS obviously based on unRAID and this time it will have a 24 drive case!!!

my previous one was based on SATA hdd and looking for new drives i've found really good prices for SAS drives.

my expander card and the rack support SAS but i was worried about spin down so thank you so much for this add-on.

could you suggest me some SAS drives that work with this spin down plugin?

i have read that the HGST HUS724030ALS640 it's working. any other model?

Link to comment

Doron,

 

Anyway to switch bay to version .6? That version seemed to work as long as no timers were set on my Segate Exos 7 drives. The new version seems to not spindown my drives anymore. I checked with the:  sdparm --command=sense /dev/sd[d-j]   to check all my drives and no additional lines are sensed.

 

Thanks in advance

Link to comment
3 hours ago, nlcjr said:

Anyway to switch bay to version .6? That version seemed to work as long as no timers were set on my Segate Exos 7 drives. The new version seems to not spindown my drives anymore. I checked with the:  sdparm --command=sense /dev/sd[d-j]   to check all my drives and no additional lines are sensed.

Hmm, that Should Not Occur™.

Can you share the log lines you get around the time of the spindown (i.e. including and immediately following the "spindown" message for one of your SAS drives)?

Do you see any messages at all with "SAS Assist" prefix?

Also, can you share your /etc/rsyslog.conf ?

Link to comment

Sure,

 

I made some changes since I sent the forum post So hopefully the logs aren't to confusing.

 

When I had .6 installed I could see the drive standby also by the drive ready light going out. That no longer occurs.  Parity drives are set to never standby as they don't seem to work correctly being netapp drives. IE no grey ball and lots of logs trying over and over to spin down. 0 and 29 I think.

 

All others but cache are set to 1 hour. They do change to grey ball but no power savings via supermicro power consumption display. Drive ready light also stays lit on supermicro.

 

Thanks for your relentless pursuit to save watts!

 

u3.JPG

u2.JPG

u1JPG.JPG

syslog.txt

Link to comment
17 hours ago, doron said:

 

(sorry for previous misfire, I was reading on a tiny phone)

 

Does this same drive react the same way to direct spindown? e.g.:


sg_start -r --pc=3 /dev/sdd

sdparm --command=sense /dev/sdd

 

is it safe to run that command while the array is in use?

Link to comment
5 hours ago, nlcjr said:

Yes it's the same result.

This is essentially what the plugin does when it tries to spin down SAS drives. So either this does nothing in your setup (combination of hard drive and controller), or something else is going on. 

Could it be that you have constant i/o against the array - in which case, a spun-down drive will immediately spin back up? Do you see i/o counters (read or write) on the main page moving, during that time?

Link to comment
On 11/22/2020 at 1:12 AM, limetech said:

I don't think this is necessary.  emhttpd won't invoke smartctl unless device is spinning.

Change for -n standby have now been added to Smartctl with release 7.2

 

change is not in 7.1 but will be in 7.2. Not sure if you would use the source from this build or wait for the 7.2 one which I think will be at the end of the year. here: https://circleci.com/gh/smartmontools/smartmontools/1121

 

Sorry I missed the updates rechanges to md and scripts. Is there a process that is checking for spinups outside of unraid control?

Link to comment
1 hour ago, SimonF said:

Change for -n standby have now been added to Smartctl with release 7.2

Thanks for doing that. That's great news.

1 hour ago, SimonF said:

Sorry I missed the updates rechanges to md and scripts. Is there a process that is checking for spinups outside of unraid control?

Unraid (in kernel upto 6.8 and in userspace in the future) looks at i/o activity to determine spin up of a drive.

  • Thanks 1
Link to comment
17 hours ago, doron said:

This is essentially what the plugin does when it tries to spin down SAS drives. So either this does nothing in your setup (combination of hard drive and controller), or something else is going on. 

Could it be that you have constant i/o against the array - in which case, a spun-down drive will immediately spin back up? Do you see i/o counters (read or write) on the main page moving, during that time?

So D, you got me digging deeper. I found on those drives IDLE_B=1, IDLE_C=1, and IDLE=1. So I changed those to zero.

These drives are already sleeping according to Unraid, Grey Ball. Checked with SDPARM they show no additional lines as before. When I issue the commands sg_start and mdcmd they both work when checked with sdparm, also the light goes out on the supermicro drive bay. With in a few minutes the drive spins back up  and the supermicro light comes back on. But they are still Grey Balled and sleeping in Unraid.

 

Maybe the Cache directory's is not working properly. I do not see any activity on the read /writes.

 

Thanks for at least getting me closer.

Link to comment
22 hours ago, doron said:

I've just pushed a new version - 0.8 - of the plugin, supporting 6.9.0-rc1 and its new spin up/down mechanisms.

As always, please report issues.

Many Thanks doron and limetech for working through this. My setup is finally working. I installed the new release candidate version and .8 spindown. Started a MC session of about 1tb and went to bed. This AM all drives are in Standby properly. AT idle I am finally down below 50w on my Suoermicro 16bay JBOD.

 

Keep up the great work.

Link to comment
3 hours ago, nlcjr said:

My setup is finally working.

Thanks for reporting! I'm really happy to hear that.

 

Would you mind running

/usr/local/emhttp/plugins/sas-spindown/sas-util

and send me the resulting file (/tmp/sas-util-out)?

(when run without parameters it doesn't do anything intrusive, just reports the HDD and controller(s) models in JSON format)

You can pm me or post here (no sensitive info such as serial numbers etc. is shared). Thanks

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.