Sleep script spins up disks?


Recommended Posts

I'm preclearing a new 2TB drive and just happened to noticed the power draw on my new kill-a-watt meter fluctuating a lot - like between 80 and 100 watts.  I listened to my server and heard a lot of fan noise  ;) and a faint sound that sounded like the disks spinning up and down cyclically.  After turning my case fans all to low I could hear it very well and it was the disks spinning up and down.  I'm only running the basic version now so I only have 3 disks.  What I heard was all three disks (which showed they were spun down) spinning up for just a few seconds and then back down.  This process repeated at a constant frequency of about once every minute.  My immediate thought was that something in the sleep script, which runs hdparm every minute on all the disks, must be doing something here.  As a test I removed the initialization of my sleep script from the go file.  Rebooted, spun down all disks and I did not observe this repetitive spinning up and down like before.  So now I'm wondering if this is normal or do I have something messed up in my sleep script or go file?  As I said, I'm preclearing a new 2TB drive so the last thing I want is for my three disks to be spinning up and down every minute for the next ~30 hours. :-[

 

I've attached copies of my sleep (s3) and go files. 

go.txt

s3.txt

Link to comment

^^Not sure I understand why that applies to my problem.  I have previously observed what you mentioned when the server was preparing to sleep.  If all drives were spun down then just before the server would enter s3 then all drives would spin up and then the machine would go to sleep.  However under the observed condition my server should not be preparing to enter S3 sleep.  There are 4 drives connected, sda, sdb, sdc, & sdd.  You can see these are all listed in my s3 script.

#!/bin/bash
drives="/dev/sda /dev/sdb /dev/sdc /dev/sdd"
timeout=5
count=5
while [ 1 ]
do
hdparm -C $drives | grep -q active
if [ $? -eq 1 ]
then
 count=$[$count-1]
else
 count=$timeout
fi
if [ $count -le 0 ]
then
 # Do pre-sleep activities
 sleep 5
 # Go to sleep
 echo 3 > /proc/acpi/sleep
 # Do post-sleep activities
 # Force NIC into gigabit mode
 # (might be needed forgets about gigabit when it wakes up)
 ethtool -s eth0 speed 1000
 # Force a DHCP renewal (shouldn't be used for static-ip boxes)
 /sbin/dhcpcd -n
 sleep 5
count=$timeout
fi
# Wait a minute
echo COUNT $count
sleep 60
done

When I observed this condition I was preclearing sdd and the other drives were all spun down.  The script queries all "drives" using the hdparm command and if at least one is active then it would not prepare to enter sleep.  (Note: I did not write this script so I really don't completely understand it).  So it appears to me that just performing a hdparm causes the drive to spinup just to report its status and then back down after it has replied.  Does this make sense?

Link to comment

i dont know if this is relevant but it seems to me that your disks are spinning down  and immediatelly something spins them up. this might mean that the following series of actions occur:

(1) all drives seem asleep --> (2) server tries to sleep --> (3) drives spin up (common phenomenon before going to sleep) --> (4) something stops sleep from occuring

 

since i dont know shell scripting that well i cannot help you thorougly but my guess is this:

action (1) should not occur because one drive is preclearing. that means that there is a mistake in your script, most probably in this line:

drives="/dev/sda /dev/sdb /dev/sdc /dev/sdd"

 

one of the drives is the flash, and there are 3 disks + 1 preclearing. so there must exist a /dev/sde which must be the disk that is being precleared (you can easily see that in the web menu, in the devices tab). that's why it is not being detected as awake which leads to sleep which in turn is stopped by the preclear action somehow)

 

since you can never know which drive is which letter i propose you use somewhere those 2 lines

flash=/dev/`ls -l /dev/disk/by-label| grep UNRAID | cut -d"/" -f3 | cut -c 1-3`

awakeHDDs=$((for d in $(ls /dev/[hs]d? | grep -v "$flash"); do hdparm -C $d | grep active ;

 

first line identifies the flash drive, second line finds all drives (except flash) tha are awake. this line is part of the s3_simple.sh script which can be found somewhere in the forum.

 

Link to comment

Thank you papnikol for the reply.  Here are the drive assignements:

 

sda = parity drive

sdb = data drive 1

sdc = data drive 2

sdd = new disk being precleared

sde = flash disk

 

Before I added the new 2TB drive to my array my sleep scrip looked like this...

drives="/dev/sda /dev/sdb /dev/sdc"

 

I forgot to add "/dev/sdd" into that line of the script and the server actually entered s3 sleep during my first attempted preclear.  Whoops.  So I added "/dev/sdd" and that corrected the problem.  However now I have this strange spinning up the immediately back down problem.

 

since i dont know shell scripting that well i cannot help you thorougly but my guess is this:

action (1) should not occur because one drive is preclearing. that means that there is a mistake in your script, most probably in this line:

drives="/dev/sda /dev/sdb /dev/sdc /dev/sdd"

I do not know shell scripting well either that's why I am asking for help.  I believe that something in my script is causing this problem.  I believe that my script polls all of these drives once a minute using the hdparm command to ascertain drive status.  My suspicion was that executing the hdparm command on a spun down/sleeping drive caused the drive to temporarily spin up to report its status.  Now the code you posted appears to only poll drives that are awake (i.e. not spun down) so this might actually fix my issue - at least it will test my theory.  I also believe that this problem would not just occur when executing a preclear on a drive but whenever a drive is spun down, which could happen in a lot of situations.  I just happend to discover it when I was preclearing a drive.  I'll try this new code out tonight to see if it works.

 

The funny thing is that I used the s3 script that was documented on the Setup Sleep (S3) and Wake on Lan (WOL) wiki page.  All I did was update the drive assignments in the script as the directions indicated.  So I would suspect that anyone else who used that script would have the same issues I do.  The script works (i.e. puts the server to sleep and wakes it up) but it seems to cause this unusual behavior of spinning up sleeping drives which I consider to be a very bad side-effect.  Most users would probably not even notice this issue because it is hard to hear the drives spinning up and down over case fans.  I had to turn my fans to low and remove the side panel off my case (Antec 300) before I could be sure I knew what I was hearing was in fact the drives spinning up then back down again.

Link to comment

It could be as simple as the firmware on your specific disk being buggy/different and spinning up the drive  when an

hdparm -C command

is issued to it.

I suppose but I'm pretty sure that what I heard was all three disks spin up in rapid succession (like within a second of each other) and then back down.  This repeated over and over at a frequency of approx. once a minute, which is how often the sleep script polls the drives using hdparm.  I'm certain that it was not just a single disk spinning up and then down.  The three disks are all WDs (WD10EARS, WD1001FALS, & WD1600AAJS).  The two 1TB drives are new (~2 months old) and the 160GB drive is about 18 months old.  I've never updated the FW on any of the drives.  So based on the assortment of drives I have and the fact that all of them were acting this way I doubt it is buggy FW.  I have an old IBM Deskstar drive that I can put in the server maybe I'll try that drive as well to see how it responds.
Link to comment

It could be as simple as the firmware on your specific disk being buggy/different and spinning up the drive  when an

hdparm -C command

is issued to it.

 

to check if if joe is right, you just have to issue the hdparm command manually and see what happens

 

Also, maybe try and see what other processes are running, maybe there is something strange.

 

finally, maybe try and post your syslog. i dont know if i can help you but maybe someone else can...

Link to comment

Well my theory that the hdparm command was spinning up the HDDs was incorrect.  If this was true then this strange spinning up and back down would occurr whenever a disk was spun down (aka asleep).  I have verified this is not happening.  So I'm back to what papnikol suggested before.

i dont know if this is relevant but it seems to me that your disks are spinning down and immediately something spins them up. this might mean that the following series of actions occur:

(1) all drives seem asleep --> (2) server tries to sleep --> (3) drives spin up (common phenomenon before going to sleep) --> (4) something stops sleep from occuring

What I observed was exactly the same as what I've seen when my server goes to sleep.  It's as though the server was trying to go to sleep (and it shouldn't have) but something stopped it.  I'm focusing on steps 1 and 4 of the sequence.  Is it possible that a drive being precleared would appear asleep to the s3 script but then when the server tries to sleep for whatever reason the preclear stops the sleep process?  I've just finished swapping out a drive in my server so now I can test this condition to see if I can reproduce it.  Once I can reporduce it then (with your guys help) I might be able to solve this mystery.
Link to comment

Well now I'm totally clueless.  After I noticed that strange behavior at the beginnng of my preclear I removed the initialization of the sleep script from the go file because I wanted to get my new drive checked out.  I've left the sleep script out for the past few days while I've been reorganizing my array - swapping drives in and out, preclearing, rebuilding parity, etc.  Well I finished that up on Wed night.  Last night I put the two lines back in my go file to initialize the sleep script.  I installed a spare 160G HDD as well.  Rebooted the server and started a preclear on the 160G drive.  So I'm right back to the scenario when I first noticed the anomaly.  Once the all the drives in my array spun down I waited and nothing happened, no spinning up and down of the idle drives like before.  WTF ???  I don't know what to say, I know it happened once before but I cannot reproduce it now.  Of course my array is different now after adding a new drive and I'm preclearing a different HDD so that may have something to do with it.  So I'm just going to have to let this one go I guess.  Maybe there was something specific to the firmware of that WD20EARS drive?  However I will be paying close attention to see if it ever happens again.  Maybe if I sneek down in the basement in stealth mode I can catch it in the act!  8)  Sorry for the distraction.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.