• [6.9.0] HDD's no spin down after update


    xruchai
    • Solved Minor

    After updating to 6.9 Final the HDD's (sata) no longer go into standby (after 30 min), no spin down.

     

    I also set the delay to 15 minutes but the HDD's just don't go into standby.

    I have not changed any system settings before, only when I tried to solve the problem (uninstalled plugins etc.).

     

    Before the update, on 6.8.3, the spin down worked fine.

     

    I hope that you can help.

     

    • Like 2



    User Feedback

    Recommended Comments



    If you click spindown button on main do they spin down?

     

    Also autofan is reading temps. Not sure where it gets them from but if doing smartctl will spin up disks if not using -n standby.

     

    I think telegraf also created issues within betas are 6.9-rcs do you use that?

    Edited by SimonF
    • Like 1
    Link to comment
    3 minutes ago, SimonF said:

    If you click spindown button on main do they spin down?

     

    Also autofan is reading temps. Not sure where it gets them from but if doing smartctl will spin up disks if not using -n standby.


    If I set the HDDs to standby manually and then run smartctl -n standby -i / dev / sdb, I get the message: Device is in standby mode.

    The hard drives remain in standby so there is no access, that works.

    To me it looks like the timer isn't running or the spin down task isn't triggered?

     

    I don't use telegraf.

    Edited by xruchai
    Link to comment
    2 minutes ago, xruchai said:

    spin down task isn't triggered

    Or could be something is doing something with the disks and Unraid still thinks IO is happening. 

     

    Spindown process changed in6.9 rcs from 6.8.3. Did you see my question re telegraf?

     

     

    Link to comment

    Yes, I saw it late and then processed my post. I don't use telegraf.

    There is also no activity on the hard drives, only on the SSDs, but that's how it should be. It's strange when I manually put the hard drives on standby, they stay in that state.

    As indicated, I haven't changed anything after the update from 6.8.3 to 6.9.0 that could influence this behavior :(.

    Edited by xruchai
    Link to comment

    You ought to reboot into safe mode and see if the disks spin down. If they do then some plugin is the cause.

     

    • Like 1
    Link to comment

    From my Observations I noticed if I had Dynamix Fan Control installed it would keep my drives spun up. I rely a lot on that plugin so I had to drop back down to 6.9 Beta-35 which appears to have no problems with the drives or my fans.

    Link to comment

    looking at the code for Autofan, it may cause an issue for people with SAS drives as its using hdparm. @limetech I am guessing the use of smartctl will show as usage against the drive? But why would that be different to pre 6.9?

     

    function_get_highest_hd_temp() {
      HIGHEST_TEMP=0
      for DISK in "${HD[@]}"; do
        SLEEPING=`hdparm -C ${DISK} | grep -c standby`
        if [[ $SLEEPING -eq 0 ]]; then
          if [[ $DISK == /dev/nvme[0-9] ]]; then
            CURRENT_TEMP=$(smartctl -A $DISK | awk '$1=="Temperature:" {print $2;exit}')
          else
            CURRENT_TEMP=$(smartctl -A $DISK | awk '$1==190||$1==194 {print $10;exit}')
          fi
          if [[ $HIGHEST_TEMP -le $CURRENT_TEMP ]]; then
            HIGHEST_TEMP=$CURRENT_TEMP
          fi
        fi
      done
    }

     

    Link to comment
    44 minutes ago, limetech said:

    If you remove Dynamix System Autofan plugin, does issue persist?

     

    I've not tried removing it to be honest. When I was tinkering with RC2 and Beta-35 the only difference on my end was installing the RC2 so whatever changed I can only "assume" was the culprit and I saw a post on the forum from somebody else who noticed the same thing. 

     

    Not putting any blame on anything, I'll give Final an install soon and take a look before removing the plugin to see if it changes anything.

    Link to comment

    Some info from my log right after I manually spin down the array:

     

    Mar  2 12:37:17 Tower emhttpd: spinning down /dev/sdl
    Mar  2 12:37:20 Tower emhttpd: spinning down /dev/sdk
    Mar  2 12:37:20 Tower emhttpd: spinning down /dev/sdh
    Mar  2 12:37:21 Tower emhttpd: spinning down /dev/sdj
    Mar  2 12:38:02 Tower kernel: sd 7:0:4:0: attempting task abort!scmd(0x000000007e3428ef), outstanding for 7017 ms & timeout 7000 ms
    Mar  2 12:38:02 Tower kernel: sd 7:0:4:0: [sdl] tag#3214 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
    Mar  2 12:38:02 Tower kernel: scsi target7:0:4: handle(0x000d), sas_address(0x4433221105000000), phy(5)
    Mar  2 12:38:02 Tower kernel: scsi target7:0:4: enclosure logical id(0x5c81f660f5419d00), slot(6) 
    Mar  2 12:38:02 Tower kernel: sd 7:0:4:0: task abort: SUCCESS scmd(0x000000007e3428ef)
    Mar  2 12:38:04 Tower emhttpd: read SMART /dev/sdl
    Mar  2 12:38:12 Tower kernel: sd 7:0:3:0: attempting task abort!scmd(0x00000000637eab23), outstanding for 7016 ms & timeout 7000 ms
    Mar  2 12:38:12 Tower kernel: sd 7:0:3:0: [sdk] tag#3223 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
    Mar  2 12:38:12 Tower kernel: scsi target7:0:3: handle(0x000c), sas_address(0x4433221106000000), phy(6)
    Mar  2 12:38:12 Tower kernel: scsi target7:0:3: enclosure logical id(0x5c81f660f5419d00), slot(5) 
    Mar  2 12:38:12 Tower kernel: sd 7:0:3:0: task abort: SUCCESS scmd(0x00000000637eab23)
    Mar  2 12:38:15 Tower emhttpd: read SMART /dev/sdk
    Mar  2 12:38:22 Tower kernel: sd 7:0:2:0: attempting task abort!scmd(0x000000005b9e412a), outstanding for 7063 ms & timeout 7000 ms
    Mar  2 12:38:22 Tower kernel: sd 7:0:2:0: [sdj] tag#3236 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
    Mar  2 12:38:22 Tower kernel: scsi target7:0:2: handle(0x000b), sas_address(0x4433221101000000), phy(1)
    Mar  2 12:38:22 Tower kernel: scsi target7:0:2: enclosure logical id(0x5c81f660f5419d00), slot(2) 
    Mar  2 12:38:22 Tower kernel: sd 7:0:2:0: task abort: SUCCESS scmd(0x000000005b9e412a)
    Mar  2 12:38:25 Tower emhttpd: read SMART /dev/sdj
    Mar  2 12:38:32 Tower kernel: sd 7:0:0:0: attempting task abort!scmd(0x0000000000b292e2), outstanding for 7069 ms & timeout 7000 ms
    Mar  2 12:38:32 Tower kernel: sd 7:0:0:0: [sdh] tag#3248 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00
    Mar  2 12:38:32 Tower kernel: scsi target7:0:0: handle(0x0009), sas_address(0x4433221100000000), phy(0)
    Mar  2 12:38:32 Tower kernel: scsi target7:0:0: enclosure logical id(0x5c81f660f5419d00), slot(3) 
    Mar  2 12:38:32 Tower kernel: sd 7:0:0:0: task abort: SUCCESS scmd(0x0000000000b292e2)
    Mar  2 12:38:35 Tower emhttpd: read SMART /dev/sdh

     

    I have 4 array drives (3+parity) attached to a Dell perc h310 (flashed to LSI 9211-8i).  I AM using telegraf but [inputs.smart] is disabled in the config files.

     

    SSDs are connected right to MB headers.

     

    UPDATE before I posted: I found a separate [inputs.hddtemp] in telegraf.conf and disabled it....it appears to have fixed my immediate spin-up issue.  

    Link to comment
    On 3/2/2021 at 11:58 AM, limetech said:

    If you remove Dynamix System Autofan plugin, does issue persist?

     

    Spindown works only if I disable autofan and every other addon that queries smart data (telegraf, etc.) Once spun down these queries don't seem to wake them but too soon to say definitively.

     

    Edited by CS01-HS
    • Thanks 1
    Link to comment

    I can also confirm that.
    I just do not understand why, in version 6.8.3 ran without problems and according to other users also with the RC versions.

    Link to comment

    Can confirm, same behavior:

     

    Current setup is 3 HDD with no parity, + SSD cache. I have docker with telegraf working, and also Dynamix fan control, but they are NOT the issue.

     

    Drives refuse to scheduled spin down, even with docker disabled and plugins disabled (safe boot). When they are manually spinned down, they remain in that state until woke up.

     

    In Dashboard, all three disk show constant read activity of exactly 341B/s or 682B/s (same in all drives at same time), but this is a "fake" read, because dashboard still shows that read activity with drives manually spinned down, which is of course, impossible.

     

    It seems to me that this is a bug where unraid is getting fake read activity, and so, it does not automatically spin down drives. Docker/Grafana and plugins cannot be blamed, as problem persist with no plugins and docker disabled.

     

    This behavior was not present with 6.8.3 (automatic spin down worked perfectly fine)

    Edited by Carpe_Diem
    added some more info
    Link to comment

    same issue... after the update to 6.9.0 my drives won't automatically spin down.  I thought it had to do with cache_dirs running, but it's been over 18hrs and still no spindown.  Manually spinning them down works and it stays spun down until accessed.  I can also confirm that weird 341B/s on all drives every few seconds...

    image.thumb.png.26bbeabef168a00abc547f1634f3c32d.png
     

    Dynamix autofan is installed but disabled.  Telegraf is installed.  However, all this worked just fine prior to the upgrade.  So far, this is my only issue.

     



    Edit: After spinning them down, I'm still getting this weird 85.0B/s reads...
     

    image.thumb.png.280b805a8d6005c4eaa0a35f2feadbbd.png

    Edited by jbquintal
    added second picture...
    Link to comment

    If anyone is having issues you could install doron's SAS Helper even if you only have SATA drives.

     

    This creates a wrapper for smartctl in /usr/sbin.

     

    add DEBUG=true to script as below and this will provide info in the log as to where the command is being call from.

     

    Will show

    Mar 3 14:10:38 Tower SAS Assist v0.85: debug: smartctl wrapper caller is bash, grandpa is ttyd, device /dev/sdg, args "/dev/sdg"

    Mar 3 14:13:23 Tower SAS Assist v0.85: debug: smartctl wrapper caller is smartctl_type, grandpa is emhttpd, device /dev/sdb, args "-A /dev/sdb"

     

    Once you have identified where the issue is you can remove the plugin.

     

     

    Quote
    
    root@Tower:/usr/sbin# cat smartctl
    #!/bin/bash
    
    # Spin down SAS drives - stopgap scripts until Unraid does organically.
    #
    # This script is a wrapper for "smartctl" - which in 7.1 does not support the "-n standby"
    # flag for SAS drive. This wrapper works around that, by checking whether the drive is SAS
    # and if so, avoid calling smartctl (return silently).
    #
    # v0.85
    # (c) 2019-2021 @doron - CC BY-SA 4.0
    
    . /usr/local/emhttp/plugins/sas-spindown/functions
    . /usr/local/emhttp/plugins/sas-spindown/functions
    DEBUG=true
    DEVICE="${@: -1}"
      
    $DEBUG && Log "debug: smartctl wrapper caller is $(cat /proc/$PPID/comm), grandpa is $(cat /proc/$(cat /proc/$PPID/stat | cut -d" " -f4)/comm), device $DEVICE, args \"$@\""
    
    
    if [[ "$@" =~ \-n\ +standby\  || "$@" =~ "--nocheck=standby" ]] ; then
    
            if IsSBY $DEVICE ; then
    
                    $DEBUG && Log  "debug: device $DEVICE is spun down, smartctl evaded"
                    echo "SAS Assist Plugin: $DEVICE is in standby mode (spun down), smartctl exits(2)"
    
                    exit 2          # Match smartctl's exit code, thanks @segator
    
            fi
    
    fi
    
    $REALSMART "$@"

     

     

    • Thanks 1
    Link to comment

    Droppin in to say I had spindown issues in RC2 and removing dynamix autofan plugin didn't fix it. Upon upgrading to 6.9.0 the issue persisted, but this time removing dynamix autofan looks to have solved it.

    Link to comment

    Similar boat here. 

     

    6.8.3 working fine. 6.9.0 not working. 

     

    I read in earlier posts that autofans was the issue. I removed it and it still was having issues. Other post mentioned telegraf and input.hddtemps. I turned hddtemps off and it still persists. I do have input.smart enabled so that might be the issue. I ultimatedly turned off all telegraf/influx/varken and it now goes down to standby now. 

     

    Curious. 

    Link to comment

    Same issue here, no problem in 6.8.3. I have only tried 6.9.0rc2 and 6.9.0 and the problem were there in both. I disabled autofan and telegraf and the problem disappeared. However, because of my hardware setup I am kinda dependent on the autofan plugin to spin up the fans when the HDDs gets warm.  

    • Like 1
    Link to comment

    I'm also having this issue.

    Everything worked fine in 6.8 but since upgrading, drives will not spin down automatically (if the spin down button is pressed they do spin down okay).

    From reading the above, I have dynamix autofan installed but (like make others) disabling this is not an option with the set up as it is. 

    Do we have any ideas when/if this is something that can be fixed? 

    • Like 2
    Link to comment

    I might have this issue too. The HDDs were spun up 24/7 after the update to 6.9.0.

    After deactivation of dynamix autofan the HDD's get spun down after the spin down delay of 3h. But they get spun up from SMART regulary. Is this the default behavior i have overseen before or an unwanted behavior?

    Mar 5 07:56:35 Server emhttpd: read SMART /dev/sdd
    Mar 5 07:56:47 Server emhttpd: read SMART /dev/sdc
    Mar 5 10:57:15 Server emhttpd: spinning down /dev/sdd
    Mar 5 10:57:18 Server emhttpd: spinning down /dev/sdc
    Mar 5 11:01:20 Server emhttpd: read SMART /dev/sdc
    Mar 5 11:01:43 Server emhttpd: read SMART /dev/sde
    Mar 5 11:02:00 Server emhttpd: read SMART /dev/sdd
    Mar 5 14:01:45 Server emhttpd: spinning down /dev/sde
    Mar 5 14:02:02 Server emhttpd: spinning down /dev/sdd
    Mar 5 14:02:06 Server emhttpd: spinning down /dev/sdc

     

    • Like 1
    Link to comment

    I'm having the same issue with drives not spinning down after upgrading to 6.9 

    I have spin down delay set to 15 minutes, and already confirmed that with Dynamix Auto Fan Control enabled, the drives wont spin down, with it disabled, the fans spin down within 15 minutes as they should. Also confirmed that the drives will stay down if manually spun down. 

    • Like 3
    Link to comment

    Spinning control was changed in 6.9 to handle multiple pools and SAS devices.  Some plugins are incompatible with these changes, as noted, Dynamix Auto Fan Control.  We're looking into fixing the plugin.

    • Like 1
    Link to comment

    I'm having the same issue with drives not spinning down after upgrading to 6.9 i have Telegraf installed and getting the 341.00b/s read on the drives, and dont have the Dynamix Auto Fan Control installed. 

    Link to comment

    I'm also experiencing the issue.
    I don't have the Fan plugin installed on my system, though.

    I have telegraf installed and monitoring the smart state of my drives, though.

    That didn't hinder the disks to spin down in 6.8.2. Could you also take a look at that?

    It seems to result in the same strange 341.00 b/s read on all drives simultaneously.

     

    Manual spin down works without problems but it would be nice to get the automatic spindown working again

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.