doron

Community Developer
  • Posts

    640
  • Joined

  • Last visited

  • Days Won

    2

Posts posted by doron




  1. I'm actually unable to use my drives right now.  Since they stay spun up all the time, they are running really hot (51c avg) and I'm afraid to leave them in service.

    You may want to check your cooling. Depending on physical environment etc., a properly cooled drive that's spun up but idle would not typically average at these temps.

    Sent from my tracking device using Tapatalk

  2. 51 minutes ago, @zen said:
    emhttpd: sdspin /dev/sdf down: 1

    This might indicate an issue.

     

    Can you, on the Unraid console, issue this command:

    hdparm -y /dev/sdf

    and post the output?

     

    (note the small y, not uppercase Y...)

  3. @zen , please add some more info re this problem. Do they never spin down? Spin down then up again? Logs?...

    (Edit - sent out prematurely): the plugin is not supposed to interfere with SATA spin down, at all. That said, stuff can happen - but, you said that removing it did not remove the issue, so chances are your issue is elsewhere.

    Sent from my tracking device using Tapatalk



  4. @Elmojo @Toibs finding who issues i/o activity against your drives can be tricky at times.
    If you eliminated network drive activity against your shares, try to stop all plugins, Dockers and VM's and see what happens. Then, add them back one by one. You might be able to find the culprit this way.

    Sent from my tracking device using Tapatalk

    • Like 1
  5. @Elmojo, from looking at your data, it does seem that your internal ZFS pool (the one we're discussing) is in fact being accessed a short while after being spun down. You can see that the "SMART read" messages (a great indication of a drive being spun up) show up about 1-2 minutes after the respective spin down messages. This, in most cases, indicates some sort of disk activity.

     

     The question whether the drive in fact spins down can be answered as follows:

     

    1. Open both a UI and a CLI terminal window, have both ready

    2. Push the UI button to spin the drive down, after noting its dev name

    3. Immediately thereafter (don't wait too long), issue, on the CLI terminal, this command:

    sdparm -C sense /dev/sdX

    replace sdX with the actual device name. 

    If the drive is spun down, you will see the word "standby" somewhere on the sense data.




  6. Any thoughts?

    Some quick questions:
    1. Under normal operation, are you seeing log messages with "SAS Assist" prefix?
    2. When you manually spin down your zfs pool, are you seeing any? Can you post them?
    3. Can you post the output of
    ls -la /dev/disk/by-path



    (Or, you can just post diagnostics)

    Sent from my tracking device using Tapatalk

  7. 4 hours ago, dunee said:

    I can't figure this out on my own.

     

    Let's break this into parts:

     

    4 hours ago, dunee said:

     

    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdm
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdj
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdk
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdh
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdg
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sde
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdc
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdl
    Dec 14 11:44:17 Tower emhttpd: spinning down /dev/sdi
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdj
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdh
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdg
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdk
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdi
    Dec 14 11:44:17 Tower SAS Assist v2022.08.02: Spinning down device /dev/sdl
    Dec 14 11:44:35 Tower emhttpd: read SMART /dev/sdm
    Dec 14 11:47:54 Tower emhttpd: read SMART /dev/sdg
    Dec 14 11:48:12 Tower emhttpd: read SMART /dev/sdh
    Dec 14 11:48:23 Tower emhttpd: read SMART /dev/sdi
    Dec 14 11:48:36 Tower emhttpd: read SMART /dev/sdj
    Dec 14 11:48:48 Tower emhttpd: read SMART /dev/sdk
    Dec 14 11:49:00 Tower emhttpd: read SMART /dev/sdl

     

    I have a mixture of SATA and SAS drives. The SATA drives stay spun down, the SAS drives spin up, one after the other, within a few minutes.

     

    If the re-spin happens within a few minutes (rather than a couple of seconds), then it's almost certain that either (a) someone is issuing i/o against the drive or (b) some plug-in has waken up and is spinning the drive up, either rightly or erroneously (we've seen both - e.g. an old version of the autofan plugin used to do the wrong thing when inquiring about the drives, which would wake up all SAS drives). 

     

    Have you looked at your plugins? Can you disable them all and try? Then, if the issue goes away, you can add them back one by one.

     

    4 hours ago, dunee said:

     

    I've shut down my dockers, this still happens. The open files plugin does not show any files being touched when the drives spin up - especially since the SAS drives have no files on them.

     

    root@Tower:~# /usr/local/emhttp/plugins/sas-spindown/sas-util 
    
    SAS Spindown Utility (v20210201.01)
    ls: cannot access '/dev/disk/by-path/*-sas-*': No such file or directory
    
    Error: No SAS drives detected.
    
    Now exiting.
    
    root@Tower:~# ls /dev/disk/by-path/
    pci-0000:00:1a.0-usb-0:1.2:1.0-scsi-0:0:0:0@        pci-0000:02:00.0-scsi-0:0:12:0-part1@  pci-0000:02:00.0-scsi-0:0:6:0@
    pci-0000:00:1a.0-usb-0:1.2:1.0-scsi-0:0:0:0-part1@  pci-0000:02:00.0-scsi-0:0:1:0@         pci-0000:02:00.0-scsi-0:0:6:0-part1@
    pci-0000:02:00.0-scsi-0:0:0:0@                      pci-0000:02:00.0-scsi-0:0:1:0-part1@   pci-0000:02:00.0-scsi-0:0:7:0@
    pci-0000:02:00.0-scsi-0:0:0:0-part1@                pci-0000:02:00.0-scsi-0:0:2:0@         pci-0000:02:00.0-scsi-0:0:7:0-part1@
    pci-0000:02:00.0-scsi-0:0:10:0@                     pci-0000:02:00.0-scsi-0:0:2:0-part1@   pci-0000:02:00.0-scsi-0:0:8:0@
    pci-0000:02:00.0-scsi-0:0:10:0-part1@               pci-0000:02:00.0-scsi-0:0:4:0@         pci-0000:02:00.0-scsi-0:0:8:0-part1@
    pci-0000:02:00.0-scsi-0:0:11:0@                     pci-0000:02:00.0-scsi-0:0:4:0-part1@   pci-0000:02:00.0-scsi-0:0:9:0@
    pci-0000:02:00.0-scsi-0:0:11:0-part1@               pci-0000:02:00.0-scsi-0:0:5:0@         pci-0000:02:00.0-scsi-0:0:9:0-part1@
    pci-0000:02:00.0-scsi-0:0:12:0@                     pci-0000:02:00.0-scsi-0:0:5:0-part1@

     

    Ah. That is unrelated to the above and seems to indicate an issue with that script. It may be incompatible with your controller, other firmware or kernel version. I'll look into that later but the actual plugin code is unrelated (and does not use this method to detect SAS drives - proof is: You do get the "SAS Assist" messages, meaning that the plugin has indeed acted on your SAS drives).

     

    4 hours ago, dunee said:

    I have an Dell R730xd with an H330 controller in HBA mode (so not flashed to IT mode).

    Ah, Dell servers. Gotta love them (I do, actually), but they have so many firmware layers that you can sometimes lose track of who did what... I do hope it is not some firmware code that causes the drives to spin up.

     

    I wouldn't bother with the IT mode thing; you do have it in HBA mode which is what we need.

     

    4 hours ago, dunee said:

    The recalcitrant SAS drives all appear to be Seagates.

     

    We've seen issues with Seagate drives not spinning down. However, having them spin up after a few minutes is not one of them.

  8. 4 minutes ago, pawelb said:

    This case: https://www.inter-tech.de/productdetails-144/2U-2404S_EN.html

    And drives are Toshiba N300.

    And, per the subject of this thread, can you check whether there's power in pin 3? Not being familiar with this enclosure, - could it be the culprit?

    (if the drives are SATA and the backplane is dual SAS/SATA, as your backplane seems to be, then essentially there shouldn't be a need to go for a SAS controller.)

  9. On 8/4/2023 at 12:37 AM, Hugh Jazz said:

    if i choose a keyfile, can i just use any random file i want and store it on a usb stick or something?

    Yes, any file, on the location of your choice. Make very sure though:

    1. It is accessible to Unraid during (re)starting the array
    2. It is kept intact, bit-wise, throughout the life of the array (do not trust a copy/paste of its contents, for example, etc.)
    3. You have a good backup copy in a safe place you remember... If you lose it, you lose your entire array and anything else that's encrypted using this keyfile.

    This all may sound trivial, but - I've seen all of those happen. Better safe.

  10. 15 minutes ago, Hugh Jazz said:

    hi! is it possible to use this tool just to verify my password without changing it?

    Sure; just run the tool as you normally would. Once asked for the old (current) password/key, provide it. The tool then tries this key on each available drive. If it can't open any of them, it will shout. If you're asked for the new key, it means the key is good; just hit ^C (ctrl-C) and leave.

  11. 44 minutes ago, suppa-men said:

    @Doron, please find the output below.

     

    Looks like the SATA drives indeed do spin down. That's the first time I'm seeing this. Probably the doing of the HBA.

    Does this mean that hdparm -y /dev/sdX does not work for these drives?

     

    I could probably add some option to "force" a drive to be considered as SAS even though it isn't. Thing is, it should probably be configured by S/N (to survive restarts). If you'd then make a change - e.g. connect that SATA drive to an on-board SATA port - you'd need to remove this or it will break things.

     

    The alternative is to hack the drive_types table in your "go" script 🙂

  12. @suppa-men, the poster you reference appeared to not really have an issue - their SATA drives were spinning back up a few minutes after being spun down, so probably as a result of i/o activity.

     

    What you're describing, is something I have not seen or been reported yet. You issue a SAS/SCSI command against a SATA drive and you report that it actually worked. This is very interesting.

    Let me ask you this - how do you verify that the SATA drive does in fact spin down following the sg_start command?

  13. 10 hours ago, bilbobagginz said:

    where i can find a list of SAS drives that work with spindown? 

    i'd like to buy some mid size 3/4TB HDD for my setup

    Oddly, I have never compiled a list of drives that do perform nicely with the spin down/up SCSI/SAS commands. I have collected a list of exclusions - i.e. drives (or drive/controller combos) that misbehave, or otherwise known to either ignore the spin down command or create some sort of breakage upon receiving them.

    It might be a good idea to compile success stories into such a list.

     

    I'll kick it off: I have a few HUH721212AL4200 (12TB HGST) on an on-board supermicro SAS controller (LSI 2308). They spin down and up rather perfectly.

  14. On 4/2/2023 at 5:49 AM, Octalbush said:


    Here's the output:
     

    sdb     | MG06SCA800A   | 1000:0097:1028:1f45   |  n/a  | 
    sdc     | MG06SCA800A   | 1000:0097:1028:1f45   |  n/a  | 
    sdd     | MG06SCA800A   | 1000:0097:1028:1f45   |  n/a  | 
    sdg     | MG06SCA800A   | 1000:0097:1028:1f45   |  n/a  | 
    sdi     | MG06SCA800A   | 1000:0097:1028:1f45   |  n/a  | 

     

    I have added the MG06 SAS drives to the exclusions list, so the plugin will not try to spin them down.

  15. 3 hours ago, Octalbush said:

    Unfortunately this plugin causes my drives to get read errors, and today it even made one go disabled and I am currently rebuilding it now. It seems to work ok when I let Unraid manage the spindowns, but if I spin it down using the web gui button, it can cause errors, and in todays case, the drive to be disabled entirely. I notice I get some errors over time even if I let Unraid spin them down based on time, but they don't disable the drive. I am using Toshiba MG06 8TB SAS drives.

    It turns out that the standby (aka "spin down") commands are interpreted differently by different SAS HDDs, unlike SATA drives. These commands are not well standardized. Some drives, after receiving this command, do spin down, but expect an explicit "spin up" command to start revolving again. This behavior is not compatible with Unraid, which expects a spun-down drive to spin back up automatically when the next I/O is directed at it. This translates to read or write errors (depending on the I/O that was underway); if it was a write, you'll get the drive red-x'ed, like you experienced.

    This has been reported a lot with Seagate drives, and a bit less with Toshiba drives.

     

    There's very little the plugin can do about it. I can add the MG06 to the exclusion list - which will mean that the plugin will simply avoid touching these drives. 

    Towards that end can you share the output of:

     

    /usr/local/emhttp/plugins/sas-spindown/sas-util