SuperDan Posted July 30, 2020 Share Posted July 30, 2020 Maybe there is some kind of hardware requirement that needs to be met before it will work? I tried this one my Dell 720XD that has a stock Perc h310 controller (not re flashed with another vendors firmware) and it worked. # sg_start -rS /dev/sdh # smartctl -i /dev/sdh smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: HITACHI Product: HUC106060CSS600 Revision: A430 Compliance: SPC-4 User Capacity: 600,127,266,816 bytes [600 GB] Logical block size: 512 bytes Rotation Rate: 10020 rpm Form Factor: 2.5 inches Logical Unit id: 0x5000cca0214c3e74 Serial number: PPWAXW5B Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Thu Jul 30 13:25:40 2020 PDT device is NOT READY (e.g. spun down, busy) # dd of=/dev/null if=/dev/sdh # smartctl -i /dev/sdh smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: HITACHI Product: HUC106060CSS600 Revision: A430 Compliance: SPC-4 User Capacity: 600,127,266,816 bytes [600 GB] Logical block size: 512 bytes Rotation Rate: 10020 rpm Form Factor: 2.5 inches Logical Unit id: 0x5000cca0214c3e74 Serial number: PPWAXW5B Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Thu Jul 30 13:27:36 2020 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled Quote Link to comment
doron Posted August 7, 2020 Author Share Posted August 7, 2020 I took another look at this, and there's a clue that might help move this forward (at least I was not aware of it). It appears as if in SAS/SCSI drive management there are two distinct spindown states: STOP and STANDBY. Both make the drive park its heads and spin down; however, STOP requires an explicit START to have the drive spin back up, whereas STANDBY does what we're used to see in the ATA world - the next I/O at the drive will have it spin up again. Now the sg_start -S command that we've been toying with, issues the STOP command (as documented). This is why we've seen the behavior we've been seeing: Unless we issue the corresponding sg_start -s, the device will remain stopped and I/O against it will fail. Now, there seems to be a way to cause the drive to go to standby by using sdparm. e.g. (roughly): sdparm --flexible --save -p po --set=STANDBY=1 /dev/sdj but when I issue this command, the syslog shows this cryptic message: Tower kernel: sdj: sdj1 and the drive seems to be spun up and ready to go (i.e. dd from it works instantaneously). Not sure which component issues this message and in fact, whether it's related to the fact that the drive does not spin down but the message is issued in correspondence with using the sdparm command. Perhaps @limetech can shed more light on this. BTW issuing this command against a drive when the array was started, caused the array to go into parity check(?!?). There's also an SCT parameter (timeout for the drive itself to do a STANDBY), which we could play with (set to a very short period so it's almost an immediate spindown) but trying it in Unraid, this caused the same result as above. 1 Quote Link to comment
keshavdaboss Posted August 9, 2020 Share Posted August 9, 2020 On 8/7/2020 at 1:53 AM, doron said: I took another look at this, and there's a clue that might help move this forward (at least I was not aware of it). It appears as if in SAS/SCSI drive management there are two distinct spindown states: STOP and STANDBY. Both make the drive park its heads and spin down; however, STOP requires an explicit START to have the drive spin back up, whereas STANDBY does what we're used to see in the ATA world - the next I/O at the drive will have it spin up again. Now the sg_start -S command that we've been toying with, issues the STOP command (as documented). This is why we've seen the behavior we've been seeing: Unless we issue the corresponding sg_start -s, the device will remain stopped and I/O against it will fail. Now, there seems to be a way to cause the drive to go to standby by using sdparm. e.g. (roughly): sdparm --flexible --save -p po --set=STANDBY=1 /dev/sdj but when I issue this command, the syslog shows this cryptic message: Tower kernel: sdj: sdj1 and the drive seems to be spun up and ready to go (i.e. dd from it works instantaneously). Not sure which component issues this message and in fact, whether it's related to the fact that the drive does not spin down but the message is issued in correspondence with using the sdparm command. Perhaps @limetech can shed more light on this. BTW issuing this command against a drive when the array was started, caused the array to go into parity check(?!?). There's also an SCT parameter (timeout for the drive itself to do a STANDBY), which we could play with (set to a very short period so it's almost an immediate spindown) but trying it in Unraid, this caused the same result as above. I think you are on to something! Looking at this post from a while ago: It seems like people were able to get their disks spun up and down on SAS2008 Cards, and the IBM Br10i. I know a lot of people run 2008 cards but I don't see this feature implemented. Quote Link to comment
Chahk Posted August 10, 2020 Share Posted August 10, 2020 Another +1. Btw, I'm running a motherboard with a built-in SAS2008 controller, and spin-down doesn't work. Quote Link to comment
Cilusse Posted August 22, 2020 Share Posted August 22, 2020 I'm also showing my interest for such a feature to be implemented! The sg_start command works perfectly on my system. I also realised that the syslog shows when Unraid spins a drive down, there should be a way to tie the log entry and the sg_start command to follow Unraid's desire to spin a drive down. The syslog doesn't say when a drive is spun back up (or at least not with my current settings), but if it did, we would also have a way to trigger the SAS drives back up if this log entry is detected. Quote Link to comment
itimpi Posted August 22, 2020 Share Posted August 22, 2020 2 hours ago, Cilusse said: he syslog doesn't say when a drive is spun back up (or at least not with my current settings), but if it did, we would also have a way to trigger the SAS drives back up if this log entry is detected. I think the problem is that with SATA drives Unraid does not need to take any explicit action to spin up the drives (which is why there are no log entries) as it happens automatically as soon as any attempt is made to access the drive. It appears that with SAS drives explicit command need to be used to spin up the drives before attempting to access them or you get an error, and at the moment Unraid has no logic of this type in place. 1 Quote Link to comment
doron Posted August 22, 2020 Author Share Posted August 22, 2020 4 minutes ago, itimpi said: I think the problem is that with SATA drives Unraid does not need to take any explicit action to spin up the drives (which is why there are no log entries) as it happens automatically as soon as any attempt is made to access the drive. It appears that with SAS drives explicit command need to be used to spin up the drives before attempting to access them or you get an error, and at the moment Unraid has no logic of this type in place. That's partially correct. See my post above. Basically SCSI / SAS drives have two different spindown states: STOP and STANDBY. The difference between the two is that if a drive is in the STOP state, in needs an explicit START to spin up (or it will not process i/o commands), whereas if it's in STANDBY, it will spin down implicitly by the next i/o. We need a handy tool to put the drive in STANDBY. Turns out the sg_start -S places the drive in STOP; this implies what you wrote above, and what we're experiencing in our tests. Once we have that tool, @Cilusse's idea of tying it to a syslog action could be used as a fine stopgap until Limetech includes it in the base product. But I don't have an answer to this question - yet. 1 Quote Link to comment
Cilusse Posted August 27, 2020 Share Posted August 27, 2020 (edited) I've been researching ways that might allow an automation of the sg_start command. iotop (available through NerdPack) can show real time read/write usage for any disks. This might not work because a spun down SAS drive will probably not show any activity as Unraid won't be able to access it. Then I came across ioping. This one is not available on NerdPack so I can't easily test it, but it logs the access latency to each drive. I am assuming that a spun down SAS drive will show an ever increasing latency until it fails and Unraid reports a read error in the array. There might be a way to write a script that looks at those latencies and spins a SAS drive up when it reaches a set threshold, just before Unraid fails the request and reports an error. Pair that to the syslog message that is sent when a drive un spun down by Unraid, and you have an almost automatic system to spin SAS drives up and down. If anyone has tinkered with those tools before, I would suggest to give it a try. What do you guys think ? Edited August 27, 2020 by Cilusse Quote Link to comment
SimonF Posted August 29, 2020 Share Posted August 29, 2020 (edited) I was able to get devices to spin down using timer and continued to look to see if i can them to be put into standby by command which I have now been able to do. Standby needs to use the pass-through device sgX and not he sdX one. sg_map will show the mapping. /dev/sg0 /dev/sda /dev/sg1 /dev/sr0 /dev/sg2 /dev/sdb /dev/sg3 /dev/sdc /dev/sg4 /dev/sdd /dev/sg5 /dev/sde /dev/sg6 /dev/sdf /dev/sg7 /dev/sdg /dev/sg8 /dev/sdh /dev/sg9 /dev/sdi /dev/sg10 /dev/sdj The following shows the disk status root@Tower:~# sdparm --command=sense /dev/sdd /dev/sdd: HITACHI HMRSK 3P02 Additional sense: Idle condition activated by timer Command to put drive into standby. Power Condition 3 is standby root@Tower:~# sg_start -vvv --pc=3 /dev/sg4 open /dev/sg4 with flags=0x802 start stop unit cdb: [1b 00 00 00 30 00] duration=0 ms root@Tower:~# sdparm --command=sense /dev/sdd /dev/sdd: HITACHI HMRSK 3P02 Additional sense: Standby condition activated by command Drive spin up as required. Also drive polling spins them up as I am guessing that as unraid doesnt know they are spun down it does a smartctl for the drive. SCSI Power Conditions Edited August 29, 2020 by SimonF Additional Info 1 Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 @simonf so you're say: sg_start -vvv --pc=3 /dev/sg4 <--- spin down disk sdparm --command=sense /dev/sdd <-- preps disk to spin back up on next access attempt If so, I'll toss one of my SAS disks into the machine and play with it in a little while. now here's another question: what happens if you execute those same commands on a SATA disk? If the need to ID a disk as SAS or SATA can be eliminated, that would go a long way to this idea bearing fruit. Quote Link to comment
SimonF Posted August 29, 2020 Share Posted August 29, 2020 The sdparm shows the status, the disk spins up if the access is made to it, you can force spinup using pc=1 or just -s -vvv is not required for the command to run just give more verbose info of the SCSI command. Those commands only work on SAS you will get an error message for SATA I had used the following to set timers on the disks previously and the sense would say standby by timer rather than command. sdparm --flexible -6 -l -S --set SZCT=9000 /dev/sdi sdparm --flexible -6 -l -S --set STANDBY_Z=1 /dev/sdi Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 ok. so now the question is, if tossing that command at a SATA disk returns an error, is it a no-harm/no-foul error. AKA, if we blindly sent both (SATA and SAS specific) commands to both types of disks, is there any negative consequences. I'm looking at it from the elimination of logic standpoint, so there doesn't have to be code that does "IF/THEN/ELSE", which we know someone will break by having a disk report weirdly at some point. Quote Link to comment
SimonF Posted August 29, 2020 Share Posted August 29, 2020 15 minutes ago, sota said: ok. so now the question is, if tossing that command at a SATA disk returns an error, is it a no-harm/no-foul error. AKA, if we blindly sent both (SATA and SAS specific) commands to both types of disks, is there any negative consequences. I'm looking at it from the elimination of logic standpoint, so there doesn't have to be code that does "IF/THEN/ELSE", which we know someone will break by having a disk report weirdly at some point. I cannot say if this would cause harm, commands fail for the wrong types. sg4 is SAS sg2 is SATA, may be others may be able to commend. I would expect that information from the disk from smartctl should confirm the protocol in use and could be used to condition which command is used. root@Tower:~# hdparm -y /dev/sg4 /dev/sg4: issuing standby command SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 18 00 00 00 00 20 00 00 c0 00 00 00 00 f8 21 00 00 00 00 00 00 00 00 00 00 SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 18 00 00 00 00 20 00 00 c0 00 00 00 00 f8 21 00 00 00 00 00 00 00 00 00 00 HDIO_DRIVE_CMD(standby) failed: Input/output error root@Tower:~# sg_start -vvv --pc=3 /dev/sg2 open /dev/sg2 with flags=0x802 start stop unit cdb: [1b 00 00 00 30 00] duration=0 ms start stop unit: Fixed format, current; Sense key: Illegal Request Additional sense: Invalid field in cdb Sense Key Specific: Error in Command: byte 4 bit 3 Raw sense data (in hex), sb_len=26, embedded_len=26 70 00 05 00 00 00 00 12 00 00 00 00 24 00 00 cb 00 04 00 00 00 00 00 00 00 00 START STOP UNIT command failed root@Tower:~# smartctl -i /dev/sg2 smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Pipeline HD 5900.1 Device Model: ST3320310CS Serial Number: LU WWN Device Id: 5 000c50 010d03c22 Firmware Version: SC14 User Capacity: 320,072,933,376 bytes [320 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 5900 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Sat Aug 29 16:01:45 2020 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled root@Tower:~# smartctl -i /dev/sg4 smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.7.8-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: HITACHI Product: HMRSK2000GBAS07K Revision: 3P02 Compliance: SPC-4 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000cca01bd5f764 Serial number: Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Sat Aug 29 16:01:51 2020 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled Quote Link to comment
doron Posted August 29, 2020 Author Share Posted August 29, 2020 (edited) Okay, @SimonF, I think you nailed it! I just tested this on my SAS drives. My results: 1. The command does actually work on the /dev/sdX devices as well, but what happens is some kernel thing negates it immediately (with a message in syslog - see this post for details). For some obscure reason, issuing the command against /dev/sgN does not initiate this auto-negation, hence - success! Great find. 2. Indeed this is the STANDBY (rather than STOP) state, which means the next i/o does spin the drive up. Even tested with the array started (ain't I so brave). No damage, no red x. This is the state we've been looking for. 3. Issuing this against SATA drives (per @sota) - well, mixed blessing. On one hand it does send the drive to spin down. However the next spin up of the drive spews some funny i/o error messages on syslog. It does spin up, and the i/o does complete successfully (I do my testing with dd) but this is not something we want. So I guess program logic will need to make the distinction between SAS and SATA. That is very cool. @limetech, I think this might be ready for you. sg_start --pc=3 /dev/sgN Edited August 29, 2020 by doron 1 Quote Link to comment
SimonF Posted August 29, 2020 Share Posted August 29, 2020 26 minutes ago, doron said: Okay, @SimonF, I think you nailed it! I just tested this on my SAS drives. My results: 1. The command does actually work on the /dev/sdX devices as well, but what happens is some kernel thing negates it immediately (with a message in syslog - see this post for details). For some obscure reason, issuing the command against /dev/sgN does not initiate this auto-negation, hence - success! Great find. 2. Indeed this is the STANDBY (rather than STOP) state, which means the next i/o does spin the drive up. Even tested with the array started (ain't I so brave). No damage, no red x. This is the state we've been looking for. 3. Issuing this against SATA drives (per @sota) - well, mixed blessing. On one hand it does send the drive to spin down. However the next spin up of the drive spews some funny i/o error messages on syslog. It does spin up, and the i/o does complete successfully (I do my testing with dd) but this is not something we want. So I guess program logic will need to make the distinction between SAS and SATA. That is very cool. @limetech, I think this might be ready for you. sg_start --pc=3 /dev/sgN @limetech --pc=1 can be used for spinup drives @doron thanks for the feedback. Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 I like how smartctl reports: SATA Version is: SATA 2.6, 3.0 Gb/s <-- SATA Transport protocol: SAS (SPL-3) <-- SAS so any logic has to go "digging" to find the answer. Figures; can't make it easy! Quote Link to comment
SimonF Posted August 29, 2020 Share Posted August 29, 2020 7 minutes ago, sota said: I like how smartctl reports: SATA Version is: SATA 2.6, 3.0 Gb/s <-- SATA Transport protocol: SAS (SPL-3) <-- SAS so any logic has to go "digging" to find the answer. Figures; can't make it easy! on my system sg_map -i provides more info, ATA are SATA devices and Vendor name are SAS so maybe that could be used. root@Tower:~# sg_map -i /dev/sg0 /dev/sda SanDisk Cruzer Fit 1.00 /dev/sg1 /dev/sr0 HL-DT-ST BD-RE BU40N 1.03 /dev/sg2 /dev/sdb ATA ST3320310CS SC14 /dev/sg3 /dev/sdc HGST HUS724030ALS640 A1C4 /dev/sg4 /dev/sdd HITACHI HMRSK2000GBAS07K 3P02 /dev/sg5 /dev/sde HGST HUS724030ALS640 A1C4 /dev/sg6 /dev/sdf HGST HUS724030ALS640 A1C4 /dev/sg7 /dev/sdg HGST HUS724030ALS640 A1C4 /dev/sg8 /dev/sdh ATA CT500MX500SSD1 023 /dev/sg9 /dev/sdi HITACHI HMRSK2000GBAS07K 3P02 /dev/sg10 /dev/sdj ATA SSD PLUS 480GB 00RL Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 (edited) problem I see with that is the only logical choice is to go with not-ATA and assume it's SAS at that point, and you know what they say about ass u me. having a maintained positive lookup table for SAS drives sounds painful. OR... root@Cube:~# smartctl -i /dev/sg4 | grep -i "SATA" SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) root@Cube:~# smartctl -i /dev/sg4 | grep -i "SAS" root@Cube:~# root@Cube:~# smartctl -i /dev/sg13 | grep -i "SAS" Transport protocol: SAS (SPL-3) root@Cube:~# smartctl -i /dev/sg13 | grep -i "SATA" root@Cube:~# I guess maybe it could be two lines? if TRUE <SATA_TEST> do sata-standby-command if TRUE <SAS_TEST> do sas-standby-command and ignore the FALSE cases? if by some bizarre reason neither are true, than nothing happens/no command is run. Edited August 29, 2020 by sota Quote Link to comment
doron Posted August 29, 2020 Author Share Posted August 29, 2020 Or this: if [ "$(smartctl -a /dev/sde | grep protocol | sed -r 's/.*protocol: *(.*) .*/\1/')" == "SAS" ] then echo yes # do SAS thing else echo no # do non-SAS thing fi Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 are we sure 'protocol' will never show up in an SATA listing? Quote Link to comment
doron Posted August 29, 2020 Author Share Posted August 29, 2020 Just now, sota said: are we sure 'protocol' will never show up in an SATA listing? We're not. But if it will, it will probably not have "SAS" as value. Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 ah... yea I re-read the line. Not a master programmer... anymore. If you wanted Applesoft BASIC or 6502 assembly, I was your guy for a long time. Quote Link to comment
doron Posted August 29, 2020 Author Share Posted August 29, 2020 14 minutes ago, sota said: ah... yea I re-read the line. Not a master programmer... anymore. If you wanted Applesoft BASIC or 6502 assembly, I was your guy for a long time. Yep, I can relate to that... At the time of the 6502 I was programming S/370 ASM and PL/1. But we digress ☺️ Quote Link to comment
sota Posted August 29, 2020 Share Posted August 29, 2020 Obviously @limetech has to agree, but I think at least the groundwork is there to do some integrated testing. The base is there though; a command that puts an SAS disk into standby such that it'll spin back up when it gets touched, and code to be able to figure out which command to run for SAS vs. SATA. Quote Link to comment
Cilusse Posted August 29, 2020 Share Posted August 29, 2020 This is awesome! The command: sg_start -vvv --pc=3 /dev/sgX works perfectly with my SAS-only array. I've put it in a User Script running hourly to often make sure the drives are standing by, and whenever I access any files, the required drives spin back up! No errors to report so far and it's finally completely automatic ! I really hope that Limetech can implement this behaviour in a future version of Unraid. Thank you so much for all the contributions, I love this community! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.