doron

Community Developer
  • Posts

    640
  • Joined

  • Last visited

  • Days Won

    2

Everything posted by doron

  1. Hi @SimonF, thanks for this. Yes, we played quite extensively with the SCSI options etc. What was new to me was the Seagate toolset. So, Basically, things like SeaChest_PowerControl -d /dev/sgX --spinDown SeaChest_PowerControl -d /dev/sgX --transitionPower standby_z SeaChest_PowerControl -d /dev/sgX --transitionPower sleep With each of which cases - (a) Test that the drive indeed spins down (sense) (b) Issue a direct i/o - read, write - against it, see if it spins up or produces 0x4/0x11. Something like that. Again, my base hypothesis is that the behavior will remain the same - require an explicit "move-to-active" command and returning 4/11 without it - but, who knows, we may be surprised. Edit: ./SeaChest_PowerControl -d /dev/sg8 --checkPowerMode ========================================================================================== SeaChest_PowerControl - Seagate drive utilities - NVMe Enabled Copyright (c) 2014-2021 Seagate Technology LLC and/or its Affiliates, All Rights Reserved SeaChest_PowerControl Version: 3.0.2-2_2_3 X86_64 Build Date: Jun 17 2021 Today: Tue Apr 5 23:11:16 2022 User: root ========================================================================================== /dev/sg8 - HUH721212AL4200 - - SCSI Device is in the Standby_Z state activated by host command
  2. Sure, we can do that. Let's validate first - do we know that these drives do exhibit that behavior? I mean, you did report a 1.2 drive (I presume HP branded) getting this problem. Not all SG drives show this - some behave perfectly. If memory serves, for the most part we've seen this happen on Constellation ES.3 series drives (these are 3.5"). Not sure about these newer Exos series.
  3. What you're seeing is probably quite similar to what others are seeing with some brands of drives/controllers. See quite a few posts in this thread (including my post from a minute ago, just above). If you provide some more info on the failure (drive and controller brand/model, syslog lines at the time of the failure) I'd be able to say more about whether this is a similar mishap (intervention required, drive in standby, ASC/ASCQ 0x4/0x11). Or post diagnostics. But I'm assuming it's the same.
  4. Interesting. I was not aware of this Seagate nomenclature. I've peeked at it now; it seems to be dealing with the power states this plugin is also attempting to deal with. Seagate provides a set of tools of its own to interact with these modes. Specifically, to manipulate PowerChoice modes one needs to use a tool called SeaChest_PowerControl. While I did download and play with this tool, I do not have a single Seagate drive here so what's coming is mostly conjecture. If I had to guess, I'd say that this tool basically issues SCSI commands similar to those issued natively by the plugin. In other words, I'd guess it is a different wrapping for the same basic set of commands. In essence, the problem we keep bumping our head into, mainly with some of the Seagate drives, is that we do manage to spin them down into STANDBY_Z mode; however Unraid i/o logic does not expect to have to explicitly "wake" the drive back up. It expects the drive to spin up automagically next time it gets an active i/o issued at it. And it seems that some drives (again, we've seen that mainly with Seagates but not only - the OEM world is complex as it is so no brand wars please), when they get read or write while spun down, they will return an i/o error (ASC/ASCQ 0x4/0x11 resp.) instead of spinning up. They expect an explicit "move back to ACTIVE" command. That in turn makes Unraid red-x them and boom. If anyone has time and some non-array SG drives to work with me on this, as remote hands, I may be able to allocate some time for this. LMK. But the odds of this bet are slim I think - since it is mainly about whether the drive will auto-spin-up, not whether it will spin down successfully.
  5. Thanks @SimonF , that's interesting. It may well be related to some of the cases. I believe the plugin code already avoids smartctl for some of the functions due to something similar, I'll check when I get home. It would be good if the patched smartctl finds itself in the next RC. Sent from my tracking device using Tapatalk
  6. Unfortunately there's not much more I can suggest. Many Seagate drives have demonstrated erratic results wrt standby/spindown. I don't know of a reliable way around it. Sent from my tracking device using Tapatalk
  7. If you do, please report your results here - it may help other users in the future. Sent from my tracking device using Tapatalk
  8. Ah I missed the fact they are all tied to a single controller. In that case we must conclude that the specific drive (ST33000650SS) does not interpret the spin down commands as expected. Unfortunately there is no "standard behavior" re this. Newer Seagate drives tend to do the right thing, quite a few older ones do not. Your drive seems to have been pulled from an Adaptec system, which may or may not have to do with this issue. There's a slight chance that updating the drive's firmware will help, but I can't guarantee it will (or that you won't end up with a bricked drive), so weigh your risk. If you try this, you may want to check out this. I have not tried it, and it seems to be specifically not intended for Adaptec OEM drives, but who knows.
  9. Can you specify the exact model of the drive, and the models of the two controllers - the one that "does the right thing" and the one that does not?
  10. What you bumped into is called "A Bug" 🙂 I got confused for a moment since you are posting about v0.8, which is not posted via this thread, but rather via a plugin - kind of work-in-progress that I started on September and never properly completed. At any rate, I just posted plugin v0.9, with that bug resolved. @Towley, could you please confirm that the problem is resolved? Thanks! The file content is your key once you use a file. If you put "test1" in it, that becomes your passphrase. One important thing to note is the ending newline. Type unraid-newenckey -h and read the last paragraph.
  11. babaa abbab babaa ! Sent from my tracking device using Tapatalk
  12. Quite an elusive issue. My drives are old WD and newer HGST and I don't see that problem at all, neither in 6.9 nor in 6.10-rc1/2. Sent from my tracking device using Tapatalk
  13. Just to clarify, this appears to be unrelated to the plugin, and applies to both SATA and SAS drives. Those drives (mainly Seagates) are spun up soon after being spun down. Seems to be kernel related, though not sure. Sent from my tracking device using Tapatalk
  14. I seem to not be able to reproduce. Can you expand on what was the issue you saw? Can you paste the error here?
  15. Ouch. Shouldn't have happened. Will fix when I get a chance (on the road now). Workaround is to use key files. Place old and new passphrases in key files and do the swap. Mind the ending newline (see doc). Keep the "new" key file safe until you are sure you can unlock interactively! Sent from my tracking device using Tapatalk
  16. You are very welcome - thanks for reporting success.
  17. Do you see a "SAS Assist" message about spinning the drive down, and then immediately thereafter a message about reading SMART? There was an issue that started in 6.9.2 (vs. 6.9.1), where some drives are spinning down and immediately back up. It seems to be a kernel issue, I'm not sure it's been resolved yet. It was reported against both SAS and SATA drives.
  18. If you get the i/o errors mentioned above on an array drive, the drive will need a reboot to resume operation (the message is not a warning). So, no. Sent from my tracking device using Tapatalk
  19. Indeed, @madejackson and @francishefeng59, unfortunately some SAS drives (mostly reported with Seagates but with some others as well) do not respond as expected to the spin-down (aka STANDBY) command and require an explicit wake-up call, which is not really applicable in the Unraid realm (expected behavior: Spin up automatically upon next i/o). The result is typically an i/o error with sense 0x2 and ASC/ASCQ 0x4/0x11, recoverable only upon a reboot. I've made some attempts to map the bad actors (seems to be drive+controller related) and exclude them, but can't say it was an overwhelming success. So this is where we stand right now.
  20. Just installed and everything is looking perfect - thanks again @StevenD! This is awesome.
  21. The latter might be related to different work profiles (e.g. often-written files or folders on this sdg drive, etc.). To eliminate that I'd swap cables / ports with the other ones and see whether issue stays with dev (i.e. sdg) or with drive (i.e. S/N / Unraid slot).
  22. It appears as though your drive misbehaves per the spin down process. "LU not ready, intervention required" means it is spun down and does not spin up automatically when an i/o is directed at it. This piece is kind of vaguely defined in the SAS protocols and different drive/ctrl/firmware combos provide different results. Your downgrading to p16 might have something to do with that.