doron

Community Developer
  • Posts

    640
  • Joined

  • Last visited

  • Days Won

    2

Everything posted by doron

  1. Okay that's actually a feature. Since your drives clearly can't be spun down properly, the plugin avoids them (they are on the exclude list). If it can't help, at least it avoids collateral damage... Now, when you say "loops" - is there actually an endless loop of "spindown 3" - "spindown 4" - "spindown 3" - "spindown 4" or is this a result of your hitting the green button a few times in a row? Actually, if all your SAS drives are on the exclude list, there's unfortunately little point for you to run this plugin at this time 😞
  2. Just pushed out version 0.7 of the plugin. There are many changes, a few notable ones listed below. The main method of spinning a SAS drive down remains the same - meaning, that if you had issues (or worse, red x's) following spindown attempts in previous versions, there's a good chance this version will not improve this particular situation, so please test with care. - Adapt the syslog hook to various Unraid configs (between Unraid and Dynamix there are several different forms of syslog configs, which vary among them if you config syslog settings, so there's now a mechanism that will reconfigure the hook per different situations and will dynamically respond to changes in settings). @stigs, I'm guessing this might address your issue as well. - Filter out syslog lines (aka "spam"...) from some SAS devices rejecting ATA standby op (e0) - Introduce an exclusion list, which should gradually contain drive/controller combinations that are known to not respond favorably to spindown command - More consistent log messages and tags - Add new debug and testing tools - Many other changes, major code reorg Enjoy! Please report issues (or success).
  3. Thanks for reporting this. The next 0.7 version should(...) address your case. I'll hopefully push in within the next couple days. When you update to it, please report again.
  4. This is weird - unless this reflects your pushing the green buttons for disk 2 and 3 repeatedly several times in quick succession. Is this what happens? BTW these messages are generated due to Unraid trying ATA spindown commands against a SAS drives. The next version of the plugin (I've been sitting on it for a while, hoping to collect more data on drive/controller combos with which there's failures) filters these messages out from your log.
  5. Can you share a sample of the spam you received in the log?
  6. When a drive is marked disabled in the array (gets the red x), it will not come back up automatically. It needs to be rebuilt; there are a few guides as to how to do that. However the drive is not physically disabled - it is probably in good shape - just needs to be reintroduced into the array and rebuilt.
  7. Thanks for reporting that. Have you at any time tried the manual command to spin a drive down? Such as sg_start -r --pc=3 /dev/sdX I wonder whether it gets stuck the same way, and then gets a "task abort" a bit later. I haven't seen similar reports up until now.
  8. If it works correct;y, you should see log lines like this: SAS device /dev/sdl is spun down, smartctl evaded
  9. If you still have an empty log (assuming the modified 99-... file is there), I'd like to get to the bottom of this. Have you made changes (or installed 3rd party stuff that makes changes) to rsyslog.conf etc.? Do you have other "interesting" items in /etc/rsyslog.d/ ? Can you pm me your /etc/rsyslog.conf ?
  10. When cryptsetup opens a LUKS2 (newer format) header, it needs this /run/cryptsetup directory to place some locks for atomic operations. If this directory is missing, it spits out that warning message, creates the directory and continues. In the case above, it completed successfully. The UD code seems to assume that if cryptsetup luksOpen generated any output on stdout/stderr, it means it had an error - and assumes failure (.../include/lib.php). Therefore this warning message from cryptsetup causes it to give up. @DaveDoesStuff may have recently formatted this particular drive, so that the header format is LUKS2. Hence, this chain of events. You may want to consider changing UD to look at cryptsetup's exit code, rather than stdout content, for a reliable indication of success/failure.
  11. I looked at the UD code and indeed this is the case. When it issues the luksOpen command it expects a null (empty) response on stdout. When there's any (warning) message there, it assumes luksOpen failed and barfs. I suppose you can work around that by creating the directory /run/cryptsetup in your go script (permissions must be 700), which will eliminate the warning message and therefore the failure, but best would be for UD to fix that. EDIT: If you do want the workaround, just add this to your go script: mkdir -pm 700 /run/cryptsetup
  12. My initial guess is it's a small issue in Unassigned Devices. The error from luksOpen is just a warning, but UD interprets the nonzero rc as an error and gives up. What happens if you mount it from the UI? Does it work fine and mount the drive?
  13. Nothing?! As in no log lines at all?! (If so, something must have gone wrong with the copy/paste. Did you get any error messages when you restarted rsyslogd? Anyway, if that's the case just delete the file (99-spinsasdown.conf) and then remove/install the plugin.) The green button will not always reflect status (sometimes it will, but not with all drives/controllers). The proper indication for debugging is the sdparm command. Yes that was my assumption, which is why I asked for the test.
  14. Hang on. What's interesting is that the command you issue manually is the same command issued by the plugin when it reports "spinning down". So something might be spinning the drive back up before you are testing. The double message might be a result of your rsyslog configuration (have you changed it from default?). I have a new version of the plugin which I have not pushed out yet which should also address that. Would you mind testing something? If okay with you please replace the entire contents of the file /etc/rsyslog.d/99-spinsasdown.conf with this: # SAS Spindown Assist v0.69k $RuleSet local if $msg contains "spindown" and $.sasdone != 1 then { ^/usr/local/bin/unraid-spinsasdown set $.sasdone = 1 ; } # Some SAS devices spew errors when hit with ATA spindown opcode; filter out if re_match($msg, ".*error: mdcmd.*Input/output error \\(5\\): write.*") or re_match($msg, ".*md: do_drive_cmd:.*ATA_OP e0 ioctl error.*") then stop $RuleSet remote if re_match($msg, ".*error: mdcmd.*Input/output error \\(5\\): write.*") or re_match($msg, ".*md: do_drive_cmd:.*ATA_OP e0 ioctl error.*") then stop Once you save the file, restart rsyslog server: /etc/rc.d/rc.rsyslog restart Let me know how it fares. As I said this is part of an upcoming version of the plugin. BTW this has an extra benefit - it should filter out the i/o error messages that you see in your log whenever Unraid tries to spin down a SAS device. Last question: What SAS controller are you using?
  15. A few questions: - What happens if you manually spin down /dev/sg7 ? - When you check sense right after you see the message "spinning down slot 3, device /dev/sdh (/dev/sg7)" - do you see the device spun down? (additional sense etc.) - Do you always get two messages of "spinning down" instead of one?
  16. I've just posted a new version on the top post (0.6). It should cover all mounted encrypted (LUKS) drives, be it array, cache or UD managed. Enjoy, and stay healthy and safe.
  17. Thanks. How did you determine that the drive does not spin down? Can you share the details of the controller?
  18. Yes, the script would in most cases skip handling the UD enrcypted drives. This is due to the naming convention used by UD. I'll look into changing the code so it handles these too.
  19. Ah. Our posts crossed paths. So this is exactly what I suspected. The script did not deal with the UD device (will need to figure out why - perhaps it wasn't mounted at the time the script ran?) so the UD device is still with the old key. This is easily solvable. You want to add the new key to the UD device. Let's do it manually. Assume the UD drive is sdX . Then, you want to: cryptsetup luksAddKey /dev/sdX1 /root/keyfile This should prompt you for the current ("old") key. Just type it in. Once this completes successfully, everything should work with the new key. If you want you can later remove the old key, but as I said there's no harm in having two. Lemme know how you fare.
  20. That's good news. My advice: do not link. Copy. On your go file, just cp /boot/config/keyfile /root/ Ouch. Sounds painful. Hmm. So let's zoom in on that drive and take it step by step. 1. Are you sure that drive was on the list of drives you changed the key for? Could it be that it was not, and now it's expecting the old key? 2. Did the script complete successfully? Did you record (or can you still find) its console output? Assuming this UD drive is sdX, please try -- cryptsetup luksOpen --test-passphrase --key-file /root/keyfile /dev/sdX1 (Note the 1 at the end of the drive name - first partition) If this fails, try without the keyfile: cryptsetup luksOpen --test-passphrase /dev/sdX1 When prompted, provide your old key. If this one works, we've found the problem. If this is the case, post here and I'll guide you through changing the key manually (basically you need to add the new key, and only once successful, remove the old key - or, you can even leave the old key be, LUKS works perfectly well with two keys). Not sure about the autostart but avoiding an angry wife is our first priority.
  21. @DaveDoesStuff - Just so I understand the current situation - so you have the key in /root/keyfile and some of your drives won't open when you start the array? If you just copy the key file to /root/keyfile and then start the array - what do you get?
  22. Indeed, the UI doesn't sense it when you spin the drive down from the command line. However if you use the plugin in normal course of things, when Unraid is the one to schedule the spin down, it should work fine and be reflected in the UI in green/grey (it does here, I'm using the plugin for a couple weeks now). (Sorry for misreading your previous message - I thought you were referring to the log messages you quoted.) If you did it manually via the console (using sg_start and sdparm) - while the array was started - then this might be it. If you did it in sync with Unraid (i.e. either via its own scheduled spindown or by pushing the green button) - I'd think not. You can rebuild that drive, which should restore its contents.
  23. Thanks for taking the time. Thanks. How did you determine that the drive is not spun down? These two messages (OP e0 error and mdcmd error (5)) are not really read errors. They are the a the drive/controller's response to the ATA ops sent by Unraid, and are not an indication of a problem. I'm assuming you have been receiving them prior to installing the plugin (and keep receiving them if you remove the plugin) - please correct me if I'm wrong. The next version of the plugin (not pushed out yet) includes a feature that just filters out these syslog messages.