Community Developer
  • Posts

  • Joined

  • Last visited

  • Days Won


Everything posted by doron

  1. Thanks @trurl. Much appreciated and apologies for that. Your findings are aligned with the initial description (btw, being author of SAS Spindown plugin, I have some mileage of intimacy with these drives 🙂 ). Trying to prioritize the burning situation here, I'd like to put aside, for the moment, the reason for parity2's failure (I'll deal with it later), and focus on the operational, Unraid-specific question: At this state, am I right in assuming disk3 is indeed fully rebuilt? Will stopping the futile "rebuild" process right now, doing a New Config without parity2 and with "parity assumed valid", return the array to a stable, protected state? (then, I will deal with parity2 and probably rebuild it, but I will have a protected array). Thanks!
  2. D3 is the one that's invalid (as you say, this is expected). Q is disabled (DISK_DSBL). Here. Thanks.
  3. Yes, that's what it started doing, but Q is DSBL right now so it just reads stuff and writes nothing. Recommendation? (a) Just wait for it to finish doing nothing for the next ~20 hours and then rebuild Q or (b) do a New Config assuming D3 is indeed good vis-a-vis P (taking Q out of the game)? BTW I do think this is a bug worth addressing. But currently I'm with the operational questions.
  4. Thanks. That drive is disabled. And it doesn't in fact write anything to it. So certainly not really rebuilding it. But wait. You may be on to something here. When the data rebuild started, parity2 was not disabled. I started rebuiding both, assuming the double issue was a controller / cabling mishap (which I still think it might have been, but that's beside the point). Parity2 red-x'ed again during the rebuild. So we might be looking at a corner case bug, where data rebuild is not aware that its 2nd target is disabled, and thinks it needs to complete the run. Plausible? Anyway, the situation now is that I believe disk3 is already fully rebuilt, and parity2 is not being rebuilt, so the process is chewing in vain. What would be the least-risky procedure to bring the array again to being protected? I thought of: - Stop the rebuild - Bring down array - New config without parity2, and "parity assumed valid" - When all is green, insert parity2 and rebuild it. Is this a good way? Is there a better way (e.g. can I tell emhttp now that disk3 is valid, in spite of the allegedly incomplete rebuild)?
  5. Hi folks, I'm recovering from a mishap with two red x HDDs. (Thank God and @limetech for dual parity!!) Two parity drives of 12TB. One of them is red-x'ed. Five data drives of 4TB and one of 12TB. One of the 4TB was red-x'ed. Started data rebuild for the 4TB. Parity2 is still disabled. Data rebuild seems to have covered the entire 4TB of the target drive, but the drive's icon is still orange and "rebuild" continues (just reading, not writing anything). It is now at 117% and happily chewing on. It appears to want to go all the way to 12TB (shudder), although the rebuilt data drive I've never seen this before. Is this a known issue? Surely it's not how it's supposed to work... If I stop the array and try to mark the disk as "parity presumed valid" (I guess I'd need "new config" for that?) - would that work? Any feedback would be appreciated. This seems to be an odd behavior - new to me - and since I'm now with no protection (remember, two red x's) I don't want to break anything. Unraid 6.10.2.
  6. Not much except that per your findings, ST2000NM0001 seems to not honor the "spin down" (standby) command. As discussed in this thread, unfortunately not all drives / controllers handle this instruction the same way. Some do the right thing, some spin it down and require an explicit "wake up" instruction (resulting in red-x in Unraid systems), and some just ignore it. If you want to run a test, then assuming the HDD is /dev/sdx, try this: sg_start -rp3 /dev/sdx && sleep 1s && sdparm -C sense /dev/sdx Let us know what you get on output.
  7. There's a chance it has to do with the enclosure / backplane/ controller. I'd check whether there's updated firmware for either. Long shot but worth a search. Sent from my tracking device using Tapatalk
  8. Try this: /etc/rc.d/rc.autofan restart
  9. Very good. Now let's hope @bonienl considers adding the code fix into the plugin.
  10. Okay @DavidNguyen, here goes. Again, please note this is not sanctioned by the plugin author and should be considered a hack, just for testing. Also note that I didn't package it back into plugin format, which means that the change will be lost upon reboot. Attached please find the modified script "autofan". Steps to activate: - Place the script in /usr/local/emhttp/plugins/dynamix.system.autofan/scripts/ - chmod 755 autofan - Go to settings, Fan Auto Control, set the function to "Disable", then Apply - (Make sure process "autofan" is not running) - Set function back to "Enable", then Apply You should now have autofan function without offending your SAS drives. Please report success / issues. autofan
  11. @DavidNguyen , meanwhile please keep us posted as to whether disabling the autofan plugin indeed stopped the phenomenon. If you want to test, I can send you a (hacked, unauthorized) code fix for that plugin to try on. Sent from my tracking device using Tapatalk
  12. Hi @bonienl, It seems like this plugin will spin up SAS drives every $INTERVAL minutes. It checks for HD temps, however the test for spun-down status is good for ATA drives and not for SAS drives. I proposed a small code change (pull request onto your repo) to fix this. Thanks for considering it.
  13. Just for the sake of testing, can you disable the System Autofan and System Temp plugins and retry? EDIT: Reading the code of System Autofan, it will definitely spin up SAS drives every $INTERVAL minutes. My guess is that if you disable this plugin, the phenomenon you describe will go away. I will contact the author.
  14. Spinning up a few minutes later would probably be due to some activity against the drive. You may want to check for other plugins, Dockers or VMs that generate periodical I/O against the array. Sent from my tracking device using Tapatalk
  15. Generally speaking, the passphrase is placed in /root/keyfile, but please read the official Unraid docs for the complete picture (there's UI to specify keyfile, etc.)
  16. root@Tower:~# unraid-newenckey -h == unraid-newenckey v0.9, made for Unraid, change encrypted volumes' unlock key. @doron == Usage: unraid-newenckey [current-key-file] [new-key-file] Both positional arguments are optional and may be omitted. If provided, each of them is either the name of a file (containing a passphrase or a binary key), or a single dash (-). For each of the arguments, if it is either omitted or specified as a dash, the respective key will be prompted for interactively. Note: if you provide a key file with a passphrase you later intend to use interactively when starting the array (the typical use case on Unraid), make sure the file does not contain an ending newline. One good way to do that is to use "echo -n", e.g.: echo -n "My Good PassPhrase" > /tmp/mykeyfile root@Tower:~#
  17. Here you go @another_hoarder. To use it, put it anywhere convenient, align permissions (chmod +x generic.wrapper) and then use thusly: generic.wrapper <CommandToBeWrapped> In your case, the command to be wrapped would be smartctl. To remove the wrapper, run the same command again. No warranty, your own risk etc. generic.wrapper
  18. Enabling debug on the plugin will not generate very much output on syslog, if at all (umm, I guess I need to change that). Note however that if you are looking for debug info on spin up, you'd probably not have found it here anyway - the spin ups / SMART reads occur outside of the plugin and unrelated to it. If you want to explore debugging other parts of the system, you might want to wrap smartctl with some debug code, to track calls to it. Let me know if you want to explore that path, I can provide you with a neat little wrapper for that.
  19. rc8 -> stable. No issue. Happy puppy here.
  20. Just upgraded (from rc6). No issues. Thank you!
  21. Updated from rc5. No issues. Looking good, great work.
  22. That may be related to the "smartctl -n" issue spinning up drives. You may want to check 6.10-rc5, which per @SimonF includes an updated version of smartctl which has this issue solved (haven't seen it in the release notes btw).
  23. (sorry for the deleted post) I just pushed out a newer version of the plugin (also with a more a-la-mode version numbering scheme). Should fix this issue - let me know if it does.
  24. Is the time span between spindown and "read SMART" always the same (about 1 min)? Does the same thing happen both when you let the drive spin down "naturally" (i.e. following the Unraid spin down delay) and when you force spin down from the GUI? Could you perhaps have some folder that lives mainly on these drives and sees frequent I/O? Have you tried to upgrade to some 6.10 RC? BTW all my SAS drives are HGST (HUH721212AL4200) with an (onboard) LSI controller and they remain spun down nicely.
  25. In the case above, it seems that the spin-up (read SMART) happens more than a full minute after the spin down. This might be a result of normal I/O (e.g. read) against the array.