Jump to content

doron

Community Developer
  • Posts

    642
  • Joined

  • Last visited

  • Days Won

    2

Everything posted by doron

  1. @Derek_, glad it worked well for you. Just for other readers of this thread - note that for the array disk you must use the /dev/mdX devices (not /dev/sdX), whereas for the encrypted cache drive you need to find the actual device name /dev/sdX and then use it adding "1" for first (only) partition on that disk. You do not do this for parity drives - they are not LUKS devices. You're very welcome. Great question. You do have 7 left. What technically happened is that for each drive, you added a key into slot #1 (second slot), then restarted and tested, and then removed the key from slot #0, marking it "disabled". The next time you do the same exercise (assuming no fiddling in between, I'm simplifying this a bit), you will add a key into the first free slot - which will now be #0. So if I understand your underlying question, you can repeat this process an infinite number of times - as long as you remove the old keys. The eight slots can keep up to eight keys simultaneously.
  2. I don't see a reason to stop anything. The key slots are fixed and are manipulated in place. Your data is not affected during the process. Just do the full set of luksAddKey, stop/start when convenient to see that the new key works, then a set of luksRemoveKey.
  3. Absolutely not. You must do it when the array is started. In fact when it's not started you will not be able to follow the instructions above, as the /dev/mdX devices will not be available.
  4. @Derek_, limetech's post in the thread referenced above has it exactly right. In brief, you can either use "cryptsetup luksAddKey" followed by "cryptsetup luksRemoveKey", or just use "cryptsetup luksChangeKey" which, in most instances, just bundles these two actions together. The safest way would be to do a whole set of luksAddKey, test that everything works with the new key (array stop/start from GUI etc.), and then use luksRemoveKey to clean up. The /dev/ devices you must use as operands for cryptsetup are /dev/md1 for disk1, /dev/md2 for disk2 and so forth. Note that you have to do it unto all of your encrypted array disks(!), and make very sure you assign the exact same key to all of them. Another thing you want to be mindful of is your cache. If you have configured it to be encrypted XFS, you'll need to take care of it, too. To figure out the correct device name for the operation, you can check out your GUI main tab. Under "Cache Devices" you will find your cache drive. There, you will have the device name in parenthesis under "Identification". So if, for example, you find "(sdc)" in there, your parameter for the luksAddKey / luksRemoveKey would be /dev/sdc1 (note the "1" suffix, indicating the first partition). (Edited - corrected cache reference)
  5. Hi folks, I've recently pre-cleared 3 drives using the plugin (run off of Unassigned Devices, successfully completed) and then added them to the array, one by one. Two drives were recognized by Unraid as clear and added immediately. The third was not recognized as being precleared and Unraid kicked off a clear - again. For a 12TB this is a bit frustrating... The two drives that went in smoothly are 4TB, 512e. The one drive that went into clearing again is 12TB, 4Kn. Btw the Unassigned Devices code recognized all three of them as "precleared" (under fs type). Any thoughts on what could be the reason for the larger (newer) drive to not be recognized as pre-cleared? Could it have something to do with the drive being 4Kn? (I posted about this here and received the good advice to ask on this thread).
  6. Hi all, I've seen this discussed in old threads, but no resolution so here goes. I've just added three drives to my array: two 4TB drives and one 12TB drive. All three had been just previously precleared successfully using the preclear plugin (and the Unassigned Devices Plugin noticed the signature and marked their fs as "precleared"). I added them one by one (just caution, no reason). The two 4TB drives were recognized as precleared, added immediately and were ready for formatting. The 12TB one was seemingly not recognized as such, and starting the array initiated a clearing. For a 12TB drive this is going to take many hours, which is a bit frustrating as we've just been through a full preclear. One fact that may or may not be related is that the "problematic" 12TB drive is 4Kn, while the 4TB drives are 512e. Could it be that the PC plugin's signature on such drive is not properly detected by Unraid? Any other reason that may cause this? (Unraid version 6.7.2.)
  7. I didn't look up the specific mobo but typically the SATA ports are tied off of the chipset, which you can't passthru. The better news is that you can pass the SATA drives you connect to them, as RDM drives. This forum has numerous guides as to how to do that. This will leave your slots free.
  8. A data point: My SAS drives (HGST HC520, relatively new crop) appear to spin up and down properly for Unraid, in the sense that the GUI shows them as spun down and turns their temp display into "*", and there are no i/o sense errors in syslog. However, spin down seems to not actually happen, judging by the fact that clicking "spin up" on the GUI "spins them up" instantly, and their temp remains around 30C while my spun down SATA drives wake up at around 24C. Connection is via an on-board LSI 2308 chip.
  9. It's small. As you say, there are workarounds. I was thinking about the principle of least surprise, or "how many things will break with alt A vs. alt B.". For me its some logic for setting up /root/keyfile ; for others it might be something else. Anyway - minor annoyance at most, easy to resolve.
  10. Actually I do understand, quite well in fact. Which is why I commented that it makes sense to me to set fmask & dmask to 77 (which is 077). The bit that made me wonder is disabling of the X bit for root (which is what's caused by your setting fmask to 177 as opposed to 077). Understood and appreciated. However this is unrelated to turning off the X bit for root. This is the only bit I'm concerned with. Nor should you. It makes good sense to make all these files inaccessible to non-root users. Bottom line - perhaps you want to reconsider the one bit - X bit for root. It has nothing to do with access to the sensitive files; these will be fully protected from non-root snoopers with dmask=77 and fmask=77. Thanks!
  11. I suspect this is likely to break some stuff users have been doing for years. I know it does for me. So if I may suggest, ... Include this change in the release notes, as something that matters. Add an "advanced, don't push button unless you know whatcha doin" type of option, to set dmask and fmask to non-default values. Regardless, I'm not sure I follow the logic (i.e. security value) of making that change. If someone who's up to no good has write access to the flash drive content it's clearly game over. So while fmask=77 might help considering future non-root users, I fail to see how fmask=177 helps. An elaboration on the threat model would help here. Thanks for all the good stuff pouring in!
  12. Could you please elaborate a little bit on the new security scheme? Or provide a link if it has already been discussed (sorry, I couldn't find it). Thanks!
  13. Thanks for taking the time! (Just to avoid confusion, this particular issue was first reported by @electron286 , I was asking about progress since I bumped into same issue). At any rate, I found the problem, at least in my case. My setup includes a floppy drive, which is a platform-connected device. Seems like ScanControllers.cfm device tree parsing is slightly confused by this. What you get in ls -l /sys/block for this device looks like this: fd0 -> ../devices/platform/floppy.0/block/fd0 Once instructed to ignore such devices, the scanning process completes happily and everything starts to work flawlessly. Below is a proposed patch to ScanControllers.cfm. Again, thanks for building and maintaining this fantastic tool over the years! --- ScanControllers.cfm.orig 2019-08-07 06:57:04.000000000 +0300 +++ ScanControllers.cfm 2019-10-09 10:49:20.150512121 +0300 @@ -388,7 +388,7 @@ <CFFILE action="write" file="#PersistDir#/ls_sysblock.txt" output="#BlockDevices#" addnewline="NO" mode="666"> <CFLOOP index="i" from="2" to="#ListLen(BlockDevices,Chr(10))#"> <CFSET CurrLine=ListGetAt(BlockDevices,i,Chr(10))> - <CFIF FindNoCase("virtual",CurrLine) EQ 0> + <CFIF FindNoCase("virtual",CurrLine) EQ 0 AND FindNoCase("platform",CurrLine) EQ 0> <CFSET DrivePath="/sys/" & ListDeleteAt(ListLast(CurrLine,">"),1,"/")> <CFSET CurrDrive=Duplicate(Drive)> <CFSET CurrDrive.DevicePath=DrivePath>
  14. Hi @jbartlett- has there been progress on this issue? I'm seeing the same issue ("usr/bin/lspci: option requires an argument -- 's') so I thought I'd ask. Thanks!
  15. Yes, indeed. Thanks for this! - I will add this to the post. I didn't go that path since (a) I was concerned that, placed over a plug pin, the tiny piece of tape would not hold for many insert/removal cycles and would eventually peel off and cause trouble - I'm not sure how real that concern is, and (b) the adapters I applied the "destructive" way on are quite inexpensive (the first type can be sourced from Aliexpress for a little over a buck a piece).
  16. This has been mentioned here and there on the forums, but I thought I'd offer a concise guide in hope to save some grief for at least one other person ๐Ÿ™‚ The symptom is quite simple: You purchased one or more new drives, you connect them to your existing setup and they just don't spin up. Seem to be DOA. So good news - most likely they're not, and you're just having a case of the SATA Pin 3 syndrome. This will happen when your power supply is not of the newest crop, in one of two cases: The drive is SAS, newer crop, and you are trying to connect it to a SATA-style SAS controller ports, such as those on boards like SM X10SL7-F, using contraptions such as this SATA-to-SAS adapter or that SAS-to-SATA cable. The drive is SATA, newer crop. For example, some of WD (ahem HGST) new enterprise drives - in my case, HC520 - specifically HUH721212AL4200 (SAS) or HUH721212ALN600 (SATA) - will demonstrate this issue. Very briefly, the issue has to do with a newer spec of the SATA power connector. Revisions of SATA spec after Rev 3.2 (which does not have this), redefine the function of pin 3 on the SATA power connector to be Power Disable. This means that for drives supporting this feature, if the drive sees live voltage on this pin (typically 3.3V), it will power itself off (or not power on, as the case may be). (This is done in support of hot-swap enclosures and arrays, where the ability to hard-power-cycle a single drive without physical access or total system disruption is a boon.) WD has a nice writeup about this if you want to read more. So basically, if these drives see voltage on pin 3, they will not start. Now, many PSUs we use these days, unless they're extremely new, would show 3.3V on all three pins: 1/2/3. (In addition, some cheap SAS-SATA adapters short pins 1-2-3 together, so even if your PSU does not feed pin 3 with 3.3V, your drive may still see voltage there due to this "feature".) So how do we fix this? One way to solve this is using Molex-to-SATA power adapters, such as this. These do not carry 3.3V in (only 5V and 12V) so problem solved. Even if you have a SATA power chain cable, you can still feed its end off of one of these and you should be good. One type of SAS-to-SATA adapter - this one - also solves the problem due to its power being fed by a Molex power plug. Another way (the one I used) is to hack these adapters. So if you take a sharp cutter and gently pick up, fold and break pin 3 of the power section, counting from the L shape end, on your adapter (do not even consider doing this unto the drive itself!), you should get lucky (make very sure you get rid of the broken pin, so it doesn't find itself inside some electronics later). It will look like this: On some other adapters, however, it gets slightly trickier. Seems like some of them short pins 1-2-3 together, so removing only input pin 3 does not help cuz 3.3V that's fed to pin 1 flows to the drive's pin 3. In that case, what I did is remove all three pins 1-2-3. Note that pins 1-2 are marked "reserved" in the standard, so I didn't expect anything bad to happen. And sure enough, everything started to churn. This is how it looks: A similar and less destructive way to solve this issue, as mentioned below by @jonathanm, is to use a thin slice of Kapton tape and place it over pin 3. If choosing this solution I'd probably opt for applying it onto pin 3 of the drive itself, to make the solution compatible with adapters that short pins 1-2-3 together. There are several good online guides for this, e.g. this one. I didn't opt for this solution since I was unsure just how well the tiny piece of tape would withstand multiple insert/removal cycles of the plug. Now, your drives should spin up and live happily ever. Edit: Added the Kapton tape method, the SAS-SATA-Molex adapter and some text clarifications.
  17. Yes, this should be possible - the question is, why? What are you trying to achieve?
  18. I know, sorry about that. I was debugging the situation, shutdown to check hardware, and logs and diags are gone. I need to see to it that at least syslog persists on my server. I'd try to reproduce on a test server if I could run one... Side note: a test/debug license could have been a great feature. e.g. make that license type do something like shutdown the server after 24 hours, or two hours of "idling", or another arbitrary action that is very annoying for production servers but harmless for testing. I for one would have used it a lot ๐Ÿ™‚ Exactly. Thanks!
  19. I'm very sure it was only rebuilding parity[1]. However probably due to some leftover somewhere, it decide that "Total size: 12TB" and was apparently trying to do, well, exactly that. If I were to try to reproduce (I wish I had a test system, as I had with 5.x - a testing license would have been great), I would have tried to have two parity drives of same size, and then insert a smaller drives into the parity[1] slot and see if it's reproduced. @itimpi - if you have the bandwidth and interest... It usually allows replacing one drive in one slot, I believe. See above - t'was not not rebuilding both. Was rebuilding one, assuming an incorrect size.
  20. Following your suggestion I just did. And - lo and behold - it now reports it is doing (Total size:) 4TB of sync (not 12TB as it did the previous time around). Judging by this, I'm quite sure it will also complete successfully (we'll know in 8 hours or so...). So it may have been a one-off quirk. Which is probably why no one else reported this happening.
  21. s/will not/should not/ ๐Ÿ™‚ But it did. This is what I was reporting. Stop array; unassign red x (smaller) drive from parity slot; start array. Indeed. And no file i/o was happening anyway at that time. Only parity sync. What you're saying "it will not" and "nor will it" should really be "it should not" and "nor should it". My detailed problem report above asserts that it did exactly that, at exactly 4TB (33.3%) into the parity sync. I checked the logs at that point at this is what they were saying (unfortunately the machine was rebooted a couple times since then so I can't provide syslog, but trust me on that). It was trying to access the disk beyond its end. Got a failure and error sense. Disabled the parity drive. See above. Unfortunately it does. Which is why I said this is something that needs fixing.
  22. Let me clarify. I agreed with you that it might be the best way to have both parity drives sync simultaneously. However in the case that there are two drives of different sizes in the two parity slot, the process fails and one of the drives gets a red x (dsbl). This is the part that needs fixing, IMHO. (e.g. one way to address it would be to make sure the smaller of the two is still at least as large as the largest data drive, and then proceed to the ends of both drives - but not beyond, so that the process completes successfully). I didn't think that; this is what happened. It tried to read past the end of the smaller drive, failed, and placed the parity1 drive in red x disabled state. It is now not part of the array (it is unassigned). The array now has parity2 and three data drives. Nothing left out of the description ๐Ÿ™‚ The parity sync process failed. First parity placed in red x after Unraid tried to read past its end. See section 2 under "What we did" in the first post. Sorry of this wasn't clear from my post.
  23. Agreed. In this case, parity is built for an array in which the largest disk is size X, and the two parity disks are size X and size Y>>X, and it fails. It might be something worth fixing. I'm still unsure as to my previous questions - (a) is the array protected at its current state (data drives and parity2 only), (b) is there any safe way to return to 4TB parity without going thru an unprotected period? Thanks!!
  24. Thanks for responding @trurl. I'm not sure what you mean by "continuing". The parity sync of the 12TB parity2 had been completed previously, after 23 hours and 12TB of reported progress. Are you saying that it started re-calculating parity2, as a function of a first parity drive joining in? If so, that would makes sense; the observed data points are (a) it reported "syncing 12TB" from the moment it started building the 4TB parity, and (b) it was trying to access (I guess either r or w) the 4TB drive beyond its physical limits. Fair question. I was hoping to pull all three out of the game and run extended tests on them, to see whether I have a batch of lemons or this is a one off (they aren't cheap). One of the other two does report a small number (like 15 or so) of "Correction algorithm invocations" upon write that were successfully corrected. Probably normal and not a cause for alarm but due to the one lemon, I want to be extra careful. So I'm trying to return to 4TB parity. But -- agreed, if all else fails, I can probably do that. Prefer not to if at all possible.
ร—
ร—
  • Create New...