EMPTY and Remove a Drive Without Losing Parity


NAS

Recommended Posts

Resurrecting this topic post move to v5

 

With all the bells and whistles being discussed something as fundamental as this should be possible. You should not have to put unRAID into a failed state to remove a drive.

 

I assume no one ever worked out how to do this?

 

 

Edit by limetech: moved this discussion here here.

 

 

Update: We have a untested procedure now:

 

....

If you really want to remove a drive and leave parity undisturbed, this will work:

1. Start array in Maintenance mode.  This ensures no file systems are mounted.

2. Identify which disk you're removing, let's say it's disk3.  Take a screen shot.

3. From the command line type:

dd bs=1M if=/dev/zero of=/dev/md3   <-- 'md3' here corresponds to 'disk3'

4. Go to bed because this will take a long time.

5. When the command completes, Stop array, go to Utils page, click 'New Config' and execute that Utility.

6. Go back to Main, assign Parity, and all devices except the one you just cleared.

7. Click checkbox "Parity is already valid.", and click Start

 

Any code changes I make will just refine the above process.  Here are several I can think of:

a) Add ability to mark a disk "offline".  This let's us Start array normally in step 1 so server is not down during the process.

b) Add some code put array in mode where all writes are "reconstruct writes" vs. "read-modify-writes".  This requires all the drives to be spun up during the clearing process but would probably let step 3 run 3x faster.

c) Add an explicit "Clear the offline disk and remove from array when done" button.

...

 

Important warning!!!  The procedure above erases the drive completely, and assumes you have nothing worth saving on the drive.  If you have unsaved files on the drive, COPY THEM OFF FIRST!!!  Or you will lose them!

 

 

Link to comment

I assume no one ever worked out how to do this?

 

There's really not much difference in the "at risk" time between what it would take to update parity to allow a drive to be removed, or to simply do a new parity sync.

 

If there was a "remove drive" feature, it would have to do a complete pass on the drive being removed and update the parity drive based on the results => the system would be "at risk" for the duration of that update.    Simply doing a New Config without the disk you no longer want will do a full parity sync -- with the same effective result.

 

There IS conceptually a way to remove a drive without losing parity protection ... but you need a utility I'm not sure is available.  If you wrote all zeroes to every sector in the drive you want to remove (while it's still part of the array);  and then did a parity check to confirm everything's good in the array -- you could then do a New Config and safely check the "Trust Parity" box, since you would KNOW that the drive being removed was all zeroes and thus had no impact on parity  :)

Link to comment

That is my point,  conceptually with XOR parity if you can know the drive to be removed is all zeros you can remove it without doing any further parity work and zero at risk time. The tools at our disposal make the a klunky, risky and unsupported process but it could be the exact opposite to that relatively easily.

 

This asumes the disk is empty. If you assume the disk is not empty then it is perfectly possible in theory to make the drive RO and then programatically remove it from parity again without the need to spin the whole array. This is less "nice" but still nicer than breaking the array and recalculating parity from scratch again.

 

But these is a considerable reduction in "at risk time" with these approaches as at no point do you have no parity.

 

 

Link to comment

Agree it would be very convenient.    On the other hand, it's ALREADY very simple => it just requires that you note the drives used for parity and (if you have one) cache.    Nothing else matters in the assignments (the order of the data drives doesn't matter) => so a New Config is not exactly "hard"  :)

 

Link to comment

Agree it would be very convenient.    On the other hand, it's ALREADY very simple => it just requires that you note the drives used for parity and (if you have one) cache.    Nothing else matters in the assignments (the order of the data drives doesn't matter) => so a New Config is not exactly "hard"  :)

The order of the drive assignments doesn't matter to YOU.  It does to ME as I have shares for different computers on the same unRAID array and I like to group the drives together.  Purely cosmetic but makes it easier for me to know what is on the drives without having to have a directory handy.  And it is not simple if I have to find my reading glasses to read the screen capture so that I can make sure I'm reassigning the correct serials to the correct slots for the remaining drives.  A button I can read without the reading glasses but the serials are to small.  And yes I could increase the screen fonts but then that reduces what is on the screen when I went and found my "reading" glasses for another reason.  And that just increases the inconvenience factor for a change.
Link to comment

Actually it matters to me too  :)    ... I have mine in the order they're in my UnRAID tower so it's trivial to remove a disk should that be necessary (failure; replacement; etc.).

 

I was, of course, referring to the fact that doesn't matter to UnRAID -- i.e. to the functionality of the system.

 

[i also don't like to have to get my reading glasses and magnifying glass to read drive serial numbers  :) ]

 

  • Like 1
Link to comment
To have a utility to write zeros to a drive you intend to remove in order maintain parity seems very risky to me and I wouldn't want to do that.

Would you be ok with doing that procedure on a drive and partition that mounted cleanly and had a completely empty filesystem?

 

It would be nice to be able to at least manually empty a drive and then start a procedure to purge the drive from the array just like it was never there to begin with. That way we could shrink the array without losing parity protection for the remaining drives. As it is right now, once a slot is used, it's a pain to shrink the slot count, even if you have expanded the TB capacity with much larger drives.

Link to comment

As we can see, "mdcmd set invalidslot 99" does not work like it used to.  Maybe Limetech can shed some light as to what the expected result of this command is in the current version?

 

Doesn't "Parity is already valid" option do exactly the same that command did?

 

Btw, I do also like the idea to be able to remove a drive while keeping parity protection, I did it once last year for testing by manually zeroing a drive with dd, then used "New Config" and "Parity is already valid" options, it worked fine, but there are surely risks while doing it manually you mistake a drive number or even while reassigning drive slots after New Config (this gets easy anyway if Tom implements a way to remove a drive, even with no 'zeroing' feature) or something... with a proper gui tool to do it AND as jonathanm stated ONLY allow it for a completely empty filesystem (0 files/dirs), along with the right warning text, I think it could be rather safe?

Link to comment

...

To have a utility to write zeros to a drive you intend to remove in order maintain parity seems very risky to me and I wouldn't want to do that.

 

Yeah there is a finite risk but no more so than the code that asks the user if they want to format a new drive.

 

If the disk is part of the array and is empty then it should be a relatively safe procedure surely. Failing that we could do this on the command line only for the brave.

 

The main thing is you should not have to put the array into an unprotected state to remove an empty drive. For some people this is multiple days of all disk churning which is a big risk in itself, and its just plain ugly anyway.

 

Edit: maybe we just talk about how to do it and making it possible again like in 4.7 rather than trying to debate if it should be a native feature

Link to comment

I really do think adding array contraction really needs to be looked at added to the core unraid. We have ways to expand our arrays with ease but nothing to safely contract, an oversite in my eyes.

 

Add this to the list with dual parity and a scheduler so you can schedule Parity checks and any other tasks without the need of a plugins but there for another topic and may turn up one day.

  • Like 1
Link to comment

The problem I see is that users would expect a remove to happen almost immediately.  This cannot be the case if you want to maintain parity. 

 

If you are going to keep the array protected during the remove you have to first write zeroes to the drive (while it is still part of the array) which will take as long as the pre-clear required when adding a drive (that has not previously been pre-cleared).  This could easily take something like 8-10 hours for a large drive. 

 

Depending on exactly how it is implemented it might also mean that the array is off-line during the whole process (as it is when adding a disk that is not already pre-cleared), which is again something I do not see users being happy with.

Link to comment
This could easily take something like 8-10 hours for a large drive. 

Rebuilding parity after removing a drive will take the same amount of time AND there is no protection at all.

 

...might also mean that the array is off-line during the whole process

Isn't this also the case if you rebuild parity with a new config?

 

...users would expect a remove to happen almost immediately.

Tell them it won't!

Link to comment
...might also mean that the array is off-line during the whole process

Isn't this also the case if you rebuild parity with a new config?

No you can read and write to the array while parity is building.  I don't because I don't want to take any chances.  I don't think the array would HAVE to be off line during a clearing of the drive to remove it.  The array may have to be put into a read only mode so that no updates happen to complicate the parity and drive updates that are going on while the drive is being removed.  That is why I suggested a automatic parity build after removing drives would be alright for me - smaller change.  Actually I would be alright if the array became read only on a parity build as long as it remains fully read/write on a parity sync.
Link to comment

i would rather Tom spend his time adding email, ups, preclear functionality than changing something that already works fine (new config).

 

Definitely agree.  While it's conceptually easy to remove a drive from the array without ever running "at risk" (i.e. by writing zeroes to the drive);  it's not much of a risk if you (a) do a parity check and ensure you get "all zeroes" in the error and sync error counts);  and then just (b) do a New Config.    The only "at risk" time is then the new parity sync. 

 

If you want to be really "paranoid" about it, you can even completely eliminate the risk for that as well by using a new parity drive at that point -- keeping your original one -- and not using the array at all during the sync.  Then if something DID go wrong, you could re-construct the original array with another "New Config" (using the original parity drive and the drive you had removed) and do a "Trust Parity" ... and you'd be exactly where you started  :)

Link to comment

Also this feature might be possible to implement in a plugin.

 

I've never written one either; but clearly this is a conceptually very simple thing to implement => you just need a utility that writes all zeroes to the disk you want to remove.    I could easily do this in Windows; but am not a "Linux" guy and am not inclined to change that.    Of course, a good plugin that did this would also check to ensure there were no files on the disk before zeroing it -- and warn the user if that wasn't the case.

 

But IMHO it's simply not a feature that's needed.  The "New Config" already makes this a VERY simple task with minimal "at risk" time.

Link to comment

I'm sorry if this has already been said....  I have a basic understanding of how parity works, but I don't understand all the details discussed.  Anyway....  if one drive is removed.... the parity for that section would simple be reversed from it's current state?  so rather than rewriting that entire section why not just have a 'flag' that indicates that the way that section of parity is treated should be in reverse?  sorry if that makes no sense... It's kinda hard to explain what I'm thinking!

Link to comment

....  if one drive is removed.... the parity for that section would simple be reversed from it's current state?  ...

It would be "reversed" if and only if the corresponding bit from the removed drive was a "1". If the corresponding bit was a "0" then it would be unchanged. Hence the discussion about zeroing the drive before removal.

 

Link to comment

i would rather Tom spend his time adding email, ups, preclear functionality than changing something that already works fine (new config).

 

I also have to agree with this. While I believe the 'remove drive' feature would be welcomed by some, I'm also fairly confident that the other features listed above are needed more and would be appreciated by a much larger percentage of the unRaid user base.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.