Parity-Swap how-to


Recommended Posts

I googled this and searched this on here and couldn't find a how-to.

 

I have a 3TB drive that is disabled because it failed. My parity drive is 3TB as well. I bought 2 - 4TB drives. How do I copy the existing parity contents onto the new 4TB and set the configuration to accept it as the new parity drive so I can rebuild the contents of the disabled 3TB drive onto the new 4TB data drive?

 

Moderator edit: fix 'party' in heading, for searching purposes

Link to comment

The most recent documented case of doing the Parity-Swap-Disable process is the one I went through. Lately everyone else takes the scared way out and avoid it entirely.

 

I had slightly more steps to go through because my bad drive wasn't marked as disabled so I had to force it to disabled. This is what I posted when I went through this process -- http://lime-technology.com/forum/index.php?topic=35014.0

 

That was on v5, but the process should be the same on v6 however the UI is possibly different. Perhaps @jonp or @limetech can chime in about this?

Link to comment

Thank you...I think I found the proper steps (after more digging) and got it going so far...

 

 

You must replace a failed disk with a disk which is as big or bigger than the original and not bigger than the parity disk. If the replacement disk is larger than your parity disk, then the system permits a special configuration change called swap-disable.

 

For swap-disable, you use your existing parity disk to replace the failed disk, and you install your new big disk as the parity disk:

 

Stop the array.

Power down the unit.

Replace the parity hard disk with a new bigger one.

Replace the failed hard disk with you old parity disk.

Power up the unit.

Start the array.

When you start the array, the system will first copy the parity information to the new parity disk, and then reconstruct the contents of the failed disk.

 

Link to comment

"When you start the array, the system will first copy the parity information to the new parity disk, and then reconstruct the contents of the failed disk."

 

This is an interesting scenario. Is this step automatic? Can it go sideways (assuming no more failed disks)?

 

Thanks.

Link to comment

"When you start the array, the system will first copy the parity information to the new parity disk, and then reconstruct the contents of the failed disk."

 

This is an interesting scenario. Is this step automatic? Can it go sideways (assuming no more failed disks)?

 

Thanks.

This should all be automatic as long as there are no unforeseen events such as a power failure or another disk failing during this process.  It is also recommended that you do not try and write to the array during the process (although I do not think the system will prohibit you from doing so as it is emulating the failed disk until the process completes).

 

As a precaution I would make sure that you keep the old 'failed' disk untouched until the recovery process finishes.  As long as the disk has not physically failed then in the worst case most of its data is probably recoverable.

Link to comment

The Swap-Disable instructions in the Wiki leave out an important step.    The drive you want to rebuild has to be marked as "missing" before this will work.

 

In addition, this is strictly for the case where you want to replace the parity drive and use the old parity drive to replace a failed drive.    You'll need to do that first; THEN you can replace one of your drives with the other 4TB drive.

 

To do this, follow these steps:  [shut down instead of just stopping if you need to make any physical drive changes]

 

(a)  Stop the array; removing the failed drive from the configuration; then Start the array again so it shows that drive as "Missing".

 

(b)  Stop the array; assign the new 4TB drive as parity and the old parity drive in place of the missing drive.

 

©  Now you'll be presented an option to "Copy" the parity -- click on that and wait (hours) for it to finish;  then Start the array and let it do the rebuild.

 

(d)  When that's done, you can then rebuild any drive you want with your other 4TB drive [You'll need to do the same process to force it "Missing" before assigning the 4TB drive]

 

Link to comment
  • 3 weeks later...

This post came in handy, as I am just going through this process right now. I can't believe how many HDs have failed in the last month. What I didn't expect though was that the array would be offline while the copying is in progress. :( I guess I understand why, but still was hoping it wouldn't be the case.

 

hades

Link to comment
  • 5 months later...

I have created an updated wiki page for the Parity Swap procedure ->  The Parity Swap procedure

 

I would really appreciate review and corrections, especially from Gary if he has time.  It's wordy, no pictures (afraid that's not my strong point), but I believe has extra hand holding, for both new users and all of us that rarely run it.

 

I've called it the 'Parity Swap' procedure, not the 'Swap Disable' procedure, which it's called more often.  I hope that's not a problem, and I can change it, but I think 'Parity Swap' is clearer, easier to understand.

 

It's not well tested.  I just used it successfully on my own v6.1 system, but not with a failed drive, so there may be behavioral quirks with other versions and situations.  PLEASE let us know!

Link to comment

I never liked "Swap-Disable" as a name either ... but I agree it's also not a "Parity Swap".    I suppose "Swap-Disable" was derived from the fact that it's a procedure for swapping the parity drive for a disabled drive (which simultaneously upgrading parity).

 

Not sure what the best name for the procedure => perhaps just saying what it is instead of trying to think of a catch-phase ... e.g. "Replacing a failed drive when you need to use a drive larger than the parity drive"

 

 

Link to comment

I'll try to keep my thoughts short...  I really really don't like 'swap-disable', and was hoping there must be *something* better, anything, for multiple reasons.  No matter how you try to explain it, it's still a stretch, and very awkward.  And I don't want the word 'disable' in there at all either, because it's not intrinsic to the process.

 

Please forgive a little history...  In the earliest unRAID days, drives needed to be replaced, and then as now, we often want to replace a drive with a larger one.  So very quickly there was a need for a special procedure to do that when the new drive was larger than the parity drive.  But I believe the thinking was still (as it still is in much of the RAID world), that when there are disk issues the whole drive is failed, and you either rebuild or replace the whole drive.  unRAID has changed that, and we rarely have to replace an entire failed drive.  We recover most of it, and we decide to 'fail' the drives ourselves because of the increased info we have.  We believe the drive is 'going to fail', not 'already failed' as in the past.  But back in the beginning, Tom created this procedure for the 'already failed' case, and it probably didn't occur to him then how much we use it now with drives that haven't failed (yet).  How often do you see reports where a user had a drive issue and reported using the old 'swap disable' procedure without issue?  I can't recall any!  On the other hand, how often have you seen reports where a user tried to use the procedure but it wouldn't work?  Relatively often!  Why?  Because they were trying to use it with a drive that HAD NOT FAILED YET!  I suspect most of us have or would use it this way, with 'unfailed' drives, but because of its heritage, we have to artificially 'fail' the drive to meet the procedures requirements.  I hope that in the future Tom will modify that to recognize the very specific conditions of 2 drive assignments different, one is the parity drive and it's in the other slot, and a larger than parity drive is in the parity slot.  No more need to 'fail' a drive ourselves.

 

So the word 'disable' is not a critical intrinsic part.  I hope Brit will continue his name progressions, but this time branch off without any 'disable's.

 

I'm dropping 'parity swap' as 2 esteemed gentlemen have vetoed it.  I think we all agree there's probably not one intuitive term we can use, but while I do agree with the pragmatism of just describing it fully, we humans need shortcuts, have to call it something catchy, with as few words as possible.  I don't know what's best, and will agree to whatever the consensus is.  I do have a couple of suggestions, that perhaps will inspire a better term yet.  One is 'parity data swap', longer but a little easier to explain.  The other idea is go in a very different direction with something like 'double upgrade'.  You may still have to explain that you can't upgrade 2 data drives at the same time, one of them has to be the parity drive ... but at that point they should get it, they should understand what it's all about!  But 'double upgrade' is probably too different to catch on.

Link to comment

Disabled / Missing is intrinsic and should be mentioned because the procedure wont work without a disk being marked as disabled or missing.

 

 

 

I agree. "Disabled" should stay in the title until the procedure no longer requires a disabled disk. Although, what is the point of this procedure if a disk is not disabled?

Link to comment

Disabled / Missing is intrinsic and should be mentioned because the procedure wont work without a disk being marked as disabled or missing.

 

 

 

I agree. "Disabled" should stay in the title until the procedure no longer requires a disabled disk. Although, what is the point of this procedure if a disk is not disabled?

 

I do agree that this procedure should be possible to do without having to fail/disable the disk, and have it be easier to do without all the work arounds. Its freaking pain in the ass to mark a disk as disabled / missing. After I went through this procedure I even posted a list of features I considered vital, with marking a disk as disabled or doing this procedure more naturally being part of it.

 

I think my requests got lost in the forum reorgs as it was well before LT got semi-organized.

 

As for why this would be done, imagine it as "array expansion with preventative maintenence". Remember, read issues do not count as disk disable, nor do drives running hotter than normal, nor do sector reallocations or other SMART issues!

 

Scenario:

you have your slots filled with 4tb drives.

you have a disk that is nearing failure (read errors or pending sector reallocation etc)

you see larger drives are cheaper and faster than your current drive  (6tb or 8tb vs 4tb)

You want to replace your data drive before it completely fails

 

That is close to the scenario I went through with my 2tb drive having read issues when 4tb HGST were the sweet spot for new drives.

 

Link to comment

... Although, what is the point of this procedure if a disk is not disabled?

 

I agree.  If you don't have a disabled disk, it's a FAR better idea to do the upgrade in two steps ... (1) Replace the parity drive with a larger drive; and then (2) replace the disk you want to replace.

 

I believe the ONLY reason this "swap-disable" process was implemented in the first place was so if you had a failed disk; and wanted to take advantage of the fact that drives had evolved and you could buy a larger disk -- but didn't have a large enough parity disk -- you could (a) buy the larger disk; and (b) then use it for parity, while still being able to rebuild the failed disk on the old parity disk.    Thus you were "swapping" the parity disk for a "disabled" disk ... which I suspect is where the name came from.

 

There's NO reason to use this process with a disk that isn't disabled.    You can simply do the two steps I outlined above.

 

Link to comment

... Although, what is the point of this procedure if a disk is not disabled?

 

I agree.  If you don't have a disabled disk, it's a FAR better idea to do the upgrade in two steps ... (1) Replace the parity drive with a larger drive; and then (2) replace the disk you want to replace.

 

There's NO reason to use this process with a disk that isn't disabled.    You can simply do the two steps I outlined above.

 

I completely disagree.

 

Why would you want to replace parity by rebuilding it off the data drives when one of your data drive is having read errors or other issues that do not trigger it to be disabled? I think it makes more sense to replace the parity drive with a larger drive when it's merely copying data from the existing parity drive and not reading it from a questionable data drive.

 

Once the new parity drive is rebuilt off the data on the old parity drive, then you can safely reconstruct the data from the new parity and other working data drives.

Link to comment

That's a reasonable point.    I was thinking in terms of Rob's comment about replacing a drive with a larger drive, even though it hadn't failed, but we wanted to buy a drive larger than the current parity drive to upgrade the potential drive sizes in the future.  [i.e. "... we often want to replace a drive with a larger one.  So very quickly there was a need for a special procedure to do that when the new drive was larger than the parity drive ..."]

 

But if you have a drive that you know is having issues and you want to replace it, then I guess you COULD use this procedure to replace it.    Personally, I'd take a directory of the drive; replace it with a new drive; and then copy the files it had contained from my backups, as if I didn't trust the drive, I'd likely not trust the files either.

 

I think the real point is you should always have a spare equal to the size of the parity drive, so this issue never occurs ... and when you USE that spare, if you replace it with a larger drive, then you should immediately upgrade the parity drive (which can then be your spare), so you never encounter the situation in the first place.    This was pretty pricey to do a few years ago, but with current drive prices it's not an unreasonable thing to do.

 

Link to comment
  • 1 year later...

hi guys

sorry for the confusion, i'm a newbie and just need to follow the right process, and i'm seeing a lot of discussions on how to do the parity swap, and what i'm understanding, you are explaining how to swap a data drive using the current parity drive because the new drive bought is bigger than the current parity.

 

however, i have the actual parity drive reading errors and want to replace it with a new one - note it's the same specs as my current parity drive that's failing.

do i still need to follow the "parity swap procedure" even though i'm not planning on keeping the current "defective" parity drive in my array?

or is there another process i'm missing to just replace a parity drive for another drive due to failure?

 

thank you for your help.

 

 

Link to comment
3 minutes ago, caballopazo said:

do i still need to follow the "parity swap procedure" even though i'm not planning on keeping the current "defective" parity drive in my array?

 

No, you just do a standard disk replacement, power down, replace parity disk, power back on, assign new parity disk, start array to begin parity sync.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.