Failure of a disk while rebuilding on upgraded (larger) disk


Recommended Posts

This is a hypothetical scenario, and I'm sorry if I should have been able to find this already in the forums.

I spent a while searching both here and in the FAQ, but couldn't find the answer for this. Hopefully, you guys can enlighten me a bit.

The scenario is as follows:

 

With a 4 disk array (3 data, 1 parity), I'm replacing the smallest drive (let's call it disk 1) with a larger one (disk 5).

During the rebuild, one of the other data disks (disk 2) fails, and the rebuild stops.

 

There's nothing wrong with disk 1, it has just been removed from the array.

Is there any way of saving the data on disk 2 from this position?

e.g. by re-inserting disk 1 and swapping out disk 2 with disk 5, and then restarting the rebuild process?

 

How would one go about this? Is it even doable?

I can't seem to wrap my head around this.

Link to comment

This is a hypothetical scenario, and I'm sorry if I should have been able to find this already in the forums.

I spent a while searching both here and in the FAQ, but couldn't find the answer for this. Hopefully, you guys can enlighten me a bit.

The scenario is as follows:

 

With a 4 disk array (3 data, 1 parity), I'm replacing the smallest drive (let's call it disk 1) with a larger one (disk 5).

During the rebuild, one of the other data disks (disk 2) fails, and the rebuild stops.

 

There's nothing wrong with disk 1, it has just been removed from the array.

Is there any way of saving the data on disk 2 from this position?

e.g. by re-inserting disk 1 and swapping out disk 2 with disk 5, and then restarting the rebuild process?

 

How would one go about this? Is it even doable?

I can't seem to wrap my head around this.

Easiest, by far, is to save a copy of the "config" directory PRIOR to swapping in the new disk.  Do this with the array in a STOPPED state.

 

So...

Perform a parity check to ensure the entire array is healthy.  It can be an NOCORRECT check.

Stop the array

save a copy of the /boot/config directory  (just in case)

replace the disk being upgraded.  (assume disk1 for this example)

Start the array, the old contents will be re-constructed on the new disk. 

 

If it completes, great...

 

If it fails because a different disk crashes... Stop the array once more.

 

Replace the old disk1 to its original slot in the array, removing the newer disk you were attempting to install.

Replace the old config directory contents.  (you can just copy them back)

Replace the crashed disk2 (you can even use the one you were intending to use to replace disk1)

 

When you then start the array it will appear as if you are upgrading/replacing disk2 and you can proceed.

The key is to keep a copy of the config folder while in a "STOPPED" state so you can revert to it.

 

Joe L.

Link to comment

Wow! Not only doable, but quite simple, really.

And explained much more concisely than I managed (I'm an ESL, so I'll use that as an excuse.)

Thanks ever so much :)

You are welcome. 

 

Oh yes, your command of "English" would put many of today's school kids grammar to shame.  It is very good. 

I would never have guessed your native language was not English.

I have no second language skills at all (unless you count computer programming languages, in which case I have no idea how many languages I've learned over the years.... many)

 

Joe L.

Link to comment

Just note that what was suggested only works if there are no writes to any of the disks after the swap, ie not writes are allowed during the rebuild process.

 

If you start with a pre-cleared disk and the rebuild is completed past the size of the old disk then the array is good at that point. The re-build will write zero's to the new disk once past the size of the old disk but if the new disk is already zero'd then the array is already protected. So, you could theoretically get a failure that occurs which still leaves the array protected. However, I don't believe unRAID would tag the disk as good before the rebuild is completed though.

 

Link to comment

Just note that what was suggested only works if there are no writes to any of the disks after the swap, ie not writes are allowed during the rebuild process.

Not so sure about that statement... not as an absolute blanket statement.  I think you can write to ANY drive except the drive being re-constructed.

 

If all works as expected, when writing to a disk, only that disk AND parity need be updated.  The rest are left as is.

 

So, assuming there are writes, they are made to parity and a different data disk being written (not the disk being re-constructed). 

The disk being re-constructed is not changed, so unless I'm not thinking clearly, you could stop a failed reconstruction, replace the old config, swap in a different disk to be rebuilt, and all would still work.

 

The only time parity would get out of sync is if you write to the disk being re-constructed and then swapped in the old copy. (it would not have the"writes" made to it that were made to parity, since it was out of the array)

 

Therefore, I think you could write to the array, but NOT to the disk being re-constructed, and then still be able to swap back in the old disk and config files to recover from a different disk failure.

 

Joe L.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.