Jump to content

How to add drive that has been disabled without having the rebuild on it?


je82

Recommended Posts

In my test lab i have

 

2 parity drives

2 drives in raid array that is active

1 drive which is same make/model as the other 2 in the array but is not active in the array.

1 cache device

 

I wanted to remove the 1 drive which was the same make/model as the other 2 that are active in the array. I didn't know which disk was which so i accidentally pulled one of the drives that were active in the raid array (while system was on and array mounted)

 

The drive became emulated, which is expected. I shutdown the system and pulled out the correct device that i wanted to remove.

 

I power on the system, now 1 drive that was pulled earlier from the array is emulated and disabled even though its a working drive.

 

I unassign the device, power on the array, shut down tha array, reassign the device. Now it has to rebuild? Why do i have to rebuild when the data is there already?

 

TLDR: How to enable a disabled device without having to rebuild all the data on it?

Edited by je82
Link to comment

As far is Unraid the drive was disabled due to a write failure so it does not know that the data on it is good which is why it is trying to do a rebuild.

 

if you are SURE that no writes were made to the array then you can do the following:

  • use Tools >> New Config to reset the array.    Use the option to retain current assignments.
  • go back to the Main tab and select the ‘Parity is Good’ option and start the array.   Everything should come back up OK.

It might be a good idea at this point to start a correcting parity check to make sure that parity really is good.    You may get a few corrections almost immediately if a ‘housekeeping’ type write was missed because the drive was disabled.     However if after a few minutes no further corrections appear you can probably safely stop the parity check.

  • Like 1
  • Thanks 1
Link to comment
2 hours ago, itimpi said:

As far is Unraid the drive was disabled due to a write failure so it does not know that the data on it is good which is why it is trying to do a rebuild.

 

if you are SURE that no writes were made to the array then you can do the following:

  • use Tools >> New Config to reset the array.    Use the option to retain current assignments.
  • go back to the Main tab and select the ‘Parity is Good’ option and start the array.   Everything should come back up OK.

It might be a good idea at this point to start a correcting parity check to make sure that parity really is good.    You may get a few corrections almost immediately if a ‘housekeeping’ type write was missed because the drive was disabled.     However if after a few minutes no further corrections appear you can probably safely stop the parity check.

Thank you! I will try this after the rebuild is complete and simulate a drop again. I am only doing this for testing, i want to know how this stuff works exactly before i migrate my data over so i have an idea what to do when shit hits the fan.

 

You're saying that using the parity correction check and only let it run for a couple of minutes will safetly cover whatever was done very recently after the drive may have dropped? That is good to know. Thanks for the info!

 

Link to comment
Quote

You're saying that using the parity correction check and only let it run for a couple of minutes will safetly cover whatever was done very recently after the drive may have dropped? That is good to know.

Not quite.    I am saying that anything written to the start of the drive such as mounted status would be corrected quickly.     Data files that are likely to be much further into the drive would not be covered.

Link to comment
3 minutes ago, je82 said:

You're saying that using the parity correction check and only let it run for a couple of minutes will safetly cover whatever was done very recently after the drive may have dropped?

No, that's not it at all.

 

When a drive can no longer be accessed by unraid for whatever reason, all activity to that drive slot is emulated. Since there are writes to the file system metadata when the drive slot is unmounted, parity is guaranteed to be out of sync with the actual physical drive.

 

So, to answer the title to the thread, there is no way to add a drive back that has been disabled without rebuilding it, unless you don't care about any writes that happened after the disabling event. You will still need to correct parity for all changed bits, which may be anywhere on the drive, so a full correcting check is needed. That will take roughly the same time as a rebuild anyway.

 

Parity is calculated across the entire capacity of the drive from start to finish, without regard to content. If there have been NO data writes, then the only changes will be to areas close to the beginning of the drive. If something happened to be written to the portion of the drive after you cancelled the parity correction, then that area of parity would be invalid and would cause corruption if a different drive failed out before you got parity back in sync.

 

Normally when the array is operating you would not know for a few minutes that the drive has been dropped, and even after it's dropped the array will allow reads and writes to the slot without error, so you would pretty much be guaranteed to lose something if you readded the drive without rebuilding. Unraid doesn't drop a drive unless a write fails, so you are going to lose that specific write, and any subsequent ones as well.

  • Thanks 1
Link to comment
4 minutes ago, jonathanm said:

No, that's not it at all.

 

When a drive can no longer be accessed by unraid for whatever reason, all activity to that drive slot is emulated. Since there are writes to the file system metadata when the drive slot is unmounted, parity is guaranteed to be out of sync with the actual physical drive.

 

So, to answer the title to the thread, there is no way to add a drive back that has been disabled without rebuilding it, unless you don't care about any writes that happened after the disabling event. You will still need to correct parity for all changed bits, which may be anywhere on the drive, so a full correcting check is needed. That will take roughly the same time as a rebuild anyway.

 

Parity is calculated across the entire capacity of the drive from start to finish, without regard to content. If there have been NO data writes, then the only changes will be to areas close to the beginning of the drive. If something happened to be written to the portion of the drive after you cancelled the parity correction, then that area of parity would be invalid and would cause corruption if a different drive failed out before you got parity back in sync.

 

Normally when the array is operating you would not know for a few minutes that the drive has been dropped, and even after it's dropped the array will allow reads and writes to the slot without error, so you would pretty much be guaranteed to lose something if you readded the drive without rebuilding. Unraid doesn't drop a drive unless a write fails, so you are going to lose that specific write, and any subsequent ones as well.

Thanks for the info. I guess the follow up question would be, what happens if a whole controller card starts being trouble and instantly drops 8 drives, i cannot rebuild because 8 drives were lost/disabled. What do i do in this scenario to save the data?

 

I guess what i am asking, how do i go about mounting an encrypted unraid XFS disk directly in linux in order to access the data as a "jbod" drive instead of in an array, in case of an emergency.

 

Thanks.

Edited by je82
Link to comment
Just now, je82 said:

I guess the follow up question would be, what happens if a whole controller card starts being trouble and instantly drops 8 drives, i cannot rebuild because 8 drives were lost.

Unraid will only disable as many drives as you have parity drives, so after you correct the hardware issue and unraid can see all the drives again, it will allow you to rebuild the first failed drives. There may be some corruption, but likely not much, as the dropped drives can't have been written to. You can't start the array with too many failed drives, and I think the array goes offline if it exceeds fault tolerance. You are in an ideal place to test that scenario, by setting up a test array with only 1 parity and 3 data. Fail 2 data drives and see what happens. If you reattach the failed drives and any slots show unmountable, try running the file system checks in maintenance mode.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...