fastest "resilvering" of array ...


Recommended Posts

Hi there,

 

I added a drive to my Unraid machine while it was running, and apparently I somehow jiggled one of the main array drives (a 16TB drive), which made Unraid detect some read errors for a few seconds and dropped the drive. The array has an 18TB parity drive and some more 14TB drives in it.

I'm quite certain that the drive in question is totally fine.

Is there a chance of "resilvering" it, without the need of stopping the array, removing the device, preclear it and put it in again to rebuild?

 

Thanks,

 

M

Link to comment

OK, so new config would just assume all disks are OK but I'd have to resync/check parity to make sure everything is in order? But this would mean the array is down for that time, right?

While the rebuild of the disk I can actually use it?

 

Thx,

 

M

Edited by MatzeHali
Link to comment

You can still use the array whether you are rebuilding parity or rebuilding a data disk.

 

Technically, you should rebuild the data disk since that is the disk that is out of sync. If you rebuild parity instead, any writes that might have happened to the data disk when it was disabled are lost.

 

It isn't failed reads that causes a disk to become disabled. Unraid disables a disk when a write to it fails. It is possible for Unraid to try to rewrite the data for a failed read by getting the data from the parity calculation by reading all other disks, and if that write fails the disk is disabled. So, a failed read can cause a failed write that disables the disk.

 

But in the general case where a write fails, Unraid disables the disk. After a disk is disabled, it isn't used again until rebuilt. All reads of the disk are emulated from the parity calculation by reading all other disks. All writes to the disk, including that initial failed write, are emulated by updating parity. So that initial failed write, and any subsequent writes, can be recovered by rebuilding the disk.

 

And it is even possible that a failed write could cause filesystem corruption unless the disk is rebuilt. Some of the writes to a disk are filesystem metadata, that allows the bytes of your data to be organized into files and folders. If that metadata isn't updated correctly, it could cause corruption.

 

 

Link to comment
1 hour ago, MatzeHali said:

without the need of stopping the array, removing the device, preclear it and put it in again to rebuild?

Absolutely no point in clearing a disk before rebuilding it since rebuild will completely overwrite the disk anyway regardless of what is on it.

 

If you were going to replace the disk with a new disk, preclear might be used to test the new disk, but that has nothing to do with rebuilding to it.

 

1 hour ago, MatzeHali said:

I'm quite certain that the drive in question is totally fine.

If you want us to take a look to see if there is anything wrong with the disk, or anything else that might cause problems for rebuilding it, post diagnostics.

 

Do any of your disks have SMART warnings on the Dashboard page? Is the emulated disk mountable?

 

No need to even remove the disk if you aren't replacing it.

 

https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself

 

Link to comment
2 hours ago, MatzeHali said:

 

I added a drive to my Unraid machine while it was running

 

2 hours ago, MatzeHali said:

I somehow jiggled one of the main array drives

As you found out, hot plugging just isn't worth the risk. You have to stop the array to make disk changes anyway, may as well power down to be safe.

 

Like trurl said, you need to power down and secure all connections before bringing the array back up and attempting the rebuild. If you have a read error on any of the other array disks while rebuilding, the rebuild will be corrupt.

Link to comment

I probably would not hot plug for changes to the main array, but there's other stuff I do on that box, so I hot plug drives in for other stuff all the time. Can't just shut down a server which is in use because I need to do something else.

Just that one time I somehow must have jiggled a drive of the main array.

Rebuild worked fine, also just added another drive to have double parity. Better safe than sorry. ;)

So all went well.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.