Jump to content

[solved] questions about array shrink procedure


CaptainTivo

Recommended Posts

I have a v6 system with a bunch of smaller drives (1 and 2 TB) that I would like to retire.  I plan to a larger disk (8 TB), copy the data from the small ones and remove them.

Here is my plan and some questions.  I would love to have and comments on whether this is the best way to do this, too.  I would think this is a not-uncommon situation but I have yet to find a built-in way to do this.

for the purposes of this post, I have a 1 TB drive (disk 6), let's call it the 'source' disk,  that I plan to copy to a 4 TB drive (disk 10), call it the 'target' disk. I would then remove the 'source' disk.

 

1. With the arrays started: copy all data from disk 6 to disk 10, using the cp -rp command in a screen window.  I understand that doing file copy/moves on the /mnt/disk* mounts bypasses the automatic "distribution" done by the shares.

using this command:  cp -rp /mnt/disk6/* /mnt/disk10.  This will make a copy of all files and folders on the source disk to the target disk, preserving all linux permissions, right?

 

2. delete all the data (now duplicated) from disk6 (I do a copy then delete instead of move as a precaution)

rm -rf /mnt/disk6/*

 

3. I plan to use this method to write zeroes to disk6, preserving parity and then remove the disk: 

https://lime-technology.com/wiki/Shrink_array using the "The "Clear Drive Then Remove Drive" Method"

 

I have some questions about this method, which I will interlace with the quoted procedure below.  I also will turn on "reconstruct write" for the duration.

 

Quote
  1. Make sure that the drive you are removing has been removed from any inclusions or exclusions for all shares, including in the global share settings.

Already, this seems wrong.  If I turn off inclusion and exclusion on the shares page, then *by default* this disk is included in the share.  For instance, I have a top level shares called "Videos", it includes all disks except that I explicitly exclude disk4 which is used as a backup for other computers in the house.  In other words, disk6, the disk I am removing, *is* included in the Videos share, even though it is neither explicitly *included* or *excluded* using the Share settings drop-down menu.  I would have thought that the source disk would need to be excluded from the share to prevent any writes from allocating files onto it.  After all, the array is running and in use during this procedure, right?

 

Here is a screenshot of my Shares tab:

shares.thumb.jpg.37b5c953b4863f91666e7c3ed71ffe65.jpg

 

Quote
  1.  
  2. Make sure you have a copy of your array assignments, especially the parity drive. You may need this list if the "Retain current configuration" option doesn't work correctly.
  3. It is highly recommended to turn on reconstruct write, as the write method (sometimes called 'Turbo write'). With it on, the script can run 2 to 3 times as fast, saving hours!

    In Settings -> Disk Settings, change Tunable (md_write_method) to reconstruct write.

  4. Make sure ALL data has been copied off the drive; drive MUST be completely empty for the clearing script to work.
  5. Double check that there are no files or folders left on the drive.

    Note: one quick way to clean a drive is reformat it! (once you're sure nothing of importance is left of course!)

  6. Create a single folder on the drive with the name clear-me - exactly 7 lowercase letters and one hyphen
  7. Run the clear an array drive script from the User Scripts plugin (or run it standalone, at a command prompt).
    • If you prepared the drive correctly, it will completely and safely zero out the drive. If you didn't prepare the drive correctly, the script will refuse to run, in order to avoid any chance of data loss.
    • If the script refuses to run, indicating it did not find a marked and empty drive, then very likely there are still files on your drive. Check for hidden files. ALL files must be removed!
    • Clearing takes a loooong time! Progress info will be displayed, in 6.2 or later. Prior to 6.2, nothing will show until it finishes.
    • If running in User Scripts, the browser tab will hang for the entire clearing process.
    • While the script is running, the Main screen may show invalid numbers for the drive, ignore them. Important! Do not try to access the drive, at all!

"Do not try to access the drive, at all!"  I thought the whole point of this method was to allow use of the array while this process is going on.  The only way to prevent access to the server is to turn off shares, otherwise you risk some local process or networked computer doing a write to it.  Since you have not removed the source drive from the shares, and the array is started,  it is vulnerable, in fact targeted by the shares process to allocate new writes there.  Seems to me that the source disk should be excluded from the shares and that would fix the problem.  Comments?

 

Quote
  1. When the clearing is complete, stop the array
  2. Go to Tools then New Config
  3. Click on the Retain current configuration box (says None at first), click on the box for All, then click on close
  4. Click on the box for Yes I want to do this, then click Apply then Done
  5. Return to the Main page, and check all assignments. If any are missing, correct them. Unassign the drive(s) you are removing. Double check all of the assignments, especially the parity drive(s)!
  6. Click the check box for Parity is already valid, make sure it is checked!
  7. Start the array! Click the Start button then the Proceed button (on the warning popup that will pop up)
  8. Parity should still be valid, however it's highly recommended to do a Parity Check

 

This all seems right to me.  The zero write to the mounted disk will cause the parity to be updated and you can the safely check "Parity is already valid" after you un-assign and remove the disk.

 

Thanks for your help.

 

BTW, I had a devil of a time finding this procedure.  I first found the Lime-tech wiki: https://lime-technology.com/wiki/FAQ_remove_drive

but page is for V5 and below.  I tried the links for v6 mentioned at the top of this page and they *download* a short file that contains this:

"The MediaWiki FAQ can be found at:
https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ"

WTF???

 

Finally, I found this page: https://lime-technology.com/forums/topic/46802-faq-for-unraid-v6/?tab=comments#comment-496153

which helpfully notes that the TOC is broken.  This was posted in Apri *2016*.  Two years ago and Lime-tech has not fixed the FAQ link on their own wiki?????

 

 

 

Link to comment

OK. Read your post again. The example you gave was all about adding one disk and removing another. If instead you want to remove multiple disks, then I would just add the larger disk, copy all the to be removed smaller disks to the larger disk, then New Config without the smaller disks and let parity rebuild. Writing zeros to each of the smaller disks is going to be a lot more time and effort. Rebuilding parity is the simple way to remove disks. The additional time and effort involved doing it the other way may seem safer, but if there are any complications it could actually have risks of its own.

 

As always, you should make sure you have good backups of any important and irreplaceable files. This is true even if you aren't making any changes to your array.

Link to comment
22 minutes ago, CaptainTivo said:

Already, this seems wrong.  If I turn off inclusion and exclusion on the shares page, then *by default* this disk is included in the share.

 

Note that there is just a tiny amount of time from you removing the drive from inclusions/exclusions until you clear the drive and remove any file systems on it. So it's only during this time you don't want to add new files that might be allocated to this drive. The reason for clearing off inclusion/exclusion is just so the disk will not remain mentioned in share configuration files after you remove the disk.

 

24 minutes ago, CaptainTivo said:

The only way to prevent access to the server is to turn off shares, otherwise you risk some local process or networked computer doing a write to it.

 

Note that the text says to avoid accesses to the drive - not accesses to the server. After you have started the clear process, you really do not have any need to perform any accesses to it. And while the clear script run, the drive does no longer have any valid file system on it that might tempt unRAID to copy new files to it.

Link to comment
2 hours ago, trurl said:

add the larger disk, copy all the to be removed smaller disks to the larger disk, then New Config without the smaller disks and let parity rebuild.

This. Writing zeroes to maintain parity is false economy, when you could just as easily keep the removed source drive intact as a backup for the files you copied.

 

As long as all your drives have good smart reports and your pre-flight parity check comes up clean, there is zero reason to fight tooth and nail to keep parity intact while removing multiple drives. Much better to get the large drive tested and included in the array, copy the data, then break parity to remove ALL the small drives at once and do a single parity rebuild operation with the final complement of drives.

 

Also the global exclude function overrides the per share inclusion / exclusion, which makes it "safe" to violate the typical rule that keeps you from losing data copying from a disk to a user share. So, if you globally exclude your disk 6 source drive, it is in this one case permissible to copy from /mnt/disk6/*.* to /mnt/user/, because disk6 is globally excluded from participating in the user shares. Personally I would prefer to copy from disk to disk anyway so I control the destination disk specifically, but that is the origin of the global exclude setting.

Link to comment
On 7/25/2018 at 4:32 PM, jonathanm said:

This. Writing zeroes to maintain parity is false economy, when you could just as easily keep the removed source drive intact as a backup for the files you copied.

 

As long as all your drives have good smart reports and your pre-flight parity check comes up clean, there is zero reason to fight tooth and nail to keep parity intact while removing multiple drives. Much better to get the large drive tested and included in the array, copy the data, then break parity to remove ALL the small drives at once and do a single parity rebuild operation with the final complement of drives.

 

Also the global exclude function overrides the per share inclusion / exclusion, which makes it "safe" to violate the typical rule that keeps you from losing data copying from a disk to a user share. So, if you globally exclude your disk 6 source drive, it is in this one case permissible to copy from /mnt/disk6/*.* to /mnt/user/, because disk6 is globally excluded from participating in the user shares. Personally I would prefer to copy from disk to disk anyway so I control the destination disk specifically, but that is the origin of the global exclude setting.

 

Fair enough.  This seems like the way to go:

1. copy all data from each disk which is to be removed "cp -rp /mnt/disk_to_be_removed /mnt/larger_disk_which_will_remain"

2. stop array

3. New Config, remove the "disks to be removed" from the config

4. start array (rebuild parity).

 

I still think following the exact steps in the  "Clear Drive Then Remove Drive" Method procedure will not work properly.

 

 

Link to comment
On 7/25/2018 at 7:29 PM, CaptainTivo said:

"Do not try to access the drive, at all!"  I thought the whole point of this method was to allow use of the array while this process is going on.  The only way to prevent access to the server is to turn off shares, otherwise you risk some local process or networked computer doing a write to it.  Since you have not removed the source drive from the shares, and the array is started,  it is vulnerable, in fact targeted by the shares process to allocate new writes there.  Seems to me that the source disk should be excluded from the shares and that would fix the problem.  Comments?

 

The script unmounts the disk before starting the clearing process so nothing can be written to that disk, but the script appears to be working very slowly with recent unRAID releases, so if you want to to this procedure it should be done manually with dd, which needs the array started in maintenance mode, i.e., not accessible.

Link to comment
16 hours ago, johnnie.black said:

 

The script unmounts the disk before starting the clearing process so nothing can be written to that disk, but the script appears to be working very slowly with recent unRAID releases, so if you want to to this procedure it should be done manually with dd, which needs the array started in maintenance mode, i.e., not accessible.

Thank you.  That makes perfect sense and should be included in the procedure description.  Anyway, I decided that jonathanm is right and this is really false economy.

Link to comment
  • CaptainTivo changed the title to [solved] questions about array shrink procedure
54 minutes ago, CaptainTivo said:

Thank you.  That makes perfect sense and should be included in the procedure description.  Anyway, I decided that jonathanm is right and this is really false economy.

 

The unmount is implied - the disk needs a valid file system to be mounted. And zeroing it removes the file system. The OS would be very unhappy if you zero the disk content before unmounting.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...