copying 27tb from server a to b

February 4, 20188 yr

What is the best way to do this? Enable parity or just do a pairty sync once data is copied? Use Rsync over ssh or windows explorer or teracopy?? any best practices for this or am i overthinking it?

I kinda just want to disable pairty copy the data at 125 mb/s and the redo pairty ....seems fastest but other threads state that this may result in data corruption that you will not know about.....

Quote

February 4, 20188 yr

I'm not in too much of a hurry. I would go for rsync - and I would then run rsync again to verify if there are updated/new files since the first copy started.

With 110 MB/second you can manage about 400 GB / hour. So just under 3 days. And if you need to you can stop the copy process in the middle if you want to look at a movie without the copy process stealing too much disk/network bandwidth. For me, it wouldn't be the total time that matters most but how much the copy process will affect my access to the data.

Just remember that you want to use turbo-write, i.e. reconstructive writes, or the receiving machine will not be able to keep up with the network link speed.

I'm assuming most of the data is media data (movies, audio, images) so no gain to hope for from stream-compressing the data.

Quote

February 5, 20188 yr

Depending on specs, enable TurboWrites on the destination server will make a sizable difference on speeds.

Quote

February 5, 20188 yr

Author

Will rsync account for broken pipes / files if i set it up to run from say midnight to 5 am over and over until all the data is copied? that way its usable in the day and not killing the network while im on it.

Quote

February 5, 20188 yr

1 hour ago, pyrater said:

Will rsync account for broken pipes / files if i set it up to run from say midnight to 5 am over and over until all the data is copied? that way its usable in the day and not killing the network while im on it.

In my experience rsync is very resilient aginst problems like disconnects, running out of space at the wrong time and the like. Even user aborts with a Control-C don't cause problems. I re-run, it sees what's still to be completed, and then it gets on with it. I also use it occasionally to to binary compares between servers, so I also have an additional check that the copies are good. It's never let me down.

Quote

February 5, 20188 yr

Author

S80 can you hook me up with the command to do the compares between servers? I would like to run that after the data is copied to make sure nothing was borked in the process. Currently im using

rsync -av --info=progress2 -e ssh Data/ [email protected]:/mnt/user/

Quote

February 5, 20188 yr

Author

Im thinking of doing transfers from 11pm to 7 am every night until its complete. Anyone see any issues with this?

Run from source server:

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]

cd /usr/local/

nano rsync_script

#!/bin/sh
rsync -e 'ssh -p 22' -av /mnt/user/Data 192.168.2.5:/mnt/user/Data

ctl X

chmod +x rsync_script

crontab -e
00 21 * * * /usr/local/rsync_script
0 7 * * * root killall rsync

Mostly got the info from: https://www.techrepublic.com/article/how-to-set-up-auto-rsync-backups-using-ssh/

Edited February 5, 20188 yr by pyrater

Quote

February 5, 20188 yr

39 minutes ago, pyrater said:

S80 can you hook me up with the command to do the compares between servers? I would like to run that after the data is copied to make sure nothing was borked in the process. Currently im using

rsync -av --info=progress2 -e ssh Data/ [email protected]:/mnt/user/

OK - so this is what I use, but I am not using ssh. I have the rsync daemon running on the server I am copying to. You can have it running on the source if you prefer, and then pull the files to the destination. Obviously the source and destingation parametes would need to change... My destination is called BackupServer. and I am checking the endire contents of a disk on the source server (disk1 in this case) agains a share on the destination server (Disk1-backup).

rsync -rvnc --progress --delete --timeout=18000 /mnt/disk1/ BackupServer::mnt/user/Disk1-backup 2>&1 > /mnt/cache/Disk1_differences.txt &

For my use, this spawns a task (the & at the end) so that I can run multiple concurrent sessions on different physical disk drives. The -n option is to perform a dry-run, so no files get changed by this. The -c option is to use the checksums of the files as a comparison instead of the date and time stamps. Any differences are logged in a file for later examination - in this case Disk1_differences.txt. The very long timeout is simply because if you have a small number of very large files (big blu-ray rips, for example) the connection can go quiet for a long time while the checksums are being calculated, and I did have some issues with rsync timing out in such cases.

Another approach altogether would be to use the Dynamix File Integrity plugin. This stores a hash for each file in the extended attributes. rsync copies those across as well when copying files (or adds them later if they are newly generated). Then you can use the same plugin on the destination server to check the files there. I haven't yet tried this plugin to check files on my backup server, but I am now starting to use it to validate files on my main server.

Edited February 5, 20188 yr by S80_UK

Quote

February 5, 20188 yr

Author

Nice! thanx man im gonna try the plugin you suggested and add the timeout command to my script above. why do you have the delete option? so it doesnt check files that you have deleted from the source?

Edited February 5, 20188 yr by pyrater

Quote

February 5, 20188 yr

In my case the backup is a mirror of the master, so --delete tells me if there are any files on the backup which are not present on the source. If I didn't do a dry run, those files would get deleted from the backup. That's not what everyone wants, but it suits my use case.

Edited February 5, 20188 yr by S80_UK

Quote

February 6, 20188 yr

15 hours ago, pyrater said:

Will rsync account for broken pipes / files if i set it up to run from say midnight to 5 am over and over until all the data is copied? that way its usable in the day and not killing the network while im on it.

Rsync is excellent at allowing you to restart aborted copy operations. That's why I suggested rsync and the ability to just copy during off-hours and break the copy when you want personal access to your files without rsync stealing bandwidth.

Quote

February 6, 20188 yr

Author

Thanx PWM i think this method you suggested is going to work out great =) Got 3 TB last night from 11 - 7

Edited February 6, 20188 yr by pyrater

Quote

copying 27tb from server a to b

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)