June 6, 20251 yr I have been concerned with backups for the last month or two. I used to have a ghetto "backup" of my server data onto an old PC, but first the data set outgrew the storage about 6 months ago, then the PC died completely in May. It was time for something better, especially since I had made a syntax error in a cleanup script a few weeks ago, accidentally deleting some files. First, I rebuilt the PC as a new Unraid server, utilizing some of the small drives that were in it before (2x4tb and 2x8tb) and added 2 new 20tb drives. I bought new motherboard, CPU and RAM as well. All told, the backup box ended up with 64tb usable, compared to my main server's 90tb. That's fine at the moment, as the used space on main is only 65tb and I have several folders of stuff that I am OK with not backing up (porn, downloaded software installers, etc). I decided to forgo parity on the backup box, as the main box has parity, and this is after all a backup. Once all the hardware issues were solved, I got Unraid put on and grabbed a license. So then it comes down to the nuts and bolts of how I was going to back up. I had done some looking around on this before. One of the things I really like about Unraid is that even if a disk fails and parity is completely faulty, you can always read data from the non-failed disks. I work in IT, and while there are good reasons for products like Veeam and ShadowProtect that use their own image-based backup formats, I really wanted my backup server to also be able to be accessed without needing any other software but Unraid. I narrowed the choice down to two options, rdiff-backup and rsnapshot. Both are based on using rsync to clone whole directories into the target device and also provide space-efficent backup increments, but they differ in how they store the increments. Rdiff-backup uses its own format for the increments, and it further improves its space efficiency by only storing diffs of changed files, and compressing those. It does mean that to restore from anything other than the most recent backup, you need a tool to browse and recover files. There is a pretty good web interface for this called rdiffweb, which is a separate product from rdiff-backup itself. The other option is rsnapshot. It uses clever tricks with hardlinks to present each increment as a top-level folder that contains all the files as of that point in time. There is no file-level differences or compression, an increment will use all the space of new and changed files. The additional features of rdiff-backup caused me to pick it as my backup solution. I got it all set up in docker, first by using someone else's container, then forking it on github and rolling up my own mods. I started my initial backup of the "most important" data (media files, pictures, etc) and let it run. The first backup took a week, but I thought "oh, this is just the initial full backup, it's copying everything, increments will be faster." I was wrong. Rdiff-backup is SLOW. My first incremental looked like it was on-track to take 6 or 7 days, so I scrapped that plan and switched to rsnaphot. I'm not aware of anything in community apps for rsnaphot, so I put my recently-learned roll-your-own Docker skills to work. I set up a container image with rsnapshot and ssh, plus some scripts to get crucial config files out of the container and into appdata. If someone wants to follow in my (faint and dubious) footsteps, I can DM you the link to pull my container. Here's how it ends up going:I have the same container deployed on both servers. On the backup Unraid server, I rolled up my rsnapshot-docker container, with the backup share mounted as /backup and a couple of SMB remotes from a windows box that I wanted to backup mounted under the root as well. I created an rsnapshot.conf file that listed all my sources, and set up the "alpha" and "beta" backup plans to keep 6 and 4 respectively. I will be running those as dailies and weeklies. I might increase the 4 weeks, or add a gamma / monthly tier, if space looks like it permits. I put another copy of the same docker on my main server, with the shares I wanted to back up from mounted to the container under the root as well. When rsnapshot backs up, you have to give it a "repository" folder to put the backups in. Each backup gets its own folder in the "backup root" defined in the config file, with the first backup being "alpha.0." I chose to have the repository folder be the hostname of my primary unraid, so my backup folder looks like this: alpha.0/sourcehostname//sharename/foo/bar/baz.txt and the windows box gets its own hostname in there as well. That path setup is why the source side docker has everything mounted to root, because if I mounted them as /source/share then I would end up with that additional directory level in the backup folders (like alpha.0/sourcehostname/source/foo/bar/baz.txt.) In order to keep fine control over what I am backing up, I set up the rsnapshot.conf to just have the root sharenames listed, and use an exclude file (I called it backup.list) in the appdata folder to control directory exclusions. Of course, I also had to do some things with ssh to get connectivity unattended, which was your basic ssh key setup and an ssh config file on the backup box that specifies the port and IP to connect to, and what private key to use, as well as disabling host key-checking, since my docker containers will regent the ssh key when they upgrade or rebuild. Note that the ssh session is contrainer to container, so you have to allocate a port other than 22 to the container and use that when backing up. I keep the connectivity working across reboots by mapping /root/.ssh/ to /mnt/user/appdata/rsnapshot-docker/ssh/. Then I got to scheduling. I started off on the assumption that I would be using cron from inside the container to schedule jobs. This turned out to not work so well, I suspect I needed to install some additional packages, or tweak the container setup in some way, but I decided all that was not really worth it and fell back to using the User Scripts plugin on the host to schedule. I just have a script that runs "docker exec -d rsnapshot-docker rsnapshot alpha" followed by some grepping of log files to detect errors and warnings and send me a message on ntfy.sh. That first scheduled alpha backup kicked off about 3pm. I had added a new directory into this backup of about 7tb, so I think it will run overnight and into tmorrow, but I am getting really close to having everything I want in the backup job. By early next week, this should be humming along with no intervention from me.Let me explain a bit how rsnapshot works, because I think it's genius. The initial backup goes into the alpha.0 folder, with all files included. Then, when the next alpha pass runs (the next day in my case) renames alpha.0 to alpha.1 and then it only copies newer files, and hardlinks the unchanged files to the existing ones in alpha.1. This is all quite fast, since under the hood it uses rsync, which is really efficient. Because of how hard links work, when it reaches the configured number of increments to keep, it simply deletes the oldest one, rolls all the folders up one number, and you still have the base full backup. The newest backups in each series are always numbered 0. The best part is that it is fast. It's been about 5 days since I started with rsnapshot, and I already have more data on the backup than I did in my initial run with rdff-backup, and an incremental without substantial data changes runs in about 30 minutes, which is actually pretty excellent for the number of files I am processing (about half a million).So anyway, I am tickled pink with the backup at this point.
June 7, 20251 yr I had to ask ChatGPT to summarize your literal wall of text This script is what I use to backup between servers
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.