KVM Live Backup qcow2

bigjme · July 27, 2018

Hi All,

So i know this has been mentioned a million times but i'm trying to back up my VM's live without any shut down or suspensions as part of my daily incremental rsync backup.

The machines in question are for CCTV and VoIP systems so they need to remain online

Right now i have this which does work, it creates a temporary overlay file allowing the base image to stay unchanged. You should then be able to clone the original file which in itself should work as an isolated backup and once done merge the overlay into the main file as snapshot point on the main system

virsh snapshot-create-as --domain "Windows Server 2016" $(date '+%Y-%m-%d-%H-%M-%S') --diskspec dc,file="/mnt/user/ArrayVDisks/TempVms/overlays/Windows Server 2016.qcow2" --disk-only --atomic
virsh blockcommit "Windows Server 2016" hdc --active --verbose --pivot
rm "/mnt/user/ArrayVDisks/TempVms/overlays/Windows Server 2016.qcow2"

If i then run this command afterwards i can see the snapshot is created and i can see the overlay file be created then merged as needed

virsh snapshot-list "Windows Server 2016"

So far, so good

But then i try and run with my rsync command and it doesn't work properly, below is a fully working copy of the code

sources=(
	"/mnt/cache/VMImages/Windows Server 2016/"
)
targets=(
	"/mnt/disks/Disk_1/VMImages/Windows Server 2016"
)
arraylength=${#sources[@]}

virsh snapshot-create-as --domain "Windows Server 2016" $(date '+%Y-%m-%d-%H-%M-%S') --diskspec hdc,file="/mnt/user/ArrayVDisks/TempVms/overlays/Windows Server 2016.qcow2" --disk-only --atomic

for (( ii=1; ii<${arraylength}+1; ii++ )); do
	echo "Starting backup of" "${sources[$ii-1]}" " to " "${targets[$ii-1]}"
	mkdir -p "${targets[$ii-1]}" 2>/dev/null	
	BACKUPS=30
	END=$((BACKUPS - 1))
	mv "${targets[$ii-1]}"/backup."$BACKUPS" "${targets[$ii-1]}"/backup.tmp 2>/dev/null
	for ((i=END;i>=0;i--)); do
		mv "${targets[$ii-1]}"/backup."$i" "${targets[$ii-1]}"/backup.$(($i + 1)) 2>/dev/null
	done
	mv "${targets[$ii-1]}"/backup.tmp "${targets[$ii-1]}"/backup.0 2>/dev/null
	cp -al "${targets[$ii-1]}"/backup.1/. "${targets[$ii-1]}"/backup.0
	rsync -ahv --delete --progress --exclude "docker.img" --exclude "Program Files" "${sources[$ii-1]}" "${targets[$ii-1]}"/backup.0/
done

virsh blockcommit "Windows Server 2016" hdc --active --verbose --pivot
rm "/mnt/user/ArrayVDisks/TempVms/overlays/Windows Server 2016.qcow2"

Using the same internal backup loop on a normal set of folders gives me proper incremental backups and only copies over file differences. For the VM image it seems to just take a full copy of the base image every time, in this instance creating a 50GB image every day rather than copying over the maybe 2GB of file differences

I've really new to rsync and i'm new to the kvm virsh command line so i'm hoping i am just miss-understanding something and there is an obvious issue.

I keep seeing this mentioned in some online backups for virsh as i'm doing above but i'm unsure exactly what it does or if this would fix the issue "--no-metadata"

Help would be hugely appreciated as i know a lot of people are looking for the same thing. My aim is to allow this to copy the differentials to an offsite ssh server for rsync but i need to get this working locally first as i can't afford to be copying 500GB of vm images a night and having them offline during the process

As an additional, i'm using qcow2 as these are running on an NVMe drive so i need them to be sparse images and just use what they need

Regards,

Jamie

Edited July 27, 2018 by bigjme

bigjme · August 21, 2018

Ok so for anyone following this or interested, this may be becoming more of just a general virsh support issue than anything as i have an idea of what may work for achieving at least some form of easier backups and having had to trash a vm recently this is becoming a more prominent issues for me

Please note, i have not tried rolling back a vm to a snapshot yet so don't use the above as serious backups until tested more

Now as i'm fairly new to virsh and there are no doubt some guru's on here, this is my new current script

#!/bin/bash

#Declare backup sources
sources=(
	"/mnt/cache/VMImages/vm1/"
	"/mnt/disk1/Disk1Test/VM Image/vm2/"
)
targets=(
	"/mnt/disks/Disk_1/VMImages/vm1"
	"/mnt/disks/Disk_1/VMImages/vm2"
)
vmname=(
	"vm1"
	"vm2"
)
arraylength=${#sources[@]}

# Declare backup drives
deviceuuid=(
	"6ed5043c-14ee-41f2-903d-d201ec50d39f"
)
devicemount=(
	"/mnt/disks/Disk_1"
)
devicelength=${#deviceuuid[@]}

# Mount drives
for (( ii=1; ii<${devicelength}+1; ii++ )); do
	if grep -qs "${devicemount[$ii-1]}" /proc/mounts; then
		echo "${devicemount[$ii-1]}" " - mounted"
	else
		echo "${devicemount[$ii-1]}" " - not mounted"
		mkdir "${devicemount[$ii-1]}"
		echo "${devicemount[$ii-1]}" " - created mount path"
		mount -t xfs UUID="${deviceuuid[$ii-1]}" "${devicemount[$ii-1]}"
		if [ $? -eq 0 ]; then
			echo "${devicemount[$ii-1]}" " - mount success!"
		else
			echo "${devicemount[$ii-1]}" " - mount failed!"
			exit 1;
		fi
	fi
done

# Handle Backup
for (( ii=1; ii<${arraylength}+1; ii++ )); do
	echo "Starting backup of" "${sources[$ii-1]}" " to " "${targets[$ii-1]}"
	mkdir -p "${targets[$ii-1]}"
	
	#virsh domblklist "${vmname[$ii-1]}"
	virsh snapshot-create-as --domain "${vmname[$ii-1]}" $(date '+%Y-%m-%d-%H-%M-%S') --diskspec hdc,file="/mnt/user/ArrayVDisks/TempVms/overlays/${vmname[$ii-1]}.qcow2" --disk-only --atomic

	files=( $(find "${targets[$ii-1]}" -name "*.qcow2") )
	if [ ${#files[@]} -gt 0 ]; then
		echo "Running incremental backup - setting as inplace"
		rsync -ahv --delete --progress --inplace "${sources[$ii-1]}" "${targets[$ii-1]}"
	else
		echo "Running first backup - setting as sparse"
		rsync -ahv --delete --progress --sparse "${sources[$ii-1]}" "${targets[$ii-1]}"
	fi
	
	virsh blockcommit "${vmname[$ii-1]}" hdc --active --verbose --pivot
	rm "/mnt/user/ArrayVDisks/TempVms/overlays/${vmname[$ii-1]}.qcow2"
	#virsh snapshot-list "${vmname[$ii-1]}"
done


# Unmount drives
for (( ii=1; ii<${devicelength}+1; ii++ )); do
	if grep -qs "${devicemount[$ii-1]}" /proc/mounts; then
		fuser -k "${devicemount[$ii-1]}"
		umount "${devicemount[$ii-1]}"
		echo "${devicemount[$ii-1]}" " - unmounted"
		rmdir "${devicemount[$ii-1]}"
		echo "${devicemount[$ii-1]}" " - removed mount path"
	fi
done

So, this mounts the drive i specify to a mount point of my choosing, then loops through the VM's as needed. It creates a snapshot for each vm, checks if that vm has been backed up before, if it hasn't then rsync does a sparse image backup to preserve the space used rather than copying it at the full image size. If it exists then it does an inplace backup which just updates the sparse image, again keeping the files small

This so far seems to work fine as i have a 30GB qcow2 vm thats using 5.6gb, 3 backup runs later and its still only 5.6gb as nothing in the vm ever really changes although it is running and there is no noticeable different to the vm while this is all running

So now the image is copied the system commits the overlay image back into the base, removes the overlay image, and does the next vm, once its done it unmounts all the drives as needed. This is done so i can actually backup my vm's to multiple drives if needed - i use this for important documents on my array as well

So that's all working fine but still there is that dreaded rsync that copies the entire image over every time which isn't ideal but is fine while i'm doing local backups

So i had a thought earlier but i can't figure out for the life of me if its even remotely possible or not.

From what i have read the command "virsh snapshot-create-as" is capable of creating a snapshot backing chain with multiple layers when this parameter is passed "--reuse-external". An example is below

A brand new image is created and looks like this: base.img

After 1 snapshot this looks like this: base.img -> snapshot1

After 2 snapshots this looks like this: base.img -> snapshot1 -> snapshot2

So my thought was, what happens if when we create snapshot1 we copy the base.img in our backup script

When we create snapshot2 we copy snapshot1

We then commit snapshot1 back to the base.img using blockcommit and "--shallow --keep-relative" this then leaves: base.img -> snapshot2

The next backup run then creates snapshot1 based on snapshot2, then commits snapshot2 as above and loop like this but one creates snapshot1, the next creates snapshot2 etc.

The file you then transfer would be the "middle" snapshot and would in essence be the difference between the snapshots resulting in a much smaller file to be copied

In my mind that would work up to this point (although i know far too little to say if its plausible)

Now the main issue i see comes in. In your backup destination you now have a load of snapshots overlays and a base file, but how on earth would you get those overlays to commit back to the main file? The overlay and base files would be aware of entirely different backing chains to each other so i'm not sure how to possibly maintain this

My hope is that this thread may become more of a rambling of ideas in order to aid someone else come up with a good idea. At work we use windows hyperV with active replication (so if one server dies, server 2 is only so far behind and can be spun up in its place), i would love to be able to do something similar with my home unraid boxes and kvm

I'm aware this post is long as this topic may be changing off topic so perhaps it may be worth moving it elsewhere? Either way, hopefully the scripts above may help someone

Regards,

Jamie

Edited August 21, 2018 by bigjme

ThatDude · June 30, 2019

Hey did you ever figure this out?

I would really like to make live incremental backups that I push to the cloud but I don't want to re-invent the wheel if you've already done it.

bigjme · July 5, 2019

On 6/30/2019 at 10:06 PM, ThatDude said:

Hey did you ever figure this out?

I would really like to make live incremental backups that I push to the cloud but I don't want to re-invent the wheel if you've already done it.

I'm afraid not. I had instances where the overlay images would fail to merge back in some instances so it would fail to backup on the next occurrence

ThatDude · July 5, 2019

That's been my experience too 😞

I've fallen back to scripting a graceful shutdown then taking a full backup of the VMs. I really wanted incremental backup files so that I could push them to cloud storage. I need to figure out a workaround.

Edited July 5, 2019 by ThatDude
Grammar

bastl · July 5, 2019

If the source and the target drives are BTRFS formated drives, BTRFS snapshots could be an option. Did you guys ever had a look on it and tried this as an option?

bigjme · July 5, 2019

I didn't no as my vm's were stored across my array and cache drives and they were all formatted up as XFS

JorgeB · July 5, 2019

BTRFS snapshots work, though if made with the VM running it will be in a crash consistent state, still better than nothing, what I do is daily snapshots with the VMs online (followed by send/receive to another disk) and try to once a week to do an offline snapshot, this way I have more options for recovery if needed, for example I can go back to the last offline snapshot but grab any file/setting from the most recent online snapshot, more info here.

bastl · July 5, 2019

Here we go. There is the link I am searching for 😂

bigjme · December 29, 2019

So its been a long time since i looked at this previously but as i'm bored of the holidays i thought i'd look at a way to do this a little differently. Instead of full backups which take too much time and data, i looked at implementing an approach closer to what i think windows replication would use.

I have not been using this script full time, its just a testing script for now with no error checks etc. but if anyone wants to comment or elaborate on this they are more than welcome to. Testing wise i have run this many times, shutting down and restarting the vm after various runs, even mounting the backup image to the vm and it all seems fine and works ok

I have yet to bother to export the xml file as that seems like the easiest part in reality.

In short this is how the script works:

Loop through the vms listed at the top then loop through each drive assigned to it
Make the folder for the backup to be stored if needed
See if overlay 1 or 2 exist, if not it is an initial backup
1. Create overlay 1 and backup the initial large base image
2. Shrink the image back down a little using qemu-img and then move the shrunken version to be the main image
If overlay 1 or 2 exists then handle things different
1. Create a new snapshot (overlay) using a secondary snapshot name (state1 and state2 switch each run)
2. Copy the previously used snapshot to the backup location as the changes since last backup
3. Merge the old snapshot into the base image
4. Rebase the partial image in the backup folder to the main image taken on the first run
5. Commit the changes to the base image so the base backup file is now complete
6. Remove the old snapshot file
7. Shrink the image down again to keep the sizes small as snapshot commits do swell the underlined image if it is sparse

The reason i went down this route is that because your only copying the changes since the last backup, the network load and thus time to complete is extremely low. As your also block committing a previous snapshot it should avoid errors in the commit. This should make the process a little more reliable and considerably faster than copying the base each time

Now onto the downside. As your using an overlay image at all times after the first commit, this runs the risk of getting fairly large if the backup isn't run often when the vm is changing a lot. The upside is on my test 80GB centos install (12GB used) the script does a full backup and image shrink in 25 seconds (600MB disk change) so this could be run very often in a replication style system (6 seconds for a few hundred MB changes without the image shrink)

Again, this is just me having a play and using various tools for fun to see if this was possible at all but so far, it seems to be working and any feedback from those with more knowledge than me would be appreciated. I must iterate again however, this is being used for testing and is not production ready in any way. Use this at your own risk

To test, simply change the initial BaseBackupFolder, OverlayPath, and the VMNames

#!/bin/bash
echo "Starting backup system..."
BaseBackupFolder="/mdev1/VM Backup/"
OverlayPath="/mdev1/VM Overlays/"
VMNames=(
	"Cent8Backup"
)
arraylength=${#VMNames[@]}
echo "Found ${arraylength} VMs that need backing up.."
echo " "

# Handle Backup
for (( ii=0; ii<${arraylength}; ii++ )); do
	BackupPath="${BaseBackupFolder}${VMNames[$ii]}"

	DriveLetter=`virsh domblklist "${VMNames[$ii]}" --details | grep ^file | grep disk | awk -F' {2,}' '{print $3}'`
	FilePaths=`virsh domblklist "${VMNames[$ii]}" --details | grep ^file | grep disk | awk -F' {2,}' '{print $4}'`
	TotalDrives=${#DriveLetter[@]}

	echo "VM Name: ${VMNames[$ii]}"
	echo "Drives: ${TotalDrives}"
	echo " "

	for (( iii=0; iii<${TotalDrives}; iii++ )); do
		DiskName=$(basename -- "${FilePaths[$iii]}")
		OverlayDestination="${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay1.qcow2"
		BackupTarget="${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2"
		echo "Backing up: ${DriveLetter[$iii]}"

		if [ ! -d "${BackupTarget}" ]; then
			mkdir -p "${BackupPath}"
		fi

		if [[ ! -f "${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay1.qcow2" && ! -f "${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay2.qcow2" ]]; then
			echo "Initial overlay not found, creating first backup"
			echo "File Path: ${FilePaths[$iii]} -> ${BackupTarget}"
			echo "Overlay Path: ${OverlayDestination}"
			virsh snapshot-create-as --domain "${VMNames[$ii]}" guest-state1 --diskspec "${DriveLetter[$iii]}",file="${OverlayDestination}" --disk-only --atomic --no-metadata
			rsync -avh --info=progress2 --sparse "${FilePaths[$iii]}" "${BackupTarget}"
		else
			echo "Initial overlay found, running sequential backup"
			if [ ! -f "${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay2.qcow2" ]; then
				FilePaths[$iii]="${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay1.qcow2"
				OverlayDestination="${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay2.qcow2"
				BackupTarget="${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-partial.qcow2"
				echo "File Path: ${FilePaths[$iii]} -> ${BackupTarget}"
				echo "Overlay Path: ${OverlayDestination}"
				virsh snapshot-create-as --domain "${VMNames[$ii]}" guest-state2 --diskspec "${DriveLetter[$iii]}",file="${OverlayDestination}" --disk-only --atomic --no-metadata
				rsync -avh --info=progress2 --sparse "${FilePaths[$iii]}" "${BackupTarget}"
				virsh blockcommit "${VMNames[$ii]}" "${DriveLetter[$iii]}" --top "${FilePaths[$iii]}" --verbose --wait
				rm -f "${FilePaths[$iii]}"
				qemu-img rebase -f qcow2 -u -b "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2" "${BackupTarget}"
				qemu-img commit "${BackupTarget}"
				rm -f "${BackupTarget}"
			else
				FilePaths[$iii]="${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay2.qcow2"
				OverlayDestination="${OverlayPath}${VMNames[$ii]}-${DriveLetter[$iii]}-overlay1.qcow2"
				BackupTarget="${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-partial.qcow2"
				echo "File Path: ${FilePaths[$iii]} -> ${BackupTarget}"
				echo "Overlay Path: ${OverlayDestination}"
				virsh snapshot-create-as --domain "${VMNames[$ii]}" guest-state1 --diskspec "${DriveLetter[$iii]}",file="${OverlayDestination}" --disk-only --atomic --no-metadata
				rsync -avh --info=progress2 --sparse "${FilePaths[$iii]}" "${BackupTarget}"
				virsh blockcommit "${VMNames[$ii]}" "${DriveLetter[$iii]}" --top "${FilePaths[$iii]}" --verbose --wait
				rm -f "${FilePaths[$iii]}"
				qemu-img rebase -f qcow2 -u -b "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2" "${BackupTarget}"
				qemu-img commit "${BackupTarget}"
				rm -f "${BackupTarget}"
			fi
		fi
		echo "shrinking backup image"
		qemu-img convert -O qcow2 "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2" "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2.shrunk"
		rm -f "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2"
		mv "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2.shrunk" "${BackupPath}/${VMNames[$ii]}-${DriveLetter[$iii]}-main.qcow2"
		echo "image shrunk"
		# uncomment the following to completely flatten. Remeber to delete running overlay afterwards
		# virsh blockcommit "${VMNames[$ii]}" "${DriveLetter[$iii]}" --active --verbose --pivot
		echo " "
	done

done

It uses the same principle as i previously discussed. Where by we go from:

Base -> Overlay1
Base -> Overlay1 -> Overlay2
Base -> Overlay2
Base -> Overlay2 -> Overlay1
Start again at point 1

At all points the file we copy is the middle overlay you can see in steps 2 and 4

Regards,

Jamie

Edited December 29, 2019 by bigjme

bastl · December 29, 2019

@bigjme Have a look into the following plugin which supports snapshots of running VMs like your script and has lots of more features

bigjme · December 30, 2019

Thanks bastl,

Mine was a little more generic as this is currently running on none unraid hardware. I'm aiming it more towards my remote backup server that uses SSH to backup my remote systems. My current script could be configured to run on most things remotely for example

That script will certainly be useful to read through however

JTok · December 30, 2019

7 hours ago, bigjme said:

Mine was a little more generic as this is currently running on none unraid hardware. I'm aiming it more towards my remote backup server that uses SSH to backup my remote systems. My current script could be configured to run on most things remotely for example

I think the base script that the plugin actually runs could be easily adapted to a non-unraid system (but don't quote me on that lol). I'm still maintaining the script in a separate repo, so here's a link to just the script portion if you don't want to sort through the whole plugin:

https://github.com/JTok/unraid-vmbackup/blob/master/script

Edited December 30, 2019 by JTok

gurbedevries · July 24, 2020

On 12/30/2019 at 7:23 PM, JTok said:

I think the base script that the plugin actually runs could be easily adapted to a non-unraid system (but don't quote me on that lol). I'm still maintaining the script in a separate repo, so here's a link to just the script portion if you don't want to sort through the whole plugin:

https://github.com/JTok/unraid-vmbackup/blob/master/script

Your script does not take into account dynamic qcow2 disks. When I use this script to backup a 500GB qcow2 disk (actual size 23gb) it copies the file including all the empty space, resulting in a backup over 20 times larger than the actual machine disk. Any ideas on how to fix this? I'm using virt-manager on Linux Mint btw, not unraid. The script itself does indeed run on a non-unraid system.

Edited July 24, 2020 by gurbedevries

KVM Live Backup qcow2

Recommended Posts

bigjme

Link to comment

bigjme

Link to comment

ThatDude

Link to comment

bigjme

Link to comment

ThatDude

Link to comment

bastl

Link to comment

bigjme

Link to comment

JorgeB

Link to comment

bastl

Link to comment

bigjme

Link to comment

bastl

Link to comment

bigjme

Link to comment

JTok

Link to comment

gurbedevries

Link to comment

Join the conversation