Subversion is great, but like any data repository, it must be backed up regularly. Many people have tried to implement version control without really understanding how it works, only to later discover that their backup strategy wasn’t working.
The svn backup script I use is run every night as part of a cron job. Each morning I get an email telling me if everything went ok or not. Here is a list of what I want to happen with each backup:
- Dump all the data out of the repository
- Name the file with a timedate stamp in the filename. Something like YYYYMMDD-HHMM will work.
- gzip the file to save space
- Move a copy of the file to another server using scp
Seems pretty basic, but when I’m doing a backup by hand, I like to go a step further and verify the backup by creating a new repository, filling it with the backed up data and then checking it out. This lets me verify that my backup works and that I can get my code back if necessary. So for this verification stage I want to do the following:
- Pull the zipped file back down from the remote server
- Unzip it.
- Create a new repository
- Load all of my content into the new repository
- Checkout a copy of trunk into a directory
- Cleanup
The following perl script accomplishes everything I need in a svn backup script. When it is run with cron, I get a short email everyday telling me that it completed. The output is intentionally terse. If I get a long email I know something went wrong, but I don’t have to wade through a bunch of logging information if everything went as planned. If you want more output, take the -q off of the Subversion commands. The emails that cron sends me look like this if nothing went wrong:
Dumping Subversion repo /var/svn to my_backup-20050921-0100... Backing up through revision 340... Compressing dump file... Created /home/admin/backups/my_backup-20050921-0100.gz my_backup-20050921-0100.gz transfered to my.server.com --------------------------------------- Testing Backup --------------------------------------- Downloading my_backup-20050921-0100.gz from my.server.com Unzipping my_backup-20050921-0100.gz Creating test repository Loading repository Checking out repository Cleaning up
If you want to use this on Windows, you’ll need to make a few changes. First the way we generate the time and datestamp for the file name will need changed. You’ll probably want to use something other than scp and gzip as well.
Here is the script. I hope some people find it useful.
my $svn_repo = "/var/svn"; my $bkup_dir = "/home/backup_user/backups"; my $bkup_file = "my_backup-"; my $tmp_dir = "/home/backup_user/tmp"; my $bkup_svr = "my.backup.com"; my $bkup_svr_login = "backup"; $bkup_file = $bkup_file . `date +%Y%m%d-%H%M`; chomp $bkup_file; my $youngest = `svnlook youngest $svn_repo`; chomp $youngest; my $dump_command = "svnadmin -q dump $svn_repo > $bkup_dir/$bkup_file "; print "\nDumping Subversion repo $svn_repo to $bkup_file...\n"; print `$dump_command`; print "Backing up through revision $youngest... \n"; print "\nCompressing dump file...\n"; print `gzip -9 $bkup_dir/$bkup_file\n`; chomp $bkup_file; my $zipped_file = $bkup_dir . "/" . $bkup_file . ".gz"; print "\nCreated $zipped_file\n"; print `scp $zipped_file $bkup_svr_login\@$bkup_svr:/home/backup/`; print "\n$bkup_file.gz transfered to $bkup_svr\n"; #Test Backup print "\n---------------------------------------\n"; print "Testing Backup"; print "\n---------------------------------------\n"; print "Downloading $bkup_file.gz from $bkup_svr\n"; print `scp $bkup_svr_login\@$bkup_svr:/home/backup/$bkup_file.gz $tmp_dir/`; print "Unzipping $bkup_file.gz\n"; print `gunzip $tmp_dir/$bkup_file.gz`; print "Creating test repository\n"; print `svnadmin create $tmp_dir/test_repo`; print "Loading repository\n"; print `svnadmin -q load $tmp_dir/test_repo < $tmp_dir/$bkup_file`; print "Checking out repository\n"; print `svn -q co file://$tmp_dir/test_repo $tmp_dir/test_checkout`; print "Cleaning up\n"; print `rm -f $tmp_dir/$bkup_file`; print `rm -rf $tmp_dir/test_checkout`; print `rm -rf $tmp_dir/test_repo`;
Eric Wilhelm has another subversion backup method that is worth checking out as well. His method is based on dumping out a backup at every X number of commits instead of based on a specific period of time. This has some advantages particularly with large repositories that don’t change very often.
A very nice script indeed. I’ve got a enhancement proposal, though: use ISO-Format timestamps. Those are guaranteed to be understood by people and machines alike. date -Is produces such timestamps to second-level precision. They look like this:
2005-10-25T18:56:37+0200
Easily parseable by machines and unsuspecting non-nerdy ;) users. It also includes the timezone, which is nifty if you admin machines across time or even date borders. I’ve taken to only use ISO stamps everywhere since they can be sorted quite easily, too.
Excellent script, but I am just wondering what would happen if during the dumping, there’s a write action on one of the files in the repository.
What is the best way: shutdown the Apache process before the dumping and restart it after?
Thanks for this… doing a test check-out seems like a very sensible idea. I will use this as a template for doing something similar in PHP (I’m on a Windows box and so don’t have Perl installed by default).
Thank you. I’ve tried this script, it’s worked (modified a little).
You are great
svn-hot-backup – the python script
I use svnmirror.sh file to backup the subversion repository.
wow!
I like the idea of testing the backup!
Thanks for sharing.
Slavi
Great post. Came up very high on Google :) I just automated backups in my post-commit hook but opted for the svn-fast-backup script as provided for the fsfs repository. Works like a charm.
Is it really necessary to use perl if you mainly just use backticks to execute shell commands?
Perl means I can use the same script regardless of what system I’m using. I would be trivial to convert it to a particular shell script if you prefer not to use Perl.