Backing Up Subversion Automatically

Subversion is great, but like any data repository, it must be backed up regularly. Many people have tried to implement version control without really understanding how it works, only to later discover that their backup strategy wasn’t working.

The svn backup script I use is run every night as part of a cron job. Each morning I get an email telling me if everything went ok or not. Here is a list of what I want to happen with each backup:

  1. Dump all the data out of the repository
  2. Name the file with a timedate stamp in the filename. Something like YYYYMMDD-HHMM will work.
  3. gzip the file to save space
  4. Move a copy of the file to another server using scp

Seems pretty basic, but when I’m doing a backup by hand, I like to go a step further and verify the backup by creating a new repository, filling it with the backed up data and then checking it out. This lets me verify that my backup works and that I can get my code back if necessary. So for this verification stage I want to do the following:

  1. Pull the zipped file back down from the remote server
  2. Unzip it.
  3. Create a new repository
  4. Load all of my content into the new repository
  5. Checkout a copy of trunk into a directory
  6. Cleanup

The following perl script accomplishes everything I need in a svn backup script. When it is run with cron, I get a short email everyday telling me that it completed. The output is intentionally terse. If I get a long email I know something went wrong, but I don’t have to wade through a bunch of logging information if everything went as planned. If you want more output, take the -q off of the Subversion commands. The emails that cron sends me look like this if nothing went wrong:

Dumping Subversion repo /var/svn to my_backup-20050921-0100...
Backing up through revision 340...

Compressing dump file...

Created /home/admin/backups/my_backup-20050921-0100.gz

my_backup-20050921-0100.gz transfered to

Testing Backup
Downloading my_backup-20050921-0100.gz from
Unzipping my_backup-20050921-0100.gz
Creating test repository
Loading repository
Checking out repository
Cleaning up

If you want to use this on Windows, you’ll need to make a few changes. First the way we generate the time and datestamp for the file name will need changed. You’ll probably want to use something other than scp and gzip as well.

Here is the script. I hope some people find it useful.

my $svn_repo = "/var/svn";
my $bkup_dir = "/home/backup_user/backups";
my $bkup_file = "my_backup-";
my $tmp_dir = "/home/backup_user/tmp";
my $bkup_svr = "";
my $bkup_svr_login = "backup";

$bkup_file = $bkup_file . `date +%Y%m%d-%H%M`;
chomp $bkup_file;
my $youngest = `svnlook youngest $svn_repo`;
chomp $youngest;

my $dump_command = "svnadmin  -q dump $svn_repo > $bkup_dir/$bkup_file ";
print "\nDumping Subversion repo $svn_repo to $bkup_file...\n";
print `$dump_command`;
print "Backing up through revision $youngest... \n";
print "\nCompressing dump file...\n";
print `gzip -9 $bkup_dir/$bkup_file\n`;
chomp $bkup_file;
my $zipped_file = $bkup_dir . "/" . $bkup_file . ".gz";
print "\nCreated $zipped_file\n";
print `scp $zipped_file $bkup_svr_login\@$bkup_svr:/home/backup/`;
print "\n$bkup_file.gz transfered to $bkup_svr\n";

#Test Backup
print "\n---------------------------------------\n";
print "Testing Backup";
print "\n---------------------------------------\n";
print "Downloading $bkup_file.gz from $bkup_svr\n";
print `scp $bkup_svr_login\@$bkup_svr:/home/backup/$bkup_file.gz $tmp_dir/`;
print "Unzipping $bkup_file.gz\n";
print `gunzip $tmp_dir/$bkup_file.gz`;
print "Creating test repository\n";
print `svnadmin create $tmp_dir/test_repo`;
print "Loading repository\n";
print `svnadmin -q load $tmp_dir/test_repo < $tmp_dir/$bkup_file`;
print "Checking out repository\n";
print `svn -q co file://$tmp_dir/test_repo $tmp_dir/test_checkout`;
print "Cleaning up\n";
print `rm -f $tmp_dir/$bkup_file`;
print `rm -rf $tmp_dir/test_checkout`;
print `rm -rf $tmp_dir/test_repo`;

Eric Wilhelm has another subversion backup method that is worth checking out as well. His method is based on dumping out a backup at every X number of commits instead of based on a specific period of time. This has some advantages particularly with large repositories that don’t change very often.