SVN backups, and backing up in general, is critical for the purpose of disaster recovery. There are a few requirements to consider:
- Backups should mirror (or come close to) the current state of system, i.e. a snapshot.
- Backups should be incremental. In the event a snapshot is malformed, or the system during the snapshot is corrupt, one can roll back to previously working snapshot.
TakeBackup only what you need to survive. No sense in including temporary files, libs that are part of the distribution (easily downloadable), etc.
Lower-level note: I don’t have the resources for LVM snapshots, and the physical disks are RAID 6.
Considerations
Originally, I was backing up my uberSVN installation and all of the SVN repositories using a simple rdiff-backup command. This approach is shortsighted: a rsync of the internal repositories directory does not consider current commits, prop changes, hooks, etc. that are occurring while the rsync is happening which could cause a non-restorable backup; using svnadmin hotcopy addresses this concern. However, the issue with hotcopy is that it does not perform an incremental backup for you, so I needed to couple it with rdiff-backup. It is worth noting that performing this type of copy operation will include, props, hooks, commits and other information – it is more comprehensive than a normal svnadmin dump, but with the downside that it is only truly compatible with the version of SVN that generated it.
As if this wasn’t enough to think about, hotcopy does not preserve file system timestamps. This is problematic with rdiff-backup which relies on a timestamp + file size combination; even though it uses the same underlying algorithm as rsync, AFAIK it does not support the checksum feature for transfers. So after the svnadmin hotcopy is performed, file attributes should be synchronized as well (with slight risk I might add).
Lastly, uberSVN has a handful of configurations and metadata that must be backed up per installation. Its GUI has an easy backup feature, but there is no CLI access/equivalent that I could find. I’m sure I could dig through the source and figure out exactly what was included in the backup, but I decided to reverse-engineer the backup file (which uses ZIP compression by the way) and infer other details. uberSVN includes (or should include) the following directories from its install directory (default is /opt/ubersvn): /opt/ubersvn{conf,openssl,tomcat,ubersvn-db}. From this, a backup archive can be carefully constructed for later restoration within the GUI.
The Implementation (tl;dr)
This is a bash script that is configurable in the first two grouping of variables (lines are highlighted). It also uses a subsys lock mechanism (flock) so it cannot be run in parallel which is helpful when using it in a crontab. I haven’t extensively tested its reliability, but it does work… generally speaking. Here’s the hack:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
#!/bin/bash remotePath="/home/backup/ubersvn-repos"; remotePort=22; remoteUser="backup"; remoteHost="example.com"; targetPrefix="/tmp/svnbackup"; # no trailing slash, where to put it temporarily repoPrefix="/opt/ubersvn/repositories/"; # with trailing slash, contains all the repos svnadmin="/opt/ubersvn/bin/svnadmin"; # path to ubersvn's svnadmin binary rdb="/usr/bin/rdiff-backup"; # path to rdiff-backup binary lockPath="/var/lock/subsys/backup-svn.sh.lock"; # path to store the lock file extraPathsPrefix="/opt/ubersvn"; # no trailing slash, $extraPaths will be postfixed read -r -d '' extraPaths <<EOT conf openssl tomcat ubersvn-db EOT # prepare repo hotcopy paths timestamp=`date +"%s"`; targetTemp="${targetPrefix}_${timestamp}"; repoList=`find "$repoPrefix" -maxdepth 1 -type d -not -wholename "$repoPrefix"`; ( flock -x -w 10 200 || exit 1 # hotcopy all svn repos for repoPath in $repoList; do repo=`echo "$repoPath" | rev | cut -d/ -f1 | rev`; tempRepoPath="${targetTemp}/repositories/${repo}"; mkdir -p "$tempRepoPath"; "$svnadmin" hotcopy --clean-logs "$repoPath" "$tempRepoPath"; done; # sync file attributes (i.e. timestamps) to prepare incremental backup, this is kinda dirty cd "${targetTemp}/repositories/"; find . -exec touch -r "$repoPrefix"\{\} \{\} \; cd - >/dev/null 2>&1; # copy extra paths (e.g. ubersvn configs) for extra in $extraPaths; do /bin/cp -fRp "${extraPathsPrefix}/${extra}" "${targetTemp}/${extra}"; # preserve attributes done; "$rdb" --remote-schema "ssh -C -p $remotePort %s" "$targetTemp" "$remoteUser"@"$remoteHost"::"$remotePath"; # incremental backup rm -fr "$targetTemp"; # clean up ) 200>"$lockPath"; |