jump to navigation

Backing up your linux box with DAR December 19, 2008

Posted by haskelladdict in Linux.
Tags: , , ,
add a comment

More than just a handful of people unfortunately don’t do it, despite the fact that it is actually quite straightforward and painless. I am talking of course about backing up your important data! Just remember, it takes only fraction of a section to type rm -f inside the wrong directory at the wrong time. Besides, hard drives are known to give up their ghost unexpectedly. Hence, backing up important data should really be as automatic and natural a task as checking ones email.

For Linux systems there exist a plethora of good backup solutions. Here, I will explain the use of DAR [1], which makes backing up data a breeze.

Before we get started, we need to decide on a certain backup strategy. One reasonable choice might be to perform a full backup initially, followed by incremental backups at certain time intervals (daily, weekly, whatever seems appropriate). Periodically, e.g., every 6 months, another full backup could be pulled, followed by another set of incrementals until the next full backup is due. If you are backing up mission critical things, daily snapshots are certainly appropriate. Otherwise, weeklies might be sufficient.

Next, we have to decide what files to back up and where to store them. Let’s say we want to back up all of user Susan’s home directory, /home/susan, and we will keep the backup files on a separate hard-drive (USB or whatever) mounted at /mnt/backup. We start the backup process via

/usr/bin/dar -R /home/susan -c /mnt/backup/susan-home-full -z -s 1000M -P /home/susan/random-stuff

It is pretty easy to understand what is going on. “-R” specifies the root of the directory tree to be backed up; “-c” is the path to the backup file(s); “-z” specifies that we want to compress the backup using gzip; “-s” gives the maximum size of individual backup files (we certainly don’t want one huge 200G backup file); “-P” excludes certain sub-directories/files from the backup. Please consult [2] for many more options. That’s it, you just backed up your data!

In addition, it is very useful to use dar_manager supplied with dar to generate a database of all files contained in our backup. This will greatly simplify, aka, automate, retrieval of data, especially when parts of it are spread over multiple full and/or incremental backup files:

# create database
/usr/bin/dar_manager -C /home/susan/dar_backup_database.dmd 

# append full backup to database
/usr/bin/dar_manager -B /home/susan/dar_backup_database.dmd -A /mnt/backup/susan-home-backup

Here, “-B” specifies the path to the database file and “-A” gives the path to our initial full backup.

Now we can simply run differential backups at whatever time intervals necessary. To do so, the following simple bash script backup_home.sh might be helpful:

#!/bin/bash

# backup
if [ -z "$1" ]; then
  echo "usage: backup-home-disk <reference archive>"
  exit
fi

date=$(date +%F)

# create incremental backup
/usr/bin/dar -R /home/susan -c /mnt/backup/susan-home-diff-${date} -A "$1" -z -s 1000M -P  /home/susan/random-stuff

# update database
/usr/bin/dar_manager -B /home/susan/dar_backup_database.dmd -A /mnt/backup/susan-home-diff-${date}

Using this script, doing an incremental backup amounts to typing

# ./backup_home.sh <path to previous backup file>

Voila!

Well, not so fast. Before calling it a day and feeling all “backed up”, we need to make sure that our backups work as intended. Most importantly, we need to make sure that we can actually recover our data from the backup. Here, I’ll explain how to recover a specific set of files from the backup. If you need to recover the whole archive, e.g, after a disk failure please consult [2] for a very nice step by step walk-through. Hence, let’s pretend we lost /home/susan/pics and would like to recover it from our backups. We will recover the files into /tmp/backup via

# set up dar manager for recovery
/usr/bin/dar_manager -B /home/susan/dar_backup_database.dmd -o -R /tmp/restore/ -O -w

# recover data
/usr/bin/dar_manager -B /home/susan/dar_backup_database.dmd -r /home/susan/pics

In the initial setup step, we use the “-R” switch to specify where to recover the data to (without it we will simply replace the original files) and “-w -O” to signal that it is ok to overwrite earlier versions of files that dar_manager might pull from the backup. The second step then gets our files back via specifying all files to be recovered via the -r switch. If everything went well, you should now have a pristine copy of /home/susan/pics at the time of your last backup inside /tmp/restore. If not, please make sure to investigate what went wrong until you are absolutely sure your backups work. It certainly doesn’t hurt to occasionally check if data recovery works as it should.

We have only touched the surface of what DAR can do. If the above is all you need, great. If not please consult the docs for more info. As usual, man dar and man dar_manager are your friends.

[1] http://dar.linux.free.fr/
[2] http://dar.linux.free.fr/doc/Tutorial.html