0. Naming conventions for this tutorial
We'll called the server holding the backup "back.example.com" or just "back" and the one that is in production and that you wish to save files "prod.example.com" or just "prod". Something like this:
prod# echo test
shows that you need to type "echo test" on your production server.
Note that the below howto is for Debian, there might be some differences in another OS.
1. Why automysqlbackup and dirvish?
Using FTP for doing a backup works well, and DTC helps doing it by generating a backup script that is started every day/week/month. While this works pretty well, and is pretty easy to use, it always backup things that have already been saved, and it doesn't keep a history (this is not an incremental backup).
Because of this reason, we recommend advanced users to use dirvish and automysqlbackup instead, because the resulting backup has few days of history for all files, and a daily, weekly and monthly backup for all your database. It also performs a way better (no need to do CPU intensive tasks like creating tar.gz archives), saves a lot of bandwidth for things that have already been sent to the backup server, and is very secure (it's using ssh). The below setup is also the perfect solution to backup a server that uses DTC.
2. Setting-up automysqlbackup on your server
Using rsync for doing a backup works pretty well for everything BUT MySQL database. This is why you should use automysqlbackup. This software is the "install and forget" kind of, and it is extremely easy to use (there's absolutely nothing to configure).
Because we loved the simplicity of automysqlbackup, GPLHost created a Debian package for it. It is available only in Squeeze and SID (it's not in Lenny), so if you use Lenny, you might want to add one of the mirrors of GPLHost (which you must have already done if you use DTC).
To install automysqlbackup, just do:
prod# apt-get install automysqlbackup
By default it will save all of your MySQL database in /var/lib/automysqlbackup. If you want to change this, for example if you want the backup to be in /var/www so it will later be saved with the rest of the files in /var/www/sites, you can edit /etc/default/automysqlbackup.
Note that by default, automysqlbackup will generate one backup every day for the last week, one backup every week for the last month, and one backup every month for the last year. I appears to me that this is really enough for most situation, unless you need dumps every hour (in which case you can always change the frequency to start automysqlbackup, and that will be it...).
3. What is the principle of Dirivsh
Dirvish uses rsync to do it's backup. On the backup server, you install dirivsh and rsync, and then you will only need to have rsync installed on the server you wish to backup. I stress it again: DO NOT install dirvish on the server you wish to have a backup, this is not needed.
Dirivsh uses "vaults" where files are stored. These are in fact just backup repositories containing both the configuration/definition of the target server you want to keep a backup of, and the files themselves.
4. Installing software and creating the dirvish user on the production server
First of all, install ssh, rsync and sudo:
prod# apt-get install ssh sudo rsync
Then you need to create a username so that the backup server can login and use rsync. We always use "dirvish" as username, but you could use virtually anything. Then you have to configure /etc/sudoers to allow the "dirvish" user to use rsync as root. Here's how:
prod# adduser --disabled-password --gecos "Dirvish backup user" dirvish
prod# echo "dirvish ALL = NOPASSWD: /usr/bin/rsync" >>/etc/sudoers.d/dirvish_backup
We setup the user with --disabled-password because we will use ssh keys only for the backup server to log into the production one.
5. Setting-up ssh keys
As root user on the backup server, create a pair of private and public ssh keys:
back# ssh-keygen -t rsa
When ask for the path, just type enter. When asked for a passphrase, enter an empty passphrase (just type enter), because nobody will be on the keyboard to type it. Do not worry, this is still very safe. Next, you have to copy the public key of the backup server into the production server. Just read the public key on the backup server with:
back# cat /root/.ssh/id_rsa.pub
Set the ssh files and user into the Production server. Something like this:
prod# mkdir /home/dirvish/.ssh
prod# chown dirvish.dirvish /home/dirvish/.ssh
prod# chmod 700 /home/dirvish/.ssh
prod# echo "ssh-rsa <your-ssh-key>" >/home/dirvish/.ssh/authorized_keys2
prod# chown dirvish.dirvish /home/dirvish/.ssh/authorized_keys2
prod# chmod 600 /home/dirvish/.ssh/authorized_keys2
Copy the content of the ssh public key (see above) and paste it into your newly created file "authorized_keys2"
Note that copying the email address at the end of the key is often a bad idea, because if the rDNS of the backup is wrong, your production server will not let you in. Next, simply try, from the backup server and being root, to login as the dirvish user in your production server:
back# ssh dirvish@prod.example.com
It shall print a warning about the fingerprint of the sshd of your production server. Just type "yes", and normally, you'll get in without typing a password. Then you should be connected in your Production server from the Backup server as user "dirvish".
6. Setting-up Dirvish on your backup server
If you haven't done it yet it is time to install dirvish:
agt-get install dirvish
First, you got to write the main configuration file of dirvish. Here's an example of what should be in /etc/dirvish/master.conf in your backup server:
bank:
/var/backup
index: text
exclude:
lost+found/
/proc
/sys
/etc/mtab
/var/cache/apt
core
*~
.nfs*
/tmp/*
Runall:
prod.example.com
image-default: %Y%m%d
expire-default: +9 days
rsh: ssh -o "BatchMode yes" -o "StrictHostKeyChecking no"
I believe that the above is self explanatory, but here's few comments on it still. The "bank:" directive tells where your will store your backup files. "Runall:" lists all the folders in /var/backup that are "vaults". Meaning that here, /var/backup/prod.example.com will be where your production server files will be saved. Of course, in "Runall:" you can list more than a single vault. Now, you got to configure your vault. First create the folder:
back# mkdir -p /var/backup/prod.example.com/dirvish
Here's an example of configuration file. In this example, it will be stored in /var/backup/prod.example.com/dirvish/default.conf:
rsync-client: sudo rsync
client: dirvish@prod.example.com
tree: /
exclude:
+ /etc/mlmmj
+ /var/www
/*/
/*
This tells what user to log as, and what command to use for rsync.
Notice the tree is set to / and then the exclude excludes everything but the folders with + commands. This is because the tree option doesn't allow multiple folders per vault. The + means to include the folder. You must specify the includes before the excludes. This is a work-around to allow you to back up all your DTC settings in to one back up.
If you have commands to run before of after the backup, you can use pre-client, post-client, pre-server and post-server commands. pre-client will run on the production server, while pre-server will run on the backup server. This could be done like this, in the default.conf:
pre-client: /etc/init.d/mysql stop
post-client: /etc/init.d/mysql stop
Note that this is just an example, of course you don't really want to shutdown your MySQL server during the backup operation that could take a while...
You might want to use the pre and post client as another way to backup your whole mysql database.
pre-client: /etc/dirvish/sqldump-zip.sh
#!/bin/sh
#Remove the last backup
rm -f /var/www/MyDatabaseBackup.zip
#Dump the entire database
/usr/bin/mysqldump -A -a -e -pMyPassword > /var/www/MyDatabaseBackup.sql
#Compress the sql file (saves about 80% of space)
/usr/bin/zip /var/www/MyDatabaseBackup.zip /var/www/MyDatabaseBackup.sql
#Remove the redundant sql file
rm -f /var/www/MyDatabaseBackup.sql
Notice that putting the zip file in /var/www will cause it to be backed up when your back up runs.
7. Running the first backup by hand
Dirvish requires you to do the first backup "by hand". This is very easy, here's an example:
back# dirvish --vault prod.example.com --init
Note that because the first backup will take a long long time, it is highly recommended to use "screen" and do this dirvish 1st backup in a screen session to avoid being disconnected from ssh while doing the backup.
That is it, everything is ready. From now on, your backup server will login every day in your production server to do a backup using rsync over ssh. There's nothing more you have to do for this setup! But what about the SQL dbs we talked about before? Well, if you configured automysqlbackup to do dumps in /var/www/automysqlbackup like I suggested earlier, it will be included in the dirivsh backup as well, so no worries here.
8. Setting up a CRON job
If you installed Dirvish using Debian Apt-Get then a cron job has been created for you automatically.
A cron job should call:
/usr/sbin/dirvish-expire --quiet && /usr/sbin/dirvish-runall --quiet
Note, if you installed Dirvish using the install script in the tar file (from the Dirvish website) then the dirvish executables may be located in /usr/local/sbin.
Make sure that /usr/local/sbin is in the path when you run the cron job. dirvish-runall calls dirvish from the path. If dirvish-runall can't find dirvish your job won't be run but no error message is produced.
A suggested script to run both commands and set the path might be:
#!/bin/sh
PATH=$PATH:/usr/local/sbin/
/usr/local/sbin/dirvish-expire --quiet
/usr/local/sbin/dirvish-runall --quiet
To run the job every night you might use:
4 22 * * * root /etc/dirvish/dirvish-cronjob
9. Restoring from the backup
Having a backup but not knowing how to restore is stupid, right? Well, with drivish, it's extremely simple. Let me show you:
back# ls /var/backup/prod.example.com
20100530 20100601 20100603 20100605 20100607 dirvish
20100531 20100602 20100604 20100606 20100608
Each day, you will have a folder as above, with the date as timestamp. Then in each folder, you have something like this:
back# ls /var/backup/prod.example.com/20100608
index log summary tree
Above, only "tree" is a folder, the others are just files you can ignore. In /var/backup/prod.example.com/20100608/tree, you have a complete backup of all files. If a file was the same as the backup for each previous day (eg: it's been 10 days the file is there), then the file will in fact be a hardlink. Meaning that this file takes the storage space ONCE for each of the 9 days of history, even though it appears to be in each individual daily backup, as if it was a normal file.
10. Last word
This howto didn't cover the backup of /etc/mlmmj that you would need to backup as well if you run DTC.
Editing this page means accepting its license.