----------------------------------------------------------------------------------------
Remote offsite backups for your webserverHave a linux web server you need to back up remotely? If you have a high speed internet connection at home and a spare computer, it's easy and inexpensive to keep an up to date copy of your website offsite.
Tools
What you'll need:
- a spare computer. It needs to have at least one large hard drive, and preferably two large hard drives and a CD Rom or DVD burner to make archives.
- a high speed internet connection at home with some spare bandwith
- a copy of a linux distribution such as Mandriva or Ubuntu
The concept
What we're attempting to create is a system that keeps a complete backup copy of your webserver offsite. We'll do that by 'syncing' a computer at your home with your actual webserver. This synchronization will done once each day, typically in the early hours of the morning when there's little traffic
on your home connection or your webserver. To keep traffic light we'll copy only files that were changed each day, and remove non-vital files like log files and such.
That gives us a backup. However we also need to be concerned about such things as the backup going bad somehow (perhaps through bad media), as well as keeping archived copies in case we need to retrieve something from a week or a month ago. We'll do that by keeping creating a date stamped, compressed
copy of the entire backup on a separate drive on the home computer.
At regular intervals we'll then archive these compressed backups to either CD or DVD and remove them from our computer to prevent the hard drive from filling up.
The flow of the backups and archives is as this:
Webserver-
->Offsite computer drive 1
-->gets compressed and a copy put on offsite computer drive 2
-->occassionally archives are pulled from offsite computer drive 2 and burned to CD or DVD
Setup of the home computer
For the purposes of this tutorial I'm going to assume you have a computer (it can be relatively low end, even an early pentium 4 computer will be more than sufficient) with one smaller hard drive sufficient to keep a copy of your webserver and one larger hard drive to maintain older copies of the backups.
First, install your favourite linux distribution on your home computer. Ensure you have 'rsync' installed, ssh running, and software to burn your CD's. Most modern linux distributions will have all three of these as readily available options.
Install linux on the first, smaller drive. Make sure you leave the bulk of the space setup as a seperate partition mounted at '/backup'. The second drive should be partitioned seperately as /backup2.
Allowing the remote computer to login
Each night the remote webserver will need to automatically login to the home computer via ssh without a password. We need to specifically allow this login without a password on the home computer.
Rather than retyping this common task, here's a quick tutorial telling you how to set this up:
http://www.linuxproblem.org/art_9.html
You should now be able to login from your webserver to the home backup computer without entering a password.
Running the offsite backup
Now we need to create a backup script on the webserver to backup all the files that have changed that day. Using your favourite editor, create a file and enter the following:
rsync -e ssh -azv /directory-to-backup 256.256.256.256:/backup/webserver --exclude '/path-to-log-files' --delete-excluded
Where /directory-to-backup is the directory on your webserver to backup, 256.256.256.256 is the IP address of your home computer, and /path-to-log-files is the directory to your log files that we don't want to back up.
The -delete-excluded deletes any files from your home computer that have been removed on your webserver. You can add additional 'exclude' paths to prevent backup up additonal directories.
If you want to remotely back up additional directories you can duplicate the above line in the file as many times as you need, changing only the appropriate '/directory-to-backup'. For example you might want to add this as a second line in the file to back up your mysql tables:
rsync -e ssh -azv /var/lib/mysql 256.256.256.256:/backup/webserver -delete-excluded
Save the file and name it 'remote-backups.sh'.
Now we need to execute this file once a day. I run mine at 3:00 a.m each night. On the webserver type the command 'crontab -l' to edit your crontab and add the following line:
0 3 * * * /path-to-backup-script/remote-backup.sh
That tells the webserver to run the remote backup each night at 3:00 a.m. It will log on to the home computer and using ssh (so it's encrypted) and compression it will copy over any files changed since the last backup.
To test it, you can run the remote-backups.sh program from the linux command prompt by entering './remote-backups.sh'. Caution - the first time you run this it will copy over all the files which will take quite some time.
There! You should now have a complete copy of any vital files from your webserver on your home computer.
Creating archives
Next we need to protect ourselves against a bad backup. To do this we create archives or compressed copies of the backup on a physically different hard drive. This is the /backup2 hard drive we set up on the remote home computer.
This is fairly straight forward. We simply need to create a file on the home computer called 'archive.sh' with the following lines in it:
cd /backup
tar -czf $(date +%Y%m%d)webserver-backup.tgz webserver
mv $(date +%Y%m%d)webserver-backup.tgz /backup2
This script creates a compressed copy of the recent backup in a file name that contains the backup. If the first file is created on December 1 2007, the compressed file will be called 20071201-webserver-backup.tgz. The file created on the next day will be 20071202-webserver-backup.tgz.
Now we need to create a cron job on the home computer to run this script. I like to run this script at 8:00 a.m. each day to ensure that the backup from the webserver has completed. Simply type in 'crontab -e' and add the following line:
0 8 * * * /path-to-archive.sh/archive.sh
That's it! After a few days you should have a current uncompressed copy of your webserver's files located on your home computer at /backup/webserver. On the second drive you should have multiple compressed copies of the entire backup in filenames like this:
20061201webserver-backup.tgz
20061202webserver-backup.tgz
20061203webserver-backup.tgz
so you can retrieve files from your backup for any specific day.
One point we still have to address is when the /backup2 drive gets full. At this point I simply burn copies of the compressed backups to a CD or DVD then delete all the files in /backup2.
That's it - you now have an inexpensive remote backup for your linux webserver for only the cost of an inexpensive older computer.
----------------------------------------------------------------------------------------
|