Links

Tags

Recent tweets

Find us on Facebook

CYA - Conserve Your Assets, with rsync

Some time back-- actually quite a while back-- I wrote a series of articles called the Windows to Linux roadmap. Now that I'm editor of the Linux site on developerWorks, I have to look at these things from a different perspective and it is bittersweet to watch them age. Ubuntu wasn't around at that time, which is my primary environment now. There are also tools that have come along to make management easier when, at the time, Webmin was really the only consistent tool I could find. (Webmin is still around, by the way, and I still might consider it if I was managing servers and needed to help share management with people who didn't have a strong Linux background.)

One of the articles I was looking over today was the one on doing backups. In 2003 the backup landscape was pretty dismal, at least from where I could see it. Were I to write that article today I would have more tools to discuss, my favorite being rsync. Rsync was actually around when I wrote the articles, but it was one of those resources that lurked in the shadows, like so many little tools do. Essentially rsync is designed to do file duplication, but tries to make it as efficient as possible by only transfering the delta (changes) in files when it can. It has a number of options and can be set up to do transfers through the network and over encrypted tunnels if desired. I wrote a little script that I run manually whenever I wish to do a backup... though I could run it automatically if I chose... and probably should.

This does a backup to my local USB drive and also does a dump to a network machine, through an encrypted tunnel. This device could be anywhere as long as I could access it over the network, and you'll notice that I am accessing it through an Internet address, so it works when I'm on the road as well. Note also that I'm doing key-based authentication in ssh.

The --exclude-from parameter lets me set up a file containing paths (with wild cards) that I do not want to back up. Things like the Trash, cache files, etc.

The first backup is a bear because it has to transfer all of the data. After that it's easier because it only addresses changes. Of course, one problem with this is that it doesn't take into account file deletions. rsync can do that, but I found that defeated the purpose of the backup if I was trying to recover files that I'd delted accidentally. So, I set up another script that I call cya-purge.sh, that handles that sort of clean-up. I run it periodically, when I'm pretty sure that I don't have something I need to restore.

This second script is identical, except for the --delete parameter, which tells rsync to remove files that are no longer on my system.

I agree that my solution is somewhat inelegant, and probably more hands-on than many people would prefer their backup to be. However, at the time that's really what I was looking for and I still enjoy doing it this way. I have a lot of granular control over this and don't have to mess with interfaces or anything like that. It's simple.

Duplicity

Of course, my hairy-man approach to backups is not going to be to most people's taste. For them/you there is duplicity, an elegant front end to working with rsync that handles bundling of files into smaller chunks, suitable for storing on remote networks. It also does management of the the backup to keep files around for a period of time and then allow them to leave gracefully... something that I would like to get my own scripts to do when I have time to wrap my brain around it. Duplicity is the default backup solution in Ubuntu, so if you have that turned on, you are using it!

My first experience with duplicity was not great. It spent a few hours doing a full backup of my user directory (gigs and gigs of data) and then deleted it when it was done. I never did figure out why it was doing that. However, when I recently tried it again through the Ubuntu control panel it seemed to work fine. I would need to do some tinkering to see how best to emulate my current system of dual backup-ups to a local and remote device, but it might be worth the trouble. I am amused to see that when I looked at the settings to refresh my memory that the automatic backup for today has already occured, and that I did not notice. That's a good sign!

Other solutions

Of course, there are a number of backup solutions that have evolved over the last nine years or so since I penned-- or shoudl I say keyboarded-- that article. Notable ones are Bacula, fwbackups and Amanda. At some point I may dig into them a little more, but in the mean time you will probably enjoy what you can do with rsync. I should point out that there are ways to use rcync in Windows as well. Take a look at this article if you want to explore that.