Having a backup strategy is key. What works for me won't necessary work for you. I am going to outline my method for my own reference, to get feedback, and in the hope it might help someone else out there.

I am using Amazon S3 as my back-end, taking advantage of 10 GB for a year for free. I am abusing Fabric as my glue. Fabric is meant for remote execution but I am using its local feature as a kind of Pythonic shell script.

The first major issue is security. S3 is private by default but you are implicitly trusting Amazon. The only solution is to encrypt your data, which is easier said then done. The core of this post is documenting my method for encryption.

Encrypting Your Bits

I use OpenSSL to do the encryption instead of GPG or PGP because it is available on my Mac and widely on Linux without apt-getting.

Conceptually a backup is like a message sent from present day me to future me using asymmetric, or public/private key, encryption. I want to encrypt the backup with my public key and decrypt it later with my private key.

The trouble is that with OpenSSL asymmetric encryption can only be used for small files. The solution is to generate a unique symmetric key for every backup and encrypt that small file with my public key. I save the symmetrically encrypted backup and the asymmetrically encrypted symmetric key together and send them both to S3. Without your private key the two files get you nothing since they are both encrypted.

The first thing you have to once and only once is generate the public/private keys for encrypting the symmetric encryption keys (passwords):

defcreate_keys():""" Only do this once, if you overwrite your keys you won't be able to decrypt your backups! """key_file_name=os.path.join(root,"keys/backup.pem")local("openssl genrsa -des3 -out %s 1024"%key_file_name)# Generate encrypted private keylocal("openssl rsa -in %s -pubout > %s"%(key_file_name,key_file_name+".pub"))#Output the public part

This script has several global configuration variables which aren't addressed. It also won't clean up your local store except to delete the unencrypted key.

Recovery

Like an claiming an insurance policy, recovery is the most important part of any backup system. Like insurance people tend not to talk too much about how it will work nor do they test it often. Given the complexity of the backup process, I figured outlining the recovery process even if the automation ultimately fails me would be handy for reference.

defrecover_backup(id):# Try to get the fileconn=boto.connect_s3(ACCESS_KEY,SECRET_KEY)bucket=conn.get_bucket("amjoconn-backups")key=bucket.get_key("unified_backup-%s.tar.gz"%id)ifkeyisNone:print"Unable to find backup with id %s"%idreturnrecover_name=os.path.join(recovery_location,"recover-%s.tar.gz"%id)print"Writing","unified_backup-%s.tar.gz"%id,"to",recover_nameprint"This might take a bit of time..."key.get_contents_to_filename(recover_name)local("tar -xzf %s"%recover_name)

There it is. How I am keeping my bits safe. Such a problem is never really solved though. I hope to be able to update this post over time as I discover better ways to backup because this solution has much room for improvement.