Duplicity and Amazon S3 - New Backup strategy for the Tuquoc Server

After comparing different options (NAS, Amazon S3) I decided to go with the Amazon S3 Web service. Comparing the prices makes the Amazon S3 a way less expensive than a NAS system. Another argument for the Amazon S3 is the fact, that the backup isn't stored in the same building as the server lays. One drawback is that curious Amazon Administrator might have a look in the backed up emails. Therefore, I use Duplicity to transfer and encrypt the data. 

Duplicity is a backup tool that uses GnuPG to encrypt and librsync to backup the data. It creates incremental backups, which lowers a lot the amount of data transfered to Amazon S3. 

I started my "New Backup Adventure" by googling around and found this little How To with which I started.

Here a short explanation how I installed and configured the Backup on my Ubuntu server (be root or die!):

Install duplicity and python-boto:

  • apt-get install duplicity
  • apt-get install python-boto

Python-boto is used to let communicate duplicity to the Amazon S3.

Build a new bucket on the Amazon S3:

  • Go to Amazon S3 and register yourself.
  • Generate a Secret Access Key.
  • To create the bucket I used JetS3t. Login in JetS3T with the Access Key ID and the Secret Access Key and generate a bucket.

Creating a pair of keys with GnuPG:

  • gpg --gen-key
  • Choose 1) DSA and ElGamal
  • Choose 4096bit for the key length
  • Fill out the additional questions (name, email, comment)

Important: If you don't generate the key on the server itself, you have to sign the keys after importing. Otherwise, duplicity will throw a "broken pipe" error. To do them so:

  • Import the key with: gpg --import HEXNAMEOFKEY
  • gpg --edit-key HEXNAMEOFKEY
  • sign
  • 5
  • quit
HEXNAMEOFKEY is the 8 digit hex number name of the key.
 

Write a simple backup script:

Write a simple backup script. To understand duplicity command first have a look at:
  • man duplicity
My script look like this one:
 
#!/bin/sh
 
export PASSPHRASE=Enter here the GPG Passphrase
export AWS_ACCESS_KEY_ID=Enter here the Access Key of the Amazon S3 
export AWS_SECRET_ACCESS_KEY=Enter here the Secret Access Key
 
duplicity remove-older-than 3M --encrypt-key=GPGKEYNAME --sign-key= GPGKEYNAME 3+http://S3BucketName
 
    
duplicity --encrypt-key=GPGKEYNAME --sign-key= GPGKEYNAME --include=/some/file/you/like/to/backup --exclude='**' / s3+http://S3BucketName
 
export PASSPHRASE=
export AWS_ACCESS_KEY_ID= 
export AWS_SECRET_ACCESS_KEY=
 
 
Few comments to the script:
  • The PASSPHRASE= tag contains your GPG Key Passphrase. So be very careful who read your script
  • The GPGKEYNAME is again the 8 number hex digit name of the key
  • All backups older than 3 month are removed with the option duplicity remove-older-than 3M.
  • With duplicity I backup the / directory and exclude ('**') everything. With the include tag, I then include the files to backup
  • The S3BucketName you can get from the JetS3t Tool
  • The (Secret) Access Key you generated before in the Amazon S3 registration

Finally make the script NOT world readable and move the script to the weekly cron directory:

  • chown root:root the_backup_script
  • chmod go-rwx the_backup_script
  • chmod u-w the_backup_script
  • mv the_backup_script /etc/cron.weekly

Et voila, you got a Duplicity - Amazon S3 Backup!