Automatic Backups to Amazon S3 are Easy

You have good reason to backup your files. Amazon S3 is a cost-effective storage option. It doesn't take the place of a dedicated drive that you own, it can be useful for redundancy nonetheless. With a few easy command-line steps (plus some pre-requisites), you can set up your machine to automate backups to S3 in no time.

Pre-requisites

An Amazon web services account and your Amazon access credentials
s3cmd: command line interface to S3.
cron

The cron is pretty standard on unix-based systems. As of this writing, s3cmd should be straightforward:

# Mac users
$ brew install s3cmd

# Linux
$ yum install s3cmd
# or
$ apt-get install s3cmd

Optional:

gpg: opensource encryption program

Setup

First you'll need to configure s3cmd: s3cmd --configure. Have your Amazon access key and secret key at the ready.

If you plan to store sensitive data on S3, enter the path to your gpg executable; s3cmd will encrypt your data before transferring from your machine to S3. It also decrypts when downloading to your machine. Keep in mind, encrypted files won't be readable by others with direct access.

Here's a sample result:

$ s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3
Access Key: xxxxxxxxxxxxxxxxxxxx
Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: xxxxxxxxxx
Path to GPG program: /usr/local/bin/gpg

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]: Yes

New settings:
  Access Key: xxxxxxxxxxxxxxxxxxxx
  Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Encryption password: xxxxxxxxxx
  Path to GPG program: /usr/local/bin/gpg
  Use HTTPS protocol: True
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] Y
Please wait...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Success. Encryption and decryption worked fine :-)

Save settings? [y/N] y
Configuration saved to '$HOME/.s3cfg'

Backup

Now all you need is a file to backup and an S3 bucket to store it.

Let's say you're a web developer like me and you want to back up your MySQL or Postgres development data. First, generate the backup file (you may need to add database credentials command-line options, of course):

# mysql
$ mysqldump my_app_development > backup-`date +%Y-%m-%d`.sql

# or postgres
$ pg_dump my_app_development > backup-`date +%Y-%m-%d`.sql

You can use s3cmd to create a bucket. This is essentially a top-level directory in your S3 account. Since bucket names must be unique to all S3 users, you won't be able to call it something like "backups". It's helpful to use a prefix like your email or handle.

Creates an S3 bucket called 'myname-backups':

$ s3cmd mb s3://myname-backups

Now you're ready to deliver. Encrypt and send your sql dump file to your new S3 bucket:

$ s3cmd put backup-2014-02-01.sql s3://myname-backups/backup-2014-02-01.sql --encrypt

You can verify it's in the bucket:

$ s3cmd ls s3://myname-backups/
2014-02-01 22:32   1109702   s3://myname-backups/test/backup-2014-02-01.sql

And retrieve it (with automatic decryption when performed on your machine):

s3cmd get s3://myname-backups/backup-2014-02-01.sql

s3cmd supports a wide range of configuration options beyond those entered during the setup phase.Once set, your global configuration is editable in your .s3cfg file, typically saved in your home directory. You can also set options at the command line.

Automate

Backing up is good but automatic, recurring backups are even better; like saving money, it's more likely to happen when you make a computer do it for you.

Let's add a cron task:

#!/usr/bin/env bash

TIMESTAMP=$(date +%Y-%m-%d)
TEMP_FILE=$(mktemp tmp.XXXXXXXXXX)
S3_FILE="s3://myname-backups/local/data/backup-$TIMESTAMP"
pg_dump directory_development > $TEMP_FILE
s3cmd put $TEMP_FILE $S3_FILE --encrypt
rm "$TEMP_FILE"

Save this in a directory for your local scripts, like $HOME/bin/database_backup.sh and add execute permissions with chmod +x ~/bin/database_backup.sh.

To edit your crontab, crontab -e, and set it to run everyday at 10PM:

# Backup database to S3 daily
0 22 * * * /Users/myname/bin/database_backup.sh

Easy, right?

rossta.net

Automatic Backups to Amazon S3 are Easy

Push important files to the cloud with s3cmd and cron

Pre-requisites

Setup

Backup

Automate

More posts