buckets of fun — nginx + s3

We use Amazon’s S3 a lot. With it we archive backups, store images, and sometimes even use it to host entire static websites — like Atlas and Voila‘s homepages.

It’s simple but powerful, and its guaranteed 99.9% monthly uptime conveniently matches our SLA. Although its ability to host sites is useful, it’s limited for us somewhat as we also serve API traffic from atlas.metabroadcast.com.

buckets as a backend

So instead of pointing atlas.metabroadcast.com directly at the S3 bucket, we’ll instead point to Nginx, which will decide whether the request is for the Atlas homepage or intended for the API, or something else entirely.

Nginx will simply proxy the request to the S3 bucket. Or at least, it should be simple. Unfortunately, there are quite a few places out there documenting the wrong way to do this. And I don’t mean less-than-ideal-wrong; fully not-working-at-all-wrong. Which caused quite a bit of frustration for a while.

the code

So here’s how we’ve done it. The Nginx config above will proxy any publicly-available S3 assets, and if it’s the last location in the Nginx server, means that we can use it as a catchall.

We hide a few Amazon (amz) headers, but most importantly we update the Host header to match the format of the bucket endpoint. This is the bit S3 uses to know which bucket’s contents should be returned, and without which we would just see errors.

In the end, some of the magic that makes S3 work so well is what made it less than obvious what was required for the proxy rules. And a bunch of differing, and incorrect, guides on how to get it to work didn’t really help. Hopefully someone else will stumble across this post and find it useful!

If you enjoyed the read, drop us a comment below or share the article, follow us on Twitter or subscribe to our #MetaBeers newsletter. Before you go, grab a PDF of the article, and let us know if it’s time we worked together.