In my previous post on storing web assets on Amazon S3, I promised to share the Ant script I developed from a week of work and testing to jumpstart your own efforts. Let’s look at what I wanted to accomplish.

The Ideal Deployment

The ideal deployment would accomplish a number of things that I listed in my post such as:

Combine multiple Javascripts or CSS files to reduce download count

Automatically set far-future headers, mime type, encoding and so forth

Finally, compress text files for browsers that support Gzip and upload them automatically

That’s a good wish list because it’s everything I did by hand the first time and it was a pain in the ass. Being fully automated means with one simple execution and a couple of parameters, we can push an entire repository of web assets off to Amazon S3 and keep them up to date.

Implementation

Here’s the background on my site and why this script does what it does. It’s easy to customize so you should be able to use it for your own needs with little effort but this gives some rationale as to why I’ve made certain decisions.

Be sure to check my previous post to fully understand the limitations of S3. I won’t be re-addressing here why we have two buckets, set headers or compress content separately.

Javascript

On my site, we include Google Analytics on every page. We also include jQuery and Dan Switzer’s qForms on many pages. The public facing part of our site also includes the AddThis widget that lets people share our content via social media sites like Facebook and Twitter.

Depending on the page you hit, you might have to download all four of those things. Or in certain rare cases, maybe just one of them (GA). We made the decision to create a bundle of core Javascript that would include all four of those items in a single file. When stripped and minified with a far futures expiry date (which means they would only ever download it once), we decided that the file size was small enough to send to every user instead of sending it piecemeal. jQuery accounts for the biggest chunk and since we’re moving towards more interactivity, we decided to bite the bullet and bundle the four scripts together.

Javascript is compressed by the impressive YUI-Compressor written by Julien Lecomte.

CSS

Historically, we’ve always used separate files for print and screen CSS. However, there is a technique that will allow you to put both styles into the same spreadsheet that will not only eliminate the extra file but will also reduce the total amount of CSS needed. It boils down to using this technique in your single CSS file:

This first part will eliminate the second file but it doesn’t reduce the overall amount of CSS. For that, I turned to this explanation of putting generic styles outside of the { } which gives you a set of “base” styles that are then overridden by media-specific styles (in my case, print). This is the “cascading” in Cascading Style Sheets. You have to laugh when you work with something for years and still learn something so fundamental. In the end, I have three CSS files that look like:

During deployment, the script concatenates the three files together into a single CSS which handles both screen and print. Then it’s minimized by YUI-Compressor and gzipped for a teensy end product.

Why keep the three separate files? I find it easier to go into a smaller file to find a style than deal with one giant file all the time. If I were starting from scratch, I probably wouldn’t have gone this way but since I already had the three files in my source control, I left them as-is and let Ant put them together.

Updating files with far future Expires

Astute readers will be wondering: if you set the cache headers to expire a year from now, how do you make changes to those files? Won’t the browser use its local copy effectively ignoring the updated file on the server?

Yes.

When I consulted at Yahoo in 2003 helping them with a major search redesign, I was exposed to their internal interface to Akamai’s content distribution network. Their answer to this problem was very simple: rename the file. If the file was foo.css, name it foo2.css and update your HTML to point at it instead. Assuming the actual HTML doesn’t also have a far futures expiry, then the next request will load foo2.css instead and see the updated styles.

This sounds like a pain in the butt but is not that bad. You should already have your templates abstracted in some fashion, either in a MVC system, custom tag or some other templating mechanism that separates out the core aspects of your look and feel. That means when your core JS and CSS files are updated, and thus renamed, there should only be one or two places you need to make changes.

Ask yourself: is a tiny extra bit of work on your part worth a huge speed increase for every one of your users on every request ever made to your website? There is only one correct answer to that question.

Preparing Ant

Let’s get to business. In addition to Ant 1.7, you’ll need to also obtain the following libraries and Ant tasks:

It’s beyond the scope of this document on how to get those working; you’ll need some Ant skills but basically it involves downloading the JARs and putting them into your Java lib/ext or ant/lib folder. On CentOS Linux, that would be /usr/java/latest/jre/lib/ext and /usr/share/ant/lib.

Why not jets3t?

There is a pure Java interface for S3, jets3t, but it didn’t work for my purposes here. It may change over time and would be preferable to an external dependency like s3cmd.

Properties Files

I use one properties file per environment that I deploy to and my Ant scripts ask me which environment I want to target when they start. For the script below, they would be named production.properties, staging.properties and development.properties:

The Script

With all of the preparations done, we can try the actual Ant script. I’m sorry the formatting of this makes you scroll horizontally; I’m going to get a new code plugin soon to eliminate this. You might download the file to follow along instead.

The first and default target is picktarget – this prompts the user for which environment to deploy to and sets some Ant properties for helper libraries and options before jumping off to the target specified in the properties file:

I like to have an init target that cleans up existing directories and prepares the script to run. I used to remove the directory completely and recreate it but it extends the length of time that SVN exports or checkouts take over slow networks so I generally keep the source directory and only rebuild the work directory now:

The hyphen in front of this target name indicates it is private. I only call this as a dependency from other targets using the “depends” syntax. This is pretty straightforward – it uses the Svnant library to checkout my source code from Subversion using the properties file we specified.

I like to use the current svn revision number as my release number and I embed it in my application for reference. I also print it to the screen while deploying.

Now we start to get to the good stuff. Here I’m creating the directory structure to build my static assets with a directory for concatenating and minimizing Javascript and CSS files. These files wind up in a /global directory during deployment for inclusion in my HTML templates.

One of the challenges of bundling Google Analytics and AddThis javascript code into your app is that you’re no longer getting their updates on every page request. Generally this is OK – you probably don’t need an update from them every day. But from time to time, there are new features and enhancements (especially in GA) that you’ll want to capture and update in your bundle. I’ve automated this process by fetching those files during deployment so I have the latest each time I deploy.

Note: because of the way the combined files are named, you may need to update your templates when you deploy your assets!

Now take all of the Javascript and CSS files and concatenate them into fewer files. Note that ORDER MATTERS! We list the files in an explicit order to satisfy any dependencies that we may have in the system. Once combined, we run them through YUI-compressor to squeeze them down and finally rename them back to their original names.

As we discussed above, when you set a far-futures expires header on a file and you need to change that file, the only realistic strategy is to rename it. This next target does that automatically by using an md5 hash (Ant’s “checksum”) as part of the filename. Why use this instead of say the revision number? Because we only want to update the references to these files in our HTML templates if something actually changes. It’s quite possible you could push your static assets many times and unless you updated your own Javascript or CSS or Google or AddThis updated theirs, you may not actually have any changes. Thus, save yourself the effort of updating your HTML template references by leaving the file name the same.

I will admit that I don’t like using an md5 hash because it’s hard to spot check for changes. I haven’t come up with anything better yet.

The end of this target prints the filenames to the output so you can compare them to the ones in your templates. If you were clever, you would put these filenames into a config file of some sort that your templates used so you didn’t have to actually update the templates each time.

When developing locally for example, I may wish to push to a local directory rather than to S3. On our live servers, we keep a copy of our static assets on the boxes alongside our production code. In case S3 were to have a serious outage, we could update one configuration file and suddenly start using our local assets again instead of S3. Given that we don’t control S3, we feel this is a good backup strategy.

And finally, repeat the push but this time to our compressed bucket. We use two uploads here – one for the uncompressed content like images which won’t be Gzipped and another for the compressed content that includes a couple of additional headers for Vary and Content-Encoding so that browsers and proxies will know what to do with it:

And finally, the roll up targets. None of the above targets are really designed to be called directly. Rather, the following targets use the “depends” feature of Ant to combine multiple targets into something useful like: predeploy, deploy, redeploy, localpush, and repush. Those should all be semi-guessable in terms of what they accomplish.

12 Comments

When you’re developing locally, I assume you have all these js and css separate, and in your html you’re including them individually, correct?

So how are you modifying your HTML when you deploy to production? or does your app just have a switch that detects its environment and then decides whether to include the individual versions or the concatenated versions?

Brian,
OK, so I understand how you’re changing the path to where the files live. But what about the files themselves? For example, let’s say you and your team have a handful of javascript files that you all work on all the time. These aren’t open source projects that only change once in a while… they’re files you work on every day. So in your html, you have

And on production, you want to have these files all concatenated, so you put that in your build script and it creates a new file: JSFilesIChangeEVeryDay_Combined.js or whatever

Now, your HTML needs to change on production, so that it points to your new combined file instead of each individual js file.

My question is: how are you managing the changing of your HTML during deployments? Or aren’t you?

See, I’m on the verge of instituting a very similar thing as you’ve described, and in my head, I see me simply using ANT to do a find/replace in our main layout file which looks for the big chunk of text that includes all the single JS files and replaces them with a single include for the combined JS file. But if there are other approaches to doing that, I’d love to hear them.

brian said:

I don’t use the individual files locally on a day to day basis – I use the concatenated ones. If I change one of those underlying JS files then I recompile them. If I was doing this day in and day out like you’re suggesting, I would probably have a separate target that just took my local files and rebuilt the concatenated versions into a statically named file for during development. Then you could use your normal deployment scripts to adjust which files to use based upon the environment.

It’s important to work with the compiled scripts most of the time because there are problems that can crop up with minimizing and concatenation. You won’t see them if you’re normally working with separate files.

I don’t know of a good solution beyond what you’re proposing: a more complex bit of Ant search and replace.