Because we’re always on the look out for ways to speed up our web application, one of my favorite tools for optimization is the YSlow Firefox extension. Based on rules created by research done by Yahoo engineer, Steve Souders (his book High Performance Web Sites is a must read for anyone interested in front end engineering), the tool hooks into Firebug and helps you diagnose issues that can shave seconds off your pages’ load times. While we were able to implement most of the suggestions fairly easily, Rule #3, which specifies adding a far futures Expires header required a bit of elbow grease that some of you might be interested in.

Rule #3 recommends that you use set an Expires header on your static files (images, CSS and JavaScript) very far into the future (like 10 years) so that your browser’s cache is used to load those elements rather than making another HTTP request, which is costly when it comes to page load times. Implementing this is pretty easy. In your .htaccess file, you can use the following code:

Keep in mind, if you use a far future Expires header you have to change the component’s filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component’s filename, for example, yahoo_2.0.6.js.

We, of course, didn’t have a built in build process that added the version number to our static files. Obviously, we weren’t interested in changing version numbers by hand or having tons of different versioned files lying around in our SVN depository. And so motivated by a goal (increasing our Y Slow score) and sloth (not doing something manually), we figured out the following automated solution.

The first thing we did was set up some mod rewrite rules to allow version numbers in our file names. In our .htaccess file, we added the following lines:

What this does is quietly redirects any files located in our \scripts\ or \css\ folders with version numbers in between the file name and the extension back to just the filename and extension. For example, I could now rewrite the url /css/structure.css as /css/structure.1234.css and Apache would see those as the exact same files. We only do versioned files for our JavaScript and CSS, but you could easily adapt the rule for images as well, like so:

Once that was in place, we wrote a tiny PHP function that would look at the last modified date of the file and automatically rewrite the url with that unix timestamp as the version number. Here’s that PHP function:

It’s a great little system and required very little effort on our end and resulted in a noticeably faster browsing experience for our clients that frequented certain pages often, because their browsers were taking full advantage of their primed caches rather than calling our servers every time they loaded a page. The best part is that when we make a change to a CSS or JavaScript file, we don’t have to worry about tracking or managing version numbers or multiple files.

Automatically Version Your CSS and JavaScript Files by Kevin Hale

This entry was posted
3 years ago
and was filed under Notebooks. Comments are currently closed.

· 68 Comments! ·

Nice tip, especially the PHP-Code. But i would like to second Stan. Wouldn’t it be easier (as in: no need for htacces, which slows down the server a bit - yes i know, only a bit) if you just append the versin info in a query string?
As long as the querystring doesn’t change the browser should be able to cache the files anyway

We haven’t tested whether browsers would consider different query strings sufficient for proper caching, but I don’t see why using version numbers in a query string wouldn’t be a valid alternative approach. You could easily adjust the code so as to avoid the htaccess rules and even simply the PHP function. I think we’re just neat freaks when it comes to code and so this is why we approached it the way we did.

UPDATE! : As Anup pointed out in a comment below, urls with query strings (ie. script.js?v=1234) are NOT cached by the browser and so the method above should be used, because it inserts the version number into the filename.

Wouldn’t the query string approach require you to change the URL in every file that it is used? If I am understanding it correctly, that is a bit of a drawback. With the approach Kevin took, you just edit a file as normal with your new code, and everything else takes care of itself.

— On second thought, you could use the same PHP to append the query string, and not need the htaccess part. My bad.

Kevin: Thank you very much! What a great way of getting a discussion going…and making us all think about page load optimization.

Stan: You have almost answered a question I had for some time now. How does this “prototype.js?v=1234” work? I’m sure I must have missed the memo, but I’ve noticed similar things in code from other developers. Would you be able to point me to a link that explains this a little bit more in-depth?

Querystrings in the URL won’t guarantee caching. According to Cal Henderson, “According the letter of the HTTP caching specification, user agents should never cache URLs with query strings. While Internet Explorer and Firefox ignore this, Opera and Safari don’t - to make sure all user agents can cache your resources, we need to keep query strings out of their URLs.” — http://www.thinkvitamin.com/features/webapps/serving-javascript-fast

@WebGyver: I believe the querystring is used either for URL rewriting, or, when — in this instance — the js extension is mapped to be handled by something else (e.g. php) which outputs the right JavaScript, if needed.

Rails automatically does this by appending the timestamp to the filepath when you use the helper functions, but I didn’t like the idea of checking a bunch of files for their timestamp on each request, so I coded my own site to use
and then automatically substitute “?VERSION” by the file’s svn revision when I deploy to production. So in production it’s all static.

All the people who talk of the file.ext?93399393 fix, need to read all the comments. It’s been stated that this does not work as HTTP specifies that user agents should not cache resources with query strings (though some still do anyway). Kevin’s approach should work in all cases because it actually is a unique filename. Great tip Kevin!

As long as freshness (Expires/Cache-Control:max-age) is provided (as it is in this technique), URLs with query strings are indeed cacheable and this is specified more explicitly in the updated HTTP/1.1 drafts:

“caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time.”

All of them [browsers] will cache responses from URIs that contain question marks, as long as there’s freshness information present. IE is a little bit aggressive with them; it will cache even without freshness information. That isn’t necessarily bad, just something to be aware of.

While the specifications do state that query strings can be cached, these are updated drafts—meaning they’re not necessarily followed by the browsers and so in practice, creating an optimization strategy on something that might work in the future (especially on browser feature development) seems odd.

Also, I’m not really sure why there’s such a big push to avoid the htaccess rules. We use them on Wufoo and the processing time is negligible on our servers.

Thanks Kevin, I’m currently looking at implementing the solution, with a little tweaking : I’m thinking I can write a script that will search for JS and CSS files (for example) within the application folder tree, get the timestamp, and generate a configuration file for the templating engine with the “version number ” that came from the timestamp. The difference would be to request timestamps only at build time, and not during execution (though the fileaccess for timestamp may be minimal, I don’t know for sure).

Hey Farbrice, that’s an awesome way to do it. We use a three stage push method and svn and we didn’t want to have to have someone responsible for pushing a “crawl” code before every push. As I mentioned, we do this on Wufoo and one of the things we do is limit it to a maximum of two javascript files and two css files on any html page. The CPU processing for filemtime is very minor and the tradeoff in terms of not having to worry about the process at all and the apparent speed increase is definitely worth it for us.

.htaccess - the moment you turn on htaccess in apache, you’re slowing down your webserver. you’re basically forcing apache to do a stat on every directory from your resource up to your docroot just to figure out if a .htaccess file exists or not. stat is slow

include($_SERVER[‘DOCUMENT_ROOT’].’/path/to/autoVer.php’);
even though DOCUMENT_ROOT might be constant, php doesn’t know that, and this makes your include filename variable. What that means, is that apc can’t cache it well, and that causes further slowdowns at run time.

Both of these affect the response time of your server, so they won’t actually affect your YSlow grade at all, but they will slow you down.

Regarding always using the same filename for different versions… you could run into problems if you have multiple pages/apps that depend on different versions of the file. With YUI, for example, since the same URLs are used by many properties, and many products outside of Yahoo!, the URLs cannot be changed without breaking a lot of apps.

Regarding using the file’s mtime as a version, you’ll run into problems when you hit one of the following two scenarios:
1. You have more than one server for load balancing (get around this by using the svn version number, or the mtime of the file in svn)
2. The file changes more frequently than once a second (though in this case, you probably want it cached only if you’re getting a few hundred thousand requests a second).

Hi again. After implementing something similar on my site I noticed a big drop in bandwidth, which is great. However, the “Hits” reported is the same. And when I observe with Firebug, I still get >0 ms fetch time on the css ,js etc. which tells me the browser is still fetching, despite the files now having “Expires” around January 2018, and the “Cache-control max-age=315360000”. Did you find something similar ? Why isn’t the browser caching the files? hmm..

Fabrice, check the HTTP request and response headers (using Firebug or LiveHTTPHeaders). Is there an Etags in the response headers? Is the Last-Modified header getting sent? Are the Expires and Cache-control headers really getting sent? Is it HTTP/1.1 or 1.0?

Can anyone think of a way of making this automatically apply to all images, and maybe other files without having to stick the PHP into their paths? I’d like to use this in WordPress, but I can’t really embed the PHP into each image’s path as the editor won’t allow it, etc.

I dont know why everyone say its a great article. I think you just want to say its grait because i can put my backlink here for free. So i am haunest. Thnx for the good website and thnx for the pr juice

I revised autoVer to squash a couple of bugs. First, if the filename had more than one period in it, the number would be inserted multiple times. Second, if the file were in the root directory, it wouldn’t be handled properly.