Links

Monday, May 21, 2007

ETags are not a panacea

panacea: A remedy for all diseases, evils, or difficulties; a cure-all.From
Answers.com

In my post "That Darned Cat! - 1",
I complained about Twitter performance, and peeked at some of their HTTP headers noticing they didn't
seem to respect ETags or Last-Modified header cache validator tests.

Since posting, Twitter performance is back on track. I haven't checked, but I'm guessing they didn't
add ETag support. :-)

A number of people seemed to read into my post that ETags are a cause of Twitter's performance problems.
I'd be the first to admit that such a proposition is a bit of a stretch. ETags are no panacea, and
in fact you'll obviously have to write more code to handle them correctly. Harder even, if you're
using some kind of high level framework for your app. This isn't easy stuff.

And in general, my 20+ years of programming have taught me that your first guess at where
your performance problems in your code are, is dead wrong. You really need to break out some
diagnostic tools, or write some, to figure out where your problems are. Since I don't have
the Twitter code, I'm of course at a complete loss to guess where their problems are, when they
have them.

ETags and Last-Modified processing is something you ought to do, if you can afford it,
because it does allow for some optimization in your client / server transactions.
To be clear, the optimization is that the server doesn't have to send the content
it would have sent to the client, as the client has indicated it already has
that 'version' of it cached. There is still a round-trip to the server involved. If you're
looking for an absolute killer optimization though, you should be looking at
Expires and Cache-Control headers. See Mark Nottingham's recent post
"Expires vs. max-age" for
some additional information, along with the link to his
caching tutorial.

Expires and friends are killer, because they allow the ultimate in client / server
transaction optimization; the transaction is optimized away completely.
The client can check the expiration data, and determine that the data they have
cached has not 'expired', and thus they don't need to ask the server for it at all. Unfortunately,
many applications won't be able to use these headers, if their data is designed to change
rapidly; eg, Twitter.

Sam Ruby also blogged about another
great example
of Expires headers. How often does the Google maps image data really change?

Here's another great example, applicable to our new web 2.0-ey future.
Yahoo! is
hosting their YUI library
for any application to use directly, without copying the toolkit to their own web site.
Let's peek at the headers from one of their files:

Good stuff! The Expires and Cache-Control headers render this file pretty much immutable,
as it should be. When Yahoo! releases the next version of the toolkit, it'll be hosted at
a different url base, and so will be unaffected by the headers of this particular file;
they will be different urls.
This sort of behaviour is highly optimal for web 2.0-ey apps, which are wont to download
a lot of static html, css and javascript files, which, for some particular version of the
app, will never change. And thus, by having the files cached on the client in such a way
that it never asks the server for them again, the app will come up all the quicker.