Why ESI is Still Important, and How to Make it Better

More than ten years ago, I was working at Akamai and got involved in the specification of Edge Side Includes (ESI), sort of a templating language for intermediaries.

In that time, interest in ESI has grown, waned and been reborn. As far as I can tell, it’s implemented not only by Akamai and Oracle (the main forces behind it), but also in Varnish, Squid, and lots of other places too.

Back then, I had a strong suspicion that it’d die because people would see it as locking them into Akamai (or some other vendor). Why, then, is this limited, funny, embarrassingly simple little templating language still around?

In a word, it’s concurrency.

In the last couple of years, it’s become hot to build massively scalable Web servers by re-thinking how they handle concurrency; often using asynchronous, non-blocking single-process servers, rather than threads or multiple processes.

The benefits of this approach have been known for a long time; way before Dan Kegel wrote the C10K page, Web proxy servers like Squid (and its predecessor, Harvest) were using this approach because it’s the only sensible way to scale for them.

However, as folks are finding out when they use newer tools that implement these methods (e.g., Twisted, Node.JS), writing event-driven code is something you either love or hate. Many developers can’t stand it, especially for debugging (personally, I love it, but that’s just me).

So, ESI is a way to offer the massive concurrency of non-blocking, asynchronous servers in a way that’s easy to digest. Since fetching a URI doesn’t block, the only overhead is in stitching the page together, and you can control the overhead of that by limiting the language’s capability.

This makes ESI a great tool for building highly scalable dynamic Web sites without writing and debugging new code. Win.

Making ESI Better

ESI is, as mentioned, more than a decade old, and the Web has changed a lot in the intervening time. Even putting that aside, ESI isn’t exactly what we’d call Web-friendly. We can do better.

Over that time, I’ve had a number of thoughts about how to improve ESI as a language, which I’ve shared with some interested people privately. One of my back-burner projects has been to implement this, but I have to admit that this isn’t going to happen soon, since I’m busy doing several other things.

Instead, I’m going to dump those ideas here, and hope someone runs with them. Here are a few:

The biggest single way I can see to improve ESI is to make it possible to source variables from a URI. In other words, it should be possible to fetch a URI, parse the response (probably in JSON), and then reference the data returned when evaluating the template.

This would enable some really exciting things. Because variables are now just state, you can do things like cache user preferences – using plain old HTTP caching – and have that state be local to where it’s needed. When you update that state, it can be invalidated. ESI expressions now can have arbitrary, application-relevant input, instead of being limited to a few paltry request headers.

Here, you see some JSON being loaded into the user_prefs variable, form a URI that’s templates using a cookie that identifies the user, to drive how the page loads. This is very similar to a set of techniques I discussed a while back for composing services “RESTfully”, and it still works.

JSON also presents a way to clean up the variable model generally; instead of the random collection of variables, ESI 2.0 could instantiate a request object, with appropriate members like .method, .cookie, .headers, and so forth. It also brings about the possibility of making response attributes available as well, at least in the context of an include.

Going even further, JavaScript presents an opportunity to rally around a common, well-understood syntax for things like variable references, operators, and even common functions (e.g., string manipulation).

ESI:include desperately needs a timeout parameter, and a sensible means of specifying fallback content (probably as a child of the include element).

Deeper integration with HTTP is necessary; not only should it be possible to access arbitrary aspects of the incoming request, but it should be possible to affect more of the outgoing response; e.g., the status code. Likewise, finer-grained control over outgoing requests (generated by include as well as load) would be good (e.g., via attributes on the element).

There are lots of smaller, easier wins. Not requiring valid XML is an obvious one; integrating URI Templates is likewise a no-brainer. Cleaning up some of the cruft in the syntax would be nice; there are some elements that people just don’t need in there (e.g., esi:inline, the alt attribute).

Anybody up for it?

9 Comments

Jan Algermissen said:

Great ideas, Mark!

The biggest problem I see is finding the right balance between enabling a useful amount of templating features (you mention string manipulation) on the one hand and shoving in a full blown scripting engine (e.g. JavaScript) on the other.

While limiting the templating capabilities provides for a simpler spec and faster implementations it also (can) lead to one undesired dependencies between services. For example, when service A (sending the ESI template) requires a (ESI included) service B to produce representations in a certain way (e.g. send address strings in a certain format) due to limitations in the templating capabilities.

Jan

Friday, October 21 2011 at 11:20 AM

Erik Mogensen said:

Varnish implements a tiny (?) subset of ESI (the Good Parts?) I propose that ESI should be stripped down. I think it’s esi:include, esi:remove, based on the most-bang-for-the-buck principle.

inline, try, choose/when, vars and so on can be implemented by an origin server, returning the appropriate esi instructions to the intermediary, so when sponsoring Varnish Cache to implement ESI we chose to focus on the juicy bit, namely include and remove.

Maybe a “surrogate capability” which defined this minimal useful subset would be an idea. ESI-minimal/1.0 or something.

Friday, October 21 2011 at 11:52 AM

Mark Nottingham said:

Ilya G commented on Twitter:

“I’m growing more and more convinced that ESI is unnecessary.. client-side JS seems to be solving the same problem already.”

It’s definitely part of the solution, and people have talked about that for a while (even creating direct replacements).

The difference is that ESI in an intermediary has the advantage of shared caching; if your content is shared among a number of users, but combined in different ways, it’s a good candidate for composing the entire response from the intermediary, rather than going back to the origin.

Combine that with the benefit of being closer to the client (think Australia, or mobile, or emerging markets), and you’re shaving a huge amount of latency off your pages – both because they can be composed more locally, and because connections are terminated closer to the clients.

Saturday, October 22 2011 at 9:30 AM

Ilya Grigorik said:

Mark, but don’t we effectively get most of the same benefits? It seems like the core benefit of ESI is the fact that if forces you to decompose your page to multiple services. These endpoints, in turn, can implement their own smart caching and can be backed by CDN’s, etc. In other words, all we’ve done is we’ve moved the ESI templating to the client.. and that has both pro’s and cons.

Pro: we know we want and need AJAX so having an intermediary ESI service means replicating the template in both places, which is not a great experience. Con: the burden is on the client to assemble the page, which means many outbound requests.. but that’s while not ideal seems like a reasonable tradeoff.

In other words, if we know we need to decompose at UI layer, why bother with an intermediary? My own personal objection until recently has been: yes, but all of this JS coordination at UI layer breaks apart when we’re not JS enabled (ex: crawlers). Having said that, using a templating system like Closure (or similar), we can effectively render the same templates on server-side or client-side.. which gets us the benefit of both.

I do still see a place for ESI in specialized use cases.. but for general use, it seems like JSON endpoints + server/client templates is the right answer for the most part?

Saturday, October 22 2011 at 10:40 AM

Stefan Tilkov said:

I’d be extremely interested in this. Currently, ESI is very much tied to caching (even if only in peoples’ perception), but the more general use case is extremely interesting: The general aggregation of content, in my particular case from a large Web app where each page is dependent on a bunch of loosely coupled modules (I know, suspiciously sounds like “SOA” or “portal”, but anyway). “A templating language for intermediaries” sounds exactly like what the world needs.

Saturday, October 22 2011 at 12:56 PM

Mark Nottingham said:

Hi Ilya,

You can totally do the composition in-client, yes. As you point out, the con of this is making several requests; current browsers don’t give enough control over pipelining nor connections to make it optimal, especially for poorly-connected clients.

The other benefit of doing it at the edge, rather than in-browser, is that you don’t expose your application data as easy-to-consume / repurpose services. Some businesses care about that, some don’t.

Like I said, I’m a little surprised that ESI is still going; there are a lot of trade-offs inherent in it, and while some people get excited about it (I still meet people who think it’s the bee’s knees, and they’re not dumb people), others see this happening in-client more.

My interest is mainly in making sure that people who want to use ESI have an opportunity to see the full potential of the language; it’s pretty sucky and old at the moment.

I’d also like to see more declarative client-side frameworks out there, ones that work well with the Web, instead of requiring code for everything (hey, I can dream).

Sunday, October 23 2011 at 3:57 AM

Jan Algermissen said:

Ilya,

mobile devices are also something where ESI is beneficial because a) there can be transformation to representations suited for the device after the ESI processing and b) letting the server determine what processing to happen on the client is problematic for mobile devices. [1]

Jan

[1] Incidentally I just came across this piece of Mark (Baker): http://www.coactus.com/blog/2007/08/mobile-ajax-workshop-position-paper/ (from 2007, but spot on as allways)

Sunday, October 23 2011 at 7:46 AM

Ilya Grigorik said:

Mark, Jan fair points, I guess I’m just reflecting on how my own thinking has shifted with respect to ESI over the past few years.

While I was a huge fan of the spec, in large part due to the scatter/gather architecture it implied, it does also introduce a pretty high cost: the front-end developer now needs to learn yet another language. Given that this was originally conceived in server-side context, it seems that recent trends are only making the case worse for ESI.. We want more reactive applications, which means same architecture but client-side composition. This may make for sub-optimal load-time performance, but that’s a different story. In any case, I see where you’re coming from, but I’m still of the mind that ESI may be relegated to some niche use-cases. I doubt any attempt at standardization of this would bear much fruit: it seems like if you need ESI, then you’re probably in a rather special case (scale, distribution, etc), at which point a customized solution has likely been built already.

Re: obfuscation / hiding - that’s a fair concern.. you can’t do everything in the client. Having said that, for majority of the use cases obfuscating the JS source seems to get you 95% of the way there. Many of Google’s services are great examples: in theory, we do have “unofficial API’s”, in practice relying on obfuscated endpoints and variables is brittle at best, and impractical in real life (been there, done that :))

Jan: It seems like we’re finally moving away from the “mobile” version of the web. All modern smartphones are perfectly capable of rendering the full experience. Granted, I’m not saying that’s necessarily optimal experience on a much smaller screen, but at least from what I’ve seen.. in practice it yields better results than those uninspiring “mobile versions” of web apps (which are usually thrown together after the fact).

Monday, October 24 2011 at 5:44 AM

Mark Nottingham said:

I’d agree the impetus for standardising it is low; it’s important for it to be portable, but that can be achieved by Open Source.

I also can’t really argue that it’s a niche technology; I definitely can’t see it “taking over.” So, I should probably qualify the “important” in the title; it’s important for some.