Problem

Forces

The state of many web applications is inherently volatile. Changes can come from numerous sources, such as other users, external news and data, results of complex calculations, triggers based on the current time and date.

HTTP requests can only emerge from the client. When a state change occurs, there's no way for a server to open connections to interested clients.

Solution

Stream server data in the response of a long-lived HTTP connection. Most web services do some processing, send back a response, and immediately exit. But in this pattern, they keep the connection open by running a long loop. The server script uses event registration or some other technique to detect any state changes. As soon as a state change occurs, it pushes new data to the outgoing stream and flushes it, but doesn't actually close it. Meanwhile, the browser must ensure the user-interface reflects the new data. This pattern discusses a couple of techniques for Streaming HTTP, which I refer to as "Page Streaming" and "Service Streaming".

Note that this pattern is very experimental, especially the "Service Streaming" variant. There remain many questions on feasibility and scaleability. One particular gotcha is the effect of proxies. Sometimes, a proxy sitting somewhere between server and browser will buffer responses, an unfortunate optimisation that prevents real-time data from flowing into the browser.

"Page Streaming" involves streaming the original page response. Here, the server immediately outputs an initial page and flushes the stream, but keeps it open. It then proceeds to alter it over time by outputting embedded scripts that manipulate the DOM. The browser's still officially writing the initial page out, so when it encounters a complete <script> tag, it will execute the script immediately. A simple demo is available at http://ajaxify.com/run/streaming/.

For example, the server can initially output a div which will always contain the latest news.

print ("<div id='news'></div>");

But instead of exiting, it starts a loop to update the item every 10 seconds.

Each language and environment will have its own idiosyncrasies in implementing this pattern. For PHP, there's fortunately some very useful advice in the online comments for the "flush()" command. It turned out to be necessary to execute ob_end_flush(); before flush() could be called. There's also a max_execution_time parameter you might need to increase, and the web server will have its own timeout-related parameters to tweak.

That illustrates the basic technique, and there are some refinements discussed below and in the Design Decisions. One burning question you might have is how the browser initiates communication, since the connection is in a perpetual response state. The answer is to use a "back channel", i.e. a parallel HTTP connection. This can easily be accomplished with an XMLHttpRequest Call or an IFrame Call. The streaming service will be able to effect a subsequent change to the user interface, as long as it has some means of detecting the call. For example, a session object, a global application object (such as the applicationContext in a Java Servlet container), or the database.

"Page Streaming" means the browser discovers server changes almost immediately. This opens up the possibility of real-time updates in the browser, and allows for bi-directional information flow. However, it's quite a departure from standard HTTP usage, which leads to several problems. First, there are unfortunate memory implications, because the Javascript keep accumulating, and the browser must retain all of that in its page model. In a rich application with lots of updates, that model is going to grow quickly, and at some point a page refresh will be necessary in order to avoid hard drive swapping, or a worse fate. Second, long-lived connections will inevitably fail, so you have to prepare a recovery plan. Third, servers can't deal with lots of simultaneous connections. Running multiple scripts is certainly going to hurt when each script runs in its own process, and even in more sophisticated multi-threading environments, there will be limited resources.

Another problem is that Javascript must be used, because it's the only way to alter page elements that have already been output. In its absence, the server could only communicate by appending to the page. Thus, browser and server are coupled closely, making it difficult to write a rich Ajaxy browser application.

"Service Streaming" is a step towards solving these problems, though it doesn't work on all browsers. The technique relies on XMLHttpRequest Call (or a similar remoting technology like IFrame_Call). This time, it's an XMLHttpRequest connection that's long-lived, instead of the initial page load. There's more flexibility regarding length and frequency of connections. You could load the page normally, then start streaming for thirty seconds when the user clicks a button. Or you could start streaming once the page is loaded, and keep resetting the connection every thirty seconds. Having a range of options helps immeasurably, given that HTTP Streaming is constrained by the capabilities of the server, the browsers, and the network.

As for the mechanics of service streaming, the server uses the same trick of looping indefinitely to keep the connection open, and periodically flushing the stream. The output can no longer be HTML script tags, because the web browser wouldn't automatically execute them, so how does the browser deal with the stream? The answer is that it polls for the latest response and uses it accordingly.

The responseText property of XMLHttpRequest always contains the content that's been flushed out of the server, even when the connection's still open. So the browser can run a periodic check, e.g. to see if its length has changed. One problem, though, is that, once flushed, the service can't undo anything its output. For example, the responseText string arising from a timer service might look like this: "12:00:00 12:01:05 12:02:10", whereas it would ideally be just "12:02". The solution here is to parse the response string and only look at the last value. To be more precise, the last valid value, since you don't want to grab a partial result. An example of this technique works in this way. To ease parsing, the service outputs each message delimited by a special token, "@END@". Then, a regular expression can be run to grab the latest message, which must be followed by that token to ensure it's complete.

That's great if you only care about the last message, but what if the browser needs to log all messages that came in? Or process them in some way? With a polling frequency of 10 seconds, the previous sequence would lead to values being skipped; the browser would jump from 12:00 to 12:02:10, ignoring the second value. If you want to catch all messages, you need to keep track of the position you've read up to, and that's a technique in the Refactoring Illustration below.

In summary, Service Streaming makes the HTTP Streaming approach more flexible, because you can stream arbitrary content rather than Javascript commands, and because you can control the connection's lifecycle. However, it combines two technologies that aren't consistent across browsers, with predictable portability issues. Experiments suggest that the Page Streaming technique does work on both IE and Firefox ([1]), but Service Streaming only works on Firefox, whether XMLHTTPRequest ([2]) or IFrame ([3]) is used. In the first case IE suppresses the response until its complete, with the IFrame it works if a workaround is used:
The IE accepts a message from the server after the first 256 bytes so the only thing to do is to send 256 dummy Bytes before sending the messages.
After this all messages will arrive as expected. So a full Service Streaming is possible in IE, too!
You could claim that's either a bug or a feature; but either way, it works against HTTP Streaming. So for portable page updates, you have a few options:

Use a limited form of Service Streaming, where the server blocks until the first state change occurs. At that point, it outputs a message and exits. This is not ideal, but certainly feasible (see Real-World Examples).

Decisions

How long will you keep the connection open?

It's impractical to keep a connection open forever. You need to decide on a reasonable period of time to keep the connection open, and this will depend on:

The resources involved: server, network, browsers, and supporting software along the way.

How many clients will be connecting at any time. Not just averages, but peak periods.

How the system will be used - how much data will be output, how will the activity change over time.

The consequences of too many connections at once. For example, will some users miss out on critical information?

It's difficult to give exact figures, but it seems fair to assume a small intranet application could tolerate a connection of minutes or maybe hours, whereas a public dotcom might only be able to offer this service for quick, specialised, situations, if at all.

How will the browser distinguish between messages?

As the Solution mentions, the service can't erase what it's already output, so it often needs to output a succession of distinct messages. You'll need some protocol to delineate the messages, e.g. the messages fit a standard pattern, the messages are separated by a special token string, or the messages are accompanied by some sort of metadata, e.g. a header indicating messages size.

Real-World Examples

APE

Ajax Push Engine is a complete open source packaged solution (including a Comet server), designed to push real-time data in a lightweight, highly scalable and modular way, only using JavaScript for the client side. With the use of those Web Stardards, APE is fully crossbrowser and uses all the lastest browser features and provides backward compatibilty for older ones. The APE Server is fully written in C language, has a small footprint and does not require any dependencies.

Caplin Liberator

Liberator is a free version of a commercial Comet server from Caplin Systems. Liberator can handle tens of thousands of concurrent users on a single mid-range server while processing up to 4 million messages per second with very low message latency. Liberator is a standalone server that runs on Linux, and is widely used for financial applications. It supports Ajax, Flex, Silverlight, Java and .Net clients.

Escape From The Web

Escape From The Web is a web-based terminal emulator written in Python using the Tornado web framework and the MochiKit JavaScript framework. It is based on Ajaxterm but unlike Ajaxterm, EFTW does *not* use long-polling (constantly re-requesting screen updates every second or so). Instead, EFTW uses long-lived HTTP streams to keep the screen updated at all times which allows for more immediate updates and uses considerably less bandwidth.

Jotspot Live

Jotspot Live is a live, mult-user, wiki environment which uses HTTP Streaming to update message content. In an interview with Ajaxian.com, developer Abe Fettig explained the design is based on LivePage (see below).

Kaazing Enterprise Comet

Kaazing Enterprise Comet provides the industry's most productive and advanced environment for creating real-time Web applications that extend SOA's event and message delivery to the browser, allowing browsers to participate in the server-side message bus. All applications developed with Enterprise Comet are deployed to standard Java EE containers with no browser plugins.

Lightstreamer

Lightstreamer has been one of the very first servers to successfully implement HTTP Streaming back in 2000. The product has been evolving to support all the Real-Time Web paradigms, including Comet and WebSockets. Lightstreamer is a now a very widespread solution, used by startup companies as well as Fortune 500 corporations, in all the cases where there is a need to push live data to and from any mobile/desktop browsers and applications.

LivePage

LivePage is part of Donovan Preston's Nevow framework, a Python-based web application framework. Events are pushed from the server using XMLHttpRequest-based Service Streaming. For compatibility reasons, Nevow uses the technique mentioned in the solution, where the connection closes after first output. Donovan explained the technique to me:

When the main page loads, an XHR (XMLHttpRequest) makes an "output conduit" request. If the server has collected any events between the main page rendering and the output conduit request rendering, it sends them immediately. If it has not, it waits until an event arrives and sends it over the output conduit. Any event from the server to the client causes the server to close the output conduit request. Any time the server closes the output conduit request, the client immediately reopens a new one. If the server hasn't received an event for the client in 30 seconds, it sends a noop (the javascript "null") and closes the request.

MigratoryData WebSocket Server

MigratoryData WebSocket Server is an enterprise-grade WebSocket server for building highly scalable real-time websites. A single MigratoryData instance is able to handle up to 1 million concurrent users (benchmarks). For old browsers without native WebSockets, data is pushed via techniques (such as HTTP streaming) which use a single persistent TCP connection, like WebSockets do. So, true data push is achieved for all browsers via pure JavaScript (no plug-in needed).

NGiNX HTTP Push Module

NGiNX HTTP Push Module is an addon for the Nginx server turning it into a robust HTTP Push and Comet server. It takes care of all the connection juggling, and exposes a simple interface to broadcast messages to clients via plain old HTTP requests. This lets one write live-updating asynchronous web applications as easily as their oldschool classic counterparts, since your code does not need to manage requests with delayed responses.

Pushlets

Realtime on Rails

Martin Scheffler's Realtime on Rails is a real-time chat application that uses Service Streaming on Firefox and, because of the restrictions described in the Solution above, Periodic Refresh on other browsers.

StreamHub

StreamHub Comet Server is a cross-platform push server with free and commercial versions. It can handle tens of thousands of users, or hundreds of thousands when run as part of a cluster. It is mainly used in low-latency financial applications. Currently it supports AJAX, Java, .NET and HTML clients, additionally C++, Silverlight and iPhone clients are available under early access. A GWT Comet Adapter is available for implementing Comet in GWT applications.

WebSync

WebSync is a complete implementation of the Bayeux protocol for the Microsoft platform (.NET/IIS/SQL) allowing full-duplex communication in all major browsers, mobile phones, and common language frameworks. WebSync is both vertically and horizontally scalable, able to support tens of thousands of concurrent users on each server. It has built-in support for server farms with fail-over handling. WebSync provides an ultra-light JavaScript client, a .NET client, as well as PHP and .NET publishers for thin and thick application integration. WebSync is 100% standards-compliant, making use of HTML5 features whenever browser support is available. WebSync is available as both an add-on module for IIS and as a service on demand (SaaS).

Rupy

Refactoring Illustration

Streaming Wiki Demo

The Basic Wiki Demo updates messages with Periodic Refresh, polling the server every five seconds. This Demo replaces that mechanism with Service Streaming. The AjaxCaller library continues to be used for uploading new messages, which effectively makes it a back-channel.

The service remains generic, oblivious to the type of client that connects to it. All it has to do is output a new message each time it detects a change. Thus, it outputs a stream like this: <message>content</message>some time later<message>content</message>.... To illustrate how the server can be completely independent of the client application, it's up to the client to terminate the service (in a production system, it would be cleaner for the service to exit normally, e.g. every sixty seconds). The server detects data changes by comparing old messages to new messages (it could also use an update timestamp if the message table contained such a field).

The browser sets up the request using the standard open() and send() methods. Interestingly, there's no onreadystatechange handler because we're going to use a timer to poll the text.a (In Firefox, a handler might actually make sense, because onreadystatechange seems to be called each time the service flushes output.)

pollLatestResponse() keeps reading the outputted text. It keeps track of the last complete message it detected using nextReadPos. For example, if it's processed two 500-character messages so far, nextReadPos will be 1001. That's where the search for a new message will begin. Each complete message after that point in the response will be processed sequentially. Think of the response as a work queue. Note that the algorithm doesn't assume the response ends with </message>; if a message is half-complete, it will simply be ignored.

Alternatives

Periodic Refresh

Periodic Refresh is an obvious alternative to HTTP Streaming. Rather than having the server push data, the client checks for new data by frequently polling the server. Generally, Periodic Refresh is easier to implement than HTTP Streaming (which is the reason for HTTP Streaming Servers). However, whether Periodic Refresh or HTTP Streaming is more scalable depends on a particular implementation. Although streaming requires resources to manage the long-lived connections, periodic refresh requires additional overhead to handle the many requests that end up returning no data.

This works fine in FireFox and Internet Explorer... I haven't tested anything else. If the browser disconnects from the server, its as if the iframe finished loading. FF correctly calls the .onload function, IE seems to have a bug which prevents this from happening. However, you can still use the readyState property to see if the page has "finished loading" and loop it with a timer every half second to see if we're disconnected. (Kinslayer)