A new HTTP header that might be useful

The HyperText Transport Protocol, HTTP, underlies every website visit you make. Your browser negotiates with the server for the website. Once they come to their split second agreement, the server starts to send the web page, along with any other requested files (like images)11. Curious?: If you are using Firefox, install a program called Firebug. This popular programmer’s tool reveals all this under-the-hood magic. [↩]

The governing document specifying the protocol is RFC-2616, published by the Internet Engineering Task Force (IETF), an anarchic nerdopoly that authoritatively defines hundreds of protocols and file formats for internet use.

In common with many other IETF protocols, HTTP allows servers and browsers to send information not given in the RFC document using X- headers. Originally the ‘X’ stood for ‘experimental’. Today X-headers are used to create important de-facto sources of information and to facilitate services that build on the basic HTTP model.

I think that a useful additional header would be X-Torrent.
HTTP is a client-server protocol: all clients look up a single service by a universal addressing scheme. That single service provides the data, images etc by itself. This often leads to complicated schemes to disguise whole legions of physical servers as a single logical server in order to account for demand.

If you are a large firm with a relatively predictable load, this is all fine and good. But if you are a small website, perhaps hosted by a shared host or a modest virtual private server, it won’t stand up to the sudden surge in traffic that might come down the pipe from Digg, Reddit, Slashdot and the like.

By contrast, the BitTorrent protocol works by distributing files between peers in small chunks. As more and more peers obtain parts of the file, the aggregate bandwidth and load capacity for that file increases. A popular torrent can have staggering aggregate bandwidth by a simple scheme of many small connections donating some of their capacity.

The thinking behind the X-Torrent header is that whenever a web server returns any headers over HTTP, it includes an X-Torrent header pointing to a torrent tracker for that document. When a website becomes heavily loaded, browsers would use the BitTorrent protocol to obtain the requested document or resource from their peers, distributing the load amongst the audience as well as on the server.

This would be particularly useful for cases where many users try to access a single large file at the same time. It would also provide a useful form of distributed caching — other systems might be able to pull resources directly from the torrent rather than the original server.

Such a scheme would require web servers to push the header and maintain a BitTorrent tracker file. Browsers would need to include the BitTorrent protocol. After that, it might turn out to be a way to make everyone’s life just a little easier.

Update: a commenter on Reddit makes the fairly sensible suggestion that alternatively, web browsers could send a message to the web server that they will accept a torrent file instead of the original using the HTTP Accept-Encoding header. This would eliminate the need for another header and simplify the semantics a bit.

13 Responses to A new HTTP header that might be useful

Each administrator having to set up a tracker may not be as bigger deal as you think… Amazon S3 automagically produces torrent files for any files that are stored on the service. See here for some info.

a commenter on Reddit makes the fairly sensible suggestion that alternatively, web browsers could send a message to the web server that they will accept a torrent file instead of the original using the HTTP Accept-Encoding header.

This doesn’t seem sensible to me since it depends on the server that we’re assuming cannot send you the page in the first place. There’s also a problem with page updating – let’s say you updated this blog post, and client 1 had cached and started torrenting the old version. Client 2 comes along, and successfully receives the new updated page, and starts serving it through the X-Torrent tracker. Client 1 needs to realize it has an old copy, and re-request the page or get off the tracker. This presents a security hole – any client can join the X-Torrent tracker and claim they have a new version of the site, DDOS the original server, and distribute their own defaced version using X-Torrent. The only way to guarantee that the authors content is, excuse me for the pun, authoritative, is to have the author act as the sole seed, bringing updates to the tracker and refreshing everyone who is not identified as the author.

I can’t see how this can work (without a lot of effort). It’s a good idea but you’re going to end up with something like Amazon’s S3 (thanks gassit) which isn’t as transparent as you want.

First issue: the server is stalled and can’t supply the page so you will never see the X-torrent header.

So you solve that like Amazon do by layering bittorrent over the top – ie. instead of serving the actual page to all requesters you serve up a bittorrent file and then the client browser uses that to download the page. With light traffic this ends up a little slower as there is a level of indirection, and when things get heavy there’s a bittorrent cloud to serve up your page. Cool idea, and it’s what Amazon are doing.

But you are going to have to change all your links – ie. get your site to reference a bunch of bittorrent files rather than html files.

Next you’re going to have to get clients (all of em if you are to have any hope of transparency) to recognize what you are doing.

ie. you are going to have to turn the web (or at least your part of it) into a *.bt cloud rather than a *.html cloud.

In other words browsers have to become bittorrent clients. Not technically impossible but not piss easy either (Firefox has a torrent plugin but I’ve never got it to work). Further you have to distribute this new functionality and get everyone to use it.

At this point you’re trying to recreate the web from scratch using (let’s make up an acronym) BTTP instead of HTTP.

To make this workable I think you’d have to extend HTTP using something like the redirection condition (“this page temporarily moved” – I forget the code) so that the browser would attempt to reload the page the user is looking for but actually load the torrent file instead (which assumes your server is responsive enough to return that condition *and* that some server is available and responsive enough to subsequently serve up the torrent file).

Then – so long as the browser can recognize torrent files – bobs your uncle.

I just don’t see how you could make this happen quickly enough to get the benefits you seek. You need a certain population of enabled browsers to get enough of a cloud going to get the benefits.

The technical way of explaining what I’ve described above is:
1. You’re layering (or tunnelling) one protocol (BT) over another (HTTP)
2. That means you need protocol agents for that second protocol (BT) at each end (both in the browser and the server)

Ok, it would add an extra layer of indirection to bit torrents, but you could add extra things like proxying and authentication to torrents.

You’re looking at it like an extension to HTTP, and thinking that it has some problems. Yes but it doesn’t mean that every HTML page has to be a BT page. It could mean the opposite: every BT packet could have a HTTP header. That would add a lot of already well thought-out functionality.