Is the World Wide Web for luvvies and VCs – or for all of us?

Part 1: In which we look at what the Greatest Living Briton got wrong (and right)

Common Topics

Analysis The Web turns 25 years old today, and its inventor Sir Tim Berners-Lee has written yet another declaration of rights – a "Magna Carta" – to mark the occasion.

These incessant anniversaries are proof that journalists and media luvvies love looking backwards rather than reporting what's in front of them - warts and all. But especially not the warts. And it's also proof that web people can not resist the self-indulgence of penning a manifesto.

The web does not lack manifestos, declarations, decrees, charters, proclamations, or lists. Many were written in bedrooms; all have the sickly insincerity of a Hallmark greeting card. Perhaps a Web Manifesto is the zenith of all listicles, a kind of Holy Listicle Scripture.

Nobody could reasonably disagree with Berners-Lee's commandments at their most vague and generalised. But these windy declarations are really vanity products, to remind the world how self-righteous the writer is. Much as some people like to advertise: "I've done my recycling. Have you?" others like to declare: "I use the Web. I am more virtuous than thou. My manifesto proves it." It pays to look behind the curtain.

Before we can look forward, though, we have see what the web is good at and bad at - but before we can do that, we have to establish what it is, exactly. And how we all got here.

A bastard publishing format

The World Wide Web was essentially a quick hack; it was a piece of improvisation. It just happened to be a hack the world found useful at the time: electronic publishing using machine readable tags, or markup, to give things in documents meaning and describe how they appeared, had been evolving throughout the 1970s.

By the mid-1980s, it was moving something from only IBM customers could use to an open format everybody could use. The landmark was the first SGML specification, published in 1985. Its roots are described here - in the most important Web document nobody has ever read. SGML was rich and promising indeed.

Writing in 1971, Charles Goldfarb described the fundamentals:

The principle of separating document description from application function makes it possible to describe the attributes common to all documents of the same type. ... [The] availability of such "type descriptions" could add new function to the text processing system. Programs could supply markup for an incomplete document, or interactively prompt a user in the entry of a document by displaying the markup. A generalized markup language then, would permit full information about a document to be preserved, regardless of the way the document is used or represented.

So SGML was about a lot more than presentation, the coat of paint on the toy. It allowed classes of documents and even "mini-languages" to be defined, and link to data outside the document. It could yank in databases and everything processed had a meaning.

By the late 1980s, everyone involved in professional technical publishing was getting ready for the SGML revolution, and one of these was a contractor in technical publishing at CERN, Tim Berners-Lee. He took the basic elements from what would be another instance of an SGML markup language and lashed it to the client/server architecture of the the academics' network. He wanted something much simpler and immediately useful. SGML was all about doing it right - and it was complex and formal.

The brilliantly clever bit of Berners-Lee's proposal was the simplicity - he'd created an instantly useful document management system. For a while, as the internet was opened up, Berners-Lee's HTML was just another navigation system alongside Gopher and WAIS. But by the end of 1994 VC money turned an academic side project into Netscape, and from that point on, the world would have to work with HTML.

So, we've been struggling with the compromises and omissions of the hack ever since.

In 1992, Berners-Lee revised HTML to make it more SGML compliant, and less anarchic: browsers would henceforth present (more or less) the same results. It was only in 1996, with the development of the XML and XHL specs, that some of what Berners-Lee, in his scramble to make something small and useful had omitted, began to be restored.

Equipped with these, document links can have multiple sources and targets. XML grappled with semantics, too, giving tags meanings. Style sheets were another plank of '80s-era SGML publishing, and attempted to sort out the presentation mess - separating the look from the document itself. They popped up for the web in 1998. Berners-Lee himself has been hawking the semantic web for almost 20 years.

So if you want a glimpse of "the future of the web" today, strictly speaking, you only need to look at what Berners-Lee left out (and what XML and other specs “restored”, to some extent) 18 years ago. I put restored in scare quotes there, because, incredibly, much of it has yet to be implemented - and some of the most mind-blowing parts have been completely forgotten. A search for XHL only brings up the first reference to the technology on the third page of the Google search results. We have to live with the consequences.

Try, for example, searching for articles or blog posts on Thai politics published in December 2013 - with the date as part of the query. You can't. Google's intitle: and related: directives merely hint at what the web could look like. For all the PhDs at Google, Microsoft and Facebook, the web we use today is far dumber than it should be. That's not all that "didn't happen". The web developed in ways that precluded some quite interesting business innovation, too.