The dream of the Semantic Web

The Semantic Web is going to succeed, in the face of its grandest ambitions.

When Tim Berners-Lee first proposed the idea, it promised something big. Very big. In his article in Scientific American, the vision he sketched for us had the world’s data interoperating smoothly. But the vision was actually even bigger than that. It came out of the idea of knowledge representation, which in turn sprang from work in artificial intelligence. This initial dream of the Semantic Web wasn’t just about applications sharing data. Rather, the Semantic Web promised to put this data together in ways that understood the relationships to such a degree that smart programs would be able to make logical leaps to derive new information. So, it would know that if the #86 bus runs from Cleveland Circle to Harvard Square, it would also be able to deduce that the bus has a motor, that it might occasionally need oil, that people ride in it, that they perhaps pay for the privilege, that it therefore might have some form of money collecting device and that it won’t fit in your pocket.

This is because the Semantic Web doesn’t just give us a way to share data. Any ol’ standard lets us do that, whether it’s JPG for sharing images or comma-delimited files for sharing spreadsheets. The Semantic Web gives us a way to capture the relationships among the pieces. It does this through the Semantic Web’s preferred data capture standard, RDF (Resource Description Framework). RDF captures information in what philosophers used to called judgments: "A relates to B." But unlike the usual links in HTML documents, RDF lets you specify what the relationship is: A is the sister of B, B has a height of 6 feet, B is a hedge fund manager. From this, a computer could conclude that A’s sister is six feet tall. With a little more information about how relationships work, the computer could also figure out that A is a woman and that B would probably be a good person for A to ask for a loan. These sorts of relationships are specified in what the Semantic Web calls ontologies, which are themselves expressed in a standard called OWL (Web Ontology Language).

All this is great. It enables the network to be smarter, and not just in the I-know-more-facts-than-you sort of way, because ontologies express relationships as well as facts. And because ontologies are expressed in standard formats, these smarts are cumulative. Brilliant. That Tim Berners-Lee is one heck of a smart guy. (And, it cannot be said too often, his generosity in making the Web’s standards fully open is epochal. Thank you, Sir Tim!)

So, what’s the problem? There is no problem. The Semantic Web is a great idea. It’s just not the only idea. We are always going to know, understand and care about more than any knowledge representation system can keep up with. That’s not just because there is so much to know. It’s also because so much of what we know is difficult to express precisely. Ontologies express knowledge but also necessarily (in almost all cases) clean it up, which means simplifying and specifying. There’s no harm there, so long as we remember what we’re doing, but it does mean that ontologies are tools good for some types of tasks and not very good for others. It also suggests that a system that is composed of lots of small ontologies loosely joined—and multiple ontologies covering the same fields in different ways—will capture more knowledge and be more robust than single ontologies that cover huge fields. Multiple messy ontologies include more of how the world seems to multiple, messy people. The Semantic Web’s value will grow as it becomes as inconsistent, ambiguous and imperfect as our own collective knowledge is.

Of course, the Semantic Web is only part of what we need. We will always require the help of smart, opinionated, knowledgeable people who direct us to what seems interesting and important to them. They’re going to do that by posting links and explaining why those links matter. And then they’re going to argue against the very places they’re pointing us to: "You have read what this person says! It’s totally wrong!" That’s semantics, too. And we’re going to need more and more and more of it if we are to make any sense of our world.

But, how will we find those people? And how will we be sure that they’re finding everything worth our time? And how can we make sure we’re not just listening to people who agree with us? And how will we be sure that what they claim is true isn’t a passel of lies?

We can’t. No amount of knowledge representation, computerized smarts or human effort can guarantee any of that. Knowledge will always be hit or miss, whether it’s the systematized Semantic Web or the World Wide Web’s wildly unsystematic ecosystem of wise guys, savants and know-it-alls. That’s inevitable. And, when you think about it, the alternative would be intensely undesirable.