Sir Tim Berners-Lee: Semantic Web is open for business

Web inventor Sir Tim Berners-Lee says that Semantic Web building blocks are in place. He also questions the attitude to data ownership of social networking companies in an interview recorded earlier this month.

Our wide-ranging dig into the past, present and future of the Semantic Web was recorded for one of the regular Talking with Talis podcasts, and now appears here as the first of a new podcast series for ZDNet; Talking Semantics.

In this post, I'd like to draw out some aspects of the conversation that I found most interesting. Have a listen for yourself, draw your own conclusions, and please do share them in TalkBack.

"I think... we've got all the pieces to be able to go ahead and do pretty much everything... [Y]ou should be able to implement a huge amount of the dream, we should be able to get huge benefits from interoperability using what we've got. So, people are realizing it's time to just go do it."

Asked about an important article in Scientific American from 2001, Berners-Lee was quick to move past the grand vision outlined there, and to stress the importance of simple yet empowering steps;

"In fact, the gain from the Semantic Web comes much before that. So maybe we should have written about enterprise and intra-enterprise data integration and scientific data integration. So, I think, data integration is the name of the game. That's happening, it's showing benefits. Public data as well; public data is happening and it is providing the fodder for all kinds of mashups.

What we should realize is that the return on investment will come much earlier when we just have got this interoperable data that we can query over."

We spent some time (almost 15 minutes, from about 20 minutes in, for those listening along) talking about the ways in which data holders will gain benefits from their data being visible to a new generation of Semantic Web applications;

"There's an awful lot of data out there. And I think, one of the huge misunderstandings about the Semantic Web is, 'oh, the Semantic Web is going to involve us all going to our HTML pages and marking them up to put semantics in them.' Now, there's an important thread there, but to my mind, it's actually a very minor part of it. Because I'm not going to hold my breath while other people put semantics in by hand... So, where is the data going to come from? It's already there. It's in databases..."

The W3C-supported Linked Data Project is one compelling example of a community effort to take data and make it more visible to the rest of the Semantic Web. Projects such as DBpedia, MusicBrainz and Revyu.com are enriching existing content, and increasingly providing tools with which new content can be created. As Tim notes;

"So, some data is scraped from HTML pages, some of it is pulled out of databases, some of it comes from projects which have been in XML. So, things come in many different ways. And once they're exported, as you browse around the RDF graph, as you write mash-ups to reuse that data, you really don't have to be aware of how it was produced."

Impressive as these activities are, if we are to see a similar growth in the availability of data from less philanthropic sources, there is a clear need for greater clarity with respect to the 'proper' use and reuse of data. In a similar manner to that attempted for 'creative works' by Creative Commons, recent activity around the Open Data Commons offers useful pointers as to the way forward here, and I may delve further into that area in a future post.

Towards the end of our conversation, we built upon earlier discussion of shared and open data by turning to those sites receiving such criticism for their rather different perspective at the moment, the social networks. Asked,

"Do you think developers of applications like, say, Facebook and LinkedIn and the rest, are ready to embrace the Semantic Web, or do you think they think they can do it themselves?"

Tim responded with;

"It is a very grown-up thing to realize that you are not the only social networking site... otherwise it is like a website which doesn't have any links out. In the Semantic Web similarly, if you don't have any links out, well, that's boring.

In fact, a lot of the value of many websites is the links out."

Whilst quick to recognise that sites such as LiveJournal support the FOAF specification, there was a clear distinction drawn between those few examples and the majority;

"Now if you look at the social networking sites which, if you like, are traditional Web 2.0 social networking sites, they hoard this data. The business model appears to be, 'We get the users to give us data and we reuse it to our benefit. We get the extra value.'"

...

"Web 2.0 is a stovepipe system. It's a set of stovepipes where each site has got its data and it's not sharing it. What people are sometimes calling a Web 3.0 vision where you've got lots of different data out there on the Web and you've got lots of different applications, but they're independent. A given application can use different data. An application can run on a desktop or in my browser, it's my agent. It can access all the data, which I can use and everything's much more seamless and much more powerful because you get this integration. The same application has access to data from all over the place."

Those who doubt the commitment of current players can, perhaps, be reassured by simply remembering the speed with which the current market leaders grew. Consider, too, Tim's,

"People can indeed choose not to go to that site [if it does not open access to their data]"

In other words, in a market such as the one in which we operate, there is always scope for new entrants with new values and new business models. If users are compelled by the new proposition they can - and will - move with remarkable rapidity. The big question, though, has to be... do they care enough?

We talked for a fascinating hour during which we ranged from past to future, from technology to policy. We covered specifications such as RDF and SPARQL, and we talked about the pressing need for more accessible texts to explain the Semantic Web to mainstream business. We remembered that Tim's original web client was both editor and browser, and postulated on how things might have evolved differently if today's Read/Write Web of blogs and wikis had been an integral part of the way everyone was introduced to thinking about the Web all those years ago.

There is much still to do, and Sir Tim Berners-Lee is clearly enthused by the journey that lies ahead. Listening to him, it's hard not to agree.