'Semantic' website promises to organise your life

Making sense of an ever-increasing number of emails, web links, feeds, and social networking contacts is tough for even the most organised person. But a new website promises to make the task much easier by gradually learning to identify relevant information in this deluge of data.

Radar Networks, a company based in San Francisco, US, is betting it can make sense of all the information collected by a web user with software that learns to distinguish between people, places, companies, and more.

Its website, called Twine, harnesses the philosophy at the core of a discipline called the "semantic web".

The semantic web is an extension of the current web, but where information is stored in a machine-readable format. It should allow computers to handle information in more useful ways by processing the meanings within documents instead of simply the documents themselves. To an extent some web tools, such as tags, already tap into this philosophy.

Data overload

Although only available to only about 100 testers, Twine has caused a stir among web experts because it is one of the first commercial ventures to try harnessing the semantic web.

As a technology that could transform the way websites work, the semantic web is often also associated with the term "Web 3.0".

Twine uses a semantic approach to act as a personal organiser, bookmark service, and a social network combined. A user adds information by creating a note, forwarding an email, uploading a document, or tagging a web page.

"Twine is a service that helps you de-fragment your digital life," says company founder Nova Spivack. "Today we all have different bits of data in different places, there is no easy way to see all you know, and share and manage it."

Twine annotates information semantically, highlighting the names of people or companies mentioned in an email, for example, and grouping these names into two categories at one side. This allows a user to explore connections between different documents, and to see their information organised in a more insightful fashion, Spivack says.

Linguistic approach

But currently very little information available on the web is presented this way. So Twine has to perform the annotation itself. It does this by using a combination of natural language processing and machine learning techniques. That is, it takes techniques from linguistics, understanding meaning through words, sentence structure, or grammar.

The machine learning process will, over time, allow Twine to learn from user behaviour, Spivack says. For example, it may 'learn' over time that what it thought was the name of a person is in fact the name of a company.

"I like to think that in the order of 10 years it will become more like an assistant than a web page," he says. There are also plans to provide tools that will allow other websites to upload information to Twine. "That includes a range of social networks and could even include companies with proprietary data, for example, stock market data," Spivack says.

Advertising 3.0

At the same time, Twine could enable entirely new forms of advertising. "If we understand about your interests we can provide more relevant adverts," Spivak says. "If they can become 100% relevant, they actually become content, not adverts."

Many experts believe Twine and other semantic web technologies have great potential, but are keen to test them before making a judgement.

"I think it's a good application that can exploit the current semantic web technology", says Tim Finin, a web researcher at the University of Maryland in Baltimore, US. "[But] I'm hesitant to describe it as 'the next big thing'."

'Legacy information' or older pages on the web could be a serious problem for a semantic website, points out Nigel Shadbolt, a semantic web expert at Southampton University in the UK. This is because older pages will not have the underlying annotations that the semantic web harnesses to extract meaning. Furthermore, and paradoxically, there is also a risk of overloading users with new information. "That's a very tough problem," he says.

Another issue may be the amount of excitement building around semantic web start-ups. "With any new technology there's always a risk of hype," says Steve Cayzer, a semantic web researcher at Hewlett-Packard labs in Bristol UK.

"I think a lot of people get hung up on the word 'semantic'," he says. "It really means an information-rich model, but what some people might take it to mean is that there is a human-level understanding."

If you would like to reuse any content from New Scientist, either in print or online, please contact the syndication department first for permission. New Scientist does not own rights to photos, but there are a variety of licensing options available for use of articles and graphics we own the copyright to.