I’ve just been experimenting with Yahoo Pipes–very cool, quite intuitive–and I’m wondering if this might not offer another very useful layer to web semantics. Essentially semantic web is at its basest level about the various natural meanings attached to particular words, sets of data, etc. Pipes is a type of tagged data, linked sources that suggest more expansive meaning and intention.

It’s like a ‘still’ (alcohol burner) for the web–distill your own web juice.

What is WebFountain and why do you have to read documents on semantic web to hear of it?

Compared to IBM’s project, WebFountain, search engines and search directories garner attention because they are the surface accoutrements of the WWW. But after reading a bit on WebFountain, I propose the WWW will be left to such surface cyberscrapers and the brutal backbone of corraling metadata will be sweated out in the Milky Way-Wide Web (MWWW) by this.

According to John Batelle, who has recently published a very intriguing book, The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture, WebFountain (WF) dominates the realm of what is too simply called the “semantic web.” Batelle features a lengthy blog post on WebFountain, from which he largely draws for the same section in his book–if you’re interested in a comprehensive tale of IBM’s odyssey. In a nutshell, WebFountain is currently in development-research as a “platform–middleware” (Batelle).

If you can wrap your head around the concept that searching the web is still a fairly primitive, clunky process, then it may be possible to understand that the brainiacs behind WebFountain are carving away at a sophisticated metadata analyzer which could very well be the virtual scaffolding that supports the UWW in the future.

So what will WebFountain be capable of?

Ideally users will be able to propose queries to various portals on the internet designed to drill into multifarious “silos” of data. According to an article in The Economist, users will initially be limited to business, industry, academics–those internet mavens that are incapabel right now of really plumbing the expanse of content and data. Traditional search engines allow us to access an impressive pit of data, but data (keyword) searches are really limited to pits, holes straight down into like data. But search as we know it is not capable of synthesizing teh subtle connections between disparate data sets. Enter metadata tagging. “A long, long time ago, in a galaxy far far away” cyber data was finally intuitively connected….

I digress. What kind of results will a system run on WebFountain’s platform be able to serve up? Batelle proposes examples of the types of complex and unique queries. Here’s one of them: “‘Give me all the documents on the web which have at least one page of content in Arabic, are located in the Midwest, and are connected to at least two similar documents but are not connected to the official Al Jazeera website, and mention anyone on a specified list of suspected terrorists.’” This query clearly asks for data to be returned that are from distantly related sources. Right now Google is incapabel of managing this query, but a semantic-web-based search engine, Hakia, is incapable as well. Hakia reports at the top of the results page for this query that it doesn’t get “the nuance” of the request.