Report from the IKS Workshop in Amsterdam

IKS (Interactive Knowledge Stack) is an initiative to help Content Management Systems enrich their content with semantic information (think Semantic Web). The initiative, which is in part funded by the European Union, has now reached a point where a first usable version is available. The IKS Project is running an early adopters program and a series of workshops to help get CMS vendors become familiar with the system and so that they can provide early feedback.

One such workshop was held on December 9 + 10 in Amsterdam, Netherlands. I (Dirk) attended this workshop, representing Geeklog, to try and get a better idea of what this project is all about.

What the IKS Project has produced so far is a webservice with a REST API that CMS can send their content, e.g. an article, to. The webservice then sends back some RDFa that contains the semantic information for the article. So if the article would contain the word "Paris", you'd get back information stating that Paris is a place and a list of possible places called Paris that the article may be referring to. Since there are several places in the world called Paris, the RDFa will also include a confidence level for each of the options. The quality of the results depends on the semantic engines behind the webservice.

The webservice used to be called FISE but has since been accepted by the Apache Software Foundation as an incubating project and will be known as Apache Stanbol from now on.

Stanbol can be thought of as a middleware with the REST API on one end and a collection of semantic engines on the other end. As mentioned above, the quality of the results depends on these semantic engines. The idea is that you can select which engines you are using for your CMS or website. This allows for specialized engines that can identify, say, medical terms. Or you may drop the engine that identifies places and use one that knows about Greek sagas, where "Paris" would most likely refer to a person.

From a technical point of view, you would typically run your own instance of Stanbol (although it can also be shared between sites). Being written in Java, Stanbol is a bit of a heavy-weight but simple enough to install. In preparation for the workshop, I hacked together a very primitive Geeklog plugin that would simply send every story to Stanbol when it is saved. The REST API makes this really easy. The more tricky part is parsing and interpreting the RDFa and to use it for something interesting. For my prototype I settled on making it highlight the places and persons it had identified. There will probably be PHP libraries developed as part of Stanbol, which would also offer a way for Geeklog to contribute to the project.

Speaking of Geeklog: What does all this mean for Geeklog now? That's up to us, i.e. the Geeklog community, really. Is there enough interest in getting involved with what looks like a promising project to finally make the Semantic Web (proposed in 1999, after all) a reality? Can you think of use cases for adding semantic information to your Geeklog site? Please leave comments below.

To throw out just one idea: Google is already interpreting some semantic information embedded in pages (see Google Rich Snippets), so this may help in SEO, at least for some sites.

Wow, probably 4+ years since I read anything about RDFa - but I've always considered it a brilliant format.Honestly anything to boost SEO is a step forward.Is a plugin going to be hearty enough, or should it be a built-in service for any plugin to take advantage of?

-s

---FlashYourWeb and Your Gallery with the E2 XML Media Player for Gallery2 - http://www.flashyourweb.com

Whether this should become a plugin or core functionality really depends on the use cases we come up with. Which is why I'm reaching out to the community for ideas and feedback.

From a technical point of view, my prototype plugin simply hooks into PLG_itemSaved. Add some functions to make the parsing of the RDFa easier, and this could very well be implemented as a plugin (plus maybe some tweaks in the plugin API here and there). However, if there's a lot of interest in this or a great use case, we may want to make it a core feature ...

In your tests was there any noticeable delay for processing? long/short article.Would be a great addition to the commerce plugins and would also make sense to make it available to staticpages - heck any plugin should have access to it.

-s

---FlashYourWeb and Your Gallery with the E2 XML Media Player for Gallery2 - http://www.flashyourweb.com