We have 28 presentations, with speakers from ten countries. Topics include semantic interoperability, image retrieval, multimedia challenges, mapping and modelling.

As keynote speakers we have:

Professor David Crystal, the renowned author, linguist and broadcaster, talking about semantic targeting;

Clifford Lynch, Director of the Coalition for Networked Information, on the subject of e-Research and New Challenges in Knowledge Structuring

Conference Proceedings

15 Papers from the conference were published in a special issue of Aslib Proceedings, Volume 62 Issue 4/5. For all of these, as well as the other presentations at the conference, slides and abstracts are available on this site. Papers, presentations and recordings, where available, can be found under each Presentation, accessed via each Session listed below.

Many thanks to Conrad Taylor for both recording and photographing the event.

Programme Highlights

David Crystal (Crystal Semantics)

Semantic targeting: past, present and future

This keynote address will look at the evolution of the linguistic approach to content analysis which Crystal has been developing over the past 20 years. It begins with the knowledge management taxonomy used for the Cambridge family of general encyclopedias, and follows its transformation into an Internet taxonomy, with applications in automatic document classification, search engine assistance, e-commerce, online advertising, and Internet security. Recent developments have brought a focus on advertising, a field which has seen ideas develop from simple keyword analysis to contextual advertising and now to semantic targeting. Crystal explores the difference between these notions, and describes current issues in the way semantic targeting is evolving, including ways of handling site sensitivity, sentiment, intention, and cultural localization.

Clifford Lynch (Coalition for Networked Information)

e-Research and new challenges in knowledge structuring

Our second keynote speaker gives a high-level overview of some of the developments in e-research and cyberinfrastructure, with emphasis on some of the opportunities for data curation and data reuse, with considerable emphasis on humanities and social sciences as well as science and engineering. Lynch will also look at developments in "citizen science" and what might be thought of as "citizen humanities" in this context. The talk will conclude with consideration of the changing nature of publishing/authoring, particularly in the scholarly sphere, and the implications of the production of structured, re-useable, and interchangeable knowledge as part of the processes of scholarship and scholarly communication.

Tom Scott and Michael Smethurst (BBC)

Building coherence at bbc.co.uk

Think of the BBC as a storytelling organisation; then think of the transition needed from storytelling in the world of linear broadcasting to that of the non-linear, hypertext world of the web. The value in a website lies not in its implicit (meta)data of the domain model but rather in the way the domain model overlaps and intersects with other domains. As ever the links are more important than the nodes because that's were the context lives: programmes:segment music:track, programmes:segment food:recipe etc. In this way we weave new 'user journeys' into and out of a domain, into and out of bbc.co.uk. From archive episodes no longer available online, to a recipe page, to a chef, to another recipe and back to a recent episode. Using well targeted content specific links we could not only escape the dead end content silos that characterised bbc.co.uk but point users back to programmes that would hopefully inform, educate and all that stuff. In building bbc.co.uk/programmes and bbc.co.uk/music in this way we have kept everything in its right place we've built a sane, maintainable, scalable, accessible site that search engines love and can be easily evolved to add new features and functionality. So to anyone considering how best to build websites we'd recommend you throw out the Photoshop and embrace Domain Driven Design and the Linked Data approach every time. Even if you never intend to publish RDF it just works.

Image retrieval

This session will begin with an overview of the state of the art for the image retrieval market in Still digital images - the hardest things to classify and find, given by Ian Davis of Dow Jones Client Solutions. For those of us who mostly handle retrieval from text, this will bring into focus the added difficulties and rather different needs experienced by the users of images. Davis will probe the strengths and weaknesses of the different approaches through which the challenges can be met.While traditionally image retrieval has relied on indexing with controlled vocabularies, Chris Town in his talk Giving meaning to content through ontology based image retrieval will argue that such keyword based multimedia retrieval effectively treats images as "black boxes" since all indexing and search is based on the labels associated with a given image rather than the image itself. Furthermore, manual image annotation is an expensive process which is prone to problems such as errors, inconsistencies, ambiguity, lack of context, and both over- and under-keywording. But the alternative approach of content-based image retrieval (CBIR) has mostly failed to gain wide adoption. Town will outline why this may be the case, with a particular emphasis on the aspect that CBIR solutions have not done enough to bridge the "semantic gap" between their system's retrieval model and that of the user. He will demonstrate how ontological query languages have been utilised by Imense Ltd. to provide more effective image retrieval and image analysis solutions.

A third dimension will be added by Elaine Ménard from McGill University in Canada, speaking about Ordinary image retrieval in a multilingual context: a comparison of two indexing vocabularies. She compares traditional image indexing with the use of a controlled vocabulary and the free image indexing using uncontrolled vocabulary. Her study also compares image retrieval within two retrieval contexts: a monolingual context where the language of the query is the same as the indexing language and, a multilingual context where the language of the query is different from the indexing language.

Paul Miller (The Cloud of Data)

Exploiting data in the cloud

Much of the recent attention devoted to Cloud Computing has been concerned with outsourcing of hardware or hosting of applications. Important as these trends are, Miller will argue that the Cloud is capable of far more than simple replication of existing enterprise processes. Amazon's recently announced Public Data Sets programme and the World Wide Web Consortium's (W3C) Linked Open Data community project illustrate the opportunity for re-use of public data, with licensing frameworks evolving to reflect shifting presumptions. Specifications from the Semantic Web are being put to work as enterprises such as Thomson Reuters seek to unlock value in expensively curated internal data. What happens as increasing quantities of data become accessible, as attitudes to control and ownership morph, and as technologies evolve to enhance 'enterprise' applications with insight from beyond the firewall? Where might the balance lie between comprehensiveness and insight on one hand, and security and control on the other? Miller will point us to the future.

The full conference programme brochure (as shown below) is available here.

The conference followed a series of highly successful afternoon meetings that have now become part of the ISKO UK regular programme and attract people from different communities working in the knowledge and information arena.

Papers, presentations and recordings for the ISKO UK 2009 conference are available under each Presentation, accessed via the Sessions listed above.