Goal: National Semantic Web Infrastructure with Applications

The ambitious goal of FinnONTO is to lay a foundation for a national metadata, ontology, ontology service, and linked data framework in Finland, and demonstrate its usefulness in practical applications. In our vision, a conceptual semantic infrastructure is needed for the semantic web in the same way as roads are needed for traffic and transportation, power plants and electrical networks are needed for energy supply, or GSM standards and networks are needed for mobile phones and wireless communication. A solid, commonly agreed open infrastructure would make it much easier and cheaper for public organizations and companies to create interoperable intelligent contents and services on the coming semantic web. The infrastructure should be open source and its central components be maintained by the public sector order to guarantee wide usage and interoperability across different application domains and users.

FinnONTO Phases

FinnONTO consists of a series of project phases. The work started in 2003 with 14 funding companies and public organizations, and has grown year after year. In 2008-2009 there were already 38 funding organizations in the project making FinnONTO truly a national effort. In the last phase 2010-2012, the project was split into two projects: FinnONTO 2.0, continuing the FinnONTO tradition (36 organizations) directly, and its spinoff project Semantic ubiquitous services SUBI (17 organizations), focusing on mobile applications of the FinnONTO infrastructure. Well over 40 Finnish organizations now fund and participate in the FinnONTO project.

Research Results

The project has been active in many fields of research in ontologies, metadata, automatic annotation, semantic computing, and the web. At the end of this web page, publications of the project during the years are listed.

Main Demonstrators

The project has been application-oriented and produced many demos and systems, some of which have already been deployed in practice. The main demonstrators of the project include:

National Ontology Service ONKI (published 2008) with its ontologies and vocabularies, the core of the FinnONTO-infrastructure. ONKI is used for publishing interlinked, collaboratively created ontologies and vocabularies in a centralized way. A major novelty of the system was that it can be integrated with legacy systems and applications using AJAX, Web Services, and REST. The system is provided for the society as a free Living Laboratory by FinnONTO. In summer 2009, ONKI had 150 organizations as registered web service users and some 10 000 unique end-users in a month. More information and publications about ONKI can be found at the ONKI homepage.

Applications of the FinnONTO infrastructure in eCulture, eHealth, eLearning, eGoverment, and eCommerce, including semantic portals such as
MuseumFinland (Semantic Challenge Award in 2004) (published 2004),
HealthFinland (Semantic Challenge Award in 2008) (published 2008), and
CultureSampo (published 2008).
More information, publications, and the systems online can be found via the homepage links above.

World Summit Award WSA 2010, SmartMuseum system (EU project) that utilized the CultureSampo system of FinnONTO, Abu Dhabi, 2010.

Consortium Organization

The research has directed and has been mostly carried our by the Semantic Computing Research Group (SeCo) at the Aalto University (formerly Helsinki University of Technology, TKK) and the University of Helsinki. Also the University of Tampere has contributed to the work until 2009.

The consortium behind the project represents a wide area of functions of the society including museums, libraries, business, health organizations, government, media, and education. Public organizations, companies, and universities are participating in the project.

2017

A major challenge in publishing linked Cultural Heritage (CH) collections on the web is interoperability. This is due to the heterogeneity of CH contents and the distributed content creation model where publishers focus on their own data with little consideration on the others’ data. As a solution approach, the “Sampo” model is presented based on using domain independent modeling standards, on a model for aligning metadata models, and on sharing domain ontologies for populating the matadata models. The harmonized data is published for machines as a linked data service, to be used by applications for human users. To illustrate and evaluate the model, three online systems on the Web, Culture- Sampo, BookSampo, and WarSampo are presented.

The Finnish Ontology Library Service ONKI was published as a living laboratory prototype for public use in 2008. Its idea is to support content indexers and ontology developers via a browser interface and machine APIs. ONKI has been well-accepted, but being a prototype maintained by the ending research project FinnONTO (2003–2012), a more sustainable service was needed, supported by permanent governmental funding. To achieve this, ONKI was deployed and is being further developed by the National Library of Finland into a new national vocabulary service Finto. We discuss challenges in the deployment of ONKI into Finto and lessons learned during the transition process.

Semantic and context knowledge have been envisioned as an appropriate solution for addressing the content heterogeneity and information overload in mobile Web information access, but few have explored their full potential inmobile scenarios, where information objects refer to their physical counterparts, and retrieval is context-aware and personalized for users. We present SMARTMUSEUM, a mobile ubiquitous recommender system for the Web of Data, and its application to information needs of tourists in context-aware, on-site access to cultural heritage. The SMARTMUSEUM system utilizes Semantic Web languages as the form of data representation. Ontologies are used to bridge the semantic gap between heterogeneous content descriptions, sensor inputs, and user profiles. The system makes use of an information retrieval framework where in context data and search result clustering are used in recommendation of suitable content for mobileusers. Results from laboratory experiments demonstrate that ontology-based reasoning, query expansion, search result clustering, and context knowledge lead to significant improvement in recommendation performance. The results from field trials show that the usability of the system meets users’ expectations in real-world use. The results indicate that semantic content representation and retrieval can significantly improve the performance of mobile recommender systems in knowledge-rich domains.

The BookSampo dataset provides information as linked data on fiction literature published in Finland going back to the 15th century, along with rich descriptions of both their content and context. The dataset contains data on nearly 400,000 subjects, including literary works, authors, book covers, reviews, awards, images, and movies, over 3 million triples in total. The data has been applied as the basis of the BookSampo portal in public use in Finland, and is aligned with the cross-domain cultural heritage contents and ontologies of CultureSampo, another in-use semantic portal. The data has been used to answer complex questions, such as what topics should one write about, if one wants to get a literary award (based on statistics). The metadata was transformed into RDF from legacy library databases, then enriched manually by dozens of librarians in a Web 2.0 fashion in Finnish public libraries, and is constantly updated at a rate of some new 90,000 triples monthly.

We present an approach to diversify entity search by utilizing semantics present and inferred from the initial entity search results. Our approach makes use of ontologies and independent component analysis of the entity descriptions to reveal direct and latent semantic connections between the entities present in the initial search results. The semantic connections are then used to sample a set of diverse entities. We empirically demonstrate the performance of our approach through retrieval experiments that use a real-world dataset composed from four entity databases. The results indicate that our approach significantly improves both diversity and effectiveness of entity search.

We present an ontology for managing the scientific and common names of birds. The ontology is based on the TaxMeOn meta-ontology model for biological names. The ontology is in use as an ontology service and it has been applied in a bird watching system.

Animals and plants are referred to using scientific or common names depending on the expertise of an audience or a source of data. The names change in time and therefore their usage as identifiers as such is problematic. We present a solution for managing and using plant names as an ontology. The ontology is based on the TaxMeOn meta-ontology for biological names. In order to refer to organisms unambiguously and publish information as Linked Data on the web, the names are given URIs. The ontology is developed collaboratively and it supports the approval process and temporal tracking of the common names. We introduce an ontology service of plant names for end-users and provide user interfaces and APIs for integrating the ontology into applications.

Observational data about species of public interest, such as birds and butterflies, is often created and collected by volunteered citizen scientists, and used by professionals for managing biodiversity. The education and skills of the citizens participating in the work varies a lot, and the process of making observations is typically not systematic but rather ad hoc. As a result, the quality of the observational data in repositories, such as the Global Biodiversity Information Facility GBIF Data Portal, is often not good, hampering its utilization severely. This paper presents an approach for enhancing data quality in a citizen science setting, and presents a mobile tool BirdWatch for citizen observers, mitigating difficulties in producing high quality Linked Data for biodiversity management.

Ontology repositories, such as NCBO Bioportal, ONKI and Cupboard, help finding and using ontologies on the Semantic Web. However, currently each ontology repository constitutes a separate island with its own user interface, APIs, users, ontology languages and set of ontologies. Because there is not a universal way to access all ontology repositories, doing global search, browsing, and inference over all available ontology repositories turns out to be technically difficult and is generally not done. Ontologies are not reused as much as they could and hence the full potential of ontologies is not achieved. To address the problem, we propose the Normalized Ontology Repository (NOR) approach to make the ontology repositories universally accessible while maintaining their unique functionalities and strengths. The SKOS language is used as the lowest common denominator for presenting the ontologies. In addition, a simple API for searching and accessing the ontologies is defined. As a proof-of-concept evaluation, we present three case implementations to demonstrate the NOR approach: 1) the distributed architecture of the ONKI repository, 2) the metasearch for ONKI and NCBO Bioportal, and 3) publishing informal ontological concept collections as NOR end-points, demonstrated with the semantic portal CultureSampo and the metadata editor SAHA.

BookSampo is a joint project between the Finnish public libraries and semantic web researchers, to improve fiction literature search and recommendation. In the project, dozens of librarians around Finland have used a collaborative web-based metadata editor to input diverse knowledge about fiction literature into a shared database. Particularly, the project has sought to improve access by indexing not only bibliographical information about the books, but focusing on the content and context of the works. In order to do this, the database employs advanced techniques such as functional, content-centered indexing, ontological vocabularies and the networked data model of linked open data. To demonstrate the functionality this makes possible, the fiction literature portal http://www.kirjasampo.fi/ was created. This portal uses the knowledge created in the project to offer advanced semantic search and recommendation based on the database created. In addition, web services exposing direct access to the data have been used for example in culture hack events to answer more complex questions, such as where in Finland are the most crimes committed in fiction literature.

Academic users often find work with online primary sources both rewarding and challenging. Improving subject access in these sources is essential as digital collections propagate and work with primary sources becomes increasingly important in humanities curricula. A user needs assessment was conducted with humanities users at the University of Colorado Boulder to facilitate engagement with these sources. Two of the major user needs identified were improving findability and context, particularly for historical subjects. Linked Data can help meet these needs by linking related concepts in the sources using a specialized vocabulary, enriching them with outside resources, and enabling semantically rich services that empower users. This paper discusses a project the authors undertook to enhance subject access in CU’s WWI Collection Online by deep linking historical data on the civilian experience in occupied Belgium. This work is intended to lead to a richer understanding of forces shaping the WWI period.

TravelSampo [1] is a prototype system, by which museums are able to create interactively audio guide tours inside museums and outside in the open air. The system includes a web-based editor by which a curator can describe objects in an exhibition, or in the open air, using a set of shared ontologies published in the National Ontology Service ONKI (http://onki.fi/), and upload related audio descriptions, text, and images. Each exhibit object is given an identifier and a geo-location. When the end-user is near the object, either in a museum or in the open air, information related to the object can be given to her based on the object identifier or GPS location. A major novelty of TravelSampo lies in its ability to associate the object metadata automatically with millions of semantically related pieces of information available though the Linked Data cloud (http://linkedata.org/) and the CultureSampo system (http://www.kulttuurisampo.fi/). For example, a painting can be linked, based on the underlying ontologies and metadata, with the biography of the painter in Wikipedia or in the National Biography, with other paintings of the artist in the collections of other museums, with photos and books about the artist, and so on. This gives the end-user a richer experience than is possible with traditional audio guide systems. For the museums, TravelSampo offers a cost-efficient and dynamic way of creating information rich audio guide programs, and re-using and linking each others collections through linked data, leading to a win-win situation. The paper presents and discusses the underlying ideas of TravelSampo and our experiences in developing the systems especially from the content publishers’, i.e. the museums’ viewpoint. [1] E. Mäkelä, J. Väätäinen, R. Alitalo, O. Suominen, E. Hyvönen: Discovering Places of Interest through Direct and Indirect Associations in Heterogeneous Sources - The TravelSampo System. Terra Cognita 2011: Foundations, Technologies and Applications of the Geospatial Web, CEUR Workshop Proceedings, Vol-798, 2011. http://ceur-ws.org/Vol-798/proceedings.pdf

Events are an essential component of cultural heritage (CH) Linked Data (LD): they link actors, places, times, objects, and other events into larger narrative structures, providing a rich basis for semantic searching, recommending, analysis, and visualization of CH data. This paper argues that shared vocabularies (gazetteers, ontologies) of events, such as the “Battle of Normandy” or “Crucifixion of Jesus”, are necessary to facilitate the aggregation and linking of heterogeneous content from various collections. For example, biographies, histories, photos, and paintings often reference or depict events. A set of general requirements for an event gazetteer is presented, based on the needs of publishing, aggregating, and reusing cultural heritage content as Linked Data. After this, a metadata model addressing the presented requirements for representing historical events is outlined. The model is being applied in a case study aimed at developing an event ontology for World War I (WWI). Our goals from an end-user perspective are twofold: 1) Facilitate event-based cataloging for curators in memory organizations; 2) Utilize semantic event descriptions and narrative event structures in end-user applications for searching and linking documents and other content about WWI, and for structuring and visualizing them.

This paper presents the CultureSampo system for publishing heterogeneous linked data as a service. Discussed are the problems of converting legacy data into linked data, as well as the challenge of making the massively heterogeneous yet interlinked cultural heritage content interoperable on a semantic level. Novel user interface concepts for then utilizing the content are also presented. In the approach described, the data is published not only for human use, but also as intelligent services for other computer systems that can then provide interfaces of their own for the linked data. As a concrete use case of using CultureSampo as a service, the BookSampo system for publishing Finnish fiction literature on the semantic web is presented.

2011

BookSampo is a semantic portal in use, covering metadata about practically all Finnish fiction literature of Finnish public libraries on a work level. The system introduces a variety of semantic web novelties deployed into practise: The underlying data model is based on the emerging functional, content-centered metadata indexing paradigm using RDF. Linked Data (LD) principles are used for mapping the metadata with tens of interlinked ontologies in the national FinnONTO ontology infrastructure. The contents are also linked with the large LD metadata repository of related cultural heritage content of CultureSampo. BookSampo is actually based on using CultureSampo as a semantic web service, demonstrating the idea of re-using semantic content from multiple perspectives without the need for modifications. Most of the content has been transformed automatically from existing databases, with the help of ontologies derived from thesauri in use in Finland, but in addtion tens of volunteered librarians have participated in a Web 2.0 fashion in annotating and correcting the metadata, especially regarding older litarature. For this purpose, semantic web editing tools and public ONKI ontology services were created and used. The paper focuses on lessons learned in the process of creating the semantic web basis of BookSampo.

Linked Open Aalto is a research project aiming at developing a semantic web approach for creating and publishing interlinked educational, research, and managerial contents produced at different communities, schools, departments, research groups, and persons in Aalto. By using semantic Linked (Open) Data principles, technologies, and open datasets available, Aalto contents can be interlinked with related teaching and research materials in Finland and internationally. By aggregating and combining local contents from separate incompatible data silos and systems, the end-user can be provided with a global, cross-disciplinary perspective to knowledge produced in Aalto and other universities. For example, a web page describing a course can be interlinked automatically with related research results, publications, projects, Wikipedia pages, research groups, researchers, internationally available video lectures, open course materials, events in Aalto, conferences, blog discussions, and so on.

We consider ontology evolution in a system of light-weight Linked Data ontologies, aligned with each other to form a larger ontology system. When one ontology changes, the human editor must keep track of the actual changes and of the modifications needed in the related ontologies in order to keep the system consistent. This paper presents an analysis tool MUTU, by which such changes and their potential effects on other ontologies can be found. Such an analysis is useful for the ontology editors for understanding the differences between ontology versions, and for updating linked ontologies when changes occurred in other components of an ontology system.

Data quality is a growing concern on the Semantic Web. The amount of data available is growing faster than ever, and the emphasis thus far has been on creating and interlinking data without much regard to how good the data actually is. The trend is shifting from creating new data to refining what already exists. Data quality is a subjective concept and a formal representation for it is often troublesome. First, we must define what is meant by data quality - what are the different facets of the concept. Second, a way for representing this quality must be found. Third, actual processes to refine data and improve its quality and ways to take data quality into account on the Semantic Web must be developed. This work presents some solutions to the problem. Many ways to annotate quality metadata as RDF are first discovered, along with their pros and cons. A framework for managing RDF-based quality metadata is presented, with a set of tools for specifically managing the quality annotations. Additionally, an automatic annotation system and a schema validation system, within the restraints of the open world assumption, have been designed, implemented and integrated into the framework. The system has been tested using real life datasets with promising first results.

Semantic web technologies have introduced the idea of annotating content in terms of concepts taken from ontologies. Since concepts are defined in terms of properties and relations to other concepts, descriptions grow up into larger RDF graphs that can be used as a basis for data integration and intelligent information retrieval. Since ontologies do not typically contain all the possible concepts needed for annotation, it is usually necessary to offer the annotator the possibility to introduce new free keywords or tags in addition to the predefined ontology concepts. The problem then is that free keywords/tags do not have ontological connections to the rest of the RDF graph, unless such relations are defined by the annotator.We present a process for integrating free keywords into the ontological framework, and a practical tool implementation of it, discussing the challenges and possibilities introduced by the system. We also describe a case study performed for the Finnish Defence Forces, where the tool is used for creating a faceted semantic search portal featuring the free keywords and the ontological concepts at the same time.

Biodiversity management requires the usage of heterogeneous biological information from multiple sources. Indexing, aggregating, and finding such information is based on names and taxonomic knowledge of organisms. However, taxonomies change in time due to new scientific findings, opinions of authorities, and changes in our conception about life forms. Furthermore, organism names and their meaning change in time, different authorities use different scientific names for the same taxon in different times, and various vernacular names are in use in different languages. This makes data integration and information retrieval difficult without detailed biological information. This paper introduces a meta-ontology for managing the names and taxonomies of organisms, and presents three applications for it: 1) publishing biological species lists as ontology services (ca. 20 taxonomies including more than 80,000 names), 2) collaborative management of the vernacular names of vascular plants (ca. 26,000 taxa), and 3) management of individual scientific name changes based on research results, covering a group of beetles. The applications are based on the databases of the Finnish Museum of Natural History and are used in a living lab environment on the web.

Structured semantic metadata about unstructured web documents can be created using automatic subject indexing methods, avoiding laborious manual indexing. A succesful automatic subject indexing tool for the web should work with texts in multiple languages and be independent of the domain of discourse of the documents and controlled vocabularies. However, analyzing text written in a highly inflected language requires word form normalization that goes beyond rule-based stemming algorithms. We have tested the state-of-the art automatic indexing tool Maui on Finnish texts using three stemming and lemmatization algorithms and tested it with documents and vocabularies of different domains. Both of the lemmatization algorithms we tested performed significantly better than a rule-based stemmer, and the subject indexing quality was found to be comparable to that of human indexers.

The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and as a SPARQL endpoint and a source file for machines. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.

Ontologies and vocabularies are a key resource for creating interoperable metadata on the Semantic Web. To make finding and using ontologies easier, the idea of Ontology Repositories has been introduced with current implementations including e.g. the NCBO Bioportal, ONKI and Cupboard. There is a genuine need for different kinds of Ontology Repositories, each focusing on different kinds specific user-needs, different ontologies and different organizational requirements which cannot be addressed by a single general implementation. However, at the moment each Ontology Repository is a separate island with its own user interfaces and APIs. They also use varying ontology languages such as OWL, SKOS, and RDF Schema. Due to this, global search, browsing, and inference over the repositories is difficult and generally not done which means that, for example, finding and reusing existing ontologies becomes difficult. To address the problems, we have developed a loosely coupled Network of Ontology Repositories (NOR) architecture that makes the repositories globally interoperable while maintaining their unique functionalities and strengths. To participate in the network, each ontology repository is required to implement a shared API. As a proof-of-concept evaluation, we present three case implementations demonstrating different aspects of the NOR approach: 1) internal distributed architecture of ONKI, 2) global search of ONKI and NCBO Bioportal, 3) publishing non-ontological concept collections as NOR endpoints, demonstrated with the semantic portal CultureSampo and the metadata editor SAHA.

Purpose – Library Director Jarmo Saarti introduced a wide or ideal model for fiction in literature in his dissertation, published in 1999. It introduces those aspects that should be included in an information system for fiction. Such aspects include literary prose and its intertextual references to other works, the writer, readers and critics receptions of the work as well as a researcher s view. It is also important to note how libraries approach a literary work by means of inventory, classification and content description. The most ambiguous of the aspects relates to that context in cultural history, which the work reflects and is a part of. The paper aims to discuss these issues. Design/methodology/approach – Since the model consists of several components which are not found in present library information systems and cannot be implemented by them, a new way had to be found to produce, save, process and present fiction‐related metadata. The Semantic Computing Research Group of Aalto University has developed several Semantic Web services for use in the field of culture, so cooperation with it and the use of Semantic Web tools were a natural starting point for the construction of the new service. Kirjasampo will be based on the Semantic Web RDF data model. The model enables a flexible linking of metadata derived from different sources, and it can be used to build a Semantic Web that can be approached contextually from different angles. Findings – The “semantically enriched” ideal model for fiction has hence been realised, at least to some extent: Kirjasampo supports literature‐related metadata that is more varied than earlier and aims to account for different contexts within literature and connections with regard to other cultural phenomena. It also includes contemporary reviews of works and, as such, readers receptions as well. Modern readers can share their views on works, once the user interface of the server is completed. It will include several features from the Kirjasto 2.0‐application, which enables the evaluation, description and recommendations of works. The service should be online by the end of Spring 2011. Research limitations/implications – The project involves novel collaboration between a public library and a computer science research unit, and utilises a novel approach to the description of fiction. Practical implications – The system encourages user participation in the description of fiction and is of practical benefit to librarians in understanding both how fiction is organised and how users interpret the same. Originality/value – Upon completion, the service will be the first Finnish information system for libraries built with the tools of the Semantic Web which offers a completely new user environment and application for data produced by libraries. It also strives to create a new model for saving and producing data, available to both library professionals and readers. The aim is to save, accumulate and distribute literary knowledge, experiences and silent information.

This master s thesis explores a way in which documents can be automatically classified based on their contents. Automatic classification of data is one of the main applications of machine learning. With the help of already classified data a model for the most likely class can be learned. Whether adding background knowledge from ontologies can be added to the model in order to improve the classification accuracy, is also explored in this master s thesis. A new machine learning model is introduced that incorporates ontology information. The proposed method for learning a classification model and enhancing it with ontology information is used in a case study for the Finnish National Archives and a set of digital documents that have been manually classified. An RDF schema for representing documents, sentences and words is created in order to prepare tha data for the machine learning analysis. The words are put into base form and matched semi-automatically with concepts of the General Finnish Ontology YSO. Then the ontology enhanced model is applied on the data and the most likely classes for documents are learned. The master s thesis shows that the classification accuracy of the model increases when ontology information is added to it.

People frequently need to find knowledge related to places when they plan a leisure trip, when they are executing that plan in a certain place, or when they want to virtually explore a place they have visited in the past. In this chapter we present and discuss a set of methods for searching and browsing spatiotemporally referenced knowledge related to cultural objects, e.g. artifacts, photographs and visiting sites. These methods have been implemented in the semantic cultural heritage portal CULTURESAMPO that offers map-based interfaces for a user to explore hundreds of thousands of content objects and points of interest in Finland. Our goal is to develop and demonstrate novel ways to help the user 1) to decide where to go for a trip, and 2) to learn more about the neighborhoods and points of interest during the visit.

Ontologies and vocabularies are a key resource for creating interoperable metadata on the Semantic Web. To make the finding and using ontologies easier, the idea of Ontology Repositories have been introduced with current implementations including e.g. the NCBO Bioportal, ONKI and Cupboard. However, at the moment each ontology repository is a separate island with its own user interfaces and APIs. They also use varying ontology languages such as OWL, SKOS, RDF Schema and others. Due to this, global search, browsing, and inference over the repositories is difficult and generally not done. At the same time, there is a genuine need for different kinds of Ontology Repositories, each focusing on different kinds specific user-needs, different ontologies and different organizational requirements which can not be addressed by a single global implementation. Since there are benefits of having interoperability among the repositories, we have developed a loosely coupled Network of Ontology Repository (NOR) architecture that makes the repositories globally interoperable while maintaining their unique functionalities and strengths. To participate in the network, each ontology repository is required to implement a shared API. As a proof-of-concept, we present a global metasearch prototype for searching simultaneously hundreds of ontologies in the ONKI and NCBO Bioportal repositories.

This thesis explores the possibilities of using the view-based search paradigm to create intelligent user interfaces on the Semantic Web. After surveying several semantic search techniques, the view-based search paradigm is explained, and argued to fit in a valuable niche in the field. To test the argument, numerous portals with different user interfaces and data were built using the paradigm. Based on the results of these experiments, this thesis argues that the paradigm provides a strong, extendable and flexible base on which to built semantic user interfaces. Designing the actual systems to be as adaptable as possible is also discussed.

Finding ontologies and concepts from a collection of ontologies is a recurring task in many use cases, such as content indexing, searching, and ontology developing. To facilitate this, efficient search and browsing methods are needed. This paper introduces ONKI2, an ontology browser providing a user interface for a repository of ontologies. The system provides a multi-facet search facility for finding an ontology. Finding concepts is supported by autocompletion-based text search that can be refined with additional restrictions. ONKI2 is in use in the Finnish Ontology Library Service ONKI for a collection of 79 ontologies and vocabularies.

Ontology repository systems are used for publishing and sharing ontologies and vocabularies for content indexing, information retrieval, content integration, and other purposes. However, interlinking these distributed repositories to provide global search and browsing over the repositories has not been made. In the spirit of Linked Open Data, we propose creating a network of Linked Open Ontology Services (LOOS) consisting of ontology repositories that publish their content using a shared API. To test the approach, we have defined an HTTP API and present a proof-of-concept implementation consisting of three client applications that are used for accessing a LOOS network of over 50 ontology servers, part of the Ontology Library Service ONKI.

Ontology repository systems are used for publishing and sharing ontologies. However, currently the repositories form separate islands of ontologies, which hinders the user from finding and utilizing the most suitable ontological concepts and ontologies on a global level. In contrast, this paper presents the idea of creating a network of Linked Open Ontology Services (LOOS) based on a set of ontology services that publish their content via a shared API. This facilitates global search and browsing over all ontologies in the network. LOOS has been implemented in the National Finnish Ontology Service ONKI serving currently 79 ontologies.

Publishing information about upcoming events such as concerts and discussion group meetings in a structured format allows the event information to be aggregated, filtered and delivered to potential participants. Making automatic personalized recommendations about events requires structured metadata such as machine-understandable locations and semantic descriptions about the topic and audience of the event. We present a survey of the state of current semantic representation formats for events, including iCalendar and its RDFa and microformat representations, and show that their support for expressing rich structured metadata is limited. We have also tested how well different tools support and understand the formats. Based on the surveys we have implemented a rich event information schema for a health-oriented activity portal and developed an aggregation and validation tool for gathering and processing event information.

Providing citizens with reliable, up-to-date and individually relevant health information on the web is done by governmental, non-governmental, business and other organizations. Currently the information is published with little co-ordination and co-operation between the publishers. For publishers, this means duplicated work and costs due to publishing same information twice on many websites. Also maintaining links between websites requires work. From the citizens point of view, finding content is difficult due to e.g. differences in layman’s vocabularies compared to medical terminology and difficulties in aggregating information from several sites. To solve these problems, we propose as a solution a national scale semantic publishing system HealthFinland which consists of a 1) a centralized content infrastructure of health ontologies and services with tools, 2) a distributed semantic content creation channel based on several health organizations, and 3) an intelligent semantic portal aggregating and presenting the contents from intuitive and health promoting end-user perspectives for human users as well as for other web sites and portals.

Authors and documents with identical titles are common in the digital library environment. In order to manage identities correctly, authority control is used by library and information scientists for disam- biguating and cross-referencing entity names. We argue that the benefits of traditional authority control can be enhanced by using techniques and technologies of the Semantic Web, leading to simpler management of multiple languages, better linkability of resources, simpler reuse of au- thority registries in applications, and less work in indexing. To demon- strate our propositions, we have created a prototype of an ontology server and service called ONKI People that is used in two ways: First, it is a centralized authority service providing human end-users with efficient and easy to use authority finding and disambiguation services based on faceted semantic search and visualizations. The services are available on- line also as AJAX and Web Services API for machines to use. Second, the underlying RDF triple store can be used as a content resource in ap- plications such as semantic cultural heritage portals. The paper discusses and demonstrates both use cases in a real life setting.

The Semantic Web extends traditional web documents, i.e. the Web of Pages, with conceptual structures based on ontologies and metadata, i.e. the Web of Data. This paper presents a hybrid document search approach combining the benefits of the traditional text search of literal documents and the semantic search based on their underlying conceptual structures. The approach is based on document expansion, where documents are automatically annotated with not only the concepts explicitly present in a given document, but also with the ontologically related concepts using smaller weights. Our test results using the CLEF Test Suite suggest that document expansion alone achieves better recall than text search at the expense of precision. As a solution, a method of combining document expansion with text search is presented in which better recall was obtained without sacrificing precision. This approach seems promising when integrating unstructured, textual content with the Semantic Web of Data.

Vocabularies are the building blocks of the Semantic Web providing shared terminological resources for content indexing, information retrieval, data exchange, and content integration. Most semantic web applications in practical use are based on lightweight ontologies and, more recently, on the Simple Knowledge Organization System (SKOS) data model being standardized by W3C. Easy and cost-efficient publication, integration, and utilization methods of vocabulary services are therefore highly important for the proliferation of the Semantic Web. This paper presents the ONKI SKOS Server for these tasks. Using ONKI SKOS, a SKOS vocabulary or a lightweight ontology can be published on the web as ready-to-use services in a matter of minutes. The services include not only a browser for human usage, but also Web Service and AJAX interfaces for concept finding, selecting and transporting resources from the ONKI SKOS Server to connected systems. Code generation services for AJAX and Web Service APIs are provided automatically, too. ONKI SKOS services are also used for semantic query expansion in information retrieval tasks. The idea of publishing ontologies as services is analogous to Google Maps. In our case, however, vocabulary services are provided and mashed-up in applications. ONKI SKOS was published in the beginning of 2008 and is to our knowledge the first generic SKOS server of its kind. The system has been used to publish and utilize some 60 vocabularies and ontologies in the National Finnish Ontology Service ONKI www.yso.fi.

In this paper we present an ontology-based query expansion widget which utilizes the ontologies published in the ONKI Ontology Service. The widget can be integrated into a web page, e.g. a search system of a museum catalogue, enhancing the page by providing a query expansion functionality. We have tested the system with general, domain-specific and spatio-temporal ontologies.

Semantic web techniques can be used to relate two things together. However, usually this relation is not accompanied with a measure that would tell how interesting the relation is. Data mining tradition provides interestingness measures; it is natural to try and fit semantic web and data mining traditions together. In this paper we use support and confidence values provided by association rule mining as interestingness measures for relations. The presented method is tailored to location ontologies in order to find out what interesting mutual relations two places have based on annotations in the cultural heritage domain. The method also uses ontology-based reasoning to group places together. We present tests of running the method against a set of over 60,000 annotations in order to find out cultural heritage connections between places.

This paper discusses problems of creating and using ontology library services in production use. One approach to a solution is presented with an online implementation--the Finnish Ontology Library Service ONKI--that is in pilot use on a national level in Finland. ONKI contributes to previous research on ontology libraries in many ways: First, mashup and web service support with various tools is provided for cost-efficient utilization of ontologies in indexing and search applications. Second, services covering the different phases of the ontology life cycle are provided. Third, the services are provided and used in real world applications on a national scale. Fourth, the ontology framework is being developed by a collaborative effort by organizations representing different application domains, such as health, culture, and business.

CULTURESAMPO is an application demonstration of a national level publication system of cultural heritage contents on the Web, based on ideas and technologies of the Semantic (Web and) Web 2.0. On the semantic side, the system presents new solutions to interoperability problems of dealing with multiple ontologies of different domains, and to problems of integrating multiple metadata schemas and cross-domain content into a homogeneous semantic portal. A novelty of the system is to use semantic models based on events and narrative process descriptions for modeling and visualizing cultural phenomena, and for semantic recommendations. On the Web 2.0 side, CULTURESAMPO proposes and demonstrates a content creation process for collaborative, distributed ontology and content development including different memory organizations and citizens. The system provides the cultural heritage contents to end-users in a new way through multiple (nine) thematic perspectives, based on semantic visualizations. Furthermore, CULTURESAMPO services are available for external web-applications to use through semantic AJAX widgets.

We present an overview of CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage collections and other contents into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens. This paper focuses on CultureSampo’s search, recommendation, and visualization services for the end-users. The key idea here is to access cultural heritage on the Semantic Web through nine “thematic perspectives”, such as places on the maps, the social network of cultural persons, timelines, and narrative texts, e.g. biographies and literary works.

This paper presents an overview of the SemanticWeb 2.0 application CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage content into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.

Ontologies aim to capture knowledge about things and their relationships. Publishing ontologies on the Semantic Web enables people and organizations to use shared ontologies in annotating e.g. photographs, videos, music, and other types of cultural objects. Search engines also use relationships provided by ontologies in semantic search, e.g. for query expansion or for view-based search. However, building ontologies is a time-consuming process, and it should be helped by automatic finding of interesting, possible relationships. Finding the correct concept for annotation purposes is helped by subsumption and partonomy hierarchies and associative relationships. In this paper we show how an analysis of co-occurrences of concepts in annotations can be used to provide interesting relationships for enriching ontological structures. We use association rule mining techniques and test the idea using a set of annotations of cultural objects in CULTURESAMPO portal and the Finnish General Upper Ontology YSO. The results are visualized in the ONKI SKOS browser to give an additional layer on top of the original relationships of the YSO ontology. An analysis shows that best ranked relationships should also be included in the ontology as subclassof or associative relationships.

This paper presents solutions and lessons learned in FinnONTO project carried out in Finland in 2003–2007. The paper focuses on three aspects of interoperability of digital collections. First, transforming thesauri to ontologies. Second, publishing ontologies for the use of indexers and content providers. Third, ontology based methods for improving end user access to digital collections. The first aspect is analysed through case studies done with Finnish thesauri. The second is discussed by presenting the ONKI ontology server. The last aspect is demonstrated in the scope of the semantic portal CultureSampo for publishing cultural heritrage on the Semantic Web.

This paper presents the national level cross-domain ontology and ontology service infrastructure ONKI used in Finland. The novelty of ONKI is based on two ideas. First, the core ontologies are developed collaboratively by experts transforming thesauri into mutually aligned lightweight ontologies, based on a large top ontology that is extended by various domain specific ontologies. Second, the National Ontology Service ONKI has been implemented for publishing ontologies cost-efficiently as ready to use services. ONKI provides legacy and other applications with ready to use functionalities for using ontologies on the HTML level by Ajax and semantic widgets. ONKI has been used in various applications for creating mash-up applications in a way analogous to using Google Maps, but in our case external applications are mashed-up with ontology support for indexing and information retrieval.

This paper presents the Semantic Web 2.0 application CULTURESAMPO, an ambitious system of creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic challenge of aggregating highly heterogeneous, cross-domain cultural heritage into a semantically rich intelligent system for human and machine users. At the same time, CULTURESAMPO is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.

Finding people is essential in finding information. Librarians and information scientists have studied authority control - psychologists and sociologists social networks. In aforementioned, authors link to documents (and co-authors) creating access points to information. In latter, social paths serve as channels for rumours as well as expertise. Key problems include identification and disambiguation of individuals followed by difficulties of tracking the social connections. With semantic web, these aspects can be approached simultaneously. In this paper, we define a simple ontology for describing people and organizations. The model is based on FOAF and other existing vocabularies. We also demonstrate search and visualization tools for finding people.

Thesauri and other controlled vocabularies act as building blocks of the Semantic Web by providing shared terminology for facilitating information retrieval, data exchange and integration. Representation and publishing methods are needed for utilizing thesauri efficiently, e.g., in content indexing and searching. W3C has provided the Simple Knowledge Organization System (SKOS) data model for expressing concept schemes, such as thesauri. A standard representation format for thesauri eliminates the need for implementing thesaurus specific rules or applications for processing them. However, there do not exist general tools which provide out of the box support for publishing and utilizing SKOS vocabularies in applications, without needing to implement application specific user interfaces for end users. For solving this problem the ONKI-SKOS server is presented.

This article presents the vision and results of creating a national level cross-domain ontology service infrastructure in Finland in the FinnONTO project. The novelty of the infrastructure is based on two ideas. First, a system of open source core ontologies is being developed by transforming thesauri into mutually aligned lightweight ontologies, including a top ontology of 20,000 concepts that is extended by various domain specific ontologies. Second, the ONKI Ontology Server framework for publishing ontologies as ready to use services has been designed and implemented. ONKI provides legacy and other applications with ready to use functionalities for using ontologies on the user interface level as semantic widgets. The idea is to use ONKI for creating mash-up applications in a way analogous to using Google or Yahoo Maps, but in our case external applications are mashed-up with ontology support. The ontology framework presented is operational on the web and is being used in creating the application demonstrations.

Many tasks on the semantic web require the user to choose concepts from a limited vocabulary e.g. for describing an indexed resource or for use in semantic search. Semantic autocompletion interfaces offer an efficient way for concept selection. However, these interfaces usually do not expose the semantic context of the matched concepts, thereby making it hard to know if a matched concept is the right one, as well as hiding possibly more appropriate choices. Ontology browsers, on the other hand, show context but do not allow quick discovery or embedding into other applications. To lessen these problems, we present an interface combining semantic autocompletion with in-place ontological context navigation. Because required context differs between ontologies, the implementation was designed to make it easy to add different contexts and visualizations. To test the applicability of our idea and implementation the, system was tested on three ontologies with different requirements and structure.

The Semantic Web is based on using ontologies for enabling semantically disambiguated data exchange between distributed systems on the web. This requires efficient means for publishing ontologies on the web to ensure the availability, sharing and acceptance of the ontologies. Support services are needed for utilizing ontologies easily and cost-effectively in applications and legacy systems lacking ontology support. To address these vital needs, this paper presents the ONKI ontology service which provides ready-to-use mash-up functionalities, such as semantic disambiguation, concept finding and concept fetching as ready-to-use web widgets for adding ontology support to e.g. HTML forms using JavaScript. Two implementations of the ONKI Server are presented: ONKI-SKOS for ontologies presented in the Simple Knowledge Organization System (SKOS) language and ONKI-Geo for geographical ontologies with a map interface. The presented ONKI systems are operational on the web, used in the National Finnish Ontology Service. They have been successfully used in several pilot applications.

Content annotations in semantic cultural heritage portals commonly make spatiotemporal references to historical regions and places using names whose meanings are different in different times. For example, historical administrational regions such as countries, municipalities, and cities have been renamed, merged together, split into parts, and annexed or moved to and from other regions. Even if the names of the regions remain the same (e.g., “Germany”), the underlying regions and their relationships to other regions may change (e.g., the regional coverage of “Germany” at different times). As a result, representing and finding the right ontological meanings for historical geographical names on the semantic web creates severe problems both when annotating contents and during information retrieval. This paper presents a model for representing the meaning of changing geospatial resources. Our aim is to enable precise annotation with temporal geospatial resources and to enable semantic search and browsing using related names from other historical time periods. A simple model and metadata schema is presented for representing and maintaining geospatial changes from which an explicit time series of temporal part-of ontologies can be created automatically. The model has been applied successfully to representing the complete change history of municipalities in Finland during 1865–2007, and the resulting ontology time series is used in the semantic cultural heritage portal CULTURESAMPO to support faceted semantic search of contents and to visualizing historical regions on overlaying maps originating from different historical eras.

Geographic place names are semantically often highly ambiguous. For example, there are 491 places in Finland sharing the same name ”Isosaari” (great island) that are instances of several geographical classes, such as Island, Forest, Peninsula, Inhabited area, etc. Referencing unambiguously to a particular ”Isosaari”, either when annotating content or during information retrieval, can be quite problematic and requires usage of advanced search methods and maps for semantic disambiguation. Historical places introduce even more challenges, since historical metadata commonly make spatiotemporal references to historical regions and places using names whose meanings are non-existing or different in different times. This paper presents how these problems have been addressed in a large Finnish place ontology SUO and a historical geo-ontology SAPO. A location ontology server ONKI-Geo has been created for publishing the ontologies and utilizing them as mashup services. To demonstrate the usability of our ontologies, two case applications in the cultural heritage domain are presented.

This paper presents the semantic portal CULTURESAMPO---Finnish Culture on the Semantic Web . The portal provides memory organizations and other cultural content publishers with a national, shared semantic publication channel for heteroge- nous cultural contents. The content comes from over ten organizations and is annotated using various ontologies of the FinnONTO infrastructure. For the end-user, intel- ligent semantic search, recommendation, and visualization services for accessing and learning about cultural heritage are provided.

In this paper we examine 1) the scope of geo-ontologies used especially for the purposes of information retrieval on the Web, 2) the core geographical concepts and their mutual relations, and 3) the properties the concepts have. Furthermore, we present the Finnish geo-ontology (Suomalainen paikkaontologia, SUO) and discuss the theories and principles that have governed the development process, as well as the limitations and requirements the use of geographical dictionaries as an instance data source have imposed to the content and the structure of SUO.

Cultural heritage is by nature strongly interlinked, e.g. thematically and historically, but at the same time distributed in heterogeneous collections of different memory organizations at different locations. In order to provide the end-users with aggregated homogeneous views to distributed heterogeneous contents, semantic portals have been created successfully based on metadata and shared (or aligned) ontologies. This paper discusses two problems encountered in such a distributed semantic content creation environment. First, during the content creation work, how could a publisher start using shared ontologies in legacy cataloguing and annotation systems that do not support ontologies. Second, during content publication, how could a publisher re-use the aggregated content in its own legacy publication system, e.g., on the ordinary web pages of a museum or in a collection browser. As a solution, we present the ONKI Ontology Server for adding shared ontological annotation functionalities to legacy cataloguing systems in a practical, cost-efficient and lightweight way. For distributed publishing of the aggregated semantic portal services, we introduce the lightweight mash-up web widget components called floatlets . A major idea behind both the ONKI functionalities and floatlets is that they can be easily integrated with legacy systems on the user interface level, in the same spirit as e.g. Google Maps.

We argue that an ontology of historical events is needed in semantic portals for cultural heritage due to three reasons. First, ontological identifiers (URIs) of events, such as the World War II or coronation of Napoleon, are needed in order to make collection metadata mutually interoperable in terms of related events---in the vein as identifiers are needed for identifying artifact types, persons, and geolocations when annotating collection items. Second, events are of central importance in creating semantic links between cultural contents in applications such as recommendation systems. Third, historical events are important as content items of their own, forming the backbone of chronological histories.

In this paper, we argue for a need to shift focus in semantic search from the items themselves to using them as lenses to wider topics. A system for doing this in the cultural heritage domain is presented, duplicating on the web the way exhibitions in the real world are organized. An interface for specifying such exhibitions is presented, combining a general narrative pattern with semantic autocompletion and the novel concept of domain-centric view-based search. This also solves a number of problems view-based search has previously encountered in the cultural heritage domain. Presented also are multiple visualizations for the exhibition, supporting the user in making sense of the data and in doing exploratory search.

This paper presents a method for making metadata conforming to heterogeneous schemas semantically interoperable. The idea is to make the knowledge embedded in the schema structures interoperable and explicit by transforming the schemas into a shared, event-based representation of knowledge about the real world. This enables and simplifies accurate reasoning services such as cross-domain semantic search, browsing, and recommending. A case study of transforming three different schemas and datasets is presented. An implemented knowledge-based recommender system utilizing the results in the semantic portal \CS\ was found useful in a preliminary user study.

Producing semantic metadata requires efficient methods, e.g., concept finding, for accessing and using ontologies. To add such functionalities to metadata applications such as cataloging systems in museums, we propose a \emphmash-up approach where ready-to-use user interface components for using specific ontologies are made available to be integrated into applications. As a proof-of-concept, we present the \emphOntology Service ONKI wich implements semantic autocompletion concept search and concept browsing for ontologies as shared mash-up components.

A lot of functionality is needed when an application, such as a museum cataloguing system, is extended with semantic capabilities, for example ontological indexing functionality or multi-facet search. To avoid duplicate work and to enable easy and cost-efficient integration of information systems with the Semantic Web, we propose a web widget approach. Here, data sources are combined with functionality into readyto-use software components that allow adding semantic functionality to systems with just a few lines of code. As a proof of the concept, we present a collection of general semantic web widgets and case applications that use them, such as the ontology server ONKI, the annotation editor SAHA and the culture portal CultureSampo.

Semantic autocompletion interfaces offer an efficient way for concept selection useful in both search and annotation applications. However, these interfaces usually do not expose the semantic context of the matched concepts, thereby making it hard to know if a matched concept is the right one, as well as hiding possibly more appropriate choices. To lessen these problems, we present an in-place ontological context navigation interface to be used with semantic autocompletion.

In current Semantic Web view-based search systems views are formed by selecting properties and enumerating all their values as selections. This approach breaks down with multiple content types, such as in the cultural heritage domain, because the number of differing properties, and therefore views becomes unmanageable. We propose a novel solution termed Domain-Centric View-Based Search, in which views are created based on common property ranges and domain ontologies.

This paper shows how semantic web techniques can be applied to solving problems of distributed content creation, discovery, linking, aggregation, and reuse in health information portals, both from end-users and content publishers viewpoints. As a case study, the national semantic health portal \HF\ is presented. It provides citizens with intelligent searching and browsing services to reliable and up-to-date health information created by various health organizations in Finland. The system is based on a shared semantic metadata schema, ontologies, and ontology services. The content includes metadata about thousands of web documents such as web pages, articles, reports, campaign information, news, services, and other information related to health.

Geographic place names are widely used but are semantically often highly ambiguous. For example, there are 491 places in Finland sharing the same name Isosaari (great island) that are instances of several geographical classes, such as Island, Forest, Peninsula, Inhabited area, etc. Referencing unambiguously to a particular Isosaari , either when annotating content or during information retrieval, can be quite problematic and requires usage of advanced search methods and maps for semantic disambiguation. This paper presents an ontology server, ONKI-Paikka, for solving the place finding and place name disambiguation problem. In ONKI-Paikka, places can be found by a faceted search engine, combined with semantic autocompletion and a map service for constraining search and for visualizing results. The service can be connected to legacy applications cost-effectively by using Ajax-technology in the same spirit as Google Maps that is used in ONKI-Paikka as a subservice.

This paper presents a system for searching semantic relations between web resources, in our case significant persons of art history. The system is based on the Union List of Artists Names (ULAN) metadata of some 120,000 persons and organizations.

The Semantic Web is based on using shared ontologies for enabling semantically disambiguated data exchange between distributed systems on the web. This requires, from the ontology publisher s viewpoint, efficient means for publishing ontologies on the web to ensure the availability and acceptance of the ontologies. From the ontology user s viewpoint, support services are needed for utilizing ontologies easily and cost-effectively in the users own systems that are typically legacy systems without ontology support. This paper presents the ONKI ontology server for addressing these vital needs. For the publisher, ONKI provides a server and a Simple Knowledge Organization (SKOS) compatible light-weight ontology browser with ready-made web interfaces for making ontologies available both for human and machine users. For external legacy and other applications, ONKI provides centralized ontology services for semantic disambiguation, concept finding, and concept fetching. A major contribution of ONKI is to provide these services as ready-to-use functionalities for creating mash-up applications very cost-efficiently. Two prototypes of the system---ONKI-SKOS for all kinds of ontologies and ONKI-Geo for geographical ontologies with a map mash-up interface---are operational on the web and are currently being successfully used in several pilot applications.

This paper concerns the idea of publishing heterogenous cultural content on the Semantic Web. By heterogenous content we mean metadata describing potentially any kind of cultural objects, including artifacts, photos, paintings, videos, folklore, cultural sites, cultural process descriptions, biographies, history etc. The metadata schemas used are different and the metadata may be represented at different levels of semantic granularity. This work is an extension to previous research on semantic cultural portals, such as MuseumFinland, that are usually based on a shared homogeneous schema, such as Dublin Core, and focus on content of similar kinds, such as artifacts. Our experiences suggest that a semantically richer event-based knowledge representation scheme than traditional metadata schemas is needed in order to support reasoning when performing semantic search and browsing. The new key idea is to transform different forms of metadata into event-based knowledge about the entities and events that take place in the world or in fiction. This approach facilitates semantic interoperability and reasoning about the world and stories at the same time, which enables implementation of intelligent services for the end-user. These ideas are addressed by presenting the vision and solution approaches taken in two prototype implementations of a new kind of cross-domain semantic cultural portal “CULTURESAMPO—Finnish Culture on the Semantic Web”.

An event-based approach is presented for annotating events and narrative structures underlying texts and stories semantically. The idea is applied to using the Finnish national epic Kalevala for accessing related cultural contents, such as artifacts, paintings etc. in a semantic portal.

This article presents the vision and results of creating the basis for a national semantic web content infrastructure in Finland in 2003-2007. The main elements of the infrastructure are shared and open metadata schemas, core ontologies, and public ontology services. Several practical applications testing and demonstrating the usefulness of the infrastructure are overviewed in the fields of eCulture, eHealth, eGovernment, eLearning, and eCommerce.

We present ONKI ontology server, a mash-up approach for integrating ontology library services with semantic web applications. The idea of ONKI is to provide applications with ready-to-use ontology service functionalities, such as semantic autocompletion, browsing, and annotation support, at the user interface level using AJAX mash-up technologies. The system is being integrated with various semantic web applications.

Creation of rich, ontology-based metadata is one of the major challenges in developing the Semantic Web. Emerging applications utilizing semantic web techniques, such as semantic portals, cannot be realized if there are no proper tools to provide metadata for them. This paper discusses how to make provision of metadata easier and cost-effective by an annotation framework comprising of annotation editor combined with shared ontology services. We have developed an annotation system supporting distributed collaboration in creating annotations, and hiding the complexity of the annotation schema and the domain ontologies from the annotators. Our system adapts flexibly to different metadata schemas, which makes it suitable for different applications. Support for using ontologies is based on ontology services, such as concept searching and browsing, concept URI fetching, semantic autocompletion and linguistic concept extraction. The system is being tested in various practical semantic portal projects.

View-based search provides a promising paradigm for formulating complex semantic queries and representing results on the Semantic Web. A challenge for the application of the paradigm is the complexity of providing view-based search services through application programming interfaces (API) and web services. This paper presents a solution on how semantic view-based search can be provided efficiently through an API or as web service to external applications. The approach has been implemented as the open source tool Ontogator, that has been applied successfully in several practical semantic portals on the web.

This paper concerns the idea of publishing heterogenous cultural content on the Semantic Web. By heterogenous content we mean metadata describing potentially any kind of cultural objects, including artifacts, photos, paintings, videos, folklore, cultural sites, cultural process descriptions, biographies, history etc. The metadata schemas used are different and the metadata may be represented at different levels of semantic granularity. This work is an extension to previous research on semantic cultural portals, such as MuseumFinland, that are usually based on a shared homogeneous schema, such as Dublin Core, and focus on content of similar kinds, such as artifacts. Our experiences suggest that a semantically richer event-based knowledge representation scheme than traditional metadata schemas is needed in order to support reasoning when performing semantic search and browsing. The new key idea is to transform different forms of metadata into event-based knowledge about the entities and events that take place in the world or in fiction. This approach facilitates semantic interoperability and reasoning about the world and stories at the same time, which enables implementation of intelligent services for the end-user. These ideas are addressed by presenting the vision and solution approaches taken in two prototype implementations of a new kind of cross-domain semantic cultural portal “CULTURESAMPO—Finnish Culture on the Semantic Web”

Content in semantic web portals is often projected along application specific navigational taxonomies and linked semantically. This paper presents a logic-based method and a server ONTODELLA for these tasks. We argue that logic rules between the content layer and the application layer add flexibility and better architectural separation of content and functionality. The system has been implemented and applied succesfully in several semantic portals.

This paper generalizes the idea of traditional syntactic text autocompletion onto the semantic level. The idea is to autocomplete typed text into ontological categories instead of words in a vocabulary. The idea has been implemented and its application for semantic indexing and content-based information retrieval in multi-facet search is proposed. Four operational semantic portals on the web using the implementation are presented as application cases.

This thesis explores the possibilities of using the view-based search paradigm to create intelligent search interfaces on the Semantic Web. After surveying several current semantic search techniques, the view-based search paradigm is explained, and argued to fit in a valuable niche in the field. To test the argument, OntoViews, a semantic view-based search portal creation tool was designed and implemented, and eight portals with five vastly different user interfaces were built using it. Based on the results of these experiments, this thesis argues that the paradigm, particularly as implemented in the OntoViews tool provides a strong, extensible and flexible base on which to built semantic search applications. The particular problems faced in applying view-based search for semantic interfaces are noted, along with explanations on how they were solved in the OntoViews architecture. Finally, directions and ideas for future research are presented for both the paradigm and the implementation architecture, respectively.

A prototype semantic yellow page service portal is described. Our idea is to represent service offerings as events and processes in terms of ontologies. Based on versatile semantic descriptions, users can be provided with a flexible view-based search engine enhanced with semantic text autocompletion.