Technologies

Context

Standards vocabularies and ontologies are key elements to achieve data interoperability. The D2KAB project (www.d2kab.org)develops and supportsAgroPortal (http://agroportal.lirmm.fr)a reference ontology repository for agronomy, food and plant sciences. We collaborate with the Stanford NCBO BioPortal group to synchronize our efforts and mutualize technology development. We have already designed and implemented an advanced prototype offering ontology-based services that hosts 106 ontologies or vocabularies including some reference resources in the domain: Agrovoc, NAL thesaurus, Crop Ontology, etc. With such a number of ontologies, new problems have raised such as describing,selecting, evaluating, trusting and interconnecting ontologies as well as using them for semantic annotation of data.

We are offering a postdoc or researcher position to develop new ontology management and alignment capabilities inside AgroPortal, including: to capture and synchronize metadata descriptions, to facilitate the cohabitation, interoperation and appropriate use of different types of semantic resources (e.g., from SKOS vocabularies to formal OWL ontologies), to improve ontology selection and recommendation, and to enable ontology interoperation. Also relying on the experience and technology developed with the YAM++ (http://yamplusplus.lirmm.fr) application –LIRMM’s ontology alignment matcher– we will develop a state-of-the-art framework for mapping extraction, generation, validation, evaluation, storage and retrieval by adopting a complete semantic web and linked open data approach and engaging the community for curation.

Detailed description

A key aspect in addressing semantic interoperability in agronomy, plant sciences, nutrition and biodiversityis the use of ontologies as a common denominator to describe data, make them interoperable and turn them into structured and formalized knowledge. Biomedicine has always been a leading domain for semantic interoperability pioneering the development of reference ontologies such as the Gene Ontology. This has served as model for the agronomic, environmental and plant sciences e.g., Plant Ontology [1], Crop Ontology [2], opening the space to various types of semantic applications [3], to data integration or decision support. Semantic interoperability has been identified as a key issue for agronomy and biodiversity sciences, and the use of ontologies a way to address it [4], [5]. The more ontologies and vocabularies are being produced in the domain, the more the need to host them, described them appropriately and manage the alignments between those ontologies becomes important.

By reusing the NCBO BioPortal technology, we have designed AgroPortal, an ontology repository for the agronomy domain (http://agroportal.lirmm.fr) [7]. The main objective of the AgroPortal project is to develop and support a reference ontology repository for agronomy, food, plant sciences, and biodiversity. It offers a robust and reliable advanced prototype service to the community that features ontology hosting, search, versioning, visualization, comment, services for semantically annotating data with the ontologies, as well as storing and exploiting ontology alignments, all of these in a semantic web compliant infrastructure. Ontologies in the portal are being developed within multiple agronomic use cases, including the Agronomic Linked Data (http://agrold.org), INRA Linked Open Vocabularies (http://lovinra.inra.fr) which is an effort to publish vocabularies produced or co-produced by INRA.

YAM++ is a state-of-the-art ontology alignment system being developed at LIRMM [8]. YAM++ uses machine-learning techniques to combine different similarity measures, exploiting the intrinsic textual features of ontologies to provide similarity scores based on information retrieval techniques. YAM++ obtained excellent results during the OAEI 2013 campaign [9]. Since 2016, YAM++ exists also in the form of a multifunctional web service application (http://yamplusplus.lirmm.fr) allowing manual mapping validation and enrichment.

The postdoc/researcher mission will be to:

Work with partners on the design (with use of semantic web standards) of their ontologies/vocabularies and the integration (when not done yet) within AgroPortal.

Work on metadata extraction, synchronisation and exploitation to facilitate the selection, recommendation of ontologies (cf. [10,11]).

Align the ontologies within AgroPortal to one another and to the GACS vocabulary (cf. below). Release mappings as linked open data.

Design an ontology alignment framework inside AgroPortal to make YAM++/AgroPortal the reference platform to extract, generate, validate, evaluate, store and retrieve ontology alignments. Work with partners on generating and curating mappings thanks to the framework developed.

Contribute to the GACS project with the AgroPortal alignment framework and become the preferred platform for hosting and browsing the GACS vocabulary.

The project will have be driven by the use cases of the D2KAB ANR project (e.g., food packaging, agro-agri linked data, wheat phenotype, ecosystems & plant biogeography). In collaboration with RDA Agrisemantics working group (http://agrisemantics.org) we will work on the development of Global Agricultural Concept Scheme (GACS) which is an important international initiative to integrate the Agrovoc, CAB Thesaurus, and NAL Thesaurus (www.agrisemantics.org/gacs)[6]. Because of this size and endorsements by major organizations, the GACS will certainly a major element in the lingua franca for agriculture (and related domains) and AgroPortal has been proposed to the Agrisemantics WG as the platform for accessing each of the three original thesaurus as well as the GACS itself. We will produce alignments to build GACS and to interconnect it to other ontologies in AgroPortal.

Expected profile

We are looking for a motivated postdoc or experienced researcher. The candidate must hold a PhD in Informatics / Computer science and must have experience in the semantic web area and using ontologies. The candidate will demonstrate aptitudes or matches with most of the following aspects: