Implications of Linked Data for GEOSS

This page will discuss the possible ways in which the concept of linked data can be used for data, metadata, semantics etc. throughout GEOSS. This would potentially be a major shift for GEOSS, since linked data is very semantically driven. However, the thought could be that the GCI components be modified over time to deal with linked data while there is an education and outreach effort to get data providers to convert to linked data usage. The long term goal is that more and more linked data will be used, resulting in greater benefit to GEOSS users and providers.

The initial effort, coming from the SIF, is to identify, analyze, and recommend some best practices and guidelines for using linked data. The SIF will also study and recommend how GCI components can be modified to exploit linked data if it is available. This effort is not meant to disrupt any existing operations.

Identifed Best Practices and Guidelines for Linked Data

W3C Best Practices for Publishing Linked Data (W3C Working Group Note 09 January 2014)

The W3C Best Practices address the Best Practices for distinct phases, summarised by the section headings extracted from the document;

PREPARE STAKEHOLDERS

SELECT A DATASET

MODEL THE DATA

SPECIFY AN APPROPRIATE LICENSE

GOOD URIs FOR LINKED DATA

USE STANDARD VOCABULARIES

CONVERT DATA

PROVIDE MACHINE ACCESS TO DATA

ANNOUNCE NEW DATA SETS

RECOGNIZE THE SOCIAL CONTRACT

Consideration of all of the above aspects is key to implementing linked data solutions. For GEOSS and possible GCI implementation which areas should be focussed on? For example, can GEOSS define a policy for URI/IRI specification CF RFC5870. A URI policy is important for persistence and has to be assessed prior to later stages such as converting data. For which categories could the GCI implement an appropriate service and for which parts does the responsibility lie with the data provider?

Conversion of complete data sets to linked data presents challenges and a common practice is to extract and link metadata or subsets of data, e.g. Melodies project in which image data is preserved as is but metadata and extracted features are recorded as Linked Data. Solutions for full datasets include The RDF Data Cube Vocabulary "The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations. The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multi-dimensional data sets."