Contents

Young Lives Linked Data Demonstrator

Young Lives is an international study of childhood poverty, involving 12,000 children in 4 countries over 15 years. It is led by a team in the Department of International Development at the University of Oxford in association with research and policy partners in the 4 study countries: Ethiopia, India, Peru and Vietnam. One of the goals of the programme is to see widespread use of both it's published research, and the datasets that have been collected as part of the study.

We undertook two pilots to explore how linked data might play a role in communicating Young Lives research.

Phase 1

We developed a set of scripts to model a subset of the Young Lives survey data as RDF Linked Data ( Notes on Modelling RDF data). The modelled data was loaded into an OntoWiki platform, and made accessible to browse as linked data, and a graphing widget was developed to access the data using SPARQL to access the data, and to display comparable datasets from WHO (mocked up in RDF as raw RDF was not available).

Identified that, whilst linked data should allow connections to be made across datasets:

(a) Subtle differences in definition across datasets can limit the possibility of automated comparison;

(b) There are limited datasets available covering key development topics in RDF at present, and this limits the scope for automated comparison;

The self-describing nature of linked data can support in-depth annotation of an academic study

The Phase 1 demonstration is no longer online.

Phase 2

The second phase focussed on:

Modelling all the questions asked during the study and publishing these as linked data;

Publishing selected statistics from Round three of the study as linked data and making these accessible through an interactive visualisation;

Representing the structure of the Young Lives study as linked data (using a SKOS concept scheme;

Modelling details of publications from the study, integrating data on these from R4D, a third party source of data on some of the publications

The resulting site at data.younglives.org.uk has been designed to provide a stable platform for end-users to access and browse - making key concepts and findings from the study accessible to both humans and computers.

The Young Lives Grapher - which takes RDF Data Cubes and displays interactive graphs using the Google Chart API

The CSV to Data Cube Import Tool - which takes formatted CSV files, and provides an interface to convert these into RDF Data Cubes

Custom code to generate mappings between SPSS, DDI and RDF models of study structure, and to process Young Lives publication files are also archived on GitHub.

Technical Issues Paper

A Technical Issues paper collates together key learning from the process of developing a number of linked data pilots. It highlights key considerations for practitioners exploring the development of linked open data projects in the development field. The draft is available in three sections:

The paper remains a working draft, though elements of it have been re-used in a number of other focussed publication.

The Social Life of Open Data

The Social Life of Open Data (SLOD) project has explored the ways in which open data re-use relies upon a chain of re-use, with different actors between 'raw datasets' and their re-use playing a significant roles in shaping how data is interpreted and used.

In looking at IATI data, a distinction between the 'infrastructures' of open data, and the 'eco-system' of re-use was identified, and this formed the basis of initial project write-up. Time constraints and the technical difficulties of capturing and managing full provenance information mean that a full analysis of the social provenance chains involved in IATI data re-use (and the application of the method to further datasets) has been postponed until later in 2012, to be completed as part of the authors PhD work outside of the IKM Emergent programme.

Outputs

SLOD Tool - an open source Django application for capturing data using the W3C PROV-DM model.