The W3C HCLS BioRDF Task Force on Provenance

W3C HCLS BioRDF Task force

Short description: One of the current goals of the BioRDF task force is to transform microarray gene expression results into RDF format and preserve provenance information about these gene expression results, such as what samples were used, which institutions contributed the samples, what experiment factors were used to produce the results. In the first iteration, we have created a provenance data model, that captures provenance information at four different levels:

the institutional level, which describes the laboratory performing an experiment and the publication reporting the results;

the experimental context level, which describes samples used in the experiment and the list of genes being studied;

the statistical and significance analysis level, which describes the statistical and significance analysis tools used in an experiment and results of the analysis;

the dataset description level, which provides descriptive metadata about the gene list results from each study.

In that iteration we have not reused any existing provenance vocabularies/ontologies in order to maintain an independence of our data modelling. At the moment, we are trying to refactor the model and map to some existing provenance vocabulary.