Whose data is it anyway?

Who owns scientific data? The scientist who collects it or the person that pays for it? With the UK making serious commitments to open access for the outcomes of research, Dr Liz Harley asks whether the data underpinning those outcomes should be publicly available as well.

Ownership is of scientific data is a contentious issue. Historically the scientist who developed the hypothesis,designed the experiments and analysed the outcomes has been considered the owner of the data. No onewould dispute Charles Darwin’s ownership of the data he meticulously collected through hispigeon breedingexperiments, orbarnacle dissections.

Darwin paid for his research using family money, and conducted much of his work in laboratories on his ownpremises. But today the vast majority of UK scientific research is funded not by personal fortune, likeDarwin’s, but by the Research Councils. Their money is allocated by the Government, which comes ultimatelyfrom the taxpayer. Today Darwin would be working out of a laboratory on UCL’s Gower Street (the site of hisfamily’s London house) and funded by a grant from the Natural Environment Research Council.

One of the key arguments of theOpen Accessmovement is that scientists should be accountable to thepublic that fund their research. Making the outcomes of publicly funded work free at the point of publicationallow the public to assess the research conducted with their money and ostensibly on their behalf forthemselves. Research Councils UK, the umbrella body for UK research councils, has set in motion anambitious open access policythat would see all peer-reviewed research articles that acknowledge researchcouncil funding to be published in an open access format. The policy was implemented on April 1st 2013, withthe anticipation of a transition period of around five years.

That takes care of the outcomes of research, the published papers, but what about the underlying data? Doesanyone have a right to see, or demand to see that data?

There are many benefits to having openly available scientific data, chief of which is aiding the development ofnew research. Hypothetically if Darwin had made his data on the basic physiology of the barnacle freelyavailable it could have formed the basis for research that Darwin could never have imagined. Perhaps in thefields of conservation or ecotoxicology.

Having access to existing data could prevent duplication of research efforts by different groups, which wouldbe particularly beneficial when that research involves the use of animals. And from a broader perspectivemaking all data subject to wider scrutiny – essentially peer review – has the potential to identify mistakes,deliberate or otherwise, more rapidly. While open data is unlikely to eliminate scientific fraud, it could makefraud a lot more difficult to get away with.

However, many would argue that while it is acceptable to make the outcomes of data publicly available, thedata itself is the intellectual property of the scientist. The adage ‘publish or perish’ is often used to describethe pressure that scientists are under to generate publications, and as one dataset can form the basis ofmultiple publications there is often little incentive to make that resource available to competing interests. Andthere is nothing to stop the unscrupulous passing another scientist’s data off as their own.

But theoretically there is a system that could manage these competing interests. The premise is simple:research grants typically run for a fixed length of time, approximately three years. So a condition of receivingthe research funding could be that within three years of the end of the grant the researcher has to make alldata collected under that grant publicly available. This would give the scientist time to work with the data andproduce the publications that are so critical to research success, while still ensuring that the public who paidfor the data will get to see it.

The question of who owns what makes implementing any kind of concrete open data policy difficult. AtpresentResearch Councils UK have a set of common principles, which acknowledge the importance of opendata, along with the “legal, ethical and commercial constraints” surrounding data release.

But in a climate of greater openness and transparency, not just in science but across all sectors, perhapswhat we should be asking ourselves is not who owns the data, but how can we use it to do the greatest good.