Obama Administration backs open access to all federal research

Papers available one year after publication, data behind them preserved.

Today, John Holdren, the head of the White House Office of Science and Technology Policy, announced that the administration is adopting a policy that would see nearly all of the science papers produced through federal funding made accessible to the public within a year of their publication. The new rules would apply to any agency that has a research budget of over $100 million, and it would include measures for preserving any digital data that was associated with the research.

A similar policy has already been adopted by the National Institutes of Health, and there were indications that the administration had been considering this measure for some time. It was perhaps pushed along by a "We the People" petition that succeeded under the previous standards, reaching 65,000 signatures (100,000 are now needed). Still, Holdren's announcement finally clarifies the intended plan.

The one-year term for publications to remain behind paywalls is only a guideline, and agencies can shift the limit based on the publishing issues faced by their particular fields. As further protection for publishers, each agency must develop, "procedures the agency will take to help prevent the unauthorized mass redistribution of scholarly publications." At the same time, however, the plan calls for agencies to provide strategies for helping the public search for and find the papers and to make metadata available for aggregation and analysis. Given all that, it's not clear whether "unauthorized mass distribution" would even be necessary.

In addition to the publications, the plan calls for agencies to preserve "digitally formatted scientific data resulting from unclassified research" and to make that accessible too. This won't include things like lab notebooks or draft versions of the papers, but it might include databases and images that were essential to the analysis.

Overall, the goal is a good one, as it should help provide the public with access to the research it paid for, and scientists will have a greater ability to find and link their research with past work. The big challenge, however, is that it's all supposed to be done without any additional spending. Preservation of data and sharing it in a usable form aren't always cheap.

I remember signing this pension a few weeks back! The first one so far that I've signed to receive a response. Though I do wonder what repercussions could result from this, if any. One year sounds fair enough. Not too long that we'll be slowing down progress, and not too short that the groups will not be able to profit from their work, or to get a head start.

"Nearly all" probably means all of the boring stuff. DHS/NSA/CIA mathematical and scientific discoveries probably won't count eh? There's a wealth of unpublished scientific knowledge being held under lock and key by some more secretive agencies.

"And while this new policy call does not insist that every agency copy the NIH approach exactly, it does ensure that similar policies will appear across government." - From the response

So this is just the starting point, so long as the research is not compromising, or otherwise dangerous in the wrong hands. I do not see why more policies like this will not be put in place.

I remember signing this pension a few weeks back! The first one so far that I've signed to receive a response. Though I do wonder what repercussions could result from this, if any. One year sounds fair enough. Not too long that we'll be slowing down progress, and not too short that the groups will not be able to profit from their work, or to get a head start.

It's not as much preserving the group's head start (if follow up studies aren't well under way by the time you publish you're doing something wrong or are exiting the field) or profit-potential (profit driven research like a drug candidate will be published after it has undergone some form of IP protection) as supporting the infrastructure that goes into the journal setup. You have to pay for the websites, the editors time, typesetting, etc etc etc. Peer-review is generally on a volunteer basis but the costs behind it are non-trivial, and in the journal model are paid for by subscriptions and to an extent page fees. If the papers are all open access immediately there's no impetus to subscribe to get the latest articles. PLoS et al offset this by more substantial publication charges.

As a researcher and both applaud and am scared by this. Forcing journals to have open access after a year is a great idea, although I'm not sure how they'll enforce it (given that not all Journals are based in the US, and some might argue this is an anti-business move, given that they're in it for the money).

The thing I'm scared of is the massive bureaucratic nightmare that will descend upon us when we are required to make our data public. Where will these data be stored, and in what format? How are these rules going to be enforced exactly? It just sounds like a nightmare to police and run, even if it's a good idea in practice.

The thing I'm scared of is the massive bureaucratic nightmare that will descend upon us when we are required to make our data public. Where will these data be stored, and in what format? How are these rules going to be enforced exactly? It just sounds like a nightmare to police and run, even if it's a good idea in practice.

Agreed that it will be a tricky system, but the brass tacks of it is, when one takes government money, one plays by government rules.

This is overall quite a good thing, but I am concerned about "...it's all supposed to be done without any additional spending." A long-term data retention plan can get pricey. There's the upfront storage cost, power and maintenance. Then there's the maintenance of the data formats and the tools necessary to read them.

I hope they can find a way to account for all of that, without having to impact the actual research budgets...

I like this idea, both halves of it. I like reading study results, but there are a lot of problems that can be seen, and having access to metadata - and actual studies, can help.

Here's an example. A European study looked at whether living by a powerful electrical distribution lines caused illness. They looked at several dozen illnesses, and found a couple that made 2-sigma (95%) confidence, and several that made 1-sigma. This was reported in the press as a disaster - these wires kill people. However, somebody let the raw data out, and it became clear that there were just as many 1-sigma and 2-sigma cases where it appeared to prevent disease. Had the metadata not been released, this wouldn't have been discovered and everybody would have been left with the wrong impression.

There are many cases where cause and effect are very unclear. Does sodium cause heart problems, or do people who eat a lot of fast food have heart trouble and also eat a lot of sodium? Putting people on low sodium diets doesn't seem to help. Does salty sushi cause the same problems as salty French fries? Metadata will help.

Separately, the new up/down voting process sort of gives me the creeps. As I was writing this, I caught myself trying to say things other Ars readers would like and avoid things they wouldn't like. I know we don't want trolls, but I must say that making things a popularity content doesn't always lead to the most healthy discussion.

In the areas I am familiar with all the costs and about 99% of the profits of publishers come from one source: the federal government. The writers ( aka researchers ) don't get paid for their articles, often times having to pay to [publish their articles ( pages fees ).

I can't see why the government can't simply create database/web servers to host the papers Require grant recipients to deposit their papers in a government server ( instead of publishing ). Then either reduce the grants by the amount saved in not publishing or charge a publication fee to cover the costs.

In the end the government would save a lot by doing this.

PS: When I was a grad student, a freind forwarded TeX preprints that he got emailed by a friend to me. Unfortunately this got really unwieldly after a while, so Paul Ginsparg created something generally known as xxx.lanl.gov . That eventually evolved to arXiv . The system has been the principle place for most researchers to look up papers for thirty years. Add a status ( unreviewed, prereviewed, reviewed ) and a reviewing system to it and for those areas you would be there.

Obama is following instead of leading. All publications of research funded by the federal government should be open access on publication. In practice, most are already available after a year. But, it is a scandal that even PNAS has a waiting period before access is open. Government sponsored organizations should not be funding their operations by subscription fees particularly when those fees are mostly paid through government grants. This is just another case of bureaucratic floundering. The President's actions are just another example of the hypocrisy of his pretensions to open government.

And what do you suppose you'll do with all this information on the oxidative detoxification of aqueous bark extracts now that it will be available to you? My strong suspicion is "nothing, we just want it". Or "nothing, but I need to know exactly what sort of tinfoil is best at guarding against the government's Thought Beacons".

The US government produces a great deal of core research on a lot of things as boring as bark extracts, and a great deal that sounds boring but which has real economic value to those people who know what to do with it. And if you're one of those people who know what to do with it, you already know who in the government you need to ask to get free and unfettered access to it, or at least you should.

Me, I'd rather make the Chinese government go through the effort of stealing it rather than just giving it to them by making it public.

And what do you suppose you'll do with all this information on the oxidative detoxification of aqueous bark extracts now that it will be available to you? My strong suspicion is "nothing, we just want it". Or "nothing, but I need to know exactly what sort of tinfoil is best at guarding against the government's Thought Beacons".

The US government produces a great deal of core research on a lot of things as boring as bark extracts, and a great deal that sounds boring but which has real economic value to those people who know what to do with it. And if you're one of those people who know what to do with it, you already know who in the government you need to ask to get free and unfettered access to it, or at least you should.

Me, I'd rather make the Chinese government go through the effort of stealing it rather than just giving it to them by making it public.

And as a researcher I'd rather not have data that I want to see behind a paywall when my institution doesn't subscribe to that journal.

In addition to the publications, the plan calls for agencies to preserve "digitally formatted scientific data resulting from unclassified research" and to make that accessible too. This won't include things like lab notebooks or draft versions of the papers, but it might include databases and images that were essential to the analysis.

Who's going to pay for this archival. Is the funding from government funding agencies going to increase to allow for it? Or is the money expected to be siphoned off from actual research spending?

Publication is the whole point of scientific research. Scientific research depends fundamentally on everyone seeing and thinking about what everyone else is doing. If there is a cost associated with publication, that cost has to be the very first line item in the research budget. But more importantly, the costs of publication are demonstrably trivial in comparison to the benefits received.

Consider, for example, what happened when the research of Mr. Max Planck regarding blackbody radiation was read by a young man with no university affiliation who simply worked as a lowly third-class technical expert at the Bern Patent Office and liked to read scientific research papers in his spare time. He spent some time thinking about Mr. Planck's research and ultimately published his own thoughts, thereby giving the world an enormously valuable gift - the discovery of what would later become very widely known as the photon. But that was only the beginning. This young man, Mr. Albert Einstein, subsequently gave the world many more enormously valuable free gifts such as special relativity and general relativity.

If this young man - working in drudgery in a lonely corner of an obscure patent office and getting paid virtually nothing for his labor - had been effectively blocked from reading Mr. Planck's research by a government paywall, the world would never have received any of these enormously valuable free gifts!

Sharing is caring. The correct government policy is to immediately release all research, Wikileaks-style, so that anyone and everyone can collect, index and store any or all of it themselves on their own websites. If the government has to pay for servers and bandwidth in order to do that, then so be it. The boost to science provided by worldwide free and immediate access to all scientific research is absolutely invaluable.

As a researcher and both applaud and am scared by this. Forcing journals to have open access after a year is a great idea, although I'm not sure how they'll enforce it (given that not all Journals are based in the US, and some might argue this is an anti-business move, given that they're in it for the money).

The thing I'm scared of is the massive bureaucratic nightmare that will descend upon us when we are required to make our data public. Where will these data be stored, and in what format? How are these rules going to be enforced exactly? It just sounds like a nightmare to police and run, even if it's a good idea in practice.

I don't think our concern in the US is non-US-based journals. The outcry here is that the US taxpayers are funding this research, and then to view the results, they have to pay a second time. So the argument is that paying taxes should be enough to view the results, which should be free.

And what do you suppose you'll do with all this information on the oxidative detoxification of aqueous bark extracts now that it will be available to you? My strong suspicion is "nothing, we just want it". Or "nothing, but I need to know exactly what sort of tinfoil is best at guarding against the government's Thought Beacons".

Me? I can't do anything with it, but it bothers me that a person who might make use of it is having difficulty seeing it. Especially since those who have easy access are exactly those who think inside the box created by modern scientific edifaces, and those with the most potential to make the greatest use of that information tend ti think outside the box and are at times much less likely to have access.

Quote:

The US government produces a great deal of core research on a lot of things as boring as bark extracts, and a great deal that sounds boring but which has real economic value to those people who know what to do with it. And if you're one of those people who know what to do with it, you already know who in the government you need to ask to get free and unfettered access to it, or at least you should.

Yes. If you think inside the box, but what of those who do think outside the box. Those who for example might want to try an aqueous bark extract as a cure for a serious disease, eg malaria?

Quote:

Me, I'd rather make the Chinese government go through the effort of stealing it rather than just giving it to them by making it public.

Why steal when they can buy? Anything that is that sensitive should not even be available behind present paywalls. It's called being classified.

I like this idea, both halves of it. I like reading study results, but there are a lot of problems that can be seen, and having access to metadata - and actual studies, can help.

Here's an example. A European study looked at whether living by a powerful electrical distribution lines caused illness. They looked at several dozen illnesses, and found a couple that made 2-sigma (95%) confidence, and several that made 1-sigma. This was reported in the press as a disaster - these wires kill people. However, somebody let the raw data out, and it became clear that there were just as many 1-sigma and 2-sigma cases where it appeared to prevent disease. Had the metadata not been released, this wouldn't have been discovered and everybody would have been left with the wrong impression.

There are many cases where cause and effect are very unclear. Does sodium cause heart problems, or do people who eat a lot of fast food have heart trouble and also eat a lot of sodium? Putting people on low sodium diets doesn't seem to help. Does salty sushi cause the same problems as salty French fries? Metadata will help.

Separately, the new up/down voting process sort of gives me the creeps. As I was writing this, I caught myself trying to say things other Ars readers would like and avoid things they wouldn't like. I know we don't want trolls, but I must say that making things a popularity content doesn't always lead to the most healthy discussion.

Though your point is valid, you're confusing data and metadata. Data would be something like:

Jane Doe, age 4, yes near powerline, no cancer, no kidney disease, yes autismJohn Smith, age 19, no near powerline, no cancer, yes kidney disease, no autismPaul Jones, age 44, no near powerline, yes cancer, yes kidney diesease, no autism

no point in getting exited over this. they still think more of the various entertainment industries and in this case, the publishers, to actually let the data out. as soon as that group start complaining, there will be a quick change of mind and out will come the restrictions and the fees.

Why the 100 million dollar hurdle? Something tells me that's a loophole which effectively makes this worthless. They'll subdivide department or find other creative ways to make sure all research stays under the $100 million mark

And what do you suppose you'll do with all this information on the oxidative detoxification of aqueous bark extracts now that it will be available to you? My strong suspicion is "nothing, we just want it". Or "nothing, but I need to know exactly what sort of tinfoil is best at guarding against the government's Thought Beacons".

The US government produces a great deal of core research on a lot of things as boring as bark extracts, and a great deal that sounds boring but which has real economic value to those people who know what to do with it. And if you're one of those people who know what to do with it, you already know who in the government you need to ask to get free and unfettered access to it, or at least you should.

Me, I'd rather make the Chinese government go through the effort of stealing it rather than just giving it to them by making it public.

Don't be silly; the research pertaining to the thought-ray damping characteristics of tin-foil (and aluminium-foil, for that matter) and similar matters have never been hidden from public view by academic paywalls -- that stuff's classified.

This is a good thing. The government publishes a lot of information online already, and while some of it might be boring or technical (like the elasticity of beef prices), it already serves the public good. This is especially useful for college and late high school coursework. That's when you're surveying a technical field, but not doing your own work.

I hope that this directive eventually makes it into a stronger law passed by congress. Honestly though, it will probably effect the operating procedures of agencies and the next administration won't even notice that it's there.

I'm curious to see how it will interface with private researchers who draw a significant portion of their funding from the government, but not all of it.

I predict more publications made "secret" to prevent them from being published. NSA et al have a lot to cover up.

Exactly my thought. I'd place a nickel on this being a case of hypocrisy in purity.

Except that the nice part about scientific discovery is that it doesn't really care about "secret." It is silly to think that knowledge will not come out eventually if it is based on experimentation, etc. This is a good first step. I feel like it is not revolutionary, but it is at least evolutionary. This information should be free and out there. People should be paying not for the articles or studies, but for the analysis of the articles, and studies. In that way there room for scientific journals to exist. Sometimes, you don't read a medical journal like Blood for the actual study. You may be reading it for the report on the validity of a study. Can you get that information from an abstract? Probably, Yes. So, this year long thing is a compromise. I guess it is about these scientific journals not being creative enough to really jump on the fact that they could publish a lot more online, and then charge for the analysis. I assume there is still some competition between publications for subscriptions.

It's not as much preserving the group's head start (if follow up studies aren't well under way by the time you publish you're doing something wrong or are exiting the field) or profit-potential (profit driven research like a drug candidate will be published after it has undergone some form of IP protection) as supporting the infrastructure that goes into the journal setup. You have to pay for the websites, the editors time, typesetting, etc etc etc. Peer-review is generally on a volunteer basis but the costs behind it are non-trivial, and in the journal model are paid for by subscriptions and to an extent page fees. If the papers are all open access immediately there's no impetus to subscribe to get the latest articles. PLoS et al offset this by more substantial publication charges.

This...

The unseen work is often not factored into the overall costs. I'm all for open access, but there needs to be some type of realistic balance.

As a researcher and both applaud and am scared by this. Forcing journals to have open access after a year is a great idea, although I'm not sure how they'll enforce it (given that not all Journals are based in the US, and some might argue this is an anti-business move, given that they're in it for the money).

The thing I'm scared of is the massive bureaucratic nightmare that will descend upon us when we are required to make our data public. Where will these data be stored, and in what format? How are these rules going to be enforced exactly? It just sounds like a nightmare to police and run, even if it's a good idea in practice.

I don't think our concern in the US is non-US-based journals. The outcry here is that the US taxpayers are funding this research, and then to view the results, they have to pay a second time. So the argument is that paying taxes should be enough to view the results, which should be free.

Just for the record, quite a lot of US-taxpayer-funded research is published in international journals: Nature, for example (based in the UK). I'm not saying that this is necessarily a problem: if the National Science Foundation, DoE, DoD, etc., require that the results of its funding be made available after a year, the publishers will have to adapt their copyright agreements or else they'll lose a lot of high-impact results and the quality of their products will suffer. I would imagine most have already had to deal with the NIH requirements anyway.