Aaron Swartz and Open Access

Just time for a very brief comment about the tragic death, apparently by his own hand, of Aaron Swartz on Friday. For those of you who haven’t followed the story, or perhaps don’t even know who he was, Aaron Swartz was an “internet activist” and leading champion of the open data movement. He was a young man, only 26 when he died, who was prepared to fight for a cause he truly believed in. And to die for it.

Aaron Swartz was being prosecuted for alleged illegal downloads of scientific papers from the JSTOR system so he could make them available to the public. If convicted he would have faced a sentence of up to 35 years in prison.

Whether his prosecution was according to the letter of the law is a question I’ll leave for others to discuss. I’ll just say that it’s profoundly objectionable that the papers in the JSTOR are behind a paywall in the first place, just another example of how the academic publishing industry now actively stifles the free communication of scientific ideas and results that it purports to facilitate.

Aaron Swartz was a controversial character, but I know I’m not alone in thinking that his prosecution was at the least heavy-handed and at the worst downright vindictive. Academics have been using the hashtag #PDFtribute on Twitter to pay tribute to his courage and to follow his example by posting their own research publicly free of charge.

Astronomers have making their results available in this way for years, through the arXiv. We have also been paying through the nose for subscriptions to journals that do little more than duplicate the arXiv submission at such a prohibitive cost for access that the public can’t access them. In future we’re supposed to pay huge fees up front to academic publishing houses, to duplicate the arXiv in a different but equally pointless way. Pointless, that is, from any perspective other than their own profits.

As regular readers of this blog will know, I’ve suggested a way to bypass traditional journals and achieve a form of publication that is both open to all and run at a minimal cost to authors. That will be going on-line in the not-too-distant future. One thing remaining to be resolved is the name for the new system. I still haven’t decided on that, but at least I now know to whose name it will be dedicated.

Because publication in certain journals is regarded on CVs as proof of classy work. There is some justification for that view too, but hopefully Peter’s journal or similar will also rapidly gain a reputation for quality and we can dispense with academic journal publishers, who were once symbiotic but are now parasitic – and hugely expensive.

Oh, things are much worse in biology. We don’t have arXiv, or a credible equivalent. So the only thing to do is publish in a journal — which means running the peer-review gauntlet and multiple rejections if you “aim high”. As a result, most work is several years out of date by the time it’s published.

So my point wasn’t “oh, you silly astronomers should be more like us cool biologists”, but “as we silly biologists strive to reach a situation more the one you cool astronomers enjoy, how can we fully understand how things work in your saner field?”

While arXiv itself doesn’t have peer review, in practice all serious stuff there is at least submitted to reputable journals, so arXiv provides an alternative distribution mechanism which is essentially independent of the question of peer review.

Suppose you have a job vacancy to fill, and there are 100 applicants, each with 50 papers. This is completely realistic. You are not going to read 5000 papers. So, the fact that papers appear in reputable journals is a rough indicator of some minimum quality. Yes, there might be alternatives, but even if they exist, it is not the decision of those applying for jobs (or for those who have jobs who are applying for grants) to change the system.

Peter’s idea can work only if sufficient quality control is applied. Since arXiv has been around for a while, no one has to submit to a journal to make his work available. For similar reasons, people reading on the arXiv don’t read all papers, but have to filter them somehow and, especially if one doesn’t know the author, acceptance by a reputable journal can be a useful criterion.

OK, so the purpose of journals in astronomy is purely to Keep Score. I can imagine that. But it leaves the question of why anyone bothers actually subscribing to the journals, when all the actual science is right there on arXiv.

The point is simply that things don’t have to be refereed to go on the arXiv, but they do to get in a journal. My plan basically a service to referee arXiv submissions and place a quality mark on those that pass muster. This makes journals entirely redundant. I think they are anyway, actually, because good papers get cited whether they are refereed or not..

First, not all journals are funded exclusively by subscriptions. Some are funded by professional societies or by the research budget of countries. Second, there might be some pressure to limit submissions to subscribers. And, obviously, if the journal ceases to exist because no-one subscribes, then it can’t provide a stamp of approval.

“My plan basically a service to referee arXiv submissions and place a quality mark on those that pass muster. “

Yes, and I think it has a chance of success by going that route.

“I think they are anyway, actually, because good papers get cited whether they are refereed or not..”

Do you know of any non-refereed paper which a) has been cited more than a few times and b) is good? Apart from Grigori Perelman’s proof of the Poincare conjecture, none springs to (my) mind. (Considering that he rejected both the Fields Medal and the Clay Millennium Prize, I don’t think he was worried that not publishing in a reputable journal might hamper his future financial situation, as most of us mortals would be.)

Also, what about stuff like Sjur Refsdal’s gravitational-lens papers in the 1960s? They weren’t cited very often until gravitational lenses were actually observed, and Sjur was later nominated for a Nobel Prize at least a couple of times (but never won and, since he is dead, never will) after the field he essentially single-handedly founded took off. However, in the early 1970s, anyone awarding a job based on number of citations would probably have picked someone else. But “published in MNRAS” conveys a stamp of approval and Sjur got a professorship in 1970. He worked mainly on stars for 10 years until observations caught up with theory and he went back to gravitational lenses.

Yes, citations might have some correlation to quality for “high-impact” bandwagon stuff, but such a metric will leave the blue-sky researchers behind. Also, consider that the timescale for a paper to pick up citations is about the same as the timescale for non-permanent jobs. As such, if citations alone count, even good people will have left the field before collecting enough to get them a job.

Just thought I’d point out a good and non-refereed arXiv-submission-only paper that has gained more than a few citations – the Cordes and Lazio paper describing their model of the Galactic free electron distribution currently has 599 citations. To be fair this is the only very highly cited article I know of, but given my limited depth of knowledge I’m sure there most be others. Within my field of gravitational waves the LIGO Scientific Collaboration/Virgo Collaboration have started placing some more technical papers as arXiv only.

It occurs to me that if the old ‘pay-for-access’ business model had continued, journals would have been out of business sooner or later once more disciplines started switching to archiv-like distribution models, and they could persuade university administrators to stop subscribing to expensive journals.

The switch to ‘pay-to-publish’ has really been a life-line to the journals.

“It occurs to me that if the old ‘pay-for-access’ business model had continued, journals would have been out of business sooner or later once more disciplines started switching to archiv-like distribution models”

Actually, no. In astrophysics, essentially everything has been put on arXiv for at least the last dozen years. Again, the elephant in the room is the fact that journals, however much one hates them, do provide a stamp of approval which is otherwise not available. Peter’s plan to replace this with his own stamp is a good idea. It is not about distribution; we already have that. (And the discussion shouldn’t be contaminated by bringing peer review into it.)

“The switch to ‘pay-to-publish’ has really been a life-line to the journals.”

Indeed. Anyone in elected office who agreed to this boondoggle should not be re-elected to said office, or should be ousted for gross misconduct.

It’s only slowly become apparent to me what I should have spotted from the off: the hostility towards author-pays open access on this blog is because you’re all astronomers, and so you’ve grown used to the excellent free service of arXiv.

Over here in biology, where we have nothing like arXiv, author-pays OA is a big step forward, compared with the Nothing Is Freely Available status quo.

That’s certainly not to say that Finch Report’s account of Gold OA was anything close to what it ought to have been. In particular, its cost estimate of £1500-2000 for a typical APC is grossly inflated, as I have argued in detail. (I came up with a figure of £283, which is less then one sixth of what Finch suggests. That is likely to fall further as economies of scale increasingly kick in.)

I think you underestimate the enormous cultural shift that arXiv has slowly but surely induced in astronomy and the other fields that it covers. Building an arXiv for biology would not be technically very demanding. But getting old-school biologists to use it, or cite material in it, would be a mammoth task.

(For the same reason, although most universities have institutional repositories, they tend to be very forlorn. Little content, poor usability, ambiguous terms, no interoperability.)

So credit to arXiv for providing the infrastructure that’s needed here. The problem of course is cultural. As you’ll see from the comments on the linked article, many or most palaeontologists don’t seem to consider this citable. I don’t know how we’re going to go about changing that.

With author-pays OA, journals have a conflict of interest in that they earn more money the more articles they publish, so there is an incentive to publish sub-standard stuff. The more inflated the fee is, the more they earn per bad article.

Also, this might lead to some sort of flat fee being paid by research councils or whatever rather than by authors themselves. You can count on this being inflated even more and more difficult to break out of than the present system.

The fear that journals supported by APCs will blindly accept sub-standard material is an old one. I don’t think it has teeth, though. There have always been good journals and bad journals — we can all point to terrible subscription-based journals. What we do is: not send our stuff to those journals, and not pay much attention to what they publish. Same applies in the Gold-OA world.

So we have PLOS Biology, which is highly respected; and PLOS ONE, which at least in my field is now a very important journal (though I hear that may be different in other fields); and various BMC journals and lots of well-respected singletons. And we also have a barrel-sweeping of crud that no-one pays attention to.

It’s worth taking a step back to gain perspective. Historically, journals provided two services – quality control and dissemination. The cost of journal publication, whether author-pays or via subscription, has gradually become exorbitant. The internet is the game-changer: it renders the dessemination service redundant, and at the same time provides academics with a tool to take back control of publication and save a lot of money. The only question is how to organise ourselves online so as to provide quality control. Debate about that is secondary to the main issue, which is that the internet is available.

Perhaps there is also a moral issue. We don’t like having to give up copyright on our own writings, do refereeing for free, then pay huge sums of money to get behind a paywall to read our own research which in most cases has been publicly funded.

“As you’ll see from the comments on the linked article, many or most palaeontologists don’t seem to consider this citable. I don’t know how we’re going to go about changing that.”

While it’s not true that anyone can post to arXiv, the hurdle is lower than for acceptance by a reputable journal. And the main reason that the stuff at arXiv is OK is because it is also submitted to reputable journals. Thus, a paper which is only on arXiv and nowhere else is uncitable, not because it can’t be found but because there is no “stamp of quality”. Thus Peter’s idea to use arXiv as a distribution mechanism (which is already the case in astronomy) but provide the missing seal of approval.

So, put your journal papers on arXiv as well, so that people can read them, not so that people can cite them. You can put the journal reference there, too, so that it can be cited by people who have read only the arXiv version. That’s the first step. The second is to replace the journal’s seal of approval with another one.

I have to say I am disappointed to find that even in astronomy, only journals are “real”, and arXiv is only an easier way to get hold of journals’ articles. I’d thought — or maybe hoped — that arXiv, which is manifestly superior in every practical sense to the journal system — was the Real Thing in astronomy, the dog that wagged the journal tail.

Probably the only reason it is not is the lack of a seal of approval. Yes, some journal articles are bad or even crackpot, and occasionally there is the arXiv-only submission which is very good, but these are exceptions. Thus Peter’s idea has a real chance to take off. I think it is expecting too much for people without a permanent job to be the first ones to publish in the new journal; some big names (including, but not limited to, those on the board of directors) should get the ball rolling.

The fact is that arXiv-only articles, at least good ones, are rare. Even people submitting good articles only to the arXiv won’t change things, because no-one reads all articles and one needs an “accepted for publication in” filter. Peter’s idea will provide that filter.

I’m waiting a) to see who’s on the board and b) when the first article by each board member will appear in the new journal (and nowhere else, except arXiv, of course).

Leaving aside the question of whether open access is good (most here probably agree that it is) and whether the potential punishment here was excessive, at least compared to, say, that for killing civilians in third-world countries (most here probably agree that it was), the question remains whether Swartz’s actions helped or hurt his cause. Yes, some laws are wrong and should be changed, but this must happen through some democratic process. The alternative is that anyone who disapproves of any law is allowed to break it, in which case one needs no laws at all and society would no longer function. After all, a law is necessary only when not everyone agrees whether it is correct, otherwise there would be no need for one. I certainly don’t want to live in a society where whoever drums up the most active support (even though this might reflect only a small fraction of the population) decides what happens. Among other things, it’s not fair to those who actually have to work to stay alive and at the same time allow society to function at all. One shouldn’t be distracted by the fact that his goal might have been desirable, otherwise one is forced into the position “activism is good only if it supports my goals”.

Allowed by some “higher law”. If you want to break a law as a matter of conscience, and are prepared to suffer the consequences, fine, but the idea that if everyone broke every law they want society would still function is wrong. Of course, there are times when having a society disappear would be preferable to it continuing, but that is not the case here.

Yes, most people approve of most laws, but that can’t be a defense (“Your Honour, let me off the hook; most people approve of most laws, so I shouldn’t be punished for breaking just one law.”)

Yes, sometimes people do things they don’t intend to, or disagree with, in hot blood, but that does not let one off the hook. In any case, Swartz’s activities were clearly premeditated.

I agree that this is a real problem. Of course, whatever the law says, someone has a vested interest in it. The question is whether the majority of the electorate would have preferred a different law. The only good solution here is a plebiscite, the result of which must be binding. The number of signatures to initiate it should be realistic, and “more yes than no” should be required to pass.

Whether or not the majority of the electorate, as opposed to the majority of internet activists, agree with Swartz is another question.

I’m not saying that this is the case here, but it is often claimed that laws don’t represent the will of the electorate but such a claim is not evidence. In some cases, there have been plebiscites which resulted in an outcome quite different than what activists expected. Actually, this is not surprising: few people take to the streets to defend the status quo. Thus, there are usually more people demonstrating for change than against it. Unless one actually has an absolute majority active in such a demonstration (usually, the number is orders of magnitude smaller), it really doesn’t mean much. Even 30% of the population for something doesn’t mean it should be implemented if 70%, or even 31%, think it shouldn’t be. (General caveat: I think that majority rule should apply only in cases where a decision has to be made. Many things should be allowed on principle even if the majority doesn’t like them, in particular things which don’t affect the majority anyway. But that is not the case here.)

I don’t know how I missed all the posts about the Open Journal of Astrophysics, but I’m tremendously excited about it! I’ve grown so used to hearing people complain about the current science publishing model that it’s a bit of a shock to see someone actually doing something about it. ;) Seriously, though, I’m amazed that you guys have been able to put this together in such a short time, and I hope the launch goes smoothly!

p.s. Since the system still hasn’t been named, I can’t resist putting in another suggestion. Since you’re building a machine that will take in a flood of mixed-quality arXiv papers on one end and let out a trickle of high-quality papers on the other, why not call it “the AstroFilter”?