Monday, January 21, 2008

Peter Murray-Rust is a committed advocate of Open Access (OA). He is, however, a disappointed one. He is disappointed not because so few researchers are willing to self-archive their scholarly papers on the Web, not because it is proving so hard to persuade funders and research institutions to introduce Open Access mandates, but because of a failing he sees within the movement itself. Out of his disappointment, however, has come a new movement: the Open Data movement.

As a Reader in molecular informatics at the University of Cambridge Murray-Rust is interested in scholarly papers less for their textual content, more for the raw data contained within them — the graphs and tables, the molecular structures, the spectral and crystallography data, the photographs of proteins, and all the other factual information that litters science papers.

As such, much of Murray-Rust's time is spent not on reading the scholarly literature, but mining it — using various software tools to automatically extract the "embedded data" contained in the tables, the charts, and the images in science papers, and capturing the "supplemental information" that invariably accompanies the papers. After aggregating all these data Murray-Rust will compare them, input them into programs, use them to create predictive models, and reuse them for a variety of different purposes.

In short, Murray-Rust is working at the frontline of what has been dubbed Science 2.0, an online interactive environment where a great deal of the information used is more likely to have been discovered, aggregated and distributed by software and machines than it is by humans; an environment where data are constantly used and reused — pumped through new tools like RSS feeds, and displayed in mashups, wikis, and the various other tools developing around Open Notebook Science.

Murray-Rust's ultimate goal is to create and exploit what he calls the chemical semantic web — a web that would assume most scientific information was unencumbered by proprietary interests, and able to be freely shared and exchanged.

In practice, however, mining the scholarly literature remains a difficult and risky activity, explains Murray-Rust — not so much because the technology is still in its infancy, but because scholarly publishers routinely appropriate the content of research papers, and then lock it up behind financial firewalls and prohibit its reuse.

Assuming that the Open Access movement was committed to removing these barriers, Murray-Rust became an OA advocate. After all, as leading OA advocate Peter Suberputs it, Open Access implies scholarly literature that is "digital, online, free of charge, and free of most copyright and licensing restrictions". That, says Murray-Rust, is what is needed to build the semantic web.

But while the definition of Open Access agreed at the launch of the 2001 Budapest Open Access Initiative (BOAI) states that any paper made Open Access must be free of copyright and licensing restrictions, Murray-Rust discovered that in most cases publishers and authors still fail to provide the necessary permissions when making papers Open Access. Where a paper is flagged as being Open Access, reuse is often prohibited. And even where there is no specific prohibition, usage conditions are frequently not specified, effectively placing the paper into licensing limbo.

In many cases, says Murray-Rust, Open Access publishers don't even articulate to themselves under what conditions they are making their papers available on the Web, let alone provide an appropriate licence. As a result, third parties cannot know what usage is permitted. And where publishers do think it through, and attach a licence, the usage conditions are in any case often non conformant with the BOAI definition.

The legal status of papers that researchers themselves self-archive on the Web, or in their institutional repositories, is equally uncertain, and sometimes reuse is expressly forbidden.

What frustrates him says Murray-Rust, is that this confusion could have been avoided — had the Open Access movement emulated the Open Source Initiative (OSI) and developed customised OA licences. And having done so, he adds, the movement (again like the OSI) could have policed the use of the term Open Access, and publicised and sanctioned publishers who fail to use the licences, or who make false claims about Open Access. It should also have better educated researchers about licensing.

Further limiting what he can do, adds Murray-Rust, traditional subscription publishers like the American Chemical Society and Wiley explicitly forbid text mining of papers they publish. At the same time these publishers insist that authors not only sign over the copyright in the paper, but also ownership of the supplemental data, despite the fact that factual data are not subject to copyright.

After failing to persuade Open Access advocates to hear his concerns, Murray-Rust began to direct his energies to what he calls the Open Data movement, for which he is now a leading advocate. While he remains an advocate for OA, he explains, he has come to believe that the issue of Open Data needs to be addressed separately. For where the Open Access movement is concerned only with ensuring that scholarly papers are human readable, the Open Data movement requires that they are also machine readable. And since Open Data implies reuse, it is vital that licences are provided that specifically permit this.

Fortunately, Science Commons stepped into the breach, and is proving a valuable ally, not least by developing the Open Data protocol and the recently-announced Public Domain Dedication & Licence (PDDL) — thereby providing the first component of the legal framework that Murray-Rust believes is needed to enable text mining, and helping in the creation of the chemical semantic web.

I had been keen to speak with Murray-Rust for some time, so I was pleased recently to be able to hook up with him on the telephone. I found his ebullient style, rapid delivery, and quick-fire mind both challenging and fascinating. Above all, the conversation offered me an interesting new perspective on Open Access, and confirmed suspicions I have long harboured that the Open Access movement would truly benefit from having an official body to represent its interests.

Murray-Rust is a vivid and rumbustious person who does not pull his punches. When I emailed the draft text of the interview to him, however, he asked that I stress the positive rather than the negative in this introduction. "Yes, I am angry, but not completely," he wrote. "I believe in the power of the bottom-up to change things and I am optimistic that we shall get change."

He also asked me to underline his appreciation for all that the Open Access movement has achieved, and requested I append this paragraph: "Although this interview highlights some of the shortcomings of Open Access movement I want to pay tribute to the many activists who have devoted and often courageously worked to make scholarly knowledge free for everyone. I'd particularly like to say something very appreciative about Peter Suber, and I'd like also to mention the Scholarly Publishing & Academic Resources Coalition (SPARC) and the Wellcome Trust — who in my opinion have probably been the largest force for change recently."

####

If you wish to read this interview in its entirety please click on the link below. I am publishing it under a Creative Commons licence, so you are free to copy and distribute it as you wish, so long as you credit me as the author, do not alter or transform the text, and do not use it for any commercial purpose.

I have in mind a figure of $8, but whatever anyone felt inspired to contribute would be fine by me. Payment can be made quite simply by quoting the e-mail account: richard.poynder@btinternet.com. It is not necessary to have a PayPal account to make a payment.

What I would ask is that if you point anyone else to the article then you consider directing them to this post, rather than directly to the PDF file itself.

Thursday, January 10, 2008

Last month it was announced the President Bush had signed the long-awaited omnibus spending bill that, amongst other things, will require the US National Institutes of Health (NIH) to mandate Open Accessto all the research it funds. While a few have expressed dissatisfaction with some of the details of the mandate, the news has been widely greeted as a major victory for the Open Access movement in the US — a victory, moreover, that came only after a long struggle.

In Europe, meanwhile the news was decidedly disappointing, when it finally became clear that over-cautious European politicians and bureaucrats had chosen not to act in the interests of science, and would not be pushing for Open Access.

The disappointment was all the greater given the enthusiastic way in which the research community had responded to a petition that Open Access advocates had organised earlier in the year urging the EC to act on the recommendations of its own report, and mandate all EU-funded researchers to make their papers freely available on the Internet. With the petition attracting 18,500 signatures in a matter of weeks, it was universally assumed that a mandate was inevitable. It turned out, however, that aggressive lobbying by self-serving publishers had persuaded EC officials to drop the mandate.

As project manager for the petition, Open Access advocate Dr Alma Swan was personally involved in events. When I learned that she was passing through Oxford, therefore, I tracked her down in Oxford's famous Randolph Hotel. Sitting in the (to me) somewhat incongruous surroundings of the Randolph's plush tea rooms, I asked Swan what had gone wrong, and where it leaves the Open Access movement in Europe.

Far from being fazed by developments, however, Swan was as confident as ever. "One thing that those who oppose Open Access must understand is that we are not going to give up," she assured me. "Moreover, we are going to be more tenacious than the people who oppose us."

Besides, she added, the battle isn't going to be won in the corridors of power, but in the meeting rooms and the labs of research institutions. Here, she assured me, the omens are good — as awareness continues to grow that Open Access isn't just a trendy buzz word, or even an end in itself, but the enabler for a much larger revolution; a revolution, moreover, that universities will find it increasingly difficult to resist.

Swan's quiet confidence is also hard to resist. What makes her arguments particularly compelling is that Swan is not an over-earnest ideologue, but a generous-spirited and witty woman with an infectious, and somewhat subversive, sense of fun.

Nor is she obsessed with demonising publishers. After all, she points out, in resisting Open Access they are only doing what businesses are expected to do in capitalist democracies — seeking to maximise their profits. But she adds that their hypocrisy is nevertheless depressing. While making public statements claiming to support the principle of Open Access, she says, publishers are constantly engaged in behind-the-scenes attempts to derail it.

In characteristic Swan style, when pushed to take a jibe at publishers, she ends our meeting with a humorous anecdote, remarking that one prominent member of the publisher lobby group STM has developed a sneaky habit of stealing the jokes from her presentations. With a mischievous twinkle in her eye she says, "I've told him that whenever he's in the audience when I'm presenting, it'll be a strait-laced show."

####

If you wish to read the interview with Alma Swan please click on the link below. I am publishing it under a Creative Commons licence, so you are free to copy and distribute it as you wish, so long as you credit me as the author, do not alter or transform the text, and do not use it for any commercial purpose.

If after reading it you feel it is well done you might like to consider making a small contribution to my PayPal account. I have in mind a figure of $8, but whatever anyone felt inspired to contribute would be fine by me.