Friday, January 08, 2010

Is Gold Open Access growing? If so, how quickly? More importantly, how accurately is the Open Access (OA) movement able to measure its growth?

At the end of last year I posted a list of what I believed were the main OA developments in 2009, which I prefaced with some commentary. In that commentary I argued that the rate of progress of both Gold OA and Green OA had accelerated during 2009.

To support my statement about Gold OA I said that over the year 100 more journals had been added to the Directory of Open Access Journals (DOAJ).

Gavin Baker quite rightly challenged my claim about Gold OA, pointing out that in reality "fewer journals were added to DOAJ in 2009 than in 2008 — about 100 fewer, it looks to be."

More detailed figures were later published by Peter Suber in the January SPARC Open Access Newsletter. In 2009, said Suber, DOAJ "added 723 peer-reviewed OA journals, representing 19% growth over the previous year. Last year it grew by 812 journals or 27%. In 2008, it added 2.2 titles per day, but in 2009 the rate was closer to 1.99 titles per day. It now lists a total of 4,535 peer-reviewed OA journals."

In confirming Baker's claim that 2009 had seen a fall in the growth of Gold OA, Suber added that while the DOAJ grew significantly last year, "it grew more slowly in 2009 than in 2008. So far we don't know why. There could have been fewer new launches, which the recession makes likely."

As such, Baker appears to have been right to challenge my assertion about Gold OA. It would have been more accurate, it seems, to have said that Gold OA continued to grow in 2009, rather than that growth had accelerated.

In his newsletter, however, Suber did offer another possible explanation for the decrease in the number of journals added to DOAJ in 2009. It could be, he said, that "some number of new launches might have gone unnoticed or still be working their way through the DOAJ's indexing backlog."

Nevertheless, Baker's challenge stimulated me to give some thought to how the OA movement measures the growth of Gold OA. As a result I have had to conclude that I made a more fundamental error than Baker charged me with. For it turns out that seeking to measure the growth of Gold OA by counting the number of journals in the DOAJ is a deeply flawed approach.

For instance, as Suber suggests — and publishing consultant Alma Swan confirms — the DOAJ registry (in Swan's words) "merely reflects the pace at which the Lund people add them."

In other words, the team that manages DOAJ (which is based at Lund University Libraries) may be struggling to keep pace with the growth of new journals. As Swan puts it, "100 journals fewer could just mean that someone in the Lund team had swine flu for two months and didn't do as much work as last year".

Of course some of these journals may not meet the criteria for inclusion in the DOAJ, but for several of them at least this seems an unlikely explanation.

Huge backlog

When I contacted the Lund team they confirmed that they have been struggling to keep up, not least because over the past twelve months they have had fewer people assigned to adding journals to the database. There is therefore, I was told, currently a "huge backlog" of journals waiting to be added.

It is perhaps no surprise then that fewer journals were added to the database in 2009. Nor does the situation look set to improve in the near future: Staff shortages aside, the task of adding new journals to the DOAJ is a time-consuming process, one that generally requires a series of interchanges with the editor or owner of the journal.

And in order to keep the directory current a constant weeding process has to take place, a process that saw around 100 journals removed from the DOAJ during 2009. And the larger the database becomes the more time-consuming managing it will surely become.

All of which would appear to suggest that the number of Gold OA journals is growing at a faster rate than is evident if one simply counts those listed in DOAJ.

On the other hand, there are reasons for arguing that counting the number of journals added to the DOAJ might actually overstate, rather than understate, growth. How come? Because the date a journal is added to the DOAJ is not necessarily the date that it comes into existence.

As Swan also points out, "Just because a Gold OA journal was added in 2009 does not mean that it is a new OA journal. They may be new ones, but not necessarily so. They may be pre-existing Gold OAJs, or pre-existing journals that turn Gold."

Swan adds: "When I helped Sally Morris work through DOAJ years ago, checking when journals were launched and how much they published there was a huge variation, and some of them were relatively old journals — mostly humanities ones that had always published for free from university departments of research groups; the fact that DOAJ catalogued them in, say, 2005, means nothing if they were actually launched in 1989."

But whether it understates or overstates growth, counting the number of journals added to the DOAJ would appear to be a highly inaccurate way of trying to measure the growth of Gold OA.

An alternative approach

So how do we measure it? Bo-Christer Björk, who is based at the Hanken School of Economics in Helsinki, Finland, has been doing some work on this. He agrees with Swan's second point: "A lot of journals have joined DOAJ much later than they were started, so the growth might be overstated."

Consequently Björk has adopted an alternative approach. Instead of using DOAJ he has been using Ullrich's Periodicals Directory for his IT-barometer study. As he explains: "We undertook a search for peer reviewed, scholarly journals, with a particular start year, and then searched for all journals versus Open Access journals."

Below is a graph Björk created using that data, which plots the growth of new OA journals as a percentage of all new journals.

The graph suggests that the percentage of new journals created as OA journals in 2007 was 32%. In terms of growth, says Björk, "The share has been quite stable since 2000."

However while Björk was able to avoid the time-lag issues associated with using DOAJ, he experienced the same problem with regard to journals converting to OA. As he puts it, "It is very difficult to trace the development of Gold OA backwards. The information is as accurate as the background data in Ullrich's. It also doesn't tell you when a journal may have become OA, so some of these may have swapped model."

Nevertheless he adds, "Since the rate of renewal of the stock of existing journals is around 2% per annum this indicates that Gold OA might be increasing its overall share by perhaps 0.5% per annum."

While not directly comparable with Suber's figures, this gives a somewhat different picture of the growth of Gold OA.

Unsatisfactory methodology

In fact, adds Björk, if one looks at article numbers rather than journal numbers, growth appears even more modest. As he puts it: "Calculated over the number of articles the increase is likely to be even lower [than 0.5%], since the yearly article volume of newly-founded journals is low (with the exception of the PLoS journals)."

As such, Björk's work highlights a further problem: Not only does counting the number of journals listed in the DOAJ produce a very inaccurate picture of the growth of Gold OA, it is also unlikely to reveal Gold OA's true share of the market, since the number of papers published in OA journals tends to be lower than in subscription journals. As Swan puts it, "Some Gold journals publish one article per century; others publish hundreds per year."

Explains Swan: "It is easier to launch a non-OA journal in terms of gathering articles, because there is no charge to the authors for publishing in a non-OA journal." By contrast, she adds, "OA journals may struggle a bit for authors, especially before they have built a reputation."

Moreover, the difference between OA journals themselves can be quite stark. Contrast, for instance, a typical OA journal publishing a handful of articles a year with PLoS ONE.

For a number of reasons, not least its multidisciplinary approach, PLoS ONE has experienced explosive growth since its launch on 20th December 2006. During the few days that remained of 2006 PLoS ONE published 138 papers. In 2007 it published 1,231 articles, and in 2008 2,726 articles — a 120% increase on the year.

By the end of last year, says PLoS ONE publications manager Rebecca Walton, PLoS ONE had published 4,400 papers (a 65% increase on 2008), taking the total since launch to 8,495.

Anticipating that PLoS ONE will publish 6,600 papers in 2010 (a further 50% increase), Walton expects that this year it will become the largest scholarly journal in the world by volume of published research articles. She explains: "We believe we are the third largest scholarly journal in the world, behind Physical Review B and Applied Physics Letters, and in 2010 we will become the largest."

As such, PLoS ONE could now be publishing the same number of papers as 100 regular OA journals.

The truth of the matter, concludes Swan, is that — whatever source one uses — trying to measure the growth of Gold OA by counting journals "is an unsatisfactory methodology".

With the growth of Hybrid OA (where an author can continue to publish in a subscription journal but opt to pay to have individual papers made OA) the pointlessness of counting Gold OA journals becomes even more pronounced. In such an environment, the only meaningful way of sizing Gold OA is to count articles, not the number of Gold journals.

In short, the OA movement really needs to abandon its fixation on counting journals and concentrate on papers. "The best way to measure growth would be by number of Gold articles published each year," says Swan.

However, she adds: "No-one is monitoring that. Bo-Christer Björk's people have a methodology for assessing the proportion of all the literature that is OA, but I don't know if they differentiate between Green and Gold that way."

Considerable frustration

Actually, says Björk, he does differentiate between Green and Gold OA, and his figures suggest that the overall percentage of articles published as Gold OA in 2008 was 9.3% (With around 11.8% available as Green OA, giving an overall OA percentage of around 21%).

Björk's figures, we should note, also include papers published as Hybrid OA (24% of his Gold OA total).

"The method we have is based on googling with a sample of journal articles," he explains, adding that a similar method has been used in a very thorough study by a group of Japanese researchers in the biomedical field. "Their results are slightly higher than ours and we have yet to analyse from where the difference stems."

(The Japanese study, published in January last year, estimated that, at the time of the study, 27% of articles published in biomedicine were available on an OA basis).

Nevertheless, since Björk's main focus is on taking a snapshot of the current situation, rather than measuring the growth of Gold OA, his figures don't take us much further forward if our aim is to track growth over time.

So we are left with the question we began with: Is Gold OA growing? If so, how quickly, and is that growth accelerating? "I have no evidence to show any acceleration in growth, " says Björk. "On the contrary it seems that growth has been relatively stable, after a short expansive period when BioMed Central and PLoS were founded"

In short, we simply don't know how fast Gold OA is growing, or even if it is growing. We assume that it is, but no one seems able to demonstrate this in a convincing way.

All in all, while Baker appears to have been right to challenge my claim that the growth of Gold OA accelerated in 2009, he should perhaps have gone on to point out that counting Gold journals is not the way to track the growth of OA anyway.

Meanwhile, for some OA advocates the absence of reliable data here is a source of considerable frustration. Leo Waaijers, manager of the SURF Platform ICT and Research, for instance, would dearly like to see some accurate figures on the growth of Gold OA.

When I asked Waaijers if he knew whether this information was available he replied: "I have been thinking about this and looking around over the last few days but I could not find any reliable source. All in all it is a strange thought: in a period of information overload we seem not to be able to produce a more or less accurate record of an important development in scientific and scholarly communication."

There is scope here, it would seem, for someone to provide a useful service!