Monday, March 13, 2017

The OA interviews: Philip Cohen, founder of SocArXiv

Fifteen years
after
the launch of the Budapest Open Access Initiative (BOAI) the OA revolution
has yet to achieve its objectives. It does not help that legacy publishers are
busy appropriating open access, and diluting it in ways that benefit them more than the
research community. As things stand we could end up with a half revolution.

But
could a new development help recover the situation? More specifically, can the newly
reinvigorated preprint movement gain sufficient traction, impetus, and focus to
push the revolution the OA movement began in a more desirable direction?

This
was the dominant question in my mind after doing the Q&A below with Philip Cohen, founder of the
new social sciences preprint server SocArXiv.

Preprint
servers are by no means a new phenomenon. The highly-successful physics
preprint server arXiv (formally
referred to as an e-print service) was founded way back in 1991, and today it hosts
1.2 million e-prints in physics, mathematics, computer science, quantitative biology, quantitative finance and statistics. Currently around 9,000-10,000 new papers each
month are submitted to arXiv.

Yet
arXiv has tended to complement – rather than compete with – the legacy
publishing system, with the vast majority of deposited papers subsequently being
published in legacy journals. As such, it has not disrupted the status quo in
ways that are necessary if the OA movement is to achieve its objectives – a
point that has (somewhat bizarrely) at times been celebrated by open access
advocates.

In
any case, subsequent attempts to propagate the arXiv model have generally proved
elusive. In 2000, for instance, Elsevier launched a chemistry
preprint server called ChemWeb, but closed it in 2003. In
2007, Nature launched Nature
Precedings,
but closed it in 2012.

Hope springs eternal

Fortunately,
hope springs eternal in academia, and new attempts to build on the success of
arXiv are regularly made. Notably, in 2013 Cold Spring Harbor Laboratory (CSHL)
launched a preprint server for the biological sciences called bioRxiv. To the joy of preprint
enthusiasts, it looks as if this may prove a long-term success. As of March 8th
2017, some 8,850 papers had been
posted, and the number of monthly submissions has grown to around 620.

Buoyed
up by bioRxiv’s success, and convinced that the widespread posting of preprints
on the open Web has great potential for improving scholarly communication, last
year life scientists launched the ASAPbio
initiative.
The initial meeting was deemed so successful that the normally acerbic PLOS co-founder Michael Eisen penned an
uncharacteristically upbeat blog post about it (here).

Has
something significant changed since Elsevier and Nature unsuccessfully sought to
monetise the arXiv model. If so, what? Perhaps the key word here is “monetise”.
We can see rising anger at the way in which legacy publishers have come to
dominate and control open access (see here, here, and here for instance), anger
that has been amplified by a dawning realisation that the entire scholarly communication
infrastructure is now in danger of being – in the words of Geoffrey Bilder – enclosed by private
interests, both by commercial publishers like Elsevier, and by for-profit
upstarts like ResearchGate and Academia.edu (see here, here and here for instance).

CSHL/bioRxiv
and arXiv are, by contrast, non-profit initiatives whose primary focus is on research, and facilitating research, not the pursuit of profit. Many feel that
this is a more worthy and appropriate mission, and so should be supported. Perhaps, therefore, what has changed is that there is a new awareness that while legacy publishers contribute
very little to the scholarly communication process, they nevertheless profit from it, and excessively at that. And for this reason they are a barrier to achieving the
objectives of the OA movement.

Reproducibility crisis

But
what is the case for making preprints freely available online? After all, the
research community has always insisted that it is far preferable (and safer) for scholars to rely on papers that have been through the peer-review process, and published
in respectable scholarly journals, in order to stay up to date in their field, not on self-deposited early
versions of papers that might or might not go on to be published.

Advocates
for open access, however, now argue that making preprints widely available enables
research to be shared with colleagues much more quickly. Moreover, they say, it enables papers to potentially be scrutinised by a much greater number of eyeballs
than with the traditional peer review system. As such, they add, the
published version of a paper is likely to be of higher quality if it has first been made available as a preprint. In addition, they say, posting
preprints allows researchers to establish priority in their discoveries and
ideas that much earlier. Finally, they argue, the widespread sharing of
preprints would benefit the world at large, since it would speed up the entire
research process and maximise the use of taxpayer money (which funds the
research process).

Many
had assumed that OA would provide these kind of benefits. In addition to making
papers freely available, it was assumed that open access would introduce a
quicker time-to-publish process. This has not proved the case. For instance, while
the peer review “lite” model pioneered by PLOS ONE did initially lead to faster
publication times, these have subsequently begun to lengthen again.

Above
all, open access has failed to address the so-called reproducibilitycrisis (also
referred to as the replication crisis). By utilising a more transparent publishing process (sometimes including open peer review) it was assumed that open
access would increase the quality of published research. Unfortunately, the
introduction of pay-to-publish gold OA has undermined this, not least because
it has encouraged the emergence of so-called predatory OA
publishers
(or article brokers), who gull researchers
into paying (or sometimes researchers willingly pay) to have their papers published
in journals that wave papers past any review process.

The
reproducibility crisis is by no means confined to open access publishing (the
problem is far bigger), but it could hold out the greatest hope
for the budding preprint movement.

Why
do I say this? And what is the reproducibility crisis? Stanford Professor of
Medicine John Ioannidis neatly summarised the reproducibility crisis in 2005, when
he called his seminal paper on the topic “Why most published research findings are false”. In this and
subsequent papers Ioannidis has consistently argued that the findings of many
published papers are simply wrong.

Shocked
at Ioannidis’ findings, other researchers set about trying to size the problem
and to develop solutions. In 2011, for instance, social psychologist Brian Nosek launched the Reproducibility
Project,
whose first assignment consisted of a collaboration of 270 contributing authors
who sought to repeat 100 published experimental and correlational psychological
studies. Their conclusion: only 36.1% of the studies could be replicated, and
where they did replicate their effects were smaller than the initial studies
effects, seemingly confirming Ioannidis’ findings.

The
Reproducibility Project has subsequently moved on to examine the situation in cancer
biology (with similar initial
results).
Meanwhile, a survey undertaken by Nature last year would appear to confirm
that there is a serious problem.

Whatever
the cause and extent of the reproducibility crisis, Nosek’s work soon attracted
the attention of John Arnold, a former Enron trader who has committed a large
chunk of his personal fortune to funding those working to – as Wiredputs it – “fix
science”. In 2013, Arnold awarded Nosek a $5.25 million grant to allow him and colleague
Jeffrey Spies to found the Center for
Open Science (COS).

COS
is a non-profit organisation based in Charlottesville, Virginia. Its mission is
to “increase openness, integrity, and reproducibility of scientific research”. To
this end, it has developed a set of tools that enable researchers to make their
work open and transparent throughout the research cycle. So they can register their
initial hypotheses, maintain a public log of all the experiments they run, and the
methods and workflows they use, and then post their data online. And the whole
process can be made open for all to review.

Open Science Framework

At
the heart of the COS project is the Open Science Framework (OSF). This, COS executive
director Brian Nosekexplained to me last year, consists
of two main components – a back-end application framework and a front-end view.
“The back-end framework is an open-source, general set of tools and services
that can be used to support virtually any service supporting the research
lifecycle”, he explained, adding that the front-end is the interface through
which researchers interact with the system.

How will this help the preprint movement? If the objective is to make the entire research process
open and transparent then posting preprints is clearly an essential part of the OSF vision. And to assist in this the Open Science Framework includes a module called OSF Preprints. Any researcher
can post preprints directly into OSF
Preprints. Importantly, the service also allows “collections” to be created.
These can be collections of, say, journals, meetings, registries, or indeed preprints. And they can be community-based collections with a branded community interface. SocArXiv is one of those community interfaces.

As
COS Community Manager Matt Spitzer explained to me last year, “SocArXiv
will simply be a branded service built on a generalised OSF pre-print service.”

Elsewhere,
the Latin American online library and publishing platform SciELO has announced plans to launch
a preprint service. And for those in the humanities the Humanities
Commons
has launched CORE.

True,
CORE is described as a repository, but it caters for preprints too. Indeed, it
seems likely that we could see repositories and preprint servers start to
merge. In the Q&A below Cohen stresses that SocArXiv is not intended exclusively
for preprints and, as we shall see, he believes it is important that it should
not.

Clearly
keen to play in the preprint pond, for-profits are riding the wave too. Both PeerJ and F1000Research now offer preprint
services, although these are primarily intended to feed the pay-to-publish
services these companies offer. Likewise, OA publisher MDPI has launched preprints.org, presumably for similar reasons.

Finally,
we could note that the American Chemical Society (ACS) has announced plans to launch
a preprint server (ChemRxiv) too. This is
ironic given its response to the launch
of ChemWeb 17 years ago, but underlines how attitudes to preprints have changed.

Central Service

As
the number of preprint servers increases, however, so concern has grown that
the landscape could become overly complex, and inefficient. At ASAPbio’s third
meeting, therefore, it was proposed that a central
preprint service be created. Explaining the logic for this, ASAPbio commented “an increasing
number of intake mechanisms … may lead to confusion and difficulty in finding
preprints, heterogenous standards of ethical disclosure, duplication of effort
in creation of infrastructure, and uncertainty of long-term preservation.”

ASAPbio
has already attracted $1 million in
funding for the mooted Central Service, and since OSF Preprints could be said
to contain the seeds for creating this – in so far as it is fast becoming the
platform of choice for those setting up preprint servers and because, courtesy
of its partnership with SHARE, it is already
harvesting preprints from third-party servers (i.e. bioRxiv, arXiv, PeerJ and CogPrints) – COS is
bidding to build the
ASAPbio Central Service.

But
the million-dollar question is whether this fledgling preprint movement has the
potential to get the OA revolution back on track, and perforce reduce the degree of control publishers now have over scholarly communication. Key to this, of
course, will be whether the new services can attract sufficient papers to make
them viable, and whether they will prove financially sustainable over time. Above
all, however, their success will depend on whether they can play a meaningful role in reinventing scholarly communication for the networked world.

Here
we could note that in the Q&A below Cohen voices concern that ASAPbio envisages the Central Service as catering for preprints
alone. This, he says, could prove “a gift to the publishers, who retain their
dominance by controlling the so-called ‘version of record.’”

He adds: “There is no reason to
erect this barrier between systems, where the ‘preprints’ system only publishes
non-reviewed work, and the journals only publish reviewed work – except to
protect the revenue stream of the publishers.”

Importantly, Cohen warns, fixating
on “the idea of the ‘complete’ draft may impede innovation toward more advanced
forms of communication.”

As
we noted, arXiv has done little to disrupt the legacy publishing system. The
danger is that the new generation of preprint servers will achieve little more than
arXiv in this regard. That is, they could become no more than repositories of articles jostling for a place in traditional (or pay-to-publish OA) journals. Already we can see journal editors seeking to position them
as passive reservoirs of papers waiting to be selected for publishers’
pay-to-publish mills.

Preprint
servers have the potential to be far more than that. They should be viewed as nurseries
in which new forms of scholarly communication are experimented with and
developed. As such, they should be viewed as separate and, to a great extent,
independent of the legacy journal system. One hopeful sign here is that we have
seen the emergence of new overlay journals like Discrete
Analysis
and Quantum. Built on top
of arXiv these tend to be scholar-led, community owned journals created and
managed to review, highlight, and disseminate high-quality research papers, not
to monetise them. As such, they can be seen as alternatives rather than complements
to the traditional system (and itsoligopolists).

Complete the revolution?

Evidently,
Cohen would like to see SocArXiv play a similar role. When I asked him if he
envisaged the service adding comment and post-publication review functionality,
or becoming a platform for new overlay journals he replied, “[I]t’s important to point out that, as an open service, it
is possible right now for anyone to develop those functions. Any institution,
working group, department, or library could put up a list of papers,
automatically or manually generated, and host discussions on them, facilitate
peer review, and produce their own overlay journals. A big part of our outreach
job in the coming year is to get people who have the knowhow and resources to
develop such things to jump on it and bring them to fruition.”

Cohen
clearly also has his eye on a world beyond the traditional journal. Writing on the LSE blog last year he
said, “I hope that SocArXiv will enable us to save research from the journal
system.”

And
below he points out that scholarly work involves far more than journal
articles, not least data and commentary. “SocArXiv
does not require the disruption of the journal system, but if we help make that
happen, and help build a better system to replace it, I would be glad.”

The
good news is that if the preprint movement flourishes, and manages to maintain
an existence independent of traditional publishers, it has the potential to
complete the revolution the OA movement began. And if all else fails, it could seek
to cut publishers out of the loop altogether and take back ownership of
scholarly communication.

Alternatively,
of course, it may – like the OA movement more generally – end up captured and exploited
by legacy publishers, who will seek to use it in a way that props up the
outdated and inefficient model of scholarly communication that currently allows them to
make excessive profits from the public purse. Not only would this be a waste of
taxpayers’ money, but it would hobble and hold back the global research
endeavour.

The interview begins …

RP:
What is SocArXiv, who should use it, and why?

PC: SocArXivis an open
archive of the social sciences, a free, noncommercial service for rapid sharing
of academic papers. It is built on the Open Science Framework, an open access,
open source platform that also allows researchers to upload entire projects
(e.g., data and code) and link them to research results.

Anyone who does research in the
social sciences should consider using it. Because SocArXiv is a not-for-profit
alternative, researchers can be assured that they are sharing their research in
an environment where access, inclusivity, and preservation, rather than profit,
will remain at the heart of the mission.

All this is in contrast to the
for-profit companies that want to monetize your research, including
Academia.edu, ResearchGate, the Elsevier products Mendeley and SSRN, and Google
Scholar. They may or may not provide people with something useful – access,
storage, social networking, metrics – but they exist to make money for their
investors, and that’s not our mission.

RP:
How is the service managed and by who?

PC: SocArXiv is administratively housed at the University of
Maryland, under my direction, with a steering committee of sociologists and
academic librarians. That means that our grant money is administered by UMD,
and we receive tax deductible contributionsthrough the university’s foundation.

In our operations we are a partner
of the nonprofit Center for Open Science (COS), which built and operates the
archive. As a member community of the COS Preprints service,
we participate in their Advisory Group, which consults on questions of
governance and technology.

RP:
As you indicated, the SocArXiv steering committeeis heavy
on sociologists. Does that tell us anything beyond the fact that you are a
sociologist and so presumably reached out to your colleagues in the first
instance?

PC: That’s correct. Being a small operation, it helped to start
with people in one discipline as a way to organize our discussion of needs and
desires – what we want, and how can we make it happen.

Of course, our needs and desires are
very similar to those of people in other disciplines, but it helped to think
locally. The system is open to all disciplines – anyone who wants their work to
appear under the words “social science” (we have a number of papers, for
example, from anthropology, geography, and urban planning). It’s also important
that the sociologists on the committee include experts in such subjects as the
sociology of knowledge, organizations, social movements, and higher education.

Beyond our researchers, by working
with leaders of the academic library community as well, we are developing the
project on a foundation of good preservation, access, and public service – and
lots of experience managing information projects. Additionally, as we gain
institutional supporters we are including them on a consultative advisory
board.

RP:
Are social scientists more or less likely to embrace open access and preprint
servers than other disciplines? What are the discipline-specific issues here,
and are there any disincentives for social scientists to use a service like
SocArXiv?

PC: I can’t generalize to social science in general, but some
patterns are clear. For example, economists are used to reading important work
online before it’s peer reviewed, and they have high-status outlets for working
papers that are recognized outside of academia – as when major news
organizations report on NBER Working Papers.

Sociologists, on the other hand,
expect to hear about interesting research first at a conference – where they
will see slides but not have access to a paper – and then wait months (or
years) to read it in a peer reviewed journal. I use that example purposefully,
because it also correlates with the massive disparity in social and political
influence between economics and sociology.

RP:
How receptive are social science journals to accepting papers that have been on
a preprint server? Is there an issue here?

PC: I don’t know of any major social science journal that will
not accept papers that have been posted in a public repository. The American Sociological Association, for example, although it has a bad track record of operating for-profit journals and discouraging open
access, explicitly permitspublication in all of its journals of papers that have been
posted in non-peer-reviewed repositories.

RP:
We last spokein July 2016. What has changed since then, and is the service proving more or
less popular/successful than you anticipated?

PC: We have made great strides since our soft launch last
summer as the first community in the OSF Preprints service. In December COS
launched a more fully featured web interface for uploading and discovering
papers, and several other communities have started up (in agriculture,
psychology, engineering, and research transparency). All the papers from these
services become part of the same open system.

As COS is leading on the technology,
we have been concentrating on the scholar and community side. We have received grants
of $50,000 each from the Alfred P. Sloan Foundation,
and the Open Society Foundations,
and contributions of $10,000 each from two libraries
(UCLA and MIT). At the University of Maryland, we have received support from the Department of Sociology,
the College of Behavioral and Social Sciences, and the University Libraries. We
are using this money for outreach and development, to build the user base and
expand the community, and to bring people together to work on next steps.

To that end, this year we will hold
a symposium called O3S: Open Scholarship for the Social Sciences, on the UMD
campus October 26-27. We hope it will be the first in a series of conferences, and
we will feature panels showcasing open scholarship, research on open
scholarship, and a workshop on the future of SocArXiv. With keynote addresses
by COS co-founder and CTO Jeffrey Spies and sociologist Tressie McMillan Cottom we think this is going to be a great event. And we will have some funding to
bring junior scholars to the symposium. (The call for papers and more informationis
available on our blog site, SocOpen.org.)

Meanwhile, new people are posting
papers every day. At this writing we recently passed 800 papers, posting at a
rate of several per day. March looks great, starting off at double the rate of
the previous two months. Of course, this is an infinitesimal fraction of the
social science coming out. I had naively thought we would grow faster.

The users remain concentrated among
people who use Twitter and people who are motivated
to move their papers over from the corporate paper sites. So there is lots of
room for growth, and outreach is the watchword.

New scholarly
communication system

RP:
Preprint servers seem to be enjoying a new lease of life, particularly in the
wake of the launch of bioRxiv and the ASAPbio initiative. Most recently, we have seen announcements for new preprint
servers from SciELO and ECS.
Do you see SocArXiv as part of a new movement? If so, how would you
characterise the nature and the goals of this movement?

PC: Preprints are a good workaround for our highly
dysfunctional journal publishing system. With preprints you can get your work
out in a timely way, to actual readers, while preserving your ability to
publish in regular journals for prestige and promotion. Lots of credit to the
big idea from arXiv.org, which started this for math and physics decades ago.
They have preserved their journal system while enhancing the efficacy and
efficiency of their research.

This is what we want to do for the
social sciences in the near term, while participating in the broader
interdisciplinary movement to build a new scholarly communication system over
time.

RP:
We have also seen a recent call for a central preprint service.
Some have expressed doubts about this. For instance, quantum physicist Michael Nielsencommented “it creates an effective monopoly, which tends to suppress innovation”. On the
other hand, the institutional repository movement has demonstrated that
creating an effective distributed system faces its own kind of challenges. What
are your views on the need for a central service, and the pros and cons of
central vs. distributed services? Would a central service be competitive with
subject-specific preprint servers like SocArXiv in your view, or complementary?

PC: I have positive and negative responses to the central
preprint service. On the positive side, I reject the fear that a central
service will be a monopoly and suppress innovation. This shows a fundamental
misunderstanding of open systems. If they are really open, they can’t be
monopolies, because they present no obstacles to entry or innovation. You can’t
start a petroleum or journal publishing company today because Exxon or Elsevier
will crush you in the marketplace – you need to take sales away from them to
succeed, and they will sell what you are selling for less, preventing you from
getting started. Truly open scholarship is not like that. Anyone can distribute
the information however they want without taking it away from anyone else.

Of course there is competition in
open scholarship – for attention, for grant money, for legitimacy – but it is
not like actual market competition because the products are free and unlimited
copies. The idea that ASAPbio or COS is dominant like Exxon or Elsevier is
dominant is just very naïve about the power of global capitalism.

Seriously, Elsevier is making
billions of dollars off a premodern publishing system that no one in their
right mind would have designed this way half a century ago. That’s suppressing
innovation. COS is the size of a thumb drive to them; it could be a thousand
times bigger without posing the threat to innovation that they do. On the
contrary, beyond their own innovation, open platforms like the OSF encourage
innovation by others because anyone can build integrations and applications on
top of them.

And that brings me to my negative
response. ASAPbio intends the Central Service to include only preprints, which
they define as, “Complete and public drafts of scientific documents, yet to be
certified by peer review.” I believe this definition – which preserves the
journal article as the unit of scholarly output – is limiting in two ways.

First, by insisting that preprints
are not yet peer reviewed, it is a gift to the publishers, who retain their
dominance by controlling the so-called “version of record.” There is no reason
to erect this barrier between systems, where the “preprints” system only
publishes non-reviewed work, and the journals only publish reviewed work –
except to protect the revenue stream of the publishers.

Second, the idea of the “complete”
draft may impede innovation toward more advanced forms of communication. Of
course that is how most researchers in the journal disciplines work today, but
a more innovative future is within our grasp.

In real life, today, scholarly work
includes registrations, code, data, comments, and reviews themselves – but we
usually only count published papers. Work does not stop when a draft is
“complete.” Just yesterday I had the very common, frustrating experience of
flipping back and forth between two papers by the same research team, produced
in series, with the second building only very slightly off the first. The team
was spinning out small bits of “complete” research in rapid succession, to
publish them as quickly as possible – and maximize the lines on their CVs.

If scholarly communication were
allowed to break out of the journal article mode, they could simply have rolled
out sequential analyses along a research path. The peer review system that
accompanies such innovation would be more efficient and – if it were conducted
according to open scholarship principles – more informative and engaging, with
reviews of different components of the research ideally provided as context to
readers and researchers alike as the project evolves.

This is just one scenario, used to
illustrate the possibilities for genuine innovation outside the relatively
ancient and hidebound paper system. Post-publication review may turn out to be
great, and I’m worried that a narrow definition of preprints will hinder that
potential development.

For what it is worth, although we
are on a system called “OSF Preprints,” SocArXiv invites people to post working
papers (drafts in progress), preprints (things to be published), and postprints
(things already published elsewhere), as long as the author has the right to
distribute them. We see no reason to impose limits to one or another of these
categories.

Clearly, the norms and practices
associated with emerging scholarly communication systems are yet to be
established. We want to develop new ideas while also allowing people to get
jobs, get promoted, and use peer review to maintain standards of quality – all
at higher speed and reduced cost – and we think we’re off to a great start at
doing that.

One final point on the Central
Service: I’m excited by the proposal
from the Center for Open Science, in response to the Request For Applications.
In addition to an exemplary model of community governance, great technology,
and a demonstrated commitment to open science principles in so many ways, COS
offers the prospect of a preprint system that ties in to a wider set of tools
and materials, which – while meeting the requirements of the RFA – might allow
the system that evolves to be less constraining that I’m afraid it might
otherwise be. I don’t know who the other contenders are, but I’d love to see
COS build it.

RP:
As SocArXiv will be using the Center for Open Science platform it will be
linked into SHARE. What does SHARE bring to the party? Presumably its function
is as a discovery service only, since its currency is metadata rather than full-text,
right? The OAI-PMH
harvesting protocol that the IR movement developed was based on metadata, but
has not really been that successful. What are your thoughts on these matters?

PC: What SHARE brings to SocArXiv is the same thing it brings to all of the 150 data sources
it currently aggregates, from the giant arXiv and PubMed Central to smaller
individual institutional repositories and SocArXiv. SHARE is not designed to be
a discovery platform in and of itself; it harvests, normalizes, and then
distributes a dataset of research events, which include the posting of
preprints.

Through SHARE, the Association of Research Libraries and COS provide public infrastructure for disseminating
metadata for any purpose. SHARE provides great opportunities for SocArXiv,
allowing people to create custom research streams, institutional reports,
discovery tools, and anything else you can do with research metadata.

As a rudimentary example, I myself (knowing next to nothing
about such things) built a Twitter feed for SocArXiv papers using SHARE (@socarxivpapers), which I describedon our blog.

Someone who knew what they were doing could do a lot more,
and we’re excited to make that possible. (I am not dodging the question of
OAI-PMH, it’s just beyond my expertise to comment on that.)

RP:
You mentioned data and software code earlier. SocArXiv acts as a repository for
these too?

PC: Yes. SocArXiv and the other services on OSF Preprints run
on the Open Science Framework.
Preprints may be nested within projects on that platform, and include any
research materials.

This is a very powerful and flexible platform, which
includes storage, researcher collaboration tools, versioning, analytics,
variable public access settings, and the ability to mint DOIs. This is a great
benefit of working with COS, which is providing this application framework as a
free public good.

Copyright

RP:
Last
July, The Scholarly
Kitchen gave you a hard time over
whether uploads to SocArXiv are vetted, and suggested that without
moderation the service will have a problem with regard to copyright
infringement. What is the current situation, and who is responsible if a paper
uploaded to the service infringes someone’s copyright? Likewise, how are
nonsense, off-topic and inappropriate papers filtered out (are they)?

PC: Our mission is to provide access, not to police copyright.
All SocArXiv users agree to the COS terms of use, which, in accordance with the Digital Millennium Copyright Act, offers a means of complaining if anyone thinks something
has been posted in violation of their copyright.

To my knowledge we have yet to receive such a complaint.
Maybe Scholarly Kitchen thinks
everyone has a moral obligation to play the role of copyright police. This is
not our job. Although we will of course comply with the law, as noted, we’re
not raising and spending money and recruiting volunteers to devote to the
prevention of minor copyright infractions.

In my experience, most authors have no idea what’s in the
ridiculous contracts they sign, and they often veer between exaggerated
paranoia and reckless egalitarianism when it comes to sharing their work.

Often, we get the worst of both worlds. For example, I
learned from your tweet of a new (paywalled) study finding that 40% of papers on ResearchGate were in violation of publisher
copyrights. This is a case when researchers are stealing their own work from
Elsevier (and others) and then giving it to ResearchGate to sell, for which the
researcher receives nothing. Congratulations, academic freedom! As I wrote about Sci-Hub, “if your entire enterprise can be brought down by the
insertion of 11 characters into a URL, your system may in fact not be
sustainable.”

On the question of moderation and
quality control, at present papers are not vetted before they are posted. We
manually take down the very few things that are obviously inappropriate. This
works when you’re taking in a few papers a day, but obviously we will need a
more robust moderation system as the service grows, including clear guidelines
and a routine plagiarism check.

It is our hope that we can persuade
researchers to reallocate some of the time they currently donate as reviewers
in the service of monopolistic for-profit companies to our public-good project,
and volunteer to work as moderators (as arXiv has done). COS is currently
developing the moderation dashboard we will need to carry this out.

That said, I personally think it
would be good for us to get beyond the fear of having our work contaminated by
the proximity to work of lesser quality (or elevated by the esteemed
contributions of others, for that matter). It is different when people discover
books by browsing shelves; in that case it’s a shame to have bad books getting
in your way. But with a free digital archive the downside to accepting bad work
is not so great.

We expect people will mostly find
specific research on SocArXiv through, for example, published citations, the
recommendations of colleagues, through aggregations created by subject experts,
from institutional lists, conference programs, and social media.

We also hope to provide tools such
as lists of most-read, most-cited, most-favourably reviewed, and so on (or
these may be developed by third parties). Most mathematicians don’t read raw
feeds from arXiv, and we don’t think that’s how people will use SocArXiv
either.

I think we will be able to surface
great work without requiring all submissions to be of high quality, with all
the energy and expense that would entail. We encourage people to brag not about
the existence of their paper on SocArXiv, but rather about its value.

RP: Where does SocArXiv fit with the
larger agenda that I think you refer to as “open scholarship”? Where does open
begin and end so far as research in social sciences is concerned?

PC: To
clarify, when we say “open scholarship,” we are aligning with the open science
movement, but including those who don’t consider their work to be “science.”
The open approach responds to many of the problems we face in the research
community today, including the long run issues in academia generally and the
current crisis associated with the Trump presidency.

The SocArXiv steering committee just posted a statement in response to the planned March for Science,
titled “Social Science without Walls,” which summarizes our view on this
question. In it we argue that SocArXiv will help us realize our collective
goals of making our work better, more efficient, more relevant, and less
hierarchical.

The social science without walls
made possible by open source, open access research infrastructure, we wrote,
“allows us to make the best use of our resources, improve the process and
products of our work, bring it to more people faster, and dissolve the
obstacles to interaction that plague our industry.”

From the research process itself
through dissemination of results and – crucially, today especially – engagement
with wider publics, open scholarship is foundational to our vision of social
science.

RP: I believe you are of the view
that the research community needs to take back control of scholarly publishing.
What does that mean in practice? Does it mean, for instance, you believe
traditional publishers no longer have a valid role in scholarly communication?
And how does SocArXiv facilitate the process of taking back control?

PC: Most
of what commercial journal publishers do academics actually do. We research,
write, review, edit, and promote our work – and commercial publishers organize
that labour, partly to our benefit and the public’s benefit but largely to
their own. Some of what they do is outside of our expertise, including editing
and producing publications, but those functions are secondary. And a lot of
what they do is only necessary to serve the needs of the system they rely on,
such as marketing and policing copyrights and devising means of keeping content
from reaching readers.

An open access scholarly publishing
system could do more, faster, better, and vastly cheaper, without most of what
commercial publishers do. SocArXiv does not require the disruption of the
journal system, but if we help make that happen, and help build a better system
to replace it, I would be glad.

Funding and the
future

RP: You mentioned that SocArXiv has
already attracted some funding. Can you say more about funding and how it can be
assured over time? How successful have you been to date in your funding
efforts? Can you envisage the service ever offering paid-for services in order
to be financially sustainable? If so, what kind of services, and whom would you
expect to be billed?

PC: The
operation of the archive is funded by COS at present. The grant money and institutional
contributions we have so far are going to design and outreach and governance
efforts. I hope we will be able to continue building the system with money from
foundations such as those that support us now, as we develop a model of
sustainability that derives support from the voluntary contributions of
academic institutions and research funders.

I have been inspired by arXiv’s model (and they have graciously consulted
with us, in addition to letting us riff off their name), and I hope that we can
follow in their footsteps on sustainability as well.

We are committed to offering a free
service for researchers and readers, and open access indefinitely. We might in
principle offer ancillary services to institutions for a fee, but we have as
yet no such plans. (Note to institutional readers: if you are currently paying
SSRN thousands of dollars per year for a paper series or a list of papers,
contact us!)

RP: To what extent is the SocArXiv project
focused on advocacy as much as service provision? More generally, is there a
danger that the preprint movement might end up chasing after buzzwords and
trends, rather than sparking fundamental change in scholarly communication
(which I think has been a tendency within the OA movement)?

PC: We have to do some of both – advocacy and service provision
– but ultimately I hope our service will be our advocacy. I can write polemics
all day long (and I often do), but in the absence of a working open archive
they won’t mean that much.

Participation in the archive is not
conditional on some political or social movement affiliation. At our most
ambitious we do want to shift the ground on which social science is built, but
that’s going to require offering something new and professionally rewarding
beyond a cutting critique of Elsevier.

RP:
What future plans are there for SocArXiv? I have seen mention of a comment
function, post-publication review, overlay journals? Are these all on the
table? What other features/functionality do you anticipate offering in the
future?

PC: Those are all potentially important features, although not
as important as a smoothly operating basic archive, with transparent
governance, shared norms, and community support – so I’m not rushing.

However, it’s important to point out
that, as an open service, it is possible right now for anyone to develop those
functions. Any institution, working group, department, or library could put up
a list of papers, automatically or manually generated, and host discussions on
them, facilitate peer review, and produce their own overlay journals. A big
part of our outreach job in the coming year is to get people who have the
knowhow and resources to develop such things to jump on it and bring them to fruition.

I especially want to encourage
people who are already in the business of aggregating papers – such as
conferences and paper competitions – to use the system. Anyone running a paper
competition could require the papers be posted on SocArXiv, where they could be
juried as they are made public.

Similarly, conference submissions
could be done through the archive, with papers tagged according to their panel
sessions or subject areas. These are simple examples of how we could do work we
are already doing but in an open way, using the tools SocArXiv already has made
available to move toward an open scholarship culture.

RP:
What are the primary obstacles today to achieving the changes you would like to
see to scholarly communication, how can they be overcome, and what long-term
opportunities does the open agenda offer the research community?

PC: You may have meant practical obstacles, but all these words
later I’m inclined toward a more philosophical answer. To my mind, our biggest
obstacles are institutional inertia and risk aversion.

No reasonable person would design an
academic publishing system like this if we were building it today. When I was a
grad student in the early 1990s, before the web, we had to physically be in the
library to read the journals (I did not subscribe to any). Now that we have the
capacity to provide them to anyone anywhere at a fraction of the cost, are they
any more accessible?

The great innovations in journal
publishing technology in the last quarter century seem to have gone to building
and maintaining elaborate paywall and authentication systems, and legal
protocols to enforce them – and more is spent keeping people out than bringing
people in.

The American Sociological
Association, in my own discipline, still allocates “pages” to journal editors
according to the cost of printing and shipping paper, setting an arbitrary
limit to how many “top” articles may exist. Fearing a future in which “the
journal world may not be as profitable in the future as it is now,” ASA’s response is to work on inventing new paywall journals.

Inertia is normal for social
institutions, of course, but journal publishing seems to have more than most.
I’m sure this comes from the slow turnover of generations in academia, and from
the constricting job market that compels professors to squeeze harder to make
students in their own image, out of fear of joint failure.

That’s probably also why they fight
so hard to maintain our arbitrary prestige and ranking system, which bestows
success or failure on scholars before anyone beyond a tiny committee of
reviewers has laid eyes on their actual work, much less assessed its impact.

There is also big money at stake.
But it’s not just executives and managers of the multinational conglomerates
that sit atop the system, it’s also the conferences and receptions and awards
(and tote bags) they dole out, for which the vast majority of faculty
continuously scrap.

We could do so much better for so
much less money.

But there are risks. We have to be
willing to try new things, to step out from under the current system. We have
to evaluate people not based on the pedigree of their journal publications but
on the quality of their work. We have to reward career pathways that differ
from the ones that got us where we are.

Some attempts will fail. But if
we’re guided by sound principles, focus on what’s important, and play to our
strengths – doing the things we do well and contracting for the things we don’t
– the rewards will be greater down the road. And that’s what the open agenda
offers.