Fixing authorship – towards a practical model of contributorship

Introduction

As we near the completion of the metamorphosis of paper-based scholarly publishing to a medium entirely based on the Internet, so there is increasing need to enrich the environment with a connected network, unfettered by the legacy of putting ink onto paper. One of the more recent areas to come under consideration is issues and concepts of authorship, and how these can be represented in a wholly digital world. For legal and copyright reasons, the concept of ‘an author’ of a scholarly work is likely to persist for some time. However, the idea that a simple list of authors is the optimum way of recording scholarly achievement has reached the end of its shelf life. It’s time to move on.

Anyone who is connected with scholarly publishing knows that there are a variety of tasks that are covered and obscured by the term “authorship”, and there are vital research tasks that are not considered to be worthy of the term. Moreover, there are many grey areas: for example, ‘guest’ authorship - where names appear in author lists of people who have had little or no impact on the research work - and ‘ghost’ authorship - where legitimate authors do not appear on the author list for reasons of expediency or politics.

Clearly, there cannot be just one resolution for authorship-related problems. However, the study of contributorship - and the development of a standard infrastructure to support more nuanced relationships between researcher and published output - promises to solve the logistical issues, and to illuminate those that have an ethical basis. A prominent example of work in this area is the recent International Workshop on Contributorship and Scholarly Attribution (IWCSA), in which we participated and which recently published its results (1).

Authorship broken, needs fixing

Current definitions of authorship only cover a very limited series of relationships that a person can have with a published article. Typical author lists tend to only include authors and/or editors, with other contributions and relationships being inconsistently indicated via text in an acknowledgements section.

This binomial approach - essentially a relic from the print age - to recognizing contributions to a published scholarly work has many flaws. The Harvard Workshop recognized nine specific issues which are listed in Table 1:

Problem identified by Workshop

Resolution approach

Varied authorship conventions across disciplines

-

Increasing number of authors on articles

-

Inadequate definitions of authorship

-

Inability to identify individual contributions

-

Damaging effect of authorship disputes

-

Current metrics are inadequate to capture and include new forms of scholarship and effort

Many readers will be familiar with some or even all of these issues as authors or editors. Here we want to highlight and elaborate on what we consider the most prominent ones:

Varied authorship conventions across disciplines

It often comes as a surprise to find that different disciplines vary in the significance of author order and role. Take, for example, the diverse ways in which the same author order of a fictional paper written by Smith, Taylor and Thorisson might be interpreted depending on discipline:

High Energy Physics

Author list is in alphabetic order, no precedence can be interpreted. Names may include engineers as well as researchers.

Economics, some fields within Social Sciences

Author list is in alphabetic order, no precedence can be interpreted.

Life Sciences

Smith the postdoc did most of the experimental work, but Thorisson was the principal investigator who led the scientific direction of the work. The alphabetical order is coincidental.

‘Standard’ order

Smith is the senior researcher who did most of the work. Taylor was subordinate to Smith, Thorisson is subordinate to Taylor. The alphabetical order is coincidental.

Table 2: Varied authorship conventions across disciplines

Increasing number of authors on articles

High Energy Physics (HEP) is well-known for long author lists on research papers, with over 3,000 authors credited in recent extreme cases. This is in part because of the complexity and scale of HEP research, but also because HEP publications tend to give equal weighting to researchers and engineers alike. Clearly, the traditional model of the author as the writer of the work is not being applied in this discipline (2).

Equally, having 1000+ authors on a single paper presents novel logistical problems of managing a non-trivial amount of publication metadata - merely getting all the names and affiliations correct is a significant challenge. In fields other than HEP, there is also a clear trend towards an increased number of authors per published paper. For example, the Wellcome Trust reports that the number of authors on its genetics papers rose from around 10 to nearly 29 between 2004 and 2010. Furthermore, many standard ways of assessing scholarly impact will share the value amongst the authors in an entirely arbitrary manner. This leads to the so-called “dilution effect”, whereby even a well-cited paper makes little or no contribution to the metrics for individual authors because credit is “diluted” across the large number of authors.

Inadequate definitions of authorship

There is no universal definition of what is meant by research authorship: the closest that exists are a set of rules drawn up by the International Committee of Medical Journal Editors (ICMJE)(3). These rules have been adapted and used by a number of journals over the last several years, although even the ICMJE itself recognizes that they are outdated (Christine Laine, Editor of Annals of Internal Medicine, reported at IWCSA).

Inability to identify individual contributions

With any multi-author work, there will be a breakdown of tasks that the individuals listed as authors have contributed to the work. Traditional author lists do not allow for any credit below this level. Many journals now allow (or even require) contributorship statements at the end of the article, but these are rarely in any kind of standardized form that can be processed in automated fashion to inform calculation of impact, expertise or standing. This lack of granularity can lead to the case where a senior researcher who has had little or no influence on a paper can be credited with “proper” authorship, whereas a computer programmer who made a significant contribution via the construction of key algorithms is perhaps not credited at all.

Damaging effect of authorship disputes

The lack of clarity of authorship claims and credit has led to a growth in authorship disputes and a number of scandals. A detailed and standardized method of declaring contributions is likely to put an end to all but the most egregious of such disputes. The problems revealed by an analysis of author / article relationships fall into two broad categories: logistical (in other words, technical) and ethical. However, these are not conveniently discrete categories: an inability to precisely define the relationship leads to a position whereby a research team is obliged to force classification upon its members. Given that authorship is the principal means of recognizing academic achievement, this is not without weight.

Contributorship

We hope that one of the major outcomes of this field of work will be an evidence-based system of classifying relationships between researcher and a published work. Moreover, we hope that this taxonomy will facilitate codification of relationships that go beyond traditional authorship, thus removing the difficult decisions that can arise when compiling an author list. For example, by explicitly allowing “data collection” or “algorithm creation” as a type of contribution, it would be possible to formally attribute credit to members of the team that a strict adherence to authorship conventions (such as they are) would likely ignore, whilst not conflating the precise nature of the researcher’s contribution with intellectual leadership. In the same vein, specifying “Head of research team” or “Principal investigator” would facilitate distinguishing a senior member’s relationship with the work from those who also made intellectual contributions (see Textbox (right)).

Clearly, the answer to this problem goes beyond the creation of a standard - there needs to be an infrastructure for storing these complex relationships, tools to create them and maintain them, and ways of displaying them. Most importantly, the benefits of fully recording these relationships must outweigh (and be seen to outweigh) the cost of the additional complexity and work required (i.e. beyond what is currently the norm).

Software can certainly help in this effort (although the idea of determining who-did-what with a list of 1000+ researchers is overwhelming!) and there have been some very good examples of simple, spreadsheet-based tools in recent proof-of-principle projects. However, the task of apportioning responsibilities (and rewards) can start earlier - perhaps within research tools such as Mendeley.

Help is coming

Many of the issues highlighted above are being tackled by a diverse community of agencies and approaches, many of which came together for the IWCSA workshop. Here we want to highlight a particularly important one: the Open Researcher & Contributor ID initiative (ORCID: http://about.orcid.org). Launched in mid-October 2012, the registry service operated by ORCID enables researchers to create a public identity and obtain a persistent personal identifier, and to maintain a centralized record of their scholarly activities (4), (5).

Whilst the basic idea of an online “author profile” is not unique or innovative in itself, several key attributes differentiate the new service from the myriad free and commercial services in this space. First, it is backed by a non-profit, community-based organization with participation from commercial publishers, academic institutions, research libraries, funding agencies and many others. Second, major stakeholders in the ORCID community are committed to building software applications and platforms that will build on and integrate with the central ORCID service for automatically linking scholars and their published works.

At the time of writing, the ORCID service is limited in functionality and is experiencing some early growing pains, but the service is improving over time and with the strong support of the community. Despite these initial teething troubles, several integrations built by ORCID’s launch partners are already operational and more will come online in the next several months.

So what is ORCID’s relevance to the attribution challenges outlined above? Although the first-generation service is functionally limited, the core system has been built to support future developments and definitions that go beyond basic author or editor roles. These can potentially include richer contributorship statements such as the examples already given above. It follows that ORCID can serve as a central index or discovery hub in which to look up not merely the base contributor-work relationship, but also the nuances of that relationship if more detailed information is available.

Conclusions

Definitions are softening: in the new world of online digital publishing, “articles” are more than words on paper, metrics are more than citation counts, usage is more than subscriptions - and authors are more than just writers. The concept of authorship is rooted in our culture and in our minds, and that principle will not go away. But the idea of contributorship offers a richer set of definitions that enable our contributions to human knowledge to be recorded more precisely, if only we are willing to embrace it, and if the tools and infrastructure are developed that allow us to capture this information whilst not increasing administrative burden.

Conflict of interest statement

The authors have both been active contributors to ORCID in the past three years. As of October 2012, one of them (GAT) is employed by ORCID part time to work on the EU-funded ODIN project (http://odin-project.eu).

Contributorship statement

The authors contributed equally to the drafting of this article.

About the authors

Gudmundur ‘Mummi’ Thorisson is an academic and consultant interested in scientific communication, in particular as this relates to open access to and use/reuse of research data in the life sciences. He has been involved in various projects relating to identity & unique identifiers in research and scholarly communication, most recently the ORCID initiative. Through his previous work in the GEN2PHEN project (http://www.gen2phen.org) he has also contributed to several database projects in the biomedical research domain, notably GWAS Central (http://www.gwascentral.org).

Gudmundur holds a PhD from the University of Leicester in the United Kingdom and worked there as a post-doctoral researcher after graduating in 2010. He currently works part time for ORCID on the ODIN project (http://odin-project.eu), whilst also working in a research support role at the Institute of Life and Environmental Sciences (http://luvs.hi.is), University of Iceland, Reykjavik where he is now based.

Mike Taylor is a research specialist in Elsevier Labs and the newest member of the Research Trends Editorial Board. His current areas of work include altmetrics, contributorship, research networks, the future of scholarly communications and other identity issues. He has worked in various capacities within the ORCID initiative. Previous to joining Elsevier Labs, Mike worked in various technology and publishing groups within Elsevier.

7 Responses

Authorship is a fascinating topic and others have attempted, over the years, to standardize it. As you describe, allocation of responsibilities and rewards require clear and unambiguous declarations of authorship/contributorship.

However, there is an argument to be made that ambiguity is not entirely a bad thing in science. The sociologist of science, Harriet Zuckerman (1968), made a compelling argument that ambiguity in authorship helps researchers collaborate.

An authorship taxonomy, as you propose, may actually make things worse in science, adding stress, political and status battles over exactly who did what and how much.

“I have shown that patterns of name ordering have been devised as adaptive mechanisms to heighten the visibility of role-performance of individual investigators working in collaboration. The various patterns which, at first, seem to have unequivocal meanings for scientists do convey varying degrees of ambiguity and, to this extent, fail to serve their intended functions. However, it is also the case that such ambiguity in the meaning of name orders reduces the stress of collaboration. As in other departments of social life, making things explicit often introduces strain into social relations.”

–Zuckerman HA. 1968. Patterns of Name Ordering Among Authors of Scientific Papers: A Study of Social Symbolism and Its Ambiguity. The American Journal of Sociology 74: 276-91. http://www.jstor.org/stable/2775535

I very much like your contribution above, and appreciate the nuanced approach that Zuckerman describes.

We feel that this socially-mediated ambiguity is perfectly well accommodated within a taxonomy – which would necessarily be hierarchical. So the present model could be encapsulated with a simple assert that “I am a contributor”, which equates exactly with having presence in an author list. As behaviour and tools evolve, so people can add in further detail, “I am a contributor, I interpreted the data”.

There certainly is demonstrable post-hoc disagreement over who-did-what (Ilakovac et al, 2007, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764586/?tool=pmcentrez&rendertype=abstract), and many examples of ethical issues. Our belief is that an infrastructure that permits capture of this detail as the work develops would enable detailed accuracy, and expose any disagreements (which is preferable to them being hidden) – whilst being able to retain the top-level declaration that “I am a contributor”.