How can we use research performance indicators in an informed and responsible manner? Guest post by Henk Moed

In this post, distinguished Professor Henk Moed explains why he has published his recent book Applied Evaluative Informetrics, and outlines his critical views on a series of fundamental problems in the current use of research performance indicators in research assessment. The book “gives an overview of the pros and cons of 28 often used indicators, including Journal Impact Factor, SNIP, SJR, relative citation rates, h index, full text downloads and altmetrics”. Amongst his contributions to the Bibliometrics field is also developing the journal metric indicator Source Normalised Impact per Paper (SNIP).

During the past decade, in the domain of science policy an increasing emphasis was placed on societal value and value for money, performance-based funding and on globalization of academic research, and a growing need for internal research assessment and research information systems.

At the same time, due to the computerization of the research process and the digitization of scholarly communication, research assessment is more and more becoming a ‘big data’ activity, involving multiple comprehensive citation indexes, electronic full text databases, large publications repositories, usage data from publishers’ sites, and altmetric, webometric and other new data sources.

Indicators may be biased and not measure what they are supposed to measure

These trends created an increasing interest in the development, availability and application of new indicators for research assessment. Many new indicators were developed, and have become available on a large scale. Desktop bibliometrics is becoming a common assessment practice.

But more and more critique is articulated on the way bibliometric – or, more general, informetric – indicators are used in research assessment. Indicators may be biased and not measure what they are supposed to measure; most studies adopt a limited time horizon; indicators can be manipulated, and may have constitutive effects; measuring societal impact is problematic; and when they are used, an evaluative framework and assessment model are often lacking.

Image by springer.com

In my book I discuss the various criticisms in detail. I reflect upon their implications for the actual use of informetric indicators in research assessment, and for future indicator development. The central question in my book is: How can we use research performance indicators in an informed and responsible manner, taking into account the critique on the way they are currently used, and properly exploiting their potential?

The following views are expressed, partly supportive, and partly as a counter-critique towards the criticisms of current practices in the use of research performance indicators.

Calculating indicators at the level of an individual and claiming they measure by themselves the individual’s performance, suggests a façade of exactness that cannot be justified. A valid and fair assessment of individual research performance can be conducted properly only on the basis of sufficient background knowledge on the particular role they played in the research presented in their publications, and by taking into account also other types of information on their performance.

The notion of making a contribution to scientific-scholarly progress, does have a basis in reality that can best be illustrated by referring to an historical History will show which contributions to scholarly knowledge are valuable and sustainable. In this sense, informetric indicators do not measure contribution to scientific-scholarly progress, but rather indicate attention, visibility or short term impact.

Societal value cannot be assessed in a politically neutral manner. The foundation of the criteria for assessing societal value is not a matter in which scientific experts have qualitate qua a preferred status, but should eventually take place in the policy domain. One possible option is moving away from the objective to evaluate an activity’s societal value, towards measuring in a neutral manner researchers’ orientation towards any articulated, lawful need in society.

Studies on changes in editorial and author practices under the influence of assessment exercises are most relevant and illuminative. But the issue at stake is not whether scholars’ practices change under the influence of the use of informetric indicators, but rather whether or not the application of such measures enhances research performance. Although this is in some cases difficult to assess without extra study, other cases clearly show traces of mere indicator manipulation with no positive effect on performance at all.

A typical example of a constitutive effect is that research quality is more and more conceived as what citations measure. More empirical research on the size of constitutive effects is needed. If there is a genuine constitutive effect of informetric indicators in quality assessment, one should not point the critique on current assessment practices merely towards informetric indicators as such, but rather towards any claim for an absolute status of a particular way to assess research quality. Research quality is not what citations measure, but at the same time peers may assess it wrongly.

If the role of informetric indicators has become too dominant, it does not follow that the notion to intelligently combine peer judgments and indicators is fundamentally flawed and that indicators should be banned from the assessment arena. But it does show that the combination of the two methodologies has to be organized in a more balanced manner.

In the proper use of informetric tools an evaluative framework and an assessment model are indispensable. To the extent that in a practical application an evaluative framework is absent or implicit, there is a vacuum, that may be easily filled either with ad-hoc arguments of evaluators and policy makers, or with un-reflected assumptions underlying informetric tools. Perhaps the role of such ad hoc arguments and assumptions has nowadays become too dominant. It can be reduced only if evaluative frameworks become stronger, and more actively determine which tools are to be used, and how.

One possible approach to the use of informetric indicators in research assessment is a systematic exploration of indicators as tools to set minimum performance standards. Using baseline indicators, researchers will most probably change their research practices as they are stimulated to meet the standards, but if the standards are appropriate and fair, this behavior will actually increase their performance and that of their institutions

The following alternative approaches to the assessment of academic research are proposed.

A key assumption in the assessment of academic research has been that it is not the potential influence or importance of research, but the actual influence or impact that is of primary interest to policy makers and evaluators. But an academic assessment policy is conceivable that rejects this assumption. It embodies a shift in focus from the measurement of performance itself to the assessment of preconditions for performance.

Rather than using citations as indicator of research importance or quality, they could provide a tool in the assessment of communication effectiveness, and express the extent to which researchers bring their work to the attention of a broad, potentially interested audience. This extent can in principle be measured with informetric tools. It discourages the use of citation data as a principal indicator of importance.

The functions of publications and other forms of scientific-scholarly output, as well as their target audiences should be taken into account more explicitly than they have been in the past. Scientific-scholarly journals could be systematically categorized according to their function and target audience, and separate indicators could be calculated for each category. More sophisticated indicators of internationality of communication sources can be calculated than the journal impact factor and its variants.

One possible approach to the use of informetric indicators in research assessment is a systematic exploration of indicators as tools to set minimum performance standards. Using baseline indicators, researchers will most probably change their research practices as they are stimulated to meet the standards, but if the standards are appropriate and fair, this behavior will actually increase their performance and that of their institutions.

At the upper part of the quality distribution, it is perhaps feasible to distinguish entities which are ‘hors catégorie’, or ‘at Nobel Prize level’. Assessment processes focusing on the very top of the quality distributions could further operationalize the criteria for this qualification.

Realistically speaking, rankings of world universities are here to stay. Academic institutions could, individually or collectively, seek to influence the various systems by formally sending to their creators a request to consider the implementation of a series of new features: more advanced analytical tools; more insight into how the methodological decisions influence rankings; and more information in the system about additional, relevant factors, such as teaching course language.

In response to major criticisms towards current national research assessment exercises and performance-based funding formula, an alternative model would require less efforts, be more transparent, stimulate new research lines and reduce to some extent the Matthew Effect. The basic unit of assessment in such a model is the emerging research group rather than the individual researcher. Institutions submit emerging groups and their research programs, which are assessed in a combined peer review-based and informetric approach, applying minimum performance criteria. A funding formula is partly based on an institution’s number of acknowledged emerging groups.

The practical realization of these proposals requires a large amount of informetric research and development. The book proposes several new directions for indicator development. They constitute important elements of a wider R&D program of applied evaluative informetrics. The further exploration of measures of communication effectiveness, minimum performance standards, new functionalities in research information systems, and tools to facilitate alternative funding formula, should be conducted in a close collaboration between Informetricians and external stakeholders, each with their own domain of expertise and responsibilities.