If you've been reading this blog for a while, you know that this is exactly what I've been saying over and over again: we need a scientific approach to understand the academic system, much like we have one to understand the economic system. We are wasting time, money and effort because we don't understand the social dynamics of the communities and don't know what benefits knowledge discovery. Instead of a scientific investigation of these matters, we spend time with useless complaints. Clearly, funding agencies should be interested in how to use their resources more efficiently. It's thus no surprise the NSF has such a program. It's a surprise other agencies don't.

Thus, I agree on the general sense of Lane's article. Current metrics to assess scientific productivity are poor. They are used even though that is known. There are no international standards. Many measures for creativity and productivity are not used at all. Lane writes:

Existing metrics do not capture the full range of activities that support and transmit scientific ideas, which can be as varied as mentoring, blogging or creating industrial prototypes. [...]

Knowledge creation is a complex process, so perhaps alternative measures of creativity and productivity should be included in scientific metrics, such as the filing of patents, the creation of prototypes4 and even the production of YouTube videos. Many of these are more up-to-date measures of activity than citations.

She also points out that there are many differences between fields that have to be accounted for, and that the development of good metrics is an interdisciplinary effort at the intersection of the social and natural sciences. I would have added the computer sciences to that. I agree on all that. What I don't agree on is the underlying assumption that measuring scientific productivity is under all circumstances good and useful to begin with. Julia lane actually doesn't provide a justification, she just writes:

"Scientists are often reticent to see themselves or their institutions labelled, categorized or ranked. Although happy to tag specimens as one species or another, many researchers do not like to see themselves as specimens under a microscope — they feel that their work is too complex to be evaluated in such simplistic terms. Some argue that science is unpredictable, and that any metric used to prioritize research money risks missing out on an important discovery from left field. It is true that good metrics are difficult to develop, but this is not a reason to abandon them. Rather it should be a spur to basing their development in sound science. If we do not press harder for better metrics, we risk making poor funding decisions or sidelining good scientists."

True, this is no reason to abandon metrics. But let me ask the other way 'round: Where is the scientific evidence that the use of metrics to measure scientific success is beneficial for progress?

I don't have any evidence one way or the other (well, if my proposal under Ms Lane's program had been approved of, I might have). So instead I'll just have to offer some arguments why the mere use of metrics can be counterproductive. First, let me be clear that scientific research can be very different from one field to the other. We also previously discussed Alexander Shneider's suggestion that science proceeds in various different stages. The stages basically differentiate the phase of the creative process. For some of these stages, the use of metrics can be useful. Metrics are useful if it is uncontroversial what constitutes progress or good research. This will be the case in the stages where a research field is established. That's basically the paper-production, problem-solving phase. It's not the "transformative" and creative stage.

One has to be very clear on one point: metrics are not external to the system. The measurement does affect the system. Julia Lane actually provides some examples for that. Commonly known as "perverse incentives" it's what I've referred to as a mismatch between primary goals and secondary criteria: You have a primary goal. That might be fuzzy and vague. It's something like "good research" or "insight" or "improved understanding." Then you try to quantify it by use of some measure. If you use that measure, you have now defined for the community what success means. You dictate them what "good research" is. It's 4 papers per year. It's 8 referee reports and 1 YouTube video. It doesn't matter what it is and how precise you make it, point is that this measure in turn becomes a substitute for the primary goal:

"The Research Performance Progress Report (RPPR) guidance helps by clearly defining what agencies see as research achievements, asking researchers to list everything from publications produced to websites created and workshops delivered."

So you're telling me what I should be achieving? And then you want me to spend time counting my peas?

Measures for achievements are fine if you have good reason to believe that your measure (and you could adapt it when things go astray) is suitably aligned with what you want. But the problem arises in cases where you don't know what you want. Lane eludes to this with her mentioning of researchers who think their work is "too complex" (to be accurately measured) and that "science is unpredictable." But then she writes this is no reason to abandon metrics. I already said that metrics do have their use, so the conclusion cannot be to abandon them altogether. But merely selecting what one measures tells people what they should spend their time on. If it's not measured, what is it good for? Even if these incentives are not truly "perverse" in that they lead the dynamics of the system totally astray, they deviate researchers' interests. And there's the rub: how do you know that deviation is beneficial? Where's the evidence? You with your science metric, please tell me, how do you know what are the optimal numbers a researcher has to aim at? And if you don't know, how do you dare to tell me what I should be doing?

The argument that I've made previously in my post "We only have ourselves to judge each other" is that what you should be doing instead is to just make sure the system can freely optimize, at least within some externally imposed constraints that basically set the goals of research within the context of the society. The last thing you should be doing is to dictate researchers what is the right thing to do, because you don't know. How can you know, if they don't know?

And yes, that's right, I'm advocating laissez-faire for academia. All you out there who scream for public accountability, you have completely missed the point. It's not that scientists don't want to be accountable. There's just no sensible way to account for their work without that accounting hindering progress. Call that the measurement problem of academia if you like.

Bottomline: Before you ask for more scientific science metrics, deliver scientific evidence that the use of such metrics is beneficial for scientific progress to start with.

47 comments:

as been mentioned by Julia Lane, good physicists should not be judged in a negative sense as sidelining. Say it in other terms: how can we be sure, that not only 'mainstream' physics is supported by such systems, like the Brazilian one ?

On the other hand, how can we be sure, that sidelining leads to good results.

We can't be sure. The problem is that the use of metrics raises the impression one can be sure what's good and what isn't. That fundamental insecurity is reflected in the community and their judgement. If one tries to replace it with a "sure" measure, it enters the implicit assumption that it's possible to "know better." But there's no two scientists who agree on which way to pursue research is the best and why and who is doing the right thing and who isn't. We should make sure this diversity (in content as well as procedure) is preserved, rather than using a one-size-fits all international standardized measure. If anything, that's a sure way into stagnation. Best,

Konrad Hinsen said:Two fundamental problems with metrics in science are that quantity does not imply quality, and that short-term impact does not imply long-term significance. The real value of many scientific discoveries often becomes apparent only many years later. It would be interesting to evaluate metrics by applying them to research that is a few decades old. Would they have identified ideas and discoveries that we now recognize as breakthroughs?

I've used the word to mean a quantification (here, of scientific achievement) by making use of collected data and giving them some weight (relevance). The citation index (or the h-index, or whatever) is one such data point that you can raise. You can count the number of papers, the impact factor of the journals they got published in, videos produced, seminars, patents, conferences attended, blogposts, etc etc. Lots of numbers. You raise them for everyone in the group/department/university or for a single person. You take all this data and from it you compute a measure for "scientific achievement" (as for example done with the university rankings). What I'm saying here is that simply by doing so (raising and comparing the data, computing some measure for achievement) you are modifying the system. I'm asking if that's desirable or if not, at least in some cases, it's a non welcome disturbance that is counterproductive. Best,

Yes, nice quotation. These are two of the most commonly raised objections. Of course defenders of metrics would say, you just have to use the right metric. You can measure quality: just judge this work on a 5 star scale. You can try to incorporate long-term impact by explicitly asking for it, etc. That could indeed improve matters compared to nowadays. But either way, what I'm saying is that it doesn't matter what great metric you come up with, you'll never know it's the "right" metric. There's no knowing and there's no substitute for peer's judgement. Everything you change, you can only make it worse. Best,

I have the feeling that attempting to analyse the scientific process as a scientific matter per se is an over-analysis. This would create more venue for people publishing even more papers that probably will lead nowhere. The best would be a fundamental change towards a more natural selective situation.

I've always thought that if one decides to go into the big sacrifice, dedication and effort of a PhD and postdoc(s), this would be considered already proof of competence, responsability and motivation. (I firmly believe in inner motivation, as well as some random components, and capacity for science, much like in arts, music and writing.)

A very simple and natural solution to reflect that the above premisse is good enough would be to aim at the best educational system, the best laboratories, libraries and resources, select only the *very* best students, and after these have gone through all their formative years (educationally and professionally speaking), then... that's it. You have the best possible scientist, right? What is else to do?

Whatever their competence or profile (theoretical? creative? etc), they *will* work. Let them "play". They will be creative, technical, whatever. Let them *think*. These people are intelligent, dilligent, right? They have inner motivation, they have already proved enough. Why to doubt them constantly? No need for ellaborate measures and over-analysis. They will fail or not fail. There are no garanties, but it's better to have a cleaner situation than a distorted, confusing, useless, non-optimal one. (I see that as an almost a big "lie", an almost unethical thing).

Science is not about an industry, a commerce, or a financial institution. Any measure has only the effect of making people run after it in order to keep themselves in the game (e.g. the publish or perish game). It's not by a big number of scientists spending their life long careers publishing mediocre papers that science will advance (although I know some people that firmly believe that "productivity" is all), but by the very few percent of them who had the best education, "talent" or firm dedication to pursue the best quality work as their only truthful aim. (I think that, fundamentally, that is the real core for the advancement of science, to accept something else is foolishness).

In a selective situation, with much less papers around to read, only the most interesting or relevant work would be published and receive appropriate visibility. A mediocre work would be recognized more easily and would be an embarrassing situation in such selective community of scientists.

I recognize that the issue is not that simple, specially for experimental sciences, but for the theoretical sciences, I see no reason to prefer the current ridiculous situation (one cannot keep track of one's particular field with so much noise and trash) over a simpler and more natural solution, specially considering that theoretical scientists are relatively cheap to finance. And having a more selective number of them, the cost would be insignificant. Give them permanent positions and let them think.

Some fifty years after the first quantitative attempts at citation indexing, it should be feasible to create more reliable, more transparent and more flexible metrics of scientific performance. The foundations have been laid. Most national funding agencies are supporting research in science measurement, vast amounts of new data are available on scientific interactions thanks to the Internet, and a community of people invested in the scientific development of metrics is emerging. Far-sighted action can ensure that metrics goes beyond identifying 'star' researchers, nations or ideas, to capturing the essence of what it means to be a good scientist.

I saw the connection to "Web of knowledge" that you were referring to earlier.

Attempts like ORCID or Nature to give identifier recognition, along with "the scientists presence," as you have been doing over the years, as well as, others to make science more accessible toward understanding "is a character we might wish to imbue in all who advance the principals of science," as much as, an algorithm identifier toward the consensus of ability and responsibility.

For me this sets apart the fundamental idea as to what to this point has not been recognized is now being done so.

This is partly developed in respect of internet development that protocols had to be adapted to engage the full scope of the individuals interactions.

Thus becoming part of an accountable system?

I have noticed this part of your engagement with asking about citations and who is judging as part of questions of others as well, hence the idea of marketing and such of ones wares. Left over from the questions of Peter's as well.

Like Phil it is awareness about "quality" and not quantity that one should engage the value of how far research can penetrate all levels of society, is indicative of the value science can play when it reaches across all those levels of society as well.

Well, I would agree with you that the best would be a change towards a more natural selective situation. But the point I was trying to make with saying we need a scientific study of the academic system is that it doesn't matter what you think or I think or Julia Lane thinks. We might still be arguing and exchanging anecdotes in some thousand of years. We're scientists, we should be analyzing the situation and figure out a scientific answer to tell who is right with what's the best thing to do. Best,

Yes, I understand. But my point is that a first, concentrated effort towards changing to a more selective situation is very fundamental, and I think it is clearly a consensus (I may be wrong, though); it is not something that needs a scientific analysis to begin with, and at the end such an effort would exactly clean the path towards settling the "scientific answer to tell who is right", as you put it -- which in fact would be a second order refinement to find the optimum situation.

I strongly doubt there is a consensus on this. In the contrary, the majority of people nowadays seems to believe in or at least doesn't object on the use of perpetual accountability. (That I referred to as "pea counting." I just noticed with delay it's a German idiom that probably doesn't make much sense in English.) Best,

I couldn't agree more that a laissez-faire approach, particularly when combined with long term investment is critical to scientific study.

Of all people, it was actually my father I picked this up from. He tells on interesting anecdote:

In the early 70s he was a post-doc in molecular biology at Cambridge under Crick. At that time Brenner was establishing the nematode C. Elegans as a model for developmental and genetic biology. This undertaking took a considerable amount of time, for which there was little or no results published. If Brenner had tried to establish this model in the current funding environment, with five year cycles, and a reliance on publishing records then C. Elegans would have never been developed, and a large part of our understanding of developmental biology would have been lost. That sort of long term commitment to scientific research is almost never seen anymore.

"Erbsen zählen" is correct. Google translate must have thought it was the beginning of a sentence "Erbsen zählen... zu den dümmsten Gemüsesorten." eg would mean "Peas are among the most stupid sorts of vegetables."

(...) the majority of people nowadays seems to believe in or at least doesn't object on the use of perpetual accountability. (That I referred to as "pea counting." (...)

Yes, but that's exactly because they managed to stay in the system *according to* its rules -- "bean counting" (eh... I wonder the equivalence in Portuguese...). Those who objected or did not adapt to the system, left. That's the "natural selection" that we have discussed before.

By "consensus" I mean the people who *wants* to change the system, the new generations, who are entering the system. I think it's virtually impossible to change the current situation while the older generations dominate the field. So we have to address the former, the consensus must come from the former. Otherwise, I do not think something can be changed starting with only rational analysis with the people who are benefiting from the system. But I may be wrong.

BTW, Bee, we generally agree on various matters. I came to notice that you usually focus on trying to rationally improve or perform small changes of some existing situation. I tend to be more radical and idealistic, towards fundamental changes. This is often not too rational. You method may be more easy to realize in a first moment, easier to quantify and argue about. Mine is difficult to accomplish, but is well intended, aims at a general change. In any case, I do not want to dispute your paradigm on "how to change the world" :); yet I do hope the general solution somehow lies in between both approaches.

Yes, I referred to that as "survivor bias" elsewhere: people who are successful within the current system today have no reason wanting to change it. That's one of the reason why there hasn't been change yet (another reason is simply inertia). That doesn't mean though the majority of people in the system think it's maximally efficient, and the question is what's the fraction. My impression is that meanwhile it's a very small fraction. It's the top-cited ivy-league, top-institution people that manage to delude themselves into believing academia works great.

Well, yes, I guess the people who want to change the system have a consensus wanting to change the system. But I'm not sure that's relevant. It is, in practical and theoretical terms, irrelevant what people think who are entering the system. They have no experience. They have not had any chance to learn from their actions. They don't know how the system works. They don't know what consequences their actions have for the community. They don't know anything about the community or its working to begin with. I understand why you want them to be the ones who make a difference. And I think this would be a good thing to do. But what I am asking for is to put aside for a moment your and my personal opinion and be more careful: How do you know this would be the right thing to do? What reason do you have to believe it's not a giant mistake you'd make? Can you provide scientific evidence?

What I'm saying is actually pretty radical if you think about it. I'm saying: stop the trial and error. Raise data, analyze the situation, draw conclusion. Stop the handwaving and the rhetoric. Think scientifically. Act accordingly. Everybody who is in a situation of power is bound to be opposed because they are likely to lose power. In any case, I think there is a sufficiently large basis of people in the system, who did not leave, who have made their experience, and who understand like you and I that it suffers from a lot of problems. Over the course of time, I have met and discussed with dozens of people in many different fields (chemisty, biology, medicine, economics, physics, etc) who agree that the system is extremely inefficient and in desperate need of improvement (the NSF program that Julia Lane manages is an indicator for that). The problem is not to find them. The problem is to get them to move their ass. Best,

I think there is a sufficiently large basis of people in the system, who did not leave, who have made their experience, and who understand like you and I that it suffers from a lot of problems. (...)

(...)It is, in practical and theoretical terms, irrelevant what people think who are entering the system. They have no experience.

That is right of course; the former group is the one who must take action now (somehow); the latter will be "in action" only in the future. There isn't much they can do for sure, but this is not only for lack of experience, but also for lack of representativeness (they are inexperienced but hopefully not stupid). This is something to be changed as part of the process.

I guess the people who want to change the system have a consensus wanting to change the system.

Sorry for the misunderstanding, this is ridiculously obvious. I mean consensus of those who wants to first change *according* to what I have said, namely, by increasing quality by a very strong selection of students (towards future researchers).

I would agree that both Bee’s and Christine’s position essentially rests with the recognition that quality simply can’t be quantified, yet rather is recognized almost simply instinctively and really can’t have a metric applied to it at all. This of course was Pirsig’s central premise and interestingly he initially was to consider this as he was first aspired to science as a vocation. only later to shun it because of the methods it incorporated to evaluate itself.

”The difference between a good mechanic and a bad one, like the difference between a good mathematician and a bad one, is precisely this ability to select the good facts from the bad ones on the basis of quality. He has to care! This is an ability about which normal traditional scientific method has nothing to say. It's long pasttime to take a closer look at this qualitative preselection of facts which has seemed so scrupulously ignored by those who make so much of these facts after they are "observed." I think that it will be found that a formal acknowledgment of the role of Quality in the scientific process doesn't destroy the empirical vision at all. It expands it, strengthens it and brings it far closer to actual scientific practice.”

”We have artists with no scientific knowledge and scientists with no artistic knowledge and both with no spiritual sense of gravity at all, and the result is not just bad, it is ghastly. The time for real reunification of art and technology is really long overdue”

There is no doubt in my mind as to the path these two are walking have been walked before. I would even suspect the "older generation" was once young :)and they too tried to break free from the constraints that seemed insurmountable for their starting adventures.

So what's changed?

There is more accountability now that technology has made it's way into the ideas of information sharing and development that would allow us to see aspects of the character of the scientist at heart doing their work.

In the spirit of our discussion is an example that I found of relevance as well, as the words that you have quoted that eventually lead to the understanding of the quality of the work.

In this sense, it is timeless and exists from one generation to the next. Are the technologies taking us away from pursuing the deeper thoughts that are applicable to us even today from historical forbears.

IN this sense the times are always changing in terms of the technologies, and Pirsig caught a sense of this when he wrote....

What is in mind is a sort of Chautauqua...that's the only name I can think of for it...like the traveling tent-show Chautauquas that used to move across America, this America, the one that we are now in, an old-time series of popular talks intended to edify and entertain, improve the mind and bring culture and enlightenment to the ears and thoughts of the hearer. The Chautauquas were pushed aside by faster-paced radio, movies and TV, and it seems to me the change was not entirely an improvement. Perhaps because of these changes the stream of national consciousness moves faster now, and is broader, but it seems to run less deep. The old channels cannot contain it and in its search for new ones there seems to be growing havoc and destruction along its banks. In this Chautauqua I would like not to cut any new channels of consciousness but simply dig deeper into old ones that have become silted in with the debris of thoughts grown stale and platitudes too often repeated.

This search for meaning even amidst the science is an essential recognition of things that are about to change, but have always been part of the generations, from one to the next. A maturation, with this greater sense of responsibility.

For some reason, shaking the classical coats of our history is a way of proclaiming ownership about who we are in the world of science today. A rebirth you might say. A tossing off of the blanket, for a new season that is to begin.

Pirsig recognized the superficial nature that could be espoused with the technologies and the idea about this move to recognition of the quality still needed a good assessment of what was being espoused and from what sections of thought we were seeing revealed.

”Yes, I agree with you. I find it somewhat depressing that the problem is so obvious, yet still persists.”

Yes isn’t it curious that despite the words and thoughts of those like Socrates, Plato, Arisitole, Descartes, Einstein and more recently Pirsig, that all this still goes for the most part largely unheeded. I would say however it not resultant solely as Pirsig contends, that it is intentionally ignored, yet rather never has been widely understood as to be appreciated. I think the key ingredient is what Pirsig recognized which find you have taken the same, that the false divisions and subdivisions drawn between our methods of understanding and knowledge creation are what first must be exposed and made rid of.

Although I wouldn’t contend that Pirsig had all the answers, I do believe he did at least understand the question and also was true if to no one else at least to himself. That has become evident for me, since in all his years, although being one of the best “technical writers” of the past century he only wrote and had published two books, which I find as a testament to him knowing only too well that qualified truth is seldom if ever improved by producing a sequel; which unfortunately is the opposite of the common perception in both the arts and the sciences and has been for quite some time.

It reminds me of my own experience when I was looking for a PhD grant (back in 2004). After months of prospection, I finally found a laboratory in Grenoble that was interested in me, with a research project dealing with Drosophila, i.e., the fruit fly. They asked the region for money with my CV. The answer was : "to get the money, the project must have obvious industrial aims", which is hard to prove with Drosophila... Any project with mice in it would've been regarded, because the industrial aim is in general more obvious to the eyes in such a case. But all biologists know that the results gathered with Drosophila are more "interesting" than those with mammals and can often be extended to mammals in a long term shot, simply because the fly has a simpler genome, and then one can more easily identify what's happening in the "black box"... The knowledge is clearer with the fly, but the metric used here tells it is no good because it is far from human...

The fact is, any standard or metric squeezes the accountable researches to a sub-space within the space of all possibilities. The latter is infinite, and we are afraid of this infinity. This endless void has us sick, and we need to focus on one thing in order to not face the vertigo it could cause. For this reason, the current system is scared of letting researchers think and decide on their own, or otherwise the system would lose its power over individuals.

I personnally believe in the universality of logic. Thanks to it, Science has won over the Church's talk about the Nature, as minds eventually feel stuck in a maze and misled with false conceptions.

A Science of Science has however its own polemic : it must study itself from the beginning, or it would mean it's not self-coherent. So then, a science of science of science, then a science of science of ... of science... Another infinity which has us confused again.

The reason why we are so reluctant to let things run by themselves may be there.

A large part of expenditures on science are on things like the CERN accelerator, large telescopes, spacecraft instrumentation and so on. Or the human genome project. While you might count these more as engineering than science, nevertheless these programs have to be run with some metrics.

So some scientific expenditures have to have metrics attached. Likewise in applied research which is directed to solve some specific problem - progress can be judged.

It is the "green field" areas of science where metrics probably do not apply - because we have no or little knowledge, because we do not even know what we're looking for.

Yes, I agree. As I said in my post, it depends on the field, and in many (I'm tempted to say most) cases my sense is metrics won't do much harm but instead improve objectivity. I was just saying the one-size-fits all approach can do harm in some cases, so better move carefully and identify these cases first. Best,

Thanks for sharing your experience, that's interesting. Yes, metrics do restrict a very large or possibly infinite amount of factors down to a few. This creates two problems. The first is that to make metrics better you have to include more and more data and make this data more and more refined. That's already bad because it takes up more and more time and more and more people to create a more and more accurate picture of reality. But the real problem is that no human can plausibly judge on all this data from all these people. You have to rely on some software to do that. This is problematic because software has the tendency to weed out any sort of individuality (how many points do you score on the international standard norm for scientists?) Point is, the best way to judge on a human is another human. No, that's not objective, and that's not standardized. But try to convince me a computer averaging over some thousand people will do better. You can go and collect all the data you want, you're only making it worse. And in some fields, especially the fundamental research, very-long-term investment, that human judgement has always played a relevant role. What we should be doing is to make sure the conditions are such that there is as few externally imposed bias as possible, not trying to replace it or to average it over the globe. Best,

I was thinking more about this and, taking some interesting comments here, at the end it seems to me that it's purely and simply about the *measure of a scientist* (A).

This is a much deeper question than I came to realize. Clearly, one attaches to that a computation of his/her *output* (B) under some *criteria* (C).

Now:

(A) is fundamentally unethical IMO!(B) misses long-term research, potential for future outcomes, etc;(C) is arbitrary to the extent that it can never be all-encompassing, given the human diversity and nature of the work (the danger of "the one-size-fits all approach" that Bee refers).

It's a deep problematic concept. Do you want to *measure* Einstein, Dirac, Feynman? Is it correct to attach a 10 grade to Einstein and 8 grade to Dirac, etc? I know we tend to compare the life-time realization and genuises of scientists, but does such a "measure" really *add fundamental value* to their accomplishments?

It is clear that we value the work of such geniuses and even want to compare their contributions, etc. But such comparisons have no intrinsic meaning in the sense that, when these scientists were alive, a measure attached to them at that time would make any difference at all (surely, maybe a difference in a negative sense, making some of them leave the field before accomplishing anything).

So it's much worse than I thought.

The simplest approach would be to find a solution that leaves afterwards such kinds of measurements completely dispensable by construction.

The point is: at the forefront of theory there are no guaranties of success anyway!

One can only hope to increase the potential for interesting results by some natural mechanism; one that I have mentioned is a more sensible selection of technically very good, creative, inquisitive and motivated students, because these are the life-long elements that matter. (As I mentioned previously, this rationale fits better the theoretical sciences).

Point is, the best way to judge on a human is another human. No, that's not objective, and that's not standardized.

There, I would say this is what peer-reviewing does, which is precisely what we have for a system at the moment. And the problem with this is all possible conflicts of interest : As an expert, can I judge someone's groudbreaking work if this breaks my own ground in turn?... To bypass this issue, resorting to a computerized method appears as being the neutral solution. It is all about personal ethics, but how to be sure everyone is going to think with respect to this ethics. Would you let a discovery disturb your field if you feel it means your research contract will be cancelled at the end of the month?... That's tough indeed.

Dr Philip Gibbs is presently publishing on his weblog a series of historical instances in which some scientists had their discovery denied for unscientific reasons. We now know they were right, though in the past their work was completely rejected by the respected scientists of the field (see Philip Gibb's last publication here). These are probably worst case scenarios of what can happen when judgements about works are 100% done by human scientists who do not think above all about the interest for Science prior to their own interest.

Or, maybe we have no other solution than dealing with uncertainty anyway... Heisenberg learned to us that this is the way it is when one tends to be more and more precise in accounting for the state of things.

Your analysis in three points is a striking demontration of the weaknesses we are talking about. Indeed, those who judge you need simplicity for dealing with your case, as otherwise they could not deal with it. But how to summarize someone in three such points... It's like shooting three photos of Paris, and say showing them to your friend who never got there before: "Here's Paris!". Everybody does this, though it's incomplete to say so. After looking at the three photos, your friend thinks he knows everything about how Paris looks like now. In his mind, a small bunch of neurons has encoded a closed small world called "Paris", which resembles the three photos, and for which no outer connections can be thought of... In fact, we like simplicity because it allows us making rapidly, as considering all and every possible parameters would take centuries before we can decide anything. The only way to go beyond such a limitation would be to constantly think of what is to come next, to think about things in an open way... This goes against stability of things, and it costs efforts and pain.

Very nice post that suggests an interesting yet counterintuitive way to look at the future of metrics.

You mention that metrics can disturb the research process by incentiving researchers to pursue directions or activities they would not normally favor. Well, what if the metrics themselves are so complicated and weigh so many different factors that, to most individuals, they are almost uninterpretable? A sufficiently complex metric might no longer be easily "gameable" and researchers might simply ignore the details and pursue research as they please.

As more and more of scientific activity moves online or becomes otherwise trackable, I think this is a strong possibility.

That's an interesting way to put it, I think there's a lot of truth to what you're saying. To me the clearest statement is in your sentence

The point is: at the forefront of theory there are no guaranties of success anyway!

Which I tried to express with:

You with your science metric, please tell me, how do you know what are the optimal numbers a researcher has to aim at? And if you don't know, how do you dare to tell me what I should be doing?

That I think is, as far as fundamental research is concerned, the greatest problem with the attempt to "measure" somebody's promise: it's the implicit assumption such a measure is knowable and quantifiable and expressable in a computer code and that procedure can do better than humans. In the long run, that inevitably also goes also along with erasing local and cultural differences, which will result in more streamlining.

One could, on some philosophical level, probably argue that one can measure everything to sufficiently good precision if one only raises enough data with enough precision. You could add psychological profiles, biorythm, genetic information, IQ and various personality tests etc, just think about it (I can just imagine we're going there, give it a decade or two). Even if the true space of parameters is infinite dimensional and arbitrarily complex, it's probably "theoretically doable." Besides the question (that I raised in my previous comments) if the result justifies the effort, the big issue is: what's it good for if you don't know what you're looking for anyway? Providing an international standard for academic achievement very really tells scientists what's "right" and what's "wrong." We already have that situation today, though it's more subtle. (You have to be deaf and blind not to know what's the career-wise "right" thing to do, but at least there's still some room for the quirky outsiders.) If you raise it to a standard you're going to exponentiate these problems that we have today. That's what I'm afraid is going to happen. Best,

If somebody can write a computer code, somebody else will figure out how to game it. Just look around, this happens all the time and everywhere. What you say simply disregards the way natural selection proceeds and has always proceeded. People learn from their behavior. They adapt accordingly. That research will (and can) simply ignore the metrics is about the last things that's going to happen. But besides that, nobody is going to accept a metric that's "uninterpretable" as you say. Julia Lane writes in her article

"Importantly, data collected for use in metrics must be open to the scientific community, so that metric calculations can be reproduced. This also allows the data to be efficiently repurposed."

Yes, that's right, that's peer review. And it's also right that peer review today suffers from a lot of problems. What I am saying is that peer review is in the end the only judgement we have to rely on (if we need a judgement before Nature judges herself, which, in fundamental research, is all the time). Thus, we should make sure that peer review works efficiently. To this end, I am saying we should most importantly reduce all sorts of pressure that scientists are subject to (financial pressure, public pressure, time pressure, peer pressure) and raise the value of constructive criticism (you might want to add to this some eduction on how the scientific system works best and what results individual behavior has on the aggregate level).

See, the problem that you've pointed out, it's caused by: a) the reviewer is afraid of damaging his future options b) if he offers useful criticism it's not acknowledged and he's just wasting his time c) because he can. Now consider the following change of scenario a) said reviewer doesn't have anything to fear for his future. His only interest is to find out the truth about Nature. If he's wrong and his colleague is right, sooner or later it will come out anyway. Since he has nothing to be afraid of, why should he attempt to reject a paper that could be right? b) If his comments are fruitful, this contribution will be appropriately acknowledged. c) If he writes crappy reports only to silence a colleague, there is a risk it will fall back on him negatively.

See what I mean? There's so many things that would be so easy to improve... Best,

I'm glad you're thinking about this so deeply. But I can tell you, that for experimentalists in the US system there is already a firm and quantitative measure that hiring institutions, ie labs and university departments, use to judge scientists' productivity/value: overhead dollars, plain and simple. Anyone's value to the institution is ultimately determined by their ability to secure funding. The result is that the people who rise to the highest and most secure positions, ie tenured group leaders, are typically not those who have the deepest understanding or the most original ideas, but those who are the best managers and can best interface with funding agencies.

This system has a lot of drawbacks, some more obvious and some less so; in particular, it tends to lock everyone at all levels into carrying out exactly the funding agency's specific agenda, very much at the expense of original thought or even talking to people who work on other projects, to say nothing of other fields. At the same time, I don't really have a good alternate system to suggest that has any chance of being adopted.

Sorry about that. That was a blogger glitch. I prescheduled the post yesterday evening for today, but it went out immediately. (This has happened before.) I can then remove it from the blog, but not from the feeds. I did it anyway, thinking that those of you who use feeds will figure what happened. Best,

I know. Even in theoretical physics, ones ability to bring in money matters for tenure. See what's going on here: hiring committees at universities export their judgement to funding agencies. Now funding agencies also don't decide off-hand who gets the bucks, they just mediate a peer-review process. That review process however suffers from many problems (one of which is financial pressure...). Now look at this mess. Is it any surprise the system suffers from unwelcome emergent trends that are for the individual hard to connect to their behavior? As a result, nobody feels responsible for that shit. (There are exceptions to this, eg private institutes are usually not required to play along anybody else's lines. They frequently do anyway. That too isn't too hard to explain if you have to compete according to some international standard norm that tells you what "success" means.) Best,