Gender discrimination in high energy physics ?

This is the subject of a recently uploaded ArXiv preprint (filed under Physics and Society). The author, physicist and statistician Sherry Towers, carries out a detailed analysis of the productivity and career paths of a sample of 57 postdoctoral researchers (48 males and 9 females) in high-energy physics, working on the Run II D0 experiment at Fermi National Laboratory (Fermilab) during the 1998-2006 period.
Upon comparing the relative productivities (assessed through co-authorship of internal progress reports) of female and male postdoctoral researchers, as well as the rates at which researchers in both groups were invited to speak at conferences and eventually landed university faculty appointments, the paper claims evidence of systematic gender bias.
The above is quite an indictment, one to be taken very seriously, especially for a field of scientific inquiry, namely experimental high-energy physics, that prides itself of being as egalitarian and gender-blind as it gets.
The science enterprise can ill afford to exclude from participation more than half of the population, if it is to survive and thrive. Although I cannot really say that I have ever witnessed a blatant case of gender discrimination (GD) on the job, I have certainly no trouble believing that it exists. I have heard competent, brilliant female colleagues lament it; and there is really no reason why my discipline, physics, should be immune from something that is otherwise so pervasive throughout society.
But, is GD really practiced so openly that it can be detected quantitatively as easily as suggested in this paper, based on a sample of nine ? Is such a gloomy assessment of the current state of affairs, truly warranted by the data ?
I am not convinced in the least by the case expounded by the author. Some key assumptions underlying her analysis of the data are overly simplistic; I am puzzled by her claim of statistical significance of the findings, given the small size of the sample; most importantly, though, I find the charge of GD based upon the evidence at hand, exceedingly weak.
I may well be wrong, of course, in which case I thank in advance those who will set me straight. In the meantime, I am going to state my reasons for being skeptical.

The case
Of the 48 males in the sample, sixteen went on to take faculty jobs at the end of their postdoctoral stages; four out of the nine female researchers did the same [0].
The author contends that female researchers in the sample proved, on average, significantly more productive than their male counterparts, during their postdoctoral tenure. Yet, they reaped significantly fewer faculty jobs than those to which their high productivity would have entitled them. Towers points to this as evidence of GD.
She also identifies conference presentations as a mechanism whereby Fermilab exercises its bias against women, unfairly enhancing the chances of less productive male researchers of landing a faculty position, as they allegedly enjoy a far greater share of allotted speaking slots than their female colleagues [1]. Towers maintains that a more equitable allocation of conference presentations among researchers in the sample (i.e., one commensurate to their individual productivities) would likely have resulted in additional female hires. A consequence of this reasoning is that fair hiring practices should ultimately reflect differences in productivity among researchers.
The connection between conference presentations and faculty hires may be worth exploring in and of itself, but is conceptually separate from the (more important) one of female career advancement. In order to assess the merit of any charge of GD vis-a-vis hiring, the first question that one has to ask is the following:Has the productivity of female researchers in Towers’ sample been insufficiently rewarded, based on the rate at which they landed faculty jobs ? Is a gender-blind hiring scenario (i.e., one in which researchers are preferentially hired based on productivity alone) statistically irreconcilable with the evidence presented in her paper ?

The first, obvious consideration is that, making allowance for the small number of female researchers in Towers’ sample, female researchers were hired as faculty at about 70% higher rate than their male colleagues. Towers acknowledges this fact on page 7 of her paper, quickly pointing out that there is no “statistically significant difference” between the fractions 4/9 and 16/48 (as we shall see, Towers’ case of GD in hiring rests on the alleged “statistical significance” of the difference between 4/9 and 6/9). Be that as it may, Towers claims that their noticeably higher productivity should have have been rewarded by an even greater number of female faculty hires.
It seems fair to state that her case is not self-evident. It necessarily requires a careful analysis of the data, as the raw numbers do not immediately raise suspicions of discrimination against women.

Measuring productivity
Towers introduces a measure of research productivity (Pl) equal to the sum of all Fermilab internal progress reports (papers) co-authored by the lth researcher (male or female). The author reports the average values of P for male and female researchers (Table 1):
PF = 1.70 +/- 0.39 (average yearly productivity for female scientists)
PM = 1.38 +/- 0.17 (average yearly productivity for male scientists)
I have no idea of what those reports are, have never read (and likely never will read) any of them; it seems at least debatable whether a measure of this type is a reliable indicator of talent (or even actual productivity) [2]. But hey, what do I know ? I am no high-energy physicist after all, and the above measure is likely to be just as imperfect as any other. Moreover, it satisfies a basic criterion, namely it has no obvious intrinsic built-in bias (much less gender-specific). As we shall see, this issue turns out to matter very little, in the end.

The above average values hardly point to a “statistically significant” difference given their relatively large uncertainties, especially if the small female sample size is taken into account. More cogent information must therefore be contained in the actual productivity distributions which, as Towers maintains, are very different for males and females.No actual data are shown in the paper, only a schematic description of such distributions is offered (page 9). Specifically, Towers divides the sample into four distinct groups:a) One consisting of half of all males (24 researchers) producing “almost nothing”b) A “moderately productive” one, comprising “slightly less than half” of all malesc) A group comprising all the women in the sample, of roughly equal, “high productivity”d) A small group of “extremely productive” males
(incidentally: how do you pick your guys, high energy physicists ?).
The author contends, on the one hand, that the 24 “productive males” are at least as productive as the least productive female (page 7), and that the productivity distribution for female researchers is relatively narrow (page 9); on the other hand, she clearly makes the point that the productivity of the nine female researchers is, as a whole, “significantly above” that of the group of “moderately productive” males. Of course, it would have been a heck of a lot easier if she had actually shown the data, but it seems reasonable to conclude that, aside from few “extremely productive” male outliers, the productivity distributions of productive male, and female researchers largely overlap, the female one slightly but noticeably displaced toward higher values.

Discrimination against…
Towers’ claim of GD boils down to the following observation: four hires among women is significantly less than what strict application of productivity criteria would dictate. According to her, a gender-blind hiring process ought to have resulted in two additional positions for female researchers (page 15), a difference that the statistician Towers evidently deems too large to be accounted for by a mere statistical fluctuation[3].
Now, if GD were mostly at the root of the above outcome, wouldn’t you generally expect that every “extremely productive” male would get his faculty position, and that those positions (two, or, however many) which did not go to deserving women, would go preferentially to male researchers in the “moderately productive” group ? After all, the pool of productive male researchers was large enough to fill in principle all (20) available positions; even if the hiring process is indeed biased in favor of male applicants, should the “productive” ones not retain an edge over their “unproductive” colleagues ?
This is where Towers’ case begins to crack. From her paper we learn that just eleven of the twenty-four productive males landed faculty appointments (page 7). It is not even clear from Towers’ text whether all the “extremely productive” ones were successful (I suspect not). What is clear, is that in almost a third of all male hires, “unproductive” individuals were selected. In other words, not only five out nine “very productive” female researchers, but as many as thirteen “moderately” to “extremely” productive male researchers, and for that matter even nineteen “unproductive” ones, appear to have been discarded from consideration, for reasons ostensibly unrelated to what Towers refers to as “productivity”. If we restrict ourselves to “productive” researchers, then equal proportions of women and men (five out of nine women and thirteen out of twenty-four men) may rightfully allege to have been victims of (gender-unrelated) “discrimination”, based on Towers’ productivity argument.

Who should have got those jobs
Obviously, there exists another explanation for the above outcome, one that does not involve any discrimination [4]. In general, raw productivity, no matter how measured, will be but one of many factors intervening in a hiring decision, and quite likely not the most important one. This is why, time and again, an “extremely productive” short-listed candidate walks away empty-handed, and one “less productive” (on paper, that is) gets the job. Given that Towers’ argument rests on statistical considerations, is it possible to make a simple model of the hiring process, in which the different productivities of the various candidates are taken into account probabilistically ?
Let us stick with Towers’ sample. A very simple model distribution, consistent with the information given by Towers, is the one shown here:

Model productivity distribution for Towers' sample, based on the information supplied in her paper.

The following simplifying assumptions are made:
1) All individuals in the same group are equally productive.
2) A clear-cut difference in productivity exists among all groups; in particular, every female researcher is assumed to be 25% more productive than every “moderately productive” male (an assumption manifestly at odds with Towers’ admission of substantial overlap between the productivity distributions of “moderately productive” male, and female researchers, but one which we shall make nonetheless, for simplicity). The productivity of “unproductive” males is set to zero.
3) The “moderately productive” group includes 20 individuals.
4) The average female productivity is 2.0, that of the males 1.0 (in other words, we are taking a value on the high end of the uncertainty range for the females, and on the low end for the males). As a result, the few “extremely productive” males are twice as productive as the females.

In the absence of actual data, the above, crude model distribution is purposefully constructed to be most favorable to the author’s thesis, i.e., all of the aspects upon which Towers’ argument hinges are deliberately amplified. For example, female researchers are considered as a whole twice as productive, and each individually more productive than over 91% of males; the “one-sided significance test probability” Pfemale > male (see page 8 of Towers’ paper — incidentally, this is not particularly informative a quantity, as it is insensitive to the magnitudes of relative differences in productivity), takes on a value of 0.99999994 for the above distribution, as opposed to 0.98 for Towers’ sample (Table I of her paper).
Yet, it is easy to show that, even with the above distribution, on purely probabilistic grounds there is no reason to attribute to anything other than a statistical fluctuation a number of positions going to women equal to four, of the twenty available.

Simulation
One way to gain some additional quantitative insight, consists of “simulating the hiring season” on a computer. The only discriminating factor among individuals is assumed to be their productivity, i.e., the process is “blind” to anything else (to phrase it differently, any other factor only exerts a random influence). We begin by making the simplest possible assumption, namely that a difference in productivity proportionally translates into a different probability of being hired. That is, a researcher who is, say, 20% more productive than a colleague, is also 20% more likely than the latter to land a faculty job. We thus assign an a priori hiring probability to each researcher proportional to his/her productivity, based on the above model distribution, namely 0 for the unproductive researchers, 1.6 for each of the twenty members of the “moderately productive” group, 2.0 for each of the nine “very productive” females, and 4.0 for each the four “extremely productive” males. Note how, by making a priori probabilities proportional to productivity, we build into the model a “fairness assumption”, i.e., none of the 24 “slackers” will get a job.
We perform then a Monte Carlo simulation, i.e., use a common random number generator to sample 20 individuals from the pool, each according to his/her own probability; any one candidate may be sampled more than once (multiple offers), but only “one offer is accepted”. The sampling stops as soon as all twenty positions are taken.
On repeating the procedure illustrated above many times, one can keep track of how many women land jobs each time, and construct a frequency histogram. The numerical implementation is straightforward.

Before discussing the results, it is worth restating that the above distribution, utilized to carry out the simulation, is just a model one. The only reason for using it, is that no raw data are provided in Towers’ paper. Such a distribution, however, is based on all the (few) numbers, and the qualitative description given by Towers, with simplifying assumptions biasing it in favor of female applicants. It thus seems very unlikely that the results and conclusions would be drastically altered, if the actual distribution were used. If anything, the use of real data can be expected to weaken Towers’ case.
What is shown in the above figure, is the computed probability f(n) that a hiring season will conclude with the hire of n or less of the nine women on the job market. Obviously, n can be as low as zero and as high as 9, with f(9)=1. As we can see, based on productivity alone as much as 44% of the time only five or less women can be expected to be hired, and four or less 18% of the time (in case you are wondering: the probability that at least one of the four “geniuses” will not be hired is 46%; 9% of the time at least two of them will not be hired… so much for “publish or perish”…).
An event that occurs one in six times (i.e., with an 18% frequency) cannot in fairness be regarded as “rare”; more importantly, if only slightly more sensible assumptions are built into the model, the predicted frequency of occurrence of what Towers regards as a suspiciously disappointing hiring outcome for female researchers (i.e., four or less), rises considerably[5]. Thus, there really seems to be no statistical basis whatsoever to allege discrimination toward women in the hiring process, at least based on Towers’ description of her own data, which do not appear statistically inconsistent with a gender-blind operation [6]. Based on this conclusion, one can infer that any imbalance in the allocation among researchers of conference presentations ostensibly had, at least on average, no measurable effect on the careers of the researchers.

Conference presentations
Before exploring the (narrower) issue of equity in the allocation of conference presentations, it is worth clarifying and/or restating the following:
1) Conference presentations, in and of themselves, are only a modest “reward” for one’s scientific accomplishments. The main objective of a postdoctoral researcher is not speaking at conferences, but rather landing a university faculty position. Towers’ case of GD is ultimately about jobs, not talks. That is, talks are only relevant insofar as they can be shown to affect the likelihood of being hired. A charge of GD based on conference presentations alone is not likely to be taken very seriously.
2) As usual, correlation does not imply causation. Very often, young researchers who speak at conferences are on their way to faculty jobs anyway, typically as a result of strong support from senior mentors (which is also likely why they speak at the conference in the first place). Thus, conference presentations merely reaffirm an existing state of affairs. Conversely, a conference presentation is unlikely to reverse the fortunes of anyone enjoying only lukewarm support from his/her postdoctoral advisor.

Towers’ data would appear to indicate that opportunities to speak at conferences were mostly granted to male researchers, but the validity of this contention is difficult to assess independently, as no raw numbers of conference presentations for female and male researchers are provided. Towers’ own “conference reward ratio” is misleading, as it assigns disproportionate weight to talks given by “unproductive” researchers. A more in-depth analysis may well show that, much like in the case of hires, “productive” male researchers were put at an equal or even greater disadvantage, thereby undermining the claim of GD.
But the most striking result by far, is that the correlation between conference presentations and faculty appointments is virtually non-existent for male researchers (Table 1). In other words, while the lion’s share thereof may have gone to male researchers, they derived no measurable benefit from conference presentations (Towers herself is at a loss explaining it).
This suggests that the issue of conference presentations is likely just a “red herring”, largely irrelevant to hiring. If one is to pursue this route any further, badly needed are a clearer sense of the importance attributed to presentations by postdoctoral researchers themselves, as well as a better understanding of the process by which conference slots are allocated.
It is not at all implausible, for example, that a productive, confident researcher with a strong CV, may not feel particularly urgent a need to speak at a conference. In fact, it is not at all rare for speaking opportunities to go to postdoctoral scientists with weaker research records, in an attempt to help them jump-start difficult job searches (this normally happens with the consensus of the group).

Generally speaking, it is clear that any action (deliberate or not) whose effect is that of depriving a researcher of the proper recognition for the work accomplished (including a chance to showcase in public his/her speaking ability) is unacceptable, and should therefore be prevented and remedied. At the same time, given that the impact of conference presentations on career advancement is unclear (to say the least), serious allegations such as “gender discrimination” and/or possible violation of Title IX regulations seem unwarranted, if solely or primarily based on conference presentations.

Conclusion
At the end of her paper (appendix), Towers laments the unwillingness of the administration of Fermilab to follow up with an investigation her charge of discrimination against women, based on her statistical analysis. With all due respect for the academic and the statistician, I myself would not find in her data and analysis thereof, sufficient, convincing evidence of gender discrimination, even after accepting her many questionable premises and extrapolations (e.g., the notion that an “effective”, “productive” researcher is one who co-authors many reports, the ensuing inference that almost half of Fermilab postdocs spend their day surfing the web, the suggestion that, despite their lackluster research record, quite a few of these “parasites” will still manage to impress gullible search committees just by giving a conference presentation or two … hey, don’t get me wrong, I am all for making fun of high energy physicists but, in fairness, this borders on the ridiculous… it reads more like Dilbert than academia, or science).
Naturally, search committees, being composed of humans, are fallible. It appears from the data, however, that if mistakes were made (i.e., the possible hires of unproductive researchers), both the female and male populations in the sample were equally affected.

Now, that there is no evidence of hiring bias in this case, is obviously not the same as saying that there is no gender discrimination in particle physics, or all physics for that matter. Simply, it is not nearly as blatant as Towers suggests. In any case, a statistically convincing case has, in my opinion, not been built this time. Or, yet. Quite possibly, as the size of the sample at the disposal of researchers grows, clearer evidence will emerge and Towers’ thesis will be vindicated. However, I find her contention that a measure of productivity based on raw paper count should directly correlate with hiring, incredibly naive. Most reasonable scientists (male and female) would readily agree that evaluating a faculty candidate is much more complex and multi-faceted a proposition than that. Her apparent eagerness to build a case of discrimination upon such a shaky foundation is surprising.

The reason of the current under-representation of women at the faculty level, as it emerges from the above examination, seems to be merely the sheer outnumbering by men, a well-known, long recognized problem of modern science. Doubtless, discrimination of various forms has a lot to do with that; but I suspect that the most devastating kind is at work long before women reach college age. I am afraid it is very hard to try and find a remedy, that late into the game.

Notes[0] No information is offered in the study, as to what fractions of female and male researchers in the sample actually did seek faculty appointments in the first place, how aggressively, how many offers were turned down, etc (only a rather superficial comment is made, in a footnote at the bottom of page 7). It is not even clear to me whether the author regards any of that as relevant. These are clearly highly non-trivial aspects, if one is intent on building a case of GD within a certain professional sector. Everything else being equal, various societal pressures are widely believed to affect professional choices of women to a greater degree than men. Family reasons, for example, may induce a woman not to apply for a job, or turn down a job offer, more often than a man. The paper comes across as tacitly making the assumption that both male and female researchers ought to be regarded as equally free to pursue their career ambitions, and that any female under-representation at the faculty level, is largely attributable to more or less deliberate discrimination at hiring time, on the part of the scientific community.[1] An invitation to speak at a conference constitutes an explicit acknowledgment of one’s leading contribution to a scientific project. Conference presentations are believed to be important for young physicists, typically more so than mere co-authorship of the article describing the research presented, especially in a field like experimental high energy physics, where collaborations involve hundreds of researchers, and spotting talent and creativity can be a complex proposition. By speaking at a conference, a researcher gains exposure to the broader community, conceivably strengthening his/her chances of a successful faculty job search. However, the actual impact of conference presentations on one’s job search is far from clear. Towers’ own data, for example (Table I), show at best a weak correlation.[2] These numbers mean that most of the researchers in the sample co-authored between one and two internal reports per year. On page 11, Towers states that it is “not unusual” for a report to have “in excess of 20 to 30 authors”. Towers seemingly attributes no importance to the number of co-authors, i.e., each report is counted as one paper, for the purpose of assessing the productivity of a researcher.
Also, a distinction is made by the author, between “physics” papers, as opposed to “service” papers, the former focusing on scientific advances (presumably of greater prestige and impact on one’s career), the latter having to do instead with the overall operation of the facility (i.e., the laboratory), and therefore generally less qualifying from a professional standpoint. The author carries out separate statistical analyses for the two different types of papers (and related conference presentations), as well as a “global” one, in which all papers and conferences are counted equally (this is the one given in the text). The average “physics” productivity of males and females is identical within statistical fluctuations (0.78+/-0.22 for females and 0.72+/-0.10 for males).[3] That’s right, Towers’ case of GD is based on a deviation of two from a target of six in a sample of nine… and, what is the basis for Towers’ estimate of six, anyway ? One would expect her to have simply ranked all researchers according to her measure of productivity (after all, her entire case is one of discrimination of productive individuals), and that six would be just the number of women ranked in the top twenty most productive researchers. Surprisingly, however, that is not how Towers arrived at her estimate (in fact, that information is not given at all, in her paper). Instead, she relies on her own parametric model of career advancement, which includes, besides productivity, also variables such as “socialization”, and is different for males and females (page 12). Aside from the doubtful reliability of such an approach, why use a model when actual productivity data are at hand ?[4] My personal take is that Towers’ “productivity measure” does not really measure much of anything relevant to faculty hiring (e.g., overall scientific ability). This is in line with its relatively weak correlation with actual hiring (Table 1 of Towers’ paper), and explains the “anomaly” of so many supposedly “unproductive” researchers, who were deemed suitable faculty candidates by (reputedly competent) hiring committees. More generally, the only lesson to be learned from Towers’ work may be that the evaluation of the activity of a scientist through a mere paper count is misguided and misleading. This is, of course, neither novel nor surprising a concept.[5] The simplest, most obvious correction consists of abandoning the draconian assumption of zero productivity for the twenty-four less productive researchers, assuming instead a small but non-zero value (say ten times smaller than that of the “moderately productive” males). Adjusting the weight of “extremely productive” males to 3.04 (in order to keep the average at 1.0), and leaving the weights of the remaining two groups unchanged, immediately brings to 27% the probability that four or less women be hired, i.e., over one in four times (note that we are still assuming female researchers to be on average twice as productive as male researchers; on taking a less severe 1.7:1.3 female vs male average productivity ratio, the four-or-less outcome probability goes up even further, to 34%, i.e., over one in three times). The relatively large size of the “unproductive” group has a significant effect on the outcome, even if individual hiring probability is small.
Now, as anyone who has served on a physics search committee knows, the odds of landing a position do not increase linearly with one’s productivity. In fact, “number of papers” essentially ceases to be relevant, above a minimal threshold below which a candidate is regarded as not productive enough. Other qualities, such as effectiveness in communicating (a faculty will spend much of the time teaching), a well-defined research plan, strong letters of recommendation, will play a much more important role. Thus, a more realistic model is one that assigns essentially the same likelihood of being hired to all productive candidates. In that scenario, the probability that four or less women will be hired is 36%. One does not see how a credible statistically-based case of “gender discrimination” can ever stand on such odds, which are, of course, a direct consequence of the very small size of the sample.[6] The numerical experiment described in note [5] yields additional interesting information. The simulation in which each member of the “unproductive” group is assumed to be ten times less likely to be hired than any of the “moderately productive” males, yields a probability for four or less women to be hired of approximately 27%, but as low as 0.1% for five or more unproductive male hires. This certainly points to a statistically significant deviation of the actual male hiring pattern from what one would expect based on productivity alone; that is, “unproductive” researchers were rewarded well beyond their “merit”. However, in our model this happens almost exclusively at the expense of other men. Obviously, this may be a sign of questionable, scarcely transparent hiring practices (but not of GD); more likely, though, it may merely reflect the dubious value of Towers’ productivity metric, i.e., the scarce importance attributed by search committees to the number of internal reports co-authored.

15 Responses to “Gender discrimination in high energy physics ?”

Yep. That is a question someone should investigate. Basically, I can think of three possibilities:

1) Many of the 24 are not and were never expected to be productive, but were in fact hired as technicians. In that case, one should review the policy of hiring technicians via postdoc channels. (The fact that they didn’t hire any woman for this would still be interesting, btw.)

2) Massive affirmative action for doubtful male applicants. 2b) It has to do with some other quotas that need to be filled, like country of origin or some professor’s mafia (Massimo, feel free to substitute your favorite local variant for mafia). And this group happens to be male and not very good.

3) Nine women isn’t many, so it could be a statistical fluke. Half of the men are unproductive, 9/2 = 4, which gives a standard deviation of 2 for independent counts. Assuming that (i) one can occasionally get more than standard deviation, (ii) any statistical argument based on numbers like 2 or 4 is shaky, and (iii) the counts might not be independent, getting a 0 instead of 4 is possible. But not all that likely, so we arrive at the preferred ending: “More data needed.”

Re: (incidentally: how do you pick your guys, high energy physicists ?).

Agree with everything you say. However, I would like to stress how, in the original study, the tacit assumption is made that every person in the sample would have applied for the same jobs, and would have been ready to take any offer anywhere. Is this realistic ?
We are talking an incredibly small sample. It is quite possible, for example, that some of those women may have been married to some of the men (it is not uncommon in our field to choose a colleague for partner), and this could have conceivably greatly limited their freedom of taking jobs anywhere. If we are talking even just one or two of them, it’s a huge fraction of a 9-person set.

Also, as you correctly opine, it is quite possible that expectations of “productivity” were not the same for everyone. Otherwise, how did those five lazy ass technicians get faculty jobs ? If it was just mafia at work, then many men were also discriminated against, not just the women.
More likely, the “productivity measure” is bogus, and the entire approach flawed.

This post has been brought to our attention by the Thought Crime Algorithm of the Masses. You have been identified as being in dire need of immediate rehabilitation.

This is an unacceptable analysis – the incorrect conclusion has been reached. Whenever any question of discrimination arises, the answer must always be ‘yes’. Failure in this regard will be taken as evidence of either a nefarious agenda, or a lack of compassion and lack of respect for Diversity. We demand this analysis be remedied immediately by the inclusion of the following as your conclusion:

“Regardless of these gamed statistics which give the wrong answer, the whole issue could be put to rest using a strict quota system to ensure a complete adherence to social justice for all oppressed peoples.”

You have clearly escaped the recent advances made in public school curricula, so we recommend a crash program of reeducation. Consult the following resources:

Towers introduces a measure of research productivity, defined as Pl = Σi (1/Nil), where the sum runs over all Fermilab internal progress reports (papers) co-authored by the lth researcher (male or female), and Nil is the number of co-authors of the ith such paper.

Maybe you can help me because I can’t find that definition in the paper (version 3). To the contrary, Towers states:

We thus use the number of internal papers authored or co-authored by each researcher in our sample as a measure of their productivity.

I read that as 1 point every time your name appears in the author list, no matter how many other names there are.

That was my main criticism about Towers’ prepublication because she later shows that women appear on papers with lots of other scientists while men are more likely to publish alone or in smaller groups. Which would deny the claim that women are really more productive (for example if you count productivity the way you do, too, which is a lot more intuitive).

You are absolutely right. Thank you for pointing this out. My bad.
I am going to redact my article and add a note. I guess I must have been so convinced that what she meant by “productivity” was my “definition”, that I ended up misreading.
Now, in principle it may not have mattered, because, to me, there was no a priori reason for assuming that women would appear in large collaborations more often than men, which would render “my” measure and hers sort of equivalent.

Indeed, as you say, her own socialization data suggest that “men are more likely to publish alone or in smaller groups” (although I am not sure how seriously to take that claim, given the large uncertainties), and this may alter the picture substantially, to the point where her case could completely fall apart. However, I still maintain that it is statistically very weak, even if we assume equal likelihood of men and women to be part of large collaborations.

this may alter the picture substantially, to the point where her case could completely fall apart.

It may not collapse altogether because it is interesting in itself where those differences come from. That may even be a hint of existing discrimination. But that has to be looked into, the data alone don’t seem to prove it.

Is there any way you can translate your article ?

I’d rather not. It’s rather long and writing on English doesn’t come as easy to me as it should. You could check the Google translation but I’m not sure if that’s comprehensible at all (and it’s probably prone to misunderstandings).

Yeah, I cannot really follow the translation.
Going back to productivity, one of the reasons why I was initially confused is that, if we just take a straightforward paper count, then it seems to me that for the most part we are talking between one and two papers per person. These are rather small numbers.
Is it fair to call someone “productive” who has co-authored one or two internal reports (possibly as part of a large group), concurrently stating that someone whose name does not appear on any internal report has contributed “almost nothing” ? Does it really work that way ? This seems awfully simplistic…

It is also unclear to me whether men were indeed allocated significantly more speaking slots than women. Towers’ own “conference reward ratio” is misleading, as it assigns most of the weight to presentations given by “unproductive” researchers.
For example, consider a hypothetical situation in which 4 men and 5 women each publish 9 papers, and one extra man publishes nothing. Suppose all ten give an invited talk. In this case, the average conference reward ratio (eqns. 1 and 2) works out to be 0.1 for women and 0.28 for men, but to conclude from that that “men are rewarded almost three times more than women” would be absurd.
All we would be talking about, in that case, is just one unproductive person giving a talk, something for which there could be many reasons, not necessarily having to do with “discrimination”.
It would be more useful to have raw numbers of presentations.

I recently went through a faculty job search. I am a man. I was repeatedly told that there were dozens of applicants for each position to which I applied. I was told that if I was a woman I would be hired, because the departments were all gender-imbalanced (by which, they meant, that the gender balance of the department reflected the field at large). But I am a man and, as a result, the departments had a responsibility to find a qualified woman for the position.

I was also told, each time, that the gender-based search was purely informal and that the statement would be denied if I repeated it.

I don’t usually read this blog, but ended up here after reading the article. I have found the comments and forums related to the article far more informative than the article itself! When I got to this comment, I felt I had to reply, having myself been on many faculty search committees. Let me identify myself as female and a professor of physics. Let me also say that I think Okham’s analysis thorough and mostly right. However, with respect to the comment above:

I very much suspect that statements of the form: “If you were a woman, we would hire you.” were made in a confidential manner to you in order to make you feel better about not getting the job. They are not factual. It is unfortunate that such statements are likely to be believed by most male physicists looking for a job, which simply makes them even more effective as a salve. In fact, it is really quite clever. There are many incentives in place to hire women, so officially you can make the statement that you are actively searching for qualified women for the position. However, qualified women routinely get ranked lower than men for the following reasons.

Many of the confidential letters of recommendation for qualified women dwell on personality and degree of assertiveness (either too much or too little), rather than scientific accomplishments. This personality rating is then used to either say they will not be leaders in the field (not assertive enough) or they may be difficult to work with (too assertive). Being “just right” is an extremely narrow window. The letter for male applicants match potential leadership qualities with their work, instead of their personality.

I have watched faculty meetings which drop qualified women to the bottom of the list due to vague comments about not fitting in, or doesn’t act like a physicist. Such comments might have merit, if they refer to experimental style, teaching, or thinking. However, these comments upon later elucidation, refer to her clothes, her persona, her… femaleness.

Once low in the ranking, the woman candidate is packaged up as a “member of the minority pool, who was interviewed, but didn’t make the cut”, and this satisfies the University rules about affirmative action (or whatever they call it these days).

One other point. Gender Imbalance in a department is related to gender imbalance in the pool of applicants, something tracked by APS and AIP via number of female graduates and postdocs on a year by year basis. Therefore a Physics Dept with only 5% women is actually doing better than the pool and is NOT gender imbalanced.

Kidding aside, let me clarify, for the record: I do believe that we have, in physics, a problem of under-representation of women and minorities, and I think it is hurting the field as a whole. I also think that a sweeping indictment like that made by Towers can do a lot of damage, by inducing many potentially interested bright young women to stay away. And, while they should certainly be warned about and aware of existing discrimination (which exists in all strata of society anyway), purposefully making things sound worse than they are never does anyone any good.

Secondly, with reference to the comment to which you have replied: I see no direct connection between the case discussed in the preprint and attempts on the part of universities to diversify their pool of faculty, which, to me, can be justified on various grounds.

I have watched faculty meetings which drop qualified women to the bottom of the list due to vague comments about not fitting in, or doesn’t act like a physicist.

Unfortunately, I have witnessed exactly the same, at least once. It’s the dreadful “I have the gut feeling that she is not…. and that he may be more… even though he hasn’t published as much…”. But it is not always like that, and the target is not always a woman.

now that I have broken the wall of the ten comments (still more than half of them are probably mine, but that’s just a detail), I expect major advertisers to approach me with lucrative contracts. The ensuing commercialization of my blog, with me becoming rich and famous, will alas be also the cause of its decline. Britney Spears’ divorce, or Michael Jackson’s latest facial surgery will take the place of referees and statistical analysis, in order to increase the blog audience. Depressing, I know…