Women Are More “Election-Averse” Than Men

In an effort to control for confounding factors that might affect the relative propensities for women and men to enter politics, we take the question of candidate emergence into the laboratory. We find evidence that women are election averse while men are not. This difference does not arise from disparities in abilities, risk aversion, or beliefs, but rather from the specific competitive and strategic context of campaigns and elections. The key features of our experimental design involve (1) an objective task that represents policymaking ability, (2) monetary rewards that ensure that all subjects, regardless of gender, face the same incentives to select a representative with the highest task ability, and (3) a comparison of alternative selection mechanisms. In Experiment 1, we find that men and women are equally likely to volunteer when the representative is selected randomly, but that women are less likely to be candidates when the representative is chosen through an election. In Experiment 2, we find that women’s election aversion persists with variations in the electoral environment; it disappears only when campaigns are both more truthful and less costly.

From a working paper by Kristin Kanthak and Jonathan Woon. This “election aversion” among women—even women in the high-status jobs that tend to feed into politics—is well-documented in surveys of potential candidates. See the work of Jennifer Lawless and Richard Fox (book, article). This simple experiment provides confirmation.

10 Responses to Women Are More “Election-Averse” Than Men

First, a triviality. If Figure 1 is an actual screen capture of the computerized interface presented to subjects, it’s perhaps worth pointing out that it manages to misspell one of the 25 words on the screen. While a 4% error rate is not the end of the world, it does not inspire confidence that other aspects of data collection and analysis have been performed as meticulously as one might hope.

The authors state (p. 12): “An individual’s decision to volunteer or not boils down to a comparison between her own score and the beliefs she holds about other volunteers’ scores.” Wouldn’t it be true, however, that the decision to volunteer might also be affected by how hard the subject wanted to work? If 2/3 of the payout is determined by the group representative’s score, it would not be irrational for a subject to choose to coast in calculating sums, forgoing some of the payout that she might receive by trying harder but nevertheless receiving significant payouts from the work of the group representative. A subject with that mindset would be less likely to volunteer to be the group representative, because if she were, 100% of her payout would be a function of how hard she worked. (I owe this insight to being a profoundly lazy person myself.) Since the experiment showed no gender differences in the rate of volunteering, to the extent that effort minimization (i.e., laziness) played a role in the decision to volunteer, it should not affect the conclusions of the study, but it does suggest that the statement on page 12 may be overly simplistic.

At page 16, the authors postulate that gender differences in candidate entry might be due to different beliefs about how much competition to expect and to different beliefs about the informativeness of elections. Isn’t it also possible that gender differences in candidate entry might have been due to gender differences in competitiveness itself? In other words, do boys enjoy competitive games more than girls, and if so, are elections considered as a kind of competitive game? The principle of parsimony would suggest that this possible explanation for gender differences in desire to put oneself forward as a candidate be considered. Perhaps the authors dealt with this subject, and I missed it.

At page 18, the authors discuss “an important difference in the distribution of scores. . . . [T]he variance in performance is quite a bit higher for men. There are a few men with very high scores, and women’s scores are clustered more tightly around the mean . . . .” There be dragons! It was a similar issue that got Larry Summers in hot water and led to the grovel-fest that was nonetheless unsuccessful in saving his job.

Finally, because this has gone on too long, we are informed (p. 10) that the experiments were conducted at the Pittsburgh Experimental Economics Laboratory at the University of Pittsburgh, with 130 subjects participating in Experiment 1 and 200 subjects (so far) participating in Experiment 2. If the authors say anything else about the nature of the subjects (other than their gender), I missed it. I’m guessing, however, that the subjects were students, probably most if not all undergraduates, and that their age range was very different from the general population of candidates for elective office. Very likely the unrepresentative nature of the subjects, in terms of age and probable geographic distribution, does not affect whether the results of the experiments can fairly be extrapolated to the wider population and provide insights into gender differences among electoral candidates in the United States. But doesn’t it deserve at least a mention?

As someone who uses and intends to keep using lab experiments, the final point from RobC is frustrating, since it seems to be indicative of the laziness of potential reviewers. The concern has been dealt with multiple times, and even a basic Google Scholar search might turn up the Frechette paper looking at the behavior/performance of students in the lab that, itself, looks at several studies that address this issue in particular. The superficial reading of the paper also seems to support the view that RobC is interested more in long comments than thoughtful critique, so it should not surprise me. And yet it does.

The random-period selection for payoffs deals with the “laziness” thing. The different tasks and steps gets directly at the competitiveness of women vs men. Larry Summers got in trouble for his phrasing and comments without specific reference to data; this paper notes an empirical fact from the experiment. But, you know, I’m sure they appreciate someone catching the typo.

While I don’t do experimental work, I don’t think RobC’s request that the author’s disclose the nature of the experimental pool is out of line. Regardless of whether studies show that students are similar to randomly-selected members of the population, they are still not randomly-selected members of the population. I expect people doing observational work to disclose the provenance of their covariates for things like democracy and development; a brief note about the experimental pool is not too much to ask.

And I ask this in true curiosity: is there decent research to suggest that 20 year olds at elite American universities produce identical experimental results to other populations? I’m only familiar with the work on the ultimatum game…

Kris Kanthak here. Thanks, all, for your interest in our paper. Couple of thoughts:
On the typo, yup, there’s a typo. We “inherited” that typo from the folks who did a similar experiment first. We can’t change it now, because doing so would violate all sorts of rules about good experimentation. You have to have stuff be the same from session to session. So if, in your first session, a capuchin monkey runs through the lab, you either have to throw out those data or let the damn monkey run around every time you run a session. Happily, we have a typo and not a monkey, for sanitary reasons.
On the laziness issue, I’ve seen the data: Subjects aren’t being lazy. But it makes sense that they aren’t. You, RobC, as a lazy person, just wouldn’t show up to our session. But these people showed up, knowing they would be doing some sort of task for money. And to be clear, EVERYONE has to do the mathematical task again, so you could either sit there for five minutes, twirling your thumbs, or just do the math and earn money. Part of the reason we pay you for doing the math is so that you want to try hard. We want to pay you enough so you care about how well you do, and we could tell pretty quick if people didn’t care. They do.
Next: Ah, yes, the Larry Summers graph (which is what we call it ourselves). Summers got into hot water for his impolitic discussion of that type of graph, not for its mere existence. This is why we note its existence, and then quickly move on.
On the undergraduate pool issue, Turing’s right: This is fairly well-trodden ground and I’m fairly comfortable about my position on that ground. As Turing says, Frechette finds very little difference between behavior of undergrads and behavior of professionals in experimental settings and has pretty extensive evidence to back him up, and that is but one example. The article is here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1939219 (You have to give RobC the LINK, Turing. He’s lazy.)

Thanks for your response. I did actually read the Frechette article, which provides some support for the use of undergraduates in economic simulations, but hardly disposes of the issue. In fact, I’ll see your Frechette and raise you one, an article by Oreskes, Shrader-Frechette and Belitz that states: “Fundamentally, the reason for modeling is a lack of full access, either in time or space, to the phenomena of interest. In areas where public policy and public safety are at stake, the burden is on the modeler to demonstrate the degree of correspondence between the model and the material world it seeks to represent and to delineate the limits of that correspondence.”

Your study does not involve public safety but it does involve public policy, so the authors’ advice may be appropriate here. But whether or not you choose to justify the use of undergraduates, let me return to the point I made in my original comment: shouldn’t you disclose it? What justification is there for the lack of transparency on this point?

I won’t waste folks’ time by debating the other points here, but let it be noted that my comment about people’s laziness wasn’t intended to suggest that the participants were too lazy to participate in the study but rather that the normal motivations that apply to man as an economic animal suggest that many people may seek to minimize effort (not to the extent of twirling their thumbs but in how hard they push themselves to calculate the sums with maximum speed and accuracy) and that the statement in your article that their decision boils down to a simple comparison of their scores to their beliefs about others’ scores fails to address that.

BTW, I may be fundamentally lazy, but not too lazy to read your paper, or to find and read the Frechette article, or to find and read the NSF award abstract concerning the $246,717 we taxpayers are paying for your research. Your condescension is hardly helpful here. Concerning academic norms, Andy Gelman wrote recently, “Listening to criticism and recognizing our errors isn’t just good scientific ethics, it’s also good for us.” As Gandhi responded when asked what he thought about Western civilization, that would be a very good idea.

RobC: Let me interject here to note that this is a working paper that was presented at a conference I attended, and I did not inform Kanthak or Woon that I was writing this post. Which is to say: this paper is not fully finished, and there is little doubt in my mind that the student subject pool would have been noted and discussed in any version submitted to a scholarly journal. Don’t read too much into the fact that it’s not mentioned or defended here — which may have as much to do with when I pulled the trigger and wrote the post as with anything Kanthak and Woon intended.

In terms of tone — and knowing Kristin a bit — I don’t think she is condescending to you, simply teasing you a bit using the same theme, laziness, that you raised in a bit of self-deprecation.

And, in fairness, let me also note that you extrapolated from a typo to make what was — to me at least — a snide remark about what that “4% error rate” does or does not suggest about the “meticulousness” of the rest of the paper. I bring that up to say: I don’t approve of condescension, but I think you can make your (perfectly reasonable) points with a milder tone and without hinting around about systematic malpractice on the authors’ part.

Just to throw in my comments in to the threads: I expect that Kanthak, even if (mildly) annoyed by Rob’s tone, probably appreciates his comments. When someone gives a reasonable criticism of one of my papers-in-progress, even if that criticism turns out to be misinformed or easily correctable, it often represents some lack of communication on my part. In this case, for example, Kanthak and Woon might be motivated to add some footnotes addressing the typo, the question about generalizability, and other issues raised in the discussion. This would make the paper stronger and might motivate some additional thoughts on the authors’ parts.

Super-polite comments are more pleasant to deal with than snarky comments, but on the other hand we have to recognize that commenters are commenting for free. If a bit of enjoyment in snark is their payoff for providing us with helpful feedback, that’s a small price to pay!

Overall, I much prefer responding to blog comments than responding to referee comments. Referees have the power and can be arbitrary in so many ways.

John and Andy, thank you for your comments. Perhaps my 4% error rate reference was a little snarky, but remember that the typo I was talking about was not a typo in an article or working paper, it was a typo in one of a very few data entry screens seen by participants, screens that I would have expected were carefully designed and agonized over by multiple pairs of eyes before being presented to subjects. It’s not clear to me from Professor Kanthak’s reply whether the mistake was made in the original Niederle and Vesterlund task or at a later point, in a study that she and Woon “inherited” but that isn’t discussed in the working paper. Her belief seems to be that the integrity of the experiment required the continuation of that error for the experiments described in the working paper. Perhaps I’m reading her wrong, because that seems very odd.

My only other snark, in my view, was toward the Larry Summers issue, and that was meant as a wry comment about the mishegas at Harvard and the political correctness in the academy that makes discussion of certain subjects “impolitic” (Kanthak’s term), not in any way as a criticism of the authors’ inclusion of the subject in their publication.

In other respects, I thought I went out of my way to be gentle, conceding as to one of my points that “it should not affect the conclusions of the study,” and as to another that it was a suggestion to be considered, and as to a third, “Very likely the unrepresentative nature of the subjects, in terms of age and probable geographic distribution, does not affect whether the results of the experiments can fairly be extrapolated to the wider population and provide insights into gender differences among electoral candidates in the United States.”

I didn’t attack the fundamental nature of the study, even though to be honest I have grave reservations whether these simple experiments do in fact provide any meaningful insights into the far more complex and interesting decisions of people deciding whether to run for public office, and–a subject perhaps for another day–whether whatever limited insights they do provide are worth the quarter of a million dollars they cost. And I worry greatly that the academic game of “telephone” over time strips away the limited nature of the experiments and treats the result as providing confirmation of election aversion among women (see John’s OP) and subsequently becomes authority for policy prescriptions. Is it far-fetched to imagine that in the future we’ll read someplace, “We need to reduce the costs of campaigns which have been found (Kanthak and Woon, 2013) to disproportionately cause women not to run for elective office.”?

In any event, my comments were of a far more limited and frankly non-threatening nature. You’re both generous in your interpretation of Professor Kanthak’s response. John has little doubt that a final paper would mention that the subjects were undergraduates and doesn’t think her tone was condescending. Andy thinks she probably appreciates my comments. I don’t know Professor Kanthak personally and can go only by what she wrote, which strikes me as defensive and more than a little ticked off; I see heels being dug in. Andy’s statement last week about academic norms is nice to hear, but based on what I’ve seen in comments on The Monkey Cage, it may well be a good deal more prescriptive than descriptive.

RobC: I suggested that these experiments “provide confirmation” of *other scholarly work* that relies on extensive *non-experimental* evidence (multiple surveys, in this case — see the links provided). In no way did I suggest that these experiments alone suffice to prove the case.

For what it’s worth, RobC, I interpreted Dr. Kanthak’s comments as playful and not condescending, perhaps because I have had the benefit of knowing her. And from my reading of her remarks, it appears that the spelling error was made in a prior experiment, was not caught, and cannot be changed “now” to protect the integrity of the experiment…although the middle part of that chain is more of an inference on my part.

About

The mission of this blog is described in our inaugural post. And, technically, an orangutan is an ape, not a monkey.