Is there a conflict?

The June issue of Physics Today included the letter “Sexism may be in the eye of the beholder,” in which Richard Wolfson explains why he chose not to use Leon Lederman and Christopher Hill’s book Quantum Physics for Poets in the classroom:

Much as I liked the book, in the end I chose not to adopt it. My reason was the very example the reviewer touts as an instance of Lederman’s engaging writing: the image of a reader peering in the window of Victoria’s Secret while Lederman and Hill enlighten him—and it is clearly a him—about wave–particle duality. Read the cited passage in all its detail and it isn’t hard to draw several conclusions about how the authors, perhaps subconsciously, view their readers as male; as drawn, in a slightly voyeuristic way, to Victoria’s Secret; and as thinking highly of their own sexual allure.

How would a female student react to Lederman and Hill’s example? Would it make her feel included among those interested in physics? Would it make her comfortable in the presence of male physicists or her fellow physics students? I think not. Had this example occurred just once, I might have let it go and adopted the book. But Victoria’s Secret is mentioned every time the wave–particle duality comes up—which is frequently in this book on quantum physics.

Lederman and Hill had a response in the the same issue, saying that there is no problem with the book because Victoria’s Secret is a common store nationwide and that both men and women look in the windows. They conclude with:

We are inclined to disagree, however, with Mr. Wolfson’s conclusion about the effect of the Victoria’s Secret windows metaphor on our female readers: We have done the experiment of taking the risk, and we have not received a single complaint thus far from anyone else that our book is sexist.

(I should also note, since it’s such a pet peeve, that they begin their letter by pointing out that in a previous book they utilized the story of a great female mathematician, Emmy Noether. As though that makes one immune to all future criticism?)

According to a blog post by Ed Bertschinger, after this first letter another reader did complain, and was told by Hill that the example would be changed in future versions. Says Bertschinger, “Is it possible to eliminate the implicit bias that fails to see how one’s cultural metaphors exclude others? Sometimes I think that solving this problem is much harder than solving the many-body Schrodinger equation. Astronomers and physicists like intellectual challenges. This one is worthy of our sustained effort.”

I agree that this is worth our effort, but I can also see how those casually reading the letters might misinterpret the intentions. I can hear the moans and groans already… “All this attention over one little example? What are you advocating here, a boycott, a book-burning party?” No, nothing so extreme. I’ll first clarify what the example was, and then why I liked Wolfson’s response.

After reading the two letters, I was curious for myself if the example really assumed a male reader or not. Simply mentioning Victoria’s Secret is not necessarily problematic in itself. So, I checked out a copy of the book. The example is first mentioned in an introductory section (in my 2011 copy, on page 26), and in my view, it definitely assumes a male reader. The nail in the coffin was the description of how most of the photons “reflect off your face and pass right through the store window, providing a clear image of you (handsome devil!) to anyone who happens to be on the other side of the window (the window mannequin dresser?).” The phrase “handsome devil” definitely describes a guy. Sure, technically the word “handsome” could describe a woman, but that’s not typically how the word is used in our culture.

“Oh no, heaven forbid, freak out, one phrase in a three hundred page book!” someone out there over the Internet is sarcastically gasping. Yeah, I get that. It is not at all shockingly offensive. As with plenty of other “controversies” about women in science, it’s not that offensive, more like eye-roll inducing. And as Wolfson points out, the rest of the book is great, though this example keeps popping up. The point is this: given the choice between two books for your course, why choose the one that you know will exclude some of your students? And this is assuming you only have a choice of two books, in reality there are many more. Wolfson made his choice, and made known why. Though the authors originally denied any problem, it sounds like they will now be changing it. Hey, that sounds to me like… progress!

This is the kind of activism that I like to see. Sometimes as an advocate for women in science, I encounter some knee-jerk reactions to my (perceived) goals, like I’m the PC police here to ruin everyone’s fun. This kind of activism called out a problem, and now it’s being fixed. Little steps here and there get us farther towards a goal of helping everyone feel included. It doesn’t happen over night, and there will be problems along the way, so we sort them out as we encounter them. I think an important distinction in this process is to separate out “you are sexist/racist/ableist/whateverist” from “that statement was sexist/racist/ableist/whateverist.” It’s not that this book or these authors are sexist, and obviously I can see why an accusation of that would be hurtful and cause a defensive reaction. Wolfson didn’t do that, he pointed out that this example was written assuming a male reader, which could make a female reader feel excluded. Given all the other positives with this book, he chose a different book to avoid that negative. We should all be responsible for the things we say and do and have the ability to say “you’re right, that was my bad.” I would expect others to be able to call me out for language that illustrates an implicit bias (and people do), that doesn’t mean I’m a bad person. We all have implicit biases, so we should all be prepared to own up when they come out unexpectedly.

This reminds me of my process for our end-of-semester course evaluations (though it’s been a while since I filled one out). We are asked to evaluate our instructor’s respect for all students regardless of a long list of categories (e.g. race, sex, age, religion) on a scale of 1-6. If you say or do absolutely nothing offensive or eyebrow-raising throughout the entire course, you get a 5 from me. That’s the default. To earn that 6, you’d need to do one “small act of activism” throughout the entire course. Use the pronoun “she” instead of “he” when talking about an example one time in the semester, you’ve got a 6. One. Time. Any one statement or action that recognizes a non-traditional group will make the cut.

We should all be striving for that 6. Pointing out a perceived bias does not have to make you the fun police, and owning up to a bias does not make you a horrible person. It just means we’re all slowly making progress towards the same goal.

A short piece by Ed Rybicki, published in Nature, has been making the rounds online and has caused quite a storm. People have been both vehemently opposing the article and also strongly defending it. It’s called “Womanspace” and is an anecdotal story about the differences between men and women shopping.

The comments section of the article is certainly very popular, so check that out too. Before I write down my thoughts, I wanted to give an opportunity for some folks to share their thoughts here. I’ll add mine soon.

A friend pointed me to an interesting new article in Physics Today titled “Problems with problem sets” (sorry, it’s behind a paywall). The abstract states:

Undergraduate physics problem sets and textbook examples often assume prior knowledge that is more common in men than in women. Could that difference be deterring women from pursuing careers in physics?

Their argument is based off of a review of physics problems that start off like this example:

The 200-kg steel hammerhead of a pile driver is lifted 3.00 m above the top of a vertical I-beam being driven into the ground. The hammerhead is then dropped, driving the I-beam 7.4 cm deeper into the ground. . . .

If you’re not familiar with those construction terms, it’s more difficult to approach this problem. Their argument is that due to gender socialization, men are more likely than women to be familiar with these terms, and therefore one group of people is being put at a disadvantage. If you can access the article, I highly suggest reading the whole thing, because the writers do an excellent job of laying out their argument and addressing the caveats.

They suggest, for the example above, changing the wording to this:

A cylindrical rod is being driven into the ground by a machine that drops a heavy weight on it, lifts the weight, and drops it again. Such a machine, called a pile driver, is frequently used in construction projects.

My first thought when reading the article was: yes. There was definitely a difference with my male classmates in the background that I brought to my college physics experience; I think in my circumstances, it played out much more in the lab than in problem sets. I went to a small school that did not use the same textbooks as other intro physics courses, so I was personally unfamiliar with the wording of the problems presented here. Glad I was spared that, thank you Professors! Still, this issue reminded me of what I previously wrote regarding Girl Scouts vs. Boy Scouts, as one anecdote for how gender socialization could impact how one reads these physics problems. Beyond that, I’m assuming readers are somewhat familiar with how gender socialization could be important here.

But aside from my reaction, I was a bit surprised by the reactions of some other folks, so I wanted to address potential issues here.

First, it’s important to remember that to say that these problems are biased towards men does not mean every man is at an advantage and every woman is at a disadvantage. It means that with a large enough sample, you’re going to have more men familiar with the terminology than women. Every person has their own unique background, but on the whole boys are more likely to have participated in activities and played with toys that given them knowledge of tools, construction, engineering, etc. So while many men can say “I have no idea what that means!” and many women can say “Duh of course I know what an I-beam is!” the point still stands.

Second, gender is admittedly but one of many reasons why a person would or would not be familiar with construction terms. You can talk about class differences, age differences, differences in whether or not you had mono the week that your teacher talked about certain tools in shop class in 11th grade, whatever. To recognize the existence of other groups potentially biased by these questions does not negate the fact that gender can play a huge role. Furthermore, that’s an excellent argument for changing the wording, because the questions should be clear to everyone! The article is not saying “We need to change the wording so women will understand,” it’s saying “We need to change the wording so everyone can understand, and we’ll help a lot of women in particular, among others.”

I would also want to stress that obviously, you’d have a hard time finding a person who would say “I quite my physics major because I didn’t know what a pile driver was.” Some people will get by just fine picking up these things along the way, but others won’t. The reason why is that things like this can add up to the feeling of “I don’t belong here in this class” or “I don’t have the ability to succeed in this.” Studies have shown that women are often less confident in their math abilities than their equally talented male counterparts, so even receiving the same grade on a confusing assignment can result in the feelings I mentioned. Again, this is statistically speaking, on the large scale. Though I would also point out that it’s important to address the problem from the other direction, by working to improve confidence!

Finally, one person said that if you fix this bias, you’ll just introduce another bias. When I called this out as fatalistic, they argued they were just being cautious, that you’d have to make sure the new bias is less than the old. Well, sure! Obviously no one would advocate for introducing new biases! But that’s not a reason to stop continuing along in the process of making sure problem sets are accessible to everyone (given of course the math and science pre-reqs of the course, which in the rewritten example above, seems fine). I would obviously advocate caution when writing such problem sets, and many educators have studied this issue extensively, so that prior knowledge should be applied as well. This is simply adding more knowledge to that field. Will we have to continue to study how problem sets are written? Of course, but that’s not a reason to not act upon this observation.

Overall, though this was not a “scientific study” but more of a set of observations and suggestions, I thought it was useful for current and future physics teachers to keep in mind.

I recently attended a presentation by an ETS (Educational Testing Service, the company that runs the GRE, TOEFL, and Praxis tests) about their most recent changes to the GRE. The audience was those of us at the university involved in the graduate admissions process, so that we could better understand the new scores coming in this fall, what exactly the GRE is evaluating, and also a bit about the TOEFL and how it is administered. In addition to addressing the GRE and TOEFL, the company’s representative also introduced us to the new Personal Potential Index (PPI) that is being offered by ETS as a “stand-alone” product.

The PPI is intended to measure non-cognitive skills that have been shown to be correlated with success in graduate study: “knowledge and creativity, resilience, communication skills, planning and organization, teamwork, and ethics and integrity.” At first I thought, is this some kind of “personality test,” where people are just going to answer the way that they think people want to hear? Actually, it’s essentially a website where you choose your evaluators (just like you would choose your letter of recommendation writers – seems like they’d be the same people), and they rate you on 24 statements, which you can read here.

For each statement, students are rated on a 5 point scale. Of course, it’s a bit “top-heavy,” since (hopefully) few people agreeing to recommend a student would rank them poorly. The choices are: below average, average, outstanding (top 5%), and truly exceptional (top 1%). Evaluators may also choose “Insufficient Opportunity to Evaluate.” Then they can also add comments for each category.

For the student, if they waive their FERPA rights, then they do not see the results but can choose who among their reviewers they include in their report (though it is not clear to me, if you cannot see the results, why you would not use someone’s evaluation). The report can be send to 4 schools if you’ve taken the GRE (and if you’re using this system, you probably are applying to grad school and so you probably already have paid for the GRE), and additional schools for $20 each. I am not a big fan of increasing financial barriers to grad school, but at least including 4 reports with the price of the GRE helps.

That’s how this new product works. No one at our university said they used it, and the representative from ETS named maybe two or three universities that had at least one program require it. So clearly, it’s not a “big thing” yet, but perhaps it will be in the future.

The question is, should it be? Like many in the audience (at least, the very vocal ones behind me), I was pretty skeptical about the PPI. Can you really narrow down a person’s “potential” to succeed with a simple number? Isn’t that half of my beef with the GRE and other standardized tests anyway? I also couldn’t shake the suspicion that every evaluator is just going rank their student in the top 1% anyway, so students are going to be stuck paying $20 to distribute a report that has just as high of a score as everyone else’s.

But then I started thinking about this in the context of recommendation letters, which this service is meant to – well, I’m not sure if it’s meant to supplement or replace them. And I remembered how unconscious bias has been shown to come into play often with recommendation letters, and that this can unfairly (and unintentionally) harm the career prospects of women. (I would also be interested to hear if such bias has been found with other cultural groups, especially if there is a disconnect in understanding between the writer and the student.) For those not familiar, I’m referring to the 2003 study by Trix and Psenka and the 2009 study by Madera, Hebl, and Martin. MHM09 stated “(a) that women were described as more communal and less agentic than men (Study 1) and (b) that communal characteristics have a negative relationship with hiring decisions in academia that are based on letters of recommendation (Study 2)” TP03’s abstract states:

Letters written for female applicants were found to differ systematically from those written for male applicants in the extremes of length, in the percentages lacking in basic features, in the percentages with doubt raisers (an extended category of negative language, often associated with apparent commendation), and in frequency of mention of status terms. Further, the most common semantically grouped possessive phrases referring to female and male applicants (‘her teaching,’ ‘his research’) reinforce gender schema that tend to portray women as teachers and students, and men as researchers and professionals.

When letters focus more on the “concrete” signs of success (publications, a take-charge attitude), as they do for males, they tend to be more valued more highly than letters which address more social or personal aspects, as recommenders tend to do with women. Remember, these are subtle changes being analyzed in the letters, and they’re not because the women had no accomplishments to talk about (these were all successful applicant letters). It’s all about what first comes to mind, how you choose to phrase things, what you emphasize, etc.

Given that letters of recommendation can be subject to bias, could a well-researched, quantifiable, and standardized recommendation system (still with room for personal comments and insight, of course) be more fair? One issue with standard letters of recommendation is that they are so open ended, and it’s in those situations where unconscious bias and schemas are most likely to be used. The University of Michigan’s STRIDE group (Committee on Strategies and Tactics for Recruiting to Improve Diversity and Excellence) emphasizes focusing on multiple, specific criteria during evaluation (in the specific case of hiring). They offer a “Candidate Evaluation Tool,” kind of a worksheet, where evaluators have to specifically mark the candidate’s potential in a variety of specific areas. This avoids “global judgements,” where biases are more likely to come into play. Forcing an evaluator to sit down and analyze based on specific criteria has helped those who may have been harmed by “snap judgements.”

In the case of letters of recommendation, are they the “global judgements” whereas the PPI can serve much like STRIDE’s Candidate Evaluation Tool? Not only does this assure that important aspects of an applicant’s success aren’t forgotten, but it also means flaws can be honestly pointed out (a letter writer probably won’t mention a candidate’s poor ability to take criticism, but could honestly mark them as “average” when specifically asked). Not all bias can be removed, if in years of interactions, evaluators still focus more on communal aspects with their female students. But it’s a step that forces evaluators to think about specific aspects of personality, one question at a time.

On the other hand… changes aren’t implemented in a vacuum. If you implement any changes in hiring/evaluation, it’s important to train the people using them. Those responsible for admissions, mostly research professors, will probably not receive such training in the PPI, if it were to become a new standard. I worry that some would just focus on scores for the statement “Is among the brightest persons I know,” and just forget about teamwork, integrity, and resilience. Remember, that was one of the problems mentioned in MHM09, that these aspects were not particularly valued (that study was specifically in medicine). Will the PPI be used as intended, or largely ignored as “irrelevant?”

Even after the amount of thinking I had to put in to this issue just to write this post, I’m still on the fence. On the one hand, this is just another product that may cost applicants more money, but on the other hand the idea is very consistent with my understanding of the best practices in evaluation. The more specific the evaluation criteria, the less likely schemas and unconscious bias will creep in like with open ended evaluations. If the criteria are not valued by those using the evaluations, however, the PPI will not have gained anything for graduate school applicants.

I’m curious what others think. Could the PPI be helpful, harmful, or neither? If the idea is on track but the implementation is lacking, what could be done to promote its usefulness?

Although the research described in this article about Boy Scouts and Girl Scouts is not that radical, it did bring up memories of my childhood experience. What was surprising about the article to me was not necessarily the different content of the badges, but that even similar badges had different names between the Boy and Girl Scouts:

Denny found boys’ badge titles use more career-oriented language (such as Engineer, Craftsman, Scientist), whereas girls’ badge titles consistently use more playful language with less of a career orientation. (Instead of the boy’s “Astronomer,” the comparable girls badge is called “Sky Search.” Instead of “Mechanic,” a similar girl badge is called “Car Care.”)

Though I was unaware of that difference (and probably wouldn’t completely understand the implications at the time), even as a fifth grader, I knew that I was getting the short end of the stick in Girl Scouts. The boys got to make sweet racing cars out of wood while we were making crap out of yarn and plastic straws and doing choreographed dances to Billy Joel’s “River of Dreams.” Although I didn’t understand the larger context, I knew that the boys got to do cool things and we got to do kind of boring, “girly” things. Which is not to say that my time in Girl Scouts was not valuable – it definitely helped in making friends and gaining new experiences. Yet I can see in retrospect how such activities contribute to the steering of children’s career paths simply due to their gender. Even as a kid, I knew something was off.

A new paper in the Proceedings of the National Academies of Sciences has been causing a storm: Understanding current causes of women’s underrepresentation in science, by Stephen J. Ceci and Wendy M. Williams, both from the Department of Human Development at Cornell University. (What is a Department of Human Development? According to their website, “The Department of Human Development addresses the biological, emotional, cognitive, and social factors that shape human behavioral development and the potential of research for enhancing development and well-being from infancy through old age in diverse social contexts.”)

Though there are points to be made about the media coverage of this article, I want to focus on the text of the article itself. The article claims to prove that there is no sex discrimination in (i) manuscript reviewing, (ii) grant funding or (iii) interviewing/hiring. The last section then explains the point that

Women’s current underrepresentation in math-intensive fields is not caused by discrimination in these domains, but rather to sex differences in resources, abilities, and choices (whether free or constrained).

Let’s examine the two issues separately, and then wrap it up in summary.

I: Discrimination

The authors break this up into three parts. I have not been very aware of studies on discrimination in journal reviewing, and have not heard any big push against it, so to me this part is largely a strawman. However, it also possibly suffers from the weakness of the next section, regarding grant funding. The often quoted statistic is that women need to be 2.6 times as strong an applicant, but objective measures, to be rated as strongly (subjectively) as a male applicant. This is based on a paper in Nature which studied postdoctoral fellowship applications to the Sedish Medical Research Council (MRC) in 1995. After comparing an objective “total impact measure,” based on “total number of publications, total number of first-author publications, total citations, total impact measure, first-author impact measure, and first-author citations,” they found that “a female applicant had to be 2.5 times more productive than the average male applicant to receive the same competence score [from the judges].” Ceci and Williams now go on to cite multiple studies where grant applications for males and females had the same success rates, indicating that such discrimination has never been reproduced and therefore doesn’t exist. But nowhere do they attempt to address apples to apples. Male and female success rates say nothing about the content of those applications – did the successful women have to try twice as hard as the men? That’s the discrimination question at the heart of the MRC study, and that’s not addressed by any of the other studies about acceptance rates. The exception is a followhip study of the MRC, which found no discrimination, but in fact a small effect in favor of women instead. But that’s not surprising – if your esteemed organization made headlines for being discriminatory, wouldn’t you take that as a giant kick in the behind to pay attention to the issue, and sometimes that involves a bit of recalibrating?

The third discrimination issue is related to hiring practices. Ceci and Williams first list a handful of the many studies that have shown gender preferences in hiring, but then never go on to disprove them. The next paragraph includes the sentence “A Government Accounting Office (GAO) report notes that women in math-intensive fields express feelings of isolation, dissatisfaction, and discrimination,” and then goes on to talk about how women are more likely to work part-time because they choose to and that’s why they get paid less. I’m sorry, did I just blink and miss something? How was this transition logical, except to imply that women feel discriminated against because they work part time? Perhaps they feel discriminated against because they are being discriminated against? I don’t understand how these issues (discrimination in hiring and self-reported discrimination) just managed to fall out of the article at this point.

The only time hiring discrimination is actually addressed in this section is the statement that “among PhDs applying for tenure-track jobs, women were slightly more likely than men to be invited to interview and offered jobs.” Yet this is, once again, the same key omission from the entire paper – just who are these women that are applying? If STEM fields lose increasing numbers of women as you go up the pipeline, who is left by the time they are applying for tenure-track jobs: the superstars, of course! The ones who have fought against the tide, who are committed, who have made it that far despite many barriers against them. The fact that they are slightly more likely than men to be interviewed doesn’t mean discrimination doesn’t exist, because it ignores the many women that have already been discriminated against. And, again, it says nothing about how hard these women had to work compared to men.

All in all, I did not feel that the paper held up to its claims of proving that discrimination doesn’t exist. Quite frankly it was shoddily done and failed to address the actual key issues that exist in regards to discrimination. As Joan Schmelz wrote in the AAS Women newsletter on February 11, 2011: “I think the authors should be sentenced to walk a mile in my shoes . . . I hope I am wearing the highest of my high heels on that day!”

II. “Today’s Causes” of Underrepresentation

The quote that sums it all up:

That women tend to occupy positions offering fewer resources is not due to women being bypassed in interviewing and hiring or being denied grants and journal publications because of their sex. It is due primarily to factors surrounding family formation and childrearing, gendered expectations, lifestyle choices, and career preferences – some originating before or during adolescence… – and secondarily to sex differences at the extreme right tail of mathematics performance on tests used as gateways to graduate school admission.

Ceci and Williams break it down again into three issues: career preferences, ability differences, and fertility/lifestyle choices. What boggles my mind is how these are somehow stated as issues independent of discrimination, whereas they are truly embedded in a gendered society that influences everyone throughout their entire lives. And the mathematical differences thing – is this truly being brought up, despite all the studies that show that gap decreasing? How can something biologically innate be changing so rapidly? And then of course, there’s the fact that you can teach spatial ability, the well-demonstrated existence of stereotype threat, and how silly it is to focus on the gender makeup of the top 0.01% when a far greater percentage of people than that go on to get degrees in science. (If you say there are about 10 million scientists and engineers working in this country [old numbers], that’s about 3% of the population now, which is 300 times the number of people you are talking about in that tiny tiny tail of the SAT-M scores, and those are the ones that get STEM degrees, which plenty of high-scoring math students don’t.)

The article ends with noting that many statistics about how fertility choices and work-home balance come up as big issues, and that we ought to be addressing those issues with family-friendly policies. Okay. But how is this issue not related to discrimination? The fact that this is a woman’s burden and that men come out relatively unscathed in the whole process when they have children (I don’t mean to imply that it’s not difficult, just that the same career effects aren’t seen) says something about our society and the unfair treatment of some over others – that’s discrimination.

Summing it All Up

The paper phrases their purpose for downplaying discrimination in the abstract:

Thus, the ongoing focus on sex discrimination in reviewing, interviewing, and hiring represents costly, misplaces effort: Society is engaged in the present in solving problems of the past, rather than in addressing meaningful limitations deterring women’s participation in science, technology, engineering, and mathematics careers today.

Do we need to address issues other than discrimination? Yes, a resounding yes. Work-life issues are huge in academics (and elsewhere) and can affect both men and women. But that doesn’t mean that we can’t also talk about discrimination. I don’t know how “costly” it is to hold seminars and workshops that raise awareness, but I highly doubt that the money spent on such efforts could fund entire campus day-care centers, hire postdocs or teachers to fill-in for new parents, and fund couples-hiring initiatives. Furthermore, not talking about discrimination and unconcious bias is dangerous because awareness is key to addressing the biases that we all have. Have things gotten better since the Swedish Medical Research Council? You bet, and that’s because they were called out, big time. To say that talking about discrimination is counter-productive only allows it to continue. It’s definitely not so blatant anymore, which is why more of the focus is on unconcious bias, which is not adequately addressed in this article. I’d really love for these authors to walk a mile in my shoes, too.

Sociological Images had a short post on “Rulers of Science” and Male as Default, highlighting a set of rulers, one of which is called “Rulers of Science” and lists famous scientists, and one called “Great Women Rulers of Science.”

The comment thread is very interesting, as there is certainly a lot to think about. Enjoy!