5 Things You Should Know About Orchestras and Blind Auditions

Unless you were going completely off the grid this week, you probably heard about the now-infamous “Google memo“. Written by a (since fired) 28 year old software engineer at Google, the memo is a ten page long document where the author lays out his beliefs about why gender gaps in tech fields continue to exist. While the author did not succeed in getting any policies at Google changed, he did manage to kick off an avalanche of hot takes examining whether the gender/tech gap is due to nature (population level differences in interests/aptitude) or nurture (embedded social structures that make women unwelcome in certain spaces). I have no particular interest in adding another take to the pile, but I did see a few references to the “blind orchestra auditions study” that reminded me I had been wanting to write about that one for a while, to deep dive in to a few things it did or did not say.

For those of you who don’t know what I’m talking about, here’s the run down: back in the 1970s, top orchestras in the US were 5% female. By the year 2000, the were up to almost 30% female. Part of the reason for the change was the introduction of “blind auditions”, where the people who were holding tryouts couldn’t see the identity of the person trying out. This finding normally gets presented without a lot of context, but it’s good to note someone actually did decided to study this phenomena to see if the two things really were related or not. They got their hands on all of the tryout data for quite a few major orchestras (they declined to name which ones, as it was part of the agreement of getting the data) and tracked what happened to individual musicians as they tried out. This led to a data set that had overall population trends, but also could be used to track individuals. You can download the study here, but these are my highlights:

Orchestras are a good place to measure changing gender proportions, because orchestra jobs don’t change. Okay, first up is an interesting “control your variables” moment. One of the things I didn’t realize about orchestras (though may be should have) is that the top ones have not changed in size or composition in years. So basically, if you suddenly are seeing more women, you know it’s because the proportion of women overall is increasing across many instruments. In the words of the authors ” An increase in the number of women from, say, 1 to 10, cannot arise because the number of harpists (a female-dominated instrument), has greatly expanded. It must be because the proportion female within many groups has increased.”

Blind auditions weren’t necessarily implemented to cut down on sexism. Since this study is so often cited in the context of sexism and bias, I had not actually ever read why blind auditions were implemented in the first place. Interestingly, according to the paper written about it, the actual initial concern was nepotism. Basically, orchestras were filled with their conductors students, and other potentially better players were shut out. When they opened the auditions up further, they discovered that when people could see who was auditioning, they still showed preferential treatment based on resume. This is when they decided to blind the audition, to make sure that all preconceived notions were controlled for. The study authors chose to focus on the impact this had on women (in their words) “Because we are able to identify sex, but no other characteristics for a large sample, we focus on the impact of the screen on the employment of women.”

Blinding can help women out Okay, so first up, the most often reported findings: blind auditions appear to account for about 25% of the increase in women in major orchestras. When they studied individual musicians, they found that women who tried out in blind and non-blind auditions were more successful in the blinded auditions. They also found that having a blind final round increased the chances a woman was picked by about 33%. This is what normally gets reported, and it is a correct reporting of the findings.

Blinding doesn’t always help women out One of the more interesting findings of the study that I have not often seen reported: overall, women did worse in the blinded auditions. As I mentioned up front, the study authors had the data for groups and for individuals, and the findings from #3 were pulled from the individual data. When you look at the group data, we actually see the opposite effect. The study authors suggest one possible explanation for this: adopting a “blind” process dropped the quality of the female candidates. This makes a certain amount of sense. If you sense you are a borderline candidate, but also think there may be some bias against you, you would be more likely to put your time in to an audition where you knew the bias factor would be taken out. Still, that result interested me.

The effects of blinding can depend on the point in the process Even after controlling for all sorts of factors, the study authors did find that bias was not equally present in all moments. For example, they found that blind auditions seemed to help women most in preliminary and final rounds, but it actually hurt them in the semi-final rounds. This would make a certain amount of sense….presumably people doing the judging may be using different criteria in each round, and some of those may be biased in different ways than others. Assuming that all parts of the process work the same way is probably a bad assumption to make.

Overall, while the study is potentially outdated (from 2001…using data from 1950s-1990s), I do think it’s an interesting frame of reference for some of our current debates. One article I read about it talked about the benefit of industries figuring out how to blind parts of their interview process because it gets them to consider all sorts of different people….including those lacking traditional educational requirements. With many industries dominated by those who went to exclusive schools, hiding identity could have some unexpected benefits for all sorts of people. However, as this study also shows, it’s probably a good idea to keep the limitations of this sort of blinding in mind. Even established bias is not a consistent force that produces identical outcomes at all time points, and any measure you institute can quickly become a target that changes behavior. Regardless, I think blinding is a good thing. All of us have our own pitfalls, and we all might be a little better off if we see our expectations toppled occasionally.

Hmmmm. Your points 3 and 4 seem contradictory. Am I missing something? Is is possible that the effect of blind auditions is to induce more women to try out, so that individual women do more poorly, but the overall effect is more women who are selected?

Yeah, they do contradict each other. I was trying to be cute. Sorry about that.

And yes, the explanation is similar to what you suggested, though they phrase it differently. Their theory is that blind auditions cause more women to try out, so overall a smaller percentage of the group gets selected. However if they looked at any individual woman (let’s call her Sally), they found blinding helped. If Sally tried out in two unblinded auditions and one blinded audition, she was more likely to get the job under the blinded audition than random chance would suggest. The called it “when controlled for quality of player”. Hope that’s a little clearer!

I don’t think that point #1 is complete. To control your variables here, you need to ensure that both the composition of the orchestras hasn’t changed (which you do), but also that the composition of the applicant stream hasn’t changed. That is, if in 1970 there were 95 men and 5 women applying for the jobs, you would expect that only 5% of the selectees would be women (assuming a perfectly equal distribution of talent). But if by 2015, there were still 95 men applying, but now there were 95 women also applying, then the selectees rising from 5% to 30% is not a success story, but rather a retrogression.

And obviously, lots more women entered the workforce between 1970 and 2015. Even more, there has been a marked drop in the number of women who take time off their careers to stay home with young children. For long-training and personal-connections based careers (and I’d assume that professional musicians are both), “dropping out” for awhile usually means dropping out permanently (at least for the top jobs). (There’s also been a steady decline in the male labor force participation overall, but I don’t know how much that would affect the pool of wanna-be musicians.) So, given that there are likely lots more women in the qualified-applicant pool in 2015 vs. 1970, and possibly less men as well, I would expect that the number of female musicians hired would rise under any plausibly merit-based selection, even the old one. I don’t think we can draw conclusions on the efficacy (for this purpose) of blind auditions per se (although possibly as an anti-cronism measure, or to increase transparency, it might be a great idea anyway).

I think this is Google-guy’s main point: if the qualified applicant pool is 80-20 male-female, then it’s unrealistic to think that a 50-50 outcome for hiring/promotion is either achievable or fair. He also says that the majority of the difference between men’s and women’s interest/aptitude in engineering is due to innate differences, i.e. not under Google’s control, which is more contentious.

That’s a good point, and it’s definitely something I skimmed over. They actually were attempting to look at that and specifically designed a lot of their analysis to determine if the rise was solely due to the increase in female candidates (which occurred at the same time, for all the reasons you describe), or if the blinding also played a role. What helped them out is that they got the full audition list for each orchestra, so they could track the success rate of individual women. Some orchestras moved to blinded auditions before others, so they could compare success for one individual woman across multiple tryouts. If Sally got to the final round in the two blinded auditions but didn’t advance out of the preliminary round in the unblinded ones, they could conclude the blinding was helping her. Men who went on multiple auditions saw the opposite effect….they were more likely to get knocked out early in the blinded auditions and made it further in the unblinded ones. It was these comparisons across individual performers that yielded the results that get quotes so often.

I did want to note that this data only went up through the 90s, so the effect we see today may not be the same as what they saw then. I was pretty with you though on the anti-cronyism part though….it seems like a wise idea to evaluate work with no context every once in a while to see if you’re actually judging work the same with and without context.