Friday, May 04, 2007

Racial bias and NBA referees -- Part 2

This is part 2 of my comments on the NBA racial discrimination study. Part 1 is here. The NYT article on the study is here.

The study found a difference in the rates of how white and black referees call fouls against white and black players – notably, white referees call fewer fouls than expected against white players.

From this, the authors of the study conclude that, because most of the referees in the league are white, it would be advantageous to hire more white players and fewer black players. I don’t think that's correct, for several reasons, and I will explain why.

I'll start by describing the paper's approach. There were many different tests and regressions, but the first one, and the simplest one, is not that different from the more complicated ones. So I'll use that one.

This is the first chart in Table 3, which is below. The numbers are fouls called per 48 minutes, by race of the (majority of the three) referees and the perpetrator:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33

5.02

You'll notice that where black players are concerned, the race of the referees didn't matter: fouls were called at almost exactly the same rate by both races. ("Almost" because the numbers are rounded.)

But white players were caught much less often by white referees than by black referees, 4.95 fouls to 5.02 fouls, a difference of .07 fouls every 48 minutes.

And so, you can make this argument: in the left column, which represents black players, the top number is exactly the same as the bottom number. If white referees were as consistent for white players as the were for black players, the second column would look the same, and the top number there would also be equal to the bottom number. The chart would then look like this:

black players

white players

majority white refs

4.33

4.95 5.02

majority black refs

4.33

5.02

This new chart represents how you expect it "should" be in the absence of bias. The top-right cell (changed, in red) represents white referees making calls on white players. So translating the change of chart into English, you could say:

1. "White referees didn't call as many fouls on white players as they should have."

The "1." in front of the statement is because, as you've probably figured out, there's more than one way to adjust the chart. Instead of changing the top-right cell, we could change the bottom-right cell, so that the black ref number becomes equal to the white ref number:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33

5.02 4.95

This is just as consistent as the first way. The interpretation here is:

2. "Black referees called more fouls on white players than they should have."

These two statements, number one and number two, are equally consistent with the data. There is no way to determine which one is true. Indeed, they could both be true: maybe white refs were too lenient on white players, AND black refs were too hard on white players. Or it's possible that all refs are more lenient with white players. Maybe the "real" number should have been even higher than 5.02. Maybe blacks let lots of fouls go, but whites let even more fouls go! Or maybe both refs were tougher with white players than black players – but white refs were not as much more tough as black refs were more tough.

We can reduce that mess to one principle, which we can say at least three different ways:

(c) regardless of whether black referees were too tough or too lenient on white players, the white referees were either less tough or more lenient.

The statements are three different ways of saying the same thing. The first way kind of implies that it's the white referees who are biased. The second way implies that it's the black referees who are biased. And the third way implies that maybe all the referees might have been biased.

Any of those is possible. There's evidence of bias, but absolutely no evidence of who is biased.

But, wait, we're not finished: there are still other interpretations of the chart. This time, start with the right column. White refs gave white players 0.07 fewer fouls than black refs. If white refs are more 0.07 more lenient towards white players, they should also be 0.07 more lenient towards black players. That is, the left column should look the same as the right column! And so, instead of giving them 4.33 fouls per game, they should have given them only 4.26:

black players

white players

majority white refs

4.33 4.26

4.95

majority black refs

4.33

5.02

This gives rise to a third interpretation:

3. "White referees called more fouls on black players than they should have."

And there's one more interpretation, which happens when you change the bottom-left cell:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33 4.40

5.02

4. "Black referees called fewer fouls on black players than they should have."

(Digression: to some, interpretations 3 and 4 are perhaps less intuitive than 1 and 2. In the first two charts, black referees and white referees treated black players exactly the same. This satisfies our wants and expectations for NBA officiating, that the race of the referee should make no difference. It seems like those numbers somehow must be right, and making them unequal is the wrong thing to do.

But I think that prior expectation is unrealistic. Why shouldn't there be differences in refereeing style between whites and blacks? There are differences in basketball style. There are differences in culture, in politics, and many other areas. That's one of the reasons employers like "diversity" – because people of different races can bring different attitudes and methods to the business. So why shouldn't referees be a bit different by race?

Sure, the NBA tries to enforce consistency, so you wouldn't expect any large differences. But isn't it acknowledged that some referees call a tighter game, and some call a more lenient game? Perhaps the division isn't exactly 50-50 between white and black.

And, indeed, other data show a difference: in one of their subsequent breakdowns, it was found that black referees were slightly more strict than whites even in fouls called against blacks.

Anyway, I include this digression because the authors of the study don’t really consider the possibility of non-bias-related differences between the two groups of referees. But if you do consider such differences, you can come up with explanations for the data other than racial bias (which I'm saving for part 3). The authors didn't do that, and neither did most of the opinions on the subject that I've read.)

So what can we conclude, not knowing which of the four possibilities is correct?

First, that there's some bias going on. But does it favor white players or black players? And is it by white referees or black referees?

Under 1 and 2, it's bias for and against white players; under 3 and 4, it's bias against and for black players. Under 1 and 3, it's bias by white referees. Under 2 and 4, it's bias by black referees.

So we don't know who benefits, and whose 'fault' it is.

What we do know is that, under any of the four possible interpretations,

And that's why, in the study, the authors repeatedly refer to "own-race bias" instead of "white bias" or "black bias." Every player benefits when more referees are of his own race.

(And one quick point: there is a fifth interpretation, which is that some combination of all four of the others are right! Maybe black refs both favor blacks and stiff whites. Maybe white refs both favor whites and stiff blacks. And when you put them all in a chart, you get what we got.)

------

What is the magnitude of the "own-race" bias? You'll notice that in every one of the four cases, we had to change one of the cells by exactly the same amount: 0.07. (It always has to work out that way, that no matter which of the four cells you change, it will have to change by the exact same amount.)

If you can change the race of a single one of your (48-minute) players, so that it now matches the majority of the referees that game, you should reduce your fouls by 0.07 (or about one foul per 15 such games). If, somehow, you could change all five players, it would be one foul every three games – or about one point per game. That's a lot. But it's hard to have two sets of five players who are exactly equal in every respect but race, which renders this strategy fairly unworkable.

Can you gain an advantage by switching race for an entire season, instead of by the game? I will soon argue that you cannot. But it sure seems like you can. Here's the logic:

Since two-thirds of NBA referees are white, most games (in theory, about 74%) would have a majority of white referees. So if you replace a black player with a white player all season, he'll have the advantage in 74% of games, and a disadvantage in 26% of games. 74% minus 26% is about half, so the advantage of that is about 0.035 fouls per game – one foul in 30 games. If you assume that the victim benefit equals the perpetrator benefit, double that to one foul in 15 games. That's about six fouls over a season. What's that – maybe a half a win or so?

The New York Times article recaps the issue like this:

Their results suggested that for each additional black starter a team had, relative to its opponent, a team’s chance of winning would decline from a theoretical 50 percent to 49 percent and so on, a concept mirrored by the game evidence: the team with the greater share of playing time by black players during those 13 years won 48.6 percent of games …

But I don't think that's right. The overall tendency of players to avoid and draw fouls will be accounted for in their statistics and evaluations. And so if teams consider fouls for and against when making personnel decisions, black teams should be exactly as likely to win as white teams. The GM may not know why player W takes fewer fouls than player B, but he doesn’t care – he'll hire player W even without knowing.

Teams are rational. If you had two players, one white and one black, equal in talent (not including the effects of racial bias), the white player would have better stats than the black player – because he plays against white refs more often! Therefore, he'd be contributing more to the team, and he would already have the job.

That is: even though the racial bias is invisible, the results are not. And teams would already have made the optimal player moves to correct for it, just as they correct for every other player characteristic.

That's in theory: in practice, that might not be entirely correct. It's possible that some of the effect has gone unnoticed – most likely, a player's ability to induce fouls on the other team. But if that's the case, then the problem is that teams aren't measuring it. The standard economic assumption is that teams are rational, and would notice it. As an empirical question, my gut says that teams would notice a fair bit of it, maybe most of it.

Look at it this way. Suppose that 25 years ago, a racist demon started giving every white baby an extra dollop of basketball talent. Nobody knew that until today, when a study came out proving it. Does that mean that teams should immediately rush out and get white players? Of course not. The white players' extra talent was obvious, even if the cause wasn't, and all the demon-gifted great young white players were already snapped up and are now playing.

Similarly, the effects of race-biased refereeing have already been factored into which players have jobs today. Which race got the extra jobs, blacks or whites? We don’t know, because we don’t know which way the bias goes. If the hidden truth is that referees are biased in white players' favor, then some white players have jobs they wouldn’t have otherwise. If the bias is in black players' favor, then the black players have the extra jobs. Just as you would expect.

But there's no way to tell which way it goes. The study provides insufficient evidence to tell whether the bias favors white players or black players.

------

In Part 3, I'll argue that the data can be explained by other theories than racism. One such possibility – and I think it's a good one – is discussed by Guy here, and by Andrew and Guy in the comments to this post.

17 Comments:

I agree that your four explanations are all theoretically possible. But the virtually identical rate for black players (regardless of ref race) would certainly be an odd coincidence. I think 1 and 2 (or some combo) are much more plausible.

* *

"If you assume that the victim benefit equals the perpetrator benefit, double that to one foul in 15 games. That's about six fouls over a season. What's that – maybe a half a win or so?"

6 fouls equals about 4 points, which I believe is more like 1/6 or 1/7 of a win in the NBA.

* *

At the end of the paper, the authors argue that the data shows more bias by white refs than black refs. Can you comment on that?

**

It's interesting that black refs appear to call somewhat more fouls overall than white refs (if I'm reading the data correctly). Given that 83% of PT goes to black players, the reverse should be true if refs are biased in favor of same-race players. That suggests a real race difference in foul-calling propensity.

1. The standard deviation for the "virtually zero" difference is .016, which is significant in the basketball sense. So I think it probably *is* a coincidence.

Look at it this way: different referees probably have different rates at which they call fouls, even on players of the same race. There are only about 15 black referees. So it would be coincidence indeed if they just happened to have exactly the same foul-calling propensity as the white refs.

2. Thanks. Does that also consider offensive fouls?

3. Yes, you're right, the authors do argue that, on page 29. I haven't thought about that aspect much, but let me read it again ...

4. Black refs might call more fouls overall, but disproportionally against white players. What matters is not the actual rate of fouls, but the rate of fouls *relative to the other race*.

Even if white refs NEVER call fouls, if black refs call 1 foul against a black player and 2 against a white player, that benefits black players.

Basically, the authors breakdown the bias number by individual referee instead of by referee race. They find that blacks hover close to zero, while whites vary a lot from zero.

But I think that's exactly the same result as the chart, just broken down differently. Again, the authors assume that the "real" rate should be zero, and that therefore black refs aren't biased but white refs are.

And again, I think they are ignoring the possibility of explanations 3 and 4. If those are correct, then the non-bias baseline for that graph would NOT be zero. It could be anywhere. It could even, in theory, move high enough that all the bias would be caused by black refs.

Since the sample size for white refs is much larger than for black refs, we should expect a larger significance for the whites, if the TRUE effect for both is the same. (Correlation approaches 1 as sample size approaches infinity)

***

Phil, in my blog, I said that the difference seems to be in the all-same-face crews, so that the mixed-races crews had no effect (or at least had their effect cancelled). I would suggest using those crews as the control groups.

Tom: True that as far as white players were concerned, the ref combination 2-black 1-white looked almost the same as the 2-white 1-black. But do you really think it's safe to conclude that they're truly identical?

And I'm not sure what you mean by using them as a control group. Do you mean that will help us figure out where the bias is, and where the benefit is?

Also, the 2x2 matrix compares "white majority" referee crews to "black majority" referee crews. If 2/1 crews are indeed the same as 1/2 crews, then the differences in that matrix are indeed comparing all-white to all-black, no?

Phil: I believe that defensive fouls are usually estimated at -.5 points, as it usually results in 1.5 points from the free throw line, vs. the 1 point the opposing team was expected to score with a possession. Offensive fouls are -1 point (loss of possession). I don't know what the right mix is, but you can see it doesn't matter much. (It would be interesting to know if the 'bias' tends to appear more on off vs def fouls. The authors' data doesn't seem to include that.)

I don't recall the pythag formula for NBA, but I think it yields a point/win converter in the 25-30 range.

If all that's happening here are the fouls, the impact is clearly small. But if their findings on team win% are correct, it suggests that players are changing their approach in response to the bias -- whites being more aggressive and/or blacks less -- so that the foul difference is small but overall game impact is large.

However, I have a hard time seeing how that happens if players aren't conscious of the bias. Even if refs were calling 10% fewer fouls on white players (before players adjust their approach), how long would it take a player to notice such a differential, conclude this ref squad was tough or loose, and adjust their play? The game would be over, I think. The only way this can be true is if NBA players know from the start, based on refs' race, to play a certain way. But, we haven't seen a single player of either race say that's true (so far). So I'm skeptical about this scenario.

Guy: The 0.5 and 1.0 both make sense. I'll have to check the paper again to see what it says. They probably double their number to account for fouls induced on the other team.

That's a good question, whether players are able to notice whether the referees are tough or loose that day. I wonder if there's enough variation (aside from caused by race) that players can sometimes pick up on it. I agree with you that if it's only 10%, probably nobody notices.

Sometimes in hockey games, when the game gets a little rough, the announcers will say the referees will crack down a little more to make sure the aggression doesn't get out of hand. Do they say that in basketball? Fouls aren't violent, emotional or dangerous like in hockey, so probably not?

Phil: My 10% higher foul rate was just a made up number, but it's about 3X the rate they find. The point is that it's hard to imagine that this happens at an unconscious level on players' part. A player only draws 4 fouls per game on average. So how long would it take for a white player to notice he was getting away with more aggressive play than usual, given the subtle foul-calling differences we're talking about here? Either players are aware of this and adjust from the beginning of the game -- but refuse to admit it publicly -- or race of refs doesn't really impact players' approach.

* *

Back to your post #2: For black players, study has samples of 49,000 maj black ref games and 165,000 maj white ref games. Seems like the chance there is a non-trivial true difference in foul-calling (given identical observed rates) is pretty small.

There is also a good theoretical reason to think that the real difference (if any) occurs with white players: approximately 83% of fouls by white players will be opposite-race (against a black player), compared to just 17% for black players. If refs are race-biased, and we think they are equally conscious of race of the 'foulee' (per table 6), it's reasonable to assume that most of the bias occurs when refs have to choose between a player of own race and player of opposite race. And that would then result in a much larger spread in the foul rate for white players.

On your first point, I think you're right. It would be different in hockey, of course, where refs often call 100% more fouls, not just (say) 10%.

On your second point, I'm not sure I understand what you're getting at. There will be exactly as many white-on-black fouls as black-on-white. And the measure of bias doesn't depend on how many incidents are actually observed.

Can you flesh out your argument a bit, with maybe some numbers? Maybe I'm just dense today.

Yes, there will be roughly equal # of B-on-W and W-on-B fouls. But for a white player, c. 83% of his fouls will be mixed race, whereas only about 17% of a black player's fouls will be against a white player. So if ref bias were only or mainly expressed when a play forces a ref to side either with a player of his own race or the other race, then there's no reason for white and black refs to judge black players differently, since they are mainly fouling other black players. That won't be true, however, for white players.

OK, I think I see what you're saying ... you're saying that the observed difference in fouls for black players will be small, because black-on-black fouls are neutral for both refs (black perp, black victim).

But even though the difference in fouls was small, the difference in *bias* might be large. It's just watered down by the large number of black-on-black fouls.

Right, the amount of ref bias could be large or small, but it won't show up in foul rates for black players if the bias is only revealed on mixed-race plays. In that scenario, if 100% of players were black, we would be able to observe the bias at all, because all plays would involve players of same race. But if 50% of players were white and there were a lot of mixed-race plays, then the bias would have a much greater overall impact and we would expect to see differential foul rates for both white and black players, depending on the refs' race.