Sabermetric Research

Phil Birnbaum

Monday, May 28, 2007

The large supply of tall people

In this New York Times article from three weeks ago, Dave Berri, co-author of "The Wages of Wins," once again argues that the relatively low level of competitive balance in the NBA is due to the "short supply of tall people":

"The population of [necessarily tall] athletes the N.B.A. draws upon is quite small. … As the evolutionary biologist Stephen Jay Gould observed, when a population is relatively small, the difference between the very best and the average athlete will be quite large. In other words, when your population of athletes is small, your league will have less competitive balance."

I have previously made two points in response to this argument. First, there are other, stronger reasons that the level of competitive balance in the NBA is low. And, second, every sport has specialized requirements for its athletes -- why should height be any different?

It's the second point I want to address in more detail here.

Berri asserts that basketball players need to be tall. Which is true. But, obviously, height isn't enough. They also need to be able to shoot straight. They need to be able to anticipate plays. They need to be able to read a defense. And so on.

Why, then, should height be special? Isn't there equally a "short supply of people who can feed a teammate with a blind pass?" Isn't there a "short supply of people who can make a difficult fadeaway jumper?"

And, as I wrote previously, Wayne Gretzky saw plays unfold in slow motion, which gave him extra time to react. This made him perhaps the greatest playmaker in hockey history. You've got to think that even among all the hockey players in the world, there's a very short supply of people who can read plays anywhere close to the NHL level.

Why makes height different from shooting, or playmaking? For almost any skill (or "characteristic," if you don't consider height a skill), the population will possess it in a normal distribution. Professional athletes are drawn from those winding up in the far right tail of that distribution -- the best of the best (or tallest of the tallest). Of course, any sport requires not just one skill, but many. So an NBA player could probably be only "above average" in some areas, so long as he's *way* above average in enough other areas.

But there's a short supply of humans who are in the right tail of any and every normal distribution. Again, why should height be different? I see three ways height is different, and both of them work *against* Berri's argument.

First: not all basketball players are tall. A quick check of the Detroit Pistons roster shows that four of their players are 6'3" or shorter. It's not really rare to be that tall. I know several people who are around 6-3.

But how many people do I know who can beat out even one Detroit Piston in any basketball-related skill? Probably none. In passing, shooting, or dribbling, any NBA player could beat any of my friends with no effort at all.

What does this mean? That while height is important for a basketball player, it appears to be almost the *least* important of all the skills! If you count all the men in North America who are at least as tall as the fourth-shortest player on an NBA team (6-3), you'll get a pretty high number. If you count all the men in North America who can block shots at least as well as the -fourth-worst player on an NBA team, you'll get a pretty low number. When you look at it this way, there's comparatively a pretty *large* supply of tall people.

Second: tall men -- really tall men, say 6-7 or such -- may be fairly rare in the general population, but they are easily noticed. So while there might be a short supply of tall people, probably all of them have tried to play basketball! That is, suppose that in the general US population, men who are 6-7 are exactly as rare as men who can hit a fadeaway jumper under pressure with two seconds left on the shot clock. You'd find more of the former in the NBA than the latter. Why? Because almost all tall teenagers attract attention as basketball players, and they all will have had tryouts. Meanwhile, not-so-tall teenagers who might have had basketball skills could wind up not playing at all (especially if they are younger than their peers, and so temporarily smaller), or concentrating on other sports.

Relative to other skills, and considering only the population who considers playing baskeball, height is again in larger supply, not shorter.

Third: tallness is evident early, but basketball skill may develop later. According to Wikipedia, even Michael Jordan "was not initially a standout player for the North Carolina Tar Heels." In any sport, history is full of players drafted in the very late rounds who turn out to be superstars, and there are lots of players who unexpectedly develop into elite players even after making the big leagues.

"Tall" is obviously easier to scout for than other skills. How many teenagers would have pursued an NBA career, but didn't know they would eventually be able to pass and shoot and read a defense extremely well? Probably lots. How many teenagers would have pursued an NBA career, but didn't know they would eventually be extremely tall? Probably very few.

Again, the height requirement means that more of the best players in the world will make the NBA -- not fewer.

In summary:

1. Height is not as rare in the general population as other basketball attributes, which suggests that it's less important.

2. People with height are almost 100% likely to consider and pursue basketball, unlike people with other basketball skills.

3. Height is evident early in a player's life, unlike other basketball skills.

The fact that height is more important in the NBA than in other sports results in an increase in the basketball-suited population that actually pursues the sport. The market for professional basketball players is much more efficient at recognizing productivity when it appears in the guise of height than when it appears in other ways.

And so I don't think it's correct that NBA play quality -- and therefore competitive balance -- is reduced by a shortage of tall players. If anything, height is *more* abundant than other skills.

The argument should go exactly the other way. The fact that height is important in the NBA results in *more* competitive balance compared to other sports, not less.

La Russa said back in Oakland Jim Lefebvre said a team’s chances of winning dramatically increase when it adds a third inning of scoring in any game. They started keeping track of such things and found the three-inning rule true.

The article shows that teams scoring in only two innings have a winning percentage of .417, but if they score in three innings, they jump to .623. If you score in three *or more* innings, you'll be .711.

But that's irrelevant. Obviously, only runs are important. If your opponents score more runs than you, you lose. Even if they scored them all in one inning, you still lose. And it's not like "scoring in at least three innings" is a skill you can teach your team indpendent of scoring runs in general. So what's the point?

In the game La Russa complained about, his team scored four runs in the seventh and won 4-1. Someone should show him page 173 of the 1986 Bill James Baseball Abstract, where James found that if you score four or more runs in a game, you'll be .747. (It's probably lower now in this high-offense era, but the point still holds.)

In any case, I don't think this is anything that La Russa can screw up the team with ... there are few reasonable managerial strategies that increase the number of scoring innings while decreasing the number of runs. And maybe La Russa didn't really mean what it sounds like he meant; perhaps he was just making conversation for the reporter.

Wednesday, May 16, 2007

Ripsometrics revisited

The second annual Rock-Paper-Scissors tournament took place in Las Vegas recently. According to the New York Times, it was a big event. Bud Light was the sponsor, and ESPN was there.

It seems like the game should be just luck. Predictably, adherents disagree:

Some view the game as a matter of pure luck, but others, like [league founder] Mr. [Matti] Leshem, say there is a clear-cut strategy that involves learning to read an opponent, much as poker players do.

Free-agent horses: a better hobby than an investment

A few months ago, I wrote that professional sports teams return very little profit compared to other investments. My argument was that owning a team is the way for rich people to gain fame and have fun playing fantasy sports with real money.

Here's an argument from William Baldwin of Forbes that says horse racing is the same thing – a supposed business that's really just a hobby. Baldwin calculates that, across all racehorse owners, the cost of upkeep is $2 billion, but there's less than $1 billion in purses to go around. Conclusion: owners know that they stand to lose half their investment, but continue anyway because they're not in it just for the money.

Baldwin's article is actually an editor's note commenting on another article that appeared in the same issue. That story, by Dan Seligman, talks about a horse auction where animals go for millions of dollars, a lot like promising free-agents. Seligman is one of my favorite journalists (although I haven't seen much of his work lately). He's not afraid of numbers, and he understands what they mean and what they don’t mean; I don't recall a statistical argument of his that I thought was unsound.In this article, he does a bit of original sabermetrics, finding a correlation of 0.4 between horse "tryout camp" results and eventual sale price. I trust his findings a lot more than if he had gone out and quoted some supposed expert somewhere.Additionally, the article suggests that free agent horses are wildly overpaid. Of 22 horses that sold for over $1 million, only one earned back his million in purses. A lot of their income comes from stud fees -- old blue-chip horses siring young speculative horses.

If Forbes is right about the horse business, you should expect to find it all privately owned (like baseball teams and Picassos). It would be difficult to sell shares in a a business where you lose more than 50% of your capital every year.

Saturday, May 12, 2007

MLB standards allow lots of ball juicing

Baseball offense is way up in the last 15 years or so, and one theory is juiced baseballs. However, MLB insists that the balls are tested every year to make sure they're consistent.

But James Sherwood, the professor in charge of actually testing the baseballs, says that the standard of what constitutes an acceptable baseball is way too broad:

"Their testing window is this big," he says, his hands a foot apart. "I don't know why it was ever set that wide." A ball testing at the high end could travel as much as 50 feet farther than one falling on the low end, he says. That's the difference between a lot of home runs and a whole lot of home runs.

You'd think, though, that if MLB was juicing the ball deliberately, the quality control guys would have been warned to keep their mouths shut.

Wednesday, May 09, 2007

Racial bias and NBA referees -- Part 3

The study found that significantly more fouls were called when the referees were of different race than the perpetrator, and/or the same race as the victim. Could there be other explanations than racial bias?

Here's one, suggested by Guy in the comments to part 1:

... what if white players disproportionately commit certain types of fouls more than blacks, and other fouls less? And what if black and white refs, for cultural/historical reasons, also employ different standards for different types of fouls -- blacks quicker to call X foul, but slower to call Y? And suppose these both reflect differences in the "black" and "white" game, such that white refs are more sensitive to the kind of fouls committed more by black players. Wouldn't that then create the result the authors found?

I think it would.

It shouldn't be too difficult to check, if data on foul types is available. You can see if whites tend to take different fouls than blacks. You can check if white referees tend to call different fouls than black referees. Then, you can adjust all the numbers to see if the bias still holds up.

In baseball terms, it's like adjusting baseball numbers for park effects, and then for home/road – the arithmetic is pretty simple.

Let us assume refs have at some point refs played basketball. Let us also assume black player play a more aggressive, foul-inducing basketball. Now, the white refs would have played with a "white" style while playing, and would be more prone to see aggressive "black" play as fouls, while black refs would see that style as normal and be less prone to call fouls.

So, taking a white player, both refs would likely see the same behavior as fouls.

On the other hand, an aggressive black player would more likely be seen as fouling by a white ref who played a less aggressive style than by a black ref who is used tot he more aggressive style.

Andrew's theory requires that blacks play more aggressively in general, where Guy's requires that blacks and whites take different kinds of fouls. But I think they're really the same theory: both say that referees see certain aspects of the game differently because of race, and both say that players play certain aspects of the game differently because of race.

I like this theory, and I bet it figures at least a little bit into the observations. As Andrew says, black referees and white referees grew up with different views of the game of basketball, just as they undoubtedly grew up with different views of a lot of things in life. It would be a huge coincidence if the interaction between referee tendencies and player tendencies turned out to be exactly zero.

By the way, the study considers that black and white players may have different styles. But it argues that style can be predicted from a function of a player's statistics, and several of their regressions considered such factors. (Specifically, they say that trying to predict race based on all those individual statistics yields a decent r-squared of 0.2.) But I'm skeptical that the connection between stats and style is really enough to say they're the same thing. I can easily imagine two players, all with exactly the same stats (including fouls), where the two players foul in different ways.

So should we conclude that referees are racially biased? I think this theory is enough to at least show reasonable doubt.

----

Here's another possibility. There's something economists call "rational discrimination," where people make decisions about a person based on the statistical profile of their racial group, even though they might not have any bias against the group in general. For instance,

Jesse Jackson once commented, “There is nothing more painful for me at this stage in my life than to walk down the street and hear footsteps and start thinking about robbery—then look around and see somebody white and feel relieved.” (

source)We can assume that Jesse Jackson isn't prejudiced against his own race; he's just noting the unfortunate statistical fact that muggings are disproportionally committed by blacks. His relief is "rational," in the sense that he does indeed have less of a statistical chance of being mugged than he thought.

Often, when two players make contact, it's hard to tell which player is at fault. Suppose that referees have noticed – consciously or unconsciously – that when a certain situation happens, and it's between a white player and a black player, it's usually the white player's fault: say, 80% of the time. But the ref only gets a good view three-quarters of the time. What should he do the other times?

Presumably, he has to call a foul on at least one of the players, and he should make his best guess possible. Should he consider race? If he's completely unsure who's at fault, should he give the foul to the white player, who has an 80% probability of having done the crime? Or should he remain race-neutral, and flip a mental 50-50 coin?

I think he'd have to call it on the white player 80% of the time. Otherwise, the white player will have an incentive to automatically foul the black player, knowing that the opponent will be falsely accused more than he should be.

Now, suppose that black referees have noticed the 80% figure more than the white referees. That would account for the "same-race" effect. It might still be caused by bias, but a different sort of bias. The black referees were able to spot the differences in style of play. But the white referees, perhaps out of unconscious bias, did not.

I have to say that I don’t think this is the true explanation. But it does have the advantage that it predicts same-race favoritism among both perpetrators and victims. Plus, if it is correct, the bias is easy to fix – just share the 80% figure with the white referees.

-----

Ian Ayres, one of the economists the New York Times called on to review the paper, believes that the findings are indeed the result of unconscious bias. He argues that such bias against blacks is pervasive, and that while well-meaning people can consciously overcome it when given time, that's not possible when making split-second judgements, no matter how hard they might try.

Ayres refers to the "Implicit Association Test," or "IAT." That's a little experiment that tries to measure your bias. Roughly speaking, it works like this. Suppose you want to figure out if you're biased against blacks. They give you two timed tests. In the first test, you have to divide a combination of words and faces into two groups. Group 1 contains words that are "good" (like "celebration" or "joy") and faces that are white. Group 2 contains words that are "bad" (like "evil" and "sadness") and faces that are black. So you have to separate "good or white" from "bad or black."

In the second test, the switch you up, so that you have to separate "good or black" from "bad or white."

The idea is that if you already associate white with goodness and black with badness, the first test will be much easier than the second one, and you'll finish it much faster.

I was skeptical, until I actually found some online tests. I took the one that checks if I'm biased towards Canada over the USA. I'm Canadian, but not particularly patriotic (except when it comes to hockey), and I'm more pro-American than most of my friends. But it freaked me out how much easier it was when the pictures of Canada were paired with the "good" words. I came out as having a "strong automatic preference" for Canada, the largest bias possible under the test.

Ayres thinks this test (for black vs. white) could serve as a proxy for estimating referee bias. I think his theory is reasonable, that the IAT probably correlates with decisions made on the court. It would be nice to have some evidence, of course, and Ayres does suggest giving all the referees the test to see how it matches up to the individual referee "bias" scores in the study. (To get the refs to take the test, you might have to have a third party administer it, and guarantee that only the correlations will be released, and not individual results. Also, you would probably have to hold a gun to their loved ones' heads.)

Now, I have to say that while I think the IAT probably measures something real, I don't think it necessarily measures racist attitudes. It could just show preference for one's "In-group" – and that preference is universal. According to Wikipedia,

Experiments in psychology have shown that group members will award one another higher payoffs even when the "group" they share seems random and arbitrary, such as having the same birthday, having the same final digit in their U.S. Social Security Number, or even being assigned to the same flip of a coin."

I believe this is true. I'm as big a skeptic of astrology as anyone, but if you did an IAT by zodiac sign, I'm pretty sure I'd show a strong preference for my own. (Go, Capricorn!) So I think that part of what the Price/Wolfers study found as "racial discrimination" is actually just in-group Bias.

That is: I think if referees do indeed favor players of their own race for non-basketball reasons, they would also favor players of their own religion, or players from the same school, or players from the same city, or players who support the same political causes. (More importantly, I think that referees may favor certain teams over others. Any time I watch a sporting event between two teams I never heard of, it isn't long before I start arbitrarily rooting for one of them. But maybe that's just me.)

Admittedly, I have no proof of this, and a possible counterargument comes from the IAT itself. As expected, white people taking the IAT come out with a preference for white over black. But black people also prefer white over black! (The IAT site suggests that there's more than just in-group preference happening – see answers #8 and #9.)

A different finding supports the idea that the preference for whites comes from something other than irrational racism. Ayres writes,

Simply showing people a picture of Michael Jordan (or Bill Cosby) reduces implicit racial bias in the IAT. This suggests that people so much want to "Be Like Mike" that he transcends racial categories, making us think less about race.

That implies that, to some extent, it wasn't about race in the first place. Real racists don't care if you're Michael Jordan or Bill Cosby – you still can't get into their club, you're still ordered to sit at the back of the bus, and you still have to drink from the "colored" fountain. It seems to me that if looking at Bill Cosby reduces your unconscious bias against blacks, it's really an in-group thing – you identify with Cosby, you feel that he's somehow one of you, and that winds up extending to all blacks.

Anyway, I'm speculating here, and it seems like debates on the IAT have been going on for years, by people who are smarter than me and have more than ten minutes experience with it. Still, when Ayres suggests that the IAT should be used by the NBA when making referee hiring decisions, I'm not convinced.

For one thing, if you trust the IAT, then referee bias must be fluctuating all the time. If just looking at Bill Cosby can make you feel less biased, wouldn't an ugly disagreement with a black player make you feel more biased?

And second, could it really be true that blacks would be less prejudiced against whites because they favor whites on the IAT? Just two days ago, King Kaufman wrote,

An ESPN-ABC News poll has found that black baseball fans are more than twice as likely as white fans to be rooting for Barry Bonds to break Henry Aaron's career home run record, and white fans are more than twice as likely as black fans [76% to 37%] to believe that Bonds knowingly used steroids, more than three times as likely to believe he shouldn't go to the Hall of Fame.

Now, these numbers are so far apart, that you've got to believe that one side must be significantly biased. If you're a white person, and believe that whites are very biased but blacks aren't, you are forced to accept the black consensus on the Bonds issue. You've also got to accept that O. J. Simpson might be innocent (73% of blacks think so, versus 13% of whites). And on any other issue that has to do with race, where bias could come into the equation, you need to figure that black opinion, as unbiased, is much more likely to be correct than white.

I'm not prepared to do that. Sometimes, I agree with the consensus "white" point of view. Sometimes, I agree with the consensus "black" point of view. Sometimes, I think the truth is somewhere in the middle.

So I think it's overwhelmingly likely that both blacks and whites are biased in favor of their own race, regardless of their average results on the IAT.

The important question then becomes: if in-group bias is universal, can referees be trained to overcome it in split-second decisions? I don't know, and I don't even know how you would check.

----

Anyway, I guess I can sum up my evaluation of the study as:

1. I think the effect the study found is real, that players benefit when referees are of the same race.

2. I think the study cannot and does not show whether white referees are biased, whether black referees are biased, or whether both are biased. I don't think you can even tell which way they're biased.

3. The Guy/Andrew theory – interaction between types of fouls taken and styles of fouls enforced – is a possible explanation for the observed effect.

Also, my gut says – without necessarily any evidence:

4. The Guy/Andrew theory probably only accounts for a small part of the effect, if any (although I am open to contrary evidence).

5. There probably is a real bias effect happening, but "in-group" bias is a more likely explanation than race bias.

6. No matter how non-racist or anti-racist the referee, such bias might be impossible to eliminate.

7. There are probably many similar "in-group" biases that referees have, in all sports, but that nobody has studied yet.

For these reasons, knowing about the same-race effect won't really interfere with my enjoyment of the game. While I think the small amount of bias that the study found might be real, I suspect it's just part of human nature. So long as there isn't any conscious racism, and the league and referees are trying their best to improve and be fair – and I believe both conditions to be true – I don’t think the integrity of the game is in jeopardy.

Friday, May 04, 2007

Racial bias and NBA referees -- Part 2

This is part 2 of my comments on the NBA racial discrimination study. Part 1 is here. The NYT article on the study is here.

The study found a difference in the rates of how white and black referees call fouls against white and black players – notably, white referees call fewer fouls than expected against white players.

From this, the authors of the study conclude that, because most of the referees in the league are white, it would be advantageous to hire more white players and fewer black players. I don’t think that's correct, for several reasons, and I will explain why.

I'll start by describing the paper's approach. There were many different tests and regressions, but the first one, and the simplest one, is not that different from the more complicated ones. So I'll use that one.

This is the first chart in Table 3, which is below. The numbers are fouls called per 48 minutes, by race of the (majority of the three) referees and the perpetrator:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33

5.02

You'll notice that where black players are concerned, the race of the referees didn't matter: fouls were called at almost exactly the same rate by both races. ("Almost" because the numbers are rounded.)

But white players were caught much less often by white referees than by black referees, 4.95 fouls to 5.02 fouls, a difference of .07 fouls every 48 minutes.

And so, you can make this argument: in the left column, which represents black players, the top number is exactly the same as the bottom number. If white referees were as consistent for white players as the were for black players, the second column would look the same, and the top number there would also be equal to the bottom number. The chart would then look like this:

black players

white players

majority white refs

4.33

4.95 5.02

majority black refs

4.33

5.02

This new chart represents how you expect it "should" be in the absence of bias. The top-right cell (changed, in red) represents white referees making calls on white players. So translating the change of chart into English, you could say:

1. "White referees didn't call as many fouls on white players as they should have."

The "1." in front of the statement is because, as you've probably figured out, there's more than one way to adjust the chart. Instead of changing the top-right cell, we could change the bottom-right cell, so that the black ref number becomes equal to the white ref number:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33

5.02 4.95

This is just as consistent as the first way. The interpretation here is:

2. "Black referees called more fouls on white players than they should have."

These two statements, number one and number two, are equally consistent with the data. There is no way to determine which one is true. Indeed, they could both be true: maybe white refs were too lenient on white players, AND black refs were too hard on white players. Or it's possible that all refs are more lenient with white players. Maybe the "real" number should have been even higher than 5.02. Maybe blacks let lots of fouls go, but whites let even more fouls go! Or maybe both refs were tougher with white players than black players – but white refs were not as much more tough as black refs were more tough.

We can reduce that mess to one principle, which we can say at least three different ways:

(c) regardless of whether black referees were too tough or too lenient on white players, the white referees were either less tough or more lenient.

The statements are three different ways of saying the same thing. The first way kind of implies that it's the white referees who are biased. The second way implies that it's the black referees who are biased. And the third way implies that maybe all the referees might have been biased.

Any of those is possible. There's evidence of bias, but absolutely no evidence of who is biased.

But, wait, we're not finished: there are still other interpretations of the chart. This time, start with the right column. White refs gave white players 0.07 fewer fouls than black refs. If white refs are more 0.07 more lenient towards white players, they should also be 0.07 more lenient towards black players. That is, the left column should look the same as the right column! And so, instead of giving them 4.33 fouls per game, they should have given them only 4.26:

black players

white players

majority white refs

4.33 4.26

4.95

majority black refs

4.33

5.02

This gives rise to a third interpretation:

3. "White referees called more fouls on black players than they should have."

And there's one more interpretation, which happens when you change the bottom-left cell:

black players

white players

majority white refs

4.33

4.95

majority black refs

4.33 4.40

5.02

4. "Black referees called fewer fouls on black players than they should have."

(Digression: to some, interpretations 3 and 4 are perhaps less intuitive than 1 and 2. In the first two charts, black referees and white referees treated black players exactly the same. This satisfies our wants and expectations for NBA officiating, that the race of the referee should make no difference. It seems like those numbers somehow must be right, and making them unequal is the wrong thing to do.

But I think that prior expectation is unrealistic. Why shouldn't there be differences in refereeing style between whites and blacks? There are differences in basketball style. There are differences in culture, in politics, and many other areas. That's one of the reasons employers like "diversity" – because people of different races can bring different attitudes and methods to the business. So why shouldn't referees be a bit different by race?

Sure, the NBA tries to enforce consistency, so you wouldn't expect any large differences. But isn't it acknowledged that some referees call a tighter game, and some call a more lenient game? Perhaps the division isn't exactly 50-50 between white and black.

And, indeed, other data show a difference: in one of their subsequent breakdowns, it was found that black referees were slightly more strict than whites even in fouls called against blacks.

Anyway, I include this digression because the authors of the study don’t really consider the possibility of non-bias-related differences between the two groups of referees. But if you do consider such differences, you can come up with explanations for the data other than racial bias (which I'm saving for part 3). The authors didn't do that, and neither did most of the opinions on the subject that I've read.)

So what can we conclude, not knowing which of the four possibilities is correct?

First, that there's some bias going on. But does it favor white players or black players? And is it by white referees or black referees?

Under 1 and 2, it's bias for and against white players; under 3 and 4, it's bias against and for black players. Under 1 and 3, it's bias by white referees. Under 2 and 4, it's bias by black referees.

So we don't know who benefits, and whose 'fault' it is.

What we do know is that, under any of the four possible interpretations,

And that's why, in the study, the authors repeatedly refer to "own-race bias" instead of "white bias" or "black bias." Every player benefits when more referees are of his own race.

(And one quick point: there is a fifth interpretation, which is that some combination of all four of the others are right! Maybe black refs both favor blacks and stiff whites. Maybe white refs both favor whites and stiff blacks. And when you put them all in a chart, you get what we got.)

------

What is the magnitude of the "own-race" bias? You'll notice that in every one of the four cases, we had to change one of the cells by exactly the same amount: 0.07. (It always has to work out that way, that no matter which of the four cells you change, it will have to change by the exact same amount.)

If you can change the race of a single one of your (48-minute) players, so that it now matches the majority of the referees that game, you should reduce your fouls by 0.07 (or about one foul per 15 such games). If, somehow, you could change all five players, it would be one foul every three games – or about one point per game. That's a lot. But it's hard to have two sets of five players who are exactly equal in every respect but race, which renders this strategy fairly unworkable.

Can you gain an advantage by switching race for an entire season, instead of by the game? I will soon argue that you cannot. But it sure seems like you can. Here's the logic:

Since two-thirds of NBA referees are white, most games (in theory, about 74%) would have a majority of white referees. So if you replace a black player with a white player all season, he'll have the advantage in 74% of games, and a disadvantage in 26% of games. 74% minus 26% is about half, so the advantage of that is about 0.035 fouls per game – one foul in 30 games. If you assume that the victim benefit equals the perpetrator benefit, double that to one foul in 15 games. That's about six fouls over a season. What's that – maybe a half a win or so?

The New York Times article recaps the issue like this:

Their results suggested that for each additional black starter a team had, relative to its opponent, a team’s chance of winning would decline from a theoretical 50 percent to 49 percent and so on, a concept mirrored by the game evidence: the team with the greater share of playing time by black players during those 13 years won 48.6 percent of games …

But I don't think that's right. The overall tendency of players to avoid and draw fouls will be accounted for in their statistics and evaluations. And so if teams consider fouls for and against when making personnel decisions, black teams should be exactly as likely to win as white teams. The GM may not know why player W takes fewer fouls than player B, but he doesn’t care – he'll hire player W even without knowing.

Teams are rational. If you had two players, one white and one black, equal in talent (not including the effects of racial bias), the white player would have better stats than the black player – because he plays against white refs more often! Therefore, he'd be contributing more to the team, and he would already have the job.

That is: even though the racial bias is invisible, the results are not. And teams would already have made the optimal player moves to correct for it, just as they correct for every other player characteristic.

That's in theory: in practice, that might not be entirely correct. It's possible that some of the effect has gone unnoticed – most likely, a player's ability to induce fouls on the other team. But if that's the case, then the problem is that teams aren't measuring it. The standard economic assumption is that teams are rational, and would notice it. As an empirical question, my gut says that teams would notice a fair bit of it, maybe most of it.

Look at it this way. Suppose that 25 years ago, a racist demon started giving every white baby an extra dollop of basketball talent. Nobody knew that until today, when a study came out proving it. Does that mean that teams should immediately rush out and get white players? Of course not. The white players' extra talent was obvious, even if the cause wasn't, and all the demon-gifted great young white players were already snapped up and are now playing.

Similarly, the effects of race-biased refereeing have already been factored into which players have jobs today. Which race got the extra jobs, blacks or whites? We don’t know, because we don’t know which way the bias goes. If the hidden truth is that referees are biased in white players' favor, then some white players have jobs they wouldn’t have otherwise. If the bias is in black players' favor, then the black players have the extra jobs. Just as you would expect.

But there's no way to tell which way it goes. The study provides insufficient evidence to tell whether the bias favors white players or black players.

------

In Part 3, I'll argue that the data can be explained by other theories than racism. One such possibility – and I think it's a good one – is discussed by Guy here, and by Andrew and Guy in the comments to this post.

Thursday, May 03, 2007

Racial bias and NBA referees -- Part 1

A new study says that NBA referees are racially biased.

The study came to light Wednesday in an article by Alan Schwarz on the front page of the New York Times. To his credit, Schwarz did a bit of investigating before running the article -- he had obtained an advance copy of the study, and he sent it to three prominent economists (Ian Ayres, David Berri, and Larry Katz), who pronounced it valid.

Here's the study, courtesy of the NYT website. It's called "Racial Discrimination Among NBA Referees," by Joseph Price and Justin Wolfers.

Price and Wolfers looked at every regular season NBA game from 1991-92 to 2003-04. First, they noted that white players took more fouls than black players (after normalizing to 48 minutes on the court):

4.330 fouls/game – black players4.970 fouls/game – white players

The authors are quick to point out that this statistic, by itself, is not evidence of discrimination; the two groups are probably just different. "The difference in foul rates," they write, "largely reflect the fact that white players tend to be taller, heavier, and more likely to play center than black players."

(By the way, the Times article gets this wrong; in its second paragraph, it says "white referees called fouls at a greater rate against black players than against white players." That isn't true – both white and black referees called significantly more fouls against white players.)

But even if white players take more fouls than black players for non-racial reasons, you can still check for referee bias. There are three referees per game, and you can measure the difference in black/white fouls with more black referees, and compare it to the same stats with fewer black referees. The results:

0 white refs: whites take 0.827 more fouls than blacks1 white refs: whites take 0.675 more fouls than blacks2 white refs: whites take 0.654 more fouls than blacks3 white refs: whites take 0.574 more fouls than blacks

Now, if you're a skeptic like me, the first thing you usually do, after a result like this, is to think of other reasons that the effect shows up. For instance, in baseball, if someone found that player X hits better on weekends than weekdays, you'd say, "Aha! That's because weekday games are played during the day, and weekend games at night. So what's really being measured is day/night effects, not weekend/weekday effects!" And you'd probably be right.

Is there a similar argument that can be made in this study? Nope. The equivalent to weekend/weekday in this study is white refs/black refs. So to make the day/night argument, you have to find something that correlates with the race of the referees. But you probably won't – because the NBA assigns the referees almost completely at random!

So it doesn't matter what internal correlations there are within the rest of the data. Maybe games in certain arenas are more aggressive, and should have more fouls. It doesn't matter, because black and white referees are equally likely to be assigned to those games. Maybe different players have different styles. It doesn't matter, because black and white referees are equally likely to be assigned to those players. And so on, for anything you can think of.

Now, it *is* possible that despite the referees being assigned randomly, whites were assigned to more high-foul games just by chance. But that's what the significance tests are for, and it turns out the probability of that is very small. And, just to make sure, the authors ran a couple of other tests, and couldn't find any significant differences between the white-referee games and the black-referee games.

And, just to make completely sure, the authors ran a bunch of regressions that corrected for all kinds of factors: player characteristics, home/road, stadium, and so on. They even ran one mega-huge test, where they used dummy variables for every different player, and for every different referee! And for all those tests, the measured effect of the race of the referee was pretty much the same. And that's as expected, because, no matter how many factors you correct for, the referees were assigned randomly to those factors anyway. The only possibility is that some weird random things happened, like white referees were assigned to all the games with the basketball equivalent of Donald Brashear – but the authors' extra tests show that didn't happen.

So the differences are real. The next question is, what do they mean? And do they prove racial bias against blacks? I'm thinking about some of this stuff, and I'll probably discuss some of it in another post, which is why this is "part 1".

For now, I'll list a few more of the findings from the study:

1. The effect is mostly limited to the frequency with which referees call fouls on whites:

Blacks are called pretty consistently regardless of the race of the referee. This is kind of interesting, but I'm not sure it necessarily means anything (more next post).

2. There is a similar race effect based on the victim of the foul, not just the perpetrator. It's roughly equal in size. This means that in some ways, the effect is double what it appears to be from the above results.

Since box scores don't identify the victim, his race had to be approximated by the "average race" of the team. This increases the variance of the estimates, but they were still all statistically significant.

3. If I understand the regression coefficients properly, the real-life effect is this: Going from the extreme case of "all black players and all white referees" to "all white players and all white referees" would save a team about 2 extra foul calls a game (one less as perp and one more as victim). It also leads to a 5 point difference. (This assumes that the new white players are exactly as likely to commit the same foul-related plays as the black players. In real life, white players foul more.)

However, this 5 points is an unrealistic measure of the effect. In real life, there are 67% white referees, and five players on the court. A more realistic example: if you add replace one black player with white player, you improve by 67% of 1/5 of 5 points, which is 2/3 of a point. That's still pretty significant, but doesn't really affect strategy much at all.

And keep in mind, too, that if two teams have an equal percentage of minutes played by whites, they'll cancel out and neither team has the advantage.

Also, note that if referee bias gives whites fewer fouls, that will show up in their statistics (fewer turnovers, for instance, since an offensive foul counts as a turnover). So the "Moneyball" effect is not 100% of the race effect – some of the advantage should be priced into white players' salaries already, at least that's true for the part of the advantage that shows up in stats. For the other part – for instance, if white players are more likely to draw opposition fouls, but teams don’t keep track of that -- maybe there's a slight opportunity to exploit there.

Overall, I think the race effect doesn't really affect the outcome of the games very much. But, obviously, the study raises more important issues than just basketball strategy.