Sabermetric Research

Saturday, March 31, 2007

Rugby: visiting teams more aggressive than home teams

At least one study has shown that players on a home team will have higher levels of testosterone than players on the visiting team. Does that mean home players will be more aggressive?

Apparently it's the opposite. A study called "Game location and aggression in rugby league," by Marc V. Jones, Steven R. Bray, and Stephen Olivier, found that in British rugby, visitors committed more "aggressive behaviors" than the home side.

In order to eliminate the possibility of referee bias, the authors videotaped 21 games, and asked expert observers to look for aggression. They found 632 such acts – 306 from the home team, and 326 from the visiting team.

They broke down as follows:

137 by the home team when the home team won119 by the visiting team when the visiting team won140 by the home team when the home team lost185 by the visiting team when the visiting team lost

It looks like most of the effect comes from visitors who lost. (I assume these numbers don't add to 632 because of ties.)

More interesting is the breakdown by half (again, ties excluded):

341 aggressive acts in the first half240 aggressive acts in the second half

The authors don't do a significance test for half, but, clearly, there are more instances of aggression in the first half. I wonder if this applies to other sports. Perhaps the famed unwillingness of NHL referees to call penalties in overtime is partly due to there actually being less to call.

Broken down further:

Teams who eventually won: 165 first half, 91 second half.Teams who eventually lost: 176 first half, 149 second half.

Almost all the difference between winning teams and losing teams comes in the second half. Perhaps in the first half, the outcome is still in doubt – it's only in the second half that the leading team realizes it's going to win, and finds it less necessary to take out their aggression.

It would have been very interesting to see whether the observers' counts of aggressive acts matched the referees' calls, but the authors don't give that information.

Tuesday, March 27, 2007

Overtime incentives study

When an NHL team loses a game in overtime or in a shootout, it still gets a point in the standings. That means that, for overtime games, three points are divided between the two teams. But for non-overtime games, only two points are divided.So teams have an incentive to allow more games to go into overtime.I've done a little study to check that. It's still in draft form. You can find it here.Comments are appreciated. (Also, if anyone knows if there was a rule change or something responsible for the big jump from 2001-02 to 2002-03, please let me know.)

UPDATE, March 30: study is now updated (same link). I've added some tests suggested by commenters, and suggested some possible reasons for the observed effects.

Young sabermetrician in the news

Victor Wang, who had an article in a recent "By the Numbers," is profiled in a Minnesota newspaper. The reporter gets the sabermetrics right, but thinks that correlations are something taught in calculus class.Alan Schwarz wrote about Wang in the New York Times, about a month ago (subscription required)."By the Numbers" would get a lot more publicity if we had more 16-year-old contributors. :)

Friday, March 23, 2007

Double steals: are managers too conservative?

Dan found that for a double steal, the theoretical break-even percentages (1970-2006) were 60% with nobody out, 53 percent with one out, and 78 percent with two outs. Anything less than that, and it's costing the team runs.

With nobody out, teams were successful only 57% of the time, which makes the no-out double steal look like an overused strategy. However, the breakeven percentages assume that it will be the lead runner who's thrown out. But, in real life, it's sometimes the trailing runner. If you take that into account, the 60% theoretical breakeven point is too high. It's probably even less than the observed 57%.

Where it gets more interesting is in the one-out case. Teams were successful there 1587 times out of 2258, or 70%. This is substantially higher than the 53% breakeven point. Dan suggests that managers underestimate the value of getting a runner to third with one out, and that's why they don’t call for the double-steal enough.

I should mention that these results aren't necessarily inconsistent with perfectly rational managers. Suppose that sometimes, perhaps with really fast runners and a pitcher with a slow delivery, the success rate is 90%. Other times, the chance is only 30%. If managers are smart enough to go for it only when the chances are greater than the 53% breakeven, the overall success rate will be somewhere between 53% and 90% -- perfectly consistent with the observed 70% rate.

However, even if that's the case, we can say that managers are less inclined to call for a double steal than to call for just a regular stolen base. According to "The Book," the overall (single) steal success rate is 69%, which is very close to the breakeven. The 0-out double steal rate is also close to its breakeven, but the one-out double-steal rate is higher than the breakeven. So, overall, the double steal is used more conservatively than the single steal.

The obvious question is: why are managers more conservative with double steals than with single steals? One answer suggests itself from the NFL study finding that teams are kicking field goals on fourth-and-short when they should be going for it: that managers are afraid that the risky play will fail, and their reputation will suffer.

If coaches are generally afraid of being criticized, they will call for unusual plays less often than they should. A double-steal is seen as an out-of-the-ordinary play, for which the manager gets a large part of the credit or blame. A regular steal is routine, so credit usually goes to the runner or the catcher, and managers can call for steals without fear of criticism.

If this theory is true, we should find that the kinds of plays for which managers are second-guessed are consistently underused, especially as compared to plays that are similar but uncontroversial. This double-steal data is mildly supportive of that theory.

Sunday, March 18, 2007

Do women's basketball teams have better stats on the road?

While looking through some references for academic studies my library has access too, I found a very confused study of home field advantage: "Team Quality and the Home Advantage."

The authors define home field advantage (HFA) as "the consistency with which home teams win over 50% of ... games." That can be interpreted many different ways. But it seems that what they mean by HFA is actual home team winning percentage, so that if teams are .600 at home, the HFA is 60%. That's reasonable, I suppose, when applied to a .500 league, but obviously not when applied to an individual team. The 1998 Yankees had the league's best home record (.765), but that doesn't necessarily mean they had the best home field advantage, at least in the normal sense of how people use the expression.

But the authors don't make the distinction, and throughout the paper, they usually treat "home field advantage" and "record at home" as interchangeable. Indeed, they go out of their way to inform us that better teams would be expected to have better home records – and they quote other studies confirming this, as if it's an important finding. It never seems to occur to them to treat HFA as a function of a team's difference between home and road record.

The main part of the paper is a regression on the results of single games. The authors chose a decade's worth of NCAA women's basketball, and used game data for four teams – the two best over the decade, and the two worst. (Since each team played two games against each opponent per season – one at home, and one on the road – the schedule is balanced.)

Then, they ran a regression on whether or not the home team won the game.

(Why did they use a full decade's record to choose the best and worst teams? Because, they say, a single season's record might be flawed, since some players might have been absent part of the season due to injury. But why would that matter? And, besides, doesn't their method have the same problem? In their own data, every player will have been absent at least part of the decade due to, say, having graduated.)

The dependent variables in the regression were:

-- a dummy variable for whether one of the "two best" teams was involved;-- attendance, expressed as a percentage of capacity;-- distance the visiting team travelled;-- who won the previous game that season between the two teams, if any;-- the half-time score of the current game;-- the difference between the number of rebounds for this team this game, and the number of rebounds for this team in the other game that season, even if that other game hadn't been played yet-- the same as above, but for fouls, steals, field goal percentage, and free throw percentage.

This is quite muddled.

First, the half-time score doesn't cause home field advantage, it is caused by it. Do the authors really believe that the home field advantage doesn't have anything to do with scoring, that it can cause you to win games in which the other team scores more points than you do? Do they believe that there's no home field advantage until the second half of a game, so you can just take the first-half score as independent of HFA?

And what about the performance measures, like field goal percentage? Aren't those affected by home field advantage also? Or is the theory that teams play exactly as well at home as on the road, but magical scoring dust somehow makes that identical performance worth more points?

And, why take the difference between the home and road performance, anyway? If you hit for an 80% free throw percentage at home, why would you suspect your chances of winning this game are different if you only shot 70% on the road three weeks ago?

The authors do realize, after the first regression, that having half-time score in the model is perhaps not a good idea, and so they run a second one without it. But they still solemnly conclude that "the results clearly implicate the importance of half-time score on game outcome."

Anyway, I don't think any of the regression results can be taken seriously at all. But there's one very surprising result the authors found that didn't come from the regression: that the low-quality teams actually played better on the road than at home, at least according to certain statistics:

"... high quality teams had a higher field goal percentage at home when matched against the same opponent away for a particular year on 63 separate occasions, whereas they did worse at home than away on only 25 occasions. In contrast, low quality teams did better on the road 51 times and better at home on only 35 occasions."

They also made more steals and committed fewer fouls on the road (but rebounds and free-throw percentage showed no difference).

Can anyone explain this? The only thing I can think of is some kind of park effect. If the bad teams played in arenas with bad visibility, where overall field goal percentages were very, very low, that would explain why both they and their opponents have better stats away from that court.

That explanation seems very unlikely, though. I don't mean to sound disrespectful, but, given the other flaws in this study, is it possible that this particular finding might just be an error?

Do fast players age more gracefully than slow players?

I love this study. Actually, I love most of the stuff Tango comes up with, but this is one of my favorite kinds of studies. It's simple, it doesn't rely on any fancy statistical methodology, and the results are easy to understand and hard to dispute.

Tango found the twenty "best-hitting speedsters" up to age 30, from players born 1941-1960. Then, for each of the twenty, he used some version of similarity scores to find the most comparable slowster – that is, a player similar in every category except speed (3B and SB/CS, presumably – Tango doesn’t give us his actual similarity formula).

Tango observes that part of the fun is seeing who matches up with whom; my favorite, for some reason, is Al Oliver as a slow Ralph Garr.

The conclusion: skilled fast players age at about the same rate as skilled slow players, but manage to hang on for an extra 12% more career playing time.

This result appears to contradict a similar study from Bill James. In the "Rookies" section in the 1987 Abstract (p. 67), Bill James looked at 40 pairs of players, selected similarly from rookies throughout baseball history. The results were much more dramatic: 43% more major league games for the fast players. James didn't give full stats for the two groups, but he did say that the fast players had only 38% more home runs in 43% more games, which means that the fast players hit for less future power than the speedsters.

Why do the two studies show such a drastic difference in the effect of speed on career length? I can think of a couple of reasons, theoretically possible but probably not true:

The James study compared all kinds of young players after only one season; Tango's study compared only older star players with established records. Maybe players slow down most between 20 and 30. Most guys who are slow at 24 are going to be too slow to play by 29; but the ones who stay in shape are as slow at 30 as they're ever going to get, and can stick around longer.

Or maybe fast young guys tend to rise faster than slow guys up to age 27, but decline at the same rate after. That would explain why the two groups are different from age 24-40, but the selectively sampled older players are similar from age 30-40.

Monday, March 12, 2007

Another "value added" paper

The "Value Added" approach for baseball – where a player is credited with the increase in run expectation or win probability arising from his contribution – seems to keep getting independently reinvented. (For instance, see the bottom of page 3 here.)

Here's the latest, from the Retrosheet website, by Kyle Y. Lin and Ken K. Lin. The authors do not cite any non-academic references. They use wins as their measure instead of runs. And they include a nice explanation of the algorithm for recursively calculating theoretical (Markov) win probabilities from Retrosheet data.

Is the distribution of no-hitters "memoryless"?

Here's a new research study (.pdf) by Michael Huber and Andrew Glen, recently published on the Retrosheet website. The paper tries to confirm whether certain rare events in baseball – namely, no hitters, triple plays, and hitting for the cycle – are "memoryless".

A "memoryless" distribution would mean that the odds of seeing a no-hitter (or cycle, or triple play) on any given day don’t depend on whether or not there was a no-hitter yesterday, or last month, or last year. If the distribution is memoryless, it doesn't matter how long it's been since the last one – the probability of seeing the next one should be the same, regardless.

Now, obviously, there are reasons to believe this should not be the case, at least theoretically. Consider no hitters. In the low-hitting environment of the mid-1960s, 27 consecutive batting outs would be a more frequent event than in the modern era. All things being equal, then, we might expect more no-hitters then than now. And so if there was a no hitter on day X, it's more likely that X is in 1967 than 2006, and the odds of a no hitter on day X+1 should therefore be higher. Memorylessness should not be the case.

But, in practice, the era effect is small. Figure 4 of the study compares theoretical inter-arrival times (that is, the number of games between successive no-hitters) to actual, and the two curves look very close.

The authors do later break down the results by era, but, unfortunately, they don’t show any more graphs. They do give us significance levels, though. They reject memorylessness (at p= .05) for triple plays 1961-76, hitting for the cycle 1901-19, and no-hitters in 1920-41 and 1961-76. Again, these are probably era effects – 1961-76 is not a homogeneous era. It comprises high offense in the early-60s, low offense in the mid-to-late 60s, and part of the DH era.

Also, many of the other eras show effects that are "nearly" significant – for cycles, only two of the six eras show levels above 0.25, and, for no-hitters, only one of six is above 0.25.

So can we still say that the rare events are "roughly" memoryless? This might be one of those cases where one picture is indeed worth a thousand numbers. All those significance levels confirm what we already knew – that playing conditions change over time, and that we have to reject pure memorylessness. But they don't tell us *how close* to memoryless the distributions actually are. Thanks to Figure 4, though, we can probably say "close enough."

Thursday, March 08, 2007

Home field advantage and testosterone

In a previous post on home field advantage (HFA), commenter Nate suggested that evolution might have created an innate increase in human performance when defending their own turf. He mentioned a study where British soccer players had higher testosterone when playing at home.

In a separate study published this past summer, a Ph.D. candidate in Canada took saliva samples from 14 players on a minor-league hockey team before and after games. The key finding: Levels of testosterone, which have been found to facilitate assertive and aggressive behavior, were 25% to 30% higher before home games, suggesting the home arena triggered players' elemental instinct to protect their territory. "It has the potential to go a long way in developing techniques to create the ideal physiological profile prior to playing," says Justin Carre, the doctoral candidate at Brock University in St. Catharines, Ontario, who co-authored the study.

Monday, March 05, 2007

Can batters decrease strikeouts by reducing their power?

Here is a baseball study called "The Tradeoff Between Home Runs and Contact Rate."

Lawrence W. Cinamon ran a regression on home run rate (HR/AB), as predicted by independent variables for park, season, and contact rate (non-K/AB). He uses two different models, with two different statistical corrections, the first for autocorrelation, and the second for autocorrelation and heteroskedasticity. (I leave it to the statisticians reading this to comment on the corrections, which I don't understand.)

For the first model, Cinamon finds that home run rate increases by .041 points for every one-point increase in strikeout rate. For the second model, it's .14 percentage points.

So, using the second model, increasing strikeouts from 20% of 500 AB to 21% of 500 AB should increase the number of home runs by .7 of a home run. Roughly, every seven strikeouts equals one home run.

My comments:

First, if you look at it a slightly different way, the effect is even larger. Another way to think of home run power is per ball in play: HR/BIP, rather than HR/AB. That is, the chance of hitting a home run *given that you don't strike out* increases by more than .14 points. This is consistent with Voros McCracken's model (also used by Jim Albert), where a player's ability is the combination of four skills: (a) walking; (b) striking out when not walking; (c) hitting HRs when not walking or striking out; and (d) getting a hit when not walking, striking out, or hitting a home run.

That is: suppose Dave Kingman could hit a home run every time he put the ball in play – but he struck out 95% of the time. He'd have 25 hits, and 25 HR. We'd still call him a power hitter, despite his .050 batting average.

Second, Cinamon had a quadratic variable for years of experience, but dropped it because power peaked at 22 years, and "only three players in the sample have years numbered 22 or higher." But that's no reason to drop the variable. (I'm sure you could find diseases that peak at age 100, even though not many people live to be 100.) Cinamon says the result is due to "survivorship bias" – that home runs cause long careers, not the other way around. But both are true. Players do hit for more power as they get older, as Bill James told us many years ago.

In any case, even if there is survivorship bias, I don't see why that should invalidate the results. Cinamon says that "long ball causes long careers." But isn't that *more* reason to adjust for age?

What Cinamon does instead is limit his study to the first five seasons for players' careers. Since the tradeoff between HR and K is very likely to change as a player ages, the study's results now apply only to younger players.

Third, are we sure what the study observes is really a tradeoff? Suppose players are independently either high- or low-strikeout, and independently high- or low- power. And suppose there's no tradeoff possible – players are what they are, and can't change. And everyone makes the majors except the bad group, the high-K low-power group. In that case, you'd get exactly the results you see here: HR hitters (a combination of low-K and high-K) have more strikeouts than singles hitters (who are low-K only). But there would be no tradeoff, just the illusion of one.

In that model, the effect is 100% selection bias, and 0% tradeoff. Cinamon assumes the effect is 0% selection bias, and 100% tradeoff. What's the real answer? Obviously, it's somewhere in the middle. But where? And how could we design a study to tell?(Hat tip: Marginal Revolution.)

Friday, March 02, 2007

Home field advantage in speed skating

Most studies of home field advantage (HFA) deal with team sports. This has both benefits and drawbacks. On the one hand, team sports normally involve teams with schedules balanced between home and road games, which makes it easy to measure the effect. On the other hand, the effect can only be measured by the outcome of games, which makes it difficult to find where the effect is coming from.

In a study called Home advantage in speed skating," econometrician Ruud H. Koning tries to isolate HFA for individual speed skaters. To do that, he directly measures the skaters' performance -- that is, their speed.

Unfortunately, though, you can't just measure speed directly -- you have to make all sorts of adjustments. For one thing, there are "park" adjustments, since skaters post faster times in indoor tracks than outdoor tracks. They perform differently at high altitudes than at low altitudes. Skaters as a group perform better over time, dropping a huge 7% off their times between 1987 and 2002. (Some of the improvement is caused by changes in equipment, such as something called a klap skate, which was introduced in 1996.) Average speeds are different for different lengths of race. Competitors, for some reason, perform better at major events such as the Olympics.

And so, Koning had to adjust all the observed speeds, by running a regression of log(speed) on all the above factors. His estimates of HFA were:

Men are 0.2% faster at home than on the road;Women are 0.3% faster at home than on the road.

I hoped to compare these numbers to the HFA in team sports, like baseball. To do that, we'd want to know the answer to the question "if two exactly equal skaters faced each other, how often would the home skater beat the visiting skater?" Alas, the study doesn't give us enough information to tell. If we knew the standard deviation of results for an individual skater, that would be enough. But we don't have that. (For instance, suppose the SD was 0.14%. Then, the SD of the difference between two equal skaters would be root-2 times that, or 0.2%. The observed HFA would then be exactly 1 SD. The chance of a normal variable exceeding 1 SD is about 1/3, and so the road skater would be about .333 against the home skater.)

What Koning does tell us is that the average difference between Olympic gold and silver in 2002 was 0.77% for men and 0.45% for women. In baseball, the 162-game difference between home and road is about 13 games, while, typically, the best team in MLB beats the second best team by only a game or two.

So in skating, the distance between first and second is three times the HFA. In baseball, the difference between first and second is maybe 1/5 the HFA. This suggests that HFA has a much, much smaller effect in skating than in baseball.

I would have expected HFA in skating to be much higher. For one thing, there's a lot of luck in baseball -- the inferior team wins a reasonably large proportion of games. But in speed sports like skating, you'd think there's a lot less luck, and individual times would be pretty consistent. In that case, you'd think that any advantage in ability would translate fairly directly into advantage in observed performance.

Put another way: home teams play .540 baseball. But if you compared only strikeout rates, I bet home teams would play higher than .540. And if you compared ball/strike ratios, home teams would be better still. Speed skating seems more like a pure skill, more like throwing strikes than scoring more runs. And so you'd expect a larger effect.

Second, it seems like the regression result is so small that changes in the method might significantly affect it. For instance, the Olympic Games effect is four times the size of the HFA. If there are other confounding factors that weren't included, those could significantly change the size of the observed HFA.

In fairness, I don't know if there are any of those, and Koning seems to know a lot about skating and has added a fair number of variables based on his understanding. (And I think this study illustrates the importance of knowing about the sport you're analyzing -- researchers with little knowledge of skating, such as myself, might have remained ignorant of the sudden introduction of the klap skate, and used a linear variable for year instead of separate dummies to capture the abrupt change.)

But still: why is the Olympic effect is so large? Koning makes the argument that the Olympics feature competitors who are at the top of their game. But wouldn't they also be at the top of their game in the years immediately before and after the Olympics? That's an important variable that isn't included -- stage in a skater's career. Would adding that variable change the conclusions?

Or, less importantly, what if the klap skate improved short distance skating more than long distance skating? If you added interaction variables for distance and year, how much would home advantage change then? Probably not much at all, but maybe a lot.

But perhaps I'm nitpicking. Still, given that the effect is so small, and there are so many variables to adjust for, to my mind this study constitutes fairly weak evidence of HFA in skating.