The sabermetric blog of Cyril "Cy" Morong, professor of Economics at San Antonio College

Wednesday, January 7, 2009

Which Players Had The Most Surprising Walk Rates

A few years ago I noticed that Miller Huggins walked quite a bit yet did not seem to be much of a hitter. From 1904-1916 he lead the league 4 times in walks and was in the top ten 7 other times. So what kind of fearsome hitter was he that he got walked so much? His career batting average (AVG) was .265, not bad in the dead ball era. The league average was .260 during his career. His career slugging percentage (SLG) was .314 while the league average was .343.

But a better measure of power is isolated power (ISO) or SLG - AVG. It tells us extra bases per AB (after all, if a guy can get a single every time up, his SLG would be 1.000 yet he has now power). Huggins' ISO was .049 while the league average was .083. So I was very impressed that he knew the strike zone well enough and had so much discipline that he could walk so frequently yet not have much hitting ability in general.

I thought it would be interesting to come up with some kind of measure of this ability. At first I used walk rate divided by ISO. But his turned out to be unfair to many sluggers since ISO can go very high (theoretically as high as 3.000). Their walk rate would end up being divided by a very large number so their Walk rate/ISO would be low.

So I ran a regression. A player's walk rate (relative to the league average) was the dependent variable and his ISO (relative to the league average) was the independent variable. The idea is that power hitters would get walked more than other hitters. I used all players with 5000+ career plate appearances (885 players). The data comes from the Lee Sinins Complete Baseball Encyclopedia. Here is the regression equation:

Walk Rate = 61.5 + .428*ISO

(In the Sinins encyclopedia, a walk rate of 150, for example, means that the player walked 50% more than average). The r-squared was .159, meaning that only 15.9% of the variation in hitter's walk rates is explained by variation in ISO. But the t-value for the coefficient on ISO was over 12, so it was statistically very significant.

Then each player's walk rate was predicted using the equation and the difference between their actual rate and the predicted rate was found. Then all the players were ranked from highest to lowest by this difference. The table below shows the top 25.

So Roy Thomas did the best. His actual walk rate was 2.49 times the average but his ISO was only .55 or 55% of the average. The equation predicts that he would have a walk rate of 85.06 (or 85.06% of the league average). Since 249 - 85.06 = 163.94, his walk rate was that many points above expected and he had the highest difference. Miller Huggins does very well, coming in at number 7.

The players who walked the least (based on the equation) are below:

I also did something similar using SLG. Here are the best players followed by the worst.