“No game in the world is as tidy and dramatically neat as baseball, with cause and effect, crime and punishment, motive and result, so cleanly defined.”
—Paul Gallico

In last week’s article I introduced a framework for evaluating baserunning in an attempt to determine the magnitude of the impact both good and bad baserunners have on their teams. In that article I used play-by-play data from Retrosheet for the five-year period from 2000 to 2004 to analyze three situations that I felt would likely reveal baserunning skill. These were:

Runner on first, second not occupied and the batter singles

Runner on first, second not occupied and the batter doubles

Runner on second, third not occupied and the batter singles

In these situations I took into account both the number of outs and the fielder to which the ball was hit. In crunching the numbers I then introduced the concept of Expected Bases (the number of bases a runner was expected to gain given his opportunities), Incremental Bases (the difference between the Expected Bases and the actual number of bases gained), and Incremental Base Percentage (the ratio of total bases gained to expected bases). What I found was that there is about a 30-base span between the best and worst baserunners and that the IBP typically ranges from 1.15 to .85. A couple of minor corrections to the first article can be found on my blog.

Refinements

As many readers immediately perceived, these raw numbers are not the end of the story. A number of readers e-mailed suggestions for refining the framework, which included factoring in singles stretched into doubles and doubles stretched into triples (the effect of third base coaches), trying to factor out the effects of the hit-and-run, and including the scenario where second is occupied with a runner on first when a batter singles. While these are all excellent ideas and would make the framework more accurate, they all have problems.

Unfortunately, the play-by-play data from Retrosheet provides no ability to determine when a Juan Pierre stretches a hit rather than simply coasting into second or third. Nor does it allow for determining when a hit-and-run was on, something that would certainly affect the Cardinals’ statistics given Tony La Russa’s penchant for putting on the play.

And while as a Cubs fan I’m well aware of the cost of a third-base coach, having endured “Waving” Wendall Kim in 2003 routinely sending runners to their doom, I can’t think of a good way to isolate the effect of third-base coaches. Information about who was coaching third in a given game is nonexistent, and coaches probably don’t move around enough to be able to generate comparisons of their effect on teams. In addition, an aggressive third-base coach will not only cause more runners to be thrown out but will also cause more bases to be gained, which may well cancel each other out. Finally, I did consider including the situation where second base was occupied but dropped the idea since I was concerned that the defense might make a play on the lead runner allowing the runner on first to take third for “free.”

But while I chose not to pursue these refinements, there are a couple that are both doable and have an impact as James Click also discovered in his baserunning analysis published in the 2005 Baseball Prospectus. To introduce the first, take a look at the team leaders in IBP for each of the five years:

In addition to the Rockies, Texas and St. Louis each appear in the top five three times, with the White Sox and Twins twice each, meaning that 15 of the 25 spots are occupied by the same five teams. On the other end of the spectrum, the Red Sox, Dodgers, Brewers, Phillies, Blue Jays, Giants and Astros all appear in the bottom five more than once. Enough said. What is going on is that parks play a role in determining how often runners advance and how many bases they gain when doing so. And that difference is largely based on the field to which the ball is hit and the surface of the park. For example, in looking at the scenario where there is a runner on first base and the batter singles, the following tables show the high and low totals for the five years when the ball is fielded by each of the three outfield positions.

Here you can see that Fenway’s Green Monster has the effect of doubling the chances that a runner would move from first to third or score on a single to left, as opposed to the Sky Dome (Rogers Centre) where the AstroGrass—replaced with FieldTurf this season—and smaller dimensions ostensibly hold runners in check. The spacious center-field area at Coors Field allows twice as many runners to advance or score as Jacobs Field, while the right field well 353 feet from the plate at Wrigley allows runners to advanced to third or score 50% of the time. In Yankee Stadium at 318 feet in right field, runners advance to third or score just 36% of the time.

In order to see how these effects played out overall, I calculated a park factor (IBP/PF) for each park for every season. My approach to doing so was similar to how Batter and Pitcher Park Factors (BPF, PPF) are calculated. First I calculated the aggregate IBP in all games played at each park and then did the same for all road games for the team that played in that park. So for example, the Rockies and their opponents recorded an IBP of 1.03 in Coors Field in 2003 and an IBP of .99 on the road. I then divided the home IBP by the road IBP, in this case 1.03/.99 = 1.04, and took half the difference since the team only plays half of its games at home. For Coors Field in 2003 the IBP/PF was calculated at 1.02. For the five seasons the Coors Field factors were:

2000 1.03
2001 1.03
2002 1.02
2003 1.02
2004 1.01

These are pretty consistent and show that Coors generally provides a 2-3% advantage to runners in gaining incremental bases. One might wonder if the quality of the home team’s outfield defense might skew these numbers, but since the same defense plays both at home and on the road the effect, if any, should be cancelled out.

Once the park factors were calculated I took a look at their variability. The lowest IBP/PF was recorded in San Diego (Qualcomm Stadium) in 2002 at .96 and the highest was in Texas in 2003 at 1.04, so generally you can see that these park factors don’t vary as much as BPF and PPF, which can range from .90 to 1.10 or higher in the case of Coors Field. And even though some parks such as Coors Field and The Ballpark at Arlington show consistency there is also a good deal of variability in the season-to-season factors for a particular park. Running a simple regression (and excluding teams that moved parks such as the Padres, Reds, Brewers, Pirates and Phillies) showed that the correlation coefficient for the four pairs of seasons were positive three times but reached .3 only once, which is pretty weak. Clearly single-year park factors here should be taken with a grain of salt. As a result, I averaged the park factor for each park across the five years it was in use with the results below:

This further shrinks the variability to where there is only a 5% swing between the best and worst parks. Because of the small spread, the argument can be made that characteristics that allow runners to advance more frequently in a ballpark are cancelled by other characteristics that don’t resulting in most parks drifting back towards the middle. If that’s the case, what we really need to do is calculate park factors for each scenario or at least each field to which the ball is hit. That makes sense, but it will have to wait until another day.

Using these park factors I then went back and recalculated the IBP for each player in each season and then re-ranked the players for the last five years. The top 15 (actually 17 with ties) are listed below where Bases+ is the number of bases adjusted for park factor and IBP+ is the adjusted IBP.

As you can see the IBP+ values are very similar to the unadjusted totals. (You’ll also note that Damian Jackson, who was inadvertantly excluded from the list in my first article now takes the top spot from Jack Wilson by percentage points.) Those players who were helped most by their park were:

As you might have expected, we have two Rangers and three Rockies, all of whom gained from four to eight bases because of their parks. In the case of Larry Walker it drops him from 24th to 39th on the list of 210 players with 100 or more opportunities over the given years. You can find a complete list of the 210 players on my blog. The players who were most hurt by their parks include:

Once again, it’s not surprising that two Padres are on the list given that both Qualcomm and PETCO had IBP/PFs under 1.0 and that the magnitude of the effect is only around four bases over the five seasons.

When applied at the team level, the leaders and trailers for the past five seasons are:

It is interesting that the 2000 Brewers led the league while the 2001 edition was in the cellar. A quick look reveals that in 2000 Ron Belliard and Marquis Grissom accounted for 20 incremental bases all by themselves in 91 opportunities, with James Mouton adding over five more in just 10 opportunities. In 2001 with Grissom gone, his replacement Devon White added just over three incremental bases, while Belliard accounted for just two and Jose Hernandez along with Jeremy Burnitz were dinged for -14 thereby driving them to the bottom.

On Deck

The second refinement is to take my measures of Incremental Bases (IB) and Incremental Base Percentage (IBP) and convert these into runs gained or lost. That will be the subject of next week’s article.