Thursday, November 29, 2007

The biggest change is probably the inclusion of Contact Slugging Perentage (cSlug) per a suggestion by Tom Tango. It also now includes a total line on the grid display (circled in red), a new summary display, corrects a bug where homeruns weren't included in the cBA (formerly BABIP), and shows a scrolling list of new articles at BP.

Wednesday, November 28, 2007

Just wanted to create a link to this BP Unfiltered post by Nate Silver related to my baserunning framework. As Nate says, the numbers for both major and minor leaguers will be in this year's Baseball Prospectus and it'll be nice to see them included in things like PECOTA and perhaps someday in measures that reflect total player performance.

The correlations that Nate mentions are similar to what I found in the past and Nate is right in removing EqSBR from the picture if you want to look soley at baserunning, as opposed to basestealing, skill. Using a rate statistic is also appropriate since the number of opportunities does vary considerably, although of course it can't account for the quality of those opportunities. That's the reason I typically use a rate statistic based on the ratio of runs to expected runs (the difference being the metrics like EqHAR etc.) and then simply use a cutoff for the number of opportunities (or times on base would work as well).

In any case, it's gratifying to see this work being systematized and calculated for both historical seasons and during the season.

"In short, we view means and medians as the hard ‘realities,’ and variation that permits their calculation as a set of transient and imperfect measurements of this hidden essence…But all evolutionary biologists know that variation itself is nature’s only irreducible essence. Variation is the hard reality, not a set of imperfect measures for a central tendency. Means and medians are the abstractions."--Stephen Jay Gould, from the essay "The Median Isn’t the Message"

That quote is from one of my favorite authors (himself a big baseball fan), and was appropriately used by Nate Silver when he introduced the PECOTA system in the 2003 Baseball Prospectus. The concepts embodied in the quote have been on my mind the past couple of weeks, ever since I mentioned Wily Mo Pena’s platoon split in my inaugural column and received a healthy dose of reader feedback.

Perhaps the most prevalent question asked why it is that the performance analysis community in general doesn’t seem to pay much attention to individual platoon splits or view them as very important.

In order to answer that, we’ll explore the nature of variation in these splits, using a couple of different methods in an attempt to illustrate why variation is the irreducible essence when it comes to platoon splits.

Of Lefty Mashers and LOOGYsPlatooning in baseball, known more formally as the "two-platoon" system, has been around for nearly 100 years. Most famously, it was exploited by Casey Stengel when he managed the Yankees in the 1950s; it is often said that he learned the strategy from John McGraw when the lefty-hitting Stengel was himself platooned, remaining on the bench against southpaws after he'd been dealt to McGraw's New York Giants in 1921. In 1922 and 1923, Stengel would hit .368/.436/.564 and .339/.400/.505 in that role. However, there is ample evidence that McGraw wasn’t the only one who understood that hitters tended to hit better against pitchers of the opposite hand. In fact, some credit the 1914 "Miracle" Boston Braves with being the first team, guided by Braves manager George Stallings, to extensively take advantage of this strategy.

Last season, this debate was again brought to the fore with the publication of Bill James’ essay "Underestimating the Fog" in the 2005 Baseball Research Journal and more recently by the excellent chapters devoted to the subject in The Book and by our own James Click in Baseball Between the Numbers.

To provide a little context to the question, I used Retrosheet data and took a look at all 16,562 player seasons from 1970 through 1992. Overall, and excluding switch hitters, the 3,371 players this encompassed performed as follows:

In other words, as a group, right-handed hitters enjoyed improvements of 17 points in batting average, 19 points in on base percentage, and 33 points in slugging percentage against left-handed pitchers, while the advantage for lefties versus right-handers was a bit higher, at 24, 26, and 56 points, respectively.

Clearly this confirms that platoon advantages exist and are significant, and that the difference is larger for left-handed hitters than for right-handers. This fact is exploited by modern managers who employ LOOGYs (left-handed one-out guys) like the Cubs' Scott Eyre, whose sole job is to come in to pitch to the opposing team’s toughest left-handed batters.

But for the analysis that follows we'll limit the list of players to non-switch-hitters who came to the plate more than 2000 times, which leaves us with 505 players (188 left-handed hitters and 317 right-handers).

As you can see, although the performance improves because we’re now dealing with better hitters overall, the differences are basically the same, even a bit larger. This is a bit counterintuitive, since one might be inclined to think that these splits would be smaller, on the theory that players who have more ABs do so in part because they are able to handle a wider variety of pitchers.

But our questioners like George Stallings, John McGraw, and Casey Stengel already knew that the advantage existed. Lurking beneath the question of why this advantage doesn’t get more play from analysts lies another question. Is the ability to hit one side better than another a repeatable skill, or simply a phenomenon that affects all hitters more or less equally? In other words, when it comes to individual hitters, is it the mean or the variation that is the hard reality?

Variations on the ThemeTo explore this issue, we’ll look at two measures of the variation of these platoon splits using the group of 505 hitters with 2,000 or more plate appearances from 1970-1992.

One simple approach we can use to study these splits a bit closer is to look at how they are distributed. If platoon differences were spread randomly across the population of hitters without regard to skill, we would expect the pattern of those differences to follow a normal distribution, one that takes the shape of a bell curve. Some hitters would certainly do better than others, but the differences could be attributed to chance. And as many readers are aware, the empirical rule relating to normal distributions states that approximately 68% of the values will fall within one standard deviation of the mean, 95% within two, and 99% within three standard deviations.

So what do we find when we look at these splits?

The following tables show the mean, standard deviation, and the percentage of split values that fall within the first three standard deviations for left and right-handed hitters for various offensive categories.

The table reveals that the average left-handed hitter hit 34 points better against right-handers, but that the difference from the mean spans a range from 12 to 45 points (34 plus and minus the standard deviation of 22 points). For right-handed hitters the mean value for batting average is 18 points with an average spread of between 1 and 35 points. As you can see, there is a fairly large amount of variability for hitters from both sides.

It should be noted that the mean values shown in these tables differ from the average values shown in the previous table, since the former are computed from totals for the entire data set whereas the latter are averaged for each set of players.

This data can be looked at graphically, where the split differences in OBP for right-handed hitters is shown:

What the tables show pretty clearly is that platoon splits are indeed normally distributed for both populations. The conclusion one might draw from this is that even if you found a player with a large platoon split, one couldn’t automatically assume that the split wasn’t the result of random chance. After all, assuming the splits were random you would still by chance have a couple of players with splits that are greater than three standard deviations from the mean.

In and of itself, the fact that the values are normally distributed does not mean they are necessarily random. After all, all kinds of characteristics in the real world--such as IQ scores--are normally distributed, and yet most people assume that there are true ability-based differences between individuals.

A more sophisticated approach to looking at the variation is to try and determine how persistent split differences are throughout a player’s career. In other words, even assuming a normal distribution, do the same hitters have either large or small splits from year to year or do their splits vary more or less randomly within the distribution? If the splits are persistent, we would be more likely to assume they are based on actual skill differences rather than random fluctuations around a general advantage.

A technique that’s been used by others to look into similar questions is to divide a player’s career into even and odd years and then compare the splits to each other. This has two advantages. First, it allows for the comparisons to be made with larger sample sizes than doing so between individual seasons. In this data set, the average hitter had 126 plate appearances against lefties per season, whereas when using career halves the average hitter had over 1,500 plate appearances against right-handers in each half of their career and almost 700 plate appearances against lefties. Incidentally, the number of plate appearances versus southpaws has declined over the years. In this data set there were 27 seasons where players garnered over 300 plate appearances against lefties (Rusty Staub had 364 in 1978)--a feat that is non-existent today. Secondly, using even and odd seasons tends to even out changes in the offensive environment as well as differences in the player’s overall performance over time.

After splitting each career into its respective halves, we can then plot those halves against each other to see if there is a healthy positive correlation between the two halves of the career. If so, we would be inclined to side with the view that platoon splits have their basis in ability differences between individuals.

For example, the following graph plots the even and odd years for left-handed hitters’ slugging percentage split difference.

The line shown is a measure of the strength of the relationship between the two halves.

The numerical measure of that strength is termed the correlation coefficient (r). Typically r values--which range from -1 to 1, with -1 being a perfect negative correlation and 1 being a perfect positive one--greater than .70 indicate that there is a strong positive correlation between the two values whereas values less than .30 indicate a weak correlation. For the above graph the correlation coefficient is .18, and while positive--as indicated by the upward slope of the best fit line--the value is well below the threshold for even a weak positive correlation. In other words, a left-handed hitter’s slugging percentage in even years is a poor predictor of their slugging percentage in odd years.

By squaring r we can get the coefficient of determination, or R-square, which is a measure of how much the variation in one of the values can explain the variation in the other. In this case the R-square is just 3.1%.

The following tables show the r value for each of the splits for lefties and righties.

As you might have expected given the previous graph, the correlation coefficients for batting average, on base percentage, and slugging percentage are pretty small, which is an indication that split values vary widely even between the larger sample sizes that result from dividing careers into even and odd years.

The rightmost column records whether or not the correlation is significant at the 95% confidence level. In other words, when considering split differences in AVG for both sides and OBP for lefties we cannot with 95% certainty conclude that the correlation coefficient is actually different than 0. Interestingly, SLG for lefties is not statistically significant even though its r is higher than that for righties. That’s the case since there is also more variability in the split difference for left-handers.

Another interesting point is that both strikeout and walk rate are more strongly correlated than the other measures indicating that they contain larger ability components. Strikeout and walk rates are not typically what springs to mind when people think of platoon differences among individual hitters and yet the data support the notion that strikeout and walk rates are, in fact, the most persistent characteristics of these splits.

So where does this leave us?

Using these two ways of looking at variation it would appear that, as Gould said, variation is the hard reality when it comes to platoon splits and the mean values we’re left with are mostly a byproduct of that reality. Another way of saying this is that, as Albert and Bennett put it in their book Curve Ball, platoon splits are an example of a "bias situation," where all hitters tend to take advantage to the same degree in the long run (once the vagaries of chance have evened out), rather than an "ability situation" where differences are governed by individual hitter differences.

As an aside, the fact that platoon splits are influenced very heavily by small sample size and luck resolves a conundrum my brother and I encountered when simulating seasons with Strat-O-Matic baseball back in the early 1980s.

In the card version of the game you have a 50% probability of determining the outcome of a play based on the hitter’s card, and a 50% chance based on the pitcher’s card. Players with massive platoon splits faithfully recorded on their cards (say, Keith Moreland in 1983) in single seasons could therefore always be found. On the other hand, most pitchers--usually having faced more hitters and therefore having larger sample sizes and smaller fluctuations--did not have such extreme splits. Consequently, it was usually easy to stack the lineup and neutralize even the best left-handed starting pitchers and be very successful pinch-hitting even against would-be LOOGYs (few of them as there were in the early 1980s). As a result, left-handed pitchers were worth far less than they should have been and teams loaded with left-handers fared worse than expected. Trust me. I managed the 1983 Padres with the lefty trio of starters Tim Lollar, Dave Dravecky, and Mark Thurmond who "helped" my team to over 100 losses--19 games worse than their actual record.

I admit that I haven’t kept up with how Strat-O-Matic or other simulation games have evolved, but incorporating a general platoon advantage or even basing the platoon split on multiple years in order to dampen the variation would have been a step in the right direction.

A Step BeyondI mentioned in the opening that an excellent chapter on platooning appeared in The Book, and so let’s come full circle.

Looking at the overall distributions and correlations can give us big clues that platoon splits for the most part are a group characteristic rather than an individual one. But you’ll notice that while the correlations certainly are weak, they are positive and most are statistically significant. So what is the meaning of those very weak correlations?

In the chapter discussing this topic, the authors apply a more rigorous statistical technique using data from 2000-2004 to clear away some of the fog. As I noted in a previous column, they conclude that when using their statistic wOBA (weighted On Base Average, which is similar to linear weights), a right-handed hitter’s platoon split should be regressed to the mean by weighting the league average by 2,200 plate appearances whereas a lefty’s split would need to be weighted by 1,000 league average plate appearances. In other words, these small correlations can be accounted for when trying to estimate what a player’s true platoon skill is, but doing so requires putting heavy emphasis on the league average.

Using this technique they show that even the most extreme lefty masher from 2000-2004, Brian Jordan, would have his measured wOBA platoon split of .101 regressed by 63% to .037.

And that’s the essence of why platoon splits, while important in a general sense, are not an object of focus to the degree that one might expect. Put simply, yearly random fluctuation due to small sample size and their inherent variability swamps the actual skill that players do possess, making the splits less valuable as tools for prediction and assessing value.

Friday, November 16, 2007

I know most Rockies fans were hopeful that Troy Tulowitzki would lay claim to the National League Rookie of the Year Award. Well, the award was given out last Monday and in what was the closest race in the NL since the current system was put in place in 1980, Milwaukee's Ryan Braun edged Tulo 128 to 126 in the voting. Braun received 17 first place votes to Tulowitzki's 15.

While it's hard to fault voters for choosing Braun given the monstrous offensive season he had - breaking the rookie record for slugging percentage held by Mark McGwire and hitting 34 homeruns in two-thirds of a season - one gets the impression that the voters may have underappreciated the difference between the two players on the defensive side of the ball. Be that as it may we shouldn't be too hard on them, however, since historically defense has defied meaningful analysis and looking at things like error totals (Braun made 26 while Tulowitzki made 11) don't really give a good feel for the more important aspects of defense such as range.

But we're here to correct that.

Below you'll find a table that compares the two players in terms of their raw numbers using two of the advanced performance measures we track at Baseball Prospectus.

BRAR stands for Batting Runs Above Replacement and is the number of runs the player contributed above and beyond what a replacement level offensive player (think of a pretty weak bench player and you'll have the idea) would contribute if allowed to make the same number of outs. We use outs instead of plate appearances or at bats since ultimately baseball is predicated on the idea of avoiding outs if you're on offense and finding a way to get outs on defense. BRAR includes basic baserunning and is adjusted for things like the league and home park context. The latter is of course key in the case of Tulowitzki since he plays at Coors Field which, even in the humidor era, is a ballpark with a marked hitter's advantage. FRAR stands for Fielding Runs Above Replacement and is the number of runs the player saved on defense over a replacement level player (a relatively poor fielder) at his position. Outs is obviously the number of outs each player consumed offensivley and AdjG is the number of 9-inning game equivalents the player played at his position.

So from the table we can divine that while Braun contributed twenty more runs offensively while consuming 130 fewer outs, Tulowitzki amazingly was responsible for saving 61 more runs on defense. Put together Tulowitkzi contributed 76 runs to Braun's 35, a differene of 41 runs and equivalent to about 4 wins (10 additional runs for a team in the big scheme of things is roughly equivalent to a win). Can that be right? Such a large number makes one immediately suspicous and so let's take a look at an alternate metric.

John Dewan of Baseball Info Solutions has created a Plus/Minus system that tracks each ball hit to any fielder and determines in the aggregate how many balls plus or minus the fielder was able to handle as compared to an average fielder at his position. In Dewan's system Braun came up with a total of -41 and Tulo was at +35. In other words, Braun didn't handle 41 balls that an average third baseman would have while Tulowitzki made 35 more plays than would have been expected. Not incidentally, Braun's rating was the lowest of any position player and Tulowitkzi's was the highest among shortstops as he was crowned the best fielding shortstop in baseball by Dewan in the 2008 Bill James Handbook.

We can convert Dewan's plus/minus values to runs by assuming that each ball not fielded became a single (an assumption that is kind to Braun since he plays third base where some of the balls he failed to field undoubtedly became doubles and not singles) and knowing how much a single and an out are worth in terms of runs. Luckily, there are a series of run estimatation formulas known collectively as "Linear Weights". In that system each single is given a "weight" equivalent to 0.47 runs and each batting out -0.25 runs. So given these values it turns out that a would-be out that turns into a hit costs a team roughly 0.70 runs. If we then take the difference between Braun and Tulowitzki, 76 plays, and multiply by 0.70 we get 53.2 runs. Both systems are saying essentially the same thing - Tulowitzki was worth somewhere between 50 and 60 runs more defensively in 2007.

But of course Braun didn't play an entire season and so could it be the case that if we gave him a full season his offensive dominance would wipe out the difference?

To answer that question below is a table that pro-rates both player's offensive and defensive statistics to the same number of outs consumed on offense and adjusted games on defense.

If both players had consumed 450 outs Braun would have produced 71 more runs than a replacement player while Tulowitzki would still have been at 30. Those 41 additional runs are not inconsequential. However, had Braun played 162 games on defense he would have given up 23 more runs than a replacement level fielder while Tulo would have saved 49 runs. When you add it all up the gap between the two players shrinks to 30 runs or the equivalent of three wins. As my colleague Joe Sheehan noted, Tulowitzki was robbed, perhaps not by much - but in the end he was the more deserving.

But Rockies fans shouldn't feel too bad. Look what happens when you add the name Hunter Pence to the list and use the same rationale as discussed above.

The Astros' Pence comes out on top just barely by virtue of a solid season both offensively and defensively. Still, there is value in playing time and that, in my estimation anyway, should push Tulowitzki back over the top.

One of the trademarks of Omar Minaya’s three seasons as the Mets general manager has been his relentlessness in the free-agent market. He isolates an area of need, locks onto a player and does not move on until a deal is completed.

With Jorge Posada never truly an option, Minaya identified Yorvit Torrealba as his catcher of choice. After some fast-moving negotiations, the Mets are on the verge of signing him to a three-year, $14.4 million contract.

Everything is going along nicely until we run smack into the words "Yorvit Torrealba" in a sentence in which they seemingly don't belong. The author does go onto point out that Torrealba's road numbers (.212/.292/.326) indicate that the Mets "probably" plan to bat him eighth. Probably? Did we really think he was going to hit second or third?

Thursday, November 15, 2007

This week on Baseball Prospectus my focus on baserunning turns to the minor leagues as I present the leaders and trailers at the four top levels of the minors (AAA, AA, high and low A) and then an aggregate total to account for players who moved between leagues.

Braves fans may be interested to learn that Gorkys Hernandez fared particularly well and finished second in the minors by contributing +10.2 runs overall on the bases. He was excellent at going from first to third and second to home on hits (easily tops at the level in EqHAR), and stole bases at an 83% rate, good for +2.5 runs. On the other hand former Moneyball pinup Jeremy Brown was the clear winner of the "Triple-A Plodder Award", giving up 8.8 runs, 7.6 of those while attempting, or not attempting, to advance on hits (EqHAR). His total in that category was almost 2.5runs worse than anyone else at his level as he managed to get thrown out four of the six times he tried to advance on hits in 30 opportunities.

I thought it might also be interesting to take a look at how a few of the other top prospects going into 2007 fared. So below we have a table of those players who made it into the top 20 of Baseball America's Top Prospects list published at the end of February.

Tuesday, November 13, 2007

Last week I had the opportunity to listen to a talk given by an evangelical Christian leader to an audience of primarily evangelicals on the topic of global climate change. Given the rancor that sometimes accompanies this issue, it was a bit of surprise as were some of the reactions I observed afterwards. But this subject has been on my mind for the last few months after attending a lecture given by Dr. Richard Alley, professor of geosciences at Penn State University in State College, Pennsylvania, last June at the Denver Science Museum and having just finished reading Tim Flannery's 2006 book The Weather Makers: How Man Is Changing the Climate and What It Means for Life on Earth.

Actually, I hadn't kept up with the most recent surveys on evangelicals' views on climate change and so after the talk I took a look around and found that in one recent study 33% of evangelical Christians describe global warming as a "major issue" while the numbers are over 50% for people of other faiths or of no faith. The results were culled from a Barna poll and were based on nationwide surveys conducted on 1,007 adults in January 2007 and 1,004 adults in July-August 2007. They break down like this:

Percentage That Agree Global Warming is a Major IssueEvangelical Christians 33%Non-evangelical born again Christians 55%Notional Christians 59%Non-Christians of other faiths 61%Agnostics and atheists 69%

and the article goes on to give the following percentages within Christianity although it's not clear if this is in response to the same question...

Regardless, the gap still exists and the reasons for it and why it tends to shrinks as you go from evangelical to non-evangelical and Protestant to Catholic are intriguing. While I'm not sociologist I'll venture a few guesses based on my own experience and the reactions I observed last week.

A Conspiracy wrapped in a Hoax. Within the evangelical community there is certainly an emphasis on prophecy and end times events as evidenced by the popularity of the Left Behind series which focuses on one particular (and historically recent) interpretation of the book of Revelation. In that vein one of the conditions or preconditions for the rise of the anti-Christ that is often discussed is the formation of a one-world government that, interestingly enough dovetails with other groups who fear the same thing. And so many evangelicals feel that global climate change and the kind of international treaties it may produce (Kyoto) is just the kind of issue that could be used to hasten the creation of a one-world system as people look from national to international solutions. Christians of other stripes don't as often take this view of the book of Revelation and therefore this issue doesn't arise. Oh, and it doesn't help that in the U.S. one of the main advocates of climate change has been a traditional enemy of conservatives and that a large majority of evangelicals are political conservatives.

Scientific Skepticism. Spurred on by the very prevalent belief in young-earth creationism many evangelicals chafe when the discussion turns to the scientific consensus on climate change let alone things like ice cores showing temperature patterns that go back hundreds of thousands of years. On the former issue young earth creationists already believe that the scientific consensus on the pattern and history of life is fatally flawed and so disbelieving in the human connection with climate change is not only to be expected but is actually easier to deny than is biological evolution. As to the latter, the belief that the earth is only around 10,000 years old means that climate reconstructions and talk of previous ice ages (or even the most recent one) are unconvincing to the say the least. And of course there's a liberal sprinkling of myths that circulate as well including the one about volcanoes emitting more CO2 than man-made sources (in actuality human sources are 130-150 times more potent). Once again, other stripes of Christianity don't hold this view and in fact young earth creationism is almost entirely restricted to the U.S.

Dissonance with God's Plan. Although it seems simple on the surface one of the primary issues may simply be the belief that the all powerful God is in control of his creation and so Christians needn't worry. This would make sense but I think it goes a little deeper and reflects a feeling of dissonance with the idea of God commanding mankind to proliferate and subdue the earth. If indeed not only the industrial revolution but all of mankind's activity on the earth for the past 10,000 years (the so-called "Anthropocene") has brought about these changes, something would seem amiss with God's plan. One would think this conflict would arise in other flavors of Christianity as well but perhaps their view of God's plan is somewhat less deterministic.

A Question of Priority.Finally, there are many evangelicals who simply believe that Christians should be focusing on other more pressing social issues. While that's an argument I can certainly respect, there would seem to be no reason why one would necessarily preclude the other and Christians couldn't both focus on issues like the right to life and the environment.

In the end this seemed to be the main point made by the speaker I heard last week. Christians can all agree that we have a responsibility both to our fellow man, especially the poor and the oppressed, and to be stewards of the creation. In the area of climate change these go hand in hand since by caring for the environment we can help lesson or avoid the deleterious effects, most of which will be felt by the poor whom Christians are called to serve.

Monday, November 12, 2007

Everyday I take a look at BallBug.com and peruse the top baseball stories published on the web. Over the past couple of weeks I've been amused by the numbers of stories, like this one, that focus on Yorvit Torrealba. In this particular case the news is that Torrealba won't sign early with the Rockies before their exclusive negotiating rights run out and will instead take his chances on the open market with teams like the Mets and Marlins.

As far as the Rockies are concerned, not negotiating a deal right now is certainly the smart move. Torrealba's exposure in the post season seems to have driven the market for him well above what he's actually worth. In the end he is still a backup catcher who hit .255/.323/.376 with an EqA of .236. He actually performed better offensively in 2006 because he hit with more power and his bum right shoulder is a liability in an area (throwing out runners) where you typically want your backup to be better than league average. As a backup catcher he is barely adequate and backup catchers are essentially freely available and can be plucked out of the backup catcher Oort cloud at will. Somebody is going to overpay.

Also saw this nice recap of Daisuke Matsuzaka's season. For the record my prediction of Dice-K's season was 17-12, 4.22 ERA, 179 K against an actual of 15-12, 4.40, 201 K. Not too bad considering it was pretty much a stab in the dark. I did take into consideration adjusting to a five-man rotation and thought he might wear down making my prediction one of the worst ones. One thing I was surprised at was given his large repertoire, how frequently he used his fastball.

Want to know about winners? Pedroia gave up his scholarship at Arizona State to free up money to sign a much-needed pitcher, so when the Sun Devils reached the College World Series, coaches and players had "DP" on their caps in honor of their leader who never got to Omaha. The sabermetrics guys in their garages never understand these things.

I'm less upset by this than simply confused. The column starts off talking about A-Rod and his announcement during game three of the World Series and then quickly transitions to this non-sequiter. Why? What's the point? Who cares? Perhaps there is a private joke in there somewhere but I've missed it.

In writing that post I was looking at projections for individuals and thought I'd post the results of three players. Pete Rose, whose projections were fairly close, Willie Mays who was in the middle of the pack in terms of standard deviation of the difference between the actual and projected, and Rogers Hornsby who came down on the high side, in part because of his 1926 season when the projection had him at an NOPS/PF of 151 when he came in at 116.

Keep in mind these projections are based on a three-year weighted average, regressed to the mean, and are age and league adjusted (as well as park adjusted).

"What this tends to suggest, by the way, is that platoon splits are something relatively immutable: a player is ‘born’ with a certain platoon split, and it isn’t likely to change very much over the course of his career, although it may take a lot of data to ‘discover’ exactly what it is."

was interesting to me since I wrote a pairof articles on platoon splits back in April of 2006 for BP. The second of those articles in particular took a quick look at whether hitters learn to hit better against the same side relative to the opposite side over time along with how the splits differ amongst different types of hitters.

So for those interested that 2006 column published on April 27th is reproduced below for your reading pleasure...

April 27, 2006Schrodinger's Bat: Of Crowds and Splitsby Dan Fox“I have never wished to cater to the crowd; for what I know they do not approve, and what they approve I do not know.”

--Epicurus (341 BC – 271 BC)

In the fall of 1906 Francis Galton (1822-1911), the British polymath and half-cousin of Charles Darwin, decided to attend a country fair near his home. Galton was a man of many and varied talents--he invented the weather map, a method for classifying fingerprints, and even the silent dog whistle--but among them was a statistical bent, and he had used his skills to try and understand human differences and heredity.

Those of us in the performance analysis community particularly owe him a debt, since it was he who is generally credited with creating the concept of regression and discovering the idea of regression to the mean, both of which we often apply with a vengeance. Of course, he was also a big proponent of eugenics, although in his defense, we can note that he cautioned people against the negative forms that became popular in the fist half of the twentieth century.

Be that as it may, on that particular day, Galton watched as 800 people (many “non-experts” in Galton’s own words) bought tickets to try and guess the weight of an ox after it had been slaughtered and dressed, in order to win a prize. Being interested in the intelligence of individuals, he asked for the tickets after the event. After tabulating them, he noticed something interesting. The ox had actually weighed 1,198 pounds and while the guesses of the individuals formed a normal distribution, the mean of those guesses was 1,197.

As you might imagine, one of the wonderful byproducts of the information age is the ability to harness the wisdom Galton discovered through the blogosphere and the miracle of email. Well, in response to my column on platoon splits two weeks ago, I received a whole boatload of wisdom (although I would not classify readers as non-experts) and so this week I want to cull that wisdom to take a look at the two most frequently mentioned aspects that were missing from that original article.

Slice and Dice

For those who haven’t read the original article the thesis was simply this: performance analysts don’t typically put a whole lot of emphasis on platoon splits because the amount of variability inherent in them (even across halves of careers) means that for the majority of players there is little quantifiable difference between the assumption that there is a general platoon advantage for right-handed hitters versus left-handed pitchers (and vice versa), and the notion that there is an inherent skill above and beyond this general advantage.

The data set used in that article included all 505 players who garnered 2,000 or more plate appearances from 1970 through 1992 (incidentally the 1993-1998 and 2005 play by play data are now available on the Retrosheet web site).

Although we broke this group of hitters into left- and right-handers and measured both the average platoon split and how it correlated across even and odd years of careers, several readers wondered what the data would look like if it were further sliced. For example, is it the case that platoon splits are larger or smaller for certain kinds of hitters?

To take a first run at that question we can calculate the overall Isolated Power (ISO, defined as slugging percentage minus batting average) for each hitter and then divide the hitters into three groups.

ISO between .100 and .150. These are hitters with just below-average to slightly above-average power; the mean ISO for the group was around .136. Hitters like Gary Maddux, Alan Trammell, Robin Yount, Bill Buckner, and Jose Cruz are included in this list of 183 hitters, 81 left-handed and 102 right-handed.

What this table shows is that as a group, the higher the ISO, the larger the platoon split, at least when it comes to on base percentage and slugging percentage. This can also be shown graphically below, where differences in OPS are indicated by the Y-axis on the left, and split differences by AVG, OBP, and SLG are indicated on the right.

As we move to the right, the bars for right- and left-handed hitters get a little higher. One might theorize that the reason for this is that hitters with more power generally have longer swings. And with a longer swing comes the need to commit earlier and a greater likelihood of being victimized by pitchers of the same hand who feature hard breaking stuff.

This is borne out by taking a quick look at the pitchers that gave these hitters the most trouble. The following list includes the fifteen pitchers with the lowest OPS against left-handed hitters in the < .1 ISO group (with more than 250 plate appearances), along with their repertoire as listed in the highly-recommended The Neyer/James Guide to Pitchers.

This list is very different. It includes fourteen southpaws and just a single right-hander, and is not necessarily populated with the best overall pitchers of the period. And whereas the repertoires of the first group of pitchers varied, ten of those here featured sliders or hard curves in their top two pitches. Clearly the harder breaking stuff proved to be more difficult for the more powerful hitters.

Incidentally, Neyer/James did not have a listing for Bob Lacey and I couldn’t find out what he threw before publication of this article. Inquiring minds do want to know and so I’m sure some A’s fan out there fill us in.

The other question then is whether or not splits for players with a larger ISO are more predictive. In other words, is a larger ISO indicative of players with more consistent platoon splits?

The answer appears to be 'no.' On balance, the correlations between even and odd years for the higher ISO group are no larger than that for the lowest group, and there is no trend in either direction. Although this seems counterintuitive, it may simply indicate that as a group, power hitters have the same variability in their splits as do other hitters, only their distribution is centered on slighter larger differences.

This same question repeated itself many times, as readers asked whether platoon splits actually shrink over time. In other words, over the course of their careers, do hitters learn to hit better against same-side pitchers, and therefore reduce the opposing manager’s advantage when using his pen?

To look at this, we can take all of the hitters in the original study and focus in on only those who played in the majors for 15 consecutive years. That leaves us 104 players, 40 left-handed hitters and 64 right-handed.

We can then graph their performance over the course of these 15 years against both right and left-handed pitchers for each type of hitter.

By glancing at the graphs, you can see that split differences for both right- and left-handed hitters don’t really change much throughout the period. The gap between the red and navy blue, the pink and green and the aqua and purple lines remains comparatively constant, indicating that there appears to be little "learning" going on relative to performance against their strong side. For right-handed hitters, the gap remains roughly 18 points of batting average, 20 points of on-base percentage, and 38 points of slugging percentage, while for lefties the gaps are 26, 32, and 61 points respectively. However, there are two trends that are discernible.

First, the performance of right-handed hitters against right-handed pitchers declines precipitously around year 13, while performance against left-handed pitchers does so more slowly, thereby widening the gap a bit. For left-handed hitters this trend is not present. As a result, older right-handed hitters are probably relatively more valuable in a platoon role than are older left-handed hitters.

Second, left-handed hitters' performances against left-handed pitchers basically peak by year five, and then hold steady as left-handers continue to improve against the opposite side through year nine. On the contrary, right-handed hitters seem to improve against right-handed pitchers (as they do against lefties) more steadily through year eight before they begin to decline once again, widening the gap just a bit.

Focusing only on 15-year veterans gives us a long period of time to examine, but it cuts down the sample size and ensures that we’re looking only at a group of generally pretty good ballplayers. It certainly could be the case that players with lesser skills have a different progression, although it seems unlikely. In any event, it's a test that will have to wait for another day.

Heading Home

Like Galton I’m always impressed with the wisdom of crowds, and especially the insightful wisdom of BP readers. Keep it coming, and feel free to share your thoughts, opinions, complaints, and insights any time.

"The reason people find it so hard to be happy is that they always see the past better than it was, the present worse than it is, and the future less resolved than it will be" - Blaise Pascal

It's that time of year again when baseball minds start to think about next year and indulge in all manner of projections and forecasts. It is of course a perfect compliment to the hot stove talk as we imagine what this or that player might do for our favorite team. In my column on Baseball Prospectus today I take a historical look at projections by "projecting the past" using a simple but fairly effective algorithm for generating projections for every player in every season from 1903 through 2006, some 17,000 player seasons in all.

What I was looking for were the biggest booms and busts of all-time; in other words which players most exceeded a reasonable expectation and which ones went into the tank. Although ours (and our favorite front office's) expectations may not always be reasonable, the article talks about those using a couple different ways of comparing the projection to the reality and lists a top and bottom ten using those methods. Hope you enjoy it. The projections use normalized and park-adjusted OPS and are based on a three year weighted average that is regressed to the mean, age, and league adjusted.

So surf over and find out who was the biggest bust of all-time (a hint: he hit a few taters in his time) and the biggest boom (hint: my sister's favorite player and not a favorite of Mad Dog's)

The problem is that transfers are based on local revenues. Teams that receive money are encouraged to invest it in their payrolls. But if a team actually attracts fans by fielding a winning team, its revenue-sharing receipts will be reduced.

To create a more balanced playing field, revenue-sharing payments should be increased for teams that attract more fans. I have devised an approach for doing this based on a statistical analyses of teams’ payrolls, winning percentages and attendance. It takes into account the size of the team’s local population, to acknowledge that teams in places like New York and Chicago have greater financial incentives to invest in players than teams in places like Milwaukee and Kansas City do.

Here’s how my formula would have affected the revenue-sharing payments to the Pittsburgh Pirates, which last year filled only 60 percent of its seats but received $25 million in revenue sharing. If the team could have increased its attendance rate to 70 percent, its payments would have grown to $29 million, and if attendance had gone up to 80 percent, the payments would have reached $33 million. My formula would have had even more significant consequences for the Devil Rays. Based on the team’s 38-percent attendance rate, its revenue-sharing payments would have been reduced from $33 million to $13.5 million.

While the study won't be published for a few months this synopsis makes me wonder how he takes into consideration the differences in the sources of local revenue between teams. It seems fine to remove the disincentive to attracting more fans and looking at local population and attendance levels seems like a way to get this done. But clearly a large part of the disparity in local revenues has to do with media market size, which incidentally, also influences attendance. Lewis made mention of media market size in the Carroll interview but didn't focus on it as I would have expected since there is a real disparity between the teams. If the Yankees can bring in over $90M in local media revenues while a team like the Royals might fall somewhere between $5M and $10M, there would have to be a portion of the revenue sharing that reflects only these differences and is unaffected by his plan.

Thursday, November 01, 2007

Today on Baseball Prospectus my column walked down memory lane back to 1977 and after reliving a few memories of the Cubs and their collapse after holding a 7.5 game lead on July 1st, look at the best and worst baserunners of that year similar to what I did for 1987.

One of the points I made in the article followed the revelation that runs contributed via stolen bases and pickoffs (EqSBR) were often much larger in the negative range thirty years ago than we see today.

"Not all managers or players were aware of the costs of allowing Gary Templeton and Enos Cabell to run with abandon, leaving a trail of broken innings in their wake. As a result, stolen base percentages were relatively low, just 62.9 percent overall. When noting the general rule that a 67-70 percent success rate is required to break even (yes, it varies by the base/out situation, and that's taken into account by EqSBR), you can see how teams left a lot of runs on the field. In fact, the aggregate EqSBR for the majors stood at a staggering -418 runs for the 1977 season, or over -16 runs per team."

A very smart reader (Guy Molyneux) correctly pointed out that while it's probably true that managers allowed marginal stolen base threats to run too frequently, they also employed the hit and run more often which look, from the perspective of EqSBR, exactly like stolen bases attempts that are only visible when the runner is thrown out. In fact I had noted in the piece that Mike Vail of the Mets was 0 for 9 (with two pickoffs) costing his team 4.6 runs but there's little doubt that a good proportion of those were broken hit and run plays. Guy supports his point by noting that in 1977 there were fewer double plays grounded into per ground ball hit than in 2007 (.75 versus .82) and there is a pretty strong negative correlation between stolen bases per game and GIDP per game (-.7 from 1970 through 2007). Most interestingly, this would also have an effect on runners advancing from first to third (a topic I also mentioned briefly). With additional runners moving, they should be more successful in advancing and indeed that is what we see as there has been a declining advancement rate since 1970.

In the final analysis the more aggressive baserunning of the 1970s certainly cost teams in terms of runs via caught stealing but some of that is offset by their ability to advance more frequently. While you might imagine that my metric that tracks advancement on hits (EqHAR) would compensate, that is not the case. EqHAR compares each runner against the average advancement for the year in question and so there is no absolute baseline against which it is compared as is done with EqSBR. Unfortunately the data we have doesn't allow us to totally disambiguate the two.