Baseball ProGUESTus

Entropy and the Eephus

Most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers, and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.

My favorite pitch of the 2013 season happened in late August, in a game between two teams far out of contention. Carlos Villanueva was on the mound, Jayson Werth in the box. Following a 90 mph first-pitch fastball, Villanueva unleashed a glorious, arcing 57 mph eephus: a ball without spin, without speed, and utterly unexpected.

Werth’s reaction—dubbed by some an “existential crisis”—was undoubtedly amongst the most en-GIF’d moments of the year. In stunned repose, he watched as the lazy arc of the ball took it over the plate for a called strike. Villanueva, asked post-game about his daring decision to challenge one of the best hitters in the NL with an eephus, later said, “I’m not a power pitcher so I have to mix up my repertoire,” an apt explanation.

I want to linger for a moment on Villanueva’s quote. We tend to take it as received wisdom that a pitcher who lacks velocity can make up for it to some extent with a broad and varied arsenal of pitch types. (Usually, this arsenal doesn’t include a large dose of the eephus, although it has been employed to great effect in the past). The theory goes that by cleverly mixing pitches, a cunning pitcher is able to create uncertainty in the opposing hitter, thus driving him to hesitate (or, in Werth’s case, freeze solid). Yet, whereas we can quantify and examine the effect of a fastball’s speed on a pitcher’s results, we run into considerably more difficulty in analyzing how a junkballer leverages uncertainty to deceive a batter.

I set out to measure this uncertainty and its effects on a pitcher’s success. To rigorously quantify uncertainty, I use a metric called entropy. Entropy is an important statistic throughout science, statistics, and psychology. In practice, entropy is best thought of as the number of yes or no questions that must be answered to reveal the identity of an unknown pitch. We can divide entropy into two components: one is simply the number of pitch types a pitcher tosses, and the other is how evenly a pitcher uses that repertoire. This latter component, which I call evenness, can be defined as the entropy divided by the number of pitch types. Evenness therefore encapsulates the portion of entropy that is dependent on how uniformly pitches are employed.

Thus, entropy is high when a pitcher has many different kinds of pitches, each of which is used with some frequency. In this case, from the batter’s perspective, it is difficult to predict what the next pitch will be. Conversely, low entropy would be achieved when a pitcher is all-fastball, all the time, in which case a hitter can focus primarily on location.

Note well here that I am focusing only on the uncertainty a pitcher builds by varying pitch type. There are many other ways in which a pitcher could confuse a batter, such as by varying speeds (as in the eephus) and location in the zone. Certainly, these two elements (and likely other things, besides) are important pieces of the puzzle, but beyond the scope of the present article.

Some methodological details follow. I used pitch type data graciously supplied to me by Pitch Info. I define fastball velocity as the speed of the pitcher’s top 100 pitches, to avoid situations in which a pitcher throws a purposefully slower fastball variant (two-seam vs. four-seam vs. cut, etc.). And I limit myself here to starting pitchers, whom I define as those who threw >100 innings in 2013 (except R.A. Dickey, on account of his knuckleball, which generates uncertainty in an entirely different manner).

To begin, I wanted to test Villanueva’s axiom: do pitchers with slow fastballs possess high entropy to make up for it? For a first approximation, the answer appears to be yes.

There is a moderately strong, but significant, negative relationship between fastball velocity and entropy (R2 = .099). Interestingly, the correlation is primarily driven by the absence of pitchers who fall in the lower left portion of the graph, the region corresponding to throwers with neither complex repertoires nor fastball velocity. Here we run into a classic problem in sabermetrics, selection bias: whereby pitchers who have neither fastball velocity nor entropy are presumably not successful, and so don’t play in the major leagues.

This point can be powerfully illustrated by an outlier in the above graph, Jason Marquis. Marquis, it turns out, had the sixth-worst fastball velocity in the league last season, and coupled that to the fifth-worst entropy. I don’t think it is a coincidence that he also had by far the lowest VORP of any pitcher, nearly doubling (in negative value) the next-worst player. Should he continue to post such putrid results, he will be hard-pressed to remain on a major league roster (and thus, within the sample).

To get a better feel for the different types of players, I broke down each quadrant of the graph, picked out some representative hurlers (some good and some bad for each), and computed a few relevant summary statistics.

Beginning in the upper left, we see the pitchers who lack velocity but have high entropy. This group of wily veterans, containing names such as Mark Buehrle, Hisashi Iwakuma, and yes, Carlos Villanueva himself, corresponds pretty well to what I would consider “junkballers.” Lacking plus fastball velocity, they must rely upon control and their extensive pitch repertoires. However, they don’t tend to produce many strikeouts, and their results suffer.

At the opposite corner, we find a cluster of familiar names: Max Scherzer, Justin Verlander, and Clayton Kershaw among them. These are excellent pitchers who thrive with lively fastballs and hence need not dig deep into the junkballer’s bag of tricks. The only strike against them is the highest rate of free passes of the bunch. For hopefully obvious reasons, I call these the power pitchers, and so dub the regression line of entropy vs. velocity the Junkballer-Power Pitcher Axis.

The lower left corner has few pitchers in it, and the ones who live there mostly aren’t very good—with exactly two exceptions: Cliff Lee and Madison Bumgarner. However, these are the only two (out of twenty) with sub-3.5 FIPs (by far the lowest fraction of all the quadrants); they are, I think, the exceptions who prove the rule. The rest of this category, I call the scrubs; the less said about them the better.

But in the upper right, we again encounter quality, albeit at the expense of slightly lower fastball velocity (hence, “Slower Aces”). Indeed, of the four quadrants, this group has the lowest median FIP, and encompasses such top-tier talents as David Price and Yu Darvish. These guys have it all: searing fastballs and gyrating breaking balls, deployed in entropic mixtures to produce the best strikeout-to-walk ratio of all the categories. This group has the best percentage of excellent (sub-3.5 FIP) pitchers, narrowly edging out the Power Pitchers.

Two things I hope were made clear by this quadrant-splitting exercise: one, there are ways to be successful and unsuccessful in each one of the four categories. Each group has at least one ace, and each has at least one barely competent, basically replacement-level starter. This truth speaks to the diverse ways in which major league pitchers can excel: sometimes with velocity, sometimes with entropy and trickery, and sometimes with things that clearly can’t be captured by these two factors (ahem, Cliff Lee). It may also be the case that Lee, bane of my results, actually does create a lot of uncertainty, but through a mechanism my pitch-type focused metric doesn’t count (e.g. a deceptive delivery, varying location, etc.).

Secondly, there appear to be some potentially interesting relationships between entropy and some of the determinants of pitcher success. To explore this notion, I adopted a multiple linear regression approach. For this analysis, I split entropy up into its component parts in case they had different effects. The two elements of entropy are 1) the number of unique pitches a player throws, and 2) how evenly those pitches are used. I focus for now on strikeouts, measured as K/9, because I think this is the most obvious way that entropy can be useful (as Werth can surely attest).

It’s been known for some time that fastball velocity is linked to lots of strikeouts, and I find that to be true. I also take into account breaking ball nastiness by the average level of spin. Considering (alongside those elements of “stuff”) the effects of our two kinds of entropy, I arrive at the following model:

As expected, fastball velocity is the most important determinant of K/9. Riding second and third, however, are the two components of entropy: evenness, and number of pitch types. These both strongly increase strikeout rate, as hypothesized. Indeed, their effects are just slightly stronger than my spin-based index of breaking ball nastiness.

There’s one additional unexpected variable here, and that’s an interaction term between the two entropy components. Generally, the interaction term implies that although evenness and pitch repertoire size are good for racking up the strikeouts, having both may not increase a pitcher’s Ks as much as expected—in other words, increasing entropy offers diminishing returns. There are a few possible explanations for this effect. I favor the notion that above some threshold value of entropy, batters simply stop trying to guess what the next pitch will be. At this point, instead of being bamboozled by uncertainty, they may just wait to see how the pitch develops and react accordingly.

I ought to note that this regression isn’t some weird, artificial effect of small sample size: all of the terms in the model are statistically significant, most extremely significant. The combined model fits the data quite well and is able to explain in total more than 25 percent of the variation in strikeout rates across the league (R2 = .273). So, entropy clearly offers some explanatory power in understanding strikeout rates. Because the strikeout is so important to a pitcher’s ultimate results, this positive relationship suggests that all other things being equal, a pitcher with more entropy is a better pitcher.

With that said, I don’t mean to overstate the importance of entropy. Fastball velocity is, as ever, king, with entropy a merely secondary influence. Entropy is something to compensate or augment velocity, as Villanueva’s quote illustrates and as the Junkballer-Power Pitcher axis bears out. The two elements of entropy—pitch evenness and the size of a pitcher’s repertoire—do predict strikeout rates with some accuracy. In addition, I would be remiss not to mention some excellent recent work from MGL and Jon Roegele, which shows respectively that throwing more kinds of pitches and throwing a lower percentage of fastballs reduces the times through the order penalty (TTOP). Because these factors are highly correlated with more entropic pitchers, their results suggest that entropy may offer ancillary benefits in terms of reducing batter familiarity. Most of all, entropy offers a first glance into the complex guessing game that is pitch selection, in which the reward for guessing just right can be a hitter’s fearful glimpse into the void.

Funny that the announcers mentioned Dave LaRoche during the Villanueva/Werth pitch bc I happened upon on this yesterday - http://2.bp.blogspot.com/-br6BDUDN2AE/UvL59-0BRoI/AAAAAAAADLI/npmuUYGIXlw/s1600/Dave+RaLoche+strikes+out+Gorman+Thomas+on+eephus.gif ... I wish I could properly credit the person who put it in my feed, but I only had the GIF open on a tab and not the tweet.

Shouldn't this analysis separate left-handed from right-handed pitchers? I'm not so surprised to see Cliff Lee and Madison Bumgarner in that lower-left part of the chart, in that they have one degree of "differentness" wired in just by being lefties. There probably are interesting lefty/righty comparisons to be done here, but lumping them all together seems to me to run an apples-and-oranges risk.

Yes, there are some interesting differences in the data between lefties and righties.
The prevailing patterns related to strikeout rate, velocity, and entropy are there in both subpopulations, though, which is why I chose to pool them. In other words, although LH and RH are apples and oranges with regards to some characteristics, they seem to be all apples with regards to how entropy increases strikeout rate.

In the future, I'm definitely going to split them up and examine each separately, also looking at batter handedness as a factor.

The negative interaction for the entropy components is the result of how evenness is calculated (entropy/number of pitches) - for a given degree of entropy evenness is inversely related to the number of pitches. Try the regression again using entropy instead of evenness or use the logarithms of evenness and number of pitches to avoid the artifact.

I'm not sure I see why evenness and pitch types are necessarily related (besides the obvious), or why exactly that would result in an artifactual interaction term. But taking your word for it, I recalculated the model with just entropy, and the logarithm of evenness + the logarithm of pitch types. In each case, the effect of entropy and its components is to increase K rate. Hopefully that ameliorates your concerns.

"You should either have a model with entropy and number of pitches and their interaction."

I'm still a little lost, cause you said "either" but then no "or". So, to recap, I fit the following models:
-entropy alone
-pitch types + evenness + interaction
-logarithm of pitch types + logarithm of evenness

All had positive effects of entropy or its components on K/9.

It doesn't make a lot of sense to me to fit entropy and number of pitches in the same model (as I think you suggested) because number of pitches is in the calculation for entropy, and so they are VERY highly correlated. This causes multicollinearity.
While evenness and number of pitch types are also (negatively) correlated, it is to a much smaller degree.

Is there a way to measure the release point and/or arm action for different types of pitches?

I recall from many years ago reading Mike Mussina spent hours practicing his delivery of various pitches to make his release look exactly the same, regardless of pitch type, to delay the time at which the batter could pick up the pitch type. I took the story at face value...wondering if it actually can be measured.

If so, "sameness" of delivery might explain some of how a a pitcher like Cliff Lee separates himself from his otherwise scrub-worthy peers based on velocity+entropy. Or, alternately, if measurable it may be possible to add it into the Gory Math soup, and refine (upwards, presumably) the entropy score of a Cliff Lee, and reduce that of a scrub, helping to explain the differences in performance.

Are you planning to go look at whether the measurable difference between the pitch types matters for this effect? That is, maybe batters do worse against a pitcher who has three pitches with radically different velocity/horizontal movement/vertical movement, than against one who uses four that are just different enough to be classified separately.