Help Support FanGraphs

Zach Britton is MLB’s best groundball pitcher. His groundball rates in each of the past two years—twice eclipsing 77 percent—are the highest figures on FanGraphs’ all-time leaderboard. If you don’t watch many Orioles games, you might assume Britton induces his grounders chiefly with sinkers at the bottom of the strike zone. Keeping the ball down, after all, is the sinkerballer’s credo, but the Orioles closer doesn’t fit into that mold.

Against righties, there are ample pitches at and below the knees, but the thicker clusters are in the heart of the strike zone. Similarly, most grounders coming off lefties’ bats came on middle-middle pitches. What gives? The platitude about keeping the ball down hardly applies to Britton’s grounders, and it might be overblown for other pitchers, too. Many other pitch and contextual factors could play a part in predicting grounders. Which matter most?

Set-up

The plan here will be to predict whether or not a batted ball will be a grounder with the three dozen variables below.

Total number of hard fastballs thrown inside before that pitch (THIFB), total number of slow fastballs thrown inside before that pitch (TSIFB)

Each of these inputs is its own individual question. Is a grounder more likely when the pitcher throws with a sharp downward plane? What if the pitcher mixes his pitches well (measured by entropy)? What if the situation calls for the batter to loft the ball into the outfield? To what extent does previously pitching inside matter? We’ll put these questions through a model that will answer each simultaneously: a decision tree model with boosting.

You’ve likely come across a standard decision tree, which models an event by taking a data set and stratifying it into spaces. Those trees are nicely straightforward and take a simple flow chart form. But a single decision tree suffers from high variance, where these overfit models do poorly in out-of-sample testing with new data.

Variance can be reduced by fitting the tree many times over, motivating the use of boosting. This adds power to the decision tree framework by fitting a new tree to each current tree’s set of residuals. Over many tree-fitting iterations, this “slow learning” process attacks areas where the model is underperforming and adapts to provide more spot-on predictions. Because the model allows for non-linear relationships and accounts for any potential interaction effects, it’s suitable for spatial data (like pitch locations). Its output won’t be a tidy coefficient formula, but we’ll get the percentage influence of each factor as part of the whole.

I’ll run individual models for three pitch-type groups: fastball, offspeed, and breaking. The broad pitch-type categories (detailed in the appendix) will allow us to hone in on the traits necessary to get a grounder out of a fastball, offspeed pitch, or breaking pitch. For instance, the fastball model can simultaneously address the difference between four-seamers with a bit of tail vs. sinkers with great movement. Pitches on 0-0 counts are excluded at this stage so the previous pitch attributes can be evaluated in full.

After fine-tuning through cross-validation, the models perform reasonably well. By area under the ROC curve (AUC), my predictions would be regarded as fairly good, as per this primer from the University of Nebraska.

Results

The influence scores reported by the boosted trees are normalized to sum to 100, showing the relative importance of each of the variables passed into the models. Pitch attributes will be our starting point, as they do the heavy lifting in explaining grounders. First we’ll put the sinkerballer’s credo to the test, considering vertical location’s influence results and its underlying marginal effects: modeled values of groundball percentage as the other factors are taken at their averages.

In this and several other charts, you’ll notice in the x-axes that PITCHf/x factors were rescaled so lefties and righties of various heights can be compared analogously. Here, a knee-high pitch takes a value of zero.

INFLUENCE OF VERTICAL LOCATION ON GB%

Fastball

Offspeed

Breaking

19.5%

23.0%

19.8%

Vertical location clearly is important. In terms of influence, it’s the most vital factor in offspeed and breaking-ball predictions and a close second in fastball predictions. Still, pitchers should recognize that it’s just ~20 percent of the equation.

What are the practical effects of a well-located pitch? With similarly interweaving curves, it’s clear that down is universally better. One notable difference here is that the curves for secondaries curl up on the right side—showing that as pitchers throw breaking balls and offspeed pitches higher and higher, GB% flattens out and stops dropping. Contrast that with fastballs, for which GB% continues to get worse as pitches are thrown belt-high and above. Locating an otherwise-identical pitch an inch lower raises the probability of a grounder by 1.4 percent on fastballs and 1.2 percent for offspeed and breaking balls.

By influence, horizontal location matters quite a bit more for offspeed than hard or breaking stuff, but all chart curves here follow a sinusoidal path. Throwing inside to a batter is a bit better than grooving one (obviously); after passing by the middle of the plate, GB% climbs and climbs the farther outside a pitch is from a batter. I would have guessed that pounding hitters inside with riding sinkers or cutters is a reasonably effective way to get grounders, but that’s hardly the case. Instead, as long as a pitch is thrown at least from the middle of the plate (at dist ≈ 2.25) and outward, each additional inch outside will result in GB% rises of 1.7 percent for fastballs, 1.4 percent for offspeed, and 1.3 percent for breaking balls.

Moving down the line, we’ll next look at movement.

INFLUENCE OF VERTICAL MOVEMENT ON GB%

Fastball

Offspeed

Breaking

22.8%

14.3%

13.0%

The most interesting takeaway here is that vertical movement is the most crucial factor for fastballs. Heat’s percentage importance nearly doubles the rates owned by secondaries, and its marginal GB% changes faster.

Still, for all pitches, the more downward movement—the more negative the number is—the better the resulting GB%. “Rising” pitches—positive movement—aren’t good for grounders. The penalty eases up slightly at the highest rungs of movement, although the curves are still strongly linear (with correlations in excess of 0.98 in all three instances). All else equal, each additional inch of downward movement will increase GB% by 3.5 % percent for fastballs, 2.2 percent for offspeed pitches, and 1.7 percent for breaking balls.

INFLUENCE OF HORIZONTAL MOVEMENT ON GB%

Fastball

Offspeed

Breaking

9.2%

1.6%

1.7%

Lateral movement is among the most important characteristics for fastballs, comprising nearly 10 percent of the recipe. The chart shows that fastballs moving in towards hitters (with negative movement) are effective groundball pitches; those are cutters to opposite-handed hitters and sinkers to same-handed hitters. For each additional inch of inward movement beyond 1.5 inches, a pitcher can raise his fastball GB% by 2.5 percent. Any outward movement beyond that hurts the groundball effort.

It’s easy to envision a hitter swinging at a fading changeup and weakly grounding out to the pull side. But the extent that an offspeed (or breaking) pitch moves away from a batter is of no consequence. This is reflected in small influence figures and flat marginal curves. If anything, sliders and changeups moving in towards batters are a teeny tiny bit more effective, but in the end, lateral movement shouldn’t be part of the pitcher’s calculus if he’s looking to turn an offspeed or breaking pitch into a grounder.

Velocity is a big part of Britton’s ability to overpower hitters; how does it help groundball percentage?

INFLUENCE OF VELOCITY ON GB%

Fastball

Offspeed

Breaking

3.6%

10.1%

10.6%

In terms of influence, velocity is triply more important for secondary pitches than for fastballs. Yet all the velocity curves are similar, being close to parallel as they proceed on extremely linear paths. With an extra 1.0 mph of velocity, an otherwise identical pitch will yield a 1.5 percent rise in GB% for fastballs, a 1.7 percent GB% bump for offspeed pitches, and a 1.6 percent jump for breaking balls.

The rest of the pitch attributes, shown below, hold much lesser weight in prediction. Surprisingly, release height is among this group.

INFLUENCE OF OTHER PITCH ATTRIBUTES

Input

Fastball

Offspeed

Breaking

Release Height

2.4%

2.1%

2.0%

Break angle

1.9%

1.0%

3.6%

Spin rate

1.5%

1.2%

2.0%

Arm angle

2.9%

2.4%

2.5%

When a short pitcher comes along, there’s a question of whether his lack of downward plane will make it hard to get hitters out. But height doesn’t measure heart, and it also isn’t much of a groundball catalyst. It helps some; an extra upward inch in release height, for instance, adds a 0.4 percent GB% boost on fastballs. But again, there are much larger groundball rises to be had if a pitcher can squeak out inch-level improvements in location and movement. Short guys shouldn’t be deterred if their groundball stuff is otherwise solid.

Next we’ll move onto the other variable categories. The influence figures for the situational factors are shown below.

INFLUENCE OF SITUATIONAL FACTORS

Input

Fastball

Offspeed

Breaking

Handedness

0%

0%

0%

Plate count

2.1%

0.4%

0.5%

Leverage index

0.4%

0.7%

0.8%

Double-play situation

0%

0%

0.1%

Sac fly situation

0%

0%

0.1%

Two out nobody on

0%

0.1%

0%

Year

0.7%

1.2%

1.2%

Even if a batter wants to hit a sac fly, stay out of a double play, or launch a home run with two outs and the bases empty, there won’t be any change in whether his batted ball is a grounder or not. Whether or not the batter finds himself in a clutch situation hardly matters either. Failing to pick up crucial sac flies can be frustrating, but maybe we should give batters a pass, as the outcome appears to be out of their control and counterbalanced by the pitcher’s desire to prevent a fly ball.

Plate count matters a bit on fastballs—GB% trickles down by about one percent as the count becomes more favorable to the hitter.

Another finding here is that batter/pitcher handedness, in and of itself, is irrelevant. It’s how pitches move that is important. That’s a nontrivial distinction, particularly when many managers are wedded to making substitutions that optimize the traditional left/right platoon. Pitcher arsenals need to be considered when making relief and pinch-hitting substitutions.

INFLUENCE OF BATTER/PITCHER TALENT

Input

Fastball

Offspeed

Breaking

Batter Proj. GB%

12.3%

11.4%

13.5%

Batter Proj. HR%

1.2%

1.2%

1.6%

Pitcher Entropy

1.3%

1.3%

1.6%

Yes, it’s true: groundball-hitting batters hit grounders. These influence figures hover around ~12 percent, far less than the pitch attributes discussed above. The upshot here is that pitchers are much more in control of whether or not the ball is hit on the ground.

The batter’s home run talent and pitcher’s ability to mix pitches hold virtually no significance, the same fate that meets the previous pitch characteristics.

INFLUENCE OF PREVIOUS PITCH ATTRIBUTES

Input

Fastball

Offspeed

Breaking

Vertical location

1.5%

2.3%

2.2%

Horizontal location

2.2%

1.9%

2.4%

Vertical movement

1.0%

1.6%

1.9%

Horizontal movement

0.7%

0.7%

1.0%

Velocity

1.0%

1.1%

1.3%

Break angle

0.8%

1.4%

1.6%

Is fastball

0%

0.1%

0%

Is offspeed

0%

0%

0%

Is breaking

0%

0%

0%

Is hard fastball inside

0%

0%

0%

Is slow fastball inside

0%

0%

0%

Pitcher pace

0.8%

0.8%

1.7%

The way a pitcher immediately sets up the ball-in-play pitch is pretty unimportant in generating grounders. If all totaled, we can see that the “previous” variables are altogether a bit mightier—they do compose about one-tenth of the recipe—but improving a pitch’s GB% this way can only be done in tiny increments. All pitches see slight groundb all bumps if an inside pitch or a low pitch precedes the BIP pitch. Pitches’ groundball-friendliness also can get little boosts if the prior pitch “rises” high, is slower, or comes at a quick pace.

A pitcher who does all these things well can raise his GB% a few ticks. But the greatest increases come when pitchers improve their movement or location instead of sequencing. The last table shows that even the long-revered brushback pitch is inconsequential.

INFLUENCE OF PREVIOUS EVENTS IN THE PA

Input

Fastball

Offspeed

Breaking

THIFB

0.1%

0%

0%

TSIFB

0.1%

0%

0.1%

Left out of the original analysis were a pair of extra factors that are worth testing. In previous research, Baseball Prospectus’ Harry Pavlidis found that to get a grounder from a pitched change-up, it’s good if there’s a small gap between the fastball and offspeed offering, and it’s good if the change-up sinks more relative to the fastball. I went back to re-run the offspeed model with these variables included. The direction of my results were in agreement with his: Offspeed pitches perform better with smaller velocity differentials and more sink than fastballs. But the big difference in my results is that these factors hold little import. Each factor hits just over three percent importance.

The difference, I’d think, is due to my use of more rigorous methods that further take context into account. Between this latest result and the lackluster results from the other sequencing variables, it’s clear that pitches have a natural gorundball talent unto themselves, largely distinct from other aspects of the arsenal.

Wrapping up with the Best Groundball Pitchers

This analysis shows that keeping the ball down is just 20 percent of the groundball puzzle, a lower estimation than most sinkerballers surely would guess. It’s important, but the same can be said of several other factors. Velocity, both components of movement, horizontal location, and the batter’s own groundball tendencies matter a great deal, and other factors also claim smaller chunks of predictive power.

So who does the model predict as the best groundball pitchers? The table below shows the top ten player-seasons by predicted GB% on all pitches (min. 100) through 2015. For completeness, separate models were run to make groundball estimates for pitches coming on 0-0 counts, and those predictions are included in these tallies.

Despite the upward locations in the initial chart, the models identify Britton as the best groundballer in the PITCHf/x era. His fastball’s velocity, downward movement, and lateral movement are all at the top of the class. Joining Britton at the top are several of the best sinkerballers of the past eight years.

Notice also there are more seasons coming from the Pirates than any other club. The model loves Jared Hughes, and John Holdzkom’s dominant nine-inning stint with the 2014 Buccos was enough to earn him a 7th-place ranking. This isn’t a surprise, given the Pirates’ devotion to a strategy of creating grounders to be gobbled up into defensive shifts. There are interesting questions that follow such as, what impact does being a Pirate have on a pitcher’s groundball percentage? Tomorrow we’ll examine that question and take a closer look at Pittsburgh’s strategy.

Gerald Schifman is the lead researcher at Crain's New York Business and a writer at The Hardball Times. He previously worked in the New York Mets' baseball operations department and in Major League Baseball's publishing department. Follow him on Twitter @gschifman.

You only give us the top 10 predicted GB% pitcher seasons since 2008 to support your methodology’s effectiveness and unfortunately that is not much support at all. The top 3 predicted season are indeed the top 3 actual GB% pitcher seasons and Venter’s 2012, predicted as number 5 is actually number 6. Which would be impressive if any of the other 6 predicted top 10 were even close, but none of those even cracked the top 30. Hughes and Holdzkom’s predicted GB% as 78%+ actually turn out to be 56.3% and their top 10 rankings fall to the 280s. That… Read more »

I didn’t list the top 10 seasons to show the models’ effectiveness; my intention was to highlight the best GB pitchers. I can show some additional testing here: I grabbed pitcher-pitch type-seasons which had at least 40 BIP, and compared the predicted GB% to the actual GB%, weighting pitchers by their total BIP. Aggregated up, fastball and changeup predictions were -0.1% less than the actual figures. For breaking balls, the difference was an even slimmer -0.01% underprediction. Over and underpredictions cancel themselves out, leaving no persistent bias in either direction. As MGL notes below, zero error is to be expected… Read more »

Vote Up

0

Vote Down

1 year 3 months ago

Guest

Peter Jensen

The average AUC hovered around ~0.7 in each instance. You didn’t include the actual AUC value in your initial article so there was no way to evaluate your model from that standard. You stated that your predictions were “fairly good” but the source you cited in your article evaluates a .7 AUC as being on the cusp between fair and poor. One thing to remember about the figures in the table is that these are all-pitch predictions, not just estimates for BIP. It’s not an apples-to-apples comparison between those predictions and the actual GB% (which come only on BIP, of… Read more »

Your point about the AUCs has merit. Still, scores were generally inside that ‘C’-grade band, and the predictions are least useful as benchmarks. I’m saying that if Eppley’s every pitch were put into play, I predict that his GB% would be 80.1%. We can’t directly evaluate whether the model predictions are right or wrong, because many pitches won’t have a BIP to compare against. I look at the all-pitch predictions as another way to consider a pitcher’s GB talent beyond the typical GB/BIP percentage. The fact is that as a pitch crosses the plate with a certain location, movement, etc.,… Read more »

Vote Up

0

Vote Down

1 year 3 months ago

Guest

Peter Jensen

I you implying that Cody Eppley you you list as the 3rd best GB pitcher with your metric predicting 80.1% of his pitches would be ground balls if hit only had an actual GB rate of 60.3% because the batters wouldn’t cooperate and hit his best pitches? This sentence should have been: Are you implying that Cody Eppley, who you listed as having the 3rd best pitcher season with your metric predicting 80.1% of his ground balls if they were hit, actually had a ground ball percent of 60.3 because the batters wouldn’t cooperate and hit his best pitches? Sorry… Read more »

I’m pretty sure the EBS outage affected only 1 of the 4 Availability Zones in US East 1 Region. Amazon preaches cross AZ balancing and scaling which they don’t charge extra for. If your application can’t do that, then maybe you shouldn’t be using AWS in the first place.

great web website site…Thanks an excellent lot specifically for proclaiming this strategy through each one of the people you might know people products you might be communicating with relation in order to! Publication apparent. Kindly additionally speak with great net web we…

Vote Up

0

Vote Down

7 months 19 days ago

Guest

MGL

Good and interesting analyses, although somewhat intuitive. I agree with Peter in that the model should not be biased, i.e. in large samples the predicted and actual should be equal. He is also right that the training data and testing data should definitely be independent. Given that they are not, it is even more surprising that the predicted and actual results are not equal. “This analysis shows that keeping the ball down is just 20 percent of the groundball puzzle, a lower estimation than most sinkerballers surely would guess. ” A sinkerball is actually defined in two ways: One, pitches… Read more »

I don’t think I imply that a sinkerball is only tied to vertical location, but yes, certainly—whether or not a pitch is a sinker hinges principally on downward movement.

As for your other remarks, please see my response to Peter (above).

Vote Up

0

Vote Down

1 year 3 months ago

Guest

GB

“Good and interesting analyses, although somewhat intuitive.”

That may be the nicest thing MGL has ever said online.

Vote Up

0

Vote Down

1 year 3 months ago

Guest

Peter Jensen

That may be the nicest thing MGL has ever said online.

That is an unfair and untrue and silly comment. MGL is a tough critic and doesn’t pull his punches when he feels that posters haven’t done their homework, but he often gives praise to those who he feels have done serious research and analysis well.