Glossary

Because we like baseball, we like statistics. Baseball statistics, that is. You can thoroughly enjoy baseball without paying any attention to its statistics, of course, but to really understand the game deeply, you’ve got to dive in.

Following is a definition of most of the stats we use. For more information, be sure to check out Fangraphs’ Sabermetric Library. By the way, when you see a writer refer to a batter’s performance in this way—.275/.338/.425 (insert your own numbers)—he’s referring to the batter’s BA/OBP/SLG. If you’re not sure what those are, read on.

Batting Average on Balls in Play. This is a measure of the number of batted balls that safely fall in for a hit (not including home runs). The exact formula we use is (H-HR)/(AB-K-HR+SF) This is similar to DER, but from the batter’s perspective. And note: BABIP is not the exact inverse of DER; there are small differences in the formulas.

The percent of plate appearances that result in a walk. This measure is preferred to BB/9 (Walks per nine innings) because it has a proper denominator—plate appearances—instead of innings (an inning equals three outs regardless of the number of plate appearances).

A run contribution formula created by David Smyth, which quantifies the number of runs contributed by a batter. The fundamental formula for Base Runs is (baserunners times scoring rate) plus home runs. You can read more about it in this Wiki entry.

This measures the number of runs a baserunner contributed, above or below the average baserunner that year, through all baserunning events, including stolen bases, caught stealing, advancing on hits and being thrown out on the bases.

The number of hits a batter achieved by bunting the ball. BuH% is calculated as the number of bunt hits divided by the number of balls bunted into play (including sacrifice bunts). It doesn’t include attempted bunts or foul bunts, only bunts put in play.

One of the benefits of the Win Probability Added approach is that it provides a useful way to measure the criticality of a situation, via the Leverage Index statistic. “Clutch” is a single number that sums up a player’s performance in different Leverage Index situations. The specific definition starts with a player’s overall WPA divided by his overallLeverage Index. In general, a higher Leverage Index will accentuate his WPA contribution (good or bad), and dividing by LI “normalizes” it.

The stat then looks at each individual play and divides the player’s WPA in that specific play by the LI of that specific play. It then sums up all the plays in the year. Finally, it subtracts this figure (the sum of each specific WPA/specific LI) from the first figure (overall WPA/overall LI). The result is an index of whether the player did better or worse in “clutch” situations. Zero is neutral, a positive number is “clutch” and a negative number isn’t.

Defense Efficiency Ratio. The percent of times a batted ball is turned into an out by the teams’ fielders, not including home runs. The exact formula we use is (BFP-H-K-BB-HBP-Errors)/(BFP-HR-K-BB-HBP). This is similar to BABIP, but from the defensive team’s perspective. Please note that errors should include only errors on batted balls.

ERA measured against the league average, and adjusted for ballpark factors. The specific formula divides the league ERA by the pitcher’s ERA (and adjusts for ballpark). So an ERA+ of 125, for instance, means that the league ERA was 25% higher than the pitcher’s ERA (which means that the pitcher’s ERA was 80% of the league ERA). Careful with those ratios.

ERA measured against the league average, and adjusted for ballpark factors. Similar to ERA+, but the ratio is reversed: pitcher’s ERA is divided by the league ERA. An ERA- of 125 means that the pitcher’s ERA was 25% higher than the league average while an ERA- of 80 means that the pitcher’s ERA was 20% lower than league average.

Defensive Runs Saved. A fielding metric that combines a number of advanced stats into a single number that indicates the number of runs above/below average a player contributed at a certain position in the field. Developed by John Dewan and the Baseball Info Solutions staff, it includes the impact of all fielder’s range, the impact of an outfielder’s arm (on stopping runners from advancing), the ability to turn a double play (for middle infielders), how well third and first basemen handle bunts, how well pitchers and catchers control the running game and a catcher’s impact on earned runs allowed by specific pitchers. It also includes the impact of misplays and good fielding plays, as identified by the BIS video scouts.

Fielding Independent Pitching, a measure of all those things for which a pitcher is specifically responsible. The formula is (HR*13+(BB+HBP-IBB)*3-K*2)/IP, plus a league-specific factor (usually around 3.2) to round out the number to an equivalent ERA number. FIP helps you understand how well a pitcher pitched, regardless of how well his fielders fielded. The theory behind FIP was first articulated by Voros McCracken and the exact formula was invented by Tangotiger.

GB/FB stands for Ground ball to Fly ball Ratio. It is the number of ground balls divided by the number of fly balls (but not line drives) hit by the batter or allowed by the pitcher. It includes all batted balls, not just outs.

The number of times a batter Grounded Into Double Plays. Depending on the data source, this sometimes includes all double plays, not just those that are grounded into. As a general rule, 80% to 90% of all double plays are grounded into.

Gross Production Average, a variation of OPS, but more accurate and easier to interpret. The exact formula is (OBP*1.8+SLG)/4, sometimes adjusted for ballpark factor. The scale of GPA is similar to BA: .200 is lousy, .265 is around average and .300 is a star.

Home Runs as a percentage of fly balls, including both infield and outfield flies (but not line drives). For pitchers, significant variations from 10% are probably random to a large degree, but for hitters this stat is more indicative of a true skill.

The percent of flyballs that are infield flies. On Fangraphs, infield flies are those that travel no more than 140 feet from home plate, including caught foul balls. For some pitchers, inducing infield flies may be a repeatable skill.

Strikeouts per nine innings pitched. This stat is inferior to K% because it is based on innings (or outs) instead of plate appearances. For instance, a pitcher who gets every out through a strikeout but gives up a hit half the time will have the same K/9 as a pitcher who strikes out every batter.

Line Drive Percentage. Baseball Info Solutions tracks the trajectory of each batted ball and categorizes it as a groundball, fly ball or line drive. LD% is the percent of batted balls that are line drives. Line drives are not necessarily the hardest hit balls, but they do fall for a hit around 75% of the time.

Leverage Index, invented by Tangotiger, is a measure of how critical a specific batting situation is. One (1) is average, anything above one is more critical and anything less than one is less critical. There are several types of Leverage Index tracked at Fangraphs: pLI is LI averaged across all plays, gmLI is a pitcher’s LI when he first enters a game, inLI is the average LI at the beginning of each pitcher’s inning pitched and exLI is the LI when he leaves a game. For batters, phLI measures average LI in pinch-hitting situations.

A system for assigning weights to each type of play (outs, singles, doubles, etc.) that is fundamental to most Run Creation formulas today, such as wOBA. These weights are expressed as the average numbers of runs above or below average that each play created. There are several ways to assign linear weights to plays. One of the best-known is the Value Added approach, which is illustrated in this Retrosheet article by Tom Ruane. On most statistical websites, linear weights are calculated for each individual year; sometimes broken out for the league, too.

LOB stands for Left On Base. It is the number of runners that are left on base at the end of an inning. LOB% is slightly different; it is the percentage of baserunners allowed that didn’t score a run. LOB% is used to track pitcher’s luck or effectiveness (depending on your point of view). The exact formula is (H+BB+HBP-R)/(H+BB+HBP-(1.4*HR)). You can read more about LOB% in this article.

The Marcels are a simple way of calculating a player forecast. Named after the monkey from Friends (so simple a monkey could do them), they simply consist of averaging a player’s previous experience (with greatest weight on the most recent years) and regressing to the major league average depending on the number of years the player has been in the majors. This is done for each component (home runs, doubles, walks, etc.) A simple aging factor is applied, but no park factor. The Marcels, made available by Tangotiger, are explained in this post.

Meltdowns. Any relief appearance in which a pitcher caused his team’s WPA to decrease by at least .06. Meltdowns and Shutdowns are an improvement over Holds and Saves for ranking relief pitchers. Why .06? Read about it here.

On Base Percentage, the proportion of plate appearances in which a batter reached base successfully, including hits, walks and hit by pitches. The specific formula is (H + BB + HBP) divided by (AB + BB + HBP + SF). OBP is also a powerful performance metric when interpreted as the percentage of times the batter didn’t make an out.

The percentage of times a batter makes fair or foul contact with a pitch outside the strike zone when he swings at it. The strike zone is defined using PITCHf/x data, though firms with video scouts may interpret it differently. Average is 50% to 60%.

The percentage of pitches outside the strike zone that a batter swings at. The strike zone is defined using PITCHf/x data, though firms with video scouts may interpret it differently. Average is around 30%.

Putouts, the number of times a fielder recorded an out in the field. First basemen and outfielders get lots of these. From a pitching perspective, PO stands for pick offs — the number of times a pitcher picks a baserunner off a base.

PrOPS stands for “Predicted OPS.” It was developed by J.C. Bradbury amd introduced in this article. PrOPS isn’t really a stat; it’s a formula for predicting what a player’s OPS is likely to be in the future based on his batted balls, strikeouts, home runs and walks.

A formula for converting a team’s Run Differential into a projected Won/Loss record. The formula is RS^2/(RS^2+RA^2). Teams’ actual won/loss records tend to mirror their Pythagorean records, and variances can usually be attributed to luck.

You can improve the accuracy of the Pythagorean formula by using a different exponent (the 2 in the formula). In particular, a sabermetrician named US Patriot discovered that the best exponent can be calculated this way: (RS/G+RA/G)^.287, where RS/G is Runs Scored per game and RA/G is Runs Allowed per game. This is called the PythagoPat formula.

Runs Created. Invented by Bill James, RC is a very good measure of the number of runs a batter contributed to his team’s offense. The basic formula for RC is OBP*TB, but it has evolved into over fourteen different versions.

Runs Created Above Average. A stat invented and tracked by Lee Sinins, the author of the Sabermetric Baseball Encyclopedia. Lee calculates each player’s Runs Created, and then compares it to the league average, given that player’s number of plate appearances. Lee uses a different version of RC than we do, though the two are very similar.

This is a very specific measure of the contributions a batter made to his team. The impact of each hit and out is measured by its impact on the change in the team’s actual and expected number of runs scored. The number of expected runs scored (both before and after the play) is based on a run expectancy table that includes all 24 base/out situations. You can find a typical run expectancy table at the beginning of this article.

As an example, a standard linear weight for a home run is 1.4. This means that, on average, all home runs add 1.4 runs to a team’s runs scored when factoring in the difference between run expectancies before and after the home run. In RE24, however, a home run with the bases empty will count as only one run (because a bases empty situation with the same number of outs has the same run expectancy). RE24 will then count a home run with the bases loaded and two outs as 3.2 runs, since the team was expected to score about 0.8 runs, on average, in that situation. So exactly *when* the batter hits each kind of hit is taken into account.

Note that RE24 doesn’t take the inning or score into account. This is done in WPA

When you talk about stats, how do you compare players? You can compare batters and pitchers to the equivalent of “zero” runs (though being above zero obviously has different implications for batters and pitchers). You can compare players to average too. In fact, you have to compare fielders to average if you use advanced fielding stats such as Ultimate Zone Rating. So when you combine batting, pitching and fielding stats into one “uberstat” such as Win Shares or Wins Above Replacement, you have to compare players to average.

And that creates a problem. Comparisons to average means that a player who plays just one game and is average in that game is just as valuable as the player who played an entire average season. Doesn’t make sense. So we compare players to replacement level instead. Replacement level is usually thought of as the level at which a player can be easily replaced by a bench player, a freely available free agent or someone else. Replacement levels are also useful for salary analysis in which it is assumed that a player who performs at a replacement level should receive a minimum salary.

There are a lot of ways to calculate replacement levels. Here is one, presented by our own Sean Smith. There are several legitimate approaches. A key thing to remember is that replacement levels will differ by position, because it is harder to find a good-hitting catcher than a good-hitting first baseman (for example).

Runs Saved Above Average. This stat, which is also tracked and reported by Lee Sinins, is a measure of a pitcher’s effectiveness and contribution. The formula is RA/IP minus league-average RA/IP, times total innings pitched.

Revised Zone Rating is the proportion of balls hit into a fielder’s zone that he successfully converted into an out. Zone Rating was invented by John Dewan when he was CEO of Stats Inc. John is now the owner of Baseball Info Solutions, where he has revised the original Zone Rating calculation so that it lists balls handled out of the zone (OOZ) separately (and doesn’t include them in the ZR calculation) and doesn’t give players extra credit for double plays (Stats had already made that change).

Shutdowns. Any relief appearance in which a pitcher caused his team’s WPA to increase by at least .06. Meltdowns and Shutdowns are an improvement over Holds and Saves for ranking relief pitchers. Why .06? Read about it here.

Ultimate Zone Rating. The name was coined by John Dewan in the late 1990’s and the stat was created by Mitchel Lichtman in the early 2000’s. Essentially, UZR looks at the trajectory and speed of every batted ball and, based on overall major league averages, assigns a probability that a certain position will field it. If a player at that position fields it, he gets credit above the overall major league average. If he doesn’t, he gets negative credit.

UZR is adjusted in many ways and includes other fielding metrics. It gets complicated very quickly. But the bottom line is that a player with zero UZR is an average fielder at his position. A positive number indicates he was an above-average fielder, and a negative number indicates that he was below average.

Wins Above Replacement. This is a metric that combines a player’s contributions on offense and defense and then compares him to the appropriate replacement level for his position. Currently, there are two versions in general use, at Baseball Reference and Fangraphs. The framework was developed at Tangotiger’s blog.

Weighted On Base Average. wOBA was developed by Tangotiger and introduced in The Book: Playing the Percentages in Baseball (essential reading for all sabermetric baseball fans). wOBA is a rate stat, calibrated to follow the same scale as On-Base Percentage, that captures the value of all of a hitter’s contributions.

Win Probability Added. A system in which each player is given credit toward helping his team win, based on play-by-play data and the impact each specific play has on the team’s probability of winning. You can read more about WPA in The One About Win Probability.

Weighted Runs Created Plus. This measures how a player’s wRC compares to the league average, adjusted for ballpark. 100 is average, 120 is twenty percent above average and 80 is 20 percent below average.

Win Shares. Invented by Bill James. Win Shares is a very complicated statistic that takes all the contributions a player makes toward his team’s wins and distills them into a single number that represents the number of wins contributed to the team, times three. Bill continues to make changes to his system and you can read about them at Bill James Online.

Expected Fielding Independent Pitching. This is an experimental stat that adjusts FIP and “normalizes” the home run component. Research has shown that home runs allowed are pretty much a function of flyballs allowed and home park, so xFIP is based on the average number of home runs allowed per outfield fly. Theoretically, this should be a better predicter of a pitcher’s future ERA.

The percentage of times a batter makes fair or foul contact with a pitch inside the strike zone when he swings at it. The strike zone is defined using PITCHf/x data, though firms with video scouts may interpret it differently. Average is 85% to 90%.

For fielding stats, the areas on a ballfield in which at least a certain percentage of batted balls (typically around 50%) are handled for outs. Zones are standardized and defined separately for each position. For pitching stats, the strike zone.

The overall percentage of pitches that are inside the strike zone, regardless of whether the batter swung or not. The strike zone is defined using PITCHf/x data, though firms with video scouts may interpret it differently. Average is 45%.

The percentage of pitches inside the strike zone that a batter swings at. The strike zone is defined using PITCHf/x data, though firms with video scouts may interpret it differently. Average is around 65%.