On June 24, 2017, Joe Mauer went 2-for-2 with two walks against Corey Kluber. It was, perhaps, one of his most impressive performances of all last season; Kluber allowed only one other base runner in seven innings, and struck out 13. For this performance, and many others, he got his second Cy Young last year.

But on that particular day – June 24 – the numbers say (my numbers say) Corey Kluber had reached his peak. In the third inning, when he faced Joe Mauer, he was more difficult to reach base against than any other pitcher, at any time, in 2017.

I present… relOBP! (Also its cousins, relAVG and relSLG, but they can wait.)

relOBP attempts to quantify how good a batter is at reaching base, and how good the opposing pitcher is at preventing that, for each play in the season. relOBP therefore requires two numbers, a pitcher-score and a batter-score. Let’s jump in

[Editor’s note

Defining relOBP

Math follows, but not particularly hard math.

Suppose that, in a given play, the probability of a batter getting on base is given as P(reach) = o*c, where o = the effective number of opportunities the pitcher gives per plate appearance (average 1.000) and c = the batter’s rate of capitalizing on those opportunities.

If every pitcher always has o-score = 1.000, then the c-score is simply a batter’s on base percentage (or, for my purposes, his on base percentage in the surrounding +-30 plate appearances).

But once we have an initial estimate for each batter’s c-score by plate appearance, we can use it to estimate the opposing pitcher’s o-score at the appearance as well. How? Well, let’s suppose that a pitcher has a constant o-score o over an interval of plate appearances 1…n. Then the expected number of batters to reach base is E[Total_Reach] = o*∑c_i, where c_i is the c-score of the ith batter (at this point, his on-base percentage over his recent and future plate appearances). Thus, the pitcher’s o-score in the center of the interval can be estimated as the number of batters that do reach base, divided by the sum of their c-scores.

We now have first estimates of o- and c-scores; but with better o-scores, we can calculate better c-scores, and vice versa. Thus, we can just iterate this process until o- and c-scores converge (which they do, rather rapidly).

I did 20 iterations of this process on all plays from the regular season 2017, calculating c-scores for on-base percentage, slugging percentage, and batting average, though as I said I’ll focus on on-base percentage in this article. For purely arbitrary reasons, my intervals were a batter’s previous and next 30 plate appearances, and a pitcher’s previous and next 50 plate appearances (when available); however, the final numbers are not especially sensitive to interval sizes.

/end math

relOBP Leaders

Let’s check out some leaderboards! Consider the following table, showing the Top 10 batters by relOBP in 2017, as well as the actual Top 10.

The following table shows a player’s relOBP as his average c-score, weighted by how many adjacent plate appearances were available to calculate it (e.g. early and late season plate appearances are weighted less heavily).

RK

Player

relOBP

avg. opponent o-score

PA

1

Joey Votto

0.462

0.990

707

2

Mike Trout

0.455

0.975

507

3

Aaron Judge

0.436

0.969

678

4

Jose Altuve

0.434

0.952

662

5

Paul Goldschmidt

0.411

1.000

665

6

Justin Turner

0.408

1.017

543

7

Kris Bryant

0.407

1.010

666

8

Tommy Pham

0.405

1.003

530

9

Anthony Rendon

0.404

1.008

605

10

Eric Hosmer

0.402

0.961

671

And 2017’s actual season leaders:

RK

Player

OBP

PA

1

Joey Votto

0.454

707

2

Mike Trout

0.442

507

3

Aaron Judge

0.422

678

4

Justin Turner

0.415

543

5

Tommy Pham

0.411

530

6

Jose Altuve

0.41

662

7

Kris Bryant

0.409

665

8

Paul Goldschmidt

0.404

665

9

Anthony Rendon

0.403

605

10

Freddie Freeman

0.403

514

While the order changes a bit, the top-10 are mostly the same (Freddie Freeman falls from 10th in OBP to 16th in relOBP, however, and is replaced by Eric Hosmer, who was 11th in OBP).

Note, however, that not everyone has faced the same quality of competition. Justin Turner faced weaker pitchers, on average; he had effectively had 1.017 as many opportunities to get on base as a hitter facing neutral pitching (~9 more PA over the course of the season), while Jose Altuve faced tougher competition, effectively losing 32 plate appearances.

relOBP and Luck

relOBP lets us see (approximately) how good the opposing pitcher is in each plate appearance (of course, we’re not accounting for handedness in our simple model).

For example, here’s a season of plate appearances from Mike Trout.

When the o-score (orange) dips low, that’s a tough matchup; when it spikes, it’s an easy one. In gray, you can see Mike Trout’s actual rolling OBP, and in blue, his c-score. When the c-score is higher than the OBP, Trout was hitting better than he appeared (given the matchup); and vice-versa. You might wonder what the cumulative difference in those scores is; did he gain or lose expected times on base? On net, he was unlucky; he lost almost 6 times on base because he faced harder pitching.

Trout was not the hardest hit, however. That would be Miguel Cabrera (and indeed, much of the Detroit Tigers’ lineup):

Player

Times on base lost

Miguel Cabrera

16.3

Justin Upton

14.3

Jose Altuve

13.9

Ian Kinsler

13.8

Nicholas Castellanos

13.6

Manny Machado

12.5

Melky Cabrera

12.0

Jonathan Schoop

11.8

Adam Jones

11.5

Jose Abreu

11.2

The Tigers were facing some tough pitching, apparently. One wonders if some of the other fine hitters here (Altuve; Machado; Abreu) were particularly prone to face difficult relievers, and if this would explain their presence on the list. On the flip side, the luckiest player of 2017 in this respect was Ozzie Albies, with 9.7 bases.

I have a lot more graphs, but that’s really not the point of the article: I only wish to introduce relOBP to you all, and now you’ve met it and can be friends.

I would love to hear your reaction to relOBP, to the methodology behind it, and any suggestions you might have for improving it. Also, if you would like to see my code or play with some of the data, let me know in the comments!

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

Thanks! Code + filtered version of 2017 data here: https://github.com/polkerty/relobp. It’s not super heavily commented. However, to get a leaderboard you want to set the “OUTPUT_PLAY_STRENGTH” constant at the top to false, which by default it’s not, otherwise you’ll get a CSV showing the o/c scores for each plate appearance. Also of note – while Statcast includes plaintext batters’ names, pitchers appear just as IDs, but if you google “mlb [id]” you’ll find the player of interest. And also of note – the code is set up to work nicely with generating hitting leaderboards, but not pitching leaderboards. I guess… Read more »

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

In other words, the worse the lineup, the more their opportunities are affected.

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

This is a good observation! It seems difficult to remove the bias of one’s lineup when evaluating the opposing pitcher. Maybe we could look at the pitcher’s last start and next start, but then we also lose some precision. Any ideas on this front are welcome.

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

Also of note – the 2017 Tigers were 16/30 in MLB in OBP, so the effect isn’t *all that* strong.