The MVP race: Harper and his rivals

When I last wrote about the MVP race a month agp, I said I’d revisit it near the end of the season. With Bryce Harper posting amazing statistics again in September, he seems to have emerged as the consensus pick for MVP. Although a bit of slump of the last week caused him to lose his lead in a few categories, he still has the best batting statistics in baseball. As of now (after the first game of the Nats’ October 3 doubleheader) he leads the majors in slugging, on-base-plus slugging (OPS), and wins above replacement (WAR), and leads the National League in batting average and runs scored, is tied for the lead in home runs, and is a close second in on-base percentage.

Who are his rivals? The place to start is with the leaders in WAR. Let’s look at this link from FanGraphs, which combines position players and pitchers; I’ll look at the version that bases pitcher WAR 50% on fielding independent pitching (FIP) and 50% on runs allowed, though you can compare with other versions. Here are the top 15 in the NL through Friday, October 2:

Bryce Harper 9.4

Jake Arrieta 8.4

Clayton Kershaw 8.0

Zack Greinke 7.6

Joey Votto 7.5

Paul Goldschmidt 7.1

Yoenis Cespedes* 6.7

A.J. Pollock 6.4

Kris Bryant 6.3

Jason Heyward 6.0

Andrew McCutchen 5.8

Max Scherzer 5.8

Buster Posey 5.6

Anthony Rizzo 5.5

Gerrit Cole 5.2

*For Yoenis Cespedes, I’ve included his WAR earned in the American League before his trade. He’s earned 2.7 WAR while with the Mets.

Among the top three pitchers, Arrieta has made his last start, while Greinke is scheduled to pitch tonight and Kershaw might pitch tomorrow, leaving the tight Cy Young race still a bit up in the air.

Although no single candidate has emerged as a strong rival to Harper, I’d like to focus on interesting cases that have been made for a couple of players: Cespedes and Rizzo.

The case for Cespedes

The argument is that the Mets were a near .500 team, lagging behind the Nationals in a lackluster race until Cespedes was acquired at the trade deadline on July 31. Since August 1, the Mets have gone 36–21 and easily clinched divisional title. Hitting .287/.336/.610 with 17 home runs for his new team, Cespedes was the spark that lit the Mets’ offense.

This argument tells a story, and like most such stories it simplifies in ways that distort. For example, in addition to Cespedes arriving, the Mets’ offense was also sparked by the arrivals of Michael Conforto and David Wright (from the DL) and by hot hitting from Duda, Granderson, d’Arnaud, and Murphy, as well as significant contributions from their pitchers other players. The Mets’ success can’t be attributed solely or even primarily to Cespedes.

On the other hand, the discussion of Cespedes does raise an interesting issue for MVP voters: how to treat his statistics from the AL before he was traded. I’ve read several arguments saying that the AL performance shouldn’t count, since this is the NL MVP award. I’m going to make the counter argument however.

Suppose that Cespedes had been traded at the same time, but that his statistics were flipped with Harper’s, so that Cespedes had clearly been the best player in baseball over the full year. Don’t you suppose many of us would be saying that his overall performance deserved to be recognized, even though it meant counting the performance from both leagues. With interleague play and other changes, I think it’s time to stop thinking of the two leagues as completely independent entities, but rather as conferences within the same overall league. So I’m willing to count Cespedes’ performance while with the Tigers in his NL MVP consideration. But the downside is that any special merit he might get for playing for a playoff team for two months needs to be offset by down-weighting the four months he spent with a team that collapsed. (I don’t give much credit for playing for a winner any way, but I know many voters do.)

Clutch hitting and the case for Anthony Rizzo (& Kris Bryant)

Another interesting candidate has been promoted by several sabermetric writers—Anthony Rizzo. The argument is based on the idea that hitting in clutch or “high leverage” situations—late in close games, especially with runners on base—is more important than hitting more typical situations. There are several ways of measuring “clutch” situations, and according to most of them, Rizzo has hit better than average in those situations and Harper has hit worse in clutch situations.

With a 1.272 OPS, Rizzo has been a fantastic hitter in clutch, or high-leverage situations. Compare those statistics with those for Harper:

Bryce Harper

Situation

PA

BA

OBP

SLG

OPS

High leverage

126

.240

.405

.427

.832

Medium leverage

245

.379

.490

.717

1.207

Low leverage

275

.327

.462

.677

1.139

Now, if two players have the same overall statistics, I think it’s obvious that the one who hits better in high leverage situations will help his team win more games. Hitting a home run with a runner on base in the 8th inning of a tie game is clearly going to contribute to a win much more than hitting a solo home run in the 8th inning of a 14–3 blowout.

Rizzo leads the majors in win probability added (WPA), a statistic that is calculated based on the change in probability of winning the game for each event that occurs in a game. I use WPA quite a lot, for example, in my month in review posts that tell the story of dramatic clutch hits, shutdowns, or meltdowns that occurred during the month. WPA is a great story-telling statistic because it highlights the dramatic moments when the game is on the line. But as a measure of value, WPA is fundamentally flawed. When a team wins a game 5 to 4 in extra innings, all 5 runs were equally needed for the victory, whereas WPA will give inordinate weight to the final “game winning” run. Yet, without each of the previous 4 runs, the team never would have made it to extra innings and the clutch situation. Thus, WPA can’t be used directly as a measure of value.

In a recent article in Grantland, Ben Lindbergh uses a variant of WPA called “championship win probability added” (cWPA) to argue for Rizzo as a potential MVP candidate. This statistic extends the idea of WPA to the sequencing of games—a game late in the season of a close pennant raise is higher leverage, or more clutch, than a game earlier in the season. But, it’s obvious that as a measure of value it has the same flaw as WPA—if a team wins its division or wild card slot on the last day of the season 90 wins to 89 for its opponent, all 90 wins are counted equally (and are thus or equal value) in winning the championship. The games won in April or May count just as much in the standings as a game won in late September or early October.

Dave Studeman, writing for Hardball Times, recognizes that WPA is flawed as a measure of value and proposes a different approach. Instead, he argues that a run in a one-run game is worth more than a run in a two-game run, and comes up with some measures of the value of a run in one-run, two-run, three-run games, etc., which he calls “margin factor.” Now I concede Studeman’s point that a run is worth more in a close game, though I admit I don’t fully understand how he measures the margin factors. Based on the margin factors, Studeman reweights each player’s run contributions based on whether the margin in the game was one run, two runs, etc. Rizzo came out top in this calculation.

Now, assuming that the margin factors accurately measure the value of a run in each of these situations, I still have a problem with Studeman’s approach. When Harper was batting in the fourth inning, we don’t know whether the game will ultimately turn out to have a one-run or a five-run margin. Measuring the impact based on the closeness of the game after the fact is like measuring how well a player hits in games his team wins—it might be measuring something, but it is so confounded by what all of the other players in the game are doing that it’s not a clean measure at all. If I were using this approach, I’d recommend using Bayes’ theorem to calculate the value of each run based on the probability at the time is batting that the game will have various margins. It would be a lot of work to do those calculations, but I believe they could be calculated using the Markov approach that’s used by sabermetricians such as the authors of The Book.

A simpler approach, and the one I would take if I were undertaking the research, would be to get ahold of a statistical baseball game simulator and run 10,000 seasons with Rizzo’s, and then Harper’s, high-, medium-, and low-leverage statistics and see how the teams do. While I expect that the simulations will show that there’s a clear benefit to batting well in high-leverage situations, I think they’ll also show that Harper’s better statistics in medium- and low-leverage situations also have a clear impact. So while I give some weight to the “clutch” argument, I’m not yet persuaded that Rizzo should beat Harper for MVP.

By the way, it’s interesting that Rizzo’s teammate, Kris Bryant, has clutch statistics that are just as good as Rizzo’s. You may have noticed that Bryant is slightly ahead of Rizzo in Fangraphs WAR based on playing a more demanding position well, as well as on being a good baserunner. So taking account of the clutch argument actually boosted Bryant quite a bit on my pseudo-ballot.

My vote:

Obviously, I don’t have a vote for the MVP, but based on these considerations, this is how I would fill out a ballot if I had one. (Again, rankings might change slightly based on games scheduled for tonight or tomorrow).