Wins Above Replacement

I’ve gotten tired of having to explain some things every year. I’m finally writing it down and putting it in one place. Wins Above Replacement is a tool that was developed for baseball– but is now being used in every sport. Its goal is to do two things:

Measure a player’s total contribution in every statistical category (offensive and defensive). The contribution is measured in runs (because we know how to convert hits, walks, errors, putouts, etc. into runs).Runs are then converted into wins. To oversimplify, you take the total number of runs scored in a season, and divide it into the total number of wins. This lets us compare players from the dead ball / spitball era (or 1963-68, when they raised the mound) to the lively ball of the 20’s and 30’s (or the steroids era).The number of runs per win varies from season to season, but it has usually been around 10 in my lifetime.

Compare his contribution to a replacement-level player.A “replacement level player” isn’t a real person. It’s the average production for the position over a full season (140 games, 30 starts, 60 relief appearances), reduced by 40%.

“WAR” (as it is abbreviated) is expressed as a whole number– not a ratio. That’s deliberate, because it lets you measure the actual impact of the player. You can say “Francisco Lindor was 5.7 wins above replacement in 2016”, which means, if Lindor had joined a commune and Terry Francona had given his 158 games played, 684 plate appearances and 1,364 defensive innings to some backup infielder or a schmoe in AAA– the Indians would have won nearly 6 games less.

At this point, people always ask two questions.

“Why don’t you use the average?”

The answer, very simply, is “Because of situations like the Indians’ catchers in 2016.” Last year, the Indians played 1,445 defensive innings, They used the following players (innings refers to the number of innings they played at the position). Note the WAR in the right column.

Player

Age

Innings

BA

OPS

WAR

Yan Gomes

28

582.1

.167

.527

-0.8

Roberto Perez

27

451.2

.183

.579

0.5

Chriz Gimenez

33

389.2

.216

.602

-0.6

Adam Moore

32

21.1

0-5

.000

-0.1

Everyone was stinky except Perez– and he didn’t play well. If you prorated his statistics to full time, he would have finished with 1.4 WAR– below the 2.0 minimum that you need to be productive.

This is one of the rare occasions where the Indians didn’t use those players because they were stupid. This was literally all they had available. Moore was one of their two AAA catchers– the other was Guillermo Quiroz, who was 34, has played for seven different organizations and hit even worse than Moore did at Columbus.

Why not pull someone up from Akron? Look at the players they had. The most productive was Jeremy Lucas, who was 25 and hit .252 with a .775 OPS. The guy who played the most was Eric Haase. At 23, he maybe has a future– but he hit .208 with a .703 OPS. They brought 23-year old Daniel Salters up from the Carolina League late in the AA season, but he was struggling, so he wasn’t an option.

Both Gomes and Perez were injured– they didn’t come back quickly. Cleveland tried to trade for a catcher; nobody wanted to offer something better at a reasonable price.

That inability to find a quality regular is the reason you can’t use “average” as your benchmark. Often a team simply doesn’t have an average player. Remember, an average player is someone in the middle of the pack– half the players are better than he is and half are worse.

A replacement player– someone 40% worse than average– is someone any team can find. He’s a minor-league free agent. He’s a “non-roster” player who gets invited to spring training. He’s a veteran AAA player that can be had in trade for that 23-year-old with the good fastball who walks 6 men per game.

By using replacement level as your comparison, you can ask questions like “Why are you playing someone who’s performing worse than a street free agent?” As the Indians often do.

“Why don’t you adjust for playing time?”

WAR uses a whole number– not a percentage or a per-game ratio. That means it gives Josh Tomlin (174 inning pitched in the regular season) as many WAR (1.6) as Andrew Miller (who pitched 29). And that is, contrary to what you might think, the correct measure.

Why? Because we’re trying to measure total contribution, not performance level.

1. Being healthy and available to play matters. A team needs to put 9 men on the field at all times.

Go back and look at the chart of the 2016 Indians catchers. Roberto Perez (0.5 WAR) played at a higher level than Yan Gomes (-0.8) or Chriz Gimenez (-0.6). But, due to injury, he wasn’t available until July.

Perez played 70 games in 2016– nine on rehab assignment and 61 in Cleveland. Pro-rating his performance to a full season (1.4 WAR) would create the false impression that he contributed more than he actually did.

The obvious example is Dave Justice. In 1996, his OPS (.923) was a lot higher than it was in 1992 (.805). But in 1996, he played only 60 games. In 1992, he played 144.

If you were Manager Bobby Cox or GM John Schuerholtz, which season would you have considered “better”?

2. Using a ratio or percentage overvalues players who aren’t that good. A player can look a lot better than he really is if the coach or manager is picking his spots for him.

Here’s another example from 2016. WAR considered the following players more or less equal in 2016. If you look at the OPS, that seems ridiculous. If you look at the playing time, it does not:

Player

Games

PA

OPS

WAR

Mike Napoli

150

645

.800

1.0

Brandon Guyer

38

96

.907

1.1

Tyler Naquin

116

365

.886

0.9

Guyer and Naquin produced at much higher level– but they were being platooned. Mike Napoli wasn’t. He had to hit against everyone– even pitchers he struggles with.

Brandon Guyer has an .861 career OPS against lefties– but it falls to .644 against righties. If he’d played 150 games, his OPS would probably have been 150 points lower.

Fatigue also played a big part in Napoli’s results. He was 34 last year; 150 games played beat his previous high (set in 2010) by 10 games. Napoli had an .838 OPS on August 31st, but he hit the wall in September (.140 batting average; .612 OPS).

The Indians tried to rest Napoli– they used him at DH in 51 games and played Carlos Santana at first. The problem is that Napoli can’t sit on the bench and go up cold; he hit .261 with an .828 OPS playing first, but only .189 with a .722 OPS at DH. He doesn’t like to do it, which is why most managers don’t ask him.

They could have rested him more against righties (he hit .229 against them). But his OPS against righties (.792) was close to his production against lefties (.817). Anyway, who else were they going to play? They only used five first basemen— Napoli, Santana, Lonnie Chisenhall, Jose Aguilar and Chris Gimenez.

If the Indians had used Santana at first and Gimenez at DH (they really did– thankfully only for one game), Napoli could have stayed fresh and boosted his WAR numbers. Raise your hand if you believe the team would have benefited from that.

3 thoughts on “Wins Above Replacement”

Also, I once again think better roster selection (and more flexibility with the existing roster) might have fixed this problem. If Gio had been given a shot at the big club as the season wore on, he could have had time at third, JRam at 2nd and Kipnis at DH.

I never like to rely too much on stats I can’t calculate myself. Also, WAR relies on models and consensus decisions made by other people. It’s hard to assess the flaws– it requires so much data manipulation to look under the hood and kick the tires.

On the other hand, it’s being used pretty much everywhere. It is calculated by people who really try to be fair and complete. It measures things of value. And it’s orders of magnitude better than Paul Hoynes being able to say that Felix Fermin (3.0 WAR in 652 games with the Indians) was one of the unrecognized stars, or Hal Lebovitz writing that Rick Manning saved the Indians 100 runs a year with his defense.

I use the NFL’s Quarterback Rating formula– and that’s got a lot of components that I violently disagree with. I grit my teeth and use this too.