The burdens of being average

The black hole in your lineup: during the season, it’s the bane of your existence. Every day, the manager trots out the same garbage player because he has no better option (or refuses to ink a better player’s name into the lineup card). But at the end of the season, a black hole becomes a reason to be optimistic. Replacing the black hole with any competent hitter for the position represents an easy upgrade. Sure, it was bad while it lasted, but when constructing next year’s team, having a black hole makes identifying—if not fixing—your team’s weakness a snap.

Remember the 2005 Mets? Their first base rotation was one of the least productive in history. All told, Mets first basemen Doug Mientkiewicz, Mike Jacobs and Chris Woodwardcombined to hit .227/.303/.391, for a putrid .694 OPS, or 82 OPS+. That’s awful almost anywhere on the diamond, let alone an offense-first position like first base. Accordingly, the Mets upgraded to Carlos Delgado that offseason and he plugged the hole nicely, posting a .909 OPS.

In this vein, we often hear pundits—mostly the smart ones—noting that if only Team X could get average production at a particular position, its offense would be so much better. And that is correct thinking—to a point.

Average production at a particular position is a tricky thing. Last year, for example, the average line for all center fielders was .272/.334/.422, a .757 OPS or a 97 OPS+. (Aside: For this little study, any player who spends at least half his games as a center fielder had all of his plate appearances count toward this sample, so long as he accrued more than 100 plate appearances. So guys who played multiple positions, like Ryan Freel, had all of their hitting stats counted toward the position at which they played most often. Not a perfect system by any means.) But these figures are misleading, because these stats are tilted toward the guys who get more plate appearances.

Is that fair? Good players almost always get more plate appearances, with the odd Juan Pierre here or there. So when we compute positional averages for center fielders, we end up giving a lot of weight to top-flight players Curtis Granderson, Grady Sizemore and Ichiro Suzuki. But half of baseball is just showing up, and there are a number of center fielders who just showed up. We shouldn’t shortchange guys like Jerry Owens and Jacque Jones just because they didn’t play every day. Not playing every day may be a sign of injury, but in many cases it is because a player is not very good, is platooned, is replaced midseason, etc. These are the guys who make up the major league population.

If you’re a visual person, this picture might help.

The gray bars show the number of players who fell into a particular OPS+ range, the blue dots connected by the red line show the average number of plate appearances guys in that category accumulated last year. A minority of center fielders who had an OPS+ over 100, but because those guys were so valuable, they accumulated a lot of plate appearances. In stat speak, we might say that the distribution of playing time is not independent of performance. Well, duh—if you’ve got a center fielder who can put up a 120 OPS+, you’ll run him out there every day of the week and twice on Sundays!

This inflates the positional averages, so while all the center fielders averaged a 100 OPS+, the average center fielder did not put up a 100 OPS+. That’s a subtle distinction, but an important one because players themselves are discrete (if not always, ahem, discreet). If you’re looking for an average-hitting center fielder, what are you really looking for?

There are a few possibilities for getting around the distorting effect of playing time distribution, but the easiest is simply to find the median performance of a center fielder rather than the mean. To do this, we line up all the center fielders in our list and find the guy in the middle. For center fielders last year, that guy was Jim Edmonds, who had a 88 OPS+. That’s way less than the average of 97 we identified earlier! To put that in perspective, that nine-point difference in OPS+ is about 40 points of OPS, or almost three-quarters of a win over a full season. That’s a pretty big deal when you consider that, on the free agent market, one win costs about $5 million.

For all positions except second base, the mean OPS+ (the average performance) is higher than the median OPS+ (the performance of the average). As we saw with the center field example, this indicates that there were premium talents at those positions who accumulated lots of plate appearances. Guys like Victor Martinez and Russell Martin behind the dish, Albert Pujols and Alex Rodriguez on the infield corners, Magglio Ordonez and Matt Holliday at the outfield corners, Hanley Ramirez and Carlos Beltran up the middle—these guys all wrecked the curve last year, gathering lots of plate appearances and performing spectacularly.

The lone exception is second base, where the median OPS+ is higher than the mean OPS+. Last year, second base was home to many solid but few spectacular hitters. Only Chase Utley hit like a superstar, but guys like Brandon Phillips, Mark Ellis and Placido Polanco all had good to very good years. A few awful hitters—Craig Biggio, Marcus Giles and Ray Durham—got a whole lot of plate appearances for one reason or another, dragging the average down. But if we look at the median, 2007 was a good hitting year for second basemen.

In fact, while the mean OPS+ for left fielders—the mashing, defensively-challenged run producers—was much higher than for second basemen—the scrappy bantamweight glove men—the median OPS+ was the same! That’s right: last year’s typical second baseman hit as well as last year’s typical left fielder. You wouldn’t know that simply by looking at positional averages because there were awesome hitters like Adam Dunn, Barry Bondsand Matt Holliday at the top of the leftfield list . But there were also a good number of poor-hitting left fielders who didn’t get a lot of plate appearances: Endy Chavez, Rob Mackowiak, Darin Erstad. Keep in mind we’re only looking at last year’s data, and this is by no means a general conclusion.

So if your favorite team is looking to upgrade their third base hot corner, don’t freak out if all they can find is a league-average hitter. A third baseman who hit for a 109 OPS+, last year’s mean for third basemen, would have ranked 15th among the 40 third basemen with 100 or more plate appearances. A more appropriate target is the median OPS+, which was 101. That’s the difference between, say, Kevin Kouzmanoff and Wilson Betemit.

In all things baseball, we are interested in finding a baseline, whether it be average, bench or replacement. The way performances are distributed should affect the way we choose those baselines, and I hope this was an illustration of that.

References & ResourcesThe Baseball-Reference Play Index was indespensible for this project. If you haven’t signed up for it yet, I am not sure what you are waiting for. Because I used B-R, the version of OPS+ presented here follows the formula 100*(OBP/lgOBP + SLG/lgSLG – 1). While the merits of OPS+ can be debated—it’s preferable to use something that measure performance in real terms, like runs or wins—it is both convenient and park-adjusted. An 80-20 solution: 80% solution at 20% effort.

Mike Emeigh, a frequent poster at the Baseball Think Factory, is going to think that I stole this idea from him, based on the conversation that starts in comment 81 in this thread. That’s a coincidence—although I figure that neither of us is the first person to talk about mean versus median production.