Tuesday, September 20, 2016

Victims' Ward

As Mark Sheldon reported, the Reds gave up three homeruns to the Cubs in a 5-2 loss Monday night, bringing their total to 242 homeruns allowed on the season and breaking the record held by the '96 Tigers. (The '96 Tigers faced the likes of Mark McGwire, Ken Griffey Jr., and Albert Belle, but, being the year before interleague play started, not a single pitcher.) The Reds have allowed more homeruns this season than the Baltimore Orioles have hit - 242 to 237. And there's still 12 games to go.

"The bane of our season," as manager Bryan Price said, has also been bases on balls. All told, Reds pitching has allowed an on-base percentage of .345, a slugging average of .457, and an OPS of .802. The MLB averages this year are .322/.418/.740.

If the Reds' pitching staff was a hitter, who would it be? What I mean is, what batter hits the most like the average Reds opponent?

The way I'm going to answer this is to look at binary components, but just the first three - the three true outcomes - since those are the ones the pitcher has total control over.

Here are the rates for Reds opponents this year, and the MLB average:

Split $BB $SO $HRReds Opp 11% 22% 6%2016 MLB 9% 23% 4%
Translation: the Reds walk (or hit) about 1 in 9 batters they face. Of the batters they don't walk (or hit), they strike out about 2 in 9. And when a batter connects, he hits a homerun 6% of the time.

So compared to the league average, Reds pitchers walk more batters, strike out (slightly) less batters, and, obviously, give up a lot more long balls.

So what batter has produced the closest to these same rates?

To answer that, I made similarity scores, like the ones Bill James created, but based on the three binary components. I'll explain how I did it at the end, since it's boring.

If Reds' pitching was a hitter, it would be a lot like Seth Smith. Said another way, Reds pitchers turn the average hitter into Seth Smith.

Smith, a left-handed hitter, has been platooned this year, so he barely made the 400-PA cutoff I used, and he wouldn't qualify for rate stats. But he's hit .262 with 16 HR, 60 RBI, and a .349 OBP.

Former MVP Andrew McCutchen, the second-most similar batter, is having a down year for him, hitting just .252 with 23 HR. But he's actually been a little worse than the average Reds opponent, with more strikeouts and fewer homeruns.

Now, the explainin':

I made a list of every batter with 400 PA this year, which is 189 players. I figured their $BB, $SO, and $HR rates, and the standard deviation for each rate.

I wanted to give each component the same weight, so for each one I took 1000, divided it by three, and divided that by four times the standard deviation. Or:

weight = 1000 / 3 / (s * 4)

Then I converted each component into a "penalty" by finding the distance (the absolute value) of the batter's rate from the Reds' pitchers' rate and multiplying it by the component's weight, and subtracted each penalty from 1000. Written out in equation form, it looks something like this:

Sim Score = 1000 - penaltyBB - penaltySO - penaltyHR

penaltyBB = absValue(redsBB - batterBB) * weightBB

where redsBB and batterBB are the $BB rates for Reds pitchers and for each batter respectively, and weightBB is from the first formula above.

I'm probably making this sound complicated, but it was simple to work out in Excel following Bill's methodology, especially since I reduced it to just three "penalties". It honestly took a lot longer for me to explain my method than to do the research in the first place, and even then I probably didn't explain it as clearly as Bill James does. He doesn't get enough credit for that.