Our last two posts focused on explaining some essential offensive and pitching statistics to help us evaluate the 2010 Indians. Rather than introduce another stat this week, I thought we’d take a little break to discuss batting orders, their importance to a team’s ability to score runs, and how Manny Acta has begun to think about setting up a lineup for the 2010 season given the recent rosterdecisions.

There’s a lot of conventional wisdom about batting orders out there. We’ve heard it all before. Leadoff men should be speedsters. Wait, no! Leadoff men should get on base! Batters in the #2 hole should be good at moving runners over, but shouldn’t worry too much about getting on base themselves. Sluggers should always bat cleanup. The best all-around offensive player on the team should bat the three spot. Pitchers should bat last (in NL parks). But why? Can’t we explain any of these notions? Are they correct, or just accepted because some guy who chews tocacco said so?

Let’s look at some numbers to try to find the answers. (The subsequent charts are quoted from The Book, and have been compiled using actual game data.)

The chart below presents how many times, on average, each spot in a batting order actually gets to bat in a game:

Order

PA/G

PA/Season

1

4.83

782

2

4.72

765

3

4.61

747

4

4.49

727

5

4.39

711

6

4.26

690

7

4.14

671

8

4.02

651

9

3.9

632

So if we’re concerned with getting our best hitter to bat as often as possible, we’d put him in the leadoff spot, and then fill each successive spot by the best remaining hitter, right? This way, our best hitter would have 150 more PAs than our worst hitter over the course of a season. Done!!

Not quite, of course. We know that baseball games are more complex than this. They require some planning toward “big innings”—chains of positive offensive events that lead to runs being scored. These events happen more often in particular contexts, and those contexts cannot be ignored. In addition to number of PAs, here are the two major contexts to consider, with accompanying charts:

1. How many men will be on base when a spot comes to bat?

Order

PA Bases Empty

PA Men On

# of runners on

1

3.11

1.72

2.39

2

2.63

2.09

2.77

3

2.38

2.23

3.00

4

2.19

2.30

3.20

5

2.28

2.11

3.10

6

2.29

1.97

2.84

7

2.2

1.94

2.74

8

2.17

1.85

2.61

9

2.13

1.77

2.48

As the chart shows, the leadoff man will have more PAs with no runners on than any other spot in the lineup (because he’s the only one guaranteed to leadoff an inning in every game—the first), so his ability to do major damage will be limited. He also bats with the fewest runners on base for an entire game. And look at the cleanup spot: he bats with the most runners on base during a game. Maybe we need to rethink our first theory about our best hitter leading off.

2. How many outs will there likely be when a spot comes to bat?

Order

0 Outs

1 Out

2 Out

1

48%

26%

26%

2

33%

41%

26%

3

28%

35%

37%

4

34%

31%

35%

5

35%

33%

33%

6

33%

34%

33%

7

33%

33%

34%

8

34%

33%

33%

9

34%

33%

33%

You’ll notice that the 6-through-9 slots all bat pretty evenly across the various “out-states”: each about a third of the time (which makes sense). But the 1-through-4 slots have more variation. I’ll let you figure out why this is the case, but it should be pretty obvious.

Let’s jump forward. I’ll spare you some of the ugly results of trying to combine these three charts, and skip right to the money shot. What we want to know is how many runs, on average, are contributed by each spot in the batting order, by each event that can happen in an at bat. This chart presents the run values (more on that in a second) for each spot in the batting order for the various outcomes:

Order

1B

2B

3B

HR

NIBB

K

Out (non-K)

1

0.515

0.806

1.121

1.421

0.385

-0.329

-0.328

2

0.515

0.799

1.100

1.45

0.366

-0.322

-0.324

3

0.493

0.779

1.064

1.453

0.335

-0.317

-0.315

4

0.517

0.822

1.117

1.472

0.345

-0.332

-0.327

5

0.513

0.809

1.106

1.438

0.348

-0.324

-0.323

6

0.482

0.763

1.050

1.376

0.336

-0.306

-0.306

7

0.464

0.738

1.014

1.336

0.323

-0.296

-0.296

8

0.451

0.714

0.980

1.293

0.312

-0.287

-0.286

9

0.436

0.689

0.948

1.249

0.302

-0.278

-0.277

Again, this is a lot to digest, but let me take you through some of it. That 0.515 run value under “1B” means that when a leadoff hitter gets a single, his team—ON AVERAGE—is going to score about a half a run (.515 runs) more that inning than they would have before he hit the single. When the same batter strikes out (second to last column), he costs his team .329 runs compared to what they would have scored before the strikeout. Since the average team scores a bit under 5 runs a game, when an inning starts, they’re expected to score .555 runs before the inning ends, so strikeouts lower that, while singles raise it. This may be tough to wrap your head around at first, but this chart tells us everything we need to know about constructing batting orders. (I was told there would be no math.)

Let’s start by asking some questions. When is a HR worth the most? Looks like the number 4 spot in the lineup (1.472 Run Value—higher than any other spot). Conventional wisdom wins! HR hitters should bat cleanup. When is a walk worth the most? The leadoff spot—by a lot! Again, this corresponds well with our notions of high on-base percentage guys leading off. The #6 through #9 slots appear to become less effective across all events as the batting order progresses, so we should probably just put the four worst hitters in descending order in those spots. We’re doing great so far: all of this makes sense with what we already believe about batting orders.

Enough with the conventional wisdom; it’s time to blow your mind a bit. Compare the 2nd hole in the lineup to the 3rd spot. Which one has a higher value to its team? Where should we put the better hitter to maximize run scoring? That’s right. The second spot in the lineup can contribute more runs with the same hitter than the third spot can. So you should put a better hitter second in the lineup rather than third. Why would this be? Well, believe it or not, the extra plate appearances over the course of a season for the #2 hitter end up adding more value than the fact that more people are on base when the #3 hitter bats.

I’ll let you look over the chart for yourself to verify, but I think we’re ready to make a fairly unconventional conclusion: the most valuable spots in the lineup are the first, second, and fourth spots, followed by the three and five spots. Take a look. Seriously weird stuff.

So how does all this affect the Indians in 2010, and Manny Acta’s insistence that he’d like to have a consistent lineup everyday—especially with the recent news that Grady is scheduled to bat second this season? Rather than accepting my logic above, let’s run some simulations.

Last week, Scott sent me a link that attempts to optimize batting orders. Try it out for yourself, but basically, you enter nine players’ OBPs and slugging percentages, and it calculates the best and worst lineups by runs scored.

Here are the ten guys I believe will be vying for spots in an everyday lineup. I’ve used CHONE’s projections for their OBP and slugging, but feel free to adjust these according to your own projections. Either way, I don’t think the table below is unrealistic:

OBP

SLG

Sizemore

0.370

0.484

Cabrera

0.365

0.432

Choo

0.372

0.460

Hafner

0.351

0.446

Branyan

0.329

0.473

Peralta

0.328

0.408

LaPorta

0.337

0.457

Marson

0.344

0.352

Valbuena

0.328

0.400

Brantley

0.349

0.363

Notice that CHONE projects Choo and Sizemore to be the two best players on the team (no surprise there), with Branyan, Hafner, and Cabrera the next three in some order.

Ready to be surprised? Let’s leave Brantley out for the first simulation, since all signs point to Branyan effectively taking his spot in the everyday lineup. Here are the best and worst batting orders determined by run production:

Spot

Best

Worst

1

Choo

LaPorta

2

Sizemore

Valbuena

3

Cabrera

Hafner

4

Branyan

Marson

5

Hafner

Peralta

6

LaPorta

Cabrera

7

Peralta

Choo

8

Valbuena

Sizemore

9

Marson

Branyan

RPG

5.257

4.96

Quick. Who are our three best hitters? I bet you’d say Choo (highest OBP), Sizemore (high OPB, higher slugging), and Branyan (he slugs close to .500, remember). Look at where those three fall in the most effective lineup! First, second, and fourth! Exactly what we’d expect from the work we did above! And we fill out the top five with Hafner and Cabrera. Since Hafner has more pop than Cabrera, he should bat fifth, which leaves third for Cabrera! Math works!

Now look at the worst lineup. What I find most interesting is that even with the worst possible lineup, we’re only costing ourselves, 0.31 runs per game over the best possible lineup. Yes, that’s a cost of 50 runs over the course of season, or about 5 wins, but we’re talking about THE WORST LINEUP against the BEST. No manager would bat Sizemore eighth and Marson fourth, but even if you did, you’re talking about fractions of a run per game! This is why a lot of the talk over Wedge’s batting orders was misplaced: it wasn’t that he was putting guys in the wrong order so much as he was putting the wrong guys in the lineup (I’m looking at you David Dellucci).

Now let’s look at the lineup with Brantley in there rather than Branyan, just to check if we really should be starting the Love Muscle over the Rookie. According to CHONE’s projections, Brantley gets the nod in OBP by .020, but he gives up .110 in slugging. I find this fairly believable. Here are the best and worst lineups with Brantley:

Spot

Best

Worst

1

Cabrera

Peralta

2

Sizemore

Marson

3

Brantley

Hafner

4

Hafner

Brantley

5

Choo

Valbuena

6

LaPorta

Cabrera

7

Peralta

Choo

8

Valbuena

Sizemore

9

Marson

LaPorta

RPG

5.171

4.89

Best case scenario with Brantley in the game? 5.171 runs per game. Best with Branyan? 5.257 runs per game. That’s comes out to 14 more runs over the course of a season with Branyan over Brantley—or an extra win or two over the course of a season. An added bonus? Playing Branyan for a year keeps Brantley’s arbitration clock from ticking! This way, we get an extra year of club-control for Brantley, while only paying Branyan $2 million.

Now, if I were the GM, would I have signed Branyan? Probably not. After all, what’s an extra win or two worth on a team that is expected to finish below .500? Furthermore, I’m all for young players getting opportunities. But it’s not my $2 million, and for the Indians, an extra year of cheap Brantley in 2016 (due to a delayed arbitration clock) might be worth a $2 million dollar investment in Branyan in 2010. So while I know it wasn’t a popular move to effectively send Brantley back down for more seasoning, once the team signed Branyan, there was no reason to let Brantley start the season with the big league club.

As always, feel free to ask questions, and I’ll do my best to point you toward an answer.

Post navigation

Great analysis here. My problem with giving Branyan ABs over ANY of our young guys is that his production last year is way over his norm. At age 33 he hits 31 homers. His previous season high was 24, and that was 7 years ago. I expect Branyan to struggle, especially being forced to be the big run producer in the line-up.

http://waitingfornextyear.com Scott

My question – how much should handedness factor in? Wedge was HUGE on the L-R-L-R potential. As of now, having Cabrera lead off puts Sizemore, Choo, Hafner and Branyan in a row.

After all that house cleaning last season, how on earth is Travis Hafner still on this roster?

Jon Steiner

I hear you Rick. I think the problem with the CHONE projections isn’t the rate stats–those OBP and slugging look right to me (his career slugging is .491 and OBP is .331). But Branyan has played more than 120 games in a season exactly once–last year. Now some of that history has to do with platooning (which Acta says he won’t do), but a lot of that was injuries. I’m guessing his health will determine Brantley’s arbitration clock more than the Indians Front Office.

http://www.60bpm.com/ Robbie

Ow… my head hurts, but it’s a good kinda hurt.

BisonDeleSightings

Great stuff. One question about the run values chart, though: are those driven from actual game data? If so, I wonder to what extent those values are skewed by the fact that most teams follow similar patterns for their lineups. For example, the cleanup hitter generates most runs per HR because the #1-3 hitters tend to be the teams’ best on-base guys.

Jon Steiner

@ Scott: I would guess that handedness was probably discussed a good deal before deciding to move Grady to the #2 hole; it’s really the only reason to even think about keeping him in the leadoff spot, in my opinion. But since ~70% of SPs are right-handed, we’ll have a nice advantage for the first six innings of most games. Nice enough to outweigh those situational lefties in the seventh and eighth? I’m not sure, but the Indians are counting on it.

@ Kensha: I’m sure the Indians would trade Hafner for a bag of balls right now to get out from under that contract, but there’s not a team in baseball who’d take him, considering what he’s owed ($40 million).

@ Bison: you are a very astute reader–each week you point something out that I purposely left out so as not to confuse people. Basically, the answer to your question is yes, those numbers are derived from the way teams currently set up their lineups. The effects of this construction aren’t entirely clear, but you’re correct to suggest that HRs are very valuable in the cleanup spot in part because of who typically bats before. I think some more research might be required to say for sure how big this issue may be, but people smarter than me seem to think the effect is fairly minimal.

http://www.msblsim.com boogeyman

Brantley should be playing as much as possible…Brantley/Sizemore/Choo in the OF and Brantley, Cabrera and Sizemore 1-3 in the lineup the problem is way to many lefties who strike out to much but we all knew that already. Hopefully Santana makes it up fast or maybe blows them away this spring and if so he’s your #4 and you can put Choo at #5.

MP34

Indians, “Logic on Batting Orders” ? Surely you jest.

All kidding aside, nice piece Scott.

Karsten

The logic it is too powerful! I love the insight here, I’d love to see how this lineup works for realsies.

And even though I’m already a huge Brantley fan I’m happy to see him sit out for now while Russell gets his swings this year. It’s not like Brantley won’t be getting better, he’ll only continue to improve in AAA, and heck if he’s lighting it up and anything goes haywire with our OF, he’ll be up real quick.

http://www.msblsim.com boogeyman

Anyone wanna bet these stats don’t pan out?

historycat

I’ll be honest this is the first article on the Indians I’ve tried this year.

I got through a couple paragraphs and gave up. That is not at all a commentary on you Scott.

I cannot get excited for this team. Last year was the first year that I did not attend at least one game. I so very much want to love the Indians. I was with them though the 80’s, I remember Felix “the cat” Fermin. I just cannot like this team right now. I cannot go through another rebuilding year right now.

Please tell me something to look forward to for the Tribe.

mark

the weirdest piece of data in this article to me: if you look at the chart, the #2 batter is actually (ever so slightly) better off striking out than an non-k-out. odd.

5haun

@14 Maybe that’s why they moved Grady to second in the order instead of third…

Jon Steiner

@ Mark: My guess would be the danger of grounding into a double play comes into that run value. You’re right though, it’s counterintuitive.

http://waitingfornextyear.com Scott

MP34 and historycat – I can take no credit for this work. These SABR pieces are guest posts by Jon Steiner. Please check out the wOBA piece for his introduction.

I’m glad to see the warm – and continually growing – response.

http://waitingfornextyear.com Scott

“Please tell me something to look forward to for the Tribe.”

And this is another reason why we’re rolling out these pieces. I refuse to be fair weather, and we can only do so much complaining. We may as well use this season to learn some new things!

fjs

If the numbers were built by lineups that were set using conventional wisdom, wouldn’t they naturally track to more or less support the conventional wisdom?

Jon Steiner

@fjs: Not necessarily. The conventional wisdom says that the third hitter should be better than the second (right?), but this modelling suggests otherwise–that the second batter should be one of your 3 best hitters, and the third hitter shouldn’t be. That conclusion is based on actual game data, and then verified by the runs-per-game model. I’m pretty sure that isn’t want most people think of as “conventional wisdom.”

You and Bison are correct to wonder how much the sampling bias is affecting the model, but it doesn’t necessarily follow that the sampling bias makes the status quo more attractive. The opposite could just as easily be true.

Eric

Cool analysis, but it would be interesting to see how protection for a hitter affects stats as well. For instance, does Cabrera get more pitches to hit with Sizemore behind him than flipped

fjs

@history

For me, the reason for this season, besides the fact that it’s baseball, is to see if there’s something to look forward to next year. It’s to watch guys develop and try so have hope for the future. Plus, in the AL central, all it takes is a little luck in guys playing above themselves to win it. Get a few guys ready for the bigs, get them some experience, and 2011 could be something.

Matt

Grady is a terrible choice for the second hole becasue he strikes out way too much. Productive outs put more pressure on the pitchers/defense and usually separate great teams from bad ones. Grady is best suited to be a 5 hitter until he decides to stop yanking at everything and hit the ball where it is pitched. If he could cut his strike outs down by 50% then he could settle into the three hole. Where would he bat in the indians hay days with Lofton, Visquel, Alomar, Belle, ect. probably the bench.

Well thought out and it does make sense statistically…as far as it goes but you forgot some situations and stats.

Speed at the top of the lineup and the player’s ability to get on base effects the batters that follow.
Grady Sizemore walking his first AB will help the batting averages of the following three batters.
The attention of the pitcher to 1st base and the holding of the runner will add points to the next batters.

Secondly, double plays are not taken into effect. The speed and base running ability of the top of the lineup will decrease DP’s, increase infield errors,and throwing errors by outfielders.

Station-to-station base running is a sure formula for low run production but base running is not considered in your stats.

So maybe after adding the above the same results may be had, but first we must consider all the stats.

Everyone seems to want to claim what a lousy QB Hoyer was last year. That is utter nonsense, and this is one of the better pieces I've seen to illustrate that. Why was Hoyer so "bad" for the last 5 games? That's simple. The Browns offense last year was built around play-action passing. Shanahan was […]

The guy was rated by pretty much everyone as a first round talent last year and by many as a top 10. If he doesn't have any talent he fooled a lot of really smart NFL people (we have no idea what the "deeply personal" issues were that Pettine referenced or the role they played […]

there are only a couple of teams so desperate for a starting QB & where hoyer & mccown could possibly go to & be a starter ... that is cleveland , buffalo & possibly houston. hoyer could not start anywhere else in the league ... same with mccown .