Monday, February 02, 2009

BABIP III: High Strikeout Hitters

I don't want to spend the rest of the preseason writing about this, but I am intrigued by one thing that Brett said:

As for making contact (eg. Swisher), intuitively I agree with the authors that contact rate and BABIP should be inversely proportional.

They wrote: "One might expect a higher contact rate to lead to a higher BABIP, but the opposite actually seems to be the case. This is likely caused by the correlation between strikeouts and power, since players who swing hard tend to either miss entirely or crush the ball for hits."

Brett can't be specifically talking about low contact hitters, because the top BABIP hitters aren't pure strikeout hitters. Certainly, there are some high strikeout guys on these lists (Hunter Pence 2007, Ryan Howard 2006), but many of the hitters here aren't high strikeout guys.

I am curious about Hardball Times' theory. Are players who are swinging hard "crushing the ball for hits" when they're not swinging and missing?

Top 10 Strikeouts 2008

#

Player

K

BABIP

BA

HR

$

1

Mark Reynolds

204

.329

.239

28

$20

2

Ryan Howard

199

.289

.251

48

$30

3

Jack Cust

197

.311

.231

33

$16

4

Dan Uggla

171

.323

.260

32

$22

5

Carlos Pena

166

.307

.247

31

$21

6

Chris Young

165

.304

.248

22

$19

7

Adam Dunn

164

.262

.236

40

$21

8

Matt Kemp

153

.363

.290

18

$33

9

Jim Thome

147

.276

.245

34

$20

10

Ryan Ludwick

146

.349

.299

37

$32

We've talked a lot about "regression to the mean" the last few days, so it's useful to know that the mean in 2008 for BABIP was .313. That is to say that of the 147 hitters who qualified for the batting title in 2008, the 74th hitter - Adam LaRoche - had a .313 BABIP.

This chart isn't a ringing endorsement for the theory that these guys are "smoking the ball" when they're not missing it. Dunn, Thome, and Howard are the only hitters who are well below the mean, but only Kemp and Ludwick are really well above it.

Top 10 Strikeouts 2007

#

Player

K

BABIP

BA

HR

$

1

Ryan Howard

199

.336

.268

47

$32

2

Dan Uggla

167

.286

.245

31

$16

3

Adam Dunn

165

.309

.264

40

$28

4

Jack Cust

164

.366

.256

26

$16

5

Mike Cameron

160

.300

.242

21

$18

6

Grady Sizemore

155

.334

.277

24

$31

7

B.J. Upton

154

.399

.300

24

$30

8

Brandon Inge

150

.308

.236

14

$10

9

Jhonny Peralta

146

.329

.270

21

$16

10

Carlos Pena

142

.305

.282

46

$31

Looking at this backwards, half of the players from the 2008 list repeat on the 2007 list.

2007's mean is about the same as 2008's (.312), but now we have hitters like Upton and Cust who are truly smoking the ball when they're not swinging and missing. With the exception of Pena, there is an incredible amount of variance in the repeaters from one year to the next.

We're still not any closer to figuring out how much of this is good luck one year versus bad luck the next. And I'm not necessarily convinced that Brandon Inge is due some additional luck because he's taking some "good cuts" on the balls he's not completely missing.Top 10 Strikeouts 2006

#

Player

K

BABIP

BA

HR

$

1

Adam Dunn

194

.278

.234

40

$20

2

Ryan Howard

181

.363

.313

58

$43

3

Curtis Granderson

174

.337

.260

19

$14

4

Bill Hall

162

.324

.270

35

$24

5

Alfonso Soriano

160

.302

.277

46

$44

6

Jason Bay

156

.338

.286

35

$31

7

Richie Sexson

154

.303

.264

34

$20

8

Grady Sizemore

153

.342

.290

28

$29

9

Nick Swisher

152

.287

.254

35

$18

10

Jhonny Peralta

152

.329

.257

13

$7

Once again, more repeaters (Dunn, Howard, Sizemore, and Peralta). Now you've got even more hitters (six) above the mean (.314) for 2006.

One more...

Top 10 Strikeouts 2005

#

Player

K

BABIP

BA

HR

$

1

Adam Dunn

168

.281

.247

40

$24

2

Richie Sexson

167

.307

.263

39

$26

3

Pat Burrell

160

.341

.281

32

$27

4

Preston Wilson

148

.317

.260

25

$24

5

Brad Wilkerson

147

.317

.248

11

$20

6

Troy Glaus

145

.287

.258

37

$25

7

Jason Bay

142

.355

.306

32

$40

8

Brandon Inge

140

.315

.261

16

$15

9

Alex Rodriguez

139

.349

.321

48

$49

10

Jim Edmonds

139

.314

.263

29

$22

These BABIP numbers all look pretty good. But for some of these players (Sexson, Wilkerson, Wilson, Inge), it's the low BA/high K combination - and not the BABIP - that should have lit up like a warning sign ahead.

One significant problem with expecting BABIP to be higher for sluggers is that it loses sight of what a strikeout is.

It's a negative outcome. It's three swings and misses. There might be some hitters taking great cuts and just missing, but there are also a lot of bad swings on strike three, too.

After Pat Burrell's .341 in 2005, he's put up 298, 283, 275. He might be below his expected BABIP, but after three years of putting up BABIPs under 300, it might be time to ask if our expectations are too high.

After Sexson's .217 in 2007, a lot of the numbers wags said Richie would bounce back. He did...to .275, which didn't make enough of an impact.

I think there are hitters who make great contact or swing and miss. But there are also hitters taking lousy swings and putting the ball into play. Don't assume someone who mashes the ball out of the park is making good contact when that ball stays in the yard.

2 comments:

Strangely, there is not a very strong correlation between contact rate and BABIP. I've been playing around with those numbers myself tonight and found that over the last 5 seasons for those batters with 500 or more plate appearances (737 in all), the mean BABIP is .308. This graph show the rather weak correlation for the large sample size. I was rather surprised by the results myself.

I think where regression to the mean for pitchers is relevant, the theory from THT says that hitters have their own individual mean to regress to based upon the GB/FB/LD rate, their speed, and their batting side. There are also some park factors.

That's why Jeter and Ichiro always appear at the top of the lists. It's not luck it's design. But, pitchers only have control over three outcomes- HR, Ks, and BB. Everything else is ballpark and defense. When a pitcher has one good year of BABIP, it can be assumed that the next year, if his three true outcomes are identical, is ERA should regress toward the mean b/c of the increased hits. It's all related to DIPSERA or the other similar calculations.

For the hitters, though Burrell seems to have a similar BABIP except for the one year around the league average. That means he was lucky that year, not unlucky the others. His style of swing, footspeed, and right handed bat just to lead to high BABIP. He will regress toward his personal mean, not the league mean.