The Walkers

By Bill James

May 31, 2019

The Walkers

Who's the greatest walker who ever lived, the greatest at extracting a walk from a pitcher? Optimally one would want to adjust that list for batter quality -- Barry Bonds is not the person I am asking about here, I don't think, pitchers were afraid to let him hit anything. Who's the greatest active walker?

Asked by: wovenstrap

Answered: 5/30/2019

Interesting question. I might do a study related to that; threat-adjusted walk percentage, or something. I don't think there is anybody now who sort of specializes in that, the way some players did in the past (Max Bishop, Ed Yost, Eddie Stanky, Gene Tenace), perhaps because the game has changed. I might study it.

The great Jim Murray once wrote about Maury Wills that there was one category Wills would never lead the league in:Walks.Murray was reflecting the normal assumption of the time, which was that a walk was an act of the pitcher.Pitchers, he was implicitly saying, have an element of choice in who they walk.They would never choose to walk Maury Wills, because of he was a threat on the base paths.

Wovenstrap’s question, when you think about it, still relies on that frame of reference.When you ask "who is best at extracting a walk from a pitcher?", you are still implicitly stating that the act belongs to the pitcher, although advancing the argument by crediting the batter with an ability rather like the ability of a farmer to extract a loan from a banker or the ability of the cop to extract a confession from a criminal.

So my data group here is all player/seasons (non-pitcher) from 1913 to 2017, using 1913 as the back border because I wanted to study the influence of strikeouts on walks, assuming that some hitters who work the pitcher for a walk are also working him for a strikeout, like Mickey Mantle, let’s say.Strikeout records are incomplete before 1913.I used 2017 as the other border because I was using a version of the data base that I haven’t completely updated for 2018.

I tried to create a formula which estimates the "expected walks" for any hitter, based on

5)Anything else that pops up as a separator between high-walk and low-walk players.

Power

Walks for a hitter definitely increase as power increases, although this effect is not as large as I would have guessed that it was.There are many singles hitter in history who drew large numbers of walks; there are many power hitters who didn’t.

Batting Average

Batting average did not turn out to be useful as a predictor of walks drawn for a hitter, except for hitters who hit for a very high average, over .340.If a player hits for a very high average, there is some "risk avoidance" of the hitter in run situations.Otherwise, walks relate to batting average in a U-shaped curve.The center of the batting average chart for all players in the study was .267.Players with batting averages near .267 had lower walk rates than either players with very high batting averages or players with very low batting averages.However, when I tried to build "absolute distance from batting average of .267" into the walks prediction formula, I could not improve the accuracy of the predictions by the use of that information, other than making an adjustment to expected walks for players who hit over .340.

Strikeouts

Strikeouts, at least in this study, did not seem to be a meaningful predictor of walks drawn, independent of home runs.Of course strikeouts are fellow-travelers of home runs, and home runs predict walks, so there is an indirect effect of strikeouts on walks through the home run column.But when you adjust for that, strikeouts do not seem to be useful information for predicting walks, at least that I could find in my four or five hours of messing around with the data.

Speed

Speed, again, is not a useful predictor of walks, I don’t believe."Fast" players actually walk MORE than "Slow" players, which makes sense when you understand that the batter is actually more in control of when a walk occurs than the pitcher is.

We know that the batter is more in control of when a walk occurs than the pitcher is because the standard deviation of walks per plate appearance is higher for batters than it is for pitchers. In order the model the actual outcomes, you have to leave more space for the batter to control the event than you for the pitcher to control the event.Understanding that, then, one would predict that fast players should walk more than slow players do, actually for the reason stated by Jim Murray.As a pitcher would want to avoid walking a very fast runner, so also a fast runner would have more incentive to draw a walk than a slow runner would.If the batter is more in control of the walk outcome than the pitcher is, one would expect walks to increase as speed increases, as they do.But to make this information useful in a prediction model, you’d have to put more time into it than I have available.

Anything else

There is, actually, something else that pops out of the data that pretty important:handedness.Left-handed batters, controlling for power, walk quite significantly more often than right-handed hitters do.Switch hitters walk even a little bit more often than left-handers.We’ll have to adjust for that.

Now that I think about it, I have "height" in the data base; I should have studied that.It’s probably useful.Oh, well.I’m moving on; I’m publishing this today.If you want to study it with height included as a factor, you go ahead.

Before I make the Predictive Formula

The first thing that I have to explain is that, by "home run percentage", I don’t EXACTLY mean the home run percentage, and, by batting average, I don’t EXACTLY mean the batting average.I modified the player’s expected walks by his home run rate, and by his batting average if his batting average was over .340, but I didn’t use raw home run rate or raw batting average.For home run rate, I used this:

Home Runs + 1

--------------------------

At Bats +42

And for "batting average"I used this:

Hits + 16

-----------------------------

At Bats + 60

It’s what I call ballast.I do that so that I don’t get crazy results for players who hit 2 home runs in 5 at bats.Going 9-for-25 with 3 homers doesn’t make you a .360 hitter with home run rate of 72 per 600 at bats; it makes you a .294 hitter with a home run rate of 36 per 600 at bats, which is still pretty good, but normal.

The Predictive Formula

A player’s expected walks are:

1) His home run rate,

2)Times .7,

3)Plus .069,

4)Times his plate appearances,

5)Times 1.1 if the player is a switch hitter, 1.08 if he is a left-handed hitter, and 0.92 if he is a right-handed hitter, and

6)Increased by 12% if he hit over .340.

Let’s do a couple of players for whom the formula works, to show how it works.Duke Snider in 1954 hit 40 home runs in 584 at bats, which we change to 41 in 626, figuring that he had a home run rate of .0655.

That number we multiply by .7, making .0458.

To that we add .069, making .1148.

We multiply his plate appearances (679) times .1148, making 77.98.

This we increase by 8% because Snider was a left-handed hitter, making an expectation of 84.2 walks.

Snider in 1954 hit .341, HOWEVER, because of the "ballast" adjustment, we don’t treat him as a .341 hitter, but as a .334 hitter.Therefore, we don’t make an additional adjustment for his high batting average, and his expected walks stay at 84.2.

He did in fact walk 84 times.The formula accurately predicts his walks, in that particular case.

George Springer in 2017 hit 34 homers in 548 at bats, which we interpret as a home run percentage of .0593 (35/590).Multiply that by .7, you have .0415.Add .069, you have .1105.

Multiply that by his plate appearances, 629, and you have 69.52 walks.He is, however, a right-handed hitter, so we reduce that by 8%, and his expected walks are 63.96.He actually drew 64 walks.

Chris Davis in 2017 hit 26 homers in 456 at bats, which we interpret to be a home run rate of .0542 (27/498).Multiply that by .7, you have .0380.Add .069, you have .107.Multiply that by his plate appearances, 524, and you have 56.04.He’s a left-handed hitter, so we’ll increase that by 8%; that makes 60.53.He actually drew 61 walks.

Hanley Ramirez exactly hit his expected walks drawn both in 2016 and 2017.

Let’s do somebody who has no power.Luis Polonia in 1991 hit 2 homers in 604 at bats, which we interpret as a home run rate of .00464, or 3/646.Multiply that by .7, you have .00325.Add .069, you have .07225.Multiply that by his plate appearances, 662, and you have 47.83 walks.He’s a left-handed hitter, so we increase that by 8%, he winds up with 51.66 expected walks.He actually drew 52 walks, so the formula works.

Later on, we’ll do the cases where it doesn’t work.There are lots of cases where it works perfectly, literally hundreds of them, thousands of them if you included the low-at-bat guys, and many cases where it doesn’t work.But when it doesn’t work, it is equally likely to be 50 walks too high, or 50 walks too low.

The Results

First of all, you have to just throw Barry Bonds in 2004 out the window; Barry Bonds in 2004 is just stupid data.Note what I am saying; I am not saying that you have to throw Barry Bonds’ data out the window; I am just saying Barry Bonds in 2004.We all know why this is; we don’t need to talk about it or explain it.We’re just going to throw it away and move on.

Throwing away Barry Bonds in 2004, the 25 most exceptional walk seasons of all time—that is, the 25 seasons in which the player most exceeded his expected walks—are these 25 seasons:

Rank

First

Last

YEAR

Expected

Actual

1

Eddie

Stanky

1945

48

148

2

Eddie

Yost

1956

53

151

3

Eddie

Stanky

1946

42

137

4

Eddie

Stanky

1950

51

144

5

Barry

Bonds

2002

106

198

6

Eddie

Joost

1949

62

149

7

Eddie

Yost

1950

56

141

8

Ferris

Fain

1949

54

136

9

Ted

Williams

1947

82

162

10

Jimmy

Wynn

1969

68

148

11

Rickey

Henderson

1996

46

125

12

Luke

Appling

1935

43

122

13

Max

Bishop

1929

50

128

14

Eddie

Yost

1954

53

131

15

Luke

Appling

1949

44

121

16

Jimmy

Wynn

1976

51

127

17

Max

Bishop

1926

40

116

18

Gene

Tenace

1977

49

125

19

Eddie

Yost

1959

60

135

20

Ferris

Fain

1950

59

133

21

Eddie

Stanky

1951

53

127

22

Max

Bishop

1930

54

128

23

Jack

Clark

1989

58

132

24

Rickey

Henderson

1989

52

126

25

Eddie

Yost

1960

52

125

Eddie Stanky and Eddie Yost are, by this chart, the greatest non-threat Walkers of all time.Wovenstrap said that "Barry Bonds" is not the answer he is looking for, and I agree that it isn’t, but Bonds in 2004 would have been +123 walks, if we were counting that.

Running the data for Stanky in 1945, The Brat hit 1 home run in 555 at bats, which we interpret as a home run rate of .00335.Multiply that by .7, you’ve got .00234.Add .069, you’ve got .07134.Multiply that by his plate appearances (726), you’ve got an expectation of 51.8 walks, but he’s a right-handed hitter, so we reduce that by 8%, and he’s down to 47.7, which we will call 48.He actually drew 148 walks, which I think was a National League record at the time, so he beat expectations by 100.He is the only player in history, other than Bonds in 2004, to beat his expected walk total by 100.

Ed Yost in 1956. . .Ed Yost was known as the Walking Man.Yost hit 11 homers in 515 at bats, which we interpret as a Home Run Percentage of .02154.Multiply that by .7, that’s .0151; add .069 and it is .0841.Yost had 684 plate appearances, so that’s an expectation of 57.5 walks, but he was a right-handed hitter, so we multiply that by .92, and we’re down to 52.9, or 53 walks.He actually drew 151 walks, so that’s +98.

These are the greatest NON-Walking seasons in the data, by this method:

Rank

First

Last

YEAR

Expected

Actual

1

Garret

Anderson

2000

78

24

2

Rougned

Odor

2016

72

19

3

Alfonso

Soriano

2002

73

23

4

Bill

Terry

1932

81

32

5

Garret

Anderson

2001

74

27

6

Woody

Jensen

1936

63

16

7

Hal

Trosky

1936

83

36

8

Lou

Brock

1967

70

24

9

Joe

Pepitone

1964

70

24

10

Al

Oliver

1973

67

22

11

Cecil

Cooper

1982

77

32

12

Tony

Oliva

1964

79

34

13

Kirby

Puckett

1988

67

23

14

Joe

Pepitone

1963

67

23

15

Adam

Jones

2014

63

19

16

Garret

Anderson

2002

73

30

17

Dante

Bichette

1995

65

22

18

Willie

Davis

1966

58

15

19

Garry

Templeton

1979

60

18

20

Chuck

Klein

1930

96

54

21

Andre

Dawson

1987

74

32

22

Dave

Robertson

1916

56

14

23

A.J.

Pierzynski

2013

53

11

24

Garret

Anderson

2003

73

31

25

Bobby

Tolan

1969

68

27

26

Felipe

Alou

1966

65

24

So the greatest NON-Walker of all time is, let us say, Garrett Anderson in 2000. I’ll run the data for Garrett Anderson.Anderson hit 35 homers in 647 at bats, which we interpret as a Home Run Percentage of .0522.Multiply that by .7, that’s .0366.Add .069, that’s .1056.Multiply that by his plate appearances, 681, and he’s expected to walk 71.896 times.He’s a left-handed hitter, so we increase that by 8%, and we’re up to 78 expected walks.He actually walked only 24 times, so he is 54 walks short of expectation—the largest shortfall of all time.

People think that Barry Bonds and Babe Ruth and Ted Williams walked a tremendous amount because they were left-handed hitters and great hitters with high averages and a lot of power, so naturally they’re going to walk a lot.It’s not that simple.Garrett Anderson in 2000 was a left-handed hitter; he hit .286 with 35 homers, which is pretty good, similar to what Barry Bonds did in 1995 (.294 with 33 homers).But when Bonds did that, he drew 120 walks; when Anderson did it, he drew 24.Bill Terry in 1932 was a left-handed hitter who hit for a good average (.350) with 28 homers—but he didn’t draw walks.Hal Trosky in 1936 was a left-handed hitter who hit .343 with 42 homers—but he didn’t draw walks.Cecil Cooper in 1932 was a left-handed hitter who hit .313 with 32 homers—but he didn’t draw walks.Tony Oliva in 1964 was a left-handed hitter who hit .323 with 32 homers—but he didn’t draw walks.Chuck Klein in 1930 was a left-handed hitter who hit .386 with 40 homers, but he didn’t draw a lot of walks.Bonds, Ruth and Ted Williams didn’t draw huge numbers of walks because they were left-handed hitters who hit .350 with power; they took a large number of walks because that was part of their approach.It was an additional skill that they had.

Career Numbers

Well, before I get there, there is a point I should have made earlier.

The Point I Should have Made Earlier

Hitters walk, over time, in about 8.8% of plate appearances.The first-effort approach to predicting walks for each hitter, then, is just to predict that every hitter will walk in about 8.8% of his plate appearances.

You can improve that estimate by (1) adjusting for his power, and (2) adjusting for whether he is a left-handed hitter or a normal person, but these improvements don’t actually do a hell of a lot.You have a certain amount of error in the first effort, and then, by making these adjustments, you can remove about 20% of that error, get the estimates 20% closer to what actually happens.

And then that’s about all you can do, just remove about 20% of the error.I’m sure you could do better than I could, if you spent a week with the data rather than a few hours, but I’m also pretty confident that you couldn’t remove 40 or 50% of the error; maybe you could get to 25% or something, but not much better.

The reason that is true is that drawing walks is a strong individual skill or trait.How many walks you draw—and here again, we have to say "with the exception of Barry Bonds in 2004."But with the exception of Barry Bonds in 2004, how many walks you draw is NOT primarily a function of how much power you have or what your batting average is or whether you are fast or slow or fat or ugly or whether you hit right-handed or left-handed.It is a primarily a function of the batter’s ability to draw a walk.Because it is an independent skill of its own, it cannot be indirectly measured or independently predicted with great accuracy based on the player’s other characteristics.

Career Numbers

These are the greatest walk-drawers of all time, career totals, in terms of drawing more walks than you would expect them to draw:

First

Last

Expected

Actual

Margin

Rickey

Henderson

1078

2190

1112

Eddie

Yost

699

1614

915

Barry

Bonds

1672

2558

886

Joe

Morgan

1091

1865

774

Ted

Williams

1247

2021

774

Max

Bishop

477

1153

676

Eddie

Stanky

374

996

622

Frank

Thomas

1046

1667

621

Luke

Appling

704

1302

598

Willie

Randolph

651

1243

592

Babe

Ruth

1530

2062

532

Harmon

Killebrew

1047

1559

512

Pee Wee

Reese

701

1210

509

Eddie

Joost

538

1043

505

Lu

Blue

596

1092

496

Jimmy

Wynn

728

1224

496

Harlond

Clift

574

1070

496

Edgar

Martinez

791

1283

492

Ferris

Fain

414

904

490

Jack

Clark

776

1262

486

Mickey

Mantle

1253

1733

480

Gene

Tenace

506

984

478

Tony

Phillips

842

1319

477

Eddie

Collins

740

1213

473

Bobby

Abreu

1007

1476

469

26th is Wade Boggs.And these are the greatest walk-drawers in terms of the ratio of expected to actual walks drawn, minimum total of 1000 between expected and actual:

First

Last

Expected

Actual

Ratio

Eddie

Stanky

374

996

2.660

Max

Bishop

477

1153

2.418

Eddie

Yost

699

1614

2.309

Ferris

Fain

414

904

2.182

Rickey

Henderson

1078

2190

2.031

Gene

Tenace

506

984

1.946

Rick

Ferrell

480

931

1.939

Eddie

Joost

538

1043

1.938

Willie

Randolph

651

1243

1.909

Lyn

Lary

378

705

1.864

Harlond

Clift

574

1070

1.863

Luke

Appling

704

1302

1.849

Lu

Blue

596

1092

1.833

Roy

Cullenbine

467

852

1.823

Elmer

Valo

518

943

1.822

Willie

Kamm

469

824

1.756

Dave

Magadan

414

718

1.733

Pee Wee

Reese

701

1210

1.726

Joe

Morgan

1091

1865

1.710

Jimmy

Wynn

728

1224

1.681

Earl

Torgeson

584

980

1.679

Mike

Hargrove

576

965

1.675

Elbie

Fletcher

510

851

1.670

Bill

North

376

627

1.669

Eddie

Collins

740

1213

1.640

These are the top 25 NON-Walkers of all time:

First

Last

Expected

Actual

Ratio

A.J.

Pierzynski

735

308

.419

Garret

Anderson

910

429

.471

Willie

Davis

882

418

.474

Bill

Buckner

892

450

.504

Garry

Templeton

688

375

.545

Cecil

Cooper

794

448

.564

Carl

Crawford

648

377

.582

Vada

Pinson

983

574

.584

Al

Oliver

908

535

.589

George

Sisler

794

472

.594

Ivan

Rodriguez

862

513

.595

Andre

Dawson

979

589

.602

Steve

Garvey

786

479

.609

Alfonso

Soriano

809

496

.613

Juan

Gonzalez

744

457

.614

Vinny

Castilla

685

423

.617

Joe

Carter

849

527

.621

Lloyd

Waner

669

420

.627

Robinson

Cano

875

550

.628

Willie

Wilson

676

425

.629

Frank

White

651

412

.633

Brandon

Phillips

651

416

.639

Willie

McGee

701

448

.639

Matt

Williams

733

469

.640

Lee

May

760

487

.641

Here is a full listing of all players whose walks + expected walks total 1,000 or more.Data for active players is a year out of date, because I haven’t updated something for 2018 yet: