How many ballots to get elected to the Hall?

Voting shall be based upon the player’s record, playing ability, integrity, sportsmanship, character, and contributions to the team(s) on which the player played.
— Official Baseball Hall of Fame BBWAA Election Rules

I don’t think it’s entirely shocking to say that the official Hall of Fame election rules contain no language about the quantity of a candidate’s ballot appearances as part of a voter’s rubric. And yet, almost without fail, Hall of Fame ballot nominees must work their way up from below, slowly gaining ground and traction over the years to crack the 75 percent threshold at a later date. I’m not here to debate the wisdom of that rule wrinkle. I’m more interested in seeing if it’s possible to shed a little more light on how the voters separate those worthy of third-ballot entry with those who manage to sneak in on their 15th.

Of course, the basic underlying idea is that the most elite members of baseball history deserve quicker entry, which leads to an artificial hierarchy that separates first ballot entrants from those deemed less qualified. But how does the BBWAA make that decision? How do the voters filter the select few worthy of the inner circle from the rest of the general Hall of Fame population?

(A technical note that applies to the rest of this article: I’m going to use a data set that includes all BBWAA-elected Hall of Famers from 1968 to the present. Nothing before 1968, and no Hall of Famers elected by the Veteran’s Committee.)

The obvious place to start would be total career WAR. A quick perusal of first balloters turns up a reasonably complete list of who’s who in the 100 WAR plateau—guys likeStan Musial, Al Kaline, Willie Mays, Nolan Ryan and a handful of others. The numbers back this idea.

Batters

Pitchers

First Ballot HoFers

93.9

83.7

2nd-15th Ballot HoFers

67.0

61.6

If we throw every Hall of Fame batter onto a graph, with the years they spent on a BBWAA ballot on one axis and their total career WAR on the other, you get a similar grouping where the first ballot Hall of Famers are mostly on a different tier than the rest. More interestingly, another pattern emerges—or more accurately, a lack of one. There doesn’t seem to be much in the way of a pattern after the first ballot, and that’s something that could be an interesting avenue for a more in-depth look later on.

So far so good, but this just corroborates pretty much everything that the BBWAA has all but officially said—better players get in more quickly. But how does the BBWAA define “better”? What statistics do they value over others, when it comes to judging player value? And what about more traditional statistics? A large majority of the voting bloc appear to be members of the old guard, seemingly more likely to champion metrics like pitcher wins and batting average over WAR and wOBA. Do the data agree?

Rather than a simple sum of first balloters against the rest, I found correlation coefficients between the number of years a Hall of Famer spent on the ballot and various statistics, traditional and sabermetric. To put it simply, the correlation coefficient ranges from zero to one. If it’s zero, the variable has no relation to a player’s number of years on the ballot. A coefficient of one indicates a perfect match. Rate stats and counting stats act differently here, so it’s important to compare within each group, but not across it. Compare batting average to on-base percentage and RBIs to WAR, but not vice versa, because playing time is valued fairly highly among the voters. First up, batters. I’ll list rate stats first, followed by counting stats.

Correlation Coefficient (R)

AVG

0.06

OBP

0.17

SLG

-0.07

ISO

-0.09

wOBA

-0.05

H

0.45

HR

0.14

RBI

0.23

SB

0.27

WAR

0.45

And for pitchers.

Correlation Coefficient (R)

K/9

0.09

BB/9

-0.04

HR/9

0.19

ERA

0.01

FIP

-0.08

Wins

0.41

WAR

0.32

Traditional stats seem to be slightly favored across the board, although WAR certainly puts up a good fight. Intriguingly, the BBWAA has historically favored on-base percentage over batting average or slugging percentage, which seems to indicate that OBP may have been a larger part of how people internally judged player value before sabermetrics enjoyed a sharp rise in popularity over the last 10 to 20 years or so. On the pitching side of things, however, wins are still the leading indicator of ballot years, compared to a set of every pitching counting statistic I could find.

I’d love to see this how these figures change over the decades, as more and more sabermetric-oriented writers are given the right to vote. Will the BBWAA’s voting practices change accordingly?

References & ResourcesAll statistics from Fangraphs, with the exception of Hall of Fame ballot years, which were pulled from Baseball-Reference. Special thanks to commenter Jeremy, who inspired this article from a comment in my last piece.

Comments

I think for HOF analysis it is interesting to consider WAA as well as WAR, to see how voters value short but amazing production vs. steady but lower production. We might see some interesting patterns emerge about the relative perceived value of peak vs. duration by adding WAA to the mix.

I would suggest using xER and strikeouts as measures for pitchers too. (xER = IP*(lgERA – ERA), it basically counts how many ER above average a player prevented. Another pitching counting stat you could use is (IP*ERA+)/100.

The reason counting stats are favored would seem to be that high counting stats are correlated with long careers.

On the whole, I would think that the reason OBP is favored over AVG is that AVG is anti-correlated with power, as it depends on not striking out.

For pitchers, I have trouble believing that the voters consider HR/9 more important than K/9. I think that might be a fluke of some sort in the data, or maybe maybe something that is masking some other factor.

I think you really need to adjust these stats by historical era to evaluate their impact on voting. The reason K/9 appears unimportant, for example, is probably because K rates have risen steadily over the years. Bob Feller and Ron Darling, for example, had the same K/9.

In order to make this kind of predictive inference about the qualities associated with HoF election, it seems important to also include candidates who didn’t get in—much like the basketball reference HoF odds tool.

Just spitballing a little bit here—you could use as an inclusion criteria either “appeared on any BBWAA HoF ballot” or maybe something like “appeared on five ballots” to set the bar a little higher—and assign them an elected-on-nth-ballot value of, say, 30. Or you could reverse the order of the x-axis, and have “not elected” at 0, “elected on 15th ballot” at 1, … up to “elected on 1st ballot” at 15.

As currently constructed, you’re saying something to the effect of “If Larry Walker were eventually elected to the HoF, then this is how many ballots it probably took.” And it seems like the first part of that statement is a much bigger question than the second.