Pages

Monday, January 20, 2014

A slow news day can be a painful thing. On Thursday, January 16, it resulted in something very dangerous indeed: the baseball media commenting on the Oscar nominations, announced at 8:30 that morning. While I'm already not that curious what Dan Shaughnessy has to say about American Hustle, what actually happened was arguably worse: writers began to draw parallels between the Academy Award voting process and the Baseball Hall of Fame. A sampling:

I hope @MrBrianKenny can work up some righteous indignation about these Oscar nominations. Not seeing a lot of transparency in process!

A few weeks after the Hall of Fame election unleashed perhaps baseball's greatest internecine vitriol yet, some BBWAA members were clearly still smarting. The Hall election that they administer has come under attack for both its results and its process, and many voting members got defensive about it. What this Oscar comparison revealed is that they clearly think they are being singled out for criticism—with the implication that this is unfair. The BBWAA's sins are no worse than any other imperfect voting system, they believe, yet baseball's zealots attack them with unprecedented nastiness that's disproportional to the severity of the crime. Why can't we all just get along like those nice people in Hollywood?

It's hogwash. Because anyone who uses the Oscars—the Oscars–as an example in their argument that the Hall of Fame gets exposed to an undue amount of criticism is not a very close watcher of Hollywood's awards season.

The Oscars, of course, are second-guessed all the time—not just by fans, but by the media and blogosphere too. People are still fuming about Crash's defeat of Brokeback Mountain for Best Picture of 2005, and accusations of racism flew after Meryl Streep defeated Viola Davis for Best Actress just two ceremonies ago.

But that's just the results.

People in the Oscar-watching business have the same complaints about their voting process as baseball observers do about the BBWAA's. The difference has been—are you sitting down for this?—the Academy has actually listened and changed its voting process in response.

Probably the best example of this happened two years ago when controversy erupted over the Best Original Song category. Across all of 2011, the Academy's music branch found just two songs worthy of nominations, despite a long list of excellent qualifiers. Having just two nominees in a category and at an awards show more accustomed to five was an embarrassment for the Academy, and sharp criticism of the music branch's convoluted nomination system flew in from all sides. Within a few short months, the Academy announced that it would throw out the old process and switch to a simple new method: the five songs with the most votes would get nominations. It wasn't hard, and no one felt like it undermined tradition. The following year, Adele's "Skyfall" bested four other solid nominees in the category.

What's that? Best Original Song doesn't have the same prestige and tradition around it as, say, Best Picture? Well, the Academy has futzed with that award too—probably more than any other. Prior to the 2009 Oscars, the Academy stirred the pot by announcing it would double the number of Best Picture nominees—and change their method of choosing a winner from first-past-the-post to instant-runoff voting. The changes were a response to criticism that smaller, independent films with strong but not widespread followings were being crowded out of the Best Picture race by the Hillary Clintons and Mitt Romneys of the race that had the resources to compete. The Oscars had always been a competition with five contestants—not only for Best Picture, but to this day that remains the number of nominees in almost every Oscar category. Yet the Academy wasn't afraid to mess with tradition to make their process better.

Two years later, the Academy tweaked the process again based on feedback and perceived issues with the new Best Picture experiment. Currently, instead of forcing 10 Best Picture nominees in a year that might not merit it, the Oscars' nomination system is flexible enough to produce anywhere from five to 10 nominees. The modified instant-runoff voting system is complex but ingenious. Imagine that—a group always striving to improve its election process, even at the edges. The Academy has a very instructive lesson for the BBWAA here: if you change your process and don't like it, you're not stuck with it. Every change can be a learning experience, and you have as many tries as you need to get it right.

Finally, the Oscars are continually adding and reinventing categories as needed. The award for Best Makeup was created in 1981 as a direct result of the outcry over The Elephant Man's (1980) lack of recognition for what was obviously to so many an award-worthy makeup job. An award for Best Animated Feature was added for 2001 in order to address the Academy's longstanding anti-animation bias. Today, there are movements afoot to add categories such as Best Casting, to split categories such as Best Makeup and Hairstyling, and to combine categories such as Best Sound Mixing and Best Sound Editing.

So, no, the Hall of Fame is not the only non-political election that people complain about. Within the baseball universe, people hear incessant complaints about baseball's election process; in the blogosphere of Oscar watching, the chorus for change is equally loud. So it is within any industry, really—from similar crowd-pleasing rules changes at the Emmys to choosing the next monarch of England. The BBWAA simply thinks they're alone in receiving criticism because they're not plugged into any of these other circles. Maybe if they were, they'd learn a valuable lesson: other organizations bend under the weight of public pressure to improve—and almost none are worse off for doing so. If the BBWAA is upset because it feels like it is the only election that people hurl their invective at, maybe they should also consider that theirs is also the only election that refuses to evolve.

Monday, January 13, 2014

New year, new policy initiatives on the state level. Over the coming weeks and months, governors across the country will give their State of the State or budget addresses laying out their agenda for 2014. Like the national State of the Union address, these speeches are often critical to learning about legislative priorities and the political circumstances of the governor. As close readers know, I believe that most policy that truly affects people is made on the state level—and so I believe state politics is absolutely essential to pay attention to.

As I do every year, then, here's a list of the dates of each state's State of the State address for 2014—updated in real time as dates are announced. Tune in for an unbeatably rich appreciation of local politics; alternatively, after each speech is delivered, I'll link to the transcript on this page as well.

Saturday, January 11, 2014

The Hall of Fame fever has already broken—the baseball world moved on to Alex Rodriguez today—but I want to get in a final word before we all forget everything we've learned and do it all over again in 2015.

There was a lot of anger this week. There were people saying anyone who didn't vote for Greg Maddux should lose their right to vote. After one writer actually was stripped of his right to vote, the liberal writers and bloggers who condoned the Deadspin poll suggested that nonserious voters like Murray Chass get their votes revoked instead. And I've seen it written that the Hall of Fame needs to step outside its current voting process and form a committee to get PED users like Barry Bonds and Roger Clemens into the Hall, because the current process clearly won't be voting them in any time soon.

I want to be 100% clear: as I've said many times, I believe Bonds and Clemens are Hall of Famers. I don't believe players should be banned for life from the Hall of Fame because they committed offenses that weren't even enough to get them banned a single game when they played. (For those who played during the drug-testing era, I believe that MLB's clearly delineated rules—50 games for a first offense, 100 for a second, and life for a third—set the ground rules for a player's eligibility. If someone fails three drug tests, he's banned for life, including from the Hall. If he fails less than three, he served his time and all should be forgiven.) I also believe that Murray Chass, Ken Gurnick, and others were wrong and destructive to have voted the way they did—opting for Jack Morris over Greg Maddux, choosing to leave slots on their ballot open when there are up to 20 Hall-worthy players, etc. etc.

But, to be clear, I also believe that millions of Americans were wrong when they went into the voting booth on November 6, 2012, and voted against my preferred candidate. Likewise, I hate that a small minority of people don't treat their vote with the respect that I, as a lover of politics and a junkie for government, believe it deserves. People make silly write-in votes, or they vote for third parties in extremely close and consequential elections. But that's their right; people disagree sometimes. And I don't support efforts to change the way we elect presidents because other people happened to be allowed to vote for someone else—and certainly not because a small minority of people acted really stupidly. (That's always going to happen.)

Therefore, while I sympathize deeply with those in baseball who are frustrated with the BBWAA's incompetence (in my opinion) to elect worthy candidates, I urge my compatriots to tone it down a bit and think about the reforms they're proposing. The key phrase in the previous sentence is "in my opinion"; other people can have different ones, and we need to accept that. I don't think I'm any more flabbergasted than any of you at the ignorance that some voters put on display, and I agree that they'll probably never come around and that that's unacceptable. But think about Rand Paul or Elizabeth Warren or whoever might be your own ideological opposite; they probably drive you up a tree too. "How can he THINK THAT?!?" you've probably yelled at the TV. Well, because this is America.

We have to be rational enough to draw the line between process and results. If we don't like the results—even if we find the results completely maddening, irrational, and corrupt—we can't automatically think the process must also change. It's easy to think we've crossed over that line where the ends justify the means: "Barry Bonds is so obviously Hall-worthy that any process that doesn't elect him is necessarily wrong." Except that's a big problem, because 65.3% of BBWAA voters—and probably a comparable percentage of the public—don't agree. Extend that logic to politics—"Ron Paul is so obviously the best candidate for president that any process that doesn't elect him is necessarily wrong"—and the undemocratic nature of that kind of comment becomes clear.

(It's this kind of logic that has led to laws, such as voter-ID laws, that load the die in favor of one party over another. Both sides have historically been guilty of trying to change the rules because they so desperately believe decision-making power must be taken out of the hands of those who disagree with them. I oppose these laws even more than I oppose laws that I'm ideologically against because they specifically undermine what should be the bipartisan priority of fairness.) People do have a right to their opinion, and saying so isn't a squishy way to evade the issue. It's a reality we have to deal with. Instead of trying to oppress others, it's something we have to learn to adapt to. That's the only way we can—hopefully—move on to the stage of trying to persuade others to our side and return to a productive dialogue.

Make no mistake—authority must be flexible, and so some rules must sometimes change. The Constitution has to be a living document, and the BBWAA bylaws must change with the times. That's why I support changes that vast majorities of the BBWAA (say, two-thirds or three-fourths) can agree on—just like we allow our Constitution to be changed via the amendment process. I believe eliminating the "you can vote for a maximum of 10 players" falls into this category; I don't think I've seen a single baseball writer still defending that rule. But when it comes to disqualifying certain voters for the contents of their votes—no matter how disrespectful—or forming committees that circumvent a majority, we should not even be considering it. It would, quite literally, be the BBWAA version of tyranny.

The exit polls deviated from the actual results by an average margin of error of 3.3 percentage points; the median error was 2.9 points. They had some big misses: Tim Raines, whom they overestimated by 8.2 points; Barry Bonds, whom they overestimated by 7.6; and Curt Schilling, whom they overestimated by 7.3. But the exit polls, for the first time ever, also nailed two players' vote totals exactly: Morris and Sammy Sosa.

Otherwise, many of the deviations from the exit polls occurred in predictable directions, if not predictable magnitudes. That was the rationale behind my Hall of Fame vote-projection model, which sought to apply an adjustment to the exit polls based on their margins of error in previous years. (For example, Lee Smith is always underestimated by exit polls, while Raines is always overestimated; this is due to the nature of which voters do not release their ballots.)

My model was better at predicting the final vote than the raw exit polls, although not by as much as it could have been. The average error of my projections was 3.0 points; the median deviation was 2.6 points. Due to the nature of the analysis, my biggest misses were on first-time candidates who had no polling-error history to go off. Using a pseudoscientific process that looked at the performances of similar players, I created adjustment factors for these candidates as well, though in retrospect this probably did more harm than good. I overestimated Mike Mussina by 8.4 points, overestimated Tom Glavine by 6.1 points, and underestimated Jeff Kent by 6.4 points. It was a valiant effort, but I think I'll be scrapping this method of predicting first-time candidates in 2015.

Remove the five first-timers, though, and my basic prediction method actually validated quite well. Among the 17 candidates with vote histories to go off, my average deviation was a much better 2.3 points, and my median error was all the way down to 1.6 points. The only two who gave me trouble were Schilling and Morris, and no calculation method could have nailed these two's very unusual performance on Hall Election Day. Morris, as mentioned above, hit his exit-poll percentage exactly, which is very uncharacteristic for him; he has always outperformed his polls, and by bigger and bigger margins every year. Schilling, on the other hand, had the opposite problem; he had only one year of data to base a prediction on, so there simply wasn't enough information to know how nonpublic voters truly felt about him.

Nevertheless, there are always ways to improve, and we should never stop looking for them. This post mortem would be pointless if it didn't answer the question, "How can I make this model better next year?"

One excellent idea, suggested to me by multiple people, is to calculate the adjustment factors differently—more specifically, to weight more recent data more heavily, rather than relying on a simple average. This makes a lot of sense, and I was enthusiastic about this idea until I took a closer look after the election. Comparing my actual model with an alternative one in which the 2013 error counted for more revealed that recent error was no more predictive than error over time! Out of the 11 2014 candidates who had more than one year of past voting data, six actually moved further away from their ultimate vote total using the alternative, weighted model. In other words, although clearly neither is perfect, the long-term average of a player's deviation from the polls is more predictive than what they did in the year prior.

However, this just proves that new data aren't any better than an entire data set; it doesn't say anything about the oldest data, and when they start to go bad. Next year I'll have the choice whether to calculate adjustment factors from two-year polling error, three-year polling error, four-year polling error, or even five-year polling error. Using numbers from 2010 to predict something about voting in 2015 definitely seems like a bit of a stretch; could using four-year-old numbers, like I did this year, also be using information that is past its expiration date?

I recalculated my projections using adjustment factors averaged from just the past three years of voting history and made a startling discovery: the average error dropped to 2.9 points, and the median error plummeted to 2.1. Using three-year polling error this year would have produced more accurate results, suggesting that four-year-old data have indeed outlived their usefulness. (Curious to see if we could do even better, I also tested a version that used an average of just the previous two years; this produced an equivalent average error to the four-year average—3.0 points—but a lower median error—2.2 points. That's still worse than the three-year calculation, though, indicating that three years is the sweet spot.) Therefore, I will use three-year calculations for my 2015 projections—accounting for the exit-poll error in 2012, 2013, and 2014 only.

Another tweak I will make to the experiment will be a subtle, yet important, change in how I calculate the polling error itself. Currently, I subtract a player's percentage on public ballots (the polls) from his percentage on all ballots (the actual vote). However, it would be more precise to subtract his percentage on public ballots from his percentage on private ballots (i.e., all ballots minus public ballots). This is because "turnout" for the exit polls (i.e., the number of ballots publicly revealed) has increased dramatically in recent years. In 2012, we only know of 114 public ballots out of 573 total (19.9%); in 2014, a whopping 208 people out of 571 (36.4%) revealed their ballots beforehand. This necessarily creates some error because, for instance, some of the +7.9% error for Jack Morris in 2012 was eaten into by the 94 additional ballots that were public in 2014 but private in 2012. Put another way, if someday 98% of voters make their ballots public before Election Day, my adjustment factors will be pretty useless; we'll already know that there can be only a minuscule amount of error in those polls. The very error represented by my adjustment factors by definition gets smaller with a larger sample size in @RRepoz's exit poll.

Finally, one common denominator in both my projections' errors and the polls' errors was that we both guessed too high. Most players ended up getting fewer votes than we projected they would—whether because of the controversial Rule of 10, or old-school voters protesting the Steroid Era, or for some other reason. This suggests automatically building in a one- to two-point negative adjustment factor in addition to the player-specific one. However, a closer examination reveals that it is disproportionately the first-time candidates dragging the ballot in this direction; they over-performed in exit polls by an average of 3.7 points. Rather than try to devise my own method of projecting these fickle ballot rookies—an endeavor that has failed two years in a row now—perhaps I should simply dock them each a few points next year and call it a day.

Tuesday, January 7, 2014

In my last post predicting the outcome of tomorrow's Hall of Fame vote, I applied an "adjustment factor" to existing exit polls of the Hall election to arrive at projected vote totals for each player. However, there was one critical shortcoming of my method: because it was based on the historical error of past such "exit polls," there was no way to predict how first-time candidates will perform. In 2014, Greg Maddux, Tom Glavine, Frank Thomas, Mike Mussina, and Jeff Kent are on the ballot for the first time, and we have no clue how they will fare beyond their current polling numbers. Will they, like so many players before them, over- or under-perform those polls by a statistically significant degree?

Last year, I attempted a linkage analysis to answer this question, but it was inconclusive. This year, we try something simpler; while we can't look at actual historical results for these players, we can look at historical results for players who are similar to them. We have trustworthy exit-poll data for each of the past four Hall of Fame elections; here is how each player's final vote total differed from his exit poll, sorted by his year on the ballot.

In the far-right column, you can see the average deviations for individual players that we've already used as adjustment factors for the 2014 vote. In the bottom-most row are average deviations not by player, but for specific years on the ballot. A clear pattern jumps out: candidates under-perform their exit polls in the early stages of their candidacy, but, after their sixth year on the ballot, the switch is flipped—they begin to over-perform their expected totals.

This makes logical sense. The voters who aren't covered by the exit polls tend to be more conservative; they're writers who are no longer covering baseball or don't believe in the transparency of releasing their ballot publicly. Retired voters especially are likely to prefer players they witnessed and covered—too curmudgeonly, or not familiar enough with more recent players, to recognize their greatness. There is also probably a "distance makes the heart grow fonder" aspect to it; it's easier to misremember players who played 15 or 20 years ago as better than they were.

So already we have a useful tidbit of information. Players who are appearing on the Hall ballot for the first time tend to do a little bit worse than polling suggests—an average of 2.0 points worse. That's an adjustment we can apply to all first-time candidates.

But, other than all being first-timers, Maddux, Glavine, Thomas, Mussina, and Kent have little else in common. To learn more, we must separate players into categories. Let's start with the simplest: hitters and pitchers.

The average for all the exit-poll deviations for each category is in that category's lower-right corner, in the highlighted cell. Pitchers, it turns out, are looked upon more favorably than hitters by voters not accounted for in the exit polls. But we can get even more specific, by breaking the players down by position:

Relief pitchers, with their gaudy totals of the deeply flawed save statistic, are the most beloved by the "old-school" nonpublic voters; they get an average boost of 6.1 percentage points above and beyond their polling numbers. Starters are no slouches either, though; they earn a 2.2-point boost. Most offensive positions do not have significant changes, and some even come with sharp penalties—notably the middle infield (though it is worth noting that the sample size we're calculating from is very small). These numbers are, at worst, an indicator of which direction (up or down) we can expect a first-time candidate to move; at best, they're position-specific adjustment factors.

What other categories can we separate players into? Well, since this is Hall of Fame voting, we'd be remiss not to compare PED users and PED non-users. Since voters also penalize some players for suspected PED use—despite little to no evidence for it—I'll also put them into a category of their own. Apologies in advance for these categories; they're necessarily subjective, because most of the Steroid Era is based on hearsay.

Unsurprisingly, nonpublic voters are not kind to PED users. Exit polls typically overestimate them by 2.5 points. Suspected users Mike Piazza and Jeff Bagwell are also docked some points, though not as many; there is probably less universal condemnation of them even among conservative voters. "Clean" players (and I put this in scare quotes because there really is no way to know whether they were truly PED-free) are actually slightly underestimated by exit polls.

(There could be some contamination in these data, though. "Clean" players are also exclusively the players who have been on the ballot for eight years or more. Old-school voters might be voting for them at higher rates because of the age bias discussed earlier; alternatively, maybe the age bias is due to retired voters' aversion to steroids. That said, Mark McGwire's evolving margin of error suggests that time may heal all wounds. From a –13.2% adjustment factor in his fourth year on the ballot, McGwire cut that to –0.7% in his fifth year, and in his sixth and seventh years he has actually benefited from nonpublic ballots. This may be a ray of hope for steroids-tainted candidates; voters' nostalgia may win out after they get tired of feeling vindictive. That will certainly be something to watch for in Clemens's and Bonds's numbers this year.)

There's one other way we can categorize players that's been nagging at me: by race. Despite its best efforts since integration, baseball has had trouble becoming a truly colorblind sport, still questionably throwing around terms like "horse" and "scrappy." I wondered if old-school voters' dislike for Tim Raines or Barry Larkin had anything more to it than the types of players they were.

Without assigning blame or making accusations, the numbers do prove that nonpublic ballots are skewed toward white players. They gain an average of 1.4 points on top of their exit polls, while African Americans lose 1.1 points and Hispanics lose 1.0. There are certainly prominent exceptions within each race (e.g., Lee Smith), and thus I think factors like position are more predictive. However, it is not inconceivable that a strand of racism remains among the BBWAA's oldest and crustiest members.

So, finally, let's apply these findings to our first-time candidates. Through some quick-and-dirty arithmetic, we can arrive at a back-of-the-napkin adjustment factor for each of them. These will be incorporated into my final Hall of Fame election forecast on Wednesday, but the crude nature of this analysis will mean there's a greater margin of error for these adjustment factors than for the ones based on historical fact. Without further ado:

Greg Maddux
–2.0% (first-time candidate) + 2.2% (starting pitcher) + 0.7% (clean) + 1.4% (white) = adjustment factor of +2.3% [note: because Maddux is so close to 100%, we cannot add the full 2.3 points; his new vote projection will simply bring his total as high as it can be given that we know there is one ballot against him]

Thursday, January 2, 2014

In my opinion, the most useful contributions to Baseball Hall of Fame voting and the attendant debate are the ballot aggregators of Twitter users @RRepoz and @leokitty. The former runs the comprehensive HOF Ballot Collecting Gizmo, while the latter maintains a Google spreadsheet with each individual ballot detailed. Together, they are the Hall of Fame equivalent of exit polling an election.

However, as those who work in politics know, every poll has a margin of error. They can even be flat wrong—remember when aides were calling John Kerry "Mr. President" after looking at the first wave of exits in 2004? In this case, these Hall of Fame polls are at one big disadvantage to the generally sound practice of political polling: they aren't representative, as scientific polls are made to be by weighting.

Through no fault of these aggregators, Hall of Fame exit polling is by definition skewed toward a self-selected pool: BBWAA members who are willing to make their ballots public. This tends to include more progressive scribes: those who value transparency, and not those who stopped covering baseball 20 years ago (these retired reporters may not even have an outlet to publish their Hall of Fame column even if they wanted to write one). In political terms, the poll over-represents certain demographics and undercounts certain other populations who still vote in high numbers.

Therefore, if you quote these aggregators' raw numbers as direct predictions of final vote totals—as many people on Twitter seem to be doing—you're going to be in for a surprise on January 8, when full results are announced. It's just as dangerous as relying on unweighted polling numbers in politics.

What we need to predict the Hall of Fame is more than a flawed poll—it's a model, of the sort used with great success by Nate Silver in the past few elections. Except ours is much simpler—all we have to do, in pollster terms, is tweak the numbers based on where past polls have historically fallen short.

You could say what we're doing here is the baseball version of UnskewedPolls.com, the ill-fated conservative alternative to FiveThirtyEight that manipulated 2012 polls and convinced many that Mitt Romney would actually win the election. Well, I prefer to think of it as the work any pollster must do to refine his or her raw data into a releasable scientific survey: weighting the numbers based on known facts and sound logic to get a representative sample.

Below is a comparison of the final exit polls for last year's Hall of Fame election and the actual results. We 're using @RRepoz's polling here, since he had a larger sample size than @leokitty—194 ballots out of an eventual 569 cast (34.1%):

You can see that the polls significantly short-changed certain players and over-hyped certain others. A similar discrepancy exists in the 2012 Hall of Fame exit polls, for which we use @leokitty's data. Her sample size that year was 114 out of 573 ballots cast (19.9%):

In 2011, @leokitty captured 122 votes out of an eventual 581 (21.0%):

And, finally, @leokitty had the biggest sample in 2010—92 out of 539 ballots (17.1%):

Four years of data should be sufficient for our purposes, especially since a clear pattern has emerged. The exit polls understate support for "old-school" candidates like Jack Morris, Lee Smith, and Don Mattingly. They overstate support for more subtle greatness—especially Tim Raines—and controversial candidates like Barry Bonds and Roger Clemens.

Each individual is over- or under-sampled by different degrees, however. A simple average of each player's "margin of error" in the past four elections yields an adjustment factor that we can apply to this year's exit polling. The chart below extrapolates each candidate's projected vote share for 2014 based on this adjustment factor and @RRepoz's current (updated as of 9pm ET on January 5) exit polls.

(Note that, unfortunately, these projections don't work on first-time candidates, because there is no vote history to calculate an adjustment factor from. In these cases, I've left their projected vote totals as they are in the polls. However, a future post here will explore ways to guesstimate adjustment factors for these players as well.)

The model projects that four candidates will be elected to the Hall this year: Greg Maddux, Tom Glavine, Frank Thomas, and—in a scraper—Craig Biggio. Falling short, thanks in part to his negative adjustment factor, will be Mike Piazza. Meanwhile, the top candidate likely to benefit from the old-school boost, Jack Morris, simply has too much ground to make up. He's currently polling at 60.3%, and while he is almost certain to get more than that on Wednesday, a 14.7-point bump would be an unprecedented margin of error. Falling off for another reason could be Rafael Palmeiro, who looks like he'll survive according to the raw numbers, but his adjustment puts him under 5%. While polls indicate Don Mattingly is in real danger of dropping off, precedent suggests he'll get a big boost on Election Day.

Of course, even a few days out, it's still fairly early in the voting process. @RRepoz has aggregated just 131 ballots; if history is any guide, more will soon be publicly released, which will increase the accuracy of exit polls and thus improve these projections. Stay tuned to Baseballot on Twitter to get daily updates as we count down the days to January 8.