Team Sites

Overthinking It: Neither a Berroa Nor a [Ver]lander Be

A quick glance at the recent fortunes of last year’s Rookie of the Year hopefuls reveals that bringing home the hardware is far from a guarantor of future success. Of those receiving votes in the American League, victor Andrew Bailey, runner-up Elvis Andrus, and fourth-place finisher Jeff Niemann have used their successful starts as springboards for continued or heightened success. However, 2010 has been less kind to the other AL tyros who earned some November lovin’ from the BBWAA. A mere eight months ago, Rick Porcello was mowing down the Twins in the Metrodome as the Tigers’ season entered overtime; now he finds himself sporting an ERA over five (Skill-Interactive and otherwise). Gordon Beckham and his TAv below the terrible twos (.196, to be precise) have earned Ozzie Guillen’s ire (a distinction sharedbymany), and Brett Anderson, all but christened Cy Youngster prior to the season, may have pitched his last.

The “sophomore slump” (as opposed to the sophomoric slump) entered the baseball lexicon long ago, as a term used to describe the tendency of successful rookies to struggle in their second years of action. One might well wonder whether this effect is real, or merely the product of selective memory. Aaron Gleeman investigated the issue several years ago, but since the world has moved on from Win Shares (say thankee), it now seems like a suitable time to hit refresh and examine the matter through a time WARP. Here’s a visual representation of performance by Rookies of the Year in the first seasons of their eligibility for the award, as well as during their second seasons:

On average, each season’s most recognized (if not most worthy of recognition) rookie accumulated 3.6 WARP along the way to earning his laurels. 77 of the award winners (61.6 percent) accumulated lower WARP totals in their second years of action; one (Derek Jeter, who warded off decline with a concentrated beam of pure clutch captaincy emitted by his 1996 World Series ring) matched his WARP total out to the first decimal, and 47 (37.6 percent) failed to equal their previous season’s total. Of course, considering the invisible error bars surrounding our estimates of win value, those figures represent only an approximation of the true proportions of improvement and decline.

The reasons for the sophomore slump may seem straightforward, but if debate still rages over whether Han shot first, who am I to declare this matter settled so soon? As Mike Stadler notes in The Psychology of Baseball, the sophomore slump need not be a matter solely composed of cold, hard, numbers. Stadler writes, “The player who began one year in relative rookie obscurity might be dazzled by the spotlight that shines on last year’s Rookie of the Year—the extra media attention and higher expectations may force him to perform under more pressure than in the first year. Or, it might be that rookies, still untested and with something to prove, are just a bit more motivated, a bit more driven to succeed, and that some of this edge is naturally lost after a player has won his place on the team.”

Although such internal explanations for external behaviors can be alluring, this one reeks of the fundamental attribution error. Stadler concedes that the effect is better explained by a statistical phenomenon with which baseball fans of the basement-dwelling persuasion have become quite familiar—namely, regression to the mean. Since I’ve already used poor Stadler as a straw man, I’ll give him a chance to sound smart: “Anytime that we measure some statistic that changes over time in a population of individuals and select from the set those individuals with the most extreme scores, and then measure that statistic again at a later time, we can expect that the second measurement for the selected individuals will tend to move from the original extreme score toward the overall average for all individuals in the set.” In other words, when we focus our attentions on Rookies of the Year, we’re selecting individuals with extreme performances (which enabled them to win the award in the first place). As a result, we shouldn’t be surprised when these extremophiles subsequently return to environments more hospitable to harboring baseball life.

If the psychological approach to explaining the sophomore slump were valid, we’d expect to see it reflected in our data. After all, if the “dazzled by the spotlight” argument held water, the aforementioned player’s performance would return to its award-winning level after his second-season slump sent the spotlight elsewhere. Can we observe this behavior? Stadler cites one unattributed study that searched for such a “rebound effect,” but I’ll spare you the details of the statistical indicators examined by its authors, for fear of offending your sabermetric sensibilities. Rather, let’s transform our earlier bar chart into line graph form, and extend it to year three and beyond (assigning 0 WARP to full seasons lost to injury, but omitting seasons lost to military service and, in the case of Ken Hubbs, death):

Although the Rookies of the Year recovered slightly from their second-year low as a group, they couldn’t quite re-scale the lofty heights of their award-winning campaigns. Consider, too, that the average age of these players in their rookie seasons was only 23.3; regardless of one’s loyalties in the Great Sabermetric Peak Age Debate (an issue on which the finest minds of Blefescu may soon weigh in), one would expect these players to improve, on average, by virtue of the maturation process alone. That their performance peak preceded their physical peak, despite the experiential and physical gains accrued by the time that they’d attained the latter landmark, suggests that regression to the mean was, in fact, the culprit.

Before we move on, let’s take a look at the biggest second-year losers and gainers among the Rookies of the Year—because when Alex Trebeck asks you, “This MLB Rookie of the Year experienced the largest decline in WARP in his second full season,” you’d damn well better have “Who is Carl Morton?” on the tip of your tongue.

Note that this article’s title was prompted by Berroa’s precipitous decline. Being like 2006 AL RotY Justin Verlander hasn’t been such a bad thing, but “Neither a Lefebvre Nor a Berroa Be” didn’t have quite the same ring to it. You got a problem with that, take it up with management. Without further ado, on to those who ate regression for breakfast:

This is the part of the article where I observe that 1954 NL RoY Wally Moonhad (and still appears to have) a unibrow thick enough to give Cyclops or Geordi La Forge a serious case of eye-visor envy.

If you peruse almost any collection of statistical leaderboards and laggardboards, you’ll notice that the names on any given list tend not to reappear the following year. However, as a group, Rookies of the Year aren’t necessarily the most extreme performers; because the BBWAA voters who anoint each season’s Rookie of the Year can be swayed by team performance, empty statistics, or a particularly compelling narrative, “Rookie of the Year” is not always synonymous with “Most Value Rookie” (of course, “Most Valuable Player” hasn’t always been synonymous with “Most Valuable Player,” but that’s a mildly interesting subject for another day). A Rookie-of-the-Year victory is a flawed proxy for high performance, so it might make some sense to bypass it entirely and let the numbers speak for themselves. Let’s bring back our first graph, with the WARP total of each season’s most productive rookie (in this case, beginning in 1956) plotted in red:

As a group, the High-WARP rookies outperformed the Rookies of the Year by more than a full win in their respective rookie campaigns. Since the High-WARP group encompasses an even more extreme sample of performers than the Rookie of the Year group, we would expect its members’ regressions to be more pronounced, and that’s exactly what we see. The High-WARP rookies experienced a decline of 1.6 WARP in their second years of action, fully twice the decline experienced by the RotY group.

There’s another item of interest here. Since 1956, WARP and the BBWAA have seen eye-to-eye only 38.9 percent of the time. We’ve established that the BBWAA might be less than adept at identifying each season’s most valuable rookie (in fact, no fewer than six Rookies of the Year played at sub-replacement level, led by AL RotY Albie Pearson’s -1.0 WARP in 1958) but maybe that was never its intention. Last voting season, our own resident BBWAA members wrestledwith the issue of whether the likelihood of future success should have any bearing on RotY award voting; based on the historical voting results, it appears that most writers have concluded, whether consciously or otherwise, that it should. By year three, the Rookies of the Year have drawn even with the High-WARP rookies, and by year four, the Rookies of the Year have pulled ahead, albeit only slightly. Perhaps we can take this to mean that the writers have been adept at distinguishing flashes in the pan from players with staying power, regardless of whether they’ve exceed their voting mandates in doing so. Don’t tell Murray Chass I said so, but from that perspective, the BBWAA electorate may have been something of a wise crowd all along.

The high WARP player is going to eliminate a lot of rookies called up in June, who may have played better, but not as much, thus allowing them to not accumulate as much WARP. If you took the high WARP per service time player (Or something to that effect, but done better by smarter people), how different would the results be?

I liked the headline too.
Kind of a snarky shot here though: "... since the world has moved on from Win Shares..." In case you hadn't noticed, the world also has moved on from WARP and its prior generation of fielding metrics. Most baseball analysts are using WAR in various incarnations.