Prospect prizefights

Two weeks ago, I introduced the “versus draftee” split. The idea behind it is simple enough. In the college game, players face a broad range of talent, from future major league superstars to guys who will never throw a baseball after graduation. If we’re going to look at a college player’s stats, why not focus on his numbers against the best?

In that article, I looked at how college hitters performed against top-flight competition—that is, against guys who had been or would be drafted. It’s only natural to turn our focus to college pitchers.

A different view of Friday night

Elite college pitchers have a straightforward task during the regular season. They take the ball every Friday night and face off against the best opponents their schools face. Unlike hitters, who can pad their stats in relatively easy midweek matchups, top hurlers have little rest from the best.

This means several things for the task at hand. A college ace pitching for a major program is already facing high-quality competition. Most of his opponents are also big-time programs, and there aren’t many holes in those lineups.

To take just one example, David Price faced about 500 batters in his junior year for Vanderbilt. 230 of those were drafted at some point. Some analysts have likened major-conference college baseball to advanced Single-A, or even Double-A. There are more easy outs in the SEC than in the Florida State League, but there is some validity to the comparison.

Compared to analyzing hitters, we’re often working with much larger samples. If you recall from the article on college batters, we rarely got more than 100 plate appearances against drafted pitchers. Price is an extreme example, but he’s not alone. Thirty-nine pitchers faced more than 100 drafted batters last year.

There’s a downside to these big samples. It’s feast or famine. Since the good hitters are facing their conference rivals on Friday night, Friday night starters at mid-majors are facing everybody else. While Chance Ruffin racked up 140 batters faced against draftees and Mike Leake accumulated 131, Stephen Strasburg only reached 75, Eric Arnett got to 69, and Dallas Baptist’s Victor Black only hit 50.

In other words, our data on the “versus draftee” split is relatively good for ace pitchers at elite programs … but those guys are already facing quality competition most of the time. For everybody else, we don’t have much to go on.

Atop the boards

Enough theorizing. Draft watchers are already getting revved up for June 7, and the best college prospects in the nation are battling it out at the top of draft boards. Let’s take a look at some of those elite prospects and how they performed against drafted hitters last year.

Of this group, only Eibner had trouble striking out prospects at an elite-level clip, while several of these top pitchers struggled with their control against better opponents. Intuitively, that doesn’t surprise. One stereotype of a college hitter with no pro potential is a guy who can’t stop hacking away. It isn’t fair to everyone, but there’s at least a grain of truth there.

As we did last time, let’s look at the 2009 results from a different angle. Which pitchers most thoroughly dominated drafted competition? Using strikeouts per nine innings and a minimum of 70 batters faced against draftees to rate the field, here are the best:

You heard it here first: Strasburg is pretty good at pitching. This is an impressive list overall, and if we had gone deeper, it would still look good. Among the next 20, at 10 K/9 or higher, are Josh Spence, Chris Dwyer, Brad Boxberger and Gerrit Cole.

What does it mean?

When I embarked on this project, I had two things in mind. First, how players perform against the highest-quality competition is something that scouts and commentators talk about. At the very least, they imply that it has predictive value. As with any other element of baseball’s conventional wisdom, it would be nice to know if it were true.

Second, it wasn’t just the conventional wisdom that suggested this type of analysis would bear fruit. It seemed intuitively clear to me that how amateurs perform against other top prospects has more predictive value than how they perform against run-of-the-mill college players. It was just a matter of how to calculate it.

There’s some basis for that suspicion. For an article published in The Hardball Times Baseball Annual 2010, I did this type of analysis for Triple-A hitters, looking solely at their results against pitchers who also spent time in the big leagues. I discovered that, at least for some stats, results against sometime-big-leaguers was more predictive of big-league performance than were general Triple-A batting lines.

The reasoning for college players is analogous. I’ve repeatedly referred to the cliche of the “major league curveball,” the mythical and oft-cited pitch that good big leaguers can hit but fringe guys can only wave at in desperation. In other words, it seems inappropriate to view the relationship between Triple-A and the bigs as a linear one. For some players, according to this reasoning, there’s a sort of invisible wall. Their big-league stats won’t be 20 percent worse than their Triple-A results; they’ll be disastrous.

It’s easy to sound confident when referring to one amateur player as having professional potential and another as having none. Does this mean we’re looking at another non-linear situation? If we are, analysis along the lines of the “versus draftee” split is necessary.

Linear wins the day

With enough data, we can get a glimpse of whether the “versus draftee” split has value. Let’s get the question clear in our minds. What is more indicative of future professional success: a relatively large sample against lower-quality competition, or a relatively small sample against higher-quality competition?

In other words, if we want to predict the professional results of an amateur pitcher using college statistics, we can use all of their stats, adjusted for park and competition, or we can use only their stats against draftees.

That’s what I tested. There are about 100 pitchers who (a) faced at least 200 drafted hitters in their college careers at some point between 2007 and 2009, and (b) faced at least 200 hitters below Double-A in their first two professional seasons.

Thus, we could compare the predictiveness of those pitchers’ overall stat lines and their “versus draftee” results. The adjusted overall stat lines won in a rout.

For both K/9 and BB/9, the correlation between adjusted overall stat lines and professional results was about 0.45. Between “versus draftee” performance and professional results? Approximately 0.25.

There are dozens of ways to massage the data looking for some way in which “versus draftee” stats are predictive, or some population for which they are useful. I didn’t try them all, but I tried a lot of them. Maybe the guys with the worst performance against draftees suffered in the pros? Nope. Perhaps the split is only relevant when it differs markedly from overall performance? Negative.

Are the matchups against draftees more valuable? Do we get something worthwhile if we weight those plate appearances more heavily? No. In fact, the more heavily you weight them, the worse the results get. Here’s how bleak it is: Using plate appearances against draftees is only slightly more valuable than randomly choosing a sample of equal size.

At least we cleared that up.

What “linear” means for draftniks

Negative results are never exciting, but this one might be. It’s disappointing to take this split out of our toolbox so soon after adding it, but the study ends up giving us some valuable information.

Most important, it reminds us to have faith in competition adjustments at the college level. Too often, fans look at college data and write it off as too messy, or too complicated. That’s right: It is messy and complicated. That just means that it’s more valuable to those who take the time to clean it up.

The apparent linearity of college talent gives us more confidence in statistical evaluations of players at smaller colleges. This year, analysts will pick apart guys like Chris Sale (Florida Gulf Coast), Bryce Brentz (Middle Tennessee State) and Michael Choice (UT-Arlington). We can’t get much out of their raw stats, but we can do a lot with proper adjustments. That reasoning may even extend to Division II standouts such as Florida Southern’s Daniel Tillman.

The triumph of linear also means that we don’t have to put as much stock in marquee matchups. They are fun to watch, for sure, but in most cases, they aren’t crucial for talent evaluation. Perhaps scouts can determine something from such encounters, perhaps noting that a hitter can’t get around on 95 mph heat, or that a pitcher relies too much on getting mediocre hitters to chase balls out of the zone. But that may be as far as it goes.

Lest you get too excited, or even disappointed, remember it’s still early. In the grand scheme of play-by-play college analysis, we’re about as far along as Stephen Strasburg’s pro career. Maybe in five years, we’ll be able to identify that “invisible wall.” Maybe the “versus draftee” split has predictive value for performance at and above Double-A. Time will tell.

Comments

Would it be possible to look at the fraction of draftees that make it to Double A (or AAA or the majors) for both hitters and pitchers? My thinking is that the pool of draftee hitters is larger, so it will include a broader range of hitter quality (i.e., a certain number of hitters are actually drafted for their defense, while pitchers are generally drafted for their pitching)… In this way, the spead of draftee hitter talent may not be that much less than the spread for all college hitters, so you’re simply just taking a smaller sample size… I assume this is what your comment on the draftee group not being appreciably different than a random sample of the same size indicates…