PECOTA Takes on Prospects

Introduction: I Now Pronounce You Scout and Stat

When I reread Nate Silver’s PECOTA Takes on Prospects series, three themes emerged. One, minor-league statistics are pretty damn good at predicting future performance. Two, so many factors can derail a prediction, particularly for young prospects. Three—which doubles as a disclaimer for this series—I’m not Nate Silver. Apologies in advance.

In the two-year run of his series, Silver, using stats alone, attempted to quantify the future value of prospects with rookie eligibility. He rated each prospect with a metric named UPSIDE and parsed his rankings, explaining why, for example, Dustin Pedroia deserved no. 4 prospect status in 2006, after just 538 (whoa) plate appearances above High-A. Or why UPSIDE projected Alexi Casilla for an excellent five years at shortstop and Troy Tulowitzki for just a decent future. We’ll be doing the same during the next few weeks, aided by Rob McQuown’s revival of Silver’s models.

UPSIDE was conceived to provide a complementary perspective to scouting rankings. It’s not necessarily a better perspective (in fact, it’s not a better perspective)—it’s simply a process that chooses to rely solely on rigorous calculation. Obviously, we should avoid numbers-only approaches when evaluating players—prospects are far more nuanced than their stats and quantifiable attributes. But this extreme approach does provide some benefits. Scouting valuations, while comprehensive, might overweight certain factors. They might say, misadjust on league and level effects. We know the California League is a hitter’s league, but how much of a hitter’s league is it? How much better is the competition in the Eastern League? And consequently, how much do we debit and credit batters and pitchers within our rankings?

This isn’t the fault of scouts, of course. It’s just humanly impossible to consider park, league, and competition-level effects for every player. We know the effects exist, but the adjustments we apply in our head are rarely precise. This is the primary argument for a statistically driven prospect ranking. PECOTA weights such factors objectively, while incorporating additional calculi like position, aging curves, and major-league equivalencies (Davenport translations). PECOTA digests all of these inputs, generates an UPSIDE value for each prospect, and forms a ranked list.

This stat-driven approach not only offers a different perspective, but prompts further questions that may lead to our better understanding of a prospect. The presentation of UPSIDE will be a single figure—which, by the way, is available on the player cards—and when that figure defies the prospect’s scouting diary, we’ll delve into the machinations behind UPSIDE. Is he performing especially well for his age at a high minor-league level? Will his skillset falter against better competition? Does PECOTA project a flat development curve for him?

Models have flaws too. They miss certain attributes. They are emotionless machines that do not discriminate between players (or more accurately, player IDs). This is where scouting comes in: Why, for example, do 185 players rank above Austin Hedges in UPSIDE, who ranks 18th on Jason Parks’ Top 101? While PECOTA considers age, body type, handedness, past performance, environment, and much more, variables like pitch mix, position flexibility, mental acumen, injuries, and makeup have yet to be incorporated. Models can be powerful, but they’re just simplified versions of reality. We can reference BP’s prospect experts to explain the complexities that PECOTA misses. Together, we hope they’ll provide the most complete view of today’s prospect landscape.

Prospects have upside, but how much UPSIDE?
UPSIDE measures the player’s likelihood of above-average performance while under his parent club’s financial control. For players aged 24 or higher, that’s their next five years. For younger players, it’s the peak projected five-year window (PEAK) through their age-28 season (i.e. whichever is highest between age 20-24, 21-25, etc. to their 24-28 window). Why age 28? As Silver explained in 2007, “This is the real fruit of the unforgiving labor of scouting and development: getting impact performances from players who are still cheap under the reserve clause, or in arbitration.”

With PECOTA as its backbone, UPSIDE calculates these peaks by first finding the prospect’s top 20 Comparable Players. The grittiest calculations arise from here: Each prospect is compared to every baseball player season in our database based on baseball attributes and performance. Each comparison is given a similarity score; we take the 20 highest. The higher a player’s overall Similarity Index, the more confidence we can have in his UPSIDE rating.

Once UPSIDE determines those 20 comparables, it takes the sum of their PEAK above-averageWARP values, doubles it, and weights them by similarity (higher similarity score, higher weighting). Basically, UPSIDE determines prospect upside by looking at similar players’ peaks. It disregards below-average performances as they contradict the definition of upside. Here’s Nate again:

The cost of employing a prospect is relatively low, both in terms of financial outlay and opportunity cost (a player can simply be left in the minors if he’s not good enough for MLB), so assigning negative points for a below-average or below-replacement level performance isn’t quite fair.

We’ve modified the definition of UPSIDE from our initial release, which used the sum of PEAK non-negative WARP values. Above-average reflects the UPSIDE definition better than above-replacement: It takes much more skill to clear an average WARP than zero WARP, and that skill would indicate the presence of upside.

With the methodology covered, let’s turn to the rating scale. We interpret UPSIDE by the following chart, borrowed from Nate (and Kevin Goldstein) once more:

Upside Score

Definition

100+

Excellent Prospect. One of the better prospects in baseball. Strong chance of long major league career, perhaps with several All-Star appearances. May have Hall of Fame potential, especially if prospect is young or has a rating of 150 or higher.

50-100

Very Good Prospect. Strong chance of a meaningful major league career, with some legitimate chance at stardom. Best-case outcomes may involve some Hall of Fame potential.

25-50

Good Prospect. Reasonable chance of a meaningful major league career, but only an outside chance at stardom.

10-25

Average Prospect. Some chance of a meaningful major league career, but more likely to end up on the major league fringe. Highly unlikely to make two or more All-Star appearances.

0-10

Marginal Prospect. Very little chance of becoming a major league regular, excluding extreme mitigating circumstances affecting the player’s statistical record.

With that in mind, let’s jump into PECOTA’s Top 100. This ranking is generated solely by PECOTA and its UPSIDE algorithm, with no qualitative adjustments applied. It’s comprised of rookie-eligible players under age 27 and excludes Japanese imports. Jason Parks’ rating of the prospect is listed for comparison, and pitcher rows are shaded.

It's been seven years since Nate Silver last wrote this series. Seven years, coincidentally, is the amount of time a team financially controls a player, and UPSIDE, Silver’s prospect metric, attempts to forecast the five-year PEAK within those seven years. Given the time that has passed, we can evaluate the accuracy of UPSIDE from 2006-07.

In his wrap-up of 2006, Silver formed “Team PECOTA,” consisting of the highest-ranking players in his PECOTA Top 50 who did not rank in Baseball America’s top 50. Here they are, with WARP accumulated during their seven-year spans since 2006:

Hart peaked at 91 on BA’s top 100 in 2002, falling off thereafter. Scouts liked the raw power but feared his long swing, which was already causing high strikeout rates. Nevertheless, .302/.398/.560 with 31 steals in Triple-A—a .302 True Average—impressed PECOTA. In terms of level translations, the gap between Triple-A and the major leagues is small. Pedroia’s high ranking was similarly motivated.

Nunez hit for a .309 TAv in short-season ball during 2005; with few stats to go on, PECOTA fancied the 18-year-old shortstop. He ranked sixth on BA’s top 10 that year too, but after a combined .214/.261/.308 slash across two levels in 2006, he didn’t re-appear in UPSIDE rankings or BA’s top 10. Small-sample, short-season statistics can mislead PECOTA; here, it got Nunez wrong.

Petit pitched very well in Double-A as a 20-year-old, posting walk and strikeout rates of four and 29 percent, respectively, but those have obviously failed to translate to the major leagues. He confused scouts then, too: The numbers were “really quite fantastic,” in Silver’s words, but not the stuff. Despite the reservations, BA ranked him the second-best prospect in the Mets system. We rarely see pitchers who rely on command and deception earn upper-tier prospect status, and BA placed him 77th overall. PECOTA, ranking him 17th, couldn’t make this hedge, not knowing that Petit’s strikeout numbers were a product of command rather than velocity.

We can also review PECOTA’s favorites in its 2007 iteration, compared to Kevin Goldstein’s rankings:

A much more disappointing group overall—just four of these nine have played in a major-league game this season. Shortstops account for three misses—Casilla, Lillibridge, and Rodriguez. From the same class, Brandon Wood and Reid Brignac joined them as highly rated shortstops on scout lists who just didn’t make it. You can attribute their failure to a number of factors, but the common theme here seems to be lack of major-league power. This wasn’t a huge concern for Silver, as long as the prospect made up for it with contact, batting eye, defense, and speed. However, none of PECOTA’s top-ranked shortstops ended up being outstanding in any such category, and even today, we see very few no-power, average-elsewhere middle infielders—depending on how you view his defense, Alcides Escobar straddled replacement level last year. Without power or a true skill, such prospects constantly live on the major-league fringe, despite playing a premium position.

Tulowitzki, a shortstop prospect who succeeded in that barren ’07 class, was underrated by PECOTA relative to Goldstein’s no. 24 ranking. PECOTA couldn’t pinpoint a “true skill” of Tulowitzki’s or give him bonus points for being a grinder, a word used by both Silver and Goldstein. While PECOTA deemed the above-average contact and speed of the prospects above deserving of high UPSIDE, it wasn’t impressed by Tulowitzki’s then-average power.

Iannetta has the 10th-highest WARP since 2007 among catchers—not bad for a guy who’s never played a full season. As Silver wrote, “I suspect the scouting-based lists are failing to account adequately for his positional value." Scouts dinged Iannetta’s arm and athleticism, but he succeeded thanks to the power and on-base skills he displayed in the minors.

PECOTA knows less than lists informed by scouting info, but it didn’t whiff completely, naming multiple prospects whose skills it liked enough to grant a high UPSIDE rating. The objective of PECOTA/UPSIDE isn’t to replicate or beat the scouting rankings—anyone can espouse the talents of Byron Buxton or Carlos Correa. We’re interested in which prospects the scouting lists leave off entirely, and what PECOTA saw in them based on stats alone. By 2021, PECOTA, acting on incomplete information, will have made more incorrect forecasts, but the prospects it nails may not have been on anyone’s top 100.

PECOTA rates seven prospects as “Excellent” and 52 as “Very Good,” cutting off the fallen Jonathan Singleton as its first “Good” prospect. No surprise up top: Byron Buxton takes the number-one UPSIDE rating, which tends to happen when Mike Trout is your primary comparable. Oscar Taveras follows him, with Colby Rasmus, Wil Myers, and Adam Jones as his first three comparables. As indicated by the rating scale, PECOTA expects long major-league careers out of Buxton and Taveras.

Overall, PECOTA’s top 100 contains 60 position players and 40 pitchers. Of the position players, UPSIDE, as it has in the past, rates shortstops and center fielders quite well. It may actually be overrating them, if we contemplate the trajectory of minor leaguers. PECOTA tends to compare players to others at the same position, but a shortstop in Low-A today might not be a shortstop in the major leagues, because he might not have the range for the position. Nevertheless, teams will keep that potential shortstop at that position in the minors, in hopes that he’ll develop at the position—he has positional upside, after all. That raises the UPSIDE of some shortstops and center fielders, because PECOTA doesn’t necessarily know who will eventually be forced to second base or right field. Attrition at the position happens slowly, and those who make it—the Starlin Castros, the Austin Jacksons—become comparables for current minor leaguers.

PECOTA projects 213 prospects, starting with Singleton, as “Good,” giving them a “reasonable chance at a meaningful major-league career, but only an outside chance at stardom.” Forty-one of them close this top-100 list, and the names are a mix of young and old, hitters and pitchers, scout-adored and not. By its definition of “Good,” PECOTA tells us that a few of these prospects will become stars. Scouts have their eyes on a few, but PECOTA has its inclinations too—prospects it nearly pushes into “Very Good” territory without much scout support, for instance. Here, UPSIDE becomes a diamond-mining tool: Of these scout-neglected prospects, who have the best arguments for stardom based on their stats? This is perhaps the most interesting application of UPSIDE—to speculate about which of the relative no-names will eventually become big names.

But aside from analyzing PECOTA’s likes and dislikes, we’re also interested in comparing its rankings to qualitative ones, such as Jason Parks’ Top 101. This exercise can reveal what the model misses, which prospects may be underrated scouting-wise, and which both scouts and stat agree on. With two lists, let’s observe how they overlap:

34 appear on both lists

66 appear exclusively on PECOTA’s top 100

66 appear exclusively on Parks’ top 101

Here we observe the nature of a purely stat-driven prospect ranking: It disagrees with a scout-based ranking two-thirds of the time. I appreciate this large figure—it shows how naïve a stat-driven list can be, given that scouts have access both to stats and to their own insight about players. It stresses that PECOTA thinks certain players aren’t getting enough credit, leading us to ask why. It also pushes us to consider how to improve PECOTA as a model.

We’ll examine the disagreements between PECOTA and BP’s Prospect Staff throughout the series, position by position and player by player.

The Hybrid Top 60
We can synthesize the Baseball Prospectus minds by combining both PECOTA’s and Jason Parks’ lists into one “BP” rank. Below, I’ve taken the geometric mean of the PECOTA Top 100 and Parks Top 101 ranking for each player. If the prospect didn’t make Parks’ 101, I assigned him a default Parks ranking of 150.

I also capped PECOTA rankings at 150, so anyone exceeding that number receives a PECOTA ranking of 150. This minimizes any outlying factors PECOTA couldn’t account for. For example, Lucas Giolito has a PECOTA ranking of 3,409 because of limited playing time; we’d rather not dock him for that. So for the purposes of the Hybrid Top 60, he sits in a 6,669-way tie as 150th on PECOTA’s rankings (that’s about how many players we ran PECOTAs for).

How does Bubba Starling get ranked as 29th by Pecota? I used to be a true believer in him but no longer am. Does Pecota heavily weigh, or include at all, high school stats? Starling's performance in the minors has been unremarkable.

Thanks for resurrecting Pecota Takes on the Prospects. I always thought it was a fun alternative take on prospect evaluation. Taken for what it is, it is useful. The depth and quality of writing you have given to this subject is impressive. I guess Pecota is a sucker for catching prospects.

It feels like there is a systematic bias in the PECOTA rankings against pitchers (12 of the top 13 are position prospects). Is that a function of TINSTAAPP or some other bias inherent in the system? Surprising to see the large gaps in the ratings even in pitchers in the high minors like Heaney or Taillon, let alone down to guys like Giolito ranked #3490 (!) by PECOTA.

PECOTA will have some trouble with small-sample pitchers, and Giolito has less than 40 professional innings to his name. Which is why, for the purposes of the Hybrid Top 60, I capped his ranking at 150.

Andrew mentions this when discussing the bias towards SS and CF (2nd paragraph after the PECOTA Top 100 table). Certainly it's easy to assume that catchers, as part of the same side of the defensive spectrum, are treated the same way.

Thanks, Andrew and BP. This is fantastic, and I appreciate the various elements -- description of the stat and process, review of the old results, comparison with a scouting-based prospect list -- that you've woven together. Looking forward to the rest of the series, and perhaps eventually a further exploration of your line "It also pushes us to consider how to improve PECOTA as a model."

There's a slight favoritism on catchers, but not a huge one (Jason Parks has 8 in his top 101, PECOTA has 11, albeit ranked higher). It's something to explore in next week's piece.

As for Roth, 45 pro innings isn't the greatest sample to work from. Roth is also unusual in that 20 of those innings are in the majors, where he had decent ratios. Probably higher than he should be, but we won't hard-code PECOTA for such unusual cases.

Can you do a similar table of the players that BA rated highly and were not included on Pecota's list in 2006 & 2007. It would be interesting to see if there were comparable Warp totals on those players.

PECOTA on prospects is exactly why I started subscribing to Baseball Prospectus. It is the most valuable and unique feature BP has to offer. Thank you for resurrecting it! I hope this is relaunched next year two months sooner so that we can use this information in our fantasy drafts.

Really excited about this series. One question about the calculation of UPSIDE - you say it's "above-average" WARP. Above-average relative to what?

Correct me if I'm wrong, but the calculation seems to go like this:
- Find the 20 most-comparable seasons (which means the players who, at that point in their career, had had a career most similar to the one being considered).
- For Buxton, this would be Trout 2012, FMart 2009, Daniel Fields 2011, JUp 08, Heyward '10, Cutch '07, Angel Morales '10, Allen Hanson '13, Domingo Santana '13, Jaff Decker '10, Yelly '12, Luigi Rodriquez '13, Myers '11, Gose '11, Eddie Rosario '12, Billy Butler '06, Tabata '09, Snider '08, Nimmo '13, Taveras '12
- Starting with those seasons, get the actual or projected performance for each player in their best 5 years prior to age 28 or the 5 years following the comp if they were at least 24. It figures some average to compare those WARP values to, and sums the above-average ones.
- So that produces 20 5-year sums of WARP. For each, we double it (presumably to scale it to a similar level as the previous non-negative WARP version?) and multiply by similarity, then we add them all together and divide by the sum of all 20 similarities
- So now we have a number. What does this mean? For Buxton, the number is 219.7 which is way more than for anyone else. Yet his long-term PECOTA forecast (which is also supposed to be a weighted mean) has his best 5 years prior to age 28 as 4.5, 4.0, 4.8, 5.1, and 4.7 WARP, respectively. That totals to 23.1, so a) why is his UPSIDE not in the neighborhood of 20 x 23.1 = 462, and b) why would UPSIDE be more reliable than his PEAK5 WARP?

The average is 1.62. The sum of the above-average ones, times 2, is 38.6, which is meant to represent an average PEAK5 of just-under 10, so I can't just divide by 5 to get the average it corresponds to, because it depends on how many players ended up counting as above-average.

*This one is approximate, since I didn't want to try to translate his minor league stats from last year into major-league WARP

Yep, these are the main steps of the process. In regards to Buxton, UPSIDE is the composite of above-average WARP values, so subtract a little over 10 WARP (5 x 2) to sum the "Wins Above Average," assuming an average player produces ~2 WARP.

The long-term PECOTA forecast for Buxton has him playing full seasons (600+ PA) while UPSIDE's comparables does not necessarily assume so.

For any player, they'll inevitably have comparables with below-average performance, which will lower their UPSIDE.

Interesting. And does it use projections for the comps, or only players who had actual production that could be compared?

Assuming it uses projections (given the comparables list on the PECOTA cards), I wonder how it would change if only completed seasons were used (so therefore for prospects all the comps have to be from 6-10 years ago or more)

I'm intrigued about Michael Roth's high placement on the list. Given that his numbers across the board have been unremarkable thus far, it makes me wonder what kind of weight PECOTA puts on a player's trajectory in the minors.

Roth, for instance, jumped from Rookie Ball to Double-A between '12 and '13, then hopped directly into the Angels bullpen after just a week in the Texas League.