Author Archive

Twenty-one year old Brandon Finnegan put his name on the map in Tuesday night’s epic wildcard game, when he tossed scoreless 10th and 11th innings before leading off the 12th with a walk to Josh Reddick, who would eventually come around to score against Jason Frasor. Drafted by the Royals with this year’s 17th overall pick, Finnegan made quick work of the minor leagues, making his big league debut on September 6th at Yankee Stadium, just 81 days — and 27 minor league innings — removed from his last appearance with TCU in this year’s College World Series.

To find comps for Finnegan, I first looked for pitchers with a similar arsenal of pitches. Using a minimum of 1,000 pitches, I sought out left-handed pitchers who threw fastballs, sliders, and changeups — Finnegan’s three pitches — at least 90% of the time since 2008, and threw each of these pitches at least 5% of the time. From there, I turned to the PITCHf/x database to find out how often these pitchers’ pitches fell within Finnegan’s middle 50% of values for velocity, break angle, break length, and spin rate, and spin direction from his eight big-league games. These are the pitchers who threw the highest ratio of pitches comparable to what Finnegan threw. The similarity percentage was calculated by dividing each pitcher’s share of pitches meeting these criteria by the share of pitches met by Finnegan himself. The ERA’s were calculated over the last seven years: 2008-2014.

Finnegan looks to have a bright future ahead of him, as his top two comps are two of the most dominant pitchers in baseball — one a starter (Sale) and one a reliever (Watson). It remains to be seen which path the Royals will choose for their hard-throwing lefty going forward. While it’s tempting to slot him in in a relief role next season, the wiser decision might be to stretch him out as starter, where he would be able to take full advantage of his three-pitch arsenal. But either way, until the Royals’ playoff run comes to an end, Brandon Finnegan will be allowed to air it out for just an inning or two at a time on easily the biggest stage he’s ever seen. And given his lights-out stuff, he might just end up being this year’s Francisco Rodriguez.

It’s been quite a week for the WAR stat. Since Jeff Passan dropped his highly controversial piece on the metric on Sunday night, the interwebs have been abuzz with arguments both for and against the all-encompassing value stat. One criticism in particular that caught my eye came from Mike Newman, who writes for ROTOscouting. Newman’s qualm had to do with a piece of WAR that’s often taken for granted: the positional adjustment. He made the argument that current WAR models underrate players who play premium defensive positions, pointing out that it would “laughable” for Jason Heyward to replace Andrelton Simmons at shortstop, but not at all hard to envision Simmons being an excellent right fielder.

This got me thinking about positional adjustments. Newman’s certainly right to question them, as they’re a pretty big piece of the WAR stat, and one most of us seem to take for granted. Plus, as far as I’m aware, none of the major baseball websites regularly update the amount they credit (or debit) a player for playing a certain position. They just keep the values constant over time. I’m sure that whoever created these adjustments took steps to ensure they accurately represented the value of a player’s position, but maybe they’ve since gone stale. It’s certainly not hard to imagine that the landscape of talent distribution by position may have changed over time. For example, perhaps the “true” replacement level for shortstops is much different than it was a decade or so ago when Alex RodriguezDerek Jeter, Nomar Garciaparra, and Miguel Tejada were all in their primes.

I decided to try and figure out if something like this might be happening. If the current positional adjustments were in fact inaccurately misrepresenting replacement level at certain positions, we’d expect the number of players above replacement level to vary by position. For example, there might be something like 50 above-replacement third basemen, but only 35 shortstops. Luckily, the FanGraphs leaderboard gives you the ability to query player stats by position played, which proved especially useful for what I was trying to do. For each position, I counted the number of plate appearances accumulated by players with a positive WAR and then divided that number by the total plate appearances logged at that position. Here are the results broken out by position for all games since 2002.

Based on this data, it seems like the opposite of Newman’s hypothesis may be true. A significantly higher portion positive WAR plate appearances have come from players at the tougher end of the defensive spectrum, which implies that teams don’t have too difficult of a time finding shortstops and center fielders who are capable of logging WARs above zero. Less than 13% of all SS and CF plate appearances have gone to sub-replacement players. But finding a replacement-level designated hitter seems to be slightly more difficult, as teams have filled their DH with sub-replacement-level players nearly 30% of the time. Either teams are really bad at finding DH types (or at putting them in the lineup), or the positional adjustments aren’t quite right. The disparities are even more pronounced when you look at what’s taken place from 2002 to 2014.

The share of PAs logged by shortstops and center fielders hasn’t changed much over the years, but the numbers have plummeted for first basemen, corner outfielders, and DH’s. From Billy Butler and Eric Hosmer, to Jay Bruce and Domonic Brown, this year’s lineups have been riddled with sub-replacement hitters manning positions at the lower end of the defensive spectrum. Meanwhile, even low-end shortstops and center fielders, like Derek Jeter and Austin Jackson, have managed to clear the replacement level hurdle this season if we only count games at their primary positions.

The waning share of above-replacement PA’s coming from 1B, LF, RF, and DH has caused the overall share to drop as well, with a particularly big drop coming this year. Here’s a look at the overall trend.

And here it is broken down by position…

And just between this year and last…

Frankly I’m not sure what to make of all of this. I’m hesitant to call it evidence that the positional adjustments are broken. There could be some obvious flaw to my methodology that I’m not considering, but I find it extremely interesting that there’s been such a shift between this year and last. We’re talking an 8 percentage point jump in the number of PAs that have gone to sub-replacement-level players. Maybe its been spurred the rise of the shift or maybe year-round interleague play has something to do with it, but it seems to me that something’s going on here. And I’m interested to hear other people’s thoughts on these trends.

Rockies outfielder Corey Dickerson is quietly having an excellent season at the plate. Believe it or not, the 25-year-old is hitting an impressive .315/.371/.577, which even after adjusting for the effects of Coors Field, is still good for a 144 wRC+ — 13th highest among players with at least 400 plate appearances. Dickerson’s batted pretty sparingly against lefties, which has certainly played a role in his gaudy stat line, but platoon or no platoon, a .405 wOBA is certainly nothing to sneeze at.

While Dickerson’s out-of-the-blue breakout is interesting, the approach he’s used to get there is what makes him truly unusual. Since debuting last season, he’s swung at 62% of pitches inside the strike zone and 42% of pitches outside of it, making him about 1.5 times (62%/42%) as likely to swing at a strike than a ball. This is the lowest such ratio of any player with at least 600 PA’s these last two years. Dickerson’s not a free swinger, per se — his overall swing rate of 51% is 38th out of 251 players with at least 600 PA’s — but he just doesn’t discriminate based on whether or not a pitch is in the strike zone. Here’s a look at the hitters with the lowest Z-Swing%/O-Swing% these last two seasons:

Dickerson’s contact rates tell a similar story. Just like his overall swing rate, Dickerson’s contact rate of 81% isn’t all that interesting. Here, he checks in at 151 out of 251. But also like his swing rate, it doesn’t change very much depending on a pitch’s location. He’s put wood on 83% of pitches he’s offered at in the zone, compared to 74% outside of it, making him 1.1 times as likely to connect on a pitch within the zone — fourth lowest out of 251.

Multiplying these two metrics (Contact% x Swing%) gives us Dickerson’s contact rate over all pitches seen, regardless of that pitch’s location. Lets call this AllContact% to distinguish it from the traditional Contact%. This number shows just how much of an outlier he really is. For the average major league hitter, a pitch thrown in the strike zone results in contact 2.9 times as often as one outside of it, but for Dickerson, a pitch in the zone is less than 1.7 times as likely. Even if we set the bar as low as 70 plate appearances to include 577 players, this is still the lowest in baseball since the start of 2013.

And unsurprisingly, he also the all-time leader since 2007 (the earliest year with PITCHf/x data). Dickerson had the lowest among all players with 100 PA’s here, but I set the threshold to 600 PA’s to avoid having leader board filled with obscure players like Jesus Feliciano and Jordan Brown. In case you were wondering, Vladimir Guerrero checked in at 2.13.

Dickerson’s indifference to a pitch’s location means its probably only a matter of time before pitchers just stop throwing the ball in the strike zone, especially if he keeps slugging well above .500. So far this year, opposing pitchers have thrown Dickerson a strike just over 45% of the time. This is lower than the league average of 49%, but isn’t exceptionally low, especially for a free-swinging power hitter. Guys like Jose Abreu, Carlos Gomez, and Pablo Sandoval see strikes around 42% of the time, so pitchers could almost certainly get away with throwing Dickerson a few more balls. Sure, he’s shown that he’s able to hit those pitches, but even for a player like Dickerson, chasing after bad pitches is still a recipe for lots of swings and misses. His 74% O-Contact% is well above the league average of 63%, yet still lower than the overall Contact% of 80%.

Dickerson’s one-size-fits-all approach to swinging has worked well so far, but it remains to be seen what will happen when pitchers start exploiting it by throwing more balls out of the zone. Maybe he’ll be unfazed and keep on raking. Maybe he’ll turn into a strikeout machine, who needs to refine his approach to even stay in the big leagues. Either way, Corey Dickerson’s a fascinating player, who’s unlike any we’ve seen in recent years, and it’ll be interesting to see if he’s able to keep succeeding going forward.

At the start of the 2013 season, Brett Gardneradopted a new, more aggressive approach at the plate in the hopes of barreling more hittable pitches. Up to that point, the slap-hitting outfielder had been one of the most patient hitters in baseball. Gardner sat out most of 2012 due to injury, but swung at just 32.7% of all pitches seen between 2010 and 2011, the fewest of any player with at least 300 plate appearances. Last year, his swing rate jumped to 40.1%, with most of his new-found aggressiveness focused on pitches located within the strike zone. While his zone swing rate rose by 13 percentage points from 2010 to 2013, his rate for pitches out of the zone only increased by seven.

The change seemed to pay off. Gardner posted a career high .143 ISO last season — much better than his career mark of .103 — on his way to a very respectable 108 wRC+. He’s carried that success over to this season as well. With 16 homers, he’s doubled his total from last season — which was already a career high — and with a 119 wRC+, he’s developed into one of the better-hitting outfielders in all of baseball.

But unlike last season, he’s no longer sporting a swing percentage north of 40%. Instead, it’s fallen back to 36.6%, just a tad higher than his 35% mark from 2011. So if Gardner’s back to his old ways of watching two thirds of all pitches go by, how has he managed to keep hitting for power? The answer has everything to do with plate discipline. Gardner’s continued to take advantage of hittable pitches, but has also gotten much better at laying off pitches outside of the strike zone. First lets look at how often he’s swung at pitches inside of the strike zone.

Since adopting his more aggressive approach two springs ago, Gardner’s behavior on pitches in the zone hasn’t changed much. Maybe he’s gotten a little less aggressive over the past couple of years, but for the most part, his swing rates have been pretty consistent. It’s probably safe to say that Gardner’s a guy who swings at about 50-55% of pitches in the strike zone. We see a different story, however, when it comes to pitches outside of the zone.

At least initially, Gardner’s swing rate on balls out of the zone also spiked. He seemingly became more aggressive on all pitches, without discriminating based on location. But that’s changed over the past couple of seasons, as he’s swung at fewer and fewer pitches out of the zone. His O-Swing% dipped below 18% in both July and August — down from around 25% in early 2013 — putting him on par with what he was doing back in 2010 and 2011. Today, Gardner’s been nearly three times more likely to swing at a strike than a ball, up from two times as likely in April of 2013.

Gardner’s improved plate discipline is nothing new. Although his change in approach puts a kink in the trend, Gardner’s been getting better at deciding whether or not to swing since his first days in the big leagues, and probably even longer. Even before he re-evaluated his approach before the 2013 season, he was already starting to transition from a “guy who doesn’t swing at anything” to a “guy who doesn’t swing at balls”.

Coming up through the minors, Gardner didn’t impress many scouts with his tools, and barely even made his college team as a walk-on. Sure, he’s always had plus-plus speed, but that only gets you so far when you’re an outfielder with little power to speak of. Rather than relying on his pure hitting skills, Gardner makes it work with his zen-like plate discipline. By swinging at so few balls out of the zone, Gardner practically forces pitchers to leave the occasional pitch over the heart of the plate, and has just enough pop in his bat to make them pay for it. But most importantly, he’s learned how to take advantage of those mistake pitches, while simultaneously laying off of the bad ones.

Brandon Moss has wielded an immensely potent bat since joining the Athletics’ lineup in June of 2012. Between 2012 and 2013, he hit a remarkable 146 wRC+, and clubbed a homer once every 15.7 PA’s, placing him third in baseball behind Chris Davis and Miguel Cabrera over that span. Moss kept up the hot hitting to start the 2014 season, as well. The 30-year-old 1B/OF/DH posted a 162 wRC+ in the season’s first two months, further establishing himself as a key cog in one of baseball’s most potent lineups.

But Brandon Moss hasn’t been himself lately. Since his last home run on July 24th, he’s only managed three extra-base hits, resulting in a laughable .168/.317/.198 batting line. Moss’s slump has also coincided with a change in his hitting approach. Moss appears to have gotten a bit more passive at the plate, swinging at way fewer pitches both inside and outside of the strike zone. This new-found passivity took a turn for the extreme once the calendar turned to August, when his O-Swing% and Z-Swing% fell to 27% and 65%, respectively — both around six percentage points lower than his career norms.

Moss’s decision to lay off more pitches has unsurprisingly lead to a spike in both his walk and strikeout numbers, but it’s also resulted in his power completely flat-lining. Moss has basically been Adam Dunn without the power these last couple of months. That’s a pretty terrible hitter, and is part of the reason why the A’s went out and got the real Adam Dunn to help their sputtering offense.

The new swing profile is something that’s recently changed, making it the obvious culprit for Moss’s drop-off in production, but we shouldn’t immediately rule out the possibility that pitchers have changed the way they’re approaching him. It could just be that he’s swinging at fewer pitches because he’s getting fewer pitches to hit. That doesn’t seem to be the case, though, as Moss’s zone breakdown from August looks nearly identical to what it was over the season’s first four months. For whatever reason, Moss just isn’t swinging as often as he used to.

It’s not entirely clear what’s spurred Moss’ sudden reluctance to swing the bat, but all indications are that it’s done a number on his offensive performance. Unlike the Brandon Moss that — up until recently — could be counted on for a wRC+ north of 130, this latest iteration seems to be letting a few too many hittable pitches float down the heart of the plate. And based on what’s transpired over the last month or two, Moss’s best bet is probably to re-discover the more aggressive approach that’s worked so well for him in the past.

The ability to quantify the value of catcher framing has been one of the biggest sabermetric breakthroughs of the last decade. By parsing through PITCHf/x data, analysts like Mike Fast, Max Marchi, Dan Brooks, and Harry Pavlidis have managed to shed light on which catchers are adept at turning balls into strikes, uncovering hidden value in otherwise unremarkable players, including Rene Rivera, Chris Stewart, and of course, Jose Molina.

MLB front offices have taken notice. Several teams, including the Yankees, Rays, Red Sox, Pirates, Padres, and Brewers have begun hoarding good-framing catchers over the past few years. But one team that’s missing from this list are the Oakland Athletics, who have historically been among the first adapters of sabermetric principles. One would think that the A’s would be all over the Jose Molina‘s and Chris Stewart’s of the world, yet Billy Beane and co. seem to have missed the memo on acquiring good framers. In fact, they’ve made a habit of employing poor ones. According to Baseball Prospectus‘ model, A’s catchers rank fourth from last in framing runs saved this season. This isn’t a one year anomaly, either. Here’s a look at all of the catchers the A’s have used since 2010, along with their career framing numbers.

That right there is a pretty sorry group of framers. There’s not a single catcher in the group who’s even above average. So what gives? Why has Billy Beane — who’s nearly synonymous with the term “market inefficiency” — been so reluctant to exploit the latest market inefficiency?

As far as I can tell, there are two possible explanations, and the real answer is probably some combination of the two:

1) The A’s have chosen to employ catchers who excel in areas other than pitch-framing.

2) The A’s aren’t completely buying into all of this pitch-framing stuff.

Let’s start with the first explanation. Since 2010, A’s catchers have accumulated 12.1 fWAR (which doesn’t account for framing), putting them 15th out of 30 MLB organizations. But since 2012, the year after Mike Fast’s research first brought the value of pitch framing to the public’s eye, the A’s rank 10th. The average wRC+ from a catcher is 93, but the A’s have done much better than that of late by employing guys like John Jaso (136 wRC+) and Derek Norris (110 wRC+). Even if you were to dock the Oakland’s catchers for their poor framing skills, they’d still fall somewhere in the middle of the pack in terms of total value. Basically, the A’s have managed to find good, cheap catchers, who generate value in ways other than framing pitches. Plus, for all we know, the A’s might have reason to believe these guys excel in other overlooked areas. They could be superb game callers, for example.

But that can’t be all that’s going on. Sure, the A’s have done a decent enough job of finding catching talent without prioritizing framing, but it’s not like they’ve had Mike Piazza or Johnny Bench behind the plate. Jaso and Norris are fine players, but aren’t exactly superstars. Plus, it should tell us something that they haven’t even brought in any bottom-of-the-barrel framing specialists. Eric Kratz or Chris Stewart were both traded for warm bodies last winter, but the A’s instead chose to roll with Vogt as their primary catching depth.

Perhaps the A’s have reason to believe that publicly available framing models overstate the value-add of a framed pitch? As Dave Cameron recently pointed out, its not entirely clear if the full value of a framed pitch should be attributed to the catcher, with none of the credit going to the pitcher. Current models don’t account for how a pitcher might change his approach based on the framing abilities of his catcher, and research shows that pitchers do in fact change their approach based on who’s catching, throwing a few more pitches outside of the strike zone:

Oakland’s brain trust is about as progressive as they come, and have a proven penchant for unearthing value from unlikely places. When a team like that zigs while others zag, it probably makes sense to ask why. This isn’t to say that the publicly-available framing data is useless, as having a good framer undeniably adds some value, even if it’s only a few runs. But the fact that the A’s have yet to employ a single plus framer should lead us to wonder if there’s a piece of the puzzle we might be missing.

As you’ve probably heard by now, the Baltimore Orioles have made a habit of outperforming their run differential these last three years. In 2012, they finished the year 93-69, but their +7 run differential suggested they didn’t play much better than a .500 team. This year, they’re at it again. They currently sit atop the American League East with a 73-52 record, but their peripheral stats suggest they’ve lucked into a few wins along the way.

This has inevitably led to some disagreement over the true talent of recent Orioles teams. On the one hand, it’s been well established that things like BaseRuns and Pythagorean records do a pretty good job of predicting a team’s win-loss record. But at the same time, Buck Showalter’s Orioles have been pulling this off for a while now. Even if you understand and accept the concept of random variation, its a little hard to believe that the Orioles’ run has been entirely due to luck.

Jeff Sullivan recently penned a convincing article, dispelling the myth that clutch teams remain clutch over an extended period of time. He compared teams’ first-half clutch scores to their second-half scores, finding no correlation between the two, concluding that “team clutch” is not a repeatable skill.

Sullivan’s argument is pretty persuasive, but Major League teams today are sort of like like a Ship of Theseus: They experience lots of turnover over the course of a year, and come September, many look completely different than they did on opening day. Perhaps a comparison of half-seasons might not be picking up on the “magic” that often exists for only part of a year, when a team had the right combination of players on its roster.

To test whether this might be the case, I looked at month-to-month correlations for all consecutive months from 2009 to 2013. I also broke things up by hitting clutch and pitching clutch to see if there might be a phenomenon that exists on only one side of the ball.

There isn’t much going on here, as all three trend lines are pretty darn close to flat. But we do see a slight upward slope to the trend line for pitchers. Its not enough to be statistically significant (P-Value=.27), but maybe it could be picking up on something. For instance, it doesn’t seem far-fetched that some managers might be better than others at deploying relievers in situations where they’re likely to succeed. The 2012 Orioles’ bullpen, after all, was more clutch than average in all six months of the season. So maybe their success had something to do with the way Buck Showalter managed his bullpen? Let’s see if we see anything more definitive by breaking up the correlations up by starters and relievers.

Nada. Both rotation and pitching clutch show no signs of correlation, which implies that the hint of a relationship for month-to-month pitching clutch was purely statistical noise. Pretty much any way you slice it, there’s just no evidence suggesting that team clutch is in any way a repeatable skill, even over very short periods of time. Some teams — like the Orioles — do manage to string together consecutive months of clutch performance. But the overall lack of correlation between consecutive months shows that a team’s clutch performance is about as random as a coin flip. If you flip a coin enough times, you’ll eventually get 10 heads in a row. By that same logic, you’re bound to find a stretch as extreme as the Orioles’ if you string together enough three-year stretches.

All statistics courtesy of FanGraphs and their infinitely useful splits data.

Big-league teams today employ a myriad of data-driven strategies to eek every last drop of value from the players on their rosters. Many of these strategies consist of matching up hitters and pitchers based on their handedness. Between lineup platoons and highly-specialized bullpens, managers today go to great lengths to ensure they’re putting their players in the best possible situation to succeed.

It’s easy to see why. With very few exceptions, Major League hitters hit much better against opposite-handed pitching. In terms of wOBA (vs. opposite-handed – vs. same-handed), lefties perform about .031 better against righties, while righties hit .043 better against lefties. Yet not all platoon splits are created equal. Players like Shin-Soo Choo, David Wright, and Jonny Gomes are notorious for their drastic splits, while others put up comparable numbers no matter who’s on the mound. Ichiro Suzuki and Alex Rodriguez are a couple of the no-platoon-split poster boys.

Ok, so some batters have bigger platoon splits than others, but is there any particular reason for this? Take Choo for example. Is there something inherent to his skill set or approach that causes him to struggle against lefties?

Hoping to find an answer, I ran some regressions in search of attributes that might make a player more likely to have an exaggerated platoon split. I tested all sorts of things out there — from walk rate and swing% to a player’s height and throwing arm — but didn’t come away with much. Aside from a hitter’s handedness, attributes that proved statistically significant included: a hitter’s overall wOBA, his line drive rate, his strikeout rate, and his contact rate on pitches out of the zone, but even those relationships are extremely weak. It takes .100 points on a batter’s wOBA, or a 10% increase in K% or LD%, to move a batter’s platoon split by just .010 points. This tells us something, but not a ton, and at the end of the day, these variables account for a nearly negligible 4% of the variation in hitters’ platoon splits. Here’s the resulting R output. My sample included all batter seasons from 2007-2013 with at least 100 plate appearances against both lefties and righties, excluding switch hitters:

Good hitters or guys who strike out frequently might be a little more prone to having large platoon splits. But for all practical purposes, a player’s ability to hit one type of pitching better than the other seems to be a skill that’s independent of all others. Aside from going by a player’s platoon stats, which can take years to become reliable, there’s little we can do to anticipate which hitters might fare particularly bad against same-handed pitching. And with the exception of players with long track records of unusual platoon splits — like Choo and Ichiro — it’s generally safe to assume that any given hitter’s true-talent platoon split is within shouting distance of the average: .043 for lefties and .031 for righties.

Over the last few weeks, I have written a series of posts looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I analyzed hitters in Rookie leagues, Short-Season A, Low-A, High-A, Double-A, and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

After receiving a few requests, I decided to apply the model to players of years past. In what follows, I dive into what KATOH would have said about recent top prospects, look at the highest KATOH scores of the last 20 years, and highlight some instances where KATOH missed the boat on a prospect. If you’re feeling really ambitious, here’s a giant google doc of KATOH scores for all 40,051 player seasons since 1995 ( minimum 100 plate appearances in a short-season league or 200 in full-season ball).

Before I delve into the parade of lists, I want to point out one disclaimer to what I’m doing here. KATOH was derived from the performances of historical players, so applying the model to those same players might make it look a little better than it is. Take a player like Jason Stokes for example. Although he was a very well-regarded prospect in the early 2000’s (#15 and #51 per Baseball America in 2003 and 2004), KATOH consistently gave him probabilities in the 70’s and 80’s. But part of that is likely because Stokes’ data points were incorporated into the model. If I had created KATOH in 2005, Stokes’ MLB% may have been a few percentage points higher. Even so, a few data points generally aren’t enough to substantially change a model that incorporates thousands. In other words, it’s probably safe to assume that a player’s MLB% using today’s KATOH is roughly in line with what he would have received at the time.

Now, onto the results. Here’s what KATOH thought about some of the most recent top 100 prospects:

2013 Top 100 Prospects

Player

Year

Age

Level

MLB Probability

Xander Bogaerts

2013

20

AA

99.888%

Xander Bogaerts

2013

20

AAA

99.869%

George Springer

2013

23

AAA

99.816%

Gregory Polanco

2013

21

AA

99.614%

Nick Castellanos

2013

21

AAA

99.608%

Kolten Wong

2013

22

AAA

99.428%

Wil Myers

2013

22

AAA

99.418%

Miguel Sano

2013

20

A+

99.335%

Tyler Austin

2013

21

AA

99.194%

Jackie Bradley

2013

23

AAA

99.079%

Kaleb Cowart

2013

21

AA

99%

Byron Buxton

2013

19

A+

98%

Francisco Lindor

2013

19

A+

98%

Christian Yelich

2013

21

AA

97%

Byron Buxton

2013

19

A

97%

Addison Russell

2013

19

A+

97%

Billy Hamilton

2013

22

AAA

96%

Brian Goodwin

2013

22

AA

96%

Carlos Correa

2013

18

A

96%

Slade Heathcott

2013

22

AA

96%

Javier Baez

2013

20

A+

95%

Jake Marisnick

2013

22

AA

95%

Albert Almora

2013

19

A

95%

Jonathan Singleton

2013

21

AAA

94%

Mike Zunino

2013

22

AAA

94%

Alen Hanson

2013

20

A+

94%

Gregory Polanco

2013

21

A+

92%

Javier Baez

2013

20

AA

91%

Jorge Soler

2013

21

A+

90%

Gary Sanchez

2013

20

A+

89%

Austin Hedges

2013

20

A+

89%

Mike Olt

2013

24

AAA

87%

Miguel Sano

2013

20

AA

83%

George Springer

2013

23

AA

82%

Mason Williams

2013

21

A+

78%

Trevor Story

2013

20

A+

61%

Bubba Starling

2013

20

A

61%

Courtney Hawkins

2013

19

A+

58%

Roman Quinn

2013

20

A

58%

2012 Top 100 Prospects

Player

Year

Age

Level

MLB Probability

Jurickson Profar

2012

19

AA

99.975%

Anthony Rizzo

2012

22

AAA

99.947%

Manny Machado

2012

19

AA

99.937%

Billy Hamilton

2012

21

AA

99.856%

Oscar Taveras

2012

20

AA

99.827%

Kolten Wong

2012

21

AA

99.824%

Nolan Arenado

2012

21

AA

99.759%

Leonys Martin

2012

24

AAA

99.737%

Nick Franklin

2012

21

AA

99.737%

Yasmani Grandal

2012

23

AAA

99.714%

Wil Myers

2012

21

AAA

99.659%

Andrelton Simmons

2012

22

AA

99.566%

Travis D’Arnaud

2012

23

AAA

99.512%

Jedd Gyorko

2012

23

AAA

99.493%

Hak-Ju Lee

2012

21

AA

99.492%

Jonathan Singleton

2012

20

AA

99.482%

Nick Castellanos

2012

20

AA

99.465%

Jonathan Schoop

2012

20

AA

99.443%

Jean Segura

2012

22

AA

99.423%

Nick Castellanos

2012

20

A+

99.051%

Starling Marte

2012

23

AAA

99.015%

Anthony Gose

2012

21

AAA

99%

Rymer Liriano

2012

21

AA

99%

Jake Marisnick

2012

21

AA

99%

Xander Bogaerts

2012

19

A+

98%

Michael Choice

2012

22

AA

98%

Gary Brown

2012

23

AA

98%

Christian Yelich

2012

20

A+

98%

Nick Franklin

2012

21

AAA

97%

Javier Baez

2012

19

A

97%

Brett Jackson

2012

23

AAA

96%

Zack Cox

2012

23

AAA

92%

Mason Williams

2012

20

A

91%

Gary Sanchez

2012

19

A

89%

Jake Marisnick

2012

21

A+

88%

Francisco Lindor

2012

18

A

88%

Cheslor Cuthbert

2012

19

A+

87%

Miguel Sano

2012

19

A

86%

Billy Hamilton

2012

21

A+

83%

George Springer

2012

22

A+

80%

Christian Villanueva

2012

21

A+

80%

Mike Olt

2012

23

AA

79%

Matt Szczur

2012

22

A+

78%

Rymer Liriano

2012

21

A+

76%

Blake Swihart

2012

20

A

66%

Cory Spangenberg

2012

21

A+

64%

Bubba Starling

2012

19

R

17%

2011 Top 100 Prospects

Player

Year

Age

Level

MLB Probability

Mike Trout

2011

19

AA

99.973%

Brett Lawrie

2011

21

AAA

99.969%

Anthony Rizzo

2011

21

AAA

99.911%

Wil Myers

2011

20

AA

99.654%

Christian Colon

2011

22

AA

99.495%

Brandon Belt

2011

23

AAA

99.414%

Austin Romine

2011

22

AA

99.393%

Jesus Montero

2011

21

AAA

99.379%

Devin Mesoraco

2011

23

AAA

99.205%

Brett Jackson

2011

22

AAA

99.199%

Dustin Ackley

2011

23

AAA

99.196%

Yonder Alonso

2011

24

AAA

99%

Lonnie Chisenhall

2011

22

AAA

99%

Zack Cox

2011

22

AA

98%

Jason Kipnis

2011

24

AAA

98%

Mike Moustakas

2011

22

AAA

98%

Desmond Jennings

2011

24

AAA

98%

Jonathan Villar

2011

20

AA

98%

Matt Dominguez

2011

21

AAA

98%

Jurickson Profar

2011

18

A

97%

Bryce Harper

2011

18

A

97%

Tony Sanchez

2011

23

AA

97%

Dee Gordon

2011

23

AAA

97%

Grant Green

2011

23

AA

97%

Manny Machado

2011

18

A+

97%

Nolan Arenado

2011

20

A+

96%

Chris Carter

2011

24

AAA

96%

Travis D’Arnaud

2011

22

AA

96%

Wilmer Flores

2011

19

A+

95%

Jose Iglesias

2011

21

AAA

95%

Hak-Ju Lee

2011

20

A+

94%

Brett Jackson

2011

22

AA

93%

Jonathan Singleton

2011

19

A+

92%

Joe Benson

2011

23

AA

91%

Gary Sanchez

2011

18

A

86%

Wilin Rosario

2011

22

AA

86%

Nick Castellanos

2011

19

A

85%

Nick Franklin

2011

20

A+

83%

Jean Segura

2011

21

A+

82%

Cesar Puello

2011

20

A+

82%

Derek Norris

2011

22

AA

76%

Jonathan Villar

2011

20

A+

73%

Aaron Hicks

2011

21

A+

68%

Billy Hamilton

2011

20

A

61%

Miguel Sano

2011

18

R

44%

Josh Sale

2011

19

R

15%

Next, lets take a look at some of the highest KATOH scores of all time, namely those who received a score of at least 99.9%. There aren’t any complete busts among these players, as virtually all of them went on to play in the majors.

All-Time Top KATOH Scores

Player

Year

Age

Level

MLB Probability

Sean Burroughs

2000

19

AA

99.998%

Luis Castillo

1996

20

AA

99.995%

Fernando Martinez

2007

18

AA

99.994%

Daric Barton

2005

19

AA

99.992%

Alex Rodriguez

1995

19

AAA

99.992%

Carl Crawford

2001

19

AA

99.992%

Elvis Andrus

2008

19

AA

99.992%

Adam Dunn

2001

21

AAA

99.990%

Joe Mauer

2003

20

AA

99.989%

Ryan Sweeney

2005

20

AA

99.984%

Nick Johnson

1999

20

AA

99.984%

Jose Tabata

2009

20

AA

99.983%

Jose Tabata

2008

19

AA

99.983%

Travis Snider

2009

21

AAA

99.981%

Joaquin Arias

2005

20

AA

99.980%

Matt Kemp

2006

21

AAA

99.979%

Jose Reyes

2002

19

AA

99.979%

Jurickson Profar

2012

19

AA

99.975%

Mike Trout

2011

19

AA

99.973%

Jay Bruce

2008

21

AAA

99.971%

Brett Lawrie

2011

21

AAA

99.969%

B.J. Upton

2004

19

AAA

99.959%

Howie Kendrick

2006

22

AAA

99.951%

Ryan Howard

2005

25

AAA

99.951%

Dioner Navarro

2004

20

AA

99.950%

Luis Rivas

1999

19

AA

99.949%

Lastings Milledge

2005

20

AA

99.948%

Anthony Rizzo

2012

22

AAA

99.947%

Billy Butler

2006

20

AA

99.946%

Fernando Martinez

2008

19

AA

99.944%

Alberto Callaspo

2004

21

AA

99.944%

Jose Lopez

2003

19

AA

99.939%

Freddie Freeman

2010

20

AAA

99.939%

Manny Machado

2012

19

AA

99.937%

Rickie Weeks

2005

22

AAA

99.935%

Casey Kotchman

2004

21

AAA

99.932%

Eric Chavez

1998

20

AAA

99.930%

Adrian Beltre

1998

19

AA

99.927%

Shannon Stewart

1995

21

AA

99.917%

Anthony Rizzo

2011

21

AAA

99.911%

Karim Garcia

1995

19

AAA

99.910%

Jay Bruce

2007

20

AAA

99.907%

Jeff Clement

2008

24

AAA

99.902%

Miguel Cabrera

2003

20

AA

99.900%

All of the players who registered a KATOH score of at least 99.9% did so while playing in either Double- or Triple-A. This isn’t all that surprising since these are the levels closest to the big leagues. But what about the lower levels? Like we saw in Double- and Triple-A, there weren’t any complete busts among the highest ranking hitters from full-season A-ball. For both full-season leagues, each of the 20 top ranked players has either made it to the majors, or in the case of Carlos Correa, is young enough to still has an excellent chance to do so. But on the bottom two rungs on the minor league ladder, we come across a few instances where KATOH whiffed, most notably in Garrett Guzman (74%), Richard Stuart (72%), and Pat Manning (72%).

Top KATOH Scores for Seasons in High-A

Player

Year

Age

Level

MLB Probability

Adrian Beltre

1997

18

A+

99.863%

Andruw Jones

1996

19

A+

99.568%

Giancarlo Stanton

2009

19

A+

99.405%

Billy Butler

2005

19

A+

99.348%

Miguel Sano

2013

20

A+

99.335%

Chris Snelling

2001

19

A+

99.241%

Jason Heyward

2009

19

A+

99.097%

Andy LaRoche

2005

21

A+

99.091%

Wilmer Flores

2010

18

A+

99.075%

Nick Castellanos

2012

20

A+

99.051%

Jose Reyes

2002

19

A+

99%

Casey Kotchman

2003

20

A+

99%

Vernon Wells

1999

20

A+

99%

Travis Lee

1997

22

A+

99%

Brandon Wood

2005

20

A+

98%

Xander Bogaerts

2012

19

A+

98%

Justin Huber

2003

20

A+

98%

Aramis Ramirez

1997

19

A+

98%

Jay Bruce

2007

20

A+

98%

Byron Buxton

2013

19

A+

98%

Top KATOH Scores for Seasons in Low-A

Player

Year

Age

Level

MLB Probability

Mike Trout

2010

18

A

99%

Adrian Beltre

1996

17

A

98%

Jurickson Profar

2011

18

A

97%

Bryce Harper

2011

18

A

97%

Sean Burroughs

1999

18

A

97%

Andruw Jones

1995

18

A

97%

Byron Buxton

2013

19

A

97%

Jason Heyward

2008

18

A

97%

Corey Patterson

1999

19

A

97%

Vladimir Guerrero

1995

20

A

97%

Javier Baez

2012

19

A

97%

Ian Stewart

2004

19

A

96%

Lastings Milledge

2004

19

A

96%

Carlos Correa

2013

18

A

96%

Prince Fielder

2003

19

A

96%

Delmon Young

2004

18

A

96%

Josh Vitters

2009

19

A

96%

Chad Hermansen

1996

18

A

95%

Wilmer Flores

2010

18

A

95%

B.J. Upton

2003

18

A

95%

Top KATOH Scores for Seasons in Short-Season A

Player

Year

Age

Level

MLB Probability

Played in Majors

Chris Snelling

1999

17

A-

82%

1

Richard Stuart

1996

19

A-

72%

0

Aramis Ramirez

1996

18

A-

71%

1

Ryan Kalish

2007

19

A-

71%

1

Cory Spangenberg

2011

20

A-

66%

0

Hanley Ramirez

2002

18

A-

66%

1

Wilson Betemit

2000

18

A-

65%

1

Ismael Castro

2002

18

A-

65%

0

Vernon Wells

1997

18

A-

64%

1

Carlos Figueroa

2000

17

A-

61%

0

Carson Kelly

2013

18

A-

61%

0

Pablo Sandoval

2005

18

A-

60%

1

Dan Vogelbach

2012

19

A-

59%

0

Manny Ravelo

2000

18

A-

57%

0

Chip Ambres

1999

19

A-

57%

1

Maikel Franco

2011

18

A-

55%

0

Jurickson Profar

2010

17

A-

55%

1

Derek Norris

2008

19

A-

54%

1

Cesar Saba

1999

17

A-

54%

0

Edinson Rincon

2009

18

A-

52%

0

Top KATOH Scores for Seasons in Rookie ball

Player

Year

Age

Level

MLB Probability

Played in Majors

Jeff Bianchi

2005

18

R

76%

>1

Justin Morneau

2000

19

R

74%

1

Addison Russell

2012

18

R

74%

0

Garrett Guzman

2001

18

R

74%

0

James Loney

2002

18

R

74%

1

Prince Fielder

2002

18

R

73%

1

Pat Manning

1999

19

R

72%

0

Wilmer Flores

2008

16

R

70%

1

Alex Fernandez

1998

17

R

70%

0

Dorssys Paulino

2012

17

R

69%

0

Tony Blanco

2000

18

R

69%

1

Hank Blalock

1999

18

R

69%

1

Joe Mauer

2001

18

R

69%

1

Hanley Ramirez

2002

18

R

69%

1

Ramon Hernandez

1995

19

R

68%

1

Angel Salome

2005

19

R

68%

1

Marcos Vechionacci

2004

17

R

67%

0

Gary Sanchez

2010

17

R

66%

0

Scott Heard

2000

18

R

65%

0

Jose Tabata

2005

16

R

65%

1

Now for KATOH’s biggest whiffs. Looking at seasons prior to 2011, the following players had very high KATOH ratings, but never made it to baseball’s highest level. The biggest miss was Cesar King, a defensive-minded catcher from the Rangers organization. Though to KATOH’s credit, King did spend five days on the Kansas City Royals’ roster in 2001 without getting into a game. Following King are a couple of busted Yankees prospects in Jackson Melian and Eric Duncan. Not to make excuses for KATOH, but these guys’ high scores may have had something to do with the way the Yankees over-hyped their prospects back then. If those two weren’t on Baseball America’s top 100 list, KATOH would have pegged them in the 70’s, rather than in the high-90’s.

KATOH’s Biggest Misses

Player

Year

Age

Level

MLB Probability

Cesar King

1998

20

AA

99.427%

Jackson Melian

2000

20

AA

99%

Eric Duncan

2005

20

AA

98%

Matt Moses

2006

21

AA

98%

Juan Williams

1995

21

AA

98%

Jeff Natale

2005

22

AA

97%

Eric Duncan

2006

21

AA

97%

Nick Weglarz

2010

22

AAA

96%

Nick Weglarz

2009

21

AA

96%

Tony Mota

1999

21

AA

95%

Micah Franklin

1998

26

AAA

94%

Billy Martin

2003

27

AAA

94%

Bill McCarthy

2004

24

AAA

94%

Jackson Melian

1999

19

A+

94%

Tagg Bozied

2004

24

AAA

94%

Kevin Grijak

1995

23

AAA

93%

Angel Villalona

2008

17

A

93%

Danny Dorn

2010

25

AAA

93%

Nic Jackson

2003

23

AAA

92%

Pat Cline

1997

22

AA

92%

And here are the major leaguers who KATOH deemed least likely to make it when they were in the minors. Its worth noting that a couple of them — Jorge Sosa and Jason Roach — made it as pitchers.

Worst KATOH Scores Who Made it to the Majors

Player

Year

Age

Level

MLB Probability

Justin Christian

2004

24

A-

0.017%

Jorge Sosa

1999

21

A-

0.027%

Tyler Graham

2006

22

A-

0.087%

Gary Johnson

1999

23

A-

0.136%

Bo Hart

1999

22

A-

0.155%

Tommy Manzella

2005

22

A-

0.181%

Michael Martinez

2006

23

A-

0.185%

Eddy Rodriguez

2012

26

A+

0.194%

Kevin Mahar

2004

23

A-

0.215%

Will Venable

2005

22

A-

0.232%

Brent Dlugach

2004

21

A-

0.268%

Sean Barker

2002

22

A-

0.270%

Steve Holm

2002

22

A-

0.301%

Edgar V. Gonzalez

2000

22

A-

0.315%

Peter Zoccolillo

1999

22

A-

0.328%

Konrad Schmidt

2007

22

A-

0.337%

Tommy Medica

2010

22

A-

0.365%

Brian Esposito

2008

29

AA

0.392%

Jason Roach

1997

21

A-

0.396%

Jorge Sosa

2000

22

A-

0.439%

KATOH’s far from perfect, but overall, I think it does a pretty decent job of forecasting which players will make it to the majors. That being said, it’s still a work in progress, and I have a few ideas rolling around in my head to improve on the model. Furthermore, I’m working to develop something that will forecast how a minor leaguer will perform upon reaching the majors, to complement his MLB%. I’ll be dropping these new and improved KATOH projections (for both hitters and pitchers) after this year’s World Series, when we’ll all be desperate for something baseball-related to get us through the winter.

Over the last couple of weeks, I’ve been looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. So far, I’ve analyzed hitters in Rookie leagues, Low-A, High-A, Double-A and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in Rookie ball, Low-A, or High-A, added a little bit to the model for Double-A and Triple-A hitters. Today, I’ll look into what KATOH has to say about players in Short-Season A-ball. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. For those interested, here’s the R output based on all players with at least 200 plate appearances in a season in SS A-ball from 1995-2007.

Just like we saw with hitters in Rookie ball, a player’s Baseball America prospect status couldn’t tell us anything about his future as a big leaguer. This was entirely due the scarcity players top 100 prospects in the sample, as only a handful of players spent the year in SS A-ball after making BA’s top 100 list. Somewhat surprisingly, walk rate is predictive for players in SS-A, despite being statistically insignificant for hitters in Rookie ball and the more advanced A-ball levels. Another interesting wrinkle is the “Strikeout_Rate:Age” variable. Basically, this says that strikeout rate matters more for younger players than for older players at this level. Although frequent strikeouts are obviously a bad thing no matter how old you are:

The season is less than 50 games old for most teams in the New York-Penn and Northwest Leagues, which makes it a little premature to start analyzing players’ stats. But just for kicks, here’s a look at what KATOH says about this year’s crop of players with at least 100 plate appearances through July 28th. The full list of players can be found here, and you’ll find an excerpt of those who broke the 40% barrier below:

Player

Organization

Age

MLB Probability

Rowan Wick

STL

21

82%

Eduard Pinto

TEX

19

68%

Marcus Greene

TEX

19

60%

Mauricio Dubon

BOS

19

59%

Franklin Barreto

TOR

18

57%

Christian Arroyo

SFG

19

57%

Skyler Ewing

SFG

21

56%

Taylor Gushue

PIT

20

55%

Domingo Leyba

DET

18

55%

Raudy Read

WSN

20

53%

Nick Longhi

BOS

18

52%

Andrew Reed

HOU

21

52%

Danny Mars

BOS

20

51%

Amed Rosario

NYM

18

49%

Yairo Munoz

OAK

19

48%

Seth Spivey

TEX

21

47%

Mike Gerber

DET

21

47%

Mark Zagunis

CHC

21

47%

Kevin Krause

PIT

21

46%

Leo Castillo

CLE

20

45%

Jordan Luplow

PIT

20

45%

Mason Davis

MIA

21

40%

Kevin Ross

PIT

20

40%

Franklin Navarro

DET

19

40%

As we saw with Rookie league hitters, KATOH doesn’t think any of these players are shoo-ins to make it to the majors. Even Rowan Wick, who hit a Bondsian .378/.475/.815 before getting promoted, gets just 82%. This goes to show that SS A-ball stats just aren’t all that meaningful.

Once the season’s over, I’ll re-run everything using the final 2014 stats, which will give us a better sense of which prospects had the most promising years statistically. I also plan to engineer an alternative methodology — to supplement this one — that will take into account how a player performs in the majors, rather than his just getting there. Additionally, I hope to create something similar for projecting pitchers based on their statistical performance. In the meantime, I’ll apply the KATOH model to historical prospects and highlight some of its biggest “hits” and “misses” from years past. Keep an eye out for the next post in the coming days.