Japanese Projections – Part 1: Background

Predicting the future is hard, but it can be a lot of fun.

Projection is the art (fools say science) of predicting the future when applied to baseball players. Generally, it involves looking at what a player has done in the past, especially in the recent past, adjusting for various things (like age, injuries, et cetera), and applying an algorithm or set of algorithms to create a set of magic numbers. These magic numbers, though based on reality, can have tragic consequences. For instance, two years ago I reasoned that the $10m the White Sox committed to Jermaine Dye for 2005 and 2006 might have been better spent on Jose Cruz Jr. For those keeping score:

Oh well, one outta two ain’t bad, right? Point being, at least for idiots like me, it’s hard to predict the future for major league players – there is a lot of variability, for lots of reasons. And it gets harder.

f major league players are hard to project, what about Japanese league players? The first step in our quixotic attempt to predict the future for this small subset of players is translating their Japanese statistics into a context we can understand. Although we have access to plenty of Japanese statistics, we aren’t quite sure what they mean. For instance, a major league pitcher with a 3.00 ERA last year performed very well indeed. A 3.00 ERA in Nippon Professional Baseball isn’t as good; how good is it?

In attempting to answer that question, we run into many problems. First, the sample of players who play in both NPB and MLB is very small — only a few players each year go in one direction or another. Compare that with the minor leagues; every American-born player who played a major league game in 2006 spent some time in the minors… and we still have trouble projecting minor league players! Also, the sample is biased; players who go from MLB to Japan tend to do so because they weren’t good enough to play in MLB any more, whereas Japanese players come to America because they were too good for the league. Unlike minor league translations, which usually compare players across multiple minor league levels in the same year, Japanese translations rely on comparing performance in separate years, even though much might have changed in the interim.

Aside: Imagine for a moment that Bizarro World Andruw Jones spent 2005 and 2006 (after learning his new batting stance and thus gaining lots of power at the plate) in Japan. We can assume he would have put up astronomical numbers: at least 60-65 home runs, for instance, despite the shortened season. Jones in 2003 and 2004 had a slugging percentage of .500; in 2005 and 2006, he slugged .553. Thus, translating Bizarro Jones’ stats to MLB (by comparing them with 2003-2004 Jones) would overestimate the dampening effect on SLG of MLB by at least 10%, the improvement in his slugging percentage exogenous to league difficulty.

In addition to all the factors mentioned above, there are factors that are unquantifiable. Japan and the United States are very different places; how much of a player’s failure to hit stems from the relative difference in the level of competition, and how much stems from him not being able to find his favorite foods, from not having anyone in the clubhouse to talk to, from not being able to converse with 99% of the people he meets?

Translations are a crude way of adjusting for these factors in one fell swoop. We know they aren’t very good, but they’re the best we have. Jim Albright came up with these translations for pitchers:

League

hits

homers

walks

strikeouts

Japan

14624

1545

5832

10963

Majors

15737

1910

6252

9695

Adjustment factors*

1.076

1.236

1.072

0.884

These are matched-innings translations. That is, they assume that the pitchers in both leagues pitched the same number of innings in each case. If Pitcher A gave up 100 hits in 100 innings in Japan, we’d expect him to give up 108 hits in 100 innings in America.

Because I’m stubborn, I like to reinvent the wheel. I plan on taking a second look at these translation factors later. For now, they’ll be an easy way of giving us some translated data we can use to take a stab at the question on everyone’s mind: how will Daisuke Matsuzaka do in the major leagues?