Archive for the ‘japan’ Category

According to Jon Heyman of Sports Illustrated, Hanshin accepted a bid of $25 million for the posting rights to Kei Igawa. We should learn who the lucky winner was later today. Last month, I predicted a posting fee of $5-$10m for Igawa… yet another meal of ground glass to eat. Yum!

A few days ago, this blog took your predictions for who would win the rights to Daisuke Matsuzaka. As ESPN reported tonight, the Red Sox won his rights for a whopping $51.1 million. That means the winner of our contest was… no one! That’s right, no one predicted that the Red Sox would outbid everyone. The closest prediction in terms of dollars goes to “George Steinbrenner“, who “predicted” the Devil Rays would win the auction with a bid of $52,000,000.01. George, I’m not sure if you were serious, but that’s deadly accurate (sort of).

Good luck to the Red Sox; here’s hoping Matsuzaka is the ace that the Red Sox Nation will expect him to be.

Wow. Olney, citing “officials monitoring the bidding”, says the Sox “may” have offered between $38 million and $45 million for his rights. That would be higher than every prediction in our contest except those of Steinbrenner and Dr. Evil (no commentary on the seriousness of their predictions).

It’s impossible to not enjoy all the hedging in these rumors-made-stories. In other news, Angelina Jolie “may” leave Brad Pitt to hang out with this author. It’s possible! The rumor the ST has says the Rangers may have offered “close to $30 million.”

This post has been made largely irrelevant by the excellent work of Jeff Sackmann over at the Hardball Times (Jeff is also perhaps the primary reason for the author’s GMAT score – thanks again for the great blog, Jeff!). Even so, this subject deserves attention.

Akinori Iwamura has been posted by his team and is pretty much guaranteed to be on an MLB team next year. Earlier, we used the factors Aaron Gleeman developed for his Kenji Johjima projection in an off-the-cuff Iwamura projection. Today we’ll be more rigorous.

Most translation systems include data from players who go both from NPB to MLB and from MLB to NPB. Instead, we will only use data from NPB players born in Japan who later played in MLB. The sample, unfortunately, is quite small: it includes Ichiro Suzuki, Hideki Matsui, Tsuyoshi Shinjo, Kazuo Matsui, Tadahito Iguchi, So Taguchi, and Kenji Johjima. We used data from their final three seasons in Japan and compared that with their first two seasons (assuming they had that many) in America. This was done using matched PAs; thus, in all cases, Japanese numbers were interpolated to match the number of plate appearances each player had in America. Summary data for each statistic gave us the following translations:

To use these factors, simply apply them to an NPB line while holding PA constant (for instance, if player X hits 20 HR in 500 PA in NPB, he’d hit about 10 HR in 500 PA in America). The largest factor, by far, is for HR. Going to America is devastating to NPB home run hitters – they hit homers at roughly half the rate per plate appearance in America than they did in Japan. Interestingly enough, this group struck out less in America than in Japan, which indicates they probably changed their hitting approach significantly.

Akinori Iwamura

Here are his numbers for 2004-2006 in Japan.

Here are those numbers translated to MLB using the above method.

He loses about 20 points of AVG, 40 points of OBP, and 100 points of SLG on average. Yikes. Here’s a 3-year weighted projection, pro-rated to 160 games played (as he was very durable in Japan).

That’s a little better than my back-of-the-envelope projection from last week, but still not all that great for a third baseman. Basically, it’s a slightly better version of David Bell. Here’s hoping he can play second. He did win another gold glove this year, so that’s something.

Tadahito Iguchi

Let’s try out this method on some other players, even though it’s cheating (you shouldn’t apply a model to the data used in making it). Below are Iguchi’s projected 2005 line (year in italics) and his actual line (below).

Not bad – I’ll take a projection that is within 21 points of OPS anytime. Of course, we expected this to happen, since his numbers were used to make the model.

What about some other Japanese stars? Let’s take a look.

Kosuke Fukudome

This star OF for the Chunichi Dragons probably isn’t making the jump any time soon, but let’s take a look at what he might do in the bigs.

As you can see, his skills transfer very well. Along with his good hitting numbers, Fukudome is an outstanding center fielder; he’d find quite a few suitors in MLB if he wanted to try his luck here. Sadly, he turns 30 next April, so we’ll have missed the prime of his career if and when he ever decides to come over.

Shinnosuke Abe

Abe is the Yomiuri Giants’ starting catcher. He’ll never be posted, so we’ll have to wait for him to come over via free agency if he wants to play in MLB (he won’t be eligible for three more seasons). Here’s his projection.

Abe slugged .630 in 2004, but hasn’t come near that since, and it shows in his projection. His HR have gone 33-26-10 in the past three years. Pass.

Next week, we’ll take another look at pitcher projections using homegrown translations that will hopefully be a little more accurate and/or believable.

5 minutes ago, bidding for Daisuke Matsuzaka’s posting rights officially ended. MLB will forward the amount of the winning bid to Seibu, who has 4 days to decide whether or not to accept it. At that point, the winning team is announced and they can begin to negotiate with Matsuzaka’s agent, Scott Boras.

In the comments, feel free to guess the amount of the winning bid and the team that won. Our predictions after the break:

Earlier, we talked about how hard it is to predict the future. As our old professor Larry Sabato used to say:

He who lives by the crystal ball ends up eating ground glass.

So, with that in mind, let’s fry up a delicious glass omelet!

Jim Albright’s system gives us the means to translate Japanese into American, so to speak: numbers from NPB become numbers from MLB. Of course, the translations overlook several factors. They do not account for park effects, for one. Another: they don’t adjust for age. And finally: they don’t account for league difficulty. These are problems I’ll try to tackle at some point in the future; but for now, we’ll overlook our beauty queen’s gapped teeth and barely noticeable moustaches, and get her ready for the swimwear competition.

The first set of translations are easy: we simply hold IP constant and multiply the other statistics by the translation factors. As mentioned yesterday, 100 hits in 100 IP become 107 hits in 100 IP. Et cetera. (Stats we don’t have translation factors for, like HBP and WP, were left unchanged). The largest adjustment turns out to be for home runs; despite the bigger parks in America, pitchers have trouble keeping the ball in the yard when they make the trip over here.

So, we’ve translated hits, home runs, strikeouts, and walks to MLB equivalents. What now? We used Bill James’ component ERA formula to calculate an ERC for each player. Then, based on the number of innings, we figure out how many earned runs that player must have allowed given the number of innings they pitched.

Aside: the ERC formula requires BFP (which we don’t have for all years) as one of its inputs. Using the Lahman database and Excel, I regressed BFP on IPouts, H, BB, K, HBP, and HR. I used the weights from this regression to estimate BFP. For a full season’s worth of hitters (800+ BFP) the calculated value is usually within 5 BFP and rarely further than 15 BFP from the correct value. Click here to see the regression results.

Then, using their historical ratio of R to ER, we take a stab at guessing how many unearned runs they might have allowed in addition. If we wanted, we could also try to guess how many wins and losses a player would have had based on their RA, an assumed team RA and run context. For now, we’ll just ignore them in our translated statistics.

After we have all our translations done, we should adjust everything for age and park. And maybe we will, later. But for now, a simple flat 3/2/1 projection without mean regression will have to suffice. What that means in English: we will assign each of the last three years a weight of either 3 (for the most recent year), 2, or 1. We will then calculate the weighted average for stats like BB, H, K, IP, etc. using that algorithm. ERC, ER, and R are re-calculated as described above. Finally, we will re-calculate starts and innings pitched based on the assumption that Japanese pitchers will throw fewer pitches per start (but start more frequently) in America, and pro-rate other stats accordingly.

Aside: Since the start of the 2000 season, 420 pitchers have started at least 25 games with one team during a season while making no relief appearances. I calculated the average number of batters those pitchers faced per start — it’s 26.7. From this number, we can assume either a number of starts or a number of batters faced and back into innings pitched (and hence other numbers) that way.

Kei Igawa

Here’s how Igawa did in Japan the past three years:

Pretty good numbers (although lots of home runs). 228 K in 200 IP looks great. Watch what happens after the translation:

Some good, some bad. Note that Igawa’s BB/K ratios are always pretty good, though he gave up too many baserunners and homers in ’04 and ’05. Hard to find anything wrong with the translated 2006 line, although we find it a tad too optimistic a translation. Keep in mind that GS has not been adjusted, and it’s unlikely that Igawa would have stayed in each start as long as these stats would lead you to believe. We will adjust for that in his projection.

Aside: You might wonder why Igawa’s Japanese ERA was nearly identical in 2004 and 2005 yet translated so differently. The first numbers use his actual Japanese ERA; the second estimate what his ERA would have been in America given his component stats. Thus, despite posting similar ERAs in Japan in 2004 and 2005, Igawa’s components indicate he pitched much better in 2004 than he did the following year.

Now, we project his stats using the model described above. We will assume he makes 30 starts and faces 26.7 batters per start. Also, we assume he plays for a team that scored 4.85 runs per game (splitting the difference between the AL and the NL, as this is projection applies to neither league in particular). Finally, we’ll assume he got a decision for every 9 IP and calculate his winning percentage using James’ pythagorean formula with an exponent of 1.82. That gives us this:

The ERA is deceptive – he’s giving up a lot of unearned runs. Basically a league-average starter. This line is somewhat similar to Matt Clement or Jeremy Bonderman ca. 2005. If he can match this projection, he’s worth Jeff Suppan money.

He doesn’t strike out a ton of hitters, but he keeps the ball in the park and has great control (his R/ER ratios are surprising – they’re very low for a groundball pitcher, as he reportedly is). Translated:

Those hold up very well, mainly because he doesn’t walk anyone and keeps the ball down. Note the 2005 3.17 translated ERA matches the 3.17 actual ERA by a lucky quirk: his Japanese peripherals suggested he was unlucky to have an ERA as high as it was. Projected to 2007:

A Cy Young candidate in the National League. Two important caveats: first, there’s no age adjustment, and he’s on the bad side of 30. This would cause him to take a hit. Second, it seems unlikely that a guy who relies on control could allow so many balls in play but so few over the fence. This projection is probably at least a run too low.

Kazumi Saitoh

Ahh, my favorite player in NPB. His numbers are fantastic; how will they hold up? Actual numbers:

Not a lot of innings in 2004 and 2005; was he hurt? Tons of runs in 2004 despite pretty good peripherals, too. Translations:

Not a bad 2006, huh? The H/9 looks too low, though. He would have run away with the Cy Young if he put up those numbers in MLB. 2007 projection:

Sign me up! I’m not sure if he would be able to sustain the BABIP, though. I think this projection is a tad optimistic, but I buy it more than Kuroda’s. Note that I didn’t give Saitoh 30 starts, as that would have been a reach given the number of innings he’s thrown recently.

Daisuke Matsuzaka

What we’ve all been waiting for. Actual stats:

Absolutely dominant. 138 hits in 186 innings is incredible. He did miss a few starts in 2004 to injury. Translated:

It’s hard not to get excited. K/BB is still over 5. HR rates are low. Wow. And the projection:

Wonder why teams are bidding $25 million just to talk to this guy? Now you know. He probably won’t be this good — his projected BABIP is too high, for instance. But you never know…

Kei Igawa started last night (the night of the 7th) against the MLB All Stars, according to this article in the Daily Yomiuri. He was wild but effective, allowing five walks but just two runs, and left with the score tied two all. MLB then piled on five runs against Japanese relievers, winning the game 7-2. Igawa also struck out four.

Projection is the art (fools say science) of predicting the future when applied to baseball players. Generally, it involves looking at what a player has done in the past, especially in the recent past, adjusting for various things (like age, injuries, et cetera), and applying an algorithm or set of algorithms to create a set of magic numbers. These magic numbers, though based on reality, can have tragic consequences. For instance, two years ago I reasoned that the $10m the White Sox committed to Jermaine Dye for 2005 and 2006 might have been better spent on Jose Cruz Jr. For those keeping score:

Oh well, one outta two ain’t bad, right? Point being, at least for idiots like me, it’s hard to predict the future for major league players – there is a lot of variability, for lots of reasons. And it gets harder.

f major league players are hard to project, what about Japanese league players? The first step in our quixotic attempt to predict the future for this small subset of players is translating their Japanese statistics into a context we can understand. Although we have access to plenty of Japanese statistics, we aren’t quite sure what they mean. For instance, a major league pitcher with a 3.00 ERA last year performed very well indeed. A 3.00 ERA in Nippon Professional Baseball isn’t as good; how good is it?

In attempting to answer that question, we run into many problems. First, the sample of players who play in both NPB and MLB is very small — only a few players each year go in one direction or another. Compare that with the minor leagues; every American-born player who played a major league game in 2006 spent some time in the minors… and we still have trouble projecting minor league players! Also, the sample is biased; players who go from MLB to Japan tend to do so because they weren’t good enough to play in MLB any more, whereas Japanese players come to America because they were too good for the league. Unlike minor league translations, which usually compare players across multiple minor league levels in the same year, Japanese translations rely on comparing performance in separate years, even though much might have changed in the interim.

Aside: Imagine for a moment that Bizarro World Andruw Jones spent 2005 and 2006 (after learning his new batting stance and thus gaining lots of power at the plate) in Japan. We can assume he would have put up astronomical numbers: at least 60-65 home runs, for instance, despite the shortened season. Jones in 2003 and 2004 had a slugging percentage of .500; in 2005 and 2006, he slugged .553. Thus, translating Bizarro Jones’ stats to MLB (by comparing them with 2003-2004 Jones) would overestimate the dampening effect on SLG of MLB by at least 10%, the improvement in his slugging percentage exogenous to league difficulty.

In addition to all the factors mentioned above, there are factors that are unquantifiable. Japan and the United States are very different places; how much of a player’s failure to hit stems from the relative difference in the level of competition, and how much stems from him not being able to find his favorite foods, from not having anyone in the clubhouse to talk to, from not being able to converse with 99% of the people he meets?

Translations are a crude way of adjusting for these factors in one fell swoop. We know they aren’t very good, but they’re the best we have. Jim Albright came up with these translations for pitchers:

League

hits

homers

walks

strikeouts

Japan

14624

1545

5832

10963

Majors

15737

1910

6252

9695

Adjustment factors*

1.076

1.236

1.072

0.884

These are matched-innings translations. That is, they assume that the pitchers in both leagues pitched the same number of innings in each case. If Pitcher A gave up 100 hits in 100 innings in Japan, we’d expect him to give up 108 hits in 100 innings in America.

Because I’m stubborn, I like to reinvent the wheel. I plan on taking a second look at these translation factors later. For now, they’ll be an easy way of giving us some translated data we can use to take a stab at the question on everyone’s mind: how will Daisuke Matsuzaka do in the major leagues?

Interesting stuff. The Daily Yomiuri reports that the Hanshin Tigers will allow star LHP Kei Igawa to follow his dream and jump over to MLB by posting him next week. Igawa is scheduled to start next Tuesday in the NPB/MLB All Star series currently underway in Japan. I expect Hanshin expects to replace their star with free agent Hiroki Kuroda, who the article reports is filing for free agency despite a contract offer of 3 years, 1 billion yen from his current team.

As for Igawa, he’ll probably have plenty of suitors. Along with all the losers in the Matsuzaka sweepstakes, expect the Mariners, Dodgers, and Braves to be interested.

Finally. Besides Seattle, the Angels, Orioles, and Giants have all reportedly dropped out of the race. Unfortunately, this means that Matsuzaka is even more likely to end up on the Yankees. Booo! Teams have until 5pm next Wednesday to get their bids in; after that, Seibu has a few more days to decide whether or not it will accept the bid.