Monday, November 26, 2007

The "K" study for real

[UPDATE: originally, this post had that the study used only players whose initials were both K. Commenters told me that I misread the study, that it was players where EITHER initial was K. Sorry for my screw-up. The original (incorrect) post can be obtained from me if you want it. This is now new.]

The biggest difference was in the 1920s -- only 1.2%. The biggest modern difference is the 60s and 80s, with 0.6%. From 2000-2003, the effect went the other way -- 15.4% for K players, 16.4 for others.

So I don't see where the authors' 1.6% difference comes from. I'll try a signficance test simulation later and update this post.

UPDATE: The "K" players in real life had 566,374 PA. I ran a simulation that had an average 568,558 PA. The SD of strikeout rate was 0.43 percentage points, which is higher than the 0.3 points that I observed.So the big question remains: why did the authors get such a high strikeout rate difference?

FURTHER UPDATE: Mystery solved by Tango! It looks like the authors weighted every player equally, instead of weighting every PA equally. See comments.