I was working on a long-term project I've got going earlier this week and
ran into a minor partial result that really interests me. I also needed a
short topic to balance some general notes stuff that's been building up, so
stick with me through a little statistical tapdancing here at the top and
see if I'm right about this being interesting, then we'll get to the news
and notes.

The project as a whole involves trying to get some measure of prediction as
to how a team will do in the postseason. While a lot of folks, me included,
try to do this with the ISR's, that's not really what they're for, and I
won't ever optimize them to predict the postseason at the expense of how
they measure the quality of the season. For the potential independent
measure I'm trying to develop, though, I'm looking at things like how "hot"
a team is through various types of sequences of wins and losses,
distribution throughout the season of opponent quality, and that sort of
thing.

Anyway, one thing that I've found is the following little tidbit. Let's
make up two measures of run scoring for each game for each team. Just to
make them easier to follow, we'll call them Offensive Quality (OQ) and
Defensive Quality (DQ). All they are is the ratio of how the team did
against how most teams do against their opponents:

There are all sort of adjustments you could make to that based on park
factors and the like, and I may do some of that going forward, but you get
some interesting stuff from just taking the average of those scores for all
of a team's games (call 'em OQA and DQA, just for simplicity), or at least
interesting in the sense that the result defies traditional wisdom, which
is that strong pitching is a requirement to advance in the postseason.

The best measure I've been able to come up with so far for postseason
success is number of tournament wins. This isn't perfect (three wins may
or may not mean a regional win, for example, which is not what most folks
think of as equal success), but most alternatives are too subjective.
Since we need a measure to be able to correlate with, we'll go with this.

Now, if you correlate OQA and DQA as I was talking about earlier with the
number of postseason wins, you get a measure of how much each one --
offense and defense -- gives you a sense of predictability. I imagined
going in that the correlation for DQA would be much higher -- after all,
the cliche is that hitting pulls in the fans, but pitching wins
championships. It turns out, though, that the two are almost equal -- .48
for DQA and .46 for OQA. In other words (and in conclusion) there's not an
advantage to putting more of your resources on one side or the other; just
recruit and play the best players you can regardless of which side of the
equation they're on.

I've gotten a couple of copies of the official RPI report for this year
(thanks to those who forwarded them), and I've been able to verify that the
formula I'm using is still correct. There are two sources for
discrepancies, and they're worth a quick discussion. The first is that the
NCAA is doing some intermediate rounding in their calculations. There's no
real reason to consider what they're doing or what I'm doing right or wrong
(other than that there's is what counts), but it's worth noting that that
tiny change (something like .0004, if I remember right) was enough to
switch Arizona State and LSU in the final rankings this year. That should
be remembered whenever you're looking at teams next to each other who are
very close in the actual RPI number. The other discrepancies come in
because the question of what is a home game and what is a neutral-site game
is a quite fuzzy one at times -- some early-season tournament games are
home and some are neutral, and, as far as I can tell, it's up to the teams
which are which.

Speaking of the RPI, long ago I promised an explanation of exactly what
was wrong with the RPI mathematically. I never got around to it (darn
archives), and frankly there are enough different things wrong that no one
explanation will ever quite suffice, but Paul Kislanko has written
a very good
discussion of the math behind the problems; head on over and take a
look, and see what else is on the site while you're there.

If you're interested in reprinting this or any other Boyd's World material
for your publication or Web site, please read the
reprint policy and
contact me