Wednesday, October 01, 2008

I make a statistical based, non-subjective “Power Poll” each year. For this year I am trying to refine my methods to reflect what statistics relate to actual success.

I’m sorting through all the NCAA statistics for the past several years, looking for correlations. To do so I’m using linear regressions. An example of this is –

What we have here is the linear regression of all 119 NCAA teams last year with the y-axis being win-loss percentage of the teams for the year, and the x-axis the average score per game . In the upper right are Hawaii and Kansas at 12-1 each with 43.4 ppg and 42.9 ppg respectively, while in the lower left we find 1-11 FIU at 15 ppg.

I also identify Florida, LSU and Ohio State on the graph. The line running the center of the graph is the “slope”, which shows what one might expect when all the variable are fitted together. Florida, located beneath the line, did worse on a win-loss basis than their points-per-game of 42.5 would suggest.

The Correlation between points per game and record is 0.704616406330361. A correlation of 1 (or -1) is considered to show perfect correlation, while 0 shows no correlation. The 0.7 we found shows a high degree of correlation between points per game and winning.

When you look at the correlation between record and defensive points allowed for 2007, the result was 0.731562154315535, showing a slightly higher correlation between a good defense and a winning record.

I plan on doing this for a variety of statistics for the past several years, and hopefully incorporating it in my Power Poll.

Credit for the use of a powerful statistical software tool (free, at that) goes to -

5 comments:

mergz,The problem with designing a correlation table in college football is the VAST variation in the ability level of the potential opposition pool among different schools and leagues.

A particular stat may show a high level of correlation to winning % when you are comparing teams who play 'similar' schedules. But the HUGE disparity in the level of play, from league-to-league or region-to-region, would seem likely to skew almost any statistically-based analysis. Ex: how can you account for situations where some teams use 2nd, 3rd, or 4th team players for substantial periods in some games while some coaches leave their 1st-teamers in longer (and potentially pad their stats v. lesser opposition).

I know that your focus is on college football, but I believe your analytical method would be more substantiatable when used to study the NFL, where there is not nearly as great a disparity in ability/play level.

An OBVIOUS example of the case of misleading stats is UF's Columbus campus--AN ohio state university.

For the last several years they have CONSISTENTLY piled up huge numbers v. their 'normal' schedule; but when faced with stronger opposition from outside their regional comfort zone, they have failed EQUALLY consistently.

There is OBVIOUSLY a direct correlation between WHO Aosu is playing and their winning %:

I just skimmed this post, but seeing that you're in that relatively small community of college football bloggers who care about "doing stats" well...

I need some help thinking about how to approach yardage stats. In particular, I'm wondering if anyone's researched how sustainable is a season like Vanderbilt's having so far, with poor yardage stats and lots of hidden yards.

Brilliant stuff, but I like to weigh on the side of chaos theory and or the butterfly effect when it comes to putting stats and predictions together in football. Think about it, Weather is about as predictable as NCAA Football. So many small factors can have such an impact on events. And those small factors somtimes undeterminable can seem to compound themselves over time. To many to name. I think the ability to determine and identify those small sometimes naked to the human eye factors is the key and what Vegas goes after. Only problem is they are most likely always changing.

Recognition

I can’t say enough about my two favorite blogs Get the Picture and Saurian Sagacity. There are not two more consistently thought-provoking and analytical college football blogs on the internet.
-Orange and Blue Hue

Rare is the SMQ shout out for the sole purpose of shouting out, but even rarer is the high substantive quality of disinterested naysaying in progress at Saurian Sagacity, where poster Mergz is steadily blowing up notions of "National Championships" new and old...-Sunday Morning Quarterback

In the old days, long before Urban Meyer roamed the sidelines at The Swamp, even before Steve Spurrier was slinging touchdowns and kicking game-winning field goals, some sports writers gave the University of Florida's football team a long-forgotten nickname: theSaurians. Today, two Florida alums pay tribute to those scribes of old as we enjoy the present and look toward the future.