"The Big Ten Scorecard": An Attempt To Grade The Season

So, Now that we’re simply awaiting the start of the bowl season to cap off an interesting year in Big Ten, and indeed, college football, I thought I might present to the board for comment something I had been considering doing for a while – “The Big Ten Scorecard”.

It’s not by any means scientific, and I don’t pretend to be an expert at these things, but what I tried to do here is take summary metrics and compare them to what the conference game averages would have been. Including the BTCG, there were 146 games played (allow that two teams played 13 games, of course), so the same size is sufficient, in my view, to present what an average Big Ten game stat line would look like.

Bearing that in mind, there are several tables – passing offense and defense, rushing offense and defense, scoring offense and defense, a summary for offensive and defense metrics and how many were met, and an overall “score” for the team.

Score? Metrics?

Glad you asked. I kept it fairly simple for this first pass at the idea. You will see in the tables many boxes shaded in red with numbers in red as well. I went with the mean for each statistic as the target, so essentially, what we’re discussing is the team’s performance against the Big Ten mean on 31 measures.

So, for the most part, on offense, if a team was below the mean on a certain measure, the box is shaded because it indicates a performance which was generally subpar compared to the rest of the Big Ten. The sole exception would be interceptions, in which case being below the mean is preferred obviously.

On defense, on the other hand, generally numbers below the mean would be preferred; the sole exception again (for purposes of this experiment) would be interceptions, as more indicates an opportunistic defense.

There are some confounding factors, of course, such as teams facing pass-heavy or run-heavy opponents, but the human performance aspect of football allows teams proficient in stopping such attacks to meet other targets.

In other words, in my totally contrived system, there are 31 possible points, and if you “exceed the target” (perform well against the mean), you get one point for that measure. I have even included handy icons to graphically illustrate which teams are making the grade compared to their conference compatriots, if you will, in the summary tables.

Basically, the final score is the percentage of measures against which you exceeded the conference average. No team obviously scored 100%. Indeed, no one even hit 80%, so there is a bit of a curve involved as well. One other thing that some will undoubtedly notice – sometimes, a team which came in at what appears to be average is still in a shaded box. I rounded the numbers for purposes of simplicity in the tables, but what it means more often than not is that the unrounded figure is still slightly below the actual mean.

TL;DR – Michigan, Nebraska, Ohio State, Penn State and Wisconsin all exceed the Big Ten mean at least most of the time in most areas. Michigan State and Northwestern would be teams that do this only slightly more than half the time. Iowa, Minnesota, Indiana and Purdue struggle, needless to say. Illinois...well, at least this season gets “Firestone Smoldering Rubber Award”.

Again, this is my first stab at such a thing, and I welcome comments and suggestions. For as long as I am here, I would like to make it a yearly thing with perhaps even midpoint reports.

When you average stuff half of it is below average and half of it is above average only happens when the distribution is normal

To use a football example, if you have say Michigan and Ohio excelling in these statistics and much higher than the rest of the teams, their stats might skew the average upward because of how high they are compared to the rest. In this case depending on how much higher the stats for those two schools were compared to the rest of the conference, you could have quite a few schools below than average and only 3 or 4 schools above the average.

If each school in the B1G was a data point of just one value, you might be right. However, you would also need the distribution to be normal, as mentioned above. This very likely wouldn't be true because we know that the quality of B1G football teams is not a bell curve, and 12 just isn't a very large sample size.

Now, each school in the B1G is not just one value. There are many numerical values associated with each team, and for each one the distribution will be different. Just look at offensive points per game - 8 teams are above the mean, 4 are below. I'm pretty confident that observation alone should punch a hefty hole in whatever point you were trying to make about the uselessness of these data.

...When you average out stuff... half of it is below average...I'm not sure I get what this has to do with Football other than using it to show on average that some schools are better than average.

I'm good with that.

In the universe of distributions there are infinities of kind. When human minds fathom them they do so... for almost all parts... through the normal distribution. I clipped in the bell curve image on a lark because this analysis and metric is fashioned on the mean (not the distribution around it) of 31 different distributions...which spawned this fun. Stats are fun but not necessarily meaningful.

LSA then takes each school mean for each targeted distribution and does exactly what you say might make my obscure point valid... and reduces it to a single data point for each school.

Here's a picture of that distribution of overall targets attained ...

...wait for it...

...as fitted to a normal distribution...uh...hmm...it's a pretty good fit given we are only looking at 12 datapoints.

Michigan is highlighted...as usual... above the mean...which is what most people here come to hear.

As to my original point... it was simply this - Good Post LSA. Let's focus the metrics to GAF criteria if possible but good post.

GAF - Good At Football

What you're saying is mathematically sound. I do appreciate that you've made an effort to explain yourself. I agree that if you take enough different distributions & smush them together, it's going to look pretty normal. Actually, it doesn't even matter if I agree - that's just reality.

But we do care about the constituent target distributions. We love any aggregate score that shows we're better than average, sure, but it's just not very informative about how or why. You lose the detail when you smash every other category in with it, & if that's what you have to resort to in order to claim that the data aren't interesting ... well, that's just like, your opinion, man.

As for me, I think it's interesting that Michigan is below average in avg. pass attempts & completions, but higher than Nebraska in avg. YPA & avg. YPC. Yeah, I should mention that we're not just comparing against the mean - we get to compare the raw numbers against other teams' raw numbers too.

Anyway, don't worry - I'm not railing against your understanding of stats, you're fine, you're clever. But there's a lot you can take away from The Big Ten Scorecard if you're willing to look at more than the final table.

I guess... but are we looking at garbage time, strength of schedule differences, strength of offenses/defenses, downfield success rates, play success, points per play? There are a ton of stats (most that take real work to bring out) that are meaningful to football. Choosing an aggregate of summary stats IDK about that. If it brings up questions about the constituent distributions then by all means let's go there but always with an eye to what is happening on the field and team.

LSA knocks me out with his production of work here. I'm looking at these tables partly because my daughter won't let me correct her math homework but mostly because I love football and Michigan. I think the targets need revision. Let's look at Success rate to begin with. 1st downs that gain 5 yards or more (.5 of needed yds on non std 1st and >10 yd situations). 2nd downs that gain .7 the needed yardage for first down. 3rd and 4th downs that convert. Football Outsiders has a good rundown of GAF statistics for Offense, Defense and Special Teams. The Hidden Game of Football has good footbally stats. If I had LSA's ability to kick it out... I'd go in this direction because there are few people who do. There is this guy from Maryland who does a pretty interesting take. I'd like to see that done for Michigan related data more often.

You are the one who made the thesis that single data points would create a normal distribution. I just showed it. Truth be told - if you smush different distributions together - I have no idea what they will look like - though I can fit them to a normal distribution - albeit not necessarily well.

I'm looking at this passing thing now for Mich and Nebraska... let me think about that... I think it's due to a lot of this sort of thing...

It's interesting that you mention "The Hidden Game Of Football", as I started reading it again last night to get some ideas for other improvements. Admittedly, with the season being over, I wanted to see what it looked like using more easily available statistics and bounce it off the board more than anything, and as usual, the board has come through with some great discussion and ideas.

Based on what Ron Utah suggested below actually, I've added some stuff on completion percentage and I was thinking 3rd and 4th down conversions might be a good addition to this as well. First downs and first down yardage would be interesting as well and, as you suggest, probably provide even more insight into success on offense.

I've also toyed with the notion of perhaps incorporating median values since individual teams play only 12 games, though this might require collecting individual game data from day one. I may very well to this next season actually, as it might aid in producing even better results.

Thanks to everyone - I do appreciate the feedback.

"Funny isn't it, how naughty dentists always make that one fatal mistake."

While statistical analysis only tells part of the story, I for one ALWAYS appreciate your work, and this is another great example. Thank you, and well done.

A few tweaks to consider:

Number of attempts in the passing and running game is not a good "target." Being above or below that does not indicate success or failure; however, a total number of plays on offense (higher is better) and defense (lower is better) might be a good measure. However, I do like to see the data on attempts; I just wouldn't include it in the "target" metrics.

That said, completion percentage is an important statistic, and I think you ought to chart that.

All-in-all, a very good post, and fun to think about. What's best is that it seems to work: the teams with the highest overall scores were the best in the B1G, and it ranked the offenses and defenses accurately as well.

11 National Championships. 42 B1G Championships. Winningest program in college football. HAIL TO THE VICTORS

Excellent diary, LSA! It's interesting to compare season results (W/L) with % of targets achieved. The two correlate almost perfectly. It's also weird to notice that despite teams thinking JT Floyd was vulnerable all year, they ran the ball more often than normal and threw the ball less often than normal. Did we just get lucky that teams didn't try to exploit Floyd more, especially with Countess out and no pass rush to speak of? Granted, the B1G doesn't throw much to begin with, but we were still below average for pass attempts against.

Even though the sample space is small, because there were so many catagories it was not suprising that there is a very strong corolation between score and record. What you should have done is order record verse score. If you had included just Big10 games and threw out nonconference the correlation would be more extreme.

That makes sense that a team that was doing better in more catagories would win more games. Any deviation would be due to catagories not covered. For example a big play defense, or mistake prone offense might skew the numbers for an individual team.

My guess if you could quantify more catagories, your numbers will match even closer to reality.

I think another interesting item is you have a self weighting mechnism for tempo. A team that plays a slow tempo game(less plays) will have inferior offensive numbers just because of fewer plays. However, that will improve the defense more then it should.

I had a friend who got payed a lot of money by Oldmobile to tell them customers were happier the fewer times they had to bring their car into the shop. Despite that, I think this exercise is useful in that it shows no team was truly lucky or unlucky. A team like NW that lost three games in the last minute did so not because it was unlucky but their pass defense was beyond terrible, which I think one would fine even if you evaluated more generic metrics like ypa.