Statisticatin'

It is our nature to assign narratives to games. Against UConn it was brutal offensive efficiency. Against Notre Dame it was the first appearance of clutch Denard. UMass and Indiana it was praying our offense would have the ball last and Bowling Green was relief that we finally took care of business.

For me, Michigan St was a game of 7 plays. There were seven big plays on Saturday, and all of them went Michigan St’s way. There was plenty that didn’t go right in between, but those seven plays, but the seven plays masked what in some ways was a better performance than it felt, and in some ways worse. Denard’s three interceptions, two in the end zone (-10 points). Edwin Baker 61 yards, +5. Le’Veon Bell 41 yards, +4. Cousins to Dell for 41 yards +4. Cousins to Dell again, 44 yards, +3. Those seven plays, a 26 point swing. Only two plays for the offense +3 or higher, no running plays worth more than 1.4 points. All of the big plays went Sparty’s way.

What we hoped was not true now appears to be so: this team will not win Big Ten games without exceptional offensive performances. We are who we are at this point.

Game Notes

Rush Off: +5

Pass Off: 0

Rush Def: –15

Pass Def: –10

Field Position: Michigan +4

Denard was +3 on the ground and a +1 through the air.

Shaw and Smith were both +0 and Hopkins was +1 on two strong carries.

Cousins was +13 combined.

Baker was +7

Caper was +3

Bell was +7

I went 4-1 against the spread last weekend in my picks. I did not pick MSU, Purdue, Wisconsin or S Carolina to win like three of them did, but all four managed to cover as I predicted. Only Indiana let me down. The game went down as I predicted but the Indiana offense couldn’t quite put together enough garbage time points to cover the spread.

Projections

Michigan’s poor showing on Saturday, along with increases from Illinois, Ohio St, Purdue and Wisconsin, move the win projections down a game and a half or so. That is a huge drop for one week.

Remaining Schedule

Win odds drop across the board, Illinois is probably over valued but after back to back strong showings they look to be better than expected this season. Wisconsin is starting to come closer to conventional feelings, but is still too high.

Team – National Rank – Win Odds

Iowa – 26 – 54%

@ Penn St – 57 – 59%

Illinois – 23 – 48%

@ Purdue – 77 – 73%

Wisconsin – 52 – 72%

@ Ohio St – 9 – 17%

Projected Big Ten Standings

1. Ohio St

2. Michigan St

3. Illinois

4. Iowa

5. Michigan

6. Wisconsin

7. Purdue

8. Northwestern

9. Penn St

10. Indiana

11. Minnesota

Just like last week, Illinois and Wisconsin feel like they are reversed, but to date, the computer likes what Illinois has done a lot and still isn’t sold on the Badgers.

All numbers included in this preview are using my PAN metric, Points Above Normal. PAN is essentially how many points above an average FBS team was a team/unit/player worth. For reference, an average FBS is approximately equal to Illinois or a top team from the MAC.

All games against FCS teams are excluded for all teams, as well as any plays in the second half where one team leads by more than 2 touchdowns or any end of half run out the clock situations.

At this point adjustments for strength of opponent are directional but still highly uncertain. They will be now be used in all situations except otherwise noted.

Rush Offense vs Michigan St

Michigan Off: +8 PAN, 2nd nationally, 1st Big Ten

Michigan St Def: -0 PAN allowed, 54th, 6th

[Chart note: positive numbers mean good performances for both offense and defense.]

Despite Michigan St holding Wisconsin to 24 points last week, they fared worse against Wisconsin’s running game than the average team has in 2010.

Michigan has been between good and ridiculous in every game this year. Michigan St’s rush defense is definitely an upgrade over the Hoosiers from last week, but it’s more of an “allow touchdowns in reasonably length drives” as opposed to a “regularly keep us out of the end zone” type matchup.

Robinson will obviously be the catalyst for Michigan. His rushing PAN is +9, over 3.5 points better than anyone else in the nation. Vincent Smith and Michael Shaw have both hovered right around zero, with Smith grading out higher due to a combination of the long run against Indiana and Shaw’s best performance coming against an FCS opponent.

Michigan will face a better rush defense than they have seen in the last month or so but Michigan St hasn’t shown anything yet to indicate they are good enough to slow Michigan’s run game down more than anyone else has. Look for at least a full touchdown worth of advantage from the Michigan rushing attack.

Pass Offense vs Michigan St

Michigan Off: +7, 4th, 2nd

Michigan St Def: –1 allowed, 46th, 4th

After struggling for the first month of the season, Michigan St’s pass defense was the difference in shutting down Wisconsin last week. Michigan’s pass game has continued to be very potent and last week at Indiana was the best game yet. Because Michigan’s passing success is built so much off the success of the running game, it’s not as clear as to how a good pass defense will be able to defend the Wolverine passing game.

Robinson has been +6 PAN on the season and is the 20th ranked passer in my ratings, which can reward for volume, which Michigan has very little of in the passing game. Robinson will be aided by a group of receivers who have been much more productive than last year. Hemingway and Roundtree are both averaging a solid +6 PAN per game and Odoms is at +3. Stonum is potentially a threat but has only been worth +2 against FBS competition.

Michigan St stepped up last week and did a great job limiting Wisconsin through the air. Although both Michigan and Wisconsin’s passing games are set up by strong ground games, the spread and shred is very different from old school Big Ten rushing. If Michigan St can replicate last week’s success this matchup could be a draw instead of the +6 for Michigan the numbers indicate.

Rush Defense vs Michigan St

Michigan Def: +1 allowed, 72nd, 7th

Michigan St Off: +3, 28th, 5th

This chart, with the exception the MSU-WMU game is a chart of averageness. Michigan hasn’t been gouged in the running game but they haven’t been closing the door on anyone either. Michigan St’s two-headed running attack came out big in Week 1 but has been relatively quiet since.

Michigan St’s +3 PAN is essentially split between Le’Veon Bell and Edwin Baker with both players contributing equally to the success. Bell had the big day against Western with a +7 showing but was shut down against Wisconsin by going –4 on the day. Baker has been much more consistent with all four games going between +1 and +3.

If you take out the beatdown Michigan St’s backs administered in Week 1, the Spartan ground game looks much more tameable. Michigan can not sleep on this matchup but is in better shape than I thought. Michigan should have a chance to reasonably contain the Spartan backs, shifting the challenge to the…

Pass Defense vs Michigan St

Michigan Def: –1 allowed, 42nd, 3rd

Michigan St Off: +2 allowed, 32nd, 6th

As I have stated previously, the high rating for Michigan’s pass defense against Indiana will not hold. I do agree with Brian that Indiana will probably still have the best pass offense in the Big Ten this year, but when you are compared to Akron and Western Kentucky in your performance, you should always come up looking good, even if you allow nearly 500 yards through the air.

Michigan’s pass defense is very difficult to assess right now. Indiana doesn’t have good comps to measure against. The BG performance mostly looks bad because of one fluke play where BG got away with a massive hold. ND and UConn both look like respectable performances in comparison with how other teams have defended them.

Michigan St has really stepped up their passing game in the last two FBS games, with a pair of +7’s against legitimate opponents. After two sub-zero PANs for Kirk Cousins to start the season, he has been +11 and +12 in his last two games (the difference between the team and his score is that sacks count against the team but not the player).

The receiving has been pretty balanced with three Spartans checking in at +4 on the season. BJ Cunningham, Keshawn Martin and Mark Dell have carried most of the load this year. All three have a game-rated +9 or higher on the season.

If Kirk Cousins can keep up his recent success this where it starts to get scary for Michigan. The Spartan passing attack will not be as good we saw last week in Indiana but if Michigan does improve in the secondary, the results could be nearly as bad, especially if Michigan St can keep Michigan looking in the backfield on play action.

Special Teams vs Michigan St

Keshawn Martin. That’s the two big green bars on the graph. He has three big returns on the year, one going the distance against Wisconsin last week. Let’s not kick to him, although he has lost a fumble on a return this season. Michigan’s blue bars have gotten closer to zero, mostly because we have chosen to forgo special teams altogether. The Spartan kicking situation is a polar opposite to Michigan. Michigan’s kickers have cost the team 5 points vs an average kicker while Michigan St’s kicker has been 5 points better than the average kicker. If this game is decided by special teams it is very unlikely to be a Wolverine win.

Predictions Almost Certain to Cost You Money if Taken Seriously

At a neutral site, this matchup is pretty much a tossup. Luckily its in Ann Arbor. If Michigan can keep Keshawn Martin from breaking a long return and is at least even in turnovers I would feel really good about the chances.

Introduction

Inspired by all of the great statisticatin’ done by such MGoUsers as Misopogon, the Mathlete, and, most recently (and in the past), MCalibur,I decided to look into something I’ve been wondering about for a while. Well, four somethings really, all related to the importance of yards on first down in determining eventual success at getting first downs and sustaining drives:

Something I

How much do varying numbers of yards on first down affect the probability of getting a first down (or a touchdown) in that series?

Here I was simply wondering about the “how much?” question. It goes without saying that losing 10 yards on first down (or starting first and 20) reduces the probability that you will get a first down, but by how much? Similarly, obviously the more yards you get on first down, the better your chances are of getting a first down on the series, but by how much? Are there thresholds beyond which your probability of getting a first down increases appreciably, or is it more or less a linear relationship, where every additional yard on first down increases your probability of getting a first down by the same amount?

Something II

How much variation is there between teams in their ability to recover from bad first down plays?

My assumption here was that with principally running teams like the RR-era WVU, it is harder to overcome a bad first down play than with more balanced teams like the Lloyd-era UM or the Vest’s OSU teams. Conversely, I assumed that when the RR-era WVU or UM teams got at least four yards on first down, a first down on the series was virtually a lead pipe cinch. But, as these were only assumptions, I was interested in doing some analysis to check this out.

Something III

How much variation is there across games in the ease with which first downs are gotten, and in the effects of various numbers of yards on the probability of getting a first down?

For example, you would think that statistically it would be easier to get a first down on any given series in a home game, other things equal, right? Or, it would be harder to get a first down on any given series in a game against an opponent with a better defense, right?

Based on my data, the answer to both of these questions is NSFMF. I’ll explain later.

Something IV

How much do things other than the focus of the analysis, like field position and penalties, affect first down probabilities?

The Data

When I started, I knew I wanted to compare Lloyd-UM with RR-UM and RR-WVU, since I wanted to see how the spread n’ shred in its mature form would compare with the more anemic version (UM 2008-2009) and with the DeBordian “rock, rock, rock, rock, rock, ICBM, rock, rock, rock (also rock)” approach. In compiling the sample, I made several choices:

I did not look at UM in 2008 because I thought it would unfairly penalize Michigan and/or RR, and also I’m pretty sure we invaded Grenada that year and they called off the season.

I added another comparison team, the 2006-2009 OSU juggernaut. Damn, those guys won a lot of games in those years. Fuckers…

which I included. I debated about this latter non-omission because I didn’t want to unfairly stack the deck against Lloyd, but I figured (1) omitting Baby Seal U from the other coaches actually (slightly) stacked the deck in favor of Lloyd, and (2) there are Baby Seal Us and then there are Appy State Us.

This is a picture of an actual baby seal.

As for the unit of analysis (or “record” or “case,” depending on your disciplinary background), you may or may not know that ESPN.com publishes the play by play for each game, with pretty detailed information on each play.

At the game level, the sample consists of 122 games played from 2005 to 2009 by three schools (OSU, WVU, UM) and three coaches (Tressel, Rodriguez, Carr). For each game, I recorded:

whether it was a home game or not (away- and neutral-field games were coded the same. In retrospect I probably should have distinguished between these, but it didn’t end up mattering anyway).

At the play/series level, the sample consists of 3,529 first down plays and the series these plays began. For the teams of interest (i.e., not the opponents), I recorded the following data for each first down play:

The dependent variable was whether the series ended in a first down.

The primary independent variable was the number of yards gained on first down.

The control variables were:

the field position on first down;

the yards to go on first down;

whether there was an offensive or defensive penalty (or both) on the series (penalties on first down, where first down was repeated, figured into the “yards to go” variable);

whether there was a turnover on the series;

whether there was low time (less than a minute) in the second or fourth quarters;

whether there was a pass or run on first down;

the quarter the series took place in; and

the number of previous first downs for the drive in which the first down took place (so, if it is the first first down play in a drive, this variable would be scored 0; if a team makes a first down, this variable would be scored 1 for the second first down play in the same drive).

Table 1 below shows the sample by season and team.

The Analysis

Hierarchical Linear Models

My initial plan was to run two-level hierarchical linear models (HLM), in which first-down plays/series are nested within games. Briefly, HLM allows you to calculate how much of the variation in the dependent variable is due to level-1 (play/series-level) factors like yards on first down, field position, etc., and how much is due to level-2 (game-level) factors like opponent defensive strength, home/away game, etc.

Essentially, HLM would calculate the average probability of getting a first down, as well as the effect of the level-one independent variables on that probability, for each of the 122 games, and then those parameters would be the dependent variables to be predicted as a function of level-2 (game-level) variables.

Fortunately for those of you who are about to stop reading, one of the things I discovered is that there is not significant variation from game to game either in the probability of getting a first down, nor in the effects of the level-1 independent variables, to support an HLM analysis.

This does not mean that, for example, UM had exactly the same average success in getting a first down against OSU as they did against Eastern Michigan. What it does mean is that there is not so much variation from game to game in this average probability that it makes sense to predict that scant amount of variation with game-level factors.

The Probit Binary Response Model

Hence, the following is just a play/series-level analysis, which is probably more intuitive for the reader anyway. Because the dependent variable is dichotomous (0 if no first down on the series, 1 if first down or touchdown), I used the probit binary response model (PBRM). For those of you not steeped in this method, the PBRM is one of several regression-like methods for binary dependent variables.

Probit coefficients are in the metric of the standard normal cumulative distribution function (CDF), also known as z-scores. When you evaluate the standard normal CDF at a given value, it tells you the probability of scoring a “1” on the dependent variable.

The sign and magnitude of probit coefficients are interpreted in the standard way: a negative effect means that the variable lowers the probability of scoring a “1” on the dependent variable, positive coefficients mean that the variable increases the probability, and larger coefficients (in absolute value terms) mean stronger effects.

Except for Table 3 below, I have transformed all coefficients into probabilities, so you don’t have to worry about the metric of the coefficients.

Several Words on Sampling Error

You may remember from some statistics course that it is generally good practice to report not just the point estimates from any statistical analysis, but also an estimate of sampling error. This is why when networks report polling data, they usually say something like “Candidate X is leading Candidate Y by 5 points [the point estimate], with a margin of error plus or minus 3 points [the sampling error estimate].”

Virtually all statistical software packages (I used Stata/SE 10) assume that the data were gathered via a simple random sample, in which all samples of a given size have an equal probability of selection. Clearly, my choice to non-randomly sample three teams and five seasons, and then take a census of all games (except for Baby Seal U games) and first down plays violates this assumption. Hence, this analysis isn’t necessarily representative of the nation-wide effects of first down yards (and other variables) on first-down probabilities. You should interpret all of these findings as merely relating to UM, OSU, and WVU for the years specified.

The Results

Descriptive Statistics

Figures 1 and 2 below show, respectively, the number of yards gained on first down and the starting field position for any particular series. Recall that there can be multiple series within a drive, so Figure 2 should not be interpreted as the starting field position for the drive.

Note from Figure 1 that the modal number of yards gained on first down is zero. Obviously, this can occur via an incomplete pass, a completed pass for no gain, or a rush for no gain. The distribution is right-skewed, although fairly normally distributed (excluding the zero yards bar) within a range of about a loss of 10 yards and a gain of about 20 yards.

Note from Figure 2 that the modal starting field position is 80 yards from the opponent’s goal line (or the offensive team’s 20). This is largely due to touchbacks on punts or kickoffs, of course.

Table 2 below shows the descriptive statistics by team for the variables used in the analysis. Note that the percentage of first down plays where the series ended in a first down or touchdown ranges from 66% for the 2009 UM team to about 76% for the 2006-2007 WVU teams. This should explain in part the 5-7 record of the former team and the shredding of opponents achieved by the mature WVU teams. Interestingly, OSU and Lloyd-era UM had about the same overall probability of getting a first down.

Time will tell if the RR UM teams can recapture that glory, or whether the spread n’ shred was simply more effective (1) in the Big East, (2) with Pat White/Steve Slaton, or (3) both (1) and (2).

One bit of hopeful evidence comes from the opponent total defense rank (near the bottom of Table 2). It doesn’t appear as though WVU played an appreciably easier average schedule than OSU, and if anything, WVU’s opponents finished their seasons with, on average, better-ranked defenses than either Lloyd-era or RR-era UM.

In terms of the primary independent variable of interest, Figure 3 shows the distribution of yards gained on first down, by team. Note that RR-UM was more likely than the other teams to lose from 1 to 4 yards on first down, less likely to gain from 3 to 5 yards, more likely to gain 6 or 7 yards (there may be a small sample size problem here), and less likely to hit a big play on first down (10 or more yards) than OSU or WVU.

Interestingly, RR’s WVU teams were less likely to gain 0 to 2 yards on first down, which is probably largely due to the lower percentage of passing plays on first down for WVU (17% vs. about 32-34% for the other three teams. This should demonstrate that RR/Magee understand that when you have Pat White, you run the ball on first down (and most downs thereafter). When you have Tate, you have to be more balanced. Say, maybe these guys do know about football…

Other points of interest from Table 2:

Lloyd’s teams were more disciplined on offense with respect to penalties than the Vest’s teams--about 4.7% of OSU’s series had at least one post-first down offensive penalty (recall that the first down penalties were folded into the “yards to go” variable), compared to 2.8% for Lloyd-UM. RR’s teams fall in between.

On the other hand, the Vest’s teams drew more post-first down defensive penalties than RR’s teams. Perhaps the passing attack invites more encroachment/pass interference calls than a more ground-based attack?

Turnovers! About 7.6% of RR-UM’s series ended in turnovers, compared to 4.0 to 4.7% for the other teams. Yikes.

PBRM Results

Figures 4-6 show some results from the regression analysis. First, Figure 4 shows the probability of getting a first down after selected numbers of yards on each first down play, assuming (1) it was first and 10, and (2) there was no penalty on the series.

Note that losing five or more yards on first down gives you about a 0.25-0.30 probability of getting a first down, whereas, obviously, gaining 10 or more yards is by definition a first down (on first and 10 at least).

In between these extremes, the first down returns to yards on first down is basically linear, though there are fairly noticeably inflection points between losing 5 or more and losing 1 to 4 yards (the first two points in the curves) and between gaining 3 to 5 and gaining 6 or 7 yards. By the way, I chose these categories based on exploratory analyses that showed that there was no statistically significant difference between gaining, say, 0, 1, or 2 yards.

Finally, notice the similarity between the OSU and Lloyd-UM curves. This shouldn’t be particularly surprising, since those teams pursued fairly similar offensive strategies--lots of off tackle to Hart/Wells interspersed with daggers to Manningham/Ginn.

I was interested to see that WVU dominated the story, at all categories of yards gained on first down. That is, it isn’t true that the WVU offense bogged down especially on small losses or gains on first down. A great offense will overcome.

Figure 5 shows the probability of getting a first down by field position on first down, in 10-yard increments. There are basically four points here:

Being inside your own 20 reduces your probability of getting a first down, probably because of more conservative play calling;

There is basically no difference between the 20 and the 50;

Probabilities go up between the 50 and field goal range (a field goal attempt was coded 0 on the dependent variable, since there was no first down or touchdown);

The probability goes way down in field goal range, probably because coaches elect to take the 3 points instead of going for it on 4th (see the Mathlete’s excellent diaryon this).

Figure 6 shows basically the same trends, broken down by teams. There isn’t much to see here, except that WVU was awesome, RR-UM sucked, and OSU/Lloyd-UM were basically indistinguishable. It looks like a good rule of thumb is that WVU had a 10-percentage point better probability of getting a first down than RR-UM and a 5-percentage point advantage over the Vest and Lloyd.

Table 3 shows the full regression results. There isn’t much new here, but just to recap:

Yards on first down matters a lot (duh I);

WVU kicked ass;

It’s harder to get a first down on first and 20 than first and 5 (duh II);

Field position doesn’t matter as much as you might think;

Offensive penalties make it harder to get a first down; defensive penalties make it easier (duh III); and

Ceteris paribus, passing on first down increases the probability of getting a first down on that series (though in analysis not shown here, I found that, not surprisingly, it increases the chances of a turnover [see Hayes, W.]).

One other thing: in the note to Table 3, it says that the “Pseudo R2” is .3008. This is a statistic calculated in the PBRM that is analogous to the R2 (r-squared) statistic in linear regression, which is interpreted as the percentage of the variation in the dependent variable that is explained by the model. It’s hard to say whether 30% is a lot or a little; all I know from the coding is that there were lots of series in which a team would lose 10 on first down and still get a first down, and others where they would gain 9 on first down and fail to get a first down. So, there is still a large stochastic component to the process.

Stuff You’d Think Might Matter but Didn’t, Statistically

Statistically, variables that had no significant (but see “Several Words on Sampling Error” above) effect on the probability of getting a first down (net of the other variables included in the model shown in Table 3) included:

Home vs. not home game;

Which game of the season it was;

The quarter of the game;

The drive number (these last two suggest that there is not a robust effect of either “bursting out of the gate,” nor of “starting sluggishly.” Sometimes teams start strong and finish weak, other times the reverse happens);

Number of previous first downs on a drive. This was interesting to me, because one often thinks, I think, that teams get “hot” on a drive. In other words, each first down makes it successively easier to get the next first down. My analysis suggests this is not true, at least in these data. There are a couple of explanations for this: one is that it does get slightly easier to get a first down the closer you get to your opponent’s goal line (though not in the field goal zone), so the two effects are collinear--the more first downs you get on a drive, the better your field position is, and it is that latter issue that affects first down probabilities. The second goes back to the stochastic component--there are just as many drives where a team will gain 3 first downs and then stall as ones where they will gain 3 first downs and then 2 more.

Conclusions

I have few beyond the things I’ve already mentioned. Basically, yards on first down are incredibly important, but not in any surprising way. The more yards you get, the better your chances are of getting a first down. However, there is a large random component to getting first downs, so yards aren’t everything.

In terms of UM football, it is clear that the mature spread n’ shred is lethal. But you already knew that. The question is whether UM can recapture that WVU magic. I guess I’m optimistic, for several reasons:

The RR offense requires experienced, athletic players, really at all offensive positions. This we now have, and/or are quickly cultivating.

A heavily run-based offense is slightly less likely to turn the ball over and much less likely to suffer no gain on first down (due to the lack of incomplete passes). This bodes well for sustained drives.

WVU played, on average, slightly better defenses (at least if you think total defense rank at the end of the season is a good indicator of defensive strength) than UM on average, and defenses that were as good as those played by OSU, on average. So, at least by this figuring, there is no reason to think that UM’s current schedule is too good for us to be successful.

Obviously, the $1M unanswered question is whether the RR offense will be as successful at UM as it was at WVU. The analysis I have done can’t really speak to this question, but neither does it suggest obvious reasons why it won’t be successful. It does show how powerful the WVU version was, and I for one support giving RR enough time to have a reasonable chance to put that offense into place.