FO Basics: Seven Years of FO Research

by Aaron Schatz

Over the next couple weeks, we're going to run a series of articles we're calling FO Basics. We get a lot of questions about our work, but there are also a lot of readers who don't ask questions. We hope this series will help answer some questions and clarify some confusing things for even those readers who don't respond on the message boards.

The schedule:

August 30: Where our stats come from, and the difference between charting stats and play-by-play stats.

September 1: Our college stats, how they differ from our NFL stats and from each other.

September 6: The importance (and limitations) of watching games on tape.

September 7: Regression towards the mean -- what it means, and how we use it.

While reading Football Outsiders or the Football Outsiders Almanac, new readers will often come across an offhand comment about, for example, the idea that fumble recovery is not a skill, and wonder what in the heck we are talking about. In Football Outsiders Almanac 2010, we include an essay called "Pregame Show" which gives a basic look at some of the most important precepts that have emerged during seven years of Football Outsiders research. Today we are republishing that essay here as part of our preseason "FO Basics" series. You will also find links to the original research online when possible, or mentions of where that research appeared in print. (Some research was developed over time and therefore doesn't have specific articles to be linked.)

This essay is always featured on the website if you scroll to "About" in the green bar above and then scroll down and click on "FO Basics."

You run when you win, not win when you run.

The first article ever written for Football Outsiders was devoted to debunking the myth of "establishing the run." There is no correlation whatsoever between giving your running backs a lot of carries early in the game and winning the game. Just running the ball is not going to help a team score; it has to run successfully.

There are two reasons why nearly every beat writer and television analyst still repeats the tired oldschool mantra that "establishing the run" is the secret to winning football games. The first problem is confusing cause and effect. There are exceptions, usually when the opponent is strong in every area except run defense, like last year's New Orleans Saints. However, in general, winning teams have a lot of carries because their running backs are running out the clock at the end of wins, not because they are running wild early in games.

The second problem is history. Most of the current crop of NFL analysts came of age or actually played the game during the 1970s. They believe that the run-heavy game of that decade is how football is meant to be, and today's pass-first game is an aberration. As we addressed in an essay in Pro Football Prospectus 2007 about the history of NFL stats, it was actually the game of the 1970s that was the aberration. The seventies were far more slanted towards the run than any era since the arrival of Paul Brown, Otto Graham, and the Cleveland Browns in 1946. Optimal strategies from 1974 are not optimal strategies for today's game.

A sister statement to "you have to establish the run" is "team X is 5-1 when running back John Doe runs for at least 100 yards." Unless John Doe is ripping off six-yard gains Chris Johnson-style, the team isn't winning because of his 100-yard games. He's putting up 100-yard games because his team is winning.

A great defense against the run is nothing without a good pass defense.

This is a corollary to the absurdity of "establish the run." If you don't believe us, meet our good friends the 2006-2007 Minnesota Vikings. With rare exceptions, teams win or lose with the passing game more than the running game -- and by stopping the passing game more than the running game. The reason why teams need a strong run defense in the playoffs is not to shut the run down early, it's to keep the other team from icing the clock if they get a lead. You can't mount a comeback if you can't stop the run.

Running on third-and-short is more likely to convert than passing on third-and-short.

On average, passing will always gain more yardage than running, with one very important exception: when a team is just one or two yards away from a new set of downs or the goal line. On third-and-1, a run will convert for a new set of downs 36 percent more often than a pass. Expand that to all third or fourth downs with 1-2 yards to go, and the run is successful 40 percent more often. With these percentages, the possibility of a long gain with a pass is not worth the tradeoff of an incomplete that kills a drive.

This is one reason why teams have to be able to both run and pass. The offense also has to keep some semblance of balance so they can use their play-action fakes, and so the defense doesn't just run their nickel and dime packages all game. Balance also means that teams do need to pass occasionally in short-yardage situations; they just need to do it less than they do now. Teams pass roughly 60 percent of the time on third-and-2 even though runs in that situation convert 20 percent more often than passes. They pass 68 percent of the time on fourth-and-2 even though runs in that situation convert twice as often as passes.

Standard team rankings based on total yardage are inherently flawed.

When you open your newspaper on Sunday morning, you'll see that the little agate-type previews of each game list team rankings by total yardage. That is still how the NFL "officially" ranks teams, but these rankings rarely match up with common sense. That is because total team yardage may be the most context-dependent number in football.

It starts with the basic concept that rate stats are generally more valuable than cumulative stats. Yards per carry says more about a running back's quality than total yardage, completion percentage says more than just a quarterback's total number of completions. The same thing is true for teams; in fact, it is even more important because of the way football strategy influences the number of runs and passes in the game plan. Poor teams will give up fewer passing yards and more rushing yards because opponents will stop passing once they have a late-game lead and will run out the clock instead. For winning teams, the opposite is true. Did Detroit really have a better passing game than San Diego or New England in 2006, or did the Lions have more passing yards because they went 3-13 and thus threw the ball more than any team except for Green Bay, while the Chargers and Patriots were a combined 26-6 and spent a lot of time killing the clock with the running game?

Total yardage rankings are also skewed because some teams play at a faster pace than other teams. In 2009, Houston had 320 more total yards than Indianapolis, but that's not because Houston had the better offense. The Colts ran only 164 offensive drives that year, compared to 180 offensive drives for the Texans.

Pro Football Prospectus 2005, Cleveland chapter

Pro Football Prospectus 2005, New York Jets chapter

A team will score more when playing a bad defense, and will give up more points when playing a good offense.

This sounds absurdly basic, but when people consider team and player stats without looking at strength of schedule, they are ignoring this. In 2004, Carson Palmer and Byron Leftwich had very similar numbers, but Palmer faced a much tougher schedule than Leftwich did. Palmer was better that year, and better in the long run.

In 2007, Oakland running back Justin Fargas had four games with at least 115 rushing yards. Those games came against the teams ranked 31st (Miami), 30th (New York Jets), 29th (Houston), and 26th (Denver) in defensive DVOA against the run. On the other hand, he gained only 41 yards on 15 carries against Green Bay, ranked eighth, and only 58 yards on 22 carries against Minnesota, ranked first.

Because players and teams don't give the exact same performance every week, this is more of a general law, and it doesn't necessarily apply in the short term. Sometimes the short term lasts a whole year -- for example, if you are the 2006 Jacksonville Jaguars.

Pro Football Prospectus 2005, Cincinnati chapter

If their overall yards per carry are equal, a running back who consistently gains yardage on every play is more valuable than a boom-and-bust running back who is frequently stuffed at the line but occasionally breaks a long highlight-worthy run.

Our brethren at Baseball Prospectus believe that the most precious commodity in baseball is outs. Teams only get 27 of them per game, and you can't afford to give one up for very little return. So imagine if there was a new rule in baseball that gave a team a way to earn another three outs in the middle of the inning. That would be pretty useful, right?

That's the way football works. You may start a drive 80 yards away from scoring, but as long as you can earn 10 yards in four chances, you get another four chances. Long gains have plenty of value, but if those long gains are mixed with a lot of short gains, you are going to put the quarterback in a lot of difficult third-and-long situations. That means more punts and more giving the ball back to the other team rather than moving the chains and giving the offense four more plays to work with.

The running back who gains consistent yardage is also going to do a lot more for you late in the game, when the goal of running the ball is not just to gain yardage but to eat clock time. If you are a Chicago Bears fan watching your team with a late lead, you don't want to see three straight Matt Forte stuffs at the line followed by a punt. You want to see a game-icing first down.

A common historical misconception is that our preference for consistent running backs means that "Football Outsiders believes that Barry Sanders was overrated." Sanders wasn't just any boom-and-bust running back, though; he was the greatest boom-and-bust runner of all time, with bigger booms and fewer busts. Our play-by-play database only goes back to 1993, but Sanders led the league in rushing DYAR for 1996 and was second behind Terrell Davis in 1997.

Rushing is more dependent on the offensive line than people realize, but pass protection is more dependent on the quarterback himself than people realize.

Some readers complain that this idea contradicts the previous one. Aren't those consistent running backs just the product of good offensive lines? The truth is somewhere in between. There are certainly good running backs who suffer because their offensive lines cannot create consistent holes (Frank Gore in 2009, for example). Most boom-and-bust running backs, however, contribute to their own problems by hesitating behind the line whenever the hole is unclear, looking for the home run instead of charging forward for the four-yard gain that keeps the offense moving.

As for pass protection, some quarterbacks have better instincts for the rush than others, and are thus better at getting out of trouble by moving around in the pocket or throwing the ball away. Others will hesitate, hold onto the ball too long, and lose yardage over and over.

Note that "moving around in the pocket" does not necessarily mean "scrambling." In fact, a scrambling quarterback will often take more sacks than a pocket quarterback, because while he's running around trying to make something happen, a defensive lineman will catch up with him.

Shotgun formations are generally more efficient than formations with the quarterback under center.

Over the past three seasons, offenses have averaged 5.9 yards per play from Shotgun, but just 5.1 yards per play with the quarterback under center. This wide split exists even if you analyze the data to try to weed out biases like teams using Shotgun more often on third-and-long, or against prevent defenses in the fourth quarter. Shotgun offense is more efficient if you only look at the first half, on every down, and even if you only look at running back carries rather than passes and scrambles.

Clearly, NFL teams have figured the importance of the Shotgun out for themselves. Over the past four seasons, the average team has gone from using Shotgun 19 percent of the time to 36 percent of the time, not even counting the Wildcat and other college-style option plays that have become popular in recent years. Before 2007, no team had ever used Shotgun on more than half its offensive plays. In the past two seasons, five different teams have used Shotgun over half the time. It is likely that if teams continue to increase their usage of the Shotgun, defenses will adapt and the benefit of the formation will become less pronounced.

Pro Football Prospectus 2007, Tampa Bay chapter

A running back with 370 or more carries during the regular season will usually suffer either a major injury or a loss of effectiveness the following year, unless he is named Eric Dickerson.

Terrell Davis, Jamal Anderson, and Edgerrin James all blew out their knees. Larry Johnson broke his foot. Earl Campbell and Eddie George went from legendary powerhouses to plodding, replacement-level players. Shaun Alexander broke his foot and became a plodding, replacement-level player. This is what happens when a running back is overworked to the point of having at least 370 carries during the regular season.

The "Curse of 370" was expanded in Pro Football Prospectus 2006 to include seasons with 390 or more carries in the regular season and postseason combined. Research also shows that receptions don't cause a problem, only workload on the ground.

Plenty of running backs get injured without hitting 370 carries in a season, but there is a clear difference. On average, running backs with 300 to 369 carries and no postseason appearance will see their total rushing yardage decline by 15 percent the following year and their yards per carry decline by two percent. The average running back with 370 or more regular-season carries, or 390 including the postseason, will see their rushing yardage decline by 35 percent, and their yards per carry decline by eight percent. However, the Curse of 370 is not a hard and fast line where running backs suddenly become injury risks. It is more of a concept where 370 carries is roughly the point at which additional carries start to become more and more of a problem.

Research in Pro Football Prospectus 2008 suggests that overuse in college does not create a problem for top prospects, but also shows that players chosen after the first round rarely have a successful NFL career after a college season over 330 carries.

Wide receivers must be judged on both complete and incomplete passes.

We don't yet know enough to precisely parse the blame for incomplete passes, but we know that wide receiver catch rates are as consistent from year to year as quarterback completion percentages. Since 2001, Hines Ward has never had caught fewer than 59 percent of intended passes, whether from Kordell Stewart, Tommy Maddox, or Ben Roethlisberger. Plaxico Burress, playing with the same quarterbacks as well as with Eli Manning, never caught more than 58 percent of intended passes, and in three different years had a catch rate below 50 percent. However, it is also important to look at catch rate in the context of the types of routes each receiver runs. We recently expanded on this idea with a new plus/minus metric.

The total quality of an NFL team is three parts offense, three parts defense, and one part special teams.

There are three units on a football team, but they are not of equal importance. Our DVOA ratings provide good evidence for this. The special teams ratings are turned into DVOA by comparing how often field position on special teams leads to scoring compared to field position and first downs on offense. After figuring out these numbers, the top ratings for special teams are roughly one-third as high as the top ratings for offense or defense.

Offense is more consistent from year to year than defense, and offensive performance is easier to project than defensive performance. Special teams is less consistent than either.

Nobody in the NFL understands this concept better than Indianapolis Colts general manager Bill Polian. Both the Super Bowl champion Colts and the four-time AFC champion Buffalo Bills of the early 1990s were built around the idea that if you put together an offense that can dominate the league year after year, eventually you will luck into a year where good health and a few smart decisions will give you a defense good enough to win a championship. (As the Colts learned in January 2007, you don't even need a year, just four weeks.) Even the New England Patriots, who are led by a defense-first head coach in Bill Belichick, have been more consistent on offense than on defense since they began their run of success in 2001.

Teams with more offensive penalties generally lose more games, but there is no correlation between defensive penalties and losses.

Defensive penalties often represent strong defensive play that goes just over the line between legal and illegal. As long as penalties are only called every so often, this kind of close play leads to successful defense.

Connected to this precept: The penalty that correlates highest with losses is the False Start, and the penalty that teams will have called most consistently from year to year is the False Start.

Pro Football Prospectus 2007, St. Louis chapter

Field-goal percentage is almost entirely random from season to season, while kickoff distance is one of the most consistent statistics in football.

This theory, which originally appeared in the New York Times in October 2006, is one of the our most controversial, but it is hard to argue against the evidence. Measuring every kicker from 1999 to 2006 who had at least ten field goal attempts in each of two consecutive years, the year-to-year correlation coefficient for field-goal percentage was an insignificant .05. Mike Vanderjagt didn't miss a single field goal in 2003, but his percentage was a below-average 74 percent the year before and 80 percent the year after. Adam Vinatieri, supposedly the best kicker in the game, has never has never had two straight seasons with accuracy better than last year's NFL average of 83 percent.

On the other hand, the year-to-year correlation coefficient for kickoff distance, over the same period as our measurement of field-goal percentage and with the same minimum of ten kicks per year, is .61. The same players consistently lead the league in kickoff distance, particularly Rhys Lloyd, Olindo Mare, and Stephen Gostkowski.

Recovery of a fumble, despite being the product of hard work, is almost entirely random.

Stripping the ball is a skill. Holding onto the ball is a skill. Pouncing on the ball as it is bouncing all over the place is not a skill. There is no correlation whatsoever between the percentage of fumbles recovered by a team in one year and the percentage they recover in the next year. The odds of recovery are based solely on the type of play involved, not the teams or any of their players.

Fans like to insist that specific coaches can teach their teams to recover more fumbles by swarming to the ball. Chicago's Lovie Smith, in particular, is supposed to have this ability. However, since Smith took over the Bears, their rate of fumble recovery on defense went from a league-best 76 percent to a league-worst 33 percent in 2005, then back to 67 percent in 2006. Last year, they recovered 57 percent of fumbles, close to the league average.

Fumble recovery is equally erratic on offense. In 2008, the Bears fumbled 12 times on offense and recovered only three of them. In 2009, the Bears fumbled 18 times on offense, but recovered 13 of them.

Fumble recovery is a major reason why the general public overestimates or underestimates certain teams. Fumbles are huge, turning-point plays that dramatically impact wins and losses in the past, while fumble recovery percentage says absolutely nothing about a team's chances of winning games in the future. With this in mind, Football Outsiders stats treat all fumbles as equal, penalizing them based on the likelihood of each type of fumble (run, pass, sack, etc.) being recovered by the defense.

Other plays that qualify as "non-predictive events" include blocked kicks and touchdowns during turnover returns. These plays are not "lucky," per se, but they have no value whatsoever for predicting future performance.

Pro Football Prospectus 2005, New Orleans chapter

Field position is fluid.

Every yard line on the field has a value based on how likely a team is to score from that location on the field as opposed to from a yard further back. The change in value from one yard to the next is the same whether the team has the ball or not. The goal of a defense is not just to prevent scoring, but to hold the opposition so that the offense can get the ball back in the best possible field position. A bad offense will score as many points as a good offense if it starts each drive five yards closer to the goal line.

A corollary to this precept: The most underrated aspect of an NFL team's performance is the field position gained or lost on kickoffs and punts. This is part of why players like Devin Hester and Josh Cribbs can have such an impact on the game, even when they aren't taking a kickoff or punt all the way back for a touchdown.

Also, see the book The Hidden Game of Football by Pete Palmer, Bob Carroll, and John Thorn

The red zone is the most important place on the field to play well, but performance in the red zone from year to year is much less consistent than overall performance.

Although play in the red zone has a disproportionately high importance to the outcome of games relative to plays on the rest of the field, NFL teams do not exhibit a level of performance in the red zone that is consistently better or worse than their performance elsewhere, year after year. The simplest explanation why is a small(er) sample size and the inherent variance of football, with contributing factors like injuries and changes in personnel.

Football Outsiders Almanac 2009, "The Red Menace"

Defenses which are weak on first and second down, but strong on third down, will tend to decline the following year. This trend also applied to offenses through 2005, but may or may not still apply today.

We discovered this when creating our first team projection system in 2004. It said that the lowly San Diego Chargers would have of the best offenses in the league, which seemed a little ridiculous. But looking closer, our projection system treated the previous year's performance on different downs as different variables, and the 2003 Chargers were actually good on first and second down, but terrible on third.

Teams get fewer opportunities on third down, so third-down performance is more volatile -- but it's also is a bigger part of a team's overall performance than first or second down, because the result is usually either very good (four more downs) or very bad (losing the ball to the other team with a punt). Over time, a team will play as well in those situations as it does in other situations, which will bring the overall offense or defense in line with the offense and defense on first and second down.

This trend is even stronger between seasons. Struggles on third down are a pretty obvious problem, and teams will generally target their off-season moves at improving their third-down performance ... which often leads to an improvement in third-down performance.

However, we have discovered something surprising over the past three years: The third-down rebound effect seems to have disappeared on offense, as we explain in the Philadelphia chapter of Football Outsiders Almanac 2010. We don't know yet if this change is temporary or permanent, and there is no such change on defense.

Injuries regress to the mean on the seasonal level, and teams that avoid injuries in a given season tend to win more games.

There are no doubt teams with streaks of good or bad health over multiple years. However, teams who were especially healthy or especially unhealthy, as measured by our Adjusted Games Lost (AGL) metric, almost always head towards league average in the subsequent season. Furthermore, injury - or the absence thereof - has a huge correlation with wins, and a significant impact on a team's success. In 2008, six of the seven least-injured teams in the league made the playoffs. The teams with the biggest drop in injuries between 2008 and 2009 were Baltimore and Dallas, a pair of playoff teams. Teams with a high number of injuries are a good bet to improve the following season.

Pro Football Prospectus 2008, "The Injury Effect"

By and large, a team built on depth is better than a team built on stars and scrubs.

Connected to the previous statement, because teams need to go into the season expecting that they will suffer an average number of injuries no matter how healthy they were the previous year. The Redskins went into 2006 with a Super Bowl-quality starting lineup, and finished 5-11 because they had no depth. You cannot concentrate your salaries on a handful of star players because there is no such thing as avoiding injuries in the NFL. Every team will suffer injuries; the only question is how many. The game is too fast and the players too strong to build a team based around the idea that "if we can avoid all injuries this year, we'll win."

Running backs usually decline after age 28, tight ends after age 29, wide receivers after age 30, and quarterbacks after age 32.

This research was originally done by Doug Drinen (editor of Pro-Football-Reference.com). In recent years, a few players have had huge seasons above these general age limits (most notably Tiki Barber, Tony Gonzalez, and Terrell Owens), but the peak ages Drinen found a few years ago still apply to the majority of players.

During the summer of 2007, ESPN The Magazine asked us to research when players decline at "non-skill" positions. This research was not as rigorous as our usual work, and needs a little more attention before we're ready to stand by it. For the curious, however, the preliminary results said that defensive ends and defensive backs generally begin to decline after age 29, linebackers and offensive linemen after age 30, and defensive tackles after age 31.

The future NFL success of quarterbacks chosen in the first two rounds of the draft can be projected with a high degree of accuracy by using just two statistics from college: games started and completion percentage.

This theory was introduced in Pro Football Prospectus 2006 and further refined in Pro Football Prospectus 2007. The projection created by these stats is known as the Lewin Career Forecast, after the creator of the theory, David Lewin, who now works for the Cleveland Cavaliers.

Scouts expected players such as Kyle Boller (48 percent), Jim Druckenmiller (54 percent) and Ryan Leaf (54 percent) to suddenly figure out how to complete passes once they hit the NFL. It isn't surprising that it didn't happen. Having a high completion percentage (above 60 percent or so) is no guarantee of success, especially if it was done in a small number of games in a fluky system (Tim Couch being a strong example), but it is a prerequisite for it. Games started are important because the more film that exists of a player in game conditions, the easier it is to find weaknesses that might come out against different opponents or different schemes. When scouts don't get sufficient information, they place too much weight on "measureables" and off-field workouts, and make mistakes like Couch (26 starters), Leaf (24 starts) or Akili Smith (19 starts).

The Lewin Career Forecast only applies to the first two rounds because it assumes that with enough game film to judge, scouts can accurate identify players who are "system quarterbacks" and will not succeed in the NFL, and those players appropriately fall on draft day (Texas Tech quarterbacks like Graham Harrell are a good historical example.)

From 1996-2005, the worst quarterback drafted in the top two rounds who had 37 or more college starts and a completion rate above 60 percent was Eli Manning. When the worst projection belongs to a quarterback who just led a two-minute drill to finish off a historic Super Bowl upset, that's a good projection system. However, the Lewin system has mixed successes (Kevin Kolb) with failures (Brady Quinn, Matt Leinart, Brian Brohm) in recent years, and we're likely to revisit it in the near future.

Highly-drafted wide receivers without many college touchdowns are likely to bust.

Football Outsiders Almanac 2009 introduced a new metric called Playmaker Score, which measures rookie wide receivers by simply multiplying average yards per reception and total career touchdowns in college. Players who score high in this metric do not necessarily become stars in the NFL, but no first- or second-round pick with a score below 8.0 has yet to live up to his draft position. Like the Lewin Career Forecast, Playmaker Score is far more accurate with receivers chosen in the first two rounds, and it doesn't seem to work for hybrid slot receiver/running backs such as Percy Harvin and Dexter McCluster.

Football Outsiders Almanac 2009, "Introducing Playmaker Score"

Championship teams are generally defined by their ability to dominate inferior opponents, not their ability to win close games.

Football games are often decided by just one or two plays -- a missed field goal, a bouncing fumble, the subjective spot of an official on fourth-and-1. One missed assignment by a cornerback, or one slightly askew pass that bounces off a receiver's hands and into those of a defensive back five yards away and the game could be over. In a blowout, however, one lucky bounce isn't going to change things.

Championship teams beat their good opponents convincingly and destroy the cupcakes on the schedule. Certainly there are exceptions to this rule, including the past two Super Bowl champions. Unless this becomes a trend that lasts four or five years, it is hard to say this rule no longer exists.

Comments

Nobody in the NFL understands this concept better than Indianapolis Colts general manager Bill Polian. Both the Super Bowl champion Colts and the four-time AFC champion Buffalo Bills of the early 1990s were built around the idea that if you put together an offense that can dominate the league year after year, eventually you will luck into a year where good health and a few smart decisions will give you a defense good enough to win a championship

Does he understand this, or was he just in a position to acquire a very good quarterback?

I mean that Ozzie Newsome fellow seems like a bright guy, but he built the Ravens off defense. So why would he do that? My guess is that elite defenders were available to him. So that's the route he took.

Since some of your research keys off broadcast talking points, how about the larger scope of NFL "parity?" TV commentators often mention how game events, draft behavior, and rule changes reflect parity--or don't. Do FO stats have a long-term view for, or point to a better understanding of, league parity?

No, you're mixing up cause and effect, such as in the RB situation. They're not saying that trouncing the weak teams MAKES a team good, they're saying that good teams trounce the weak teams BECAUSE they're good already. So it doesn't matter if a team schedules cupcake teams or not (at least as far as this point is concerned).

I'm not sure I agree with the last point. Now, I agree with the statistically inspired principle: that strong teams trounce cupcakes and win against strong opposition most of the time. I agree that "good teams win close games" is nonsense.

I don't agree that the last two Super Bowl champions are a counterexample. They just weren't very strong teams. The Saints were probably the fourth or fifth best team in the playoffs, let alone the league, and Pittsburgh, well, wasn't terribly strong either. The fact that they don't fit the profile of strong teams isn't a knock on the profile, it's a knock on the teams.

The Saints had several notable injuries on defense in the second half of the season that were healthy in the postseason. The injuries were particularly acute in the secondary. Jabari Greer was probably the most notable, since the Saints rely so much on their corners to run Gregg Williams' schemes.

Games Jabari Greer was healthy (not including last game of the regular season when several other starters were resting for the playoffs):

As I said, there were other several other injuries that happened around midseason that were healthy when the playoffs came around. But Greer was probably the most important.

I don't think the Saint's season disproves the 'Guts versus Stomps' argument. What it does is emphasize the importance of health (which is another point in this list). When the playoffs came around they resembled the healthy team of the early part of the season, which had a string of stomps.

The 2008 Steelers weren't strong? They were 2nd in DVOA and went 12-4 against a brutal schedule. They weren't as good as Tennessee or the Giants that year (IMO, the 2 best teams, at least until Plex shot himself), but they were still strong.

I would agree with this statement if it were stated in August/September 2008, but New Orleans had the top point differential in 2009, and Pittsburgh was in the top 5 in 2008 while playing arguably one of the 5 toughest schedules in the league. There have always been ebbs/flows to the accuracy of this statement, much it due to which teams are considered to be SB contenders.

Probably the biggest exception to this recently is the Colts under Manning - they seem to win a disproportionate share of close games while a team like Baltimore destroys weak teams then looks futile against the Colts (they've lost the last 7 games by an average score of 24-9). Of course they don't play each other in the regular season this year, so unless they face each other in the playoffs we won't be able to test this situation again.

Could it be that the elder Manning is like Kobe Bryant in close games?

"Championship teams beat their good opponents convincingly and destroy the cupcakes on the schedule."

I think the next sentence was saying that the last two champions were an exception to *this* rule - that they happened to win championships without necessarily dominating their opposition like good teams generally do. (Reading between the lines, they were perhaps more lucky than good. But they certainly get to keep the trophies.)

commissioner leaf (comment #4)
If you want to argue that MIN & IND were better than the Saints last year (and MAYBE the Cowboys), I think we can have a reasonable debate. Both of those games were close, and could have gone either way. But tell me--which other teams were better???? However, considering the Saints beat the 1st two, and only lost by a TD when they were driving, in DAL territory, in the last seconds of the game, I don't think you can say FOR SURE that ANY team was better than the Saints last year.
A couple of points that show them being the BEST of them all (which also are pretty important to winning games): the BEST pass defense in the RED ZONE; the BEST 4th Q point differential (~+90); the BEST running game, according to DVOA (helps bleed the clock at the end); arguably the BEST QB in the NFL RIGHT NOW (at the least, he set a new record for comp. %--and he beat the other two).
Now--you want to cite ANYTHING in your favor???

This column is like a greatest hits album, which is cool unless you're a hipster. I love all the FO stuff except for the metrics for individual non-QB performance. I appreciate that the perfect shouldn't be the enemy of the good but I'm not convinced that these metrics are an improvement over conventional statistics plus subjective assessment.

Exactly, the FO system severely penalizes players that make big plays and ALY and DYAR conflict with one another. As discussed in another thread:

RB A starts on his 20 and rushes for 3 consecutive 10 yard runs resulting in 1st downs. The team stalls on the 50 yard line and punts. The end result is a total of 6.45 success points for that RB.

RB B starts on his 20 and takes the handoff for 79 yards and is tackled at the 1. This is worth 5.7 success points for the RB.

Now, when judging a *team's* performance and predicting future outcomes, RB A may be in a better situation since ripping off 3 consecutive 10 yard runs probably means RB A is decent and his line is decent. RB B's run is more of an outlier that doesn't correlate well with his team winning (not that he didn't contribute more than RB A) due to the inability to predict long plays. However, the play itself *should* describe how good the player actually is - ideally with a different metric like "individual success points". It does not. With success points RB A > RB B.

Great piece. FO has contributed an impressive amount to the discussion, and might be the largest force behind getting football stats analysis out of the stone age. My hat is off to you guys; I've been a fan since I discovered the site in 2006. There's no better place on the web for getting at the nuts and bolts of the game, IMHO.

If there was one thing about FO I could change, however, it would be the various authors' occasional overstatement of their case. There are many instances when obvious objections to the argument are overlooked. And while FO presents itself as merely the messenger of cold, objective facts, there are times when the editorial side comes through too strongly.

I only mention it because the very first thing Adam discusses is one of these beaten-to-death horses that FO has gotten (at least partially) wrong:

The first article ever written for Football Outsiders was devoted to debunking the myth of "establishing the run." There is no correlation whatsoever between giving your running backs a lot of carries early in the game and winning the game. Just running the ball is not going to help a team score; it has to run successfully.

There are two reasons why nearly every beat writer and television analyst still repeats the tired oldschool mantra that "establishing the run" is the secret to winning football games.

Obvious counterargument: Sure, there are some idiot commentators and analysts (perhaps even coaches) who think that simply giving the running back the ball 20 times before halftime, regardless of the quality of the runs, is "establishing the run." But, seriously now, almost everyone else takes "establishing the run" to include successfull running. So it becomes a red herring to talk about "no correlation whatsoever between giving your running backs a lot of carries early in the game and winning the game." I'd be willing to bet there's a correlation when those runs are successful — which is all most coaches, etc., are getting at.

Anyway, this is but one quibble — albeit a reoccurring one — I have with an otherwise excellent website. This piece is a nice celebration of FO's contributions to football analysis over the last few years. Congrats to you guys.

Yes, there is often a difference between what commentators mean by "establish the run," and what coaches mean by "establish the run." You have to remember that a) this was less true seven years ago and b) we're mostly reacting to commentators. When I first wrote that article, it was common to watch a Tennessee Titans game and hear the announcers going on and on about "establishing the run" as Eddie George constantly ran into his line for no gain. And as I often say, FO was not established to improve NFL coaching and management. It was established to improve NFL coverage by the media.

Aaron: Did you see the Brian Burke column in the Times today? He suspiciously uses the "kneel to win" argument but attributes it to his dad when he was a kid. Let's just say I'm skeptical that Burke pere was that prescient and sophisticated a football fan.

Burke seems to hold a grudge against FO. He's written a few pieces criticizing the statistical rigor of FO's methods (particularly the curse of 370, and more recently the predictive powers of DVOA), which is somewhat reasonable, but has gone way overboard in hyperbole and name calling. He almost seems to take FO's success personally.

I have no idea who first coined the sarcastic "kneel to win" anti-wisdom, or whether this has anything to do with FO, but it does seem a little suspicious.

. . . but, I've also noticed knee-jerk reactions against any occurrence of the phrase "establishing the run." Not necessarily by FO writers, but often enough that I'm on my own, mini-countercrusade against it.

This knowledge is the reason why I cringe everytime I hear about how the Steelers want to go back to their 'run first steelers style football'.
Everytime Tomlin runs some double TE I formation up the middle on first down for 2 yards... I die a little.

Yeah, the point of double te formation is to force the defense to fill the box, and then go over the top on it, while allowing for very sound pass protection. That is the classic Joe Gibbs approach to offense, which usually gets misrepresented as a Woody Hayes, three yards and a cloud of dust, strategy. Of course, you do have to actually be able to consistently rip off some five and six yard runs out the formation, and have personnel to get vertical, to make it work.

What has always been interesting to me is that another acorn from the Sid Gillman/Don Coryell tree, Mike Martz decided to just spread the field, put out five guys in patterns on each pass, and let the qb take his licks.

and even then, an accurate depiction of the "cloud of dust" offense would include the significant advantage in talent that Ohio State (and Michigan, who at the time ran a similar offense) enjoyed over the majority of their opponents.

In that context, it was enough simply to run the ball to exhaustion; perhaps by no coincidence, those teams rarely found success in the Rose Bowl with that offense. But as depth in the conference began to improve, even at that level, a straight-ahead run-heavy offense was no longer guaranteed to be a success to the degree it was in the '70s.

However, run-heavy offenses are still popular in the conference. Last year, Wisconsin ran on about 63% of its plays, even though they gained 8.1 yards per attempt through the air, second only to Penn State in the Big Ten. (Their S&P+ was better for passing than for rushing as well.)

And Ohio State is still the "leader" in running the ball, 64% of the time. (Note that those are using conventional NCAA stats that count sacks as rushing attempts.) With S&P+ about even for rushing and passing, it's almost as though Tressel is trying to compensate for a passing attack that isn't where he wants it to be. Detractors would suggest that he's afraid to throw the ball because it might give the Buckeyes an insurmountable lead, and what's more exciting than a close game?

But back to the pro side of things, yeah, when teams line up in a heavy formation and then plod up the middle, it's weird to hear commentators explain that this is exactly what you want to do. Funny, I'd rather see the Lions go to a 2-TE Ace formation and throw to Pettigrew, Johnson, and RB-of-the-day to get the safeties to back off and then hit the line a few times.

Agreed. I think the Steelers generally top 10ish offense in the last decade has more to do with having superior personnel than superior game planning.

I was hopeful when Rooney and Tomilin and Arians all stressed they wanted to run more efficiently, not more often, and I know pre-season play balance doesn't carry over to the real season, but yikes. I'm not looking forward to my blood pressure this year.

I think one of FO's greatest contributions has been to emphasize how large a role pure random chance plays in the outcome of close games, and how much confirmation bias skews our perspective of the game. I had known this on some level since I was a kid, but until it was quantified better, it had not been driven home completely. I've been watching the game closely for 35 years, but it has been only in the last 6 years or so that I think I've had a clear view of what I was seeing, or perhaps more accurately, it is only in the last 6 years or so that I've appreciated how little I really "know" with a lot of confidence.

A side-effect has been that, while I had little use for television yappers during games before I started reading FO, I rarely have the announcers turned up at all anymore. I sometimes can get the crowd noise on the second language feed, without announcers, and that is ideal. It would be one thing if play-by-play guys still did a decent job on substitutions, but they seem to be mostly concerned with mugging with the useless analysts, so why bother?

First the nitpick: " We help this series will help answer some questions".

Methinks you meant "We hope this series..."

Otherwise, fantastic article. One question I've always had about running back
ability vs. quality of offensive line is how Walter Payton fits in to this,
as I remember that his offensive line was a shambles in his prime years. So
when/if you ever get to analyzing that period, I would be really interested
in what you come up with.

Aside from what Aaron wrote up top, things I've learned over the years include:

1. Never trade up to draft Florida State QB Adrian McPherson in a community mock draft. You will never live the ridicule down.

2. From the Brady-Manning debate, I learned that Brady is better at rescuing lost kittens from trees, Manning better at later consoling the firemen who didn't get there in time.

3. I bear witness that there is no punter but ROBO-PUNTER, and he is worth a 1st-round draft pick and the salary it commands.

4. Al Davis may be crazy and he may hurt his team year after year, but do you really want to see a world in which raiderjoe goes without the man? I, for one, am not that cruel.

5. Bears fans will eternally think that this is the year their free agent machinations will pay off. The only pieces that were missing before were that seventh-round wideout and that last half-retired, Alzheimer's-stricken tackle/backup-running back/center-with-one-arm.

There are so many things FO has done for the game, both in making it more exciting to analyze and watch, so thank you, guys, for all the hard work you've done and congratulations on your well-earned success.

This really was a good refresher--like getting together with a bunch of guys from high school/college that you assume you knew already and would be bored with, only to realize, when you're done, that there were good reasons you were friends in the first place.

FO has really contributed to general understanding of the hidden aspects of football, and maybe, in some ways (if teams start using siilar analysis) improved the game itself. I know that's a little crazy sounding, but I would not be surprised if a few younger, astute assistants were FO fans, and carried some of this info forward into their jobs as they climb the NFL ladder....

All these years later, and I STILL have a problem accepting guts and stomps, but that's MY problem. The data and analysis seems right, it just conflicts with my instincts.

I must admit that I never really understood your Guts and Stomps article, but using the definitions from it, the 2009 Saints during the regular season had:
Stomps 3
Guts 2
Dominations 3
Skates 2
and added 2 Dominations and 1 Gut in the playoffs.

When you write: "Championship teams beat their good opponents convincingly and destroy the cupcakes on the schedule", that means more Stomps and Dominations than Guts and Skates, which the Saints certainly had. The Saints don't seem to be an exception to the rule.

What may be a bit unusual about the Saints is that if they had experienced some moderately bad fumble recovery luck in the Conference Championship, they likely would have been thoroughtly stomped on their home field.

Couple of counterpoints will:
Fumble recovery is luck--stripping it out is a skill. The Saints repeatedly STRIPPED the ball, and on a couple it was OBVIOUS they were trying to strip it out.
It also seemed that both teams got 1 "lucky" recovery--the Saints when Favre & Peterson fumbled the handoff near the goalline, and the Vikes when Peterson somehow recovered his own fumble in between about 5 Saints.

Which is why I wrote "moderately bad fumble RECOVERY luck" (caps added). Yes, the Saints stripped effectively, and the Vikings protected the ball poorly. Still, if the Saints had experienced just a little worse luck in recovering fumbles (b.t.w., let's not overlook the fumble that didn't show up in the stats, when the Saints had the ball knocked loose in the fourth and inches, only to have the ball pinned between the linebacker's shoulder pad and the running back's thigh pad) they very, very, likely lose, with a significant chance of losing by 14 or more. They were thoroughly whipped on the line of scrimmage, and for all the attention given to Favre's last pass, Favre outplayed Brees by a non-trivial margin. All my Viking fan bitterness aside, it isn't very usual for the eventual Super Bowl champ to have their qb outplayed, and to be dominated on the line of scrimmage, on their home field, in the playoffs.

I understand the point about field goal percentage being random, but isn't a kicker who has never had a season where he's converted fewer than 80% of his attempts probably better than one who bounces around in the 70% range? You aren't saying that it's COMPLETELY random and there's no difference in talent between one kicker and another, are you?

Also, just to point out, you said that Adam Vinatieri has never had consecutive seasons where he's been better than 83%. Well, Nate Kaeding (playoffs notwithstanding) has been better than 83% four years in a row, and has been better than 87% in 4 of his 6 NFL seasons. It seems like 172 attempts over 6 years and a career 87.2% conversion rate should tell us that he's better than, say, Kris Brown who's spent most of his career in the 60-70% range, with a few years breaking 80%.

I think what the research shows is that all NFL kickers are pretty good at their job, and most variation in field goal percentage from year to year is just random noise rather than talent of the kicker changing.

It also shows that field goal percentage is a poor measure of true accuracy. I'm guessing if you measure exactly where a kicker put the ball on ever kick, you could build a decent model for predicting accuracy.

It really doesn't make that much of a difference. Kickers rarely attempt more than 40 FGs a season; a difference of 10% in accuracy works out to less than a point per game for the heavily-used kickers.

Yes, Kaeding is more accurate than Kris Brown ... but even at that, you're talking about the current career leader in FG% and comparing him to one of the least-accurate active kickers in the NFL. And even at that, the difference is less than 10%: Brown's career percentage is 77.3%. So you go back to the 1-point-per-4-games difference.

Brown is another example of the unpredictability of kickers: on two separate occasions in his career (1999-2000 and 2007-08), Brown's had back-to-back seasons greater than 83%.

I would support an amendment to the United States Constitution to the effect that no person would be permitted to speak on a televised NFL broadcast unless and until they could recite each and every one of these conclusions verbatim.

This particular precept should be amended with "and passes not thrown to them, if only we had a way to do this." A receiver who seldom gets thrown to is probably doing a poor job, whatever his catch percentaged and yards per catch.

Still, I love this stuff. Thanks for posting an article that lets us focus on football analysis.

I have a question with regards to the philosophy that boom or bust RBs vs. ones that can continuously rip off 5 yard gains. There seems to be no mention of how many people the defense commits to the run. I do not know the numbers, so maybe I am being foolish, but from my limited perception I would guess that Adrian Peterson is considered a less than perfect boom or bust running back. Doesn't the 8 guys in the box Brett Favre gets to see on first down mean anything? I just find it hard to believe that Adrian Peterson is valued as basically an above average running back, when defensive coordinators are forced to tailor the personnel to his appearance in the backfield. When you watch the guy play, whenever he gets any sort of room, he makes something happen. The run blocking he gets is just atrocious, and their 20th ranked placement on this site seems to be artificially inflated by Adrian's success.