stats

I have a collection of statistics that I have kept for a number of years (since the 80s) regarding team level comparisons -- e.g. win-loss records, etc... Many of the stats are from the last 10 years to give a "current" flavor to the all-time records for those of you looking to back up your arguments with recent stats and attempting to debunk arguments like "you were good in the 1920s, but since then ..."

My main webpage is http://webpages.charter.net/ultimakhan/ and Brian was gracious enough to post it in one of his posts a few years back -- I just wanted to add a diary entry for those looking for an easier link to the content.

As Michigan fans get ready for year two of this new fangled offense, it strikes me that the old measures of success are no longer applicable.

In days of old when the ball was rarely in the air, just making receptions was a thing of beauty. But now with every other play being a swing pass or a slip screen, simply catching the ball is not enough.

Case in point, Marvelous Matavious Odoms. He, being the record holder of receptions, has yet to actually impress me. At least not on a consistent basis. Yes, he had 49 receptions. But what did he do with them? He averaged 9 yards per reception. He scored 1 touchdown. I yawned.

Because what you're not seeing in those stats is the number of drops he had. You're not seeing the number of times he gained 3 yards when we needed 4 on third down. And you're not seeing his pathetic work on returns.

Cumulative stats mean less and less these days. Back around the time I was born, teams played 10 games and maybe a bowl. Now some teams play 14 per year.

Texas Tech seems to set new passing records every year. But their quarterback went undrafted. For a while, John Navarre held most of the passing records at Michigan. John Navarre was probably the 5th or 6th best QB I've personally seen suit up for the maize and blue. (Brady, Griese, Harbaugh, Grbac, Collins, Henne, argue amongst yourselves)

The problem with these stats is that they only keep track of the good, without penalty of the bad. What would be much more telling are stats that include efficiencies.

"Aha, but what about the one hit wonders?!" I can hear you say. "What about the LB who catches 1 yard passes on the goal line for TD's? Should he be considered the best receiver?" No you fool. But he should be given props.

No. Stop thinking so one-dimensionally. This is a college full of engineers. So find one, buy him some beers, and get him to explain how one point does not a histogram make, my young padawan.

There is some consciousness of the need for better stats. Increasingly, commentators rely on things such as yards per carry or yards per attempt. These are better. It's two pieces of information combined into one. It's like Ernest Rutherford looking at the plum pudding model and saying, "Wait, we can do better!" But they're far from perfect.

This is why QB's have more complicated efficiency ratings. This is why Brian complains about redzone scoring efficiency. People know that flats stats are useless. (BTW redzone scoring efficiency should be points scored in the redzone per redzone trip. Yeah, it's not out of 100 %, but it's an easy number to understand. A team with a score of 3.5 is not as good as a team with a 6.8. Or if you don't have a kicker and go for two every time you could get a score of 8.0)

So let me be your Neil's Bohr and suggest some stats that can take us to a deeper understanding of a receiver's value in football. If you happen to work for ABC or ESPN, please send me proper compensation for when you utilize these in your graphics.

Some of these are not of my original creation. Like Newton with the Principia, I simply gather these ideas and put my own concise twist on them to go along with my own inventions.

a) Yards per thrown at. This has three pieces of information in it. I want to know how many times he catches it, but give him a penalty for drops, and find out what he does with it after catching it.

b) TD's per redzone thrown at. This tells you if the WR is a big target who can get open in close space or box out effectively.

c) Conversion Efficiency = (receiving yards minus (half the yards to go)) multiplied by the down number per thrown at. Gaining 12 yards on 4th and 10 is a 28, Gaining 6 yards on 1st down is a 1. 6 yards on 3rd and 5 is a 14. Gaining 3 yards on 3rd and 8 is a -3

d)Snag and Go = (Total receiving yards/(yards BEFORE the catch)) multiplied by (receptions per thrown at). This tells you if the kid is dependable, if he's got any shake and bake. The first ratio is high for a guy like Steve Breaston, but close to one for a guy like Jason Avant. But Avant would score higher on the second ratio, just not enough to overcome his lack of YAC.

So there it is, your new measures of receiver excellence. And if you think this is over the top, wait till you see what I've got in mind for QB's. Gametracker stats will have lots more colorful graphs.

And if I get bored enough this summer, I'll even prove that these stats work by going to mgovideo and getting numbers for Braylon, Mario, and Martavious, and show you just how much work the kid needs to do.

Note: This is a long and complex read. I know that. I'm looking for assistance with a project I'm working on that I know everyone will be interested in. If you wish to skip all of the reading, I have summarized everything in bullet points at the bottom.

I had hoped to keep this my little secret until I was completely done and I could unveil everything at once, but I no longer believe that I could do this project as efficiently without some other input. As an engineer, I require myself to do everything with as high efficiency as possible so I must petition the MGoBlog community for help.

As many (more likely all since you're on a site like this) of you are aware, there have been more and more threads being posted which essentially go down as so:

Poster 1: "We're going after slot-dot X and he's only 3 stars!. Argh! Doesn't RichRod understand he's not at WVU anymore and he needs to get MICHIGAN quality recruits. RichRod=Fail."Poster 2: "Stars don't matter, obviously RichRod thinks that he's good enough and that's good enough for me."Poster 3: "Rankings are early, they'll change, just settle down for now"Poster 2: "He's only 3 stars but look at his offer sheet, I'd take someone that's 3 stars with offers from USC, OSU, UF, 'Bama, etc. over a 5 star with offers from us and the MAC."Poster 1: "Stars do matter, you need talent!"Poster 2: "Mike Hart, Braylon Edwards... nuff said"

And so on and so on.

So, I started thinking about rankings and their usefulness at predicting future college and pro success. To that end, I'm going to undertake what I believe will be the largest statistical analysis of recruiting rankings to date. But I need some help.

Let me describe what I'm planning on doing, what I've already done to accomplish that goal, and what I still need to do. Then I'll finally be able to show everyone what I need help with. You'll also be informed enough to offer criticisms, advice, and ask questions if necessary.

1- What I plan on doing

I'm going to take all recruiting data from Scout and Rivals from 2002-2009. As of right now, that includes: name, positional rank, number of stars, HT/WT/40, position, hometown, and home state. I'm then going to also compile data on how many starts each player had in each year of his career, if he redshirted, if he left early for the draft (manifested as number of years of eligibility remaining), the number of All-Conference honors received, and the number of All-American honors received. I will also take information on if they were drafted, what round they were drafted in, what overall number they were drafted as, what position they were drafted for, and what team they went to.

Once I have all of that data, I will first do a top-level analysis to see, independent of everything else, how star rankings alone are at predicting collegiate and pro success as defined by the stats that I will have collected above.

From then on, I will keep trying to dig further to get more and more relevant models and conclusions. This will include but will not be limited to how the average rankings of the other players around another player (independent of that player's rankings) affect collegiate/pro success, the number of blue-chip recruits that completely fail, the number of blue-chip recruits that leave their home state, the average team ranking, success of rankings at predicting success at each individual position, the affect of positional ranking on future success, etc.

I'm going to try to come up with as many ways as possible to analyze the data that either decouples the data or gives conclusions that are independent of coupling. Figuring out how to do that will be difficult but fun.

As a side note, this will also let me eventually compare Scout and Rivals to say with some authority, whose [final] rankings are more accurate.

Of course, I will also apply standard statistical analysis procedures to determine if my conclusions could be deemed statistically relevant or not (I don't know with what percent confidence yet so don't ask).

2- What I have done

It's all well-and-good to have thought all of this out, I'd be willing to wager that at least one other person currently reading this has thought about it, but thinking alone won't get any of us anywhere. So, I've started to do a lot of the grunt work as a sign of my commitment so that people will understand that I'm dedicated enough to make helping me worth their time.

I have already collected all of the information from Rivals for every class and every player.

So, for the classes from 2002-2009, I have every name, positional rank, Rivals Rating (RR), star rating, position (as Rivals breaks it down), and what school they committed to.

I have also created an Excel spreadsheet template that will allow me (once I get all of that data) to merely copy and paste a few things from Rivals and all of the data that I have on every player will be retrieved. With that, I will be able to create a spreadsheet for every BCS team (as Rivals only has complete listings for BCS teams) which will have every class and all of the data for each kid in every class all in one spot. Then I'll be able to do my analyses more easily.

3- What I still need to do

Obviously I'm still not done with the collecting data/grunt work as I still have to take all of Scout's data. It's taking a little while because of the way that they format their data compared to Rivals. Fortunately, I have solved the problem and can now do the usual copy and paste (followed by several other things to make it all work).

I'm considering also grabbing data from ESPN but I'm really not sure if it's even worth it. They only have data from 2007-2009 (I believe) so that doesn't even include a class that been drafted yet.

More importantly, I need to find a source for the other data that I'm trying to collect. I need to find some place(s) that lists all of following:

If a player redshirted

Number of starts each year

Every All-Conference team (not just first team) for all BCS conferences starting from 2002

There is also some other data that I’m going to try and collect but I already have sources for that so it need not be listed here.

4- What I need help with

I need help finding the data that I list above. Pieces of it are available everywhere but I haven’t found a single site that has a repository of all the information implied in even just one of those points above.

Additionally, getting individual statistics is extremely hard. But, it would allow more comparisons than possibly anything else. But, there are literally tens-of-thousands of players. There were over 1000 wide-receivers in 2009 alone! There are simply too many players to try and go to each player individual profile page somewhere and collect the data. I, unfortunately, require lists. That is, unless there is some tool or way to automate that data collection process. I myself know of no such way but that is one of the reasons that I’m asking the MGoBlog community for help, because I don’t necessarily know everything that I could do to make this project as easy as possible (at least on the data collection front).

I’d also like to find a way to collect data on all of the schools that have officially offered a kid a scholarship to see if there is some way to show that stars or scholarship offers is, statistically speaking, the best measure of a kid’s future ability. Again, I can’t go to every Rivals profile page to try and collect that data. This is one area where I feel that since the pages are so similar, it might be possible to write some sort of script to do the work for me. Unfortunately, I’m a ChemE and MSE person, not a CSE person (for those of you outside the engineering that’s Chemical Engineering, Material Science Engineering, and Computer Science and Engineering respectively) so I don’t know what tool or utility I would go about using to accomplish that. I am in Tech. Services so I’m sure that if someone pointed out to me the appropriate tool and maybe some documentation on how to use it then I wouldn’t have any problems.

Summary

I know that what I wrote above was long so here’s the summary (whether you read everything preceding this or not).

I’m going to perform a statistical analysis on Scout and Rivals to determine how good their final star ratings and positional rankings are at predicting future success both in college and the pros. To do so, I have already collected the data from Rivals and am currently working on collecting data from Scout. I will probably not take data from ESPN although that is not a certainty.

To determine collegiate success I will take data that includes but is not limited to All-Conference honors, All-American honors, and the number of starts. To determine pro success I will take into consideration where a player was drafted and for what position.

I know where to acquire some of the information that I need but I still need help finding useful places to take large amounts of data on:

Transfers

All-Conference teams

What position each player was drafted for

Number of starts by each player

I would also like to find a way to automate data collection, specifically with an eye towards collecting data on what schools offered each kid a scholarship. Since there are tens-of-thousands of kids this cannot be done individually but must somehow by automated. I do not know how to do that and am thus asking for help. The same situation applies for collecting individual, positional specific, statistics on each kid.

If anyone would like to help me out with what I have asked, then I would greatly appreciate it. Any criticisms will be well-received (or at least as well-received as I can) and taken into account. Any comments or other thoughts are also welcome and appreciated.

For more information, read the sections above.

Update: 3-26-09

Since so many people have responded with helpful ideas, if you wish to contact me with anything that you either don't want to post in the comments, is too long and complicated for the comments, or that you wish to have a more private dialogue about then email me at: [email protected]

That's not my main email so I won't check it as often (i.e. not every 20 minutes) but I'll try to check it at least once a day. If you want to send me anything, links or other work that you've done that might help me, then send it there.

Thanks for all the great ideas and please keep them coming. I'm still thinking about ways to handicap a teams that have a lot or a little talent relative to the average (for reasons that are too long to fully explain in this update, although there are some interesting thoughts on why and how in the comments below). I'm also looking for ways to automate the data collection process. There are a few suggestions below but I'm going to be looking for more so please tell me.

Again, I prefer using the comments if possible but if not then email me.

Update 2: 3-27-09

Well, it's been pointed out in the comments and confirmed by me that the email address is listed above doesn't work. That's because I had a small typo. Of course, small typos in email addresses are big typos.

I wanted to try to put Steven Threet’s performance thus far
into perspective.Some people think he’s
a stiff who lathers his hands in butter before taking a snap.Others see him as a competent young QB who
can hold his own despite having the deck stacked against him with the depth
chart on offense.He’s probably
somewhere in between….so let’s find out.One of my motivations has been that I’ve sensed Brian at MGoBlog souring
on him a lot in the past couple of weeks.Obviously he isn’t Pat White in this offense, but he isn’t Ryan Mallett
either.Let me explain…..

I set out to compare his performance to other Michigan freshman
quarterbacks.It is at this point,
before even looking at a stat, where you can really understand Threet’s
situation.Freshman aren’t supposed to
play at that position.No matter how
good you are in high school, you’re not ready to start at a big-time college
program.Ask Jimmy Clausen.Ask Ryan Mallett.You can argue that Threet is a RED-shirt
freshman and not a TRUE freshman, but since graduating college he’s spent a few
months at Georgia Tech, a few months under Lloyd Carr, and a few months under
Rodriguez.Not exactly a masters course
in preparation for this season.

There are only two circumstances where a freshman QB is likely
to succeed:

He’s a
freakish athlete and can rely on his athleticism to make up for his mental
and physical deficiencies.Pryor at
OSU falls into this category.His
passing stats are very modest and they’re keeping things really
simple.He uses his legs to
overcome his passing limitations.

He’s
surrounded by a talented, veteran team.This is the Chad Henne category.We saw in Henne’s sophomore year that he wasn’t the second coming
of John Elway that we saw his freshman year.Why?Braylon Edwards.Henne knew
that if he threw it near Braylon, he was likely to have a chance to catch
it – no matter how mediocre the throw was.To some extent this applies to Pryor as well since he is starting
behind a veteran offensive line with a Heisman quality RB.

Steven Threet has neither luxury.But how does he compare to other freshman
QB’s?If someone can dig up NCAA
statistics from all schools it would be really interesting.I took on just Michigan QB’s.And it wasn’t difficult – they barely play as
freshman.At this point through 7 games
Threet has thrown the second most passes of any freshman QB that I could find
in the recent history of the school.Let’s look at them:

Season

Name

GP

ATT

COMP

PCT

TD

INT

YDS

Y/A

2008

Steven Threet

7

145

76

52.4

6

3

792

5.46

2007

Ryan Mallett

10

141

61

43.3

7

5

892

6.33

2004

Chad Henne

12

399

240

60.2

25

12

2743

6.87

2000

John Navarre

5

77

40

51.9

8

1

583

7.57

1995

Scott Dreisbach

4

106

56

52.8

3

3

850

8.02

1989

Elvis Grbac

8

116

73

62.9

8

3

824

7.10

I was surprised at Navarre’s stats.I remember him being horrible as a redshirt
freshman and single-handedly giving away the UCLA game.But his stats aren’t horrible.Obviously Henne is on a plane all of his own,
though his Y/A were lower than I expected considering the Braylon factor.Ryan Mallett sure was terrible, wasn’t
he?

If I’m not mistaken, Threet is the first one of those
players to be given the job out of necessity rather than due to an injury to a
more experienced player, though I can’t remember Grbac’s situation.I also mentioned the lack of a surrounding
cast.Here’s another chart comparing
those QB’s.

Wow, nostalgia.I’m
starting to wonder if Arrington’s departure hurt this offense more than anyone
else.

I’m going to limit the amount of conclusions I draw from
this information and leave it open to interpretation.Obviously missing from this data are the
number of fumbles and any rushing statistics.

My opinion from this information and from watching the games
is this – Threet is not a very good quarterback right now, but he’s not a bad
one either.Unlike some areas of the
team he is showing progress from week-to-week and compares favorably to other
freshmen at his position.The
quarterback position is a problem for the team, but the problem is mostly due
to a lack of depth and experience – not directly because Steven Threet is
incapable of leading the team to victory.Threet is raw and inconsistent, but he’s competent and should develop
into a solid QB in time.There is little
in my limited analysis that shows he’s worse than most of those other
guys.What Michigan lacks is a capable experienced QB
(Jason Forcier?) who could hold down the fort this season.

Let’s give Threet time and the benefit of the doubt…….and do
whatever we can to keep Sheridan
off the field.

I'm going to stop screaming like a little girl about mike barwis for a moment and play devil's advocate. My personal opinion is that he's a godsend and exactly what our program needs to break through to the next level. But I was reading some comments on the ESPN articles and realized that every story has two sides. So let's take a closer look at why we think he's so great and why fans of other teams might not agree.

Ryan m'fckn Mundy

What we think: If Barwis can change this guy from someone we were practically shoving out the door into an NFL draft pick, just imagine what he can do with players that don't inherently suck!!

What skeptics might say: Mundy wasn't really that bad before. He sucked because of bad coaching.

Upon Closer Inspection: In 2006 he had a whopping 25 tackles, 1 int, and 1 sack. I can't even find his name for 2005. (In other words; Antonio Bass had one more tackle than him in that year) In 2004 he's credited with 51 tackles, 2 int, and 1 pass breakup. And he got 10 tackles in 2003. After one offseason of training with Barwis he had 62 tackles, 3 int, and 3 fumble recoveries. His sophomore year showed some promise, but the complete regression during 2006 might lend some weight to the bad coaching theory

Gittleson was ancient

What we think: If your S&C program was designed in the 1970's and only two programs still use it, and the other one is grandpa paterno at penn state, then you are behind the times. Barwis is bringing a modern, cutting edge program, so we're no longer going to be decades behind the rest of the country, we're going to be decades ahead.

What skeptics might say: One S&C program is just as good as any other. What really matters is how much effort the players put forth

Upon Closer Inspection: Michigan has had several players taken in the first round of the NFL draft, including Jake Long who was #1 overall this year. But there are quotes floating around the internets (citation needed) about how Michigan players getting ready for the combine were in for a world of hurt because of the backwards training they were used to. Nothing speaks louder than results, and if its true that Brandon graham was only doing 315 but is now maxing out at 475, well brian said it best. EEEEEEEE!!!!

Ninja Offense Scoring

What we think: Better conditioned players means more scoring!!!

What skeptics might say: WVU only scored that much because they didn't play anyone once Miami, Vatech, and BC left the big east

Upon Closer Inspection: I looked at the WVU scores for the past 5 years (since Barwis took over their S&C program) and found that the offense improved its per game scoring every year. In 2003 they scored 28.5 pts/game, 2004 was 30.0, 2005 was 32.1, 2006 was 38.8, and last year was 39.6. This is a good trend. It's true that WVU struggled offensively in the few games it played against the three teams that went to the ACC, averaging just 21.6 points against them. The one year WVU played all three was 2003 when they scored 20, 28, and 35 points against miami, vatech, and bc respectively. In bowl games, WVU scored just 7 and 18 points while getting clobbered by maryland and FSU, but exploded with 38, 38, and 48 points against much higher ranked Georgia, GaTech, and Okla in BCS bowls. Again, this a good trend.

Speed Kills!

What we think: Barwis will make the players faster! Rich Rod's offense works because of the high tempo and speed of the players.

What skeptics might say: You can't coach speed. Player's are either fast or they're not. We've recruited players who were designed to be in a slow plodding offense.
No amount of running will change that. If you don't have Pat White and
Steve Slaton you're screwed.

Upon Closer Inspection: While it's true that genetics set the baseline for your speed, proper
training can help you achieve maximum potential. You can't make a 5.4
guy into a 4.5 guy. But there is plenty of evidence to show that one
or two tenths of a second improvement can be made. Especially for
linemen who have not been pressured to run as much before. Many of the players we have recruited at the skill positions are already fast enough to run the spread but the incoming freshmen class and next years commits are even faster to begin with.

WVU winning BCS games

What we think: This shows that RichRod/Barwis can compete with and beat the best.

What skeptics might say: Those opponents in the BCS games were overrated and the scores were close. USC, Florida, and LSU would have killed them

Upon closer inspection: Who can say. We couldn't beat USC either. But WVU beating the SEC champ, the ACC runner up, and the Big12 champ in successive years is still pretty darn impressive no matter what the score.

We lost games because we sucked in the 4th quarter

What we think: Our crappy conditioning caused us to lose games in the 4th quarter. That won't happen with Mike Barwis on the job!!

What skeptics might say: We weren't that bad in the 4th quarter before; in fact we had several comebacks, like 28 pts against Minnesota and triple OT vs. MSU. So even if we lost a few in the 4th, it all evens out, and there won't be much difference now.

Upon closer inspection: In the last 5 years Michigan has lost 17 of 62 games, WVU lost 14 of 63, 9 of which were during Barwis's first 2 years as head of the S&C. In the 2nd half, Michigan was outscored 20 times, 8 of which resulted in losses after leading or being tied at the half. For WVU the numbers are 17 times, of which only 4 resulted in losing the lead (the others were games in which WVU was ahead by alot or already behind). But if you look at just the 4th quarter, the numbers become more interesting. Michigan was outscored in the 4th quarter 19 freaking times. 6 of those were 4th quarter collapses where we lost the lead, and 4 of them were double digit 4th quarter leads. These all happened in 2004 and 2005. (2005 only had 5 games where we weren't outscored in the 4th quarter or 2nd half, so it truly earned the name 'season of unending pain'). WVU was outscored in the 4th 21 times, BUT ONLY 1 RESULTED IN A LOST LEAD. One! One freaking game did they lose in the 4th quarter. It was Barwis's first game as head of S&C, against wisconsin of all teams. So when Mike Barwis says we're not going to lose games in the 4th quarter, he means it. Conversely, WVU only won 7 games in the 2nd half, 2 of those in the 4th. Michigan won 7 games in the 2nd half, but 9 in the 4th. I think this has more to do with coaching. LLoyd would sit on a lead, lose the third quarter and then open up a bit to win in the end. Whereas with RR he doesn't hold back. He's either going to beat you and put you away in the first half, or just trail for the entire game.

And lastly, Fire and Brimstone

What we think: Mike Barwis is super energetic! His passion is contagious! It's his way or the highway! If he followed me around at work, I'd become CEO in no time!!!

What skeptics might say: He's full of hot air. All S&C guys sound like that. He just uses meaningless jargon and doesn't really know what he's talking about. Anyone can swear up a storm, that doesn't make you a good motivator.

Upon closer inspection: Having only seen and heard a few scattered interviews and not having had the pleasure (hell) of a workout with him, I can't really say. The energetic part is on display in every one of his videos and can't be disputed. Whether or not it is effective is probably best seen in how the people he works with respond to it. But the quotes from people who have met him are all pointing towards legitimate charisma. Even the recruits continually mention positive things about him after only short introductions and demonstrations. Having a young energetic guy around, who attracts NFL alumni back to campus can only be a good thing for recruiting.

Conclusion: Mike Barwis. All your recruits are belong to him. Mike Barwis. Perhaps doing this research has tempered my enthusiasm for this coming season, but it looks like the program is headed on an upward track. Mike Barwis is the real deal. Maybe not a miracle worker, but he's exactly what we needed to move our program into the 21st century.

*ps. if you're interested in the data I used you can get it from mgoblue.com and wvustats.com or you can email me at BlueSeoul at hotmail.com and I'll email you the spreadsheet since I don't know how to upload it to the diary.