Thursday, October 15, 2009

An Interview with Richard Billingsley

Of the six computer ratings in the BCS formula, the Billingsley Report is unquestionably the most controversial. And of the seven programmers (Anderson & Hester count as two), Richard Billingsley is undoubtedly the most opinionated and colorful. The Billingsley Report has been part of the BCS since its second season, 1999, but the data goes all the way back to 1869, to Princeton vs. Rutgers, the first college football game ever played.

With the 2009 BCS Standings set to debut on Sunday, the Guru decided to have a chat with Mr. Billingsley this week in a no-holds-barred, hour-long phone interview. This is what he had to say:

Guru: Why are you a college football fan?

Billingsley: I can honestly say it came by naturally. I was born into a football-crazy family. My grandfather and parents were big college football fans and I remember being a fan as a young child at 5-6 years old. I'm literally a fan before the game was on TV. It really boggles my mind that college football has gone from being on the radio to black-and-white TV to to color TV and now to every outlet you can think of. The first football game that left an impression on me was in 1957, it was Notre Dame beating Oklahoma that ended (the Sooners') 47-game winning streak. I was born in Oklahoma and still live here. I remember the entire state in such shock over that loss, literally like someone had died in the family. That made such an impact on me because I figured if it's that important to adults, there must be something to it, so I was hooked.

Guru: How did you get into the rankings business?

Billingsley: I was interested in college football rankings, such as the AP and the UPI - the coaches poll back in the old days - at a very young age. Back then, the polls came out on Tuesdays. The voting was done so late on Sundays it didn't meet press deadlines, so the rankings did not come out until Tuesdays in the papers. I'd run home from school looking at those polls in the newspaper and most of the time, I'd go 'man, did these guys watch the same games as I did? This is not right. There has to be a better way of ranking teams.' Even when I was a teenager, I thought they were not paying attention to strength of schedule and playing favorites with traditional powerhouse schools. I didn't think these should be tolerated. So I sat down one day and started tinkering with my own mathematical formula.

Guru: When did you first publish your rankings?

Billingsley: I first ran it for a couple of years, in 1968 and '69. In 1970 it was published for the first time, by a local neighborhood newspaper in Houston. I did that for years. It was just a hobby, something I printed out for friends, family and coworkers. It wasn't until 1994, when I started wondering who'd been No. 1 in my system back in the 1920s, and I thought I could find out, and run my system through those years if I had all the information I needed. So I sat down and wrote letters to every Division I school's SID (Sports Information Director), asking for their football records as far back as they have them. And every SID responded and sent me their press guide. That was my starting place. What I did not realize I'd come up against was that there's a vast amount of discrepancies between schools - not only did they not agree on the date, but the score, or in one instance, who won the game. But eventually I pieced everything together like a jigsaw puzzle. I started in 1869, everything from that point through the current year. I finally finished it in 1996 - it took me two years to do all that. One of my friends would tell me: 'you're either the most dedicated college football fan or you lead the most pathetic life of any man.'

Guru: And how did your rankings get into the BCS?

Billingsley: When I finished the project, I was pretty pround of what I had done and I took the results and mailed it to Richard Campbell, the director of statistics for the NCAA. He said it was amazing and he'd like to publish this in the NCAA records book. It was in the 1996 records book, and it was the first time ever I had gotten any recognition. And then the real break came in 1999, when Roy Kramer (then BCS chair and commissioner of the SEC) wanted to expand the computer rankings from three to seven. His deputy Charles Bloom called me and said we got your name from the NCAA and would like to see your work. And for the first time, my rankings were included in 1999. The fact is, in the six computers, there are the most time-tested brilliant mathematical minds - everybody but me, I'm not a mathematician - but one of the requirements is the BCS needed to see 10 years of rankings on a week-by-week and see how far back can you go. (Bloom) was amused that I had such a vast amount of information. How I ended up getting in the BCS was probably because they were impressed with my research.

Guru: You had to alter your formula after the BCS required that margin of victory (MOV) be dropped as a component. How did you feel about that?

Billingsley: That happened after the 2001 season, but I had actually agreed to take (MOV) out before the BCS requested us to take it out. It had become so obvious to me during the 2001 season that the coaches were using the scores of the game to manipulate the computers. It was the most unsportsmanlike thing I've ever seen in my life and I wanted no part of it. I agree that using MOV gives you a better predictor for future games, a more accurate predictor. But with it or without it, it has never changed my top two teams at the end of the season.

Guru: Your system tends to produce rankings quite distinct from the other five BCS computers. Why is that?

Billingsley: My system is probably more different from the other computer systems. The other five guys are looking at it from a purely mathematical standpoint - don't get me wrong, I applaud their systems and I have tremendous respect for what they do. But my system is not purely mathematically based. My rankings are based on rules that are put in place from a fan's perspective, things I think that are important to rank college football teams. My rankings are closer-ly related to human voters, an improved AP poll, if you will. It reacts to games more like a human voter but does it without biases like the name of team, the conference they play in, etc. It's mainly concerned with with wins, losses, strength of schedule (SOS) and head-to-head results. The core of my system is not something you see in most computers. It's not necessarily better - in purely mathematical terms, it's not as good - but the public relates very well to the system.

Guru: So how do you respond to the accusation that your system is a 'one-man poll'?

Billingsley: My response is that it's a 100 percent computer-generated formula, there's no personal input on a week-by-week basis. Anyone can duplicate the system if they have the program. Other than that I wrote the program, I have no influence.

Guru: But you do have a preseason ranking. Isn't that inherently biased?

Billingsley: In my system, you carry the rankings from one season to the next, exactly from last season's final rankings. You must have a starting position. Both Sagarin and Massey use that philosophy, the only difference is I found a way to do it without MOV and still have a pretty accurate system. Some people think before the season, everybody should start out equal, but my answer to that is it looks fair on the surface to start everyone equal, but it's not logical because we know for a fact that all teams are not equal, so how can we ignore that? It's more fair if Idaho starts out the same as Southern Cal. It's fair to Idaho. But is it fair to USC? In my mind, (to start out teams equally) skews any hopes of an unbiased SOS. That's why my SOS is dramatically different from what other computers are showing you.

Guru: Does it bother you that your rankings get tossed at a disproportionally high rate every week?

Billingsley: I certainly know my rankings give a different perspective. I know there are those out there like to say, 'look, Billingsley gets thrown out more than other computers, so it must not be any good.' My reponse is this: My computer program is more correlated to human votes than other computers, so of course it gets thrown out more often. But that's what the BCS was tying to accomplish. They don't want me to be like the other computers, they want a different perspective. It doesn't bother me that my rankings get thrown out more often, but it bothers me that people don't understand why,

Guru: Do you get disgruntled fans writing you as a result?

Billingsley: I do get hate mail from fans but you can't satisfy everybody. It's always from the fans of a few teams feeling that they're getting the shaft. If they love your ranking, they won't write, so 90 percent of the mail is bad. When this happened 10 years ago it broke my heart. But over the years, it just rolls off my back. I have always tried my best to respond to every email I get, but I finally have to post a message that says that I won't respond to anything with vulgarity. I've gotten some disgustingly vulgar email - those people really are not fans, they're just disgruntled human beings. You may not agree with me but at least you can be respectful.

Guru: What would you do to change the current BCS formula?

Billingsley: A couple of things. First, the weight between computers and humans should be 50-50. That's the way it was and then it changed because of an overreaction to that particular season (2003). In this formula, we're ignoring SOS - 33 percent is not enough weight to accurately describe SOS. The voters just don't have the capacity to gauge SOS the way they should, even if they had the inclination to do so. There is a difference, from a mathematically standpoint, between a 47th-ranked team and 48th but (the voters) cannot give us that distinction. That's why computers should have more weight. Second, they need to stop throwing out the high number (among the computers). They can throw out the low number but keep the high number. The reason I'm saying that is the six formulas we have now are narrowed down from hundreds of computer and it's the best we have, so we really shouldn't toss out two more in every ranking. For example, last season, I continuously had USC higher than other systems and at the end of the season, guess where they ended up? So my question is, why is it that the Trojans are not allowed to get the benefit of a system that gives them a higher rating?

Guru: Aside from the formula, do you want to see the BCS changed in any way?

Billingsley: I'm a fan of the BCS, but not because I'm a part of it. I think this system is the best way to bring together a No. 1 and No. 2 and keep the integrity of the season intact. I'm not a fan of a playoff. I prefer the way it is. In some seasons, the plus-one would make a lot of sense but it doesn't in every season. So whatever we do it's got to fit every season, that's what people forget. If you make a change based on one season's result, you're making a mistake. Any change that takes place should be made over a long period of investigation and research. We don't need a plus-one. If I have one criticism (for the BCS), it's that in the early years they made too many changes based on a knee-jerk reaction. But in their defense, they didn't have a choice. It was something we were all experiencing for the first time.

The problem with Billingsly's ranking system, is that two teams could play identical schedules, have identical records, but the team that had the better season the previous year, will be ranked higher. He talks about trying to be fair, but this is about the most unfair thing I can think of.

Billingsley system still allows for teams to manipulate their schedule to game themselves to the top of the rankings.

The best way to evaluate teams is to see what they did against good teams (teams with a winning record-- 7 out of 12 with FCS schoolsnot counting). Good teams rarely lose to bad teams, so the best teams are those that win against the best competition.

Looking back to 2002, while the Ohio State/Miami Fiesta bowl was good and all, USC and Georgia both played harder schedules in harder conferences were at least as good if not better than either Ohio State or Miami, with each schools winning more games against good teams in the regular season.

Without a playoff, we are left with more mythical national champions. The only difference is we are told that Florida and OU were the best two teams in 2008 -- I disagree -- Texas, Utah, and USC were equally deserving and demonstrate why a playoff is necessary.

"The core of my system is not something you see in most computers. It's not necessarily better - in purely mathematical terms, it's not as good - but the public relates very well to the system."

I've always thought the entire purpose to the computer rankings was to be purely mathematical, unbiased, and who cares if the public relates well, like he even says elsewhere in the interview, people are happy when their team is ranked highly, and mad when they're not. The general public doesn't seem to be the best judge of these things.

Personally I'd like to see everyone follow Colley's lead and actually explain what it is they're doing. This way we can judge for ourselves (or at least those of us who are mathematically inclined) whether the algorithms are worth their weight.

Samuel Chi

The Guru is a journalist who takes time from his busy schedule to provide this important public service. And of course, the Guru is so well-rounded that he has interests beyond the gridiron and crystal ball. Check out his other adventures -- after first buckle your seat belt.