Football Statistics: the Impact of Smiling

If you've ever watched a professional football game (and this is probably true for most professional sports) then you have seen these little portraits of players that appear at the bottom of the screen. On some TV networks they are actually short video clips where the players announce their alma mater, on some networks they are animations where the players each raise their heads and occasionally blink (these creep me out), and for other networks these head-shots are just still photos. Some players smile in their photos, some do not.

Key & Peele have a recurring bit about this player introduction phenomenon.

While watching a Seahawks game this past year, my mother in law posed an amusing question: Do players who smile in their photos play better football?
The question is simple and whimsical, in other words perfect. I don't know anything about how often these photos are taken, what the player's mindset is when they're shot, or if there is any prior expectation about attitude/persona and player record. I set out to find some answers...

For this study I am only focusing on Quarter Backs (QBs) in American professional footbal (NFL), though it would be easy to extend to all positions if anyone can help me get the data! Right away I know I'll need a few ingredients: photos of each player, some classification of their smiles, and some real stats on their records in the NFL.

Gathering Data

For player stats I used the 2013 season data for all 63 QBs in the NFL here. For each player I manually clicked on their profile and saved their headshot images. This was a bit tedious, as image filenames had to be matched up to player records, and was the main reason I didn't extend this analysis to other positions. With 63 player's images and stats in hand, it was time to write some code!

Ranking Smiles

How do you quantify who is smiling most or least in photos? There are certainly computer vision or machine learning approaches to this problem, but I am not a master of either, and 63 images is not a good training set for such a study. Instead I decided to manually decide who was smiling most/least.

At first I decided to try and sort all 63 photos at once. Sorting 63 things at once turns out to be fairly difficult, and the best approach is to print all the images on paper, lay them on a table, and sort them by hand. This approach is entirely arbitrary, and I don't have a color printer. Plus, humans are much better at "A or B" choices, rather than "rank A thru Z".

Instead, since "Who is smiling more?" is a fairly subjective question, I decided to try an "A or B" approach. I would compare two randomly chosen players, look at their photos side-by-side, and choose who I thought was smiling more. Simple A versus B choices! If these random trails are repeated enough, a total ranking should emerge.

With these "A or B" trials, I could use the Elo rating system to assign each player a total score. This rating system is a classic algorithm, and was developed for rating chess player match-ups of different ranks (experts versus novices). In this rating system, when a top-ranked player (or in my case, top-ranked smile) is matched up against a low-ranked player (frown) and wins, the top-rank player only gains a few points. However, if the low-ranked player were to win (an upset), they would gain a lot of points and visa versa. Variations of this system have been used in sports, video games, and even in choosing baby names.

One example match-up from my Python script. I would say "B" is smiling more in this case

With my 63 photos in hand, I wrote a small bit of Python code to display random match-ups and save the results. I gathered just over 1000 random match-ups of these 63 QBs. Here is the sorted Elo scores, as a function of win rate. You can see that players who "win" (smiling more) a larger fraction of their "games" (match-ups) generally have a higher Elo rank. This makes good sense!

A subtle detail of Elo ranking is that the ranked result depends on the order of match-ups. Thus the player's rating and rank would be different even if each "game" played had the exact same outcome, simply due to the order the games were played.

To explore this, I took the 1000 QB trials and did 1000 re-samplings, randomly shuffling the match-up order, but keeping the outcome of each match the same. For each player I then calculated their median Elo score among all 1k resamples. This median defines their ranking in my analysis. This also produces a bootstrapping-like uncertainty for the player's Elo score. The average uncertainty is Elo score is +/- 10.5, while the average difference between Elo scores is 9.1. This means that the total ranking should be good to about 1 position for each player.

The 2013 NFL QB's, Sorted by Smile

This is a terrifying and yet beautiful animation.

Watch it while focusing on their mouths, and it's amusing how they transition quickly from Pissed Off grimaces to Indifferent smirks, then to Pleasant smiles, and finally to Thrilled grins. Watch it while focusing on their eyes, which are closely aligned in the photos, and it will haunt your dreams. I am also fascinated by the incredible variety of their head shapes... I'll let you pick your own outliers in that field

The grumpiest QBs in this dataset were: Matt Barkley (Eagles) and Matt Schaub (who played for the Texans in 2013, now for the Raiders).

The happiest QBs in this dataset were: RG3 (Washington) and Thad Lewis (in 2013 for the Bills, now for the Texans). My boy, Russell Wilson (Seattle) placed a respectable 3rd!

Smiles versus QB Records

The tl;dr is that no strong trends appear between smile rating and player statistics. In the next figure you can see some very low-signal "trends", but nothing is significant. In each panel the smile Elo score goes from left to right, with the happiest person on the right. This shouldn't surprise you, since as I said from the outset this idea is silly.

What is more interesting are the correlations between the various QB statistics. Here is the "scatter matrix" for these stats. The general trend is that if you're a good QB you're a lot better at most all of these stats. There are also some bimodal or non-linear trends, which are probably due to starting QBs having an opportunity to get, for example, more yards per game (yardsG in this figure).

Rolling Out the Big Guns

I'll admit, I was a bit disappointed that no obvious trends showed up in the Smiles vs Stats plots. So in a last-ditch attempt to find some correlation I decided to employ a machine learning algorithm. The data were set up perfectly to dump straight in to some pre-fab ML code, where features were the QB stats and the quantity to predict was the smile (Elo) rating.

I'm not a ML expert, so I drew wisdom from an intro to ML that Josh Bloom provided at the AstroData Hack Week last fall. There are many types of ML algorithms, and I don't usually know how to decide between them. Paraphrasing his advice: for most problems, Random Forest is the best. (I used RF to do time series forecasting as part of the Hack Week in 2014!) So by borrowing liberally from the example Python code that Josh Bloom provided in his ML introduction, I trained a RF on these QB stats and Elo smile ratings.

Here's the resulting residual between the RF prediction and the Elo smile rating:

Note with so few QBs I didn't split my sample in to train vs test data (tsk tsk), so this is just the prediction based on the exact same data I trained on. The Elo scores span from 1300–1900, while the residuals span -150–100. You can see we're missing a big component here. For clarity, lets just show the prediction versus the median score:

In blue is the 1-to-1 trend of Elo score versus itself. What is interesting is that, despite getting incorrect values for Elo rating, the RF does appear to be able to predict a rough ordering of smiles based on the stats at all! Of course, the accuracy is not so good, and would only be good to about 40%. Also interesting: the two highest rated (by me) smiles are definitely the highest in the predicted Elo score by the RF!

Another property of the RF is that it tells you how important each feature was in the resulting fit. Here I'm showing the feature importance for all 15 features used in the solution above.

The winner was Feature 14, labeled as "rate" in the scatter matrix, also known as passer rating.

So a possible conclusion here:Guys who have a better passer rating might smile more in their picture.

Perhaps if you have a rookie who is smiling a lot in their official photo, draft them up for fantasy football points!

Now for total honesty: I am deeply skeptical that these features (QB stats) having anything to do with the labels (smile score). There is a lot machine learning can tell us about football that is useful and interesting. This project is a reminder that machine learning might be able to find subtle and complex correlations, but it cannot guarantee meaning. I would, however, be very interested to see how the smile ratings compare to salary and endorsements...

- - -

If you'd like to check my work, I'll putting some of the code and data for this project on GitHub, and I can provide more data on my Elo rankings if anyone is interested! If you are interested in helping me expand this project, or might know how spin up an "A vs B as a Service" app on AWS or Heroku or something, drop me a line!