A melting pot of statistics, machine learning and data visualization

R

NBA basketball is the one the sports I enjoy watching the most. As I was ordering my (undisclosed amount)th beer while watching a game during after-work hours, it occurred to me how often I had seen sparsely populated arenas during games, with large areas of seats going unoccupied. This got me to thinking about the average fan attendance for NBA teams, what could be the factors influencing attendance, and ultimately, which NBA team had the most loyal fans?

After some online browsing, Python scraping and data cleansing, I was able to obtain a good amount of data from the awesome guys at basketball-reference.com. Unfortunately, I could not find any records of fan attendance beyond 1981, so this analysis will be restricted to the period between 1981 to 2013 (with records for 2002-2006 also missing). First, I wanted to see if there were any trends in NBA fan attendance per season.

Fan attendance for each NBA teams during the seasons 1981 to 2013. Years marked with a red asterik represent shortened seasons due to a lockout. Data for the year 2002 to 2006 was not available

The two most striking features of the plot above are the obvious increase in fan attendance from 1981 to 1995, and the subsequent stagnation thereafter. This makes sense, since this period is widely regarded as the golden era and renaissance of basketball, full of rivalries and Hall of Fame players in their prime. Unsurprisingly, the year 1999 and 2012, which were both shortened by ~4 months due to a lockout, saw a drop in total number of fan attendance (purely as a result of lesser games being played – if I were more rigorous, I would normalize for this and also the overall US population, but I wanted to visualize the raw numbers).

Next, I investigated whether team success (the net number of wins per season) during a season could be an indicator of fan attendance. Not surprisingly, teams that won more also attracted more fans (doh!). This was true regardless of the conference in which the team was (East or West).

Fan attendance as a function of number of wins for all NBA teams during the period of 1981-2013

I also looked at whether fans were more attracted by teams that scored a lot, or by teams that put an emphasis on defense. However, I had to consider historical trends in scoring, and adjust for the fact that defenses/offenses have gotten more sophisticated over time. Therefore, I decided to look at the fan attendance numbers of each NBA team during a given season, and plot that as a function team’s deviation from the median number of points scored by all teams during that season. The plot below shows the aggregate of all points after considering each individual season between 1981 and 2013. Interestingly, although teams that score more attract more fans, it seems that good defense is even more likely to attract crowds.

Fan attendance as a function of the number of points scored for and against the home team. To adjust for the variability in offensive/defensive points scored at each season, the attendance numbers are plotted against the home team’s deviation from the season average.

Of course, the caveat of the above plot is that teams that score a lot and/or defend well are more likely to win, and thus attract more fans. Indeed, winning teams usually develop bandwagon fans and thus inflate their attendance numbers. Therefore, I sought to find out who were the most loyal fans in the NBA. In my mind, the mark of a truly loyal fanbase is one that shows up to support its team regardless of win/loss ratio. For these reasons, I plotted the fan attendance of each NBA team normalized per number of wins.

And so the most loyal fanbase are the good people of Memphis, Minnesota and Toronto!

I will add all the relevant code to my github account soon (basically as soon as I’ve commented it!)