Making Analytic Rain in Today’s NBA: The Stat Revolution in Pro Basketball

If you’ve taken a course in the social sciences, you are well aware of the quantitative approaches that now dominate these fields—from political science to economics to sociology. For several decades this emphasis on empiricism has reigned supreme. In some ways, this fetishization of quantitative methodologies seems like a lot of social scientists protesting too much: “Hey! We’re scientists too!” And we all know how prescient and skilled political scientists and economists are at predictions and policy…But, we’ll leave that be for now. While these methodologies are now fundamental building blocks of social science, they have also begun to spread from sports to approaches to governance. Following in baseball’s footsteps, many NBA teams’ front offices are now utilizing a “moneyball” approach to building their rosters. This approach, first utilized in Major League Baseball by Oakland A’s General Manager Billy Beane and made famous by Michael Lewis’ book Moneyball, eschews many traditional metrics for evaluating player and team success. This is essentially an extension of the trend of empiricism and in sports circles it represents an analytics revolution.

The stat/analytic revolution in color

When the Memphis Grizzlies traded Rudy Gay, a bête noire of analytics adherents, this season, it was viewed as a sort of victory for the NBA “moneyball” crowd. See, the Grizzlies hired John Hollinger, one of the leading lights of the NBA analytics movement, to run their team in late 2012, giving him the power over roster decisions. Gay is what’s known as a “high-volume” scorer, that is, someone who scores a lot, but requires an inordinate amount of shots and overall time with the ball to do so. Grizzlies Coach Lionel Hollins, a former NBA player, was reportedly against the trade and valued Gay much more than Hollinger. Yet, after the trade the Grizzlies showed no obvious decline in offensive production, even though they replaced the younger and more talented Gay with Tayshaun Prince’s gaunt corpse. Despite his freakish athletic abilities and prototypical NBA wing stature, Gay is simply an inefficient NBA player.

Over the course of the last five years, NBA teams have increasingly employed an “analytics” framework to build their rosters. These metrics emphasize efficiency and in some ways have been transformed by the emergence of “stretch 4’s” like Dirk Nowitzski and players like LeBron James and Carmelo Anthony that are capable of playing power forward and spacing the floor with their shooting ability. High volume scorers that pound the ball into the floor for 20 seconds and launch inefficient mid-range jumpers, even though they may score 20 points a game, are increasingly viewed as detriments to their respective teams. For example, look at how both Monta Ellis and Brandon Jennings were treated during free agency this offseason. They were on the market dramatically longer than a lot of other guys in the league who put up 17-20 points and 5-8 assists a game. Why? Because guys like this just aren’t valued as highly under the analytics paradigm. They aren’t efficient enough. Now, think about the Miami Heat this year. They essentially played a position-less system, with 5 guys rotating in unison, with nearly every player on their roster, save the clownish “Birdman,” capable of hitting three point shots, shooting at an efficient clip, and capable of defending multiple positions. The game done changed.

Clowning Birdman?

One key paradigmatic shift in the league revolves around the types of shots players take. Kirk Goldsberry over at Grantland did a masterful job this season providing a host of interesting charts showing the subtle changes in the types of shots players are taking and the overall increase in three pointing shooting across the league. Launching deep two point jump shots is inefficient. If a player can take one or two steps back and instead shoot a three pointer, then why not do that? So, teams with forward-looking management are increasingly loading rosters with guys that can hit threes—well, except for last year’s incredibly disappointing Lakers. Teams like the Knicks and the Rockets fired up an astonishing amount of three pointers this past year and with other teams like the Heat and the up and coming, and unconscionably fun to watch, Golden State Warriors dominating the league, it seems like this trend will certainly continue. Shit—even the steely ol’ general, San Antonio Head Coach Greg Popovich, revamped his roster to conform to this new development.

One of the new metrics used by adherents to the “analytics” framework, called Player Efficiency Rating (PER), attempts to capture a player’s total contributions in one statistical measure. Although not without its flaws, PER has become the leading metric used to evaluate a player. League average PER is adjusted to 15.0 every year and accounts for variables like team pace. To give a sense of how the metric is used, the highest career average PER is 27.91, held by Michael Jordan. With that being said, I had the sense that perhaps PER was overblown. First, it doesn’t necessarily do an adequate job of measuring a player’s value. It does not include many important measures, particularly on the defensive end, that are invaluable to player and team success. So, using multivariate regression analysis, I looked at some key statistics (team PER, three point percentage, three point makes, and opponents point scored) and their statistical correlation to team winning percentage. Ultimately, the data proved that guys who did math and this stuff for a living know a lot more about basketball than me.

A little bit of PER with the morning coffee

The table below provides descriptive statistics for each of the aforementioned variables. These descriptive statistics help give us a sense of dispersion throughout the data before performing the multivariate regression test.

I converted winning percentage from numbers solely behind the decimal point, per NBA.com, to a more straightforward percentile. The mean winning percentage is in fact 50 percent, meaning that in the 2012-13 NBA season, on average teams won as many games as they lost. There is, of course, a wide variance in these numbers. The Miami Heat, repeat championships, ended the season with an impressive 80.5 percent of their games won and the lowly Orlando Magic finished with a 24.4 winning percentage.

As discussed below in reference to my regression analysis, the variance in the Team PER average is notably small. As player PER is adjusted to 15.0 average each year , it should be no surprise that Team PER is 15.0 as well. Once again, the Miami Heat, largely on the back of the uniquely efficient year that LeBron James had, led the league in Team PER with a 17.5 team PER. Interestingly, although the Washington Wizards had a winning percentage over 10 percentage points higher than the Orlando Magic, they had the lowest Team PER in the 2012-13 season, posting a 13.5 Team PER. The Golden State Warriors led the league with an impressive 40.3 percent shooting from the three point line, with the Miami Heat coming right behind at 39.6 percent. It is worth noting that both of these teams made deep playoff runs, with the Heat ultimately winning the championship. Alternatively, the Minnesota Timberwolves posted the lowest three-point percentage. The next three teams at the bottom of the list, Orlando, Phoenix, and Charlotte, also posted three of the lowest winning percentages in the league. The New York Knicks led the league in three point makes a game, with 10.9, with the Miami Heat in third. Every team in the top 10 in three point makes, with the exception of the Portland Trailblazers, made the playoffs this year.

The anomalous Grizzlies

Although the variance may seem relatively small in three point makes, the difference between making 10.9 three pointers and the league low of 4.7 per game, as the Memphis Grizzlies posted this year, translates to 18 points a game. However, the Memphis Grizzlies proved to be an interesting anomaly, as they were able to make a deep playoff run, even though they made the least amount of three point shots a game and were below the league Team PER average. The Grizzlies did lead league the in opponents points scored, allowing only 89.3 points a game, nearly 9 points better than the league average, 6 points better than the champions, and 15 points better than the league worst Sacramento Kings. What the Grizzlies have, that few teams do, are two dominant big men, Marc Gasol (an analytics wet dream) and Zach Randolph, that control the paint and the glass on both ends of the floor. While they never are going to score 110 points a game, well, few teams will ever score 110 on them.

Variable

Constant

20.85

Team PER

9.9 (p-value=.000)

3 Pt. Percentage

.988 (p-value=.118)

3 Pt. Makes

.455 (p-value=.619)

Opponents Points Scored

-1.62 (p-value=.000)

Adjusted R Square

.897

As the adjusted R-square for my regression model demonstrates, this model can account for approximately 90% of the variance in winning percentage for NBA teams in the 2012-13 season. For those of you that are unfamiliar with some of the jargon here, an adjusted R-square essentially tells us how much the variables in a multivariate regression are correlated. In this regression test, winning percentage is the dependent variable with team per, three point percentage and makes, and opponent points scored serving as independent variables. So, this adjusted R-Square essentially tells us that 90% of team winning percentage can be determined by the four independent variables. In other words, although I was ultimately wrong about the importance of team PER, the model I produced here does account for an overwhelming percentage of overall team winning percentage.

Don’t ever doubt my team’s PER!

In the face of my hypothesis that Team PER could not adequately account for variance in winning percentage, the variable proved to be highly statistically significant, with a p-value at nearly .000. Ok, again, what the hell is a p-value? The p-value is the probability that a null hypothesis is actually correct. So, the chief null hypothesis, which is my actual hypothesis in this particular case (sorry if I’m losing you here), of this regression test is as follows: “There is no correlation between winning percentage and team PER.” Most p-values are arbitrarily set at .05, which essentially means that there is 5% chance that a null hypothesis is indeed correct. So, if the p-value on this test would have been higher than .05 it would have meant that there is a more than a 5% chance that team PER and winning percentage are not statistically correlated. Again, with a p-value of .000, I was profoundly wrong in my initial hypothesis.

For every one-unit change in PER, it increases winning percentage by 9.95%. In other words, if a team’s average PER increases by one full point then it should win an additional 8 games a season. Opponents points scored also proved to be highly statistically significant with a p-value approximating .000. For every additional point allowed, winning percentage goes down 1.6%. Neither three-point percentage nor three point makes proved to be statistically significant, with p-values of .118 and .619 respectively. Although these two variables contributed to the overall high adjusted R-square produced by the model, alone they could not help explain the variance in team winning percentage.

The four principal conclusions of this analysis are as follows: 1) There is a strong positive correlation between team winning percentage and team PER; 2) There is no relationship between team winning percentage and three point percentage; 3) There is no relationship between team winning percentage and three point makes; and 4) There is a strong negative correlation between team winning percentage and opponents points scored.

The implications of this analysis for NBA front offices are clear. Efficiency, captured in statistics like PER that account for not just a player’s raw numbers, but their total value added to the team, should be the most prized element in constructing a roster. The Memphis Grizzlies, who were below the league average in team PER, proved to be a statistical outlier, as shown in the chart above. And teams like the Chicago Bulls (team PER of 14.17) and Indiana Pacers (team PER of 14.77) that hovered near league average in team PER also had top flight defenses, as the Bulls were third in opponents points scored and the Pacers were second.

Ultimately, we can expect to see a transformation in the league as teams evolve to focus on spacing, allowing for three point shooting, and players that are capable of guarding multiple positions—essentially the 2012-13 Miami Heat. Teams are increasingly looking for what Bill Simmons calls DTA (defense, threes, athleticism). Look at the roster the Lakers put together. It had little to any spacing, little to any shooting, and old guys that couldn’t guard quick athletic wings. As an NBA fanatic, it’s fun watching the game evolve and the ways that computer modeling, analytics, and smart, innovative people are working to build efficient NBA rosters. This trend augurs well for the fans, as the game increasingly moves away from the brutal defense-first orientation we saw in the 90’s to the electric, three point firing, increasingly-position-less games we see today.