Old Championships, Modern Players

Introduction

A few weeks ago, I saw a post on the NBA subreddit putting the 2003 San Antonio Spurs’ championship into perspective by comparing its roster to current players. While the original author’s logic is far too simple and flawed, the concept is worth a look. I chose to expand this idea by finding similar players for the ’03 Spurs, ’11 Mavericks, and ’08 Celtics. These Spurs and Mavericks teams are commonly cited cases for a team having just one incredible superstar (Tim Duncan and Dirk Nowitzki) and a collection of subpar players. The ’08 Celtics is a more down to Earth example of one of the first super teams, albeit an aging one.

So was Tim’s and Dirk’s supporting casts really that bad? Initially, I mostly think no. Basketball is and always will be a team sport. However, as detailed briefly by Malcolm Gladwell in this episode of Revisionist History, a transcendent basketball player is possibly the determining factor for how great his team is.

Method

Warning: Math ahead. Feel free to skip this section and go straight to the results.

To find the most similar players, I used basketball-reference.com‘s player statistics. The features include all per game, per 36 minutes, and other advanced statistics such as offensive rating. I did not include shooting statistics found in individual player pages because I didn’t know how to scrape them all yet. If you do know how or have access to it, I want to know! You can learn more at the site’s glossary.

Next the player pool starts during the 1980-81 season and ends at the 2016-17 season. Where the player pool begins is mostly irrelevant because I only want to pull from the 2010-11 season and on. Current players are defined as players during or after the 2010-11 seasons. I figured this would provide more recognizable names. Anything older would be inaccurate because of how differently basketball is played now versus then. I also wanted to keep the idea from the Reddit post using modern players for comparison. This is really just for fun.

Players are then subsetted to those who played more than 26 games and at least 450 minutes in the season total. 27 games is about a third of the season, and 450 minutes was include Speedy Claxton, a significant player in the Spurs playoff lineup.

In order to treat all the features equally, I uses scikit-learn’s StandardScaler to scale each data point in the feature so comparing points versus blocks have equal weight. Otherwise, points naturally having higher numbers will carry more weight than steals or blocks in my comparison method.

Finally, the most fun part is the comparison method. I implemented KDTree to calculate Euclidean distance between data points. While we can Pythagorean theorem the hell out of 2D or even 3D features, 25 dimensions is a bit harder for me to comprehend. So we can employ linear algebra and more maths to see which points are closest to one another. The closer the point, the more similar one player is to another simply based on statistics. I chose to exclude the distance values in my results for display simplicity. If you are interested in how similar each player is, feel free to message me. I ignored player position when querying results because of how differently basketball is played now.

Results

Remember that the players higher up on the list are more similar to the projected target player. I highlighted the two most similar players in their respective team colors. I also included the season and team of the player queried. I believe those are important things to remember when considering the exact player. Rookie Steph Curry and MVP Steph Curry are very different players.

2011 Dallas Mavericks

Surrounding Dirk Nowitzki:

I’m just going to say it. That doesn’t look like a championship team to me.

Discussion

When looking at these player comparisons, Dirk’s run is easily the most impressive championship run in the modern era. The 3rd seeded Mavericks were a laughing stock at the time. Even the 6th seeded Blazers were jokingly favored over them. The Mavs then swept the defending champion Lakers in the semifinals, beat a young, up an coming Thunder team in the conference finals, and the first year of the Heatles in the finals. We called Dirk soft. We laughed at his teams. We laughed when Cuban refused to pay a future 2 time MVP, but the lovable German and his glorious blonde locks proved he is one of the greatest.

Tim Duncan’s unsupported run in 2003, while incredible due to his individual feats, looks a bit overstated. Young Manu was good. Young Tony Parker albeit with questionable decision making, was good. Bruce Bowen is good. Seeing how David Robinson compares to old Tim Duncan also means he’s still really good at basketball. Perhaps we remember Tim’s run as running with a bunch of scrubs because of just how incredible he was that year, how amazing Coach Pop is, and because that was the beginning of the big 3 coming together.

Moving Forward

Next steps would be gather shooting statistics such as volume of 2 pointers, shot distances, and other percentages to get a better idea of player similarity. While this only addresses how players are similar offensively, I believe these are especially important statistics because they show a lot about player positioning and where each player is effective. A high scoring Anthony Davis does not play in the same areas as a high scoring Kevin Love or Hakeem.

If I could, gathering defensive statistics and creating metrics for defense will make this query much, much better. Unfortunately, I do not have access to player tracking data. I will be writing a post soon about what we cannot measure that has significant effects on defense soon.

Caveats

Defensive statistics are incomplete. I am aware we cannot reasonably compare players this way. This was simply a fun exercise putting Tim’s and Dirk’s championship runs into perspective.

It’s difficult to find players to compare to great players. We define great players as great because they are outliers. Trying to find things to compare to outliers in itself is a bit silly. This is why I chose to leave out other championship teams, the two main stars, and think the Boston Celtics comparison table is probably garbage. I would not put any weight on it myself. It’s just there because it’s fun to see.

I am aware of the inconsistencies in table appearances, and I got really tired of them at this point. Excel is no fun to use!

You May Also Like

I am a data practitioner who loves making simple, powerful visualizations and working with machine learning algorithms to learn more about the world.
Basketball is incredibly dynamic, and it only takes one player to change an entire game plan. Data has stories hidden within it, and I want to reveal why my favorite team is better than yours is. And why Ben Simmons is amazing.