Thursday, August 7, 2008

Note for those viewing this post in a feedreader: today's Arbitrarian is very graphics-heavy, and the images won't show up in the feed version of the post. If you want the full experience, I suggest clicking through to see the original at HP.

Last week, we looked at player productivity, as based on box score output. Today, we're going to look at value, which, as you will see, is somewhat different than productivity.

The MVP is not necessarily the best player in the league, nor the most efficient. Many times, the MVP award goes to the player widely considered to be the best player on one of the better teams, but when it comes to arbitrating between comparing the best players on several of the best teams, there appears to be no hard-and-fast rule, and subjectivity enters into play. Today, I will propose that value ought to be quantified in terms of individual contributions to team success, where success is measured in wins.

If we can estimate the number of wins for which each player is responsible, we can do away with the arbitrary focus on only the best few teams. It is theoretically possible, for example, for the most valuable player to be an absolutely dominant but lonely contributor on a middling team, while the better teams each have enough decent players that no single one can be credited with a large portion of their success. We are still, however, left with the problem of objectively measuring each player's contribution to team wins. To do this, I'd first like to explore...

A non-basketball thought experiment

Imagine a lemonade stand owned and staffed by Xavier, Yvette, and Zach. They make money by selling home-brewed lemonade at the end of their cul-de-sac, and only one of them staffs the stand at any given time. After their first month in business, they look at their lemonade sales revenue, and try to figure out which salesperson deserves what part of the income. One option would be to split the revenue into thirds--three employees, three parts. Zach claims that such a distribution is unfair because he worked over half of the total number of hours, while Yvette and Xavier worked about a quarter of the hours each. He claims that the distribution should thus be more like (1/4, 1/4, 1/2).

Xavier points out, however, that if they are trying to assess each employee's value, they should try to find a more specific measure of actual revenue generated by each seller. He suggests that, since revenue is generated by lemonade sales, revenue generation should be measured in terms of the number of lemonades sold by each employee. Since they kept detailed records of such numbers, this is easy to calculate: Xavier sold 2/5 of all glasses, Yvette 1/2, and Zach just 1/10. Zach is disappointed that his pay-per-hour gambit was foiled, but must concede that this arrangement is more just--Yvette and Xavier are much better salespersons, and did more to help the company make money, while Zach mostly daydreamed during his hours on the job.

Back to basketball

What I have in mind is the application of a similar methodology to basketball. We have an excellent estimator of aggregate value--team wins; and credit for these wins can be apportioned to the players who work for those wins. There are a plethora of ways this could be done--we could arbitrarily estimate credit for each player on each team: The superstar might get 50% of the credit, the rest of the starters get 10% of the credit each, while the remainder is split amongst the bench players. Perhaps we could look at minutes played--after all, ceteris paribus, removing a mediocre player, and replacing him with a better player for the same number of minutes, should result in a greater number of team wins. Similarly, increasing the number of minutes played by a good player (to a point) should increase wins, while increasing minutes played by a bad player should lead to fewer wins.

This method isn't foolproof, however: certain high-minute players might be daydreamer-types like Zach in the example above, while others might be feverishly productive. Consider, for example, Matt Carroll versus Yao Ming in 2007-08. Both played roughly the same number of minutes (2016 and 2044), but Yao was substantially more productive (by almost any measure) than was Carroll in that amount of time. Estimates of their value should reflect this difference.

Instead of minutes, I have chosen to use Model-Estimated Value (or MEV, discussed here) as an estimate of player productivity. There are several advantages to this choice, but two stand out. First, as discussed previously, MEV is a good estimator of per-game productivity, and so is more helpful to us than looking at, say, games played, minutes played, or points scored alone.

The second advantage comes from the fact that MEV does not perfectly capture player value. If it did, then team-level MEV would correlate perfectly with team wins, and we would not need separate measures for productivity and value. Rather, since some aspects of player value are omitted from the box score--things like defense, effort, intensity, etc--we may scale our MEV productivity estimates by team success, which does implicitly measure all of each player's contributions.

Bruce Bowen, for example, had a 07-08 per-game MEV of 5.60, which put him below, among others, Wally Szczerbiak. Many would cite this as an example of the failings of MEV--its inability to fully measure defense (not to mention a lack of adjustment for playing time and pace) leads to an undervaluation of players like Bowen. However, Bowen's defense does show up in the Spurs' success--no small part of their winning can be attributed to his contributions. Similarly for Szczerbiak--his contributions are reflected in the success had by Cleveland and Seattle--that is, relatively little success. Thus, by crediting players for team success, using MEV as our measure of productivity, we may get closer to measuring each players' actual value.

This method is still not perfect. We might still be undervaluing Bowen's relative contribution to the Spurs, and overvaluing Szczerbiak's contribution to his teams. However, given two players with idential MEV numbers on teams with otherwise identical rosters, the player whose MEV is "worth more" will help his team win more games.

Calculation and results

The measurement of each player's value to their team is straightforward. Merely take each player's season total MEV for a given team, and divide it by that team's season total MEV. This gives us a metric I call Percent Valuable Contributions, or PVC. For Kevin Garnett in 07-08, this calculation takes his season total MEV (1,534.19) and divides it by that of the Celtics as a whole (8,282.62), resulting in a percentage (expressed as a decimal): 0.185. This means that Garnett is responsible for 18.5% of the Celtics' success, which is a rather large portion, indeed. From here, estimation of value is very easy. Simply take this PVC number, and multiply it by team wins. This gives you each player's BoxScores (BXS), their individual contribution to team success. The first tab on the table below depicts the numbers that go into the BoxScores calculation for each player in the 2007-08 season.

I've sorted each player by their PVC for the sake of comparison. In terms of value to their team, the top three players are James, Paul, and Jefferson. All three are good players, but certainly Jefferson is a step below the other two. Since BoxScores accounts for team success, we can clearly see Jefferson's actual value is much less than that of James and Paul--he might be the most valuable player on the Timberwolves roster, but such is not a high distinction.

For more insight, note the series of players whose PVC comes it at around 0.185: Steve Nash, Joe Johnson, Kevin Garnett, Richard Jefferson, and Carlos Boozer. Even the casual fan knows that these players are not all equally valuable, though they may be equally valuable to their respective teams. Note that Nash and Boozer generated a much higher MEV total than the other three--this is largely because, as the table also shows, Phoenix and Utah had much greater team MEV totals, thanks to a faster-paced playing style. Team wins complete the picture--Richard Jefferson was responsible for 18.5% of his team's 34 wins. His value is thus estimated at 6.29 BXS. Kevin Garnett was responsible for 18.5% of his team's 66 wins. His value is thus 12.23 BXS. As you can see, by accurately measuring productivity (MEV), and accounting for team success (wins), we are able to objectively assess each player's value (BXS).

An aside into rated productivity

Another useful measure, especially for comparing players on poor teams, or those who played limited minutes, is what I call the Valuable Contributions Ratio (VCR). This is a pace- and playing time- adjusted metric of productivity assessed at the per-minute level. As above, this calculation is straightforward and intuitive. Merely take each player's PVC (MEV/team MEV) and divide it by each player's percent of team minutes played (min/team min). Thus, we are dividing a percentage by another percentage (which is why I call it a ratio--units are somewhat meaningless). This statistic controls for team pace and playing time, and is independent of team quality--it captures productivity relative to the time allowed for production.

This is useful for comparing bench players, players who miss a substantial number of games, and rookies. Bench players get a "fair shake" by this statistic, because they often have less time on the floor in which to accumulate MEV toward a larger cumulative share of team success. Same for injured players--Andrew Bynum did not play very many games for the Lakers in 07-08, and as such was less valuable in terms of team wins. However, when he did play, he produced very efficiently, with a VCR of 1.36. (This means that he was responsible for 1.36% of his team's production for every 1% of team minutes played--which is very efficient.) VCR is useful for comparing rookies, as well, since they often play relatively few minutes, and since their teams often win very few games. Rookies with high BXS are the most impressive, but more often than not, rookies don't produce many wins. Rather, they may produce MEV efficiently, and we can see this in VCR. Among rookies with substantial playing time in 07-08, Carl Landry produced the most efficiently, with a very respectable VCR of 1.39. (The Arbitrary Rookie of the Year, Kevin Durant, was 7th among rookies by VCR, and 5th on the rookie BXS list after Scola, Horford, Moon, and Thaddeus Young. He did, however, lead all rookies in points per game. WoW Club!)

BXS MVPs and all-time greats

Look again at the table above, but this time, select the second sheet, titled "07-08 BXS." This lists each player from the 07-08 season, on each team for which he played, and includes measures of productivity (PPG, and MEV/gp), efficiency (VCR), and value in terms of BXS. As you can see, the obvious most valuable player in 2007-08 was Chris Paul, who was responsible for over a quarter of his team's 56 wins, for a BXS of 15.41. Kobe Bryant, this year's Arbitrary MVP, had a good showing as well, but was responsible for almost three fewer wins than was Paul. Kevin Garnett, who was thought to be a more subjective favorite for MVP, acquitted himself nicely in objective terms, by generating 12.23 wins even while missing eleven games.

For a more historical perspective, see the third tab, "BXS Seasons." This lists the same information, but for the 500 most valuable seasons from the population of every professional basketball season since the beginning of the NBA, even including the ABA. Unsurprisingly, Chamberlain tops the list, although his most valuable season was not his most productive in MEV terms. Shaquille O'Neal's dominating performance for the incredible 67-win 99-00 Lakers is the most valuable season in recent memory, followed closely by Jordan's post-first-retirement 72-win season in Chicago. Perhaps surprising, but perhaps not to those who have always appreciated the Big Ticket, is Kevin Garnett's extremely high value in 03-04. Always a valuable player to his team, Garnett and the Timberwolves finally put it together for one great year, and Garnett's relative value (PVC) translated to absolute value (BXS).

The final tab, "BXS Careers," accumulates the performance of 500 NBA greats. The table is sorted by BXS82, which is the number of wins each player would be expected to produce in 82 games played, given his career performance. There may be a few surprises, but they are instructive: The first is Alex Groza, who was a great player in the early years of professional basketball, but whose career was cut short. The second surprise might be the ordering of Michael Jordan, relative to Magic Johnson and Tim Duncan. Many fans and observers would identify Jordan one of the most, if not the most, valuable player ever, and here he ranks sixth at a per-game level. The first thing to note is that Duncan has always been more valuable than his box score statistics might indicate, and this is reflected in his BXS measure. Secondly, Duncan has not yet seen his productivity or value decline substantially due to aging. For the most part, Duncan's 13.73 BXS/82 average comes from the peak of his career. Johnson retired after a relatively short career, and his comeback in 1996 was brief. Jordan, by comparison, had a second comeback for Washington during which he played 142 games of much less valuable basketball. If you exclude Jordan's Washington years, his career BXS82 becomes 14.62, which puts him solidly above the other two.

Visualizing value

Now for the first graphical visualizations in the life of this young column. Since BXS is derived by multiplying player contributions (PVC) by team success (team wins), we can envision BXS itself as the area of a rectangle with sides of PVC and Wins. This lends itself to graphical expression, with the league as a rectangle, 41*30= 1230 wins wide, and 100% high. Partitions may be made on the horizontal axis for each team, scaling each section by that team's number of victories. Within each team's segment, further divisions can be made for each player, according to their contribution to team success. The best explanation is an example, displayed below:

You may use the controls at the top left to zoom in and pan across the graphic to see more detail, or an expanded overview, as you wish. The "Fullscreen Version" link directs you to a much larger version of the same display, which may be easier to grok. The graphic above displays BXS for the 2007-08 season, with team success increasing from left to right, and player contributions increasing from bottom to top. Colors are derived from statistically-derived playing style, where red indicates a propensity for scoring, green denotes perimeter play, and blue highlights interior-play tendencies. Much more will be said about this classification scheme next week.

Several things I'd like to point out in the graphic above to get you started: First, note how tall the rectangles of Chris Paul, LeBron James and Al Jefferson are--this is because height is scaled according to PVC, and these three players were the most responsible for their teams' success.

Looking across the top row of the graphic, we can identify each team's Most Valuable Player. For the lowly Heat, Dwyane Wade was most valuable, despite injury. Calderon was most valuable in Toronto (7.1 BXS), though Bosh was a very close second (7.0 BXS). The most valuable player on the best team was Kevin Garnett, but since he had a very supportive team behind him, his individual value was somewhat less than that of Chris Paul, whose supporting cast drops off substantially in terms of contributions after Stojakovic.

One useful perspective granted by displaying contributions in this manner, is that it is easy to compare units across teams. For example, Boston was famed for the Big Three of Garnett, Pierce and Allen. Using the scale on the right of the graphic, we can see that together, these three accounted for almost half of Boston success. Looking across the graphic from the lowest part of Allen's rectangle, however, we can see that the big three that was most valuable to their team can actually be found in New Orleans, where Paul, West and Chandler can be credited with almost 60% of the Hornets' success. On the other side of the coin, Detroit, Houston, and Chicago all got a fairly balanced set of contributions, as their subjective reputations might have suggested. I would be very interested to hear about your own observations, as well as your opinions as to how well this graphic meshes with your subjective opinions, in the comments.

I've also developed an interactive presentation of the graphic above, with even more detailed statistics. Just follow the link below to the Interactive BoxScores Explorer page. The league-wide graphic has been scaled to fit in your browser window, and players' statistical details pop up on mousover. Try it--it's somewhat addictive.

Make sure to click around a little bit--I've created "player cards" for each individual, which display even more detailed statistical information, including their playing style, most and least similar players, the mean and standard deviation of their "counting" statistics, and a season-long sparkline of their productivity. Feel free to use them in any application you wish.

The player cards are a quick and easy way to quickly assess any player. For fantasy purposes, for example, if you're comparing two players with similar averages in assists, you might want to pick the player with the smaller standard deviation about that mean, as indicated by the error bars in the middle section of the card. Alternatively, if you are interested in whether a certain player tends to produce more as the season goes on, the seasonal trends should give some insight into this, as well as how long it takes the player to recover fully from injury, or how much they produce when their minutes go up. Also, I've included each player's most- and least- similar match, based on the 07-08 season, which can help give you an idea of the niche they fill on their team.

Historical BXS franchise timelines

Another excellent use of this BXS area diagramming visualization is to display franchise histories. Better years are represented by wider segments, and the best players rise to the top with tall rectangles. Eras can by identified by patterns in color. Here are two examples, the first depicting the LA Lakers franchise, and the second, Boston Celtics history:

Several eras stand out in the graphic above. The Mikan era was eventually replaced by the Baylor/West dynasty which became the Chamberlain/West years. Notice, incidentally, how West becomes "greener" over the course of his career, indicating a shift away from focusing on scoring, and toward a concentration on other perimeter contributions. A Kareem era follows, though his best years are at this point behind him, and massive team success comes only with the addition of Magic Johnson. Johnson leads the Lakers for ten consecutive years and his retirement marks the end of an era of dominance. LA returns to form in the late 1990s in the hands of O'Neal and Bryant, who turn in some incredible performances--interestingly, there is an obvious breakpoint between 2001 and 2002 on the graphic indicating the switch from the Lakers being "Shaq's team" to being "Kobe's team." The 2006 version of Bryant was forced to carry the scoring load to a massive degree, but the 2008 version (as is evidenced by a much less red color), has been freed up to focus less on point production, and more on doing other things to help his team win. Perhaps the 08-09 season will see Gasol and Bynum float to the top of the column, turning in full, healthy seasons for a very successful LA team.

Boston history is marked even more clearly by the careers of its greatest players. The Celtics of the 1960s are consistently topped by Bill Russell's defensive-interior blue, and bolstered by some great scorers, like Havlicek and Jones. The 1980s saw a parallel to the Lakers above, in which a perimeter player (here Bird) lead a team supported by strong interior scorers. The 1986 and 1987 Celtics offer an interesting starting lineup of all greens and blues--no single player was responsible for most of the scoring, while other types of contributions were made by all. After years of seasons with little success and narrow columns, the Celtics finally turned it around last season with the addition of Garnett and Allen, almost tripling 06-07's win total.

The final graphic below offers an alternative take on presenting BoxScores, by tracing the careers of each of 50 NBA greats. Following the peaks and valleys of each player's tenure, we can also see the years in which many stars were shining brightly. 1972 was a great year for the sport, as was 1990--many of the NBA's greatest players had good seasons in these years, and the league may be seen to peak at these points. The graphic makes it possible to see the beginning of new eras--witness the start of Bird's and Johnson's careers, followed shortly by the rookie seasons of Thomas, Drexler, Jordan, Olajuwon, Stockton, and Barkley. We can see a big dip in strike-shortened 1999, and then another in 2004. This second dip may be troubling--has the quality of the league declined so sharply? Worry not--many modern greats retired in the years just before 2004, meaning that their layers drop out of the picture, setting the stage for the new era of NBA stars we are witnessing today.

As always, I'd very much like to hear your opinions of BoxScores as a measure of value, as well as whether or not you think it gets things right. Was Magic really at his peak in 1987? Were Garnett and Pierce in 2008 in the same league as Bird and McHale in their prime? Should Chris Paul have been the MVP this past season? Were the Rockets really the most balanced of the good teams last year, and will the addition of Ron Artest make them that much more indomitable? Please feel free to leave a comment and take part in the now-customary brief survey below. Next week, I'll go into much more detail on a new way to describe players, without having to use all those pesky words.

PostScript: Discussion of the "50-Win Standard"I hope you took the time to read Josh Tucker's excellent discussion on the established precedent of giving the MVP award only to players on teams with fifty or more wins. I have a few thoughts on how the 50-win minimum precedent fits in with the BoxScores methodology I've established here.

The first is that I essentially agree with the implied criteria of such a cutoff (or the implementation of the "Bryant-Nash Rule"). That is, I think that value should determine the MVP, and value is measured in wins, not strictly in individual statistics.

However, as an Arbitrarian, I would tend to shy away from establishing an arbitrary (though precedented) line of demarcation between those who should and should not be under consideration. I think that if you put Wilt Chamberlain on a team with 11 kindergartners, and that team won 41 games, I'd want to consider Chamberlain for MVP. That is, there should be a sliding scale, in the sense that each of the Detroit Pistons individually are less valuable than a single LeBron James, though the Pistons collectively tend to do better than do the Cavaliers collectively.

This is built right into the estimation of BoxScores: A player contributes X% of the production for a team with Y wins, and so he is credited with X•Y of those wins. Fortunately, from the standpoint of the 50-win precedent, as team wins decrease, it gets harder and harder for any player to outproduce a player on a 50+ win team.

For example, imagine a season in which player A contributes 20% of the production for a 50-win team. (20% is on the low end for MVP-candidate PVC, and 50 wins is at the low end for a contender as well, so this is a conservative estimate for an MVP frontrunner.) Such a player accumulated 0.2*50 = 10 BXS. Player B, let's say, is on a 41-win team. In order to be more valuable than A, B would have to be responsible for 10/41 = 24.4% of his team's production, which is very high, indeed. In 07-08, only two players had more than a quarter of their team's valuable contributions, and one of those was the BoxScores MVP, Chris Paul (whose team had 56 wins). LeBron James was the other, and despite contributing more than 2/7 of his team's production, Cleveland's win total of 45 reflected James' second-most-valuable status.

In sum, even if we use BoxScores as our measure of value, it is highly unlikely (although not impossible) that the MVP will come from a sub-50-win team. The precedent will likely remain intact.