Breaking down Wins Produced

To paraphrase John Von Neumann,truth is much too complicated to allow anything but approximations.

Case in point, it is a simple truth that everything in the Wins Produced Model comes from the Boxscore. All the math is based on actual measurable events that occur and are measure on a basketball court. If you understand this it can lead to some interesting revelations about the game of basketball.

But that’s really getting ahead of myself. Let’s roll it back and start from the beginning because you see people always get confused with the model itself. One of the key misconceptions is that we somehow make up some of the numbers that actually go into the model itself. We always get asked this question. This is patently not true but I can actually understand how people actually get this wrong.

Bear with me

Now Prof. Berri has a wonderful page outlining in detail how the model is actually built. This is fabulous is you want to replicate how the model is built and you want to go off and do it yourself. It’s actually, to me, the easiest of the advanced models to build and replicate (and trust me I’ve done them all). The problem lies in how faint that praise really is.

The level of detail in that explanation can be very intimidating to the more casual fan. I must admit dear reader that even I can get confused and lost at the intricacies of the model.

Trust me to take it upon myself to attempt to clarify. My goal then here is to walk you thru the model and clarify some common misconceptions that exist.This will then allow me to have some fun with the components of the model.

The first bit is simple. Wins Produced is based on point margin and the fact that it correlates strongly to winning (Rocket Science, I know). Points scored on offense and given up on defense are good indicators of a teams ability to win games.On this pretty much everyone agrees.

In the simplest terms, there are really four things that actually matter to winning in basketball:

Points Scored on offense

Possessions Employed on offense

Points allowed on defense

Possessions Acquired on defense

What the model actually does is map every action on the court that’s measured in the boxscore to one of those four things and map out it’s value to wins.

The model then is divided into three distinct parts:

The Individual player’s production on the offensive end. This includes Field goals made and missed, Free throws made and missed as well as turnovers and assists or all the player actions related to the end of his teams play on the offensive end. Let’s call this Individual Offense.

The Individual player’s production on the defensive end and on the Boards. Here we have all the plays related to a player gaining his team possession of the basketball by ending or preventing the opponent’s offensive opportunities this includes rebounds (offensive and defensive), steals and blocks. Let’s call this Individual Defense + Rebounding .

The team factors. These capture all those boxscore stats that are specifically not assigned to an individual. The first group includes the rebounds and turnovers that are not assigned. The second includes all the missed field goals for the opponent not captured in the individual stats via blocks. Let’s call this Team Defense.

Now the majority of these stats are just tallied up and multiplied by a value to come up with out total productivity. There are three exceptions to this. Team Defense factors are divided equally amongst the players on a team based on minute played by each player. In essence, this gives you credit for playing on a good or a bad defensive team. Assists are the second exception. In the simplest possible terms, the player gets credit for assists in comparison to an average player on his team. Better than average assist producers get credit, below average guys get docked (and nobody likes playing with them). The assist itself get assigned about half the value of the points produced by the field goal assist. The last exception is defensive rebounding. For each defensive rebound gained by a player he takes away about half a rebound from his teammates. What we then do is take the value of that half a rebound, toss it in a rebound tip jar which gets spread evenly amongst the teams player’s based on minutes played.

So a player’s individual productivity is the sum of Individual Offense, Individual Defense and Rebounding. This is then adjusted by the value of Team Defense to give total player productivity. To get Wins Produced, we compare the player’s (or team’s) productivity versus the average productivity at his position (or for the average team). A player’s value is then a function of how good he is versus the average opponent faced. The classic model does this in Wins but we can also very easily translate it to points (and I’ve done just that).

If this is too much trouble for you, we’ve even come up with a nice simple version:

Let’s get to the payoff. If I work the numbers for all these factors since 1978 I come up with some very interesting results. First let’s look at last year by category as a proof of concept:

(Editor’s Note: I screwed this up initially, fixed now. Carry on.)

Chicago,Miami, Philadelphia and OKC show up as our top 4 (I feel sorry for the Bulls fans). Boston, New York then Memphis show up as our best D’s (with the Celts being historically good). On offense the top three are San Antonio, OKC and Denver. For individual defense and rebounding we get Chicago, the Lakers and OKC. This lines up with reality very well. The Lakers for example have been good at the possession acquired game in the last decade (offense and team d have always been a concern as well). The bottom of that list is fairly accurate as well.

Let’s get back to the historical chart though:

A few things jump out here:

The level of correlation for the total Wins Produced model to Wins remain very consistent over time in the 90%+ range.

The models containing just the boxscore stats (Wins Produced and Win Score) track very closely to each other.

The explanatory value of each of the factors changes dramatically over time. Offense has risen over time (hello three point shot) vs Individual defense and Rebounding but it has made a comeback more than a few times. Team defense in particular seems to go in an out of fashion (and if I had to guess it is very,very sensitive to rule changes).

Efficient Offense seems to be priority one to winning. Individual Defense and rebounding and Team Defense seem to be more arbitrary (good to have but no guarantee of success).

I hope this clarifies some doubts and leads to some more interesting questions. It’s certainly going to lead to some more post (yes, dear readers, that is foreshadowing).

45 Responses to "Breaking down Wins Produced"

Thanks for these awesome graphics and charts; I always enjoy these posts. However, I have two questions that you guys can hopefully help me with:

1) Is it fair to divide team defense among the players? I’m sure the Wages of Wins writers of all people would surely be against saying Tyson Chandler = Amare Stoudemire on the defensive end.

2) I see that Opponent’s 3FGM, 2FGM, and FTM are used in the calculation. These combined will give the Opponent’s total score. Along with that, we have 3FGM, 2FGM, and FTM for each player, meaning we have the player’s total points. Since a win is completely determined by which team has more points, shouldn’t a model that includes all these statistics have a 100% correlation?

Chirag,
1)i agree with you. We spend a lot of time on this particular topic actually. Hopefully we can come up with a better system for this in the future.
2)The model is built around point margin.Point margin and wins do not perfectly line up. Stuff happebs :-)

1) This is an issue people often have with Wins Produced, but I think that it mostly stems from a misunderstanding of what the model is doing here. Luckily, in this post, Arturo broke it down, so you can go see for yourself what’s going on. The most important thing to note is that Team Defense does not equal ALL of the team’s defensive output, but rather the team’s defensive output MINUS individual defensive output. So the model is not at all saying that Tyson Chandler is equivalent to Amare Stoudemire on defense, because Tyson Chandler’s individual defensive output is much higher than Amare’s. Arturo even points out in the article that team defense has an effect that seems to spike occasionally, but mostly is fairly insignificant.

1) (Continued)
It is still true that poor defensive players do get a slight “unfair” boost from the team defensive adjustment. Amare probably gets the credit for some of Tyson Chandler’s box outs, and the shot disruptions that he creates that don’t end in a block, but these events happen few enough times that it doesn’t have a significant effect on player ratings. Of course everyone would prefer that this team adjustment only include events that are truly not the result of individual defense, but for things not in the box score (except charges), we just don’t have a reliable way to accurately attribute credit for these events. You could resort to something like adjusted plus minus to “capture” all of these evenst, but then you would be taking on a model with enormous problems in order to correct for a relatively small source of error on the individual level (which by the way disappears on the team level).

2) I would just add that your assertion here works on a per game level, but breaks down on any level that takes into account multiple games, because while in aggregate, score differential is extremely predictive of team wins, variations in how those points are distributed across games will have an affect on actual win loss record. Here’s an example. Say that over a 2 day period, team Blue has 2 games and team red has two games (none against each other). Say that both teams play 2 games over these two days, and both end up with an average point differential of +10. We can think of various ways to achieve this average point differential. Let’s say that Blue team blue out their opponent on their first game, and achieved a +21 score differential for that particular game. Then it must be the case that Blue team lost the second game by 1 point. If Red team on the other hand beat its first team by exactly +10 points, then it must have beaten the second team by +10 points.

2) (Continued)
So even though the point differentials over the period for the Blue team and the Red team match, the Blue team ended up with a Win-Loss record of 1-1 while Red team achieved a 2-0 record. This effect leads to point differential’s sub-100% predictiveness of team wins and losses.

Daniel M,
Yes. A metric built around efficiency differential (like wins produced or win shares) would have a similar correlation.
As show here, it’s the team and or player’s raw efficiency differential vs the average level of production so some SOS is built in to the model since you do not play an average sched.Opponents are adjusted for but things like altitude and days of rest are not.

I see that you alluded to year adjustments (maybe coming in the new model??) but am wondering why they weren’t there in the first place. I am fairly certain there could be some sort of 3 year window adjustment included to value tov / reb, ect equally.

Has anyone ever done any work to look at whether the value of a Rebound or a Turnover or a made FG have changed over time?

From the charts above it looks like it remains fairly stable (the full model has almost always correlated >90% with team wins). But it could still be possible that model would improve if the calculations were based on more localized time periods rather than everything from 1987 to today.

You would want to avoid overfitting the model but it would be interesting to see what the trends are over time. Is rebounding more important today than twenty years ago? Or less?

@Daniel M, Arturo
“Any metric that enforces summing to the team efficiency differential will have an identical correlation to that.”

Correct me if I’m wrong, but Wins Produced isn’t like some algebraic sum that is set to equal efficiency differential, so implying that the correlation of Wins Produced in modelling Wins should be /identical/ to that of Points Differential in modelling wins is incorrect. Because Wins Produced is more a model of Points Differential in terms of box score statistics that is then translated into Wins based on how points differential itself models to wins. In other words, the above assertion is too simplistic, and implies that because WP is based off of Points Diff, then it is insignificant to achieve a similar correlation to wins as Points Diff. I don’t think that this implication is correct, and one can convince themselves of this by comparing the correlations to wins of all of the models based on Points Diff, and seeing that they’re not equal.

Love the break down, probably the reason I got so attached to this site. WP seems so predictive and powerful. However, since is 90% and not 100%. I have to ask, has a team or player consistently broken the model? Like do the Spurs consistently over perform or are there merits to “Defense Wins Championships” vs the “7-seconds Suns” are regular season teams? Just curious, great stuff.

Shawn,
I was being simplistic. You are correct. I could build a simple model on point differential and minutes played by the player and it would be very accurate at the team level but would tell me bupkus about the players. The value in models like wins produced is the fact that all the publically available data from the boxscore has been mapped to point differential using sound mathematical principles. Yes, there are gaps in the data and areas for improvement but we feel it’s a reasonable approximation of the truth.

“Correct me if I’m wrong, but Wins Produced isn’t like some algebraic sum that is set to equal efficiency differential, so implying that the correlation of Wins Produced in modelling Wins should be /identical/ to that of Points Differential in modelling wins is incorrect. ”

The team adjustment for Wins Produced (and many other metrics) effectively does force the team to sum to point differential. Win Shares, ASPM, Wins Produced, etc. all effectively will be the same in that regard. The only reason there wouldn’t be 100% correlation to wins by any of those metrics is the fact that there is some (generally) random error translating point differential to wins.

Hey Daniel,
Your wording has been a bit sketch in terms of connotation.

Your first point is any metric using point differential will behave the same. First, point differential is a good indicator of wins. It’s what most people will use for teams. In essence, it’s a good thing to model.

Your next points are a bit off. You say “effectively does force.” and here’s where I find the terminology off. Wins Produced says “Wins come from point differential” and then “Point differential can be derived from the following box score values.” Finally “The box score values are assigned as such” where most are assigned to individual players. There is no forcing, this is straight math with regressions deciding the value of a difference of a point (roughly 0.033 wins) and each box score stat as it relates to that (which you can see in the calculating section)

Your last point is right, point differential is a good model for wins but it is not 100% (teams lose close games, teams get blown out)

Rather than just divvy up team defense (and rebounds) based on minutes played, why not try to divvy it up more accurately using lineup analysis? The precise data for which lineups were on the court at which times is out there… 82games.com uses it, for instance.

The simplistic approach would be to divvy up team defense and rebounds among the 5 players on the court at a given time. That would help a lot when it comes to differentiating between players who often sub in for one another (so, say, Harrington wouldn’t get credit for many of Faried’s rebounds last season in Denver).

Even better would be to perform a regression analysis on all of a team’s lineups, and figure out the impact each player has on opponent’s offensive efficiency and on defensive rebounding rates. That obviously adds a lot of complexity to the model, but it’s a reasonably self-contained MMSE problem that could be done for each team. I suspect the results would improve correlation of team defense to team wins.

The much simpler model is the model that maps Points Differential to Wins. To get player performance to map to Points Differential is basically always the more complex part of any of these models. I mean modelling wins from points differential is pretty straight forward, and it’s doesn’t make really make sense to say that models will necessarily be the same because of a particular component of the model which is of relatively minimal complexity, and in fact is incorrect. All you could say to that end is that the explanitory power of the model that maps points differential to wins constitutes an upper bound for the explanitory power of any model that uses this as the mechanism for getting to wins. The strengths or weaknesses of the model beyond that are it’s own, and have nothing to do with other models which use the Pts Diff -> Wins mechanism.

As for the “team adjustment” comment, when you’re building a model, you are trying to mirror reality. If there is a part of the domain of your model that you don’t necessarily attribute to particular factors, you still have to include it in the model, otherwise the model is incorrect. This shouldn’t be controversial. The issue isn’t whether or not a model has an adjustment, but how well it ennumerates the factors that contribute to the outcomes it is trying to model. The mere presence of an adjustment does not speak to this significant qualitative distinction between models. So, it strikes me as a strange thing to note, that Win Shares, ASPM, Wins Produced, etc have adjustments. It’s like saying modles are models.

The adjustment doesn’t mean that the model will sum to Wins. Because of the particular factors that each model builder chooses to ennumerate, there are factors that are outside of the adjustment. Those ennumerations are like going out on a limb saying that this factor affects outcomes in this manner, and you could be wrong on any of those particulars, and if you are wrong on those particulars, then the correlation of your model’s prediction to actual outcomes are going to diverge from actual outcomes. If you don’t ennumerate anything, then nobody will use your model, and if you ennumerate things that you can’t accurately estimate given your inputs, then your models explanatory power will suffer. Therefore, it’s not correct that Win Shares, ASPM, Wins Produced need be of equal explanatory power merely because they use adjustments and translate back to Points Differential, which is what you seem to be arguing.

I believe the point Daniel is trying to make, and pardon me if I’m mistaken, is that using team level correlation to wins doesn’t necessarily show that a model is good. For example, you could create a model that only tracked players’ points scored, and then gave a defensive adjustment for opponent’s points scored. This model would suggest that Kobe Bryant was amazing and Tyson Chandler was bad. But, when taken at a team level, it would be highly correlated to winning, because point differential is correlated to winning.

What makes a model good, is being able to be broken down to a player level, and then be predictive when using different combinations of players.

I think that it’s still useful to note the correlation of the model to the factor that you are modelling, and in fact, that’s the whole point of the model. Those experienced in statistics or econometrics might not be overly impressed by a model that has a high corellation to the factor that it is modelling, but I think that there are at least a couple very important reasons for noting the model’s performance.
Firstly, it helps to establish credibility with lay readers, especially if you contrast a model with high explanatory power to a popular model with low explanatory power. Sure, those lay readers may read too much into a statement of explanatory power, but at least it does highlight the difference between a model that gets the basics right, and one that is way off.

Secondly, model makers and users should test how their model performs in terms of the factor being modelled periodically, especially in a dynamic system. It’s concievable that you could build a model that both ennumerates many factors very and accurately tracks wins at first, but then drifts over time as the system being modelled changes. That would have happened at least to some extent if you made your model before the three point era, and had a factor for field goals, but not three points. That’s an obvious rule change that needs to be accounted for in the model, but there may be other non-obvious cases of factor drift that could occur that you might not catch without looking at model’s overall performance. Looking at performance from year to year allows you to at least identify when you’ve got an unanticipated problem.

Also, to bring it back to the first chart, those years for which the full model dips might be good seasons to mine for ideas on how to improve the model. Of course it’s not unlikely that a lot of that is random variation, but you might chance upon natural experimental controls that you hadn’t considered. For instance, this past season seems to have been a /relatively/ poor performing year for the model. Of course my first thought was “Lockout!”, but backtracking to 1999, it doesn’t look like a lockout necessarily hurts the models efficacy too much. The other nice thing about this chart is that it shows that there really hasn’t been too much factor drift. It’s effectively modelled wins for a long time (I know much of this is retrodiction, but it’s still impressive that it fits so well given everything that’s changed in the game).

Awesome graph, it looks like you have a negative (as in photonegative) model of the effect of changing teams on production. Are you going to have much to say on that when you release the model? Should I venture to hope for a positive model?

“What makes a model good, is being able to be broken down to a player level, and then be predictive when using different combinations of players.”

Correct. To be predictive not of itself (which proves nothing except you’re measuring the same thing) but to be predictive of team point differential with new lineups.

The ultimate proof of an individual player statistic’s validity is when you can take the individual ratings for Y-1 and predict lineup performance in Y–with lineups that did not exist in year Y-1. That is the only way to truly prove the validity of the statistic.

“So the only way to test a model is to look at how it predicts small segments of an actual game? We aren’t interested in an actual game? Or actual season? ”

Obviously we care about the actual game and actual season! However, lineups that were used in Y-1 and Y will be very highly correlated no matter what stat you’re looking at, so it will be harder to discern differences in performance from one metric to another in total. If a specific lineup produced a specific performance in Y-1, it will likely look similar in Y, no matter to whom in the lineup you attribute the performance.

If a lineup is +15 in Y-1, and I attribute that all to player A, and in Y, the lineup is still +15… what have I validated?

If however, I look at new lineups, where player A is not with the same players as last year, I can truly figure out if player A was indeed worth what I estimated, +15. If I predict, based on Y-1, that a lineup should be worth +15 in Y, and it actually is worth +5, I might have an issue.

Hmm, I still don’t understand how incorporating opponent team statistics works.

If the only variables I used were points, Opp 2FGM, Opp 3FGM, Opp FTM, then I have the exact point differential. Of course point differential doesn’t map out exactly to wins, but it is fairly close. With just those 4 statistics, I could achieve about the same correlation to Wins as the WP stat does.

So doesn’t using opponent statistics and simply dividing them up equally among the players make it a bit unfair?

The problem with using line-ups for something like that is sample size. Plus/Minus statistics, which have an inherently high variance, usually require multiple seasons of data to overcome sample size issues. If you’re looking at specific line-ups, which play far fewer minutes in a given season than an individual players does, you need to have an even larger sample size to actually prove anything. Since line-ups rarely stay together that long, it becomes almost impossible to actually prove anything at the line-up level.

A far easier solution would be to look at teams which have recently had a lot of player turnover, and compare the expected wins produced of the entire team with how many games they actually won.

I think a good test of any model is to measure how well it does vs. the Vegas odds makers when it disagrees with Vegas.

For example, in Vegas right now I am sure there are number crunchers looking at the data from last year and the year before, injuries from last year, current injuries, trades and FA signings late last year and this offseason etc… and building an odds line for how many games they expect each team to win and also projecting odds for winning the championship.

If your model is really good (because we know Vegas is very good), you should be able to value some of those things at least well (and hopefully better) than Vegas. There will always be a issues with getting them all right because of future injuries, the difficulty of evaluating rookies, mid season trades etc..), but you should be able to value what we do know better.

I already have a couple of teams in mind for gambling purposes that I may bet early in the season because insights like that.

“The problem with using line-ups for something like that is sample size. Plus/Minus statistics, which have an inherently high variance, usually require multiple seasons of data to overcome sample size issues. If you’re looking at specific line-ups, which play far fewer minutes in a given season than an individual players does, you need to have an even larger sample size to actually prove anything[...]

A far easier solution would be to look at teams which have recently had a lot of player turnover, and compare the expected wins produced of the entire team with how many games they actually won.”

When you look at many years and years-1, you should be able to overcome the sample size issues pretty easily by looking at all of the lineups in Y that were not there in Y-1 in aggregate.

I agree that the team-level analysis would be easier and would get at the same point.