9.15.2007

Fly Into Tomorrow

Ok. Back to what's important.

If you haven’t been following thisweek'sPERdebate it probably isn’t worth starting now. And if you have been following it, chances are you’re bored sick. But for those eight or nine of you out there actually looking forward to the next installment, come and join me for one final ride.

When we last left off, the theory of PER inflation was under fire from the gentlemen at BallHype, who put together an impressive study showing that season-to-season increases in minutes actually correspond to increased, not decreased, productivity – even for the subgroups we argue are inflation-prone. The problem with looking for the minutes-productivity relationship in this data is that causation may very well run the other way, like when players get more minutes because they've improved. I originally thought this problem could be avoided by looking at intra-season (i.e. game level) data. But as commenter Brian M wisely points out, the problem remains that coaches generally let players play when they’re hot, and bench them when they're cold - so we still won't know if minutes increase productivity, or the other way around.

Rather than look directly at the mpg-PER relationship, I thought I’d try approaching the problem from another angle. Our original hypothesis was that per-minute productivity will decline with large jumps in mpg because of a) the increased quality of teammates with whom production is shared, and b) the increased quality of defenders. My idea is pretty simple: if we can show the (negative) effect of match-up quality on productivity, and the (positive) effect of minutes-played on match-up quality, this would provide some indirect proof that per-minute adjustment creates inflated PERs.

The raw data I use are from the bizarrely under-the-radar +/- stats website basketballvalue.com. For each game, they provide data on every 5-on-5 combination that takes the floor and the total number of minutes elapsed for each match-up. Thus, for every player-game observation it is possible to calculate the average quality of teammates and defenders in that game- first, by taking each player’s 2006-2007 PER ratings, then multiplying that figure by the fraction of time they share the floor. For example, if Ginobili plays 30% of a 2-on-2 game with Duncan (PER 26) and 70% with Oberto (PER 12), then the average PER of his teammates would be 16.2 (.3*26 + .7*12). By applying the same method to our 5-on-5 data we derive our two independent variables, "Teammate-Per" and "Opponent-PER", for each player, for each game. Then, by linking these match-up variables with boxscore data from the same games, we can analyze the effect of both Teammate- and Opponent-PER on individual game-level production.

The results of our regression analysis are given in the tables below. For the first model, we test the effect of Teammate and Opponent-PER on three different measures of per-minute production (that is, our dependent variable): the NBA efficiency metric (NBA48), a simplified Hollinger metric (Hollinger48), and Dave Berri’s Win Score (WS48), each of which are normalized a la PER. We use a linear fixed effects model to control for both individual and team effects. (Without getting to technical, what this basically means is that we cancel out the effects of fixed differences in individual and team productivity, i.e. the fact that Kobe or the SUNS are more productive on average, and in ways that are unrelated to Opponent-PER.) For the second model, we test the effect of Opponent-PER on WS48 using different subgroups and controlling for minutes played.

We find that the effect of Teammate-PER is weaker than expected, and its significance is sensitive to the metric we select. For both NBA48 and Hollinger48, increasing Teammate-PER has a small and significant negative effect on productivity, but using Berri’s WS48 metric, that significance disappears. (This makes some sense, since Berri’s system emphasizes shot efficiency over point totals, making the benefit of high-quality passing more important than the cost of reduced attempts). Moreover, when included in a model with Opponent-PER, the effect of Teammate-PER drops out entirely (see table 2).

In contrast, the effect of Opponent-PER (i.e. the quality of defenders) is robust for all three performance measures. In the first model, the effect is still quite small – a negative .06 decline in WS48 for every 1 unit increase in Opponent-PER. However, that effect increases significantly with the addition of further controls (i.e. mpg). And when we focus only on our original subgroups - i.e. players with 15+ PERs and high mpg - the effect jumps to –0.20. This means that increasing the average quality of the opposition from 10 to 20 PER – that is, going from a match-up with bench players to a match-up with starters – leads to a 2pt decline in per-minute Win Score (where WS48 is normalized with a mean of 15). Not an enormous decline, but still significant.

Going back to our original theory of PER inflation, we also tested to see if Teammate-PER and Opponent-PER are indeed correlated with the number of minutes an individual plays. As one might expect, we find that yes, the longer a player stays on the floor, the higher the quality of both teammates and defenders. Thus, given the positive effect of minutes-played on match-up quality, and the negative effect of match-up quality on individual production, it seems plausible that – all things really being equal – an increase in minutes will lead to (slightly) decreased productivity, on average. And that this is especially true for above-average bench-players who get a large bump in mpg - that is, the subgroup we originally hypothesized would be subject to inflation. In short, THE THEORY OF INTERTEMPORAL HETEROGENEITY LIVES.

A couple quick caveats- first, these are the results of a pretty quick and dirty analysis, so please judge them accordingly. Also, while I do know my way around this kind of analysis, I'm far from an expert, so consider that as well. Finally, it's true that the problem raised by Brian M still applies: players who are more productive will stay in games longer, and thus face better defenses. However, this just means that if anything, the effect we observe is understated, and so it hardly undermines our case.

30 Comments:

Regarding your last paragraph: The correlation versus causation barrier still applies, and I would argue that you're adding an unnecessary step:

...players who are more productive will stay in games longer, and thus face better defenses.

It seems like the most common sense explanation is that more productive players face better defenses because the other coach wants to stop them. Conversely, bench players get run when the other team's starters are resting. And anytime a bench guy starts taking more playing time, he will naturally play against the other team's starters more.

I realize this sort of restates your theory, but you hadn't mentioned that part of it is probably voluntary matchup juggling by coaches. But hear me out for the rest of this, because I think I actually have something useful.

Still, none of this stuff controls for the fact that a guy might get more playing time just because he happens to be playing better (whether from game to game or season to season). The way to do that (if you can fish it out of the available game data) is to compare the set of guys who got increased minutes due to injuries versus those who got increased minutes for other reasons. When a guy's minutes increase because he's replacing an injured player, this is by no choice of the coach or anyone. If there is a significant difference in production increases between these guys and the others, you can start to infer that the others are getting extra run because they are playing better.

sb-that's a good point. I was more thinking of the fact that bench players who play longer (because they're hot) will innevitably end up facing the starting defense, and vice-versa. but your right that if they're hot, they'll probably face better defense regardless of how long they play - because the opposing coach wants to stop them.

reggie- i feel you. i think shoals wrote something about the darko rant @ fanhouse.

THIS DISCUSSION IS CLEARLY SHOULD BE TO YOU CLEARLY THE MOST BORINGEST THING EVER, LIKE DUH. THAT YOU'VE EVER DONE. WHY ARE YOUR PAGE HITS DOWN? I DUNNO, PROLLY COS THIS IS BORING. WHY IS THIS ALL IN CAPS? I DUNNO, I GUESS IT WOULDA BEEN OBVS TO YOU THAT P-E-R WHATEVERS ARE AS BORING AS ALL CAPS SO THINK I GONNA KEEP IT UP WITH THIS CRAP UNTIL YOU START BRING US REAL CONTENT.

This is pretty good stuff. I'd like to see if it extends well to a few priors from the real world. For example, by this method, are Kobe, Wade, Nash, etc still the elite? To me, one of the primary draws of PER is that Kobe wins, and I think that most watchers of basketball would agree that he is the best.

Also, it would be nice to formalize the defense thing, but that is probably better left to someone who gets paid to do this, only because it would be a pain.

Although individual numbers will be damaged, isn't mashing the #5 scorer in the league with the #1 rebounder and a talented, veteran swingman (super-glue, as it were) just destined to work? I don't like the Celtics as they were, I don't care for their history of dominance, but it HAS to be a decent team doesn't it?

PER is handy as a proxy for player quality but is a measure more of offense than defense (because of lack of shot defense so it isn't a perfect comprehensive proxy. Reruning with 06-07 adjusted +/- data when it becomes available might be an option.

The debate will continue.I appreciate your efforts and your hypothesis and findings add to that side of the scale.

Sorry guys but all this blowhard about Efficiency Ratings (when is the last time you have referenced them without the acronym?) obscures the whole point of the Association (Sorry Charlie but it's the ring baby!) in an attempt to statistically evaluate disparate players who play on different teams with different systems and styles and rosters. WTF? The only true measure of success in this game is who wins and loses and who can hold their cookies in the playoffs. Let's get back to talking about something culturally relevant. I know it's the offseason but damn, can't we talk about next year or all this summer's international basketball or this picture of the late 90s film "Celtic Pride":"http://www.lovefilm.com/lovefilm/images/products/0/21220-large.jpg"Damon Wayans as Kevin Garnett and Danny Ainge as one of those familiar looking white guys. The symbolism is just too much.

"Although individual numbers will be damaged, isn't mashing the #5 scorer in the league with the #1 rebounder and a talented, veteran swingman (super-glue, as it were) just destined to work?"

Sounds familiar to me, a lot like "isn't mashing the #1 point guard in the league with one of the top 5 scorers, along with a talented, up-and-coming swingman just destined to work?" That was how people described the Kidd/Carter/Jefferson Nets. Garnett/Pierce/Allen are definitely a talent upgrade, I think that the '04-'07 Nets have proven you absolutely need a bench and at least one more adequate starter. I'm excited as hell to see it but I just can't stop equating Rondo/Allen/Pierce/Garnett/Perkins with Kidd/Carter/Jefferson/Krstic/Collins.

Iread an article somewhere, I'll try and dig it up, that shooters being hot or cold is simply a matter of perception. In other words the player's shot percentage doesn't increase or decrease by anything more than a negligible number. Do you think that's true of games as well?

One question: Why did you mix your PERs and your Berris in the final findings? I'm not sure how much the PER-WS disparity mucks things up, if at all. But it seems weird to say 'players who are good by this measure and play more against players who are good by that same measure perform poorer by this other measure.'

Do you have data using either >average WS or with PER as the dependent variable?

Also: big ups for using basketballvalue.com. That site's in the holy triumvarate now with 82games and basketball-referece (and by association, dougstats).

This is the most boring shit I have read in a long time. Not this post specifically. But the whole stat debate. Who cares? The stats are bullshit. We all know that. Do we really need to talk about if for a week. I mean, come on.

And arguing about basketball existentialism isn't basketball nerd-dom? Get a life. Stats aren't bullshit, they offer an interesting way to look at an issue that is typically approached through purely qualitative means.

TZ-Good question. I'm not sure why I went with Berri's WS. But the results in Table 2 are basically the same no matter which measure (NBA48, Hollinger48, or WS48) you choose- i.e. for the 4th model, it varies between -.17 and -.21. Also, the "Hollinger48" measure that I constructed really isn't the same as Hollinger's PER- I don't use his constants for pace and team rebounding, etc. For Teammate-PER and Opponent-PER, I just used the season-level PER numbers from ESPN. But since I had to construct my own for the game-level measure, I just did a very simple approximation of his formula. I don't expect the results would differ to much either way though.

Everyone else-Yeah, I get it. Shit is boring. I couldn't agree more. Feel free to discuss anything you like.

I guess the difference is that basketball existentialism is not boring. Stat debates are. Watch the games and determine who is good based on actually watching the guys play. This isnt baseball. And it isnt math class.

Other than causality, there is a huge issue with this. Let's assume that PER is a zero sum game, which it isn't, but for the most parts it should be. If someone gets a steal (PER good) that means someone else is credited with a turnover (PER bad). If we're using the season end PER for each player, then that means it accounts for all the good & bad things a player did during the season. Assume there are only two players in the league: Player A has a high PER, and Player B has a low PER. Now if you go through each moment of each game, you'll find that Player B did worse when facing Player A. Why? Because Player A got steals, and Player B comitted turnovers.

Taking this analogy back to your study, in general the average of all the players will look worse than their season average against Player A, because we know Player A was good on the year! The more I think about it, I think you've created a tautology.

I don't think PER is bullshit or boring, and I have been reading every post so far. What disturbs me about this whole discussion being on freedarko is that it signifies to some degree a turn away from the oppositional or -- wait for it -- postcolonial stance of the site thus far. Please please step back from the brink of hyper-rationality, embrace the fluid, mystical nature of existence (and basketball) and quit trying to represent what cannot be represented. This is really Victorian age shit here, this codifying impulse, and must have something to do with the proliferation of "experts" and their need to maintain their privileged position through inaccessible statistical machinations. You don't have to stop with this, but please don't give up the messianic discourse we have come to love (read: need).

Correct me if I'm wrong, but what I understand you to be saying is this- if Player B had a large number of turnovers last season, this means that every time he took the floor, his opponents' steals would have been higher, on average, because his losses were by definition their gains. The idea behind a fixed-effect model is that it controls for these kind of problems. All the fixed-effects of low-PER players (the fact that they tend to share the floor with high-steals players, etc.) are controlled for. In a sense, it's as if we're looking at one individual player and then seeing how his performance varies with the quality of the opponents he faces. So, if its a player who committed a high number of turnovers, the question is, did he commit more turnovers against certain defenses then others.Does that address your point?

Ok let's take this example and port it into baseball. For instance we want to test if a player with 300 ABs will perform better/same/worse when we give him 600 ABs (for a second let's assume that he's not a lefty masher or other platoon player). Now we can run the same test you just did, accounting for everyone's teammates and opponent. That is look at every batter/pitcher matchup and see how each batter did. In general we'll find that batters did worse against the best pitchers in the league.

Everyone should perform worse when facing stiffer competition. Sure there might be a crappy batter that owns a Johan Santana. But overall in general the league will be worse against the best pitchers.

But what does this say about our 300 AB guy? Does this mean that his OPS will decline when he's given more ABs?

What might sell me on this study is to compare low minute players with high minutes players of the same PER. That is take all the 15 min players with a PER of 15, and compare them to the 30 min players with a PER of 15. And so on and so forth. See how they do against the top X% PER players of the league.

If your theory is correct, the higher minute players should outperform their low minute brethren.