I have been preparing for some time now to reconstruct Box Plus/Minus (BPM), with a goal of addressing the major existing issues.

Here are some issues that have been identified (I will add more to this post as more are brought forward):

Poor handling of outliers on offense

Mishandling of interaction terms (related to the first)

Poor estimation of defense

Poor handling of blocks (as shown by college BPM being dominated by block%)

Some of these can be readily addressed by changing the statistic entirely. However, I don't want to do that. BPM will remain a box-score-based statistic. The goal is to make a statistic that can easily be applied to other leagues and contexts that do not have such good data coverage. So these are the constraints:

Box score stats only (i.e. anything that can be calculated from the stats we have from the 80s.)

No PbP stats, not even things like "assisted by" ratios.

Nothing super complex that can't be done by someone with Excel and a good knowledge of math.

Focus on Explanation, not Prediction. What happens should be credited to the team. No luck adjustment. (A good explanatory stat can be converted to a predictive stat with appropriate regression to the mean.)

To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).

Here is a sample of the data:

What I am interested in on this forum thread is to get ideas from the public about possible ways to reformulate the metric to achieve these goals as comprehensively as possible.

one good interaction that may or may not improve sample size is "turnovers vs expected" i.e. projected turnovers from stat line vs actual turnovers. that has a very good RSQ in general and to me seems pretty important

Look at how much assist rate over role / position (PG, wing, big of primary, secondary, tertiary) contributes to team assist rate differential vs. league average. If you dominate the ball but don't make team above average, you probably aren't as good as your assist rate alone suggests.

While the assist-rebound interactive term may have helped on average, it seemed to overly penalize the extremes. I'd see if you could dampen the extremes or reduce the size of the factor in general. This is not a great case for saying that having the same person rebound and distribute is more valuable than having two separate players at the same levels.

If you are going to have a versatility reward interactive term it should include scoring and probably shot defense.

Look at how much assist rate over role / position (PG, wing, big of primary, secondary, tertiary) contributes to team assist rate differential vs. league average. If you dominate the ball but don't make team above average, you probably aren't as good as your assist rate alone suggests.

This is an interesting perspective... Probably relates to blocks on the defensive side as well. (Hello, Hassan Whiteside)

Assuming you stick with that, it appears scoring defense is either same for everybody regardless of who you guard or if you are even on court or not via a team level adjustment, or not include at all... or estimated in a semi-complicated but doable and maybe helpful way.

What about starters' defense estimated as 40% the presumed starter matchup scoring defense , 15% the average of the next two closest positions starters, 10% the average of the other 2 starters, 20% of the average of the sub matchups at the position, 15% of the average of the subs at other positions? Defense is shared. This is rough but it may be better than everybody gets the same grade or no grade at all.

Assuming you stick with that, it appears shot defense is either same for everybody regardless of who you guard or if you are even on court or not via a team level adjustment, or not include at all... or estimated in a semi-complicated but doable and maybe helpful way.

Something position-based could be workable (i.e. centers get more credit for 2 pointers?).

My weights try to account for general substitution patterns and who guys switch onto the most. Reverse the substitution related proportions for subs.

I will be working with season-level data, so something based on individual matchups will not be feasible. In other words, we may have season-long opponent 2pt%, but we won't have it broken out by position.

To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).

To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).

Can you go into anymore detail on the approach here?

Briefly: construct a simple prior based on MPG and team quality. Subtract that out of matchup data, run the RAPM, and then add it back in as a postprocessing step.