If there are any questions about Box Plus/Minus calculations, please feel free to ask them in the comments section below. I will be migrating questions about Box Plus/Minus from my comments page and other locations to this page to keep them all together.

39 Comments

Mr. Myers, I have read your BPM article on basketball-reference with great interest. I am in the midst of a project comparing four methods (yours, WS, WP, and EWA), and I can’t figure out one part of your methodology: replacement level is -2.0 BPM, but what is the win total for a team of replacement level players? If I sum up all the VORP for 2014, I get 810.2 wins above replacement level, which leaves 420.2 for the replacement level, which divides rather neatly to 14 wins. Is that a coincidence, or is 14 wins the number? I get 810.8 wins for 2013 too but I thought asking directly was most prudent.

Since a team of -2.0 players will produce about 14 wins (depending on how you do the team points to wins conversion–pythagorean method, linear, etc.), that is the number of wins a replacement level team will produce according to VORP.

The number that was derived directly was the -2.0 value, per the extremely long thread on Tom Tango’s blog (linked to in the BPM writeup). A team of replacement players will produce some wins; the difficulty lies in establishing that number. Several methods were used in the later portions of the Tom Tango comment thread, and all basically came up with the -2.0 value.

I’m loving your Box Plus/Minus stat. One thing I noticed is that guards seem to have better BPM and VORP than bigs. Is this an accurate observation? Is there anything in BPM that favors perimeter players over bigs? Thanks!

It looks like 3PAr is 3PA/FGA. It is not in the bkref glossary, but is in the advanced stats. TO% looks like a number between 0 and 1, not TOV% which is 0-100.
I tried TOV%/100 but that didn’t work. So what is TO%?

Love your work. I’ve been playing around with creating BPM numbers for the last 20 years of ACC basketball (spoiler: Duncan was amazing). Question: On the spreadsheet I downloaded from B-R.com, the “Raw SPM” column (column AS) adds up four of the five to its left, but not “Val/Shot.” When calculating “Raw OBPM” (column BG), it adds up all five columns to the left, including “Val/Shot.” Is this right?

I’m trying to get a hold of Box Plus Minus and I have found that the best way to do that, is to try to calculate it myself.

So I downloaded the big excel-file on basketball-reference to look into the machinery.

First of all. Nice job.

I have a question: How do you calculate the replacement-level? You have values that you subtract from each of the component of BPM (“Reb/Vers”, “Defense”, “Offense” and “MPG + Int”). But I cannot see where you get those values from?

There’s quite a bit of VBA in this spreadsheet that has to be activated to get everything to run. I wasn’t using a new version of Exel when I wrote the sheet, so that shouldn’t be an issue. I think I was using 2007?

For everyone else’s reference, the Excel spreadsheet has some “average values” subtracted out of each component of the BPM, simply for presentation within the Excel file (so everyone is centered on 0). Since this is before the team adjustment, those values have no bearing on the final result.

1. In Bkref’s Sheet, BPM is in Reb/Vers, Offense, Defense, MPG. But OBPM it includes Val/Shot, And You said it may be a bug. But the OBPM Numbers in Basketball Reference Page provides the result which is closer to data including Val/Shot, so it means the site itself has the bug??

2. In Playoff BPM, How should I Credit Net, Offensive Rating of a Team? Should I Use Net Rating as Team’s PO ORtg-Team’s PO Drtg, and Offensive Rating as Team’s PO ORtg – Lg’s PO Avg ORtg?

3. In Calculating Team Rating in PO, Just Multiply The Player’s PO Time * Regular Season BPM and sum it, and devide by Sum of PO Time and Multiply 5?

Hello, Y2J, thanks for your question!
1. Has that not been fixed yet? I’ll check into it. Val/Shot was just an intermediate term in the offensive calculation. Basketball Reference could certainly have the same bug, since they simply copied my (perhaps buggy!) Excel file to create BPM for their site.
2. The Team Ratings to use in the playoffs are a bit different than the regular season. I take the actual efficiencies and adjust by the strength of their opponents. So if an offensive rating is, say, 113 vs. a playoff average of 107, that would be a +6. Then the strength of schedule adjustment would be applied to that +6 number. I calculate strength of schedule based on the actual playoff lineups, using the players’ regular season BPMs.
So, suppose the Spurs used Duncan and Manu 35 MPG in the playoffs, then their “expected strength” would be based on Duncan and Manu’s regular season BPM, but using the playoff distribution of minutes. Most teams will be better in the playoffs due to their tightened rotation of players.
So back to the original team. It was +6 in the playoffs. It played, say, the Mavs 4 times and the Spurs 7 times. The Mavs had an “expected defensive strength” of +1, and the Spurs an “expected defensive strength” of +6. So the team’s strength of schedule would be (4*1+7*6)/(4+7) = +4.2. Therefore the “Adjusted Offensive Rating” for that team in the playoffs would be +6 + 4.2 = +10.2.
Does that make sense?
3. That’s the calculation for “expected strength” of a team in the playoffs, which I just referred to in the paragraph above. That is used, as discussed, in determining a team’s strength of schedule in the playoffs.

And for the last question:
Playoff BPM will not sum to 0, because it’s on the same scale as the regular season. All the players on one team will be adjusted to sum to their team’s playoff adjusted ratings, but that’s the only adjustment of the numbers.

I know there are some adjustments made for BPM, but there’s something I don’t understand.

How can Jonas Valanciunas be a +13 Ortg – Drtg, yet a -1.0 in BPM, and yet generally for other players it seems to line up fairly closely. Eg. Demar Derozan is a +3 Ortg-Drtg, and a +0.9 BPM, and Kyle Lowry is a +15 Ortg-Drtg and a +6.7 BPM. (Yes, I’m a Raptors fan, and yes, Lowry is very underrated and Demar a bit overrated)

For someone more comparable to JV, Andre Drummond is a +5 O-D, and 0.6 BPM

Is there a mistake with Jonas’ data, or are there some weird adjustment that affects some players much more than others that I don’t know about? I had a look at the article where it’s calculated, but couldn’t follow it well enough to figure it out.

BPM is not based on ORtg-DRtg at all. ORtg-DRtg depends on who else is on the court at the same time, so it can be wildly inaccurate (noisy) as an individual player measure. Over a huge dataset of players, ones with better ORtg-DRtg should have better underlying BPMs (since they are likely better players) but at an individual level the correlation will be low.

Adjusted team ratings are simply team ratings that are adjusted for the team’s strength of schedule. If every team plays the same schedule, no Strength of Schedule adjustment is needed and the results would be accurate. It is far more important in leagues with imbalanced scheduling (like the NBA western vs. eastern conference).

I recently read some stuff about Box Plus Minus stats on Basketball Reference and did some research on my own, but couldn’t get sure answers on calculating BPM in the playoffs. I also sent you an e-mail with this same message, but I’m also leaving a comment here because I’m not a fan of how fast e-mails come back 🙁

Well anyways, I also tried asking BR, but they referred me to you as you are the creator of the stat.

Ok, so to get to the point, BPM_Team_Adjustment = [Team_Rating*120% – sum(Player_%Min*Player_RawBPM)]/5.

1. Player_%Min gets adjusted only for minutes played in the playoffs, correct?

2. The values MP and G in Player_RawBPM gets adjusted for their respectable values played in the playoffs, or do I leave these values as values from in the regular season?

3. If I change only the MP and G values in Player_RawBPM, do all the other values besides MP and G in Player_RawBPM come from the regular season?

4. I understood that Team_Rating = NETRtg of the teams. So would Team_Rating = The average of all opposing teams’ NETRtg?

5. Does “Add actual playoff efficiency differential to the calculated strength of schedule to get the adjusted team efficiency differential used in the BPM calculation” mean Playoffs_BPM_Team_Adjustment = [Team_Rating + sum(Player_%Min*Player_RawBPM)]?

6. and do I not have to worry about multiplying by 120% or dividing by 5?

I’m not sure if I ever emailed back, so I’m responding here (Sorry, this was stuck in the queue so long!)

The team adjustment is odd in the playoffs, for sure.

All of the data (minutes, raw BPM) is from the playoffs only, except team rating. There, I use the net rating of the team in the playoffs + the estimated team ratings of their opponents (Strength of Schedule), based on those teams’ regular season performance.

Love your stat and reading about it on Basketball Reference! I do have a question about the formula for your team adjustment. I am not clear on what “S” is in the formula. Is it different than the player’s raw BPM? I thought it was the sum of all of the components, but that would make it the same as their rawBPM which is also in the formula separately. And in your example you plugged in the number 7 but you didn’t plug in minutes percentage or the other one (“S “or raw BPM). Maybe I am just missing something. Could you help me understand?

The team adjustment is to take the team rating (times 120%) and subtract out the Sum of the RawBPM of ALL PLAYERS on the team (times their respective % of minutes). That team adjustment value would then be applied to ALL PLAYERS on the team.

So you have to calculated the RawBPM*Player%Min for everyone on the team before you can calculate the team adjustment.

Ohh, gotcha! So the “S” would be the equivalent of a summation symbol (Epsilon) for all of the players? And then one last question, how is the team adjustment applied to each player? Is it just added or subtracted to the player’s rawBPM to get the player’s final BPM?

Hi Daniel – fantastic job. I was wondering if there is an updated BPM spreadsheet with values for 2016-17 and 2017-18? The reference sheet on the BRef website only has it for the period 1977-78 until 2015-16. Also, is it possible to get BPM data going back to the 1950s and 1960s?

I am not sure how easy it would be to update the spreadsheet for 2017 & 2018 since BBRef reformatted their site completely. You could manually copy in the source data to generate the updated BPM’s.

BPM is only generated back to the mid 1970s because some of the stats that it uses for the calculation were not tracked prior to that time. A reduced version of BPM could be developed, but it would not be as accurate.

I recently started doing some statistical analysis for my school’s sports teams, and I have a couple questions about BPM and VORP.

First of all, I am only in high school and I am wondering if you think using BPM and VORP can accomplish the same as it does for professional leagues? I understand there will be some obvious differences, and the numbers for high school will likely be much lower but will it still be effective in trying to determine the players’ quality and contribution to the team compared to other players?

Second, if I were to do it at a high school level, some of the coefficients of the BPM equation, such as ReMPG, would have to change (In high school, games are 32 minutes long). How would this change the “a” coefficient if at all? Also, are the other coefficients determined for a professional level. If yes, how would I go about determining more high school level appropriate coefficients?

To be honest, I would use the coefficients as they are. The ReMPG coefficient could be a bit off, I agree–that is basically a proxy for the added information a coach’s playing time choices give about how good the player is. That could be more variable at other levels of competition.

A simpler, more linear (no interaction terms) version might be better for levels of competition with fewer games/less minutes. The interaction terms are overly sensitive to outlier input values, which are far more common in low-minutes samples.