Learn a Stat: Box Plus Minus and VORP
Welcome back to Hack a Stat! In this chapter of Learn a Stat, which is also the last one (for now), we will discover Box Plus Minus and VORP.
TABLE OF CONTENTS
Box Plus-Minus is another all-in-one statistic that tries to converge all of a player’s contributions into a single number. It’s a Daniel Myers‘ idea: a stat with characteristics similar to PER or Win Share, but which had the form of plus-minus. In other words, the number (positive or negative) allows us to understand the player’s impact on the court, calculated using the box scores.
Although the BMP can be calculated directly, the Offensive Box Plus Minus (OBPM) and the Defensive Box Plus Minus (DBPM) can also be obtained: as the names suggest, the first statistic refers only to the offensive part, while the second to the defensive phase. Due to the nature of the box scores, it is easy to understand that the OBPM is based on solid data that cover all the various aspects of offensive phase and therefore it is quite reliable; the same cannot be said for the DBPM, as several defensive contributions are not reported on the box score. The Defensive Box Plus-Minus is less reliable than its offensive counterpart.
The VORP, on the other hand, stands for Value Over Replacement Player and is a statistic dependent on the BPM: it takes BPM values to create a unique rating scale for the League in which the Replacement Player will be the reference value.
Definition and starting data
The Box Plus Minus allows us to understand the player’s impact on the court in terms of the points differential (positive or negative) spread over 100 possessions. The Offensive and the Defensive have equivalent meaning but refer only to the offensive and defensive impact. As previously mentioned, the Box Plus Minus is calculated starting from the data obtained from a normal box score, even if, actually, the advanced statistics relating to the classic values of the box score will be used.
The VORP converts the value of the Box Plus Minus into an estimate of the player’s contribution, which is parameterized with respect to the Replacement Player. In this case, the reference value is -2, which is the BPM selected for the Replacement Player. We will then obtain a unique comparison scale, whose threshold value will be 0 (i.e. the Replacement Player’s VORP): a VORP greater than 0 will be positive, a minor one will identify a player who is playing poorly.
Both statistics are based on seasonal values.
We need various data to calculate these two stats:
Box Plus Minus
- Minutes played [MP];
- Team minutes played [TeMP];
- Game played [GP];
- Offensive Rebound Percentage [OR%];
- Defensive Rebound Percentage [DR%];
- Total Rebound Percentage [TR%];
- Steal Percentage [ST%];
- Block Percentage [BLK%];
- Assist Percentage [Ast%];
- Usage Percentage [Usg%];
- Turnover Percentage/Ratio [TO%];
- True Shooting Percentage [TS%];
- Team true Shooting Percentage [TeTS%];
- 3-point frequency [%3P];
- League 3-point frequency [Lg%3P];
- Team Net Rating [TmNetRtg];
- Team Offensive Rating [TmOffRtg];
- League Offensive Rating [LgOffRtg];
- Box Plus Minus [BPM];
- Minutes played [MP];
- Team minutes played [TeMP];
- Team game played [TeGP];
Formulas and calculation
Box Plus Minus
As for the PER, the BPM is calculated by adding the different player’s contributions in order to obtain a raw value, gBMP, which will then be calibrated on team performance to obtain the real Box Plus Minus. Every single addend that makes up the raw term is related to one or more contributions; moreover, there is always a corrective factor.
NB1: An updated BPM has recently been released; the work done involved for sure the corrective factors. As soon as they are made public, I will update this Learn a Stat.
NB2: unless it is not specified, the values will be used as usually displayed (e.g. TR% of 25, 25 will be used in the formula).
The first term takes into account the minutes played:
Behind the first term there is small reasoning: wanting to compare the BPM of each player you can come across both high and low-minute players. Basically those with high minutes will have faced more critical moments of the game than the seconds. To give greater weight to the first ones, it was decided to add four 0-minute games in the calculation of the average. The total minutes are then divided by the number of games played plus four: this value has been set by Myers for the NBA. For the statistics that I calculate for Serie A, Euroleague and Eurocup I decided to add two 0-minute games instead, given that the total number of matches is much lower (30/34 games against 82).
The following terms are related to offensive and defensive rebounds:
The fourth and fifth relate to steals and blocks:
The sixth takes into consideration the assists, while the next takes the turnovers:
The percentage of turnovers is multiplied by the Usage in order to consider it on team possessions (as already happens for ST% and BLK%) since the TO% (or TO Ratio) is calculated on individual possessions.
NB: the TO% must be used as a pure and not a percentage value (e.g. TO% of 25, use 0.25).
The next term refers to offensive production and differs from the previous ones due to their linearity:
Let’s go in order: the first part calculates the individual possessions concluded with a shot or an assist (1 – TO% provides exactly this) and redistributes them on team possessions through the Usage. The second part calculates the offensive production starting from his TS% and compares it with the team one: in this way, a player with excellent percentages in a team with horrible percentages will stand out more than the one with the same percentage but who plays in a team with a better TS%. We then have a term relating to assists and finally a term relating to the frequency 3-point shots attempted by the player, compared with that of the League.
NB: in this case too, the TO% must be used as a pure number and not a percentage value, as well as the 3-point frequencies and the shooting percentages.
The last term could be seemed strange:
It multiplies two terms that do not have much relevance together: total rebounds and assists. Actually, this term seeks to reward all-around players who contribute by doing everything on the court. The most striking example is Westbrook, for example; in essence, this term rewards triple-double players.
Adding all the terms presented you get the raw Box Plus Minus (pay attention to the seventh term, which must be subtracted):
To obtain the pure value it is necessary to calculate a team adjusted coefficient [TeAdC]; each team will have a different value. The formula for its calculation is the following:
For the coefficient of 1.20, Myers states that teams with positive Net Rtg (usually leading in the game) play slightly below their potential, while those with negative Net Rtg (usually not leading in the game) play slightly above their potential (due to efforts in an attempt to win). Myers, therefore, wanted to take this aspect into account in the calculation of TeAdC: that 1.20 increases the positive Net Ratings by 20% and decrease the negative Net Ratings by 20% in order to consider this particular game aspect. 20% is the value chosen to scale the Net Rating.
The Net Ratings used in the NBA BPM are adjusted: they take into account the strength of schedule of each team. As you know, NBA teams face more opponents from the same conference. Currently, a Net Rtg of +8 in the Western Conference is better than a +8 in the East, because we know that there are more high-level teams on the western side. The analysts then correct the ratings to take this into account.
In LBA, as in Euroleague, there are no such problems. We can use normal Net Ratings without dilemmas.
Then, we find the sum of the raw BPM multiplied by the percentage of minutes played by each player.
The difference between Net Rating and the second term related to gBPM is divided by 5 to distribute the value on the five players that make up the line-ups. This coefficient can be positive or negative depending on team performance.
The final step is to add the raw BPM with the Team Adjusted Coefficient to obtain the Box Plus Minus.
The result can be positive or negative.
Offensive Box Plus Minus
The calculation of the offensive BPM is identical to the calculation of the Box Plus Minus. The differences lie in the coefficients (they all have different values) and in the Team Adjusted Coefficient. In this case, the Net Rating will no longer be used, but the difference between the team Offensive Rating and the League Offensive Rating instead will be used.
It is sufficient to follow the BPM calculation procedure, being careful to replace the values of the coefficients, to derive the offensive value.
Defensive Box Plus Minus
This last BPM will be calculated by the simple subtraction between the Box Plus Minus and the Offensive Box Plus Minus, given that the sum of the offensive and defensive contributions provides the total player’s contribution.
The Value Over Replacement Player is also simple to obtain:
Having said that the BPM value chosen for the Replacement Player is -2, the distance between that player and the one being analyzed is calculated with the first subtraction, then the result is multiplied by the percentage of minutes played and the ratio between the team game played and the total number of games to play (82 for the NBA, 30 or 34 for Serie A and Euroleague).
How to read and analyze
We take two teams and analyze players’ BPM: Euroleague 2017/2018, Efes and Real Madrid. Here are the Box Plus Minus and some intermediate values.
Some players have a positive raw contribution, but the corrective coefficient (negative for both teams) leads to a negative Box Plus Minus: in other words, although some players had good performances, the team’s performances have reduced their efforts. Among the Istanbul team rankings, no player has a positive BPM: ideally, it makes sense, given that Efes was the worst team of that season.
In Real Madrid instead, we find several players with positive BPM: Doncic, the best in this ranking, but also Fernandez, Randolph, Ayon, Campazzo, Thompinks, and Tavares. We also have some players with Box Plus Minus around the zero: as you can guess, averaging a positive BPM is not so easy. Having a value around 0 means being an average player, one who brings his contribution to the team cause. A negative contribution begins to be found around the value -2 (the replacement player’s value). Players like Causeur or Randle are positive.
If instead, we wanted to observe the offensive part here is what we get:
We notice a situation very similar to the previous one. In this one, offensive players like McCollum achieve to obtain positive values (or in any case greater than -2) even in very low ranking teams.
Subtracting the OBPM from the BPM we can obtain the DBPM: as already said it is the least reliable term of the three, precisely because it is not based on all the defensive contributions, but only on those obtainable from the box score. This is a defect, which must always be considered when talking about Box Plus Minus.
In the end, the VORP:
The VORP allows us to skip the reasoning previously done: Causeur’s Box Plus Minus, for example, is equal to -0.8; does this mean that he has contributed positively or negatively to the cause of Real? His value is greater than -2 (the reference value), so he made a little positive contribution. The VORP allows us to skip this small numerical comparison, directly estimating Fabien’s positioning in a unique scale comparison.
In addition, the VORP takes into account the minutes played, thus reducing the BPM of the players with low-usage. For example, McCollum and Balbay have almost the same BPM, but different minutes played. This fact cannot be found directly through Box Plus Minus, but with the VORP we immediately find that the American, having played more, contributed more consistently to the team’s performances.
In conclusion, we can say that BPM is a fascinating advanced statistic, but one must know its limits in order to use it wisely. The VORP, on the other hand, is a useful statistic to get a first indication of which are the best players in the league.
This Learn a Stat ends here. See you soon, your friendly neighborhood Cappe!