Total Offense Score

The goal of this article is to create a new metric to judge a single player’s offensive production. I know there is a ton out there already (PER, OWS, OBPM, etc.) but I wanted to try my hand at creating one and possibly find ways improve upon them. My metric will mostly be based on total production and how efficient that player was at achieving that production.

Method

I measured total production as points generated (PGEN), which is the sum of the points the player generated either through scoring or passing, for which we will include second assists and free throw assists. The formula is as follows:

\[PGEN = PTS + \frac{AST\:PTS\:CREATED}{AST + FT\:AST} * (AST + 2nd\:AST + FT\:AST)\]

We are getting our stats from nba.com/stats, which includes FT AST in their calculation for AST PTS CREATED but does not include 2nd AST so we find the player’s points per assist and multiply that by the total of the three types to estimate the total points generated through passing.

To measure efficiency I will be using a stat called Individual Offensive Efficiency with Assist Opportunities (IOEwAO) which is PGEN divided by the number or possessions that player terminated (NPTwAO) either by shooting, making a pass that leads to a shot, or turning the ball over. I know those are complicated abbreviations but IOEwAO can differ from just Individual Offensive Efficiency which doesn’t take into account potential assists. I will be using assist opportunities as it is more indicative of how often that player ends a possession.

\[NPTwAO = FGA + .44*FTA + TOV + ASSIST\:OPPORTUNITIES + 2nd\:AST - OREB\] \[IOEwAO = \frac{PGEN}{NPTwAO}\]

You may notice that I included offensive rebounds in the NPTwAO calculation, this is because NPT stands for net possessions terminated and offensive rebound earns the team another possession. Also, FT AST was not included because on nba.com/stats Assist Opportunities and FT AST overlap.

We multiply free throws by 0.44 since not every time does a player shoot two free throws for a foul. 0.44 gives us a better estimate of the actual number of possessions.

Ok so this is the bulk of the information that will be used, now I have to get the data. Using the nba_api repository on github, I wrote a Python Script to scrape the necessary data for every active player in the 2019-20 NBA season into a spreadsheet, and calculated their PGEN, NPTwAO, and IOEwAO along with some other metrics that this article will not focus on.

import pandas as pd

df = pd.read_csv('offensive_efficiency_19_20.csv')
df[['PLAYER_NAME', 'TEAM_ABBREVIATION', 'GP', 'PTS', 'FGA', 'FTA',
    'TOV', 'OREB', 'PGen', 'NPTwAO', 'PGen/G', 'NPTwAO/G', 'IOEwAO']].head(10)
PLAYER_NAME TEAM_ABBREVIATION GP PTS FGA FTA TOV OREB PGen NPTwAO PGen/G NPTwAO/G IOEwAO
0 Aaron Gordon ORL 62 894 767 224 100 107 1501.200000 1296.56 24.212903 21.27 1.157833
1 Aaron Holiday IND 66 627 563 87 88 22 1287.780876 1133.28 19.511831 17.56 1.136331
2 Abdel Nader OKC 55 345 263 66 43 14 449.268293 391.04 8.168514 7.16 1.148906
3 Adam Mokoka CHI 11 32 28 4 2 7 44.500000 33.76 4.045455 3.07 1.318128
4 Admiral Schofield WAS 33 99 92 15 7 7 143.705882 128.60 4.354724 3.96 1.117464
5 Al Horford PHI 67 798 709 80 80 103 1512.297945 1156.20 22.571611 17.59 1.307990
6 Al-Farouq Aminu ORL 18 78 86 29 17 24 137.400000 140.76 7.633333 8.04 0.976130
7 Alec Burks PHI 66 993 768 265 91 48 1522.879808 1302.60 23.073936 20.02 1.169108
8 Alen Smailagic GSW 14 59 40 19 11 10 93.000000 66.36 6.642857 4.74 1.401447
9 Alex Caruso LAL 64 349 291 94 53 17 679.244604 603.36 10.613197 9.66 1.125770

The full spreadsheet contains 41 columns. If you are interested in checking it out, it can be found on my github, but the above columns are what we are going to use in this article.

First let’s find the league average in points per possession. This can be closely estimated by dividing the total points by the total possessions, using the .44 weight for free throws mentioned before.

\[PPP = \frac{PTS}{FGA + .44*FTA + TOV - OREB}\]
ppp = df['PTS'].sum() / (df['FGA'].sum() + .44*df['FTA'].sum() + df['TOV'].sum() - df['OREB'].sum())
print(ppp)
1.0873277667011825

So roughly 1.09 points are scored per possession in the NBA. Having an IOEwAO above this would mean the player produces points at an above average rate, which is good, and should be rewarded in our metric. A below average rate should be punished. To accomplish this, I will take the players PGEN/G and raise it to the power of their IOEwAO divided by PPP which we will call the efficiency factor.

\[PGEN/G^{\frac{IOEwAO}{PPP}}\]

This makes it so a player’s PGEN/G value increases at an exponential rate if it was done on good efficiency, and decreases at an exponential rate on poor efficiency. For example:

print('PGen = 30, IOEwAO = 1, Score =', 30**(1/1.087))
print('PGen = 30, IOEwAO = 1.2, Score =', 30**(1.2/1.087))
PGen = 30, IOEwAO = 1, Score = 22.85057844894064
PGen = 30, IOEwAO = 1.2, Score = 42.724465491574115

Using this, lets see how players stack up against each other. Note that this is a average per game metric which differs from some like OWS and OPBM.

df['Score'] = df['PGen/G'] ** (df['IOEwAO'] / ppp)
df[['PLAYER_NAME', 'NPTwAO/G', 'PGen/G', 'IOEwAO', 'Score']].sort_values('Score', ascending = False).head(25)
PLAYER_NAME NPTwAO/G PGen/G IOEwAO Score
90 Damian Lillard 42.07 52.270929 1.261881 98.650533
364 Mitchell Robinson 5.51 11.190416 2.049158 94.762398
215 James Harden 47.50 56.209223 1.200825 85.596552
383 Nikola Jokic 29.51 38.220744 1.324564 84.630986
424 Rudy Gobert 12.11 19.491754 1.621689 83.896069
312 Kyrie Irving 37.89 46.461379 1.254221 83.747851
221 Jarrett Allen 9.21 15.802198 1.734327 81.657820
325 Luka Doncic 45.48 53.628274 1.196849 80.091785
281 Karl-Anthony Towns 29.36 37.659113 1.309635 79.078980
319 LeBron James 44.95 52.825918 1.196271 78.607847
175 Giannis Antetokounmpo 37.82 45.957049 1.225386 74.717450
76 Clint Capela 11.53 18.265734 1.608916 73.593774
245 John Collins 18.46 26.192835 1.426170 72.465101
474 Trae Young 48.45 55.033549 1.158617 71.573127
133 Domantas Sabonis 24.42 31.984065 1.329774 69.262019
33 Ben Simmons 32.90 39.960157 1.235801 66.119093
23 Anthony Davis 28.05 35.397849 1.273747 65.244666
186 Hassan Whiteside 12.41 18.975417 1.542605 65.070609
42 Bradley Beal 41.69 47.880432 1.169778 64.203828
239 Jimmy Butler 30.16 36.820908 1.253294 63.846615
110 DeMar DeRozan 31.95 38.943015 1.230290 63.028871
444 Stephen Curry 35.49 41.270000 1.210690 62.941262
125 Devin Booker 39.49 45.710849 1.177698 62.804072
31 Bam Adebayo 23.29 30.257050 1.319444 62.653395
282 Kawhi Leonard 35.24 41.560621 1.199544 61.057069

This is the top 25 players ranked by our new score.

Our current formula would tell us that last year, Mitchell Robinson was one of the most productive offensive players in the league. It’s understandable if you don’t believe this, but he did have a superb IOEwAO at 2.005, much better than the rest of the top 25. However, if you notice, he only had about 5.5 possessions terminated a game so it was much easier to reach his high efficiency compared to the number 1 player, Damian Lillard, who had 42 possessions terminated per game.

This is a flaw that we need to account for. To do this I will introduce another equation and variable called possession weight.

\[POS\:WT = 1 - 10^{-(\frac{NPTwAO/G}{Average\:NPTwAO/G})}\]

Now this is a little complicated so I will do my best to explain. Basically possession weight will come out with a value in between 0 and 1. The larger the player’s NPTwAO/G, the closer to 1 and visa versa. The score from above will be multiplied by this weight. Below is a plot of the weights by NPTwAO/G.

import matplotlib.pyplot as plt

npt_avg = df['NPTwAO'].sum() / df['GP'].sum()

fig, ax = plt.subplots(1, 1, figsize=(10, 4))
plt.plot(df['NPTwAO/G'], (1 - 10 ** (-1*df['NPTwAO/G'] / npt_avg)), '.', alpha=.5)
plt.axvline(14.3, color='k', linestyle='dashed', linewidth=1)
# print(df['NPTwAO/G'].mean())
plt.show()

If the player’s NPTwAO/G is equal to the league average, the weight comes out to 0.90, which is the dotted vertical line. High above the average and the score will barely change as the weight get closer to 1, way below and it is affected significantly.

Results

Now let’s look at our new updated rankings.

df['Eff_Factor'] = (df['IOEwAO'] / ppp)
df['Pos_Weight'] =  (1 - 5 ** (-1*df['NPTwAO/G'] / npt_avg))
df['Score'] = df['PGen/G'] ** df['Eff_Factor'] * df['Pos_Weight']


df[['PLAYER_NAME', 'TEAM_ABBREVIATION', 'NPTwAO/G', 'PGen/G', 'IOEwAO', 'Eff_Factor', 'Pos_Weight', 'Score']].to_csv('player_score_results.csv')

df[['PLAYER_NAME', 'TEAM_ABBREVIATION', 'NPTwAO/G', 'PGen/G', 'IOEwAO', 'Eff_Factor', 'Pos_Weight', 'Score']].sort_values('Score', ascending = False).head(25)
PLAYER_NAME TEAM_ABBREVIATION NPTwAO/G PGen/G IOEwAO Eff_Factor Pos_Weight Score
90 Damian Lillard POR 42.07 52.270929 1.261881 1.160534 0.991203 97.782672
215 James Harden HOU 47.50 56.209223 1.200825 1.104382 0.995224 85.187778
312 Kyrie Irving BKN 37.89 46.461379 1.254221 1.153490 0.985920 82.568697
383 Nikola Jokic DEN 29.51 38.220744 1.324564 1.218183 0.963854 81.571881
325 Luka Doncic DAL 45.48 53.628274 1.196849 1.100726 0.994006 79.611701
319 LeBron James LAL 44.95 52.825918 1.196271 1.100194 0.993638 78.107706
281 Karl-Anthony Towns MIN 29.36 37.659113 1.309635 1.204453 0.963238 76.171910
175 Giannis Antetokounmpo MIL 37.82 45.957049 1.225386 1.126971 0.985809 73.657125
474 Trae Young ATL 48.45 55.033549 1.158617 1.065564 0.995709 71.265972
133 Domantas Sabonis IND 24.42 31.984065 1.329774 1.222974 0.935912 64.823166
33 Ben Simmons PHI 32.90 39.960157 1.235801 1.136548 0.975316 64.486989
42 Bradley Beal WAS 41.69 47.880432 1.169778 1.075828 0.990818 63.614334
245 John Collins ATL 18.46 26.192835 1.426170 1.311629 0.874687 63.384272
23 Anthony Davis LAL 28.05 35.397849 1.273747 1.171447 0.957401 62.465277
424 Rudy Gobert UTA 12.11 19.491754 1.621689 1.491445 0.743980 62.416989
125 Devin Booker PHX 39.49 45.710849 1.177698 1.083113 0.988240 62.065480
444 Stephen Curry GSW 35.49 41.270000 1.210690 1.113454 0.981556 61.780342
239 Jimmy Butler MIA 30.16 36.820908 1.253294 1.152637 0.966403 61.701542
110 DeMar DeRozan SAS 31.95 38.943015 1.230290 1.131480 0.972531 61.297541
282 Kawhi Leonard LAC 35.24 41.560621 1.199544 1.103204 0.981029 59.898776
310 Kyle Lowry TOR 33.65 39.849228 1.212036 1.114692 0.977313 59.430490
31 Bam Adebayo MIA 23.29 30.257050 1.319444 1.213474 0.927224 58.093709
73 Chris Paul OKC 30.27 36.582512 1.232470 1.133485 0.966816 57.186114
76 Clint Capela ATL 11.53 18.265734 1.608916 1.479697 0.726716 53.481766
221 Jarrett Allen BKN 9.21 15.802198 1.734327 1.595036 0.645207 52.686162

Based on our new offensive score metric, these were the top 25 offensive players of the 2019-20 season. Note that a few of these players, like Steph Curry and Kyrie Irving missed a lot of games but had high scores the games they did play. Outside of that, I would say this is a pretty reasonable list, to make it to the top you need both volume and efficiency. I also believe playmaking is better rated in this metric than others by including assist opportunities, 2nd assists, and free throw assists.

What I like most about this metric is that volume at the cost of efficiency will hurt your score. For example, lets take an estimated Damian Lillard game where he has 42 NPTwAO and 52 PGen. Now let’s say he continues to take 5 more shots, and makes 2 of them shooting 40% for 4 points, only producing .8 ppp. He would now have 47 NPTwAO and 56 PGen. Let’s look at the difference in final scores between the two instances of the game.

score = 52 ** ((52/42)/ppp) * (1 - 5 ** (-1*42 / npt_avg))
print('Score before shots', score)
score_final = 56 ** ((56/47)/ppp) * (1 - 5 ** (-1*47 / npt_avg))
print('Score after shots', score_final)
Score before shots 89.14031151364419
Score after shots 81.93279421061916

One flaw I have with PER, whose formula can be found here, is that a player can improve their score with volume as long as they are shooting above 30.4% from the field. If I had a player shooting around 35%, I would not want him to increase his volume. My metric makes it so increasing volume is curved by the efficiency factor heavily so that volume is only good when they are producing above league average efficiency.

Flaws

Now I will admit there are some flaws to this metric. Being a purely offensive stat focused metric, it leaves out stats like defensive rebounds, blocks, and steals which could lead to offensive possessions and may unfairly omit a good defender’s actual contribution. It doesn’t adjust for pace so plyers on teams that play faster might get more opportunities to generate points leading to higher scores. Using assist opportunities puts reliance on teammates to make shots, so a player’s playmaking score might be a lot lower on a bad team than it would a good team, as you probably noticed there are only a few instances of a player on a bad team. I didn’t really know how to go about this because leaving out unconverted opportunities misconstrues that player’s usage, but if I were to try and regularize the conversion of assist opportunities it may unfairly punish those who set up their teammates better which is why their team is better in the first place.

This isn’t really a flaw in the stat but another issue is that 2nd assists, free throw assists, and assist opportunities only go back to the 13-14 season when player tracking started so we will be unable to compare offensive scores historically unfortunately, unless we figure out a way to estimate these stats.

Conclusion

I feel this is a good metric to compare offensive performances. This is not to show how skilled a player is necessarily as a lot of it is based on opportunity, but all the time we spend on comparing performance, I think this metric does a better job than most when it comes to offense. It can be used for single game matchups as well as whole seasons differentiating itself to metrics like offensive win shares, and in my opinion pretty unbiased to position.

While there are some flaws and can get a bit complicated, I believe this is a simpler metric to understand than other popular ones that get even more complicated, as it can be boiled down to pretty much volume affected by efficiency. A future task would be to figure out how to incorporate defense which can be extremely hard to measure in numbers.