Posted: Sun Jun 03, 2018 2:04 pm Post subject: Basketball Analytics Discussion Thread
I'd like to start a healthy discussion of basketball analytics. I've generally just been a consumer of it, not really looking at the breakdowns on formulas, but now that I'm looking at some I have questions. Perhaps the answers are simple, or perhaps it will fuel discussion, or maybe I'll be the only person interested in this, but giving it a shot and starting with Usage Rate.
1) Why are FTA multiplied by 0.44?
2) Why use team/player minutes instead of the equivalent but for possessions.
3) Why aren't we including assists if the formula includes turnovers?
.44 estimates the number of possessions per free throw. And-1 free throws and technical free throws do not expend a possession. Otherwise, you could determine the number of possessions with FTA x .5 (because you get two free throws for a shooting foul, and that expends the possession). If you did that, then a FG plus an and-1 FT would add up to 1.5 possessions. .44 is not a magical number -- it was derived empirically from league data and is just an estimate.
An assist does not expend a possession. It is the shot that expends the possession. A turnover does expend a possession.
As for the second question, I would need to think through the logic of the team minutes/(5 x player minutes) component. Offhand, I can't remember what that is adjusting for.
1) FTA is multiplied by 0.44 to account for and-ones and fouls on 3s (and technical FTs). This has been standard since Hollinger came up with PER (maybe before that).
2) I don't know this for a fact, but I suspect the reason is simply that it's hard to know how many possessions a player played. Like, if there are 98 possessions in a game and Brook Lopez played 24 minutes, did he play 49 possessions? Probably not, the Lakers probably played slower with Lopez on the court. Getting that information is hard, so minutes played is good enough for an estimate.
3) The reason they don't include assists is because that's simply not what it measures. It estimates how many possessions a player has used. So it's shots, free throws, and turnovers. If you are interested in a measure that includes assists, you can check out Hollinger's player stats that includes assists. (It doesn't include passes that lead to FTs though so it's still not perfect). Here's one from 05-06: http://insider.espn.com/nba/hollinger/statistics/_/sort/usageRate/year/2006
.44 estimates the number of possessions per free throw. And-1 free throws and technical free throws do not expend a possession. Otherwise, you could determine the number of possessions with FTA x .5 (because you get two free throws for a shooting foul, and that expends the possession). If you did that, then a FG plus an and-1 FT would add up to 1.5 possessions. .44 is not a magical number -- it was derived empirically from league data and is just an estimate.
An assist does not expend a possession. It is the shot that expends the possession. A turnover does expend a possession.
As for the second question, I would need to think through the logic of the team minutes/(5 x player minutes) component. Offhand, I can't remember what that is adjusting for.
Thanks for your thoughts. An assist does expend the possession though, someone is scoring. What am I missing?
Joined: 02 May 2005 Posts: 90513 Location: Formerly Known As 24
Posted: Sun Jun 03, 2018 2:25 pm Post subject:
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession. _________________ “We must always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.” ― Elie Wiesel
1) FTA is multiplied by 0.44 to account for and-ones and fouls on 3s (and technical FTs). This has been standard since Hollinger came up with PER (maybe before that).
2) I don't know this for a fact, but I suspect the reason is simply that it's hard to know how many possessions a player played. Like, if there are 98 possessions in a game and Brook Lopez played 24 minutes, did he play 49 possessions? Probably not, the Lakers probably played slower with Lopez on the court. Getting that information is hard, so minutes played is good enough for an estimate.
3) The reason they don't include assists is because that's simply not what it measures. It estimates how many possessions a player has used. So it's shots, free throws, and turnovers. If you are interested in a measure that includes assists, you can check out Hollinger's player stats that includes assists. (It doesn't include passes that lead to FTs though so it's still not perfect). Here's one from 05-06: http://insider.espn.com/nba/hollinger/statistics/_/sort/usageRate/year/2006
1) Thanks for the response. so the .44 is an average they use across the board, but why not use the actual percentage for the specific game(s)?
2) Agreed, but it seems that with as much that is tracked these days, having actual counts of possessions wouldn't be that hard
3) i see what you're saying on assists (and it's hard to include assists because you'll be double counting - player A assists but player B shoots, both get counted), but seems like something is missing if you don't include it.
Guess i'm just thinking there's a better way to go about this.
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends with the other team getting the ball. A made shot, free throws, missed shot (although an offensive rebound “steals” a possession, hence their value), or turnover.
I'd argue a "steal" steals a possession, hence its value.
An offensive rebound simply extends a possession so it hasn't ended yet. But it'd be a huge effort to adjust USG% to only include missed FGA that result in a defensive rebound, and the difference would be mostly negligible, so what's the point.
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
1) FTA is multiplied by 0.44 to account for and-ones and fouls on 3s (and technical FTs). This has been standard since Hollinger came up with PER (maybe before that).
2) I don't know this for a fact, but I suspect the reason is simply that it's hard to know how many possessions a player played. Like, if there are 98 possessions in a game and Brook Lopez played 24 minutes, did he play 49 possessions? Probably not, the Lakers probably played slower with Lopez on the court. Getting that information is hard, so minutes played is good enough for an estimate.
3) The reason they don't include assists is because that's simply not what it measures. It estimates how many possessions a player has used. So it's shots, free throws, and turnovers. If you are interested in a measure that includes assists, you can check out Hollinger's player stats that includes assists. (It doesn't include passes that lead to FTs though so it's still not perfect). Here's one from 05-06: http://insider.espn.com/nba/hollinger/statistics/_/sort/usageRate/year/2006
1) Thanks for the response. so the .44 is an average they use across the board, but why not use the actual percentage for the specific game(s)?
2) Agreed, but it seems that with as much that is tracked these days, having actual counts of possessions wouldn't be that hard
3) i see what you're saying on assists (and it's hard to include assists because you'll be double counting - player A assists but player B shoots, both get counted), but seems like something is missing if you don't include it.
Guess i'm just thinking there's a better way to go about this.
Well, for #1 and #2, it's useful to understand how basketball reference is calculating these things. I don't know their internal software, but I do know that they host box scores of basically every NBA game ever. It's very, very easy to get from each box score the number of FTs someone shot, multiply it by 0.44, add it to FGA and TO, and then scale it by minutes played. In fact, I think anyone's laptop could probably compile the USG stats of the entire league instantaneously, given basketball reference's internal database.
Implementing #1 and #2 is not nearly as trivial. For example, let's say they want to use the actual number instead of 0.44. Well, how do you figure out how many And-1s someone got? Or how many 3 point fouls, or technical fouls, they shot? You can certainly do data mining of play-by-play data, but that's too much work for something as off-hand as USG%. Likewise for # of possessions, which isn't available on the basic box score. I bet on NBA.com you can pull this info rather easily, but USG% predates NBA.com letting you pull this information, and basketball-reference probably can't bother updating their code to get a marginal improvement in accuracy.
As for #3, absolutely. It's why CP3's USG% has always been so low given how ball dominant he is. The bball-ref version is useful because it's more precise in what it measures, but it doesn't measure "usage" in the way people think of it. It's still a useful shorthand, but it's inaccurate for sure and it causes a lot of confusion since people think it's synonymous with being ball dominant.
Last edited by tox on Sun Jun 03, 2018 2:40 pm; edited 1 time in total
Joined: 02 May 2005 Posts: 90513 Location: Formerly Known As 24
Posted: Sun Jun 03, 2018 2:39 pm Post subject:
HOF Rookie wrote:
Omar Little wrote:
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
You may want to add something on for your purposes, but usage is really meant to encapsulate a percentage of possessions a player terminates. It’s good for that purpose. A player who passes on a lot of his possessions (regardless of the further actions by the player receiving the pass) should have a lower usage than a guy who shoots or gets fouled or turns it over (although there’s a gray area in the turnover stat that penalizes passers, since that’s the majority of turnovers). _________________ “We must always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.” ― Elie Wiesel
Joined: 10 Apr 2001 Posts: 65135 Location: Orange County, CA
Posted: Sun Jun 03, 2018 2:45 pm Post subject:
Interesting. I don't look this hard into USG, but have general baselines for what franchise player USG looks like vs. a role player. _________________ Resident Car Nut.
Joined: 02 May 2005 Posts: 90513 Location: Formerly Known As 24
Posted: Sun Jun 03, 2018 2:48 pm Post subject:
Fwiw, the .44 possessions for free throws is also carried into true shooting, which is basically usage minus turnovers (which in an opposite way from usage, does not assign turnovers to a possession) converted to 2pt fg% (adding in the value of threes). Add in turnovers and you get to the formula for points per possession (which can be turned into a fg% if you wish. _________________ “We must always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.” ― Elie Wiesel
Joined: 02 May 2005 Posts: 90513 Location: Formerly Known As 24
Posted: Sun Jun 03, 2018 2:50 pm Post subject:
It is possible to be a ball dominant low usage player, although they are rare. They are usually correlated but not the same thing. _________________ “We must always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.” ― Elie Wiesel
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
You may want to add something on for your purposes, but usage is really meant to encapsulate a percentage of possessions a player terminates. It’s good for that purpose. A player who passes on a lot of his possessions (regardless of the further actions by the player receiving the pass) should have a lower usage than a guy who shoots or gets fouled or turns it over (although there’s a gray area in the turnover stat that penalizes passers, since that’s the majority of turnovers).
you're right. i guess my issue with the metric is that it doesn't tell me if the player is effective or not. it doesn't take into account made baskets/free throws and so the metric just looks at the 'black hole-ness' of a player.
i want to ultimately find or create a metric looks at touches and time of possession, coupled with things such as FGM/FGA/etc.
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
You may want to add something on for your purposes, but usage is really meant to encapsulate a percentage of possessions a player terminates. It’s good for that purpose. A player who passes on a lot of his possessions (regardless of the further actions by the player receiving the pass) should have a lower usage than a guy who shoots or gets fouled or turns it over (although there’s a gray area in the turnover stat that penalizes passers, since that’s the majority of turnovers).
you're right. i guess my issue with the metric is that it doesn't tell me if the player is effective or not. it doesn't take into account made baskets/free throws and so the metric just looks at the 'black hole-ness' of a player.
i want to ultimately find or create a metric looks at touches and time of possession, coupled with things such as FGM/FGA/etc.
PER.
Why would you have an issue with a metric for not measuring something it was not designed to measure.
Thats like getting mad at a hammer for not being able to pass through water like a garden hose can.
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
You may want to add something on for your purposes, but usage is really meant to encapsulate a percentage of possessions a player terminates. It’s good for that purpose. A player who passes on a lot of his possessions (regardless of the further actions by the player receiving the pass) should have a lower usage than a guy who shoots or gets fouled or turns it over (although there’s a gray area in the turnover stat that penalizes passers, since that’s the majority of turnovers).
you're right. i guess my issue with the metric is that it doesn't tell me if the player is effective or not. it doesn't take into account made baskets/free throws and so the metric just looks at the 'black hole-ness' of a player.
i want to ultimately find or create a metric looks at touches and time of possession, coupled with things such as FGM/FGA/etc.
PER.
Why would you have an issue with a metric for not measuring something it was not designed to measure.
Thats like getting mad at a hammer for not being able to pass through water like a garden hose can.
An assist is a pass, and that means someone else has the ball. Think of a possession is something that ends that possession. A shot, free throw, or turnover. Any other action continues the same possession. This way only one player is awarded each possession.
Gotcha. I know this is a popular metric, but the more I think about it, the more something just feels like it's missing something.
You may want to add something on for your purposes, but usage is really meant to encapsulate a percentage of possessions a player terminates. It’s good for that purpose. A player who passes on a lot of his possessions (regardless of the further actions by the player receiving the pass) should have a lower usage than a guy who shoots or gets fouled or turns it over (although there’s a gray area in the turnover stat that penalizes passers, since that’s the majority of turnovers).
you're right. i guess my issue with the metric is that it doesn't tell me if the player is effective or not. it doesn't take into account made baskets/free throws and so the metric just looks at the 'black hole-ness' of a player.
i want to ultimately find or create a metric looks at touches and time of possession, coupled with things such as FGM/FGA/etc.
PER.
Why would you have an issue with a metric for not measuring something it was not designed to measure.
Thats like getting mad at a hammer for not being able to pass through water like a garden hose can.
maybe it's partly semantics of 'usage'.
That’s fair. It’s probably not the best label for the metric although I would argue usage rate correlates highly with “degree of involvement” on the offensive end.
Fwiw, the .44 possessions for free throws is also carried into true shooting, which is basically usage minus turnovers (which in an opposite way from usage, does not assign turnovers to a possession) converted to 2pt fg% (adding in the value of threes). Add in turnovers and you get to the formula for points per possession (which can be turned into a fg% if you wish.
As an aside here, the .44 comes from the 2002 book by John Hollinger in which he published the PER formula. As far as metrics go, this was the dawn of time, sort of like the opening sequences of 2001: A Space Odyssey. I remember that he said that .44 is just where the number fell based on league data. It became a standard conversion.
But I wonder whether anyone has updated the numbers to find out whether .44 is still where the number falls? The game has changed a lot since 2002.
Anyway, for anyone who is in a really nerdy mood, here is 538's discussion of usage rate from a few years ago. There are actually multiple formulas for usage rate, and Hollinger's version does use assists:
Fwiw, the .44 possessions for free throws is also carried into true shooting, which is basically usage minus turnovers (which in an opposite way from usage, does not assign turnovers to a possession) converted to 2pt fg% (adding in the value of threes). Add in turnovers and you get to the formula for points per possession (which can be turned into a fg% if you wish.
As an aside here, the .44 comes from the 2002 book by John Hollinger in which he published the PER formula. As far as metrics go, this was the dawn of time, sort of like the opening sequences of 2001: A Space Odyssey. I remember that he said that .44 is just where the number fell based on league data. It became a standard conversion.
But I wonder whether anyone has updated the numbers to find out whether .44 is still where the number falls? The game has changed a lot since 2002.
Anyway, for anyone who is in a really nerdy mood, here is 538's discussion of usage rate from a few years ago. There are actually multiple formulas for usage rate, and Hollinger's version does use assists:
the .44 should be more dynamic than it is. and using minutes is strange when the other metrics is based on possessions. it should be using possessions, not minutes.
i look at basketball metrics and generally have issue with them when i actually look at what it's measuring. or maybe i just have an issue with authority.
i'm also looking into offensive rating, and can't help but feel something is off with that one too.
Fwiw, the .44 possessions for free throws is also carried into true shooting, which is basically usage minus turnovers (which in an opposite way from usage, does not assign turnovers to a possession) converted to 2pt fg% (adding in the value of threes). Add in turnovers and you get to the formula for points per possession (which can be turned into a fg% if you wish.
As an aside here, the .44 comes from the 2002 book by John Hollinger in which he published the PER formula. As far as metrics go, this was the dawn of time, sort of like the opening sequences of 2001: A Space Odyssey. I remember that he said that .44 is just where the number fell based on league data. It became a standard conversion.
But I wonder whether anyone has updated the numbers to find out whether .44 is still where the number falls? The game has changed a lot since 2002.
Anyway, for anyone who is in a really nerdy mood, here is 538's discussion of usage rate from a few years ago. There are actually multiple formulas for usage rate, and Hollinger's version does use assists:
the .44 should be more dynamic than it is. and using minutes is strange when the other metrics is based on possessions. it should be using possessions, not minutes.
i look at basketball metrics and generally have issue with them when i actually look at what it's measuring. or maybe i just have an issue with authority.
i'm also looking into offensive rating, and can't help but feel something is off with that one too.
Okay, but let me amplify something Tox said. When you look at stats and metrics, you will find two different, but related sets, as well as references to a third set:
1. There are the old school box score metrics. This includes PER, Usage Rating, ORtg, DRtg, and more. These stats come from a time when only box score data was readily available. Some of them are still good (team ORtg, team DRtg), but some of them are obsolete (PER, individual ORtg, individual DRtg, etc.). It's not that they are wrong, per se, but rather that they are based on limited data. You will see these stats on a lot of web sites because they are easily updated by daily box score data. This is why some of the formulas are so hard to follow at times -- the websites show the formulas in the format used in computer code.
2. There are the newer metrics, such as RPM and the like. These are based on data that was not easily available in the old days, in particular play by play data. These metrics are not perfect and are usually not as generalized, but they are a major step forward.
3. There are proprietary metrics that are usually available only to subscribers (and of course the private metrics that the teams use). You will sometimes see references to these metrics in the media.
With all of this in mind, here are some responses to your comments:
Yes, it would be nice if the .44 was more dynamic, but it is a product of the old school metrics (category 1). It would take a lot of effort to update that number dynamically. I hope someone periodically checks. In any event, it is more of a convention than anything else. If different people used different values, the numbers wouldn't match, and the slight improvement in accuracy wouldn't be terribly important because it's just an estimate anyway. The more accurate data about free throws would require category 3 data.
Using minutes may be appropriate for a category 1 formula. I think that component of the formula involves the percentage of the team's minutes played by the target player (I figured out the logic one time, but I just don't remember now). In the old school world, it would be impossible to determine how many possessions were played by that player. You could only extrapolate from minutes. The category 2 stats do a better job with this because they use the play by play data to calculate +/- data and the like.
Team ORtg is a perfectly fine stat because it is so simple -- points per 100 possessions. Individual ORtg is just a category 1 stat. It was okay ten years ago, but it is limited by its reliance on box score data. It comes from Dean Oliver, who was a contemporary of Hollinger. Basically, think of individual ORtg and DRtg as an alternative version of PER.
Last edited by Aeneas Hunter on Sun Jun 03, 2018 9:05 pm; edited 1 time in total
Ok, I’ll gripe about stats. Haha. I think the real “true shooting” is eFG% and not TS%.
Sure, if you want to see it that was. TS% is PPP (points per possession) divided by two. PPP is the real stat. It is divided by two because basketball fans have a hard time relating to a PPP of 1.1. If you say that the TS% is 55%, people can wrap their brains around it. PPP is scoring, eFG% is shooting.
Ok, I’ll gripe about stats. Haha. I think the real “true shooting” is eFG% and not TS%.
Sure, if you want to see it that was. TS% is PPP (points per possession) divided by two. PPP is the real stat. It is divided by two because basketball fans have a hard time relating to a PPP of 1.1. If you say that the TS% is 55%, people can wrap their brains around it. PPP is scoring, eFG% is shooting.
Well, PPP traditionally includes turnovers as well.
It's probably most accurate to say TS% is "true efficiency" or something, whereas eFG% is "true shooting"
Here is an interesting spin on the relevance of usage rates, in the specific context of Lebron James. Frankly, I think a lot of this analysis goes in the "duh" category, but the point is that pairing Lebron with a teammate who has a high usage rate is a mistake. Of course, this is Kevin Pelton, and he is prone to expressing himself in techno-babble:
Quote:
If we look at the new James teammates who have beaten expectations in Cleveland, it's almost exclusively players with low usage rates. Just one of the nine to match or beat their projections (Smith) was forecast for above-average usage, and their average projected usage rate was 15.8 percent.
To some extent, this is a natural outcome. Part of the metric calibration I mentioned above is valuing the tradeoff between usage and efficiency. This relies on measuring the average extent to which players get less efficient in smaller roles, but each individual has a different degree of sensitivity to how much they're being asked to do on offense. Star players are valuable precisely because they see less drop-off than role players in smaller roles. At the same time, they benefit less in terms of efficiency from playing alongside a high-usage teammate like LeBron.
Still, these results seem to offer an important takeaway for any team that signs James this summer (including the Cavaliers). While some additional shot creators are necessary, particularly in a playoff setting, any team with James must be careful not to invest too many resources on players who are best with the ball in their hands. Instead, the focus should be on finding role players whose games will mesh well with LeBron's.
This is similar to what a lot of us said when we traded for Nash a few years back. We never did get to see how that worked out, because Nash was hurt so often. On the other hand, the CP3/Harden pairing worked pretty well.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum