Clustering NBA Players


Evan Green, Michael Menz, Luke Benz, Gabriel Zanuttini-Frank, and Michael Bogaty


NBA


While there are five positions that everybody knows in basketball, players at those positions play the game very differently. LeBron James and Harrison Barnes are both nominally small forwards but they are not similar in their approach to that position. Currently, there is no good way of defining that difference succinctly. Therefore, we looked at mathematically defining player types in the NBA based mostly on how they play (focusing on tendencies such as Usage Rate, 3 Point Attempt Rate, and Percentage of Points Scored in the Paint) and to a lesser degree on how good they are (focusing on stats such as FG%, PTS per Minute, and Rebounds Per Minute). We then used hierarchical clustering to form the groups. We chose this method over k-means clustering because of the random nature of k-means clustering and we wanted stable results. We now can say that LeBron James is really a shot creator, while Harrison Barnes is a shooter. The player types, attributes and representative players are summarized in the table below.

Player Type Traits Overrepresented Traits Underrepresented Sample Players
Traditional Bigs Rebounding and Blocks 3PT Attempts Tyson Chandler, Dwight Howard, and DeAndre Jordan
Shooters 3PT Attempts and 3PT% Points in the Paint Jared Dudley, Harrison Barnes, and Kyle Korver
Shot Creators Usage %, FT Attempts % of 3PT that are Assisted Stephen Curry, Lebron James, and Kemba Walker
Defensive Specialists % shots that are assisted and % of points that are on a fast break Usage % and FG% Thabo Sefolosha, Luc Richard Mbah a Moute, Spencer Hawes
Ball Dominant Guards Assists, Unassisted Points, and Steals Rebounding Rajon Rondo, Chris Paul, and Ricky Rubio
Skilled Bigs Rebounding, Usage %, and Mid Range Points 3PT Attempts Karl-Anthony Towns, Blake Griffin, and Marc Gasol

Once we have these clusters, we can look at other, related questions. Since we clustered every NBA player from the 2011-2012 season until the 2015-2016 Season, we can look at how players developed and changed over that time. During this period, Steph Curry has had an ascension from a great shooter and promising player to one of the best players in the NBA. Fittingly his player type has changed from being a Shooter to being a Shot Creator. While he’s been a point guard throughout this transition, his player type has recognized how differently he now plays. Kawhi Leonard’s transition similarly explains how his playing style and role has changed. In the 2011 season, he was a Skilled Big. The next two seasons he was a Shooter and the last two season he has been a Shot Creator. This mirrors his transition from defensive minder player, to a player who could also be a spot up shooter, to a player who could handle the ball and create offense. In the opposite direction we can see the decline of the skills of Kevin Garnett. In 2011 and 2012, he was a skilled big, before transitioning toward being a traditional big the last three seasons. Again, this mirrors our qualitative understanding of how Kevin Garnett has changed.

Another question we can ask with the help of the player types is whether certain player types compliment or subtract for the effectiveness of other types of players. Obviously a lineup’s effectiveness is also impacted by the skill level of the players involved in the lineup. A lineup with more skilled players is likely to be better. Thus we analyzed the success of lineups based on the skill level of the players and types of players that were involved. We used Plus Minus Per Minute Played as our success measure for lineups to avoid favoring lineups that had played more minutes. We also had threshold for the minimum minutes played for a lineup we would include to avoid having too much noise in the samples. To measure lineup strength, we summed the BPM (Box Plus Minus) of the each of the players. We chose BPM because it a catch all statistic that attempts to quantify the ability of every player and is easily available from basketball-reference.com. We then tried to see if any player types or player type combinations were significant predictors of lineup success. Interestingly, we found that none were! In fact, a simple linear model of Lineup BPM to predict Plus Minus Per Minute was quite successful with a R^2 of .42.

We observe that the residuals are normally distributed (The residual plot resembles a straight line) which indicated that a linear fit is appropriate in this case.

Additionally, there is reason to believe that BPM (or any other single statistic for that matter) does not perfectly capture a players talent which adds more uncertainty in any attempt to predict lineup success. Furthermore, basketball, like all sports, has a component of inherent randomness that makes a perfect fit impossible to achieve. This makes the R^2 value of .42 surprisingly high and gives us confidence in the model.

There are a couple explanations for the fact that player type is not a significant predictor of lineup success. One possibility is that the impacts of the player types are already efficiently being accounted for by coaches and management. This makes a lot of sense since a hypothetical lineup that has all traditional big men (think DeAndre Jordan) would probably struggle mightily and would never be played in a real game. Additionally, in the interactive graphic above you can see that teams typically have players spaced out among the different player types. Thus, there is evidence to suggest that teams are already implicitly recognizing the value and existence of these player types, even if they haven’t formalized them as we have here.

Another is that there is something our analysis is missing and that there are player types that would be significant predictors but we have missed them. However, because of how many lineups we looked at and the clustering methods we used, it is unlikely there are other more mathematically defined clusters that also serve as significant predictors after accounting for lineup skill. I also suspect that our linear model would have a worse fit and the residuals would not be as nicely normally distributed if this were the case. Instead the model just looks there is a large amount of natural fluctuation around the trend.

In summary, although we see strong mathematical evidence that there exist player types in the NBA that transcend the traditional positions and do a better job of describing who a player is and what they are likely to do, we find no evidence that these player types can help predict lineup success better than simply the talent of the lineup. We suspect that this is because coaches and general managers implicitly understand and account for these player types.