Data Storytelling – Numbers Don’t Lie

Over the course of the fall 2024 semester, I and a group of two others (Tom Wiley and Jack Welker) created a project showcasing our knowledge in Excel and data collection. We pulled statistics from “BasketballReference.com” throughout the NBA season up until November 27, 2024 and uploaded them into excel where we transformed individual player statistics into comparative values. We then took those comparative values and created composite values to create a number from 0 to 5 for each player. We then put players into pivot tables based off of what team they are on and pulled different insights that I will be listing below.

The Excel sheet above is the data sheet we used for our project, by clicking the link you can view the sheet, but not edit it. Everything below are the different insights we found and created using Flourish!

Skewed Values

First, let’s look at teams. The best teams on paper have high values while the bad teams have low values, but what if these values are skewed? 

Many teams rotate players in and out of a game off the bench. These players could be anyone from designated 6th men to people playing for 30 seconds to get a foul and go back out.

So how do we account for this?

Tom created a team composite score visual where the purple bars are scores from ALL players, while the red bar is the scores from players with a barrier of entry that’s at least 12 minutes played per game.

The range of the values is 0-45.

Positional Value

On top of just ranking each player individually, we sought out to measure broader values on teams. With five players on the court during every game, each player is designated one of the following positions: Point guard, shooting guard, small forward, power forward, and center.

We can determine the value of each position by taking the composite scores of all of the players that make up that position and averaging them out. What we will find is that the position with the highest average composite score will bring the most value.

The positions in order from most valuable to least valuable are center, power forward, point guard, shooting guard, and small forward.

High Impact Players

Jack researched this insight. We can determine how much influence high-impact players have on their teams wins and losses by comparing box +/- and win share statistics. Box +/- is defined as a statistical metric that estimates a player’s contribution to their team based solely on their box score data. Win shares is a composite basketball stat that attempts to capture an individual player’s overall contribution to their team, expressed as a ‘share’ of the team’s total wins over the course of a season. We took the top 20 individual players in the league (high-impact players), and compared the average of their box +/- as well as their win shares. What we found is that Shai Gilgeous-Alexander had the highest average of the two stats, and at the time this data was collected/recorded, his team (OKC Thunder) were the first team in their division. 

Information About Our Project

Featured below is my group’s “About this Project” File that explains our data lifecycle, the origin of our data, the spreadsheet we used, a guide on how we transformed the data we found, what we analyzed within our data, the insights we pulled from the data, and our overall data story with a script of what we presented to our class.

Discover more from Ben Souder

Subscribe now to keep reading and get access to the full archive.

Continue reading