SQL Project Nba
SQL Project Nba
NBA
Ben Brumm
www.databasestar.com
SQL Project: NBA
● questions to answer
This will help you follow along with the YouTube video and perform your own data analysis.
Table of Contents
The Project 2
Download the Sample Data 2
Importing Data 2
Questions 3
Queries and Results 4
Question 1 4
Question 2 6
Question 3 7
Question 4 8
Question 5 10
Conclusion 11
www.DatabaseStar.com 1
SQL Project: NBA
The Project
This project is about performing some data analysis on a set of data to answer some questions.
The data set is on the NBA basketball league. This is available from the Kaggle website here:
https://www.kaggle.com/datasets/wyattowalsh/basketball/data
Step 1: Log in to the Kaggle website. You'll need to have an account in order to download the data set.
Step 5: Extract the ZIP file, which contain a range of CSV files.
Importing Data
The process to import data is different in each SQL editor and database is different, but the overall
process is the same:
1. Create a database
2. Start the data import wizard or process
3. Select the CSV file
4. Adjust any options as needed
5. Proceed with the import
I've created a few videos on importing CSV files on my YouTube channel, with more to come.
www.DatabaseStar.com 2
SQL Project: NBA
Questions
As part of the data analysis for this project, we started with a list of questions that we wanted to find
the answer to using SQL.
These are just the questions. If you'd like the answers, which are the SQL queries and the results, refer
to the next section.
1. How many games has each team won and lost in their entire history?
4. Which team has had the best 5-season span? Or, said another way, which team has the best
win-loss record over a 5 year period?
5. Which team has had the biggest increase or decrease in wins from one season to the next?
www.DatabaseStar.com 3
SQL Project: NBA
The queries have been written for Oracle. Most of the queries should work on all other vendors, but
some may need some adjustments for vendor-specific features.
Question 1
Question:
How many games has each team won and lost in their entire history?
Query:
SELECT
team_name,
SUM(wins) AS wins,
SUM(losses) AS losses
FROM (
SELECT
team_name_home AS team_name,
SUM(CASE WHEN wl_home = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_home = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
GROUP BY team_name_home
UNION ALL
SELECT
team_name_away,
SUM(CASE WHEN wl_away = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_away = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
GROUP BY team_name_away
)
GROUP BY team_name
ORDER BY wins DESC;
www.DatabaseStar.com 4
SQL Project: NBA
Result:
… … …
www.DatabaseStar.com 5
SQL Project: NBA
Question 2
Question:
Query:
WITH season_record AS (
SELECT
SUBSTR(season_id, 2, 4) AS season,
team_name_home AS team_name,
SUM(CASE WHEN wl_home = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_home = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_home
UNION ALL
SELECT
SUBSTR(season_id, 2, 4),
team_name_away,
SUM(CASE WHEN wl_away = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_away = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_away
)
SELECT
season,
team_name,
SUM(wins) AS wins,
SUM(losses) AS losses,
ROUND(SUM(wins) / SUM(wins + losses), 3) AS win_pct
FROM season_record
GROUP BY season, team_name
ORDER BY win_pct DESC;
Result:
www.DatabaseStar.com 6
SQL Project: NBA
Question 3
Question:
Query:
WITH season_record AS (
SELECT
SUBSTR(season_id, 2, 4) AS season,
team_name_home AS team_name,
SUM(CASE WHEN wl_home = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_home = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_home
UNION ALL
SELECT
SUBSTR(season_id, 2, 4),
team_name_away,
SUM(CASE WHEN wl_away = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_away = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_away
)
SELECT
season,
team_name,
SUM(wins) AS wins,
SUM(losses) AS losses,
ROUND(SUM(wins) / SUM(wins + losses), 3) AS win_pct
FROM season_record
GROUP BY season, team_name
ORDER BY win_pct ASC;
Result:
www.DatabaseStar.com 7
SQL Project: NBA
Question 4
Question:
Which team has had the best 5-season span? Or, said another way, which team has the best win-loss
record over a 5 year period?
Query:
WITH season_record AS (
SELECT
season,
team_name,
SUM(wins) AS wins,
SUM(losses) AS losses
FROM (
SELECT
SUBSTR(season_id, 2, 4) AS season,
team_name_home AS team_name,
SUM(CASE WHEN wl_home = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_home = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_home
UNION ALL
SELECT
SUBSTR(season_id, 2, 4),
team_name_away,
SUM(CASE WHEN wl_away = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_away = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_away
)
GROUP BY season, team_name
),
season_5y AS (
SELECT
season,
team_name,
wins,
losses,
ROUND(wins / (wins + losses), 3) AS win_pct,
SUM(wins) OVER (
PARTITION BY team_name
ORDER BY season ASC
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
www.DatabaseStar.com 8
SQL Project: NBA
) AS wins_5y,
SUM(losses) OVER (
PARTITION BY team_name
ORDER BY season ASC
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS losses_5y,
COUNT(*) OVER (
PARTITION BY team_name
ORDER BY season ASC
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS seasons_included
FROM season_record
)
SELECT
season,
team_name,
wins_5y,
losses_5y,
ROUND(wins_5y / (wins_5y + losses_5y), 3) AS win_pct_5y
FROM season_5y
WHERE seasons_included = 5
ORDER BY win_pct_5y DESC;
Result:
www.DatabaseStar.com 9
SQL Project: NBA
Question 5
Question:
Which team has had the biggest increase or decrease in wins from one season to the next?
Query:
WITH season_record AS (
SELECT
season,
team_name,
SUM(wins) AS wins,
SUM(losses) AS losses
FROM (
SELECT
TO_NUMBER(SUBSTR(season_id, 2, 4)) AS season,
team_name_home AS team_name,
SUM(CASE WHEN wl_home = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_home = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_home
UNION ALL
SELECT
TO_NUMBER(SUBSTR(season_id, 2, 4)),
team_name_away,
SUM(CASE WHEN wl_away = 'W' THEN 1 ELSE 0 END) AS wins,
SUM(CASE WHEN wl_away = 'L' THEN 1 ELSE 0 END) AS losses
FROM game
WHERE season_type = 'Regular Season'
GROUP BY season_id, team_name_away
)
GROUP BY season, team_name
),
season_with_prev AS (
SELECT
season,
team_name,
wins,
SUM(wins) OVER (
PARTITION BY team_name
ORDER BY season ASC
RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING
) AS wins_prev_season
FROM season_record
)
www.DatabaseStar.com 10
SQL Project: NBA
SELECT
season,
team_name,
wins,
wins_prev_season,
wins - wins_prev_season AS wins_increase
FROM season_with_prev
WHERE wins_prev_season IS NOT NULL
ORDER BY wins_increase ASC;
Result - Increase:
Result - Decrease:
Conclusion
Hopefully, this guide has been useful to you. If you have any questions, let me know at
[email protected].
Thanks,
Ben Brumm
www.DatabaseStar.com
www.DatabaseStar.com 11