0% found this document useful (0 votes)

46 views6 pages

Iml Report

This document provides an overview of a basketball game prediction project that uses machine learning techniques. The project uses historical NBA game data to build and train random forest classifiers to predict winners of games. Key inputs for the models include statistics like points, rebounds, assists, and field goals. The models were refined using feature selection methods to identify the most important predictive factors. The goal is to accurately forecast outcomes and better understand dynamics of NBA games.

Uploaded by

Tushar kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views6 pages

Iml Report

Uploaded by

Tushar kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

1

BASKETBALL GAME PREDICTOR

Machine Learning Course Project

V Semester , Bachelor of Technology , Information Technology

Indian Institute of Information Technology , Allahabad , Prayagraj

Tushar Kumar (IIT2021203), Yuvraj Jindal (IIT2021161), Parth Garg(IIT2021116) , Jinam

Jain(IIT2021180), Rishika Rajput(IIT2021117), Sakshi khokhar
(IIT2021108)

Abstract— The Basketball Game Prediction using basketball is perceived as a sport that is primarily
machine learning is an innovative application that player driven, forecasting the results of NBA games
harnesses advanced algorithms to forecast the is particularly interesting. The prevailing stigma is
winning of game during the National Basketball that winning games requires a superstar. Teams are
Association tournaments. This cutting-edge adopting and utilizing increasingly sophisticated
system analyse a wide variety of factors Historical statistics. In this project, we employ Random Forest
data, which allows it to make informed predictions classifier algorithms to attempt to forecast the
about winning. outcome of a game between two teams as well as
This predictor makes use of key inputs including identify the factors that are truly most crucial to
field goal , assists, rebounds, points per game, and determining the result of a game without taking into
many more statistics by incorporating machine account the individual player statistics.
learning techniques.These components are
important markers for comprehending the
dynamics of an NBA game and aid in the
development of outcome-estimating predictive
models.
2 BACKGROUND
By carefully examining past match data, our
model's forecast accuracy has been refined. The 1. Decision tree: A decision tree is a visual
algorithms have been adjusted to find patterns model for decision-making, presenting
and correlations between the aforementioned choices and their potential outcomes in a
variables and the final result. The model is tree-like structure. Nodes represent
constantly learning and adapting to new data and decisions or tests, branches depict possible
trends, making improvements to its accuracy and outcomes, and leaves indicate final results.
dependability with every iteration. Widely used in machine learning, decision
trees offer a clear and interpretable
approach to classification and regression
tasks across diverse domains.
1 INTRODUCTION
2. Random Forest: A random forest classifier is
an ensemble learning method that
P REDICTING the results of sporting events is a
logical use case for machine learning. A lot of
combines multiple decision trees for
improved accuracy and robustness. It
professional sports have readily available, generally
constructs a forest of trees, each trained on
random, and predictably large data sets. Because
a random subset of the data and features. By
2

aggregating predictions, it enhances 5. FEATURE SELECTION

predictive performance and mitigates
overfitting in diverse machine learning
The standard NBA box score includes 14 statistics
applications.
measuring each teams performance over the course
of a game. These statistics are:
3. OBJECTIVE - Field Goals Made (FGM)
- Field Goals Attempted (FGA)
The primary objective of a Basketball Game - 3 Point Field Goals Made (3PM)
Prediction model using Random Forest Classifier is - 3 Point Field Goals Attempted (3PA)
to accurately forecast the final winner. This - Free Throws Made Made (FTM)
predictive tool aims to leverage historical match - Free Throws Attempted (FTA)
data and crucial match-specific parameters to - Offensive Rebounds (OREB)
provide insights into the potential outcome. By - Defensive Rebounds (DREB)
achieving these objectives, the Basketball Game
- Assists (AST)
Predictor model endeavors to be a valuable asset in
the Basketball domain, offering actionable insights - Turnovers (TOV)
and aiding stakeholders in making informed - Steals (STL)
decisions during matches. The model can be useful - Blocks (BLK)
in betting, fantasy sports, and improving team - Personal Fouls (PF) - Points (PTS)
performance by identifying areas of improvement
Using the statistics contained in the box score, we
and developing targeted training strategies. constructed a dimensional feature vector for each
game, containing the difference in the competing
teams net: [win-lose record, points scored, points
allowed, field goals made and attempted, 3pt made
4. DATASET and attempted, free throws made and attempted,
offensive and defensive rebounds, turnovers,
assists, steals, block, and personal fouls].
We been using Kaggle NBA Game Data for this Initially, we trained and tested all of our learning
project. For the purposes of data collection on NBA models on the aforementioned feature vectors. We
matches, this dataset has been gathered. I used the quickly realized, however, that besides logistic
nba stats website to create this dataset.We used a regression, which performed well, all of the other
Games.csv database containing all games from 2004 models suffered from overfitting and poor test
to the latest update2022 with dates, teams and accuracies. In order to curb our overfitting we
other data such as numbers of points, etc. decided to instead construct our models using a
small subset of our original features consisting of the
Datasetlink: features that best captured a teams ability to win. In
https://www.kaggle.com/datasets/nathanlauga/nb choosing a specific set of features to utilize in our
a-games learning models, we ran three separate feature
selection algorithms in order to determine which
features are most indicative of a teams ability to
win. Two of the feature selection algorithms used
were forward and backward search, in which we
utilize 10-fold cross validation and add or remove
features one by one in order to determine which
features result in the highest prediction accuracies.
In addition, we ran a heuristic feature selection
algorithm to verify that the features selected tended
3

to be those that are most informative about

whether a team will win. The results of the three
methods are shown in the table below.

Forward Search Backward Search Heuristic

Points Scored Points Scored Points Scored
Points Allowed Field Goals Attempted Field Goals Attempted
Field Goals Attempted Defensive Rebounds Free Throws Made
Defensive Rebounds Assists Defensive Rebounds
Assists Turnovers Assists
Blocks Overall Record Overall Record
Overall Record Recent Record Recent Record

The features selected by backward search were Feature Extraction

almost the exact same features as those selected by
heuristic search. This indicated that the backward
This includes the following steps:
search features captured the aspects of a teams play
that best indicated whether that team would win
and thus that these features would likely yield good - Identify the features (independent
results. Our preliminary results showed that variables) that are likely to have predictive
backward search did in result in the best power.
crossvalidation accuracy. The features selected by - Remove irrelevant or redundant
backward search also agree with the experts view of features that do not contribute to the
the game, that prediction is most accurate when predictive task.
considering the offensive and scoring potential of a - Convert categorical variables into a
team compared to its opponent. Each of the format suitable for machine learning
selected statistics are related to scoring, even models.
turnovers and defensive rebounds as they - This may involve one-hot encoding,
essentially give the team possession of the ball. label encoding, or other methods
depending on the nature of the data.
- Splitting the dataset into training
6. LITERATURE REVIEW and test dataset.

A literature review of a Basketball Game

Predictor model utilizing Random Forest Classifier
would investigate existing research on forecasting
basketball game winner. It would explore the Decision tree Algorithm:
application of machine learning in sports analytics,
with a focus on basketball, emphasizing crucial The Decision Tree Splits the node into Subnodes
features such as points per game, rebounds, hence increasing the prunity of the nodes with
assists, field goal percentage, and many other respect to the target variable. A decision tree is
statistics for predictive modeling. The review would similar to a flowchart where each node
analyze and contrast various machine learning represents a test on an attribute (independent
algorithms, discussing their accuracy and
variable).A decision tree is basically a graphical
limitations in predicting basketball scores.
Furthermore, it would consider the feasibility of representation of every possible solution to a
real-time predictions during NBA matches, decision based problem based on a certain
addressing challenges and proposing potential condition.
avenues for future research in basketball score
prediction.
4

There are some important terms related to randomizing new trees during learning. Our data
forming a decision tree – has a plenty of features and a random forest can
help unravel complex unknown interactions
● Entropy: It is a measure of randomness or between predictor variables.
unpredictability in the dataset.
In terms of feature importance, the popular
● Information Gain: This is nothing but a method for random forests and decision trees in
decrease in the entropy of the dataset after general is based on mean decrease Gini. This is
splitting the dataset (on the basis of an based on the Gini impurity index which is
attribute) . computed by:

● Root Node:It represents the entire

𝐺𝐼 = ∑𝑝𝑖 (1 − 𝑝𝑖 )
population or sample and this further gets
divided into two or more homogeneous
sets. where 𝑛𝑐 is the number of classes present in the
output (in our case, two) and 𝑝𝑖 represents the
● Leaf Node: Node cant be further probability of class 𝑖 in the training set. The Gini
segregated. impurity index is a metric of misclassification
error.
● Pruning: opposite of splitting, basically
removing unwanted branches from the
tree.

Random Forest:

A random forest is made of decision trees. Each

decision tree can be thought of as a
representation of the training data that is split
into subpopulations based on a strong
differentiating variable. Because a each tree is
built on a different subset of the training
observations, random forest can easily handle
outliers and can prevent overfitting by
5

7. METHODOLOGY

This code methodology involves data extraction,

feature engineering to create relevant predictors,
model training using Random Forest Classifier,
evaluating model performance, and saving the best-
performing model for deployment or further
analysis. The code demonstrates a pipeline for
developing predictive models to forecast game
winner based on various game related features and
machine learning algorithms. Adjustments to
hyperparameters, feature selection, or data
processing methods can be made to further
enhance model performance.
▪ Feature importance refers to the
significance of different input factors in
determining game outcomes. Analyzing
feature importance helps identify key
elements like player performance, team
8. Results statistics, or game context that heavily
influence the model's ability to predict
▪ We use classification accuracy, which basketball winners, aiding in refining the
measures the percentage of correct predictive model.
predictions made by the model, to
evaluate its performance.
▪ We developed a machine learning model
using random forest algorithm to predict
whether a basketball team will win or
lose the game. Our model achieved
76.74% accuracy, proving its
effectiveness in predicting game
outcomes.
▪ Confusion matrix: A confusion matrix is a
performance evaluation tool in machine
learning that tabulates the true positive,
true negative, false positive, and false
negative outcomes of a classification
algorithm. It provides a clear snapshot of
model accuracy, precision, recall, and F1
score, aiding in the assessment of
predictive performance and error
analysis.
6

9. CONCLUSION

We found that a basketball teams win record plays

a central role in determining their likeliness of
winning future games. Winning teams win more
because they have the ingredients for success
already on the team. However, we were surprised
that removing the winning record significantly
changed classification accuracy. If we consider a
teams win record as representative of that teams
ability to win, then this implies that the score
statistics fail to completely represent a teams
success on the court. This result points to the need
for advanced statistics that go beyond the score in
order to potentially improve prediction accuracy for
close games and upsets. This need explains the
growing popularity on advanced statistic sport
conferences like the MIT Sloan conference.

10. REFERENCES

[1] M. Beckler, H. Wang. NBA Oracle

[2] A. Bocskocsky, J. Ezekowitz, C. Stein. The
Hot Hand: A New

Paola Zuccolotto - Marica Manisera - Basketball Data Science - With Applications in R-CRC Press (2020)
No ratings yet
Paola Zuccolotto - Marica Manisera - Basketball Data Science - With Applications in R-CRC Press (2020)
245 pages
NBA Game Prediction Using Machine Learning Algorithm
0% (1)
NBA Game Prediction Using Machine Learning Algorithm
6 pages
ML Tennis
No ratings yet
ML Tennis
6 pages
An Analysis of Nba Spatio Temporal Data
No ratings yet
An Analysis of Nba Spatio Temporal Data
44 pages
Simulating A Basketball Match With A Homogeneous Markov Model and Forecasting The Outcome
100% (1)
Simulating A Basketball Match With A Homogeneous Markov Model and Forecasting The Outcome
11 pages
Epathshala Legal Research PDF
No ratings yet
Epathshala Legal Research PDF
243 pages
Predicting Outcomes of NBA Basketball Games
100% (2)
Predicting Outcomes of NBA Basketball Games
67 pages
Budgeting in A Public Sector Power Point
No ratings yet
Budgeting in A Public Sector Power Point
40 pages
Thesis PDF
No ratings yet
Thesis PDF
137 pages
iTEP Conversation 18
No ratings yet
iTEP Conversation 18
4 pages
Preview
No ratings yet
Preview
58 pages
Georgios Papageorgiou Dissertation Data Mining in Sports
No ratings yet
Georgios Papageorgiou Dissertation Data Mining in Sports
82 pages
The Answer For Each Item Is Already Provided. Make A Written Explanation or Justification About The Answer. Briefly Explain The Reason and Show The Complete Solution If Needed
87% (23)
The Answer For Each Item Is Already Provided. Make A Written Explanation or Justification About The Answer. Briefly Explain The Reason and Show The Complete Solution If Needed
3 pages
Descriptive and Predictive Analysis of Euroleague
No ratings yet
Descriptive and Predictive Analysis of Euroleague
25 pages
Chap 009
100% (1)
Chap 009
92 pages
The Paper About The Method of Cricket Match Outcome
No ratings yet
The Paper About The Method of Cricket Match Outcome
67 pages
Project Report On Competency Mapping
100% (3)
Project Report On Competency Mapping
22 pages
Descriptive and Predictive Analysis of Euroleague PDF
No ratings yet
Descriptive and Predictive Analysis of Euroleague PDF
25 pages
Hybrid Basketball Game Outcome Prediction Method
No ratings yet
Hybrid Basketball Game Outcome Prediction Method
14 pages
Descriptive and Predictive Analysis of Euroleague PDF
No ratings yet
Descriptive and Predictive Analysis of Euroleague PDF
25 pages
Predicting Results For Professional Basketball Using Nba Api Data
No ratings yet
Predicting Results For Professional Basketball Using Nba Api Data
6 pages
Practical Research 2 Module - 2nd Sem
No ratings yet
Practical Research 2 Module - 2nd Sem
29 pages
Physics 2a Assessment 2
No ratings yet
Physics 2a Assessment 2
20 pages
The Application of Machine Learning and Deep Learn
No ratings yet
The Application of Machine Learning and Deep Learn
20 pages
Modeling and Forecasting The Outcomes of NBA Basketball Games
100% (1)
Modeling and Forecasting The Outcomes of NBA Basketball Games
25 pages
Report
No ratings yet
Report
8 pages
Os Segredos Do Lobo - Jordan Belfort
No ratings yet
Os Segredos Do Lobo - Jordan Belfort
12 pages
Basketball Prediction Using Multiple Regression As A Data Model in Predicting The Outcome of Game
No ratings yet
Basketball Prediction Using Multiple Regression As A Data Model in Predicting The Outcome of Game
11 pages
NBA Shot Log Report
No ratings yet
NBA Shot Log Report
5 pages
LFS 2018-19 Final Report
No ratings yet
LFS 2018-19 Final Report
411 pages
AI in Sport Gambling
No ratings yet
AI in Sport Gambling
6 pages
The Effects of The Teaching Style On Stu
No ratings yet
The Effects of The Teaching Style On Stu
7 pages
Jasper Lin, Logan Short, Vishnu Sundaresan, Predicting National Basketball Association Game Winners
No ratings yet
Jasper Lin, Logan Short, Vishnu Sundaresan, Predicting National Basketball Association Game Winners
5 pages
Won Machinelearningtopredictnbapointspreads
No ratings yet
Won Machinelearningtopredictnbapointspreads
4 pages
Minitab Statguide Multivariate
No ratings yet
Minitab Statguide Multivariate
25 pages
ChengDadeLipmanMills PredictingTheBettingLineInNBAGames PDF
No ratings yet
ChengDadeLipmanMills PredictingTheBettingLineInNBAGames PDF
5 pages
Estimation of The Mean and Proportion
100% (1)
Estimation of The Mean and Proportion
59 pages
Project Presentation 10
No ratings yet
Project Presentation 10
20 pages
Combining Textual Pre-Game Reports and Statistical Data For Predicting Success in The National Hockey League
No ratings yet
Combining Textual Pre-Game Reports and Statistical Data For Predicting Success in The National Hockey League
12 pages
Use of Machine Learning To Aut
No ratings yet
Use of Machine Learning To Aut
18 pages
A Comparative Study of Data Mining Techniques On Football Match Prediction
No ratings yet
A Comparative Study of Data Mining Techniques On Football Match Prediction
8 pages
Carlos Hilado Memorial State College: Eco-Lamp
No ratings yet
Carlos Hilado Memorial State College: Eco-Lamp
31 pages
Handball Goalkeeper Intuitive Decision-Making
No ratings yet
Handball Goalkeeper Intuitive Decision-Making
12 pages
Simple Linear Regression With Excel
No ratings yet
Simple Linear Regression With Excel
39 pages
IPL Score Prediction (Journal) - 4nm18cs142-169-191-215.
No ratings yet
IPL Score Prediction (Journal) - 4nm18cs142-169-191-215.
10 pages
An Improved Prediction System For Football A Match Result - Data Mining
No ratings yet
An Improved Prediction System For Football A Match Result - Data Mining
9 pages
Predicting NBA Games Using Neural Networks
100% (5)
Predicting NBA Games Using Neural Networks
18 pages
Entropy-25-00765, Introduction
No ratings yet
Entropy-25-00765, Introduction
16 pages
Prediction of Football Match Score and Decision Making Process
No ratings yet
Prediction of Football Match Score and Decision Making Process
4 pages
The Importance of Pilot Studies
No ratings yet
The Importance of Pilot Studies
5 pages
Alexandre Bucquet (Bucqueta@stanford - Edu) Vishnu Sarukkai (Sarukkai@stanford - Edu)
No ratings yet
Alexandre Bucquet (Bucqueta@stanford - Edu) Vishnu Sarukkai (Sarukkai@stanford - Edu)
1 page
Regression-Timeseries Forecasting
No ratings yet
Regression-Timeseries Forecasting
4 pages
EAP11 12 Unit-10 Lesson-1 Kinds-of-Reports
No ratings yet
EAP11 12 Unit-10 Lesson-1 Kinds-of-Reports
27 pages
Sports Result Prediction System
No ratings yet
Sports Result Prediction System
2 pages
Entropy 23 00090 v3
No ratings yet
Entropy 23 00090 v3
12 pages
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
No ratings yet
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
7 pages
A Simple LNRE Model For Random Character Sequences: Stefan Evert
No ratings yet
A Simple LNRE Model For Random Character Sequences: Stefan Evert
12 pages
Bibliographie Commentee
No ratings yet
Bibliographie Commentee
6 pages
Teicher Score
No ratings yet
Teicher Score
5 pages
A. Study Protocol 1. Objective of Study
No ratings yet
A. Study Protocol 1. Objective of Study
4 pages
Prediction of IPL Match Outcome Using Machine Lear
No ratings yet
Prediction of IPL Match Outcome Using Machine Lear
8 pages
Interim Layout
No ratings yet
Interim Layout
9 pages
Chapter 4 Quantitative Dissertation
100% (2)
Chapter 4 Quantitative Dissertation
6 pages
Do Evaluators Wear Grass Skirts
No ratings yet
Do Evaluators Wear Grass Skirts
24 pages
Cricket Analysis Using Machine Learning: B V S Sai Praneeth, V Srighan Reddy, P Jayanth, K Jeevan Reddy
No ratings yet
Cricket Analysis Using Machine Learning: B V S Sai Praneeth, V Srighan Reddy, P Jayanth, K Jeevan Reddy
5 pages
Chapter 3
No ratings yet
Chapter 3
2 pages
Determinants of Credit Default Risk of Microfinance Institutions
No ratings yet
Determinants of Credit Default Risk of Microfinance Institutions
9 pages
Basketball Game & Player Performance Prediction Presentation 9
No ratings yet
Basketball Game & Player Performance Prediction Presentation 9
10 pages
South West Regional Review of Local Action Plans in Implementing The National Dementia Strategy
No ratings yet
South West Regional Review of Local Action Plans in Implementing The National Dementia Strategy
14 pages
Paper 3
No ratings yet
Paper 3
7 pages
Analysis and Prediction of Basketball Game Results - Caner Kahraman Thesis Report November 2022
No ratings yet
Analysis and Prediction of Basketball Game Results - Caner Kahraman Thesis Report November 2022
101 pages
Prediction of English Premier League Soccer Matches
No ratings yet
Prediction of English Premier League Soccer Matches
60 pages
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
No ratings yet
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
5 pages
Comparison of Football Results Using Machine Learning Algorithms
No ratings yet
Comparison of Football Results Using Machine Learning Algorithms
7 pages
GuthrieBasicResearchMethodsRev Ed 2023
No ratings yet
GuthrieBasicResearchMethodsRev Ed 2023
266 pages
HRM Evaluation
No ratings yet
HRM Evaluation
9 pages
Making Real-Time Predictions For NBA Basketball Games by Combining The Historical Data and Bookmaker's Betting Line
No ratings yet
Making Real-Time Predictions For NBA Basketball Games by Combining The Historical Data and Bookmaker's Betting Line
8 pages
NBA2023 2024 Data Guidelines
No ratings yet
NBA2023 2024 Data Guidelines
3 pages
Investigatory Project Mathematics
No ratings yet
Investigatory Project Mathematics
4 pages
AT3 Research Essay
No ratings yet
AT3 Research Essay
6 pages
Basketball Game & Player Performance Prediction Presentation 9
No ratings yet
Basketball Game & Player Performance Prediction Presentation 9
10 pages
Chapter 2 Edited
No ratings yet
Chapter 2 Edited
12 pages
Predictive Modeling of Tennis Matches A Review
No ratings yet
Predictive Modeling of Tennis Matches A Review
6 pages
Predicting Baseball Wins Using Machine Learning
No ratings yet
Predicting Baseball Wins Using Machine Learning
3 pages
Electronics 14 02177
No ratings yet
Electronics 14 02177
15 pages
NBA Game Result Prediction Using Feature Analysis and Machine Learning
No ratings yet
NBA Game Result Prediction Using Feature Analysis and Machine Learning
14 pages
Results of Sports Matches For 2025
No ratings yet
Results of Sports Matches For 2025
8 pages
YMER210577
No ratings yet
YMER210577
3 pages

Iml Report

Uploaded by

Iml Report

Uploaded by

1

BASKETBALL GAME PREDICTOR

V Semester , Bachelor of Technology , Information Technology

Tushar Kumar (IIT2021203), Yuvraj Jindal (IIT2021161), Parth Garg(IIT2021116) , Jinam

aggregating predictions, it enhances 5. FEATURE SELECTION

to be those that are most informative about

Forward Search Backward Search Heuristic

The features selected by backward search were Feature Extraction

A literature review of a Basketball Game

● Root Node:It represents the entire

A random forest is made of decision trees. Each

This code methodology involves data extraction,

We found that a basketball teams win record plays

[1] M. Beckler, H. Wang. NBA Oracle

You might also like