0% found this document useful (0 votes)
50 views

IPL - Prediction - Model - Training - Final - Ipynb - Colab

Uploaded by

Piyush Verma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

IPL - Prediction - Model - Training - Final - Ipynb - Colab

Uploaded by

Piyush Verma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IPL 1st Inning Score Prediction using Machine Learning

The Dataset contains ball by ball information of the matches played between IPL Teams of Season 1 to 10, i.e. from 2008 to 2017.
This Machine Learning model adapts a Regression Appoach to predict the score of the First Inning of an IPL Match.
The Dataset can be downloaded from Kaggle from here.

keyboard_arrow_down Import Necessary Libraries


and Mounting GDrive for importing Dataset

# Importing Necessary Libraries


import pandas as pd
import numpy as np
np.__version__

account_circle '1.25.2'

add Code add Text


from sklearn.metrics import mean_absolute_error as mae, mean_squared_error as mse

Mount your Google Drive and save the dataset in the Drive name "data.csv"

# Mounting GDrive and importing dataset


data = pd.read_csv('/content/data.csv')
print(f"Dataset successfully Imported of Shape : {data.shape}")

Dataset successfully Imported of Shape : (76014, 15)

keyboard_arrow_down Exploratory Data Analysis


# First 5 Columns Data
data.head()

mid date venue batting_team bowling_team batsman bowler runs wickets overs runs_last_5 wickets_last_5 striker non-striker total

0 1 2008-04-18 M Chinnaswamy Stadium Kolkata Knight Riders Royal Challengers Bangalore SC Ganguly P Kumar 1 0 0.1 1 0 0 0 222

1 1 2008-04-18 M Chinnaswamy Stadium Kolkata Knight Riders Royal Challengers Bangalore BB McCullum P Kumar 1 0 0.2 1 0 0 0 222

2 1 2008-04-18 M Chinnaswamy Stadium Kolkata Knight Riders Royal Challengers Bangalore BB McCullum P Kumar 2 0 0.2 2 0 0 0 222

3 1 2008-04-18 M Chinnaswamy Stadium Kolkata Knight Riders Royal Challengers Bangalore BB McCullum P Kumar 2 0 0.3 2 0 0 0 222

4 1 2008-04-18 M Chinnaswamy Stadium Kolkata Knight Riders Royal Challengers Bangalore BB McCullum P Kumar 2 0 0.4 2 0 0 0 222

# Describing Numerical Values of the Dataset


data.describe()

mid runs wickets overs runs_last_5 wickets_last_5 striker non-striker total

count 76014.000000 76014.000000 76014.000000 76014.000000 76014.000000 76014.000000 76014.000000 76014.000000 76014.000000

mean 308.627740 74.889349 2.415844 9.783068 33.216434 1.120307 24.962283 8.869287 160.901452

std 178.156878 48.823327 2.015207 5.772587 14.914174 1.053343 20.079752 10.795742 29.246231

min 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 67.000000

25% 154.000000 34.000000 1.000000 4.600000 24.000000 0.000000 10.000000 1.000000 142.000000

50% 308.000000 70.000000 2.000000 9.600000 34.000000 1.000000 20.000000 5.000000 162.000000

75% 463.000000 111.000000 4.000000 14.600000 43.000000 2.000000 35.000000 13.000000 181.000000

max 617.000000 263.000000 10.000000 19.600000 113.000000 7.000000 175.000000 109.000000 263.000000

# Information (not-null count and data type) About Each Column


data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 76014 entries, 0 to 76013
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 mid 76014 non-null int64
1 date 76014 non-null object
2 venue 76014 non-null object
3 batting_team 76014 non-null object
4 bowling_team 76014 non-null object
5 batsman 76014 non-null object
6 bowler 76014 non-null object
7 runs 76014 non-null int64
8 wickets 76014 non-null int64
9 overs 76014 non-null float64
10 runs_last_5 76014 non-null int64
11 wickets_last_5 76014 non-null int64
12 striker 76014 non-null int64
13 non-striker 76014 non-null int64
14 total 76014 non-null int64
dtypes: float64(1), int64(8), object(6)
memory usage: 8.7+ MB

# Number of Unique Values in each column


data.nunique()

mid 617
date 442
venue 35
batting_team 14
bowling_team 14
batsman 411
bowler 329
runs 252
wickets 11
overs 140
runs_last_5 102
wickets_last_5 8
striker 155
non-striker 88
total 138
dtype: int64
# Datatypes of all Columns
data.dtypes

mid int64
date object
venue object
batting_team object
bowling_team object
batsman object
bowler object
runs int64
wickets int64
overs float64
runs_last_5 int64
wickets_last_5 int64
striker int64
non-striker int64
total int64
dtype: object

keyboard_arrow_down Data Cleaning


keyboard_arrow_down Removing Irrelevant Data colunms

# Names of all columns


data.columns

Index(['mid', 'date', 'venue', 'batting_team', 'bowling_team', 'batsman',


'bowler', 'runs', 'wickets', 'overs', 'runs_last_5', 'wickets_last_5',
'striker', 'non-striker', 'total'],
dtype='object')

Here, we can see that columns ['mid', 'date', 'venue', 'batsman', 'bowler', 'striker', 'non-striker'] won't provide any relevant information for our model
to train

irrelevant = ['mid', 'date', 'venue','batsman', 'bowler', 'striker', 'non-striker']


print(f'Before Removing Irrelevant Columns : {data.shape}')
data = data.drop(irrelevant, axis=1) # Drop Irrelevant Columns
print(f'After Removing Irrelevant Columns : {data.shape}')
data.head()

Before Removing Irrelevant Columns : (76014, 15)


After Removing Irrelevant Columns : (76014, 8)
batting_team bowling_team runs wickets overs runs_last_5 wickets_last_5 total

0 Kolkata Knight Riders Royal Challengers Bangalore 1 0 0.1 1 0 222

1 Kolkata Knight Riders Royal Challengers Bangalore 1 0 0.2 1 0 222

2 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.2 2 0 222

3 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.3 2 0 222

4 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.4 2 0 222

keyboard_arrow_down Keeping only Consistent Teams

(teams that never change even in current season)

# Define Consistent Teams


const_teams = ['Kolkata Knight Riders', 'Chennai Super Kings', 'Rajasthan Royals',
'Mumbai Indians', 'Kings XI Punjab', 'Royal Challengers Bangalore',
'Delhi Daredevils', 'Sunrisers Hyderabad']

print(f'Before Removing Inconsistent Teams : {data.shape}')


data = data[(data['batting_team'].isin(const_teams)) & (data['bowling_team'].isin(const_teams))]
print(f'After Removing Irrelevant Columns : {data.shape}')
print(f"Consistent Teams : \n{data['batting_team'].unique()}")
data.head()

Before Removing Inconsistent Teams : (76014, 8)


After Removing Irrelevant Columns : (53811, 8)
Consistent Teams :
['Kolkata Knight Riders' 'Chennai Super Kings' 'Rajasthan Royals'
'Mumbai Indians' 'Kings XI Punjab' 'Royal Challengers Bangalore'
'Delhi Daredevils' 'Sunrisers Hyderabad']
batting_team bowling_team runs wickets overs runs_last_5 wickets_last_5 total

0 Kolkata Knight Riders Royal Challengers Bangalore 1 0 0.1 1 0 222

1 Kolkata Knight Riders Royal Challengers Bangalore 1 0 0.2 1 0 222

2 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.2 2 0 222

3 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.3 2 0 222

4 Kolkata Knight Riders Royal Challengers Bangalore 2 0 0.4 2 0 222

keyboard_arrow_down Remove First 5 Overs of every match

print(f'Before Removing Overs : {data.shape}')


data = data[data['overs'] >= 5.0]
print(f'After Removing Overs : {data.shape}')
data.head()

Before Removing Overs : (53811, 8)


After Removing Overs : (40108, 8)
batting_team bowling_team runs wickets overs runs_last_5 wickets_last_5 total

32 Kolkata Knight Riders Royal Challengers Bangalore 61 0 5.1 59 0 222

33 Kolkata Knight Riders Royal Challengers Bangalore 61 1 5.2 59 1 222

34 Kolkata Knight Riders Royal Challengers Bangalore 61 1 5.3 59 1 222

35 Kolkata Knight Riders Royal Challengers Bangalore 61 1 5.4 59 1 222

36 Kolkata Knight Riders Royal Challengers Bangalore 61 1 5.5 58 1 222

Plotting a Correlation Matrix of current data

keyboard_arrow_down Performing Label Encoding


from sklearn.preprocessing import LabelEncoder, OneHotEncoder
le = LabelEncoder()
for col in ['batting_team', 'bowling_team']:
data[col] = le.fit_transform(data[col])
data.head()

<ipython-input-37-6e893f89f1c2>:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


data[col] = le.fit_transform(data[col])
<ipython-input-37-6e893f89f1c2>:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


data[col] = le.fit_transform(data[col])
batting_team bowling_team runs wickets overs runs_last_5 wickets_last_5 total

32 3 6 61 0 5.1 59 0 222

33 3 6 61 1 5.2 59 1 222

34 3 6 61 1 5.3 59 1 222

35 3 6 61 1 5.4 59 1 222

36 3 6 61 1 5.5 58 1 222

from seaborn import heatmap


heatmap(data=data.corr(), annot=True)

<Axes: >

data.columns

Index(['batting_team', 'bowling_team', 'runs', 'wickets', 'overs',


'runs_last_5', 'wickets_last_5', 'total'],
dtype='object')

# scatter plot between total runs and batting_team


import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.scatter(x=data['batting_team'], y=data['total'], color='red')
plt.xlabel('Batting teams')
plt.ylabel('Total Runs')
plt.title('Scatter Plot of Total Runs Scored')
plt.show()

# scatter plot between total runs and bowling_team


import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.scatter(x=data['bowling_team'], y=data['total'], color='blue')
plt.xlabel('Batting teams')
plt.ylabel('Total Runs')
plt.title('Scatter Plot of Total Runs Scored')
plt.show()
# scatter plot between total and wicket
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.scatter(x=data['wickets'], y=data['total'], color='green')
plt.xlabel('Batting teams')
plt.ylabel('Total Runs')
plt.title('Scatter Plot of Total Runs Scored')
plt.show()

keyboard_arrow_down Data Preprocessing and Encoding


keyboard_arrow_down Performing One Hot Encoding and Column Transformation

from sklearn.compose import ColumnTransformer


columnTransformer = ColumnTransformer([('encoder',
OneHotEncoder(),
[0, 1])],
remainder='passthrough')

data = np.array(columnTransformer.fit_transform(data))

Save the Numpy Array in a new DataFrame with transformed columns

cols = ['batting_team_Chennai Super Kings', 'batting_team_Delhi Daredevils', 'batting_team_Kings XI Punjab',


'batting_team_Kolkata Knight Riders', 'batting_team_Mumbai Indians', 'batting_team_Rajasthan Royals',
'batting_team_Royal Challengers Bangalore', 'batting_team_Sunrisers Hyderabad',
'bowling_team_Chennai Super Kings', 'bowling_team_Delhi Daredevils', 'bowling_team_Kings XI Punjab',
'bowling_team_Kolkata Knight Riders', 'bowling_team_Mumbai Indians', 'bowling_team_Rajasthan Royals',
'bowling_team_Royal Challengers Bangalore', 'bowling_team_Sunrisers Hyderabad', 'runs', 'wickets', 'overs',
'runs_last_5', 'wickets_last_5', 'total']
df = pd.DataFrame(data, columns=cols)

# Visualize Encoded Data


df.head()

batting_team_Royal
batting_team_Chennai batting_team_Delhi batting_team_Kings batting_team_Kolkata batting_team_Mumbai batting_team_Rajasthan batting_team_Sunrisers bowling_team_Chennai bo
Challengers
Super Kings Daredevils XI Punjab Knight Riders Indians Royals Hyderabad Super Kings
Bangalore

0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

1 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

2 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

3 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

4 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

5 rows × 22 columns

keyboard_arrow_down Model Building


keyboard_arrow_down Prepare Train and Test Splits
features = df.drop(['total'], axis=1)
labels = df['total']

# Perform 80 : 20 Train-Test split


from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size=0.20, shuffle=True)
print(f"Training Set : {train_features.shape}\nTesting Set : {test_features.shape}")

Training Set : (32086, 21)


Testing Set : (8022, 21)

keyboard_arrow_down Model Algorithms


Training and Testing on different Machine Learning Algorithms for the best algorithm to choose from

# Keeping track of model perfomances


models = dict()

keyboard_arrow_down 1. Decision Tree Regressor

from sklearn.tree import DecisionTreeRegressor


tree = DecisionTreeRegressor()
# Train Model
tree.fit(train_features, train_labels)

▾ DecisionTreeRegressor
DecisionTreeRegressor()

# Evaluate Model
train_score_tree = str(tree.score(train_features, train_labels) * 100)
test_score_tree = str(tree.score(test_features, test_labels) * 100)
print(f'Train Score : {train_score_tree[:5]}%\nTest Score : {test_score_tree[:5]}%')
models["tree"] = test_score_tree

Train Score : 99.98%


Test Score : 86.30%

from sklearn.metrics import mean_absolute_error as mae, mean_squared_error as mse


print("---- Decision Tree Regressor - Model Evaluation ----")
print("Mean Absolute Error (MAE): {}".format(mae(test_labels, tree.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, tree.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, tree.predict(test_features)))))

---- Decision Tree Regressor - Model Evaluation ----


Mean Absolute Error (MAE): 3.9104338070306657
Mean Squared Error (MSE): 122.84888431812516
Root Mean Squared Error (RMSE): 11.08372159151091

keyboard_arrow_down Logistic Regression

from sklearn.linear_model import LogisticRegression

# Define and train the Logistic Regression model


logistic_regression = LogisticRegression()
logistic_regression.fit(train_features, train_labels)

/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py:458: ConvergenceWarning: lbfgs failed to converge (status=1):


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
▾ LogisticRegression
LogisticRegression()

# Evaluate the model on the test set


test_score_logistic = str(logistic_regression.score(test_features, test_labels) * 100)
print(f'Test Score : {test_score_logistic[:5]}%')

# Store the model in the dictionary


models["logistic_regression"] = test_score_logistic

Test Score : 10.69%

print("---- Linear Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, logistic_regression.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, logistic_regression.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, logistic_regression.predict(test_features)))))

---- Linear Regression - Model Evaluation ----


Mean Absolute Error (MAE): 15.277736225380204
Mean Squared Error (MSE): 445.39042632759913
Root Mean Squared Error (RMSE): 21.10427507230701

keyboard_arrow_down Ridge Regression

from sklearn.linear_model import Ridge


ridge = Ridge()
ridge.fit(train_features, train_labels)

▾ Ridge
Ridge()

test_score_ridge = str(ridge.score(test_features, test_labels) * 100)


print(f'Test Score : {test_score_ridge[:5]}%')
models["ridge"] = test_score_ridge

Test Score : 65.27%


print("---- Ridge Regression - Model Evaluation ----")
print("Mean Absolute Error (MAE): {}".format(mae(test_labels, ridge.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, ridge.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, ridge.predict(test_features)))))

---- Ridge Regression - Model Evaluation ----


Mean Absolute Error (MAE): 13.132425008936426
Mean Squared Error (MSE): 311.5373680439042
Root Mean Squared Error (RMSE): 17.650421186020015

keyboard_arrow_down Linear Regression

from sklearn.linear_model import LinearRegression


linreg = LinearRegression()
# Train Model
linreg.fit(train_features, train_labels)

▾ LinearRegression
LinearRegression()

# Evaluate Model
train_score_linreg = str(linreg.score(train_features, train_labels) * 100)
test_score_linreg = str(linreg.score(test_features, test_labels) * 100)
print(f'Train Score : {train_score_linreg[:5]}%\nTest Score : {test_score_linreg[:5]}%')
models["linreg"] = test_score_linreg

Train Score : 66.15%


Test Score : 64.93%

print("---- Linear Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, linreg.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, linreg.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, linreg.predict(test_features)))))

---- Linear Regression - Model Evaluation ----


Mean Absolute Error (MAE): 13.136500925128422
Mean Squared Error (MSE): 311.168758694234
Root Mean Squared Error (RMSE): 17.639976153448565

keyboard_arrow_down Random Forest Regression

from sklearn.ensemble import RandomForestRegressor


forest = RandomForestRegressor()
# Train Model
forest.fit(train_features, train_labels)

▾ RandomForestRegressor
RandomForestRegressor()

# Evaluate Model
train_score_forest = str(forest.score(train_features, train_labels)*100)
test_score_forest = str(forest.score(test_features, test_labels)*100)
print(f'Train Score : {train_score_forest[:5]}%\nTest Score : {test_score_forest[:5]}%')
models["forest"] = test_score_forest

Train Score : 99.03%


Test Score : 93.28%

print("---- Random Forest Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, forest.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, forest.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, forest.predict(test_features)))))

---- Random Forest Regression - Model Evaluation ----


Mean Absolute Error (MAE): 4.456521777908372
Mean Squared Error (MSE): 60.23483851765969
Root Mean Squared Error (RMSE): 7.761110649749796

keyboard_arrow_down Lasso Regression

from sklearn.linear_model import LassoCV


lasso = LassoCV()
# Train Model
lasso.fit(train_features, train_labels)

▾ LassoCV
LassoCV()

# Evaluate Model
train_score_lasso = str(lasso.score(train_features, train_labels)*100)
test_score_lasso = str(lasso.score(test_features, test_labels)*100)
print(f'Train Score : {train_score_lasso[:5]}%\nTest Score : {test_score_lasso[:5]}%')
models["lasso"] = test_score_lasso

Train Score : 65.06%


Test Score : 64.43%

print("---- Lasso Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, lasso.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, lasso.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, lasso.predict(test_features)))))

---- Lasso Regression - Model Evaluation ----


Mean Absolute Error (MAE): 13.17142720370711
Mean Squared Error (MSE): 319.0856214179204
Root Mean Squared Error (RMSE): 17.862967878208828

keyboard_arrow_down Support Vector Machine

from sklearn.svm import SVR


svm = SVR()
# Train Model
svm.fit(train_features, train_labels)
train_score_svm = str(svm.score(train_features, train_labels)*100)
test_score_svm = str(svm.score(test_features, test_labels)*100)
print(f'Train Score : {train_score_svm[:5]}%\nTest Score : {test_score_svm[:5]}%')
models["svm"] = test_score_svm

print("---- Support Vector Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, svm.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, svm.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, svm.predict(test_features)))))

keyboard_arrow_down Neural Networks

from sklearn.neural_network import MLPRegressor


neural_net = MLPRegressor(activation='logistic', max_iter=500)
# Train Model
neural_net.fit(train_features, train_labels)

train_score_neural_net = str(neural_net.score(train_features, train_labels)*100)


test_score_neural_net = str(neural_net.score(test_features, test_labels)*100)
print(f'Train Score : {train_score_neural_net[:5]}%\nTest Score : {test_score_neural_net[:5]}%')
models["neural_net"] = test_score_neural_net

print("---- Neural Networks Regression - Model Evaluation ----")


print("Mean Absolute Error (MAE): {}".format(mae(test_labels, neural_net.predict(test_features))))
print("Mean Squared Error (MSE): {}".format(mse(test_labels, neural_net.predict(test_features))))
print("Root Mean Squared Error (RMSE): {}".format(np.sqrt(mse(test_labels, neural_net.predict(test_features)))))

keyboard_arrow_down Best Model Selection


from seaborn import barplot
model_names = list(models.keys())
accuracy = list(map(float, models.values()))
barplot(model_names, accuracy)

From above, we can see that Random Forest performed the best, closely followed by Decision Tree and Neural Networks. So we will be
choosing Random Forest for the final model

keyboard_arrow_down Predictions
def predict_score(batting_team, bowling_team, runs, wickets, overs, runs_last_5, wickets_last_5, model=forest):
prediction_array = []
# Batting Team
if batting_team == 'Chennai Super Kings':
prediction_array = prediction_array + [1,0,0,0,0,0,0,0]
elif batting_team == 'Delhi Daredevils':
prediction_array = prediction_array + [0,1,0,0,0,0,0,0]
elif batting_team == 'Kings XI Punjab':
prediction_array = prediction_array + [0,0,1,0,0,0,0,0]
elif batting_team == 'Kolkata Knight Riders':
prediction_array = prediction_array + [0,0,0,1,0,0,0,0]
elif batting_team == 'Mumbai Indians':
prediction_array = prediction_array + [0,0,0,0,1,0,0,0]
elif batting_team == 'Rajasthan Royals':
prediction_array = prediction_array + [0,0,0,0,0,1,0,0]
elif batting_team == 'Royal Challengers Bangalore':
prediction_array = prediction_array + [0,0,0,0,0,0,1,0]
elif batting_team == 'Sunrisers Hyderabad':
prediction_array = prediction_array + [0,0,0,0,0,0,0,1]
# Bowling Team
if bowling_team == 'Chennai Super Kings':
prediction_array = prediction_array + [1,0,0,0,0,0,0,0]
elif bowling_team == 'Delhi Daredevils':
prediction_array = prediction_array + [0,1,0,0,0,0,0,0]
elif bowling_team == 'Kings XI Punjab':
prediction_array = prediction_array + [0,0,1,0,0,0,0,0]
elif bowling_team == 'Kolkata Knight Riders':
prediction_array = prediction_array + [0,0,0,1,0,0,0,0]
elif bowling_team == 'Mumbai Indians':
prediction_array = prediction_array + [0,0,0,0,1,0,0,0]
elif bowling_team == 'Rajasthan Royals':
prediction_array = prediction_array + [0,0,0,0,0,1,0,0]
elif bowling_team == 'Royal Challengers Bangalore':
prediction_array = prediction_array + [0,0,0,0,0,0,1,0]
elif bowling_team == 'Sunrisers Hyderabad':
prediction_array = prediction_array + [0,0,0,0,0,0,0,1]
prediction_array = prediction_array + [runs, wickets, overs, runs_last_5, wickets_last_5]
prediction_array = np.array([prediction_array])
pred = model.predict(prediction_array)
return int(round(pred[0]))

keyboard_arrow_down Test 1
Batting Team : Delhi Daredevils
Bowling Team : Chennai Super Kings
Final Score : 147/9

batting_team='Delhi Daredevils'
bowling_team='Chennai Super Kings'
score = predict_score(batting_team, bowling_team, overs=10.2, runs=68, wickets=3, runs_last_5=29, wickets_last_5=1)
print(f'Predicted Score : {score} || Actual Score : 147')

keyboard_arrow_down Test 2
Batting Team : Mumbai Indians
Bowling Team : Kings XI Punjab
Final Score : 176/7

batting_team='Mumbai Indians'
bowling_team='Kings XI Punjab'
score = predict_score(batting_team, bowling_team, overs=12.3, runs=113, wickets=2, runs_last_5=55, wickets_last_5=0)
print(f'Predicted Score : {score} || Actual Score : 176')

keyboard_arrow_down Live* Test 1 (2020 season)


Batting Team : Kings XI Punjab
Bowling Team : Rajasthan Royals
Final Score : 185/4
These Test Was done before the match and final score were added later.

# Live Test
batting_team="Kings XI Punjab"
bowling_team="Rajasthan Royals"
score = predict_score(batting_team, bowling_team, overs=14.0, runs=118, wickets=1, runs_last_5=45, wickets_last_5=0)
print(f'Predicted Score : {score} || Actual Score : 185')

keyboard_arrow_down Live Test 2 (2020 Season)


Batting Team : Kolkata Knight Riders
Bowling Team : Chennai Super Kings
Final Score : 172/5

# Live Test
batting_team="Kolkata Knight Riders"
bowling_team="Chennai Super Kings"
score = predict_score(batting_team, bowling_team, overs=18.0, runs=150, wickets=4, runs_last_5=57, wickets_last_5=1)
print(f'Predicted Score : {score} || Actual Score : 172')

keyboard_arrow_down Live Test 3 (2020 Season)


Batting Team : Delhi Daredevils
Bowling Team : Mumbai Indians
Final Score : 110/7

batting_team='Delhi Daredevils'
bowling_team='Mumbai Indians'
score = predict_score(batting_team, bowling_team, overs=18.0, runs=96, wickets=8, runs_last_5=18, wickets_last_5=4)
print(f'Predicted Score : {score} || Actual Score : 110')

keyboard_arrow_down Live Test 4 (2020 Season)


Batting Team : Kings XI Punjab
Bowling Team : Chennai Super Kings
Final Score : 153/9

batting_team='Kings XI Punjab'
bowling_team='Chennai Super Kings'
score = predict_score(batting_team, bowling_team, overs=18.0, runs=129, wickets=6, runs_last_5=34, wickets_last_5=2)
print(f'Predicted Score : {score} || Actual Score : 153')

keyboard_arrow_down Export Model


from joblib import dump

dump(forest, "forest_model.pkl")
dump(tree, "tree_model.pkl")
dump(neural_net, "neural_nets_model.pkl")

Start coding or generate with AI.

You might also like