0% found this document useful (0 votes)

22 views13 pages

Data-Engineering EINDE

The document discusses the steps taken to clean and preprocess soccer match data, perform exploratory data analysis to answer questions about successful teams and top scoring players, and build machine learning models to predict if the home team will lose a match. Logistic regression performed best with an accuracy of 72.24% and AUC of 65.22%.

Uploaded by

ahmedabbasi1318

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views13 pages

Data-Engineering EINDE

Uploaded by

ahmedabbasi1318

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Academic year 2020/21

Faculty of Business and Economics

Data Engineering Project

Data engineering
Prof. Len Feremans

Hélène Truyers s0145497

Carolina Salcedo Ortiz s0200385
Oussama El Bouazzaoui s0162659

1
1. Pre-processing

Our first step was to load the data from Kaggle. The panda dataframe was used so that everything was
already in the right format. The first merge that happened was the merge between the match-dataframe
and the player-dataframe. As this dataframe had already been used for the explanatory analysis, we
simply reused it. When these two dataframes were merged, we proceed to clean up the columns that are
not necessary for machine learning or that have been duplicated.

Next, we merged the dataframes of players and their attributes. We also dropped unnecessary columns
there. Another problem we came across was that the attributes of the players were kept in a timeseries.
This gave an abundance of observations so we used groupby to take an average for the continuous
variables and used the mode for the discrete ones. Since groupby created two different dataframes, we
had to merge the continuous and the discrete variables again into one dataframe.

After this was done we got two big dataframes that we needed to merge. This was done in the next step.
After that we dropped the player names, which we had used for the merge, because our reasoning was
that the names should not influence the outcome of a match, only the specs of the players should.

Next up, we defined our target-variable. The target variable was whether or not the home-team would
lose or not. We compared the goals from the home team to the away team. After this we merged for a
last time with the big dataframe.

Then the second big task of the data pre-processing started. Our first step was to check where all the
NaN’s are in our dataset, we also checked for infinities and zeros.

2. Exploratory data analysis

In this project we worked on data collected for more than 25000 matches played within the seasons 2008
to 2016. The database contained seven tables: Country, League, Match, Player, Player_Attributes, Team,
Team_Attributes. Databases in SQL format were presented.

For the exploratory data analysis, the three questions that were proposed. These three questions were:
For each year, what are the most successful teams?, What are the attributes of the best 1% scoring
players? and calculate the age of each player for every match. To answer the questions, the first step was
to copy the dataframes so that if a mistake was made, the data would not be corrupted. Each question
was then followed by a different set of steps.

2
We completed the following steps to determine which teams were the most successful:

1. We derived the year from date field in df_match

2. We assigned points to home and away team
3. We melted dataframe 2 times
4. We merged df_best_teams with df_team on team_id and team_api_id
5. We kept only relevant columns
6. Then, we grouped by year and team
7. We showed most successful team for each year
8. Finally, we displayed the results

Based on the final dataframe obtained, we made the following graph. In this graph you can see the points
of the best teams of each year from 2008-2016.

Secondly, we determined the characteristics of the top 1% of scoring players by following these steps:

1. We calculated number of rows that corresponds to top 1 %

2. We filtered the top 1 %
3. Then, we merged with df_player to get the name of the player and sort by finishing and date
attributes
4. We dropped duplicates and kept only the first row it finds (this explains why we have sorted on
date in the previous step)
5. We kept only the relevant columns and reset the index
6. We renamed player api id so we could merge later
7. Finally, we displayed the results

3
From the final dataframe that we obtained in this section, we made the following graph by choosing the
following parameters: player names and their overall rating. The graph shows the average of the overall
ratings of the ten top scoring players:

Finally, in order to calculate the age of each player for each match after we created the respective copy,
we created the following loops for both home and away teams. The first five steps describe what was
done in each loop.

1. We added birthday for each player of the home/away team

2. We automatically created the name of the column to be merged
3. We needed to create a player DataFrame, with the information we want
4. In this step we have to replace the names to allow merging
5. Lastly, we made the merge
6. After we did the steps above for both we dropped columns starting with home_ and away_

3. Machine learning

Three algorithms were run to create the models. The first one was logistic regression, the second
one was a decision tree and lastly a random forest with Kfold. The target variable was whether
or not the home team would lose. The input variables were the attributes of the players and
three odds from the bookmakers, namely, the odds that the home team would win, that they
would play a draw or that the away team would win. For efficiency's sake we first dropped all
columns that weren’t needed. This was done in the pre-processing step, so we could just use the
dataframe in machine learning.

4
The first step we did during the machine learning was creating random indices for a test, training
and validation set. We used the validation set to make sure we did not overfit our model and the
testset to calculate the AUC and accuracy. After this was done we extracted all features and
targets from the dataframe. Then we did the pre-processing of the discrete variables. This
actually means that we dummyfied all the discrete variables. This included our input variables
and the target variable. After that we pre-processed the continuous variables, this was done by
using a scaler to scale all variables. We did so that all variables had an equal weight in the machine
learning algorithm.

Secondly we concatenated all the variables, this includes the discrete and continuous variables
but also the target variable. The first row of this concatenate dataframes were all NaN’s so we
dropped this. After that we dropped all the NaN’s, _0 and 0. This was done with all variables
together so that if we dropped one row in the training data it would also be dropped in the target
variable. After this we removed the target variable out of the dataframe. And lastly we made the
training, validation and test set. When this was all done we could start with machine learning.

Literature stated that it would be hard to predict a football-match using machine learning. The
market roughly predicted 70% of the games right in 2019 (Empirics Asia, 2019). This is the
accuracy that we have. But we are working with skewed data. Only 28.77% of the matches were
lost by the home team. This led us to look at the AUC instead of the accuracy.

The first model we checked was the logistic linear regression. This model outperformed the rest
by a small margin. The accuracy of this model was 72,24% and the AUC was 65,22%. The Area
Under the Curve is being displayed in the ROC graph. the more the ROC curve is in the top left
corner, the better the model is.

The second model that we checked was the KNN. However this model did not have any
discriminative power as it just predicted that every match would be lost. This led to an accuracy
of 71,87%. A way to fix this would have been to have different cut-off values. Another technique
that might have been useful would be Near Miss. We would not recommend a Smote technique
since this technique duplicates the minority class and Python already gave us a warning that the
training dataset was big. However we also looked at the AUC as this metric is classification-
threshold-invariant. The AUC was 63,33%.

The last technique we tried was a tree. The tree was by far the worst model. This model also has
an accuracy of 71,87% like KNN. It also has the same problem as KNN, meaning that this model
does not have any discriminative power. However the AUC is lower than KNN. The AUC of a tree
is only 60,41%. We also tried a random forest with Kfold but it didn’t have a significantly higher
AUC.

5
Model/Values Logistic Linear KNN Decision Tree Random
Regression Forest

72,24% 71,87% 71,78% 71,87%

Accuracy

65,22% 63,33% 60,41% 63,32%

AUC

We looked at the AUC and ROC graphs to compare the models. We chose to only show the two
models with the highest AUC values. As you can see, there is only a minimal difference between
the two. We eventually went with the logistic linear regression because it has the highest AUC
value but on top of that it also has a better accuracy. From our experiments with these machine
learning models for soccer matches we conclude that the logistic regression was the best for us.

4. References
● Empirics Asia. (2019, December 19). How I Used Machine Learning to Predict Football Games for
24 Months Straight. Retrieved from Empirics Asia: https://empirics.asia/how-i-used-machine-
learning-to-predict-football-games-for-24-months-straight/
● Mathien, H. (2016, October 16). European Soccer Database. Retrieved from Kaggle:
https://www.kaggle.com/hugomathien/soccer
● Seaborn documentation: https://seaborn.pydata.org/tutorial.html
● Scikit-learn documentation : https://scikit-learn.org/stable/user_guide.htm

6
5. Appendix

# import libraries
import pandas as pd
import pyodbc as pdb
import numpy as np
import datetime as dt
import shutil
import math
import os
import sqlite3
import matplotlib.pyplot as plt
import seaborn as sns
from heapq import nlargest

# make connection to sqlite database

con = sqlite3.connect("database.sqlite")

# create dataframe for each table

df_match = pd.read_sql_query("SELECT * FROM Match", con)
df_player = pd.read_sql_query("SELECT * FROM Player", con)
df_player_attr = pd.read_sql_query("SELECT * FROM Player_Attributes", con)
df_country = pd.read_sql_query("SELECT * FROM Country", con)
df_team = pd.read_sql_query("SELECT * FROM Team", con)
df_team_attr = pd.read_sql_query("SELECT * FROM Team_Attributes", con)
df_league = pd.read_sql_query("SELECT * FROM League", con)

#%% datacleaning: we will now clean both datasets so we can cleanly merge them. This step is
preparring for the ML

#first we clean df_match_processed

df_match_processed1= df_match_processed.loc[:,~df_match_processed.columns.duplicated()]
#we delete the first 6 collumns so id, country_id, leage_id,season stage, date will be deleted
df_match_processed.drop(df_match_processed.columns[0:6], axis=1, inplace=True)
#now we will drop everything that is in XML as extracting XML is out of the scope for this project
df_match_processed.drop(df_match_processed.columns[1:9], axis=1, inplace=True)
# we want to keep BWH,BWA and BWD since these are odd, this leaves us drop the first 3 colums
df_match_processed.drop(df_match_processed.columns[1:4], axis=1, inplace=True)

1
this dataframe was made in the expanitoy part, we put the code at the end of the document

1
# after BWH,BWA and BWD we drop everything we don't understand (IWH till BSA)
df_match_processed.drop(df_match_processed.columns[4:28], axis=1, inplace=True)
#now we remove the age of the players as this is not explanatory in our model
df_match_processed =
df_match_processed[df_match_processed.columns.drop(list(df_match_processed.filter(regex='age')))]

#now we clean up the player_merged

#we create a dataset with the attributes of the players and their names
df_players_merged = df_players.merge(df_players_attr,
on='player_api_id').sort_values(["finishing","date"],ascending=[False,False])
#we first drop the player_api_id
df_players_merged.drop(df_players_merged.columns[0], axis=1, inplace=True)
#now we drop the id, fifa_api_id and data
df_players_merged.drop(df_players_merged.columns[1:4], axis=1, inplace=True)
# now we drop everything after head_accurancy
df_players_merged.drop(df_players_merged.columns[9:42], axis=1, inplace=True)

#now we groupby so that there is no time-evolution for the players but an overal mean (continuous)
or mode for (discrete)
#first of all we groupby for the continous variables
df_players_cont=df_players_merged.groupby('player_name').mean()

# now we groupby for the discrete variables

df_players_discr=df_players_merged.groupby('player_name').agg(pd.Series.mode)
df_players_discr.drop(df_players_discr.columns[3:6], axis=1, inplace=True)

df_cleaned = df_players_discr.merge(df_players_cont, on='player_name')

#%% now we merge data_processed with data_cleaned

#merge df_matched_processed with df_cleaned for away player one
df_match_player_merged =
pd.merge(df_match_processed,df_cleaned,how='inner',left_on=['name_away_player_1'],right_on=['pla
yer_name'])
df_match_player_merged =
df_match_player_merged.rename(columns={"player_name":"away_name_1"})

for x in range (2,12):

# Merge for every other away player (2-11)
df_match_player_merged =
pd.merge(df_match_player_merged,df_cleaned,how='inner',left_on=[f'name_away_player_{x}'],right_o
n=['player_name'])

2
df_match_player_merged =
df_match_player_merged.rename(columns={"player_name":"away_player_name_{x}"})

#merge for the home players

for x in range(1,12):
df_match_player_merged=pd.merge(df_match_player_merged,df_cleaned,how='inner',left_on=[f'name
_home_player_{x}'],right_on=['player_name'])
df_match_player_merged =
df_match_player_merged.rename(columns={"player_name":"home_player_name_{x}"})

#%% names of the players should not influence the match, but their specs should. So we drop all of
their names
df_match_player_merged_final=df_match_player_merged.copy()
df_match_player_merged_final.drop(df_match_player_merged_final.columns[4:26], axis=1,
inplace=True)

#%% we need to convert the scores of df_match to who won-lost-tie

match_final = df_match[['match_api_id','home_team_goal','away_team_goal']].copy()
# so now we made our targetvariable in relationship to the hometeam
match_final.loc[match_final["home_team_goal"] > match_final["away_team_goal"],"Target"] =
'WIN_HOME_TIE'
match_final.loc[match_final["home_team_goal"] < match_final["away_team_goal"],"Target"] =
'LOSE_HOME'

#now we merge df_match_player_merged_final and match_final for the complete dataframe

df_COMPLETE=df_match_player_merged_final.merge(match_final, on='match_api_id')
# now we just need to drop the match_api_id and then we have a complete dataframe
df_COMPLETE.drop(df_COMPLETE.columns[0], axis=1, inplace=True)
#we drop the home_team_goal and away_team_goal as they are not available if we want to predict
who is going to win
df_COMPLETE.drop(df_COMPLETE.columns[179:181], axis=1, inplace=True)

#%% looking for the pecentage nan

# calculate the number of NaN’s
def nan_percen(column):
spec = df_COMPLETE.loc[:,column]
aantalnan = spec.isna()
Percen = sum(aantalnan)/len(spec)

3
#getting the results back
return Percen
percent_missing = df_COMPLETE.isnull().sum() * 100 / len(df_COMPLETE) # percentage van missing
values

#%% Data preparation

#Create random indices for a test, training and validation set

from sklearn.model_selection import train_test_split

indices=np.arange(19675)
indices_train, indices_test = train_test_split(indices, test_size=0.2, random_state=0)
indices_train, indices_val = train_test_split(indices_train, test_size=0.2, random_state=0)

#%% Extracting all features and target from the train dataframe
Target = df_COMPLETE.iloc[:,179] # target variable is on the last place of our dataframe

specs=df_COMPLETE.iloc[:,3:178]#all the specs

specs_numbers=specs.select_dtypes(include=np.number)#the numeric columns
specs_strings=specs.select_dtypes(exclude=np.number)#the non numeric columns
specs_strings=specs_strings.astype(str) #convert to string

#%%Preprocessing for discrete variables (for the specs)

Discrete=pd.get_dummies(specs_strings, sparse=True)#dummify the discrete variables
#%%
TARGET=pd.get_dummies(Target, sparse=True)#dummify the target variable
Target1=TARGET.iloc[:,0]
#%% Making sure we have a copy of our targetvariable in case someting goes wrong
TARGET = Target1

#%% Now we pre-process the the ODDS (this is a continous variable)

odds = df_COMPLETE.iloc[:,0:3]
data_preprocessed = [] #initialise

#import libraries
from numpy import asarray
from sklearn.preprocessing import MinMaxScaler

#we scale our continous variables

scaler = MinMaxScaler()
specs_numbers_prep1 = scaler.fit_transform(odds)
data_preprocessed.append(specs_numbers_prep1)

4
#we make the odds a dataframe instead of a tuple
ODDS_coding=pd.DataFrame(data=specs_numbers_prep1,index=np.array(range(1,
21375)),columns=np.array(range(1, 4)))

#%% Preprocessing for continuous variables (step 2)

#we scale our continous variables
scaler = MinMaxScaler()
specs_numbers_prep = scaler.fit_transform(specs_numbers)
data_preprocessed.append(specs_numbers_prep)
#we make the odds a dataframe instead of a tuple
specs_numbers_prep=pd.DataFrame(data=specs_numbers_prep,index=np.array(range(1,
21375)),columns=np.array(range(1, 110)))

#%%We put all the variables together and get the NaN’s, _0,... out
horizontal_stack = pd.concat([specs_numbers_prep, ODDS_coding,Discrete,TARGET], axis=1)
horizontal_stack = horizontal_stack[1:]
horizontal_stack = horizontal_stack[horizontal_stack.isnull().sum(axis=1) < 2]

#%% we extract the target variable out of the horizontal stack

TARGET = horizontal_stack.iloc[: , -1]
horizontal_stack = horizontal_stack.iloc[: , :-1]

#%% we check the how many of the matches of the hometeams were lost
Procent = sum(TARGET)/19675

#%% Create train, validation and test data

#tolist() returns a list of values so here for train test and val data
indices_train_list=indices_train.tolist()
indices_test_list=indices_test.tolist()
indices_val_list=indices_val.tolist()

#creating the test, train, validation data for the input data
X_train=horizontal_stack.iloc[indices_train_list,]
X_test=horizontal_stack.iloc[indices_test_list,]
X_val=horizontal_stack.iloc[indices_val_list,]

#creating the test, train, validation data for the target variable
Y_train=TARGET.iloc[indices_train_list,]
Y_test=TARGET.iloc[indices_test_list,]

5
Y_val=TARGET.iloc[indices_val_list,]
X_train_total=pd.concat([X_train, X_val]) #concatenate two data frames
Y_train_total=pd.concat([Y_train, Y_val]) #concatenate two data frames

EXPLORITORY PART NEEDED TO UNDERSTAND THE FIRST MERGE

#%% Caculate the age of each player for every match
# First, copy the df_match DataFrame, as we will work on it
df_match_processed = df_match.copy()

# add birthday for each player of the home team

for x in range(1,12):

# Here we automatically create the name of the column to be merged

player_column_name = f'home_player_{x}'

# Now we need to create a player DataFrame, with the info we want, like we did before
df_player_info = df_player[['player_api_id', 'player_name', 'birthday']]

# And now, we have to replace the names to allow merging

df_player_info = df_player_info.rename(columns={
'player_api_id': player_column_name,
'player_name': f'name_{player_column_name}',
'birthday': f'birthday_{player_column_name}',
})

# Finally, now, we can make the merge

df_match_processed = df_match_processed.merge(df_player_info, on=player_column_name)

# add birthday for each player of the away team

for x in range(1,12):

# Here we automatically create the name of the column to be merged

player_column_name = f'away_player_{x}'

# Now we need to create a player DataFrame, with the info we want, like we did before
df_player_info = df_player[['player_api_id', 'player_name', 'birthday']]

# And now, we have to replace the names to allow merging

6
df_player_info = df_player_info.rename(columns={
'player_api_id': player_column_name,
'player_name': f'name_{player_column_name}',
'birthday': f'birthday_{player_column_name}',
})

# Finally, now, we can make the merge

df_match_processed = df_match_processed.merge(df_player_info, on=player_column_name)

# drop columns starting with home_ and away_

df_match_processed =
df_match_processed.loc[:,~df_match_processed.columns.str.startswith('home_')]
df_match_processed =
df_match_processed.loc[:,~df_match_processed.columns.str.startswith('away_')]

# calculate the age for each player

for x in range(1,12):

df_match_processed[f'birthday_home_player_{x}'] =
pd.to_datetime(df_match_processed[f'birthday_home_player_{x}'])
df_match_processed[f'birthday_away_player_{x}'] =
pd.to_datetime(df_match_processed[f'birthday_away_player_{x}'])
df_match_processed[f'age_home_player_{x}'] = round(((pd.to_datetime(df_match_processed['date']) -
pd.to_datetime(df_match_processed[f'birthday_home_player_{x}'])).dt.days)/365,1)
df_match_processed[f'age_away_player_{x}'] = round(((pd.to_datetime(df_match_processed['date']) -
pd.to_datetime(df_match_processed[f'birthday_away_player_{x}'])).dt.days)/365,1)

# show results
df_match_processed.head()

#now we don’t need the birtday anymore so we can drop these

df_match_processed =
df_match_processed[df_match_processed.columns.drop(list(df_match_processed.filter(regex='birthday
')))]

Fantasy Sports Prediction Clustering Analysis
No ratings yet
Fantasy Sports Prediction Clustering Analysis
21 pages
Mini Project Documentation
No ratings yet
Mini Project Documentation
51 pages
Football Result Prediction Using Simple Classification Algorithms, A Comparison Between K-Nearest Neighbor and Linear Regression
No ratings yet
Football Result Prediction Using Simple Classification Algorithms, A Comparison Between K-Nearest Neighbor and Linear Regression
26 pages
Python Code 6-10 Class X
No ratings yet
Python Code 6-10 Class X
6 pages
Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming
100% (1)
Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming
5 pages
Lab Manual Python Programming Language
No ratings yet
Lab Manual Python Programming Language
21 pages
Predicting Football Matches Using Neural Networks in MATLAB
100% (1)
Predicting Football Matches Using Neural Networks in MATLAB
6 pages
Applying Machine Learning To Event Data in Soccer
No ratings yet
Applying Machine Learning To Event Data in Soccer
70 pages
Practise Questions
No ratings yet
Practise Questions
26 pages
EA Sports Notebook
No ratings yet
EA Sports Notebook
12 pages
Rating Australian Rules Football Teams With The Playerratings Package
No ratings yet
Rating Australian Rules Football Teams With The Playerratings Package
9 pages
Thesis PDF
No ratings yet
Thesis PDF
137 pages
Class12 DataScience Project Template 2024-25
No ratings yet
Class12 DataScience Project Template 2024-25
50 pages
Morgans Table For Sample Size-20 - Appendix 3
100% (3)
Morgans Table For Sample Size-20 - Appendix 3
1 page
Predicting Results of Brazilian Soccer League Matches: University of Wisconsin-Madison
No ratings yet
Predicting Results of Brazilian Soccer League Matches: University of Wisconsin-Madison
13 pages
3 Awesome Visualization Techniques For Every Dataset: Mlwhiz
No ratings yet
3 Awesome Visualization Techniques For Every Dataset: Mlwhiz
13 pages
Data Science Cr7
No ratings yet
Data Science Cr7
10 pages
Verhoosel 33241900 2024-2
No ratings yet
Verhoosel 33241900 2024-2
82 pages
Final Prjoect
No ratings yet
Final Prjoect
32 pages
Football Analyst: Sistemes de Suport A La Decisió
No ratings yet
Football Analyst: Sistemes de Suport A La Decisió
15 pages
Result Prediction For Soccer Games
No ratings yet
Result Prediction For Soccer Games
10 pages
Tesi
No ratings yet
Tesi
73 pages
Ruck Those Stats! Machine Learning As The New Coach
No ratings yet
Ruck Those Stats! Machine Learning As The New Coach
5 pages
Prediction of English Premier League Soccer Matches
No ratings yet
Prediction of English Premier League Soccer Matches
60 pages
Exemplar - Perform Feature Engineering
No ratings yet
Exemplar - Perform Feature Engineering
14 pages
Report
No ratings yet
Report
8 pages
Astros
No ratings yet
Astros
20 pages
Time-Based Sports Athletes
No ratings yet
Time-Based Sports Athletes
10 pages
Noordman Rogier 12366315 MSC ETRICS
No ratings yet
Noordman Rogier 12366315 MSC ETRICS
54 pages
Doing A Basic Team Ranking Using Match Data
No ratings yet
Doing A Basic Team Ranking Using Match Data
8 pages
57 - Step PPT 2 Cpr3 Final
No ratings yet
57 - Step PPT 2 Cpr3 Final
32 pages
Soccer Mac
No ratings yet
Soccer Mac
38 pages
DMDW Lab Progress Report: 'C:/Users/KIIT/Desktop/DMDW Lab/players - CSV'
No ratings yet
DMDW Lab Progress Report: 'C:/Users/KIIT/Desktop/DMDW Lab/players - CSV'
8 pages
Sportsdatanb
No ratings yet
Sportsdatanb
34 pages
Interim Layout
No ratings yet
Interim Layout
9 pages
ML Lab Manual 2024
No ratings yet
ML Lab Manual 2024
41 pages
Instructions
No ratings yet
Instructions
4 pages
Predicting Outcome of Soccer Matches Using Machine Learning
No ratings yet
Predicting Outcome of Soccer Matches Using Machine Learning
12 pages
ML 1
No ratings yet
ML 1
16 pages
Project Python-1
No ratings yet
Project Python-1
3 pages
Back2Back Brain Dead 2k25
No ratings yet
Back2Back Brain Dead 2k25
37 pages
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
No ratings yet
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
5 pages
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
No ratings yet
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
7 pages
Eidd S8 TD1
No ratings yet
Eidd S8 TD1
3 pages
Ekefre Non Confidential
No ratings yet
Ekefre Non Confidential
59 pages
Comparison of Football Results Using Machine Learning Algorithms
No ratings yet
Comparison of Football Results Using Machine Learning Algorithms
7 pages
Thesis Proposal Presentation
No ratings yet
Thesis Proposal Presentation
15 pages
Corentin Herbinet Using Machine Learning Techniques To Predict The Outcome of Profressional Football Matches
No ratings yet
Corentin Herbinet Using Machine Learning Techniques To Predict The Outcome of Profressional Football Matches
73 pages
Entropy 23 00090 v3
No ratings yet
Entropy 23 00090 v3
12 pages
Report
No ratings yet
Report
25 pages
Analysis and Prediction of Soccer Games - An Application To The Kaggle European Soccer Database
No ratings yet
Analysis and Prediction of Soccer Games - An Application To The Kaggle European Soccer Database
6 pages
Cmu 432 Fis - Fps - Group 2
No ratings yet
Cmu 432 Fis - Fps - Group 2
14 pages
ML Lab A1 A4
No ratings yet
ML Lab A1 A4
6 pages
Predicting Baseball Wins Using Machine Learning
No ratings yet
Predicting Baseball Wins Using Machine Learning
3 pages
Wilkinson Draft 2
No ratings yet
Wilkinson Draft 2
3 pages
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
No ratings yet
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
7 pages
INeuron ML Practical Assignments
No ratings yet
INeuron ML Practical Assignments
14 pages
Result Prediction For European Football Games: Xiaowei Liang Zhuodi Liu Rongqi Yan
No ratings yet
Result Prediction For European Football Games: Xiaowei Liang Zhuodi Liu Rongqi Yan
5 pages
Predicting Players Rating
No ratings yet
Predicting Players Rating
4 pages
Game ON! Predicting English Premier League Match Outcomes
No ratings yet
Game ON! Predicting English Premier League Match Outcomes
5 pages
AISS Manual
No ratings yet
AISS Manual
7 pages
Demand Forecasting - Lecture Notes
100% (1)
Demand Forecasting - Lecture Notes
30 pages
Sample Size Determination: BY DR Zubair K.O
100% (1)
Sample Size Determination: BY DR Zubair K.O
43 pages
Factor Analysis Using SPSS: Example
No ratings yet
Factor Analysis Using SPSS: Example
14 pages
Worksheet SR-FAITH
No ratings yet
Worksheet SR-FAITH
4 pages
Questions Updated
No ratings yet
Questions Updated
13 pages
Standard Normal Curve Table
67% (3)
Standard Normal Curve Table
3 pages
Intro
No ratings yet
Intro
26 pages
We I Bull Analysis
No ratings yet
We I Bull Analysis
72 pages
IE UNIT-III Forecasting
No ratings yet
IE UNIT-III Forecasting
99 pages
Econometric Exam 2 Flashcards - Quizlet
No ratings yet
Econometric Exam 2 Flashcards - Quizlet
18 pages
Solution RVCE AIML Test 3
No ratings yet
Solution RVCE AIML Test 3
3 pages
Assignment For Statistical Economics
No ratings yet
Assignment For Statistical Economics
3 pages
AL 302 Introduction To Probability and Statistics
No ratings yet
AL 302 Introduction To Probability and Statistics
2 pages
Power Calculation
No ratings yet
Power Calculation
2 pages
CH 17 Statistica
No ratings yet
CH 17 Statistica
36 pages
Chapter 3 Multiple Regression Analysis Estimation
No ratings yet
Chapter 3 Multiple Regression Analysis Estimation
38 pages
01 - Introduction Into Data Analyze
No ratings yet
01 - Introduction Into Data Analyze
18 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
Chapter 14
No ratings yet
Chapter 14
157 pages
Case Problem Rubrics (SIDM)
No ratings yet
Case Problem Rubrics (SIDM)
1 page
Assignment of Biostatistics
No ratings yet
Assignment of Biostatistics
8 pages
MATH 1280-01 - Written Assignment Unit 2 - Sheet1
No ratings yet
MATH 1280-01 - Written Assignment Unit 2 - Sheet1
1 page
Probabilistic Design SC Problems PDF
No ratings yet
Probabilistic Design SC Problems PDF
16 pages
GMM Postestimation - Postestimation Tools For GMM: Estat Overid
No ratings yet
GMM Postestimation - Postestimation Tools For GMM: Estat Overid
4 pages
(Dataset1) C:/Users/Hp/Documents/Diabetes - Sav: Descriptives
No ratings yet
(Dataset1) C:/Users/Hp/Documents/Diabetes - Sav: Descriptives
5 pages
HP Case Analysis
No ratings yet
HP Case Analysis
8 pages
Clog P Dengan Aktivitas (Log 1/IC) : Regression
No ratings yet
Clog P Dengan Aktivitas (Log 1/IC) : Regression
6 pages
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
No ratings yet
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
8 pages
A Quick and Easy Guide in Using SPSS for Linear Regression Analysis
From Everand
A Quick and Easy Guide in Using SPSS for Linear Regression Analysis
Jurex Gallo
No ratings yet