100% found this document useful (1 vote)
852 views

Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming

The document describes a project to predict winners of football/soccer matches using machine learning and Python. It involves collecting historical match data, analyzing it using a Poisson distribution model to calculate the probability of match outcomes, and predicting points and outcomes of matches in the 2022 FIFA World Cup. The model correctly predicted Brazil as the winner. Machine learning soccer predictions have applications for sports betting, marketing, and team management.

Uploaded by

garzonalexa12
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
852 views

Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming

The document describes a project to predict winners of football/soccer matches using machine learning and Python. It involves collecting historical match data, analyzing it using a Poisson distribution model to calculate the probability of match outcomes, and predicting points and outcomes of matches in the 2022 FIFA World Cup. The model correctly predicted Brazil as the winner. Machine learning soccer predictions have applications for sports betting, marketing, and team management.

Uploaded by

garzonalexa12
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proyect: Predict Football Match

Winners With Machine Learning


And Python | Foundations of
Programming |
👋 Soccer is one of the most popular sports around the world, and many people
enjoy betting on the results of matches. However, predicting the result of a match is
not always easy. Many people call soccer “the unpredictable game” because a
soccer game has different factors that can change the final score.

Objectives:

Develop a simple prediction model using the Python programming language that
is capable of predicting the results of the 2022 FIFA World Cup matches.

Validate, evaluate and adjust the prediction model using machine learning
techniques and data analysis to improve its accuracy and predictability.

Methodology:
Data analysis and machine learning are performed, with collection and pre-
processing of historical data from soccer matches and other relevant factors for the
prediction of results.
Select the most relevant variables and train a simple prediction model with the
available data. Finally, the model is evaluated and tested to determine its accuracy in
predicting the results of the 2022 FIFA World Cup.

Notes: Previous make te predictor we have a different points where need to


establish:

1. Web-scraping to collect data of past football matches

2. Supervised Machine Learning using detection models to predict the results of a


football match on the basis of collected data

3. Evaluation of the detection models

Proyect: Predict Football Match Winners With Machine Learning And Python | Foundations of Programming | 1
Predictor:

The Poisson distribution is a discrete probability distribution that describes the


number of events that occur in a fixed time interval or region of opportunity.

This is why it is suitable for multi-classification problems generally, but works just as
well for binary problems too (considering the two classes as the multi-classes in the
dataset).
This is how we can make a basic prediction for a football game winner with the help
of a machine learning model (in this case, Poisson distribution).
If we think of a goal as an event that might happen in the 90 minutes of a football
match, we could calculate the probability of the number of goals that could be scored
in a match by Team A and Team B.

To calculate lambda, we need the average goals scored/conceded by each national


team. This leads us to the next point.

Goals scored/conceded by every national team

After collecting data from all the World Cup matches played from 2002 to 2022, I
could calculate the average goal scored and conceded by each national team.

Results:

1. However, it is to be noted that we could also use any other website for carrying
out the detection as well.

For instance, we could web-scrape the results of a match off of Wikipedia itself
simply by providing the link to the match scores, such
as, https://en.wikipedia.org/wiki/2022_FIFA_World_Cup

2. For performing actual web-scraping, the copied URL would need to be provided
to the web-scraping script or code for extracting the relevant match data.

3. The script would be used to combine all the games in one season into a list or a
.csv file.

4. The copied URL from above would be given as input, along with the id of the
tables containing information about the championship.

5. The compiled list comprising all the matches would be received as output.

Proyect: Predict Football Match Winners With Machine Learning And Python | Foundations of Programming | 2
6. The information that is unnecessary is omitted, such as the player statistical
data.

7. The information is restricted to contain only match data mapped to team data so
that predictions as to which team will win can be made.

8. The result is appended to contain the data about matches and teams (omitting
player-specific information) with the help of a Data frame.

Predicting the group stage


Below is the code I used to predict the number of points each national team would
get in the group stage. It looks intimidating, but it only has many things I mentioned
until this point translated into code.
def predict_points(home, away):

if home in df_team_strength.index and away in df_team_strength.index:

lamb_home = df_team_strength.at[home,'GoalsScored'] *
df_team_strength.at[away,'GoalsConceded']

lamb_away = df_team_strength.at[away,'GoalsScored'] *
df_team_strength.at[home,'GoalsConceded']

prob_home, prob_away, prob_draw = 0, 0, 0

for x in range(0,11): #number of goals home team

for y in range(0, 11): #number of goals away team

p = poisson.pmf(x, lamb_home) * poisson.pmf(y, lamb_away)

if x == y:

prob_draw += p

elif x > y:

prob_home += p

else:

prob_away += p

points_home = 3 * prob_home + prob_draw

points_away = 3 * prob_away + prob_draw

return (points_home, points_away)

else:

return (0, 0)

In plain English, predict_points calculates how many points the home and away
teams would get. To do so, I calculated lambda for each team using the
formula average_goals_scored * average_goals_conceded .

Proyect: Predict Football Match Winners With Machine Learning And Python | Foundations of Programming | 3
Then I simulated all the possible scores of a match from 0–0 to 10–10 (that last
score is just the limit of my range of goals). Once I have lambda and x, I use the
formula of the Poisson distribution to calculate p .

The prob_home , prob_draw , and prob_away accumulates the value of p if, say, the
match ends in 1–0 (home wins), 1–1 (draw), or 0–1 (away wins) respectively. Finally,
the points are calculated with the formula below.

points_home = 3 * prob_home + prob_draw


points_away = 3 * prob_away + prob_draw

If we use predict_points
to predict the match England vs United States, we’ll get this.

>>> predict_points('England', 'United States')

(2.2356147635326007, 0.5922397535606193)

This means that England would get 2.23 points, while the USA would get 0.59. I get
decimals because I’m using probabilities.
If we apply this predict_points function to all the matches in the group stage, we’ll
get the 1st and 2nd position of each group, thus the following matches in the
knockouts.

I continues with the similar procees to Predicting the knockouts, quarter-final,


semifinal, and final

By running the function one more time, I get that the winner is …

Brazil!

Discuss:

Machine learning-based soccer predictions have a wide range of applications,


including sports betting, marketing strategies, and team management. With the

Proyect: Predict Football Match Winners With Machine Learning And Python | Foundations of Programming | 4
ability to predict results with high accuracy, machine learning can be a valuable tool
for the world of soccer.

Conclusions:

Predicting the results of soccer matches requires collecting and analyzing large
amounts of data such as player statistics, team tactics, and historical performance.
Machine Learning and Python are powerful tools for predicting results and
performing accurate data analysis, with a wide range of applications in the world of
sports and beyond.

Proyect: Predict Football Match Winners With Machine Learning And Python | Foundations of Programming | 5

You might also like