Wilkinson Draft 2
Wilkinson Draft 2
While sitting on the couch and watching my favorite team play every weekend can have a
full range of emotions from excitement to disappointment. This is because watching my team
play but more importantly soccer games can be very unpredictable and games can go in any
direction no matter the team you are playing. Since games are very unpredictable it affects
team’s tactics for every matchday, betting on soccer is one of the hardest sports, and the biggest
thing for soccer clubs keeping the fans happy and engaged with the club. So, using data science
this proposal is going to develop a model that is going to try to predict the factors for an outcome
of a game so it can help improve team performance and tactics. This model would be run before
the game happens by a club to see if their tactics that they have prepared for this match are going
to succeed or not.
This data science problem can be solved using predictive modeling because it is going to
analyze previous game data as one factor to predict the outcome of future games. By using
previous game data obtained from the scoreline of playing different clubs, specific player
statistics like passing, shooting, and defending. The last factor is match conditions like the
weather with these factors a model is going to be built to help predict the outcome of winning,
The use we would want to help support this problem is the specific audience we are
trying to reach. That audience would be the managers of different clubs from around the world.
The next audience would be different sports analysts from around the world to help them in their
job and then also gambling companies to try to help them understand the game a bit more.
Lastly, trying to keep the fans engaged with the club and trying to help them understand what the
manager and team were trying to succeed in a specific game. This problem would add a lot of
value to the audience if the model is able to accurately predict the outcome of a game. The value
that would be added is developing more in-depth tactics for the team for each and every game,
the club trying to optimize their budget and team training to get prepared the most for games, and
trying to increase betting odds for companies and engaging the fans with better information.
For this problem a supervised model type will be used. The specific type of model that
we are going to be using is Support Vector Machines because they are more helpful in difficult
problems with trying to create clear solutions between the different outcomes of the game either
a win, draw, or loss. Since we are using a supervised model type, we need a target variable for
the problem and the target variable is going to be categorical being the outcome of a game either
win, draw, or loss. There are really two data mining techniques we are going to use or focus on
and the first one is trying to find key factors that influence the outcome of the game. The next
technique we are going to use is validating the model’s efficiency and so that the model doesn’t
For this problem there are going to be three main resources that we get our data from.
The first resource is going to be historical data from previous games from the different leagues
we are trying to look at. The second resource is going to be specific player statistics from the
individual teams or from specific soccer analytic programs. These statistics include player
ratings based on a scale of 0-10. The third resource is going to be from a meteorologist or
weather app to understand the weather conditions on game days. There are specific attributes that
are going to help our predictive modeling. Those attributes are going to be the team’s form over
the last five games either being wins, draws, or losses. The home and away advantage for clubs
and their rate of performing at home or away. The specific player statistics from games like goals
scored, assists, key passes in attacking third, any type of defensive action, and the players heat
map. The historical results of games between two clubs like looking at the previous games to
help us determine the outcome. Finally, the weather conditions of games like the temperature,
any precipitation before, during or after the game, and the quality of the pitch the game is being
played on. This model will help predict the outcome of games to improve team performance,