0% found this document useful (0 votes)
16 views36 pages

Example Maths IA

Here is an example Math IA if you need to see what it may look like

Uploaded by

Ethan Fong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views36 pages

Example Maths IA

Here is an example Math IA if you need to see what it may look like

Uploaded by

Ethan Fong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Brief Introduction of xG

xG was a unit created in 2012 by Opta’s Sam Green used to measure how good a

scoring chance. This was done by calculating the amount of shots taken in estimating the

likelihood of the shot being a goal that the player has scored. During the 22/23 season,

Darwin Nuñez was initially predicted to have an xG of 23.47 but had underperformed where

he scored 10.47 fewer goals of at just 13 premier league goals (Fotmob 2023). This can be

similarly shown by Haaland where he was expected to have a 30.16 xG but had

underperformed sightly by scoring 26 goals. Whereas other notable players like Kane and

Mbappe had outperformed their xG (with Kane scoring 39 more goals than the 32.92 xG he

was expected and Mbappe scoring 30 more goals than the 25.75 xG than what was

expected of him) (Ogden 2024). This shows that xG as an indicator may not always be

reliable as they are seen to be largely subjective that makes it hard to accurately assess

player performance in one season (Fotmob 2023).

Expected goals (xG) as a team performance indicator has been questioned for its

accuracy for providing an overview of how teams would perform within their respective

leagues. On one hand, xG can give slightly accurate predictions on results. For instance, it

can give 66% accuracy for home games results and 58% accuracy for away games results,

showing its reliance in predicting results (Football XG 2024). However, xG has issues in

maintaining accuracy on a consistent basis. For example, during the 19/20 season; based

on the results from xG. Manchester City was expected to win the premier league by 13

points. Despite this, Liverpool won the premier league title despite them being predicted to

score 39 points fewer in the past 2 seasons (MacInnes 2020). Hence, showing the

limitations of xG as a reliable measure of team performance when being unable to consider

of other factors of team performance that include the quality of players and tactical aspect of

the team in accurately assessing team performance.

Aim: To investigate correlations between the 5 statistics of team performance and xG .


Introduction
Team performance is currently being measured with using the 5 statistics of GF

(Goals For), GA (Goals Against), W (wins), L (Losses) and Pts (Points). Goals For is the

number of goals scored by a team in a season against their opponents. Goals Against is the

number of goals conceded by a team in a season by their opponents. Wins are the number

of matches won in a season by a team against their opponents. Losses are the number of

matches lost in a season by a team against their opponents. Lastly, points are values that a

team accumulates when determines their league placement in a season where wins = 3

points, draws = 1 point and losses = 0. xG only became mainstream in 2017 where it

appeared on Match of the Day’s post-match statistical round –up (Willams 2020).

Henceforth, the investigation was conducting using Pearson Coefficient and

Spearman Ranking to measure their relationship with xG. Subsequently, the results of the

test for each factor of team performance will be compared to xG . The data is presented

using scatter plots.

I choose this topic because I am passionate about football and because I am

interested in whether xG can objectively give reliable accurate predictions that can be used

by football fans to accurately assess how well their teams would do in their respective

football leagues. Furthermore, I am interested in unpacking the xG debate because of its

controversies of xG not being a reliable indicator for team performance as it is used as a

marker despite its inaccuracies. Hence, my math IA would be looking to settle the debate in

seeing the validity of xG in football analysis.

Method of Data Collection


The data collected from FBrief.com. This source is reliable because the data is from Opta

which is company that collects accurate and reliable data that is football about sports

(source) :

• 2022-2023 La Liga (FBref.com, 2023a)

• 2022-2023 Premier League (FBref.com, 2023b)


• 2022-2023 Serie A (FBref.com, 2023c)

These factors were chosen because xG, GF, GA, W, L and points because most

widely recognised current statistics that determine team performance. Data was collated

using Excel. Data was grouped to compare xG with GF, GA, W, L and Pts as factors relating

to team performance can also influence a team’s league standing. Thus, when comparing

with xG, this would help with seeing whether these factors are good for determining team

performance.

The R-value is defined by showing the strength and direction of that relationship between

two variables (Turner 2024). Whereas, the r2 value is defined by how well the data is able to

fit into the model (Abba 2024).

The independent variable of my analysis is the xG and the dependent variables is the five

statistics of team performance such as wins, losses, points, goals against and goals for.

Research Question
Does xG determine team performance within the five factors that influences it?

Hypothesis: xG is key an indicator for all five statistics in determining team performance

Results
Positive Correlations
Scatter plot of xG and GF of The Top Three European Leagues Data:fbrief.com
(See Appendix 1 for Raw Data for GF and xG)

The graph seems to show a strong positive correlation between xG and GF.

Pearson Test on xG and GF:

Pearson Test Results


RegEqn: m*x+b M: 0.711
B: 14.8 R2 = 0.829081
R= 0.910539 Linear Equation: 𝑦 = 0.711𝑥 + 14.8
(See Appendix 1 for Raw Data for GF and xG used for the test and Appendix 2 for the

results)

The results suggest that xG and GF have a strong positive linear relationship of 0.9095.

Therefore, suggesting an increase xG when there is an increase in GF. Furthermore, the r2

value shows that there is 90% fit in the data.

Spearmans Ranking on xG and GF:


Spearman Rank Results
RegEqn:m*x+b M:1

B:0 r=0.999347

R2= 0.998694
(See Appendix 1 for Raw Data for xG and GF used)

This indicates that xG and GF have a strong positive linear relationship as the r value

(0.988) is close to 1. Hence, imply that when there is an increase in xG, there would also be

an increase in GF. Further, the r2 value shows that there is a 99% fit in the data hence

strongly supporting the answer.

When looking at all the top three leagues within Europe, there seems to be

correlation between xG and goals scored as well as team performance but there are some

exceptions.

Ranking XG GF League Standing


1 Inter (68.0) Napoli (77) Napoli
2 Napoli (64.7) Inter (71) Lazio
3 Milan (58.8) Atalanta (66) Inter
Table 1: Comparison of the Top Three rankings of XG, GF and League Standing in Serie A
Out of the three teams that had highest xG, Inter and Napoli scored the most goals

which ranked them second and first in the league. Therefore, suggesting some correlation

that xG is a good determiner for GF and therefore team performance. However, Milan

despite being within the top three teams for xG, they are not in the top three for league

standing nor GF which could suggest the possibility of other factors affecting team

performance.

Ranking xG GF League Standing


20 Sampdoria (34.1) Sampdoria (24) Sampdoria
19 Hellas Verona (35.8) Hellas Verona (31) Cremonese
18 Lecce (36.1) Spezia (31) Hellas Verona
Out of the three teams with the lowest xG, Sampdoria and Hellas Verona were the teams

that had scored the least the goals, which lead to be placed with the top three bottom

rankings. Consequently, this implies that there is a correlation with xG being an indciator for

team performance and GF. However, this is the exception of Leece where despite being

ranked top three with lowest xG, they were not within the top three bottom teams indicating

about the influence of external factors affecting team performance.

Ranking xG GF League Standing


1 Barcelona (75.5) Real Madrid (75) Barcelona
2 Real Madrid (75.5) Barcelona (70) Real Madrid
3 Atlético Madrid Atlético Madrid (70) Atlético Madrid
(61.9)
Table 2: Comparison of the Top Three rankings of xG, GF and League standing in La Liga

The results in La Liga indicate a correlation that xG is a good determiner of GF and

therefore team performance. The teams with the three highest xG were placed in the top

three in league standing and GF. Thus, suggesting that results in La Liga strongly supports

the hypothesis of xG determining team performance within La Liga based on GF.

Ranking xG GF League Standing


20 Mallorca (35.2) Elche (30) Elche
19 Getafe (36.7) Cádiz (30) Espanyol
18 Elche (37.5) Valladolid (33) Valladolid
The results show that there is no correlation with xG and GF with determining team

performance, where the teams with lowest xG are not ranked the lowest with GF and league
standing. This can imply that the results within La Liga doesn’t support the hypothesis of xG

for judging team performance within La Liga based on GF except for Elche that suggest that

xG can determine team performance based on GF and league standing.

Ranking xG GF League Standing


1 Manchester City Manchester City Manchester City
(78.6) (94)
2 Brighton (73.3) Arsenal (88) Arsenal
3 Newcastle Utd Liverpool (75) Manchester Utd
(71.9)
Table 3 Comparison of the Top Three rankings of XG, GF and League Standing in The Premier

League

The results in the EPL indicate a low correlation. This can be shown that out of the

three teams with the highest xG, only Manchester City was placed within the top three for

GF and within the league standing. It can that xG can predict team performance that the

higher the xG, the higher amount goals scored thus a higher league standing. However,

Brighton and Newcastle Utd despite being predicted to be within the top three for xG isn’t

with the top three in league standing nor GF, indicating that xG is not an accurate measure

of team performance.

Ranking xG GF League Standing


20 Wolves (36.8) Wolves (31) Southampton
19 Southhampton Everton (34) Leeds United
(37.8)
18 Bournemouth (38.5) Southampton (36) Leicester City

The results show no correlation where xG doesn’t seem to have played a role with

determining GF and league standing with the EPL. This is evident by Wolves despite having

the lowest xG and GF, they were not ranked within the bottom three teams within league

standing, that demosntrates that xG is an imperfect indicator for team performance based on

league standing and GF. However, this is except for Southampton where it was ranked the

lowest xG which lead to be ranked the top three lowest within the league standing and GF.

Overall, xG and GF seem to have played a bigger role in determining team

performance within La Liga. As the more goals a team scores, the more likely they would
perform well; seen by Barcelona and Atlético Madrid being joint second for the most goals

scored, which allowed for them to be within the top three within La Liga. This could imply that

La Liga teams are much more expansive in their playstyle, suggesting emphasis on

attacking tactics in improving their team performance within the league. However, for the

Premier League and Serie A, it appears that the amount goals score by a team doesn’t

seem to have strong correlation to improving their team performance within their league

standing implying that their other factors influencing team performance than GF.

Scatter Plot of xG and W of the Top Three European Leagues Data:fbrief.com

(See Appendix 1 for Raw Data for xG and W used)

The relationship between xG and W seem to show a moderate positive relationship

between each other. This indicates that xG somewhat determines the number of wins that a

team and therefore team performance.

Pearson Test on xG and W:

Pearson Test Results


RegEqn: m*x+b M: 1.63
B: 27.0 R2 = 0.7063
R = 0.8404 Linear Equation: 𝑦=1.63𝑥+26.99
(See Appendix 1 For Raw Data for xG and W)
The r value suggests (0.84) a strong positive linear relationship between xG and

wins. This means the higher the xG; the higher the amount of wins a team would get, thus

affecting team performance. The r2 value (0.84) suggest that the data fits the model.

Spearmans Ranking on xG and W

Spearman Rank Results


RegEqn:m*x+b M:0.999

B:0.00212 r=0.998

R2: 0.997

(See Appendix 1 For Raw Data for xG and W)

The r value (0.996) implies a strong positive linear relationship between xG and wins.

This could suggest that the higher the xG, the better the chances of a team winning their

matches. Furthermore, r2 values implies that the data strongly supports the answer for good

fit 99%.

Ranking xG W League Standing


20 Sampdoria (34.1) Sampdoria (3) Sampdoria
19 Hellas Verona (35.8) Cremonese (5) Cremonese
18 Lecce (36.1) Spezia (6) Hellas Verona
The results show little correlation with xG being an reliable indicator for assessing

team performance based on league standing and W. This can be evident by Sampodria

where there low xG has lead to them to have the least amount wins and lower league

standing based on team performance. However, Leece is a team where despote having on

of three lowest xG, they were not ranked within the lowest league standings and the

amounts that can question the predictability of xG as a team performance indicator.

Ranking xG W League Standing


1 Inter (68.0) Napoli (28) Napoli
2 Napoli (64.7) Inter (23) Lazio
3 Milan (58.8) Lazio (22) Inter
Table 7: Comparison of the Top Three rankings of xG, W and League Standing in Serie A

This shows a positive correlation that suggest that xG and the amount of wins a team

would get can determine team performance. This shows Inter and Napoli having the highest
amount of xG leading to be top three with the highest number of wins. However, Milan is an

outliner, as is not within the top three in league standing nor for the most amount of wins.

Thus, suggesting the presence of other factors such as draws could have influenced the

league standing of a team by a one-point difference that could have impacted team

performance based on their league standing.

Ranking xG W League Standing


20 Mallorca (35.2) Elche (5) Valladolid
19 Getaf e (36.7) Espanyol (8) Espanyol
18 Elche (37.5) Getaf e (10) Elche
This show little correlation where xG can be an accurate indicator for team

performance. This can be shown by Elche where its low xG has led to have the lowest

amount of wins and therefore the lowest league standing. However, teams like Getafe and

Elche, where despite being ranked the lowest by xG, it was not ranked the lowest based on

wins and league standing, suggesting the lack of consideration of outside factors when

determining team performance based on xG.

Ranking xG W League Standing


1 Barcelona (75.5) Barcelona (28) Barcelona
2 Real Madrid (75.5) Real Madrid (24) Real Madrid
3 Atlético Madrid (61.9) Atlético Madrid (23) Atlético Madrid
Table 8: Comparison of the Top Three rankings of xG, W and League Standing in La Liga

This shows a positive correlation that xG can predict how well a team would do

based on the number of wins. This is exemplified by Atlético Madrid, Real Madrid and

Barcelona being placed within top three in xG and league standing. This implies that xG is

an accurate measure for team performance, as the higher the xG, the more wins a team

would get. Consequently, supporting the hypothesis that suggest xG can predict team

performance based on the number of wins in La Liga that would allow them to be in the top

three in the league standing.

Ranking xG W League Standing


20 Wolves (36.8) Southampton (6) Southhampton
19 Southampton (37.8) Leeds United (7) Leeds United
18 Bournemouth (38.5) Everton (8) Leicester City
This shows little correlation where xG could be an accurate indicator for the amount

wins and team performance within the league. This can be evident by Southampton where

they were ranked with lowest xG, which lead them to having the lowest amount of win and

lower league standing. However, this is except for Bournemouth and Wolves where despite

their low xG, they were not ranked the lowest based on wins and league standing. As a

result, this leads to the hypothesis being rejected.

Ranking xG W League Standing


1 Manchester City (78.6) Manchester City (28) Manchester City
2 Brighton (73.3) Arsenal (26) Arsenal
3 Newcastle Utd (71.9) Manchester Utd (23) Manchester Utd
Table 9: Comparison of the Top Three rankings of xG, W and League Standing in The Premier

League

This shows some positive correlation between xG and the number of wins to

determining league. For example, Brighton had the second highest xG; despite they weren’t

second in the league and with the most wins. However, Manchester City had the highest xG

leading to wins; causing them to be 1st in the league. This could suggest that xG as a

measure for team performance can have inconsistencies for determining team performance

that can make it unreliable. As a result, this makes team performance unapplicable to real

life as the unpredictability of it can lead to xG creating overestimations or underestimations

of a team performance in a footballing season.

Overall, the Serie A and Premier league suggest winning seem to have more an

important factor than xG. As factors such as draws can influence team performance by one

point in league standings. However, La Liga seems to be the outlier where xG, seems to be

more important than wins; to be able to place higher within league standing. This suggest

that wins within the top three European leagues is mostly a key indicator for team

performance for teams to be able to do well within the league standings.


Scatter Plot of xG and Pts of the Top Three European Leagues
Data:fbrief.com

(See Appendix 1 For Raw Data for xG and Pts)

The graph shows a moderate positive correlation. This would imply that the higher

the xG, the higher amount of points a team would get. The relationship between xG and pts

is shown to be moderate positive correlation. This implies that xG can influence how many

points a football team would get that would determine their success in their national leagues.

Pearson Test on xG and Pts:


Pearson Test Results
RegEqn: m*x+b M: 0.596
B: 19.3 R2 = 0.7106
R = 0.8430 Linear Equation: 𝑦=0.596𝑥+19.3
(See Appendix 1 For Raw Data for xG and Pts)

The r value (0.84) shows a strong positive linear relationship between xG and Pts .

This could suggest that xG can influence the number of points teams would get in a season.

Further, the r2 value of 0.71 suggests that data strongly supports this answer.

Spearmans Ranking on xG and Pts


Spearman Rank Results
RegEqn:m*x+b R= 0.99935

R2:0.9987

(See Appendix 1 For Raw Data for xG and Pts)


The r value (0.99) implies a strong positive linear relationship where an increase of

xG, would mean an increase of points. Further, the r 2 value of 0.999 suggest that the data

set strongly support this answer.

Ranking xG Pts League Standing


20 Sampdoria (34.1) Sampdoria (19) Sampdoria
19 Hellas Verona (35.8) Cremonese (27) Cremonese
18 Lecce (36.1) Hellas Verona (31) Hellas Verona

The results indicate some correlation between xG and the amount of pts a team would get.

This can be evident by Sampdoria and Hellas Verona where the lower the xG they have, the

lower their league standing and the amount points within the league. However, Lecce

despite being ranked to have the lowest xG, they weren’t ranked to have the lowest amount

of points and be lowest within the league that can suggest the inability for xG to be

generalisable for being an accurate indicator for team performance.

Ranking xG Pts League Standing


1 Inter (68.0) Napoli (90) Napoli
2 Napoli (64.7) Lazio (74) Lazio
3 Milan (58.8) Inter (72) Inter
Table 13: Comparison of the Top Three rankings based on xG, Pts and League Standing in Serie A

This indicates a moderate positive correlation that xG is a determiner for team

performance based on pts. This is referenced by Inter and Napoli having one of the highest

xGs that placed to be top three with the highest number of points and in their league

standings. This is with the exception for Milan where despite being within the top three for

the highest xG, it doesn’t have highest amount points nor top three in league standing. This

suggest leads to hypothesis 3 being accepted, that means that having more points would

lead to a better league standing.

Ranking xG Pts League Standing


20 Mallorca (35.2) Elche (25) Elche
19 Getafe (36.7) Espanyol (37) Espanyol
18 Elche ( 37.5) Valladolid (40) Valladolid
The findings show some correlation of xG being a determiner for team performance

based on Pts. This can be supported by Elche being within the top three for the lowest xG
and pts as well as league standing. This is with the exception of Getafe and Mallorca where

despite having low xG, they weren’t ranked the lowest amount of pts and league standing

demonstrating the inapplicability of xG as an indicator for team performance based on pts.

Ranking xG Pts League Standing


1 Barcelona (75.5) Barcelona (88) Barcelona
2 Real Madrid (75.5) Real Madrid (78) Real Madrid
3 Atlético Madrid (61.9) Atlético Madrid (77) Atlético Madrid

Table 14: Comparison of the Top Three rankings based on xG, Pts and League Standing in La Liga

The results indicate a strong positive correlation between xG and Pts. This is by

Barcelona, Real Madrid and Atlético Madrid being placed within the top three in xG making

them more likely to have the highest amount points earned and be top three in league

standing. Thus, supporting hypothesis 3 of xG being a key indicator for the most amounts

points for team performance.

Ranking xG Pts League Standing


20 Wolves (36.8) Southampton (25) Southampton
19 Southampton (37.8) Leeds United (31) Leeds United
18 Bournemouth (38.5) Leicester City (34) Leicester City
The results show little correlation where xG is an indicator of team performance

based on Pts. This can be evident by Southampton being ranked with the lowest amount of

xG leading the team having the lowest amount of pts and league standing. However, this is

with apart from Bournemouth and Wolves where despite having the lowest xG, they weren't

ranked to have the lowest pts and league standing, implying about the lack of reliable of xG

as a accurate indicator for team performance based on pts.

Ranking xG Pts League Standing


1 Manchester City (78.6) Manchester City (89) Manchester City
2 Brighton (73.3) Arsenal (84) Arsenal
3 Newcastle Utd (71.9) Manchester Utd (75) Manchester Utd
Table 15: Comparison of the Top Three rankings based on xG, Pts and League Standing in The

Premier League

This shows a weak positive hypothesis as xG is shown to be able to be an indicator

for the number of points that a team would to determine their team performance. This is

evident by Newcastle and Brighton where despite having highest xG; they do not have
highest number of points nor is top three in the league standing. However, there is an

exception where xG can determine team performance. This can be shown by Manchester

City being placed first for the highest xG leading to them being 1st of the most points and 1st

in league standing. This signifies that xG can be an indicator for team performance based on

pts earned; but with abnormalies that could undermine it as a measure.

Conclusion of the Positively Correlated Team Performance Indicators Based on xG


Overall, xG seems to show positive relationships between GF, W and Pts that

suggest that the higher that a higher xG would mean higher GF, W and Pts within especially

La Liga because of their emphasises of aggressive tactics to help with allowing for teams to

have better performances within the leagues. As well as, the shift away from Serie A’s more

defensive approach to allow for better team performance, as it can indicate the risising

importance of attacking tactics within the league . However, for the premier league, there

seems to be inconsistency between xG and the positively correlated team performance

indicators that can indicate about the competitiveness of the league that can make it hard to

determine team performance into suggesting about weakness of xG in being unable to

considering external factors that can determine team performance (ie playing styles of teams

and tactics), that may suggest about the different variety of methods and ways of determine

team performance within the Premier League that xG is not able to measure.
Negative Correlations
Scatter plot of xG and GA of the Top Three European Leagues Data:fbrief.com

(See Appendix 1 for Raw Data for xG and GA used )

The graph looks like it shows a moderate negative correlation

Pearson Test on xG and GA:


Pearson Test Results
RegEqn: m*x+b M: -0.568
B: 79.0 R2 = 0.3528
R = -0.5940 Linear Equation: 𝑦 = − 0.568𝑥 +
79.0
(See Appendix 1 for Raw Data for xG and GA used )

The r value (-0.5940) shows a moderate negative linear relationship of between xG

and GA suggesting that if there is an increase of xG, there would be a decrease in GA.

Furthermore, the r2 value shows a 35% good fit suggesting that the data doesn’t moderately

support the answer in xG affecting GA.

Spearman's Ranking on xG and GA

Spearman Rank Results


RegEqn:m*x+b M:0.999

B:0.5776 r=0.998

R2: 0.996

(See Appendix 1 for Raw Data for xG and GA used)


The r value (0.997) between xG and GA implies a strong positive linear relationship

between xG and GA. This implies that when there is a high xG, the GA rank will increase

which means that goals against will decrease. Furthermore, the r2 value suggest that the

data strongly support this answer of 99% for good fit.

There seems to be negative correlation that can suggest that a higher xG, would

mean lower the GA. However, they may be exceptions to it when looking at the top three

leagues within Europe.

Ranking XG GA League Standing


1 Inter (68.0) Napoli (28) Napoli
2 Napoli (64.7) Lazio (30) Lazio
3 Milan (58.8) Juventus (33) Inter
This shows a slight correlation where teams with the highest xG, which would lead to

a lower GA and therefore leading to a higher league standing. This can be examplifed by

Napoli where their high xG has lead to them having a lower GA which lead them to being

placed top three within the league that can make xG a depedenable indicator for team

performance within league standing and GA. However, there an abnormaly where despite

Lazio not having the highest xG, it was ranked the top three for lowest GA and league

standing that can question that show external influences having an impact on team

performance.

Ranking XG GA League Standing


20 Sampdoria (34.1) Sampdoria (71) Sampdoria
19 Hellas Verona (35.8) Cremonese (69) Cremonese
18 Lecce (36.1) Salernitana (62) Hellas Verona
Table 4 : Comparison of the Top Three Lowest Ranking of XG, GA and League Standing in Serie A

There seems to be a negative correlation between xG and GA in accurately

representing team performance. This is shown by Lecce it has the highest xG out of the top

three lowest ranked teams meaning that they are within the top three in GA. This suggest

that a higher the xG, would mean a lower GA; translating to better team performance. Thus,

showing that xG is a determiner for team performance for GA thus supporting hypothesis

Ranking XG GA League Standing


1 Barcelona (75.5) Barcelona (20) Barcelona
2 Real Madrid (75.5) Atlético Madrid (33) Real Madrid
3 Atlético Madrid Real Sociedad (35) Atlético Madrid
(61.9)
The results show some correlation with xG and GA with determining league standing

and hence team performance. This is evident by Barcelona being ranked the highest with xG

with the lowest in GA and therefore the highest within league standing. This can suggest that

the higher the xG, the lower the GA and the better the league standing is for the team. This

is apart from Real Sociedad where despite not having the highest xG nor league standing, it

was ranked the lowest in GA that can not support the hypothesis.

Ranking xG GA League Standing


20 Mallorca (35.2) Espanyol (69) Elche
19 Getaf e (36.7) Elche (67) Espanyol
18 Elche ( 37.5) Almería (65) Valladolid
Table 5 : Comparison of the Top Three Lowest Ranking Teams in terms of xG, GA and League

Standing in La Liga

The findings suggest little correlation between xG and GA in determining team

performance in La Liga. This can be shown by Elche which was a team that has the highest

xG out of the top lowest xG; Elche ended up ranked with the second lowest GA and placed

within the top three lowest ranked teams. This could imply about xG being a flawed indicator

to determining team performance between xG and GA in Serie A. Consequently, suggest the

inapplicability of xG in real life situations for accurately measuring team achievement.

Ranking XG GA League Standing


1 Manchester City (78.6) Manchester City (33) Manchester City
2 Brighton (73.3) Newcastle Utd (33) Arsenal
3 Newcastle Utd (71.9) Arsenal (43) Manchester Utd
The findings show little correlation with xG and GA with determining team performance

based on league standing. This can be evident by Manchester City where being ranked the

highest with xG, it has meant that it was ranked the lowest in GA and higher league

standing. However, this is with the exception of Brighton and Newcastle where despite high

xG, it was ranked for top three lowest GA and league standing, suggesting flaws and the

invalidity of xG as an overall team performance indicator within the league.

Ranking xG GA League Standing


20 Wolves (36.8) Leeds United (78) Leicester City
19 Southampton (37.8) Southampton (73) Leeds United
18 Bournemouth (38.5) Bournemouth (71) Southampton

Table 6: Comparison of the Top Three Lowest Ranking Teams in terms of xG, GA and League

Standing in The Premier League

The results show some negative correlation between xG and GA in showing that they

are somewhat dependent on influencing team performance. This is evident by Southampton

having the second lowest xG; leading it to be within the top three lowest ranked team and

highest GA. However, there are outliners: Wolves were ranked to have the lowest xG.

However, despite it doesn’t have the highest GA nor is within the top three lowest ranking

teams. This suggest that xG unable to accurately assess team performance in terms of GA.

Thus, rejecting the idea that having the highest xG will give a team the lowest GA that would

allow for better team performance and league standings.

Scatter Plot of xG and L of the Top Three European Leagues Data:fbrief.com

(See Appendix 1 For Raw Data for xG and L)


The graph below shows a negative correlation implying that the higher the xG the lower

amount of losses a team would get.

Pearson Test on XG and L:


Pearson Test Results
RegEqn: m*x+b M: -1.85
B: 77.2 R2 = 0.6139
R = -0.7835 Linear Equation: 𝑦=−1.85𝑥+77.2
(See Appendix 1 For Raw Data for xG and L)
The r value (-0.7835) shows a negatively strong linear relationship between xG and L

(s). This could imply that xG is a good determiner for the amount of losses a team would

make 1. However, the r2 value suggest that the data strongly support the answer that as a -

78% of good fit that suggest a higher xG would a lower amount of loses.

Spearmans Ranking on xG and L


Spearman Rank Results
RegEqn:m*x+b M: 0.996

B:0.0451 R= 0.9975

R2: 0.995

(See Appendix 1 For Raw Data for xG and L)

The r value (0.997) of the data set suggests a strong negative linear relationship as

with an increase of xG, there would be a decreased number of losses a team would face.

Further, the r2 value suggest that the data set strongly support this answer as it has a 99%

good fit.

Ranking xG L League Standing


1 Inter (68.0) Napoli (4) Napoli
2 Napoli (64.7) Lazio (8) Lazio
3 Milan (58.8) Milan (8) Inter
The results show little correlation between xG and Ls to judging team performance.

This can be exemplified by Napoli by the team that has the highest xG with the lowest

amount of losses and within the top three in league standing. This is apart from Milan and

Inter where despite having the highest xG they were not ranked highly for the amount of

losses and league standing. This can suggest inconsistency of xG as an accurate indicator

for team performance based on L.

Ranking xG L League Standing


20 Sampdoria (34.1) Sampdoria (25) Sampdoria
19 Hellas Verona Hellas Verona (21) Cremonese
(35.8)
18 Leece (36.1) Cremonese (21) Hellas Verona
Table 10: Comparison of the Top Three Bottom Teams in terms of rankings based on XG, L and

League Standing in Serie A

There seems a be a negative correlation that suggest that xG does determining team

performance in the amount of loses that a team would get. As exemplified, Sampdoria being

ranked the lowest in xG has the highest number of losses and is one of the bottom

performing teams. This alludes that a lower xG the more loses a team would get as this

mean that they are likely to be perform badly in Serie A.

Ranking xG L League Standing


1 Barcelona (75.5) Barcelona (6) Barcelona
2 Real Madrid (75.5) Atletico Madrid (7) Real Madrid
3 Atlético Madrid (61.9) Real Madrid (8) Atletico Madrid
The results shows a strong correlation that suggest a higher xG performance

indicator for loses and league standing. This can be supported by Barcelona having the

highest xG meaning the lowest amount of losses and a better league standing. This is used

to emphasis the need for teams to have a higher xG within the league to allow for better

team performance within the league.

Ranking xG L League Standing


20 Mallorca (35.2) Elche (23) Elche
19 Getafe (36.7) Valladolid (20) Espanyol
18 Elche ( 37.5) Almería (19) Valladolid
Table 11: Comparison of the Top Three Bottom Teams in terms of rankings based on xG, L and

League Standing in La Liga

The results show some correlation between xG and L that doesn’t support team

performance. This be shown by Elche where it had the highest xG amongst the bottom three

teams; despite this it was placed last in La Liga and ranked high for the most losses. This

implies that having a higher xG would not guarantee the team to do better within their league

standing by having a lower lost rate; leading to hypothesis being rejected.

Ranking xG L League Standing


1 Manchester City (78.6) Newcastle Utd (5) Manchester City
2 Brighton (73.3) Manchester City (5) Arsenal
3 Newcastle Utd (71.9) Arsenal (6) Manchester Utd
The results show little correlation with xG being a precise indicator for team performance

based on the number of losses and placement within the league. This can be evident by
Manchester City having the highest xG can lead to better team performance by having a

having a higher league standing lower number of losses. Despite this, Brighton and

Newcastle Utd are the teams that despite having a high xG, they were not placed within the

top the three for league standing and lowest number of losses. Thus, indicating the highly

competitive nature of the premier league that can skew the accuracy of xG as a team

performance indicator based on L.

Ranking xG L League Standing


20 Wolves (36.8) Southampton (25) Leicester City
19 Southampton (37.8) Leicester City (22) Leeds United
18 Bournemouth (38.5) Bournemouth (21) Southampton
Table 12: Comparison of the Top Three Bottom Teams in terms of rankings based on xG, L and

League Standing in The Premier League

The findings suggest little correlation between xG and the amount of L. This is

exemplified by Bournemouth having the highest xG amongst the top three bottom ranking

teams; where it was top three for most losses but was not bottom three for league standing.

However, Leeds and Leicester City seem to be the teams that do will not perform well,

despite not being ranked for having the top three lowest xG nor top three highest number of

losses. This shows a lack of transferability of xG in different situations where xG is unable to

determine team performance within the premier league, thus rejecting the hypothesis.

Conclusion of the Negatively Corelated Team Performance Indicators Based on xG


In conclusion, there seems negative correlation between xG and the team performance

indicators like GA and L within the . This can suggest teams with higher xG would be more

likely to have less GA and L. As Serie A and La Liga, seem to imply about the need for

teams to be more defensive to help with having lower losses and goals against within the

league to maintain high team performance , due to the emphasis and rising importance of

attacking tactics within these leagues, However, for the Premier League, there seem to be

no correlation that suggest this negative correlation, demonstrating about the importance of

a more holistic approach for team performance that shows a need for teams within the
premier league to have both good defensive and offensive tactics to have better

performances within the league, due to highly competitive nature of the premier league.

Testing and Checking The Validity of Model of

Five Statistics of Team Performance


The equations of the Pearson test for xG and the five statistics that determine team

performance was used to see if the 22/23 season’s xG data match the model’s prediected

xG data which was done using the percentage error range of 𝑥 ≤ (−)10% to determine the

accuracy of the predicted xG of the model and 22/23 season xG data .

Positive Correlation of the Team Performance Indicators


xG and W
Premier League
Middle Rank Team: Brentford

Y = 1.630(15) + 26.99 = 51.4 expected xG

Percentage Error:

9.53 ≤ 10%, this implies about the reliability of model to provide evidence about the positive

relationship between xG and GF, for teams to have better team performance through higher

xG = higher GF.

Exception Team: Leicester City

Y = 1.630(9) + 26.99=41.66 expected xG

Percentage Error

21.21 ≥ 10% shows, the unreliability for xG to predict a team’s GF to demsontrate the flaws

of xG in not being able to consider external factors that can influence team performance.

Relegated Team: Southampton


Y = 1.630(6) + 26.99 = 36.77

Percentage Error

2.8% ≤ 10% shows the accuracy of xG to predicted team performance within the league that
makes it reliable indicater for team performance.

La Liga
Middle Rank Team: Athletic Club

Y = 1.630(14) + 26.99 = 49.81 expected xG

Percentage Error

8.81 ≤ 10%, implies about the trustworthiness of xG to make it a dependable indicator for

team performance.

Exception Team: Espanyol

Y = 1.630(8) + 26.99 = 40.03 expected xG

Percentage Error

20.65% ≥ 10% that indicate the flaws of xG as a team performance indicate that doesn’t

support the hypothesis.

Relegated team: Elche

Y = 1.630(5) + 26.99 = 34.14 expected xG

Percentage Error

9.84 ≤ 10% preciseness of xG of team performance that supports the hypothesis.

Serie A
Middle Rank Team: Fiorentina

Y = 1.630(15) + 26.99 = 51.4 expected xG

Percentage Error:
7 ≤ 10% to show the creditableness of team performance within league standing

Exception Team: Inter

Y = 1.630(23) + 26.99 = 64.488 expected xG

Percentage Error:

5.45 ≤ 10% demonstrates worthfulness of team performance within league standing

Relegated team: Cremonese

Y = 1.630(5) + 26.99 = 35.14 expected xG

Percentage Error

10.13% ≥10% indicatess no evidence of a positive relationship between xG and GF to


accurately determine team performance

xG and GF
Premier League
Middle Ranked Team: West Ham

Y=.0.71081(42) +14.7738 = 44.62 expected xG

Percentage Error:

10.26 ≥ 10% suggest about the inaccuracy of xG to determining team performance within

the league.

Exception Team: Wolves

Y= 0.71081(31) +14.7738 =36.8 expected xG

Percentage Error:

0 ≥ 10 implies about xG being a highly accurate team performance indicator


Relegation Team: Leicester City

Y= 0.71081(51) +14.7738 = 51.02 expected xG

Percentage Error:

-1.01 ≥ -10% suggest about the uncertainty of xG as a team performance indicator

La Liga
Middle Ranked Team: Athletic Club

Y = 0.71081(47) +14.7738 = 48.18 xG

Percentage Error:

2.4% ≤ 10% provides evidence of the reliablity of xG as a team performance

Exception Team: Espanyol

Y= 0.71081(52) +14.7738 = 51.73

Percentage Error:

0.52 ≤ 10% demonstrates the accuracy of xG as a team performance indicator.

Relegated Team: Elche

Y = 0.71081(23) +14.7738 = 31.12

Percentage Error:

20.5% ≥ 10% confirms about the faultiness of xG in being a team performance


indicator
Serie A
Middle Ranked Team Fiorentina

Y= 0.71081(53) +14.7738 = 52.44 expected xG


Percentage Error

4.88 ≤ 10% establishes the reliability of xG as a good team performance idnciater

Relegated team: Spezia

Y= 0.71081(31) +14.7738 = 36.80 expected

Percentage Error

12.77% ≥ 10% shows the ungeneralisable of xG as an team performance indicator

Exception Team: Inter Milan

Y= 0.71081(71) +14.7738 = 65.24 expected xG

Percentage Error

4.23% ≤ 10% prove dependability of xG as a team performance indicator to support the


hypothesis.

xG and Pts
Premier League
Middle Ranked Team: Crystal Palace

Y = 0.596 (45) + 19.32 = 46.15 expected xG

Percentage Error:

-14.84 ≤ -10% demonstrates the unreliableness of team performance that is not able to

support the hypothesis.

Expectation Team: Wolves

Y = 0.596 (41) + 19.32 = 43.77 expected xG

Percentage Error:
-15.92% ≤ -10% shows the inaccurateness of team performance to supporting the

hypothesis.

Relegated Team: Leeds United

Y = 0.596 (31) + 19.32 = 37.81

Percentage Error:

25.09 ≥ 10% indciates the flaws of xG as a team performance indicater.

La Liga
Middle Ranked Team: Mallorca

Y = 0.596 (50) + 19.32 = 49.13 expected xG

Percentage Error:

-28.35% ≤ -10% shows xG for the inperciseness of team performance indicator

Expectation Team: Espanyol

Y = 0.596 (37) + 19.32 = 41.38 expected xG

Percentage Error:

16.72% ≥ 10% illustrates the unreliability of xG of a team performance indciator.

Relegated Team: Elche

Y = 0.596 (25) + 19.32 =34.23 expected

Percentage Error:

9.55% ≤ 10% shows the reliableness of xG as a team performance indicator to support the

positive relationship between xG and pts.

Serie A
Middle Ranked Team: Torino
Y = 0.596 (53) + 19.32 = 50.92 expected xG

Percentage Error:

-17. 91 ≤ -10 presents the inability for xG to be being a consistent team pefromance

indicator.

Expectation Team: Napoli

Y = 0.596 (90) + 19.32 = 72.98 expected xG

Percentage Error

-11.34 ≤ 10% shows the incosnsitecy of xG as a relianle indicator

Relegated Team: Cremonese

Y = 0.596 (27) + 19.32 = 35.42 expected xG

Percentage Error:

9.26% ≤ 10% shows that xG is an accurate and reliable team performance indicator.

Summary of the Positively Correlated Team Performance Indictaor


Percentage Error Results:
The result s show a lack of evidence and consistency of xG as an reliable indicator for

determining team performance based on GF, W and Pts. Thus, suggesting that xG is not a

reliable quantitive measure for determining team performance due to the prevalence of

external factors like the quality of players that can determine team performance within the

top three European leagues.

Negative Correlations of Team Performance Indicators


xG and L
Premier League
Middle Rank team: Fulham

Y = -1.853(16) +77.16 = 47.50 expected xG


Percentage Error

-2.73% ≥ -10% suggest the inadequacy of xG as an team performance indicator

Exception Team: Leeds United

Y = -1.853(21) +77.16 = 38.23

Percentage Error

23.72% ≥ 10% indicates the unacceptableness of xG of the accuracy of the team

performance indicator

Relegated Team: Leicester City

Y -1.853(22) +77.16 = 36.37 expected xG

Percentage Error

38.85% ≥ 10% demonstrates the unreliability of xG as an team performance indicator.

La Liga
Middle Rank team: Girona

Y = -1.853(15) +77.16 = 49.35 expected xG

Percentage Error

2.53% ≤ 10% indicates the reliability of xG as a team performance indicator.

Exception Team: Espanyol

Y = -1.853(17) +77.16 = 45.64 expected xG

Percentage Error

5.82% ≤ 10% show that xG is a capable accurate measure for team performance.

Relegated Team: Valladolid


Y = 1.853(20) +77.16 = 40.08 expected xG

Percentage Error

-3.94% ≥ -10% demonstrates the inaccuracy of xG as being a capable team performance indicator.

Serie A
Middle Rank team: Bologna

Y = -1.853(12) +77.16 = 54.91 expected xG

Percentage Error:

-20.23% < -10% proves the unvalidity of xG of being a reliable indicator for team performance.

Exception Team: Milan

Y = - 1.853(8) +77.16. = 62.33 expected xG

Percentage Error:

Relegated Team: Cremonese

Y = - 1.853(21) +77.16 = 38.23 expected xG

Percentage Error

1.22% ≥ 10% establishes the validity of xG as a reliable indicator for team performance

xG and GA
Premier League
Middle Rank Team: Brentford

Y = -0.5675(46) + 78.96 = 52.86 expected xG

Percentage Error
6.5% ≤ 10% proves the validity of xG as a team performance indicator

Exception Team: Wolves

Y = -0.5675(58) + 78.96 = 46.05 expected xG

Percentage Error

-20.08% ≤ -10% indicates that xG is not a good team performance indicator.

Relegated Team: Southampton

Y = -0.5675(73) + 78.96 = 37.5 expected xG

Percentage Error

1.9% ≤ 10% exhibits xG as being a justifiable team performance indicator.

La Liga
Middle Rank Team: Athletic Club

Y = -0.5675(43) + 78.96 = 54.56 expected xG

Percentage Error:

-0.55 ≥ -10% indicates the unreliableness of xG as being an indicator for team performance.

Exception Team: Espanyol

Y = -0.5675(69) + 78.96 =39.8 expected xG

Percentage Error

21.35% ≥ 10% indicates that xG is an inaccurate team performance indicator

Relegated Team: Valladolid

Y = -0.5675(67) + 78.96 = 40.94 expected xG

Percentage Error
-8.40 ≤ -10% indicates the unreliablity of xG as consistent team performance indictaor.

Serie A
Middle Rank Team: Roma

Y = -0.5675(38) + 78.96 = 57.4 expected xg

Percentage Error:

0 ≥ 10% shows xG as being an indicator that can be used to measure team peformance.

Exception Team: Inter

Y = -0.5675(42) + 78.96 = 55.13 expected xG

Percentage Error:

23.34% ≥ 10% indicates that xG is good inicator for team performance.

Relegated Team: Cremonese

Y = -0.5675(69) + 78.96 = 39.80 expected xG

Percentage Error:

-14.32 ≤ -10% proves the inaccuracy of xG as a team performance indicator.

Summary of the Negatively Correlated Team Performance Indictaor


Percentage Error Results:
The results show the inapplicableness of xG to predicting team performance based on GA

and L with the top three European leagues. This can suggest about the inability of xG to

determine how well a team would do within the league due to the prevalence of teams that

would exceed or underperform the predictions made by model for xG caused by factors due

to the inflexibility of xG to consider like coaching quality of the team that can influence the

quality of the team training and therefore their team performance within the league .
Conclusion
In conclusion, the results suggest that higher xG can determine team performance

but with varying results. For instance, xG can work within La Liga where they can influence

team performance based on GF, W and pts. This has implications of La Liga being

dependent on more expansive playstyles for a better league standing and team

performance. However, xG cannot be generalisable for team performance for the top three

European Leagues. This is because of Serie A xG predicting only 2/3 of the top teams for

Ws, Pts and GF that breaks the stereotype that alludes to Serie A’s dependence on

defensive and goalkeeping. Consequently, indicating an attacking component to assessing

team performance. This is similarly shown in the Premier League where only 1/3 of the top

teams in these three factors; implying a more well-rounded approach to determining team

performance that emphasise the need for a better attack and defence.

Reflection
Spearman ranking was used to investigate whether xG would show a positive or

negative corelation for assessing team performance. Whereas Pearson's test is used to

identify the relationship between xG and team performance indicators to recognise whether

xG can determine team performance.

However, Spearman's ranking can be unreliable as it is not suitable for graphs that

have non-linear relationships. Henceforth, the results gathered from the Spearman's ranking

would not be reliable. Furthermore, the limitations of using a Pearson coefficient would be

the inclusion of called a spurious correlation that can make two factors like related when they

are not; that can make the findings unreliable. (Ghouse et al. 2024).

Moreover, to improve the accuracy of my investigation more seasons and leagues

are needed to accurately assess team performance based on xG. This would be done by

including two more seasons and leagues such as the Bundesliga and Ligue 1. Another way

to improve the study would have been to avoid data that have external issues. For example,

COVID-19 lead to football games being suspended on 13th March 2020. This would result in

data on team performance being unreliable for our IA (Premier League 2020). Lastly, an
alternative statistical test like Anova would allow for hypothesis testing between different

group means; to determine whether there is a significant difference between the use of xG

in determining a specific team performance indicator for a specific league (Bevans 2024).

Bibliography
Bevans. R. (2024). One-way Anova test | when and how to use it (with examples). [online].
Available From: ttps://www.scribbr.com/statistics/one-way-
anova/#:~:text=The%20null%20hypothesis%20(H0,use%20a%20t%20test%20instead
[accessed from 26th July 2024].

FBref.com. (2023a). 2022-2023 La Liga stats [online] . Available from:


https://fbref.com/en/comps/12/2022-2023/2022-2023-La-Liga-Stats [accessed April 16
2024].

FBref.com. (2023b). 2022-2023 Premier League stats [online]. Available from:


https://fbref.com/en/comps/9/2022-2023/2022-2023-Premier-League-
Stats#all_league_structure [accessed: April 16 2024].
FBref.com. (2023c). 2022-2023 Serie A stats [online]. Available from:
https://fbref.com/en/comps/11/2022-2023/2022-2023-Serie-A-Stats [accessed April 16
2024].

Footy Stats. (2024). Darwin Nunez stats – Goals, xG, assists & career Stats | FootyStats
[online]. Available From: https://footystats.org/players/uruguay/darwin-nunez [accessed from
21 July 2024].

Footy Stats. (2024). Erling Haaland stats – Goals, xG, assists & career stats | FootyStats
[online]. Available from: https://footystats.org/players/norway/erling-haaland [accessed from
21 July 2024].

Football XG. (2024). What are expected Goals (xG)? [online]. Available from:
https://footballxg.com/what_are_expected_goals/#:~:text=23%2B00%3A00-
,So%20how%20much%20better%20is%20expected%20goals%3F,worse%20on%20the%20
home%20results [accessed from 21 July 2024].

Ghouse, G., Rehman, A. U. & Bhatti, M. I. (2024). Understanding of causes of spurious


associations: Problems and prospects. J Stat Theory Appl 23, 44–66.
https://doi.org/10.1007/s44199-024-00072-0

MacInnes. P. (2020). ‘It is beyond the model’: Have Liverpool exposed the limits of xG?
[online]. Available From: https://www.theguardian.com/football/2020/aug/09/liverpool-xg-
jurgen-klopp [accessed from 21 July 2024].

Premier League. (2020). How has the COVID-19 Pandemic affected Premier League
matches?. [online]. Available from: https://www.premierleague.com/news/1682374
[accessed from 26th July 2024].

Whitmore. J. (2023). What is expected goals (xG)? [online]. Available from:


https://theanalyst.com/eu/2023/08/what-is-expected-goals-xg/ [accessed from 21 July 2024].

Williams. A. (2020). The roots of expected goals (xG) and its journey from “nerd nonsense”
to the mainstream [online]. Available from: https://thesefootballtimes.co/2020/04/08/the-
roots-of-expected-goals-xg-and-its-journey-from-nerd-nonsense-to-the-mainstream/
[accessed 15th October 2024]

Abba. I. (2023). What is R Squared? R2 value meaning and definition [online]. Available
from: https://www.freecodecamp.org/news/what-is-r-squared-r2-value-meaning-and-
definition/ [accessed 22nd October 2024].

Turney. S. (2024). Pearson Correlation Coefficient (r) | Guide & Examples [online]. Available
from: https://www.scribbr.com/statistics/pearson-correlation-
coefficient/#:~:text=The%20Pearson%20correlation%20coefficient%20(r,the%20relationship
%20between%20two%20variables.&text=When%20one%20variable%20changes%2C%20t
he,changes%20in%20the%20same%20direction [accessed from 22nd October 2024]

Lusby. J. (2023). He is the Haaland of xG but Liverpool need Darwin Nunez to turn it into
goals [online]. Available from: https://www.fotmob.com/topnews/9164-hes-haaland-xg-but-
liverpool-need-darwin-nunez-turn-it-into-goals [accessed from 22nd October 2024].

Ogden. M. (2024). Haaland scores goals for Man City, so why all the criticism [online].
Available From: https://www.espn.com.sg/soccer/story/_/id/39946776/haaland-scores-goals-
man-city-why-all-criticism [accessed from 22nd October 2024].
Appendices
Appendix 1 Raw Data for xG, GF, GA, W, L and Pts

You might also like