Machine Learning Methods For Predicting League of Legends Game Outcome
Machine Learning Methods For Predicting League of Legends Game Outcome
Abstract—The video game League of Legends has several profes- data for every game played. This makes research in this area
sional leagues and tournaments that offer prizes reaching several very interesting, for example, to develop predictive models to
million dollars, making it one of the most followed games in the assist teams about the strategy to follow. However, there are few
Esports scene. This article addresses the prediction of the winning
team in professional matches of the game, using only pregame data. studies focused on LoL, and there is an almost complete lack of
We propose to improve the accuracy of the models trained with the research on the professional levels of the game.
features offered by the game application programming interface This study focuses on using pregame data (teams, players,
(API). To this end, new features are built to collect interesting infor- champions, etc.) from different LoL leagues and tournaments
mation, such as the skills of a player handling a certain champion, to predict the winning team of professional games. The main
the synergies between players of the same team or the ability of a
player to beat another player. Then, we perform feature selection motivation of our pregame approach is their applicability. Pre-
and train different classification algorithms aiming at obtaining dicting the winning team before match starts is very valuable
the best model. Experimental results show classification accuracy data, for example, to make decisions regarding team composition
above 0.70, which is comparable to the results of other proposals in terms of players, champions, and roles (recommendation
presented in the literature, but with the added benefit of using few system), or as part of betting systems. Choosing the professional
samples and not requiring the use of external sources to collect
additional statistics. level of the game reduces considerably the number of available
observations (compared to the other levels, thousands of games
Index Terms—Esports, feature creation, League of Legends versus hundreds of thousands or even millions). In order to
(LoL), model ensemble, prediction, video games.
obtain robust models, it is necessary to exploit the particularities
of this dataset, such as the fact that the matches are played by a
I. INTRODUCTION limited number of teams and players. This allows the extraction
N THE context of electronic entertainment, the broadcasting of features that provide much more relevant information to the
I of professional video games, known as Esports, is gaining
more and more relevance. In these games, the opponents face
predictive model than just the fact of participating or not in the
encounter, as would be the case when using the original set of
each other to achieve one of the important prizes at stake and features provided by the game API.
the recognition of the public, reaching an audience of 495 We propose an approach that, based on the analysis of the
million people and exceeding one billion dollars in revenues original data, detects possible new features that are important
by 2020 [1]. In this emerging field, League of Legends (LoL) [2] for predicting the outcome of a match. These features reflect,
from Riot Games Inc. is at the top of the most watched games for example, the dexterity of the player in handling a certain
with 348.8 million hours of viewing in 2019 [1], giving an champion, what synergies are produced when two players are in
idea of the popularity of this game. The creators of the game the same team, or the ability of a player to beat another player
have made available to developers an application programming when they face each other. Subsequently, a preprocessing step is
interface (API) that allows access to a large amount of detailed carried out, which includes the creation of these new features and
feature selection. Next, several machine learning algorithms are
trained, and those that show the best performance are selected to
Manuscript received 19 May 2021; revised 3 November 2021 and 18 January
2022; accepted 11 February 2022. Date of publication 23 February 2022; date be used in a meta-model to further increase accuracy. The main
of current version 16 June 2023. This work was supported in part by the contribution of this study is the development of a robust model
National Plan for Scientific and Technical Research and Innovation of the for winner prediction in professional LoL games. Despite the
Spanish Government under Grant PID2019-109238GB-C2, in part by the Xunta
de Galicia under Grant ED431C 2018/34 with the European Union ERDF funds, fact of having a dataset with limited information, our approach
in part by the CITIC, as a Research Center accredited by Galician University solves this problem by creating new features which will help
System, is funded by “Consellería de Cultura, Educación e Universidades from selecting the best predictive algorithms that will be combined
Xunta de Galicia,” supported in an 80% through ERDF Funds, ERDF Opera-
tional Programme Galicia 2014–2020, and the remaining 20% by “Secretaría into a meta-model.
Xeral de Universidades’ under Grant ED431G 2019/01. (Corresponding author: The remainder of this article is organized as follows. In
Verónica Bolón-Canedo.) Section II, a general description of the LoL game is given.
The authors are with CITIC, Universidade da Coruña, 15071 A Coruña,
Spain (e-mail: [email protected]; [email protected]; veronica.bolon@ Section III reviews the related work. Section IV describes the
udc.es). methods and materials we work with, including the dataset
Color versions of one or more figures in this article are available at (Section IV-A), preprocessing tasks (Sections IV-B and IV-C),
https://doi.org/10.1109/TG.2022.3153086.
Digital Object Identifier 10.1109/TG.2022.3153086 as well as the classification algorithms (Section IV-D). Section V
2475-1502 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
172 IEEE TRANSACTIONS ON GAMES, VOL. 15, NO. 2, JUNE 2023
those games from OP.GG Website [11], as well as information unknown battlegrounds, and used various artificial neural
regarding the champions from CHAMPION.GG Website [12]. network techniques to predict the winning player of the game
With all this data they trained their model, based on recursive by knowing their starting position. For the game Counter Strike,
neural network and logistic regression, to achieve an accuracy Xenopoulos et al. [21] created a probabilistic model to predict
of 72.10%. Ong et al. [13] employed the k-means algorithm to the winning team at the beginning of each round. With this
cluster the behavior of players based on their statistics in played model as a basis, they presented a recommender system to guide
games. In this way, they obtain a series of groups that describe teams in purchasing equipment. Meanwhile, Ravari et al. [22]
different playing styles of those players. Using these clusters studied the prediction of match outcome in the videogame
as features to define each team, they train various classification Destiny, which combines the genres first person shooter and
algorithms to predict the winner, achieving an accuracy of 70.4% massively multiplayer online role playing game. The authors
with support vector machines. The authors used 113 000 game create two sets of predictive models, one set predicts the match
instances, randomly extracted from the game API, in their study. outcome for each game mode, while the other set predicts the
They also used the game API to obtain the player statistics from overall match outcome, without considering the game modes.
which they perform the clustering. Finally, with the information In addition, they also analyze how game performance metrics
available before the start of the match, Chen et al. [14] aimed influence each of the proposed models.
to detect player skills that are determinant for the outcome of a As seen in the related work, in-game approaches generally
match. To do so, they used several skill-based predictive models perform better than pregame approaches. This is because these
to decompose player skills into interpretive parts. The impact approaches have much more information than pregame ones
on game outcome of these parts is evaluated in statistical terms. (such as experience gained, turrets destroyed, or enemies de-
This approach achieves an accuracy of 60.24% with a dataset of feated). Thus, as the game advances, the features allow more
231 212 game instances extracted from the game API. Note that accurate prediction of the winning team. Pregame approaches,
[13] and [14] are not peer reviewed. however, have very limited information. This makes prediction
Other relevant articles use as a basis for study the game much more complex and it is challenging to obtain models with
Defense of the Ancients 2 (DotA2), which is very similar to good accuracy, but it provides useful information before game
LoL in its concept. This game is more studied in literature, starts, which can be exploited in recommendation or betting
and, therefore, has a larger number of articles on winning team systems. The pregame approaches that achieve the best results,
prediction. For this game, Conley et al. [15] proposed a hero Ong et al. [13] and White et al. [10], need to obtain additional
recommendation engine. For this purpose, they train a kNN information, besides the match information, to train their mod-
(k-nearest neighbors) algorithm using as features the heroes that els. Our proposal, that belongs to the pregame approach, does
compose each team. In this way, they achieve an accuracy of not need additional data about the players or the champions.
70% with a dataset of 18 000 instances. Agarwala et al. [16]
performed a principal component analysis to extract the in- IV. MATERIALS AND METHODS
teraction between the heroes. With the obtained result, they
train a logistic regression algorithm with a dataset composed of A. Dataset
40 000 instances. Applying their approach, the authors achieve The dataset used in this article has been obtained from
an accuracy of 62%. Hodge et al. [17] presented the study Kaggle [23]. It consists of observations corresponding to profes-
of real-time prediction of the outcome in professional DotA2 sional games from various leagues and championships played
matches using in-game information. The authors used standard between 2014 and 2018. It includes details regarding both match
machine learning, feature creation, and optimization algorithms setup and match progress. The match progress data are discarded
on a mixed professional and nonprofessional games dataset to as this work is based on prediction before the match starts
create their model. They tested the obtained model in real time (pregame). Once cleaned and prepared, the dataset includes 241
during a championship match, reaching an accuracy of 85% after teams, 1470 players, and 139 champions. It has 7583 instances,
5 min of gameplay. Lan et al. [18] proposed a model that allows with a binary class variable, indicating the team that wins the
predicting the winning team using data on player behavior during game, and 26 features, corresponding to the year, season, league,
the match (in-game). They first extract the features that define and type of match, as well as the name of the team playing on
player behavior with a convolutional neural network. These each side (2 features, 1 per side), its composition of champions in
features are modeled as a temporal sequence that is processed each of the five roles (10 features, 5 per side), and its composition
by a recurrent neural network. Finally, the output of these two of players in each of the five roles (10 features, 5 per side).
networks for each team are combined to forecast the outcome. Table I shows the name and content of the class variable and
Thus, the authors use a training set of 20 000 instances to obtain, each of these features.
after the first 20 combats occurred during the match, an accuracy
of 87.85%.
Outside the context of multiplayer online battle arena B. Preprocessing
(MOBA) there are also articles on other Esports that attempt to In order to obtain data of quality, adapted to the needs of the
predict the match outcome. Thus, Sánchez-Ruiz et al. [19] used models to be trained, thus improving their predictive capacity, a
a set of numerical matrices, which represent the units influence series of preprocessing tasks are applied to the dataset.
on the game map, to predict the outcome of the game StarCraft. For the treatment of missing values an imputation with
The paper by Mamulpet [20] focused on the videogame players kNN [24] is performed. This algorithm searches the dataset for
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
174 IEEE TRANSACTIONS ON GAMES, VOL. 15, NO. 2, JUNE 2023
first with a linear kernel and the second with a radial basis
function Gaussian kernel.
3) Logistic Regression [32]: It is a statistical model that
uses a logistic function to predict the outcome of a bi-
nary nominal variable. In this case, two variants are used
that provide good results, the generalized linear model
with boosting [33] and the generalized linear model with
elastic-net regularization [34].
4) Naive Bayes Classifier [35]: It is based on the Bayes
theorem and on assuming independence between features
to calculate the probability that an observation belongs to
one class or another.
5) kNN Algorithm [36]: It is based on the idea of searching
for “k” observations in the training data that have the
smallest Euclidean distance to a new observation (nearest
neighbors), assigning it the most repeated class of those
training observations.
6) Neural Networks [37]: They are based on generating an
interconnected group of nodes that attempts to imitate
the biological behavior of the axons of neurons. Each of
these nodes receives as input the output of the nodes of
the previous layers multiplied by a weight. These values
are aggregated at each receiving node and, optionally,
modified or limited by an activation function before propa-
gating the new value to the next neurons. Within the broad
field of neural networks, in our work we use a multilayer
perceptron (MLP) [38], a kind of fully connected feed-
forward neural network composed of multiple layers of
perceptrons.
In this work, we also include two meta-models that are inter- Fig. 2. Schematic diagram of the steps followed in the methodology.
esting for the results obtained. The first one is a model based
on label fusion with majority voting [39], while the second one
performs stacking [40], where the outputs of a set of classifiers
are used as new features to train a new model.
V. METHODOLOGY
As mentioned before, in this work we are considering only the
state of the game before starting the battle (pregame) to perform
the prediction of the winning team in professional games. With
this limited information, both in terms of available features and
number of instances, it is recommended, during the exploratory
analysis, to guide a search process for possible new features
that can provide additional relevant information about the class.
Fig. 3. Features importance of the original dataset.
In the preprocessing of the data, the new candidate features,
found in the previous analysis, are created and feature selection
is performed to determine which ones are really important, dis- On the other hand, to guide the search for new features
carding the rest. The final dataset is trained on a set of classifiers. that capture useful information, a supervised learning algorithm
Finally, voting- and stacking-based meta-models are created based on decision trees is trained with the aim of determining
from the trained models to improve the final accuracy obtained. the importance of the features. The obtained ranking (see Fig. 3)
A diagram of the methodology followed is shown in Fig. 2. shows that the most relevant features with respect to the class
correspond to the roles of the players of both teams, followed
by those reflecting the selected champion, and finally those
A. Exploratory Analysis referring to the name of the team playing the match. The rest of
The analysis of the original dataset revealed the need to per- the features are detected as not important for the outcome.
form some preprocessing tasks to prepare it to the classification From these features detected as most relevant, a search for
phase, such as the handling of missing values and the one-hot possible new features is carried out. First, the win ratios for each
encoding of categorical features. value of these features are calculated (as matches won for that
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
176 IEEE TRANSACTIONS ON GAMES, VOL. 15, NO. 2, JUNE 2023
winsmn
coopPlayerred = .
matchesmn
m∈Rp n∈Rp, m=n
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
178 IEEE TRANSACTIONS ON GAMES, VOL. 15, NO. 2, JUNE 2023
TABLE II with the best results for this dataset have been selected. One
FINAL SET OF SELECTED FEATURES
of them is the generalized linear model with boosting. The
other variant is the generalized linear model with elastic-net
regularization.
The Naive Bayes classifier, despite its simplicity, achieves
very good results in this problem. Similarly, a simple classifier
such as kNN obtains good accuracy, better than some much more
complex algorithms.
Using a neural network, in this case, does not obtain results as
good as other models presented above. However, its combination
in a meta-model with other classifiers adds enough diversity in
the prediction to improve the results obtained. A feedforward
network is implemented with one single hidden layer and with
dropout to avoid overfitting. The number of hyperparameters to
be adjusted is high, which makes the process of finding their
optimal values difficult.
Finally, two meta-models (also known as ensembles) are
created combining the previous models and improving their
results. The first one uses the models obtained with extreme
gradient boosting, the Naive Bayes classifier, and the neural
network to create a label fusion with majority voting. This simple
meta-model has no hyperparameters. In the second meta-model,
a stacking is performed, where the output of the Naive Bayes
classifier and the neural network is used as new features to train
a model based on the extreme gradient boosting algorithm.
Table II). All the selected features are from the group of new VI. RESULTS
ones. Among these new features, only the 10 corresponding to A. Experiments Framework
how effective a champion is in a certain role have been discarded.
Finally, the 28 numerical features are Z-score normalized. In the experimentation, a balanced random splitting with a
90%/10% ratio is used on the dataset. Thus, the training of
C. Classification the algorithms in the modeling phase is performed using 6826
instances, leaving 757 instances for testing. In this case, a tenfold
With the preprocessed data, binary classification algorithms cross-validation with five repetitions was chosen for training
are trained to find the patterns present in the training set, allowing the models. These partitions were created using seeds to ensure
generalization to new observations to predict their class. For this reproducibility. The metric used for evaluating all models, in-
training we have selected, from the whole set of algorithms tested cluding those used in the wrapper feature selection, is accuracy.
in the experimentation, those with the best results obtained, by It should also be noted that the experimentation of this work is
themselves or in one of the proposed meta-models. done in R programming language [42].
In order to find the hyperparameters for each algorithm, a grid
search has been performed. Then, a set of candidate values for
B. Comparative Study
the hyperparameters is created. For each possible combination
of these values, the model is fitted and the accuracy is estimated As mentioned before, the test partition was not used for the
using a tenfold cross-validation with five repetitions. Finally, preprocessing and modeling. This allows an honest estimation
the hyperparameter values with the best metric are chosen. The of the error to compare the results between classifiers. For this
hyperparameter values chosen for each algorithm can be found purpose, the accuracy obtained on the test partition for each of
in the Appendix. the algorithms is computed. Comparing the models (see first
Among all the trained algorithms based on decision trees, column of Table III), it is easy to note how the meta-models
extreme gradient boosting has obtained the best results in this improve the results of the other ones. In particular, the meta-
study. This algorithm tends to overfit, so, to avoid this issue, it is model based on stacking has the highest predictive capacity of
especially critical to correctly determine the value of its hyper- the models evaluated.
parameters. Support vector machines performed well in other However, model performance in terms of computation times
related work [13]. For this reason, two variants are included in could be a critical factor for practical application of classifiers
this study. The first one uses a linear kernel and the second variant (e.g., for real-time use during the champion and role selection
uses a Gaussian radial basis function kernel with class weights. phase). In that case, if the computation times employed by the
As in the previous case, logistic regression is also found in algorithms for this dataset with new features and feature selec-
the literature as one of the algorithms with the best predictive tion are compared (see columns fourth and fifth in Table III),
capacity for the problem addressed. In this case, the two variants a good choice would be the Naive Bayes classifier. This model
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
HITAR-GARCÍA et al.: MACHINE LEARNING METHODS FOR PREDICTING LEAGUE OF LEGENDS GAME OUTCOME 179
APPENDIX
CLASSIFIER HYPERPARAMETER VALUES
Extreme Gradient Boosting
1) Number of trees to be adjusted, nrounds = 14 000;
2) maximum depth of each tree, maxdepth = 2;
3) reduction of the step size in each update, eta = 1e − 6;
4) reduction of minimum loss required to perform a new
partitioning of a node, gamma = 0;
Fig. 8. Ranking of feature importance in the stacking-based meta-model. 5) ratio of subsample of columns for tree construction,
colsamplebytree = 1e − 9;
6) minimum of the sum of the instance weight required on a
a certain role and those that capture how well or poorly a team
child node, minchildweight = 1; and
plays on each side.
7) ratio of data to be used to generate the trees, subsample =
1e − 1.
VII. CONCLUSION
SVM Linear Kernel:
In this article we have seen how it is possible to create 1) Cost for hyperplane margin infringement, C = 5e − 5.
a classifier for determining the winning team in professional SVM RBF Kernel With Class Weights:
LoL games that, with a limited number of observations and 1) Radial kernel coefficient, sigma = 2.8657e − 4;
features, obtains good results, in line with those offered by other 2) cost for hyperplane margin infringement, C = 0.25; and
approaches that have datasets with tens of thousands of instances 3) class weight, Weight = 3.
and with additional information on player performance. To this Generalized Linear Model With Boosting:
end, this proposal has been based on the search for and creation 1) Initial number of boosting iterations, mstop = 200.
of new features that provide additional relevant information to Generalized Linear Model With Elastic-Net Regularization:
the classifier. To the best of our knowledge, this is the first 1) Penalty mix ratio for elastic-net, alpha = 0.55; and
attempt in the literature to create new features (without relying 2) penalty value, lambda = 0.09.
on external sources) to improve the prediction of the winner in Naive Bayes Classifier:
LoL. These new features have been subjected, together with the 1) Laplace smoothing value, laplace = 0;
original ones, to a feature selection process. The result has been 2) use of a function to estimate conditional densities for the
used as a final dataset to train different models and assemble class of each feature, usekernel = TRUE; and
a selection of them into a meta-model. The best results were 3) adjustment value for the function of the previous hyper-
obtained by the meta-model based on stacking, achieving accu- parameter, adjust = 8.
racy over 70%. This result is comparable to other approaches k-Nearest Neighbors:
from the state-of-the-art, but with the added benefit of using few 1) Number of neighbors, k = 16.
samples and not requiring the use of external sources to collect Neural Network (MLP):
additional statistics. 1) Number of neurons in the hidden layer, size = 256;
The approach presented in this article can be considered as 2) ratio of information to be discarded after each layer,
a proof of concept for its application in other videogames (not dropout = 0.4;
exclusively in MOBA) or even sports. If the particularities of the 3) batch size used in each iteration, batch size = 2275;
dataset allow combining features to create new ones based on 4) learning rate, lr = 3e − 4;
win ratios, our methodology can be applied to predict the match 5) gradient decay value, rho = 0.9;
outcome. 6) learning rate decay value, decay = 0.2; and
This proposal opens the door to different ways for future 7) activation function used in neurons, activation = tanh.
development. The first one is to increase the number of obser- Meta-Model With Extreme Gradient Boosting Algorithm:
vations in the dataset. Getting more observations would allow 1) Number of trees to be adjusted, nrounds = 14 000;
studying how it impacts the predictive capacity of models. It is 2) maximum depth of each tree, maxdepth = 2;
also interesting to get information about when each match was 3) reduction of the step size in each update, eta = 1e − 6;
played. Having temporal information offers a whole range of 4) reduction of minimum loss required to perform a new
possibilities, both for the creation of new features that exploit partitioning of a node, gamma = 0;
this information (e.g., trends in players to detect periods in which 5) ratio of subsample of columns for tree construction,
the player performs more or less), and for using classification colsamplebytree = 1e − 9;
algorithms that consider the time factor, such as recursive neural 6) minimum of the sum of the instance weight required on a
networks. Another development option is the implementation of child node, minchildweight = 1; and
a recommendation system for professional teams. This system 7) ratio of data to be used to generate the trees, subsample =
would advise about which team members should participate, 0.15.
Authorized licensed use limited to: University of North Carolina at Chapel Hill. Downloaded on January 26,2024 at 03:05:15 UTC from IEEE Xplore. Restrictions apply.
HITAR-GARCÍA et al.: MACHINE LEARNING METHODS FOR PREDICTING LEAGUE OF LEGENDS GAME OUTCOME 181