Experiment-8: Optimal Cricket Team Selection Artificial Intelligence Model
Experiment-8: Optimal Cricket Team Selection Artificial Intelligence Model
Experiment-8: Optimal Cricket Team Selection Artificial Intelligence Model
Introduction:
The encyclopaedia defines cricket as "a bat and ball, team game played during the summer in the
British Isles and in several countries influenced by the British, such as Australia, New Zealand, India,
Pakistan, South Africa, and West Indian nations".
Cricket is played between two teams of 11 players on a grassy field, in the centre of which are two
wickets - the equivalent of baseball's 'bases'. Although the game play and rules are very different, the
basic concept of cricket is similar to that of baseball. Teams bat in successive innings and attempt to
score runs, while the opposing team fields and attempts to bring an end to the batting team's
innings. After each team has batted an equal number of innings (either one or two, depending on
conditions chosen before the game), the team with the most runs wins. This process (an innings)
may be repeated once more (a match can last one day or take as many as five). Cricket is a very
popular game and has become an integral part of the culture. The popularity that the game can bring
to a player and the financial benefits it can give attracts many aspirants to compete for a place in the
team.
Because of this the selection committee responsible for choosing players for a particular match, faces
a tough job. There are more and more people aspiring to be in the team, and not that they are all
bad players, but selection committee has to select the best 15 or so out of them. It is not uncommon
to see as many as 300-400 aspirants with varied backgrounds competing for a birth in the team of
11. The competition becomes more severe as the standard of tournament rises from state to
national, and from national to international level.
Various factors come into play while selecting a team. A human selection committee will invariably
suffer from the shortcomings of unfair or biased judgment, human error, and overlooking of certain
important points. A system is thus required which can effectively take into account all factors
involved and give the optimal team, without human interference. This system should take as input
various performance characteristics like players history, his average scores if he is a batsmen, wickets
taken and runs scored if he is a bowler, whether he is a wicket keeper, his performance as a fielder
and so on. The only science that comes to our rescue is artificial intelligence. In this paper, we have
explored genetic algorithm for the selection of team from a list of ‘probable players’. A system is
generated which can consider various factors and give the optimal team. The results are verified
using the case of Indian Cricket Team Selection for world cup 2019.
Traditional system:
Cricket is unlike most professional team sports in that the highest level sides (Test and ODI
national sides) do not have a fixed roster of contracted players. They qualify for a national
team based solely on long-term residency criteria. A national side for a particular match,
series, or tour is chosen by a panel of selectors, who are senior administrators and
frequently retired players. They choose a squad for a match from the pool of available first
class players in the country. The best possible combination of players suitable for the given
match conditions and opposition is chosen. A practice match aka the Selection match is
played in which the selectors analyse the performances of the players and select the best 11
as per their opinions.
Drawbacks:
It is observed that established players are frequently selected for the national team,
usually only interrupted by injury or suspension.
On the other hand, lesser known players may be chosen more sporadically, having to
prove themselves worthy of selection to remain in the side, often competing for a
place against other players with similar abilities.
Occasionally, the selectors will choose a relatively unknown player who has
performed well in the domestic circuit, who may only play a single game, or may
perform well enough in his first chance to be selected again.
Sometimes, the player has not had a good day in the selection match due to various
reasons but otherwise, possesses great skills, this way a good talent may be missed
out.
Rare, but politics do happen in cricket also and selectors may be biased towards a
particular player and he/she might get undeserving chances.
PEAS properties:
Performance: The accuracy and level of optimization of the fitness function
keeping into account all the playing factors and team constraints.
Environment: Data of all the players; their batting, bowling and fielding statistics,
injury factors, pitch report, opposition players' statistics.
Environment Properties:
Partially observable: Single agent: environment has single agent, as input is from
one agent.
Deterministic: Environment deterministic as is the processing steps are known. A pre
trained genetic algorithm is used.
Episodic: The inputs do not have any co relation between them, hence, the
environment is episodic.
Discrete: In this GA model, environment is discrete as input is not continuous, after
one input the system creates a new set of results
Dynamic: The performance of the players keep on changing, hence the environment
changes making it a dynamic environment.
Proposed Method:
Genetic Algorithm
Genetic algorithm was discovered to illustrate some of the processes of nature and it is
applied in different areas such as, biology, engineering, economics, chemistry, mathematics
and etc... “Genetic Algorithm process was first described by American scientist, John Holland
in early 1960’s”. Many complex real world optimization problems can be solved by using this
method. The main principal of the concept of genetic algorithm is survival of the fittest. In
genetic algorithm process it frequently alters the population of individual solutions. At each
step genetic algorithm picks individuals at random from the existing population and uses
them to create the next generation. After considerable number of generations, population
converges towards the optimal solution. Generally genetic algorithm process begins with an
initial set of random solutions and it is also called the initial population. Each set of random
solutions of the population is called a chromosome. In most cases a chromosome is a string
of numbers and these numbers alter throughout iterations.
During the each iteration, chromosomes are evaluated by using a defined fitness function.
Then the next population is selected after using crossover operator and mutation operator.
In two points cross over method, two strings from the existing population are randomly
selected. Then two cut points are also selected randomly and everything in between two cut
points are swapped to construct modified chromosomes. In mutation operator a random
number of a randomly selected string is changed. These operators are used to modify the
existing population and fitness values of each chromosome are calculated after applying the
above mentioned operators. The process of selection, cross over and mutation is the main
three steps in genetic algorithm. This process continuous until it meets the defined stopping
criteria and after considerable number of iterations this method converges to the
optimal solution.
Problem formulation:
First all n players in cricket team pool is categorized as batsmen, fast bowlers, spin bowlers, all-
rounders and wicket keepers. Then players’ fitness values are evaluated using variables such as
batting average, batting strike rate, bowling average, number of wickets per match, win loss ratio,
Experience. Overall performances and performances in venue (Example: Australia) are considered
when fitness function is defined. Ranking method is used to evaluate the fitness values.
For an example, if batting average is the variable, a selected player’s mean batting average is
evaluated by considering his overall batting average and batting average in Australia. Similarly, mean
batting average is calculated for all 30 players and rank them from 1 to 30. If the mean batting
average is high of a selected player then his rank value is a small number. But as far as mean bowing
average is concerned lesser mean bowling averages have smaller ranking values and it is the opposite
ranking method from other variables. If a selected player has played more number of matches then
that player has more experiences than others, due to that reason his rank for experience is a small
number. Fitness function is defined using following variables
By using the rank values of mean batting average, mean bowling average, mean strike rate, mean
wickets per match, mean win loss ratio and experience, fitness function is defined. It is the average
of above mentioned rank variables.
Suppose that the optimal cricket squad of 15 players contains 10 specialist batsmen, then that squad
is not an appropriate cricket squad. To win matches a team requires batsmen as well as bowlers and
all-rounders. In other words if the optimal squad provides 15 bowlers then there are no batsmen in
the team. So the team is unbalanced as a result of that it’s impossible to win matches using that
team.
Therefore by considering past data following constraints were defined and by doing so the
appropriate combinations can be selected for the squad. 4 ≤ Number of Batsmen ≤ 6, 1 ≤Number of
Spinners ≤ 3, 3 ≤ Number of Fast bowlers ≤ 5, 1 ≤ Number of All-rounders ≤ 4, 1 ≤ Number of Wicket
Keepers ≤ 3
Working:
Representation:
After numbering each player, 100 random teams are constructed where each team contains 15
players. Therefore a team is represented as a string of 15 bits. In this scenario a chromosome is a
team of 15 players and it can be represented as a bit string of length 15 as following, = 1 3 23 21 4 5 7
29 18 17 5 12 16 22 6 is the ℎ team of the initial population. Then out of these 100 teams 20 teams
are randomly selected as the initial population, which satisfies the above mentioned constraints. So
that 𝑖�𝑖� varies from 1 to 20.Subsequently the fitness values of each chromosome which is the sum of
fitness of each individual player are calculated. Crossover method and Mutation method are the next
steps to follow in genetic algorithm process
Crossover method:
For this problem multiple cross over method is used. This method randomly cuts two cut points and
switches the middle part of two chromosomes to generate new chromosomes. After switching the
middle part, new chromosomes should also satisfy the constraint. Chromosomes are selected
randomly and crossover probability for this problem is kept as 0.6.
Figure above demonstrates how to apply crossover method for two chromosomes of length 15. Third
cell and the 14th cell have been selected as two cut points and every number in between those cells
are switched to create new chromosomes.
Mutation method:
In mutation method one cell is selected randomly and changes the number of that cell randomly to
construct a new chromosome. After changing the randomly selected number, new chromosome
should also satisfy the constraint. The mutation probability for this problem is kept at 0.05.
Figure describes that 9th cell has been selected as the cell to change the number and before
changing the value of that cell is 10 and after changing that value is 30. Consequently after applying
mutation operator, fitness values of each team are calculated. Similarly this process continues until it
meets the stopping criteria.
The stopping criteria for this problem is, continue the above process until the difference of the
fitness values of the best player and worst player will be less than 5. Here, 5 is the least value which
gives the optimal solution. Selected team is also a better team where one player almost matches
with the other 14 players in the squad and after considerable number of iterations it converges to
the optimal team.
Results:
The list of 30 probable players, announced by BCCI, was taken and fed as input to the
program along with the performance data. The resulting team of 15 players was compared
with the actually selected team and it was found that 13 players were similar. The two non-
similar players were fresh entrants. The entry of fresh entrants, in order to give
encouragement to new and young players, can be taken account of by fixing a few players in
the team or by giving extra fitness points to some players.
During the simulation, the fitness of the teams changed as teams became balanced and
improved players were included in the teams. These changes occurred in the following
stages
1) Random team generation. 15 players were chosen randomly and taken as team.
2) Teams that did not consist of required number of bowlers, batsmen, and wicket keeper
became extinct, whereas a team having required number of these players had a better
chance of survival.
3) Balanced teams emerged with required number of batsmen, bowler, and wicket keeper. It
was no longer needed to get a balanced team, but the performance of players started
playing a role.
4) Between generations 1-10 the teams were largely unbalanced. Between generations 10-
66, the team was balanced but the overall fitness of team depended on individual player
fitness.
5) From generation 100 onwards (fig 1) and 65 onwards (fig 2) there came a period of
stability. All teams nearly always consisted of required number of players and had the best
players.
Future work:
The system has considered all possible factors which can be explained by statistical data of the
players’. But there are certain factors such as mentality of the players, confidence level, injuries,
which can never be measured statistically. Even though the numbers of catches caught or dropped
are available for an each player, that variable is not considered due to the reason that generally a
player is not thrown away from the team by considering his fielding ability. If some injuries occur,
then another set of players as the replacement, can also be selected by using this system. As it is
mentioned in the introduction results can be improved by considering players’ performances in
Australia. Even though some people have tried this process before some important factors such as
performances in playing country and selecting bowlers as fast bowlers as well as spin bowlers are
missing in their implementation. This method can be improved by considering league and club level
matches.
Apart from above limitations and already existing functionalities, following additions can be
implemented in the system. Such as,
More complex and efficient fitness functions could be added to evaluate the fitness
of a player with greater accuracy. Several more parameters could be considered
during fitness evaluation to optimize the results.
A functionality that could be used to predict the final score of the match.
Conclusion:
A genetic algorithm based method has been developed for selecting optimal team of cricket
players. Testing has been done both by considering the selection process at league level and
international level. Results are indicative of good potential of the proposed method.