Unit-III Genetic Algorithm
Unit-III Genetic Algorithm
Optimization refers to finding the values of inputs in such a way that we get
the “best” output values. The definition of “best” varies from problem to
problem, but in mathematical terms, it refers to maximizing or minimizing
one or more objective functions, by varying the input parameters.
What are Genetic Algorithms?
• Nature has always been a great source of inspiration to all mankind.
Genetic Algorithms (GAs) are search based algorithms based on the
concepts of natural selection and genetics. GAs are a subset of a much
larger branch of computation known as Evolutionary Computation.
• Advantages of GAs
• GAs have various advantages which have made them immensely popular.
These include
• Does not require any derivative information (which may not be available
for many real-world problems).
• Is faster and more efficient as compared to the traditional methods.
• Has very good parallel capabilities.
• Optimizes both continuous and discrete functions and also multi-objective
problems.
• Provides a list of “good” solutions and not just a single solution.
• Always gets an answer to the problem, which gets better over the time.
• Useful when the search space is very large and there are a large number
of parameters involved.
Limitations of GAs
• GAs are not suited for all problems, especially problems which
are simple and for which derivative information is available.
• Fitness value is calculated repeatedly which might be
computationally expensive for some problems.
• Being stochastic, there are no guarantees on the optimality or
the quality of the solution.
• If not implemented properly, the GA may not converge to the
optimal solution.
Basic Terminology
• Before beginning a discussion on Genetic Algorithms, it
is essential to be familiar with some basic terminology
which will be used throughout this tutorial.
• Population − It is a subset of all the possible (encoded)
solutions to the given problem. The population for a
GA is analogous to the population for human beings
except that instead of human beings, we have
Candidate Solutions representing human beings.
• Chromosomes − A chromosome is one such solution to
the given problem.
• Gene − A gene is one element position of a
chromosome.
• Allele − It is the value a gene takes for a particular
chromosome.
Genotype − Genotype is the population in the computation space. In the
computation space, the solutions are represented in a way which can be easily
understood and manipulated using a computing system.
Phenotype − Phenotype is the population in the actual real world solution space
in which solutions are represented in a way they are represented in real world
situations.
• Decoding and Encoding − For simple problems,
the phenotype and genotype spaces are the same. However,
in most of the cases, the phenotype and genotype spaces are
different. Decoding is a process of transforming a solution
from the genotype to the phenotype space, while encoding is a
process of transforming from the phenotype to genotype space.
Decoding should be fast as it is carried out repeatedly in a GA
during the fitness value calculation. For example, consider the
0/1 Knapsack Problem. The Phenotype space consists of
solutions which just contain the item numbers of the items to
be picked. However, in the genotype space it can be
represented as a binary string of length n (where n is the
number of items). A 0 at position x represents that xth item is
picked while a 1 represents the reverse. This is a case where
genotype and phenotype spaces are different.
Fitness Function − A fitness function simply defined is a function which
takes the solution as input and produces the suitability of the solution as
the output. In some cases, the fitness function and the objective
function may be the same, while in others it might be different based on
the problem.
Genetic Operators − These alter the genetic composition of the
offspring. These include crossover, mutation, selection, etc.
• Basic Structure
• The basic structure of a GA is as follows −
• We start with an initial population (which may be generated at
random or seeded by other heuristics), select parents from this
population for mating. Apply crossover and mutation operators on
the parents to generate new off-springs. And finally these off-springs
replace the existing individuals in the population and the process
repeats. In this way genetic algorithms actually try to mimic the
human evolution to some extent.
• Each of the following steps are covered as a separate chapter later in
this tutorial.
Genotype Representation
• One of the most important decisions to make while implementing a genetic
algorithm is deciding the representation that we will use to represent our
solutions. It has been observed that improper representation can lead to poor
performance of the GA.
• Therefore, choosing a proper representation, having a proper definition of the
mappings between the phenotype and genotype spaces is essential for the success
of a GA.
• In this section, we present some of the most commonly used representations for
genetic algorithms. However, representation is highly problem specific and the
reader might find that another representation or a mix of the representations
mentioned here might suit his/her problem better.
• Binary Representation
• This is one of the simplest and most widely used representation in GAs. In
this type of representation the genotype consists of bit strings.
• For some problems when the solution space consists of Boolean decision
variables – yes or no, the binary representation is natural. Take for example
the 0/1 Knapsack Problem. If there are n items, we can represent a solution
by a binary string of n elements, where the xth element tells whether the
item x is picked (1) or not (0).
Integer Representation
For discrete valued genes, we cannot always limit the solution space to
binary ‘yes’ or ‘no’. For example, if we want to encode the four distances –
North, South, East and West, we can encode them as {0,1,2,3}. In such
cases, integer representation is desirable.
• Permutation Representation
• In many problems, the solution is represented by an order of
elements. In such cases permutation representation is the most
suited.
• A classic example of this representation is the travelling salesman
problem (TSP). In this the salesman has to take a tour of all the
cities, visiting each city exactly once and come back to the starting
city. The total distance of the tour has to be minimized. The solution
to this TSP is naturally an ordering or permutation of all the cities
and therefore using a permutation representation makes sense for
this problem.
• Population is a subset of solutions in the current generation. It can
also be defined as a set of chromosomes. There are several things to
be kept in mind when dealing with GA population −
• The diversity of the population should be maintained otherwise it
might lead to premature convergence.
• The population size should not be kept very large as it can cause a
GA to slow down, while a smaller population might not be enough
for a good mating pool. Therefore, an optimal population size needs
to be decided by trial and error.
• The population is usually defined as a two dimensional array of –
size population, size x, chromosome size.
• Population Initialization
• There are two primary methods to initialize a population in a GA.
They are −
• Random Initialization − Populate the initial population with
completely random solutions.
• Heuristic initialization − Populate the initial population using a
known heuristic for the problem.
• Population Models
• There are two population models widely in use −
• Steady State
• In steady state GA, we generate one or two off-springs in each
iteration and they replace one or two individuals from the
population. A steady state GA is also known as Incremental GA.
• Generational
• In a generational model, we generate ‘n’ off-springs, where n is the
population size, and the entire population is replaced by the new one
at the end of the iteration.
• Genetic Algorithms - Fitness Function
• The fitness function simply defined is a function which takes
a candidate solution to the problem as input and produces as
output how “fit” our how “good” the solution is with respect to the
problem in consideration.
• Calculation of fitness value is done repeatedly in a GA and therefore
it should be sufficiently fast. A slow computation of the fitness value
can adversely affect a GA and make it exceptionally slow.
•
• In most cases the fitness function and the objective function are the
same as the objective is to either maximize or minimize the given
objective function. However, for more complex problems with
multiple objectives and constraints, an Algorithm Designer might
choose to have a different fitness function.
• A fitness function should possess the following characteristics −
• The fitness function should be sufficiently fast to compute.
• It must quantitatively measure how fit a given solution is or how fit
individuals can be produced from the given solution.
• In some cases, calculating the fitness function directly might not be
possible due to the inherent complexities of the problem at hand. In
such cases, we do fitness approximation to suit our needs.
• The following image shows the fitness calculation for a solution of
the 0/1 Knapsack. It is a simple fitness function which just sums the
profit values of the items being picked (which have a 1), scanning
the elements from left to right till the knapsack is full.
• Parent Selection is the process of selecting parents which mate and
recombine to create off-springs for the next generation. Parent selection is
very crucial to the convergence rate of the GA as good parents drive
individuals to a better and fitter solutions.
• However, care should be taken to prevent one extremely fit solution from
taking over the entire population in a few generations, as this leads to the
solutions being close to one another in the solution space thereby leading to
a loss of diversity. Maintaining good diversity in the population is
extremely crucial for the success of a GA. This taking up of the entire
population by one extremely fit solution is known as premature
convergence and is an undesirable condition in a GA.
• Fitness Proportionate Selection
• Fitness Proportionate Selection is one of the most popular ways of parent
selection. In this every individual can become a parent with a probability
which is proportional to its fitness. Therefore, fitter individuals have a
higher chance of mating and propagating their features to the next
generation. Therefore, such a selection strategy applies a selection pressure
to the more fit individuals in the population, evolving better individuals
over time.
• Consider a circular wheel. The wheel is divided into n pies, where n is the
number of individuals in the population. Each individual gets a portion of
the circle which is proportional to its fitness value.
• Two implementations of fitness proportionate selection are possible −
• Roulette Wheel Selection
• In a roulette wheel selection, the circular wheel is divided as
described before. A fixed point is chosen on the wheel
circumference as shown and the wheel is rotated. The region
of the wheel which comes in front of the fixed point is chosen
as the parent. For the second parent, the same process is
repeated.
• It is clear that a fitter individual has a greater pie on the wheel
and therefore a greater chance of landing in front of the fixed
point when the wheel is rotated. Therefore, the probability of
choosing an individual depends directly on its fitness.
• Implementation wise, we use the following steps −
• Calculate S = the sum of a finesses.
• Generate a random number between 0 and S.
• Starting from the top of the population, keep adding the
finesses to the partial sum P, till P<S.
• The individual for which P exceeds S is the chosen individual.
Stochastic Universal Sampling (SUS)
• Stochastic Universal Sampling is quite similar to Roulette wheel selection,
however instead of having just one fixed point, we have multiple fixed
points as shown in the following image. Therefore, all the parents are
chosen in just one spin of the wheel. Also, such a setup encourages the
highly fit individuals to be chosen at least once.
Inversion Mutation
In inversion mutation, we select a subset of genes like in
scramble mutation, but instead of shuffling the subset, we merely
invert the entire string in the subset.
Genetic Algorithms - Application
Areas
• Optimization − Genetic Algorithms are most commonly used in
optimization problems wherein we have to maximize or minimize a
given objective function value under a given set of constraints. The
approach to solve Optimization problems has been highlighted
throughout the tutorial.
• Economics − GAs are also used to characterize various economic
models like the cobweb model, game theory equilibrium resolution,
asset pricing, etc.
• Neural Networks − GAs are also used to train neural networks,
particularly recurrent neural networks.
• Parallelization − GAs also have very good parallel capabilities, and
prove to be very effective means in solving certain problems, and
also provide a good area for research.
• Image Processing − GAs are used for various digital image
processing (DIP) tasks as well like dense pixel matching.
• Vehicle routing problems − With multiple soft time
windows, multiple depots and a heterogeneous fleet.
• Scheduling applications − GAs are used to solve various
scheduling problems as well, particularly the time tabling
problem.
• Machine Learning − as already discussed, genetics
based machine learning (GBML) is a niche area in
machine learning.
• Robot Trajectory Generation − GAs have been used to
plan the path which a robot arm takes by moving from
one point to another.
• Parametric Design of Aircraft − GAs have been used to
design aircrafts by varying the parameters and evolving
better solutions.