0% found this document useful (0 votes)
16 views

Unit-III Genetic Algorithm

The document provides an overview of genetic algorithms including basic concepts like population, chromosomes, genes, alleles, genotype, phenotype, encoding, decoding, fitness functions and genetic operators. It describes representation methods in genetic algorithms like binary, real-valued, integer and permutation representations. It also discusses population size and initialization methods, and population models like steady state and generational models.

Uploaded by

Raman Naam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Unit-III Genetic Algorithm

The document provides an overview of genetic algorithms including basic concepts like population, chromosomes, genes, alleles, genotype, phenotype, encoding, decoding, fitness functions and genetic operators. It describes representation methods in genetic algorithms like binary, real-valued, integer and permutation representations. It also discusses population size and initialization methods, and population models like steady state and generational models.

Uploaded by

Raman Naam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Unit-III

Genetic Algorithms: Basic Genetics,


Concepts, Working Principle, Creation of
Offsprings,Encoding, Fitness Function,
Selection Functions, Genetic Operators-
Reproduction, Crossover, Mutation;
Genetic Modeling, Benefits.
Genetic Algorithms - Introduction
• Genetic Algorithm (GA) is a search-based
optimization technique based on the
principles of Genetics and Natural Selection.
It is frequently used to find optimal or near-
optimal solutions to difficult problems which
otherwise would take a lifetime to solve. It is
frequently used to solve optimization
problems, in research, and in machine
learning.
Introduction to Optimization
• Optimization is the process of making something
better. In any process, we have a set of inputs and a
set of outputs as shown in the following figure.

Optimization refers to finding the values of inputs in such a way that we get
the “best” output values. The definition of “best” varies from problem to
problem, but in mathematical terms, it refers to maximizing or minimizing
one or more objective functions, by varying the input parameters.
What are Genetic Algorithms?
• Nature has always been a great source of inspiration to all mankind.
Genetic Algorithms (GAs) are search based algorithms based on the
concepts of natural selection and genetics. GAs are a subset of a much
larger branch of computation known as Evolutionary Computation.
• Advantages of GAs
• GAs have various advantages which have made them immensely popular.
These include
• Does not require any derivative information (which may not be available
for many real-world problems).
• Is faster and more efficient as compared to the traditional methods.
• Has very good parallel capabilities.
• Optimizes both continuous and discrete functions and also multi-objective
problems.
• Provides a list of “good” solutions and not just a single solution.
• Always gets an answer to the problem, which gets better over the time.
• Useful when the search space is very large and there are a large number
of parameters involved.
Limitations of GAs
• GAs are not suited for all problems, especially problems which
are simple and for which derivative information is available.
• Fitness value is calculated repeatedly which might be
computationally expensive for some problems.
• Being stochastic, there are no guarantees on the optimality or
the quality of the solution.
• If not implemented properly, the GA may not converge to the
optimal solution.
Basic Terminology
• Before beginning a discussion on Genetic Algorithms, it
is essential to be familiar with some basic terminology
which will be used throughout this tutorial.
• Population − It is a subset of all the possible (encoded)
solutions to the given problem. The population for a
GA is analogous to the population for human beings
except that instead of human beings, we have
Candidate Solutions representing human beings.
• Chromosomes − A chromosome is one such solution to
the given problem.
• Gene − A gene is one element position of a
chromosome.
• Allele − It is the value a gene takes for a particular
chromosome.
Genotype − Genotype is the population in the computation space. In the
computation space, the solutions are represented in a way which can be easily
understood and manipulated using a computing system.
Phenotype − Phenotype is the population in the actual real world solution space
in which solutions are represented in a way they are represented in real world
situations.
• Decoding and Encoding − For simple problems,
the phenotype and genotype spaces are the same. However,
in most of the cases, the phenotype and genotype spaces are
different. Decoding is a process of transforming a solution
from the genotype to the phenotype space, while encoding is a
process of transforming from the phenotype to genotype space.
Decoding should be fast as it is carried out repeatedly in a GA
during the fitness value calculation. For example, consider the
0/1 Knapsack Problem. The Phenotype space consists of
solutions which just contain the item numbers of the items to
be picked. However, in the genotype space it can be
represented as a binary string of length n (where n is the
number of items). A 0 at position x represents that xth item is
picked while a 1 represents the reverse. This is a case where
genotype and phenotype spaces are different.
Fitness Function − A fitness function simply defined is a function which
takes the solution as input and produces the suitability of the solution as
the output. In some cases, the fitness function and the objective
function may be the same, while in others it might be different based on
the problem.
Genetic Operators − These alter the genetic composition of the
offspring. These include crossover, mutation, selection, etc.
• Basic Structure
• The basic structure of a GA is as follows −
• We start with an initial population (which may be generated at
random or seeded by other heuristics), select parents from this
population for mating. Apply crossover and mutation operators on
the parents to generate new off-springs. And finally these off-springs
replace the existing individuals in the population and the process
repeats. In this way genetic algorithms actually try to mimic the
human evolution to some extent.
• Each of the following steps are covered as a separate chapter later in
this tutorial.
Genotype Representation
• One of the most important decisions to make while implementing a genetic
algorithm is deciding the representation that we will use to represent our
solutions. It has been observed that improper representation can lead to poor
performance of the GA.
• Therefore, choosing a proper representation, having a proper definition of the
mappings between the phenotype and genotype spaces is essential for the success
of a GA.
• In this section, we present some of the most commonly used representations for
genetic algorithms. However, representation is highly problem specific and the
reader might find that another representation or a mix of the representations
mentioned here might suit his/her problem better.
• Binary Representation
• This is one of the simplest and most widely used representation in GAs. In
this type of representation the genotype consists of bit strings.
• For some problems when the solution space consists of Boolean decision
variables – yes or no, the binary representation is natural. Take for example
the 0/1 Knapsack Problem. If there are n items, we can represent a solution
by a binary string of n elements, where the xth element tells whether the
item x is picked (1) or not (0).

For other problems, specifically those dealing with numbers, we can


represent the numbers with their binary representation. The problem
with this kind of encoding is that different bits have different
significance and therefore mutation and crossover operators can have
undesired consequences. This can be resolved to some extent by
using Gray Coding, as a change in one bit does not have a massive
effect on the solution.
• Real Valued Representation
• For problems where we want to define the genes using continuous rather
than discrete variables, the real valued representation is the most natural.
The precision of these real valued or floating point numbers is however
limited to the computer.

Integer Representation
For discrete valued genes, we cannot always limit the solution space to
binary ‘yes’ or ‘no’. For example, if we want to encode the four distances –
North, South, East and West, we can encode them as {0,1,2,3}. In such
cases, integer representation is desirable.
• Permutation Representation
• In many problems, the solution is represented by an order of
elements. In such cases permutation representation is the most
suited.
• A classic example of this representation is the travelling salesman
problem (TSP). In this the salesman has to take a tour of all the
cities, visiting each city exactly once and come back to the starting
city. The total distance of the tour has to be minimized. The solution
to this TSP is naturally an ordering or permutation of all the cities
and therefore using a permutation representation makes sense for
this problem.
• Population is a subset of solutions in the current generation. It can
also be defined as a set of chromosomes. There are several things to
be kept in mind when dealing with GA population −
• The diversity of the population should be maintained otherwise it
might lead to premature convergence.
• The population size should not be kept very large as it can cause a
GA to slow down, while a smaller population might not be enough
for a good mating pool. Therefore, an optimal population size needs
to be decided by trial and error.
• The population is usually defined as a two dimensional array of –
size population, size x, chromosome size.
• Population Initialization
• There are two primary methods to initialize a population in a GA.
They are −
• Random Initialization − Populate the initial population with
completely random solutions.
• Heuristic initialization − Populate the initial population using a
known heuristic for the problem.
• Population Models
• There are two population models widely in use −
• Steady State
• In steady state GA, we generate one or two off-springs in each
iteration and they replace one or two individuals from the
population. A steady state GA is also known as Incremental GA.
• Generational
• In a generational model, we generate ‘n’ off-springs, where n is the
population size, and the entire population is replaced by the new one
at the end of the iteration.
• Genetic Algorithms - Fitness Function
• The fitness function simply defined is a function which takes
a candidate solution to the problem as input and produces as
output how “fit” our how “good” the solution is with respect to the
problem in consideration.
• Calculation of fitness value is done repeatedly in a GA and therefore
it should be sufficiently fast. A slow computation of the fitness value
can adversely affect a GA and make it exceptionally slow.

• In most cases the fitness function and the objective function are the
same as the objective is to either maximize or minimize the given
objective function. However, for more complex problems with
multiple objectives and constraints, an Algorithm Designer might
choose to have a different fitness function.
• A fitness function should possess the following characteristics −
• The fitness function should be sufficiently fast to compute.
• It must quantitatively measure how fit a given solution is or how fit
individuals can be produced from the given solution.
• In some cases, calculating the fitness function directly might not be
possible due to the inherent complexities of the problem at hand. In
such cases, we do fitness approximation to suit our needs.
• The following image shows the fitness calculation for a solution of
the 0/1 Knapsack. It is a simple fitness function which just sums the
profit values of the items being picked (which have a 1), scanning
the elements from left to right till the knapsack is full.
• Parent Selection is the process of selecting parents which mate and
recombine to create off-springs for the next generation. Parent selection is
very crucial to the convergence rate of the GA as good parents drive
individuals to a better and fitter solutions.
• However, care should be taken to prevent one extremely fit solution from
taking over the entire population in a few generations, as this leads to the
solutions being close to one another in the solution space thereby leading to
a loss of diversity. Maintaining good diversity in the population is
extremely crucial for the success of a GA. This taking up of the entire
population by one extremely fit solution is known as premature
convergence and is an undesirable condition in a GA.
• Fitness Proportionate Selection
• Fitness Proportionate Selection is one of the most popular ways of parent
selection. In this every individual can become a parent with a probability
which is proportional to its fitness. Therefore, fitter individuals have a
higher chance of mating and propagating their features to the next
generation. Therefore, such a selection strategy applies a selection pressure
to the more fit individuals in the population, evolving better individuals
over time.
• Consider a circular wheel. The wheel is divided into n pies, where n is the
number of individuals in the population. Each individual gets a portion of
the circle which is proportional to its fitness value.
• Two implementations of fitness proportionate selection are possible −
• Roulette Wheel Selection
• In a roulette wheel selection, the circular wheel is divided as
described before. A fixed point is chosen on the wheel
circumference as shown and the wheel is rotated. The region
of the wheel which comes in front of the fixed point is chosen
as the parent. For the second parent, the same process is
repeated.
• It is clear that a fitter individual has a greater pie on the wheel
and therefore a greater chance of landing in front of the fixed
point when the wheel is rotated. Therefore, the probability of
choosing an individual depends directly on its fitness.
• Implementation wise, we use the following steps −
• Calculate S = the sum of a finesses.
• Generate a random number between 0 and S.
• Starting from the top of the population, keep adding the
finesses to the partial sum P, till P<S.
• The individual for which P exceeds S is the chosen individual.
Stochastic Universal Sampling (SUS)
• Stochastic Universal Sampling is quite similar to Roulette wheel selection,
however instead of having just one fixed point, we have multiple fixed
points as shown in the following image. Therefore, all the parents are
chosen in just one spin of the wheel. Also, such a setup encourages the
highly fit individuals to be chosen at least once.

It is to be noted that fitness proportionate selection methods don’t work for


cases where the fitness can take a negative value.
Tournament Selection
• In K-Way tournament selection, we select K individuals from
the population at random and select the best out of these to
become a parent. The same process is repeated for selecting
the next parent. Tournament Selection is also extremely
popular in literature as it can even work with negative fitness
values.
Rank Selection
• Rank Selection also works with negative fitness values and is mostly used
when the individuals in the population have very close fitness values (this
happens usually at the end of the run). This leads to each individual having
an almost equal share of the pie (like in case of fitness proportionate
selection) as shown in the following image and hence each individual no
matter how fit relative to each other has an approximately same probability
of getting selected as a parent. This in turn leads to a loss in the selection
pressure towards fitter individuals, making the GA to make poor parent
selections in such situations.
• In this, we remove the concept of a fitness value while
selecting a parent. However, every individual in the population
is ranked according to their fitness. The selection of the
parents depends on the rank of each individual and not the
fitness. The higher ranked individuals are preferred more than
the lower ranked ones.

Random Selection : In this strategy we randomly select


parents from the existing population. There is no selection
pressure towards fitter individuals and therefore this strategy
is usually avoided.
Genetic Algorithms - Crossover
• The crossover operator is analogous to reproduction and
biological crossover. In this more than one parent is selected
and one or more off-springs are produced using the genetic
material of the parents. Crossover is usually applied in a GA
with a high probability – pc
• Crossover Operators
• In this section we will discuss some of the most popularly used
crossover operators. It is to be noted that these crossover
operators are very generic and the GA Designer might choose
to implement a problem-specific crossover operator as well.
• One Point Crossover
• In this one-point crossover, a random crossover point is
selected and the tails of its two parents are swapped to get new
off-springs.
Multi Point Crossover
Multi point crossover is a generalization of the one-point crossover
wherein alternating segments are swapped to get new off-springs.
• Uniform Crossover
• In a uniform crossover, we don’t divide the chromosome into segments,
rather we treat each gene separately. In this, we essentially flip a coin for
each chromosome to decide whether or not it’ll be included in the off-
spring. We can also bias the coin to one parent, to have more genetic
material in the child from that parent.
• Whole Arithmetic Recombination
• This is commonly used for integer representations and works by taking the
weighted average of the two parents by using the following formulae −
• Child1 = α.x + (1-α).y
• Child2 = α.x + (1-α).y
• Obviously, if α = 0.5, then both the children will be identical as shown in
the following image.
Genetic Algorithms - Mutation
• Introduction to Mutation
• In simple terms, mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and introduce
diversity in the genetic population and is usually applied with a low
probability – pm. If the probability is very high, the GA gets reduced to a
random search.
• Mutation is the part of the GA which is related to the “exploration” of the
search space. It has been observed that mutation is essential to the
convergence of the GA while crossover is not.
• Mutation Operators
• In this section, we describe some of the most commonly used mutation
operators. Like the crossover operators, this is not an exhaustive list and the
GA designer might find a combination of these approaches or a problem-
specific mutation operator more useful.
• Bit Flip Mutation
• In this bit flip mutation, we select one or more random bits and flip them.
This is used for binary encoded GAs.
Random Resetting
Random Resetting is an extension of the bit flip for the integer
representation. In this, a random value from the set of permissible values
is assigned to a randomly chosen gene.
Swap Mutation
In swap mutation, we select two positions on the chromosome at random,
and interchange the values. This is common in permutation based
encodings.
• Scramble Mutation
• Scramble mutation is also popular with permutation
representations. In this, from the entire chromosome, a subset
of genes is chosen and their values are scrambled or shuffled
randomly.

Inversion Mutation
In inversion mutation, we select a subset of genes like in
scramble mutation, but instead of shuffling the subset, we merely
invert the entire string in the subset.
Genetic Algorithms - Application
Areas
• Optimization − Genetic Algorithms are most commonly used in
optimization problems wherein we have to maximize or minimize a
given objective function value under a given set of constraints. The
approach to solve Optimization problems has been highlighted
throughout the tutorial.
• Economics − GAs are also used to characterize various economic
models like the cobweb model, game theory equilibrium resolution,
asset pricing, etc.
• Neural Networks − GAs are also used to train neural networks,
particularly recurrent neural networks.
• Parallelization − GAs also have very good parallel capabilities, and
prove to be very effective means in solving certain problems, and
also provide a good area for research.
• Image Processing − GAs are used for various digital image
processing (DIP) tasks as well like dense pixel matching.
• Vehicle routing problems − With multiple soft time
windows, multiple depots and a heterogeneous fleet.
• Scheduling applications − GAs are used to solve various
scheduling problems as well, particularly the time tabling
problem.
• Machine Learning − as already discussed, genetics
based machine learning (GBML) is a niche area in
machine learning.
• Robot Trajectory Generation − GAs have been used to
plan the path which a robot arm takes by moving from
one point to another.
• Parametric Design of Aircraft − GAs have been used to
design aircrafts by varying the parameters and evolving
better solutions.

You might also like