Genetic Algorithms
Genetic Algorithms
SEMINAR REPORT
Submitted by
PRAVEEN R S
To
Of
November, 2010
DEPARTMENT OF MECHANICAL ENGINEERING
COLLEGE OF ENGINEERING
THIRUVANANTHAPURAM – 16
CERTIFICATE
I express my sincere thanks to Sri. K Sunilkumar (Lecturer & Staff Advisor, Department of
Mechanical Engineering), Prof. Z A Samitha (Professor & Senior Staff Advisor, Department
of Mechanical Engineering), Prof. E Abdul Rasheed (Head of Department, Department of
Mechanical Engineering), Dr. J Letha (Principal, College of Engineering, Trivandrum) for
giving me this opportunity and for their kind cooperation during the course of this work.
I would also wish to record my gratefulness to all my friends and classmates for their help
and support in carrying out this work successfully. I also thank the Lord Almighty for the
grace, strength and hope to make my endeavour a success.
Praveen R S
Abstract
applies the Principle of survival of the fittest to find better and better solutions. The feasible
solutions from the solution space are evaluated using a fitness function and they are selected
for reproduction on the basis of their fitness value. Reproduction involves cross over and
mutation. The successive generations would have better average fitness value, compared to
the previous generation. The iteration process is continued till the required convergence is
optimum. It has got a wide variety of applications is Operations Research related problems
problem, etc.
7. Uniform Crossover............................................................................................................................ 15
1. Introduction
mimic the performance of biological systems. Evolutionary computing algorithms are used
for search and optimization applications and also include fuzzy logic, which provides an
approximate reasoning basis for representing uncertain and imprecise knowledge. The no free
lunch theorem states that no search algorithm is better on all problems. All search methods
show on average the same performance over all possible problem instances. The present trend
is to combine these fields into a hybrid in order that the drawbacks of one may be offset by
the merits of another. Neural networks, fuzzy logic and evolutionary computing have shown
capability on many problems, but have not yet been able to solve the really complex
Knowledge
Based
Information
Systems
Approximate Search/
Reasoning Optimisation
Approaches Approaches
Multivalued
Probabilistic Neural Evolutionary
& Fuzzy
Models Networks Algorithms
logic
Genetic
Programming
Figure 1: The Placement of Genetic Algorithms in the hierarchy of Knowledge Based Information Systems
Page |2
1.1.Evolutionary Algorithms
evolves from one generation to the next, ultimately arriving at a satisfactory solution to
the problem. These algorithms differ in the way a new population is generated from the
present one, and in the way the members are represented within the algorithm. They are
Genetic Algorithms
analogy with the physical process of annealing. Hill climbing, in essence, finds an
optimum by following the local gradient of the function (thus, they are also known as
gradient methods).
Random Search Algorithms - Random searches simply perform random walks of the
problem space, recording the best optimum values found. They do not use any
through the search space using the knowledge gained from previous results in the
search.
Evolutionary algorithms exhibit an adaptive behavior that allows them to handle non-
knowledge of the problem structure. They also are very robust to time-varying behavior,
2. Genetic Algorithms
Genetic Algorithms (GAs) were invented by John Holland in the 1960s and were
developed with his students and colleagues at the University of Michigan in the 1970s.
Holland's original goal was to investigate the mechanisms of adaptation in nature and to
develop methods in which these mechanisms could be imported into computer systems.
Genetic algorithms are search methods that employ processes found in natural biological
to find those that approach some specification or criteria. To do this, the algorithm applies
the principle of survival of the fittest to find better and better approximations. At each
potential solutions (individuals) according to their level of fitness in the problem domain
and breeding them together using operators borrowed from natural genetics. This process
leads to the evolution of populations of individuals that are better suited to their
environment than the individuals that they were created from, just as in natural
adaptation.
The GA will generally include the three fundamental genetic operations of selection,
crossover and mutation. These operations are used to modify the chosen solutions and
select the most appropriate offspring to pass on to succeeding generations. They usually
exhibit a reduced chance of converging to local minima. GAs suffers from the problem of
excessive complexity if used on problems that are too large. Genetic algorithms work on
populations of individuals rather than single solutions, allowing for parallel processing to
be performed when finding solutions to the more large and complex problems.
Page |4
Standard genetic algorithms are implemented where the initial population of individuals
individuals in the current population are decoded and evaluated according to a fitness
function set for a given problem. The expected number of times an individual is chosen is
performed between two selected individuals by exchanging part of their genomes to form
Every member of a population has a certain fitness value associated with it, which
represents the degree of correctness of that particular solution or the quality of solution it
represents. The initial population of strings is randomly chosen. The GA using genetic
operators, to finally arrive at a quality solution to the given problem manipulates the
strings. GAs converge rapidly to quality solutions. Although they do not guarantee
convergence to the single best solution to the problem, the processing leverage associated
with GAs make them efficient search techniques. The main advantage of a GA is that it is
string represents a different solution to a given problem. Thus, the possibility of the GA
getting caught in local minima is greatly reduced because the whole space of possible
A GA has the ability to create an initial population of feasible solutions (or number of
individuals) and randomly initializing them at the beginning of a computation. This initial
population is then compared against the specifications or criteria and the individuals that
are closest to the criteria, that is, those with the highest fitness factor, are then recombined
in a way that guides their search to only the most promising areas of the state or search
Each feasible solution is encoded as a chromosome (string) also called a genotype and
each chromosome is given a measure of fitness (fitness factor) via a fitness (evaluation or
objective) function. The fitness of a chromosome determines its ability to survive and
If the optimization criteria are not met, then the creation of a new generation starts.
Individuals are selected (parents) according to their fitness for the production of
(crossover) at some crossover point (locus). All offspring will be mutated (altering some
genes in a chromosome) with a certain probability. The fitness of the offspring is then
computed. The offspring are inserted into the population replacing the parents, producing
a new generation. This cycle is performed until the optimization criteria are reached. In
some cases, where the parent already has a high fitness factor, it is better not to allow this
parent to be discarded when forming a new generation, but to be carried over. Mutation
ensures the entire state-space will be searched, (given enough time) and it is an effective
Provide Initial Does the average fitness suit Yes Best individuals
Population the requirement?
No
Start Solution Found
Generate new
population
Selection
Recombination
Mutation
Mutation. Starting from an initial population of strings (representing possible solutions), the
GA uses these operators to calculate successive generations. First, pairs of individuals of the
current population are selected to mate with each other to form the offspring, which then
This operator selects the chromosome in the population for reproduction. The more fit
the chromosome, the higher its probability of being selected for reproduction. Thus,
selection is based on the survival-of-the-fittest strategy, but the key idea is to select the
better individuals of the population. After selection of the pairs of parent strings, the
The crossover operator involves the swapping of genetic material (bit-values) between
the two parent strings. This operator randomly chooses a locus (a bit position along the
two chromosomes) and exchanges the sub-sequences before and after that locus between
The two individuals (children) resulting from each crossover operation will now be
subjected to the mutation operator in the final step to forming the new generation. This
operator randomly flips or alters one or more bit values at randomly selected locations in
a chromosome.
The mutation operator enhances the ability of the GA to find a near optimal solution to a
which is needed to make sure that the entire solution space is used in the search for the
best solution. In a sense, it serves as an insurance policy; it helps prevent the loss of
genetic material.
Page |8
3. Encoding
structured in the GA and also determines what genetic operators are used [1]. Each
This alphabet could consist of binary digits (0 and 1), floating point numbers, integers,
symbols (i.e., A, B, C, D), matrices, etc. In Holland's original design, the alphabet was
limited to binary digits. Each element of the string represents a particular feature in the
chromosome. The first thing that must be done in any new problem is to generate a code
for this problem. How is one to decide on the correct encoding for one's problem?
problems, strongly advocates using whatever encoding is the most natural for your
problem, and then devising a GA that can use that encoding [2].
One appealing idea is to have the encoding itself adapt so that the GA can make better
use of it. Choosing a fixed encoding ahead of time presents a paradox to the potential
GA user: for any problem that is hard enough that one would want to use a GA, one
doesn't know enough about the problem ahead of time to come up with the best
encoding for the GA. Thus, most research is currently done by guessing at an
appropriate encoding and then trying out a particular version of the GA on it.
Page |9
The actual value of a solution refers to its phenotype whereas the encoded value refers
to its genotype. Search happens in genotypic space, but selection occurs in phenotypic
space. For example, using a binary coding scheme (5, 3) can be coded as 101 011, in
which 101 refers to 5 and 011 refers to 3. (5, 3) is the phenotype whereas 101 011 is the
represent the nodes [3]. In the random key method, we assign each gene a random
number drawn uniformly from [0; 1). To decode the chromosome, we visit the nodes in
Decodes as 3 1 2 4 5
Nodes that should be early in the tour tend to “evolve” genes closer to 0 and those that
should come later tend to evolve genes closer to 1. Standard crossover techniques will
4. Selection
respect to a particular set of parameters [4]. The fitness function transforms that measure
representing a set of parameters is independent of the evaluation of any other string. The
fitness of that string, however, is always defined with respect to other members of the
current population.
When individuals are modified to produce new individuals, they are said to be breeding.
Selection determines which individuals are chosen for breeding (recombination) and how
many offspring each selected individual produces. The individual (chromosome or string)
is first grade, known as its fitness, which indicates how good a solution it is. The period in
which the individual is evaluated and assigned a fitness value is known as fitness
assessment. Good chromosomes (those with the highest fitness function) survive and have
offspring, while those chromosomes furthest removed or with the lowest fitness function
are culled. Constraints on the chromosomes can be modeled by penalties in the fitness
Once individuals have had their fitness assessed, they may be selected and bred to form
the next generation in the evolution cycle, through repeated application of some selection
function. This function usually selects one or two individuals from the old population,
copies them, modifies them, and returns the modified copies for addition to the new
This selection method normalizes all the fitnesses in the population. These normalized
fitnesses then become the probabilities that their respective individuals will be selected.
Fitnesses may be transformed in some way prior to normalization. One of the problems
fitnesses are rarely an accurate measure of how “good” an individual really is.
In this technique, individuals are first sorted according to their fitness values, with the
first individual being the worst and the last individual being the best. Each individual is
then selected with a probability based on some linear function of its sorted rank. This is
( )
| |
where ||P|| is the size of the population P, and 1 < c < 2 is the selection bias: higher
values of the selective pressure „c‟ cause the system to focus more on selecting only the
better individuals. The best individual in the population is thus selected with
Stochastic universal sampling provides zero bias and minimum spread. The individuals
are mapped to contiguous segments of a line, such that each individual's segment is equal
in size to its fitness exactly as in roulette-wheel selection. Here equally spaced pointers
are placed over the line as many as there are individuals to be selected. Consider n the
number of individuals to be selected, then the distance between the pointers is 1/n and the
position of the first pointer is given by a randomly generated number in the range [0, 1/n].
P a g e | 12
decreasing order of fitness, and their lengths are proportional to their fitness values. The
initial point (here its „p‟) is fixed at random and another point „q‟ is fixed such that it is at
a distance of 1/n from p. „r‟ is fixed such that rq = pq. The solutions which fall at those
A B C D E F
p q r
Figure 3: Stochastic Universal Sampling
The simplest selection scheme is roulette-wheel selection, also called stochastic sampling
with replacement. Each slot on the wheel represents a chromosome from the parent
generation; the width of each slot represents the relative fitness of a given chromosome.
Then the Roulette wheel is simulated. The largest fitness values tend to be the most likely
resting-places for the marble, since they have larger slots. Consider an example.
E A
20% 7%
D
14% B
24%
C
35%
Here, C, being the most fit individual, has the greater probability to be selected.
P a g e | 13
use it. In truncation selection, individuals are sorted according to their fitness. The next
generation is formed from breeding only the best individuals in the population. One form
of truncation selection, (m,l) selection, works as follows. Let the population size l = km
where k and m are positive integers. The m best individuals in the population are
“selected”. Each individual in this group is then used to produce k new individuals in the
next generation. In a variant form, (m + l) selection, m individuals are “selected” from the
union of the population and the m parents which had created that population previously.
This selection mechanism is popular because it is simple, fast, and has well-understood
from the population. These are independent choices: an individual may be chosen more
than once. Then tournament selection selects the individual with the highest fitness in this
pool. Clearly, the larger the value n, the more directed this method is at picking highly fit
individuals. If n = 1, then the method selects individuals totally at random. Popular values
for n include 2 and 7. Two is the standard number for genetic algorithm literature, and is
not very selective. Seven is used widely in the genetic programming literature, and is
5. Recombination or Crossover
After selection has been carried out recombination can occur. Crossovers are (sometimes)
deterministic operators that capture the best features of two parents and pass it to a new
When a population has been entirely replaced by children, the new population is known
as the next generation. The whole process of finding an optimal solution is known as
evolving a solution.
The traditional GA uses 1-point crossover, where the two mating chromosomes are each
cut once at corresponding points and the selections after the cuts exchanged. The locus
In two-point crossover chromosomes are regarded as loop formed by joining the ends
together. To exchange a segment from one loop with that from another loop requires the
This form of crossover is different from one-point crossover. Copying the corresponding
gene from one or the other parent, chosen according to a randomly generated crossover
mask creates each gene in the offspring. Where there is a "1" in the crossover mask, the
gene is copied from the first parent and where there is a "0" in the mask, the gene is
copied from the second parent as shown in figure 7. The process is repeated with the
parents exchanged to produce the second offspring. A new crossover mask is randomly
generated for each pair of parents. Offspring therefore, contain a mixture of genes from
each parent. The number of effective crossing points is not fixed, but will average L/2
Shuffle crossover is related to uniform crossover. A single crossover position (as in single
point crossover) is selected. But before the variables are exchanged, they are randomly
shuffled in both parents. After recombination, the variables in the offspring are
unshuffled. This removes positional bias as the variables are randomly reassigned each
Partially matched crossover (PMX) arose in an attempt to solve the blind Travelling
Salesman Problem (TSP). In the blind TSP, fitness is entirely based on the ordering of the
reproduction. PMX begins by selecting two points -- two and five, in this case -- for its
operation.
notes that the H allele in Chromosome 2 will replace the C allele in Chromosome 1
The same process is accomplished for the other two alleles being swapped, that is,
Offspring 1: ED|HBA|FGC
Offspring 2: GF|CDE|HBA
Order crossover involves the removal of some alleles and the shifting of others. Given the
crossover points and parent chromosomes as in the PMX example, OX would remove the
Offspring 1: - - |HBA|FG
Then, beginning after the second crossover point, OX shifts alleles to the left (wrapping
around the end of the chromosome if necessary), filling empty alleles and leaving an
Offspring 1: BA|---|FGH
Offspring 2: DE|---|GFC
To finish the process, OX exchanges the alleles within the crossover boundaries, finishing
Offspring 1: BA|CDE|FGH
Offspring 2: DE|HBA|GFC
PMX preserves the absolute position of a city allele within chromosomes, whereas OX
This form of crossover works in an entirely different fashion, by swapping a specific set
Parent 1: ABCDEFGH
Parent 2: GFHBACDE
In generating offspring, CX begins with the first cities of the two parent chromosomes:
Offspring 1: G-------
Offspring 2: A-------
A search of Parent 1 finds the just-introduced G allele in position 7. Another swap occurs:
Offspring 1: G-----D
Offspring 2: A-----G
P a g e | 18
The search-and-swapping process continues until the allele first replaced in Parent 1 -- the
A – is found in a swap between chromosomes. CX then fills the remaining empty alleles
from corresponding elements of the parents. The final offspring look like this:
Offspring 1: GECBAFDH
Offspring 2: ABHDECGE
Inversion preserves the nature of a permutation while reordering its elements. Here are
Crossover probability (pc) says how often will be crossover performed. If there is no
from the parts of parents‟ chromosome. If crossover probability is 100%, then all
offspring is made by crossover. If it is 0%, whole generation is made from exact copies of
Crossover is made in the hope that new chromosomes will have good parts of old
chromosome and may be the new chromosomes will be better. However it is good to
6. Mutation
After recombination offspring undergo mutation. Although it is generally held that crossover
is the main force leading to a thorough search of the problem space, mutations are
probabilistic background operators that try to re-introduce needed chromosome features (bit
or allele) into populations whose features have been inadvertently lost. Mutation can assist by
preventing a (small) population prematurely converging onto a local minimum and remaining
stuck on this minimum due to a recessive gene that has infected the whole population
(genetic drift). It does this by providing a small element of random search in the vicinity of
the population when it has largely converged. Crossover alone cannot prevent the population
converging on a local minimum. Mutation generally finds better solutions than a crossover-
only regime although crossover gives much faster evolution than a mutation-only population.
As the population converges on a solution, mutation becomes more productive and crossover
less productive. Consequently, it is not a choice between crossover and mutation but, rather
the balance among crossover, mutation and selection that is important. Offspring variables
are mutated by the addition of small random values (size of the mutation step), with low
the number of bits (variables) "n", in the chromosome (dimensions). The more dimensions
one individual has the smaller the mutation probability is required to be.
A mutation rate m = 1/n produces almost optimal results for a broad class of test functions
where the mutation rate is independent of the size of the population. Varying the mutation
rate by increasing it at the beginning of a search and a decreasing it to 1/n at the end as the
This technique is generally used in binary coded chromosomes. The value of a particular
This is a modification of the flip bit technique. Here, a bit position is selected at random
and it is changed to the upper or lower bound of the coding scheme used.
6.1.(iii)Uniform Mutation
This technique is similar to the uniform crossover. Here also, there is a mutation mask,
which determines which all bit positions should be flipped. Flipping is done where an „1‟
is in the mask and the bit is left as it is, where a „0‟ is in the mask.
P a g e | 21
Mutation probability says how often will be parts of chromosomes mutated. If there is no
mutation, offspring is taken after crossover without any change. If mutation is performed,
changed.
Mutation is made to prevent falling GA into local optimum, but it should not occur very
7. Convergence
With a correctly designed and implemented GA, the population will evolve over
successive generations so that the fitness of the best and the average individual in each
generation increases towards the global optimum [5]. Convergence is the progression
towards increasing uniformity. A gene is said to have converged when 95% of the
population share the same value. The population is said to have converged when all of the
At the start of a run, the values for each gene for different members of the population are
randomly distributed giving a wide spread of individual fitnesses. As the run progresses
some gene values begin to predominate. As the population converges the range of
fitnesses in the population reduces. This reduced range often leads to premature
A standard problem with GAs is where the genes from a small number of highly fit, but
not optimal, chromosomes may tend to dominate the population causing it to converge on
a local minimum rather than search for a global minimum. Once the population has
reduced its range of fitnesses due to this convergence, the ability of the GA to continue to
search for better solutions is effectively prevented. Crossovers of chromosomes that are
almost identical produce offspring chromosomes that are almost identical to their parents.
The only saving grace is mutation that allows a slower, wider search of the search space
to be made.
P a g e | 23
convergence to occur; because the population is not infinite. In order to make GAs work
effectively on finite populations the selection process of parents must be modified. Ways
of doing this are presented in the next section. The basic idea is to control the number of
reproductive opportunities each individual gets, so that it is neither too large, nor too
small. The effect is to compress the range of fitnesses and prevent any "super-fit"
After many generations, the population would have converged but can't yet find the
global maximum. The average fitness will be high and the range of fitness levels quite
small. This means that there is very little gradient in the fitness function. Because of this
slight slope, the population slowly edges towards the global maximum rather than going
to it quickly.
P a g e | 24
There are three sources named S1, S2 and S3, whose supply quantities are 8, 19 and 17
respectively. There are four destinations D1, D2, D3 and D4 whose demands are 11, 3, a4
and 16 respectively. Transportation cost from every source to every destination is same.
8.2 Encoding
Prüfer number is an encoding technique used to encode spanning tree representations [6].
It is the sequence of numbers of nodes to which the least valued leaf nodes (dangling
nodes) are connected. If there are „n‟ stations in a Transportation problem (including
sources and destinations), the Prüfer number consists of n-2 digits. The steps to find the
Prüfer number for the above spanning tree are shown below.
8.4 GA Operators
The Prüfer number representations of all the feasible solutions need to be obtained first.
They can be evaluated using the fitness function. The fitness for different feasible
solutions can be obtained by calculating the allocations for links between different
sources and destinations, from which the total cost can be obtained. Here, the fitness
should be evaluated in inverse scale. The solution with least cost must be allotted
9 Conclusion
The genetic algorithm (GA) is a search heuristic that mimics the process of natural
evolution. This heuristic is routinely used to generate useful solutions to optimization and
search problems [7]. Genetic algorithms belong to the larger class of evolutionary
represented in binary as strings of 0s and 1s, but other encodings are also possible. The
evolution usually starts from a population of randomly generated individuals and happens
evaluated, multiple individuals are stochastically selected from the current population
(based on their fitness), and modified (recombined and possibly randomly mutated) to
form a new population. The new population is then used in the next iteration of the
generations has been produced, or a satisfactory fitness level has been reached for the
Initially many individual solutions are randomly generated to form an initial population.
The population size depends on the nature of the problem, but typically contains several
randomly, covering the entire range of possible solutions (the search space).
Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to
be found.
P a g e | 27
breed a new generation. Individual solutions are selected through a fitness-based process,
where fitter solutions (as measured by a fitness function) are typically more likely to be
selected. Certain selection methods rate the fitness of each solution and preferentially
select the best solutions. Other methods rate only a random sample of the population, as
Most functions are stochastic and designed so that a small proportion of less fit solutions
are selected. This helps keep the diversity of the population large, preventing premature
The next step is to generate a second generation population of solutions from those
mutation.
For each new solution to be produced, a pair of "parent" solutions is selected for breeding
from the pool selected previously. By producing a "child" solution using the above
methods of crossover and mutation, a new solution is created which typically shares many
of the characteristics of its "parents". New parents are selected for each new child, and the
Although reproduction methods that are based on the use of two parents are more
"biology inspired", some research suggests more than two "parents" are better to be used
These processes ultimately result in the next generation population of chromosomes that
is different from the initial generation. Generally the average fitness will have increased
by this procedure for the population, since only the best organisms from the first
generation are selected for breeding, along with a small proportion of less fit solutions.
P a g e | 28
This generational process is repeated until a termination condition has been reached.
The highest ranking solution's fitness is reaching or has reached a plateau such that
Manual inspection
include timetabling and scheduling problems, and many scheduling software packages are
based on GAs. GAs have also been applied to engineering. Genetic algorithms are often
As a general rule of thumb genetic algorithms might be useful in problem domains that have
a complex fitness landscape as crossover is designed to move the population away from local
optima that a traditional hill climbing algorithm might get stuck in.
P a g e | 29
References