TECHNICAL SEMINAR REPORT
on
Title of Seminar
submitted to in the partial fulfillment of the requirements for the award of
degree of
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
by
Mr. abcd
17J41A05xx
Under the esteemed guidance of
Guide Name
D
esigna ti
on
DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING
MALLA REDDY ENGINEERING COLLEGE
(AUTONOMOUS)
(An UGC Autonomous Institution, Approved by AICTE and Affiliated to JNTUH Hyderabad,
Recognized under Section 2(f) &2 (B) of UGC Act 1956, Accredited by NAAC with ‘A’ Grade (II Cycle) and NBA
Maisammaguda, Dhulapally (Post Via Kompally), Secunderabad-500 100
Website: www.mrec.ac.in E-mail:
[email protected] 2017-2021
DECLARATION
I hereby declare that this Technical Seminar Report titled “GENETIC ALGORITHM” is
original and bonafide work of my own in the partial fullfilment of the requirements for
the award of the degree of Bachelor of Technology in Computer Science and
Engineering at Malla Reddy Engineering College (Autonomous), affiliated to
JNTUH, Hyderabad under the guidance of PROF. MOHAMMAD INAYATHULLA, and has
not been copied from any earlier reports.
Mr. DARAM VISHNU
VARDHAN
(17J41A0577)
MALLA REDDY ENGINEERING COLLEGE (AUTONOMOUS)
(An UGC Autonomous Institution, Approved by AICTE and Affiliated to JNTUH Hyderabad,
Recognized under Section 2(f) &2 (B) of UGC Act 1956, Accredited by NAAC with ‘A’ Grade (II Cycle)
and NBA
Maisammaguda, Dhulapally (Post Via Kompally), Secunderabad-500 100 )
(II Cycle) and NBA
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
This is to certify that the Seminar titled “Your Title” is a bonafide work done by Mr.
abcd (17J41A05xx), in partial fulfillment for the award of degree of Bachelor of
Technology in Computer Science and Engineering of the Malla Reddy Engineering
College (A), Affiliated to JNTUH, Hyderabad and that this work has not been submitted
for the award of any other degree/diploma of any Institution/University.
Coordinator Internal Guide
Head of the Department
ACKNOWLEDGEMENT
I am extremely thankful to our beloved Chairman and Founder of Malla Reddy Group of
Institutions Sri. Ch.Malla Reddy, for providing necessary infrastructure facilities
throughout the Technical Seminar.
I express my sincere thanks to Director Dr.A.Ramaswamy Reddy, who took keen
interest and encouraged us in every effort during the Technical Seminar.
I also express my gratitude to our Principal, Dr.A.Ravindra , for having provided me
with adequate facilities to pursue my course and Seminar successfully.
I would like to thank Dr.O.Obulesu, Professor and Head, Department of Computer
Science and Engineering, for giving the opportunity to use all the facilities available in
the department, for successful completion of the Seminar.
I am very grateful to our Seminar coordinator Mr.M.Praveen, Assistant Professor for
extending his support and assisting us throughout our Seminar.
I am very grateful to our Seminar Internal Guide PROF. MOHAMMMAD INAYATHULLA,
for his extensive patience and guidance throughout my Technical Seminar.
I sincerely thank all the staff of our department, for their timely suggestions, healthy
criticism and motivation during the course of the Seminar.
Finally, I express my immense gratitude with pleasure to all individuals who have
either directly or indirectly contributed to our need at right time for the development
and success of my Seminar.
GENETIC ALGORITHM
Abstract
In computer science and operation search, a genetic
algorithm (GA) is a meta heuristic inspired by the
process of natural selection that belongs to the larger
class of evolutionary algorithms (EA).
Genetic algorithms are commonly used to generate high-
quality solutions to optimization and search problems by
relying on biologically inspired operators such
as mutation, crossover and selection
Problems which appear to be particularly appropriate
for solution by genetic algorithms
include timetabling and scheduling problems, and many
scheduling software packages are based on GA. GAs
have also been applied to engineering. Genetic
algorithms are often applied as an approach to
solve global optimization problems.
1
List of acronyms
GA - GENETIC ALGORITHM
EA - EVOLUTIONARY ALGORITHM
PC - PROBABILITY OF CROSSOVER
PM - PROBABILITY OF MUTATION
AGA - ADAPTIVE GENETIC ALGORITHM
CAGA - CLUSTER BASED ADAPTIVE GENETIC ALGORITHM
DNA - DEOXY RIBONULEIC ACID
LLGA - LINKAGE LEARNING GENETIC ALGORITHM
GEMGA - GENETICALLY EXPRESSED MESSY GENETIC ALGORITHM
mGA - modified GENETIC ALGORITHM
2
CONTENT :
INTRODUCTION TO DARWIN’S THEORY OF NATURAL SELECTION
INTRODUCTION TO GENETIC ALGORITHM
INTRODUCTION TO OPERATORS USED IN GENETIC ALGORITHM
DIFFERENT VARIANTS OF GENETIC ALGORITHM
LIMITATIONS OF GENETIC ALGORITHM
PROBLEM DOMAIN’S OF GENETIC ALGORITHM
CONCLUSION
INTRODUCTION TO DARWIN’S THEORY OF NATURAL
SELECTION
fig-1.1
Natural selection is the differential survival and reproduction
of individuals due to differences in phenotype. It is a key
mechanism of evolution, the change in
the heritable traits characteristic of a population over
generations. Charles Darwin popularised the term "natural
selection", contrasting it with artificial selection, which in his
view is intentional, whereas natural selection is not.
Variation exists within all populations of organisms. This
occurs partly because random mutations arise in
the genome of an individual organism, and their offspring can
inherit such mutations. Throughout the lives of the
individuals, their genomes interact with their environments to
cause variations in traits. The environment of a genome
includes the molecular biology in the cell, other cells, other
individuals, populations, species, as well as the abiotic
environment. Because individuals with certain variants of the
trait tend to survive and reproduce more than individuals with
other less successful variants, the population evolves. Other
factors affecting reproductive success include sexual
selection (now often included in natural selection)
and fecundity selection.
Natural selection acts on the phenotype, the characteristics of
the organism which actually interact with the environment,
but the genetic (heritable) basis of any phenotype that gives
that phenotype a reproductive advantage may become more
common in a population. Over time, this process can result in
populations that specialise for particular ecological
niches (microevolution) and may eventually result
in speciation (the emergence of new species, macroevolution).
In other words, natural selection is a key process in the
evolution of a population.
Natural selection is a cornerstone of modern biology. The
concept, published by Darwin and Alfred Russel Wallace in
a joint presentation of papers in 1858, was elaborated in
Darwin's influential 1859 book On the Origin of Species by
Means of Natural Selection, or the Preservation of Favoured
Races in the Struggle for Life. He described natural selection
as analogous to artificial selection, a process by which
animals and plants with traits considered desirable by human
breeders are systematically favoured for reproduction. The
concept of natural selection originally developed in the
absence of a valid theory of heredity; at the time of Darwin's
writing, science had yet to develop modern theories of
genetics. The union of traditional Darwinian evolution with
subsequent discoveries in classical genetics formed
the modern synthesis of the mid-20th century. The addition
of molecular genetics has led to evolutionary developmental
biology, which explains evolution at the molecular level.
While genotypes can slowly change by random genetic drift,
natural selection remains the primary explanation for adaptive
evolution.
Introduction to Genetic
Algorithms
A genetic algorithm is a search heuristic that is
inspired by Charles Darwin’s theory of natural
evolution. This algorithm reflects the process of natural
selection where the fittest individuals are selected for
reproduction in order to produce offspring of the next
generation.
NATURAL SELECTION :
The process of natural selection starts with the selection
of fittest individuals from a population. They produce
offspring which inherit the characteristics of the parents
and will be added to the next generation. If parents have
better fitness, their offspring will be better than parents
and have a better chance at surviving. This process keeps
on iterating and at the end, a generation with the fittest
individuals will be found.
This notion can be applied for a search problem. We
consider a set of solutions for a problem and select the
set of best ones out of them.
Five phases are considered in a genetic algorithm.
• Initial population
• Fitness function
• Selection
• Crossover
• Mutation
Initial Population
The process begins with a set of individuals which is
called a Population. Each individual is a solution to the
problem you want to solve.
An individual is characterized by a set of parameters
(variables) known as Genes. Genes are joined into a
string to form a Chromosome (solution).
In a genetic algorithm, the set of genes of an individual
is represented using a string, in terms of an alphabet.
Usually, binary values are used (string of 1s and 0s). We
say that we encode the genes in a chromosome.
Fitness Function
The fitness function determines how fit an individual
is (the ability of an individual to compete with other
individuals). It gives a fitness score to each individual.
The probability that an individual will be selected for
reproduction is based on its fitness score.
Selection
The idea of selection phase is to select the fittest
individuals and let them pass their genes to the next
generation.
Two pairs of individuals (parents) are selected based
on their fitness scores. Individuals with high fitness have
more chance to be selected for reproduction.
Crossover
Crossover is the most significant phase in a genetic
algorithm. For each pair of parents to be mated,
a crossover point is chosen at random from within the
genes.
For example, consider the crossover point to be 3 as
shown below.
Offspring are created by exchanging the genes of
parents among themselves until the crossover point is
reached.Mutation
In certain new offspring formed, some of their genes can
be subjected to a mutation with a low random
probability. This implies that some of the bits in the bit
string can be flipped.
Mutation occurs to maintain diversity within the
population and prevent premature convergence.
Termination
The algorithm terminates if the population has
converged (does not produce offspring which are
significantly different from the previous generation).
Then it is said that the genetic algorithm has provided a
set of solutions to our problem.
INTRODUCTION TO OPERATORS OF GENETIC
ALGORITHM
Encoding of a Chromosome
The chromosome should in some way contain
information about solution which it represents.
The most used way of encoding is a binary string.
The chromosome then could look like this
Each chromosome has one binary string. Each bit
in this string can represent some characteristic of
the solution. Or the whole string can represent a
number - this has been used in the basic GA
applet.
Of course, there are many other ways of
encoding. This depends mainly on the solved
problem. For example, one can encode directly
integer or real numbers, sometimes it is useful to
encode some permutations and so on.
Chromosome 1 1101100100110110
Chromosome 2 1101111000011110
CROSSOVER
After we have decided what encoding we will use,
we can make a step to crossover. Crossover
selects genes from parent chromosomes and
creates a new offspring. The simplest way how to
do this is to choose randomly some crossover
point and everything before this point point copy
from a first parent and then everything after a
crossover point copy from the second parent.
Crossover can then look like this ( | is the
crossover point):
Chromosome 111011 | 00100110110
Chromosome 211011 | 11000011110
Offspring 1 11011 | 11000011110
Offspring 2 11011 | 00100110110
There are other ways how to make crossover, for
example we can choose more crossover points.
Crossover can be rather complicated and very
depends on encoding of the encoding of
chromosome.
MUTATION
After a crossover is performed, mutation take
place. This is to prevent falling all solutions in
population into a local optimum of solved
problem. Mutation changes randomly the new
offspring. For binary encoding we can switch a
few randomly chosen bits from 1 to 0 or from 0 to
1. Mutation can then be following
Original offspring 11101111000011110
Original offspring 21101100100110110
Mutated offspring 11100111000011110
Mutated offspring 21101101100110110
DIFFERENT VARIANTS OF
GENETIC ALGORITHM
Chromosome representation
The simplest algorithm represents each chromosome as a bit
string. Typically, numeric parameters can be represented
by integers, though it is possible to use floating
point representations. The floating point representation is
natural to evolution strategies and evolutionary programming.
The notion of real-valued genetic algorithms has been offered
but is really a misnomer because it does not really represent
the building block theory that was proposed by John Henry
Holland in the 1970s. This theory is not without support
though, based on theoretical and experimental results (see
below). The basic algorithm performs crossover and mutation
at the bit level. Other variants treat the chromosome as a list
of numbers which are indexes into an instruction table, nodes
in a linked list, hashes, objects, or any other imaginable data
structure. Crossover and mutation are performed so as to
respect data element boundaries. For most data types, specific
variation operators can be designed. Different chromosomal
data types seem to work better or worse for different specific
problem domains.
When bit-string representations of integers are used, Gray
coding is often employed. In this way, small changes in the
integer can be readily affected through mutations or
crossovers. This has been found to help prevent premature
convergence at so-called Hamming walls, in which too many
simultaneous mutations (or crossover events) must occur in
order to change the chromosome to a better solution.
Other approaches involve using arrays of real-valued
numbers instead of bit strings to represent
chromosomes. Results from the theory of schemata
suggest that in general the smaller the alphabet, the
better the performance, but it was initially surprising to
researchers that good results were obtained from using
real-valued chromosomes. This was explained as the set
of real values in a finite population of chromosomes as
forming a virtual alphabet (when selection and
recombination are dominant) with a much lower
cardinality than would be expected from a floating point
representation.
An expansion of the Genetic Algorithm accessible
problem domain can be obtained through more complex
encoding of the solution pools by concatenating several
types of heterogenously encoded genes into one
chromosome. This particular approach allows for
solving optimization problems that require vastly
disparate definition domains for the problem
parameters. For instance, in problems of cascaded
controller tuning, the internal loop controller structure
can belong to a conventional regulator of three
parameters, whereas the external loop could implement
a linguistic controller (such as a fuzzy system) which has
an inherently different description. This particular form
of encoding requires a specialized crossover mechanism
that recombines the chromosome by section, and it is a
useful tool for the modelling and simulation of complex
adaptive systems, especially evolution processes.
Elitism
A practical variant of the general process of constructing
a new population is to allow the best organism(s) from
the current generation to carry over to the next,
unaltered. This strategy is known as elitist selection and
guarantees that the solution quality obtained by the GA
will not decrease from one generation to the next
Parallel implementations
Parallel implementations of genetic algorithms come in
two flavors. Coarse-grained parallel genetic algorithms
assume a population on each of the computer nodes and
migration of individuals among the nodes. Fine-grained
parallel genetic algorithms assume an individual on each
processor node which acts with neighboring individuals
for selection and reproduction. Other variants, like
genetic algorithms for online optimization problems,
introduce time-dependence or noise in the fitness
function.
Adaptive GAs
Genetic algorithms with adaptive parameters (adaptive
genetic algorithms, AGAs) is another significant and
promising variant of genetic algorithms. The
probabilities of crossover (pc) and mutation (pm) greatly
determine the degree of solution accuracy and the
convergence speed that genetic algorithms can obtain.
Instead of using fixed values of pc and pm, AGAs utilize
the population information in each generation and
adaptively adjust the pc and pm in order to maintain the
population diversity as well as to sustain the
convergence capacity. In AGA (adaptive genetic
algorithm), the adjustment of pc and pm depends on the
fitness values of the solutions. In CAGA (clustering-
based adaptive genetic algorithm), through the use of
clustering analysis to judge the optimization states of the
population, the adjustment of pc and pm depends on
these optimization states. It can be quite effective to
combine GA with other optimization methods. GA tends
to be quite good at finding generally good global
solutions, but quite inefficient at finding the last few
mutations to find the absolute optimum. Other
techniques (such as simple hill climbing) are quite
efficient at finding absolute optimum in a limited region.
Alternating GA and hill climbing can improve the
efficiency of GA while overcoming the lack of robustness
of hill climbing.
This means that the rules of genetic variation may have a
different meaning in the natural case. For instance –
provided that steps are stored in consecutive order –
crossing over may sum a number of steps from maternal
DNA adding a number of steps from paternal DNA and
so on. This is like adding vectors that more probably
may follow a ridge in the phenotypic landscape. Thus,
the efficiency of the process may be increased by many
orders of magnitude. Moreover, the inversion
operator has the opportunity to place steps in
consecutive order or any other suitable order in favour
of survival or efficiency.
A variation, where the population as a whole is evolved
rather than its individual members, is known as gene
pool recombination.
A number of variations have been developed to attempt
to improve performance of GAs on problems with a high
degree of fitness epistasis, i.e. where the fitness of a
solution consists of interacting subsets of its variables.
Such algorithms aim to learn (before exploiting) these
beneficial phenotypic interactions. As such, they are
aligned with the Building Block Hypothesis in adaptively
reducing disruptive recombination. Prominent examples
of this approach include the mGA, GEMGA and LLGA.
LIMITATIONS OF GENTIC
ALGORITHM
There are limitations of the use of a genetic algorithm
compared to alternative optimization algorithms:
• Repeated fitness function evaluation for complex
problems is often the most prohibitive and limiting
segment of artificial evolutionary algorithms. Finding the
optimal solution to complex high-dimensional,
multimodal problems often requires very expensive fitness
function evaluations. In real world problems such as
structural optimization problems, a single function
evaluation may require several hours to several days of
complete simulation. Typical optimization methods
cannot deal with such types of problem. In this case, it
may be necessary to forgo an exact evaluation and use
an approximated fitness that is computationally efficient.
It is apparent that amalgamation of approximate
models may be one of the most promising approaches to
convincingly use GA to solve complex real life problems.
• Genetic algorithms do not scale well with complexity.
That is, where the number of elements which are exposed
to mutation is large there is often an exponential increase
in search space size. This makes it extremely difficult to
use the technique on problems such as designing an
engine, a house or a plane. In order to make such
problems tractable to evolutionary search, they must be
broken down into the simplest representation possible.
Hence we typically see evolutionary algorithms encoding
designs for fan blades instead of engines, building shapes
instead of detailed construction plans, and airfoils instead
of whole aircraft designs. The second problem of
complexity is the issue of how to protect parts that have
evolved to represent good solutions from further
destructive mutation, particularly when their fitness
assessment requires them to combine well with other
parts.
• The "better" solution is only in comparison to other
solutions. As a result, the stop criterion is not clear in
every problem.
• In many problems, GAs have a tendency to converge
towards local optima or even arbitrary points rather than
the global optimum of the problem. This means that it
does not "know how" to sacrifice short-term fitness to
gain longer-term fitness. The likelihood of this occurring
depends on the shape of the fitness landscape: certain
problems may provide an easy ascent towards a global
optimum, others may make it easier for the function to
find the local optima. This problem may be alleviated by
using a different fitness function, increasing the rate of
mutation, or by using selection techniques that maintain a
diverse population of solutions,[14] although the No Free
Lunch theorem[15] proves that there is no general solution
to this problem. A common technique to maintain
diversity is to impose a "niche penalty", wherein, any
group of individuals of sufficient similarity (niche radius)
have a penalty added, which will reduce the representation
of that group in subsequent generations, permitting other
(less similar) individuals to be maintained in the
population. This trick, however, may not be effective,
depending on the landscape of the problem. Another
possible technique would be to simply replace part of the
population with randomly generated individuals, when
most of the population is too similar to each other.
Diversity is important in genetic algorithms (and genetic
programming) because crossing over a homogeneous
population does not yield new solutions. In evolution
strategies and evolutionary programming, diversity is not
essential because of a greater reliance on mutation.
• Operating on dynamic data sets is difficult, as genomes
begin to converge early on towards solutions which may
no longer be valid for later data. Several methods have
been proposed to remedy this by increasing genetic
diversity somehow and preventing early convergence,
either by increasing the probability of mutation when the
solution quality drops (called triggered hypermutation), or
by occasionally introducing entirely new, randomly
generated elements into the gene pool (called random
immigrants). Again, evolution strategies and evolutionary
programming can be implemented with a so-called
"comma strategy" in which parents are not maintained and
new parents are selected only from offspring. This can be
more effective on dynamic problems.
• GAs cannot effectively solve problems in which the only
fitness measure is a single right/wrong measure
(like decision problems), as there is no way to converge
on the solution (no hill to climb). In these cases, a random
search may find a solution as quickly as a GA. However,
if the situation allows the success/failure trial to be
repeated giving (possibly) different results, then the ratio
of successes to failures provides a suitable fitness
measure.
• For specific optimization problems and problem instances,
other optimization algorithms may be more efficient than
genetic algorithms in terms of speed of convergence.
Alternative and complementary algorithms
include evolution strategies, evolutionary
programming, simulated annealing, Gaussian
adaptation, hill climbing, and swarm intelligence (e.g.: ant
colony optimization, particle swarm optimization) and
methods based on integer linear programming. The
suitability of genetic algorithms is dependent on the
amount of knowledge of the problem; well known
problems often have better, more specialized approaches.
Problem domains of
genetic algorithm
Problems which appear to be particularly appropriate for
solution by genetic algorithms include timetabling and
scheduling problems, and many scheduling software packages
are based on GAs. GAs have also been applied
to engineering. Genetic algorithms are often applied as an
approach to solve global optimization problems.
As a general rule of thumb genetic algorithms might be useful
in problem domains that have a complex fitness landscape as
mixing, i.e., mutation in combination with crossover, is
designed to move the population away from local optima that
a traditional hill climbing algorithm might get stuck in.
Observe that commonly used crossover operators cannot
change any uniform population. Mutation alone can provide
ergodicity of the overall genetic algorithm process (seen as
a Markov chain).
Examples of problems solved by genetic algorithms include:
mirrors designed to funnel sunlight to a solar
collector, antennae designed to pick up radio signals in
space, walking methods for computer figures, optimal design
of aerodynamic bodies in complex flowfields
In his Algorithm Design Manual, Skiena advises against
genetic algorithms for any task:
it is quite unnatural to model applications in terms of genetic
operators like mutation and crossover on bit strings. The
pseudobiology adds another level of complexity between you
and your problem. Second, genetic algorithms take a very
long time on nontrivial problems. he analogy with evolution
—where significant progress require [sic] millions of years—
can be quite appropriate.
I have never encountered any problem where genetic
algorithms seemed to me the right way to attack it. Further, I
have never seen any computational results reported using
genetic algorithms that have favorably impressed me. Stick
to simulated annealing for your heuristic search voodoo
needs.
— Steven Skiena
CONCLUSION
In 1950, Alan Turing proposed a "learning machine" which
would parallel the principles of evolution.[32] Computer
simulation of evolution started as early as in 1954 with the
work of Nils Aall Barricelli, who was using the computer at
the Institute for Advanced Study in Princeton, New Jersey.
[33][34] His 1954 publication was not widely noticed.
Starting in 1957, the Australian quantitative geneticist Alex
Fraser published a series of papers on simulation of artificial
selection of organisms with multiple loci controlling a
measurable trait. From these beginnings, computer simulation
of evolution by biologists became more common in the early
1960s, and the methods were described in books by Fraser
and Burnell (1970)[36] and Crosby (1973).Fraser's
simulations included all of the essential elements of modern
genetic algorithms. In addition, Hans-Joachim
Bremermann published a series of papers in the 1960s that
also adopted a population of solution to optimization
problems, undergoing recombination, mutation, and selection.
Bremermann's research also included the elements of modern
genetic algorithms.Other noteworthy early pioneers include
Richard Friedberg, George Friedman, and Michael Conrad.
Many early papers are reprinted by Fogel (1998).
Although Barricelli, in work he reported in 1963, had
simulated the evolution of ability to play a simple game,
[40] artificial evolution only became a widely recognized
optimization method as a result of the work of Ingo
Rechenberg and Hans-Paul Schwefel in the 1960s and early
1970s – Rechenberg's group was able to solve complex
engineering problems through evolution strategies.[41][42]
[43][44] Another approach was the evolutionary programming
technique of Lawrence J. Fogel, which was proposed for
generating artificial intelligence. Evolutionary
programming originally used finite state machines for
predicting environments, and used variation and selection to
optimize the predictive logics. Genetic algorithms in
particular became popular through the work of John
Holland in the early 1970s, and particularly his
book Adaptation in Natural and Artificial Systems (1975). His
work originated with studies of cellular automata, conducted
by Holland and his students at the University of Michigan.
Holland introduced a formalized framework for predicting the
quality of the next generation, known as Holland's Schema
Theorem. Research in GAs remained largely theoretical until
the mid-1980s, when The First International Conference on
Genetic Algorithms was held in Pittsburgh, Pennsylvania.