0% found this document useful (0 votes)
15 views54 pages

Topic 3B - Implementing Genetic Algorithm v0

Uploaded by

Kent CT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views54 pages

Topic 3B - Implementing Genetic Algorithm v0

Uploaded by

Kent CT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

UEMH3163/3073 UECS2053/2153 – Artificial Intelligence

Implementing Genetic
Algorithms

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 1 of 54


Learning Objectives

After completing this lecture, you will be able to:-


● Design and implement a complete genetic algorithm
Select operators and parameters to improve the

performance of a genetic algorithm


● Discuss the benefits and limitations of genetic algorithms
● Anticipate issues in practical use of genetic algorithms

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 2 of 54


Genetic Algorithm Building Blocks

●Problem definition: a chromosome which encodes a


solution to the problem
● Initialization procedure: to create the initial population
● Genetic operators: selection, crossover, and mutation
● Objective evaluation: fitness function or fitness score
● Termination condition

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 3 of 54


Genetic Algorithm

Initialization: Start with a large population of randomly


generated chromosomes
● Repeat:
–Evaluate each solution (with a fitness function)
–Keep fitter solutions, eliminate poorer ones (selection)
–Generate new solutions (crossover)
–Add small random variances (mutation)
●Stop when your solution is satisfactory (convergence) or
you run out of time (termination condition)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 4 of 54


Initialization

●Chromosomes are randomly generated for a population


(with size N)
The chromosomes must contain information (genes) about

a solution for the problem being solved


●The chromosomes are encoded in one of several forms
(depending on the problem domain)
There are a few types of encoding methods (covered in the

next few slides) which define the mapping between


genotype and phenotype

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 5 of 54


Initialization – Binary Encoding

●Each chromosome is Used for (example)


represented using a binary –Knapsack problem, given a fixed


string capacity and a list of items with
Every gene is represented

value/weight/size, select items to
using the bits 0 or 1 maximize value without exceeding
capacity
–Each bit or group of bits Encoding

represents some aspect of the
problem (e.g. ‘rain’ or ‘no rain’) –Each bit represents whether the
corresponding item is in the
Each gene shows some
● knapsack
characteristic of the solution
Each chromosome represents

a value in the search space

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 6 of 54


Initialization – Value Encoding

●Each chromosome is Used for (example)


represented as a string of –Finding neural network weights,


some values given a certain architecture, find
the best weights to achieve a
●Each gene represents a certain output
variable Encoding

●Value can be an integer, a –Real values in chromosomes


which represent the corresponding
real number, a character, or neural network weights
some object

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 7 of 54


Initialization – Permutation Encoding

●Each chromosome Used for (example)


represents a sequence of –Travelling salesman problem


items (TSP), given a number of cities
and distances between them,
●Each gene represents an find the shortest sequence of
trips which visits all the cities
item
Encoding

●Useful for ordering –Chromosome represents order


problems (problems where in which cities will be visited
the solutions have a specific
order)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 8 of 54


Initialization – Tree Encoding

●Each chromosome is a tree Used for (example)


of objects (such as –Finding a (programming) function


which will achieve a certain output
functions/commands in a for a fixed set of inputs
programming language) Encoding

Each gene represents an


● –Chromosome represents
object in the tree functions in a tree

Mainly used for evolving


programs or genetic
programming

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 9 of 54


Initialization

Two methods for initializing the population:-


● Random Initialization
–Populate the initial population with completely random
solutions
● Heuristic Initialization
–Populate the initial population using a known heuristic
(rules learned via experience) for the problem

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 10 of 54


Search Space

●The population exists within a defined (possibly infinite)


search space
●Each individual represents a solution within this search
space, with one dimension per gene (on average)
The high dimensionality of the search space normally

precludes easy visualization


–We can still imagine how a search space ‘looks’

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 11 of 54


Search Space

●A completely random search space would be bad for GA


(and any other optimization method)
–Inheriting ‘good’ traits has no benefit
A single-valley space without local minima is more

efficiently solved by gradient-descent related methods


A search spaces with a fairly continuous surface and

multiple valleys is suitable for GA (especially if it is


prohibitively large)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 12 of 54


Iterative Evolution

With an initial population the following is done iteratively:-


● Selection
–Evaluate individual fitness and give preference to ‘fitter’
individuals
● Crossover (Mating/Recombination/Reproduction)
–2 individuals (from selection step) exchange genes,
creating a new (hopefully better) solution
● Mutation
–Random modifications are introduced to individuals

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 13 of 54


Selection

Preference should be given to better individuals to pass


on their genes to the next generation


● ‘Better’ is defined by an individual’s fitness
● Fitness is determined by an objective/fitness function
Selection should favour fitter chromosomes, but there are

no fixed rules as to how much favouritism should be applied


●No selection strategy consistently performs best for all
types of problems

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 14 of 54


Selection

There are two kinds of selection:


● Parent selection
–Selecting which parents mate and recombine to create
offspring for the next generation
● Survivor selection
–Selecting which individuals are to be kicked out and which
are to be kept for the next generation

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 15 of 54


Selection

The fitness value/score of each individual is the value being


optimized (minimized or maximized) by the GA


In general, fitness scores are used:-

–Parent selection: Better fitness scores increases chances of


being a parent
–Survivor selection: Better fitness scores increases odds of
surviving
●Over the generations, less-fit individuals will die (be removed),
leaving each generation better off than previous generations
●Convergence happens when successive generations don’t
improve fitness much

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 16 of 54


Selection

Example of convergence

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 17 of 54


Selection

Maintaining good population diversity is


extremely critical for a successful GA


●If the entire population consists of
variations of one extremely fit solution
(premature convergence) the GA is likely
to underperform
●Dilemma: We want fit, but not too fit (both
fit and diversity important)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 18 of 54


Parent Selection

In current practice, the following parent selection strategies are


generally used
Fitness-proportional selection (only for single-sign fitness)

–Roulette wheel slection


–Stochastic universal sampling
Tournament selection (handles negative fitness)

Rank selection (handles negative and low/high variance fitness)


Truncation selection

Steady-state selection (incorporates survivor selection)


Random selection (pointless)


UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 19 of 54


Parent Selection – Roulette Wheel

●Every chromosome has a slice of the roulette


wheel proportional to its fitness, the wheel is
then ‘spun’ to see which chromosome is
chosen as a parent
● In general:-
–Calculate sum of fitness S
–Generate random number r between 0 and S
–Loop through each chromosome, adding its
fitness to a sum until the sum is greater than r,
choose the matching chromosome

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 20 of 54


Parent Selection – SUS

Stochastic Universal Sampling is similar to


Roulette wheel selection, but only one random


number is used (one spin of the wheel)
●All parents are then chosen at evenly spread
intervals around the wheel
Avoids too much bias if random values aren’t

properly distributed for Roulette wheel selection

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 21 of 54


Parent Selection – Tournament

●A few chromosomes are chosen at random, and a tournament


is then run between them
The winner (best fitness) is selected as a parent

●Tournament size is an important parameter which changes


selection pressure
–Large tournament
size disadvantages
weak individuals,
small tournament
size increases
randomness

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 22 of 54


Parent Selection – Rank

●When a population has very close fitness values, there is very


little selection pressure, making GA effectively random
●Alternatively, when a population has very different fitness values,
there is too much selection pressure, which could lead to
premature convergence
●Rank selection assigns probability of selection based on fitness
rank rather than fitness

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 23 of 54


Parent Selection – Rank

● After fitness is calculated, all individuals are ranked


●Each individual receives a fixed (decreasing with lower
rank) probability of being selected (e.g. 0.5, 0.25, 0.125…)
●Selection is then done using one of the fitness
proportionate methods, but the probability is used instead of
the fitness
Higher fitness still gives preference, but this is now

bounded
● Negative values also work with rank selection

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 24 of 54


Parent Selection – Truncation

●A fixed proportion of the fittest individuals are selected for


recombination
Based on animal/plant breeding practices (directed

evolution)
●Less sophisticated than the other methods discussed here
(except random selection) and not often used in practice

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 25 of 54


Parent Selection – Steady-State

●Main idea: a big part of current chromosomes should


survive
●In every generation, select a few (high fitness)
chromosomes for creating new offspring
●Then select a few (low fitness) chromosomes to be
replaced
● The rest of the population survives to the next generation
Convergence is slower due to lower turnover, also may be

more vulnerable to local minima (just increase population)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 26 of 54


Survivor Selection

●Good chromosomes (solutions) can be lost due to


crossover or mutation resulting in weaker offspring
● These can be rediscovered, but there’s no guarantee
●In genetic algorithms, elitism is the practice of copying a
small proportion of the fittest chromosomes unchanged
(no crossover or mutation)
●This can dramatically impact performance (quality and
speed) of the genetic algorithm search process
● Elites remain eligible for selection as parents

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 27 of 54


Survivor Selection

Survivor selection determines which individuals are kicked


out (die) and which are kept (elitism) in the next generation
● Survivor selection strategies:-
–Fitness based – this is traditional elitism
–Age based – each individual is allowed to remain for a
finite number of generations before it is kicked out
●This allows for multiple ‘tries’ at reproduction, increasing the
chance of passing on good genes

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 28 of 54


Crossover

Also known as mating/recombination/reproduction


●Randomly mixes genes between parents (output of parent


selection)
●The parents each provide part (50% or otherwise) of their genes
(unique traits/characteristics)
●The new offspring hence inherit both parent’s traits in their
chromosome
This can increase diversity

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 29 of 54


Crossover General Algorithm

● Input: two parents


● Randomly choose a crossover site
● Exchange genes up to this site

●The two offspring are then put into the next generation of
the population

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 30 of 54


Simple Crossover Methods

● One Point / Single Point Crossover


● Multi Point Crossover
● Uniform Crossover
● Whole Arithmetic Recombination

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 31 of 54


One Point Crossover

A random crossover point is selected and the tails of the


parents are swapped to get new offspring

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 32 of 54


Multi Point Crossover

A generalization of one point crossover, where multiple


points are used to swap segments of the parents

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 33 of 54


Uniform Crossover

Each gene is treated separately. Essentially, flip a coin for


each gene to see which child it ends up in (coin can be
biased away from 0.5)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 34 of 54


Whole Arithmetic Recombination

●Commonly used for integer representations and works by


taking the weighted average of the two parents

● If α = 0.5 then both children will be identical

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 35 of 54


Other Crossover Methods

The simple crossover methods make assumptions about


chromosome design etc. that may not be suitable for a particular
problem. Here are some alternatives:-
Davis’ Order Crossover (OX1)

Partially Mapped Crossover (PMX)


Order based crossover (OX2)


Shuffle crossover

Ring Crossover

… and many more (custom designed methods are common)


UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 36 of 54


Mutation

● Mutation is a random change of genes


In nature, mutation is the result of copying errors in DNA,

possibly due to toxins, radiation, or chemical substances


●Most of these changes are negative and may result in
illnesses
● However, some may have neutral or positive impact
●Mutations also contribute significantly to diversity (which is
the primary point of its inclusion in GA)
● Mutation alone is effectively random search

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 37 of 54


Mutation

●In genetic algorithms, mutation is implemented as a small


random tweak in a chromosome
It is used to maintain and introduce diversity and try to

avoid premature convergence


●It is usually applied with a low probability to avoid GA
reducing to a random search

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 38 of 54


Mutation Methods

● Bit Flip Mutation


● Random Resetting
● Swap Mutation
● Scramble Mutation
● Inversion Mutation
● … and many more (custom designs as well)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 39 of 54


Bit Flip Mutation

● Select one or more random bits and flip them

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 40 of 54


Random Resetting

● Extension of the bit flip for non-binary encodings


● A random value is assigned to a randomly chosen gene

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 41 of 54


Swap Mutation

●Select 2 positions on the chromosome at random, and


interchange/swap their values
● This is common in permutation (order) based encodings.

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 42 of 54


Scramble / Shuffle Mutation

A subset of genes is chosen and their values scrambled or


shuffled randomly (for permutation based encodings)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 43 of 54


Inversion Mutation

●Select a subset of genes like in scramble/shuffle mutation,


but instead of shuffling the subset, we merely
invert/reverse the entire string in the subset

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 44 of 54


Terminating Genetic Algorithms

● The iterative portion of GA should stop when:-


–Fixed number of generations reached
–Allocated budget (computer time/money) reached
–Highestranking solution’s fitness has plateaued (results not
improving for best solution)
–Expert decides its time to stop
–Some combination of the above

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 45 of 54


Genetic Algorithm Parameters

● Population and Generation Size (N)


● Crossover Probability (Pc)
● Mutation Probability (Pm)
● Termination Condition

●Often these parameters need ‘tuning’ based on results


obtained, no general theory to calculate ‘good’ values
(therefore heuristic selection)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 46 of 54


Genetic Algorithm Parameters

Population and Generation Size (N)


● How many chromosomes in population
● Too few – search space not explored much
● Too many – computation takes too much time
There is an (unknown) upper bound to N above which

problems cannot be solved faster


● Proper choice of N avoids unnecessary computation

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 47 of 54


Genetic Algorithm Parameters

Crossover Probability (Pc)


●Crossover is done hoping that children inherit good parts of
their parents, resulting in a better solution
● If Pc is 100%, all offspring are the result of crossover
●Any value of Pc under 100% is effectively a form of survivor
selection
There is no optimum value for Pc, it normally depends on

heuristics (and the problem)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 48 of 54


Genetic Algorithm Parameters

Mutation Probability (Pm)


●Mutation is done to avoid premature convergence (increase
diversity) by providing an opportunity for solutions to escape
local minima
● If Pm is 100%, all genes/chromosomes are changed
●Mutation should not occur too often, because the GA would
then become a random search
There is no optimum value for Pm, it normally depends on

heuristics (and the problem)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 49 of 54


Benefits of Genetic Algorithms

Easy to understand
● Inherently parallel and easily

distributed
Optimizes both continuous and

discrete functions and also single- ●May provide a list of “good”


objective and multi-objective solutions and not just a single
problems. solution. Easy to exploit for previous
or alternate solutions.
●Good for noisy environment (error-
prone data, crossover and mutation Flexible in forming building blocks for

increase diversity) hybrid applications (mix and match


different solutions)
●We always get solution in a
reasonable time (though may not be Useful when the search space is

optimal) and solution gets better over very large and there are a large
time number of parameters involved.
●Faster and more efficient as Has substantial history and range of

compared to the traditional methods. use (proven effectiveness)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 50 of 54


Limitations of Genetic Algorithms

GAs are not suited for all


● ●Implementation (choosing
problems, especially problems parameters and operators) is
which are simple. still an art

●Fitness scores have to be ●Being stochastic


(probabilistic), there are no
calculated repeatedly (for guarantees on the optimality
different chromosomes) which or the quality of the solution.
might be computationally
expensive for some problems ●GAs may not converge to the
(though we may remember the optimal solution. Premature
convergence may lead the
scores for those chromosomes algorithm to converge on the
already evaluated) local optimum

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 51 of 54


Issues for Practitioners

Basic implementation
● ●Performance (how fast is a
decisions solution needed)
–Representation/encoding Scalability (how big is the

–Population size (N) and data set)


crossover/mutation
probabilities (Pc, Pm) ●Fitness score must be
accurate (wrong fitness
–Selection policies function guarantees bad
●Termination criterion (When performance)
to stop? When does it
converge?)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 52 of 54


Genetic Algorithm Conclusion

●GAs are a powerful, robust optimization search


technique
GAs will converge over successive generations toward a

near global optimum via selection, crossover, and


mutation operations
●GAs combine direction (selection and crossover) and
chance (mutation) elements into a single effective and
efficient search
GAs can find good solutions in reasonable time (good

enough and fast enough)

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 53 of 54


End of Lecture

UEMH3163/3073 UECS2153/2053 Artificial Intelligence Slide 54 of 54

You might also like