0% found this document useful (0 votes)
40 views

Intro Genetic Algorithms

This document provides an overview of genetic algorithms. It discusses that genetic algorithms are probabilistic search algorithms that use operations inspired by genetic processes like crossover and mutation to iteratively transform populations of potential solutions. It notes that while the structure is simple, the dynamic behavior is complex. The document then discusses the basic cycle of genetic algorithms and covers standard representations, selection methods, genetic operators like crossover and mutation, and provides an example application to the knapsack problem. It also discusses the schema theorem, which provides a framework for analyzing how well schemas (subsets of potential solutions) will proliferate or be destroyed over generations.

Uploaded by

joseaguilar64
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Intro Genetic Algorithms

This document provides an overview of genetic algorithms. It discusses that genetic algorithms are probabilistic search algorithms that use operations inspired by genetic processes like crossover and mutation to iteratively transform populations of potential solutions. It notes that while the structure is simple, the dynamic behavior is complex. The document then discusses the basic cycle of genetic algorithms and covers standard representations, selection methods, genetic operators like crossover and mutation, and provides an example application to the knapsack problem. It also discusses the schema theorem, which provides a framework for analyzing how well schemas (subsets of potential solutions) will proliferate or be destroyed over generations.

Uploaded by

joseaguilar64
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

genetic algorithms

combinatorial improvement
problem instance
fenotype data structures
genotype

specific generic
genetic
heuristics improvement
algorithms
for instance class routine

acceptable
fenotyperesult datagenotype
structures

 specific heuristics require  generic improvement saves


specific research research and coding time
 heuristics need to be tuned for  generic improvement may treat
the instance classes they are to easy problems inefficiently
be applied to  must outperform brute force
 consistently produce acceptable
results
settle for a generic improvement
definition
A genetic algorithm is a probabilistic search algorithm
that iteratively transforms a set (called a population)
of mathematical objects
(typically fixed-length binary character strings),
each with an associated fitness value,
into a new population of offspring objects
using the Darwinian principle of natural selection and
using operations
that are patterned after naturally occurring genetic
operations,
such as crossover (sexual recombination) and mutation.
quick overview

developed: USA in the 1970s


early names: J. Holland, K. DeJong, D. Goldberg
typically applied to:
discrete optimization
attributed features:
not too fast
good heuristic for combinatorial problems
special features:
traditionally emphasizes combining information from good
parents (crossover)
many variants, e.g., reproduction models, operators
how do genetic algorithms work?
the structure is relatively simple to comprehend,
but the dynamic behavior is complex
how do genetic algorithms work?
the structure is relatively simple to comprehend,
but the dynamic behavior is complex
organisms produce offspring
similar to themselves,
but can have variations
random changes (mutation)

combinations of featured from each parent (crossover)


how do genetic algorithms work?
the structure is relatively simple to comprehend,
but the dynamic behavior is complex
some offspring does survive and some do not
the better they adapt to their environment,
the higher are the chances

over time generations become more and more adapted,


because the fittest survive
how do genetic algorithms work?
the structure is relatively simple to comprehend,
but the dynamic behavior is complex
what are the theoretical foundations if any?
genetic algorithms work by
discovering, emphasizing, and recombining
good building blocks
of solutions in a highly parallel fashion.
Melanie Mitchell, paraphrasing John Holland

using formalism
notion of a building block is formalized as a schema
schemata are propagated or destroyed
according to the laws of probability
basic cycle of simple genetic algorithms
1. select parents for the mating pool
proportional to their fitness,
e.g. by roulette-wheel selection
(size of mating pool = population size)
2. shuffle the mating pool randomly
3. for each consecutive
pair apply crossover with probability pc , else copy parents
4. for each offspring apply mutation
(bit-flip with probability pm independently for each bit)
5. replace the population with the resulting offspring
standard representation and selection
fenotype space genotype space = {0,1}L

encoding 10010001
(representation)
10010010
010001001
decoding 011101001
(inverse
representation)

selection:
main idea: better individuals get higher chance
chances proportional to fitness
1/6 = 17% fitness(A) = 3
implementation:
roulette wheel technique
assign to each individual A BC fitness(B) = 1

a part of the roulette wheel 3/6 = 2/6


fitness(C) = 2
pin the wheel n times 50% = 33%
to select n individuals
example: 0-1 knapsack problem

given n "objects" each with a value and resource claim,


select a subfamily of objects
with maximal value within a resource constraint

n
max v j x j with x j {0,1}
j=1 0-1 knapsack problem
n
subject to w x
j=1
j j b

representation: binary string


n n

fitness: v x
j=1
j j if w x
j=1
j j b else 0
operators
one-point crossover: randomly chosen position

parents: 1010001110 0011010010

offspring: 1010010010 0011001110


choose a random point on the two parents
split parents at this crossover point
create children by exchanging tails
pc typically in range (0.6, 0.9)
gene-wise mutation:
alter each gene independently with a probability pm
pm is called the mutation rate
typically between 1/pop_size and 1/ chromosome_length
an example
simple problem: max x2 over {0,1,,31}
approach:
representation: binary code, e.g. 01101 13
population size: 4
1-point xover, bitwise mutation
roulette wheel selection
random initialisation
we show one generational cycle done by hand
x2 example: selection
x2 example: crossover
x2 example: mutation
schema theorem

objective: support for the effectiveness of the search process


by providing a model for the expectation of the survival
of a group of individuals (called a schema)

the framework was formalized by


J.H. Holland, Adaptation in Natural and Artificial Systems:
an introductory analysis with applications to biology, control and artificial
intelligence. MIT Press, ISBN 0-262-58111-6. (1998, original printing 1975).
popularized by
D.E., Goldberg, Genetic Algorithms in Search, Optimization and
Machine Learning. Addison Wesley, ISBN 0-201-15767-5 (1989)

it applies to the case of a simple genetic algorithm:


binary alphabet;
fixed length individuals of equal length, l;
fitness proportional selection;
single point crossover;
gene wise mutation.
schema
a schema is a subset of the space of all possible individuals
for which all the genes match the template for schema H.

a template, much like a regular expression,


or a mask, describing a set of strings
the set of strings represented by a given schema
characterizes a set of candidate solutions sharing a property

notation: 0 or 1 represents a fixed bit,


asterisk represents a dont care ("wild card")

(defining) length
the distance between the first and the last fixed bit
(difference between their positions)

order
the number of fixed bits in a schema
examples
for a binary individual with the gene sequence 0 1 1 1 0 0 0,
one (of many) matching schema has the form, * 1 1 * 0 * *
the schema H = [0 1 * 1 *] identifies the chromosome set,
01010
01011
01110
01111
11****00 is the set encoded in 8 bits,
beginning with two ones and ending with two zeros
length =7
order=4
1*01, beginning with 1 and ending with 01
length = 3
order =3
0*101*
length=4
order =4
approximating schema dynamics
let H be a schema
with at least one instance present in generation k
let e(H, k) be the number of instances of H in P(k)
let f(H,k) be the average fitness of instances of H
f (x ) 1
f (H, k ) = xHP (k )

e(H, k )
F (k) = f (x )
N xP(k )

let m(H, k) be the number of instances of H


in the mating pool of generation k
then expected value of m(H,k) is

f (H, k )
e(H, k )
F (k )
number of offspring of x is f(x)/f(pop)
(roulette-wheel or fitness proportionate selection)
approximating schema dynamics
schemata with fitness greater (lower) than
the average population fitness are likely
to account for proportionally more (less)
of the population at the next generation
strictly speaking, for accurate estimates of expectation
the population size should be infinite
note that the average fitness of a schema
is never explicitly calculates,
but schema proliferation depends on its value

then expected value of m(H,k) is

f (H, k )
e(H, k )
F (k )
approximating schema dynamics
consider the following individual, h,
two matching schema, H1, H2 and
crossover point between 3rd and 4th gene:
h= 1 0 1 | 1 1 0 0
H1 = * 0 1 | * * * 0
H2 = * 0 1 | * * * *
observations,
- schema H1 will be broken by the location of the crossover operator
unless the second parent is able to repair the disrupted gene.
- schema H2 emerges unaffected and is therefore independent of the
second parent.
- with Pdiff(H, k) is the probability that the second parent
from generation k does not match schema H
under single point crossover, the (lower bound) probability of schema
H surviving at generation k is, P(H survives) = 1 P(H dies)=

d(H) d(H)
1 pC pdiff (H, k ) 1 pC
L1 L1
approximating schema dynamics
mutation is applied gene by gene.
in order for schema H to survive,
all non * genes in the schema must remain unchanged
probability of not changing a gene is (1 pm)
require that all o(H) non * genes survive, or (1 pm)o(H)
typically the probability of applying the mutation operator, pm,<< 1,
thus
(1 pm)o(H) appr. 1 o(H) pm
under gene wise mutation,
the surviving probability of an order o(H) schema H at generation k is,

o(H)
(1 pm ) 1 o(H)pm
the schema theorem
lemma 1: the expected number of f (H, k )
instance of H in the mating pool is e(H, k )
F (k )
lemma 2: the probability that
an instance of H in the mating pool is chosen (pc)
and neither of its offspring is in H is
l(H)
pC pdiff (H, k )
lemma 3: the probability that L1
an instance of H in the mating pool remains in H
after the mutation operator is
(1 pm )o (H)

the expected number of chromosomes in P(k+1)


that matches schema H is :

d(H) o(H) f (H, k )


1 pC pdiff (H, k ) (1 pm ) e(H, k )
L1 F (k )
the schema theorem
lemma 1: the expected number of f (H, k )
instance of H in the mating pool is e(H, k )
F (k )
lemma 2: the probability that
an instance of H in the mating pool is chosen (pc)
and neither of its offspring is in H
is less than
d(H)
pC
L1
lemma 3: the probability that
an instance of H in the mating pool remains in H
after the mutation operator is approximately (1 o(H)pm )

the expected number of chromosomes in P(k+1)


that matches schema H is at least:

d(H) f (H, k )
1 pC o(H)pm e(H, k )
L1 F (k )
the schema theorem
the theorem was a milestone in the development of genetic algorithms,
but it has undesirable assumptions:

only the worst-case scenario is considered,


while ignoring positive effects of the search operators
(this has lead to the development of exact schema theorems)
the theorem concentrates on the number of schema surviving
not which schema survive (such considerations have been addressed by
the utilization of markov chains to provide models of behavior
associated with specific individuals in the population )
claims of exponential increases in fit schema
are (unfortunately ) misleading !
Goldberg popularized the following result:

e(H, k + 1) (1 + c)e(H, k )
where c is the constant by which fit schema are always fitter than the
population average.
the simple genetic algorithm

has been subject of many (early) studies


still often used as benchmark for novel algorithms
shows many shortcomings, e.g.
representation is too restrictive
mutation & crossovers only applicable
for bit-string & integer representations
selection mechanism sensitive for converging populations
with close fitness values
generational population model can be "improved"
with explicit survivor selection
therefore,
other crossover operators
more flexible crossover/mutation
other representations
other selection strategy
termination
This generational process is repeated
until a termination condition has been reached.
Common terminating conditions are:
a solution is found that satisfies minimum criteria
fixed number of generations reached
allocated budget (computation time/money) reached
the highest ranking solution's fitness is reaching or has reached a
plateau such that successive iterations no longer produce better
results
manual inspection
combinations of the above

You might also like