Stochastic Search Methods
Stochastic Search Methods
| |
=
|
\ .
( )
exp
s S
E s
n
k T
e
| |
=
|
\ .
12
Metropolis algorithm
Stochastic algorithm proposed by Metropolis
et al. to simulate the structural evolution of a
molten substance for a given temperature
Assumptions:
current system state s
temperature T
number of equilibration steps m
13
Metropolis algorithm (2)
Key step: generate new system state s
new
,
evaluate energy difference E = E(s
new
) E(s),
and accept the new state with probability
depending on E
Probability of accepting the new state:
accept
1; 0
exp ; otherwise
E
p
E
T
A <
=
A
| |
|
\ .
14
Metropolis algorithm (3)
Metropolis(s, T, m);
i := 0;
while i < m do
s
new
:= Perturb(s);
E := E(s
new
) E(s);
if (E < 0) or (Random(0,1) < exp(E/T))
then s := s
new
;
i := i + 1;
end_while;
Return s;
15
Algorithm Simulated annealing
Starting from a configuration s, simulate an
equilibration process for a fixed temperature T
over m time steps using Metropolis(s, T, m)
Repeat the simulation procedure for
decreasing temperatures T
init
= T
0
> T
1
> > T
final
Result: a sequence of annealing configurations
with gradually decreasing free energies
E(s
0
) E(s
1
) E(s
final
)
16
Algorithm Simulated annealing (2)
Simulated_annealing(T
init
, T
final
, s
init
, m, );
T := T
init
;
s := s
init
;
while T > T
final
do
s := Metropolis(s, T, m);
T := T;
end_while;
Return s;
17
Simulated annealing as an
optimization process
Solutions to the optimization problem
correspond to system states
System energy corresponds to the objective
function
Searching for a good solution is like finding a
system configuration with minimum free energy
Temperature and equilibration time steps are
parameters for controlling the optimization
process
18
Annealing schedule
A major factor for the optimization process to
avoid premature convergence
Describes how temperature will be decreased
and how many iterations will be used during
each equilibration phase
Simple cooling plan: T = T, with 0 < < 1, and
fixed number of equilibration steps m
19
Algorithm characteristics
At high temperatures almost any new solution
is accepted, thus premature convergence
towards a specific region can be avoided
Careful cooling with = 0.8 0.99 will lead to
asymptotic drift towards T
final
On its search for optimal solution, the
algorithm is capable of escaping from local
optima
20
Applications and extensions
Initial success in combinatorial optimization,
e.g. wire routing and component placement
in VLSI design, TSP
Afterwards adopted as a general-purpose
optimization technique and applied in a wide
variety of domains
Variants of the basic algorithm: threshold
accepting, parallel simulated annealing, etc.,
and hybrids, e.g. thermodynamical genetic
algorithm
21
Evolutionary algorithms (EAs)
Simplified models of biological evolution,
implementing the principles of Darwinian
theory of natural selection (survival of the
fittest) and genetics
Stochastic search and optimization
algorithms, successful in practice
Key idea: computer simulated evolution as a
problem-solving technique
22
Analogy used
Biological evolution Computer problem solving
Individual Solution to a problem
Chromosome Encoding of a solution
Population Set of solutions
Crossover, mutation Search operators
Natural selection Reuse of good solutions
Fitness Quality of a solution
Environment Problem to be solved
23
Evolutionary algorithms and
soft computing
Neural
Networks
Evolutionary
Programming
Evolution
Strategies
Genetic
Algorithms
Genetic
Programming
Evolutionary
Algorithms
Fuzzy
Systems
COMPUTATIONAL
INTELLIGENCE
or
SOFT COMPUTING
Source: EvoNet Flying Circus
24
Evolutionary cycle
Genetic variation
(mutation, recombination)
Population
Offspring
Parents
Selection
Replacement
Source: EvoNet Flying Circus
25
Generic Evolutionary algorithm
Evolutionary_algorithm(t
max
);
t := 0;
Create initial population of individuals;
Evaluate individuals;
result := best_individual;
while t <
tmax
do
t := t + 1;
Select better solutions to form new population;
Create their offspring by means of genetic variation;
Evaluate new individuals;
if better solution found then result := best_individual;
end_while;
Return result;
26
Differences among variants of EAs
Original field of application
Data structures used to represent solutions
Realization of selection and variation operators
Termination criterion
27
Evolution strategies (ES)
Developed in 1960s and 70s by Ingo
Rechenberg and Hans-Paul Schwefel at the
Technical University of Berlin
Originally used as a technique for solving
complex optimization problems in engineering
design
Preferred data structures: vectors of real
numbers
Specialty: self-adaptation
28
Evolutionary experimentation
Pipe-bending experiments (Rechenberg, 1965)
29
Algorithm details
Encoding object and strategy parameters:
g = (p, s) = (p
1
, p
2
, , p
n
), (s
1
, s
2
, , s
n
))
where p
i
represent problem variables and
s
i
mutation variances to be applied to p
i
Mutation is the major operator for
chromosome variation:
g
mut
= (p
mut
, s
mut
) = (p + N
0
(s), (s))
p
mut
= (p
1
+ N
0
(s
1
), , p
n
+ N
0
(s
n
))
s
mut
= ((s
1
), , (s
n
))
30
Algorithm details (2)
1/5
th
success rule: Increase mutation strength,
if more than 1/5 of offspring are successful,
otherwise decrease
Recombination operators range from
swapping respective components between
two vectors to component-wise calculation
of means
31
Algorithm details (3)
Selection schemes
( + )-ES: parents produce offspring,
best out of + individuals survive
(, )-ES: parents produce offspring,
best offspring survive
Originally: (1+1)-ES
Advanced techniques: meta-evolution
strategies, covariance matrix adaptation ES
(CMA-ES)
32
Genetic algorithms (GAs)
Developed in 1970s by John Holland at the
University of Michigan and popularized as a
universal optimization algorithm
Most remarkable difference between GAs
and ES: GAs use string-based, usually binary
parameter encoding, resembling discrete
nucleotide coding on cellular chromosomes
Mutation: flipping bits with certain probability
Recombination performed by crossover
33
Models the breaking of two chromosomes and
subsequent crosswise restituation observed on
natural genomes during sexual reproduction
Exchanges information among individuals
Example: simple (single-point) crossover
Parents Offspring
1 0 0 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0
0 0 1 1 0 1 1 1 1 1 0 1 0 0 0 0 1 1 0 1 1 0 1 0 1 0 1 1
Crossover operator
34
Selection
Models the principle of survival of the fittest
Traditional approach: fitness proportionate
selection performing probabilistic multiplication
of individuals with respect to their fitness values
Implementation: roulette wheel
35
Selection (2)
In the population of n individuals, with the sum
of their fitness values f and average fitness
f
avg
, the expected number of copies of i-th
individual with fitness f
i
equals to
Alternative selection schemes: rank-based
selection, elitist selection, tournament
selection, etc.
avg
i i
n f f
f f
36
Algorithm extensions
Encoding of solutions: real vectors,
permutations, arrays,
Crossover variants: multiple-point crossover,
uniform crossover, arithmetic crossover,
tailored crossover operators for permutation
problems, etc.
Advanced approaches: meta-GA, parallel
GAs, GAs with subjective evaluation of
solutions, multi-objective GAs
37
Genetic programming (GP)
An extension of genetic algorithms aimed at
evolving computer programs using the
simulated evolution
Proposed by John Koza from MIT in 1990s
Computer programs represented by tree-like
symbolic expressions, consisting of functions
and terminals
Crossover: exchange of subtrees between
two parent trees
38
Genetic programming (2)
Mutation: replacement of a randomly
selected subtree with a new, randomly
created tree
Fitness evaluation: program performance in
solving the given problem
GP is a major step towards automatic
computer programming, nowadays
capable of producing human-competitive
solutions in variety of application domains
39
Genetic programming (3)
Applications: symbolic regression, process
and robotics control, electronic circuit
design, signal processing, game playing,
evolution of art images and music, etc.
Main drawback: computational complexity
40
Advantages of EAs
Robust and universally applicable
Besides the solution evaluation, no additional
information on solutions and search space
properties is required
As population methods they produce
alternative solutions
Enable incorporation of other techniques
(hybridization) and can be parallelized
41
Disadvantages of EAs
Suboptimal methodology
Require tuning of several algorithm
parameters
Computationally expensive
42
Conclusion
Stochastic algorithms are becoming
increasingly popular in solving complex
search and optimization problems in various
application domains, including machine
learning and data analysis
A certain degree of randomness, as involved
in stochastic algorithms, may help
tremendously in improving the ability of a
search procedure to discover near-optimal
solutions
43
Conclusion (2)
Many stochastic methods are inspired by
natural phenomena, either by physical or
biological processes
Simulated annealing and evolutionary
algorithms discussed in this presentation are
two such examples
44
Further reading
Corne, D., Dorigo, M. and Glover F. (eds.)
(1999): New Ideas in Optimization, McGraw
Hill, London
Eiben, A. E. and Smith, J. E. (2003):
Introduction to Evolutionary Computing,
Springer, Berlin
Freitas, A. A. (2002): Data Mining and
Knowledge Discovery with Evolutionary
Algorithms, Springer, Berlin
45
Further reading (2)
Jacob, C. (2003): Stochastic Search Methods.
In: Berthold, M. and Hand, D. J. (eds.)
Intelligent Data Analysis, Springer, Berlin
Reeves, C. R. (ed.) (1995): Modern Heuristic
Techniques for Combinatorial Problems,
McGraw Hill, London