0% found this document useful (0 votes)
9 views48 pages

Local Search

The document discusses local search methods for solving combinatorial optimization and constraint satisfaction problems, including techniques like ant colony optimization, particle swarm optimization, simulated annealing, and genetic algorithms. It highlights the importance of heuristics and strategies such as iterated local search and heuristic repair to improve search efficiency and avoid local optima. Additionally, it covers applications of these methods in various fields, including scheduling and routing problems.

Uploaded by

江冠緯
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views48 pages

Local Search

The document discusses local search methods for solving combinatorial optimization and constraint satisfaction problems, including techniques like ant colony optimization, particle swarm optimization, simulated annealing, and genetic algorithms. It highlights the importance of heuristics and strategies such as iterated local search and heuristic repair to improve search efficiency and avoid local optima. Additionally, it covers applications of these methods in various fields, including scheduling and routing problems.

Uploaded by

江冠緯
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

LOCAL SEARCH

林伯慎 Bor-shen Lin


[email protected]
http:www.cs.ntust.edu.tw/~bslin
AGENDA
 Combinatorial Optimization Problems &
Constraint Satisfaction Problems (CSPs)
 Local Search Methods
 Ant colony optimization
 Particle swarm optimization
 Simulated annealing
 Genetic algorithm
CONSTRAINT SATISFACTION PROBLEMS
 Combinatorial optimization problems involve
assigning values to a number of variables.
 A constraint satisfaction problem (CSP) is a
combinatorial optimization problem with a set of
constraints.
 CSPs can be modeled as search problems, and search
methods and heuristics can be used to solve them.
 Example: 8-queens problem
 A queen occupies the squares of the same row, column
and diagonal, on which no other queens can be put.
8-QUEENS PROBLEM

 A 8-ply deep search tree (DFS)


 Size of search space without constraint: 64·63·…·57
 The branching factor will be smaller if the constraints
are applied
 Place one queen for each row (or column)
 Size of search space: 8·7·…·1
 The choices are even fewer after further
applying the constraints
 Row 5 has 2 choices only
 The search tree is large, but can be significantly
reduced by applying the constraints appropriately
HEURISTIC REPAIR
 Generate a possible solution (randomly or with
heuristic) and then make changes that reduce the
distance of the state from the goal
 e.g. board state randomly assigned as [3, 1, 7, 5, 8, 6, 4, 2]
 Resolve conflict: min-conflict heuristic for 8-queens
8-QUEENS PROBLEM (CONT’D)

 Most Constrained Variables


 Considering the problem to be one of assigning a
value to eight variables, a through h.
 Working with the variable that has the least possible
number of valid choices
 Example: a = 1, b = 3, c =5 , and the numbers of
choices: d(3), e(3), f(1), g(3), h(3)
→ place a queen in column f (choose variable f) !
 Least Constraining Value
 Assigning a value to a variable that leaves the
greatest number of choices for other variables
EXCHANGING HEURISTICS
 Swap two or more variables at each step until a solution is
found.
 A k-exchange involves swapping the values of k variables.
 k-exchange for TSP
 Remove two edges and substitute them for two other edges
 Higher k leads to higher time complexity
 k=3 produces good enough results and can be implemented
efficiently.

Cited from stochastic local search


Foundations and applications,
H. H. Hoos and T. Stutzle
COMBINATORIAL OPTIMIZATION
PROBLEMS
 Finding the best possible set of values for a group of
variables
 It is useful to find ways to restrict the number of
choices that are available for each queen to avoid the
problem of combinatorial explosion
 Example
 Allocating teachers to classroom
 Scheduling machines and workers
 Selecting the best routes for buses
 Traveling salesman problem
PROBLEMS IN DFS/BFS/A*
 Large memory required
 Open list for recording and sorting all frontier nodes
 Close list for tracking all visited nodes

 Computation cost is high


 Sorting the open list according to evaluation function
 Comparing nodes to avoid duplicated state in close list
 Optimality guaranteed for A*
 May not be achievable in real time for the search tree
with large branching factor
 Stochastic local search
 Fewer memory required
 Local optimum
LOCAL SEARCH
 Start from a random state, and make small
changes until a goal state is achieved
 The methods used by local search techniques are
known as meta-heuristics.
 Ant colony optimization
 Particle swarm optimization
 Simulated annealing
 Genetic algorithm
 Local optimization: attempt to optimize a set of
values, but will often find local optimum instead
of a global optimum.
ITERATED LOCAL SEARCH
 Attempt to overcome the problem of local optima
by running the optimization procedure
repeatedly, from different initial states.
 If used with sufficient iterations, this kind of
method will almost always find a global
maximum.
ANT COLONY OPTIMIZATION (ACO)
 Behavior of ants
 Foraging ants leave a trail of pheromones that can
lead other ants to find the food they have found
 The trail of pheromones is renewed regularly
 If better route is found, the pheromones of old route
will gradually fade and new route become most
popular choice
 Cope extremely well with changes in the
environment, such as blockages and new routes
 Adapt to change in real time
 Traveling salesperson problem
 Network routing problem
ANT COLONY OPTIMIZATION
𝜏𝑥𝑦 𝛼 𝜂𝑥𝑦 𝛽
 𝑝𝑥𝑦 = σ 𝛼𝜂 𝛽
𝑧 𝜏𝑥𝑧 𝑥𝑧

 probability for state transition from state x


to state y in state space
 𝜏𝑥𝑦 :amount of pheromone from x to y
1
 𝜂𝑥𝑦 = 𝑑 , 𝑑𝑥𝑦 is distance from x to y
𝑥𝑦
𝜏𝑥𝑦
 Ant: computational agent
 moves to candidate states in state space
𝑑𝑥𝑦
 selects its next state according to 𝑝𝑥𝑦 x y
 Pheromone update
 𝜏𝑥𝑦 ′ = 1 − 𝜌 𝜏𝑥𝑦 + σ𝑘 Δ𝜏𝑥𝑦
𝑘

𝜌 : evaporation rate of pheromone


Q
 𝑘 =
Δ𝜏𝑥𝑦 if ant k passes through x-y edge
L 𝑘

L𝑘 the tour length of the k-th ant


PARTICLE SWARM OPTIMIZATION (PSO)

 Observe a school of fishes or a fleet of birds


 If a fish found the food, there might be food around it
 Every fish moves towards the food randomly
 Humans are usually influenced by celebrities

Known
x optimum
x
x x

Heuristic x dv
x
optimum
x
x
GAME OF LIFE
 Cellular automata (by J. H. Conway 1970)
 Rules
 Every cell is either Alive or Dead and has 8 neighbors
 Alive cell
 change to Dead if too lonely (<2) or too crowded (>3)
 Dead cell
 Change to Alive if having exactly three neighbors

2,3 ≠3
< 2 or > 3

A D

3
SIMULATION BY GAME OF LIFE

 Cited from swarm intelligence, J.K and R.C. E

 A simple model of social behavior


 each particle changes its individual state according to
a simple rule
 A group of particles has a specific pattern

 Similar to social behaviors (like stores)


 Evolution of the social pattern
AXELROD’S CULTURAL MODEL
 Individual
 Represented a string of symbols (features)
 Similarity criteria for interaction
 Probability of interaction as a function of similarity
 agents who are similar to each other are likely to interact
e.g. 42237 vs. 99217 : with probability 0.4
42237 vs. 98765 : with probability 0.0 (no interaction)
 Interaction occurs to change nonmatching feature according
to stochastic simulation
e.g. 42237 → 42217
 Cognitions, attitudes, and other arrays of psychological
phenomena are optimized by interaction among individuals
 What are the simulation results?
 before

 after

 Cited from swarm intelligence, J.K and R.C. E


DISCUSSIONS

 Form homogeneous groups


 Polarization
 Cultural differences become insurmountable
 Dissimilarity creates boundaries between cultural regions
 Inter-individual similarities do not facilitate convergence

 Reason: interaction is influenced by similarity


 Alternative simulation
 No similarity criteria (unconditional interaction)
 The population converges without polarity
ADAPTIVE CULTURE MODEL
 Universal behaviors of individuals
 Evaluate
 Evaluate stimuli as positive or negative, attractive or repulsive
 Distinguish features of the environment (good/bad)

 Learn to improve the average evaluation

 Compare
 Judge ourselves through comparison with others (neighbors)
 Looks, wealth, humor, intelligence…

 Imitate
 Imitate ONLY those neighbors who are superior to themselves
 Few animals may imitate (humans/birds)

 True imitation: not only imitating a behavior but realizing its


purpose (very rare for animals)
 Adapt to environment challenges
SWARM INTELLIGENCE

 A set of particles as candidate solutions


 Each particle is influenced by
 The very best performance of any member of the entire
population (temporarily global optimum 𝑝𝑙 )
 The best performance of a group of k nearest neighbors
(local optimum 𝑝𝑙 )
 Updating rule for a particle
 𝑣𝑡 = 𝑣𝑡−1 + 𝜑1 ∙ 𝑝𝑔 − 𝑥𝑡−1 + 𝜑2 ∙ 𝑝𝑙 − 𝑥𝑡−1
𝑝𝑙
 𝑥𝑡 = 𝑥𝑡−1 + 𝑣𝑡
 𝜑1 , 𝜑2 are random numbers
𝑥𝑡−1 𝑑𝑣

𝑝𝑔
APPLICATIONS OF SWARM INTELLIGENCE
 Find optimal solutions
 Optimizing parameters as meta method
 e.g. PSO + ANN
 Evolution for a group of particles
 flock of birds, school of fishes, … (bees)
 Simple rules for each particle
SIMULATED ANNEALING

 Annealing: heating material to a very high


temperature and then allowing it to cool very
slowly
 Some global energy function e(x) of the
variables x needs to be defined. The aim of
simulated annealing is to minimize the energy.
 Simple Monte Carlo simulation vs. Metropolis
Monte Carlo simulation
CONCEPT OF GPD

f(x)

f’(x) > 0

x+dx x
SIMULATED ANNEALING (CONT’D)

 A random start state x is selected.


 A small random change is made
 x’ = x + d.
 If x’ lowers the system energy, it is accepted.
 If it increases the energy, it may be accepted, depending
on a probability called the Boltzmann acceptance criteria
P(accept | dE) = e(-dE/T)

E(x) E(x’) P(Accept|dE)


1 T higher
dE>0
High T P(Accept) higher
E(x) Low T
x x’ x dE
SIMULATED ANNEALING (CONT’D)
 Probability of acceptance: the chance that the ball
escape from local minimum

人工智慧
 dE ↑, P(accept) ↓ more difficult to escape
 T↑, P(accept)↑ easier to escape
 T = 0, P(accept) = 0
 do not escape from local minimum any longer
 Cooling schedule
 reducing T from initial temperature
 Tnew = Told – dT
 Tnew = C·Told (0< C < 1)
SIMULATED ANNEALING (CONT’D)
 Has been successfully used for placing VLSI
components, scheduling problems and other large
combinatorial problems where values need to be

人工智慧
assigned to a large number of variables to
maximize (or minimize) some function of those
variables.
 Example: knapsack problem
 Solutions representing whether the objects are
picked are selected randomly (e.g. (0, 1, 1, 0, 1, 0, 1,
0, 0, 1))
 E(x) is the total value of the picked objects (It can
also be defined as some other measurement)
 Use random test to make small changes for the
solutions
DISCUSSION ON SIMULATED ANNEALING
 Compared with gradient descent (GD)
 E(x) always decreases (dx = -eE’(x) → dE < 0)
 Arrive at local optimum
 GD allows ONLY x to have continuous & differentiable
variables since it needs to compute E’(x)
 Simulated annealing allows the state to jump out of
the local optimum when dE > 0
 Simulated annealing allows x to have discrete
variables
GENETIC ALGORITHMS
 A method based on biological evolution.
 Create chromosomes which represent possible
solutions to a problem.
 A fitness value for each chromosome is
determined.
 The best chromosomes in each generation are
bred with each other to produce a new
generation.
 Reproduction is done by applying crossover over
two or more chromosomes
 Mutation is applied so as to make random
changes to particular genes.
INTRODUCTION
 Local Search
 Find solutions in a extremely large search space
 Making small changes to potential solutions to a
problem until an optimal solution is identified
 GA is a form of local search
 Based on evolution
 Borrow terms from genetics
 Chromosome, crossover, mutation, fitness
REPRESENTATION
 Chromosome (染色體)
 A string of bits
 Each bit is known as a gene
 Formulated as a solution to a problem

 Population
 A set of chromosomes
 Creatures might have a number of chromosomes
GENETIC ALGORITHM

1. Create a random population of chromosomes


(the first generation)
2. If the termination criteria are satisfied, stop.
3. Determine the fitness of each chromosome.
4. Apply crossover and mutation to selected
chromosomes from current generation to
generate a new population of chromosomes –
the next generation
5. Return to step 2.
ALGORITHM – CONT’D
 Size of population
 Should be determined in advance and remains constant
from one generation to the next
 Size of each chromosome
 Should be the same (variable chromosome size is
unusual)
 Fittest chromosomes are selected to mate (with higher
probability) and might generate more offspring
 Objective fitness function is used (in general)
 Subjective fitness function can also be used (e.g. To
generate interesting pictures, human judgment is
important.)
SELECTION
 The probability that a chromosome is selected to
mate is proportional to its fitness value
 P(c is selected) = f(c)/(Snf(cn))
 c is selected by random test
CROSSOVER
 Single point crossover
 110100 | 110001001 → 110100000111101
010101 | 000111101 → 010101110001001
 Two-point crossover
 Uniform crossover

 Cloning
 asexual reproduction
 single parent
p/1-p
MUTATION
 GA is similar to Hill Climbing…
 Generating possible solutions to a problem and moving
toward a better solution until no better solution can be
found
 Does not perform well with problems with local optima
 Mutation is introduced to avoid this.

 Mutation is usually applied with low probability


TERMINATION
 GA is terminated if…
 A limit on the number of generations
 Particular solution is found
 The highest fitness level has reached a particular
value
 Culling
 All individuals below a given threshold are discarded
 Converge faster than the random version
OPTIMIZATION OF A MATH FUNCTION
 Find maximum for sin(x)
 Fitness
 100 for f(x) = 1, 0 for f(x) = -1, 50 for f(x) = 0
 f(x) = 100(sin(x) + 1)
EXAMPLE
SOLVING 8-QUEENS PROBLEM

32752411 24748552 32748552


SAMPLE PROGRAM FOR GA
 Use GA to solve 8-Queens problem
 In 8 questions, there exist C82(=28) non-conflicts
 Use non-conflicts as a metric of fitness.
 Goal: non-conflicts = 28
e.g. onflicts, non-conflicts =25.
 Culling: non-conflicts=20 as a threshold

 Applicable to 10 queens
WHY GA WORKS?
 Schema S (pattern of chromosomes)
 **1011****, 010**1****, *****10*00
 dL(S): defining length
 Number of cutting positions between the highest bit
and the lowest bit for the pattern
 3, 5, 4 respectively
 O(S): order
 Number of fixed bits for the pattern
 4, 4, 4 respectively (for defined bits)
 Shorter, low order schema whose fitness is higher
than the average fitness
→ has higher chance to survive!
SCHEMA THEOREM

Current
generation

 S = 11010******************************
 m(S) = 3 (occurring 3 times: C4, C5, C10)
 f(C4) = 10, f(C5) = 22, f(C10) = 40 → f(S) = (10+22+40)/3 = 24 (average)
SCHEMA THEOREM – CONT’D

 Current generation: C1,C2,..., CK


 average fitness: a ≡ Skf(Ck) / K
 Probability that Ck is selected
 P(Ck) = f(Ck) / Skf(Ck)
 There are K selections to produce the next
generations. The expected number that Ck is
selected : m’(Ck)
 m’(Ck) = K * P(Ck) = K * f(Ck)/Skf(Ck) = f(Ck) / a
 e.g. Assume the probability for head in coin tossing is
0.4. The expected number of heads in 5000 tossings is
5000 * 0.4 = 2000.
SCHEMA THEOREM – CONT’D
 Schema (pattern) S contains Cn , so the expected
number of patterns of schema S is m’(S)
 m’(S) = Snm’(Cn) = (Snf(Cn)) / a
 Summed for the chromosomes with pattern S
(C4, C5, C10 in the example)
 Average fitness of Schema S is
 f(S) ≡ (Snf(Cn)) / m(S)
 It can be derived that
 m’(S) = [f(S)/a ] * m(S)
SCHEMA THEOREM – CONT’D
 mi+1(S) = [fi(S)/ai ] * mi(S)
 If in the i-th generation, the average fitness for the
chromosomes of Schema S (fi(S)) is higher than the
average fitness of all chromosomes(ai)
(Schema S is better pattern of chromosomes)
→ mi+1(S) > mi(S)
→ the expected number of chromosomes belonging to
Schema S will not decrease through selection
→ Schema S has higher chance to survive
CROSSOVER AND MUTATION
 Crossover (with rate Pc)
 Psurvival(S) ≧ 1- Pc * dL(S)/(L-1)
 For single point crossover, Schema S (with dL(S)
cutting positions) tends to be destroyed if the
cutting position is between the highest bit and the
lowest bit
 The higher the crossover rate, the more probable
that Schema S is destroyed
 Shorter schema (small dL(S)) easy to survive
 Mutation (with rate Pm)
 Psurvival(S) = (1-Pm)O(S)
 Schema S can survive if mutation does not occur to
the defined bits
SCHEMA THEOREM

 f ( S , i )  d L (S ) 
m( S , i + 1)   1 − pc  (1 − pm )  m( S , i )
O(S )

 a (i )  L( S ) − 1 

Shorter (dL(S) small), lower order (O(S) small)


schemata which are fitter than the average fitness
of the population (f(S,i) > a(i)) will appear with
exponentially increasing regularity in subsequent
generations.

You might also like