Week-6 - Informed Search and Local Search
Week-6 - Informed Search and Local Search
oIs it complete?
o No (can get stuck in loops)
o Iasi to Fagaras will lead it to go to Neamt [dead end] (IasiNeamtIasiNeamt …)
o Graph search version, however, is complete (for finite spaces)
oComplete?
o No – can get stuck in loops, e.g., IasiNeamtIasiNeamt…
oTime?
o O(𝑏 𝑚 ), (in worst case)
o m max depth of search space
o but a good heuristic can give dramatic improvement
oSpace?
o O(𝑏 𝑚 ) – keeps all nodes in memory
oOptimal?
o No (not guaranteed to render lowest cost solution)
oA* expands the fringe node with lowest f value where
o f(n) = g(n) + h(n)
o g(n) is the cost to reach n
o h(n) is an optimistic estimate of the least cost from n to a goal node:
o 0 h(n) h*(n)
oA* tree search is optimal
oIts performance depends heavily on the heuristic h
o Expand the node that seems the closest and
that was the shortest…
h(x)
oComplete?
o Yes (unless there are infinitely many nodes with f f(G)
oTime?
o Exponential with path length, O(𝑏 𝑚 )
oSpace?
o Keeps all nodes (generated) in memory, so also exponential, O(𝑏 𝑚 )
o It is the major problem, not time
oOptimal?
o Yes (provided that h is admissible [for tree] or consistent [for graph]).
o Every consistent heuristic is also admissible
oOptimally Efficient? (something even stronger) (no search algorithm could do
better!)
o Yes
o (no algorithm with the same heuristic is guaranteed to expand fewer nodes).
oA* keeps the entire explored region in memory
o => will run out of space before you get bored waiting for the answer
oThere are variants that use less memory
o IDA* works like iterative deepening, except it uses an f-limit instead of a depth limit
o On each iteration, remember the smallest f-value that exceeds the current limit, use as new
limit
o Very inefficient when f is real-valued and each node has a unique value
o RBFS is a recursive depth-first search that uses an f-limit = the f-value of the best
alternative path available from any ancestor of the current node
o When the limit is exceeded, the recursion unwinds but remembers the best reachable f-value on
that branch
o SMA* uses all available memory for the queue, minimizing thrashing
o When full, drop worst node on the queue but remember its value in the parent
oOften, admissible heuristics are solutions to relaxed problems, where new
actions are available
366
15
o Local search: improve a single option until you can’t make it better (no fringe!)
o Abandoning the fringe (idea of exploring everything) lost the completeness guarentee
o Generally much faster and more memory efficient (but incomplete and suboptimal)
oIn many optimization problems, path is irrelevant; the goal state is the solution
oThen state space = set of “complete” configurations;
find configuration satisfying constraints, e.g., n-queens problem; or, find optimal
configuration, e.g., travelling salesperson problem
oIn such cases, can use iterative improvement algorithms (aka local search): keep a
single “current” state, try to improve it
oMore or less unavoidable if the “state” is yourself (i.e., learning)
oLocal search may be good when
o You don’t know a goal, but can recognize one when you see it
o You only want to find a goal and don’t need to keep track of the sequence of actions
that reached it
“Like climbing Everest in dense fog with amnesia (memory loss) and no map”
oRandom restarts
o find global optimum
o it’s true but not that
much helpful
oRandom sideways
moves
o Escape from
shoulders
o Loop forever on flat
local maxima
local maximum oLocal Maxima: peaks not highest
point in space (not global max)
plateau oPlateaus: broad flat region giving
search no guidance (use random
walk) (e.g. flat local max and
shoulder)
oRidges: flat like plateaus, but with
drop-offs to sides; steps to North,
ridge East, South and West may go
down, but step to NW may go up
o Starting from X, where do you end up ?
o Starting from Y, where do you end up ?
o Starting from Z, where do you end up ?
2 5
1 7 4 -4
8 6 3
1 2 5 1 2 5 1 2 3
start 7 4 7 4 -4 8 4 0
8 6 3 8 6 3 7 6 5
-3 goal
1 2 5
8 7 4 -4
6 3
oNo sideways moves – number of
conflicts for heuristic:
o Succeeds w/ prob. p=0.14 (14%)
o 86% stuck at local maxima
o So, allow random restart (7 trial)
o Average number of moves per trial:
o 4 when succeeding, 3 when getting stuck
o Expected total number of moves needed:
o 3(1-p)/p + 4 ≅ 22 moves
oAllowing 100 sideways moves:
o Succeeds w/ prob. p=0.94 (94%)
o Average number of moves per trial:
o 21 when succeeding, 65 when getting stuck
o Expected total number of moves needed:
o 65(1-p)/p + 21 =~ 25 moves
oIn metallurgy, annealing is a technique
involving heating & controlled cooling of a material to
increase size of its crystals & reduce defects
oHeat causes atoms to become unstuck from initial
positions (local minima of internal energy) and wander
randomly through states of higher energy
oSlow cooling gives them more chances of finding
configurations with lower internal energy than initial
one
oReminds the annealing process used to cool metals slowly to reach an
ordered (low-energy) state
oBasic idea:
o Allow “bad” moves occasionally, depending on “temperature”
o Temperature defines how much you are bouncing around
o High temperature => more bad moves allowed, shake the system out of its local
minimum (or maximum)
o Gradually reduce temperature according to some schedule
o Low temperature => more good moves allowed, tries to fit onto the global optimum
o Sounds pretty weird, doesn’t it?
• Idea: Escape local maxima by allowing downhill moves
• But make them rarer as time goes on
• Why is this different from K local searches in parallel? (e.g. Running hill-climbing with random K restarts in parallel
[not in sequence])
The searches communicate! “Come over here, the grass is greener!” (without communciation, they may hit to
dead end, stuck in local maximum)
• What is the problem?
Concentration in a small region after some iterations. → Stochastic beam search is a solution (choose K
successors at random with probability that is an increasing function of their objective value)
8
X7 9
X7 10
4 initial states (K=4)
9 X8 Branching factor b=2
8 10
7 9 9
X6 X5
6
X7 8
X3 9
X7 9
States need to be encoded
Initial Population