Chapter 3 - Problem Solving by Searching (II)
Chapter 3 - Problem Solving by Searching (II)
1
Informed search algorithms
• Informed search is a strategy that uses
information about the cost that may incur to
achieve the goal state from the current
state.
• The information may not be accurate. But it
will help the agent to make better decision
• This information is called heuristic
information
2
Informed Search
• There several algorithms that belongs to
this group. Some of these are:
– Best-first search
1.Greedy best-first search
2.A* search
3
Best-first search
Idea: use an evaluation function f(n) for each node
Estimate of "desirability“ using heuristic and path cost
Expand most desirable unexpanded node
The information gives a clue about which node to be expanded
first
This will be done during queuing
The best node according to the evaluation function may not be
best
Implementation:
Order the nodes in fringe in decreasing order of
desirability (increasing order of cost evaluation function)
4
Straight Line distance to
Ethiopia Map with step costs in km Gondar
Aksum
100 Gondar 0
Mekele
200 Aksum 100
Gondar 80
180
Lalibela Mekele 150
110 250
150 Lalibela 110
Bahr dar
Dessie
Desseie 210
170 Bahrdar 90
Debre markos 330
Dire Dawa Debre Markos 170
230 Addis Ababa 321
Jima
330
400
Jima 300
Addis Ababa
430
100
370
Diredawa 350
Nazarez
Nazarez 340
Gambela 230 320 Nekemt
Gambela 410
Awasa 500
Awasa
Nekemt 420
5
Greedy best-first search
Evaluation function f(n) = h(n) (heuristic)
= estimate of cost from n to goal
That means the agent prefers to choose the action which is
assumed to be best after every action
e.g., hSLD(n) = straight-line distance from n to Gonder
Greedy best-first search expands the node that appears to be
closest to goal (It tries to minimizes the estimated cost to reach the
goal)
Example One
Greedy best-first search example
Show the flow to move from Gonder to Awassa using the given
road map graph
6
Properties of greedy best-first search
Complete? Yes if repetition is controlled otherwise it can can get
stuck in loops
Time? O(bm), but a good heuristic can give dramatic improvement
Space? O(bm), keeps all nodes in memory
Optimal? No
7
A* search
Idea: avoid expanding paths that are already expensive
Evaluation function f(n) = g(n) + h(n) where
g(n) = cost so far to reach n
h(n) = estimated cost from n to goal
f(n) = estimated total cost of path through n to goal
It tries to minimizes the total path cost to reach into the goal at
every node N.
Example one
Indicate the flow of search to move from Awasa to Gondar using
A*
8
Example Two
9
Admissible heuristics
A heuristic h(n) is admissible if for every node n,
h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state
from n.
An admissible heuristic never overestimates the cost to reach the
goal, i.e., it is optimistic
Example: hSLD(n) (never overestimates the actual road distance)
Theorem: If h(n) is admissible, A* using TREE-SEARCH is
optimal
10
Optimality of A* (proof)
Suppose some suboptimal goal G2 has been generated and is in the fringe
(domain). Let n be an unexpanded node in the fringe such that n is on a
shortest path to an optimal goal G but not on the path to G2.
We want to prove that the algorithm chooses to expand n than going to the
suboptimal direction.
If G2 is suboptimal, it should satisfy
g(G2) ≥ f* where f* is the optimal path cost
Assume n is not chosen for expansion before G2
f(n) ≥ f(G2)
Since h is admissible
h*(n) ≥ h(n) h*(n) + g(n) ≥ h(n) + g(n)
f* ≥ f(n)
Combining the two gives
f* ≥ f(G2)
Since G2 is a goal state we have h(G2) = 0
f(G2) = h(G2) + g(G2) = g(G2)
f* ≥ g(G2)
This shows a contradiction with the initial assumption
Therefore, A* will never select G2 for expansion
11
Find Admissible heuristics for the 8-puzzle?
h1(S) = ?
h2(S) = ?
12
Admissible heuristics
E.g., for the 8-puzzle:
h (n) = number of misplaced tiles (square with no)
1
h (n) = total Manhattan distance
2
(.)
h (S) = ? 8
1
H2(s) = is no. of squares from desired location of misplaced tile. Because tiles
cannot move along diagonals, the distance we will count is the sum of the horizontal and vertical
distances. This is sometimes called the city block distance or Manhattan distance (distance of each tile
to reach the goal)
h2(S) = ? 3+1+2+2+2+3+3+2 = 18
• Dominance
– If h2(n) ≥ h1(n) for all n (both admissible)
– then h2 dominates h1
– h2 is better for search
13
Properties of A*
Complete? Yes (unless there are infinitely many nodes
with f ≤ f(G) )
Optimal? Yes (provided that the heuristic is admissible)
Time?
In the best case (if the heuristic is the same as the actual cost), it is
equivalent to the depth of the solution node (i.e. it is linear). O(d)
In the worst case, it is equivalent to the number of nodes which has
f-value ≤ f-value of the solution node O(bceiling(C*/ ε)) where C* is the
f value of the solution node
This shows, A* search is computationally efficient compared to
Greedy best first search strategy
14
Properties of A*
Space? Keeps all nodes in memory (exponential)
i.e. O(bm)
This is again a limitation in the same way as we saw while
discussing Greedy first search
Hence, it is advisable to have modified version of such
algorithm in which the modification minimizes the space
complexity
There are two such modifications
1. Iterative deepening A* (IDA*) search
2. Simplified Memory Bound A* (SMA*) search
15
A*(IDA*) Search
It doesn’t keep track of each visited node
Its heuristic function is admissible, then optimal solution
is guaranteed
Set certain threshold
If f(node) > threshold, prune (hold) the node
Terminates when goal is reached
16
Simplified Memory Bound A* (SMA*) search
Avoids repeated steps
Utilizes allotted memory
Complete if available memory is sufficient enough to
store the lowest path solution
17
Game playing (Adversarial Search)
• Outlines
– How to make optimal decisions in two player game
– MinMax algorithm
– α-β pruning algorithm
• In Game theory there are always at least two agents that participate.
• There may be different groups that participate in game where each of
them go for win or maximize the objective
• In this topic, we focus on only two player game:
– Player1 wants to maximize his objective function at the end of the
game
– Player2 (opponent): wants to minimize player1 objective function
18
Game playing (Adversarial Search)
• Opponent always introduce uncertainty because one never
knows what action the opponent may choose
• This unpredictable nature of game playing makes it different
from search problem.
• In most cases, game playing has very large branching factor
which will have a direct impact on the implementation time and
space complexity
• Example: Tic-Tac-Toe
19
Game tree (2-player, deterministic, turns)
20
Components
Initial state (environment + whose turn to move)
Operators (defines legal move to the agent)
Terminal test
Utility function (payoff function)
Minimax Algorithm
Perfect play for deterministic games
Idea: choose move to position with highest Minimax value
= best achievable payoff against best play
E.g., 2-ply game:
The algorithm consists of five steps
1. Generate the whole tree
2. Apply the utility function to each terminal state to get its value
3. Determine the utility of upper state using the lower states
4. Continue upward until the root
5. Max should choose the best play
21
Minimax algorithm
23
α-β pruning
Alpha (α ) minimal score that player MAX is guaranteed to
attain.
Beta (β ) maximum score that player MAX can hope to obtain
against a sensible opponent.
24
α-β pruning algorithm description
Case one: pruning via calling MIN function
Consider M is a node for MAX and it has guaranteed α using all the
paths to the left of P1 and assume the utility of N i for i < K is greater
than α. However utility of Nk < α. This shows if MAX choose to apply
action P1, then MIN will choose P2 that minimizes MAX utility which
MAX don’t want at all. Therefore the moment this situation happen, no
need to investigate all the sub trees with roots N i where k < i <= m
MAX Node
M
P1
MIN Nodes
P2
MAX Nodes N1 Nk Nm
25
α-β pruning algorithm description
Case two: pruning via calling MAX function
Consider M is a node for MIN and it knows that MAX could obtain
using all the paths to the left of P1 and assume the utility of N i for i< K is
less than . However utility of Nk > . This shows if MIN choose to apply
action P1, then MAX will choose P2 that maximizes MAX utility which
MIN don’t want at all. Therefore the moment this situation happen, no
need to investigate all the sub trees with roots N i where k < i <= m
MIN Node
M
P1
MAX
Nodes
P2
MIN Nodes N1 Nk Nm
26
Example:
Show the utility of each of nodes and prune unnecessary nodes using α-β pruning
algorithm for the following state space tree
3 4
4
7 6 9 30 12 -10 0
25
27
The α-β algorithm
28
The α-β algorithm
29
b =∞
= -∞
b =∞ 3 4
4
= 3
b=3
7 6 9 25 30 12 -10 0
= ∞
3 4
4
7 6 9 25 30 12 -10 0
30
b =∞
= 3
b=3
b =∞
= ∞
= 3
b =∞
3 4 = 7
4
b=
∞ 7 30 12 -10 0
b=3 6 9 25
= 3
= ∞ b=
7
= 3
3 4
4
7 6 9 25 30 12 -10 0
31
b =∞
= 3
b=3
= ∞ b =7
= 3
3 4
4
b =∞
7 6 9 25 30 12 -10 0
= 7
b=3 b =7
= ∞ b =7 = 3
= 3 V=9
3 4
4
7 6 9 25 30 12 -10 0
32
b =∞ b =∞
b=3 = 7 = 7
= ∞ b =7 V=4
= 3
3 4
4
7 6 9 25 30 12 -10 0
33
Properties of α-β
If min is greater or equal to max then prune
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2)
doubles depth of search
Why is it called α-β?
α is the value of the best (i.e., highest-value) choice found so far
at any choice point along the path for max
If v is worse than α, max will avoid it
prune that branch
Define β similarly for min
34
Questions???
35