AI Lec3 SearchAgents
AI Lec3 SearchAgents
Search problems.
Uninformed search.
Hueristic search.
Local search and constraint satisfaction problems.
Game tree search.
5 4 5
1 4
2 3
6 1 88 6
8 84
7 3 22 7 6 25
2 8 3 2 8 3 2 8 3
1 6 4 1 6 4 1 6 4
1 7 5 1 7 5 1 7 5
2 8 3 2 8 3 2 8 3
6 4 6 4 6 4
2 1 7 5 2 1 7 5 2 1 7 5
8 3 8 3 2 8 3 2 8 3
2 6 4 2 6 4 6 4 6 4
3 1 7 5 3 1 7 5 7 1 7 5 7 1 7 5
8 3 8 3 2 3
2 6 4 2 6 4 6 8 4
4 1 7 5 4 1 7 5 8 1 7 5
8 3 8 6 3 2 3
2 6 4 2 4 6 8 4
5 1 7 5 6 1 7 5 Discarded before 9 1 7 5
generating node 7
(a) (b) (c)
© 1998 Morgan Kaufman Publishers
A A
B B B
C C C C C
2 8 3 2 8 3 2 8 3
1 6 4 1 4 1 6 4
5 7 5 3 7 6 5 5 7 5
2 8 3 2 3 2 8 3
1 4 1 8 4 1 4
3 7 6 5 3 7 6 5 4 7 6 5
8 3 2 8 3
2 1 4 7 1 4 To the goal
3 7 6 5 4 6 5
8 3
2 1 4
3 7 6 5
2 8 3 2 8 3 2 8 3
1 6 4 1 4 1 6 4
1+5 7 5 1+3 7 6 5 1+5 7 5
2 8 3 2 3 2 8 3
1 4 1 8 4 1 4
2+3 7 6 5 2+3 7 6 5 2+4 7 6 5
8 3 2 8 3 2 3 2 3
2 1 4 7 1 4 1 8 4 1 8 4
3+3 7 6 5 3+4 6 5 3+2 7 6 5 3+4 7 6 5
1 2 3
8 4
4+1 7 6 5
Goal
1 2 3 1 2 3
8 4 7 8 4
5+0 7 6 5 5+2 6 5
Evaluation function:
I Sum of:
F actual path cost g (n) from the start node to node n
F estimated cost h(n)
I Estimated cost of the cheapest path through node n
Idea: Try to find the cheapest solution.
Straight−line distance
Oradea to Bucharest
71
Neamt Arad 366
87
Bucharest 0
Zerind
75
151 Craiova 160
Iasi Dobreta 242
Arad
140 Eforie 161
92 Fagaras 178
Sibiu Fagaras
99 Giurgiu 77
118 Hirsova 151
Vaslui
80 Iasi 226
Rimnicu Vilcea Lugoj
Timisoara 244
142
Mehadia 241
111 Pitesti 211 Neamt 234
Lugoj 97
Oradea 380
70 98 Pitesti 98
85 Hirsova
146 101 Rimnicu Vilcea 193
Mehadia Urziceni
86 Sibiu 253
75 138 Bucharest Timisoara 329
Dobreta 120
90
Urziceni 80
Craiova Eforie Vaslui 199
Giurgiu Zerind 374
Arad
f=0+366 Arad
=366
We shall assume:
h is admissible, that is, it is never larger than the actual cost.
g is the sum of the cost of the operators along the path.
The cost of each operator is greater than some positive amount, .
The number of operators is finite (thus finite branching factor of
search tree).
Under these conditions, we can always revise h into another admissible
heuristic funtion so that the f = h + g values along any path in the search
tree is never decreasing. (Monotonicity.) If f ∗ is the cost of an optimal
solution, then
N
Z
I
A
380 S
F
V
400
T R
L P
H
M U
B
420
D
E
C
G
G G2
Complexity:
I Number of nodes expanded is exponential in the length of the solution.
I All generated nodes are kept in memory. (A∗ usually runs out of space
long before it runs out of time.)
I With a good heuristic, significant savings are still possible compared to
uninformed search methods.
I Admissible heuristic functions that give higher values tend to make the
search more efficient.
Memory-bounded extensions to A∗ :
I Iterative deepening A∗ (IDA∗ )
I Simplified memory-bounded A∗ (SMA∗ )
Questions:
How might one have come up with good heuristics such as h2 for the
8-puzzle?
Is it possible for a computer to mechanically invent such heuristics?
Relaxed Problem. Given a problem, a relaxed problem is one with less
restrictions on the operators.
Strategy. use the path cost of a relaxed problem as the heuristic function
for the original problem.
Problem definition:
A finite set of variables and their domains.
A finite set of conditions on these variables.
A solution is an assignment to these variables that satisfies all the
conditions.
Example: 8-queens problem:
Variables: q1 , ..., q8 , the position of a queen in column 1,...,8.
Domain: the same for all variables, {1, .., 8}.
Constraints: q1 − q2 6= 0, |q1 − q2 | =
6 1,...
{1, 2, 3, 4}
q2 q3
{1, 2, 3, 4} {1, 2, 3, 4}
{1, 2, 3, 4}
q4
© 1998 Morgan Kaufman Publishers
(HKUST) Lecture 3: Search 33 / 77
Running Example: 4-Queens Problem
Constraint graph with q1 = 1:
q1
{1}
q2 q3
{ 3 , 4} {2, 4}
{2, 3}
q4
{2}
q2 q3
{4} {1, 3 }
{ 1 , 3, 4 }
q4
2 x
Move x 2
(1,3) to (1,2) 2 x
2 x
3 x
x3
1 x
1 x
2 x
x 1 Move (2,6)
2 x to (2,7)
2 x
x1
1
x 3 x
2 x
2 x
No change x 2
2 x
4 x
x 3
1
x 2 x
x1
Given a training set of inputs Dt = {x1 , ..., xn }, classify them into a unique
cluster in {C1 , ..., CK }.
A CSP problem?
Variables: c1 ,...,cn .
Domains: {1, ..., K } (ci = j means that the ith input xi is assigned to
cluster j.
Constraints?
Intuition: “Similar” inputs should get the same assignment.
c1 = c2 = 1, c3 = c4 = 2, µ1 = 1.5, µ2 = 10.5.
c1 = arg mini (1 − µi )2 = 1.
Given c1 = c2 = 1, c3 = c4 = 2,
Algorithm - informal:
1 Initialize µ1 ,...,µK .
2 Iterate the following for T steps:
1 compute the best assignment c1 , ..., cn for the given µ.
2 computer the best centroids µ1 , ..., µK for the given c.
evaluation
current
state
Tic-Tac-Toe:
A board with nine squares.
Two players: “X” and “O”; “X” moves first, and then alternate.
At each step, the player choose an unoccupied square, and mark it
with his/her name. Whoever gets three in a line wins.
MAX (X)
X X X
MIN (O) X X X
X X X
X O X O X ...
MAX (X) O
X O X X O X O ...
MIN (O) X X
X O X X O X X O X ...
TERMINAL O X O O X X
O X X O X O O
Utility −1 0 +1
Minimax Algorithm (With Perfect Decisions) Assume the two players are:
MAX (self) and MIN (opponent). To evaluate a node n in a game tree:
1 Expand the entire tree below n.
2 Evaluate the terminal nodes using the given utility function.
3 Select a node that has not been evaluated yet, and all of its children
have been evaulated. If there are no such node, then return.
4 If the selected node is on at which the MIN moves, assign it the
minimum of the values of its children. If the selected node is on at
which the MAX moves, assign it the maximum of the values of its
children. Return to step 3.
MAX 3
A1 A2 A3
MIN 3 2 2
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
3 12 8 2 4 6 14 5 2
w1 f1 + w2 f2 + · · · wn fn
where the f ’s are the features (e.g. number of queens in chess) of the
game position, and w ’s are the weights that measure the importance
of the corresponding features.
Learning good evaluation functions automatically from past
experience is a promising new direction.
X 5–4=1
O O 4 – 6 = –2
X
MAX’s move
O
6–6=0
1 –2 X
O
5 – 6 = –1
X X
Start node
O 5–5=0
X
5 – 6 = –1
OX
O 4 – 5 = –1
X
O
5–5=0
–1 X
O
6–5=1
X X
5–5=0
X O
6–5=1
X O
(HKUST) Lecture 3: Search 62 / 77
The Second Stage of Search
1 0
Start node
OX OX X OX X 3–3=0
O
MAX’s move
OX X 4–3=1
1 O
O
OX OX 4–2=2 OX X 4–3=1
X X O
O
OX 4–2=2
0 X
O O
OX OX 4–3=1 OX 3–2=1
X X X
O
OX 4–3=1 OX O 5–2=3
1 X X
O O
OX OX 3–2=1 OX 3–3=0 OX 4–2=2
X X X OX
O
OX 4–2=2 OX O 5–3=2 OX 4–2=2
X X O X
O
OX 3–2=1 OX 3–3=0
X X O
OX O 5–2=3 OX 4–3=1
X OX
OX 3–2=1
X O
OX 4–2=2
X O
OO
OX X 3–2=1
X
1 –∞ O O
O O
OX X 2–2=0
X
OX OX X
X X O
Start
OX X 2–2=0
X
node O
MAX’s OX X –∞
move O X
1 OO –∞ OO
O
OX 3–1=2 O
D OX 3–2=1
X X X X
OX OX
X X O O X X O O
OX 2–1=1 C OX 2–2=0
X X X X
O O
OX O 3–1=2 B OX O 3–2=1
X X X X
O O
OX 2–1=1 A OX –∞
X OX OX X
Different approaches:
I Depth-limited search
I Iterative deepening search
I Quiescent search
Quiescent positions are positions that are not likely to have large
variations in evaluation in the near future.
Quiescent search:
I Expansion of nonquiescent positions until quiescent positions are
reached.
Player
Opponent m
..
..
..
Player
Opponent n
MAX 3
A1 A2 A3
MIN 3 2 2
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
3 12 8 2 4 6 14 5 2
MAX 3
A1 A2 A3
MIN 3 <=2 2
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
3 12 8 2 14 5 2
Alpha value = –1
O 4 – 5 = –1
X
Start node –1 O
A 5–5=0
X
X
O 6–5=1
X
5–5=0
X O
6–5=1
X O
Averages may be misleading: minimax would choose the left branch while
averaging would lead to the right branch.
where
I µi is the expected value of the games (tryouts) for the child node Mi .
For example, in zero-sum game, it’s W Ni , where Wi is the number of
i