cs188 Su24 Lec03
cs188 Su24 Lec03
1 A 3
S G
5
Last time…
1 A 3
S G
5
Last time…
g: backward cost (S -> current node)
h: forward cost (heuristic for current node -> goal)
h = ??
1 A 3
S h = ??
G h = ??
5
Last time…
g: backward cost (S -> current node)
Q: Where do heuristics come from?
h: forward cost (heuristic for current node -> goal)
A: We have to create them!
h=6
1 A 3
f =g+h
=1+6
=7
S h=7
G h=0
f =g+h
5 =5+0
=5
1 A 3
f =g+h
=1+6
=7
S h=7
G h=0
f =g+h
5 =5+0
=5
1 A 3
f =g+h
=1+3
=4
S h=1
G h=0
f =g+h
5 =5+0
=5
1 A 3
f =g+h
=1+3
=4
S h=1
G h=0
f =g+h
5 =5+0
=5
What’s a better heuristic? Admissible = Underestimates cost from any node to the goal
Last time…
o Failure to detect repeated states can cause exponentially more work.
o How to implement:
o Tree search + set of expanded states (“closed set”)
o Expand the search tree node-by-node, but…
o Before expanding a node, check to make sure its state has never been
expanded before
o If not new, skip it, if new add to closed set
Last time…
A
1
1
S C
1
2
3
B
G
Last time…
A
1
1
S C
1
2
3
B
G
Last time…
h=4
A
h=1
1
f =g+h
1 f =g+h
h=2 S C =3+1
=1+4
=5 =4
1
2
3
B
h=1
G
f =g+h
=1+1
h=0
=2
Last time…
This heuristic isn’t consistent
h=4
A
h=1
1
f =g+h
1 f =g+h
h=2 S C =3+1
=1+4
=5 =4
1
2
3
“Triangle inequality” B
A
h=1
1
f =g+h
1 f =g+h
h=2 S C =3+1
=1+4
=5 =4
1
2
3
“Triangle inequality” B
A
h=1
1
f =g+h
1 f =g+h
h=2 S C =3+1
=1+4
=5 =4
1
2
3
Consistency: “Triangle inequality” B
o Graph search:
o A* optimal if heuristic is consistent
o UCS optimal (h = 0 is consistent)
… f≤1
f≤2
f≤3
Bonus: Optimality of A* Graph Search
Proof by contradiction:
o New possible problem: some n on path to G*
isn’t in queue when we need it, because some
worse n’ for the same state dequeued and
expanded first (disaster!)
o Take the highest such n in tree
o Let p be the ancestor of n that was on the
queue when n’ was popped
o f(p) < f(n) because of consistency
o f(n) < f(n’) because n’ is suboptimal
o p would have been expanded before n’
o Contradiction!
Beyond Pathfinding
A* can be used in a variety of domains
besides path planning
N variables
domain D
constraints
x2
x1
o Domains:
Explicit:
o Domains:
o Constraints:
Implicit:
Explicit:
Example: Cryptarithmetic
X1
o Variables:
o Domains:
o Constraints:
Example: Sudoku
§ Variables:
§ Each (open) square
§ Domains:
§ {1,2,…,9}
§ Constraints:
{}
{WA=g} {WA=r} … {NT=g} …
o Ordering:
o Which variable should be assigned next?
o In what order should its values be tried?
Keep track of domains for unassigned variables and cross off bad options
Filtering: Forward Checking
o Filtering: Keep track of domains for unassigned variables and cross off bad options
o Forward checking: Cross off values that violate a constraint when added to the existing
assignment
NT Q
WA
SA NSW
V
NT Q
WA
SA
NSW
V
NT Q
WA
SA
NSW
V
NT Q
WA SA
NSW
V
o Lots of middle ground between arc consistency and n-consistency! (e.g. k=3, called
path consistency)
Ordering
Ordering: Minimum Remaining Values
o Variable Ordering: Minimum remaining values (MRV):
o Choose the variable with the fewest legal left values in its domain
o To apply to CSPs:
o Take an assignment with unsatisfied constraints
o Operators reassign variable values
o No fringe! Live on the edge.
o The same appears to be true for any randomly-generated CSP except in a narrow
range of the ratio
Summary: CSPs
o CSPs are a special kind of search problem:
o States are partial assignments
o Goal test defined by constraints
o Basic solution: backtracking search
o Speed-ups:
o Ordering
o Filtering
o Structure – turns out trees are easy!
o Local search: improve a single option until you can’t make it better (no fringe!)
o Generally much faster and more memory efficient (but incomplete and suboptimal)
Hill Climbing
o Simple, general idea:
o Start wherever
o Repeat: move to the best neighboring state
o If no neighbors better than current, quit
82
Simulated Annealing
o Theoretical guarantee:
o Stationary distribution:
o If T decreased slowly enough,
will converge to optimal state!
o Possibly the most misunderstood, misapplied (and even maligned) technique around
Example: N-Queens
o Theorem: if the constraint graph has no loops, the CSP can be solved in O(n d2) time
o Compare to general CSPs, where worst-case time is O(dn)
o This property also applies to probabilistic reasoning (later): an example of the relation
between syntactic restrictions and the complexity of reasoning
Tree-Structured CSPs
o Algorithm for tree-structured CSPs:
o Order: Choose a root variable, order variables so that parents precede children
o Claim 2: If root-to-leaf arcs are consistent, forward assignment will not backtrack
o Proof: Induction on position
o Why doesn’t this algorithm work with cycles in the constraint graph?
o Note: we’ll see this basic idea again with Bayes’ nets
Improving Structure
Nearly Tree-Structured CSPs
Choose a cutset
SA
M1 M2 M3 M4
¹ ¹ ¹ ¹
Agree on
Agree on
Agree on
NS NS
WA NT NT Q Q W W
V
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
shared vars
shared vars
shared vars
SA SA SA SA