Adversial Search
Adversial Search
Environment Type
Fully
Observable Turn-taking: Semi-dynamic
Multi-agent
yes
Sequential
yes no
Discrete no
yes Discrete
yes
Game Game Matrices Continuous Action Games
Tree
Search
2 CMPT 310 - Blind Search
Adversarial Search
Examine the problems that arise when we try to plan ahead in
a world where other agents are planning against us.
Utility values for each agent are the opposite of the other
creates the adversarial situation
Games – adversary
Solution is strategy (strategy specifies move for every possible opponent reply).
Optimality depends on opponent. Why?
Time limits force an approximate solution
Evaluation function: evaluate “goodness” of game position
MAX moves first and they take turns until the game is over
Winner gets award, loser gets penalty.
Games as search:
Initial state: e.g. board configuration of chess
Successor function: list of (move,state) pairs specifying legal moves.
Terminal test: Is the game finished?
Utility function: Gives numerical value of terminal states. E.g. win (+1), lose (-1) and draw
(0) in tic-tac-toe or chess
Size of search trees
b = branching factor
Chess
b ~ 35
D ~100
- search tree is ~ 10 154 (!!)
- completely impractical to search this
Game-playing emphasizes being able to make optimal decisions in a finite amount of time
Somewhat realistic as a model of a real-world agent
Even if games themselves are artificial
Partial Game Tree for Tic-Tac-Toe
Game tree (2-player, deterministic,
turns)
Assumptions:
Max depth = d, b legal moves at each point
E.g., Chess: d ~ 100, b ~35
Criterion Minimax
Time O(bd)
Space O(bd)
Practical problem with minimax search
Number of game states is exponential in the number of moves.
Solution: Do not examine every node
=> pruning
Remove branches that do not influence final decision
Revisit example …
Alpha-Beta Example
Do DF-search until first leaf
[-∞,+∞]
[-∞, +∞]
Alpha-Beta Example (continued)
[-∞,+∞]
[-∞,3]
Alpha-Beta Example (continued)
[-∞,+∞]
[-∞,3]
Alpha-Beta Example (continued)
[3,+∞]
[3,3]
Alpha-Beta Example (continued)
[3,+∞]
This node is worse
for MAX
[3,3] [-∞,2]
Alpha-Beta Example (continued)
[3,14] ,
[3,5] ,
[3,3]
[3,3]
Best-Case
each player’s best move is the left-most alternative (i.e., evaluated first)
in practice, performance is closer to best rather than worst-case
MIN
MAX
5 6
3 4 1 2 7 8
Final Comments about Alpha-Beta Pruning
Pruning does not affect final results
Standard approach:
cutoff test: (where do we stop descending the tree)
depth limit
better: iterative deepening
cutoff only when no big changes are expected to occur next (quiescence search).
evaluation function
When the search is cut off, we evaluate the current state
by estimating its utility using an evaluation function.
Static (Heuristic) Evaluation Functions
An Evaluation Function:
estimates how good the current board configuration is for a player.
Typically, one figures how good it is for the player, and how good it is for the opponent, and subtracts the
opponents score from the players
Othello: Number of white pieces - Number of black pieces
Chess: Value of all white pieces - Value of all black pieces
Evaluation functions estimate the quality of a given board configuration for the
Max player.
Alpha-Beta is a procedure which can prune large parts of the search tree and
allow search to go deeper
79
Constraint satisfaction problems (CSPs)
CSP:
state is defined by variables Xi with values from domain Di
goal test is a set of constraints specifying allowable combinations of values
for subsets of variables
80
Example: Map-Coloring
Domains Di = {red,green,blue}
81
Example: Map-Coloring
82
Constraint graph
Binary CSP: each constraint relates two variables
Constraint graph: nodes are variables, arcs are constraints
83
Varieties of CSPs
Discrete variables
finite domains:
n variables, domain size d O(d n) complete assignments
e.g., 3-SAT (NP-complete)
infinite domains:
integers, strings, etc.
e.g., job scheduling, variables are start/end days for each job
need a constraint language, e.g., StartJob1 + 5 ≤ StartJob3
Continuous variables
e.g., start/end times for Hubble Space Telescope observations
linear constraints solvable in polynomial time by linear programming
84
Varieties of constraints
Unary constraints involve a single variable,
e.g., SA ≠ green
85
Example: Cryptarithmetic
Variables: F T U W R O X1 X2 X3
Domains: {0,1,2,3,4,5,6,7,8,9} {0,1}
Constraints: Alldiff (F,T,U,W,R,O)
O + O = R + 10 · X1
X1 + W + W = U + 10 · X2
X2 + T + T = O + 10 · X3
X3 = F, T ≠ 0, F ≠ 0
86
SEND+MORE=MONEY
BASE+BALL=GAMES
LOGIC+LOGIC=PROLOG
87
Real-world CSPs
Assignment problems
e.g., who teaches what class
Timetabling problems
e.g., which class is offered when and where?
Transportation scheduling
Factory scheduling
88
Standard search formulation
Let’s try the standard search formulation.
We need:
• Initial state: none of the variables has a value (color)
• Successor state: one of the variables without a value will get some value.
• Goal: all variables have a value and none of the constraints is violated.
NxD
N layers
WA WA WA NT T
[NxD]x[(N-1)xD]
WA WA WA NT
NT NT NT WA
Equal! N! x D^N
89
There are N! x D^N nodes in the tree but only D^N distinct states??
Backtracking (Depth-First) search
• Special property of CSPs: They are commutative: NT = WA
This means: the order in which we assign variables WA NT
does not matter.
• Better search tree: First order variables, then assign them values one-by-one.
D
WA WA WA
WA
NT D^2
WA WA
NT NT
D^N
90
Backtracking example
91
Backtracking example
92
Backtracking example
93
Backtracking example
94
Improving backtracking efficiency
General-purpose methods can give huge gains in speed:
Which variable should be assigned next?
In what order should its values be tried?
Can we detect inevitable failure early?
95
Most constrained variable
Most constrained variable:
choose the variable with the fewest legal values
96
Most constraining variable
Tie-breaker among most constrained variables
97
Least constraining value
Given a variable, choose the least constraining value:
the one that rules out the fewest values in the remaining
variables
98
Forward checking
Idea:
Keep track of remaining legal values for unassigned variables
Terminate search when any variable has no legal values
99
Forward checking
Idea:
Keep track of remaining legal values for unassigned variables
Terminate search when any variable has no legal values
100
Forward checking
Idea:
Keep track of remaining legal values for unassigned variables
Terminate search when any variable has no legal values
101
Forward checking
Idea:
Keep track of remaining legal values for unassigned variables
Terminate search when any variable has no legal values
102
Constraint propagation
Forward checking propagates information from assigned to
unassigned variables, but doesn't provide early detection for all
failures:
103
Arc consistency
Simplest form of propagation makes each arc consistent
X Y is consistent iff
for every value x of X there is some allowed y
105
Arc consistency
Simplest form of propagation makes each arc consistent
X Y is consistent iff
for every value x of X there is some allowed y
106
Arc consistency
Simplest form of propagation makes each arc consistent
X Y is consistent iff
for every value x of X there is some allowed y
107
Time complexity: O(n2d3)