Unit 202 Game Playing
Unit 202 Game Playing
Chapter 5
Mausam
(Based on slides of Stuart Russell, Henry
Kautz, Linda Shapiro & UW AI Faculty)
1
Game Playing
3
What Kinds of Games?
Mainly games of strategy with the following
characteristics:
4
Games vs. Search Problems
5
Two-Player Game
Opponent’s Move
Game yes
Over?
no
Generate Successors
Evaluate Successors
no Game yes
Over?
6
Games as Adversarial Search
• States:
– board configurations
• Initial state:
– the board position and which player will move
• Successor function:
– returns list of (move, state) pairs, each indicating a legal
move and the resulting state
• Terminal test:
– determines when the game is over
• Utility function:
– gives a numeric value in terminal states
(e.g., -1, 0, +1 for loss, tie, win)
7
Game Tree (2-player, Deterministic,
Turns)
computer’s
turn
opponent’s
turn
10
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
11
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
12
80
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
13
30
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
14
30
30
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
15
30
30 25
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
16
30
30 25
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
17
30
30
30 25
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
18
30
30
30 25
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
19
30
30 20
30 25 20
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
20
30
30 20
30 25 20 05
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
21
30
30 20
30 25 20 05
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
22
30
30 20
30 25 20 05
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
23
20
20
30 20
30 25 20 05
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
24
20
20 15
30 20 15 60
30 25 20 05 10 15 45 60
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
25
20
20 15
30 20 15 60
30 25 20 05 10 15 45 60
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
26
20
20 15
30 20 15 60
30 25 20 05 10 15 45 60
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
27
Minimax Strategy
• Why do we take the min value every other
level of the tree?
29
Properties of Minimax
• Complete?
– Yes (if tree is finite)
• Optimal?
– Yes (against an optimal opponent)
– No (does not exploit opponent weakness against suboptimal opponent)
• Time complexity?
– O(bm)
• Space complexity?
– O(bm) (depth-first exploration)
30
Good Enough?
• Chess:
– branching factor b≈35
• The Universe:
– number of atoms ≈ 1078
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
33
30
30 25
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
34
30 Do we need to check
this node?
30 25
80 30 25 ?? 55 20 05 65 40 10 70 15 50 45 60 75
35
30 No - this branch is guaranteed to be
worse than what max already has
30 25
80 30 25 X
?? 55 20 05 65 40 10 70 15 50 45 60 75
36
30
30 20
Do we need to check
this node?
30 25 20 05
80 30 25 X
35 55 20 05 ?? 40 10 70 15 50 45 60 75
37
30
30 20
30 25 20 05
80 30 25 X
35 55 20 05 X
?? 40 10 70 15 50 45 60 75
38
Alpha-Beta
• The alpha-beta procedure can speed up a
depth-first minimax search.
• Alpha: a lower bound on the value that a max
node may ultimately be assigned
v>
α=-∞
β=∞
α=-∞
β=∞
α=-∞
β=∞
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
42
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞
β=∞
α=-∞
β=∞
α=-∞ 80
β=80
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
43
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞
β=∞
α=-∞
β=∞
α=-∞
30
β=30
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
44
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞
β=∞
α=30
β=∞ 30
α=-∞
30
β=30
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
45
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞
β=∞
α=30
β=∞ 30
α=30
β=∞
α=-∞
30
β=30
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
46
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞
β=∞
α=30
β=∞ 30
β≤α
α=30
β=25 prune!
α=-∞
30 25
β=30
80 30 25 X
35 55 20 05 65 40 10 70 15 50 45 60 75
47
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 30
β=30
α=30
β=∞ 30
α=30
β=25
α=-∞
30 25
β=30
80 30 25 X
35 55 20 05 65 40 10 70 15 50 45 60 75
48
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 30
β=30
α=30 α=-∞
β=∞ 30 β=30
α=30
β=25
α=-∞ α=-∞
30 25
β=30 β=30
80 30 25 X
35 55 20 05 65 40 10 70 15 50 45 60 75
49
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 30
β=30
α=30 α=20
β=∞ 30 β=30 20
α=30 α=20
β=25 β=30
α=-∞ α=-∞ 20
30 25
β=30 β=20
80 30 25 X
35 55 20 05 65 40 10 70 15 50 45 60 75
50
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 30
β=30
α=30 α=20
β=∞ 30 β=30 20
α=30 α=20
β=25 β=05
α=-∞ α=-∞ 20
30 25 05
β=30 β=20
80 30 25 X
35 55 20 05 65 40 10 70 15 50 45 60 75
51
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 30
β=30
α=30 α=20
β=∞ 30 β=30 20
β≤α
α=30 α=20
β=25 β=05 prune!
α=-∞ α=-∞ 20
30 25 05
β=30 β=20
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
52
α=-∞
α - the best value β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 20
β=20
α=30 α=20
β=∞ 30 β=30 20
α=30 α=20
β=25 β=05
α=-∞ α=-∞ 20 05
30 25
β=30 β=20
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
53
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=-∞ 20
β=20
α=30 α=20
β=∞ 30 β=30 20
α=30 α=20
β=25 β=05
α=-∞ α=-∞ 20 05
30 25
β=30 β=20
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
54
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20
β=∞
α=20
30 20 β=∞
α=20
30 25 20 05 β=∞
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
55
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20
β=∞
α=20
30 20 β=∞
α=20
30 25 20 05 β=10 10
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
56
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20
β=∞
α=20
30 20 10 β=∞
α=20
30 25 20 05 β=10 10
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
57
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20
β=∞
α=20
30 20 10 β=∞
α=20
α=20 β=15
30 25 20 05 β=10 10 15
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
58
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20
β=∞
α=20
30 20 15 β=∞
α=20
α=20 β=15
30 25 20 05 β=10 10 15
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
59
α=20
α - the best value 20 β=∞
for max along the path
β - the best value
for min along the path
α=20
20 15
β=15
α=20
30 20 15 β=∞
α=20
α=20 β=15
30 25 20 05 β=10 10 15
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 45 60 75
60
α=20
α - the best value
for max along the path
20 β=∞
β≤α
β - the best value
for min along the path
prune!
α=20
20 15
β=15
X
α=20
30 20 15 β=∞
α=20
α=20 β=15
30 25 20 05 β=10 10 15
X X
80 30 25 X
35 55 20 05 X
65 40 10 70 15 50 X
X 45 X
60 X
75
61
Bad and Good Cases for Alpha-Beta Pruning
• Bad: Worst moves encountered first
4 MAX
+----------------+----------------+
2 3 4 MIN
+----+----+ +----+----+ +----+----+
6 4 2 7 5 3 8 6 4 MAX
+--+ +--+ +--+ +-+-+ +--+ +--+ +--+ +--+ +--+--+
6 5 4 3 2 1137 4 5 2 3 8 2 1 61 2 4
63
Why O(bm/2)?
Let T(m) be time complexity of search for depth m
Normally:
T(m) = b.T(m-1) + c T(m) = O(bm)
64
Node Ordering
Iterative deepening search
65
Good Enough?
• Chess: The universe
– branching factor b≈35 can play chess
- can we?
– game length m≈100
– search space bm/2 ≈ 3550 ≈ 1077
• The Universe:
– number of atoms ≈ 1078
– age ≈ 1018 seconds
– 108 moves/sec x 1078 x 1018 = 10104 66
Cutting off Search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval
67
Cutoff
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
68
0
0 0
0 0 Cutoff 0 0
80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
69
Evaluation Functions
Tic Tac Toe
• Let p be a position in the game
• Define the utility function f(p) by
– f(p) =
• largest positive number if p is a win for computer
• smallest negative number if p is a win for opponent
• RCDC – RCDO
– where RCDC is number of rows, columns and diagonals in
which computer could still win
– and RCDO is number of rows, columns and diagonals in
which opponent could still win.
70
Sample Evaluations
• X = Computer; O = Opponent
O O O X
X X X
X O X O
rows rows
cols cols
diags diags
71
Evaluation functions
• For chess/checkers, typically linear weighted sum of features
Eval(s) = w1 f1(s) + w2 f2(s) + … + wm fm(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens),
etc.
72
Example: Samuel’s Checker-Playing
Program
• It uses a linear evaluation function
f(n) = w1f1(n) + w2f2(n) + ... + wmfm(n)
For example: f = 6K + 4M + U
– K = King Advantage
– M = Man Advantage
– U = Undenied Mobility Advantage (number of
moves that Max where Min has no jump moves)
73
Samuel’s Checker Player
• In learning mode
74
Samuel’s Checker Player
• How does A change its function?
Coefficent replacement
(node) = backed-up value(node) – initial value(node)
if > 0 then terms that contributed positively are
given more weight and terms that contributed
negatively get less weight
if < 0 then terms that contributed negatively are
given more weight and terms that contributed
positively get less weight
75