Artificial Intelligence: Adversarial Search
Artificial Intelligence: Adversarial Search
Adversarial Search
1
Motivation
GO
chess tic-tac-toe
2
Today
State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Stochastic Games
Where we are today
3
Adversarial Search
5
Types of Games
Perfect Information
A game with the perfect information is that in which agents can
look into the complete board. Agents have all the information about
the game, and they can see each other moves also.
Examples: Chess, Checkers, Go, etc.
Imperfect Information
Game state only partially observable, choices by opponent are not
visible (hidden)
Example: Battleship, Stratego, many card games, etc.
6
Types of Games (II)
Deterministic games
No games of chance (e.g., rolling dice)
Examples: Chess, Tic-Tac-Toe, Go, etc.
Non-deterministic games
Games with unpredictable (random) events (involving chance or luck)
Example: Backgammon, Monopoly, Poker, etc.
7
Types of Games (III)
Zero-Sum Game
If the total gains of one player are added up, and the
total losses are subtracted, they will sum to zero
(example: cutting a cake)
A gain by one player must be matched by a loss by the
other player
One player tries to maximize a single value, the other
player tries to minimize it
Examples: Checkers, Chess, etc.
Non-Zero-Sum Game
Win-Win or Lose-Lose type games
Famous example: The Prisoner’s Dilemma
https://en.wikipedia.org/wiki/Prisoner%27s_dilemma
8
Today
State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Stochastic games
Where we are today
9
Example: Game of Nim
Rules
2 players start with a pile of tokens
move: split (any) existing pile into two non-empty
differently-sized piles
game ends when no pile can be unevenly split
player who cannot make his move loses
10
State Space of Game Nim
start with one pile of tokens
each step has to divide one pile
of tokens into 2 non-empty piles
of different size
player without a move left loses
game
12
Exhaustive MiniMax Search
For small games where exhaustive search is feasible
Procedure:
1. build complete game tree
2. label each level according to player’s turn (MAX or MIN)
3. label leaves with a utility function to determine the outcome
of the game
e.g., (0, 1) or (-1, 0, 1)
4. propagate this value up:
if parent=MAX, give it max value of children
if parent=MIN, give it min value of children
5. Select best next move for player at root as the move leading
to the child with the highest value (for MAX) or lowest
values (for MIN)
13
Exhaustive MiniMax for Nim
15
Heuristic Function for 2-player games
simple strategy:
try to maximize difference between MAX’s game and MIN’s
game
16
Choosing a Heuristic Function e(n)
17
MiniMax with Fixed Ply Depth
23
Two-ply MiniMax for Opening Move
Tic-Tac-Toe tree
at horizon = 2
29
Alpha-Beta Pruning
Optimization over MiniMax, that:
ignores (cuts off, prunes) branches of the tree
that cannot possibly lead to a better solution
reduces branching factor
allows deeper search with same effort
30
Alpha-Beta Pruning: Example 1
With minimax, we look at all possible nodes at the n-ply depth
With α-β pruning, we ignore branches that could not possibly
contribute to the final decision
32
Alpha-Beta Pruning Algorithm
01 function alphabeta(node, depth, α, β, maximizingPlayer)
02 if depth = 0 or node is a terminal node
03 return the heuristic value of node
04 if maximizingPlayer
Initial call:
05 v := -∞
alphabeta(origin, depth, -∞, +∞, TRUE)
06 for each child of node
07 v := max(v, alphabeta(child, depth - 1, α, β, FALSE))
08 α := max(α, v)
09 if β ≤ α
10 break (* β cut-off *)
11 return v
12 else
13 v := ∞
14 for each child of node
15 v := min(v, alphabeta(child, depth - 1, α, β, TRUE))
16 β := min(β, v)
17 if β ≤ α
18 break (* α cut-off *)
19 return v
source: http://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning
33
Example with tic-tac-toe
max level
min level
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
34
Example with tic-tac-toe
max level
a =-∞ b=2
min level
value ≤ 2
e(n) = 2
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
35
Example with tic-tac-toe
max level
a =-∞ b=2 1
min level
value ≤ 2 1
e(n) = 2 e(n) = 1
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
36
Example with tic-tac-toe
a =1 b=+∞
max level
value ≥ 1
e(n) = 2 e(n) = 1
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
37
Example with tic-tac-toe
a =1 b=+∞
a =-∞ b=-1
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
38
Example with tic-tac-toe
incompatible…
so stop searching the right branch; a =1 b=+∞
the value cannot come from there! value ≥ 1
a =-∞ b=-1
b=1
value ≤ -1
child β <= ancestor α → stop search
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
39
Alpha-Beta Pruning: Example 2
Max ≥3
≥6 =6
-------------------------------------------------------------------------------------------------------
Min ≤5 =3 ≤6✓=6 ≤5 x
-------------------------------------------------------------------------------------------------------
Max ≥5 =5 =3 ≥6✓=6 =7 =5
-------------------------------------------------------------------------------------------------------
Min ≤5 =5 ≤7✓ ≤4 =3 =6 ≤6 x ≤6 =7 =5
≤4 x
--------------------------------------------------------------------------------------------------------
=5 =6 =7 =4 =3 =6 =6 =7 =5
source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
40
Alpha-Beta Pruning: Example 2
source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
41
Alpha-Beta Pruning: Example 3
Step 3
Step 2
Step 1
43
Efficiency of Alpha-Beta Pruning
Depends on the order of the siblings
In worst case:
alpha-beta provides no pruning
In best case:
branching factor is reduced to its square root
69
Alpha-Beta: Best ordering
70
Alpha-Beta: Best ordering
best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first
71
Alpha-Beta: Best ordering
best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first
72
Alpha-Beta: Best ordering
best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first
73
Alpha-Beta: Best ordering
74
Alpha-Beta: Best ordering
75
Alpha-Beta: Best ordering
76
Today
State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Stochastic Games
Where we are today
77
Backgammon
∑ P ( r ) Expectiminimax ( Result ( s ,r ) )
r is a possible dice roll (or other random event)
P(r) the probability of the event
Result(s, r) is the same state s with dice roll result r
Note: very expensive due to the high branching factor!
See https://en.wikipedia.org/wiki/Expectiminimax
for the whole algorithm
Today
State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Stochastic Games
Where we are today
86
1992-1994 - Checkers:
Tinsley vs. Chinook Marion Tinsley
World champion
for over 40 years
VS
Chinook
Developed by
Jonathan Schaeffer,
professor at the U. of Alberta
87
1997 - Othello: Murakami vs. Logistello
Takeshi Murakami
World Othello (aka Reversi) champion
VS
Logistello
developed by Michael Buro
runs on a standard PC
https://skatgame.net/mburo/log.html
(including source code)
88
1997- Chess: Kasparov vs. Deep Blue
Garry Kasparov
50 billion neurons
2 positions/sec
VS
Deep Blue
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec
89
2003 - Chess: Kasparov vs. Deep Junior
Garry Kasparov
still 50 billion neurons
still 2 positions/sec
VS
Deep Junior
8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100
90
2016 – Go: AlphaGo vs Lee Se-dol
https://www.theverge.com/2016/3/15/11213518/alphago-deepmind-go-match-5-result
91
2017 – AlphaGo Zero & AlphaZero
AlphaGo Zero learned the Game by itself, without input of human
games
Became better than all old versions after 40 days of training
In the first three days, AlphaGo Zero played 4.9 million games
See https://arxiv.org/abs/1904.01557
94
Today
State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Stochastic games
Where we are today
95