0% found this document useful (0 votes)
158 views58 pages

Artificial Intelligence: Adversarial Search

The document discusses adversarial search and game playing in artificial intelligence. It covers topics like minimax search, alpha-beta pruning, and heuristics for game evaluation. Minimax search works by building a game tree and propagating values from the leaves up to determine the best move. Alpha-beta pruning improves on minimax by pruning branches that cannot alter the result. Heuristics are used to evaluate non-terminal nodes when the full game cannot be exhaustively solved. Examples provided include games like Tic-Tac-Toe, Nim, and chess to illustrate these concepts.

Uploaded by

Akshita Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views58 pages

Artificial Intelligence: Adversarial Search

The document discusses adversarial search and game playing in artificial intelligence. It covers topics like minimax search, alpha-beta pruning, and heuristics for game evaluation. Minimax search works by building a game tree and propagating values from the leaves up to determine the best move. Alpha-beta pruning improves on minimax by pruning branches that cannot alter the result. Heuristics are used to evaluate non-terminal nodes when the full game cannot be exhaustively solved. Examples provided include games like Tic-Tac-Toe, Nim, and chess to illustrate these concepts.

Uploaded by

Akshita Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Artificial Intelligence:

Adversarial Search

1
Motivation

GO

chess tic-tac-toe

2
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic Games
 Where we are today

3
Adversarial Search

 Classical application for heuristic search


 simple games: exhaustibly searchable
 complex games: only partial search possible
 additional problem: playing against opponent
 Here, we look at 2-player adversarial games
 win, lose, or tie

5
Types of Games

 Perfect Information
 A game with the perfect information is that in which agents can
look into the complete board. Agents have all the information about
the game, and they can see each other moves also.
 Examples: Chess, Checkers, Go, etc.

 Imperfect Information
 Game state only partially observable, choices by opponent are not
visible (hidden)
 Example: Battleship, Stratego, many card games, etc.

6
Types of Games (II)

 Deterministic games
 No games of chance (e.g., rolling dice)
 Examples: Chess, Tic-Tac-Toe, Go, etc.

 Non-deterministic games
 Games with unpredictable (random) events (involving chance or luck)
 Example: Backgammon, Monopoly, Poker, etc.

7
Types of Games (III)
 Zero-Sum Game
 If the total gains of one player are added up, and the
total losses are subtracted, they will sum to zero
(example: cutting a cake)
 A gain by one player must be matched by a loss by the
other player
 One player tries to maximize a single value, the other
player tries to minimize it
 Examples: Checkers, Chess, etc.
 Non-Zero-Sum Game
 Win-Win or Lose-Lose type games
 Famous example: The Prisoner’s Dilemma
https://en.wikipedia.org/wiki/Prisoner%27s_dilemma

8
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic games
 Where we are today

9
Example: Game of Nim
 Rules
 2 players start with a pile of tokens
 move: split (any) existing pile into two non-empty
differently-sized piles
 game ends when no pile can be unevenly split
 player who cannot make his move loses

→ Worksheet #2 (“Game of Nim”)

10
State Space of Game Nim
 start with one pile of tokens
 each step has to divide one pile
of tokens into 2 non-empty piles
of different size
 player without a move left loses
game

source: G. Luger (2005)


11
MiniMax Search
 Game between two opponents, MIN and MAX
 MAX tries to win, and
 MIN tries to minimize MAX’s score
 Existing heuristic search methods do not work
 would require a helpful opponent
 Need to incorporate “hostile” moves into search strategy

12
Exhaustive MiniMax Search
 For small games where exhaustive search is feasible
 Procedure:
1. build complete game tree
2. label each level according to player’s turn (MAX or MIN)
3. label leaves with a utility function to determine the outcome
of the game
 e.g., (0, 1) or (-1, 0, 1)
4. propagate this value up:
 if parent=MAX, give it max value of children
 if parent=MIN, give it min value of children
5. Select best next move for player at root as the move leading
to the child with the highest value (for MAX) or lowest
values (for MIN)

13
Exhaustive MiniMax for Nim

Bold lines indicate


forced win for MAX

source: G. Luger (2005)


14
n-ply MiniMax with Heuristic
 Exhaustive search for interesting games is rarely
feasible
 Search only to predefined level
 called n-ply look-ahead
 n is number of levels
 No exhaustive search
 nodes evaluated with heuristics and not win/loss
 indicates best state that can be reached
 horizon effect
 Games with opponent
 simple strategy: try to maximize difference between
players using a heuristic function e(n)

15
Heuristic Function for 2-player games

 simple strategy:
 try to maximize difference between MAX’s game and MIN’s
game

 typically called e(n)

 e(n) is a heuristic that estimates how favorable a


node n is for MAX
 e(n) > 0 --> n is favorable to MAX
 e(n) < 0 --> n is favorable to MIN
 e(n) = 0 --> n is neutral

16
Choosing a Heuristic Function e(n)
 

17
MiniMax with Fixed Ply Depth

Leaf nodes show the actual heuristic value e(n)


→ Worksheet #2 (“MiniMax”)
source: G. Luger (2005)
18
Example: e(n) for Tic-Tac-Toe
 Possible e(n)
number of rows, columns, and diagonals open for MAX
- number of rows, columns, and diagonals open for MIN
e(n) =
+∞, if n is a forced win for MAX
-∞, if n is a forced win for MIN

e(n) = 8-8 = 0 e(n) = 6-4 = 2 e(n) = 3-3 = 0

→ Worksheet #2 (“MiniMax Heuristic for Tic-Tac-Toe”)

23
Two-ply MiniMax for Opening Move

Tic-Tac-Toe tree
at horizon = 2

source: G. Luger (2005)


25
Two-ply MiniMax: MAX’s possible 2nd moves

source: G. Luger (2005)


26
Two-ply minimax: MAX’s move at end?

→ Worksheet #2 (“Two-ply MiniMax)


source: G. Luger (2005)
28
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic games
 Where we are today

29
Alpha-Beta Pruning
 Optimization over MiniMax, that:
 ignores (cuts off, prunes) branches of the tree
that cannot possibly lead to a better solution
 reduces branching factor
 allows deeper search with same effort

30
Alpha-Beta Pruning: Example 1
 With minimax, we look at all possible nodes at the n-ply depth
 With α-β pruning, we ignore branches that could not possibly
contribute to the final decision

A=min(3, max(5,?)) C=max(3, min(0,?), min(2,?))


B will be >= 5
So we can ignore B’s right
branch, because A must be 3
D will be <= 0
But C will be >= 3
So we can ignore D’s right
branch
E will be <= 2.
So we can ignore E’s right
branch
Because C will be 3.

source: G. Luger (2005)


31
Alpha-Beta Pruning Algorithm
 α : lower bound on the final backed-up value.
 β : upper bound on the final backed-up value.
 Alpha pruning:
 eg. if MAX node's α = 6, then the search can prune branches from a MIN
descendant that has a β <= 6.
 if child β <= ancestor α → prune a =6 b=+∞ MAX
value ≥ 6
incompatible…
so stop searching the right branch;
MIN
the value cannot come from there!
a =-∞ b=5
 Beta pruning: value ≤ 5
 eg. if a MIN node's β = 6, then the search can prune branches from a MAX
descendant that has an α >= 6.
 if ancestor β <= child α → prune a =-∞ b=6 MIN
incompatible… value ≤ 6
so stop searching the right branch;
MAX
the value cannot come from there! a =7 b=+∞
value ≥ 7

32
Alpha-Beta Pruning Algorithm
01 function alphabeta(node, depth, α, β, maximizingPlayer)
02 if depth = 0 or node is a terminal node
03 return the heuristic value of node
04 if maximizingPlayer
Initial call:
05 v := -∞
alphabeta(origin, depth, -∞, +∞, TRUE)
06 for each child of node
07 v := max(v, alphabeta(child, depth - 1, α, β, FALSE))
08 α := max(α, v)
09 if β ≤ α
10 break (* β cut-off *)
11 return v
12 else
13 v := ∞
14 for each child of node
15 v := min(v, alphabeta(child, depth - 1, α, β, TRUE))
16 β := min(β, v)
17 if β ≤ α
18 break (* α cut-off *)
19 return v

source: http://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning
33
Example with tic-tac-toe

max level

min level

source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
34
Example with tic-tac-toe

max level

a =-∞ b=2
min level
value ≤ 2

e(n) = 2

source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
35
Example with tic-tac-toe

max level

a =-∞ b=2 1
min level
value ≤ 2 1

e(n) = 2 e(n) = 1
source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
36
Example with tic-tac-toe
a =1 b=+∞
max level
value ≥ 1

min level value = 1

e(n) = 2 e(n) = 1

source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
37
Example with tic-tac-toe
a =1 b=+∞

max level value ≥ 1

a =-∞ b=-1

min level value ≤ -1


value = 1

e(n) = 2 e(n) = 1 e(n) = -1

source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
38
Example with tic-tac-toe
incompatible…
so stop searching the right branch; a =1 b=+∞
the value cannot come from there! value ≥ 1

a =-∞ b=-1

b=1
value ≤ -1
child β <= ancestor α → stop search

e(n) = 2 e(n) = 1 e(n) = -1

source: robotics.stanford.edu/~latombe/cs121/2003/home.htm
39
Alpha-Beta Pruning: Example 2

Max ≥3
≥6 =6
-------------------------------------------------------------------------------------------------------
Min ≤5 =3 ≤6✓=6 ≤5 x
-------------------------------------------------------------------------------------------------------
Max ≥5 =5 =3 ≥6✓=6 =7 =5
-------------------------------------------------------------------------------------------------------
Min ≤5 =5 ≤7✓ ≤4 =3 =6 ≤6 x ≤6 =7 =5
≤4 x
--------------------------------------------------------------------------------------------------------
=5 =6 =7 =4 =3 =6 =6 =7 =5

source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
40
Alpha-Beta Pruning: Example 2

source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
41
Alpha-Beta Pruning: Example 3

Step 3

Step 2

Step 1

→ Worksheet #2 (“Alpha-Beta Pruning”)

43
Efficiency of Alpha-Beta Pruning
 Depends on the order of the siblings

 In worst case:
 alpha-beta provides no pruning
 In best case:
 branching factor is reduced to its square root

69
Alpha-Beta: Best ordering

Original (arbitrary) game tree

Best ordering for alpha-beta

70
Alpha-Beta: Best ordering
 best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first

71
Alpha-Beta: Best ordering
 best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first

72
Alpha-Beta: Best ordering
 best ordering:
1. children of MIN : smallest node first
2. children of MAX: largest node first

73
Alpha-Beta: Best ordering

74
Alpha-Beta: Best ordering

75
Alpha-Beta: Best ordering

8 nodes explored out of 27

76
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic Games
 Where we are today

77
Backgammon

source: Russel & Norvig (2010)


Stochastic (Non-Deterministic)
Games
 Search tree for games of chance
 white can calculate its own legal moves
 but it does not know what black will roll...
 Idea: add chance nodes to the search tree
 branches indicate possible dice rolls

 each branch labeled with the roll and its probability

(e.g., 1/6 for a single dice roll)


Search Tree for Backgammon
EXPECTIMINIMAX Algorithm
 Calculating EXPECTIMINIMAX
 Like MiniMax, but using the sum of the weighted sum for Chance
nodes:


∑ P ( r ) Expectiminimax ( Result ( s ,r ) )
r is a possible dice roll (or other random event)
 P(r) the probability of the event
 Result(s, r) is the same state s with dice roll result r
 Note: very expensive due to the high branching factor!
 See https://en.wikipedia.org/wiki/Expectiminimax
for the whole algorithm
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic Games
 Where we are today

86
1992-1994 - Checkers:
Tinsley vs. Chinook Marion Tinsley
World champion
for over 40 years
VS
Chinook
Developed by
Jonathan Schaeffer,
professor at the U. of Alberta

1992: Tinsley beat Chinook in 4 games to 2,


with 33 draws.
1994: 6 draws

In 2007, Schaeffer announced that checkers was solved,


and anyone playing against Chinook would only be able to draw, never win.

Play against Chinook: http://games.cs.ualberta.ca/cgi-bin/player.cgi?nodemo

87
1997 - Othello: Murakami vs. Logistello

Takeshi Murakami
World Othello (aka Reversi) champion

VS

Logistello
developed by Michael Buro
runs on a standard PC
https://skatgame.net/mburo/log.html
(including source code)

Logistello beat Murakami by 6 games to 0

88
1997- Chess: Kasparov vs. Deep Blue
Garry Kasparov
50 billion neurons
2 positions/sec
VS
Deep Blue
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec

Deep Blue wins by 3 wins, 1 loss, and 2 draws

89
2003 - Chess: Kasparov vs. Deep Junior

Garry Kasparov
still 50 billion neurons
still 2 positions/sec

VS

Deep Junior
8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100

Match ends in a 3/3 tie!

90
2016 – Go: AlphaGo vs Lee Se-dol

 GO was always considered a much harder game to automate


than chess because of its very high branching factor (35 for
chess vs 250 for Go!)
 In 2016, AlphaGo beat Lee Sedol in a
five-game match of GO.
 In 2017 AlphaGo beat Ke Jie, the
world No.1 ranked player at the time

 uses a Monte Carlo tree search


algorithm to find its moves based on
knowledge previously "learned" by deep
learning

https://www.theverge.com/2016/3/15/11213518/alphago-deepmind-go-match-5-result
91
2017 – AlphaGo Zero & AlphaZero
AlphaGo Zero learned the Game by itself, without input of human
games
 Became better than all old versions after 40 days of training

 In the first three days, AlphaGo Zero played 4.9 million games

against itself using reinforcement learning


AlphaZero can learn other
games, like Chess and Shogi
 In 2018, it beat the then-

best chess program,


Stockfish 8 in a 100-game
tournament
 Trained using 5,000 tensor

processing units (TPUs), run


on four TPUs and a 44-core
CPU during matches 92
2018 – AlphaZero vs Stockfish 8

Game commentary: https://www.youtube.com/watch?v=nPexHaFL1uo


93
2019 – Deep learning to answer
math questions
Ongoing work on solving other problems with a general AI
 In 2019, Google engineers published on their work training a

neural network system to answer math questions, like

What is the sum of 1+1+1+1+1+1+1?

 The system’s answer? 6


 But it did solve 14 out of 40 questions on a standard test
correctly

See https://arxiv.org/abs/1904.01557

94
Today
 State Space Search for Game Playing
 MiniMax
 Alpha-beta pruning
 Stochastic games
 Where we are today

95

You might also like