0% found this document useful (0 votes)
13 views54 pages

Ch5 AdversarialSearch

The chapter discusses adversarial search and game playing. It covers optimal decisions in games, the minimax algorithm, alpha-beta pruning, and properties of minimax searches like completeness and complexity. Examples are provided for tic-tac-toe to illustrate minimax searches and heuristic evaluation functions for cutting off search trees.

Uploaded by

Countess Loly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views54 pages

Ch5 AdversarialSearch

The chapter discusses adversarial search and game playing. It covers optimal decisions in games, the minimax algorithm, alpha-beta pruning, and properties of minimax searches like completeness and complexity. Examples are provided for tic-tac-toe to illustrate minimax searches and heuristic evaluation functions for cutting off search trees.

Uploaded by

Countess Loly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Chapter5:

Adversarial Search
(Game-Playing)

Ch5: Adversarial Search 1


Outline

❖ Games
❖ Optimal decisions in games
❖ Minimax algorithm
❖ Alpha-Beta Pruning (α-β pruning)
❖ Stochastic Games

Ch5: Adversarial Search 2


Which Problems can we Solve?
❖ The task environments which are suitable for the search
algorithms we’ve looked at so far are:
 Fully observable
 Deterministic
 Sequential
 Static
 Discrete
 Single agent

❖ Here we will consider the situation where other agents


messing with the world.

Ch5: Adversarial Search 3


Games
❖ Multiagent environments:
 Cooperative
 Competitive (in which the agent’s goals are in conflict)➔ adversarial search
problems ➔these problems known as games
❖ In Math. game theory (branch of economics), any multiagent environment
(either cooperative or competitive) is a game provided that the impact of
each agent on the other is significant
❖ In AI, games are usually what game theorists would call deterministic,
turn-taking, two-player, zero-sum games of perfect information.
 Zero-sum ➔ one players loss is the other’s gain.
 Perfect information ➔ both players have access to complete information about
the state of the game. No information is hidden from either player.
❖ Examples: chess, checkers, Connect 4, Othello, go, tic-tac-toe, …

Ch5: Adversarial Search 4


Features of these Games
❖ Fully observable: game state is visible to both players
❖ Deterministic: no element of chance
❖ Sequential: action taken now affects future choices
❖ Static: the world doesn’t change during deliberation
❖ Discrete: the game state can be represented exactly using a
finite representation
❖ Multi agent: two agents whose actions alternate and the utility
values at the end of the game are always equal and opposite
(+1 and –1)

Ch5: Adversarial Search 5


Game problem formulation
❖ Two players: MAX and MIN
❖ MAX moves first and they take turns until the game is over
 Winner gets reward, loser gets penalty.
 “Zero sum” means the sum of the reward and the penalty is a constant.
❖ Formal definition as a search problem:
 Initial state: Set-up specified by the rules, e.g., initial board configuration of
chess.
 Player(s): Defines which player has the move in a state.
 Actions(s): Returns the set of legal moves in a state.
 Result(s,a): Transition model defines the result of a move.
 Terminal-Test(s): Is the game finished? True if finished, false otherwise.
 Utility function(s,p): Gives numerical value of terminal state s for player p.
▪ vary from game to game:
▪ E.g., win (+1), lose (-1), and draw (0) in tic-tac-toe.
▪ E.g., win (+1), lose (0), and draw (1/2) in chess.➔ (Constant Sum)

Ch5: Adversarial Search 6


Game tree

❖ As for a search problem, the initial state, action set and


transition model define a game tree for the game.
 a tree where the nodes are game states and the edges are
moves.

❖ We draw the tree assuming the two players, MAX and MIN
where MAX moves first.

❖ The next slide gives a partial game tree for tic-tac-toe.

Ch5: Adversarial Search 7


partial game tree for tic-tac-toe

How do we
search this
tree to find
the optimal
move?
Terminal states are labeled depending on the winner
(MAX=+1,MIN=-1)
High values are good for MAX and bad for MIN.

Ch5: Adversarial Search 8


Optimal strategies
❖ Key thing is that we have to take into account what the other
player is doing.
❖ Rather than the simple path that is a solution in a search
problem, we need a contingent strategy, which specifies
 MAX’s move in the initial state,
 then MAX’s moves in the states resulting from every possible
response by MIN,
 then MAX’s moves in the states resulting from every possible
response by MIN to those moves
 …
❖ This gives us an optimal strategy in the sense that we do as
well as we can against an infallible opponent.

Ch5: Adversarial Search 9


Minimax search
❖ One-move deep (two half-moves)(2 ply) game tree:

Ch5: Adversarial Search 10


Minimax search (Cont’d)
❖ Given a game tree, we determine the optimal strategy by
establishing the minimax value of each node, which is the utility (for
MAX) of being in the state corresponding to s.
❖ Well, the value assuming that both players finish the game out
perfectly. (assume both players play optimally)
❖ How do we do this?
 Obviously, Minimax value of a terminal state is just its utility.
 Assume our utility function gives terminal nodes high positive values if
they are good for MAX
 And low values if they are good for MIN
 Now, look at the leaf nodes and consider which ones MAX wants:
▪ Ones with high values.
 MAX could choose these nodes if it was his turn to play.
 So, the value of the MAX-node parent of a set of nodes is the max of all
the child values.
Ch5: Adversarial Search 11
Minimax search (Cont’d)

 Similarly, when MIN plays he wants the node with the lowest value.
 So the MIN-node parent of a set of nodes gets the min of all their
values.
❖ i.e., Given a choice, MAX prefer to move to a state of maximum
value, whereas MIN prefers a state of minimum value
❖ We back up values until we get to the children of the start node, and
MAX can use this to decide which node to choose.

Ch5: Adversarial Search 12


Minimax algorithm

Designed to find the optimal strategy for Max and find best move:

1. Generate the whole game tree, down to the leaves.

2. Apply utility (payoff) function to each leaf.

3. Back-up values from leaves through branch nodes:


a Max node computes the Max of its child values
a Min node computes the Min of its child values

4. At root: choose the move leading to the child of highest value.

Ch5: Adversarial Search 13


Minimax search (Cont’d)
MINIMAX(B) = min(3,12,8) =3
MINIMAX(C) = min(2,4,6) =2
MINIMAX(D) = min(14,5,2) =2

Ch5: Adversarial Search 14


Minimax search (Cont’d)
MINIMAX(root) = max(min(3,12,8), min(2,4,6), min(14,5,2)) = max(3,2,2) =3

❖ There’s an algorithm for this.


Ch5: Adversarial Search 15
Minimax algorithm
❖ Recursive Depth First Search:

Ch5: Adversarial Search 16


Properties of minimax
❖ Complete?
 Yes (if tree is finite)
❖ Optimal?
 Yes (against an optimal opponent)
❖ Time complexity?
 O(bm)
❖ Space complexity?
 O(bm) (depth-first exploration)

❖ For chess, b ≈ 35, m ≈100 for "reasonable" games


 exact solution completely infeasible
❖ It is usually impossible to develop the whole search tree.
 Moves must be made in a reasonable amount of time
Ch5: Adversarial Search 17
Solution to the complexity problem

❖ Two solutions:

 Early cutoff of the search tree


▪ depth limited Minimax search (MINIMAXcutoff).

 Dynamic pruning of redundant branches of the search tree


▪ Procedure: Alpha-Beta pruning

Ch5: Adversarial Search 18


Cutting off search
❖ Idea:
 Cutoff the search tree before the terminal state is reached.
❖ Problem:
 Utility is defined only for terminal states.
❖ Solution:
 apply a heuristic Evaluation function to states in the search
▪ Which estimate the position utility

❖ MinimaxCutoff search is identical to Minimax search except


1. TERMINAL-TEST(s) is replaced by CUTOFF-TEST(s)
2. UTILITY(s) is replaced by EVAL (s)

Ch5: Adversarial Search 19


Example—Tic-tac-toe.
❖ The evaluation function heuristic

Ch5: Adversarial Search 20


Example—Tic-tac-toe. (Cont’d)

Ch5: Adversarial Search 21


Example—Tic-tac-toe. (Cont’d)
❖ Unsurprisingly (for anyone who ever played Tic-tac-toe):

❖ Is the best move.


❖ So MAX moves and then MIN replies, and then MAX
searches again:

Ch5: Adversarial Search 22


Example—Tic-tac-toe. (Cont’d)

Here there are


two equally good
best moves.
• So we can break
the tie randomly.
• Then we let
MIN move and do
the search again.

Ch5: Adversarial Search 23


Example—Tic-tac-toe. (Cont’d)

And so on.

Ch5: Adversarial Search 24


α-β pruning
❖ It is possible to compute the correct minimax decision without looking
at every node in the game tree
❖ Example
Do DF-search until first leaf Range of possible values

[-∞,+∞]

[-∞, +∞]

Ch5: Adversarial Search 25


α-β pruning Example

[-∞,+∞]

[-∞,3]

Ch5: Adversarial Search 26


α-β pruning Example

[-∞,+∞]

[-∞,3]

Ch5: Adversarial Search 27


α-β pruning Example

[3,+∞]

[3,3]

Ch5: Adversarial Search 28


α-β pruning Example

[3,+∞]
This node is
worse for MAX

[3,3] [-∞,2]

Ch5: Adversarial Search 29


α-β pruning Example

[3,14] ,

[3,3] [-∞,2] [-∞,14]

Ch5: Adversarial Search 30


α-β pruning Example

[3,5] ,

[3,3] [−∞,2] [-∞,5]

Ch5: Adversarial Search 31


α-β pruning Example

[3,3]

[3,3] [−∞,2] [2,2]

Ch5: Adversarial Search 32


α-β pruning Example

[3,3]

[3,3] [-∞,2] [2,2]

Ch5: Adversarial Search 33


General alpha-beta pruning

❖ α is the value of the best Player


(i.e., highest value) choice
found so far at any choice
point along the path for
MAX
m 
❖ If v is worse than α, ( > v ), Opponent
MAX will avoid it
 prune that branch
Player
❖ Define β similarly for MIN

Opponent n v

Ch5: Adversarial Search 34


The α-β algorithm
❖ Depth first search
– only considers nodes along a single path from root at any time

 = highest-value choice found at any choice point of path for MAX


(initially,  = −infinity)
b = lowest-value choice found at any choice point of path for MIN
(initially, b = +infinity)

❖ Pass current values of  and b down to child nodes during search.


❖ Update values of  and b during search:
➢ MAX updates  at MAX nodes
➢ MIN updates b at MIN nodes
❖ Prune remaining branches at a node when  ≥ b

Ch5: Adversarial Search 35


α-β Example Revisited

Do DF-search until first leaf


, b, initial values
=−
b =+

, b, passed to kids
=−
b =+

Ch5: Adversarial Search 36


α-β Example Revisited

=−
b =+

=−
b =3

MIN updates b, based on kids

Ch5: Adversarial Search 37


α-β Example Revisited

=−
b =+

=−
b =3

MIN updates b, based on kids.


No change.

Ch5: Adversarial Search 38


α-β Example Revisited

MAX updates , based on kids.


=3
b =+

3 is returned
as node value.

Ch5: Adversarial Search 39


α-β Example Revisited

=3
b =+

, b, passed to kids
=3
b =+

Ch5: Adversarial Search 40


α-β Example Revisited

=3
b =+

MIN updates b,
based on kids.
=3
b =2

Ch5: Adversarial Search 41


α-β Example Revisited

=3
b =+

=3  ≥ b,
b =2 so prune.

Ch5: Adversarial Search 42


α-β Example Revisited

MAX updates , based on kids.


No change. =3
b =+

2 is returned
as node value.

Ch5: Adversarial Search 43


α-β Example Revisited

=3
b =+ ,
, b, passed to kids

=3
b =+

Ch5: Adversarial Search 44


α-β Example Revisited

=3
b =+ ,
MIN updates b,
based on kids.
=3
b =14

Ch5: Adversarial Search 45


α-β Example Revisited

=3
b =+ ,
MIN updates b,
based on kids.
=3
b =5

Ch5: Adversarial Search 46


α-β Example Revisited

=3
b =+ 2 is returned
as node value.

Ch5: Adversarial Search 47


α-β Example Revisited

Max calculates the same


node value, and makes the
same move!

Ch5: Adversarial Search 48


The α-β algorithm

Ch5: Adversarial Search 49


Final Comments about Alpha-Beta Pruning
❖ Pruning does not affect final results

❖ Entire subtrees can be pruned.

❖ Good move ordering improves effectiveness of pruning

❖ Repeated states are again possible.


 Store them in memory = transposition table

Ch5: Adversarial Search 50


Example1
❖ which nodes can be pruned?

5 6
3 4 1 2 7 8

Ch5: Adversarial Search 51


Example1 (Cont’d)
Max Answer:
NONE! Because the
most favorable nodes for both
are explored last (i.e., in the
diagram, are on the right-hand
side).
Min

Max

5 6
3 4 1 2 7 8

Ch5: Adversarial Search 52


Example2 : the exact mirror image of example1

❖ which nodes can be pruned?

3 4
6 5 8 7 2 1

Ch5: Adversarial Search 53


Example2 (Cont’d)
Answer:
Max LOTS! Because the most
favorable nodes for both are
explored first (i.e., in the
diagram, are on the left-hand
side).

Min

Max

3 4
6 5 8 7 2 1

Ch5: Adversarial Search 54

You might also like