0% found this document useful (0 votes)
29 views44 pages

UNIT II Adversarial Search

adversal search

Uploaded by

23r15a6619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views44 pages

UNIT II Adversarial Search

adversal search

Uploaded by

23r15a6619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Adversarial Search

Adversarial Search
• Competitive environments, in which two or more agents have
conflicting goals, giving rise to adversarial search problems.
• Adversarial search: In which we explore environments where other
agents are plotting against us.
• Ex: game problems

• Minmax search
• Alpha-Beta pruning
Game Theory
• adversarial agents, part of the environment—a part that makes the environment
nondeterministic.
• Games require different search procedures.
• Basically they are based on generate and test philosophy.
• The generator generates individual move in the search space, each of which is
then evaluated by the tester and the most promising one is chosen.
• Game playing is most practical and direct application of the heuristic search
problem solving paradigm.
• It is clear that to improve the effectiveness of a search for problem solving
programs, there are two things that can be done:
 Improve the generate procedure so that only good moves (paths) are
generated.
 Improve the test procedure so that the best moves (paths) will be recognized
and explored first.
Two-player zero-sum games
• Let us consider only two player discrete, perfect information
games, such as tic-tac-toe, chess, checkers etc.
• Discrete because they contain finite number of states or
configurations.
• Perfect-information because both players have access to the
same information about the game in progress (card games are
not perfect - information games).
• zero-sum” what is good for one player is just as bad for the
other: there is no “win-win” outcome
• Two-player games are easier to imagine & think and more
common to play.
• Typical characteristic of the games is to ‘look ahead’ at future
positions in order to succeed.
Correspondence with State
Space

• There is a natural correspondence between such


games and state space problems.
• For example,
State Space Game Problem
states - legal board positions
operators - legal moves
goal - winning positions
• A game can be formally defined with the following elements:
• Game tree: as a search
tree that follows every
sequence of moves all
the way to a terminal
state.
• The game tree may be
infinite if the state
space itself is
unbounded or if the
rules of the game
allow for infinitely
repeating positions.

•A (partial) game tree for the game of tic-tac-toe.


•The top node is the initial state, and MAX moves first, placing an X in an empty square.
• part of the tree, giving alternating moves by MIN (O) and MAX (X), until eventually reach terminal
states, which can be assigned utilities according to the rules of thegame.
Correspondence with AND/OR
graph
• The correspondence between game tree and AND/OR
tree is obvious.
 The moves available to one player from a given position can be
represented by OR links.
 Whereas the moves available to his opponent are AND links.

• The trees representing games contain two types of nodes:


 MAX- nodes (nodes with OR links, maximizing its gain)
 MIN - nodes (nodes with AND links, minimizing opponent’s gain)
Contd…
• The leaf nodes are lebeled WIN, LOSS or DRAW depending on
 whether they represent a win, loss or draw position from MAX's
view point.
• Each non-terminal nodes in the game tree can be labeled
WIN, LOSS or DRAW by a bottom up process similar to the
"Solve" labeling procedure in AND/OR graph.
Status Labeling
Procedure
• If j is a non-terminal MAX node, then

WIN , if any of j's successor is a WIN


STATUS (j) = LOSS , if all j's successor are LOSS
DRAW, if any of j's successor is a DRAW and none is WIN
Contd…
• If j is a non-terminal MIN node, then

WIN , if all j's successor is a WIN


STATUS (j) = LOSS , if any of j's successor are LOSS
DRAW, if any of j's successor is a DRAW and none is LOSS
Contd…
• The function STATUS(j) should be interpreted as the best terminal
status MAX can achieve from position j, if MAX plays optimally
against a perfect opponent.
• Example: Consider a game tree on next slide.
 Let us denote
• MAX  X
• MIN  Y,
• WIN  W,
• DRAW  D and
• LOSS  L.
 The status of the leaf nodes is assigned by the rules of the game
whereas, those of non-terminal nodes are determined by the labeling
procedure.
 Solving a game tree means labeling the root node by WIN, LOSS, or
DRAW from Max player point of view.
MAX  X (W)

MIN  Y (D) Y (W) Y(L)

MAX X (W) X (D) X (W) X (W) X(D) X (L)

MIN  Y(D) Y (L) Y(W) Y (L) Y(W) Y(L) Y(L) Y (D)

MAX  X (L) X (W) X (L) X (D) X (W) X (W) X (L) X (W)

• Labeling is done from MAX point of view.


• Associated with each root label, there is an optimal playing strategy
which prescribes how that label can be guaranteed regardless of MIN.
• Highlighted paths are optimal paths for MAX to play.
• An optimal strategy for MAX is a sub-tree whose all nodes are WIN.(See
fig on the previous slides)
The following game tree is generated with MIN player playing first.
Here MAX node may loose if MIN chooses first path.
Level Complete Game Tree with MIN playing first

MIN MIN->L

MAX MAX->L MAX->W MAX->W

MIN MIN->L MIN->L MIN->L MIN->W MIN->L MIN->W

MAX L W L L W W W W L W
Examples:
• Checkers : Non-terminal nodes are 1040 and 1021
centuries required if 3 billion nodes could be generated
each second.
• Chess : 10120 nodes and 10101 centuries.
• So this approach is not practical
Evaluation Function
• Having no practical way of evaluating the exact status of successor
game positions, one may naturally use heuristic approximation.
Optimal Decisions in Games
• MINIMAX search algorithm
• Alpha–Beta Pruning
MINIMAX search
Procedure
• For games with multiple outcome scores, we need a slightly more general
algorithm called minimax search.
• Convention:
 Positive number indicates favor to one player

 Negative number indicates favor to other

 0, an even match

• It operates on a game tree and is a recursive procedure where a player tries to


maximize its own and minimize its opponent's advantage.
• The player hoping for positive number is called the maximizing player. His
opponent is the minimizing player.
• If the player to move is the maximizing player, he is looking for a path leading to a
large positive number and his opponent will try to force the play toward situations
with strongly negative static evaluations.
• Values are backed up to the starting position.
• The procedure by which the scoring information passes up
the game tree is called the MINIMAX procedure
• The score at each node is either minimum or maximum of
the scores at the nodes immediately below.
• It is a depth-first, depth limited search procedure.
• A two-ply game tree. The nodes are “MAX nodes,” in which it is MAX’s turn to move, and the nodes are “MIN
nodes.”
• The terminal nodes show the utility values for MAX;
• The other nodes are labeled with their minimax values.
• MAX’s best move at the root is a1 , because it leads to the state with the highest minimax value, and MIN’s
best reply is b1, because it leads to the state with the lowest minimax value.
Algorithmic Steps
• If the limit of search has reached, compute the static
value of the current position relative to the appropriate
player as given below (Maximizing or minimizing player).
Report the result (value and path).
• If the level is a maximizing level then
 Generate the successors of current position
 Apply MINIMAX to each of these successors
 Return the maximum of the results.
• If the level is minimizing level (minimizer's turn)
 Generate the successors of the current position.
 Apply MINIMAX to each of the successors.
 Return the minimum of the result
Example: Evaluation function for Tic-Tac-Toe game

• Static evaluation function ( f ) to position P is


defined as:

- If P is a win for MAX, then f(P) = n, ( a


very large +ve number)
- If P is a win for MIN, then f(P) = - n
- If P is not a winning position for either
player, then f(P) = (Total number of rows,
columns and diagonals that are still open
for MAX) - (total number of rows, columns
and diagonals that are still open for MIN)
• Consider X for MAX and O for MIN and the following board position P. Now is
the turn for MAX.

• Total number of rows, columns and diagonals that are still open
for MAX (thick lines) = 6
• Total number of rows, columns and diagonals that are still open
for MIN (dotted lines) = 4
f(P) = (Total number of rows, columns and diagonals that are still
open for MAX) - (total number of rows, columns and diagonals that are
still open for MIN) = 2
Properties of minimax
• Complete : Yes (if tree is finite)
• Optimal : Yes (against an optimal opponent)
• Time complexity : O(bm)
• Space complexity : O(bm) (depth-first exploration)
• For chess, b ≈ 35, m ≈100 for "reasonable" games
• à exact solution completely infeasible.
Limitations
• Not always feasible to traverse entire tree
• Time limitations
Remarks:

• Since the MINIMAX procedure is a depth-first process, the


efficiency can often be improved by using dynamic branch-and-bound
technique in which partial solutions that are clearly worse than known
solutions can be abandoned.

• Further, there is another procedure that reduces


- the number of tree branches to be explored and
- the number of static evaluation to be applied.

• This strategy is called Alpha-Beta pruning.


Alpha-Beta Pruning
• It requires the maintenance of two threshold values.

• One representing a lower bound () on the value that a maximizing node
may ultimately be assigned (we call this alpha). >=
• =“at least.”

• Another representing upper bound () on the value that a minimizing


node may be assigned (we call it beta). <=
• =“at most.”

Pruning
• At Max node:  of child >= its parent  value
• At Min node:  of child <= its parent  value
Levels 1st step 2nd step 3rd step Final step

MAX A A 2 A A (This path is not


going to give
value more than
MIN B 2 B=2 C 1 1, so is not
explored further)
C 1
MAX D (2) D (2) E(7) F (1)

F (1)
Levels Game Tree up to 2nd level using - pruning algorithm Pruning

=2 • At Max node:  of child >= its


MAX  (LB) ≥2 A parent  value

• At Min node:  of child <= its


 (UB) =2 B ≤1 C parent  value
MIN
≤2

MAX
MAX D(2) E(7) F(1) G

•There is no need to explore right side of the tree fully as that result
is not going to alter the move decision.
• Given below is a game tree of depth 3 and branching factor 3.
• Find which path should be chosen.

MAX O

MIN O O O

MAX O O O O O O O O O

8 7 3 9 1 6 2 4 1 1 3 5 3 9 2 6 5 2 1 2 3 9 7 2 16 6 4
• Given below is a game tree of depth 3 and branching factor 3.
• Note that only 16 static evaluations are made instead of 27 required without alpha-beta
pruning. Pruning
=5
5 • At Max node:  of child >= its parent  value
• At Min node:  of child <= its parent  value
MAX 4 O

MIN =4 O =5 O 3 O
8 5
=4 =5 =3
MAX =8 O 9 O 4 O 3 O 9 O O 2 O O O
8 2 1 3 6 1

8 7 3 9 1 6 2 4 1 1 3 5 3 9 2 6 5 2 1 2 3 9 7 2 16 6 4

   
Heuristic Alpha–Beta Tree Search

• Limited computation time, cut off the search early and apply a heuristic
evaluation function to states, effectively treating nonterminal nodes as if they
were terminal.
• Replace the UTILITY function with EVAL, which estimates a state’s utility.

• H-MINIMAX(s, d) for the heuristic minimax value of state s at search depth d :


Evaluation functions

• A heuristic evaluation function EVAL(s, p) returns an estimate of the expected


utility of state s to player p.

• For terminal states,: EVAL(s, p) = UTILITY(s, p)

• For nonterminal states, the evaluation must be somewhere between a loss and a
win: UTILITY(loss, p) ≤ EVAL(s, p) ≤ UTILITY(win, p)
Cutting off search
• The next step is to modify ALPHA-BETA-SEARCH so that it will call the
heuristic EVAL function when it is appropriate to cut off the search. We
replace the two lines that mention IS-TERMINAL with the following line:

• if game.IS-CUTOFF(state, depth) then return game.EVAL(state, player),


null
MINIMAX (Pos, Depth, Player) Function returns the best path along with
best value. It will use the following functions.
• GEN (Pos):
Generates a list of SUCCs of ‘Pos’.
• EVAL(Pos, Player):
It returns a number representing the goodness of ‘Pos’ for Player
from the current position.
• DEPTH(Pos, Depth):
It is a Boolean function that returns true if the search has
reached to maximum depth from the current position otherwise false.
Function MINIMAX(Pos, Depth, Player)
{ If DEPTH(Pos, Depth) then return ({Val = EVAL(Pos, Player), Path = Nil})
Else
{ SUCC_List = GEN(Pos);
If SUCC_List = Nil then return ({Val = EVAL(Pos, Player), Path = Nil})
Else
{ Best_Val = Some minimum value returned by EVAL function;
For each SUCC  SUCC_List DO
{ SUCC_Result = MINIMAX(SUCC, Depth + 1, ~Player);
NEW_Value = - Val of SUCC_Result ;
If NEW_Value > Best_Val then
{ Best_Val = NEW_Value;
Best_Path = Add(SUCC, Path of SUCC_Result);
};
};
Return ({Val = Best_Val, Path = Best_Path});
}
}
}

You might also like