Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
Two–persons game
Russel Norvig (Text) Book and
Patrick Henry Winston (Reference
Book)
Game Theory
• Mathematical game theory, a branch of Economics,
views any multi agent environment as a Game provided
that the impact of each agent on the others is
“significant”, regardless of whether the agents are
cooperative or competitive.
• Game playing was one of the first tasks undertaken in AI.
• By 1950, chess had been tackled by Konrad Zuse,
Claude Shannon, Norbert Wiener and by Alan Turing.
• The state of a game is easy to represent, and agents
are usually restricted to a small no. of actions whose
outcomes are defined by precise rules.
* Environments with many agents are best viewed as economies rather than games
Complexity
• In tic-tac-toe there are nine first moves with 8 possible
responses to each of them, followed by 7 possible
responses to each of these, and so on.
• It follows that 9 X 8 X 7 X 6 X…….1 or 9! (=362,880)
• Although it is not impossible for a computer to search this
no. of paths exhaustively, many important problem (e.g.
chess) exhibit factorial or exponential complexity, although
on a much larger scale.
• For example, chess has 10120 possible game paths;
checkers has 10 40, some of which may never occur in an
actual game.
• These spaces are difficult or impossible to search
exhaustively.
Game-Tree Sizes
• Sizes of game trees (total no.of nodes):
– Nim-5: 28 nodes
– Tic-Tac-Toe: 105 nodes
– Checkers: 1031 nodes
– Chess: 10123 nodes
– Go: 10360 nodes
• In practice it is intractable to find a solution
with minimax
4
Types of games
deterministic chance
chess, checkers, go, backgammon,
perfect othello, Tic-Tac-Toe monopoly
information
bridge, poker,
scrabble
imperfect
information
Typical case
• Zero-sum: one player’s loss is the other’s gain
• Perfect information: both players have access to complete
information about the state of the game. No information is
hidden from either player.
• No chance (e.g., using dice) involved
• Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim,
Othello
• Not: Bridge, Solitaire, Backgammon, ...
• Imperfect information: game of Bridge, as not all cards
are visible to each player.
• Competitive multi-agent environments give rise to
adversarial search also known as games
Games vs. search problems
• Problem solving agent is not alone any more
– Multiagent, conflicts
• Default: deterministic, turn-taking, two-player,
zero sum game of perfect information
– Perfect info. vs. imperfect, or probability
• "Unpredictable" opponent specifying a move
for every possible opponent reply
• Time limits unlikely to find goal, must
approximate
Game formalization
• Initial state
• A successor function
– Returns a list of (move, state) pairs
• Terminal test
– Terminal states
• Utility function (or objective function)
– A numeric value for the terminal states
• Game tree
– The state space
Two-Person Games
• A game can be formally defined as a kind
of search problem with initial state, a set
of operators, a terminal test, and a utility
function
• A search tree may be constructed, with
large number of states
• States at depth d and depth d+1 are for
different players (two plies)
MiniMax Game tree
• MiniMax is a depth first, depth limited, recursive search
procedure.
• This method is used for playing games in which there are 2
players taking turns to play moves.
• Physically it is just a tree of all possible moves.
• MiniMax game tree are best suited for games in which both
players can see the entire game situation.
2 2 1
2 7 1 8 2 7 1 8 2 7 1 8
2
This is the optimal play
2 1
MAX
2 7 1 8
MIN 2 7 1 8
Minimax
• Perfect play for deterministic games: optimal strategy
• Idea: choose move to position with highest minimax value
= best achievable payoff against best play
• E.g., 2-ply game: only two half-moves
Evaluation function
• Evaluation function or static evaluator is used to
evaluate the “goodness” of a game position.
– Contrast with heuristic search where the evaluation function
was a non-negative estimate of the cost from the start node
to a goal and passing through the given node.
• else
– return the lowest MiniMaxValue of the
successors of the state
Pseudo-code that implements Min-Max
Criterion Minimax
Optimal? yes,
Time O(bm),
Space O(bm),
Properties of minimax
• Complete? Yes (if tree is finite)
• Optimal? Yes (against an optimal opponent)
• Time complexity? O(bm)
• Space complexity? O(bm) (depth-first exploration)
• Basic idea: “If you have an idea that is surely bad, don't
take the time to see how truly awful it is.” -- Pat Winston
MIN <=6 B C
On discovering util( D ) = 6
= agent = opponent
Example with MIN
MIN β≤5
MIN
3 4 5 6
(Some of) these
still need to be
looked at
As soon as the node with
value 6 is generated, we
know that the alpha value will be
larger than 6, we don’t need
to generate these nodes
(and the subtree below them)
Example of Alpha-Beta Pruning
MAX
3
MIN 3 2 2
3 12 8 2 14 5 2
Alpha-Beta Pruning
• Alpha = the value of the best choice (i.e.
highest value) we have found so far at any
choice point along the path for MAX.
• β can be similarly
defined for min
pseudo-code for Alpha-Beta
• We start out with the range of possible scores (as
defined by alpha and beta) going from minus infinity to
plus infinity.
2 3 4
5 6 7 8 9 10 11 12 13