0% found this document useful (0 votes)
145 views8 pages

Adversarial Search and Game Playing: Games

The document discusses games and adversarial search. It notes that games involve multi-agent environments where agents' actions affect each other. Games can be cooperative or competitive, with competitive games giving rise to adversarial search, also known as games. The document then discusses why games are studied, different types of games, and how games relate to search problems. It provides an example of a partial game tree for tic-tac-toe and discusses optimal strategies and the minimax algorithm for finding the best strategy.

Uploaded by

Chandra Dian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views8 pages

Adversarial Search and Game Playing: Games

The document discusses games and adversarial search. It notes that games involve multi-agent environments where agents' actions affect each other. Games can be cooperative or competitive, with competitive games giving rise to adversarial search, also known as games. The document then discusses why games are studied, different types of games, and how games relate to search problems. It provides an example of a partial game tree for tic-tac-toe and discusses optimal strategies and the minimax algorithm for finding the best strategy.

Uploaded by

Chandra Dian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Games

n Games: multi-agent environment


Adversarial Search and Game q What do other agents do and how do they affect our
success?
Playing q Cooperative vs. competitive multi-agent environments.
q Competitive multi-agent environments give rise to
adversarial search a.k.a. games
n Why study games?
Russell and Norvig, Chapter 5
q Fun!
q They are hard
q Easy to represent and agents restricted to small
number of actions sometimes!

2
http://xkcd.com/601/

Relation of Games to Search Types of Games


n Search no adversary
q Solution is (heuristic) method for finding goal Deterministic Chance
q Heuristics and CSP techniques can find optimal solution
q Evaluation function: estimate of cost from start to goal through Perfect chess, go, backgammon
given node information checkers, othello
q Examples: path planning, scheduling activities
Imperfect Bridge, hearts Poker, canasta,
n Games adversary
information scrabble
q Solution is strategy (strategy specifies move for every possible
opponent reply). Our focus: deterministic, turn-taking, two-player,
q Time limits force approximate solutions zero-sum games of perfect information
q Examples: chess, checkers, Othello, backgammon zero-sum game: a participant's gain (or loss) is exactly balanced
by the losses (or gains) of the other participant.
perfect information: fully observable

3 4

Game setup Partial Game Tree for Tic-Tac-Toe


n Two players: MAX and MIN
n MAX moves first and they take turns until the game is over.
n Games as search:
q Initial state: e.g. starting board configuration
q Player(s): which player has the move in a state
q Action(s): set of legal moves in a state
q Result(s, a): the states resulting from a given move.
q Terminal-test(s): game over? (terminal states)
q Utility(s,p): value of terminal states, e.g., win (+1), lose (-1) and draw
(0) in chess.
n Players use search tree to determine next move.

5 6

1
The Tic-Tac-Toe Search Space
n Is this search space a tree or graph?

n What is the minimum search depth?

n What is the maximum search depth?

n What is the branching factor?

http://xkcd.com/832/
7

Optimal strategies Two-Ply Game Tree


n Find the best strategy for MAX assuming an infallible MIN
opponent.
n Assumption: Both players play optimally.
n Given a game tree, the optimal strategy can be determined
by using the minimax value of each node:

MINIMAX(s)=
UTILITY(s) If s is a terminal
maxa Actions(s) MINIMAX(RESULT(s,a)) If PLAYER(s)=MAX
mina Actions(s) MINIMAX(RESULT(s,a)) If PLAYER(s)=MIN
Definition: ply = turn of a two-player game

9 10

Two-Ply Game Tree Two-Ply Game Tree

The minimax decision!

The minimax value at a min node is the minimum Minimax maximizes the worst-case outcome for max.
of backed-up values, because your opponent will
do whats best for them (and worst for you).
11 12

2
Minimax Algorithm Properties of Minimax
n Minimax explores tree using DFS.
function MINIMAX-DECISION(state) returns an action
return arg maxa Actions(s) MIN-VALUE(RESULT(state,a)) n Therefore:
function MAX-VALUE(state) returns a utility value q Time complexity: O(bm) L
if TERMINAL-TEST(state) then return UTILITY(state) q Space complexity: O(bm)
v J
for each a in ACTIONS(state) do
v MAX(v,MIN-VALUE(RESULT(state,a)))
return v
function MIN-VALUE(state) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v
for a in ACTIONS(state) do
v MIN(v,MAX-VALUE(RESULT(state,a)))
return v
13 14

Problem of minimax search Alpha-Beta Pruning


n Number of game states is exponential in the n : the highest (i.e. best for Max) value
number of moves. possible
q Solution: Do not examine every node n : the lowest (i.e. best for Min) value possible
q ==> Alpha-beta pruning
n initially and are (-, ).
n Remove branches that do not influence final decision
n General idea: you can bracket the highest/lowest value
at a node, even before all its successors have been
evaluated

15

Alpha-Beta Example Alpha-Beta Example (continued)

Range of possible values!


[-,+]
[-,+]

[-,3]
[-, +]

17 18

3
Alpha-Beta Example (continued) Alpha-Beta Example (continued)

[3,+]
[-,+]

[3,3]
[-,3]

19 20

Alpha-Beta Example (continued) Alpha-Beta Example (continued)

[3,+]
This node is worse ! [3,14] ,
for MAX

[3,3] [-,2]
[3,3] [-,2] [-,14]

21 22

Alpha-Beta Example (continued) Alpha-Beta Example (continued)

[3,5] , [3,3]

[3,3] [-,2] [2,2]


[3,3] [-,2] [-,5]

23 24

4
Alpha-Beta Example (continued)
Alpha-Beta Algorithm

function ALPHA-BETA-SEARCH(state) returns an action


vMAX-VALUE(state, - , +)
[3,3] return the action in ACTIONS(state) with value v

[3,3] [-,2] [2,2] function MAX-VALUE(state, , ) returns a utility value


if TERMINAL-TEST(state) then return UTILITY(state)
v-
for each a in ACTIONS(state) do
v MAX(v, MIN-VALUE(RESULT(state,a), , ))
if v then return v
MAX( ,v)
return v

25 26

Alpha-Beta Algorithm Alpha-beta pruning


n When enough is known about
a node n, it can be pruned.

function MIN-VALUE(state, , ) returns a utility value


if TERMINAL-TEST(state) then return UTILITY(state)
v+
for each a in ACTIONS(state) do
v MIN(v, MAX-VALUE(RESULT(state,a), , ))
if v then return v
MIN( ,v)
return v

27 28

Final Comments about Alpha-Beta Pruning Is this practical?


n Pruning does not affect final results n Minimax and alpha-beta pruning still have exponential
n Entire subtrees can be pruned, not just leaves. complexity.
n Good move ordering improves effectiveness of pruning n May be impractical within a reasonable amount of time.
n With perfect ordering, time complexity is O(bm/2) n SHANNON (1950):
Terminate search at a lower depth
q Effective branching factor of sqrt(b)
q

q Apply heuristic evaluation function EVAL instead of the UTILITY


q Consequence: alpha-beta pruning can look twice as
function
deep as minimax in the same amount of time

29 30

5
Cutting off search Heuristic EVAL
n Change: n Idea: produce an estimate of the expected utility of the game
q if TERMINAL-TEST(state) then return UTILITY(state) from a given position.
into n Performance depends on quality of EVAL.
q if CUTOFF-TEST(state,depth) then return EVAL(state) n Requirements:
q EVAL should order terminal-nodes in the same way as UTILITY.

n Introduces a fixed-depth limit depth q Fast to Compute.

q Selected so that the amount of time will not exceed what the q For non-terminal states the EVAL should be strongly correlated
rules of the game allow. with the actual chance of winning.
n When cuttoff occurs, the evaluation is performed.

31 32

Heuristic EVAL example How good are computers


n Lets look at the state of the art computer
programs that play games such as chess,
checkers, othello, go

Addition assumes
independence

Eval(s) = w1 f1(s) + w2 f2(s) + + wn fn(s)


In chess:

w1 material + w2 mobility + w3 king safety + w4 center control +

33 34

Checkers Chinook
n Components of Chinook:
q Search (variant of alpha-beta). Search space has 1020
states.
q Evaluation function
q Endgame database (for all states with 4 vs. 4 pieces;
roughly 40 billion positions).
q Opening book - a database of opening moves
n Chinook can determine the final result of the game
n Chinook: the first program to win the world within the first 10 moves.
champion title in a competition against a n Author has recently shown that several openings
lead to a draw.
human (1994) Jonathan Schaeffer, Neil Burch, Yngvi Bjornsson, Akihiro Kishimoto, Martin Muller, Rob
Lake, Paul Lu and Steve Sutphen. "Checkers is Solved," Science, 2007.
http://www.cs.ualberta.ca/~chinook/publications/solving_checkers.html

35 36

6
Chess Othello
n 1997: Deep Blue wins a 6-
game match against Garry
Kasparov
n Searches using iterative deepening
alpha-beta; evaluation function has
over 8000 features; opening book
of 4000 positions; end game
database.
n FRITZ plays world champion,
Vladimir Kramnik; wins 6-gam n The best Othello computer programs can
match. easily defeat the best humans (e.g. Logistello,
1997).
37 38

Go Games that include chance

n Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and


n Go: humans still much better! (5-11,11-16)

39 40

Games that include chance Expected minimax value


chance nodes! EXPECTIMINIMAX(s)=
UTILITY(s) If s is a terminal
maxa EXPECTIMINIMAX(RESULT(s,a)) If PLAYER(S)=MAX
mina EXPECTIMINIMAX(RESULT(s,a)) If PLAYER(S)=MIN
r P(r) EXPECTIMINIMAX(RESULT(s,r)) If PLAYER(S)=CHANCE
r is a chance event (e.g., a roll of the dice).
These equations can be propagated recursively in a similar way
to the MINIMAX algorithm.

n Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and


(5-11,11-16)
n [1,1],,[6,6] probability 1/36, all others - 1/18
n Can not calculate definite minimax value, only expected value
41 42

7
TD-Gammon (Tesauro, 1994) Summary
n Games are fun
n They illustrate several important points about AI
Whites turn, with q Perfection is (usually) unattainable -> approximation
a roll of 4-4
q Uncertainty constrains the assignment of values to
states

World class program based on a combination of reinforcement


Learning, neural networks and alpha-beta pruning to 3 plies.

Move analyses by TD-Gammon have lead to some changes in


accepted strategies.

http://www.research.ibm.com/massive/tdl.html
43 44

You might also like