Lecture 7
Lecture 7
Playing
CHAPTER 5
ICE 3201
Bangladesh University of Professionals
Environment Type Discussed In this Lecture
2
Fully
Observable Turn-taking: Semi-dynamic
Multi-agent
yes
Sequential
yes no
Discrete no
yes Discrete
yes
Game Game Matrices Continuous Action Games
Tree
Search
CMPT 310 - Blind Search
Adversarial Search
Search – no adversary
Solution is (heuristic) method for finding goal
Heuristic techniques can find optimal solution
Evaluation function: estimate of cost from start to goal through given node
Examples: path planning, scheduling activities
Games – adversary
Solution is strategy (strategy specifies move for every possible opponent
reply).
Optimality depends on opponent. Why?
Time limits force an approximate solution
Evaluation function: evaluate “goodness” of game position
Examples: chess, checkers, Othello, backgammon
Types of Games
MAX moves first and they take turns until the game is over
Winner gets award, loser gets penalty.
Games as search:
Initial state: e.g. board configuration of chess
Successor function: list of (move,state) pairs specifying legal moves.
Terminal test: Is the game finished?
Utility function: Gives numerical value of terminal states. E.g. win (+1), lose
(-1) and draw (0) in tic-tac-toe or chess
b = branching factor
Chess
b ~ 35
D ~100
- search tree is ~ 10 154 (!!)
- completely impractical to search this
Game-playing emphasizes being able to make optimal decisions in a finite amount of time
Somewhat realistic as a model of a real-world agent
Even if games themselves are artificial
Partial Game Tree for Tic-Tac-Toe
Game tree (2-player, deterministic, turns)
MAX to move
Minimax Algorithm
Assumptions:
Max depth = d, b legal moves at each point
Time O(bd)
Space O(bd)
Practical problem with minimax search
=> pruning
Remove branches that do not influence final decision
Revisit example …
Alpha-Beta Example
[-!,+!]
[-!, +!]
Alpha-Beta Example (continued)
[-!,+!]
[-!,3]
Alpha-Beta Example (continued)
[-!,+!]
[-!,3]
Alpha-Beta Example (continued)
[3,+!]
[3,3]
Alpha-Beta Example (continued)
[3,+!]
This node is worse
for MAX
[3,3] [-!,2]
Alpha-Beta Example (continued)
[3,14] ,
[3,5] ,
[3,3]
[3,3]
Worst-Case
branches are ordered so that no pruning takes place. In this case alpha-beta
gives no improvement over exhaustive search
Best-Case
each player’s best move is the left-most alternative (i.e., evaluated first)
in practice, performance is closer to best rather than worst-case
MIN
MAX
5 6
3 4 1 2 7 8
Alpha Beta Pruning
Standard approach:
cutoff test: (where do we stop descending the tree)
depth limit
better: iterative deepening
cutoff only when no big changes are expected to occur next (quiescence search).
evaluation function
When the search is cut off, we evaluate the current state
by estimating its utility using an evaluation function.
Static (Heuristic) Evaluation Functions
An Evaluation Function:
estimates how good the current board configuration is for a player.
Typically, one figures how good it is for the player, and how good it is for the
opponent, and subtracts the opponents score from the players
Othello: Number of white pieces - Number of black pieces
Chess: Value of all white pieces - Value of all black pieces
when the clock runs out we use the solution found at the previous
depth limit
The State of Play
Checkers:
Chinook ended 40-year-reign of human world champion
Marion Tinsley in 1994.
Chess:
Deep Blue defeated human world champion Garry Kasparov in
a six-game match in 1997.
Othello:
human champions refuse to compete against computers: they
are too good.
Go:
human champions refuse to compete against computers: they
are too bad b > 300 (!)
See (e.g.) http://www.cs.ualberta.ca/~games/ for more information
Deep Blue