Lecture 6 - Minmax Alpha Beta
Lecture 6 - Minmax Alpha Beta
Slides were adapted from those by Dan Klein and Pieter Abbeel for CS188 Intro to AI
at UC Berkeley (ai.berkeley.edu)
Table of Contents
• History / Overview
• Minimax for Zero-Sum Games
• α-β Pruning
• Finite lookahead and evaluation
Game Playing State-of-the-Art
• Checkers: 1950: First computer player.
1994: First computer champion: Chinook
ended 40-year-reign of human champion
Marion Tinsley using complete 8-piece
endgame.
• Axes:
► Deterministic or stochastic?
• Chess, LoL, GTA?
► One, two, or more players?
► Zero sum?
► Perfect information (can you see the state)?
• Go, Poker, Bridge, Majhong
2 0 … 2 6 … 4 6
Value of a State
• Value of a state:
► The best achievable
outcome (utility) from that
state Non-Terminal
States:
2 0 … 2 6 … 4 6 Terminal States:
Adversarial Game Trees
-8 -5 -10 +8
Terminal States:
(Assuming the two turns of play, and the
numbers are for illustration only)
Adversarial Search (Minimax)
• Deterministic, zero-sum games:
Minimax values:
► Tic-tac-toe, chess, checkers
computed recursively
► One player maximizes result
• Minimax search:
2 5 min
► A state-space search tree
def max-value(state):
initialize v = -∞ def min-value(state):
for each successor of state: initialize v = +∞
v = max(v, min-value(successor)) for each successor of state:
return v v = min(v, max-value(successor))
return v
Minimax Example
• Up-triangle: MAX player
• Down-triangle: MIN player
3 12 8 2 4 6 14 5 2
Minimax Properties
• The search is optimal if both players act rationally (being a
perfect player). What if not?
max
min
10 10 9 100
Minimax Efficiency
• How efficient is minimax?
► Just like (exhaustive) DFS
► Time: O(bm)
► Space: O(bm)
3 12 8 2 4 6 14 5 2
Minimax Pruning
3 12 8 2 14 5 2
Alpha-Beta Pruning
• General configuration (MIN version)
► We’re computing the MIN-VALUE at some node n
► We’re looping over n’s children MAX
► n’s estimate of the childrens’ min is dropping
► Who cares about n’s value? MAX MIN a
► Let α be the best value that MAX can get so far at any
choice point along the current path from the root
► If n’s value becomes worse than α, MAX will avoid it, MAX
so we can prune n’s other children (it’s already bad
enough that it won’t be played) MIN n
α =3 α =3
3
3 12 8 2 14 5 2
min
• Good child ordering improves effectiveness of
pruning
• Example:
► Suppose we have 100 seconds, can explore 10K
nodes / sec
► So can check 1M nodes per move
► - reaches about depth 8 – decent chess program