Lecture7 Part2
Lecture7 Part2
O X
O X X
O X
O X
O X X
O X
Minimax
O X X X O X O X
O O O O X X O
O X X X X O X O X
-1 0 1
Minimax
PLAYER( )= X
PLAYER( X )= O
ACTIONS(s)
X O O
ACTIONS( O X X )={ ,
O
}
X O
RESULT(s, a)
X O O X O
O
RESULT( O X X , )= O X X
X O X O
TERMINAL(s)
O
TERMINAL( O X ) = false
X O X
O X
TERMINAL( O X ) = true
X O X
UTILITY(s)
O X
UTILITY( O X )= 1
X O X
O X X
UTILITY( X O ) = -1
O X O
O X O
O X X
X X O
VALUE: 1
MIN-VALUE: X O
PLAYER(s) = O 0 O X X
X O
O X O X O
MAX-VALUE: MAX-VALUE:
1 O X X 0 O X X
X O X O O
O X O X X O
VALUE: O X X VALUE: O X X
1 0
X X O X O O
MIN-VALUE: X O
PLAYER(s) = O 0 O X X
X O
O X O X O
MAX-VALUE: MAX-VALUE:
1 O X X 0 O X X
X O X O O
O X O X X O
VALUE: O X X VALUE: O X X
1 0
X X O X O O
MAX-VALUE:
1
X O
PLAYER(s) = X O X
X O
O X O X O X X O X X O
MAX-VALUE: MAX-VALUE: VALUE: MAX-VALUE:
1 O X X 0 O X X -1 O X O 0 O X
X O X O O X O X O O
O X O X X O X X O
VALUE: O X X VALUE: O X X VALUE: O X X
1 0 0
X X O X O O X O O
9
5 3 9
8
9 8
5 3 9 2 8
Minimax
• Given a state s:
• MAX picks action a in ACTIONS(s) that produces
highest value of MIN-VALUE(RESULT(s, a))
• MIN picks action a in ACTIONS(s) that produces
smallest value of MAX-VALUE(RESULT(s, a))
Minimax
function MAX-VALUE(state):
if TERMINAL(state):
return UTILITY(state)
v = -∞
for action in ACTIONS(state):
v = MAX(v, MIN-VALUE(RESULT(state, action)))
return v
Minimax
function MIN-VALUE(state):
if TERMINAL(state):
return UTILITY(state)
v=∞
for action in ACTIONS(state):
v = MIN(v, MAX-VALUE(RESULT(state, action)))
return v
Optimizations
4
4 5 3 2
4 8 5 9 3 7 2 4 6
4
4 5 ≤3 ≤2
4 8 5 9 3 2
Alpha-Beta Pruning