0% found this document useful (0 votes)
67 views49 pages

CS335 Introduction To AI: Francisco Iacobelli June 25, 2015

This document provides an overview of the CS335 Introduction to AI course. It discusses topics that will be covered including games and optimal decisions in games. Games can have perfect or imperfect information. Optimal players in games aim to find the best move against the opponent's best move. The document describes algorithms for finding optimal moves like minimax and alpha-beta pruning which improve upon minimax by pruning parts of the search tree. Heuristics can also be used to guide the search and make evaluations faster by replacing minimax utility values with heuristic evaluations below a certain depth. Feature-based heuristic evaluation functions are discussed along with their limitations regarding linearity and independence assumptions.

Uploaded by

Tariq Iqbal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views49 pages

CS335 Introduction To AI: Francisco Iacobelli June 25, 2015

This document provides an overview of the CS335 Introduction to AI course. It discusses topics that will be covered including games and optimal decisions in games. Games can have perfect or imperfect information. Optimal players in games aim to find the best move against the opponent's best move. The document describes algorithms for finding optimal moves like minimax and alpha-beta pruning which improve upon minimax by pruning parts of the search tree. Heuristics can also be used to guide the search and make evaluations faster by replacing minimax utility values with heuristic evaluations below a certain depth. Feature-based heuristic evaluation functions are discussed along with their limitations regarding linearity and independence assumptions.

Uploaded by

Tariq Iqbal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

CS335 Introduction to AI

Francisco Iacobelli
June 25, 2015
Games
Competitive and Perfect Information

I Competitive: Commonly Zero Sum (whatever one player


wins the other loses)
I Perfect Information: Players knows the results of all
previous moves
There is one best way to play for each player
I Imperfect Information: Players do not know all of the
previous moves (may play simultaneously)
I Simple states for representation (not Robo Soccer, but To
be fair...)
Games
Language/Functions

S0 //initial state
player(s) // who’s the player in state s
actions(s) // possible moves from state s
result(s,a) // the state after action a is taken on state s
terminal(s) // returns true if s is a terminal state
utility(s,p) // the objective function in state s for player p
Optimal Decisions
Optimal Players

I A game not only finds best way to goal


I The other player has a say

I Two players: MAX and MIN


I actions(s) and result(s,a) define a game tree
I Tic Tac Toe: Fewer than 9!(362, 880) terminal nodes
I Chess: over 1040
I Search tree as theory
Optimal Decisions
Optimal Players

I A game not only finds best way to goal


I The other player has a say

I Two players: MAX and MIN


I actions(s) and result(s,a) define a game tree
I Tic Tac Toe: Fewer than 9!(362, 880) terminal nodes
I Chess: over 1040
I Search tree as theory
Optimal Decisions
Optimal Players

I A game not only finds best way to goal


I The other player has a say

I Two players: MAX and MIN


I actions(s) and result(s,a) define a game tree
I Tic Tac Toe: Fewer than 9!(362, 880) terminal nodes
I Chess: over 1040
I Search tree as theory
Optimal Decisions
Optimal Players

I A game not only finds best way to goal


I The other player has a say

I Two players: MAX and MIN


I actions(s) and result(s,a) define a game tree
I Tic Tac Toe: Fewer than 9!(362, 880) terminal nodes
I Chess: over 1040
I Search tree as theory
Game Tree
Tic Tac Toe
Min plays Max
Small 2-ply game

4 = MAX ; 5 = MIN
Minimax
Picking my best move against your best move

minimax(s) =

 utility (s) if terminal(s)
maxa∈action(s) minimax(result(s, a)) if player (s) = MAX
mina∈action(s) minimax(result(s, a)) if player (s) = MIN

Minimax
Example

minimax(s) =

 utility (s) if terminal(s)
maxa∈action(s) minimax(result(s, a)) if player (s) = MAX
mina∈action(s) minimax(result(s, a)) if player (s) = MIN

Minimax Algorithm
Recursive

function Minimax(state)
v = max-value(state)
return action in successors(state) with value v
//
function max-value(state)
if terminal(s)
return utility(s)
v = -infinity
for a,s in successors(state) do
v = max(v,min-value(s))
return v
//
function min-value(state)
if terminal(state)
return utility(s)
v = +infinity
for a,s in successors(state) do
v = min(v,max-value(s))
return v
Minimax
Discussion

I Complete depth first exploration


I Depth m with b legal moves. O(bm )
I Space complexity (memory) O(bm)
I Chess: m ≈ 35; on average:50 ≤ b ≤ 100
I Impractical for most games, but basis of other algs.
Minimax
Discussion

I Complete depth first exploration


I Depth m with b legal moves. O(bm )
I Space complexity (memory) O(bm)
I Chess: m ≈ 35; on average:50 ≤ b ≤ 100
I Impractical for most games, but basis of other algs.
Minimax
Multiplayer

I Utility vectors instead of values


Alha-Beta prunning
Intuition

Do we need to expand all nodes?

minimax(root) = max(min(3, 12, 8), min(2, x, y ), min(14, 5, 2))


= max(3, min(2, x, y ), 2)
= max(3, z, 2)
=3

Do we need z?
Alpha–Beta prunning

Two values:
I α = value of best choice so far for MAX (highest-value)
I β = value of best choice so far for MIN (lowest-value)
I Each node keeps track of its [α, β] values
Alpha Beta prunning
Example

A α = −∞ β = +∞

B C D

3
Alpha Beta prunning
Example

A α = −∞ β = +∞

B C D

3 ≤ α?.. continue
Alpha Beta prunning
Example

A α = −∞ β = 3

B C D

3 ≤ β?
Alpha Beta prunning
Example

A α = −∞ β = 3

B C D

3 12 ≤ α?
Alpha Beta prunning
Example

A α = −∞ β = 3

B C D

3 12 ≤ β?
Alpha Beta prunning
Example

A α = −∞ β = 3

B C D

3 12 8 ≤ α, β?
Alpha Beta prunning
Example

A α = −∞ β = 3

3 B C D

3 12 8
Alpha Beta prunning
Example

A α = −∞ β = 3

3 ≥ α? B C D

3 12 8
Alpha Beta prunning
Example

A α=3β=3

3 B C D

3 12 8
Alpha Beta prunning
Example

A α=3β=3

3 B C D

3 12 8 2 ≤ α?
Alpha Beta prunning
Example

A α=3β=3

3 B 2 ≥ α? C D

3 12 8 2
Alpha Beta prunning
Example

A α=3β=3

3 B 2 C D

3 12 8 2 14 ≤ α?
Alpha Beta prunning
Example

A α=3β=3

3 B 2 C D

3 12 8 2 14 ≤ β?
Alpha Beta prunning
Example

A α=3β=3

3 B 2 C D

3 12 8 2 14 5 ≤ α?
Alpha Beta prunning
Example

A α=3β=3

3 B 2 C D

3 12 8 2 14 5 ≤ β?
Alpha Beta prunning
Example

A α=3β=3

3 B 2 C 2 D

3 12 8 2 14 5 2 ≤ α?
Alpha Beta prunning
Example

3 A α=3β=3

3 B 2 C 2 D

3 12 8 2 14 5 2
Alpha-Beta Prunning
Algorithm

function alpha-beta(state)
v = max-value(state,−∞,+∞)
return action in successors(state) with value v
//
function max-value(state,α,β)
if terminal(s)
return utility(s)
v = -infinity
for a in action(state) do
v = max(v,min-value(result(s,a),α,β))
if v >= β return v
α = max(α,v)
return v
//
function min-value(state,α,β)
if terminal(state)
return utility(s)
v = +infinity
for a in action(state) do
v = min(v,max-value(result(s,a),α,β))
if v<= α return v
β = min(β,v)
return v
Alpha-Beta Prunning
Properties

I Prunning does not affect final outcome


I Sorting moves by result improves α − β performance
m
I Perfect ordering: O(b 2 )
I An exercise on metareasoning
Real Time Decisions
Heuristics, welcome back.

What if we change minimax as follows:


I we replace utility (s) by eval(s) –a heuristic function
I and replace terminal(s) by cutoff (s) to know when to apply
eval(s)
I therefore: h − minimax(s, d) is now a function of s and the
depth d to explore
Heuristics
Search Faster

h − minimax(s, d) =

 eval(s) if cutoff (s)
maxa∈action(s) h − minimax(result(s, a), d + 1) if player (s) = MAX
mina∈action(s) h − minimax(result(s, a), d + 1) if player (s) = MIN

Heuristics
Good Evaluation Functions

I A bad evaluation function may result in loss


I eval(sw ) ≥ eval(sd ) ≥ eval(sl )1
I eval(s) should be fast
I eval(s) on non-terminal states should be highly correlated
with winning

1
w=win,d=draw,l=lose
Heuristics
Combination of Features

n
X
eval(s) = w1 f1 (s) + w2 f2 (s) + . . . + wn fn (s) = wi fi (s)
i=1

Assumption: Each feature is independent of other features

What are good features and weights, say for chess? for
checkers?
Heuristics
linearity and the independence assumption

Both these boards would have the same heuristic. Whites


move.

But they shouldn’t!


Heuristics
linearity and the independence assumption

Both these boards would have the same heuristic. Whites


move.

But they shouldn’t!


Heuristics
Cutoff for α–β prunning

The idea is to replace the termination condition in α–β with


if cutoff(s,d) then return eval(s)

I fixed d
I iterative deepening on d
I if s is terminal, cutoff (s) returns true
I add quiescence search or vanilla states
I try and prevent horizon effect or inevitable consequences
Stochastic Games
Chance plays a part

Backgammon: States and moves depend on a dice roll which


oponent cannot forsee
Stochastic Games
Incorporate Change in the Tree

Chance is represented as nodes


Stochastic Games
Expected Value

expectiminimax(s) =


 utility (s) if terminal(s)
maxa∈action(s) expectiminimax(result(s, a)) if player (s) = Max

 min
 P a∈action(s) expectiminimax(result(s, a)) if player (s) = Min
r P(r )expectiminimax(result(s, r )) if player (s) = Chance

Playing Games
Complexity

I Chess: beginner plans 4 − 6 ply. Kasparov ≈ 12


I Chess: Say O(355 )
I Backgammon: O(bm nm ) where n is the number of dice
rolls.
I Backgammon: b ≈ 20 and n = 21
I Mario Bros: 22 × 22 area around Mario and 16 possible
actions every 40 millisecons2
I Monte Carlo simulations... stay tuned.

2
Togelius,Shaker,Karakovskiy and Yannakakis (2013)
Playing Games
Current Status

I Checkers: Chinook ended 40-year-reign of human world


champion Marion Tinsley in 1994. Used a precomputed
endgame database defining perfect play for all positions
involving 8 or fewer pieces on the board, a total of 444 billion
positions.
I Chess: Deep Blue defeated human world champion Gary
Kasparov in a six-game match in 1997. Deep Blue searches 200
million positions per second, uses approx 8000 features for
evaluation, a DB of 700,000 grandmaster games, etc. and
undisclosed methods for extending some lines of search up to
40 ply.
I Othello: human champions refuse to compete against
computers, who are too good.
I Go: human champions refuse to compete against computers,
who are too bad. In go, b > 300, so most programs use pattern
knowledge bases to suggest plausible moves
Exercise

Describe and implement


I state descriptions
I move generators
I terminal tests
I utility functions
I evaluation functions (heuristics)
For: Monopoly, Scrabble, Texas Hold’em

You might also like