0% found this document useful (0 votes)
20 views29 pages

AI Lec07 Adversarial Search

Uploaded by

tinphamhuynh282
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views29 pages

AI Lec07 Adversarial Search

Uploaded by

tinphamhuynh282
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Introduction to

Artificial Intelligence
Lecture: Adversarial Search
Outline

● Games
● Optimal Decisions in Games
● α-β Pruning
● Imperfect, Real-time Decisions
● Stochastic Games

2
Multiagent environments

● Each agent needs to consider the actions of other agents and


how they affect its own welfare.
● The unpredictability of other agents introduce contingencies
into the agent’s problem-solving process
● Game theory views any multiagent environment as a game.
○ The impact of each agent on the others is “significant,”
regardless of whether the agents are cooperative or
competitive.
● Types of games:
○ Perfect information vs Imperfect information
○ Deterministic vs Chance
A sequential game has perfect information if each player, when making any decision, is perfectly informed of all the events
that have previously occurred, including the "initialization event" of the game.
3
Types of game

Deterministic Chance
Perfect Chess, Checkers, Go, Backgammon, Monopoly
information Othello
Imperfect Bridge, poker, scrabble
information nuclear war

4
Adversarial search

● Adversarial search (known as games) covers competitive


environments in which the agents’ goals are in conflict.
● Zero-sum games of perfect information
○ Deterministic, fully observable environments, turn-taking,
two-player
○ The utility values at the end are always equal and opposite.

5
Primary assumptions

● Two players only, called MAX and MIN.


○ MAX moves first, and then they take turns moving until the
game ends
○ Winner gets reward, loser gets penalty.
● Both players have complete knowledge of the game’s state
● No element of chance
● Zero-sum games
○ The total payoff to all players is the same for every game
instance.
● Rational players
○ Each player always tries to maximize his/her utility

6
Games as search

● 𝑆0 – Initial state: How the game is set up at the start 0.


● 𝑃𝐿𝐴𝑌𝐸𝑅(𝑠): Which player has the move in a state, MAX/MIN?
● 𝐴𝐶𝑇𝐼𝑂𝑁𝑆(𝑠) – Successor function: A list of (move, state) pairs
specifying legal moves.
● 𝑅𝐸𝑆𝑈𝐿𝑇(𝑠, 𝑎) – Transition model: Result of move 𝑎 on state 𝑠
● 𝑇𝐸𝑅𝑀𝐼𝑁𝐴𝐿 − 𝑇𝐸𝑆𝑇(𝑠): Is the game finished?
● 𝑈𝑇𝐼𝐿𝐼𝑇𝑌(𝑠,𝑝) – Utility function: A numerical value of a terminal
state 𝑠 for a player 𝑝

7
Games vs. Search problems

● Complexity: games are too hard to be solved


● Time limits: make some decision even when calculating the
optimal decision is infeasible
● Efficiency: penalize inefficiency severely
○ Several interesting ideas on how to make the best possible
use of time are spawn.

8
Search Tree of Tic-Tac-Toe

9
Optimal decision in games

● Normal search problem


○ Optimal solution is a sequence of action leading to a goal
state.
● Games
○ A search path that guarantee win for a player
○ The optimal strategy can be determined from the minimax
value of each node
● MINIMAX(s) =
UTILITY(s) if TERMINAL-TEST(s)

maxa∈Actions(s)MINIMAX(RESULT(s, a)) if PLAYER(s) = MAX

mina∈Actions(s)MINIMAX(RESULT(s, a)) if PLAYER(s) =


MIN 10
Example

MAX best move

MIN best move

Utility values for MAX


11
The minimax algorithm

● Compute the minimax decision from the current state


● Use a simple recursive computation of the minimax values of
each successor state
○ The recursion proceeds all the way down to the leaves of
the tree, and then the minimax values are backed up
through the tree as the recursion unwinds.

12
The minimax algorithm

function MINIMAX-DECISION(state) returns an action


return argmaxa ∈ ACTIONS(s)MIN-VALUE(RESULT(state, a))

function MAX-VALUE(state) returns a utility value


if TERMINAL-TEST(state) then return UTILITY(state)
v ← -∞
for each a in ACTIONS(state) do
v ← MAX(v, MIN-VALUE(RESULT(s, a)))
return v

function MIN-VALUE(state) returns a utility value


if TERMINAL-TEST(state) then return UTILITY(state)
v←∞
for each a in ACTIONS(state) do
v ← MIN(v, MAX-VALUE(RESULT(s, a)))
return v
13
The minimax algorithm

● A complete depth-first exploration of the game tree


● Completeness
○ Yes (if tree is finite)
● Optimality
○ Yes (against an optimal opponent)
● Time complexity
○ 𝑂(𝑏𝑚)
● Space complexity
○ 𝑂(𝑏𝑚) (depth-first exploration)

14
The minimax algorithm

15
Problem with minimax search

● The number of game states is exponential in the tree’s depth


→ Do not examine every node
● Alpha-beta pruning: Prune away branches that cannot possibly
influence the final decision
● Bounded lookahead
○ Limit depth for each search
○ This is what chess players do: look ahead for a few moves
and see what looks best

16
Alpha-beta pruning

17
Alpha-beta pruning

18
Alpha-beta pruning

function ALPHA-BETA-SEARCH(state) returns an action


v ← MAX-VALUE(state,-∞,+∞)
return the action in ACTIONS(state) with value v
function MAX-VALUE(state,α,β) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v ← -∞
for each a in ACTIONS(state) do
v ← MAX(v, MIN-VALUE(RESULT(s,a),α,β))
if v ≥ β then return v
α ← MAX(α, v)
return v
function MIN-VALUE(state,α,β) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v ← +∞
for each a in ACTIONS(state) do
v ← MIN(v, MAX-VALUE(RESULT(s,a) ,α,β))
if v ≤ α then return v
β ← MIN(β, v)
return v
19
Alpha-beta pruning

● Pruning does not affect the result


● Good move ordering improves effectiveness of pruning
● Killer move heuristic
● Transposition table avoids re-evaluation a state

20
Alpha-beta pruning

21
Heuristic minimax

● Both minimax and alpha-beta pruning search all the way to


terminal states.
○ This depth is usually impractical because moves must be
made in a reasonable amount of time (~ minutes).
● Cut off the search earlier with some depth limit
● Use an evaluation function
H-MINIMAX(s, d) =

EVAL(s) if CUTOFF-TEST(s, d)

maxa∈ACTIONS(s)H-MINIMAX(RESULT(s, a), d+1) if PLAYER(s) = MAX

mina∈ACTIONS(s)H-MINIMAX(RESULT(s, a), d+1) if PLAYER(s) = MIN

22
Evaluation Functions

● The evaluation function should order the terminal states in the


same way as the true utility function does
○ States that are wins must evaluate better than draws, which
in turn must be better than losses.
● The computation must not take too long!
● For nonterminal states, their orders should be strongly
correlated with the actual chances of winning.

23
Cutting off search

● Minimax Cutoff is identical to Minimax Value except


○ 𝑇𝑒𝑟𝑚𝑖𝑛𝑎𝑙? is replaced by 𝐶𝑢𝑡𝑜𝑓𝑓?
○ 𝑈𝑡𝑖𝑙𝑖𝑡𝑦 is replaced by 𝐸𝑣𝑎𝑙

if CUTOFF-TEST(state, depth) then return EVAL(state)

24
Stochastic Games

● Uncertain outcomes controlled by chance, not an adversary!


● Why wouldn’t we know what the result of an action will be?
○ Explicit randomness: rolling dice
○ Unpredictable opponents: the ghosts respond randomly
○ Actions can fail: when moving a robot, wheels might slip

25
Expectimax search

● Values reflect average-case (expectimax) outcomes, not


worst-case (minimax) outcomes
● Expectimax search: compute the average score under optimal
play
○ Max nodes as in minimax search
○ Chance nodes are like min nodes, but the outcome is
uncertain
○ Calculate expected utilities, i.e. take weighted average of
children
● The underlying uncertain-result problems can be formulated as
Markov Decision Processes

26
Expectimax search

27
Expectimax search

● It is possible to perform pruning in expectimax search.


● Common techniques for pruning in expectimax include:
○ Depth Limiting: Limiting the depth of the search tree can
effectively prune branches that are too deep to be practically
explored.
○ Evaluation Function: Using an evaluation function to estimate
the value of a state without fully exploring its subtree. If the
evaluation function indicates that further exploration is unlikely to
yield significant improvements, you can prune the subtree.
○ Probabilistic Pruning: In scenarios where probabilities are
involved, you might prune branches that have very low
probabilities of occurring, as they contribute little to the overall
expectation.
○ Iterative Deepening: Iterative deepening can be combined with
pruning techniques to explore deeper parts of the tree only when
necessary, based on the current state of the search.
28
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A


Modern Approach (3rd ed.). Prentice Hall Press, Upper Saddle
River, NJ, USA.
● Lê Hoài Bắc, Tô Hoài Việt. 2014. Giáo trình Cơ sở Trí tuệ nhân
tạo. Khoa Công nghệ Thông tin. Trường ĐH Khoa học Tự
nhiên, ĐHQG-HCM.
● Nguyễn Ngọc Thảo, Nguyễn Hải Minh. 2020. Bài giảng Cơ sở
Trí tuệ Nhân tạo. Khoa Công nghệ Thông tin. Trường ĐH Khoa
học Tự nhiên, ĐHQG-HCM.

29

You might also like