0% found this document useful (0 votes)
18 views77 pages

AI Module 3 Final

Uploaded by

yadikibilva2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views77 pages

AI Module 3 Final

Uploaded by

yadikibilva2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

SEARCH IN COMPLEX

ENVIRONMENTS
(Slides are based on Artificial
Intelligence A Modern Approach
by Stuart Russell & Peter Norvig)
Dr. K. Venkateswara Rao
Professor CSE
Contents
• Local search algorithms and optimization problems
– Hill-climbing search
– Simulated annealing
– Local beam search
– Evolutionary algorithms
• Optimal decisions in games
– The minimax search algorithm
– Optimal decisions in multiplayer games
• Alpha-Beta pruning
– Move ordering
• Monte Carlo tree search
• Kalman Filter
Systematic Algorithms
• Systematic Algorithms are characterized by Observable,
deterministic, known environments where solution (goal
configuration) and path (a sequence of actions) to the goal are
important.
• Observable: In an observable environment, the agent (the
entity trying to solve the problem) can directly observe the
current state of the environment. This means that the agent
has complete information about the state of the environment
at any given time.
• Deterministic: In a deterministic environment, the outcome
of an action is fully determined by the current state of the
environment and the action taken by the agent. There is no
randomness or uncertainty in the environment's response to
the agent's actions.
Systematic Algorithms
• Known: In a known environment, the agent has complete
knowledge of the environment's rules and dynamics. There
are no hidden states or unknown elements in the
environment.
• Solution as a sequence of actions: In such environments, the
solution to a problem is typically represented as a sequence
of actions that the agent can take to transition from the initial
state of the environment to the goal state.
• Local search, on the other hand, is a problem-solving
technique used in optimization and search problems. It differs
from the scenario (characterized by Observable,
deterministic, known environments where a solution is a
sequence of actions ) described above in several ways
Local Search
• Local search is an optimization technique used in artificial
intelligence to find solutions to problems by iteratively exploring
the space of possible solutions, focusing on improving the current
solution by making small incremental changes.
• It is particularly useful for problems where the search space is
large and it is impractical to explore all possible solutions.
• Characteristics of local search algorithms
1. Iterative Improvement
2. Exploration of Neighborhoods
3. No Backtracking
4. Heuristic Guidance
5. Stochastic or Deterministic
6. Convergence to Local Optima
Local Search
• Local search algorithms start with an initial solution and
iteratively explore the neighborhood of the current
solution by applying small changes to the current
solution.
• At each iteration, the algorithm evaluates neighboring
solutions and selects the one that offers the most
improvement according to some evaluation function or
objective.
• Local search algorithms often use heuristic information
to guide the search process.
• One of the main limitations of local search algorithms is
that they may converge to local optima, solutions that are
locally optimal but not globally optimal.
Local Search
• Local search algorithms do not backtrack.
– They only move forward by considering neighboring
solutions and selecting the one that offers the most
improvement. This can make them more efficient in large
search spaces but also means they may get stuck in local
optima.
• Local search algorithms can be stochastic or deterministic.
– Stochastic algorithms introduce randomness into the
search process, which can help escape local optima and
explore a broader range of solutions.
– Deterministic algorithms, on the other hand, follow a fixed
set of rules for selecting neighboring solutions and may be
more predictable in their behavior.
Local Search
• Local Search Algorithms are not Systematic. The path
followed by the agent is not retained. Two key
advantages of local search algorithms are:
1. They use very little memory, usually constant.
2. They can often find reasonable solutions in large or
infinite search spaces
• A state space landscape has both location (defined by
state), and elevation (defined by value of the heuristic
cost function (if aim is to find global minimum) or
objective function (if aim is to find global maximum)).
• A complete local search algorithm always finds a goal if
one exists
Local Search Algorithms
1. Hill-climbing search
2. Simulated Annealing
3. Local beam search
4. Evolutionary algorithms
Hill Climbing Search
Hill-Climbing Search

Steepest-Ascent Hill Climbing


• current  start node
• loop do
– neighbor  a highest-valued successor of current
– if neighbor.Value <= current.Value then return current.State
else current  neighbor
• end loop
Hill-Climbing Example & Problems
Local Search – Hill Climbing
Ridges Illustration
Variants of Hill Climbing
• Allow backtracking
• Stochastic hill climbing: choose at random from among
the uphill moves. The probability of selection varies
with the steepness of the uphill move. This usually
converges more slowly than steepest ascent.
• First-choice hill climbing: implements stochastic hill
climbing by generating successors randomly until one is
generated that is better than the current state.
• Random restart hill climbing: “If at first you don’t
succeed, try, try again.” It conducts a series of hill-
climbing searches from randomly generated initial states,
until a goal is found.
Simulated Annealing
• Hill-climbing is incomplete
• Pure random walk, keeping track of the best state found
so far, is complete but very inefficient because it relies
solely on random moves without any intelligent
guidance
• Simulated Annealing is a Variant of hill climbing that
Combines hill climbing with random walk in some way
to yield both efficient and completeness.
• Simulated Annealing comes from the physical process of
annealing that is used to temper or harden metals and
glass by heating them to a high temperature / high energy
levels and then gradually cooling them, thus allowing the
material to reach a low energy crystalline state.
Simulated Annealing
Simulated Annealing Algorithm
• Initialize: Start with an initial solution to the optimization problem.
• Define Temperature Schedule: The algorithm requires a cooling schedule,
which determines how the temperature decreases over time. Initially, the
temperature is high, allowing for more exploration of the solution space, and then
it gradually decreases to focus more on exploitation.
• Iterate: Perform a series of iterations, each consisting of the following steps:
1. Generate a Neighbor
2. Evaluate Neighbor
3. Accept or Reject Neighbor: Compare the cost of the new solution with the
cost of the current solution. If the new solution is better (i.e., has a lower
cost), accept it. If the new solution is worse, accept it with a certain
probability, which depends on the temperature and the difference in cost
between the current and new solutions. This probabilistic acceptance allows
the algorithm to escape local optima.
4. Update Temperature: Adjust the temperature according to the cooling
schedule.
• Termination: Repeat the iterations until a stopping criterion is met. This could
be a maximum number of iterations, reaching a certain temperature threshold, or
finding a solution that meets some desired criteria.
Simulated Annealing

The probability of moving to a higher energy state, instead of lower


is
p = e^(-E/kT)
where E is the positive change in energy level, T is the temperature,
and k isbeginning,
At the Bolzmann’stheconstant.
temperature is high. As the temperature becomes
lower
kT becomes lower
E/kT gets bigger
(-E/kT) gets smaller
e^(-E/kT) gets smaller
As the process continues, the probability of a downhill move gets
smaller and smaller.
Simulated Annealing Algorithm
• current  start node; /* Initial state of the problem (input) */
• for each T on the schedule /* need a schedule (input) */
1. next  randomly selected successor of current
2. evaluate next; if it’s a goal, return it
3. E  next.Value – current.Value /* already negated */
4. if E > 0
then current  next /* better than current */
else current  next with probability e^(E/T)
• E represents the change in the value of the objective function.
• Since the physical relationships no longer apply, drop k. So p =
e^(-E/T)
• We need an annealing schedule, which is a sequence of values of
T: T0, T1, T2, ...
Simulated Annealing

The Problem is to reach


global minima. The Hill
Climbing algorithm
halts at local minima.
Simulated Annealing
escape from the
problem of local
minima.
Simulated Annealing Applications
• Basic Problems
– Traveling salesman
– Graph partitioning
– Matching problems
– Graph coloring
– Scheduling
• Engineering
– VLSI design
• Placement
• Routing
• Array logic minimization
• Layout
– Facilities layout
– Image processing
– Code design in information theory
Local Beam Search Technique
Local Beam Search Algorithm
1. Initialization: Start with k randomly generated initial solutions or paths,
where k is the beam width or the number of beams. K is an input to the
algorithm.
2. Expansion: For each beam, generate successor solutions by applying possible
operators or actions. These successor solutions represent neighboring states in
the search space.
3. Evaluation: Evaluate the quality of each successor solution using a heuristic
or objective function. This function measures how close a solution is to the
desired goal or optimal solution. If any one is a goal, return solution
(algorithm halts).
4. Selection: Select the top k successor solutions across all beams based on their
evaluation scores. These top k solutions become the new set of beams for the
next iteration.
5. Termination: Repeat steps 2-4 until a termination condition is met. This could
be a maximum number of iterations, reaching a certain quality threshold, or
finding a solution that satisfies specific criteria.
6. Result: Once the termination condition is satisfied, return the best solution
found among all beams.
Local Beam Search Characteristics
• Parallel Exploration: Local Beam Search explores multiple paths
simultaneously by maintaining multiple beams. This parallel exploration can
help the algorithm to avoid getting stuck in local optima and increase the
chances of finding a good solution.
• Beam Selection: At each iteration, only the top k successor solutions are
selected to become the new set of beams. This selective mechanism helps to
focus the search on the most promising regions of the search space.
• Diversity: Since Local Beam Search maintains multiple beams, it can explore
different regions of the search space concurrently. This diversity in exploration
can be beneficial for finding a variety of solutions or for escaping local optima.
• Stochastic beam search: Instead of choosing the k best nodes from the pool of
candidate successors, stochastic beam search chooses k successors, with
probability of choosing a given successor being increasing function of its value
• Stochastic beam search resembles to a process of natural selection: The
successors (offspring) of a state (organism) populate the next generation
according to its value (fitness).
Genetic Algorithms
• Variant of stochastic beam search
• Combines two parent states to generate successors
• Uses one fitness function and two operators called Crossover and
Mutation.
• Start with random population of states
– Representation serialized (ie. strings of characters or bits)
– States are ranked with “fitness function”
• Produce new generation
– Select random pair(s) using probability:
• probability ~ fitness
– Randomly choose “crossover point”
• Offspring mix halves
– Randomly mutate bits
Genetic Algorithm
Genetic Algorithm
Adversarial Search
• Adversarial search, also known as two-player game
search deals with decision-making in competitive
environments.
• It focuses on scenarios where two or more agents, often
referred to as players, interact in a competitive manner,
each trying to achieve their own objectives while
simultaneously trying to prevent their opponents from
achieving theirs.
• Adversarial search is commonly applied in various
domains, including board games (e.g., tic-tac-toe, chess,
Go), card games (e.g., poker), and other strategic
scenarios.
Terminology
• A zero-sum game is defined as one where the total payoff to all
players is the same for every instance of the game. Chess is zero-
sum because every game has payoff of either 0 + 1, 1 + 0 or 1/2+
1/2 . “Constant-sum” would have been a better term, but zero-sum
is traditional
• zero-sum games of perfect information means deterministic, fully
observable environments in which two agents act alternately and
the utility values at the end of the game are always equal and
opposite.
– For example, if one player wins a game of chess, the other
player necessarily loses.
• Heuristic evaluation functions allow to approximate the true
utility of a state without doing a complete search.
• Pruning allows to ignore portions of the search tree that make no
difference to the final choice
Steps in Adversarial Search
1. Game Representation: The first step in adversarial
search is to represent the game environment, including
the initial state, possible actions or moves for each
player, transition rules defining how the game state
changes with each move, and the terminal states that
determine when the game ends (e.g., win, lose, draw).
This representation provides the framework for
analyzing the game.
2. Search Tree Construction: Adversarial search
algorithms construct a search tree to represent all
possible sequences of moves that can be made by the
players. Each level of the tree corresponds to a player's
turn, and the branches represent possible moves.
Steps in Adversarial Search
3. Evaluation Function: An evaluation function is used to
assess the desirability of game states. This function assigns
a numerical value to each game state, indicating how
favorable it is for the player whose turn it is.
4. Search Algorithms: Adversarial search algorithms explore
the search tree to determine the best move for a player at
any given game state. Common search algorithms used in
adversarial search include MiniMax, Alpha-Beta Pruning,
and Monte Carlo Tree Search (MCTS). These algorithms
aim to find the optimal strategy by considering both the
current player's moves and their opponent's potential
responses.
5. Decision Making:
6. Iterative Deepening:
Formal Definition of a Game
• A game can be formally defined as a kind of search problem with
the following elements:
1. S0: The initial state specifies how the game is set up at the
start.
2. PLAYER(s): Defines which player has the move in a state.
3. ACTIONS(s): Returns the set of legal moves in a state.
4. RESULT(s, a): The transition model, which defines the result
of a move.
5. TERMINAL-TEST(s): A terminal test, which is true when the
game is over and false otherwise. States where the game has
ended are called terminal states.
6. UTILITY(s, p): A utility function (also called an objective
function or payoff function) defines the final numeric value
for a game that ends in terminal states for a player p. In chess,
the outcome is a win, loss, or draw, with values +1, 0, or 1/2 .
Optimal Decisions in Games
1. In a normal search problem, the optimal solution would be a
sequence of actions leading to a goal state — a terminal state that
is a win.
2. In adversarial search, MIN has something to say about it.
3. MAX therefore must find a contingent strategy, which specifies
MAX’s move in the initial state, then MIN’s moves in the states
resulting from every possible response by MAX, then MAX’s
moves in the states resulting from every possible response by
MIN, and so on.
4. Given a game tree, the optimal strategy can be determined from
the minimax value of each node, which we write as
MINIMAX(n).
Game Tree for Tic-Tac-Toe
The initial state, ACTIONS function, and RESULT function define
the game tree for the game — a tree where the nodes are game
states and the edges are moves.
The Minimax Algorithm
MINIMAX Rule application Example
The minimax Algorithm
• The minimax algorithm computes the minimax decision from the
current state. It uses a simple recursive computation of the
minimax values of each successor state, directly implementing the
defining equations.
• The recursion proceeds all the way down to the leaves of the tree,
and then the minimax values are backed up through the tree as the
recursion unwinds.
• The Algorithm returns the action corresponding to the best
possible move, that is, the move that leads to the outcome with the
best utility, under the assumption that the opponent plays to
minimize utility. The functions MAX-VALUE and MIN-VALUE
go through the whole game tree, all the way to the leaves, to
determine the backed-up value of a state.
• The notation argmaxaϵS f(a) computes the element a of set S that
has the maximum value of f(a).
The minimax Algorithm
MiniMax Pocedure Example
MiniMax Procedure Trace it
Optimal Decisions in Multiplayer Games
• Many popular games allow more than two players. Hence the
minimax idea is to be extended to multiplayer games.
• Replace the single value for each node with a vector of values. For
example, in a three-player game with players A, B, and C, a vector
(vA, vB, vC) is associated with each node. For terminal states, this
vector gives the utility of the state from each player’s viewpoint.
The simplest way to implement this is to have the UTILITY
function return a vector of utilities.
• Now consider non-terminal states. backed-up value of a node n is
always the utility vector of the successor state with the highest
value for the player choosing at n.
• Multiplayer games usually involve alliances, whether formal or
informal, among the players. Alliances are made and broken as the
game proceeds. The players involved in alliance will automatically
cooperate to achieve a mutually desirable goal.
Three Player Game - Example
Alpha-Beta Pruning
• The problem with minimax search is that the number of
game states it has to examine is exponential in the depth
of the tree.
• Alpha-Beta Pruning improves the performance of the
Minimax
– The Basic Idea “ If you have an idea that is surely bad, don’t
take the time to see how truly awful it is” Pat Winston
– The trick is that it is possible to compute the correct minimax
decision without looking at every node in the game tree.
• Alpha–beta pruning can be applied to trees of any depth,
and it is often possible to prune entire subtrees rather
than just leaves.
Pruninig Example
Alpha Beta Pruning (cutoff)
• Instead of first creating the entire tree (upto depth level) and
then doing propagation of values upwards, interleave the
generation of the tree and do propagation of values.
• Generate the tree depth-first, left to right.
• Propagate final values of nodes as initial estimates for their
parent node.
• The temporary values at Max nodes are Alpha values
• The temporary values at MIN nodes are Beta values.
• If an Alpha value >= the Beta value of a descendant node,
Stop generation of children of the descendant. This is called
Beta cut-off.
• If the Beta value <= the Alpha value of a descendant node,
Stop generation of children of the descendant. This is called
Alpha cut-off.
Alpha Beta Pruning (cutoff)
Alpha-Beta Pruning Problem - What move Max Player choose from
root? What nodes need not be examined using Alpaha-Beta Pruning?
Alpha-Beta Pruning
Alpha-Beta Pruning
• α = the value of the best (i.e., highest-value) choice we
have found so far at any choice point along the path for
MAX.
• β = the value of the best (i.e., lowest-value) choice we
have found so far at any choice point along the path for
MIN.
• Alpha–beta search updates the values of α and β as it
goes along and prunes the remaining branches at a
node (i.e., terminates the recursive call) as soon as the
value of the current node is known to be worse than
the current α or β value for MAX or MIN,
respectively.
Alpha-Beta Search Algorithm
Importance of Order of Generation of Nodes
Alpha-Beta Pruning
Move Ordering Effect on Alpha-Beta Pruning
• The effectiveness of alpha–beta pruning is highly dependent on
the order in which the states are examined.
• Move Ordering suggests that it might be worthwhile to try to
examine first the successors that are likely to be best.
• alpha–beta needs to examine only O(bm/2) nodes to pick the best
move, instead of O(bm) for minimax.
• the effective branching factor becomes √b instead of b
• alpha–beta can solve a tree roughly twice as deep as minimax in
the same amount of time.
• If successors are examined in random order rather than best-first,
the total number of nodes examined will be roughly O(b3m/4) for
moderate b.
• The best moves are often called killer moves and to try them first
is called the killer move heuristic.
• One way to gain information from the current move is with
iterative deepening search.
Monte Carlo Tree Search
• Minimax is not efficient if branching factor is high.
• Alpha-Beta Pruning Algorithm is better than Minimax
• Monte Carlo Tree Search is good even if branching
factor is high.
• Monte Carlo tree search (MCTS) was introduced by
Abramson (1987). Tesauro and Galperin (1997) showed
how a Monte Carlo search could be combined with an
evaluation function for the game of backgammon.
• Kocsis and Szepesvari (2006) refined the approach with
the “Upper Confidence Bounds applied to Trees”
selection mechanism. Chaslot et al. (2008) show how
MCTS can be applied to a variety of games
Monte Carlo Tree Search
• Monte Carlo Tree Search (MCTS) is a probabilistic and
heuristic driven search algorithm that combines the
classic tree search implementations and machine learning
principles of reinforcement learning.
• In MCTS, nodes are the building blocks of the search
tree. These nodes are formed based on the outcome of a
number of simulations. The process of Monte Carlo
Tree Search can be broken down into four distinct
steps/phases.
1. Selection
2. Expansion,
3. Simulation
4. Back propagation
Monte Carlo Tree Search (MCTS)
• MCTS uses ” exploration-exploitation trade-off “.
• It exploits the actions and strategies that is found to be the
best till now
• It continues to explore the local space of alternative decisions
and find out if they could replace the current best.
• MCTS becomes particularly useful in making optimal
decisions in Artificial Intelligence (AI) problems such as
Othello, Backgammon, Poker, BRIDGE, Tic Tac Toe,
Scrabble, StarCraft II (Video game), Checkers, Chess, Go,
etc.
• This has been used by Artificial Intelligence Programs like
AlphaGo, to play against the world’s top Go players.
Exploration and Exploitation Trade-off
• Exploration
– helps in exploring and discovering the unexplored parts of the
tree
– results in finding a more optimal path.
– expands the tree’s breadth more than its depth.
– can be useful to ensure that MCTS is not overlooking any
potentially better paths.
– becomes inefficient in situations with large number of steps or
repetitions.
• The inefficiency of Exploration is balanced out by exploitation.
• Exploitation sticks to a single path that has the greatest estimated
value. This is a greedy approach and this will extend the tree’s
depth more than its breadth.
• MCTS dynamically balances exploration and exploitation
Steps in MCTS
Steps in MCTS: Selection
• MCTS algorithm traverses the current tree from the root
node using a specific strategy.
• The strategy uses an evaluation function to optimally
select nodes with the highest estimated value.
• MCTS uses the Upper Confidence Bound (UCB)
formula applied to trees as the strategy in the selection
process to traverse the tree.
• It balances the exploration-exploitation trade-off.
• During tree traversal, a node is selected based on some
parameters that return the maximum value.
• The parameters are characterized by the formula.
Upper Confidence Bound (UCB)
Steps in MCTS: Selection

• Where, vi = the exploitation term.


• The square-root of ln(N)/ni is the exploration term.
• When traversing a tree during the selection process, the child node
that returns the greatest value from the above equation (the choice
is based on the upper confidence bound (UCB)) will be one that
will get selected.
• At the beginning when no node is explored, it makes a random
selection because there is no data available to make a more
educated selection.
• When a node is unexplored, i.e. when n1=0, the second term
becomes ∞ and thus obtains a maximum possible value and
automatically becomes a candidate for selection. Thus, the
equation makes sure all children get selected at least once.
Steps in MCTS: Expansion and Simulation
Expansion: In this process, a new child node is added to the node
and start looking one level deeper.
• Simulation / Rollout: In this phase, simulate the game from the
selected child node and continue the game by making random
choices until it reachs an end state, i.e. a win, lose or draw. Assign
following values to these results/outcomes:
• Win = +1 or some +ve Value
• Loose = -1 or some –ve Value
• Draw = 0
• Rollout(Si)
– Loop Forever:
• if Si is a terminal state: return Value(Si)
• Ai = random(available_actions(Si))
• Si = Simulate(Si, Ai)

• This loop will run forever until you reach a terminal state.
Steps in MCTS: BackPropagation
Backpropagation:
In this phase, update the result found in the simulation phase at all
the nodes in the random path (traversed and up till the root node).
This sets the value vi which is then used in the selection phase of
the formula.
•Backpropagation steps:
• Backpropagates the value from the new node to the root node.
• Increment the number of simulation stored in each node
• Increment the number of wins, if the new node’s simulation
results in a win.
MCTS Flowchart

Source for MCTS example:


https://www.analyticsvidhya.com/blog/2019/01/
monte-carlo-tree-search-introduction-algorithm-d
eepmind-alphago/
Complete Walkthrough with an example
Iteration-1
Iteration-1
Iteration-2
Iteration-3
Iteration-4
Monte Carlo Tree Search Algorithm
Kalman Filter
• Kalman filter is one of the most powerful probabilistic AI
algorithms.
• The Kalman filter algorithm uses statistical methods to predict
and correct the state of a system.
• It was developed in the late 1950s by Rudolf Emil Kalman, a
Hungarian-American electrical engineer and mathematician.
• The Kalman filter works by modeling the state of a system
using a set of probabilities.
• It assumes that the system's state can be represented by a set of
variables called the state vector, which contains all the relevant
information about the system's current state.
• The Kalman filter uses a set of equations and algorithms to make
predictions about the state vector and correct these predictions
based on new observations.
The Prediction Step
• The first step in the Kalman filter algorithm is the
prediction step. In this step, the Kalman filter uses the
information from the past state vector and the current
state vector to predict the next state vector. This is done
using the following equation:

Where:
• xk|k is the current state vector.
• F is the state transition matrix, which describes how the
state vector changes from one time step to another.
• ẋ is the rate of change of the state vector.
The Update Step
• The Kalman filter uses
new observations to refine
its prediction of the state
vector. This is done using
the equations given here.
Benefits and Applications of Kalman Filter
• One of the main benefits of the Kalman filter is its ability to
handle noisy or incomplete data. The Kalman filter is able to
estimate the state of a system even when the measurements are
uncertain or incomplete.
• Another benefit of the Kalman filter is its ability to model
complex systems using a simple set of equations.
• One of the most common applications of the Kalman filter is in
navigation systems. The Kalman filter is used to estimate and
correct the position, velocity, and acceleration of a vehicle using
GPS measurements and other sensors. The Kalman filter is also
used in robotics to estimate the position, velocity, and orientation
of a robot.
• The Kalman filter is also used in tracking systems, such as
tracking aircraft, missiles, and other objects.

You might also like