AI_Unit_3_Notes
AI_Unit_3_Notes
Unit No. 3
Heuristic search strategies: Greedy best-first search, A* search, Memory bounded heuristic
search: local search algorithms & optimization problems: Hill climbing search, Simulated
annealing search, Local beam search, Genetic algorithms; Constraint satisfaction problems,
Local search for constraint satisfaction problems. Adversarial search: Games, optimal decisions
& strategies in games, The minimax search procedure, Alpha-beta pruning, Additional
refinements, Iterative deepening.
Heuristics operates on the search space of a problem to find the best or closest-to-optimal
solution via the use of systematic algorithms. In contrast to a brute-force approach, which
checks all possible solutions exhaustively, a heuristic search method uses heuristic information
to define a route that seems more plausible than the rest. Heuristics, in this case, refer to a set
of criteria or rules of thumb that offer an estimate of a firm's profitability. Utilizing heuristic
guiding, the algorithms determine the balance between exploration and exploitation, and thus
they can successfully tackle demanding issues. Therefore, they enable an efficient solution
finding process.
Significance of Heuristic Search in AI
The primary benefit of using heuristic search techniques in AI is their ability to handle large
search spaces. Heuristics help to prioritize which paths are most likely to lead to a solution,
significantly reducing the number of paths that must be explored. This not only speeds up the
search process but also makes it feasible to solve problems that are otherwise too complex to
handle with exact algorithms.
A* Search Algorithm
Motivation
To approximate the shortest path in real-life situations, like- in maps, games where there can
be many hindrances.
We can consider a 2D Grid having several obstacles and we start from a source cell (colored red
below) to reach towards a goal cell (colored green below)
What is A* Search Algorithm?
A* Search algorithm is one of the best and popular technique used in path-finding and graph
traversals.
Explanation
Consider a square grid having many obstacles and we are given a starting cell and a target cell.
We want to reach the target cell (if possible) from the starting cell as quickly as possible. Here
A* Search Algorithm comes to the rescue.
What A* Search Algorithm does is that at each step it picks the node according to a value-‘f’
which is a parameter equal to the sum of two other parameters – ‘g’ and ‘h’. At each step it
picks the node/cell having the lowest ‘f’, and process that node/cell.
We define ‘g’ and ‘h’ as simply as possible below
g = the movement cost to move from the starting point to a given square on the grid, following
the path generated to get there.
h = the estimated movement cost to move from that given square on the grid to the final
destination. This is often referred to as the heuristic, which is nothing but a kind of smart guess.
We really don’t know the actual distance until we find the path, because all sorts of things can
be in the way (walls, water, etc.). There can be many ways to calculate this ‘h’ which are
discussed in the later sections.
Algorithm
We create two lists – Open List and Closed List (just like Dijkstra Algorithm)
// A* Search Algorithm
1. Initialize the open list
2. Initialize the closed list
put the starting node on the open
list (you can leave its f at zero)
2) So as per best first search algorithm choose the path with lowest heuristics value , currently C
has lowest value among above node . So we will go from A to C.
3) Now from C we have direct paths as C to F( with heuristics value of 17 ) and C to E( with
heuristics value of 19) , so we will go from C to F.
4) Now from F we have direct path to go to the goal node G ( with heuristics value of 0 ) , so we
will go from F to G.
5) So now the goal node G has been reached and the path we will follow is A->C->F->G .
Machine Learning: Greedy Best-First Search can be used in machine learning algorithms
to find the most promising path through a search space.
Optimization: Greedy Best-First Search can be used to optimize the parameters of a
system in order to achieve the desired result.
Game AI: Greedy Best-First Search can be used in game AI to evaluate potential moves
and chose the best one.
Navigation: Greedy Best-First Search can be use to navigate to find the shortest path
between two locations.
Natural Language Processing: Greedy Best-First Search can be use in natural language
processing tasks such as language translation or speech recognisation to generate the
most likely sequence of words.
Image Processing: Greedy Best-First Search can be use in image processing to segment
image into regions of interest.
1. Hill Climbing
2. Local Beam Search
3. Simulated Annealing
1. Local Maximum: A local maximum is a state better than its neighbors but not the best
overall. While its objective function value is higher than nearby states, a global
maximum may still exist.
2. Global Maximum: The global maximum is the best state in the state-space diagram,
where the objective function achieves its highest value. This is the optimal solution the
algorithm seeks.
3. Plateau/Flat Local Maximum: A plateau is a flat region where neighboring states have
the same objective function value, making it difficult for the algorithm to decide on the
best direction to move.
4. Ridge: A ridge is a higher region with a slope, which can look like a peak. This may cause
the algorithm to stop prematurely, missing better solutions nearby.
5. Current State: The current state refers to the algorithm’s position in the state-space
diagram during its search for the optimal solution.
6. Shoulder: A shoulder is a plateau with an uphill edge, allowing the algorithm to move
toward better solutions if it continues searching beyond the plateau.
Simulated Annealing
Problem : Given a cost function f: R^n –> R, find an n-tuple that minimizes the value of f. Note
that minimizing the value of a function is algorithmically equivalent to maximization (since we
can redefine the cost function as 1-f).
Many of you with a background in calculus/analysis are likely familiar with simple optimization
for single variable functions. For instance, the function f(x) = x^2 + 2x can be optimized setting
the first derivative equal to zero, obtaining the solution x = -1 yielding the minimum value f(-1)
= -1. This technique suffices for simple functions with few variables. However, it is often the
case that researchers are interested in optimizing functions of several variables, in which case
the solution can only be obtained computationally.
One excellent example of a difficult optimization task is the chip floor planning problem.
Imagine you’re working at Intel and you’re tasked with designing the layout for an integrated
circuit. You have a set of modules of different shapes/sizes and a fixed area on which the
modules can be placed. There are a number of objectives you want to achieve: maximizing
ability for wires to connect components, minimize net area, minimize chip cost, etc. With these
in mind, you create a cost function, taking all, say, 1000 variable configurations and returning a
single real value representing the ‘cost’ of the input configuration. We call this the objective
function, since the goal is to minimize its value.
A naive algorithm would be a complete space search — we search all possible configurations
until we find the minimum. This may suffice for functions of few variables, but the problem we
have in mind would entail such a brute force algorithm to fun in O(n!).
Due to the computational intractability of problems like these, and other NP-hard problems,
many optimization heuristics have been developed in an attempt to yield a good, albeit
potentially suboptimal, value. In our case, we don’t necessarily need to find a strictly optimal
value — finding a near-optimal value would satisfy our goal. One widely used technique is
simulated annealing, by which we introduce a degree of stochasticity, potentially shifting from
a better solution to a worse one, in an attempt to escape local minima and converge to a value
closer to the global optimum.
Simulated annealing is based on metallurgical practices by which a material is heated to a high
temperature and cooled. At high temperatures, atoms may shift unpredictably, often
eliminating impurities as the material cools into a pure crystal. This is replicated via the
simulated annealing optimization algorithm, with energy state corresponding to current
solution.
In this algorithm, we define an initial temperature, often set as 1, and a minimum temperature,
on the order of 10^-4. The current temperature is multiplied by some fraction alpha and thus
decreased until it reaches the minimum temperature. For each distinct temperature value, we
run the core optimization routine a fixed number of times. The optimization routine consists of
finding a neighboring solution and accepting it with probability e^(f(c) – f(n)) where c is the
current solution and n is the neighboring solution. A neighboring solution is found by applying a
slight perturbation to the current solution. This randomness is useful to escape the common
pitfall of optimization heuristics — getting trapped in local minima. By potentially accepting a
less optimal solution than we currently have, and accepting it with probability inverse to the
increase in cost, the algorithm is more likely to converge near the global optimum. Designing a
neighbor function is quite tricky and must be done on a case by case basis, but below are some
ideas for finding neighbors in locational optimization problems.
One caveat is that we need to provide an initial solution so the algorithm knows where to start.
This can be done in two ways: (1) using prior knowledge about the problem to input a good
starting point and (2) generating a random solution. Although generating a random solution is
worse and can occasionally inhibit the success of the algorithm, it is the only option for
problems where we know nothing about the landscape.
There are many other optimization techniques, although simulated annealing is a useful,
stochastic optimization heuristic for large, discrete search spaces in which optimality is
prioritized over time. Below, I’ve included a basic framework for locational-based simulated
annealing (perhaps the most applicable flavor of optimization for simulated annealing). Of
course, the cost function, candidate generation function, and neighbor function must be
defined based on the specific problem at hand, although the core optimization routine has
already been implemented.
2. Domains: The range of potential values that a variable can have is represented by
domains. Depending on the issue, a domain may be finite or limitless. For instance, in
Sudoku, the set of numbers from 1 to 9 can serve as the domain of a variable
representing a problem cell.
3. Constraints: The guidelines that control how variables relate to one another are known
as constraints. Constraints in a CSP define the ranges of possible values for variables.
Unary constraints, binary constraints, and higher-order constraints are only a few
examples of the various sorts of constraints. For instance, in a sudoku problem, the
restrictions might be that each row, column, and 3×3 box can only have one instance of
each number from 1 to 9.
Types of Constraint Satisfaction Problems
CSPs can be classified into different types based on their constraints and problem
characteristics:
1. Binary CSPs: In these problems, each constraint involves only two variables. For
example, in a scheduling problem, the constraint could specify that task A must be
completed before task B.
2. Non-Binary CSPs: These problems have constraints that involve more than two
variables. For instance, in a seating arrangement problem, a constraint could state that
three people cannot sit next to each other.
3. Hard and Soft Constraints: Hard constraints must be strictly satisfied, while soft
constraints can be violated, but at a certain cost. This distinction is often used in real-
world applications where not all constraints are equally important.
Representation of Constraint Satisfaction Problems (CSP)
In Constraint Satisfaction Problems (CSP), the solution process involves the interaction of
variables, domains, and constraints. Below is a structured representation of how CSP is
formulated:
1. Finite Set of Variables (V1,V2,…,Vn)(V1,V2,…,Vn):
The problem consists of a set of variables, each of which needs to be assigned a value
that satisfies the given constraints.
2. Non-Empty Domain for Each Variable (D1,D2,…,Dn)(D1,D2,…,Dn):
Each variable has a domain—a set of possible values that it can take. For example, in a
Sudoku puzzle, the domain could be the numbers 1 to 9 for each cell.
3. Finite Set of Constraints (C1,C2,…,Cm)(C1,C2,…,Cm):
Constraints restrict the possible values that variables can take. Each constraint defines a
rule or relationship between variables.
4. Constraint Representation:
Each constraint CiCi is represented as a pair <scope, relation>, where:
Scope: The set of variables involved in the constraint.
Relation: A list of valid combinations of variable values that satisfy the
constraint.
5. Example:
Let’s say you have two variables V1V1 and V2V2. A possible constraint could
be V1≠V2V1 =V2, which means the values assigned to these variables must not be
equal.
Detailed Explanation:
o Scope: The variables V1V1 and V2V2.
o Relation: A list of valid value combinations where V1V1 is not equal
to V2V2.
Some relations might include explicit combinations, while others may rely on abstract relations
that are tested for validity dynamically.
CSP Algorithms: Solving Constraint Satisfaction Problems Efficiently
Constraint Satisfaction Problems (CSPs) rely on various algorithms to explore and optimize the
search space, ensuring that solutions meet the specified constraints. Here’s a breakdown of the
most commonly used CSP algorithms:
1. Backtracking Algorithm
The backtracking algorithm is a depth-first search method used to systematically explore
possible solutions in CSPs. It operates by assigning values to variables and backtracks if any
assignment violates a constraint.
How it works:
For each unassigned variable, the algorithm keeps track of remaining valid values.
Once a variable is assigned a value, local constraints are applied to neighboring
variables, eliminating inconsistent values from their domains.
If a neighbor has no valid values left after forward-checking, the algorithm backtracks.
This method is more efficient than pure backtracking because it prevents some conflicts before
they happen, reducing unnecessary computations.
3. Constraint Propagation Algorithms
Constraint propagation algorithms further reduce the search space by enforcing local
consistency across all variables.
How it works:
Constraints are propagated between related variables.
Inconsistent values are eliminated from variable domains by leveraging information
gained from other variables.
These algorithms refine the search space by making inferences, removing values that
would lead to conflicts.
Constraint propagation is commonly used in conjunction with other CSP algorithms, such
as backtracking, to increase efficiency by narrowing down the solution space early in the search
process.
Adversarial Search
Adversarial search is a search, where we examine the problem which arises when we try to
plan ahead of the world and other agents are planning against us.
o In previous topics, we have studied the search strategies which are only associated with
a single agent that aims to find the solution which often expressed in the form of a
sequence of actions.
o But, there might be some situations where more than one agent is searching for the
solution in the same search space, and this situation usually occurs in game playing.
o The environment with more than one agent is termed as multi-agent environment, in
which each agent is an opponent of other agent and playing against each other. Each
agent needs to consider the action of other agent and effect of that action on their
performance.
o So, Searches in which two or more players with conflicting goals are trying to explore
the same search space for the solution, are called adversarial searches, often known
as Games.
o Games are modeled as a Search problem and heuristic evaluation function, and these
are the two main factors which help to model and solve games in AI.
Types of Games in AI:
o Perfect information: A game with the perfect information is that in which agents can
look into the complete board. Agents have all the information about the game, and they
can see each other moves also. Examples are Chess, Checkers, Go, etc.
o Imperfect information: If in a game agents do not have all information about the game
and not aware with what's going on, such type of games are called the game with
imperfect information, such as tic-tac-toe, Battleship, blind, Bridge, etc.
o Deterministic games: Deterministic games are those games which follow a strict pattern
and set of rules for the games, and there is no randomness associated with them.
Examples are chess, Checkers, Go, tic-tac-toe, etc.
o Non-deterministic games: Non-deterministic are those games which have various
unpredictable events and has a factor of chance or luck. This factor of chance or luck is
introduced by either dice or cards. These are random, and each action response is not
fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Note: In this topic, we will discuss deterministic games, fully observable environment, zero-sum,
and where each agent acts alternatively.
Zero-Sum Game
o Zero-sum games are adversarial search which involves pure competition.
o In Zero-sum game each agent's gain or loss of utility is exactly balanced by the losses or
gains of utility of another agent.
o One player of the game try to maximize one single value, while other player tries to
minimize it.
o Each move by one player in the game is called as ply.
Each of the players is trying to find out the response of his opponent to their actions. This
requires embedded thinking or backward reasoning to solve the game problems in AI.
o Result(s, a): It is the transition model, which specifies the result of moves in the state
space.
o Terminal-Test(s): Terminal test is true if the game is over, else it is false at any case. The
state where the game ends is called terminal states.
o Utility(s, p): A utility function gives the final numeric value for a game that ends in
terminal states s for player p. It is also called payoff function. For Chess, the outcomes
are a win, loss, or draw and its payoff values are +1, 0, ½. And for tic-tac-toe, utility
values are +1, -1, and 0.
Game tree:
A game tree is a tree where nodes of the tree are the game states and Edges of the tree are the
moves by players. Game tree involves initial state, actions function, and result Function.
Example: Tic-Tac-Toe game tree:
The following figure is showing part of the game-tree for tic-tac-toe game. Following are some
key points of the game:
o There are two players MAX and MIN.
o Players have an alternate turn and start with MAX.
o MAX maximizes the result of the game tree
o In the game tree, optimal leaf node could appear at any depth of the tree.
o Propagate the minimax values up to the tree until the terminal node discovered.
In a given game tree, the optimal strategy can be determined from the minimax value of each
node, which can be written as MINIMAX(n). MAX prefer to move to a state of maximum value
and MIN prefer to move to a state of minimum value then:
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and
will find the 3rd layer node values.
o For node B= min(4,6) = 4
o For node C= min (-3, 7) = -3
Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value
and find the maximum value for the root node. In this game tree, there are only 4 layers, hence
we reach immediately to the root node, but in real games, there will be more than 4 layers.
o For node A max(4, -3)= 4
That was the complete workflow of the minimax two player game.
Properties of Mini-Max algorithm:
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in
the finite search tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which
is O(bm).
Limitation of the minimax Algorithm:
The main drawback of the minimax algorithm is that it gets really slow for complex games such
as Chess, go, etc. This type of games has a huge branching factor, and the player has lots of
choices to decide. This limitation of the minimax algorithm can be improved from alpha-beta
pruning which we have discussed in the next topic.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization
technique for the minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game states it has
to examine are exponential in depth of the tree. Since we cannot eliminate the
exponent, but we can cut it to half. Hence there is a technique by which without
checking each node of the game tree we can compute the correct minimax decision, and
this technique is called pruning. This involves two threshold parameter Alpha and beta
for future expansion, so it is called alpha-beta pruning. It is also called as Alpha-Beta
Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only
prune the tree leaves but also entire sub-tree.
o The two-parameter can be defined as:
o Alpha: The best (highest-value) choice we have found so far at any point along
the path of Maximizer. The initial value of alpha is -∞.
o Beta: The best (lowest-value) choice we have found so far at any point along the
path of Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the same move as the
standard algorithm does, but it removes all the nodes which are not really affecting the
final decision but making algorithm slow. Hence by pruning these nodes, it makes the
algorithm fast.
Note: To better understand this topic, kindly study the minimax algorithm.
Condition for Alpha-beta pruning:
The main condition which required for alpha-beta pruning is:
1. α>=β
4.
5. if MaximizingPlayer then // for Maximizer Player
6. maxEva= -infinity
7. for each child of node do
8. eva= minimax(child, depth-1, alpha, beta, False)
14.
15. else // for Minimizer player
16. minEva= +infinity
17. for each child of node do
18. eva= minimax(child, depth-1, alpha, beta, true)
19. minEva= min(minEva, eva)
20. beta= min(beta, eva)
21. if beta<=alpha
22. break
23. return minEva
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is
compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at node D and
node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of
Min, Now β= +∞, will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3,
hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the
values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value
of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where
α>=β, so the right successor of E will be pruned, and algorithm will not traverse it, and the value
at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the
value of alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞,
these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and
max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α remains 3,
but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta
will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it
satisfies the condition α>=β, so the next child of C which is G will be pruned, and the algorithm
will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is
the final game tree which is the showing the nodes which are computed and nodes which has
never computed. Hence the optimal value for the maximizer is 3 for this example.
Move Ordering in Alpha-Beta pruning:
The effectiveness of alpha-beta pruning is highly dependent on the order in which each node is
examined. Move order is an important aspect of alpha-beta pruning.
It can be of two types:
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the
leaves of the tree, and works exactly as minimax algorithm. In this case, it also
consumes more time because of alpha-beta factors, such a move of pruning is called
worst ordering. In this case, the best move occurs on the right side of the tree. The time
complexity for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning
happens in the tree, and best moves occur at the left side of the tree. We apply DFS
hence it first search left of the tree and go deep twice as minimax algorithm in the same
amount of time. Complexity in ideal ordering is O(bm/2).
Rules to find good ordering:
Following are some rules to find good ordering in alpha-beta pruning:
o Occur the best move from the shallowest node.
o Order the nodes in the tree such that the best nodes are checked first.
o Use domain knowledge while finding the best move. Ex: for Chess, try order: captures
first, then threats, then forward moves, backward moves.
o We can bookkeep the states, as there is a possibility that states may repeat.
The fitness function guides the selection process by assigning higher fitness scores to
better solutions.
Termination:
Termination criteria determine when the GA should stop running. Common termination
conditions include:
Reaching a maximum number of generations.
Achieving a satisfactory solution or fitness level.
Running out of computation time or resources.
Once the termination criteria are met, the GA concludes, and the best solution found is
returned.