AI Unit 2
AI Unit 2
(KCS071) Unit 2
Before it can do this, it needs to decide (or we need to decide on its behalf)
what sorts of actions and states.
Let us assume that the agent will consider actions at the level of driving from
one major town to another. Each state therefore corresponds to being in a
particular town.
It then uses the solution to guide its actions,doing whatever the solution
recommends as the next thing to do—typically, the first action of the
sequence—and then removing that step from the sequence.
Once the solution has been executed, the agent will formulate a new goal.
• INITIAL STATE - The initial state that the agent starts in.
For example, the initial state for our agent in Romania might be described as In(Arad).
• ACTIONS - A description of the possible actions available to the agent. Given a particular state s,
ACTIONS(s) returns the set of actions that can be executed in s.
For example, from the state In(Arad), the applicable actions are {Go(Sibiu), Go(Timisoara),
Go(Zerind)}.
• TRANSITION MODEL - A description of what each action does; It is specified by a function RESULT(s,
a) that returns the state that results from doing action a in state s.
• PATH COST - A path cost function that assigns a numeric cost to each path.
The problem-solving agent chooses a cost function that reflects its own
performance measure.
The preceding elements define a problem and can be gathered into a single
data structure that is given as input to a problem-solving algorithm.
A solution to a problem is an action sequence that leads from the initial state
to a goal state. Solution quality is measured by the path cost function, and an
optimal solution has the lowest path cost among all solutions.
KCS-071 Unit 2 Ankita Singh
STATE SPACE : Together, the initial state, actions, and transition model
implicitly define the state space of the problem—the set of all states reachable
from the initial state by any sequence of actions.
The state space forms a directed network or graph in which the nodes are
states and the links between nodes are actions. A path in the state space is a
sequence of states connected by a sequence of actions.
• Action(s): The possible actions include choosing a road to travel from one intersection to
another.
• Result(s, a): For a chosen action 'a' (road taken), the result would be a new state
representing the agent's new location (another intersection).
• Goal Test: A function to determine if the agent has reached the destination, 'Work'.
• Path Cost Function: A function that adds up the distance (or time) to travel from the initial
state to the current state via the chosen paths. The objective is to minimize this cost.
• Actions: In this simple environment, each state has just three actions: Left, Right, and Suck.
• Transition model: The actions have their expected effects, except that moving Left in the leftmost square, moving Right in
the rightmost square, and Sucking in a clean square have no effect.
• Goal test: This checks whether all the squares are clean.
• Path cost: Each step costs 1, so the path cost is the number of steps in the path.
• Goal test: This checks whether the state matches the goal
configuration.
• Path cost: Each step costs 1, so the path cost is the number of
steps in the path.
A tree representation of
search problem is called
Search tree The root of the
search tree is the root node
which is
corresponding to the initial
state
Following are the four essential properties of search strategy/algorithms to compare the efficiency of these algorithms
● Completeness : A search algorithm is said to be complete if it guarantees to return a solution if at least any solution exists
for any random input.
● Optimality : If a solution found for an algorithm is guaranteed to be the best solution (lowest path cost) among all other
solutions, then such a solution for is said to be an optimal solution.
● Time Complexity : Time complexity is a measure of time for an algorithm to complete its task.
● Space Complexity ; It is the maximum storage space required at any point during the search, as the complexity of the
problem.
• Examples: Breadth First Search, Depth First Search, Uniform Cost Search
• Efficiency: By leveraging domain knowledge, informed search strategies can make informed
decisions and focus the search on more relevant areas, leading to faster convergence to a solution.
Disadvantage:
• Heuristic Accuracy: The effectiveness of informed search strategies heavily relies on the quality
and accuracy of the chosen heuristic function. An inaccurate or misleading heuristic can lead to
suboptimal or incorrect solutions.
Complexity Higher complexity due to lack of Reduced complexity and typically more efficient in
information, affecting time and space both time and space due to informed decisions.
complexity.
• If there are more than one solutions for a given problem, then BFS will provide the minimal
solution which requires the least number of steps.
Disadvantages:
•BFS is inefficient in terms of time and space for large search spaces. It requires lots of memory
since each level of the tree must be saved into memory to expand the next level.
•BFS needs lots of time if the solution is far away from the root node.
•It is a searching algorithm used for traversing a weighted tree or graph.This algorithm comes into play when a
different cost is available for each edge.
•The primary goal of the uniform cost search is to find a path to the goal node which has the lowest
cumulative cost.
•A uniform cost search algorithm is implemented by the priority queue It gives maximum priority to the
lowest cumulative cost
•Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is the same.
•The goal test is applied to a node when it is selected for expansion rather than when it is first generated.
1. Dequeue the node with the lowest path cost from the priority queue.
2. If the dequeued node is the goal state, terminate the search and return the solution.
3. Otherwise, expand the node and enqueue its unvisited neighboring nodes with their updated path
costs.
5. Repeat steps 4 until the goal state is found or the priority queue is empty.
•Uniform cost search is optimal because at every state the path with the least
cost is chosen.
Disadvantages:
•It does not care about the number of steps involved in searching and only
concerned about path cost, due to which this algorithm may be stuck in an
infinite loop.
Optimal: Uniform cost search is always optimal as it only selects a path with the lowest path cost.
Time Complexity:• The time complexity of Uniform Cost Search depends on the number of nodes
and the cost of the lowest-cost path to the goal. Lowest cost is e and optimal cost is c* .Hence, the
worst case time complexity of Uniform cost search is O (b 1 + [C*/ε])
Space Complexity: • The space complexity of Uniform Cost Search can also be exponential in the
worst case, i.e.,O(bd), as it may need to store all the nodes along the lowest-cost path in memory.The
same logic is for space complexity so, the worst case space complexity of Uniform cost search is
O (b 1 + [C*/ε])
• It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right direction)
Disadvantages:
•There is the possibility that many states keep reoccurring, and there is no guarantee of finding the
solution.
•DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.
•Optimality- DFS does not guarantee finding the optimal solution, as it may find a solution at a greater depth
before finding a shorter path. It is non optimal, as it may generate a large number of steps or high cost to reach
to the goal node.
•Time Complexity - The time complexity of DFS can vary depending on the search space structure. In the worst
case, it can be O(bm), where b is the branching factor and m is the maximum depth of the search tree.
•Space Complexity: The space complexity of DFS is O(b.m), where b is the branching factor and m is the
maximum depth of the search tree. It stores nodes along the current path in memory.
•In this algorithm, the node at the depth limit will treat as it has no successor nodes
further
•Standard failure value: It indicates that problem does not have any solution.
•Cutoff failure value: It defines no solution for the problem within a given depth limit.
Goal Node: H
This depth (d) will lead to no solution due to
condition of cut-off failure.
Disadvantages:
•It may not be optimal if the problem has more than one solution.
1'st Iteration-(d=0)---------> A
If Goal Node is k
•It combines the benefits of BFS and DFS search algorithm in terms of fast
search and memory efficiency.
Disadvantages:
•The main drawback of IDDFS is that it repeats all the work of the previous
phase
•Bidirectional search replaces one single search graph with two small subgraphs in which
one starts the search from an initial vertex and other starts from goal vertex.
•Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
• The motivation is that bd/2 + bd/2 is much less than bd. Bidirectional search is implemented
by replacing the goal test with a check to see whether the frontiers of the two searches
intersect; if they do, a solution has been found.
•It starts traversing from node 1 in the forward direction and starts from goal
node 16 in the backward direction.
E
B
A D G
C F
• As the search progresses from both directions, the effective branching factor is reduced, leading to a more efficient
search. Bidirectional search requires less memory.
Disadvantages:
•Bidirectional Search requires storing visited nodes from both directions, leading to increased memory consumption
compared to unidirectional searches. Implementation of the bidirectional search tree is difficult.
•In bidirectional search, one should know the goal state in advance.
• The coordination and synchronization between the two searches introduce additional overhead in terms of implementation
complexity.
•Optimality: Bidirectional search is Optimal if both the forward and backward searches are optimal
•Time Complexity: The time complexity of Bidirectional Search depends on the branching factor, the depth of
the shallowest goal state, and the meeting point of the two searches. In the best case, it can be O(b d/2), where b
is the branching factor and d is the depth of the goal state.Worst time complexity of bidirectional search is
O(b d).
•Space Complexity: The space complexity of Bidirectional Search depends on the memory required to store
visited nodes from both directions. In the best case, it can be O(b d/2), where b is the branching factor and d is
the depth of the goal state. Worst Space complexity of bidirectional search is O(b d)
•The heuristic method, however, might not always give the best solution, but it guaranteed to find a good
solution in reasonable time
•Heuristic function estimates how close a state is to the goal It is represented by h(n), and it calculates the cost
of an optimal path between the pair of states The value of the heuristic function is always positive.
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than or equal to
the estimated cost.
• It is the combination of depth first search and breadth first search algorithms It uses the heuristic
function and search.
•In the best first search algorithm, we expand the node which is closest to the goal node and the
closest cost is estimated by heuristic function.
•Step 3 Remove the node n, from the OPEN list which has the lowest value of f(n), and places it in the CLOSED
list
•Step 5 Check each successor of node n, and find whether any node is a goal node or not If any successor node
is goal node, then return success and terminate the search, else proceed to Step 6
•Step 6 For each successor node, algorithm checks for evaluation function f(n), and then check if the node has
been in either OPEN or CLOSED list If the node has not been in both list, then add it to the OPEN list
24 A 40
F
D B 32
6 7 19
C 25
A C
12 G
D 35
10
9 E 19
B E H
14 8 F 17
A->C->F->G
H 10
G 0
KCS-071 Unit 2 Ankita Singh
Advantages:
• It uses less memory than other informed search methods like A* as it does not store all the generated nodes.
Disadvantages:
• It is not optimal. It does not guarantee the shortest possible path will be found.
•Time Complexity:The worst case time complexity of Greedy best first search
is O( b m)
•It has combined features of UCS and greedy best first search, by which it solve the problem efficiently.
•A* search algorithm finds the shortest path through the search space using the heuristic function. This search
algorithm provides optimal result faster.
•A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).
•In A* search algorithm, we use search heuristic as well as the cost to reach the node.
Step2 Check if the OPEN list is empty or not, if the list is empty then return failure and stops
Step 3 Select the node from the OPEN list which has the smallest value of evaluation function g+h if node n is
goal node then return success and stop, otherwise
Step 4 Expand node n and generate all of its successors, and put n into the closed list For each successor n',
check whether n' is already in the OPEN or CLOSED list, if not then compute evaluation function for n' and place
into Open list
Step 5 Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer which
reflects the lowest g(n') value
Path A → F → G
Step 04:
•f(H) = (3+1+3+2) + 3 = 12
•f(J) = (3+1+3+3) + 0 = 10
Since f(J) is least, so it decides to go to node J.
Path:A → F → G → I → J
This is the required shortest path from node A to node J
h(n), where g(n) is the cost to reach any node from start state
Disadvantages:
•It does not always produce the shortest path as it mostly based on heuristics and approximation.
•The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it is not
practical for various large scale problems.
•Time Complexity: The time complexity of A* Search depends on the heuristic function,
the branching factor, and the structure of the search space. In the worst case, it can be
exponential.
•Space Complexity: The space complexity of A* Search depends on the size of the priority
queue and the number of nodes stored in memory. In the worst case, it can be exponential.
• A heuristic h(n) is consistent if, for every node n and every successor n’ of n generated by any action
a, the estimated cost of reaching the goal from n is no greater than the step cost of getting to n’ plus
the estimated cost of reaching the goal from n’: h(n) ≤ c(n, a, n’) + h(n’)
This is a form of the general triangle inequality, which stipulates that each side of a triangle
cannot be longer than the sum of the other two sides
n
h(n)
c(n,a,n’)
Every consistent heuristic is also admissible
n’ G
If h(n) is consistent, A* using GRAPH - SEARCH is optimal h(n’)
The main difference between IDA∗ and standard iterative deepening is that
the cutoff used is the f-cost (g+h) rather than the depth; at each iteration, the
cutoff value is the smallest f-cost of any node that exceeded the cutoff on the
previous iteration.
Where h is admissible.
Here,
● f(n) = Total cost evaluation function.
● g(n) = The actual cost from the initial node to the current node.
● h(n) = Heuristic estimated cost from the current node to the goal state. it is based on the
approximation according to the problem characteristics.
F-score is a heuristic function that is used to estimate the cost of reaching the goal state from a
given state. It is a combination of two other heuristic functions, g(n) and h(n).
KCS-071 Unit 2 Ankita Singh
Step-by-Step Process of the IDA* Algorithm
1. Initialization: Set the root node as the current node and compute its f-score.
2. Set Threshold: Initialize a threshold based on the f-score of the starting node.
3. Node Expansion: Expand the current node’s children and calculate their f-scores.
4. Pruning: If the f-score exceeds the threshold, prune the node and store it for future exploration.
5. Path Return: Once the goal node is found, return the path from the start node to the goal.
6. Update Threshold: If the goal is not found, increase the threshold based on the minimum pruned
value and repeat the process
1. Optimal Pathfinding: IDA* guarantees finding the optimal path, as it never overestimates the
cost to the goal.
2. Memory Efficient: It uses limited memory compared to A* by applying depth-first search
techniques.
3. Efficient with Large State Spaces: IDA* handles large graphs efficiently by pruning
unnecessary nodes.
Disadvantages
1. Repeated Node Exploration: The algorithm does not store visited nodes, leading to repeated
exploration.
2. Slower than A*: IDA* can be slower than algorithms like A* due to the repeated exploration of
nodes.
In this way, RBFS remembers the f-value of the best leaf in the forgotten subtree and can therefore
decide whether it’s worth reexpanding the subtree at some later time.
RBFS is somewhat more efficient than IDA∗, but still suffers from excessive node regeneration
447
B
393
C 447 D 449
415
646 413
E 415 F 671 G H
417 553
L M I J B
591
450
526
447
B
393
C 447 D 449
417 415
646 413417
E 415 F 671 G H
450
417 553
L M I J B
591
450
526
447
B
393
C 447 D 449
447
646 413417
E 415 F 671 G H
450
417 553
I J B
447
526
Space Complexity: Its space complexity is linear in the depth of the deepest
optimal solution
IDA∗ and RBFS suffer from using too little memory. Between iterations, IDA∗
retains only a single number: the current f-cost limit. RBFS retains more
information in memory, but it uses only linear space: even if more memory
were available, RBFS has no way to make use of it. Because they forget most
of what they have done, both algorithms may end up reexpanding the same
states many times over. Furthermore, they suffer the potentially exponential
increase in complexity associated with redundant paths in graphs.
Pruning: It eliminates those parts of a search space which does not contain
better solution.
Disadvantages:
● The Branch and Bound algorithm is limited to small size network. In the problem of large
networks, where the solution search space grows exponentially with the scale of the network,
the approach becomes relatively prohibitive
•Such structure represent the fact that along any one of the branches leaving it.
AND OR GRAPHS
•The AND OR GRAPH (or tree) is useful for representing the solution of problems that can solved by
decomposing them into a set of smaller problems, all of which must then be solved.
•This decomposition, or reduction, generates arcs that we call AND arcs. One AND arc may point to
any number of successor nodes, all of which must be solved in order for the arc to point to a
solution.
• It is optimal when the heuristic function is admissible (never overestimates the true cost).
Disadvantages of AO Algorithm*
• It can consume a large amount of memory, similar to the A* algorithm.
• The performance of AO* is heavily dependent on the accuracy of the heuristic function. If
the heuristic function is not well-chosen, AO* could perform poorly.
• In this technique, all the solutions are generated and tested for the best solution. It ensures that
the best solution is checked against all possible generated solutions.
• Potential solutions that need to be generated vary depending on the kinds of problems.For some
problems the possible solutions may be particular points in the problem space and for some
problems, paths from the start state.
• This approach is what is known as British Museum algorithm: finding an object in the British
Museum by wandering randomly.
•Systematic Generate and Test may prove to be ineffective while solving complex problems. But
there is a technique to improve in complex cases as well by combining generate and test search with
other techniques so as to reduce the search space. For example in Artificial Intelligence Program
DENDRAL we make use of two techniques, the first one is Constraint Satisfaction Techniques followed by
Generate and Test Procedure to work on reduced search space i.e. yield an effective result by working
on a lesser number of lists generated in the very first step.
Complete: Good Generators need to be complete i.e. they should generate all the possible solutions and cover all the
possible states. In this way, we can guaranty our algorithm to converge to the correct solution at some point in time.
Non Redundant: Good Generators should not yield a duplicate solution at any point of time as it reduces the
efficiency of algorithm thereby increasing the time of search and making the time complexity exponential. In fact, it is
often said that if solutions appear several times in the depth-first search then it is better to modify the procedure to
traverse a graph rather than a tree.
Informed: Good Generators have the knowledge about the search space which they maintain in the form of an array of
knowledge. This can be used to search how far the agent is from the goal, calculate the path cost and even find a way to
reach the goal.
• A salesman has a list of cities, each of which he must visit exactly once.
There are direct roads between each pair of cities on the list. Find the route
the salesman should follow for the shortest possible round trip that both
starts and finishes at any one of the cities. – Traveler needs to visit n cities. –
Know the distance between each pair of cities. – Want to know the shortest
route that visits all the cities once.
State Space Representation: we will represent a state of the problem as a tuple (x, y) where x represents the amount of water in
the 4-gallon jug and y represents the amount of water in the 3-gallon jug. Note that 0 ≤ x ≤ 4, and 0 ≤ y ≤ 3.
Assumptions:
1 6 4
7 5
2 8 3
2 8 3 2 8 3
1 6 4
1 4 1 6 4
7 5
7 6 5 7 5
2 3 2 8 3 2 8 3 2 8 3 2 8 3
1 8 4 1 4 1 4 1 6 4 1 6
7 6 5 7 6 5 7 5 5 7 5 7 5 4
KCS-071 Unit 2 Ankita Singh
Solution using Informed Search (A*)
•The algorithm does not maintain a search tree, so the data structure for the current node need only
record the state and the value of the objective function.
• Hill climbing is sometimes called greedy local search with no backtracking because it grabs a good
neighbor state without looking ahead beyond the immediate neighbors of the current state.
• Hill climbing often makes rapid progress toward a solution because it is usually quite easy to
improve a bad state.
•Global Maximum:Global maximum is the best possible state of state space landscape It
has the highest value of objective function.
•Current state :It is a state in a landscape diagram where an agent is currently present.
•Flat local maximum:It is a flat space in the landscape where all the neighbor states of
current states have the same value.
The heuristic cost function h is the number of pairs of queens that are attacking each
other, either directly or indirectly.
The global minimum of this function is zero, which occurs only at perfect solutions.
Figure shows a state with h=17. The figure also shows the values of all its successors, with the
best successors having h=12.
Unfortunately, hill climbing often gets stuck for the following reasons:
Hill-climbing algorithms typically choose randomly among the set of best successors if there
is more than one.
It takes just 5 steps to reach the state in Figure 4.3(b), which has h=1 and is very nearly a
solution.
14% of time HCA solves the problem, where 86% of the time it get stuck in local minimum
However
-It takes only 4 steps on average when it succeeds
-And 3 on average when it get stuck (in state space with 88 = 17 million states)
Step 2:Loop Until a solution is found or there is no new operator(successor function) left to apply.
Else if it is better than the current state then assign new state as a current state.
Else if not better than the current state, then return to step 2.
Step 5: Exit.
•Step 2 :Loop until a solution is found or the current state does not change.
•Let SUCC be a state such that any successor of the current state will be better than it.
•If it is goal state, then return it and quit, else compare it to the SUCC.
•If the SUCC is better than the current state, then set current state to SUCC.
•Step 3 Exit
A plateau is a flat area of the state-space landscape. It can be a flat local maximum, from which no
uphill exit exists, or a shoulder, from which progress is possible. A hill-climbing search might get lost
on the plateau. In each case, the algorithm reaches a point at which no progress is being made.
•Solution: We can allow sideways move in the hope that the plateau is really a shoulder. But,if we
always allow sideways moves when there are no uphill moves, an infinite loop will occur whenever the
algorithm reaches a flat local maximum that is not a shoulder. One common solution is to put a limit
on the number of consecutive sideways moves allowed.
- To prevent the algorithm from getting stuck in a loop or revisiting states, a list of previously
visited states can be maintained, which are then avoided in future steps.
- This list is a fixed length queue called “tabu list” //add most recent state to queue and drop oldest.
- As the size of the tabu list grows, hill climbing will asymptotically become “non redundant” (won’t
look same state twice)
-In practice a reasonable sized tabu list(say 100) improves the performance of HC in many problems.
•Different variations
–For each restart: run until termination vs. run for a fixed time
–Run a fixed number of restarts or run indefinitely
• It's highly sensitive to the initial state and can get stuck in local optima.
• It does not maintain a search history, which can cause the algorithm to cycle or loop.
• It can't deal effectively with flat regions of the search space (plateau) or regionst that form a ridge.
● When T is high, the probability of accepting a worse state is higher, allowing the algorithm to explore more freely. As T decreases,
the probability of accepting worse solutions decreases, focusing more on exploitation and less on exploration.
● In the early stages (high temperature), the algorithm explores a wide range of solutions, even accepting worse solutions to escape
local optima. In the later stages (low temperature), it focuses on refining the solution by accepting only better solutions.
Annealing Schedule:
● The schedule controls how the temperature T decreases over time. In the early stages, the temperature is high, allowing more
exploration (accepting worse solutions). Over time, as the temperature lowers, the algorithm becomes more focused on improving
the solution.When the temperature reaches zero, the search ends, and the algorithm returns the current state as the best-found
solution.
● The rate at which the temperature decreases is critical. A slow decrease (cooling schedule) allows more exploration, which may yield
better results but takes longer. A fast decrease leads to quicker convergence but increases the risk of getting stuck in a local
optimum.
Probabilistic Acceptance:
● Unlike greedy algorithms, simulated annealing sometimes accepts worse solutions, making it less likely to get trapped in local minima
and more likely to find a global optimum.
•Instead of choosing the best k from the the pool of candidate successors, stochastic beam
search chooses k successors at random, with the probability of choosing a given successor
being an increasing function of its value. Stochastic beam search bears some resemblance
to the process of natural selection, whereby the “successors” (offspring) of a “state”
(organism) populate the next generation according to its “value” (fitness).
STEP-COSTP .
● Belief state
● Initial state
● Actions
● Transition model
● Goal Test
● Path Cost
We say that the agent can coerce the world into state 7.
● A belief state is a set of all possible physical states the agent might be in at a given time. Since the agent has no sensors to detect
its current state, it must consider multiple possibilities simultaneously.
● If the original problem has n physical states, the sensorless problem can have up to 2 n belief states (since the agent could be in any
combination of these physical states). Many belief states might be unreachable, depending on the problem.
2. Initial State:
● The agent’s starting belief state is usually the set of all possible states since it has no information about where it starts.
● n some problems, the agent might have more knowledge, so its initial belief state could be a smaller subset of possible states.
3. Actions:
● The set of actions the agent can take. In sensorless problems, the agent may be unsure which actions are legal in its current belief
state, since it doesn’t know its exact position.
● There are two possibilities:
○ If illegal actions have no effect: The agent can take the union of all possible actions from every state in the belief state.
This means the agent assumes it can perform any action that’s legal in any of the possible states.
○ If illegal actions are dangerous: The agent can take the intersection of actions, meaning it only performs actions that are
legal in all possible states within the belief state.
5. Goal Test:
● The agent achieves the goal if ALL the possible states in the belief state meet the goal condition.
● The goal is considered achieved only if, regardless of which state the agent is actually in, it has reached the goal. Even if the agent
accidentally reaches the goal in one state, it won’t know unless every possible state in its belief state has also reached the goal.
6. Path Cost:
● he cost of taking an action in a belief state. In sensorless problems, an action might have different costs in different states, which
complicates the calculation of path cost.
● How it works:
○ If the same action has different costs depending on the actual state, the agent needs to account for the range of possible costs.
○ To simplify, it’s often assumed that the action cost is the same across all states in the belief state, so the cost can be directly
transferred from the physical problem.
There are only 12 reachable belief states out of 28 =256 possible belief states.
The incremental belief-state search must find one single solution that satisfies all states in
the belief
KCS-071 Unit state.
2 Ankita Singh
Physical states
Efficiency Gains: This ability to prune the search space can significantly speed up the problem-solving process, especially when
dealing with large belief states.
For eg, we might define the local-sensing vacuum world with agent having position
sensor and a local dirt sensor but has no sensor capable of detecting dirt in other
squares.
PERCEPT(s) function returns the percept received in a given state. For example, in
the local-sensing vacuum world, the PERCEPT in state 1 is [A, Dirty].
When observations are partial, it will usually be the case that several states could
have produced any given percept. For example, the percept [A, Dirty] is produced
by state 3 as well as by state 1. Hence, given this as the initial percept, the initial
belief state for the local-sensing vacuum world will be {1, 3}.
The ACTIONS, STEP-COST, and GOAL-TEST are constructed from the underlying
physical problem just as for sensorless problems, but the transition model is a bit
more complicated.
• The prediction stage is the same as for sensorless problems: given the action a in belief state b, the predicted belief
state is ˆb =PREDICT(b, a).
• The observation prediction stage determines the set of percepts o that could be observed in the predicted belief
state:
• The update stage determines, for each possible percept, the belief state that would result from the percept. The
new belief state bo is just the set of states in ˆb that could have produced the percept:
each updated belief state bo can be no larger than the predicted belief state ˆ b;
Putting these three stages together, we obtain the possible belief states resulting from a given action and the
subsequent possible percepts:
o ∈ POSSIBLE-PERCEPTS(PREDICT(b, a))} .
KCS-071 Unit 2 Ankita Singh
Game playing - Adversarial Search
In multiagent environments, each agent needs to consider the actions of other agents
and how they affect its own welfare. Competitive environments, in which the agents’
goals are in conflict, give rise to adversarial search problems—often known as
games.
In AI, the most common games are of a rather specialized kind—deterministic, turn-taking, two-player, zero-sum
games of perfect information (such as chess).
In our terminology, this means deterministic, fully observable environments in which two agents act alternately and in which
the utility values at the end of the game are always equal and opposite.
For example, if one player wins the game of chess(+1),the other player necessarily loses(-1). It is this opposition
between the agents’ utility functions that makes the situation adversarial.
● At the end of the game, points are awarded to the winning player and penalties are given to the loser. A game can
be formally defined as a search problem with the following components:
○ S0: The initial state, which specifies how the game is set up at the start.
○ PLAYER(s): Defines which player has the move in a state.
○ ACTIONS(s): Returns the set of legal moves in a state.
○ RESULT(s, a): The transition model, which defines the result of a move.
○ TERMINAL-TEST(s): A terminal test, which is true when the game is over and false otherwise. States where the
game has ended are called terminal states.
○ UTILITY(s, p): A utility function (also called an objective function or payoff function), defines the final numeric
value for a game that ends in terminal state s for a player p. In chess, the outcome is a win, loss, or draw, with
values +1, 0, or 1
A zero-sum game is defined as one where the total payoff to all players is the same for every instance of the game. Chess is
zero-sum because every game has payoff of either 0 + 1, 1 + 0 or 1
● In a game, on the other hand, MIN has something to say about it, MAX
therefore must find a contingent strategy, which specifies MAX’s move
in the initial state, then MAX’s moves in the states resulting from every
possible response by MIN, then MAX’s moves in the states resulting from
every possible response by MIN those moves, and so on.
● An optimal strategy leads to outcomes at least as good as any other
strategy when one is playing an infallible opponent.
KCS-071 Unit 2 Ankita Singh
Min-Max Terminology
•move: a move by both players
•ply: a half-move
•utility function:the function applied to leaf nodes
•backed-up value
–of a max-position : the value of its largest successor
–of a min-position : the value of its smallest successor
•minimax procedure: search down several levels; at the bottom level apply the utility
function, back-up values all the way up to the root node, and that node selects the move.
● This particular game ends after one move each by MAX and MIN.
● The utilities of PLY the terminal states in this game range from 2 to 14
● The terminal nodes on the bottom level get their utility values from the game’s UTILITY function. The first MIN node, labeled B, has three successor states with values 3, 12, and 8, so its
minimax value is 3.
● Similarly, the other two MIN nodes have minimax value 2. The root node is a MAX node; its successor states have minimax values 3, 2, and 2; so it has a minimax value of 3. We can also
identify the minimax decision MINIMAX DECISION at the root: action a1 is the optimal choice for MAX because it leads to the state with the highest minimax value.
● The algorithm computes the minimax decision for the current state and uses a depth-first search algorithm for the
exploration of the complete game tree.
● It operates by recursively using backtracking to simulate all possible moves in the game, effectively searching
through a game tree.The recursion proceeds all the way down to the leaves of the tree, and then the minimax values
are backed up through the tree as the recursion unwinds.
● In the game tree, there are two types of nodes: MAX nodes, where the algorithm selects the move with the maximum
value, and MIN nodes, where the algorithm selects the move with the minimum value.
• Games: It's typically applied to perfect information games like chess, checkers, and tic-tac-toe.
● In the algorithm, the MAX player is often represented with negative infinity (-∞) as the initial worst case, while MIN is
represented with positive infinity (∞).
MIN B C
MAX D E F G
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example
MAX A
MIN B C
MAX 8 F G
D E
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example
MAX A
MIN B C
MAX 8 F G
D E
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example
MAX A
-1
MIN B C
8 -1 G
MAX D E F
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example
MAX A
-1
2
MIN B C
8 -1 G
MAX D E F 4
2
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example 2
MAX A
-1
2
MIN B C
8 -1 G
MAX D E F 4
2
-1 Unit 2
KCS-071 8 -3 -1 2 1 -3 Ankita4Singh
Example 2
• Optimality: It guarantees to find the optimal strategy for both players. But is not able to exploit opponent
weakness against suboptimal opponent.
• Time Complexity: The time complexity is O(b^m), where b is the branching factor (the average number of
child nodes for each node in the tree) and m is the maximum depth of the tree.
• Space Complexity: The space complexity is also O(b*m), because of the need to store all the nodes in
memory.
• A game like chess can have around 35 possible moves at any point, leading
to a vast number of possible game states to evaluate.
• When to Prune: A branch is pruned when the minimizer's best option (Beta) is less than the maximizer's best
option (Alpha) since the maximizer will never allow the game to go down a path that could lead to a worse outcome
than it has already found.
• Each node has to keep track of its alpha and beta values. Alpha can be updated only when it’s MAX’s turn and,
similarly, beta can be updated only when it’s MIN’s chance.
• MAX will update only alpha values and MIN player will update only beta values.
• The node values will be passed to upper nodes instead of values of alpha and beta during go into reverse of tree.
The effectiveness of alpha-beta pruning is highly dependent on the order in which the successors are examined.
It might be worthwhile to try to examine first the successors that are likely to be the best. In such case, it turns out
that alpha-beta needs to examine only O(bd/2) nodes to pick the best move, instead of O(bd ) for minimax. This means
that the effective branching factor becomes sqrt(b) instead of b – for chess comes oyt to be 6 instead of 35.
Benefits:
• Efficiency: Alpha-beta pruning can greatly reduce the number of nodes that are explored in the game tree, leading
to faster decision-making.
• Optimality: The final decision made by the alpha-beta pruning is the same as the decision that would be made by
the full minimax algorithm, ensuring optimality is not compromised.
• Widely Applicable: It can be applied to any game tree, not just two-player games, though the algorithm assumes
perfect play on both sides.
KCS-071 Unit 2 Ankita Singh
Constraint Satisfaction Problem
● Till now we explored the idea that problems can be solved by searching in a space of
states. These states can be evaluated by domain-specific heuristics and tested to see
whether they are goal states. From the point of view of the search algorithm, however,
each state is atomic, or indivisible—a black box with no internal structure.
● CSP describes a way to solve a wide variety of problems more efficiently. We use a
factored representation for each state: a set of variables, each of which has a value.
● A problem is solved when each variable has a value that satisfies all the constraints on the
variable. A problem described this way is called a constraint satisfaction problem.
D is a set of domains, {D1, . . . ,Dn}, one for each variable. Each domain Di consists of a set of allowable values, {v1, . . . ,
vk} for variable Xi.
Each constraint Ci consists of a pair (scope, rel ) where scope is a tuple of variables that participate in the constraint and rel
is a relation that defines the values that those variables can take on.
A relation can be represented as an explicit list of all tuples of values that satisfy the constraint, or as an abstract relation that
supports two operations: testing if a tuple is a member of the relation and enumerating the members of the relation.
For example, if X1 and X2 both have the domain {A,B}, then the constraint saying the two variables must have different values
can be written as (X1,X2), [(A,B), (B,A)] or as (X1,X2),X1 = X2.
● An assignment that does not violate any constraints is called a consistent or legal assignment;
● A partial assignment is one that assigns values to only some of the variables.
A. Finite Domains CSPs - They have a limited number of possible values for each variable. The problem space is
manageable but can grow exponentially as the number of variables increases.
If you have n variables, each with a domain size d, then the total number of possible assignments to all variables is
O(d^n), For example, if each variable can take one of 3 values, and there are 5 variables, there are 35=243 possible
assignments.
Eg- Map coloring: The regions of a map (variables) can be assigned a color (values) from a finite set like {Red, Green,
Blue}.
B. Infinite Domains- Some CSPs deal with variables that can take on values from an infinite set, such as all integers or
strings. It requires constraint language to express relationships, which is hard to enumerate.
● Eg Job Scheduling: The variables might be the start and end days of jobs, which can take on any integer value
representing a day. For instance, a constraint might be "Job 1 must start at least 5 days before Job 3.”
Continuous Variables- These variables can take on any value within a continuous range (e.g., real numbers).
● Eg- In scheduling tasks like Hubble Telescope observations, the start and end times might be represented as continuous
variables because time can be infinitely subdivided (e.g., down to seconds, milliseconds, etc.).
e.g., SA != green
e.g., SA != WA
often representable by a cost for each variable assignment -> constrained optimization problems
2=G R G G,B B
3=B R G B ERROR
3=G R G G B
KCS-071 Unit 2 Ankita Singh
Example: Cryptarithmetic
Cryptarithmetic: is a type of constraint satisfaction problem in which each alphabet and symbol is associated with a unique
digit.
Constraints:
2 1
2 1 G+2= U+10
+ 8 1 G=9 or 8
1 0 2
+ G 1
G=9 -> U=1
1 U 2 G=8 -> U=0
KCS-071 Unit 2 Ankita Singh
Ans: O=1, T=2, G=8, U=0
SEND+MORE=MONEY
1
Letter Code
D
9
M 1
+ 1 0
O 0
1 0
R
KCS-071 Unit 2 Y Ankita Singh
1
9 Letter Code
+ 1 0 S 9
1 0 E
1 1 N
4. If O=0, there must be carry of 1
so 1+E+0=N, —--------------(1) 9 D
Also N+R(+1)=E+10—-------(2)
From 1&2 + 1 0 8 M 1
N+R(+1)=N-1+10
R(+1)=9 1 0 O 0
R!=9 since S=9,
Therefore R=8 R 8
1 1 N 6
5. D+E should be such that it generates a
carry. Also D+E>11 (Y!= 0,1) 9 5 7 D 7
Assume Y=2
D+E=12 + 1 0 8 5 M 1
D!=8,9
Assume D=7 E=5 1 0 5 2 O 0
R 8
6. N+8+1=15
Y 2
N=6
Variables: A, B, C (regions)
Domains: {Red, Green, Blue} (possible colors for each region)
Constraints: A ≠ B, B ≠ C, A ≠ C (no two adjacent regions should have the same color)
DFS Algo
Step 1: A = Red
Step 2: B = Red (No constraint check yet, DFS continues)
Step 3: C = Red (Again, no constraint check)
At this point, DFS has reached a complete assignment: A = Red, B = Red, C = Red. But this violates the constraints (A ≠ B, B ≠ C, A ≠ C). <Backtrack>
Backtrack to C: C = Green, Still, A = Red, B = Red, C = Green violates the constraint A ≠ B.<Backtrack>
Backtrack again, now try a different value for B.
Exploring All Possible Assignments:
Eventually, after a lot of unnecessary exploration, the dumb DFS will find valid solutions:
● A = Red, B = Green, C = Blue
● A = Green, B = Red, C = Blue
But notice how it wastes a lot of time trying invalid combinations and continues exploring even when constraints are violated early on. This is
inefficient
Depth-first search for CSPs with single-variable assignments is called backtracking search
The term backtracking search is used for a depth-first search that chooses values for one variable at a time and backtracks
when a variable has no legal values left to assign.
It repeatedly chooses an unassigned variable, and then tries all values in the domain of that variable in turn, trying to find a
solution. If an inconsistency is detected, then returns failure, causing the previous call to try another value.
function RECURSIVE-BACKTRACKING(assignment, csp) returns solution or failure # explore possible assignments of values to variables in a recursive
manner, checking for consistency at each step.
for each value in ORDER-DOMAIN-VALUES(var, assignment, csp) do #Order-Domain-Values function generates the sequence of values to try.
if value is consistent with assignment given constraints[csp] then
add {var = value} to assignment
result ← RECURSIVE-BACKTRACKING(assignment, csp)
remove {var = value} from assignment #If no valid assignment can be found after trying all possible values for the selected variable, the algorithm
removes the most recent variable-value assignment (undoing the last step) and backtracks to try a different value or variable.
return failure
● One way to make better use of constraints during search is called forward checking. Whenever a variable X is assigned,
the forward checking process looks at each unassigned variable Y that is connected to X by a constraint and
deletes from Y ’s domain any value that is inconsistent with the value chosen for X.
● By detecting potential conflicts early, forward checking avoids exploring paths that would eventually fail, reducing
unnecessary search.
In CSPs there is a choice: an algorithm can search (choose a new variable assignment from several possibilities)
or do a specific type of inference called constraint propagation: using the constraints to reduce the number
of legal values for a variable, which in turn can reduce the legal value for another variable, and so on.
Constraint propagation may be intertwined with search, or it may be done as a preprocessing step, before
search starts. Sometimes this preprocessing can solve the whole problem, so no search is required at all
● Arc Consistency: Ensures that for every value in the domain of a variable, there is some compatible value in the domain
of other variables it’s constrained with. If a value for a variable has no valid partner in another variable, that value is
removed from the domain.