Topic 01 Searching
Topic 01 Searching
searching
COURSE TEACHER: ANITA ALI
1
Motivation
◦ Searching is the core component of nearly all intelligent systems.
◦ It forms a conceptual backbone of almost any approach to the systematic
exploration of alternatives.
◦ Natural Language Understanding - search for the best interpretation of a text
◦ Learning - search for the best justification of an experience
◦ Planning - search for a series of decisions that best achieves a goal while meeting
certain set of conditions.
2
State Space Search – Toy Problems
3
State Space Search - Real World Problems
7
Problem Formulation
•Given a goal, deciding on which states and actions to consider.
•Huge influence on finding a solution for the problem. Action : UP
• A problem is defined in terms of its five components:
1. Initial state :
2. Actions: Initial State
◦ Set of actions applicable in each state
Action : Right
8
Problem Formulation
3. Transition Model
10
Home Activity
11
What is a Node?
• An abstract representation of a state. • What is stored in a Node?
• Parent
• Current state (in the form
of 2D array here)
S • Actions
• Successors
• Step cost to each successor
• Path cost till current state =
path cost of parent node +
step cost of current node
12
Some Terms
◦ Branching Factor
any node /
any node
13
Some Terms
◦m
be infinite)
14
Some Terms
Depth d
15
Some Terms
Frontier/Fringe
for expansion.
Explored Set
16
Performance Measures of Search
Algorithms
1. Completeness
Is algorithm able to find a solution, if there exists one?
17
Performance Measures of Search
Algorithms
2. Optimality
Is algorithm able to
provide an optimal solution?
18
Performance Measures of Search
Algorithms
3. Time Complexity
the solution .
19
Performance Measures of Search
Algorithms
4. Space Complexity
memory.
20
Types of Search Algorithm
1. Uninformed Search (Blind Search/ Brute Force Search)
21
Note: Blind doesn’t imply unsystematic
1. Uninformed Search
•All non-goal nodes in frontier look equally good
•Nothing is known about states except what is specified in the problem
•Can only differentiate a goal state from a non-goal state
•Cannot tell whether a non-goal state is better than another non-goal
state in reaching the goal
•Traverses state space blindly in the hope of somehow hitting the goal
state
• e.g. BFS, DFS, Iterative Deepening Search, Uniform Cost , Bidirectional
Search, Depth Limited etc.
22
Breadth First Search
•Expand the shallowest node first
• Try all nodes at a given level, before proceeding to the next level
•Nodes can be tried in any direction
•Implementation : Frontier is stored in a FIFO Queue
• Put successors at the rear of Queue
•Any new path to a state s already in the frontier or explored set is
discarded because it must be at least as deep as already discovered
path to s
⇒ It always has the shallowest path to every node on the frontier
23
Example:
Determine path from S to G, returned by BFS on given graph & discuss performance measures.
S A B
C D
E F G
25
Solution
Step F Visited Remarks
26
Activity: Find path from S to G using BFS
29
Performance Measures
1. Completeness
◦ For finite b, BFS is complete (as it ultimately reaches the goal state)
31
Performance Measures
2. Optimality
◦ BFS is optimal (as it finds the
shallowest goal node)
32
Performance Measures:
3. Time Complexity
33
3. Time Complexity
= 𝑏 + 𝑏 2 + 𝑏 3 + 𝑏 4 + .. + 𝑏 𝑑
Time Complexity =
Note: Assuming that goal test is applied when nodes are generated 34
3. Time Complexity
= 𝑏 + 𝑏 2 + 𝑏 3 + 𝑏 4 + .. + 𝑏 𝑑 =
= b(𝑏 𝑑 - 1) / (b-1)
Note: Assuming that goal test is applied when nodes are generated 35
3. Time Complexity
If goal test is applied to nodes, when selected for expansion, then
: O (bd+1)
37
Example:
• Consider a Machine Language Translation Model.
38
Performance Measures:
4. Space Complexity
• Every node generated remains in the memory
40
BFS: Assumption:
Branching factor (b) = 10
45
When BFS is NOT appropriate?
◦ Space is limited
46
Depth First Search
• Start from the root node and explore
before backtracking.
• For the farthest goal node, it creates the same set of nodes as BFS, only in a
different order
47
Example:
Determine path from S to I, returned by DFS on given graph & discuss performance measures. Explore arcs
from left to right.
S
A B C
D E F G
H I J
49
Activity: Find path from S to G using DFS
53
Performance Measures
1. Completeness
every direction.
infinite depth.
56
Performance Measures
2. Optimality
node.
58
Performance Measures:
3. Time Complexity
• In worst case (farthest node i.e. rightmost leaf), it creates same set of
nodes as BFS does but only a different order.
60
4.Space
Complexity
◦ Path from A to M
◦ Explored nodes
◦ Frontier
◦ Unexplored nodes
61
Depth First Search
62
Depth First Search
63
4. Space Complexity depth d
64
Assumption:
Branching factor (b) = 10
# nodes generated /sec = 1 million
BFS v. DFS 1 Node occupies = 1000 bytes
68
When DFS is NOT appropriate?
• If there are loopy paths (cycles) in state space graph
69
Iterative Deepening Search- IDS
• Also called progressive deepening search
• The idea is to give best of both BFS (optimality) and DFS (storage requirement
is linear in DFS)
70
IDS Algorithm
1. L = 0
71
IDS : Finding Path from A to M
A
B
C
D G
E F
H I J K L M N O
72
IDS
73
IDS
74
IDS
75
IDS
76
IDS - Example
Determine path from S to G, returned by IDS on given graph & discuss performance measures. Explore arcs from left to
right.
A B
J
D E F G
H I J
C
77
Performance Measures
1. Completeness
83
Performance Measures
2. Optimality
85
Proportional to number of nodes generated
Performance Measures:
3. Time Complexity
Level b b2 b3 … bd-2 bd-1 bd
1*
2
3
:
d-2
d-1
d
* Level 0 is not included, as root node is not to be generated, it is a part of problem definition 86
Performance Measures:
3. Time Complexity
• Seems too wasteful
• Examines same nodes over & over
• True for small state spaces
• But, % of extra (redundant) effort decreases as b increase
• b=3 2.25 3d 125% more nodes
• b=4 1.78 4d 78% more nodes
• b=5 1.25 5d 25% more nodes
89
Proportional to number of nodes stored
Performance Measures:
4. Space Complexity
• Same as DFS :
90
Alg. Completeness Optimality Time Complexity Space Complexity
• For large state space, which algorithm (BFS, DFS or IDS) should be preferred?
• Answer:
93
Assumption:
Alg. Nodes generated Time Complexity
Self-Checking Branching factor (b) = 10
BFS
O(bd)
Exercise d=5
IDS
O(bd)
BFS
IDS
95
Assumption:
Alg. Nodes generated Time Complexity
Self-Checking Branching factor (b) = 10
BFS
O(bd)
Exercise d=5
IDS
O(bd)
BFS
10 + 102 + 103 + 104 + 105 = 111110
IDS
10*5 + 102* (5 – 1) + 103 * (5 – 2) + 104 * (5 – 3) + 105 * (5 – 4) = 123450
97
Uniform Cost Search - UCS
• Maintains frontier as a priority queue ordered as per path costs
• Returns least-cost path i.e. the optimal one.
• i.e. It expands from the frontier the node having the least path cost
g(n)
• Doesn’t make use of Visited list as it may result in overlooking the
optimal path
• E.g.
98
What if UCS is implemented using Visited
List?
6 2
S B G
1
3
99
UCS- Example (Maintain a priority queue)
Determine path from S to G, returned by UCS on given graph & discuss performance measures.
S
6
8
4
A B C
5 5
7 3
8
D E F G
5
3 6 5
H I J
2
101
Homework: Find path from S to G
102
What if step costs (∈) are equal?
• Same as BFS
104
Performance Measures
1. Completeness
• Say for 0 step costs it may keep adding nodes to frontier and can get
stuck.
105
Performance Measures
2. Optimality
106
BFS : O (bd+1)
Performance Measures:
3. Time Complexity
• ∈ = step cost
Note: It also needs some time to look for the highest priority node in Q 107
Performance Measures: BFS : O (bd+1)
4. Space Complexity
109
Dynamic Programming Optimality Principle
• If the shortest path from start state S to goal G goes via some intermediate
state A then path from S to A and A to G must also be the shortest one
• Implementation: If there are two different paths in the frontier leading to the
same state, then discard the longer path or redundant path with same path
length
G
111
Searching for optimality
- Strict Expanded/ Closed List
• Instead of Visited, a list of expanded nodes is maintained
• All such algorithms are also said to use “Strict Expanded List” or
“Closed List”
• Efficient searching
112
UCS- Revisited (Use Strict Expanded List)
114
UCS- Solution
Step F (priority Queue) Expanded Remarks
115
Bidirectional Search
119
Motivation
120
Informed Search
• Informed: Some additional information (heuristics) other than the
search
121
Heuristic Function (h)
• g(n) : Actual path cost from the start node to node n
• h(n) : Estimated cost of the cheapest path from n to the goal
h=7
S
2
h=5 g (C)
9 B
7 h (C)
h=2 A
4 h=0
3
C G
h=1
Note: Heuristic value is computed for each node from domain knowledge (directly given122to you)
Best First Search - Greedy Algorithm
• Expands the node that appears to be closest to the goal.
124
Straight Line Distance can be
Travelling Salesman Problem a good heuristic (hSLD)
Optimality?
3
A
4
h=2 1
2 D G
S h=4 h=0
h=9
3 2
5
B
h=3
128
Homework : Greedy Algorithm
130
Performance Measures
1. Completeness
132
Performance Measures
2. Optimality
134
Performance Measures
3. Time Complexity
• Like BFS & DFS, it may need to expand all the nodes
• It also needs some time to look for the highest priority node in Q.
135
Performance Measures
4. Space Complexity
• Same as BFS
136
Admissibility of Heuristics
• A heuristic that never overestimates the cost to reach the goal is called
admissible.
• Admissible heuristics think the cost of solving the problem is less than it
actually is.
138
Admissible Heuristic
◦ h(n) ≤ h*(n)
◦ Hence;
139
Admissible Heuristics – Some Examples
1. number of misplaced tiles
5 3 8 1 2 3
2 6 8 4
7 4 1 7 6 5
Start Goal
At least 1 move would be needed to move a misplaced tile towards the goal state – no overestimation - admissible
140
Admissible Heuristics – Some Examples
2. Sum of Manhattan Distances of all
5 3 8 1 2 3 misplaced tiles
2 6 8 4 h=
7 4 1 7 6 5 h=
Start Goal
Taxicab/ City-block /rectilinear distance b/w 2 points (x1 , y1) & (x2, y2) is |x2 – x1| +|y2
142 – y2|
A* Search
146
g(n) : actual path cost from the start node to node n
h(n) : estimated cost of the cheapest path from n to the goal
A* Search (A star)
• The most widely-known form of best-first search in AI
• UCS finds shortest path to every other node rather than focusing on the goal node
• f(n) : estimated cost of the cheapest solution through n (estimated total path length)
• In fact it is the best estimate of the total distance to the goal
• Implementation: Maintain priority queue and pull off node with least f(n)
147
UCS
A* Search A*
A S B G
148
A* with Strict Expanded List/Closed List
•Use of Closed list saves us redundant effort of expanding longer,
non-optimal paths.
153
Example- A*
(a) Simulate A* to find path from S to G with closed list
C
1
(b) Repeat without closed list
100
A
S A B C G 1
G
S 2
90 100 88 100 0
2
B
156
(a) With Closed List
Step F (priority Queue) Expanded Remarks
157
(b) Without Closed List
Step F (priority Queue) Remarks
159
Example- A*
(a) Simulate A* to find path from S to G without closed list
C
(b) Repeat with closed list
2 h=1
A 3
2 h=2 2
4 D G
S h=1 h=0
h=0
5
1
5
B
h=3
160
Homework Apply A* on given 8-puzzle
When;
1. h(n) = number of misplaced tiles
2. h(n) = sum of Manhattan distance
Step cost is 1
2 8 3 1 2 3
1 6 4 8 4
7 5 7 6 5
163
1. Completeness
• Complete if costs > 0 , above epsilon & branching factor is finite
• Even if h are not admissible, it is able to terminate with a solution
path (though not necessarily the optimal one)
• Proof:
• The evaluation function f of nodes expanded must increase
eventually (since paths are longer and more costly) until all the
nodes on a solution path are expanded
• Note: A* is admissible if it uses admissible heuristics.
168
2. Optimality
• Tree Search
169
3. Space Complexity
• Exponential
• The space complexity of A∗ often makes it impractical to insist on finding an
optimal solution.
• One can use variants of A∗ that find suboptimal solutions quickly, or
• one can sometimes design heuristics that are more accurate but not strictly
admissible.
• In any case, the use of a good heuristic still provides enormous savings
compared to the use of an uninformed search.
170
4. Time Complexity
• Exponential unless heuristic is very accurate.
• Because it keeps all generated nodes in memory (as do all GRAPH-SEARCH algorithms),
• Recently developed algorithms have overcome the space problem without sacrificing
optimality or completeness, at a small cost in execution time.
171
The End
173