0% found this document useful (0 votes)
23 views113 pages

Topic 01 Searching

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views113 pages

Topic 01 Searching

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Problem Solving by

searching
COURSE TEACHER: ANITA ALI

1
Motivation
◦ Searching is the core component of nearly all intelligent systems.
◦ It forms a conceptual backbone of almost any approach to the systematic
exploration of alternatives.
◦ Natural Language Understanding - search for the best interpretation of a text
◦ Learning - search for the best justification of an experience
◦ Planning - search for a series of decisions that best achieves a goal while meeting
certain set of conditions.

2
State Space Search – Toy Problems

3
State Space Search - Real World Problems

•Route finding problem •VLSI Layout


4
State Space Search - Real World Problems

•Robot Navigation •Automatic assembly sequencing


5
State Space Search - Real World Problems

•Travelling salesman problem


6
Searching
• The process of computing a sequence of actions (solution) that leads to goal
state.

7
Problem Formulation
•Given a goal, deciding on which states and actions to consider.
•Huge influence on finding a solution for the problem. Action : UP
• A problem is defined in terms of its five components:
1. Initial state :
2. Actions: Initial State
◦ Set of actions applicable in each state

Action : Right

8
Problem Formulation
3. Transition Model

It specifies the successor of a given state, when an


action is applied on it.

State Space ≡ (initial state, actions, transition model)

4. Goal Test State Space


The set of all states reachable from the
A set of goal states
initial state using any action sequence.
9
Problem Formulation
5. Path cost function
◦ A function that assigns a number to each
path indicating how good that path is.
◦ Step cost can be same for all nodes.
◦ Step cost can be different for different paths.
◦ A solution having lowest path cost is
regarded as optimal.

10
Home Activity

11
What is a Node?
• An abstract representation of a state. • What is stored in a Node?

• Parent
• Current state (in the form
of 2D array here)
S • Actions
• Successors
• Step cost to each successor
• Path cost till current state =
path cost of parent node +
step cost of current node

12
Some Terms
◦ Branching Factor

◦ Average number of successors of

any node /

◦ Maximum number of successors of

any node

13
Some Terms
◦m

◦ The maximum depth of state space (i.e.

the maximum length of any path- could

be infinite)

14
Some Terms
Depth d

The path length from the root to

the shallowest goal node

15
Some Terms
Frontier/Fringe

Set of all leaf nodes that are available

for expansion.

Explored Set

set of already expanded nodes.

also called closed list.

16
Performance Measures of Search
Algorithms
1. Completeness
Is algorithm able to find a solution, if there exists one?

17
Performance Measures of Search
Algorithms
2. Optimality
Is algorithm able to
provide an optimal solution?

18
Performance Measures of Search
Algorithms
3. Time Complexity

◦ How long it takes to find a solution?

◦ It is proportional to Number of nodes generated – during reaching

the solution .

19
Performance Measures of Search
Algorithms
4. Space Complexity

◦ How much memory does it need to perform the search?

◦ It is proportional to Number of nodes (concurrently) stored in

memory.

20
Types of Search Algorithm
1. Uninformed Search (Blind Search/ Brute Force Search)

2. Informed Search (Heuristics Search) Best First Search

21
Note: Blind doesn’t imply unsystematic
1. Uninformed Search
•All non-goal nodes in frontier look equally good
•Nothing is known about states except what is specified in the problem
•Can only differentiate a goal state from a non-goal state
•Cannot tell whether a non-goal state is better than another non-goal
state in reaching the goal
•Traverses state space blindly in the hope of somehow hitting the goal
state
• e.g. BFS, DFS, Iterative Deepening Search, Uniform Cost , Bidirectional
Search, Depth Limited etc.
22
Breadth First Search
•Expand the shallowest node first
• Try all nodes at a given level, before proceeding to the next level
•Nodes can be tried in any direction
•Implementation : Frontier is stored in a FIFO Queue
• Put successors at the rear of Queue
•Any new path to a state s already in the frontier or explored set is
discarded because it must be at least as deep as already discovered
path to s
⇒ It always has the shallowest path to every node on the frontier

23
Example:
Determine path from S to G, returned by BFS on given graph & discuss performance measures.

S A B

C D

E F G

25
Solution
Step F Visited Remarks

26
Activity: Find path from S to G using BFS

29
Performance Measures
1. Completeness
◦ For finite b, BFS is complete (as it ultimately reaches the goal state)

31
Performance Measures
2. Optimality
◦ BFS is optimal (as it finds the
shallowest goal node)

◦ Provided all step costs are


equal/ uniform

32
Performance Measures:
3. Time Complexity

33
3. Time Complexity

Then the total number of nodes generated is

= 𝑏 + 𝑏 2 + 𝑏 3 + 𝑏 4 + .. + 𝑏 𝑑

Time Complexity =

Note: Assuming that goal test is applied when nodes are generated 34
3. Time Complexity

Then the total number of nodes generated is

= 𝑏 + 𝑏 2 + 𝑏 3 + 𝑏 4 + .. + 𝑏 𝑑 =

= b(𝑏 𝑑 - 1) / (b-1)

Time Complexity = O(𝒃𝒅 )

Note: Assuming that goal test is applied when nodes are generated 35
3. Time Complexity
If goal test is applied to nodes, when selected for expansion, then

whole layer of nodes at depth d would be expanded before the goal

was detected & time complexity would be

: O (bd+1)

37
Example:
• Consider a Machine Language Translation Model.

• Input: French sentence “ Jane visite l’Afrique en septembre”,

• the following could be several translations to English:

• “Jane is visiting Africa in September” - Best / most likely translation

• “Jane is going to be visiting Africa in September” - Not a bad translation

• “Jane is going to visit Africa in September” - Not a bad translation

38
Performance Measures:
4. Space Complexity
• Every node generated remains in the memory

• So space complexity is same as time complexity

• Space Complexity : O (bd) -goal test is conducted when node is generated

• Space Complexity : O (bd+1) - goal test is conducted when node is expanded

40
BFS: Assumption:
Branching factor (b) = 10

Time & Memory Requirement # nodes generated /sec = 1 million


1 Node occupies = 1000 bytes

Depth Node Time Memory

2 110 0.11 msec 107 KBs


4 11,110 11 msec 10.6 MB
6 ≈106 1.1 sec 1 GB
8 ≈108 2 minutes 103 GB
10 ≈1010 3 hours 10 TB
12 ≈1012 13 days 1 Peta Byte
1. Memory is a bigger problem for BFS, than its execution time 42
BFS: Assumption:
Branching factor (b) = 10

Time & Memory Requirement # nodes generated /sec = 1 million


1 Node occupies = 1000 bytes

Depth Node Time Memory


2 110 0.11 msec 107 KBs
4 11,110 11 msec 10.6 MB
6 106 1.1 sec 1 GB
8 108 2 minutes 103 GB
10 1010 3 hours 10 TB
12 1012 13 days 1 Peta Byte
14 1014 3.5 years 99 Peta Bytes
16 1016 350 years 10 Exa Bytes
2. Time is also a major issue for large problems 44
When BFS is appropriate?
◦ Space is not a problem

◦ It's necessary to find the solution with the fewest arcs

◦ Although all solutions may not be shallow, at least some are

45
When BFS is NOT appropriate?
◦ Space is limited

◦ All solutions tend to be located deep in the tree

◦ Branching factor is very large

46
Depth First Search
• Start from the root node and explore

as far as possible along each branch

before backtracking.

• Implementation: Frontier is stored in a LIFO Queue (i.e. Stack)

• For the farthest goal node, it creates the same set of nodes as BFS, only in a
different order

47
Example:
Determine path from S to I, returned by DFS on given graph & discuss performance measures. Explore arcs
from left to right.
S

A B C

D E F G

H I J

49
Activity: Find path from S to G using DFS

53
Performance Measures
1. Completeness

• DFS is complete if depth is finite in

every direction.

• DFS is incomplete for problems with

infinite depth.

56
Performance Measures
2. Optimality

• DFS is not optimal , as it may

return the deepest goal node

ignoring the shallowest goal

node.

58
Performance Measures:
3. Time Complexity

• In worst case (farthest node i.e. rightmost leaf), it creates same set of
nodes as BFS does but only a different order.

• So, time complexity remains same as BFS.

• Time Complexity : O (bd+1) (goal test was applied at the time of


node expansion)

60
4.Space
Complexity
◦ Path from A to M

◦ Explored nodes

◦ Frontier

◦ Unexplored nodes

Note: Explored nodes with


no descendants in the frontier
are removed from memory.

61
Depth First Search

62
Depth First Search

63
4. Space Complexity depth d

•Maximum size of F happens when goal is (b-1)


nodes

the _________in the __________ branch


(b-1)
of search tree. nodes

•Nodes in Frontier = __________


(b-1)
nodes

•Space Complexity = ______

64
Assumption:
Branching factor (b) = 10
# nodes generated /sec = 1 million
BFS v. DFS 1 Node occupies = 1000 bytes

Depth Node Time (DFS/BFS) Memory (BFS) Memory (DFS)

12 1012 13 days 1 Peta Byte

16 1016 350 years 10 Exa Bytes

Note: Space cost is a big advantage of DFS over BFS 66


When DFS is appropriate?
• Space is restricted (complex state representation e.g., robotics)

• There are many solutions, perhaps with long path lengths,

particularly for the case in which all paths lead to a solution

68
When DFS is NOT appropriate?
• If there are loopy paths (cycles) in state space graph

• If you care about optimality

• If there are shallow solutions

69
Iterative Deepening Search- IDS
• Also called progressive deepening search

• The idea is to give best of both BFS (optimality) and DFS (storage requirement

is linear in DFS)

70
IDS Algorithm
1. L = 0

2. Apply DFS to max. depth L. If path to goal found, return it

3. else Increment L and Goto Step 2.

71
IDS : Finding Path from A to M
A

B
C

D G
E F

H I J K L M N O

72
IDS

73
IDS

74
IDS

75
IDS

76
IDS - Example
Determine path from S to G, returned by IDS on given graph & discuss performance measures. Explore arcs from left to
right.

A B
J

D E F G

H I J
C

77
Performance Measures
1. Completeness

◦ For finite b, IDS is complete

◦ All the nodes are expanded at each level

83
Performance Measures
2. Optimality

◦ IDS is optimal (as it finds the shallowest goal node)

◦ Provided all step costs are equal/uniform

85
Proportional to number of nodes generated

Performance Measures:
3. Time Complexity
Level b b2 b3 … bd-2 bd-1 bd
1*
2
3
:
d-2
d-1
d
* Level 0 is not included, as root node is not to be generated, it is a part of problem definition 86
Performance Measures:
3. Time Complexity
• Seems too wasteful
• Examines same nodes over & over
• True for small state spaces
• But, % of extra (redundant) effort decreases as b increase
• b=3 2.25 3d 125% more nodes
• b=4 1.78 4d 78% more nodes
• b=5 1.25 5d 25% more nodes

89
Proportional to number of nodes stored

Performance Measures:
4. Space Complexity

• Same as DFS :

90
Alg. Completeness Optimality Time Complexity Space Complexity

Self-Checking BFS Yes Yes O(bd) O(bd)


DFS Yes No O(bd+1) O(d)
Exercise IDS Yes Yes O(bd) O(d)

• For large state space, which algorithm (BFS, DFS or IDS) should be preferred?

• Answer:

93
Assumption:
Alg. Nodes generated Time Complexity
Self-Checking Branching factor (b) = 10
BFS
O(bd)
Exercise d=5

IDS
O(bd)

Alg. # Nodes generated

BFS

IDS

95
Assumption:
Alg. Nodes generated Time Complexity
Self-Checking Branching factor (b) = 10
BFS
O(bd)
Exercise d=5

IDS
O(bd)

Alg. # Nodes generated

BFS
10 + 102 + 103 + 104 + 105 = 111110

IDS
10*5 + 102* (5 – 1) + 103 * (5 – 2) + 104 * (5 – 3) + 105 * (5 – 4) = 123450

Note that 123450/111110 = 1.11 →IDS is generating 11% more nodes,


but difference decreases for large state space

Note: For large state space, IDS is preferred 96


Homework
• Explore following algorithms

• Depth Limit Search

• Breadth Limit Search

97
Uniform Cost Search - UCS
• Maintains frontier as a priority queue ordered as per path costs
• Returns least-cost path i.e. the optimal one.
• i.e. It expands from the frontier the node having the least path cost
g(n)
• Doesn’t make use of Visited list as it may result in overlooking the
optimal path
• E.g.

98
What if UCS is implemented using Visited
List?
6 2
S B G

1
3

99
UCS- Example (Maintain a priority queue)
Determine path from S to G, returned by UCS on given graph & discuss performance measures.

S
6
8
4
A B C
5 5
7 3

8
D E F G
5
3 6 5

H I J
2

101
Homework: Find path from S to G

102
What if step costs (∈) are equal?
• Same as BFS

104
Performance Measures
1. Completeness

• UCS is complete, provided that step cost ≥ ∈ > 0

• If step cost is 0 or negative, it can get stuck in infinite loops.

• Say for 0 step costs it may keep adding nodes to frontier and can get
stuck.

105
Performance Measures
2. Optimality

• Surely it is (as it is maintaining a priority queue).

106
BFS : O (bd+1)
Performance Measures:
3. Time Complexity

• Uniform-cost search is guided by path costs rather than depths

• Let C*= cost of optimal solution.

• ∈ = step cost

• C*/ ∈ = effective depth

Note: It also needs some time to look for the highest priority node in Q 107
Performance Measures: BFS : O (bd+1)

4. Space Complexity

• Same as its time complexity

109
Dynamic Programming Optimality Principle
• If the shortest path from start state S to goal G goes via some intermediate
state A then path from S to A and A to G must also be the shortest one

• Implementation: If there are two different paths in the frontier leading to the
same state, then discard the longer path or redundant path with same path
length

• Gives efficient searching in dense graphs S A

G
111
Searching for optimality
- Strict Expanded/ Closed List
• Instead of Visited, a list of expanded nodes is maintained

• An expanded state is never revisited

• All such algorithms are also said to use “Strict Expanded List” or

“Closed List”

• Efficient searching
112
UCS- Revisited (Use Strict Expanded List)

114
UCS- Solution
Step F (priority Queue) Expanded Remarks

115
Bidirectional Search

119
Motivation

◦ 𝒃𝒅/𝟐 + 𝒃𝒅/𝟐 << 𝒃𝒅

◦ e.g. 𝟏𝟎𝟖 + 𝟏𝟎𝟖 = 2. 𝟏𝟎𝟖 << 𝟏𝟎𝟏𝟔

◦ Can use both BFS or UCS

◦ Hard to implement for implicit goals like “goal = checkmate” in chess

120
Informed Search
• Informed: Some additional information (heuristics) other than the

problem description is there , which guides you proceeding from

which path would be better

• Heuristic is a piece of domain specific knowledge that guides the

search
121
Heuristic Function (h)
• g(n) : Actual path cost from the start node to node n
• h(n) : Estimated cost of the cheapest path from n to the goal

h=7
S
2
h=5 g (C)
9 B

7 h (C)
h=2 A
4 h=0
3
C G
h=1

Note: Heuristic value is computed for each node from domain knowledge (directly given122to you)
Best First Search - Greedy Algorithm
• Expands the node that appears to be closest to the goal.

• Implementation: Maintain priority queue and pull off node with


least heuristic.

• Example: Consider Travelling Salesman problem

124
Straight Line Distance can be
Travelling Salesman Problem a good heuristic (hSLD)

Optimality?

Does it guarantee to provide optimal solution? 125


Greedy Algorithm - Example
C
h=1
5

3
A
4
h=2 1
2 D G
S h=4 h=0
h=9
3 2
5

B
h=3
128
Homework : Greedy Algorithm

130
Performance Measures
1. Completeness

• Greedy Search is incomplete

132
Performance Measures
2. Optimality

• No – does not guarantee to provide optimal solution

134
Performance Measures
3. Time Complexity

• Like BFS & DFS, it may need to expand all the nodes

• Time Complexity : O (bd+1)

• It also needs some time to look for the highest priority node in Q.

135
Performance Measures
4. Space Complexity

• Same as BFS

• Space Complexity : O (bd+1)

136
Admissibility of Heuristics
• A heuristic that never overestimates the cost to reach the goal is called

admissible.

• Admissible heuristics are by nature optimistic, because;

• Admissible heuristics think the cost of solving the problem is less than it

actually is.

138
Admissible Heuristic
◦ h(n) ≤ h*(n)

◦ h(n) → estimated cost to reach the goal from n

◦ h*(n) → true cost to reach the goal from n

◦ Hence;

◦ h(g) = 0, for any goal node, g

◦ h(n) = ∞if there is not path from n to a goal node

139
Admissible Heuristics – Some Examples
1. number of misplaced tiles
5 3 8 1 2 3

2 6 8 4

7 4 1 7 6 5

Start Goal

At least 1 move would be needed to move a misplaced tile towards the goal state – no overestimation - admissible
140
Admissible Heuristics – Some Examples
2. Sum of Manhattan Distances of all
5 3 8 1 2 3 misplaced tiles
2 6 8 4 h=
7 4 1 7 6 5 h=

Start Goal

Taxicab/ City-block /rectilinear distance b/w 2 points (x1 , y1) & (x2, y2) is |x2 – x1| +|y2
142 – y2|
A* Search

146
g(n) : actual path cost from the start node to node n
h(n) : estimated cost of the cheapest path from n to the goal

A* Search (A star)
• The most widely-known form of best-first search in AI

• UCS finds shortest path to every other node rather than focusing on the goal node

• A* uses heuristic to enumerates path length

f(n) = g(n) + h(n) where;

• f(n) : estimated cost of the cheapest solution through n (estimated total path length)
• In fact it is the best estimate of the total distance to the goal

• Implementation: Maintain priority queue and pull off node with least f(n)
147
UCS

A* Search A*

A S B G

148
A* with Strict Expanded List/Closed List
•Use of Closed list saves us redundant effort of expanding longer,

non-optimal paths.

•However, when Closed list is used with A*, only admissibility of

heuristics does not guarantee its optimality

•Consistency of heuristic is also essential for optimality


149
Consistent Heuristic
• h is consistent if the heuristic function satisfies triangle inequality for every node &
its child node n’:
c ( n , n′ )
• H(n) <= h(n’) + c(n, n’) n n’
h(n’)
h(n)
G
• When h is consistent, the f values of nodes expanded by A* are never decreasing
• When A* selected n for expansion, it already found the shortest path to it
• When h is consistent, every node is expanded once.

Consistency is sometimes also called monotonicity 152


Consistent (Monotone) Heuristic
If h is consistent, we have
f(n’) = g(n’) + h(n’)
= g(n) + c(n, a, n’) + h(n’)
≥ g(n) + h(n)
≥ f(n)
i.e. f(n) is non- decreasing along any path

153
Example- A*
(a) Simulate A* to find path from S to G with closed list
C
1
(b) Repeat without closed list
100
A
S A B C G 1
G
S 2

90 100 88 100 0
2
B

156
(a) With Closed List
Step F (priority Queue) Expanded Remarks

157
(b) Without Closed List
Step F (priority Queue) Remarks

159
Example- A*
(a) Simulate A* to find path from S to G without closed list
C
(b) Repeat with closed list
2 h=1

A 3

2 h=2 2
4 D G
S h=1 h=0
h=0
5
1
5
B
h=3

160
Homework Apply A* on given 8-puzzle
When;
1. h(n) = number of misplaced tiles
2. h(n) = sum of Manhattan distance
Step cost is 1

2 8 3 1 2 3

1 6 4 8 4

7 5 7 6 5

Initial State Final State

163
1. Completeness
• Complete if costs > 0 , above epsilon & branching factor is finite
• Even if h are not admissible, it is able to terminate with a solution
path (though not necessarily the optimal one)
• Proof:
• The evaluation function f of nodes expanded must increase
eventually (since paths are longer and more costly) until all the
nodes on a solution path are expanded
• Note: A* is admissible if it uses admissible heuristics.

168
2. Optimality
• Tree Search

• Admissible heuristic is needed

• Graph Search with reopening closed nodes

• Admissible heuristic is enough

• Graph Search without reopening closed nodes

• Consistent heuristic is needed

169
3. Space Complexity
• Exponential
• The space complexity of A∗ often makes it impractical to insist on finding an
optimal solution.
• One can use variants of A∗ that find suboptimal solutions quickly, or
• one can sometimes design heuristics that are more accurate but not strictly
admissible.
• In any case, the use of a good heuristic still provides enormous savings
compared to the use of an uninformed search.

170
4. Time Complexity
• Exponential unless heuristic is very accurate.

• Computation time is not main drawback of A*

• Because it keeps all generated nodes in memory (as do all GRAPH-SEARCH algorithms),

• A∗ usually runs out of space long before it runs out of time.

• For this reason, A∗ is not practical for many large-scale problems.

• Recently developed algorithms have overcome the space problem without sacrificing
optimality or completeness, at a small cost in execution time.

171
The End

173

You might also like