0% found this document useful (0 votes)
28 views65 pages

Parallel Distributed Computing Unit-5

The document discusses search algorithms for discrete optimization problems. It begins with an overview of discrete optimization problems and how they can be formulated as finding minimum-cost paths in graphs. It then discusses sequential search algorithms like depth-first search and how they can be applied. Parallel search algorithms are also discussed as a way to potentially speed up solving these computationally expensive discrete optimization problems.

Uploaded by

sowmya srikande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views65 pages

Parallel Distributed Computing Unit-5

The document discusses search algorithms for discrete optimization problems. It begins with an overview of discrete optimization problems and how they can be formulated as finding minimum-cost paths in graphs. It then discusses sequential search algorithms like depth-first search and how they can be applied. Parallel search algorithms are also discussed as a way to potentially speed up solving these computationally expensive discrete optimization problems.

Uploaded by

sowmya srikande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Search Algorithms for Discrete Optimization

Problems
Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

To accompany the text “Introduction to Parallel Computing”,


Addison Wesley, 2003.
Topic Overview

• Discrete Optimization – Basics

• Sequential Search Algorithms

• Parallel Depth-First Search

• Parallel Best-First Search

• Speedup Anomalies in Parallel Search Algorithms


Discrete Optimization – Basics

• Discrete optimization forms a class of computationally


expensive problems of significant theoretical and practical
interest.

• Search algorithms systematically search the space of possible


solutions subject to constraints.
Definitions

• A discrete optimization problem can be expressed as a tuple


(S, f ). The set S is a finite or countably infinite set of all solutions
that satisfy specified constraints.

• The function f is the cost function that maps each element in


set S onto the set of real numbers R.

• The objective of a DOP is to find a feasible solution xopt, such


that f (xopt) ≤ f (x) for all x ∈ S.

• A number of diverse problems such as VLSI layouts, robot


motion planning, test pattern generation, and facility location
can be formulated as DOPs.
Discrete Optimization: Example

• In the 0/1 integer-linear-programming problem, we are given


an m × n matrix A, an m × 1 vector b, and an n × 1 vector c.

• The objective is to determine an n × 1 vector x whose elements


can take on only the value 0 or 1.

• The vector must satisfy the constraint

Ax ≥ b

and the function


f (x) = cT x
must be minimized.
Discrete Optimization: Example

• The 8-puzzle problem consists of a 3 × 3 grid containing eight


tiles, numbered one through eight.

• One of the grid segments (called the “blank”) is empty. A tile


can be moved into the blank position from a position adjacent
to it, thus creating a blank in the tile’s original position.

• The goal is to move from a given initial position to the final


position in a minimum number of moves.
Discrete Optimization: Example
5 2 1 2 3
1 8 3 4 5 6
4 7 6 7 8

(a) (b)

5 2 1 5 2 1 5 2 1 5 2 1 5 2
up up left down
1 8 3 8 3 4 8 3 4 8 3 4 3
4 7 6 4 7 6 7 6 7 6 7 8 6

left down

1 2 3 1 2 3 1 2 1 2
up up left
4 5 6 4 5 4 5 3 4 5 3
7 8 7 8 6 7 8 6 7 8 6

Last tile moved Blank tile

(c)

An 8-puzzle problem instance: (a) initial configuration; (b) final


configuration; and (c) a sequence of moves leading from the
initial to the final configuration.
Discrete Optimization Basics

• The feasible space S is typically very large.

• For this reason, a DOP can be reformulated as the problem


of finding a minimum-cost path in a graph from a designated
initial node to one of several possible goal nodes.

• Each element x in S can be viewed as a path from the initial


node to one of the goal nodes.

• This graph is called a state space.


Discrete Optimization Basics

• Often, it is possible to estimate the cost to reach the goal state


from an intermediate state.

• This estimate, called a heuristic estimate, can be effective in


guiding search to the solution.

• If the estimate is guaranteed to be an underestimate, the


heuristic is called an admissible heuristic.

• Admissible heuristics have desirable properties in terms of


optimality of solution (as we shall see later).
Discrete Optimization: Example

An admissible heuristic for 8-puzzle is as follows:

• Assume that each position in the 8-puzzle grid is represented as


a pair.

• The distance between positions (i, j) and (k, l) is defined as |i −


k| + |j − l|. This distance is called the Manhattan distance.

• The sum of the Manhattan distances between the initial and


final positions of all tiles is an admissible heuristic.
Parallel Discrete Optimization: Motivation

• DOPs are generally NP-hard problems. Does parallelism really


help much?

• For many problems, the average-case runtime is polynomial.

• Often, we can find suboptimal solutions in polynomial time.

• Many problems have smaller state spaces but require real-time


solutions.

• For some other problems, an improvement in objective


function is highly desirable, irrespective of time.
Sequential Search Algorithms

• Is the search space a tree or a graph?

• The space of a 0/1 integer program is a tree, while that of an


8-puzzle is a graph.

• This has important implications for search since unfolding a


graph into a tree can have significant overheads.
Sequential Search Algorithms
1 1

2 4 2 4

3 5 6 3 5 6

7 7 7

8 9 8 9 8 9

(a)

1 1

2 3 2 3

4 4 4

5 6 5 6 5 6

7 7 7 7 7

8 9 8 9 8 9 8 9 8 9

10 10 10 10 10 10 10 10 10

(b)

Two examples of unfolding a graph into a tree.


Depth-First Search Algorithms

• Applies to search spaces that are trees.

• DFS begins by expanding the initial node and generating its


successors. In each subsequent step, DFS expands one of the
most recently generated nodes.

• If there exists no success, DFS backtracks to the parent and


explores an alternate child.

• Often, successors of a node are ordered based on their


likelihood of reaching a solution. This is called directed DFS.

• The main advantage of DFS is that its storage requirement is


linear in the depth of the state space being searched.
Depth-First Search Algorithms
7 2 3
A 4 6 5
1 8
down right
Step 1
7 2 3 7 2 3
B 4 6 C 4 6 5
1 8 5 1 8
up right
Step 2 down

7 2 3 7 2 7 2 3
D 4 6 5 E 4 6 3 F 4 6
Blank tile
1 8 1 8 5 1 8 5
up right The last tile moved.
Step 3
7 2 3 7 2
G 4 6 H 4 6 3
1 8 5 1 8 5

States resulting from the first three steps of depth-first search


applied to an instance of the 8-puzzle.
DFS Algorithms: Simple Backtracking

• Simple backtracking performs DFS until it finds the first feasible


solution and terminates.

• Not guaranteed to find a minimum-cost solution.

• Uses no heuristic information to order the successors of an


expanded node.

• Ordered backtracking uses heuristics to order the successors of


an expanded node.
Depth-First Branch-and-Bound (DFBB)

• DFS technique in which upon finding a solution, the algorithm


updates current best solution.

• DFBB does not explore paths that ae guaranteed to lead to


solutions worse than current best solution.

• On termination, the current best solution is a globally optimal


solution.
Iterative Deepening Search

• Often, the solution may exist close to the root, but on an


alternate branch.

• Simple backtracking might explore a large space before


finding this.

• Iterative deepening sets a depth bound on the space it


searches (using DFS).

• If no solution is found, the bound is increased and the process


repeated.
Iterative Deepening A* (IDA*)

• Uses a bound on the cost of the path as opposed to the depth.

• IDA* defines a function for node x in the search space as l(x) =


g(x) + h(x). Here, g(x) is the cost of getting to the node and
h(x) is a heuristic estimate of the cost of getting from the node
to the solution.

• At each failed step, the cost bound is incremented to that


of the node that exceeded the prior cost bound by the least
amount.

• If the heuristic h is admissible, the solution found by IDA* is


optimal.
DFS Storage Requirements and Data Structures

• At each step of DFS, untried alternatives must be stored for


backtracking.

• If m is the amount of storage required to store a state, and d is


the maximum depth, then the total space requirement of the
DFS algorithm is O(md).

• The state-space tree searched by parallel DFS can be


efficiently represented as a stack.

• Memory requirement of the stack is linear in depth of tree.


DFS Storage Requirements and Data Structures
1 Bottom of the stack

2 3 4 5 5 1 4 5

4
6 7 8 9 3 8 9
9

10 11 7 11
8

11
12 13 14 10 14

14
15 16 17 13 16 17
17

18 19 16 15 19

19
20 21 18

24
22 23 24 21 23 24
23

Current State
Top of the stack

(a) (b) (c)

Representing a DFS tree: (a) the DFS tree; Successor nodes


shown with dashed lines have already been explored; (b) the
stack storing untried alternatives only; and (c) the stack storing
untried alternatives along with their parent. The shaded blocks
represent the parent state and the block to the right represents
successor states that have not been explored.
Best-First Search (BFS) Algorithms

• BFS algorithms use a heuristic to guide search.

• The core data structure is a list, called Open list, that stores
unexplored nodes sorted on their heuristic estimates.

• The best node is selected from the list, expanded, and its off-
spring are inserted at the right position.

• If the heuristic is admissible, the BFS finds the optimal solution.


Best-First Search (BFS) Algorithms

• BFS of graphs must be slightly modified to account for multiple


paths to the same node.

• A closed list stores all the nodes that have been previously
seen.

• If a newly expanded node exists in the open or closed lists with


better heuristic value, the node is not inserted into the open list.
The A* Algorithm

• A BFS technique that uses admissible heuristics.

• Defines function l(x) for each node x as g(x) + h(x).

• Here, g(x) is the cost of getting to node x and h(x) is an


admissible heuristic estimate of getting from node x to the
solution.

• The open list is sorted on l(x).

The space requirement of BFS is exponential in depth!


Best-First Search: Example
7 2 3 1 2 3
Blank Tile
4 6 5 4 5 6
1 8 7 8 The last tile moved

(a) (b)

7 2 3 7 2 3
6 4 6 5 6 4 6 5
1 8 1 8
Step 1 Step 1

7 2 3 7 2 3 7 2 3 7 2 3
7 4 6 4 6 5 7 7 4 6 4 6 5 7
1 8 5 1 8 1 8 5 1 8
Step 2

7 2 7 2 3
8 4 6 3 4 6 6
1 8 5 1 8 5

7 2 3 7 2 3
6 4 6 5 6 4 6 5
1 8 1 8
Step 1 Step 1

7 2 3 7 2 3 7 2 3 7 2 3
7 4 6 4 6 5 7 7 4 6 4 6 5 7
1 8 5 1 8 1 8 5 1 8
Step 2 Step 2 Step 4

7 2 7 2 3 7 2 7 2 3 7 2 3 7 2 3
8 4 6 3 4 6 6 8 4 6 3 4 6 4 6 5 4 5
1 8 5 1 8 5 1 8 5 1 8 5 1 8 1 6 8
6 8 8
Step 3

7 2 3 7 3 7 2 3 7 2 3 7 3 7 2 3
7 4 8 6 4 2 6 4 6 7 7 4 8 6 4 2 6 4 6 7
1 5 1 8 5 1 8 5 1 5 1 8 5 1 8 5
7 7

(c)

Applying best-first search to the 8-puzzle: (a) initial configuration;


(b) final configuration; and (c) states resulting from the first four
steps of best-first search. Each state is labeled with its h-value
(that is, the Manhattan distance from the state to the final state).
Search Overhead Factor

• The amount of work done by serial and parallel formulations of


search algorithms is often different.

• Let W be serial work and WP be parallel work. Search overhead


factor s is defined as WP /W .

• Upper bound on speedup is p × (W/Wp).


Parallel Depth-First Search

• How is the search space partitioned across processors?

• Different subtrees can be searched concurrently.

• However, subtrees can be very different in size.

• It is difficult to estimate the size of a subtree rooted at a node.

• Dynamic load balancing is required.


Parallel Depth-First Search
A B

C D E F

(a) (b)

The unstructured nature of tree search and the imbalance


resulting from static partitioning.
Parallel Depth-First Search: Dynamic Load Balancing

• When a processor runs out of work, it gets more work from


another processor.

• This is done using work requests and responses in message


passing machines and locking and extracting work in shared
address space machines.

• On reaching final state at a processor, all processors terminate.

• Unexplored states can be conveniently stored as local stacks


at processors.

• The entire space is assigned to one processor to begin with.


Parallel Depth-First Search: Dynamic Load Balancing
Service any pending
messages

Do a fixed amount of work

Finished Got
available work
work

Processor active

Processor idle

Select a processor and Service any pending


request work from it messages

Got a reject

Issued a request

A generic scheme for dynamic load balancing.


Parameters in Parallel DFS: Work Splitting

• Work is split by splitting the stack into two.

• Ideally, we do not want either of the split pieces to be small.

• Select nodes near the bottom of the stack (node splitting), or

• Select some nodes from each level (stack splitting).

• The second strategy generally yields a more even split of the


space.
Parameters in Parallel DFS: Work Splitting
1 1

3 5 3 4

5 4

7 9 7 8

9 8

10 11 10

11 14

13 13 14

16
17
15 17 16

19

18 19

24
Cutoff depth
21

23
22 23 24

Current State

(a) (b)

Splitting the DFS tree: the two subtrees along with their stack
representations are shown in (a) and (b).
Load-Balancing Schemes

• Who do you request work from? Note that we would like to


distribute work requests evenly, in a global sense.

• Asynchronous round robin: Each processor maintains a counter


and makes requests in a round-robin fashion.

• Global round robin: The system maintains a global counter and


requests are made in a round-robin fashion, globally.

• Random polling: Request a randomly selected processor for


work.
Analyzing DFS

• We cant compute, analytically, the serial work W or parallel


time. Instead, we quantify total overhead TO in terms of W to
compute scalability.

• For dynamic load balancing, idling time is subsumed by


communication.

• We must quantify the total number of requests in the system.


Analyzing DFS: Assumptions

• Work at any processor can be partitioned into independent


pieces as long as its size exceeds a threshold .

• A reasonable work-splitting mechanism is available.

• If work w at a processor is split into two parts ψw and (1 − ψ)w,


there exists an arbitrarily small constant α (0 < α ≤ 0.5), such
that ψw > αw and (1 − ψ)w > αw.

• The costant α sets a lower bound on the load imbalance from


work splitting.
Analyzing DFS

• If processor Pi initially had work wi, after a single request by


processor Pj and split, neither Pi nor Pj have more than (1−α)wi
work.

• For each load balancing strategy, we define V (P ) as the total


number of work requests after which each processor receives
at least one work request (note that V (p) ≥ p.

• Assume that the largest piece of work at any point is W .

• After V (p) requests, the maximum work remaining at any


processor is less than (1 − α)W ; after 2V (p) requests, it is less
than (1 − α)2W .

• After (log1/(1−α) (W/))V (p) requests, the maximum work


remaining at any processor is below a threshold value .

• The total number of work requests is O(V (p) log W ).


Analyzing DFS

If tcomm is the time required to communicate a piece of work,


then the communication overhead To is given by

To = tcomm V (p) log W (1)

The corresponding efficiency E is given by

1
E =
1 + To/W
1
=
1 + (tcomm V (p) log W )/W
Analyzing DFS: V (P ) for Various Schemes

• Asynchronous Round Robin: V (p) = O(p2) in the worst case.

• Global Round Robin: V (p) = p.

• Random Polling: Worst case V (p) is unbounded. We do


average case analysis.
V (P ) for Random Polling

• Let F (i, p) represent a state in which i of the p processors have


been requested, and p − i have not.

• Let f (i, p) denote the average number of trials needed to


change from state F (i, p) to F (p, p) (V (p) = f (0, p)).

i p−i
f (i, p) = (1 + f (i, p)) + (1 + f (i + 1, p)),
p p
p−i p−i
f (i, p) = 1 + f (i + 1, p),
p p
p
f (i, p) = + f (i + 1, p).
p−i
V (P ) for Random Polling

• We have:

X
p−1
1
f (0, p) = p × ,
i=0
p − i
X
p
1
= p× ,
i=1
i
= p × Hp ,

• As p becomes large, Hp ' 1.69 ln p (where ln p denotes the


natural logarithm of p). Thus, V (p) = O(p log p).
Analysis of Load-Balancing Schemes

If tcomm = O(1), we have,

To = O(V (p) log W ). (2)

• Asynchronous Round Robin: Since V (p) = O(p2), To =


O(p2 log W ). It follows that:

W = O(p2 log(p2 log W )),


= O(p2 log p + p2 log log W )
= O(p2 log p)
Analysis of Load-Balancing Schemes

• Global Round Robin: Since V (p) = O(p), To = O(p log W ). It


follows that W = O(p log p).
However, there is contention here! The global counter must be
incremented O(p log W ) times in O(W/p) time.
From this, we have:
W
= O(p log W ) (3)
p
and W = O(p2 log p).
The worse of these two expressions, W = O(p2 log p) is the
isoefficiency.
Analysis of Load-Balancing Schemes

• Random Polling: We have V (p) = O(p log p), To = O(p log p log W )
Therefore W = O(p log2 p).
Analysis of Load-Balancing Schemes: Conclusions

• Asynchronous round robin has poor performance because it


makes a large number of work requests.

• Global round robin has poor performance because of


contention at counter, although it makes the least number of
requests.

• Random polling strikes a desirable compromise.


Experimental Validation: Satisfiability Problem
700

600
ARR
GRR

Speedup
500
RP

400

300

200

100

0
0 200 400 600 800 1000 1200

Speedups of parallel DFS using ARR, GRR and RP load-balancing


schemes.
Experimental Validation: Satisfiability Problem
900000

700000

Number of work requests


500000 GRR
Expected (GRR)
RP
Expected (RP)
300000

100000

0
0 200 400 600 800 1000 1200
p

Number of work requests generated for RP and GRR and their


expected values (O(p log2 p) and O(p log p) respectively).
Experimental Validation: Satisfiability Problem
2.5e+07

W
2e+07

1.5e+07

1e+07
E = 0.64
E = 0.74
5e+06 E = 0.85
E = 0.90

0
0 20000 40000 60000 80000 100000 120000
2
p log p

Experimental isoefficiency curves for RP for different efficiencies.


Termination Detection

• How do you know when everyone’s done?

• A number of algorithms have been proposed.


Dijkstra’s Token Termination Detection

• Assume that all processors are organized in a logical ring.

• Assume, for now that work transfers can only happen from Pi
to Pj if j > i.

• Processor P0 initiates a token on the ring when it goes idle.

• Each intermediate processor receives this token and forwards


it when it becomes idle.

• When the token reaches processor P0, all processors are done.
Dijkstra’s Token Termination Detection

Now, let us do away with the restriction on work transfers.

• When processor P0 goes idle, it colors itself green and initiates


a green token.

• If processor Pj sends work to processor Pi and j > i then


processor Pj becomes red.

• If processor Pi has the token and Pi is idle, it passes the token to


Pi+1. If Pi is red , then the color of the token is set to red before
it is sent to Pi+1. If Pi is green, the token is passed unchanged.

• After Pi passes the token to Pi+1, Pi becomes green .

• The algorithm terminates when processor P0 receives a green


token and is itself idle.
Tree-Based Termination Detection

• Associate weights with individual workpieces. Initially,


processor P0 has all the work and a weight of one.

• Whenever work is partitioned, the weight is split into half and


sent with the work.

• When a processor gets done with its work, it sends its parent the
weight back.

• Termination is signaled when the weight at processor P0


becomes 1 again.

• Note that underflow and finite precision are important factors


associated with this scheme.
Tree-Based Termination Detection
w0 = 0.5 w0 = 0.5 w0 = 0.25

w1 = 0.5 w1 = 0.25 w1 = 0.25


w3 = 0.25

w2 = 0.25 w2 = 0.25

Step 1 Step 2 Step 3

w0 = 0.25 w0 = 0.5 PSfrag replacements


w0 = 1.0

w1 = 0.5 w3 = 0.25 w1 = 0.5

Step 4 Step 5 Step 6

Tree-based termination detection. Steps 1–6 illustrate the weights


at various processors after each work transfer.
Parallel Formulations of Depth-First Branch-and-Bound

• Parallel formulations of depth-first branch-and-bound search


(DFBB) are similar to those of DFS.

• Each processor has a copy of the current best solution. This is


used as a local bound.

• If a processor detects another solution, it compares the cost


with current best solution. If the cost is better, it broadcasts this
cost to all processors.

• If a processor’s current best solution path is worse than the


globally best solution path, only the efficiency of the search
is affected, not its correctness.
Parallel Formulations of IDA*

Two formulations are intuitive.

• Common Cost Bound: Each processor is given the same cost


bound. Processors use parallel DFS on the tree within the cost
bound. The drawback of this scheme is that there might not be
enough concurrency.

• Variable Cost Bound: Each processor works on a different cost


bound. The major drawback here is that a solution is not
guaranteed to be optimal until all lower cost bounds have
been exhausted.

In each case, parallel DFS is the search kernel.


Parallel Best-First Search

• The core data structure is the Open list (typically implemented


as a priority queue).

• Each processor locks this queue, extracts the best node,


unlocks it.

• Successors of the node are generated, their heuristic functions


estimated, and the nodes inserted into the open list as
necessary after appropriate locking.

• Termination signaled when we find a solution whose cost is


better than the best heuristic value in the open list.

• Since we expand more than one node at a time, we may


expand nodes that would not be expanded by a sequential
algorithm.
Parallel Best-First Search
Global list maintained
at designated processor

Put expanded
nodes Get
current
best node

Lock the list Lock the list

Place generated Place generated


Lock the list
nodes in the list nodes in the list

Pick the best node Place generated Pick the best node
from the list nodes in the list from the list

Unlock the list


PSfrag replacements
Pick the best node
from the list
Unlock the list

Expand the node to Expand the node to


Unlock the list
generate successors generate successors

P0 Expand the node to


generate successors Pp−1
P1

A general schematic for parallel best-first search using a


centralized strategy. The locking operation is used here to
serialize queue access by various processors.
Parallel Best-First Search

• The open list is a point of contention.

• Let texp be the average time to expand a single node, and


taccess be the average time to access the open list for a single-
node expansion.

• If there are n nodes to be expanded by both the sequential


and parallel formulations (assuming that they do an equal
amount of work), then the sequential run time is given by
n(taccess + texp ).

• The parallel run time will be at least ntaccess .

• Upper bound on the speedup is (taccess + texp)/taccess .


Parallel Best-First Search

• Avoid contention by having multiple open lists.

• Initially, the search space is statically divided across these open


lists.

• Processors concurrently operate on these open lists.

• Since the heuristic values of nodes in these lists may diverge


significantly, we must periodically balance the quality of nodes
in each list.

• A number of balancing strategies based on ring, blackboard,


or random communications are possible.
Parallel Best-First Search
Exchange
best nodes

PSfrag replacements
Local list
Local list
Local list

Exchange
best nodes
Exchange
best nodes

P0 Pp−1
P1

A message-passing implementation of parallel best-first search


using the ring communication strategy.
Parallel Best-First Search
blackboard
Exchange Exchange
best nodes best nodes

PSfrag replacements
Local list
Exchange
best nodes Local list
Local list

P0 Pp−1
P1

An implementation of parallel best-first search using the


blackboard communication strategy.
Parallel Best-First Graph Search

• Graph search involves a closed list, where the major operation


is a lookup (on a key corresponding to the state).

• The classic data structure is a hash.

• Hashing can be parallelized by using two functions – the first


one hashes each node to a processor, and the second one
hashes within the processor.

• This strategy can be combined with the idea of multiple open


lists.

• If a node does not exist in a closed list, it is inserted into the


open list at the target of the first hash function.

• In addition to facilitating lookup, randomization also equalizes


quality of nodes in various open lists.
Speedup Anomalies in Parallel Search

• Since the search space explored by processors is determined


dynamically at runtime, the actual work might vary significantly.

• Executions yielding speedups greater than p by using


p processors are referred to as acceleration anomalies.
Speedups of less than p using p processors are called
deceleration anomalies.

• Speedup anomalies also manifest themselves in best-first


search algorithms.

• If the heuristic function is good, the work done in parallel best-


first search is typically more than that in its serial counterpart.
Speedup Anomalies in Parallel Search
Start node S Start node S
1 R1

2 10 R2 L1

3 4 11 R3 R4 L2

5 12 R5 L3

6 9 13 L4
Goal node G Goal node G
7 8

Total number of nodes generated by Total number of nodes generated by


sequential formulation = 13 two-processor formulation of DFS = 9

(a) (b)

The difference in number of nodes searched by sequential and


parallel formulations of DFS. For this example, parallel DFS
reaches a goal node after searching fewer nodes than
sequential DFS.
Speedup Anomalies in Parallel Search
Start node S Start node S
1 R1

2 R2 L1

3 4 R3 R4 L2

5 R5 L3

6 R6 L4

7 R7 L5
Goal node G Goal node G

Total number of nodes generated by Total number of nodes generated by


sequential DFS = 7 two-processor formulation of DFS = 12

(a) (b)

A parallel DFS formulation that searches more nodes than its


sequential counterpart.

You might also like