0% found this document useful (0 votes)

1K views35 pages

Parallel DFS and BFS

The document discusses parallel implementations of depth-first search (DFS) and best-first search (BFS) algorithms. For parallel DFS, the key issues are how to distribute the search space among processors and how idle processors obtain new work. Dynamic load balancing schemes like asynchronous round robin are proposed. For parallel BFS, a centralized approach where processors share a global queue is discussed, but this suffers from termination detection and contention issues with high queue access overhead.

Uploaded by

Geeta Meena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views35 pages

Parallel DFS and BFS

Uploaded by

Geeta Meena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Parallel Depth-First Search

and
Parallel Best-First Search
Depth-First Search (DFS)

 Depth-first search is the process of searching a graph in such a

way that the search moves forward until it reaches a vertex
whose neighbors have all been examined. At this point it
backtracks a minimum distance and continues in a new
direction.

 In other words, in a depth-first search, if v is the vertex being

searched from (starting vertex), (v,w) is the edge being
examined and w is unvisited, w will be the next vertex searched
from (the new starting point). The DFS is called recursively.
Cont...

 However, if v is the starting vertex and (v, w) is the edge being examined and
w is already visited, v remains the vertex being searched from, and another
vertex adjacent to v which has not been visited. is chosen as the vertex to be
examined.
 This indicates that in the DFS strategy the vertices are examined in order of
decreasing depth in the tree.
Parallel Depth-First Search
Cont...
Cont...

 The parallel depth-first search discussed is applicable to SIMD

and MIMD architectures. However, it is not recommended to
implement a parallel DFS on SIMD computers because the
following two problems:
 Since all processors execute identical instructions, all must be in the
same stage of the search tree. It is possible that only some of the
processors are busy and others are idle, which will reduce the overall
execution rate compared with the MIMD computers.
 Due to the architectural constraints of SIMD computers, load balancing
must be performed globally at the beginning of the search tree or at the
end of each stage of the search. When some of the processors are idle,
busy processors can share the work with the idle processors.
Parallel Depth-First Search

• The critical issue in parallel depth-first search algorithms is the distribution of the
search space among the processors.
• The unstructured nature of tree search and the imbalance resulting from static
partitioning.
Cont...

Dynamic
Load Balancing
Important Parameters of Parallel DFS

• Two characteristics of parallel DFS are critical to determining its

performance:
• first is the method for splitting work at a processor, and
• the second is the scheme to determine the donor processor when a
processor becomes idle.

• Work-Splitting Strategies
• When work is transferred, the donor's stack is split into two stacks, one of
which is sent to the recipient.
• In other words, some of the nodes (that is, alternatives) are removed from
the donor's stack and added to the recipient's stack.
• If too little work is sent, the recipient quickly becomes idle; if too much,
the donor becomes idle.
• Ideally, the stack is split into two equal pieces such that the size of the
search space represented by each stack is the same. Such a split is
called a half-split.
Cont...

• Work-Splitting Strategies
• It is difficult to get a good estimate of the size of the tree rooted at an
unexpanded alternative in the stack.
• However, the alternatives near the bottom of the stack (that is, close to
the initial node) tend to have bigger trees rooted at them, and alternatives
near the top of the stack tend to have small trees rooted at them.
• To avoid sending very small amounts of work, nodes beyond a specified
stack depth are not given away. This depth is called the cutoff depth.

• Some possible strategies for splitting the search space are

(1) send nodes near the bottom of the stack,
(2) send nodes near the cutoff depth, and
(3) send half the nodes between the bottom of the stack and the cutoff depth.
Cont...

Load-Balancing Schemes:
• Asynchronous Round Robin (ARR)
• each processor maintains an independent variable, target.
• Whenever a processor runs out of work, it uses target as the label of a
donor processor and attempts to get work from it.
• The value of target is incremented (modulo p) each time a work request
is sent.
• The initial value of target at each processor is set to ((label + 1) modulo
p) where label is the local processor label.
• Here, work requests are generated independently by each processor.
• However, it is possible for two or more processors to request work from
the same donor at nearly the same time.
Cont...

Load-Balancing Schemes:
• Global Round Robin (GRR)
• It uses a single global variable called target.
• This variable can be stored in a globally accessible space in shared
address space machines or at a designated processor in message
passing machines.
• Whenever a processor needs work, it requests and receives the value of
target, either by locking, reading, and unlocking on shared address space
machines or by sending a message requesting the designated processor
(say P0).
• The value of target is incremented (modulo p) before responding to the
next request.
• The recipient processor then attempts to get work from a donor processor
whose label is the value of target.
• GRR ensures that successive work requests are distributed evenly over
all processors.
• A drawback of this scheme is the contention for access to target.
Cont...

Load-Balancing Schemes:
• Random Polling (RP)
• It is the simplest load-balancing scheme.
• When a processor becomes idle, it randomly selects a donor.
• Each processor is selected as a donor with equal probability, ensuring
that work requests are evenly distributed.
Best-First Search

• Best-first search is a way of combining the advantages of both

the depth-first and the breadth-first search into a single method.
• It uses a heuristic function to direct the traversing of the search
tree. Smaller heuristic values are assigned to more promising
nodes.
• The expansion of a vertex v is estimated numerically by a
heuristic evaluation function f(v) which may depend on the
description of v, the description of the goal, the information
gathered by the search up to this point and any extra knowledge
about the problem domain.
Cont...

• The vertex selected for consideration is the one having the best
value of this evaluation function.
• If the selected vertex is a solution, we can quit; otherwise, all
those new vertices are added to the set of vertices generated so
far for the next step of examination.
• For example, this value should be the associated cost of the
element at that level with respect to the objective function.
• The main disadvantage of BFS is its memory requirement, which
is linear in the size of the search space explored. For problems
with a large search space tree, providing the required memory
becomes a problem.
Execution of Best-First Search
Cont...
Cont...
Cont...
Parallel Best-First Search

• In the sequential best-first search algorithm, the most promising node from
the open list is removed and expanded, and newly generated nodes are
added to the open list (Assuming open list is implemented by queue data
structure).
• Parallelism in a best-first search can be introduced by expanding the vertices
in parallel.
• Centralized strategy:
• Each processor gets work from a single global open list or queue.
• Suppose p processors are available.
• At each time t, instead of expanding a single vertex with the best value of the
evaluation.
• p vertices are considered for expansion as the p best values of the evaluation.
• After each iteration, each processor needs only to place the generated vertices
on queue.
• The new vertices are evaluated and placed on queue for the next step of the
examination.
• The locking operation is used here to serialize queue access by various
processors.
Parallel Best-First Search
Cont...

• There are two problems with the centralized approach:

1.The termination criterion of sequential BFS fails for parallel BFS.
• Since at any moment, p nodes from the open list are being
expanded, it is possible that one of the nodes may be a solution that
does not correspond to the best goal node (or the path found is not
the shortest path).
• This is because the remaining p - 1 nodes may lead to search spaces
containing better goal nodes.
• Therefore, if the cost of a solution found by a processor is c, then this
solution is not guaranteed to correspond to the best goal node until
the cost of nodes being searched at other processors is known to be
at least c.
• The termination criterion must be modified to ensure that termination
occurs only after the best solution has been found.
Cont...

2. Since the open list is accessed for each node expansion, it must be easily
accessible to all processors, which can severely limit performance.
• Even on shared-address-space architectures, contention for the open
list limits speedup.
• Let texp be the average time to expand a single node, and taccess be
the average time to access the open list for a single-node expansion.
• If there are n nodes to be expanded by both the sequential and
parallel formulations (assuming that they do an equal amount of
work), then the sequential run time is given by n(taccess + texp).
• Assume that it is impossible to parallelize the expansion of individual
nodes. Then the parallel run time will be at least ntaccess, because the
open list must be accessed at least once for each node expanded.
• Hence, an upper bound on the speedup is (taccess + texp)/taccess.
Cont...

• One way to avoid the contention due to a centralized open list is to let each
processor have a local open list.
• Initially, the search space is statically divided among the processors by
expanding some nodes and distributing them to the local open lists of
various processors.
• All the processors then select and expand nodes simultaneously.
Cont...

• Consider a scenario where processors do not communicate with each other.

• In this case, some processors might explore parts of the search space that
would not be explored by the sequential algorithm.
• This leads to a high search overhead factor and poor speedup.
• Consequently, the processors must communicate among themselves to
minimize unnecessary search.
• The use of a distributed open list trades-off communication and computation:
decreasing communication between distributed open lists increases search
overhead factor, and decreasing search overhead factor with increased
communication increases communication overhead.
Communication strategies for Parallel Best-First
Tree Search

• A communication strategy allows state-space nodes to be

exchanged between open lists on different processors.
• The objective of a communication strategy is to ensure that
nodes with good heuristic values are distributed evenly among
processors.

• Three communication strategies:

• Random
• Ring
• Blackboard
Cont...

• Random communication strategy

• Each processor periodically sends some of its best nodes to
the open list of a randomly selected processor.
• This strategy ensures that, if a processor stores a good part
of the search space, the others get part of it.
• If nodes are transferred frequently, the search overhead
factor can be made very small; otherwise it can become quite
large.
• The communication cost determines the best node transfer
frequency.
• If the communication cost is low, it is best to communicate
after every node expansion.
Cont...

• Ring communication strategy:

• The processors are mapped in a virtual ring.
• Each processor periodically exchanges some of its best
nodes with the open lists of its neighbors in the ring.
• This strategy can be implemented on message passing as
well as shared address space machines with the processors
organized into a logical ring.
• As before, the cost of communication determines the node
transfer frequency.

• Unless the search space is highly uniform, the search

overhead factor of this scheme is very high. The reason is
that this scheme takes a long time to distribute good nodes
from one processor to all other processors.
Cont...

• A message-passing implementation of parallel best-first search using the ring

communication strategy.
Cont...

• Blackboard communication strategy:

• There is a shared blackboard through which nodes are switched among
processors as follows:
• After selecting the best node from its local open list, a processor
expands the node only if its heuristic value is within a tolerable limit of
the best node on the blackboard.
• If the selected node is much better than the best node on the
blackboard, the processor sends some of its best nodes to the
blackboard before expanding the current node.
• If the selected node is much worse than the best node on the
blackboard, the processor retrieves some good nodes from the
blackboard and reselects a node for expansion.
• The blackboard strategy is suited only to shared-address-space
computers, because the value of the best node in the blackboard has to
be checked after each node expansion.
Cont...

• An implementation of parallel best-first search using the blackboard

communication strategy.

Understanding The Law of Resonance
No ratings yet
Understanding The Law of Resonance
13 pages
STV Lab Manual Viii - Sem
No ratings yet
STV Lab Manual Viii - Sem
31 pages
Pattern Matching Algorithms
No ratings yet
Pattern Matching Algorithms
17 pages
4.CPU Scheduling and Algorithm-Notes
No ratings yet
4.CPU Scheduling and Algorithm-Notes
31 pages
One Bit Sliding Window Protocol
No ratings yet
One Bit Sliding Window Protocol
1 page
Practice Assignment 11 Sol 12453
100% (1)
Practice Assignment 11 Sol 12453
6 pages
Elementary Data Link Protocols
No ratings yet
Elementary Data Link Protocols
61 pages
Solution To Recurrence Relation
No ratings yet
Solution To Recurrence Relation
2 pages
Segmented Paging: Unit Iv
100% (1)
Segmented Paging: Unit Iv
11 pages
Memory Organisation in Embedded Systems
No ratings yet
Memory Organisation in Embedded Systems
12 pages
Multiprotocol Service10
100% (2)
Multiprotocol Service10
8 pages
Write A C Program To Simulate Lexical Analyzer To Validating A Given Input String.
No ratings yet
Write A C Program To Simulate Lexical Analyzer To Validating A Given Input String.
8 pages
B.SC (It) 6Th Sem Practical Question Paper With ANSWER... : 1. Write A Program For Frame Sorting Technique Used in Buffers
No ratings yet
B.SC (It) 6Th Sem Practical Question Paper With ANSWER... : 1. Write A Program For Frame Sorting Technique Used in Buffers
13 pages
Chapter 6 (Pipelining and Superscalar Techniques)
No ratings yet
Chapter 6 (Pipelining and Superscalar Techniques)
10 pages
C++ Copy Constructor
No ratings yet
C++ Copy Constructor
4 pages
CB3491-CCS 2marks
No ratings yet
CB3491-CCS 2marks
12 pages
Toc Unit III
No ratings yet
Toc Unit III
14 pages
Oopcgl Mini Project
No ratings yet
Oopcgl Mini Project
6 pages
Horspool Algorithm
No ratings yet
Horspool Algorithm
6 pages
Unit 1 Algorithm Performance Analysis and Measurement
No ratings yet
Unit 1 Algorithm Performance Analysis and Measurement
61 pages
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
No ratings yet
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
22 pages
Thrashing: Operating System
100% (1)
Thrashing: Operating System
13 pages
Studocu DAA Unit 1 Notes
No ratings yet
Studocu DAA Unit 1 Notes
52 pages
Dbms Co Po Mapping Justification
No ratings yet
Dbms Co Po Mapping Justification
3 pages
Recursively Enumerable Languages
No ratings yet
Recursively Enumerable Languages
8 pages
C++ Practical 2nd Sem BCA Assignments
No ratings yet
C++ Practical 2nd Sem BCA Assignments
1 page
Friend Function & Friend Class
No ratings yet
Friend Function & Friend Class
26 pages
Toc Mod 5 Notes
No ratings yet
Toc Mod 5 Notes
41 pages
DAA Unit 4 Notes
No ratings yet
DAA Unit 4 Notes
87 pages
Computer Network
No ratings yet
Computer Network
17 pages
Difference
100% (1)
Difference
2 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
Winsem2023-24 Bcse309l TH VL2023240500721 Cat-1-Qp - Key
No ratings yet
Winsem2023-24 Bcse309l TH VL2023240500721 Cat-1-Qp - Key
7 pages
MPMC Lab Manual Exps
No ratings yet
MPMC Lab Manual Exps
29 pages
2.2.1 Introduction To Queues
No ratings yet
2.2.1 Introduction To Queues
12 pages
Solutions To Exercises On Memory Management
No ratings yet
Solutions To Exercises On Memory Management
7 pages
Text Processing: Data Structures and Algorithms in Java 1/47
No ratings yet
Text Processing: Data Structures and Algorithms in Java 1/47
47 pages
STLD Bits
No ratings yet
STLD Bits
18 pages
Mini Project HPC
No ratings yet
Mini Project HPC
17 pages
5.2: Closure Properties of Recursive and Recursively Enumerable Languages
No ratings yet
5.2: Closure Properties of Recursive and Recursively Enumerable Languages
14 pages
TCP in Wireless Domain
No ratings yet
TCP in Wireless Domain
30 pages
CN MAY 2023 Solved Paper
No ratings yet
CN MAY 2023 Solved Paper
19 pages
MAD Lab Manual
No ratings yet
MAD Lab Manual
43 pages
Experiment 3 Desktop Virtualization - PDF
No ratings yet
Experiment 3 Desktop Virtualization - PDF
12 pages
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
No ratings yet
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
24 pages
Unit-4 DBMS Notes
No ratings yet
Unit-4 DBMS Notes
32 pages
ANFIS
No ratings yet
ANFIS
42 pages
Chapter 5 - Uncertain Knowledge and Reasoning
No ratings yet
Chapter 5 - Uncertain Knowledge and Reasoning
29 pages
1) Aim: Demonstration of Preprocessing of Dataset Student - Arff
No ratings yet
1) Aim: Demonstration of Preprocessing of Dataset Student - Arff
26 pages
Component Level Design
No ratings yet
Component Level Design
2 pages
Friend Function and Operator Overloading
No ratings yet
Friend Function and Operator Overloading
54 pages
Chapter 26: Remote Log-In, Electronic Mail and File Transfer
0% (1)
Chapter 26: Remote Log-In, Electronic Mail and File Transfer
34 pages
3.multicore Architecture and Programming
0% (1)
3.multicore Architecture and Programming
3 pages
STL Complete Notes
No ratings yet
STL Complete Notes
19 pages
Problems With Answers
No ratings yet
Problems With Answers
6 pages
Elementary Data Link Protocols
No ratings yet
Elementary Data Link Protocols
4 pages
Compiler Design Question Bank-UNIT 1
No ratings yet
Compiler Design Question Bank-UNIT 1
12 pages
Techniques of Knowledge Representation
No ratings yet
Techniques of Knowledge Representation
3 pages
Scalable Distributed Depth-First Search With Greedy Work Stealing
No ratings yet
Scalable Distributed Depth-First Search With Greedy Work Stealing
15 pages
3-Module2 - Introduction To Problem Solving by Searching Methods-30-07-2024
No ratings yet
3-Module2 - Introduction To Problem Solving by Searching Methods-30-07-2024
78 pages
Parallel DFS
No ratings yet
Parallel DFS
10 pages
Parallel Graph Search Algorithms
No ratings yet
Parallel Graph Search Algorithms
13 pages
Motilal Nehru National Institute of Technology Allahabad, Prayagraj Computer Science & Engineering Department Formal Methods Lab - CS18201 (CS-8 Sem) Assignment-3 1
No ratings yet
Motilal Nehru National Institute of Technology Allahabad, Prayagraj Computer Science & Engineering Department Formal Methods Lab - CS18201 (CS-8 Sem) Assignment-3 1
2 pages
Amdahl's Law
No ratings yet
Amdahl's Law
25 pages
Flynn's Classification
No ratings yet
Flynn's Classification
9 pages
Microprocessor
No ratings yet
Microprocessor
105 pages
Tcs
No ratings yet
Tcs
46 pages
Daftar Topik Dan Road Map Pusat Penelitian 2020 2024
No ratings yet
Daftar Topik Dan Road Map Pusat Penelitian 2020 2024
22 pages
Chapter 13 - Motivation at Work
No ratings yet
Chapter 13 - Motivation at Work
62 pages
Exercise 37. Read and Find The Appropriate Translation For The Words Below in The Text
No ratings yet
Exercise 37. Read and Find The Appropriate Translation For The Words Below in The Text
3 pages
Schneider Ecostructure Guide
No ratings yet
Schneider Ecostructure Guide
80 pages
Electric Project Documentation: . +A PROJECT 2507 1. Information PLUG-SPRAY2507
No ratings yet
Electric Project Documentation: . +A PROJECT 2507 1. Information PLUG-SPRAY2507
5 pages
HW - 7 1
No ratings yet
HW - 7 1
4 pages
Futo Digital Bootcamp 2024 Timetable
No ratings yet
Futo Digital Bootcamp 2024 Timetable
3 pages
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
100% (1)
Math10 Q1 Wk3 Illustrate-Geometric-Sequence
12 pages
Horizon (Ceiling Hung) : Key Features
No ratings yet
Horizon (Ceiling Hung) : Key Features
2 pages
Chapter 4
100% (1)
Chapter 4
166 pages
Dialog B.ing
No ratings yet
Dialog B.ing
2 pages
Loop Breaker Manual
No ratings yet
Loop Breaker Manual
62 pages
The FSC - Stability
No ratings yet
The FSC - Stability
9 pages
03 - нематоды птиц
No ratings yet
03 - нематоды птиц
10 pages
Bray Resilient Valves
No ratings yet
Bray Resilient Valves
25 pages
20 Things To Do After Installing Elementary OS Freya
No ratings yet
20 Things To Do After Installing Elementary OS Freya
2 pages
Under Guidance of Hassan Zakir Jafri SB
No ratings yet
Under Guidance of Hassan Zakir Jafri SB
10 pages
Wheel Decide Tutorial - Youtube
No ratings yet
Wheel Decide Tutorial - Youtube
3 pages
Bio++data Mukul++Vaghela
No ratings yet
Bio++data Mukul++Vaghela
2 pages
Questionnaire Employee Name: Designation: Academic Qualification: Experience
No ratings yet
Questionnaire Employee Name: Designation: Academic Qualification: Experience
4 pages
Karnataka FPOs
No ratings yet
Karnataka FPOs
66 pages
Accepted For Publication-Journal of Biomolecular Structure & Dynamics - Decision On Manuscript ID TBSD-2023-4739
No ratings yet
Accepted For Publication-Journal of Biomolecular Structure & Dynamics - Decision On Manuscript ID TBSD-2023-4739
3 pages
Teaching Early Numeracy Skills Hands-On Learning in Times of The Covid-19 Pandemic
No ratings yet
Teaching Early Numeracy Skills Hands-On Learning in Times of The Covid-19 Pandemic
17 pages
Cáscara de Plátano Como Biosorbente para La Descontaminación de Contaminantes Del Agua. Una Revisión
No ratings yet
Cáscara de Plátano Como Biosorbente para La Descontaminación de Contaminantes Del Agua. Una Revisión
28 pages
BioAir EcoFilter Brochure
No ratings yet
BioAir EcoFilter Brochure
4 pages
I Lecture 6
No ratings yet
I Lecture 6
39 pages
DLL Spa 1
No ratings yet
DLL Spa 1
2 pages
Iq Check Real Time PCR Solution
No ratings yet
Iq Check Real Time PCR Solution
4 pages

Parallel DFS and BFS

Uploaded by

Parallel DFS and BFS

Uploaded by

Parallel Depth-First Search

 Depth-first search is the process of searching a graph in such a

 In other words, in a depth-first search, if v is the vertex being

 The parallel depth-first search discussed is applicable to SIMD

• Two characteristics of parallel DFS are critical to determining its

• Some possible strategies for splitting the search space are

• Best-first search is a way of combining the advantages of both

• There are two problems with the centralized approach:

• Consider a scenario where processors do not communicate with each other.

• A communication strategy allows state-space nodes to be

• Three communication strategies:

• Random communication strategy

• Ring communication strategy:

• Unless the search space is highly uniform, the search

• A message-passing implementation of parallel best-first search using the ring

• Blackboard communication strategy:

• An implementation of parallel best-first search using the blackboard

You might also like