PDC Unit-2
PDC Unit-2
and programming
UNIT-2
Need for communication and coordination/synchronization
Synchronization:
• Synchronization means organizing the sequence of work and the tasks that perform it. This is very important
in programs that run tasks in parallel (simultaneously).
• It ensures that tasks are coordinated, which improves program performance. Often, it involves "serializing"
parts of the program.
Types of Synchronization:
Barrier:
•A barrier ensures all tasks finish their work before moving forward.
•Each task keeps working until it reaches the barrier. Once all tasks reach it, they synchronize and move to the
next step together.
•Sometimes, a specific task or part of the work must be completed before others continue.
Lock/Semaphore:
•This is used to control access to shared resources like data or code.
•Only one task can use the resource at a time. A task must "lock" the resource before using it and
"unlock" it afterward.
•If another task tries to access the locked resource, it has to wait. This can either pause the task
(blocking) or allow it to do something else in the meantime (non-blocking).
Synchronous Communication Operations:
•These involve two or more tasks that need to communicate while working.
•For example, if a task sends data, it waits for confirmation that the other task received it. This ensures
both tasks are properly coordinated.
• Some problems can be solved in parallel without much coordination. For instance, in image processing,
different parts of an image can be handled by different tasks.
• However, some tasks depend on each other and need to share data or resources. This requires proper
synchronization.
• Whether tasks use synchronous (waiting for confirmation) or asynchronous (not waiting)
communication.
• In a parallel system, multiple jobs arrive, and the system decides the order in which they should be executed.
• The goal is to reduce the total time it takes to complete all the jobs (minimize turnaround time).
•Parallel machines are divided into separate, non-overlapping partitions, where different jobs run at the same
•Jobs wait in a queue until they are allocated processors, especially when the system's state changes.
Goal:
•The aim is to maximize processor usage.
•However, since future jobs and their execution times are unknown, the system uses simple rules (heuristics) to
allocate jobs efficiently at each scheduling step.
How Scheduling Works:
•The scheduler assigns resources (like processors and nodes) to a job based on its requirements, using data
provided by the resource manager.
Scheduling Policies:
1.FCFS (First Come First Serve): Jobs are processed in the order they arrive.
2.Lookahead Optimizing Scheduler: Tries to predict job requirements for better scheduling.
3.Gang Scheduling: Schedules related jobs (from the same group) to run simultaneously.
Independence & Partitioning
•The first step in creating a parallel algorithm is to break the problem into smaller tasks that can run at
the same time.
•These tasks can vary in size or complexity.
Decomposition:
•Tasks can be represented using a task dependency graph, which shows the order in which tasks must
be executed.
•In the graph, nodes represent tasks, and edges show which tasks depend on the results of others.
Tasks:
•A task is a small unit of work within the system.
•Decomposition divides the main computation into these tasks.
Common Tasks Include:
1.Identifying work that can run in parallel.
2.Assigning tasks to processors.
3.Distributing inputs, outputs, and data among tasks.
4.Managing shared resources.
5.Synchronizing processors to ensure proper task execution.
• Running multiple tasks at the same time helps solve problems faster.
• Tasks can be of any size, but once defined, they are the smallest units that can run in parallel.
• Each element of the output vector (y) is calculated independently of the others.
• This allows the matrix-vector multiplication to be divided into n tasks, where each task handles a part of the
matrix and vector.
Observations: Tasks share data (e.g., the vector b), but there are no control dependencies.
• This means that no task needs to wait for another to complete before starting.
• All tasks perform the same number of operations, making them equal in size.
• A question arises: Is this the maximum number of tasks that can be created for this problem?
• Critical Path Length: The longest path from the start to the end of the task graph. It defines the
minimum time required to complete all tasks in the graph.
• Average Degree of Concurrency: A measure of how many tasks, on average, can be executed
concurrently.
• It is calculated as the total sum of task weights divided by the critical path length.
Decomposition Granularity:
• Theoretically, breaking a task into smaller subtasks (finer granularity) can reduce parallel execution time.
• Dividing tasks too finely can result in excessive overhead for task management and coordination.
• Communication and synchronization costs may outweigh the benefits of finer granularity.
Bound on Granularity:
• There is an inherent upper bound to how finely a task can be divided.
• Example:
• For matrix-vector multiplication, there are at most n2n^2n2 concurrent tasks (one task per matrix
entry).
• Beyond this limit, no further decomposition is possible because the computational work is
inherently limited.
Communication Overhead:
•Concurrent tasks often need to exchange data (e.g., sharing intermediate results).
•Create a tradeoff between finer decomposition (to maximize concurrency) and the overhead caused by data
•The performance bounds of a parallel system are determined by finding the optimal balance between:
• Limits on granularity (the computational structure restricts how many tasks can run concurrently).
• Optimizing parallel performance requires balancing task size and communication costs, which is often
application-specific.
• This highlights that while parallel computing can significantly speed up processes, it is limited by the inherent
nature of the computation and the associated overheads.
Task Interaction Graph:
Subtasks Exchange Data:
•In a decomposition (dividing a problem into smaller tasks), subtasks often need to communicate data with each
other.
•Example:
•In the decomposition of a dense matrix-vector multiplication:
•If the vector is not replicated across all tasks, subtasks will need to communicate elements of the
vector with one another.
Graph Representation:
•A task interaction graph is a representation of:
•Tasks as nodes.
•Interactions or data exchanges between tasks as edges.
•This graph helps visualize and analyze the communication dependencies among tasks.
Importance of the Task Interaction Graph:
• There is no single universal method for task decomposition; techniques depend on the specific problem
being solved.
• Exploratory decomposition: For tasks involving dynamic exploration, like search problems.
• Speculative decomposition: Used when predicting which tasks will be required in the future.
Recursive Decomposition:
• Recursive decomposition is particularly effective for problems that follow the divide-and-conquer strategy,
which breaks a problem into smaller independent subproblems and solves them recursively.
• Steps:
• The problem is divided into smaller parts that can be solved independently.
• Each subproblem is further divided until the tasks become simple enough to be solved directly (base case).
Advantages of Recursive Decomposition:
• Naturally introduces concurrency, as the independent subproblems can be solved in parallel.
• Suited for problems with hierarchical or tree-like structures.
Data Decomposition