0% found this document useful (0 votes)
5 views

Parallel Programming

The document provides an overview of parallel programming, explaining its importance in efficiently solving complex problems by dividing tasks among multiple computing resources. It covers various models, paradigms, tools, challenges, and best practices associated with parallel programming, highlighting its applications in fields like scientific simulation, big data analysis, and artificial intelligence. The future of parallel programming is discussed, emphasizing advancements in technology and the potential for improved efficiency and new applications.

Uploaded by

Kashaf Maqsood
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Parallel Programming

The document provides an overview of parallel programming, explaining its importance in efficiently solving complex problems by dividing tasks among multiple computing resources. It covers various models, paradigms, tools, challenges, and best practices associated with parallel programming, highlighting its applications in fields like scientific simulation, big data analysis, and artificial intelligence. The future of parallel programming is discussed, emphasizing advancements in technology and the potential for improved efficiency and new applications.

Uploaded by

Kashaf Maqsood
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Parallel

Programming
AdvanceSubmitted
Computer Arcitecture
to: Dr. Saima Farhan
OUR TEAM

Farheen Fatima

Farzeen Fatima

Khansa
CONTENTS OF THIS Presentation
 Introduction to Parallel Programming
 Why Parallel Programming?
 Parallel Programming Process
 Parallel Programming Models
 Parallel Programming Paradigms
 Tools and Frameworks for Parallel Programming
 Challenges in Parallel Programming
 Applications of Parallel Programming
 Best Practices for Effective Parallel Programming
 Future of Parallel Programming
 Conclusion
INTRODUCTIO
N
parallel programming is the
process of splitting a problem
into smaller tasks that can be
executed at the same time – in
parallel – using multiple
computing resources.

It is a key technique for solving


complex problems efficiently

Example: scientific problems, big data


analysis, and artificial intelligence,
image processing, weather forecast.
Matrix A: Matrix B: Matrix C (2*2)

Problem breakdown

C[0][0] C[0][1] C[1][0] C[1][1]

P0 P0 P0 P0

lt
res

res
ult su es ult
re r
ltu

ASSEMBLE
Why Use
 Improved Performance: Speeds up computational
tasks by dividing them into smaller, parallelized
subtasks.
 Efficient Resource Utilization: Makes optimal use
of multi-core processors and high-performance
computing (HPC) architectures.
 Handles Large Data: Ideal for processing vast
datasets or performing simulations.
 Essential for Modern Applications: Powering
technologies like machine learning, climate
modeling, and real-time analytics.
Process of parallel programming
Understand the problem

Broke down into sub problems

Identify Communications Between Tasks

Synchronize the Sequence of Task

Identify Dependencies in the Sequence of Tasks

Perform Load Balancing

Write and debug the code


parallel programming Models
Shared Memory
 Shared Address Space: Processes share a common memory space, reading and
writing asynchronously.
 Access Control: Locks and semaphores manage access to shared memory to prevent
race conditions and deadlocks.
 Ease of Development: Programmers don’t need to define data ownership, simplifying
development.
 Performance Challenge: Managing data locality is difficult, leading to inefficiencies in
memory access and cache usage. Poor data locality increases memory usage, which
reduces performance. Controlling data locality is complex and often beyond average
users.
MPI
 A parallel programming approach where separate processes communicate only by
sending messages, not sharing memory. Each set of tasks use their own local memory
during computation. Multiple tasks can reside on the same physical machine and/or
across an arbitrary number of machines.
 Processes exchange data through communications by sending and receiving
messages.
 Data transfer usually requires cooperative operations to be performed by each
process. For example, a send operation must have a matching receive operation.
Hybrid Model
 Hybrid Model: Combines multiple programming models, like MPI with
shared memory
 Popular Hardware: Ideal for clustered multi-core systems.
 MPI + GPU: MPI tasks run on CPUs, and intensive computations are
offloaded to GPUs.
 Data Exchange: GPU and CPU exchange data using CUDA or similar
technologies.
parallel programming Paradigms
Data-Level
 The data-parallel model is one of the simplest parallel algorithm models.
 Tasks to be performed are identified first, then assigned to processes.
 Task assignment can be static (fixed) or semi-static (partially flexible).
 Each process performs the same task (operations are identical), but on
different pieces of data.
 The problem is divided into smaller tasks based on data partitioning.
 Data partitioning ensures:
-> All processes perform similar operations
-> Proper load balancing with uniform data distribution.
Exampel
1. Problem:
- You have a large array of numbers, e.g., [1, 2, 3, 4, 5, 6, 7, 8].
- The goal is to calculate the total sum of all the numbers.
2. Data Partitioning:
- Divide the array into smaller chunks to distribute the work among multiple
processes.
- For example, with 4 processes:
- Process 1 gets [1, 2]
- Process 2 gets [3, 4]
- Process 3 gets [5, 6]
- Process 4 gets [7, 8]
Example: Cont…
3. Parallel Task Execution:
- Each process performs the same operation (sum calculation) on its
assigned data.
- Process 1 calculates 1 + 2 = 3.
- Process 2 calculates 3 + 4 = 7.
- Process 3 calculates 5 + 6 = 11.
- Process 4 calculates 7 + 8 = 15.
4. Combining Results:
- The partial sums [3, 7, 11, 15] are combined (aggregated) to get the final
result:
3 + 7 + 11 + 15 = 36.
Task-Pool-level
 Also known as the Task Pool Model.
 Dynamic mapping of tasks to processors to handle load balancing.
 Used when tasks vary in size and processing time.
 How it Works:
 Tasks are divided into a pool, which is a collection of tasks ready to be
processed.
 Idle processors in the system are assigned tasks from the pool during
runtime.
 This ensures that processors remain active and no processor is underutilized.
 Load Balancing:

Example
A system needs to process various tasks of different sizes (e.g.,
small data cleanup tasks and large data processing tasks).

Step 1 Step 2
A pool of tasks is This ensures all
created, and as processors remain
processors become busy, and the
idle, they pull tasks workload is
from the pool to distributed evenly,
process. leading to efficient
processing
Master-Slave
 Also known as the Manager-Worker Model.
 Roles:
 The master process manages the tasks.
 The slave processes execute the tasks assigned by the master.
 Task Allocation:
 If the task size is known beforehand, the master allocates it to the
appropriate slaves.
 If the task size is unknown beforehand, the master assigns smaller
portions to slaves incrementally.
Responsibilitie Common
s Usage

Allocates tasks to Effective in systems


slave processes. with shared memory
Synchronizes the or message-passing
activities of all slave communication.
processes. Ensures efficient task
distribution and
coordination.
process a large collection of images (e.g.,
apply filters, resize images, etc.)
Master Process Role:
 The master receives the list of images to process.
 It divides the work (e.g., groups of images) and assigns tasks to slave
processes.
 For example:
 Slave 1 processes images 1–100.
 Slave 2 processes images 101–200.
 Slave 3 processes images 201–300.
Slave Process Role:
 Each slave performs the task assigned by the master (e.g., applying
filters to the images).
 Once done, slaves may send their results back to the master.
Pipeline-Based
 Also called the Producer-Consumer Model.
 Data flows through a series of processes arranged in succession.
->How it works:
A single task passes through multiple processes sequentially.
Each process performs its part of the work and then sends the task to the
next process.
->Pipeline Structure:
Processes act as a chain of producers (output generators) and consumers
(input processors).
->Task Mapping:
Uses static mapping, where tasks are assigned to specific processes in
Example
Scenario:
Consider a pipeline for processing and analyzing log
files in a web application. The pipeline involves:
Stage 1: Reading log files (Producer).
Stage 2: Parsing logs to extract relevant information.
Stage 3: Filtering logs for specific events (e.g.,
errors, warnings).
Stage 4: Aggregating statistics (e.g., error counts).
Stage 5: Saving results to a database (Final
Consumer).
Tools and Framework for Parallel
Programming
MPI
A standardized library for parallel programming.
Enables applications to use multiple processors or computers to work together.
Designed for systems where each processor has its own private memory.
Communication between processors happens by sending and receiving messages.

Portability Scalability Flexibility


Provides a rich set of
Works across Can scale easily to functions for: Sending and
different use many receiving messages.
hardware processors. Complex communication
architectures and patterns
operating systems
Thread Building Block
 Threading Building Blocks (TBB) is a C++ library developed by Intel to help
 developers write parallel applications.
 It focuses on task-based parallelism, making it easier to divide work across
multiple threads.
 TBB allows tasks to be divided into smaller independent tasks that can be
executed in parallel,
 without requiring manual management of threads.
 It automatically distributes tasks among available threads, ensuring efficient
use of resources.
 TBB scales well across multi-core processors, handling dynamic workloads
efficiently.
 Cross-Platform:
 Works across various operating systems like Windows, Linux, and
macOS,
 supporting multi-core CPUs from different manufacturers.
CUDA
 CUDA (Compute Unified Device Architecture) is a parallel computing platform
 and programming model created by NVIDIA.
 It allows programmers to use Graphics Processing Units (GPUs) for general-purpose
computing.
 Utilizes the power of GPUs to speed up computationally intensive tasks.
 Ideal for tasks like simulations, machine learning, and real-time data processing.
 Integrates with C, C++, and Fortran, making it easy for developers to adopt.
 Can scale efficiently with the increasing number of GPU cores,
 enhancing performance as hardware improves.

 Manages different types of memory, such as global memory, constant memory and
shared memory, for optimal performance.
Challenges
Synchronizati Load
on Balancing Communicatio Debug and
n Overhead Profiling
Ensures that tasks distributes work
Processors in Debugging parallel
accessing shared evenly across
parallel systems is
resources do so in processors to ensure
systems often complex due to
an orderly manner, no processor is
need to issues like race
avoiding conflicts overburdened.
exchange data, conditions,
(e.g., using locks,
which can deadlocks, and
semaphores
cause non-
communicati deterministic
on overhead. behavior
Applications
Scientific Big Data ML Gaming and
Simulation Analysisi Graphics

Used in weather Involves Training deep


forecasting, processing vast learning models Real-time rendering
climate modeling, amounts of Parallel algorithms in video games relies
molecular structured and are used to on parallel
dynamics, and unstructured data. distribute tasks processing, especially
astrophysics. Technologies like across multiple with GPUs.
Hadoop GPUs or CPUs for
faster results.
Best practices for effective parallel
programming.
1. Understand the Problem
 Find which parts of your program can run at the same time.
 Check if using parallel programming will actually save time and effort.
2. Reduce Communication Between Tasks
 Limit how often different parts of the program need to share data.
 Use efficient ways to share information, like shared memory or quick messages.
3. Divide Work Evenly
 Make sure all tasks get a fair share of the work so no processor stays idle.
 Use methods to adjust the workload if some tasks take longer than others.
4. Prevent Conflicts
 Use tools like locks or semaphores to avoid errors when multiple tasks try to
change the same data at the same time.
5. Optimize Data Access
 Keep tasks close to the data they need to access to reduce delays.
 Make use of your computer’s memory cache for faster performance.
6. Choose the Right Tools
 Use tools and libraries that match your needs, such as:
OpenMP for shared memory.
MPI for distributed systems.

7.Test Carefully
 Look for errors in how tasks interact, especially with shared data.
 Try running the program on different computers to ensure it works well everywhere.
8.Measure Performance
 Check how much faster the program runs with parallel programming.
 Find the slow parts and improve them.
9.Plan for Growth
 Design the program so it works well even if more processors or cores are added in the
future.
10. Keep Code Simple
 Write small, reusable pieces of code for parallel tasks.
 This makes debugging and updates much easier.
Future of Parallel programming
1. Faster and Smarter Devices
 Computers will use different processors like CPUs, GPUs, and specialized
chips together to work faster.
2. Quantum Computing
 New kinds of computers, like quantum computers, will need special parallel
programming methods to solve problems quicker.
3. Artificial Intelligence (AI)
 AI and machine learning will rely even more on parallel programming to train
smarter systems faster.
4. Easier Tools
 Tools and languages will improve to make parallel programming simpler for
everyone.
5. Saving Energy
 Parallel programming will focus on doing tasks faster while using less energy,
especially in big data centers.
6. Edge and IoT Devices
 Small devices like smart sensors and IoT gadgets will use parallel
programming to handle tasks quickly.
7. Powerful Supercomputers
 Parallel programming will power supercomputers that can handle extremely
large and complex calculations.
8. Automation
 Future tools will make it easier to write parallel programs by automatically
dividing tasks between processors.
9. New Applications
 Fields like gaming, virtual reality, blockchain, and augmented reality will
heavily use parallel programming.
In conclusion, parallel
programming is the
key to making computers
faster and more efficient
by running many tasks
at the same time. It is
shaping the future of
technology, from AI and
supercomputers to
gaming, IoT, and even
quantum computing. As
tools and methods
improve, parallel
programming will
become easier and more
powerful, helping solve
bigger problems and
create smarter systems.

You might also like