0% found this document useful (0 votes)
56 views52 pages

Lec1 and 2

This document provides an introduction to a course on parallel programming. The objectives of the course are to learn how to program parallel processors and systems, develop real applications on hardware, and discuss current parallel computing contexts. It explains that multicore and many-core architectures are now common due to technology trends, and many programmers will need to develop parallel software to take advantage of these resources. It provides an overview of the roadmap for the course, covering why parallel computing is needed, how to write parallel programs, and what topics will be covered, including concurrent, parallel and distributed systems.

Uploaded by

Waleed Awan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views52 pages

Lec1 and 2

This document provides an introduction to a course on parallel programming. The objectives of the course are to learn how to program parallel processors and systems, develop real applications on hardware, and discuss current parallel computing contexts. It explains that multicore and many-core architectures are now common due to technology trends, and many programmers will need to develop parallel software to take advantage of these resources. It provides an overview of the roadmap for the course, covering why parallel computing is needed, how to write parallel programs, and what topics will be covered, including concurrent, parallel and distributed systems.

Uploaded by

Waleed Awan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

An Introduction to Parallel Programming

Lecture 1
Why Parallel Computing?

1
INTRODUCTION
WEEK 01
Course Objectives
 Learn how to program parallel processors and
systems
 Learn how to think in parallel and write correct parallel
programs
 Achieve performance and scalability through
understanding of architecture and software mapping
 Significant hands-on programming experience
 Develop real applications on real hardware
 Discuss the current parallel computing context
 What are the drivers that make this course timely
 Contemporary programming models and architectures, and
where is the field going

3
Why is this Course Important?
 Multi-core and many-core era is here to stay
 Why? Technology Trends

 Many programmers will be developing parallel software


 But still not everyone is trained in parallel programming

 Learn how to put all these vast machine resources to the best

use!
 Useful for
 Joining the industry

 Graduate school

 Our focus
 Teach core concepts

 Use common programming models

 Discuss broader spectrum of parallel computing

4
Roadmap
 Why we need ever-increasing performance.
 Why we’re building parallel systems.
 Why we need to write parallel programs.
 How do we write parallel programs?
 What we’ll be doing.
 Concurrent, parallel, distributed!

5
Parallel and Distributed
Computing
 Parallel computing (processing):
 the use of two or more processors (computers), usually

within a single system, working simultaneously to solve a


single problem.
 Distributed computing (processing):
 any computing that involves multiple computers

remote from each other that each have a role in a


computation problem or information processing.

 Parallel programming:
 the human process of developing programs that express what

computations should be executed in parallel.

6
Parallel Computing
To be run using multiple CPUs
◦A problem is broken into discrete parts that can be solved
concurrently
◦Each part is further broken down to a series of
instructions

Instructions from each part execute simultaneously on


different CPUs

Page 7
Parallel Computing Example

Page 8
Parallel Computing
The simultaneous use of multiple compute resources to solve a
computational problem.
Compute Resources
The compute resources can include:
◦A single computer with multiple processors/cores
◦An arbitrary number of computers connected by
a network
◦A combination of both

Page 10
Why we need ever-increasing
performance
 Computational power is increasing, but so are our
computation problems and needs.
 Problems we never dreamed of have been
solved because of past increases, such as
decoding the human genome.
 More complex problems are still waiting to be
solved.

11
Climate modeling
 National Oceanic and Atmospheric Administration
(NOAA) has more than 20PB of data and processes
80TB/day

12
Climate modeling

One
Another processor
processor computes
computes this part
this part in
parallel

Processors in adjacent blocks in the grid communicate their result.

13
Data analysis
 CERN’s Large Hadron Collider (LHC) produces about 15PB per year
 High-energy physics workflows involve a range of both data-intensive and
compute-intensive activities.
 The collision data from the detectors on the LHC needs to be filtered to select a
few thousand interesting collisions from as many as one billion that may take
place each second.
 The WLCG produces a massive sample of billions of simulated beam crossings,
trying to predict the response of the detector and compare it to known physics
processes and potential new physics signals.

14
Drug discovery
 Computational drug discovery and design (CDDD) based on HPC is a
combination of pharmaceutical chemistry, computational chemistry, and
biology using supercomputers, and has become a critical technology in
drug research and development.

15
Why Parallel Computing?
The Real World is Massively Parallel:
◦Parallel computing attempts to emulate the natural world
◦Many complex, interrelated events happening at the same time, yet within a
sequence.

Page 12
Why Parallel Computing?

Page 13
Why Parallel Computing?
To solve larger, more complex Problems:
numerical simulations of complex systems and
"Grand Challenge Problems" such as:

◦weather and climate forecasting


◦chemical and nuclear reactions
◦geological, seismic activity
◦mechanical devices (spacecraft )
◦electronic circuits
◦manufacturing processes
Why Parallel Computing?

To provide Concurrency:
Commercial applications require the processing of large amounts of data
in sophisticated ways.

Page 15
Why Parallel Computing?

Example applications include:


◦parallel databases, data mining
◦web search engines, web based business services
◦computer-aided diagnosis in medicine
◦management of national and multi-national corporations
◦advanced graphics and virtual reality, particularly in the
entertainment industry

Page 16
Why Parallel Computing?
◦ To save time
◦ To solve larger problems
◦ To provide concurrency

Page 17
Why Parallel Computing?

Parallel computing is an attempt to maximize the


infinite but seemingly limited commodity called
time!

Page 18
Who and What?
Top500.org provides statistics on parallel computing
users.
Who and What?

Page 20
The Future?
During the past 20 years, the trends indicated by ever faster
networks, distributed systems, and multi-processor
architectures clearly show that parallelism is the future of
computing.
In this same time period, there has been a greater than
500,000x increase in supercomputer performance, with no
end currently in sight.
The race is already on for Exascale Computing!
Exaflop = 1018 calculations per second

Page 21
Towards parallel hardware

26
Why we’re building parallel
systems
 Up to now, performance increases have been
attributable to increasing density of transistors.

 But there are inherent


problems.

27
A little physics lesson
 Smaller transistors = faster processors.
 Faster processors = increased power
consumption.
 Increased power consumption =
increased heat.
 Increased heat = unreliable processors.

28
Evolution of processors in the
last 50 years
Evolution of processors in the last 50 years

29
How small is 5nm?

https://www.tsmc.com/english/dedicatedFoundry/technology/logic/l_5nm
30
An intelligent solution
 Instead of designing and building faster
microprocessors, put multiple processors on a
single integrated circuit.
 Move away from single-core systems to
multicore processors.
 Introducing parallelism!!!

31
Basic Computer Architecture
 Old computers – one  New computers have 4
unit to execute or more cpu cores
instructions

Core

32
Memory Cache
 L1 Cache
 Size is up to 2MB
 Typically 100 times faster than RAM
 L2 Cache
 Size is typically between 256KB to 8MB
 Typically 25 times faster than RAM
 L3 Cache
 Size is up to 64MB
 L3 cache is a general memory pool that the entire chip can make use of

Each core has its own L1 and L2


caches, while the L3 cache, also
called the Last Level Cache or
LLC, is shared among cores.
33
Basic Concepts

High Performance Computing (HPC)


◦Using the world's fastest computers to solve large/complex
computational problems.

Page 23
Basic Concepts
Task
◦A logically discrete (independent) section of computational work.
◦A task is typically a program or set of instructions executed by a
processor.

Page 24
Basic Concepts

Parallel Task
◦A task that can be executed by multiple processors safely (yields
correct results)

Page 25
Basic Concepts
Parallel Program
◦A program which consists of multiple tasks running on multiple
processors, simultaneously.

Page 26
Basic Concepts
Serial Execution
◦Execution of a program sequentially, one statement at a time. All
parallel tasks will have sections that must be executed serially.

Parallel Execution
◦Execution of a program by more than one task, with each task being
able to execute the same or different statement at the same moment
in time (simultaneously).

Page 27
Basic Concepts
Node
◦A standalone "computer in a box".
◦Usually comprised of multiple processors/cores, memory, network
interfaces, etc.

Page 28
Basic Concepts
Communications
◦The data exchange between parallel tasks.
◦There are several ways this can be accomplished, such as through a
shared memory bus or over a network.

Page 29
Basic Concepts

Synchronization
◦The coordination of parallel tasks in real time, very often associated
with communications.
◦Synchronization usually involves waiting by at least one task, and can
therefore cause a parallel application's execution time to increase.

Page 30
Basic Concepts
Massively Parallel
◦Refers to the hardware that comprises a parallel system - having
many processors.
◦The meaning of many keeps increasing (up to 6 digits!!!!).

Page 31
Basic Concepts
Parallel Computing System
◦Consists of multiple processors having direct (usually bus based)
access to common physical memory.
◦All processors communicate with each other using a shared memory

Page 32
Basic Concepts
Distributed System
◦Contains multiple processors connected by a communication
network.
◦Refers to network based memory access for physical memory that is
not common.

Page 33
Memory Models
 There are three common kinds of parallel
memory models
 Shared
 Distributed
 Hybrid

46
Shared Memory Model
 All cores share the same pool of memory
 HPC Architecture – we talked about the
memory available on one node
 Any memory changes seen by all
processors

47
Benefits and Drawback
 Benefit:
 Data sharing is fast

 Drawback:
 Adding more processors may lead to performance
issues when accessing the same shared memory
resource (memory contention)

48
Distributed Memory Model
 In a distributed memory model, each core has its own
memory
 Processors communicate only through a network connection
and/or communication protocol ( e.g., MPI )
 Changes to local memory associated with processor do not
have an impact on other processors
 Remote-memory access must be explicitly managed by the
programmer

49
Benefits and Drawbacks
 Biggest benefit is scalability
 Adding more processors doesn’t result in resource
contention as far as memory is concerned

 Biggest Drawback
 Can be tedious to program for distributed memory
models
 All data relocation must be programmed by hand

50
Hybrid Memory Model
 As the name implies, the hybrid memory model is a
combination of the shared and distributed memory
models
 Most large and fast clusters today admit a hybrid-
memory model
 A certain number of cores share the memory on one
node, but are connected to the cores sharing memory
on other nodes through a network

51
Benefits and Drawbacks
 Benefit:
 Scalability

 Drawback
 Must know how to program communication
between nodes (e.g., MPI)

52

You might also like