0% found this document useful (0 votes)

173 views33 pages

Slides Chapter 2 - Parallel Programming Platforms

This document provides an introduction to parallel computing concepts. It discusses the key elements of parallel computers including hardware with multiple processors and memories, parallel system software, and parallel application software. It then describes different parallel computing platforms in terms of their logical and physical organization. The logical organization includes control mechanisms, communication models, and logical elements. The physical organization discusses ideal architectures, interconnection networks, evaluation metrics, network topologies, and issues like cache coherence in shared memory systems.

Uploaded by

unicyclehusby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

173 views33 pages

Slides Chapter 2 - Parallel Programming Platforms

Uploaded by

unicyclehusby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Introduction to Parallel Computing

George Karypis Parallel Programming Platforms

Elements of a Parallel Computer

Hardware

Multiple Processors Multiple Memories Interconnection Network

System Software

Parallel Operating System Programming Constructs to Express/Orchestrate Concurrency

Application Software

Parallel Algorithms

Goal: Utilize the Hardware, System, & Application Software to either

Achieve Speedup: S = Ts/TP; Solve problems requiring a large amount of memory.

Parallel Computing Platform

Logical Organization
The

users view of the machine as it is being presented via its system software actual hardware architecture

Physical Organization
The

Physical Architecture is to a large extent independent of the Logical Architecture

Logical Organization Elements

Control Mechanism

SISD/SIMD/MIMD/MISD

Single/Multiple Instruction Stream & Single/Multiple Data Stream

SPMD: Single Program Multiple Data

Logical Organization Elements

Communication Model
Shared-Address

Space

Message-Passing

UMA/NUMA/ccNUMA

Physical Organization

Ideal Parallel Computer Architecture

PRAM:

Parallel Random Access Machine

PRAM Models
EREW/ERCW/CREW/CRCW

Exclusive/Concurrent Read and/or Write

Concurrent

Writes are resolved via

Common/Arbitrary/Priority/Sum

Physical Organization

Interconnection Networks (ICNs)

Provide processor-to-processor and processor-to-memory connections Networks are classified as:

Static

Dynamic

Consist of a number of point-to-point links

direct network

The network consists of switching elements that the various processors attach to

indirect network

Historically used to link processors-to-processors

Historically used to link processors-to-memory

distributed-memory system

shared-memory systems

Static & Dynamic ICNs

Evaluation Metrics for ICNs

Diameter

The maximum distance between any two nodes

Smaller the better.

Connectivity

The minimum number of arcs that must be removed to break it into two disconnected networks

Larger the better

Measures the multiplicity of paths The minimum number of arcs that must be removed to partition the network into two equal halves.

Bisection width

Larger the better

Bisection bandwidth

Applies to networks with weighted arcsweights correspond to the link width (how much data it can transfer) The minimum volume of communication allowed between any two halves of a network

Larger the better

Cost

The number of links in the network

Smaller the better

Metrics and Dynamic Networks

Network Topologies

Bus-Based Networks
Shared

medium Information is being broadcasted Evaluation:

Diameter: O(1) Connectivity: O(1) Bisection width: O(1) Cost: O(p)

Network Topologies

Crossbar Networks
Switch-based

network Supports simultaneous connections Evaluation:

Diameter: O(1) Connectivity: O(1)? Bisection width: O(p)? Cost: O(p2)

Network Topologies

Multistage Interconnection Networks

Multistage Switch Architecture

Pass-through

Cross-over

Connecting the Various Stages

Blocking in a Multistage Switch

Routing is done by comparing the bit-level representation of source and destination addresses. -match goes via pass-through -mismatch goes via cross-over

Network Topologies

Complete and star-connected networks.

Network Topologies

Cartesian Topologies

Network Topologies

Hypercubes

Network Topologies

Trees

Summary of Performance Metrics

log

Topology Embeddings

Mapping between networks

Useful

in the early days of parallel computing when topology specific algorithms were being developed.

Embedding quality metrics

dilation

maximum number of lines an edge is mapped to maximum number of edges mapped on a single link

congestion

Mapping a Cartesian Topology onto a Hypercube

Cool things

Mapping a Cartesian Topology onto a Hypercube

Routing Mechanisms

Routing:
The

algorithm used to determine the path that a message will take to go from the source to destination

Can be classified along different dimensions

minimal

vs non-minimal deterministic vs adaptive

Dimension Ordered Routing

There is a predefined ordering of the dimensions Messages are routed along the dimensions in that order until they cannot move any further

X-Y routing for meshes E-cube routine for hypercubes

010 011

011 111

Physical Organization

Cache Coherence in Shared Memory Systems

certain level of consistency must be maintained for multiple copies of the same data Required to ensure proper semantics and correct program execution

serializability

Two

general protocols for dealing with it

invalidate & update

Invalidate/Update Protocols

The preferred scheme depends on the characteristics of the underlying application

frequency

of reads/writes to shared variables

Classical trade-off between communication overhead (updates) and idling (stalling in invalidates) Additional problems with false sharing Existing schemes are based on the invalidate protocol
A

number of approaches have been developed for maintaining the state/ownership of the shared data

Communication Costs in Parallel Systems

Message-Passing Systems
The

communication cost of a data-transfer operation depends on:

start-up time: ts

add headers/trailer, error-correction, execute the routing algorithm, establish the connection between source & destination time to travel between two directly connected nodes. node latency 1/channel-width

per-hop time: th

per-word transfer time: tw

Store-and-Forward & Cut-Through Routing

Cut-through Routing Deadlocks

Messages 0, 1, 2, and 3 need to go to nodes A, B, C, and D, respectively

Communication Model Used for this Class

We will assume that the cost of sending a message of size m is:

In general true because ts is much larger than th and for most of the algorithms that we will study mtw is much larger than lth

Introduction To Parallel Computing: Solution Manual
No ratings yet
Introduction To Parallel Computing: Solution Manual
70 pages
Chapter 2 - Parallel Programming Platforms
No ratings yet
Chapter 2 - Parallel Programming Platforms
33 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Intro To Communication: - Advantages
No ratings yet
Intro To Communication: - Advantages
13 pages
Lecture 5 Network Topologies For Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies For Parallel Architectures - Updated
46 pages
Lecture 4 Flynn's Classical Taxonomy
No ratings yet
Lecture 4 Flynn's Classical Taxonomy
43 pages
Lecture 4 Network Topologies For Parallel Architecture
No ratings yet
Lecture 4 Network Topologies For Parallel Architecture
34 pages
Introduction
No ratings yet
Introduction
46 pages
PDC - Lecture - No. 3
No ratings yet
PDC - Lecture - No. 3
34 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
34 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Parallel Processing Lecture3
No ratings yet
Parallel Processing Lecture3
54 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
33 pages
Parallel Architecture: Sathish Vadhiyar
No ratings yet
Parallel Architecture: Sathish Vadhiyar
26 pages
Lecture 4
No ratings yet
Lecture 4
33 pages
Solution 2-DD
No ratings yet
Solution 2-DD
70 pages
Pdcco 1
No ratings yet
Pdcco 1
8 pages
Lecture 5
No ratings yet
Lecture 5
72 pages
Parallel Architectures
No ratings yet
Parallel Architectures
160 pages
Fundamentals of Parallel Computers
No ratings yet
Fundamentals of Parallel Computers
6 pages
CMP 316 Data Communication and Networks WRITEUP Update
No ratings yet
CMP 316 Data Communication and Networks WRITEUP Update
122 pages
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
38 pages
Multiprocessor Interconnection Networks Networks: CS 740 November 19, 2003
No ratings yet
Multiprocessor Interconnection Networks Networks: CS 740 November 19, 2003
8 pages
DFSSDF
No ratings yet
DFSSDF
73 pages
Aca Unit-3
No ratings yet
Aca Unit-3
10 pages
Unit 1
No ratings yet
Unit 1
25 pages
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
No ratings yet
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
21 pages
Chapter 7
No ratings yet
Chapter 7
97 pages
Interconnection Networks
No ratings yet
Interconnection Networks
31 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Lecture 8 Miscellaneous Topics
No ratings yet
Lecture 8 Miscellaneous Topics
52 pages
lect3-parallel system
No ratings yet
lect3-parallel system
31 pages
Lecture 3 - 3 Evaluating Static Interconnection Networks
No ratings yet
Lecture 3 - 3 Evaluating Static Interconnection Networks
41 pages
Unit 1
No ratings yet
Unit 1
67 pages
Parallel Processors: Session 5 Interconnection Networks
No ratings yet
Parallel Processors: Session 5 Interconnection Networks
48 pages
Chapt. 1 Intro. To Computer Networks
No ratings yet
Chapt. 1 Intro. To Computer Networks
44 pages
Lecture 3.2.3 (Various Interconnection Networks)
No ratings yet
Lecture 3.2.3 (Various Interconnection Networks)
22 pages
1st Ia Preparation
No ratings yet
1st Ia Preparation
15 pages
Advanced Computer Architecture CSE 8383
No ratings yet
Advanced Computer Architecture CSE 8383
56 pages
Parallel Algorithms: Peter Harrison and William Knottenbelt
No ratings yet
Parallel Algorithms: Peter Harrison and William Knottenbelt
65 pages
Static and Dynamic
No ratings yet
Static and Dynamic
43 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
Chapter 03
No ratings yet
Chapter 03
68 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
L2 Parallel Computing Models
No ratings yet
L2 Parallel Computing Models
31 pages
Module 1
No ratings yet
Module 1
61 pages
Module 1 DataCommunication First Chapter
No ratings yet
Module 1 DataCommunication First Chapter
90 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
Lec3 InnerconnectionNetworks
No ratings yet
Lec3 InnerconnectionNetworks
28 pages
Introduction To Distributed Operating Systems Communication in Distributed Systems
No ratings yet
Introduction To Distributed Operating Systems Communication in Distributed Systems
150 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
Network 34
No ratings yet
Network 34
76 pages
Section AX-1: Axles, Gearboxes & Wheels
No ratings yet
Section AX-1: Axles, Gearboxes & Wheels
88 pages
PegaRULES Process Commander-Study Guide
60% (5)
PegaRULES Process Commander-Study Guide
28 pages
Microsoft Word Shortcuts 2023 1
No ratings yet
Microsoft Word Shortcuts 2023 1
1 page
SAPANS - Analytics With SAP Solutions
No ratings yet
SAPANS - Analytics With SAP Solutions
105 pages
LDP-1 MCQ Reference Ques
No ratings yet
LDP-1 MCQ Reference Ques
34 pages
Diagnosis
No ratings yet
Diagnosis
17 pages
(TABLES K and L) Critical Values For The Wilcoxon Signed-Ranked Test The Rank Correlation Coefficient
No ratings yet
(TABLES K and L) Critical Values For The Wilcoxon Signed-Ranked Test The Rank Correlation Coefficient
1 page
Table Lamp 3ds Max: Vikki Olds
0% (1)
Table Lamp 3ds Max: Vikki Olds
15 pages
Netwrix Auditor Installation Configuration Guide
No ratings yet
Netwrix Auditor Installation Configuration Guide
182 pages
Cambridge IGCSE™: Cambridge International Mathematics 0607/61 October/November 2020
No ratings yet
Cambridge IGCSE™: Cambridge International Mathematics 0607/61 October/November 2020
7 pages
The Motion of A Ball Bearing in Oil
50% (2)
The Motion of A Ball Bearing in Oil
4 pages
Theoretical Concepts in Physics An Alternative View of Theoretical Reasoning in Physics Second Edition Malcolm S. Longair PDF Download
No ratings yet
Theoretical Concepts in Physics An Alternative View of Theoretical Reasoning in Physics Second Edition Malcolm S. Longair PDF Download
44 pages
(TSC) ss32-ss315
No ratings yet
(TSC) ss32-ss315
2 pages
Hydroxyethyl Starches - British Pharmacopoeia 2024
No ratings yet
Hydroxyethyl Starches - British Pharmacopoeia 2024
13 pages
Cryptography and Network Security
No ratings yet
Cryptography and Network Security
37 pages
Sageep 33-083
No ratings yet
Sageep 33-083
2 pages
DVP-PLC Application Manual 2-56: H'0Fdb H'0Fdb
No ratings yet
DVP-PLC Application Manual 2-56: H'0Fdb H'0Fdb
16 pages
State Institute of Technical Teachers Training & Research, Kalamassery
No ratings yet
State Institute of Technical Teachers Training & Research, Kalamassery
1 page
Mechanical Properties of Fluids - Short Notes - Arjuna NEET 2024
No ratings yet
Mechanical Properties of Fluids - Short Notes - Arjuna NEET 2024
2 pages
QSC20 VMDR Lab Tutorial Supplement
No ratings yet
QSC20 VMDR Lab Tutorial Supplement
34 pages
Bsad Priyanka
No ratings yet
Bsad Priyanka
24 pages
Manual Instalare TRE4x4 TR135
No ratings yet
Manual Instalare TRE4x4 TR135
35 pages
Bonding
No ratings yet
Bonding
52 pages
1.how Many Times "Indiabix" Is Get Printed?
No ratings yet
1.how Many Times "Indiabix" Is Get Printed?
4 pages
Aniline Point Report
No ratings yet
Aniline Point Report
5 pages
Borooah 1881
No ratings yet
Borooah 1881
631 pages
CMA Test
No ratings yet
CMA Test
2 pages
Technical Specification
No ratings yet
Technical Specification
13 pages
Back Process Reports
No ratings yet
Back Process Reports
36 pages
50 Years Data Science by Dave Donoho
No ratings yet
50 Years Data Science by Dave Donoho
41 pages

Slides Chapter 2 - Parallel Programming Platforms

Uploaded by

Slides Chapter 2 - Parallel Programming Platforms

Uploaded by

Introduction to Parallel Computing

George Karypis Parallel Programming Platforms

Elements of a Parallel Computer

Multiple Processors Multiple Memories Interconnection Network

Parallel Operating System Programming Constructs to Express/Orchestrate Concurrency

Goal: Utilize the Hardware, System, & Application Software to either

Achieve Speedup: S = Ts/TP; Solve problems requiring a large amount of memory.

Parallel Computing Platform

Physical Architecture is to a large extent independent of the Logical Architecture

Logical Organization Elements

Single/Multiple Instruction Stream & Single/Multiple Data Stream

SPMD: Single Program Multiple Data

Logical Organization Elements

Ideal Parallel Computer Architecture

Parallel Random Access Machine

Exclusive/Concurrent Read and/or Write

Writes are resolved via

Interconnection Networks (ICNs)

Provide processor-to-processor and processor-to-memory connections Networks are classified as:

Consist of a number of point-to-point links

Historically used to link processors-to-processors

Historically used to link processors-to-memory

Static & Dynamic ICNs

Evaluation Metrics for ICNs

The maximum distance between any two nodes

Smaller the better.

Larger the better

Larger the better

Larger the better

The number of links in the network

Smaller the better

Metrics and Dynamic Networks

medium Information is being broadcasted Evaluation:

network Supports simultaneous connections Evaluation:

Diameter: O(1) Connectivity: O(1)? Bisection width: O(p)? Cost: O(p2)

Multistage Interconnection Networks

Multistage Switch Architecture

Connecting the Various Stages

Blocking in a Multistage Switch

Complete and star-connected networks.

Summary of Performance Metrics

Mapping between networks

Embedding quality metrics

Mapping a Cartesian Topology onto a Hypercube

Mapping a Cartesian Topology onto a Hypercube

Can be classified along different dimensions

vs non-minimal deterministic vs adaptive

Dimension Ordered Routing

X-Y routing for meshes E-cube routine for hypercubes

Cache Coherence in Shared Memory Systems

general protocols for dealing with it

invalidate & update

The preferred scheme depends on the characteristics of the underlying application

of reads/writes to shared variables

Communication Costs in Parallel Systems

communication cost of a data-transfer operation depends on:

per-word transfer time: tw

Store-and-Forward & Cut-Through Routing

Cut-through Routing Deadlocks

Communication Model Used for this Class

We will assume that the cost of sending a message of size m is:

You might also like