0% found this document useful (0 votes)

14 views53 pages

CS-3006 3 ParallelArchitectures

The document provides an overview of parallel architectures, detailing Flynn's Taxonomy which classifies computer architectures into SISD, SIMD, MISD, and MIMD. It discusses various processor architectures including Symmetric Multiprocessors (SMP), Non-Uniform Memory Access (NUMA), and distributed systems like clusters and grids. Additionally, it covers cloud computing and supercomputers, emphasizing their roles in high-performance computing and resource management.

Uploaded by

i221861 Sara Zahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views53 pages

CS-3006 3 ParallelArchitectures

Uploaded by

i221861 Sara Zahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

Parallel Architectures

(CS 3006)

Dr. Muhammad Mateen Yaqoob

Department of AI & DS,

National University of Computer & Emerging Sciences,
Islamabad Campus
Flynn’s Taxonomy
• Specific classification of computing architectures

• By Michael Flynn (from Stanford, in 1966)

– Made a classification of computer systems
known as Flynn’s Taxonomy

Computer

Instructions Data
Flynn’s Taxonomy
1. Single Instruction, Single Data Stream – SISD
• Single processor
• Single instruction stream
• Data stored in single memory
• Deterministic execution

SI SISD SD
2. Single Instruction, Multiple Data Stream - SIMD
• A parallel processor
• Single instruction: all processing units execute same
instruction at any given clock cycle
• Multiple data: Each processing unit can operate on a
different data element
• Large number of processing elements (with local
memory)
SISD SD
• Examples: GPUs, etc.
SI SISD SD

SISD SD
2. Single Instruction, Multiple Data Stream - SIMD

Slide Credit: Alex Klimovitski & Dean Macri, Intel Corporation

3. Multiple Instruction, Single Data Stream - MISD
• Sequence of data
• Transmitted to set of processors
• Each processor executes different instruction
sequence, using same Data
• Examples: Pipelined Vector Processors, etc.
4. Multiple Instruction, Multiple Data Stream- MIMD

• Most common parallel processor architecture

• Simultaneously execute different instructions
• Using different sets of data
• Examples: Multi-cores, SMPs, Clusters, Grid, Cloud
MIMD - Overview
• General purpose processors
• Each can process all instructions necessary
• Further classified by method of processor
communication:
1. Shared Memory
2. Distributed Memory
Taxonomy of Processor Architectures
Taxonomy of Processor Architectures
Symmetric Multiprocessor (SMP)
• Processors share memory (tightly coupled)
• Communicate via shared memory (single bus)
• Same memory access time (any memory region, from
any processor)
• Processors share I/O address space too
SMP Advantages
• Performance
– work can be done in parallel

• Availability
– Failure of a single processor does not halt system

• Incremental growth
– Adding additional processors enhances performance

• Scaling
– Range of products based on number of processors
Symmetric Multiprocessor Organization
Multithreading and Chip Multiprocessors
• Instruction stream divided into smaller streams
called “threads”

• Executed in parallel
Taxonomy of Processor Architectures
Tightly Coupled - NUMA
• Non-Uniform Memory Access (NUMA)
– Access times to different regions of memory differs
SunFire X4600M2 NUMA machine
Non-uniform Memory Access (NUMA)
• Non-uniform memory access
– All processors have access to all parts of memory
– Access time of processor differs depending on
memory region
– Different processors access different regions of
memory at different speeds

• Cache-coherent NUMA (cc-NUMA)

– Cache coherence is maintained among the caches
of the various processors
Motivation (Why NUMA)
• SMP has practical limit to number of processors
– Bus traffic limits to between 16 and 64 processors

• In clusters each node has own memory:

– Apps do not see large global memory
– Coherence maintained by software not hardware

• NUMA retains SMP flavour while giving large scale

multiprocessing
CC-NUMA Organization
CC-NUMA Operation
• Each processor has own L1 and L2 cache
• Each node has own main memory
• Nodes connected by some networking facility
• Each processor sees single addressable memory
• Hardware support for read/write to non-local
memories, cache coherency

• Memory request order:

1. L1 cache  L2 cache (local to processor)
2. Main memory (local to node)
3. Remote memory (remote node)
NUMA Pros & Cons
• Effective performance at higher levels of parallelism
than SMP

• No major software changes

• Performance can breakdown if too much access to

remote memory
Distributed Memory / Message Passing
• Each processor has access to its own memory only

• Data transfer between processors is explicit (via

message passing functions): E.g., MPI library

• User has complete control/responsibility for data

placement and management

Interconnection Network

CPU Memory CPU Memory CPU Memory

Hybrid Systems
• Distributed memory system with multiprocessor
shared memory nodes

• Most common parallel architecture

Interconnection Network
Network Interface Network Interface Network Interface
CPU CPU CPU
Memory

Memory

Memory
CPU CPU CPU

CPU CPU CPU

Taxonomy of Processor Architectures
Distributed Computing
• Using distributed systems to solve large
problems

• Paradigms:
– Cluster computing
– Grid computing
– Cloud computing
Cluster Computing
Clusters - Loosely Coupled
• Collection of independent uni-processor systems or
SMPs
• Interconnected to form a cluster

• Communication via fixed path or network connections

• Not a single shared memory

Introduction to Clusters
• Alternative to SMP
• High performance
• High availability
• A group of interconnected whole computers
• Working together as unified resource
• Illusion of being one big machine
• Each computer called a node
Cluster Benefits
• Scalability
• Superior price/performance ratio
Cluster System Architecture
Cluster Middleware
• Unified image to user
– Single system image
• Single point of entry
• Single file hierarchy
• Single job management system
• Single user interface
• Single I/O space
Cluster vs. SMP
• Both provide multiprocessor support

• SMPs:
– Easier to manage and control
– Closer to single processor systems:
• Scheduling is main difference
• Less physical space required
• Lower power consumption
Cluster vs. SMP
• Clustering:
– Superior incremental scalability
– Superior availability
• Redundancy
Grid Computing
Grid Computing
• Heterogeneous computers over the whole world
providing CPU power and data storage capacity

• Applications can be executed at several locations

• Geographically distributed services

• Coordinates/Access of resources; as contract to

centralized control

• Uses standard, open, general-purpose protocols and

interfaces Credits: Grid Computing by Camiel Plevier
Grid Architecture

Autonomous, globally distributed computers/clusters

A typical view of Grid environment
Grid Information Service
Grid Information Service system
collects the details of the Details of Grid resources
available Grid resources. Passes
information to resource broker.

Computational jobs

Grid application

Processed jobs
Computation result

User Resource Broker

A User submit
computation or data A Resource Broker distribute the Grid Resources
intensive application to jobs in an application to the Grid Grid Resources (Cluster, PC,
Grids. resources based on user’s QoS Supercomputer, database,
requirements and available Grid instruments, etc.)
resources.
Cloud Computing
What is Cloud Computing?
• Cloud Computing is a network-based computing that
takes place over the Internet:
– a collection/group of integrated and networked
hardware, software, and Internet infrastructure
(called a platform).

• Hides the complexity and details of the underlying

infrastructure
What is Cloud Computing?
•On demand services, that are always ON, Anywhere,
Anytime and Any place

•Pay for use and as needed

•Elastic: scale up and down (capacity and functionalities)

•Shared pool of configurable computing resources

43
Service Models

44
Cloud Service Models

Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim 45
Grance
Figure source: https://dachou.github.io/assets/20110326-cloudmodels.png
Cloud Providers
SuperComputers
What is SuperComputer?
• Typical definition*: A computer that leads the world in
terms of processing capacity, speed of calculation, at the
time of its introduction
– Computer speed is measured in FLoating Point
Operations Per Second (FLOPs)

– Currently the LINPACK Benchmark is officially used to

determine a computers speed.
http://www.netlib.org/benchmark/hpl

– Top 500 SuperComputers

– A ranked list of general purpose systems that are in
common use for high-end applications
*https://home.chpc.utah.edu/~thorne/computing/L13_Supercomputing_Part1.pdf
Top 5 of the list (Nov. 2021)
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Any Questions?

Multiprocessors and Multicomputers
No ratings yet
Multiprocessors and Multicomputers
27 pages
Multicore Programming: K. Nagalakshmi, ASP/IT Department of Information Technology E.G.S. Pillay Engineering Technology
No ratings yet
Multicore Programming: K. Nagalakshmi, ASP/IT Department of Information Technology E.G.S. Pillay Engineering Technology
19 pages
Cloud Computing Unit-1
100% (1)
Cloud Computing Unit-1
88 pages
Cluster
No ratings yet
Cluster
55 pages
Distributed System MCQ 2018
59% (75)
Distributed System MCQ 2018
25 pages
Unit IV Cluster Computing
No ratings yet
Unit IV Cluster Computing
70 pages
Lecture 10 - 1 Parallel Systems-20181127022751-20191112080309
No ratings yet
Lecture 10 - 1 Parallel Systems-20181127022751-20191112080309
49 pages
Cluster Computing: Dr. C. Amalraj 01/03/2021 The University of Moratuwa Amalraj@uom - LK
No ratings yet
Cluster Computing: Dr. C. Amalraj 01/03/2021 The University of Moratuwa Amalraj@uom - LK
49 pages
Lecture 1 - Overview of Distributed Computing
No ratings yet
Lecture 1 - Overview of Distributed Computing
71 pages
Week 6 A
No ratings yet
Week 6 A
32 pages
UNIT 4 COA Parallelism
No ratings yet
UNIT 4 COA Parallelism
29 pages
2 CS Architecture
No ratings yet
2 CS Architecture
22 pages
High Performance Cluster Computing:: Architectures and Systems
No ratings yet
High Performance Cluster Computing:: Architectures and Systems
70 pages
Coa PPT-2
No ratings yet
Coa PPT-2
16 pages
L1&2 Intro To DS&CC - M
No ratings yet
L1&2 Intro To DS&CC - M
53 pages
Parallel Computing
100% (1)
Parallel Computing
53 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
Jamshed 2015
No ratings yet
Jamshed 2015
17 pages
02 Lecture Flynn IN
No ratings yet
02 Lecture Flynn IN
78 pages
DoS - Unit 1
No ratings yet
DoS - Unit 1
57 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
IV. Physical Organization and Models: March 9, 2009
No ratings yet
IV. Physical Organization and Models: March 9, 2009
35 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
High Performance Cluster Computing:: Architectures and Systems
No ratings yet
High Performance Cluster Computing:: Architectures and Systems
70 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
CS-3006 3 ParallelArchitectures
No ratings yet
CS-3006 3 ParallelArchitectures
56 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
2 - Distributed Computing Models
No ratings yet
2 - Distributed Computing Models
27 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Module 4 - Architecture
No ratings yet
Module 4 - Architecture
22 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
IV. Physical Organization and Models: March 9, 2009
No ratings yet
IV. Physical Organization and Models: March 9, 2009
35 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
Architecture
No ratings yet
Architecture
67 pages
CC Ica
No ratings yet
CC Ica
16 pages
Downloadfile
No ratings yet
Downloadfile
16 pages
Gopallapuram, Renigunta-Srikalahasti Road, Tirupati: by B. Hari Prasad, Asst - Prof
No ratings yet
Gopallapuram, Renigunta-Srikalahasti Road, Tirupati: by B. Hari Prasad, Asst - Prof
18 pages
Week 6 A
No ratings yet
Week 6 A
22 pages
Slide02 Parallel Computers
No ratings yet
Slide02 Parallel Computers
44 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
Unit 3
No ratings yet
Unit 3
28 pages
Management Information System: Ghulam Yasin Hajvery University Lahore
No ratings yet
Management Information System: Ghulam Yasin Hajvery University Lahore
42 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Model Question Paper Answers Cloud Computing
No ratings yet
Model Question Paper Answers Cloud Computing
44 pages
Unit 1 - Part 1
No ratings yet
Unit 1 - Part 1
51 pages
Unit 4
No ratings yet
Unit 4
16 pages
Background: Computer System Architectures Computer System Software
No ratings yet
Background: Computer System Architectures Computer System Software
25 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Chapter 01 PDC
No ratings yet
Chapter 01 PDC
51 pages
PP16 Lec4 Arch3
No ratings yet
PP16 Lec4 Arch3
23 pages
Computer Design Paper Ali
No ratings yet
Computer Design Paper Ali
5 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Motivation For Parallelism Motivation For Parallelism
No ratings yet
Motivation For Parallelism Motivation For Parallelism
6 pages
15 Parallel Processing
No ratings yet
15 Parallel Processing
36 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
Scalable Parallel Computing
No ratings yet
Scalable Parallel Computing
11 pages
Parallel Processors and Cluster Systems: Gagan Bansal IME Sahibabad
No ratings yet
Parallel Processors and Cluster Systems: Gagan Bansal IME Sahibabad
15 pages
Grid Computing
100% (1)
Grid Computing
14 pages
Unit 4
No ratings yet
Unit 4
90 pages
Nptel CC 2025 Qa
No ratings yet
Nptel CC 2025 Qa
119 pages
Distributed System Notes Midsem
No ratings yet
Distributed System Notes Midsem
183 pages
Why Parallel Computing?: Peter Pacheco
No ratings yet
Why Parallel Computing?: Peter Pacheco
84 pages
Soas Thesis Submission
100% (3)
Soas Thesis Submission
7 pages
DC-Unit 1
No ratings yet
DC-Unit 1
9 pages
A Review Paper On E-Governance: Transforming Government: Annu Kumari (Research Scholar) Dr. Shailendra Narayan Singh
100% (1)
A Review Paper On E-Governance: Transforming Government: Annu Kumari (Research Scholar) Dr. Shailendra Narayan Singh
4 pages
Information Technology Unit 5
No ratings yet
Information Technology Unit 5
13 pages
Silo - Tips Oracle 11g r2 Grid Infrastructure Installation On 2 Node Cluster Using Virtualbox
No ratings yet
Silo - Tips Oracle 11g r2 Grid Infrastructure Installation On 2 Node Cluster Using Virtualbox
29 pages
Big Data Analytics For Dynamic Energy Management in Smart Grids
No ratings yet
Big Data Analytics For Dynamic Energy Management in Smart Grids
9 pages
Unit 1 Unit1
No ratings yet
Unit 1 Unit1
38 pages
Classification of Distributed Computing Systems
No ratings yet
Classification of Distributed Computing Systems
14 pages
Module 5
No ratings yet
Module 5
8 pages
Cloud Computing Material Unit - 1
No ratings yet
Cloud Computing Material Unit - 1
24 pages
Comparing Cloud Computing and Grid Computing
No ratings yet
Comparing Cloud Computing and Grid Computing
15 pages
Remote Instrumentation and Virtual Laboratories - Service Architecture and Networking-Sprin
No ratings yet
Remote Instrumentation and Virtual Laboratories - Service Architecture and Networking-Sprin
519 pages
GCC Model Exam
No ratings yet
GCC Model Exam
2 pages
Virtualization For Data-Centre Automation
No ratings yet
Virtualization For Data-Centre Automation
11 pages
Seismic Interpretations To Reservoir
No ratings yet
Seismic Interpretations To Reservoir
2 pages
GCC Lab Manual
No ratings yet
GCC Lab Manual
125 pages
Tibco Hawk: Concepts Guide
No ratings yet
Tibco Hawk: Concepts Guide
81 pages
Autodyn: Explicit Software For Nonlinear Dynamics
No ratings yet
Autodyn: Explicit Software For Nonlinear Dynamics
67 pages
Sergei Petrenko
No ratings yet
Sergei Petrenko
23 pages
New Seminar Paper
No ratings yet
New Seminar Paper
16 pages
C-DAX: A Cyber-Secure Data and Control Cloud For Power Grids
No ratings yet
C-DAX: A Cyber-Secure Data and Control Cloud For Power Grids
17 pages
GECON06 Grid Economic Issues Thanos Courcoubetis Stamoulis Final
No ratings yet
GECON06 Grid Economic Issues Thanos Courcoubetis Stamoulis Final
16 pages
Grid Computing
No ratings yet
Grid Computing
2 pages
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
From Everand
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
Anand Vemula
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

CS-3006 3 ParallelArchitectures

Uploaded by

CS-3006 3 ParallelArchitectures

Uploaded by

Parallel Architectures

Dr. Muhammad Mateen Yaqoob

Department of AI & DS,

• By Michael Flynn (from Stanford, in 1966)

Slide Credit: Alex Klimovitski & Dean Macri, Intel Corporation

• Most common parallel processor architecture

• Cache-coherent NUMA (cc-NUMA)

• In clusters each node has own memory:

• NUMA retains SMP flavour while giving large scale

• Memory request order:

• No major software changes

• Performance can breakdown if too much access to

• Data transfer between processors is explicit (via

• User has complete control/responsibility for data

CPU Memory CPU Memory CPU Memory

• Most common parallel architecture

CPU CPU CPU

• Communication via fixed path or network connections

• Not a single shared memory

• Applications can be executed at several locations

• Geographically distributed services

• Coordinates/Access of resources; as contract to

• Uses standard, open, general-purpose protocols and

Autonomous, globally distributed computers/clusters

User Resource Broker

• Hides the complexity and details of the underlying

•Pay for use and as needed

•Elastic: scale up and down (capacity and functionalities)

•Shared pool of configurable computing resources

– Currently the LINPACK Benchmark is officially used to

– Top 500 SuperComputers

You might also like