0% found this document useful (0 votes)
12 views53 pages

CS-3006 3 ParallelArchitectures

The document provides an overview of parallel architectures, detailing Flynn's Taxonomy which classifies computer architectures into SISD, SIMD, MISD, and MIMD. It discusses various processor architectures including Symmetric Multiprocessors (SMP), Non-Uniform Memory Access (NUMA), and distributed systems like clusters and grids. Additionally, it covers cloud computing and supercomputers, emphasizing their roles in high-performance computing and resource management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views53 pages

CS-3006 3 ParallelArchitectures

The document provides an overview of parallel architectures, detailing Flynn's Taxonomy which classifies computer architectures into SISD, SIMD, MISD, and MIMD. It discusses various processor architectures including Symmetric Multiprocessors (SMP), Non-Uniform Memory Access (NUMA), and distributed systems like clusters and grids. Additionally, it covers cloud computing and supercomputers, emphasizing their roles in high-performance computing and resource management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Parallel Architectures

(CS 3006)

Dr. Muhammad Mateen Yaqoob

Department of AI & DS,


National University of Computer & Emerging Sciences,
Islamabad Campus
Flynn’s Taxonomy
• Specific classification of computing architectures

• By Michael Flynn (from Stanford, in 1966)


– Made a classification of computer systems
known as Flynn’s Taxonomy

Computer

Instructions Data
Flynn’s Taxonomy
1. Single Instruction, Single Data Stream – SISD
• Single processor
• Single instruction stream
• Data stored in single memory
• Deterministic execution

SI SISD SD
2. Single Instruction, Multiple Data Stream - SIMD
• A parallel processor
• Single instruction: all processing units execute same
instruction at any given clock cycle
• Multiple data: Each processing unit can operate on a
different data element
• Large number of processing elements (with local
memory)
SISD SD
• Examples: GPUs, etc.
SI SISD SD

SISD SD
2. Single Instruction, Multiple Data Stream - SIMD

Slide Credit: Alex Klimovitski & Dean Macri, Intel Corporation


3. Multiple Instruction, Single Data Stream - MISD
• Sequence of data
• Transmitted to set of processors
• Each processor executes different instruction
sequence, using same Data
• Examples: Pipelined Vector Processors, etc.
4. Multiple Instruction, Multiple Data Stream- MIMD

• Most common parallel processor architecture


• Simultaneously execute different instructions
• Using different sets of data
• Examples: Multi-cores, SMPs, Clusters, Grid, Cloud
MIMD - Overview
• General purpose processors
• Each can process all instructions necessary
• Further classified by method of processor
communication:
1. Shared Memory
2. Distributed Memory
Taxonomy of Processor Architectures
Taxonomy of Processor Architectures
Symmetric Multiprocessor (SMP)
• Processors share memory (tightly coupled)
• Communicate via shared memory (single bus)
• Same memory access time (any memory region, from
any processor)
• Processors share I/O address space too
SMP Advantages
• Performance
– work can be done in parallel

• Availability
– Failure of a single processor does not halt system

• Incremental growth
– Adding additional processors enhances performance

• Scaling
– Range of products based on number of processors
Symmetric Multiprocessor Organization
Multithreading and Chip Multiprocessors
• Instruction stream divided into smaller streams
called “threads”

• Executed in parallel
Taxonomy of Processor Architectures
Tightly Coupled - NUMA
• Non-Uniform Memory Access (NUMA)
– Access times to different regions of memory differs
SunFire X4600M2 NUMA machine
Non-uniform Memory Access (NUMA)
• Non-uniform memory access
– All processors have access to all parts of memory
– Access time of processor differs depending on
memory region
– Different processors access different regions of
memory at different speeds

• Cache-coherent NUMA (cc-NUMA)


– Cache coherence is maintained among the caches
of the various processors
Motivation (Why NUMA)
• SMP has practical limit to number of processors
– Bus traffic limits to between 16 and 64 processors

• In clusters each node has own memory:


– Apps do not see large global memory
– Coherence maintained by software not hardware

• NUMA retains SMP flavour while giving large scale


multiprocessing
CC-NUMA Organization
CC-NUMA Operation
• Each processor has own L1 and L2 cache
• Each node has own main memory
• Nodes connected by some networking facility
• Each processor sees single addressable memory
• Hardware support for read/write to non-local
memories, cache coherency

• Memory request order:


1. L1 cache  L2 cache (local to processor)
2. Main memory (local to node)
3. Remote memory (remote node)
NUMA Pros & Cons
• Effective performance at higher levels of parallelism
than SMP

• No major software changes

• Performance can breakdown if too much access to


remote memory
Distributed Memory / Message Passing
• Each processor has access to its own memory only

• Data transfer between processors is explicit (via


message passing functions): E.g., MPI library

• User has complete control/responsibility for data


placement and management

Interconnection Network

CPU Memory CPU Memory CPU Memory


Hybrid Systems
• Distributed memory system with multiprocessor
shared memory nodes

• Most common parallel architecture

Interconnection Network
Network Interface Network Interface Network Interface
CPU CPU CPU
Memory

Memory

Memory
CPU CPU CPU

CPU CPU CPU


Taxonomy of Processor Architectures
Distributed Computing
• Using distributed systems to solve large
problems

• Paradigms:
– Cluster computing
– Grid computing
– Cloud computing
Cluster Computing
Clusters - Loosely Coupled
• Collection of independent uni-processor systems or
SMPs
• Interconnected to form a cluster

• Communication via fixed path or network connections

• Not a single shared memory


Introduction to Clusters
• Alternative to SMP
• High performance
• High availability
• A group of interconnected whole computers
• Working together as unified resource
• Illusion of being one big machine
• Each computer called a node
Cluster Benefits
• Scalability
• Superior price/performance ratio
Cluster System Architecture
Cluster Middleware
• Unified image to user
– Single system image
• Single point of entry
• Single file hierarchy
• Single job management system
• Single user interface
• Single I/O space
Cluster vs. SMP
• Both provide multiprocessor support

• SMPs:
– Easier to manage and control
– Closer to single processor systems:
• Scheduling is main difference
• Less physical space required
• Lower power consumption
Cluster vs. SMP
• Clustering:
– Superior incremental scalability
– Superior availability
• Redundancy
Grid Computing
Grid Computing
• Heterogeneous computers over the whole world
providing CPU power and data storage capacity

• Applications can be executed at several locations

• Geographically distributed services

• Coordinates/Access of resources; as contract to


centralized control

• Uses standard, open, general-purpose protocols and


interfaces Credits: Grid Computing by Camiel Plevier
Grid Architecture

Autonomous, globally distributed computers/clusters


A typical view of Grid environment
Grid Information Service
Grid Information Service system
collects the details of the Details of Grid resources
available Grid resources. Passes
information to resource broker.

Computational jobs

Grid application

Processed jobs
Computation result

User Resource Broker


A User submit
computation or data A Resource Broker distribute the Grid Resources
intensive application to jobs in an application to the Grid Grid Resources (Cluster, PC,
Grids. resources based on user’s QoS Supercomputer, database,
requirements and available Grid instruments, etc.)
resources.
Cloud Computing
What is Cloud Computing?
• Cloud Computing is a network-based computing that
takes place over the Internet:
– a collection/group of integrated and networked
hardware, software, and Internet infrastructure
(called a platform).

• Hides the complexity and details of the underlying


infrastructure
What is Cloud Computing?
•On demand services, that are always ON, Anywhere,
Anytime and Any place

•Pay for use and as needed

•Elastic: scale up and down (capacity and functionalities)

•Shared pool of configurable computing resources


43
Service Models

44
Cloud Service Models

Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim 45
Grance
Figure source: https://dachou.github.io/assets/20110326-cloudmodels.png
Cloud Providers
SuperComputers
What is SuperComputer?
• Typical definition*: A computer that leads the world in
terms of processing capacity, speed of calculation, at the
time of its introduction
– Computer speed is measured in FLoating Point
Operations Per Second (FLOPs)

– Currently the LINPACK Benchmark is officially used to


determine a computers speed.
http://www.netlib.org/benchmark/hpl

– Top 500 SuperComputers


– A ranked list of general purpose systems that are in
common use for high-end applications
*https://home.chpc.utah.edu/~thorne/computing/L13_Supercomputing_Part1.pdf
Top 5 of the list (Nov. 2021)
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Top 500 SuperComputers - Nov. 2021
Any Questions?

You might also like