0% found this document useful (0 votes)
91 views29 pages

Lecture 3 Flynn's Classical Taxonomy

This document discusses Flynn's taxonomy for classifying parallel computer architectures. It describes the four categories in Flynn's taxonomy: SISD, SIMD, MISD, and MIMD. SISD refers to a single instruction stream, single data stream computer (i.e. a serial computer). SIMD involves a single instruction stream operating on multiple data streams. MISD uses multiple instruction streams on a single data stream. MIMD uses multiple instruction and data streams and includes most current supercomputers, grid computers, and networked parallel computers. The document provides examples and details on each classification.

Uploaded by

nimranoor137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views29 pages

Lecture 3 Flynn's Classical Taxonomy

This document discusses Flynn's taxonomy for classifying parallel computer architectures. It describes the four categories in Flynn's taxonomy: SISD, SIMD, MISD, and MIMD. SISD refers to a single instruction stream, single data stream computer (i.e. a serial computer). SIMD involves a single instruction stream operating on multiple data streams. MISD uses multiple instruction streams on a single data stream. MIMD uses multiple instruction and data streams and includes most current supercomputers, grid computers, and networked parallel computers. The document provides examples and details on each classification.

Uploaded by

nimranoor137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Parallel and Distributed Computing

CS 3006 (BCS-7A | BDS-7A)


Lecture 3
Danyal Farhat
FAST School of Computing
NUCES Lahore
Flynn’s Classical Taxonomy
and Processor to Memory
Connection Strategies
Hardware Architecture Classifications
• Flynn’s Classification
Differentiates multiprocessor computers according to the dimensions of
instruction and data

• Feng’s Classification
Mainly based on serial and parallel processing in the computer system

• Handler’s Classification
Calculated on the basis of degree of parallelism and pipelining in system
levels
Flynn’s Classical Taxonomy
• Most Widely used parallel computer classifications
• Differentiates multiprocessor computers according to the
dimensions of instruction and data
Instruction stream: Sequence of instructions from memory to control unit
Data stream: Sequence of data from memory to control unit
• SISD: Single Instruction stream, Single Data stream
• SIMD: Single Instruction stream, Multiple Data stream
• MISD: Multiple Instruction stream, Single Data stream
• MIMD: Multiple Instruction stream, Multiple Data stream
Processor Organizations
SISD
• A serial (non-parallel computer)
• Single instruction: one instruction per cycle
• Single data: only one data stream per cycle
• Easy and deterministic execution
Example:
• Single CPU workstations
• Most workstations from HP, IBM and SGI are SISD
machines
SISD (Cont.)
• Performance of a processor can be measured with:
MIPS Rate = f x IPC
Million instructions per second (MIPS) is an approximate measure of a
computer's raw processing power
f (clock frequency of processor); IPC (instructions per cycle)
How to increase performance of uniprocessor?
• Multithreading
• Increasing clock frequency
• Increasing number of instructions completed during a
processor cycle (multiple pipelines in a superscalar
architecture and/or out of order execution)
SISD – Multithreading
• Run multiple threads on the same core concurrently
• Context switch implemented in hardware
• Minimum hardware support: replicate architectural state
All running threads must have their own context
Multiple register sets in the core
Multiple state registers
 Program Counter (PC)
 Memory Address Register (MAR)
 Accumulator Register (ACC)
SISD – Multithreading (Cont.)
Implicit Multithreading
• concurrent execution of multiple threads extracted from a
single sequential program
• Managed by processor hardware
• Improve individual application performance
Explicit Multithreading
• concurrent execution of instructions from different explicit
threads, either by interleaving instructions from different
threads or by parallel execution on parallel pipelines
SISD-Explicit Multithreading
• Four approaches for explicit multithreading
Interleaved multithreading (fine-grained): switching can be at each
clock cycle. In case of few active threads, performance degrades
Blocked multithreading (coarse-grained): events like cache miss
produce switch
Simultaneous multithreading (SMT): execution units of a superscalar
processor receive instructions from multiple threads
Chip multiprocessing: e.g. dual core (not SISD)
• Architectures like IA-64 Very Long Instruction Word (VLIW)
allow multiple instructions (to be executed in parallel) in a
single word
SISD-Explicit Multithreading (Cont.)
Interleaved Multithreading (fine-grained):
• Fetch instructions from different threads in consecutive cycles
• In every clock cycle, we fetch an instruction for a thread
switching is at each clock cycle
• In case of few active threads, performance degrades
SISD-Explicit Multithreading (Cont.)
Blocked Multithreading (coarse-grained):
• Another thread is started when a thread is blocked
• Events like cache miss, waiting for I/O produce switch
• Switch to different thread when a long latency event (e.g. L2
cache miss) occurs
SISD-Explicit Multithreading (Cont.)
Simultaneous Multithreading (SMT):
• Fetch instructions from different threads in single cycle
• Execution units of a superscalar processor receive instructions
from multiple threads
• A superscalar processor is a CPU that implements a form of
parallelism called instruction-level parallelism within a single
processor
Intel’s Hyper Threading Technology
• A single physical processor appears as two logical processors
by applying two-threaded SMT approach
Example: Intel Pentium 4 in 2002

• Each logical processor maintains a complete set of architecture


state (general-purpose registers, control registers,…)

• Logical processors share nearly all other resources such as


caches, execution units, branch predictors, control logic and
buses
Intel’s Hyper Threading Technology (Cont.)
• Partitioned resources are recombined when only one thread is
active
• Add less than 5% to the relative chip size
• Improve performance by 16% to 28%
SIMD

• Homogeneous processing units / processing elements (PEs)


• Single instruction: All processor units execute the same
instruction at any given time
• Multiple data: Each processing unit can operate on different
data set
Example: Add A and B, C and D, X and Z
SIMD (Cont.)
• Each processing element has an associated data memory
So that each instruction is executed on a different set of data by the
different processors
• Used by vector and array processors
Suitable for vector and matrix calculations
• Vector processors act on array of similar data (only when
executing in vector mode) and in this case they are several
times faster than when executing in scalar mode
Example: NEC SX-8 processors run at 2 GHz for vectors and 1 GHz for
scalar operations
SIMD - Example
• A good example is the processing of pixels on screen
• A sequential processor would examine each pixel one at a time
and apply the processing instruction
• An array or vector processor can process all the elements of an
array simultaneously
• Game consoles and graphic cards make heavy use of such
processors to shift those pixels
• Such designs are usually dedicated to a particular application
and not commonly marketed for general purpose computing
SIMD-Example
MISD
• A single data stream is transmitted to a set of processors, each
of which executes a different instruction sequence
• Each processing unit operates on the data independently via
independent instruction stream
• This structure is not commercially implemented
• An example of use could be multiple cryptography algorithms
attempting to crack a coded message
MISD (Cont.)
• Example: 3 processors executes three different instructions on
same data set
MIMD
• Multiple instruction: Every processor may execute a different
instruction stream
• Multiple data: Every processor may work with a different data
stream
Examples:
• Most of the current supercomputers
• Grid computers
• Networked parallel computers
• Symmetric Multiprocessors (SMP) computers
MIMD (Cont.)
MIMD systems are mainly:

Shared Memory (SM) Systems:


• Multiple CPUs all of which share the same address space
(there is only one memory)

Distributed Memory (DM) Systems:


• Each CPU has its own associated memory
• CPUs are connected by some network (clusters)
MIMD - Shared Memory

• All processors have access to all memory as a global address


space
Uniform Memory Access (UMA)

• From all processing units to the shared memory, the data


access time is constant.
• Mostly represented by Symmetric Multiprocessor (SMP)
machines.
Non-Uniform Memory Access (NUMA)

• From all processing units to the shared memory, the data


access time is not constant.
Shared Memory Interconnection Network
• Main problem is how to do interconnections of the CPUs to
each other and to the memory

There are three main network topologies available:


• Crossbar (n2 connections - datapath without sharing)
• -network(n log2 n connections - log2 n switching stages
and shared on a path)
• Central databus (1 connections - n shared)
Shared Memory Interconnection Network (Cont.)
Thank You!

You might also like