0% found this document useful (0 votes)
84 views23 pages

Parallel Computer Architecture Classification

Flynn's taxonomy classifies computer architectures based on the number of instruction and data streams. The main categories are SISD, SIMD, MIMD, and MISD. Modern classifications focus on how parallelism is achieved, through data or function parallelism at different levels. Performance is measured in metrics like MIPS and MFLOPS, with peak and sustained performance distinguishing maximum and achievable speeds.

Uploaded by

Denis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views23 pages

Parallel Computer Architecture Classification

Flynn's taxonomy classifies computer architectures based on the number of instruction and data streams. The main categories are SISD, SIMD, MIMD, and MISD. Modern classifications focus on how parallelism is achieved, through data or function parallelism at different levels. Performance is measured in metrics like MIPS and MFLOPS, with peak and sustained performance distinguishing maximum and achievable speeds.

Uploaded by

Denis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Parallel computer architecture

classification
Hardware Parallelism
• Computing: execute instructions that operate on data.
Computer

Instructions Data

• Flynn’s taxonomy (Michael Flynn, 1967) classifies computer


architectures based on the number of instructions that can be
executed and how they operate on data.
Flynn’s taxonomy
• Single Instruction Single Data (SISD)
– Traditional sequential computing systems
• Single Instruction Multiple Data (SIMD)
• Multiple Instructions Multiple Data (MIMD)
• Multiple Instructions Single Data (MISD)

Computer Architectures

SISD SIMD MIMD MISD


SISD
• At one time, one instruction operates on one
data
• Traditional sequential architecture
SIMD
• At one time, one instruction operates on many data
– Data parallel architecture
– Vector architecture has similar characteristics, but achieve the parallelism
with pipelining.
• Array processors
Array processor (SIMD)
IP

MAR

MEMORY

OP ADDR MDR
A1 B1 C1 A2 B2 C2 AN BN CN
DECODER

ALU ALU ALU


MIMD
• Multiple instruction streams operating on
multiple data streams
– Classical distributed memory or SMP architectures
MISD machine
• Not commonly seen.
• Systolic array is one example of an MISD architecture.
Flynn’s taxonomy summary
• SISD: traditional sequential architecture
• SIMD: processor arrays, vector processor
– Parallel computing on a budget – reduced control unit cost
– Many early supercomputers
• MIMD: most general purpose parallel
computer today
– Clusters, MPP, data centers
• MISD: not a general purpose architecture.
Flynn’s classification on today’s architectures

• Multicore processors

• Superscalar: Pipelined + multiple issues.

• SSE (Intel and AMD’s support for performing operation on


2 doubles or 4 floats simultaneously).

• GPU: Cuda architecture

• IBM BlueGene
Modern classification
(Sima, Fountain, Kacsuk)
• Classify based on how parallelism is achieved
– by operating on multiple data: data parallelism
– by performing many functions in parallel: function parallelism
• Control parallelism, task parallelism depending on the level of the
functional parallelism.

Parallel architectures

Data-parallel Function-parallel
architectures architectures
Data parallel architectures
• Vector processors, SIMD (array processors), systolic arrays.
IP

MAR
Vector processor (pipelining)

MEMORY

A B C
OP ADDR MDR

DECODER

ALU
Data parallel architecture: Array
processor
IP

MAR

MEMORY

OP ADDR MDR
A1 B1 C1 A2 B2 C2 AN BN CN
DECODER

ALU ALU ALU


Control parallel architectures
Function-parallel
architectures

Instruction level Thread level Process level


Parallel Arch Parallel Arch Parallel Arch
(ILPs) (MIMDs)

Pipelined VLIWs Superscalar Shared


Distributed
processors processors Memory
Memory MIMD
MIMD
Classifying today’s architectures
• Multicore processors?
• Superscalar?

• SSE?

• GPU: Cuda architecture?

• IBM BlueGene?
Performance of parallel architectures
• Common metrics
– MIPS: million instructions per second
• MIPS = instruction count/(execution time x 10 6)

– MFLOPS: million floating point operations per second.


• MFLOPS = FP ops in program/(execution time x 106)

• Which is a better metric?


• FLOP is more related to the time of a task in numerical code
– # of FLOP / program is determined by the matrix size
Performance of parallel architectures
• FlOPS units
– kiloFLOPS (KFLOPS) 10^3
– megaFLOPS (MFLOPS) 10^6
– gigaFLOPS (GFLOPS) 10^9  single CPU performance
– teraFLOPS (TFLOPS) 10^12

– petaFLOPS (PFLOPS) 10^15  we are here right now


»10 petaFLOPS supercomputers

– exaFLOPS (EFLOPS) 10^18  the next milestone


Peak and sustained performance
• Peak performance
– Measured in MFLOPS
– Highest possible MFLOPS when the system does
nothing but numerical computation
– Rough hardware measure
– Little indication on how the system will perform in
practice.
Peak and sustained performance
• Sustained performance
– The MFLOPS rate that a program achieves over the entire run.
• Measuring sustained performance
– Using benchmarks
• Peak MFLOPS is usually much larger than sustained MFLOPS
– Efficiency rate = sustained MFLOPS / peak MFLOPS
Measuring the performance of parallel
computers
• Benchmarks: programs that are used to
measure the performance.
– LINPACK benchmark: a measure of a system’s
floating point computing power
• Solving a dense N by N system of linear equations Ax=b
• Use to rank supercomputers in the top500 list.
Other common benchmarks
• Micro benchmarks suit
– Numerical computing
• LAPACK
• ScaLAPACK
– Memory bandwidth
• STREAM
• Kernel benchmarks
– NPB (NAS parallel benchmark)
– PARKBENCH
– SPEC
– Splash
Summary
• Flynn’s classification
– SISD, SIMD, MIMD, MISD
• Modern classification
– Data parallelism
– function parallelism
• Instruction level, thread level, and process level
• Performance
– MIPS, MFLOPS
– Peak performance and sustained performance
References
• K. Hwang, "Advanced Computer Architecture :
Parallelism, Scalability, Programmability",
McGraw Hill, 1993.
• D. Sima, T. Fountain, P. Kacsuk, "Advanced
Computer Architectures : A Design Space
Approach", Addison Wesley, 1997.

You might also like