Lecture 3 Flynn's Classical Taxonomy
Lecture 3 Flynn's Classical Taxonomy
• Feng’s Classification
Mainly based on serial and parallel processing in the computer system
• Handler’s Classification
Calculated on the basis of degree of parallelism and pipelining in system
levels
Flynn’s Classical Taxonomy
• Most Widely used parallel computer classifications
• Differentiates multiprocessor computers according to the
dimensions of instruction and data
Instruction stream: Sequence of instructions from memory to control unit
Data stream: Sequence of data from memory to control unit
• SISD: Single Instruction stream, Single Data stream
• SIMD: Single Instruction stream, Multiple Data stream
• MISD: Multiple Instruction stream, Single Data stream
• MIMD: Multiple Instruction stream, Multiple Data stream
Processor Organizations
SISD
• A serial (non-parallel computer)
• Single instruction: one instruction per cycle
• Single data: only one data stream per cycle
• Easy and deterministic execution
Example:
• Single CPU workstations
• Most workstations from HP, IBM and SGI are SISD
machines
SISD (Cont.)
• Performance of a processor can be measured with:
MIPS Rate = f x IPC
Million instructions per second (MIPS) is an approximate measure of a
computer's raw processing power
f (clock frequency of processor); IPC (instructions per cycle)
How to increase performance of uniprocessor?
• Multithreading
• Increasing clock frequency
• Increasing number of instructions completed during a
processor cycle (multiple pipelines in a superscalar
architecture and/or out of order execution)
SISD – Multithreading
• Run multiple threads on the same core concurrently
• Context switch implemented in hardware
• Minimum hardware support: replicate architectural state
All running threads must have their own context
Multiple register sets in the core
Multiple state registers
Program Counter (PC)
Memory Address Register (MAR)
Accumulator Register (ACC)
SISD – Multithreading (Cont.)
Implicit Multithreading
• concurrent execution of multiple threads extracted from a
single sequential program
• Managed by processor hardware
• Improve individual application performance
Explicit Multithreading
• concurrent execution of instructions from different explicit
threads, either by interleaving instructions from different
threads or by parallel execution on parallel pipelines
SISD-Explicit Multithreading
• Four approaches for explicit multithreading
Interleaved multithreading (fine-grained): switching can be at each
clock cycle. In case of few active threads, performance degrades
Blocked multithreading (coarse-grained): events like cache miss
produce switch
Simultaneous multithreading (SMT): execution units of a superscalar
processor receive instructions from multiple threads
Chip multiprocessing: e.g. dual core (not SISD)
• Architectures like IA-64 Very Long Instruction Word (VLIW)
allow multiple instructions (to be executed in parallel) in a
single word
SISD-Explicit Multithreading (Cont.)
Interleaved Multithreading (fine-grained):
• Fetch instructions from different threads in consecutive cycles
• In every clock cycle, we fetch an instruction for a thread
switching is at each clock cycle
• In case of few active threads, performance degrades
SISD-Explicit Multithreading (Cont.)
Blocked Multithreading (coarse-grained):
• Another thread is started when a thread is blocked
• Events like cache miss, waiting for I/O produce switch
• Switch to different thread when a long latency event (e.g. L2
cache miss) occurs
SISD-Explicit Multithreading (Cont.)
Simultaneous Multithreading (SMT):
• Fetch instructions from different threads in single cycle
• Execution units of a superscalar processor receive instructions
from multiple threads
• A superscalar processor is a CPU that implements a form of
parallelism called instruction-level parallelism within a single
processor
Intel’s Hyper Threading Technology
• A single physical processor appears as two logical processors
by applying two-threaded SMT approach
Example: Intel Pentium 4 in 2002