0% found this document useful (0 votes)
23 views

cs501 Archintro Pin2

This document discusses the architecture of parallel computers. It covers hardware issues like the number and type of processors, memory hierarchy, and I/O devices. Operating system issues for parallel computers like resource allocation and access to hardware features are also discussed. The document examines programming issues, performance metrics, and different models of parallelism including multi-processors, multi-computers, vector computers, SIMD computers, and the PRAM model. Flynn's taxonomy of computer architectures is introduced.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

cs501 Archintro Pin2

This document discusses the architecture of parallel computers. It covers hardware issues like the number and type of processors, memory hierarchy, and I/O devices. Operating system issues for parallel computers like resource allocation and access to hardware features are also discussed. The document examines programming issues, performance metrics, and different models of parallelism including multi-processors, multi-computers, vector computers, SIMD computers, and the PRAM model. Flynn's taxonomy of computer architectures is introduced.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

BY-Pin2

Advanced Computer Architecture

The Architecture of
Parallel Computers
BY-Pin2

Computer Systems
Application
No Component Software
Can be Treated Operating
In Isolation System
From the Others
Hardware
Architecture
BY-Pin2

Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
BY-Pin2

Operating System Issues


• Allocating and Managing Resources
• Access to Hardware Features
– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
BY-Pin2

Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers
– Preprocessor
– Precompiler
– Parallelizing Compiler
BY-Pin2

Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
BY-Pin2

Flynn’s Classification
• Consider Instruction Streams and Data
Streams Separately.
• SISD - Single Instruction, Single Data
Stream
• SIMD - Single Instruction, Multiple Data
Streams
• MIMD - Multiple Instruction, Multiple Data
Streams.
• MISD - (rare) Multiple Instruction, Single
Data Stream
BY-Pin2

SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered in
everyday life
BY-Pin2

SIMD
• Multiple Processors Execute a Single
Program
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
BY-Pin2

MIMD
• Multiple Processors cooperate on a single
task
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
BY-Pin2

MISD
• A Single Data Stream passes through
multiple processors
• Different operations are triggered on
different processors
• Systolic Arrays
• Wave-Front Arrays
BY-Pin2

Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques are
only Partially Successful
• Programming languages are few, not well
supported, and difficult to use.
• Parallel Algorithms are difficult to design.
BY-Pin2

Performance Issues
• Clock Rate / Cycle Time = τ
• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic
• Time, T = Ic × CPI × τ
• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
BY-Pin2

Performance Issues II
• Ic & p affected by processor design and
compiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structure
and design
BY-Pin2

Other Measures
• MIPS rate - Millions of instructions per
second
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating point
operations per second.
• These measures are not neccessarily directly
comparable between different types of
processors.
BY-Pin2

Parallelizing Code
• Implicitly
– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly
– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
BY-Pin2

Multi-Processors
• Multi-Processors generally share memory,
while multi-computers do not.
– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
BY-Pin2

Multi-Computers
• Independent Computers that Don’t Share
Memory.
• Connected by High-Speed Communication
Network
• More tightly coupled than a collection of
independent computers
• Cooperate on a single problem
BY-Pin2

Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highly
pipelined mode
• Can be Memory-to-Memory or Register-to-
Register
BY-Pin2

SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the same
instruction at the same time
• Interconnection network between PEs
determines memory access and PE
interaction
BY-Pin2

The PRAM Model


• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution
– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
BY-Pin2

The VLSI Model


• Implement Algorithm as a mostly
combinational circuit
• Determine the area required for
implementation
• Determine the depth of the circuit
BY-Pin2

Advanced Computer Architecture

The Architecture of
Parallel Computers
BY-Pin2

Computer Systems
Application
No Component Software
Can be Treated Operating
In Isolation System
From the Others
Hardware
Architecture
BY-Pin2

Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
BY-Pin2

Operating System Issues


• Allocating and Managing Resources
• Access to Hardware Features
– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
BY-Pin2

Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers
– Preprocessor
– Precompiler
– Parallelizing Compiler
BY-Pin2

Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
BY-Pin2

Flynn’s Classification
• Consider Instruction Streams and Data
Streams Separately.
• SISD - Single Instruction, Single Data
Stream
• SIMD - Single Instruction, Multiple Data
Streams
• MIMD - Multiple Instruction, Multiple Data
Streams.
• MISD - (rare) Multiple Instruction, Single
Data Stream
BY-Pin2

SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered in
everyday life
BY-Pin2

SIMD
• Multiple Processors Execute a Single
Program
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
BY-Pin2

MIMD
• Multiple Processors cooperate on a single
task
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
BY-Pin2

MISD
• A Single Data Stream passes through
multiple processors
• Different operations are triggered on
different processors
• Systolic Arrays
• Wave-Front Arrays
BY-Pin2

Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques are
only Partially Successful
• Programming languages are few, not well
supported, and difficult to use.
• Parallel Algorithms are difficult to design.
BY-Pin2

Performance Issues
• Clock Rate / Cycle Time = τ
• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic
• Time, T = Ic × CPI × τ
• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
BY-Pin2

Performance Issues II
• Ic & p affected by processor design and
compiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structure
and design
BY-Pin2

Other Measures
• MIPS rate - Millions of instructions per
second
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating point
operations per second.
• These measures are not neccessarily directly
comparable between different types of
processors.
BY-Pin2

Parallelizing Code
• Implicitly
– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly
– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
BY-Pin2

Multi-Processors
• Multi-Processors generally share memory,
while multi-computers do not.
– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
BY-Pin2

Multi-Computers
• Independent Computers that Don’t Share
Memory.
• Connected by High-Speed Communication
Network
• More tightly coupled than a collection of
independent computers
• Cooperate on a single problem
BY-Pin2

Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highly
pipelined mode
• Can be Memory-to-Memory or Register-to-
Register
BY-Pin2

SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the same
instruction at the same time
• Interconnection network between PEs
determines memory access and PE
interaction
BY-Pin2

The PRAM Model


• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution
– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
BY-Pin2

The VLSI Model


• Implement Algorithm as a mostly
combinational circuit
• Determine the area required for
implementation
• Determine the depth of the circuit

You might also like