0% found this document useful (0 votes)

264 views20 pages

Chapter 1 (Parallel Computer Models)

1) The document discusses the evolution of computer generations from first to fifth generation systems and the key characteristics of each generation such as the introduction of floating point arithmetic, high-level languages, integrated circuits, parallel computing, and optical technologies. 2) It also covers computer architecture models including SISD, SIMD, MIMD, and MISD based on Flynn's classification and details shared-memory and distributed-memory multiprocessor systems. 3) Performance attributes such as clock time, clock rate, instruction count, cycles per instruction, execution time, and MIPS rate are defined to analyze and compare system performance.

Uploaded by

Kushal Sh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

264 views20 pages

Chapter 1 (Parallel Computer Models)

Uploaded by

Kushal Sh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Advanced Computer Architecture

Kai Hwang & Naresh Jotwani

Chapter One
Parallel Computer Models
Computer Generations
First Generation (1945-1954):
• Single central processing unit (CPU).
• Performed fixed-point arithmetic using program counter
• Used machine or assembly languages
• Subroutine linkage was not implemented
• Vacuum tubes and relay memories technology
• Representative system: IBM 701, ENIAC, Princeton IAS
Second Generation (1955-1964):
• Floating point arithmetic, multiplexed memory, index registers were introduced
• Subroutine libraries and compilers were implemented
• High level language (Fortan, Cobol) were established
• Registered transfer language was developed
• Representative system: IBM 7030, the Univac LARC, CDC 1604
Third Generation (1965-1974):
• Pipelining and cache memory were introduced
• Integrated circuit (IC) and Microprogrammed control for setting up the activities between CPU and
I/O for multiple users
• Time-sharing operating system was developed using virtual memory for maximum use of
resources
• Representative system: IBM 360-370 series, CDC 6600/7600, ASC, PDP-8 series
Computer Generations cont…

Forth Generation (1975-1990):

• Parallel computing was introduced for shared distributed memory
• Multiprocessing operating system , special language and compilers for parallelism purpose
• Software tools and environments were designed for parallel processing
• LSI/VLSI and semiconductor memory were created
• Representative system: IBM/3090 VF, VAX 9000, Cray X-MP, BBN TC 2000
Fifth Generation (1991-):
• Superscalar processors, cluster computers and MPP (Massively parallel processing) were
emphasized
• Depth in advanced VLSI and high density packaging and optical technologies have been
achieved.
• Heterogeneous processing is emerging to resolve large scale problems
• Representative system: Convex C3800, Cray Y-MP and C-90, Digital VAX 9000
Elements of Modern Computer

Computing problem
Algorithms and data structures
Hardware resources
Operating system
System software support
Compiler Support
Computing problem:
• Modern computer requires an integrated system: machine hardware, instructions set, system
software, application programs, user interface.
• Requires different computing resources based on the type of the problems e.g. for numerical
problems, the solution demands complex mathematical formulations, for alphanumerical
problems, it demands efficient transaction processing and large database management, for
artificial intelligence it demands logic interference and symbolic manipulations. Some
problem requires the combinations of all these processes.
Algorithms and data structures:
• To specify the computations involved in the solution, particular data structures and
algorithms are needed.
• In most cases, numerical algorithms are deterministic and symbolic processing may need
nondeterministic approaches.
Hardware resources:
• An operating system, application program, preprocessor, memory and other peripheral
devices form the hardware core of a computer system.
• This includes other special hardware interfaces integrated in the I/O devices like network
adaptors, modems, workstations, display terminals, printers, scanners etc.
Operating System:
• Manages the allocation and deallocation of the resources during the execution of any user
program.
• Application software and standard benchmark programs must be built for the performance
evaluation.
• Maps efficient compiler, processor scheduling, memory management and parallelism (at
both compile and run time).
System software support:
• Programs written in high-level languages must be translated into machine language with the
help of a good software support.
• Resource binding needs the use of the compiler, assembler, loader and OS kernel to
accomplish physical machine to program execution.
Compiler Support:
• There are three types of compiler support approaches: i) Preprocessor ii) Precompiler and iii)
Parallelizing compiler.
• Preprocessor uses sequential compiler and low level library to implement high level parallel
approach.
• Precompiler requires not full but rather some parallel program flows and limited
optimizations.
• Parallelizing compiler requires fully developed parallelizing/vectorising compiler that can
transform sequential codes into parallel constructs.
Flynn’s Classification of Computer Architectures(Michael Flynn, 1972):
SISD (Single Instruction stream over a Single Data stream):

Fig. 1.2(a) SISD uniprocessor architecture

SIMD (Single Instruction stream over Multiple Data streams):

Fig. 1.2(b) SIMD architecture (with distributed memory)

Legends: CU = Control Unit, PU = Processing Unit, MU = Memory Unit, IS = Instruction Stream, DS = Data Stream
PE = Processing Element, LM = Local Memory
MIMD (Multiple Instruction streams over Multiple Data streams):

Fig. 1.2(c) MIMD architecture (with shared memory)

MISD (Multiple Instructions stream and a Single Data stream):

Fig. 1.2(d) MISD architecture

Legends: CU = Control Unit, PU = Processing Unit, MU = Memory Unit, IS = Instruction Stream, DS = Data Stream
PE = Processing Element, LM = Local Memory
System Performance Attributes
Clock Time/Clock Cycle:
• Clock time denoted by 𝝉, is a constant that the digital computers have. Can vary depending
on the architecture used. (Usually measured in ns)
Clock Rate/Cycle Frequency:
• Inverse of clock time is clock rate, denoted by f. In short, (f = 1/𝝉). (Usually measured in hz)
Instruction Count (Ic):
• Instruction count is the size of a single program means the number of machine instructions to
be performed to execute a single program.
CPI/Cycles per Instruction:
• An instruction may consist of several micro operations (fetch, decode, operand fetch,
execute) that can be performed in one clock cycle. Then again, different machine instructions
may require different clock cycles.
• Number of clock cycles needed to execute a single instruction.
• If CPU clock cycles for a single program is C and Instruction count is Ic, then:
𝐶
𝐶𝑃𝐼 =
𝐼𝑐
Execution Time/CPU Time:
• If Ic is the total number of instructions, CPI is cycle per instruction and 𝝉 is clock cycle, the
total execution time, (T) for a program is:
𝐶𝑃𝐼
𝑇 = 𝐼𝑐 ∗ 𝐶𝑃𝐼 ∗ 𝝉 or 𝑇 = 𝐼𝑐 ∗
𝑓
• Carrying out an instruction requires some phases:
 Instruction fetch
 Decode
 Operands fetch
 Execution
 Storing results back to the memory
• Decode and execution phases are carried out in the CPU, thus the term processor cycle.
• Rest three phases require accessing memory, thus the term memory cycle.
• Usually the memory cycle is k times the processor cycle, where k is the ratio between
memory cycle and processor cycle. So the CPI can be written as
CPI = Processor cycle(p) + Memory cycle(m) * k
• Therefore, total execution time, T:
𝑇 = 𝐼𝑐 ∗ 𝑝 + 𝑚 ∗ 𝑘 ∗ 𝝉
Where, p = number of processor cycles needed for instruction decode and execution
m = number of memory references needed
k = ratio between p and m
MIPS rate (Million Instructions Per Seconds):
• Evaluated as:
𝐼𝑐
𝑀𝐼𝑃𝑆 𝑟𝑎𝑡𝑒 =
𝑇∗106
𝐼𝑐 𝑓 𝑓∗𝐼𝑐
Or, = = [Derivated from, Total execute time, 𝑇 = 𝐼𝑐 ∗ 𝐶𝑃𝐼 ∗ 𝝉]
𝑇∗106 𝐶𝑃𝐼∗106 𝐶∗106

 Math Problems
Multiprocessors and Multicomputers
Depending on the type of accessing memory, two categories are represented here.
Shared-Memory Multiprocessors
• Uniform Memory Access (UMA) model
• Non-uniform Memory Access (NUMA) model
• Cache Only Memory Architecture (COMA) moldel
Distributed-Memory Multicomputers
Shared-Memory Multiprocessors
• The Uniform Memory Access (UMA) model:
 The physical memory is uniformly shared by all the
processors.
 All processors have equal access time to all the memory
words, thus called uniform memory access.
 Processor can have their own private cache.
 All the tightly connected processors and the memory
are interconnected by a single bus, a crossbar switch or
a multistage network.
 When all the processors have equal access to all the
peripherals, it is called symmetric multiprocessor.
 When one or couple of processors have the access right,
It is called asymmetric multiprocessor.
 UMA model is advantageous for timesharing
Applications for multiple users and also useful to speed-up
large single program in a short time.
• The Nonuniform Memory Access (NUMA) model:
 In NUMA model, access time varies with the
location of the memory word.
 The shared memory is physically distributed to
all the processors, called local memories.
 Processors are divided into several clusters, the
clusters are themselves are UMA or NUMA
multiprocessors.
 All processors that are belonged to a
certain clusters have uniform access to the cluster
shared memory.
 All clusters have equal access to the global
shared memory and indeed the access time to
cluster memory is shorter than to the global
memory.

Legends: GSM = Global-Shared Memory, CSM = Cluster-Shared Memory

CIN = Cluster Interconnection Network, P = Processor
• The Cache-only Memory Access (COMA) model:
• This model is another special depiction of NUMA
model where the distributed main memories are
converted to caches.
• All the caches form the global memory space, thus
no memory hierarchy is needed.
• Distant cache access is served by the distributed
cache directories.

Legends: P = Processor, C = Cache, D = Directories

Distributed-Memory Multicomputers
• This model consists of multiple computers, also known as nodes
• These node are autonomous with a processor
and a local memory of their own.
• The message-passing network provides
point-to-point static connections among the
nodes.
• All the local memories are private and only
be accessible by the local processors. For this,
these models are called no-remote-memory-
access (NORMA).
• Internode communication is accomplished
by passing message through the static
connection network.

Legends: P = Processor, M = Memory

Multivector and SIMD Computers
• Vector Supercomputers
 Vector computer is an extension
of scalar computer. Much like an
optional feature as the figure
shown here.
 Through the host computer, data
and program are first loaded into
the main memory.
 The Scalar control unit decodes
all the instructions.
 if the decoded instruction is a
scalar operation or program, it will
be directly executed by the scalar
processor using the scalar functional
pipelines.
 If the decoded instruction is a
vector operation, it will be sent to
the vector control unit.
 The vector control unit then
supervises the flow of vector data between main memory and vector functional pipelines directly.
 Vector processor can have multiple vector functional pipelines.
• SIMD Supercomputer
 As mentioned in the slide no. 07, a proper and functional SIMD computer is presented here with 5-tuple:
M = (N, C, I, M, R)
 N is the number of Processing Elements(PE).
 C is the instructions set directly executed by the
Control Unit(CU), both scalar and program flow
control instruction.
 I is the set of instructions broadcast by the CU
to all the PE for parallel execution including all the
local operations executed by each active PE over
data within that PE.
 M is the masking scheme that separates the sets
into enabled and disabled PE subsets.
 R is the set of Data-routing functions that state
various patterns to be installed in the interconnection
Network for inner-PE connections.
Parallel Random Access Machines(PRAM)
• Unlike the conventional computers which are labeled as random access machines,
parallel random access machines (PRAM) have been developed for idealized
computers with zero synchronization or memory access overhead.
 A PRAM with n-processor has a global memory and
can be distributed among the processors centralized in
one place.
 These processors operate on a synchronized
read memory-compute-write memory cycle.
 Depending on how the concurrent operations are
handled, four options are possible:
 Exclusive Read (ER): Allows one processor to read from
any location in each cycle.
 Exclusive Write (EW): Allows one processor to write
into a memory location at a time.
 Concurrent Read (CR): Allows multiple processors to read the same information from the same
Memory location
 Concurrent Write (CW): Allows simultaneous writes to the same memory locations.
• PRAM Variants
Depending on how the memory reads and writes are handled, four variants have been settled:
 EREW-PRAM: Exclusive Read or Exclusive Write.
 CREW-PRAM: Concurrent Read or Exclusive Write.
 ERCW-PRAM: Exclusive Read or Concurrent Write.
 CRCW-PRAM: Concurrent Read or Concurrent Write.

Coa Notes 4th Semm Computer Organization and Architecture
No ratings yet
Coa Notes 4th Semm Computer Organization and Architecture
96 pages
Electricity Billing System Slide (Slideplayer Com)
25% (4)
Electricity Billing System Slide (Slideplayer Com)
12 pages
IGI 2 Codes 1
No ratings yet
IGI 2 Codes 1
33 pages
IGI 2 Codes 2
100% (1)
IGI 2 Codes 2
34 pages
Ohe Fitting Guideline
100% (1)
Ohe Fitting Guideline
56 pages
Lecture 6 - Variables in C++
No ratings yet
Lecture 6 - Variables in C++
22 pages
Cse III Computer Organization (15cs34) Question Paper
No ratings yet
Cse III Computer Organization (15cs34) Question Paper
4 pages
(DSU) Data Structure Using 'C' (22317)
0% (1)
(DSU) Data Structure Using 'C' (22317)
6 pages
Introduction To Programming Sample Question Paper
No ratings yet
Introduction To Programming Sample Question Paper
35 pages
Algorithms Flowcharts Notes
100% (4)
Algorithms Flowcharts Notes
4 pages
Lecture-2 (Overview of Microcomputer Structure and Operation)
0% (2)
Lecture-2 (Overview of Microcomputer Structure and Operation)
18 pages
Structured Programming Language (UPDATED)
No ratings yet
Structured Programming Language (UPDATED)
12 pages
SIMD Computer Organizations
0% (1)
SIMD Computer Organizations
20 pages
File Allocation Methods
No ratings yet
File Allocation Methods
9 pages
Computer Architecture & Organization
No ratings yet
Computer Architecture & Organization
4 pages
0 - C Notes PDF
No ratings yet
0 - C Notes PDF
158 pages
Pop Module2 Notes
No ratings yet
Pop Module2 Notes
42 pages
Module 1 - Chapter 1 - PPT
No ratings yet
Module 1 - Chapter 1 - PPT
40 pages
CS8261 C Programming Lab Manual With All 18 Programs
100% (1)
CS8261 C Programming Lab Manual With All 18 Programs
42 pages
Computer Organization: Sandeep Kumar
No ratings yet
Computer Organization: Sandeep Kumar
117 pages
Neural Syllabus
No ratings yet
Neural Syllabus
1 page
MPMC Unit 1
No ratings yet
MPMC Unit 1
204 pages
Data Structures Viva Questions
No ratings yet
Data Structures Viva Questions
5 pages
CS-I Practicals Answer Key 11th
No ratings yet
CS-I Practicals Answer Key 11th
11 pages
Unit I Introduction: OCS752-Introduction To C Programming Department of CSE 2020-2021
No ratings yet
Unit I Introduction: OCS752-Introduction To C Programming Department of CSE 2020-2021
43 pages
R22 COA Unit 1
No ratings yet
R22 COA Unit 1
41 pages
Object Oriented Programming Using C++ Second Year Sem II: Two Marks Questions
100% (2)
Object Oriented Programming Using C++ Second Year Sem II: Two Marks Questions
6 pages
Data Structures Part - A (Shortanswer Questions) : Vemu Institute of Technology
No ratings yet
Data Structures Part - A (Shortanswer Questions) : Vemu Institute of Technology
6 pages
POP Using C - VTU Lab Program-4
No ratings yet
POP Using C - VTU Lab Program-4
5 pages
Limitations of Algorithm Power
100% (1)
Limitations of Algorithm Power
10 pages
UNIT - II Control Structures 16 Marks
No ratings yet
UNIT - II Control Structures 16 Marks
2 pages
Computer Organization & Architecture (KCA 105) : DR Manmohan Mishra Associate Professor MCA, Department
No ratings yet
Computer Organization & Architecture (KCA 105) : DR Manmohan Mishra Associate Professor MCA, Department
87 pages
Course File Programming and Problem Solving in C
No ratings yet
Course File Programming and Problem Solving in C
10 pages
C Programming
No ratings yet
C Programming
4 pages
Advance Computer Architecture - CS501 Spring 2013 Quiz
No ratings yet
Advance Computer Architecture - CS501 Spring 2013 Quiz
4 pages
UNIT I Basics of C Programming 12 Marks
No ratings yet
UNIT I Basics of C Programming 12 Marks
3 pages
COA Course File 2023-24
No ratings yet
COA Course File 2023-24
61 pages
Computer Organisation and Architecture
No ratings yet
Computer Organisation and Architecture
6 pages
Module 1: Basic Structure of Computers 1.1 Basic Operational Concepts
No ratings yet
Module 1: Basic Structure of Computers 1.1 Basic Operational Concepts
34 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
CST294 - Ktu Qbank
No ratings yet
CST294 - Ktu Qbank
22 pages
It Workshop-Labmanual (Semester 2)
No ratings yet
It Workshop-Labmanual (Semester 2)
15 pages
COA Module4
No ratings yet
COA Module4
19 pages
FOC Anna Univ Ques
No ratings yet
FOC Anna Univ Ques
15 pages
Complete C Questions AND ANSWERS11 PDF
67% (3)
Complete C Questions AND ANSWERS11 PDF
92 pages
Module 1 PDF
100% (1)
Module 1 PDF
33 pages
COA Class Test-1
No ratings yet
COA Class Test-1
3 pages
Btech Syllabus
0% (1)
Btech Syllabus
2 pages
300+ REAL TIME Computer Engineering Objective Questions & Answers
100% (1)
300+ REAL TIME Computer Engineering Objective Questions & Answers
57 pages
5 Pointers
No ratings yet
5 Pointers
8 pages
Computer Organization and Assembly Languages Am - Course Outline
No ratings yet
Computer Organization and Assembly Languages Am - Course Outline
2 pages
Lab Record
100% (1)
Lab Record
99 pages
Unit1 COA
No ratings yet
Unit1 COA
135 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
67 pages
Computer Organisation and Architectre Notes
No ratings yet
Computer Organisation and Architectre Notes
353 pages
RMK Group A4 PPT - MPMC - Ec8691 - Unit 1
100% (1)
RMK Group A4 PPT - MPMC - Ec8691 - Unit 1
127 pages
Stack and SUBROUTINES Bindu Agarwalla
No ratings yet
Stack and SUBROUTINES Bindu Agarwalla
15 pages
Software Construction Lecture 1
No ratings yet
Software Construction Lecture 1
30 pages
Os Lab Manual PDF
No ratings yet
Os Lab Manual PDF
60 pages
COA Unit 1
No ratings yet
COA Unit 1
33 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
91 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
No ratings yet
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
1 page
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
Chapter 6 (Pipelining and Superscalar Techniques)
No ratings yet
Chapter 6 (Pipelining and Superscalar Techniques)
10 pages
Chapter 5 (Bus, Cache and Shared Memory)
100% (1)
Chapter 5 (Bus, Cache and Shared Memory)
8 pages
Apostrophe
No ratings yet
Apostrophe
1 page
Alien Shooter - Vengeance - Cheats
No ratings yet
Alien Shooter - Vengeance - Cheats
5 pages
Car Rental Srs Document
No ratings yet
Car Rental Srs Document
116 pages
Shakil Ahmed Senior Lecturer & Head Department of Development Studies North Western University-Khulna
No ratings yet
Shakil Ahmed Senior Lecturer & Head Department of Development Studies North Western University-Khulna
28 pages
Alessandro Volta
No ratings yet
Alessandro Volta
6 pages
Government CSE
No ratings yet
Government CSE
49 pages
Electricity Billing System: Software Requirement Specification
No ratings yet
Electricity Billing System: Software Requirement Specification
34 pages
Report On Making Concepts
No ratings yet
Report On Making Concepts
3 pages
IGI 2 Codes 3
No ratings yet
IGI 2 Codes 3
2 pages
Report On Making Concepts
No ratings yet
Report On Making Concepts
3 pages
Indzara Project Planner Basic ET0030022010001
No ratings yet
Indzara Project Planner Basic ET0030022010001
9 pages
Specifications For BrightSign CMS
No ratings yet
Specifications For BrightSign CMS
4 pages
Real-Time Control Systems: A Tutorial: August 2004
No ratings yet
Real-Time Control Systems: A Tutorial: August 2004
9 pages
Drunk Detector System With Facial Recognition
100% (2)
Drunk Detector System With Facial Recognition
27 pages
cp2087 Ultra Chrome
100% (6)
cp2087 Ultra Chrome
248 pages
Computer Organization and Networks
No ratings yet
Computer Organization and Networks
200 pages
Linux Booting Process
No ratings yet
Linux Booting Process
16 pages
Decimal To Floating Point
No ratings yet
Decimal To Floating Point
2 pages
CH 11
No ratings yet
CH 11
31 pages
1.0 Intro - COMP208
No ratings yet
1.0 Intro - COMP208
74 pages
OS Unit 2
No ratings yet
OS Unit 2
6 pages
449 261 F
No ratings yet
449 261 F
4 pages
HyperMILL 2016 2 en
No ratings yet
HyperMILL 2016 2 en
16 pages
Data Backup and Recovery
100% (1)
Data Backup and Recovery
53 pages
GREENLEE OTDR - Datasheet
No ratings yet
GREENLEE OTDR - Datasheet
4 pages
8086 Opcode
100% (1)
8086 Opcode
23 pages
PB Ethswitch
No ratings yet
PB Ethswitch
42 pages
Guide To HQIP in Backup Partition Ver1.2
No ratings yet
Guide To HQIP in Backup Partition Ver1.2
3 pages
SP830 - Datasheet
No ratings yet
SP830 - Datasheet
2 pages
Hollysys Hmi Touch Screen
No ratings yet
Hollysys Hmi Touch Screen
15 pages
Sigma
No ratings yet
Sigma
5 pages
Arduino Playground - Matlab
No ratings yet
Arduino Playground - Matlab
3 pages
Week 11 - Script - Comparative Adjectives
No ratings yet
Week 11 - Script - Comparative Adjectives
3 pages
Adobe After Effects CS6
0% (4)
Adobe After Effects CS6
112 pages
Trajan 712 Owners Manual
No ratings yet
Trajan 712 Owners Manual
21 pages
Introduction To Python - 2018
No ratings yet
Introduction To Python - 2018
20 pages
Deal or No Deal Game Service Manual Ice Games
No ratings yet
Deal or No Deal Game Service Manual Ice Games
30 pages
Cooltek Antiphon Case Manual
No ratings yet
Cooltek Antiphon Case Manual
10 pages
Rfm9x Gui User Guide v1.0 en
No ratings yet
Rfm9x Gui User Guide v1.0 en
17 pages

Chapter 1 (Parallel Computer Models)

Uploaded by

Chapter 1 (Parallel Computer Models)

Uploaded by

Advanced Computer Architecture

Kai Hwang & Naresh Jotwani

Forth Generation (1975-1990):

Fig. 1.2(a) SISD uniprocessor architecture

SIMD (Single Instruction stream over Multiple Data streams):

Fig. 1.2(b) SIMD architecture (with distributed memory)

Fig. 1.2(c) MIMD architecture (with shared memory)

MISD (Multiple Instructions stream and a Single Data stream):

Fig. 1.2(d) MISD architecture

Legends: GSM = Global-Shared Memory, CSM = Cluster-Shared Memory

Legends: P = Processor, C = Cache, D = Directories

Legends: P = Processor, M = Memory

You might also like