0% found this document useful (0 votes)

34 views5 pages

CA Classes-221-225

The document discusses parallel processing and Flynn's classification of computer systems. It focuses on Single Instruction Multiple Data (SIMD) architecture, describing how SIMD processes the same instruction on multiple data elements simultaneously. Modern CPUs can execute SIMD instructions to handle floating-point numbers and provide speedups in algorithms. Fine-grained SIMD architecture and an example system called the Massively Parallel Processor are also covered.

Uploaded by

SrinivasaRao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views5 pages

CA Classes-221-225

Uploaded by

SrinivasaRao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Computer Architecture Unit 10

Self Assessment Questions

1. A problem is broken into a discrete series of ______________ .
2. _______________ provides facilities for simultaneous processing of
various set of data or simultaneous execution of multiple instructions.
3. Parallel processing in multiprocessor computer is said to be
________________ parallel processing.
4. Parallel processing in uni-processor computer is said to
_____________ parallel processing.

10.3 Classification of Parallel Processing

The core element of parallel processing is CPUs. The essential computing
process is the execution of sequence of instruction on asset of data. The
term stream is used here to denote a sequence of items as executed by
single processor or multiprocessor. Based on a number of instruction and
data, streams can be processed simultaneously, Flynn classifies the
computer system into four categories. The matrix defines the 4 possible
classifications according to Flynn as given in figure 10.4.

Figure 10.4: Flynn’s Classification of Computer System

In this chapter, our main focus will be Single Instruction Multiple Data
(SIMD).
Single Instruction Multiple Data (SIMD)
The term single instruction implies that all processing units execute the
same instruction at any given clock cycle. On the other hand, the term
multiple data implies that each and every processing unit could work on a
different data element. Generally, this type of machine has one instruction
dispatcher, a very big array of very small capacity instruction units and a
network of very high bandwidth. This type is suitable for specialised
problems which are characterised by a high regularity, for example, image
processing. Figure 10.5 shows a case of SIMD processing.
Manipal University of Jaipur B1648 Page No. 221
Computer Architecture Unit 10

prev instruct prev instruct prev instruct

load A(1) load A(2) load A(n)

t i me
load B(1) load B(2) load B(n)

C(1)=A(1)B(1) C(2)=A(2)B(2) C(n)=A(n)*B(n)

store C(1) store C(2) store C(n)

next instruct next instruct next instruct

P1 P2 Pn
Figure 10.5: SIMD Process

Today, modern microprocessors can execute the same instruction on

multiple data. This is called Single Instruction Multiple Data (SIMD). SIMD
instructions handle floating-point real numbers and also provide important
speedups in algorithms. As the performing units for SIMD instructions
typically belong to a physical core, as many SIMD instructions can run in
parallel as the available physical cores. As mentioned, the utilisation of
these vector-processing capabilities in parallel could give significant
speedups in certain specific algorithms.
The adding up of SIMD instructions & hardware to a multi-core CPU is a bit
more extreme as compared to the addition of floating point ability. Since
their inception, a microprocessor is a SISD device. SIMD is also referred as
vector processing as its fundamental unit of organisation is the vector. This
is shown in figure 10.6:

Figure 10.6: Scalars and Vectors

Manipal University of Jaipur B1648 Page No. 222

Computer Architecture Unit 10

A normal CPU operates on scalars, which is one at a time. A superscalar

CPU operates on multiple scalars at a given moment, but it executes a
different operation on each instruction. On the other hand, a vector
processor lines up an entire row of these same types of scalars and
operates on them as a single unit. Figure 10.7 shows the difference
between SISD and SIMD.

Figure 10.7: SISD vs. SIMD

Modern, superscalar SISD machines exploit the property ‘instruction-level

parallelism’ of the instruction stream. This signifies that multiple instructions
can be executed at a single instance on the same identical data stream.
One property of the data stream called ‘data parallelism’ is exploited by a
SIMD machine. In this framework, you get data parallelism when you have a
large mass of uniform data that requires same instruction performed on it.
Therefore, a SIMD machine is totally a separate class of machine than the
normal microprocessor.
Self Assessment Questions
5. SIMD stands for ________________ .
6. Flynn classified computing architectures into SISD, MISD, SIMD and
_______________________ .
7. SIMD is known as ____________ because its basic unit of
organisation is the vector.
8. Superscalar SISD machines use one property of the instruction stream
by the name of ___________.

Activity 1:
Explore the components of a parallel architecture that are used by an
organisation. Also, find out the type of memory used in that architecture.

Manipal University of Jaipur B1648 Page No. 223

Computer Architecture Unit 10

10.4 Fine-Grained SIMD Architecture

The Steven Unger design scheme is the initial base for the Fine-grained
SIMD architectures. These are generally designed for low-level image
processing applications. The following are the features of fine-grained
architecture:
 Complexity is minimal and the degree of autonomy is lowest feasible in
each Processing Element (PE).
 Economic constraints are applicable on the maximum number of PEs
provided.
 It is assumed by the programming model that there is equivalence
between the number of PEs and the number of data items, and hides
any mismatch as far as possible.
 The 4-connected nearest neighbour mesh is used as the basic
interconnection method.
 A simple extension of a sequential language with parallel-data additions
is the usual programming language
Even though, practically, this concept is not absolute in any systems, there
are certain systems that are close to this concept. They include CLIP4, the
DAP, the MPP (all first-generation systems), the CM1 and the MasPar1
amongst later embodiments. There are other categories which are a bit
deviated from the classical model. They are explained as follows:
 Processing element complexity is increased, either so as to operate on
multi-bit numbers directly or by the addition of dedicated arithmetic units.
 Enhanced connectivity arrangements are superimposed over the
standard mesh. Such arrangements include hypercube and crossbar
switches.
One of the most important architectural developments which have occurred
in this class of system over time is the incorporation of ever-increasing
amounts of local memory. This reflects the experience of all users that
insufficient memory can have a catastrophic effect on performance,
outweighing, in the worst cases, the advantages of a parallel configuration.
Perhaps, the Massively Parallel Processor (MPP) system has been the
most modern design which retained the simplicity of the fine-grained
approach, and this is examined in detail in the next section.

Manipal University of Jaipur B1648 Page No. 224

Computer Architecture Unit 10

10.4.1 An example: The massively parallel processor

MPP is the acronym for Massively Parallel Processor. MPP shows the
principles of this group in the best possible way, though it is not the most
recent example of a fine-grained SIMD system. The overall system design is
illustrated in figure 10.8.

Figure 10.8: The MPP Systems

A square array was chosen in MPP to match the configuration of the

anticipated data sets on which the system was intended to work. The
square array is of 128 x 128 active processing elements. The MPP was
constructed for (and used by) NASA, with the obvious intention of
processing mainly image data. The size of the array was simply the biggest
that could be achieved at the time, given the constraints of then current
technology and the intended processor design. It resulted in a system
constructed from 88 array cards, each of which supported 24 processor
chips (192 processors) together with their associated local memory.
The array incorporates four additional columns of spare (inactive)
processing elements to provide some fault-tolerance. One of the major
system design considerations in highly parallel systems such as MPP is
how to handle the unavoidable device failures. The number of these is

Manipal University of Jaipur B1648 Page No. 225

Parallel Architecture Classification
50% (2)
Parallel Architecture Classification
41 pages
Advanced Computer Architecture - Unit 1 - WWW - Rgpvnotes.in
100% (1)
Advanced Computer Architecture - Unit 1 - WWW - Rgpvnotes.in
20 pages
Ca Part 4
No ratings yet
Ca Part 4
25 pages
Unit-1 ACA
No ratings yet
Unit-1 ACA
26 pages
MCA Computer Organization and Architecture 14
No ratings yet
MCA Computer Organization and Architecture 14
9 pages
Flynn'S Classification: Cs6303 Computer Architecture
No ratings yet
Flynn'S Classification: Cs6303 Computer Architecture
11 pages
Lecture 3.1.1 (Parallelism in Uniprocessor System, Flynn - S Classification)
No ratings yet
Lecture 3.1.1 (Parallelism in Uniprocessor System, Flynn - S Classification)
8 pages
Parallel Computig Assignment
No ratings yet
Parallel Computig Assignment
15 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
Hardware Multithreading
No ratings yet
Hardware Multithreading
10 pages
Unit IV CA
No ratings yet
Unit IV CA
73 pages
Lecture3 (Form Parallelism&flynn)
No ratings yet
Lecture3 (Form Parallelism&flynn)
12 pages
Flynn's Classification
No ratings yet
Flynn's Classification
3 pages
Advance Computer Architecture2
No ratings yet
Advance Computer Architecture2
36 pages
Cs8083 MCP Unit I Notes
No ratings yet
Cs8083 MCP Unit I Notes
31 pages
CA Classes-226-230
No ratings yet
CA Classes-226-230
5 pages
CA Classes-231-235
No ratings yet
CA Classes-231-235
5 pages
Unit 1
No ratings yet
Unit 1
48 pages
Parallel & Distributed Computing: By: M. Imran Siddiqui
No ratings yet
Parallel & Distributed Computing: By: M. Imran Siddiqui
25 pages
21cs401 CA Unit V
No ratings yet
21cs401 CA Unit V
16 pages
A Comprehensive Survey of Various Processor Types & Latest Architectures
No ratings yet
A Comprehensive Survey of Various Processor Types & Latest Architectures
7 pages
Coa-Unit - 5 Notes
No ratings yet
Coa-Unit - 5 Notes
38 pages
Class 2 - Computer Architecture and Organization - Introduction - Number System
No ratings yet
Class 2 - Computer Architecture and Organization - Introduction - Number System
11 pages
Assign
No ratings yet
Assign
12 pages
5 Marks Q. Describe Array Processor Architecture
No ratings yet
5 Marks Q. Describe Array Processor Architecture
11 pages
CA Slides#2 Architectural Classification
No ratings yet
CA Slides#2 Architectural Classification
22 pages
Lecture 2
No ratings yet
Lecture 2
12 pages
Advanced Computer Architecture: The Architecture of Parallel Computers
No ratings yet
Advanced Computer Architecture: The Architecture of Parallel Computers
44 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
atII Bks Lec 2021 28
No ratings yet
atII Bks Lec 2021 28
6 pages
Notes FT HA
No ratings yet
Notes FT HA
4 pages
CA Classes-216-220
No ratings yet
CA Classes-216-220
5 pages
Lect6-SPC - Flynns
No ratings yet
Lect6-SPC - Flynns
16 pages
COE4590 10 Flyns
No ratings yet
COE4590 10 Flyns
15 pages
Flynn's Taxonomy of Computer Architectures: Michael Flynn 1966 CMPS 5433 - Parallel Processing
No ratings yet
Flynn's Taxonomy of Computer Architectures: Michael Flynn 1966 CMPS 5433 - Parallel Processing
13 pages
Study of Architectural Design of VLSI: Veni Madhav Sharma, Javed Ali Mansuri, Sunil Sharma
No ratings yet
Study of Architectural Design of VLSI: Veni Madhav Sharma, Javed Ali Mansuri, Sunil Sharma
2 pages
SIMD and Associative Computational Models: Parallel & Distributed Algorithms
No ratings yet
SIMD and Associative Computational Models: Parallel & Distributed Algorithms
31 pages
Lecture 3.1.1 (Parallelism in Uniprocessor System, Flynns Classification)
No ratings yet
Lecture 3.1.1 (Parallelism in Uniprocessor System, Flynns Classification)
21 pages
5.1parallel Processing
No ratings yet
5.1parallel Processing
20 pages
CA Classes-16-20
No ratings yet
CA Classes-16-20
5 pages
Aca Unit 1.1
No ratings yet
Aca Unit 1.1
20 pages
Flynns Classification
No ratings yet
Flynns Classification
27 pages
BCSE412L - Parallel Computing 04
No ratings yet
BCSE412L - Parallel Computing 04
9 pages
Flynn Classification
No ratings yet
Flynn Classification
4 pages
Module 6
No ratings yet
Module 6
10 pages
Unit 4 COA
No ratings yet
Unit 4 COA
8 pages
ACA T1 Solutions
No ratings yet
ACA T1 Solutions
17 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Flyyn's Taxonomy Research
No ratings yet
Flyyn's Taxonomy Research
13 pages
Model
No ratings yet
Model
14 pages
Unit 4 COA
No ratings yet
Unit 4 COA
5 pages
CA Classes-236-240
No ratings yet
CA Classes-236-240
5 pages
Ca Unit 4 Prabu
No ratings yet
Ca Unit 4 Prabu
24 pages
Parallel Processing Report
No ratings yet
Parallel Processing Report
9 pages
Parallel Processing in Processor Organization: Prabhudev S Irabashetti
No ratings yet
Parallel Processing in Processor Organization: Prabhudev S Irabashetti
4 pages
IJARCCE6G S Prabhudev Parallel PDF
No ratings yet
IJARCCE6G S Prabhudev Parallel PDF
4 pages
Flynn's Taxonomy of Computer Architecture
No ratings yet
Flynn's Taxonomy of Computer Architecture
8 pages
Taxonomy Parallel Computer Architectures Instruction Data
No ratings yet
Taxonomy Parallel Computer Architectures Instruction Data
2 pages
Pharmabeginers Com Quality Risk Management
No ratings yet
Pharmabeginers Com Quality Risk Management
31 pages
Pharmabeginers Com Investigation Tools Guideline
No ratings yet
Pharmabeginers Com Investigation Tools Guideline
31 pages
Multicore Question Bank
No ratings yet
Multicore Question Bank
5 pages
Arm Neoverse N2:: Arm'S 2 Generation High Performance Infrastructure Cpus and System Ips
100% (1)
Arm Neoverse N2:: Arm'S 2 Generation High Performance Infrastructure Cpus and System Ips
27 pages
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
100% (2)
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
506 pages
Mobile Proc. Arch. Lecture 2021-11-06
No ratings yet
Mobile Proc. Arch. Lecture 2021-11-06
171 pages
Differences Between The PICS EU GMP Guidelines and WHO Guidelines - Final
No ratings yet
Differences Between The PICS EU GMP Guidelines and WHO Guidelines - Final
20 pages
Computer Architecture AllClasses-Outline
No ratings yet
Computer Architecture AllClasses-Outline
294 pages
Computer Architecture AllClasses-Outline-1-99
No ratings yet
Computer Architecture AllClasses-Outline-1-99
99 pages
Programming in C - 41-60
No ratings yet
Programming in C - 41-60
20 pages
CA Classes-201-205
No ratings yet
CA Classes-201-205
5 pages
Guía de Procesadores Ryzen
100% (1)
Guía de Procesadores Ryzen
44 pages
Week 4 - Computer Organization & Architecture
No ratings yet
Week 4 - Computer Organization & Architecture
59 pages
Built-In Fault-Tolerant Computing Paradigm For Resilient Large-Scale Chip Design
No ratings yet
Built-In Fault-Tolerant Computing Paradigm For Resilient Large-Scale Chip Design
318 pages
04 Classic+Autosar新特性介绍
No ratings yet
04 Classic+Autosar新特性介绍
37 pages
Tips For Writing User Friendly GMP Document
No ratings yet
Tips For Writing User Friendly GMP Document
12 pages
Computer Architecture AllClasses-Outline-100-198
No ratings yet
Computer Architecture AllClasses-Outline-100-198
99 pages
CA Classes-86-90
No ratings yet
CA Classes-86-90
5 pages
TI RTOS Kernel Workshop Student Guide Rev4.00
No ratings yet
TI RTOS Kernel Workshop Student Guide Rev4.00
344 pages
Qbdgroup Com en Blog What Is The Gamp 5 V Model in Computeri
No ratings yet
Qbdgroup Com en Blog What Is The Gamp 5 V Model in Computeri
16 pages
White Paper CPV Lets Foster Quality
No ratings yet
White Paper CPV Lets Foster Quality
7 pages
WWW Pharmaceutical Technology Com Sponsored Pharmaceutical Q
No ratings yet
WWW Pharmaceutical Technology Com Sponsored Pharmaceutical Q
6 pages
CA Classes-106-110
No ratings yet
CA Classes-106-110
5 pages
Orion - Production Server Hardware Requirements Database Server
No ratings yet
Orion - Production Server Hardware Requirements Database Server
15 pages
Nios II Gen2 Hardware Development Tutorial
No ratings yet
Nios II Gen2 Hardware Development Tutorial
20 pages
CA Classes-196-200
No ratings yet
CA Classes-196-200
5 pages
CA Classes-126-130
No ratings yet
CA Classes-126-130
5 pages
CA Classes-251-255
No ratings yet
CA Classes-251-255
5 pages
CA Classes-26-30
No ratings yet
CA Classes-26-30
5 pages
HW (U1) 1
No ratings yet
HW (U1) 1
20 pages
C Programming AllClasses-Outline-1-98
No ratings yet
C Programming AllClasses-Outline-1-98
98 pages
Python Question and Answe Login and Memory Mengment
No ratings yet
Python Question and Answe Login and Memory Mengment
71 pages
CA Classes-116-120
No ratings yet
CA Classes-116-120
5 pages
CA Classes-36-40
No ratings yet
CA Classes-36-40
5 pages
C Programming AllClasses-Outline-198-233
No ratings yet
C Programming AllClasses-Outline-198-233
36 pages
Computing Power
No ratings yet
Computing Power
12 pages
Programming in C - 121-140
No ratings yet
Programming in C - 121-140
20 pages
Programming in C - 21-40
No ratings yet
Programming in C - 21-40
20 pages
Programming in C - 161-180
No ratings yet
Programming in C - 161-180
20 pages
CA Classes-186-190
No ratings yet
CA Classes-186-190
5 pages
CA Classes-261-265
No ratings yet
CA Classes-261-265
5 pages
Telco Cloud Platform 5g Edition Data Plane Performance Tuning Guide
No ratings yet
Telco Cloud Platform 5g Edition Data Plane Performance Tuning Guide
36 pages
Samsung's Mobile Lines - 2018 - 06 - 12 - 2
No ratings yet
Samsung's Mobile Lines - 2018 - 06 - 12 - 2
242 pages
Introduction To Problem Solving Using C: Basic Concepts of Computer
No ratings yet
Introduction To Problem Solving Using C: Basic Concepts of Computer
32 pages
Unit 3 App
No ratings yet
Unit 3 App
36 pages
Pipelining For Multi-Core Architectures
No ratings yet
Pipelining For Multi-Core Architectures
31 pages
Tomcat Sizing Guide For HP
No ratings yet
Tomcat Sizing Guide For HP
24 pages
2020 Date Hypervisor
No ratings yet
2020 Date Hypervisor
6 pages
Docker Performance Tuning
No ratings yet
Docker Performance Tuning
25 pages
Experiment 3
No ratings yet
Experiment 3
5 pages
32 Bit Vs 64 Bit
No ratings yet
32 Bit Vs 64 Bit
9 pages
DSC02 (BSC Hons) Guidelines With Weeks and Lecture Hours
No ratings yet
DSC02 (BSC Hons) Guidelines With Weeks and Lecture Hours
2 pages
Unit 16.assignment 1 Frontsheet
No ratings yet
Unit 16.assignment 1 Frontsheet
30 pages
Tensilica Sogggggggund
No ratings yet
Tensilica Sogggggggund
12 pages
SC2012 Compass CR
No ratings yet
SC2012 Compass CR
11 pages
Quickspecs: HP Probook 645 G1 Notebook PC
No ratings yet
Quickspecs: HP Probook 645 G1 Notebook PC
43 pages
Virtualization Xeon Core Count Impacts Performance Paper
No ratings yet
Virtualization Xeon Core Count Impacts Performance Paper
10 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

CA Classes-221-225

Uploaded by

CA Classes-221-225

Uploaded by

Computer Architecture Unit 10

Self Assessment Questions

10.3 Classification of Parallel Processing

Figure 10.4: Flynn’s Classification of Computer System

prev instruct prev instruct prev instruct

load A(1) load A(2) load A(n)

C(1)=A(1)*B(1) C(2)=A(2)*B(2) C(n)=A(n)*B(n)

store C(1) store C(2) store C(n)

next instruct next instruct next instruct

Today, modern microprocessors can execute the same instruction on

Figure 10.6: Scalars and Vectors

Manipal University of Jaipur B1648 Page No. 222

A normal CPU operates on scalars, which is one at a time. A superscalar

Figure 10.7: SISD vs. SIMD

Modern, superscalar SISD machines exploit the property ‘instruction-level

Manipal University of Jaipur B1648 Page No. 223

10.4 Fine-Grained SIMD Architecture

Manipal University of Jaipur B1648 Page No. 224

10.4.1 An example: The massively parallel processor

Figure 10.8: The MPP Systems

A square array was chosen in MPP to match the configuration of the

Manipal University of Jaipur B1648 Page No. 225

You might also like

C(1)=A(1)B(1) C(2)=A(2)B(2) C(n)=A(n)*B(n)