0% found this document useful (0 votes)
34 views5 pages

CA Classes-221-225

The document discusses parallel processing and Flynn's classification of computer systems. It focuses on Single Instruction Multiple Data (SIMD) architecture, describing how SIMD processes the same instruction on multiple data elements simultaneously. Modern CPUs can execute SIMD instructions to handle floating-point numbers and provide speedups in algorithms. Fine-grained SIMD architecture and an example system called the Massively Parallel Processor are also covered.

Uploaded by

SrinivasaRao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

CA Classes-221-225

The document discusses parallel processing and Flynn's classification of computer systems. It focuses on Single Instruction Multiple Data (SIMD) architecture, describing how SIMD processes the same instruction on multiple data elements simultaneously. Modern CPUs can execute SIMD instructions to handle floating-point numbers and provide speedups in algorithms. Fine-grained SIMD architecture and an example system called the Massively Parallel Processor are also covered.

Uploaded by

SrinivasaRao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Computer Architecture Unit 10

Self Assessment Questions


1. A problem is broken into a discrete series of ______________ .
2. _______________ provides facilities for simultaneous processing of
various set of data or simultaneous execution of multiple instructions.
3. Parallel processing in multiprocessor computer is said to be
________________ parallel processing.
4. Parallel processing in uni-processor computer is said to
_____________ parallel processing.

10.3 Classification of Parallel Processing


The core element of parallel processing is CPUs. The essential computing
process is the execution of sequence of instruction on asset of data. The
term stream is used here to denote a sequence of items as executed by
single processor or multiprocessor. Based on a number of instruction and
data, streams can be processed simultaneously, Flynn classifies the
computer system into four categories. The matrix defines the 4 possible
classifications according to Flynn as given in figure 10.4.

Figure 10.4: Flynn’s Classification of Computer System

In this chapter, our main focus will be Single Instruction Multiple Data
(SIMD).
Single Instruction Multiple Data (SIMD)
The term single instruction implies that all processing units execute the
same instruction at any given clock cycle. On the other hand, the term
multiple data implies that each and every processing unit could work on a
different data element. Generally, this type of machine has one instruction
dispatcher, a very big array of very small capacity instruction units and a
network of very high bandwidth. This type is suitable for specialised
problems which are characterised by a high regularity, for example, image
processing. Figure 10.5 shows a case of SIMD processing.
Manipal University of Jaipur B1648 Page No. 221
Computer Architecture Unit 10

prev instruct prev instruct prev instruct

load A(1) load A(2) load A(n)

t i me
load B(1) load B(2) load B(n)

C(1)=A(1)*B(1) C(2)=A(2)*B(2) C(n)=A(n)*B(n)

store C(1) store C(2) store C(n)

next instruct next instruct next instruct

P1 P2 Pn
Figure 10.5: SIMD Process

Today, modern microprocessors can execute the same instruction on


multiple data. This is called Single Instruction Multiple Data (SIMD). SIMD
instructions handle floating-point real numbers and also provide important
speedups in algorithms. As the performing units for SIMD instructions
typically belong to a physical core, as many SIMD instructions can run in
parallel as the available physical cores. As mentioned, the utilisation of
these vector-processing capabilities in parallel could give significant
speedups in certain specific algorithms.
The adding up of SIMD instructions & hardware to a multi-core CPU is a bit
more extreme as compared to the addition of floating point ability. Since
their inception, a microprocessor is a SISD device. SIMD is also referred as
vector processing as its fundamental unit of organisation is the vector. This
is shown in figure 10.6:

Figure 10.6: Scalars and Vectors

Manipal University of Jaipur B1648 Page No. 222


Computer Architecture Unit 10

A normal CPU operates on scalars, which is one at a time. A superscalar


CPU operates on multiple scalars at a given moment, but it executes a
different operation on each instruction. On the other hand, a vector
processor lines up an entire row of these same types of scalars and
operates on them as a single unit. Figure 10.7 shows the difference
between SISD and SIMD.

Figure 10.7: SISD vs. SIMD

Modern, superscalar SISD machines exploit the property ‘instruction-level


parallelism’ of the instruction stream. This signifies that multiple instructions
can be executed at a single instance on the same identical data stream.
One property of the data stream called ‘data parallelism’ is exploited by a
SIMD machine. In this framework, you get data parallelism when you have a
large mass of uniform data that requires same instruction performed on it.
Therefore, a SIMD machine is totally a separate class of machine than the
normal microprocessor.
Self Assessment Questions
5. SIMD stands for ________________ .
6. Flynn classified computing architectures into SISD, MISD, SIMD and
_______________________ .
7. SIMD is known as ____________ because its basic unit of
organisation is the vector.
8. Superscalar SISD machines use one property of the instruction stream
by the name of ___________.

Activity 1:
Explore the components of a parallel architecture that are used by an
organisation. Also, find out the type of memory used in that architecture.

Manipal University of Jaipur B1648 Page No. 223


Computer Architecture Unit 10

10.4 Fine-Grained SIMD Architecture


The Steven Unger design scheme is the initial base for the Fine-grained
SIMD architectures. These are generally designed for low-level image
processing applications. The following are the features of fine-grained
architecture:
 Complexity is minimal and the degree of autonomy is lowest feasible in
each Processing Element (PE).
 Economic constraints are applicable on the maximum number of PEs
provided.
 It is assumed by the programming model that there is equivalence
between the number of PEs and the number of data items, and hides
any mismatch as far as possible.
 The 4-connected nearest neighbour mesh is used as the basic
interconnection method.
 A simple extension of a sequential language with parallel-data additions
is the usual programming language
Even though, practically, this concept is not absolute in any systems, there
are certain systems that are close to this concept. They include CLIP4, the
DAP, the MPP (all first-generation systems), the CM1 and the MasPar1
amongst later embodiments. There are other categories which are a bit
deviated from the classical model. They are explained as follows:
 Processing element complexity is increased, either so as to operate on
multi-bit numbers directly or by the addition of dedicated arithmetic units.
 Enhanced connectivity arrangements are superimposed over the
standard mesh. Such arrangements include hypercube and crossbar
switches.
One of the most important architectural developments which have occurred
in this class of system over time is the incorporation of ever-increasing
amounts of local memory. This reflects the experience of all users that
insufficient memory can have a catastrophic effect on performance,
outweighing, in the worst cases, the advantages of a parallel configuration.
Perhaps, the Massively Parallel Processor (MPP) system has been the
most modern design which retained the simplicity of the fine-grained
approach, and this is examined in detail in the next section.

Manipal University of Jaipur B1648 Page No. 224


Computer Architecture Unit 10

10.4.1 An example: The massively parallel processor


MPP is the acronym for Massively Parallel Processor. MPP shows the
principles of this group in the best possible way, though it is not the most
recent example of a fine-grained SIMD system. The overall system design is
illustrated in figure 10.8.

Figure 10.8: The MPP Systems

A square array was chosen in MPP to match the configuration of the


anticipated data sets on which the system was intended to work. The
square array is of 128 x 128 active processing elements. The MPP was
constructed for (and used by) NASA, with the obvious intention of
processing mainly image data. The size of the array was simply the biggest
that could be achieved at the time, given the constraints of then current
technology and the intended processor design. It resulted in a system
constructed from 88 array cards, each of which supported 24 processor
chips (192 processors) together with their associated local memory.
The array incorporates four additional columns of spare (inactive)
processing elements to provide some fault-tolerance. One of the major
system design considerations in highly parallel systems such as MPP is
how to handle the unavoidable device failures. The number of these is

Manipal University of Jaipur B1648 Page No. 225

You might also like