0% found this document useful (0 votes)
60 views9 pages

MCA Computer Organization and Architecture 14

This document discusses SIMD (single-instruction multiple-data streams) architecture. It describes the concept of SIMD and parallel processing. It also explains Flynn's classification of parallel processing including SISD, SIMD, MISD and MIMD architectures. It discusses fine-grained and coarse-grained SIMD architectures.

Uploaded by

SAI SRIRAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views9 pages

MCA Computer Organization and Architecture 14

This document discusses SIMD (single-instruction multiple-data streams) architecture. It describes the concept of SIMD and parallel processing. It also explains Flynn's classification of parallel processing including SISD, SIMD, MISD and MIMD architectures. It discusses fine-grained and coarse-grained SIMD architectures.

Uploaded by

SAI SRIRAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT

14 SIMD Architecture

Names of Sub-Units

Parallel Processing, Classification of Parallel Processing, Fine-Grained SIMD Architecture, Coarse-


Grained SIMD Architecture.

Overview

This unit begins by discussing about the concept of SIMD architecture and parallel processing. Next,
the unit discusses the classification of parallel processing. Further the unit explains the fine-grained
SIMD architecture. Towards the end, the unit discusses the coarse-grained SIMD architecture.

Learning Objectives

In this unit, you will learn to:


 Discuss the concept of SIMD architecture
 Explain the concept of parallel processing
 Describe the classification of parallel processing
 Explain the significance of fine-grained SIMD architecture
 Discuss the coarse-grained SIMD architecture
Computer Organization and Architecture

Learning Outcomes

At the end of this unit, you would:


 Evaluate the concept of SIMD architecture
 Assess the concept of parallel processing
 Evaluate the importance of classification of parallel processing
 Determine the significance of fine-grained SIMD architecture
 Explore the coarse-grained SIMD architecture

Pre-Unit Preparatory Material

 http://cs.ucf.edu/~ahmadian/pubs/SIMD.pdf

14.1 INTRODUCTION
SIMD (single-instruction multiple-data streams) is an abbreviation that stands for single-instruction
multiple-data streams. As indicated in the diagram, the SIMD parallel computing paradigm consists of
two parts: a von Neumann-style front-end computer and a processor array.
The processor array is a group of synchronised processing units that may conduct the same operation
on many data streams at the same time. While being processed in parallel, the dispersed data is stored
in a small piece of local memory on each processor in the array.
The processor array is coupled to the front end’s memory bus, allowing the front end to build local
processor memories at random as if it were another memory. Figure 1 depicts SIMD architecture model:

Von Neumann
Computer

Virtual Processors

Figure 1: SIMD Architecture Model


On the front end, a typical serial programming language can be used to design and run a programme.
The front end runs the application programme serially, but sends a problem command to the processor
array so that SIMD tasks can be run in parallel.
The similarities between serial and data-parallel programming are one of the valid aspects of data
parallelism. The processors’ lock-step synchronisation eliminates the need for synchronisation.
Processors can either do nothing or accomplish the same thing at the same moment.

2
UNIT 14: SIMD Architecture

Simultaneous operations across huge amounts of data are employed in SIMD architecture to take
advantage of parallelism. This paradigm works well when dealing with challenges that require the
improvement of a significant amount of data at once. It’s got a lot of dynamism.
In SIMD machines, there are two major configurations. Each CPU has its own local memory in the
first scheme. The interconnectivity network allows processors to communicate with one another. If
the interconnection network does not allow for a direct link between two groups of processors, the
information might be exchanged through an intermediary processor.
Processors and memory modules communicate with each other via the interconnection network in the
second SIMD architecture. Two processors can communicate with one another via intermediate memory
modules or, in some cases, intermediary processors (s). The BSP (Burroughs’ Scientific Processor) used
the second SIMD scheme.

14.2 PARALLEL PROCESSING


Parallel processing is a set of techniques that allows a computer system to perform multiple data -
processing tasks at the same time in order to boost the system’s computational speed.
A parallel processing system may process several pieces of data at the same time, resulting in a quicker
execution time. For example, the next instruction can be read from memory while an instruction is
being processed in the ALU component of the CPU.
The primary goal of parallel processing is to improve the computer’s processing capability and
throughput, or the amount of processing that can be done in a given amount of time.
A parallel processing system can be achieved by having a multiplicity of functional units that perform
identical or different operations simultaneously. The data can be distributed among various multiple
functional units. One method of dividing the execution unit into eight parallel functional units, the
operation done in each functional unit is specified in each block, as shown in Figure 2:

Adder-Subtractor

Integer Multiply

Logic unit

Shift Unit

Incrementer
To memory
Processor
registers Floating-point add-
subtract

Floating-point multiply

Floating-point divide

Figure 2: Processor with Multiple functional Units

3
Computer Organization and Architecture

The integer multiplier and adder are used to execute arithmetic operations on integer numbers. The
floating-point operations are divided into three circuits that work in tandem. On distinct data, the logic,
shift, and increment operations can all be run at the same time. Because all units are independent of
one another, one number can be moved while another is increased.

14.3 CLASSIFICATION OF PARALLEL PROCESSING


Multiprocessing can be defined using Flynn’s classification, it is based on multiplicity of instruction
stream and data streams in a computer system. An instruction stream is a sequence of instruction
executed by computer. A data stream in a sequence of data which includes input data or temporary
results. When designing a programme or concurrent system, several system and memory architecture
styles must be considered. It’s critical because one system a memory style may be ideal for one task but
error-prone for another.
In 1972, Michael Flynn proposed a classification system for distinct types of computer system architecture.
The following are the four different styles defined by this taxonomy:
 SISD (Single Instruction Single Data)
 SIMD (Single Instruction Multiple Data)
 MISD (Multiple Instruction Single Data)
 MIMD (Multiple Instruction Multiple Data)

14.3.1 SISD Architecture


SISD is the abbreviation for “Single Instruction and Single Data Stream.” It depicts the structure of a
single computer, which includes a control unit, a processor unit, and a memory unit. The system may or
may not have internal parallel processing capability, therefore instructions are performed sequentially.
Like classic Von-Neumann computers, most conventional computers have SISD architecture. Multiple
functional units or pipeline processing can be used to achieve parallel processing in this instance. The
SISD architecture model is shown in Figure 3:

IS DS
Control Processing Memory
Unit Unit Unit

IS

SIMD

Processing
Unit 1 DS Memory Unit 1

Processing
Control FS DS Memory Unit 2
Unit 2
Unit .. ..
..
..
..
Processing DS Memory Unit N
Unit n
FS

Figure 3: SISD Architecture Model

4
UNIT 14: SIMD Architecture

The advantages of SISD architecture are as follows:


 It has less power.
 A sophisticated communication protocol between several cores is not a concern.

The disadvantages of SISD architecture are as follows:


 SISD architecture, like single-core CPUs, has a speed limit.
 It is unsuitable for larger projects.

14.3.2 SIMD Architecture


The acronym SIMD stands for ‘Single Instruction, Multiple Data Stream.’ It symbolises an organization
with a large number of processing units overseen by a central control unit. The control unit sends the
same instruction to all processors, but they work on separate data. The SIMD architecture model is
shown in Figure 4:

Data
Bus 1
Control PE 1 Memory 1
Bus Data
Bus 2
PE 2 Memory 2
Control
Unit

Data
Bus n
PE n Memory n

Figure 4: SIMD Architecture Model


The advantages of SIMD architecture are as follows:
 A single instruction can perform the same operation on numerous components.
 By increasing the number of processing cores, the system’s throughput can be boosted.
 The processing speed is faster than that of the SISD design.

The disadvantages of SIMD architecture are as follows:


 There is more sophisticated communication between processor cores
 The cost is higher than with the SISD design.

14.3.3 MISD Architecture


MISD is an acronym that stands for “Multiple Instruction and Single Data Stream.” Because no real
system has been built using the MISD structure, it is primarily of theoretical importance. Multiple
processing units work on a single data stream in MISD. Each processing unit works independently on
the data via a distinct instruction stream.

5
Computer Organization and Architecture

Figure 5 depicts the MISD architecture:

IS 1
C.U.1 P.S.1

C.U.2 IS 2
P.S.2
DS MU1 MU2 MUn

C.U.n IS n P.S.n

Figure 5: MISD Architecture Model

14.3.4 MIMD Architecture


MIMD (Multiple Instruction, Multiple Data) states to a parallel architecture, which is the most
fundamental and well-known type of parallel processor. The main goal of MIMD is to achieve parallelism.
The MIMD architecture consists of a group of N tightly connected processors. Each processor has
memory that is shared by all processors but cannot be accessed directly by other processors.
The processors of the MIMD architecture work independently and asynchronously. Various processors
may be performing various instructions on various pieces of data at any given time. MIMD is further
classified into two broad categories:
 SPMD (Single Program, Multiple Data Streams)
 MPMD (Multiple Program, Multiple Data Streams)
The MIMD architecture is shown in Figure 6:

IS 1 IS 1 DS 1
C.U.1 P.S.1 MU1

IS 2 IS 2 DS 2
C.U.2 P.S.2 MU2

IS n DS n
C.U.n P.S.n MUn
IS n

IS 3
IS 2
IS 1

Figure 6: MIMD Architecture Model

6
UNIT 14: SIMD Architecture

Some of the advantages of MIMD are as follows:


 Less contention
 High scalability
 MIMD offers flexibility

Some of the disadvantages of MIMD are as follows:


 Load balancing
 Deadlock situation prone
 Waste of bandwidth

14.4 FINE-GRAINED SIMD ARCHITECTURE


A programme is broken down into a high number of tiny pieces in fine-grained parallelism tasks. Many
processors are allocated to these duties independently. The quantity of the amount of labour involved
with a parallel job is minimal, and it is evenly divided between the processors. As a result, fine-grained
parallelism makes load balancing easier. The number of processors required to complete each operation
decreases as the amount of data processed decreases. The level of full processing is high. As a result,
communication and collaboration improve overhead of synchronization. In architectures that enable
rapid communication, fine-grained parallelism is best utilised. Fine-grained parallelism is best achieved
with a shared memory architecture with low communication overhead.
It is difficult for programmers to detect parallelism in a program, therefore, it is usually the compilers’
responsibility to detect fine-grained parallelism. An example of a fine-grained system (from outside the
parallel computing domain) is the system of neurons in our brain.

14.5 COARSE-GRAINED SIMD ARCHITECTURE


In coarse-grained parallelism, a programme is partitioned into large jobs. Processors conduct a
substantial amount of calculation as a result. This may result in a load imbalance, with some jobs
processing the majority of the data while others remain idle. Furthermore, coarse-grained parallelism
fails to harness the parallelism in the programme because the majority of the computation is executed
sequentially on a machine. This type of parallelism has the benefit of low communication and
synchronisation costs. In a message-passing architecture, data communication takes a long time.
Parallelism on a medium scale is comparison to fine-grained and coarse-grained parallelism, medium-
grained parallelism is utilised. Medium-grained parallelism is a compromise between fine-grained
and coarse-grained parallelism, with task sizes and communication times that are larger than fine-
grained parallelism but smaller than coarse-grained parallelism. This is where most general-purpose
parallel computers belong. In fine-grained parallelism, Assume there are 100 processors tasked with
analysing a 10*10 image. The 100 processors can process the 10*10 image in one clock cycle, ignoring
the communication overhead. Each processor works on a single pixel of the image and then sends the
results to the others. Fine-grained parallelism is demonstrated here.
Consider a medium-grained parallelism scenario in which the 10*10 image is processed by 25 processors.
The image will now be processed in four clock cycles. This is a medium-grain parallelism example.
Furthermore, if we lower the number of processors to two, the processing will take 50 clock cycles. Each
processor must process 50 items, increasing calculation time; however, as the number of processors
sharing data drops, the communication cost lowers. Coarse-grained parallelism is demonstrated in this
example.

7
Computer Organization and Architecture

Conclusion 14.6 CONCLUSION

 Parallel processing is a set of techniques that allows a computer system to perform multiple data-
processing tasks at the same time.
 SISD architecture is the structure of a single computer, which includes a control unit, a processor
unit, and a memory unit.
 SIMD symbolises an organization with a large number of processing units overseen by a central
control unit.
 Multiple processing units work on a single data stream in MISD.
 MIMD (Multiple Instruction, Multiple Data) states to a parallel architecture, which is the most
fundamental and well-known type of parallel processor.

14.7 GLOSSARY

 Parallel processing: It is a set of techniques that allows a computer system to perform multiple
data-processing tasks at the same time.
 SISD architecture: It depicts the structure of a single computer, which includes a control unit, a
processor unit, and a memory unit.
 SIMD architecture: It symbolises an organization with a large number of processing units overseen
by a central control unit.
 MISD architecture: The multiple processing units work on a single data stream in MISD.
 MIMD architecture: It states to a parallel architecture, which is the most fundamental and well-
known type of parallel processor.

14.8 SELF-ASSESSMENT QUESTIONS

A. Essay Type Questions


1. Explain the concept of SIMD architecture.
2. The primary goal of parallel processing is to improve the computer's processing capability. Describe
the significance of parallel processing.
3. Multiple processing units work on a single data stream in MISD. Discuss the concept of MISD
architecture.
4. Describe the concept of fine-grained SIMD architecture.
5. Coarse-grained type of parallelism has the benefit of low communication and synchronisation costs.
Discuss.

14.9 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS

A. Hints for Essay Type Questions


1. SIMD (single-instruction multiple-data streams) is an abbreviation that stands for single-instruction
multiple-data streams. As indicated in the diagram, the SIMD parallel computing paradigm consists
8
UNIT 14: SIMD Architecture
of two parts: a von Neumann-style front-end computer and a processor array. Refer to Section SIMD
Architecture
2. Parallel processing is a set of techniques that allows a computer system to perform multiple data-
processing tasks at the same time in order to boost the system’s computational speed. Refer to
Section Parallel Processing
3. Multiple Instruction and Single Data Stream (MISD) is an acronym that stands for “Multiple
Instruction and Single Data Stream.” Refer to Section Classification of Parallel Processing
4. A programme is broken down into a high number of tiny pieces in fine-grained parallelism tasks.
Many processors are allocated to these duties independently. Refer to Section Fine-Grained SIMD
Architecture
5. In coarse-grained parallelism, a programme is partitioned into large jobs. Processors conduct a
substantial amount of calculation as a result. Refer to Section Coarse-Grained SIMD Architecture

@ 14.10 POST-UNIT READING MATERIAL

 https://www.geeksforgeeks.org/computer-organization-and-architecture-pipelining-set-1-
execution-stages-and-throughput/
 https://www.geeksforgeeks.org/difference-between-simd-and-mimd/

14.11 TOPICS FOR DISCUSSION FORUMS

 Discuss with your friends and classmates about the concept of SIMD architecture. Also, try to find
some real world examples of SIMD architecture.

You might also like