0% found this document useful (0 votes)
23 views

Unit-5-Parallel Processing

The document discusses parallel processing techniques aimed at increasing computational speed, including pipeline and vector processing. It outlines Flynn's classification of computer architectures, detailing SISD, SIMD, MISD, and MIMD systems, and explains the concept of pipelining as a method to decompose processes into suboperations for concurrent execution. Additionally, it addresses pipeline conflicts and their resolutions, as well as the implementation of instruction pipelines in RISC architectures.

Uploaded by

ShivuAg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Unit-5-Parallel Processing

The document discusses parallel processing techniques aimed at increasing computational speed, including pipeline and vector processing. It outlines Flynn's classification of computer architectures, detailing SISD, SIMD, MISD, and MIMD systems, and explains the concept of pipelining as a method to decompose processes into suboperations for concurrent execution. Additionally, it addresses pipeline conflicts and their resolutions, as well as the implementation of instruction pipelines in RISC architectures.

Uploaded by

ShivuAg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

PIPELINE AND VECTOR PROCESSING

Parallel processing:
• Parallel processing is a term used for a large class of techniques that

are used to provide simultaneous data-processing tasks for the purpose of increasing the
computational speed of a computer system.

 It refers to techniques that are used to provide simultaneous data processing.

 The system may have two or more ALUs to be able to execute two or more
instruction at the same time.

 The system may have two or more processors operating concurrently.

 It can be achieved by having multiple functional units that perform same or different
operation simultaneously.

• Example of parallel Processing:

– Multiple Functional Unit:

Separate the execution unit into eight functional units operating in parallel.

 There are variety of ways in which the parallel processing can be classified

 Internal Organization of Processor

 Interconnection structure between processors

 Flow of information through system

19
UNIT-V
Architectural Classification:

– Flynn's classification

» Based on the multiplicity of Instruction Streams and Data Streams

» Instruction Stream

• Sequence of Instructions read from memory

» Data Stream

• Operations performed on the data in the processor

 SISD represents the organization containing single control unit, a processor unit and a
memory unit. Instruction are executed sequentially and system may or may not have
internal parallel processing capabilities.

 SIMD represents an organization that includes many processing units under the
supervision of a common control unit.

 MISD structure is of only theoretical interest since no practical system has been
constructed using this organization.

 MIMD organization refers to a computer system capable of processing several


programs at the same time.

The main difference between multicomputer system and multiprocessor system is that the
multiprocessor system is controlled by one operating system that provides interaction
between processors and all the component of the system cooperate in the solution of a
problem.

 Parallel Processing can be discussed under following topics:

 Pipeline Processing

 Vector Processing

 Array Processors

20
UNIT-V
PIPELINING:

• A technique of decomposing a sequential process into suboperations, with


each subprocess being executed in a special dedicated segment that operates
concurrently with all other segments.

• It is a technique of decomposing a sequential process into sub operations, with


each sub process being executed in a special dedicated segments that operates
concurrently with all other segments.

• Each segment performs partial processing dictated by the way task is


partitioned.

• The result obtained from each segment is transferred to next segment.

• The final result is obtained when data have passed through all segments.

• Suppose we have to perform the following task:

• Each sub operation is to be performed in a segment within a pipeline. Each segment


has one or two registers and a combinational circuit.

21
UNIT-V
OPERATIONS IN EACH PIPELINE STAGE:

• General Structure of a 4-Segment Pipeline

• Space-Time Diagram

The following diagram shows 6 tasks T1 through T6 executed in 4segments.

PIPELINE SPEEDUP:

Consider the case where a k-segment pipeline used to execute n tasks.

 n = 6 in previous example

22
UNIT-V
 k = 4 in previous example

• Pipelined Machine (k stages, n tasks)

 The first task t1 requires k clock cycles to complete its operation since there
are k segments

 The remaining n-1 tasks require n-1 clock cycles

 The n tasks clock cycles = k+(n-1) (9 in previous example)

• Conventional Machine (Non-Pipelined)

 Cycles to complete each task in nonpipeline = k

 For n tasks, n cycles required is

• Speedup (S)

 S = Nonpipeline time /Pipeline time

 For n tasks: S = nk/(k+n-1)

 As n becomes much larger than k-1; Therefore, S = nk/n = k

PIPELINE AND MULTIPLE FUNCTION UNITS:

Example:

- 4-stage pipeline

- 100 tasks to be executed

- 1 task in non-pipelined system; 4 clock cycles

Pipelined System : k + n - 1 = 4 + 99 = 103 clock cycles

Non-Pipelined System : n*k = 100 * 4 = 400 clock cycles

Speedup : Sk = 400 / 103 = 3.88

Types of Pipelining:

• Arithmetic Pipeline

• Instruction Pipeline

ARITHMETIC PIPELINE:

 Pipeline arithmetic units are usually found in very high speed computers.

 They are used to implement floating point operations.

23
UNIT-V
 We will now discuss the pipeline unit for the floating point addition and subtraction.

 The inputs to floating point adder pipeline are two normalized floating point numbers.

 A and B are mantissas and a and b are the exponents.

 The floating point addition and subtraction can be performed in four segments.

Floating-point adder:

[1] Compare the exponents

[2] Align the mantissa

[3] Add/sub the mantissa

[4] Normalize the result

X = A x 10a = 0.9504 x 103

Y = B x 10b = 0.8200 x 102

1) Compare exponents :

3-2=1

2) Align mantissas

X = 0.9504 x 103

Y = 0.08200 x 103

3) Add mantissas

Z = 1.0324 x 103

4) Normalize result

Z = 0.10324 x 104

24
UNIT-V
Instruction Pipeline:

 Pipeline processing can occur not only in the data stream but in the instruction stream
as well.

 An instruction pipeline reads consecutive instruction from memory while previous


instruction are being executed in other segments.

 This caused the instruction fetch and execute segments to overlap and perform
simultaneous operation.

Four Segment CPU Pipeline:

 FI segment fetches the instruction.

 DA segment decodes the instruction and calculate the effective address.

 FO segment fetches the operand.

 EX segment executes the instruction.

25
UNIT-V
INSTRUCTION CYCLE:

Pipeline processing can occur also in the instruction stream. An instruction

pipeline reads consecutive instructions from memory while previous

instructions are being executed in other segments.

Six Phases* in an Instruction Cycle

[1] Fetch an instruction from memory

[2] Decode the instruction

26
UNIT-V
[3] Calculate the effective address of the operand

[4] Fetch the operands from memory

[5] Execute the operation

[6] Store the result in the proper place

* Some instructions skip some phases

* Effective address calculation can be done in the part of the decoding phase

* Storage of the operation result into a register is done automatically in the execution phase

==> 4-Stage Pipeline

[1] FI: Fetch an instruction from memory

[2] DA: Decode the instruction and calculate the effective address of the operand

[3] FO: Fetch the operand

[4] EX: Execute the operation

Pipeline Conflicts :

– Pipeline Conflicts : 3 major difficulties


1) Resource conflicts: memory access by two segments at the same time. Most of these
conflicts can be resolved by using separate instruction and data memories.

2) Data dependency: when an instruction depend on the result of a previous instruction,


but this result is not yet available.

27
UNIT-V
Example: an instruction with register indirect mode cannot proceed to fetch the operand
if the previous instruction is loading the address into the register.

3) Branch difficulties: branch and other instruction (interrupt, ret, ..) that change the value
of PC.

Handling Data Dependency:

 This problem can be solved in the following ways:

 Hardware interlocks: It is the circuit that detects the conflict situation and
delayed the instruction by sufficient cycles to resolve the conflict.

 Operand Forwarding: It uses the special hardware to detect the conflict and
avoid it by routing the data through the special path between pipeline
segments.

 Delayed Loads: The compiler detects the data conflict and reorder the
instruction as necessary to delay the loading of the conflicting data by
inserting no operation instruction.

Handling of Branch Instruction:

 Pre fetch the target instruction.

 Branch target buffer(BTB) included in the fetch segment of the pipeline

 Branch Prediction

 Delayed Branch

RISC Pipeline:

 Simplicity of instruction set is utilized to implement an instruction pipeline using


small number of sub-operation, with each being executed in single clock cycle.

Since all operation are performed in the register, there is no need of effective address
calculation.

Three Segment Instruction Pipeline:

 I: Instruction Fetch

 A: ALU Operation

 E: Execute Instruction

Delayed Load:

28
UNIT-V
Delayed Branch:

Let us consider the program having the following 5 instructions

29
UNIT-V

You might also like