0% found this document useful (0 votes)

23 views

Unit-5-Parallel Processing

The document discusses parallel processing techniques aimed at increasing computational speed, including pipeline and vector processing. It outlines Flynn's classification of computer architectures, detailing SISD, SIMD, MISD, and MIMD systems, and explains the concept of pipelining as a method to decompose processes into suboperations for concurrent execution. Additionally, it addresses pipeline conflicts and their resolutions, as well as the implementation of instruction pipelines in RISC architectures.

Uploaded by

ShivuAg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Unit-5-Parallel Processing

Uploaded by

ShivuAg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

PIPELINE AND VECTOR PROCESSING

Parallel processing:
• Parallel processing is a term used for a large class of techniques that

are used to provide simultaneous data-processing tasks for the purpose of increasing the
computational speed of a computer system.

 It refers to techniques that are used to provide simultaneous data processing.

 The system may have two or more ALUs to be able to execute two or more
instruction at the same time.

 The system may have two or more processors operating concurrently.

 It can be achieved by having multiple functional units that perform same or different
operation simultaneously.

• Example of parallel Processing:

– Multiple Functional Unit:

Separate the execution unit into eight functional units operating in parallel.

 There are variety of ways in which the parallel processing can be classified

 Internal Organization of Processor

 Interconnection structure between processors

 Flow of information through system

19
UNIT-V
Architectural Classification:

– Flynn's classification

» Based on the multiplicity of Instruction Streams and Data Streams

» Instruction Stream

• Sequence of Instructions read from memory

» Data Stream

• Operations performed on the data in the processor

 SISD represents the organization containing single control unit, a processor unit and a
memory unit. Instruction are executed sequentially and system may or may not have
internal parallel processing capabilities.

 SIMD represents an organization that includes many processing units under the
supervision of a common control unit.

 MISD structure is of only theoretical interest since no practical system has been
constructed using this organization.

 MIMD organization refers to a computer system capable of processing several

programs at the same time.

The main difference between multicomputer system and multiprocessor system is that the
multiprocessor system is controlled by one operating system that provides interaction
between processors and all the component of the system cooperate in the solution of a
problem.

 Parallel Processing can be discussed under following topics:

 Pipeline Processing

 Vector Processing

 Array Processors

20
UNIT-V
PIPELINING:

• A technique of decomposing a sequential process into suboperations, with

each subprocess being executed in a special dedicated segment that operates
concurrently with all other segments.

• It is a technique of decomposing a sequential process into sub operations, with

each sub process being executed in a special dedicated segments that operates
concurrently with all other segments.

• Each segment performs partial processing dictated by the way task is

partitioned.

• The result obtained from each segment is transferred to next segment.

• The final result is obtained when data have passed through all segments.

• Suppose we have to perform the following task:

• Each sub operation is to be performed in a segment within a pipeline. Each segment

has one or two registers and a combinational circuit.

21
UNIT-V
OPERATIONS IN EACH PIPELINE STAGE:

• General Structure of a 4-Segment Pipeline

• Space-Time Diagram

The following diagram shows 6 tasks T1 through T6 executed in 4segments.

PIPELINE SPEEDUP:

Consider the case where a k-segment pipeline used to execute n tasks.

 n = 6 in previous example

22
UNIT-V
 k = 4 in previous example

• Pipelined Machine (k stages, n tasks)

 The first task t1 requires k clock cycles to complete its operation since there
are k segments

 The remaining n-1 tasks require n-1 clock cycles

 The n tasks clock cycles = k+(n-1) (9 in previous example)

• Conventional Machine (Non-Pipelined)

 Cycles to complete each task in nonpipeline = k

 For n tasks, n cycles required is

• Speedup (S)

 S = Nonpipeline time /Pipeline time

 For n tasks: S = nk/(k+n-1)

 As n becomes much larger than k-1; Therefore, S = nk/n = k

PIPELINE AND MULTIPLE FUNCTION UNITS:

Example:

- 4-stage pipeline

- 100 tasks to be executed

- 1 task in non-pipelined system; 4 clock cycles

Pipelined System : k + n - 1 = 4 + 99 = 103 clock cycles

Non-Pipelined System : nk = 100 4 = 400 clock cycles

Speedup : Sk = 400 / 103 = 3.88

Types of Pipelining:

• Arithmetic Pipeline

• Instruction Pipeline

ARITHMETIC PIPELINE:

 Pipeline arithmetic units are usually found in very high speed computers.

 They are used to implement floating point operations.

23
UNIT-V
 We will now discuss the pipeline unit for the floating point addition and subtraction.

 The inputs to floating point adder pipeline are two normalized floating point numbers.

 A and B are mantissas and a and b are the exponents.

 The floating point addition and subtraction can be performed in four segments.

Floating-point adder:

[1] Compare the exponents

[2] Align the mantissa

[3] Add/sub the mantissa

[4] Normalize the result

X = A x 10a = 0.9504 x 103

Y = B x 10b = 0.8200 x 102

1) Compare exponents :

3-2=1

2) Align mantissas

X = 0.9504 x 103

Y = 0.08200 x 103

3) Add mantissas

Z = 1.0324 x 103

4) Normalize result

Z = 0.10324 x 104

24
UNIT-V
Instruction Pipeline:

 Pipeline processing can occur not only in the data stream but in the instruction stream
as well.

 An instruction pipeline reads consecutive instruction from memory while previous

instruction are being executed in other segments.

 This caused the instruction fetch and execute segments to overlap and perform
simultaneous operation.

Four Segment CPU Pipeline:

 FI segment fetches the instruction.

 DA segment decodes the instruction and calculate the effective address.

 FO segment fetches the operand.

 EX segment executes the instruction.

25
UNIT-V
INSTRUCTION CYCLE:

Pipeline processing can occur also in the instruction stream. An instruction

pipeline reads consecutive instructions from memory while previous

instructions are being executed in other segments.

Six Phases* in an Instruction Cycle

[1] Fetch an instruction from memory

[2] Decode the instruction

26
UNIT-V
[3] Calculate the effective address of the operand

[4] Fetch the operands from memory

[5] Execute the operation

[6] Store the result in the proper place

* Some instructions skip some phases

* Effective address calculation can be done in the part of the decoding phase

* Storage of the operation result into a register is done automatically in the execution phase

==> 4-Stage Pipeline

[1] FI: Fetch an instruction from memory

[2] DA: Decode the instruction and calculate the effective address of the operand

[3] FO: Fetch the operand

[4] EX: Execute the operation

Pipeline Conflicts :

– Pipeline Conflicts : 3 major difficulties

–
1) Resource conflicts: memory access by two segments at the same time. Most of these
conflicts can be resolved by using separate instruction and data memories.

2) Data dependency: when an instruction depend on the result of a previous instruction,

but this result is not yet available.

27
UNIT-V
Example: an instruction with register indirect mode cannot proceed to fetch the operand
if the previous instruction is loading the address into the register.

3) Branch difficulties: branch and other instruction (interrupt, ret, ..) that change the value
of PC.

Handling Data Dependency:

 This problem can be solved in the following ways:

 Hardware interlocks: It is the circuit that detects the conflict situation and
delayed the instruction by sufficient cycles to resolve the conflict.

 Operand Forwarding: It uses the special hardware to detect the conflict and
avoid it by routing the data through the special path between pipeline
segments.

 Delayed Loads: The compiler detects the data conflict and reorder the
instruction as necessary to delay the loading of the conflicting data by
inserting no operation instruction.

Handling of Branch Instruction:

 Pre fetch the target instruction.

 Branch target buffer(BTB) included in the fetch segment of the pipeline

 Branch Prediction

 Delayed Branch

RISC Pipeline:

 Simplicity of instruction set is utilized to implement an instruction pipeline using

small number of sub-operation, with each being executed in single clock cycle.

Since all operation are performed in the register, there is no need of effective address
calculation.

Three Segment Instruction Pipeline:

 I: Instruction Fetch

 A: ALU Operation

 E: Execute Instruction

Delayed Load:

28
UNIT-V
Delayed Branch:

Let us consider the program having the following 5 instructions

29
UNIT-V

Partes Del CPU
No ratings yet
Partes Del CPU
27 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
COAU5
No ratings yet
COAU5
31 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Unit-4 Pipelinie and Vector Processing
No ratings yet
Unit-4 Pipelinie and Vector Processing
33 pages
CSO Lecture Notes Unit - 5
No ratings yet
CSO Lecture Notes Unit - 5
11 pages
Pipelining 2
No ratings yet
Pipelining 2
43 pages
Unit 5-2 COA
No ratings yet
Unit 5-2 COA
52 pages
Chapter 5 - CO - BIM - III
No ratings yet
Chapter 5 - CO - BIM - III
7 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
28 pages
Chap. 9 Pipeline and Vector Processing
0% (1)
Chap. 9 Pipeline and Vector Processing
12 pages
Unit-V NEW
No ratings yet
Unit-V NEW
21 pages
COA DR MVN 5 UNIT - Latest PDF
No ratings yet
COA DR MVN 5 UNIT - Latest PDF
24 pages
Pipelining Vector Processing
No ratings yet
Pipelining Vector Processing
27 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
Lecture 10
No ratings yet
Lecture 10
23 pages
Pipelining
No ratings yet
Pipelining
33 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
3140707-CO-UNIT-6
No ratings yet
3140707-CO-UNIT-6
48 pages
CSO Computer Programming
No ratings yet
CSO Computer Programming
73 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Ca Unit 2.2
100% (2)
Ca Unit 2.2
22 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
No ratings yet
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
36 pages
Chap. 9 Pipeline and Vector Processing
No ratings yet
Chap. 9 Pipeline and Vector Processing
16 pages
Unit 5 - Pipeling and Multipoessors
No ratings yet
Unit 5 - Pipeling and Multipoessors
74 pages
Pipeline Processing Coa
No ratings yet
Pipeline Processing Coa
34 pages
csso-U-5
No ratings yet
csso-U-5
29 pages
UNIT-5: Pipeline and Vector Processing
No ratings yet
UNIT-5: Pipeline and Vector Processing
63 pages
CA Slides#3 Pipeline Introduction
No ratings yet
CA Slides#3 Pipeline Introduction
26 pages
Chapter 3 - Pipelining-And-Vector-Processing
100% (1)
Chapter 3 - Pipelining-And-Vector-Processing
29 pages
Pipelining and Vector Processing Chapter 9
100% (6)
Pipelining and Vector Processing Chapter 9
29 pages
Unit 5 (Coa) Notes
No ratings yet
Unit 5 (Coa) Notes
35 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Pipelining and Vector Processing: - Parallel
No ratings yet
Pipelining and Vector Processing: - Parallel
37 pages
1.4-Parallel Computer Architecture
No ratings yet
1.4-Parallel Computer Architecture
22 pages
Unit 6 - Pipeline, Vector Processing and Multiprocessors
No ratings yet
Unit 6 - Pipeline, Vector Processing and Multiprocessors
23 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Unit 4 - P 2
No ratings yet
Unit 4 - P 2
13 pages
Pipeline and Vector Processing
No ratings yet
Pipeline and Vector Processing
18 pages
Pipeline & Parallel Processing
No ratings yet
Pipeline & Parallel Processing
19 pages
5.Pipeline and Multiprocessors
No ratings yet
5.Pipeline and Multiprocessors
16 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
39 pages
CO Module 5 Notes
No ratings yet
CO Module 5 Notes
16 pages
Contact Session 8
No ratings yet
Contact Session 8
63 pages
Unit 5
No ratings yet
Unit 5
36 pages
Unit-7-n
No ratings yet
Unit-7-n
13 pages
Unit-6: Pipeline & Vector Processing
No ratings yet
Unit-6: Pipeline & Vector Processing
41 pages
UNIT-4_Pipelining & Parallel processing
No ratings yet
UNIT-4_Pipelining & Parallel processing
34 pages
Pipe Lining
No ratings yet
Pipe Lining
7 pages
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
No ratings yet
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
20 pages
Pipeline and Vector Processing
83% (12)
Pipeline and Vector Processing
37 pages
Parallel Processing
No ratings yet
Parallel Processing
33 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Chap 9
No ratings yet
Chap 9
59 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
Cmit 351 Project 1 Ford (1)
No ratings yet
Cmit 351 Project 1 Ford (1)
7 pages
Schema Elctrica Placa Baza Toshiba A500-13w
No ratings yet
Schema Elctrica Placa Baza Toshiba A500-13w
49 pages
Computer
No ratings yet
Computer
64 pages
Archive Server Installation Guide
No ratings yet
Archive Server Installation Guide
3 pages
Ursalink UR52 Industrial Cellular Router Quick Start Guide
No ratings yet
Ursalink UR52 Industrial Cellular Router Quick Start Guide
18 pages
Asterisk VoIP Private Branch Exchange
No ratings yet
Asterisk VoIP Private Branch Exchange
5 pages
Job Control Commands: Option Description
No ratings yet
Job Control Commands: Option Description
3 pages
HPE 3PAR Peer Persistence Host OS Support Matrix
No ratings yet
HPE 3PAR Peer Persistence Host OS Support Matrix
14 pages
Simulation Transcript IBM TTS
No ratings yet
Simulation Transcript IBM TTS
3 pages
Voice Recorder
No ratings yet
Voice Recorder
2 pages
Week 23 Apply Summative Assessment Part 2 - Gillian Jacob
No ratings yet
Week 23 Apply Summative Assessment Part 2 - Gillian Jacob
8 pages
iZOOM! Magnifies Images and Text On Your Computer Screen!: Inside This Issue
No ratings yet
iZOOM! Magnifies Images and Text On Your Computer Screen!: Inside This Issue
4 pages
Log
No ratings yet
Log
2 pages
Intel Xeon Scalable Processor Throughput Latency
No ratings yet
Intel Xeon Scalable Processor Throughput Latency
132 pages
How To Include Nonrecoverable Tax in Mass Additions in R12
No ratings yet
How To Include Nonrecoverable Tax in Mass Additions in R12
3 pages
Marie Had A Little Lamb
No ratings yet
Marie Had A Little Lamb
20 pages
MSInfo
No ratings yet
MSInfo
145 pages
HDS ShadowImage
No ratings yet
HDS ShadowImage
2 pages
Verifyaccess - Config
No ratings yet
Verifyaccess - Config
62 pages
Sub Netting 2
No ratings yet
Sub Netting 2
10 pages
Think OS A Brief Introduction To Operating Systems
No ratings yet
Think OS A Brief Introduction To Operating Systems
93 pages
High Performance Multi-Cloud Object Storage: White Paper
No ratings yet
High Performance Multi-Cloud Object Storage: White Paper
21 pages
CS 162 hw0
No ratings yet
CS 162 hw0
10 pages
IEE Mod 4 Notes
No ratings yet
IEE Mod 4 Notes
10 pages
Data types in java
No ratings yet
Data types in java
3 pages
Disaster Recovery Using RecoverPoint
No ratings yet
Disaster Recovery Using RecoverPoint
171 pages
2Gb Ddr3L Sdram: Lead-Free&Halogen-Free (Rohs Compliant)
No ratings yet
2Gb Ddr3L Sdram: Lead-Free&Halogen-Free (Rohs Compliant)
32 pages
ATMEL 89C51 - AT89C51 Microcontroller Pin Diagram & Description - EngineersGarage
100% (2)
ATMEL 89C51 - AT89C51 Microcontroller Pin Diagram & Description - EngineersGarage
5 pages
Practice 2 IP Addressing
No ratings yet
Practice 2 IP Addressing
104 pages

Unit-5-Parallel Processing

Uploaded by

Unit-5-Parallel Processing

Uploaded by

PIPELINE AND VECTOR PROCESSING

 It refers to techniques that are used to provide simultaneous data processing.

 The system may have two or more processors operating concurrently.

• Example of parallel Processing:

– Multiple Functional Unit:

 Internal Organization of Processor

 Interconnection structure between processors

 Flow of information through system

» Based on the multiplicity of Instruction Streams and Data Streams

• Sequence of Instructions read from memory

• Operations performed on the data in the processor

 MIMD organization refers to a computer system capable of processing several

 Parallel Processing can be discussed under following topics:

• A technique of decomposing a sequential process into suboperations, with

• It is a technique of decomposing a sequential process into sub operations, with

• Each segment performs partial processing dictated by the way task is

• The result obtained from each segment is transferred to next segment.

• Suppose we have to perform the following task:

• Each sub operation is to be performed in a segment within a pipeline. Each segment

• General Structure of a 4-Segment Pipeline

The following diagram shows 6 tasks T1 through T6 executed in 4segments.

Consider the case where a k-segment pipeline used to execute n tasks.

• Pipelined Machine (k stages, n tasks)

 The remaining n-1 tasks require n-1 clock cycles

 The n tasks clock cycles = k+(n-1) (9 in previous example)

• Conventional Machine (Non-Pipelined)

 Cycles to complete each task in nonpipeline = k

 For n tasks, n cycles required is

 S = Nonpipeline time /Pipeline time

 For n tasks: S = nk/(k+n-1)

 As n becomes much larger than k-1; Therefore, S = nk/n = k

PIPELINE AND MULTIPLE FUNCTION UNITS:

- 100 tasks to be executed

- 1 task in non-pipelined system; 4 clock cycles

Pipelined System : k + n - 1 = 4 + 99 = 103 clock cycles

Non-Pipelined System : n*k = 100 * 4 = 400 clock cycles

Speedup : Sk = 400 / 103 = 3.88

 They are used to implement floating point operations.

 A and B are mantissas and a and b are the exponents.

[1] Compare the exponents

[2] Align the mantissa

[3] Add/sub the mantissa

[4] Normalize the result

X = A x 10a = 0.9504 x 103

Y = B x 10b = 0.8200 x 102

 An instruction pipeline reads consecutive instruction from memory while previous

Four Segment CPU Pipeline:

 FI segment fetches the instruction.

 DA segment decodes the instruction and calculate the effective address.

 FO segment fetches the operand.

 EX segment executes the instruction.

Pipeline processing can occur also in the instruction stream. An instruction

pipeline reads consecutive instructions from memory while previous

instructions are being executed in other segments.

Six Phases* in an Instruction Cycle

[1] Fetch an instruction from memory

[2] Decode the instruction

[4] Fetch the operands from memory

[5] Execute the operation

[6] Store the result in the proper place

* Some instructions skip some phases

==> 4-Stage Pipeline

[1] FI: Fetch an instruction from memory

[3] FO: Fetch the operand

[4] EX: Execute the operation

– Pipeline Conflicts : 3 major difficulties

2) Data dependency: when an instruction depend on the result of a previous instruction,

Handling Data Dependency:

 This problem can be solved in the following ways:

Handling of Branch Instruction:

 Pre fetch the target instruction.

 Branch target buffer(BTB) included in the fetch segment of the pipeline

 Simplicity of instruction set is utilized to implement an instruction pipeline using

Three Segment Instruction Pipeline:

Let us consider the program having the following 5 instructions

You might also like

Non-Pipelined System : nk = 100 4 = 400 clock cycles