Unit 6

The document discusses parallel processing and pipelining techniques used to enhance computational speed in computer systems. It explains the concepts of parallel processing, pipelining, instruction pipelines, and vector processing, along with their applications and challenges such as pipeline hazards and data dependencies. Flynn's classification of parallel processing is also outlined, categorizing systems based on instruction and data streams.

Uploaded by

aayushkhatiwada333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views20 pages

Unit 6

Uploaded by

aayushkhatiwada333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

UNIT 6

BCA 2nd semester

Parallel Processing
• Parallel processing is a term used to denote a large class of techniques that
are used to provide simultaneous data-processing tasks for the purpose of
increasing the computational speed of a computer system.
• Instead of processing each instruction sequentially as in a conventional
computer, a parallel processing system is able to perform concurrent data
processing to achieve faster execution time.
• For example, while an instruction is being executed in the ALU, the next
instruction can be read from memory. The system may have two or more
ALUs and be able to execute two or more instructions at the same time.
• Furthermore, the system may have two or more processors operating
concurrently. The purpose of parallel processing is to speed up the computer
processing capability and increase its throughput, that is, the amount of
processing that can be accomplished during a given interval of time.
• The amount of hardware increases with parallel processing. and with it, the
cost of the system increases. However, technological developments have
reduced hardware costs to the point where parallel processing techniques are
economically feasible.
Pipelining
• Pipelining is a technique of decomposing a sequential process into sub-
operations, with each sub-process being executed in a special dedicated
segment that operates concurrently with all other segments.
• A pipeline can be visualized as a collection of processing segments
through which binary information flows.
• Each segment performs partial processing dictated by the way the task is
partitioned. The result obtained from the computation in each segment is
transferred to the next segment in the pipeline. The final result is
obtained after the data have passed through all segments.
• It is characteristic of pipelines that several computations can be in
progress in distinct segments at the same time.
• The overlapping of computation is made possible by associating a
register with each segment in the pipeline. The registers provide
isolation between each segment so that each can operate on distinct data
simultaneously.
Pipelining Example
R1
Thethrough R5 are registers
pipeline organization that receive
will be demonstrated by new data with every clock
pulse.
means ofThe multiplier
a simple and adder
example. Suppose that weare
wantcombinational
to circuits. The sub-
perform the combined multiply and add operations
operations performed
with a stream of numbers. in each segment of the pipeline are as follows:

Each sub-operation is to be implemented in a

segment within a pipeline. Each segment has one or
two registers and a combinational circuit as shown in
Fig.

R1 through R5 are registers that receive new data

with every clock pulse. The multiplier and adder are
combinational circuits. The sub-operations
performed in each segment of the pipeline are as
follows:
• A task is defined as the total operation performed going through
all the segments in the pipeline. The behavior of a pipeline can be
illustrated with a space-time diagram. It shows the segment
utilization as a function of time. The space-time diagram of a 4
segment pipeline is given below:
• Figure: Space time diagram of 4 segment and 6 tasks
Instruction Pipeline
• Pipeline processing can occur not only in the data stream but in the instruction
stream as well. An instruction pipeline reads consecutive instructions from
memory while previous instructions are being executed in other segments.
• This causes the instruction fetch and execute phases to overlap and perform
simultaneous operations.
• Computers with complex instructions require other phases in addition to the fetch
and execute to process an instruction completely. In the most general case, the
computer needs to process each instruction with the following sequence of steps:
1. Fetch the instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.
• The design of an instruction pipeline will be most efficient if the instruction cycle
is divided into segments of equal duration. The time that each step takes to fulfill
its function depends on the instruction and the way it is executed.
Example: Four-Segment Instruction Pipeline
The above figure shows operation of 4-segment
instruction pipeline. The four segments are
represented as:
1. FI: segment 1 that fetches the instruction.
2. DA: segment 2 that decodes the instruction
and calculates the effective address.
3. FO: segment 3 that fetches the operands.
4. EX: segment 4 that executes the instruction.
The space time diagram for the 4-segment
instruction pipeline is given below:
Pipeline Conflicts (Hazards)
• A pipeline hazard occurs when the instruction pipeline deviates at
some phases, some operational conditions that do not permit the
continued execution. In general, there are three major difficulties
that cause the instruction pipeline to deviate from its normal
operation.
1. Resource conflicts caused by access to memory by two
segments at the same time. Most of these conflicts can be
resolved by using separate instruction and data memories.
2. Data dependency conflicts arise when an instruction depends
on the result of a previous instruction, but this result is not yet
available.
3. Branch difficulties arise from branch and other instructions that
change the value of PC.
Data Dependency
• It arises when instructions depend on the result of previous instruction but the
previous instruction is not available yet.
• For example an instruction in segment may need to fetch an operand that is being
generated at same time by the previous instruction in the segment.
• The most common techniques used to resolve data hazard are:
(a) Hardware interlock - a hardware interlock is a circuit that detects instructions
whose source operands are destinations of instructions farther up in the pipeline.
It then inserts enough number of clock cycles to delays the execution of such
instructions.
(b) Operand forwarding - This method uses a special hardware to detect conflicts in
instruction execution and then avoid it by routing the data through special path
between pipeline segments. For example, instead of transferring an ALU result
into a destination result, the hardware checks the destination operand, and if it is
needed in next instruction, it passes the result directly into ALU input, bypassing
the register.
(c) Delayed load - It is software solutions where the compiler is designed in such a
way that it can detect the conflicts; re-order the instructions to delay the loading
of conflicting data by inserting no operation instruction.
Handling of Branch Instructions
• Branch hazard arises from branch and other instruction that change
the value of program counter (PC). The conditional branch provides
plenty of instruction branch line and it is difficult to determine which
branches will be taken or not taken. A variety of approaches have been
used to deal with branch hazard and they are described below.
(a) Prefetch branch target - When a conditional branch is recognized,
the target of the branch is prefetched, in addition to the instruction
following the branch. This target is then saved until the branch
instruction is executed. If the branch is taken, the target has already
been prefetched.
(b) Branch prediction - uses additional logic to prediction the outcomes
of a (conditional) branch before it is executed. The popular
approaches are - predict never taken, predict always taken, predict by
opcode, taken/not taken switch and using branch history table.
d) Loop buffer - A loop buffer is a small, very-high-speed memory
maintained by the instruction fetch stage of the pipeline and
containing the most recently fetched instructions, in sequence. If a
branch is to be taken, the hardware first checks whether the branch
target is within the buffer. If so, the next instruction is fetched from
the buffer.
e) Delayed branch - This technique is employed in most RISC
processors. In this technique, compiler detects the branch
instructions and re-arranges the instructions by inserting useful
instructions to avoid pipeline hazards.
Vector Processing

• Vector processing is a procedure for speeding the processing of

information by a computer, in which pipelined units perform arithmetic
operations on uniform, linear arrays of data values, and a single
instruction involves the execution of the same operation on every
element of the array.
• There is a class of computational problems that are beyond the
capabilities of a conventional computer. These problems are
characterized by the fact that they require a vast number of computations
that will take a conventional computer days or even weeks to complete.
• In many science and engineering applications, the problems can be
formulated in terms of vectors and matrices that lend themselves to
vector processing.
• To achieve the required level of high performance it is necessary to
utilize the fastest and most reliable hardware and apply innovative
procedures from vector and parallel processing techniques
Application Areas of Vector Processing

• Computers with vector processing capabilities are in demand in

specialized applications. The following are representative
application areas where vector processing is of the utmost
importance.
- Long-range weather forecasting
- Petroleum explorations
- Seismic data analysis
- Medical diagnosis
- Aerodynamics and space flight simulations
- Artificial intelligence and expert systems
- Mapping the human genome
- Image processing
Vector Operations
• Many scientific problems require arithmetic operations on large arrays of
numbers. These numbers are usually formulated as vectors and matrices of
floating-point numbers.
• A vector is an order set of one dimensional array of data items. A vector V of
length ‘n’ is represented as a row vector by V = [V1, V2, V3,…………………….., Vn]
• A conventional sequential computer is capable of processing operands one at a
time. Consequently, operations on vectors must be broken down into single
computations with subscripted variables. The element Vi of vector V is written
as V(I) and the index I refers to a memory address or register where the
number is stored.
• To examine the difference between a conventional scalar processor and a
vector processor, consider the following Fortran DO loop:

This is a program for adding two vectors A and B of length

100 to produce a vector C.
• A computer capable of vector processing eliminates the overhead
associated with the time it takes to fetch and execute the
instructions in the program loop. It allows operations to be
specified with a single vector instruction of the form
C(1 : 100) = A(1 : 100) + B(1: 100)
• The vector instruction includes the initial address of the operands,
the length of the vectors, and the operation to be performed, all in
one composite instruction.
Matrix Multiplication

• Matrix multiplication is one of the most computational intensive operations

performed in computers with vector processors. An n x m matrix of numbers
has n rows and m columns and may be considered as constituting a set of n
row vectors or a set of m column vectors. Consider, for example, the
multiplication of two 3 x 3 matrices A and B.
Inner Product
• In general, the inner product consists of the sum of k product terms of the form

In a typical application k may be equal to 100 or even 1000. The inner product calculation
on a pipeline vector processor is shown below:
Flynn's Classification of Parallel Processing
• There are a variety of ways that parallel processing can be classified. It can be
considered from the internal organization of the processors, from the
interconnection structure between processors, or from the flow of information
through the system.
• One classification introduced by M. J. Flynn considers the organization of a
computer system by the number of instructions and data items that are
manipulated simultaneously.
• The normal operation of a computer is to fetch instructions from memory and
execute them in the processor. The sequence of instructions read from memory
constitutes an instruction stream. The operations performed on the data in the
processor constitutes a data stream. Parallel processing may occur in the
instruction stream, in the data stream, or in both.
• Flynn's classification divides computers into four major groups as follows:
1. Single instruction stream, single data stream (SISD)
2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)
• SISD represents the organization of a single computer containing a control
unit, a processor unit, and a memory unit. Instructions are executed
sequentially and the system may or may not have internal parallel processing
capabilities. Parallel processing in this case may be achieved by means of
multiple functional units or by pipeline processing.
• SIMD represents an organization that includes many processing units under
the supervision of a common control unit. All processors receive the same
instruction from the control unit but operate on different items of data. The
shared memory unit must contain multiple modules so that it can
communicate with all the processors simultaneously.
• MISD structure is only of theoretical interest since no practical system has
been constructed using this organization.
• MIMD organization refers to a computer system capable of processing
several programs at the same time. Most multiprocessor and multicomputer
systems can be classified in this category.

Dharwar Drilling Society: Case Analysis Report
No ratings yet
Dharwar Drilling Society: Case Analysis Report
8 pages
Honors Electric Vehicles 2019 Course
No ratings yet
Honors Electric Vehicles 2019 Course
8 pages
Homeworkproblems PDF
No ratings yet
Homeworkproblems PDF
144 pages
Grami Product List & Price 2021
No ratings yet
Grami Product List & Price 2021
6 pages
Analyzing User Comments On YouTube Coding Tutorial Videos
No ratings yet
Analyzing User Comments On YouTube Coding Tutorial Videos
50 pages
General Feedback For Module 7
No ratings yet
General Feedback For Module 7
1 page
SCHEME HND 1 General Computer II 2019-2020
No ratings yet
SCHEME HND 1 General Computer II 2019-2020
5 pages
Horsetail Equisetum Hyemale1
No ratings yet
Horsetail Equisetum Hyemale1
8 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
Unit 5 - Pipeling and Multipoessors
No ratings yet
Unit 5 - Pipeling and Multipoessors
74 pages
Lecture 1 Definitions & Terminologies in Experimental Design
No ratings yet
Lecture 1 Definitions & Terminologies in Experimental Design
11 pages
Corporate and Academic Services: Part 1: Basic Data
No ratings yet
Corporate and Academic Services: Part 1: Basic Data
3 pages
Silver Oak College of Engineering and Technology: Computer Organization Module Solution - 4
No ratings yet
Silver Oak College of Engineering and Technology: Computer Organization Module Solution - 4
11 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Types of Lighting
No ratings yet
Types of Lighting
7 pages
4-Concept of Pipelining
No ratings yet
4-Concept of Pipelining
20 pages
Contact Us - WBM International Online Shopping in Pakistan
No ratings yet
Contact Us - WBM International Online Shopping in Pakistan
1 page
Massachusetts Parent Letter Refusing MCAS
No ratings yet
Massachusetts Parent Letter Refusing MCAS
1 page
Pipelining
No ratings yet
Pipelining
13 pages
Technical Data Sheet: Zwaluw Fix-O-Chem (Styrene Free)
No ratings yet
Technical Data Sheet: Zwaluw Fix-O-Chem (Styrene Free)
2 pages
Unit 4 - P 2
No ratings yet
Unit 4 - P 2
13 pages
He Mrut 006
No ratings yet
He Mrut 006
3 pages
Tmco Single Chamber Manual
No ratings yet
Tmco Single Chamber Manual
13 pages
Persuasive Speech On Homework Should Be Banned
100% (1)
Persuasive Speech On Homework Should Be Banned
6 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
Regression - Slides and UIP Case-Study Setup
No ratings yet
Regression - Slides and UIP Case-Study Setup
21 pages
Unit 5
No ratings yet
Unit 5
51 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Distortion in Amplifiers
No ratings yet
Distortion in Amplifiers
6 pages
SIMD Machines:: Pipeline System
No ratings yet
SIMD Machines:: Pipeline System
35 pages
Pipeline & Parallel Processing
No ratings yet
Pipeline & Parallel Processing
19 pages
Co - Unit Ii - Ii
No ratings yet
Co - Unit Ii - Ii
34 pages
DDOT Reimagined Phase II Draft Plan (Released 4/24/23)
100% (1)
DDOT Reimagined Phase II Draft Plan (Released 4/24/23)
4 pages
Csa Module Iv Notes
No ratings yet
Csa Module Iv Notes
59 pages
Buy Social Security Number SSN
No ratings yet
Buy Social Security Number SSN
8 pages
Successful Remedies For Early Marriage
No ratings yet
Successful Remedies For Early Marriage
4 pages
Tutorial Sheet - 9
No ratings yet
Tutorial Sheet - 9
2 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
FINAL Presentation
No ratings yet
FINAL Presentation
31 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
CSO Lecture Notes Unit - 5
No ratings yet
CSO Lecture Notes Unit - 5
11 pages
Chapter 8 Pipeline and Vector Processing
0% (1)
Chapter 8 Pipeline and Vector Processing
12 pages
CA-unit 4-Material
No ratings yet
CA-unit 4-Material
31 pages
Unit 6 - Pipeline, Vector Processing and Multiprocessors
No ratings yet
Unit 6 - Pipeline, Vector Processing and Multiprocessors
23 pages
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
100% (1)
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
5 pages
PIpeline Processing and Multi Processing
No ratings yet
PIpeline Processing and Multi Processing
16 pages
Unit 3
No ratings yet
Unit 3
94 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
Pipe Lining
No ratings yet
Pipe Lining
5 pages
Lecture 1
100% (1)
Lecture 1
10 pages
Pipeline
No ratings yet
Pipeline
33 pages
Pipelining
No ratings yet
Pipelining
5 pages
Lime 2
No ratings yet
Lime 2
11 pages
Fortum Investor Presentation May 2019 0
No ratings yet
Fortum Investor Presentation May 2019 0
56 pages
RSTCA HOLDINGS Financials
No ratings yet
RSTCA HOLDINGS Financials
19 pages
Biology 10th 10 - 10 - 2024 - 083742
No ratings yet
Biology 10th 10 - 10 - 2024 - 083742
2 pages
UNIT - 5 Pipeling Concept
No ratings yet
UNIT - 5 Pipeling Concept
15 pages
Unit 6
No ratings yet
Unit 6
22 pages
Unit 5 1
No ratings yet
Unit 5 1
21 pages
10 Pipelining
No ratings yet
10 Pipelining
44 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
COA Unit - V Notes
No ratings yet
COA Unit - V Notes
21 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
CA Unit-3 Part2
No ratings yet
CA Unit-3 Part2
8 pages
Statistics Study Group 1
No ratings yet
Statistics Study Group 1
3 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
COA Lecture 10
No ratings yet
COA Lecture 10
22 pages
DLCO Module 6 Sem 3
No ratings yet
DLCO Module 6 Sem 3
40 pages
ĐỀ KIỂM TRA ĐẦU VÀO - ANH 7 Global
No ratings yet
ĐỀ KIỂM TRA ĐẦU VÀO - ANH 7 Global
5 pages
3.2 Pipeline Processing
No ratings yet
3.2 Pipeline Processing
18 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
5.pipeline and Multiprocessors
100% (1)
5.pipeline and Multiprocessors
16 pages
Lec 8 Performance Enhancement-Computer Architecture
No ratings yet
Lec 8 Performance Enhancement-Computer Architecture
23 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
21 pages
CA Unit-2 Chapter-2
No ratings yet
CA Unit-2 Chapter-2
36 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
53 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
22 pages
Unit 6
No ratings yet
Unit 6
11 pages
Coa M3 Bit
No ratings yet
Coa M3 Bit
4 pages
CBA Processor
No ratings yet
CBA Processor
21 pages
Module 5 Pipeline and Vector Processing
No ratings yet
Module 5 Pipeline and Vector Processing
71 pages
Lecture 3.1.2 (Concept of Pipelining, Pipeline Hazards)
No ratings yet
Lecture 3.1.2 (Concept of Pipelining, Pipeline Hazards)
6 pages
Definition and Dispute: A Defense of Temporal Externalism 1st Edition Derek Ball Instant Download
100% (1)
Definition and Dispute: A Defense of Temporal Externalism 1st Edition Derek Ball Instant Download
57 pages
COA Module 5 QB Complete Solutions
No ratings yet
COA Module 5 QB Complete Solutions
32 pages

Unit 6

Uploaded by

Unit 6

Uploaded by

UNIT 6

BCA 2nd semester

Each sub-operation is to be implemented in a

R1 through R5 are registers that receive new data

• Vector processing is a procedure for speeding the processing of

• Computers with vector processing capabilities are in demand in

This is a program for adding two vectors A and B of length

• Matrix multiplication is one of the most computational intensive operations

You might also like