0% found this document useful (0 votes)

597 views22 pages

Vector Processing and Pipelining

This document discusses parallel processing and pipelining. It describes four types of parallel processing: SISD, SIMD, MISD, and MIMD. SIMD involves a single control unit and multiple processing units operating on different data simultaneously. Pipelining involves dividing a process into sequential stages and executing each stage concurrently across dedicated hardware. Pipelining can increase throughput by allowing new tasks to begin before previous tasks finish. The document provides an example comparing sequential and pipelined laundry processes. It also discusses pipeline performance metrics like speedup.

Uploaded by

praveenpin2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

597 views22 pages

Vector Processing and Pipelining

Uploaded by

praveenpin2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 22

Chapter 9

Pipeline and Vector

Processing

Dr. Bernard Chen Ph.D.

University of Central Arkansas
Spring 2009
Parallel processing
 A parallel processing system is able to perform
concurrent data processing to achieve faster
execution time

 The system may have two or more ALUs and be

able to execute two or more instructions at the
same time

 Goal is to increase the throughput – the

amount of processing that can be accomplished
during a given interval of time
Parallel processing
classification

Single instruction stream, single data stream – SISD

Single instruction stream, multiple data stream –

SIMD

Multiple instruction stream, single data stream –

MISD

Multiple instruction stream, multiple data stream –

MIMD
Single instruction stream, single data
stream – SISD

 Single control unit, single computer, and a

memory unit

 Instructions are executed sequentially. Parallel

processing may be achieved by means of
multiple functional units or by pipeline
processing
Single instruction stream, multiple
data stream – SIMD

 Represents an organization that includes many

processing units under the supervision of a
common control unit.

 Includes multiple processing units with a single

control unit. All processors receive the same
instruction, but operate on different data.
Multiple instruction stream, single
data stream – MISD

 Theoretical only

 processors receive different instructions, but

operate on same data.
Multiple instruction stream,
multiple data stream – MIMD
 A computer system capable of processing
several programs at the same time.

 Most multiprocessor and multicomputer

systems can be classified in this category
Pipelining: Laundry
Example
 Small laundry has one
washer, one dryer and one
operator, it takes 90 A B C D
minutes to finish one load:

 Washer takes 30 minutes

 Dryer takes 40 minutes
 “operator folding” takes 20
minutes
Sequential Laundry
6 PM 7 8 9 10 11 Midnight
Time

30 40 20 30 40 20 30 40 20 30 40 20
T
a A
s
k
B
O
r
d C
e 90 min
r
D
 This operator scheduled his loads to be delivered to the laundry every 90 minutes
which is the time required to finish one load. In other words he will not start a new
task unless he is already done with the previous task
 The process is sequential. Sequential laundry takes 6 hours for 4 loads
Efficiently scheduled laundry: Pipelined
Laundry
Operator start work ASAP
6 PM 7 8 9 10 11 Midnight
Time

30 40 40 40 40 20
40 40 40
T
a A
s
k
B
O
r
d C
e
r
D
 Another operator asks for the delivery of loads to the laundry every 40 minutes!?.
 Pipelined laundry takes 3.5 hours for 4 loads
 Multiple tasks operating
Pipelining Facts simultaneously
 Pipelining doesn’t help
latency of single task, it
helps throughput of
6 PM 7 8 9 entire workload
Time
 Pipeline rate limited by
slowest pipeline stage
T  Potential speedup =
a 30 40 40 40 40 20
Number of pipe stages
s
k A  Unbalanced lengths of
pipe stages reduces
O speedup
r B  Time to “fill” pipeline
d and time to “drain” it
e The washer reduces speedup
r C waits for the
dryer for 10
minutes
D
9.2 Pipelining
• Decomposes a sequential process into
segments.
• Divide the processor into segment processors
each one is dedicated to a particular segment.
• Each segment is executed in a dedicated
segment-processor operates concurrently with
all other segments.
• Information flows through these multiple
hardware segments.
9.2 Pipelining
 Instruction execution is divided into k
segments or stages
 Instruction exits pipe stage k-1 and

proceeds into pipe stage k

 All pipe stages take the same amount of

time; called one processor cycle

 Length of the processor cycle is determined

by the slowest pipe stage

k segments
9.2 Pipelining
 Suppose we want to perform the
combined multiply and add
operations with a stream of
numbers:

 Ai * Bi + Ci for i =1,2,3,…,7
9.2 Pipelining
 The suboperations performed in
each segment of the pipeline are
as follows:

 R1  Ai, R2  Bi
 R3  R1 * R2 R4  Ci
 R5  R3 + R4
Pipeline Performance

 n:instructions n is equivalent to number of loads in

 k: stages in the laundry example
pipeline k is the stages (washing, drying and
 τ : clockcycle folding.
 Tk: total time Clock cycle is the slowest task time

Tk = (k + (n − 1))τ

T1 nk n
Speedup = =
Tk k + (n − 1) k
SPEEDUP
 • Consider a k-segment pipeline operating on n data
sets. (In the above example, k = 3 and n = 4.)

 > It takes k clock cycles to fill the pipeline and get the
first result from the output of the pipeline.

 After that the remaining (n - 1) results will come out at

each clock cycle.

 > It therefore takes (k + n - 1) clock cycles to

complete the task.
SPEEDUP
 If we execute the same task
sequentially in a single processing
unit, it takes (k * n) clock cycles.
 • The speedup gained by using the
pipeline is:
 S = k * n / (k + n - 1 )
SPEEDUP
 S = k * n / (k + n - 1 )

For n >> k (such as 1 million data sets on a 3-

stage pipeline),
 S~k
 So we can gain the speedup which is equal
to the number of functional units for a large
data sets. This is because the multiple
functional units can work in parallel except
for the filling and cleaning-up cycles.
Example: 6 tasks, divided
into 4 segments
1 2 3 4 5 6 7 8 9

T1 T2 T3 T4 T5 T6

Computer Organization and Architecture
67% (3)
Computer Organization and Architecture
111 pages
Lecture Notes On Parallel Processing Pipeline
No ratings yet
Lecture Notes On Parallel Processing Pipeline
12 pages
Pipeline
No ratings yet
Pipeline
22 pages
W25Q32FV
No ratings yet
W25Q32FV
79 pages
Pipeline
No ratings yet
Pipeline
33 pages
Machine Instruction and Programs
No ratings yet
Machine Instruction and Programs
98 pages
TMS 1000 Series Data Manual Dec76
No ratings yet
TMS 1000 Series Data Manual Dec76
46 pages
Parallelism - Multiprocessing, Multithreading & Pipelining
No ratings yet
Parallelism - Multiprocessing, Multithreading & Pipelining
65 pages
Pipeline Processing
No ratings yet
Pipeline Processing
28 pages
Lecture 22
No ratings yet
Lecture 22
106 pages
Lecture 10
No ratings yet
Lecture 10
23 pages
5.1-5.3 Pipelining and Parallel Processing
No ratings yet
5.1-5.3 Pipelining and Parallel Processing
56 pages
Lec18 Pipeline Chap9 2
No ratings yet
Lec18 Pipeline Chap9 2
26 pages
Slide 6
No ratings yet
Slide 6
46 pages
Module 4 - Parallel & Pipeline Processing - Final
No ratings yet
Module 4 - Parallel & Pipeline Processing - Final
31 pages
Chap 9
No ratings yet
Chap 9
59 pages
Zero Address Instructions
No ratings yet
Zero Address Instructions
3 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Unit 5
No ratings yet
Unit 5
36 pages
EC6009 Advanced Computer Architecture University Question Paper Nov Dec 2017
No ratings yet
EC6009 Advanced Computer Architecture University Question Paper Nov Dec 2017
3 pages
Campmc Unit Ii
No ratings yet
Campmc Unit Ii
61 pages
Module 5
No ratings yet
Module 5
16 pages
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
No ratings yet
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
20 pages
Unit 5
No ratings yet
Unit 5
51 pages
CA Slides#3 Pipeline Introduction
No ratings yet
CA Slides#3 Pipeline Introduction
26 pages
Unit-3 (Part-IV)
No ratings yet
Unit-3 (Part-IV)
4 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
25 pages
COA Unit-5
No ratings yet
COA Unit-5
144 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
37 pages
EC2251 C1 Linear and Digital Integrated Circuits
No ratings yet
EC2251 C1 Linear and Digital Integrated Circuits
1 page
BCS302 Unit-3 (Part-III)
No ratings yet
BCS302 Unit-3 (Part-III)
4 pages
Csso U 5
No ratings yet
Csso U 5
29 pages
Digital Ic Design Module 3 Inverter
No ratings yet
Digital Ic Design Module 3 Inverter
37 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
01 Introduction
No ratings yet
01 Introduction
32 pages
Unit A451: Computer Systems and Programming
No ratings yet
Unit A451: Computer Systems and Programming
15 pages
5 Pipeline
No ratings yet
5 Pipeline
63 pages
Parallelism in Uniprocessor System and Granularity
100% (5)
Parallelism in Uniprocessor System and Granularity
5 pages
COA UNIT-III Parallel Processors
No ratings yet
COA UNIT-III Parallel Processors
51 pages
Pipe Lining
No ratings yet
Pipe Lining
7 pages
Inventario Electrónico
No ratings yet
Inventario Electrónico
32 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
Coa Unit 5
No ratings yet
Coa Unit 5
71 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Muzffarpur Institute of Technology, Muzaffapur: Operating System (106503P)
No ratings yet
Muzffarpur Institute of Technology, Muzaffapur: Operating System (106503P)
12 pages
Lista
No ratings yet
Lista
8 pages
COA Class Notes Assignment 02
No ratings yet
COA Class Notes Assignment 02
9 pages
Unit 6 - Pipeline, Vector Processing and Multiprocessors
No ratings yet
Unit 6 - Pipeline, Vector Processing and Multiprocessors
23 pages
Performance Study For Quad Gate Vertically Stacked Junctionless Nanosheet
No ratings yet
Performance Study For Quad Gate Vertically Stacked Junctionless Nanosheet
6 pages
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
No ratings yet
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
8 pages
CSO Computer Programming
No ratings yet
CSO Computer Programming
73 pages
Iedm 2017 8268438
No ratings yet
Iedm 2017 8268438
4 pages
Final
No ratings yet
Final
26 pages
Vectors
No ratings yet
Vectors
52 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
8048 MC
No ratings yet
8048 MC
4 pages
Intel Microprocessors Case Study-Solution: Response To Threat of Imitation
No ratings yet
Intel Microprocessors Case Study-Solution: Response To Threat of Imitation
2 pages
Coa Notes Unit 5
No ratings yet
Coa Notes Unit 5
55 pages
Answer All Questions, Each Carries 3 Marks: Reg No.: - Name
No ratings yet
Answer All Questions, Each Carries 3 Marks: Reg No.: - Name
2 pages
Unit-V NEW
No ratings yet
Unit-V NEW
21 pages
Computer Architecture
No ratings yet
Computer Architecture
18 pages
33 Hazards in Pipeline 06-04-2023
No ratings yet
33 Hazards in Pipeline 06-04-2023
27 pages
Pipeline Processing Coa
No ratings yet
Pipeline Processing Coa
34 pages
Chapter 5 - CO - BIM - III
No ratings yet
Chapter 5 - CO - BIM - III
7 pages
13-Virtual Memory 1
No ratings yet
13-Virtual Memory 1
26 pages
Feed The 3990X The Right Work and You'll Be Rewarded With Unprecedented Performance.
No ratings yet
Feed The 3990X The Right Work and You'll Be Rewarded With Unprecedented Performance.
1 page
Presentation On Second Order Effects and Short Channel Effects
100% (1)
Presentation On Second Order Effects and Short Channel Effects
23 pages
The Pentium Family of Processors Originated From The 80486 Microprocessor
No ratings yet
The Pentium Family of Processors Originated From The 80486 Microprocessor
2 pages
ch.9 Pipeline MoDIFIED
No ratings yet
ch.9 Pipeline MoDIFIED
76 pages
COAU5
No ratings yet
COAU5
31 pages
Computer Systems Architecture 308 312
No ratings yet
Computer Systems Architecture 308 312
5 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
Coa Unit 5
No ratings yet
Coa Unit 5
20 pages
Unit-4 Pipelinie and Vector Processing
No ratings yet
Unit-4 Pipelinie and Vector Processing
33 pages
Eee g594 Advanced Vlsi Devices
No ratings yet
Eee g594 Advanced Vlsi Devices
3 pages
CO Module 5 Notes
No ratings yet
CO Module 5 Notes
16 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
Comp Architecture Chapter 4 - Pipelining
No ratings yet
Comp Architecture Chapter 4 - Pipelining
53 pages
Implementing FORTH On My 6502 Computer - Export
No ratings yet
Implementing FORTH On My 6502 Computer - Export
24 pages
Unit-5-Parallel Processing
No ratings yet
Unit-5-Parallel Processing
11 pages
Chapter 8 Pipeline and Vector Processing
0% (1)
Chapter 8 Pipeline and Vector Processing
12 pages
INTEL DSKTP CompChart
No ratings yet
INTEL DSKTP CompChart
3 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Module 3
No ratings yet
Module 3
67 pages
Technology in Telecommunications Networks
From Everand
Technology in Telecommunications Networks
Tanushri Kaniyar
No ratings yet
Solutions to Problems in Fluids and Turbomachinery
From Everand
Solutions to Problems in Fluids and Turbomachinery
Rahul Basu
No ratings yet

Vector Processing and Pipelining

Uploaded by

Vector Processing and Pipelining

Uploaded by

Chapter 9

Pipeline and Vector

Dr. Bernard Chen Ph.D.

 The system may have two or more ALUs and be

 Goal is to increase the throughput – the

Single instruction stream, single data stream – SISD

Single instruction stream, multiple data stream –

Multiple instruction stream, single data stream –

Multiple instruction stream, multiple data stream –

 Single control unit, single computer, and a

 Instructions are executed sequentially. Parallel

 Represents an organization that includes many

 Includes multiple processing units with a single

 processors receive different instructions, but

 Most multiprocessor and multicomputer

 Washer takes 30 minutes

proceeds into pipe stage k

time; called one processor cycle

by the slowest pipe stage

 n:instructions n is equivalent to number of loads in

 After that the remaining (n - 1) results will come out at

 > It therefore takes (k + n - 1) clock cycles to

For n >> k (such as 1 million data sets on a 3-

You might also like