0% found this document useful (0 votes)

67 views75 pages

ACA - Chapter 6

This document discusses techniques for improving instruction pipeline performance, including pipelining and superscalar techniques. It covers linear and non-linear pipeline models, reservation tables, latency analysis, and avoiding collisions in scheduling instructions. Key mechanisms for instruction pipelining include prefetch buffers, multiple functional units, internal data forwarding, and hazard avoidance. Prefetch buffers in particular can be used to match the instruction fetch rate to the pipeline consumption rate by pre-loading sequential, branch target, or loop instructions.

Uploaded by

Praveen Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views75 pages

ACA - Chapter 6

Uploaded by

Praveen Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Chapter 6

Pipelining and
Superscalar Techniques.
Dr.Manjunath Kotari
Professor & Head-CSE
Linear Pipeline

• Processing Stages are linearly connected

• Perform fixed function
• Synchronous Pipeline
• Clocked latches between Stage i and Stage i+1
• Equal delays in all stages
• Asynchronous Pipeline (Handshaking)
Latches

S1 S2 S3

L1 L2

Slowest stage determines delay

Equal delays  clock period

Linear Pipeline Processors

• Linear pipeline processor is a cascade of

processing stages which are
• linearly connected to perform a fixed function over a
stream of data flowing from one end to the other.
• Applied for
• Execution
• Arithmetic computation
• Memory access operations
Asynchronous and Synchronous
Models
• Depending on the control of data along the pipeline,
we model linear pipelines in two categories
• Asynchronous
• Synchronous
Asynchronous Model
• Data flow between adjacent stages in an
asynchronous pipeline is controlled by a handshaking
protocol.
• When stage Si is ready to transmit, it sends a ready
signal to stage Si+1
• After stage Si+1 receives the incoming data, it returns
the acknowledge signal to Si
• Advantages
• Useful in designing communication channels
• Variable throughput rate-different delays
Synchronous Model
• Clocked latches are used to interface between
stages.
• The latches are made with master slave flip-flops
• Upon the arrival of a clock pulse, all latches
transfer data to the next stage simultaneously.
• The pipeline stages are combinational logic
• Equal delays in all stages.
• Specified by reservation table
Reservation Table
Time

S1 X

S2 X

S3 X

S4
X
5 tasks on 4 stages
Time

S1 X X X X X

S2 X X X X X

S3 X X X X X

S4 X X X X X
Clocking and Timing Control
Speedup, Efficiency & Throughput
Efficiency(Ek) & Throughput(Hk)
Non Linear Pipelines

• Variable functions
• Feed-Forward
• Feedback
3 stages & 2 functions

X Y

S1 S2 S3

• it is a multi function pipeline

•Three type of connections
•Streamline connection S1-S2 and S2-S3
•Feedforeward connection S1-S3
•Feedbackward connection S3-S2 and S3-S1
Reservation Tables for X & Y
S1 X X X
S2 X X
S3 X X X

S1 Y Y
S2
Y
S3
Y Y Y
Reservation Tables

• For static linear pipeline is trivial in the sense

that dataflow follows a linear streamline.
• For a dynamic pipeline non-linear pattern is
used.
• A static pipeline is specified by a single
reservation table.
• A dynamic pipeline may be specified by more
than one reservation table.
Reservation Tables

• Each reservation table displays the time-space flow

of data through the pipeline for one function
evaluation.
• Different functions may follow different paths on the
reservation table.
• The no. of columns in a table is called as evaluation
time of function.
• The check marks in each row of the table
correspond to the time instants that the particular
stage will be used.
• Multiple check marks means repeated usage of the
Latency Analysis

• Latency-
• the number of time units between two initiations
of a pipeline is called latency.
• Must be non-negative integers
• A latency of k means, two initiations are
separated by a k clock cycles
• Collision
• Any attempt by two or more initiations to use the
same pipeline stage at the same time.
• Collision implies resource conflicts between
two initiations in the pipeline.
• Therefore all collisions must be avoided in
scheduling a sequence of pipeline initiations.
• Some latencies will cause collisions, and some
will not.
• Latencies that cause collisions are called
forbidden latencies.
• Latency sequence
• It is a sequence of permissible nonforbidden latencies
between successive task initiations.
• Latency cycle
• It is a latency sequence which repeats the same
subsequence indefinitely.
• Average latency
•It is obtained by dividing the sum of all latencies by the number of
latencies along the cycle.
• Constant cycle
•It is a latency cycle which contains only one latency value.
Collision free scheduling

• Collision vectors
• The combined set of permissible and forbidden latencies
can easily displayed by a collision vector.
• C=(CmCm-1……..C2C1)
• If Ci=1 latency i causes a collision
• If Ci=0 latency i causes is permissible.
State diagrams
• Simple cycle
• It is a latency cycle in which each state appears
only once.
• Ex: (3),(6),(1,8),(3,8) and (6,8)
• Greedy cycles
• Greedy cycle is one whose edges are all made
with minimum latencies from their respective
starting states.
• Ex: (1,8),(3)
• MAL (minimal average latency)
Nonlinear Pipeline Design

• Latency
The number of clock cycles between two initiations of a
pipeline
• Collision
Resource Conflict
• Forbidden Latencies
Latencies that cause collisions
Nonlinear Pipeline Design cont
• Latency Sequence
A sequence of permissible latencies between successive
task initiations
• Latency Cycle
A sequence that repeats the same subsequence
• Collision vector
C = (Cm, Cm-1, …, C2, C1), m <= n-1
n = number of column in reservation table
Ci = 1 if latency i causes collision, 0 otherwise
Collision Vector for Multiply
after Multiply
Forbidden Latencies: 1, 2

Collision vector
0 0 0 0 1 1  11

Maximum forbidden latency = 2  m = 2

Example
X Y

S1 S2 S3
Reservation Tables for X & Y
S1 X X X
S2 X X
S3 X X X

S1 Y Y
S2
Y
S3
Y Y Y
Reservation Tables for X & Y
S1 X X X
S2 X X
S3 X X X

S1 Y Y
S2
Y
S3
Y Y Y
Collision Vector

• Forbidden Latencies: 2, 4, 5, 7
• Collision Vector =
1011010
Y after Y
S1 Y Y Y
S2
Y Y
S3
Y YY YY

S1 Y YY
S2
Y
S3
Y Y YY
Collision Vector

• Forbidden Latencies: 2, 4
• Collision Vector =
1010
Exercise – Find the collision
vector
1 2 3 4 5 6 7

A X X X

B X X

C X X

D X
State Diagram for X
8+

1011010

3 8+

6 8+ 1*

1011011 1111111

3* 6
Cycles

• Simple cycles  each state appears only once

(3), (6), (8), (1, 8), (3, 8), and (6,8)
• Greedy Cycles  simple cycles whose edges are all
made with minimum latencies from their respective
starting states
(1,8), (3)  one of them is MAL
Delay Insertion

• The purpose of delay insertion is to modify the

reservation table, yielding a new collision vector.
• This leads new modified state diagram
• Which may produce greedy cycles meeting the lower
bound on the MAL
Instruction Pipeline Design
• Pipelined Instruction Processing
• The fetch stage (F) fetches instructions from a cache
memory, presumably one per cycle.
• The decode stage (D) reveals (disclose) the instruction
function to be performed and identifies the resources
needed.
• The issue stage (I) reserves resources and operands
are also read from registers during the issue stage.
• The instructions are executed in one or several
execute stages (E)
• Issue of instructions follows the original program order
• The shaded boxes correspond to idle cycles when instruction
issues are blocked due to
•resources latency or
•conflicts or
•due to data dependencies
•The first two load instructions issue on consecutive cycles.
•The add is dependent on both loads so it waits.
• It is an improved timing after the instruction issuing order is
changed to eliminate unnecessary delays.
• Four load operations in the beginning
• The add and multiply instructions are blocked fewer cycles due to
this data prefetching
• The reorder should not change the end results.
• The R4000 is a super pipelined 64-bit processor (instruction + data ) cache
• Execution consists of 8 major steps.
• The single-cycle ALU stage takes slightly more time than each of the cache
access stages.
• The overlapped execution of successive instructions
•This pipeline operates efficiently and utilized simultaneously
on a noninterfering basis
• The internal pipeline clock rate (100 MHz)
• Load and branch instructions introduce extra delays.
Mechanisms for Instruction Pipelining

• Prefetch Buffers
• Multiple Functional Units
• Internal Data Forwarding
• Hazard Avoidance
Prefetch Buffers

• Three type of buffers can be used to match the

instruction fetch rate to the pipeline consumption
rate.
• Sequential buffer
• Target buffer
• Loop buffer
• In one memory-access time, a block of consecutive
instructions ate fetched into a prefetch buffer.
1. Sequential instructions are loaded into a pair of sequential
buffers
2. Instructions from a branch target are loaded into pair of
target buffers
3. These works on FIFO fashion
4. After the branch condition is checked, appropriate
instructions are taken from one of the two buffers,
instructions in the other are discarded.
• Loop buffer
• Loop buffer operates in two steps.
• It contains instructions sequentially ahead of the current
instruction.
• It recognizes when the target of a branch falls within the
loop boundary.
Multiple functional units
Internal Data Forwarding

• The throughput of a pipelined processor can be

further improved with internal data forwarding
among multiple functional units.
• Store load forwarding
• Load load forwarding
• Store store forwarding
Store – load forwarding
• The load operation (LD R2,M) from memory to register R2
can be replaced by the move operation MOVE R2,R1
Load-load forwarding
• It eliminates the second load operation (LD R2,M) and
replaces it with the move operation (MOVE R2,R1)

LD R2,M
Store-store forwarding
• The two stores are executed immediately one after the other.
• The second store overwrites the first
• The first store becomes redundant and thus can be
eliminated without affecting the outcome.
Implementing the dot-product operation with internal data
forwarding
Hazard Avoidance

• The read and write of shared variables by different

instructions in a pipeline may lead to different results
• if these instructions are executed out of order.
• Three types of hazards
• Read-after-Write hazard
• Write-after-Read hazard
• Write-after-Write hazard
• Consider two instructions I and J.
• Instruction J is assumed to
logically follow instruction I
according to program order.
• if the actual execution order of
these two instructions
•violates the program order,
•incorrect results are read or
written,
•thereby producing hazards
•We use the notation D(I) and R(I) for
domain and range of instruction I.
•The domain contains –input set to
be used by instruction I.
• The range corresponds to the
output set of instruction I.
Possible hazards can occur
• R(I) ∩ D(J) ≠ 0 for RAW hazard
• R(I) ∩ R(J) ≠ 0 for WAW hazard
• D(I) ∩ R(J) ≠ 0 for WAR hazard
• The RAW hazard corresponds to the flow dependence
• WAR to the antidependence
• WAW to the output dependence
• The resolution of hazard can be checked by special
hardware while instructions are being loaded into the
buffer.
• A special tag bit can be used with each operand register to
indicate safe or hazard prone.
Dynamic Instruction Scheduling

• Static scheduling
• Data dependencies in a sequence of instructions create
interlocked relationships among them.
Branch handling techniques
• Three basic terms for the analysis of branching
effect.
• Branch Taken
• The action of fetching a nonsequential or remote instruction
after a branch instruction
• Branch Target
• The instruction to executed after a branch taken
• Delay Slot
• The number of pipeline cycles wasted between a branch taken
and its target.
• Denoted by d 0<=d<=k-1, where k=no. of pipeline stages.
Branch Prediction
• Branch can be predicted either based on branch code
types statically or based on branch history during
program execution.
• The static prediction direction is usually wired into the
processor.
• According to past experience, the best performance is
given by predicting taken.
• A dynamic branch strategy uses recent branch history to
predict whether or not the branch will be taken next time
when it occurs.
• To accurate one may need to use the entire history of
the branch to predict the future choice.
Classification of dynamic branch
strategies
• One class predicts the branch direction based
upon information found at the decode stage.
• The second class uses a cache to store target
addresses at the stage the effective address of
the branch target is computed.
• The third scheme uses a cache to store target
instructions at the fetch stage.
BTB (branch target buffer)

• Use to implement branch prediction

• The BTB is used to hold recent branch information
including the address of the branch target used.
• The address of the branch instruction locates its
entry in the BTB.
Delayed Branches

• Branch penalty would be reduced significantly

• if the delay slot could be shortened or minimized to a zero
penalty.
• A delayed branch of d cycles allows at most d-1
useful instructions to be executed following the
branch taken.
Arithmetic Pipeline Design
• Fixed point operations
• Fixed point numbers are represented internally in machines in
• Sign-magnitude
• One’s complement
• Two’s complement notation
• Add, subtract, multiply and divide are four primitive
arithmetic operations
• The add or subtract of two n-bit integers produces an n-bit
result
• The multiplication of two n-bit numbers produces a 2n-bit
result
• The division of an n-bit number by another may create an
arbitrarily long quotient and a remainder.
Floating point numbers
• The IEEE 754 floating-point standard
• A floating-point number X is represented by a pair (m,e).
• The algebraic value is represented as X=m x r e.
Arithmetic Pipeline Stages

• Arithmetic or logical shifts can be easily implemented

with shift registers.
• High speed addition requires either the use of a
• CPA which adds two numbers and produces an arithmetic
sum.
• CSA to add three input numbers and produce one sum
output and a carry.
Multiply Pipeline Design

• Consider the multiplication of two 8-bit integers

AxB=P, where P is the 16-bit product in double
precision.
• This fixed-point multiplication can be written as the
summation of eight partial products
• P=AxB=P0+P1+P2+……….+ P7
Pipeline unit for fixed-point
multiplication of 8-bit
integers
• Arithmetic pipeline has 3
stages
The mantissa section can
perform
Floating point add or
multiply operations

Stage 1 receives input

operands and returns with
computations results.
Stage 2 contains array
multiplier used to carry
out long multiplication.
Stage 3 contains registers
for holding results.
Convergence Division
• Division can be carried out by repeated multiplications.
• Mantissa division is carried out by a convergence
method.
• This convergence division obtains the quotient
Q=M/D of two normalized fractions 0.5≤ M<D<1

SPI MX66UM2G45G, 1.8V, 2Gb, v1.0 PDF
No ratings yet
SPI MX66UM2G45G, 1.8V, 2Gb, v1.0 PDF
101 pages
Atmega 2560 PDF
No ratings yet
Atmega 2560 PDF
449 pages
GATE: - A Gate Is Defined As A Digital Circuit Which
No ratings yet
GATE: - A Gate Is Defined As A Digital Circuit Which
14 pages
Manual Ingles Mindray 2800
No ratings yet
Manual Ingles Mindray 2800
108 pages
Yasnac I80m Computer Cummunication Instructions
No ratings yet
Yasnac I80m Computer Cummunication Instructions
42 pages
G31MXP Series Manual
100% (1)
G31MXP Series Manual
78 pages
Serial Communication Programming in C
No ratings yet
Serial Communication Programming in C
28 pages
Contact Session 8
No ratings yet
Contact Session 8
63 pages
Software Design For Low Power
No ratings yet
Software Design For Low Power
20 pages
Device Parameter Extraction of 14Nm, 10nmand 7Nm Finfet
No ratings yet
Device Parameter Extraction of 14Nm, 10nmand 7Nm Finfet
43 pages
Guess Paper of Computer Studies 9th Class Important Short Questions
100% (1)
Guess Paper of Computer Studies 9th Class Important Short Questions
3 pages
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
No ratings yet
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
34 pages
Mohammad Rayan D. Guro-Sarih. September 6,2019 Abm 12-3 What Is Computer
No ratings yet
Mohammad Rayan D. Guro-Sarih. September 6,2019 Abm 12-3 What Is Computer
6 pages
CH - En.u4cse19101 de Lab 2
No ratings yet
CH - En.u4cse19101 de Lab 2
17 pages
Static 0 Hazard and Dynamic Hazard: Digital Electronic Circuits
No ratings yet
Static 0 Hazard and Dynamic Hazard: Digital Electronic Circuits
11 pages
Pipe Lining
No ratings yet
Pipe Lining
37 pages
BCS-29 Advanced Computer Architecture: Linear & Nonlinear Pipelines Instruction Pipelines & Arithmetic Operations
No ratings yet
BCS-29 Advanced Computer Architecture: Linear & Nonlinear Pipelines Instruction Pipelines & Arithmetic Operations
33 pages
Mos Vlsi Report Final
No ratings yet
Mos Vlsi Report Final
46 pages
PLC Based Examples
No ratings yet
PLC Based Examples
2 pages
Pipelining
No ratings yet
Pipelining
44 pages
Block RAM/ROM
No ratings yet
Block RAM/ROM
13 pages
Microprocessor and Microcontroller
0% (1)
Microprocessor and Microcontroller
3 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
Block Diagram of INTEL 8085: Introduction To 8085
No ratings yet
Block Diagram of INTEL 8085: Introduction To 8085
6 pages
Manual QuartusII
No ratings yet
Manual QuartusII
73 pages
Accelerometer SPI Mode
No ratings yet
Accelerometer SPI Mode
6 pages
Chapter 6
No ratings yet
Chapter 6
71 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Basics and Hazards of Pipeline Controller
No ratings yet
Basics and Hazards of Pipeline Controller
23 pages
AAPP Mod3 Latest
No ratings yet
AAPP Mod3 Latest
65 pages
Lecture 7 - PIPELINING
No ratings yet
Lecture 7 - PIPELINING
16 pages
74LS245
No ratings yet
74LS245
2 pages
Parallel Processing Chapter - 3: Instruction Level Parallelism
No ratings yet
Parallel Processing Chapter - 3: Instruction Level Parallelism
33 pages
UNIT-5: Pipeline and Vector Processing
No ratings yet
UNIT-5: Pipeline and Vector Processing
63 pages
Chapter 5
No ratings yet
Chapter 5
38 pages
CHAPTER-1: Introduction To Microprocessor (10%) : Short Answer Questions
No ratings yet
CHAPTER-1: Introduction To Microprocessor (10%) : Short Answer Questions
6 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
28 pages
Computer Architecture-I: Assignment 1 Solution
No ratings yet
Computer Architecture-I: Assignment 1 Solution
6 pages
Stud CSA Mod4 p2 PipeliningBasics
No ratings yet
Stud CSA Mod4 p2 PipeliningBasics
83 pages
Latancy Solution-Pipeline Reservation Table
60% (10)
Latancy Solution-Pipeline Reservation Table
14 pages
Uni1-2 Pipelining
No ratings yet
Uni1-2 Pipelining
12 pages
Pipeline Processing
No ratings yet
Pipeline Processing
28 pages
Pipelinenew
No ratings yet
Pipelinenew
43 pages
Unit 3
No ratings yet
Unit 3
64 pages
CAO-II Module 2 Complete
100% (1)
CAO-II Module 2 Complete
32 pages
Co - Unit Ii - Ii
No ratings yet
Co - Unit Ii - Ii
34 pages
4.non Linear Pipeline
88% (8)
4.non Linear Pipeline
20 pages
Csa Module Iv Notes
No ratings yet
Csa Module Iv Notes
59 pages
Advanced Computer Architecture: Pipelined Processor
No ratings yet
Advanced Computer Architecture: Pipelined Processor
20 pages
4th Sem. / Computer Engineering / I.T. Subject: Computer Organization
No ratings yet
4th Sem. / Computer Engineering / I.T. Subject: Computer Organization
2 pages
Be Computer-Engineering Semester-4 2022 May Microprocessor-Pattern-2019
No ratings yet
Be Computer-Engineering Semester-4 2022 May Microprocessor-Pattern-2019
2 pages
Nonlinear Pipelining: Nonlinear Pipeline: Which Allows
No ratings yet
Nonlinear Pipelining: Nonlinear Pipeline: Which Allows
25 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
Pipeline Processing
No ratings yet
Pipeline Processing
43 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Pipeline 1
No ratings yet
Pipeline 1
17 pages
CA Slides#3 Pipeline Introduction
No ratings yet
CA Slides#3 Pipeline Introduction
26 pages
Parallel Chapter3
No ratings yet
Parallel Chapter3
29 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
Nonlinear Pipelining: Nonlinear Pipeline: Which Allows
No ratings yet
Nonlinear Pipelining: Nonlinear Pipeline: Which Allows
25 pages
Unit 1.image - Marked
No ratings yet
Unit 1.image - Marked
19 pages
33 Hazards in Pipeline 06-04-2023
No ratings yet
33 Hazards in Pipeline 06-04-2023
27 pages
Memory Access Method
No ratings yet
Memory Access Method
14 pages
APznzabDMN0K7ucLj 5y16mZ4MCAzvYka6XPubu o-J2kvJ41PtLmk6WmKHv2EeC4Ezo2wWs0bceGCsYwyq4dsvlt0hqLhY17sXl8HI4hJMeArq1cYV0OrVA-LXS0 77s jVurWxDlctuiAfZ24C8IrdGDNq-YxVFyEtTihvDe2xUFnrVedfCLXwLd0z
No ratings yet
APznzabDMN0K7ucLj 5y16mZ4MCAzvYka6XPubu o-J2kvJ41PtLmk6WmKHv2EeC4Ezo2wWs0bceGCsYwyq4dsvlt0hqLhY17sXl8HI4hJMeArq1cYV0OrVA-LXS0 77s jVurWxDlctuiAfZ24C8IrdGDNq-YxVFyEtTihvDe2xUFnrVedfCLXwLd0z
20 pages
Chapter 6 (Pipelining and Superscalar Techniques)
No ratings yet
Chapter 6 (Pipelining and Superscalar Techniques)
10 pages
Test-3 Solutions Subject: Advanced Computer Architecture: 1 2 3 4 5 6 7 8 S1 S2 S3 X X X X X X X X
No ratings yet
Test-3 Solutions Subject: Advanced Computer Architecture: 1 2 3 4 5 6 7 8 S1 S2 S3 X X X X X X X X
14 pages
Pipeline
No ratings yet
Pipeline
30 pages
Lecture 060708
No ratings yet
Lecture 060708
37 pages
Kya Hua
No ratings yet
Kya Hua
4 pages
Pipelining: Advanced Computer Architecture
No ratings yet
Pipelining: Advanced Computer Architecture
23 pages
Parallelism in Uniprocessor System and Granularity
100% (5)
Parallelism in Uniprocessor System and Granularity
5 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Shuffled MCQs Cleaned
No ratings yet
Shuffled MCQs Cleaned
74 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Lecture8 - 14 07 2025
No ratings yet
Lecture8 - 14 07 2025
62 pages
Pipelining: Advanced Computer Architecture
100% (1)
Pipelining: Advanced Computer Architecture
30 pages
Computer Systems A Programmers Perspective, Section 4.4, "General Principles of Pipelining"
No ratings yet
Computer Systems A Programmers Perspective, Section 4.4, "General Principles of Pipelining"
7 pages
15CS72 ACA Module3 Chapter2finalnotes
No ratings yet
15CS72 ACA Module3 Chapter2finalnotes
20 pages
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
15 pages
CA Unit-2 Chapter-2
No ratings yet
CA Unit-2 Chapter-2
36 pages
Fpga An 02081 1 1 Migrating Designs From Amd CPLD Fpga Devices To Lattice Fpga Devices
No ratings yet
Fpga An 02081 1 1 Migrating Designs From Amd CPLD Fpga Devices To Lattice Fpga Devices
88 pages
Pipelining Concepts and Problems
No ratings yet
Pipelining Concepts and Problems
33 pages
ch4 2
No ratings yet
ch4 2
42 pages
Module 3-Part 2
No ratings yet
Module 3-Part 2
50 pages
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
Ca 4
No ratings yet
Ca 4
39 pages
Parallel Processor Computing Unit 2
No ratings yet
Parallel Processor Computing Unit 2
25 pages
Solutions to Problems in Fluids and Turbomachinery
From Everand
Solutions to Problems in Fluids and Turbomachinery
Rahul Basu
No ratings yet
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
From Everand
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
Analog Dialogue
4/5 (1)

ACA - Chapter 6

Uploaded by

ACA - Chapter 6

Uploaded by

Chapter 6

• Processing Stages are linearly connected

Slowest stage determines delay

Equal delays  clock period

• Linear pipeline processor is a cascade of

• it is a multi function pipeline

• For static linear pipeline is trivial in the sense

• Each reservation table displays the time-space flow

Maximum forbidden latency = 2  m = 2

• Simple cycles  each state appears only once

• The purpose of delay insertion is to modify the

• Three type of buffers can be used to match the

• The throughput of a pipelined processor can be

• The read and write of shared variables by different

• Use to implement branch prediction

• Branch penalty would be reduced significantly

• Arithmetic or logical shifts can be easily implemented

• Consider the multiplication of two 8-bit integers

Stage 1 receives input

You might also like