0% found this document useful (0 votes)

18 views51 pages

Pipelining - Modified1

Uploaded by

Sasuke Uchiha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views51 pages

Pipelining - Modified1

Uploaded by

Sasuke Uchiha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

Pipelining: Datapath and Hazards

Chapter 6, Computer Organization

and Design, David A. Patterson and
John L. Hennessy
Introduction
•The performance of a Single CPU can be increased by :
–Improving the hardware by introducing faster circuits.
–Arranging the hardware such that more than one operation can be performed at the
same time.
•Since, there is a limit on the speed of hardware and the cost of faster circuits is quite
high, we have to adopt the 2nd option. This second approach is Instruction Level
Parallelism.
•Pipelining : Pipelining is a technique for implementing instruction-level parallelism within
a single processor.
•It is a process of arrangement of hardware elements of the CPU such that its overall
performance is increased with simultaneous execution of more than one instruction taking
place in a pipelined processor.
•Pipelining attempts to keep every part of the processor busy with some instruction by
dividing incoming instructions into a series of sequential steps performed by different
processor units with different parts of instructions processed in parallel.
Pipelining: Laundry Example

• Small laundry has one washer, one

dryer and one operator, it takes 90
minutes to finish one load:
A B C D
– Washer takes 30 minutes
– Dryer takes 40 minutes
– “operator folding” takes 20
minutes
Sequential Laundry
6 PM 7 8 9 10 11 Midnight
Time

30 40 20 30 40 20 30 40 20 30 40 20
T
a A
s
k
B
O
r
d C
e 90 min
r
D
• This operator scheduled his loads to be delivered to the laundry every 90 minutes
which is the time required to finish one load. In other words he will not start a
new task unless he is already done with the previous task
• The process is sequential. Sequential laundry takes 6 hours for 4 loads
Efficiently scheduled laundry: Pipelined Laundry

6 PM 7 8 9 10 11 Midnight
Time

30 40 40 40 40 20
40 40 40
T
a A
s
k
B
O
r
d C
e
r
D
• Another operator asks for the delivery of loads to the laundry every 40 minutes!
• Pipelined laundry takes 3.5 hours for 4 loads
Pipelining Facts
• Multiple tasks operating
simultaneously
6 PM 7 8 9 • Pipelining doesn’t help
Time latency of single task, it
helps throughput of
30 40 40 40 40 20 entire workload
T • Pipeline rate limited by
a A
s
slowest pipeline stage
k • Potential speedup =
B Number of pipe stages
O
r
• Unbalanced lengths of
d C The washer pipe stages reduces
waits for the
e dryer for 10 speedup
r minutes • Time to “fill” pipeline
D
and time to “drain” it
reduces speedup
Instruction pipeline versus sequential processing

sequential processing

Instruction pipeline
Instruction pipeline (Contd.)

sequential
processing is
faster for few
instructions
Performance of Pipelining system

•Throughput of the instruction pipeline is determined by how often

an instruction exits the pipeline. Pipelining does not decrease the time
for individual instruction execution. Instead, it increases instruction
throughput.

Machine cycle . The time required to move an instruction one step

further in the pipeline. The length of the machine cycle is
determined by the time required for the slowest pipe stage.
Performance Measurement

n is equivalent to number of loads in

• n:instructions the laundry example
• k: stages in pipeline k is the stages (washing, drying and
• : clockcycle folding.
• Tk: total time  Clock cycle is the slowest task time

Tk (k  (n  1))

T1 nk n
Speedup  
Tk k  (n  1) k
Pipeline Datapath for MIPS Instruction
Time Taken by each MIPS Instruction: Sequential Vs Pipeline Execution
Graphical Representation of ILP
Single Cycle Non Pipeline Datapath
Instruction Execution in Single Cycle Datapath Assuming
Pipelining
Pipeline Version of Single Cycle Datapath for MIPS
Pipeline Control Issues and Hardware
• Here, the following stages perform work as specified:
• IF/ID: Initializes control by passing the rs, rd, and rt fields of the
instruction, together with the opcode and funct fields, to the control
circuitry.
• ID/EX: Buffers control for the EX, MEM, and WB stages, while executing
control for the EX stage. Control decides what operands will be input to
the ALU, what ALU operation will be performed, and whether or not a
branch is to be taken based on the ALU Zero output.
• EX/MEM: Buffers control for the MEM and WB stages, while executing
control for the MEM stage. The control lines are set for memory read or
write, as well as for data selection for memory write. This stage of
control also contains the branch control logic.
• MEM/WB: Buffers and executes control for the WB stage, and selects
the value to be written into the register file.
The control lines for the final three stages: Note that four of the nine control lines are used in the EX phase,
with the remaining five control lines passed on to the EX/MEM pipeline register extended to hold the control
lines; three are used during the MEM stage, and the last two are passed to MEM/WB for use in the WB stage.
Hazards in
Pipelining System: Data & Branch
Overview of Hazards
• Pipeline processors have several problems associated with
controlling smooth, efficient execution of instructions on the
pipeline. These problems are generally called hazards, and
include the following three types:
• Structural Hazards occur when different instructions collide
while trying to access the same piece of hardware in the same
segment of a pipeline. This type of hazard can be alleviated by
having redundant hardware for the segments wherein the
collision occurs. Occasionally, it is possible to insert stalls or
reorder instructions to omit this type of hazard.
Time Taken by each MIPS Instruction: Sequential Vs Pipeline Execution
Structural Hazard #1: in case of Single Memory
Time (clock cycles)

Reading data from

ALU
Mem Reg Mem Reg
I memory
n
s
Inst 1

ALU
t Mem Reg Mem Reg
r.

Inst 2

ALU
O Mem Reg Mem Reg
r
d
Inst 3

ALU
e Mem Reg Mem Reg
r

ALU
Mem Mem Reg
Inst 4 Reading instruction Reg

from memory

Read same memory twice in same clock cycle 23

Structural Hazard #1: Fix with separate instruction and data memories (I$ and D$)

Time (clock cycles)

I
n

ALU
I$ Reg D$ Reg

s lw

ALU
t Instr 1 I$ Reg D$ Reg

ALU
I$ Reg D$ Reg
Instr 2
O

ALU
I$ Reg D$ Reg
Instr 3
r

ALU
d Instr 4 I$ Reg D$ Reg

e
r
24
Structural Hazard #2: Registers (1/2)

Time (clock cycles)

I
n

ALU
s I$ Reg D$ Reg
t
lw

ALU
r Instr 1
I$ Reg D$ Reg

ALU
I$ Reg D$ Reg
Instr 2
O

ALU
r Instr 3
I$ Reg D$ Reg

ALU
I$ Reg D$ Reg
e Instr 4
r

Can we read and write to registers simultaneously?

25
Structural Hazard #2: Registers (2/2)

• Two different solutions have been used:

(1) RegFile access is very fast: takes less than half the time of
ALU stage
• Write to Registers during first half of each clock cycle
• Read from Registers during second half of each clock cycle
(2) Build RegFile with independent read and write ports
• Result:
– can perform register Read and Write during same clock cycle

26
Overview of Hazards
• Data Hazards occur when an instruction depends on the result of a
previous instruction still in the pipeline, which result has not yet been
computed. The simplest remedy inserts stalls in the execution sequence,
which reduces the pipeline's efficiency.
• The solution to data dependencies is twofold.
– First, one can forward the ALU result to the writeback or data fetch stages.
– Second, in selected instances, it is possible to restructure the code to eliminate some
data dependencies.
• Control Hazards can result from branch instructions. Here, the branch
target address might not be ready in time for the branch to be taken,
which results in stalls (dead segments) in the pipeline that have to be
inserted as local wait events, until processing can resume after the branch
target is executed. Control hazards can be mitigated through accurate
branch prediction (which is difficult), and by delayed branch strategies.
Data Hazard
• Definition. A data hazard occurs when the current instruction
requires the result of a preceding instruction, but there are
insufficient segments in the pipeline to compute the result and
write it back to the register file in time for the current instruction to
read that result from the register file.
• We typically remedy this problem in one of three ways:
• Forwarding: In order to resolve a dependency, one adds special
circuitry to the pipeline that is comprised of wires and switches with
which one forwards or transmits the desired value to the pipeline
segment that needs that value for computation. Although this adds
hardware and control circuitry, the method works because it takes
far less time for the required value(s) to travel through a wire than it
does for a pipeline segment to compute its result.
Data Hazard
Example of data
hazards in a
sequence of MIPS
instructions, where
the red (blue) arrows
indicate
dependencies that
are problematic
Operand Forwarding

Data Hazard
Data Hazard Solution using Operand Forwarding and Stall
Data Hazard Solution using Operand Forwarding and Stall (Cont.)
Data Hazard
• Code Re-Ordering: Here, the compiler reorders statements in the
source code, or the assembler reorders object code, to place one or
more statements between the current instruction and the instruction
in which the required operand was computed as a result. This requires
an "intelligent" compiler or assembler, which must have detailed
information about the structure and timing of the pipeline on which
the data hazard would occur. We call this type of software a hardware-
dependent compiler.
• Stall Insertion: It is possible to insert one or more stalls (no-op
instructions) into the pipeline, which delays the execution of the
current instruction until the required operand is written to the register
file. This decreases pipeline efficiency and throughput, which is
contrary to the goals of pipeline processor design. Stalls are an
expedient method of last resort that can be used when compiler
action or forwarding fails or might not be supported in hardware or
software design.
• Problem: The first instruction (sub), starting on clock cycle 1 (CC1) completes on CC5, when
the result in Register 2 is written to the register file. If we did nothing to resolve data
dependencies, then no instruction that read Register 2 from the register file could read the
"new" value computed by the sub instruction until CC5. The dependencies in the other
instructions are illustrated by solid lines with arrowheads. If register read and write cannot
occur within the same clock cycle (we will see how this could happen in Section 5.3.4),
then only the fifth instruction (sw) can access the contents of register 2 in the manner
indicated by the flow of sequential execution in the MIPS code fragment shown previously.
• Solution #1 - Forwarding: The result generated by the sub instruction can be forwarded to
the other stages of the pipeline using special control circuitry (data bus switchable to any
other segment, which can be implemented via a decoder or crossbar switch). This is
indicated notionally in Figure 5.7 by solid red lines with arrowheads. If the register file can
read in the first half of a cycle and write in the second half of a cycle, then the forwarding
in CC5 is not problematic. Otherwise, we would have to delay the execution of the add
instruction by one clock cycle (see Figure 5.9 for insertion of a stall).
• Solution #2 - Code Re-Ordering: Since all Instructions 2 through 5 in the MIPS code
fragment require Register 2 as an operand, we do not have instructions in that particular
code fragment to put between Instruction 1 and Instruction 2. However, let us assume that
we have other instructions that (a) do not depend on the results of Instructions 1-5, and
(b) themselves induce no dependencies in Instructions 1-5 (e.g., by writing to register 1, 2,
3, 5, or 6). In that case, we could insert two instructions between Instructions 1 and 2, if
register read and write could occur concurrently. Otherwise, we would have to insert three
such instructions. The latter case is illustrated in the following figure, where the inserted
instructions and their pipeline actions are colored dark green.
Example of code reordering to solve data
hazards in a sequence of MIPS instructions
• Solution #3 - Stalls: Suppose that we had no instructions to
insert between Instructions 1 and 2. For example, there
might be data dependencies arising from the inserted
instructions that would themselves have to be repaired.
Alternatively, the program execution order (functional
dependencies) might not permit the reordering of code. In
such cases, we have to insert stalls, also called bubbles,
which are no-op instructions that merely delay the
pipeline execution until the dependencies are no longer
problematic with respect to pipeline timing. This is
illustrated in Figure 5.9 by inserting three stalls between
Instructions 1 and 2.
Example of stall insertion to solve data
hazards in a sequence of MIPS instructions

• the insertion of stalls is the least desirable technique because

it delays the execution of an instruction without accomplishing
any useful work (in contrast to code re-ordering).
Data Hazards and Stalls
if (ID/EX.MemRead and
((ID/EX.RegisterRt = IF/ID.RegisterRs) or
(ID/EX.RegisterRt = IF/ID.RegisterRt)))
stall the pipeline

• lw $s0, 20($t1) # lw rt, imm(rs)

• sub $t2, $s0, $t3 # sub rd, rs, rt

Note: The first line tests to see if the instruction is a load: the only
instruction that reads data memory is a load. The next two lines
check to see if the destination register field of the load in the EX
stage matches either source register of the instruction in the ID
stage. If the condition holds, the instruction stalls 1 clock cycle.
Stall when R-format dependent instruction follow a Load
Instruction (pipeline Stall)
Control Hazards

• Branch determines flow of control

– Fetching next instruction depends on branch
outcome
– The delay in determining the proper instruction to
fetch is called a control hazard or branch hazard.
– Pipeline can’t always fetch correct instruction
• Still working on ID stage of branch
• beq, bne in MIPS pipeline

41
Control Hazards Simple Solution Option 1: two Stalls

Stall on every branch until have new PC value;

Would add 2 bubbles/clock cycles for every Branch!

Time (clock cycles)

I EX
n

ALU
I$ Reg D$ Reg
s beq
t
r. nop bubble bubble bubble bubble bubble

O nop bubble bubble bubble bubble bubble

ALU
d Instr I$ Reg D$ Reg
e

ALU
r Instr I$ Reg D$ Reg

Where do we do the compare for the branch?42

Control Hazard: Branching
• Optimization #1:
– Insert special branch comparator in Stage 2 (Dec)
– As soon as instruction is decoded (i.e. Opcode identifies it
as a branch), immediately make a decision and set the
new value of the PC
– Benefit: since branch is complete in Stage 2, only one
unnecessary instruction is fetched, so only one no-op is
needed
– Side Note: means that branches are idle in Stages 3, 4 and
5

43
Special Branch Comparator with One Clock Cycle Stall

Time (clock cycles)

ID/RF
I

ALU
n beq I$ Reg D$ Reg

s
t nop bubble bubble bubble bubble bubble

ALU
I$ Reg D$ Reg
Instr
O
r Instr

ALU
I$ Reg D$ Reg
d
e Instr

ALU
I$ Reg D$ Reg
r

Branch comparator moved to Decode stage

44
Control Hazards: Branch Delay Slot

• Optimization #2: Redefine branches

– Old definition: if we take the branch, none of the
instructions after the branch get executed by
accident
– New definition: whether or not we take the
branch, the single instruction immediately following
the branch gets executed (the branch-delay slot)
• Delayed Branch means we always execute the
instruction after branch
• This optimization is used with MIPS.
46
Example: Nondelayed vs. Delayed Branch
Nondelayed Branch Delayed Branch
or $8, $9, $10 add $1, $2,$3

add $1, $2, $3 sub $4, $5, $6

sub $4, $5, $6 beq $1, $4, Exit

beq $1, $4, Exit or $8, $9, $10

xor $10, $1, $11 xor $10, $1, $11

Exit: Exit:
47
Notes on Branch-Delay Slot

– Worst-Case Scenario: put a no-op in the branch-delay slot

– Better Case: place some instruction preceding the branch in the
branch-delay slot—as long as the changed doesn’t affect the logic
of program
• Re-ordering instructions is common way to speed up
programs
• Compiler usually finds such an instruction 50% of time
• Jumps also have a delay slot …

48
Control Hazards: Branch Prediction

• Opt #3: Predict outcome of a branch, fix up if guess

wrong
– Must cancel all instructions in pipeline that depended on
wrong-guess
– This is called “flushing” the pipeline
• Opt 3.1: Assume branches are NOT taken,
continue execution down the sequential instruction stream. If the branch is
taken, the instructions that are being fetched and decoded must be discarded.
Execution continues at the branch target.
– If branches are untaken half the time, and if it costs little to
discard the instructions, this optimization halves the cost of
control hazards.
• Opt3.2: Dynamic branch prediction: Prediction of
branches at runtime using runtime information.
– branch prediction buffer or branch history table 49
Exercise 1
• For the following code sequence in MIPS,
– Indicate the dependences
– Indicate the potential hazards and types
– Provide your hazard resolution methods and show how many extra
clock cycles you have to pay.

sub $2, $1,$3 # Register $2 written by sub

and $12,$2,$5 # 1st operand($2) depends on sub
or $13,$6,$2 # 2nd operand($2) depends on sub
add $14,$2,$2 # 1st($2) & 2nd($2) depend on sub
sw $15,100($2) # Base ($2) depends on sub

50
Exercise 2

• Show what happens when the branch is taken in this instruction

sequence, assuming the pipeline is optimized for branches that are
not taken and that we moved the branch execution to the ID stage.
The numbers to the left of the instruction (40, 44, . . . ) are the
addresses of the instructions.

36 sub $10, $4, $8

40 beq $1, $3, 7 # PC-relative branch to 40 + 4 + 7 * 4 = 72
44 and $12, $2, $5
48 or $13, $2, $6
52 add $14, $4, $2
56 slt $15, $6, $7
… …
72 lw $4, 50($7)

51
https://www.cise.ufl.edu/~mssz/CompOrg/CDA-pipe.html

Thank you

Chapter 17 - Pipelining Hazards
No ratings yet
Chapter 17 - Pipelining Hazards
33 pages
Lecture # Pipelining
No ratings yet
Lecture # Pipelining
36 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
No ratings yet
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
69 pages
Unit 5.2 Processor
No ratings yet
Unit 5.2 Processor
40 pages
Module 5 Part2 Pipelining
No ratings yet
Module 5 Part2 Pipelining
36 pages
Lecture-5-09 01 2025
No ratings yet
Lecture-5-09 01 2025
25 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
61 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
COA Lecture 10
No ratings yet
COA Lecture 10
22 pages
Pipeline
No ratings yet
Pipeline
33 pages
CA Unit-2 Chapter-2
No ratings yet
CA Unit-2 Chapter-2
36 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
Pipelining
No ratings yet
Pipelining
43 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
Pipelined Processor Design: Computer Architecture and Assembly Language
No ratings yet
Pipelined Processor Design: Computer Architecture and Assembly Language
22 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
MiCam - Debug - 2024 05 10 07 36 38 377559
No ratings yet
MiCam - Debug - 2024 05 10 07 36 38 377559
204 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
77 pages
Elasticsearch Performance Tuning
No ratings yet
Elasticsearch Performance Tuning
143 pages
CH 6
No ratings yet
CH 6
29 pages
CS530 Fall2015 Lecture9
No ratings yet
CS530 Fall2015 Lecture9
5 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
Lec 1
No ratings yet
Lec 1
30 pages
Module 5 - Pipelining
No ratings yet
Module 5 - Pipelining
61 pages
L15 MipsPipeline
No ratings yet
L15 MipsPipeline
26 pages
Avalon Memory Mapped Design Opti Miz Ations
No ratings yet
Avalon Memory Mapped Design Opti Miz Ations
44 pages
COA Unit3 Notes
No ratings yet
COA Unit3 Notes
47 pages
Computer Architecture and Organization: EE-321 Spring 2021
No ratings yet
Computer Architecture and Organization: EE-321 Spring 2021
53 pages
Project Slides Presentation
No ratings yet
Project Slides Presentation
31 pages
Piplining
No ratings yet
Piplining
23 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
National University Entrance Exam
No ratings yet
National University Entrance Exam
47 pages
Pipeline
No ratings yet
Pipeline
39 pages
Concepts Introduced The Laundry Analogy For Pipelining: Structural Hazards Data Hazards Control Hazards
No ratings yet
Concepts Introduced The Laundry Analogy For Pipelining: Structural Hazards Data Hazards Control Hazards
20 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
OpenCL Programming
100% (1)
OpenCL Programming
246 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
OD 02 PDE Designing Data Processing Systems
No ratings yet
OD 02 PDE Designing Data Processing Systems
67 pages
Unit3 Pipelining
No ratings yet
Unit3 Pipelining
54 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
BCS-29 Advanced Computer Architecture: Linear & Nonlinear Pipelines Instruction Pipelines & Arithmetic Operations
No ratings yet
BCS-29 Advanced Computer Architecture: Linear & Nonlinear Pipelines Instruction Pipelines & Arithmetic Operations
33 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
09 Chapter9+ +Controlling+Pipeline+Flow
No ratings yet
09 Chapter9+ +Controlling+Pipeline+Flow
14 pages
Digital and Kalman Filtering: An Introduction to Discrete-Time Filtering and Optimum Linear Estimation, Second Edition
From Everand
Digital and Kalman Filtering: An Introduction to Discrete-Time Filtering and Optimum Linear Estimation, Second Edition
S. M. Bozic
No ratings yet
Computer Architecture: Appendix A Pipelining Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Appendix A Pipelining Prof. Jerry Breecher CSCI 240 Fall 2003
58 pages
Synopsis Format
No ratings yet
Synopsis Format
10 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
2-Types of Parallelism PDF
No ratings yet
2-Types of Parallelism PDF
22 pages
Minor2project Synopsis
No ratings yet
Minor2project Synopsis
12 pages
Coa Iat-2 QB Soln
No ratings yet
Coa Iat-2 QB Soln
16 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Digital Filters Design for Signal and Image Processing
From Everand
Digital Filters Design for Signal and Image Processing
Mohamed Najim
No ratings yet
L14 MipsPipeline Ovw
No ratings yet
L14 MipsPipeline Ovw
17 pages
Helping Slides Pipelining Hazards Solutions
No ratings yet
Helping Slides Pipelining Hazards Solutions
55 pages
Lect8 Pipelined DP Control
No ratings yet
Lect8 Pipelined DP Control
59 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Soln. Tu. 2 - 2
No ratings yet
Soln. Tu. 2 - 2
1 page
Pipe Lining
No ratings yet
Pipe Lining
29 pages
Computer Architectur by FM Sir
No ratings yet
Computer Architectur by FM Sir
23 pages
Radio Control for Model Ships, Boats and Aircraft
From Everand
Radio Control for Model Ships, Boats and Aircraft
F. C. Judd
5/5 (1)
CA-unit 4-Material
No ratings yet
CA-unit 4-Material
31 pages
Oss Report
No ratings yet
Oss Report
5 pages
Embedded Systems Design: Pipelining and Instruction Scheduling
No ratings yet
Embedded Systems Design: Pipelining and Instruction Scheduling
48 pages
COA Tute 1 Sol
No ratings yet
COA Tute 1 Sol
7 pages
Build Docker Image
No ratings yet
Build Docker Image
5 pages
Pipeline Hazards. Presentation
100% (2)
Pipeline Hazards. Presentation
20 pages
COA Tute 9 Sol
No ratings yet
COA Tute 9 Sol
8 pages
Data Engineer Course (5 Days)
No ratings yet
Data Engineer Course (5 Days)
4 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
85 pages
Planantir AI Patent
No ratings yet
Planantir AI Patent
29 pages
Student Marks
No ratings yet
Student Marks
1 page
4 20 10 PDF
No ratings yet
4 20 10 PDF
12 pages
The FOG of FORECASTING - Realising The Potential of Your Pipeline
No ratings yet
The FOG of FORECASTING - Realising The Potential of Your Pipeline
8 pages
Chapter 6 - Pipelining
0% (1)
Chapter 6 - Pipelining
61 pages
8 Sem Syllabus CE
No ratings yet
8 Sem Syllabus CE
13 pages
CAO Pipelining Lecture
No ratings yet
CAO Pipelining Lecture
50 pages
Instruction Pipelining
No ratings yet
Instruction Pipelining
32 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
L10-L11-Instruction Pipelining
No ratings yet
L10-L11-Instruction Pipelining
38 pages
Computer Organization and Architecture (01CE0402) Lab Manual
No ratings yet
Computer Organization and Architecture (01CE0402) Lab Manual
4 pages
Lec-10 Software Pipelining
No ratings yet
Lec-10 Software Pipelining
24 pages
Airflow 101 Mobile
No ratings yet
Airflow 101 Mobile
48 pages
Lab3 Datapath Planning
No ratings yet
Lab3 Datapath Planning
34 pages
Introduction To Optical Computer
No ratings yet
Introduction To Optical Computer
10 pages
Lec12 Pipeline
No ratings yet
Lec12 Pipeline
23 pages
Computer Systems Architecture: Thorsten Altenkirch and Liyang Hu
No ratings yet
Computer Systems Architecture: Thorsten Altenkirch and Liyang Hu
20 pages
Serverless Data Processing With Dataflow - Foundations
No ratings yet
Serverless Data Processing With Dataflow - Foundations
2 pages
Vl9253 Vlsi Signal Processing
No ratings yet
Vl9253 Vlsi Signal Processing
1 page
CS 6303 Computer Architecture TWO Mark With Answer
100% (1)
CS 6303 Computer Architecture TWO Mark With Answer
14 pages
Azure Data Factory
No ratings yet
Azure Data Factory
4 pages
Automating Administration With Windows Powershell®
No ratings yet
Automating Administration With Windows Powershell®
5 pages
Types of Pipeline
100% (1)
Types of Pipeline
2 pages

Pipelining - Modified1

Uploaded by

Pipelining - Modified1

Uploaded by

Pipelining: Datapath and Hazards

Chapter 6, Computer Organization

• Small laundry has one washer, one

•Throughput of the instruction pipeline is determined by how often

Machine cycle . The time required to move an instruction one step

n is equivalent to number of loads in

Reading data from

Read same memory twice in same clock cycle 23

Time (clock cycles)

Time (clock cycles)

Can we read and write to registers simultaneously?

• Two different solutions have been used:

• the insertion of stalls is the least desirable technique because

• lw $s0, 20($t1) # lw rt, imm(rs)

• Branch determines flow of control

Stall on every branch until have new PC value;

Time (clock cycles)

O nop bubble bubble bubble bubble bubble

Where do we do the compare for the branch?42

Time (clock cycles)

Branch comparator moved to Decode stage

• Optimization #2: Redefine branches

add $1, $2, $3 sub $4, $5, $6

sub $4, $5, $6 beq $1, $4, Exit

beq $1, $4, Exit or $8, $9, $10

xor $10, $1, $11 xor $10, $1, $11

– Worst-Case Scenario: put a no-op in the branch-delay slot

• Opt #3: Predict outcome of a branch, fix up if guess

sub $2, $1,$3 # Register $2 written by sub

• Show what happens when the branch is taken in this instruction

36 sub $10, $4, $8

You might also like