0% found this document useful (0 votes)

21 views40 pages

Chapter 3 PPTV 31 Sem IIv 31

The document discusses instruction pipelining, which improves CPU performance by overlapping the execution of multiple instructions. It describes the basics of pipelining including pipeline hazards like structural hazards, data hazards, and control hazards from branches. Solutions to hazards include forwarding, replicating pipeline stages, and branch prediction.

Uploaded by

zelalem2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views40 pages

Chapter 3 PPTV 31 Sem IIv 31

Uploaded by

zelalem2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Chapter 3

Instruction Pipelining

1
Outline
 Basics
 Pipeline Hazards

◦ Structural
◦ Data
◦ Control
 Branch Handling
 Branch Prediction

2
Basics
 As computer systems evolve, greater
performance achieved through :
◦ Improvements in technology
 Faster circuitry
◦ Organizational enhancement to the CPU
 Use of multiple register --- than single Accumulator
 Use of cache memory
 Use of instruction pipelining

3
Basics…
 Pipelining
◦ Used at an assembly line in a manufacturing plant
 Products at various stages can be worked on
simultaneously
 New inputs accepted at one end before the previously
accepted input appears at the output

◦ Used also in Laundry

 Washer, Dryer, Folder, Taken away

4
Basics…
 Sequential execution of an N-stage task:

◦ Production time: N time units.

◦ Resource needed: one general-purpose machine
◦ Productivity: one product per N time units.

5
Basics…
 Pipelined execution of an N-stage task:

◦ Production time: N time units

◦ Resource needed: N special purpose machines
◦ Productivity: about one product per time unit

6
Basics…
 Instruction pipelining
◦ An implementation technique in which multiple
instructions are overlapped in execution
 A pipelined approach takes much less time
 Pipeling paradox

◦ Time for a single load is not shorter

◦ More loads are finished per hour
 Improvement in throughput, decreases the total time
to complete the work

7
Basics…
 Instruction Pipelining ---applying
◦ Apply the above concept to instruction execution

 A Simple approach
◦ Subdividing instruction processing into two stages
◦ The pipeline has two stages
 Fetch stage
 Execute stage
◦ Fetch stage:
 Fetches an instruction and buffers it --- instruction pre-fetch or fetch
overlap
 Passes the buffered instruction when the Ex stage is free
◦ Execute stage
 Accepts instructions from fetch stage and processes the instruction
 Doubles execution rate

8
Basics…
 A Simple approach --- Drawbacks
◦ double execution rate unlikely
 The execution time generally longer than fetch time
 Fetch stage has to wait for some time
 A conditional branch instruction makes the address of
the next instruction to be fetched unknown
 Fetch stage waits until the address is supplied from
execute stage
 Execute stage may have to wait while the next instruction
is fetched

9
Basics…
 A typical instruction execution sequence:
◦ Fetch Instruction (FI):
 Fetch the instruction
◦ Decode Instruction (DI):
 Determine the op-code and the operand specifiers
◦ Calculate Operands (CO):
 Calculate the effective addresses
◦ Fetch Operands (FO):
 Fetch the operands
◦ Execute Instruction (EI):
 perform the operation
◦ Write Operand (WO):
 store the result in memory

10
Basics…
 Applying pipelining

 Each row shows the progress of an individual instruction

 Assumption
◦ Various stages will be of nearly equal duration
◦ Each instruction goes through all six stages of the pipeline
 Reduces execution time for 9 instructions
◦ From 54 time units to 14 time units

11
Basics…
 Applying pipelining (continued)

 Ideal speedup = 6 times

◦ Equal to the number of stages

12
Basics…
 Typical Pipeling

13
Basics…
 Typical Pipeling :
◦ Effect of a conditional branch on instruction pipeline
 Assume
◦ Instruction 3 is conditional branch to instruction 15
◦ Not known until I3 is executed
◦ During time unit 8, instruction 15 enters the pipeline
◦ No instruction complete during time units 9 through 12
 Branch penalty

14
Basics…(contd)
 Alternative timing diagram for instruction pipelining
operation:
 Each row shows the state of the pipeline at a given point in
time

15
Basics…(contd)
Number of Pipeline Stages
 In general, a larger number of stages gives better
performance
 However:
◦ A larger number of stages increases the overhead in
moving information between stages and
synchronization between stages
◦ The complexity of the CPU grows with the number of
stages
◦ It is difficult to keep a large pipeline at maximum rate
because of pipeline hazards

16
Pipeline Hazard
 Pipeline Hazard
◦ situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycle
◦ The instruction is said to be stalled
 When an instruction is stalled:
◦ All instructions later in the pipeline than the stalled instruction
are also stalled
◦ No new instructions are fetched during the stall
◦ Instructions earlier than the stalled one continue as usual
 Types of hazards (conflicts):
◦ Structural hazards
◦ Data hazards
◦ Control hazards

17
Pipeline Hazard...
 Structural Hazards
◦ The hardware cannot support the combination of instructions that
we want to execute in the same clock cycle
◦ Hardware conflicts caused by the use of the same hardware resource
at the same time (e.g., memory conflicts)

 Penalty: 1 cycle

18
Pipeline Hazard...
 Structural Hazards … (contd)

 In general, the hardware resources in conflict are

duplicated in order to avoid structural hazards
 Functional units (ALU, FP unit) can also be pipelined

themselves in order to support several instructions at

the same time
 Memory conflicts can be solved by

◦ having two separate caches, one for instructions

and the other for operands (Harvard architecture)
◦ keeping as many intermediate results as possible in
the registers

19
Pipeline Hazard...
 Data Hazards
◦ Occurs because one step must wait for another to complete
◦ Caused by reversing the order of data-dependent
operations due to the pipeline (e.g., WRITE/READ conflicts)
 ADD A, R1 ; Mem(A) ← Mem(A) + R1;
 SUB A, R2 ; Mem(A) ← Mem(A) - R2;

20
Pipeline Hazard...
 Data Hazards (contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;

◦ Penalty: 2 cycles

21
Pipeline Hazard...
 Data Hazards Solutions (contd…)
◦ The penalty due to data hazards can be reduced by a technique called
forwarding (bypassing)

◦ The ALU result is fed back to the ALU input.

 If the hardware detects that the value needed for the current operation is the one
produced by the previous operation (but which has not yet been written back), it
selects the forwarded result, instead of the value from register or memory.

22
Pipeline Hazard...
 Data Hazards Solutions(contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;

◦ Penalty: reduced to 1 cycle

23
Pipeline Hazard...
 Control Hazards
◦ Arising from the need to make a decision based on the
results of one instruction while others are executing
◦ The flow of instruction address is not what the pipeline
expected
◦ Caused by branch instructions, which change the
instruction execution order

24
Pipeline Hazard...
 Instruction Pipelining in Practice

25
Branch Handling
 Stall
◦ Stop the pipeline until the branch instruction reaches the
last stage

◦ Large loss of performance, since 20% - 35% of the

instructions executed are branches (conditional and
unconditional)

26
Branch Handling ...
 Multiple streams
◦ Replicate the initial portions of the pipeline and allow the
pipeline to fetch both instructions
 Using two streams
◦ implement hardware to deal with all possibilities

27
Branch Handling ...
 Multiple streams --- (Drawbacks)
◦ With multiple pipeline there are contention delays to access
the registers and to memory
◦ Additional branch instruction may enter the pipeline
 Before the original decision is resolved

28
Branch Handling ...
 Pre-fetch branch target
◦ when a conditional branch is recognized, the target of the
branch is pre-fetched, in addition to the instruction
following the branch

29
Branch Handling ...
 Loop buffer:
◦ use a small, very high-speed memory to keep the n
most recently fetched instructions in sequence. If a
branch is to be taken, the buffer is first checked to see
if the branch target is in it
◦ Similar in principle to a cache dedicated to instructions
 But
 it only retains in sequence
 Much smaller in size and hence lower cost
 Delayed branch:
◦ re-arrange the instructions so that branching occur
later than specified

30
Branch Handling ...
 Delayed Branch Example

 The compiler has to find an instruction which can be moved from

its original place to the branch delay slot, and will be executed
regardless of the outcome of the branch
◦ 60% to 85% success rate.

31
Branch Predictions
 When a branch is encountered, a prediction is made and the
predicted path is followed
 The instructions on the predicted path are fetched
 The fetched instruction can also be executed --- called
Speculative execution
◦ The results produced of these executions should be
marked as tentative
 When the branch outcome is decided, if the prediction is
correct, the special tags on tentative results are removed
 If not, the tentative results are removed. And the execution
goes to the other path
 Branch prediction can base on static information or dynamic
information

32
Branch Predictions ...
 Static Branch Predictions
◦ Predict always taken
 Assume that jump will happen
 Always fetch target instruction

33
Branch Predictions ...
 Static Branch Predictions (contd…)

34
Branch Predictions ...
 Static Branch Predictions (contd…)
◦ Predict never taken
 Assume that jump will not happen
 Always fetch next instruction

◦ Predict by Operation Codes

 Some instructions are more likely to result in a
jump than others
 Can get up to 75% success

35
Branch Predictions ...
 Dynamic Branch Predictions
◦ Attempts to improve the accuracy of prediction
 By recording the history of conditional branch instructions

 Taken/not Taken switch

 One of more bits associated with each conditional branch
 Reflect the recent history of the instruction
 Bits can be stored
 Not associated with the instruction in main memory
 Kept in temporary high speed storage
 In cache along with the instruction
 To maintain a small table for recently executed branch instructions
 With one or more bits in each entry

36
Branch Predictions ...
 Taken/not Taken switch (contd…)
 Single bit --- Drawback
 Records whether the last execution of the instruction resulted in a
branch or not
 In case of conditional branch instruction that is always taken such as
loop instruction
 With only one bit of history
 An error in prediction will occur twice
 One for entering the loop
 One for leaving the loop

37
Branch Predictions ...
 Taken/not Taken switch (contd…)
◦ Using Two bits
 Records the result of the last two instances of the execution of
the associated instruction
 Using state diagram

38
Branch Predictions ...
 Dynamic Branch Predictions …
◦ Branch History Table:
 A small cache memory associated with the instruction fetch
stage of the pipeline
 Store information regarding branches in a branch-history table
so as to more accurately predict the branch outcome
 Assume that the branch will do what it did last time

39
Summary
 Pipelining is a technique that exploits
parallelism among the instructions in a
sequential instruction stream
 Pipelining has a substantial advantage that,

unlike programming a multiprocessor, it is

fundamentally invisible to the programmer

Hello: Let's Get Started!
No ratings yet
Hello: Let's Get Started!
26 pages
Module 5 Part2 Pipelining
No ratings yet
Module 5 Part2 Pipelining
36 pages
Lec 13
No ratings yet
Lec 13
36 pages
Design of 3 Stage Pipelining Processor Using VHDL
No ratings yet
Design of 3 Stage Pipelining Processor Using VHDL
22 pages
CH14-WS - 10thed - Pipeline
No ratings yet
CH14-WS - 10thed - Pipeline
16 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
CEA201 - Chapter 14 - Processor Structure and Function
No ratings yet
CEA201 - Chapter 14 - Processor Structure and Function
42 pages
Lect8 Pipelined DP Control
No ratings yet
Lect8 Pipelined DP Control
59 pages
Chapter 5
No ratings yet
Chapter 5
38 pages
Module 5 - Processor Structure and Function
No ratings yet
Module 5 - Processor Structure and Function
74 pages
Pipelining New
No ratings yet
Pipelining New
33 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
No ratings yet
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
42 pages
Chapter 4
No ratings yet
Chapter 4
78 pages
Moduel 5
No ratings yet
Moduel 5
46 pages
Control Hazard
No ratings yet
Control Hazard
20 pages
RFGHJ
No ratings yet
RFGHJ
20 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
85 pages
Unit 5 Pipeline Hazard
No ratings yet
Unit 5 Pipeline Hazard
31 pages
Pipelining (All Slides)
No ratings yet
Pipelining (All Slides)
45 pages
Processor Structure and Function
100% (1)
Processor Structure and Function
55 pages
CH 12.ppt Type I
No ratings yet
CH 12.ppt Type I
54 pages
Unit3 Pipelining
No ratings yet
Unit3 Pipelining
54 pages
10 Pipelining
No ratings yet
10 Pipelining
44 pages
Lecutre-7 Instruction Pipelining
No ratings yet
Lecutre-7 Instruction Pipelining
29 pages
Lecutre-7 Instruction Pipelining
No ratings yet
Lecutre-7 Instruction Pipelining
29 pages
Canvas Pipelining and Parallel Processors
No ratings yet
Canvas Pipelining and Parallel Processors
5 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
Ca 5
No ratings yet
Ca 5
12 pages
CoA Batch13
No ratings yet
CoA Batch13
30 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
31 Pipeline Hazards 25-04-2024
No ratings yet
31 Pipeline Hazards 25-04-2024
35 pages
Pipelining: Basic Concepts
No ratings yet
Pipelining: Basic Concepts
20 pages
Computer Science 37 Lecture 22
No ratings yet
Computer Science 37 Lecture 22
14 pages
Dpco Unit 4
No ratings yet
Dpco Unit 4
21 pages
Pipelining
No ratings yet
Pipelining
44 pages
12 - Processor Structure and Function
No ratings yet
12 - Processor Structure and Function
73 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Lec 24
No ratings yet
Lec 24
3 pages
CH14 COA9e Processor Structure and Function
No ratings yet
CH14 COA9e Processor Structure and Function
40 pages
Unit - 1 Microprocessor Architecture
No ratings yet
Unit - 1 Microprocessor Architecture
52 pages
Ch2 Lec7 Instruction Piplining
No ratings yet
Ch2 Lec7 Instruction Piplining
34 pages
PIpeline Processing and Multi Processing
No ratings yet
PIpeline Processing and Multi Processing
16 pages
DLCO Module 6 Sem 3
No ratings yet
DLCO Module 6 Sem 3
40 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
Instruction Pipelining
No ratings yet
Instruction Pipelining
32 pages
Lec3 PDF
No ratings yet
Lec3 PDF
15 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
Chapter 6 - Pipelining
0% (1)
Chapter 6 - Pipelining
61 pages
L10-L11-Instruction Pipelining
No ratings yet
L10-L11-Instruction Pipelining
38 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
55 pages
Chapter 8 - Pipelining
No ratings yet
Chapter 8 - Pipelining
38 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
General Principles of Pipelining: Andrew Warfield CS313
No ratings yet
General Principles of Pipelining: Andrew Warfield CS313
25 pages
Implementing and Operating Cisco Enterprise Network Core Technologies Encor
No ratings yet
Implementing and Operating Cisco Enterprise Network Core Technologies Encor
5 pages
Cs1358 Ece Computer Architecture
No ratings yet
Cs1358 Ece Computer Architecture
8 pages
Instruction-Level Parallelism and Superscalar Processors
No ratings yet
Instruction-Level Parallelism and Superscalar Processors
22 pages
COA - Chapter # 9
No ratings yet
COA - Chapter # 9
74 pages
CS3351 Dpco Qbank
No ratings yet
CS3351 Dpco Qbank
43 pages
Implementing Automation For Cisco Enterprise Solutions Enaui
No ratings yet
Implementing Automation For Cisco Enterprise Solutions Enaui
4 pages
Midtermarch 2
No ratings yet
Midtermarch 2
9 pages
CTE 433 Computer Architecture II
No ratings yet
CTE 433 Computer Architecture II
28 pages
Cpre 381 Project Report 1
No ratings yet
Cpre 381 Project Report 1
22 pages
Pipelining
No ratings yet
Pipelining
47 pages
ECE-6913 - RISC-V Project - A1
No ratings yet
ECE-6913 - RISC-V Project - A1
4 pages
Computer Organization & Archetecture
No ratings yet
Computer Organization & Archetecture
39 pages
Chapter 5 PPTV 41 STDV 1
No ratings yet
Chapter 5 PPTV 41 STDV 1
47 pages
Coa Mid 2qb and Obj
No ratings yet
Coa Mid 2qb and Obj
29 pages
Register Renaming: ECEN 6253 Advanced Digital Computer Design
No ratings yet
Register Renaming: ECEN 6253 Advanced Digital Computer Design
7 pages
CST 3 Sem Computer System Organization 307 4 N Jun 2022
No ratings yet
CST 3 Sem Computer System Organization 307 4 N Jun 2022
3 pages
DDCA - CO-3 & 4 - Terminal Questions
No ratings yet
DDCA - CO-3 & 4 - Terminal Questions
18 pages
Interfacing Lec 14 TMP
No ratings yet
Interfacing Lec 14 TMP
52 pages
8259updatedd 141126060514 Conversion Gate02
No ratings yet
8259updatedd 141126060514 Conversion Gate02
28 pages
Interrupt8259 140807080616 Phpapp02
No ratings yet
Interrupt8259 140807080616 Phpapp02
40 pages
Computer Organisation and Arthicture Question Paper and Solution
No ratings yet
Computer Organisation and Arthicture Question Paper and Solution
14 pages
Desc
No ratings yet
Desc
6 pages
Unit2 100523023556 Phpapp01
No ratings yet
Unit2 100523023556 Phpapp01
26 pages
PIC8259
No ratings yet
PIC8259
26 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Serial Data 8251 V 64
No ratings yet
Serial Data 8251 V 64
31 pages
ICS232 Lab1 2012 02
No ratings yet
ICS232 Lab1 2012 02
6 pages
Chapter 2 Proposed
No ratings yet
Chapter 2 Proposed
62 pages
Computer Organization and Design Pipeliing-Chapter+4 Slides
No ratings yet
Computer Organization and Design Pipeliing-Chapter+4 Slides
131 pages
M5 Notes
No ratings yet
M5 Notes
13 pages
Chapter 4 PPTV 52 Link
No ratings yet
Chapter 4 PPTV 52 Link
33 pages
Appendix C
No ratings yet
Appendix C
26 pages
Tuto 2
No ratings yet
Tuto 2
25 pages
Iointerfacingcircuits 140807080623 Phpapp01
No ratings yet
Iointerfacingcircuits 140807080623 Phpapp01
24 pages
05-Addressing Mode
No ratings yet
05-Addressing Mode
20 pages
Implementing Cisco Enterprise Wireless Networks Enwlsi
No ratings yet
Implementing Cisco Enterprise Wireless Networks Enwlsi
3 pages
15 - 370w18 - Pipeline Last
No ratings yet
15 - 370w18 - Pipeline Last
53 pages
MD Shabbir Hasan: Made by
No ratings yet
MD Shabbir Hasan: Made by
15 pages
Web-Based MIPS Simulation Environmen PDF
No ratings yet
Web-Based MIPS Simulation Environmen PDF
7 pages
Department of Computer Science and Engineering: State University of Bangladesh
No ratings yet
Department of Computer Science and Engineering: State University of Bangladesh
2 pages
Cao MCQ
No ratings yet
Cao MCQ
13 pages
Memory Hierarchy I (Caches)
No ratings yet
Memory Hierarchy I (Caches)
9 pages
Sample Questions Unit-4&5
No ratings yet
Sample Questions Unit-4&5
2 pages
Computer Architecture Experiment: Jiang Xiaohong
No ratings yet
Computer Architecture Experiment: Jiang Xiaohong
14 pages
3SP Wspeculation
No ratings yet
3SP Wspeculation
10 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Accelerated Computing With HIP: Second Edition
From Everand
Accelerated Computing With HIP: Second Edition
Yifan Sun
No ratings yet
LPIC-1 Primer
From Everand
LPIC-1 Primer
John Greene
4.5/5 (3)
Comptia Server+ Primer
From Everand
Comptia Server+ Primer
John Greene
5/5 (1)

Chapter 3 PPTV 31 Sem IIv 31

Uploaded by

Chapter 3 PPTV 31 Sem IIv 31

Uploaded by

Chapter 3

◦ Used also in Laundry

◦ Production time: N time units.

◦ Production time: N time units

◦ Time for a single load is not shorter

 Each row shows the progress of an individual instruction

 Ideal speedup = 6 times

 In general, the hardware resources in conflict are

themselves in order to support several instructions at

◦ having two separate caches, one for instructions

◦ The ALU result is fed back to the ALU input.

◦ Penalty: reduced to 1 cycle

◦ Large loss of performance, since 20% - 35% of the

 The compiler has to find an instruction which can be moved from

◦ Predict by Operation Codes

 Taken/not Taken switch

unlike programming a multiprocessor, it is

You might also like