0% found this document useful (0 votes)
21 views40 pages

Chapter 3 PPTV 31 Sem IIv 31

The document discusses instruction pipelining, which improves CPU performance by overlapping the execution of multiple instructions. It describes the basics of pipelining including pipeline hazards like structural hazards, data hazards, and control hazards from branches. Solutions to hazards include forwarding, replicating pipeline stages, and branch prediction.

Uploaded by

zelalem2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views40 pages

Chapter 3 PPTV 31 Sem IIv 31

The document discusses instruction pipelining, which improves CPU performance by overlapping the execution of multiple instructions. It describes the basics of pipelining including pipeline hazards like structural hazards, data hazards, and control hazards from branches. Solutions to hazards include forwarding, replicating pipeline stages, and branch prediction.

Uploaded by

zelalem2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Chapter 3

Instruction Pipelining

1
Outline
 Basics
 Pipeline Hazards

◦ Structural
◦ Data
◦ Control
 Branch Handling
 Branch Prediction

2
Basics
 As computer systems evolve, greater
performance achieved through :
◦ Improvements in technology
 Faster circuitry
◦ Organizational enhancement to the CPU
 Use of multiple register --- than single Accumulator
 Use of cache memory
 Use of instruction pipelining

3
Basics…
 Pipelining
◦ Used at an assembly line in a manufacturing plant
 Products at various stages can be worked on
simultaneously
 New inputs accepted at one end before the previously
accepted input appears at the output

◦ Used also in Laundry


 Washer, Dryer, Folder, Taken away

4
Basics…
 Sequential execution of an N-stage task:

◦ Production time: N time units.


◦ Resource needed: one general-purpose machine
◦ Productivity: one product per N time units.

5
Basics…
 Pipelined execution of an N-stage task:

◦ Production time: N time units


◦ Resource needed: N special purpose machines
◦ Productivity: about one product per time unit

6
Basics…
 Instruction pipelining
◦ An implementation technique in which multiple
instructions are overlapped in execution
 A pipelined approach takes much less time
 Pipeling paradox

◦ Time for a single load is not shorter


◦ More loads are finished per hour
 Improvement in throughput, decreases the total time
to complete the work

7
Basics…
 Instruction Pipelining ---applying
◦ Apply the above concept to instruction execution

 A Simple approach
◦ Subdividing instruction processing into two stages
◦ The pipeline has two stages
 Fetch stage
 Execute stage
◦ Fetch stage:
 Fetches an instruction and buffers it --- instruction pre-fetch or fetch
overlap
 Passes the buffered instruction when the Ex stage is free
◦ Execute stage
 Accepts instructions from fetch stage and processes the instruction
 Doubles execution rate

8
Basics…
 A Simple approach --- Drawbacks
◦ double execution rate unlikely
 The execution time generally longer than fetch time
 Fetch stage has to wait for some time
 A conditional branch instruction makes the address of
the next instruction to be fetched unknown
 Fetch stage waits until the address is supplied from
execute stage
 Execute stage may have to wait while the next instruction
is fetched

9
Basics…
 A typical instruction execution sequence:
◦ Fetch Instruction (FI):
 Fetch the instruction
◦ Decode Instruction (DI):
 Determine the op-code and the operand specifiers
◦ Calculate Operands (CO):
 Calculate the effective addresses
◦ Fetch Operands (FO):
 Fetch the operands
◦ Execute Instruction (EI):
 perform the operation
◦ Write Operand (WO):
 store the result in memory

10
Basics…
 Applying pipelining

 Each row shows the progress of an individual instruction

 Assumption
◦ Various stages will be of nearly equal duration
◦ Each instruction goes through all six stages of the pipeline
 Reduces execution time for 9 instructions
◦ From 54 time units to 14 time units

11
Basics…
 Applying pipelining (continued)

 Ideal speedup = 6 times


◦ Equal to the number of stages

12
Basics…
 Typical Pipeling

13
Basics…
 Typical Pipeling :
◦ Effect of a conditional branch on instruction pipeline
 Assume
◦ Instruction 3 is conditional branch to instruction 15
◦ Not known until I3 is executed
◦ During time unit 8, instruction 15 enters the pipeline
◦ No instruction complete during time units 9 through 12
 Branch penalty

14
Basics…(contd)
 Alternative timing diagram for instruction pipelining
operation:
 Each row shows the state of the pipeline at a given point in
time

15
Basics…(contd)
Number of Pipeline Stages
 In general, a larger number of stages gives better
performance
 However:
◦ A larger number of stages increases the overhead in
moving information between stages and
synchronization between stages
◦ The complexity of the CPU grows with the number of
stages
◦ It is difficult to keep a large pipeline at maximum rate
because of pipeline hazards

16
Pipeline Hazard
 Pipeline Hazard
◦ situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycle
◦ The instruction is said to be stalled
 When an instruction is stalled:
◦ All instructions later in the pipeline than the stalled instruction
are also stalled
◦ No new instructions are fetched during the stall
◦ Instructions earlier than the stalled one continue as usual
 Types of hazards (conflicts):
◦ Structural hazards
◦ Data hazards
◦ Control hazards

17
Pipeline Hazard...
 Structural Hazards
◦ The hardware cannot support the combination of instructions that
we want to execute in the same clock cycle
◦ Hardware conflicts caused by the use of the same hardware resource
at the same time (e.g., memory conflicts)

 Penalty: 1 cycle

18
Pipeline Hazard...
 Structural Hazards … (contd)

 In general, the hardware resources in conflict are


duplicated in order to avoid structural hazards
 Functional units (ALU, FP unit) can also be pipelined

themselves in order to support several instructions at


the same time
 Memory conflicts can be solved by

◦ having two separate caches, one for instructions


and the other for operands (Harvard architecture)
◦ keeping as many intermediate results as possible in
the registers

19
Pipeline Hazard...
 Data Hazards
◦ Occurs because one step must wait for another to complete
◦ Caused by reversing the order of data-dependent
operations due to the pipeline (e.g., WRITE/READ conflicts)
 ADD A, R1 ; Mem(A) ← Mem(A) + R1;
 SUB A, R2 ; Mem(A) ← Mem(A) - R2;

20
Pipeline Hazard...
 Data Hazards (contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;

◦ Penalty: 2 cycles

21
Pipeline Hazard...
 Data Hazards Solutions (contd…)
◦ The penalty due to data hazards can be reduced by a technique called
forwarding (bypassing)

◦ The ALU result is fed back to the ALU input.


 If the hardware detects that the value needed for the current operation is the one
produced by the previous operation (but which has not yet been written back), it
selects the forwarded result, instead of the value from register or memory.

22
Pipeline Hazard...
 Data Hazards Solutions(contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;

◦ Penalty: reduced to 1 cycle

23
Pipeline Hazard...
 Control Hazards
◦ Arising from the need to make a decision based on the
results of one instruction while others are executing
◦ The flow of instruction address is not what the pipeline
expected
◦ Caused by branch instructions, which change the
instruction execution order

24
Pipeline Hazard...
 Instruction Pipelining in Practice

25
Branch Handling
 Stall
◦ Stop the pipeline until the branch instruction reaches the
last stage

◦ Large loss of performance, since 20% - 35% of the


instructions executed are branches (conditional and
unconditional)

26
Branch Handling ...
 Multiple streams
◦ Replicate the initial portions of the pipeline and allow the
pipeline to fetch both instructions
 Using two streams
◦ implement hardware to deal with all possibilities

27
Branch Handling ...
 Multiple streams --- (Drawbacks)
◦ With multiple pipeline there are contention delays to access
the registers and to memory
◦ Additional branch instruction may enter the pipeline
 Before the original decision is resolved

28
Branch Handling ...
 Pre-fetch branch target
◦ when a conditional branch is recognized, the target of the
branch is pre-fetched, in addition to the instruction
following the branch

29
Branch Handling ...
 Loop buffer:
◦ use a small, very high-speed memory to keep the n
most recently fetched instructions in sequence. If a
branch is to be taken, the buffer is first checked to see
if the branch target is in it
◦ Similar in principle to a cache dedicated to instructions
 But
 it only retains in sequence
 Much smaller in size and hence lower cost
 Delayed branch:
◦ re-arrange the instructions so that branching occur
later than specified

30
Branch Handling ...
 Delayed Branch Example

 The compiler has to find an instruction which can be moved from


its original place to the branch delay slot, and will be executed
regardless of the outcome of the branch
◦ 60% to 85% success rate.

31
Branch Predictions
 When a branch is encountered, a prediction is made and the
predicted path is followed
 The instructions on the predicted path are fetched
 The fetched instruction can also be executed --- called
Speculative execution
◦ The results produced of these executions should be
marked as tentative
 When the branch outcome is decided, if the prediction is
correct, the special tags on tentative results are removed
 If not, the tentative results are removed. And the execution
goes to the other path
 Branch prediction can base on static information or dynamic
information

32
Branch Predictions ...
 Static Branch Predictions
◦ Predict always taken
 Assume that jump will happen
 Always fetch target instruction

33
Branch Predictions ...
 Static Branch Predictions (contd…)

34
Branch Predictions ...
 Static Branch Predictions (contd…)
◦ Predict never taken
 Assume that jump will not happen
 Always fetch next instruction

◦ Predict by Operation Codes


 Some instructions are more likely to result in a
jump than others
 Can get up to 75% success

35
Branch Predictions ...
 Dynamic Branch Predictions
◦ Attempts to improve the accuracy of prediction
 By recording the history of conditional branch instructions

 Taken/not Taken switch


 One of more bits associated with each conditional branch
 Reflect the recent history of the instruction
 Bits can be stored
 Not associated with the instruction in main memory
 Kept in temporary high speed storage
 In cache along with the instruction
 To maintain a small table for recently executed branch instructions
 With one or more bits in each entry

36
Branch Predictions ...
 Taken/not Taken switch (contd…)
 Single bit --- Drawback
 Records whether the last execution of the instruction resulted in a
branch or not
 In case of conditional branch instruction that is always taken such as
loop instruction
 With only one bit of history
 An error in prediction will occur twice
 One for entering the loop
 One for leaving the loop

37
Branch Predictions ...
 Taken/not Taken switch (contd…)
◦ Using Two bits
 Records the result of the last two instances of the execution of
the associated instruction
 Using state diagram

38
Branch Predictions ...
 Dynamic Branch Predictions …
◦ Branch History Table:
 A small cache memory associated with the instruction fetch
stage of the pipeline
 Store information regarding branches in a branch-history table
so as to more accurately predict the branch outcome
 Assume that the branch will do what it did last time

39
Summary
 Pipelining is a technique that exploits
parallelism among the instructions in a
sequential instruction stream
 Pipelining has a substantial advantage that,

unlike programming a multiprocessor, it is


fundamentally invisible to the programmer

40

You might also like