0% found this document useful (0 votes)
63 views29 pages

Pipelining

The document discusses pipeline hazards and techniques to resolve them. It describes structural, data, and control hazards and how stalls are inserted in a pipeline to handle hazards. It also explains forwarding, interlocking, and delayed load techniques to resolve data hazards like RAW, WAW, and WAR without requiring stalls.

Uploaded by

meetachaudhry
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views29 pages

Pipelining

The document discusses pipeline hazards and techniques to resolve them. It describes structural, data, and control hazards and how stalls are inserted in a pipeline to handle hazards. It also explains forwarding, interlocking, and delayed load techniques to resolve data hazards like RAW, WAW, and WAR without requiring stalls.

Uploaded by

meetachaudhry
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

PIPELINING

Topics To Be Covered
Hazards in Pipelining Types of Hazards Performance of Pipelines with Hazards Structural hazards Data Hazard Classification Data hazard resolving techniques Forwarding Hardware Interlocks Delayed Load

Pipeline Hazards
Hazards are the major Hurdle of Pipelining.

Hazards may make it necessary to Stall the pipeline.


Stall is a Pipeline bubble which floats through the

pipeline taking space but carrying no useful work.

Inserting Stall in Pipeline

Clock cycle number


Instr Instr i Instr i+1 Instr i+2 Stall Instr i+3 Instr i+4

1 IF

2 ID

3 EX

10

MEM WB

IF

ID
IF

EX
ID

MEM WB
EX MEM WB

bubble bubble bubble bubble bubble

IF

ID IF

EX ID

MEM WB EX MEM WB

Types Of Hazards

Structural Hazards
Data Hazards

Control hazards

Performance of Pipelines with Stalls

Speedup from pipelining =

Average instruction time unpipelined ----------------------------Average instruction time pipelined CPI unpipelined * Clock Cycle Time unpipelined = ------------------------------------CPI pipelined * Clock Cycle Time pipelined

The ideal CPI on a pipelined machine is almost always 1. Hence, the pipelined CPI is

CPIpipelined

= Ideal CPI + Pipeline stall clock cycles per instruction = 1 + Pipeline stall clock cycles per instruction

If we ignore the cycle time overhead of pipelining and assume the stages are all perfectly balanced, then the cycle time of the two machines are equal and

Speedup =

CPI unpipelined ---------------------------1+ Pipeline stall cycles per instruction

Speedup =

Pipeline depth -------------------------1 + Pipeline stall cycles per instruction

Thus, If there are no pipeline stalls, this leads to the intuitive result that speedup is equal to the number of pipeline stages.

Structural Hazards
Common instances of structural hazards arise when Some functional unit is not fully pipelined. Some resource has not been duplicated enough to allow all combinations of instructions in the pipeline to execute.

Example :- Pipeline with Structural Hazard

Clock cycle number


Instr Load Instr 1 Instr 2 Instr 3 1 IF 2 ID IF 3 EX ID IF 4 MEM EX ID IF 5 WB MEM EX ID WB MEM EX WB MEM WB 6 7 8

After Stall is implemented

Clock cycle number Instr 1 2 3 4 5 6 7 8 9

Load
Instr 1 Instr 2 Stall Instr 3

IF ID EX
IF ID IF

MEM
EX ID bubble

WB
MEM EX bubble IF WB MEM bubble ID WB bubble EX bubble MEM WB

Simplified Picture

Clock cycle number Instr Load Instr 1 Instr 2 Instr 3 1 IF 2 ID IF 3 EX ID IF 4 MEM EX ID stall 5 WB MEM EX IF WB MEM ID WB EX MEM WB 6 7 8 9

Question :- why would a designer allow Structural Hazard? Answer :- To reduce cost.

Thus, if Structural Hazard is rare, it may not be


worth to avoid it.

Data Hazard Classification


Considering two instructions i and j , with i occurring before j. The possible Data Hazards are

1. RAW (read after write) 2. WAW (write after write) 3. WAR (write after read)

RAW (Read after write) :j tries to read a source before i writes it, so j incorrectly gets the old value. This corresponds to True Data Dependence.

Ex,
1 ADD SUB AND R1, R2, R3 R4, R5, R1 R6, R1, R7 2 3 4 MEM EX IDand 5 WB MEM EX WB MEM WB 6 7 8 9

IF ID EX IF IDsub IF

OR
XOR

R8, R1, R9
R10,R1,R11

IF

IDor
IF

EX
IDxor

MEM
EX

WB
MEM WB

WAW (write after write): j tries to write an operand before it is written by i.


The writes end up being performed in the wrong order, leaving the value written by i rather than the value written by j in the destination. This corresponds to Output Dependence. Only occurs if we have writes in multiple stages Not a problem with linear integer pipeline
Using modified pipeline, the WAW hazard is shown below
LW R1, 0(R2) ADD R1, R2, R3

IF

ID
IF

EX
ID

MEM1
EX

MEM2
WB

WB

WAR (write after read):

j tries to write a destination before it is read by i , so i incorrectly gets the new value. This corresponds to Anti Dependence.
For this to happen we need a pipeline that writes results early in the pipeline, and then other instruction read a source later in the pipeline - This cannot occur in linear 5 stage instruction pipeline

Using modified Pipeline, WAR hazard occurs as shown


SW R1, 0(R2) ADD R2, R3, R4 IF ID IF EX ID MEM1 EX MEM2 WB WB

RAR (read after read) - this case is not a hazard .

Data Hazard resolving techniques


RAW hazard can be eliminated using the following techniques

1. Hardware based Techniques


a. Forwarding or Bypassing b. Interlocking 2. Software based Technique a. Delayed Load

Forwarding
Forwarding can be generalized to include passing a result directly to the functional unit that requires it. considering an example,
1 ADD SUB AND R1, R2, R3 R4, R5, R1 R6, R1, R7 IF 2 ID IF 3 EX IDsub IF 4 MEM EX IDand 5 WB MEM EX WB MEM WB 6 7

We notice that result is not actually needed by SUB until after ADD actually produces it

ADD R1, R2, R3 SUB R4, R5, R1

R1 at ALUOutput Need R1 at ALUInput

If the result can be moved from where the ADD produces


it (EX/MEM register), to where the SUB needs it (ALU input latch), then the need for a stall can be avoided.

Thus forwarding work as follows - ALU result automatically fed back to ALU input latches . - Need control logic to detect if the feedback should be selected, or the normal input operands

Without forwarding our example will execute correctly with stalls:

1 ADD R1, R2, R3 IF

2 ID

3 EX

4 MEM

5 WB

SUB

R4, R5, R1

IF

stall

stall

IDsub

EX

ME M

WB

AND

R6, R1, R7

stall

stall

IF

IDand

EX

MEM

WB

Using the forwarding paths the code sequence can be executed without stalls:
1 ADD R1, R2, R3 IF 2 ID 3 EXadd 4 MEMadd 5 WB 6 7

SUB
AND

R4, R5, R1
R6, R1, R7

IF

ID
IF

EXsub
ID

MEM
EXand

WB
MEM WB

- The first forwarding is for value of R1 from EXadd to EXsub . - The second forwarding is also for value of R1 from MEMadd to EXand. - This code now can be executed without stalls.

Interlocking
Not all potential hazards can be handled by Forwarding.

Consider the following sequence of instructions:


1 LW SUB AND OR R1, 0(R1) R4, R1, R5 R6, R1 R7 R8, R1, R9 IF 2 ID IF 3 EX ID IF 4 MEM EXsub ID IF 5 WB MEM EXand ID WB MEM EX WB MEM WB 6 7 8

Result Needed even before it is computed


To overcome the above problem We need to add hardware, called a pipeline interlock,

to preserve the correct execution pattern. A pipeline interlock detects a hazard and stalls

the pipeline until the hazard is cleared.

The pipeline with a stall and the legal forwarding is:

1 LW R1, 0(R1) R4, R1, R5 R6, R1 R7 R8, R1, R9 IF

2 ID

3 EX

4 MEM

5 WB

SUB
AND OR

IF

ID
IF

stall
stall stall

EXsub
ID IF

MEM WB
EX ID MEM EX WB MEM WB

* The only necessary forwarding is done for R1 from MEM to EXsub.


* There is no need to forward R1 for AND instruction because now it is getting the value through the register file in ID.

* The CPI for stalled instruction increases by the length of the stall .

Delayed Load

Compiler helps arrange instruction to avoid pipeline stalls, called instruction scheduling.

Compiler knows delay slots (the next instruction that


may conflict with a load) for typical instruction types. Try to move other instructions into this slot that dont

conflict.
If one cant be found insert a NOP.

Compiler Scheduling Example

A=B+C;

D =E+F

LW R1, B LW R2, C

ADD R3, R1, R2


SW A, R3 LW R4, E

<- Need to stall for R2

LW R5, F
ADD R6, R4, R5 SW D, R6 <- Need to stall for R5

Compiler Scheduling Example (after using delayed load)


A=B+C; D =E+F LW R1, B LW R2, C LW R4, E LW R5, F ADD R3, R1, R2 SW A, R3 ADD R6, R4, R5 SW D, R6 <- Swap instruction, no stall

Thank You

You might also like