0% found this document useful (0 votes)

164 views8 pages

COE301 Lab 13 Pipelined CPU Design With Data Forwarding

This document discusses pipelining a CPU design to improve performance. It describes how to divide the CPU into 5 stages - fetch, decode, execute, memory, and writeback. Pipeline registers are added between each stage to allow instructions to flow continuously through the pipeline. Data hazards can occur when instructions depend on results that have not been written yet. To address this, forwarding units are implemented which bypass the register file and supply dependent instructions with operands from earlier pipeline stages where the results have already been computed. The document provides examples of data hazards and illustrates how forwarding can resolve them by supplying operands directly from previous instructions still in the pipeline.

Uploaded by

Itz Sami Uddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

164 views8 pages

COE301 Lab 13 Pipelined CPU Design With Data Forwarding

Uploaded by

Itz Sami Uddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Pipelined CPU Design

13 with Data Forwarding

13.1 Objectives

After completing this lab, you will:

• Learn how to design a pipelined CPU
• Learn the different types of pipeline hazards
• Implement Forwarding to handle data hazards
• Verify the correct operation of your pipelined CPU design

13.2 Pipeline Data Path

The single-cycle data path design can be pipelined into a 5-stage pipeline by introducing registers at
the end of each stage as shown in Figure 13.1.

Figure 13.1: Pipelined Data Path.

It should be observed that the destination register number is also pipelined by saving it across stages
as the writing of the content of the register is done at stage 5. In addition, the incremented value of
PC is also pipelined across stage 2 and 3 as it is need by the Next PC block.

13: Pipelined CPU Design with Data Forwarding Page 1

13.3 Pipelined Control

The control signals are generated by the control unit in the second stage (i.e. ID state). In order to
pipeline the control unit, we need to save all the control signals needed by the later stages in
registers. For example, the control signals ExtOp, ALUSrc, ALUCtrl, J, Beq, Bne, MemRead,
MemWrite, Memtoreg and RegWrite need to be saved in the register separating stage 2 and stage 3.
However, only the control signals MemRead, MemWrite, Memtoreg and RegWrite are saved in the
register separating stages 3 and 4 as the remaining control signals are used in stage 3 and are not
needed in stages 4 and 5. The pipelined data path and control unit CPU is shown in Figure 13.2.

Figure 13.2: Pipeline Data Path and Control logic.

13.4 Pipeline Hazards

Hazards are situations that would cause incorrect execution if next instruction were launched during
its designated clock cycle. Hazards can be classified into three main types:

1. Structural hazards
Caused by resource contention

13: Pipelined CPU Design with Data Forwarding Page 2

Using same resource by two instructions during the same cycle

2. Data hazards
An instruction may compute a result needed by next instruction
Hardware can detect dependencies between instructions

3. Control hazards
Caused by instructions that change control flow (branches/jumps)
Delays in changing the flow of control

Hazards complicate pipeline control and limit performance. Dependency between instructions
causes a data hazard. An example of a data hazard is Read After Write – RAW Hazard. An example
of a RAW hazard is given below. Given two instructions I and J, where I comes before J.
Instruction J should read an operand after it is written by I.

I: add $s1, $s2, $s3 # $s1 is written

J: sub $s4, $s1, $s3 # $s1 is read

The RAW Hazard occurs when J reads the operand before I writes it.

Figure 13.3 shows a sample MIPS program with several RAW hazards. The result of sub instruction
is needed by the add, or, and, & sw instructions. The instructions add & or will read old value of
$s2 from reg file as the value of $s has not been updated in the reg file yet. During CC5, $s2 is
written at the end of the cycle, and thus the old value is read. Thus, from this example we can see
that any dependency between an instruction and any of the three following instructions will cause a
RAW hazard.

Figure 13.3: Example of RAW hazard.

13: Pipelined CPU Design with Data Forwarding Page 3

13.5 Handling RAW Data Hazards

One way of handling RAW data hazards is to stall the pipeline until the required destination register
is updated in the reg file. This requires freezing the execution of instructions that have such
dependency for three clock cycles to all the reg file to be updated. Figure 13.4 illustrates an
example of that. Due to the RAW dependency between the add and sub instructions, fetching the
operands of the add instruction has to wait until register $s2 of the sub instruction is updated. This
requires stalling the pipeline for three clock cycles from CC3 to CC5. Stall cycles delay the
execution of the add instruction & fetching of the or instruction. The add instruction cannot read
$s2 until beginning of CC6. The add instruction remains in the Instruction register until CC6 and
the PC register is not modified until the beginning of CC6.

However, instead of stalling the pipeline and wasting clock cycles, RAW hazards can be handled by
observing that the needed data is available in one of the stages 3 to 5 and can be used by forwarding
it to stage 2 instead of waiting until the data is written to the reg file. This idea is illustrated in
Figure 13.5.

Figure 13.4: Pipeline stall due to RAW hazard.

The add instruction takes the content of $s2 from the ALU output. The or instruction takes the
content of $s2 from the output of the DM stage. The and instruction takes the content of $s2 from
the input of the reg file stage 5 and the content of $s6 from the ALU output at stage 2.

To implement forwarding, two multiplexers are added at the inputs of the A & B registers and data
from ALU stage, MEM stage, and WB stage is fed back to these multiplexers as shown in Figure
13.6. Two signals ForwardA and ForwardB control forwarding as shown in Table 13.1.

13: Pipelined CPU Design with Data Forwarding Page 4

Figure 13.5: Example of data forwarding.

It should be observed that current instruction being decoded is in the Decode stage, the previous
instruction is in the Execute stage, the second previous instruction is in the Memory stage and the
third previous instruction in the Write Back stage. Thus, RAW data hazards detection conditions
and the generation of the forwarding control signals can be done as follows:

If ((Rs != 0) and (Rs == Rd2) and (EX.RegWrite)) ForwardA 1

Else if ((Rs != 0) and (Rs == Rd3) and (MEM.RegWrite)) ForwardA 2
Else if ((Rs != 0) and (Rs == Rd4) and (WB.RegWrite)) ForwardA 3
Else ForwardA 0

If ((Rt != 0) and (Rt == Rd2) and (EX.RegWrite)) ForwardB 1

Else if ((Rt != 0) and (Rt == Rd3) and (MEM.RegWrite)) ForwardB 2
Else if ((Rt != 0) and (Rt == Rd4) and (WB.RegWrite)) ForwardB 3
Else ForwardB 0

The hazard detection and forwarding unit is shown in Figure 13.7.

13: Pipelined CPU Design with Data Forwarding Page 5

Table 13.1: Data forwarding signals.

Signal Explanation

ForwardA = 0 First ALU operand comes from register file = Value of (Rs)

ForwardA = 1 Forward result of previous instruction to A (from ALU stage)

ForwardA = 2 Forward result of 2nd previous instruction to A (from MEM stage)

ForwardA = 3 Forward result of 3rd previous instruction to A (from WB stage)

ForwardB = 0 Second ALU operand comes from register file = Value of (Rt)

ForwardB = 1 Forward result of previous instruction to B (from ALU stage)

ForwardB = 2 Forward result of 2nd previous instruction to B (from MEM stage)

ForwardB = 3 Forward result of 3rd previous instruction to B (from WB stage)

Figure 13.6: Implementation of data forwarding.

13: Pipelined CPU Design with Data Forwarding Page 6

Figure 13.7: Hazard detection and forwarding unit.

13.6 In-Lab Tasks

1. Implement RAW hazard detection and the forwarding unit in your pipelined CPU design.
2. Add pipeline registers to your data path CPU design.
3. Add pipeline registers to pipeline the control signals in your CPU design.
4. Verify the correctness of your pipelined CPU design by executing the following instruction
sequence:

ori $s1, $0, 1

addi $s2, $0, 2
xor $s3, $s3, $s3
andi $s4, $0, $0
addi $s5, $s1, 5
add $s6, $s1, $s2
sub $s7, $s1, $s2

How many clock cycles your pipelined CPU takes to execute this program?

13: Pipelined CPU Design with Data Forwarding Page 7

5. Add the two multiplexers needed to implement data forwarding.
6. Implement the forwarding unit and add it to your pipelined CPU.
7. Verify the correctness of your pipelined CPU design including data forwarding unit by
executing the following instruction sequence:

ori $s1, $0, 1

addi $s2, $0, 2
ori $s3, $0, 3
sub $s4, $s3, $s1
add $s5, $s4, $s2
or $s6, $s4 $s5
and $s7, $s3, $s4
sw $s7, 10($s4)

13: Pipelined CPU Design with Data Forwarding Page 8

Modern Digital Electronics - R. P. Jain
40% (5)
Modern Digital Electronics - R. P. Jain
92 pages
Fpga Vs Asic Design Flow
No ratings yet
Fpga Vs Asic Design Flow
32 pages
Fundamental STA
No ratings yet
Fundamental STA
47 pages
8086 Pipelining
No ratings yet
8086 Pipelining
14 pages
ARM 4 Part2
100% (1)
ARM 4 Part2
9 pages
Digital Electronics
No ratings yet
Digital Electronics
101 pages
Basic Flip Flops-SR Flip Flop, JK Flip Flop, D Flip Flop, T Flip Flop, Circuits
100% (1)
Basic Flip Flops-SR Flip Flop, JK Flip Flop, D Flip Flop, T Flip Flop, Circuits
11 pages
Pipeline Hazards Detailed Notes
No ratings yet
Pipeline Hazards Detailed Notes
49 pages
DSP Unit 5
No ratings yet
DSP Unit 5
34 pages
List of Experiments: DDVHDL
No ratings yet
List of Experiments: DDVHDL
52 pages
Analog Solutions For Xilinx Fpgas: 1st Edition
No ratings yet
Analog Solutions For Xilinx Fpgas: 1st Edition
36 pages
Combinatorial Logic
No ratings yet
Combinatorial Logic
37 pages
U33
No ratings yet
U33
61 pages
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
No ratings yet
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
34 pages
Exploring BeagleBone: Tools and Techniques for Building with Embedded Linux
From Everand
Exploring BeagleBone: Tools and Techniques for Building with Embedded Linux
Derek Molloy
4/5 (1)
Digital Electronics 2 MS
No ratings yet
Digital Electronics 2 MS
3 pages
COE301 Lab 1 Introduction MARS
100% (1)
COE301 Lab 1 Introduction MARS
6 pages
C Programming for the Pc the Mac and the Arduino Microcontroller System
From Everand
C Programming for the Pc the Mac and the Arduino Microcontroller System
Peter D Minns
No ratings yet
Computer Hardware
No ratings yet
Computer Hardware
26 pages
DMOS Driver For Three-Phase Brushless DC Motor: Features
No ratings yet
DMOS Driver For Three-Phase Brushless DC Motor: Features
26 pages
Just A Minute (Jam) Circuit: Govt. College of Engineering Kannur
100% (1)
Just A Minute (Jam) Circuit: Govt. College of Engineering Kannur
26 pages
Vlsi Notes
No ratings yet
Vlsi Notes
5 pages
20 Pipelining Hazards
No ratings yet
20 Pipelining Hazards
32 pages
DLD Experiments Libre
No ratings yet
DLD Experiments Libre
51 pages
Pipelining 3
No ratings yet
Pipelining 3
37 pages
PIC Microcontroller
No ratings yet
PIC Microcontroller
18 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
53 pages
Forwarding Assignment
No ratings yet
Forwarding Assignment
35 pages
Lecture9 Cda3101
No ratings yet
Lecture9 Cda3101
62 pages
CA Unit-2 Chapter-2
No ratings yet
CA Unit-2 Chapter-2
36 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
L1-Introduction 1
No ratings yet
L1-Introduction 1
28 pages
Dma 8257
No ratings yet
Dma 8257
12 pages
CO Assignment 4 Solution
100% (1)
CO Assignment 4 Solution
10 pages
L13 Stalls and Flushes
No ratings yet
L13 Stalls and Flushes
27 pages
Lec13 Data Hazards
No ratings yet
Lec13 Data Hazards
42 pages
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Digital System Design: After Successful Completion of This Course The Student Will Be Able To
No ratings yet
Digital System Design: After Successful Completion of This Course The Student Will Be Able To
2 pages
Pipeline Mips
No ratings yet
Pipeline Mips
28 pages
Computer Architecture LAB 2
No ratings yet
Computer Architecture LAB 2
4 pages
Revisiting Hazards: Data Hazards Control Hazards Hardware
No ratings yet
Revisiting Hazards: Data Hazards Control Hazards Hardware
45 pages
Unit 5 Pipeline Hazard
No ratings yet
Unit 5 Pipeline Hazard
31 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Basic Electronics Project
No ratings yet
Basic Electronics Project
9 pages
Curriculum Gap Analysis Best Practice
No ratings yet
Curriculum Gap Analysis Best Practice
28 pages
M116C 1 EE116C-Midterm2-w15 Solution
100% (1)
M116C 1 EE116C-Midterm2-w15 Solution
8 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
Appendix C
No ratings yet
Appendix C
26 pages
Lecture 13
No ratings yet
Lecture 13
28 pages
Co - Unit Ii - Ii
No ratings yet
Co - Unit Ii - Ii
34 pages
Ultra Low-Voltage Ultra Low-Power CMOS Threshold Voltage Reference
No ratings yet
Ultra Low-Voltage Ultra Low-Power CMOS Threshold Voltage Reference
3 pages
CSCE 5610 Computer System Architecture: Instruction Level Parallelism
No ratings yet
CSCE 5610 Computer System Architecture: Instruction Level Parallelism
16 pages
SIMD Machines:: Pipeline System
No ratings yet
SIMD Machines:: Pipeline System
35 pages
Computer Architecture Lab 06
No ratings yet
Computer Architecture Lab 06
8 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
71 pages
CS104: Computer Organization: 2 April, 2020
No ratings yet
CS104: Computer Organization: 2 April, 2020
33 pages
COE301 Lab 12 Single Cycle CPU Design
No ratings yet
COE301 Lab 12 Single Cycle CPU Design
10 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
VHDL Implementation of 8-Bit ALU: Suchita Kamble, Prof .N. N. Mhala
No ratings yet
VHDL Implementation of 8-Bit ALU: Suchita Kamble, Prof .N. N. Mhala
5 pages
Unit 6 Part1 Ilp
No ratings yet
Unit 6 Part1 Ilp
39 pages
Pipelining
No ratings yet
Pipelining
21 pages
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
No ratings yet
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
51 pages
BJT and Jfet
No ratings yet
BJT and Jfet
21 pages
Dpco Unit 4
No ratings yet
Dpco Unit 4
21 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
Advanced Linux Programming
No ratings yet
Advanced Linux Programming
31 pages
Hazards - V3
No ratings yet
Hazards - V3
34 pages
Pipelining
No ratings yet
Pipelining
29 pages
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
No ratings yet
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
21 pages
74AUP1G3208 PhilipsSemiconductors
No ratings yet
74AUP1G3208 PhilipsSemiconductors
16 pages
Hazards
No ratings yet
Hazards
4 pages
M3.3 Data Hazard
No ratings yet
M3.3 Data Hazard
12 pages
Pipelining Hazards 2
No ratings yet
Pipelining Hazards 2
12 pages
Content: - Introduction To Pipeline Hazard - Structural Hazard - Data Hazard - Control Hazard
No ratings yet
Content: - Introduction To Pipeline Hazard - Structural Hazard - Data Hazard - Control Hazard
27 pages
24.3 A 3nm Gate-All-Around SRAM Featuring An Adaptive Dual-BL and An Adaptive Cell-Power Assist Circuit
No ratings yet
24.3 A 3nm Gate-All-Around SRAM Featuring An Adaptive Dual-BL and An Adaptive Cell-Power Assist Circuit
3 pages
COE301 Lab 2 Introduction MIPS Assembly
No ratings yet
COE301 Lab 2 Introduction MIPS Assembly
7 pages
Pipelined Processor Design
No ratings yet
Pipelined Processor Design
28 pages
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
No ratings yet
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
11 pages
Computer Architecture - Sheet 6 Solution
No ratings yet
Computer Architecture - Sheet 6 Solution
7 pages
Problem Set 4 Sol
No ratings yet
Problem Set 4 Sol
14 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
4 pages
Data Hazards
No ratings yet
Data Hazards
15 pages
Cisc Vs Risc: Multiplying Two Numbers in Memory
No ratings yet
Cisc Vs Risc: Multiplying Two Numbers in Memory
4 pages
COE301 Lab 8 MIPS Exceptions and IO
No ratings yet
COE301 Lab 8 MIPS Exceptions and IO
10 pages
Ca Assignment: Syeda Haima Batool Naqvi CS-18022
No ratings yet
Ca Assignment: Syeda Haima Batool Naqvi CS-18022
11 pages
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
No ratings yet
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
7 pages
Computer Architecture Lab 05
No ratings yet
Computer Architecture Lab 05
9 pages
A Pipelined Datapath: Resisters Are Used To Save Data Between Stages
No ratings yet
A Pipelined Datapath: Resisters Are Used To Save Data Between Stages
14 pages
CA Classes-86-90
No ratings yet
CA Classes-86-90
5 pages
COE301 Lab 3 IntegerArithmetic
No ratings yet
COE301 Lab 3 IntegerArithmetic
7 pages
Ca CT2
No ratings yet
Ca CT2
4 pages
Table 1: Control Signals and Opcodes
No ratings yet
Table 1: Control Signals and Opcodes
6 pages
Hazards: CSE378 W, 2001 CSE378 W, 2001
No ratings yet
Hazards: CSE378 W, 2001 CSE378 W, 2001
6 pages
VHDL Implementation of A Mips-32 Pipeline Processor
No ratings yet
VHDL Implementation of A Mips-32 Pipeline Processor
5 pages
COE301 Lab 14 Pipelined CPU Design With Stall Capability
No ratings yet
COE301 Lab 14 Pipelined CPU Design With Stall Capability
4 pages
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
From Everand
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
Mulayam Singh
No ratings yet
Silicon NPN Power Transistors: 2SC3298 2SC3298A 2SC3298B
No ratings yet
Silicon NPN Power Transistors: 2SC3298 2SC3298A 2SC3298B
3 pages
COE301 Lab 10 RevisedTasks
No ratings yet
COE301 Lab 10 RevisedTasks
1 page
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet

COE301 Lab 13 Pipelined CPU Design With Data Forwarding

Uploaded by

COE301 Lab 13 Pipelined CPU Design With Data Forwarding

Uploaded by

Pipelined CPU Design

13 with Data Forwarding

After completing this lab, you will:

13.2 Pipeline Data Path

Figure 13.1: Pipelined Data Path.

13: Pipelined CPU Design with Data Forwarding Page 1

Figure 13.2: Pipeline Data Path and Control logic.

13.4 Pipeline Hazards

13: Pipelined CPU Design with Data Forwarding Page 2

I: add $s1, $s2, $s3 # $s1 is written

J: sub $s4, $s1, $s3 # $s1 is read

Figure 13.3: Example of RAW hazard.

13: Pipelined CPU Design with Data Forwarding Page 3

Figure 13.4: Pipeline stall due to RAW hazard.

13: Pipelined CPU Design with Data Forwarding Page 4

If ((Rs != 0) and (Rs == Rd2) and (EX.RegWrite)) ForwardA 1

If ((Rt != 0) and (Rt == Rd2) and (EX.RegWrite)) ForwardB 1

The hazard detection and forwarding unit is shown in Figure 13.7.

13: Pipelined CPU Design with Data Forwarding Page 5

ForwardA = 1 Forward result of previous instruction to A (from ALU stage)

ForwardA = 2 Forward result of 2nd previous instruction to A (from MEM stage)

ForwardA = 3 Forward result of 3rd previous instruction to A (from WB stage)

ForwardB = 1 Forward result of previous instruction to B (from ALU stage)

ForwardB = 2 Forward result of 2nd previous instruction to B (from MEM stage)

ForwardB = 3 Forward result of 3rd previous instruction to B (from WB stage)

Figure 13.6: Implementation of data forwarding.

13: Pipelined CPU Design with Data Forwarding Page 6

13.6 In-Lab Tasks

ori $s1, $0, 1

13: Pipelined CPU Design with Data Forwarding Page 7

ori $s1, $0, 1

13: Pipelined CPU Design with Data Forwarding Page 8

You might also like