0% found this document useful (0 votes)

3 views29 pages

CA07 2022S3 New

The lecture discusses the design and implementation of pipelined processors, highlighting the drawbacks of single-cycle and multicycle processors. Pipelining improves performance by allowing multiple instructions to be processed simultaneously across different stages, thus increasing throughput. The document also includes examples comparing execution times and speedups between single-cycle, multicycle, and pipelined architectures.

Uploaded by

Huy Hoang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views29 pages

CA07 2022S3 New

Uploaded by

Huy Hoang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

ELT3047 Computer Architecture

Lecture 7: Pipelined processor

Hoang Gia Hung

Faculty of Electronics and Telecommunications
University of Engineering and Technology, VNU Hanoi
Last lecture review
❑ Quizz
❑ Control Unit design
➢ Control signals can be derived manually for a particular instruction.
➢ To design the Control Unit, we must generate the control signals for every
instruction in the ISA.
➢ Needs only 9 bits from the instruction encoding to determine the instruction
type.

❑ Control Unit implementation

➢ ROM implementation
➢ Combinatorial logic implementation
▪ Multilevel implementation: simplify the design process, reduce the size of
the main controller,and potentially speedup the circuit

❑ Today lecture: Pipelined processor

Drawbacks of Single Cycle Processor
❑ All instructions take as much time as the slowest instruction
➢ Not all instructions need all 5 stages
Multicycle Implementation
❑ Can we improve single-cycle processor performance?
➢ Reduce cycle time to accomodate one stage per clock cycle.
➢ Clock cycle time is now constrained by longest stage.

IF ID EX MEM WB
I-MEM Reg Read ALU D-MEM Reg W
180 ps 100 ps 160 ps 200 ps 100 ps

longest stage

➢ Introducing a little bit more complexity to the datapath so that simpler

instructions take fewer cycles.
Single cycle vs. multicycle
❑ Single cycle
Cycle 1 Cycle 2

LW SW
waste
❑ Multicycle

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
IF ID Exec Mem Wr IF ID Exec Mem IF
LW SW BEQ

✓ Shorter clock cycle time

✓ Less waste → higher overall performance: less waste
 Design and implementation is more complicated
Example
❑ Assume the following operation times for components:
➢ Instruction and data memories: 200 ps
➢ LU and adders: 180 ps
➢ Decode and Register file access (read or write): 150 ps
➢ Ignore the delays in PC, mux, extender, and wires

❑ Assume the following instruction mix:

➢ 40% ALU, 20% Loads, 10% stores, 20% branches, & 10% jumps

❑ Which of the following would be faster and by how much?

➢ Single-cycle implementation for all instructions
➢ Multicycle implementation optimized for every class of instructions
Example solution

❑ For fixed single-cycle implementation:

➢ Clock cycle = 880 ps determined by longest delay (load instruction)

❑ For multi-cycle implementation:

➢ Clock cycle = max (200, 150, 180) = 200 ps (maximum delay at any step)
➢ Average CPI = 0.4×4 + 0.2×5 + 0.1×4+ 0.2×3 + 0.1×2 = 3.8

❑ Speedup = 880 ps / (3.8 × 200 ps) = 880 / 760 = 1.16

The idea of pipelining
❑ Limitations of the multicycle design
➢ Some HW resources are idle during different phases of the instruction cycle,
e.g. “Fetch” logic is idle when an instruction is being “decoded” or “executed”
➢ Most of the datapath is idle when a memory access is happening.

❑ Can we do better?
➢ Pipelining: employs more
concurrency (i.e., more
“work” done in 1 cycle)
➢ Laundry analogy:
▪ 4 loads → speedup =
8/3.5 = 2.3.
▪ 𝑛 → ∞ loads: speedup =
2𝑛
≈ 4
0.5𝑛+1.5
▪ In the limit, speedup =
number of stages.
Single-cycle vs multi-cycle vs pipeline
❑ Five stages, one step per stage
➢ Each step requires 1 clock cycle → steps enter/leave pipeline at the rate of
one step per clock cycle
Single-cycle Implementation:
Cycle 1 Cycle 2
Clk

lw sw Waste

Multicycle Implementation:

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
Clk
lw sw
IF ID EX MEM WB IF ID EX MEM IF

Pipeline Implementation:
pipeline clock same
lw IF ID EX MEM WB as multi-cycle clock

sw IF ID EX MEM WB

R-type IF ID EX MEM WB
Pipeline performance
❑ Ideal pipeline assumptions
➢ Identical operations, e.g. four laundry steps are repeated for all loads
➢ Independent operations, e.g. no dependency between laundry steps
➢ Uniformly partitionable suboperations (that do not share resources), e.g.
laundry steps have uniform latency.

❑ Ideal pipeline speedup

Time between instructions𝑛𝑜𝑛𝑝𝑖𝑝𝑒𝑙𝑖𝑛𝑒𝑑
➢ Time between instructions𝑝𝑖𝑝𝑒𝑙𝑖𝑛𝑒𝑑 =
Number of stages
➢ Speedup is due to increased throughput (*); latency (*) does not decrease.

❑ Speedup for non-ideal pipelines is less

➢ External/internal fragmentation, pipeline stalls.

✓ Latency = execution time (delay or response time) = the total time from start to
finish of ONE instruction
✓ Throughput (or execution bandwidth) = the total amount of work done in a given
amount of time
Example
❑ Assume the execution time for stages in a RISC-V datapath are
✓ 100ps for register read or write
✓ 200ps for other stages

Register Memory Register

Instr Instr fetch ALU op Total time
read access write
lw 200ps 100 ps 200ps 200ps 100 ps 800ps
sw 200ps 100 ps 200ps 200ps 700ps
R-format 200ps 100 ps 200ps 100 ps 600ps
beq 200ps 100 ps 200ps 500ps

❑ Compare pipelined datapath with single-cycle datapath

➢ Clock rates
➢ Execution time & speedup
Example solution
Single cycle 1 (Tc=800ps) Single cycle 2

lw sw waste

Pipelined (Tc=200ps)
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9

lw IF ID EX MEM WB
sw IF ID EX MEM WB
R-type IF ID EX MEM WB
IF ID EX MEM WB
Pipeline’s fill time IF ID EX MEM WB

❑ Time btw 1st and 5th instructions: single cycle = 3200ps (4 x 800ps) vs pipelined
= 800ps (4 x 200ps) → speedup = 4.
➢ Execution time for 5 instructions: 4000ps vs 1800ps ≈ 2.22 times speedup
→ Why shouldn't the speedup be 5 (#stages)? What’s wrong?
➢ Think of real programs which execute billions of instructions.
Symbolic representation of 5 stages

IF ID EX MEM WB

ID, WB stages respectively read and write

the same hardware element, Reg (RegFile).

Instruction Fetch Instruction Decode Execute Memory Access Write back to register
(IMEM Read) (Reg Read) ALU (DMEM) (Reg write)
Symbolic representation of pipelined
RISC-V datapath
tinstruction = 1000 ps
Resource use in a
add t0, t1, t2 particular time slot

or t3, t4, t5
Resource use of
instruction over time
instruction sequence

slt t6, t0, t3

sw t0, 4(t3)

lw t0, 8(t3)

addi t2, t2, 1

tcycle= 200 ps
Pipelined datapath design
lw t0, 8(t3) sw t0, 4(t3) slt t6, t0, t3 or t3, t4, t5 add t0, t1, t2
Instruction Fetch Instruction Decode ALU Execution Memory Access Write Back

pc+4
+4 wb
Reg[] pc alu
DataD 1
1 Reg[rs1] 2
alu
pc inst[11:7] AddrD 0 ALU DMEM 1
pc+4
0 IMEM Reg[rs2] Addr DataR
wb
inst[19:15] AddrA DataA Branch 0 0
Comp. DataW mem
inst[24:20] AddrB DataB 1

inst[31:7]
Imm. imm[31:0]

Gen

❑ Think of the datapath as a linear sequence of stages, each

stage operates on different instruction.
➢ On any given cycle up to 5 instructions will be in various points of execution.
➢ How can we operate the stages independently, i.e. moving the current
instruction to the next stage before taking in the next instruction?
Pipeline registers
❑ Add state registers between each pipeline stage.
➢ To isolate information between cycles, hold data for each instruction in flight.

lw t0, 8(t3) sw t0, 4(t3) slt t6, t0, t3 or t3, t4, t5 add t0, t1, t2

Instruction Fetch Instruction Decode ALU Execution Memory Access Write Back

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA Branch ALU DataR 0
0
AddrA Comp.
1 DataW
DataB
AddrB

Imm.
Gen

❖ Now, let’s check the flow of instructions through the pipeline cycle-by-cycle!
IF for Load

PC+4 is computed, stored

lw t0, 8(t3) back into the PC, then
stored in the IF/ID buffer.
Instruction Fetch

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0
pc+4 IMEM AddrD 1
ALU Addr wb
DataA Branch DataR 0
0
AddrA Comp.
1 DataW
DataB
AddrB

Imm.
Instruction word is
fetched from memory, Gen
and stored in the IF/ID
buffer because it will be
needed in the next stage.
ID for Load

lw t0, 8(t3) PC+4 is passed forward

Instruction Decode to ID/EX buffer

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA Branch ALU DataR 0
0
AddrA Comp.
1 DataW
DataB
AddrB

Bits of load instruction

are taken from IF/ID Imm.
buffer, while new Gen
instruction is being
fetched

12-bit immediate is fetched from IF/ID buffer, rs1 and rs2 values are
then sign-extended, then stored in the ID/EX fetched & stored in ID/EX
buffer for use in a later stage. buffer.
EX for Load
ALU result is stored in
rs1 value is taken EX/MEM buffer for use
lw t0, 8(t3) as memory address in
from ID/EX buffer
& passed to ALU. ALU Execution the next stage.

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA
ALU DataR 0
Branch 0
AddrA
Comp. DataW
DataB 1
AddrB

Imm.
Gen

32-bit literal is rs2 value is passed forward

provided to ALU as to EX/MEM buffer (but won't
second operand be needed, though)
MEM for Load
ALU result is taken from
lw t0, 8(t3)
EX/MEM buffer & passed
to data memory, also Memory Access
passed to MEM/WB buffer

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA Branch ALU DataR 0
0
AddrA Comp.
1 DataW
DataB
AddrB

Imm. Value on Read

Gen data port of the
data memory is
stored in
rs2 is passed from
MEM/WB buffer
EX/MEM buffer to
Write data port of
data memory.
WB for Load
lw t0, 8(t3)
Write Back

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA
ALU DataR 0
Branch 0
AddrA Comp.
1 DataW
DataB
AddrB

Imm.
Wrong Gen
register
number
Value from data memory is
selected and passed back to
register file.
Corrected Datapath for Load
lw t0, 8(t3)
Write Back

pc+4
+4 Reg[] pc alu
1 1
alu pc wb 2
DataD DMEM
0 0 1
pc+4 IMEM AddrD Addr wb
DataA
ALU DataR 0
Branch 0
AddrA Comp.
1 DataW
DataB
AddrB

Imm.
Gen

The problem is fixed by passing the Write register number through the various inter-
stage buffers & feed it back just in time → adding 5 more bits to the last three buffers.
Pipelined control signals
❑ Control signals derived from instruction & determined during ID
as in single-cycle implementation.
➢ As the instruction moves → pipeline the control signals → extend the pipeline
registers to include the control signals
➢ Each stage uses some of the control signals

9 control bits 5 control bits 2 control bits

RISC-V ISA supports for pipelining
❑ What makes it easy
➢ All instructions are 32-bits
• Easier to fetch and decode in one cycle: fetch in the 1st stage and
decode in the 2nd stage
• c.f. x86: 1- to 17-byte instructions
➢ Few and regular instruction formats
• Can decode and read registers in one step
➢ Load/store addressing
• Can calculate address in 3rd stage, access memory in 4th stage

❑ What makes it hard?

➢ Pipeline hazards
Three Types of Pipeline Hazards
❑ A hazard is a situation in which a planned instruction cannot
execute in the “proper” clock cycle.
1. Structural hazard
• Attempt to use the same resource by two different instructions at the
same time while hardware does not support multiple access.
2. Data hazard
• Attempt to use data before it is ready because instructions have data
dependency.
3. Control hazard
• Attempt to make a decision about program control flow before the
condition has been evaluated and the new PC target address
calculated.

❑ Pipeline hazards are serious problems that cannot be ignored

Structural hazard example
❑ RegFile must serve
▪ Up to 2 oprerand reads in ID stage & up to 1 operand write in WB stage
➢ Structural hazard occurs if RegFile HW does not support simultaneous
read/write!
Time (clock cycles)

ALU
add t0, t1, t2 IM Reg DM Reg
instruction sequence

ALU
IM Reg DM Reg
or t3, t4, t5

ALU
IM Reg DM Reg
slt t6, t0, t3

ALU
sw t0, 4(t3) IM Reg DM Reg

ALU
IM Reg DM Reg
lw t0, 8(t3)
Data hazard example
❑ If the same register is written and read in one cycle
➢ WB must write value before ID reads new value
➢ Not structural hazard since separate ports allow simultaneous R/W.

Time (clock cycles)

ALU
add t0, t1, t2 IM Reg DM Reg
instruction sequence

ALU
IM Reg DM Reg
or t3, t4, t5

ALU
IM Reg DM Reg
slt t6, t0, t3

ALU
sw t0, 4(t3) IM Reg DM Reg

ALU
IM Reg DM Reg
lw t0, 8(t3)
Control hazard example
❑ If the beq branch is taken, wrong instructions would have been
fetched as the decision is made only in MEM stage.

beq

ALU
IM Reg DM Reg
Branch outcome
I is ready
n
Ins 1

ALU
s IM Reg DM Reg fetched
t regardless
r.
of branch
Inst 3

ALU
IM Reg DM Reg outcome!
O
r
d
e

ALU
Inst 4 IM Reg DM Reg
r

PC updated reflecting
ALU
Inst 5 IM Reg DM Reg
branch outcome
Summary
❑ Pipelined processor
➢ Speedup is due to increased throughput, latency does not decrease.
➢ Implemented by adding state registers to the single-cycle datapath.
➢ The pipeline registers are also extended to include the control signals.

❑ The basic idea of pipelining is easy, but the devil is in the details
➢ Hazard: a situation in which a planned instruction cannot execute in the
“proper” clock cycle
▪ Structural hazard
▪ Data hazard
▪ Control hazard
➢ Pipeline hazards are serious problems that cannot be ignored

❑ Next lecture: Hazard handling methods.

FemtoRV32 Piplined Processor Report
No ratings yet
FemtoRV32 Piplined Processor Report
25 pages
Onur Digitaldesign - Comparch 2021 Lecture13 Pipelining Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture13 Pipelining Afterlecture
138 pages
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
97 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
L04 Pipelining
No ratings yet
L04 Pipelining
48 pages
Slide 4
No ratings yet
Slide 4
51 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
77 pages
Pipe 1 New
No ratings yet
Pipe 1 New
64 pages
Chapter 4.5 - 4.8 Piplined Processor and Hazards
No ratings yet
Chapter 4.5 - 4.8 Piplined Processor and Hazards
68 pages
3-Pipelining 241110 203716
No ratings yet
3-Pipelining 241110 203716
59 pages
Arch3 Pipelining Afterlecture
No ratings yet
Arch3 Pipelining Afterlecture
180 pages
Pipeline Processor Design
No ratings yet
Pipeline Processor Design
89 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
1 Processor Pipeline
No ratings yet
1 Processor Pipeline
73 pages
4 The Processors
No ratings yet
4 The Processors
112 pages
Basic Pipelining: CS2100 - Computer Organization
No ratings yet
Basic Pipelining: CS2100 - Computer Organization
83 pages
Week 11
No ratings yet
Week 11
33 pages
Pipelining ControlUnitAndHazards
No ratings yet
Pipelining ControlUnitAndHazards
109 pages
Lecture13 Pipeline1
No ratings yet
Lecture13 Pipeline1
26 pages
Chapter7 - Basic Processing Unit 1
No ratings yet
Chapter7 - Basic Processing Unit 1
31 pages
CH7-Parallel and Pipelined Processing
No ratings yet
CH7-Parallel and Pipelined Processing
23 pages
Lecture # 7.
No ratings yet
Lecture # 7.
26 pages
Lec11 Pipeline 1 Notes
No ratings yet
Lec11 Pipeline 1 Notes
26 pages
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
No ratings yet
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
21 pages
Lec4 - ILP Pipelining Intro
No ratings yet
Lec4 - ILP Pipelining Intro
24 pages
Lecture10 - Chapter4-P2
No ratings yet
Lecture10 - Chapter4-P2
46 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
L14 MipsPipeline Ovw
No ratings yet
L14 MipsPipeline Ovw
17 pages
Lec 7 CSE-509 Pipelining
No ratings yet
Lec 7 CSE-509 Pipelining
27 pages
L11 Pipelined Datapath and
100% (1)
L11 Pipelined Datapath and
31 pages
Lecture-4-08 01 2025
No ratings yet
Lecture-4-08 01 2025
35 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
Unit 4
No ratings yet
Unit 4
20 pages
Lec 11
No ratings yet
Lec 11
30 pages
Ca06 2014 PDF
No ratings yet
Ca06 2014 PDF
53 pages
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
No ratings yet
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
35 pages
Chapter4 2
No ratings yet
Chapter4 2
34 pages
Lec 04 Pipeline D Processor
No ratings yet
Lec 04 Pipeline D Processor
106 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
38 pages
Pipelining 2
No ratings yet
Pipelining 2
33 pages
What Is The Most Boring Household Activity?
No ratings yet
What Is The Most Boring Household Activity?
27 pages
Cse410 10 Pipelining A
No ratings yet
Cse410 10 Pipelining A
7 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Pipelining
No ratings yet
Pipelining
44 pages
CS104: Computer Organization: 30 March, 2020
No ratings yet
CS104: Computer Organization: 30 March, 2020
31 pages
8 Pipeline DDP Control
No ratings yet
8 Pipeline DDP Control
54 pages
CSCE 5610 Computer System Architecture: Instruction Level Parallelism
No ratings yet
CSCE 5610 Computer System Architecture: Instruction Level Parallelism
11 pages
The Improvement of The Personal Computer
No ratings yet
The Improvement of The Personal Computer
74 pages
Chapter 4 Notes
No ratings yet
Chapter 4 Notes
32 pages
Pipelining
No ratings yet
Pipelining
24 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
L24 Pipeline
No ratings yet
L24 Pipeline
40 pages
Pipelined Processor Design: Computer Architecture and Assembly Language
No ratings yet
Pipelined Processor Design: Computer Architecture and Assembly Language
22 pages
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
No ratings yet
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
35 pages
Risc in Pipe Ine
No ratings yet
Risc in Pipe Ine
39 pages
Processor Organization & Instruction Cycle
No ratings yet
Processor Organization & Instruction Cycle
31 pages
CS530 Fall2015 Lecture9
No ratings yet
CS530 Fall2015 Lecture9
5 pages
2014fa CS61C L31 DG PipelineII 6up
No ratings yet
2014fa CS61C L31 DG PipelineII 6up
4 pages
Lec12 Pipeline
No ratings yet
Lec12 Pipeline
23 pages
Lesson1 - Structural Components of Microprocessors - Microcontroller
No ratings yet
Lesson1 - Structural Components of Microprocessors - Microcontroller
5 pages
DSP V/s GPP
100% (1)
DSP V/s GPP
26 pages
AZI - Computer Architecture - 672.7E Exams Questions List - 2018
No ratings yet
AZI - Computer Architecture - 672.7E Exams Questions List - 2018
6 pages
Microproprocessors
No ratings yet
Microproprocessors
28 pages
5-Stage Pipeline CPU Hardware
No ratings yet
5-Stage Pipeline CPU Hardware
33 pages
System On Chip Architecture
No ratings yet
System On Chip Architecture
36 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
MPMC New Manual Version 1
No ratings yet
MPMC New Manual Version 1
146 pages
Chapter 6
No ratings yet
Chapter 6
51 pages
Desktop L01so6n
No ratings yet
Desktop L01so6n
40 pages
M S Engineering College Bangalore Title:: Part - A
No ratings yet
M S Engineering College Bangalore Title:: Part - A
2 pages
What Is CPU and How It Works Presentation
No ratings yet
What Is CPU and How It Works Presentation
17 pages
2017 Summer Model Answer Paper
No ratings yet
2017 Summer Model Answer Paper
28 pages
Unit 2-MPMC
No ratings yet
Unit 2-MPMC
3 pages
Ponyprog Circuit For AVR&amp PIC16F84
No ratings yet
Ponyprog Circuit For AVR&amp PIC16F84
6 pages
2-Register and Register File-06-01-2024
No ratings yet
2-Register and Register File-06-01-2024
12 pages
Interfacing and Instructions
No ratings yet
Interfacing and Instructions
12 pages
Intel Core I5 PPT 2
No ratings yet
Intel Core I5 PPT 2
15 pages
Caracteristicas de La Computadora
No ratings yet
Caracteristicas de La Computadora
3 pages
Stm32AssemblyMiniCookbookV1 2
No ratings yet
Stm32AssemblyMiniCookbookV1 2
18 pages
Module 1 Quiz OT 133 150
No ratings yet
Module 1 Quiz OT 133 150
9 pages
AMD Processor Identification
No ratings yet
AMD Processor Identification
4 pages
Intel Cpu History
No ratings yet
Intel Cpu History
3 pages
Instruction Format
No ratings yet
Instruction Format
4 pages
COA Assignment 1
No ratings yet
COA Assignment 1
1 page
PassMark CPU Benchmarks - Laptop & Portable CPU Performance
No ratings yet
PassMark CPU Benchmarks - Laptop & Portable CPU Performance
15 pages
Z80 Microprocessor
No ratings yet
Z80 Microprocessor
11 pages
CA Lab Manual 5
No ratings yet
CA Lab Manual 5
2 pages
8086 String
No ratings yet
8086 String
2 pages
Partitioning Sept2018 MSC Sem3 Electronic Science
No ratings yet
Partitioning Sept2018 MSC Sem3 Electronic Science
13 pages
Electronics II Essentials
From Everand
Electronics II Essentials
The Editors of REA
No ratings yet

CA07 2022S3 New

Uploaded by

CA07 2022S3 New

Uploaded by

ELT3047 Computer Architecture

Lecture 7: Pipelined processor

Hoang Gia Hung

❑ Control Unit implementation

❑ Today lecture: Pipelined processor

➢ Introducing a little bit more complexity to the datapath so that simpler

✓ Shorter clock cycle time

❑ Assume the following instruction mix:

❑ Which of the following would be faster and by how much?

❑ For fixed single-cycle implementation:

❑ For multi-cycle implementation:

❑ Speedup = 880 ps / (3.8 × 200 ps) = 880 / 760 = 1.16

❑ Ideal pipeline speedup

❑ Speedup for non-ideal pipelines is less

Register Memory Register

❑ Compare pipelined datapath with single-cycle datapath

ID, WB stages respectively read and write

slt t6, t0, t3

addi t2, t2, 1

❑ Think of the datapath as a linear sequence of stages, each

PC+4 is computed, stored

lw t0, 8(t3) PC+4 is passed forward

Bits of load instruction

32-bit literal is rs2 value is passed forward

Imm. Value on Read

9 control bits 5 control bits 2 control bits

❑ What makes it hard?

❑ Pipeline hazards are serious problems that cannot be ignored

Time (clock cycles)

❑ Next lecture: Hazard handling methods.

You might also like