0% found this document useful (0 votes)
63 views66 pages

6 MultiCycle

The document discusses multi-cycle implementations of processor datapaths and control. A multi-cycle approach breaks instructions down into smaller steps that can be performed incrementally across multiple clock cycles. This allows for a shorter clock period than in a single-cycle approach. Common steps include instruction fetch, register fetch, execution, memory access, and writing results. The document provides examples of the steps used for different instruction types and presents a general multi-cycle datapath diagram.

Uploaded by

isaacopine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views66 pages

6 MultiCycle

The document discusses multi-cycle implementations of processor datapaths and control. A multi-cycle approach breaks instructions down into smaller steps that can be performed incrementally across multiple clock cycles. This allows for a shorter clock period than in a single-cycle approach. Common steps include instruction fetch, register fetch, execution, memory access, and writing results. The document provides examples of the steps used for different instruction types and presents a general multi-cycle datapath diagram.

Uploaded by

isaacopine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Multi-cycle

Datapath and control

19/10/2021 CI-0114 Fundamentos de Arquitectura 1


Single-cycle implementation

As we’ve seen, single-cycle implementation, although easy to
implement, could potentially be very inefficient

In single-cycle, we define a clock cycle to be the length of time needed to
execute a single instruction. So, our lower bound on the clock period is
the length of the most-time consuming instruction

In our previous example, our jump instruction needs only 4ns but our
clock period must be 13ns to accommodate the load word instruction!

CI-0114 Fundamentos de Arquitectura 2


Multi-cycle implementation

We can get around some of the disadvantages by introducing a little
more complexity to our datapath

Instead of viewing the instruction as one big task that needs to be
performed, in multi-cycle the instructions are broken up into smaller
fundamental steps

As a result, we can shorten the clock period and perform the instructions
incrementally across multiple cycles

What are these fundamental steps? Well, let’s take a look at what our
instructions actually need to do…

CI-0114 Fundamentos de Arquitectura 3


R format steps

An instruction is fetched from instruction memory and the PC is incremented

Read two source register values from the register file.

Perform the ALU operation on the register data operands

Write the result of the ALU operation to the register file

CI-0114 Fundamentos de Arquitectura 4


LOAD steps

An instruction is fetched from instruction memory and the PC is incremented

Read a source register value from the register file and sign-extend the 16 least
significant bits of the instruction

Perform the ALU operation that computes the sum of the value in the register
and the sign-extended immediate value from the instruction

Access data memory at the address given by the result from the ALU

Write the result of the memory value to the register file

CI-0114 Fundamentos de Arquitectura 5


STORE steps

An instruction is fetched from instruction memory and the PC is incremented

Read two source register values from the register file and sign-extend the 16
least significant bits of the instruction

Perform the ALU operation that computes the sum of the value in the register
and the sign-extended immediate value from the instruction

Update data memory at the address given by the result from the ALU

CI-0114 Fundamentos de Arquitectura 6


BRANCH EQUAL (BEQ) steps

An instruction is fetched from instruction memory and the PC is incremented

Read two source register values from the register file and sign-extend the 16
least significant bits of the instruction and then left shifts it by two

The ALU performs a subtract on the data values read from the register file. The
value of PC+4 is added with the sign-extended left-shifted-by-two immediate
value from the instruction, which results in the branch target address

The Zero result from the ALU is used to decide which adder result should be
used to update the PC

CI-0114 Fundamentos de Arquitectura 7


JUMP steps

An instruction is fetched from instruction memory and the PC is incremented

Concatenate the four most significant bits of PC+4, the 26 least significant bits
of the instruction, and two zero bits. Assign the result to the PC

CI-0114 Fundamentos de Arquitectura 8


General steps
So, generally, we can say we need to perform the following steps:
1.Instruction fetch
2.Instruction decode and register fetch
3.Execution, memory address computation, branch completion, or jump
completion
4.Memory access or R-type instruction completion
5.Memory read completion

CI-0114 Fundamentos de Arquitectura 9


Multi-cycle datapath
Here is a general overview of our new multi-cycle datapath

We now have a
single memory
element that
interacts with both
instructions and
data

Single ALU unit, no
dedicated adders

Several temporary
registers

CI-0114 Fundamentos de Arquitectura 10


Multi-cycle datapath
IorD

RegWrite
MemRead

MemWrite

ALUSrcA
IRWrite

RegDst
PC

ALUOp
0
M Instruction rs Read
u Read [25-21] Register 1 Read 0
Address M
x data 1 A

Register File
1 Instruction u
rt Read
Instruction
[20-16]
Instruction 0
Register 2 x
1 A ALU
[15-0]
rd u
M
Write L out
Instruction Register
U
Write 1x
Data Memory
Register Write Read
0
Data data 2 B 0
M 4 1M
Memory u u
x
2
1x
Data 3

extender
Register

Left 2
Shift

ALUSrcB
MemToReg

Sign
CI-0114 Fundamentos de Arquitectura 11
Multi-cycle datapath

These are some old


datapath elements
that we are already
used to. Note,
however, that the
Memory element is
now pulling double-
duty as both the
Instruction
Memory and Data
Memory element.

CI-0114 Fundamentos de Arquitectura 12


Multi-cycle datapath

New added
datapath
elements.

CI-0114 Fundamentos de Arquitectura 13


Multi-cycle datapath
New temporary registers:

Instruction register (IR) – holds the instruction after its been pulled from memory

Memory data register (MDR) – temporarily holds data grabbed from memory until the
next cycle

A – temporarily holds the contents of read register 1 until the next cycle

B – temporarily holds the contents of read register 2 until the next cycle

ALUOut – temporarily holds the contents of the ALU until the next cycle

Note: every register is written on every cycle except for the instruction register

CI-0114 Fundamentos de Arquitectura 14


Multi-cycle control IorD
IorD

RegWrite
MemRead

MemWrite

ALUSrcA
IRWrite

RegDst
PC

ALUOp
0
M Instruction rs Read
u Read [25-21] Register 1 Read 0
Address M
x data 1 A

Register File
1 Instruction u
rt Read
Instruction
[20-16]
Instruction 0
Register 2 x
1 A ALU
[15-0]
rd u
M
Write L out
Instruction Register
U
Write 1x
Data Memory
Register Write Read
0
Data data 2 B 0
M 4 1M
Memory u u
x
2
1x
Data 3

extender
Register

Left 2
Shift

ALUSrcB
MemToReg

Sign
CI-0114 Fundamentos de Arquitectura 15
Multi-cycle control
The IorD control signal.

Deasserted (0): the contents of PC is used as the address for the
memory unit

Asserted (1): The contents of ALUOut is used as the address for
the memory unit

CI-0114 Fundamentos de Arquitectura 16


Multi-cycle control RegDst
IorD

RegWrite
MemRead

MemWrite

ALUSrcA
IRWrite

RegDst
PC

ALUOp
0
M Instruction rs Read
u Read [25-21] Register 1 Read 0
Address M
x data 1 A

Register File
1 Instruction u
rt Read
Instruction
[20-16]
Instruction 0
Register 2 x
1 A ALU
[15-0]
rd u
M
Write L out
Instruction Register
U
Write 1x
Data Memory
Register Write Read
0
Data data 2 B 0
M 4 1M
Memory u u
x
2
1x
Data 3

extender
Register

Left 2
Shift

ALUSrcB
MemToReg

Sign
CI-0114 Fundamentos de Arquitectura 17
Multi-cycle control
The RegDst control signal:

Deasserted (0): the register file destination number for the Write
register comes from the rt field

Asserted (1): the register file destination number for the Write
register comes from the rd field

CI-0114 Fundamentos de Arquitectura 18


Multi-cycle control MemToReg
IorD

RegWrite
MemRead

MemWrite

ALUSrcA
IRWrite

RegDst
PC

ALUOp
0
M Instruction rs Read
u Read [25-21] Register 1 Read 0
Address M
x data 1 A

Register File
1 Instruction u
rt Read
Instruction
[20-16]
Instruction 0
Register 2 x
1 A ALU
[15-0]
rd u
M
Write L out
Instruction Register
U
Write 1x
Data Memory
Register Write Read
0
Data data 2 B 0
M 4 1M
Memory u u
x
2
1x
Data 3

extender
Register

Left 2
Shift

ALUSrcB
MemToReg

Sign
CI-0114 Fundamentos de Arquitectura 19
Multi-cycle control
The MemToReg control signal:

Deasserted (0): the value fed to the register file input comes from
ALUout

Asserted (1): the value fed to the register file input comes from
MDR

CI-0114 Fundamentos de Arquitectura 20


Multi-cycle control ALUSrc
IorD

RegWrite
MemRead

MemWrite

ALUSrcA
IRWrite

RegDst
PC

ALUOp
0
M Instruction rs Read
u Read [25-21] Register 1 Read 0
Address M
x data 1 A

Register File
1 Instruction u
rt Read
Instruction
[20-16]
Instruction 0
Register 2 x
1 A ALU
[15-0]
rd u
M
Write L out
Instruction Register
U
Write 1x
Data Memory
Register Write Read
0
Data data 2 B 0
M 4 1M
Memory u u
x
2
1x
Data 3

extender
Register

Left 2
Shift

ALUSrcB
MemToReg

Sign
CI-0114 Fundamentos de Arquitectura 21
Multi-cycle control
One of the changes we’ve made is that we’re using only a single ALU. We have
no dedicated adders on the side. To implement this change, we need to add
some multiplexors.

ALUSrcA multiplexor chooses between the contents of PC or the contents of
temporary register A as the first operand

ALUSrcB multiplexor chooses between the contents of temporary register B,
the constant 4, the immediate field, or the left-shifted immediate field as the
second operand

CI-0114 Fundamentos de Arquitectura 22


Multi-cycle datapath and control
Unidad de

PCSrc
ALUSrcA

ALUOp
Control
IorD

MemRead

MemWrite

RegWrite
IRWrite
PC

RegDst
0
[31-28]

Left 2
[25-0]

Shift
1M
PC 0 u
M [25-21] rs Read
Read x
2
u Register 1 Read 0
Address M 3
x data 1 A

Instruction

Register File
1 [20-16] u
rt Read
A

Register
Register 2 x Zero
Instruction
ALU
0 1

MemToReg
M
L

ALUSrcB
rd u
Write Out
Register
U
Write 1x
Data Memory
Write Read
0
Data data 2 B 0
M 4 1M
Memory u
2
u
x
1x
Data 3

extender
Register [15-0]

Left 2
Shift
Sign
CI-0114 Fundamentos de Arquitectura 23
Multi-cycle 1-bit control signal
1-Bit signal name Effect when deasserted Effect when asserted

RegDst The register file destination number The register file destination number for the
for the Write register comes from the Write register comes from the rd field
rt field
RegWrite None Write register is written with the value of the
Write data input

ALUSrcA The first ALU operand is PC The first ALU operand is A register

MemRead None Content of memory at the location specified


by the Address input is put on the Memory
data output
MemWrite None Memory contents of the location specified
by the Address input is replaced by the
value on the Write data input

CI-0114 Fundamentos de Arquitectura 24


Multi-cycle 1-bit control signal
1-Bit signal name Effect when deasserted Effect when asserted

MemToReg The value fed to the register file input The value fed to the register file input
is ALUout comes from Memory data register

IorD None ALUOut is used to supply the address to


the memory unit

IRWrite None The output of the memory is written into the


Instruction Register (IR)

PCWrite None The PC is written; the source is controlled


by PC-Source

PCWriteCond None The PC is written if the Zero output from the


ALU is also active

CI-0114 Fundamentos de Arquitectura 25


Multi-cycle 2-bit control signal
2-Bit signal Value Effect
name
ALUSrcB 00 The second input to ALU comes from the B register

01 The second input to ALU is 4

10 The second input to the ALU is the sign-extended, lower 16 bits of the Instruction Register (IR).

11 The second input to the ALU is the sign-extended, lower 16 bits of the IR shifted left by 2 bits

PCSource 00 Output of the ALU (PC+4) is sent to the PC for writing

01 The contents of ALUOut (the branch target address) are sent to the PC for writing

10 The jump target address (IR[25-0] shifted left 2 bits and concatenated with PC + 4[31-28]) is sent
to the PC for writing

CI-0114 Fundamentos de Arquitectura 26


Multi-cycle 4-bit control signal ALU
4-Bit signal Value The ALU performs
name
ALUOp 0000 Add operation

0001
0010 Substrac operation

0011 Shift right arithmetic

0100 And operation

0101 Or operation

0110 Xor operation

0111 Nor operation

1000 Multiplication operation

1010 Division operation

CI-0114 Fundamentos de Arquitectura 27


Multi-cycle datapath and control
Ok, so we already observed that our instructions can be roughly broken up into the
following steps:
1. Instruction fetch
2. Instruction decode and register fetch
3. Execution, memory address computation, branch completion, or jump
completion
4. Memory access or R-type instruction completion
5. Memory read completion
Instructions take 3-5 of the steps to complete. The first two are performed
identically in all instructions

CI-0114 Fundamentos de Arquitectura 28


Instruction fetch step (IF)
First step for all instructions types

IR = Memory[PC];
PC = PC + 4;
Operations:

Send contents of PC to the memory element as the address

Read instruction from memory

Write instruction into IR for use in next cycle

Increment PC by 4

CI-0114 Fundamentos de Arquitectura 29


Instruction fetch step (quiz time!!!)
Signal Value
PCWrite
IorD None

MemRead None

MemWrite None

IRWrite
PCSource None

ALUOp
ALUSrcA
ALUSrcB
RegWrite

CI-0114 Fundamentos de Arquitectura 30


Instruction fetch step
Signal Value
PCWrite 1
IorD 0

MemRead 1

MemWrite 0

IRWrite 1
PCSource 00

ALUOp xxxx
ALUSrcA 0
ALUSrcB 01
RegWrite 0

CI-0114 Fundamentos de Arquitectura 31


Instruction decode + register fetch step
Second step for all instructions types

A = Reg[IR[25-21]];
B = Reg[IR[20-16]];
ALUOut = PC + (sign-extend(IR[15-0]) << 2);
Operations:

Decode instruction

Optimistically read registers

Optimistically compute branch target

CI-0114 Fundamentos de Arquitectura 32


Instruction decode + register fetch step

Signal Value
ALUOp
ALUSrcA
ALUSrcB

CI-0114 Fundamentos de Arquitectura 33


Instruction decode + register fetch step

Signal Value
ALUOp 0000
ALUSrcA 0
ALUSrcB 11

CI-0114 Fundamentos de Arquitectura 34


Execution step
Here is where our instructions diverge

Memory reference (load and store):
● ALUOut = A + sign-extend(IR[15-0]);

Arithmetic-logical operation:
● ALUOut = A op B;

Branch:
● if (A == B) PC = ALUOut;

Jump
● PC = PC[31-28] || (IR[25-0] << 2);
CI-0114 Fundamentos de Arquitectura 35
Execution: memory reference

Signal Value
ALUOp 0000
ALUSrcA 1
ALUSrcB 10

CI-0114 Fundamentos de Arquitectura 36


Execution: arithmetic/logic operation

Signal Value
ALUOp xxxx
ALUSrcA 1
ALUSrcB 10

CI-0114 Fundamentos de Arquitectura 37


Execution: branch

Signal Value
ALUOp 0010
ALUSrcA 1
ALUSrcB 00
PCSource 01
PCWriteCond 1

CI-0114 Fundamentos de Arquitectura 38


Execution: jump

Signal Value
PCSource 10
PCWrite 1

CI-0114 Fundamentos de Arquitectura 39


Memory access/ R-type completion step

Memory reference:
● Load: MDR = Memory[ALUOut];
● Store: Memory[ALUOut] = B;

R-type instruction:
● Reg[IR[15-11]] = ALUOut;

CI-0114 Fundamentos de Arquitectura 40


Memory access: LOAD

Signal Value
MemRead 1
IorD 1
IRWrite 0

CI-0114 Fundamentos de Arquitectura 41


Memory access: STORE

Signal Value
MemWrite 1
IorD 1

CI-0114 Fundamentos de Arquitectura 42


R-type completion

Signal Value
MemToReg 0
RegWrite 1
RegDst 1

CI-0114 Fundamentos de Arquitectura 43


Read completion step

Load operation:
● Reg[IR[20-16]] = MDR;

CI-0114 Fundamentos de Arquitectura 44


Read completion

Signal Value
MemToReg 1
RegWrite 1
RegDst 0

CI-0114 Fundamentos de Arquitectura 45


Multi-cycle datapath and control
So, now we know what the steps are and what happens in each
step for each kind of instruction in our mini-MIPS instruction set
To make things clearer, let’s investigate how multi-cycle works for
a particular instruction at a time

CI-0114 Fundamentos de Arquitectura 46


Execution of R-format add instruction
R-format instructions require 4 cycles to complete. Let’s imagine that
we’re executing and add instruction

add $s0, $s1, $s2

which has the following example fields:

Op code rs rt rd shmat funct


000000 10001 10010 10000 00000 100000

CI-0114 Fundamentos de Arquitectura 47


Instruction add: step 1
Signal Value
PCWrite 1
IorD 0

MemRead 1

MemWrite 0

IRWrite 1
PCSource 00

ALUOp 0000
ALUSrcA 0
ALUSrcB 01
RegWrite 0

CI-0114 Fundamentos de Arquitectura 48


Instruction add: step 2

Signal Value
ALUOp 0000
ALUSrcA 0
ALUSrcB 11

CI-0114 Fundamentos de Arquitectura 49


Instruction add: step 3

Signal Value
ALUOp 0000
ALUSrcA 1
ALUSrcB 10

CI-0114 Fundamentos de Arquitectura 50


Instruction add: step 4

Signal Value
MemToReg 0
RegWrite 1
RegDst 1

CI-0114 Fundamentos de Arquitectura 51


Branch
Branch instructions require 3 cycles to complete. Let’s imagine
that we’re executing a beq instruction

beq $s0, $s1, L1


which has the following fields:

Op code rs rt Immediate
000100 10001 10010 XXXXXXXXXXXXXXXX

CI-0114 Fundamentos de Arquitectura 52


Instruction branch: step 1
Signal Value
PCWrite 1
IorD 0

MemRead 1

MemWrite 0

IRWrite 1
PCSource 00

ALUOp 0000
ALUSrcA 0
ALUSrcB 01
RegWrite 0

CI-0114 Fundamentos de Arquitectura 53


Instruction branch: step 2

Signal Value
ALUOp 0000
ALUSrcA 0
ALUSrcB 11

CI-0114 Fundamentos de Arquitectura 54


Instruction branch: step 3

Signal Value
ALUOp 0010
ALUSrcA 1
ALUSrcB 00
PCSource 01
PCWriteCond 1

CI-0114 Fundamentos de Arquitectura 55


Store
Store instructions require 4 cycles to complete. Let’s imagine that
we’re executing a sw instruction

sw $rt, immed($rs)


which has the following fields:

Op code rs rt Immediate
101011 10111 10010 XXXXXXXXXXXXXXXX

CI-0120 Arquitectura de Computadores 56


Instruction store: step 1
Signal Value
PCWrite 1
IorD 0

MemRead 1

MemWrite 0

IRWrite 1
PCSource 00

ALUOp 0000
ALUSrcA 0
ALUSrcB 01
RegWrite 0

CI-0114 Fundamentos de Arquitectura 57


Instruction store: step 2

Signal Value
ALUOp 0000
ALUSrcA 0
ALUSrcB 11

CI-0114 Fundamentos de Arquitectura 58


Instruction store: step 3

Signal Value
ALUOp 0000
ALUSrcA 1
ALUSrcB 10

CI-0114 Fundamentos de Arquitectura 59


Instruction store: step 4

Signal Value
MemWrite 1
IorD 1

CI-0114 Fundamentos de Arquitectura 60


Load
Load instructions require 5 cycles to complete. Let’s imagine that
we’re executing a lw instruction

lw $rt, immed($rs)


which has the following fields:

Op code rs rt Immediate
100011 10111 10010 XXXXXXXXXXXXXXXX

CI-0120 Arquitectura de Computadores 61


Instruction load: step 1
Signal Value
PCWrite 1
IorD 0

MemRead 1

MemWrite 0

IRWrite 1
PCSource 00

ALUOp 0000
ALUSrcA 0
ALUSrcB 01
RegWrite 0

CI-0114 Fundamentos de Arquitectura 62


Instruction load: step 2

Signal Value
ALUOp 0000
ALUSrcA 0
ALUSrcB 11

CI-0114 Fundamentos de Arquitectura 63


Instruction load: step 3

Signal Value
ALUOp 0000
ALUSrcA 1
ALUSrcB 10

CI-0114 Fundamentos de Arquitectura 64


Instruction load: step 4

Signal Value
MemRead 1
IorD 1
IRWrite 0

CI-0114 Fundamentos de Arquitectura 65


Instruction load: step 5

Signal Value
MemToReg 1
RegWrite 1
RegDst 0

CI-0114 Fundamentos de Arquitectura 66

You might also like