0% found this document useful (0 votes)
32 views11 pages

Mips PDF

The document discusses the design of a single-cycle MIPS processor. It describes the basic components needed for the design including program counter, registers, ALU, and data memory. It provides an abstract view of the design and discusses incrementing the program counter and fetching instructions.

Uploaded by

Pawan Goswami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views11 pages

Mips PDF

The document discusses the design of a single-cycle MIPS processor. It describes the basic components needed for the design including program counter, registers, ALU, and data memory. It provides an abstract view of the design and discusses incrementing the program counter and fetching instructions.

Uploaded by

Pawan Goswami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

CSE 675.

02: Introduction to Computer Architecture

Designing
MIPS Processor
(Single-Cycle)
Presentation G

Reading Assignment: 5.1-5.4 Slides by Gojko Babić

Introduction
• We're now ready to look at an implementation of the system
that includes MIPS processor and memory.
• The design will include support for execution of only:
– memory-reference instructions: lw & sw,
– arithmetic-logical instructions: add, sub, and, or, slt &
nor,
– control flow instructions: beq & j,
– exception handling: illegal instruction & overflow.
• But that design will provide us with principles, so many more
instructions could be easily added such as: addu, lb, lbu, lui,
addi, adiu, sltu, slti, andi, ori, xor, xori, jal, jr, jalr, bne, beqz,
bgtz, bltz, nop, mfhi, mflo, mfepc, mfco, lwc1, swc1, etc.
g. babic Presentation G 2

1
Single Cycle Design
• We will first design a simpler processor that executes each
instruction in only one clock cycle time.
• This is not efficient from performance point of view, since:
– a clock cycle time (i.e. clock rate) must be chosen such that
the longest instruction can be executed in one clock cycle
and
– that makes shorter instructions execute in one unnecessarily
long cycle.
• Additionally, no resource in the design may be used more than
once per instruction, thus some resources will be duplicated.
• The singe cycle design will require:
– two memories (instruction and data),
– two additional adders.
g. babic Presentation G 3

Elements for Datapath Design

A L U c o n tr o l MemWrite
32 4
overflow 32
32 32 32
PC Address Read 16 32
Z e ro data Sign
ALU ALU 32 extend
r e s u lt 32
32 Write Data
data memory
a . P ro g ra m c o u n te r

g. Sign-extension unit
c. ALU
MemRead

e. Data memory unit

MemRead=1
MemWrite =0
5 Read
32 32
register 1 32
Read
data 1 In s tr u c tio n
Register 5 Read address
32
num bers register 2 A dd S um
Reg isters Data 32
32 Shift 32
5 W rite In s tru c tio n
32 Left 2
register 32
Read In s tru c tio n
32 data 2 m em ory
W rite
Data
d ata
d. A d d e r h. Shift left 2
RegWrite f . In st ru c tio n m e m o r y

b . Register File

g. babic Presentation G 4

2
Abstract /Simplified View (1st look)

Data

Register #
PC Address Instruction Registers ALU Address
Register #
Instruction
memory Data
Register # memory

Data

• This generic implementation:


– uses the program counter (PC) to supply instruction
address,
– gets the instruction from memory,
– reads registers,
– uses the instruction opcode
g. babic
to decide exactly what to do. 5
Presentation G

Abstract /Simplified View (2nd look)

Figure 5.1

• PC is incremented by 4 by most instructions, and 4 +


4×offset by branch instructions.
• Jump instructions change PC differently (not shown).
g. babic Presentation G 6

3
Our Implementation
• An edge triggered methodology
• Typical execution:
– read contents of some state elements at the beginning of the clock
cycle,
– send values through some combinational logic,
– write results to one or more state elements at the end of the clock
cycle.

State State
element Combinational logic element
1 2

Figure 5.5
Clock cycle

• An edge triggered methodology allows a state element to be read


and written in the same clock cycle.
g. babic Presentation G 7

Incrementing PC & Fetching Instruction

A d d

R e a d
P C
a d d r e s s

I n s tr u c t io n

I n s t r u c tio n

m e m o r y
Clock Figure 5.6
with addition in red

g. babic Presentation G 8

4
Datapath for R-type Instructions
R e g W r ite
Clock

ALU control
I25-21 R e a d 4
re g is te r 1
R e a d

d a ta 1
I20-16 R e a d
Z e ro
I n s t r u c t io n re g is te r 2

R e g is te r s A L U
A L U
I15-11 W r it e
r e s u lt
re g is te r
R e a d

d a ta 2
W r it e

d a ta

add = 32
31 26 25 21 20 16 15 11 10 6 5 0
sub = 34
slt = 42
R-type 000000 rs rt rd 00000 funct and = 36
or = 37
g. babic Presentation G nor = 39 9

Complete Datapath for R-type Instructions


Based on contents of op-code and funct
fields, Control Unit sets ALU control
appropriately and asserts RegWrite, i.e.
RegWrite = 1.

Add
R e g W r ite
Clock

ALU control
I25-21 R e a d 4
Read re g is te r 1
PC R e a d
address
d a ta 1
I20-16 R e a d
Z e ro
re g is te r 2
Instruction R e g is te r s A L U
A L U
I15-11 W r ite
r e s u lt
Instruction re g is te r
R e a d
clock memory
d a ta 2
W r ite
d a ta

g. babic Presentation G 10

5
Datapath for LW and SW Instructions
31 26 25 21 20 16 15 0
sw or lw opcode rs rt offset
M e m W r it e
Clock

A L U control
I25-21 R ead
4
r e g is t e r 1 M e m W r ite
R ead
d a ta 1
I20-16 R ead
r e g is t e r 2 Zero
In s tr u c tio n
R e g is te r s ALU ALU
I20-16 W r ite R ea d
r e s u lt A d d re s s
r e g is t e r d a ta
R ead
d a ta 2
W r ite
D a ta
d a ta
m e m ory
R e g W r it e W r it e
d a ta

I15-0 16 32
S ig n M em R e ad
e x te n d

Control Unit sets:


• ALU control = 0010 (add) for address calculation for both lw and sw
• MemRead=0, MemWrite=1 and RegWrite=0 for sw
• MemRead=1, MemWrite=0 and RegWrite=1 for lw
g. babic Presentation G 11

Datapath for R-type, LW & SW Instructions

R e g W r it e
Clock

A dd
M e m W r it e
Clock
4 RegDst

R e g is t e r s
rs R ead A L U control
r e g is te r 1 4
R e ad
PC R e ad
a d d re s s rt Read M e m to R e g
d a ta 1
re g is te r 2 A L U S rc Z e ro
In s t r u c ti o n
ALU A LU
0 W r it e R e ad A d d res s R ea d
rd re s u lt
r e g is te r d a ta 2 0 d a ta 1
I n s tr u c t io n 1
W r it e 1
Clock D a ta
m e m o ry d a ta 0
W r ite m e m o ry

MemRead =1 d a ta
MemWrite =0
16 S ig n 32
offset M e m R ea d
e xte n d

Let us determine setting of control lines for R-type, lw & sw instructions.


g. babic Presentation G 12

6
Datapath for BEQ Instruction
31 26 25 21 20 16 15 0
beq rs rt offset

Branch target = [PC] + 4 + 4×offset

P C + 4 fr o m in s t r u c ti o n d a ta p a t h

Add Sum B r a n c h ta r g e t

S h if t
l e ft 2

rs A L U control
R e ad 4
In s tr u c tio n re g is te r 1
R ead
rt d a ta 1
R ead
re g is te r 2
T o b ran c h
R e g is te r s ALU Z e ro
W r ite c o n t r o l lo g ic
re g is te r
R ead
d a ta 2
W r ite
d ata

R e g W r it e

offset
16
S ig n
32
Figure 5.9
e x te n d with additions in red

g. babic Presentation G 13

Datapath for R-type, LW, SW & BEQ

P C S rc

0
M
Add u
Clock R e g W r it e x
4 ALU 1
A dd
res u lt
S h ift
Clock M e m W r it e
le ft 2

In str uc tio n [2 5 – 2 1 ] rs R ead


R ead re g iste r 1 R e ad
PC d a ta 1
a d d res s In str uc tio n [2 0 – 1 6 ] R ead
A L U S rc M e m to R e g
re g iste r 2 Z er o
In s tru ctio n rt 0 R e ad
ALU
[3 1 – 0 ] 0 ALU
W r ite d a ta 2 A d d re ss R ead 1
M re s u lt
u re g iste r M d a ta
I ns tru ct io n In str uc tio n [1 5 – 1 1 ] x u M
W r ite x u
m e m o ry 1 R e g is te rs x
d a ta
clock rd 1
D a ta 0
W rite
MemRead=1 R e g D st d a ta m e m o ry
16 32
MemWrite=0 In str uc tio n [1 5 – 0 ] S ig n 4
ex te nd
offset M em Read

ALU control

Figure 5.15
with additions in red

g. babic Presentation G 14

7
Control Unit and Datapath

M
u
x
AL U
Add 1
res u lt
A dd
S hift P C S rc
R eg D st le ft 2
4 B ra nc h
M e m R e ad
In s tru ction [31 2 6] M e m to R e g
Co n tro l
opcode A LU O p
M e m W rite
A LU S rc
R e g W rite
Clock anded
In s tru ction [25 2 1] rs R e ad
R ead
PC
a d dres s
reg ister 1
R e ad Clock anded
In s tru ction [20 1 6] d a ta 1
R e ad
reg ister 2 Z ero
Ins tru ctio n rt 0 R e g is ters R e ad A LU AL U
[31 – 0 ] 0 R e ad
M W rite d a ta 2 res ult A d d re ss 1
In stru ction reg ister M da ta
u M
m e m o ry x u
Clock In s tru ction [15 1 1] u
W rite x
1 D ata x
d a ta
rd 1 m e m ory 0
W rite
da ta
MemRead=1
MemWrite=0 16 32
In s tru ction [15 0 ] offset S ig n
e xte nd A LU
co n trol

Ins tru ctio n [5 0] funct

Figure 5.17
with additions in red

g. babic Presentation G 15

Truth Table for (Main) Control Unit


Input Output

Memto- Reg Mem Mem

Op-code RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0

R-type 000000 1 0 0 1 d 0 0 1 0

lw 100011 0 1 1 1 1 0 0 0 0

sw 101011 d 1 d 0 0 1 0 0 0

beq 000100 d 0 d 0 d 0 1 0 1

• ALUOp[1-0] = 00  signal to ALU Control unit for ALU to perform add


function, i.e. set Ainvert = 0, Binvert=0 and Operation=10
• ALUOp[1-0] = 01  signal to ALU Control unit for ALU to perform
subtract function, i.e. set Ainvert = 0, Binvert=1 and Operation=10
• ALUOp[1-0] = 10  signal to ALU Control unit to look at bits I[5-0] and
based on its pattern to set Ainvert, Binvert and Operation so
that ALU performs appropriate function, i.e. add, sub, slt,
and, or & nor
g. babic Presentation G 16

8
Truth Table of ALU Control Unit

Input Output

ALUOp Funct field ALU


ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 Control
0 0 d d d d d d 0 0 10 add
0 1 d d d d d d 0 1 10 sub
add
1 0 1 0 0 0 0 0 0 0 10
1 0 1 0 0 0 1 0 0 1 10 sub

1 0 1 0 0 1 0 0 0 0 00 and
1 0 1 0 0 1 0 1 0 0 01 or

1 0 1 0 1 0 1 0 0 1 11 slt

1 0 1 0 0 1 1 1 1 1 00 nor

Ainvert Bivert Operation

g. babic 17

Design of (Main) Control Unit


Op-code
bits Memto- Reg Mem Mem
5 4 3 2 1 0 RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0
000000 1 0 0 1 d 0 0 1 0
100011 0 1 1 1 1 0 0 0 0
101011 d 0 1 d 0 0 1 0 0 0
000100 d 0 0 d 0 d 0 1 0 1
… …
In p u t s

Op5
0 0 Op4
Op3
Op2 Figure C.2.5
Op1
Op0
RegDst =Op5Op4Op3Op2Op1Op0
O u tp u ts
ALUSrc= Op5Op4Op3Op2Op1Op0 R - fo r m a t Iw sw be q
R e gD st

+Op5Op4Op3Op2Op1Op0 A LU S rc
M e m to R e g

R e g W r ite
M emRead
M e m W r ite

B ra n c h
A LU O p 1

A LU O p O

g. babic 18

9
Datapath for R-type, LW, SW, BEQ & J
31 26 25 0
j jump_target

PC  PC31-28 || jump_target || 00 Add


2
zeros

Instruction [25– 0] Shift Jump address [31– 0]


left 2
26 28 0 1
PC[31-28]
PC+4 [31– 28] M M
u u
x x
Add ALU 1 0
result
Add
RegDst shift
Jump left 2
4 Branch
MemRead
Instruction [31– 26]
Control MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite

Instruction [25– 21] Read


Read register 1
PC address Read
Instruction [20– 16] data 1
Read
register 2 Zero
Instruction 0 Registers Read ALU ALU
[31– 0] 0 Read
M Write data 2 result Address 1
Instruction u register M data
u M
memory Instruction [15– 11] x u
1 Write x Data
data x
1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control
Figure 5.24 Instruction [5– 0]

with correction in red

g. babic 19

Design of Control Unit (J included)


Op-code
bits Memto- Reg Mem Mem
5 4 3 2 1 0 RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 Jump
000000 1 0 0 1 d 0 0 1 0 0
100011 0 1 1 1 1 0 0 0 0 0
101011 d 1 d 0 0 1 0 0 0 0
000100 d 0 d 0 d 0 1 0 1 0
J 000010 d d d 0 d 0 d d d 1
In p u t s …
Op5 0
Op4
Op3
Op2
Op1
Jump =Op5Op4Op3Op2Op1Op0 Op0

R -fo r m a t Iw sw be q Jump
R egD st

A LU S rc
M e m toR e g

No changes in ALU Control unit R e g W r ite


M emRead
M e m W r ite

B ra n c h
A LU O p 1

A LU O p O
g. babic 20

10
Cycle Time Calculation
• Let us assume that the only delays introduced are by the
following tasks:
– Memory access (read and write time = 3 nsec)
– Register file access (read and write time = 1 nsec)
– ALU to perform function (= 2 nsec)
• Under those assumption here are instruction execution times:
Instr Reg ALU Data Reg
fetch read oper memory write Total
R-type 3 + 1 + 2 + 1 = 7 nsec
lw 3 + 1 + 2 + 3 + 1 = 10 nsec
sw 3 + 1 + 2 + 3 = 9 nsec
branch 3 + 1 + 2 = 6 nsec
jump 3 = 3 nsec
• Thus a clock cycle time has to be 10nsec, and
clock rate = 1/10 nsec = 100MHz
g. babic Presentation G 21

Single Cycle Processor: Conclusion


• Single Cycle Problems:
– what if we had a more complicated instruction like floating
point?
– a clock cycle would be much longer,
– thus for shorter and more often used instructions, such as
add & lw, wasteful of time.
• One Solution:
– use a “smaller” cycle time, and
– have different instructions take different numbers of
cycles.
• And that is a “multi-cycle” processor.

g. babic Presentation G 22

11

You might also like