0% found this document useful (0 votes)
244 views21 pages

Lecture 9. MIPS Processor Design - Instruction Fetch: Prof. Taeweon Suh Computer Science Education Korea University

This document provides an overview of instruction fetch in a MIPS processor design. It discusses that a processor fetches instructions from memory using the program counter (PC), which is incremented by 4 each time to get the next instruction. It then presents a simple Verilog model for instruction fetch that includes a PC register and adder module to increment the PC. A testbench is also provided to simulate the instruction fetch logic.

Uploaded by

Andrea Aquino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
244 views21 pages

Lecture 9. MIPS Processor Design - Instruction Fetch: Prof. Taeweon Suh Computer Science Education Korea University

This document provides an overview of instruction fetch in a MIPS processor design. It discusses that a processor fetches instructions from memory using the program counter (PC), which is incremented by 4 each time to get the next instruction. It then presents a simple Verilog model for instruction fetch that includes a PC register and adder module to increment the PC. A testbench is also provided to simulate the instruction fetch logic.

Uploaded by

Andrea Aquino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

2010 R&E Computer System Education &

Research

Lecture 9. MIPS Processor Design


Instruction Fetch
Prof. Taeweon Suh
Computer Science Education
Korea University

Introduction
Microarchitecture:

How to implement an
architecture in hardware

Multiple implementations for


a single architecture
Single-cycle

Each instruction executes in a


single cycle

Multicycle

Each instruction is executed


broken up into a series of shorter
steps
We dont cover this in this class

Pipeline

Each instruction is broken up into


a series of steps
Multiple instructions execute
simultaneously
2

Application
Software

programs

Operating
Systems

device drivers

Architecture

instructions
registers

Microarchitecture

datapaths
controllers

Logic

adders
memories

Digital
Circuits

AND gates
NOT gates

Analog
Circuits

amplifiers
filters

Devices

transistors
diodes

Physics

electrons

Korea Univ

Processor Performance
Program execution time
Execution Time = (#instructions)(cycles/instruction)
(seconds/cycle)

Challenge in designing microarchitecture


is to satisfy constraints of:
Cost
Power
Performance

Korea Univ

Overview
In chapter 4, we are going to implement (design) MIPS CPU
The implemented CPU should be able to execute the machine
code we discussed so far

For the sake of your understanding, we simplify the


processor system structure

Real-PC system

CPU
FSB
(Front-Side Bus)
North
Bridge
DMI
(Direct Media I/F)

Main
Memor
y
(DDR)

Simplified

Address Bus

MIPS
CPU
Data Bus

Memory
(Instruction,
data)

South
Bridg
e

Korea Univ

Our MIPS Model


Our MIPS CPU model has separate connections
to instruction memory and data memory
Actually, this structure is more realistic as we will see
in chapter 5

Address Bus

Instruction
Memory
Data Bus

MIPS CPU
Address Bus

Data
Memory
Data Bus

Korea Univ

Processor
Our MIPS implementation is simplified by implementing
only
memory-reference instructions: lw, sw
arithmetic-logical instructions: add, sub, and, or, slt
Control flow instructions: beq, j

Generic implementation steps


Fetch: use the program counter (PC) to supply the instruction
address and fetch the instruction from memory (and update the
PC)
Decoding: decode the instruction (and read registers)
Execution: execute the instruction
Address Bus

MIPS CPU
Fetch
PC = PC +4

Instruction
Memory

Data Bus

Address Bus

Execute

Decode
Data Bus

Data
Memory

Korea Univ

Instruction Execution in CPU


Fetch
Fetch instruction by accessing memory with PC

Decoding
Extract opcode: Determine what operation should be done
Extract operands: Register numbers or immediate from fetched
instruction
Read registers from register file

Execution
Use ALU to calculate (depending on instruction class)
Arithmetic result
Memory address for load/store
Branch target address

Access data memory for load/store

Address Bus

MIPS CPU

Next Fetch

Fetch
PC = PC +4

PC target address or PC + 4

Instruction
Memory

Data Bus

Address Bus

Execute

Decode
Data Bus

Data
Memory

Korea Univ

Revisiting Logic Design Basics


Combinational logic
Output is directly determined by input

Sequential logic
Output is determined not only by input, but
also by internal state
Sequential logic needs state elements to store
information
Flip-flop and latch are used to store the state
information
But, avoid using latch in digital design

Korea Univ

Combinational Logic Examples


Adder

AND gate
Y=A&B

A
B

Y=A+
B
+

Arithmetic Logic Unit


(ALU)

Multiplexer
Y = S ? I1 : I0
I0
I1

M
u
x

Y = F(A, B)
A

ALU

F
9

Korea Univ

State Element (Register)


Register (flip-flop): stores data in a circuit
Clock signal determines when to update the stored
value
Edge-triggered

Rising-edge triggered: update when clock changes from 0 to 1


Falling-edge triggered: update when clock changes from 1 to 0

Data input determines what (0 or 1) to update to the


output
Flip-flop
(register)

D
Clk

Clk

D
Q

10

Korea Univ

State Element (Register)


Register with write control
Only updates on clock edge when write
control input is 1

Clk

D
Write
Clk

Write
D
Q

11

Korea Univ

Clocking Methodology
Virtually all digital systems are essentially synchronous to
the clock
Combinational logic sits between state elements (registers)
Combinational logic transforms data during clock cycles

Between clock edges


Input from state elements
Output to the next state elements
Longest delay determines clock period (frequency)

12

Korea Univ

Building a Datapath
Processor is composed of datapath and control
Datapath

Elements that process data and addresses in the CPU


Registers, ALUs, muxs, memories,

Control

Logic that controls operations


When to write to a register
What kind of operation ALU should do

Addition, Subtraction, Exclusive OR and so on

We will build a MIPS datapath incrementally and provide


Verilog code
We adopt both structural and behavioral modeling
Behavioral modeling describes what a module does

For example, the lowest modules (such as ALU and register files) will be
designed with the behavioral modeling

Structural modeling describes a module from simpler modules via


instantiations

For example, the top module (such as MIPS_CPU) will be designed with the
structural modeling

13

Korea Univ

Address Bus

Instruction
Memory

MIPS
CPU

Data Bus
Address Bus

Overview of CPU Design

Data
Memory

Data Bus

mips_tb.v (testbench)
mips_cpu_mem.v
reset
mips_cpu.v

Address

imem.v
(Instruction
Memory)

clock

fetch,
pc

Decodin
g

Register
File

ALU

Memory
Access

Instruction

Address

dmem.v

DataOut

(Data
Memory)

DataIn

14

Binary
(machine
code)

Data in
your
program,
Stack,
Heap

Korea Univ

Instruction Fetch
MIPS CPU

Increment by 4 for
next instruction

4
Add

Instructio
n Memory

reset
clock

Address

PC

Out

32

instruction
32-bit register (flip-flops)

What is PC on reset?

MIPS initializes the PC to 0xBFC0_0000


For the sake of simplicity, lets initialize the PC to 0x0000_0000 in our design

How about x86 and ARM?

x86 reset vector is 0xFFFF_FFF0. BIOS ROM is located there


ARM reset vector is 0x0000_0000

15

Korea Univ

Instruction Fetch Verilog Model


4

reset
clock

Add

PC

`include "delay.v"

`include "delay.v"

`include "delay.v"

module pc (input
clk, reset,
output reg [31:0] pc,
input
[31:0] pcnext);

module adder(input [31:0] a, b,


output [31:0] y);

module mips_cpu(input
clk, reset,
output [31:0] pc,
input [31:0] instr);

always @(posedge clk, posedge reset)


begin
if (reset) pc <= #`mydelay
0'h00000000;
else
pc <= #`mydelay pcnext;
end

assign #`mydelay y = a + b;
wire [31:0] pcnext;
endmodule
// instantiate pc and adder modules
pc
pcreg (clk, reset, pc, pcnext);
adder pcadd4 (pc, 32'b100, pcnext);
endmodule

endmodule

16

Korea Univ

Memory
As studied in the Computer Logic Design,
memory is classified into RAM (Random Access
Memory) and ROM (Read-Only Memory)
RAM is classified into DRAM (Dynamic RAM) and SRAM
(Static RAM)
DDR is a DRAM
Short form of DDR (Double Data Rate) SDRAM (Synchronous
DRAM)

DDR is used as main memory in modern computers

We use a simple Verilog memory model that


stores your program since our focus is on how
CPU works

17

Korea Univ

Simple MIPS Test Code


Example MIPS Assembly code

assemble

18

Korea Univ

Instruction Memory Verilog


Model
module imem(input [6:0] a,
output [31:0] rd);

128
words

Instruction
Memory

reg [31:0] RAM[127:0];

Word
(32-bit)

initial
begin
$readmemh("memfile.dat",RAM);
end

Compiled
binary file

a[6:0]

assign #1 rd = RAM[a]; // word


aligned
endmodule
Data comes out from
the address a

2002000
5
2003000
c
2067fff7
00e2202
5
0064282
4
00a4282
0
10a7000
a
0064202
a
1080000
1
2005000
0
00e2202
a
0085382
0
00e2382
2
ac67004
4
8c02005
0
0800001
1
2002000
1
ac02005
4

rd[31:0] 32

memfile.dat

Depending on your needs, you can increase or decrease the memory size

Examples

For 1KB word-addressable memory, reg [31:0] RAM[255:0]


For 16KB byte-addressable memory, reg [7:0] RAM[16*1024-1:0]

19

Korea Univ

MIPS CPU with imem and


Testbench
module mips_tb();
reg
reg

module mips_cpu_mem(input clk, reset);


wire [31:0] pc, instr;
// instantiate processor and memories
mips_cpu imips_cpu (clk, reset, pc,
instr);
imem
imips_imem (pc[7:2], instr);

clk;
reset;

// instantiate device to be tested


mips_cpu_mem imips_cpu_mem(clk, reset);
// initialize test
initial
begin
reset <= 1;
# 32;
reset <= 0;
end

endmodule
// generate clock to sequence tests
initial
begin
clk <= 0;
forever #10 clk <= ~clk;
end
endmodule

20

Korea Univ

Simulation and Synthesis


Instruction fetch simulation

Synthesis
Try to synthesis pc and adder with
Quartus-II

21

Korea Univ

You might also like