0% found this document useful (0 votes)
5 views24 pages

Single Cycle Processor Design: Computer Architecture and Assembly Language

The document outlines the design and components of a single cycle processor, focusing on the MIPS instruction set and its implementation. It discusses the performance implications of single-cycle designs, including advantages and disadvantages, and details the datapath components, instruction formats, and execution steps for various instruction types. Additionally, it covers the control signals required for executing instructions and the integration of data memory into the datapath.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views24 pages

Single Cycle Processor Design: Computer Architecture and Assembly Language

The document outlines the design and components of a single cycle processor, focusing on the MIPS instruction set and its implementation. It discusses the performance implications of single-cycle designs, including advantages and disadvantages, and details the datapath components, instruction formats, and execution steps for various instruction types. Additionally, it covers the control signals required for executing instructions and the integration of data memory into the datapath.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Single Cycle Processor Design

CSE 333
Computer Architecture and Assembly Language

[Adapted from slides of Dr. M. Mudawar, ICS 233, KFUPM]


Outline
 Designing a Processor: Step-by-Step

 Datapath Components and Clocking

 Assembling an Adequate Datapath

 Controlling the Execution of Instructions

 The Main Controller and ALU Controller

 Drawback of the single-cycle processor design


The Performance Perspective
 Recall, performance is determined by:
 Instruction count I-Count
 Clock cycles per instruction (CPI)
 Clock cycle time
CPI Cycle
 Processor design will affect
 Clock cycles per instruction
 Clock cycle time

 Single cycle datapath and control design:


 Advantage: One clock cycle per instruction
 Disadvantage: long cycle time
Designing a Processor: Step-by-Step
 Analyze instruction set => datapath requirements
 The meaning of each instruction is given by the register transfers

 Datapath must include storage elements for ISA registers

 Datapath must support each register transfer

 Select datapath components and clocking methodology

 Assemble datapath meeting the requirements

 Analyze implementation of each instruction


 Determine the setting of control signals for register transfer

 Assemble the control logic


Review of MIPS Instruction Formats
 All instructions are 32-bit wide
 Three instruction formats: R-type, I-type, and J-type
Op6 Rs5 Rt5 Rd5 sa5 funct6

Op6 Rs5 Rt5 immediate16

Op6 immediate26

 Op6: 6-bit opcode of the instruction


 Rs5, Rt5, Rd5: 5-bit source and destination register numbers
 sa5: 5-bit shift amount used by shift instructions
 funct6: 6-bit function field for R-type instructions
 immediate16: 16-bit immediate value or address offset
 immediate26: 26-bit target address of the jump instruction
MIPS Subset of Instructions
 Only a subset of the MIPS instructions are considered
 ALU instructions (R-type): add, sub, and, or, xor, slt
 Immediate instructions (I-type): addi, slti, andi, ori, xori
 Load and Store (I-type): lw, sw
 Branch (I-type): beq, bne
 Jump (J-type): j

 This subset does not include all the integer instructions


 But sufficient to illustrate design of datapath and control
 Concepts used to implement the MIPS subset are used
to construct a broad spectrum of computers
Details of the MIPS Subset
Instruction Meaning Format
add rd, rs, rt addition op6 = 0 rs5 rt5 rd5 0 0x20
sub rd, rs, rt subtraction op6 = 0 rs5 rt5 rd5 0 0x22
and rd, rs, rt bitwise and op6 = 0 rs5 rt5 rd5 0 0x24
or rd, rs, rt bitwise or op6 = 0 rs5 rt5 rd5 0 0x25
xor rd, rs, rt exclusive or op6 = 0 rs5 rt5 rd5 0 0x26
slt rd, rs, rt set on less than op6 = 0 rs5 rt5 rd5 0 0x2a
addi rt, rs, im16 add immediate 0x08 rs5 rt5 im16
slti rt, rs, im16 slt immediate 0x0a rs5 rt5 im16
andi rt, rs, im16 and immediate 0x0c rs5 rt5 im16
ori rt, rs, im16 or immediate 0x0d rs5 rt5 im16
xori rt, im16 xor immediate 0x0e rs5 rt5 im16
lw rt, im16(rs) load word 0x23 rs5 rt5 im16
sw rt, im16(rs) store word 0x2b rs5 rt5 im16
beq rs, rt, im16 branch if equal 0x04 rs5 rt5 im16
bne rs, rt, im16 branch not equal 0x05 rs5 rt5 im16
j im26 jump 0x02 im26
Register Transfer Level (RTL)
 RTL is a description of data flow between registers
 RTL gives a meaning to the instructions
 All instructions are fetched from memory at address PC
Instruction RTL Description
ADD Reg(Rd) ← Reg(Rs) + Reg(Rt); PC ← PC + 4
SUB Reg(Rd) ← Reg(Rs) – Reg(Rt); PC ← PC + 4
ORI Reg(Rt) ← Reg(Rs) | zero_ext(Im16); PC ← PC + 4
LW Reg(Rt) ← MEM[Reg(Rs) + sign_ext(Im16)]; PC ← PC + 4
SW MEM[Reg(Rs) + sign_ext(Im16)] ← Reg(Rt); PC ← PC + 4
BEQ if (Reg(Rs) == Reg(Rt))
PC ← PC + 4 + 4 × sign_extend(Im16)
else PC ← PC + 4
Instructions are Executed in Steps
 R-type Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Reg(Rt)
Execute operation: ALU_result ← func(data1, data2)
Write ALU result: Reg(Rd) ← ALU_result
Next PC address: PC ← PC + 4
 I-type Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Extend(imm16)
Execute operation: ALU_result ← op(data1, data2)
Write ALU result: Reg(Rt) ← ALU_result
Next PC address: PC ← PC + 4
 BEQ Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Reg(Rt)
Equality: zero ← subtract(data1, data2)
Branch: if (zero) PC ← PC + 4 + 4×sign_ext(imm16)
else PC ← PC + 4
Instruction Execution – cont’d
 LW Fetch instruction: Instruction ← MEM[PC]
Fetch base register: base ← Reg(Rs)
Calculate address: address ← base + sign_extend(imm16)
Read memory: data ← MEM[address]
Write register Rt: Reg(Rt) ← data
Next PC address: PC ← PC + 4

 SW Fetch instruction: Instruction ← MEM[PC]


Fetch registers: base ← Reg(Rs), data ← Reg(Rt)
Calculate address: address ← base + sign_extend(imm16)
Write memory: MEM[address] ← data
Next PC address: PC ← PC + 4
concatenation
 Jump Fetch instruction: Instruction ← MEM[PC]
Target PC address: target ← PC[31:28] , Imm26 , ‘00’
Jump: PC ← target
Requirements of the Instruction Set
 Memory
 Instruction memory where instructions are stored
 Data memory where data is stored
 Registers
 32 × 32-bit general purpose registers, R0 is always zero
 Read source register Rs
 Read source register Rt
 Write destination register Rt or Rd
 Program counter PC register and Adder to increment PC
 Sign and Zero extender for immediate constant
 ALU for executing instructions
Components of the Datapath
 Combinational Elements 32

0 zero
A
 ALU, Adder 16
Extend
32 m
u L
32
ALU result
x 32 U overflow
 Immediate extender 1
ExtOp select ALU control
 Multiplexers

 Storage Elements
32
Instruction Data
32 32
Memory
32 32

PC
Address Address
 Instruction memory 32 Data_out
32

Instruction Data_in
 Data memory Memory

 PC register Registers
MemRead MemWrite
5 32
 Register file 5
RA BusA
32
RB BusB
 Clocking methodology 5
RW
BusW

 Timing of reads and writes Clock


32
RegWrite
MIPS Register File RW RA RB
 Register File consists of 32 × 32-bit registers
 BusA and BusB: 32-bit output busses for reading 2 registers
 BusW: 32-bit input bus for writing a register when RegWrite is 1
 Two registers read and one written in a cycle
Register
 Registers are selected by: 5
RA File BusA 32

5
 RA selects register to be read on BusA 5
RB
32
BusB
RW
 RB selects register to be read on BusB
Clock
 RW selects the register to be written BusW

 Clock input RegWrite


32

 The clock input is used ONLY during write operation


 During read, register file behaves as a combinational logic block
 RA or RB valid => BusA or BusB valid after access time
Instruction and Data Memories
 Instruction memory needs only provide read access
 Because datapath does not write instructions
32 32
 Behaves as combinational logic for read Address Instruction

 Address selects Instruction after access time Instruction


Memory
 Data Memory is used for load and store
 MemRead: enables output on Data_out Data
 Address selects the word to put on Data_out Memory
32 32
Address Data_out
 MemWrite: enables writing of Data_in 32
Data_in
 Address selects the memory word to be written
Clock
 The Clock synchronizes the write operation

 Separate instruction and data memories MemRead MemWrite

 Later, we will replace them with caches


Datapath for R-type Instructions
Op6 Rs5 Rt5 Rd5 sa5 funct6

RegWrite
ALUCtrl
30
+1
Instruction Registers 32
Memory Rs 5
30 32 RA BusA A 32
00

Instruction Rt 5 L
32 RB
Address BusB 32 U
PC

Rd 5
RW ALU result
BusW

RA & RB come from the


instruction’s Rs & Rt fields ALU inputs come from BusA & BusB

RW comes from the Rd field ALU result is connected to BusW

 Control signals
 ALUCtrl is derived from the funct field because Op = 0 for R-type
 RegWrite is used to enable the writing of the ALU result
Datapath for I-type ALU Instructions
Op6 Rs5 Rt5 immediate16
RegWrite
ALUCtrl
30
+1
Instruction Registers 32
Memory Rs 5
30 32 RA BusA A 32
00

Instruction 5
32 L
32 RB
Address BusB 32 U
PC

Rt 5
RW ALU result
BusW

ExtOp

RW now comes from Imm16


Extender
Rt, instead of Rd
Second ALU input comes
from the extended immediate
 Control signals
RB and BusB are not used
 ALUCtrl is derived from the Op field
 RegWrite is used to enable the writing of the ALU result
 ExtOp is used to control the extension of the 16-bit immediate
Combining R-type & I-type Datapaths
RegWrite
ALUCtrl
30
+1 Another mux
Instruction Registers 32
Memory Rs 5 selects 2nd ALU
30 32 RA BusA A 32 input as either
00

Instruction Rt 5 32 L
32
Address
RB BusB 0
m U source register
PC

0
m
u
u
x Rt data on BusB
RW BusW
Rd x
1
1 or the extended
ExtOp 32 ALUSrc
5 immediate
A mux selects RW RegDst ALU result
Extender
as either Rt or Rd Imm16

 Control signals
 ALUCtrl is derived from either the Op or the funct field
 RegWrite enables the writing of the ALU result
 ExtOp controls the extension of the 16-bit immediate
 RegDst selects the register destination as either Rt or Rd
 ALUSrc selects the 2nd ALU source as BusB or extended immediate
Adding Data Memory to Datapath
 A data memory is added for load and store instructions
ExtOp ALUCtrl MemRead MemWrite
Imm16 32 ALUSrc
Extender MemtoReg
ALU result

30
+1
Instruction Rs 5 32
RA BusA Data
30 Memory Memory 0
32
Registers A 32 m 32
00

Instruction Rt 5 L Address u
x
32 RB 32
Address
BusB 0
m U Data_out 1
PC

0
m u
u x Data_in
RW BusW
Rd x 1
1 32
5
RegDs
RegWrite
t

ALU calculates data memory address A 3rd mux selects data on BusW as
either ALU result or memory data_out
 Additional Control signals
 MemRead for load instructions BusB is connected to Data_in of Data
Memory for store instructions
 MemWrite for store instructions
 MemtoReg selects data on BusW as ALU result or Memory Data_out
Controlling the Execution of Load
ExtOp = ‘sign’ to sign-extend
Immmediate16 to 32 bits ExtOp ALUCtrl
MemRead MemWrite
= sign = ADD
=1 =0
ALUSrc MemtoReg
Imm16 32
Extender =1 =1
ALU result

30
+1
Instruction Rs 5 32
RA BusA Data
30 Memory Memory 0
32
Registers A 32 m 32
00

Instruction Rt 5 L Address u
x
32 RB 32
Address
BusB 0
m U Data_out 1
PC

0
m u
u x Data_in
RW BusW
Rd x 1
1 32
5
RegDst RegWrite
RegDst = ‘0’ selects Rt =0 =1
as destination register MemRead = ‘1’ to read data memory

ALUSrc = ‘1’ selects extended immediate as MemtoReg = ‘1’ places the data read
second ALU input from memory on BusW

ALUCtrl = ‘ADD’ to calculate data memory RegWrite = ‘1’ to write the memory
address as Reg(Rs) + sign-extend(Imm16) data on BusW to register Rt
Adding Jump and Branch to Datapath
30 Jump or Branch Target Address

30 30
MemRea
MemWrite
Next d
Imm26
PC MemtoReg
ALU result
PCSrc +1 Imm16
zero
Instruction Rs 5
BusA Data
RA
Memory Memory 0
30 32
Registers Ext A m 32
00

Instruction
0 Rt 5 L Address u
x
m RB 32
u Address
BusB 0
m U Data_out 1
PC

0
x m u
1 u x Data_in
RW BusW
Rd x 1
1
5

RegDst RegWrite
ALUSrc ALUCtrl J, Beq, Bne

 Additional Control Signals Next PC computes


jump or branch target
 J, Beq, Bne for jump and branch instructions
instruction address
 Zero condition of the ALU is examined
For Branch, ALU does
 PCSrc = 1 for Jump & taken Branch a subtraction
Single-Cycle Datapath + Control
30 Jump or Branch Target Address

30 30

Next J, Beq, Bne


Imm26
PC ALU result
PCSrc +1 Imm16
zero
Instruction Rs 5
BusA Data
RA
Memory Memory 0
30 32
Registers Ext A m 32
00

Instruction
0 Rt 5 L Address u
x
m RB 32
u Address
BusB 0
m U Data_out 1
PC

0
x m u
1 u x Data_in
RW BusW
Rd x 1
1
5

RegDst RegWrite ExtOp ALUSrc ALUCtrl

func
ALU
Op
Ctrl MemRead

ALUOp MemWrite MemtoReg

Main
Control
Drawbacks of Single Cycle Processor
 Long cycle time
 All instructions take as much time as the slowest

ALU Instruction Fetch Reg Read ALU Reg Write


longest delay
Load Instruction Fetch Reg Read ALU Memory Read Reg Write

Store Instruction Fetch Reg Read ALU Memory Write

Branch Instruction Fetch Reg Read ALU

Jump Instruction Fetch Decode

 Alternative Solution: Multicycle implementation


 Break down instruction execution into multiple cycles
Multicycle Implementation
 Break instruction execution into five steps
 Instruction fetch
 Instruction decode and register read
 Execution, memory address calculation, or branch completion
 Memory access or ALU instruction completion
 Load instruction completion

 One step = One clock cycle (clock cycle is reduced)


 First 2 steps are the same for all instructions

Instruction # cycles Instruction # cycles


ALU & Store 4 Branch 3
Load 5 Jump 2
Summary
 5 steps to design a processor
 Analyze instruction set => datapath requirements
 Select datapath components & establish clocking methodology
 Assemble datapath meeting the requirements
 Analyze implementation of each instruction to determine control signals
 Assemble the control logic

 MIPS makes Control easier


 Instructions are of same size
 Source registers always in same place
 Immediates are of same size and same location
 Operations are always on registers/immediates

 Single cycle datapath => CPI=1, but Long Clock Cycle

You might also like