ECE429 Final Project
ECE429 Final Project
TABLE OF CONTENTS
Contents
CHAPTER 1: INTRODUCTION ...................................................................................................................... 5
CHAPTER 2: CIRCUIT DESCRIPTION ......................................................................................................... 6
CHAPTER 3: MEMORY FILE ......................................................................................................................... 1
CHAPTER 4: ARITHMETIC LOGIC UNIT (ALU) ........................................................................................ 1
CHAPTER 5: SYNCHRONIZATION .............................................................................................................. 2
CHAPTER 6: CASE STUDY 1 ......................................................................................................................... 2
6.1
7.2
7.3
7.4
1|Page
LIST OF FIGURES
Figure 1: Overview of the primary blocks and signals ...................................................................................... 6
Figure 2: Memory file configuration ................................................................................................................. 1
Figure 3: Block diagram of the ALU circuit ...................................................................................................... 2
Figure 4: Instruction word contents ................................................................................................................... 2
Figure 5: RTL Simulation result ........................................................................................................................ 3
Figure 6: Simvision Waveform for CLA ........................................................................................................... 4
Figure 7: Simvision Waveform for CRA ........................................................................................................... 5
Figure 8: Simvision Waveform for CSA ........................................................................................................... 5
Figure 9: Simvision Waveform for CSeA ......................................................................................................... 6
Figure 10: Cell.rep file generated ...................................................................................................................... 7
Figure 11: timing.rep file generated................................................................................................................... 8
Figure 12: post-synthesis simulation ................................................................................................................. 9
Figure 13: timing.rep.5.final ............................................................................................................................ 10
Figure 14: post-P&R simulation ...................................................................................................................... 11
Figure 15: post-P&R simulation Simvision Output ......................................................................................... 11
Figure 16: Log file for CLA for New test Bench............................................................................................. 13
Figure 17: Log file for CRA for New test Bench ............................................................................................ 13
Figure 18: Log file for CSA for New test Bench ............................................................................................. 14
Figure 19: Log file for CSeA for New test Bench ........................................................................................... 15
2|Page
3|Page
LIST OF TABLES
Table 1: Comparator condition ........................................................................................................................ 20
Table 2: Final results ........................................................................................................................................ 32
4|Page
CHAPTER 1: INTRODUCTION
This Report will describes the Case Study for 32-bit Pipelined CPU design with New ALU
Architecture. The objective of this project is to understand a 32-bit Pipelined Central Processing Unit
(CPU). As the name of the design declares, the word length of the data used in the circuits is 32 bits.
Furthermore, since this circuit is pipelined, more than one instruction can be executed simultaneously. The
operation of the circuit is synchronized by an externally set clock signal. Also the instruction signals for
addressing the memory file, selecting the Arithmetic Logic Unit (ALU) operands and specifying the
operation of the ALU are also external. The correct synchronization of those signals with the critical data
path delay of the circuit that will determine the minimum operating period is one of the objectives of the
project.
5|Page
6|Page
A * B : multiplication
A + B : addition
A - B : subtraction
B - A : subtraction
A or B : logic OR function
A and B : logic AND function
A xor B : logic XOR function
A xnor B : logic XNOR function
As illustrated in Fig. 1 the operand of the CPU can be selected among the following:
Operand A: Operand A can be selected between the read port A of the memory file
and the externally defined data in. The selection is done by the external signal ASEL.
Operand B: Operand B can be selected between the read port B of the memory file
and the logic zero value. The selection is done by the external signal BSEL.
Internally the ALU has three primary operation blocks: the multiplier, the adder and the logic
function block. These blocks are illustrated in Fig 3. The multiplier can be implemented as
32-by-32 array-based multiplier. The multiplier executes the multiplication function. Notice,
however, that the result of the multiplication operation is 64 bits. Therefore, in order to store
the multiplication result back to memory file we need two clock cycles. Therefore the
multiplication instruction is executed in 3 clock cycles. This is made possible by pipelining
the multiplier unit in order to produce the 16 least significant bits (LSB) of the result in once
clock cycle and the following 16 most significant bits (MSB) in the next cycle. Notice
however, that the instruction immediately following a multiplication operation should select
the MSB of the multiplier at the output of the ALU. It should also specify the storage address
of the MSB bits in the memory file.
A20329707
Page 2
ECE429 Final
Project
CHAPTER 5: SYNCHRONIZATION
To better understand the circuit synchronization sequence described below, please refer to
Fig1.The operation of the CPU is synchronized by the external clock signal.
A20329707
Page 2
RTL Simulation
1. We have ensured that there is no bug in the design before synthesis. We have
verified the correctness by running testing testbench. The whole cpu design in
Verilog is provided, cpu_xxx.v, and a testbench for verifying the CPU,
tb_cpu.v, where we tested functionality for store, read, addition,and
subtraction.
A20329707
Page 2
A20329707
Page 3
A20329707
Page 4
A20329707
Page 5
A20329707
Page 6
A20329707
Page 7
A20329707
Page 8
A20329707
Page 9
A20329707
Page 10
A20329707
Page 11
To
A20329707
Page 12
Figure 16: Log file for CLA for New test Bench
Figure 17: Log file for CRA for New test Bench
A20329707
Page 13
Figure 18: Log file for CSA for New test Bench
A20329707
Page 14
Figure 19: Log file for CSeA for New test Bench
Figure 20: Simvision Waveform for CLA for New test Bench
A20329707
Page 15
A20329707
Page 16
Figure 22: Simvision Waveform for CSA for New test Bench
Figure 23: Simvision Waveform for CSeA for New test Bench
A20329707
Page 17
2.10
2.11
2.19
1.2
AAAA_AAAA + 5555_5555
2.85
2.90
2.36
1.65
0000_00C8 + 0000_012C
2.45
2.76
2.13
1.70
Operation (PostSynthesis
5 + 0000_000A
2.69
2.59
2.18
1.69
FFFF_FFFF - 0000_0001
2.68
2.4
2.24
2.07
FFFF_FFFF + 0000_0001
2.55
2.32
2.28
1.85
5555_5555 - 5
2.13
2.35
1.89
1.64
AAAA_AAAB + 5555_5555
2.42
2.39
1.8
1.65
Gate-Level Delay)
A20329707
Page 18
A20329707
Page 19
f0
A>B
B>A
A=B
A20329707
Page 20
The structure of the 32-bit comparator is shown in Fig. 11. You are supposed to
finish the Verilog coding of this structure and include it in the ALU design. There
should be three modules in your Verilog code: one_bit_comp, mux_4to2, and
tree_comp. The definition part of each module is included in file cpu_comp.v, and
they are listed in Fig. You should finish the code in order to complete your new cpu
design.
A20329707
Page 21
7.3
After adding the 32-bit comparator to the ALU design, the new ALU will look like
Fig . Note that we should extend the two bit output {f1, f0} to 32-bit result.
Moreover, the original 4 to 1 MUX in the ALU should be changed to a 5 to 1 MUX,
and its select signal OUTSEL should be 3 bits now. The ALU design has already
been modified so that you can focus on the comparator design.
A20329707
Page 22
7.4
Code Implemented
A20329707
Page 23
A20329707
Page 24
A20329707
Page 25
f1 =
assign
Created New Test bench tb_cpu for below given values and function.
A20329707
Page 26
After running newly created testbench in RTL Simulator. New test bench runs
successfully and error free output.
A20329707
Page 27
A20329707
Page 28
A20329707
Page 29
A20329707
Page 30
A20329707
Page 31
CHAPTER 8: RESULT
After simulating for condition given in test bench , we obtained result in simvision as
following.
Location
Operation
f1
f0
Expected
[8][10]
0000_0001
5555_5555
CMP
True
[4][2]
0000_0001
5555_5555
CMP
True
[3][7]
0000_000A 0000_012C
CMP
True
A20329707
Page 32