0% found this document useful (0 votes)
28 views56 pages

Ca Manual

It is a manual of the CA (COMPUTER ARCHITECTURE) For UET LAHORE Syllabus.

Uploaded by

Ahmad Bagri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views56 pages

Ca Manual

It is a manual of the CA (COMPUTER ARCHITECTURE) For UET LAHORE Syllabus.

Uploaded by

Ahmad Bagri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

COMPUTER ARCHITECTURE

Lab Manual

Submitted by:
2020-EE-509
2020 -EE-551

Submitted to:
Engr. Saira Arif

Department of Electrical Engineering


Rachna College of Engineering & Technology, Gujranwala
.
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

LAB TITLE: INTRODUCTION TO XILINX AND VIVADO AND CIRCUIT DEVELOPMENT


PLATFORM-BASYS-3
1 Circuit Development Platform-Basys-3

The Basys-3 is an entry-level FPGA development board designed exclusively for the Vivado® Design Suite
featuring the Xilinx® Artix®-7-FPGA architecture. Basys-3 is the newest addition to the popular Basys line of
FPGA development boards for students or beginners just getting started with FPGA technology. The Basys-3
includes the standard features found on all Basys boards: complete ready-to-use hardware, a large collection
of on-board I/O devices, all required FPGA support circuits, and a free version of development tools and at a
student-level price point. The important components of the FPGA are shown in Fig. 1.

2 Key Features and Benefits


➢ Pmod ports: 3 Standard 12-pin Pmod ports, 1 dual purpose XADC signal / standard Pmod port
➢ 4-digit 7-segment display
➢ user pushbuttons
➢ 16 user LEDs
➢ 16 user switches
➢ USB HID Host for mice, keyboards and memory sticks
➢ 12-bit VGA output
➢ USB-UART Bridge
➢ Serial Flash
➢ Free WebPACK™ download for standard use.
➢ Designed Exclusively for Vivado Design Suite. Expanded features are available through purchase of
the Design Edition.
➢ Digilent USB-JTAG port for FPGA programming and communication
➢ On-chip analog-to-digital converter (XADC)
➢ Internal clock speeds exceeding 450 MHz
➢ 90 DSP slices
➢ Five clock management tiles, each with a phase-locked loop (PLL)
➢ 1,800 Kbits of fast block RAM
➢ 33,280 logic cells in 5200 slices (each slice contains four 6-input LUTs and 8 flip-flops)
➢ Features the Xilinx Artix-7 FPGA: XC7A35T-1CPG236C

Figure 1 Basys-3 Development Board

3 Combinational Circuit Design


In this lab we are going to implement a simple combinational circuit given in Fig. 2 on our FPGA. For this
purpose, we are going to write the hardware description of our circuit in Verilog. After that, we are going to
use Vivado to burn our Verilog code on the FPGA.

3.1 Overview of Vivado Design Suite


Vivado design suite software developed by Xilinx for the synthesis and implementation of Verilog designs.
For this lab we will use the web-pack edition of Vivado.
a a.b

b
a1

c o1 y = a.b + c

Figure 2 Simple logic circuit implementing y = a.b + c


3.1.1 Creating a new project in Vivado

1. From Quick Start click on Create Project. A new project dialog box will appear as shown

in Fig. 3. Click on Next.

Figure 3 Project Dialog Box

1. Write Lab1 in the project name. Click on Next.


2. A Project Type dialog box will appear as shown in Fig. 4. Select RTL Project. Click on Next.
Fig. 4: Project Type Dialog Box

3. The Add Sources dialog box will appear. Click on Next (we will add the sources later). A
constraints file dialog box will appear. Click on Next.
4. A Default Part dialog box will appear. Select the same family, package and speed of the board as
shown in Fig. 5. Then, from the parts shown select xc7a100tcg324-1 and click on Next.

Fig. 5: Default Board Dialog Box


2. Then the Project Summary will appear. Click on Finish. After this, the project will be initialized and
window will appear as show in Fig. 6.
Fig. 6: Vivado Design Suite Window

3.1.2 Creating a new Verilog File on Vivado


3.1.2.1 Click on Add sources from the project manager bar as shown in previous Fig. 6. A dialog
box will appear. Select Add or create design sources and click next.
3.1.2.2 Add sources dialog box will appear from there click on Create File. In the dialog box,
write thefile name and click on Ok and then on Finish. When Define Module dialog box
appears just click on Ok.
3.1.2.3 From the Sources, click on Design Sources and click on Lab1. From there Lab1.v file will
open on Vivado. Now write the code as shown in Listing. 1.
mod ul e Lab1 ( o utp ut y ,

input a ,b , c

};

and a1 (o ,a , b );

or o1 (y ,o , c );
e n d mo d ul e

Listing 1: Writing Verilog Code on Vivado.

3.1.2.4 Click on Schematic to check your gate level design as shown in Fig. 7.
Fig. 7: Vivado Gate Level Schematic

3.1.3 Assigning Package Pins


There are two methods for assigning package pins which are given as follows:

3.1.3.1 Using I/O Planning


3.1.3.1.1 In the project manager bar, click on Open Elaborated Design. A dialog box will
appear, click on Ok. The I/O ports window will open as shown in Fig. 8 (if the I/O ports
box does not appear thengo to Layout and select I/O planning). From there assign I/O
ports with all having the I/O std. of LVCMOS33. Assign the same Package pins as shown
in Fig. 8. Then save these constraints by pressing CTRL+S from keyboard to save the
file.
Fig. 8: Assigning Package pins on Vivado

3.1.3.2 Using Constraints File


When we save the constraints in the above Section 2.1.3.1, we are indirectly creating a constraint file. But
we can directly create that file using the following method:
3.1.3.2.1 Click on Sources in the project manager and then select add or create constraints
and then select Create File. The constraints file will be created.
3.1.3.2.2 Type the following code as shown in the Listing 2 and then save that file. The pins will
be assigned in the same manner as done in Section 2.1.3.1.
s e t _ pr o p er t y - dict { PACKA G E _P I N M13 IOS T A NDARD LVC MOS33 }{ ge t_ p orts a };
s e t _ pr o p er t y - dict { PACKA G E _P I N L16 IOS T A NDARD LVC MOS33 }{ ge t_ p orts b };
s e t _ pr o p er t y - dict { PACKA G E _P I N J15 IOS T A NDARD LVC MOS33 }{ ge t_ p orts c };
s e t _ pr o p er t y - dict { PACKA G E _P I N M17 I OS TA NDARD L VCM OS33 }{ g et _ p orts y };

Listing 2: Assigning Package pins using constraints file.

3.1.4 Synthesizing and Implementing the Design


Under the Project Manager bar click on Run Synthesis. A dialog box will appear click on Ok. After the
synthesis is complete click on Run Implementation and then repeat the same process as before.

3.1.5 Generating the bit file and Programming the FPGA


3.1.5.1 Under the Program and Debug, click on Generate Bit stream. When the bit stream is
generated, connect the FPGA to your computer with the cable. And under the Open
Hardware Manger right click on Open Target and the click on Auto Connect as shown in
Fig. 9.
Fig. 9: Connecting FPGA to the computer.

3.1.5.2 Vivado will take a few seconds connect to the FPGA. Once done, click on Program device
from therea dialog box appears and click on Program to program your FPGA (Fig. 10).
the Verilog code willbe implemented on the FPGA. To check the behavior of the LED, make
the truth table of a.b+c and check for all the possible combinations of inputs.

Fig. 10: Programming the Bit stream on the FPGA.


3.1.6 Maximum Combinational Delay
Click on Open Synthesized Design. Click on Report timing Summary. A dialog box will appear,
click on Ok. A timing box will open as shown in Fig. 11.

Fig. 11: Timing box at the bottom of the screen.

Scroll down the box, click on Unconstrained Paths and then None to None and the Setups. The window
will show the maximum path along with the name of path in this example it gives maximum delay for a [0]
to c [1] with the delay of 6.778 ns as shown in Fig. 12.

Fig. 12: Maximum delay for a[0] to c[1] with the delay of 6.778ns.

3.1.7 Device Utilization Summary


Click on Reports in the console tab. Select the synth report utilization as shown below. This will give
you the device utilization report.

Fig. 13: Device Utilization Summary.


3.2 Tasks
Implement the circuit shown in Fig. 14 on the FPGA and develop it’s truth table.

ab

Fig. 14: Circuit to be implemented.

TRUTH TABLE:
a b y

0 0 1
0 1 1

1 0 1
1 1 1

CODE:
module gatelevel(output y,input a,b);
assign c = a & b;
assign d = a ^ b;
assign e = a & b;
assign f = ~(d & e);
assign y = c | f;
endmodule

SCHEMATIC:
TESTBENCH:
module gl_tb();
reg a;
reg b;
wire y;
gatelevel uut ( .a(a),.b(b),.y(y));
initial begin
#10 a=1'b0;b=1'b0;
#10 a=1'b0;b=1'b1;
#10 a=1'b1;b=1'b0;
#10 a=1'b1;b=1'b1;
#10$stop;
end
endmodule

SIMULATION:
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

LAB TITLE: REVISION OF VERILOG HDL


TOOLS

➢ Basys-3 100T FPGA Board

➢ Xilinx Vivado 2019.2

LAB TASKS

This is the introductory lab which will cover the tools installation, its usage, and a brief revision of Verilog HDL. This lab has
been subdivided into two parts. In the first phase, the tutorial file will be provided to install Xilinx Vivado 2019.2 into your
systems, while the other phase consists of Verilog based task to be performed on Vivado.

DELIVERABLES

➢ Prepare a report for everything that you have done in “Lab Tasks”. Explain all the steps and observations in the report.

TOOLS INSTALLATION

1 Xilinx Vivado Setup can be downloaded from the following URL:


https://getintopc.com/softwares/design/xilinx-vivado-design-suite-2018-free-download/

2 You can use the manual at this link to install Vivado in your system.

3 The tutorial on how to use Xilinx Vivado is provided at this link.

PROBLEM SET

HALF ADDER

➢ Half Adder is a combinational arithmetic circuit that adds two numbers and produces a sum bit (S) and carry bit (C) as
the output. If A and B are the input bits, then sum bit (S) is the X-OR of A and B and the carry bit (C) will be the AND of
A and B. Half adder is the simplest of all adder circuit, but it has a major disadvantage. The half adder can add only
two inputs (A and B) and has nothing to do with the carry if there is any in the input. Its truth table, module schematic,
and its gate level realization are shown in Figure 1.

➢ You are requested to write Verilog code for the circuit using structural modeling (using primitives such as AND, OR,
etc., gates). Your inputs are two 2-bit numbers.

➢ Synthesize the circuit for the Basys-3 100T FPGA Board and Simulate it by writing a test bench module for half adder
covering all possible IOs combinations.

➢ Read the synthesis report of your circuit and extract the useful information related to maximum combinational delay,
resources in the FPGA used (like lookup tables (LUTs), input/output (IOs), etc.), timing information, power usage, etc.
Report this information in your lab report.

Figure 4 Schematic and Truth Table of Half Adder

CODE:
module half_adder(a, b, sum, carry);
input a;
input b;
output sum;
output carry;
assign carry=a&b;
assign sum=a^b;
endmodule

SCHEMATIC:
TEST BENCH:
module half_adder_tb();
reg a;
reg b;
wire sum;
wire carry;
half_adder uut ( .a(a),.b(b),.sum(sum), .carry(carry));
initial begin
#10 a=1'b0;b=1'b0;
#10 a=1'b0;b=1'b1;
#10 a=1'b1;b=1'b0;
#10 a=1'b1;b=1'b1;
#10$stop;
end
endmodule
SIMULATION:

FULL ADDER
➢ The main difference between a half-adder and a full-adder is that the full-adder has three inputs and two outputs. The
first two inputs are A and B and the third input is an input carry designated as CIN. Its truth table, module schematic,
and its gate level realization are shown in Figure 2.
➢ You are requested to write Verilog code for the circuit using half adder module which you have developed in Problem
1. Your inputs (A & B) are two 2-bit numbers while Carry CIN is also 2 bit but its MSB is zero.
➢ Synthesize the circuit for the Basys-3 100T FPGA Board and Simulate it by writing a test bench module for half adder
covering all possible IOs combinations.
➢ Read the synthesis report of your circuit and extract the useful information related to maximum combinational delay,
resources in the FPGA used (like lookup tables (LUTs), input/output (IOs), etc.), timing information, power usage, etc.
Report this information in your lab report.

Figure 5 Schematic and Truth Table of Half Adder

CODE:
module full_adder(a, b, c, sum, carry);
input a;
input b;
input c;
output sum;
output carry;
wire d,e,f;
xor(sum,a,b,c);
and(d,a,b);
and(e,b,c);
and(f,a,c);
or(carry,d,e,f);
endmodule
SCHEMATIC:

TEST BENCH:
module full_adder_tb();
reg a;
reg b;
reg c;
wire sum;
wire carry;
full_adder uut ( .a(a), .b(b),.c(c),.sum(sum),.carry(carry) );
initial begin
#10 a=1'b0;b=1'b0;c=1'b0;
#10 a=1'b0;b=1'b0;c=1'b1;
#10 a=1'b0;b=1'b1;c=1'b0;
#10 a=1'b0;b=1'b1;c=1'b1;
#10 a=1'b1;b=1'b0;c=1'b0;
#10 a=1'b1;b=1'b0;c=1'b1;
#10 a=1'b1;b=1'b1;c=1'b0;
#10 a=1'b1;b=1'b1;c=1'b1;
#10$stop;
end
endmodule

SIMULATION:

4-BIT BINARY MULTIPLIER


➢ A binary multiplication is equal the sum of partial products. The description of the partial products can be
depicted via a solved example shown in Figure 3.
➢ You are requested to propose a circuit for such 4-bit multiplier and write Verilog code for the circuit using
structural modeling (using either primitives, or dataflow, or behavioral modeling).
➢ Synthesize the circuit for the Basys-3 100T FPGA Board and Simulate it by writing a test bench module for any
three (3) sample inputs.
➢ Read the synthesis report of your circuit and extract the useful information related to maximum combinational
delay, resources in the FPGA used (like lookup tables (LUTs), input/output (IOs), etc.), timing information,
power usage, etc. Report this information in your lab report.
Figure 6 Solved example of 4-bit multiplier (by partial product method)

CODE:
module binary_multiplier (input [3:0] a, input [3:0] b, output reg [7:0] p);
reg [3:0] i;
reg [7:0] temp;
always @* begin
temp = 0;
for (i = 0; i <= 3; i = i + 1) begin
if (b[i] == 1) temp = temp + (a << i);
end
p = temp;
end
endmodule
TESTBENCH:
module binary_multiplier_tb;
reg [3:0] a;
reg [3:0] b;
wire [7:0] p;
binary_multiplier UUT (a, b, p);
initial begin
// Test cases
a = 4'b0001; b = 4'b0010; #10;
a = 4'b0010; b = 4'b0001; #10;
a = 4'b0011; b = 4'b0010; #10;
a = 4'b0100; b = 4'b1000; #10;
// End of test
$finish;
end
endmodule

SIMULATION:
Lab Manual

3 EE475L: Computer Architecture

Department of Electrical Engineering


University of Engineering and Technology Lahore
Instructor: Engr. Saira Arif

Name Ahmad Saadan & Ahmad Tariq


Registration No. 2020-EE-509 |2020-EE-551

Lab Title: Finite State Machines

Tools

• Nexys A7 100T FPGA Board


• Xilinx Vivado 2019.2

Introduction

This is the introductory lab which will cover the concepts of Finite State Machines, Datapath and Controller design.

Deliverables

• Read the Tutorial provided with the manual handout (also available here). Complete the
lab and prepare a report for everything that you have done in “Problem Set”. Explain all
the steps and observations in the report.

Problem Set
Bubble sort is an algorithm to sort a list of numbers in ascending or descending order. Followingis the algorithm of
bubble sort in C language (adapted from this link):
int c, d, swap;
// Initial assignment of unsorted numbers
array[0] = 1; array[1] = 6; array[2] = 2; array[3] = 9;

for (c = 0; c < (n - 1); c++)

for (d = 0; d < (n - c - 1); d++)

if (array[d] > array[d+1]) /* For decreasing order use < */

swap = array[d]; Listing 1: Code for bubble sort, can be seen in action on http://goo.gl/h6ij6d
array[d] = array[d+1];
array[d+1] = swap;

printf("Sorted list in ascending order:\n");

for (c = 0; c < n; c++) printf("%d\n", array[c]);

return 0;

}
Code

module shift_register (
input clk, rst, en, // clock, reset, and enable inputs
input [7:0] data_in, // 8-bit parallel data input
output reg serial_out // serial data output
);

reg [7:0] reg_data; // internal 8-bit register

always @(posedge clk) begin


if (rst) begin
reg_data <= 8'b0; // reset the register to all zeros
serial_out <= 1'b0; // reset the output to zero
end else if (en) begin
reg_data <= data_in; // load the parallel input data into the register
serial_out <= reg_data[7]; // output the MSB of the register
reg_data <= {reg_data[6:0], 1'b0}; // shift the register contents left by one bit
end
end

endmodule

Testbench

module shift_register_tb;
reg clk, rst, en;
reg [7:0] data_in;
wire serial_out;
shift_register dut (
.clk(clk),
.rst(rst),
.en(en),
.data_in(data_in),
.serial_out(serial_out)
)
initial begin
clk = 0;
rst = 1;
en = 0;
data_in = 8'b0;
#10 rst = 0; // deassert reset after 10 time units
end
always #5 clk = ~clk; // toggle clock every 5 time units
always @(*) begin
if (clk) begin
if (!rst) begin
// Test 1: Load data and shift
en = 1;
data_in = 8'b10101010;
end else begin
en = 0;
data_in = 8'b0;
end
end
end
initial begin
// Wait for some cycles to observe the output
#100;
// Check output for Test 1
if (serial_out !== 1'b1) begin
$display("Test 1 failed! Expected serial_out = 1, actual = %b", serial_out);
$finish;
end
// Test complete
$display("All tests passed!");
$finish;
end
endmodule

Schematic

Simulation
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

LAB 4: DIGITAL SYSTEMS DESIGN


CODE:
module uart_module (
input clk, // system clock
input rst_n, // active low reset
input [7:0] data_in, // data input from switches
input load, // load push button
input transmit, // transmit push button
output reg tx_out // UART output
);

// Baud rate divisor for 9600 baud


parameter BAUD_DIV = 1042;

// Internal registers
reg [7:0] data_reg;
reg load_reg;
reg transmit_reg;
reg [9:0] baud_counter;
reg tx_reg;

// Datapath
always @ (posedge clk or negedge rst_n) begin
if (~rst_n) begin
data_reg <= 8'h00;
load_reg <= 1'b0;
transmit_reg <= 1'b0;
baud_counter <= 10'h000;
tx_reg <= 1'b1;
end else begin
if (load) begin
data_reg <= data_in;
load_reg <= 1'b1;
end else begin
load_reg <= 1'b0;
end

if (transmit) begin
transmit_reg <= 1'b1;
end
if (baud_counter == BAUD_DIV) begin
baud_counter <= 10'h000;
tx_reg <= 1'b0;
end else begin
baud_counter <= baud_counter + 1;
end

if (tx_reg == 1'b0 && transmit_reg) begin


tx_out <= 1'b0;
tx_reg <= 1'b1;
end else begin
tx_out <= 1'b1;
end
end
end
endmodule

TEST BENCH:
module uart_module_tb();

reg clk;
reg rst_n;
reg [7:0] data_in;
reg load;
reg transmit;
wire tx_out;

uart_module uart_inst (
.clk(clk),
.rst_n(rst_n),
.data_in(data_in),
.load(load),
.transmit(transmit),
.tx_out(tx_out)
);

initial begin
clk = 1'b0;
rst_n = 1'b0;
data_in = 8'h00;
load = 1'b0;
transmit = 1'b0;
#100 rst_n = 1'b1;
#100;
// Load data
data_in = 8'h55;
load = 1'b1;
#100 load = 1'b0;
// Transmit data
transmit = 1'b1;
#100 transmit = 1'b0;
// Wait for transmission
#100;
// End simulation
$finish;
end

always #5 clk = ~clk;

endmodule

SCHEMATIC:

SIMULATION:
Shift Register:

Lab Manual

5 EE475L: Computer Architecture

Department of Electrical Engineering


University of Engineering and Technology Lahore
Instructor: Engr. Saira Arif

Name Ahmad Saadan & Ahmad Tariq


Registration Number 20-EE-509 20-EE-551

Lab Title: Single Cycle RISC-V Processor (Phase-I)


Lab Resources

The Lab Resources will be available at the following link. All reference books, related tutorials,assessment rubrics
will be updated here: Resources EE475

Tools

• Nexys A7 100T FPGA Board


• Xilinx Vivado 2019.2

Deliverables

Implement a single cycle RV processor that supports all instructions of RISC-V ISA. Theprocessor must has a fetch
unit, decode logic, functional units, a register file, I/O support and access to memory. You will be implementing the
datapath while designing a single cycle RISC- V processor which will work with five stages mainly,
• Intruction Fetch
• Operand Fetch
• Execution
• Memory Access
• Write back the result

It must contain a file that will be used as Random Access Memory. Memory is 8-bits wide butthe processor
accesses 32-bits (4B) for operation. It has 32 Registers working as General purpose Register. While one special
purpose register (Program Counter) will be used to hold the address of the instruction. Each Register must be 32 bit
wide.
Figure 1 Instruction Formats for four different classes of Instruction

Instruction format for S-type is for store instructions. The register rs1 is the base register that is added to the 12-bit
immediate field to form the memory address. (The immediate field is split into a 7-bit piece and a 5-bit piece.) Field
rs2 is the source register whose value should be stored into memory. Instruction format for SB-type conditional
branch. The registers rs1 and rs2compared. The 12-bit immediate address field is sign-extended, shifted left 1 bit,
and added tothe PC to compute the branch target address.

Datapath and Controller Diagram

The datapath with all necessary multiplexers and all control lines identified is shown in Figure 2. The control lines
are shown in color. The ALU control block has also been added, which depends on the funct3 field and part of the
funct7 field. Whereas the complete diagram of datapath with controller is shown in Figure 3. The input to the control
unit is the 7-bit opcode field, 3-bit func3 and 7-bit func7 fields from the instruction. The outputs of the control unit
consist of two 1-bit signals that are used to control multiplexers (ALUSrc and MemtoReg), three signals for controlling
reads and writes in the register file and data memory (RegWrite, MemRead, and MemWrite), a 1-bit signal used
in determining whether to possibly branch (Branch), and a 4-bit control signal for the ALU (ALUOp). An AND gate
is used to combine the branch control signal and the Zero output from the ALU; the AND gate output controls the
selection of the next PC.

Figure 2 Datapath of Single Cycle RISC-V Processor

CODE

Verilog module for a single-cycle implementation of a MIPS CPU. The module contains several sub-modules, each
implementing a specific part of the CPU functionality.
The main module, Single_Cycle_Top, has two input ports: clk and rst. The output ports are:
• PC_Top: the current value of the program counter
• RD_Instr: the current instruction being executed, read from the instruction memory
• RD1_Top: the value of the first source register read from the register file
• Imm_Ext_Top: the immediate value, sign-extended to 32 bits
• ALUResult: the result of the ALU operation
• ReadData: the data read from the data memory
• PCPlus4: the value of the program counter incremented by 4
• RD2_Top: the value of the second source register read from the register file or the data read from
the data memory
• SrcB: the second operand of the ALU, selected from either RD2_Top or Imm_Ext_Top depending
on the value of the ALUSrc control signal
• Result: the result of the instruction, written back to the register file
• RegWrite: a control signal that enables register write
• MemWrite: a control signal that enables data memory write
• ALUSrc: a control signal that selects the second operand of the ALU
• ResultSrc: a control signal that selects the result to be written back to the register file
• ImmSrc: a control signal that selects the source of the immediate value
• ALUControl_Top: the control signal for the ALU operation
The main module instantiates the following sub-modules:
• PC_Module: implements the program counter
• PC_Adder: adds 4 to the current value of the program counter to get the next instruction address
• Instruction_Memory: implements the instruction memory
• Register_File: implements the register file
• Sign_Extend: sign-extends the immediate value to 32 bits
• Mux_Register_to_ALU: selects the second operand of the ALU
• ALU: implements the ALU operation
• Control_Unit_Top: generates the control signals for the CPU
• Data_Memory: implements the data memory
• Mux_DataMemory_to_Register: selects the result to be written back to the register file.
Overall, this Verilog module represents a basic implementation of a MIPS CPU using the single-cycle approach.
TEST BENCH
A test bench for a single-cycle RISC processor would typically involve the following steps:
1. Load an assembly program into the instruction memory of the processor.
2. Set the inputs to the processor such as reset, clock, and any necessary input signals for the program.
3. Run the clock for a number of cycles, allowing the processor to execute the program.
4. Monitor the outputs of the processor, including the values of the registers and any output signals.
5. Compare the expected output values with the actual output values to verify the correctness of the
processor implementation.
The test bench would need to cover a wide range of test cases to ensure that the processor implementation is
correct and robust. Test cases could include various combinations of instructions, different data values,
and edge cases such as overflow conditions or branching. The test bench would need to be carefully
designed to ensure that it thoroughly tests the processor and detects any issues that may arise.

SCHEMATIC

SIMULATION
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

OPEN ENDED LAB 6 : SINGLE CYCLE RISC-V PROCESSOR


INTRODUCTION:
The data path of a Single Cycle RISC V processor serves as the central hardware unit responsible for swiftly
executing instructions and managing data. Its name, "Single Cycle," stems from its ability to complete the
entire instruction execution process within a solitary clock cycle.

In this streamlined architecture, each instruction undergoes fetching from memory, decoding, and
execution within a single clock cycle. This contrasts with multi-cycle data paths, offering simplicity with a
fixed cycle time. However, the rigidity of the cycle time can impose limitations on the processor's clock
speed and, consequently, the system's overall performance.

Comprising essential components like the Instruction Memory (IM), Program Counter (PC), Instruction
Register (IR), Register File (RF), Arithmetic Logic Unit (ALU), Data Memory (DM), and Control Unit (CU),
the Single Cycle RISC V data path orchestrates the instruction execution and data manipulation processes.

Execution of instructions unfolds through a series of sequential steps: instruction fetch, decode, operand
fetch, execute, memory access, and write-back. Each of these phases swiftly unfolds within a single clock
cycle, ensuring rapid and efficient instruction execution.

Despite its simplicity, the Single Cycle RISC V data path remains a potent force in the realm of
microprocessors and digital systems. Its uncomplicated design renders it ideal for applications demanding
swift and efficient data path handling.

The data path of a single cycle RISC V processor consists of the following components:

➢ Instruction memory (IM): This component is responsible for storing the instructions that are
fetched by the processor. The instruction memory is typically implemented using SRAM or
DRAM.
➢ Program Counter (PC): This component is responsible for storing the address of the next
instruction to be fetched. The PC is incremented by 4 after each instruction fetch.
➢ Instruction Register (IR): This component is responsible for holding the current instruction that
is being executed by the processor. The instruction register is loaded with the instruction fetched
from the instruction memory.
➢ Register File (RF): This component is responsible for storing the data values that are used by the
processor. The register file typically has 32 general-purpose registers, each of which is 32 bits
wide.
➢ ALU (Arithmetic Logic Unit): This component is responsible for performing arithmetic and logical
operations on the data values stored in the register file. The ALU takes two input values and
produces a single output value.
➢ Data Memory (DM): This component is responsible for storing the data values that are used by
the processor. The data memory is typically implemented using SRAM or DRAM.
➢ Control Unit (CU): This component is responsible for controlling the operation of the processor.
The control unit generates control signals that are used to control the other components of the
data path.
➢ MUX (Multiplexer): It selects between different inputs to provide the required data or control
signals to different components of the data path.
➢ Sign Extend: It extends the sign bit of an immediate value to 32 bits.
➢ Shift Left 1: It shifts the value of a register or immediate value to the left by one bit.

The data path of a single cycle RISC V processor operates as follows:

➢ Instruction fetch: The processor fetches the instruction from the instruction memory by reading
the instruction at the address stored in the program counter (PC). The PC is incremented by 4 to
point to the next instruction.
➢ Instruction decode: The processor decodes the instruction by examining the opcode and
operands of the instruction.
➢ Operand fetch: The processor fetches the operands of the instruction from the register file or
the data memory.
➢ Execute: The processor performs the operation specified by the instruction using the ALU.
➢ Memory access: If the instruction requires a memory access, the processor accesses the data
memory to read or write the data.
➢ Write back: The processor writes the result of the operation back to the register file.
➢ Control unit: The control unit generates control signals that are used to control the operation of
the data path components.
➢ PC Update: The processor updates the PC with the address of the next instruction.

This completes one cycle of the single cycle RISC V data path, and the processor proceeds to fetch the
next instruction from the instruction memory.

Note that the single cycle RISC V data path is simple and straightforward, but it has a long cycle time due
to the large number of stages in the data path. This can limit the clock speed of the processor, and hence
the overall performance of the system.

DIAGRAM:

Figure 7 Data path of Single Cycle RISC-V Processor


RTL DESIGN:

BRANCH EQUAL:

CONCLUSION:
We can conclude that it is a simple and efficient hardware component that is widely used in
microprocessors and other digital systems. Its fixed cycle time can limit the clock speed of the processor,
but it allows for the execution of each instruction in a single clock cycle, ensuring that instructions are
executed quickly and efficiently.

The Single Cycle RISC V data path includes key components such as the Instruction Memory, Program
Counter, Instruction Register, Register File, Arithmetic Logic Unit, Data Memory, and Control Unit. These
components work together to fetch, decode, and execute instructions within a single clock cycle.
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

LAB 7 – PIPELINED ARCHITECTURE


Pipelining does not reduce time-to-completion for an instruction, rather it increases the
throughput of the processor. Multiple different operations (for different instructions) are performed
simultaneously, using different hardware resources. Using pipelining, allows us to run the entire hardware
at higher operating frequency, which effectively improves the system throughput.
From Single Cycle to Pipelined Architecture
We modify our single cycle implementation to 3-stage pipelined architecture. For that purpose
wedecompose the single cycle processor to the following three stages.
1) Fetch
2) Decode and Execute
3) Memory and Writeback
Specifically, the pipelined datapath is formed by splitting the single-cycle datapath into three
stages, where each pair of consecutive stages is separated by pipeline registers. For instance, to introduce
the pipeline stage between fetch phase and decode & execute phase, two registers (namely PC register
and Instruction register) are required as can be seen from the following Figure
7.1.
CODE
The pipeline_single_cycle Verilog code describes a single-cycle pipeline that has a pipeline register for
forwarding data between pipeline stages. The pipeline consists of the following stages:
• Fetch: Reads instructions from memory
• Decode: Decodes the instructions and reads operands from register file
• Execute: Performs arithmetic and logical operations
• Memory: Accesses data memory
• Writeback: Writes results back to register file
The pipeline_single_cycle module takes several inputs and produces an output. The inputs are:
• fastclk: A clock signal used for the pipeline stages
• reset: A reset signal that initializes the pipeline stages
• swith_select: A 5-bit input that selects the register to read from
• switch_run: A control signal that starts the pipeline
• reg_read_data_1: The data read from the selected register in the pipeline
The output of the module is reg_read_data_1, which is the data read from the selected register.
Inside the module, there are several registers that are used for storing input values and passing them
between stages. The module also contains several combinational logic blocks that perform operations on
the inputs and registers.
The pipeline_single_cycle module has a clock input named fastclk and a reset input named reset. The
always block that follows the initial block toggles fastclk every 1 time unit. The initial block sets reset to 1
and switch_run to 0, and waits for 4 time units before setting reset to 0. After a delay of 10, the module
begins reading data from selected registers and prints the values to the console.
The always block that follows the initial block toggles clkread every 320 time units. There are two other
always blocks that use clkread as the trigger for their always blocks. One block reads the data from the
selected register and prints it to the console when clkread is high, and the other block reads the data and
prints it when clkread is low.
The always block inside the module uses fastclk as the trigger for the pipeline stages. When switch_run
is set to 1, the pipeline begins processing instructions. The module reads the selected register during the
decode stage and passes the data through the pipeline to the writeback stage, where it is written back to
the register file.
Overall, the pipeline_single_cycle module implements a basic pipeline design that fetches, decodes,
executes, accesses memory, and writes back data. The module reads data from a selected register and
prints the values to the console at regular intervals.
TEST BENCH
SystemVerilog testbench for the "pipeline_single_cycle" module, which includes a number of signals, an
instantiation of the unit under test (UUT), and a set of always blocks that control the inputs and monitor
the outputs of the UUT.
The signals include the clock signal "fastclk", the reset signal "reset", a 5-bit input signal "swith_select",
a single-bit input signal "switch_run", and an additional clock signal "clkread". The UUT has a single 32-
bit output signal "reg_read_data_1".
The always block with the posedge of the "clkread" signal is responsible for controlling the inputs of the
UUT, setting the "swith_select" signal to different values and monitoring the "reg_read_data_1" output
signal. The values of "reg_read_data_1" are stored in "reg_read_data_2" and "reg_read_data_3" for
comparison purposes. Additionally, this always block sets "switch_run" to 1 for a short period of time and
then back to 0.
The always block with the negedge of "clkread" signal is similar to the previous one, but it prints out the
values of "reg_read_data_1", "reg_read_data_2", and "reg_read_data_3" at each step.
The testbench appears to be testing the behavior of the "pipeline_single_cycle" module, which likely
includes a pipeline with a single stage that executes a single instruction per cycle. The testbench sets up
the inputs for the module and verifies that the output values match the expected values. It also includes
some debugging information to help with identifying any issues that may arise.

SCHEMATIC
SIMULATION
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

Lab 8 – Pipelined Architecture (Resolving Data Hazards)


In the previous lab, the single cycle RISC-V processor was converted to a pipelined processor by the help
of pipelining registers. The pipelined processor is going to handle multiple instructions concurrently and
due to dependency of the result of an instruction on another. These hazards can be classified as data or
control hazards. Data hazards take place when an instruction tries to read a register that has not been
updated by the previous instructions. On the other hand, the control hazards take place when the
decision of fetching the next instruction has not been during the decode stages. The control hazards in
case of jumps and taken branches. This is due to the fact that when jump/branch is resolved in the
execution phase, the subsequent instruction is being fetched simultaneously.

Resolving Data Hazards

In the case of the three stage pipeline, some data hazards can be resolved by forwarding the result of the
Memory-Writeback stage to the Decode-Execute stage which is performed by adding forwarding
multiplexers. Forwarding is used when the destination register in the Memory-Writeback stage matches
either of the source registers in the Decode-Execute stage. This leads to the addition of two forwarding
multiplexers and a forwarding unit which takes the whole instruction in the two pipeline stages as the
inputs and the selection of the two muxes becomes the outputs. This can be illustrated by Figure 8.1.

Illustration of the implementation of the forwarding module


// Check the validity of the source operands from EXE stage
assign rs1_valid = |exe2fwd.rs1_addr;
assign rs2_valid = |exe2fwd.rs2_addr;
// Hazard detection
assign lsu2rs1_hazard = ((exe2fwd.rs1_addr == lsu2fwd.rd_addr) & lsu2fwd.rd_wr_req) & rs1_valid;
assign lsu2rs2_hazard = ((exe2fwd.rs2_addr == lsu2fwd.rd_addr) & lsu2fwd.rd_wr_req) & rs2_valid;
// Generate the forwarding signals
assign fwd2exe.fwd_lsu_rs1 = lsu2rs1_hazard;
assign fwd2exe.fwd_lsu_rs2 = lsu2rs2_hazard;

Forwarding is not sufficient in case of load instructions which can have multi-cycle latency due to which
the results can not be forwarded. The only solution left would be to stall the pipeline until the result has
been written to the register file. When a stage is stalled, all the previous stages must also be stalled in
order to avoid instruction loss. For this purpose, we add the stalling capability to the forwarding to make
it the forward stall unit. This adds the stall signals to all the pipeline registers which has been illustrated
in Figrue 8.2.
INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

Lab 9 – Pipelined Architecture (Resolving Control Hazards)


Resolving Control Hazards

For taken branches as well as jumps the following instruction (which has been fetched) should not be
executed. Rather it should be flushed from the pipeline, while the program counter is updated to the new
address. For this purpose we need to flush the Decode-Execute stage which is done by setting the
instruction pipeline register between the Fetch stage and the Decode-Execute to nop. For this purpose,
we need to modify our forward stall module to add the br_taken flag as its input and the flush signal as
the output. These changes can be observed in Figure 9.3.

Illustration of the implementation for PC updating and fetch stage flushing


// PC update state machine
always_ff @(posedge clk) begin
if (rst_n) begin
pc_ff <= '0;
end else begin
pc_ff <= pc_next;
end
end
assign pc_next = exe2if_fb.jump_br_taken ? exe2if_fb.alu_pc
: id2if_fb_rdy ? (pc_ff + 32'd4)
: pc_ff;
`ifdef IF2ID_PIPELINE_STAGE
assign if2id_data.instr = exe2if_fb.jump_br_taken
? `INSTR_NOP // Insert NOP for jump or branch taken
: imem2if_rdata_i;
`else
assign if2id_data.instr = imem2if_rdata_i;
`endif
Tasks
Implement the proposed stall/forwarding/flush strategy to resolve as many hazards as possible and
implement the proposed strategy. - Write an assembly program to test some of these hazards and verify
the implementation.

INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

Lab 10
Introduction and Interfacing of BASYS3, NEXYS A7 and GPIO testing of Basys-3
Introduction:
The BASYS 3 is one of the best boards on the market for getting started with FPGA. It is an entry-level
development board built around a Xilinx Artix-7 FPGA.

As a complete and ready-to use digital circuit development platform, it includes enough switches, LEDs, and
other I/O devices to allow a large number of designs to be completed without the need for any additional
hardware. There are also enough uncommitted FPGA I/O pins to allow designs to be expanded using Digilent
Pmods or other custom boards and circuits, and all of this at a student-friendlyprice point.

The BASYS 3 is designed exclusively for Xilinx’s Vivado Design Suite, and the WebPACK edition is available as a
free download from Xilinx.

Guides and demos are available to help users get started quickly with the BASYS 3. These can be foundthrough the
Support Materials tab.

https://digilent.com/shop/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/
Figure 1: BASYS 3

Assigning Package Pins

There are two methods for assigning package pins which are given as follows:

1. Using I/O Planning:


In the project manager bar, click on Open Elaborated Design. A dialog box will appear, click on Ok. The
I/O ports window will open as shown in Figure 9 (if the I/O ports box does not appear then go to Layout
and select I/O planning). From there assign I/O ports with all havingthe I/O std. of LVCMOS33. Assign
the same Package pins as shown in Figure 9. Then savethese constraints by pressing CTRL+S from
keyboard to save the file.

Figure 1: Assigning Package pins on Vivado.

Using Constraints File:


When we save the constraints in the above Section, we are indirectly creating a constraint file. But we
can directly create that file using the following method:

Click on Sources in the project manager and then select add or create constraints and then
select Create File. The constraints file will be created.
Type the following code as shown in the Listing 1 and then save that file. The pins will beassigned in
the same manner as done in Section above.

set_property - dict { PACKAGE_PIN M18 IOSTANDARD LVCMOS33 }{ get_ports a}; set_property -


dict { PACKAGE_PIN L18 IOSTANDARD LVCMOS33 }{ get_ports b}; set_property - dict {
PACKAGE_PIN J18 IOSTANDARD LVCMOS33
}{ get_ports c}; set_property - dict { PACKAGE_PIN M19 IOSTANDARD LVCMOS33 }{ get_ports y};

Listing 1: Assigning Package pins using constraints file.

Synthesizing and Implementing the Design:


Under the Project Manager bar click on Run Synthesis. A dialog box will appear click on Ok.
After the synthesis is complete click on Run Implementation and then repeat the same
process as before.

Generating the bit file and Programming the FPGA:


Under the Program and Debug, click on Generate Bitstream. When the bitstream is generated,
connect the FPGA to your computer with the cable. And under the Open HardwareManger right
click on Open Target and the click on Auto Connect as shown in Figure 2.

Figure 2: Connecting FPGA to the computer

Vivado will take a few seconds connect to the FPGA. Once done, click on Program device from there a
dialog box appears and click on Program to program your FPGA (Figure 3). The Verilog code will be
implemented on the FPGA. To check the behavior of the LED, make the truth table of a.b+c and check
for all the possible combinations of inputs.

Figure 3: Programming the Bit-stream on the FPGA

Bit-stream generated successfully:


Code:

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

--The IEEE.std_logic_unsigned contains definitions that allow

--std_logic_vector types to be used with the + operator to instantiate a

--counter.

use IEEE.std_logic_unsigned.all;

entity GPIO_demo is

Port ( SW : in STD_LOGIC_VECTOR (15 downto 0);

BTN : in STD_LOGIC_VECTOR (4 downto 0);

CLK : in STD_LOGIC;

LED : out STD_LOGIC_VECTOR (15 downto 0);

SSEG_CA : out STD_LOGIC_VECTOR (7 downto 0);

SSEG_AN : out STD_LOGIC_VECTOR (3 downto 0);

UART_TXD : out STD_LOGIC;

VGA_RED : out STD_LOGIC_VECTOR (3 downto 0);

VGA_BLUE : out STD_LOGIC_VECTOR (3 downto 0);

VGA_GREEN : out STD_LOGIC_VECTOR (3 downto 0);

VGA_VS : out STD_LOGIC;

VGA_HS : out STD_LOGIC;

PS2_CLK : inout STD_LOGIC;

PS2_DATA : inout STD_LOGIC

);

end GPIO_demo;

architecture Behavioral of GPIO_demo is

component UART_TX_CTRL

Port(

SEND : in std_logic;

DATA : in std_logic_vector(7 downto 0);


INSTRUCTOR: ENGR. SAIRA ARIF

NAME: - Ahmad Saadan & Ahmad Tariq

REGISTRATION NO.:- 2020-EE-509 & 2020-EE-551

Lab 11:

Interfacing of on board LED and seven segment of Basys-3

Problem Set:

1. Half adder:

Half Adder is a combinational arithmetic circuit that adds two numbers and produces a sum bit (S) and
carry bit (C) as the output. If A and B are the input bits, then sum bit (S) is the X-OR of A and B and the
carry bit (C) will be the AND of A and B. Half adder is the simplest of all adder circuit, but it has a major
disadvantage. The half adder can add only two inputs (A and B) and has nothing to do with the carry if
there is any in the input. Its truth table, module schematic, and its gate level realization are shown in
Figure 1.

➢ You are requested to write Verilog code for the circuit using structural modeling (using
primitives such as AND, OR, etc., gates). Your inputs are two 2-bit numbers.
➢ Synthesize the circuit for the Nexys A7 100T FPGA Board and Simulate it by writing a
testbench module for half adder covering all possible IOs combinations.
➢ Read the synthesis report of your circuit and extract the useful information related to
maximum combinational delay, resources in the FPGA used (like lookup tables (LUTs),
input/output (IOs), etc.), timing information, power usage, etc. Report this information in
your lab report.

Figure 1: Schematic and Truth Table of Half Adder


CODE:

utput:

Figure 2: Half Adder


Simulation:

Constraints:
When we save the constraints in the above Section, we are indirectly creating a constraint file. But we
can directly create that file using the following method:

Click on Sources in the project manager and then select add or create constraints and then
select Create File. The constraints file will be created.
Type the following code and then save that file. The pins will beassigned
//======= LEDS ========//
set_property PACKAGE_PIN R2 [get_ports a]

set_property PACKAGE_PIN T1 [get_ports b]

set_property PACKAGE_PIN L1 [get_ports carry]

set_property PACKAGE_PIN P1 [get_ports sum]

set_property IOSTANDARD LVCMOS33 [get_ports a]

set_property IOSTANDARD LVCMOS33 [get_ports b]

set_property IOSTANDARD LVCMOS33 [get_ports carry]

set_property IOSTANDARD LVCMOS33 [get_ports sum]

Synthesizing and Implementing the Design:


Under the Project Manager bar click on Run Synthesis. A dialog box will appear click on Ok. After the
synthesis is complete click on Run Implementation and then repeat the same process as before.

Generating the bit file and Programming the FPGA:


Under the Program and Debug, click on Generate Bitstream. When the bitstream is generated,
connect the FPGA to your computer with the cable. And under the Open HardwareManger right click on
Open Target and the click on Auto Connect as shown in Figure.
Figure: Connecting FPGA to the computer.
Vivado will take a few seconds connect to the FPGA. Once done, click on Program device from there a
dialog box appears and click on Program to program your FPGA (Figure). TheVerilog code will be
implemented on the FPGA. To check the behavior of the LED, make the truth table of a.b+c and check
for all the possible combinations of inputs.

Figure: Programming the Bit stream on the FPGA


Bit-stream generated successfully:

When input A is 1 and B is zero


When input B is 1 and A is 0:

When Input A is 1 and is also one:

You might also like