Lab N
Lab N
LabN1.v
1. Create a directory for this lab and copy to it the files cpu.v and ram.dat that were
created in the previous lab. Copy also the file LabM10.v and name it LabN1.v.
2. Recall that LabN1 determines the address of the instruction to be executed next (i.e.
PCin) based on whether we are jumping, branching, or continuing sequentially. But
rather than representing this logic behaviourally in a testing module, let us implement
it structurally in a circuit whose output is PCin.
3. But if PCin is an output of a circuit, how can you ever set it in order to fetch the very
first instruction of your program? We clearly need a mechanism to force the CPU to
stop the current program and switch to another. To that end, let us introduce two new
signals: INT, a 1-bit interrupt signal, and entryPoint, a 32-bit signal containing the
address to switch to. Our PCin logic now becomes:
Note that if the interrupt signal INT is set, we fetch the next instruction from address
entryPoint thereby affecting a context switch. With this new scheme, PCin is no
longer set externally and, hence, can indeed be an output of a circuit.
The yPC component takes 8 inputs and determines PCin accordingly. The jump and
branch inputs are simply flags that are set to 1 if the current instruction is a jump or a
branch on equal. Here is the block diagram of the component.
branch
jump
INT
PCp4
zero
jTarget yPC PCin
imm
entryPoint
6. The first mux chooses between sequential processing (i.e. PCp4) and branching. Its
control signal is the and of branch ad zero; i.e. we branch if this is a beq instruc-
tion and its registers are equal. The branch target address, bTarget, is computed
by multiplying imm by 4 and adding the result to PCp4.
7. The second mux chooses between the previous mux output (choiceA) and jumping.
Its control signal is jump. The jump target is a 32-bit address whose high-order 4 bits
come from PCp4 and whose lower 26 bits are the input jTarget.
8. The third mux chooses between the previous mux output and entryPoint. Its cont-
rol signal is INT.
10. Modify LabN1.v so that it instantiates yPC in addition to the five components it
already instantiates. This requires changing PCin from reg to wire and adding
declarations for the new signals.
11. In addition, modify the body of LabN1.v so it starts with a context switch to launch
our program. Here is the new template:
HR/S08 Lab N 3
initial
begin
//------------------------------------Entry point
entryPoint = 128; INT = 1; #1;
//------------------------------------Run program
repeat (43)
begin
//---------------------------------Fetch an ins
clk = 1; #1; INT = 0;
//---------------------------------View results
as before
end
$finish;
end
Notice that the "Set control signals" section must now detect beq and j and set the
two signals branch and jump accordingly. Note also that the “Prepare for the next
ins” section is now empty since its behavioural logic has been promoted to a circuit.
12. Compile and run LabN1. The generated output should be exactly as in the previous
lab. Specifically, The last two lines of the output should be:
yC1
13. Our "Set control signals" section sets eight control signals:
(It also sets the 3-bit op signal but let us ignore that for now.) We seek to automate
the generation of these eight signals by building structural circuits that output them.
14. As a first step toward this goal, let us build the circuit yC1 that takes the opCode as
input (i.e. ins[31:26]) and determines if the instruction is load, store, branch-on-
equal, jump, or R-type, and outputs lw, sw, branch, jump, or rtype accordingly:
15. We generate lw by noting that the opCode of the lw instruction is 100011. Hence, if
we and the 6 bits bits of opCode after negating the ones that are 0 for lw, the result
will only be 1 if opCode were indeed 100011. This leads to the following circuit:
16. The signals sw, branch, and jump can be generated similarly by noting the opCode
of the corresponding instructions.
17. The rtype signal is generated by detecting the case of all 6 bits of opCode being 0.
This can be done in a single instantiation.
18. Add the yC1 module to your cpu.v file and complete its development. Its body should
have about 10 lines. You cannot test this module yet because it generates only two
of the eight control signals needed for the datapath.
yC2
This module represents the second part of the control unit (hence the C in its name).
It takes four of the signals generated by yC1 as input, and generates the six control
signals we need.
20. To build this component, we need to implement the logic of the "Set control signals"
section in hardware. To that end, we switch from sequential, if-then-else thinking, to
parallel, declarative thinking, and ask: What should the value of RegDst be?
21. Looking at all the cases, and manipulating don’t-cares to our advantage, we see that
RegDst has to be 1 for R-type instructions and 0 otherwise. Hence, this signal can
be generated in one line:
22. Consider ALUSrc next. This signal must be 0 for R-types and branches, 1 for loads
and stores and addi, and don’t-care for jumps. Again, we manipulate the don’t-care
and treat it as 1 so as to end up with a simple rule: 0 for R-types and branches and 1
otherwise. This leads to the one-liner:
24. Complete the development of yC2. Its body should have about 6 lines.
LabN2.v
25. Save LabN1.v as LabN2.v and modify it so it instantiates the two parts of the control
unit in addition to the datapath components:
26. In addition, change the declaration of the eight control signals from reg to wire and
remove their references from the Our "Set control signals" section:
initial
begin
//------------------------------------Entry point
entryPoint = 128; INT = 1; #1;
//------------------------------------Run program
repeat (43)
begin
//---------------------------------Fetch an ins
clk = 1; #1; INT = 0;
//---------------------------------View results
as before
end
$finish;
end
Except for op, our CPU has become capable of self-setting the signals it needs.
6 Lab N HR/S08
27. Compile and run LabN2. The generated output should be exactly as in the previous
lab. Specifically, The last two lines of the output should be:
yC3
28. We now turn our attention to the op signal. This is the hardest control signal to gene-
rate because it depends sensitively on the instruction. Indeed, we may need to look
at both the opCode (ins[31:26]) and the fnCode (ins[5:0]) before becoming
able to determine the correct op value.
29. We overcome the above difficulty by dividing the problem into two and building two
back-to-back circuits: The first, yC3, is responsible for non-R-type instructions and
the second, yC4, takes care of R-types. These two circuits interact with each other
through a new 2-bit signal ALUop as shown in this block diagram:
branch
fnCode
rtype
30. The yC3 circuit must generate ALUop as shown in the table below. Note that since j
doesn’t involve the ALU, it doesn’t matter what operation is performed. Note also that
yC3 cannot determine the operation for R-types since it doesn’t see the fnCode.
J j don’t-care xx
R unknown unknown 10
HR/S08 Lab N 7
endmodule
yC4
32. We now turn our attention to the fourth and last part of our control unit, yC4. This unit
sees the fnCode and the ALUop signal generated by yC3 and outputs the 3-bit ALU
signal op. Because of this, it is sometimes referred to as the ALU Control Unit. Here
is the specification of this unit:
As you can see, this unit operates primarily based on ALUop. If this signal is 00 or 01
then yC4 trusts and findings of yC3 and generates op accordingly. But if ALUop is
10 then yC4 knows that this is an R-type instruction and hence generates op based
on the function code.
endmodule
34. Consider the following circuit which is made up of five simple gates:
ALUOp
ALU control block
ALUOp0
ALUOp1
Operation2
F3
Operation
F2 Operation1
F (5– 0)
F1
Operation0
F0
35. Argue that this circuit does indeed behave as required by the table. At first glance, it
may seem impossible that any circuit can sometimes ignore one of its inputs but this
one does: fnCode is ignored if ALUop is 00 or 01.
36. Complete the development of yC4. Its body should have exactly five lines since it is
made up of five primitive gates.
LabN3.v
37. Save LabN2.v as LabN3.v and modify its instantiation section as follows:
38. In addition, change the declaration of the op signal from reg to wire and remove it
from the Our "Set control signals" section:
initial
begin
//------------------------------------Entry point
entryPoint = 128; INT = 1; #1;
//------------------------------------Run program
repeat (43)
begin
//---------------------------------Fetch an ins
clk = 1; #1; INT = 0;
//---------------------------------View results
as before
end
$finish;
end
Notice that the control signal section has become empty. The CPU is now capable of
executing the program without any assistance from external modules.
39. Compile and run LabN3. The generated output should be exactly as in the previous
lab. Specifically, The last two lines of the output should be:
LabN4.v
40. In this task we repackage our components so as to fully separate the concerns. Let
us put all the needed instantiation in one module that represents the CPU chip:
In fact, this module needs not have any output (the program makes changes to the
registers and to memory) but we have declared ins, wb, and rd2 simply to be able
to test it (rd2 helps us test sw).
10 Lab N HR/S08
41. Add the yChip module to your cpu.v file and complete its development. You simply
need to copy all the instantiation lines, along with their corresponding declarations,
from LabN3 to the body of this module.
42. Save LabN3 as LabN4.v and modify it by replacing all the instantiated circuits with an
instantiation of yChip:
module labN;
reg [31:0] entryPoint;
reg clk, INT;
wire [31:0] ins, rd2, wb;
initial
begin
//------------------------------------Entry point
entryPoint = 128; INT = 1; #1;
//------------------------------------Run program
repeat (43)
begin
//---------------------------------Fetch an ins
clk = 1; #1; INT = 0;
//---------------------------------View results
$display("%h: rd2=%2d wb=%2d", ins, rd2, wb);
end
$finish;
end
endmodule
43. Compile and run LabN4. The generated output should be similar to the previous. In
particular, the last two lines of the output should be:
LAB N
Notes
• The CPU built in this Lab communicates with the outside world through four
channels:
§ The clock signal (input)
§ The interrupt signal (input)
§ The entry point signal (input)
§ The BIU (Bus Interface Unit in yIF and yDM) (input and output)
• The clock rate is determined based on the longest path that an instruction takes. For
the subset we considered, this would be the lw instruction.