Finalsol
Finalsol
Write a short VHDL process involving std_logic signals a, b and c that will cause a circuit
synthesizer to generate a latch.
process (a,b) is begin
if a = b then c <= ‘0’; elseif a > b then c <= ‘1’; end if;
end process
What is the maximum time period for which a flip flop can be metastable.
There is no maximum time.
-1-
2. (10 points). Use the Karnaugh map below to find a minimum sum-of-products expression
for Σm(0,1,3,4,5,8,9,12,14). How many simple gates of each type are needed to implement this
expression (without further simplification)? How many LUT4s?
CD
00 01 11 10
00 1 1 1 0
01 1 1 0 0
AB
11 1 0 0 1
10 1 1 0 0
Use the Karnaugh map below to find a minimum product-of-sums expression for
Σm(0,1 ,9,13,15), Σd(3,4,5,6,8). Make full use of the don’t cares.
CD
00 01 11 10
00 1 1 x 0
01 x x 0 x
AB
11 0 1 1 0
10 x 1 0 0
-2-
3. (10 points) The diagram below shows a generic state machine. Assume that the state is
encoded using 5 bits.
next state
logic
-3-
4. (10 points) The processor simulation below includes several labeled blanks. Fill in the
correct values in the spaces below.
D. 0004 E. 0004 .
-4-
5. (15 points) Write a program in the WASHU-2 assembly language that implements the
following C-style pseudo-code. Note that the lines defining i, j and p are included below. For
each of the branch instructions in your program, also show the actual machine instruction
that is generated by the assembler for that instruction.
while (i != 0) {
if (j > i) *p = *p + j;
i--;
}
location 0100
loop: dLoad i
brZero end 0208
negate
add j
brPos 2 0302
branch skip 0104
iLoad p
add j
iStore j
skip: cLoad -1
add i
branch loop 01f5
location 0120
i: 20
j: 7
p: 0123
-5-
6. (10 points) Consider an SRAM with 2K words of 32 bits each. How many address bits are
needed to address these words? (Reminder 1024=210, 4096=212, 16,384=214, 65,536=216.)
11
Assuming that the central memory array has the same number of rows as it has columns,
how many rows are there?
256
How many of the address bits are used by the row decoder?
8
Draw a diagram of the basic storage cell that is typically used in an SRAM.
-6-
7. (10 points) Consider a version of the WASHU-2 processor that is equipped with a 2-way set-
associative cache with 256 rows (so, 512 words altogether) for both instructions and data,
with the partial contents shown below. The least recently used (LRU) bit in the center column
is equal to 0 if the left-hand entry has been least recently used and is equal to 1 if the right-
hand entry has been least recently used.
4a 34 c789 1 0 24 8249
Assume ACC=5 initially. What value is in the ACC after the next instruction executes?
ACC=5+3=8
Show how the fetching and execution of this instruction modifies the cache by marking the
changes in the table above.
Suppose that the instruction at location 244b is 504c. Show how the fetching and execution of
this instruction changes the cache contents, by marking your changes on the diagram above
(be sure to update the LRU bits, if appropriate).
-7-
8. (10 points) On a modern processor used in a laptop, approximately how long does it take to
retrieve an instruction from the L1 instruction cache? How long does it take to retrieve an
instruction from main memory, if it’s not present in any of the caches?
It takes about 1 ns to retrieve and instruction from L1 cache and about 100 ns to retrieve it from main
memory.
What is the advantage of having multiple general purpose registers in processor (as
opposed to the single accumulator used in the WASHU-2)?
It allows programs to avoid many of the load and store operations that would otherwise be required.
When a subprogram is entered, registers can be assigned to each parameter and local variable,
allowing most uses of those parameters and variables to be done using register-to-register
instructions that do not require expensive loads and stores.
List two instructions that could be added to the WASHU-2 instruction set that would
improve the performance of typical programs. For each instruction give an instruction
summary similar to the one shown below for the brZero instruction. Your new instructions
should be natural additions to the existing instruction set. In particular, do not re-use any
instruction codes already assigned.
02xx brZero if ACC=0 then PC=PC+ssxx
9xxx multiply ACC = ACC * M[pxxx]
axxx divide ACC = ACC/M[pxxx]
Explain how these instructions improve the performance of programs running on the
processor. Estimate the amount by which they would improve the performance of programs
using them.
In programs that do integer arithmetic, these instructions can eliminate the need to call special
subprograms. For example, the multiply subprogram used in labs requires over 100 instructions,
taking an average of about 100 ns each. So that’s 10 microseconds per multiply. A multiply
instruction could perform this operation in about 100 ns. So for each multiply instruction in a
program, the improvement is a factor of more than 100. Of course, the overall impact on the program
will be smaller, since most instructions are not multiplications.
-8-
9. (10 points) Consider the VHDL process shown below. Assume that all signals are 16 bits.
[1] process (clk, a, b, c) begin
[2] x <= a + b;
[3] if rising_edge(clk) then
[4] if a = c then
[5] y <= b + c; z <= c;
[6] elsif b > c then
[7] y <= a – x;
[8] end if; end if; end process;
Suppose that a circuit is synthesized for this process for the FPGA on our prototype board.
How many flip flops are used by this circuit? How many latches?
32 flip flops, no latches
Approximately how may LUT4s are used to evaluate the condition in the if-statement on line
4?
This is an equality comparison on 16 bit values, so about n/2=8 LUTs.
Approximately how may LUT4s are used to evaluate the condition in the elsif-statement on
line 6?
This is a greater-than comparison on 16 bit signals, so about 16 LUTs.
-9-