Assignment-3
Assignment-3
Ex 1. Processor X has cache memory with 8.0 % of cache miss rate. Cache access time
takes 0.6ns. Assume that main memory accesses take 70ns. What is the AMAT of X?
Solution:
The AMAT of X :
AMAT = hit time + miss rate x miss penalty
= 0.6 ns + 8% x 70 ns
= 6.2 ns
102
2
102
3
Solution:
a) 4-way Associative cache( 4 blocks/ 1set)
b) 1024 sets
c) 1024 sets x 4 blocks/set = 4096 blocks
d) Block size = Cache size / number of blocks = 32KB/4096 = 2^15/2^12= 2^3 bytes = 8
bytes = 2 words
e) 1024 sets = 2^10 => set index = 10 bits
4 blocks/ 1set = 2^2 => block offset = 2 bits
Block size = 8 bytes = 2^3 => byte offset = 3 bits
32 bit address = tag + set index + block offset + byte offset
=> Tag = 32 – (10+2+3) = 17 bits
Ex3. Design a 128KB direct-mapped data cache that uses a 32-bit address and 16 bytes
per block. Calculate the following:
(a) How many bits are used for the byte offset?
(b) How many bits are used for the set (index) field?
(c) How many bits are used for the tag?
Solution:
a) Block size = 16 bytes = 2^4 => byte offset = 4 bits
b) The number of blocks = Cache size / block size = 128KB / 16 bytes = 2^17/2^4 = 2^13
blocks => set index = 13 bits
c) Tag bits = 32 - (13+4) = 15 bits
Ex4. Design a 8-way set associative cache that has 16 blocks and 32 bytes per block.
Assume a 32 bit address. Calculate the following:
(a) How many bits are used for the byte offset?
(b) How many bits are used for the set (index) field?
(c) How many bits are used for the tag?
Solution:
a) Block size = 32 bytes = 2^5 => byte offset = 5 bits
b) Number of sets = Number of blocks / Number of ways = 16 / 8 = 2^1
=> set index = 1 bit
c) Tag bits = 32 – (5+1) = 26 bits
Ex5. Look at the different ways capacity affects overall performance. In general, cache
access time is proportional to capacity. Assume that main memory accesses take 70 ns
and that memory accesses are 36% of all instructions. The following table shows data for
L1 caches attached to each of two processors, P1 and P2.
a) Assuming that the L1 hit time determines the cycle times for P1 and P2, what are their
respective clock rates?
b) What is the AMAT for P1 and P2?
c) Assuming a base CPI of 1.0 without any memory stalls, what is the total CPI for P1
and P2? Which processor is faster?
Solution:
a) Clock cycle time of P1 = 0.66 ns => Clock rate for P1 = 1/0.66 = 1.515 GHz
Clock cycle time of P2 = 0.90 ns => Clock rate for P2 = 1/0.90 = 1.111 GHz
b) AMAT = Hit time + Miss rate x Miss penalty
=> AMAT for P1 = 0.66 + 8% x 70 = 6.26 ns
AMAT for P2 = 0.90 + 6% x 70 = 5.1 ns
c)
Ex6a A cache has 4 blocks with direct mapped organization. Given a sequence of block
access: 0, 4, 0, 6, 4. Fill the appropriate data to the below table. The first access was
shown as an example.
Ex6b A cache has 4 blocks with 2-way set associative organization. Given a sequence of
block access: 0, 4, 0, 6, 4, 5, 9. Fill the appropriate data to the below table. The first
access was shown as an example.
Solution:
a)
Midterm
a) 0.6x2+0.3x3+0.1x4 = 2.5 cycles/inst
b) CPUtime = 15s =IC x CPI / CR= IC x 2.5 / 2.5x10^9 => IC = 15x10^9