Solution To Assignment of COA On Cache
Solution To Assignment of COA On Cache
A CPU has a 32 Kbyte direct mapped cache with 128-Byte block size. Suppose A is a two-
dimensional array of size 512x 512 with elements that occupy 8-byte each. Consider the
following two C code segments, P1 and P2
P1 :
for (i = 0; i < 512 ; i + +) {
for (j = 0; j < 512; j + +) {
x + = A [i][j] ;
}
}
P2 :
P1 and P2 are executed independently with the same initial state, namely, the Array A is not
in the cache and i, j, x are in registers. Let the number of cache misses experienced by P1 be
M1 and that for P2 be M2.
ANS: (c)
EXPLANAITION: 16 array elements are brought into the cache as the first element A[0][0]
is accessed and there will be hits for the next 15 accesses for A[0] [0] to A[0] [15] which are
in cache and a miss at A[0] [16], Therefore, there occurs 15 hits and one miss, for every 512
× 512/16 = 16384 block transfer during P1.
ANS: (b)
ANS: (a)
ANS: (b)
ANS: (c)
EXPLANAITION: The number of data cache misses that occur on total are 56 as is clear
from the given data.
Q.11 Which of the following lines of the data cache will be replaced by new blocks in
accessing the array for the second time?
a) line 4 to line 11 b) line 4 to line 12
c) line 0 to line 7 d) line 0 to line 8 [GATE-2007]
ANS: (a)
EXPLANAITION: The lines that will be replaced by the new blocks are lines 4 to 11.
Q.12 Consider a 4-way set associative cache consisting of 128 lines with a line size of 64
words. The CPU generates a 20-bit address of a word in main memory. The number of bits in
the TAG, LINE and WORD fields are respectively.
a) 9, 6, 5 b) 7, 7, 6
c) 7, 5, 8 d) 9, 5, 6 [GATE-2007]
ANS: (b)
EXPLANAITION: 7 bits are required if there are 128 lines. The reason behind is that 128 is
27. Now, each line is of 64 words or 26 words. Hence, number of bits required is 6 bits 64 or
2. As per the given, a 20 bit address is generated for a word in main memory, so bits required
for tag = 20 – (7 + 6) = 20 – 13 = 7 bit.
Q.13 In an instruction execution pipeline, the earliest that the data TLB (Translation Look a
side Buffer) can be accessed is
a) Before effective address calculation has started
b) During effective address calculation
c) After effective address calculation has completed
d) After data cache lookup has completed
ANS: (b)
EXPLANAITION: During effective address calculation, the translation look aside buffer
data can be accessed earliest.
Common Data for Questions no 14, 15 and 16
Consider a machine with a 2-way set associative data cache of size 64 Kbyte and block size
16 byte. The cache is managed using 32 bit virtual addresses and the page size is 4 Kbyte. A
program to be run on this machine begins as follows.
double ARR [1024] [1024]
int i , j ;
/ Initialize array ARR to 0.0 /
for (i =0; i < 1024 ; i + +)
for (j =0; j < 1024 ; j + +)
ARR [i] [j] = 0.0;
The size of double is 8 Byte. Array ARR is located in memory starting at the beginning of
virtual page 0 x FF000 and stored in row major order. The cache is initially empty and no
pre-fetching is done.
The only data memory references made by the program are those to array ARR.
ANS: (b)
EXPLANAITION:
TAG SET BLOCK
17 11 4
From the above figure and given conditions, the total number of tags comes out to be = 17 x
2 x 1024 = 34 Kbit
Q.15 Which of the following array elements has the same cache index as ARR [0][0]?
a) ARR [0][4] b) ARR [4] [0]
c) ARR [0] [5] d) ARR [5] [0] [GATE-2008]
ANS: (b)
EXPLANAITION: The array element ARR[4] [0] has the same cache index as ARR[0] [0]
since it is given that page size is of 4 Kbyte and 1 row contains 1024 elements, i.e., 210
locations.
ANS: (b)
***Q.17 For inclusion to hold between two cache, levels L1 and L2 in a multilevel cache
hierarchy, which of the following are necessary?
1. L1 must be a write-through cache.
2. L2 must be a write-through cache.
3. The associativity of L2 must be greater than that of L1.
4. The L2 Cache must be at least as large as the L1 cache.
a) 4 b) 1 and 4
c) 1, 2 and 4 d) 1, 2, 3 and 4 [GATE-2008]
ANS: (a)
EXPLANAITION: For a multilevel cache hierarchy the condition that is necessary to hold
inclusion is that the L2 cache must be at least as large as the L2 cache. And since both the
levels are write through cache this is not the sufficient condition as is depicted by the
following figure:
*Q.18 How many 32K × 1 RAM chips are needed to provide a memory capacity of 256
Kbyte?
a) 8 b) 32
c) 64 d) 128 [GATE-2009]
ANS: (c)
EXPLANAITION: As given, basic RAM is 32K x 1 and we have to design a RAM of 256K
x 8.
Therefore, number of chips required = 256 k × 8/(32K × 1)
= 245×1024× 8/32 × 1024 × 1) (Multiplying and dividing by 1024)
= 64 = 8x8
Means, 64 = 8 parallel lines x 8 serial RAM chips
*Q.19 A main memory unit with a capacity of 4 megabyte is built using 1M × 1 bit DRAM
chips. Each DRAM chip has 1 k rows of cells with 1 k cells in each row. The time taken for a
single refresh operation is 100 ns. The time required to perform one refresh operation on all
the cells in the memory unit is
a) 100 ns b) 100 ∗ 210 ns
c) 100 ∗ 220 ns d) 3200 ∗ 220 ns [GATE-2010]
ANS: (d)
Therefore, the time required to perform one refresh operation on all the cells in the memory
unit is A x B = 32 x 106 x 220
= 32 x 106 x 220 x 100 x 1-9 sec or 3200 x 220 nanoseconds
Q.20 Consider a 4-way set associative cache (initially empty) with total 16 cache blocks. The
main memory consists of 256 blacks and the request for memory blocks is in the following
order:
0, 255, 1, 4, 3, 8, 133, 159, 216, 129, 63, 8, 48, 32, 73, 92, 155
Which one of the following memory blocks will not be in cache if LRU replacement policy is
used?
a) 3 b) 8
c) 129 d) 216 [GATE-2010]
ANS: (d)
Set 0 48
4 32 8
26 92
Set 1 1
133
73
129
Set 2
Set 3 255 155
3
159
63
0 mod 4 = 0 (set 0)
255 mod 4 = 3 (set 3)
Like this the sets in the above table are determined for the other values.
Q.21 When there is a miss in L1 cache and a hit in L2 Cache, a block is transferred from L2
cache to L1 cache. What is the time taken for this transfer?
a) 2 ns b) 20 ns
c) 22 ns d) 88 ns [GATE-2010]
ANS: (c)
As already given in question,
Memory access time for 11 = 2 ns
Memory access time for 12 = 20 ns
Now the required time pf transfer = 20 + 2 = 22 ns
***Q.22 When there is a miss in both L1 cache and L2 cache, first a block is transferred from
main memory to L2 Cache, and then a block is transferred from L2 Cache to L1 cache what
is the total time taken for these transfer?
a) 222 ns b) 880 ns
c) 902 ns d) 968 ns [GATE-2010]
ANS: (a)
*Q.23 An 8 Kbyte direct mapped write back cache is organized as multiple blocks, each of
size 32 byte. The processor generates 32-bit addresses. The cache controller maintains the tag
information for each cache block comprising of the following:
1 Valid bit
1 Modified bit
As many bits as the minimum needed to identify the memory block mapped in the cache.
What is the total size of memory needed at the cache controller to store metadata (tags) for
the cache?
a) 4864 bit b) 6144 bit
c) 6656 bit d) 5376 bit [GATE-2011]
ANS: (d)
Number of clocks in cache = (8 x 210 x 8)/(32x8) = 210/22 = 28 , 8 bits are needed to identify
each block. Size of block is 32 bytes, Thus 5 bit will be needed to identify the lines.
***Q.25 A RAM chip has a capacity of 1024 words of 8 bits each (1K×8). The number of
2×4 decoders with enable line needed to construct a 16K ×16 RAM from 1K×8 RAM is
a) 4 b) 5
c) 6 d) 7 [GATE-2013]
ANS: (b)
***Q.27 A 4-way set-associative cache memory unit with a capacity of 16 KB is built using
a block size of 8 words. The word length is 32 bits. The size of the physical address space is 4
GB. The number of bits for the TAG field is _____ [GATE-2014-2]
ANS: (20)
Physical address size = 32 bits
Cache size = 16k bytes = 214 Bytes
Block size = 8 words 8×4 Byte = 32 Bytes (Where each word= b Bytes)
No. of blocks = 214/25 = 29, therefore block offset = 9 bits
No. of sets = 29/4 =27, therefore set offset = 7 bits
Byte offset =8× 4 Bytes =32 Byte= 25, therefore 5 bits.
TAG =32 – (7+ 5) = 20 bits.
Q.28 In designing a computer’s cache system, the cache block (or cache line) size is an
important Parameter. Which one of the following statements is correct in this context?
a) A smaller block size implies better spatial locality
b) A smaller block size implies a smaller cache tag and hence lower cache tag overhead
c) A smaller block size implies a larger cache tag and hence lower cache hit time
d) A smaller block size incurs a lower cache miss penalty [GATE-2014-2]
ANS: (d)
When a cache block size is smaller, it could accommodate more number of blocks, it
improves the hit ratio for cache, so the miss penalty for cache will be lowered.
Q.29 If the associativity of a processor cache is doubled while keeping the capacity and block
size unchanged, which one of the following is guaranteed to be NOT affected?
a) Width of tag comparator
b) Width of set index decoder
c) Width of way selection multiplexor
d) Width of processor to main memory data bus [GATE-2014-2]
ANS: (d)
When associativity is doubled, then the set offset will be effected, accordingly, the number of
bits used for TAG comparator be effected. Width of set index decoder also will be affected
when set offset is changed. Width of wag selection multiplexer will be affected when the
block offset is changed. With of processor to main memory data bus is guaranteed to be NOT
affected.
***Q.30 The memory access time is 1 nanosecond for a read operation with a hit in cache, 5
nanoseconds for a read operation with a miss in cache, 2 nanoseconds for a write operation
with a hit in cache and 10 nanoseconds for a write operation with a miss in cache. Execution
of a sequence of instructions involves 100 instruction fetch operations, 60 memory operand
read operations and 40 memory operand write operations. The cache hit-ratio is 0.9. The
average memory access time (in nanoseconds) in executing the sequence of instructions is
__________. [GATE-2014-3]