0% found this document useful (0 votes)
33 views8 pages

Solution To Assignment of COA On Cache

Solution to assignment of COA on cache

Uploaded by

samar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views8 pages

Solution To Assignment of COA On Cache

Solution to assignment of COA on cache

Uploaded by

samar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Statements for Linked Questions no 6 & 7

A CPU has a 32 Kbyte direct mapped cache with 128-Byte block size. Suppose A is a two-
dimensional array of size 512x 512 with elements that occupy 8-byte each. Consider the
following two C code segments, P1 and P2
P1 :
for (i = 0; i < 512 ; i + +) {
for (j = 0; j < 512; j + +) {
x + = A [i][j] ;
}
}

P2 :

for (i = 0; i < 512 ; i + +) {


for (j = 0; j < 512; j + +) {
x + = A [j][i] ;
}
}

P1 and P2 are executed independently with the same initial state, namely, the Array A is not
in the cache and i, j, x are in registers. Let the number of cache misses experienced by P1 be
M1 and that for P2 be M2.

Q.6 The value M1 is


a) Zero b) 2048
c) 16384 d) 262144 [GATE-2006]

ANS: (c)

EXPLANAITION: 16 array elements are brought into the cache as the first element A[0][0]
is accessed and there will be hits for the next 15 accesses for A[0] [0] to A[0] [15] which are
in cache and a miss at A[0] [16], Therefore, there occurs 15 hits and one miss, for every 512
× 512/16 = 16384 block transfer during P1.

Q.7 The value of the ratio M1/M2 is


a) Zero b) 1/16
c) 1/8 d) 16 [GATE-2006]

ANS: (b)

EXPLANAITION: As the next element required to be accessed after A[0][0] is A[1][0],


then the elements A[0][1] to A[0] [15] brought into cache are of no use. Thus, there will be
262144 (512 × 512) misses and no hits Therefore, M1/M2 = 16384/262144 = 1/16.

Common Data for Questions no 8 and 9


Consider two cache organizations:
The first one is 32 Kbyte 2-way set associative with 32 Kbyte block size. The second one is
of the same size but direct mapped. The size of an address is 32 bit in both cases. A 2-to-1
multiplexer has a latency of 0.6 ns while a k-bit comparator has a latency of k/10 ns. The hit
latency of the set associative organization is h1 while that of the direct mapped one is h2.

Q.8 The value of h1 is


a) 2.4 ns b) 2.3 ns
c) 1.8 ns d) 1.7 ns [GATE-2006]

ANS: (a)

EXPLANAITION: Consider the following table, as it is given the following is concluded:

TAG SET BLOCK


18 9 5

Therefor h1 = 18/10 + 0.6 = 2.4 ns

Q.9 The value of h2 is


a) 2.4 ns b) 2.3 ns
c) 1.8 ns d) 1.7 ns [GATE-2006]

ANS: (b)

EXPLANAITION: Consider the following table, as it is given the following is concluded:

TAG SET BLOCK


17 10 5
Therefore, h2 = 17/10 + 0.6
= 2.3 ns

Statements for Linked Questions no 10 and 11


Consider a machine with a byte addressable main memory of 216 byte.
Assume that a direct mapped data cache consisting of 32 lines of 64 byte each is used in the
system. A50 X 50 two dimensional arrays of bytes are stored in the main memory starting
from memory location 1100H. Assume that the data cache is initially empty. The complete
array is accessed twice. Assume that the contents of the data cache do not change in between
the two accesses.

Q.10 How many data cache misses will occur in total?


a) 48 b) 50
c) 56 d) 59 [GATE-2007]

ANS: (c)

EXPLANAITION: The number of data cache misses that occur on total are 56 as is clear
from the given data.

Q.11 Which of the following lines of the data cache will be replaced by new blocks in
accessing the array for the second time?
a) line 4 to line 11 b) line 4 to line 12
c) line 0 to line 7 d) line 0 to line 8 [GATE-2007]

ANS: (a)
EXPLANAITION: The lines that will be replaced by the new blocks are lines 4 to 11.

Q.12 Consider a 4-way set associative cache consisting of 128 lines with a line size of 64
words. The CPU generates a 20-bit address of a word in main memory. The number of bits in
the TAG, LINE and WORD fields are respectively.
a) 9, 6, 5 b) 7, 7, 6
c) 7, 5, 8 d) 9, 5, 6 [GATE-2007]

ANS: (b)

EXPLANAITION: 7 bits are required if there are 128 lines. The reason behind is that 128 is
27. Now, each line is of 64 words or 26 words. Hence, number of bits required is 6 bits 64 or
2. As per the given, a 20 bit address is generated for a word in main memory, so bits required
for tag = 20 – (7 + 6) = 20 – 13 = 7 bit.

Q.13 In an instruction execution pipeline, the earliest that the data TLB (Translation Look a
side Buffer) can be accessed is
a) Before effective address calculation has started
b) During effective address calculation
c) After effective address calculation has completed
d) After data cache lookup has completed

ANS: (b)

EXPLANAITION: During effective address calculation, the translation look aside buffer
data can be accessed earliest.
Common Data for Questions no 14, 15 and 16
Consider a machine with a 2-way set associative data cache of size 64 Kbyte and block size
16 byte. The cache is managed using 32 bit virtual addresses and the page size is 4 Kbyte. A
program to be run on this machine begins as follows.
double ARR [1024] [1024]
int i , j ;
/ Initialize array ARR to 0.0 /
for (i =0; i < 1024 ; i + +)
for (j =0; j < 1024 ; j + +)
ARR [i] [j] = 0.0;

The size of double is 8 Byte. Array ARR is located in memory starting at the beginning of
virtual page 0 x FF000 and stored in row major order. The cache is initially empty and no
pre-fetching is done.
The only data memory references made by the program are those to array ARR.

Q.14 The total size of the tags in the cache directory is


a) 32 Kbit b) 34 Kbit
c) 64 Kbit d) 68 Kbit [GATE-2008]

ANS: (b)

EXPLANAITION:
TAG SET BLOCK
17 11 4
From the above figure and given conditions, the total number of tags comes out to be = 17 x
2 x 1024 = 34 Kbit

Q.15 Which of the following array elements has the same cache index as ARR [0][0]?
a) ARR [0][4] b) ARR [4] [0]
c) ARR [0] [5] d) ARR [5] [0] [GATE-2008]

ANS: (b)
EXPLANAITION: The array element ARR[4] [0] has the same cache index as ARR[0] [0]
since it is given that page size is of 4 Kbyte and 1 row contains 1024 elements, i.e., 210
locations.

Q.16 The cache hit ratio for this initialization loop is


a) 0% b) 25%
c) 50% d) 75% [GATE-2008]

ANS: (b)

EXPLANAITION: We know that hit ratio is given as


Number of hits/(Number of hits + Number of rows) = 4/16 = 25%

***Q.17 For inclusion to hold between two cache, levels L1 and L2 in a multilevel cache
hierarchy, which of the following are necessary?
1. L1 must be a write-through cache.
2. L2 must be a write-through cache.
3. The associativity of L2 must be greater than that of L1.
4. The L2 Cache must be at least as large as the L1 cache.
a) 4 b) 1 and 4
c) 1, 2 and 4 d) 1, 2, 3 and 4 [GATE-2008]

ANS: (a)

EXPLANAITION: For a multilevel cache hierarchy the condition that is necessary to hold
inclusion is that the L2 cache must be at least as large as the L2 cache. And since both the
levels are write through cache this is not the sufficient condition as is depicted by the
following figure:

*Q.18 How many 32K × 1 RAM chips are needed to provide a memory capacity of 256
Kbyte?
a) 8 b) 32
c) 64 d) 128 [GATE-2009]

ANS: (c)

EXPLANAITION: As given, basic RAM is 32K x 1 and we have to design a RAM of 256K
x 8.
Therefore, number of chips required = 256 k × 8/(32K × 1)
= 245×1024× 8/32 × 1024 × 1) (Multiplying and dividing by 1024)
= 64 = 8x8
Means, 64 = 8 parallel lines x 8 serial RAM chips

*Q.19 A main memory unit with a capacity of 4 megabyte is built using 1M × 1 bit DRAM
chips. Each DRAM chip has 1 k rows of cells with 1 k cells in each row. The time taken for a
single refresh operation is 100 ns. The time required to perform one refresh operation on all
the cells in the memory unit is
a) 100 ns b) 100 ∗ 210 ns
c) 100 ∗ 220 ns d) 3200 ∗ 220 ns [GATE-2010]

ANS: (d)

EXPLANAITION: Since, the capacity is 4 MB therefore


4 x 106 x 8 = 32 x 106 ... (A)
20
And 1K x 1K (rows x cells) = 2 ... (B)

Therefore, the time required to perform one refresh operation on all the cells in the memory
unit is A x B = 32 x 106 x 220
= 32 x 106 x 220 x 100 x 1-9 sec or 3200 x 220 nanoseconds

Q.20 Consider a 4-way set associative cache (initially empty) with total 16 cache blocks. The
main memory consists of 256 blacks and the request for memory blocks is in the following
order:
0, 255, 1, 4, 3, 8, 133, 159, 216, 129, 63, 8, 48, 32, 73, 92, 155
Which one of the following memory blocks will not be in cache if LRU replacement policy is
used?
a) 3 b) 8
c) 129 d) 216 [GATE-2010]

ANS: (d)

Set 0 48
4 32 8
26 92
Set 1 1
133
73
129
Set 2
Set 3 255 155
3
159
63
0 mod 4 = 0 (set 0)
255 mod 4 = 3 (set 3)
Like this the sets in the above table are determined for the other values.

Statements for Linked Questions no 21 and 22


The computer system has an l1l2 cache, an l2 cache and a main memory unit connected as
shown below. The block size in l1 cache is 4 words. The block size in l2 cache is 16 words.
The memory access times are 2 ns, 20 ns and 200 ns, for l1 cache, l2 cache and main memory
unit respectively.

Q.21 When there is a miss in L1 cache and a hit in L2 Cache, a block is transferred from L2
cache to L1 cache. What is the time taken for this transfer?
a) 2 ns b) 20 ns
c) 22 ns d) 88 ns [GATE-2010]

ANS: (c)
As already given in question,
Memory access time for 11 = 2 ns
Memory access time for 12 = 20 ns
Now the required time pf transfer = 20 + 2 = 22 ns

***Q.22 When there is a miss in both L1 cache and L2 cache, first a block is transferred from
main memory to L2 Cache, and then a block is transferred from L2 Cache to L1 cache what
is the total time taken for these transfer?
a) 222 ns b) 880 ns
c) 902 ns d) 968 ns [GATE-2010]

ANS: (a)

As given memory access time for, main memory = 200 ns


Memory access time for L2 = 2 ns
Memory access time for L2 = 20 ns
Total access time = Block transfer time from main memory to t2 cache + Access time of L2 +
Access time of L1
Now, the required time of transfer = 200 + 20 + 2 = 222 ns

*Q.23 An 8 Kbyte direct mapped write back cache is organized as multiple blocks, each of
size 32 byte. The processor generates 32-bit addresses. The cache controller maintains the tag
information for each cache block comprising of the following:
1 Valid bit
1 Modified bit
As many bits as the minimum needed to identify the memory block mapped in the cache.
What is the total size of memory needed at the cache controller to store metadata (tags) for
the cache?
a) 4864 bit b) 6144 bit
c) 6656 bit d) 5376 bit [GATE-2011]

ANS: (d)

Number of clocks in cache = (8 x 210 x 8)/(32x8) = 210/22 = 28 , 8 bits are needed to identify
each block. Size of block is 32 bytes, Thus 5 bit will be needed to identify the lines.

***Q.25 A RAM chip has a capacity of 1024 words of 8 bits each (1K×8). The number of
2×4 decoders with enable line needed to construct a 16K ×16 RAM from 1K×8 RAM is
a) 4 b) 5
c) 6 d) 7 [GATE-2013]

ANS: (b)

RAM chip size =1k × 8[1024 words of 8 bits each]


RAM to construct =16K × 16
Number of chips required = (16K x 16)/ (1K x 8) =16 × 2 [16 chips vertically with each
having 2 chips horizontally]
So to select one chip out of 16 vertical chips, we need 4 x 16 decoder.
Available decoder is 2 × 4 decoder. To be constructed is 4 × 16 decoder
So we need 5, 2 × 4 decoder in total to construct 4 × 16 decoder.

***Q.27 A 4-way set-associative cache memory unit with a capacity of 16 KB is built using
a block size of 8 words. The word length is 32 bits. The size of the physical address space is 4
GB. The number of bits for the TAG field is _____ [GATE-2014-2]

ANS: (20)
Physical address size = 32 bits
Cache size = 16k bytes = 214 Bytes
Block size = 8 words 8×4 Byte = 32 Bytes (Where each word= b Bytes)
No. of blocks = 214/25 = 29, therefore block offset = 9 bits
No. of sets = 29/4 =27, therefore set offset = 7 bits
Byte offset =8× 4 Bytes =32 Byte= 25, therefore 5 bits.
TAG =32 – (7+ 5) = 20 bits.

Q.28 In designing a computer’s cache system, the cache block (or cache line) size is an
important Parameter. Which one of the following statements is correct in this context?
a) A smaller block size implies better spatial locality
b) A smaller block size implies a smaller cache tag and hence lower cache tag overhead
c) A smaller block size implies a larger cache tag and hence lower cache hit time
d) A smaller block size incurs a lower cache miss penalty [GATE-2014-2]

ANS: (d)
When a cache block size is smaller, it could accommodate more number of blocks, it
improves the hit ratio for cache, so the miss penalty for cache will be lowered.

Q.29 If the associativity of a processor cache is doubled while keeping the capacity and block
size unchanged, which one of the following is guaranteed to be NOT affected?
a) Width of tag comparator
b) Width of set index decoder
c) Width of way selection multiplexor
d) Width of processor to main memory data bus [GATE-2014-2]

ANS: (d)

When associativity is doubled, then the set offset will be effected, accordingly, the number of
bits used for TAG comparator be effected. Width of set index decoder also will be affected
when set offset is changed. Width of wag selection multiplexer will be affected when the
block offset is changed. With of processor to main memory data bus is guaranteed to be NOT
affected.

***Q.30 The memory access time is 1 nanosecond for a read operation with a hit in cache, 5
nanoseconds for a read operation with a miss in cache, 2 nanoseconds for a write operation
with a hit in cache and 10 nanoseconds for a write operation with a miss in cache. Execution
of a sequence of instructions involves 100 instruction fetch operations, 60 memory operand
read operations and 40 memory operand write operations. The cache hit-ratio is 0.9. The
average memory access time (in nanoseconds) in executing the sequence of instructions is
__________. [GATE-2014-3]

ANS: (1.68 ns)

Total instruction = 10 instruction + fetch operation + 60 memory operand read


operation + 40 memory operand write op
= 200 instructions (operations)
Time taken for fetching 100 instructions (equivalent to read) = 90x1 ns +10x5 ns
= 140 ns

Memory operand Read operations = 90% (60) x 1ns + 10% (60) × 5 ns


= 54 ns + 30 ns = 84 ns
Memory operands write operation time = 90% (40) x 2 ns + 10% (40) x 10 ns
= 72 ns + 40 ns =112 ns
Total time taken for executing 200 instructions = 140 + 84 +112 = 336 ns
∴ Average memory access time = 336 ns / 200 = 1.68 ns

You might also like