0% found this document useful (0 votes)

26 views25 pages

Memory Organization

The document discusses memory organization, focusing on cache performance, access times, and hit ratios. It includes various questions and solutions related to cache behavior, instruction queues, and memory access times in different scenarios. The document also covers concepts such as write strategies, memory hierarchy, and locality of reference.

Uploaded by

vidhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views25 pages

Memory Organization

Uploaded by

vidhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

MEMORY ORGANIZATION

Q3.

Caches:

• Tavg

Q1-Q2. There is an instruction QUEUE in the CPU that could hold pre-fetched
instructions. Thus, the “INSTRUCTION QUEUE” and the “GENERAL PURPOSE
REGISTERS” now result in having 25% less memory access made by cpu and INTERNAL
BUS in CPU circuitry is self sufficient for these accesses and no memory access signal
comes to cache or main memory. Earlier the cache hit was: 0.8, cache access time= 100 ns
and Main memory access time = 500ns.

Q1. What is the speedup achieved in the average memory access time of the memory
system due to having “INSTRUCTION QUEUE” :___________________________ 1 (no
change in memory access time of the memory system)

Q2. If earlier a program took 1000 ns to execute (assuming only memory accesses time in
program execution, cpu time is negligible), what is the new execution time:
_____________ns 750

Solution:

Let earlier be x memory accesses -> 1000 ns

The new memory accesses = 0.75x

New execution time = 750 ns

Q3 -Q5. The empirically analysed probability of accessing a main memory block number x
when the program A is under execution and having a particular instance of “VIRTUAL
MEMORY to MAIN MEMORY” mapping is:
Q3. Assuming the same “VIRTUAL MEMORY to MAIN MEMORY” mapping and all program
execution conditions same for executing program A, find the hit ratio of CACHE at this
instance, when the cache with 8 block size has the blocks: [1, 2, 3, 8, 5 , 6, 7, 9 ] in cache.

What is the “HIT RATE of cache at given instance for given data”

A. 1
B. 0.4
C. 0.8
D. 0.6

Hit rate = (0.1 + 0.2 + 0.3)

Q4.

Consider that as per block replacement policy, only 3 blocks can be allocated in the cache
to a process: then minimum how many block misses will be required after this snapshot of
cache, to have cache hit rate greater than 0.6:

A. 4
B. 1
C. 2
D. 3

Q5. Consider the cache block size of 4 words and memory word size is 16 bytes in byte
addressable main memory. Then the no. of words in the program A is:
A. 4
B. 256
C. 64
D. 16

Q6.
Consider a simultaneous access cache memory system with
Cache access time = 100 ns
Main memory access time = 1000 ns
Hit ratio = h

Which of the following is a correct graph for variation of “Tavg” vs “h”

Q7.
Consider a cache with access time = x ns
Main memory with access time = y ns

If the variation of Tavg with the cache hit ratio “h” is same in simultaneous access
arrangement and hierarchical access arrangement, then which of the following is condition is
necessary:

A. y<x
B.

• Multilevel

• Mapping:

Q1. . If the probability to access any block in the memory system is same and independent
of the current cache contents has the value, P(access a block) = 1/1024. Then what can be
the maximum size of tag bits possible in any kind of cache-mapping:

A. 1024
B. 12
C. 10
D. 20
Q2. . If the probability to access any address in the memory system is same and
independent of the current cache contents has the value, P(access an address) = 1/1024.
The cache block size is 4 words. And the memory word size is 4 bytes. Then what can be
the maximum size of tag bits possible in any kind of cache-mapping, where cache can hold 4
blocks:

E. 6
F. 64
G. 10
H. 4

Q3. . Assuming all addresses in the memory system are equi-probable to be accessed at
any time while the program execution. Each access request is independent from each other.
In direct mapping, there is only 1 tag bit saved in tag directory for each cache block. Then
the hit ratio

1. [NIELIT]

In a particular system it is observed that, the cache performance gets improved as a result of
increasing the block size of the cache. The primary reason behind this is :

A. Programs exhibits temporal locality

B. Programs have small working set
C. Read operation is frequently required rather than write operation
D. Programs exhibits spatial locality

Q1:[nat]
In a two-level memory hierarchy, the access time of cache memory is 12 ns and the access
time of main memory is 1500 ns. The hit ratio is 0.98, the average access time of the two
level memory system is ______ ns?

Solution:

if nothing is mentioned, it is following simultaneous access in memory hirarchy .

Avg access time = cache hit * cache access time + missRate * hit rate in 2nd level * 2nd
level access time

= .98 * 12 + .02 * 1 * 1500

= 41.76 ns

Q2: Which of the following is true for instruction cache and data cache generally: [msq]

A . Spatial locality of instruction cache references>spatial locality of data cache

references.
B. Temporal locality of instruction cache references<spatial locality of instruction
cache references.
C. Data cache hit rate<instruction cache hit rate.
D. Data cache miss rate>data cache hit rate.
Answer: ABCD

• Spatial locality of instruction cache references>spatial locality of data

cache references.The data-cache has no spatial locality since it is
randomly accessed.Instruction accesses are highly sequential, since there
are very few branches, and even the branches contribute to spatial locality
since they branch no more than 5 instructions back.
• Temporal locality of instruction cache references<spatial locality of
instruction cache references. I-cache temporal locality occurs only when
there is a branch. Since there are very few, this is lower than the spatial
locality.
• Data cache hit rate<instruction cache hit rate.
• Data cache miss rate>data cache hit rate
The data cache has a terrible hit rate – loads are almost always misses
because they are random. Stores might always be hits, but if the loads
evict data from other loads, stores might miss too.

Q3:
Consider a memory system with a single L1 cache that has an average access
time of100ns without L1, and 30ns with L1. L1 has an access time of 10ns. What
is the hit ratio of L1required to have an average access time of 30ns.
A. 90%
B. 70%
C. 80%
D. 81.8%

Solution:
C. 80%

Q:
Consider a memory system with I-cache, D-cache, L2 cache and main memory
accesstime of 5ns, 5ns, 10ns and 100ns respectively, and their respective hit
ratios are 0.85, 0.80, 0.9 and1. On executing a program, 40% of instructions
access I-cache and 60% of instructions accessD-cache.What is the average
memory access time?

a. 6.6ns
b. 14.7ns
c. 8.6ns
d. 12.2ns

Solution:
8.6ns

Q:
Suppose a processor supports software configuration on handling cache write hits/miss
strategy.

1. What strategy should be taken if:

(a) The processor mainly works on applications that require massive memory
write operation and data access.
(b) The processor mainly works on applications that require massive memory
write operation and data access but do not allow data inconsistency.

Answer 1. (a) Write Back, reduce main memory access.

(b) Write Through, keep data consistently.

Q:
Which of the following is true? [msq]
I : Write back updation of cache leads to raw hazards
II : Write through mechanism reduces the average memory access time
III : Write back and write through mechanisms reduce conflict misses
IV : Write back and write through mechanisms maintain cache coherence

a. I ,II,and IV
b. I, and IV
c. I, II, and IIIOnly
d. IV is True

Solution:
I, and IV

The following program fragment is executing on single core machine

for(int i=0; i<n;i++)

sum = sum+A[i]
Which of the following updation techniques of the cache will result in less access
time of the variables 'i' and 'sum' respectively?
a. write back to i and write through
b. write through for i and write back for sum
c. write through for both i and sum
d. write back for both i and sum

Solution:
write back for both i and sum
Q:
. Which of the following statements are true?
a. The main objective for using cache memory is to increase the effective speed of the
memory system.
b. The main objective for using virtual memory is to increase the effective capacity of the
memory system.
c. The size of main memory is larger as compared to cache memory.
d. Main memory is faster as compared to cache memory.
Correct answers are (a), (b) and (c).
(a) is true since the average memory access of a memory system using cache is close
to the access time of the cache. And cache is faster than main memory.
(b) is true since in virtual memory the size of a program can be as large as the size of
the secondary memory.
(c) is true since cache memory is more expensive and hence its capacity is smaller
than main memory.
(d) is clearly false.

. Which of the following is true for a memory hierarchy?

a. It tries to bridge the processor-memory speed gap.
b. The speed of the memory level closest to the processor has the highest
speed.
c. The capacity of the memory level farthest away from the processor is the
largest.
d. It is based on the principle of locality of reference.
All the answers are true.
This follows from the basic definition of memory hierarchy design. The fastest and
smallest memory module is closest to the processor. The overall access time of the
memory system is close to the access time of the fastest memory level. This is
achieved by exploiting locality of reference.
Q. Which of the following statements are false?
a. Temporal locality arises because of loops in a program.
b. Spatial locality arises because of loops in a program.
c. Temporal locality arises because of sequential instruction execution.
d. Spatial locality arises because of sequential instruction execution.
Correct answers are (b) and (c).
Temporal locality says that an accessed word will be accessed again in the near
future, and this is due to the presence of loops.
Spatial locality says that if a work is accessed, then words in the neighborhood will
also be accessed in the near future. This happens due to sequential program
execution.

Hardware cache memories exploit spatial locality of reference

A. by remembering which pieces of data have been accessed recently

B. when data items are re-accessed frequently
C. by remembering which cache blocks (lines) have been written to
D. only if cache block (line) size is greater than 1 byte

Solution:
Answer is D.

Spatial locality refers to a data near by a recently accessed one being accesses
in near future. So, to make use of it, when a data is accessed, a block of data
which incudes the accessed part is taken (called cache line) and placed in cache.
So, if this cache line size is 1 byte, that virtually rules out any chance of exploiting
spatial locality in cache.

A is not correct. There is no "remembering" mechanism in cache though this

might be used for cache line replacement like in LRU.

B is for temporal locality.

C is used for write-back cache.

Consider a cache works at 5× speed of main memory with hit rate of 75%. What is the
speedup of the memory performance if such cache is used.

Answer Let cache access time to be t, then the main memory access time is 5t. The
system average access time is,
Ta = 0.75t + 0.25 × 5t = 2.0t. (1)
Then the performance speedup is 5t/2.0t = 2.5.

There is a computer that has 64MB byte-addressable main memory. Instructions and data
are stored in separated caches, each of which has eight 64B cache lines. The data
cache use direct-mapping. Now there are two programs in the following form.
Program A:
int a[64][64];
int sum_array1()
{
int i,j,sum=0;
for(i=0;i<64;i++)
for(j=0;j<64;j++)
sum += a[i][j];
return sum;
}
Program B:
int a[64][64];
int sum_array1()
{
int i,j,sum=0;
for(j=0;j<64;j++)
for(i=0;i<64;i++)
sum += a[i][j];
return sum;
}
Suppose int data is represented in 32-bit 2’s complement and i,j,sum are stored
in specific registers. Arrays are stored in row-major with the start address 32010
in the main memory. Answer the following questions.

1. What are the line numbers of the main memory blocks that contain a[0][31]
and a[2][2] respectively? (Cache line number starts from 0)
2. What are the data cache hit rates of program A and B?

Answer 1. a. 320 + 31 ∗ 4 = 444, 444/64 = 6,

b. 320 + (64 ∗ 2 + 2) ∗ 4 = 2376, 2376/64 = 37, 37 mod 8 = 5.
2. The size of Array a is 64 ∗ 64 ∗ 4 = 214B. Since the size of a block is 64B, it
takes 28 main memory blocks to store it. Under the condition of row-major,
28times of cache miss will appear. Therefore, the hit rate of program A is (2 12 −
28)/212 = 93.75%. For program B, the hit rate is 0.

Q. Assume that a read request takes 50 nsec on a cache miss and 5 nsec on
a cache hit. While running a program, it is observed that 80% of the processor’s
read requests result in a cache hit. The average read access time is …………..
nsec.
Correct answer is 14.
Average read time = 0.80 x 5 + (1 – 0.80) x 50 = 14 nsec
Q. The memory access time is 1 nsec for a read operation with a hit in cache, 5
nsec for a read operation with a miss in cache, 2 nsec for a write operation with
a hit in cache, and 10 nsec for a write operation with a miss in cache. The
execution of a sequence of instructions involves 100 instruction fetch operations,
60 memory operand read operations, and 40 memory operand write operations.
The cache hit ratio is 0.9. The average memory access time (in nanoseconds) in
executing the sequence of instructions is:
a. 1.26
b. 1.68
c. 2.46
d. 4.52
Correct answer is (b).
Total number of read = 100 + 60 = 160
Total number of write = 40.
So, fraction of reads = 160 / (160 + 40) = 0.8
And, fraction of writes = 40 / (160 + 40) = 0.2
Average access time = 0.8 (0.9 x 1 + 0.1 x 5) + 0.2 (0.9 x 2 + 0.1 x 10) = 1.68

Q. Consider a two-level memory hierarchy with separate instruction and data

caches in level 1, and main memory in level 2. The clock cycle time in 1 ns. The
miss penalty is 20 clock cycles for both read and write. 2% of the instructions are
not found in I-cache, and 10% of data references not found in D-cache. 25% of
the total memory accesses are for data, and cache access time (including hit
detection) is 1 clock cycle. The average access time of the memory hierarchy will
be …………. nanoseconds.
Average access time = 0.75 (0.98 x 1 + 0.02 x 20) + 0.25 (0.90 x 1 + 0.10 x 20) = 1.76
ns

Direct Mapping

. Consider a direct-mapped cache with 64 blocks and a block size of 16 bytes.

Byte address 1200 will map to block number ………… of the cache. Correct
answer is 11.
We first find out the memory block number that byte address 1200 belongs to. Since
the size of a block is 16 bytes.
Byte address 0 to 15: block 0
Byte address 16 to 31: block 1
Byte address 32 to 47: block 2, and so on.
Byte address 1200 will belong to block number: floor(1200/16) = 75. For direct
mapped cache,
Cache block no. = (Memory block no.) MOD (No. of cache blocks) = 75 MOD 64 =
11.

Q. A cache memory system with capacity of N words and block size of B words
is to be designed. If it is designed as a direct mapped cache, the
length of the TAG field is 10 bits. If it is designed as a 16-way set associative
cache, the length of the TAG field will be ………… bits. Correct answer is 14.
For 16-way set associative cache, 4 more bits will be required in the TAG as
compared to direct mapping, since 24 = 16.

Q. Which of the following statements is true:

a. The implementation of direct mapping technique for cache requires expensive
hardware to carry out division.
b. The set associative mapping requires associative memory
for implementation.
c. A main memory block can be placed in any of the sets in set associative
mapping.
d. None of the above.
Correct answer is (b).
Direct mapping is the easiest to implement. Both fully associative and set associative
mappings require an associative memory. (c) is false as in set associative mapping, a
main memory block can be mapped to any of the blocks in the selected set.

Q For two direct-mapped cache design with a 32-bit address and a 16-bit address,
the following bits of the address are used to access the cache (as below).

Tag Index Offset

a 31-10 9-5 4-0

b 15-10 9-4 3-0

1. What is the cache line size (in word)?

2. How many entries does the cache have?
3. What is the ratio between total bits required for such a cache implementation
over the data storage bits?

Answer: 1. a. The offset is 4-0 (5 bits). It implies 25bytes= 32bytes= 8words. b.

The offset is 3-0 (4 bits). It implies 24bytes= 16bytes= 8words.
2. a. The range of index is 9-5 (5 bits). For the direct-mapped cache, it implies
25sets= 32entries.
b. The range of index is 9-4 (6 bits). For the direct-mapped cache, it
implies 26sets= 64entries.
3. a. Total bits=32 entries∗(1 valid bit+22 tag bits+32 ∗ 8 data bits) = 32 ∗
279, Data bits= 32 entries∗32 ∗ 8 data bits= 32 ∗ 256,
Ratio= 279/256 = 1.09.
b. Total bits=64 entries∗(1 valid bit+6 tag bits+16 ∗ 8 data bits) = 64 ∗
135, Data bits= 64 entries∗16 ∗ 8 data bits= 64 ∗ 128,
Ratio= 135/128 = 1.05.

Suppose there is a byte addressable computer with main memory size 256MB and a
cache with 8 lines. Each cache line size is 64B. Assume direct mapping in the
memory hierarchy. Calculate the following items.
1. How many bits do we need at least for the main memory address? The total
cache size (in bits).
2. The line number corresponding to the main memory address 233310.

Answer 1. All of the following answers are correct (note: analysing one senario

is enough). 1

(a) Assume the main memory address space is up to 256MB, then we have
28bit main memory address, thus the total cache size in bits is 8 × (1 + 19
+ 512) = 4256bits.
(b) Assume 32-bit main memory address, the total cache size in bits is 8 × (1
+ 23 + 512) = 4288bits.
(c) Assume 16-bit main memory address, the total cache size in bits is 8 × (1
+ 7 + 512) = 4160bits.
2. All of the following answers are correct
(a) d2333B/64Be = 37, line number = 37 mod 8 = 5.
(b) b2333B/64Bc = 36, line number = 36 mod 8 = 4.

Q2: A cache separate a 32-bit address as follows:

bits 0 - 3 = offset
bits 4 - 14 = index
bits 15 - 31 = tag
What is the size of cache? How much space is required to store the tags for
the cache?

Solution:

Size of cache line: 2 = 24= 16 bytes

offset bits

Number of cache lines: 2index bits= 211= 2048

Total cache size: 16∗2048 = 32KB
Total tag size: 17∗2048 = 34Kb

Q3:

Set Associative

Q:
A computer uses a 2-way set associative cache of size 128 KBytes with block (line) size of
32 Bytes. The cache accepts 32 bit addresses of the form b31 b30…b2 b1 b0 where b31
is the most significant address and b0 is the least significant address bit. Which bits are
used by the cache controller for indexing into the cache directory?

Solution:
block size = 32B

cache size = 128KB

number of cache lines = cache size/ block size

= 128kb/32b= 4096

number of cache sets = number of cache lines/p-way

s = 4096/2 = 2048

tag s block size

11 bits 5bits
16 bits

bb b b b b b b b
31 30 29 28 27 26 25 24 23 b b b b b
4 3 2 1 0

b b b b b b b b b bb
15 14 13 12 11 10 9 8 7 6 5

b b b b b b b
22 21 20 19 18 17 16

A method of addressing a cache directory, comprising:

computing a parity of an address tag field within a presented address;

computing an index including:

the computed parity within a bit of the index; and

an index field within the presented address within remaining bits of the index.

The index field of an address maps to low order cache directory address lines.
The remaining cache directory address line, the highest order line, is indexed
by the parity of the address tag for the cache entry to be stored to or
retrieved from the corresponding cache directory entry. Thus, even parity
address tags are stored in cache directory locations with zero in the most
significant index/address bit, while odd parity address tags are stored in cache
directory locations with one in the most significant index/address bit. The
opposite arrangement (msb 1=even parity; msb 0=odd parity) may also be
employed, as may configurations in which parity supplies the least significant bit
rather than the most significant bit.
now in the set (s) field msb is used as a parity bit the remaining set
field bits are used for indexing...along with parity bit

so clearly b b b b b b b b b b b are used for indexing

15 14 13 12 11 10 9 8 7 6 5

Q:
Which of the following statement/s is/are FALSE?I: In set associative mapping, if
set size is reduced to 1; it reduces to fully associative mapping.II : In set
associative mapping if only one set is present, it reduces to direct mapping.

a. I Only
b. II Only
c. Both I and II
d. None

Solution:
Both I and II

A byte-addressable computer uses 32-bit address to access main memory. Suppose the
data cache has a size of 4KiB and works in 8-way set-associative. Each block size
is 16B.
1. How many bits in Tag, Index and Offset?
2. Calculate the total cache size (in bits) if it uses write back and LRU
replacement.

Answer 1. The number of blocks is 4 KiB /16 B= 28. Therefore, 28/8 = 25rows need 5 bits to be
indexed. Since the computer is byte-addressable, the bits of offset is log2(24B) = 4.
And the tag bits is calculated as: 32 − 5 − 4 = 23 bits.
2. (23 Tag bits +16 ∗ 8 Data bits+1 Valid bit+1 Dirty bit+3 LRU bits) ∗ 28 = 39936
bits.

There are several parameters that impact the overall size of the page table. Listed below
are key page table parameters.

Table 2: Q6
Virtual Address Size Page Size Page Table Entry Size

32 16KiB 4bytes
1. Given the parameters shown above, calculate the total page table size for a
system running 5 applications that utilize half of the memory available (half of
the 32-bit virtual address space for each running ).
2. A cache designer wants to increase the size of a virtually indexed, physically
tagged cache. Given the page size shown above, is it possible to make a
64KiB direct mapped cache, assuming 2 words per block?
3. How would the designer increase the data size of the cache?

Answer 1. The tag size is 32 − log2(16384) = 32 − 14 = 18 bits. All five page tables would
require 5 ∗ (218 ∗ 4)/2 bytes = 2621440.0 B,
2. The page index consists of address bits 13 down to 0. So the LSB of the tag
is address bit 14. A 64 KiB direct-mapped cache with 2-words per block
would have 8-byte blocks and thus 64 KiB / 8 bytes = 8192 blocks, and its
index field would span address bits 16 down to 3 (13 bits to index, 1 bit word
offset, 2 bit byte offset). As such, the tag LSB of the cache tag is address bit
16,
3. The designer would instead need to make the cache 4-way associative to
increase its size to 64 KB.

Q:
A computer system has a 8K word cache organized in a block-set associative manner with
8 blocks per set and 32 words per block. The number of bits in the SET and WORD fields
of the main memory address format is:
a. 6,4
b. 4,5
c. 5,5
d. 8,5

Q:
The main memory of a computer has N blocks while the cache has N/m blocks. If the cache
uses set associative mapping scheme with 4 blocks per set, then block “k” of the main
memory maps to the set:

a. (k mod (N/m) ) of the cache

b. (k mod (N/4m) ) of the cache
c. (4k mod 4m) of the cache
d. (k mod 4m) of the cache
Q:
Consider a 256k 4- way set associative cache with block size 64 Bytes. Main memory is
2Gb.The number of bits used for tag,set and word will be respectively?

(tag,set,word) in that

a) 15,10,6
b) 9, 16, 6
c) 10,15,6
d) 8, 17, 6

Answer: a) 15,10,6

Q:
Consider a system with following sequence of memory block accesses : 16, 0,
16, 0, 4,0, and 4.
Cache memory has 16 blocks and main memory has 64 blocks. In this case
which of the below mapping techniques result in least number of misses and less
hardware complexity (i.e.,least number of tag comparators)?

a. Direct mapping
b. 2-way set associative
c. 4-way set associative
d. Fully associative

Solution:
2-way set associative

Q:Consider a system with a block size of p words. What is the range of words
present in kth block of main memory?

a. k*p to (k+1)*p - 1
b. k*p^2 to (k+1)*p^2
c. k*2^p to (k+1)*2^p - 1
d. k*2^p to (k)*2^(p+1) - 1
Solution:
k*p to (k+1)*p - 1

Q:
A computer has a 512KB, 8-way set associative data cache with block size of
32B. Theprocessor sends 32-bit addresses to the cache controller. Each cache
tag directory entry contains,in addition to the address tag, 2 valid bits, 1 modified
bit and 1 replacement bit.A. what will be the number of tag bits?

a. 11
b. 16
c. 21
d. 12

Solution:
16
B. What will be the cache controller size?
a. 40KB
b. 320KB
c. 192KB
d. 32KB

Solution:a
40Kb
Q
Consider a 2 way set associative cache with 4 blocks. The memory block requests in the
order.

4,6,3,8,5,6,0,15,6,17,20,15,0,8

If LRU is used for block replacement then memory set 17 will be in the cache block ____ and
set no ____. (assuming first block starts from 0)

Answer: cache block number 3 and set number 1(as there are two sets 0 and 1)

Q. A computer system uses 32-bit memory addresses and it has a main memory consisting
of 1G bytes. It has a 4K-byte cache organized in the block-set-associative manner, with 4
blocks per set and 64 bytes per block.
(a) Calculate the number of bits in each of the Tag, Set, and Word fields of the
memory address.
(b) Assume that the cache is initially empty. Suppose that the processor fetches 1088
words of four bytes each from successive word locations starting at location 0. It then
repeats this fetch sequence nine more times. If the cache is 10 times faster than the
memory, estimate the improvement factor resulting from the use of the cache.
Assume that the LRU algorithm is used for block replacement.

Solution:

Here number of words that are in 1 block = Block size / Word size

= 64 B / 4 B = 16
Number of cache lines = Cache size / Block size

= 4 KB / 64 B

= 64

Number of sets hence = 64 / 4 = 16 sets

Also number of words which are possible to be accomodated in cache till it is full = 4 K
/ 4 = 1024

Now we have to know about the access sequence of words i.e. a word access results in hit
or miss :

a) For word number 0 - 15 , block number = 0 ..

Thus set number = block number % number of sets = 0 mod 16 = 0

b) For word number 16 - 31 , block number = 1

Thus set number = 1

Thus continuing in the same manner till word no 1023 (at this point cache is full) , let us
mention the block number contained by each set :

Set number 0 : 0 16 32 48

Set number 1 : 1 17 33 49

Set number 2 : 2 18 34 50

and so on till .....

Set number 15 : 15 31 47 63

Now when the word numbers 1024-1039 comes i.e. block number 64 comes , cache is full
hence capacity miss occurs . Hence LRU replacement comes into picture . So block number
0 in set 0 is least recently used , hence it is replaced.Hence set 0 blocks appear
now as : 64 16 32 48

Similarly set 1 will be updated when block number 65 is accessed as

: 65 17 33 49

set 2 will be updated when block number 66 is accessed as

: 66 18 34 50
set 3 will be updated when block number 66 is accessed as
: 67 19 35 51

In this way all words (0 - 1087) are covered by covering 68 blocks .

Thus number of cache misses in 1st iteration = 64(compulsory) + 4(capacity)

= 68

Now in subsequent nine iterations where access of the words (0 - 1087) are done again , set
number ( 4 - 15 ) will remain unaffected as the words are already in cache which belong to
these sets and these sets wont face any block replacement as shown below . So from now
on we focus on first 4 sets only where replacement occurs due to cache miss.

Now when the word numbers 0-15 comes i.e. block number 0 comes , cache is full hence
capacity miss occurs . Hence LRU replacement comes into picture . So block number 16 in
set 0 is least recently used , hence it is replaced.Hence set 0 blocks appear
now as : 64 0 32 48

Similarly set 1 will be updated as : 65 1 33 49

set 2 will be updated as : 66 2 34 50

set 3 will be updated as : 67 3 35 51

Similarly when block number 16 comes , set 0 will be updated as : 64 0 16 48

block number 17 comes , set 1 will be updated as : 65 1 17 49

block number 16 comes , set 2 will be updated as : 66 2 18 50

set 3 will be updated as : 67 3 19 51

This way further misses will occur in these sets

Hence number of misses = 4*5 = 20

Likewise number of misses will be same for subsequent iterations .

Hence total number of misses in 10 iterations = 68(for first iteration) + 9*20 (for next
9 iterations)

= 68 + 180

= 248 misses

Thus speedup obtained = Time without cache / Time with cache

= ( Number of blocks accessed * Number
of iterations * 10 * Cache access time
) / ( Number of

blocks * (Cache access time + Main

memory access time + (Number of
iterations - 1) * (Number of cache misses
per iteration *(Cache access time + Main
memory access time) + (68 - Number of
cache misses) * Cache access time))

[ As main memory access time = 10 * cache

memory access time given ]

= (68 * 10 * 10) / ((1 * 68 * 11) + 9 * (20

* 11 + 48))

= 2.15

Thus the performance improvement(speedup) = 2.15

Types of Misses

Q:
Which one of the following is correct?
A. Compulsory misses can be reduced by increasing the total cache
size.
B. Capacity misses can be reduced by increasing the block size.
C. Conflict misses may be increased by increasing the value of
associativity.
D. Compulsory misses can be reduced by increasing the cache block
size.
Q:
. Which of the following statements is false for cache misses?
a. Compulsory miss can be reduced by decreasing the cache block size.
b. Capacity miss can be reduced by decreasing the total size of the cache.
c. Conflict miss can be reduced by decreasing the value of cache associativity.
d. Compulsory miss can be reduced by prefetching cache blocks. Correct answers
are (a), (b) and (c).
This follows from the definitions of compulsory and capacity misses.

Q:
Match the following
A.Capacity Misses I.Fully Associative cache
B.Compulsory Misses II.Set Associative cache
C.Conflict Misses III.Direct Mapped cache
IV.Cold Start cache

A. III, B. IV, C. II
A. I, B. I, C. II and III
A. I, B. IV, C. II and III
A. I, B. II, C. III

Solution:
A. I, B. IV, C. II and III

Q. How can the cache miss rate be reduced?

a. By using larger block size
b. By using larger cache size
c. By reducing the cache associativity
d. None of the above
Correct answers are (a) and (b).
If the cache block size is increased, number of cache misses will be reduced in
general. Similar will be the case for larger cache memories, where more number of
cache blocks can be stored. However, if the associativity is reduced, we are reducing
the choice for cache block placement that can result in an increase in the number of
misses.

Q: Consider a physical memory of 1MB size and a direct mapped cache of 8KB size with
block size 32 bytes.Let there be a sequence of block accesses given by
2,216,100,256,2048,728,256,216
Find the number of compulsory, capacity, conflict miss.

Solution:
in one go 32 bytes of data will be taken from/trasferred to main memory to/from cache. This
means there are 1MB/8KB=128 blocks in main memory corresponding to a single cache
block. Since 128 memory blocks can map to a single cache entry, this means we need
log2 128=7 tag bits.

• The main address bits get split into 3:tag :index :offset

For the above configuration we have 20 bit physical address, 7 bits of tag, 8 bits for
index (2 =256 entries in cache each of size 32 bytes = 8 KB) and 5 bits of offset.
8

No. of blocks in cache =8KB/32B=256

Now, to find which cache block a memory block goes, we just have to do
Mod256 .
2 2 Compulsory Miss
216 216 Compulsory Miss
100 100 Compulsory Miss
256 0 Compulsory Miss
2048 0 Compulsory Miss
728 216 Compulsory Miss
256 0 Conflict Miss
216 216 Conflict Miss

Multilevel-cache:

Q. Suppose that in 1000 memory references there are 40 misses in L1

cache and 10 misses in L2 cache. If the miss penalty of L2 is 200 clock cycles,
hit time of L1 is 1 clock cycle, and hit time of L2 is 15 clock cycles, the
average memory access time will be ……………….. clock cycles.
L1 hit ratio = (1000 – 40) / 1000 = 0.96
L2 hit ratio = (1000 – 10) / 1000 = 0.99
Average access time = 0.96 x 1 + 0.04 [0.99 x 15 + 0.01 x 200] = 1.634

Q:
. In a two-level cache system, the access times of L1 and L2 caches are 1 and 8
clock cycles respectively. The miss penalty from L2 cache to main memory is 18
clock cycles. The miss rate of L1 cache is twice that of L2. The average memory
access time of the cache system is 2 cycles. The miss rates of L1 and L2 caches
respectively are:
a. 0.130 and 0.065
b. 0.056 and 0.111
c. 0.0892 and 0.1784
d. 0.1784 and 0.0892
Correct answer is (a).
Let the miss rate of L2 cache be x.
So, miss rate of L1 cache = 2x.
Thus, average memory access time
AMAT = (1-2x).1 + 2x. [(1-x).8 + x.18] = 2 (given)
Solving, we get
x = 0.065
Q:
Suppose that in 250 memory references there are 30 misses in first level cache and 10
misses in second level cache. Assume that miss penalty from L 2 cache memory are 50
cycles. The hit time of L2 cache is 10 cycles. The hit time of the L1 cache is 5 cycles. If there
are 1.25 memory references per instruction, then average stall cycles per instruction is
__________

Answer:

Memory stall cycles created when CPU has to wait for memory. Here, we can assume no
stall cycles for L1 hit. So, CPU will access from L1 only. If something not found there we
need to consider L1 miss penalty (why not L2 miss penalty ?? because we will include it in
L1 miss

Miss penalty of
L1 =( HIT time in L2)+(miss rate in L2)∗(miss penalty of L2)
=10 cycles +(10/30)∗50 cycles
=80/3 cycles

Mem stall cycles due to misses per instruction

=(mem request per instruction)∗(Miss rate in L1)∗(Miss penalty of L1)
=1.25∗(30/250)∗80/
=4
Q:
Consider n unique memory block accesses on a fully associative cache with k
blocks..What is the necessary relation required to ensure maximum number of
misses using FIFOreplacement policy ?

a. n=k+1
b. n<k
c. n=k
d. n<=k

Solution:a
n=k+1

Q:A basic computer system with a single core processor where memory
operations take 40% of execution time. An enhancement called L1 cache speeds
up 60% of memory operations by a factor of 4. Another enhancement called L2
cache speeds up half of remaining 40% of memory operations by a factor of 2.
What is overall speedup of system?(Round off to 3 decimal digits)

Solution:
1.282

Cose222 HW4
No ratings yet
Cose222 HW4
5 pages
ALTUI Doc
No ratings yet
ALTUI Doc
61 pages
Week 6: Assignment Solutions
No ratings yet
Week 6: Assignment Solutions
4 pages
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
No ratings yet
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
38 pages
Computer Organization Exercise Answer7
No ratings yet
Computer Organization Exercise Answer7
7 pages
Coa Assignment On Cache
No ratings yet
Coa Assignment On Cache
4 pages
Solution To Test
No ratings yet
Solution To Test
6 pages
Computer Organization SET-1
No ratings yet
Computer Organization SET-1
3 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Jntu Online Examinations
No ratings yet
Jntu Online Examinations
20 pages
24-Cache Memory Mapping Techniques-14!03!2024
No ratings yet
24-Cache Memory Mapping Techniques-14!03!2024
36 pages
Module 4 - Cache Memory Problems
No ratings yet
Module 4 - Cache Memory Problems
8 pages
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
No ratings yet
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
18 pages
Jntu Online Examinations (Mid 2 - Aca)
No ratings yet
Jntu Online Examinations (Mid 2 - Aca)
22 pages
Tutorial 7cache
No ratings yet
Tutorial 7cache
2 pages
Memory Organization Qns
No ratings yet
Memory Organization Qns
3 pages
Revision 1
No ratings yet
Revision 1
33 pages
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
No ratings yet
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
17 pages
Cau 6 Cache
No ratings yet
Cau 6 Cache
25 pages
Chapter 2z
No ratings yet
Chapter 2z
54 pages
Bairathicache
No ratings yet
Bairathicache
4 pages
ch5 Easy
No ratings yet
ch5 Easy
27 pages
Ceng252 Quiz3answers
100% (2)
Ceng252 Quiz3answers
3 pages
CSC506 Homework Due Friday, 5/28/99 - Cache Questions & VM
No ratings yet
CSC506 Homework Due Friday, 5/28/99 - Cache Questions & VM
2 pages
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
No ratings yet
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
52 pages
Assignment-2 COA
No ratings yet
Assignment-2 COA
2 pages
Cap Ese Q Answers
No ratings yet
Cap Ese Q Answers
11 pages
Unit 5
No ratings yet
Unit 5
40 pages
Chapter 4.2 - KTMT
No ratings yet
Chapter 4.2 - KTMT
6 pages
CSE332 Cache Memory 2 May2024
No ratings yet
CSE332 Cache Memory 2 May2024
138 pages
CA11 2023S1 New
No ratings yet
CA11 2023S1 New
26 pages
Assign1 PDF
No ratings yet
Assign1 PDF
5 pages
L10 Cache Memory
No ratings yet
L10 Cache Memory
52 pages
Ca Mod 2
No ratings yet
Ca Mod 2
40 pages
CL10 MemoryMgmt
No ratings yet
CL10 MemoryMgmt
45 pages
Chapter 2z
No ratings yet
Chapter 2z
54 pages
Jaimin Brahmbhatt COSC 6351 Advanced Computer Architecture Assignment
No ratings yet
Jaimin Brahmbhatt COSC 6351 Advanced Computer Architecture Assignment
3 pages
cs325 Fall10 Finalexam
No ratings yet
cs325 Fall10 Finalexam
9 pages
Access Type
No ratings yet
Access Type
23 pages
Cache Ques B
No ratings yet
Cache Ques B
1 page
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
Problem Bank 01: Assignment I
No ratings yet
Problem Bank 01: Assignment I
9 pages
Cacne Memory - Mapping Techniques
No ratings yet
Cacne Memory - Mapping Techniques
7 pages
ACA - Memory
No ratings yet
ACA - Memory
26 pages
Solutions: 18-742 Advanced Computer Architecture
No ratings yet
Solutions: 18-742 Advanced Computer Architecture
8 pages
4.2 Cachememory
No ratings yet
4.2 Cachememory
12 pages
Cache Memory
No ratings yet
Cache Memory
24 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Assignment 3 Computer Organization & Architecture
No ratings yet
Assignment 3 Computer Organization & Architecture
2 pages
HW4
No ratings yet
HW4
3 pages
Final Chapter-5
No ratings yet
Final Chapter-5
9 pages
Cache Memory in Computer Organizatin
No ratings yet
Cache Memory in Computer Organizatin
12 pages
Large and Fast - Exploiting Memory Hierarchy
No ratings yet
Large and Fast - Exploiting Memory Hierarchy
52 pages
Advanced Architecture Memory
No ratings yet
Advanced Architecture Memory
13 pages
MCQs On Cache Memories
No ratings yet
MCQs On Cache Memories
5 pages
Solution To Assignment of COA On Cache
No ratings yet
Solution To Assignment of COA On Cache
8 pages
Memory Hierarchy - Short Review
No ratings yet
Memory Hierarchy - Short Review
11 pages
Cache 13115
No ratings yet
Cache 13115
20 pages
Computer Organization (PCC CS 302) - 2024
No ratings yet
Computer Organization (PCC CS 302) - 2024
2 pages
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
Mtech CLG 25
No ratings yet
Mtech CLG 25
1 page
Minor Project Report Aanchal
No ratings yet
Minor Project Report Aanchal
44 pages
Pharmaceutical Marketing and Distribution
No ratings yet
Pharmaceutical Marketing and Distribution
9 pages
Core Java Interview Questions
No ratings yet
Core Java Interview Questions
44 pages
Slides Nancy Liao Brief Intro To Blockchain Iac 101217
No ratings yet
Slides Nancy Liao Brief Intro To Blockchain Iac 101217
11 pages
Daa Micro
No ratings yet
Daa Micro
25 pages
370 FORMULA Book Free Unacademy
No ratings yet
370 FORMULA Book Free Unacademy
151 pages
Introducing Operating Systems: Pt101 - Platform Technologies
No ratings yet
Introducing Operating Systems: Pt101 - Platform Technologies
36 pages
ORACLE Architecture
No ratings yet
ORACLE Architecture
23 pages
Oracle - Certcollection.1z0 883.v2015.07.01.by - Forum.85q
No ratings yet
Oracle - Certcollection.1z0 883.v2015.07.01.by - Forum.85q
42 pages
Oracl DB Monitoring
100% (1)
Oracl DB Monitoring
21 pages
A Mem Agentic Memory For LLM Agents 1739879556
No ratings yet
A Mem Agentic Memory For LLM Agents 1739879556
20 pages
Resume Kushal PDF
No ratings yet
Resume Kushal PDF
1 page
MCQ of Computer Organization Architecture 3340705
No ratings yet
MCQ of Computer Organization Architecture 3340705
26 pages
Chapter Four - Memory Managment
No ratings yet
Chapter Four - Memory Managment
16 pages
Bài tập tiếng anh chuyên ngành topic 2
No ratings yet
Bài tập tiếng anh chuyên ngành topic 2
2 pages
Sap 47n WF Tables
No ratings yet
Sap 47n WF Tables
56 pages
Dropbox
No ratings yet
Dropbox
79 pages
Level I I Developer PDF
No ratings yet
Level I I Developer PDF
4 pages
Fundamental of Computer (BBA 5th)
No ratings yet
Fundamental of Computer (BBA 5th)
12 pages
Aide MATLab Sur Les Algorithmes Génétiques
100% (2)
Aide MATLab Sur Les Algorithmes Génétiques
21 pages
Developers Guide To Microsoft Enterprise Library 5 CSharp Edition Aug 2010 PDF
No ratings yet
Developers Guide To Microsoft Enterprise Library 5 CSharp Edition Aug 2010 PDF
273 pages
SAP HANA Data Tiering With SAP NSE
No ratings yet
SAP HANA Data Tiering With SAP NSE
21 pages
LLM-Based Edge Intelligence A Comprehensive Survey On Architectures Applications Security and Trustworthiness
No ratings yet
LLM-Based Edge Intelligence A Comprehensive Survey On Architectures Applications Security and Trustworthiness
58 pages
HPE Alletra Storage 5000 Data sheet-PSN1014656646USEN
No ratings yet
HPE Alletra Storage 5000 Data sheet-PSN1014656646USEN
4 pages
F5 Solutions Playbook September 2016 PDF
No ratings yet
F5 Solutions Playbook September 2016 PDF
92 pages
Chapter 8
No ratings yet
Chapter 8
28 pages
Cloud Computing - So Far
No ratings yet
Cloud Computing - So Far
122 pages
Storemi Quick Start Guide
No ratings yet
Storemi Quick Start Guide
10 pages
Sap Hana On Power: Deeper Dive
No ratings yet
Sap Hana On Power: Deeper Dive
62 pages
LM - Ic - Unit2 2
No ratings yet
LM - Ic - Unit2 2
23 pages
A Review On GOOGLE File System
No ratings yet
A Review On GOOGLE File System
4 pages
Operating Systems Lecture Notes
No ratings yet
Operating Systems Lecture Notes
34 pages
BCS402 - Module 5 Edited
No ratings yet
BCS402 - Module 5 Edited
16 pages
SQL Performance Counter - User Connections
No ratings yet
SQL Performance Counter - User Connections
2 pages

Memory Organization

Uploaded by

Memory Organization

Uploaded by

MEMORY ORGANIZATION

Let earlier be x memory accesses -> 1000 ns

The new memory accesses = 0.75x

New execution time = 750 ns

Hit rate = (0.1 + 0.2 + 0.3)

Which of the following is a correct graph for variation of “Tavg” vs “h”

A. Programs exhibits temporal locality

if nothing is mentioned, it is following simultaneous access in memory hirarchy .

= .98 * 12 + .02 * 1 * 1500

A . Spatial locality of instruction cache references>spatial locality of data cache

• Spatial locality of instruction cache references>spatial locality of data

1. What strategy should be taken if:

Answer 1. (a) Write Back, reduce main memory access.

The following program fragment is executing on single core machine

for(int i=0; i<n;i++)

. Which of the following is true for a memory hierarchy?

Hardware cache memories exploit spatial locality of reference

A. by remembering which pieces of data have been accessed recently

A is not correct. There is no "remembering" mechanism in cache though this

B is for temporal locality.

C is used for write-back cache.

Answer 1. a. 320 + 31 ∗ 4 = 444, 444/64 = 6,

Q. Consider a two-level memory hierarchy with separate instruction and data

. Consider a direct-mapped cache with 64 blocks and a block size of 16 bytes.

Q. Which of the following statements is true:

Tag Index Offset

a 31-10 9-5 4-0

b 15-10 9-4 3-0

1. What is the cache line size (in word)?

Answer: 1. a. The offset is 4-0 (5 bits). It implies 25bytes= 32bytes= 8words. b.

Q2: A cache separate a 32-bit address as follows:

Size of cache line: 2 = 24= 16 bytes

Number of cache lines: 2index bits= 211= 2048

cache size = 128KB

number of cache lines = cache size/ block size

number of cache sets = number of cache lines/p-way

tag s block size

A method of addressing a cache directory, comprising:

computing a parity of an address tag field within a presented address;

computing an index including:

the computed parity within a bit of the index; and

so clearly b b b b b b b b b b b are used for indexing

a. (k mod (N/m) ) of the cache

Number of sets hence = 64 / 4 = 16 sets

a) For word number 0 - 15 , block number = 0 ..

Thus set number = block number % number of sets = 0 mod 16 = 0

b) For word number 16 - 31 , block number = 1

Thus set number = 1

and so on till .....

Similarly set 1 will be updated when block number 65 is accessed as

set 2 will be updated when block number 66 is accessed as

In this way all words (0 - 1087) are covered by covering 68 blocks .

Thus number of cache misses in 1st iteration = 64(compulsory) + 4(capacity)

Similarly set 1 will be updated as : 65 1 33 49

set 2 will be updated as : 66 2 34 50

set 3 will be updated as : 67 3 35 51

Similarly when block number 16 comes , set 0 will be updated as : 64 0 16 48

block number 17 comes , set 1 will be updated as : 65 1 17 49

block number 16 comes , set 2 will be updated as : 66 2 18 50

set 3 will be updated as : 67 3 19 51

This way further misses will occur in these sets

Hence number of misses = 4*5 = 20

Likewise number of misses will be same for subsequent iterations .

Thus speedup obtained = Time without cache / Time with cache

blocks * (Cache access time + Main

[ As main memory access time = 10 * cache

= (68 * 10 * 10) / ((1 * 68 * 11) + 9 * (20

Thus the performance improvement(speedup) = 2.15

Q. How can the cache miss rate be reduced?

No. of blocks in cache =8KB/32B=256

Q. Suppose that in 1000 memory references there are 40 misses in L1

Mem stall cycles due to misses per instruction

You might also like