0% found this document useful (0 votes)

131 views22 pages

Test 6 PracticeQuestion Cachememory 1 Updated

The document discusses cache memory and hierarchy of memories. It defines key concepts like hit time, miss penalty, miss rate, average memory access time, locality of reference, and how cache memory reduces average memory access time. Several examples are provided to calculate performance metrics like average memory access time.

Uploaded by

anik.additional

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

131 views22 pages

Test 6 PracticeQuestion Cachememory 1 Updated

Uploaded by

anik.additional

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Practice problems: Cache memory

1. What is Hierarchy of Memories?

Programmers want memory to be fast, large, and cheap. Question may be asked whether we need a very
fast and very large memory, since programs access a small proportion of their address space at any time!
Architects have found that they can address these conflicting demands with a hierarchy of memories, with
the fastest, smallest, and most expensive memory per bit at the top of the hierarchy and the slowest, largest,
and cheapest per bit at the bottom. To optimize cost and speed, combination of memory devices are
used. Smaller amount of expensive but fast memory is used close to the processor whereas large
amounts of cheaper but slower memory is used farther from the processor. Hierarchy of memories
give the programmer the illusion that main memory is nearly as fast as the top of the hierarchy and nearly
as big and cheap as the bottom of the hierarchy. Due to hierarchy, to CPU, it would appear as fast as
most expensive memory and as big as the cheapest.
The fastest, smallest, and most expensive memory per bit at the top of the hierarchy and the slowest,
largest, and cheapest per bit at the bottom.

2. What is principle of locality? What is locality of reference? Define temporal locality and spatial locality.
 Programs access a small proportion of their address space at any time
 Temporal locality
 Items accessed recently are likely to be accessed again soon
 instructions in a loop
 Spatial locality
 Items near those accessed recently are likely to be accessed soon
 sequential instruction access, array data
3. Cache Memory reduces frequency of access to RAM while CPU is running a program
Reduces average memory access time of a program
Reduces program execution time
Caches give the programmer the illusion that main memory is nearly as fast as the top of the hierarchy
and nearly as big and cheap as the bottom of the hierarchy.

4. How does CPU use cache memory? Explain briefly

List the steps in sequential order, how CPU access/uses cache memory while it runs a program.

Program (Instructions and data) are loaded into RAM. Cache memory is usually much smaller than the size of
RAM. So only a fraction of information (Instruction and Data) of RAM can be copied/transferred/stored into
cache memory. In a computer having cache memory, CPU is designed to search next instruction or data it
requires in cache. If it is found, it is called cache hit and the CPU will read instruction/data from cache. If the
instruction or data the CPU is searching for is not found in cache, it is called cache miss. In the event of cache
miss, the CPU will access RAM and a number (block) of instructions/data including the one the CPU is
searching for will be copied/transferred from RAM to cache memory following the spatial locality of
reference. The size of the block is only few bytes, i.e., 4B/8B/16B/32B so on. The block transfer from RAM to
cache in the event of cache miss increases the probability of reading following instructions from cache
instead of reading from RAM.
5. Define Hit rate, Hit time, Miss rate, miss penalty, average memory access time, and memory stall cycles.
• Hit Time (HT): The hit time is how long it takes data to be sent from the cache to the processor.
This is usually fast, on the order of 1-3 clock cycles.
• Miss Penalty (MP): The miss penalty is the time to copy data from main memory to the cache.
This often requires dozens of clock cycles (at least). The miss rate is the percentage of misses.
• Miss Rate (MR): 1 – Hit Ratio
• The average memory access time, or AMAT, can then be computed.
AMAT = Hit time + (Miss rate × Miss penalty)
This is just averaging the amount of time for cache hits and the amount of time for cache misses
6. What is cache miss? What is the consequence of cache miss? Explain.
When CPU is searching any instruction or data in cache and if it is not found, called cache miss. In the
event of cache miss, the CPU will access RAM and a number (block) of instructions/data including the one the
CPU is searching for will be copied/transferred from RAM to cache memory following the spatial locality of
reference. The size of the block is only few bytes, i.e., 4B/8B/16B/32B so on. The block transfer from RAM to
cache in the event of cache miss increases the probability of reading following instructions from cache instead of
reading from RAM.
In the event of cache miss, a new block is always copied/transferred from RAM to cache. If the cache is full and
there is a cache miss, a new block will be copied/transferred from RAM to cache by replacing a block transferred
earlier.
7. What are the performance measures of cache memory?
• Hit Time (HT): The hit time is how long it takes data to be sent from the cache to the processor.
This is usually fast, on the order of 1-3 clock cycles.
• Miss Penalty (MP): The miss penalty is the time to copy data from main memory to the cache.
This often requires dozens of clock cycles (at least). The miss rate is the percentage of misses.
• Miss Rate (MR): 1 – Hit Ratio
The average memory access time, or AMAT, can then be computed.
AMAT = Hit time + (Miss rate × Miss penalty)
This is just averaging the amount of time for cache hits and the amount of time for cache misses
8. What is the justification of block transfer from RAM to cache in the event of cache miss? Explain
Instructions in users programs usually follow principle of spatial locality. It means that instruction from
nearby RAM locations is likely to be read next. So in the event of cache miss, a block of instruction,
including the one the CPU is searching for is copied from RAM to cache so that the CPU can read
following few instructions from cache instead of accessing RAM again.
9. Parameters that matter: RAM

Hit Time (HT) Cache

C
P
Miss Rate (MR) U
Miss Penalty (MP) Access time 5ns
Access time 30ns
AMAT = Hit Time + Miss Rate x Miss Penalty
Suppose a program (all instructions are register based), initially loaded into RAM. The Hit ratio of
Cache is 80%. Calculate average access time.
Solution: AMAT = Hit Time + Miss Rate x Miss Penalty
Average Memory Access Time: 5ns +0.2×30ns = 11ns

10. If a memory system consists of a single external cache with an access time of 20 ns and a hit rate of
0.92, and a main memory with an access time of 60 ns, what is the average memory access time
(AMAT) of this system?
AMAT = Hit time + Miss rate × Miss penalty

AMAT = 20 + (0.08x60) = 24.8 ns

11. If Hit ratio of Cache is 80%, Hit Time (HT) = 5ns and Miss Penalty (MP) = 30ns, calculate AMAT.
Miss Rate = 1- 0.8 = 0.2
AMAT = Hit Time + Miss Rate x Miss Penalty
= 5ns +0.2×30ns = 11ns
12. Hit time is also important for performance Average memory access time (AMAT)
AMAT = Hit time + Miss rate × Miss penalty

Example
CPU with 1ns clock, hit time = 1 cycle, miss penalty = 20 cycles, cache miss rate = 5%

AMAT = 1 + 0.05 × 20 = 2ns

2 cycles per instruction

13. Processor clock cycle: 200 ns, Miss Penalty of 50 clock cycles, Miss Ratio of 0.02 misses/instruction,
and Hit time of 1 clock cycle

AMAT = Hit time + Miss rate × Miss penalty

= 1+ 0.02 × 50 = 2 clock cycles = 400 ns

Which improvement would be best?

A) 190 ns clock:

AMAT = 1+ 0.02 × 50 = 2 clock cycles = 380 ns

B) MP (Miss Penalty) of 40 clock cycles:

AMAT = 1+ 0.02 × 40 = 1.8 clock cycles = 3600 ns

C) MR(Miss Ratio) of 0.015 misses/instruction:

AMAT = 1+ 0.015 × 50 = 1.75 clock cycles = 350 ns

14. A machine has a base CPI of 2 clock cycles. Measurements obtained show that the instruction miss rate
is 12% and the data miss rate is 6%, and that on average, 30% of all instructions contain one data
reference. The miss penalty for the cache is 10 cycles. What is the total CPI?
Solution:
Please note that Base CPI means the clock cycles required to read from cache memory. You can also say
that this is hit time in CPU clock cycles, i.e., access time of Cache memory in CPU clock cycles.
Average CPI = 2.0 + instruction miss cycles + data miss cycles
= 2.0 + 0.12 x 10 + 0.30 x 0.06 x 10 = 2.0 + 1.2 + 0.18
= 3.38

15. Memory Stall Cycles = Number of Memory Accesses × Miss rate × Miss penalty

16. Machine has a base CPI of 1 clock cycles. Measurements obtained show that the instruction miss rate is
15% and the data miss rate is 6%, and that on average, 40% of all instructions contain one data
reference. The miss penalty for the cache is 20 cycles. What is the total CPI?
Solution:

Average CPI = Base CPI + instruction miss cycles + data miss cycles
I-cache miss rate = 15%
D-cache miss rate = 6%
Miss penalty = 20 cycles
Base CPI (ideal cache) = 1
Load & stores are 40% of instructions
Miss cycles per instruction
I-cache: 1 x 0.15 × 20 = 3
D-cache: 1x 0.40 × 0.06 × 20 = 0.48
Actual CPI = 1 + 3 + 0.48 = 4.48

17. Suppose our processor has separate L1 instruction cache and data cache. Our CPI-base is 2 clock cycles,
whereas memory accesses take 100 cycles. Our Instruction cache miss rate is 3% while our Data cache
miss rate is 10%. 40% of our instructions are loads or stores.
a. What is our processor's CPI stall?

Average CPI = CPI base + L1 inst miss cycles + L1 data miss cycles
= 2 + 1 ×0.03 ×100 + 0.4×0.1×100
= 2 +0.07 × 100 = 9 cycles

To improve the performance our processor, we add a unified L2 cache between the L1 caches and
memory. Our L2 cache has a hit time of 10 cycles and a global miss rate of 2%.
b. What is our new Average CPI?

CPI stall = CPI base + L1 inst miss cycles + L1 data miss cycles + L2 inst miss cycles + L2 data miss
cycles
= 2 + (1 ×0.03 × 10) + (0.4 ×0.1 × 10) + (1×0.02 × 100) + (0.4 ×0.02 × 100)
= 2 + 0.3 + 0.4 + 2 + 0.8
= 2 + 3.5 = 5.5 cycles
18.  Assume
1. I-cache miss rate 3%.
2. D-cache miss rate 5%.
3. 40% of instructions reference data.
4. Miss penalty of 50 cycles.
5. Base CPI is 2.
 What is the CPI including the misses? Average CPI
 How much slower is the machine when misses are taken into account? Average CPI/base CPI
 Redo the above if the I-miss penalty is reduced to 10 (D-miss still 50)
 With I-miss penalty back to 50, what is performance if CPU (and the caches) are 100 times faster
19.  Assume
o 5% I-cache misses.
o 10% D-cache misses.
o 1/3 of the instructions access data.
 What is the CPI if the miss penalty is 12?
 What is the CPI if miss penalty is 24 clock)?

20. Suppose our processor has separate L1 instruction cache and data cache. Our CPIbase is 2 clock cycles,
whereas memory accesses take 100 cycles. Our I-cache miss rate is 3% while our D-cache miss rate is
10%. 40% of our instructions are loads or stores.

• a. What is our processor's Average CPI?

Average CPI = CPIbase + L1 inst miss cycles + L1 data miss cycles
= 2 + 1 x 0.03 x 100 + 0.4 x 0.1 x 100
= 2 + 0.07 x 100 = 9 cycles
To improve the performance our processor, we add a unified L2 cache between the L1 caches
and memory. Our L2 cache has a hit time of 10 cycles and a global miss rate of 2%.
• b. What is our new Average CPI?
Average CPI=CPIbase + L1 inst miss cycles + L1 data miss cycles + L2 inst miss cycles + L2
data miss cycles
= 2 + (1 x 0.03 x 10) + (0.4 x 0.1 x 10) + (1 x 0.02 x 100 ) + (0.4 x 0.02 x 100)
= 2 + 0.3 + 0.4 + 2 + 0.8
= 2 + 3.5 = 5.5 cycles

21. Assume a memory access to main memory on a cache "miss" takes 30 ns and a cache "hit" takes 3 ns. If
80% of the processor's memory requests result in a cache "hit", what is the average memory access time?
Solution:
(0.8 × 3 ns) + (0.2 × 33 ns) = 2.4 ns + 6.6 ns
= 9 ns

AMAT = 3 + 0.2 x 30 = 9ns

22. I-cache miss rate = 2%

D-cache miss rate = 4%
Miss penalty = 100 cycles
Base CPI (ideal cache) = 2
Load & stores are 36% of instructions
Solution:

Miss cycles per instruction

I-cache: 0.02 × 100 = 2
D-cache: 0.36 × 0.04 × 100 = 1.44
Actual CPI = 2 + 2 + 1.44 = 5.44

Ideal CPI = Base CPI = 2

Speed up = 5.44/2 =2.72

Ideal CPU is 2.72 times faster

23. CPU with 1ns clock, hit time = 1 cycle, miss penalty = 20 cycles, I-cache miss rate = 5%
Calculate: Average memory access time (AMAT)
Solution:

AMAT = Hit time + Miss rate × Miss penalty

AMAT = 1 + 0.05 × 20 = 2ns
2 cycles per instruction
24.

25. Suppose our processor has separate L1 instruction cache and data cache. Our CPI base is 2 clock cycles,
whereas memory accesses take 100 cycles. Our instruction miss rate is 3% while our data miss rate is
10%. 40% of our instructions are loads or stores.

a. What is our processor's Average CPI?

Average CPI = CPI base + L1 inst miss cycles + L1 data miss cycles
= 2 + 1 x 0.03 x 100 + 0.4 x 0.1 x 100
= 2 + 0.07 x 100 = 9 cycles
To improve the performance our processor, we add a unified L2 cache between the L1 caches and
memory. Our L2 cache has a hit time of 10 cycles and a global miss rate of 2%.
b. What is our new Average CPI?
Average CPI = CPI base + L1 inst miss cycles + L1 data miss cycles + L2 inst miss cycles + L2 data
miss cycles
= 2 + (1 x 0.03 x 10) + (0.4 x 0.1 x 10) + (1 x 0.02 x 100 ) + (0.4 x 0.02 x 100)
= 2 + 0.3 + 0.4 + 2 + 0.8
= 2 + 3.5 = 5.5 cycles

Question 2:
You want your AMAT to be <=2 cycles. You have two levels of cache.
L1 Hit Time is 1 cycle; L1 miss rate is 10%;
L2 Hit Time is 3 cycles; L2 Miss Penalty is 100 cycles
What must you optimize your L2 miss rate to be?
2 = 1 + 0.1x ( 3 +100x)
x = 0.07 = 7% miss rate
26. Assume that 33% of the instructions in a program are data accesses. The cache hit ratio is 97% and the
hit time is one cycle, but the miss penalty is 20 cycles.
AMAT = Hit time + (Miss rate x Miss penalty)

27. • CPI = 1.0 when cache has 100% hit rate

data accesses are only performed in loads and stores

load/stores make up 50% of all instructions

miss penalty = 50 clock cycles

miss rate = 1%

• How much faster is an ideal machine?

CPU time ideal = IC x 1.0 xclock cycle time

CPU time this machine = IC x (1.0 + 1.50 x 0.01 x 50) x clock cycle time

= IC x 1.75 x clock cycle time

ideal machine is 1.75 times faster (75%)

28. Assume that 33% of the instructions in a program are data accesses. The cache hit ratio is 97% and the
hit time is one cycle, but the miss penalty is 20 cycles.

Solution:
Average Memory Access Time (AMAT) = Hit time + (Miss rate x Miss penalty)

Miss rate = 1 – 0.97 = 0.03

AMAT = 1 + (0.03 × 0.33 × 20) = 1.198

29. Suppose our processor has separate L1 instruction cache and data cache. Our CPI-base is 2 clock cycles,
whereas memory accesses take 100 cycles. Our Instruction cache hit rate is 97% while our Data cache
hit rate is 90%. 40% of our instructions are loads or stores.

[Stalled means the processor was not making forward progress with instructions, and usually happens
because it is waiting on memory I/O.]
Alternate statement:

Suppose our processor has separate L1 instruction cache and data cache. Our CPI-base is 2 clock cycles,
whereas memory accesses take 100 cycles. Our Instruction cache miss rate is 3% while our Data cache
miss rate is 10%. 40% of our instructions are loads or stores.

a. What is our processor's CPI stall?

CPI stall = CPI base + L1 inst miss cycles + L1 data miss cycles
= 2 + 1 ×0.03 ×100 + 0.4×0.1×100
= 2 +0.07 × 100 = 9 cycles

30.
You want your Average Memory Access Time (AMAT) to be <=2 cycles. You have two levels of cache.
L1 Hit Time is 1 cycle; L1 miss rate is 10%;
L2 Hit Time is 3 cycles; L2 Miss Penalty is 100 cycles
What must you optimize your L2 miss rate to be?
2 = 1 +0.1× (3 + x ×100)
x = 0.07 = 7% miss rate

31. Assume
CPI = 1.0 when cache has 100% hit rate
Miss penalty = 50 clock cycles
Miss rate = 1%
How much faster is an ideal machine?

CPU time ideal = IC × 1.0 × clock cycle time

CPU time this machine = IC × (1.0 + 0.01 × 50) × clock cycle time = IC × 1.5 × clock cycle time
ideal machine is 1.5 times faster (50%)

32. A machine has a base CPI of 2 clock cycles. Measurements obtained show that the instruction miss
rate is 12% and the data miss rate is 6%, and that on average, 30% of all instructions contain one data
reference. The miss penalty for the cache is 10 cycles. What is the total CPI?

Effective CPI = 2.0 + instruction miss cycles + data miss cycles

= 2.0 + 0.12×10 + 0.30×0.06×10 = 2.0 + 1.2 + 0.18 = 3.38

33. I-cache miss rate = 2%

D-cache miss rate = 4%
Miss penalty = 100 cycles
Base CPI (ideal cache) = 2
Load & stores are 36% of instructions
Miss cycles per instruction
I-cache: 0.02 × 100 = 2
D-cache: 0.36 × 0.04 × 100 = 1.44
Actual CPI = 2 + 2 + 1.44 = 5.44
Ideal CPU is 5.44/2 =2.72 times faster

34. Assume

I-cache miss rate 3%.

D-cache miss rate 5%.
40% of instructions reference data.
miss penalty of 50 cycles.
Base CPI is 2.

What is the CPI including the misses?

How much slower is the machine when misses are taken into account?
Redo the above if the I-miss penalty is reduced to 10 (D-miss still 50)
With I-miss penalty back to 50, what is performance if CPU (and the caches) are 100 times faster

35. Assume
5% I-cache misses.
10% D-cache misses.
1/3 of the instructions access data.
The CPI = 4 if the miss penalty is 0. A 0miss penalty is not realistic of course.
What is the CPI if the miss penalty is 12?
What is the CPI if we upgrade to a double speed cpu+cache, but keep a single speed memory (i.e., a 24
clock miss penalty)?
How much faster is the double speed machine? It would be double speed if the miss penalty were 0 or if
there was a 0% miss rate.

----

Problem:
HT(L1) = 2ns ;
HT(L2) = 10ns ;
Miss Ratio(L1) = 6%,
Miss Ratio (L2) = 2%
RAM Access time = 100ns.
Calculate average memory access time.

If Average Memory Access Time = 3.358 ns, calculate what percent of instruction CPU reads from L1
cache memory?

Miss rate of L-1 = x%

Hit ratio of L-1 = (100 – x)

For the following

Problem:
HT(L1) = 2ns ; Hit Ratio(L1) = 70%
HT(L2) = 10ns ; Hit Ratio(L1) = 80%
HT(L3) = 20ns ;
Global Miss Rate = 4% and RAM Access time = 100ns
Calculate: Average Memory Access Time

Assume:
L1 I-cache miss rate 3% L1 D-cache miss rate 8%
30% of instructions reference data L2 miss rate 4%
L2 time of 10 clock cycles Memory access time 90 clock cycles
Base CPI of 2.5 CPU Clock rate 3GHz
Which of the following implementation will be better?
a. A faster L2 of 6 clock cycles
b. A larger L2 of miss rate 3%
Average CPI: base CPI + L-1 Instruction miss cycles + L-1 Data Miss cycles + L-2 instruction miss cycles + L-2
Data Miss Cycles

base CPI = 2.5

L-1 Instruction miss cycles = Instruction count x Instruction MR x L-1 Miss Penalty = 1 x 0.03 x 10

L-1 data miss cycles = Data access count x data MR x L-1 Miss Penalty = (1 x0.3)x 0.08 x 10

L-2 Instruction miss cycles = Instruction count x Instruction MR x L-2 Miss Penalty = 1 x 0.04 x 90

L-2 data miss cycles = data access count x data MR x L-2 Miss Penalty = (1x0.3) x 0.04 x 90

Average CPI =

Case-a
L1 I-cache miss rate 3% L1 D-cache miss rate 8%
30% of instructions reference data L2 miss rate 4%
L2 time of 6 clock cycles Memory access time 90 clock cycles
Base CPI of 2.5 CPU Clock rate 3GHz

Average CPI=

Case-b
L1 I-cache miss rate 3% L1 D-cache miss rate 8%
30% of instructions reference data L2 miss rate 3%
L2 time of 10 clock cycles Memory access time 90 clock cycles
Base CPI of 2.5 CPU Clock rate 3GHz

Average CPI =

Which of the following implementation will be better?

a. A faster L2 of 6 clock cycles
b. A larger L2 of miss rate 3%
Which of the following implementation will be better?
a. A faster L2 of 6 clock cycles
b. A larger L2 of miss rate 3%

Assume:
L1 I-cache miss rate 3% L1 D-cache miss rate 8%
40% of instructions reference data L2 miss rate 4%
L2 time of 10 clock cycles Memory access time 80 clock cycles
Base CPI of 2 CPU Clock rate 4GHz
Which of the following implementation will be better?
a. A faster L2 of 5 clock cycles
b. A larger L2 of miss rate 2%

Assume
• L1 I-cache miss rate 8%
• L1 D-cache miss rate 10%
• Load Instructions: 15% of total instruction count
• Store Instructions: 25% of total instruction count
• Data ref = 40%
• L2 miss rate 6%
• L2 time of 15ns = 15ns / 0.25ns = 60 cycles
• Memory access time 100ns = 100ns/0.25ns = 400 cycles
• Base CPI of 2 = 0.25 ns x 2 = 0.5 ns
• Clock rate 4GHz
CPU clock period = 1/4x109 = 0.25 x 10-9 sec = 0.25 ns
a) How many instructions per second does this machine execute?

Average Memory Access time (ns)

No of instructions per second does this machine execute = 1 /average memory access time (ns)

1/ 20 x 10-9 =

Average CPI (no of clock cycles)

No of instructions per second does this machine execute = 4GHz /average CPI = 4x109/average CPI

b) How many instructions per second would this machine execute if the L2 cache were eliminated?
Average CPI
No of instructions per second does this machine execute = 4GHz /average CPI

c) How many instructions per second would this machine execute if both caches were eliminated?
Average CPI
No of instructions per second does this machine execute = 4GHz /average CPI

d) How many instructions per second would this machine execute if the L2 cache had a 0% miss
rate (L1 as originally specified)?
e) How many instructions per second would this machine execute if both L1 caches had a 0% miss
rate?

Problem: For Intel Core i7-965 processor, assume that

HT(L1) = 4ns ; Miss Ratio(L1) = 10%
HT(L2) = 10ns ; Miss Ratio(L2) = 6%
HT(L3) = 20ns ; Miss Ratio(L3) = 2%
RAM Access time = 100ns
Calculate: Average Memory Access Time

Assume
• L1 I-cache miss rate 5%
• L1 D-cache miss rate 8%
• Load Instructions: 25% of total instruction count
• Store Instructions: 10% of total instruction count
• L2 miss rate 4%
• L2 time of 15ns
• Memory access time 90ns
• Base CPI of 2
• Clock rate 3GHz
f) How many instructions per second does this machine execute?
g) How many instructions per second would this machine execute if the L2 cache were eliminated?
h) How many instructions per second would this machine execute if both caches were eliminated?
i) How many instructions per second would this machine execute if the L2 cache had a 0% miss
rate (L1 as originally specified)?
j) How many instructions per second would this machine execute if both L1 caches had a 0% miss
rate?

Problem: For Intel Core i7-965 processor, assume that

HT(L1) = 5ns ; Miss Ratio(L1) = 10%
HT(L2) = 10ns ; Miss Ratio(L2) = 8%
HT(L3) = 15ns ; Miss Ratio(L3) = 2%
RAM Access time = 100ns
Calculate: Average Memory Access Time
AMAT = HT(L1) + MR(L1)xMP(L1)
= HT(L1) + MR(L1)x[HT(L2) + MR(L2)xMP(L-2)]
= HT(L1) + MR(L1)x[HT(L2) + MR(L2)x{HT(L-3) + MR(L-3) x MP(L-3)}]
= 5ns + 0.1 x [10ns + 0.08x{15ns + 0.02 x 100ns}] =

CEN468 Lab 3 V2
No ratings yet
CEN468 Lab 3 V2
14 pages
Solution Manual For Computer Organization and Desi RISC-V E.. Patterson, K. Hennessy
No ratings yet
Solution Manual For Computer Organization and Desi RISC-V E.. Patterson, K. Hennessy
15 pages
Assignment 4
No ratings yet
Assignment 4
3 pages
Solution Manual COD
No ratings yet
Solution Manual COD
115 pages
Problem 1 A) Considering The Number of Instructions Here To Be A Constant A
No ratings yet
Problem 1 A) Considering The Number of Instructions Here To Be A Constant A
13 pages
Th857 Programmable Timer Module Instruction Manual
No ratings yet
Th857 Programmable Timer Module Instruction Manual
2 pages
Test 6 PracticeQuestion Cachememory 1
No ratings yet
Test 6 PracticeQuestion Cachememory 1
21 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Week 6: Assignment Solutions
No ratings yet
Week 6: Assignment Solutions
4 pages
PDF
No ratings yet
PDF
6 pages
Sample Midterm Exam Questions
No ratings yet
Sample Midterm Exam Questions
2 pages
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
No ratings yet
WINSEM2023-24 BCSE205L TH VL2023240500897 2024-03-15 Reference-Material-I
17 pages
Midterm Exam Architecture
No ratings yet
Midterm Exam Architecture
2 pages
Tutorial 3-4: Formula Used Answer
No ratings yet
Tutorial 3-4: Formula Used Answer
5 pages
Midtermsolutions
No ratings yet
Midtermsolutions
3 pages
Set Associative Mapping
No ratings yet
Set Associative Mapping
15 pages
CH06 Memory Organization
No ratings yet
CH06 Memory Organization
85 pages
Cache Memory and Associative Memory 2.2.2
No ratings yet
Cache Memory and Associative Memory 2.2.2
7 pages
CPU Organization Bindu Agarwalla
No ratings yet
CPU Organization Bindu Agarwalla
22 pages
18 Memory Interleaving 16-03-2024
100% (1)
18 Memory Interleaving 16-03-2024
8 pages
Pipelining vs. Parallel Processing
No ratings yet
Pipelining vs. Parallel Processing
23 pages
Chapter 5 Large and Fast Exploiting Memory Hierarchy
No ratings yet
Chapter 5 Large and Fast Exploiting Memory Hierarchy
101 pages
ECE 341 2013 in Class Midterm1
No ratings yet
ECE 341 2013 in Class Midterm1
9 pages
Chapter 3 (Part II) - Addressing Modes
No ratings yet
Chapter 3 (Part II) - Addressing Modes
18 pages
Solutions To Set 8
No ratings yet
Solutions To Set 8
18 pages
Advanced Memory Management in Modern CPP
No ratings yet
Advanced Memory Management in Modern CPP
119 pages
Cache Memory Problems
100% (1)
Cache Memory Problems
3 pages
The University of The South Pacific: EE326 Embedded Systems
No ratings yet
The University of The South Pacific: EE326 Embedded Systems
2 pages
TutorialModule5 Part1 Answers
100% (1)
TutorialModule5 Part1 Answers
8 pages
Solution For Chapter 4
100% (3)
Solution For Chapter 4
26 pages
Computer Architecture and Organization Ch#2 Examples
No ratings yet
Computer Architecture and Organization Ch#2 Examples
6 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Coa - Memory Organization
50% (2)
Coa - Memory Organization
31 pages
1.9 EEEQ 472 - Memory Interfacing
No ratings yet
1.9 EEEQ 472 - Memory Interfacing
13 pages
DRAM Report
No ratings yet
DRAM Report
17 pages
Design of Decoders
No ratings yet
Design of Decoders
54 pages
Advanced Computer Architecture: CSE-401 E
No ratings yet
Advanced Computer Architecture: CSE-401 E
71 pages
5 Cache PDF
No ratings yet
5 Cache PDF
46 pages
4.2 5-Stage Pipeline ARM Organization: Memory Bottle Neck
No ratings yet
4.2 5-Stage Pipeline ARM Organization: Memory Bottle Neck
6 pages
Memory Mapping Techniques (Zain)
100% (1)
Memory Mapping Techniques (Zain)
3 pages
CS704 Finalterm QA Past Papers
No ratings yet
CS704 Finalterm QA Past Papers
20 pages
33-Design of Scalable Memory Using RAM's - ROM's Chips - Construction of Larger Size Memories-16-03-2024
No ratings yet
33-Design of Scalable Memory Using RAM's - ROM's Chips - Construction of Larger Size Memories-16-03-2024
32 pages
HP Advanced Memory Error Detection Technology
No ratings yet
HP Advanced Memory Error Detection Technology
8 pages
Computer Architecture Questions
No ratings yet
Computer Architecture Questions
1 page
Assignment-2 COA
No ratings yet
Assignment-2 COA
2 pages
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
5 Cache and Main Memory
100% (1)
5 Cache and Main Memory
15 pages
Computer Network - Unit-2 - Data - Link - Layer
No ratings yet
Computer Network - Unit-2 - Data - Link - Layer
89 pages
Memory Unit Bindu Agarwalla
No ratings yet
Memory Unit Bindu Agarwalla
62 pages
Assignment 5 - OpenCL Optimizations
100% (1)
Assignment 5 - OpenCL Optimizations
2 pages
CO4 - Hashing in Data Structure
No ratings yet
CO4 - Hashing in Data Structure
13 pages
Error Detection and Correction For Small Low Cost NVMs
No ratings yet
Error Detection and Correction For Small Low Cost NVMs
6 pages
Instruction Timing and Execution in 8085
No ratings yet
Instruction Timing and Execution in 8085
50 pages
Introduction To SDRAM
No ratings yet
Introduction To SDRAM
21 pages
Microprocessor and Architecture Solution PDF
No ratings yet
Microprocessor and Architecture Solution PDF
23 pages
Advanced Architecture Memory
No ratings yet
Advanced Architecture Memory
13 pages
Ca Mod 2
No ratings yet
Ca Mod 2
40 pages
5 1
No ratings yet
5 1
39 pages
Computer Architecture and Organization: Lecture16: Cache Performance
No ratings yet
Computer Architecture and Organization: Lecture16: Cache Performance
17 pages
Cache Memory Performance
No ratings yet
Cache Memory Performance
10 pages
Cache Memory
No ratings yet
Cache Memory
28 pages
Modicon Ladder Logic Block Library User Guide Volume 1
100% (1)
Modicon Ladder Logic Block Library User Guide Volume 1
314 pages
2nd QT For Grade 3
No ratings yet
2nd QT For Grade 3
17 pages
XMC1100 Boot Kit: Getting Started
No ratings yet
XMC1100 Boot Kit: Getting Started
42 pages
3.4 Hazards/Glitches and How To Avoid Them: 3.4.1 The Problem With Hazards
No ratings yet
3.4 Hazards/Glitches and How To Avoid Them: 3.4.1 The Problem With Hazards
8 pages
Week1 Day2 OS ClassNotes and Exercises
No ratings yet
Week1 Day2 OS ClassNotes and Exercises
3 pages
Cryogenic Ball Valves: Installation, Operation and Maintenance Instructions
No ratings yet
Cryogenic Ball Valves: Installation, Operation and Maintenance Instructions
8 pages
Using Slic3r With The XYZprinting DaVinci 1.0
50% (2)
Using Slic3r With The XYZprinting DaVinci 1.0
6 pages
TD Eona Integration
No ratings yet
TD Eona Integration
2 pages
10 Io Ports
No ratings yet
10 Io Ports
21 pages
QC 05 06
100% (1)
QC 05 06
17 pages
"Online Movie Tickets Booking System": Problem Solving in Programming and System Design
No ratings yet
"Online Movie Tickets Booking System": Problem Solving in Programming and System Design
32 pages
DE Lpack
No ratings yet
DE Lpack
7 pages
Samsung Ps43f4900aw Exp View, Parts List
No ratings yet
Samsung Ps43f4900aw Exp View, Parts List
10 pages
H27 Service Manual Ver.1
0% (1)
H27 Service Manual Ver.1
58 pages
Best Practices For Securing Active Directory
100% (1)
Best Practices For Securing Active Directory
123 pages
DI Module With RS485
No ratings yet
DI Module With RS485
10 pages
COA 100 Important Question and Answers (Draft)
No ratings yet
COA 100 Important Question and Answers (Draft)
79 pages
Kinect With ROS Tutorial
No ratings yet
Kinect With ROS Tutorial
12 pages
74ls147 148 PDF
No ratings yet
74ls147 148 PDF
8 pages
Bakery Management System
No ratings yet
Bakery Management System
32 pages
Swouter
No ratings yet
Swouter
13 pages
Barnstead 2314 Lab Rotator PDF
No ratings yet
Barnstead 2314 Lab Rotator PDF
25 pages
DCC2010 SDRcube N2APB OH2NLT
No ratings yet
DCC2010 SDRcube N2APB OH2NLT
27 pages
ArmA Gold Manual PDF
No ratings yet
ArmA Gold Manual PDF
19 pages
Network Installations Instructions
No ratings yet
Network Installations Instructions
85 pages
R 2008 It Syllabus
No ratings yet
R 2008 It Syllabus
89 pages
Ede
No ratings yet
Ede
6 pages
Yoga 520-14ikb HMM 201703 PDF
No ratings yet
Yoga 520-14ikb HMM 201703 PDF
87 pages
Tellabs 8620 Access Switch Reference Manual FP 2.7
No ratings yet
Tellabs 8620 Access Switch Reference Manual FP 2.7
90 pages

Test 6 PracticeQuestion Cachememory 1 Updated

Uploaded by

Test 6 PracticeQuestion Cachememory 1 Updated

Uploaded by

Practice problems: Cache memory

1. What is Hierarchy of Memories?

4. How does CPU use cache memory? Explain briefly

Hit Time (HT) Cache

AMAT = 20 + (0.08x60) = 24.8 ns

AMAT = 1 + 0.05 × 20 = 2ns

AMAT = Hit time + Miss rate × Miss penalty

Which improvement would be best?

AMAT = 1+ 0.02 × 50 = 2 clock cycles = 380 ns

B) MP (Miss Penalty) of 40 clock cycles:

AMAT = 1+ 0.02 × 40 = 1.8 clock cycles = 3600 ns

C) MR(Miss Ratio) of 0.015 misses/instruction:

AMAT = 1+ 0.015 × 50 = 1.75 clock cycles = 350 ns

• a. What is our processor's Average CPI?

AMAT = 3 + 0.2 x 30 = 9ns

22. I-cache miss rate = 2%

Miss cycles per instruction

Ideal CPI = Base CPI = 2

Ideal CPU is 2.72 times faster

AMAT = Hit time + Miss rate × Miss penalty

a. What is our processor's Average CPI?

27. • CPI = 1.0 when cache has 100% hit rate

data accesses are only performed in loads and stores

load/stores make up 50% of all instructions

miss penalty = 50 clock cycles

• How much faster is an ideal machine?

CPU time ideal = IC x 1.0 xclock cycle time

= IC x 1.75 x clock cycle time

ideal machine is 1.75 times faster (75%)

Miss rate = 1 – 0.97 = 0.03

AMAT = 1 + (0.03 × 0.33 × 20) = 1.198

a. What is our processor's CPI stall?

CPU time ideal = IC × 1.0 × clock cycle time

Effective CPI = 2.0 + instruction miss cycles + data miss cycles

33. I-cache miss rate = 2%

I-cache miss rate 3%.

What is the CPI including the misses?

Miss rate of L-1 = x%

Hit ratio of L-1 = (100 – x)

For the following

base CPI = 2.5

Which of the following implementation will be better?

Average Memory Access time (ns)

Average CPI (no of clock cycles)

Problem: For Intel Core i7-965 processor, assume that

Problem: For Intel Core i7-965 processor, assume that

You might also like