Memory Hierarchy
Memory Hierarchy
Processors
Memory
Communication
Memory / Storage Evaluation
Costs
Capacity
Speed
Reliability
Volatility
Memory/storage hierarchies
Balancing performance with cost
Small memories are fast but expensive
Large memories are slow but cheap
Exploit locality to get the best of
Capacity
both worlds
Performance
locality = re-use/nearness of accesses
allows most accesses to use small, fast
memory
An Example Memory Hierarchy
Smaller, L0:
faster, registers CPU registers hold words retrieved
and from L1 cache.
costlier L1: on-chip L1
(per byte) cache (SRAM) L1 cache holds cache lines retrieved
storage from the L2 cache memory.
devices L2: off-chip L2
cache (SRAM) L2 cache holds cache lines
retrieved from main memory.
From lecture-9.ppt
Main Memory
Most of the main memory in a general
purpose computer is made up of RAM
integrated circuits chips, but a portion of the
memory may be constructed with ROM chips
2X in 1.5 yrs
Year
Introduction
Memory Hierarchy
Exploiting the Memory
Hierarchy
Not all stored data is equally important.
Put important data in the upper ranges
of the memory / storage hierarchy.
Put unimportant data in the lower
ranges.
The Principle of Locality
any instant of time. (This is kind of like in real life, we all have a
lot of friends. But at any given time most of us can only keep
in touch with a small group of them.)
Two Different Types of Locality:
Temporal Locality (Locality in Time): If an item is referenced, it
Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y
Memory Hierarchy Terms
The goal of the memory hierarchy is to keep the
contents that are needed now at or near the top of
the hierarchy
We discuss the performance of the memory hierarchy
using the following terms:
Hit – when the datum being accessed is found at the current
level
Miss – when the datum being accessed is not found and the next
level of the hierarchy must be examined
Hit rate – how many hits out of all memory accesses
Miss rate – how many misses out of all memory accesses
NOTE: hit rate = 1 – miss rate, miss rate = 1 – hit rate
Hit time – time to access this level of the hierarchy
Miss penalty – time to access the next level
Hit Rate and Miss Penalty
Hit rate: fraction found in that level
So high that usually talk about Miss rate
Miss rate fallacy: as MIPS to CPU performance,
miss rate to average memory access time in memory
Average memory-access time
= Hit time + Miss rate x Miss penalty
(ns or clocks)
Miss penalty: time to replace a block from lower level,
including time to replace in CPU
access time: time to lower level
= f(latency to lower level)
transfer time: time to transfer block
=f(BW between upper & lower levels, block size)
Single Cache
The average read access time= Hit Ratio*Time
taken in case of hit +(1-Hit Ratio)*Time taken
in case of miss
Average access time = H1*T1 +(1-H1)*T2
Two level Cache
Average access time = [H1*T1]+[(1-
H1)*H2*T2]+[(1-H1)(1-H2)*Hm*Tm]
Example 1
Assume that for a certain processor, a read request takes
50 nanoseconds on a cache miss and 5 nanoseconds on
a cache hit. Suppose while running a program, it was
observed that 80% of the processor’s read requests
result in a cache hit. The average read access time in
nanoseconds is____________.
(A) 10
(B) 12
(C) 13
(D) 14
Solution 1
Hit Ratio=0.8
Time taken in case of hit=5ns
Time taken in case of miss=50ns
The average read access time= Hit Ratio*Time taken in
case of hit +(1-Hit Ratio)*Time taken in case of miss
The average read access time in nanoseconds
= 0.8 * 5 + (1-0.8)*50
= 0.8 * 5 + 0.2*50
= 14 ns
Example 2
Consider a system with 2 level caches. Access times of
Level 1 cache, Level 2 cache and main memory are 1
ns, 10ns, and 500 ns, respectively. The hit rates of
Level 1 and Level 2 caches are 0.8 and 0.9,
respectively. What is the average access time of the
system ignoring the search time within the cache?
(A) 13.0 ns
(B) 12.8 ns
(C) 12.6 ns
(D) 12.4 ns
Solution 2
where,
H1 = Hit rate of level 1 cache = 0.8
T1 = Access time for level 1 cache = 1 ns
H2 = Hit rate of level 2 cache = 0.9
T2 = Access time for level 2 cache = 10 ns
Hm = Hit rate of Main Memory = 1
Tm = Access time for Main Memory = 500 ns
So, Average Access Time = ( 0.8 * 1 ) + ( 0.2 * 0.9 * 10 ) + ( 0.2
* 0.1 * 1 * 500)
= 0.8 + 1.8 + 10
= 12.6 ns
Example 3
A computer system has an L1 cache, an L2 cache, and a main memory unit
connected as shown below. The block size in L1 cache is 4 words. The
block size in L2 cache is 16 words. The memory access times are 2
nanoseconds. 20 nanoseconds and 200 nanoseconds for L1 cache, L2
cache and main memory unit respectively.
Cache Lower
Processor Level
Memory
Write Buffer
Causes of misses
Compulsory
First reference to a block
Capacity
Blocks discarded and later retrieved
Conflict
Program makes repeated references to multiple
addresses from different blocks that map to the same
location in the cache
Introduction
Memory Access Time
Average Memory Access Time = Hit Time + Miss Rate * Miss Penalty