CMP3010L08 Memory
CMP3010L08 Memory
Dina Tantawy
Computer Engineering Department
Cairo University
Agenda
• Introduction
• The principle of locality
• Temporal locality
• Spatial locality
• Memory Hierarchy
• The basics of caches
• Direct mapped memory
Is it a “Computing” Machine ??
Is it a “Computing” Machine ??
Compute
Load/store data
to memory
Load/store data
to memory
6
Principle of locality
States that programs access a relatively small
portion of their address space at any instant of
time.
9
The basic structure of memory hierarchy
10
Memory Hierarchy : Principles of Operation
• A memory hierarchy can consist of multiple levels, but data is copied
between only two adjacent levels at a time.
• Upper Level (Cache) : the one closer to the processor
• Smaller, faster, and uses more expensive technology
• Lower Level (Memory): the one further away from the processor
• Bigger, slower, and uses less expensive technology
• Block (line): The minimum unit of information that can be either present
or not present in a cache.
11
Memory Hierarchy : Terminologies
•Hit: data appears in some block in the upper level
–Hit Rate: the fraction of memory access found in the upper level
–Hit Time: Time to access the upper level which consists of
cache access time + Time to determine hit/miss
13
The Basics of Caches
14
The Basics of Caches
• How do we know if a data item is in the cache?
• How do we find it?
16
The Basics of Caches
17
Direct Mapped Cache
Direct-mapped cache: A
cache structure in which
each memory location is
mapped to exactly one
location in the cache.
18
Direct Mapped Cache
Direct-mapped cache: A
Which Should I search ?
cache structure in which
each memory location is (Block address) Mod (number of
Blocks in cache)
mapped to exactly one
location in the cache. i.e. Address 11001 → 25
25%8 = 1
19
Direct Mapped Cache
• Because each cache location can contain the contents of a number of
different memory locations, how do we know whether the data in
the cache corresponds to a requested word?
• Tag: A field in a table used for a memory hierarchy that contains the
address information required to identify whether the associated block
in the hierarchy corresponds to a requested word.
20
Direct Mapped Cache
• How to recognize that a cache block does not have valid
information?
21
Accessing Caches
MISS
22
Accessing Caches
MISS
23
Accessing Caches
HIT
24
Accessing Caches
HIT
25
Accessing Caches
MISS
26
Accessing Caches
MISS
27
Accessing Caches
HIT
28
Accessing Caches
MISS
29
Accessing Caches
HIT
30
Definition of a Cache Block
• Cache Block: the cache data that has in its own cache tag
• Example:
• 4-byte Direct Mapped cache: Block Size = 1 Byte
• Take advantage of Temporal Locality: If a byte is referenced,
it will tend to be referenced soon.
• Did not take advantage of Spatial Locality: If a byte is referenced, its adjacent
bytes will be referenced soon.
• To take advantage of Spatial Locality: increase the block size
Valid Cache Tag Direct Mapped Cache Data
Byte 0
Byte 1
Byte 2
Byte 3
31
Example: 1 KB Direct Mapped Cache with 32-Byte Blocks
• For a 2 ** N byte cache, the address is split as follows:
• The uppermost (32 - N) bits are always the Cache Tag
• The lowest M bits are the Byte Select (Block Size = 2 ** M) – bytes within the block
31 9 4 0
Cache Tag Cache Index Byte Offset
Example: 0x50 Ex: 0x01 Ex: 0x00
Stored as part
of the cache
“state”
Valid Cache Cache Data
Bit Tag Byte 31 Byte 1 Byte 0 0
: :
0x50 Byte 63 Byte 33Byte 32 1
2
3
: : :
Byte 1023 Byte 992 31
:
Direct-Mapped Cache
Tag Index Byte
Offset
t
k b
V Tag Data Block
2k
line
s
t
=
Average
Access
Miss Miss Time
Penalty Rate Exploits Spatial
Locality
Increased Miss
Fewer blocks: Penalty
compromises & Miss Rate
temporal
locality
35
Cache Size
• For the following situation:
• 32bit byte addresses
• A directmapped cache
• The cache size is 2^n blocks, so n bits are used for the index
• The block size is 2^m words ,so m bits are used for the word within
the block, and two bits are used for the byte part of the address.
• the size of the tag field is 32 - (n + m + 2).
36
Exercise
37
Exercise
38
What happens if we replaced a block
that already has data ??
‹#›
Read and Write Policies
• Cache read is much easier to handle than cache write:
• Instruction cache is much easier to design than data cache
• Cache write:
• How do we keep data in the cache and memory consistent?
40
Read and Write Policies
• Two write options when the data block is in the memory :
• Write Through: write to cache and memory at the same time.
• Isn’t memory too slow for this?
• Write Back: write to cache only. Write the cache block to memory
when that cache block is being replaced on a cache miss.
• Need a “dirty” bit for each cache block
• Control can be complex
41
Write Buffer for Write Through
Cache
Processor DRAM
Write Buffer
‹#›
Write Miss Policies
• Write allocate (also called fetch on write): data at the missed-
write location is loaded to cache, followed by a write-hit
operation. In this approach, write misses are like read misses.
‹#›
Write Miss Policies
• Both write-through and write-back policies can use
either of these write-miss policies, but usually they are
paired in this way:
‹#›
Thank you
‹#›