TM103 Chapter 6
TM103 Chapter 6
Memory
In addition, you will learn about some types of cache mapping schemes,
and their influence on the performance of the whole computer system.
This lecture gives a thorough presentation of direct mapping, associative
mapping, and set-associative mapping techniques for cache.
Introduction
Types of Memory
The Memory Hierarchy
Cache Memory
Introduction
Types of Memory
The Memory Hierarchy
Cache Memory
RAM
- RAM is a read-write memory.
- RAM is the memory to which computer specifications refer; if you
buy a computer with 4GB of memory, it has a 4GB of RAM.
- RAM is also the “main memory” we have continually referred to
throughout this text.
- RAM is used to store programs and data that the computer needs
when executing programs.
- RAM is volatile: it loses data once the power is turned off.
There are five basic different types of ROM: ROM, PROM, EPROM,
EEPROM, and flash memory.
Flash memory
• Flash memory is essentially EEPROM
• It can be written or erased in blocks, removing the one-byte-at-a-time
limitation.
• Flash memory is faster than EEPROM. 11
Lecture Overview
Introduction
Types of Memory
The Memory Hierarchy
Cache Memory
“Once the data is located, then the data, and a number of its nearby data
elements are fetched into cache memory”. Why?
• The hope is that this extra data will be referenced in the near future, which,
in most cases, it is.
- Why?
→ because programs tend to exhibit a property known as locality.
• Advantage: After a miss, there is a high probability to achieve several hits .
Although there is one miss on, there might be several hits in cache on the
newly retrieved block afterward, due to locality.
November 27, 2023 TM103 - Arab Open University 19
The Memory Hierarchy
Locality of Reference
Locality of Reference is a way of organizing data inside memory in such a
way that data nearly requested will be closely located inside the memory.
“Once the data is located, then the data, and a number of its nearby data
elements are fetched into cache memory”. This is because of the hope that
this extra data will be referenced in the near future, which, in most cases,
happens since programs tend to exhibit a locality property.
The Advantage is that after a Miss, there is a high probability to achieve
several Hits.
Example: In the absence of branches, the PC in MARIE is incremented by
one after each instruction fetch.
The locality principle provides the opportunity for a system to use a small
amount of very fast memory to effectively accelerate the majority of
memory accesses.
Typically, only a small amount of the entire memory space is being
accessed at any given time, and the values from a slower memory to a
smaller but faster memory that resides higher in the hierarchy.
This results in a memory system that can store a large amount of
information in a large but low-cost memory, yet provide nearly the same
access speeds that would result from using the fast but expensive memory.
November 27, 2023 TM103 - Arab Open University 21
Lecture Overview
Introduction
Types of Memory
The Memory Hierarchy
Cache Memory
• Introduction
• Cache Mapping Schemes
- Direct mapping
- Fully associative
- Set Associative
How, then, does the CPU locate data when it has been copied into cache?
The CPU uses a specific mapping scheme.
A mapping Scheme “converts” the main memory address into a cache
location.
It gives special significance to the bits in the main memory address.
We first divide the bits into distinct groups we call fields.
Depending on the mapping scheme, we may have two or three fields.
It determines where the data is placed when it is originally copied into
cache.
It also provides a method for the CPU to find previously copied data when
searching cache.
Y = X mod N
Note: Y is the remainder of the division: X/N
Example: A cache memory contains 10 blocks. How will the following main
memory blocks: 0, 6, 10, 15, 25, 32 be mapped to the cache?
Here N=10(starting from block 0 till block 9)
• Block 0 will be placed in cache block: 0 mod 10 = 0
• Block 6 will be placed in cache block: 6 mod 10 = 6
• Block 10 will be placed in cache block: 10 mod 10 = 0
• Block 15 will be placed in cache block: 15 mod 10 = 5
• Block 25 will be placed in cache block: 25 mod 10 = 5
• Block 32 will be places in cache block: 32 mod 10 = 2
November 27, 2023 TM103 - Arab Open University 27
Cache Mapping Schemes – Direct Mapped Cache
In the last example, blocks 5, 15, 25, 35, … are all placed in Block 5 in cache!
How does the CPU know which block actually resides in cache block 5 at any
given time?
A tag identifies each block that is copied to cache.
This tag is stored with the block, inside the cache.
A valid bit is also added to each cache block to identify its validity.
November 27, 2023 TM103 - Arab Open University 28
Cache Mapping Schemes – Direct Mapped Cache
To perform direct mapping, the binary main memory address is partitioned into
three fields:
1) Offset (Word) field
• Uniquely identifies an address within a specific block (a unique word)
• The number of words/bytes in each block dictates the number of bits in the
offset field
Example: If a block of memory contains 8=23 words, we need 3 bits in the offset
field to identify (address) one of these 8 words in the block.
2) Block field
• It must select a unique block of cache
• The number of blocks in cache dictates the number of bits in the block field
Example: If a cache contains 16=24 blocks, we need 4 bits to identify (address)
3) Tag field
• Whatever is left over!
• Do not forget that: when a block of memory is copied to cache, this tag is
stored with the block and uniquely identifies this block.
November 27, 2023 TM103 - Arab Open University 29
Cache Mapping Schemes – Direct Mapped Cache
Example 1: Assume a byte-addressable memory consists of 214 bytes, cache has 16
blocks, and each block has 8 bytes. How many bits do we have in the tag, block and
offset fields?
• The number of memory blocks are: 214/8 = 214/23=211 blocks
- Each main memory address requires 14 bits. These are divided into three
fields as follows:
- We have 8 = 23 words in each block so we need 3 bits to identify one of
these words: the rightmost 3 bits reflect the offset field.
- We have 16=24 blocks in cache. We need 4 bits to select a specific block
in cache, so the block field consists of the middle 4 bits.
- The remaining 7 bits make up the tag field (14 – (4 + 3)).
Answer:
a. 220/24 = 216 blocks
b. 20 bit addresses with 11 bits in the tag field, 5 in the block field, and 4 in the word
field
c. 0DB63 = 00001100101 10110 0111, which is Block 22
Summary:
• Direct mapped cache maps main memory blocks in a
modular fashion to cache blocks. The mapping depends
on:
- The number of bits in the main memory address (how many
addresses exist in main memory)
- The number of blocks are in cache (which determines the size of
the block field)
- How many addresses (either bytes or words) are in a block (which
determines the size of the offset field)
Summary:
• Direct mapped cache is not as expensive as other caches
because the mapping scheme does not require any
searching.
- Each main memory block has a specific location to which it maps
in cache.
- A main memory address is converted to a cache address.
- The block field identifies one unique cache block.
- The CPU knows “a priori” the cache block number in which it may
find needed data.
Figure 6.4 shows the general format of the fully associative mapping
scheme.
Answer:
• We have 8=23 words in each block. The word (offset) field
consists of 3 bits.
• The tag field is 11 bits.
To wrap up things, let us have an example that covers the three schemes.
Example:
Suppose a byte-addressable memory contains 1MB and the
cache consists of 32 blocks, where each block contains 16
bytes.
Using the schemes below, specify their different fields, and
determine where the main memory address 326A016 maps to
in cache by specifying either the cache block or cache set:
• A direct mapping scheme is used.
• A fully associative mapping scheme is used.
• A 4-way set associative mapping scheme is used.
Fully associative:
- We have 16 = 24 words in each block so we need 4 bits to
identify one of these words: the rightmost 4 bits reflect the
offset field.
- The remaining 16 bits make up the tag field.
- The main memory address 326A0 maps to cache set 010 2 = 210. We
cannot know which block in the set is addressed, the set still needs to
be searched before the desired data could be found (by comparing the
tag of the address to all tags in cache set 2)
November 27, 2023 TM103 - Arab Open University 47
End of chapter 6