0% found this document useful (0 votes)
3 views29 pages

Ch4 CacheMemory

Chapter 4 discusses cache memory and its role in computer architecture, detailing memory types, the memory hierarchy, and cache mapping schemes. It explains the differences between RAM and ROM, the importance of cache memory for performance, and various cache replacement policies. Additionally, it covers write policies for handling dirty blocks in cache memory.

Uploaded by

saifelsaid73
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views29 pages

Ch4 CacheMemory

Chapter 4 discusses cache memory and its role in computer architecture, detailing memory types, the memory hierarchy, and cache mapping schemes. It explains the differences between RAM and ROM, the importance of cache memory for performance, and various cache replacement policies. Additionally, it covers write policies for handling dirty blocks in cache memory.

Uploaded by

saifelsaid73
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Chapter 4

Cache Memory

Computer
Architecture
Memory Connection to CPU

• Assignment solution:
Assume that a computer system needs 512 bytes of RAM and
512 bytes of ROM. The available capacity of RAM chip is 128 x
8 and available capacity of ROM chip is 512 x 8. Each RAM and
ROM chip has 2 chip select lines (one active high and the other
active low).
Solution
▪ Required RAM Module: 512 x 8-bit
▪ Available RAM chips: 128 x 8-bit
▪ Required RAM Chips: 4 chips.
▪ Required ROM Module: 512 x 8-bit
▪ Available ROM chips: 512 x 8-bit
▪ Required ROM Chips: 1 ROM chip.
Memory Connection to CPU
Outline

▪ Introduction
▪ Memory Types
▪ The Memory Hierarchy
▪ Cache Memory (Mapping function)
▪ Cache Replacement policies
▪ References.
4.1 Introduction

• Memory lies at the heart of the stored-program


computer.
• In previous chapter, we studied the components
from which memory is built.
• In this chapter, we focus on memory
organization.
• A clear understanding of these ideas is
essential for the analysis of system
performance.
5
4.2 Types of Memory

• There are two kinds of main memory: Random


Access Memory, RAM, and Read Only Memory,
ROM.
• There are two types of RAM, dynamic RAM (DRAM)
and static RAM (SRAM).
• Dynamic RAM consists of capacitors that slowly leak
their charge over time. Thus, they must be refreshed
every few milliseconds to prevent data loss.
• DRAM is “cheap” memory owing to its simple design.
6
4.2 Types of Memory

• SRAM consists of circuits similar to the D flip-flop.

• SRAM is very fast memory, and it doesn’t need to


be refreshed like DRAM does. It is used to build
cache memory, which we will discuss.

• ROM also does not need to be refreshed.

• ROM is used to store permanent, or semi-


permanent data that persists even while the system
is turned off.
7
4.3 The Memory Hierarchy

• Generally speaking, faster memory is more


expensive than slower memory.
• To provide the best performance at the lowest
cost, memory is organized in a hierarchical fashion.
• Small, fast storage elements are kept in the CPU,
larger, slower main memory is accessed through
the data bus.
• Larger, (almost) permanent storage in the form of
disk and tape drives is still further from the CPU.
8
4.3 The Memory Hierarchy

• To access a particular piece of data, the CPU first


sends a request to its nearest memory, usually
cache memory.
• If the data is not in cache, then main memory is
queried. If the data is not in main memory, then
the request goes to disk.
• Once the data is located, then the data, and a
number of its nearby data elements are fetched
into cache memory.
9
4.3 The Memory Hierarchy

• This leads us to some definitions.


– A hit is when data is found at a given memory level.
– A miss is when it is not found.
– The hit rate is the percentage of time data is found at a
given memory level.
– The miss rate is the percentage of time it is not.
– Miss rate = 1 - hit rate.
– The hit time is the time required to access data at a
10
given memory level.
4.4 Cache Memory

• The purpose of cache memory is to speed up


accesses by storing recently used data closer to the
CPU, instead of storing it in main memory.
• Although cache is much smaller than main memory,
its access time is a fraction of that of main memory.
• Unlike main memory, which is accessed by address,
cache is typically accessed by content; hence, it is
often called content addressable memory.
• Because of this, a single large cache memory isn’t
always desirable-- it takes longer to search. 11
4.4 Cache Memory

• The “content” that is addressed in content


addressable cache memory is a subset of the bits of
a main memory address called a field.
• The fields into which a memory address is divided
provide a many-to-one mapping between larger
main memory and the smaller cache memory, where
many blocks of main memory map to a single block
of cache.
• A tag field in the cache block distinguishes one
cached memory block from another.
12
4.4 Cache Memory

• The simplest cache mapping scheme is direct


mapped cache.

• In a direct mapped cache consisting of N blocks of


cache, block X of main memory maps to cache
block Y = X mod N (modulo).

• Thus, if we have 10 blocks of cache, block 7 of


cache may hold blocks 7, 17, 27, 37, . . . of main
memory.
13
Direct Mapping Scheme

14
Direct Mapping Scheme

• The diagram below is a schematic of what cache


looks like.

• Block 0 contains multiple words from main memory,


identified with the tag 00000000. Block 1 contains
words identified with the tag 11110101.
15
Direct Mapping Scheme

• The size of each field into which a memory address


is divided depends on the size of the cache.
• Suppose our memory consists of 214 words, cache
has 16 = 24 blocks, and each block holds 8 words.
– Thus, memory is divided into 214 / 2 3 = 211 blocks.
• For our field sizes, we know we need 4 bits for the
block, 3 bits for the word, and the tag is what’s left
over:

16
Direct Mapping Scheme

• As an example, suppose a program generates the


address 1AA. In 14-bit binary, this number is:
00000110101010.
• The first 7 bits of this address go in the tag field, the
next 4 bits go in the block field, and the final 3 bits
indicate the word within the block.

17
Fully Associative Scheme

• Instead of placing memory blocks in specific cache


locations based on memory address, we could
allow a block to go anywhere in cache.
• In this way, cache would have to fill up before any
blocks are evicted.
• This is how fully associative cache works.
• A memory address is partitioned into only two
fields: the tag and the word.
18
Fully Associative Scheme

• Suppose we have 14-bit memory addresses and a


cache with 16 blocks, each block of size 8. The
field format of a memory reference is:

• When the cache is searched, all tags are searched


in parallel to retrieve the data quickly.
• This requires special, costly hardware. 19
Set Associative Scheme

• Set associative cache combines the ideas of direct


mapped cache and fully associative cache.
• An N-way set associative cache mapping is like
direct mapped cache in that a memory reference
maps to a particular location in cache.
• Unlike direct mapped cache, a memory reference
maps to a set of several cache blocks, similar to the
way in which fully associative cache works.
• Instead of mapping anywhere in the entire cache, a
memory reference can map only to the subset of
cache slots. 20
Set Associative Scheme

• The number of cache blocks per set in set


associative cache varies according to overall system
design.
• For example, a 2-way set associative cache can be
conceptualized as shown in the schematic below.
• Each set contains two different memory blocks.

21
Set Associative Scheme

• In set associative cache mapping, a memory


reference is divided into three fields: tag, set, and
word, as shown below.
• As with direct-mapped cache, the word field chooses
the word within the cache block, and the tag field
uniquely identifies the memory address.
• The set field determines the set to which the memory
block maps.

22
Set Associative Scheme

• Suppose we have a main memory of 214 bytes.


• This memory is mapped to a 2-way set associative
cache having 16 blocks where each block contains 8
words.
• Since this is a 2-way cache, each set consists of 2
blocks, and there are 8 sets.
• Thus, we need 3 bits for the set, 3 bits for the word,
giving 8 leftover bits for the tag:

23
Cache Replacement Policies

• With fully associative and set associative cache, a


replacement policy is invoked when it becomes
necessary to evict a block from cache.
• An optimal replacement policy would be able to
look into the future to see which blocks won’t be
needed for the longest period of time.
• Although it is impossible to implement an optimal
replacement algorithm, it is instructive to use it as a
benchmark for assessing the efficiency of any other
scheme we come up with.
24
Cache Replacement Policies

• The replacement policy that we choose depends


upon the locality that we are trying to optimize--
usually, we are interested in temporal locality.
• A Least Recently Used (LRU) algorithm keeps
track of the last time that a block was assessed and
evicts the block that has been unused for the
longest period of time.
• The disadvantage of this approach is its complexity:
LRU has to maintain an access history for each
block, which ultimately slows down the cache.
25
Cache Replacement Policies

• First-In, First-Out (FIFO) is a popular cache


replacement policy.

• In FIFO, the block that has been in the cache the


longest would be selected to be removed from
cache memory, regardless of when it was last used.

• Random replacement policy picks a block at


random and replaces it with a new block.
26
Cache Dirty Blocks

• Cache replacement policies must also take into


account dirty blocks, those blocks that have been
updated while they were in the cache.
• Dirty blocks must be written back to memory. A
write policy determines how this will be done.
• There are two types of write policies: write through
and write back.
• Write through updates cache and main memory
simultaneously on every write.
27
Cache Dirty Blocks

• Write back (also called copyback) updates memory


only when the block is selected for replacement.
• The disadvantage of write through is that memory
must be updated with each cache write, which slows
down the access time on updates.
• This slowdown is usually negligible, because the
majority of accesses tend to be reads, not writes.
• The advantage of write back is that memory traffic is
minimized, but its disadvantage is that memory does
not always agree with the value in cache, causing
problems in systems with many concurrent users.
28
References

[1] Linda Null, the Essentials of Computer


Organization and Architecture, 10th edition,
Pearson Education.
[2] William Stallings, Computer Organization and
Architecture- designing for performance, 10th
edition, Pearson Education.
[3] M. Morris Mano, Computer System Architecture,
3rd edition, Pearson/PHI, India.

You might also like