0% found this document useful (0 votes)
86 views

Cache Memory Mapping

This document discusses cache memory and its organization. It summarizes that: 1) Programs tend to access the same localized areas of memory repeatedly over short periods due to properties like loops and subroutines, known as locality of reference. 2) Cache memory is a small, fast memory placed between the CPU and main memory to take advantage of locality of reference. Frequently accessed instructions and data are stored in cache for faster access times. 3) When the CPU requests data from memory, the cache is checked first. If found, the data is read from cache. If not found, a block of data containing the requested word is transferred from main memory to cache.

Uploaded by

Aman Dubey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Cache Memory Mapping

This document discusses cache memory and its organization. It summarizes that: 1) Programs tend to access the same localized areas of memory repeatedly over short periods due to properties like loops and subroutines, known as locality of reference. 2) Cache memory is a small, fast memory placed between the CPU and main memory to take advantage of locality of reference. Frequently accessed instructions and data are stored in cache for faster access times. 3) When the CPU requests data from memory, the cache is checked first. If found, the data is read from cache. If not found, a block of data containing the requested word is transferred from main memory to cache.

Uploaded by

Aman Dubey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

464 CHAPTER TWELVE Memorv Organization

words inserted o n e : t
deleted and new a
If unwanted words have to be between active and
to distinguish d
time, there is a need for a special register would have
sometimes called
a lag register, as
inactive words. This register, For every active word stored in
there a r e words in the memory.
m a n v bits as is set to I. A word is deleted
bit in the tag register
m e m o r v , the corresponding
Words a r e stored in memory by
by clearing its tag bit to 0.
from e n c o u n t e r e d . This gives the first
m e m o r y

until the first 0 bit is


scanning the tag register new word. After the ne
position for writing
a
available inactive word and a

made active by setting


its tag bit to 1. An
word stored in memory it is
is
cleared to all 0's if this
from memory can be
unwanted word when deleted words that have a
location. Moreover, the
value is used to specify an empty
masked (togetherwith the bits) with the argument word
K,
tag bit of 0 must be
so that only active words are compared.

12-5 Cache Memory


shown that the refer
Analysis of a large number of programs has
typical
ences to memory at any given
interval of time tend to be confined within a
is known as the property of
few localized areas in memory. This phenomenon
locality of locality of reference. The reason for this property may be understood considering
that a typical computer program flows in a straight-line fashion with program
Teference
loops and subroutine calls encountered program loop is
frequently. When a

that
executed, the CPU repeatedly refers to the set of instructions in memory
constitute the loop. Every time a given subroutine is called, its set of instruc
tions are fetched from memory. Thus loops and subroutines tend to localize
the references to memory for fetching instructions. To a lesser degree, mem-
ory references to data also tend to be localized. Table-lookup procedures repeat-
edly refer to that portion in memory where the table is stored. Iterative
procedures refer to common memory locations and array of numbers are con
fined within a local portion of memory. The result of all these observations
is the locality of reference property, which states that over a short interva
of time, the addresses generated by a typical program refer to a few localized
areas of memory repeatedly, while the remainder of memory is accessed
relatively infrequently.
If the active portions of the program and data are placed in a fast smal
memory, the average memory access time can be reduced, thus reducing the
total execution time of the program. Such a fast small memory is referred
as a cache memory. It is placed between the CPU and main memory as illus
trated in Fig. 12-1. The cache memory access time is less than the access ume
of main memory by a lactor of 5 to 10. 'The cache is the fastest component in

the memory hierarchy and apprOaches the speed of CPU components.


The fundamental idea of cache organization is that by keeping the mos
frequently accessed instructions and data in the fast cache memory, tne
average memory SECTION 12-5 Cache Memory 465
Although the cache is time will
fraction
access
of only small approach the aCe time of the cache.
a access
of the memory requests fraction
will be found of the size
ize of
o1 main memory, a large
locality of
reference in the fast nc*
fas cache
fast ory because

access
asic
property
operation of the of
programs.
ca

memory, the cache is cache is as follows. When the


read from
the cache, the fast examined.
xamined. IfIf the
the word
word isis found in the cache, it is
the main memory.
nemory. If
If the word
wor addressed founa isndnot found in
by the CPU
containing
cache
cne memory.
emory
the one memory
just
is
is accessed
accessed to read the
to read the
accessed is then transferred from A)
word.
word.
A block of rds
memory. TheThe blod.essed
block transferred main memory to
rom one wordirom one
to about size
16 words may vary frot
ac adjacent to the one (the just access
ansterred cache so
to just accessed. this
In ne data
manie
cache memory.future references
words in the fast that to memory find the Tequ ired
atio ne performance of cache memory is frequently measured
quanuty called hit ratio.
When the CPU refers to in terms f
o
a
1m
cache, memory and
If the word is not found infinds the wo
it is
said to produce a hit.
main memory and it counts as a miss. The ratio of the cache, it is
by the total CPU references to number of hits dividea
nit misses)
memory (hits plus is the hit ratio. 1 ne
ratio 1s best measured experimentally by running
in the
computer and measuring the number of hits and representative programs
misses during a given
interval of time. Hit ratios of 0.9 and
ratio verifies the validity of the
higher have been reported. This high
locality of reference property.
The average memory access time of a
computer system can be improved
considerably by use of a cache. If the hit ratio is high enough so that most of
the time the CPU the cache instead of main memory, the average
accesses
access time is closer to the access time of the fast cache memory. For example,
a computer with cache access time of 100 ns, a main memory access time of
1000 ns, and a hit ratio of 0.9 produces an average access time of 200 ns. This
is a considerable improvement over a similar computer without a cache mem-
whose access time is 1000 ns.
ory,
The basic characteristic of cache memory is its fast access time. Therefore,
or no time must
be wasted when searching for words in the cache. The
very little
from main memory to cache memory is referred to as a
transformation of data
procedures are of practical interest
ing mabping process. Three ypes of mapping
the organization of cache memory:
when considering
1. Associative mapping
2. Direct mapping
3. Set-associative mapping

procedures we wil a
sppe
discussion of
these three mapping
shown in Fie 19.10se
To help in the organiZaion as
iscano
of a memory each. The cache stor
cific example words of
12 bits
stored in cache g
store 32K every
word
can time. r o r
memory
words at any given
512 of these
Main memory
32Kx 12
CPU
Cache memory
512 12

Figure 12-10 Example of cache menory

a duplicate copy in main memory. The CPU communicates with both mem
ories. It first sends a 15-bit address to cache. If there is a hit, the CPU accents
the 12-bit data from cache. If there is a miss, the CPU reads the word from
main memory and the word is then transferred to cache.

Associative Mapping
The fastest and most flexible cache organization uses an associative memory.
This organization is illustrated in Fig. 12-11. The associative memory stores
both the address and content (data) of the memory word. This permits any
location in cache to store any word from main memory. The diagram shows
three words presently stored in the cache. The address value of l5 bits is
shown as a five-digit octal number and its corresponding 12-bit word is shown
as a four-digit octal number. A CPU address of 15 bits is placed in the argu-
ment register and the associative memory is searched for a matching address.

in octal).
Figure 12-11 Associative mapping cache (all numbers
CPU address (15 bits)

Argument register
2
- Data-
-Address-
01000 3 450
02777 67 10

22345 1234
If the SECTIO ON 12-5
address is Cache Mernory 46
the CPU. If no found, the
e

address-data pair is
corresponding
match occurs, corTesponding
the main
12-bit data
12.bit is read
reao and sent
cache is
full,then
an
memory
transferred to the
i The
that is needed address- data pair must
be associat to
iative cache mory. If the
and not displaced make ro for a pair
replaced is determinedpresently
from the
in the cache.
The decision as t air isis
chooses for the
cache. A simple replacement algorithm that designer
round-robin order procedure is to
whenever a new word is replace cells ot tneche
cach in
This constitutes
first in first out
a
requested from rmain tn ory
(FIFO) replacement policy.
Direct Mapping
Associauve memories are
because of the added expensive compared to random-access memories
logic associated with each cell. The
random-access memory for the cache possibility of usin8
is investigated in Fig. 12-12. The CFu
address of 15 bits is divided into two fields. The nine least
agficld stitute the index field and the significant bits con
shows that main
remaining six bits form the tag field. The
figure
memory needs an address that includes both the tag and the
index bits. The number of bits in the index field is equal to the number of
address bits required
to access the memory. cache
In the general case, there are 2" words in cache memory and 2" words in
main memory. The n-bit memory address is divided into two fields: k bits for
the index field andn - k bits for the tag field. The direct mapping cache

organization the n-bit address to access the main memory and the k-bit
uses
cache
index to access the cache. The internal organization of the words in the
Each word in cache consists of the data
memory is as shown in Fig. 12-13(b).
n e w word is first brought into the cache.
word and its associated tag. When a
stored the data bits. When the CPU generates a
alongside
the tag bits are

between m a i n and cache memories.


Addressing relationships
Figure 12-12

6 bits 9 bits
Index
Tag

000 512 12
32K x 12
00, 000 Octal Cache memory
address Address = 9 bits
Main memory
Data = 12 bits

Octal =
15 bits
Address
address D a t a = 12 bits

77 777 L
Memory Index
address Memory data address
Tag Data
00000 1220 000 00
1220

O0777 2340
01000 3 4 50

01777 4 560

02000 56 70
777
02 67 10

(b) Cache memory

02777 6710

(a) Main Memory


Figure 12-13 Direct mapping cache organization.
memory request, the index field is used for the
The tag field of the CPU address to access the cache.
address is compared with the
from the cache. If the two tag in the word read
is in cache. If there is
tags match, there is a hit and the desired data word

from main
no
match, there is a miss and the required word is read
memory. It is then stored in the cache
replacing the together with the new tag
previous value. The
disadvantage of direct
drop considerably if two or more words whosemapping
is that
hit ratio can tne
same index but addresses have thne
different tags are accessed
ity is minimized by the fact that such words repeatedly. However, this posS1
are relatively far
address range apart in the
lo see
(multiples
how the
of 512 locations in this example).
direct-mapping organization operates, consider
numerical example shown in Fig. 12-13. The word at is address
stored in the cache
(index presen
zero

CPU nOw wants to


=
000, tag =00, data 1220). Suppose tha ue
=

accessthe word at address 02000. The index address


00, so it is used to accessthe cache. The two The
tags are then
cache tag is 00 but the
address tag is 02, which does not compaua atch.
Therefore, the main memory is accessed and the produce d
to the CPU. The data word 5670 is transTe
cache word at index address 000 is then
of 02 and data of 5670. replaced witn a g
The direct
mapping examplejust described uses a block size vord.
The same organization but using a block size of 8 words is shown in Fi8: 2-14. ot one
Index Tag SECTION 12-5 Cache
Memory 469
Block 0 0 Data
6
3450 6
007
6578
Tag Block Word
010
Block 1
Index
017

770 02
Block 63
777 02 67 10

Figure 12-14 Direct mapping cache with


block size of 8 words.

The index field is now divided into


two the block field and the word field.
In a 512-word cache there are 64 blocksparts:
of 8 words each, since 64 X 8 512
The block number is specified with a 6-bit field and the word within the block
is specified with a 3-bit field. The tag field stored within the cache is
common to
all eight words of the same block. Every time a miss occurs, an entire block of
eight words must be transferred from main memory to cache memory.
Although this takes extra time, the hit ratio will most likely improve with a larger
block size because of the sequential nature of computer programs.

Set-Associative Mapping
at the disadvantage of direct mapping is that
It mentioned previously
was
same index
in their address but with different values tag
two words with the the same ume. A third type of cache organ-
cache memory at
cannot reside in mprovement over the direct
set-associaave mapping, 1S an
ization, called each word
in that
of cache can store two or more wor
organization Each data word is stored together
mapping same index address.
under the items in word of cache is said to
of memory -data one
of tag-data f
organization for a s e t size
number
and the
with its tag ol a set-assOCiaive cache data
example refers to two and
form a set. An 12-15. Each
index address
its and each data word has l2 bits,
shown in Fig. requires six
bits
two is Each tag ress o
index address of nine bits can
a s s o c i a t e d tags. 3o bits. An
their t 12) = x 36. I can
word length is 2(6 Thus the size: of cache memory is 512
s o the words.
since each word of cache contains
512
ain memory
of main
cache of set size k will
a c c o m m o d a t e

words acco
1024 a s s o ciative
iat
accommodate
a set
In general, each word
of cache
words. in
two data menory

words of main
date k
70 CHAPTER TWELVE Memory Organization

Index Tag Data Tag Data


O00 3 450 02 5670

777 02 6710 00 23 40

Figure 12-15 Two-way set-associative mapping cache.

The octal numbers listed in Fig. 12-15 are with


reference to the main
memory contents illustrated in Fig. 12-13(a). The words stored at
01000 and 02000 of main memory are stored in cache addresses
address 000. Similarly, the words at addresses 02777 and 00777 memory at index
are stored in
cache at index address 777. When the CPU
index value of the address is used to access thegenerates
a
memory request, the
cache. The tag field of the CPU
address is then compared with both
occurs. The
tags in the cache to determine if a match
comparison logic is done by an associative search of the tags in the
set similar to an associative
memory search: thus the name "set-associative."
The hit ratio will improve as the set size
increases because more words with the
same index but different
tags can reside in cache. However, an increase in the
set size increases the number
of bits in words of cache and
plex comparison logic. requires more com-
When a miss occurs in a set-associative cache and the
set is full, it is nec
to
essary replace one of the tag-data items
with a new value. The most
acement mon
replacement algorithms used are: random replacement, first-in, first-out com
rithms (FIFO), and least recently used (LRU). With the random
the control chooses one
tag-data replacement
item for replacement at random. The poliey
procedure selects for replacement the item that has been in the set the Fir
The LRU
algorithm selects for replacement the item that has been longe
recently used by the CPU. Both FIF and LRU can be least
adding a few extra bits in each word
of cache.
implemented Dy

Writing into Cache


An
important aspect of cache
requests. When the CPU finds organization
is concerned with memory rite
a word in
cache
memory is not involved in the transfer. during if the operatio, a
main a read

write, there are two ways that the However, operano


system can
proceed.
The
simplest and most
memmory with SECTION 12-6 Virtual
Memory 471
updated in every
parallel if memory commonly used
called contains operation,procedure
irough the it write 1s ain

memory alwayswrite-through
containsmethod. the word at
This
cache
the specified address. This is
spe
memory being

important in the method has the as

residing insystems the cache. advantage


same data hat main
data with direct as
main This
municating memory memory access chara
valid at all timestransfers. It
is
t the
ck The through DMA
second
are
would receive the so ensures
that an 1/O
only the procedure is called the most recent deviCe m
then marked
cache location
is write-back
updated during
updatea a
method. In this metnoo
by a
flag a write
word isoperation.
it is so thatlater when the The
copied into main removed from the cac loCa is
during the time a wordmemory.
resides
The reason
write-back for the
however, as long as the word in the cache, it may be updated method 1s that

several times,
thecopy in main remains in the cache, does not matter
it
from the cache. Itmemory is out of
is only when date, since requests from the word are whetheT
filled
the word is
accurate copy need be displaced
rewritten into main memory.
from the cache
that an
that the number of
memory writes in a typical program Analytical results indicate
30 percent of the total ranges between 0 and
references to
memory.

Cache Initialization
One more aspect of cache organization that must be taken into consideration
is the problem of initialization. The cache is initialized when power is applied
to the computer or when the main memory is loaded with a complete set of
initialization the cache is considered
programs from auxiliary memory. After
nonvalid data. It is customary to
to be empty, but in effect it contains some
include with each word in cache a valid bit to indicate whether or not the word
contains valid data.
the valid bits to 0. The valid bit of
The cache is initialized by clearing alltime this word is loaded from main
first
word is set to I the
a particular cache the cache has
to be inmitialized again. The intro-
and stays set unless not replaced bv another
cache is
memory
m e a n s that
a word in
bit If the valid
duction of the valid and a
m i s m a t c h of tags occurs.

the invalid data


set to I
valid bit is
word unless the word automatically replaces
the n e w misses from the cache
bit happens to be 0, the eftect ot forcing
condition has
initialization
Thus the data.
with valid
until it fills

12-6 Virtual Memory in auxiliary


s t o r e d in
lirst stored au
firstare

programs
and data main memory as they
into main
m e m o r

system, brought
are s o m e large
some large com-
hierarchy or
dala used in.
In aa m
In mem
emory

Portions
of a
program

memory
is a concept

programs
as though large
Virtual construct

memory. CPU. to
Iod by the i t
the user

You might also like