0% found this document useful (0 votes)
98 views44 pages

Ldco Unit 6 Notes

Uploaded by

chandanavanjari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views44 pages

Ldco Unit 6 Notes

Uploaded by

chandanavanjari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Unit 6: Memory &Input / Output Systems

Module 3: Cache Memory – Principle of


Locality, Organization, Mapping functions

Instructor
Dr. R. S. Khule,
Department of Information Technology,
Matoshri College of Engineering and Research Centre,
Nashik.
Recap

• We have discussed about various types of semiconductor


memory in detail
Contents

• Cache Memory –
 Principle of Locality
 Organization
 Mapping functions
Module Objectives

• Understand the significance of cache memory


• Understand Cache organization
• Know how a memory address is mapped into a cache
memory address.
• Understand the Cache mapping techniques and enlist their
merits and demerits
• Understand the performance parameters of cache
Module Outcomes

• Explain the significance of cache memory


• Explain Cache organization
• Know how a memory address is mapped into a
cache memory address.
• Explain Cache mapping techniques
• Compare the Cache mapping techniques
• Know the performance parameters of cache
Cache Memories

 Processor is much faster than the main memory.


 As a result, the processor has to spend much of its time
waiting while instructions and data are being fetched from
the main memory.
 Major obstacle towards achieving good performance.
 Speed of the main memory cannot be increased beyond a
certain point.
 Cache memory is an architectural arrangement which makes
the main memory appear faster to the processor than it
really is.
 Cache memory is based on the property of computer
programs known as “locality of reference”.
Cache memories

Processor Cache Main


memory

• Processor issues a Read request, a block of words is transferred from the


main memory to the cache, one word at a time.
• Subsequent references to the data in this block of words are found in the
cache.
• At any given time, only some blocks in the main memory are held in the
cache. Which blocks in the main memory are in the cache is determined
by a “mapping function”.
• When the cache is full, and a block of words needs to be transferred
from the main memory, some block of words in the cache must be
replaced. This is determined by a “replacement algorithm”.
Cache Memory
• Cache Memory is a special very high-speed memory. It is
used to speed up and synchronizing with high-speed CPU.
Cache memory is costlier than main memory or disk
memory but economical than CPU registers. Cache memory
is an extremely fast memory type that acts as a buffer
between RAM and the CPU. It holds frequently requested
data and instructions so that they are immediately available
to the CPU when needed.
• Cache memory is used to reduce the average time to access
data from the Main memory. The cache is a smaller and
faster memory which stores copies of the data from
frequently used main memory locations. There are various
different independent caches in a CPU, which store
instructions and data.
•If the processor finds that the memory location is in the
cache, a cache hit has occurred and data is read from cache
•If the processor does not find the memory location in the
cache, a cache miss has occurred. For a cache miss, the cache
allocates a new entry and copies in data from main memory,
then the request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in
terms of a quantity called Hit ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block
size, higher associativity, reduce miss rate, reduce miss
penalty, and reduce the time to hit in the cache.
Levels of memory:
Level 1 or Register –
It is a type of memory in which data is stored and accepted that are
immediately stored in CPU. Most commonly used register is accumulator,
Program counter, address register etc.
Level 2 or Cache memory –
It is the fastest memory which has faster access time where data is
temporarily stored for faster access.
Level 3 or Main Memory –
It is memory on which computer works currently. It is small in size and once
power is off data no longer stays in this memory.
Level 4 or Secondary Memory –
It is external memory which is not as fast as main memory but data stays
permanently in this memory.
Cache Performance:
When the processor needs to read or write a location in main memory, it first
checks for a corresponding entry in the cache.
Locality of Reference
 Locality of reference refers to a phenomenon in which a computer
program tends to access same set of memory locations for a particular
time period. In other words, Locality of Reference refers to the tendency
of the computer program to access instructions whose addresses are
near one another. The property of locality of reference is mainly shown
by loops and subroutine calls in a program
• In case of loops in program control processing unit
repeatedly refers to the set of instructions that constitute
the loop.
• In case of subroutine calls, everytime the set of instructions
are fetched from memory.
• References to data items also get localized that means
same data item is referenced again and again.
• When CPU wants to read or fetch the data or instruction
first, it will access the cache memory as it is near to it and
provides very fast access. If the required data or instruction
is found, it will be fetched. This situation is known as a
cache hit. But if the required data or instruction is not
found in the cache memory then this situation is known as
a cache miss. Now the main memory will be searched for
the required data or instruction that was being searched
and if found will go through one of the two ways:


1. First way is that the CPU should fetch the required data or instruction
and use it and that’s it but what, when the same data or instruction is
required again.CPU again has to access the same main memory location
for it and we already know that main memory is the slowest to access.
2. The second way is to store the data or instruction in the cache memory
so that if it is needed soon again in the near future it could be fetched in
a much faster way.
Cache Operation
• It is based on the principle of locality of reference. There are two
ways with which data or instruction is fetched from main memory
and get stored in cache memory. These two ways are the following:
• Temporal Locality –
Temporal locality means current data or instruction that is being
fetched may be needed soon. So we should store that data or
instruction in the cache memory so that we can avoid again
searching in main memory for the same data.
• When CPU accesses the current main memory location for reading required data or
instruction, it also gets stored in the cache memory which is based on the fact that
same data or instruction may be needed in near future. This is known as temporal
locality. If some data is referenced, then there is a high probability that it will be
referenced again in the near future.
• Spatial Locality –
Spatial locality means instruction or data near to the current memory location that
is being fetched, may be needed soon in the near future. This is slightly different
from the temporal locality. Here we are talking about nearly located memory
locations while in temporal locality we were talking about the actual memory
location that was being fetched.
• Sequentiality or Sequential locality- Given that a reference
has been made to a particular location s it is likely that within
the next several references a reference to the location of s + 1
will be made. Sequentiality is a restricted type of spatial
locality and can be regarded as a subset of it.
Cache Performance
• The performance of the cache is measured in terms of hit ratio. When CPU
refers to memory and find the data or instruction within the Cache
Memory, it is known as cache hit. If the desired data or instruction is not
found in the cache memory and CPU refers to the main memory to find
that data or instruction, it is known as a cache miss.
• Hit + Miss = Total CPU Reference Hit Ratio(h) = Hit / (Hit+Miss)

• Average access time of any memory system consists of two levels: Cache
and Main Memory. If Tc is time to access cache memory and Tm is the
time to access main memory then we can write:
• Tavg = Average time to access memory, Tavg = h * Tc + (1-h)*(Tm + Tc)
Cache Organization
• Cache is close to CPU and faster than main memory. But at
the same time is smaller than main memory. The cache
organization is about mapping data in memory to a location
in cache.
• A Simple Solution:
One way to go about this mapping is to consider last few
bits of long memory address to find small cache address,
and place them at the found address.
• Problems With Simple Solution:
The problem with this approach is, we loose the
information about high order bits and have no way to find
out the lower order bits belong to which higher order bits.
Solution is Tag:
To handle above problem,
more information is stored in
cache to tell which block of
memory is stored in cache. We
store additional information
as Tag
What is a Cache Block?
Since programs have Spatial Locality. (Once a location is retrieved, it
is highly probable that the nearby locations would be retrieved in
near future). So a cache is organized in the form of blocks. Typical
cache block sizes are 32 bytes or 64 bytes
Cache Mapping
• Mapping functions determine how memory blocks are placed
in the cache. The memory system has to quickly determine
if a given address is in the cache.
• There are three popular methods of mapping addresses
to cache locations
– Fully Associative – Search the entire cache for an address
– Direct – Each address has a specific place in the cache
– Set Associative – Each address can be in any of a small set
of cache locations
.
Cache Lines
• The cache memory is divided into blocks or lines.
Currently lines can range from 16 to 64 bytes
• Data is copied to and from the cache one line at a
time
• The lower log2(line size) bits of an address specify a
particular byte in a line

Line Offset
address
Line Example
0110010100
These boxes 0110010101
represent RAM
addresses 0110010110 With a line size of 4, the offset is
the log2(4) = 2 bits
0110010111
0110011000
The lower 2 bits specify
0110011001 which byte in the line
0110011010
0110011011
0110011100
0110011101
0110011110
0110011111
Direct Mapping
• Each location in RAM has one specific place in cache
where the data will be held
• Consider the cache to be like an array. Part of the address
is used as index into the cache to identify where the data
will be held
• Since a data block from RAM can only be in one specific
line in the cache, it must always replace the one block
that was already there.
• There is no need for a replacement algorithm
Direct Cache Addressing
• The lower log2(line size) bits define which byte in
the block
• The next log2(number of lines) bits defines which
line of the cache
• The remaining upper bits are the tag field

Tag Line Offset


Cache Constants
• cache size / line size = number of lines
• log2(line size) = bits for offset
• log2(number of lines) = bits for cache index
• remaining upper bits = tag address bits
Example direct address
Assume you have
• 32 bit addresses (can address 4 GB)
• 64 byte lines (offset is 6 bits)
• 32 KB of cache
• Number of lines = 32 KB / 64 = 512
• Bits to specify which line = log2(512) = 9

17 bits 9 bits 6 bits


Tag Line Offset
Example Address
• Using the previous direct mapping scheme
with 17 bit tag, 9 bit index and 6 bit offset
01111101011101110001101100111000

01111101011101110 001101100 111000


Tag Index offset

•Compare the tag field of line 001101100 (10810)


for the value 01111101011101110. If it
matches, return byte 111000 (5610) of the line
How many bits are in the tag, line
and offset fields?

Direct Mapping A. tag=4, line=16, offset=4


24 bit B. tag=4, line=14, offset=6
addresses 64K C. tag=8, line=12, offset=4
D. tag=6, line=12, offset=6
bytes of cache
16 byte cache
lines
Associative Mapping
• In associative cache mapping, the data from any
location in RAM can be stored in any location in
cache
• When the processor wants an address, all tag fields
in the cache as checked to determine if the data is
already in the cache
• Each tag line requires circuitry to compare the
desired address with the tag field
• All tag fields are checked in parallel
Associative Cache Mapping

• The lower log2(line size) bits define which byte in


the block
• The remaining upper bits are the tag field
• For a 4 GB address space with 128 KB cache
and 32 byte blocks:

27 bits 5 bits
Tag Offset
Example Address

• Using the previous associate mapping


scheme with 27 bit tag and 5 bit offset
01111101011101110001101100111000
011111010111011100011011001 11000
Tag offset

• Compare all tag fields for the value


011111010111011100011011001. If a match is
found, return byte 11000 (2410) of the line
How many bits are in the tag and
offset fields?

Associative A. tag= 20, offset=4


Mapping 24 bit B. tag=19, offset=5
C. tag=18, offset=6
addresses
D. tag=16, offset=8
128K bytes of
cache 64 byte
cache lines
How many bits are in the tag and offset fields?

Associative
A. tag= 20, offset=4
Mapping 24 bit B. tag=19, offset=5
addresses 128K C. tag=18, offset=6
bytes of cache D. tag=16, offset=8
64 byte cache
lines
Set Associative Mapping

• Set associative mapping is a mixture of direct and


associative mapping
• The cache lines are grouped into sets
• The number of lines in a set can vary from 2 to 16
• A portion of the address is used to specify which
set will hold an address
• The data can be stored in any of the lines in the set
Set Associative Mapping

• When the processor wants an address, it indexes to


the set and then searches the tag fields of all lines in
the set for the desired address
• n = cache size / line size = number of lines
• b = log2(line size) = bit for offset
• w = number of lines / set
• s = n / w = number of sets
Example Set Associative
Assume you have
• 32 bit addresses
• 32 KB of cache 64 byte lines
• Number of lines = 32 KB / 64 = 512
• 4 way set associative
• Number of sets = 512 / 4 = 128
• Set bits = log2(128) = 7

19 bits 7 bits 6 bits


Tag Set Offset
Example Address

• Using the previous set-associate mapping


with 19 bit tag, 7 bit index and 6 bit offset
01111101011101110001101100111000

0111110101110111000 1101100 111000


Tag Index offset

• Compare the tag fields of lines 110110000 to


110110011 for the value
0111110101110111000. If a match is found,
return byte 111000 (56) of that line
How many bits are in the tag, set and offset
fields?

2-way Set A. tag=8, set = 12, offset=4


Associative 24 bit B. tag=16, set = 4, offset=4
addresses 128K
C. tag=12, set = 8, offset=4
bytes of cache 16
D. tag=10, set = 10, offset=4
byte cache lines
Summary
• In this module, we have discussed about Cache memory
Assignment

1. What is cache memory


2. Explain Cache organization
3. Explain how a memory address is mapped into a cache
memory address.
4. Explain Cache mapping techniques
5. Compare the Cache mapping techniques
• In next session i.e. Module 4 of Unit 6, we will discuss about
write policies, Multilevel Caches, Cache Coherence..
Thank You

You might also like