0% found this document useful (0 votes)
13 views22 pages

Chapter 04

The document discusses cache memory and the memory hierarchy in computers. It describes how cache memory works and its advantages. It also explains the characteristics of different levels of memory, including access time, capacity, cost per bit, and frequency of access.

Uploaded by

xuandai372005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views22 pages

Chapter 04

The document discusses cache memory and the memory hierarchy in computers. It describes how cache memory works and its advantages. It also explains the characteristics of different levels of memory, including access time, capacity, cost per bit, and frequency of access.

Uploaded by

xuandai372005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

05-May-22

Computer Organization and Architecture

Chapter 04
1
Computer Organization and

CACHE MEMORY
Architecture
NLU-FIT

KEY POINTS
 Computer memory is organized into a hierarchy. At
the highest level (closest to the processor) are the
2

processor registers. Next comes one or more levels of


cache. When multiple levels are used, they are
Computer Organization and

denoted L1, L2, and so on.


The hierarchy continues with external memory, with
Architecture


the next level typically being a fixed hard disk, and
one or more levels below that consisting of removable
media such as optical disks and tape.
 As one goes down the memory hierarchy, one finds
decreasing cost/bit, increasing capacity, and slower
access time.
NLU-FIT

 If the cache is designed properly, then most of the


time the processor will request memory words that
are already in the cache.

1
05-May-22

CACHE MEMORY
3
Computer Organization and
Architecture

4.1. Computer Memory System Overview


NLU-FIT

4.1.1. Characteristics of Memory Systems


4
Computer Organization and
Architecture
NLU-FIT

Figure 4.1. Key Characteristics of Computer Memory Systems

2
05-May-22

4.1.1. Characteristics of Memory Systems


 An obvious characteristic of memory is its
5

capacity. For internal memory, this is


typically expressed in terms of bytes (1
Computer Organization and

byte 8 bits) or words.


Common word lengths are 8, 16, and 32
Architecture


bits. External memory capacity is typically
expressed in terms of bytes.
 A related concept is the unit of transfer.
For internal memory, the unit of transfer is
equal to the number of electrical lines into
NLU-FIT

and out of the memory module.

4.1.1. Characteristics of Memory Systems


 The unit of transfer may be equal to the word
length, but is often larger, such as 64, 128, or 256
6

bits.
Computer Organization and

 Consider three related concepts for internal


memory:
Architecture

• Word: The “natural” unit of organization of memory.


The size of the word is typically equal to the number
of bits used to represent an integer and to the
instruction length.
• Addressable units: In some systems, the
addressable unit is the word. However, many
systems allow addressing at the byte level. In any
NLU-FIT

case, the relationship between the length in bits A of


an address and the number N of addressable units is
2A= N.

3
05-May-22

4.1.1. Characteristics of Memory Systems


• Unit of transfer: For main memory, this is the
7

number of bits read out of or written into


memory at a time.
Computer Organization and

 Another distinction among memory types is


the method of accessing units of
Architecture

data.These include the following:


• Sequential access:
Memory is organized into units of data, called
records.
Access must be made in a specific linear sequence.
Access time depends on location of data and
NLU-FIT

previous location
e.g. tape

4.1.1. Characteristics of Memory Systems


• Direct access:
8

Individual blocks have unique address


Access is by jumping to vicinity plus sequential
Computer Organization and

search
Access time depends on location and previous
Architecture

location
e.g. Disk
• Random access:
Individual addresses identify locations exactly
Access time is independent of location or previous
access
e.g. RAM
NLU-FIT

4
05-May-22

4.1.1. Characteristics of Memory Systems


• Associative:
9

Data is located by a comparison with contents of a


portion of the store
Computer Organization and

Access time is independent of location or previous


access
e.g. cache
Architecture

 From a user’s point of view, the two most


important characteristics of memory are
capacity and performance.
 Three performance parameters are used:
• Access time:
NLU-FIT

 Time between presenting the address and getting


the valid data

4.1.1. Characteristics of Memory Systems


• Memory Cycle time:
10

Time may be required for the memory to “recover”


before next access
Computer Organization and

Cycle time is access + recovery


• Transfer rate:
Architecture

This is the rate at which data can be transferred


into or out of a memory unit.
NLU-FIT

5
05-May-22

4.1.2. The Memory Hierarchy


 The design constraints on a computer’s
11

memory can be summed up by three


questions: How much? How fast? How
Computer Organization and

expensive?
The way out of this dilemma is not to rely
Architecture


on a single memory component or
technology, but to employ a memory
hierarchy.
 A typical hierarchy is illustrated in Figure
4.1. As one goes down the hierarchy, the
NLU-FIT

following occur:

4.1.2. The Memory Hierarchy


a. Decreasing cost per bit
12

b. Increasing capacity
c. Increasing access time
Computer Organization and

d. Decreasing frequency of access of the memory


by the processor
Architecture
NLU-FIT

Figure 4.1 The Memory Hierarchy

6
05-May-22

4.2. Cache Memory Principles


 The concept is illustrated in Figure 4.3.
13

 There is a relatively large and slow main memory


together with a smaller, faster cache memory.
Computer Organization and

 The cache contains a copy of portions of main


memory. When the processor attempts to read a
Architecture

word of memory, a check is made to determine if


the word is in the cache.
• If so, the word is delivered to the processor.
• If not, a block of main memory, consisting of
some fixed number of words, is read into the
cache and then the word is delivered to the
NLU-FIT

processor

4.2. Cache Memory Principles


14
Computer Organization and
Architecture
NLU-FIT

Figure 4.3. Cache and Main Memory

7
05-May-22

4.2. Cache Memory Principles


15
Computer Organization and
Architecture
NLU-FIT

Figure 4.4 Cache/Main Memory Structure

4.2. Cache Memory Principles


 Main memory consists of up to 2n
16

addressable words, with each word having


a unique n-bit address.
Computer Organization and

 For mapping purposes, this memory is


considered to consist of a number of fixed
Architecture

length blocks of K words each.


 That is, there are M=2n/K blocks in main
memory.
 The cache consists of m blocks, called
lines. Each line contains K words, plus a
NLU-FIT

tag of a few bits.

8
05-May-22

4.2. Cache Memory Principles


 Each line also includes control bits (not
17

shown), such as a bit to indicate whether


the line has been modified since being
Computer Organization and

loaded into the cache.


• In referring to the basic unit of the cache, the
Architecture

term line is used, rather than the term block,


for two reasons:
(1) to avoid confusion with a main memory block,
which contains the same number of data words as
a cache line;
(2) because a cache line includes not only K words
NLU-FIT

of data, just as a main memory block, but also


include tag and control bits.

4.2. Cache Memory Principles


 The length of a line, not including tag and control
18

bits, is the line size.


 The number of lines is considerably less than the
Computer Organization and

number of main memory blocks (m < M).


 At any time, some subset of the blocks of memory
Architecture

resides in lines in the cache.


 If a word in a block of memory is read, that block is
transferred to one of the lines of the cache.
 Because there are more blocks than lines, an
individual line cannot be uniquely and permanently
dedicated to a particular block.
 Thus, each line includes a tag that identifies which
NLU-FIT

particular block is currently being stored. The tag is


usually a portion of the main memory address.

9
05-May-22

4.2. Cache Memory Principles


 Cache operation
19

• CPU requests contents of memory location


• Check cache for this data
Computer Organization and

• If present, get from cache (fast)


• If not present, read required block from main
Architecture

memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block of
main memory is in each cache slot
 Figure 4.5 illustrates the read operation.
NLU-FIT

The processor generates the read address


(RA) of a word to be read.

4.2. Cache Memory Principles


20
Computer Organization and
Architecture
NLU-FIT

Figure 4.5. Cache Read Operation

10
05-May-22

4.2. Cache Memory Principles


 Figure 4.6, which is typical of contemporary cache
21

organizations.
• The cache connects to the processor via data,
Computer Organization and

control, and address lines.


• The data and address lines also attach to data and
Architecture

address buffers, which attach to a system bus from


which main memory is reached.
• When a cache hit occurs, the data and address
buffers are disabled and communication is only
between processor and cache, with no system bus
traffic.
• When a cache miss occurs, the desired address is
NLU-FIT

loaded onto the system bus and the data are


returned through the data buffer to both the cache
and the processor.

4.2. Cache Memory Principles


22
Computer Organization and
Architecture
NLU-FIT

Figure 4.6. Typical Cache Organization

11
05-May-22

4.3. Elements of Cache Design


 Although there are a large number of cache
23

implementations, there are a few basic design


elements that serve to classify and differentiate
Computer Organization and

cache architectures.
Architecture
NLU-FIT

4.3.1. Cache Addresses


 Almost all nonembedded processors, and many
24

embedded processors, support virtual memory.


 In essence, virtual memory is a facility that allows
Computer Organization and

programs to address memory from a logical point


of view, without regard to the amount of main
Architecture

memory physically available.


 When virtual memory is used, the address fields
of machine instructions contain virtual
addresses.
 For reads to and writes from main memory, a
hardware memory management unit (MMU)
NLU-FIT

translates each virtual address into a physical


address in main memory.

12
05-May-22

4.3.1. Cache Addresses


 When virtual addresses are used, the
25

system designer may choose to place the


cache as Figure 4.7
Computer Organization and

• Between the processor and the MMU


• Or between the MMU and main memory.
Architecture

 A logical cache, also known as a virtual


cache, stores data using virtual addresses.
 The processor accesses the cache directly,
without going through the MMU.
A physical cache stores data using main
NLU-FIT


memory physical addresses.

4.3.1. Cache Addresses


26
Computer Organization and
Architecture
NLU-FIT

Figure 4.7. Logical and Physical Caches

13
05-May-22

4.3.1. Cache Addresses


 One obvious advantage of the logical cache is that
cache access speed is faster than for a physical
27

cache, because the cache can respond before the


MMU performs an address translation.
Computer Organization and

 The disadvantage has to do with the fact that most


virtual memory systems supply each application with
Architecture

the same virtual memory address space.


• That is, each application sees a virtual memory that
starts at address 0.
• Thus, the same virtual address in two different
applications refers to two different physical addresses.
• The cache memory must therefore be completely
flushed with each application context switch, or extra
NLU-FIT

bits must be added to each line of the cache to identify


which virtual address space this address refers to.

4.3.2. Cache Size


28
Computer Organization and
Architecture
NLU-FIT

14
05-May-22

4.3.3. Mapping Function


 Because there are fewer cache lines than
29

main memory blocks, an algorithm is


needed for mapping main memory blocks
Computer Organization and

into cache lines.


Further, a means is needed for determining
Architecture


which main memory block currently
occupies a cache line.
 Three techniques can be used: direct,
associative, and set associative.
NLU-FIT

4.3.3. Mapping Function


 For all three cases, the example includes
30

the following elements:


• The cache can hold 64 KBytes.
Computer Organization and

• Data are transferred between main memory


and the cache in blocks of 4 bytes each.
Architecture

• This means that the cache is organized as


16K=214 lines of 4 bytes each.
• The main memory consists of 16 Mbytes, with
each byte directly addressable by a 24-bit
address (224 =16M).
• Thus, for mapping purposes, we can consider
NLU-FIT

main memory to consist of 4M blocks of 4


bytes each.

15
05-May-22

4.3.3. Mapping Function


 DIRECT MAPPING The simplest
31

technique, known as direct mapping, maps


each block of main memory into only one
Computer Organization and

possible cache line.


Architecture
NLU-FIT

Figure 4.8a. Direct mapping

4.3.3. Mapping Function


 The mapping is expressed as
32

i = j modulo m
• where
Computer Organization and

i= cache line number


J= main memory block number
Architecture

m= number of lines in the cache


 Figure 4.8a shows the mapping for the first m
blocks of main memory.
 Each block of main memory maps into one unique
line of the cache.The next m blocks of main
memory map into the cache in the same fashion;
NLU-FIT

that is, block Bm of main memory maps into line L0


of cache, block Bm+1 maps into line L1, and so on.

16
05-May-22

4.3.3. Mapping Function


 The direct mapping technique is simple
33

and inexpensive to implement.


Its main disadvantage is that there is a
Computer Organization and


fixed cache location for any given block.
Architecture

• Thus, if a program happens to reference


words repeatedly from two different blocks that
map into the same line, then the blocks will be
continually swapped in the cache, and the hit
ratio will be low.
NLU-FIT

4.3.3. Mapping Function


 ASSOCIATIVE MAPPING Associative
34

mapping overcomes the disadvantage of direct


mapping by permitting each main memory block
Computer Organization and

to be loaded into any line of the cache (Figure


4.8b).
Architecture

 In this case, the cache control logic interprets a


memory address simply as a Tag and a Word
field.
• The Tag field uniquely identifies a block of main
memory.
• To determine whether a block is in the cache, the
NLU-FIT

cache control logic must simultaneously examine


every line’s tag for a match.

17
05-May-22

4.3.3.Mapping Function
35
Computer Organization and
Architecture

Figure 4.8b. Associative mapping


 With associative mapping, there is flexibility as to
which block to replace when a new block is read
into the cache.
 The principal disadvantage of associative
NLU-FIT

mapping is the complex circuitry required to


examine the tags of all cache lines in parallel.

4.3.3. Mapping Function


 SET-ASSOCIATIVE MAPPING Set-
36

associative mapping is a compromise that


exhibits the strengths of both the direct and
Computer Organization and

associative approaches while reducing their


disadvantages.
Architecture

 In this case, the cache consists of a number


sets, each of which consists of a number of
lines
 The relationships are
m= v *k
NLU-FIT

i= j modulo v

18
05-May-22

4.3.3. Mapping Function


• where
37

i =cache set number


j =main memory block number
Computer Organization and

m = number of lines in the cache


v=number of sets
Architecture

K= number of lines in each set


 This is referred to as k-way set-associative
mapping. With set-associative mapping,
block Bj can be mapped into any of the
lines of set j.
NLU-FIT

4.3.3. Mapping Function


38
Computer Organization and
Architecture
NLU-FIT

Figure 4.8c. v Associative–mapped caches

19
05-May-22

4.3.3. Mapping Function


 Figure 4.8c illustrates this mapping for the
39

v first blocks of main memory.


• As with associative mapping, each word maps
Computer Organization and

into multiple cache lines. For set-associative


mapping, each word maps into all the cache
Architecture

lines in a specific set, so that main memory


block B0 maps into set 0, and so on.
• Thus, the set-associative cache can be
physically implemented as v associative
caches.
NLU-FIT

4.3.4. Replacement Algorithms


 Once the cache has been filled, when a new
40

block is brought into the cache, one of the


existing blocks must be replaced.
Computer Organization and

 For Direct mapping


• No choice
Architecture

• Each block only maps to one line


• Replace that line
 For the associative and set associative
techniques, a replacement algorithm is needed.
To achieve high speed, such an algorithm must
be implemented in hardware. A number of
NLU-FIT

algorithms have been tried.

20
05-May-22

4.3.4. Replacement Algorithms


 Probably the most effective is least
41

recently used (LRU):


• Replace that block in the set that has been in
Computer Organization and

the cache longest with no reference to it.


• LRU is the most popular replacement
Architecture

algorithm.
 Another possibility is first-in-first-out
(FIFO):
• Replace that block in the set that has been in
the cache longest.
NLU-FIT

• FIFO is easily implemented as a round-robin


or circular buffer technique.

4.3.4. Replacement Algorithms


 Another possibility is least frequently used
42

(LFU):
• Replace that block in the set that has
Computer Organization and

experienced the fewest references.


• LFU could be implemented by associating a
Architecture

counter with each line.


 A technique not based on usage is to pick
a line at random from among the candidate
lines.
• Simulation studies have shown that random
NLU-FIT

replacement provides only slightly inferior


performance to an algorithm based on usage

21
05-May-22

Computer Organization and Architecture


 Reference: Computer Organization and
43

Architecture Designing for Performance


(8th Edition), William Stallings, Prentice
Computer Organization and

Hall, Upper Saddle River, NJ 07458.


Architecture
NLU-FIT

22

You might also like