0% found this document useful (0 votes)
6 views36 pages

Lec 8-1

The document discusses memory organization, specifically focusing on cache types such as direct mapped, set associative, and fully associative caches. It explains how memory blocks are mapped to cache lines, calculates hit and miss ratios, and describes the impact of cache associativity on performance. Additionally, it covers replacement algorithms like FIFO and LRU for managing cache entries.

Uploaded by

flhi6876
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views36 pages

Lec 8-1

The document discusses memory organization, specifically focusing on cache types such as direct mapped, set associative, and fully associative caches. It explains how memory blocks are mapped to cache lines, calculates hit and miss ratios, and describes the impact of cache associativity on performance. Additionally, it covers replacement algorithms like FIFO and LRU for managing cache entries.

Uploaded by

flhi6876
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Lecture 8

Memory Organization

1
Example
Example
Consider a direct mapped cache with 8 cache
blocks (0-7). If the memory block requests are
in the order:
3, 5, 2, 8, 0, 6, 3, 9, 16, 20, 17, 25, 18, 30, 24, 2,
63, 5, 82, 17, 24
Which of the following memory blocks will not
be in the cache at the end of the sequence?
a. 3
b. 18
c. 20
d. 30
Also, calculate the hit ratio and
Chapter 8 <2>miss ratio.
Example
Solution-
There are 8 blocks in cache memory numbered
from 0 to 7.
In direct mapping, a particular block of main
memory is mapped to a particular line of cache
memory.
The line number is given by-
Cache line number = Block number modulo
Number of lines in cache

Chapter 8 <3>
Example
Solution-

Chapter 8 <4>
Example
Solution-
Out of given options, only block-18 is not
present in the main memory.
Option-(B) is correct.
Hit ratio = 3 / 21
Miss ratio = 18 / 21

Chapter 8 <5>
Example
Example
Consider a cache consisting of 128 blocks of
16 words each, for total of 2048(2K) words and
assume that the main memory is addressable by
16 bit address. Main memory is 64K which will
be viewed as 4K blocks of 16 words each.

Chapter 8 <6>
Example
Solution:
(1) Direct Mapping:-

Chapter 8 <7>
-

Example
Solution:
(2) Associative Mapping:-

Chapter 8 <8>
-

Example
Solution:
(3) Set-Associated Mapping:-

Chapter 8 <9>
Direct Mapped Cache Hardware
Example 8.4 CACHE FIELDS
To what cache set in Figure 8.5 does the word
at address 0x00000014 map?
Name another address that maps to the same
set.
Solution:
00000000000000000000000000010100
the word maps to set 5. Words at
addresses 0x34, 0x54, 0x74, . . . , 0xFFFFFFF4
all map to this same set.

Chapter 8 <10>
Direct Mapped Cache
Address
11...11111100 mem[0xFF...FC]
11...11111000 mem[0xFF...F8]
11...11110100 mem[0xFF...F4]
11...11110000 mem[0xFF...F0]
11...11101100 mem[0xFF...EC]
11...11101000 mem[0xFF...E8]
11...11100100 mem[0xFF...E4]
11...11100000 mem[0xFF...E0]

00...00100100 mem[0x00...24]
00...00100000 mem[0x00..20] Set Number
00...00011100 mem[0x00..1C] 7 (111)
00...00011000 mem[0x00...18] 6 (110)
00...00010100 mem[0x00...14] 5 (101)
00...00010000 mem[0x00...10] 4 (100)
00...00001100 mem[0x00...0C] 3 (011)
00...00001000 mem[0x00...08] 2 (010)
00...00000100 mem[0x00...04] 1 (001)
00...00000000 mem[0x00...00] 0 (000)

230 Word Main Memory 23 Word Cache


Chapter 8 <11>
Direct Mapped Cache Hardware

Chapter 8 <12>
 An N-way set associative cache
reduces conflicts by providing N
blocks in each set where data
mapping to that set might be found.
 Each memory address still maps to a
specific set, but it can map to any one
of the N blocks in the set. Hence, a
direct mapped cache is another name
for a one-way set associative cache.
 N is also called the degree of
associativity of the cache.

Chapter 8 <13>
 Figure 8.9 shows the hardware for a
C =8-word, N =2-way set associative
cache. The cache now has only S=4
sets rather than 8.
 Thus, only log 4 = 2 set bits rather
2

than 3 are used to select the set.


 The tag increases from 27 to 28 bits.
 Each set contains two ways or
degrees of associativity. Each way
consists of a data block and the valid
and tag bits.

Chapter 8 <14>
N-Way Set Associative Cache
Byte
Tag Set Offset
Memory
00
Address Way 1 Way 0
28 2
V Tag Data V Tag Data

28 32 28 32

= =

0
Hit1 Hit0 Hit1
32

Hit Data

Chapter 8 <15>
 Set associative caches generally
have lower miss rates than direct
mapped caches of the same capacity,
because they have fewer conflicts.
 However, set associative caches are
usually slower and somewhat more
expensive to build because of the
output multiplexer and additional
comparators.

Chapter 8 <16>
Types of Misses
 Compulsory: first time data
accessed
 Capacity: cache too small to
hold all data of interest
 Conflict: data of interest maps to
same location in cache

Miss penalty: time it takes to retrieve a


block from lower level of hierarchy

Chapter 8 <17>
Direct Mapped Cache Hardware
Example 8.5 CACHE FIELDS
Find the number of set and tag bits for a direct
mapped cache with 1024 (210) sets
and a one-word block size. The address size is
32 bits.
Solution:
A cache with 210 sets requires log2(210) = 10 set
bits.
The two least significant bits of the address are
the byte offset, and the remaining 32 − 10 – 2
= 20 bits form the tag.

Chapter 8 <18>
Direct Mapped Cache
Performance Byte
Tag Set Offset
Memory
00...00 001 00
Address 3
V Tag Data
0 Set 7 (111)
# MIPS assembly code 0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
1 00...00 mem[0x00...0C] Set 3 (011)
loop: beq $t0, $0, done
1 00...00 mem[0x00...08] Set 2 (010)
lw $t1, 0x4($0) 1 00...00 mem[0x00...04] Set 1 (001)
lw $t2, 0xC($0) 0 Set 0 (000)
lw $t3, 0x8($0)
addi $t0, $t0, -1 Miss Rate = ?
j loop
done:

Chapter 8 <19>
Direct Mapped Cache
Performance Byte
Tag Set Offset
Memory
00...00 001 00
Address 3
V Tag Data
0 Set 7 (111)
# MIPS assembly code 0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
1 00...00 mem[0x00...0C] Set 3 (011)
loop: beq $t0, $0, done
1 00...00 mem[0x00...08] Set 2 (010)
lw $t1, 0x4($0) 1 00...00 mem[0x00...04] Set 1 (001)
lw $t2, 0xC($0) 0 Set 0 (000)
lw $t3, 0x8($0)
addi $t0, $t0, -1 Miss Rate = 3/15
j loop = 20%
done:
Temporal Locality
Compulsory Misses
Chapter 8 <20>
Direct Mapped Cache: Conflict
Byte
Tag Set Offset
Memory
00...01 001 00
Address 3
V Tag Data
# MIPS assembly code 0 Set 7 (111)
0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
loop: beq $t0, $0, done 0 Set 3 (011)
lw $t1, 0x4($0) 0 Set 2 (010)
mem[0x00...04] Set 1 (001)
1 00...00
lw $t2, 0x24($0) mem[0x00...24]
0 Set 0 (000)
addi $t0, $t0, -1
j loop
done:
Miss Rate = ?

Chapter 8 <21>
Direct Mapped Cache: Conflict
Byte
Tag Set Offset
Memory
00...01 001 00
Address 3
V Tag Data
# MIPS assembly code 0 Set 7 (111)
0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
loop: beq $t0, $0, done 0 Set 3 (011)
lw $t1, 0x4($0) 0 Set 2 (010)
mem[0x00...04] Set 1 (001)
1 00...00
lw $t2, 0x24($0) mem[0x00...24]
0 Set 0 (000)
addi $t0, $t0, -1
j loop
done:
Miss Rate = 10/10
= 100%
Conflict Misses
Chapter 8 <22>
N-Way Set Associative Performance
# MIPS assembly code

addi $t0, $0, 5


loop: beq $t0, $0, done Miss Rate = ?
lw $t1, 0x4($0)
lw $t2, 0x24($0)
addi $t0, $t0, -1
j loop
done:
Way 1 Way 0
V Tag Data V Tag Data
0 0 Set 3
0 0 Set 2
0 0 Set 1
0 0 Set 0

Chapter 8 <23>
N-Way Set Associative Performance
# MIPS assembly code

addi $t0, $0, 5


loop: beq $t0, $0, done Miss Rate = 2/10
lw $t1, 0x4($0) = 20%
lw $t2, 0x24($0)
addi $t0, $t0, -1 Associativity reduces
j loop conflict misses
done:
Way 1 Way 0
V Tag Data V Tag Data
0 0 Set 3
0 0 Set 2
1 00...10 mem[0x00...24] 1 00...00 mem[0x00...04] Set 1
0 0 Set 0

Chapter 8 <24>
Fully Associative Cache

 A fully associative cache contains a single set


with B ways, where B is the number of blocks.
 A memory address can map to a block in any
of these ways.
 A fully associative cache is another name for
a B-way set associative cache with one set.
 Figure 8.11 shows the SRAM array of a fully
associative cache with eight blocks. Upon a
data request, eight tag comparisons must be
made, because the data could be in any
block.

Chapter 8 <25>
Cache Organization Recap
 Capacity: C
 Block size: b

 Number of blocks in cache: B = C/b

 Number of blocks in a set: N

 Number of sets: S = B/N

Number of Ways Number of Sets


Organization (N) (S = B/N)
Direct Mapped
N-Way Set Associative
Fully Associative

Chapter 8 <26>
Cache Organization Recap
 Capacity: C
 Block size: b

 Number of blocks in cache: B = C/b

 Number of blocks in a set: N

 Number of sets: S = B/N

Number of Ways Number of Sets


Organization (N) (S = B/N)
Direct Mapped 1 B
N-Way Set Associative 1 < N < B B/N
Fully Associative B 1

Chapter 8 <27>
 Increasing the associativity, N, usually
reduces the miss rate caused by conflicts.
 But higher associativity requires more tag
comparators.
 Increasing the block size, b, takes
advantage of spatial locality to reduce the
miss rate. However, it decreases the
number of sets in a fixed sized cache and
therefore could lead to more conflicts. It also
increases the miss penalty.

28
Replacement Algorithms
 When a new block is brought into the
cache, one of the existing blocks
must be replaced.
 For direct mapping, there is only one
possible line for any particular block,
and no choice is possible.
 For the associative and set
associative techniques, a
replacement algorithm is needed.

29
1- First–In–First–Out (FIFO)

 Replace the block in the set that


has been in the cache longest.

 FIFO is easily implemented.

 Hit ratio = probability that a word


is found in the cache

30
2- Least recently used (LRU)
(time factor)
 Replace that block in the set that has
been in the cache longest with no
reference to it.
 For example, in two-way set
associative, this is easily implemented.
 Each line includes a USE bit.
 When a lines referenced, its USE =1
and the USE bit of the other line in
that set = 0.
 When a block is to be read into the set,
the line whose USE bit is 0 is used.
 Because we are assuming that more
recently used memory locations are
more likely to be referenced, LRU 31
should give the best hit ratio.
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)

Way 1 Way 0

V U Tag Data V Tag Data


0 0 0 Set 3 (11)
0 0 0 Set 2 (10)
0 0 0 Set 1 (01)
0 0 0 Set 0 (00)

Chapter 8 <32>
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)

Way 1 Way 0

V U Tag Data V Tag Data


0 0 0 Set 3 (11)
0 0 0 Set 2 (10)
1 0 00...010 mem[0x00...24] 1 00...000 mem[0x00...04] Set 1 (01)
0 0 0 Set 0 (00)
(a)
Way 1 Way 0

V U Tag Data V Tag Data


0 0 0 Set 3 (11)
0 0 0 Set 2 (10)
1 1 00...010 mem[0x00...24] 1 00...101 mem[0x00...54] Set 1 (01)
0 0 0 Set 0 (00)
(b) Chapter 8 <33>
3- Least Frequently Used
(LFU) (number of usage)
 Replace that block in the set
that has experienced the fewest
references.
 LFU could be implemented by
associating a counter with each
line.

34
4- Random Replace

 Pick a line at random from


among the candidate lines.
 Simulation studies have shown
that random replacement
provides only slightly inferior
performance to an algorithm
based on usage.

35
Cache Summary
 What data is held in the cache?
 Recently used data (temporal locality)
 Nearby data (spatial locality)
 How is data found?
 Set is determined by address of data
 Word within block also determined by
address
 In associative caches, data could be in
one of several ways
 What data is replaced?
 Least-recently used way in the set
Chapter 8 <36>

You might also like