Lec 8-1
Lec 8-1
Memory Organization
1
Example
Example
Consider a direct mapped cache with 8 cache
blocks (0-7). If the memory block requests are
in the order:
3, 5, 2, 8, 0, 6, 3, 9, 16, 20, 17, 25, 18, 30, 24, 2,
63, 5, 82, 17, 24
Which of the following memory blocks will not
be in the cache at the end of the sequence?
a. 3
b. 18
c. 20
d. 30
Also, calculate the hit ratio and
Chapter 8 <2>miss ratio.
Example
Solution-
There are 8 blocks in cache memory numbered
from 0 to 7.
In direct mapping, a particular block of main
memory is mapped to a particular line of cache
memory.
The line number is given by-
Cache line number = Block number modulo
Number of lines in cache
Chapter 8 <3>
Example
Solution-
Chapter 8 <4>
Example
Solution-
Out of given options, only block-18 is not
present in the main memory.
Option-(B) is correct.
Hit ratio = 3 / 21
Miss ratio = 18 / 21
Chapter 8 <5>
Example
Example
Consider a cache consisting of 128 blocks of
16 words each, for total of 2048(2K) words and
assume that the main memory is addressable by
16 bit address. Main memory is 64K which will
be viewed as 4K blocks of 16 words each.
Chapter 8 <6>
Example
Solution:
(1) Direct Mapping:-
Chapter 8 <7>
-
Example
Solution:
(2) Associative Mapping:-
Chapter 8 <8>
-
Example
Solution:
(3) Set-Associated Mapping:-
Chapter 8 <9>
Direct Mapped Cache Hardware
Example 8.4 CACHE FIELDS
To what cache set in Figure 8.5 does the word
at address 0x00000014 map?
Name another address that maps to the same
set.
Solution:
00000000000000000000000000010100
the word maps to set 5. Words at
addresses 0x34, 0x54, 0x74, . . . , 0xFFFFFFF4
all map to this same set.
Chapter 8 <10>
Direct Mapped Cache
Address
11...11111100 mem[0xFF...FC]
11...11111000 mem[0xFF...F8]
11...11110100 mem[0xFF...F4]
11...11110000 mem[0xFF...F0]
11...11101100 mem[0xFF...EC]
11...11101000 mem[0xFF...E8]
11...11100100 mem[0xFF...E4]
11...11100000 mem[0xFF...E0]
00...00100100 mem[0x00...24]
00...00100000 mem[0x00..20] Set Number
00...00011100 mem[0x00..1C] 7 (111)
00...00011000 mem[0x00...18] 6 (110)
00...00010100 mem[0x00...14] 5 (101)
00...00010000 mem[0x00...10] 4 (100)
00...00001100 mem[0x00...0C] 3 (011)
00...00001000 mem[0x00...08] 2 (010)
00...00000100 mem[0x00...04] 1 (001)
00...00000000 mem[0x00...00] 0 (000)
Chapter 8 <12>
An N-way set associative cache
reduces conflicts by providing N
blocks in each set where data
mapping to that set might be found.
Each memory address still maps to a
specific set, but it can map to any one
of the N blocks in the set. Hence, a
direct mapped cache is another name
for a one-way set associative cache.
N is also called the degree of
associativity of the cache.
Chapter 8 <13>
Figure 8.9 shows the hardware for a
C =8-word, N =2-way set associative
cache. The cache now has only S=4
sets rather than 8.
Thus, only log 4 = 2 set bits rather
2
Chapter 8 <14>
N-Way Set Associative Cache
Byte
Tag Set Offset
Memory
00
Address Way 1 Way 0
28 2
V Tag Data V Tag Data
28 32 28 32
= =
0
Hit1 Hit0 Hit1
32
Hit Data
Chapter 8 <15>
Set associative caches generally
have lower miss rates than direct
mapped caches of the same capacity,
because they have fewer conflicts.
However, set associative caches are
usually slower and somewhat more
expensive to build because of the
output multiplexer and additional
comparators.
Chapter 8 <16>
Types of Misses
Compulsory: first time data
accessed
Capacity: cache too small to
hold all data of interest
Conflict: data of interest maps to
same location in cache
Chapter 8 <17>
Direct Mapped Cache Hardware
Example 8.5 CACHE FIELDS
Find the number of set and tag bits for a direct
mapped cache with 1024 (210) sets
and a one-word block size. The address size is
32 bits.
Solution:
A cache with 210 sets requires log2(210) = 10 set
bits.
The two least significant bits of the address are
the byte offset, and the remaining 32 − 10 – 2
= 20 bits form the tag.
Chapter 8 <18>
Direct Mapped Cache
Performance Byte
Tag Set Offset
Memory
00...00 001 00
Address 3
V Tag Data
0 Set 7 (111)
# MIPS assembly code 0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
1 00...00 mem[0x00...0C] Set 3 (011)
loop: beq $t0, $0, done
1 00...00 mem[0x00...08] Set 2 (010)
lw $t1, 0x4($0) 1 00...00 mem[0x00...04] Set 1 (001)
lw $t2, 0xC($0) 0 Set 0 (000)
lw $t3, 0x8($0)
addi $t0, $t0, -1 Miss Rate = ?
j loop
done:
Chapter 8 <19>
Direct Mapped Cache
Performance Byte
Tag Set Offset
Memory
00...00 001 00
Address 3
V Tag Data
0 Set 7 (111)
# MIPS assembly code 0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
1 00...00 mem[0x00...0C] Set 3 (011)
loop: beq $t0, $0, done
1 00...00 mem[0x00...08] Set 2 (010)
lw $t1, 0x4($0) 1 00...00 mem[0x00...04] Set 1 (001)
lw $t2, 0xC($0) 0 Set 0 (000)
lw $t3, 0x8($0)
addi $t0, $t0, -1 Miss Rate = 3/15
j loop = 20%
done:
Temporal Locality
Compulsory Misses
Chapter 8 <20>
Direct Mapped Cache: Conflict
Byte
Tag Set Offset
Memory
00...01 001 00
Address 3
V Tag Data
# MIPS assembly code 0 Set 7 (111)
0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
loop: beq $t0, $0, done 0 Set 3 (011)
lw $t1, 0x4($0) 0 Set 2 (010)
mem[0x00...04] Set 1 (001)
1 00...00
lw $t2, 0x24($0) mem[0x00...24]
0 Set 0 (000)
addi $t0, $t0, -1
j loop
done:
Miss Rate = ?
Chapter 8 <21>
Direct Mapped Cache: Conflict
Byte
Tag Set Offset
Memory
00...01 001 00
Address 3
V Tag Data
# MIPS assembly code 0 Set 7 (111)
0 Set 6 (110)
0 Set 5 (101)
addi $t0, $0, 5 0 Set 4 (100)
loop: beq $t0, $0, done 0 Set 3 (011)
lw $t1, 0x4($0) 0 Set 2 (010)
mem[0x00...04] Set 1 (001)
1 00...00
lw $t2, 0x24($0) mem[0x00...24]
0 Set 0 (000)
addi $t0, $t0, -1
j loop
done:
Miss Rate = 10/10
= 100%
Conflict Misses
Chapter 8 <22>
N-Way Set Associative Performance
# MIPS assembly code
Chapter 8 <23>
N-Way Set Associative Performance
# MIPS assembly code
Chapter 8 <24>
Fully Associative Cache
Chapter 8 <25>
Cache Organization Recap
Capacity: C
Block size: b
Chapter 8 <26>
Cache Organization Recap
Capacity: C
Block size: b
Chapter 8 <27>
Increasing the associativity, N, usually
reduces the miss rate caused by conflicts.
But higher associativity requires more tag
comparators.
Increasing the block size, b, takes
advantage of spatial locality to reduce the
miss rate. However, it decreases the
number of sets in a fixed sized cache and
therefore could lead to more conflicts. It also
increases the miss penalty.
28
Replacement Algorithms
When a new block is brought into the
cache, one of the existing blocks
must be replaced.
For direct mapping, there is only one
possible line for any particular block,
and no choice is possible.
For the associative and set
associative techniques, a
replacement algorithm is needed.
29
1- First–In–First–Out (FIFO)
30
2- Least recently used (LRU)
(time factor)
Replace that block in the set that has
been in the cache longest with no
reference to it.
For example, in two-way set
associative, this is easily implemented.
Each line includes a USE bit.
When a lines referenced, its USE =1
and the USE bit of the other line in
that set = 0.
When a block is to be read into the set,
the line whose USE bit is 0 is used.
Because we are assuming that more
recently used memory locations are
more likely to be referenced, LRU 31
should give the best hit ratio.
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)
Way 1 Way 0
Chapter 8 <32>
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)
Way 1 Way 0
34
4- Random Replace
35
Cache Summary
What data is held in the cache?
Recently used data (temporal locality)
Nearby data (spatial locality)
How is data found?
Set is determined by address of data
Word within block also determined by
address
In associative caches, data could be in
one of several ways
What data is replaced?
Least-recently used way in the set
Chapter 8 <36>