CPUlogic Design 11 Cache
CPUlogic Design 11 Cache
Li Yamin
Department of Computer Science
Faculty of Computer and Information Sciences
Hosei University, Tokyo 184-8584 Japan
http://cis.k.hosei.ac.jp/∼yamin/
100,000
10,000
1,000
100
10
1
19
80
19
81
19
82
CPU Logic Design
19
83
19
84
93
19
94
19
95
19
Memory
96
19
97
19
98
19
99
20
00
20
Cache – p.2/27
02
CPU Logic Design
Memory hierarchy
Speed Size
Main memory
Disk
Slowest Tape Biggest
Processor
DO
IR
DI Instruction Memory
Cache
PC A
DI D
DO
DO
DI Data
Cache
DA A
Cache Size
Size
Block Size
Physical Address Cache
Cache Placement
Virtual Address Cache
Direct Mapping
Mapping Algorithms Set Associate Mapping
Parameters Full Associate Mapping
Write Back
Memory Update Mechanism
Write Through
Write Allocate
Cache Write Miss
No-Write Allocate
Random Replacement Algorithms
Replacement Algorithms
LRU Replacement Algorithms
Cache
Block 0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
number
Memory
Block
Block address
offset
<21> <8> <5> Address
Tag Index Data in
2
Data out
Valid Tag Data
<1> <21> <256>
256
blocks Write
buffer
mux
=? 64
Block
Block address
offset
<22> <7> <5> Address
Tag Index Data in
2
Data out
128
blocks Write
buffer
mux mux
=? =?
Lower
mux level
64 memory
Random Strategy:
To spread allocation uniformly, candidate blocks are
randomly selected.
Some system generate pseudo-random block
numbers to get reproducible behavior, which is
partially useful when debugging hardware.
A virtue of random replacement is that it is simple to
build in hardware.
1 0 2 3 0 3 1 2
0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ...
0 0 0 0 0 0 0 0
S1
2AE1 4 1 Word address S0
COMP COMP
Select Select
MUX MUX
OutputEnable OutputEnable
Tag V V V V
100 1 1 1 1
300 1 1 0 0
200 0 1 0 1
204 0 0 0 0
SUb-block
Example:
Tc = 1ns, Tm = 50ns, h = 95%
T = Tc + mTm = 1 + 0.05 × 50 = 1 + 2.5 = 3.5ns
Speedup = 50/3.5 = 14.29 = 1429%