Memory Hierarchy
Memory Hierarchy
Memory density and capacity have grown along with the CPU power and
complexity, but memory speed has not kept pace.
Why Care About Memory Hierarchy?
Processor-DRAM Performance Gap grows 50% / year
Processor
60%/year
1000 (2X/1.5 years)
“Moore’s Law”
Performance
100 CPU
10 DRAM
9%/year
DRAM
(2X/10 years)
1
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Time
The Need for a Memory Hierarchy
The widening speed gap between CPU and main memory
Spatial Temporal
Data arrays loop counters
Code no branch/jump loop
Memory Hierarchy: Idea
Temporal locality: keep recently accessed data
items closer to processor.
Spatial locality: move contiguous words in
memory to upper levels of hierarchy.
Use smaller and faster memory technologies
closer to the processor
If hit rate is high enough, hierarchy has access
time close to the highest (fastest) level and size
equal to the lowest (largest) level.
Memory Hierarchy: Terminology
Hit: data appears in some block in the upper level.
Hit rate: the fraction of memory access found in the
upper level.
Analogy: fraction of time you find the book on desk.
Miss: data is not at upper level and needs to be
retrieved from a block in the lower level.
Miss rate: 1 – Hit rate
Analogy: fraction of time you need to go to shelves for
the book.
Memory Hierarchy: Terminology (2)
(Average) Hit time: time to access the upper level
which consists of
Time to determine hit/miss + memory access time.
Analogy: time to find and pick up book from desk.
(Average) Miss penalty: time to replace a block in
the upper level + time to deliver the block to the
processor.
Analogy: time to go to shelves, find needed book, and
return to your desk.
Hit time << Miss penalty.
Current Memory Hierarchy
Processor
Register
Increasing Increasing Increasing
size speed cost per bit
Primary L1
cache
Secondary
cache L2
Main
memory
Magnetic disk
secondary
memory
Current Memory Hierarchy
Processor
Control Second-
ary
Main
L2 Memory
Memory
Data Cache
L1 $
Regs
path
To Processor
Average Memory Access Time
• Average memory-access time
= Hit time + Miss rate x Miss penalty
• Miss penalty: time to fetch a block from lower
memory level
– access time: function of latency
– transfer time: function of bandwidth b/w levels
• Transfer one “cache line/block” at a time
• Transfer at the size of the memory-bus width
Memory Hierarchy Performance
1 clk First-level
300 clks
Cache
Level Third
Cache Level Main
L1 Cache Memory
L2
(DRAM)
L3
On-die
26
Relationship of Caches and Pipeline
Memory
I-$ D-$
Next
SEQ PC
Next PC
EX/MEM
MEM/WB
Adder
ID/EX
Adder
Zero?
4 RS1
Address
RS2
Memory
ALU
WB Data
Memory
MUX
Data
MUX
Sign
Imm Extend
RD RD RD
Hardware cache memories
Cache memories are small, fast SRAM-based memories
managed automatically in hardware
– Hold frequently accessed blocks of main memory
CPU looks first for data in L1, then in main memory
Typical system structure:
CPU chip
register file
L1
ALU
cache
bus main
bus interface memory