Lecture 18 Memory Hierarchy
Lecture 18 Memory Hierarchy
Performance
Disk: 4x in 3 years 2x in 10 years (2X/1.5yr)
100 Processor-Memory
DRAM
Performance Gap:
Year Size Cycle Time
1000:1! 64 Kb 2:1! (grows 50% / year)
1980 250 ns 10
1983 256 Kb 220 ns DRAM
1986 1 Mb 190 ns
DRAM
9%/yr.
1989 4 Mb 165 ns 1 (2X/10 yrs)
1980
1981
1982
1985
1986
1987
1988
1991
1993
1994
1997
1998
1999
2000
1983
1984
1989
1990
1992
1995
1996
1992 16 Mb 145 ns
1995 64 Mb 120 ns
Time
cs 152 L1 6 .3 cs 152 L1 6 .4 DAP Fa97, U.CB
DAP Fa97, U.CB
DRAM Statistics
Impact on Performance The Goal: illusion of large, fast, cheap memory
Inst Miss
° Suppose a processor executes at (0.5) Ideal CPI
16% (1.1)
• Clock Rate = 200 MHz (5 ns per cycle) 35% ° Fact: Large memories are slow, fast memories are
small
• CPI = 1.1
• 50% arith/logic, 30% ld/st, 20% control ° How do we create a memory that is large, cheap and
DataMiss
(1.6)
fast (most of the time)?
° Suppose that 10% of memory 49% • Hierarchy
operations get 50 cycle
miss penalty • Parallelism
° CPI = ideal CPI + average stalls per instruction
= 1.1 + ( 0.30 (memory access/ins)
x 0.10 (miss/memory access) x 50 (cycle/miss) )
= 1.1 cycle + 1.5 cycle
= 2. 6
° 58 % of the time the processor
is stalled waiting for memory!
° a 1% instruction miss rate would add
an additional 0.5 cycles to the CPI!
Control
Memory Probability
of reference
Memory
Memory
Memory
Datapath Memory
0 Address Space 2^n - 1
Level Memory
Cache
C Chip select D Dı D Dı
Q SRAMı 8 Write enable C latch Q C latch Q
Output enable Dout[7– 0]
32K × 8 Enable Enable
_
Write enable 0
Q
D 8 2-to-4ı D D
Din[7– 0] Dı Dı
decoder
C latch Q C latch Q
Enable Enable
1
D Dı D Dı
Address
C latch Q C latch Q
Select 0 Enable Enable Enable
In Out 2
Data 0
D Dı D Dı
Select 1 Enable C latch Q C latch Q
In Out Enable Enable
Data 1
Three-state buffers 3
Select 3 Enable
In Out
Data 3
cs 152 L1 6 .13 DAP Fa97, U.CB cs 152 L1 6 .14 DAP Fa97, U.CB
32K X 8 SRAM Dynamic RAM (DRAM)
DRAM cell
Word line
Pass transistor
9-to-512ı 512 × 64ı 512 × 64ı 512 × 64ı 512 × 64ı 512 × 64ı 512 × 64ı 512 × 64ı 512 × 64ı
decoder 512 SRAM SRAM SRAM SRAM SRAM SRAM SRAM SRAM Capacitor
4M X 1 DRAM
Addressı
[14– 6]
Bit line
64 Rowı
Addressı 2048 × 2048ı
ı decoderı
array
[5– 0] 11-to-2048
Mux
Dout
cs 152 L1 6 .15 DAP Fa97, U.CB cs 152 L1 6 .16 DAP Fa97, U.CB