Lecture4
Lecture4
SRAM - Basics
Jaydeep P. Kulkarni
[email protected]
Slides are adopted from Prof. Kaushik Roy, Prof. Chris Kim , Prof. Saibal
Mukhopadhyay, and Prof. Jan Rabaey’s slides
2
Semiconductor Memory Trends
From [Itoh01]
3
4
5
6
7
8
Variation in Process Parameters 9
10000
Source: Intel
# dopant atoms
1000
100
10
1000 500 250 130 65 32
Technology Node (nm)
Line Edge Roughness Random Dopant Fluctuation
(LER) (RDF)
Inter and
• LER, RDF induced device mismatch
Intra-die Variations
Parametric Failures: Read Failure 10
WL
Voltage
VL
VTRIPRD WL
∆
+∆ VR=‘0’
VREAD
VR=VREAD
PL PR
AXL AXR
VL=‘1’ ∆
-∆
NL NR Time ->
∆
-∆ ∆
+∆
VR
Voltage
BL BR
WL
Time ->
Read failure => Flipping of Cell Data while Reading
11
Mechanisms of Parametric Failures
BL Access Failure Read Failure
∆MIN VR
Voltage
Voltage
WL
BR
WL VL
VL
Voltage
Voltage VDDH
WL
VR VR VL
Tall-cell
Thin-cell
M. Ishida, IEDM 98
22
Tall Cell Layout
23
Thin Cell Layout
From [Itoh01]
26
Array-Structured Memory Architecture
Problem: ASPECT RATIO or HEIGHT >> WIDTH
2L 2 K Bit line
Storage cell
AK
Row Decoder
A K1 1 Word line
AL2 1
M.2K
A0
Column decoder Selects appropriate
A K2 1 word
Input-Output
(M bits)
27
Row
address
Column
address
Block
address
I/O
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
28
Memory Architecture: Decoders
M bits M bits
S0 S0
Word 0 Word 0
S1
Word 1 A0 Word 1
S2 Storage Storage
Word 2 A1 Word 2
cell cell
wo r d s
A K2 1
N
De c o d e r
SN 2 2
Word N 2 2 Word N 2 2
SN 2 1
Word N 2 1 Word N 2 1
K 5 log2N
Input-Output Input-Output
(M bits) (M bits)
Intuitive architecture for N x M memory Decoder reduces the number of select signals
Too many select signals:
N words == N select signals
K = log2N
29
Hierarchical Decoders
Multi-stage implementation improves performance
•••
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1 A 2A 3 A 2A 3 A 2A 3 A 2A 3
•••
NAND decoder using
2-input pre-decoders
A1 A0 A0 A1 A3 A2 A2 A3
30
Dynamic Decoders
Precharge devices GND GND VDD
WL 3
VDD
WL3
WL 2
WL2 VDD
WL1
WL 1
V DD
WL0
WL 0
VDD φ A0 A0 A1 A1
A0 A0 A1 A1 φ
S0
A0
S1
S2
A1 S3
2 - i n p u t NOR d e c o d e r
Advantages: speed (tpd does not add to overall memory access time)
Only one extra transistor in signal path
Disadvantage: Large transistor count
32
Tree based column decoder
BL 0 BL 1 BL 2 BL 3
A0
A0
A1
A1
D
Number of devices drastically reduced
Delay increases quadratically with # of sections; prohibitive for large decoders
Solutions: buffers
progressive sizing
combination of tree and pass transistor approaches
33
Sense Amplifiers
make ∆ V as small
C ×∆ V as possible
t = ----------------
p Iav
large small
small
transition s.a.
input output
34
M3 M4
y Out
bit M1 M2 bit
SE M5
35
Differential Sensing ― SRAM
V DD V DD
PC
BL BL V DD V DD
EQ
y M3 M4 2y
WL i
x M1 M2 2x x 2x
SE M5 SE
SE
SRAM cell i
V DD
Diff.
x Sense 2x Output
Amp y
SE
Output
(a) SRAM sensing scheme (b) two stage differential amplifier
36
Latch-Based Sense Amplifier
EQ
BL BL
VDD
SE
SE