0% found this document useful (0 votes)
9 views47 pages

C411L23MemoryCore 1

The document discusses memory cell designs focusing on SRAM and DRAM technologies, detailing their structures, read/write operations, and performance considerations. It highlights the importance of cell sizing, voltage ratios, and the use of sense amplifiers for efficient operation. Additionally, it covers various types of memory cells, including 6-transistor SRAM, 3-transistor DRAM, and their respective layouts and functionalities.

Uploaded by

bhaliyamansing81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views47 pages

C411L23MemoryCore 1

The document discusses memory cell designs focusing on SRAM and DRAM technologies, detailing their structures, read/write operations, and performance considerations. It highlights the importance of cell sizing, voltage ratios, and the use of sense amplifiers for efficient operation. Additionally, it covers various types of memory cells, including 6-transistor SRAM, 3-transistor DRAM, and their respective layouts and functionalities.

Uploaded by

bhaliyamansing81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

CMPEN 411

VLSI Digital Circuits


Spring 2011

Lecture 23: Memory Cell Designs


SRAM, DRAM

[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003


J. Rabaey, A. Chandrakasan, B. Nikolic]

Sp11 CMPEN 411 L23 S.1


Heads-up
 IBM Kerry Bernstein’s talk Thursday 4 PM, IST 333
 To prepare for his talk, go to ANGEL system, find the file “New
dimensions in performance”, under “interesting reading
materials”

 To make up last cancelled lecture:


 Kerry Bernstein’s talk – “Microarchitecture’s Race for
Performance and Power”, PSU talk, 11/2004, Slides and
Videos are online in ANGEL system “Interesting Reading
Materials”

 DAC Young Student Scholarship


www.dac.com

Sp11 CMPEN 411 L23 S.2


Review: Basic Building Blocks
 Datapath
 Execution units
- Adder, multiplier, divider, shifter, etc.
 Register file and pipeline registers
 Multiplexers, decoders

 Control
 Finite state machines (PLA, ROM, random logic)

 Interconnect
 Switches, arbiters, buses

 Memory
 ROM, Caches (SRAMs), CAM, DRAMs, buffers

Sp11 CMPEN 411 L23 S.3


2D 4x4 SRAM Memory Bank
read
precharge bit line precharge
enable
WL[0]
!BL BL
A1
WL[1]

A2 WL[2]

WL[3]

2 bit words

A0 Column Decoder
clocking and
control
sense amplifiers

BLi BLi+1 write circuitry

Sp11 CMPEN 411 L23 S.4


6-Transistor SRAM Storage Cell

WL

off on
M2 M4
Q M6
M5 !Q 1
0 off
M1
on M3

!BL BL

Sp11 CMPEN 411 L23 S.5


SRAM Cell Analysis (Read)
WL=1

M4
M6
M5 !Q=0 Q=1
M1
Cbit Cbit

!BL=2.5V → 0 BL=2.5V

 Read-disturb (read-upset): must limit the voltage rise on


!Q to prevent read-upsets from occurring while
simultaneously maintaining acceptable circuit speed and
area
 M1 must be stronger than M5 when storing a 1 (as shown)
 M3 must be stronger than M6 when storing a 0
Sp11 CMPEN 411 L23 S.6
Read Voltage Ratios
where CR is the Cell Ratio = (W1/L1)/(W5/L5) ≥ 1.2

VDD = 2.5V
VTn = 0.4V  Keep cell size minimal
1.2 while maintaining read
stability
1
 Make M1 minimum size
and increase the L of
V o ltag e R ise o n !Q

0.8
M5 (to make it weaker)
0.6 - increases load on WL

0.4
 Make M5 minimum size
and increase the W of
0.2
M1 (to make it stronger)

0  Similar constraints on
0 0.5 1 1.5 2 2.5 3 (W3/L3)/(W6/L6) when
Cell Ratio (CR)
storing a 0

Sp11 CMPEN 411 L23 S.7


SRAM Cell Analysis (Write)
WL=1

M4
M6
M5 !Q=0 Q=1
M1 →0

Cbit Cbit

!BL=2.5V BL=0V

 The !Q side of the cell cannot be pulled high enough to


ensure writing of 0 (because M1 is on and sized to protect
against read upset). So, the new value of the cell has to
be written through M6.
 M6 must be able to overpower M4 when storing a 1 and writing a 0
 M5 must be able to overpower M2 when storing a 0 and writing a 1
Sp11 CMPEN 411 L23 S.8
Write Voltage Ratios

where PR is the Pull-up Ratio = (W4/L4)/(W6/L6) ≤ 1.8

VDD = 2.5V
|VTp| = 0.4V
µp/µn = 0.5  Keep cell size minimal
0.5 while allowing writes
 Make M4 and M6
0.4 minimum size
W rite V o ltag e (V Q )

0.3

0.2

0.1

0
0 0.5 1 1.5 2
Pullup Ratio (PR)
Sp11 CMPEN 411 L23 S.9
Cell Sizing and Performance
 Keeping cell size minimal is critical for large SRAMs
 Minimum sized pull down fets (M1 and M3)
- Requires longer than minimum channel length, L, pass transistors
(M5 and M6) to ensure proper CR
- But up-sizing L of the pass transistors increases capacitive load on
the word lines and limits the current discharged on the bit lines both
of which can adversely affect the speed of the read cycle
 Minimum width and length pass transistors
- Boost the width of the pull downs (M1 and M3)
- Reduces the loading on the word lines and increases the storage
capacitance in the cell – both are good! – but cell size may be
slightly larger

 Performance is determined by the read operation


 To accelerate the read time, SRAMs use sense amplifiers (so
that the bit line doesn’t have to make a full swing)

Sp11 CMPEN 411 L23 S.10


6-T SRAM Layout

 Simple and reliable, but big


VDD  signal routing and connections
M2 M4 to two bit lines, a word line, and
both supply rails

 Area is dominated by the


wiring and contacts
Q Q
M1 M3  Other alternatives to the 6-T
cell include the resistive load
GND 4-T cell and the TFT cell
M5 M6 WL
neither of which are
available in a standard
CMOS logic process
BL BL

Sp11 CMPEN 411 L23 S.11


Multiple Read/Write Port Storage Cell
WL2

WL1

M2 M4
M5 !Q Q M6
M7 M8

M1 M3

!BL2 !BL1 BL1 BL2


 To avoid read upset, the widths of M1 and M3 will have to
be sized up by a factor equal to the number of
simultaneously open read ports
Sp11 CMPEN 411 L23 S.12
Resistance-load SRAM Cell
WL
V DD
RL RL

Q Q
M3 M4

BL M1 M2 BL

Sp11 CMPEN 411 L23 S.13


Remove R
WL

M3 M4

BL M1 M2 BL

Sp11 CMPEN 411 L23 S.14


Remove R
WL

M3 M4

M2

Further remove one transistor

Sp11 CMPEN 411 L23 S.15


3-Transistor DRAM Cell
WWL
WWL write
RWL VDD
BL1
M3

X X VDD-VT
M1 M2
Cs RWL read

BL2 VDD-VT ∆V

BL1 BL2

 Write: Cs is charged (or discharged) by asserting WWL and


BL1
 Value stored at node X when writing a 1 is VWWL - VTn
 Read: Cs is “sensed” by asserting RWL and observing BL2
 Read is non-destructive and inverting (ratioless)
Sp11 CMPEN 411 L23 S.16
3-Transistor DRAM Cell
WWL
WWL write
RWL VDD
BL1
M3

X X VDD-VT
M1 M2
Cs RWL read

BL2 VDD-VT ∆V

BL1 BL2

 Refresh: read stored data, put its inverse on BL1 and


assert WWL (need to do this every 1 to 4 msec)
 Note Vt drop at x: how to fix it?

Sp11 CMPEN 411 L23 S.17


3-T DRAM Layout

 Fewer contacts & wires


BL2 BL1 GND
 Total cell area is 576 λ2
(compared to 1,092 λ2
RWL
M3
for the 6-T SRAM cell)

M2
 No special processing
steps are needed (so
compatible with logic
WWL CMOS process)
M1
 Can use bootstrapping
(raise VWWL to a value
higher than VDD) to
eliminate threshold drop
when storing a “1”

Sp11 CMPEN 411 L23 S.18


1-Transistor DRAM Cell

WL write read
WL
1 1

M1 X
X VDD-VT
Cs Voltage swing is small
CBL

BL VDD
VDD/2 sensing
BL
 Write: Cs is charged (or discharged) by asserting WL and BL
 Read: Charge redistribution occurs between CBL and Cs
 Read is destructive, so must refresh after read

Sp11 CMPEN 411 L23 S.19


Sense Amp Operation

V BL V(1)

V PRE

V(0)
Sense amp activated t
Word line activated

Sp11 CMPEN 411 L23 S.20


1-T DRAM Cell Observations
 Cell is single ended (complicates the design of the sense
amp)
 Cell requires a sense amp for each bit line due to charge
redistribution based read
 BL’s precharged to VDD/2 (not VDD as with SRAM design)
 all previous designs used SAs for speed, not functionality
 Cell read is destructive; refresh must follow to restore
data
 Cell requires an extra capacitor (CS) that must be
explicitly included in the design
 May not compatible with logic CMOS process
 A threshold voltage is lost when writing a 1 (can be
circumvented by bootstrapping the word lines to a higher
value than VDD)

Sp11 CMPEN 411 L23 S.21


1-T DRAM (3-D capacitor)

Non-CMOS
Source: IBM
Sp11 CMPEN 411 L23 S.22
Peripheral Memory Circuitry

 Row and column decoders

 Read bit line precharge logic


 Speed
 Power
 Sense amplifiers consumption
 Area – pitch
matching
 Timing and control

Sp11 CMPEN 411 L23 S.23


2D 4x4 __RAM Memory
read
precharge bit line precharge
enable
WL[0]
!BL BL
A1
WL[1]

A2 WL[2]

WL[3]

2 bit words

A0 Column Decoder
clocking and
control
sense amplifiers

BLi BLi+1 write circuitry

Sp11 CMPEN 411 L23 S.24


2D 4x4 ___RAM Memory
read
precharge bit line precharge
enable
WL[0]
BL
A1
WL[1]

A2 WL[2]

WL[3]

2 bit words
sense amplifiers
clocking,
control, and BL0 BL1 BL2 BL3 write circuitry
refresh
A0 Column Decoder

Sp11 CMPEN 411 L23 S.25


Row Decoders
 Collection of 2M complex logic gates organized in a
regular, dense fashion
 (N)AND decoder for 8 address bits
WL(0) = !A7 & !A6 & !A5 & !A4 & !A3 & !A2 & !A1 & !A0
R
WL(255) = A7 & A6 & A5 & A4 & A3 & A2 & A1 & A0

 NOR decoder for 8 address bits


WL(0) = !(A7 | A6 | A5 | A4 | A3 | A2 | A1 | A0)
R
WL(255) = !(!A7 | !A6 | !A5 | !A4 | !A3 | !A2 | !A1 | !A0)

 Goals: Pitch matched, fast, low power


Sp11 CMPEN 411 L23 S.26
Dynamic Decoders

Precharge devices GND GND VDD

WL 3
VDD
WL3

WL 2
WL2 VDD

WL1
WL 1
V DD
WL0
WL 0

VDD φ A0 A0 A1 A1
A0 A0 A1 A1 φ

2-input NOR decoder 2-input NAND decoder

Which one is faster? Smaller? Low power?


Sp11 CMPEN 411 L23 S.27
Pass Transistor Based Column Decoder
BL3 !BL3 BL2 !BL2 BL1 !BL1 BL0 !BL0
S3

2 input NOR decoder


A1
S2

S1
A0
S0

data_out !data_out
 Read: connect BLs to the Sense Amps (SA) Writes:
drive one of the BLs low to write a 0 into the cell
 Fast since there is only one transistor in the signal path. However,
there is a large transistor count ( (K+1)2K + 2 x 2K)
 For K = 2 → 3 x 22 (decoder) + 2 x 22 (PTs) = 12 + 8 = 20
Sp11 CMPEN 411 L23 S.28
Tree Based Column Decoder
BL3 !BL3 BL2 !BL2 BL1 !BL1 BL0 !BL0

A0
!A0

A1
!A1

data_out !data_out
 Number of transistors = (2 x 2 x (2K -1))
 for K = 2 → 2 x 2 x (22 – 1) = 4 x 3 = 12

 Delay increases quadratically with the number of sections (K)


(so prohibitive for large decoders)
 can fix with buffers, progressive sizing, combination of tree and
pass transistor approaches
Sp11 CMPEN 411 L23 S.29
Bit Line Precharge Logic
 First step of a Read
!PC
cycle is to precharge
(PC) the bit lines to VDD
 every differential signal in
the memory must be
equalized to the same
voltage level before Read

 Turn off PC and enable BL !BL


the WL
 the grounded PMOS load
limits the bit line swing equalization transistor - speeds up
(speeding up the next equalization of the two bit lines by
precharge cycle) allowing the capacitance and pull-up
device of the nondischarged bit line to
assist in precharging the discharged
line

Sp11 CMPEN 411 L23 S.30


Sense Amplifiers
 Amplification – resolves data
with small bit line swings SA
(in some DRAMs required
for proper functionality) input output

 Delay reduction – compensates for the limited drive


capability of the memory cell to accelerate BL transition
small
tp = ( C * ∆V ) / Iav

large make ∆ V as small as


possible
 Power reduction – eliminates a large part of the power
dissipation due to charging and discharging bit lines
 Signal restoration – for DRAMs, need to drive the bit lines
full swing after sensing (read) to do data refresh
Sp11 CMPEN 411 L23 S.31
Differential Sense Amplifier

VDD

M3 M4
y Out

bit M1 M2 bit

SE M5

Directly applicable to
SRAMs

Sp11 CMPEN 411 L23 S.32


Differential Sensing ― SRAM
V DD V DD
PC

BL BL V DD V DD
EQ
y M3 M4 2y

WL i
x M1 M2 2x x 2x

SE M5 SE

SE
SRAM cell i

V DD
Diff.
x Sense 2x Output
Amp y

SE
Output
(a) SRAM sensing scheme (b) two stage differential amplifier

Sp11 CMPEN 411 L23 S.33


Reliability and Yield
 Memories operate under low signal-to-noise conditions
 word line to bit line coupling can vary substantially over the
memory array
- folded bit line architecture (routing BL and !BL next to each other
ensures a closer match between parasitics and bit line
capacitances)
 interwire bit line to bit line coupling
- transposed (or twisted) bit line architecture (turn the noise into a
common-mode signal for the SA)
 leakage (in DRAMs) requiring refresh operation

 suffer from low yield due to high density and structural


defects
 increase yield by using error correction (e.g., parity bits) and
redundancy

 and are susceptible to soft errors due to alpha particles


and cosmic rays
Sp11 CMPEN 411 L23 S.34
Redundancy in the Memory Structure

Fuse bank
Redundant row

Redundant columns
Row
address

Column
address

Sp11 CMPEN 411 L23 S.35


Row Redundancy
Fused
== ? Redundant Wordline
Repair
Addresses == ? Redundant Wordline

Enable
Normal
Wordline Normal Wordline
Decoder

Functional
Address

Normal
Wordline Normal Wordline
Decoder
Enable

Fused == ? Redundant Wordline


Repair
== ? Redundant Wordline
Addresses

Page 4

Sp11 CMPEN 411 L23 S.36


Page 5
Normal Data Column
Fuse

Sp11 CMPEN 411 L23 S.37


Data
Normal Data Column
Fuse

1
Data
Normal Data Column
Fuse

2
Data
Column Redundancy

Normal Data Column


Fuse

3
Data
Normal Data Column
Fuse

4
Data
Normal Data Column
Fuse

5
Data Normal Data Column
6 Fuse
Data

Fuse Normal Data Column


7
Data

Redundant Data Column


Error-Correcting Codes

Example: Hamming Codes

e.g. If B3 flips

1 =3

2K>= m+k+1. m # data bit, k # check bit


For 64 data bits, needs 7 check bits
Sp11 CMPEN 411 L23 S.38
Performance and area overhead for ECC

Sp11 CMPEN 411 L23 S.39


Redundancy and Error Correction

Sp11 CMPEN 411 L23 S.40


Soft Errors
From Semico Research Corp.
 Nonrecurrent and 10000
nonpermanent errors from

System FITS
 alpha particles (from the 1000
packaging materials)
100
 neutrons from cosmic rays
10
 As feature size
decreases, the charge 1
stored at each node 0.25 0.18 0.13 0.09 0.05

decreases (due to a lower Process Technology


node capacitance and From Actel
lower VDD) and thus Qcritical
MTBF (hours)
(the charge necessary to
.13 µm .09 µ m
cause a bit flip) decreases Ground-based 895 448
leading to an increase in Civilian Avionics System 324 162
the soft error rate (SER) Military Avionics System 18 9

Sp11 CMPEN 411 L23 S.41


Scary Fact
 Avionics system in civilian aviation: altitude of 30,000
feet on a route crossing the north pole both cause
increase in neutron flux. If avionics board uses four 1M
130nm SRAM-based FPGAs, it would be subject to
0.074 upsets per day = 324 hours between upsets or
3million FITs. Assume one such system on-board each
commercial aircraft, 4,000 civilian flights per day, 3 hours
average flight time. Nearly 37 aircraft will experience a
neutron-induced SRAM-based FPGA configuration
failure during the duration of their flight.

Sp11 CMPEN 411 L23 S.42


Modeling of a particle strike

Sp11 CMPEN 411 L23 S.43


A SPICE simulation for SRAM

!BL
BL

1->0 0->1
0
A particle
strike

WL

Sp11 CMPEN 411 L23 S.44


On-chip Memory: ITRS roadmap
Area Reused Logic
Area New Logic
Area Memory

100

80
% Die utilization

60

40

20

0
11
99

02

08
05

14
/'
/'

/'

/'
/'

/'
nm
m

m
nm

nm
0n

0n

50
70
0

35
18

13

10

Sp11 CMPEN 411 L23 S.45


State of Art

Sp11 CMPEN 411 L23 S.46


State of Art

Sp11 CMPEN 411 L23 S.47

You might also like