Unit 5 Vlsi
Unit 5 Vlsi
9 To Probe
WRITE
Write access
Data valid
DATA
Data written
M bits
-
Word 0
So Word 0 Word 1
Decoder
S1 Word 1 Ao Word 2 Störage
words Storage cell
S Word 2 cell
A1
N
Ax-l, Word N-2
SN-2Word N-2 Word N-1
SN-1Word N-1 K= log,N
Input-Output
Input-Output (M bits)
(M bits)
(b) A
decoder reduces the number
(a)Intuitive architecture for N X M memory of address bits
M bits).
Figure 12-3 Architectures for N-word memory (where each word is
assume that this module is a single-port memory. In other words, only one signal S; can be high
at any time. For simplicity, let us temporarily assume that each storage cell is a D flip-flop and
that the select signal is used to activate (clock) the cell. While this approach is relatively simple
and works well for very small memories, one runs into a number of problems when trying to use
it for larger memories.
Assume that we would like to implement a memory that holds I million (N = 10) 8-bit
(M= 8) words. The reader should be aware that l million is a simplification of the actual mem
ory size, since memory dimensions always come in powers of two. In this particular cas, the
actual number of words equals 2 = 1024 × 1024 = 1,048,576. For ease of
use, it is common
practice to denote such a memory as 1 Mword unit.
When implementing this structure using the strategy of Figure
that 1 million select signals are 12-3a, we quickly realize
neededone for every word. Since these signals are
provided from off-chip or from another part of the chip, this normally
ing and/or packaging problems. A translates into insurmountable wr
decoder
(Figure 12-3b). Amemory word is selected by
is inserted to reduce the
number of select signals
Ag-). The decoder translates this address into N providing a binary encoded address word (Ao o
a time. This approach = 2 select lines, only one of which is active a
in our example, which reduces the number of external address
lines froml million to 20 (l0g
virtually
typically designed so that its eliminates the wiring and packaging problems. The
nections between the two, in dimensions are matched to the size of the decooei
storage
overhead. The value of this particularcanthebe S signals in Figure 12-3b, do not cell ana ue
ical floor plan of the approach appreciated interpreting
by produce ay
memory
memory core, the Swires can bemodule. By Figure 12-3b as a phys-
very short, performing
and no
the pitch
matching
large routing channel is
between decoder and
required.
629
21Introductlon
AK
As Word line
AL-l
M.2K
Sense amnplifiers/Drivers
Columndecoder
A7-1
Input-Output
(M bits)
Figure 12-4 Array-structured memory organization.
address the issue of the memory aspect
While this resolves the select problem, it does not its
storage array of our token example shows that
ratio, Evaluation of the dimensions of the shape of the
its width (2/2), assuming the
beight is approximately 128,000 times larger than almost always the case. Obviously, this
is
basic storage cell is approximately square which
Besides the bizarre shape factor, the resulting
results in a design that cannot be implemented.
the storage cells to the input/outputs
design is also extremely slow. The vertical wires connecting
interconnect line increases at least lin
of an
become excessively long.Remember that the delay
early with its length.
organized so that the vertical and horizontal
To address this problem, memory arrays are
ratio approaches unity. Multiple
dmensions are of the same order of magnitude; thus, the aspect
simultaneously. To route the correct word to the
Words are stored in a single row and are selected
decoder is needed. The con
puuoutput terminals, an extra piece of circuitry called the çolumn
into a column address (A, to
pl is illustrated in Figure 12-4. The address word is partitioned
row of the memory for R/W,
"I-) and a row address (A, to A,_). The row address enables one
selected row.
he column address picks one particular word from the
ical Block 0
k
selector
Block
at
a
e.ory
Block i
re.
amplifierldriver
Global
IO
The
lock
Global
data
bus
tor
Block-1P
es
631
design. Cells-An
Overview
itsimplementations.
The ROM fast and fixedsecondWhile discuss
short
design-quality
word is ofThis 12.2 extensive
to memory-design Tange
fact debugged, semiconductor
Figure applications Read-Only
Memories th12.2.1
e discussion keep section
line. The
that
glance read-only, The Designing of
WL WL idea the simulation
manufacturing
12-9 Diode
ROM (a) Figure cell the
concentrates
should need reveals of measures
of size Memory
contents such a the
Different 12-9 only memory nonvolatile, the memoryof process
be the and control
as a associative
BL BLshows
designed reading.
washing large such celltypes. on Core d
andesign
tolerances
of that
approaches the has and
several ROM a number can and as as
so Fixing
machines, memory speedsmall design a tim1ng
only read-write Whi l e optimization.
major and
ways that of
WL WL cell
for a thepotential be and asthe of impactoperating
read cell. possible, the
implementing (b) to or0are contents
calculators, reliability
MOS accomplish memory t
mosmemory It
VDD permanently 1 and on is
is temperatures
ROM presented applications. never
compelling this both an
| BL at are
BL manufacturing cores. should core integral,
1 1 this. and altered not memory
and and
fixed
to game Thisfatally issue
0 the Programs be its but isa
ROM WL WL might
considerably bit machines, section done
Composing in reliability demanding
often
(C) artected.
cells. MOS line time seem important designing
thatso overlooked,
GND for is
upon leads performance. and
ROM odd concluded variety cells task
once
processors In large
BL BL simplifies
2
activation to at sequence other for that
developed
small first, part
memories a requires
wit of
and with but the
of its a
tal them iscell. later the cel whose Figure for ries, Al isvalue pens
betwe n WL
axis, shown capacitance. cell. with
between so is The small hcuras rent The - ontheinap lied resi torAs ume MemoryTCorhe e
an inthatcaused The one drain12-is to toBLis Von)
connected that Consider
imprOved 9b. disadvantagebe
differentiatesandBL the
required 0 the
approach Figure the memories. low, cell. first
neighboring supply majisor by The provided
primarily word-lineconnected resulting word bit
WL(3] WL(2] WL(1] WL{0] independent Since toline the
Pigure bias V that 12-10. rail isolation difference: diode A to ground simplest
throughcharge of
better no BL
is driver
inaline
cells.Notice must by to is the of is
12-10 extensively e
thcomes the
replaced l the physical lacking
resistively
on
BL[O] Al l
supply
approach the the diode
th e of cell,
This how be extra only between
bit 1 th e
A distributed output-driving
at word line cell cell, any
requires by bit connection
4x4 the supply th e
responsible isline is line. th e value clamped which
l-downloads BL[1] used penalty
voltage. thecapacitance, that ROM other is
PulOR overhead to diode of the
in gate-source use and In
ROM the contact.
memory throughout it excitations
The its cells
doessummary, WL. diode-based
between to
BL[2] mirroring of current
fo r an is ground
of a drivers; On
cell Thi s more charging operation active which storing not enabled, the
array. cores the the the
BL(3) connection
is therefore, isolate the other or
of
supply
of the array.contact complex provided device can a and word inputs. -that ROM
al a n d is be the presence
or1 thehand, is,
odd An identical a line cell
styles. lines must discharging cel l of the
in quite bit 0,
this when This BL
VpD VpD cells example respectively. line word shown
is
is beand the an hig h or is
WL. pulled
to approach
cell, from absence line exactly aand
aroundreduced provided a MOS th at NMOS f or hi g h in
of larger as the is BL. lowFigure
the by th e proposed
of only word large pulled
4x4 a area. transistortransistor, the ofvoltageexists, through what
works memo- a up 12-9a.
635
horizonsharing array everyin word-line diode
The line. diode Vw. the hap-
in in to a
Core
422 The Memory
657
12.2.3 Read-Write Memorles (RAM)
Providinga memory cell with roughly equal read and write
plex
cell.structure. While the contents of the
ROM and NVRWM performance requires a more com-
or programmed into the device memories are ingrained in the
celltoppology characteristics, storage in RAM memories is based
uTocnese either positive feedback or capacitive charge, similar to the
ideas introduced in Chapter 6.
These circuits would be perfectly suitable as R/W memory cells, but they
tend to consume too
mucharea. this section, we introduce a number of
In
alsan performance electrical reliability. They are labeledsimplifications
that trade off area for either
as either SRAMS or DRAMs,
on the storage concept
used, depending
Satic Random-Access Memory (SRAM)
121 Thegeneric SRAM cell is introduced in Figure 12-26. It turns out to be quite similar to the static
p latch. shown in'Figure 7-21l. It requires six transistors per bit.
flex As word line, which replaces the clock and controls the two
Access to the cell is enabled by
20M% pass transistors M, and MÙ, shared
Latyeen the read and write operation. In contrast to the ROM cells, two bit lines transferring
hoth the stored signal andits inverse are required. Although providing both
polarities is not a
es are necessity, doing so improves the noise margins during both read and write operations, as will
extra become apparent in the subsequent analysis.
tures
pro Problem 12.5 CMOS SRAM Cell
ode.
Does the SRAM cell presented in Figure 12-26 consume standby power? Explain. Draw an
hich.
pseudo-NMOS implementation. How about the standby power in that case? equivalent
Operation of SRAM Cel! The SRAM cell should be sized as small as possible to achieve high
memory densities. Reliable operation of the cell, however, imposes some sizing
constraints.
lo understand the operation of the memory cell, let us consider the read and write
opera
tons in sequence. While doing So, we also derive the transistor-sizing
ram
constraints.
les
WL
Vpp
M2 Ma
M6
M1 M3
BL BL
WL
BL
Vpp
MA
BL
M_ M6
VDD M
VpD
Chi
Cpit
Figure 12-27 Simplified model of
(Q=1, Vprecharg Vop). CMOS SRAM cellduring read
Memory Core
22 The
whichsimplifies to
659
tCR(V DD-VTa)-
AV=
VDsAT(l
CR
+ CR) +
CRVpn-V
where CRiis called the cell ratio and is (12.3)
defined as
CR =
W/W,L/Lsy
The value of (12.4)
ted;in
voltage rise AV as a
Figure 12-28. To keep function of CR for our 0.25-um
the node voltage from rising above the technology is plot
about 0.4 V), the cell ratio must be
to keep the cell size minimal
greater than 1.2. For large transistor threshold (of
msized, the while mai ntaining read memory arrays, it is desirable
access pass transistor Ms has to be stability. If the
Tis is undesirable, made weaker by transistor M, is mini-
e the size of thebecause increasing its length.
it adds to the load of
the bit line. A
pass transistor, and preferred
increase the width of the NMOSsolution is to min
meet the stability
constraint. This slightly pull-downMjto
designer must perform careful simulations toincreases the minimum size of the cell. The
ners (Preston01]. guarantee cell stability across all proOcess cor-
1.2
|V]
Voltage
nse
0.8
0.6
0.4
0.2
0
0 0.5 11.2 1.5 2 2.5 3
Cell Ratio (CR)
Figure 12-28 Voltage rise inside the cellupon read versus cell ratio (ratio of M,/M).
Ine voltage inside the cell does not rise above the threshold for CR> 1.2
The preceding analysis presents the worst case. The second bit line BL clamps to Vpn,
which makes the inadvertent toggling of the cross-coupled inverter pair difficult. This demon-
Srates one of the major advantages of the dual bit line architecture.
y Beyond adjusting the size of the cell transistors, the erroneous toggling can beimpossible
prevented
precharging the bit lines to another value, such as Vpp/2. This effectively makes it
he Teach the switching threshold of the connecting inverter. Precharging to the midpoint of
voltage range has some performance benefits
bit Lnes. as well, since it limits the voltageswing on the
660
Chapter 12 " Designing Memory and Array Structures
kn, VpSATe
2 (12.5)
Solving for V, leads to
WL
VpD
|M4
O=0 M6
Ms Q=1
M
BL = 1 VpD
BL = 0
BLI BL2
WWZ
WWL
RWL
M3 RWL
|M Vpn -V,
BLI
Cs
BL2 Vpp-Vr
Paure12-33 Three-transistor dynamic memory cell and the signal waveforms
write.
during read and
The fornmer approach necessitates careful transistor sizing and causes static
consumption. Therefore, the precharged approach is generally preferable. The series con-
power low when a 1 is stored. BL2 remains high in the opposite case.
fM, and M, pulls BL2
nectionof signal is sensed on the bit
the stored
cellis inverting: that is, the inverse value of
Noticethatthe inverse
nThe most common approach to refreshing the cell is to read the stored data, put its
in consecutive order.
on BL1, and assert WWL reduced with respect to the static cell. This is
illus
The cell complexity is substantially compared to
layout of Pigure 12-34. The total area of the cell is 576 2,
trated by the example 12-31. These numbers do not take into
account the
SRAM cell of Figure
the 1092 of the
sharing The area reductionismainly
with neighboring cells.
potential area reduction obtained by
contacts and devices.
due to the elimination of complex
possible at the expense ofa more read
in the cellstructure are
Further simplifications and BL2 can be merged into a
single wire. The
instance, bit lines BLl con
circuit operation. For refresh cycle must be altered
as before. The read-sensewrite This
and write cycles can proceed
the cell is the complement of the stored value.
read from merge the
SIderably, since the data value single cycle. Another option is to
both values in a operation. A
Tequires the bit line to be driven to this does not significantly change the cell
again, careful control
KWL and the WWL lines. Onceaccompanied by a refresh of the cell contents. A value is
au operation is
automatically
writing of the cell before the actual
prevent a
of the word-line voltage 1S necessary to
read during refresh. are worth mentioning:
interesting properties ofthe 3T cell
Finally, the following device ratios. This is a
common
constraints exist on the
1. In to the SRAM cell, no solely based on performance
contrast circuits. The choice of device sizes is
valid when a
static bit line
property of dynamic is not
that this statement
Observe
and reliability considerations.
load approach is employed.
Designing Memory and Array
666
Chapter 12 "
Structures
GND
BL1
BL2
Polysilicon
RWL
-M3
Metal 2
-M2
WWL
Metal 1
M
ional concepts are extremely simple. During a write cycle, the data value is placed on the bit
Bne BL. and the word line WL is raised. Depending on the data value, the cell capacitance is
either charged or discharged. Before a read operation is performed, the bit line is precharged to a
voltage VpRE Upon asserting the word line, a charge redistribution takes place between the bit
line and storage capacitance. This results in a voltage change on the bit line, the direction of
which determines the value of the data stored. The magnitude of the swing is given by the
expression
Cg (12.8)
AV = RI -VPRE = (V BIT PRECe+ CRÊ.
line after the charge redistribu
Where Cp, is the bit line capacitance, Vr the potential of the bit
As the cell capacitance is normally
ton, and VBr the initial voltage over the cell capacitance Cs.
line capacitance, this voltage change is very
One or two orders of magnitude smaller than the bit
memories [[toh90]. The ratio Cç/(Cç + CR)
Sinall, typically around 250 mV for state-of-the-art
1% and 10%.
bCalledthe charge-transfer ratio and ranges between necessary if functionality is to be
Amplification of AV to the full voltage swing is
difference between the 1T and 3T, as well as
achieved. This observation marks a first major
other, DRAM cells.
each bit line to be functional.
1.A 1T DRAM requires the presence of a sense amplifierforThe read operation of all cells
readout.
This is aresult ofthe charge-redistribution-based
only needed to speed
sinking. A sense amplifier is
ISCussed previously relies on currentconsiderations. It is also worth noticing that the
up the functionality
readout, not for contrast to the SRAM cells, which present both
DRAM cells are single ended in
memory
the data value and its complement on the bit lines. This
complicates the design of the sense
on periphery.
amplifier, as will be discussed inthe section