0% found this document useful (0 votes)
47 views18 pages

Unit 5 Vlsi

Memory architecture
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views18 pages

Unit 5 Vlsi

Memory architecture
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

12.

9 To Probe

12.1 Introduction designs is the ..


to the
dedicated to stor
contemporary digital
silicon area of many the transistors in today's high-
A large portion of the program instructions. More than half of
data values and memories, and this ratio is expected to f
age of devoted to cache High-performance wor
performance microprocessors are the system level.
is even more dramatic atsemiconductor memory, a number that i
ther increase. The situation several Gbytes of
stations and computers contain
semiconductor audio (MP3) and video plavers
continuously rising. With the introduction of skyrocketed. Obviously, dense data-storage
nonvolatile storage has
(MPEG4), the demand for concerns ofa digital circuit or system designer. The
primary
circuitry is and willbe one of the expected to be over $45 billion in 2003,
which
memories is
total market share for semiconductor
is twice as large as it was in 1998. Boolean values based on either positive feed
we introduced means of storing
InChapter 7,
capacitive storage. While semiconductor memories are built on the same concepts, the sim
back or
leads to excessive area requirements. Memory
ple use of a register cellas a means for mass storage
the overhead caused by peripheral
cells are therefore combined into large arrays, which minimizes
of these array structures
circuitry and increases the storage density. The sheer size and complexity
introduces a variety of design problems, some of which are discussed in this chapter.
We first introduce the basic memory architectures and their essential building blocks. Next, we
analyze the different memory cells and their properties. The cell structure and topology is mainly driven
by the available technology, and is somewhat out of the control of the digital designer. On the other hand,
the peripheral circuitry has atremendous impact on the robustness, performance, and power consump
tion of the memory unit. Therefore, acareful analysis of the options and considerations of the periphery
design is appropriate. Reliability and power dissipation are two other large concerns of the semiCOn
ductor memory designer, and they are discussed in separate sections.
An interesting aspect of Chapter 12 is that it applies a large number of the circuit tech
niques introduced in the earlier chapters. In a sense, one can consider memory
designquite
study of high-performance, high-density, low-power circuit design. This becomes
as
clear
from the two case studies that conclude the
chapter.
Introduction
121

12.1.1 Memory Classification 625


Electronic memories come in many
different formats and styles. The type of
preferablefor a given application is a
the stored data, the access function of the required memory size, memory unit that is
sCess
patterns, the application, and the the time it takes to
Sze Depending upon the level of
abstraction, system requirements.
memoryunit. The circuit designer tends to
different means are used to express the
equivalenttothe number of
define the size of the memory in terms of size of a
expresses the
individual cells (flip-flops or registers) needed to bits that are
chËp designer memory size in bytes (groups of 8 or 9 store the data. The
s(Kbyte), megabytes (Mbyte), gigabytes bits) or its
tem designer likes to quote the storage
(Gbyte), and ultimatelyterabytes multipleskilo-
(Tbyte). The sys-
requirement in terms of words,
computationallentity. For instance, agroup of 32 bits which represent a basic
ates on 32-bit data.
represents word in a computer that oper-
a

Tning Parameters The timing properties of a


ne it fakes to retrieve (read) fromthe memory are illustrated in Figure 12-1.The
memory_is. called the read-access time, which isequal to
he delay between the read request and the
moment the data is available at the output. This time
is different from the wriie-access time,which is
the time elapsed between a write request andthe
final writing of the input data into the memory. Finally,
another important paranmeter is the (read
Or write) cycle time of the memory, which is the
minimum time required between successive
reads or writes. This time is normally greater than the access time for reasons that become
apparent later in the chapter. Read and write cycles do not necessarily have the same length, but
their lengths are considered equal for simplicity of system design.
Function Semiconductor memories.are most often classified on the. basis of memory function
, accesspatterns, and the nature of the storage mechanism. Adistinction is made between
td only (ROM) and read-write (RWM) memories. The RWM structures have the advantage of
Read cycle
READ
Write cycle
Read access Read access

WRITE
Write access
Data valid

DATA
Data written

Figure 12-1 Memory-timing definitions.


Chapter 12 " Designing Memory and Array Structures
626
functionality with comparable access times and are the most flexi-
offering both read and write flip-flops or as a charge on acapacitor. As in the classif-
stored either in
ble memories. Data are sequential circuitry, these memory cells are called static
discussion on
cation introduced in the supply voltage is retaines
their data as long as the
and dynamic, respectively. The former retaincompensate for the charge loss caused by leak
to
while the latter need periodic refreshing store the information, they_belong to the
Since RWM memories use active circuitry to
memories, in which the data is lost when the supply voltage is turned off.
called volatile the information into the circuit toes
Read-only memories, on the other hand, encode
this topology is hard wired, the data
ogy-for example, by adding or removing transistors. Since structures belong to the class of the
ROM
cannot be modified; it can only be read. Furthermore, result in a loss of the
nonvolatile memories. Disconnection of the supply voltage does not
stored data.
can be classified as nonvolo
The most recent entry in the field are memory_modules that
tile, yet offer both read and write functionality. Typically, their write operation takes substan
tially more time than the read. We call them novolatile read-write (NVRWM) memories
Members of this family are the EPROM(erasable programmable read-only memory), EPROM
(electrically erasable programmable read-only memory), and flash memories. The emergence of
novel, cheap, and dense nonvolatile technologies over the last decade has made this approach to
storage the fastest growing in the memory arena.
Access Pattern A second memory classification is based on the order in which data can be
accessed. Most memories belong to the random-access class, which means memory locations
can be read or written in a random order. One would expect memories of this class to be called
RAM modules (random-access memory). For historical reasons, this nanme has been reserved for
the random-access RWM memories, probably because the RAM acronym is more
easily pro
nounced than the awkward RWM. Be aware that most ROM or NVRWM units also provide ran
dom access, but the acronym RAM should not be used for them.
Some memory types restrict the order of access, which results in either
faster access times,
smaller area, or e memory with a special functionality. Examples of such are
ries: the FIFO (first-in first-out), LIFO (last-in first-out, the serial memo
most often used as a stack), and the shijt
register. Video memories are an important member of this
acquired and outputted serially. and random access is not class. In video processing, data 1s
ries (CAM) represent another required.Contents-addressable memo
important class of nonrandom access
an address to locate the data, a
CAM (also called an associative memories. Instead of using
itself as input in a query-style format. memorv), uses word of data
a
When the input data matches a data word
memory array, a MATCH flag is raised. The stored in n
memory corresponds to the-input word. MATCH signal remains low if no data stored in u
the cache architecture of
many
Associative memories are an important component
An overview of the microproçessors.
memory classes, as introduced
mentations for each of the mentioned memory earlier, is given in Figure 12-2. Imple-
structures are discussed in subsequent secuos
A,ubits.
Thelengh semiconductor
easter magnetic
pproach
worOne to
is communicate tape geticQonductor has
dlso tionality. sindiglean ICs.
dlone
Aoplication
,microprocessors. demonstrated
age Milbe
scope
Evamnles ontssingle numbedatr aof
COSt on a cel ,
of
implementing Memory12.1.2 When the severe nput/Output.often portSemiconductor
Figur12-e 2
at of per and Memories With of but
a memories its
word a this bi t, massive impact have that Random
DRAMSRAM Access
time stack the directly with optical underlying ever Before the al s o Introductlon
varies textbook. generally allowsfor they the Architecture on
is larger Adding latter
multiple shared
i n put
and is the
an tend disk amounts
tend on of advent the
composition basic stor-how
between selected memories. th e this ar e
subsequent N-word should technology fraction the
Architectures to to memory type ofend more the betweenoutput
input ports. register
Shift Non-Random
computing
processor,the become the of
for be of ports final
the A nature CAM LIFOFIFO AccesS
and l are register
They either bestorage system-on-a-chip of the
128 reading memory used. memory
called century,
and input
th e
of memory
memory access
serial
only. and tends classification of of
bits. are toodesign-not the
slow hlesoutput
While circuit to andWhileperipheral
a
In or where and called expensive. are embedded.
classification.
commercial or is most memory
writing words needed complicate ports-andused output,
E'PROM
now FLASH EPROM NMAWM
Building secondary provide these techniques.
each only large-size in majority units.
in integrated and affectsnot
wi th provide More
(multiples The the RISC memories
semiconductormemories
based
memory buttheseFor on the
linear aword limited
the colocation the integration (reduced thus of
Blocks
is or are cost-effective memories design the
chips, aid tertiary extensive overall on only programmable
(PROM)
fashion, M interfaced access of th e
are with
Mask-programmed
reasons, such memory
the
of bits Gigabytes of called the
a memory of
same ofinstruction the higher
word select as were choice of ROM
wide, memories patterns. storage these multiple storage
nultiport units
shown through a technologies such as die bandwidth
length packaged
(Sbi,t the and
architecture,but diverse as
capabilities logic
func th e set presents the
typically in and memories not
do For functions ona cell.
more), memories.
computer)
Figure to most instance,
number
are functions as require- 627
Sy-), stand only
equals intuitive beyond semi
12-3a.
if at a
1, we of a a
Designing Memory and Array:
628
Chapter 12 "
Structures
M bits

M bits

-
Word 0
So Word 0 Word 1
Decoder
S1 Word 1 Ao Word 2 Störage
words Storage cell
S Word 2 cell
A1
N
Ax-l, Word N-2
SN-2Word N-2 Word N-1
SN-1Word N-1 K= log,N

Input-Output
Input-Output (M bits)
(M bits)
(b) A
decoder reduces the number
(a)Intuitive architecture for N X M memory of address bits

M bits).
Figure 12-3 Architectures for N-word memory (where each word is
assume that this module is a single-port memory. In other words, only one signal S; can be high
at any time. For simplicity, let us temporarily assume that each storage cell is a D flip-flop and
that the select signal is used to activate (clock) the cell. While this approach is relatively simple
and works well for very small memories, one runs into a number of problems when trying to use
it for larger memories.
Assume that we would like to implement a memory that holds I million (N = 10) 8-bit
(M= 8) words. The reader should be aware that l million is a simplification of the actual mem
ory size, since memory dimensions always come in powers of two. In this particular cas, the
actual number of words equals 2 = 1024 × 1024 = 1,048,576. For ease of
use, it is common
practice to denote such a memory as 1 Mword unit.
When implementing this structure using the strategy of Figure
that 1 million select signals are 12-3a, we quickly realize
neededone for every word. Since these signals are
provided from off-chip or from another part of the chip, this normally
ing and/or packaging problems. A translates into insurmountable wr
decoder
(Figure 12-3b). Amemory word is selected by
is inserted to reduce the
number of select signals
Ag-). The decoder translates this address into N providing a binary encoded address word (Ao o
a time. This approach = 2 select lines, only one of which is active a
in our example, which reduces the number of external address
lines froml million to 20 (l0g
virtually
typically designed so that its eliminates the wiring and packaging problems. The
nections between the two, in dimensions are matched to the size of the decooei
storage
overhead. The value of this particularcanthebe S signals in Figure 12-3b, do not cell ana ue
ical floor plan of the approach appreciated interpreting
by produce ay
memory
memory core, the Swires can bemodule. By Figure 12-3b as a phys-
very short, performing
and no
the pitch
matching
large routing channel is
between decoder and
required.
629
21Introductlon

2l-K Bit line


Decoder
Row Storage cell

AK
As Word line

AL-l
M.2K

Sense amnplifiers/Drivers

Columndecoder
A7-1

Input-Output
(M bits)
Figure 12-4 Array-structured memory organization.
address the issue of the memory aspect
While this resolves the select problem, it does not its
storage array of our token example shows that
ratio, Evaluation of the dimensions of the shape of the
its width (2/2), assuming the
beight is approximately 128,000 times larger than almost always the case. Obviously, this
is
basic storage cell is approximately square which
Besides the bizarre shape factor, the resulting
results in a design that cannot be implemented.
the storage cells to the input/outputs
design is also extremely slow. The vertical wires connecting
interconnect line increases at least lin
of an
become excessively long.Remember that the delay
early with its length.
organized so that the vertical and horizontal
To address this problem, memory arrays are
ratio approaches unity. Multiple
dmensions are of the same order of magnitude; thus, the aspect
simultaneously. To route the correct word to the
Words are stored in a single row and are selected
decoder is needed. The con
puuoutput terminals, an extra piece of circuitry called the çolumn
into a column address (A, to
pl is illustrated in Figure 12-4. The address word is partitioned
row of the memory for R/W,
"I-) and a row address (A, to A,_). The row address enables one
selected row.
he column address picks one particular word from the

Bxample 12.1 Memory Organization


alternative choice would be to organize the memory core of our example as an array of
Uby 2000 cells (to be more precise, 4096 x 2048), which approaches a square aspect
address of 12
0. tach of the 4000 rows stores 256 8-bit words. This results in a row
bits, while the column address measures 8 bits. It can be verified that the total address
Space still equals 20 bits.
Chapter 12 " Designing Memory and Array Structures
630
The horizontal select line
introduces commonly used terminology. wire that Connects the that
Figure 12-4 line, while the cells in a
is called the word
enables a single row of cells the bit line.
circuitry is named
Single column to the input/output modules is dominated by the size of the memory core.
The arca of large memory We cs
crucial to keep the size of the basic storage cell as Small as possible.
Thus, it is
use one of the register cells introduced in Chapter 7to implement a R/W memory. Such a
cell easily requires more than 10 transistors per bit, and employing it in a large memory
memory cells therefore reduce the cell
results in excessive area requirements. Semiconductor
Such as noise margin, logic
area by trading off some desired properties of digital circuits,
degradation of some of those pron
SWing. input/output isolation, fan-out, or speed. While a noise levels can
erties is allowable within the confined domain of the memory core where
be tightly controlled, this is not acceptable when interfacing with the external or surround.
ing circuitry. The desired digital signal properties must be recovered with the aid of periph
eral circuitry.
For example, it is common to reduce the voltage swing on the bit lines to a
value substantially below the supply voltage. This reduces both the propagation delay
and the power consumption. A careful control of the cross talk and other distur
bances is possible within the memory array, ensuring that suficient noise margin is
obtained even for these small signal swings. Interfacing to the external world, on the
other hand, requires an amplification of the internal swing to a full rail-to-rail ampli
tude. This is achieved by the sense amplifiers shown in
those peripheral circuits is discussed in Section 12.3. Figure 12-4. The design of
ber of the coveted digital properties Relaxation of bounds on a num
makes possible to reduce the transistor count
it
of a single memory cell to between one and
six transistors!
The architecture of Figure 12-4 works
well for memories up to a range of 64
Kbits. Larger memories start to suffer from a Kbits to 256
tance, and resistance of the word and bit serious speed degradation as the length, capac1
lines become excessively large. Larger memories have
consequently gone one step further and added one extra
trated in Figure 12-5. dimension the address space, as illus
to
The memory is
partitioned into P smaller blocks. The
vidual blocks is identical
and column
to one of Figure 12-4. A
word is
composition of each of the indi
addresses that selected on the basis of the row
block address, selects one ofare broadcast to all the blocks. An extra
the P blocks to be address word called the
advantage. read or written. This approach has a dual
1.The lengthof the local word and bit
blocksis kept within bounds, linesthat is, the length of the
lines within the
can be used toresulting faster access times.
2.The block in
address
put in
power-saving mode with
activate only the addressed
This results in a
substantial
sense amplifiers and row and block. Nonactive blockS a
major concern in very large power saving that is column decoders disabled.
memories. desirable, since power
dissipation
1S a
12
a
Figure Column
addressBlockaddress address Row Introduction
le
circuitry
Control
y 12-5

ical Block 0
k
selector
Block

at
a
e.ory
Block i

re.
amplifierldriver
Global
IO

The
lock
Global
data
bus
tor
Block-1P
es

631
design. Cells-An
Overview
itsimplementations.
The ROM fast and fixedsecondWhile discuss
short
design-quality
word is ofThis 12.2 extensive
to memory-design Tange
fact debugged, semiconductor
Figure applications Read-Only
Memories th12.2.1
e discussion keep section
line. The
that
glance read-only, The Designing of
WL WL idea the simulation
manufacturing
12-9 Diode
ROM (a) Figure cell the
concentrates
should need reveals of measures
of size Memory
contents such a the
Different 12-9 only memory nonvolatile, the memoryof process
be the and control
as a associative
BL BLshows
designed reading.
washing large such celltypes. on Core d
andesign
tolerances
of that
approaches the has and
several ROM a number can and as as
so Fixing
machines, memory speedsmall design a tim1ng
only read-write Whi l e optimization.
major and
ways that of
WL WL cell
for a thepotential be and asthe of impactoperating
read cell. possible, the
implementing (b) to or0are contents
calculators, reliability
MOS accomplish memory t
mosmemory It
VDD permanently 1 and on is
is temperatures
ROM presented applications. never
compelling this both an
| BL at are
BL manufacturing cores. should core integral,
1 1 this. and altered not memory
and and
fixed
to game Thisfatally issue
0 the Programs be its but isa
ROM WL WL might
considerably bit machines, section done
Composing in reliability demanding
often
(C) artected.
cells. MOS line time seem important designing
thatso overlooked,
GND for is
upon leads performance. and
ROM odd concluded variety cells task
once
processors In large
BL BL simplifies
2
activation to at sequence other for that
developed
small first, part
memories a requires
wit of
and with but the
of its a
tal them iscell. later the cel whose Figure for ries, Al isvalue pens
betwe n WL
axis, shown capacitance. cell. with
between so is The small hcuras rent The - ontheinap lied resi torAs ume MemoryTCorhe e
an inthatcaused The one drain12-is to toBLis Von)
connected that Consider
imprOved 9b. disadvantagebe
differentiatesandBL the
required 0 the
approach Figure the memories. low, cell. first
neighboring supply majisor by The provided
primarily word-lineconnected resulting word bit
WL(3] WL(2] WL(1] WL{0] independent Since toline the
Pigure bias V that 12-10. rail isolation difference: diode A to ground simplest
throughcharge of
better no BL
is driver
inaline
cells.Notice must by to is the of is
12-10 extensively e
thcomes the
replaced l the physical lacking
resistively
on
BL[O] Al l
supply
approach the the diode
th e of cell,
This how be extra only between
bit 1 th e
A distributed output-driving
at word line cell cell, any
requires by bit connection
4x4 the supply th e
responsible isline is line. th e value clamped which
l-downloads BL[1] used penalty
voltage. thecapacitance, that ROM other is
PulOR overhead to diode of the
in gate-source use and In
ROM the contact.
memory throughout it excitations
The its cells
doessummary, WL. diode-based
between to
BL[2] mirroring of current
fo r an is ground
of a drivers; On
cell Thi s more charging operation active which storing not enabled, the
array. cores the the the
BL(3) connection
is therefore, isolate the other or
of
supply
of the array.contact complex provided device can a and word inputs. -that ROM
al a n d is be the presence
or1 thehand, is,
odd An identical a line cell
styles. lines must discharging cel l of the
in quite bit 0,
this when This BL
VpD VpD cells example respectively. line word shown
is
is beand the an hig h or is
WL. pulled
to approach
cell, from absence line exactly aand
aroundreduced provided a MOS th at NMOS f or hi g h in
of larger as the is BL. lowFigure
the by th e proposed
of only word large pulled
4x4 a area. transistortransistor, the ofvoltageexists, through what
works memo- a up 12-9a.
635
horizonsharing array everyin word-line diode
The line. diode Vw. the hap-
in in to a
Core
422 The Memory
657
12.2.3 Read-Write Memorles (RAM)
Providinga memory cell with roughly equal read and write
plex
cell.structure. While the contents of the
ROM and NVRWM performance requires a more com-
or programmed into the device memories are ingrained in the
celltoppology characteristics, storage in RAM memories is based
uTocnese either positive feedback or capacitive charge, similar to the
ideas introduced in Chapter 6.
These circuits would be perfectly suitable as R/W memory cells, but they
tend to consume too
mucharea. this section, we introduce a number of
In
alsan performance electrical reliability. They are labeledsimplifications
that trade off area for either
as either SRAMS or DRAMs,
on the storage concept
used, depending
Satic Random-Access Memory (SRAM)
121 Thegeneric SRAM cell is introduced in Figure 12-26. It turns out to be quite similar to the static
p latch. shown in'Figure 7-21l. It requires six transistors per bit.
flex As word line, which replaces the clock and controls the two
Access to the cell is enabled by
20M% pass transistors M, and MÙ, shared
Latyeen the read and write operation. In contrast to the ROM cells, two bit lines transferring
hoth the stored signal andits inverse are required. Although providing both
polarities is not a
es are necessity, doing so improves the noise margins during both read and write operations, as will
extra become apparent in the subsequent analysis.
tures
pro Problem 12.5 CMOS SRAM Cell
ode.
Does the SRAM cell presented in Figure 12-26 consume standby power? Explain. Draw an
hich.
pseudo-NMOS implementation. How about the standby power in that case? equivalent

Operation of SRAM Cel! The SRAM cell should be sized as small as possible to achieve high
memory densities. Reliable operation of the cell, however, imposes some sizing
constraints.
lo understand the operation of the memory cell, let us consider the read and write
opera
tons in sequence. While doing So, we also derive the transistor-sizing
ram
constraints.

les
WL

Vpp
M2 Ma

M6

M1 M3
BL BL

Figure 12-26 Six-transistor CMOS SRAM cell.


Array Structures
658
Chapter 12 " Deslgning Memory and

Exámple 12.8 CMOSSRAM-Read Operation


precharged to
stored at O. We further assume that both bit lines are
Assume that a 1is by asserting the word
operation isinitiated. The read cycle is started
2.5 V before the read
both pass transistors M, and M, after the initial word-line delay. During a
line, enabling
read operation, the values stored in O and Q are transferred to the bit lines by leay.
corect M-Ms. A careful sizing of
discharging BL through
ing BL at its precharge value and byaccidentally
necessary to avoid writing a 1into the cell.This type of mal.
the transistors is
function is frequently called a read upset.
the cell. The bit line capac
This is illustrated in Figure 12-27. Consider the BL side of
BL stays at the pre
itance for larger memories is in the pF range. Consequently, the value of
series combination
charged value Vpp upon enabling of the read operation (WL 1). This
small-sized cell, we
of two NMOS transistors pulls down the BL towards ground. For a
would like to have these transistor sized as close to minimum as possible, which would result
in avery slowdischarge of the large bit line capacitance. As the difference between BL and
BL builds up, the sense amplifier is açtivated to accelerate the reading process.
Initially, upon the rise of the WL, the intermediate node between these two NMOS
transistors,2 is pulled up toward the precharge value of BL: This voltage rise ofO must
stay low enough not to cause a substantial current through the M;-M, inverter, which in
the worst case could flip the cell. It is necessary to keep the resistance of transistor M.
larger than that of M,to prevent this from happening.
The boundary constraints on the device sizes can be derived by solving the current
equation at the maximum allowed value of the voltage ripple AV. We ignore the body
effect on transistor M, for simplicity and write

k, Ms (VpD- AV- V VpSA TH VBsATH


2 (12.2)

WL

BL
Vpp
MA
BL

M_ M6
VDD M
VpD
Chi
Cpit
Figure 12-27 Simplified model of
(Q=1, Vprecharg Vop). CMOS SRAM cellduring read
Memory Core
22 The

whichsimplifies to
659

tCR(V DD-VTa)-
AV=
VDsAT(l
CR
+ CR) +
CRVpn-V
where CRiis called the cell ratio and is (12.3)
defined as
CR =
W/W,L/Lsy
The value of (12.4)
ted;in
voltage rise AV as a
Figure 12-28. To keep function of CR for our 0.25-um
the node voltage from rising above the technology is plot
about 0.4 V), the cell ratio must be
to keep the cell size minimal
greater than 1.2. For large transistor threshold (of
msized, the while mai ntaining read memory arrays, it is desirable
access pass transistor Ms has to be stability. If the
Tis is undesirable, made weaker by transistor M, is mini-
e the size of thebecause increasing its length.
it adds to the load of
the bit line. A
pass transistor, and preferred
increase the width of the NMOSsolution is to min
meet the stability
constraint. This slightly pull-downMjto
designer must perform careful simulations toincreases the minimum size of the cell. The
ners (Preston01]. guarantee cell stability across all proOcess cor-

1.2
|V]
Voltage
nse

0.8
0.6
0.4

0.2
0
0 0.5 11.2 1.5 2 2.5 3
Cell Ratio (CR)
Figure 12-28 Voltage rise inside the cellupon read versus cell ratio (ratio of M,/M).
Ine voltage inside the cell does not rise above the threshold for CR> 1.2

The preceding analysis presents the worst case. The second bit line BL clamps to Vpn,
which makes the inadvertent toggling of the cross-coupled inverter pair difficult. This demon-
Srates one of the major advantages of the dual bit line architecture.
y Beyond adjusting the size of the cell transistors, the erroneous toggling can beimpossible
prevented
precharging the bit lines to another value, such as Vpp/2. This effectively makes it
he Teach the switching threshold of the connecting inverter. Precharging to the midpoint of
voltage range has some performance benefits
bit Lnes. as well, since it limits the voltageswing on the
660
Chapter 12 " Designing Memory and Array Structures

Exafnple 12.9 CMOS SRAM Write Operation


ensure a correct write oner
In this example, we derive the device constraints necessary to
setting
ation. Assume that a 1is stored in the cell (or = 1). A O is written in the cell by
BL to land BL to 0, which is identical to applying a reset pulse toan SR latch. This causes
the flip- flop tò change state if the devices are sized properly.
During the initiation of a write, the schematic of the SRAM cellcan be simplified to
the model of Figure 12-29. It is reasonable to assume that the gates of transistors M, and
M4 stay at Vpp and GND, respectively, as long as the switching has not commenced. While
this condition is violated once the flip-flop starts toggling, the simplified model is more
than accurate for hand-analysis purposes.
Note that @side of the cell cannot be pulled high enough to ensure the writing of 1.
The sizing constraint, imposed by the read stability, ensures that this voltage is kept below
0.4 V. Therefore, the new value of the cell has to be written through transistor M;.
A reliable writing of the cell is ensured if we can pull node Q low enough-this is,
below the threshold value of the transistor M,. The conditions for this to occur can be
derived by writing out the de current equations at the desired threshold point, as follows:

kn, VpSATe
2 (12.5)
Solving for V, leads to

VÍ = Vop-Va-(Vpn- Vr)-2PPR (VDp-|V)VpsaTp VpsATE


2 (12.6)

WL

VpD
|M4
O=0 M6
Ms Q=1

M
BL = 1 VpD
BL = 0

Figure 12-29 Simplified model of CMOS


during write (Q= 1). SRAM cell
SIn principle, it is sufficient to
pull Qbelow the switching
initiation of the switching action. For noise margin threshold of the inverter formed by M, and M, to
M,. purposes, it is safer to require that Ois pulled below the ensure the
threshold or
bb4

higher thresholds. The resistor current showla


devices with
technologies is achieved by using the to accomplish that goal, or oad>
10l3 A. This
puts an
magnitude larger
be at least two orders of
upper limit on the resistor value. only needed for charge-loss compensation bos
devices are
The realization that the pull-upsix-transistor memory cellof Figure 12-26. Instead
of nei
resulted in arevised version of
the
pull-up transistors are realized as parasitic devin
devices, the These PMOS thin-filin tro
traditional, expensive PMOS athin-film technology.
using
deposited on top of the cell structure respect to normaldevices and are characterized
bva
properties with a5V gate
sistors (TFTS) have inferior the ON and OFF modes respectively for
10A and 10- in
current of approximately complimentary nature of the cell results in
an
Ootani90]. The
source voltage .[Sasaki90, to leakage and soft errors, yet at a lower standby
increased cell reliability with less sensitivity
current compared to the resistive load cell. transistors are additional features that
polysilicon and thin-film
Note that a high-resistivity Therefore, embedded SRAM cells, such as
those
available in a standard logic process.
are not
conventional 6T cell of Figure 12-26.
used in microprocessor caches, stick to the
Bynamic Random-Access Memory (DRAM)
that the only function of the load resis
While discussing the resistive-load SRAM cell, we noted
eliminate these loads completely
tors is to replenish the charge lost by leakage. One option is to
contents. This refresh oper
and compensate for the charge loss by periodically rewriting the cell
should occur
ation, which consists of a read of the cell contents followed by a write operation,
often enough that the contents of the memory cells are never corrupted by the leakage. Typically,
refresh should occur every 1to 4 ms. For larger memories, the reduction in cell complexity more
than compensates for the added system complexity imposed by the refresh requirement. These
memories are called dynamic, since the underlying concept of these cells is based on charge
storage on a capacitor.
Three-Transistor Dynamic Memory Cell The first kind of dynamic cell is obtained by elimi
nating the load resistors in the schematic of Figure 12-33. The four-transistor cell can be further
simplified by observing that the cell stores both the data value and its complement; hence, it con
tains redundancy. Eliminating one more device (e.g., M) removes this redundancy and results in
the three-transistor (3T) cell of Figure 12-33 [Regitz70]. This cell formed the core of the first
popular MOS semiconductor memories such as the first 1-Kbit memory from Intel [Hoff70).
While replaced by more area-efficient cellsin the very large memories of today, it is stillthe cell
of choice in many memories embedded in application-specific integrated circuits. This can be
attributed to its relati simplicity in both design and operation.
The cell is written to by placing the appropriate data value on BL1
word line(WWL). The data is retained as charge on
and asserting the write
capacitance Conce WWL is lowered. When
reading the cell, the read-word line (RWL) is raised. The storage
depending upon the stored value. The bit line BL2 is either clamped transistor M, is either on or oir
to VnnWith the aid of a load
device, for example, a grounded PMOS or saturated NMOS
transistor, or is precharged to either
MemoryCore
The 665

BLI BL2

WWZ

WWL
RWL

M3 RWL

|M Vpn -V,
BLI
Cs
BL2 Vpp-Vr
Paure12-33 Three-transistor dynamic memory cell and the signal waveforms
write.
during read and
The fornmer approach necessitates careful transistor sizing and causes static
consumption. Therefore, the precharged approach is generally preferable. The series con-
power low when a 1 is stored. BL2 remains high in the opposite case.
fM, and M, pulls BL2
nectionof signal is sensed on the bit
the stored
cellis inverting: that is, the inverse value of
Noticethatthe inverse
nThe most common approach to refreshing the cell is to read the stored data, put its
in consecutive order.
on BL1, and assert WWL reduced with respect to the static cell. This is
illus
The cell complexity is substantially compared to
layout of Pigure 12-34. The total area of the cell is 576 2,
trated by the example 12-31. These numbers do not take into
account the
SRAM cell of Figure
the 1092 of the
sharing The area reductionismainly
with neighboring cells.
potential area reduction obtained by
contacts and devices.
due to the elimination of complex
possible at the expense ofa more read
in the cellstructure are
Further simplifications and BL2 can be merged into a
single wire. The
instance, bit lines BLl con
circuit operation. For refresh cycle must be altered
as before. The read-sensewrite This
and write cycles can proceed
the cell is the complement of the stored value.
read from merge the
SIderably, since the data value single cycle. Another option is to
both values in a operation. A
Tequires the bit line to be driven to this does not significantly change the cell
again, careful control
KWL and the WWL lines. Onceaccompanied by a refresh of the cell contents. A value is
au operation is
automatically
writing of the cell before the actual
prevent a
of the word-line voltage 1S necessary to
read during refresh. are worth mentioning:
interesting properties ofthe 3T cell
Finally, the following device ratios. This is a
common
constraints exist on the
1. In to the SRAM cell, no solely based on performance
contrast circuits. The choice of device sizes is
valid when a
static bit line
property of dynamic is not
that this statement
Observe
and reliability considerations.
load approach is employed.
Designing Memory and Array
666
Chapter 12 "
Structures
GND
BL1
BL2

Polysilicon

RWL
-M3
Metal 2

-M2

WWL

Metal 1
M

Figure 12-34 Example layout of three-transistor dynamic memory cell.


2. In contrast to other DRAM cells,reading the 3T cell contents is nondestructive; that is, the
data value stored in the cell is not affected by a read.
3. No special process steps are needed. The storage capacitance is nothing more than the gate
capacitance of the readout device. This is in contrast with the other DRAM cells, dis
cussed next, and makes the 3T cell attractive for embedded memory applications.
4. The value stored on the storage node X when writing a 1 equals VwwÊ V.
This threshold
loss reduces the current flowing through M, during a read operation and
access time. To prevent this, some designs bootstrap the word-line
increases the read
words, raise Vww to a value higher than Vo: voltage, or in other
.ghe-Transistor Dynamic Memory Cell Another dramatic reduction in cell
obtained by a further sacrifice in some of the cell complexity can be
properties.,
one-transistor DRAM cell (1T), is undoubtedly the most The resulting structure, catieu a
mercial memory design." Aschematic is shown in pervasive dynamic DRAM cell in Co
Figure 12-35 (Dennard68]. Its basic opla
A DRAM cell
3T or 1T cell andcontaining only two
is, therefore, only transistors can also be conceived, It offers no
rareiy used. substantial advantages OVer clu
The Memory Core
122
667
BL
WL
Write 1 Read |
WL
M
X GND
Vop-Vi
BL Vpp
Vp/2
CBL sensing Vpp2

Fiqure 12-35 One-transistor dynamic RAM cell and the corresponding


signal waveforms during read and write.

ional concepts are extremely simple. During a write cycle, the data value is placed on the bit
Bne BL. and the word line WL is raised. Depending on the data value, the cell capacitance is
either charged or discharged. Before a read operation is performed, the bit line is precharged to a
voltage VpRE Upon asserting the word line, a charge redistribution takes place between the bit
line and storage capacitance. This results in a voltage change on the bit line, the direction of
which determines the value of the data stored. The magnitude of the swing is given by the
expression
Cg (12.8)
AV = RI -VPRE = (V BIT PRECe+ CRÊ.
line after the charge redistribu
Where Cp, is the bit line capacitance, Vr the potential of the bit
As the cell capacitance is normally
ton, and VBr the initial voltage over the cell capacitance Cs.
line capacitance, this voltage change is very
One or two orders of magnitude smaller than the bit
memories [[toh90]. The ratio Cç/(Cç + CR)
Sinall, typically around 250 mV for state-of-the-art
1% and 10%.
bCalledthe charge-transfer ratio and ranges between necessary if functionality is to be
Amplification of AV to the full voltage swing is
difference between the 1T and 3T, as well as
achieved. This observation marks a first major
other, DRAM cells.
each bit line to be functional.
1.A 1T DRAM requires the presence of a sense amplifierforThe read operation of all cells
readout.
This is aresult ofthe charge-redistribution-based
only needed to speed
sinking. A sense amplifier is
ISCussed previously relies on currentconsiderations. It is also worth noticing that the
up the functionality
readout, not for contrast to the SRAM cells, which present both
DRAM cells are single ended in
memory
the data value and its complement on the bit lines. This
complicates the design of the sense
on periphery.
amplifier, as will be discussed inthe section

You might also like