Lecture 18-20 Design For Testability-Memory Testing and MBIST
Lecture 18-20 Design For Testability-Memory Testing and MBIST
Sources:
i. Book :Bushnell and Agrawal
ii. NPTEL Lecture notes
iii. Book: Miron Abramovici
Memory Testing
Introduction
•VLSI testing, only from the context where the circuit is composed of logic gates and flip-
flops.
•However, memory blocks form a very important part of digital circuits but are not
composed of logic gates and flip-flops. This necessitates different fault models and test
techniques for memory blocks.
•In memory technology, the capacity quadruples roughly every 3 years, which leads to
decrease in memory price per bit (being stored).
•High storage capacity is obtained by raise in density, which implies decrease in the size of
circuit (capacitor) used to store a bit. Experiments with new materials having high dielectric
constant like barium strontium titanate are being done that facilitate greater capacitance to
be maintained in the same physical space.
•Further, for faster access of the memory, various methods are being developed which
includes fast page mode (FP), extended data output (EDO), synchronous DRAM
(SDRAM), double data rate etc.
Introduction
•Unlike general circuits we generally do not discard faulty memory chips.
•Multiple faults will be present in any memory chip. The yield of memory chips would be
nearly 0%, since every chip has defects. During manufacturing test, the faults are not only
to be detected but also their locations (in terms of cell number) are to be diagnosed.
•As almost all memories will have faults in some cells, there are redundant (extra) cells in
the memory. Once a fault is diagnosed, the corresponding cell is disconnected and a new
fault free cell is connected in the appropriate position. This replacement is achieved by
blowing fuses (using laser) to reroute defective cells to normal spare cells.
•The sole functionality of a cell is to store a bit information which is implemented using a
capacitor; when the capacitor is charged it represents 1 and when there is no charge it
represents 0. No logic gates are involved in a memory. Use of logic gates (in flip-flops)
instead of capacitors to store bit information would lead to a very large area.
•The above two points basically differentiate testing of logic gate circuits from memory.
•New fault models and test procedures are required for testing memories. In this lecture we
will study the most widely used fault models and test techniques for such fault models in
memories.
Memory fault models
• When data is to be read from the memory, first the row and column decoders determine
the location (i.e., the cell) from the address (sent in the address bus) that needs to be
accessed.
• Based on the address in the row and column decoders the cell of the appropriate row and
column gets connected to the sense amplifier, which sends the data out.
• Similar situation (for accessing the required cells) holds when data is to be written in the
memory, however, in case of writing, special driver circuitry writes the values in the
cells from the data bus.
It may be noted that from the testing perspective we would only check if
•Required value (0/1) can be written to a cell
•The stored value can be read from a cell
•The proper cell is accessed, i.e., the row and column decoder do not have faults.
Memory fault models
The row and column decoders are digital circuits implemented using logic gates (which are
different from memory cell implementation).
In testing of memory, we do not consider the decoders as gate level digital circuits nor the
sense amplifier and driver as analog circuits. For the decoders, we test the functionality
whether they can access the desired cells based on the address in the address bus. For the
amplifier and driver we check if they can pass the values to and from the cells correctly.
The following faults called “reduced functional faults” are sufficient for functional memory
testing
•Stuck-at fault
•Transition fault
•Coupling fault
•Neighborhood pattern sensitive fault
•Address decoder faults
March Test Notations
Stuck-at fault
w0 w1
w1
S0 S1
w0
0 1
w1 w1
s-a-0 s-a-1
State diagram for a s-a-0 memory cell and s-a-1 memory cell
Stuck-at-fault in memory is the one in which the logic value of a cell (or line in the
sense amplifier or driver) is always 0 or always 1.
S0 is the state where the cell contains 0, while is S1 is the state where it contains 1.
w1 (w0) indicates value of 1 (0) being written.
Transition fault
w2
w1
0 1
w0
w0
In other words, in the widely used coupling fault model it is assumed that any “two” cells
can couple and normal behavior changes in these two cells; it is called 2-coupling fault
model.
So if there are n cells in a memory then there can be nC2 number of 2-coupling faults.
To reduce the number of 2-coupling faults further from nC2 , we assume that only
neighboring cells (decided on threshold distance) can be involved in the fault. We consider two
types of coupling faults namely, (i) inversion coupling faults and (ii) idempotent coupling faults.
Inversion coupling faults
In a 2-inversion coupling fault cfinvi, j say, involving cells i and j , a transition (0 to1 or
1 to 0) in memory cell j causes an unwanted change in memory cell i . Memory cell i is
the coupled cell (where fault occurs) and memory cell j is the coupling cell. The two
possible 2-inversion coupling faults involving cells i, j (denoted as cfinvi, j ) are
cell i )
• Falling: ¯| (implying 1 to 0 change in cell j complements the content of
cell i )
Inversion coupling faults
w0@i
w1@j
w0@j w0@i
w1@j
w0@j
S00 S01
The state diagram for two cells i and j under normal condition.
State S00 implies that both the cells have 0 values
The self loop at state S00 , marked w0@i implies that if 0 is written to cell i then the same state is
retained; another transition w0@j is associated with the same self loop which implies that if 0 is
written to cell j then S00 retained.
If we write 1 to cell j (from state S00 ) i.e., w1@j, we go to state S01; this is indicated by the
transition from S00 to S01 marked w1@j.
Faulty States
Idempotent coupling faults
In a 2-indempotent coupling fault cfidi, j say, involving cells i and j , a transition (0 to1
or 1 to 0) in memory cell j sets the value in memory cell i to be 0 or 1. The four
possible 2- idempotent coupling faults involving cells i, j (denoted as cfidi, j ) are
w1@i w1@j
w0@i w0@i
w1@i
10 11
w1@j
w1@i
w0@j w1@j
w1@i
w0@j
The state machine for two cells i and j under rising-1 idempotent coupling fault cfidi, j .
We note that under normal condition if we write 1 to cell j (from state S00 ) we go to state
S01, however, under rising-1 cfidi, j we go to state S11. This situation is similar to inverse
coupling fault
However, unlike inverse coupling fault, in rising-1 idempotent coupling fault we do not
have a faulty transition from S10 to S01.
Bridging fault
A bridging fault is a short circuit between two or more cells. As in the case of coupling
faults, to keep the number of faults within a practical number, it is assumed that only two
cells can be involved in a bridging fault. There are two types of bridging faults
• AND bridging fault ANDbfi, j (involving cells i and j ) which results in values in cells
i and j to be logic AND of the values in these cells under normal condition. AND
bridging fault is represented by vi , v j | vi ANDvj , vi ANDvj where the first two places
represent the values in cells i and j under normal condition and the two values
following “|”represent the values in cells i and j under AND bridging fault.
0,0|0,0 , 0,1|0,0 , 1,0|0,0 , 1,1|1,1 are the four types of AND bridging faults
possible.
Bridging fault
• OR bridging fault ORbfi, j (involving cells i and j ) which results in values in cells i
possible.
Neighborhood pattern sensitive coupling faults
One of the most important and different kind of fault in memory compared logic gate
circuits is neighborhood pattern sensitive faults (NPSFs). As memory cells are very close
to each other, the cells behave normally except for certain patterns in the neighborhood
cells. For example, if a cell i has 0 and all the neighboring cells have 1, then the value of
It is obvious that given a cell there can be infinite number of neighborhood combinations.
However for all practical cases there are two types of neighborhoods used in fault
Type-1 neighborhood
The black colored cell is the one under test and the four cells around it (filled by small
check boxes) are called neighborhood cells. Patterns in the neighborhood cells cause
faults in the cell under test.
Neighborhood pattern sensitive coupling faults
Type-2 neighborhood
Complex than Type-1 neighborhood
Neighborhood pattern sensitive coupling faults
• Active NPSF (ANPSF)
The value in the cell under test changes due to a change in ONE cell of the neighborhood
(type-1 or type-2 depending on the one being used); all other cells of the neighborhood
cell no. 0,1,3,4 respectively) including the one which changes and fe represents fault
effect in the cell under test. For example, 1 0,0, 0,0 | ¯ represents the ANPSF were the
cell under test initially has value of 1, the pattern made by neighboring cells is 0000
(values at cell no. 0,1,3,4 respectively) and fault effect at cell under test is 0 when a 1 to 0
transition is made in cell 1.
Neighborhood pattern sensitive coupling faults
• Passive NPSF (PNPSF)
PNPSF implies that a certain neighborhood pattern prevents the cell under test from
changing its value. An PNPSF is represented as vcut v0 ,v1,v3 ,v4 | fe , where vcut is
the value in the cell under test, v0 , v1, v3 , v4 represent the values in the neighboring
cells and fe represents fault effect in the cell under test. There can be three types of
fe PNPSF:
o |0 : cell under test cannot be changed from 0 to 1 (initial value of cell
under test is 0)
o ¯|1: cell under test cannot be changed from 1 to 0 (initial value of cell
under test is 1)
o | x : cell under test cannot be changed regardless of content.
Testing of memory faults
2. In decreasing order of address of the memory cells, read the cells (expected
value 0) and write 1 to the cells;
3. In increasing order of address of the memory cells, read the cells (expected value
1) and write 0 to the cells;
4. In decreasing order of address of the memory cells, read the cells (expected
value 0);
Testing of memory faults
March test obviously tests s-a-0 and s-a-1 faults in the cells because 0 and 1 in
each cell is written and read back.
the cells. So, Step 1 through Step 3 tests absence of |0 fault. In a similar manner, Step
descending order of memory address of cells, both i and j are either visited before or after cell
k. Cell traversal
order Write each cell with 1
k 0 k 1
j 0 j 1
i 0 i 0 1 0
Cell traversal
order
March Test: Coupling Faults
As Step-1 of March test all the cells i, j, k are written with 0. Following that in Step 2, all the
cells (in order of) k, j,i are written with 1 (after successful reading of 0 from the cells). It may
be noted that first cell k is written with 1; as cell i is coupled with cell k having fault | , the
0 to 1 transition in cell k inverts the content of cell i . Following that, cell j is written with 1; as
cell i is also coupled with cell j having fault | , the 0 to 1 transition in cell j inverts the
content of cell i again. Now when cell i is read, the value determined is 0 which means absence
of two coupling faults (i) rising cfinvi, j and (ii) rising cfinvi,k . In other words, “rising cfinvi,k ”
Cell j is to be written with a 0 and read back, (ii) value at cell i is to be read and remembered,
(iii) cell j is to be written with a 1 and read back, and (iv) value at cell i is to be read and
checked that it is same as the one remembered (i.e., no inversion has happened).
Inverting falling coupling fault ¯| between cell i and j : (i) Cell j is to be written with a
1 and read back, (ii) value at cell i is to be read and remembered, (iii) cell j is to be written with
a 0 and read back, and (iv) value at cell i is to be read and checked that it is same as the one
remembered (i.e., no inversion has happened).
March Test: Coupling Faults
Idempotent Rising-0 coupling fault |0 between cell i and j : (i) Cell j is to be written
with a 0 and read back, (ii) cell i is to be written with 1 and read back, (iii) cell j is to be written
with a 1 and read back, and (iv) value at cell i is to be read and checked to be 1.
Idempotent Rising-1 coupling fault |1 between cell i and j : (i) Cell j is to be written
with a 0 and read back, (ii) cell i is to be written with 0 and read back, (iii) cell j is to be written
with a 1 and read back, and (iv) value at cell i is to be read and checked to be 0.
March Test: Coupling Faults
Idempotent Falling-0 coupling fault ¯|0 between cell i and j : (i) Cell j is to be
written with a 1 and read back, (ii) cell i is to be written with 1 and read back, (iii) cell j is to be
written with a 0 and read back, and (iv) value at cell i is to be read and checked to be 1.
Idempotent Falling-1 coupling fault ¯|1 between cell i and j : (i) Cell j is to be written
with a 1 and read back, (ii) cell i is to be written with 0 and read back, (iii) cell j is to be written
with a 0 and read back, and (iv) value at cell i is to be read and checked to be 0.
March Test: Bridging faults
Like coupling faults March tests cannot detect all bridging faults.
0,0|0,0 , 0,1|0,0 , 1,0|0,0 , 1,1|1,1 are the four types of A N D bridging faults
possible.
This implies that cells i, j which are involved in bridging faults must have the four
combinations of inputs 00,01,10 and 11.
No cell pairs have all the four combinations 00,01,10 and 11. So to test bridging
faults the following (on next slide) test pattern sequences are required.
March Test: Bridging faults
AND bridging fault ANDbfi, j (involving cells i and j ):
(i) write 0 in cell i and 0 in cell j and read back the values (which must remain same),
(ii) write 0 in cell i and 1 in cell j and read back the values,
(iii) write 1 in cell i and 0 in cell j and read back the values, and
(iv) write 1 in cell i and 1 in cell j and read back the values.
It may be noted that the above four test pattern sequence are enough to test OR bridging fault
also because we write all possible combinations in the two cells (involved in fault) and read back
to check if they retain their values.
March Test: Address decoder faults
A little variation of March test can test all four address decoder faults. The test
sequence (of modified March test) that tests all four address decoder faults are
as follows
•In increasing order of address of the memory cells, read the value of the
memory cells and write complement value in the cell. If 1 is read at cell 0,
value of 0 is written to cell 0; following that same procedure is followed for
cell 2 and so on for entire memory.
•In decreasing order of address of the memory cells, read the cells (match
with expected value) and write complement value in the cell.
The basic principle is that as the memory writing and examination operation moves
through memory, any address decoder fault that causes unexpected accesses of memory
locations will cause those locations to be written to an unexpected value. As the test
proceeds, it will discover those locations and report a fault.
Memory Testing
Fault types
Permanent -- System is broken and stays broken
the same way indefinitely
Transient -- Fault temporarily affects the system
behavior, and then the system reverts to the good
machine -- time dependency, caused by
environmental condition
Intermittent -- Sometimes causes a failure,
sometimes does not
Failure Mechanisms
Permanent faults:
Missing/Added Electrical Connection
Broken Component (IC mask defect or silicon-
to-metal connection)
Burnt-out Chip Wire
Corroded connection between chip & package
Chip logic error (Pentium division bug)
Failure Mechanisms (Continued…)
Transient Faults:
Cosmic Ray
An a particle (ionized Helium atom)
Air pollution (causes wire short/open)
Humidity (temporary short)
Temperature (temporary logic error)
Pressure (temporary wire open/short)
Vibration (temporary wire open)
Power Supply Fluctuation (logic error)
Electromagnetic Interference (coupling)
Static Electrical Discharge (change state)
Ground Loop (misinterpreted logic value)
Failure Mechanisms (Continued)
Intermittent Faults:
Loose Connections
Aging Components (changed logic delays)
Hazards and Races in critical timing paths (bad
design)
Resistor, Capacitor, Inductor variances (timing
faults)
Physical Irregularities (narrow wire -- high
resistance)
Electrical Noise (memory state changes)
Physical Failure-Mechanisms
Corrosion
Electromigration
Bonding Deterioration -- Au package wires interdiffuse
with Al chip pads
Ionic Contamination -- Na+ diffuses through package
and into FET gate oxide
Alloying -- Al migrates from metal layers into Si
substrate
Radiation and Cosmic Rays -- 8 MeV, collides with Si
lattice, generates n - p pairs, causes soft memory error
MATS+ march test algorithm
Ø MATS stands for Modified
Algorithmic Test Sequence.
MATS is the shortest MARCH
test for unlinked SAF’s in
memory cell array and
read/write logic circuitry
The fault is detected by march element M2 as it moves from the highest memory address downward and
expects to read a 1 in cell (2, 1), but instead gets a 0.
MATS+ detection of cell (2, 1) SA1 fault.
The fault is detected by march element Ml as it moves from the lowest memory address upward and
expects to read a 0 in cell (2, 1), but instead gets a 1
MATS+ detection of cell (2, 1) multiple address decoder faults.
This is multiple fault type C, which is a combination of address decoder faults 2 and 4. Since
all writes to cell (2, 1) have no effect, and any read of cell (2, 1) produces a random result,
the defective cell will be detected either by march element M1 when it reads cell (2, 1) (if
the read returns a 1 when a 0 was expected), or by march element M2 when it reads cell (2,
1) (if the read returns a 0 when a 1 was expected.)
In Figure (e), march element M1 writes a 1 to cell (2, 1), but that instead has the effect of
writing cell (3, 1). This is detected when element M1 operates on cell (3, 1), because it first
reads a 0 (but gets an unexpected 1), and then it writes a 1 to the cell. If, instead, the
address of cell (3, 1) mapped into an access of cell (2, 1), then march element M2 would
detect this error as it descended from highest to lowest addresses in memory. It would
expect to read a 1 from cell (2, 1), but would get a 0 instead.
March tests detecting CFsts
Ø The march tests are appropriate for SRAM testing. However, for DRAM
testing, a neighborhood pattern sensitive fault (NPSF) testing model is
more appropriate, since it provides better DRAM fault coverage.
Ø Since the operation count is much longer for NPSF tests than for march
tests, the benefit of BIST is greater when the NPSF tests are
implemented on-chip.
Ø However, no NPSF test can detect address decoder faults, whereas all
march tests can. Therefore, an appropriate scheme would be to put
both test algorithms in the BIST hardware.
Address decoder faults
From the context of memory testing four types of faults are considered in address decoder
(for both reading and writing)
Because there are as many cells as addresses, none of the above faults can stand alone. When fault 1 occurs,
either fault 2 or 3 must also occur. With fault 2, at least fault 1 or 4 must occur; with fault 3, at least fault 1 or
4; with fault 4, at least fault 2 or 3. These four fault combinations are shown in Figure
Functional Model
Simplified Functional Model
Fig. shows a simplified functional model,
consisting of an address decoder, a memory
cell array, and read/write logic.
The advantage of functional models is that
they have enough detail of data paths and
adjacent wiring runs in the memory to
adequately model the coupling faults, which
must be tested.
Other Models:
• Logic gate model-not used
• Electrical Model-allows detailed fault
localization but very costly
• Geometrical Model- Needs knowledge of
the chip layout, inductive fault analysis,
very costly and time taking
Basics of memory BIST
• For March test an address generator (increasing and decreasing
order) and a data reader cum writer is required.
• So, BIST for March test will be simply an LFSR and a data reader
cum writer. As in the case of logic BIST, the LFSR should have
primitive polynomial (so that it generates all numbers from 1 to
2n), and along with this the LFSR for memory BIST should the
following features
0 M
0 M U
1 X
U
1 X D Q 0 M D Q 0 M D Q
U U
X0 1 X X1 1 X X2
Up/Down
L-2-R/R-2-L
Up / Down LFSR Pattern
Sequences