0% found this document useful (0 votes)
4 views14 pages

Lin 1993

Uploaded by

ripec19232
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views14 pages

Lin 1993

Uploaded by

ripec19232
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

P I P E L I N E D T E S T

_________ ~. __________

A New Framework for Designing


Buibin Test Muhichip Modules
With Pipelined Test Strategy

TING-TING Y. LIN
MULTICHIP
MODULES intercon- linear-feedback shift register to r e
nect multiple bare dies by means of I HUOY-YU LlOU ceive and modify output data. After
a stack of conductive and dielectric I a long sequence of test pattern, the
University of California
thin film. As the next generation of ~
residue (or signature) in the shift
electronic packaging, MCMs offer at San Diego register of a faulty circuit differs
tremendous advantages such as r e from thesignature of a good circuit.
duced time delays between chips, Therefore, combining the bound-
less electrical noise and cross talk, ary-scan and built-in self-test tech-
simplified power distribution, and niques provides an alternate
small size. However, large I/O lead method for testing complex circuits
counts and highdensity intercon- at the board level' more efficiently.
nections decrease testing through- BILBO combines the basic fea-
put and accelerate testing cost. rn tures of scan design with those of
Traditional hierarchical testing, signature analysis2 The shift regis-
which involves testing chips indi- ters form feedback paths by XOR-
vidually before assembly and then ing some outputs from the flipflops
testing the assembled module to and connecting back to some of
avoid any errors introduced during ~
the flipflop inputs. A given width of
packaging, includes bedsf-nails fix- BILBO that implements a primitive
tures and hand-held diagnostic probes. ples of well-known BIT techniques in- lynomial has the corresponding XOR
These methods become impractical clude the scan design and built-in logic tterns. One combination of the con-
and costly with new technologies such block observer methods.',* 1 signals configures B I D 0 into a mul-
as MCMs and surfacemounted devices. Scandesign methods involve discon- leinput shift register for compacting
Incomplete or unavailable test vectors necting the memory elements and/or cuit responses when the shift registers
from chip manufacturers and the inter- the flipflops from the combinational contain extra control ports. Today's
nal module's low observability contrib- logics. The overwhelming number of computer designs contain internal VIS1
ute to these problems. Built-in test, test outputs generated by a relatively bus paths with widths of 16 or even 32
where the circuit or system under test large circuit makes the scan method bits, while the BILBO design retains a
includes a small circuit for testing, r e p cumbersome. Signature analysis, a p o p bandwidth of 8 bits. Therefore, we need
resents a new approach in testing. Exam- ular datacompaction solution, uses a to redesign the BILBO to accommodate

38 0740-747519311200-0038$03.000 1993 IEEE IEEE DESIGN & TEST OF COMPUTERS


the new wider bus paths. Bhavsar p r o exhaustive testing using LTA becomes
posed a family of concatenating polydi- imperative for the MCM designers to
viders with primitive characteristic achieve better fault coverage more effi-
polynomials to resolve the unextend- ciently than boundaly scan. Arrays of
able BILBO problem for packaged CBITs provided to the MCM, either in the
chips3 form of a small chip on the same sub-
To minimize hardware overhead and strate or off the MCM test circuitry,allow
design time while maintaining certain chips without BIST circuitry to use LTA.
state and fault coverages, we recom- Chips with existing onchip BIST struc-
mend a bytewise cascadable built-in tures easily support LTA.
tester macro cell with an optimum primi-
tive characteristic polynomial. This keeps Cascadable built-in teshg
the CBIT cell in a design library, allowing structure
circuit and system designersto easily con- The goal of our CBlT design is to p r o
struct the necessary feedback path for vide a macro cell in the design library
their BIST circuity. Previous work on cir- that expedites the BIT design process.
cular self-testing paths (CSTP)* accom- We cascade CBIT cells to form a CBIT
plished cascadability, however, by suite, using multiplexers and XORs
simply connecting registers in a circuit to placed in strategic locations to construct
form a closed loop with a feedback poly- different feedback paths. This generates
nomial of xk+l. With a nonprimitive feed- primitive polynomials in a multiplebyte
back characteristic polynomial, the CSTP configuration. A CBlT suite, with feed-
approach is a special application of the back connections that represent a prim- bit case, the feedback path for the least
CBIT.The performance of CSTPsuffersas itive polynomial, acts as a maui- significant CBIT suite differs from the
a result of its nonprimitive polynomial mum-length PRS generator.i Not only path of the most significant CBlT suite
and requires a sufficiently long testing does CBIT perform test pattern genera- (Figure lb). To guarantee the maximum
time when the aliasing probability a p tion and signature analysis, it also per- randomness and quick convergence to
proximates the asymptotic value 2-N, mits cascadability to generate a the asymptotic value of the aliasing
where N is the input width of the circuit maximal length PRS. In performing probability, the generating polynomial
under test: signature analysis, a primitive character- for the 16-bit CBlT must be prime.8 In
To further improve testing time, we istic polynomial gives a quicker conver- general,cascading CBlTs make extend-
propose a novel approach based on gence of the smaller asymptotic-aliasing ed-length MISRs fit the increasingsize of
CBITs that allows concurrent MCM test- probability for a given test length.* the data buses without redesigning the
ing. We refer to this strategy as the loop The CBIT design. A modified %bit detail of the BIST circuitry. This speeds
testing architecture.LTA uses CBITs in a BILBO forms a CBlT cell. It consists of up the design modification cycle by
pipe interwoven with high I/Ocount three control signalsg (Cx, q,and C,), making the original designs more
chips on MCMs. Simulation results show eight parallel inputs (Dbus), eight paral- testable.
that this guarantees high test coverage lel outputs (Qbus), an LFSR consisting of
with the use of maximum-length pseu- eight flipflops,and XORs that provide the Operation modes provided by
dorandom sequences for test pattern feedback path of the LFSR. Two serial CBIT. CBITs provide three modes of o p
generation. The aliasing probability data ports, Scan-in and Scan-out, form eration (Figure 2): parallel-register,scan
compares favorably to that provided by the scan path. Finally, Feedback-in and path, and MISR. During normal opera-
a twofold multi-input linear-feedback Feedback-out provide the cascading tion, the parallel-register mode remains
shift registes with only a fraction of the links among CBITs. Figure la (next page) active. CBITs form pipelined parallel reg-
area necessary. shows the 8bit CBIT cell, while Figure lb isters in the data path. During initializa-
MCMs require significant exhaustive represents a 16-bit CBIT suite configured tion and signature readout, the
testing.The original test vectors for chips from two CBlT cells.g scan-path mode becomes active. The
with high I/O counts from different man- Use of the feedback pattern and the Scan-in port shifts in nonzero seeds and
ufacturers may not be available for the generating polynomial guarantees max- the Scan-out port reads out signatures.
functional testing of the assembled imum-length PRS generation in both the In addition, a scan path can form
module.6In this case, parallel-pipelined 8 and 16-bit cases. Notice that in the 16- through the pipe to read out signatures

DECEMBER 1993 39
P I P E L I N E D T E S T

D1 Table 7 . Controlsignals and the corre-


QZ ~ sponding CBI1operation mode settings.
- D Q-- ~-

Control signals
FFI
Cy Cj
... (C, configuration
~

Q3
/-\

(1 1 1) Parallel register mode


(0 1 -) Scan path mode
D14 D13
(1 - -1 MlSR mode
... , (1 0 1 ) Most significant byte
for cascading
I (1 1 0) Leust significant byte
... (1 0 0)
for cascading
Single-byteMlSR

***+QQq
...
,
in the intermediatestages. Configuration
of the CBITs for testing occurs in the
1

' MISR mode that concurrently performs


pseudoexhaustive test pattern genera-
figure 1. An &bit CB/l/M/SRwith generatingpolynomial= 2 + xd + 2 + J (a); 16-bit tion for the succeeding CUT and output
cascadedCB/l/M/SRwith generatingpolynomial= x f 6 + x f 4+ x f 3 +x9 + xd + 2 + I (b). signature analysis for the previous CUT.
The combinations of the control signals
C,, q, and C, provide three major oper-
ations as summarized in Table 1. As
shown in the last three rows, the combi-
?N DN-1 02 Dl nations of C, and C, enable CBIT
cascading.

Pipelining for self-testing


1 1
Q2 Q1 -The horizontal extension of the CBITs
accommodates large I/O MCM testing.
Further reduction in testing time results
when several functional blocks in an

pn71 Scan-out
MCM form a pipe to test blocks concur-
rently. (We refer the functional blocks to
those CUTs in a SUT and modules to
CUT/SUT with BIT circuits.) Each pipe
consists of one zero-stage CBlTsuite and
subsequent stages of block and CBlTset.
Clusters of functional blocks possess-
ing similar numbers of inputs/outputs
form a pipe. Pipes constructed accord-
ing to their functionality and data width
improve efficiency. We then construct
CBIT suites to match the data width of
each pipe. For CUTs with very limited
f i g u r e t Para//e/-in/para//el-outregister/buffermode (a);scan-in/scan-outshi~register outputs (for example, encoders), we
mode (b);paralle/-in/para/le/-outlFSR mode (c). can cluster a greater number of CUTs for

40 IEEE DESIGN & TEST OF COMPUTERS


the CBlTsuites to analyze at each stage.
Alternatively, the partitionkegmenta-
tion process discussed by Srinivasan et
a1.I0 constructs several shorter or nar- *(+;A , ,
rower pipes. Figure 3a shows construc-
tion of pipes for a data path in the SUT, Low CBIT(C,,)
while Figure 3b shows the pipe con-
struction for a control path (which usu-
ally has nonuniform I/O bit width or
Feedback
path
1 (y;m)
Scan
path
Feedback
path
+ + I
branched signal flows). The require High CBIT(Cp)
ments on the state and fault coverage
and the aliasing probability determine
the proper length of any given pipe.
Our previous workg shows prelimi-
nary results. Following formation of the CBIT(C2,)
n
pipes, rearrangement of the number of
stages in each pipe occurs so that most * CBIT(C2,)
of the pipes finish self-testing simulta-
neously. Normally existing data paths
with pipelining form naturally self-test- figure 3. ITA pipe for cuts with: homogeneous data width (primarilydata paths] (a);het-
ing pipes. When the pipe becomes too erogenous data width (primarilycontrol paths] (b).
long and requires decomposition, only
the zerostage CBITsuiteadds to the sec-
ond pipe. Creating all pipes under this for the least significantCBIT. Daisychain-
guideline after the rearrangement phase ing the Scanjn and Scan-out ports of the
gives the maximum parallelism. CBITs at each stage gives a scan path for
For high fan-in CUTS, decomposing the initialization and scans out the final
the original network into segments with signature of each CBIT. The last single
fewer fan-ins'O reduces computational CBIT suite of a pipe connects to the zero
effort. The controllability, detectability, stage CBIT. Thus for each pipe, we have i i

and observability measures of a seg- doublelength CBIT suites for signature


mented circuit remain the same as for analysisresulting in smaller aliasingprob
the unsegmented CUT." Adopting algo ability for the whole pipe. Figure 4 shows
rithms proposed by Yeh et a1.I2allows the equivalent data flow when the CBITs nL nk
grouping of segments into clusters. Of- are paired to do a doublelength signa-
ten, these clusters identify natural pipes. ture analysis. The grayed functional
LTA. Because the cumulative test r e blocks (Fl, F2, and F4) show the paired
sults degenerate over multiple stages, testing flow when we cascade two CBITs

MlSR operation. Further cascading of able aliasing probability. Figure 4. Data path of the rn-stagepipe-
linedextended CBIT testing.

polynomial of the LFSWMISR shall


not be 0 for any state of the P E .

2 . Multiple inputs (seeds) to the LFSW


MlSRstill traverse all the states of the
PRS; we do not consider the degen-
eration or missing of some states in

DECEMBER 1993 41
P I P E L I N E D T E S T
~

the PRS because of special combi- Nstage LFSR. In this manner, we ex- feedback patterns built over the Galois
nations or sequences of the seeds. clude the 0 state from the PRS and Field GF(29, GLFSR(m = 1, N) becomes
it forms a trivial cycle (0 +0) for the known as an N-input MISR when m = 1.
These assumptions are validated by 1
~ LFSR. Therefore,when the CBITs have charac-
Kim et al.i3who discuss the existence of teristic polynomials designed to be
properties pertaining to the randomness Cascaded CBITs generate test pat- prime, maximum-lengthtest patterns r e
of the patterns generated by a MISR that terns for different functional blocks in a sult, and the asymptotic-aliasing proba-
'
exist even if the inputs are not equally pipe, giving rise to the following obser- bility converges quickly.
probable. vations: Theorem 2 by Karpovsky et al.'4 pro-
To justify the pipelined LTA a p ' vides a general formula for calculating
proach, we must prove two things re- ~ H EachCUTwithN-bit-wideinputbus- the aliasing probability for a onestage,
garding the dual use of the intermediate es needs 2N-1 different test patterns N-bit-wide MISR:
CBITs. First, we must show that these to finish the exhaustiveself-testing.
CBITs perform effectively as TPGs with 1 H To exhaustively test the paired
patterns generating maximum-length 1
PRS at each stage. The pseudorandom
property of the generating polynomial of
' CUTS (each with no more than N
inputs), one pair of CUTS should
generate 22N-1 test patterns.
the CBlT in the MISR mode supports this.
(The percentage of the corresponding ~ For example, given an &input CUT, where L is the test length and j j B j j l ,
maximum-lengthPRS indicates the per- we need 28-1 test patterns. Assuming &, . . .,&-l,are the Walsh transforms of
formance level.) Second, we must show that there is no correlation between the the error probabilities from an N-bit-out-
'
that the limited output patterns of these two neighboring CUTs for one pair of put CUT. Error detection does not occur
functionalblocks do not disturb the ran- cascaded CBIT suites, we need one if pi= 0, but always results when pi= 1.
1
domness of the signature, where the maximum-lengthcycle of the &bit-wide Some closed forms of Pa, exist with
aliasing probability remains acceptably PRS to fully test one &bit CUT. However, additional condition^.^ Two of these
small after a number of stages. because of the equally distributed ones closed-form Pal's result from setting the
in the 16-bit-widePRS,7we should fully biterror transition probabilityp to 0.5:
1
Properties of the pseudorandom test two &bit CUTs before the extended
test pattern generation from CBIT. PRS reaches its maximum-lengthperiod
~ H When the test length L is m(2"'-1)
The construction of CBITsby LFSRs with (which is 216-1 ). The actual number of (where m 2 1 is an integer), the
primitive and irreducible characteristic the test patterns needed to fully test m aliasing probability fa/is given by
polynomials gives them the following CUTs simultaneouslyusing m cascaded
major pr~perties:~ CBITs depends on the characteristic
'
polynomial of the extended CBITs and
H Every element (or state) a in the the input seeds. But in general,we have
PRS generated by the LFSR has a the following relation: where m 2 1 is an integer for the in-
complementary element (or state) dependent error model.
5 in the same sequence such that (2%- 1) 5 L I (2m* - 1)
1 (1) H When the number of test patterns is
a + Ci= 0 (Nbit-wideOs),where "+" an arbitrary positive integer L and
represents the operation on the where L represents the test length need- the probability of an output bit b e
complementary elements of the ed to test m CUTS exhaustively given m ing wrong is 0.5,Pa/is
PRS as defined by the characteristic cascaded CBIT suites. Therefore, CBITs
polynomial of the LFSR. perform effectively as TPGs when we
H For the cyclic PRS, more than one choose an appropriate test length ac-
input seed will either decompose cording to Equation 1.
~

the original maximum-lengthcycle ~

to several subcycles or merge at Aliasing probability for single-


least two subcycles together. stage MISRs. A special case of the gen-
H The total number of (distinct) states eralized LFSR5 occurs with CBlTs in the for the 2N-ary symmetricchannel error
of all thesubcycles (if decomposed MISR mode. A generalized mstage, N- model. Notice that for both cases when
by multiple seeds) are 2N-1 for an input signature analyzer with linear 2N >> 1, Pa/converges to an asymptotic

42 IEEE DESIGN & TEST OF COMPUTIU


.

value 2-N. [When the test length L is less


Module 0
than one maximum length for the N-bit-
wide MISR for the independent error
model (for example, 2N-1), we use
~ @ ~ Module1

Equation 2 to calculate the exact alias Module 1


ing probability of the MISR.]

Aliasing probability in the pipelin-


ing scheme.In the multistage pipelining Module 2
I @ 1 Module2
MCM testing scheme, we calculate the
aliasing probability for the kth stage as

Pk(aliasing probability at kth stage) @


4-(nonaliasing probability Module 3 ~ @ ~ Module3
over k stages
k
= 1- (1 -Pa/)
fa)

='- Ixpa:-
l-kXP,/+-

= i(-lY-1[$+
I =1
...
k x (k - 1)

1 Figure 5. ITA capabilih for testing: module functionality (a);module functionality and in-
terconnections (6).

the order of magnitude 00-9.The sim- ates in the MISR mode,validatesthe input
ulation result in tin and Ka~eff,~ where before the signals reach the internal log
the aliasing frequency and probability ics,and servesas the TPG for the internal
Let 2N>>1 (generally true for all CBIT both stay constant as 0(2-16) over six logic blocks. The second set of CBlTsalso
suites). We can then approximate Pa,of stages in the pipelining path, also vali- operates in the MISR mode, examining
the N-input MISRs as 2-N for all the Val- dates this result. the output from the internal logic circuit-
ues of m or L in Equations 2 and 3. Pk Thus, the randomness of the signa- ry, and servesas the TPG forthe intercon-
then simplifies to ture is preserved in the case of a limited nection to the next module.
number of multiple inputs. Also, the Figure 5a shows one CBIT suite
k aliasing probability is sufficiently small placed at the primary outputs of each
pk = C(-a'( L j
i!( k - i)!
2 N given that the number of stagesk is small CUT with the zero stage CBlT suite add-
I =I compared with the maximum length of ~ ed to generate the pseudorandom test
the PRS. l pattern for the first CUT. Cascading
By ignoring the contribution from the neighboring CBlTs (as shown in Figure
termssmaller than 2-N,the aliasing prob Other LTA applications
1 3a) tests the modular functionality of
ability for the kth stage pipelined CBlT 1 LTA has several additional applica- each CUT. However, this implementa-
converges to tions. It can test interconnections, and is tion cannot test the interconnections
effective in locating faults in intermedi- between the CUTs. Figure 5b shows an
Pk = k x 2-N ,
(5) ateCBITs. extra CBITsuiteinserted near the prima-
ry inputs of each CUT. Therefore, we al-
Equation 5 gives the asymptotic value Capability for testing interconnec- ways have a CBIT suite testing either a
for both the symmetricchannel error tions. We view interconnections among functional block or an interconnection
model with any test length and the inde the MCMs as simplified CUTs with com- pattern between two CUTs.
pendent error model with a test length patible data paths, and the whole system This implementation provides a gen-
of at least one maximum length. When ~ as many CUTs (including the intercon- eral approach that can test fault patterns
the number of stages k is much smaller nections) requiringtesting under the LTA in all permutations, including multiple
than the maximum length of the PRS scheme. The pipelining scheme tests in- stuck-at faults, bridging or coupling, and
generated by the MISR (for example, terconnections among MCMs by integrat- patternsensitive faults. The N-bit-wide
2"i1), the aliasing probability at the kth ing two sets of CBITs next to the I/O pins interconnection network often realizes
stage MISR in the pipelining scheme is of in each module. The first CBIT set oper- fewer than 2N-1 different patterns for

DECEMBER 1993 43
T E S T

implementing signal links between any posed LTA. The first involves testing a four times the maximum length of the N-
two CUTS. However, ourN-bit-wideCBIT homogeneous processor environment bit-wide CBIT suite. That is, instead of
suite can generate 2N-1 differenttest pat- consisting of SN74LS181 ALUs. The sec- 232-1 (or even 228-1) test patterns for the
terns to exercise the N-bit-wide intercon- ond encompasses a heterogeneous two ALUs, we require only 4 x 216 test
nection exhaustively. MCM system with several types of com- patterns to give a 100%randomness for
In scheduling the testing for the inter- ponents. We transformed both of these the two ALUs at each cascading stage.
connections, extra modes become un- systems into test pipes. The cascaded CBIT suite that imple
necessary. In addition, timing conflizts ments LTA outperforms the straight pipe
do not exist because we use two sets of Six-stage ALU pipes by producing the best random patterns.
CBITs near the I/O pins that transform Six ALUs (SN74LS181) form a pipe As we increase the test length, the CBIT
the interconnections into another type with 16-bit CBlTsuites inserted between suites eventually generate a 16-bit-wide
of CUT directly. This pipelining scheme the ALUs. Each 16-bit CBIT suite acts as maximum-length P E . This validates our
allows for concurrent testing of both the a TPG for the 14-bit input ALU. The &bit earlier assumption that multiple inputs to
modules and interconnections. Adapt- output of the ALU feeds into a CBlTsuite the PRS generatorsstill produce the max-
ing LTA for interconnection testing by configured for signature analysis. In this imum-lengthPRS7Regardlessof how the
inserting one extra CBIT suite near the experiment, we develop two pipes &bit-wide outputs are connected to the
input ports (making the interconnec- based on LTA: one implements a primi- CBIT suite (at the higher, lower, or even
tions observable) results in area over- tive characteristic polynomial between the middle byte), after a sufficiently long
head. In contrast, pipeline testing for the the looped CBIT pairs, while the other test length (four times of the maximum
modules requires only one CBIT suite at directly connects the feedback lines length) 100%randomness of the 16-bit-
the outputs of each module. Testing the without changing the feedback pattern wide PRS is still possible.
interconnections and the module func- of each CBIT suite. We also reconstruct
tionality concurrently requires an extra the straight pipe from our previous ex- Aliasing probability of signature
CBIT suite but saves separate modes for periment~~ to provide a baseline com- analysis, We introduce faults by requir-
reconfiguring the SUT to test the inter- parison. ing a single stuck-at4 fault at the first
connections. Therefore, simultaneous stage ALLJ in each pipe. After aspecified
testing yields significant time savings Randomness of the TPG. We mea- number of test patterns, we compare the
with only a small area penalty. Further- sure the randomness of the TPG process ALU signatures with known good signa-
more, the placement of CBIT sets holds at each stage of the three pipes for vari- tures. Aliasing occurs when a faulty pipe
when we move the I/O ports to the cen- ous test lengths. This allows us to evalu- produces the same signature as a fault-
ter of the modules. ate the effectiveness of the CBlTs as free pipe.
TPGs while operating in the MISR mode We compare the aliasing probability
Fault location in intermediate as well as the impact of pipe length on at the last stage of the three pipes which
CBITs. CUT Signatures read out when the test pattern generation. For an N-bit- have a stuck-at4 fault at the least signifi-
we configure CBITs in the scan-path wide CBIT suite, the randomness mea- cant bit of the first stage ALU output. As
mode. Generally,a wrong final signature sure is 100% if 2’” test patterns are shown in Figure 7 (page 46), the aliasing
indicates that faults exist in the test pipe. generated. Figure 6 shows the random- probability of a 16-bit CBIT suite a p
However, there exists the possibility of ness measure for each stage of the three proaches 0(2-16) as the test length in-
faults from different CUTs in a pipe can- different pipes. In all three configura- creases in all three pipes. The straight
celing each other and producing a good tions, the randomness levels off after the pipe tends to receive aliases early for
signature at the last stage. Therefore, we first stage, indicating that the length of shorter test lengths. In contrast, a CBIT
need to know the exact test length a p the pipe does not affect the random TPG suite implementing LTA does not exhib
plied to each CBIT suite. This results in process. it aliasing effects until a sufficiently long
observable signatures at the intermedi- Our previous experiments showed test time has elapsed. In this case, alias-
ate stages and facilitates fault location. that the required test length L for the two ing occurs at the sixth stage after 100
By locating faults in specific CUTS, we N-bit CUTS under LTA testing (using tests in the primitive LTA pipe. With test
achieve better fault diagnosis. Equation 1) is smaller than 22N-1. The lengths smaller than the length of one
simulation result in Figure 6 validates maximum-length PRS, the aliasing prob
Examples this observation by showing that we can ability becomes more pronounced. Fig-
We developed two experiments to exhaustively test all ALUs in each cas- ure 7b shows that aliasing occurs at the
demonstrate the effectiveness of the p r o cading stage of the pipe when Lis about later stages before it shows up in the ear-

44 IEEE DESIGN & TEST OF COMPUTERS


lier stages. For the first stage, aliasing oc-
curs after more than 1,OOO tests. Howev- a
n
er, the sixth stage shows aliasing after f go
less than 10 tests. This implies that a aa

warm-up period reduces the aliasing ._


c I-2ML i
probability at each stage for small test .....

lengths of the pipelined LTA.


In general, CBIT suites implementing
LTA exhibit smaller aliasing probabili-
ties than straight pipe CBITsuites.When
. . . ..I.....
we use only one CBIT suite in the last
stage for comparison,all aliasing proba-
bilities stay at 0(2-16) (see Figure 7c). If O I I I I I 1 I
we read the contents of the two CBIT (a) O 1 2 3 4 5 6
suites as the complete signature, the Stages
aliasing probabilities for both pipes im-
100
plementing LTA become 0(2-33 in our v)
a
single stuck fault simulation, a negligible n
1
value compared to 0(2-16). This results z 90
01
from the CBIT suite's extended width of 5
c
._
2'" (2x16 in this case). Note that in Fig- 80
ures 7a and 7c the cascaded CBITs with
nonprimitive characteristic polynomials B
0.

give the same asymptotic value of the 2 70


aliasing probability as that of the primi-
tive feedback polynomials discussed by
-
._
c

E 60 -................... ............................................
~ ......... ...........................................

Damiani et aL8 e
E
0
Area overhead and testing time. 1 2 3 4 5 6
The area overhead for implementing (b) O Stages
LTA consists of the extra wiring required
to cascade the CBITsuites with addition- 100
al XORs (to implement the primitive v)
a
n
generating polynomial). As mentioned
earlier, constructing the scan path with faa 90
cascaded CBITs eliminates the need for 5
c
.-
extra circuitry. The testing time for the E
01
80
LTA pipe is the same as that of the 2
a
straight pipe. However, with some addi-
tional wiring and extended signatures,
2 70
._
LTA pipes provide extra observability at -
c

each stage in the pipe and a much low- 60


er aliasing probability. P
n
1 I I I I I
Pipes with ALU, caches, and RAM 1 2 3 4 5 6
(P-pipe) Stages
Our second experiment involves an
MCM (consisting of one SN74LS181
ALU, one %bit RAM, and two 16x8 data Figure 6. Randomness measure of the CBlJs as JPGs over six stages in the AlU pipes (MI
caches) placed in a fourstage testing = 2'6}:straight pipe (a};cascaded CBlJs with nonprimitive polynomial (61;cascaded
pipe. A 16-bit CBIT suite placed at the CBlJs with primitive polynomial (c).

DECEMBER 1993 4s
1 inputs of the ALU acts as the TPG. We
-
c insert four %bit CBIT cells between the
-
3
P
?
Y
0
01 CUTs. Demultiplexing the test patterns
from the %bit CBIT connected to the in-
3 001 puts of the RAM tests both Address and
U
c
Data inputs exhaustively.We refer to this
p 0001 as the “straight P-pipe.”I5Adding an ex-
-
L
a.l
Q
tra connection between neighboring
0 0001 CBlTs so that paired %bit-wide CBlTs
0
W can perform 1&bit signature analysis for
c
g le-05 two CUTs simultaneously (Figure 3b)
rn
c results in the “cascaded P-pipe.”
2 le-06
-
a
Randomness of the TPG. Figure 8
1 e-07 shows the randomness measure of each
fa) 1 10 100 1,000 10,000 100,000 let06 let07
Test length stage with different test lengths for the
straight P-pipe. In Figure $a, later stage
CBITs reach 100%randomness when the
input test length becomes greater than 28
for the %bitanalysis. In Figure 8b, the zero
stage CBIT gives 100% randomness for
the 14-bit-wideinput bus to the ALU after
the input test length becomes greater
than four times the maximum length.
This finding shows consistency with the
previousexperiment that warming up the
zero stage CBIT improves the quality of
the TPG. Figure 9 shows the behavior of
the cascaded P-pipe, which is similar to
that of the straight P-pipe. This reaffirms
the accuracy of our earlier assumption
that the maximum-length PRS generated
as the test length is at least four times the
maximum length for a given data width
of the CBIT suites.
In Figures 8a and 9a, LTA requires just
four times the maximum-lengthPRS gen-
erated by one CBITsuite to test neighbor-
ing CUTs (4 x 28), as opposed to the
length it requires from a doublewidth
CBIT suite (216). The cascaded CBITs
show a quicker rate of convergence than
a straight pipe. Again, this issimilar to the
results of the previous ALU pipes.

Aliasing probability of signature


I I I I 1 I analysis. We inject the single stuck-at4
(cl 1 10 1,000 10,000 1 et06 let07 fault to the least significantbit of the ALUs
Stages
output. Calculation of aliasing probability
Figure 7.Aliasing frequency in AlU pipes: at the sixth stage for the AlU pipes [a);cas- occurs by comparing signatures from the
caded CBlTs with primitive polynomial [b); test length = 4 M I over six stages [c). faulty pipe with those from a fault-free

46 IEEE DESIGN I TEST OF COMPUTERS


.............................

m=4 i
. . . . ........................ L..

................... :... ........... :..

.................................. ........................... ;.
............................................ ;.................... .:

.................................... ..
~ ...............I.
5 60 - ................................................................ :..
........,.........................
e
I I I I a" I I I I
0 1 2 3 4
Stages
(a)

Figure 8. Randomness measure of the CBlTs as TPGs over four stages for the straight P-pipe: test lengths are multiples of 256 (1= m P )
[a);test lengths are multiples of 16,384 (1= m P 4 )(b}.

pipe. Figure 10a (next page) shows the


aliasing frequencies per injected fault of
the final Signatures in the last stage CBITs
of the two P-pipes. Reading the signatures
bytewise from each CBIT cell allowsus to
calculate the aliasing frequency. We also
analyze complete signatures for the ex-
tended CBIT pairs.All aliasingfrequencies
per injected fault of the &bit-wide signa-
tures converge to the asymptotic value
(pforthe &it case).However,with a test
length smaller than one cycle of the maxi-
mum-length PRS (a, the cascaded Ppipe
1 2
Stages
3 4

gives a smaller bytewise aliasing frequen- .................


cy than the straight P-pipe. Aliasing does m=4 i
not occur for the extended signatures un- ...................... ~.i....................... ; ......................... j..
til the test length reaches one maximum m=3
length for the 16-bit signature analysis (216 ......................................

or 65,536).
Figure 10b shows the aliasing fre ................................................ :...
quencies for the intermediate stages in
...................j.......................... ~.i...
the P-pipe with cascaded CBITs. The last
stage still gives the worst analysis result
of the ALU pipes. Once again, warming
up the P-pipesseems to improve the test
quality. For comparison, we show the 1 2 3 4
aliasing from extended signature analy- Stages
-
sis [0(2-16) 1.52 x for the com-
plete l&bit-wide CBITsuite].
In these experiments, we analyze the Figure 9. Randomness measure of the CBlTs as TPGs over four stages for the cascaded P-
randomness of the TPG and the aliasing pipe: test lengths are multiples of 256 (1= m28) (a);test lengths are multiples of 16,384
problem for multiplestage parallel sig- (1= m P 4 }(b}.

DECEMBER 1993 47
proaches: testing time and area over-
-
c head. These approaches include the
m
3
c IEEE 1149.1 boundary scan standard
(JTAG) and a pipelined BlST with con-
flict scheduling. We calculate testing
c
a time by adding the set-up time (T,,,,,),
'U 1 Y'
the module testing time (Tmodule)and
the r e a d a t time (Treadout>.
Note that we
6 normalize these times to the average
00001 testing time per module. Adding 40 to
2
c
80%for wiring to the required hardware
zm le-05
01
componentsprovides an estimate of the
- area overhead.
a
LTA does not require different CBIT
1e-06
(a) 1 10 100 1,000 10,000 100,000 l e t 0 6 cells in the design libray to test different
Test length widths of the data paths. For a wider
data bus, we can cascade the CBITs to
1 get an extended PSA without downgrad-
-
-
c
3
ing the quality of the signature analysis.
c

2U 0 1 LTA thus eliminatesthe hardware penal-


c
ty required by different sizes of BILBOs.
U
001
(The only hardware overhead needed
U
P
by LTA is the zerostage CBIT for a new
- pipe since the wiring for the cascaded
g 0001 case is negligible.) According to previ-
>
c ous performance analyses of CBITs ver-
2 00001 sus the boundayscan method,15CBIT
+?
0 takes less than 10%of the testing time
5 le-05 and requires less than twice the area of
--
a boundaryxan designs. In both cases,
1e-06 the faultcoverage is 100%.In our exam-
(b) 1 10 100 1000 10,000 100,000 let06
Test length ples, we saw that in the six stages of the
16-bit pipelined CBITsuites, the aliasing
Figure 10.Aliasing frequency. fourth stage for two P-pipes laj; cascaded CBlTS with frequency and probability stay as low as
primitive polynomial in the P-pipe (6) 0(2-ls) for a sufficientlylong test length.
Therefore, with a limited area penalty
and an order of magnitude improve
nature analyzers. As the input test length pipe from the straight-P pipe, the only ment of the total testing time, LTA dras-
traverses the entire cycle of the maxi- extra wiring needed is that which con- tically reduces the cost of MCM testing in
mum-length PRS provided by the MISRs, nects the CBIT suites. To have a primi- today's competitive market.
it visits all states in the PRS at least once. tive polynomial for the extended CBIT
Thus we have 100%randomness of the suite, we provide spare XOR gates for the Comparing LTA to boundary
maximum-length PRS during the TPG most and least significant suite configu- scan. The boundayscan approach
process. This also applies to the down- rations. Again, we require only two needs two separate modes and careful-
stream stages in one pipe. For a well- modes for initializing the CBlTs and ly selected test patterns to test the p r o
partitioned pipelined testing path, the M E R mode analysis. cesor for the interconnection failure.'
aliasing probability is of the same order When the bit width of the communicat-
as that for the serial signature analyzer. Performancecomparison w ith ing data path increases, the boundary
other approaches scan requires more complicated test
Area overhead and testing time. Two aspects make LTA with CBlTs patterns and test cycles.
To implement LTA in the cascaded P- comparable with other testing a p The original test vectors may not be

48 IEEE DESIGN C TEST OF COMPUTERS


available for boundaryscan testing in connection network. Its value is negligi- pipelined data path by centralizing the
MCMs using automatic test pattern gen- ble compared to the other Ps in the for- TPG and distributing the PSA (for exam-
eration.6 Therefore, an N-input CUT re- mula. However, if the interconnections ple, using one set of BlLBOs as TPGs for
quires O(C x 2 9 test patterns for have a long propagation delay, we can- all pipes with a one-stage BILBOCUTs-
pseudorandom testing. C 2 1 is a con- not ignore tinterconnect. For boundary BILBO structure and separating outputs
stant given by a statistical estimation on scan, the total testing time per CUT with to several PSAs). Establishing conflict
a specific test pattern generation tech- its interconnections is given by tables makes sure that the LFSRs per-
nique.I6 The optimized value for C is 1. form TPG and PSA separately in differ-
However, to have confidence that the (c 2?) [[setupall+ [module ent testing schedules. We call this
most difficult to detect faults are cov- + ([interconnect" O) + [readout-all] approach pipelined BIST with conflict
ered, a larger C is required.I6 The total scheduling. The total averaged testing
time needed for one Ninput CUT under where tsetuwlland treadout.all
represent the time for one Ninput CUT is
boundary scan testing is given by sums of scan-in time and s c a n m t time, given by
respectively. The time savings over
boundary scan again indicates the effi-
ciency of LTA during interconnection
testing.
where [setup, [module, and [readout are the Considering the area overhead, both
Scan-in, one execution, and Scan-out LTA and boundary scan need five extra Here 2Nrepresents the number of test
time for one CUT. However LTA gives electrical pads for one processing ele-
patternd7 and D represents the latency
ment. However, the control logic of between the current TPG and its imme
boundary scan requires four I/O pins diate predecessor. The minimum (or
optimized value) for D is one clock cy-
(test-modeselect, test-reset,scan-in, and
scan-out). cle. Usually D is greater than one clock
where 4 x 2Nis the maximum given by cycle because generation of the new
Due to the XOR gates required to im-
Equation 1 when m = 2 and k is the total plement the extended generating poly- test pattern cannot occur until the sys-
number of stages of a pipe implement- nomial, LTA consumes more area than tem loads the previous pattern to the
ing LTA. Thus, LTA saves time for scan- boundary scan does. Our previous ex- CUT/kernel as soon as the bus is avail-
in, scan-out, and pseudoexhaustive periment15 in testing the MCM shows able.
testing per module/CUT, especially that LTA takes 4,240 transistors in the By comparing Equations 7 and 8, we
when 4 c<k << zN.For example, if fmodule fourstage straight P-pipe case and see that the conflict table approach r e
= to clock cycles, fsetup= 16 clock CYCI~S quires more testing time than LTA, even
boundary scan takes 3,040 transistors in
and [readout = 16 clock cycles for a total. Therefore, in this example LTA though we set the optimized pipelining
16-bit input/l&bit output CUT, then the consumes about 39% more area than schedule for the best value of D. This
boundary scan will need 65,536C(32 occurs because our LTA approach uses
the boundary scan. However, with limit-
+ tn) clock cycles to finish the psuedoex- the fundamental characteristics of the
ed area overhead LTA provides a s u p e
haustive testing. LTA requires 32 + nor BIT implementation with improved MISRs to operate simultaneously asTPG
32,768fo clock cycles for eight stages. state coverage and exponentially lowerand PSA. Thus LTA eliminates the wait-
LTA does not require extra testing aliasing probability. Furthermore, LTAing time for the available register and
time in a separate mode for intercon- bus to generate a separate test pattern
provides the extensibility for the PSAs in
nection testing. (In contrast, boundary terms of the CBlT implementation and for the CUT/kemel. In addition, the pos-
scan does.) So, the total testing time per also the pipelining for several CUTs to be
sibility that additional MISR circuits or
module and its interconnection remains tested concurrently. interconnections are required by the
conflict scheduling to separate TPG and
Comparing LTA to pipelined BET PSA modes does not occur in LTA. A 16-
with conflict scheduling. Other pipe input/lhutput CUT needs a testing
lined BIST approaches in Abadir and time of at least 65,567 + fo clock cycles
since LTA can test both the interconnec- Breuer17 alternate modes of TPG and (calculated from Equation 8 with to =
tion and the processor logic at the same PSA in one LFSR circuit. Krasniewski tmodule) which is greater than the 32 +
time. Here, finterconnect represents the time and Albicki's implementation'* gives 32,678r0 clock cycles given by the 8-
for signals to transfer through the inter- one-stage analysis for all the CUTs in a stage LTA (tois at most one clock cycle).

DECEMBER 1993 49
P I P E L I N E D T E S T

Due to the dual TPG/PSA mode pro- in depth analysis of testability of parti- 12, Dec. 1990,pp. 1344-1353.
vided by LTA, we reduce the exhaustive tioned CUTS.” 9. T.T. Lin and C. Kaseff, “Performance
testing time by cutting the time required Our future work will emphasize inte Evaluation of Cascadable Built-InTester
for scheduling conflicts on one MISR. In grating partitioning and clustering algo- for Large 1/0Multichip Modules,”hoc.
addition, the horizontal (for bitsize rithms with LTA, to allow automation of of the Fourth Ann. IEEE Int’l ASIC Conf,
changes) and vertical (for multiple hierarchical functional test methodolo- IEEECSPress, 1991,pp.93.1-93.4.
CUTS to be tested in a pipe) extensibili- gy for MCMs. We expect that modifying 10. R. Srinivasan, S. K. Gupta, and M. A.
ty given by LTA provides the best utiliza- LTA and CBlT circuitly to accommodate Breuer, “An Efficient Partitioning Strate
tion of the parallel testing. Therefore, we the interconnection reconfiguration gy for PsuedoExhaustiveTesting,” Roc.
can perform pipelining and achieve par- and self-purging in the MCM will im- 30th IEEE Design Automation Conf: , IEEE
allelism on one system with minimal prove fault tolerance. @ CS Press, 1993, pp. 525530.
design modification and optimal test 11. S. C. Seth and V. D. Agrawal, “A New
scheduling. Model for Computation of Probabilistic
Our LTA scheme requires less testing Testability in Combinational Circuits,”
time compared to boundaly scan and Acknowledgment Integration, f i e VLSIJoumal,North-Hol-
pipelining with conflict scheduling with- We thank Hany Chen of Cadence Design land Publishing Co., Amsterdam, Vol. 7,
out losing the effectiveness of the test Systems Inc. for his valuable comments in No. 1, April 1989,pp. 4975.
coverage. We anticipate no significant clarifying some ideas in this work. 12. C.W. Yeh, C.K. Cheng, and T.T. Lin, “A
area overhead (except for the spare Probabilistic Multicommodity-Flow So-
XOR paths for cascadability) in LTA. Fur- lution to Circuit Clustering Problems,”
thermore, we believe that by rearrang- References Roc.IEEE/ACMlnt’[ConfCAD,IEEECS
ing the placement of the CBIT circuits 1. C. M. Maunder and R. E. Tulloss, “Test- Press, 1992,pp. 428431.
and including test scheduling, we can ability on TAP,” IEEE Spectrum, Feb. 13. K. Kim, D. S. Ha, and J. G. Tront, “On Us-
gain high testability and observability of 1992, pp. 34-37. ing Signature Registers as Pseudoran-
the permanent faults in both the proces- 2. B. J. Koenemann, J. Mucha, and G. dom Pattern Generators in Built-In
sor and interconnection. Zwiehoff, “Built-in Logic Block Tech- Self-Testing,”IEEE Trans. CAD, Vol. 7,
niques,”Proc. Int’l Test Conf, IEEE Com- No. 8, Aug. 1988, pp. 919928.
puter Society Press, b s Alamitos, Calif., 14. M. J. Karpovsky, S. K. Gupta, and D. K.
WE P R ~ P ~ ~A E D to test MCM mod-
csn 1979, pp. 3741. Pradhan, “Aliasingand Diagnosis Prob-
ules configured in a pipelined fashion. 3. D. K. Bhavsar, “Concatenable Polydivid- ability in M E R and STUMPS Using a
Cascading the CBlTs produces high test ers: Bit-Sliced LFSR Chips for Board Self- General Error Model,” Roc. 1991 Int’l
coverage with 100%randomness in the Test,“ Roc. Int’l Test Conf:,IEEE CS Test Conf,IEEE CS Press, pp. 828839.
TPG process and low aliasing probabili- Press, 1985pp. 88-93. 15. T.T. Lin, J. Comito, and C. Kaseff, “Eval-
ty in signature analysis. In addition, the 4. S. Pilarski, A. Kransniewski, and T. Ka- uation of Test Strategies for Multichip
CBlTcircuit can serve as a switching d e meda, “btirnating Testing Effectiveness Modules,” Proc. 5th Ann. IEEElnt’lASlC
vice for module reconfiguration. of the Circular Self-Test Path Tech- Conf, IEEE CS Press, 1992,pp. 234-237.
We also introduced LTA as a way to nique,” IEEE Trans. CAD,Vol. 11,No. 10, 16. J.A. Abraham and V.K. Agarawal, “Test
reduce aliasing probability. When com- Oct. 1992,pp. 1301-1316. Generation for Digital Systems,” Fault-
pared to the GLFSR a p p r o a ~ h LTA ,~ 5. D.K. Pradhan and S.K. Gupta, “A New Tolerant Computing--Theory and Tech-
gives a similar aliasing probability as the Framework for Designingand Analyzing niques,D. K. Pradhan, ed., PrenticeHall,
twofold GLFSR. LTA implementation BlST Techniques and Zero Aliasing Englewood Cliffs, N.J., 1986, pp. 1-94.
also works when the I/O ports are Compression,” IEEE Trans. Computers, 17. M. S. Abadir and M. A. Breuer, “AKnowl-
moved to the center of the chip area in 1991,Vol. 40, No. 6, pp. 743-763. edgeBased System for Designing Test-
the future system design. 6. C. A. Pina, “Implementation of an MCM able VLSl Chips,”IEEE Design & Test of
LTA exhibits greater efficiency when Brokerage Service,” Proc. IEEE MCM Computers,Vol. 2, No. 4, Aug. 1985, pp.
we partition the MCM testing sessions C o d , IEEE CS Press, 1993,pp. 46-51. 56-68,
into several subcircuits for parallel test- 7. S. Golornb, ShiftRegister Sequences, A e 18. A. Krasniewski and A. Albicki, “Auto-
ing. Yeh et al. discuss partitioning algo gean Park Press, Laguna Hills,Calif., 1982. matic Design of Exhaustively Self-Test-
rithms using netlist as inputs,’* while 8. M. Darniani et al., “Aliasingin Signature ing Chips with BILBO Modules,” Roc.
Srinivasanet propose partitioning at AnalysisTestingwith Multiple Input Shift lntl Test Conf:,IEEE CS Press, 1985, pp.
a higher level. Seth and Agrawal discuss Registers,” IEEE Trans. CAD,Vol. 9, No. 362-371.

50 ICEE DESIGN I TEST OF COMPUTERS


electrical and computer engineering at the
University of California, San Diego. Her re-
search interests include fault-tolerant com-
puting, design automation, VISI testing, fault
modeling, and system reliability. She re-
ceived a BS from National Chiao-Tung Uni-
versity, Hsinchu, Taiwan, Republic of China
and a PhD in computer engineering from
Camegie Mellon University. Lin is a member
of the IEEE, the Computer Society, and Sig-
ma Xi.

computer engineering at the University of


California, San Diego. Her research interests
include fault-tolerant computing in self-test
design methodology for VLSl circuits, and
fault modeling of free-space optical inter-
connections. Previously she was a senior
member of the technical staff at Cadence
Design Systems in San Jose, California. Liou
received her EIS in physics from National Tai-
wan University, Taipei, an MS in physics
from the University of Pittsburgh and an
MSEE from the University of California, San
Diego. She is a student member of IEEE and
the Computer Society.

Send correspondence to Ting-Ting Y. Lin


at The Department of Electrical and Com-
puter Engineering, University of California,
San Diego, La Jolla, CA 92093;
[email protected].

DECEMBER 1993 51

You might also like