Minimizing Test Time by Exploiting Parallelism in Macro Test
Minimizing Test Time by Exploiting Parallelism in Macro Test
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
After a test plan (at the device boundary) and a test pat- cently been taken into production. With the pamllel test
tem set have been generated for a macro in the design, assembly algorithm, reductions of up to 40-50% have
the test pattern insertion process takes care of writing been reached compared to traditional techniques where
out the test plan for every test pattern in the set. The re- macros were tested sequentially.
sult of this process is called the mucro test specification.
Finally, a complete device tesf specification is produced 2 Test assembly
by merging all macro test specifications. This process is
called test assembly [Be 903. Test pattern insertion and Test plan generation
test assembly have been visualized in Figure 1.
As mentioned in the introduction, the lest plan gener-
test test test test ation process determines the access protocol (test plan)
patterns plan required to access a macro from the device pinning. This
-
"'
Paper 20.3
452
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
to the bits that must be applied, patM1[5..6] gives a x14 steps + 3 patterns x 8 steps = 52 cycles or tesr
reference to the bits that must be observed. Note that it specification steps (Figure 3).
is possible to observe the result at the scan-out port at
steps 12 and 13 instead of steps 13 and 14 in the test Test Plan M1 Test Plan M2
plan of Figure 1 (by shifting the observe window). Since
only in specific examples this would give a reduction in
test time, and for simplicity reasons, this optimization
will not be taken into account.
c l c2 c3 c4
0 Cycles 52
Paper 20.3
453
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
--
Test Plan M 1 Test Plan M2 plans can be done since M1 and M2 do not share regis-
-
ters for applying or observing test patterns. Therefore,
T 14 steps
F
a steps registers which are used to apply or observe test pattems
in a test plan are sufficient to be considered as resources.
In the test plan of Table 1 patterns are applied via reg-
ister R2 and observed via register R3. R2 and R3 are
7 considered to be the resources for this test plan. In the
i P H d same way, R1 and R4 are the resources for the test plan
tJ4hl-I of Table 2. Since no resources are shared, the test plans
iTest Specification can be merged. Macros M1 and M2 can now be tested
Y
0 Cycles 27 by executing the merged test plan n l times, and the test
plan of M2 n2-nl times.
Figure 4: Parallel test assembly. controls data-im data-outs
cycles cl..c4 sdi Sdo
1-4 S--- wM1I141 -
5-8 ss-- patM2II-41 -
as resources may significantly determine the amount of 9 NNNN - -
parallelism that can be exploited, and therefore the test 10 --ss - Patm51
11-12 ---S - patMZ(6-71
time reduction that can be reached. 13 ---S - p a t 1151
~
To minimize test time, the test assembly process needs to
14 ____ - o a t 1161
~
allocate the macro test specification steps to time slots Table 3: Merged test plan for M1 and M2.
in a way that the total number of required time slots
(number of device test specification steps or: clock cy-
Note the don’t care values given for control ports cl..c4.
cles required) is minimized. This allocation needs to be
Control ports cl ..c4 can be merged into a single port with
done in a way that the functionality of the test plans modes scan (S) and normal (N) by choosing N for the
is maintained, and without the occurrence of conflicting
don’t cares on cycle 9, and S for the don’t cares of the
resource conditions. remaining cycles. For readability reasons, don’t cares of
A classificationof different levels of parallelism in VLSI further test specification examples have been filled in.
testing can now be made by considering different choices Test plan step parallelism
that can be made to specify resources.
A lower level at which parallelism can be considered is
Test plan parallelism the rest plan step level. At this level, a resource consists
At a first level, only hardware structures are consid- of a hardware srrucrwe in combination with the rime step
ered as resources. If no hardware structures are shared at which the hardware structure is required. Thus, hard-
between two test plans, they can be executed in paral- ware structures are specified for test plan steps instead
lel. The level at which parallelism is considered will be of for complete test plans. Now that (relative) times
called the rest plan level, since resources are specified are known at which hardware structures are used, test
for complete test plans. The relative time steps (or test plans can be scheduled to use shared hardware structures
plan steps) at which specific hardware structures are used at different times during their execution [Sa 881. This
are not (or: need not be) taken into consideration. For level also creates the possibility to exploit pipelining be-
example, in [Ki 82,Cr 881 register segments that act as a tween two consecutive test plan executions [Ab 861, i.e.,
TPG (test pattern generator) or SA (signature analyzer) the execution of a test plan can safely be started before
in BIST are considered to be resources. This means that a previous execution of the same test plan for another
tests which require the same register segment in their pattem has been finished.
execution cannot be applied in parallel. Examples
Example Consider the example design in Figure 2. The test plan
Consider the example design in Figure 2. The test plans for macro M1 has been given in Table 1. We consider
for macros M1 and M2 have been given in Tables 1 parallelism at the test plan step level. A resource is
and 2. From Figure 2, it can easily be seen that pattems considered to be a scan register in combination with the
can be applied to M1 and M2 in parallel by shifting time interval in which the scan register is used (contains
patterns in and out concurrently for both macros. This test data). Since the scan-in steps (1-8) are compatible
merged test plan is given in Table 3. This merge of test with the scan-out steps (10-14), pipelining can be ex-
paper 20.3
454
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
ploited between consecutive test plan executions. A test that it can only be exploited if the registers contain a
specification for M 1 according to this semi-pipelining hold mode. However, in this case test time can also be
protocol is given in Table 4. reduced by performing the shift-in, apply and shift-out
cycles for M1 and M2 simultaneously (see Table 3). Af-
controls data-ins 11 data-outs
ter executing this parallel scheme nl times, all test pat-
repeat cycles ~ 1 . ~ 4 Sdi II sdo
1 4 ssss tems for M1 have been applied and the n2-nl remain-
4 ssss ing patterns for M2 can be applied according to one of
nl-1 1 NNNN the schemes described above. With this approach, hold
3 ssss modes are not required for the scan registers. Table 6
1 ssss gives an assembled test specification for M, in which
1 ssss
3 ssss parallelism at the test plan step level has been exploited.
1 1 NNNN
3 ssss
2 ssss
Table 4: Test specification for M1 (semi-pipelining).
Paper 20.3
455
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
Example fault simulation which faults in M2 are covered by this
First, an example in which a part of a test pattem is test pattem set. For M2, test pattem generation is now
saved after it has been applied to a macro and is used only required for the remaining (not yet covered) faults.
again as a part of a successor pattern that must be applied If M1 and M2 are identical, considering parallelism at
to the macro. Again, consider Figure 2. Let nl = 3 and the test specification step level provides an effective ap-
the test pattern set for M1 be patM1[14] = {IlOlLH, proach for parallel testing [Me 901.
OlOlHH, OllOLL}. Note from Figure 2 that for M1 the
applying register R2 is 4 scan cells wide whereas the
observing register R3 is only 2 scan cells wide. Thus,
R3 can observe a new response pattern from M1 every
2 cycles. Without knowledge of the test data contents
it takes 4 cycles to fill R2 with a new pattern for M1.
However, if the contents of the test data are taken into
consideration, a possible overlap between the test pat-
tems can be exploited. Assume R2 is filled with test data
for M1. Also assume that R2 contains a mode in which sdi
data can be applied to Ml in a manner that the contents
of R2 are not lost. This mode will be called the ap-
ply/hold mode (A). Now, overlap between test patterns
I : ; I 4I sdo
can be exploited by shifting two instead of four times I i:
after each pattern apply cycle. The test specification for
M1 due to this parallelism at the test specification level Figure 5 : Example design (n2 > nl).
is given in Table 7.
Paper 20.3
456
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
case: input) is given by the variable type. The time step be applied to several macros in parallel and the shift-
at which hw is required is given by time. State gives the in of new patterns will be done concurrently with the
state of hw (in this case: the value of the control port shift-out of previously applied patterns. For every test
at time step time). A resource can now be considered a plan, resources are specified by the test plan generation
--
Note that the compatibility relation is not transitive, i.e., apply or scan-out.
if x y and y z then no knowledge exists whether x
N
For control ports, the value of a port on a time step
2.
needs to be specified (the state). If two test plans require
We can use the compatibility relation to determine com- incompatible values on a control port during scan-in,
patibility for every pair of test plan executions. The apply or scan-out, the test plans are incompatible.
results of this analysis can be presented in a test com-
patibility graph (TCG) [Ki 821. Every node of the TCG Using the resource specifications, compatibility can be
represents a test plan execution. An edge occurs be- determined between every pair of test plans, and visu-
alised in a test compatibility graph (TCG). Using the
tween two nodes if the test plan executions are compat-
ible. Concerns about the way in which test plan execu- TCG, a schedule of test plans with minimal or close to
tions are merged and the reductions in test time if spe- minimal test time can be determined. To perform this
scheduling, an (approximate) measure for the reduction
cific merge algorithms are used lead to different TCG
in test time is required for our method of merging test
representations, TCG labelings, and test scheduling al-
plans.
gorithms. Test scheduling algorithms aim at finding a
schedule for test plan executions which gives a mini- Test time costs
mum test time, given a certain protocol for merging test
Let nt, be the number of test patterns that need to be
plan executions.
applied through a test plan t,. The number of shift op-
erations between two consecutive applications of test
4 Implementation considerations patterns is given by the sh$t length ht,. Let :6 specify
the number of shift operations required in t , to shift-in
For an initial implementation of a parallel assembly al- a pattern and 6:; specify the number of shift operations
gorithm in the Macro Test tools available within Philips, required in t , to shift-out a response. An approximate
we have chosen to first exploit parallelism at the test plan equation for the test costs C ( t z )i.e.,
, the number of cy-
level and the test plan step level. As a first approach, cles required to execute a test plan can now be given
we have developed an assembly algorithm that generates [Oo 911:
a test specification for a number of macros, given gen-
C(ti)x nt, ' 6t,.
erated test plans and test pattern sets for every macro.
Advantage of this approach is that a significant reduc- If semi-pipelining is not being exploited, the shift length
tion in test time is reached without making any design can be given by:
modifications. 6 t , = 6; +b y .
Algorithm If shifting is done according to the semi-pipelined pro-
We chose to implement an algorithm which is able to tocol of Table 4, the shift length becomes:
merge test plan executions according to the principles
bt, = max(6?, ~5~;.)
given in Tables 3 and 4. Thus, if possible, pattems will
Paper 20.3
457
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
The test time reduction if two scan test plans t l and t 2 of a macro to a virtual macro. A virtual macro can
are executed in parallel according to the protocol given be tested by executing its assembled test plan for every
in Table 3 is approximately equal to: pattern in its assembled test pattem set. Note that with
the introduction of virtual macros, the direct relationship
which existed between macros and a block of hardware
These cost criteria can be used as a basis for different is lost. Executing a test plan for a virtual macro will
kinds of (heuristic) scheduling algorithms as have been test part of the hardware in one or more macros. Virtual
given in [Ki 82,Cr 88Jo 90,Oo 911. macros are merely the result of an optimization process
To be able to exploit the advantages of partitioned test- (in this case: for test time). Since both the input and
ing as described in [Cr881, our scheduling algorithm the output of the assembly process are (virtual) macros,
several optimization processes can now be performed
takes for every test plan a parameter which specifies
whether an interruption between consecutive test plan independently.
executions is allowed or not. For example for RAMS,
interruption is only allowed if the contents of the RAM 5 Results
are not lost, or if a mechanism for restoring the RAM To demonstrate the validity of the approach presented in
to its state of before the interruption exists. this paper, we take a digital signal processor (DSP1) IC
Further reducing test time produced by Philips Semiconductors. This DSPl con-
tains about 150,000 transistors. The IC is partitioned
After the resource compatibility graph has been created, into 18 macros. For every macro, the control class of
an analysis can be done which further test time reduc- the test plan, the number of test patterns T i t , and shift
tions can be reached by removing incompatibilities be- length bt, have been given in Table 8. Also, the macro
tween test plans. This can be done in two steps. test plans have been divided into control classes. All
First, a new test plan generation process for a macro can test plans of a class are tested using compatible control,
be directed to try to find different resources (e.g., dif- i.e., the values that need to be applied to the control
ferent scan registers in which a response pattern can be ports are compatible between the test plans. Thus, test
observed) to be used in a macro test plan. In this way, plans of different control classes cannot be executed in
"expensive" incompatibilities between test plans can be parallel. Since for all test plans, control values turned
removed. However, the attempt to remove an incom- out to be compatible between scan-in and scan-out, the
patibility may introduce new incompatibilities between shift lengths are specified for the semi-pipelining shift
other test plans. A different approach may be to gener- protocol. All reductions presented here are compared to
ate all possible test plans for a macro. The test assembly the semi-pipelining protocol.
algorithm can then choose the optimal test plan for the
assembly process to obtain a minimal test time. Note
that this further reduction in test time is reached without
making any design modifications.
Second, modifications can be made to testability hard-
ware structures that are used. Analysis of occurring re-
source sharing conflicts may lead to the conclusion that
design modifications can further reduce test time. A
trade-off between additional testability hardware and a
reduced test time can now be made.
Macro test optimizations
As mentioned, inputs for the algorithm described here
are a set of generated test plans with corresponding test
pattern sets. The result of the algorithm is a schedule of
test plans. Execution of the test plans according to this
schedule leads to minimal or close to minimal test time.
The schedule consists of a number of parts. In each Table 8: Macro data for the DSPl.
part, a new set of (assembled) test patterns is applied
to a number of macros in parallel according to a new For every control class, the test incompatibility graph
(assembled) test plan. We can now extent the concept FIG) has been given in Figure 6.
Paper 20.3
458
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
gives the test time reduction if only incompatibilitiesdue
to sharing of data ports and scan registers are removed.
class 1 Iclass 2
test reduction number of removed
time registers incompatibility
I shared
276224 I 50% I all
325151 41% all reg. sharing
332937 40% 18 16 - 17
456839 17% 9 14 - 17
459443 17% 19 15 - 16
459517 17% 18 15 - 17
465049 16% 1 3-4
class 3 466091 16% 1 1-2
466869 16% 9 8 - 10
466889 16% 1 5-6
466889 16% 10 9 - 10
466899 16% 13 12 - 13
The results of the scheduling process are given in Ta- As can be seen from Table 10, a reduction of 40% can
ble 9. The schedule consists of 17 assembled test plans be reached if the incompatibility between test plans 16
s j . The number of test patterns 72,, and shift length 6,, and 17 (the test plans for the two RAMs) is removed.
have been given for every assembled test plan. The test Choosing different test plans for the RAMs leads to the
time dropped to 466889 cycles, i.e., a reduction of 16%. new test schedule given in Table 1 1 . Note that the re-
duction of 40% in test time has been reached without
mucro components nsi 6,, making any design modifications.
6 7 13 17 18 20 202
6 13 17 18 30 163
6 13 17 30 163 - -
5 13 17 20 163 -
j macro components &
13 17 92 163 I 6 7 13 14 16 17 18 20 202
12 17 10 192 2 6 13 14 16 17 18 30 172
17 1254 163 3 6 13 14 16 17 30 172
14 16 150 158 4 5 13 14 16 17 20 172
16 1564 91 5 13 14 16 17 50 172
2 10 11 80 110 6 13 16 17 42 172
11 1 10 11 21 110 7 12 16 17 10 192
12 10 11 262 72 8 16 17 1254 172
13 8 9 11 20 73 9 16 258 91
14 I1 220 55 10 2 10 11 80 110
15 15 76 96 11 1 10 11 21 110
16 3 64 91 12 10 11 262 72
4 20 94 13 8 9 11 20 73
14 I1 220 55
-
para llel time 466889
15 15 76 96
16 3 64 91
Table 9: Test schedule for the DSPl.
-17
Da I
4
el rime
- 20 94
7 ;721
-
Analysis results of removing resource sharing conflicts
to further reduce test time are given in Table 10. The 6rst Table 11: Test schedule after the incompatibility be-
line of this table shows that a reduction of 50% would tween macros 16 and 17 has been removed.
be reached if no incompatibilities would exist, i.e., all
test plans can be put in one control class and no primary
data ports, scan input registers and scan output registers Application of the parallel assembly algorithm to a sec-
are shared between test plans. Thus, for this IC, 50% ond industrial device (here called DSP2) has lead to
is the minimum reduction that can be reached with this similar promising results. A general overview of results
method of parallel testing. The second line of Table 10 for both ICs is given in Table 12.
Paper 20.3
459
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.
DSPl 1 1 I 1
sequen:;\,
semi-ppe med
parallel (initially)
parallel after
removing a resource
10996
~~3~16
466889
335721
~~
[Be 841 R.G. Bennetts, Design of Testable logic circuits,
Addison-Wesley, 1984.
[Bo 921 Frank Bouwman et. al., "Macro Testability; The
Results of Production Device Applications", Proceedings
IEEE International Tesl Cor$erence, 1992, pp. 232-241.
sharing conflict
wtentia
r -
6 4 [Cr 881 Cary L. Craig, Charles R. h e , Kewal K. Saluja,
DSP2 sequential 1170360 "Test Scheduling and Control for VLSI Built-In Self-Test",
semi-pipelined 587076 0% IEEE Transactions on Computers, September 1988, pp.
parallel 298894 49% 1099-1 109.
potential 136932 77%
[De 881 Rob Dekker, Frans Beenker, Loek Thijssen, "Fault
Table 12: Test time reduction results. Modeling and Test Algorithm Development for Static Ran-
dom Access Memories", Proceedings IEEE International
Test Conference, 1988, pp. 343-352.
6 Conclusion [Fe 901 Sheng Feng, Yashwant K. Malaiya, "Optimization of
Test Parallelism with Limited Hardware Overhead", Micro-
In this paper, a classification of different possibilities for
electronics Reliability. Vol. 31 1991, pp. 271-276.
reducing test time of devices by exploiting parallelism
in Macro Test has been presented. It has been shown, [Gu 911 Rajesh Gupta, "Advanced Serial Scan Design for
that the choice what to consider resources for testing Testability", CEng Technical Report 91-IO, University of
greatly determines the complexity of the test scheduling Southern California, 1991.
algorithm that needs to be used and the amount of par- [Jo 901 Wen-Ben Jone, C.A. Papachristou, M. Pereira, "A
allelism (and therefore the reduction in test time) that Scheme for Overlaying Concurrent Testing of VLSI Cir-
can be exploited. Also, impacts of test plan generation cuits", 26th ACMIiEEE Design Automafion Conference,
and test control structures on parallelism are described. 1989, pp. 531-536.
Application of one of the techniques of parallel testing [Ki 821 Charles R. Kime. Kewal K. Saluja, "Test Scheduling
to industrial devices has lead to test time reductions of in Testable VLSI Circuits", Proceedings International Sym-
40-50% without making any design modifications. posium on Fault-Tolerant Computing, 1982, pp. 406-412.
[Le 901 Sunggu Lee, Kang G. Shin, "Design for Test Using
Acknowledgements Partial Parallel Scan". IEEE Transactions on Computer-
Aided Design, February 1990, pp. 203-21 1.
This work was carried out in JESSI project AC6.
[Ma 931 Erik Jan Marinissen, Krijn Kuiper, Clemens Wouters,
The authors like to thank the members of group Van "Testability and Test Protocol Expansion in Hierarchical
Utteren at Philips Research Labs for the many valuable Macro Testing", Proceedings IEEE 3rd European Test Con-
discussions. ference, 1993.
[Me 901 R. Mehtani et. al., "Macro-Testability and the V S P ,
Proceedings IEEE Inferrdonal Test Conference, 1990, pp.
739-748.
References
[MO 911 Sean P. Morley, Ralph A. Marlett, "Selectable
Length Partial Scan: A Method to Reduce Vector Length",
[Ab 861 Magdy S . Abadir, Melvin A. Breuer, "Test Schedules
Proceedings IEEE Infernational Test Conference, 1991, pp.
for VLSI Circuits Having Built-In Test Hardware", IEEE
385-392.
Transactions on Computers, April 1986, pp. 361-367.
[Na 92) Sridnar Narayanan, Charles Njinda, Melvin Breuer,
[Ba 901 Robert W. Bassett et. al., "Low-cost testing of high- "Optimal Sequencing of Scan Registers", Proceedings
density logic components", IEEE Design and Test of Com- IEEE infernutional Test Conference, 1992, pp. 293-302.
puters, April 1990, pp. 15-28.
[Oo911 Steven Oostdijk, Frans Beenker, Loek Thijssen, "A
[Be 861 Frans Beenker et. al., "Macro Testing: Unlfying IC Model for Test-Time Reduction of Scan-Testable Circuits",
and Board Test", IEEE Design & Test of Computers. De- Proceedings IEEE 2nd Europem Test Conference, 1991, pp.
cember 1986, pp. 26-32. 243-252.
[Be 901 Frans Beenker, Rob Dekker, Rudi Stans, Max van [Sa 881 John Sayah, Charles R. Kime, 'Test Scheduling for
der Star, "Implementing Macro Test in Silicon Compiler High Performance VLSI System Implementations", Pro-
design", IEEE Design & Test of Computers, April 1990, ceedings IEEE Inlernational Test Conference, 1988, pp.
pp. 41-51. 421-430.
Paper 20.3
460
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on October 13,2021 at 14:37:55 UTC from IEEE Xplore. Restrictions apply.