A Robust Scan Insertion Methodology
A Robust Scan Insertion Methodology
A Robust Scan Insertion Methodology
Reecha Jajodia and Paridhi Agrawal, Freescale Semiconductor - July 08, 2014
In the modern era, where meeting high performance and low power targets for any complex SoC
(System on Chip) is very tough, testing the SoC has become even more challenging. The purpose of
several DFT (Design for Testability) tests is to validate that the product hardware contains no
manufacturing defects.
Good quality scan testing is a mandatory requirement for qualification of all production devices.
With lower technology nodes and increasing design complexity, stitching the flip flops into scan
chains to enable DFT ATPG (Automatic Test Pattern Generation)/LBIST (Logic Built in Self Test)
testing poses a number of challenges.
On the broader level, there are major challenges during scan insertion including:
1. Increasing the number of clock domains with large number of IPs (Intellectual Properties)
operating at different frequencies, integrated in a single SoC.
2. LBIST requirement as a part of safety.
3. Concatenation of scan chains at different levels to enable different modes.
4. Special care-abouts for connecting scan chains inside Hardened IPs.
Special care must be taken at each level to ensure that scan stitching is robust.
Here we elaborate on Robust Scan Insertion infrastructure and methodology that takes care of the
above mentioned challenges upfront during the design phase.
Two best practices that are followed in designs having multiple clock domains are:
a. Flops functioning at different clock domains must be stitched in the separate chains.
Problem: Clock tree of two separate functional clock domains are built separately. If stitched in a
single chain then it can cause hold violations during scan shift.
b. All the flops in the same clock domain must not be stitched into a single chain.
Problem: Consider a design of 20k flops having 4 clock domains, each consisting of roughly 5k
flops. If all the 5k flops in one clock domain are stitched together the chain length becomes 5k
resulting in 5k shift cycles required to load the data while testing. Assuming a shift frequency of
50ns will result in 0.25ms for just one period of data shift in. This would highly impact the test
time and the overall product cycle time. Thus there is a need to break the chains into smaller
groups of flops stitched together to obtain an optimized chain length for the design. Supposing
the above design, we would thus need to break a single chain of a clock domain to a figure of
about 25 chains leading to a flop count of 200 flops in a single chain. This will lead to a shift time
of 200x50ns=10us (<< 0.25ms).
Taking the above solution into consideration, we need 25 Scan INs and 25 Scan out Pads for
testing. This will pose another challenge as tester channels (PADs to be used during testing) are
limited for DFT testing and required multi-siting during testing. This results in inclusion of EDT
in the design (having decompressor at input end and compressor at output end) which has few
inputs from PAD (tester channels) and PRPG (Pseudo Random Pattern Generator) which can feed
in multiple Scan INs chains in the SOG and similarly can take scan outs from multiple chains of
SOG and compress to produce a few outputs to be observed on the tester.
The EDT architecture depends on the ratio of external channels and internal chains and number of
clock domains. While generating EDT if it encounters a negative edge triggered flip flop at the start
of the chain it will insert lockup cells in its logic to avoid DRC (design rule check). With multiple
iteration of scan insertion, there is a possibility that the tool puts a positive edge triggered flop
instead of the negative edge triggered flop at the start of that chain. Hence the lockup cell
introduced in the EDT logic because of first condition holds false in this situation. Similarly, If in the
next scan insertion run , the tool puts a negative edge triggered flop instead of a positive flop at the
start of the chain, the EDT logic for that chain will not have a lockup cell due to previous iteration
where the chain had a positive edge triggered flop at the start. Hence EDT has to be regenerated.
In order to avoid EDT regeneration with every synthesis/scan insertion iteration,, the robust scan
architecture suggests that the first and the last flop of the scan chain should be positive edge
triggered flop.
LBIST is the safety feature provided for in-field testing of the chip. The time provided for this testing
(~20ms) is very less as compared to the production test time. Thus it is required that the shift in the
LBIST partition of the design should either:
i. Shift at higher frequency: This is not viable since the whole design shifting at high frequency will
lead to power related issues.
ii. Having smaller chain lengths: This would reduce the time required for shift during LBIST,
significantly reducing test time .But this further complicates the scan insertion by breaking down
the chains for LBIST (as less as 50 flops per chain). This chain configuration for scan would in
turn need a high compression ratio, not supported by many of the low cost Testers available
today. This introduces a component called concat chains which is integrated as a part of LBIST
controller IP. The primary purpose of this module to concatenate n number of smaller LBIST
chains to form m longer EDT scan chains (m << n) required during ATPG scan. This
concatenation is based clock domain wise. Two chains of different clock domains are never
concatenated to form EDT chains.This is taken care inside concat block during LBIST controller
generation. Fig(2) gives a basic block diagram of the concat block and its integration.
With a large number of clock domains in the design, there is a lot of concern related to the clock
domain crossing and clock skew.
i. During stitching of flops in a single chain whenever there is clock crossing from +ve edge
triggered flop to ve edge triggered flop ,lockup latch must be inserted.
ii. If due to some design constraint, it is required to merge flops of 2 clock domain in a single scan
chain, lockup latches must be added.
iii. As discussed above, LBIST chains are concatenated during scan. To make scan robust, the chains
with different clock domains cannot be concatenated. This would avoid hold violation during shift
due to clock skew.
iv. Inside EDT, at compactor end there is huge combinational logic. So at the output of each scan
out coming from EDT to tester PADS, generally a trailing edge flop is introduced to ease timing.
v. In EDT Bypass or single chain mode during BurnIn testing, all the EDT chains are concatenated
to form single chain. Thus there is a need to insert Lockup latches in the following given
conditions while concatenation in this mode to avoid hold violation :
1. When flops of different clock domains are concatenated.
2. When flops of hard macros are concatenated with flops of another hard macro or soft flop
irrespective of the clock domain.
Certain IPs are plugged as hard-macro in the design. There might be in-built scan chains which have
fixed length and polarity of flops at start and end of chains. As the DFT engineer cannot tweak
anything inside the hard IP, so in order to make these scan chains compatible with scan architecture
of the rest of the design, special care is taken inside the SOG for it .
Maximum LBIST chain length is limited by length of scan chain inside hard macro(as hard macro
chains cannot be further broken)
Must comply with EDT requirements i.e. 1st flop of chain should be +ve edge of clock.
During concatenation of LBIST chains to EDT chains and further concatenation of EDT chains for
EDT bypass mode and burnin mode, hold violations should not occur due to clock skew inside hard
IP.
There are large numbers of hard macros to be stitched along with the SOG.
We have no control over polarity of flops at start and end of hard macro chains.
Being a hardened IP, there might be un-deterministic clock delays (clock skews) in scan path which
can cause hold violations while concatenation.
Solutions
Solutions
1. In order to ensure nothing breaks in scan due to un-certainties in hard-macro, a scan bypass
wrapper is implemented over the hard IP, which allows to bypass the hard macro chains during
DFT testing.
2. +ve flops and lock up latches are inserted in the scan bypass wrapper in the SOG to make sure
nothing breaks during concatenation.
3. While defining the maximum chain length during LBIST controller generation, length of hard
macro chains is taken into account.
4. To ensure that no hard macro chains are concatenated with SOG chain, care is taken during
LBIST controller generation (which actually concatenates LBIST chains to make EDT chains in
side concat block) so that respective chains are treated and concatenated independently of SOG
chains.
5. Preferably stitching of hard IP chains are done in RTL to avoid added complexities. LBIST chains
are connected to hard macro in RTL itself. It would be good practice to use initial sets of chains
for connection in RTL because this would assure that if there is any change in number of chains
during design cycle due to change in flop count of any clock domain, the integration of hard
macro chains would remain intact.
Figure 4 illustrates one of the types of scan bypass wrapper (scan bypass wrapper consists of logic
other than hard marco chain).
Depending on the polarity of the first and last flop of hard marco, input +ve flop, input lockup latch,
output +ve flop and output lockup latch is inserted.
The table shown in Figure 5 below states the logic of scan bypass wrapper.
Configuring the scan wrapper IP as per above table will satisfy all conditions required:
1. Since the first flop will always be +ve edge, no issue with directly connecting it with EDT chains.
2. These chains can be connected as explicit LBIST chains.
3. The clock skew inside the hard IP will not cause any hold violation during concatenation of these
chains with SOG chains because proper lock up latches are inserted at start and end of chains.
4. ATPG DRC violation-K21, implicitly taken care as the first and last flops is always +ve flop.
5. No special care has to taken for hard macro chains during LBIST controller generation (except
the maximum length) as these can be merged with SOG chains with no issues.
6. In case there is need to bypass the hard macro chains, 1 +ve flop is added to abide by EDT rules.
Extra FFS are added in hard macro chains depending on the polarity of the start and end FFS. This
leads to increase in maximum length of LBIST chains which are generally restricted by length of
hard macro chains.
Scan bypass wrappers can be coded in different customized way to meet the challenges encountered
during scan stitching of the circuit.
Conclusion
The text gives the hands-on challenges faced during stitching the flops in complex DUTs, care abouts
to taken during the process to enable robust scan across all process voltage temperature corners
and feasible solutions for solving the challenges encountered during to enable DFT ATPG/LBIST
testing. While doing stitching these problems should be considered and proper lock-up latches
should be inserted to make the timing during scan modes more robust and also ensure decent ATPG
coverage.