Scan Insertion
Scan Insertion
Ans: Designing for the testability, in which the test merges, with the design in the earlier process
of the design, following what is called a design-for-test (DFT).
What is Testability?
Ans: Testability is a design attribute that measures how easy it is to create a program to
comprehensively test a manufactured design's quality.
Ans: Functional testing verifies that your circuit performs as it was intended to perform. Testing
of all possible input combinations grows exponentially as the number of input increases. Thus in
real time it may not be possible to test all the input combinations.
Ans: Manufacturing testing verifies that your chip does not have manufacturing defects by
focusing on circuit structure rather than functional behavior. Manufacturing defects include
problems such as
Manufacturing defects might remain undetected by functional testing yet cause undesirable
behavior during circuit operation.
Important formulas
Test Application Time Reduction = (Length of longest scan chain in scan mode) / (Length of
longest scan chain in ScanCompression_mode)
Scan Test Data Volume = 3 * Length of longest scan chain * number of scan chains
Scan Compression Test Data Volume = Length of longest scan chain * (Number of scan inputs
+ 2 * Number of scan outputs)
Test Data Volume Reduction = (Scan Test Data Volume) / (Scan Compression Test Data
Volume)
What is Controllability?
Ans: How easy it is to control the inputs of any cell inside the design from the input pads.
What is Observability?
Ans: How easy it is to observe the outputs of any cell inside the design from the output pads.
Ans: Scan design is the most popular DFT technique and has high fault coverage results. The
idea is to control and observe the values in all the design's storage elements so you can make the
sequential circuit's test generation and fault simulation tasks as simple as those of a
combinational circuit.
Ans: Full scan is a scan design methodology that replaces all memory elements in the design
with their scan-able equivalents and then stitches them into scan chain. Full scan can assure the
high quality of the product. The idea is to control and observe the values in all the design's
storage elements so you can make the sequential circuit's test generation and fault simulation
tasks as simple as those of a combinational circuit.
Ø Highly automated process. Using scan insertion tools, the process for inserting full scan
circuitry into a design is highly automated, thus requiring very little manual effort.
Ø Highly effective, predictable method. Full scan design is a highly effective, well-understood,
and well-accepted method for generating high-test coverage for your design.
Ø Assured quality. Full scan assures quality because it tests most of the silicon.
Ans: The full scan design makes all storage elements scannable; it will cost in terms of silicon
area and the timing. Because of these in some design we may not have the luxury of having a full
scan design. There comes the idea of Partial scan, which is also a scan design methodology
where only a fraction of the storage elements in the design are replaced by their scannable
equivalents and stitched into scan chains. By carefully choosing which storage elements to be
replaced we can increase the testability of the design with minimal impact on the design's area or
timing. If the design cannot offer to accommodate the extra delay added in critical path (due to
added mux in the storage element), we can exclude those critical flip-flops from the scan chain
using partial scan.
Ad Hoc DFT
Ad hoc DFT implies using good design practices to enhance a design's testability, without
making major changes to the design style. Some ad hoc techniques include:
Using these practices throughout the design process improves the overall testability of the
design.
Structured DFT
Structured DFT provides a more systematic and automatic approach tenhancing design
testability. Structured DFT's goal is to increase the controllability and observability of a
circuit. Various methods exist for accomplishing this. The most common methods are
Ø The scan design technique, which modifies the internal sequential circuitry of the design.
Ø The Built-in Self-Test (BIST) method, which inserts a device's testing function within the
device itself.
The boundary scan, which increases board testability by adding circuitry to a chip.
“Before Scan” design is difficult to initialize to a known state, making it difficult to both control
the internal circuitry and observe its behavior using the primary inputs and outputs of the design.
In a "Scan design" scan memory elements (scan flops) replace the original memory elements
(normal flops) imparting controllability and observability to the design (prime requirement for
the design being testable), when shifting is enabled.
What are the techniques used to reduce pattern count without losing coverage ?
Ans: The number of test patterns that you need to achieve full coverage depends on the design
size. Different ATPG tools offer different compression and pattern ordering techniques to help
reduce pattern count.
fault models beyond stuck-at typically require pattern counts that are much larger than those for
stuck-at only.
For Pattern reduction, first step is the chain balancing during Stitching or scan insertion. If your
chains are balanced, Tool needs to insert less dummy patterns for reaching till required flop.
Also we can include compression on the chains where we have constraints on the pins of
device.This means if we are having the compression factor of 2 then your 1 scan chain will get
divided into 2 inside the device reducing your chain length (flops per scan chain).
14. If one needs to do synthesis/STA with scan replaced FF (not stitched) and need do
generate timing and other reports. What should be values of SE, SI and SO pins since
design is not stitched?
Ans: We need not constrain the SE, SI and SO pins for synthesis / STA of a scan replaced but
not stitched design. But we will not be able to do any test related STA.
15. Can you briefly describe the points to be considered, while re-ordering the scan chain in
Physical Design?
each active edge of each clock is considered to be in a separate clock domain. Both edges of a
clock and clocks with different timings may be used to control edge-triggered scan flip flops of a
scan chain.
In order to construct functional scan chains, two consecutive scan flip flops A and B (A serially
driving B)
1) must be clocked at the same time or
2) B must be clocked before A.
In the first case, we say that A and B have compatible clock domains.
In the second case, we say that A and B have incompatible clock domains.
The precedence relationship between scan flip-flops imposed by clock domain timings is
translated at the scan segment level. Capture and launch times for a scan segment are
respectively deduced from the capture time of its first scan cell (driven by its scan input) and the
launch time of its last scan cell (driving its scan output). Therefore, the precedence relationship
between scan segments can be established, and thus respected during scan segments reordering.
User-specified scan segment positions are respected during scan reordering unless they violate
clock domain timing constraints.
The last constraint, minimizing clock domain traversals, takes priority on physical design
information because we want our approach to be minimally intrusive in term of adding
synchronization latches. Only scan segments with compatible clock domains are reordered.
Reordering a set of scan segments with compatible clock domains consists of:
1. identifying and marking the set of clusters containing the scan segments.
2. Determining the entry and exit points between which the scan segments are going to be
reordered.
3. Ordering the previously identified clusters between the entry point and exit points.
4. Reordering scan segments within each of the ordered clusters.
24. Why first negative edge flops followed by positive edge flops in the scan chain?
Ans: This is not necessary to always have negative and positive edge triggerred flops in scan
chain. Actually we can have three combinations:
1) All positive
2) All negative
3) Negative followed by positive
but positive followed by negative is not taken. Since at the intersection of positive and negative
flop the data will not be captured. Since at single pulse data launch and capture is not possible.
We will require lock up latch.
The rule is there should not be 2 shift during one clock period. So if you put +ve edge flop
followed by -ve edge flop, there is a chance of 2 shift (if the clock skew between 2 clocks is
small) in one clock period. But if you put -ve edge flop then +ve edge flop, then there is no
chance of that. because the +ve edge come in the next period. Or if ur design needs that +ve edge
then -ve edge then you a lock up latch (if skew is small)
29. How the compression technique factor affects the Number of scan chains? Is number of
Clock domains also a factor?
Clock domains are a factor, yes, but sometimes people will combine clock domains into the same
scan chain. That's not uncommon, and will work if the clock skew is managed, and the tool puts
lockup latches in there.
Compression affects the number of scan chains, of course, since more compression generally
uses fewer external scan chains.
32. Once a die is tested, can the pins used for scan testing need not be brought out when the
die is packaged as IC? Does this have any big advantage?
No - you don't have to bring them out, but then you can't re-test w/ scan at the package level.
Normally, folks don't have dedicated scan in/out pins anyway, they share them with mission-
mode pins, so they end up getting bonded out anyway.
34. Can a C2 violation occur for a set/reset signal? I am getting this violation for a signal
that is identified as a set/reset signal by the tool when "analyze control signlas -auto"
command was used.
yes, C2 can happen for set/reset signals. For both Mentor and Synopsys tools, at least, set/reset
signals are considered clocks. This DRC violation says that there is a defined clock that does not
actually do any work in the circuit (maybe it is replaced in scan mode). To fix it may be as
simple as taking that clock out of your ATPG scripts.
35. The time at which the scan chain is put in functional mode can vary depending on the
test we are carrying. Given this, how can there be a common test mode pin for all the scan
chains?
Test mode pins are typically not the same as scan-enable pins. One or more scan-enable pins
(signals) are used to toggle between functional mode and scan mode. These are what you seem to
be referring to. Typically different scan-enable signals are needed for at-speed testing to handle
things like multi-cycle paths and inter-clock domain paths.
Test mode pins are typically used to put the circuit in test mode and are therefore generally
global (static) signals. For example, a test mode pin could be used to share pins between their
functional use and as scan I/O.
I am assuming that the test mode pin (irrespective of the number of scan chains) is used to
control unwieldy circuits during testing.
36. One thing I am not able to completely appreciate is whether there is an issue while
sharing functional pin for testing. Does it in anyway reduce the coverage?
Not if it is handled properly. You need to ensure that a test mode exists where the functional
paths to the shared I/O are accessible. For example, you may have a test mode where scan testing
is performed with the shared I/O connected up to scan chains and a separate test mode with the
shared I/O in their normal functional setting where they can be tested say with boundary scan.
37. what is the command to be used in RTL compiler to add a mux at the PI which, is used
as a shared scan enable signal, with test_mode as its select
define_dft test_mode -name test_mode -active high TM
insert_dft test_point -location -type control_node -node scanenable -test_control test_mode
38. when doing DFT scan insertion which of the following is true or is a better approach?
1. Up to three additional pins are required to implement this type of scan. Only the SCAN
ENABLE pin must be dedicated; the remainder of the pins(scan in, scan out) can be shared with
primary inputs and outputs.
2. Up to four additional pins are required to implement this type of scan. Only the TEST MODE
pin must be dedicated; the remainder of the pins(scan en, scan in , scan out) can be shared with
primary inputs and outputs.
First you will of course generally use more than one scan chain and often need more than one
scan enable (SE) signal, so your 3 and 4 pin statements don't really hold true. The real question
you're asking is if the SE signal(s) must be dedicated or not or can a TM signal be used so that
the SE signal can be shared. The answer is that a TM signal can indeed be used to share the SE
signal(s). This is generally the prefered solution as very often the design requires other internal
test settings which mush be controlled by a dedicated TM signal.
Thus you one can easily see the response of the machine without having to go through the state
machine in its originally specified way. Thus we become independent of the state machine in
some way.
Thus using scan we 'reduce' the sequential machine problem down to a 'combinational' problem.
By definition, Full Scan means that ALL flip-flops in the design are converted into scan flops.
When the scan-enable signal is inactive, the flip-flops accept data from their functional inputs
and the circuit behaves in its intended sequential nature. When the scan-enable signal is active,
all flip-flops accept data from their scan input, providing full control on the values that get
loaded into them. In this mode, all sequential depth is removed leaving only a combinational
circuit to test.
Also, is the scan clock different from normal clock used during normal functionality?
Are there issues in scan testing when the clock is generated internally (say using PLL)
Yeah.. we need to create seperate scan chains for each clock domain..
same clocks can be used as scan clocks as this will reduce extra pins.
After going through some theory on DFT, I found the following answers:
1) the functional clock is bypassed for scan testing. So clocks in multiple domains can be
clubbed into a single chain with a single clock if DC testing is the only target
2) About the pll also, the answer is same since the internal clock is bypassed and scan clock is
used, the pll remains inactive during scan testing
41. By full scan methodology do we mean that every single flop in the design is a part of the
scan chain? And if we have multiple scan chains instead of one, as it is in present designs,
can it still be called full scan methodology?
In a perfect world, full scan means every flip-flop, but in the real world, many flops can be
unscanned, and the design is still considered full scan. In some cases, the ATPG tool can test
through unscanned flops without a major impact to fault coverage. Designs using one or many
scan chains are equally valid as full scan designs.
Apart from the conventional mux FF scan architecture, there are many others like the Level
Sensitive scan and the clocked scan etc. How are these better or worse than the muxed FF
technique?
LSSD is really a sub-specialty in the industry as a whole only a few companies use it, but it is
effective. For scan purposes, it does not suffer from the hold time issues that mux-scan normally
does, but area-wise, it's not as good.
Clock-scan uses a separate scan-clock for each flop - I've never seen it used in industry, but that's
just me. The problem with it is that you must route two clock trees around the chip instead of one
- a virtual show-stopper in these days of congested routing.
1) How many scan ports do you have available? This will determine, in part, the number of scan
chains. The scan chains need to be balanced in length to be as efficient as possible, for test-
time/data volume.
2) Clocks - will you use flops from different clock domains in the same chain? If so, the clock
domains need to be *reasonably* skew balanced for the chains to work properly. Also, lockup-
latches will be necessary where the scan path crosses the clock domains for safe operation (if
stitching.
3) Are there negative edge flops in the design? If so the tool will always place the negedge flops,
as a group, ahead of the posedge flops.
Why first negative edge flops followed by positive edge flops in the scan chain?
Well that's so the chain will shift data in and out properly. The idea is that each bit of the data
shifting into the chain should traverse the chain one flop at a time.
Given a clock that is defined with an off-state of 0, a positive edge comes before a negative edge
in time, right?
Now imagine in a scan chain, a posedge FF followed by a negedge FF. During any given clock
period, the data that is latched into the posedge FF will also be latched into the negedge FF as
well - in the same period. This is called a feed-through, and is generally not an optimal situation
for the ATPG.
However, if the insertion tool puts the negedge flops all grouped at the front of the chain, then
the boundary between the negede flops and posedge flops will be a negedge FF followed by a
posedge FF. Since the positive edge of the clock comes before the negative edge, the data will be
properly held up at the negedge FF until the next clock period.
I am doing scan stitching for a block which contains say two instances of prestitched blocks.
I need to connect top-level scan ports to scan ports of these blocks instances.
For example:
top block contains 5 chains in that 3 chains stitched for top-level logic.
Now for remaining scan ports I need to be connect the scan ports of stitched sub module
instances
B/C/D/M1_inst1
X/Y/Z/M1_inst2
I need to connect the top level block scan ports to scan port of the submodule inst
B/C/D/M1_inst1.
As shown below
Scan_in[3] to B/C/D/M1_inst1.scan_in[0]
Scan_en to B/C/D/M1_inst1.scan_in[0]
Scan_out[3] to B/C/D/M1_inst1.scan_out[0]
Similarly for other instance.
The requirement is to maintain the hierarchy.
In DC, you will need a DFT compiler license to stitch is properly, as it does more than just
connect the scan chains. It checks for any DRC errors, so your chains are intact. The DFT
compiler documentation asks you to create a CTL model of the sub-blocks, so I am not sure if it
is applicable to your implementation.
Without DFT compiler, you can try to hook it up manually, then try to extract the chain using an
ATPG tool to see if the chains are intact.
Assuming that CLK is a Return-to-Zero clock (0->1->0 pulse), you would stitch the negedge
CLK domain flip-flops before posedge CLK domain flip-flops, i.e., negedge CLK FFs are closer
to scan input, and posedge CLK FFs are closer to scan output.
RCO_CLK domain can be stitched at either end of the chain. However, if CLK has a
significantly larger clock tree than RCO_CLK clock tree, then it is better to put RCO_CLK
domain FFs at the end of the chain. Otherwise, you may need to skew your clock timing on the
ATE.
49. I would like to know about scan pin sharing with bidirectional pins.
I added control logic for these pins using or & and gates to make them work as inputs and
outputs respectively. But during scan chain tracing its giving error as it is not able to trace from
bidirectional pins shared as scanouts.
Constraining these pins to Z ,the dftadvisor traced the scan chains.
By add control logic, u'r basically modifying the design, which may not be the correct way to go
about it, as it may affect the functionality.
In any case, after u added the control logic, did u constrain the control logic i/ps during atpg?
In scan chains if some flip flops are +ve edge triggered and remaining flip flops are -ve edge triggered how it
behaves?
For designs with both positive and negative clocked flops, the scan insertion tool will always
route the scan chain so that the negative clocked flops come before the positive edge flops in the
chain. This avoids the need of lockup latch.
For the same clock domain the negedge flops will always capture the data just captured into the
posedge flops on the posedge of the clock.
For the multiple clock domains, it all depends upon how the clock trees are balanced. If the clock
domains are completely asynchronous, ATPG has to mask the receiving flops
Why we avoid latches in the design, even if they provide only cell delay.
Is there any time related issues??
Whenever latch is enabled it will pass whatever is there on its D inputs to Q output. If suppose
any glitch is coming on D and latch is enabled it will pass it to q. Glitch always create problem u
would be knowing this.
Latches are fast, consumes less power, less area than Flops but Glitches can also come along
with these advantages, that’s why we go for flops.
Also Latches are not DFT friendly... It is very difficult to perform Static timing analysis with
latches in your design...
While attempting to do scan insertion for a design, I am finding many uncontrollable clock
and uncontrollable asynchronous signals like set and reset violations.
Can anyone guide me on how to avoid these violations?
Uncontrollable clocks are those which come out from a combo logic (dividers ...etc)
when doing DFT, all the controls of the FF must be with respect to the TEST EN signal,
in the sense once if TEST EN =1, then automatically the entire chip should be in scan mode ....
for this all the clock of the FF should be controllable externally ( at pin level).
Hi,
suppose we have a FF1 which is scannable with scan clock and its output is connected to FF2
which is non scannable. I mean FF1 is connected with scan clock and FF2 is connected with
normal clock. If we increase sequential depth , then how it will increase or effect test coverage ?
Ans: May be you know as sequential depth is related to capture ( Launc on capture if atspeed),
when we want to capture value related to input of non-scannable flop we need extra clock pulse
so by increasing seuential depth as 2 you can capture response of that non-scan flop via scanable
flop hence test coverage will improve.
Answer2: During placement, the optimization may make the scan chain difficult to route due to
congestion. Hence the tool will re-order the chain to reduce congestion.
This sometimes increases hold time problems in the chain. To overcome these buffers may have
to be inserted into the scan path. It may not be able to maintain the scan chain length exactly. It
cannot swap cell from different clock domains.
Because of scan chain reordering patterns generated earlier is of no use. But this is not a problem
as ATPG can be redone by reading the new netlist.
Software reset is a reset managed by software by writing a special register bit dedicated for this
purpose; it's usually a synchronous reset.
Hardware reset is the traditional reset activated by setting the IC reset pin to '1' or '0'
(depending on the convention: active low or active high reset), it can be also generated inside the
IC.
If your design is predominantly edge-triggered, use the multiplexed flip-flop, clocked scan, or
clocked LSSD scan style.
If your design has a mix of latches and flip-flops, use the clocked scan or LSSD scan style.
You can convert the unused flops back to normal flops in DC , and the tool does that
automatically. The command for the same is "set_dft_insertion_configuration-unscan true".
All the flops which are not on the scan chains will be converted back to normal flop.
can anyone tell me for a design how can we decide how many scan chains we require?...
another question is in a particular scan chain how many flops should be put? is it
dependent on the technology or is it dependent on designer or on the customer
requirement?
scan chian number depend on the clock domain and max flipflops. Don't mix different clock
domains and clock edges to aviod timing issue.
and longer chian will increase test time, < 1000k ff's per chain.
There are a number of factors to decide the number of scan chains and length of each scan chain.
1. Look for your ATE specification to find how many scan chains are supported. If length of
scan chain is very long, then the tester time increases and hence your cost
2. If you are using scan compression, then your scan compression methodology will determine
the maximum length of each scan chain
positive skew is similar to useful skew , it is nothing but delaying capture clock by skew value
correct than the normal launch clock value when ever you delay the capture clock surely you
gain that my margin which can reduce in your setup violation. If you want to visualize properly,
you can think like if you have more period then surely setup violation will reduce correct. it is
similar concept.
similar to my earlier answer here you are reducing the capture clock.
Positive skew is bad for hold? Negative skew is good for hold?
What are the measures taken in the Design achieving better Yield?
• Create more powerful stringent run set files with pessimistic spacing/short rules.
• Check for the areas where the design is prone to lithographic issues, like sharp cuts and
try to re-route it.
• For via-reliability issues, use redundant vias, to reduce the chances for via-breakage.
• In order to design for yield-enhancement, design systems, this could have optimal
redundancy, like repairable memories.
• Optimal placing of de-coupling capacitances reduces the power-surges.
• Doubling the width of the non-critical nets, clock-nets can increase the yield parameter.
• Ensure that the poly-orientation is maintained.
What are the measures or precautions to be taken in the Design when the chip has both
analog and digital portions?
Designing for Optimal integration of Analog and Digital
• As today's IC has analog components also inbuilt, some design practices are required for
optimal integration.
• Ensure in the floor planning stage that the analog block and the digital block are not
sitting close-by, to reduce the noise.
• Ensure that there exists separate ground for digital and analog ground to reduce the noise.
• Place appropriate guard-rings around the analog-macro.
• Incorporating in-built DAC-ADC converters, allows us to test the analog portion using
digital testers in an analog loop-back fashion.
• Perform techniques like clock-dithering for the digital portion.
What is local-skew, global-skew, and useful-skew mean?
Local skew: The difference between the clock reaching at the launching flop vs the clock
reaching the destination flip-flop of a timing-path.
Global skew: The difference between the earliest reaching flip-flop and latest reaching flip-flop
for a same clock-domain.
Useful skew: Useful skew is a concept of delaying the capturing flip-flop clock path, this
approach helps in meeting setup requirement with in the launch and capture timing path. But the
hold-requirement has to be met for the design.
Latches are level-sensitive and flip-flops are edge sensitive. Latch based design and flop based
design is that latch allows time borrowing which a tradition flop does not. That makes latch
based design more efficient. But at the same time, latch based design is more complicated and
has more issues in min timing (races). Its STA with time borrowing in deep pipelining can be
quite complex.
How many minimum modes should qualify STA for a chip?
1. Scan Shift mode
2. Scan Capture mode
3. MBIST mode
4. Functional modes for Each Interface
5. Boundary scan mode
6. Scan-compression mode
How many minimum process lots, should STA be qualified?
1. Fast corner
2. Slow corner
3. Typical corner
How many minimum Timing, Should STA be qualified?
1. Normal delay mode (with out applying duration)
2. On-chip variation mode (duration applied)
3. SI mode (Signal integrity cross talk impact on STA)
Software reset is a reset managed by software by writing a special register bit dedicated for this
purpose; it's usually a synchronous reset.
Hardware reset is the traditional reset activated by setting the IC reset pin to '1' or '0'
(depending on the convention: active low or active high reset), it can be also generated inside the
IC.
Ans: Virtual clock is mainly used to model the I/O timing specification. Based on what clock the
output/input pads are passing the data.
What are the various timing-paths which i should take care in my STA runs? Ans:
1. Timing path starting from an input-port and ending at the output port(purely combinational
path).
2. Timing path starting from an input-port and ending at the register.
3. Timing path starting from an Register and ending at the output-port.
4. Timing path starting from an register and ending at the register.
The presence of feedback loops should be avoided at any stage of the design, by periodically
checking for it, using the lint or synthesis tools. The presence of the feedback loop causes races
and hazards in the design, and 104 RTL Design
leads to unpredictable logic behavior. Since the loops are delay-dependent, they cannot be tested
with any ATPG algorithm. Hence, combinatorial loops should be avoided in the logic.
When are DFT and Formal verification used?
Ans: DFT:
· manufacturing defects like stuck at "0" or "1".
· Test for set of rules followed during the initial design stage.
Formal verification:
· Verification of the operation of the design, i.e, to see if the design follows spec.
· gate netlist == RTL ?
· using mathematics and statistical analysis to check for equivalence.
Contamination delay tells you if you meet the hold time of a flip flop. To understand this better
please look at the sequential circuit below.
The contamination delay of the data path in a sequential circuit is critical for the hold time at the
flip flop where it is exiting, in this case R2.
Mathematically, th(R2) <= tcd(R1) + tcd(CL2)
Contamination delay is also called tmin and Propagation delay is also called tmax in many data
sheets.
Ques: Given two ASICs. one has setup violation and the other has hold violation. how can
they be made to work together without modifying the design?
Ans: Slow the clock down on the one with setup violations..
And add redundant logic in the path where you have hold violations.
Ques: Suggest some ways to increase clock frequency?
• FPGA designers may be unfamiliar with scan since FPGA testing has already been done by
the FPGA manufacturer. ASIC designers do not have this luxury and must handle all the
manufacturing test details themselves.
Ques : Explain about setup time and hold time, what will happen if there is setup time and
hold tine violation, how to overcome this?
Ans: Set up time is the amount of time before the clock edge that the input signal needs to be
stable to guarantee it is accepted properly on the clock edge.
Hold time is the amount of time after the clock edge that same input signal has to be held before
changing it to make sure it is sensed properly at the clock edge.
Whenever there are setup and hold time violations in any flip-flop, it enters a state where its
output is unpredictable: this state is known as metastable state (quasi stable state); at the end of
metastable state, the flip-flop settles down to either '1' or '0'. This whole process is known as
metastability
Ques: What is skew, what are problems associated with it and how to minimize it?
Ans: In circuit design, clock skew is a phenomenon in synchronous circuits in which the clock
signal (sent from the clock circuit) arrives at different components at different times.
This is typically due to two causes. The first is a material flaw, which causes a signal to travel
faster or slower than expected. The second is distance: if the signal has to travel the entire length
of a circuit, it will likely (depending on the circuit's size) arrive at different parts of the circuit at
different times. Clock skew can cause harm in two ways. Suppose that a logic path travels
through combinational logic from a source flip-flop to a destination flip-flop. If the destination
flip-flop receives the clock tick later than the source flip-flop, and if the logic path delay is short
enough, then the data signal might arrive at the destination flip-flop before the clock tick,
destroying there the previous data that should have been clocked through. This is called a hold
violation because the previous data is not held long enough at the destination flip-flop to be
properly clocked through. If the destination flip-flop receives the clock tick earlier than the
source flip-flop, then the data signal has that much less time to reach the destination flip-flop
before the next clock tick. If it fails to do so, a setup violation occurs, so-called because the new
data was not set up and stable before the next clock tick arrived. A hold violation is more serious
than a setup violation because it cannot be fixed by increasing the clock period.
Clock skew, if done right, can also benefit a circuit. It can be intentionally introduced to decrease
the clock period at which the circuit will operate correctly, and/or to increase the setup or hold
safety margins. The optimal set of clock delays is determined by a linear program, in which a
setup and a hold constraint appears for each logic path. In this linear program, zero clock skew is
merely a feasible point.
Clock skew can be minimized by proper routing of clock signal (clock distribution tree) or
putting variable delay buffer so that all clock inputs arrive at the same time
Ques : What are the thinks to be considered in scan stitching? I know little bit about this
can you explain me more in details?
Ans : Scan stitching is done can be done in one of three ways:
1) Use the synthesis tool (DFT compiler or equivalent)
2) A DFT scan tool (such as DFT Architect)
3) The place and route tool
Ques : Apart from the conventional mux FF scan architecture, there are many others like
the Level Sensitive scan and the clocked scan etc. How are these better or worse than the
muxed FF technique?
Ans : LSSD is really a sub-specialty in the industry as a whole only a few companies use it, but
it is effective. For scan purposes, it does not suffer from the hold time issues that mux-scan
normally does, but area-wise, it's not as good.
Clock-scan uses a separate scan-clock for each flop - I've never seen it used in industry, but that's
just me. The problem with it is that you must route two clock trees around the chip instead of one
- a virtual show-stopper in these days of congested routing.
Ques : By full scan methodology do we mean that every single flop in the design is a part of
the scan chain? And if we have multiple scan chains instead of one, as it is in present
designs, can it still be called full scan methodology?
Ans : In a perfect world, full scan means every flip-flop, but in the real world, many flops can be
unscanned, and the design is still considered full scan. In some cases, the ATPG tool can test
through unscanned flops without a major impact to fault coverage. Designs using one or many
scan chains are equally valid as full scan designs.
Ques : What is a BUS Primitive and clock_PO pattern?
Ans : A bus primitive is just a DFT model of a bus - a net that has more than one driver. It's
important that you constrain it during test.
A clock PO pattern is a pattern that measures a primary output that has connectivity to a clock.
So if a clock signal propagates through combinational logic to a primary output (PO), an ATPG
vector can be created to measure the results of that propagation
Ques : Say in my design some flops work at low frequency, in that case,
How can we take care of flops of lower frequency when we do an at speed testing?
Ans : It depends upon whether you have independent scan clocks to control the different clock
domains. If so you can generate patterns that cover all the domains, and you just need to mask
the boundaries between domains.
But that's not the normal case. Many times people will use one scan clock to drive the whole
circuit - and in this case, you will need to generate patterns for each clock domain separately,
while masking or black boxing all the other domains.
Ques : What all needs to be taken care in scan stitching to get the good coverage?
Ans : If you are using Mentor DFTAdvisor or Synopsys DFT Compiler, cleaning up pre-stitch
drc errors and most of the warnings (especially clock warnings) will generally lead to good fault
coverage.
If coverage is still low after cleaning drc errors/warnings, then there may be issues inherent to
the design that causes low coverage (redundant logic, complex reconvergent fanouts, black
boxes, constrained nets, etc.)
Both Mentor and Synopsys tools provide ways to analyze low fault coverage in their ATPG
tools. Also, some RTL analysis tools may be useful to find these kinds of problems
(see http://www.dftdigest.com/miscellaneous/rtl-design-for-test/ )
Ans : If the reset is asynchronous (and properly bypassed during scan), you can declare the reset
pin as a clock during ATPG, and ATPG will toggle it accordingly to get faults on reset pin.
If the reset is synchronous, you can treat the reset pin as a normal data pin, and ATPG should be
able to cover faults on the reset.
Be careful, however, if you run transition fault ATPG. Reset usually cannot toggle at-speed, so
you may not want to declare the reset as a clock when running transition fault ATPG.
You can also try to run the patterns that toggle the reset as a clock pin at a reduced speed on the
tester, if you worry about transition fault coverage on reset.
Ans : A pattern fault is a fault model created by IBM Testbench folks (later acquired by Cadence
and became Test Encounter). Instead of using a standard fault model like stuck-at, transition,
etc., you can use truth tables to describe what the fault free and faulty behavior is for each gate
type.
The advantage is that you can use pattern faults to guide ATPG to generate patterns that would
not be generated with other fault models. For example, in a 2 input AND gate, to cover single
stuck-at faults, you need only 3 patterns, 01, 10, and 11. If you want to force ATPG to generate a
pattern for the 00 case in every AND gate, you can define a pattern fault with 00 as its
sensitization.
A pattern fault is a mechanism used by Encounter Test to model static or dynamic defects that
are not easily, or may be impossible to be represented by stuck-at pin faults. A pattern fault is
basically a statement of
Ques : What is the use of compressor? Why can’t we use compressor for RAM?
Ans : You're getting your test methodology mixed, I think, but you can test a ROM with a
compressor - also known as a MISR. That's done all the time. The idea is that the contents of the
ROM are simply read out, and fed into the MISR (multiple input shift register), and the resulting
signature is read out and compared with the expected signature.
It's not so common to do that with RAMs, since both the write and read data are generated
algorithmically.
However if your question is more general, the use of a compressor is usually to turn a large set of
data into a smaller one, which is very handy for when access to the circuit, or bandwidth, is
limited.
Ans : The relationship between cost and yield is pretty complex stuff, and depends on the
technology node, the manufacturer, and many other factors.
The best one can say is that they are normally inversely proportional - the better your yield, the
lower the cost, because the fixed cost of putting wafers through the fab will result in more
devices going to the customer.
However, I can write a test that gives me 100% yield (no test at all), but I'll get a large
percentage of my devices returned, and lose a ton of respect (and market share).
What all needs to be taken care in scan stitching to get the good coverage?
If you are using Mentor DFTAdvisor or Synopsys DFT Compiler, cleaning up pre-stitch drc
errors and most of the warnings (especially clock warnings) will generally lead to good fault
coverage.
If coverage is still low after cleaning drc errors/warnings, then there may be issues inherent to
the design that causes low coverage (redundant logic, complex reconvergent fanouts, black
boxes, constrained nets, etc.)
Both Mentor and Synopsys tools provide ways to analyze low fault coverage in their ATPG
tools. Also, some RTL analysis tools may be useful to find these kinds of problems (see
http://www.dftdigest.com/miscellaneous/ ... -for-test/ )
What are problems associated with skew and how to minimize it?
Ans: Skew is the difference in insertion delay to registers. If the skew is too large, then you fail
timing.
Ans: Scan design rules require that registers have the functionality, in test mode, to be a cell
within a large shift register. This enables data to get into and out of the chip. The following
violations prevent a register from being scannable:
Uncontrollable Clocks
This violation can be caused by undefined or unconditioned clocks. DFT Compiler considers a
clock to be controlled only if both of these conditions are true:
• It is forced to a known state at time = 0 in the clock period (which is the same as the “clock off
state” in TetraMAX).
This violation applies only when the scan style is set to level-sensitive scan design (LSSD). For a
latch to be scannable, the latch must be forced to hold its value at the beginning of the cycle,
when the clock is inactive. This violation can be caused by undefined or unconditioned clocks.
This violation indicates that there are registers that cannot be controlled by ATPG. If the
violation is not corrected, these registers will be unscannable and fault coverage will be reduced.
Asynchronous pins of a register must be capable of being disabled by an input of the design.If
they cannot be disabled, this is reported as a violation. This violation can be caused by
asynchronous control signals (such as the preset or clear pin of the flip-flop or latch) that are not
properly conditioned before you run DFT Compiler.
What are the violations which prevent data capture?
• Three-State Contention
• Black Boxes
An active (or sensitizable) feedback loop reduces the fault coverage that ATPG can achieve by
increasing the difficulty of controlling values on paths containing parts of the loop.
Black Boxes
Logic that drives or is driven by black boxes cannot be tested because it is unobservable or
uncontrollable. This violation can drastically reduce fault coverage, because the logic that
surrounds the black box is unobservable or uncontrollable
Which scan styles are supported in your technology library?
Ans: To make it possible to implement internal scan structures in the scan style you select,
appropriate scan cells must be present in the technology libraries specified in
the target_library variable.
Use of sequential cells that do not have a scan equivalent always results in a loss of fault
coverage in full-scan designs.
Ans: If your design is predominantly edge-triggered, use themultiplexed flip-flop, clocked scan,
or clocked LSSD scan style.
If your design has a mix of latches and flip-flops, use the clocked scan or LSSD scan style.
Ans: The quality and accuracy of the scan and nonscan sequential cell models in the Synopsys
technology library affect the behavior of DFT Compiler. Incorrect or incomplete library models
can cause incorrect results during test design rule checking.
DFT Compiler requires a complete functional model of a scan cell to perform test design rule
checking. The Library Compiler UNIGEN model supports complete functional modeling of all
supported scan cells. However, the usual sequential modeling syntax of Library Compiler
supports only complete functional modeling for multiplexed flip-flop scan cells.
When the technology library does not provide a functional model for a scan cell, the cell is a
black box for DFT Compiler.
But that's not the normal case. Many times people will use one scan clock to drive the whole
circuit - and in this case, you will need to generate patterns for each clock domain separately,
while masking or black boxing all the other domains.
What all needs to be taken care in scan stitching to get the good coverage?
Ans: If you are using Mentor DFTAdvisor or Synopsys DFT Compiler, cleaning up pre-stitch
drc errors and most of the warnings (especially clock warnings) will generally lead to good fault
coverage.
If coverage is still low after cleaning drc errors/warnings, then there may be issues inherent to
the design that causes low coverage (redundant logic, complex reconvergent fanouts, black
boxes, constrained nets, etc.)
Both Mentor and Synopsys tools provide ways to analyze low fault coverage in their ATPG
tools. Also, some RTL analysis tools may be useful to find these kinds of problems
(see http://www.dftdigest.com/miscellaneous/rtl-design-for-test/ )
Please let me know the Launch and capture path in Launch On Capture(LOC) and
Launch On Shift(LOS) in At-Speed mode?
Ans: For at-speed fault there is a requirement that we launch the fault. By launching it means
that we give a transition at the fault site. So for slow-to-rise fault, the launch will be to change
the value from 0 to 1. Once the fault has been launched , the changed value is captured At-speed
in a scan flop.
Now there are two ways to launch a fault. Either when the scan chain is still getting loaded or
once the scan chain load is over.
If the scan chain is still getting loaded, then last shift is used to launch the fault and then in the
capture mode, the transition on the fault location is captured. The thing to note here the last but 1
shift should be able to put a value of 0 at the fault and with the last shift the value at the fault site
should change to 1.
In the launch on capture at the end of the shift , the fault site will have a value of 0, and then
there will be two clock pulses, one to change the fault site location to 1 and the other to capture
the transition....
what is Compression ratio and how to fix the compression ratio for any design?
Ans: The compression ratio in DFT is basically used for TAT and TDV
TAT : Tester application Time
TDV : Test data volume. ( Size of the patterns)
It is the reduction in these two number when compared to a design which has just the scan chains
and no compression techniques.
The scan compression technique available with most of the comercial tool today is to have
multiple scan chains inside the core with a limited number of top level scan ports used for
loading and unloading these chains. So, you require hardware in your design to support. This ,
there will be a decompressor to feed many internal chains from limited number of top level scan
input ports and a compressor to unload the value from many internal scan chains to limited
number of top scan outputs
The TAT and TDV is achieved by lesser number of cycles needed to load the internal chains. In
most curde form the compression ratio for such technique is
# External chain = #internal chains * compression ratio * β
Where β = to account for some pattern inflation. You need to know this β and then can control
the compression ratio.
What are the factors which decide the number of scan chains?
Ans: Factors such as availability of chip I/O pins, available tester channels, and on-chip routing
congestion caused by chaining storage elements in test mode limit the number of scan chains.
LOS is combinational ATPG algorithm and LOC is Sequential ATPG algorithm. Why?
Ans: LOC is sequential, since it is essentially a double capture, and the ATPG tool needs to be
able to store the state of the circuit after the last shift and first clock pulse of the capture in order
to know what is expected after the second capture clock.
LOS is essentially the same as a simple stuck-at, since there is only one clock pulse during the
capture. No second state needs to be stored by the ATPG to determine how the circuit will react
after the second clock pulse.
In scan chains if some flip flops are +ve edge triggered and remaining flip flops are -ve edge triggered how
it behaves?
For designs with both positive and negative clocked flops, the scan insertion tool will always
route the scan chain so that the negative clocked flops come before the positive edge flops in the
chain. This avoids the need of lockup latch.
For the same clock domain the negedge flops will always capture the data just captured into the
posedge flops on the posedge of the clock.
For the multiple clock domains, it all depends upon how the clock trees are balanced. If the clock
domains are completely asynchronous, ATPG has to mask the receiving flops
Why we avoid latches in the design, even if they provide only cell delay.
Is there any time related issues??
Whenever latch is enabled it will pass whatever is there on its D inputs to Q output. If suppose
any glitch is coming on D and latch is enabled it will pass it to q. Glitch always create problem u
would be knowing this.
Latches are fast, consumes less power, less area than Flops but Glitches can also come along
with these advantages, that’s why we go for flops.
Also Latches are not DFT friendly... It is very difficult to perform Static timing analysis with
latches in your design...
While attempting to do scan insertion for a design, I am finding many uncontrollable clock
and uncontrollable asynchronous signals like set and reset violations.
Can anyone guide me on how to avoid these violations?
Uncontrollable clocks are those which come out from a combo logic (dividers ...etc)
when doing DFT, all the controls of the FF must be with respect to the TEST EN signal,
in the sense once if TEST EN =1, then automatically the entire chip should be in scan mode ....
for this all the clock of the FF should be controllable externally ( at pin level).
Answer2: During placement, the optimization may make the scan chain difficult to route due to
congestion. Hence the tool will re-order the chain to reduce congestion.
This sometimes increases hold time problems in the chain. To overcome these buffers may have
to be inserted into the scan path. It may not be able to maintain the scan chain length exactly. It
cannot swap cell from different clock domains.
Because of scan chain reordering patterns generated earlier is of no use. But this is not a problem
as ATPG can be redone by reading the new netlist.
Ques : During the process of ATPG, I encountered a term called clocked PO pattern.
Could someone throw some light on what are these patterns ?
Ans : Clock PO patterns are special patterns meant to test primary output values when those
primary outputs are connected, directly or indirectly, to one of the scan clocks (usually through
combinational logic or just buffers).
What are the design guidelines for getting good error coverage and error free dft?
Ans: Use fully synchronous design methodology using ‘pos-edge’ only if possible. Generated or
gated clocks should be properly planned, documented and collected in one module at the top-
level.
For gated clocks or derived clocks: a test mode should be implemented that will drive all the
Flip-Flop (FF) clocks from a single test clock during this test mode. Also, clock skew for this test
clock should be properly balanced so there are no hold violations on any of the registers both
during scan shift and normal mode.
Provide proper synchronization for signals crossing clock domains, different edges of the same
clock or asynchronous inputs. Such un-testable synchronization logic should be isolated in a
separate module. Create patterns for this asynchronous logic separately. A lock-up latch should
be inserted in a signal crossing a clock-domain or clock edge, if all the FFs are to be part of same
scan chain. Don’t use clocks in combinational logic or as data/set/reset input to a FF.
All the asynchronous set/resets should be directly controllable from the primary input of the
chip. If there are any internally derived set/resets, it should be possible to disable them using one
or more primary inputs.
Don’t use asynchronous set/resets in combinational logic or as data input to a FF. Avoid using
registers with both Set/Reset functionality.
Avoid latches in your design. If there are any latches in the design, care should be taken to make
them transparent during the scan test.
Avoid internal tri-state buses (instead consider a MUX architecture). If not, then implement bus
control logic to ensure one and only one driver (no bus conflicts or no floating buses) is active on
the bus during scan test.
For external tri-states, bring out 3 signals for each tri-state pad: input, output and enable. No
combinational feedback loops (especially when integrating the sub-blocks).
Use ‘logic_high’ and ‘logic_low’ port signals for all the sub-blocks instead of VDD/VSS or
‘tie_1/tie_0’ cells.
Put ‘don’t_touch’ on these ports so that they will not be removed and can be used to connect the
‘tie_off’ nets generated during the synthesis. These ‘logic_high’ and ‘logic_low’ ports can be
connected to a ‘tie_macro’ (these macros will contain scan-able FFs) at the top-level that make
all these nets testable.
Default values on the buses should use alternate ‘logic_high’ and ‘logic_low’ connections to
increase fault coverage.
Balance the scan chains to be of approximately equal length and limit this length
Make sure the ‘scan_enable’ signal is properly buffered so that scan tests can be run at higher
frequencies.
Use only cells (including memories) which have the appropriate DFT/ATPG models available.
Provide for debug capability of a memory failure as part of the memory test methodology.
Disable the memory during scan shift and bypass mode for allowing fault coverage around the
memory I/O.
Provide a power-down mode for the memory, analog and other power consuming blocks during
the IDDQ test.
Verify there are no hold time violations on any registers in both scan shift and normal mode after
test insertion.
Plan chip level scan issues before starting the block level design.
Proper review of the memory, analog or other custom blocks test methodology. For analog IP
blocks or hardened IP blocks (with no internal scan chains) provide a bypass option so the
surrounding logic can be tested.
Proper review of all the different test modes of the chip and detailed documentation of the
primary pin assignments during these test modes