Emailing DFT Interview Preparation
Emailing DFT Interview Preparation
● What is DFT?
● Testing scheme
● Fault Models
● Boundary Scan Testing (JTAG)
● TAP controller
● Scan
● MBIST
● Asked Interview Questions
What is DFT?
DFT practiced in the industry today, at least for digital circuits, is predicated on
a structural test paradigm. Structural tests make no direct attempt to determine
if the overall functionality of the circuit is correct. Instead it tries to make sure
that the circuit has been assembled correctly from some low-level building
blocks as specified in a structural netlist. For example, are all specified logic
gates present, operating correctly and connected correctly? The stipulation is
that if the netlist is correct and structural testing has confirmed the correct
assembly of circuit elements, then the circuit should be functioning correctly.
Design for Test (DFT) is a process that incorporates rules and techniques in the
design of a product to make testing easier. Design for tests is used to manage
complexity, minimize development time and reduce manufacturing costs. The
majority of faults on boards, such as solder joints(shorts and opens),
components(wrong device, missing device, wire-bond failure and stuck pins)
make up over 95 percent of failures found. A structured technique such as
boundary scan-testing allows for pin-out testing to easily detect these failures.
Design cycle times have shortened significantly over the years while test
program development time has increased, necessitating that companies adopt
structured or repeatable methodologies. Which led to the different industry wide
standards like IEEE1500 or IEEE1687.
Now we will be understanding what are the different techniques used in DFT for
structural testing for different types of circuits in a chip (combinational,
sequential or memory).
Testing Scheme
There are some general points to be kept in mind while developing a testing
scheme -
● Perfect Test
○ Detect all defects
○ Pass all functionally good devices
○ Fail all functionally bad devices
● Real Test
○ Based on analyzable fault models
○ Some good chips are rejected (yield loss)
○ Some bad chips pass test (test escape)
FAULT MODELS
Basic fault models includes:
Static Faults: Which give incorrect values at any speed and sensitized by performing
only one operation
Dynamic Faults: Only at-speed and are sensitized by performing multiple operations
sequentially.
● Transition Fault (Delay Fault): Where signal eventually assumes the correct
value, but more slowly (or rarely, more quickly) than normal.
● Small-delay-defect model
There are two main ways for collapsing fault sets into smaller sets.
Equivalence collapsing
It is possible that two or more faults produce the same faulty behavior for all input
vectors. These faults are called equivalent faults.
Dominance collapsing
Fault F is called dominant to F’ if all tests of F’ detects F. In this case, F can be removed
from fault list.
In the example, a NAND gate has been shown, the set of all input values that can test
output's SA0 is {00,01,10}. the set of all input values that can check the first input's SA1
is {01}. In this case, output SA0 fault is dominant and can be removed from the fault list.
Functional collapsing
Two faults are functionally equivalent if they produce identically faulty functions or we
can say, two faults are functionally equivalent if we cannot distinguish them at primary
outputs (PO) with any input test vector.
Suppose a SOC designer uses 4 different chips from different companies, put them in a
system and make a product, now the issue comes how do we test each part of the
system. Earlier probe based testing was used, now that is not possible as designs are
getting more and more compact, and systems are placed much closer. So probe based
testing is not possible. Therefore we need a common way in which we test out different
parts of the chip while they are residing on the chip.
JTAG ARCHITECTURE
In jtag standard we have 4 mandatory pins (TDI, TD0, TMS and TCLK) and one optional
pin (TRST). Functions of each of the pins are as below -
● TDI - To serially shift the data inside the chip.
● TDO - To get the output data serially out, which we compare with golden
expected output and tell whether the test has passed or not.
● TMS - To switch between different modes (states). Will understand this better
while understanding TAP controller.
● TCK - To reset the JTAG tap controller and bring it back to reset state.
1.) Shift operation - To shift data inside the chip, data needs to be fed serially. Also
internally flops gets available on the TDO pin, which needs to be compared with
golden expected output.
2.) Update operation - After completing the shift operation we actually need to
update the state of internal flops. We cannot do that on every TCK cycle as it is
serial shift, so if we keep updating there will be large number of toggles on each
flop data and data will not be stable until full shift completes, which can easily
damage the chip.
3.) Capture operation - To capture the state of internal flops and readout the data
serially on the TDO pin during the shift operation.
TAP CONTROLLER
It controls the jtag operation. It is basically a 16-state Finite State Machine (FSM) whose
transitions are controlled by the TMS signal as shown in figure below. The TAP
controller can change only at the rising edge of TCK and the next state is determined by
the logic level of TMS and the present state.
Above figure shows a very basic top-level view of TAP controller. TMS, TCK and
optional TRST signals go to a 16-state FSM, which produces various control
signals depending upon FSM's state. These output signals include dedicated
control signals for Instruction Register(IR), captureIR, shiftIR, updateIR and
generic control signals for all Data Registers(DR), captureDR, shiftDR, updateDR
along with other control signals.
1.) Test-Logic Reset: It resets the JTAG circuits. Whenever the TRST(optional)
signal is asserted, it goes back to this state. Also notice that in whatever state
the TAP controller may be at, it goes back to this state if TMS is set to 1 for 5
consecutive TCK cycles. Thus if we don't have the TRST signals then we can still
reset the circuit.
2.) Run-Test/Idle: This is a state in which the FSM is waiting for some test
operations to complete.
3.) Select-DR/Scan and Select-IR/Scan: This is a temporary state to allow
the test data sequence for the corresponding register (the IR in Select-IR/Scan
state and the selected DR in Select-DR/Scan state) to be initiated.
4.) Capture-DR and Capture-IR: In this state, data can be loaded in parallel
to the corresponding register.
5.) Shift-DR and Shift-IR: In this state, the required test data is loaded (or
unloaded) serially into (or from) the corresponding register. If you refer the
Figure-2, when the TAP controller is in this state, it will stay at this state as long
as TMS=0. For each clock cycle, one data bit is shifted into (or out of) the
selected register through TDI (or TDO).
6.) Exit1-DR and Exit1-IR: All parallel-loaded (from the Capture-DR and
Capture-IR state) or serial loaded (from the shift-DR and Shift IR state) data
are held in the register in this state.
7.) Pause-DR and Pause-IR: The FSM pauses its function here to wait for
some external operation.
8.) Exit2-DR and Exit2-IR: This state represents the end of the Pause-DR and
Pause-IR operation, and allows the TAP controller to go back to Shift-Dr and
Shift-IR state for more data to be shifted in (or shifted out).
9.) Update-DR and Update-IR: The test data stored in the first flop of register
(typically all the registers have two flops for each bit, we will discuss about it
later) is loaded to the second flop in this state.
Once the patterns are generated, the expected response of the circuit for
each pattern is obtained in pre-silicon. The expected responses along with
the patterns are then stored in the memory of Automatic Test Equipment
(ATE). In post-silicon, the manufactured chip is tested using the ATE, which
loads the pattern and compares it with the expected response for pass or fail
status.
BIST
BIST is a design for testability technique that places the testing functions
physically within the circuit under test (CUT). The basic BIST architecture
requires the addition for 3 hardware blocks to a digital circuit, a test pattern
generator, a response analyzer and a test controller.
The test pattern generator generates the test pattern for the CUT. Examples of
pattern generator are a ROM with stored patterns, a counter and a linear
feedback shift register (LFSR). A typical response analyzer is a comparator with
stored responses or an LFSR used as a signature analyzer. It compacts and
analyzes the test responses to determine correctness of CUT. A test control
block is necessary to activate the test and analyze the responses.
Four primary parameters must be considered in developing BIST methodology
for embedded systems.
1.) Fault Coverage: This is the fraction of faults of interest that can be exposed
by the test patterns produced by pattern generator and detected by output
response monitor. In presence of input bit stream errors there is a chance that
the computed signature matches the golden signature, and the circuit is
reporting fault free. This undesirable property is called masking or aliasing.
2.) Test set size: This is the number of test patterns produced by the test
pattern generator, and is closely related to fault coverage, generally large test
sets imply high fault coverage.
3.) Hardware overhead: The extra hardware required for BIST is considered to
be overhead. In most embedded systems, high hardware overhead is not
acceptable.
4.) Performance overhead: This refers to the impact of BIST hardware or normal
circuit performance such as its worst case(critical) path delays. Overhead of this
type is sometimes more important than hardware overhead.
3.) Performance overhead: Extra path delays are added due to BIST.
4.) Yield loss: Yield loss increases due to increased chip area.
5.) Design effort and time: Design effort and time increases due to design of
BIST.
BENEFITS OF BIST:
1.) It reduces testing and maintenance cost, as it requires simpler and less
expensive ATE. If you need to do at-speed testing using ATE at around 100
GBPS or 50 GBPS speed, it becomes very costly. So, BIST solves that purpose.
2.) BIST significantly reduces the cost of automatic test pattern generation
(ATPG).
MBIST
Typically, we see a 4X increase in memory size every 3 years to cater to the
needs of new generation IoT devices. Deep submicron devices contain a
large number of memories which demands lower area and fast access time,
hence, an automated testing strategy for such semiconductor engineering
designs is required to reduce ATE (Automatic Test Equipment) time and cost.
● Stuck At Fault
● Transition Fault
● Coupling Fault
● Neighbourhood pattern sensitive fault (NPSF)
● Address decoder fault
MBIST MODEL
MBIST Algorithms
Memories are tested with special algorithms which detect the faults
occurring in memories. A number of different algorithms can be used to test
RAMs and ROMs. These algorithms can detect multiple failures in memory
with a minimum number of test steps and test time -
Checkerboard Algorithm
The 1s and 0s are written into alternate memory locations of the cell array in
a checkerboard pattern. The algorithm divides the cells into two alternate
groups such that every neighboring cell is in a different group. The
checkerboard pattern is mainly used for activating failures resulting from
leakage, shorts between cells, and SAF.
Algorithm Steps:
March Algorithm
A march test applies patterns that “march” up and down the memory
address while writing values to and reading values from known memory
locations. It targets various faults like Stuck-At, Transition, Address faults,
Inversion and Idempotent coupling faults.
Algorithm Steps:
Increasing Address
Decreasing Address
Memory repair is implemented in two steps. The first step is to analyze the
failures diagnosed by the MBIST Controller during the test for repairable
memories, and the second step is to determine the repair signature to repair
the memories. All the repairable memories have repair registers which hold
the repair signature.
INTERVIEW QUESTIONS
Q. Explain scan testing?
Q. Why do we need updateDR in JTAG arch, why only captureDR and shiftDR
cannot work?
Q. What is the purpose of DFT? Why can we not work only with functional
testing?
Q. What are the major cost factors of DFT and what do you to lower costs?
Q. Write a verilog code for a RAM, let the address space be a parameter.
Note: Some scripting questions are usually asked in DFT interview, knowing
one out of tcl, perl will be helpful. Questions usually asked are basic in
nature, related to regex find and all.
Q. In an error log, there are different types of Errors present like errorA,
errorB ….. errorX, with each error present in new line with something like -
errorA: xxxxxx
errorB: xxxxxxxx
Count error of each type present in the log and print as output.