Design For Testability Notes 1738476661
Design For Testability Notes 1738476661
System on Chip (SOC): All the blocks like memory, ALU, RAM,
ROM, Microprocessor, Microcontroller etc are integrated on a chip
is called as SOC. It mainly consists of 4 blocks.
I/O Memories
Goal of DFT:
Goal of DFT is to identifying the defects and separating the faulty
parts.
But not identifying where is the defect, how the defect occurs.
• DFT Engineers provides i/p’s in the format of Test vectors to
the Test Engineer team to apply on the manufactured IC and
prove the IC’s pass/fail.
• To test the Digital type logic, we use Scan method .
• For I/O defects we use – BSCAN (Boundary Scan)
• For memory – MBIST (Memory BIST)
• For Analog – ADFT (Analog DFT)
There is no Generic way to Test Analog (There is no fixed
methodology to test Analog).
Design Structures:
1) Digital: (i) Combination (ii) Sequential Logic
2) Inputs & Outputs (I/O)
3) Memories
4) Analog
Inputs & Outputs:
It takes input from the external input world and gives to the input
pad, then to some internal logic and further sends the output to
external world through output pad.
Bidirectional Pad:
• It has a logic inside the chip,
➢ If oe=1, then pad works as an output pad.
➢ If oe=0, then pad works as an input pad, i.e., signal going into
chip.
oe = 1 oe = 0
External world
dout output
din
Tri - State Pad:
➢ When oe=1, the output goes to the external world.
➢ When oe=0, it creates an high impendence state, i.e., it
completely block signals going to i/p (or) o/p.
oe = 1
External world
dout output
DFT Role :
DFT Test vectors
I
specifications A+B A+B A+B A+B
C
A+B pass
IC Classification:
If not fail
ASIC FLOW
• IC for a specific functionality, eg: mobile, laptop.
• Same design is used to manufacture millions of products.
• When we need higher volumes of IC’s we go by ASIC.
FPGA:
• It supports different applications( like adder, subtractor,
multiplier..).
i.e., Reconfigurable.
• Whereas ASIC is application specific IC, i.e., suppose an
adder ASIC only performs Addition.
• Suppose Adder is designed with ASIC & FPGA.
• Then PPA is better in ASIC then FPGA.
Design
Architecture
RTL
Design/verification
Logic synthesis
DFT/scan Insertion
ATPG
(SCAN, JTAG, MBIST)
Incremental
Synthesis
Floor Planning,
P&R
Timing Analysis
Tape out
Defects in Silicon
• IC manufacturing process is defect-prone.
• Database given for manufacturing and outcome of a
manufactured chip both are functionally equal.
• I need to prove that, they are equal or not.
➢ If they are equal it is fault free
➢ otherwise , there is a defect or fault.
• Defects during manufacturing open or short are caused by
impurities, shorts for additive material, opens for subtraction
material.
Testing:
• Testing is a process of sorting DUT’s to determine their
Pass
acceptability.
Fail
• If we apply 100 patterns as stimulus it should
#1 pattern what is the o/p
#2 pattern what is the o/p
. I/P O/P
. #1 1
.
#2 0
#100 pattern what is the o/p
#3 0
• First, we should have expected o/p’s
. .
. .
• Then apply stimulus (I/P pattern)
#100 1
Suppose
I/P O/P
#1 1
if anyone pattern fails then it is a defective
#2 0 chip and it is separated.
#3 0
. .
. .
#100 1
Multisite testing
1) One ATE tests several (usually identical) devices at the same
time.
2) For both probe (wafer level testing) and package test(taking a
die from wafer, and putting in a package).
3) DUT interface board has >1 sockets.
4) Add more instruments to ATE to handle multiple devices
simultaneously.
5) Usually test 2 or 4 DUT’s at a time, usually test 32 or 64
memory chips at a time.
6) Suppose, if we have 100 million IC’s,
7) Let us assume it will take 1min to test one IC, then to test
100 million, it will take 100 million minutes.
8) Suppose, a mobile has 3 ICs, for 100 million mobiles, it will
take 30 million minutes.
9) Therefore, to increase the testing speed, instead of testing one
IC, you can test 2 IC’s at a time is called Dual Site Testing
10) 4 ICs at a time – Quad site testing.
11) 8 ICs at a time – Octal site testing.
, if we test 8 ICs in a min then we need
100 million
= 12 million minutes instead of 100
8
million minutes.
ATE Inking
Wafer Test
Slicing
Wafer Wafer
Dicing
&
ATE Binning
Wafer map
Pass/Fail
Good chips
ATE Software
Tools
• Here, ATE generates test pattern to the wafer, then failure
dies are identified, by taking the information of fail dies from
ATE, the wafer map will Ink the dies which are fail.
• Now by using dicing and binning, here sorting is done, i.e.,
good parts (dies) are Dicing (cut and send to package) and
bad (or) Ink dies are moved to Bin.
Manufacturing Test:
1) A speck of dust on a wafer is sufficient to kill chip.
2) Yield of any chip is <100%.
Must test chips after manufacturing before delivering to
customers to only ship good parts.
3) Manufacturing testers are very expensive
a. Minimize time on tester
b. Careful selection of test vectors.
What is DFT?
• It is a technique of adding testability features to IC design.
• Developing and applying manufacturing tests.
• The manufacturing tests help to separate defective
components from the healthy ones.
• Actually, let us assume an AND gate IC.
• During the manufacturing, test logic also inserted.
• Now apply test pattern to Test logic, to check the functionality
of AND gate, after verification, if everything is ok then
manufacturer disables the Test logic.
What are DFT Goals ?
❖ Quality: A high degree of confidence during testing
❖ Test cost reduction
❖ Time to volume reduction through test automation.
Ad-Hoc DFT:
To enhance a designs testability without major changes to the
design style.
❖ Minimizing Redundant logic
❖ Minimizing Asynchronous logic
❖ Isolating clocks from the logic
❖ Adding internal control and observation points.
However structured DFT techniques yields greater results.
Structured DFT:
1) More systematic and automatic approach.
2) Scan Design
3) BIST
4) Boundary Scan
• Goal is to increase the Controllability & Observability of a
circuit.
• Actually by using Ad-Hoc DFT we cannot test each and every
block but by using structured DFT we can test each and every
block.
like, to test digital logic – use Scan methodology
to test memories – MBIST
to test I/O’s - JTAG (or) Boundary Scan.
• Suppose, if we want to manufacture an AND gate,
• Here 3 nodes are present, during manufacturing there are
two possibilities
1) stuck at – 0;
2) stuck at – 1
• Each node has a possibility of 2 defects.
Therefore, a simple AND gate with 3 nodes has 6 possible defects.
No: of Nodes = 3
No: of Faults = 2
Total no: of defects = 3*2 = 6
A s@ 0 = 11
A s@ 1 = 01
B s@ 0 = 11
B s@ 1 = 10
Y s@ 0 = 11
Y s@ 1 = 00, 01, 10 Here, 00 is called redundant
Therefore, the total no: of patterns to detect 6 defects = 01, 10, 11
Similarly, for OR the quality patterns are
00, 01, 10 (Here, 11 is redundant)
DFT helps to
a) Verify chip’s functionality.
b) Reduce debug time.
c) Faster ramp to volume production.
o/p: b & c.
Definitions:
1) Defects: Imperfection of flow that happen in a particular
DUT.
2) Faults: A representation/modelling of that defect.
3) Failure: Non- Performance of the intended function of the
device mainly due to the defect.
DFT Definitions:
A Design is Testable:
• All internal nodes of interest are simultaneously controllable
(to a desired logic value) and observable.
Design for testability:
• A methodology (or) a collection of methodologies, which
results in the creation of a testable design.
Diagnosis:
• To locate the cause of misbehavior after the incorrect
behavior is detected.
Functionality vs Structural ATPG:
64 64
64
If we go for carry
Total no: of nodes = 13.
13x2 =26
for 64-bit, total no of faults = 26x64 = 1664
for 64-bit adder, we have 1664+640=2304 faults.
Functional vs Structural testing
• If we go by functional way of testing
• Functional ATPG – generate complete set of tests for circuit
i/p – o/p combinations.
129 – i/p’s , 65 – o/p’s.
, 2129 = 680,564,733,841,926,749,214,863,536,422,912
patterns.
by using 1GHz ATE (High speed ATE), would take 2.15x1022
years. (which is not feasible).
• Where as by structural test,
➢ No redundant adder hardware, 64 bit slices.
➢ Each with 36 faults (using fault equivalence)
At most 64x36=2304 faults.
By using 1GHz ATE – 0.000001728 sec (1.7μs).
Industries prefer structural testing.
suppose, with structural testing we got 99.5%.
• For coverage of 0.5% go for functional testing.
designers give small set of functional tests – augment with
structural tests to boost coverage to 98+%.
Design Environment Setup
➢ Link library RTL
➢ Target library
➢ Search path Synthesis
➢ Tool Invocation
Scan
➢ Netlist
Insertion
➢ Constraints
Synthesis → Netlist
module ICG (clk, en, te, out_clk);
input clk, en, te;
output out_clk;
wire clk_la;
OR 2X4_LVT cg_or (.A1(en), .A2(te), .Y(clken));
INVX1_LVT cg_inv (.A(clk), .Y(clk_n));
LATCHX1_LVT cg_la (.clk(clk_n), .D(clken), .Q(clk_(a));
AND2X1_LVT cg_and (.A1clk), .A2(clk_la), .Y(out_clk));
endmodule
When ever we give netlist to the tool, for every cell definition
(OR2X4_LVT, …) the tool will go and look into the library (Link
library)
• Suppose, the tool does not find latch cell in link library then
it shows error like Black Box (or) cell not found etc.
• Eg: I have 40nm synthesis, I want to convert this in 28nm
netlist, to do Scan Insertion.
i/p is 40nm netlist we need link library (read).
Now o/p is 28nm netlist we need Target library
(write).
Link Library:
• The tool uses the link lib to read i/p netlist (reading all the
cells) (or) for each cell (like OR, AND, NOT, LATCH…) the tool
will go and look into the link library.
Target Lib:
• It is related to the tool writing the o/p netlist depending up
on the constraints (or) requirements like area optimization,
power optimization, performance optimization etc… the tool
write the netlist by taking all the cells from the Target library.
Eg: (I/p) (o/p)
50 MHz with 40nm to 100 MHz with 40nm
Technology is
To read i/p link lib different
Target lib is used to
write the o/p netlist
Search Path:
• It is the path where you will get all libraries and telling the
tool where to look for Target lib.
Single Struck-at fault:
• Only 1 node is faulty in the entire design at one point of time,
it is permanently S@0 (or) S@1. it does not toggle.
• The fault can be either at i/p (or) o/p of a gate.
A Y
A s@ 0; A = 1
A s@ 1; A = 0
Y s@ 0; A = 0
Y s@ 1; A = 1
Here, we need only 2 vectors not 4
Testability:
Vcc
•
Test Point
Test Point controllability
observation
Primary
Output
MUX
Primary
Input
Test mode or
Scan mode
Functional D Q Functional
data input 0
data input D Q
Scan input 1
0 1 1
Primary Outputs
Primary inputs
Combinational
Combinational
Combinational
Combinational
logic
logic
logic
0 0 1
logic
1 1 0
Scan en = 1
Scan in = 100101011
Scan clk
Sys clk clk Sys clk clk
D Q
latch
clk
Data D Q
Scan in
Master
Scan clk latch
A clk clk
D Q
Slave
latch
B clk clk
Clock
Any signal that can change the state of flipflop (or) Reg is called
clk. (set (or) reset are also called clock).
Scan Golden Rules (or) DRC’s: All DRCs are netlist based.
DFT rule 1:
I/P 1 D Q O/P
I/P 2 D Q clk
clk
I/P 1 D Q O/P
I/P 2 clk
D Q
clk clk
Test mode
Test mode
O/P
I/P
DFT Rule: 3
Test mode
Combo
logic Reset from port
D Q 1
R
Combo 0
logic
clk
D R Q
clk
Asynchronous SET/RESET pins of flipflops must be controlled by
a port level RESET (primary i/p) in Scan Test mode.
DFT Rule: 4
clk
Gated clock
Latch
clk
Gated clock
Latch
Enable
Latch O/P
Data
DFT Rule: 6
• Do not replace flipflops of the shift register structure by
equivalent scan flops.
• i.e., suppose, if we are not having combinational logic to
test, then no need of scan flops.
• Similarly, if we are having shift registers in a functional
block, and there is no combo logic in the functional block,
then there is no need to convert the flipflops of shift
registers to scan chain.
• Similarly if there is a combo logic, and a 32-bit shift reg in
the functional block, then we need to change only one (first)
flop in the shift reg to scan flop.
• Then to test combo logic, only 0th flop needs to convert as
scan flop and remaining 31 flops will remain as normal
shift-reg.
0 31
Combo
. . . . . . . . . . . . . . . . .
logic
0 31
Combo
logic D Q D Q D Q D Q
Si . . . . . . . . . . . . . . . . .
se
DFT Rule: 7
Data Q Data Q
Test mode
clk clk
clk clk
DFT Rule: 8
Memory
block
Test Mode
• By pass the memory in scan test mode.
➢ All the paths ending at memory cell are not observable.
➢ All the paths starting from memory are not controllable.
Combo logic
Memory
block
Combo logic
Test Mode
DFT Rule: 9
The scan enable signal must be buffered adequately.
1) The scan enable signal that causes all flip flops in the
design to be connected to form the scan shift register, has
to be fed to all flip flops in the design. This signal will be
heavily loaded.
2) The problem of buffering this signal is identical to that of
clock buffering.
3) The drive strength of scan enable port on each block of the
design must be set to a realistic value when the design is
synthesized.
4) If this port is left unconstrained during synthesis, it could
result in silicon failure.
DFT Rule: 10
• Avoid multicycle paths as much as possible .
(Ideally Zero)
DFT Rule: 11
• Negative edge flops should be placed in the start of the scan
chain.
1101 D Q D Q D Q D Q
clk
Actually, for 1010 it takes 4 clock pulse. But here all are x x x x
x x x x
1st +ve edge 0 x x x
1st -ve edge 0 0 x x
2nd +ve edge 1 0 x x
2nd -ve edge 1 1 0 x
3rd +ve edge 0 1 0 0
3rd -ve edge 0 0 1 0
4th +ve edge 1 0 1 1
4th -ve edge 1 1 0 1
See in previous figure for 4 clk pulses, the value loaded is 1101
but not 1010. this is because –ve edge flops are taken in the
middle. Those must be at the beginning.
1010 D Q D Q D Q D Q
clk
x x x x
1st +ve edge x x x x
1st -ve edge 0 x x x
2nd +ve edge 0 x x x
2nd -ve edge 1 0 x x
3rd +ve edge 1 0 0 x
3rd -ve edge 0 1 0 x
4th +ve edge 0 1 1 0
4th -ve edge 1 0 1 0
Scan Types
1) Full scan
2) Partial scan
3) Partition scan
(1)Full Scan:
Full scan is a scan design methodology that replaces all memory
elements in the design with their scan-able equivalents and then
stitches them into scan chains.
Scan out
Scan IN
Benefits:
1) Highly –automated process(tool generates less pattern for
best coverage)
2) Highly – effective, predictable method.
3) Easy to use.
4) Assured quality.
Scan out
Scan IN
Benefits:
1) Reduce impact in area
2) Reduced impact on timing
3) More flexibility between overhead and fault coverage.
Full vs Partial Scan
Partition
B
Design Partition
Primary A Design
Inputs Primary
Partition Outputs
C
Benefits :
• Improves test coverage and runtime.
• Less power consuming since each block tested separately.
Lockup Latches
• Block A is following the guidelines (placing –ve edge clk at
first place) similarly Block B also following the guidelines.
• But when both blocks are combined then they violated the
rule, at this situation lockup latches are used.
Scan input
clk
Block A Block B
• Even though we are giving clk pulses to all the flops, there
is a clock skew (i.e., expected-arrival) i.e., the 1st flop
receives clk first, and further flop receives last because of
this,clk skew is present, at this situation lockup latch is
introduced.
• The lockup latch is active low enable,
• It holds the data for half cycle.
• In the industries, lockup latches are inserted at the end of
every scan chain within the block itself.
Clock Gating:
clk
Latch
En
• A latch is placed, to reduce the glitches.
• En(enable) acts as data pin, here clk is –ve edge (or) –ve level
triggered.
• The target is to get neat gated clk at o/p.
• A latch with a gate is called ICG (Integrated Clock Gate) it
comes as one macro cell. i.e., It is not like one latch and one
separate AND gate combined.
101
D Q D Q D Q
Clk 1 Clk 2
x x x
x x x
1 x x
1 x x
0 1 x
0 1 1
1 0 1
When clk 2 comes 1st than clk 1 then all the data will be loaded
perfectly.
When clk 1 is coming early than clk 2
D Q D Q D Q
x x x
1 x x
1 x x
0 1 x
0 1 1
1 0 1
1 0 0
D Q D Q D Q
Latch
Clk 1
Clk 2
x x x x
1 x x x
1 x x x
1 x x x
0 1 x x
0 1 x x
0 1 1 x
1 0 1 x
1 0 1 1
Now we know if clk2 is coming early than clk1 then data loaded
correctly, what if we inserted a lockup latch here,
TCL
Scan insertion:
STEP -1: Invoke the compiler
#dc-shell
To read the netlist we should know the file location which is in .vs format (Verilog). If we know
the file location then proceed the following command to read the netlist.
STEP -3: Now we need to select the current design or module from the netlist to do the Scan
Insertion
Actually, synthesis team will provide the required libraries. To use the libraries, we need to
enter the following command
dc_shell # link
(It shows the 1/0 after every command/statement execution. 1 is for proper execution, 0 for
error in execution
STEP -6: Now configuring flip flops to scan flops by using multiplexer
STEP -8: Now tool(we) has to write the verilog code for the design because of scan flops are
added
To write the netlist we need to specify the write command, file format, file location as show
in below command
Here clock and reset ports are already existed, so we need not to create these ports. So, just
define these ports
STEP -9: Define the clock and reset ports by following command
Here, we need to create and define Scan Input, Scan Output, Scan Enable, and Test mode.
These ports are not existed previously.
To create the ports, we must mention the direction. It means we need to define the created
port is either input or output.
Here Scan Input, Scan Enable and Test mode are the Input direction and Scan Out is the Output
direction.
STEP -10: Create and Define the Scan Input by entering the following commands
STEP -11: Create and Define the Scan Enable by following commands
STEP -13: Create and Define the Scan Out by following commands
STEP-14: Now we have to define the scan paths and scan paths count
dc_shell # create_test_protocol
After create a test protocol, we can’t change any ports/signals. If you want to change any
signals/ports you need to remove the test protocol.
dc_shell # remove_test_protocol
Now you can edit. After completion of changes, If you want to see what are in the netlist just
enter the below command
dc_shell # create_test_protocol
Flop-flop-0
Flop-flop-2
Flop-flop-1
Here 2 violations are there, Because for all the flops the clock and reset should be control from
the port/top level. But here it shows clock and reset of flip-flop 2 is driven from other flops
that’s why it shows violations.
D3 ---> Reset Violation
D8 ---> (reset)
If test mode =1, then clk from top level goes as clk for FF2, reset from top level goes as reset
for FF2 . Therefore, We can reduce the violations
Up to this , the violations are theoretically overcome, but now we fix the violations using dft
compiler.
>> After giving clock and reset as test data, now to transfer the clk and reset into the FF2, the
Test Mode should be 1. For this:
dc_shell # set_dft_signal -view spec -type TestMode -port Testmode -
active_state 1
>> Then preview_dft ---> It gives the reports like no: of chains, scan style etc….
dc_shell # preview_dft
>> Now we have to do scan insertion and scan stitching. For this:
dc_shell # insert_dft