Inserting Permanent Fault Input Dependence On PTM To Improve Robustness Evaluation
Inserting Permanent Fault Input Dependence On PTM To Improve Robustness Evaluation
Rafael B. Schivittz1, Rafaél Fritz1, Denis T. Franco1,2, Lirida Naviner3, Cristina Meinhardt1, Paulo F. Butzen1
1
Programa de Pós-Graduação em Engenharia de Computação – PPGComp – Universidade Federal do Rio Grande – FURG
2
Universidade Federal de Pelotas - UFPEL
3
Institut TELECOM, Télécom-ParisTech, LTCI-CNRS, COMELEC Paris, France
Abstract— Many of the nanometer CMOS challenges are with a Stuck-On fault will drive current, independently of the
seriously compromising the gains attained with technology signal applied in the gate terminal.
scaling, mainly impacting the yield and the circuit reliability. To
cope with these problems, new design methodologies are Many techniques are proposed to mitigate the problems
necessary to improve the robustness of the circuits. Given the generated by the continuous scaling. These techniques are
overheads associated with the traditional fault-tolerant usually based on redundancy in time, hardware and/or
approaches, alternative solutions, based on partial fault tolerance information [2][3]. However, these techniques present
and fault avoidance, are also being considered as possible penalties in original circuit characteristics. For example,
solutions to the reliability problem. These approaches are based hardware redundancy techniques increase the circuit area. The
on the application of fault tolerance to a restricted part of the Triple Modular Redundancy (TMR) is one of the most
circuits or hardening of individual cells, allowing reliability adopted technique, producing a circuit robust to any fault in
improvements and limiting the associated overheads. In this one of the three modules and have more than 3X of penalty in
context, a fast and accurate evaluation of circuit's reliability is area. Thus, TMR adoption should be carefully explored.
fundamental, to allow a reliability-aware automated design flow,
where the synthesis tool could rapidly cycle through several The main challenge is to identify the moment when
circuit configurations to assess the best option. This work advantages in reliability are bigger than disadvantages in other
presents a methodology to calculate circuit reliability, based on circuit characteristics. In some cases, these techniques reduce
the Probabilistic Transfer Matrix (PTM) method, and using a the scaling gains due to the higher complexity of reliability
probabilistic model for stuck-on faults that considers a fault approaches. To deal with this balance, probabilistic methods
probability for each input vector. The work shows that are one of the better options [4]. At circuit level, this method
considering the same error probability for all input vectors can be modeled by a matrix to reproduce the logic gates
underestimates the input influence on the overall circuit behavior. One of the most adopted technique is the
reliability. The proposed model of gate reliability associated with probabilistic transfer matrices (PTM) that represents the
the PTM method can provide results that are more accurate in expected circuit output for each input combination. In PTM, a
terms of circuit reliability. Results obtained with the proposed matrix M is composed by i rows and j columns, where the (i,
approach show a difference up to 15% when compared with the
j)th input represents the probability of an occurrence of the
traditional application of the PTM method with equal input
vector probabilities, when applied to a set of combinational
output j given the input i, denoted by p(i | j).
circuits. Although most of PTM implementations are presented
considering the same probability q for all input vectors of a
Keywords— CMOS, Stuck-On faults, PTM, EDA. gate, there is a different fault probability for each input
I. INTRODUCTION combination and also a different fault probability for a
different kind of fault. A simple q value can underestimate the
Device scaling has been the main strategy adopted by the fault probability in PTM implementation. Assuming a same
semiconductor industry to increase the performance of value q for all input combination can mask the influence of the
integrated circuits (ICs). The continuous scaling has input in different kinds of faults. In this context, the main
emphasized several aspects neglected in earlier technologies contribution of this work is to show the input influence on
nodes. The circuit reliability has been pointed out as one of the robustness of Stuck-On faults and also to present a
major challenges in deep sub-micron CMOS circuits [1]. methodology to model SOnF over a PTM making the
In nanoscale designs, many factors associated with robustness evaluation of a circuit more accurate.
technology scaling, like manufacturing precision limitations, This work is organized as follows. Section II presents a
supply voltage reduction, higher operation frequency and background about SOnF and the probabilistic transfer
power dissipation have influenced the need for reliability. matrices. In Section III is presented the proposed PTM model
These factors increase the circuit fault probability, mainly considering SOnF and input dependence. In Section IV the
permanent faults generated during the fabrication process results of this work are presented, including a library of PTM
steps. One of these faults is the Stuck-On Fault (SOnF). SOnF modified for gates and the application of this library to
is a permanent fault that occurs in transistors. The transistor
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.
compare robustness of a circuit with the traditional PTM. Transfer Matrix). In PTM, as well as in ITM, row indices
Finally, in section V is presented the final remarks. represent all input combinations, and column indices represent
possible output values.
II. BACKGROUND
For example, let’s consider the possible probabilistic
This section presents an introduction about the Stuck-On transfer matrices shown in Fig. 2. In these examples, the gates
faults behavior. It is also shown how the probabilistic transfer PTM provide the correct output value with probability q, where
matrices represent the expected output of the gates and the q represents the reliability factor of the gate. The error
reliability computation of a circuit. probability of the gate is represented by p=1-q. In general, it is
A. Stuck-On faults possible to use any fixed probability distribution for the rows
of the matrices depending of the reliability factor of the
In a circuit, problems such as manufacturing issues, aging
technology.
effects or even single events can make the transistors remain
permanently in the on state, which characterizes the Stuck-On The difference between ITM and PTM is that ITM
faults. The transistor that remains permanently stuck-on can represents a gate with error-free and PTM represents the
affects the expected logic gate behavior, competing with its probability of error in the gate. In other words, the ITM is a
complementary transistors to control the output in some input PTM matrix with reliability factor q equal to 1.
combinations.
0 1 0 1
This work focuses on Stuck-On faults effects on 0 1 0 1 1 0
1 0 0 1 1 0
combinational circuits. To exemplify the SOnF behavior, Fig. 1
1 0 1 0
shows the circuit of an Inverter logic gate and its outputs in the
correct operation (OUT) and the output considering that a
SOnF occurred in the transistor N1 in the pull-down network
(OUT’). The effects of this fault are visible when the input
vector A=0 is applied, creating a short circuit between VDD Fig. 2. ITMs and PTMs for three different logic gates
and GND and causing an error state in the output with SOnF
(OUT’). In other words, when the input is A=1, the circuit The generation of the global circuit PTM involves the
behavior with SOnF is equal to the circuit expected, because combination of gates and interconnections. With the
there is no fault in the pull up network, i.e., the transistor with information about reliability of each gate, it is possible to
SOnF is the transistor responsible to make a path between compute the reliability of the whole circuit. The global PTM is
GND and OUT. obtained dividing the circuit in levels and combining the PTM
of each gate presented in the circuit. To compute the global
PTM of a circuit, two rules have to be respected:
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.
different number/order of fanins and fanouts. In this case an combination in X, this work models the circuit considering the
ITM of interconnection is used to map the behavior [5]. ITM for each input. If the expected output is ‘1’, only faults
∗
present in pull-down network can affect the output of the gate.
(3) Otherwise, if the expected output is ‘0’, only faults presents in
∗ transistors of the pull-up network can affect the output of the
gate, and the analysis to compute the reliability considers the
graph G1.
Furthermore, in each input combination there are transistors
that are already in conductive state. These transistors are not
present in the reliability computation because a SOnF in a
transistor that is already conducting does not affect the circuit
output. These situations allow simplifying the graph in order to
remove the transistors in a conductive state and, then, to
compute the reliability only with the critical transistors to the
Fig. 3. Example of circuit level PTM computation. related input combination. To do it, this work presents a
probabilistic SOnF model that determines the input fault error
III. PROPOSED METHOD TO INSERT STUCK-ON FAULT INPUT probability considering the transistor arrangements in the
DEPENDENCE ON PTM circuit. This model determines the behavior for series, parallel
This section presents the method proposed in this paper to and combined series/parallel transistors arrangements.
evaluate SOnF and input vector dependence in different
1) Serial arrangements
transistor arrangements. It is also shown that different input
vectors and transistor arrangement have different probability to Serial arrangements consist in more than one transistor
produce a correct output. connected in serial condition. To represent statistically the
probability of a SOnF between two terminals in this
The traditional PTM models use the reliability of the gate,
arrangement is necessary that all transistors present a SOnF.
as the reliability of the output to be correct for each input
This occurs because the stuck-on fault is only propagated to the
combination [6][7][8][9]. However, the use of PTM can be
output if there is a path able to propagate the signal between
modified to explore the circuit and the different possibility of
the terminals. For example, considering Fig. 4, the fault will
an error taking into account the input vectors [10][11]. In this
just be propagated if both transistors T1 AND T2 are Stuck-On,
context, this work proposes a new methodology to insert the
creating a conductive path.
input influence on the traditional robustness evaluation with
PTM models when considering Stuck-On fault effect on
combinational circuits. The new method can be applied in all
traditional CMOS gates composed by single stages. Gates that
present multiple stages have to be evaluated using the
traditional circuit PTM computation method described in Fig. 4. Serial arrangement of transistors
Section II.
To compute the error probability in this transistor
The main goal of this work is to model the circuit reliability arrangement, let’s consider that, for each transistor tj with
considering the SOnF probability of each input combination. [j=1…n] where n is the number of transistor in serial
The methodology starts defining a Stuck-On model based on arrangement, the error probability P(e) is modeled as the fault
the transistor arrangement and the input dependence. Next probability in tj and tj+1 through Tn, as shown in Eq. (1). The
Subsection details the modelling of Stuck-On Faults. Then, the probability of the transistor failing is given by P(ti). The
input dependence is transposed to the PTM, implying in a new statistics operation to represent the and relation is the
PTM model that take into account the input dependence for intersection, as presented in Eq. (2). Eq. (3) shows the error
SOnF. Finally, this Section presents a case study showing the probability of the serial arrangement presented in Fig. 4.
difference of the traditional PTM methodology and the method
proposed when applied to a NAND2 gate. … (1)
A. Modelling Stuck-On Faults ∩ ∩ …∩ (2)
Let’s consider a CMOS gate, where C is the circuit ∩ (3)
description composed by t transistors and I inputs. The inputs
generate 2I combinations and each combination is represented 2) Parallel arrangements
as Xi. The set of all transistors in the netlist is named T. The This arrangement consists in more than one transistor
circuit is described by two graphs, named G1 and G0. The connected in parallel condition. To represent statistically the
graph G1 is the one that sets the output to ‘1’ (pull-up network) probability of a SOnF between two terminals in this
and the graph in G0 sets the output to ‘0’ (pull-down network). arrangement is necessary that at least one transistor present a
In the ITM, an input combination Xi has its normal output SOnF. This occurs because the stuck-on fault is only
already defined, as shown in the examples presented in Fig. 2. propagated to the output if there is a path able to propagate the
To define the error probability under SOnF for each signal between the terminals. For example, considering Fig. 5,
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.
the error is propagated if transistor T1 or T2 are Stuck-On, 00 p q
01 p q
creating a conductive path. p q
10 p00 = P(T3) AND P(T4)
11 q p
(b) p01 = P(T3)
p10 = P(T4)
00
01 p11 = P(T1) OR P(T2)
10
Fig. 5. Parallel arrangement of transistors 11
(a) (c) (d)
To compute the error probability in this transistor
arrangement, let’s consider that, for each transistor tj with Fig. 7. NAND2 gate schematic (a); Traditional PTM (b); PTM considering
[j=1…n] where n is the number of transistor in parallel SOnF and input dependence (c); Probability of error propagation (d).
arrangement, the error probability P(o) is modeled as the fault
probability in tj or tj+1 through Tn, as showed in Eq. (4). The B. PTM considering SOnF model
probability of the transistor failing is given by P(ti). The To validate and allow the adoption of the method proposed
statistics operation to represent the or relation is represented in this work applied the traditional PTM method and the method
Eq. (5). Eq. (6) shows the error probability of the parallel proposed in a set of the most frequently used combinational
arrangement presented in Fig.5. standard cells from a commercial library.
… (4) The generation of a library with all information about
∪ ∪ …∪ (5) SOnF in a set of combinational logic gates is made using an
automatic generator of PTM considering SOnF and input
∪ (6) dependence. The entire process to compute the reliability of a
3) Serial/Parallel arrangements gate under Stuck-On faults is described in Algorithm 1. In the
rest of this work, the PTM generated considering the inputs
In this arrangement condition, the methodology needs to influence is called as PTM modified. The algorithm inputs are
simplify the graph, in order to obtain the error probability of the circuit description (C); the list of inputs (I) and the
the arrangement. Let’s consider Fig. 6(a) as an example, in this reliability value of the technology (q). The algorithm output is
case, the first simplification occurs in parallel arrangements the PTM modified of the gate.
and the circuit becomes as in Fig. 6(b). In the arrangement
shown in Fig. 6(b) is just necessary to compute the reliability The algorithm starts organizing the circuit description in
of transistors in serial condition, as explained before. the two graphs (G0 and G1) adopted in the methodology (line
1). Then, the list of inputs (I) is used to compute the number of
input combinations of the circuit (line 2). For each
combination in I, is realized the simplification of the graph
eliminating the transistors in conductive state (line 3). After
that (line 4), it is computed the serial and parallel
arrangements of transistors based on the graph information in
order to obtain the total reliability of the graph. After
establishing all arrangements and computing the whole
(a) (b) equation representing reliability for the input combination Xi,
Fig. 6. Serial and parallel arrangement of transistors (a); Parallel the circuit reliability is computed and the PTM modified is
simplification in transistor arrangement (b). generated.
4) NAND2 Case Study Algorithm: PTM_Creator (C, I, q)
Input data:
The traditional PTM of a NAND2 is independent of the I // List of Inputs
transistor arrangement and obtained just considering the C // Circuit description Netlist
NAND2 function. The traditional PTM is shown in Fig. 7(b). q // Gate Reliability
To exemplify the proposed methodology, let’s consider a
1 G = create_graphs(C);
NAND2 gate described in Fig. 7(a). Different of the traditional 2 For each input combination {
PTM, the PTM considering SOnF and input dependence has 3 G = graph_simplification(I);
relation with the transistor arrangement and it implies in the 4 reliability_computation(G, q); }
new PTM showed in Fig. 7(b). It is possible to note that, for
each input combination, there is a different value of q due to Output data:
the transistor arrangement in the circuit, which causes a PTM matrix of circuit C
different possibility to propagate a wrong output value. The Algorithm 1. Methodology to create the PTM of a gate.
error propagation condition is shown in Fig. 7(d) for each
input combination. The main step in the Algorithm is in line 4. In this step, the
methodology creates the PTM considering the reliability of the
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.
arrangement found at the graph analyzed. This work uses The main difference in the gates reliability considering
graph simplification to produce the equation to be solved by input dependence is that when there is some transistors in
combining the serial and parallel arrangements. This method series condition in many of the input combinations, the
searches for parallel arrangements and simplifies the reliability is increased. On the other hand, circuits with many
transistors in parallel, computing its reliability. After removing parallel arrangements present a smaller reliability because this
the parallel arrangements, the next step of this method is to arrangement is extremely sensitive to a Stuck-On fault,
simplify the serial arrangements that are possible to simplify. propagating if any of the transistors present a fault. In NAND
This method will be repeated while there are removable nodes and NOR arrangements, only one network presents parallel
in the graph. At the end of this process, the PTM modified is arrangement, in the case of AOI and OAI they can present
finished and available to the user. parallel arrangements in both networks, what reduces the gate
reliability.
IV. EXPERIMENTAL RESULTS
Analyzing the gates in Fig. 8 is possible to notice that
This section presents the experimental results separated in some gates have the same error probability considering both
three parts. The first one shows the error probability computed methods. To explore these gates let consider the NAND2 gate.
for all single stage logic gates from a Nangate FreePDK45 Fig. 7(a) presents the transistor arrangement, Fig. 7(c) and Fig.
Open Cell Library [12]. The second part explores logic gates 7(d) presents the difference in both methods between PTM
that present its reliability equals to the used technology considering input vector dependence and the PTM traditional.
reliability. In other words, this value could be The same arbitrary value of reliability q=0.99 is applied and
misunderstanding as equal to the traditional PTM approach Fig. 9(a) and Fig. 9(c) show the values computed for NAND2
that consider the same value of q for all input vectors. A gate for PTM with input dependence and PTM traditional,
deeper analysis presents this coincidence and shows the respectively. If you compare each row of the PTMs is possible
difference in several input vectors. Finally, the constructed to notice that the rows ‘00’ and ‘11’ present different
library of logic gates PTM is applied into two different circuits reliabilities. However, if you compute the gate reliability
to compare our method and the traditional PTM values in (PTM*ITM), the value is the same and equal to 0.99. The
related works. same comparison is done in an AOI21 in Fig. 9(b) for PTM
A. Standard cell modified PTM modified and Fig. 9(d) for PTM traditional. In AOI21, only
three in eight values of reliability are the same than the
Fig. 8 shows the error probability obtained from proposed
traditional PTM. This difference does not cause difference in
method (red columns) and the error probability considering
logic gate reliability, by as explored in next section, they are
the traditional PTM (blue line) considering an arbitrary
observed in circuit analysis. The PTM modified of other gates
technology reliability q = 0.99 for evaluated logic gates. The
are not present in the figure because they have more input
gates are organized by the number of transistors, from left to
combination and it would be impractical due to space
right in figure, starting with 2 transistors of the INV to the 12
limitations.
transistors of the OAI33. In Fig. 8 is possible to notice that the
complementary gates, i.e., AOI-OAI, NAND-NOR, have the 000 0.010099 0.989901
same error probability. Another observation is that NAND2, 001 0.019900 0.980100
00 0.0001 0.9999 010 0.019900 0.980100
NOR2, INV, AOI211 and OAI211 have the same error
01 0.0100 0.9900 011 0.980100 0.019900
probability computed in both methods. Otherwise, NAND4 10 0.0100 0.9900 100 0.990000 0.010000
and NOR4 considering the proposed method presents the 11 0.9801 0.0199 101 0.990000 0.010000
smaller error probability compared to the traditional method. It 110 0.990000 0.010000
occurs because they present more transistors in serial 111 0.999801 0.000199
condition and this kind of arrangement makes the error (a) (b)
probability smaller. 000 0.010000 0.990000
001 0.010000 0.990000
00 0.0100 0.9900 010 0.010000 0.990000
01 0.0100 0.9900 011 0.990000 0.010000
10 0.0100 0.9900 100 0.990000 0.010000
11 0.9900 0.0100 101 0.990000 0.010000
110 0.990000 0.010000
111 0.990000 0.010000
(c) (d)
Fig. 9. NAND2 gate PTM considering SOnF (a); AOI21 PTM considering
SOnF (b); NAND2 traditional PTM (c); AOI21 traditional PTM (d).
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.
traditional PTM and from proposed method. This choice will the multiplexer error probability using PTM modified is 10%
allow the isolation of the input vector influence in the circuit higher than PTM using traditional values.
reliability.
TABLE I. CIRCUITS ERROR PROBABILITY CONSIDERING
The C17 circuit presented in Fig. 10 is composed of six TRADITIONAL PTM AND PTM WITH INPUT VECTORS
NAND2 gates. In this analysis is considered that all inputs DEPENDENCE
vectors have the same probability of occurrence. The
Error probability using Error probability using
interesting point is that NAND2 presents the same error Traditional PTM PTM modified
probability in both methods, but different error probability for Circuit
(%) (%)
some input vectors. This difference in circuit error probability q=0.99 q=0.95 q=0.99 q=0.95
considering both methods is up to 15% in C17. This is the C17 4,80 21.61 5,71 24.97
consequence of the difference in the reliability of some input Multiplexer 5,13 21.39 5,81 23.54
vectors of the NAND2 gate. This difference is propagated
during the PTM operations to compute the circuit reliability. V. FINAL REMARKS
Circuit reliability estimation is an important aspect that has
to be considered in modern circuit design to avoid the use of
fault tolerance techniques that can mitigate the gains achieved
by the technology scaling. This work presents a method to
compute logic gate reliability using a probabilistic model for
stuck-on faults that explores the transistor arrangement and the
input vector influence to compute fault probability for each
input combination. The results show that considering the same
error probability for all input vectors underestimates the input
influence on the overall circuit reliability. The proposed
method can provide results that are more accurate in terms of
circuit reliability and guarantee more accurate results in
Fig. 10. ISCAS C17 circuit using NAND2 gates.
reliability circuit analysis.
ACKNOWLEDGMENT
The multiplexer shown in Fig. 11 is composed by NAND2,
NOR2, and Inverters. As the circuit if formed by cells with the This work is supported by Coordenação de Aperfeiçoamento
same error probability in both methods, the difference in de Pessoal de Ensino Superior – CAPES – Brazil.
circuit error probability is caused because the gates present
different error probabilities for some input vectors and this REFERENCES
difference causes an error probability different when the PTM [1] Borkar, S. et al. “Design and reliability challenges in nanometer
operations are applied in circuit arrangement. technologies”. DAC 2004. pp. 75.
[2] Fang, L.; Hsiao, M. S. Bilateral testing of nano-scale fault-tolerant
circuits. Journal of Electronic Testing, Springer, v. 24, n. 1-3, p. 285–
296, 2008.
[3] Vial, J. et al. Using TMR architectures for yield improvement. DFTVS,
2008. p. 7–15.
[4] Naviner, L. A. et al. Efficient computation of logic circuits reliability
based on probabilistic transfer matrix. DTIS 2008.
[5] Beg, Azam, and Walid Ibrahim. "On teaching circuit reliability." FIE
2008. 38th Annual. IEEE, 2008.
Fig. 11. Multiplexer circuit. [6] Ketan N. Patel, Igor L. Markov, and John P. Hayes. Evaluating circuit
reliability under probabilistic gate-level fault models. IWLS 2003.
The results obtained for both circuits are presented in [7] S. Krishnaswamy, G.F. Viamontes, I.L. Markov, and J.P. Hayes.
Table I. A different reliability of the technology q=0.95 is also Accurate reliability evaluation and enhancement via probabilistic
used to complement the analysis and verify if the difference in transfer matrices. DATE, 2005.
both methods remain the same [9]. [8] Xiao, J., et al. A method of gate-level circuit reliability estimation based
on iterative PTM model. IEEE PRDC, 2011.
Considering q=0.99 the difference in circuit C17 using the [9] Singh, N. S. S., et al. "Sensitivity analysis of probability transfer matrix
traditional PTM and the PTM modified, considering error (PTM) on same functionality circuit architectures."(CSPA), 2012 IEEE
probability is 18%. This represents that the PTM modified 8th International Colloquium on. IEEE, 2012.
increases the error probability in 18%. The multiplexer error [10] Grandhi, S., Spagnol, C., & Popovici, E. Reliability analysis of logic
circuits using probabilistic techniques. PRIME. 2014
probability using PTM modified is 13% higher than PTM
using traditional values. [11] Krishnaswamy, Smita, et al. "Probabilistic transfer matrices in symbolic
reliability analysis of logic circuits." TODAES. 2008.
Considering q=0.95 the difference in circuit C17 using the [12] NanGate FreePDK45 Open Cell Library, available at: http://nangate.com
PTM modified in error probability is 23% higher. Analyzing
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SANTA CATARINA. Downloaded on June 27,2022 at 16:41:27 UTC from IEEE Xplore. Restrictions apply.