Burn-In: Henry W. Block and Thomas H. Savits
Burn-In: Henry W. Block and Thomas H. Savits
Burn-In
Henry W. Block and Thomas H. Savits
1
2 H. W. BLOCK AND T. H. SAVITS
1. BACKGROUND AND SIMPLE EXAMPLES performed; that is, 1,000 calls are simulated and
passed through each of the five to eight modules
Many manufacturers and users of electronic com-
of the switch. The system is then subjected for up
ponents and systems, as a matter of course, subject
to 48 hours to the high temperature (50◦ C) which
these systems and/or components to initial testing
can occur within the switch if the air conditioning
for a fixed period of time under conditions which
should fail. The first part of this procedure is to
range from typical to those which approximate a
find and eliminate early system failures, and the
worst-case scenario. A typical regimen is to intro-
second part simulates use in an extreme case which
duce for a period of time some vibration and tem-
might occur. The objective of the second part is to
perature elevation for a device. In a particular con-
accelerate aging, so that weak systems fail. It also
text this is sometimes known as “shake and bake.”
provides data which can be used to see how this
At the end of this period, those components and/or
equipment compares to certain standards set for it.
systems which do not survive this period of testing
may be discarded (scrapped), analyzed for defects
Example 3. Jensen and Petersen (1982) consider
and/or repaired. Those which survive this period
a piece of measuring equipment made up of approx-
may be sold, placed into service or subjected to fur-
imately 4,000 components. They focus on several
ther testing. Although these procedures have a va-
critical types of these components. One of these,
riety of names depending on the area of application,
called an IC-memory circuit, accounts for 35 of the
we use the term burn-in to describe them all. The
4,000 components. The bimodal Weibull distribution
time period will be called the burn-in period. We il-
(i.e., a mixture of two Weibulls) is used to model this
lustrate some of these ideas with the following three
type of component and has the following survival
examples.
function:
Example 1. Rawicz (1986) considers 30-watt long-
F̄t = p exp−t/n1 β1 + 1 − p exp−t/n2 β2 :
life lamps manufactured by the Pacific Lamp Cor-
poration (Vancouver, Canada) which were designed
“for 5,000 hours of constant work in severe envi- From the data, the values p = 0:015, β1 = 0:25,
ronmental conditions at 120 V.” These are installed n1 = 30, β2 = 1 and n2 = 10 have been determined,
on billboards where it is difficult and expensive but an explicit method is not given.
to replace them. It turns out that a certain small We illustrate the results of Block, Mi and Savits
percentage of these lamps tend not to last the (1993) (which is discussed in Section 4.2) to obtain
requisite 5,000 hours but fail relatively early. Ob- the optimal burn-in time for a reasonable cost func-
viously it would be beneficial if this subpopulation tion (we use CF1 of Section 3.2 in this example).
of lamps could be identified and eliminated before Assume that we would like to plan a burn-in for
being placed on a billboard. The procedure rec- components of this type so that those surviving
ommended involves stressing all of the lamps at burn-in should function for a mission time of τ = 60
a high voltage (240 V) for a short period of time, units. If a circuit fails before the end of burn-in a
which causes the weak lamps to fail rather quickly cost c0 = q0 C, where 0 < q0 < 1, is incurred. If it
while the stronger lamps do not fail during this fails after burn-in but before the mission time is
period. The lamps which do not fail are the lamps over, a cost of C is incurred. If an item survives
potentially capable of surviving the 5,000 hours burn-in and the mission time, a gain of K = kC
of constant work. Often the burn-in weakens the is obtained. For illustrative purposes, we choose
surviving devices. In this particular application, q0 = 0:5 and k = 0:05.
however, the surprising result is that the surviving We apply Theorem 2.1 of Block, Mi and Savits
lamps are actually improved. This was thought to (1993). Let f be the density of the bimodal Weibull
occur since the high thermal treatment seemed to given above. It is not hard to show that gt =
relax structural stresses caused by the fabrication ft + τ/ft is increasing in t (either directly or
process. by standard results) and goes from 0 (as t → 0) to 1
(as t → ∞). By the cited results an optimal burn-in
time 0 < b∗ < ∞ exists and satisfies
Example 2. In the AT&T Reliability Manual
(Klinger, Nakada and Menendez, 1990) an elec- C − c0
tronics switching system (the 5ESS Switch) is gb∗ = :
C+K
discussed. Immediately after manufacture this sys-
tem is operated at room temperature (25◦ C) for For the values above we obtain the equation gb∗ =
12 hours, during which “volume-call” testing is 0:476, and solving graphically yields b∗ = 102:9.
BURN-IN 3
Even though we present Example 2 as an ex- “freak” or “main” populations and their lifetimes can
ample of burn-in, in the AT&T Reliability Manual be modeled as mixtures of Weibull distributions.
(Klinger, Nakada and Menendez, 1990), Example Systems are assumed to inherit this dichotomous
2 is called a system reliability audit. Other terms behavior, but the weaker population is called an “in-
which are often used are “screen” and “environ- fant mortality” population. This population arises
mental stress screening” (ESS). The AT&T Manual partly because of defects introduced by the manu-
(Klinger, Nakada and Menendez, 1990, page 52) de- facturing process.
fines a screen to be an application of some stress Most reliabilty books familiar to the statistics
to 100% of the product to remove (or reduce the community do not discuss burn-in. We mention
number of) defective or potentially defective units. three applied reliability books which discuss this
Fuqua (1987, pages 11 and 44) concurs with the topic. The first of these is the book by Tobias and
100% but states that this may be an inspection Trindade (1995), which has a section on burn-in cov-
and stress is not required. Fuqua (1987, page 11) ering some basics. An engineering reliability book
describes ESS as a series of tests conducted un- by Fuqua (1987) delineates the uses of burn-in (see
der environmental stresses to disclose latent part Section 2.4 and Chapter 14) for electronic systems
and workmanship defects. Nelson (1990, page 39) at the component, module (intermediate between
is more specific and describes ESS as involving ac- component and system) and system level. Most
celerated testing under a combination of random useful is the AT&T Reliability Manual (Klinger,
vibration and thermal cycling and shock. Nakada and Menendez, 1990), which discusses a
Burn-in is described by the AT&T Manual particular burn-in distribution used at AT&T along
(Klinger, Nakada and Menendez, 1990, page 52) as with a variety of burn-in procedures and several
one effective method of screening (implying 100%) examples of burn-in. Two papers which review the
using two types of stress (temperature and elec- engineering literature on burn-in are Kuo and Kuo
tric field). Nelson (1990, page 43) describes burn-in (1983) and Leemis and Beneke (1990).
as running units under design or accelerated con-
ditions for a suitable length of time. Tobias and
2. BURN-IN DISTRIBUTIONS
Trindade (1995, page 297) restrict burn-in to high
stress only and require that it be done prior to For which components or systems is burn-in ef-
shipment. Bergman (1985, page 15) defines burn-in fective? Another way of posing this question is by
in a more general way as a pre-usage operation of asking, “For which distributions (which model the
components performed in order to screen out the lifetimes of components or systems) is burn-in effec-
substandard components, often in a severe envi- tive?” First, it seems reasonable to rule out classes of
ronment. Jensen and Petersen (1982) have more or distributions which model wearout. The reason for
less the same definition as Bergman. this is that objects which become more prone to fail-
For the purposes of this paper we use the term ure throughout their life will not benefit from burn-
burn-in in a general way, similar to the usage of in since burn-in stochastically weakens the residual
Jensen and Petersen (1982) and of Bergman (1985). lifetime. Consequently, distributions which have in-
We think of it as some pre-usage operation which creasing failure rate or other similar aging proper-
involves usage under normal or stressed conditions. ties are generally not candidates for burn-in.
It can involve either 100% of the product or some For burn-in to be effective, lifetimes should have
smaller subgroup (especially in the case of complex high failure rates initially and then improve. Since
systems as in Example 2) and it is not limited to those items which survive burn-in have the same
eliminating weak components. failure rate as the original, but shifted to the
Many of the traditional engineering ideas con- left, burn-in, in effect, eliminates that part of the
cerning burn-in are discussed in the handbook of lifetime where there is a high initial chance of fail-
Jensen and Petersen (1982). This book is intended ure. The class of lifetimes having bathtub-shaped
as a handbook for small or moderate-size electronics failure rates has this property. For this type of dis-
firms in order to develop a burn-in program. Conse- tribution the failure rate starts high (the infancy
quently the book should be viewed in this spirit. Em- period), then decreases to approximately a constant
phasis is on easy-to-apply methods and on graphi- (the middle life) and then increases as it wears
cal techniques. One important contribution of the out (old age). As suggested by the parenthetical
book is to popularize the idea that components and remarks, this distribution is thought to describe
systems to which burn-in is applied have lifetimes human life and other biological lifetimes. Certain
which can be modeled as mixtures of statistical dis- other mechanical and electronic lifetimes also can
tributions. Specifically components either come from be approximated by these distributions. This type
4 H. W. BLOCK AND T. H. SAVITS
Fig. 1. Burn-in improvement example K = 1;000 hours; PPM/K = parts per million per 1;000 hours.
of distribution would seem to be appropriate for nential, information may change our opinion about
burn-in, since burn-in eliminates the high-failure the failure rate.
infancy period, leaving a lifetime which begins near The mixture of two exponentials mentioned above
its former middle life (see Figure 1). produces a special case of the bathtub failure rate
It turns out that there are reasons why many sys- where no wearout is evident. Models of this type
tems and components have bathtub-shaped failure with no wearout are thought to be sufficient for mod-
rates. As described by Jensen and Petersen (1982), eling the lifetimes of certain electronic components,
many industrial populations are heterogeneous and since these components tend to become obsolete be-
there are only a small number of different subpopu- fore they wear out. Mixing two distributions which
lations. Although members of these subpopulations are more complex than exponentials yields distri-
do not strictly speaking have bathtub-shaped fail- butions with more typical bathtub-shaped failure
ure rates, sampling from them produces a mixture rates, as can be seen in the following example. A
of these subpopulations and these mixtures often typical bathtub curve is given in Figure 8.2 of To-
have bathtub-shaped failure rates. For a simple ex- bias and Trindade (1995, page 238) which we repro-
ample, assume that there are two subpopulations of duce in Figure 1.
components each of which is exponential, one with This distribution is realized as a mixture of a log-
a small mean and one with a large mean. Sam- normal and a Weibull distribution (both of which are
pling produces a distribution with decreasing failure used to model defectives) and another distribution
rate which is a special case of the bathtub failure (which models the population of normal devices),
rate. An intuitive explanation of why this occurs is
lnt/2;700
easy to give. Initially the higher failure rate of the Ft = 0:0028
weaker subpopulation dominates until this subpop- 0:8
ulation dies out. After that, the lower failure rate t 0:5
+ 0:001 1 − exp −
of the stronger subpopulation takes over so that the 400
failure rate decreases from the higher to the lower
level. This type of idea, about the eventual domina- + 0:997 1 − exp −10−7 t
tion of the strongest subpopulation, carries through
lnt/975;000
for very general mixtures. See Block, Mi and Savits · 1−8 ;
0:8
(1993, Section 4). A subjectivist explanation of the
fact that mixing exponentials produces a decreasing where 8 is the standard normal cdf. Notice that
failure rate distribution was given by Barlow (1985), the left tail of the distribution is very steep. This
who argued that even though a model may be expo- tail represents the period where many failures oc-
BURN-IN 5
cur. Burn-in is utilized in order to remove this part density ft and failure rate rt = ft/F̄t is said
of the tail. The dotted line represents the result- to have a bathtub-shaped failure rate if there exist
ing distribution after a burn-in of several hours at points 0 ≤ t1 ≤ t2 ≤ ∞, called change points, such
an accelerated temperature. The point at which the that
curve flattens out and stops decreasing is at about
decreasing for 0 ≤ t < t1 ;
20K. This is called the first change point.
rt is constant for t1 ≤ t < t2 ;
Many papers have appeared in the statisti-
increasing for t ≤ t < ∞:
cal literature providing models and formulas for 2
bathtub-shaped failure rates. See Rajarshi and Ra-
jarshi (1988) for a review of this topic and many We have restricted the above definition to continu-
references. One easy way of obtaining some of these ous lifetimes, but discrete lifetimes can be handled
is by mixing standard life distributions such as the similarly (see Mi, 1993, 1994c). Further we often
exponential, gamma and Weibull. See Vaupel and shorten the phrase bathtub-shaped failure rate to
Yashin (1985) for some illustrations of various dis- bathtub failure rate or even bathtub distribution. A
tributions or Mi (1991) for an example of a simple bathtub curve is called degenerate if either the de-
mixture of gammas which has a bathtub-shaped creasing or increasing part is not present (i.e., it is
failure rate. The AT&T Reliability Manual (Klinger, either always increasing or always decreasing).
Nakada and Menendez, 1990) gives another model
(called the AT&T model) for the failure rate of an 3. OPTIMAL BURN-IN
electronics component. The early part of the fail- In this section we consider some basic criteria for
ure rate is modeled by a Weibull with decreasing determining the optimal burn-in time for a lifetime.
failure rate, and the latter part is modeled by an In general, we consider lifetimes with a bathtub-
exponential (i.e., constant). It does not have a part shaped failure rate having change points t1 and t2
describing wearout since the manual claims that (see Definition 1). As exemplified in Figure 1, burn-
the AT&T electronic equipment tends not to wear in often takes place at or before the first change
out before it is replaced. The AT&T model has been point t1 . In fact, in the following, various optimality
used extensively by Kuo and various co-authors criteria lead to such a burn-in time. In Section 3.1
(e.g., see Chien and Kuo, 1992) to study optimal we focus on performance based criteria. The more
burn-in for integrated circuit systems. This model realistic situation involving cost structures is con-
is also called the Weibull–exponential model in the sidered in Section 3.2 and these are based in part
statistical literature (e.g., see Boukai, 1987). on the criteria of Section 3.1.
Since mixtures are emphasized in this review we
point out one apparent anomoly mentioned by Gur- 3.1 Performance-Based Criteria
land and Sethuraman (1994). In that paper it is ob-
In this section we consider the problem of
served that when even strongly increasing failure
performance-based criteria in which the more
rate distributions are mixed with certain other dis-
general assumption of a cost structure is not made.
tributions, their failure rate tends to decrease after
Many of these criteria are basic concepts which
a certain point. This is not surprising in the light
can and should be incorporated into a general
of the previously mentioned result of Block, Mi and
cost structure. Cost structures are considered in
Savits (1993), which gives that asymptotically the
Section 3.2.
failure rate of a mixture tends to the asymptotic fail-
The paper of Watson and Wells (1961) was one
ure rate of the strongest component of the mixture.
of the first statistical papers to study the question
Since the failure rate of the strongest component is
of burn-in. These authors were interested in con-
the smallest, the failure rate of the mixture is often
ditions under which the mean residual life (after
eventually decreasing to this smallest value.
burn-in) was larger than the original mean lifetime.
Most definitions of bathtub-shaped failure rates
Maximizing the mean residual life is one of the cri-
assume the failure rate decreases to some change
teria we examine in this section. We now list sev-
point t1 , then remains constant to a second change
eral criteria for determining burn-in. Criteria C1,
point t2 , then increases. The case t1 = t2 (i.e., no
C2 and C4 deal with only one component. Crite-
constant portion) is often adequate as an assump-
rion C3 deals with components which are replaced
tion in some theoretical results. We give the defini-
at failure with other identical components.
tion below.
C1. Let τ be a fixed mission time and let F̄ be the
Definition 1. A random lifetime X with distribu- survival function of a lifetime. Find b which
tion function Ft, survival function F̄t = 1−Ft, maximizes F̄b + τ/F̄b, that is, find b such
6 H. W. BLOCK AND T. H. SAVITS
that, given survival to time b, the probability of pair is performed (see Barlow and Proschan, 1965),
completing the mission is as large as possible. then the total number of minimal repairs is a non-
C2. Let X be a lifetime. Find the burn-in time b homogeneous Poisson process with mean function
which maximizes EX − bX > b, that is, find − lnF̄b + τ/F̄b. Thus if we want to minimize
the burn-in time which gives the largest mean the expected number of minimal repairs in the in-
residual life. terval 0; τ, it suffices to maximize the quantity
C3. Let Nb t; t ≥ 0 be a renewal process of F̄b + τ/F̄b. But this is just criterion C1.
lifetimes which are burned in for b units of
3.2 Cost Functions and Burn-in
time (i.e., where F is the original lifetime dis-
tribution and the interarrival distribution has Several cost functions have been proposed to deal
survival function F̄b t = F̄b + t/F̄b. For with burn-in. A discussion of many of these is given
fixed mission time τ, find b which minimizes in the review papers of Kuo and Kuo (1983) and
ENb τ, which is the mean number of burned- Leemis and Beneke (1990). Also see Nguyen and
in components which fail during the mission Murthy (1982). In this section we discuss a few of
time τ. the recent models involving cost functions for burn-
in. In all cases we are interested in finding the burn-
The next criterion involves the α-percentile resid-
in time which minimizes the cost. Cost functions
ual life function. The α-percentile residual life is
CF1 and CF4 are used in subsequent sections. In
defined by
general these cost functions build upon and elabo-
qα b = Fb−1 α = inf x ≥ 0x F̄b x ≤ 1 − α rate the criteria of Section 3.1. Cost function CF1
is basic, while CF2 and CF4 incorporate C2; CF3
(see Joe and Proschan, 1984, for further details). uses C1.
C4. For a fixed α, 0 < α < 1, find the burn-in time CF1. A component or system with lifetime X is
b which maximizes τ = qα b, that is, find the burned-in for time b. If it fails to survive b
burn-in time which gives the maximal warranty units of time a cost c0 is incurred. If it sur-
period τ for which at most α% of items will fail. vives b units of time, then it incurs a second
cost C, C > c0 , if it does not survive past an
Criterion C2 has been studied by several authors.
additional mission time τ or it incurs a gain
The first of these, Watson and Wells (1961), ex-
of K if it does survive τ. Consequently, if F is
amined various parametric distributions. Lawrence
the distribution function of the component or
(1966) obtained bounds on the mean residual life.
system the expected cost as a function of b is
Park (1985) gave some results on the mean residual
life for a bathtub distribution. One result was that c1 b = c0 Fb+CFb+τ−Fb−KF̄b+τ:
the optimal burn-in time b∗ occurs before the first CF2. If instead of a mission time after the burn-in
change point t1 . Mi (1994b) obtained the same re- we consider a gain proportional to the mean
sult for criteria C1 and C3, that is, b∗ ≤ t1 . Launer residual life (with proportionality constant
(1993) introduced criterion C4 and also showed that K), the expected cost becomes
the optimal b∗ occurs before t1 . This type of result R∞
is important since it provides an upper bound for F̄t dt
c2 b = c0 Fb − K b :
burn-in. F̄b
The fact that optimal burn-in for a bathtub dis- The next criteria involve costs for in-shop repair.
tribution takes place before the first change point If a device fails burn-in, it is scrapped at a cost
is not unusual. In fact, it is intuitive that burn-in cs > 0 and another unit is burned-in. This process
should occur before this change point since this is is continued until a unit survives burn-in time b. A
where the failure rate of such a lifetime stops im- device which survives burn-in is then put into field
proving. We shall see in Section 3.2 that the result use. The cost for burn-in is assumed to be propor-
also holds true for many cost structures. tional to the time it takes to obtain a unit which
In another direction, Mi (1994b) compared opti- survives burn-in with proportionality constant c0 .
mal burn-in times for two mission times τ1 ≤ τ2 Mi (1994a) derives the expression for the expected
for criterion C1. He showed the intuitive result that cost as
b∗2 ≤ b∗1 . An extension to random mission times was Rb
F̄t dt cs Fb
also considered. kb = c0 0 + :
In criterion C3, a burned-in unit that failed dur- F̄b F̄b
ing field use was replaced with another burned-in The complete cost also includes additive field costs,
unit. If instead of replacing this unit, a minimal re- and this is reflected in the following cost functions.
BURN-IN 7
CF3. In this case, after a burned-in item is ob- 4.1 A Reliability Growth Model
tained, a cost of C is incurred if the burned-in
Arjas, Hansen and Thyregod (1991) consider a re-
device does not survive the mission time τ
newal process approach to reliability growth where
and a gain of K if it survives the mission.
heterogeneity of the underlying part structure is
Thus the total cost function is given by
shown to translate into renewal intensity behavior.
Fb + τ − Fb Although burn-in per se is not discussed in this pa-
c3 b = kb + C per, the lifetimes discussed are of the type to which
F̄b
burn-in is typically applied. This section also pro-
F̄b + τ vides a background for Section 4.2, which considers
−K ;
F̄b mixed lifetimes.
The basic process involves the lifetimes of parts
where kb is as above. placed in two or more sockets where, upon failure,
CF4. If instead of a mission time, a gain is taken a failed part is replaced by a new part of the same
proportional to the mean residual time, the type. The first and subsequent lifetimes for one
cost function in CF3 is modified to socket are designated by X1 ; X2 ; : : : : These life-
R∞ times are assumed independent. The lifetimes are
F̄t dt
c4 b = kb − K b : also assumed to come from a heterogeneous popu-
F̄b lation. It is natural to model these lifetimes using
a random hazard rate so that the distribution of
The cost function CF1 was introduced by Clarotti
the lifetime can be written as a mixed exponential,
and Spizzichino (1990). These authors obtained con-
that is,
ditions for an optimal burn-in time b∗ and applied Z∞
their results to a mixed exponential model. See also PXk > x = e−λx dφλ;
Section 4.2, where an extension of the mixed ex- 0
ponential model to a general mixture model is dis- where φ is the distribution of the random hazards.
cussed. The cost function CF2 is a variant of CF1. The aim of the paper is to study the renewal process
The cost functions CF3 and CF4 are discussed in of one socket or the superimposed renewal process
Mi (1991, 1995). As in Section 3.1, the respective of several. As is well known, this mixed distribution
authors show that the optimal burn-in time b∗ sat- has a decreasing failure rate.
isfies b∗ ≤ t1 for cost functions CF2–CF4, where t1 If Nt is the renewal process for one socket, it
is the first change point for the assumed bathtub can be shown that Vt = ENt is concave and
distribution. the rate of occurrence of failures for the renewal
process, vt = d/dt Vt, is decreasing. Various
results for this type of renewal process can be ob-
4. MIXTURE MODELS
tained, and comparisons can be made with processes
In this section we consider recent mixture mod- where sockets are minimally repaired rather than
els. This is the typical model described in Section replaced. For minimal repair, the associated pro-
2 to which burn-in is applicable. In both Arjas, cess is the nonhomogeneous Poisson process. (See
Hansen and Thyregod (1991) and Block, Mi and Block and Savits, 1995, for many comparisons of
Savits (1993) an underlying mixture distribution this type.)
is used to model the life of components. The latter Parametric estimation is considered by these au-
paper discusses burn-in applications although the thors for the bimodal (i.e., mixture of two) expo-
former paper does not. nential case. The bimodal Weibulls (and exponen-
The paper of Arjas, Hansen and Thyregod (1991) tials) are the principal examples of the Jensen and
discussed in Section 4.1 is an interesting mix of Petersen (1982) monograph on burn-in. The distri-
modeling and estimation and uses ideas and tech- bution for the life length of the part is the three-
niques from the reliability theory, life testing (engi- parameter mixture of two exponential distributions
neering reliability) and survival analysis literature. with distribution function
The methods developed are applied to an example Fx = π 1 − exp−λ0 x
involving printed circuit boards. In Section 4.2 we
discuss results of Block, Mi and Savits (1993). A + 1 − π 1 − exp−λ1 x; x > 0:
more general mixture model than in Arjas, Hansen It is assumed that inferior parts cannot be distin-
and Tyregod (1991) is examined. A recent paper of guished from a standard part. Two cases are con-
Spizzichino (1995) discusses another model for mix- sidered: (a) the case where sockets are observed in-
tures in heterogeneous populations. dividually and (b) the case where sockets are only
8 H. W. BLOCK AND T. H. SAVITS
observed as aggregated data. In case (a), the maxi- Rs denotes the number of active systems older
mum likelihood estimation is straightforward. Right than s. This yields Figure 2, which compares these
censoring is permitted and the likelihood or log- two estimates. The step curve comes from V bN−A t
likelihood function is standard. In case (b), times and the smooth curve comes from V̄t. Confidence
between failures are not independent and so either bounds are also obtained in this paper using several
(1) an approximation by a corresponding nonhomo- methods.
geneous Poisson process is used or (2) it is assumed,
in the case when the number of failures is less than 4.2 A General Mixture Model
the number of sockets, that each socket has experi- As mentioned in Section 1, one explanation for a
enced at most one failure and so the techniques of bathtub-shaped failure rate that is often given by
(a) apply. engineers is that it is due to mixtures of popula-
An example is given where the system is a printed tions, some weak and some strong. In Block, Mi and
circuit board consisting of 560 parts (sockets) and Savits (1993), a general mixture model was investi-
there are 3,481 systems from which data was col- gated. A goal of that paper was to determine opti-
lected for five years. Maximum likelihood estimates mal burn-in for the cost function CF1 of Clarotti and
were obtained computationally for the three pa- Spizzichino (1990). Some results of independent in-
rameters and were used to estimate the cumulative terest, however, were also obtained. They are sum-
number of occurrence of failures V̄t = EN̄t, marized below.
where N̄t is the superimposed renewal process. For the general mixture model, it is assumed that
The model can be assessed graphically by calcu- each member of the subpopulation, indexed by λ ∈
lating N̄0 t, the counting process obtained as the S, has a positive density ft; λ on 0; ∞. The den-
sum of the individual system processes, and then sity of the resulting mixed population is then given
using the Nelson–Aalen estimate by
Z
X 1N̄0 s
bN−A t =
V ; 4:1 ft = ft; λPdλ;
s≤t Rs S
where P is the mixing distribution.
where 0 < T01 < T02 < · · · < TN̄ 0
0 t < t are all The first results concern the monotonicity of the
of the failure times, 1N̄ s is 1 for each T0i and
0
ratio gt = ft + τ/ft for a fixed mission time
BURN-IN 9
τ > 0. This is a new type of aging property that 5. COMPONENT VERSUS SYSTEM BURN-IN
seems appropriate for burn-in since it is related to a
In this section we deal with the important issue
notion of beneficial aging. More specifically, if we re-
of at which stage burn-in is most effective. Consider
quire the ratio ft+τ/ft to be increasing in t > 0
a system composed of individual components. Is it
for each τ > 0, then f must be log-convex and hence
better to burn in all the components or is it bet-
belongs to the class of distributions which have a
ter to assemble the components and burn in the
decreasing failure rate. Furthermore, certain bath-
system? If there are modules and subassembly sys-
tub failure rates which can be realized as mixtures
tems similar questions can be asked. The component
have this monotonicity property.
level is usually the least expensive stage at which to
Before we can state this result, we recall the def-
consider burn-in. Assembly of even burned-in com-
inition of reverse regular of order 2 RR2 . A non-
ponents usually introduces defects, so burn-in at
negative function kx; y on A × B is said to be RR2
higher levels would seem to have some value. In this
if
section we consider some preliminary work in which
kx1 ; y1 kx2 ; y2 ≤ kx1 ; y2 kx2 ; y1 this question is considered, but under the simplify-
ing assumption that no defects are introduced upon
whenever x1 < x2 in A and y1 < y2 in B. Alterna- assembly. By a system here we mean a coherent
tively, we require that the ratio system in the sense of Barlow and Proschan (1981,
kx + 1; y page 6).
kx; y There are three possible actions we want to
consider which constitute different methods for
be decreasing in y ∈ B for each x ∈ A and 1 > 0.
burning-in the system:
The following is a preservation result for a mono-
tonicity property with a fixed mission time τ. Let
the family of positive densities ft; λx λ ∈ S be (i) Burn in component i for a time βi , i = 1; : : : ; n,
RR2 on 0; ∞ × S and let τ > 0 be a fixed mission and then assemble the system with the burned-
time. Suppose the ratio in components.
(ii) Burn in component i for a time βi , i = 1; : : : ; n,
ft + τ; λ assemble the system with the burned-in compo-
gt; λ =
ft; λ nents and then perform an additional burn-in
is increasing in t > 0 for each λ ∈ S. Then, for the of the system for a time b.
mixture density f given in (4.1), the ratio (iii) Assemble the system with new components and
R then burn in the system for a time period b.
ft + τ ft + τ; λ Pdλ
gt = = SR
ft S ft; λ Pdλ Since (i) and (iii) are special subcases of (ii), we can
is increasing in t > 0. A more general result that do no better than (ii). However, is it possible that
does not require the RR2 condition is given in Block, we can do just as well with one of the other two
Mi and Savits (1993, Theorem 3.1). actions?
A second result of interest in the paper of Block, In Block, Mi and Savits (1994, 1995), this question
Mi and Savits (1993) pertains to the limiting be- was considered for three different criteria: (1) max-
havior of the failure rate for the mixed population. imizing the probability that the system will sur-
Heuristically, it states that the failure rate of the vive a fixed mission time (or warranty period) τ;
mixture tends to the strongest subpopulation. Un- (2) maximizing the system mean residual life; and
der certain technical conditions it is shown that the (3) maximizing the α-percentile (system) residual
failure rate of the mixed population converges to a life τ = qα b for a fixed α, 0 < α < 1. In each case
constant α as t → ∞. Here α = inf aλx λ ∈ S it was shown that one can do as well with burn-in
and aλ = limt→∞ rt; λ with rt; λ the failure at the component level only.
rate of the λ-subpopulation. (The discrete version is This result can be extended to criteria which have
considered in Mi, 1994c.) a type of monotonicity property. More specifically,
Clarotti and Spizzichino (1990) also show for the the result can be shown to hold for any criterion de-
mixture of exponentials model that if one mixing termined by a functional φ defined on the class of
distribution P1 is less than P2 in the sense of likeli- life distributions which is monotone in stochastic or-
hood ratio ordering, then the optimal burn-in times der, that is, in the case of maximizing (minimizing)
b∗i for the cost function CF1 are ordered as b∗1 ≤ b∗2 . the objective function φ, we require that φF ≤
The same result also holds for the general mixture ≥ φG whenever F ≤st G [i.e., F̄t ≤ Ḡt for
model. See Block, Mi and Savits (1993) for details. all t ≥ 0]. Thus, for such criterion, burn-in at the
10 H. W. BLOCK AND T. H. SAVITS
system level is precluded by effective burn-in at the age replacement and also in obtaining approximate
component level. optimal burn-in times. We briefly describe the pro-
It should be noted that this result does not apply cedure for burn-in and note that the procedure for
to the cost function criteria considered in Section age replacement is similar.
3.2 since they are not monotone in stochastic order. We consider the cost function CF4 of Section 3 as
Also, if the act of assembling the components de- an example and describe how an optimal burn-in
grades the system, an additional burn-in at the sys- time b∗ can be obtained using the TTT transform.
tem level might be required. Whitbeck and Leemis This example is taken from Mi (1991). The cost func-
(1989) have considered a model for dealing with this tion can be written as
problem. Rb R∞
c0 0 F̄t dt + cs − K b F̄t dt
c4 b = −cs + :
6. TOTAL TIME ON TEST (TTT) F̄b
In this section we describe a primarily graphical The optimal burn-in is obtained by minimizing this
technique which has been useful in burn-in. One function. Letting u = Fb, minimizing the above is
consequence of this technique is in obtaining ap- equivalent to maximizing
proximate burn-in times. α − φF u
In a life test, failure times are observed until all or MF u = ;
1−u
some portion of the items fail. A way to summarize
the behavior is through the total time on test (TTT) where α = −cs + Kµ/c0 + Kµ and µ is the mean
statistics. Let 0 = x0 < x1 < x2 < · · · < xn of F. The function MF u is the slope of the line seg-
be an ordered sample from a continuous lifetime ment connecting the points 1; α and u; φF u.
distribution with finite mean. In this section, to Consequently we need only find the point on the
avoid technical problems, we assume the distribu- graph of φF , the scaled TTT transform, for which
tion function F is strictly increasing on 0; ∞. The the above slope is largest.
TTT statistics are defined by If n items with lifetime X are put on test, a
TTT plot (i.e., the graph of φFn ) can be obtained.
i
X
Ti = n − j + 1xj − xj−1 Since the TTT transform is the asymptotic version
j=1 of the TTT plot, an estimate of the optimal burn-
in can be obtained. If the point i/n; Ti /Tn maxi-
and mizes MFn u, then xi = F−1n i/n is the ordered
Ti value giving an estimate of the optimal burn-in. We
ui = for i = 1; : : : ; n:
n illustrate this in Figure 3. Other similar burn-in
Notice that un = x̄n . Moreover, if Fn is the empirical applications can be found in Bergman and Klef-
distribution function and Fn−1 x = inf t Fn t ≥ sjo (1985). See also the review article by Bergman
x, then F−1 n i/n = xi for i = 1; : : : ; n. Conse- (1985), which gives other applications of the TTT
quently, transform.
Z Fn−1 i/n
F̄n x dx = ui ; i = 1; : : : ; n: 7. SEQUENTIAL BURN-IN AND
0
OPTIMAL CONTROL
This suggests a distributional analog called the TTT
transform, traditionally denoted by HF −1
. It is de- A theory of sequential burn-in has been proposed
fined by in Spizzichino (1991), and some extensions of this
Z F−1 t have been initiated by the same author and some
−1 of his colleagues. This extends the previous ma-
HF t = F̄u du:
0 terial which deals with mainly one component or
The scaled TTT transform is given by system, or components which are independent and
−1
HF t −1
HF t identically distributed. The more general situation
φF t = −1
= ; where the components are not assumed indepen-
HF 1 µ
dent is treated by Spizzichino and colleagues, who
where µ is the mean of F. Although these con- assume components are exchangeable. A mixture
cepts were discussed earlier, one of the first sys- model for strong and weak exchangeable compo-
tematic expositions was given in Barlow and Campo nents has been proposed by Spizzichino (1995). We
(1975). give a brief introduction to this work. Several rep-
One of the principle uses of the TTT concept has resentative papers are contained in Barlow, Clarotti
been in obtaining approximate optimal solutions for and Spizzichino (1993).
BURN-IN 11
As background we mention a paper of Marcus Spizzichino (1993). Some very recent research on
and Blumenthal (1974), who considered a sequential optimal burn-in of software is given in Barlow,
burn-in procedure. The stopping rule they suggested Clarotti and Spizzichino (1994).
is as follows: observe failure times and stop when
the time between failures exceeds a fixed value. This
8. DISCUSSION AND AREAS
is reasonable for a lifetime which has a high initial
FOR DEVELOPMENT
failure rate that becomes smaller. Properties of this
rule are studied and tables for its use are given. In this review of recent developments in burn-in
In Spizzichino (1991), failure times of n compo- we have discussed a variety of problems. We reca-
nents which are assumed to be exchangeable are pitulate some of these ideas in this section along
observed. One burn-in time is chosen initially and with some future research directions.
if all components survive it, they are put into field A basic assumption on a lifetime for which burn-in
operation. If there is a failure before this time, a is appropriate is that it has a bathtub-shaped fail-
new additional burn-in time is chosen (which may ure rate. This type of lifetime often arises because a
depend on the first failure) and the procedure re- population consists of a mixture of weak and strong
peats. A cost structure based on the one in Clarotti subpopulations. One question for which a satisfac-
and Spizzichino (1990), that is, CF1 from Section tory answer has not been determined is for which
3, is given. A sequential burn-in strategy is defined mixtures does the failure rate have a bathtub shape.
and this is shown to be optimal. A particular case is As described in Section 3, the intuitive result that
mentioned where the exchangeable distribution is burn-in should occur before the first change point of
a mixture of exponentials. This case is further ex- a bathtub failure rate has been demonstrated for a
plored in Costantini and Spizzichino (1991), where a wide variety of criteria and cost functions, but in an
strategy is proposed for reducing this to an optimal ad-hoc way. The authors are currently working on a
stopping problem for a two-dimensional Markov unified result for an even broader class of objective
process. Further details are given in Costantini functions.
and Spizzichino (1990) and in Caramellino and The handbook of Jensen and Petersen (1982)
Spizzichino (1996). presents a wide array of graphical and heuristic
A related approach for optimal screening (a statistical techniques for burn-in. Many of these are
type of burn-in) is given in Iovino and Spizzichino applied to mixtures which model weak and strong
(1993). A general unifying model is proposed by components. At the time the book was written,
12 H. W. BLOCK AND T. H. SAVITS
statistical techniques and procedures for mixtures Allan Sampson, Leon Gleser and David Coit were
were less well understood than they are at the also helpful.
present time. It would be useful if many of the intu- The work of the authors was partially supported
itively plausible and useful techniques given in this by NSA Grant MDA-904-90-H-4036 and by NSF
handbook were updated and put on a firmer statis- Grant DMS-92-03444.
tical foundation. One example of this is the paper
of Arjas, Hansen and Tyregod (1991) (see Section
REFERENCES
4.1), who develop estimation techniques for renewal
processes where the underlying distribution is a Arjas, E., Hansen, C. K. and Thyregod, P. (1991). Estimation of
mean cumulative number of failures in a repairable system
mixture of exponentials.
with mixed exponential component lifetimes. Technometrics
The material of Section 7 on sequential burn-in 33 1–12.
and optimal control appears to be a fruitful area of Barlow, R. E. (1985). A Bayes explanation of an apparent failure
research. It seems evident that this direction should rate paradox. IEEE Transactions on Reliability R-34 107–
be expanded and further investigated. 108.
Barlow, R. E. and Campo, R. (1975). Total time on test processes
The development of new ideas on burn-in goes and applications to failure data analysis. In Reliability and
hand-in-hand with developments in accelerated life Fault Tree Analysis (R. E. Barlow, J. B. Fussell and N. D.
testing. In fact, burn-in is most often accomplished Singpurwalla, eds.) 451–481. SIAM, Philadelphia.
in an accelerated environment. A related topic is Barlow, R. E., Clarotti, C. A. and Spizzichino, F., eds.
degradation, in which instead of the lifetime of a (1993). Reliability and Decision Making. Chapman and Hall,
New York.
component the emphasis is on a measure of the Barlow, R. E., Clarotti, C. A. and Spizzichino, F. (1994).
quality of the component as it wears out. If the Optimal burn-in of software. Unpublished report.
environment is accelerated, the question of burn-in Barlow, R. E. and Proschan, F. (1965). Mathematical Theory
in conjunction with this accelerated degradation be- of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of
comes of interest. For recent developments on accel-
Reliability and Life Testing. To Begin With, Silver Spring,
erated degradation, see Nelson (1990) and Meeker MD.
and Escobar (1993). Bergman, B. (1985). On reliability theory and its applications.
An area of reliability where burn-in techniques Scand. J. Statist. 12 1–41.
might be applicable and vice-versa is the topic of Bergman, B. and Klefsjo, B. (1985). Burn-in models and TTT
transforms. Quality and Reliability International 1 125–130.
software reliability modeling. In this area one prob- Block, H. W., Mi, J. and Savits, T. H. (1993). Burn-in and mixed
lem involves removing errors (bugs) from the soft- populations. J. Appl. Probab. 30 692–702.
ware. An assumption which is made is that when Block, H. W., Mi, J. and Savits, T. H. (1994). Some results on
bugs are detected and removed no new bugs are in- burn-in. Statist. Sinica 4 525–534.
troduced. In this case the software is improved since Block, H. W., Mi, J. and Savits, T. H. (1995). Burn-in at the
component and system level. In Lifetime Data: Models in
the number of bugs remaining is decreased. Conse- Reliability and Survival Analysis (N. P. Jewell, A. C. Kim-
quently, the rate at which bugs are discovered is ber, M.-L. T. Lee and G. A. Whitmore, eds.) 53–58. Kluwer,
decreased. This rate is analogous to the left tail of Dordrecht.
a failure rate with infant mortality present. Since Block, H. W. and Savits, T. H. (1995). Comparisons of mainte-
nance policies. In Stochastic Orders and Their Applications
the time at which the testing should stop is of inter-
(M. Shaked and J. G. Shanthikumar, eds.) Chap. 15. Aca-
est, and this is analogous to the burn-in period for demic Press, New York.
a lifetime of the type discussed in this paper, there Boukai, B. (1987). Bayes sequential procedure for estimation
should be some transfer between the ideas of both and for determination of burn-in time in a hazard rate model
of these fields. To date there have been a few appli- with an unknown change-point parameter. Sequential Anal.
6 37–53.
cations of burn-in ideas to finding the time at which
Caramellino, L. and Spizzichino, F. (1996). WBF properties
to stop testing the software. The paper of Barlow, and statistical monotonicity of the Markov process associ-
Clarotti and Spizzichino has been mentioned. See ated to Schur-constant survival functions. J. Multivariate
also Section 6 of Singpurwalla and Wilson (1994), Anal. 56 153–163.
who review the optimal testing of software. Chien, W.-T. K. and Kuo, W. (1992). Optimal burn-in simula-
tion on highly integrated circuit systems. IIE Transactions
24(5) 33–43.
ACKNOWLEDGMENTS Clarotti, C. A. and Spizzichino, F. (1990). Bayes burn-in de-
cision procedures. Probability in the Engineering and Infor-
We would like to thank Leon Gleser, Rob Kass and mational Sciences 4 437–445.
Jie Mi for helpful comments as well as a referee for Costantini, C. and Spizzichino, F. (1990). Optimal stopping of
the burn-in of conditionally exponential components. Unpub-
a careful reading and perceptive comments. David lished report.
Coit supplied us with the Fuqua reference and some Costantini, C. and Spizzichino, F. (1991). Optimal stopping of
recent papers and we thank him. Discussions with life testing: use of stochastic orderings in the case of condi-
BURN-IN 13
tionally exponential lifetimes. In Stochastic Orders and De- Mi, J. (1994b). Maximization of a survival probability and its
cision Making Under Risk (K. Mosler and M. Scarsini, eds.) applications. J. Appl. Probab. 31 1026–1033.
95–103. IMS, Hayward, CA. Mi, J. (1994c). Limiting behavior of mixtures of discrete lifetime
Fuqua, N. B. (1987). Reliability Engineering for Electronic De- distributions. Technical report, Dept. Statistics, Florida In-
sign. Dekker, New York. ternational Univ.
Gurland, J. and Sethuraman, J. (1994). Reversal of increas- Mi, J. (1995). Minimizing some cost functions related to both
ing failure rates when pooling failure data. Technometrics burn-in and field use. Oper. Res. To appear.
36 416–418. Nelson, W. (1990). Accelerated Testing. Wiley, New York.
Iovino, M. G. and Spizzichino, F. (1993). A probabilistic ap- Nguyen, D. G. and Murthy, D. N. P. (1982). Optimal burn-in
proach for an optimal screening problem. J. Ital. Statist. Soc. time to minimize cost for products sold under warranty. IIE
2 309–335. Transactions 14 167–174.
Jensen, F. and Petersen, N. E. (1982). Burn-in. Wiley, New Park, K. S. (1985). Effect of burn-in on mean residual life. IEEE
York. Transactions on Reliability R-34 522–523.
Joe, H. and Proschan, F. (1984). Percentile residual life func- Rajarshi, S. and Rajarshi, M. B. (1988). Bathtub distributions:
tions. Oper. Res. 32 668–678. a review. Comm. Statist. 17 2597–2621.
Klinger, D. J., Nakada, Y. and Menendez, M. A., eds. Rawicz, A. H. (1986). Burn-in of incandescent sign lamps. IEEE
(1990). AT&T Reliability Manual. Van Nostrand Rheinhold, Transactions on Reliability R-35 375–376.
New York. Singpurwalla, N. and Wilson, S. (1994). Software reliability
Kuo, W. and Kuo, Y. (1983). Facing the headaches of early fail- modeling. Internat. Statist. Rev. 62 289–317.
ures: a state-of-the-art review of burn-in decisions. Proceed- Spizzichino, F. (1991). Sequential burn-in procedures. J. Statist.
ings of the IEEE 71 1257–1266. Plann. Inference 29 187–197.
Launer, R. L. (1993). Graphical techniques for analyzing failure Spizzichino, F. (1993). A unifying model for optimal design of
data with percentile residual-life functions. IEEE Transac- life testing and burn-in. In Reliability and Decision Making
tions on Reliability R-42 71–75. (R. E. Barlow, C. A. Clarotti and F. Spizzichino, eds.) 189–
Lawrence, M. J. (1966). An investigation of the burn-in prob- 210. Chapman and Hall, New York.
lem. Technometrics 8 61–71. Spizzichino, F. (1995). A probabilistic model for heterogeneous
Leemis, L. M. and Beneke, M. (1990). Burn-in models and meth- populations and related burn-in design problems. Scientific
ods: a review. IIE Transactions 22 172–180. note, Dept. Mathematics, Univ. Rome “Guido Castelnuovo.”
Marcus, R. and Blumenthal, S. (1974). A sequential screening Tobias, P. A. and Trindade, D. C. (1995). Applied Reliability.
procedure. Technometrics 16 229–234. Van Nostrand, New York.
Meeker, W. Q. and Escobar, L. A. (1993). A review of recent Vaupel, J. W. and Yashin, A. I. (1985). Heterogeneity’s ruse:
research and current issues in accelerated testing. Internat. some surprising effects of selection on population dynamics.
Statist. Rev. 61 147–168. Amer. Statist. 39 176–185.
Mi, J. (1991). Burn-in. Ph.D. dissertation, Dept. Mathematics Watson, G. S. and Wells, W. T. (1961). On the possibility of im-
and Statistics, Univ. Pittsburgh. proving mean useful life of items by eliminating those with
Mi, J. (1993). Discrete bathtub failure rate and upside down fail- short lives. Technometrics 3 281–298.
ure rate. Naval Research Logistics 40 361–371. Whitbeck, C. W. and Leemis, L. M. (1989). Component vs.
Mi, J. (1994a). Burn-in and maintenance policies. Adv. in Appl. system burn-in techniques for electronic equipment. IEEE
Probab. 26 207–221. Transactions on Reliability 38 206–209.
written on the topic; unfortunately, that which is 2.2 Misconceptions about Burn-in
known is subject to debate. The result is that BS
There appear to be at least two misconceptions
have adopted a limited view of burn-in and have about the engineer’s view of burn-in. The first is
refrained from a discussion of its foundational is- that items that are judged to have exponential life
sues. Our commentary—actually an article—is writ- distributions (or distributions that have an increas-
ten with the hope of filling these gaps and providing ing failure rate) should not be subjected to a burn-in
an alternative perspective on burn-in; in the sequel (cf. Clarotti and Spizzichino, 1990). The second mis-
we provide some new results on mixtures of distri- conception is that the sole purpose of burn-in is the
butions that are germane to burn-in. elimination of weak items from a population.
“Burn-in” is commonly used in engineering relia- The causes of the first misconception are a fail-
bility, statistical simulation and medical sensitivity ure to appreciate the role of utility in burn-in and a
testing. In this article we discuss the philosophical failure to distinguish between what Barlow (1985)
underpinnings of burn-in, and make three claims. refers to as the “model failure rate” and the “predic-
Our first claim is that the main purpose served tive failure rate.” Burn-in decisions should be based
by burn-in is psychological, that is, relating to be- on the predictive failure rate, not the model failure
lief. Our second claim is that burn-in is dictated by rate. In fact, if the predictive life distribution is a
the interaction between predictive failure rates and mixture of exponential distributions, then burn-in
utilities. Consequently, burn-in may be performed must be contemplated; it should be performed if the
even if the predictive failure rate is increasing and costs of testing compensate for the avoidance of risk
the utility of the time on test decreasing. An ex- of in-service failures.
ample is the burn-in phase of statistical simulation, The cause of the second misconception is a failure
which mirrors burn-in testing of engineering compo- to appreciate the fact that, fundamentally, there are
nents. Our third claim is that the famous “bathtub” two reasons for performing a burn-in test: psycho-
curve of reliability and biometry rarely has a physi- logical (i.e., those pertaining to belief) and physical
cal reality. Rather, as shown in Theorem 2, it is the (i.e., those pertaining to a change in the physical or
manifestation of one’s uncertainty. the chemical composition of an item).
2.3 Objectives
2. INTRODUCTION AND OVERVIEW
The aim of this article is to argue that the two
2.1 Background reasons given above cover the entire spectrum of
What is “burn-in”? The answer depends on whom burn-in, be it in engineering reliability, in simula-
you ask: an engineer, a simulator or a survivor (bio- tion or in sensitivity testing. Also, fundamentally,
statistician). Each explains burn-in differently. Our since the concepts of physics, chemistry and biology
goal is to argue, using a minimum of mathematics, influence belief (or psychology), there is only one
that there is a unifying theme underlying burn-in reason for burn-in, namely, psychological. In what
and, therefore, there must be a single answer to the follows (see Section 4), we will attempt to justify
question that is posed. our point of view. We will also point out that in
First, let us see how engineers view burn-in. To one’s day-to-day life, the psychology of burn-in is
an engineer, burn-in is a procedure for eliminating routinely practiced. In Section 5 we give examples of
“weak” items from a population (cf. Block, Mi and circumstances which provide a physical motivation
Savits, 1993). The population is assumed to con- for burn-in. Section 6 explores the role of utility in
sist of two homogenous subpopulations: “weak” and burn-in, and Section 7, entitled “An anatomy of fail-
“strong.” Burn-in is achieved by testing each item in ure rates with decreasing segments,” leads us to a
the population for the burn-in period, and commis- discussion of optimal burn-in times. A consequence
sioning to service those items that survive the test. of the material of Section 7 is our claim (Theorem
The items that fail the test are judged weak. 2) that the famous “bathtub curve” of reliability is
To a simulator, burn-in is the time phase during rarely a physical reality; rather, it is often the man-
which an algorithm, such as a “Gibbs sampler,” at- ifestation of one’s subjective belief. This may come
tains its theoretical convergence (usually the weak as a surprise to many.
convergence of distributions); see, for example,
3. NOTATION AND TERMINOLOGY
Besag, Green, Higdon and Mengersen (1995). Bio-
statisticians do not use the term burn-in, but the Suppose that T, the time to failure of an item,
notion of “sensitivity testing” a new drug for a short has a distribution function Ft = PT ≤ t and
period of time parallels the thinking of engineers. a survival function Ft = 1 − Ft. Assume that
BURN-IN 15
f·, the probability density function of F·, ex- R A consequence of this theorem is the result that if
ists. If F· is indexed by a parameter u so that Ftuπu du, the predictive life distribution, is a
PT ≤ tu = Ftu, then htu, the model failure mixture of exponential distributions, then the pre-
rate function of F·, is defined as dictive failure rate will be strictly decreasing.
ftu
(1) htu = ; t ≥ 0:
Ftu 4. THE PSYCHOLOGICAL ASPECT OF BURN-IN
The function F is said to have an increasing (de- In this section, we argue that burn-in is a process
creasing) model failure rate if htu is monoton- of learning, where by learning we mean a reduc-
ically increasing (decreasing) in t, where we use tion of uncertainty. The optimal burn-in time is the
increasing (decreasing) in place of nondecreasing time at which the amount of information that is
(nonincreasing) throughout. The function F is said gleaned from the test balances the costs of the test,
to have a constant model failure rate if htu is where costs include the depletion of useful life. We
constant in t. It is well known that htu is con- are prompted to make this claim as a consequence
stant in t if and only if Ftu = e−θt , an exponential of observing that engineers subject every item that
distribution. they use to a short life test prior to commissioning.
In keeping with our claim to use a minimum of This is true even of items that have an increasing
mathematics, we will call htθ a bathtub curve if model failure rate. For such items, burn-in would
it satisfies the following definition. deplete useful life. When asked why every item is
A function gt is said to be a bathtub curve if subjected to a burn-in, the answer has been that
there exists a point u > 0 such that gt is strictly burn-in gives a “warm feeling” or “confidence” about
decreasing for t < u and strictly increasing for an item’s survivability. Thus engineering practice is
t > u. contrary to the statistical literature, which seems
In the context of burn-in, which is de facto a lim- to imply that only items having a strictly decreas-
ited life test, F having an increasing (decreasing) ing model failure rate should be subjected to a
model failure rate implies that burn-in results in a burn-in.
depletion (enhancement) of useful life. When F has How can one explain engineers’s actions which
a constant model failure rate, burn-in results in nei- are contrary to the literature? Our explanation is
ther a depletion nor an enhancement of useful life. that, with burn-in, we are learning by observing, so
Since the parameter u is always unknown, we burn-in must be contemplated whenever we have
need to specify a distribution for it; let the density uncertainty about the model failure rate, be it in-
of this distribution be denoted πu. Then ht, the creasing, constant or decreasing. The depletion of
predictive failure rate function of F, is given as useful life which occurs is the price that we pay for
R additional knowledge about the failure rate. The op-
ftuπudu
ht = R timal burn-in time represents the optimal trade-off
Ftuπu du between knowledge and cost, and it may be greater
(2)
Z πuFtu than zero if the predictive failure rate is decreasing
= htu R du: or has a decreasing segment.
Ftuπu du
Thus
Z 5. THE PHYSICAL ASPECT OF BURN-IN
(3) ht = htuπut du;
Is uncertainty about the model failure rate the
where πut denotes the density of the distribution only circumstance under which a burn-in should be
of u given that T ≥ t. contemplated? The answer is no, because burn-in
Note Rthat, contrary to what many believe, may also be done in those situations wherein the act
ht 6= htuπu du; also, if πu is degener- of using the item physically enhances its survivabil-
ate, the model and predictive failure rates agree. ity. Examples include the work hardening of ductile
We conclude this section with a statement of the materials and the self-sharpening of drill bits. Un-
following important closure (under mixtures) theo- der such circumstances the model failure rate is de-
rem. creasing, and one would contemplate a burn-in, even
if the model failure rate were known with certainty.
Theorem 1 [Barlow and Proschan (1975) page The predictive failure rate is of course decreasing
103]. If the model failure rate htu is decreasing by Theorem 1.
in t; then the predictive failure rate ht is decreas- To summarize, burn-in should be contemplated
ing in t; for any πu. for all items whose predictive failure rate is either
16 H. W. BLOCK AND T. H. SAVITS
where Pui t is the probability mass of ui given mixes over the different model failure rates that are
that T ≥ t. suggested by an unknown u, via a prior πu over u:
Suppose now that each htui is decreasing in t. Specifically, suppose that an item has a model fail-
Then, by the closure under mixtures theorem (The- ure rate htu with u unknown. Let πu reflect
orem 1), ht is also decreasing in t. In Figure 2 we one’s subjective opinion about the different values
illustrate this phenomenon via model failure rates of u; that is, πu is the prior on u. Suppose that
that are constant. In fact, the closure under mix- htu is decreasing in t for all values of u. Then,
tures theorem was motivated by the physical mixing by the closure under mixtures theorem, the predic-
of constant model failure rates. tive failure rate of the item ht will also decrease
in t. For example, suppose that u = θ and that
7.2.2. Subjective or psychological mixing. The Ftθ = exp−tθ , θ ≥ 0, t > 0, a Weibull distri-
notion of what we refer to as “subjective mixing” bution with shape parameter θ. If θ = 1, htθ = 1,
parallels that of physical mixing, except that now a constant, whereas if θ < 1, htθ decreases in t.
one does not conceptualize a mixing process that Thus if πθ has support 0; 1, then ht decreases
is prompted by physically putting together several in t; see Figure 3.
heterogenous items. Rather, one acts as a subjec- Thus in the two scenarios of physical and subjec-
tivist (in the sense of de Finetti and Savage), and tive mixing, the predictive failure rate is decreas-
ing in t, suggesting that burn-in should be contem-
plated.
τ ≤ T ≤ τ + s (where s can be viewed as the mission surviving. Since burn-in is performed for a psycho-
time) and let −K be the reward if T > τ + s. Then: logical purpose, it is only natural to base burn-in cal-
culations upon the predictive failure rate. The mod-
Theorem 3 [Clarotti and Spizzichino (1990)]. eland predictive failure rates may have very differ-
ent forms. Indeed, while the famous bathtub curve
(i) Burn in indefinitely, iff rarely has a physical motivation, it arises quite nat-
C − c1 urally in our minds.
lim gt < = v:
t→∞ C+K
ACKNOWLEDGMENTS
(ii) Do not burn in, iff g0 ≥ v.
The authors would like to thank Dennis Lind-
(iii) Burn in for time τ > 0, iff gτ ≡ v.
ley and Al Marshall for their helpful comments on
an earlier draft of this paper. Supported by Army
Note that the indefinite burn-in of (i) above is dif-
Research Office Grant DAAH04-93-G-0020 and Air
ferent from the indefinite burn-in of eternal happi-
Force Office of Scientific Research Grant AFOSR-F-
ness discussed above. The former is based on costs
49620-95-1-0107.
of testing and in-service failure; the latter assumes
that the costs of burn-in are zero.
REFERENCES
9. CONCLUDING COMMENTS Barlow, R. E. and Proschan, F. (1975). Statistical Theory of
Reliability: Probability Models. Holt, Rinehart & Winston,
Let us return to the original question: “What is New York.
burn-in”? We argue that it is primarily a mechanism Besag, J., Green, P., Higdon, D. and Mengersen, K.
for learning. (1995). Bayesian computation and stochastic systems (with
discussion). Statist. Sci. 10 3–66.
The model failure rate describes the physical pro- Gurland, J. and Sethuraman, J. (1995). How pooling data may
cess of aging. The predictive failure rate describes reverse increasing failure rates. J. Amer. Statist. Assoc. 90
our changing beliefs about an item as we observe it 1416–1423.