0% found this document useful (0 votes)
100 views19 pages

Burn-In: Henry W. Block and Thomas H. Savits

The document discusses burn-in, which is a testing method used in manufacturing to eliminate weak components. It surveys recent research on burn-in, with a focus on mixture models used to describe populations with weak and strong components. It also examines criteria for determining optimal burn-in times and whether burn-in should be done at the component or system level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views19 pages

Burn-In: Henry W. Block and Thomas H. Savits

The document discusses burn-in, which is a testing method used in manufacturing to eliminate weak components. It surveys recent research on burn-in, with a focus on mixture models used to describe populations with weak and strong components. It also examines criteria for determining optimal burn-in times and whether burn-in should be done at the component or system level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Statistical Science

1997, Vol. 12, No. 1, 1–19

Burn-In
Henry W. Block and Thomas H. Savits

Abstract. A survey of recent research in burn-in is undertaken. The


emphasis is on mixture models, criteria for optimal burn-in and burn-in
at the component or system level.
Key words and phrases: Bathtub failure rate, mixture models, TTT plot,
TTT transform, cost functions.

0. INTRODUCTION of the types of statistical distributions which model


the lifetimes of components for which burn-in is rel-
Burn-in is a widely used engineering method to
evant. The remainder of the paper is devoted to ex-
eliminate weak items from a standard population.
plicating recent promising developments in burn-in.
The standard population usually consists of various
Because of the authors’s interests, most emphasis
engineering systems composed of items or parts, or
will be placed on probability modeling for burn-in,
components which are assembled together into sys-
but some statistical topics will also be covered. We
tems. The components operate for a certain amount
will not review the fairly extensive engineering liter-
of time until they fail, as do the systems composed of
ature on burn-in since this has been done in several
these components. The systems might be electronic
review articles which we cite at the end of Section 1.
systems such as circuit boards and the components
Section 1 contains several illustrative examples
would be various types of chips and printed circuits.
and an introduction to some references for addi-
A typical mechanical system is an air conditioner
tional background on burn-in. The distributions
and the components are condensor, fan, circuits
which are used to describe the lifetimes of com-
and so forth. Usually within any population of com-
ponents which can benefit from burn-in are given
ponents there are strong components with long
in Section 2. An important family of distributions
lifetimes and weak components with much shorter
is one in which the failure rate functions have a
lifetimes. To insure that only the strong components
bathtub shape. In particular, distributions which
reach the customer, a manufacturer will subject all
arise as mixtures are singled out for emphasis since
of the components to tests simulating typical or
many bathtub-shaped failure rates arise in this
even severe use conditions. In theory, the weak
way. In Section 3, various criteria are described
components will fail, leaving only the strong com-
which have been used to determine optimal burn-in
ponents. This type of testing can also be carried out
times. Section 3.1 considers general criteria and
on systems after they are assembled in order to de-
Section 3.2 covers various cost structures. Sec-
termine the weak or strong systems or to uncover
tion 4 discusses two recent mixture models. The
defects incurred during assembly. These tests are
first of these (Section 4.1) examines a typical het-
known as burn-in. One important issue is to deter-
erogeneous population to which burn-in is often
mine the optimal time for burn-in. Burn-in is more
applied and how this translates into renewal in-
typically applied to electronic than to mechanical
tensity behavior. The second of these proposes a
systems.
general mixture model. A related result involves
We give a survey of recent burn-in research with
the asymptotic failure rate of a mixture model in
emphasis on mixture models (which are used to
terms of the asymptotic failure rates of the compo-
describe populations with weak and strong compo-
nents of the mixture. The question of whether it is
nents), criteria for optimal burn-in and whether it is
better to burn in at the component or the system
better to burn in at the system or component level.
level is discussed in Section 5. In Section 6, we con-
After some background, we give a brief description
sider an important tool, the TTT transform, which
is used for approximating burn-in times. Section 7
Henry W. Block and Thomas H. Savits are Pro- gives a brief introduction to some recent sequential
fessors, Department of Statistics, University of burn-in procedures involving optimal control. Sec-
Pittsburgh, Pittsburgh, Pennsylvania 15260 (e-mail: tion 8 gives a discussion with an indication of some
[email protected], [email protected]). future research directions.

1
2 H. W. BLOCK AND T. H. SAVITS

1. BACKGROUND AND SIMPLE EXAMPLES performed; that is, 1,000 calls are simulated and
passed through each of the five to eight modules
Many manufacturers and users of electronic com-
of the switch. The system is then subjected for up
ponents and systems, as a matter of course, subject
to 48 hours to the high temperature (50◦ C) which
these systems and/or components to initial testing
can occur within the switch if the air conditioning
for a fixed period of time under conditions which
should fail. The first part of this procedure is to
range from typical to those which approximate a
find and eliminate early system failures, and the
worst-case scenario. A typical regimen is to intro-
second part simulates use in an extreme case which
duce for a period of time some vibration and tem-
might occur. The objective of the second part is to
perature elevation for a device. In a particular con-
accelerate aging, so that weak systems fail. It also
text this is sometimes known as “shake and bake.”
provides data which can be used to see how this
At the end of this period, those components and/or
equipment compares to certain standards set for it.
systems which do not survive this period of testing
may be discarded (scrapped), analyzed for defects
Example 3. Jensen and Petersen (1982) consider
and/or repaired. Those which survive this period
a piece of measuring equipment made up of approx-
may be sold, placed into service or subjected to fur-
imately 4,000 components. They focus on several
ther testing. Although these procedures have a va-
critical types of these components. One of these,
riety of names depending on the area of application,
called an IC-memory circuit, accounts for 35 of the
we use the term burn-in to describe them all. The
4,000 components. The bimodal Weibull distribution
time period will be called the burn-in period. We il-
(i.e., a mixture of two Weibulls) is used to model this
lustrate some of these ideas with the following three
type of component and has the following survival
examples.
function:
Example 1. Rawicz (1986) considers 30-watt long-
F̄t‘ = p exp’−t/n1 ‘β1 “ + 1 − p‘ exp’−t/n2 ‘β2 “:
life lamps manufactured by the Pacific Lamp Cor-
poration (Vancouver, Canada) which were designed
“for 5,000 hours of constant work in severe envi- From the data, the values p = 0:015, β1 = 0:25,
ronmental conditions at 120 V.” These are installed n1 = 30, β2 = 1 and n2 = 10 have been determined,
on billboards where it is difficult and expensive but an explicit method is not given.
to replace them. It turns out that a certain small We illustrate the results of Block, Mi and Savits
percentage of these lamps tend not to last the (1993) (which is discussed in Section 4.2) to obtain
requisite 5,000 hours but fail relatively early. Ob- the optimal burn-in time for a reasonable cost func-
viously it would be beneficial if this subpopulation tion (we use CF1 of Section 3.2 in this example).
of lamps could be identified and eliminated before Assume that we would like to plan a burn-in for
being placed on a billboard. The procedure rec- components of this type so that those surviving
ommended involves stressing all of the lamps at burn-in should function for a mission time of τ = 60
a high voltage (240 V) for a short period of time, units. If a circuit fails before the end of burn-in a
which causes the weak lamps to fail rather quickly cost c0 = q0 C, where 0 < q0 < 1, is incurred. If it
while the stronger lamps do not fail during this fails after burn-in but before the mission time is
period. The lamps which do not fail are the lamps over, a cost of C is incurred. If an item survives
potentially capable of surviving the 5,000 hours burn-in and the mission time, a gain of K = kC
of constant work. Often the burn-in weakens the is obtained. For illustrative purposes, we choose
surviving devices. In this particular application, q0 = 0:5 and k = 0:05.
however, the surprising result is that the surviving We apply Theorem 2.1 of Block, Mi and Savits
lamps are actually improved. This was thought to (1993). Let f be the density of the bimodal Weibull
occur since the high thermal treatment seemed to given above. It is not hard to show that gt‘ =
relax structural stresses caused by the fabrication ft + τ‘/ft‘ is increasing in t (either directly or
process. by standard results) and goes from 0 (as t → 0) to 1
(as t → ∞). By the cited results an optimal burn-in
time 0 < b∗ < ∞ exists and satisfies
Example 2. In the AT&T Reliability Manual
(Klinger, Nakada and Menendez, 1990) an elec- C − c0
tronics switching system (the 5ESS Switch) is gb∗ ‘ = :
C+K
discussed. Immediately after manufacture this sys-
tem is operated at room temperature (25◦ C) for For the values above we obtain the equation gb∗ ‘ =
12 hours, during which “volume-call” testing is 0:476, and solving graphically yields b∗ = 102:9.
BURN-IN 3

Even though we present Example 2 as an ex- “freak” or “main” populations and their lifetimes can
ample of burn-in, in the AT&T Reliability Manual be modeled as mixtures of Weibull distributions.
(Klinger, Nakada and Menendez, 1990), Example Systems are assumed to inherit this dichotomous
2 is called a system reliability audit. Other terms behavior, but the weaker population is called an “in-
which are often used are “screen” and “environ- fant mortality” population. This population arises
mental stress screening” (ESS). The AT&T Manual partly because of defects introduced by the manu-
(Klinger, Nakada and Menendez, 1990, page 52) de- facturing process.
fines a screen to be an application of some stress Most reliabilty books familiar to the statistics
to 100% of the product to remove (or reduce the community do not discuss burn-in. We mention
number of) defective or potentially defective units. three applied reliability books which discuss this
Fuqua (1987, pages 11 and 44) concurs with the topic. The first of these is the book by Tobias and
100% but states that this may be an inspection Trindade (1995), which has a section on burn-in cov-
and stress is not required. Fuqua (1987, page 11) ering some basics. An engineering reliability book
describes ESS as a series of tests conducted un- by Fuqua (1987) delineates the uses of burn-in (see
der environmental stresses to disclose latent part Section 2.4 and Chapter 14) for electronic systems
and workmanship defects. Nelson (1990, page 39) at the component, module (intermediate between
is more specific and describes ESS as involving ac- component and system) and system level. Most
celerated testing under a combination of random useful is the AT&T Reliability Manual (Klinger,
vibration and thermal cycling and shock. Nakada and Menendez, 1990), which discusses a
Burn-in is described by the AT&T Manual particular burn-in distribution used at AT&T along
(Klinger, Nakada and Menendez, 1990, page 52) as with a variety of burn-in procedures and several
one effective method of screening (implying 100%) examples of burn-in. Two papers which review the
using two types of stress (temperature and elec- engineering literature on burn-in are Kuo and Kuo
tric field). Nelson (1990, page 43) describes burn-in (1983) and Leemis and Beneke (1990).
as running units under design or accelerated con-
ditions for a suitable length of time. Tobias and
2. BURN-IN DISTRIBUTIONS
Trindade (1995, page 297) restrict burn-in to high
stress only and require that it be done prior to For which components or systems is burn-in ef-
shipment. Bergman (1985, page 15) defines burn-in fective? Another way of posing this question is by
in a more general way as a pre-usage operation of asking, “For which distributions (which model the
components performed in order to screen out the lifetimes of components or systems) is burn-in effec-
substandard components, often in a severe envi- tive?” First, it seems reasonable to rule out classes of
ronment. Jensen and Petersen (1982) have more or distributions which model wearout. The reason for
less the same definition as Bergman. this is that objects which become more prone to fail-
For the purposes of this paper we use the term ure throughout their life will not benefit from burn-
burn-in in a general way, similar to the usage of in since burn-in stochastically weakens the residual
Jensen and Petersen (1982) and of Bergman (1985). lifetime. Consequently, distributions which have in-
We think of it as some pre-usage operation which creasing failure rate or other similar aging proper-
involves usage under normal or stressed conditions. ties are generally not candidates for burn-in.
It can involve either 100% of the product or some For burn-in to be effective, lifetimes should have
smaller subgroup (especially in the case of complex high failure rates initially and then improve. Since
systems as in Example 2) and it is not limited to those items which survive burn-in have the same
eliminating weak components. failure rate as the original, but shifted to the
Many of the traditional engineering ideas con- left, burn-in, in effect, eliminates that part of the
cerning burn-in are discussed in the handbook of lifetime where there is a high initial chance of fail-
Jensen and Petersen (1982). This book is intended ure. The class of lifetimes having bathtub-shaped
as a handbook for small or moderate-size electronics failure rates has this property. For this type of dis-
firms in order to develop a burn-in program. Conse- tribution the failure rate starts high (the infancy
quently the book should be viewed in this spirit. Em- period), then decreases to approximately a constant
phasis is on easy-to-apply methods and on graphi- (the middle life) and then increases as it wears
cal techniques. One important contribution of the out (old age). As suggested by the parenthetical
book is to popularize the idea that components and remarks, this distribution is thought to describe
systems to which burn-in is applied have lifetimes human life and other biological lifetimes. Certain
which can be modeled as mixtures of statistical dis- other mechanical and electronic lifetimes also can
tributions. Specifically components either come from be approximated by these distributions. This type
4 H. W. BLOCK AND T. H. SAVITS

Fig. 1. Burn-in improvement example K = 1;000 hours; PPM/K = parts per million per 1;000 hours‘.

of distribution would seem to be appropriate for nential, information may change our opinion about
burn-in, since burn-in eliminates the high-failure the failure rate.
infancy period, leaving a lifetime which begins near The mixture of two exponentials mentioned above
its former middle life (see Figure 1). produces a special case of the bathtub failure rate
It turns out that there are reasons why many sys- where no wearout is evident. Models of this type
tems and components have bathtub-shaped failure with no wearout are thought to be sufficient for mod-
rates. As described by Jensen and Petersen (1982), eling the lifetimes of certain electronic components,
many industrial populations are heterogeneous and since these components tend to become obsolete be-
there are only a small number of different subpopu- fore they wear out. Mixing two distributions which
lations. Although members of these subpopulations are more complex than exponentials yields distri-
do not strictly speaking have bathtub-shaped fail- butions with more typical bathtub-shaped failure
ure rates, sampling from them produces a mixture rates, as can be seen in the following example. A
of these subpopulations and these mixtures often typical bathtub curve is given in Figure 8.2 of To-
have bathtub-shaped failure rates. For a simple ex- bias and Trindade (1995, page 238) which we repro-
ample, assume that there are two subpopulations of duce in Figure 1.
components each of which is exponential, one with This distribution is realized as a mixture of a log-
a small mean and one with a large mean. Sam- normal and a Weibull distribution (both of which are
pling produces a distribution with decreasing failure used to model defectives) and another distribution
rate which is a special case of the bathtub failure (which models the population of normal devices),
rate. An intuitive explanation of why this occurs is  
lnt/2;700‘
easy to give. Initially the higher failure rate of the Ft‘ = 0:0028
weaker subpopulation dominates until this subpop- 0:8
    
ulation dies out. After that, the lower failure rate t 0:5
+ 0:001 1 − exp −
of the stronger subpopulation takes over so that the 400

failure rate decreases from the higher to the lower
level. This type of idea, about the eventual domina- + 0:997 1 − exp −10−7 t‘
tion of the strongest subpopulation, carries through   
lnt/975;000‘
for very general mixtures. See Block, Mi and Savits · 1−8 ;
0:8
(1993, Section 4). A subjectivist explanation of the
fact that mixing exponentials produces a decreasing where 8 is the standard normal cdf. Notice that
failure rate distribution was given by Barlow (1985), the left tail of the distribution is very steep. This
who argued that even though a model may be expo- tail represents the period where many failures oc-
BURN-IN 5

cur. Burn-in is utilized in order to remove this part density ft‘ and failure rate rt‘ = ft‘/F̄t‘ is said
of the tail. The dotted line represents the result- to have a bathtub-shaped failure rate if there exist
ing distribution after a burn-in of several hours at points 0 ≤ t1 ≤ t2 ≤ ∞, called change points, such
an accelerated temperature. The point at which the that

curve flattens out and stops decreasing is at about
 decreasing for 0 ≤ t < t1 ;

20K. This is called the first change point.
rt‘ is constant for t1 ≤ t < t2 ;
Many papers have appeared in the statisti- 
 increasing for t ≤ t < ∞:
cal literature providing models and formulas for 2
bathtub-shaped failure rates. See Rajarshi and Ra-
jarshi (1988) for a review of this topic and many We have restricted the above definition to continu-
references. One easy way of obtaining some of these ous lifetimes, but discrete lifetimes can be handled
is by mixing standard life distributions such as the similarly (see Mi, 1993, 1994c). Further we often
exponential, gamma and Weibull. See Vaupel and shorten the phrase bathtub-shaped failure rate to
Yashin (1985) for some illustrations of various dis- bathtub failure rate or even bathtub distribution. A
tributions or Mi (1991) for an example of a simple bathtub curve is called degenerate if either the de-
mixture of gammas which has a bathtub-shaped creasing or increasing part is not present (i.e., it is
failure rate. The AT&T Reliability Manual (Klinger, either always increasing or always decreasing).
Nakada and Menendez, 1990) gives another model
(called the AT&T model) for the failure rate of an 3. OPTIMAL BURN-IN
electronics component. The early part of the fail- In this section we consider some basic criteria for
ure rate is modeled by a Weibull with decreasing determining the optimal burn-in time for a lifetime.
failure rate, and the latter part is modeled by an In general, we consider lifetimes with a bathtub-
exponential (i.e., constant). It does not have a part shaped failure rate having change points t1 and t2
describing wearout since the manual claims that (see Definition 1). As exemplified in Figure 1, burn-
the AT&T electronic equipment tends not to wear in often takes place at or before the first change
out before it is replaced. The AT&T model has been point t1 . In fact, in the following, various optimality
used extensively by Kuo and various co-authors criteria lead to such a burn-in time. In Section 3.1
(e.g., see Chien and Kuo, 1992) to study optimal we focus on performance based criteria. The more
burn-in for integrated circuit systems. This model realistic situation involving cost structures is con-
is also called the Weibull–exponential model in the sidered in Section 3.2 and these are based in part
statistical literature (e.g., see Boukai, 1987). on the criteria of Section 3.1.
Since mixtures are emphasized in this review we
point out one apparent anomoly mentioned by Gur- 3.1 Performance-Based Criteria
land and Sethuraman (1994). In that paper it is ob-
In this section we consider the problem of
served that when even strongly increasing failure
performance-based criteria in which the more
rate distributions are mixed with certain other dis-
general assumption of a cost structure is not made.
tributions, their failure rate tends to decrease after
Many of these criteria are basic concepts which
a certain point. This is not surprising in the light
can and should be incorporated into a general
of the previously mentioned result of Block, Mi and
cost structure. Cost structures are considered in
Savits (1993), which gives that asymptotically the
Section 3.2.
failure rate of a mixture tends to the asymptotic fail-
The paper of Watson and Wells (1961) was one
ure rate of the strongest component of the mixture.
of the first statistical papers to study the question
Since the failure rate of the strongest component is
of burn-in. These authors were interested in con-
the smallest, the failure rate of the mixture is often
ditions under which the mean residual life (after
eventually decreasing to this smallest value.
burn-in) was larger than the original mean lifetime.
Most definitions of bathtub-shaped failure rates
Maximizing the mean residual life is one of the cri-
assume the failure rate decreases to some change
teria we examine in this section. We now list sev-
point t1 ‘, then remains constant to a second change
eral criteria for determining burn-in. Criteria C1,
point t2 ‘, then increases. The case t1 = t2 (i.e., no
C2 and C4 deal with only one component. Crite-
constant portion) is often adequate as an assump-
rion C3 deals with components which are replaced
tion in some theoretical results. We give the defini-
at failure with other identical components.
tion below.
C1. Let τ be a fixed mission time and let F̄ be the
Definition 1. A random lifetime X with distribu- survival function of a lifetime. Find b which
tion function Ft‘, survival function F̄t‘ = 1−Ft‘, maximizes F̄b + τ‘/F̄b‘, that is, find b such
6 H. W. BLOCK AND T. H. SAVITS

that, given survival to time b, the probability of pair is performed (see Barlow and Proschan, 1965),
completing the mission is as large as possible. then the total number of minimal repairs is a non-
C2. Let X be a lifetime. Find the burn-in time b homogeneous Poisson process with mean function
which maximizes E’X − bŽX > b“, that is, find − ln’F̄b + τ‘/F̄b‘“. Thus if we want to minimize
the burn-in time which gives the largest mean the expected number of minimal repairs in the in-
residual life. terval ’0; τ“, it suffices to maximize the quantity
C3. Let ”Nb t‘; t ≥ 0• be a renewal process of F̄b + τ‘/F̄b‘. But this is just criterion C1.
lifetimes which are burned in for b units of
3.2 Cost Functions and Burn-in
time (i.e., where F is the original lifetime dis-
tribution and the interarrival distribution has Several cost functions have been proposed to deal
survival function F̄b t‘ = F̄b + t‘/F̄b‘‘. For with burn-in. A discussion of many of these is given
fixed mission time τ, find b which minimizes in the review papers of Kuo and Kuo (1983) and
E’Nb τ‘“, which is the mean number of burned- Leemis and Beneke (1990). Also see Nguyen and
in components which fail during the mission Murthy (1982). In this section we discuss a few of
time τ. the recent models involving cost functions for burn-
in. In all cases we are interested in finding the burn-
The next criterion involves the α-percentile resid-
in time which minimizes the cost. Cost functions
ual life function. The α-percentile residual life is
CF1 and CF4 are used in subsequent sections. In
defined by
general these cost functions build upon and elabo-
qα b‘ = Fb−1 α‘ = inf ”x ≥ 0x F̄b x‘ ≤ 1 − α• rate the criteria of Section 3.1. Cost function CF1
is basic, while CF2 and CF4 incorporate C2; CF3
(see Joe and Proschan, 1984, for further details). uses C1.
C4. For a fixed α, 0 < α < 1, find the burn-in time CF1. A component or system with lifetime X is
b which maximizes τ = qα b‘, that is, find the burned-in for time b. If it fails to survive b
burn-in time which gives the maximal warranty units of time a cost c0 is incurred. If it sur-
period τ for which at most α% of items will fail. vives b units of time, then it incurs a second
cost C, C > c0 , if it does not survive past an
Criterion C2 has been studied by several authors.
additional mission time τ or it incurs a gain
The first of these, Watson and Wells (1961), ex-
of K if it does survive τ. Consequently, if F is
amined various parametric distributions. Lawrence
the distribution function of the component or
(1966) obtained bounds on the mean residual life.
system the expected cost as a function of b is
Park (1985) gave some results on the mean residual
life for a bathtub distribution. One result was that c1 b‘ = c0 Fb‘+C’Fb+τ‘−Fb‘“−KF̄b+τ‘:
the optimal burn-in time b∗ occurs before the first CF2. If instead of a mission time after the burn-in
change point t1 . Mi (1994b) obtained the same re- we consider a gain proportional to the mean
sult for criteria C1 and C3, that is, b∗ ≤ t1 . Launer residual life (with proportionality constant
(1993) introduced criterion C4 and also showed that K), the expected cost becomes
the optimal b∗ occurs before t1 . This type of result R∞
is important since it provides an upper bound for F̄t‘ dt
c2 b‘ = c0 Fb‘ − K b :
burn-in. F̄b‘
The fact that optimal burn-in for a bathtub dis- The next criteria involve costs for in-shop repair.
tribution takes place before the first change point If a device fails burn-in, it is scrapped at a cost
is not unusual. In fact, it is intuitive that burn-in cs > 0 and another unit is burned-in. This process
should occur before this change point since this is is continued until a unit survives burn-in time b. A
where the failure rate of such a lifetime stops im- device which survives burn-in is then put into field
proving. We shall see in Section 3.2 that the result use. The cost for burn-in is assumed to be propor-
also holds true for many cost structures. tional to the time it takes to obtain a unit which
In another direction, Mi (1994b) compared opti- survives burn-in with proportionality constant c0 .
mal burn-in times for two mission times τ1 ≤ τ2 Mi (1994a) derives the expression for the expected
for criterion C1. He showed the intuitive result that cost as
b∗2 ≤ b∗1 . An extension to random mission times was Rb
F̄t‘ dt cs Fb‘
also considered. kb‘ = c0 0 + :
In criterion C3, a burned-in unit that failed dur- F̄b‘ F̄b‘
ing field use was replaced with another burned-in The complete cost also includes additive field costs,
unit. If instead of replacing this unit, a minimal re- and this is reflected in the following cost functions.
BURN-IN 7

CF3. In this case, after a burned-in item is ob- 4.1 A Reliability Growth Model
tained, a cost of C is incurred if the burned-in
Arjas, Hansen and Thyregod (1991) consider a re-
device does not survive the mission time τ
newal process approach to reliability growth where
and a gain of K if it survives the mission.
heterogeneity of the underlying part structure is
Thus the total cost function is given by
shown to translate into renewal intensity behavior.
Fb + τ‘ − Fb‘ Although burn-in per se is not discussed in this pa-
c3 b‘ = kb‘ + C per, the lifetimes discussed are of the type to which
F̄b‘
burn-in is typically applied. This section also pro-
F̄b + τ‘ vides a background for Section 4.2, which considers
−K ;
F̄b‘ mixed lifetimes.
The basic process involves the lifetimes of parts
where kb‘ is as above. placed in two or more sockets where, upon failure,
CF4. If instead of a mission time, a gain is taken a failed part is replaced by a new part of the same
proportional to the mean residual time, the type. The first and subsequent lifetimes for one
cost function in CF3 is modified to socket are designated by X1 ; X2 ; : : : : These life-
R∞ times are assumed independent. The lifetimes are
F̄t‘ dt
c4 b‘ = kb‘ − K b : also assumed to come from a heterogeneous popu-
F̄b‘ lation. It is natural to model these lifetimes using
a random hazard rate so that the distribution of
The cost function CF1 was introduced by Clarotti
the lifetime can be written as a mixed exponential,
and Spizzichino (1990). These authors obtained con-
that is,
ditions for an optimal burn-in time b∗ and applied Z∞
their results to a mixed exponential model. See also P”Xk > x‘ = e−λx dφλ‘;
Section 4.2, where an extension of the mixed ex- 0

ponential model to a general mixture model is dis- where φ is the distribution of the random hazards.
cussed. The cost function CF2 is a variant of CF1. The aim of the paper is to study the renewal process
The cost functions CF3 and CF4 are discussed in of one socket or the superimposed renewal process
Mi (1991, 1995). As in Section 3.1, the respective of several. As is well known, this mixed distribution
authors show that the optimal burn-in time b∗ sat- has a decreasing failure rate.
isfies b∗ ≤ t1 for cost functions CF2–CF4, where t1 If Nt‘ is the renewal process for one socket, it
is the first change point for the assumed bathtub can be shown that Vt‘ = ENt‘ is concave and
distribution. the rate of occurrence of failures for the renewal
process, vt‘ = d/dt‘ Vt‘, is decreasing. Various
results for this type of renewal process can be ob-
4. MIXTURE MODELS
tained, and comparisons can be made with processes
In this section we consider recent mixture mod- where sockets are minimally repaired rather than
els. This is the typical model described in Section replaced. For minimal repair, the associated pro-
2 to which burn-in is applicable. In both Arjas, cess is the nonhomogeneous Poisson process. (See
Hansen and Thyregod (1991) and Block, Mi and Block and Savits, 1995, for many comparisons of
Savits (1993) an underlying mixture distribution this type.)
is used to model the life of components. The latter Parametric estimation is considered by these au-
paper discusses burn-in applications although the thors for the bimodal (i.e., mixture of two) expo-
former paper does not. nential case. The bimodal Weibulls (and exponen-
The paper of Arjas, Hansen and Thyregod (1991) tials) are the principal examples of the Jensen and
discussed in Section 4.1 is an interesting mix of Petersen (1982) monograph on burn-in. The distri-
modeling and estimation and uses ideas and tech- bution for the life length of the part is the three-
niques from the reliability theory, life testing (engi- parameter mixture of two exponential distributions
neering reliability) and survival analysis literature. with distribution function
The methods developed are applied to an example Fx‘ = π 1 − exp−λ0 x‘‘
involving printed circuit boards. In Section 4.2 we
discuss results of Block, Mi and Savits (1993). A + 1 − 𑠐1 − exp−λ1 x‘‘; x > 0:
more general mixture model than in Arjas, Hansen It is assumed that inferior parts cannot be distin-
and Tyregod (1991) is examined. A recent paper of guished from a standard part. Two cases are con-
Spizzichino (1995) discusses another model for mix- sidered: (a) the case where sockets are observed in-
tures in heterogeneous populations. dividually and (b) the case where sockets are only
8 H. W. BLOCK AND T. H. SAVITS

bN−A t‘ and the smooth curve comes from V̄t‘.


Fig. 2. Comparing two estimates; the step curve comes from V

observed as aggregated data. In case (a), the maxi- Rs‘ denotes the number of active systems older
mum likelihood estimation is straightforward. Right than s. This yields Figure 2, which compares these
censoring is permitted and the likelihood or log- two estimates. The step curve comes from V bN−A t‘
likelihood function is standard. In case (b), times and the smooth curve comes from V̄t‘. Confidence
between failures are not independent and so either bounds are also obtained in this paper using several
(1) an approximation by a corresponding nonhomo- methods.
geneous Poisson process is used or (2) it is assumed,
in the case when the number of failures is less than 4.2 A General Mixture Model
the number of sockets, that each socket has experi- As mentioned in Section 1, one explanation for a
enced at most one failure and so the techniques of bathtub-shaped failure rate that is often given by
(a) apply. engineers is that it is due to mixtures of popula-
An example is given where the system is a printed tions, some weak and some strong. In Block, Mi and
circuit board consisting of 560 parts (sockets) and Savits (1993), a general mixture model was investi-
there are 3,481 systems from which data was col- gated. A goal of that paper was to determine opti-
lected for five years. Maximum likelihood estimates mal burn-in for the cost function CF1 of Clarotti and
were obtained computationally for the three pa- Spizzichino (1990). Some results of independent in-
rameters and were used to estimate the cumulative terest, however, were also obtained. They are sum-
number of occurrence of failures V̄t‘ = EN̄t‘, marized below.
where N̄t‘ is the superimposed renewal process. For the general mixture model, it is assumed that
The model can be assessed graphically by calcu- each member of the subpopulation, indexed by λ ∈
lating N̄0 t‘, the counting process obtained as the S, has a positive density ft; λ‘ on 0; ∞‘. The den-
sum of the individual system processes, and then sity of the resulting mixed population is then given
using the Nelson–Aalen estimate by
Z
X 1N̄0 s‘
bN−A t‘ =
V ; 4:1‘ ft‘ = ft; λ‘Pdλ‘;
s≤t Rs‘ S
where P is the mixing distribution.
where 0 < T01 < T02 < · · · < TN̄ 0
0 t‘ < t are all The first results concern the monotonicity of the
of the failure times, 1N̄ s‘ is 1 for each T0i and
0
ratio gt‘ = ft + τ‘/ft‘ for a fixed mission time
BURN-IN 9

τ > 0. This is a new type of aging property that 5. COMPONENT VERSUS SYSTEM BURN-IN
seems appropriate for burn-in since it is related to a
In this section we deal with the important issue
notion of beneficial aging. More specifically, if we re-
of at which stage burn-in is most effective. Consider
quire the ratio ft+τ‘/ft‘ to be increasing in t > 0
a system composed of individual components. Is it
for each τ > 0, then f must be log-convex and hence
better to burn in all the components or is it bet-
belongs to the class of distributions which have a
ter to assemble the components and burn in the
decreasing failure rate. Furthermore, certain bath-
system? If there are modules and subassembly sys-
tub failure rates which can be realized as mixtures
tems similar questions can be asked. The component
have this monotonicity property.
level is usually the least expensive stage at which to
Before we can state this result, we recall the def-
consider burn-in. Assembly of even burned-in com-
inition of reverse regular of order 2 RR2 ‘. A non-
ponents usually introduces defects, so burn-in at
negative function kx; y‘ on A × B is said to be RR2
higher levels would seem to have some value. In this
if
section we consider some preliminary work in which
kx1 ; y1 ‘kx2 ; y2 ‘ ≤ kx1 ; y2 ‘kx2 ; y1 ‘ this question is considered, but under the simplify-
ing assumption that no defects are introduced upon
whenever x1 < x2 in A and y1 < y2 in B. Alterna- assembly. By a system here we mean a coherent
tively, we require that the ratio system in the sense of Barlow and Proschan (1981,
kx + 1; y‘ page 6).
kx; y‘ There are three possible actions we want to
consider which constitute different methods for
be decreasing in y ∈ B for each x ∈ A and 1 > 0.
burning-in the system:
The following is a preservation result for a mono-
tonicity property with a fixed mission time τ. Let
the family of positive densities ”ft; λ‘x λ ∈ S• be (i) Burn in component i for a time βi , i = 1; : : : ; n,
RR2 on 0; ∞‘ × S and let τ > 0 be a fixed mission and then assemble the system with the burned-
time. Suppose the ratio in components.
(ii) Burn in component i for a time βi , i = 1; : : : ; n,
ft + τ; λ‘ assemble the system with the burned-in compo-
gt; λ‘ =
ft; λ‘ nents and then perform an additional burn-in
is increasing in t > 0 for each λ ∈ S. Then, for the of the system for a time b.
mixture density f given in (4.1), the ratio (iii) Assemble the system with new components and
R then burn in the system for a time period b.
ft + τ‘ ft + τ; λ‘ Pdλ‘
gt‘ = = SR
ft‘ S ft; λ‘ Pdλ‘ Since (i) and (iii) are special subcases of (ii), we can
is increasing in t > 0. A more general result that do no better than (ii). However, is it possible that
does not require the RR2 condition is given in Block, we can do just as well with one of the other two
Mi and Savits (1993, Theorem 3.1). actions?
A second result of interest in the paper of Block, In Block, Mi and Savits (1994, 1995), this question
Mi and Savits (1993) pertains to the limiting be- was considered for three different criteria: (1) max-
havior of the failure rate for the mixed population. imizing the probability that the system will sur-
Heuristically, it states that the failure rate of the vive a fixed mission time (or warranty period) τ;
mixture tends to the strongest subpopulation. Un- (2) maximizing the system mean residual life; and
der certain technical conditions it is shown that the (3) maximizing the α-percentile (system) residual
failure rate of the mixed population converges to a life τ = qα b‘ for a fixed α, 0 < α < 1. In each case
constant α as t → ∞. Here α = inf ”aλ‘x λ ∈ S• it was shown that one can do as well with burn-in
and aλ‘ = limt→∞ rt; λ‘ with rt; λ‘ the failure at the component level only.
rate of the λ-subpopulation. (The discrete version is This result can be extended to criteria which have
considered in Mi, 1994c.) a type of monotonicity property. More specifically,
Clarotti and Spizzichino (1990) also show for the the result can be shown to hold for any criterion de-
mixture of exponentials model that if one mixing termined by a functional φ defined on the class of
distribution P1 is less than P2 in the sense of likeli- life distributions which is monotone in stochastic or-
hood ratio ordering, then the optimal burn-in times der, that is, in the case of maximizing (minimizing)
b∗i for the cost function CF1 are ordered as b∗1 ≤ b∗2 . the objective function φ, we require that φF‘ ≤
The same result also holds for the general mixture ≥‘ φG‘ whenever F ≤st G [i.e., F̄t‘ ≤ Ḡt‘ for
model. See Block, Mi and Savits (1993) for details. all t ≥ 0]. Thus, for such criterion, burn-in at the
10 H. W. BLOCK AND T. H. SAVITS

system level is precluded by effective burn-in at the age replacement and also in obtaining approximate
component level. optimal burn-in times. We briefly describe the pro-
It should be noted that this result does not apply cedure for burn-in and note that the procedure for
to the cost function criteria considered in Section age replacement is similar.
3.2 since they are not monotone in stochastic order. We consider the cost function CF4 of Section 3 as
Also, if the act of assembling the components de- an example and describe how an optimal burn-in
grades the system, an additional burn-in at the sys- time b∗ can be obtained using the TTT transform.
tem level might be required. Whitbeck and Leemis This example is taken from Mi (1991). The cost func-
(1989) have considered a model for dealing with this tion can be written as
problem. Rb R∞
c0 0 F̄t‘ dt + cs − K b F̄t‘ dt
c4 b‘ = −cs + :
6. TOTAL TIME ON TEST (TTT) F̄b‘
In this section we describe a primarily graphical The optimal burn-in is obtained by minimizing this
technique which has been useful in burn-in. One function. Letting u = Fb‘, minimizing the above is
consequence of this technique is in obtaining ap- equivalent to maximizing
proximate burn-in times. α − φF u‘
In a life test, failure times are observed until all or MF u‘ = ;
1−u
some portion of the items fail. A way to summarize
the behavior is through the total time on test (TTT) where α = −cs + Kµ‘/c0 + K‘µ and µ is the mean
statistics. Let 0 = x0‘ < x1‘ < x2‘ < · · · < xn‘ of F. The function MF u‘ is the slope of the line seg-
be an ordered sample from a continuous lifetime ment connecting the points 1; α‘ and u; φF u‘‘.
distribution with finite mean. In this section, to Consequently we need only find the point on the
avoid technical problems, we assume the distribu- graph of φF , the scaled TTT transform, for which
tion function F is strictly increasing on 0; ∞‘. The the above slope is largest.
TTT statistics are defined by If n items with lifetime X are put on test, a
TTT plot (i.e., the graph of φFn ) can be obtained.
i
X
Ti = n − j + 1‘xj‘ − xj−1‘ ‘ Since the TTT transform is the asymptotic version
j=1 of the TTT plot, an estimate of the optimal burn-
in can be obtained. If the point i/n; Ti /Tn ‘ maxi-
and mizes MFn u‘, then xi‘ = F−1n i/n‘ is the ordered
Ti value giving an estimate of the optimal burn-in. We
ui = for i = 1; : : : ; n:
n illustrate this in Figure 3. Other similar burn-in
Notice that un = x̄n . Moreover, if Fn is the empirical applications can be found in Bergman and Klef-
distribution function and Fn−1 x‘ = inf ”t Ž Fn t‘ ≥ sjo (1985). See also the review article by Bergman
x•, then F−1 n i/n‘ = xi‘ for i = 1; : : : ; n. Conse- (1985), which gives other applications of the TTT
quently, transform.
Z Fn−1 i/n‘
F̄n x‘ dx = ui ; i = 1; : : : ; n: 7. SEQUENTIAL BURN-IN AND
0
OPTIMAL CONTROL
This suggests a distributional analog called the TTT
transform, traditionally denoted by HF −1
. It is de- A theory of sequential burn-in has been proposed
fined by in Spizzichino (1991), and some extensions of this
Z F−1 t‘ have been initiated by the same author and some
−1 of his colleagues. This extends the previous ma-
HF t‘ = F̄u‘ du:
0 terial which deals with mainly one component or
The scaled TTT transform is given by system, or components which are independent and
−1
HF t‘ −1
HF t‘ identically distributed. The more general situation
φF t‘ = −1
= ; where the components are not assumed indepen-
HF 1‘ µ
dent is treated by Spizzichino and colleagues, who
where µ is the mean of F. Although these con- assume components are exchangeable. A mixture
cepts were discussed earlier, one of the first sys- model for strong and weak exchangeable compo-
tematic expositions was given in Barlow and Campo nents has been proposed by Spizzichino (1995). We
(1975). give a brief introduction to this work. Several rep-
One of the principle uses of the TTT concept has resentative papers are contained in Barlow, Clarotti
been in obtaining approximate optimal solutions for and Spizzichino (1993).
BURN-IN 11

Fig. 3. Estimate of the optimal burn-in.

As background we mention a paper of Marcus Spizzichino (1993). Some very recent research on
and Blumenthal (1974), who considered a sequential optimal burn-in of software is given in Barlow,
burn-in procedure. The stopping rule they suggested Clarotti and Spizzichino (1994).
is as follows: observe failure times and stop when
the time between failures exceeds a fixed value. This
8. DISCUSSION AND AREAS
is reasonable for a lifetime which has a high initial
FOR DEVELOPMENT
failure rate that becomes smaller. Properties of this
rule are studied and tables for its use are given. In this review of recent developments in burn-in
In Spizzichino (1991), failure times of n compo- we have discussed a variety of problems. We reca-
nents which are assumed to be exchangeable are pitulate some of these ideas in this section along
observed. One burn-in time is chosen initially and with some future research directions.
if all components survive it, they are put into field A basic assumption on a lifetime for which burn-in
operation. If there is a failure before this time, a is appropriate is that it has a bathtub-shaped fail-
new additional burn-in time is chosen (which may ure rate. This type of lifetime often arises because a
depend on the first failure) and the procedure re- population consists of a mixture of weak and strong
peats. A cost structure based on the one in Clarotti subpopulations. One question for which a satisfac-
and Spizzichino (1990), that is, CF1 from Section tory answer has not been determined is for which
3, is given. A sequential burn-in strategy is defined mixtures does the failure rate have a bathtub shape.
and this is shown to be optimal. A particular case is As described in Section 3, the intuitive result that
mentioned where the exchangeable distribution is burn-in should occur before the first change point of
a mixture of exponentials. This case is further ex- a bathtub failure rate has been demonstrated for a
plored in Costantini and Spizzichino (1991), where a wide variety of criteria and cost functions, but in an
strategy is proposed for reducing this to an optimal ad-hoc way. The authors are currently working on a
stopping problem for a two-dimensional Markov unified result for an even broader class of objective
process. Further details are given in Costantini functions.
and Spizzichino (1990) and in Caramellino and The handbook of Jensen and Petersen (1982)
Spizzichino (1996). presents a wide array of graphical and heuristic
A related approach for optimal screening (a statistical techniques for burn-in. Many of these are
type of burn-in) is given in Iovino and Spizzichino applied to mixtures which model weak and strong
(1993). A general unifying model is proposed by components. At the time the book was written,
12 H. W. BLOCK AND T. H. SAVITS

statistical techniques and procedures for mixtures Allan Sampson, Leon Gleser and David Coit were
were less well understood than they are at the also helpful.
present time. It would be useful if many of the intu- The work of the authors was partially supported
itively plausible and useful techniques given in this by NSA Grant MDA-904-90-H-4036 and by NSF
handbook were updated and put on a firmer statis- Grant DMS-92-03444.
tical foundation. One example of this is the paper
of Arjas, Hansen and Tyregod (1991) (see Section
REFERENCES
4.1), who develop estimation techniques for renewal
processes where the underlying distribution is a Arjas, E., Hansen, C. K. and Thyregod, P. (1991). Estimation of
mean cumulative number of failures in a repairable system
mixture of exponentials.
with mixed exponential component lifetimes. Technometrics
The material of Section 7 on sequential burn-in 33 1–12.
and optimal control appears to be a fruitful area of Barlow, R. E. (1985). A Bayes explanation of an apparent failure
research. It seems evident that this direction should rate paradox. IEEE Transactions on Reliability R-34 107–
be expanded and further investigated. 108.
Barlow, R. E. and Campo, R. (1975). Total time on test processes
The development of new ideas on burn-in goes and applications to failure data analysis. In Reliability and
hand-in-hand with developments in accelerated life Fault Tree Analysis (R. E. Barlow, J. B. Fussell and N. D.
testing. In fact, burn-in is most often accomplished Singpurwalla, eds.) 451–481. SIAM, Philadelphia.
in an accelerated environment. A related topic is Barlow, R. E., Clarotti, C. A. and Spizzichino, F., eds.
degradation, in which instead of the lifetime of a (1993). Reliability and Decision Making. Chapman and Hall,
New York.
component the emphasis is on a measure of the Barlow, R. E., Clarotti, C. A. and Spizzichino, F. (1994).
quality of the component as it wears out. If the Optimal burn-in of software. Unpublished report.
environment is accelerated, the question of burn-in Barlow, R. E. and Proschan, F. (1965). Mathematical Theory
in conjunction with this accelerated degradation be- of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of
comes of interest. For recent developments on accel-
Reliability and Life Testing. To Begin With, Silver Spring,
erated degradation, see Nelson (1990) and Meeker MD.
and Escobar (1993). Bergman, B. (1985). On reliability theory and its applications.
An area of reliability where burn-in techniques Scand. J. Statist. 12 1–41.
might be applicable and vice-versa is the topic of Bergman, B. and Klefsjo, B. (1985). Burn-in models and TTT
transforms. Quality and Reliability International 1 125–130.
software reliability modeling. In this area one prob- Block, H. W., Mi, J. and Savits, T. H. (1993). Burn-in and mixed
lem involves removing errors (bugs) from the soft- populations. J. Appl. Probab. 30 692–702.
ware. An assumption which is made is that when Block, H. W., Mi, J. and Savits, T. H. (1994). Some results on
bugs are detected and removed no new bugs are in- burn-in. Statist. Sinica 4 525–534.
troduced. In this case the software is improved since Block, H. W., Mi, J. and Savits, T. H. (1995). Burn-in at the
component and system level. In Lifetime Data: Models in
the number of bugs remaining is decreased. Conse- Reliability and Survival Analysis (N. P. Jewell, A. C. Kim-
quently, the rate at which bugs are discovered is ber, M.-L. T. Lee and G. A. Whitmore, eds.) 53–58. Kluwer,
decreased. This rate is analogous to the left tail of Dordrecht.
a failure rate with infant mortality present. Since Block, H. W. and Savits, T. H. (1995). Comparisons of mainte-
nance policies. In Stochastic Orders and Their Applications
the time at which the testing should stop is of inter-
(M. Shaked and J. G. Shanthikumar, eds.) Chap. 15. Aca-
est, and this is analogous to the burn-in period for demic Press, New York.
a lifetime of the type discussed in this paper, there Boukai, B. (1987). Bayes sequential procedure for estimation
should be some transfer between the ideas of both and for determination of burn-in time in a hazard rate model
of these fields. To date there have been a few appli- with an unknown change-point parameter. Sequential Anal.
6 37–53.
cations of burn-in ideas to finding the time at which
Caramellino, L. and Spizzichino, F. (1996). WBF properties
to stop testing the software. The paper of Barlow, and statistical monotonicity of the Markov process associ-
Clarotti and Spizzichino has been mentioned. See ated to Schur-constant survival functions. J. Multivariate
also Section 6 of Singpurwalla and Wilson (1994), Anal. 56 153–163.
who review the optimal testing of software. Chien, W.-T. K. and Kuo, W. (1992). Optimal burn-in simula-
tion on highly integrated circuit systems. IIE Transactions
24(5) 33–43.
ACKNOWLEDGMENTS Clarotti, C. A. and Spizzichino, F. (1990). Bayes burn-in de-
cision procedures. Probability in the Engineering and Infor-
We would like to thank Leon Gleser, Rob Kass and mational Sciences 4 437–445.
Jie Mi for helpful comments as well as a referee for Costantini, C. and Spizzichino, F. (1990). Optimal stopping of
the burn-in of conditionally exponential components. Unpub-
a careful reading and perceptive comments. David lished report.
Coit supplied us with the Fuqua reference and some Costantini, C. and Spizzichino, F. (1991). Optimal stopping of
recent papers and we thank him. Discussions with life testing: use of stochastic orderings in the case of condi-
BURN-IN 13

tionally exponential lifetimes. In Stochastic Orders and De- Mi, J. (1994b). Maximization of a survival probability and its
cision Making Under Risk (K. Mosler and M. Scarsini, eds.) applications. J. Appl. Probab. 31 1026–1033.
95–103. IMS, Hayward, CA. Mi, J. (1994c). Limiting behavior of mixtures of discrete lifetime
Fuqua, N. B. (1987). Reliability Engineering for Electronic De- distributions. Technical report, Dept. Statistics, Florida In-
sign. Dekker, New York. ternational Univ.
Gurland, J. and Sethuraman, J. (1994). Reversal of increas- Mi, J. (1995). Minimizing some cost functions related to both
ing failure rates when pooling failure data. Technometrics burn-in and field use. Oper. Res. To appear.
36 416–418. Nelson, W. (1990). Accelerated Testing. Wiley, New York.
Iovino, M. G. and Spizzichino, F. (1993). A probabilistic ap- Nguyen, D. G. and Murthy, D. N. P. (1982). Optimal burn-in
proach for an optimal screening problem. J. Ital. Statist. Soc. time to minimize cost for products sold under warranty. IIE
2 309–335. Transactions 14 167–174.
Jensen, F. and Petersen, N. E. (1982). Burn-in. Wiley, New Park, K. S. (1985). Effect of burn-in on mean residual life. IEEE
York. Transactions on Reliability R-34 522–523.
Joe, H. and Proschan, F. (1984). Percentile residual life func- Rajarshi, S. and Rajarshi, M. B. (1988). Bathtub distributions:
tions. Oper. Res. 32 668–678. a review. Comm. Statist. 17 2597–2621.
Klinger, D. J., Nakada, Y. and Menendez, M. A., eds. Rawicz, A. H. (1986). Burn-in of incandescent sign lamps. IEEE
(1990). AT&T Reliability Manual. Van Nostrand Rheinhold, Transactions on Reliability R-35 375–376.
New York. Singpurwalla, N. and Wilson, S. (1994). Software reliability
Kuo, W. and Kuo, Y. (1983). Facing the headaches of early fail- modeling. Internat. Statist. Rev. 62 289–317.
ures: a state-of-the-art review of burn-in decisions. Proceed- Spizzichino, F. (1991). Sequential burn-in procedures. J. Statist.
ings of the IEEE 71 1257–1266. Plann. Inference 29 187–197.
Launer, R. L. (1993). Graphical techniques for analyzing failure Spizzichino, F. (1993). A unifying model for optimal design of
data with percentile residual-life functions. IEEE Transac- life testing and burn-in. In Reliability and Decision Making
tions on Reliability R-42 71–75. (R. E. Barlow, C. A. Clarotti and F. Spizzichino, eds.) 189–
Lawrence, M. J. (1966). An investigation of the burn-in prob- 210. Chapman and Hall, New York.
lem. Technometrics 8 61–71. Spizzichino, F. (1995). A probabilistic model for heterogeneous
Leemis, L. M. and Beneke, M. (1990). Burn-in models and meth- populations and related burn-in design problems. Scientific
ods: a review. IIE Transactions 22 172–180. note, Dept. Mathematics, Univ. Rome “Guido Castelnuovo.”
Marcus, R. and Blumenthal, S. (1974). A sequential screening Tobias, P. A. and Trindade, D. C. (1995). Applied Reliability.
procedure. Technometrics 16 229–234. Van Nostrand, New York.
Meeker, W. Q. and Escobar, L. A. (1993). A review of recent Vaupel, J. W. and Yashin, A. I. (1985). Heterogeneity’s ruse:
research and current issues in accelerated testing. Internat. some surprising effects of selection on population dynamics.
Statist. Rev. 61 147–168. Amer. Statist. 39 176–185.
Mi, J. (1991). Burn-in. Ph.D. dissertation, Dept. Mathematics Watson, G. S. and Wells, W. T. (1961). On the possibility of im-
and Statistics, Univ. Pittsburgh. proving mean useful life of items by eliminating those with
Mi, J. (1993). Discrete bathtub failure rate and upside down fail- short lives. Technometrics 3 281–298.
ure rate. Naval Research Logistics 40 361–371. Whitbeck, C. W. and Leemis, L. M. (1989). Component vs.
Mi, J. (1994a). Burn-in and maintenance policies. Adv. in Appl. system burn-in techniques for electronic equipment. IEEE
Probab. 26 207–221. Transactions on Reliability 38 206–209.

Comment: “Burn-In” Makes Us Feel Good


Nicholas J. Lynn and Nozer D. Singpurwalla

1. PREAMBLE the subject. All those who work in reliability should


thank them for this and their other writings in this
Block and Savits, henceforth BS, have made many
arena. Our intent here is not to challenge BS on
contributions to the mathematics of burn-in and are
eminently qualified to put together a review arti- the mathematics of burn-in, which undoubtedly is
cle on this topic. Indeed, what they provide here is their territory. Rather, we take exception to their
an authoritative survey of the technical aspects of interpretation and their view of burn-in. Our main
concern is that BS view burn-in as a mathemati-
cal rather than as an engineering problem. The au-
Nozer D. Singpurwalla is Professor, Department thors are not to be faulted for this because their
of Operations Research and Department of Statis- perspective of burn-in is, regrettably, guided by en-
tics, George Washington University, Washington, gineers who do reliability rather than by engineers
DC 20052-0001 (e-mail: [email protected]). who do engineering! Consequently, this survey does
Nicholas Lynn works with Professor Singpurwalla. a good job of reporting that which is known and
14 H. W. BLOCK AND T. H. SAVITS

written on the topic; unfortunately, that which is 2.2 Misconceptions about Burn-in
known is subject to debate. The result is that BS
There appear to be at least two misconceptions
have adopted a limited view of burn-in and have about the engineer’s view of burn-in. The first is
refrained from a discussion of its foundational is- that items that are judged to have exponential life
sues. Our commentary—actually an article—is writ- distributions (or distributions that have an increas-
ten with the hope of filling these gaps and providing ing failure rate) should not be subjected to a burn-in
an alternative perspective on burn-in; in the sequel (cf. Clarotti and Spizzichino, 1990). The second mis-
we provide some new results on mixtures of distri- conception is that the sole purpose of burn-in is the
butions that are germane to burn-in. elimination of weak items from a population.
“Burn-in” is commonly used in engineering relia- The causes of the first misconception are a fail-
bility, statistical simulation and medical sensitivity ure to appreciate the role of utility in burn-in and a
testing. In this article we discuss the philosophical failure to distinguish between what Barlow (1985)
underpinnings of burn-in, and make three claims. refers to as the “model failure rate” and the “predic-
Our first claim is that the main purpose served tive failure rate.” Burn-in decisions should be based
by burn-in is psychological, that is, relating to be- on the predictive failure rate, not the model failure
lief. Our second claim is that burn-in is dictated by rate. In fact, if the predictive life distribution is a
the interaction between predictive failure rates and mixture of exponential distributions, then burn-in
utilities. Consequently, burn-in may be performed must be contemplated; it should be performed if the
even if the predictive failure rate is increasing and costs of testing compensate for the avoidance of risk
the utility of the time on test decreasing. An ex- of in-service failures.
ample is the burn-in phase of statistical simulation, The cause of the second misconception is a failure
which mirrors burn-in testing of engineering compo- to appreciate the fact that, fundamentally, there are
nents. Our third claim is that the famous “bathtub” two reasons for performing a burn-in test: psycho-
curve of reliability and biometry rarely has a physi- logical (i.e., those pertaining to belief) and physical
cal reality. Rather, as shown in Theorem 2, it is the (i.e., those pertaining to a change in the physical or
manifestation of one’s uncertainty. the chemical composition of an item).
2.3 Objectives
2. INTRODUCTION AND OVERVIEW
The aim of this article is to argue that the two
2.1 Background reasons given above cover the entire spectrum of
What is “burn-in”? The answer depends on whom burn-in, be it in engineering reliability, in simula-
you ask: an engineer, a simulator or a survivor (bio- tion or in sensitivity testing. Also, fundamentally,
statistician). Each explains burn-in differently. Our since the concepts of physics, chemistry and biology
goal is to argue, using a minimum of mathematics, influence belief (or psychology), there is only one
that there is a unifying theme underlying burn-in reason for burn-in, namely, psychological. In what
and, therefore, there must be a single answer to the follows (see Section 4), we will attempt to justify
question that is posed. our point of view. We will also point out that in
First, let us see how engineers view burn-in. To one’s day-to-day life, the psychology of burn-in is
an engineer, burn-in is a procedure for eliminating routinely practiced. In Section 5 we give examples of
“weak” items from a population (cf. Block, Mi and circumstances which provide a physical motivation
Savits, 1993). The population is assumed to con- for burn-in. Section 6 explores the role of utility in
sist of two homogenous subpopulations: “weak” and burn-in, and Section 7, entitled “An anatomy of fail-
“strong.” Burn-in is achieved by testing each item in ure rates with decreasing segments,” leads us to a
the population for the burn-in period, and commis- discussion of optimal burn-in times. A consequence
sioning to service those items that survive the test. of the material of Section 7 is our claim (Theorem
The items that fail the test are judged weak. 2) that the famous “bathtub curve” of reliability is
To a simulator, burn-in is the time phase during rarely a physical reality; rather, it is often the man-
which an algorithm, such as a “Gibbs sampler,” at- ifestation of one’s subjective belief. This may come
tains its theoretical convergence (usually the weak as a surprise to many.
convergence of distributions); see, for example,
3. NOTATION AND TERMINOLOGY
Besag, Green, Higdon and Mengersen (1995). Bio-
statisticians do not use the term burn-in, but the Suppose that T, the time to failure of an item,
notion of “sensitivity testing” a new drug for a short has a distribution function Ft‘ = PT ≤ t‘ and
period of time parallels the thinking of engineers. a survival function Ft‘ = 1 − Ft‘. Assume that
BURN-IN 15

f·‘, the probability density function of F·‘, ex- R A consequence of this theorem is the result that if
ists. If F·‘ is indexed by a parameter u so that FtŽu‘πu‘ du, the predictive life distribution, is a
PT ≤ tŽu‘ = FtŽu‘, then htŽu‘, the model failure mixture of exponential distributions, then the pre-
rate function of F·‘, is defined as dictive failure rate will be strictly decreasing.
ftŽu‘
(1) htŽu‘ = ; t ≥ 0:
FtŽu‘ 4. THE PSYCHOLOGICAL ASPECT OF BURN-IN
The function F is said to have an increasing (de- In this section, we argue that burn-in is a process
creasing) model failure rate if htŽu‘ is monoton- of learning, where by learning we mean a reduc-
ically increasing (decreasing) in t, where we use tion of uncertainty. The optimal burn-in time is the
increasing (decreasing) in place of nondecreasing time at which the amount of information that is
(nonincreasing) throughout. The function F is said gleaned from the test balances the costs of the test,
to have a constant model failure rate if htŽu‘ is where costs include the depletion of useful life. We
constant in t. It is well known that htŽu‘ is con- are prompted to make this claim as a consequence
stant in t if and only if FtŽu‘ = e−θt , an exponential of observing that engineers subject every item that
distribution. they use to a short life test prior to commissioning.
In keeping with our claim to use a minimum of This is true even of items that have an increasing
mathematics, we will call htŽθ‘ a bathtub curve if model failure rate. For such items, burn-in would
it satisfies the following definition. deplete useful life. When asked why every item is
A function gt‘ is said to be a bathtub curve if subjected to a burn-in, the answer has been that
there exists a point u > 0 such that gt‘ is strictly burn-in gives a “warm feeling” or “confidence” about
decreasing for t < u and strictly increasing for an item’s survivability. Thus engineering practice is
t > u. contrary to the statistical literature, which seems
In the context of burn-in, which is de facto a lim- to imply that only items having a strictly decreas-
ited life test, F having an increasing (decreasing) ing model failure rate should be subjected to a
model failure rate implies that burn-in results in a burn-in.
depletion (enhancement) of useful life. When F has How can one explain engineers’s actions which
a constant model failure rate, burn-in results in nei- are contrary to the literature? Our explanation is
ther a depletion nor an enhancement of useful life. that, with burn-in, we are learning by observing, so
Since the parameter u is always unknown, we burn-in must be contemplated whenever we have
need to specify a distribution for it; let the density uncertainty about the model failure rate, be it in-
of this distribution be denoted πu‘. Then ht‘, the creasing, constant or decreasing. The depletion of
predictive failure rate function of F, is given as useful life which occurs is the price that we pay for
R additional knowledge about the failure rate. The op-
ftŽu‘πu‘du
ht‘ = R timal burn-in time represents the optimal trade-off
FtŽu‘πu‘ du between knowledge and cost, and it may be greater
(2)  
Z πu‘FtŽu‘ than zero if the predictive failure rate is decreasing
= htŽu‘ R du: or has a decreasing segment.
FtŽu‘πu‘ du
Thus
Z 5. THE PHYSICAL ASPECT OF BURN-IN
(3) ht‘ = htŽu‘πuŽt‘ du;
Is uncertainty about the model failure rate the
where πuŽt‘ denotes the density of the distribution only circumstance under which a burn-in should be
of u given that T ≥ t. contemplated? The answer is no, because burn-in
Note Rthat, contrary to what many believe, may also be done in those situations wherein the act
ht‘ 6= htŽu‘πu‘ du; also, if πu‘ is degener- of using the item physically enhances its survivabil-
ate, the model and predictive failure rates agree. ity. Examples include the work hardening of ductile
We conclude this section with a statement of the materials and the self-sharpening of drill bits. Un-
following important closure (under mixtures) theo- der such circumstances the model failure rate is de-
rem. creasing, and one would contemplate a burn-in, even
if the model failure rate were known with certainty.
Theorem 1 [Barlow and Proschan (1975) page The predictive failure rate is of course decreasing
103]. If the model failure rate htŽu‘ is decreasing by Theorem 1.
in t; then the predictive failure rate ht‘ is decreas- To summarize, burn-in should be contemplated
ing in t; for any πu‘. for all items whose predictive failure rate is either
16 H. W. BLOCK AND T. H. SAVITS

monotonically decreasing or has a decreasing seg- 7. AN ANATOMY OF FAILURE RATES WITH


ment. Burn-in should be performed if the costs of DECREASING SEGMENTS
testing compensate for the avoidance of risk, either
7.1 Decreasing Failure Rates
because of our added knowledge or the physical en-
hancement of survivability, both of which make us We start off by asking the question, “What causes
feel good; hence the title of this paper. F to have a monotonically decreasing predictive fail-
ure rate?” Three reasons come to mind. These are
(i) the physics of failure of an item, (ii) the physical
6. THE ROLE OF UTILITY IN BURN-IN
mixing of several items, each having a decreasing
Implicit in everything we have said above is the but known model failure rate and (iii) the subjective
assumption that the event of interest is failure and (or psychological) mixing of a decreasing but un-
that there is a positive utility associated with sur- known model failure rate. One may claim that (ii)
vival. Thus, neglecting costs associated with testing, above is a special case of (iii); however, it is helpful
a reduction in the predictive failure rate corre- to distinguish between the two. We first elaborate
sponds to an increase in the expected life and, on each of these and then address the question of a
therefore, the expected utility. Thus burn-in is bathtub failure rate.
only considered when the predictive failure rate is
decreasing or has a decreasing segment.
However, the term “failure rate” is misleading, 7.2 The Physics of Failure
since we never stipulate that T is the time to The best examples of items whose time-to-failure
failure. Indeed, T may represent the time to any distribution F has a monotonically decreasing
event of interest, and the utility associated with model failure rate are those which experience work
the time before that event’s occurrence may be neg- hardening. Examples include the curing of concrete
ative. One example arises in statistical simulation, slabs and the self-sharpening of drill bits. In all
where an algorithm, such as the Gibbs sampler, the above cases, the chemical bonds which hold
is subjected to a burn-in to ensure its (weak) con- together the atoms of a material strengthen over
vergence. The idea here is that the algorithm time or with use, making their failure increasingly
experiences a phenomenon that is akin to work unlikely over time.
hardening, in the sense that each run is a stepping In Figure 1 we illustrate several forms of decreas-
stone toward convergence. However, in this exam- ing model failure rates for an item and the resulting
ple, we define T to be the time until a specified predictive failure rate, which by the closure under
convergence criterion is met; the utility associ- mixture theorem (Theorem 1), must also be decreas-
ated with this time is negative. Furthermore, the ing. If πu∗ ‘ was degenerate at u∗ , then there would
model failure rate is increasing, since convergence be only one model failure rate (corresponding to u∗ )
becomes increasingly likely with each step of the and the corresponding predictive failure rate would
algorithm. be u∗ :
Should we perform burn-in in this case? The an-
swer to this question comes from a consideration of
the costs. We conclude that burn-in should be con-
templated whenever the predictive failure rate has
an increasing segment; when the predictive failure
rate is increasing, we will burn in indefinitely (i.e.,
until convergence is achieved). Burn-in will not be
performed when the predictive failure rate is de-
creasing. These conclusions are opposite to those of
Sections 4 and 5. Indeed, the simulation problem
may be thought of as the dual (or “mirror image”)
of the usual engineering problem.
This scenario raises two important issues: (i) that,
in essence, burn-in is a decision problem and cannot
be answered without consideration of utility; and
(ii) that the material of this paper is not restricted
to the analysis of failure—rather, it applies to any
situation where we have uncertainty surrounding Fig. 1. Monotonically decreasing model and predictive failure
the time of an event’s occurrence. rates.
BURN-IN 17

7.2.1 Physical mixing. By “physical mixing” we


mean the act of physically putting together several
probabilistically heterogenous items that are oth-
erwise indistinguishable, and inquiring about the
stochastic behavior of an item picked (at random)
from the mixture. For example, suppose that a bin
contains n items, with the ith item having model
failure rate htŽui ‘, i = 1; 2; : : : ; n, where ui is as-
sumed known. Suppose that the n items are oth-
erwise indistinguishable, so that the model failure
rate of an item picked at random from the bin is un-
known. However, if the ui ’s have a probability mass
function Pui ‘, then the predictive failure rate of
the item picked at random is given by the discrete
mixture Fig. 3. Monotonically decreasing predictive failure rate under
n subjective mixing over the Weibull shape parameter θ.
X
(4) ht‘ = htŽui ‘Pui Žt‘;
i=1

where Pui Žt‘ is the probability mass of ui given mixes over the different model failure rates that are
that T ≥ t. suggested by an unknown u, via a prior πu‘ over u:
Suppose now that each htŽui ‘ is decreasing in t. Specifically, suppose that an item has a model fail-
Then, by the closure under mixtures theorem (The- ure rate htŽu‘ with u unknown. Let πu‘ reflect
orem 1), ht‘ is also decreasing in t. In Figure 2 we one’s subjective opinion about the different values
illustrate this phenomenon via model failure rates of u; that is, πu‘ is the prior on u. Suppose that
that are constant. In fact, the closure under mix- htŽu‘ is decreasing in t for all values of u. Then,
tures theorem was motivated by the physical mixing by the closure under mixtures theorem, the predic-
of constant model failure rates. tive failure rate of the item ht‘ will also decrease
in t. For example, suppose that u = θ and that
7.2.2. Subjective or psychological‘ mixing. The FtŽθ‘ = exp−tθ ‘, θ ≥ 0, t > 0, a Weibull distri-
notion of what we refer to as “subjective mixing” bution with shape parameter θ. If θ = 1, htŽθ‘ = 1,
parallels that of physical mixing, except that now a constant, whereas if θ < 1, htŽθ‘ decreases in t.
one does not conceptualize a mixing process that Thus if πθ‘ has support 0; 1“, then ht‘ decreases
is prompted by physically putting together several in t; see Figure 3.
heterogenous items. Rather, one acts as a subjec- Thus in the two scenarios of physical and subjec-
tivist (in the sense of de Finetti and Savage), and tive mixing, the predictive failure rate is decreas-
ing in t, suggesting that burn-in should be contem-
plated.

7.3 Bathtub Failure Rates


We now ask the question, “What causes F to have
a decreasing and then increasing failure rate?” It is
difficult to think of an example from the physical sci-
ences for which one could come up with a convincing
argument about the changing behavior of chemical
bonds. That is, the bonds must initially strengthen
with use and then weaken. In the biological context,
it has been conjectured that the immune system ini-
tially improves with age but then gets worse, and so
a use of the bathtub curve in human mortality ta-
bles has a biological justification. However, the most
Fig. 2. Monotonically decreasing predictive failure rate under convincing argument—at least to us—is that of mix-
physical mixing. ing, either due to physical or, more likely, subjective
18 H. W. BLOCK AND T. H. SAVITS

the predictive failure rate is directly linked to our


uncertainty, via the prior variance.

8. THE OPTIMAL BURN-IN TIME


The question, “When should we burn-in?”, leads
us naturally to the issue of an optimal burn-in
time. To address this issue, let us first put into
perspective the circumstances under which the pre-
dictive failure rate has a decreasing segment. These
are (i) mixing due to uncertainty about constant,
increasing or decreasing model failure rates and
(ii) model failure rates which are strictly decreasing
because of physical circumstances, but about which
we are certain.
Fig. 4. Subjective mixing of increasing model failure rates re-
Under (i) above, burn-in can be viewed as a pro-
sulting in bathtub-shaped predictive failure rate.
cess of learning, that is, a reduction of uncertainty
about T. To see this, suppose that the optimal burn-
in time is τ ≥ 0. When the burn-in test shows that
causes. As an example of the above suppose that F T > τ, our predictive ability about T sharpens (via
is a Rayleigh distribution, truncated at the left at added knowledge about u). If the burn-in test shows
zero, so that htŽθ‘ = 2t + θ, with θ unknown. Let that T ≤ τ, then F is degenerate at some t ∈ 0; τ“,
πθ‘ have support ’0; ∞‘. Then it can be shown (see and the item tested is declared a weak one. Thus
Theorem 2 below) that the predictive failure rate of for predictive failure rates given by mixtures, be
F initially decreases and then increases, like a bath- they decreasing or bathtub, burn-in gives us added
tub curve (see Figure 4). Gurland and Sethuraman knowledge. The price we pay for this knowledge is
(1994, 1995), discuss other cases wherein the mix- the cost of testing and the depletion of useful life if
ture of increasing model failure rates could result the model failure rate is increasing in t. The optimal
in decreasing predictive failure rates. Their results τ is a trade-off between the costs and the utility of
suggest that in the presence of uncertainty, it is un- reduced uncertainty (see Theorem 3, below). Clearly,
usual for the predictive failure rate to be increasing. burn-in should not be done if (i) the predictive fail-
The example depicted in Figure 4 suggests the ure rate ht‘ is increasing in t or (ii) our trade-off
following theorem, which is a generalization of the calculations show that τ = 0; see Theorem 2.
situation discussed.
8.1 The Scenario of Indefinite Burn-in:
Theorem 2. Suppose that htŽθ‘ = αt‘+θ; where Eternal Happiness
θ ≥ 0 is unknown and ᐷ‘ is convex. Let πθ‘ de-
A situation of interest is that of htŽu‘ strictly
scribe our uncertainty about θ; and let VθŽt‘ denote
decreasing in t, with πu‘ being degenerate. If the
the variance of θ given T > t. Then ht‘ has a bath-
costs of burn-in are zero, then τ → ∞, because burn-
tub shape if
in enhances useful life. This implies that indefinite
d burn-in leads to eternal happiness! However, since u
VarθŽ0‘ > α0‘; is an abstraction (just a Greek symbol to de Finetti),
dt
a degenerate πu‘ is not realistic, and thus eter-
in which case the minimum occurs when VarθŽt‘ = nal happiness is a myth. If the costs of burn-in are
d/dt‘αt‘. greater than zero, then τ is the time at which the
costs of burn-in and the utility of enhanced life due
This result follows from the fact that ht‘ = αt‘+ to burn-in balance out.
E’θŽt“, where E’θŽt“ is a decreasing, convex function The above matters are summarized and quanti-
of t. The above theorem, as well as the example, fied via the following theorem due to Clarotti and
show that the popular bathtub curve of reliability Spizzichino (1990)—extended further by Block, Mi
is not necessarily physically realistic. Rather, it is and Savits (1993).
a consequence of belief produced by the process of Suppose that F has a density f, and suppose that
subjectively mixing increasing model failure rates gt‘ ≡ ft + s‘/ft‘ increases in t for all s > 0. Let
having certain properties. Note how the shape of c1 denote the cost if T < τ, let C be the cost if
BURN-IN 19

τ ≤ T ≤ τ + s (where s can be viewed as the mission surviving. Since burn-in is performed for a psycho-
time) and let −K be the reward if T > τ + s. Then: logical purpose, it is only natural to base burn-in cal-
culations upon the predictive failure rate. The mod-
Theorem 3 [Clarotti and Spizzichino (1990)]. eland predictive failure rates may have very differ-
ent forms. Indeed, while the famous bathtub curve
(i) Burn in indefinitely, iff rarely has a physical motivation, it arises quite nat-
C − c1 urally in our minds.
lim gt‘ < = v:
t→∞ C+K
ACKNOWLEDGMENTS
(ii) Do not burn in, iff g0‘ ≥ v.
The authors would like to thank Dennis Lind-
(iii) Burn in for time τ > 0, iff gτ‘ ≡ v.
ley and Al Marshall for their helpful comments on
an earlier draft of this paper. Supported by Army
Note that the indefinite burn-in of (i) above is dif-
Research Office Grant DAAH04-93-G-0020 and Air
ferent from the indefinite burn-in of eternal happi-
Force Office of Scientific Research Grant AFOSR-F-
ness discussed above. The former is based on costs
49620-95-1-0107.
of testing and in-service failure; the latter assumes
that the costs of burn-in are zero.
REFERENCES
9. CONCLUDING COMMENTS Barlow, R. E. and Proschan, F. (1975). Statistical Theory of
Reliability: Probability Models. Holt, Rinehart & Winston,
Let us return to the original question: “What is New York.
burn-in”? We argue that it is primarily a mechanism Besag, J., Green, P., Higdon, D. and Mengersen, K.
for learning. (1995). Bayesian computation and stochastic systems (with
discussion). Statist. Sci. 10 3–66.
The model failure rate describes the physical pro- Gurland, J. and Sethuraman, J. (1995). How pooling data may
cess of aging. The predictive failure rate describes reverse increasing failure rates. J. Amer. Statist. Assoc. 90
our changing beliefs about an item as we observe it 1416–1423.

You might also like