Optimal Variability Sensitive Conditionbased Maintenance With A Cox PH-chen2011
Optimal Variability Sensitive Conditionbased Maintenance With A Cox PH-chen2011
To cite this article: Nan Chen , Yong Chen , Zhiguo Li , Shiyu Zhou & Cris Sievenpiper (2011)
Optimal variability sensitive condition-based maintenance with a Cox PH model, International
Journal of Production Research, 49:7, 2083-2100, DOI: 10.1080/00207541003694811
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [Northwestern University] at 06:49 23 December 2014
International Journal of Production Research
Vol. 49, No. 7, 1 April 2011, 2083–2100
1. Introduction
Despite the increasing quality and reliability, systems in production or service industries
are still subject to deterioration and failures during their usage. Therefore, preventive
maintenance remains necessary to reduce unexpected system failures, and thus attracts
numerous research works in the literature (see e.g., Valdez-Flores and Feldman 1989,
Wang 2002 for reviews).
Because of the rapid development of information and computer technology, a huge
amount of data, such as in-process sensing signals (e.g., vibration, acoustic emission),
usage patterns, and system event logs, are often collected electronically during the
operation of many systems. It is generally believed that this data provides rich information
regarding system working conditions. For example, a faulty detector in a computed
tomography machine will eventually lead to a ‘scan abort’ failure. However, before the
failure, a faulty detector can cause a series of other events such as analogue-to-digital
converter error, communication error, and software error. By observing these preceding
events, we can predict the occurrence of the key failure and accordingly prevent its
occurrence or minimise its damages.
Among various preventive maintenance policies, the one that suggests system
inspection and maintenance actions based on the on-line observations of system conditions
is called condition based maintenance (CBM). In many cases, CBM provides better
performance than time based maintenance and the broad availability of data provides
great opportunities for establishing optimal CBM policies. Thus, CBM has drawn
significant attention in recent years. Jardine et al. (2006) provided an excellent review on
the condition based maintenance. It was noted that a critical task in CBM is to identify a
failure prognostic model to describe the system degradation process and as well impacts
from maintenance policies.
In CBM, according to whether the health states of systems are directly observable, they
Downloaded by [Northwestern University] at 06:49 23 December 2014
maintenance policies. Tapiero and Venezia (1979) treated the variance of the cost as a risk
factor to study a maintenance problem. Rangan and Grace (1988) used variance
optimisation criterion to obtain the optimal replacement cycle for systems. However, both
works only considered periodic replacement policy. Instead, Chen and Jin (2003)
considered the cost-variability-sensitive criterion on different policies, such as age
replacement and periodic replacement under minimal repair. They provided the conditions
under which the variability-sensitive policies have a finite optimal solution. However, how
to incorporate variability-sensitive policy into CBM is still an open question. It was
realised that the optimal policy is more difficult to obtain compared with the
variability-neutral policy due to the complexity introduced by the variance of cost in the
objective function.
In this paper, we want to identify the optimal preventive maintenance policies that can
minimise the average maintenance cost and cost variability of a given system where a PH
Downloaded by [Northwestern University] at 06:49 23 December 2014
model is used to describe its health evolution process. Because of the complexity
introduced by relaxing assumptions on the PH model and adding cost variability in the
objective function, it may be very difficult, if not impossible, to derive the objective
function analytically. Consequently, classical numerical algorithms cannot be used to find
the optimal solution. Therefore, we propose to use a simulation based methodology to
optimise the maintenance policies. Compared with traditional optimisation on mainte-
nance policies, simulation based methodology does not rely on restrictive (sometimes
unrealistic) assumptions about system health evolution, and therefore can be applied in
broader areas. In this paper, we propose to use a simulation model to replicate the
evolution of the system health, based on which different maintenance policies are
evaluated and compared. An improved optimisation algorithm is also presented to
increase the convergence speed of the search process.
The rest of the paper is organised as follows. In Section 2, detailed formulation of the
problem is given. In Section 3, the simulation model of system degradation and condition
based maintenance is developed, and the optimisation framework ANP-SS is presented in
detail. In Section 4, a case study based on real world data is presented to illustrate the
effectiveness of our methods. Based on the case study, some general practical implications
will be discussed. Finally, we conclude the paper in Section 5 and discuss potential future
research directions.
where P is the maintenance policy that will be optimised, is the set of all feasible policies,
and I is the inspection interval, taking positive values, that will be jointly optimised
together with P; is the factor that adjusts the weight of cost variability in the objective
function; C is the random variable denoting the total cost incurred during a pre-specified
2086 N. Chen et al.
time frame, say T. The expectation and variance of C are denoted by E(C) and
Var(C), respectively. The cost for preventive maintenance is Cp; the cost for emergency
replacement (when the system fails between two successive inspections) is Cf ; the cost for
inspection is CI. Usually we have Cf 4Cp4CI. Furthermore, NPR, NER, and NIP are the
random variables denoting the numbers of preventive maintenance, emergency replace-
ment, and inspection completed within the total time T, respectively. In this paper, we
consider the set of hazard rate control limit policies, i.e.:
1, hðtjZðtÞÞ 4 g
¼ Dðt, gÞDðt, gÞ ¼ , t ¼ kI ðk ¼ 1, 2, 3, . . .Þ; g 2 Rþ , ð2Þ
0, otherwise
where D(t, g) is the decision made at each inspection time t, which equals 1 when
immediate preventive maintenance action is taken, and equals 0 when no action is
Downloaded by [Northwestern University] at 06:49 23 December 2014
enforced; g is the hazard threshold; and h(t|Z(t)) is the system hazard rate at time t with
observed covariates Z(t). According to proportional hazard (PH) model (Cox 1972), we
have:
" p #
T X
hðtjZðtÞÞ ¼ h0 ðtÞ exp ZðtÞ ¼ h0 ðtÞ exp k Zk ðtÞ , ð3Þ
k¼1
where h0(t) is the baseline hazard rate function; Z(t) are the observations of system
conditions; and the vector is the coefficient vector. In this paper, we assume that the PH
model is explicitly known, either from engineering knowledge or estimations from
historical data.
Even though the observations Z(t) may contain some time-varying variables, it is often
expensive and impractical to implement continuous monitoring (Jardine et al. 2006).
Instead, we assume the system is inspected periodically at fixed interval I, where condition
data Z(t) will be collected and system health h(t|Z(t)) will be updated. Clearly, the
frequency of inspection has some impact on the maintenance policies. For example, if the
inspection frequency is below a certain level, then the probability that the system will fail
between two inspections will increase, and thus the total emergency replacement cost will
increase. On the other hand, if the inspection frequency is too high, although it can update
the system condition promptly, the cost incurred by frequent inspection will increase.
Obviously, there is a trade-off between the inspection cost and the emergency replacement
cost; and it is desired to find a good inspection interval that can balance these two costs to
achieve the optimal results. In fact, the problem of identifying the optimal inspection
interval under some simple system degradation model has been investigated by some
researchers (e.g., Hosseini et al. 2000, Grall et al. 2002a, b). Motivated by these
observations, we will jointly optimise the inspection interval I and the maintenance
policies P.
It is worth noting that, our methodology does not require the closed-form expression
of the objective function. Thus, the proposed methodology can be extended easily to other
more complicated maintenance policies. However, in this paper we limit our scope to the
control limit policy for illustration purposes. The summary of the major assumptions and
settings in our problem formulation are:
(1) The system degradation process can be described by a proportional hazard (PH)
model, with covariates observable at inspection.
International Journal of Production Research 2087
follows a unit exponential distribution (Leemis et al. 1990). Therefore, to generate the
failure time, we can first generate a unit exponential distributed random variable u, then by
solving the equation H(t) ^ u, we can obtain the corresponding failure time t. However,
the baseline hazard function and covariates function can be very complex, making the
2088 N. Chen et al.
integral equation in (4) difficult to solve analytically. Therefore, numerical methods must
be relied on for complicated models.
Fortunately, in many engineering applications, the baseline hazard function can be well
approximated using the hazard function of a Weibull distribution with shape parameter
and scale parameter , then we have the baseline function as h0 ðtÞ ¼ t1 . In this case,
it is possible to generate the failure time more efficiently. Noticing the covariates are
updated at fixed interval, and are considered as constant during two successive inspections,
therefore according to (3), the exponent part of the hazard function only changes at
inspections, and keeps constant otherwise. In other words, it has the form:
8
> c0 , 0 t 5 t1
>
! >
>
> c , t1 t 5 t2
Xp <.1 ..
.
Downloaded by [Northwestern University] at 06:49 23 December 2014
For illustration, a typical cumulative hazard function curve is shown as the dashed line
in Figure 1. The baseline cumulative hazard function is also depicted as the solid line for
comparison.
–1
–2
log(hazard)
–3
–4
–5
–6
0 2 4 6 8 10 12
Time
It can be noted that H(t) in this case is a stepwise invertible function. By solving the
equation H(t) ^ u, we can obtain:
8 1=
>
> u
>
> , 0 u5c0 t1
>
> c
>
> 0 1=
>
> u c0 t1
>
>
< þ t1 , c0 t1 u5½c0 t1 þ c1 ðt2 t1 Þ
t¼ c1 ,
> ...
>
>
..
.
>
>
>
> 1= ½c0 t1 þ c1 ðt2 t1 Þ þ þ cn1 ðtn tn1 Þ
>
> u ½c0 t1 þ c1 ðt2 t1 Þ þ þ cn1 ðtn tn1 Þ
>
> þ t , u5
>
: c n
n
½c0 t1 þ c1 ðt2 t1 Þ þ þ cn ðtnþ1 tn Þ
ð7Þ
where u is a random variable following unit exponential distribution.
Downloaded by [Northwestern University] at 06:49 23 December 2014
Replace the
component
N Y
Emergent Y N
Failure occurred? Maintenance?
repair
function should be unbiased to ensure convergence. Suppose under a given CBM policy,
we run the simulation n times, and obtain the total costs C1, C2, . . . , Cn. Then the objective
function can be estimated by:
1 X
n
1X n
1 X
n
f^ ¼ C2i þ ðCi C Þ2 , where C ¼ Ci : ð8Þ
n i¼1
n 1 i¼1 n i¼1
¼ ½EðCÞ2 þ VarðCÞ:
To avoid gradient estimation, which is often very time-consuming for simulation-based
procedures, we adopt and improve the gradient free optimisation method nested partition
(NP) (Shi and Chen 2000) in this paper. The idea of NP is as follows. In each iteration, a
region is selected as the most promising region. Then this region is partitioned into M
subregions; all the other regions are aggregated into one region. Each of these M þ 1
disjoint regions are sampled and evaluated through some performance function. The
region with the highest score will be selected as the most promising region in the next
iteration. A brief description of the procedure is given in the Appendix.
It is also worth noting that the NP method is most effective with finite or countable
sample space. In our application, we first discretise the continuous sample space to a
discrete and countable sample space at a given precision before applying the optimisation
methods. We also propose some improvements on the original NP framework. The
method we use here is called adaptive nested partition with sequential selection, or
ANP-SS for short. Simply speaking, we improve the estimation of the promising index,
and choose the most promising region more efficiently in each iteration. In the following,
we will focus on describing the changes we made on the original NP framework.
In this way, each promising region can be partitioned into M subregions. The next step
would be drawing random samples from these regions.
In the general case, the feasible region in the sample space may have a complex shape.
In the case where the feasible region is convex and defined by a set of linear constraints, a
procedure called MIX-D can be used to draw samples approximately uniformly from the
region (Pichitlamken and Nelson 2003). In this paper, we use stratified sampling, which
takes samples at each dimension separately, and then combines them together to obtain
the final samples from solution space. To be specific, we denote ðkÞ ¼
fxjli xi ui , 1 i ng as the space to be sampled. For dimension i, we will draw
m(k) random samples uniformly from range li xi ui, denoted as xij, j ¼ 1, 2, . . . , m(k).
After obtaining the samples in each dimension, we can combine them together to obtain
the samples in the original space by:
Downloaded by [Northwestern University] at 06:49 23 December 2014
Since samples in each dimension are independent with that in other dimensions, the
uniform sampling in each dimension guarantees the uniform distribution of x in the
original space.
1 X n0
2
S2ij ¼ Yi Yj Y i ðn0 Þ þ Y j ðn0 Þ , ð11Þ
n0 1 ¼1
where Y i ð Þ is the sample mean using the first samples from the simulation
results Yi. With the initial estimates of variance, we can determine the procedure
2092 N. Chen et al.
j2Qold , j6¼i
¼1 ¼1
Step 4: Stopping rule. If any of the following three criteria is satisfied, then the sequential
selection is terminated, and the most promising region is returned:
. If n04max{Ni}, then select the solution with smallest Y i ðn0 Þ, and correspond-
ing subregion as the most promising region ðk þ 1Þ.
. Adaptive partitioning. If xi 2 ðkÞ ¼ fxjlw xw uw , 1 w ng, 8i 2 Q, and:
max xis xjs max rwþ1 rw ,
i, j2Q w¼1,2,...,M
where rw are the subregion boundary defined in (10), then the most promising
region is constructed as ðk þ 1Þ ¼ fxjlw xw uw , w 6¼ s and ls0 xs u0s g,
where s is the dimension chosen as the partitioning dimension, and:
1 X 1
ls0 ¼ xjs max rwþ1 rw
jQj j2Q 2 w¼1,2,...,M
ð14Þ
1 X 1
and u0s ¼ xjs þ max rwþ1 rw ,
jQj j2Q 2 w¼1,2,...,M
where |Q| is the number of remaining elements in the set Q, and xjs is the value
of xj in sth dimension.
. Run an additional simulation for each xi , 8i 2 Q, and set ¼ þ 1. If
¼ maxfNi g þ 1, then select the solution with smallest Y i ð Þ, and the
corresponding subregion as most promising region ðk þ 1Þ; otherwise, go to
Step 3 for further screening
The first three steps in the above procedure are the same as that provided in
Pichitlamken (2002). However, the adaptive partition in the stopping rule is added in Step
4 in order to improve the original method. In the original method, the sequential selection
is stopped when all the remaining samples are in the same subregion. However, in the
method we proposed, the sequential selection is stopped when the remaining samples are
close enough to each other to form a new subregion as the most promising region. The
reason for adaptive partitioning is that when we select cut-points to define the subregions,
the selection is arbitrary, without any consideration on the objective function structure.
International Journal of Production Research 2093
However, it is possible that the subregion selection is inappropriate which can lead to
incorrect selection of the most promising region, as demonstrated in Figure 3.
From Figure 3, we can observe that when the original partitioning line is close to the
minimum solution we try to find, and if we select the most promising region from the
original subregions, then it is very likely we will miss the optimal solution in our most
promising region, which will cause inefficiency. Instead, if we find from the remaining
samples that they are close to each other, as illustrated by the black crosses in the figure,
although not in one subregion, we can repartition the original region to have one
subregion cover this area, and accordingly this subregion will become the most promising
region in the next iteration. Through this strategy, we can reduce the number of iterations
and evaluations of the objective function, and thus increase the efficiency and effectiveness
of the optimisation scheme.
After pattern identification and model selection, we can select several important events
as predictor events, and use them as covariates to predict the failure distribution (Li et al.
2007). The model estimated from the data is shown below:
to estimate the distributions of their occurrence time. It is observed that, not all the
predictor events would happen before the critical failures. In other words, some predictor
event occurrence times would be censored by the failures. Therefore, it is necessary to take
this censored data into consideration when estimating the distribution of the occurrence
time of predictor events. Figure 4 illustrates the empirical survival function of the predictor
events (considering censored data) and their corresponding estimated survival function
using exponential distribution.
From Figure 4, we can find that the censoring indeed has a large influence on the
estimation of the distribution. If we ignore the censored data, and only use the completely
observed data to test the goodness of fit of the estimated distribution, the test may reject
the hypothesis that the data comes from the tested distribution. However, by considering
the censoring in the data, the p-value of the goodness of fit tests would be improved, as
shown in Table 1.
1.0
0.8
Survival probability
0.6
0.4
0.2
0.0
0 2 4 6 8 10
Time
Figure 4. Comparison of the empirical distribution and the estimated exponential distribution.
International Journal of Production Research 2095
From Table 1, we can observe that the goodness of fit 2 tests do not reject the
hypothesis that the data is from these estimated distributions. Therefore, we will use these
estimated distributions to generate the occurrence time of predictor events during
simulation.
With the distribution of the covariates and the PH model as shown in (16), we can use
the methodology described in Section 3.1 to generate the failure time based on (7).
Additionally, with the estimated baseline hazard function parameter ¼ 1:558, and
¼ 0:0315, we can plug them in (7) to obtain the failure time according to the distribution
implied by the PH model.
After identifying the PH model, we can use simulation optimisation to find the optimal
policies. Suppose we use ¼ 20 in the objective function as the weight of cost variance,
and corresponding costs for preventive maintenance and emergency replacement are
Cp ¼ 200 and Cf ¼ 800. To illustrate the effectiveness of our simulation optimisation
framework, we first set the inspection interval to 1 (month), and use our method to find the
optimal hazard threshold. The simulation length is 100 (months), and no inspection cost is
considered during this validation process. For comparison, we also use 10,000 replications
to estimate the objective function under different hazard thresholds, and the result is
shown in Figure 5.
From Figure 5, we can find that the variability sensitive policy is more conservative
and results in smaller hazard threshold. It can also be observed that with little sacrifice in
mean cost, the variability of the maintenance cost can be greatly reduced. The optimal
hazard threshold for variability sensitive policy is identified as 0.11 from the graph.
Alternatively, if we use the optimisation technique introduced in Section 3.2, we can
quickly find the optimal value as 0.11, which is consistent with Figure 5. Our framework is
more efficient when the decision variables are multi-dimensional, in which case the
computation load is exponentially increased for grid evaluations. As an example, the
proposed optimisation can find the solution for the two-dimensional problem in around 15
hours, while the grid evaluation will take more than 7 days on the same computer with the
same problem settings. To illustrate the advantages of condition based maintenance over
time based maintenance, we also compare our optimal policy with the optimal periodical
maintenance policy. Numerical results show that the optimal periodical maintenance
policy has about 10% higher value in objective function than optimal CBM policy. Since
in general, the degradation process of the CT system does not follow the Markov process,
many maintenance policies based on the Markovian property are not applicable here.
For the multi-dimensional problem of finding the optimal inspection interval and
hazard threshold combination to minimise the objective function, we can still achieve
satisfactory results by applying the optimisation algorithm we proposed. We change the
Covariate ^ p-value
ZA 2.0783 0.9073
ZB 4.3755 0.2251
ZC 7.275 0.4615
2096 N. Chen et al.
Optimal variability
sensitive policy Optimal variability
neutral policy
20 Mean cost
Variability
Total cost
19.5
19
log value
18.5
Downloaded by [Northwestern University] at 06:49 23 December 2014
18
17.5
17
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Hazard threshold
0 19.37 1 0.19
10 19.49 1 0.19
20 19.60 5 0.86
30 19.61 7 0.94
40 19.63 10 0.91
50 19.64 10 0.98
60 19.65 10 0.87
70 19.66 10 0.93
80 19.67 10 0.96
90 19.68 10 0.91
Downloaded by [Northwestern University] at 06:49 23 December 2014
risk of unexpected failure and thus increases the overall maintenance cost. In contrast, the
optimal hazard threshold also increases as a general trend, but with small fluctuations. The
main reason is that during the optimisation, if the difference between objective function
values is within the indifference zone, the search will be terminated, and the solution which
may not necessarily be the precisely optimal one, will be returned. Therefore, if the hazard
thresholds have their objective function values very close to each other within a certain
range, the solutions found by the optimisation algorithm are likely to fluctuate within this
range.
Acknowledgement
The financial support of this work is provided by NSF grants #0757683 and #0758178, and GE
Healthcare.
References
Baruah, P. and Chinnam, R.B., 2005. HMMs for diagnostics and prognostics in machining
processes. International Journal of Production Research, 43 (6), 1275–1293.
Bloch-Mercier, S., 2002. A preventive maintenance policy with sequential checking procedure
for a Markov deteriorating system. European Journal of Operational Research, 142 (3),
548–576.
Chen, C.T., Chen, Y.W., and Yuan, J., 2003. On a dynamic preventive maintenance policy for a
system under inspection. Reliability Engineering and System Safety, 80 (1), 41–47.
Chen, Y. and Jin, J., 2003. Cost-variability-sensitive preventive maintenance considering manage-
ment risk. IIE Transactions, 35 (12), 1091–1101.
International Journal of Production Research 2099
Cox, D.R., 1972. Regression models and life-tables. Journal of the Royal Statistical Society Series
B-Statistical Methodology, 34 (2), 187–220.
Dieulle, L., et al., 2003. Sequential condition-based maintenance scheduling for a deteriorating
system. European Journal of Operational Research, 150 (2), 451–461.
Dong, M. and He, D., 2007. Hidden semi-Markov model-based methodology for multi-sensor
equipment health diagnosis and prognosis. European Journal of Operational Research, 178 (3),
858–878.
Grall, A., Berenguer, C., and Dieulle, L., 2002a. A condition-based maintenance policy for
stochastically deteriorating systems. Reliability Engineering and System Safety, 76 (2),
167–180.
Grall, A., et al., 2002b. Continuous-time predictive-maintenance scheduling for a deteriorating
system. IEEE Transactions on Reliability, 51 (2), 141–150.
Hormann, W. and Leydold, J., 2000. Automatic random variate generation for simulation input.
In: Proceedings of the 2000 winter simulation conference, vol. 1, 10–13 December, Orlando,
Downloaded by [Northwestern University] at 06:49 23 December 2014
Florida, 675–682.
Hosseini, M.M., Kerr, R.M., and Randall, R.B., 2000. An inspection model with minimal and major
maintenance for a system with deterioration and Poisson failure. IEEE Transactions on
Reliability, 49 (1), 88–98.
Jardine, A.K.S., Banjevic, D., and Makis, V., 1997. Optimal replacement policy and the structure
of software for condition-based maintenance. Journal of Quality in Maintenance Engineering,
3 (2), 109–119.
Jardine, A.K.S., Lin, D., and Banjevic, D., 2006. A review on machinary diagnostics and
prognostics implementing condition-based maintenance. Mechanical Systems and Signal
Processing, 20 (7), 1483–1510.
Kumar, D. and Westberg, U., 1997. Maintenance scheduling under age replacement policy
using proportional hazards model and TTT-ploting. European Journal of Operations Research,
99 (3), 507–515.
Leemis, L., Shih, L.H., and Keynertson, K., 1990. Variate generation for accelerated life and
proportional hazards models with time dependent cases. Statistics and Probability Letters,
10 (4), 335–339.
Leemis, L., 1999. Simulation input modelling. In: Proceedings of the 31st conference on winter
simulation: simulation – a bridge to the future. vol. 1, 5–8 December, Phoenix, Arizona, 14–23.
Li, Z., et al., 2007. Failure event prediction using the Cox proportional hazard model driven by
frequent failure signatures. IIE Transactions, 39 (3), 303–315.
Liao, H.T., Elsayed, E.A., and Chan, L.Y., 2006. Maintenance of continuously monitored degrading
systems. European Journal of Operational Research, 175 (2), 821–835.
Makis, V. and Jardine, A.K.S., 1992. Optimal replacement in the proportional hazards model.
INFOR, 30 (1), 172–183.
Percy, D.F. and Kobbacy, A.H., 2000. Determining economical maintenance intervals. International
Journal of Production Economics, 67 (1), 87–94.
Pichitlamken, J., 2002. A combined procedure for optimization via simulation. Dissertation (PhD).
Department of Industrial Engineering and Management Sciences, Northwestern University,
Evanston, Illinois.
Pichitlamken, J. and Nelson, B., 2003. A combined procedure for optimization via simulation. ACM
Transactions on Modeling and Computer Simulation, 13 (2), 155–179.
Rangan, A. and Grace, R.E., 1988. A non-Markov model for the optimum replacement of
self-repairing systems subject to shocks. Journal of Applied Probability, 25 (2), 375–382.
Shi, L. and Chen, C., 2000. A new algorithm for stochastic discrete resource allocation optimization.
Discrete Event Dynamic Systems, 10 (3), 271–294.
Swisher, J.R. and Jacobson, S.H., 1999. A survey of ranking, selection, and multiple comparison
procedures for discrete-event simulation. In: Proceedings of the 31st conference on
2100 N. Chen et al.
winter simulation: simulation – a bridge to the future, vol. 1, 5–8 December, Phoenix, Arizona,
492–501.
Tapiero, C.S. and Venezia, I., 1979. A mean variance approach to the optimal machine maintenance
and replacement. The Journal of the Operational Research Society, 30 (5), 457–466.
Valdez-Flores, C. and Feldman, R.M., 1989. A survey of preventive maintenance models for
stochastically deteriorating single-unit systems. Naval Research Logistics, 36 (4), 419–446.
Wang, H., 2002. A survey of maintenance policies of deteriorating systems. European Journal of
Operational Research, 139 (3), 469–489.
Wang, W., 2000. A model to determine the optimal critical level and the monitoring intervals in
condition-based maintenance. International Journal of Production Research, 38 (6), 1425–1436.
Appendix
Downloaded by [Northwestern University] at 06:49 23 December 2014
The framework of the NP method proposed by Shi and Chen (2000) is summarised. Denote as the
feasible region, and ðkÞ as the most promising region in the kth iteration.
Stage 1: Initialisation.
Set k ¼ 0, choose the whole sample space as the most promising region ð0Þ ¼ .
Stage 2: Partitioning.
Partition ðkÞ into M subregions: 1 ðkÞ, 2 ðkÞ, . . . , M ðkÞ, and aggregate all the other
regions into one region Mþ1 ðkÞ.
Stage 3: Sampling.
Randomly draw m(k) samples in each region.
Stage 4: Evaluation and selection.
Evaluate and estimate the objective function at these samples through simulation. Based
on the estimates, choose the most promising region for the next step ðk þ 1Þ.
Stage 5: If ðk þ 1Þ is not fully contained in ðkÞ, then backtracking is needed, and ðk þ 1Þ is set to
ðk 1Þ, which is the super region of ðkÞ. Otherwise, Stage 2 to Stage 4 will be repeated
until ðk þ 1Þ is a singleton, which cannot be further partitioned.