Prediction of Passive Maintenance Opportunity Windows On Bottleneck Machines in Complex Manufacturing Systems
Prediction of Passive Maintenance Opportunity Windows On Bottleneck Machines in Complex Manufacturing Systems
Prediction of Passive
Xi Gu
Department of Mechanical Engineering,
University of Michigan,
Maintenance Opportunity
2350 Hayward Street,
Ann Arbor, MI 48109
e-mail: [email protected]
Windows on Bottleneck
Xiaoning Jin
Machines in Complex
Department of Mechanical Engineering,
University of Michigan,
2350 Hayward Street,
Manufacturing Systems
Ann Arbor, MI 48109 In this paper, we investigate hidden opportunities for performing proper maintenance
e-mail: [email protected] tasks during production time without causing production losses. One of the maintenance
opportunities on a machine is when the machine is starved or blocked due to the occur-
Jun Ni rence of random failures on its upstream or downstream machines. Such failure-induced
Fellow ASME starvation or blockage time is defined as a passive maintenance opportunity window
Department of Mechanical Engineering, (PMOW), and is predicted on the bottleneck machines in manufacturing systems with
University of Michigan, different configurations. The effectiveness of the PMOW prediction algorithm is validated
2350 Hayward Street, through case studies in both simulations and an automotive assembly plant.
Ann Arbor, MI 48109 [DOI: 10.1115/1.4029906]
e-mail: [email protected]
Keywords: manufacturing system, real-time maintenance policy, maintenance opportunity
window
1 Introduction between production and maintenance [4]. Chang et al. [5] defined
MOWs as the hidden opportunities for maintenance during pro-
Maintenance is one of the most important operations in manu-
duction time while the production continuity is still guaranteed.
facturing systems, and it is the number one system operational
Such time window is provided by the material handling devices in
cost factor [1]. There are different types of maintenance tasks per-
manufacturing systems, such as bins, shelves, and conveyors.
formed in manufacturing systems, including (1) repair when
These devices function as buffers to store work-in-process (WIP)
machines break down; (2) regularly scheduled preventive mainte-
jobs and decouple machine downtimes [6].
nance tasks; and (3) incidental maintenance tasks that are urgent
We further classify MOWs into two different types. For the first
but take short time, such as adding coolant, changing tools, and
type, one machine can be strategically shut down for maintenance
inspections [2]. Development of effective maintenance policies
during its production time if the short-term system production
can help manufacturing enterprises to reduce maintenance cost by
requirement can be satisfied by utilizing the inventories in the
making better utilization of their maintenance resources.
machine’s downstream buffers. Such kind of maintenance oppor-
Although maintenance can keep the machines and equipment
tunities is defined as active MOWs. The second type of MOWs
operating in good condition, arbitrarily stopping machines for
comes from a machine’s blockage and starvation time that is
maintenance could interrupt regular production and affect system
induced by the propagation of the downtime of other machines in
throughput. Therefore, a conflict arises between the production
the system. We define such MOWs as PMOWs.
manager and the maintenance manager: the former wants to keep
Maintenance policies in multicomponent systems have been
machines operating to satisfy daily production target but with lit-
studied for decades. Cho and Parlar [7] reviewed papers investi-
tle concern about the machine health condition, while the latter
gating the policies in the systems where components may or may
wants sufficient stoppage time to perform adequate maintenance
not depend on each other, economically or stochastically, and
tasks [2]. A traditional way to resolve the conflict is to schedule
Nicolai and Dekker [8] further considered the structural depend-
maintenance tasks during nonproduction shifts or weekends [3].
ence. Some group maintenance and opportunistic maintenance
Such policy is sometimes difficult to meet the required system
policies were reviewed by Wang [9]. However, most of the litera-
performance in terms of cost-effectiveness and production effi-
ture focuses on the reliability of individual components and makes
ciency. First, it introduces extra overtime labor cost. Usually the
assumptions about their interdependency. There is a lack of
salaries during weekends and holidays are one and half to two
study on the analytical relationship among these components. For
times of regular salaries. Second, as manufacturing systems
manufacturing systems, Ambani et al. [10] used a continuous time
become more and more complex, there are so many maintenance
Markov chain model to develop condition-based maintenance
tasks that not all of them can be completed in nonproduction time.
policies in serial production lines. Lee et al. [11] investigated the
Third, this policy is usually static and does not respond to the
optimal inspection policy for a manufacturing system with two
real-time system conditions. In order to reduce overtime costs and
machines in serial or parallel. The systems considered in
make more efficient use of maintenance resources, mathematical
Refs. [10,11] contain no buffers, which makes their application
models should be developed to capture the interdependence
limited. The real-time buffer levels directly affect how the down-
time of one machine could propagate to its surrounding machines
1
Corresponding author. [12], and hence the existence of buffers plays an important role in
Contributed by the Manufacturing Engineering Division of ASME for publication
in the JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING. Manuscript received
developing short-term maintenance policies [13]. Chang et al.
September 30, 2013; final manuscript received February 19, 2015; published online used a continuous flow model to investigate the maintenance
March 12, 2015. Assoc. Editor: Jaime Camelio. opportunities in serial lines in Ref. [5], and further accounted for
Journal of Manufacturing Science and Engineering JUNE 2015, Vol. 137 / 031017-1
C 2015 by ASME
Copyright V
Fig. 4 The state of the bottleneck machine in lines li;ð1Þ and li;ð2Þ
when these lines are considered independently (a) and
Fig. 3 An example of a complex manufacturing system jointly (b)
Journal of Manufacturing Science and Engineering JUNE 2015, Vol. 137 / 031017-3
ure event may cause multiple idle durations on the bottleneck The second case occurs when the first idle duration on machine
machine. Naturally, one may consider grouping these separate Mb is caused by F2 . Then, this idle duration will delay the propa-
idle durations together and strategically shutting the bottleneck gation for F1 , and this delay is calculated as
machine down for jPMOWi j. The following proposition shows h
that such grouping will not bring additional idle time to the bottle- DðF1 Þ ¼ T2 þ Ticons2
; T2 þ DTi2 þ Tires
2
(13)
neck machine, as long as it is shut down before it is affected by
the failure event.
PMOWðT2 Þ is then updated as
PROPOSITION 1. Let j1 ¼ mink¼1;:::;Ki fk : DTi > DTi;ðkÞ g, then for
cons
any T0 Ti;ðj , if the bottleneck machine is shut down during h
1Þ
time ðT0 ; T0 þ jPMOWi jÞ, it will have no additional idle time PMOWðT2 Þ ¼ T2 þ Ticons
2
; T2 þ DTi2 þ Tires
2
afterward if no future failure occurs. [h
Proof. See the Appendix. T1 þ Ticons
1
þ jDðF1 Þj; T1 þ DTi1 þ Tires
1
Note that the maintenance crews can only move the entire or (14)
partial idle durations on the bottleneck machine earlier; otherwise
the total length of PMOW will be reduced. For example, if a
To conclude the two cases, the PMOW after the second failure
PMOW starts 5 min later but the maintenance crews cannot get
F2 can be updated as
prepared in 7 min, at least 2 min of the maintenance opportunities
will be wasted. [ h
PMOWðT2 Þ ¼ Tk þ Ticons
k
þ j DðFk Þj; T k þ DT ik
þ Ti
res
k
k¼1;2
3.2 Prediction of PMOWs Under Multiple Failures in a
Serial Line. In Sec. 3.1, PMOWs have been predicted under a (15)
single failure. However, it is possible that during the downtime of
one machine, another random failure occurs. In this section, we where
8h
< T þ T cons ; T þ DT þ T res T2 þ Ticons < T1 þ Ticons
2 i2 2 i 2 i2
DðF1 Þ ¼ 2 1
: [ T2 þ Ticons T1 þ Ticons
2 1
8
< [ T2 þ Ticons
2
< T1 þ Ticons
1
DðF2 Þ ¼ h T
: T1 þ Ti ; T1 þ DTi1 þ Tires
cons
½T2 ; þ1Þ T2 þ Ticons T1 þ Ticons
1 1 2 1
where the order of FðkÞ ’s satisfies TðkÞ þ Ticons þ DðFðkÞ Þ
ðkÞ
Fig. 5 Prediction of PMOW under multiple failures in a com-
Tðkþ1Þ þ Ticons þ DðFðkþ1Þ Þ, such that the impact of FðkÞ
ðkþ1Þ plex system
propagates to Mb before that of Fðkþ1Þ .
Then, when a new failure Fnþ1 occurs at time Tnþ1 ,
PMOWðTnþ1 Þ can be updated through the following steps.
the latter updates the PMOWs when sequential failures occur in a
(1) Arrange the n þ 1 failures in an ascending order of the serial line. The integration of the two scenarios leads to the pre-
time when their impacts propagate to the bottleneck diction of PMOW in a more general yet more complicated sce-
machine Mb . nario, i.e., PMOWs under multiple failures in a complex system.
(a) Initially, set the delay of propagation for the new fail- Figure 5 illustrates the idea of such integration: once a new
ure Fnþ1 as DðFnþ1 Þ ¼ [ and its order index as j ¼ 1; machine failure occurs in a complex system, it can be regarded as
(b) Determine whether the impact of failure Fnþ1 propa- multiple equivalent failures that occur simultaneously in equiva-
gates to Mb before that of failure FðjÞ : lent serial lines; and then, for each of the equivalent failures,
If Tnþ1 þ Ticons
nþ1
þ jDðFnþ1 Þj < TðjÞ þ Ticons ðjÞ
þ DðFðjÞ Þ, PMOWs can be updated.
Fnþ1 impacts Mb before FðjÞ . Go to step (c). Assume at time T1 , the first failure F1 occurs on machine Mi1 ,
If Tnþ1 þ Ticons
nþ1
þ jDðFnþ1 Þj TðjÞ þ Ticons ðjÞ
þ DðFðjÞ Þ, which is connected to the bottleneck machine through Ki1 equiva-
Fnþ1 impacts Mb after FðjÞ . Update the delay of propa- lent serial lines. Therefore, it can be regarded as there are totally
S Ki1 equivalent failures occurring simultaneously at time T1 , and
gation for Fnþ1 as DðFnþ1 Þ ¼ DðFnþ1 Þ Hs ðFðjÞ Þ
T the PMOW can be predicted based on the analysis in Sec. 3.1. Let
½Tnþ1 ; þ1Þ and set j ¼ j þ 1. If j ¼ n þ 1, Fnþ1 is F1;k denote an equivalent failure of F1 in the kth line, and order
the last failure that impacts Mb and go to step (c). cons
them by FðkÞ ¼ F1;ðkÞ (k ¼ 1; :::; Ki1 ) such that T1;ðkÞ cons
T1;ðkþ1Þ .
Otherwise repeat step (b) to compare Fnþ1 with the From Eq. (9), PMOW at time T1 can be calculated as
updated FðjÞ .
(c) Arrange the n þ 1 failures in the ascending order of the [
PMOWðT1 Þ ¼ Hc ðFðkÞ Þ (18)
time when
8 their impacts propagate to Mb , as k¼1;:::;Ki1
< FðkÞ k ¼ 1; :::; j 1
F0ðkÞ ¼ Fnþ1 k¼j . The “¼” here h
:
Fðk1Þ k ¼ j þ 1; :::; n þ 1 where Hc ðFðkÞ Þ ¼ TgðkÞ þ Ticons gðkÞ ;hðkÞ
þ DðFðkÞ Þ; TgðkÞ þ DTigðkÞ ;hðkÞ
means that the parameters of the updated failure F0 ’s, þTires Þ is the PMOW caused by failure FðkÞ ; and
gðkÞ ;hðkÞ
such as T 0 , DT0 , T cons0 , and T res0 , are the same as those S T
of the corresponding failure F’s. DðFðkÞ Þ ¼ l¼1;:::;k1 Hc ðFðlÞ Þ TðkÞ ; þ1 is the delay of propa-
gation for failure FðkÞ . gðkÞ and hðkÞ are the machine and line indi-
(2) Update the delay of propagation for the Sfailures that impact T ces, such that the equivalent failure FðkÞ comes from the effect of
0 0
b after Fnþ1 , as DðFðkÞ Þ ¼ j¼1;:::;k1 Hs ðFðjÞ Þ
M the actual failure FgðkÞ in line hðkÞ, i.e., FðkÞ ¼ FgðkÞ;hðkÞ . Note that
TðkÞ ; þ1 for k ¼ j þ 1; :::; n þ 1. gðkÞ ¼ 1, for all k ¼ 1; :::; Ki1 .
(3) Update
S the new PMOW as PMOWðTnþ1 Þ Then, we assume that at time T2 , a second failure F2 occurs on
¼ k¼1;:::;nþ1 Hs ðF0ðkÞ Þ. Set FðkÞ ¼ F0ðkÞ ðk ¼ 1; :::; n þ 1Þ so machine Mi2 , which is connected to the bottleneck machine
that it can be written as through Ki2 lines. The procedure to update PMOW is outlined as
[ follows:
PMOWðTnþ1 Þ ¼ Hs ðFðkÞ Þ (17)
k¼1;:::;nþ1
(1) Decompose failure F2 into its equivalent failure F2;ðkÞ ’s
(k ¼ 1; :::; Ki2 ) such that Ticons
2 ;ð1Þ
Ticons
2 ;ðKi2 Þ
.
(2) For each F2;ðkÞ (k ¼ 1; :::; Ki2 ), apply steps 1–3 in Sec. 3.2
Equation (17) has the same format as Eq. (16), which completes to update PMOWðT2;ðkÞ Þ, which is the PMOW caused
the mathematical induction. Therefore, these steps can be used jointly by F1;ð1Þ ; F1;ð2Þ ; :::; F1;ðKi1 Þ and F2;ð1Þ ; F2;ð2Þ ; :::; F2;ðkÞ .
recursively to update thePMOWwhen a new failure occurs. More- (3) Finally, the PMOW at time T2 can be updated as
over, among these three steps, step 1 is the key. Once the
sequence of the failures (in terms of when their impacts propagate [
to the bottleneck machine) is obtained, it is not difficult to update PMOWðT2 Þ ¼ PMOWðT2;ðKi2 Þ Þ ¼ Hc ðFðkÞ Þ
the delay of propagation and the PMOW. k¼1;:::;Ki1 þKi2
(19)
Journal of Manufacturing Science and Engineering JUNE 2015, Vol. 137 / 031017-5
jPMOW2 j(s)
2% 5% 10%
0 0 0 0 0 0 0 0
50 0 0 0 0 0 0.06 0.95
100 0 0 0 0 0 1.44 5.91
150 0 1.73 2.62 5.91 7.32 23.34 19.96
200 50 50.40 4.04 54.23 10.37 70.87 23.55
250 100 100.31 3.87 104.10 10.13 119.65 23.55
300 150 150.19 3.93 153.95 10.11 170.59 24.83
350 200 200.16 3.96 203.84 10.19 220.80 22.66
400 250 250.22 3.96 254.03 10.18 269.82 25.56
450 300 300.21 3.97 304.01 10.20 320.23 23.00
500 350 350.24 3.99 353.97 10.16 369.90 23.55
prediction algorithm is applicable to the systems where the varia- plant to validate its effectiveness in practice. The system in this
tions of the machine cycle times are small. case study consists of three groups of machines and conveyors
Then, we study the WIP [22] on the bottleneck machine M6 between successive groups, as shown in Fig. 11(a). The bottleneck
under different DT2 ’s. The results when DT2 ¼ 150, 250, 350, and machine Mb is the first machine in group 3. Moreover, if a failure
450 s are shown in Fig. 10 (the standard deviation on the process- occurs, maintenance will be carried out on that machine and the
ing time is assumed to be 2% of its mean). These results demon- whole group of machines will be stopped, until the maintenance is
strate that if DT2 is within 150 s, there is almost no PMOW on completed. In this case study, we only consider the PMOWs
M6 . If DT2 increases to 250 s, which is between DT2;1 and DT2;2 , caused by failures in groups 1 and 2. This system can be modeled
there will be one idle duration on M6 , which resulted from line as a three-machine–two-buffer (3M2B) system, as shown in
l2;1 . However, if DT2 is greater than DT2;2 , such as 350 and 450 s, Fig. 11(b), with the system parameters in Table 4. The cycle time
M6 will have two separate idle durations (the third one is negligi- of machine Mi (i ¼ 1, 2) in Fig. 11(b) is equal to the summation of
ble), where the first one is caused in l2;2 and the second is in l2;1 . the cycle times of all the machines in group i. Moreover, the cycle
When DT2 continues to increase, the first idle duration will keep time of the buffers will be included in the calculation of T res .
growing while the length of the second idle duration keeps In order to show how the PMOW is updated under multiple fail-
constant. ures, machine M16 is shut down for 15 min, starting at 1:00 p.m.,
Moreover, this case study indicates that the PMOW prediction and machine M14 is shut down for 10 min, starting at 1:20 p.m.
tool is also applicable to the case when the downtime DTi is not These two events are regarded as failures F1 and F2 , respectively.
deterministic. If the probability distribution of DTi is known, The factory information system (FIS) updates the WIP of all buf-
then the probability distribution of PMOW can be predicted fers and machines every 5 min, as shown in Fig. 12. Based on the
accordingly. The optimal maintenance policy under such proba- real-time information at 1:00 p.m., the PMOW under F1 can be
bilistic PMOWs will be investigated in the future. predicted as ½1 : 25 : 00; 1 : 26 : 13Þ, indicating a 73-s idle dura-
tion on the bottleneck machine Mb , starting from 1:25 p.m. When
the second failure F2 occurs at time 1:20 S p.m., the PMOW can
4.2 Case Study 2: PMOW Prediction Under Multiple be updated as ½1 : 25 : 00; 1 : 26 : 13Þ ½1 : 32 : 13; 1 : 41 : 13Þ,
Failures in a Real Manufacturing Plant. The PMOW prediction indicating that another 9-min idle duration on Mb will start at
tool has also been implemented to a real automotive assembly 1:32 p.m. The real-time plant floor data after 1:20 pm (also shown
Journal of Manufacturing Science and Engineering JUNE 2015, Vol. 137 / 031017-7
Table 4 Parameters of the 3M2B system delivered to Mb shortly. Similarly, the WIP in B2 is 2 at 1:30 p.m.,
indicating that the second idle duration will start at around
Machine/buffer M1 B1 M2 B2 Mb 1:32 p.m.; while B2 has a large WIP at 1:40 p.m. so that the second
idle duration on Mb will end soon. These results validate the effec-
Capacity 6 11 7 24 1
tiveness of PMOW prediction algorithm under multiple failures.
Cycle time (min) 4.44 0.71 6.23 3.53 1
5 Conclusions
In this paper, an analytical model has been developed to predict
the PMOWs on the bottleneck machine under multiple machine
failures in a complex manufacturing system. First, through the
proposed technique of generating EPSLs, the critical downtime
for each machine in a complex system is calculated. Then, a
PMOW prediction algorithm for a complex system is developed
when a single failure occurs on one machine. It is found that, in a
complex system, the total idle time of the bottleneck machine
equals to the difference between the machine’s actual downtime
caused by the failure and its critical downtime. Moreover, the
PMOW prediction algorithm under multiple failures has also been
investigated. Case studies in simulations and a real automotive
plant have been conducted to demonstrate the methodology and
insights.
The effectiveness of the PMOW prediction algorithm has also
been validated when it is applied to systems with small variations
in processing time. The analytical model also provides a sound
basis for research on systems with more uncertainties and varia-
tions, such as large variation on machine processing time and
probabilistic downtime caused by the failure. Such randomness
will be taken into consideration in our future work, to make
PMOW prediction algorithm more robust for real plant
Fig. 12 PMOW prediction and validation using real-time data implementation.
from FIS The future work related to the maintenance opportunity window
technique includes: (1) estimating PMOWs analytically when the
cycle times of machines are not constant; (2) developing methods
to correct PMOW based on the feedback information; and (3)
integrating maintenance opportunities into the design of manufac-
in Fig. 12) are utilized to validate the prediction. From the WIP turing systems.
records in FIS, there are indeed two idle durations on Mb , as
predicted. Moreover, with a detailed observation of the WIP in
buffer B2 , one can obtain a more accurate estimation of the Acknowledgment
starting/completion time of the two idle durations. For the first This work was supported by National Science Foundation
idle duration, it shows that at 1:20 p.m. buffer B2 has four parts, (Grant No. 0825789) and the Industry/University Cooperative
which will allow Mb to process for additional 4 min before it gets Research Center for Intelligent Maintenance Systems (NSF Grant
starved; and at 1:25 p.m. B2 has one new part, which will be No. 1134676).
Journal of Manufacturing Science and Engineering JUNE 2015, Vol. 137 / 031017-9