A Machine Learning Approach For Predictive Maintenance For Mobile Phones Service Providers
A Machine Learning Approach For Predictive Maintenance For Mobile Phones Service Providers
A Machine Learning Approach For Predictive Maintenance For Mobile Phones Service Providers
net/publication/309365201
CITATIONS READS
3 3,535
4 authors, including:
Roberto Prevete
University of Naples Federico II
89 PUBLICATIONS 1,138 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
automatic measurement of the nuchal translucency thickness from ultrasound imagery View project
All content following this page was uploaded by Francesco Isgrò on 15 November 2017.
Abstract The problem of predictive maintenance is a very crucial one for ev-
ery technological company. This is particularly true for mobile phones service
providers, as mobile phone networks require continuous monitoring. The ability
of previewing malfunctions is crucial to reduce maintenance costs and loss of cus-
tomers. In this paper we describe a preliminary study in predicting failures in a
mobile phones networks based on the analysis of real data. A ridge regression clas-
sifier has been adopted as machine learning engine, and interesting and promising
conclusion were drawn from the experimental data.
1 Introduction
A large portion of the total operating costs of any industry or service provider is
devoted to keep their machinery and instruments up to a good level, aiming to ensure
a minimal disruption in the production line. It has been estimated that the costs of
maintenance is the range 15-60% of the costs of good produced [14]. Moreover
about one third of the maintenance costs is spent in not necessary maintenance; just
as an example, for the U.S. industry only this is a $60 billion each year spent in
unnecessary work. On the other hand an ineffective maintenance can cause further
loss in the production line, when a failure presents itself.
Anna Corazza
DIETI, Università di Napoli Federico II e-mail: [email protected]
Francesco Isgrò
DIETI, Università di Napoli Federico II e-mail: [email protected]
Luca Longobardo
DIETI, Università di Napoli Federico II e-mail: [email protected]
Roberto Prevete
DIETI, Università di Napoli Federico II e-mail: [email protected]
Predictive maintenance [14, 10] attempts to minimise the costs due to failure
via a regular monitoring of the conditions of the machinery and instruments. The
observation will return a set of features from which it is possible in some way to
infer if the apparatus are likely to fail in the near future. The nature of the feature
depend, of course, on the apparatus that is being inspected. The amount of time in
the future that the failure will arise also depends on the problem, although we can
state, as a general rule, that the sooner a failure can be predicted, the better is in
terms of effective maintenance.
In general the prediction is based on some empirical rule [23, 17, 19], but over the
last decade there has been some work devoted to apply machine learning [6, 22, 5]
techniques to the task predicting the possible failure of the apparatus. For instance,
a Bayesian network has been adopted in [9] for a prototype system designed for the
predictive maintenance of non-critical apparatus (e.g., elevators). In [12] different
kind of analysis for dimensionality reduction and support vector machines [7] have
been applied to rail networks. Time series analysis has been adopted in [13] for
link quality prediction in wireless networks. In a recent work the use of multiple
classifiers for providing different performance estimates has been proposed in [18].
An area where disruption of service can have a huge impact on the company sales
and/or the customer satisfaction is the one of mobile phone service providers [8, 4].
The context considered in this work is a the predictive maintenance of national
mobile phone network, that is being able to foresee well in advance if a cell of
the network is going to fail. This is very important as the failure of a cell can have
a huge impact on the users’ quality of experience [11], and to prevent them makes
less likely that the user decides to change service provider.
In this paper we present a preliminary analysis on the use of a machine learning
paradigm for the prediction of a failure on a cell of a mobile phones network. The
aim is to predict the failure such in advance that no disruption in the service will
occur, lets say, at least a few hours in advance. A failure is reported among a set
of features that are measured every quarter of an hour. The task is then to predict
the status of the feature reporting the failure within a certain amount of time. As for
many other predictive maintenance problems given we are dealing with a very large
amount of sensors [15].
The paper is organised as follows. Next section describes the data we used and
reports some interesting properties of the data that have been helpful in designing
the machine learning engine. The failure prediction model proposed is discussed in
Section 3, together with some experimental results. Section 4 is left to some final
remarks.
2 Data analysis
Fig. 3 Illustration of the probabilities of getting an alarm for the cells close to a cell signaling an
alarm, after 15 and 180 minutes
is vastly overabundant. For this reason a further step for feature selection becomes
necessary.
A process of automatic feature selection was chosen to increase portability and
maintain a data-oriented approach. In particular we used an algorithm for L1L2
regularization implemented in the “l1l2py”1 Python package. This algorithm com-
bines the classic shrinkage methods given by Ridge Regression and Lasso Regres-
sion [20].
We consider a regression problem where the output y is reconstructed from fea-
tures xi , i ∈ [1, p] by combining them with coefficients β = {βi }. Ridge Regression
uses a L2 norm in order to force a constraint on the regression coefficients size by
reducing their absolute value:
p 2 p
β̂ridge = arg min y − ∑ x j β j + λ ∑ β j2 (1)
β j=1 j=1
On the other hand, Lasso Regression instead uses a L1 norm forcing sparsity in
data and the annulment of some of the coefficients:
p 2 p
β̂lasso = arg min y − ∑ x jβ j + λ ∑ β j (2)
β j=1 j=1
The final step is the effective experimental assessment of the classifier. First of
all, we have to decide how to solve the critical issues emerged from data analysis
and pointed out in the preceding section: how to mandage undefined values and
how to split data into training and test set while reducing unbalancing of positive
and negative examples.
With relation to the issue of undefined values we decided to operate a fixed sub-
stitution of the most frequent of such values ( INF ) based on the average value
assumed by the considered feature, according to the scheme in Table 1. The number
Table 1 Substitution of undefined values in the features which assume such value.
Feature Substitution value
FEATURE 6 120
FEATURE 3 120
FEATURE 5 120
FEATURE 1 120
FEATURE 7 -10
of occurrences of the other undefined values is relatively negligible and the tuples
containing those values were simply dropped.
1 http://slipguru.disi.unige.it/Software/L1L2Py/
In order to fix the balance between positive and negative samples we kept all the
available positive samples, which were the ones with a minor number of occurrences
N p , and randomly choose Nn = 4N p negative examples.
The splitting of data into training and test set has been solved by a temporal
based partitioning: we selected the first 2/3 of the month for the training, and the
remainder of the data was used as test set. We could act like that because positive
examples have a nearly uniform distribution in data. Therefore, even if we applied
the split without considering the frequencies of positive and negative examples, we
obtained an acceptable balance for both sets. Furthermore, we want to underline how
it is fundamental to operate an accurate sampling of data composing the training set,
because including tuples related in some way with the occurrence of an alarm results
in a hike of performance.
In the experiments different time shifts ∆ have been considered. An analysis of
the results showed some few interesting points.
First of all, we tested both generic and location-based models. The former does
not consider geographical information, while the latter is a location-based model.
Classification results have shown how the geographical information is crucial to
the classification, while a single generic model for the whole area fails to catch the
different variety of underlying key factors specific to each geographic subarea. One
example is illustrated in Figure 4, where we compare the results, in terms of ROC
curve, from a sample of such two models. Training strongly geo-localized models
resulted, in some of the best performance, with AUC values (the area under ROC
curve) of 0.7 − 0.8.
Another point regards the inverse proportionality between classification perfor-
mance and the time shift: performance decreases while the time shift between obser-
vations and alarm increases. In fact, a regular loss in performance can be observed
when the time shift raises from a quarter of an hour up to 6 − 7 hours; after that
performance fundamentally go close to a random guess.
Last, we run some tests to analyse how performance changes in relation to the
introduction of automatic feature selection. Models built directly using all the fea-
tures produced by the feature expansion phase and models where we an l1l2py step
of feature selection was applied have been compared. The performance of the sys-
tem with feature selection shows a constant (although relatively low) improvement
with respect to the one without it. One example of this is showed in Figure 5.
We noticed that the set of features chosen by the feature selection step changes
depending on the location of the considered cell. However, the final features are
always correlated with the two which showed the largest coefficient of linear cor-
relation with the output, that is: FEATURE 7 and FEATURE 2. Such analysis can
also help the service provider to analyse which are the most likely causes of mal-
functioning.
Fig. 4 Comparison between ROC curves generated on test set by a model trained and tested on the
whole Italian area (top) and another specific to a single location (bottom). Time shift for prediction
is 3 hours.
Acknowledgements
The research presented in this paper was partially supported by the national projects
CHIS - Cultural Heritage Information System (PON), and BIG4H - Big Data Ana-
lytics for E-Health Applications (POR).
References
1. Amato, F., De Pietro, G., Esposito, M., Mazzocca, N.: An integrated framework for securing
semi-structured health records. Knowledge-Based Systems 79, 99–117 (2015)
2. Amato, F., Moscato, F.: A model driven approach to data privacy verification in e-health sys-
tems. Transactions on Data Privacy 8(3), 273–296 (2015)
3. Amato, F., Moscato, F.: Exploiting cloud and workflow patterns for the analysis of composite
cloud services. Future Generation Computer Systems (2016)
4. Asghar, M.Z., Fehlmann, R., Ristaniemi, T.: Correlation-Based Cell Degradation Detection
for Operational Fault Detection in Cellular Wireless Base-Stations, pp. 83–93. Springer Inter-
national Publishing, Cham (2013)
5. Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press (2012)
6. Bishop, C.M.: Pattern recognition and machine learning. Springer (2006)
7. Cortes, C., Vapnik, V.: Support-vector networks. Machine learning 20(3), 273–297 (1995)
8. Damasio, C., Frölich, P., Nejdl, W., Pereira, L., Schroeder, M.: Using extended logic program-
ming for alarm-correlation in cellular phone networks. Applied Intelligence 17(2), 187–202
(2002)
9. Gilabert, E., Arnaiz, A.: Intelligent automation systems for predictive maintenance: A case
study. Robotics and Computer-Integrated Manufacturing 22(5), 543–549 (2006)
10. Grall, A., Dieulle, L., Berenguer, C., Roussignol, M.: Continuous-time predictive-maintenance
scheduling for a deteriorating system. IEEE Transactions on Reliability 51(2), 141–150 (2002)
11. Jain, R.: Quality of experience. IEEE MultiMedia 11(1), 96–95 (2004)
12. Li, H., Parikh, D., He, Q., Qian, B., Li, Z., Fang, D., Hampapur, A.: Improving rail network ve-
locity: A machine learning approach to predictive maintenance. Transportation Research Part
C: Emerging Technologies 45, 17 – 26 (2014). Advances in Computing and Communications
and their Impact on Transportation Science and Technologies
13. Millan, P., Molina, C., Medina, E., Vega, D., Meseguer, R., Braem, B., Blondia, C.: Tracking
and predicting link quality in wireless community networks. In: 2014 IEEE 10th International
Conference on Wireless and Mobile Computing, Networking and Communications (WiMob),
pp. 239–244 (2014)
14. Mobley, R.K.: An introduction to predictive maintenance, 2nd edn. Butterworth-Heinemann
(2002)
15. Patwardhan, A., Verma, A.K., Kumar, U.: A Survey on Predictive Maintenance Through Big
Data, pp. 437–445. Springer International Publishing, Cham (2016)
16. Schapire, R.E.: Explaining AdaBoost, pp. 37–52. Springer Berlin Heidelberg, Berlin, Heidel-
berg (2013)
17. Scheffer, C., Girdhar, P.: Practical machinery vibration analysis and predictive maintenance.
Elsevier (2004)
18. Susto, G.A., Schirru, A., Pampuri, S., McLoone, S., Beghi, A.: Machine learning for predictive
maintenance: A multiple classifier approach. IEEE Transactions on Industrial Informatics
11(3), 812–820 (2015)
19. Swanson, D.C.: A general prognostic tracking algorithm for predictive maintenance. In:
Aerospace Conference, 2001, IEEE Proceedings., vol. 6, pp. 2971–2977. IEEE (2001)
20. Tibshiriani, R.: Regression shrinkage and selection via the Lasso. Journal of the Royal Statis-
tical Society. Series B 58(1), 267–288 (1996)
21. Tychonoff, A., Arsenin, V.: Solution of ill-posed problems. Winston & Sons, Washington
(1977)
22. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
23. Zhou, X., Xi, L., Lee, J.: Reliability-centered predictive maintenance scheduling for a con-
tinuously monitored system subject to degradation. Reliability Engineering & System Safety
92(4), 530–534 (2007)