Anomaly Detection Analysis and Prediction-2019
Anomaly Detection Analysis and Prediction-2019
July 3, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2921912
ABSTRACT Anomaly detection has attracted considerable attention from the research community in the past
few years due to the advancement of sensor monitoring technologies, low-cost solutions, and high impact
in diverse application domains. Sensors generate a huge amount of data while monitoring the physical
spaces and objects. These huge collected data streams can be analyzed to identify unhealthy behaviors.
It may reduce functional risks, avoid unseen problems, and prevent downtime of the systems. Many research
methodologies have been designed and developed to determine such anomalous behaviors in security and
risk analysis domains. In this paper, we present the results of a systematic literature review about anomaly
detection techniques except for these dominant research areas. We focus on the studies published from
2000 to 2018 in the application areas of intelligent inhabitant environments, transportation systems, health
care systems, smart objects, and industrial systems. We have identified a number of research gaps related
to the data collection, the analysis of imbalanced large datasets, limitations of statistical methods to process
the huge sensory data, and few research articles in abnormal behavior prediction in real scenarios. Based
on our analysis, researchers and practitioners can acquaint themselves with the existing approaches, use
them to solve real problems, and/or further contribute to developing novel techniques for anomaly detection,
prediction, and analysis.
INDEX TERMS Statistical learning, machine learning, intelligent environments, smart objects, intelligent
transportation systems, industrial systems.
There are several survey papers about anomaly detec- availability of the data and its annotation, we may describe it
tion and analysis with focus on security and risk anal- as a supervised, unsupervised or semi-supervised. Such infor-
ysis (e.g., intrusion detection for computer network mation assists the engineers to choose appropriate anomaly
system [2], [3], fraud detections [4], and credit risk analy- detection technique. In supervised training, we have the avail-
sis [5]). In this systematic literature review, a special focus ability of the data with class labels and its simplest form
is made to investigate and analyze the existing approaches to of learning to detect the anomalous behavior of the system.
detect anomalous behavior for areas such as smart environ- In case of unsupervised learning, we have data but without
ments, transportation networks, health care systems, smart any concrete output (i.e. class labels). Similarly, in case
objects, and industrial systems. The reason to investigate of semi-supervised learning we have limited examples with
these selected areas is to provide compact information about class labels while the rest of the data is unlabeled. We pro-
the developed methods since these areas have gained less vide the details in our results section which method can be
attention in anomaly detection domain as compared to the beneficial in the above-mentioned data availability scenarios
intrusion detection in networks and fraud detection. To select to build the anomaly detection system. The details of each
and apply appropriate anomaly detection technique, a number anomaly type is as follows.
of factors including the nature of generated sensory data
stream, type of anomaly, and availability of training data A. POINT ANOMALY
should be considered. We present a systematic flow-chart of Point anomaly is an observation point in the data stream
anomaly detection system in Figure 1. that is far from the rest of the data. It is also known as an
‘‘outlier’’ [6]. For Example, a satellite transmits the data to
its base station. This data has a regular shape with certain
raise/fall in the value. At some point, a high raise or low fall in
the data can be abnormal behavior that is known as anomaly.
Such anomalies need to be recognized before processing or
doing further analysis on the data. Similarly, a temperature
data of a building have a regular shape and certain raise/fall in
the value can be an abnormal value from the sensor. Figure 2
illustrates a point anomaly.
FIGURE 4. Heart rate monitoring signal with the entire data patterns over
a consecutive time intervals.
In Figure 4, it can be seen that observation pattern is TABLE 2. The list of search keywords.
anomalous around 3 and 4 interval as compared to the rest
of the signals. All kind of anomalies are associated with a
time unit and it is possible to have seasonal trends in data
streams [9]. For instance, the heart rate signal can be abnor-
mal due to the reading right after the running or gym activity.
Similarly, the sales of air condition increase during the sum-
mer and decline at the end of summer season. This seasonality
effect can also be observed in the electricity consumption
during the hot days. To apply an anomaly detection method,
some approaches remove the seasonality trends during the III. SEARCH KEYWORDS
analysis, while others have specific process for seasonalities We obtained the search keywords from our research questions
patterns [10], [11]. to find the relevant literature. The following Table 2 lists our
The rest of the paper is organized as follows: Section II search keywords.
introduces developed protocol for systematic literature
review. Section III provides the search keywords used to IV. INCLUSION AND EXCLUSION CRITERIA
find the relevant literature. Section IV describes the inclu- The following criteria is defined to include the identified
sion and exclusion criteria to identified the relevant articles. relevant articles in this review.
Section V provide the quality assessment details followed • The articles published in the mentioned databases in
by Section VI to provide the details of our contribution. Section II-B to maintain and ensure the quality.
• The articles focusing on anomaly detection, analy- smart objects, healthcare system, and industrial system in the
sis, and prediction in intelligent inhabitant environ- context of anomalies detection, analysis and prediction.
ments, transportation systems, healthcare systems, smart
objects, and industrial systems. VII. RESULTS
• The developed methods are evaluated on real datasets. In this section, the reported results are based on the developed
• All articles should be online to ensure the paper protocol. Each subsection explains about the domain with
accessibility. a developed set of techniques in the area of statistical, and
• All articles should be written in English language. machine learning domain.
• Peer reviewed papers.
The following papers were excluded A. ANOMALOUS BEHAVIOR IN INTELLIGENT
• All articles that do not meet the inclusion criteria. INHABITANT ENVIRONMENT
• All articles that are related to anomaly detection in the In intelligent inhabitant environment, embedded sensor tech-
security and risk management domain because it is not nology plays a major role to monitor the occupants’ behav-
the focus of this literature survey. ior. The embedded sensors are one of the best solutions for
elderly population to provide a level of independence and
V. QUALITY ASSESSMENT comfort in their homes rather than requiring them to reside
The quality of the search results are ensured after extracting at health care centers [15]. The inhabitant interacts with
the information from selected digital libraries. Both authors the household objects and the embedded sensors generate
read the abstracts of the papers from the search lists and the time-series data to recognize the performed activities.
decided which paper should be included or excluded to Generated sensor data is very sparse because sensor val-
proceed. It has been found that most of the research stud- ues change when the inhabitant interacts with objects. The
ies are evaluated on synthetic datasets, while our selec- need for a robust anomaly detection model is essential in
tion criteria are based on the real datasets in a real-system any intelligent environment [16], to deal with the abnormal
environment to assure the engineering practicability. Total situations properly. Anomalous behavior is relatively new in
70 articles passed the criteria and are included in our liter- intelligent environment and is currently being explored in the
ature review. The selected papers from both authors were research community. The following literature is found and
synchronized together and conflicting views upon the articles characterized according to defined research questions.
have been resolved followed by discussion. After finalizing
the list, the papers are divided into two groups and analyzed. 1) STATISTICAL METHODS
These two groups categorization is based on the technical In the statistical methods, the dissimilarity-based model is
approaches from the domain of statistical learning as well developed [17], where an index is used to measure the degree
as machine learning. The technical insight is constructed for of resemblance between the normal and abnormal behav-
providing compact information to the reader about the devel- ior. They performed the experiments on the two different
oped methods. Our contributions and findings are reported in datasets with single inhabitants at home. In the literature,
the following section. we can find several dissimilarity measurements such as ham-
ming distance measure, Manhattan distance measures, cosine
VI. OUR CONTRIBUTIONS similarity etc. The selection of distance measures plays an
In the past, the research community has conducted several important role and can be selected based on the nature of
anomaly detection reviews ranging from comprehensive to the data. In intelligent environment, the sensors are mostly
certain application domains [1], [13], [14]. We found credible sending binary information about the objects which are either
literature review papers [13], [14], that discuss the available in a functional or rest state. For such kinds of binary data,
techniques with challenges to detect different kinds of anoma- suitable distance measure can be selected from a list of 76 dis-
lies. This systematic literature review is built upon these two tances that are described in [23]. All these distance measures
works by significantly expanding them in several directions. can be used for binary data. The statistical measure per-
The new directions in our systematic literature review include centile based method is developed to extract the parameters
recently developed approaches in the last decades such as the of the normal behavior of user interaction over the labeled
categorization of developed methods into statistical, machine available data (i.e. supervised learning) [18]. Any deviation
learning, and deep learning in different domains. An informa- from these interactions is considered as anomalous behavior.
tive prospective to choose the appropriate technique based on The reported study is based on four smart homes of senior
our provided analysis. Furthermore, this literature survey pro- citizens. The smart homes are equipped with motion and
vides up-to-date information and identified the research gaps. door sensors to recognize the behavior of the inhabitants.
It can provide the details for detecting anomalies more effi- The developed method is evaluated over the monitoring data
ciently and where to invest energy to overcome the existing of 650 days. The probabilistic models Gaussian Mixture
challenges. According to the best of our knowledge, we could Model (GMM) [19], is trained over the historical data to learn
not find a systematic literature survey including emerging the normal patterns of each activity to provide an opinion of
fields in intelligent environment, intelligent transport system, normal activity or abnormal. Such approaches have a primary
advantage to detect the abnormality, even when such patterns outside the positive class boundary in the feature space is
are not seen previously. The experiments were carried out considered as an anomaly. The one class SVM approach is
on the commercial system of lifestyle reassurance data for validated using real-world sensor data captured from three
12 months. The Markov models are successful to predict publicly available smart home datasets. The variations of
the sequential data. A hierarchical hidden Markov model SVM classifier [24], [25], also contribute to recognize the
(HMM) [20], is proposed for predicting the abnormal human anomalous behavior of the inhabitant in smart environments
behavior in a smart home setting. A meal preparation activity and evaluated on the real-life collected datasets. A support
was considered for up to 14 days. The system was trained vector description (SVD) [25], based approach is developed
on the normal patterns of the occupants and any deviation to detect abnormal behavior for elderly people living home
is considered as abnormal behavior. Another variation of alone. SVD belongs to the family of dimensionality reduction
Markov model [21], is developed by introducing the con- techniques and ability to include regularization terms. These
cept of switching hidden semi-Markov model (S-HSMM). regularization terms consequently eliminate the problem of
They consider the six household activities and trained the overfitting. A novel scheme is constructed to represent human
model on normal behavior of the activities. If an activity activities in a form of a state transition table and classified by
duration is long in comparison to the trained model, then it is multi-class SVM [24]. By nature, SVM is binary classifier
declared as abnormal. Furthermore, system provide a signal while it can be utilized in multi-class setting by considering
to an alert generation system. Another approach, based on the two approaches.(i.e., one vs Rest/All or one vs one). Multi-
Bayesian model [22], is also developed for anomaly detection class SVM is able to recognize the collective anomalies.
and estimated probabilistic features in terms of likelihood. They performed the experiments over the wearable gadgets
A real-time sensory data stream of three smart homes were collected data stream to detect the abnormal behavior. A non-
collected for 14, 25, and 21 days. In Table 3, we present our linear regression method [27], was developed to reduce the
findings based on statistical methods in intelligent inhabitant false positive rate. They trained their model using one class
environment. support vector machine over the normal activities to filter out
most of the normal instances and suspicious instances are
2) MACHINE LEARNING METHODS passed to kernel non-linear regression for further detection.
Machine learning methods are designed and developed to The evaluation is based on real-world data consisting of 431
detect the anomalies. One of the most popular and effective traces of normal daily activities and 112 anomalous traces.
methods in anomaly detection is one class Support Vector We also found feature reduction method principal component
Machine (SVM) [16]. One class SVM has the capability to analysis (PCA) with fuzzy rule-based system [26], [29], that
separate the positive data points from anomalous data points is successfully applied to recognize the anomalous behav-
in the features space. The best characteristics of once class ior of the user inside the home environment. Fuzzy rule-
SVM is that it can be trained over the data of single class. based system is based on three steps. First it check the
Upon the absence of anomalous data, it still has a minimum inputs and observe the confidence limits. In second phase
distance from the presented class. Consequently, everything rules are fired according the confidence limits of inputs.
Finally, values are defuzzified to rank the data point with the complex association. The research community success-
a degree of abnormal behavior or not. They performed the fully applied Principal Component Analysis (PCA) to reduce
experiments over the three case studies to evaluate the perfor- the feature space and developed kernel feature space to
mance of developed model. During our defined search period, detect the anomalous behavior of the system [30]. They per-
we find couple of deep learning model in intelligent inhabi- formed the experiments to detect the anomalies on teleme-
tant domain to detect the anomalous behavior. Most of them try data obtained from international space station. In kernel
are working on synthetic dataset. A Convolutional Neural tricks, most of the time it is utilized in machine learning
Network (CNN) in a combinational scheme with Recurrent methods while it comes from statistical methods. In kernel
Neural Network (RNN) and SVM [28], is developed over a trick, feature space is transformed into higher dimensional
small real dataset to find the anomalous behavior of residents. space where data points are linearly separable. Such meth-
They train the CNN model over the raw signals with activity ods are applicable for collective anomalies where multiple
class and do prediction with SVM. They also transformed sources generate a complete picture to detect the anomalous
the raw data stream to spectogram and utilize the RNN. behavior of the system. The k-means clustering with GMM
They reported both models in their proposed scheme good model [31], was developed to detect the anomalies in road
enough to detect anomalous behavior. Table 4 presents our traffic data. The authors provide a rich model that is capable
findings in the literature based on machine learning methods to learn a high-dimensional feature space over the normal
in intelligent inhabitant environment. behavior of the vehicles and detect any deviation from it as
abnormal situation. Furthermore, they also provide a visual
B. ANOMALOUS BEHAVIOR IN INTELLIGENT analytics to understand the behavior and explanation about
TRANSPORTATION SYSTEM the abnormal behavior events. Visual analytics based methods
The applications area of intelligent transportation systems provide a transparent way to analyze the data. During their
cover a wide and diverse area. During the search process research, the authors also consider the feedback provided by
of systematic literature review, the following methods were expert analysts from Volvo group trucks technology. This
found to detect the anomalies. feedback includes information to understand the normal and
abnormal behaviors. For this purpose, they analyzed the
1) STATISTICAL METHODS behavior of the driver 3 to 10 seconds before the crash as
In intelligent transportation systems, a huge amount of well as online and offline queries posed to detect the abnor-
multi-dimensional time-series data is generated from a num- mal behavior. A structured sparse subspace learning algo-
ber of components and attached sensors. Due to the high- rithm [32], is developed to detect the anomalous behavior.
dimensional feature spaces, linear models are unable to model In this method, a structured norm is imposed on the projection
coefficients matrix to achieve structured sparsity and help Deep learning models have significant improvement in many
identify anomaly sources. They performed the experiments application areas, specifically image recognition, speech
over the two real-world flight datasets from UAV Labora- classification, and natural language processing. Such models
tories at the University of Minnesota. For the experiments, have the potential to detect anomalies and researchers have
only part of flight data from takeoff to landing is used while already designed and developed solutions based on deep
developed method out-performed as compared to state of the learning methods. A reinforcement learning method [36], was
art algorithsm. In Table 5, we present our findings in the developed to detect the motor anomaly of unmanned aerial
literature based on statistical methods in intelligent transport vehicle over the temperature data stream. A reinforcement
system. learning algorithms are based on reward mechanism. They
controlled the motor temperature in a way by increasing
2) MACHINE LEARNING METHODS the speed if the temperature range is appropriate. While the
Machine learning models are playing an important role to temperature range increase they deaccelarte to avoid motor
detect and predict the abnormal behavior in intelligent trans- malfunction that is one of the major cause of drone crash.
port system. In the literature, we found that appropriate data A regression based model [37], is developed to find the
selection from the databases is an important step because correlation between events and resource metrics in logs files
these databases have been acquiring the data over several to find the contextual anomalies in air traffic control sys-
years. A decision tree algorithm (i.e. C4.5 from decision tem. Furthermore, they also provide the details of actionable
tree family) [33], is applied to the new representation of the improvement by including the change detection algorithm
unstructured data streams in the form of reports. The new and use of time windows on contextual anomalies. They
representation is named as feature space and utilized a fusion performed the experiments over three failure settings. First
model to define an abnormal behavior. They extracted a new consider the system seems working but services are unre-
representation over the 5 years of historical operational data sponsive. Second system is running but unresponsive due
of A-320. They also talk about the semantic meaning of indefinitely waiting for an event to occur. Third it crash and
the extracted model and its implications. A Proper labeling terminates unexpectedly. A support vector machine (SVM)
would allow them to predict problems in advance before algorithm [38], is successfully applied to provide real-time
they happen. Furthermore, It may help to generate alerts for warning and instructions to the drivers about the road anoma-
dangerous situations when any important parameters indicate lies. Deep Convolutional Neural Network (CNN) is devel-
abnormal behavior. It has been found that another research oped and trained over the images of railway track to detect
study by developing kernel learning mechanisms based on the anomalies [39]. CNN based methods can distinguish by
two kernels [34]. One is designed over the discrete sequences network architecture, activation function and weight update
while the other is designed for continuous time series of mechanism. They performed the experiments over the real
data. They evaluated their study on synthetic as well as four data set of 85 miles of continuous tracked images under
real-world datasets to detect the anomalous behavior of the the supervised training mechanism. In [40], the designed
aircrafts. In recent days, an effective breakthrough of Extreme autoencoder is based on unsupervised deep learning model.
Learning Mechanism (ELM) is introduced by the research They built a stacked denoising autoencoder model to learn the
community [35]. This approach solves the challenge of high robust feature space in an unsupervised fashion. Then, they
computational time of training over the large datasets. It also used output of autoencoder as an input to anomaly detection
provides a good generalization to develop a scalable anomaly algorithm based on Gaussian distribution. They performed
detection mechanism for very large datasets. They performed the experiments on turbofan gas turbine engine of a civil
the experiments over a real aviation safety benchmark prob- plane to detect the anomalous behavior of the engine. Table 6
lem that contains 43000 flights data about the information presents our findings based on machine learning methods in
of radar, flight trajectories, and distance of nearby aircraft. intelligent transport systems.
C. ANOMALOUS BEHAVIOR IN SMART OBJECTS multiple monitoring data series and correlation-based meth-
The smart objects is a fast growing area to connect the ods are developed to detect the abnormal behavior of the
multiple objects together and enable communication between system. In [43], latent correlation based method is developed
them. It collects a valuable data that can be a source of to detect the anomalous behavior of the concrete pump truck.
information and knowledge for a wide range of applications. Such method is capable to detect anomalies efficiently over
During our research survey, we found the following statistical the monitoring data. The experiments were performed on
and machine learning literature that is aligned according to 270 concrete pump trucks in supervised learning fashion to
our research questions and search criteria. detect the anomalous behavior. Another elegant approach
to develop an unsupervised learning method can reduce
1) STATISTICAL METHODS the annotation burden of the data into normal or abnormal
The statistical methods are successfully applied to discover behavior. In this domain, an unsupervised expectation max-
the anomalous behavior in many application domains. One imization method is developed to identify the anomalous
of the simplest methods is thresholding [41], that specifies behavior [44]. The developed method placed the constrain
the behavior of the objects when a certain threshold is crossed to reduce the high variances of the mixture model to make
during the monitoring phase of the smart objects. The experi- the correct prediction. The experiments were performed over
ments are performed on an intelligent trash bin equipped with a smart nursing home with light switches. Table 7 presents
gas sensor to detect the state of the food. Such methods are our findings based on statistical methods in the area of smart
easy to implement and require very low computation power to objects.
associate abnormal behavior of the objects. Similarly, another
simple and powerful method ANOVA is developed to find the 2) MACHINE LEARNING METHODS
anomalous behavior. ANVOA is based on the analysis of vari- In machine learning approaches, rule based and clus-
ance to determine the variations between the groups in such tering approaches are developed for anomaly detection
a way that they are significantly different from each other. and prediction for smart object systems. In case of rule
In [42], the authors developed ANOVA-based technique to based approaches, the researchers have developed hybrid
recognize the abnormal behavior of a vehicle. Researchers approaches with a combination of fuzzy logic [45]. They
also reported that when anomalous behavior exists, it affects calculated the degree of belief to handle the ambiguities
imprecision, and vagueness. They predict anomalous behav- data points into groups in such a way that all data points
ior by monitoring the water level through temperature and which contain similar characteristics is placed in one group.
rain gauge sensors. In [46], rules are considered with a novel In the recent days, a new cluster heat map approach [48],
approach to verify the correctness of the rules based on the is also developed to monitor the energy usage of devices
domain knowledge. A school building is considered with cou- and user contexts in an indoor office environment. Their
ple of sensors to recognize the anomalous patterns in energy study comprised on 127 sensors deployed in two floor build-
consumption with an objective of a comfortable life with less ings and measuring the total power, reactive power, phase
energy consumption. One of the most prominent advantages angle, voltage, current, light value, temperature and motion
of rule-based models is the human understandable output. readings. Similarly, temporal and spatial-temporal methods
This feature is more likely to be acceptable in the applica- are also developed to detect the abnormal behavior from
tion areas where practitioners want to get a human under- the environmental datasets [49]. The visualization technique
standable format from machine learning methods. In [47], is also developed over the clustering methods to identify
the authors introduced temporal clustering approach to detect and predict the anomalous behavior of the under-observation
the anomalous behavior in city parking of San Francisco over system [48], [50]. In Table 8, we present our findings based
8200 parking sensors data. Clustering methods associate the on machine learning methods in the area of smart objects.
D. ANOMALOUS BEHAVIOR IN HEALTHCARE SYSTEMS Therefore, data-driven approaches [57], incremental learning
Anomaly detection, analysis and prediction is considered as a methods [58], graph based approaches [59], and transfer
revolution to redefined healthcare systems. In such systems, learning [60], are successfully applied in real-life scenarios.
a clear impact can be seen in healthcare management and A data driven approach [57], is developed over binary data to
wellness to enhance quality of life and remote monitoring for detect the contextual anomalies over the tap sensor in home
the chronic disease patients. Such systems imposed a huge setting. In incremental learning method [58], the developed
challenge to reduce the number of false alarms generation. model is always kept updating its learning on the arrival of
In our systematic literature survey, it has been found that the new cases or data patterns. Applicability of incremental
sufficient approaches and methods to identify the anomalous learning model in health care domain plays an important role
behavior of the sensors, humans or machines in healthcare when the whole data is not available at the time of training
environment. machine learning models. Another prospective is that we
can by-pass the limited memory and processing power of
1) STATISTICAL METHODS the system by learning the models in incremental phases.
In statistical methods, a large number of methods are devel- Incremental learning methods based on regression and feed-
oped ranging from simple root mean square [51], to complex back mechanism was developed in the past for detecting
Markov chain models [52]. All developed methods depend on the anomalous patterns in the electronic health records [58].
the nature of the time-series data and availability of such data In graph-based methods, we can assert the healthcare work
with annotation of normal and abnormal behavior patterns. flows and dependencies for patient care. A patient flow-
A root mean square graph [51], was developed over the based anomaly detection technique is developed [59], to find
accelerometer sensor data to find the Seizure complexities the anomalous behavior in electronic medical health records.
in the form of contextual anomalies. To monitor the anoma- Another approach is based on transfer learning with the
lous cardiac behavior, dynamic time warping and density objective to classify the new instances in unsupervised fash-
functions [53], were developed over the Photoplethysmo- ion while a model is trained on the data of other sources.
gram (PPG) signals of real dataset. We also find ARIMA For instance, it has been found that such methods in the
models to detect the point anomalies in decision support literature [60], to monitor the ECG signal and detect the
system over the electronic health records. In [52] and [54], anomalous behavior. Table 10 presents our findings based on
the authors developed methods for recognizing the anoma- machine learning methods in healthcare systems.
lous behavior in sleep patterns and blood glucose level using
the data of wearable technology (i.e. fitbit) and medical
devices (i.e., insulin tolerance test). A spectral coherence E. ANOMALOUS BEHAVIOR IN INDUSTRIAL SYSTEM
analysis [55], is also developed to find the discrepancies In the industrial system, design and development of anomaly
in the accelerometer sensory data. In Table 9, we present detection methods are crucial to reduce the chance of unex-
our findings in the literature based on statistical methods in pected failure of a system. It has been found that devel-
healthcare systems. oped methods for anomaly detection are successfully applied
to predictive and proactive maintenance. Such methods are
2) MACHINE LEARNING METHODS widely used to improve the productivity performance, save
In machine learning found literature, most of the devel- the downtime of the machines, and do the root cause analysis
oped models are dependent on the domain knowledge. of the faults.
1) STATISTICAL METHODS the electric data. In Table 11, we present our findings in the
The time and frequency properties of sensory data provide literature based on statistical methods in industrial systems.
a valuable information to build the time frequency logic.
In time domain signals (for example: mean, standard devia- 2) MACHINE LEARNING METHODS
tion or variance etc.) can depict certain information about the In the industrial systems, it has been found that one-class
behavior of the system. In complex scenarios, more advanced support vector machine [66], clustering and rule-based tech-
properties of the signal are analyzed. For instance, frequency niques [67], probabilistic neural network [68], and exten-
based signal properties (for example, Fourier transforma- sion of neural network in the form of extreme learning
tion, wavelet transformation etc.) can provide information machines [69]. One class support vector machine is a very
individually or in combination with time domain features to well-known method to detect the anomalous behavior of the
understand the behavior of the system [62]. The industrial system in many application domains, as we discussed in
systems are complex and wide, and a huge number of sensors section VII-A (i.e., Intelligent inhabitant environment) about
are placed to monitor the spaces and objects to provide a the significance of its working. It has capability to place the
better solution for abnormal behavior prediction. For such boundaries of the seen data and ability to distinguish all the
scenarios, correlation based methods [63], are developed events or data points that are outside of the boundaries as an
and proved more effective way to find the anomalies. The anomalous behavior of the system. Clustering methodologies
correlation based methods can reflect a true presentation of are developed to make the group of similar characteristics of
the system because these correlations can physically reflect objects under an unsupervised fashion. After such kind of
the mechanisms and operating conditions. These methods automatic grouping, if system could not place the new data
have several advantages, comparatively simple to implement, points into the predefined cluster then it generates alerts as
and easy to perform real-time analysis. Similarly, a density- an anomalous case. Rule-based approaches are also devel-
based model [64], is also developed to find the anomalous oped with clustering approaches [67], to provide an excellent
behavior in a solar power generation system by processing coverage and low false alarm rate. Artificial neural networks
are complex architecture which mimic the human brain in hardware and soft sensors are used to predict the anomalous
solving diverse range of problems including anomaly detec- behavior of such systems. In healthcare systems, different
tion. In the literature, it has been found that probabilistic kind of sensor are considered: from patients wearable sensors
neural networks [68], that belong to stochastic neural net- to sensors embedded in healthcare instruments and electronic
work group and provide excellent solutions to detect the health record. Industrial systems are considered as a complex
anomalies in thermal power plant. A conditional Gradient system and involve legacy software and equipment. We found
Boosting Decision Tree (GBDT) [70], is developed to detect that sensors are tapped over the mechanical objects, sense the
early anomalies in wind turbine bolts breaking problem. It is environment, and measure the different parameters of running
an ensemble learning classifier who generate multiple trees machines to detect and predict the anomalous behavior. Most
and utilize them for making final decision. It also has the of the provided dataset are cleaned and minimal preprocess-
characteristics to work in the huge amount of noise where ing is required. We present the different kind of sensors along
normal and abnormal conditions are unable to separate by the domain knowledge in Table 13.
hyperplane. Table 12 present our findings based on machine
learning methods in industrial systems. B. Q2. WHAT ARE THE METHODS USED IN DIFFERENT
DOMAINS FOR DETECTING ANOMALOUS BEHAVIOR?
VIII. DISCUSSION We found different types of anomaly detection methods and
A. Q1. WHAT KIND OF SENSORS ALONG WITH we extracted the information according to our developed
GENERATED DATA STREAMS ARE USED protocol. The detail information is presented as follows.
TO DETECT ANOMALIES?
In the literature review, we find a large number of sen- 1) STATISTICAL METHODS
sors to detect the anomalous behavior in different applica- To detect the anomalies, simple methods like setting up the
tion domains. In intelligent inhabitant environments, state threshold based on the variance or calculating the percentile
changes sensors, environmental sensors, and wearable sen- also work well over the real-data. In statistical techniques,
sors are used to avoid privacy issues. These sensors provide correlation based methods are able to provide information
a more comfortable level of privacy compared to video cam- about the relationships among data sets. Such relationships
eras and collect data continuously, generating temporal data exhibit anomalous pattern in the dataset. A different kind
streams. In intelligent transport systems, system logs with of variations are established in the literature that are based
sensor data are processed for anomalous behavior. In such on correlation-based anomaly detection. In these variations,
systems, both hardware sensory information and software the most common is the dissimilarity measures that is based
logs (i.e., soft sensor) are used to provide a complete picture on distance-based indexs to measure the dissimilarities. The
of the system. Smart objects has gained a lot of popularity in Gaussian mixture model based approaches are used to detect
the last decade and many everyday objects are tapped with the anomalies that are probabilistic and assumes that all
low-cost sensors to enable inter-object and over the Internet data points are generated from a mixture of a finite num-
communications. In this domain, data coming from both ber of Gaussian distributions with unknown parameters.
TABLE 13. The sensors used in different domains to detect the Model is successfully applied in intelligent inhabitant envi-
anomalous behavior of the system. Where IIE = intelligent inhabitant
environment, ITS = intelligent transport system, smart objects = SO ronment as well as in intelligent transport systems. Such sta-
HCS = health care system, IS = industrial system. tistical technique may be useful in industrial system because
we also have complex and enormous amount of data. It can
also observable dimensionality reduction techniques like
PCA and structured subspace learning is utilized in intelligent
transport system, while it can be beneficial to reduce the
dimension in different domains and combine with machine
learning methods. Such hybrid approaches can bring fruitful
results in the cross domains.
TABLE 15. The machine learning methods founds during our systematic information about the instances that are positive but falsely
literature review. Where IIE = intelligent inhabitant environment,
ITS = intelligent transport system, smart objects = SO HCS = health care considered as negative. Precision is defined as the number
system, IS = industrial system. of class members classified correctly over the total number
of instances classified as class members. Recall is defined
as the number of class members classified correctly over the
total number of class members. In anomaly detection system,
high precision and high recall are desired to build a good
system. In such a situation, F-measure is used to give an equal
importance to precision and recall [72].
TABLE 16. The deep learning methods founds during our systematic
literature review. Where IIE = intelligent inhabitant environment,
ITS = intelligent transport system, smart objects = SO, HCS = health care
system, IS = industrial system.
1) INTELLIGENT INHABITANT ENVIRONMENT pose a great challenge because sensors are not placed only
An anomalous behavior is relatively new in intelligent in fixed locations. They ranges from patient wearing sensors
inhabitant environments and currently being explored in the or other devices that capture the values of vital parameters,
research community. Many challenges are ahead in detecting electronic medical records and medical tests. In this domain,
abnormal behavior. For instance, we found many methods more effort is required to understand the nature and impor-
that are dealing with the single inhabitant in smart environ- tance of the data before developing the anomaly detection
ment. In real-life, many family members interact with the system. Furthermore, in data-driven approaches, researchers
objects at home. Robust techniques are required to assign have developed methods to detect the anomalies in electronic
individual behavior with the right person. Another challenge medical records over the past patient cases. In data-driven
is the imbalanced data generation due to the fact that the approaches, evaluation and development require a close inter-
interaction with the some objects are frequent while are very action with the domain experts. Another challenge is the trust
rare with others. An example of such interaction is a ‘cleaning on the developed method since they are not understandable
activity’ that may take place once per week as compared to by healthcare experts. The health practitioners consider the
more frequent activity ‘preparing meal’. To find the abnormal developed system as a black box. This area also demand
behavior during the home cleaning is difficult for a classifier. human understandaility to assure the correct behaviour of the
Nevertheless, the developed methods for intelligent inhabi- system.
tant environments also provide promising results but working
in real-life scenarios require more powerful methods to deal 5) INDUSTRIAL SYSTEM
with these challenges. In industrial systems, a robust development of anomaly detec-
tion methods is crucial to reduce the chance of a sudden
2) INTELLIGENT TRANSPORT SYSTEM stop-working of the system. We found one of the most
In the literature, most of the work related to the detection noticeable ‘‘rule-based’’ approaches for industrial system to
of anomalous behaviors was done for avionics systems. The detect the anomalous behavior. Such techniques generate
nature of the datasets are complex and high dimensional. human understandable profile and is easily understandable by
The data complexity makes the learning task difficult to dis- industrial engineers. Current challenges in industrial systems
tinguish between normal and abnormal behavior. Moreover, are (1) to develop the methods that can do root-cause analysis;
high dimensional data is compute intensive and demands big- (2) to provide preventive solutions to detect the anomalous
data infrastructure to process it in defined time intervals. behavior before it stops the industrial units.
Such complex data structures can be processed by deep learn-
ing models. Again such models require large computation IX. RESEARCH GAPS
resources to train the model and pose a big challenge to In fact, research on the anomalous behavior detection outside
integrate them with existing running systems. To reduce the of the security and surveillance system need to solve many
dimensionality of the collected data, new representational existing challenges.
• It has been found that availability of the real system
techniques are required to reduce the dimensions of the data
with minimum loss of information. data is hard to achieve and require a long procedure to
access the data of a running system. A big gap exists
to formalize a way to access the data logs and sensory
3) SMART OBJECTS
data stream, to build a model and validate it in real-life
Today we are surrounded by smart objects to ensure the settings.
proper working of many systems. It is a breakthrough in • During the analysis, it has been found that a huge
technology and creates a possibility of the objects to com- amount of studies, most of them are related to the normal
municate with humans as well as other system devices to behavior of the system. The most commonly developed
provide information about their working status. During our methods are based on the training of normal behavior
search, we found simple and effective methods to recognize and deviation from these scenarios is considered as an
the anomalous behavior. In a network of smart objects, a big anomalous behavior. More accurate and robust models
challenge is objects uncertainties factor of contextual and col- are required to deal with complex real-scenarios. A close
lective behavior interpretation that demands new methods to relationship is required to explain the importance and
determine the abnormal behavior. Another foreseen challenge promising results to the concern management individ-
is privacy and security of the such systems while storing and uals.
sending data to the cloud infrastructure. • The complexities of the data include imbalance datasets,
unexpected noise, and redundancy in the data. Well-
4) HEALTHCARE SYSTEM designed methods are needed to curate the datasets for
In the literature, we found the developed methods for cardiac extracting the meaningful information and knowledge.
behavior monitoring, sleep monitoring, gait freezing, seizure In this situation, a small system can be heavily compute
complexities, and decision support systems. Such system also intensive to process the complex data. The existing cloud
computing technology can be exploited for this work to can help the research community to obtain a detail informa-
obtain the fruitful results on right time. tion about an up-to-date developed approaches and method-
• Many statistical methods were well proven to detect the ologies in anomaly detection domain. We also observe that
anomalies in the past decades. At the moment, a big chal- further development of the new methods and techniques are
lenge is ahead to these developed methods, due to the required to process the diverse data streams generated from
exponential increase in the deployment of tap sensor on IoT devices, healthcare system, intelligent environments and
the objects. Currently, the working systems are equipped complex industrial systems.
with a lot of hardware and software sensors (i.e., system
logs) to monitor the systems/environment. To consider REFERENCES
all the source of information, a number of parameters is [1] V. Chandola, A. Banerjee, and V. Kumar, ‘‘Anomaly detection for discrete
increasing exponentially and statistical methods are not sequences: A survey,’’ IEEE Trans. Knowl. Data Eng., vol. 24, no. 5,
pp. 823–839, May 2012.
able to deal with such data. [2] I. Butun, S. D. Morgera, and R. Sankar, ‘‘A survey of intrusion detec-
• Machine learning methods are developed to deal with tion systems in wireless sensor networks,’’ IEEE Commun. Surveys Tuts.,
high dimensionality of the data and detect anomalous vol. 16, no. 1, pp. 266–282, 1st Quart., 2014.
[3] P. Garcia-Teodoro, J. Diaz-Verdejo, G. Maciá-Fernández, and E. Vázquez,
behavior of the system. A subset of machine learning
‘‘Anomaly-based network intrusion detection: Techniques, systems and
domain known as deep learning has great success in challenges,’’ Comput. Secur., vol. 28, nos. 1–2, pp. 18–28, 2009.
many areas (i.e. computer vision and speech processing) [4] A. Abdallah, M. A. Maarof, and A. Zainal, ‘‘Fraud detection system:
to produce more accurate results of the complex prob- A survey,’’ J. Netw. Comput. Appl., vol. 68, pp. 90–113, Jun. 2016.
[5] M. Ahmed, A. N. Mahmood, and M. R. Islam, ‘‘A survey of anomaly
lem. A gap exists to apply new models and assess the detection techniques in financial domain,’’ Future Gener. Comput. Syst.,
ability in anomaly detection domain especially for intel- vol. 55, pp. 278–288, Feb. 2016.
ligent transport, industrial, and smart objects deployed [6] C. C. Aggarwal, ‘‘Outlier analysis,’’ in Data Mining. Cham, Switzerland:
Springer, 2015, pp. 237–263.
systems. [7] X. Ding, Y. Li, A. Belatreche, and L. P. Maguire, ‘‘Novelty detection using
• Most of the research work done in the last decades is level set methods,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 3,
about the detection of anomalies. Anomaly prediction pp. 576–588, Mar. 2015.
[8] X. Song, M. Wu, C. Jermaine, and S. Ranka, ‘‘Conditional anomaly
and prevention is an unexplored area in the research detection,’’ IEEE Trans. Knowl. Data Eng., vol. 19, no. 5, pp. 631–645,
community. It can contribute a lot to predict the anoma- May 2007.
lies in advance. There is a need to develop and/or [9] A. V. Metcalfe and P. S. Cowpertwait, Introductory Time Series With R.
New York, NY, USA: Springer-Verlag, 2009.
adapt new methods that can prevent the systems in a [10] S. V. Kumar and L. Vanajakshi, ‘‘Short-term traffic flow prediction using
proactive way and have the ability to do root cause seasonal ARIMA model with limited input data,’’ Eur. Transp. Res. Rev.,
analysis. vol. 7, no. 3, p. 21, 2015.
• It has been found that a gap in visualizing the anomalies [11] Q. Wang, X. Song, and R. Li, ‘‘A novel hybridization of nonlinear grey
model and linear ARIMA residual correction for forecasting US shale oil
for analysis. New methods and approaches are needed production,’’ Energy, vol. 165, pp. 1320–1331, Dec. 2018.
to present the system in intuitive way to analyze the [12] D. Budgen and P. Brereton, ‘‘Performing systematic literature reviews
systems. Therefore, such gaps ought to be investigated in software engineering,’’ in Proc. 28th Int. Conf. Softw. Eng., 2006,
pp. 1051–1052.
which could contribute to the fields of anomaly detection [13] V. Chandola, A. Banerjee, and V. Kumar, ‘‘Anomaly detection: A survey,’’
system. ACM Comput. Surv., vol. 41, no. 3, p. 15, 2009.
• Our analysis could not find any work based on fusion [14] Z. Niu, S. Shi, J. Sun, and X. He, ‘‘A survey of outlier detection method-
ologies and their applications,’’ in Proc. Int. Conf. Artif. Intell. Comput.
techniques. Such techniques can provide a robust plat- Intell. Berlin, Germany: Springer, 2011, pp. 380–387.
form to fuse the sensory data streams and assist the [15] M. Fahim, I. Fatima, S. Lee, and Y.-K. Lee, ‘‘EEM: Evolutionary ensem-
analysis of anomalous behavior. bles model for activity recognition in smart homes,’’ Appl. Intell., vol. 38,
no. 1, pp. 88–98, 2013.
• The research community is putting of efforts to develop [16] V. R. Jakkula and D. J. Cook, ‘‘Detecting anomalous sensor events in
accurate methods. We could not find any work based on smart home data for enhancing the living experience,’’ in Proc. Artif. Intell.
ensemble learning that can reduce the false positive rate Smarter Living, 2011, pp. 1–2.
[17] S. M. Mahmoud, A. Lotfi, and C. Langensiepen, ‘‘Abnormal behaviours
of the system. identification for an elder’s life activities using dissimilarity measure-
ments,’’ in Proc. 4th Int. Conf. Pervasive Technol. Rel. Assistive Environ.,
X. CONCLUSIONS 2011, p. 25.
[18] P. Cuddihy, J. Weisenberg, C. Graichen, and M. Ganesh, ‘‘Algorithm to
Over the years, anomaly detection is an active area of research
automatically detect abnormally long periods of inactivity in a home,’’
and has gained a lot of interest of researchers from all appli- in Proc. 1st ACM SIGMOBILE Int. Workshop Syst. Netw. Support Health-
cation areas. An anomalous behavior recognition can reduce care Assist. Living Environ., 2007, pp. 89–94.
the functional risks, avoid unseen problems, and downtime [19] F. Cardinaux, S. Brownsell, M. Hawley, and D. Bradley, ‘‘Modelling of
behavioural patterns for abnormality detection in the context of lifestyle
of the systems. In this paper, we provide the results of a sys- reassurance,’’ in Proc. Iberoamerican Congr. Pattern Recognit. Berlin,
tematic literature review with a prospective of statistical and Germany: Springer, 2008, pp. 243–251.
machine learning models. Our review provides a complete [20] W. Kang, D. Shin, and D. Shin, ‘‘Detecting and predicting of abnor-
mal behavior using hierarchical Markov model in smart home net-
overview of the developed approaches, their characteristics work,’’ in Proc. IEEE 17Th Int. Conf. Ind. Eng. Eng. Manage. (IE&EM),
and performance measures. Consequently, this information Oct. 2010, pp. 410–414.
[21] T. V. Duong, H. H. Bui, D. Q. Phung, and S. Venkatesh, ‘‘Activity recog- [43] J. Ding, Y. Liu, L. Zhang, J. Wang, and Y. Liu, ‘‘An anomaly detec-
nition and abnormality detection with the switching hidden semi-Markov tion approach for multiple monitoring data series based on latent cor-
model,’’ in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. relation probabilistic model,’’ Appl. Intell., vol. 44, no. 2, pp. 340–361,
(CVPR), vol. 1, Jun. 2005, pp. 838–845. 2016.
[22] F. J. Ordóñez, P. de Toledo, and A. Sanchis, ‘‘Sensor-based Bayesian [44] C.-W. Ho, C.-T. Chou, Y.-C. Chien, and C.-F. Lee, ‘‘Unsupervised anomaly
detection of anomalous living patterns in a home setting,’’ Pers. Ubiquitous detection using light switches for smart nursing homes,’’ in Proc. IEEE
Comput., vol. 19, no. 2, pp. 259–270, 2015. 14th Int. Conf. Pervasive Intell. Comput., 14th Int. Conf. Dependable,
[23] S.-S. Choi, S.-H. Cha, and C. C. Tappert, ‘‘A survey of binary similar- Auton. Secure Comput. 2nd Int. Conf. Big Data Intell. Comput. Cyber
ity and distance measures,’’ J. Syst., Inform., vol. 8, no. 1, pp. 43–48, Sci. Technol. Congr. (DASC/PiCom/DataCom/CyberSciTech), Aug. 2016,
2010. pp. 803–810.
[24] A. Palaniappan, R. Bhargavi, and V. Vaidehi, ‘‘Abnormal human activity [45] R. U. Islam, M. S. Hossain, and K. Andersson, ‘‘A novel anomaly detection
recognition using SVM based approach,’’ in Proc. Int. Conf. Recent Trends algorithm for sensor data under uncertainty,’’ Soft Comput., vol. 22, no. 5,
Inf. Technol. (ICRTIT), Apr. 2012, pp. 97–102. pp. 1623–1639, 2016.
[25] J. H. Shin, B. Lee, and K. S. Park, ‘‘Detection of abnormal living patterns [46] Y. Sun, T.-Y. Wu, X. Li, and M. Guizani, ‘‘A rule verification system
for elderly living alone using support vector data description,’’ IEEE Trans. for smart buildings,’’ IEEE Trans. Emerg. Topics Comput., vol. 5, no. 3,
Inf. Technol. Biomed., vol. 15, no. 3, pp. 438–448, May 2011. pp. 367–379, Feb. 2017.
[26] S. M. Mahmoud, A. Lotfi, and C. Langensiepen, ‘‘User activities outlier [47] Y. Zheng, S. Rajasegarar, C. Leckie, and M. Palaniswami, ‘‘Smart car
detection system using principal component analysis and fuzzy rule-based parking: Temporal clustering and anomaly detection in urban car parking,’’
system,’’ in Proc. 5th Int. Conf. Pervasive Technol. Rel. Assistive Environ., in Proc. Proc. IEEE 9th Int. Conf. Intell. Sensors, Sensor Netw. Inf. Process.
2012, p. 26. (ISSNIP), Apr. 2014, pp. 1–6.
[27] J. Yin, Q. Yang, and J. J. Pan, ‘‘Sensor-based abnormal human-activity [48] D. Kumar, J. C. Bezdek, S. Rajasegarar, M. Palaniswami, C. Leckie,
detection,’’ IEEE Trans. Knowl. Data Eng., vol. 20, no. 8, pp. 1082–1090, J. Chan, and J. Gubbi, ‘‘Adaptive cluster tendency visualization and
Aug. 2008. anomaly detection for streaming data,’’ ACM Trans. Knowl. Discovery
[28] N. Han, S. Gao, J. Li, X. Zhang, and J. Guo, ‘‘Anomaly detection in health Data (TKDD), vol. 11, no. 2, p. 24, 2016.
data based on deep learning,’’ in Proc. Int. Conf. Netw. Infrastruct. Digit. [49] L.-J. Chen, Y.-H. Ho, H.-H. Hsieh, S.-T. Huang, H.-C. Lee, and
Content (IC-NIDC), Aug. 2018, pp. 188–192. S. Mahajan, ‘‘ADF: An anomaly detection framework for large-scale
[29] E. Hüllermeier, ‘‘Does machine learning need fuzzy logic?’’ Fuzzy Sets PM2.5 sensing systems,’’ IEEE Internet Things J., vol. 5, no. 2,
Syst., vol. 281, pp. 292–299, Dec. 2015. pp. 559–570, Apr. 2018.
[30] R. Fujimaki, T. Yairi, and K. Machida, ‘‘An approach to space- [50] M. Bobin, M. Boukallel, M. Anastassova, and M. Ammi, ‘‘Study of a
craft anomaly detection problem Using kernel feature space,’’ in Proc. smart cup for home monitoring of the arm and hand of stroke patients,’’
11th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2005, in Proc. 18th Int. ACM SIGACCESS Conf. Comput. Accessibility, 2016,
pp. 401–410. pp. 305–306.
[31] M. Riveiro, M. Lebram, and M. Elmer, ‘‘Anomaly detection for road traffic: [51] T. R. Burchfield and S. Venkatesan, ‘‘Accelerometer-based human abnor-
A visual analytics framework,’’ IEEE Trans. Intell. Transp. Syst., vol. 18, mal movement detection in wireless sensor networks,’’ in Proc. 1st ACM
no. 8, pp. 2260–2270, Aug. 2017. SIGMOBILE Int. Workshop Syst. Netw. Support Healthcare Assist. Living
[32] Y. F. He, Y. Peng, S. J. Wang, D. T. Liu, and P. H. W. Leong, ‘‘A struc- Environ., 2007, pp. 67–69.
tured sparse subspace learning algorithm for anomaly detection in UAV [52] Y. Zhu, ‘‘Automatic detection of anomalies in blood glucose using
flight data,’’ IEEE Trans. Instrum. Meas., vol. 67, no. 1, pp. 90–100, a machine learning approach,’’ J. Commun. Netw., vol. 13, no. 2,
Jan. 2018. pp. 125–131, 2011.
[33] J. M. Pena, F. Famili, and S. Létourneau, ‘‘Data mining to detect abnormal [53] C. Puri, A. Ukil, S. Bandyopadhyay, R. Singh, A. Pal, and K. Mandana,
behavior in aerospace data,’’ in Proc. 6th ACM SIGKDD Int. Conf. Knowl. ‘‘iCarMa: Inexpensive cardiac arrhythmia management—An IoT health-
Discovery Data Mining, 2000, pp. 390–397. care analytics solution,’’ in Proc. 1st Workshop IoT-Enabled Healthcare
[34] S. Das, B. L. Matthews, A. N. Srivastava, and N. C. Oza, ‘‘Multiple kernel Wellness Technol. Syst., 2016, pp. 3–8.
learning for heterogeneous anomaly detection: Algorithm and aviation [54] K. Tonchev, P. Koleva, A. Manolova, G. Tsenov, and V. Poulkov, ‘‘Non-
safety case study,’’ in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Dis- intrusive sleep analyzer for real time detection of sleep anomalies,’’
covery Data Mining, 2010, pp. 47–56. in Proc. 39th Int. Conf. Telecommun. Signal Process. (TSP), Jun. 2016,
[35] V. M. Janakiraman and D. Nielsen, ‘‘Anomaly detection in aviation data pp. 400–404.
using extreme learning machines,’’ in Proc. Int. Joint Conf. Neural Netw. [55] T. T. Pham, D. N. Nguyen, E. Dutkiewicz, A. L. McEwan, and
(IJCNN), Jul. 2016, pp. 1993–2000. P. H. Leong, ‘‘Wearable healthcare systems: A single channel accelerom-
[36] H. Lu, Y. Li, S. Mu, D. Wang, H. Kim, and S. Serikawa, ‘‘Motor anomaly eter based anomaly detector for studies of gait freezing in Parkinson’s
detection for unmanned aerial vehicles using reinforcement learning,’’ disease,’’ in Proc. IEEE Int. Conf. Commun. (ICC), May 2017,
IEEE Internet Things J., vol. 5, no. 4, pp. 2315–2322, Aug. 2017. pp. 1–5.
[37] M. Farshchi, I. Weber, R. Dellacorte, A. Pecchia, M. Cinque, [56] S. Ray and A. Wright, ‘‘Detecting anomalies in alert firing within clinical
J.-G. Schneider, and J. Grundy, ‘‘Contextual anomaly detection for a decision support systems using anomaly/outlier detection techniques,’’
critical industrial system based on logs and metrics,’’ in Proc. 14th Eur. in Proc. 7th ACM Int. Conf. Bioinf., Comput. Biol., Health Inform., 2016,
Dependable Comput. Conf. (EDCC), Sep. 2018, pp. 140–143. pp. 185–190.
[38] B. Bose, J. Dutta, S. Ghosh, P. Pramanick, and S. Roy, ‘‘D&RSense: [57] A. Manashty, J. Light, and U. Yadav, ‘‘Healthcare event aggregation lab
Detection of driving patterns and road anomalies,’’ in Proc. 3rd Int. Conf. (HEAL), a knowledge sharing platform for anomaly detection and predic-
Internet Things, Smart Innov. Usages (IoT-SIU), 2018, pp. 1–7. tion,’’ in Proc. 17th Int. Conf. E-Health Netw., Appl. Services (HealthCom),
[39] X. Gibert, V. M. Patel, and R. Chellappa, ‘‘Deep multitask learning for Oct. 2015, pp. 648–652.
railway track inspection,’’ IEEE Trans. Intell. Transp. Syst., vol. 18, no. 1, [58] K. Raghuraman, M. Senthurpandian, M. Shanmugasundaram, Bhargavi,
pp. 153–164, Jan. 2017. and V. Vaidehi, ‘‘Online incremental learning algorithm for anomaly detec-
[40] H. Luo and S. Zhong, ‘‘Gas turbine engine gas path anomaly detection tion and prediction in health care,’’ in Proc. Int. Conf. Recent Trends Inf.
using deep learning with Gaussian distribution,’’ in Proc. Prognostics Syst. Technol. (ICRTIT), 2014, pp. 1–6.
Health Manage. Conf. (PHM-Harbin), Feb. 2017, pp. 1–6. [59] H. Zhang, S. Mehotra, D. Liebovitz, C. A. Gunter, and B. Malin, ‘‘Mining
[41] J. Amores, P. Maes, and J. Paradiso, ‘‘Bin-ary: Detecting the state of deviations from patient care pathways via electronic medical record system
organic trash to prevent insalubrity,’’ in Proc. ACM Int. Joint Conf. Per- audits,’’ ACM Trans. Manage. Inf. Syst., vol. 4, no. 4, p. 17, 2013.
vasive Ubiquitous Comput. ACM Int. Symp. Wearable Comput., 2015, [60] K. Li, N. Du, and A. Zhang, ‘‘Detecting ECG abnormalities via transduc-
pp. 313–316. tive transfer learning,’’ in Proc. ACM Conf. Bioinf., Comput. Biol. Biomed.,
[42] M. L. Han, J. Lee, A. R. Kang, S. Kang, J. K. Park, and H. K. Kim, 2012, pp. 210–217.
‘‘A statistical-based anomaly detection method for connected cars in Inter- [61] M. Hauskrecht, I. Batal, M. Valko, S. Visweswaran, G. F. Cooper, and
net of Things environment,’’ in Proc. Int. Conf. Internet Vehicles. Cham, G. Clermont, ‘‘Outlier detection for patient monitoring and alerting,’’
Switzerland: Springer, 2015, pp. 89–97. J. Biomed. Inform., vol. 46, pp. 47–55, Feb. 2013.
[62] L. V. Nguyen, J. Kapinski, X. Jin, J. V. Deshmukh, K. Butts, and MUHAMMAD FAHIM received the B.S. degree
T. T. Johnson, ‘‘Abnormal data classification using time-frequency tem- (Hons.) from Gomal University, Pakistan, in 2007,
poral logic,’’ in Proc. 20th Int. Conf. Hybrid Syst., Comput. Control, 2017, the M.S. degree from the National University
pp. 237–242. of Computer and Emerging Sciences (NUCES),
[63] P. Zhao, M. Kurihara, J. Tanaka, T. Noda, S. Chikuma, and T. Suzuki, Pakistan, in 2009, and the Ph.D. degree from
‘‘Advanced correlation-based anomaly detection method for predictive Kyung Hee University, South Korea, in 2014,
maintenance,’’ in Proc. IEEE Int. Conf. Prognostics Health Manage. where he was a Postdoctoral Fellow with the
(ICPHM), Jun. 2017, pp. 78–83.
Department of Computer Engineering. He served
[64] Y. Akiyama, Y. Kasai, M. Iwata, E. Takahashi, F. Sato, and M. Murakawa, as an Assistant Professor with the Department of
‘‘Anomaly detection of solar power generation systems based on
Computer and Software Engineering, Faculty of
the normalization of the amount of generated electricity,’’ in Proc.
IEEE 29th Int. Conf. Adv. Inf. Netw. Appl. (AINA), Mar. 2015,
Engineering and Natural Sciences, Istanbul Sabahattin Zaim University,
pp. 294–301. Istanbul, Turkey, for three years, where he also leads the Machine Learning
[65] D. Zang, J. Liu, and H. Wang, ‘‘Markov chain-based feature extraction for Research Laboratory. He is currently with the Institute of Information Sys-
anomaly detection in time series and its industrial application,’’ in Proc. tems, Innopolis University, Russia, as an Assistant Professor. His research
Chin. Control Decis. Conf. (CCDC), Jun. 2018, pp. 1059–1063. interests include activity recognition in smart homes and smartphone, wear-
[66] J. Carino, D. Zurita, A. Picot, M. Delgado, J. Ortega, and able computing, digital signal processing, machine learning, and abnormal
R. Romero-Troncoso, ‘‘Novelty detection methodology based on behavior recognition in intelligent environments.
multi-modal one-class support vector machine,’’ in Proc. IEEE 10th
Int. Symp. Diag. Electr. Mach., Power Electron. Drives (SDEMPED), ALBERTO SILLITTI received the Ph.D. degree
Sep. 2015, pp. 184–190. in electrical and computer engineering from the
[67] B. Narayanaswamy, B. Balaji, R. Gupta, and Y. Agarwal, ‘‘Data driven University of Genoa, Italy, in 2005. He has been
investigation of faults in HVAC systems with model, cluster and compare involved in several EU funded projects related to
(MCC),’’ in Proc. 1st ACM Conf. Embedded Syst. Energy-Efficient Build- open source software, services architectures, and
ings, 2014, pp. 50–59.
agile methods in which he applies noninvasive
[68] A. Hajdarevic, I. Dzananovic, L. Banjanovic-Mehmedovic, and
measurement approaches. He is currently a Full
F. Mehmedovic, ‘‘Anomaly detection in thermal power plant using
Professor with the Faculty of Computer Science
probabilistic neural network,’’ in Proc. Proc. 38th Int. Conv. Inf.
Commun. Technol., Electron. Microelectron. (MIPRO), May 2015, and Engineering, Innopolis University, Russia,
pp. 1118–1123. where he leads the Cyber-Physical Systems Lab
[69] W. Yan, ‘‘One-class extreme learning machines for gas turbine combus- and is also the Director of the Institute of Information Systems. He is
tor anomaly detection,’’ in Proc. Int. Joint Conf. Neural Netw. (IJCNN), also an Associate Dean for consulting activities. He has authored more
Jul. 2016, pp. 2909–2914. than 200 papers published in international conferences and journals. His
[70] C.-W. Wu and M. Chen, ‘‘Early anomaly detection in wind turbine bolts research interests include open source development, agile methods, empirical
breaking problem—Methodology and application,’’ in Proc. IEEE 3rd Int. software engineering, noninvasive measurement, software quality, cyber-
Conf. Big Data Anal. (ICBDA), Mar. 2018, pp. 402–406. physical systems, and mobile and web services, with a focus on mobile
[71] M. A. Hayes and M. A. Capretz, ‘‘Contextual anomaly detection in big sen- and energy-aware software development and quality for cyber-physical sys-
sor data,’’ in Proc. IEEE Int. Congr. Big Data (BigData Congr.), Jun. 2014, tems. He has served as a member for the program committee of several
pp. 64–71. international conferences, as a Program Chair for OSS in 2007 and 2019,
[72] C. Goutte and E. Gaussier, ‘‘A probabilistic interpretation of precision, XP in 2010 and 2011, SEDA in 2012, 2013, and 2014, and a General Chair
recall and f-score, with implication for evaluation,’’ in Proc. Eur. Conf. for SEDA in 2018.
Inf. Retr. Berlin, Germany: Springer, 2005, pp. 345–359.