0% found this document useful (0 votes)
12 views12 pages

B-Anomaly Detection Based On Multidimensional Data Processing For Protecting Vital Devices in 6G-Enabled Massive IIoT

Uploaded by

12215139
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views12 pages

B-Anomaly Detection Based On Multidimensional Data Processing For Protecting Vital Devices in 6G-Enabled Massive IIoT

Uploaded by

12215139
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE Xplore ®

Notice to Reader

“Anomaly Detection Based on Multidimensional Data Processing for Protecting Vital Devices in
6G-Enabled Massive IIoT”
by Guangjie Han, Juntao Tu, Li Liu, Miguel Martínez-García, Yan Peng
published in IEEE Internet of Things Journal
vol. 8, no. 7, pp. 5219–5229, April 1, 2021
Digital Object Identifier: 10.1109/JIOT.2021.3051935
The corresponding author of this article is Yan Peng (e-mail: [email protected]).

We regret any inconvenience this may have caused.


Honggang Wang
Editor-in-Chief
IEEE Internet of Things Journal

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021 5219

Anomaly Detection Based on Multidimensional


Data Processing for Protecting Vital Devices in
6G-Enabled Massive IIoT
Guangjie Han , Senior Member, IEEE, Juntao Tu, Li Liu , Miguel Martínez-García , Member, IEEE,
and Yan Peng

Abstract—As a result of the increasing deployment of of the production line [1]. Furthermore, vast array of sen-
Industrial-Internet-of-Things (IIoT) architectures, large volumes sors are being embedded in cities, homes, factories and even
of multidimensional data are continuously generated. An impor- humans. Data fusion and analysis are realized through ad hoc
tant issue with these data is that higher dimensionality increases
the degree of fragmentation. Furthermore, data sets collected by cloud and edge computing centers. All this has resulted in
IIoT nodes often display outliers, which are usually caused by an explosive growth in the degree of interconnectivity and
anomalous events or errors. These outliers contain considerable amount data being processed; the so-called Internet of Things
valuable information, which prevent the normal operation of the (IoT) is inevitably increasing the speed and stability require-
system. Thus, methodologies are able to quantify the obtained ments in network transmission, and the necessity of adequate
information to protect the high priority IIoT nodes, are cru-
cial. This study aims at developing such a method driven by network management to ensure the protection of the nodes
sixth-generation (6G) networks. The proposed algorithm uses a within a network. The emergence of new technologies, such
multidimensional data relationship diagram to characterize the as sixth-generation (6G) networks [2], can effectively mitigate
spatiotemporal correlations among heterogeneous data. Then, an some of these problems, and makes more feasible the realiza-
autoregressive exogenous model is used to eliminate the effects tion of the IoT paradigm. In the futureIn the field of industry,
of noise on sensor data, and to help in detecting anomalies.
Finally, the algorithm produces a Cumulative Coefficient of Value 6G will fully realize the Industry 4.0 revolution starting from
(CCoV), to identify high-value sensing devices and enable mas- 5G. Nevertheless, a better usage and protection of the rele-
sive Internet of Things (IoT) with 6G—using the characteristic vant nodes in a network, from a computational perspective,
patterns hidden within the data. The experimental results demon- is of paramount importance. Toward this aim, and within the
strate that the proposed method can effectively handle the effects Industrial IoT (IIoT) [5] framework, machine learning (ML)
of the ubiquitous interference noise in complex industrial environ-
ments. Moreover, the method yields effective anomaly detection strategies [6]–[8] are often used to evaluate the degree of cor-
and compensates for some of the shortcomings in traditional relation of the observations, to extract features from the input
methods. multidimensional data, and to predict the likelihood a system
Index Terms—Artificial intelligence, Industrial Internet of failure.
Things (IIoT), multidimensional data processing (MDP), sixth- Given the ever-increasing scale of the networks within the
generation (6G) networks. 6G-enabled massive IoT [9], and the multidimensional and
heterogeneous characteristics of the resulting data, anomaly
I. I NTRODUCTION
detection has simultaneously become highly important and
RESENTLY, the operation of production and social
P interaction activities is increasingly data driven and auto-
mated. Higher automation in the industrial manufacturing
challenging. The safe and efficient operation of a fully auto-
mated industrial production system [10] relies on the quality
of the sensor data, collected in the sensing layer, and on the
process can greatly improve the efficiency and customization intelligent system that extracts the latent information within it.
Manuscript received September 28, 2020; revised November 10, 2020 In an emergency, the control systems typically must produce
and November 23, 2020; accepted January 9, 2021. Date of publication a quick response. However, a large number of outliers often
January 18, 2021; date of current version March 24, 2021. This work appear in these data, and it is necessary to distinguish the out-
was supported in part by the National Key Research and Development
Program under Grant 2017YFE0125300; in part by the Jiangsu Key Research liers [11]—caused by anomalies—from errors [12]—caused
and Development Program under Grant BE2019648; and in part by the by node failure.
Project of Shenzhen Science and Technology Innovation Committee under Furthermore, in an actual industrial production environ-
Grant JCYJ20190809145407809. (Corresponding author: Guangjie Han.)
Guangjie Han, Juntao Tu, and Li Liu are with the College of ment, it is often impossible to perfectly label the generated
Internet of Things Engineering, Hohai University, Changzhou 213022, heterogeneous data. Sampling errors and the presence of
China (e-mail: [email protected]; [email protected]; instrument defects are other factors that cannot be ignored.
[email protected]).
Miguel Martínez-García is with the Department of Aeronautical Thus, unsupervised or semisupervised algorithms—that work
and Automotive Engineering, Loughborough University, with incompletely labeled data–are necessary. In addition, in
Loughborough LE11 3TU, U.K. (e-mail: [email protected]). real industrial cases it may be difficult for the safety person-
Yan Peng is with the School of Artificial Intelligence, Shanghai University,
Shanghai 200000, China (e-mail: [email protected]). nel to travel to the incident area to perform maintenance and
Digital Object Identifier 10.1109/JIOT.2021.3051935 replacement when problems are detected. Once a smart device
2327-4662 
c 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
5220 IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021

with high-value information fails to function properly, it often similar at first. Hence, it is necessary to distinguish whether
affects the entire system. Thus, it is necessary to identify which the occurrence of an outlier is caused by an anomalous event
nodes have highest relevance for the network, so that pre- or an error (e.g., system errors and sensor misreadings). In
ventive protection with redundant or backup systems can be the existing literature, most outlier detection solutions can be
implemented. divided into those based on statistics and ML, those based
To address these problems, we propose an anomaly detec- on clustering methods, and those dependant on a knowledge
tion method based on multidimensional data processing base.
(MDP)—to analyze the streamed data and to determine poten-
tial sources of anomalous behavior within complex large A. Outlier Detection Based on Statistics and Machine
networks. The distributed processing and online detection can Learning
be realized through edge computing devices in 6G-enabled
massive IoT. Using the results of the anomaly detection, the At present, principal component analysis (PCA) is widely
algorithm extracts the nodes with high-value information—by used as a statistical-based method in the field of anomaly
computing a proposed measure: the cumulative coefficient of detection. In [13], an abnormality detection method was
value (CCoV) of each node. The main contributions of this proposed based on a combination of conventional PCA [14]
article are summarized as follows. and the squared prediction error. However, the squared-
1) An effective data analysis model, to reveal the spa- prediction-error scoring mechanism cannot handle a large
tiotemporal correlation of the IIoT nodes, is designed for number of outliers. Yu et al. [15] proposed a cluster-based
6G-enabled massive IIoT. After training the autoregres- data analysis framework using R-PCA, to gather relevant sen-
sive exogenous (ARX) model, the environmental and sor data collected by the nodes—to reduce the computational
system parameter values—that eliminate the interference and processing burdens. However, this method may result in
caused by industrial noise—are obtained. the loss of information. In general, the ability to reduce the
2) A threshold-free anomaly detection method is used to number of dimensions in the data is the biggest advantage of
process the multidimensional heterogeneous data. The PCA. However, it should not be overlooked that this algorithm
source of the detected outliers is validated, yielding a often destroys the integrity of the information in the data. An
more accurate diagnosis. equivalent method to PCA that does not assume linearity is
3) A new measure (CCoV) is proposed through edge com- the use of autoencoders [16], although this method presents
puting devices driven by 6G networks. The CCoV is the same limitation of loss of information.
calculated based on the results of the anomaly detection
process, and quantifies the degree of relative importance B. Outlier Detection Based on Clustering
of the IIoT nodes. Liu et al. [17] proposed a clustering method based on the
4) The performance of the anomaly detection method is local outlier factor to detect outliers. This method can be used
evaluated in real data sets and synthetic data through to process data with imperfect tags, while incorporating a lim-
extensive simulations. The results demonstrate that the ited set of exception samples into learning. However, as the
proposed algorithm displays better performance under number of dimensions of the data increases, the inherent flaws
noisy conditions than a number of existing algorithms of this algorithm are gradually exposed. Salehi et al. [18],
[high reliable data verification (HRDV), memory effi- proposed a MiLOF detection algorithm for data streams. The
cient incremental local outlier factor (MiLOF), and algorithm has lower computational complexity and higher
recursive PCA (R-PCA)]. The CCoV can also extract precision when dealing with large data streams. The main
key nodes more accurately on enabling massive IoT with limitation of this algorithm is that, because of periodic and sea-
6G. sonal changes present in the data, this method displays often
The remainder of this article is organized as follows. a high false-positive rate.
Section II surveys related work. Section III presents the
approach used to process multidimensional data, detect out-
liers, identify the source of outliers and extract key nodes via C. Outlier Detection Through Knowledge Base
CCoV. The experimental results are presented in Section IV, Yu et al. [19] proposed an anomaly detection method based
while discussion and conclusions are drawn in Section V. on knowledge reuse that mixes historical data with target data.
Knowledge is transmitted from the already marked histori-
cal data (as an auxiliary method), to facilitate unsupervised
II. R ELATED W ORK outlier detection of the target data sets. Because of its sen-
To improve the accuracy of decision systems that classify sitivity to unlabeled data, the processing efficiency of this
anomalous behavior, it is necessary to obtain reliable and method is related to the quality of the data. Deng et al. [20]
informative sensor data in 6G-enabled massive IoT. In many proposed a tensor-based method that preserves the structural
instances, and even within normal system behavior, outliers— information of the original data. This kind of Tucker-based
i.e., “data objects that are significantly different from other approach can cope with high-dimensional data and effectively
objects” [12]—are present in the data. Thus, the nature of solve the problem of the curse of dimensionality. The disad-
what needs to be detected (i.e., anomalies due to system fail- vantage of this approach is that the selection of the model
ure) and what corrupts the data (the outliers) appears very parameters must be rigorous.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
HAN et al.: ANOMALY DETECTION BASED ON MULTIDIMENSIONAL DATA PROCESSING 5221

In [21], by exploiting the distribution of observations, steps. First, they often process sample data to determine the
the concept of the cumulative degree of discrepancy was upper and lower bounds of a trusted data set. Second, the
proposed to diagnose the source of anomalies based on received data are compared to a threshold—in order to deter-
a statistical methodology. According to the spatiotemporal mine whether a particular data point is an outlier. There are a
correlation of an event, Zhang et al. [22] proposed a fault- number of limitations in threshold-based approaches.
tolerant detection algorithm. In this approach, distributed 1) A large number of reliable historical data are required
fusion trees are constructed, and each node sends the collected as a sample, and the attributes of these data need to
data to the corresponding nearest root node—enabling robust be marked by relevant experts. It is obviously flawed
fault-tolerant detection of single/multiple events. Furthermore, to implement above work on enabling massive IIoT
Cao et al. [23] proposed a distributed fault-tolerant algorithm with 6G.
for event detection based on event-based spatiotemporal corre- 2) The comprehensiveness of the sample data has a crucial
lation. Their algorithm uses the statistical characteristics of the impact on the accuracy of the test. Existing methods still
randomness in the events to analyze the occurrence of errors have many shortcomings in dealing with the relationship
in the time series data. However, one disadvantage is that as between local and global outliers.
the number of error nodes increases, the event detection rate 3) The thresholds may be dependant on the various opera-
decreases. tional regimes of the whole system. In some cases these
threshold values are arbitrary.
D. Protecting IIoT Devices From Outliers Here, the proposed algorithm, besides effectively processing
multidimensional and heterogeneous data, it can also extract
In realistic situations in which the IIoT is used, it is
valuable system information. More importantly, it can accu-
impossible to perfectly label large amounts of sensor data.
rately identify the source of the outliers. When faced with
How to process these data, which are often unbalanced, and
imbalanced data sets, the proposed method has a high detec-
ensure the efficiency of reliable outlier detection are impor-
tion rate. At the same time, by considering the common influ-
tant issues. In [24], the corresponding training set is obtained
ence of various factors (such as noise and system parameters),
by sampling previously marked data. In practice, especially
vital nodes can be selected more accurately. In addition, the
for high-dimensional real data sets, the proportion of mark-
proposed method has a lower computational complexity—and
ers seriously affects the accuracy of the sampling. In [25], a
lower energy consumption—than other existing methods.
detection method based on a support vector data description
was proposed to detect local and global outliers. The algorithm
has a higher detection rate for completely unknown data, but III. P ROPOSED M ETHOD
at the expense of the rate of false positives and the extent In this section, we present an improved three-layer struc-
to which data are discarded. Lee and Kim [26] proposed a ture for an IIoT system model. It is also considered about the
detection algorithm based on HRDV. The proposed algorithm problem of dimensionality caused by the explosive growth of
eliminates erroneous readings and improves normal readings data in 6G-enabled massive IIoT. This model is suitable for
to improve detection reliability. IIoT nodes that are deployed in large numbers. As discussed,
traditional anomaly detection methods based on thresholds
E. Limitations of Traditional Methods often have various limitations in handling large sensor data.
Therefore, this section presents a new outlier detection method
In the traditional methods, the extraction and analysis of
driven by 6G networks, that uses the spatiotemporal correla-
valuable information hidden within the data are neglected. It
tion of adjacent nodes, to verify the source of potential outliers.
is well known that the more valuable information an IIoT
Thus, a hypothetical control center can obtain high-value data,
node has, the more important it is for the entire system to run
after eliminating error data due to performance defects or
smoothly. Therefore, administrators need to identify key nodes
failures in the nodes themselves.
globally. For that, Yildiz et al. [27] proposed a network traffic-
based approach to identify key nodes. This method measures
the importance of a node, by calculating the ratio of node A. System Model
traffic to total traffic for each node. However, this method As compared to an actual deployment environment, the
only pays attention to the amount of data transmitted by each system models representing IoT technology are typically too
node, ignoring the value of the latent information in the data. simplistic. In general, IIoT nodes transmit industrial data
This is because, in a steady state system, there is no equiva- through neighbor nodes or cluster head nodes. The data redun-
lence between large flows and large amounts of information. dancy in the nodes tends to increase when these are closer to
Dagdeviren et al. [28] used a count-based approach for each the network center. Once the cluster head nodes have been
node to identify important nodes. This method calculates the damaged, industrial equipment in this area is very likely to
priority based on the frequency of attacks to which the nodes enter a state of collapse. The existence of these defects usu-
are subject. However, when an attacker adopts an indiscrim- ally leads to high latency and a high failure rate for the entire
inate attack, because the method only counts the number of industrial IoT system. In this study, a system model that is
attacks, the accuracy of the selection process cannot be guar- more suitable for 6G-enabled massive IIoT is proposed. The
anteed. A summary of previous algorithms reveals that most entire model uses the typical three-tiered architecture: 1) a per-
of the anomaly detection methods are divided into two main ception layer; 2) a network layer; and 3) an application layer.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
5222 IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021

Fig. 1. Diagram depicting the IIoT model for generating multidimensional


data.

However, we improve the way in which the devices transmit


data in the perception layer. The transfer of data is managed
by software-definition network (SDN).
The IIoT nodes belong to the sensing layer and are deployed Fig. 2. Diagram representing the multidimensional data relational model.
in the corresponding industrial production areas. However,
because of the large number of nodes, wide distribution range,
and long periods of operation between human checks, the tra-
In this article, a correlation graph is used to characterize
ditional methods for collecting and processing data result in
the correlation of the data collected by the IIoT nodes in the
extremely high time associated costs. Thus, it is difficult to
same subarea. The graph is expressed as the tuple
ensure that the central system performs adequate decisions
making in a timely manner. Additionally, these approaches
do not meet the low latency requirements in real time auto- G = {V, E, Sv , Se } (1)
mated production. Therefore, we divide the entire industrial
production area into multiple subareas, based on the charac- where V is a set of readings from the nodes; E is the set of
teristics of the monitored objects and the differences in the neighbors of the nodes; and Sv and Se , respectively, represent
environments in which they are deployed. As illustrated in the set of the states of the nodes and their neighbors’ readings.
Fig. 1, the IIoT nodes are assumed to be densely deployed When a reading is abnormal, the corresponding element in Se
fixed smart devices with limited power, storage, and com- is set to 1; otherwise, the reading is normal and the element
puting power. Each subarea is equipped with an edge server is set to 0. Note that the nodes deployed within the sensor’s
driven by 6G networks that manages the IIoT devices in the perceived radius are neighbor nodes.
area and provides them with a unique ID. The IIoT devices As shown in Fig. 2, the correlation between the data streams
use wireless communication to directly transmit the collected of the IIoT nodes is characterized by a relational model. In
heterogeneous information to a specified edge server for pro- the figure, the colors indicate the properties of the data (for
cessing. The edge server detects abnormalities in the data, and example, red for temperature, blue for humidity, and green for
uploads the reliable data to the cloud server through an adja- light intensity). Correlated data with the same dimensions are
cent SDN-based IIoT gateway—to provide a basis for system connected by solid lines and dashed lines are used otherwise.
analysis and decision making. This model can effectively distinguish multidimensional data.
It is assumed that there is noise in the environment in which
the IIoT nodes are located that is independent and identically
B. Outlier Detection via Multidimensional Data Processing distributed through time. In the same physical system, a linear
Reflecting an actual environment, the following assumptions relationship exists between the system inputs and outputs mea-
are made in this study. Each type of IIoT node is equipped sured by the nodes. The presence of noise makes it difficult to
with various sensors to monitor different processes. When the obtain real values directly during the monitoring process. The
IIoT devices collect sensing data, these are synchronized and ARX model [6] focuses on the characteristics of the inputs
registered. The collected multidimensional data vector can be and outputs, and does not explore the hidden variables of the
expressed as Di = [X1 , X2 , · · · , Xn ]T ,n ≤ Wm , where Di is the system. Therefore, the data streams measured by the node and
data matrix that is acquired by the IIoT node i; n ≤ Wm is its neighbors are denoted as the input and output of the ARX
the highest dimensionality of the collected data; and X is a model to obtain the closest match to the true data values. By
data vector representing a time-series X(t) = (x1 , x2 , · · · , xt ). training the model using the historical data, the correspond-
Here, xt represents data acquired at time t. ing values of the environmental and system parameters are

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
HAN et al.: ANOMALY DETECTION BASED ON MULTIDIMENSIONAL DATA PROCESSING 5223

obtained. The formula for the ARX model is Algorithm 1 Outlier Detection of Sensing Data
Input: Industrial data from IIoT devices
A(z)Xi (t) = B(z)Xj (t − k) + εij (t) (2) Output: States of data Se
where Xi and Xj represent the data flow between the IIoT 1: Train ARX model coefficient θij (t) according to (4);
nodes; A(z) and B(z) denote the model transformation param- 2: for each X(t), t ∈ [1, T0 ] do
eters in the Z-transform process; t is the time window of the 3: Compute eij (t) according to (5);
sample data; and εij (t) represents the independent and identi- 4: end for
cally distributed random variables used to calculate the noise. 5: Train the OCSVM model based on eij (t);
To facilitate the calculation, A(z) and B(z) are transformed, 6: t = T0 ;
and the original linear expression is rewritten as the following 7: repeat
equation: 8: t = t + 1;
  9: Compute values using (5);
→ →
Xi (t) = − Xi , Xj • θij (t) (3) 10: Get the states of the data points according (7);
11: if Sij (t) = 1 then
where θij (t) are the system parameters. Before normal opera- 12: Record the number of outliers N = N + 1;
tion, the historical data of the IIoT nodes are collected to create 13: end if
a training set. The training process is given by the following 14: until 1;
relation:
 T0 
1   2
θ̂ij = arg min Xi (t) − X̂i tθij (4) to minimize the impact of noise on the original data. These
θij ∈Rm+n T0 t=1 ensure the quality of the sample data, and improve the robust-
where θ̂ij trainable system parameters. The estimated value ness toward noise and the accuracy of the classification. The
of the reading X̂ij (t) can be obtained for the corresponding process is summarized in Algorithm 1.
IIoT node. Next, the residual—or difference between the esti-
mated value and measured value—is calculated. The equation C. Source Identification of Outliers
to compute the residual is The previous section has described how the data collected
eij (t) = Xi (t) − X̂ij (t). (5) by the IIoT nodes are processed to detect outliers. However,
there are many reasons for abnormal data—such as system
Therefore, the set of residuals obtained from historical data failures due to external attacks, exhausted power supplies, etc.
can be expressed as follows: These sources are often grouped into two categories: 1) events
e = eij (1), eij (2), . . . , eij (T0 ) (6) and 2) errors. For the control system, the data generated by
errors have no value and therefore need to be excluded. The
where T0 represents the time interval of the collected data. outliers generated by an event usually contain a lot of valu-
Data set e is used as a training set. Outlier detection is per- able information. An IIoT node with higher-value information
formed on the data collected by each of the IIoT nodes, using is more critical and hence more worth protecting. To priori-
a one-class support vector machine (OCSVM). Hence, it can tize the protection of high-impact and vulnerable IIoT nodes,
judged whether the state of each IIoT node at each time point accurate identification of the source of outliers is required.
is abnormal. The function used to determine a state is defined We assume that the IIoT nodes are densely deployed within
as follows: the corresponding area, and that the number of IIoT nodes that
  are generally in an error state is significantly smaller than the
0, if  eij (t) ≥ 0
Sij = (7) number of neighbor nodes operating normally. Furthermore,
1, otherwise.
the effects of external interference are assumed to be relatively
The number of occurring outliers is recorded by the sta- limited. Therefore, the source of the outliers can be identified
tistical quantity N. Note that the training set is guaranteed and analyzed by the fact that the data collected by adjacent
to be positive. The OCSVM maps the data samples to the devices have a certain level of correlation in time and space.
high-dimensional feature space through its kernel function, Let Pi (T) denote the probability of the data from IIoT node i
producing a better aggregation. The complexity of the calcu- at time T being classified as an outlier. Pi (T) reflects the
lation depends on the number of support vectors, and not on frequency of the occurrence of abnormal data in continuous
the dimensionality of the sample space, mitigating the effects time. The quantity N is used to denote the number of abnormal
of the curse of dimensionality. data samples that appear in the data streams. If the reading sat-
However, the traditional OCSVM simply takes the original isfies the criteria given in [29] for continuous time, N increases
data as input samples, and ignoring the reliability, period- and exhibits an exponential relationship with respect to Pi (T).
icity, and other interference problems that may exist in the However, if the specified conditions are not met, N, Pi (T − 1),
data itself. Therefore, considering the characteristics of the and Pi (T) return 0. Thus, Pi (T) can be expressed as follows:
industrial environment, we modify the method. The proposed
method makes full use of the temporal and spatial correlation 1 − Pi (T − 1)e−N
Pi (T) = . (8)
of the data, and uses the ARX model and residual calculation 1 + Pi (T − 1)e−N

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
5224 IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021

The origin of the outliers is determined by the spatial corre- nodes are usually heterogeneous smart devices. Different mon-
lation and by the statistical characteristics of the data set. The itoring objects imply that the amount of data collected by the
degree of consistency Ri (T) reflects the overall data similarity nodes can differ. The size of the traffic is also an important
between node i and its neighboring nodes, and is given by the indicator of the criticality of a node. Let γi be defined as the
expression saturation loss, that is, the amount of data sent from node i—
as a percentage of the total amount of data in the subarea per
1
n
Ri (T) = ri,j (T). (9) unit time. The formula for γi can be written as
n
j=1 Mi
γi = . (14)
The degree of proximity ri,j between node i and its neighbor FS
node j is calculated as follows: Let s denote the total number of nodes in the subarea. Then,
 
(T) (T) the formula for the total traffic FS in the subarea is
min1≤j≤n xi , xj
ri,j =  . (10) 
S
max1≤j≤n xi(T) , xj(T) FS = Mi . (15)
i=1
(T)
Sample data xi needs to satisfy the condition cmin ≤
Depending on the conditions of deployment and the differ-
xi(T) ≤ cmax , where cmin and cmax represent the lower and ences between the monitored objects, the system environment
upper confidence limits of the sampled data. The specific for- is usually divided into two categories: 1) stable and 2) unsta-
mula is given in [29]. The cumulative degree of difference is ble. In a stable system, small fluctuations in data can produce
given by the expression very useful information. In an unstable system, sometimes the

k opposite is true. If the measurement scales of the two data
G(xi , T) = g(xi , T − m). (11) sets differ too much or the dimensions of the data are differ-
m=1 ent, it is not appropriate to directly use the standard deviation
The average and standard deviation of the sensor data are for comparison. Hence, the effects of measurement scales and
denoted by μ0 and σ0 , respectively. The occurrence of an dimensions should be eliminated at this stage. The coefficient
event can be considered to be distributed by a Bernoulli ran- of variation (Cv) can do this and can objectively reflect the
dom variable. Recall that the number of IIoT nodes that are valuable information hidden in the process of data change.
damaged, due to failure or external attacks, is assumed to be The standard deviation of the data in the qth dimension for
much smaller than the number of normal nodes. Using the node i is calculated as follows:

correlation of time and space between nodes and the Grubbs a  2
standard [13], if Pi (T) satisfies k=1 xk,q − μi,q
σi,q = (16)
a
|Pi (T) − μ0 | < KG (δ)σ0 (12)
where parameter a indicates the number of data samples after
then, the occurrence of outliers is accompanied by the occur- the error data have been removed. In addition, μi,q represents
rence of an event at a node—whose state is consistent with the the average value of the data in the qth dimension for node i,
states of the surrounding normal nodes. Otherwise, the state and it is given by the relation
of the node is considered to be inconsistent with the states a
xk,q
of the neighboring nodes. Such a state may be caused by a μi,q = k=1 . (17)
a
failure of the node itself or by an external attack. The outliers
generated in this case are marked as error data and discarded. The equations above can be effectively integrated to obtain
an expression for the coefficient of variation Cvi,q
D. CCoV σi,q
Cvi,q =   (18)
Once deployed, nodes remain unattended for a long time ui,q 
in 6G-enabled massive IIoT. Moreover, the key nodes play a where Cvi,q represents the Cv of the data in the qth dimen-
vital role in the stable operation of the entire system. Thus, it is sion for node i. Note that data that classified as errors are
relevant to identify these nodes. A quantitative formula for the not included in this calculation. The entire Cv for node i is
criticality of the nodes is given in this section. In combination obtained by
with the actual situation, the nodes with high priority can be
identified, so that backup nodes can be activated and their 
Wm

protection increased. Cvi = Cvi,q (1 ≤ q ≤ Wm ). (19)


The frequency of the outliers in the data collected by node i q=1

(fi ) is denoted by Finally, based on the frequency of outliers, the saturation


Ni of loss and Cv, the CCoV are calculated for those nodes in
fi = (13) the same region. The formula for calculating the CCoV Li is
Mi
given by
where Ni represents the number of detected outliers and Mi is
the number of data samples collected by node i. These IIoT Li = fi γi Cvi . (20)

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
HAN et al.: ANOMALY DETECTION BASED ON MULTIDIMENSIONAL DATA PROCESSING 5225

Algorithm 2 CCoV
Input: Values from IIoT node Ni , Mi , xk,q , a
Output: CCoV for devices
1: for each node i, i ∈ [1, R] do
2: Compute fi using (13);
3: Compute σi,q using (14);
4: Compute σi,q using (16);
5: Compute μi,q using (17);
6: Substitute parameters σi,q and μi,q into (18);
7: Compute the CCoV using (20);
8: end for
9: i = 0;
10: repeat
11: i = i + 1;
12: Compute values using (13) to (20); Fig. 3. False-positive rate for different values of δ.
13: Get values of each IIoT node;
14: until 1; TABLE I
S IMULATION PARAMETERS
15: Select IIoT nodes according to the values;

By plugging in (12)–(14) and (17) into (19), the final


expression for the CCoV results

Ni 
Wm
σ
Li =  i,q 
Fs  μi,q 
q=1
⎛  ⎞
 a 2
Ni 
Wm    detection rate is the ratio of the number of detected outliers
= s ⎝  1  a xk,q − μi,q ⎠. to the number of actual outliers. The false-positive rate is the
Mi  a
x 
i=1 q=1 k=1 k,q k=1 ratio of the number of nonoutlier data diagnosed as outliers
(21) to the total number of nonoutlier data. This metric reflects the
reliability of the classifier. A higher detection rate and a lower
We observe in (21) that when Ni is larger or Mi is smaller false-positive rate indicate better performance of the algorithm.
the CCoV tends to increase. Thus, Li and Ni are positively To verify the effectiveness of the proposed algorithm, this
correlated, and Li and Mi are negatively correlated. In fact, study used real data sets [27] and simulation data sets to con-
it is intuitive that, when the proportion of outliers is higher, duct experiments. The mathematical models are implemented
the frequency of unknown conditions in the area is higher and in MATLAB and PyCharm. It is assumed that the deployed
their duration is longer. Such information is precisely what the IIoT nodes were equipped with temperature or humidity sen-
control center would like to flag in time. In addition, from a sors, and collected the same number of data samples per
practical point of view, when parameter a is larger, it indicates unit time. The size of the analog area was a 50 m2 sur-
that there are more data errors. When too many data occur, face, the effective sensing radius was limited to 15 m, and the
the corresponding data set is demoted. The overall process for sampling frequency was 100 Hz. In addition, we added simu-
calculating the cumulative coefficient is shown in Algorithm 2. lated environment noise to the data sets [30], [31], obeying a
In summary, a system supervisor can understand the impor- Gaussian distribution—the impact of noise is much lower than
tance of the IIoT nodes in each subarea by calculating the the impact of an event. The faulty nodes, event nodes, and nor-
cumulative key coefficients. Hence, the security administrator mal nodes were densely deployed within the same area. This
can extract the higher priority smart devices, and then reserve enabled impact analysis of the parameters on the source of the
backup nodes and provide better protection where needed. In outliers. Finally, the CCoV for each IIoT node was obtained.
this manner, important sensory data is transmitted to the deci- Some important experimental parameters are listed in Table I.
sion system in a timely and complete manner on enabling
massive IIoT with 6G.
B. Choosing the Optimal Parameter Values in MDP
The proposed method employs spatiotemporal correlations to
IV. E XPERIMENTS AND R ESULTS
classify the source of the outliers—as either event or error. This
A. Data Sets and Parameter Settings correlation is related to the observed value of the neighboring
In this section, first, the effect of the parameter settings on nodes. The observed conditions are strictly controlled by the
the experimental results is analyzed. To evaluate the accu- parameter δ in the Grubbs standard. To obtain the best results
racy of anomaly detection, the detection rate [28] and the from the algorithm, a small false alarm rate and a high detection
false-positive rate [29] are used as performance indicators. The rate are required. As shown in Table I, the parameter δ in the

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
5226 IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021

Fig. 4. Detection rate for different values of δ. Fig. 6. ROC curve for different values of δ.

C. Comparison of Algorithm Performance


In the experiments, three algorithms, HRDV, MiLOF, and R-
PCA, were used for comparing performance with the proposed
approach. For each, the accuracy and false-positive rate of
outlier detection were compared. As shown in Figs. 7 and 8,
the simulation experiments were performed using the real data
set of a rock fall prediction system [23]. The attributes of the
data in this data set consist of temperature and humidity. In
Fig. 7(a), the results on the left are the true states of the data,
Fig. 5. ROC curve. (a) Trade-off between the detection rate and the false while predicted values of anomaly detection are displayed on
alarm rate. (b) Determination of parameters. the right figures. The red data points are outliers and the green
TABLE II data points are normal values.
S IMULATION PARAMETERS To further compare the performance of the algorithm, the
first 100 data samples were used to train the models, and the
remaining 500 data points were used as test samples. The lev-
els of outliers range from 5 to 25%. The results in Fig. 9 show
that the proposed algorithm can maintain the stability and
experiments was varied within the values 0.2, 0.4, 0.6, and accuracy of detection for both low and high levels of outliers.
0.8; the number of selected data samples K ranged from 100 However, when the proportion of outliers exceeds 20%, the
to 1000; and the percentage of injected outliers was 10%. performance of the other three algorithms—namely, HRDV,
The false-positive rate is the proportion of negative samples MiLOF, and R-PCA—display different degrees of degradation.
predicted to be positive. The detection rate is the proportion of In Fig. 9, the detection rate of MDP remains above 95%.
samples correctly predicted by the model. It can be seen from In order to further test the efficacy of our proposed method,
the experimental results—shown in Figs. 3 and 4—that as the we selected nine real data sets and three synthetic data sets
detection rate increases from 0.83 to 0.96, the false-positive for experimental evaluation. The characteristics of all test sets
rate decreases from 0.24 to 0.06 when δ is 0.2. and the AUC results for the different methods are summarized
The receiver operating characteristic (ROC) curve represents in Table III [32]–[34]. The table shows the most important
the tradeoff between the detection rate and the false alarm rate test set characteristics: number of objects (#n), number of
and is usually displayed on a 2-D plot—Fig. 5(a). The area attributes (#m), and number of outliers (#o).
under the curve (AUC) is a measure of the performance of the In the lab data set, the corresponding devices were deployed
outlier detection algorithm. The AUC is illustrated in Fig. 5(a), in a real environment to monitor some objects. Each device
and for a nearly ideal ROC curve is close to 1. in the setup had multiple sensors. Each sensor collected time-
The purpose of finding the optimal parameter δ is to balance stamped information as well as humidity, temperature, light
the relationship between the detection rate and the false- and voltage values. There were a total of 54 subjects within
positive rate. This will ensure that the experiment is carried the scope of monitoring, and the time span was 35 days. A total
out under optimal conditions. Thus, in order to facilitate the of 2.3 million data samples were collected using the TinyDB
visual observation of the change, the ROC curve was used in in-network query processing system, which was built on the
the experiment. As shown in Fig. 6 and Table II, when param- TinyOS platform.
eter δ is equal to 0.4, the area takes the maximum value of Considering the industrial environment in which IIoT nodes
0.8860, and the performance of the algorithm is optimal. are deployed, measurement data sets are often contaminated by

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
HAN et al.: ANOMALY DETECTION BASED ON MULTIDIMENSIONAL DATA PROCESSING 5227

TABLE III
AUC R ESULTS FOR THE D IFFERENT M ETHODS

Fig. 8. Anomaly detection results on real data set via MDP (percentage of
outliers: 10%).

Fig. 9. Comparison of the detection rate for different percentages of outliers.

Fig. 10. Average detection rates for data containing noise.


Fig. 7. Anomaly detection results on real data set (percentage of outliers:
10%). (a) MiLOF. (b) HRDV. (c) R-PCA.

noise. A data set contaminated by noise was hence used in the value of the information collected by each node. We note
experiments. The noise source was Gaussian white noise that that outliers which are generally caused by errors contain zero
was independently and identically distributed. The sample was information. These data points need to be removed from the
continuously subjected to ten abnormality detections within data sets to avoid unnecessary bias in the results. Therefore,
the same sampling period to obtain an average detection rate. such outliers are not included in the CCoV calculation.
Fig. 10 display the robustness of the four anomaly detection Without loss of generality, the solution proposed in this arti-
algorithms to noise. The results clearly show that the proposed cle directly transfers the heterogeneous information collected
algorithm has higher noise sensitivity. Moreover, the detection by the IIoT devices to a specified edge server for processing.
rate can be stabilized above 90%. This fully demonstrates that In addition, the algorithm reduces the uncertainty caused by
the proposed algorithm exhibits a strong robustness that is noisy readings, data loss, outliers, and redundancy. According
unmatched by other algorithms. to a survey, in an actual industrial production environment,
In the previous experiments, 100 IIoT nodes were selected the error tolerance of IIoT equipment should be kept at 5%.
and the CCoV results were given. To compare the results more The average detection rate of the proposed algorithm can be
intuitively, we normalized the values. As shown in Fig. 11, stably maintained at approximately 95%. Such fault tolerance
the corresponding CCoV can be obtained—quantifying the rate can effectively keep the risk within a controllable range.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
5228 IEEE INTERNET OF THINGS JOURNAL, VOL. 8, NO. 7, APRIL 1, 2021

demonstrate that the proposed algorithm has a higher detection


rate and lower false-positive rate than that of the traditional
methods (HRDV, MiLOF, and R-PCA). In complex industrial
environments, it exhibits better resistance to noise, increased
robustness, and is more suitable for practical applications in
6G-enabled massive IIoT.

R EFERENCES
[1] M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi,
“Toward 6G networks: Use cases and technologies,” IEEE Commun.
Mag., vol. 58, no. 3, pp. 55–61, Mar. 2020.
[2] F. Tariq, M. R. A. Khandaker, K.-K. Wong, M. A. Imran, M. Bennis,
and M. Debbah, “A speculative study on 6G,” IEEE Wireless Commun.,
vol. 27, no. 4, pp. 118–125, Aug. 2020.
(a) [3] H. Viswanathan and P. E. Mogensen, “Communications in the 6G era,”
IEEE Access, vol. 8, pp. 57063–57074, 2020.
[4] L. Liu, G. Han, Y. He, and J. Jiang, “Fault-tolerant event region detection
on trajectory pattern extraction for industrial wireless sensor networks,”
IEEE Trans. Ind. Informat., vol. 16, no. 3, pp. 2072–2080, Mar. 2020.
[5] G. Han, H. Guan, J. Wu, S. Chan, L. Shu, and W. Zhang, “An uneven
cluster-based mobile charging algorithm for wireless rechargeable sensor
networks,” IEEE Syst. J., vol. 13, no. 4, pp. 3747–3758, Dec. 2019.
[6] H. Zhang and Z. Li, “Anomaly detection approach for urban sensing
based on credibility and time-series analysis optimization model,” IEEE
Access, vol. 7, pp. 49102–49110, Apr. 2019.
[7] H. Fei and G. Li, “Abnormal data detection algorithm for WSN based
on k-means clustering,” Comput. Eng., vol. 41, no. 7, pp. 124–128,
Jul. 20l5.
[8] L. Zhou, K.-H. Yeh, G. Hancke, Z. Liu, and C. Su, “Security and pri-
vacy for the industrial Internet of Things: An overview of approaches
to safeguarding endpoints,” IEEE Signal Process. Mag., vol. 35, no. 5,
pp. 76–87, Sep. 2018.
[9] E. Sisinni, A. Saifullah, S. Han, U. Jennehag, and M. Gidlund,
“Industrial Internet of Things: Challenges, opportunities, and directions,”
(b) IEEE Trans. Ind. Informat., vol. 14, no. 11, pp. 4724–4734, Nov. 2018.
[10] B. Wu, X. Yan, Y. Wang, and C. G. Soares, “An evidential reasoning-
Fig. 11. CCoV. (a) Number of nodes is 100. (b) Number of nodes is 500. based CREAM to human reliability analysis in maritime accident
process,” Risk Anal., vol. 37, no. 10, pp. 1936–1957, Oct. 2017.
[11] H. C. Mandhare and S. R. Idate, “A comparative study of cluster based
outlier detection, distance based outlier detection and density based out-
In addition, industrial environments are often characterized lier detection techniques,” in Proc. Int. Conf. Intell. Comput. Control
by complex factors, such as signal fading and shadowing, the Syst. (ICICCS), Madurai, India, 2017, pp. 931–935.
[12] L. Martí, N. Sanchez-Pi, J. M. Molina, and A. C. B. Garcia, “Anomaly
presence of highly reflective surfaces, and interference—all of detection based on sensor data in petroleum industry applications,”
which can affect the quality of the data. Therefore, in future Sensors, vol. 15, no. 2, pp. 2774–2797, Jan. 2015.
work, more in-depth research could be conducted on unreliable [13] K. K. L. B. Adikaram, M. A. Hussein, M. Effenberge, and T. Becker,
“Data transformation technique to improve the outlier detection power of
communication links and the loss of some sensor data. grubb’s test for data expected to follow linear relation,” J. Appl. Math.,
vol. 2015, pp. 1–9, Jan. 2015.
[14] X. Xiao and Y. Zhou, “Two-dimensional quaternion PCA and
V. C ONCLUSION sparse PCA,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 7,
pp. 2028–2042, Jul. 2019.
In this study, a new monitoring model has been developed [15] T. Yu, X. Wang, and A. Shami, “Recursive principal component analysis-
for classifying anomalies on enabling massive industrial IoT based data outlier detection and sensor data aggregation in IoT systems,”
with 6G. The proposed method exploits the spatiotempo- IEEE Internet Things J., vol. 4, no. 6, pp. 2207–2216, Dec. 2017.
[16] M. Martínez-García, Y. Zhang, J. Wan, and J.McGinty, “Visually inter-
ral correlations among a multiplicity of nodes. By training pretable profile extraction with an autoencoder for health monitoring of
an ARX model, the corresponding system parameters and industrial systems,” Proc. IEEE 4th Int. Conf. Adv. Robot. Mechatronics
deviations are obtained, thereby significantly reducing the (ICARM), Toyonaka, Japan, 2019, pp. 649–654.
[17] B. Liu, Y. Xiao, P. S. Yu, Z. Hao, and L. Cao, “An efficient approach for
interference of industrial noise in outlier detection. Moreover, outlier detection with imperfect data labels,” IEEE Trans. Knowl. Data
an unsupervised learning method is adopted to enable a large Eng., vol. 26, no. 7, pp. 1602–1616, Jul. 2014.
amount of unlabeled multidimensional data to be uniformly [18] M. Salehi, C. Leckie, J. C. Bezdek, T. Vaithianathan, and X. Zhang,
“Fast memory efficient local outlier detection in data streams,” IEEE
processed through edge computing devices driven by 6G. To Trans. Knowl. Data Eng., vol. 28, no. 12, pp. 3246–3260, Dec. 2016.
tackle the difficult task of discrimination between outliers [19] W. Yu, Z. Ding, C. Hu, and H. Liu, “Knowledge reused outlier
and system errors, the Grubbs standard method was adopted. detection,” IEEE Access, vol. 7, pp. 43763–43772, 2019.
[20] X. Deng, P. Jiang, X. Peng, and C. Mi, “An intelligent outlier detection
Furthermore, the new measure of CCoV was proposed; the method with one class support tucker machine and genetic algorithm
relevant measurement data are extracted to quantify the value toward big sensor data in Internet of Things,” IEEE Trans. Ind. Electron.,
of the information produced by the IIoT nodes. The CCoV vol. 66, no. 6, pp. 4672–4683, Jun. 2019.
[21] X. Xie, B. Wang, T. Wan, and W. Tang, “Multivariate abnormal detection
provides the basis for identifying and protecting high-priority for industrial control systems using 1D CNN and GRU,” IEEE Access,
smart devices. Experiments on real data sets and synthetic data vol. 8, pp. 88348–88359, 2020.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.
HAN et al.: ANOMALY DETECTION BASED ON MULTIDIMENSIONAL DATA PROCESSING 5229

[22] S.-K. Zhang, Y.-H. Wang, Z.-M. Cui, and J.-X. Fan, “Event region fault Juntao Tu received the B.S. degree from the Hohai
tolerant detection algorithm based on aggregation tree,” J. Commun, University, Changzhou, China, in 2018, where he is
vol. 30, no. 10, pp. 1770–1776, Oct. 2007. currently pursuing the M.S. degree with the College
[23] D.-L. Cao, J.-N. Cao, and B.-H. Jin, “A fault-tolerant algorithm for event of Internet of Things Engineering.
region detection in wireless sensor networks,” Chin. J. Comput., vol. 30, His research interests include localization for the
no. 10, pp. 1770–1776, Oct. 2007. Internet of Things, wireless sensor networks, and
[24] Y. J. Lee, Y.-R. Yeh, and Y.-C. F. Wang, “Anomaly detection via online mobile computing and security.
oversampling principal component analysis,” IEEE Trans. Knowl. Data
Eng., vol. 25, no. 7, pp. 1460–1470, Jul. 2013.
[25] B. Liu, Y. Xiao, L. Cao, Z. Hao, and F. Deng, “SVDD-based out-
lier detection on uncertain data,” Knowl. Inform. Syst, vol. 34, no. 3,
pp. 597–618, 2013.
[26] D.-W. Lee and J.-H. Kim, “High reliable in-network data verification
in wireless sensor networks,” EURASIP J. Wireless Commun. Netw.,
vol. 2005, no. 4, pp. 462–472, 2005.
[27] H. U. Yildiz, B. Tavli, B. O. Kahjogh, and E. Dogdu, “The impact
of incapacitation of multiple critical sensor nodes on wireless sen-
sor network lifetime,” IEEE Wireless Commun. Lett., vol. 6, no. 3,
pp. 306–309, Jun. 2017.
[28] O. Dagdeviren, V. K. Akram, and B. Tavli, “Design and evaluation of Li Liu received the Ph.D. degree from Hohai
algorithms for energy efficient and complete determination of critical University, Nanjing, China, in 2019.
nodes for wireless sensor network reliability,” IEEE Trans. Rel., vol. 68, He is currently a Lecturer with the College of
no. 1, pp. 280–290, Mar. 2019. Internet of Things Engineering, Hohai University,
[29] R. Niu and P. K. Varshney, “Distributed detection and fusion in a large Changzhou, China. His research interests include
wireless sensor network of random size,” EURASIP J. Wireless Commun. wireless sensor networks, Industrial Internet of
Netw., vol. 2005, no. 4, pp. 462–472, 2005. Things, and machine learning.
[30] S. Wu and S. Wang, “Information-theoretic outlier detection for large-
scale categorical data,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 3,
pp. 589–602, Mar. 2013.
[31] C. Alippi, S. Ntalampiras, and M. Roveri, “Path planning for
autonomous underwater vehicles: An ant colony algorithm incorporat-
ing alarm pheromone,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24,
no. 8, pp. 1213–1226, Aug. 2013.
[32] S. Xie and Z. Chen, “Anomaly detection and redundancy elimination
of big sensor data in Internet of Things,” 2017. [Online]. Available:
https://arxiv.org/abs/1703.03225.
[33] U. Lee, E. Magistretti, M. Gerla, P. Bellavista, and A. Corradi,
“Dissemination and harvesting of urban data using vehicular sensing
platforms,” IEEE Trans. Veh. Technol., vol. 58, no. 2, pp. 882–901, Miguel Martínez-García (Member, IEEE) received
Feb. 2009. the B.Sc. degree in mathematics and the M.Sc.
[34] S. Mukhopadhyay, D. Panigrahi, and S. Dey, “Model based error correc- degree in advanced mathematics and mathemati-
tion for wireless sensor networks,” IEEE Trans. Mobile Comput., vol. 8, cal engineering from the Polytechnic University of
no. 4, pp. 528–543, Sep. 2008. Catalonia (UPC), Barcelona, Spain, in 2013 and
2014, respectively, and the Ph.D. degree in engineer-
ing from the University of Lincoln, Lincoln, U.K.,
in 2018.
Guangjie Han (Senior Member, IEEE) received He is a Lecturer of Human–Machine Systems with
the Ph.D. degree from Northeastern University, Loughborough University, Loughborough, U.K. He
Shenyang, China, in 2004. also worked as a Researcher with the University of
He is currently a Professor with the Department Lincoln and the Advanced Virtual Reality Research Center, Loughborough
of Information and Communication System, Hohai University, since 2017. His research interests include human–machine inte-
University, Changzhou, China. In February 2008, gration, machine learning, artificial intelligence, intelligent signal processing,
he finished his work as a Postdoctoral Researcher and complex systems, with particular focus in the analysis of nonlinear signals
with the Department of Computer Science, Chonnam representing phenomena of interest between humans and machines.
National University, Gwangju, South Korea. From
October 2010 to October 2011, he was a Visiting
Research Scholar with Osaka University, Suita,
Japan. From January 2017 to February 2017, he was a Visiting Professor with
the City University of Hong Kong, Hong Kong. He has over 400 peer-reviewed
journal and conference papers, in addition to 160 granted and pending patents.
His H-index is 44 and i10-index is 151 in Google Citation (Google Scholar).
Total citation of his papers by other people is more than over 7500 times.
His current research interests include Internet of Things, Industrial Internet,
machine learning, and artificial intelligence, mobile computing, security, and
privacy.
Dr. Han has been awarded 2020 IEEE S YSTEMS J OURNAL Annual Best
Paper Award and the 2017–2019 IEEE ACCESS Outstanding Associate Editor
Yan Peng received the Ph.D. degree in pattern
Award. He has also served as a Chair of organizing and technical commit-
recognition and intelligent systems from Shenyang
tees in many international conferences. He has served on the Editorial Boards
Institute of Automation, Chinese Academy of
of up to ten international journals, including the IEEE N ETWORK, IEEE
Sciences, Shenyang, China, in 2009.
S YSTEMS J OURNAL, IEEE/CAA J OURNAL OF AUTOMATICA S INICA, IEEE
She is currently a Professor with Shanghai
ACCESS, and Telecommunication Systems. He has guest edited a number of
University, Shanghai, China, where he acts as the
special issues in IEEE journals and magazines, including the IEEE J OURNAL
Dean of the School of Artificial Intelligence. Her
ON S ELECTED A REAS IN C OMMUNICATIONS , IEEE Communications, IEEE
research interests include modeling and control of
W IRELESS C OMMUNICATIONS, IEEE T RANSACTIONS ON I NDUSTRIAL
unmanned surface vehicles, field robotics, and loco-
I NFORMATICS, and Computer Networks. He is a Fellow of the U.K. Institution
motion systems.
of Engineering and Technology (FIET).

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on March 24,2022 at 11:59:04 UTC from IEEE Xplore. Restrictions apply.

You might also like