0% found this document useful (0 votes)
14 views15 pages

A Survey On Machine Learning For Data Fusion

This paper provides a comprehensive survey on the application of machine learning techniques in data fusion, highlighting its advantages over traditional probabilistic methods. It reviews existing literature, categorizes various fusion methods, and evaluates their performance based on proposed criteria. Additionally, the paper identifies open issues and suggests future research directions in the field of data fusion using machine learning.

Uploaded by

Guillaume Rossi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views15 pages

A Survey On Machine Learning For Data Fusion

This paper provides a comprehensive survey on the application of machine learning techniques in data fusion, highlighting its advantages over traditional probabilistic methods. It reviews existing literature, categorizes various fusion methods, and evaluates their performance based on proposed criteria. Additionally, the paper identifies open issues and suggests future research directions in the field of data fusion using machine learning.

Uploaded by

Guillaume Rossi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Information Fusion 57 (2020) 115–129

Contents lists available at ScienceDirect

Information Fusion
journal homepage: www.elsevier.com/locate/inffus

Full Length Article

A survey on machine learning for data fusion


Tong Meng a, Xuyang Jing a, Zheng Yan a,b,∗, Witold Pedrycz c
a
State Key Laboratory on Integrated Services Networks, School of Cyber Engineering, Xidian University, China
b
Department of Communications and Networking, Aalto University, Finland
c
Department of Electrical & Computer Engineering, University of Alberta, Canada

a r t i c l e i n f o a b s t r a c t

Keywords: Data fusion is a prevalent way to deal with imperfect raw data for capturing reliable, valuable and accurate
Data fusion information. Comparing with a range of classical probabilistic data fusion techniques, machine learning method
Machine learning that automatically learns from past experiences without explicitly programming, remarkably renovates fusion
Fusion methods
techniques by offering the strong ability of computing and predicting. Nevertheless, the literature still lacks
Fusion criteria
a thorough review of the recent advances of machine learning for data fusion. Therefore, it is beneficial to
review and summarize the state of the art in order to gain a deep insight on how machine learning can benefit
and optimize data fusion. In this paper, we provide a comprehensive survey on data fusion methods based on
machine learning. We first offer a detailed introduction to the background of data fusion and machine learning
in terms of definitions, applications, architectures, processes, and typical techniques. Then, we propose a number
of requirements and employ them as criteria to review and evaluate the performance of existing fusion methods
based on machine learning. Through the literature review, analysis and comparison, we finally come up with a
number of open issues and propose future research directions in this field.

1. Introduction that merges data to obtain more consistent, informative and accurate
information than the original raw data that are mostly uncertain, impre-
In the era of information explosion, huge volumes of data are cise, inconsistent, conflicting and alike. Varieties of data fusion methods
created, collected and processed. We can extract and gain valuable have been designed in different application fields. Generally, data fu-
information from data to look for the rules of the world and to dis- sion is widely used in wireless sensor networks, image processing, radar
cover the nature of things. Instead of believing in experiences or in- systems, object tracking, target detection and identification, intrusion
tuition, we are more likely and feel more confidence to draw a con- detection, situation assessment, etc. [1].
clusion or make a decision on the basis of real-world data. However, Traditional data fusion techniques include probabilistic fusion (e.g.,
big data also accompany with difficulties and challenges in data driven Bayesian fusion), evidential belief reasoning fusion (e.g., Dempster-
service provision because of its “5V” characteristics: Volume, Variety, Shafer theory), and rough set-based fusion, etc. [2]. In recent years,
Velocity, Veracity and Value. Obviously, traditional data processing the development of sensors, processing hardware and many other data
techniques in the literature are hard to meet the demand in the new processing technologies bring a new development opportunity to data
era of big data. How to capture reliable, valuable and accurate infor- fusion. As a technique with strong abilities to compute and classify data,
mation in massive data is one of the most significant research topics machine learning is highly expected to improve the overall performance
nowadays. of data fusion algorithms.
The cyber world brings us overmuch data to dispose. However, raw Machine learning is a technique that lets the computer “learn” with
data captured from various environments are heterogeneous, complex, provided data without thoroughly and explicitly programming of every
imperfect, and of a huge scale, which brings us many challenges to trans- problem. It aims at modeling profound relationships in data inputs and
form them into useful information. All kinds of data processing technolo- reconstructs a knowledge scheme. The result of learning can be used
gies, including but not limited to data preprocessing, data storage, data for estimation, prediction, and classification. The name of “machine
transfer, data fusion, data analysis, information retrieval and so on, are learning” was first proposed in 1959 [3]. After decades, the advance of
major in solving these problems and stemming from diverse process- computation ability of digital computers notably improves the perfor-
ing theories. In this paper, we focus on data fusion. It is a technology mance of machine learning. Machine learning enables classification and


Corresponding author at: Xidian University, 119 POX, No. 2 South Taibai Road, Xi’an 710071, China.
E-mail address: zheng.yan@aalto.fi (Z. Yan).

https://doi.org/10.1016/j.inffus.2019.12.001
Received 24 May 2019; Received in revised form 30 September 2019; Accepted 9 December 2019
Available online 10 December 2019
1566-2535/© 2019 Elsevier B.V. All rights reserved.
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

prediction based on known data and can achieve high accuracy and reli- • Based on the completed review and in-depth analysis, some signif-
ability, which makes it more likely to inform a correct decision. In recent icant open issues and valuable future research directions are pre-
years, machine learning has been applied into data fusion to improve its sented, which are useful and referable for the researchers and prac-
performance and offer satisfactory fusion results. titioners in this field.
There are some surveys about data fusion published in recent years
The reminder of the paper is organized below. We provide an
with different emphases. Alam et al. [4] completed a literature review on
overview of background knowledge of data fusion and machine learning
data fusion in IoT, which contains mathematical fusion methods such as
in Section 2. To review the literature comprehensively with a uniform
probabilistic methods, artificial intelligence, and theory of belief in the
measure, we propose a number of criteria on data fusion in Section 3.
domain of IoT. Focusing on IoT narrows down the review, while data
Section 4 reviews the recent literature about data fusion with machine
fusion with machine learning covers a wide area. Gite et al. [5] and
learning that are categorized into three classes: signal level data fusion,
Snidaro et al. [58] focused on data fusion models used in context-aware
feature level data fusion and decision level data fusion. All the litera-
systems. Pires et al. [6] summarized the state of the art of data fusion
tures are reviewed with respect to their model structures, application
techniques about sensors embedded in mobile devices. Navarro-Arribas
background and technical advantages. Besides, we discuss their perfor-
and Torra [7] reviewed the approaches of information fusion for achiev-
mance with the help of the proposed criteria. We also summarize the
ing data privacy. Faouzi et al. [8] concentrated on the application of
overall comparison of all the reviewed models/methods in this Section.
data fusion models in intelligent transportation systems. Corona et al.
In Section 5, we point out open issues and propose future research di-
[9] studied information fusion methods for computer security. Yao et al.
rections in this research field based on the result of literature review.
[10] made an overview on web information fusion and integration. Ding
Finally, conclusions are provided in the last section.
et al. [76] reviewed data fusion methods in Internet of Things, mainly
focusing on secure and privacy-preserving fusion. We can see that the
above surveys hold different concentrations from our survey presented 2. Overview of data fusion and machine learning
in this paper.
On the other hand, some works provide an overview on machine This section provides background information and concepts related
learning in some specific application scenarios, especially in big data to data fusion. It also specifies the challenges of data fusion and makes
processing related environments. For example, Liao et al. [11] surveyed a brief introduction of machine learning and its common models.
machine learning applications and achievements in the past decade
(2000–2011). Rudin and Wagstaff [12] reviewed the advances of ma- 2.1. Data fusion
chine learning in real-world problems of science and society. Qiu et al.
[13] studied on machine learning for big data processing. They pointed White [15] defined data fusion in the book “Data Fusion Lexicon” as
out five significant issues in the learning of big data through a litera- “a process dealing with the association, correlation, and combination of
ture review. Zhang et al. [14] reviewed representative works of deep data and information from single and multiple sources to achieve refined
learning in big data. position and identity estimates, and complete and timely assessments of
In summary, we can find many existing surveys about data fusion situations and threats, and their significance. The process is character-
and machine learning from various views. However, in the context of ized by continuous refinements of its estimates and assessments, and
fast growth of artificial intelligence-based fusion models and their excel- the evaluation of the need for additional sources, or modification of the
lent properties, a survey specific to data fusion based on machine learn- process itself, to achieve improved results.” Hall et al. [1] thought that
ing is still lacking. Although Alam et al. [4] provided a review on data “information fusion is the study of efficient methods for automatically
fusion techniques with artificial intelligence, they only paid attention or semi-automatically transforming information from different sources
to the literature about data fusion in Internet of Things. Their review and different points in time into a representation that provides effective
is limited with regard to the scope of models. A horizontal comparison support for human or automated decision making.”
with detailed analysis is still missed. Considering the recent advance For easy understanding, we introduce the most important elements
of machine learning, it becomes essential to comprehend elementary of data fusion:
knowledge, current application state and future trends of this field with • Data sources: Single or multiple data sources from different positions
the help of a thorough survey.
and at different points of time are involved in data fusion.
In this paper, we perform a serious survey on data fusion techniques • Operation: One needs an operation of combination of data and re-
with machine learning. We first comprehensively introduce basic def-
finement of information, which can be described as “transforming”.
initions and background knowledge about machine learning and data • Purpose: Gaining improved information with less error possibility in
fusion. Then, we indicate critical challenges of data fusion and propose
detection or prediction and superior reliability as the goal of fusion.
a number of criteria of data fusion. We make a deep-insight overview on
Example purposes of actual applications are decision making, entity
data fusion techniques based on machine learning by commenting the
identification, situation estimation, and so on.
performance of each reviewed work with the help of and by employing
the criteria. Through analysis and discussion, as well as comparison, we The superiority brought by fusion of multi-source data is quite obvi-
find some open problems, which further allow us to indicate several re- ous. Even in a static single source system, the fusion of sampling with
search directions to motivate future research in this promising research replication can result in a more accurate observation. On the other hand,
field. In particular, the main contributions of this paper are described especially in wireless sensor networks, distributed data fusion reduces
below: the redundancy of data, which reduce time and resource consumption
and the frequency of data collision in the process of data transportation.
• We sum up a group of main challenges that data fusion might face. What’s more, in all data fusion applications, data is transformed into a
Then, we propose a thorough list of requirements as uniform criteria modality with more value and higher quality, which makes a data fu-
that can serve as a measure to evaluate the performance of data sion system able to reemerge the full view of an observed phenomenon.
fusion methods based on machine learning. For instance, data enlarges its cover a lot in both time dimension and
• We review the literature of data fusion based on machine learning in spatial dimension. In other models, appropriate handling on redundant
various application scenarios, discuss their advantages and weakness data can help acquire improved, accurate and reliable information, with
in detail according to the proposed criteria. In each literature review, little imperfection.
how a machine learning method can ameliorate fusion performance Researchers began working in this field since 1960s, as a part of data
is especially commented. processing firstly. Later, in 1970s [1], US Department of Defense (DoD)

116
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Fig. 1. Joint Directors of Laboratories (JDL) architecture [15].

utilized this technology into military usage for defense and monitor. So • Level 0 – source preprocessing: It is the lowest data processing level,
far, in the military domain, there have been many applications includ- which mainly deals with raw data in signal or pixel levels. Level 0
ing entity target identification and tracking, land, ocean and airspace needs to prepare data well for next steps. Thus, its primary mission is
surveillance, radar tracking, remote sensing, and so on. More than that, to transform and assign data to a proper level for further processing.
data fusion models are nowadays widely used in nonmilitary applica- This step of data processing can obviously reduce system load and
tions, e.g., fault detection in varieties of machines, intrusion detection, makes Level 1-3 pay more attention to the data corresponding to
malware detection, review ranking, vehicle monitoring and prediction their own responsibilities without disturbing.
in traffic systems, environmental monitoring, pattern recognition, face • Level 1 – object refinement: this step is responsible for outputting the
identification, and so on [16,55,59,62,64,75,77–82]. identification information of individual objects. It focuses on identi-
Along with multiple application scenarios that data fusion used in, fying a particular entity. In this level, all static information about an
the term data fusion also has many extend forms with its own practical entity’s location, direction, state and other attributes are collected
meaning. For example, multi-source/multi-sensor data fusion relates to and combined into a consistent pattern. Then, the system can get
the data from multiple sources compared with the data from a single a comprehensive view of it from both time dimension and spatial
source. Image fusion focuses on fusion of images. Information fusion dimension for a further estimation.
concentrates on the data that has been processed, which is different from • Level 2 – situation refinement: Based on the individual entity’s in-
raw data fusion. Decision fusion is specialized to describe information formation gained from the previous level, this level broadens the
in a high semantic level for making a decision. These terms might be horizon of investigation into the environment of the entity. The re-
used interchangeably with “data fusion” in some particular situations. lationships between various entities form an environment and a sit-
uation, which is the main concern of Level 2. The relations between
entities are defined based on communications and tightly connected
2.2. Architecture and classification of data fusion
with the environment.
• Level 3 – threat refinement: The current situation assessment from
Apparently, raw data collected by collectors is usually not applicable
Level 2 helps Level 3 concern about threat and impact. Level 3
for prediction or other applications due to many reasons, such as data in-
predicts risks, vulnerabilities and operational probability. Because
completeness, data confliction and data inconsistency. Therefore, meth-
judgement is based on much uncertainty information, process in
ods are requested to deal with data imperfection. Furthermore, raw data
Level 3 becomes quite difficult.
cannot be extracted as information with high value by once. We need
• Level 4 – process refinement: This level is the management part of
a hierarchical transformation to manipulate data systematically. Since
the whole processing levels. It monitors other levels in real time,
data fusion is a complex system constituting of a number of parts to
records performance of the system and makes decisions to improve
process data, we need to unify expressions or terminologies to describe
system efficiency. For example, in this level, the system can find out
each part’s functionalities and characters. An excellent and concise ar-
what kind of information is currently scarce, approve each level’s
chitecture can also help researchers and developers communicate eas-
work in terms of getting source data or satisfying other particular
ily, which will promote the development of the research field. Herein,
needs, and direct the whole system.
we introduce some widely-spreading data fusion architectures as below,
• Sources: This component is the base of the whole system. It can be
which include Joint Directors of Laboratories (JDL) [15], the Luo and
in many forms such as sensors (local sensors or distributed sensors),
Kay architecture [19] and the Dasarathy’s architecture [20].
databases, priori knowledge, and so on.
• Human-computer interaction (HCI): This component is indispens-
2.2.1. Joint directors of laboratories (JDL) able for smooth system execution. It allows human operations on the
JDL was first proposed by US Department of Defense (DoD) in 1986, system, including commands, information inquires, messages about
which mainly aims at military usage. However, it can also easily adapt system results and decisions, and so on. In fact, HCI realizes assis-
into nonmilitary use. In order to utilize the architecture extensively, tance between human and computer reciprocally.
there appeared many revised or intended versions of JDL data fusion • Data management: This component stores data in different forms
models later on, which makes it fit into many application scenarios. In containing raw data and information. Different processing levels in-
this paper, we only introduce the original JDL for easy understanding. teract with data management frequently. Its responsibilities include
JDL data fusion model is a functional model, which describes a series but not limited to data retrieval, data storage, data security, and data
of concepts and functions to identify each process in a data fusion sys- compression. The big amount of data involved and the need for rapid
tem. Fig. 1 shows the JDL data fusion architecture. There are five levels interaction make data management a tough task.
of data processing (level 0 – source preprocessing, level 1 – object re-
finement, level 2 – situation refinement, level 3 – threat refinement, and
level 4 – process refinement) and three supporting components (sources, 2.2.2. Luo and Kay’s architecture
human-computer interaction (HCI), and database management) in the Luo and Kay studied multi-sensor integration and fusion [19,69].
JDL architecture, as follows. They provided a new general architecture of multi-sensor integration

117
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Fig. 2. Luo and Kay’s architecture [19].

based on the abstract level of used integrated data, as illustrated in Fig. 3. Dasarathy’s architecture [69].
Fig. 2.
In the Luo and Kay’s architecture, raw data come from sensors, and
2.2.3. Dasarathy’s architecture
are fused in the nodes of an information system. For example, data from
Based on the Luo & Kay’s three-layer (data-feature-decision) fusion
sensor 1 and 2 can be fused as data 𝑥1,2 . After that, the output data 𝑥1,2
architecture, Dasarathy extended it into five fusion processes regarded
will be further fused in the next fusion node with data from sensor 3,
of I/O characterization in 1997 [69]. Dasarathy thought that some am-
turning into data 𝑥1,2,3 . Similarly, data 𝑥1,2,…𝑛 from the last fusion node
biguous conditions in the three-layer architecture lead to the demand of
is the highest fusion result. The authors summarized four levels from
a more precise definition. Thus, he reformed the old architecture from
low to high to represent data in different fusion process including signal
I/O perspective and classified data fusion models into five categories:
level, pixel level, feature level and symbol level. The different levels deal
Data In-Data Out (DAI-DAO) Fusion, Data In-Feature Out (DAI-FEO)
with different input data patterns, are applied in various systems for a
Fusion, Feature In-Feature Out (FEI-FEO) Fusion, Feature In-Decision
variety of purposes and also provide distinct degrees of promotion of
Out (FEI-DEO) Fusion and Decision In-Decision Out (DEI-DEO) Fusion,
information quality.
shown in Fig. 3. The new classification defined in the Dasarathy ar-

chitecture considers the nature of input data and output data, which
Signal level: Raw data captured from sensors are as input into fusion
reduces uncertainty in the three-layer architecture.
models to be combined directly. The fusion models corresponding to
this process belong to the category of signal level data fusion. Data • Data In-Data Out Fusion: This type of fusion processes input data
will be turned out with higher accuracy, less noise or refined features to make them more accurate or polished. It is the most elementary
after this fusion process. If raw data are commensurate or in the same and basic layer in fusion family. DAI-DAO fusion instantly appears
pattern, they can be fused in this level. Signal level fusion sometimes after raw data captured from an environment. Its typical applications
occurs in real-time fusion scenarios or may be an additional step include signal processing and image processing.
in preprocessing of signals. Sometimes researchers also called these • Data In-Feature Out Fusion: In this type of fusion, data sets are first
models as “low level fusion” or “raw data fusion”. integrated and extracted into some abstract information, called fea-
• Pixel level: It is a special case of signal level fusion for image process- ture. Some simple and intuitive results can be gained from raw data
ing especially. Fusion in pixel level promotes some image processing by applying DAI-FEO fusion.
applications like segmentation. • Feature In-Feature Out Fusion: Apparently, most feature fusion algo-
• Feature level: In feature level, not raw data but features or char- rithms belong to this category, with feature inputs and also feature
acteristics take part in the fusion process. Sensor data are often outputs. Different from data inputs, feature inputs often show some
preprocessed into certain necessary features first before fusion is refined characteristics, which have been extracted preliminarily
conducted. As an output, we can obtain refined characteristics or already.
features in other patterns for achieving other targets, or data in a • Feature In-Decision Out Fusion: The majority of fusion algorithms
higher level – decision level. Feature level data fusion is also known fall into this category. And most of them are for the purpose of clas-
as “medium level fusion” or “characteristic level fusion”. sification, which is a typical case of decision. With feature inputs, a
• Symbol level: Symbol level data fusion has a more common name – sequence of decisions can be obtained. Another example of this type
decision level data fusion, referring to dealing with some informa- of fusion is pattern recognition. Features transmitted from multi-
tion that is refined from sensor data and has already been generated sensors are recognized with priori knowledge to form a decision.
to represent some determinations of a task. Usually, a global and ac- • Decision In-Decision Out Fusion: As the highest fusion level in the
curate decision is highly required through data fusion. Apart from Dasarathy’s architecture, DEI-DEO fusion transfers some decisions
decision level data fusion, the symbol level data fusion is also known in low-level or local fusion nodes to a global decision, which
as “high level fusion”. Compared to low-level fusion, symbol level fu- comprehensively consider information of all low-level or local level
sion methods often generate preliminary classification and can fuse decisions.
different types of data to obtain accurate fusion results.
There are also many other data fusion architectures, such as Bowman
The Luo and Kay’ architecture intends a hierarchical fusion scheme Df&Rm architecture [17,18], Durrant-Whyte architecture [20], Pau Ar-
to transform data from a raw state to a form of high quality. Data sets chitecture [67], Laas architecture [68], and so on. They specify data
captured from sensors follow the order of processing to become useful fusion process from different views and each proposed architecture has
information for the purpose of assisting decision making or estimation. its own advantages and characteristics in comprehending or modeling

118
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

particular applications. Steinberg et al. [21] and Ayed et al. [22] com- 2.4. Machine learning
pared these architectures in detail.
Machine learning is one of the hottest research topics because of
the massive impact of Alpha Go and other artificial intelligence applica-
2.3. Data fusion challenges tions. Machine learning is a sub field of computer science and artificial
intelligence. It describes a field that utilizes some particular algorithms
Data fusion is still confronting a number of challenges in order to to make computer systems “learn” by using given data without specific
maximize its advantages [24] although various data fusion models were programming. Specifically, it is a process to let computer systems or
proposed to address specific demands in many concrete applications. machines see, know, learn and predict the world like a human being.
Most of these challenges are resulted from the complexity of application “Machine learning is the study of making machines acquire new knowl-
environments where sensors are located, the variety of data that should edge, new skills, and reorganize existing knowledge” [70,72–74]. At
be combined, and so on. In this subsection, we list some of them as the beginning of the birth of machine learning, people performed re-
below. searches to let a machine study, gain skills and build its own knowledge
world automatically. After that, Samuel proposed the term “machine
(1) Data imperfection: It is a common problem and a main issue that learning” explicitly in 1959 [3], which was evolved from some artifi-
all data fusion methods are expected to settle. The data captured cial intelligence study fields such as pattern recognition and computa-
by sensors are often imprecise, uncertain, ambiguous, vague, and tional learning theory. The main idea of a machine learning method is
incomplete. Usually, we can improve data quality by modeling its to let the computer have the ability to acquire experience and adjust
imperfection and making use of other available information and itself accordingly without too much human intervention. It is suitable
powerful mathematical tools. Data imperfection will seriously af- for solving such problems that are difficult to program or model.
fect fusion quality if precise and useful data cannot be extracted Data plays an important role in machine learning. Data patterns de-
by data fusion. termine learning results and effects. Machine learning need some data
(2) Data inconsistency: There are some uncertainties caused by in- inputs firstly, which are also known as samples, training sets and in-
herent noises in measurements, sensors and also environments. stances. With the help of provided data sets, a machine reconstructs
These noises lead to data outliner or disorder, which is collec- internal relationships of them, which is the result of “learning” (also
tively known as data inconsistency. Apparently, data inconsis- known as ‘training’), and presents acquired knowledge by the means
tency introduces extremely bad effects to data fusion if a fusion of specific output forms like recognition, classification and prediction
model cannot distinguish the reasons that cause the noises. Data (known as ‘testing’). More concretely, regression models produce a
fusion techniques should overcome this problem by eliminating mathematical variable; classification models form a categorical variable,
the influence of data inconsistency. In addition, there are some and so on.
spurious data caused by lasting or dynamic failures, which are Machine learning methods are usually divided into three classes
difficult to model and predict in usual ways. based on if a given data set has labels about its attributes for learning:
(3) Data confliction: This issue often appears in a system applying unsupervised learning, supervised learning and semi-supervised learn-
belief functions or Dempster-Shafer theory. When some problems ing [23].
that should be treated independently are erroneously integrated, If the attributes of input data sets and output data sets are completely
a representation error occurs. labeled, the goal of machine learning algorithms becomes to construct a
(4) Data alignment/registration and correlation: Data captured model to map input to output, which is called supervised learning. Rep-
from different sensors with different frames must be aligned into resentative applications in supervised learning include classification, re-
a common frame before they are fused, which is called data align- gression, and so on. Two typical supervised learning algorithms are in-
ment or data registration. An over/under confidence will happen troduced below:
if some errors happen in this process. There are also some other
• Support Vector Machine (SVM): SVM is a typical supervised learning
challenges, such as data correlation, which appears mostly in a
model to realize binary classification. With a series of training data
distributed environment when a same set of data is computed or
sets with labels, each data marked as belonging to one or the other
fused more than once mainly because of cyclic tracks in topology,
of two categories, an SVM training algorithm constructs and trains a
called data incest phenomenon. Correlated data often markedly
model that can arrange new data into one category or another, mak-
affect a fusion system with serious biased estimation if it cannot
ing it a non-probabilistic binary linear classifier. An SVM machine
be eliminated by data fusion algorithms well.
works out a hyperplane or a set of hyperplanes in the feature space
(5) Data type heterogeneity: Data are captured by sensors in differ-
between classes. SVM models are particularly suitable for c
ent environments. So, they might belong to quite different types. • lassifying inconsistent sensor data with high dimension features.
Just like people’s eyes, nose, mouth, sensors are with different • Neural Network (NN): Also known as Artificial Neural Network
purposes, too. Data fusion methods should be able to integrate
(ANN), NN is a large-scale intricate network constituted of a set of
different types of data to describe the whole status of an object.
layers including an input layer, a couple of hidden layers and an
(6) Fusion location: This is also an outstanding problem in wireless
output layer. Each layer has many nerve cells. The inputs of a cur-
sensor networks and other distributed fusion environments. Data
rent player’s nerve cells are the outputs of a former layer’s nerve
can be fused in a central node or a local node. The first manner
cells. Given with training data sets, NN learns specific parameters of
costs more bandwidth and time. With the later manner, we can
the whole network with feed-forward or feedback. Due to the com-
reduce communication burden, but we have to give up data ac-
plex structure of NN, it is often trapped with long runtime and local
curacy certainly because of the information loss of local fusion.
minima problem. There are also some derivative NNs, such as Deep
How to balance fusion cost and fusion quality is a tough issue.
Neural Network (DNN), Convolutional Neural Network (CNN), and
(7) Dynamic fusion: The complexity of data fusion is caused by not
so on.
only data type and collection environment, but also its timeli-
ness. To estimate a system state, especially for a time-varying In unsupervised learning, there are no labels given with datasets.
system, data might be significant only in a limited time period. Algorithms often extract features and patterns by themselves. Usually,
This challenge should be dealt with well in a real-time applica- according to similarity or distance of data inputs, the models build
tion environment. Fusion node should be able to distinguish the profound association with the help of internalized heuristics. Clus-
right order of data and its validation. tering methods are representative unsupervised learning algorithms.

119
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Compared to supervised learning, or more exactly, classification, deal- where 𝑀1 , 𝑀2 , ⋯ , 𝑀𝑘 refers to k dimensions of data sets in an applica-
ing with pre-defined labels, clustering does not have any advices or con- tion environment. R denotes ideal result and F stands for corresponding
ducts. A data clustering model classifies data in the way that putting calculated fusion result. A smaller RMSE means lower bias between the
objects with similar attributes in the same group (i.e., a cluster). Some ideal result and the fusion result, which leads to better fusion quality,
typical clustering algorithms have been widely used in various appli- apparently.
cations [56,57,60,61]. For example, connectivity models, hierarchical For the study that does not have referenced data for comparison,
clustering, which are constructed based on distance connectivity; cen- different issue deserves a specific analysis. For example, in image fusion,
troid models such as k-mean algorithms that use a single vector to de- we evaluate fusion quality with the help of Structural Similarity Index
scribe a class; distribution models such as expectation-maximization al- (SSIM) of the original images a, b and fusion result image f [65,66]:
gorithms that manipulate data with statistical distributions. Because k-
means is a commonly used machine learning algorithm that is applied 𝑄(𝑎, 𝑏, 𝑓 ) = 𝜆𝑎 𝑄0 (𝑎, 𝑓 ) + 𝜆𝑏 𝑄0 (𝑏, 𝑓 )
to many data fusion methods, we discuss it in detail.
𝑎𝑏
• K-means might be the most extensively used clustering method 𝑄0 (𝑎, 𝑓 ) = 4𝜎𝑎𝑏 ( )( )
( 𝑎2 + 𝑏2 𝜎𝑎2 + 𝜎𝑏2
where a structure in data is revealed by minimizing a given
objective function. With n data positioned in a d-dimensional
Where 𝑎̄ is the mean value of a, 𝜎𝑎2 is the variance of a. 𝜎𝑎𝑏 is the co-
space, k points are randomly chosen as clustering centers ini-
variance of a and b. For simple calculation, we use a sliding window to
tially. The distance between every data and the nearest center
divide the whole problem. We define 𝜆𝑎 (𝑤) and 𝜆𝑏 (𝑤) as below:
is calculated. The objective of optimization is to achieve the
least distance and local squared-error distortion by recalcu- 𝑠(𝑎|𝑤)
𝜆𝑎 (𝑤) = ,
lating the cluster centers and arranging the distribution plans 𝑠(𝑎|𝑤) + 𝑠(𝑏|𝑤)
repeatedly. K-means belongs to variance-based clustering. In
fact, clustering is an NP-hard problem, thus there is no gen- 𝜆𝑏 (𝑤) = 1 − 𝜆𝑎 (𝑤)
eral solution. There are some representative efficient models where 𝑠(𝑎|𝑤) can be any statistical characteristic of image a in window
for solving the k-means problem such as those presented in w, such as variance or marginal information. Thus,
[70] and [71]. ∑ ( )
𝑄(𝑎, 𝑏, 𝑓 ) = |𝑊 |−1 𝜆𝑎 (𝑤)𝑄0 (𝑎, 𝑓 |𝑤) + 𝜆𝑏 (𝑤)𝑄0 (𝑏, 𝑓 |𝑤)
In case that the machine learning is based on a given training data 𝑤𝜖𝑊
set that has incomplete labels, the machine learning is semi-supervised
The value of 𝑄 is between [-1, 1]. The closer the value of 𝑄 to 1, the
learning. In this case, data inputs with labels will play a leading role in
better fusion quality an algorithm has.
forming a decision boundary. While a large set of data inputs unlabeled
(3) Stability (St): Stability is used to evaluate a fusion model’s abil-
will also help in improving the accuracy of decision boundary and the
ity to keep working well in a stable manner in different situations. What
stability of the whole model.
we need is not just a disposable system with expensive costs in installa-
tion and debugging. A steady model can persistently achieve high per-
3. Criteria of machine learning for data fusion
formance. Even with few abnormal situations, expenses are saved in
handling exceptions and routine maintenance in reality. In the litera-
In this section, we list the criteria that a data fusion model or al-
ture, multiple testing data sets were adopted to examine the stability of
gorithm should satisfy in order to employ them as evaluation metrics
a fusion model [25–27].
to review the literature in the next section. In what follows, data fu-
(4) Robustness I: Robustness evaluates the strength of a fu-
sion model, method and algorithm are used interchanged with the same
sion model to resist disturbance. When an underlying environment is
or similar meaning if not specially annotated. Facing the challenges as
changed, fusion quality should be ensured. For example, in a radar sys-
mentioned in Section 2.3, we propose a list of criteria to comprehen-
tem, raw data captured from sensors are not stable all the time. It is
sively and thoroughly evaluate the performance of data fusion.
highly expected that a fusion algorithm should effectively remove out-
(1) Efficiency (Ef): Efficiency is used to evaluate if a data fusion
liners, noises and communication errors as its best. If the fusion model
model makes use of resources economically. In most application scenar-
can overcome this problem with a stable fusion result, this model is ro-
ios, system resources are limited in terms of computation, bandwidth,
bust.
storage space and many other aspects. Dealing with as more as possible
(5) Extensibility (Ex): Extensibility means that a data fusion model
data in an as shorter as possible time interval with as less as possible
can be easily further improved and widely used in many situations. For
system resources should be a universal goal of a fusion model. The effi-
similar application environments with alike targets, the model can be
ciency reflected by execution time should be evaluated to demonstrate
applied in a generic and pervasive way. Extensibility is a valuable fea-
model advance through comparison with other models.
ture for wide adoption of the data fusion model in practice.
(2) Quality (Q): Obviously, it is the most important criterion for
(6) Privacy (P): In some application scenarios, data used for fusion
evaluating a fusion model. What is the direct impact on a fusion algo-
may be sensitive and private, which induces security requirements on
rithm? To which degree does the model improve information accuracy?
the fusion model. We use privacy to describe such a demand. In the en-
Quality is the core of data fusion. In a specific application scenario,
vironment where non-public data sets are processed, data should be pro-
there should be corresponding assessment metrics. Quality should be
tected during fusion to avoid any sensitive information leakage in subse-
inspected by checking if the above questions are answered with suffi-
quent steps. Which encryption algorithm or privacy protection scheme
cient evidence, e.g., experimental results and reasonable explanations.
should be applied and how to manage procedures including but not lim-
We divide all the literatures that dealt with fusion quality into two
ited to encryption, fusion, transmission, decryption and storage will be
types: the ones with ideal fusion result and the ones without ideal fu-
the key objectives of privacy protection.
sion result. For the former, we use Root Mean Squared Error (RMSE) to
(7) Tested with real world data sets (Re): In a solid research, ex-
measure the bias of a calculation result and observation, which directly
periments are dispensable to testify the performance of a model, prove
describes fusion quality:
its effectiveness, and show its advantages. Obviously, the experimental

√ ∑𝑀1 ∑𝑀𝑘 [ ( ) ( )]2 results will be more persuasive if researchers utilize data sets captured

√ 𝑖=1 ⋯ 𝑗=1 𝑅 𝑥𝑖 , ⋯ , 𝑥𝑗 − 𝐹 𝑥𝑖 , ⋯ , 𝑥𝑗 from real application scenarios. It is highly preferred if the whole exper-
RMSE = ,
𝑀1 𝑀2 ⋯ 𝑀𝑘 iments are done in practice rather than in a simulated environment.

120
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Table 1
Summary and comparison of machine learning methods for data fusion.

References Fusion types Application scenarios Machine learning Challenges to overcome Ef Q St R Ex P Re


methods

[28] signal Motor fault detection SVM Dynamic fusion N H N N Y N Y


[29] signal Distributed data fusion SVM Fusion location Y L N N Y N N
[30] signal Biometric Fusion SVM Data imperfection Y H Y Y N N Y
[40] signal WSN BP neural network Data type Y N Y N N N N
[41] signal WSN SMPSO-BP neural Data imperfection Y N Y N Y N N
network
[32] signal Navigation system Elman neural network Data imperfection Y H N N N N Y
[34] signal Drum level measurement RBF neural network Data imperfection Y H Y Y N N N
[31] signal Multi-objectives real-time tracking k-central clustering Data association N H N Y N N N
[39] signal High resolution radar system Clustering Dynamic fusion Y N Y Y N N Y
[51] signal Radar data fusion Cell clustering Data imperfection N L N Y N N N
[36] signal Multi-target tracking K-means Data imperfection N H Y N N N Y
[26] signal WSN anomaly detection K-Means Fusion Location Y H Y Y N N Y
[33] signal WSN Un-even Data imperfection Y N N Y Y N N
clustering/Simulated
annealing algorithm
[38] signal WSN K-means Data imperfection Y H Y N N N N
[35] signal Reputation generation Clustering Data imperfection Y H Y N Y N Y
[37] signal Unknown system Clustering/MLP Data inconsistency N H N Y Y N Y
[44] feature Land cover classification SVM Data imperfection N H Y N Y N Y
[43] feature Embedded real-time fusion ANN Data imperfection Y H Y N Y N Y
[43] feature Embedded real-time fusion SVM Data imperfection Y H Y N Y N Y
[43] feature Embedded real-time fusion NBC Data imperfection Y H Y N Y N Y
[42] feature Meta-search engine Ranking SVM Data imperfection N H Y N N N Y
[27] feature Tool wear estimation Artificial Neural Dynamic fusion Y N N Y N N Y
Network
[50] feature Gesture Recognition SVM Data imperfection-Classifier Y H Y N N N N
[52] feature Moving target indication Self-organizing Data imperfection N N N L N N Y
clustering
[25] feature Intrusion detection Fuzzy clustering Data imperfection N H Y N Y N Y
[45] decision Remote sensing data fusion SVM Data imperfection N H N N N N Y
[48] decision Intrusion detection Neural Network Data imperfection Y H Y N N N Y
[46] decision Intrusion detection Clustering Data imperfection N L N N Y N N
[47] decision Intrusion detection K-Means & v-SVC Data imperfection N H Y N N N Y
[49] decision Nuclear power crack detection Clustering Data imperfection N H Y Y N N Y
Ef
: Efficiency, Q: Quality, R: Robustness, St: Stability, Ex: Extensibility, P: Privacy, Re: Tested with real world data sets.
Y
: Concluded or did well or discussed theoretically; N: Not mentioned.
H
: Did well especially in Q; L: Concluded but analysis was not adequate in terms of Q.

4. Machine learning for data fusion

In this section, we review the state of the art of machine learning


for data fusion by classifying the current works into three categories:
signal level data fusion, feature level data fusion and decision level data
fusion. In each category, we review the literature based on the type of
machine learning. For each work, we summarize its main contributions
and characteristics, and comment on its performance based on the pro-
posed criteria. At the end, we summarize and compare all the reviewed
works in Table 1.

4.1. Signal level data fusion

According to the Luo & Kay architecture, the lowest level of data fu-
sion is signal level fusion. With raw data inputs captured from sensors,
data outputs with high accuracy, reliability and few noises are captured.
Or feature outputs are extracted to directly reflect an aspect in observa-
tion. Signal level models are often applied in signal fusion, image fusion
(also known as pixel fusion) and other similar scenarios.

4.1.1. Single level data fusion based on supervised learning Fig. 4. The model structure of [28].
As a representative supervised machine learning algorithm, SVM
provides a proper fusion function in the signal level. Banerjee et al.
[28] proposed a hybrid method for fault detection based on multi-sensor sensors are firstly preprocessed in STFT, which is mainly separated based
data fusion with SVM, Short Term Fourier Transform (STFT) and a time on the frequency level and amplitude of a signal. Then, an SVM classi-
duration based observer model. The system classifies the state of a sys- fier, which has been previously trained with labeled data, will transform
tem into three kinds: healthy, degraded and failed. The specific scheme the input signals into a high dimensional feature space and separate sig-
of the SVM based fault classifier is described in Fig. 4. Raw data from nals in a linear way into original signals and signals with fault. After

121
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

that, a sensing system with time duration based observer receives sig- tion results indicate improved efficiency compared with ordinary BP-NN
nals from classifiers and judges which state the system is with the help of algorithms. Correspondingly, Shi et al. optimized the classical BP-NN
a threshold. The threshold is a tolerance level of the system. Crossing a with the Speed-constrained Multi-objective Particle Swarm Optimiza-
safety valve means a signal gives an unwanted response, which will lead tion (SMPSO) algorithm. Their method [41] can reach convergence with
to a state change in a finite state model. At last, the output of the sys- the least iteration steps compared with a classical BP-NN algorithm and
tem is divided into three states: healthy system (i.e., the state of the sys- an improved BP-NN algorithm, which shows its efficiency. Simulation
tem does not change in the observing phase), degraded system (i.e., the results also proved that the proposed algorithm is adaptive in a large-
change of the system is in a tolerance level), and failed system (there are scale network.
indeed some signals crossing the safety valve). The proposed model can Many application environments such as human motion analysis and
monitor the working state of a motor in a certain interval of time with human-machine interface have a quite crucial need on precise location.
a prior alarm if there is any unwanted situation happening. Since the Multi-sensors set in different locations capture data of position infor-
sensors capture data varying nonlinearly, SVM as an excellent nonlin- mation of a target from different views. Fusion models are expected to
ear pattern recognition tool, especially in dynamic procedures ensures solve the imperfection of these data sets to get complete knowledge of
accuracy and performance in fault diagnosis at the same time. Classifi- the location of the target. Kolanowski et al. [32] proposed a naviga-
cation accuracy and performance of average classification are improved tion system based on Elman Artificial Neural Network (ANN), which
compared with the system without fusion. Experiments on one to ten is good at resolving nonlinear problems especially in prediction. The
sensors in the system showed good fusion performance, which implies system first uses Automatic Heading Reference System (AHRS) to ana-
sound model extensibility. The experiments were performed based on a lyze data sets from sensors. The input and output data sets are used to
practical system. However, efficiency, stability, robustness and privacy train Elman ANN. Elman ANN model has 9 input neurons and 3 out-
were not mentioned in this work. put neurons. There is at least one hidden layer between the input layer
In distributed data fusion systems, a disturbing problem often ex- and the output layer. There is also a context-sensitive layer only con-
ists. Raw signals obtained from sensors are usually stored as a large set nected to the hidden layer that stores the information of previous hid-
of samples. While the transmission bandwidth of a data fusion system den layer. The context-sensitive layer can be seen as a representation of
from sensors to a fusion center is not adequately large, usually with a feedback. The authors also changed the number of neurons of feedback
distinct limit. This causes the problem to transmit these data sets for loop for achieving better performance. Experimental results with Elman
next step process within an available time limit before data become ex- ANN show few errors compared with AHRS, which indicates that Elman
pired. Challa et al. [29] optimizes a Bayesian approach to data fusion ANN is an efficient alternative for position detection. The reduction of
with SVM, which is used as a technique for compressing information. It trigonometric operations and matrix operations makes an improvement
minimizes the objective function of SVM to transform input signals to on time cost. Thus, this system achieves Efficiency and Quality. How-
a small set of signals called support vectors, which is described as its ever, this work does not discuss other criteria as proposed in Section 3.
approximation function. Other non-support vectors are discarded since Tong et al. [34] proposed an information fusion model for boiler
related signals do not contain useful information. Correspondingly, a drum water level measurement. There is a crucial need for precise wa-
kernel dictionary of the SVM is given for model modification to achieve ter level measurement in drum because the inbalance between boiler
sound efficiency based on different practical application environments. load and feed-water will lead to serious consequences. Differential pres-
This model was tested in a density estimator system, which shows ex- sure level measurement is a convenient and efficient method utilized
cellent performance in data compression. Thus, it performs well in both in this problem. However, it does not behave well with regard to ro-
fusion efficiency and extensibility. On the other hand, it acquires many bustness facing with disturbances. A Radial Basis Function (RBF) neural
training samples to certify its strength, thus it does not perform very network model, which is expected to be able to map highly nonlinear
well in terms of robustness and stability. Through experimental result models, was designed to fuse such attributes as operating pressure, op-
analysis, we think fusion quality should be further improved. erating temperature, water inflow, and so on. Compared with BP-NN,
Fusion based on SVM could overcome fusion challenges regarding RBF networks solve more problems such as local optimization. With an
imperfect data. Fahmy et al. [30] proposed an improved SVM-based improved gradient descent algorithm, the RBF neural network can mod-
data fusion algorithm. It applies SVM into biometric fusion to fuse iris ify error of drum level measurement well. Simulation results show that
and fingerprint data to gain high accuracy. However, the performance with a two-step training method, both the number of errors in output
of traditional Linear SVM is not good enough. The authors emphatically and the training time are reduced very quickly, which indicates high ef-
studied a technique called score normalization. Although some litera- ficiency and quality of this fusion model. The authors tested the model
tures previously published assumed that this procedure is not necessary with 20 samples, the accuracy of testing results (i.e., the maximum level
in statistical learning fusion like SVM, this work illustrated the fault of of the level error is less than 1 millimeter) shows its sound performance
this viewpoint. Essentially, score normalization is an important internal regarding Robustness and Stability. But Extensibility and Privacy were
part of SVM procedure that helps transforming raw data of individual not considered in the paper. Neural networks show their strong ability
factors into a uniform pattern. The normalization method improves effi- in dealing with a nonlinear problem when it is difficult to be described
ciency and robustness of the traditional SVM model. What’s more, time as a function directly.
consumption is also reduced in both training phase and testing phase
because SVM can deal with the result directly. A number of score nor- 4.1.2. Single level data fusion based on unsupervised learning
malization methods were introduced and tested based on Radial Basis For both military and nonmilitary usage, multi-radar data fusion
SVM with CASIA and FVC2004 databases, which proves the high fu- is an important technique for target identification and tracking with
sion quality and stability of the enhanced SVM model. However, other high accuracy. Shu et al. [31] focused on discriminating and tracking
criteria were not discussed in this work. multi-objectives at real-time. In target observation fields, multi-sensor
For signal level data fusion in WSN, Back Propagation Neural Net- for multi-target tracking is difficult to realize because there is a require-
work (BP-NN) is a typical solution. However, the BP-NN based fusion ment of discrimination of a goal among lots of targets from the data
model often has long convergence time, which causes low fusion effi- observed by the same sensor and the combination of data from different
ciency and a short life cycle of nodes. Shi et al. and Tan et al. improved sensors with regard to one target. A K-central clustering method was uti-
BP Algorithm-based WSN data fusion from two viewpoints, respectively lized to optimize this target identification and tracking model to find the
[40,41]. Tan et al. applied WSN data fusion in forest fire monitoring path of a target. Given with a large set of real-time sensor data without
[40]. They took advantage of the Levenberg-Marquardt algorithm to labels, the algorithm is expected to cluster them into valuable categories,
ameliorate time and energy consumption in a classical BP-NN. Simula- which are also the batches of targets. K-central clustering chooses a

122
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

center of each cluster and trains distribution of points to make the sum of with the simulated annealing algorithm. Sensor members transmit their
the distances between the centers and other points the minimum. The data sets to their corresponding cluster head in the next phase. Cluster
simulation results indicate that the k-central clustering method solves heads execute data fusion and send the fused information to its next hop.
data association problem efficiently and gains better tracking results There is a threshold about the residual energy of the cluster head to ex-
compared to original filtering methods, which demonstrate good fusion amine if it is suitable to continue in real-time. If it is not, a new cluster
quality. Efficiency, Stability, Extensibility and Privacy were not consid- head will be chosen by the base station immediately and a new round
ered in this paper. The experiments were carried out in MATLAB, not will begin. The proposed protocol prolongs the alive time of the whole
in a real environment. The model is robust in dealing with data with network and reduces total energy consumption. It improves the perfor-
ambiguity and noise. mance of a WSN with a distributed data fusion function from the view of
In a high-resolution radar system, there is special requirement on the resource consumption. Thus, fusion efficiency is improved a lot. Exper-
efficiency of data processing on account of the large scale of raw data iments were performed based on a sensor simulation tool without real
and the need of real-time fusion in target monitoring or tracking. Li and environment tests. Other criteria were not mentioned in this work.
Wang [39] proposed a fast data fusion algorithm based on clustering. The weighted algorithm based on fuzzy logic is a classical data fu-
This algorithm divides raw data into clusters based on single dimen- sion algorithm. Due to its excellent performance in calculating weighted
sional distance. The authors also analyzed the calculation complexity of factors and dealing with imprecise data, the fusion algorithms based
the proposed algorithm as O(m∗ n). Experiments showed the outstand- on fuzzy logic have been paid much attention. However, raw data in
ing improved fusion efficiency of the model compared to K-means, Hi- WSN do not adapt to the traditional weighted fuzzy logic algorithm ide-
erarchy and some other data fusion algorithms in the same application ally because invalid data appear frequently during data collection in a
environment. Particularly, the authors considered serious noises in data real-world environment, which might lead to serious measure deviation.
collection of the radar system. To enhance the robustness of the algo- Wang et al. proposed an improved fusion method with k-mean cluster-
rithm, noise removing is performed at the end of the algorithm. ing [38] aiming to solve this problem. The K-means clustering method
Similarly, Wang et al. [36] proposed a hierarchical clustering algo- is applied to preprocess raw data before calculating the weighted fac-
rithm based on the K-means method for multi-target tracking. In this pa- tors. They divided raw data into different clusters and the error data
per, target tracking problems with targets detected by multiple radars with high variance are arranged into specific clusters. Thus, fusion qual-
were described in detail. For example, target route is irregular, radar ity can be improved by reducing the weights of data in these clusters
tracks are not uniform in time or have no common interval. To solve that contain error or useless data. Experiments with simulated datasets
these problems, a hierarchical clustering model was built. After data pre- showed its better fusion accuracy compared to traditional weighted
processing, Hausdorff distance that describes the similar level between fuzzy logic algorithms and other two fusion models. Theoretically, the
tracking data sets was defined and calculated. Data sets with Hausdorff method achieves better fusion efficiency and quality, and is also robust
distance become a class and constitute a cluster search tree. According facing with noises. It is a pity that this method was not evaluated with
to the clustering algorithm, similar classes are merged into a new class to real-world data sets.
build the hierarchical clustering tree. At last, an improved K-means algo- Along with the great development of the Internet and e-commerce,
rithm was designed to deal with final clustering, which is also the most online shopping becomes more and more popular. Consumers need to
important fusion process. Tests with real radar data showed the effec- acquire as much information as possible about the products they are in-
tiveness, stability and also the high tracking accuracy of the algorithm. terested in, including opinions of other consumers. Yan et al. [35] pro-
As one of hot topics in WSN, anomaly detection is attracting more posed an algorithm for reputation generation and recommendation pro-
and more attention. There are a number of distinctions between WSN vision based on opinion mining and fusion. Opinions are firstly filtered
and ordinary networks, which might lead to many serious problems if to eliminate unrelated or spam opinions. Then, similar opinions are
we simply transplant traditional outlier detection techniques into the fused and clustered into a specific opinion set. A number of opinion
WSN environment. Firstly, WSN has severe resource constraints espe- clusters are then generated. In addition, the voting or cited opinions of
cially in battery life, computational capacity and also communication original opinions are also properly fused into main opinion clusters. The
overload, which makes it hard to afford expensive or complicated com- scale of raw data set is greatly reduced for generating a reputation value
putation. It also has high demand for online and real time detection with high efficiency. Experimental results based on real-world data from
without prior knowledge because of the characteristics of data in W– both Chinese and English Amazon websites show the accuracy (quality)
– distributed streaming data. Guo et al. [26] proposed an anomaly de- and stability of the algorithm. The authors also discussed the generality
tection model to solve the above issues. A lightweight data fusion algo- of the algorithm by indicating that it can be applied to generate rep-
rithm named Piecewise Aggregate Approximation (PAA) was proposed utation of many different entities through opinion fusion. It can also
to compress raw data collected by sensors, which greatly reduces trans- support such ways that people express their attitudes as votes and com-
mission overload. Then, K-Means, an unsupervised detection algorithm ments in nature languages. Thus, this fusion model performs well in
improved with Artificial Immune System (AIS) completes classification terms of extensibility.
of normal data and abnormal data, namely outliner detection. Compared Alyannezhadi et al. [37] proposed a data fusion algorithm based
with other WSN detection algorithms, this model not only consumes less on clustering for uncertainty systems. The systems with a number of
energy and time, but also offers a higher detection rate and a lower false unidentified characteristics or mathematical models are usually called
alarm rate. Thorough experimental result comparison and analysis show unknown systems. In this case, we do not know explicit patterns of the
comprehensiveness of this work. Besides, experiments based on virtual system, which would make researchers fall into trouble in processing
and real data demonstrate the stability and effectiveness of the model. data. In [37], a data fusion algorithm was proposed, which contains
To gain good fusion efficiency, routing protocol design becomes three parts including clustering, prediction and updating. In the clus-
an important issue in WSN. An appropriate routing protocol considers tering part, subsets of raw data are generated and then a multi-layer
many factors such as the topology of the whole network, the capability perceptron (MLP) is trained with data to optimize its prediction abil-
of fusion nodes, the time limits of valid signals captured by sensors. Xiao ity. It is worth noting that the data in training sets are timely. At last,
and Liu [33] provided a routing protocol based on Un-even clustering fusion results are updated in the whole system. In unknown systems,
and a simulated annealing algorithm. Compared to the classical protocol a prominent problem is data inconsistency and uncertainty, which is
LEACH (Low Energy Adaptive Clustering Hierarchy), two obvious differ- also the main problem solved by this model. Experiment results with
ences are un-even initial clustering and dynamic time interval for cluster real data sets of temperature from five Internet companies show the
head reselection. At the start of the protocol, the base station clusters elimination of data inconsistency and also the robustness of the algo-
all nodes based on their position information and energy information rithm. This algorithm is also possible to be applied into other known or

123
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

unknown multi-sensor data fusion scenarios, which shows its potential final merged function help in training weights of the features. The final
of extensibility. Efficiency, Stability and Privacy were not mentioned in score of a new testing set is also computed by the function above. In
the paper. simulations, the authors used many parameters to evaluate the preci-
sion of the fusion model. Results showed that the ranking SVM obtains
4.2. Feature level data fusion higher accuracy and better performance than other methods in terms of
all assessment measurements. The efficiency of the model was not men-
In feature level data fusion, data inputs can be either data or features tioned in this paper, which requests further study. The experiments were
extracted already. As an output, we can obtain refined characteristics conducted based on a large amount of data from WikipediaMM2008
or features in the form of other patterns that can be applied to other database.
targets, or data in a higher level, i.e., decisions. Information derived A typical Artificial Neural Network-based sensor fusion method was
from this process is more polished and comprehensive to show various developed in an online tool wear estimation environment [27]. In a
characteristics of data compared with the signal level data fusion. In manufacturing process, a monitoring tool wear plays an important role
what follows, we review the recent advances about feature level data to avoid degradation of product quality caused by serious tool wear.
fusion. Great demand on online tool wear estimation leads to the research of
data fusion. This paper provides a classical neural network-based fusion
4.2.1. Feature level data fusion based on supervised learning model including data preprocessing, feature extraction and feature fu-
SVM performs well in feature fusion [44]. Pouteau et al. proposed an sion. Training data sets and testing data sets with tool wear condition
SVM-based selective fusion algorithm for solving a land cover classifi- obtained from optical microscope were used to train the neural networks
cation problem [44]. The authors compared a variety of previous fusion offline. Thus, the system can provide tool wear estimation as soon as the
models in this field and stated that SVM acquires the best performance features of the tool are given online. Different feature groups were ex-
because of its ability for processing the data from both mono-source tracted and tested in order to assess and acquire the best estimation
and multi-source. Most simplex multi-source fusion models applied in result. Tests based on training data sets generated from both laboratory
remote sensing may face deteriorative accuracy in some scenarios with and an industrial environment with different noise levels showed the
classes utilized by a non-relevant source. On the contrary, selective SVM practicability and effectiveness of the system. Therefore, this method is
can deal with it with the integration of mono-source classification and robust and effective.
multi-source fusion. Experiments with real data sets showed the effec-
tiveness and stability of the algorithm [44]. What’s more, it is not limited 4.2.2. Feature level data fusion based on unsupervised learning
to be applied in tropical rainforest classification, as tested in this paper. Intrusion Detection Systems (IDS) discriminate attacks and maintain
It is applicable in solving other remote sensing problems with multi- system stability. However, many alerts detected by IDSs have many
sensory and Geographic Information System (GIS) data, which implies kinds of problems. They are in large scales or inferior quality, which
good extensibility of the algorithm. However, other criteria were not consumes many system resources and takes long time to deal with. In
discussed in this work. some conditions, up to 99% of alerts detected by IDSs are false or repet-
Starzacher and Rinner proposed an embedded real time multisen- itive. To resolve these problems, there are many models provided. Xiao
sory data fusion scheme based on ANN, SVM and NBC (Naïve Bayes et al. [25] proposed a hierarchical fusion system with four fusion lay-
Classifiers) [43]. In an embedded real-time environment, there are not ers to process alerts. Fig. 5 shows the architecture of the alert fusion
affluent resources in each data processing node. However, there is a model. After alert pretreatment, data sets first come into primary alert
strict requirement on processing time of an applied fusion algorithm reduction. This module compares some important attributes, such as
because of the high speed and instantaneity of data scream. The embed- protocol type, source IP, target IP, and so on, of different alerts arrived
ded multi-sensor fusion system proposed in [43] includes several sensor during a temporal window. When all attributes are same in the two
nodes distributed in three layers, a single center node and an assisted alerts, which means the alerts are repetitive, these two alerts should be
sensor node to help a single node make decisions. Three fusion methods firstly combined. Then, alert verification module is responsible for val-
were tested in an embedded test platform with four real-world datasets. idating authenticity of alerts and eliminating false alerts. Alert verifica-
Classification execution time and classification rates were used to mea- tion compares alerts regarding both the information of alert itself and
sure the performance of models. Experiments result showed that SVM its target machine in order to achieve high fusion quality. This module
has the least classification time and the three algorithms all perform periodically scans the protected network environment for gaining high
better than the classical methods. On the other hand, classification rates efficiency. Many false alerts and irrespective ones are eliminated with
are influenced by many reasons. As a whole, these three fusion methods this way and the burden of services can also be reduced. Next, fuzzy
perform well in the embedded system with reasonable performance. clustering methods are used to classify alerts, which mainly groups the
Ranking SVM, which transforms a learn-to-rank problem into a for- alerts based on attack scenarios. The model groups the alerts into clus-
malized binary classification solved by SVM, has become a hot topic ters with their target IP. Alerts with the same target IP are clustered
nowadays. Cao et al. [42] employed the ranking SVM into a meta-search into one group. Then, the fuzzy similarity matrix of each group is gen-
engine based on fusion. The meta-search engine in this paper is a cross- erated. At last, the alerts are divided by the fuzzy clustering model with
media engine, which approves both text-based retrieval and content- the help of an appropriate threshold. Based on attack knowledge, alerts
based retrieval. The meta-search engine is expected to have the ability in the same class are correlated and attack scenarios are constructed
of distributing the requests from users to several member search engines then. Experiments based on two test data sets showed the performance
and then merging results into a whole list. The key point in this engine in redundant alerts reduction, so the quality of the system is good. Effi-
we pay attention to is “result fusion”, which integrates the results from ciency, Robustness and Privacy were not mentioned in the paper. Tests
all member search engines and figure out a comprehensive rank list. over two real world datasets showed the stability of the system. What’s
Common literatures in this field often give different engines a common more, the system can work with any effective alert detection methods,
weight by ignoring the specific condition and performance of each sin- so it also has good extensibility.
gle member search system. This paper solved this problem with the help
of supervised learning to obtain appropriate fusion weights. The ranking 4.3. Decision level data fusion
SVM model transforms the ranking problem into a binary classification
problem by modifying a function form. For a document from the re- In order to further fuse some information that has already been gen-
sult sets, the algorithm firstly selects features and builds training sets erated to reveal some decisions of a task, we come to the highest level -
based on users’ orders. Then the constraint relationships and a linear decision level data fusion. We need not only the decision derived from

124
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Fig. 5. Alert fusion model of [25].

mental results based on the data sets captured around the University of
Houston campus from an official mapping organization showed that this
method effectively maximizes the advantage attributes of LIDAR and
hyperspectral through feature extraction, feature classification and de-
cision fusion. A detailed fusion performance comparison and evaluation
analysis were given in this paper, which is a marked advantage. How-
ever, Efficiently, Robustness, Extensibility, Privacy and Stability were
not mentioned in this paper.
As a preliminary version of [47], Giorgio et al. provided a similar
system of anomaly-based intrusion detection in [48]. It has multiple
classifiers. Features in each traffic connection and data packet are sub-
divided into three groups – intrinsic features, traffic features and con-
tent features based on the characteristics of the feature. Each feature set
maps with a corresponding classifier. Feature sets can mostly describe
normal and abnormal network patterns so that the classifiers can distin-
guish attack pattern by training with a large group of given data sets.
Fig. 6. Flowchart of the proposed fusion method in [45]. The authors implemented a three-layer neural network as classifier and
applied five different fusion rules to verify system effectiveness. Results
showed that the Multiple Classifier System provides a trade-off between
single perspective, but also the one with a global view. Thus, decision detection rate, false alarm rate and generalization abilities compared
level fusion often appears right before final decisions are made. Com- to the approach with individual classifier that deals with all extracted
pared to low-level fusion, decision fusion methods often generate a pre- features. In addition, A-Posteriori DCS fusion technique can provide the
liminary classification and can fuse different types of data and obtain best overall performance in terms of false alarms rate, error rate and
accurate fusion results. average cost.

4.3.1. Decision level data fusion based on supervised Learning 4.3.2. Decision level data fusion based on unsupervised learning
Bigdeli et al. [45] proposed a typical decision fusion model based Fessi et al. [46] proposed a data fusion model based on clustering for
on multiple SVM and Naïve Bayes. Fusion of light detection and rang- intrusion detection to resolve the weakness of some existing literatures
ing (LIDAR) and hyperspectral data was discussed in the field of remote on clustering, such as the lack of ability in detecting composite attacks
sensing data from multiple sensors. Fig. 6 shows the classifier fusion and constructive attacks, the ignoring of efficiency and overmuch of hu-
system proposed in this paper. Firstly, a set of features, which contain man intervention. The architecture of the intrusion detection system is
valuable information to distinguish objectives in the next steps, are ex- described in Fig. 7. It is a centralized system that contains sensors as
tracted from LIDAR data and hyperspectral data, respectively. After that, observers to detect data sets, a global analyzer containing a data fusion
a one-against-one multi-class SVM method based on radial basis function component, a response module for activating actions and database. A
(RBF) kernel is utilized to classify the features captured in the previous number of analyzers inside the global analyzer are set to detect different
phase. SVM classifiers are used in each feature space. At last, a classical events about attacks with different methods based on misuse detection
fusion method, Naïve Bayes model fuses data sets from single classi- or anomaly detection. An efficiency factor of each analyzer is used to
fiers. The authors used overall accuracy and kappa coefficient as the evaluate its accuracy, performance and robustness. Some partial deci-
evaluation metrics of model performance. The proposed model shows sions are made by a number of analyzers and then are sent to the fu-
better results than the usage of original LIDAR, hyperspectral data or sion component for gaining a global security view of the whole system.
any other simple integrated models of these two kinds of data. Experi- Both the events sniffed by the analyzers and the efficiency factor of each

125
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

Fig. 7. Fusion architecture discussed in [46].

analyzer are taken into account by the clustering operation. A data fu- tational consumption of some parts in the algorithm, Efficiency is not
sion clustering model partitions events from the analyzers into new clus- ideal, which becomes a part of future work. Other criteria were not con-
ters based on attack behaviors. As a whole, the adaptivity of this model, cluded in this paper.
which is mainly realized by the settings of analyzers, in different attack
scenes and composite attacks is remarkable. The author illustrated the
function of proposed algorithm with an example, but they did not set 4.4. Comparison and discussion
any simulations or experiments to prove its performance. What’s more,
other properties, such as Robustness, Stability and tests based on real In Section 4, we comprehensively review the existing works about
world data sets were not mentioned. machine learning for data fusion. To conclude, we compare all the
Intrusion detection systems fall into two main categories, anomaly- models/methods/algorithms involved in this section in Table 1 with re-
based IDS and signature-based IDS. The anomaly-based IDSs model the gard to their fusion types, application scenarios, applied machine learn-
normal network and malicious behaviors that can be detected with dif- ing methods, main challenges to overcome, and satisfactory with the
ferent features compared to the normal model. An important advantage proposed criteria. The notations used to evaluate the performance of
of anomaly-based IDS is its ability of detecting unknown intrusions, but data fusion are introduced below.
its high false alarm rate cannot be ignored. Giacinto et al. [47] pro-
posed a multiple modular system with a one-class classifier to imple- • Efficiency (Ef)
ment anomaly-based detection. The authors divided network connec- ­ Yes (Y): The algorithm provides highly efficient data fusion or
tions into groups on account of the service of each connection. In other there are discussions on fusion efficiency in experiments and
words, each group describes a set of similar packets in view of “service”. evaluation.
Three one-classifier algorithms were applied to realize the classification. ­ No (N): The algorithm does not promote efficiency or efficiency
In each group, extracted features are classified and compared with the was not discussed.
normal model, and then decisions from classifiers will be generated and • Quality (Q)
fused into an overall conclusion. Another peculiarity of this system is ­ High (H): The algorithm improves quality as the main concern
that it subdivides false alarm rate into distributed modules. Thus, peo- and provides detailed evaluation to prove its effectiveness or
ple can adjust the threshold of the similarity in the detection, which enough experiment results to show good data fusion quality.
affects the detection rate further. Experimental results showed that the ­ Low (L): The algorithm intends to deal with low fusion quality.
multiple modular system can provide higher detection rate than a single Nevertheless, performance analysis is too rough or experiment
classifier that deals with whole features. The experiments were based on results are not adequate. Alternatively, there is no significant per-
dataset DARPA 1998, which is a popular real data set. However, Privacy formance gain.
was not discussed in this work although there is a strong need to protect ­ No (N): Fusion quality was not discussed or not obviously pro-
security and privacy of the data used in intrusion detection. moted.
Clustering is also used for decision fusion in the last step of fusion • Stability (St)
process [49]. Chen et al. proposed a deep learning-based nuclear power ­ Yes (Y): The algorithm performs well in a stable way, which is
crack detection algorithm [49]. Nuclear power crack inspection is an supported with experimental results.
important component of nuclear applications in case of incidents. Some ­ No (N): The algorithm is not stable or this property was not con-
vision-based crack detection algorithms were proposed, but there are cerned in the paper.
still open issues in tiny cracks and noisy patterns detection. This pa- • Robustness (R)
per solved this problem with a Naïve Bayes and clustering-based fusion ­ Yes (Y): The algorithm performs well in a fluctuant environment
model. With the former modules’ crack detection results aggregated in with the support of experimental results, or robustness was only
tubelets, Naïve Bayes discards false positive tubelets and the clustering theoretically discussed.
model groups the tubelets for a whole crack with Euclidean distance in ­ No (N): The algorithm is not robust or this property was not con-
order to make final decision. This algorithm was tested with real crack cerned in the paper.
datasets. Experiments showed its improved effectiveness compared with • Extensibility (Ex)
the past methods. Thus, it has sound Quality and Stability. With the ­ Yes (Y): The algorithm can be applied into other application sce-
outstanding advantage in detecting robust and noisy patterns, this al- narios theoretically or illustrated with experiments.
gorithm performs quite well regarding Robustness. Due to the compu- ­ No (N): This property was not concerned.

126
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

• Privacy (P) testing in practice. Only few works researched their models in reality
­ Yes (Y): The algorithm can ensure data security in data fusion, and most of them are related to computer science. Some works even did
data privacy was taken into consideration, or this problem was not expound the source of data they used for experiments.
concerned theoretically.
­ No (N): This problem was not considered in study. 5. Open issues and future research directions
• Tested with real world data sets (Re)
­ Yes (Y): The proposed model was tested with the data sets cap- Based on the detailed survey reported in Section 4, we further indi-
tured from real world environments or in practice. cate a number of open issues and suggest some future research direc-
­ No (N): The data sets used in experiments were all simulated or tions.
authors did not talk about the sources of data sets or there are
no any data-based experiments provided at all. 5.1. Open issues

Based on Table 1, we summarize our review as below. First, the machine learning methods used for data fusion are sim-
Among all the studies reviewed in this section, the methods of signal plex. As we discussed in Section 4.4, most of machine learning models
level data fusion are distinctly overwhelming with nearly half of all the mentioned for data fusion are based on SVM, clustering and neural net-
reviewed papers [26,28–41,51]. Some works fused features extracted works, which are classical methods and simple neural networks. SVM
from raw data to acquire better fusion quality [25,27,42–44,50,52]. In and clustering methods often aim at classifying with high accuracy. NN
[45–49], researchers extracted information and fused decisions in a high is suitable for describing uncertain complex systems. Nevertheless, the
level. power of machine learning methods should be far more than this. Taking
During survey, we observe that the application environment of one example, deep learning is considered as a significant research field
data fusion with machine learning are in variety. Representative fu- in artificial intelligence in next 10 years. Deep learning describes the
sion scenarios include but not limited to WSN systems [29,33,38,40,41], techniques that simulate complex neural systems of humans. Compared
radar tracking and remote systems [31,36,45,51,52], intrusion detection with simple neural networks, more hidden layers inserted into the net-
[25,46–48], reputation generation [35], mechanical engineering scenar- work would give the system better accuracy and learning quality. The
ios [27,28,34], and so on. More and more machine learning-based fusion lack of deep learning methods for data fusion motivate us to explore
is needed in all kinds of fields. Most of the reviewed works solved the new thoughts.
“data imperfection” problem in data fusion. Beyond that, some works Second, researchers pay little attention to fusion efficiency. Refer to
applied in distributed systems and WSN figure out location fusion prob- Table 1, past work focuses more on fusion quality than fusion efficiency.
lem with SVM and K-Means [29,38]. We hold such an opinion that the Some works even did not discuss or evaluate this important property
machine learning methods cannot solve all challenges of data fusion, at all. The most obvious disadvantage of machine learning methods is
such as data confliction due to the limitation caused by its nature. its computational complexity and huge consumption of computing and
Data fusion models are based on many typical machine learn- system resources. Machine learning often needs large sets of data for
ing methods. Supervised learning methods such as SVM [28–30,42– training, which also brings difficulty into actual applications. Since there
45,47,50] and NN [27,32,34,40,41,48] were widely applied. Corre- will be a good deal for specific needs of miniature devices in the future,
spondingly, clustering models [25,31,33,35,37,39,46,49,51,52] and K- which are not affordable for complicated computation due to limited
Means [26,36,38,47] were also adopted to improve fusion effectiveness resources, the study for optimizing the efficiency of data fusion models
and performance. SVM is good at dealing with data with high dimen- becomes necessary.
sions, while NN is more adept at learning from imperfect and uncertain Third, comprehensive concern of data fusion is missed. Based on
data or when a system is difficult to be described with a linear formula. Table 1, few literatures discussed Robustness and Extensibility. Some
There is no direct relationship between fusion types and machine learn- literatures did not testify if their models are stable in an unsteady envi-
ing methods. Usually machine learning methods are good at handling ronment with experimental results. These requirements should be fun-
classifying problem during fusion process. damental for a fusion model. Some works consider little about the mod-
Many of the data fusion models treat fusion quality as the most els’ effectiveness in practical use. Taking Robustness as an example,
important requirement without any discussion on fusion efficiency data with serious imprecision, inconsistency and noises often occurred,
[25,28,31,36,37,42,44–47,49,51,52]. In the models that mainly concern a model that cannot handle this circumstance well will be practically
about fusion quality, most literatures provided expatiation about per- limited. A similar argument is put on Extensibility. Simply improving
formance evaluation to exhibit their significant improvement on qual- data fusion accuracy and quality, but ignoring other properties will lead
ity. Experiments were usually performed to show the advantages of the to an imperfect model, while a comprehensive model that satisfies all
proposed models by comparing them with the results of other previous expected criteria should be urgently studied.
models. However, fusion efficiency was paid little attention. In some ex- Finally, few existing literatures take account of data privacy and se-
isting signal level fusion models, efficiency was discussed in distributed curity. Machine learning methods have a great need to deal with a large
fusion applications [29,33]. scale of data sets to ensure learning quality and fusion accuracy. How-
We also find that most fusion models perform well in terms of sta- ever, using original data in machine learning could cause sensitive infor-
bility, which shows their strong ability of steady operation in actual mation leakage. This problem can be particularly acute in the Internet
applications. However, few existing works concerned about robustness related applications such as intrusion detection, attack analysis, and lo-
and extensibility, and few experiments testified the performance of data cation tracking. Private information about identities and positions of
fusion on these two aspects. In addition, few existing literatures consid- data providers could be disclosed if the proposed model cannot manage
ered the security of training sets, even in the field of intrusion detection. it well.
Security and privacy issues request urgent investigation in some specific
data fusion fields. We also note that many existing works only focus on 5.2. Future research directions
achieving a single research objective without comprehensively fulfilling
all performance requirements and criteria. Based on the above indicated open issues, we move up to propose
Besides, more than half of the reviewed works evaluated the per- some potential future research directions.
formance of their proposed models with data sets captured from real First direction is to explore more application scenarios for machine
application environments. However, some experiments were conducted learning based data fusion. After the great development of machine
in simulated environments due to multiple reasons and difficulties of learning for data fusion in decades, it is gratifying to see a wide range

127
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

of models applied into different scenarios, such as intrusion detection, a concise and comprehensive reference for researchers and practitioners
target identification and tracking for military and nonmilitary utiliza- in the field of machine learning for data fusion.
tion, human-computer interaction, navigation and geographic utiliza-
tion, and so on. What’s more, there are many other application scenar- Declaration of competing interest
ios that are expected to use machine learning based data fusion meth-
ods. The strong ability of machine learning in nonlinear mapping pro- None.
vides additional opportunities for data fusion. Supervised learning mod-
els represented by SVM and Random Forest do well with high dimen- Acknowledgments
sional data and their flexibility makes them suitable for solving more
problems. ANN models are especially good at modeling multifarious This work is sponsored by the NSFC (grants 61672410, 61802293
nonlinear networks that are difficult to describe with functions directly. and U1536202), Academy of Finland (grants 308087 and 314203),
With a growing demand of IoT and smart devices, there are more indus- National Postdoctoral Program for Innovative Talents (grant
trial fields with numerous data sets that can be promoted by applying BX20180238), the Project funded by China Postdoctoral Science
machine learning based data fusion methods. Foundation (grant 2018M633461), the Fundamental Research Funds
Another future research direction is the use of more complex and for the Central Universities (grant JB191504), the Shaanxi Innovation
large-scale learning techniques into data fusion. As talked above, we Team project (grant 2018TD-007), and the 111 project (grants B16037).
place expectations on deep learning, which combines supervised learn-
ing and unsupervised learning to construct learning hierarchy, namely Supplementary materials
the network. Especially in some scenarios that relate to a large amount
of data, Deep learning can gain much more improved performance and Supplementary material associated with this article can be found, in
prediction precision than past learning algorithms [63]. According to the online version, at doi:10.1016/j.inffus.2019.12.001.
[4], there have been some efficient models appeared to deal with fusion
problems with deep learning. In [53], a deep belief network based data References
fusion scheme was proposed for ball screw fault detection. Nevertheless,
there might be some following challenges introduced at the same time. [1] D.L. Hall, J. Llinas, An introduction to multisensor data fusion, Proc. IEEE 85 (1)
(1997) 6–23.
The effectiveness of deep learning can only be ensured with mass data
[2] C. Federico, A review of data fusion techniques, Sci. World J. (2013) 1–19.
and high resource consumption. How to ensure the applicability of deep [3] A.L. Samuel, Some studies in machine learning using the game of checkers. I, Com-
learning based fusion models in small devices and how to make trade-off put. Games I (1988) 335–365.
between fusion efficiency and quality are additional issues that should [4] F. Alam, R. Mehmood, I. Katib, N.N. Albogami, A. Albeshri, Data fusion and IoT for
smart ubiquitous environments: a survey, IEEE Access 5 (2018) 9533–9554.
be solved. Except for the issues mentioned above, we are also looking [5] S. Gite, H. Agrawal, On context awareness for multisensor data fusion in IoT,
forward to researches on deep composite intelligent applications. Springer India 381 (2016) 85–93.
There is also a serious security need on fusion models. Information [6] I.M. Pires, N.M. Garcia, N. Pombo, F. Flórez-Revuelta, From data acquisition to data
fusion: a comprehensive review and a roadmap for the identification of activities of
privacy is in urgent need to be protected in both fusion process and daily living using mobile devices, Sensors 16 (2) (2016) 184.
machine learning process. Experiments involved in the above reviewed [7] G. Navarro-Arribas, V. Torra, Information fusion in data privacy: a survey, Inf. Fusion
works are mostly performed with testing data sets. It will be extremely 13 (4) (2012) 235–244.
[8] N. Faouzi, H. Leung, A. Kurian, Data fusion in intelligent transportation systems:
dangerous if transplanting the model into actual utilization directly be- progress and challenges – A survey, Inf. Fusion 12 (1) (2012) 4–10.
cause of the exposure of all data sets. Without any security protection, [9] I. Corona, G. Giacinto, C. Mazzariello, F. Roli, C. Sansone, Information fusion for
sensitive information can be recovered and acquired from fusion results. computer security: state of the art and open issues, Inf. Fusion 10 (4) (2009)
274–284.
Besides, a central device that preforms fusion might become vulnerable
[10] J. Yao, V. Raghavan, Z. Wu, Web information fusion: a review of the state of the art,
facing to attacks. We need fallback or other solutions for model’s better Inf. Fusion 9 (4) (2008) 446–449.
practicability. Trustworthy data fusion with security and privacy pro- [11] S. Liao, et al., Data mining techniques and applications – a decade review from 2000
to 2011, Expert Syst. Appl. 39 (12) (2012).
tection is highly required to be ensured.
[12] C. Rudin, K.L. Wagstaff, Machine learning for science and society, Mach. Learn. 95
At last, data fusion model performance evaluation should be more (1) (2014) 1–9.
well-founded. As mentioned in Section 4.4, there are some works that [13] J. Qiu, et al., A survey of machine learning for big data processing, EURASIP J. Adv.
did not testify their model with real data sets. Laere [54] studied the Signal Process. (1) (2016) 2016.
[14] Q. Zhang, et al., A survey on deep learning for big data, Inf. Fusion 42 (2018)
state and difficulties of information fusion performance evaluation in 146–157.
reality. The author took an overview of 52 data fusion publications, only [15] F.E. White, Data Fusion Lexicon, (1991).
6% works evaluate the model in real scenarios. Laere also explained the [16] Z. Yan, J. Liu, L.T. Yang, W. Pedrycz, Data fusion in heterogeneous networks, Inf.
Fusion 53 (2020) 1–3.
difficulties, which impede data fusion research, and gave suggestions. [17] C.L. Bowman and M.S. Murphy, Description of the VERAC NSource tracker/
Further research should improve the quality of model performance eval- correlator, Naval Res Lab. Report R-01O-80, (1980).
uation. A more holistic model evaluation should be conducted to prove [18] C.L. Bowman, C.L. Morefield, Multisensor fusion of target attributes and kinemat-
ics, in: Proceedings of Decision and Control including the Symposium on Adaptive
the effectiveness of data fusion based on machine learning. Processes, 1980 19th IEEE Conference on IEEE, 1981.
[19] R. Luo, M. Kay, Multisensor integration and fusion: issues and approaches, SPIE
Sensor Fusion 931 (1988) 42–49.
[20] B. Dasarathy, Sensor fusion potential exploitation-innovative architectures and il-
6. Conclusions lustrative applications, Proc. IEEE 85 (1) (1997) 24–38.
[21] Alan N Steinberg, C.L. Bowman, F.E. White, Revisions to the JDL data fusion model,
This paper has made a comprehensive review on the literature about Proc. SPIE - Int. Soc. Opt. Eng. 3719 (1999) 430–441.
[22] S. Ayed, H. Trichili, A.M. Alimi, Data fusion architectures: a survey and compari-
machine learning for data fusion. We first provided basic background
son, in: Proceedings of International Conference on Intelligent Systems Design and
knowledge about data fusion and machine learning. We further pro- Applications, 2016, pp. 277–282.
posed a number of criteria to evaluate the works reviewed in this paper [23] X. Jing, Z. Yan, P. Witold, security data collection and data analytics in the Internet:
a survey, IEEE Commun. Surv. Tutor. 21 (1) (2018) 586–618.
for the purpose of commenting their pros and cons remarkably. We care-
[24] B. Khaleghi, et al., Multisensor data fusion: a review of the state-of-the-art, Inf. Fu-
fully reviewed the recent literature based on the level of fusion taken sion 14 (1) (2013) 28–44.
apart in and the type of machine learning, and then used a table to sum- [25] S. Xiao, Y. Zhang, X. Liu, J. Gao, Alert fusion based on cluster and correlation anal-
marize our main review results. On the basis of our survey, we went ysis, in: Proceedings of International Conference on Convergence and Hybrid Infor-
mation Technology, 2008, pp. 163–168.
ahead to specify a number of open issues and proposed some future re- [26] X. Guo, D. Wang, F. Chen, An anomaly detection based on data fusion algorithm in
search directions that deserve further investigation. This study provides wireless sensor networks, Int. J. Distrib. Sens. Netw. (2015) 1–10 2015.

128
T. Meng, X. Jing and Z. Yan et al. Information Fusion 57 (2020) 115–129

[27] N. Ghosh, et al., Estimation of tool wear during CNC milling using neural net- [54] J. Laere, Challenges for IF performance evaluation in practice, in: Proceedings of
work-based sensor fusion, Mech. Syst. Signal Process. 21 (1) (2017) 466–479. 12th International Conference on Information Fusion, 2009.
[28] T.P. Banerjee, S. Das, Multi-sensor data fusion using support vector machine for [55] D.L. Hall, J. Llinas, An introduction to multisensor data fusion, Proc. IEEE 85 (1)
motor fault detection, Inf. Sci. 217 (24) (2012) 96–107. (2002) 6–23.
[29] S. Challa, M. Palaniswami, A. Shilton, Distributed data fusion using support vector [56] E. Soltanmohammadi, M. Naraghi-Pour, Context-based unsupervised data fusion for
machines, Int. Conf. Inf. Fusion 2 (6) (2013) 881–885. decision making, in: Proceedings of International Conference on International Con-
[30] M.S. Fahmy, Biometric fusion using enhanced SVM classification, in: Proceedings of ference on Machine Learning, 2015, pp. 2076–2084.
International Conference on Intelligent Information Hiding and Multimedia Signal [57] K. Lin, T. Liu, H. Ge, A clustering hierarchy based on data fusion in wireless sensor
Processing IEEE, 2008, pp. 1043–1048. networks, in: Proceedings of International Conference on Computational Intelligence
[31] H. Shu, Y. Wang, J. Jiang, Multi-radar data fusion algorithm based on K-central clus- & Software Engineering, 2009, pp. 1–4.
tering, in: Proceedings of International Conference on Fuzzy Systems and Knowledge [58] L. Snidaro, J. Garcia, J. Llinas, Context-based information fusion: a survey and dis-
Discovery, 2007, pp. 617–621. cussion, Inf. Fusion 25 (2015) 16–31.
[32] K. Kolanowski, A. Swietlika, R. Kapela, J. Pochmara, A. Rybarczyk, Multisensor data [59] R. Nowak, R. Biedrzyck, J. Misiurewicz, Machine learning methods in data fu-
fusion using Elman neural networks, Appl. Math. Comput. 319 (2017) 236–244. sion systems, in: Proceedings of 19th International Radar Symposium, 2012,
[33] L. Xiao, Q. Liu, A data fusion using un-even clustering for WSN, in: Proceedings of pp. 400–405.
International Conference on Advanced Intelligence and Awareness Internet, 2012, [60] K. Julisch, Clustering intrusion detection alarms to support root cause analysis, ACM
pp. 216–219. Trans. Inf. Sys. Secur. 6 (4) (2013) 443–471.
[34] W. Tong, B. Li, X. Jin, Y. Yang, Q. Zhang, A study on model of multisensor infor- [61] C. Völker, P. Shokouhi, Data aggregation for improved honeycomb detection in
mation fusion and its application, in: Proceedings of International Conference on concrete using machine learning–based algorithms, in: Proceedings of International
Machine Learning and Cybernetics, 2006, pp. 3073–3077. Symposium Non-Destructive Testing in Civil Engineering, 2015, pp. 30–47.
[35] Z. Yan, X. Jing, W. Pedrycz, Fusing and mining opinions for reputation generation, [62] SB. Ayed, H. Trichili, AM. Alimi, Data fusion architectures: a survey and compar-
Inf. Fusion 36 (2017) 172–184. ison, in: Proceedings of International Conference on Intelligent Systems Design &
[36] H. Wang, et al., An algorithm based on hierarchical clustering for multi-target track- Applications, 2016, pp. 277–282.
ing of multi-sensor data fusion, in: Proceedings of Control Conference, IEEE, 2016, [63] M. Gheisari, G. Wang, A survey on deep learning in big data, in: Proceedings of
pp. 5106–5111. IEEE International Conference on Computational Science and Engineering, 2017,
[37] M.M. Alyannezhadi, A.A. Pouyan, V. Abolghasemi, An efficient algorithm for mul- pp. 173–180.
tisensory data fusion under uncertainty condition, J. Electric. Syst. Inf. Technol. [64] C.L. Bowman, C.L. Morefield, Multisensor fusion of target attributes and kinemat-
(2016). ics, in: Proceedings of Decision and Control including the Symposium on Adaptive
[38] F. Wang, et al., An improved fusion method of fuzzy logic based on K-means clus- Processes, 1980 19th IEEE Conference on IEEE, 1981.
tering in WSN, J. North Univ. China 35 (6) (2014) 699–703. [65] Gemma Piella, New quality measures for image fusion, in: Proceedings of the 7th
[39] Z. Li, X. Wang, High resolution radar data fusion based on clustering algorithm, in: International Conference on Information Fusion, 2004, pp. 542–546.
Proceedings of IEEE International Workshop on Database Technology and Applica- [66] Z. Wang, A.C. Bovik, A universal image quality index, IEEE Signal Process. Lett. 9
tions, 2010, pp. 1–4. (3) (2002) 81–84.
[40] J. Tan, H. Gan, WSN data fusion scheme based on improved BP neural network, J. [67] L.F. Pau, Sensor data fusion, J. Intell. Robot. Syst. (1988) 103–116.
Residual. Sci. Technol. 13 (7) (2016). [68] E. Wilfried, A review on system architectures for sensor fusion applications, Springer,
[41] S. Li, et al., WSN data fusion approach based on improved BP algorithm and cluster- 2007.
ing protocol, in: Proceedings of 2015 27th Chinese Control and Decision Conference [69] Ren C. Luo, C.C. Yih, K.L. Su, Multisensor fusion and integration: approaches, appli-
(CCDC) IEEE, 2015. cations, and future research directions, IEEE Sens. J. 2 (2) (2002) 107–119.
[42] Y. Cao, T.J. Huang, Y.H. Tian, A ranking SVM based fusion model for cross-media [70] T. Kanungo, D.M. Mount, et al., An efficient k-means clustering algorithm: analysis
meta-search engine, Front. Inf. Technol. Electr. Eng. 11 (11) (2011) 903–910. and implementation, IEEE Trans. Pattern Anal. Mach. Intell. 24 (7) (2002) 0–892.
[43] A. Starzacher, B. Rinner, Embedded realtime feature fusion based on ANN, SVM and [71] J. Matousek, On approximate geometric k-clustering, Disc. Comput. Geometry 24
NBC, in: Proceedings of IEEE International Conference on Information Fusion, 2009, (1) (2000) 61–84.
pp. 482–489. [72] S.S. Liu, L.F. Zhang, Z. Yan, Predict pairwise trust based on machine learning in
[44] R. Pouteau, S. Benoît, SVM selective fusion (self) for multi-source classification of online social networks: a survey, IEEE Access 6 (1) (2018) 51297–51318.
structurally complex tropical rainforest, IEEE J. Sel. Topic. Appl. Earth Observ. Re- [73] L.F. Wei, W.Q. Luo, J. Weng, Y.J. Zhong, X.Q. Zhang, Z. Yan, Machine learning-based
mote Sens. 5 (4) (2012) 1203–1212. malicious application detection of android, IEEE Access 5 (1) (2017) 25591–25601.
[45] B. Bigdeli, F. Samadzadegan, P. Reinartz, A decision fusion method based on multiple [74] H.Q. Lin, G. Liu, Z. Yan, Detection of application-layer tunnels with rules and ma-
support vector machine system for fusion of hyperspectral and LIDAR data, Int. J. chine learning, in: The 12th International Conference on Security, Privacy and
Image Data Fusion 5 (3) (2014) 196–209. Anonymity in Computation, Communication and Storage (SpaCCS2019), 2019,
[46] B.A. Fessi, S. BenAbdallah, Y. Djemaiel, N. Boudriga, A clustering data fusion method pp. 441–455.
for intrusion detection system, in: Proceedings of 11th IEEE International Conference [75] J.Z. Wang, Z. Yan, L.T. Yang, B.X. Huang, An approach to rank reviews by fusing
on Computer and Information Technology, 2011, pp. 539–545. and mining opinions based on review pertinence, Inf. Fusion 23 (2015) 3–15.
[47] G. Giacinto, R. Perdisc, M Del Rio, F. Roli, Intrusion detection in computer networks [76] W.X. Ding, X.Y. Jing, Z. Yan, L.T. Yang, A survey on data fusion in Internet of Things:
by a modular ensemble of one-class classifiers, Inf. Fusion 9 (1) (2008) 69–82. towards secure and privacy-preserving fusion, Inf. Fusion 51 (2019) 129–144.
[48] G. Giorgio, F. Roli, L. Didaci, Fusion of multiple classifiers for intrusion detection in [77] X.Y. Jing, Z. Yan, X.Q. Liang, W. Pedrycz, Network traffic fusion and analysis against
computer networks, Pattern Recognit. Lett. 24 (12) (2003) 1795–1803. DDoS flooding attacks with a novel reversible sketch, Inf. Fusion 51 (2019) 100–113.
[49] C. Fu, R. Mohammad, NB-CNN: deep learning-based crack detection using convolu- [78] G.Q. Li, Z. Yan, Y.L. Fu, H.L. Chen, Data Fusion for Network Intrusion Detection: A
tional neural network and naïve Bayes data fusion, IEEE Trans. Ind. Electron. 65 (5) Review, Security and Communication Networks, 2018 2018.
(2018) 4392–4400. [79] Z. Yan, J. Liu, L.T. Yang, N. Chawla, Big data fusion in Internet of Things, Inf. Fusion
[50] Z. He, Accelerometer based gesture recognition using fusion features and SVM, J. 40 (2018) 32–33.
Softw. 6 (6) (2011) 1042–1049. [80] Z. Yan, J. Liu, A.V. Vasilakos, L.T. Yang, Trustworthy data fusion and mining in
[51] H. Shu, The application of cell-based clustering algorithm dealing with radar data Internet of Things, Future Generat. Comput. Syst. 49 (2015) 45–46.
fusion, in: Proceedings of 2008 Congress on Image and Signal Processing, 2008. [81] J. Liu, Z. Yan, L.T. Yang, Fusion - an aide to data mining in Internet of Things, Inf.
[52] D. Qiu, et al., The study of self-organizing clustering neural networks and applica- Fusion 23 (2015) 1–2.
tions in data fusion, in: Proceedings of IEEE World Congress on Intelligent Control [82] X.Y. Jing, J.J. Zhao, Q.H. Zheng, Z. Yan, W. Pedrycz, A reversible sketch-based
& Automation, 2008. method for detecting and mitigating amplification attacks, J. Netw. Comput. Appl.
[53] L. Zhang, H. Gao, A deep learning-based multi-sensor data fusion method for degra- 142 (2019) 15–24.
dation monitoring of ball screws, in: Proceedings of Prognostics Syst. Health Manage.
Conf. (PHM-Chengdu), 2016, pp. 1–6.

129

You might also like