0% found this document useful (0 votes)
321 views8 pages

Fault Detection in Wireless Sensor Networks Through SVM Classifier

The document discusses fault detection in wireless sensor networks using support vector machines (SVM) classification. It describes different types of faults that can occur in wireless sensor networks, including hardware, software, communication and data faults. The paper then presents SVM as a machine learning technique for fault detection and classification in wireless sensor networks and compares it to other recent techniques.

Uploaded by

Aitamar Yassine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
321 views8 pages

Fault Detection in Wireless Sensor Networks Through SVM Classifier

The document discusses fault detection in wireless sensor networks using support vector machines (SVM) classification. It describes different types of faults that can occur in wireless sensor networks, including hardware, software, communication and data faults. The paper then presents SVM as a machine learning technique for fault detection and classification in wireless sensor networks and compares it to other recent techniques.

Uploaded by

Aitamar Yassine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 1

Fault detection in Wireless Sensor Networks through


SVM classifier
Salah Zidi, Tarek Moulahi, and Bechir Alaya

Abstract—Wireless Sensor Networks (WSNs) are prone to many • Offset fault: when a constant is added to the expected
failures such as hardware failures, software failures, and commu- data which can occur due to bad calibration of sensing
nication failures. The fault detection in WSNs is a challenging unity.
problem due to sensor resources limitation and the variety of • Gain fault: happen when, in a period of time, the change
deployment field. Furthermore, the detection has to be precise rate of sensed data is different to the expectation.
to avoid negative alerts, and rapid to limit loss. The use of • Stuck-at fault: this fault is happen when the variation of
machine learning seems to be one of the most convenient solutions
for detecting failure in WSNs. In this paper, Support Vector
sensed data series is zero.
Machines (SVM) classification method is used for this purpose. • Out of bounds: happens when sensed data values are out
Based on statistical learning theory, SVM is used in our context to of bounds of normal running.
define a decision function. As a light process in term of required Many other fault regarding sensed data can occur, such as:
resources, this decision function can be easily executed at cluster data loss, aggregation error and calibration fault. The battery
heads to detect anomalous sensor. The effectiveness of SVM for failure is also one of the important causes of error [2]. This
fault detection in WSNs is shown through an experimental study, failure can lead to a malfunctioning of sensors i.e., the whole
comparing it to latest techniques for the same application. network.
Keywords—WSNs, fault detection, machine learning, SVM, clas- On one hand, faults linked to data can occur simultaneously
sification. or separately. On the other hand, it can also happen continu-
ously over a period of time or instantly. It is more difficult to
I. I NTRODUCTION deal with simultaneous and instant occurrence of fault. In this
paper, we have introduced another type of fault, not previously
W IRELESS Sensor Networks (WSNs) are sets of au-
tonomous devices collaborating together through wire-
less channel. In the last decade, WSNs have attracted the
taken into consideration. We have called it random fault. It
includes these cases of fault occurrence.
In the light of what precedes, we may conclude that
attention of research community as well as industry. This is discovering failure has a major importance to guarantee a
due to its capability to collect, process and communicate data normal functioning of WSNs. The fault detection in WSNs
smartly ,and also due to its low cost and its large domains of can be considered as a challenging problem due to sensors
applications [1]. This type of network is an interface between characteristics and fields where they are deployed. Moreover,
physical and digital world. Sensors collect data from fields the detection of anomalies, as well as in other domains, should
where they are deployed, and send it back to the sink node. be rapid and precise to limit loss on one hand, and to make
Disadvantages of sensors are their constraint in energy, storing the distinction between normal and faulty status on the other
and processing capacity. hand.
In most cases sensors are deployed in unmonitored or To deal with WSNs failure and errors, many research efforts
hazardous fields, such as: forest, highways, volcanos [1]. In have been performed. The proposed techniques are either
addition, the sensor, as an electronic device is susceptible to centralized [3], distributed [4] or hybrid [5]. They are based
break down. WSNs are prone to many failures, which can be on statics [6], on neighbors [7], self-detection [8] or machine
classified in three types [2]: learning [9]. These technique will be discussed more in the
• Hardware failures, related works section.
• Software failures, The use of machine learning seems to be one of the
• Communication failures. most convenient solution for detecting faults in WSNs. As a
Hardware failure may happen due to a problem in sensor data mining technique, the classification is the most adequate
hardware unities: sensing unity, power unity, location unity, and used technique for decision making assistance, which is
or processing unity. While software failure can occur due applied in the automatic systems diagnosis [10], [11]. These
a problem in sensor programs. Communication failures can approaches consist of algorithmic categorizations of object
happen due to problems in the sensor transceiver. or data. Moreover, data learning algorithms allow to learn
Faults in WSNs can be also classified according to data sent automatically, to recognize complex models, and to make
by sensor [2], e.g.: intelligent decisions. According to the information on the data,
there are three classes of learning techniques:
S. Zidi is with the Department of Management Information System, CBE,
Qassim University, e-mail: [email protected]
• Supervised learning: Where the classes are predeter-
T. Moulahi and B. Alaya are with Qassim University. mined and the examples are known. The classification
Manuscript received April 17, 2017; revised May 20, 2017. is based on the labeled data.

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 2

• Unsupervised learning: Also called clustering, this learn- FDS is evaluated through a simulation study and compared to
ing is unsupervised because it processes non labeled Fault Detection in Wireless Sensor Networks (FDWSN) [17].
data. It is used to classify a set of objects without having FDS outperforms FDWSN in term of detection accuracy and
expertise in advance. false alarm rate. The main advantage of FDS is simultaneously
• Semi-supervised learning: This type of learning uses consideration of sensed data as well as remaining energy in
labeled and unlabeled data. nodes. This consideration makes the decision more realistic
In this research, SVM technique is applied to classify although FDS validation was performed only by simulation.
received sensor data and to detect faults basing on kernel func- Cloud-based technique is another type of fault detection in
tions. Used for other complex problems such as medical risque WSNs. This type of solution is based on exploiting cloud in
management [23], [19], automatic system identification [21], order to cope with sensor resources limitations. The basic
and image processing [20], SVM has presented attractive idea of this approach is to transmit collected data to the
results mainly for multidimensional data. This technique will cloud storage. Next, map reduces is used for parallelizing
be discussed in detail in section 4. fault detection task. This fact decreases significantly the time
The rest of this paper is organized as follows: Section 2 is of fault detection [2]. The cloud approach has been used
a literature review regarding diagnostics and fault detection in in [18]. Indeed, authors coupled big sensor data sets provided
WSNs. Section 3 describes the problem statement. In Section by gathering, and advantages of cloud especially the potential
4, the proposed solution is outlined. The validation of our of computation, the massive storage, and software services.
contribution is performed in section 5. Finally, section 6 is The main aim of the proposed technique is to detect errors
a recap of this paper. in data as well as anomalous nodes quickly. Data gathering
is performed firstly in sensor node, and next collected at each
cluster. Finally, this data is sent to the base station, and which
II. R ELATED W ORK then sends it to the cloud. The detection technique is performed
As mentioned in the introduction section, many research ef- into two phases at the cloud. Error detection is done in the first
forts have been done to deal with fault detection in WSNs. The phase depending on three inputs: (1) the graph of the WSNs,
proposed techniques are either centralized [3], distributed [4] (2) the fault patterns, and (3) the collected data. In the second
or hybrid [5]. They are based on statics [6], on neighbors [7], phase, the location of error is determined through an error
self-detection [8] or machine learning [9]. localization algorithm.
In this section, most important contributions proposed by To be efficient, fault detection has to be fast. In addition, it is
research community concerning faults detection in WSNs are more useful if the detection happens beforehand i.e., predicting
outlined and discussed. This literature review is especially faults. In the context of WSNs, there are some particularities
based on a recent and important published survey regarding that complicate this task:
this topic [2]. • Energy and resource limitation: On one hand, proposed
In [12], the authors propose a centralized technique to algorithms have not to be greedy in memory and compu-
detect faults in WSNs. This technique is based on statical tational complexity. On the other hand, these algorithms
approach and uses Hidden Markov Models (HMMs). As a have to detect the failure quickly.
supervised machine learning proposition the collected data was • The sensors can be deployed in different environments.
divided into two types: training set and test set. The proposed Therefore, the operating conditions, sensor outputs and
method had performed well in real scenarios where faults are failure type change according to the environment.
essentially offset faults, stuck-at fault and gain fault. • Non-stationary data: The material states of the sensors
A distributed fault detection scheme based on recurrent and other network components change from one year to
neural network has been proposed in [13]. This scheme is es- the next. Accordingly, normal operating intervals also
tablished on spatially organized distributed echo state networks can change.
(SODESN) [14]. As a distributed technique, the fault detection For the first point, it is difficult to take into consideration
is based on collaborative work of many sensors. Before the the two constraints. Therefore, the appropriate solution is to
detection phase, a spatio-temporal correlation is performed predict the failure, so that, even if the algorithm is not fast
between different sensors. Next, this learning model is used enough, there will be sufficient time to react. The last two
for detecting failed sensors. The author shows that SODSEN points lead to a dynamic operating environment. Consequently,
performs well even in case of multiple faults. This result has it is recommended to adapt proposed solution with dynamic
been confirmed after testing SODSEN with real scenarios. situations (non-stationary data and operating condition).
The naive Bayes classifier is a probabilistic classifier in Techniques outlined previously in this section did not suf-
machine learning. This technique is used for determining ficiently satisfy WSNs’ constraints. Therefore, it is recom-
classifiers. As a data mining tool, Bayes classifier has been mended to use new data analysis approaches to detect failures
used to detect and classify anomalous sensor nodes from while taking into consideration WSNs particularities. By com-
normal [5], [15], [16]. In [16], the authors make use of this type paring our problem with other industrial identification [21],
of tool to deal with failures in WSNs. The proposed technique diagnosis [22], and risk management problems [23], regres-
is called Fault Detection Scheme (FDS). FDS is working out sion methods and dynamic classification approaches seem to
two levels: The first one is in the node itself. The second level be able to overcome these constraints. Indeed, classification
is performed at higher level i.e., cluster head or geteways. approaches based on data learning and statistical learning

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 3

techniques can detect quickly the data belonging to the classes. 6) Random fault: To the best of our knowledge, this type of
Furthermore, theses techniques can be adapted to dynamic fault has never been treated in previous research. We introduce
classes. random fault as an instant error, where data is perturbed for
Next in this paper, our contribution will be essentially an instance. It can be defined as several negative or positive
compared to most latest research works. These works are fast peaks which can affect the data of one or more sensors.
outlined and discussed previously in this section which are As shown in figure 1 these perturbations are very fast. In
HMM [12], SODSEN [14], Bayes [16] and Cloud [18]. other words, this perturbation refers to a random fault happen
randomly among previous described faults. The measurement
III. P ROBLEM S TATEMENT of performance of previous techniques is performed separately
for each type of fault. In addition, to do the same thing in this
This section is divided into two subsections. First, a fault paper, we also measure the performance of our technique in
taxonomy according to the gathered data is presented. Next, case of an unknown fault among the previous list. .
the problem specificities are outlined.

A. Fault taxonomy in WSNs


As presented in the introduction, there are several causes of
fault in WSNs. These causes are either linked to the collected
data or due to the sensor system functionality [2]. We assume
that the second type of causes can be included in the first type.
Indeed, calibration fault, battery failure or hardware failure can
affect the quality and the nature of sensed data. A description
of faults is outlined according to collected data.
In what follows, gathered data is modeled by the triplet
d(n, t, f (t)); Where f (t) is the sensed data by the node n Fig. 1. Illustation of random fault: Collected data over time.
during the time t. f (t) can be modeled by the equation: α +
βx + η. Where α is the offset, β is the gain, x is the non-faulty
data gathered by the node at t, and η describes the noise in B. Challenges of faults detection in WSNs
data [2].
The fault detection in WSNs is a challenging problem due
1) Offset fault: This fault occurs when a constant is added to many reasons:
to the expected data, which can happen due to bad calibration
of sensing unity. This fault can be modeled according to the • The limitation of sensor node resources don’t support
following equation 1. the use of normal techniques specially with expensive
computation .
0
x =α+x+η (1) • The deployment of sensor can be performed in haz-
ardous and different types of fields e.g. indoor, forest,
Where: α is a constant value added to the normal reading. highways, volcanos.
2) Gain fault: This sort of fault is happen when, in a • The detection has to be precise. Indeed, the detection
period of time, the change rate of sensed data is different to should make out the difference between a normal func-
the expectation. Gain fault can be modeled by the following tion and faulty status e.g., the detection method should
equation 2. distinguish between the presence of fire and a gain fault.
0
x = βx + η (2) • The detection should also be rapid to limit the loss e.g.,
in case of receiving wrong data from a faulty sensor it
Where: β is a constant value multiplying the normal reading. can lead to bad result.
3) Stuck-at fault: This fault is happen when the variation
To deal with this challenging problem, classification meth-
of sensed data series is zero. Which can be either transient
ods can be used. Classification appears as an appropriate
or persistent. Stuck-at fault can be modeled by the following
technique for decision making. In the next section, an overview
equation 3.
0 of statistical learning as a classification approach is presented
x =α (3) before giving our contribution.
Where: α is the sensed value.
4) Out of bounds fault: Let [θ1 , θ2 ] an interval describing IV. S TATISTICAL LEARNING FOR FAULT DETECTION
the possible normal value of a type of data. An out of bounds In this section, an overview of statistical learning theory is
fault happens when for a sensed data x0 ∈ f (t) such that: performed, in addition to present the proposed scheme.
x0 < θ1 or x0 > θ2 .
5) Data loss fault: This type of fault can be simply de-
scribed by the fact of ”The missing of data during a time A. Overview of statistical learning theory
series for a node”. This means that the sensed data is a null Statistical learning theory is based on empirical risk mini-
value. mization for conceptual model study. Proposed by Vapnik [26]

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 4

as a statistical interference problem based on a finite number


of observations, it has been used for supervised binary classi- X
fication. f (x) =< w, x > +b = αi yi < xi , x > +b (5)
Vapnik proposed a region separation algorithm called SVM i∈SV

(Support Vector Machines). It consists in finding the optimal 2) Case of nonlinear classification: In case of nonlinear
hyperplane that separates the data of two classes. classification, the separator hyperplane of the previous section
The principal of this technique consists in defining a deci- is not valid. Thus, non-linear SVM has to be applied. The
sion function f : X → {−1, 1}, while having a simple set of basic idea is to find a space with the biggest dimension
data {(xi , yi ); xi ∈ X and yi ∈ (−1, 1)}. For each new point where the projection of examples are linearly separable (as
x ∈ X, this decision function allows to predict its belonging presented in Fig. 3), which is a Hilbert space H based on a
to the right class ((−1) or(+1)). This decision is made with scalar product that can be replaced by a kernel function of
the minimization of structural risk, which can be estimated by the starting space (space of observations).
the empirical risk. We suppose:
1) Case of linear classification: Suppose that we have the ∅ : RP → H ; xi 7→ ∅(xi )
following empirical data (x1 , y1 ) . . . (xi , yj ) . . . (xm , ym ) ∈
R × {±1}. In case of linear classification, SVM algorithm By replacing the scalar product < φ(xi ), φ(xj ) > by a ker-
computes a hyperplane that separate at best the samples of nel function K(xi , xj ), the problem of optimization becomes:
two classes. In that case, the function f is linear in xi with
the following general form: f (xi =< w, xi >) + b. As shown  PL PL
in fig 2, there is an infinity of hyperplanes that can separate  maxαi i αi − 12 i,j αi αj yi yj K(xi , xj )
PL
(6)
these data. However, only one of them is optimal, which is the i αi yi = 0
C ≥ αi ≥ 0

hyperplane that passes in the middle. This hyperplane satisfies
the following condition: Where C is the tolerance constant. The decision function will
yi (wxi + b) ≥ 1 for i = 1 . . . m. be:
X
f (x) =< w, x > +b = αi yi K(< xi , x >) + b (7)
i∈SV

In our case Gaussien kernel function has been used, as follows:


0
0 −kx−x k
K(x, x ) = e 2σ 2 (8)

Fig. 2. Linear sepration

The problem of optimization can be described as follows: Fig. 3. Non-linear sepration


minx∈X f (x) under constraint gi (x) ≤ 0. To resolve this
problem, lagrangian method can be used. So, the dual problem
becomes: B. Proposed scheme
 PL 1
PL The proposed solution is realized into two phases. The first
 i αi − 2 i,j αi αj yi yj < xi , xj >
phase is performed in anticipated time while the second one
α ≥0 (4)
 PL i is performed in real time.
i αi yi = 0 1) Phase 1: Anticipated time
Where: Classification is based on data learning. This allows to use
αi : are lagrangian parameters. expertise in making decisions . In our case of WSNs fault
αi 6= 0 for yi (< w, xi > +b) = 1 detection, the data is the most important element in the
αi = 0 for yi (< w, xi > +b) > 1 PL process. It can give the needed knowledge to resolve different
So, we obtain weight hyperplane vector w = i=1 αi yi xi and problems affecting WSNs. We need to use the expertise to deal
the offset b = 1− < w, xi >, where xi is support vector of the with challenges outlined in the previous section. Therefore, a
known class (here its class is 1), and we obtain the hyperplane learning phase is completed as described by the Fig. 4. This
function. phase is realized in the anticipated time. The aim is to establish

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 5

humidity and the temperature. The aim of their research was


to classify data into two classes: the normal data class and the
class of anomalies.
• Prepared dataset:
For our dataset we have used only the outdoor data collected
from multi-hop WSNs. We have developed a new approach
of fault detection based on a classification and data learning
Fig. 4. Learning System method. To evaluate this approach, we have prepared our
dataset composed of a set of observations with dimension 12.
Each vector contains measurements in 3 successive instances
a decision function that can be used in real time to classify any (t0 , t1 , t2 ). For each instance, we consider 2 temperature
new observation (data). Our proposed data learning solution measurements and 2 humidity measurements. Consequently,
is based on the statistical learning technique. In the learning we prepared 4688 observations (data example or vector).
phase, Lagrangian coefficients are determined. Consequently, Then, we introduced a set of random faults. With different
support vectors and the decision function can be calculated. rates of faults (50%, 40%, 30%, 20%, and 10%) and different
Next, this decision function (or separation function) will be types of data faults (Offset, Gain, Stuck-at, Out of bounds and
deployed in the cluster-head for classifying data. A labeled random fault), we have prepared 21 datasets. Different values
dataset is used as a learning database. It is composed of a set of β have been applied. For each dataset, we begin by dividing
of normal data and a set of faulty data. it into two parts. We use 23 of observations (the first part) as
1) Phase 2: Real time a learning dataset and the remaining ( 31 of observations) are
used for test.
For each new data measurement Vt , a new observation vector
This labeled dataset can be used by researchers in different
is constructed by a data preparation block. This observa-
fields, such as machine learning and fault detection in WSNs.
tion vector is composed of the three last data measurements
(Vt , Vt−1 , Vt−2 ). We don’t need a data extraction block for
our fault detection problem. We use only three successive B. Results
measurements for 2 sensors, while the SVM is the appropriate We proceeded by experimental research of optimal values.
technique for the multidimensional classification. Then, the The cross-validation method was applied. After several tests
decision function is applied to this new observation. It belongs for different parameter values, the best learning rates were
to the first class (normal functionality) if the result is positive given with σ = 0.5 and C = 1. In the Matlab box, the
and otherwise it is considered as a faulty case. The algorithm parameter C is called BoxContraintValue, its default value
used in the cluster head consists of a simple application is 1. To evaluate our proposed technique, we compared it
of the decision function. So, the decision process is not with the most recent fault detection techniques in WSNs. This
computationally expensive. This fact makes our technique very comparison is based essentially on two metrics. The first metric
efficient with sensors as constrained resource nodes. is the Detection Accuracy (DA) [2] which is defined as below:
Number of faulty node detected
V. E XPERIMENTAL RESULTS DA = (9)
Total number of faulty node
This section deals with the evaluation of our proposed fault
detection technique. First, a description of used dataset is The second comparison metric is called False Positive Rate
given. Next, the results of our scheme is presented, analyzed (FPR) [2] which is defined as below:
and compared with other techniques. Number of non-faulty node detected as faulty
FPR =
Total fault free nodes
(10)
A. Data set Finally, the comparison is also performed according to the
The labeled WSNs dataset that we have prepared consists of sensor resources use i.e., the weight of each technique.
a set of sensor measurements where we have injected different
types of faults. • Comparison according to DA
• Original dataset: The DA of our proposed technique is in the average of DA
The labeled dataset used in this work is based on an existing after injecting each type of faults on our dataset. These faults
dataset published in 2010 by researchers at the university of described previously are: offset, stuck-at, out of bounds, gain
North Carolina at Greensboro [24]. By the use of TelosB and random fault. A comparison of DA according to the fault
motes, these researchers collected the data from a simple type is given in fig. 9.
single-hop and a multi-hop WSNs. Their data consists of For the random fault, we have considered an out of bound
humidity and temperature measurements measured every 5 case which can randomly affect any sensor and at any time.
seconds during 6 hours. For this original dataset, the re- We don’t apply successive faults. Only one fault is injected
searchers introduced an event in the network to collect different in a random instant. This type of fault is different from
measures. They introduced steam of hot water to increase the the other 4 faults. It is more difficult to detect any type of

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 6

Fig. 5. System fault detection

TABLE I. R ESULTS OF DATA LEARNING RATE TABLE II. DA COMPARISON

Fault rate 50% HMM SOSEN Bayes Cloud


β 2 4 6 8 10 SVM (dH ) 12.08 32.69 8.75 54.31
Offset Fault 90.08% 99.90% 99.90% 100% 99.70% SVM (%) 3.85 9.09 2.25 19.86
Gain Fault 99.60% 99.90% 100% 100% 100%
Stuck-at fault 99.80%
Out of Bounds 100%
Randomly Fault 99.50% According to the results shown in table II, which are
Fault rate 40% deduced from fig. 6, 7, and 8, we can see that our technique
β 2 4 6 8 10
Offset Fault 93.60% 99.80% 99.80% 99.90% 99.40%
shows a significant improvement in term of DA. This enhance-
Gain Fault 99.40% 99.60% 99.80% 99.80% 99.80% ment is 2.25% compared to Bayes and 19.86% compared to
Stuck-at fault 99.80% Cloud. This comparison is confirmed by Hausdorff distance
Out of Bounds 99.90%
Fault rate 30%
which is respectively 8.75 and 54.31 compared to Bayes and
β 2 4 6 8 10 Cloud. Our contribution is significantly important in case of big
Offset Fault 99.70% 99.80% 99.80% 99.40% 99.40% fault probability 30% to 50%. However, it provides a similar
Gain Fault 99.30% 99.50% 99.80% 99.80% 99.80% DA compared to existing fault detection technique in case of
Stuck-at fault 99.90%
Out of Bounds 99.90% small rate of fault probability 5% to 10%. This fact makes
Fault rate 20% SVM classifier method the most effective technique of fault
β 2 4 6 8 10 detection in WSNs in case of deploying sensor in hazardous
Offset Fault 99.80% 99.90% 100% 99.90% 99.90%
Gain Fault 99.20% 99.50% 99.70% 99.80% 99.80% or unmonitored fields.
Stuck-at fault 99.90%
Out of Bounds 100%
Fault rate 10%
β 2 4 6 8 10
Offset Fault 97.10% 99.90% 99.90% 99.80% 99.80%
Gain Fault 99.70% 99.70% 99.70% 99.70% 99.80%
Stuck-at fault 100%
Out of Bounds 99.90%

instantaneous problem. We have injected 50% examples of


random faults in our dataset for different sensors at different
instants. For different out of bound values, we had attractive
learning results. The error rate did not exceed the value of 1%.
The table I summarizes the results of different fault types.
Fig. 6, 7, and 8 show the comparison of our technique to
others in term of DA. To measure results of our technique
Fig. 6. Detection accuracy
compared to other methods, we use Hausdorff distance [25].
Hausdorff distance, as a mathematical tool, provides the differ-
ence between two sets of data given in the following equation. When the rate (the number) of faults increases, there will
These measurements, as well as improvement ratio, are given be more risk of having faulty data which closely looks like
in table II. the data of normal functioning. In the field of classification
and data analysis, these data are called bordering data. Often,
dH (X, Y ) = max{supx∈X infy∈Y d(x, y), these data are the cause of learning error or recognition failure.
supy∈Y infx∈X d(x, y)} For a learning base retrieved in a random way from the
original database. This effect will have even more influence
Where: on the results of classification and therefore also on the
X and Y are two non-empty subsets, fault detection. For these reasons the detection rate decreases
sup represents the supremum and inf the infimum. when the default rate increases. Nevertheless, our statistical

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 7

SVM provides the best value of FPR. Indeed the use of SVM
helps to make a significant and important improvement of FPR
as compared to others. This improvement is starting from 72%
compared to Cloud and reach 95% compared to HMM.

Fig. 7. Detection accuracy of SVM compared to HMM and Bayes

Fig. 10. False positive rate

• Comparison according to sensor resources use.


The use of SVM to detect faults in WSNs doesn’t require any
overhead on sensors. Indeed, the whole process is executed at
sink node where there is no problem of resource limitation.
After establishing the decision function it will be sent from
sink node to cluster head. So, we can conclude that our
technique as well as Cloud technique use lightly sensors
resources. At the contrary, in other techniques (Bayes, HMM
and SODSEN) algorithms are executed at cluster head and
Fig. 8. Detection accuracy of SVM
sensors to carry out the detection of fault. This fact makes our
technique very efficient with sensors as constrained resource
nodes.

VI. C ONCLUSION
In this paper, a classification approach has been proposed
for fault detection in WSNs. Our proposed solution is based
on SVM technique. In addition to its proven performance
in several areas, this technique is very interesting for the
multidimensional data learning. By using the kernel functions,
this method has an important adaptation capacity for the
nonlinear classification cases as our case of fault detection.
Indeed, the random factor and the nonlinear distribution of data
did not prevent the detection rates of our proposed solution to
exceed 99% in most cases.
This research work has been preceded by a dataset prepa-
Fig. 9. Detection accuracy of SVM according to fault type ration. To prove these attractive results, our approach has
been applied to real data extracted from a database already
published and used by researchers. This database contains
learning approach has shown more resistance to this effect a set of sensor measurements. We have injected to these
(this phenomenon). It is noted that it has been able to improve data different types of faults. We plan to publish the WSNs
the detection in certain cases and in all cases; it did not make fault detection dataset prepared and used in this paper. By
the same reduction as the other approaches. comparing it with other approaches, proposed for the same
• Comparison according to FPR problem, the results of our solution are clearly more attractive.
In fig. 10, a comparison of FPR between our technique (SVM), Based on Hausdorf distance, we have presented, in the experi-
Bayes, cloud, SODSEN and HMM is given. It is clear that mental results section, a significant improvement compared to

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2017.2771226, IEEE Sensors
Journal

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JUNE 2017 8

other methods. Classification techniques can be used to non- [16] Titouna, Chafiq, Makhlouf Aliouat, and Mourad Gueroui. FDS: fault
stationary or dynamic data. This can be very useful to prevent detection scheme for wireless sensor networks. Wireless Personal Com-
munications 86.2 (2016): 549-562.
the occurrence of faults. In the same context of prevention, the
prospects of this research concerns the dynamic classification [17] Lee, Myeong-Hyeon, and Yoon-Hwa Choi. Fault detection of wireless
sensor networks. Computer Communications 31.14 (2008): 3469-3475.
approach to be able to follow the sensor behavior through
[18] Yang, Chi, et al. A time efficient approach for detecting errors in big
its data which aims to predict the faults as rapid as possible. sensor data on cloud. IEEE Transactions on Parallel and Distributed
Indeed, predicting faults is more efficient to prevent errors than Systems 26.2 (2015): 329-339.
discovering them when they happen. [19] Salah Zidi, Bechir Alaya, Tarek Moulahi,& Lamri Laouamer,Formal
Concept Analysis and Statistical learning theory for aiding detection
ACKNOWLEDGMENT and classification of epidemics Wulfenia journal, volume 24 issue 1, pp
63-78 (2017).
The authors thank and acknowledge the scientific research [20] Huang, Y., Wu, D., Zhang, Z., Chen, H., & Chen, S. EMD-based pulsed
deanship at Qassim University for their financial support TIG welding process porosity defect detection and defect diagnosis
during the academic year 2016/2017 under research grant using GA-SVM. Journal of Materials Processing Technology, 239, 92-
reference number 1250-CBE-2016. 102 (2017).
[21] Tarhouni, M., Zidi, S., Laabidi, K., & Ksouri-Lahmari, M. (2012).
Least squares support kernel machines (LS-SKM) for identification.
R EFERENCES International Journal of Modelling, Identification and Control, 17(1), 68-
[1] Jennifer Yick, Biswanath Mukherjee and Dipak Ghosal. Wireless sensor 77.
network survey, Computer Networks 52, 2292–2330,( 2008). [22] Theljani F, Laabidi K, Zidi S, Ksouri M. Tennessee Eastman Process di-
[2] Muhammed, Thaha, and Riaz Ahmed Shaikh. An analysis of fault agnosis based on dynamic classification with SVDD. Journal of Dynamic
detection strategies in wireless sensor networks, Journal of Network and Systems, Measurement, and Control. 2015 Sep 1;137(9):091006.
Computer Applications 78 (2017): 267-287. [23] Zidi S, Julien T, Mjirda A, Maaloul F. Textual extraction and classifica-
[3] Panda, Rama Ranjan, Bhabani Sankar Gouda, and Trilochan Panigrahi. tion for medical risk management: A new Risk Management Platform to
Efficient fault node detection algorithm for wireless sensor networks. manage undesired medical events. In Advanced Logistics and Transport
High Performance Computing and Applications (ICHPCA), 2014 Inter- (ICALT), 2015 4th International Conference on 2015 May 20 (pp. 235-
national Conference on. IEEE, 2014. 239). IEEE.
[4] Feng, Zhen, Jing Qi Fu, and Yang Wang. Weighted distributed fault [24] Shan Suthaharan, Mohammed Alzahrani, Sutharshan Rajasegarar,
detection for wireless sensor networks Based on the distance. Control Christopher Leckie and Marimuthu Palaniswami, Labelled Data Collec-
Conference (CCC), 2014 33rd Chinese. IEEE, 2014. tion for Anomaly Detection in Wireless Sensor Networks, in Proceedings
of the Sixth International Conference on Intelligent Sensors, Sensor Net-
[5] Titouna, Chafiq, Makhlouf Aliouat, and Mourad Gueroui. Outlier de- works and Information Processing (ISSNIP 2010), Brisbane, Australia,
tection approach using bayes classifiers in wireless sensor networks. Dec 2010.
Wireless Personal Communications 85.3 (2015): 1009-1023.
[25] Tarek Moulahi, Sami Touil, Salem Nasri and Hervé Guyennet, Reliable
[6] Jin, Xiaohang, et al. Kuiper test and autoregressive model-based ap- relay-based broadcasting through formal concept analysis for WSNs.
proach for wireless sensor network fault diagnosis. Wireless Networks Security and Communication Networks 9.13 (2016): 2042-2050.
21.3 (2015): 829-839.
[26] Vapnik, Vladimir Naumovich, and Vlamimir Vapnik. Statistical learning
[7] Yuvaraja, M., and M. Sabrigiriraj. Fault detection and recovery scheme theory. Vol. 1. New York: Wiley, 1998.
for routing and lifetime enhancement in WSN. Wireless Networks (2015):
1-11.
[8] Panda, Meenakshi, and Pabitra Mohan Khilar. Energy efficient distributed
fault identification algorithm in wireless sensor networks. Journal of
Computer Networks and Communications 2014 (2014).
[9] Ghorbel, Oussama, et al. Distributed and efficient one-class outliers
detection classifier in wireless sensors networks. International Confer-
ence on Wired/Wireless Internet Communication. Springer International
Publishing, 2015.
[10] THELJANI, Foued, et al. Convex hull based clustering algorithm.
International Journal of Artificial Intelligence 10.S13 (2013): 51-70.
[11] Theljani, Foued, et al. Systems monitoring based on dynamic clas-
sification with SVDD. Systems, Signals & Devices (SSD), 2013 10th
International Multi-Conference on. IEEE, 2013.
[12] Warriach, Ehsan Ullah, and Kenji Tei. Fault detection in wireless sensor
networks: A machine learning approach. Computational Science and
Engineering (CSE), 2013 IEEE 16th International Conference on. IEEE,
2013.
[13] Obst, Oliver. Distributed fault detection in sensor networks using a
recurrent neural network. Neural processing letters 40.3 (2014): 261-
273.
[14] Obst, Oliver. Distributed fault detection using a recurrent neural net-
work. Proceedings of the 2009 International Conference on Information
Processing in Sensor Networks. IEEE Computer Society, 2009.
[15] Lau, Bill CP, Eden WM Ma, and Tommy WS Chow. Probabilistic fault
detector for wireless sensor network. Expert Systems with Applications
41.8 (2014): 3703-3711.

1558-1748 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like