0% found this document useful (0 votes)

9 views22 pages

A Novel Data Aggregation Scheme Based On Self Organized Map For WSN

The paper presents a novel data aggregation scheme for wireless sensor networks (WSN) that utilizes a self-organized map (SOM) neural network to reduce redundant data and eliminate outliers, thereby enhancing network performance and energy efficiency. By employing cosine similarity for clustering based on data density and interquartile analysis for outlier removal, the proposed scheme significantly outperforms existing data aggregation methods in terms of data reduction rate, network lifetime, and energy efficiency. Extensive simulations demonstrate the effectiveness of the approach in optimizing data transmission and resource utilization within WSNs.

Uploaded by

Shubham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views22 pages

A Novel Data Aggregation Scheme Based On Self Organized Map For WSN

Uploaded by

Shubham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

The Journal of Supercomputing

https://doi.org/10.1007/s11227-018-2642-9

A novel data aggregation scheme based on self‑organized

map for WSN

Ihsan Ullah1 · Hee Yong Youn2

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
Wireless sensor network allows efficient data collection and transmission in IoT envi-
ronment. Since it usually consists of a large number of sensor nodes, a significant
amount of redundant data and outliers are generated which deteriorate the network per-
formance. In this paper, a novel data aggregation scheme is proposed which is based
on self-organized map neural network to reduce redundant data and eliminate outliers.
In addition, cosine similarity is used to improve the clustering process of sensor nodes
based on the density and similarity of the data, and interquartile analysis is adopted to
remove outliers. It allows to significantly reduce the energy consumption and enhance
the network performance. Extensive simulation with real dataset shows that the pro-
posed scheme consistently outperforms the existing representative data aggregation
schemes in term of data reduction rate, network lifetime, and energy efficiency.

Keywords Data aggregation · Data clustering · Cosine similarity · SOM neural

network · Network lifetime · Wireless sensor network

1 Introduction

Wireless sensor network (WSN) consists of a large number of sensor nodes, and it is
widely used in various application domains. WSN allows timely and accurate detec-
tion of critical events in the target area where the access is usually limited. Unlike
wired network, WSN has limitations in network lifetime and communication range
of the sensor nodes. In order to overcome the limitations, the sensor nodes need to
collaborate with each other in querying the environment and then collect and trans-
mit them to the base station (BS). The sensor nodes are usually distributed in high

* Ihsan Ullah
[email protected]
* Hee Yong Youn
[email protected]
1
Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea
2
College of Software, Sungkyunkwan University, Suwon, Korea

13
Vol.:(0123456789)
I. Ullah, H. Y. Youn

density to reliably monitor the target [1–3]. However, densely deployed sensor nodes
generate redundant data which lead to energy waste and congestion of the network.
Moreover, the data may contain outliers caused due to unstable sensing or commu-
nication environment, which reduce the accuracy of data. The data having redun-
dancy and outliers eventually degrade the performance of WSN, and thus proper
data aggregation and removal of outliers are imperative.
Data aggregation is the technique of consolidating the data in an energy-efficient
way. It can greatly enhance the performance of WSN in terms of energy efficiency
and network throughput [4–6]. The data collected by adjacent sensors usually display
high spatial and temporal correlation, and a significant amount of them are redun-
dant [7–9]. Such data do not provide any information but make waste the network
resources. Based on the spatial correlation model, the sensor nodes which are spa-
tially close to each other usually contain more similar data than the distant ones.
Therefore, clustering of them based on the data density and similarly is important
for data aggregation. Numerous data aggregation techniques have been proposed to
reduce the data collected from sensor nodes before transmitted to the sink [9–13],
while data aggregation occurs at the cluster head (CH) [14–19]. Based on these exist-
ing schemes, the proposed scheme is attempted to improve the clustering of sensor
nodes and aggregation of data using machine learning techniques. Data aggregation
at the CH can be more efficient only when similar data are grouped and processed
together, and the performance and resource utilization will be improved if the outliers
and redundant data are reduced [20, 21]. The performance is further enhanced if an
efficient algorithm for dimensionality reduction on data is embodied at CH.
In this paper, thus, a novel data aggregation scheme is proposed to maximize
the performance of WSN. The proposed scheme utilizes the density and similarity
between the data of neighborhood sensor nodes to construct the cluster, and cosine
similarity function is used to observe the data relation between the sensor nodes
and consolidate them are at the CH. Also, interquartile (IRQ) analysis is used to
remove the outliers and redundant data [22–25]. Subsequently, the data are pro-
cessed using an unsupervised machine learning technique, self-organized-map
(SOM), which effectively processes and presents complex high-dimensional data
to low-dimensional ones [26, 27]. Due to its unsupervised nature, SOM is widely
used for exploratory analysis and visualization of high-dimensional complex dataset
in low-dimension [28–31]. The SOM algorithm is used to recluster the data after
eliminating the outliers and redundant data before forwarding to the BS. Computer
simulation reveals that the proposed scheme considerably outperforms the three rep-
resentative data aggregation schemes with respect to data reduction rate, energy effi-
ciency, network lifetime, number of live nodes, and clustering accuracy of the sensor
nodes. The main contributions of the paper are summarized as follows.

• Clustering of sensor nodes based on the data density and similarly is shown to be
effective for data aggregation. In this paper, cosine similarity is used to observe
the data relation between the sensor nodes, and interquartile (IRQ) analysis is
adopted to remove outliers and redundant data.
• Identification of the efficiency of two-phase data aggregation approach of the
elimination of outliers and redundant data first and then SOM-based data reclus-

13
A novel data aggregation scheme based on self-organized…

tering to convert the complex nonlinear high-dimensional data to two-dimen-

sional ones are carried out.
• Gaussian neighborhood function is used to indicate how strongly the neighbor
neurons are connected around the winner during the learning process.

The rest of the paper is organized as follows: In Sect. 2, the work related to data
aggregation for WSN is discussed. The proposed SOM-based data aggregation
(SOMDA) scheme is presented in Sect. 3. Section 4 discusses the simulation results,
and the conclusion is made in Sect. 5.

2 Related work

In WSN the issue of data aggregation has been recognized by a number of research-
ers. The adaptive sampling approach (ASAP) [32] was developed to estimate the
data similarity based on the relationship between the time series data of two sen-
sor nodes without considering the magnitude. Here a minimum number of messages
is used to collect the data, and the multivariate normal (MVN) predicts the value
of the node not sampled at the base node. In [33], back-propagation networks data
aggregation (BPNDA) was proposed to aggregate the data using neural networks.
It adopts the lower energy adaptive clustering hierarchy (LEACH) [34] strategy to
construct the cluster of sensor nodes, a three-layer back-propagation (BP) neural net-
work was used to eliminate the data redundancy and improve the accuracy of data.
Here the input layer neurons are located at the cluster members, while the hidden
layer neurons and the output layer neurons are located at the CH.
Liu et al. [35] proposed an approach called energy-efficient data collection
(EEDC) which observes the spatial and temporal correlation between the sensed
data using a piecewise linear approximation technique. By exploring the spatial cor-
relation of the sensed data, it dynamically partitions the sensor nodes into clusters
so that the sensors in the same cluster have similar surveillance time series. EEDC
tried to minimize the frequency of data transmissions under a given bound on esti-
mated accuracy, and balance the energy consumption by introducing the randomized
scheduling technique. In [36], the back-propagation network (BPN) technique was
proposed for data aggregation in WSN. It efficiently reduced the dimension of data
and improved the data aggregation process using multilayer perceptions for training
in a supervised manner. Through the BPN technology, the sink nodes handle various
signal sources for data fusion in recognition and classification.
Efficient data collection aware of spatiotemporal correlation (EAST) [37] exploits
the spatiotemporal correlation between the data and forwards imperative data to the
BS through a short path. EAST clusters the member nodes by using a spatial cor-
relation approach, while the representative nodes use the temporal suppression tech-
nique. Here data prediction decreases the network traffic by estimating the upcom-
ing data from current data log. In [38], autoregressive integrated moving average
(ARIMA) was proposed to predict the data from the previous log of sensed data.
[39] proposed a scheme called dual prediction framework (DPF) which used least
mean square (LMS) to filter the data. It requires no prior model and allows the nodes

13
I. Ullah, H. Y. Youn

to work independently, and the LMS-based prediction was hierarchically extended to

perform a joint prediction over a block of readings from neighboring nodes. Redun-
dancy elimination for accurate data aggregation (READA) [10] was proposed to
exploit the spatial correlation between the data in WSN. It eliminates redundant data
by applying a grouping and compression approach and determines the outlier using
the least square extrapolation technique. In [11] a secure data aggregation technique
named polynomial regression-based secure data aggregation (PRDA) was presented,
which preserves the privacy of the sensor data during data aggregation. PRDA per-
formed data aggregation using polynomial coefficients which represent the sensor
data. We next present the proposed scheme.

3 The proposed scheme

In this section, the proposed SOMDA scheme for data aggregation is presented. It
clusters the sensor nodes based on the density and similarity to aggregate the data
and filters the data at the CH using an unsupervised machine learning technique, self-
organizing map (SOM). SOM is employed to transform the high-dimensional data in
low-dimension one and remove the outliers and redundancy. SOMDA minimizes the
energy consumption of sensor nodes and thus prolongs the lifetime of the WSN.

3.1 Design goals

The design goal of the proposed scheme is to efficiently aggregate and transmit the
data to the BS so that the communication cost can be reduced. If all sensed data are
transmitted to the BS for aggregation, network congestion will occur. With the pro-
posed cluster-based aggregation, the sensed data are transmitted to the CHs, which
aggregate them before forwarding to the BS.
SOMDA operates in two phases. In the first phase, clusters are constructed based
on the density and data similarity of the nodes by applying cosine similarity and
data density correlation degree (DDCD) [15]. The node having the highest degree
and enough energy is announced itself as the CH. The nodes which listen to the
announcement are associated with the CH of the highest similarity. After the con-
struction of the clusters, each CH identifies multiple paths towards the BS in the
backbone and then selects the best path for data transmission. In the second phase,
the SOM algorithm is used to recluster the data to convert the complex nonlinear
high-dimensional data to two-dimensional ones. In the long run, it decreases the
data transmission cost and energy consumption.

3.2 Operation of SOMDA

3.2.1 Clustering of sensor nodes

In the clustering of WSN, the sensor nodes are usually grouped according to a set of
rules. Here the data density correlation degree (DDCD) technique [15] is extended
with the cosine similarity function to construct the clusters of similar sensor nodes

13
A novel data aggregation scheme based on self-organized…

in terms of data. The topology of the WSN with the proposed scheme is modeled by
undirected graph G = (S,E). Here S is a set of sensor nodes and E is a set of edges.
With cluster-based data aggregation, each CH consolidates the data sent from its
member nodes and forwards them to the sink (BS) in one hop or multi-hops as
shown in Fig. 1. ( )
Assume that sensor node, s0, has n neighboring nodes, s1 , s2 , … , sn , which are
within the radio coverage of s0. The data object of s0 is D0, and those of the neighbor-
ing nodes are D1, D2, …, Dn. Among the n data objects, if there are N(≤ n) data objects
whose distance to D is less than ε, then the data density correlation degree of s0 to the
CH whose data objects are in 𝜀-neighborhood of D is defined as follows [15]:
{
( 0, N < ) minPts
( )
sim(s) =
a1 1 − exp (N−minPts) + a2 1 − dΔ
1
+ a3 (1)
𝜀

Here minPts is the threshold of the number of neighbor nodes, 𝜀 is the data thresh-
old, dΔ is the distance between D and the center of the data objects which are in the

Fig. 1 Clustering of sensor nodes

13
I. Ullah, H. Y. Youn

𝜀-neighborhood of D. d is the average distance between the N data objects and s0 .

a1 , a2 , a3 are the weights, and a1 + a2 + a3 = 1.
It is possible that a node has the same data similarity with two clusters as q1 in Fig. 1.
Then it is hard to properly decide the cluster. Therefore, the cosine similarity function
is used to estimate the level of similarity among the data of a CH and the target node.
Each node is[associated with ] the CH[ whose cosine ] similarity[with it is highest.
]
Let A = a1 , a2 , ⋯ , an , B = b1 , b2 , ⋯ , bn , and C = c1 , c2 , ⋯ , cn denote the
datasets associated with CH1, CH3, and q1, respectively, as shown in Fig. 1. For this the
DDCD scheme [15] is extended by employing the cosine similarity function, cs(a, b),
to accurately measure the similarity between two objects as follows [24].
∑n
(a ⋅ ci )
i=1 i
cs(A, C) = � �
∑n � �2 ∑n � �2 (2)
i=1 a i i=1 ci
∑n
(bi ⋅ ci )
i=1
cs(B, C) = � �
∑n � �2 ∑n � �2 (3)
i=1 b i i=1 ci

With normalization, the above equation can be expressed as:

⎡ ⎤
⎢ a1 a2 an ⎥
Anr = ⎢ �
� �
,�
� �
,…, �
� � ⎥ (4)
⎢ ∑n a 2 ∑n
a
2 ∑n
a
2⎥
⎣ i=1 i i=1 i i=1 i ⎦

⎡ ⎤
⎢ b1 b2 bn ⎥
Bnr = ⎢ �
� �
,�
� �
,…, �
� � ⎥ (5)
⎢ ∑ n 2 ∑ n 2 ∑ n 2 ⎥
⎣ i=1
b i i=1
b i i=1
b i ⎦

and

⎡ ⎤
⎢ c1 c2 cn ⎥
Cnr = ⎢ �
� �
,� ,…, � ⎥ (6)
⎢ ∑n c 2 ∑n � �2 ∑n � �2 ⎥
⎣ i=1 i i=1
ci i=1
ci ⎦
Equations (2) and (3) are rewritten as:

∑
n
cs(A, C) = (anr ⋅ cnr ) (7)
i=1

∑
n
cs(B, C) = (bnr ⋅ cnr ) (8)
i=1

The highest similarity between cs(C, A) and cs(C, B) can be measured as:

cs(High) = max [cs(C, A), cs(C, B)] (9)

13
A novel data aggregation scheme based on self-organized…

3.2.2 Clustering of data

The proposed SOMDA scheme reclusters the collected data at each cluster. It consists
of two-layer neural network: input and ( output. All )Tthe received data of dimension d rep-
resented as( input neural layer
) X = x ,
1 2x , … xd are fully connected to output neural
layer Y = y1 , y2 , … , ym as shown in Fig. 2. Here m is the order of the neural map in
the output layer, and T represents the number ( of iteration of learning
) process.
The synaptic weight vector, Wk = wk,1 , wk,2 , … wk,m ,{ is the directed } links
between the input layer X and output layer Y , where k ∈ 1, 2, … , m2 expresses
the index of kth node of the output layer. The training process of SOMDA iteratively
updates the synaptic weights of the winner and its neighbors’ neurons. At each train-
ing step, a sample vector, xi,d , is randomly selected from the input dataset. As training
progresses, the algorithm calculates the Euclidean distance between every weight and
input vector xd . The node with a weight vector of the closest distance to the input vector
is tagged as the best-matching unit (BMU), j∗ .
�
⎛�
��d
� �2 ⎞
j∗ = min ⎜� xi − wim ⎟ (10)
j ⎜ ⎟
⎝ i=0 ⎠

Fig. 2 Diagram of the SOM network

13
I. Ullah, H. Y. Youn

To detect and remove outliers from data, the interquartile (IRQ) analysis is used
[25].
IRQ = Q3 − Q1 (11)
where Q1 and Q3 represent the first quartile and third quartile, respectively, of the
sample data as shown in Fig. 3. Before determining whether a data point is out-
lier or not, firstly, the potential of the data point is identified as follows. Here Ur
and Lr denote the upper and lower range of the data point identifying the outliers,
respectively.
Lr = Q1 − (1.5 × IRQ) (12)
Ur = Q3 + (1.5 × IRQ) (13)
Hypotheses:
� ∶ xi ⟨Lr or xi ⟩Ur

𝜕 ∶ xi = ym

If the hypothesis ∅ is accepted, then xi is outlier. To measure the redundancy of input

data point, xi , the cosine similarity function is used as follows.
∑n ∑m
i=1
(x .y )
k=1 i i+k
cs(X, Y) = � �
∑n � �2 ∑m � �2 (14)
i=1 xi k=1 yk

If cs(xi , ym ) = 1, then hypothesis 𝜕 is accepted and data point xi is redundant. Conse-

quently, the outliers and redundant data are deleted by the following equation.
Y∗ = Y − � − 𝜕 (15)
Subsequently, the winner node, yj∗, is promoted by adjusting its matching weight,
wj , toward the nearest input vector, xi . In order to determine the neuron closet to
the input vector, not only wj∗ is adjusted but also the weights of all the nodes in the
neighborhood of yj∗ are also adjusted. Furthermore, all the neurons close to each
other are arranged in the two-dimensional grid as shown in Fig. 4.
The synaptic weight of each excited neuron is adjusted as follows.
[ ]
wj (t + 1) = wj (t) + 𝛼(t) ⋅ hci (t) xi − wj (t) (16)

Fig. 3 Diagram of the interquar-

tile technique

13
A novel data aggregation scheme based on self-organized…

Fig. 4 Grid representation of the neurons in SOM. a Rectangular grid, b hexagonal grid

where 𝛼 and t represent the learning rate factor and the iteration of the training
process, respectively. The Gaussian neighborhood function, hci (t), indicates how
strongly the neighbor neurons are connected around the winner during the learning
process as shown in Fig. 4. It is specified as:
( )
rc , ri2
hci (t) = exp − 2 (17)
2𝜎 (t)

where rc and ri denote the position of the winner neuron_c and neuron_i on the SOM
grid, and rc , ri2 is the distance between them. The following is the procedure of the
proposed SOMDA scheme

13
I. Ullah, H. Y. Youn

13
A novel data aggregation scheme based on self-organized…

Table 1 Specification of Datasets Attributes Instances

datasets
D1 14 9300
D2 04 131,500

Fig. 5 Simulation environments of the SOMDA. a D1, b D2

3.2.3 Energy efficiency

The proposed SOMDA scheme attempts to accomplish energy-efficient data aggrega-

tion using a machine learning technique. It substantially reduces the amount of data
which have a great impact on network traffic as well as energy efficiency. Here the first-
order radio model [40] is adopted to model the power consumption of a sensor node
for receiving and transmitting data. Based on this model, a node needs 𝜀elec = 50 nJ for
running the circuitry and 𝜀amp = 100 pJ/bit/m2 for the transmitting amplifier. There-
fore, the power consumption for receiving one bit of data, er is given by 𝜀elec . The
power consumption for transmitting one bit of data to a neighbor node, et is given by
(𝜀elec + 𝜀amp ) × d , where d is the distance between the sender and receiver node.
Assume that the data traffic from node_i to j per unit time is fi,j. The energy con-
sumption of node_i for receiving and transmitting data to node_j are er (i, j) and et (i, j),
respectively, which can be expressed as follows [41].

13
I. Ullah, H. Y. Youn

Fig. 6 U-matrix view of the datasets showing the weight distances between the neighbor neurons. a D1,
b D2
∑
et (i, j) = et × fi,j
(18)
i,j∈V,j∈Si

13
A novel data aggregation scheme based on self-organized…

Fig. 7 Number of data points associated with each neuron and cluster. a D1, b D2

∑
er (i, j) = er × fi,j
(19)
i,j∈V,j∈Si

13
I. Ullah, H. Y. Youn

Fig. 8 Weight plan for each input vectors of the datasets. a D1, b D2

4 Performance evaluation

In this section, the proposed scheme is evaluated by computer simulation. It is per-

formed by the MATLAB toolbox developed for SOM and simulator for WSN clus-
tering [42, 43] to evaluate the effectiveness of the proposed scheme in terms of data
reduction rate, energy efficiency, and network lifetime. Table 1 gives the characteristics

13
A novel data aggregation scheme based on self-organized…

of the two datasets that are tested in the simulation [44, 45], and Fig. 5 depicts the set-
ting of the SOMDA simulation with the two datasets.
Three representative data aggregation algorithms are compared with the proposed
scheme, which are lower energy adaptive clustering hierarchy (LEACH) [34], percep-
tron neural network based data aggregation (PNNDA) [46], and polynomial regression-
based secure data aggregation (PRDA) [11]. The simulation consists of two phases:
clustering of the sensor nodes and data processing with SOM. In the cluster construc-
tion phase, the sensor nodes of the largest energy are selected as CH, and then the
nodes are associated with the CH having maximum cosine similarity with it. In the sec-
ond phase, the CH runs the SOM algorithm on the collected data to cluster and reduce
the multidimensional data. Seven metrics are examined to evaluate the performance of
different data aggregation schemes, which are the elimination rate of the redundant and
outlier data, data reduction rate, energy consumption of sensor node, live node in each
round, network lifetime, clustering and data similarity accuracy. The simulation results
are explained in the following.
In Fig. 6 the U-matrix views of the neuron are shown to identify the clusters of
the data with the two datasets. The figures show the distance between each neuron
and its neighbor neurons. The hexagons represent the neurons and the lines are the
connections between the neighbor neurons. The darker the shade, the larger distance
between the neurons. Similarly, a group of lighter segments surrounded by darker
segments indicates that the data have been clustered, and the neurons in lighter seg-
ments contain similar data.
Figure 7 shows the data points associated with each neuron for the two datasets.
The neurons adjacent to each other in the topology need to be close to each other in
the input space so that high-dimensional input space can be visualized in the two-
dimensional topology. For example, the two neurons at the right bottom of Fig. 7a
are far from others which means that the data points of the two neurons are very
similar to each other compared to other neurons. The data are preferred to be equally
distributed across the neurons. In Fig. 7a, more data are located on the middle neu-
rons. Similarly, Fig. 7b shows the result of clustering with D2.

Fig. 9 Comparison of the amount of data removed

13
I. Ullah, H. Y. Youn

Fig. 10 Comparison of energy consumption with the datasets. a D1, b D2

Figure 8 shows the weight plan for each element of the input vector. Here the
darkness represents the size of weight. If the connection patterns of the two inputs
are similar, the input vectors are highly related to each other. Figure 8a represents
the weight plan of the 14 vectors of D1. In this case, input vector_5, 12, 13, 14 are
similar to each other than other input vectors, and input vector_3, 6, 10, and 11 look
similar. Figure 8b presents a weight plan of four input vectors with D2. Here the vec-
tor_2 and 3 [seem similar ]to each other.
Let V = v1 , v2 , ⋯ , vn denote the input vectors, and then the cosine similarity
function of Eqs. (2) and (3) are used to measure the similarity between the vectors,
which is in the range between 0 and 1.
∑n ∑n
� � i=1
(v .v )
j=1 i i+j
cs vi , vi+j = � � (i, j = 1, 2, 3 … n) (20)
∑n � �2 ∑n � �2
i=1
v i i=1
vi+j

13
A novel data aggregation scheme based on self-organized…

Fig. 11 Comparison of the network lifetime with the datasets. a D1, b D2

Figure 9 compares the amount of removed redundant data using four different
schemes with the two datasets. As shown in the figure, the removal with the pro-
posed SOMDA scheme is greater than the other three schemes. The rate of data
removal is greatly influenced by the characteristic of the datasets. The more outlier
and redundant data, the more data reduction. Note that D2 has more such data than
D1 as D2 has much more data of 131,500 than D1 of 9300.
Figure 10 shows the average energy consumption for the aggregation and trans-
mission of the data to the sink for different rounds. Energy consumption increases if
the size of the dataset grows. Therefore, the energy consumption for the D2 is larger
than D1 as shown in Fig. 10a and b. Observe from the figures that the proposed
SOMDA scheme consistently outperforms the other three schemes, while energy
consumption grows as the rounds of the clustering and data aggregation.
Figure 11 compares the network lifetimes for the four data aggregation schemes,
which is defined as the percentage of the live nodes for varying number of the

13
I. Ullah, H. Y. Youn

Fig. 12 Comparison of live nodes at each round with the datasets. a D1, b D2

rounds. It depends on the number of live nodes and the connectivity among them
as the round proceeds. Notice that the proposed scheme consistently allows longer
network lifetime than other schemes.
Figure 12 shows the comparison of depleted nodes in each round with the four
schemes. Notice that other schemes experience more dead nodes compared to the pro-
posed SOMDA scheme at each round, where the nodes are associated with the CH
based on cosine similarity and thus the energy consumption of the clusters can be mini-
mized. Figure 13 compares the data similarity. With the help of the cosine similarity
function, the proposed scheme achieves higher similarity of the data compared to other
schemes. Figure 14 shows the accuracy of the clustering of sensor nodes. Here accuracy
is obtained by merging the data similarity between the CH and the member nodes. Again
the proposed scheme displays much higher clustering accuracy than others schemes.

13
A novel data aggregation scheme based on self-organized…

Fig. 13 Comparison of data similarity accuracy

Fig. 14 Comparison of clustering accuracy

5 Conclusion

In this paper, a novel scheme called SOMDA (SOM-based data aggrega-

tion) has been proposed to achieve energy-efficient data aggregation for WSN.
The proposed scheme utilizes the density and similarity between the data of

13
I. Ullah, H. Y. Youn

neighborhood sensor nodes to construct the cluster, and cosine similarity func-
tion is used to observe the data relation between the sensor nodes and consolidate
them are at the CH. Also, interquartile (IRQ) analysis is used to remove the outli-
ers and redundant data. Subsequently, the data are processed using an unsuper-
vised machine learning technique, self-organized-map (SOM), which effectively
processes and presents complex high-dimensional data to two-dimensional ones.
Thus, SOMDA effectively reduces the energy consumption and network traffic,
and as a result extends the lifetime of the network. Computer simulation reveals
that the proposed scheme considerably outperforms the existing data aggregation
schemes with respect to data reduction rate, energy efficiency, network lifetime,
number of live nodes, and clustering accuracy of the sensor nodes.
Data aggregation at the CH can be effective only when similar data are grouped
and processed together. The performance of data aggregation can be improved
if the detrimental influence of the outliers and redundant data are reduced. In
the future, the performance of the proposed scheme will be further enhanced by
employing efficient classification and dimensionality reduction technique. The
proposed scheme will also be extended by using different machine learning tech-
niques to include the covariance of the data in clustering.

Acknowledgements This work was partly supported by Institute for Information & communications
Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2016-0-00133,
Research on Edge computing via collective intelligence of hyperconnection IoT nodes), Korea, under the
National Program for Excellence in SW supervised by the IITP (Institute for Information & communi-
cations Technology Promotion) (2015-0-00914), Basic Science Research Program through the National
Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology
(2016R1A6A3A11931385, Research of key technologies based on software defined wireless sensor net-
work for realtime public safety service, 2017R1A2B2009095, Research on SDN-based WSN Supporting
Real-time Stream Data Processing and Multiconnectivity), the second Brain Korea 21 PLUS project, and
Samsung Electronics.

References
1. Oliveira LM, Rodrigues JJ (2011) Wireless sensor networks: a survey on environmental monitoring.
JCM 6(2):143–151
2. Yick J, Mukherjee B, Ghosal D (2008) Wireless sensor network survey. Comput Netw
52(12):2292–2330
3. Ullah I, Youn HY (2018) Statistical multipath queue-wise preemption routing for zigbee-based
WSN. Wirel Pers Commun. 100:1537–1551
4. Abid B, Nguyen TT, Seba H (2015) New data aggregation approach for time-constrained wireless
sensor networks. J Supercomput 71(5):1678–1693
5. Huang C-F, Lin W-C (2016) Data collection for multiple mobile users in wireless sensor networks. J
Supercomput 72(7):2651–2669
6. Rawat P, Singh KD, Chaouchi H, Bonnin JM (2014) Wireless sensor networks: a survey on recent
developments and potential synergies. J Supercomput 68(1):1–48
7. Vuran MC, Akyildiz IF (2006) Spatial correlation-based collaborative medium access control in
wireless sensor networks. IEEEACM Trans Netw 14(2):316–329
8. Yoon S, Shahabi C (2007) The clustered aggregation (CAG) technique leveraging spatial and tem-
poral correlations in wireless sensor networks. ACM Trans Sens Netw TOSN 3(1):3
9. Lee S, Chung T (2004) Data aggregation for wireless sensor networks using self-organizing map.
In: International Conference on AI, Simulation, and Planning in High Autonomy Systems, Springer,
Berlin, pp 508–517

13
A novel data aggregation scheme based on self-organized…

10. Khedo K, Doomun R, Aucharuz S (2010) Reada: redundancy elimination for accurate data aggrega-
tion in wireless sensor networks. Wirel Sens Netw 2(04):300
11. Ozdemir S, Xiao Y (2011) Polynomial regression based secure data aggregation for wireless sensor
networks. In: IEEE, pp 1–5
12. Bahi JM, Makhoul A, Medlej M (2012) An optimized in-network aggregation scheme for data col-
lection in periodic sensor networks. In: International Conference on Ad-Hoc Networks and Wire-
less, Springer, Berlin, pp 153–166
13. Cui J (2016) Data aggregation in wireless sensor networks. Networking and Internet Architecture.
INSA Lyon
14. Jadhav NH, Kashid DN, Kulkarni SR (2014) Subset selection in multiple linear regression in the
presence of outlier and multicollinearity. Stat Methodol 19:44–59
15. Yuan F, Zhan Y, Wang Y (2014) Data density correlation degree clustering method for data aggre-
gation in WSN. IEEE Sens J 14(4):1089–1098
16. Toloueiashtian M, Motameni H (2018) A new clustering approach in wireless sensor networks using
fuzzy system. J Supercomput. 74(2):717–737
17. Rostami AS, Badkoobe M, Mohanna F, Hosseinabadi AAR, Sangaiah AK (2018) Survey on cluster-
ing in heterogeneous and homogeneous wireless sensor networks. J Supercomput 74(1):277–323
18. Kuila P, Jana PK (2014) Approximation schemes for load balanced clustering in wireless sensor
networks. J Supercomput 68(1):87–105
19. Diwakaran S, Perumal B, Vimala Devi K (2018) A cluster prediction model-based data collec-
tion for energy efficient wireless sensor network. J Supercomput. https://doi.org/10.1007/s1122
7-018-2437-z
20. Lee KY, Suh Y-K (2018) A pattern-based outlier region detection method for two-dimensional
arrays. J Supercomput. https://doi.org/10.1007/s11227-018-2418-2
21. Kuna HD, García-Martinez R, Villatoro FR (2014) Outlier detection in audit logs for application
systems. Inf Syst 44:22–33
22. Subhashini R, Kumar VJS (2010) Evaluating the performance of similarity measures used in docu-
ment clustering and information retrieval. In: IEEE, pp 27–31
23. Wan X, Wang W, Liu J, Tong T (2014) Estimating the sample mean and standard deviation from the
sample size, median, range and/or interquartile range. BMC Med Res Methodol. 14(1):135. https://
doi.org/10.1186/1471-2288-14-135
24. Cosine similarity function - Wikipedia [Internet]. [cited 2018 Feb 23]. Available from: https://
en.wikipedia.org/wiki/Cosine_similarity
25. How to Calculate Outliers, by interquartile range [Internet]. wikiHow. https://www.wikihow.com/
Calculate-Outliers
26. Kumar DI, Kounte MR (2016) Comparative study of self-organizing map and deep self-organizing
map using MATLAB. In: IEEE, pp 1020–1023
27. Kohonen T (2013) Essentials of the self-organizing map. Neural Netw. 37:52–65
28. Faigl J, Hollinger GA (2018) Autonomous data collection using a self-organizing map. IEEE Trans
Neural Netw Learn Syst 29(5):1703–1715. https://doi.org/10.1109/TNNLS.2017.2678482
29. Aghajari E, Chandrashekhar GD (2017) Self-organizing map based extended fuzzy C-means
(SEEFC) algorithm for image segmentation. Appl Soft Comput 54:347–363
30. Isa D, Kallimani V, Lee LH (2009) Using the self organizing map for clustering of text documents.
Expert Syst Appl 36(5):9584–9591
31. Ganegedara H, Alahakoon D (2012) Redundancy reduction in self-organising map merging for scal-
able data clustering. In: IEEE, pp 1–8
32. Gedik B, Liu L, Philip SY (2007) ASAP: an adaptive sampling approach to data collection in sensor
networks. IEEE Trans Parallel Distrib Syst 18(12):1766–1783
33. Sun L-Y, Cai W, Huang X-X (2010) Data aggregation scheme using neural networks in wireless
sensor networks. In: IEEE, pp V1-725
34. Bo W, Han-ying H, Wen F (2007) A pseudo LEACH algorithm for wireless sensor networks. In:
IMECS, pp 1366–1370
35. Liu C, Wu K, Pei J (2007) An energy-efficient data collection framework for wireless sensor net-
works by exploiting spatiotemporal correlation. IEEE Trans Parallel Distrib Syst 18(7):1010–1023
36. Sung W-T (2009) Employed BPN to multi-sensors data fusion for environment monitoring services.
In: International Conference on Autonomic and Trust Computing, pp 149–63

13
I. Ullah, H. Y. Youn

37. Villas LA, Boukerche A, Guidoni DL, De Oliveira HA, De Araujo RB, Loureiro AA (2013) An
energy-aware spatio-temporal correlation mechanism to perform efficient data collection in wireless
sensor networks. Comput Commun 36(9):1054–1066
38. Li G, Wang Y (2013) Automatic ARIMA modeling-based data aggregation scheme in wireless sen-
sor networks. EURASIP J Wirel Commun Netw. 2013(1):85
39. Santini S, Romer K (2006) An adaptive strategy for quality-based data reduction in wireless sensor
networks. In: Proceedings of the 3rd International Conference on Networked Sensing Systems, pp
29–36
40. Yin Y, Liu F, Zhou X, Li Q (2015) An efficient data compression model based on spatial clustering
and principal component analysis in wireless sensor networks. Sensors 15(8):19443–19465
41. Lin H, Bai D, Gao D, Liu Y (2016) Maximum data collection rate routing protocol based on topol-
ogy control for rechargeable wireless sensor networks. Sensors 16(8):1201
42. Cluster with Self-Organizing Map Neural Network-MATLAB & Simulink –MathWorks. https://
kr.mathworks.com/help/nnet/ug/cluster-with-self-organizing-map-neural-network.html
43. Comparison OF LEACH EAMMH SEP TEEN Protocols (Contact for codes in WSN)–File
Exchange–MATLAB Central [Internet]. http://kr.mathworks.com/matlabcentral/fileexchange/46199
-comparison-of-leach-eammh-sep-teen-protocols–contact-for-codes-in-wsn-
44. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Air+quality
45. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/3D+Road+Netwo
rk+(North+Jutland%2C+Denmark)
46. Hevin Rajesh D, Paramasivan B (2015) Data aggregation framework for clustered sensor networks
using multilayer perceptron neural network. Int J Adv Res Comput Eng Technol (IJARCET) 4(4).
https://ijarcet.org/wp-content/uploads/IJARCET-VOL-4-ISSUE-4-1156-1160.pdf

Al3502deep Learning For Visionl T P C
No ratings yet
Al3502deep Learning For Visionl T P C
3 pages
Learning With AI - Joan Monahan Watson
No ratings yet
Learning With AI - Joan Monahan Watson
232 pages
Hybrid Approach For Data Aggregation in WSN With Advance Security Protocol in
No ratings yet
Hybrid Approach For Data Aggregation in WSN With Advance Security Protocol in
7 pages
Journal Pre-Proof: Computer Networks
No ratings yet
Journal Pre-Proof: Computer Networks
75 pages
Scopus Indexed Journals List
No ratings yet
Scopus Indexed Journals List
16 pages
Scdap Paper Lava
No ratings yet
Scdap Paper Lava
22 pages
Sustainable and Optimized Data Collection Via Mobile Edge Computing For Disjoint Wireless Sensor Networks
No ratings yet
Sustainable and Optimized Data Collection Via Mobile Edge Computing For Disjoint Wireless Sensor Networks
14 pages
Scdap
No ratings yet
Scdap
21 pages
Improving Lifetime and Network Connections of 3D Wireless Sensor Networks Based On Fuzzy Clustering and Particle Swarm Optimization
No ratings yet
Improving Lifetime and Network Connections of 3D Wireless Sensor Networks Based On Fuzzy Clustering and Particle Swarm Optimization
14 pages
10 1109@jiot 2019 2959094
No ratings yet
10 1109@jiot 2019 2959094
10 pages
Paper 73-Privacy Preserving Data Mining Approach For IoT
No ratings yet
Paper 73-Privacy Preserving Data Mining Approach For IoT
9 pages
A Distributed Data Collection Algorithm For Wireless Sensor Networks With Persistent Storage Nodes
No ratings yet
A Distributed Data Collection Algorithm For Wireless Sensor Networks With Persistent Storage Nodes
6 pages
Reference 8
No ratings yet
Reference 8
5 pages
A Data Aggregation Approach Exploiting Spatial and Temporal Correlation Among Sensor Data in Wireless Sensor Networks
No ratings yet
A Data Aggregation Approach Exploiting Spatial and Temporal Correlation Among Sensor Data in Wireless Sensor Networks
5 pages
IET Communications - 2020 - Sanjay Gandhi - Grid Clustering and Fuzzy Reinforcement Learning Based Energy Efficient Data
No ratings yet
IET Communications - 2020 - Sanjay Gandhi - Grid Clustering and Fuzzy Reinforcement Learning Based Energy Efficient Data
9 pages
1 s2.0 S1574119222000712 Main
No ratings yet
1 s2.0 S1574119222000712 Main
14 pages
Mobile Data Gathering With Load Balanced Clustering and Dual Data Uploading in Wireless Sensor Networks
No ratings yet
Mobile Data Gathering With Load Balanced Clustering and Dual Data Uploading in Wireless Sensor Networks
16 pages
IJCRT1872448
No ratings yet
IJCRT1872448
4 pages
An Energy-Efficient Compressive Sensing-Based Clustering Routing Protocol For Wsns
No ratings yet
An Energy-Efficient Compressive Sensing-Based Clustering Routing Protocol For Wsns
11 pages
Divide and Conquere!!
No ratings yet
Divide and Conquere!!
10 pages
Zhu 2020
No ratings yet
Zhu 2020
14 pages
Proximate Node Aware Optimal and Secure Data Aggregation in Wireless Sensor Network Based IoT Environment
No ratings yet
Proximate Node Aware Optimal and Secure Data Aggregation in Wireless Sensor Network Based IoT Environment
8 pages
Research Article: Improved Reliable Trust-Based and Energy-Efficient Data Aggregation For Wireless Sensor Networks
No ratings yet
Research Article: Improved Reliable Trust-Based and Energy-Efficient Data Aggregation For Wireless Sensor Networks
11 pages
Efficient Algorithms For Maximum Lifetime Data Gathering and Aggregation in Wireless Sensor Networks
No ratings yet
Efficient Algorithms For Maximum Lifetime Data Gathering and Aggregation in Wireless Sensor Networks
21 pages
(IJETA-V8I3P4) :er - Mohammad Shabaz
No ratings yet
(IJETA-V8I3P4) :er - Mohammad Shabaz
6 pages
Data Aggregation in Wireless Sensor Networks (WSN)
No ratings yet
Data Aggregation in Wireless Sensor Networks (WSN)
4 pages
220 1499067120 - 03-07-2017 PDF
No ratings yet
220 1499067120 - 03-07-2017 PDF
10 pages
Prediction-Based Sensor Nodes
No ratings yet
Prediction-Based Sensor Nodes
10 pages
Scalable Privacy Preserving Big Data Aggrega 2016 Digital Communications and
No ratings yet
Scalable Privacy Preserving Big Data Aggrega 2016 Digital Communications and
8 pages
A Two-Level Scheme For Efficient Data Gathering in Mobile-Sink Wireless Sensor
No ratings yet
A Two-Level Scheme For Efficient Data Gathering in Mobile-Sink Wireless Sensor
12 pages
Check PDF
No ratings yet
Check PDF
14 pages
1 s2.0 S2665917423002465 Main
No ratings yet
1 s2.0 S2665917423002465 Main
4 pages
A Low-Energy Data Aggregation Protocol Using An Emergency Efficient Hybrid Medium Access Control Protocol in Hierarchal Wireless Sensor Networks
No ratings yet
A Low-Energy Data Aggregation Protocol Using An Emergency Efficient Hybrid Medium Access Control Protocol in Hierarchal Wireless Sensor Networks
13 pages
IJETR021745
No ratings yet
IJETR021745
6 pages
Compusoft Geed
No ratings yet
Compusoft Geed
9 pages
Power Efficient Data Gathering and Aggregation in Wireless Sensor Networks
No ratings yet
Power Efficient Data Gathering and Aggregation in Wireless Sensor Networks
6 pages
Maximizing Lifetime For Data Aggregation in Wireless Sensor Networks
No ratings yet
Maximizing Lifetime For Data Aggregation in Wireless Sensor Networks
12 pages
(IJCST-V4I4P29) :M Arthi, M Jayashree
No ratings yet
(IJCST-V4I4P29) :M Arthi, M Jayashree
12 pages
Proposal of KMSTME Data Mining Clustering Method For Prolonging Life of Wireless Sensor Networks
No ratings yet
Proposal of KMSTME Data Mining Clustering Method For Prolonging Life of Wireless Sensor Networks
5 pages
Congestion Control Clustering A Review Paper
No ratings yet
Congestion Control Clustering A Review Paper
4 pages
Jinnah University For Women
No ratings yet
Jinnah University For Women
3 pages
Energy-Efficient Secure Data Aggregation Framework (Esdaf) Protocol in Heterogeneous Wireless Sensor Networks
No ratings yet
Energy-Efficient Secure Data Aggregation Framework (Esdaf) Protocol in Heterogeneous Wireless Sensor Networks
10 pages
Enhancing Coverage Using Weight Based Clustering in Wireless Sensor Networks
No ratings yet
Enhancing Coverage Using Weight Based Clustering in Wireless Sensor Networks
22 pages
Ijet V3i3p24
No ratings yet
Ijet V3i3p24
4 pages
Data Aggregation and Security Issues in Wireless Sensor Networks:A Survey
No ratings yet
Data Aggregation and Security Issues in Wireless Sensor Networks:A Survey
5 pages
Wireless Sensor Network: Submitted By-Pardeep Kumar Roll - No-60 Regn - No-4070070021 Submitted To-Mr. Gagandeep Singh
No ratings yet
Wireless Sensor Network: Submitted By-Pardeep Kumar Roll - No-60 Regn - No-4070070021 Submitted To-Mr. Gagandeep Singh
2 pages
A Review: An Improved K-Means Clustering Technique in WSN: Navjot Kaur Jassi, Sandeep Singh Wraich
No ratings yet
A Review: An Improved K-Means Clustering Technique in WSN: Navjot Kaur Jassi, Sandeep Singh Wraich
5 pages
A Survey On Data Collection Techniques in Wireless Sensor Networks
No ratings yet
A Survey On Data Collection Techniques in Wireless Sensor Networks
4 pages
Data Management in Wireless Sensor Networks
No ratings yet
Data Management in Wireless Sensor Networks
9 pages
A Reliable Routing Technique For Wireless Sensor Networks
No ratings yet
A Reliable Routing Technique For Wireless Sensor Networks
6 pages
IJEAS0206034
No ratings yet
IJEAS0206034
4 pages
1.1 Wireless Sensor Networks
No ratings yet
1.1 Wireless Sensor Networks
15 pages
Ijettcs 2014 11 07 16
No ratings yet
Ijettcs 2014 11 07 16
4 pages
Concealed Data Aggregation With Dynamic Intrusion Detection System To Remove Vulnerabilities in Wireless Sensor Networks
No ratings yet
Concealed Data Aggregation With Dynamic Intrusion Detection System To Remove Vulnerabilities in Wireless Sensor Networks
16 pages
ML Unit4
No ratings yet
ML Unit4
41 pages
An Efficient and Reliable Data Routing For In-Network Aggregation in Wireless Sensor Network
No ratings yet
An Efficient and Reliable Data Routing For In-Network Aggregation in Wireless Sensor Network
4 pages
Ijesat 2012 02 Si 01 05
No ratings yet
Ijesat 2012 02 Si 01 05
7 pages
Reliability Guaranteed Efficient Data Gathering in Wireless Sensor Networks
No ratings yet
Reliability Guaranteed Efficient Data Gathering in Wireless Sensor Networks
15 pages
Ijesat 2012 02 01 14
No ratings yet
Ijesat 2012 02 01 14
7 pages
CoDIT2025 Program
No ratings yet
CoDIT2025 Program
44 pages
(English (Auto-Generated) ) Deep Dive Into LLMs Like ChatGPT (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) Deep Dive Into LLMs Like ChatGPT (DownSub - Com)
98 pages
Final Farmer's Assistance
No ratings yet
Final Farmer's Assistance
63 pages
Fyp 2024 1st Draft To Rownak
No ratings yet
Fyp 2024 1st Draft To Rownak
24 pages
Networked Wireless Sensor Data Collection: Issues, Challenges, and Approaches
No ratings yet
Networked Wireless Sensor Data Collection: Issues, Challenges, and Approaches
4 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Introducing TensorFlow and ML
No ratings yet
Introducing TensorFlow and ML
289 pages
Sinhala Handwritten Character Recognition Using
No ratings yet
Sinhala Handwritten Character Recognition Using
5 pages
Performance Evaluation and Comparison of Classification Techniques For Outcome Estimation in Strategic Board Games
No ratings yet
Performance Evaluation and Comparison of Classification Techniques For Outcome Estimation in Strategic Board Games
8 pages
Ijiset V11 I02 10
No ratings yet
Ijiset V11 I02 10
17 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Capstone Review 2
No ratings yet
Capstone Review 2
30 pages
Vtu Old QP Aiml Previous Year Imporant Questions For Ai ML 18cs71
No ratings yet
Vtu Old QP Aiml Previous Year Imporant Questions For Ai ML 18cs71
19 pages
Deep Learning Algorithms Zoran Gacovski Download
No ratings yet
Deep Learning Algorithms Zoran Gacovski Download
81 pages
Sign Language Recognition Full Report
No ratings yet
Sign Language Recognition Full Report
13 pages
(IJCST-V13I1P4) :DR - Snehal K Joshi
No ratings yet
(IJCST-V13I1P4) :DR - Snehal K Joshi
7 pages
A Novel Online Machine Learning Approach For..
No ratings yet
A Novel Online Machine Learning Approach For..
7 pages
Applications of PINNs For Property Characterization of Complex Materials
No ratings yet
Applications of PINNs For Property Characterization of Complex Materials
11 pages
NCCCI
No ratings yet
NCCCI
16 pages
Personalized Recommendation Models in Federated
No ratings yet
Personalized Recommendation Models in Federated
20 pages
Referece Paper
No ratings yet
Referece Paper
9 pages
Liang Et Al. (2024)
No ratings yet
Liang Et Al. (2024)
17 pages
Book Chapter 9 Inferential Modeling and Soft Sensors
No ratings yet
Book Chapter 9 Inferential Modeling and Soft Sensors
16 pages
Online Learners' Engagement Detection Via Facial Emotion Recognition in Online Learning Context Using Hybrid Classification Model
No ratings yet
Online Learners' Engagement Detection Via Facial Emotion Recognition in Online Learning Context Using Hybrid Classification Model
19 pages
ISEF Research
No ratings yet
ISEF Research
5 pages
AML Syllabus (Theory)
No ratings yet
AML Syllabus (Theory)
2 pages
Leela Chess Zeroand The Human Play
No ratings yet
Leela Chess Zeroand The Human Play
9 pages
Generative AI Notes
No ratings yet
Generative AI Notes
4 pages
Intelligent Technologies for Research and Engineering
From Everand
Intelligent Technologies for Research and Engineering
S. Kannadhasan
No ratings yet
Handbook of Ultra-Wideband Short-Range Sensing: Theory, Sensors, Applications
From Everand
Handbook of Ultra-Wideband Short-Range Sensing: Theory, Sensors, Applications
Jürgen Sachs
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)

A Novel Data Aggregation Scheme Based On Self Organized Map For WSN

Uploaded by

A Novel Data Aggregation Scheme Based On Self Organized Map For WSN

Uploaded by

The Journal of Supercomputing

A novel data aggregation scheme based on self‑organized

Ihsan Ullah1 · Hee Yong Youn2

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Keywords Data aggregation · Data clustering · Cosine similarity · SOM neural

tering to convert the complex nonlinear high-dimensional data to two-dimen-

to work independently, and the LMS-based prediction was hierarchically extended to

3 The proposed scheme

3.2.1 Clustering of sensor nodes

Fig. 1 Clustering of sensor nodes

𝜀-neighborhood of D. d is the average distance between the N data objects and s0 .

With normalization, the above equation can be expressed as:

cs(High) = max [cs(C, A), cs(C, B)] (9)

Fig. 2 Diagram of the SOM network

If the hypothesis ∅ is accepted, then xi is outlier. To measure the redundancy of input

If cs(xi , ym ) = 1, then hypothesis 𝜕 is accepted and data point xi is redundant. Conse-

Fig. 3 Diagram of the interquar-

Table 1 Specification of Datasets Attributes Instances

Fig. 5 Simulation environments of the SOMDA. a D1, b D2

The proposed SOMDA scheme attempts to accomplish energy-efficient data aggrega-

In this section, the proposed scheme is evaluated by computer simulation. It is per-

Fig. 9 Comparison of the amount of data removed

Fig. 10 Comparison of energy consumption with the datasets. a D1, b D2

Fig. 11 Comparison of the network lifetime with the datasets. a D1, b D2

Fig. 13 Comparison of data similarity accuracy

Fig. 14 Comparison of clustering accuracy

In this paper, a novel scheme called SOMDA (SOM-based data aggrega-

You might also like

3 The proposed scheme

3.2.1 Clustering of sensor nodes