A Novel Data Aggregation Scheme Based On Self Organized Map For WSN
A Novel Data Aggregation Scheme Based On Self Organized Map For WSN
https://doi.org/10.1007/s11227-018-2642-9
Abstract
Wireless sensor network allows efficient data collection and transmission in IoT envi-
ronment. Since it usually consists of a large number of sensor nodes, a significant
amount of redundant data and outliers are generated which deteriorate the network per-
formance. In this paper, a novel data aggregation scheme is proposed which is based
on self-organized map neural network to reduce redundant data and eliminate outliers.
In addition, cosine similarity is used to improve the clustering process of sensor nodes
based on the density and similarity of the data, and interquartile analysis is adopted to
remove outliers. It allows to significantly reduce the energy consumption and enhance
the network performance. Extensive simulation with real dataset shows that the pro-
posed scheme consistently outperforms the existing representative data aggregation
schemes in term of data reduction rate, network lifetime, and energy efficiency.
1 Introduction
Wireless sensor network (WSN) consists of a large number of sensor nodes, and it is
widely used in various application domains. WSN allows timely and accurate detec-
tion of critical events in the target area where the access is usually limited. Unlike
wired network, WSN has limitations in network lifetime and communication range
of the sensor nodes. In order to overcome the limitations, the sensor nodes need to
collaborate with each other in querying the environment and then collect and trans-
mit them to the base station (BS). The sensor nodes are usually distributed in high
* Ihsan Ullah
[email protected]
* Hee Yong Youn
[email protected]
1
Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea
2
College of Software, Sungkyunkwan University, Suwon, Korea
13
Vol.:(0123456789)
I. Ullah, H. Y. Youn
density to reliably monitor the target [1–3]. However, densely deployed sensor nodes
generate redundant data which lead to energy waste and congestion of the network.
Moreover, the data may contain outliers caused due to unstable sensing or commu-
nication environment, which reduce the accuracy of data. The data having redun-
dancy and outliers eventually degrade the performance of WSN, and thus proper
data aggregation and removal of outliers are imperative.
Data aggregation is the technique of consolidating the data in an energy-efficient
way. It can greatly enhance the performance of WSN in terms of energy efficiency
and network throughput [4–6]. The data collected by adjacent sensors usually display
high spatial and temporal correlation, and a significant amount of them are redun-
dant [7–9]. Such data do not provide any information but make waste the network
resources. Based on the spatial correlation model, the sensor nodes which are spa-
tially close to each other usually contain more similar data than the distant ones.
Therefore, clustering of them based on the data density and similarly is important
for data aggregation. Numerous data aggregation techniques have been proposed to
reduce the data collected from sensor nodes before transmitted to the sink [9–13],
while data aggregation occurs at the cluster head (CH) [14–19]. Based on these exist-
ing schemes, the proposed scheme is attempted to improve the clustering of sensor
nodes and aggregation of data using machine learning techniques. Data aggregation
at the CH can be more efficient only when similar data are grouped and processed
together, and the performance and resource utilization will be improved if the outliers
and redundant data are reduced [20, 21]. The performance is further enhanced if an
efficient algorithm for dimensionality reduction on data is embodied at CH.
In this paper, thus, a novel data aggregation scheme is proposed to maximize
the performance of WSN. The proposed scheme utilizes the density and similarity
between the data of neighborhood sensor nodes to construct the cluster, and cosine
similarity function is used to observe the data relation between the sensor nodes
and consolidate them are at the CH. Also, interquartile (IRQ) analysis is used to
remove the outliers and redundant data [22–25]. Subsequently, the data are pro-
cessed using an unsupervised machine learning technique, self-organized-map
(SOM), which effectively processes and presents complex high-dimensional data
to low-dimensional ones [26, 27]. Due to its unsupervised nature, SOM is widely
used for exploratory analysis and visualization of high-dimensional complex dataset
in low-dimension [28–31]. The SOM algorithm is used to recluster the data after
eliminating the outliers and redundant data before forwarding to the BS. Computer
simulation reveals that the proposed scheme considerably outperforms the three rep-
resentative data aggregation schemes with respect to data reduction rate, energy effi-
ciency, network lifetime, number of live nodes, and clustering accuracy of the sensor
nodes. The main contributions of the paper are summarized as follows.
• Clustering of sensor nodes based on the data density and similarly is shown to be
effective for data aggregation. In this paper, cosine similarity is used to observe
the data relation between the sensor nodes, and interquartile (IRQ) analysis is
adopted to remove outliers and redundant data.
• Identification of the efficiency of two-phase data aggregation approach of the
elimination of outliers and redundant data first and then SOM-based data reclus-
13
A novel data aggregation scheme based on self-organized…
The rest of the paper is organized as follows: In Sect. 2, the work related to data
aggregation for WSN is discussed. The proposed SOM-based data aggregation
(SOMDA) scheme is presented in Sect. 3. Section 4 discusses the simulation results,
and the conclusion is made in Sect. 5.
2 Related work
In WSN the issue of data aggregation has been recognized by a number of research-
ers. The adaptive sampling approach (ASAP) [32] was developed to estimate the
data similarity based on the relationship between the time series data of two sen-
sor nodes without considering the magnitude. Here a minimum number of messages
is used to collect the data, and the multivariate normal (MVN) predicts the value
of the node not sampled at the base node. In [33], back-propagation networks data
aggregation (BPNDA) was proposed to aggregate the data using neural networks.
It adopts the lower energy adaptive clustering hierarchy (LEACH) [34] strategy to
construct the cluster of sensor nodes, a three-layer back-propagation (BP) neural net-
work was used to eliminate the data redundancy and improve the accuracy of data.
Here the input layer neurons are located at the cluster members, while the hidden
layer neurons and the output layer neurons are located at the CH.
Liu et al. [35] proposed an approach called energy-efficient data collection
(EEDC) which observes the spatial and temporal correlation between the sensed
data using a piecewise linear approximation technique. By exploring the spatial cor-
relation of the sensed data, it dynamically partitions the sensor nodes into clusters
so that the sensors in the same cluster have similar surveillance time series. EEDC
tried to minimize the frequency of data transmissions under a given bound on esti-
mated accuracy, and balance the energy consumption by introducing the randomized
scheduling technique. In [36], the back-propagation network (BPN) technique was
proposed for data aggregation in WSN. It efficiently reduced the dimension of data
and improved the data aggregation process using multilayer perceptions for training
in a supervised manner. Through the BPN technology, the sink nodes handle various
signal sources for data fusion in recognition and classification.
Efficient data collection aware of spatiotemporal correlation (EAST) [37] exploits
the spatiotemporal correlation between the data and forwards imperative data to the
BS through a short path. EAST clusters the member nodes by using a spatial cor-
relation approach, while the representative nodes use the temporal suppression tech-
nique. Here data prediction decreases the network traffic by estimating the upcom-
ing data from current data log. In [38], autoregressive integrated moving average
(ARIMA) was proposed to predict the data from the previous log of sensed data.
[39] proposed a scheme called dual prediction framework (DPF) which used least
mean square (LMS) to filter the data. It requires no prior model and allows the nodes
13
I. Ullah, H. Y. Youn
In this section, the proposed SOMDA scheme for data aggregation is presented. It
clusters the sensor nodes based on the density and similarity to aggregate the data
and filters the data at the CH using an unsupervised machine learning technique, self-
organizing map (SOM). SOM is employed to transform the high-dimensional data in
low-dimension one and remove the outliers and redundancy. SOMDA minimizes the
energy consumption of sensor nodes and thus prolongs the lifetime of the WSN.
3.1 Design goals
The design goal of the proposed scheme is to efficiently aggregate and transmit the
data to the BS so that the communication cost can be reduced. If all sensed data are
transmitted to the BS for aggregation, network congestion will occur. With the pro-
posed cluster-based aggregation, the sensed data are transmitted to the CHs, which
aggregate them before forwarding to the BS.
SOMDA operates in two phases. In the first phase, clusters are constructed based
on the density and data similarity of the nodes by applying cosine similarity and
data density correlation degree (DDCD) [15]. The node having the highest degree
and enough energy is announced itself as the CH. The nodes which listen to the
announcement are associated with the CH of the highest similarity. After the con-
struction of the clusters, each CH identifies multiple paths towards the BS in the
backbone and then selects the best path for data transmission. In the second phase,
the SOM algorithm is used to recluster the data to convert the complex nonlinear
high-dimensional data to two-dimensional ones. In the long run, it decreases the
data transmission cost and energy consumption.
3.2 Operation of SOMDA
In the clustering of WSN, the sensor nodes are usually grouped according to a set of
rules. Here the data density correlation degree (DDCD) technique [15] is extended
with the cosine similarity function to construct the clusters of similar sensor nodes
13
A novel data aggregation scheme based on self-organized…
in terms of data. The topology of the WSN with the proposed scheme is modeled by
undirected graph G = (S,E). Here S is a set of sensor nodes and E is a set of edges.
With cluster-based data aggregation, each CH consolidates the data sent from its
member nodes and forwards them to the sink (BS) in one hop or multi-hops as
shown in Fig. 1. ( )
Assume that sensor node, s0, has n neighboring nodes, s1 , s2 , … , sn , which are
within the radio coverage of s0. The data object of s0 is D0, and those of the neighbor-
ing nodes are D1, D2, …, Dn. Among the n data objects, if there are N(≤ n) data objects
whose distance to D is less than ε, then the data density correlation degree of s0 to the
CH whose data objects are in 𝜀-neighborhood of D is defined as follows [15]:
{
( 0, N < ) minPts
( )
sim(s) =
a1 1 − exp (N−minPts) + a2 1 − dΔ
1
+ a3 (1)
𝜀
Here minPts is the threshold of the number of neighbor nodes, 𝜀 is the data thresh-
old, dΔ is the distance between D and the center of the data objects which are in the
13
I. Ullah, H. Y. Youn
⎡ ⎤
⎢ a1 a2 an ⎥
Anr = ⎢ �
� �
,�
� �
,…, �
� � ⎥ (4)
⎢ ∑n a 2 ∑n
a
2 ∑n
a
2⎥
⎣ i=1 i i=1 i i=1 i ⎦
⎡ ⎤
⎢ b1 b2 bn ⎥
Bnr = ⎢ �
� �
,�
� �
,…, �
� � ⎥ (5)
⎢ ∑ n 2 ∑ n 2 ∑ n 2 ⎥
⎣ i=1
b i i=1
b i i=1
b i ⎦
and
⎡ ⎤
⎢ c1 c2 cn ⎥
Cnr = ⎢ �
� �
,� ,…, � ⎥ (6)
⎢ ∑n c 2 ∑n � �2 ∑n � �2 ⎥
⎣ i=1 i i=1
ci i=1
ci ⎦
Equations (2) and (3) are rewritten as:
∑
n
cs(A, C) = (anr ⋅ cnr ) (7)
i=1
∑
n
cs(B, C) = (bnr ⋅ cnr ) (8)
i=1
The highest similarity between cs(C, A) and cs(C, B) can be measured as:
13
A novel data aggregation scheme based on self-organized…
3.2.2 Clustering of data
The proposed SOMDA scheme reclusters the collected data at each cluster. It consists
of two-layer neural network: input and ( output. All )Tthe received data of dimension d rep-
resented as( input neural layer
) X = x ,
1 2x , … xd are fully connected to output neural
layer Y = y1 , y2 , … , ym as shown in Fig. 2. Here m is the order of the neural map in
the output layer, and T represents the number ( of iteration of learning
) process.
The synaptic weight vector, Wk = wk,1 , wk,2 , … wk,m ,{ is the directed } links
between the input layer X and output layer Y , where k ∈ 1, 2, … , m2 expresses
the index of kth node of the output layer. The training process of SOMDA iteratively
updates the synaptic weights of the winner and its neighbors’ neurons. At each train-
ing step, a sample vector, xi,d , is randomly selected from the input dataset. As training
progresses, the algorithm calculates the Euclidean distance between every weight and
input vector xd . The node with a weight vector of the closest distance to the input vector
is tagged as the best-matching unit (BMU), j∗ .
�
⎛�
��d
� �2 ⎞
j∗ = min ⎜� xi − wim ⎟ (10)
j ⎜ ⎟
⎝ i=0 ⎠
13
I. Ullah, H. Y. Youn
To detect and remove outliers from data, the interquartile (IRQ) analysis is used
[25].
IRQ = Q3 − Q1 (11)
where Q1 and Q3 represent the first quartile and third quartile, respectively, of the
sample data as shown in Fig. 3. Before determining whether a data point is out-
lier or not, firstly, the potential of the data point is identified as follows. Here Ur
and Lr denote the upper and lower range of the data point identifying the outliers,
respectively.
Lr = Q1 − (1.5 × IRQ) (12)
Ur = Q3 + (1.5 × IRQ) (13)
Hypotheses:
� ∶ xi ⟨Lr or xi ⟩Ur
𝜕 ∶ xi = ym
13
A novel data aggregation scheme based on self-organized…
Fig. 4 Grid representation of the neurons in SOM. a Rectangular grid, b hexagonal grid
where 𝛼 and t represent the learning rate factor and the iteration of the training
process, respectively. The Gaussian neighborhood function, hci (t), indicates how
strongly the neighbor neurons are connected around the winner during the learning
process as shown in Fig. 4. It is specified as:
( )
rc , ri2
hci (t) = exp − 2 (17)
2𝜎 (t)
where rc and ri denote the position of the winner neuron_c and neuron_i on the SOM
grid, and rc , ri2 is the distance between them. The following is the procedure of the
proposed SOMDA scheme
13
I. Ullah, H. Y. Youn
13
A novel data aggregation scheme based on self-organized…
3.2.3 Energy efficiency
13
I. Ullah, H. Y. Youn
Fig. 6 U-matrix view of the datasets showing the weight distances between the neighbor neurons. a D1,
b D2
∑
et (i, j) = et × fi,j
(18)
i,j∈V,j∈Si
13
A novel data aggregation scheme based on self-organized…
Fig. 7 Number of data points associated with each neuron and cluster. a D1, b D2
∑
er (i, j) = er × fi,j
(19)
i,j∈V,j∈Si
13
I. Ullah, H. Y. Youn
Fig. 8 Weight plan for each input vectors of the datasets. a D1, b D2
4 Performance evaluation
13
A novel data aggregation scheme based on self-organized…
of the two datasets that are tested in the simulation [44, 45], and Fig. 5 depicts the set-
ting of the SOMDA simulation with the two datasets.
Three representative data aggregation algorithms are compared with the proposed
scheme, which are lower energy adaptive clustering hierarchy (LEACH) [34], percep-
tron neural network based data aggregation (PNNDA) [46], and polynomial regression-
based secure data aggregation (PRDA) [11]. The simulation consists of two phases:
clustering of the sensor nodes and data processing with SOM. In the cluster construc-
tion phase, the sensor nodes of the largest energy are selected as CH, and then the
nodes are associated with the CH having maximum cosine similarity with it. In the sec-
ond phase, the CH runs the SOM algorithm on the collected data to cluster and reduce
the multidimensional data. Seven metrics are examined to evaluate the performance of
different data aggregation schemes, which are the elimination rate of the redundant and
outlier data, data reduction rate, energy consumption of sensor node, live node in each
round, network lifetime, clustering and data similarity accuracy. The simulation results
are explained in the following.
In Fig. 6 the U-matrix views of the neuron are shown to identify the clusters of
the data with the two datasets. The figures show the distance between each neuron
and its neighbor neurons. The hexagons represent the neurons and the lines are the
connections between the neighbor neurons. The darker the shade, the larger distance
between the neurons. Similarly, a group of lighter segments surrounded by darker
segments indicates that the data have been clustered, and the neurons in lighter seg-
ments contain similar data.
Figure 7 shows the data points associated with each neuron for the two datasets.
The neurons adjacent to each other in the topology need to be close to each other in
the input space so that high-dimensional input space can be visualized in the two-
dimensional topology. For example, the two neurons at the right bottom of Fig. 7a
are far from others which means that the data points of the two neurons are very
similar to each other compared to other neurons. The data are preferred to be equally
distributed across the neurons. In Fig. 7a, more data are located on the middle neu-
rons. Similarly, Fig. 7b shows the result of clustering with D2.
13
I. Ullah, H. Y. Youn
Figure 8 shows the weight plan for each element of the input vector. Here the
darkness represents the size of weight. If the connection patterns of the two inputs
are similar, the input vectors are highly related to each other. Figure 8a represents
the weight plan of the 14 vectors of D1. In this case, input vector_5, 12, 13, 14 are
similar to each other than other input vectors, and input vector_3, 6, 10, and 11 look
similar. Figure 8b presents a weight plan of four input vectors with D2. Here the vec-
tor_2 and 3 [seem similar ]to each other.
Let V = v1 , v2 , ⋯ , vn denote the input vectors, and then the cosine similarity
function of Eqs. (2) and (3) are used to measure the similarity between the vectors,
which is in the range between 0 and 1.
∑n ∑n
� � i=1
(v .v )
j=1 i i+j
cs vi , vi+j = � � (i, j = 1, 2, 3 … n) (20)
∑n � �2 ∑n � �2
i=1
v i i=1
vi+j
13
A novel data aggregation scheme based on self-organized…
Figure 9 compares the amount of removed redundant data using four different
schemes with the two datasets. As shown in the figure, the removal with the pro-
posed SOMDA scheme is greater than the other three schemes. The rate of data
removal is greatly influenced by the characteristic of the datasets. The more outlier
and redundant data, the more data reduction. Note that D2 has more such data than
D1 as D2 has much more data of 131,500 than D1 of 9300.
Figure 10 shows the average energy consumption for the aggregation and trans-
mission of the data to the sink for different rounds. Energy consumption increases if
the size of the dataset grows. Therefore, the energy consumption for the D2 is larger
than D1 as shown in Fig. 10a and b. Observe from the figures that the proposed
SOMDA scheme consistently outperforms the other three schemes, while energy
consumption grows as the rounds of the clustering and data aggregation.
Figure 11 compares the network lifetimes for the four data aggregation schemes,
which is defined as the percentage of the live nodes for varying number of the
13
I. Ullah, H. Y. Youn
Fig. 12 Comparison of live nodes at each round with the datasets. a D1, b D2
rounds. It depends on the number of live nodes and the connectivity among them
as the round proceeds. Notice that the proposed scheme consistently allows longer
network lifetime than other schemes.
Figure 12 shows the comparison of depleted nodes in each round with the four
schemes. Notice that other schemes experience more dead nodes compared to the pro-
posed SOMDA scheme at each round, where the nodes are associated with the CH
based on cosine similarity and thus the energy consumption of the clusters can be mini-
mized. Figure 13 compares the data similarity. With the help of the cosine similarity
function, the proposed scheme achieves higher similarity of the data compared to other
schemes. Figure 14 shows the accuracy of the clustering of sensor nodes. Here accuracy
is obtained by merging the data similarity between the CH and the member nodes. Again
the proposed scheme displays much higher clustering accuracy than others schemes.
13
A novel data aggregation scheme based on self-organized…
5 Conclusion
13
I. Ullah, H. Y. Youn
neighborhood sensor nodes to construct the cluster, and cosine similarity func-
tion is used to observe the data relation between the sensor nodes and consolidate
them are at the CH. Also, interquartile (IRQ) analysis is used to remove the outli-
ers and redundant data. Subsequently, the data are processed using an unsuper-
vised machine learning technique, self-organized-map (SOM), which effectively
processes and presents complex high-dimensional data to two-dimensional ones.
Thus, SOMDA effectively reduces the energy consumption and network traffic,
and as a result extends the lifetime of the network. Computer simulation reveals
that the proposed scheme considerably outperforms the existing data aggregation
schemes with respect to data reduction rate, energy efficiency, network lifetime,
number of live nodes, and clustering accuracy of the sensor nodes.
Data aggregation at the CH can be effective only when similar data are grouped
and processed together. The performance of data aggregation can be improved
if the detrimental influence of the outliers and redundant data are reduced. In
the future, the performance of the proposed scheme will be further enhanced by
employing efficient classification and dimensionality reduction technique. The
proposed scheme will also be extended by using different machine learning tech-
niques to include the covariance of the data in clustering.
Acknowledgements This work was partly supported by Institute for Information & communications
Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2016-0-00133,
Research on Edge computing via collective intelligence of hyperconnection IoT nodes), Korea, under the
National Program for Excellence in SW supervised by the IITP (Institute for Information & communi-
cations Technology Promotion) (2015-0-00914), Basic Science Research Program through the National
Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology
(2016R1A6A3A11931385, Research of key technologies based on software defined wireless sensor net-
work for realtime public safety service, 2017R1A2B2009095, Research on SDN-based WSN Supporting
Real-time Stream Data Processing and Multiconnectivity), the second Brain Korea 21 PLUS project, and
Samsung Electronics.
References
1. Oliveira LM, Rodrigues JJ (2011) Wireless sensor networks: a survey on environmental monitoring.
JCM 6(2):143–151
2. Yick J, Mukherjee B, Ghosal D (2008) Wireless sensor network survey. Comput Netw
52(12):2292–2330
3. Ullah I, Youn HY (2018) Statistical multipath queue-wise preemption routing for zigbee-based
WSN. Wirel Pers Commun. 100:1537–1551
4. Abid B, Nguyen TT, Seba H (2015) New data aggregation approach for time-constrained wireless
sensor networks. J Supercomput 71(5):1678–1693
5. Huang C-F, Lin W-C (2016) Data collection for multiple mobile users in wireless sensor networks. J
Supercomput 72(7):2651–2669
6. Rawat P, Singh KD, Chaouchi H, Bonnin JM (2014) Wireless sensor networks: a survey on recent
developments and potential synergies. J Supercomput 68(1):1–48
7. Vuran MC, Akyildiz IF (2006) Spatial correlation-based collaborative medium access control in
wireless sensor networks. IEEEACM Trans Netw 14(2):316–329
8. Yoon S, Shahabi C (2007) The clustered aggregation (CAG) technique leveraging spatial and tem-
poral correlations in wireless sensor networks. ACM Trans Sens Netw TOSN 3(1):3
9. Lee S, Chung T (2004) Data aggregation for wireless sensor networks using self-organizing map.
In: International Conference on AI, Simulation, and Planning in High Autonomy Systems, Springer,
Berlin, pp 508–517
13
A novel data aggregation scheme based on self-organized…
10. Khedo K, Doomun R, Aucharuz S (2010) Reada: redundancy elimination for accurate data aggrega-
tion in wireless sensor networks. Wirel Sens Netw 2(04):300
11. Ozdemir S, Xiao Y (2011) Polynomial regression based secure data aggregation for wireless sensor
networks. In: IEEE, pp 1–5
12. Bahi JM, Makhoul A, Medlej M (2012) An optimized in-network aggregation scheme for data col-
lection in periodic sensor networks. In: International Conference on Ad-Hoc Networks and Wire-
less, Springer, Berlin, pp 153–166
13. Cui J (2016) Data aggregation in wireless sensor networks. Networking and Internet Architecture.
INSA Lyon
14. Jadhav NH, Kashid DN, Kulkarni SR (2014) Subset selection in multiple linear regression in the
presence of outlier and multicollinearity. Stat Methodol 19:44–59
15. Yuan F, Zhan Y, Wang Y (2014) Data density correlation degree clustering method for data aggre-
gation in WSN. IEEE Sens J 14(4):1089–1098
16. Toloueiashtian M, Motameni H (2018) A new clustering approach in wireless sensor networks using
fuzzy system. J Supercomput. 74(2):717–737
17. Rostami AS, Badkoobe M, Mohanna F, Hosseinabadi AAR, Sangaiah AK (2018) Survey on cluster-
ing in heterogeneous and homogeneous wireless sensor networks. J Supercomput 74(1):277–323
18. Kuila P, Jana PK (2014) Approximation schemes for load balanced clustering in wireless sensor
networks. J Supercomput 68(1):87–105
19. Diwakaran S, Perumal B, Vimala Devi K (2018) A cluster prediction model-based data collec-
tion for energy efficient wireless sensor network. J Supercomput. https://doi.org/10.1007/s1122
7-018-2437-z
20. Lee KY, Suh Y-K (2018) A pattern-based outlier region detection method for two-dimensional
arrays. J Supercomput. https://doi.org/10.1007/s11227-018-2418-2
21. Kuna HD, García-Martinez R, Villatoro FR (2014) Outlier detection in audit logs for application
systems. Inf Syst 44:22–33
22. Subhashini R, Kumar VJS (2010) Evaluating the performance of similarity measures used in docu-
ment clustering and information retrieval. In: IEEE, pp 27–31
23. Wan X, Wang W, Liu J, Tong T (2014) Estimating the sample mean and standard deviation from the
sample size, median, range and/or interquartile range. BMC Med Res Methodol. 14(1):135. https://
doi.org/10.1186/1471-2288-14-135
24. Cosine similarity function - Wikipedia [Internet]. [cited 2018 Feb 23]. Available from: https://
en.wikipedia.org/wiki/Cosine_similarity
25. How to Calculate Outliers, by interquartile range [Internet]. wikiHow. https://www.wikihow.com/
Calculate-Outliers
26. Kumar DI, Kounte MR (2016) Comparative study of self-organizing map and deep self-organizing
map using MATLAB. In: IEEE, pp 1020–1023
27. Kohonen T (2013) Essentials of the self-organizing map. Neural Netw. 37:52–65
28. Faigl J, Hollinger GA (2018) Autonomous data collection using a self-organizing map. IEEE Trans
Neural Netw Learn Syst 29(5):1703–1715. https://doi.org/10.1109/TNNLS.2017.2678482
29. Aghajari E, Chandrashekhar GD (2017) Self-organizing map based extended fuzzy C-means
(SEEFC) algorithm for image segmentation. Appl Soft Comput 54:347–363
30. Isa D, Kallimani V, Lee LH (2009) Using the self organizing map for clustering of text documents.
Expert Syst Appl 36(5):9584–9591
31. Ganegedara H, Alahakoon D (2012) Redundancy reduction in self-organising map merging for scal-
able data clustering. In: IEEE, pp 1–8
32. Gedik B, Liu L, Philip SY (2007) ASAP: an adaptive sampling approach to data collection in sensor
networks. IEEE Trans Parallel Distrib Syst 18(12):1766–1783
33. Sun L-Y, Cai W, Huang X-X (2010) Data aggregation scheme using neural networks in wireless
sensor networks. In: IEEE, pp V1-725
34. Bo W, Han-ying H, Wen F (2007) A pseudo LEACH algorithm for wireless sensor networks. In:
IMECS, pp 1366–1370
35. Liu C, Wu K, Pei J (2007) An energy-efficient data collection framework for wireless sensor net-
works by exploiting spatiotemporal correlation. IEEE Trans Parallel Distrib Syst 18(7):1010–1023
36. Sung W-T (2009) Employed BPN to multi-sensors data fusion for environment monitoring services.
In: International Conference on Autonomic and Trust Computing, pp 149–63
13
I. Ullah, H. Y. Youn
37. Villas LA, Boukerche A, Guidoni DL, De Oliveira HA, De Araujo RB, Loureiro AA (2013) An
energy-aware spatio-temporal correlation mechanism to perform efficient data collection in wireless
sensor networks. Comput Commun 36(9):1054–1066
38. Li G, Wang Y (2013) Automatic ARIMA modeling-based data aggregation scheme in wireless sen-
sor networks. EURASIP J Wirel Commun Netw. 2013(1):85
39. Santini S, Romer K (2006) An adaptive strategy for quality-based data reduction in wireless sensor
networks. In: Proceedings of the 3rd International Conference on Networked Sensing Systems, pp
29–36
40. Yin Y, Liu F, Zhou X, Li Q (2015) An efficient data compression model based on spatial clustering
and principal component analysis in wireless sensor networks. Sensors 15(8):19443–19465
41. Lin H, Bai D, Gao D, Liu Y (2016) Maximum data collection rate routing protocol based on topol-
ogy control for rechargeable wireless sensor networks. Sensors 16(8):1201
42. Cluster with Self-Organizing Map Neural Network-MATLAB & Simulink –MathWorks. https://
kr.mathworks.com/help/nnet/ug/cluster-with-self-organizing-map-neural-network.html
43. Comparison OF LEACH EAMMH SEP TEEN Protocols (Contact for codes in WSN)–File
Exchange–MATLAB Central [Internet]. http://kr.mathworks.com/matlabcentral/fileexchange/46199
-comparison-of-leach-eammh-sep-teen-protocols–contact-for-codes-in-wsn-
44. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Air+quality
45. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/3D+Road+Netwo
rk+(North+Jutland%2C+Denmark)
46. Hevin Rajesh D, Paramasivan B (2015) Data aggregation framework for clustered sensor networks
using multilayer perceptron neural network. Int J Adv Res Comput Eng Technol (IJARCET) 4(4).
https://ijarcet.org/wp-content/uploads/IJARCET-VOL-4-ISSUE-4-1156-1160.pdf
13