1. Introduction
Wireless Sensor Networks (WSN) have exploded in popularity in the last few years. Part of this growth is due to the popularization of the Internet of Things (IoT), where connectivity, sensitivity, interaction, and energy are elements of the systems in a WSN. In a WSN, a node is defined as the minimal functional unit of a network and is comprised of a sensor/actuator, a central processing unit (CPU), a memory bank, a wireless transceiver, and a power source. As a unit, the node suffers energy depletion of its internal battery as a result of sensing, processing, data transmission and reception.
Interaction through wireless transmission in a WSN includes issues such as link viability, time to establish communication, data loss due to competition overuse of a wireless channel, data loss due to simultaneous transmission attempts, data loss due to repeated network flooding, and data loss due to transmission range.
Network scalability problems are caused by the birth, reboot, and death of one or several nodes in the network. Link problems in WSNs include neighbor discovery, message, loss, latency, and congestion. WSNs also have routing problems such as communication path and loop discovery [
1]. In general lines, WSNs have a wide range of problems, although most of them have been addressed through communication protocols.
The aforementioned problems have been approached through flat and hierarchical routing structures. In a flat routing structure, all the nodes in the network play the same role in end-to-end routing protocols. In hierarchical routing structures, nodes are classified by functionality and the network is divided into groups or clusters, each of which chooses a leading node that is called Cluster Head node (CH). The CH node coordinates activities inside and outside the cluster with non-CH nodes.
The main feature of flat routing structures is the ability to establish communication between any two nodes in the network without the participation of a central node. These networks, also called ad hoc networks, can operate in isolation, without connection to network infrastructure such as the Internet.
A hierarchical routing structure organizes large-scale ad hoc networks into groups or clusters, with the objective of improving network efficiency beyond the attainable level of flat routing structures. Although hierarchical routing can increase control traffic, its topology allows for data traffic to be confined within each cluster. Nodes inside each cluster can organize to optimize communications and reduce interference caused by simultaneous data transmission.
Cluster-based hierarchical routing presents some advantages in scalability and communication efficiency. Protocols based on this schema have been used to achieve routing efficiency in connection to node energy levels. That is, nodes with higher energy reserves are candidates to becoming CH nodes, and those with lower energy levels are used to monitor the environment. In this type of routing, CH nodes have specific functions to improve the scalability, lifetime, and energy efficiency of the network.
Cluster-based routing structures represent a simpler approach to the issues of WSNs, lowering the complexity of flat routing structures [
2]. However, cluster-based routing schemes are, in general, comprised of a cluster-formation and a CH-node selection mechanism. The behavior of the latter has an impact on the WSN’s general performance since a high degree of variability in this mechanism creates a proportional variability of the network’s response regarding delay, jitter, and throughput, and an inversely proportional response in energy levels.
The cluster formation mechanism creates clusters of varying sizes and, as a consequence, of varying density. This cluster formation method has an impact on the behavior of the network, which responds according to its cluster formation type. For example, a heterogeneous cluster formation causes data traffic within the clusters to become heterogeneous, that is, having varying dataflow responses in the majority of network nodes. This flow variability causes the network to show different delay and jitter values, limiting the usefulness of resource-intensive applications.
Given the problems found in the cluster formation and CH node selection mechanisms, our goal is to propose a WSN communications protocol that uses a hierarchical routing schema called H-kdtree. Its routing algorithm is based on the k-d tree algorithm, which allows creating partitions in an area with the mean of the data of one of its dimensions. Additionally, the H-kdtree protocol proposes a low-variability CH node generation mechanism, with a positive impact on delay, jitter, and throughput, compared with the low-energy adaptive clustering hierarchy protocol (LEACH) and low-energy adaptive clustering hierarchy-centralized (LEACH-C).
As a comparison, we have selected the LEACH and LEACH-C protocols, on account of being the most widely used hierarchical clustering protocol and also the most discussed in the literature. Clustering protocols work through one or several metrics that provide the necessary ability to manage network traffic efficiently and to improve the experience of a user or machine in different network environments. These improvements are usually in one or several network metrics such as load balancing, energy consumption, scalability, latency reduction, data traffic maximization, errors and data loss minimization. Quality of Service (QoS) is the improvement in one or several of these network metrics.
This article analyzes the performance of the LEACH, LEACH-C and the proposed H-kdtree protocol, by measuring the following metrics: delay, jitter, throughput, packet drop rate (PDR), and average network energy.
This article is structured as follows:
Section 2 includes a review of literature related to cluster formation and CH node selection. The fundamental basis for LEACH, LEACH-C and k-d tree is described in
Section 3. Protocol considerations and a description of the configuration and data transmission phases are discussed in
Section 4.
Section 5 includes parameters, metrics, and results of the simulation of the proposed protocol. Finally,
Section 6 presents the conclusions.
4. Proposed Protocol
This section describes in detail the hierarchical k-dimensional tree algorithm (H-kdtree). Unlike conventional WSN routing protocols like LEACH, HEED, TEEN, etc. [
47,
48], which used a few variations on Equation (
1) to form clusters and select CH nodes according to nodes’ residual energy and their distance to the Sink/BS node, H-kdtree uses the one-dimensional clustering principle taken from the k-d tree algorithm. This algorithm generates a hierarchical two-hop network topology similar to LEACH’s.
Next, we will explain the clustering mechanism of the k-d tree algorithm intuitively, using the data from
Table 1.
The data in
Table 1 are divided into two clusters, starting data partition with data from the
x dimension, implementing the median value (
), where
. For the data in
Table 1, we have a median value of 53.5 for the
x dimension, as shown in
Figure 5.
In
Figure 5, the algorithm shows the formation of two clusters. Each of the clusters found contains three variables:
The limits on the
y dimension for
and
are, respectively,
y
. The structure in
Figure 5 is divided in the same way but alternating the dimension, which in this case would be y, obtaining a new structure with four clusters, as shown in
Figure 6.
Based on the structure obtained so far, as shown in
Figure 6, we change dimension and begin to create new partitions, as shown in
Figure 7. The process shown so far is repeated iteratively until the stop condition is met, being the number of clusters or the minimum group condition.
Algorithm 2 has the
k variable as the input parameter. The
k variable represents the number of clusters desired (this is a parameter similar to the Algorithm 2 has the
k variable as the input parameter. The
k variable represents the number of clusters desired (this is a parameter similar to the
k-means algorithm). If
, the first iteration obtains the data in
Figure 5. Up to this point, we have two clusters. In the next iteration, we would obtain four clusters, but since the goal is to obtain three clusters, we take the clusters obtained so far and we select the cluster with the highest node count and partition only this cluster. In this way, we obtain the desired three clusters. In case the two clusters obtained in the first iteration have an equal number of nodes, one of them is chosen randomly.
4.1. Protocol Considerations
The maximum number of clusters that can be obtained is the total number of nodes divided by four. The minimum cluster condition regarding the number of nodes is three nodes and a CH node. In case we have a remainder smaller than four, the remaining nodes are added to the nearest cluster. In other words, the last partition of the current dimension in the algorithm is not performed.
The working principle of the H-kdtree protocol determines the following condition and consideration for the management of the complete network: Every action in the network is centralized and managed by the Sink/BS node, and all nodes in the network are within the range of the Sink/BS node.
Every node has a minimum energy threshold. This threshold is a function of the power supply voltage of each node. Every node will send the Sink/BS node a “Death” message when their power reserves reach 3% of the minimum operational threshold, informing the Sink/BS node that it is dead.
4.2. Configuration Phase
The Sink/BS node begins flooding the network with a broadcasted “Hello” message, to which every node in the network will reply with an acknowledgment message informing their nd energy level.
With the information obtained in the flooding process, H-kdtree begins the cluster formation process, based on the k-d tree algorithm. Once the clusters are formed, the following step is to select the CH nodes, based on the energy levels obtained during the flooding process. The node with the highest energy is selected as CH node. If two or more nodes have the same energy level, one of them is randomly selected to become a CH node.
At this point, the Sink/BS node already has information of every cluster, with their respective CH node and member nodes. The next step is configuring static routes. The Sink/BS node sends the CH nodes the static routes information, which is then forwarded to the rest of the nodes in each cluster. The result is the typical LEACH hierarchical routing, using a two-hop topology.
4.3. Transmission Phase
The data transmission phase is divided into rounds. Every round has a time slot of N-nodes, where N-nodes is the number of non-CH nodes in the network. During this period, the Sink/BS node sends a “Request” packet to the first CH node. Once the CH node receives this “Request” packet, it organizes a programmed transmission with the nodes in its cluster by using TDMA as access method. This process is repeated for all the clusters in the network.
The “Death” packet informs the Sink/BS node that a node has just died. Nodes in a cluster transmit the “Death” packet to the Sink/BS node using their respective time slot. If a CH node dies, it transmits its “Death” packet to the Sink/BS node when queried by the Sink/BS node. At the end of each round, the Sink/BS node reviews which nodes sent a “Death” packet. In case a “Death” packet arrives at the end of a round, the Sink/BS node begins the configuration phase, as shown in
Figure 8.
Figure 9 below shows the H-kdtree protocol’s algorithm. Algorithm 2 shows the cluster formation process as a complement to the processes described in this section.
Algorithm 2 Cluster formation based on the k-d tree algorithm. |
Require: Matrix obtained in the flooding process, with the following fields: , , , . The variable is the value of the column to that corresponds to the dimension to be selected, where , . Ensure: List with the vectors of the positions selected in each cluster- 1:
function: cluster_formation #Variable , stores the positional values of - 2:
#Variable , stores the positional values of - 3:
#Calculate the median of the selected partition - 4:
#Traverse the data in the selected dimension - 5:
for (i in 1:length(mat[,1])) do - 6:
#Subdivide the dimension into two clusters, depending on the median value - 7:
if then - 7:
- 9:
else - 10:
- 11:
end if - 12:
end for - 13:
refturn
|
5. Simulation and Results Analysis
For the simulation, we used NS-2 version 2.35, simulating LEACH, LEACH-C and H-kdtree in the same network environment to make a comparison and obtain metrics in the same simulator. The LEACH, LEACH-C and H-kdtree algorithms were implemented using R version 3.4.3. Implementing the algorithms using R allowed us to generate scripts in “.tcl”, which were embedded in the main script to configure position and initial energy of the nodes, static routing between nodes, CH nodes and Sink/BS node, along with traffic generated in each time slot, and planned information transmission in TDMA.
5.1. Simulation Parameters
There were two simulation scenarios for LEACH, LEACH-C and H-kdtree. Scenario 1 consisted of a random deployment of sensors. Scenario 2 was a sensor deployment with higher density in the central zone of a deterministic scenario, as shown in
Figure 10.
The use of random and deterministic node deployment scenarios aimed at abstracting network traffic behavior to evaluate QoS. Node deployment was done in a 100 m × 100 m area, maintaining the same density in both scenarios. The deterministic scenario aimed to evaluate network traffic in a scenario with higher density in its central area, with the objective of analyzing the influence of clustering in both types of scenarios.
Table 2 shows the simulation parameters used. These parameters are used in literature mainly to evaluate the performance of LEACH and LEACH-C [
49,
50,
51,
52].
Hierarchical protocols present two types of networks, according to their energy: homogeneous and heterogeneous networks. In homogeneous networks, the initial energy level is the same for all the nodes in the network. In heterogeneous networks, the nodes in the network have different initial energy values.
In the scenarios shown in
Figure 10, the network is divided into two energy levels. This energy division is represented by the parameter
m, which is used to calculate cluster energy
[
53].
can be calculated as follows:
where
is the initial energy of a regular node,
is the number of clusters, and
m is the percentage of nodes in the network with an advanced energy level. The quantity of advanced energy is represented by
. H-kdtree uses the simulation parameter
k to obtain the depth with which the nodes in the network will be partitioned. This partition is similar to LEACH’s
p parameter, which is used to estimate the expected number of CH nodes.
5.2. Simulation Metrics
To assess data traffic performance and QoS in the proposed scenarios, we used the following performance metrics.
5.2.1. End-to-End Delay (EED)
It is the time elapsed since a packet is sent by a node and until the packet is received by the Sink/BS node, taking into account the latencies experienced in all its path, including the latency of the CH node [
54]. It is calculated as follows:
where
is the time when the Sink/BS node receives a data packet, and
is the time when a non-CH node sends that data packet.
5.2.2. Throughput
This is the number of bits that can be transmitted by each node to the Sink/BS node in a period of time [
55]. The sum of the throughput of each node in the network is known as network throughput. The throughput is obtained by dividing the total number of packets received (by the Sink/BS node) by the total time for each round
5.2.3. Packet Delivery Ratio (PDR)
This is the ratio between the number of data packets received by the Sink/BS node and the number of data packets sent by the network nodes [
56]. The PDR value can be obtained by the following equation:
5.2.4. Jitter
Jitter can estimate the instability of a communication link. It is the variability in the time needed by a packet to reach the previously transmitted packet [
57]. It is calculated by:
5.2.5. Auxiliary Metrics
Other performance metrics used in hierarchical routing protocols are summarized below. These metrics are the synthesis of the results in terms of node extinction per round. The metrics evaluated in both protocols are:
First node died (FND): It is the number of rounds in the network until the first node has depleted its energy and died.
Half of nodes died (HND):It is the number of rounds in the network until half of the nodes in the network have depleted their energy and died.
Last node died (LND): It is the number of rounds in the network until all nodes in the network have depleted their energy and died.
5.3. Results and Discussion
In the results obtained, one of the most stable parameters found in the proposed H-kdtree protocol is related to the formation of CH nodes, as shown in
Figure 11. In this section, we analyze the impact of low variability in CH node formation, in relation to the following performance metrics: delay, throughput, and jitter, and their results are interpreted as QoS.
Regarding CH node formation in each round, we observed that LEACH and LEACH-C reduces the formation of CH nodes as nodes die. On the contrary, H-kdtree increases CH node formation in the network because of the minimum nodes per cluster value: as the number of nodes goes down, H-kdtree tends to maintain its
k value by iterating more times, which tends to comply with the minimum nodes per cluster condition. This behavior can be seen in
Figure 11 and
Figure 12 after round 80.
Regarding energy levels, we did not find a significant variation or tendency in H-kdtree compared to LEACH and LEACH-C, as shown in
Figure 13. The reason is that the energy that LEACH and LEACH-C used in node formation is offset by the energy used in selecting CH nodes in H-kdtree, being that the latter is more stable in terms of variations and allows for a more stable behavior in the data transmission phase.
Figure 12 shows node death compared to energy. H-kdtree resulted in a lower number of dead nodes in both scenarios, compared to LEACH and LEACH-C.
The scenarios allow us to assess QoS features from a hierarchical point of view for LEACH, LEACH-C and H-kdtree, as the protocols share a clustering topology with a two-hop distance to the Sink/BS node.
Regarding delay, the scenario with random node deployment shows more delay than the deterministic scenario with more density in its central area. H-kdtree maintains a stable number of CH nodes for the maximum possible number of rounds. This simplifies the work of the Sink/BS node, as each node has an identifiable death threshold that, when reached, triggers the transmission of a “Death” packet to the Sink/BS node to inform its death. This feature enables H-kdtree to maintain a network topology for the maximum possible number of rounds. This is not possible for LEACH and LEACH-C, per changes in the network topology in each round.
In the rounds we evaluated in the random scenario, H-kdtree changes the topology of the network five times, and nine times in the deterministic scenario, as shown in
Figure 12. LEACH and LEACH-C changed its topology 100 times in both scenarios (in each round, the topology changes).
CH nodes do not transmit sensory data. CH nodes compile packets from each cluster and retransmit them to the Sink/BS node. For this reason, if the network topology remains constant, delay, jitter, and throughput metrics will not vary, as these three metrics are a function of time. On the other hand, TDMA divides the network nodes into tie slots and ensures no packet loss due to simultaneous transmission.
In both scenarios, H-kdtree shows the lowest values for Delay and Jitter, due to H-kdtree’s low variability of topology, compared to LEACH and LEACH-C. This is shown in
Figure 14 and
Figure 15.
The stability of H-kdtree allows data traffic to remain constant, with very low variability in the rounds we evaluated as compared to LEACH and LEACH-C, in both scenarios. The results shown in
Figure 14,
Figure 15 and
Figure 16 support our recommendation for multimedia applications, due to its stability in delay, jitter, and throughput metrics.
5.4. Results Summary
Figure 17 shows a general overview of the distribution symmetry of metrics, showing that (a) LEACH and LEACH-C shows a tendency to a symmetrical distribution of CH nodes in both the random and deterministic scenarios (This is due to the random function used in its algorithm). This result does not occur with H-kdtree. The four anomalous values shown in LEACH and LEACH-C are the minimum and maximum values in the observed rounds, outside the first and third quartile.
The anomalous values for H-kdtree in the deterministic scenario are in the last rounds. Regarding CH node formation, H-kdtree shows low variability, with most of the data in the first and second quartile. Our interpretation is that at least 75% of CH nodes formed showed low variability along the observed rounds.
Figure 17a and
Table 3 show the low variability in the number of CH nodes. H-kdtree kept the number of CH nodes below the number of CH nodes generated by LEACH for the first 75 rounds (Q3). H-kdtree showed a reduction of 14.21% for the random scenario and of 30.14% for the deterministic scenario, as compared to LEACH.
Regarding average energy consumption for all nodes as shown in
Figure 17b and
Table 4, the energy behavior was similar in both H-kdtree and LEACH during the observed rounds. LEACH-C presents an energy efficiency above 40%, in relation to LEACH and H-kdtree.
Figure 17c and
Table 5 shows a proportional relationship between CH node formation and the number of dead nodes along the observed rounds.
To compare the results and quantify them as percentages, we used normalized averages. This will allow us to estimate the performance improvement for the metrics in the protocols, taking as a reference point the results for LEACH in the random scenario.
Figure 17c and
Table 5 shows that between quartiles Q1 and Q3, which correspond to the 25% and 7% of the number of observed rounds, H-kdtree presented the lowest number of dead nodes. In its final stage, H-kdtree shows the highest number of dead nodes due to the on-demand CH node generation mechanism, compensated with energy expenditure after the Q3 quartile. Although H-kdtree does not provide an optimal node energy distribution, the on-demand CH node selection mechanism provides a significant improvement in QoS, as measured in delay, jitter, and throughput.
The variability in delay, jitter, and throughput of the proposed H-kdtree protocol as compared to LEACH and LEACH-C is very low in the random and deterministic scenarios, as shown in
Figure 18. This is because the level of dispersion of values from the central trend in LEACH and LEACH-C is quite noticeable. These observations allow us to estimate QoS both quantitatively and qualitatively, supporting applications with higher demands.
With the level of delay variability shown in
Figure 18a and
Table 6, H-kdtree is able to provide QoS in networks with hierarchical topologies. In both the random and deterministic scenarios, H-kdtree showed a reduction in delay of 87.72% in the random scenario and of 95.39% in the deterministic scenario, compared to LEACH. With respect to LEACH-C, H-kdtree presented a reduction of 82.095% for the random scenario and 93.1% for the deterministic scenario.
Jitter response shown in
Figure 18b and
Table 7, interpreted as temporal variability in packet transmission, is also due to using TDMA for medium access. The level of jitter reduction found in the random scenario was of 76.52%, and 74.4% for the deterministic scenario. The values for delay, jitter, and variance were so low during the observed rounds that we can estimate that H-kdtree can guarantee the requirements for multimedia applications. The reason for this is that H-kdtree’s on-demand CH node selection mechanism is able to manage WSN resources efficiently.
The results obtained show that H-kdtree is able to provide QoS in applications with high restrictions in bandwidth and delay, at the expense of energy consumption. On the other hand, LEACH and LEACH-C are able to adapt to energy fluctuations in the network but is not capable of supporting multimedia applications or time restrictions, on account of its high variability.
Regarding bandwidth and the amount of data that it can transport per round, H-kdtree showed an increase of 48.96% in the random scenario and of 39.37% in the deterministic scenario, which compensates and justifies the energy requirements for transmitting data packets to the Sink/BS node as shown in
Figure 18c and
Table 8.
Among the metrics for hierarchical protocols, we took into account the metrics related to node death, included in
Table 9.
5.5. Other Tests Performed
As a complement to the results obtained, we performed tests with 100, 300, and 400 nodes on areas of proportional size, maintaining the same node density of the 200-nodes tests. These tests were performed in a scenario with random node deployment. Energy assignment for 100, 300, and 400 nodes was also proportional to the 200-nodes tests. For these tests, we only took into account the average values of metrics, using the same metrics of the 200-nodes tests.
In the evaluation of a random scenario with 100, 200, 300, and 400 nodes, the average of CH nodes in the scenario is not relevant. However, the lower variability shown in the variance confirms H-kdtree’s characteristic on-demand CH node selection mechanism, maintaining its variance 83% below LEACH and LEACH-C as shown in the
Table 10.
Regarding energy value, results show that H-kdtree maintains energy levels that are very close to those of LEACH, and therefore does not show an improvement in this area. However, H-kdtree shows a significant improvement in QoS as compared to LEACH and LEACH-C as shown in the
Table 11.
Table 12 shows the use of an on-demand mechanism implies that the protocol only reacts to a change requested by the network. In the case of H-kdtree’s CH node selection mechanism, this means that it will only be used as a response to receiving a “Death” packet. Note that the number of nodes close to death is the number of CH nodes. Death node in H-kdtree is stepped, and in LEACH and LEACH-C it is incremental.
H-kdtree shows an improvement in delay reduction, with values over 60% as compared to LEACH y LEACH-C, as shown in
Table 13.
H-kdtree shows a 95% jitter reduction as compared to LEACH and LEACH-C, as shown in
Table 14.
H-kdtree shows a 50% throughput increase as compared to LEACH and LEACH-C, as shown in
Table 15.
In the observed metrics in
Table 16, show that the stability periods in the half-life of network nodes for the proposed H-kdtree protocol is longer than in LEACH and more short that in LEACH-C. After 50% of the rounds, in H-kdtree, we have found node death to be stepped and maintaining low variability in delay and jitter. This was not only due to its reactive mechanism but also because of its stability derived from using TDMA for medium access. The WSN we studied did not present node mobility: all nodes maintained their positions. This characteristic was used by H-kdtree and its cluster formation mechanism, which is based on the k-d tree algorithm and adds more stability by keeping the majority of nodes in the same clusters after each configuration phase.
The results on PDR showed that TDMA-based packet transmission planning did not show packet loss in LEACH, LEACH-C and H-kdtree, in all scenarios with 100, 200, 300, and 400 nodes.
6. Conclusions and Future Work
The H-kdtree protocol has main contributions. First, the clustering formation method based on the k-d tree algorithm partitions the sensor node deployment area in a two-hop hierarchical topology. Second, it is a WSN protocol that provides QoS in support of services with stricter resource demands while keeping energy usage at a level similar to the LEACH protocol.
The proposed H-kdtree protocol was based on the k-d tree algorithm in evaluating the spatial partitioning to organize nodes in a dimensional space (x and y). The average energy results obtained with LEACH-C, exceed LEACH and H-kdtree on 42%. The partitions found become clusters, creating a network topology that is able to provide QoS for the longest possible time with energy requirements similar to those of LEACH. H-kdtree is characterized by keeping the number of CH nodes stable for the longest number of rounds, maintaining a constant network topology and, as a consequence, low variability in delay, jitter, and throughput metrics. Although these metrics are a function of time, they depend on the variability of the number of CH nodes.
The H-kdtree protocol has three main processes. First, the protocol uses a two-hop network topology that was not altered in each round. Then, during the data transmission phase, the “Death” packet allows H-kdtree to implement a reactive mechanism that only returns to the configuration phase when a node requires it by sending the “Death” packet. This means that the configuration phase is only repeated on-demand. Finally, the minimum group condition allows network traffic to be more homogeneous, which is reflected in delay, jitter, and throughput and, as a consequence, in improved QoS.
The set of experiments performed in random scenarios with 100, 200, 300, and 400 nodes, and a deterministic scenario with 200 nodes, helped us compare LEACH and LEACH-C with the proposed H-kdtree protocol. The conclusion is that the H-kdtree protocol fulfilled the objective by addressing existing problems in cluster generation mechanisms by reducing the variability in CH node formation: with the same resources used in LEACH and LEACH-C, H-kdtree improved delay and jitter by 60% and 95% percent, throughput improved by over 50%, while keeping energy usage at the same levels of LEACH.
Additional experiments will be required to measure H-kdtree’s performance in additional scenarios, incrementing the number of rounds, varying density in environments with heterogeneous node-energy levels, and proposing optimization mechanisms for CH node selection to maximize energy levels in the network. Additionally, with the QoS results obtained, it will be necessary to perform traffic analysis with multimedia data.