0% found this document useful (0 votes)

15 views14 pages

Q-Learning-Based Data-Aggregation-Aware Energy-Efficient Routing Protocol For Wireless Sensor Networks

This document presents a novel Q-learning-based routing protocol designed for energy-efficient data aggregation in wireless sensor networks (WSNs). The proposed algorithm, Q-DAEER, integrates data aggregation and routing path selection to optimize energy consumption and extend network lifetime by considering the sensor-type-dependent aggregation levels. Simulation results demonstrate that the Q-DAEER protocol outperforms conventional energy-aware routing methods in reducing data transmission and enhancing WSN longevity.

Uploaded by

Satti Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views14 pages

Q-Learning-Based Data-Aggregation-Aware Energy-Efficient Routing Protocol For Wireless Sensor Networks

Uploaded by

Satti Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Received December 27, 2020, accepted January 10, 2021, date of publication January 13, 2021, date of current

version January 20, 2021.

Digital Object Identifier 10.1109/ACCESS.2021.3051360

Q-Learning-Based Data-Aggregation-Aware
Energy-Efficient Routing Protocol for
Wireless Sensor Networks
WAN-KYU YUN AND SANG-JO YOO , (Member, IEEE)
Department of Information and Communication Engineering, Inha University, Incheon 402-751, South Korea
Corresponding author: Sang-Jo Yoo ([email protected])
This work was supported by the Inha University Research Grant.

ABSTRACT The energy consumption of the routing protocol can affect the lifetime of a wireless sensor
network (WSN) because tiny sensor nodes are usually difficult to recharge after they are deployed. Generally,
to save energy, data aggregation is used to minimize and/or eliminate data redundancy at each node and
reduce the amount of the overall data transmitted in a WSN. Furthermore, energy-efficient routing is widely
used to determine the optimal path from the source to the destination, while avoiding the energy-short
nodes, to save energy for relaying the sensed data. In most conventional approaches, data aggregation and
routing path selection are considered separately. In this study, we consider the degrees of the possible
data aggregation of neighbor nodes when a node needs to determine the routing path. We propose a
novel Q-learning-based data-aggregation-aware energy-efficient routing algorithm. The proposed algorithm
uses reinforcement learning to maximize the rewards, defined in terms of the efficiency of the sensor-
type-dependent data aggregation, communication energy and node residual energy, at each sensor node
to obtain an optimal path. We used sensor-type-dependent aggregation rewards. Finally, we performed
simulations to evaluate the performance of the proposed routing method and compared it with that of
the conventional energy-aware routing algorithms. Our results indicate that the proposed protocol can
successfully reduce the amount of data and extend the lifetime of the WSN.

INDEX TERMS Wireless sensor networks, routing, data aggregation, Q-learning, network lifetime.

I. INTRODUCTION In a WSN, many sensor nodes are deployed over a wide

A wireless sensor network (WSN) can be defined as a self- area to collect observation data and send them to a sink
configured and infrastructure-less wireless network used to (or server). Therefore, multi-hop transmission is required to
monitor and record the physical conditions of an environment deliver the collected data successfully to the sink located
and store the collected data at a central location. WSNs have beyond the transmission range of the source sensor node.
received considerable attention for multiple types of applica- This requires a collecting sensor node to calculate the optimal
tions because of their low cost, small size and applicability route to the sink. Energy efficiency is a primary challenge
in diverse fields such as healthcare, military and underwa- to the successful application of WSNs because nodes have
ter monitoring [1]. Recently, the device, network and data limited energy and cannot be recharged easily after they have
management technologies for WSNs have been extended to been deployed. Furthermore, because energy is mostly con-
other fields such as smart factories, where sensor nodes are sumed by the radio device, an energy-efficient design of the
deployed to collect data on products and machines for smart routing algorithm for communication is essential. Most of the
factory operations. In smart cities, WSNs can be deployed ongoing research on energy-aware routing has two objectives:
to create an efficient service delivery platform for public to minimize the overall energy consumption on the routing
and municipal workers and to manage the city resources path and maintain even residual energy levels. Because the
efficiently [2], [3]. overall energy consumption depends on the distance between
nodes and the number of intermediate nodes, the minimum
The associate editor coordinating the review of this manuscript and hop count path or shortest distance path is generally used
approving it for publication was Hongwei Du. for WSN routing. The residual energy level of each node or

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021 10737
W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

power drain rate is also considered to avoid path disconnec- the optimum path that can maximize the rewards by consid-
tion and network partition. These measures can prolong the ering the sensor-type-dependent data aggregation level of the
network lifetime because energy is dissipated more equally neighbor node, the residual energy, communication cost with
among all nodes [4], [5]. distance and hop count to the sink. In this way, the sensor
Because the data being collected by multiple sensors in a nodes can determine the optimum next hop node using their
given area are based on common phenomena, there is likely updated Q-values based on the rewards.
to be some redundancy in the source data. Data aggregation This article is organized as follows: In Section II, we review
as a form of ‘‘in-network-processing’’ in WSNs is widely the existing energy-aware routing protocols for the WSN.
used to collect data in an energy-efficient manner by elimi- In Section III, we present our proposed system model for
nating redundancy and minimizing the number of transmis- WSN routing. In Section IV, we discuss Q-DAEER algo-
sions or data size. In many WSN applications, the actual rithm. We present the simulation results in Section V and
measured raw data at each sensor node may not need to be conclude this article in Section VI.
delivered in the exact same form to the sink. The raw data
can be abstracted or compressed in networks. Depending on II. RELATED WORK
the monitoring purposes of applications, various aggregation Routing is essential in WSNs to support reliable data transfer,
techniques can be used, such as abstracting as {mean, vari- achieve low latency and provide energy-efficient operation.
ance}, maximum value, minimum value, lossy compression, Wireless communications consume significant amount of
feature domain reduction and data prediction. The efficiency power for transmitting sensed data from sensor nodes to
of data aggregation increases when the correlation among the sink nodes. However, the power consumption has become
data collected by various sensors is high [6], [7]. a limiting factor because most sensor nodes are powered
Various machine learning technologies have been used to by batteries. Sensor nodes used in wireless networks have
effectively capture the dynamic features such as node topol- limited computational capability and cannot have full infor-
ogy changes, restricted energy conditions, event detection mation about networks so that it is very difficult for nodes
and communication costs of WSNs for their energy-efficient to calculate the optimum route to the destination quickly.
operation. Among them, reinforcement learning (RL) is par- Even when a node is able to obtain the optimum routing path,
ticularly suitable for problems that include a long- versus the path may not remain optimum over time owing to various
short-term reward trade-off. It provides a framework for a types of changes in the sensing environment, for example,
system to learn from its previous interactions with its envi- the node movement, instable wireless channel condition and
ronment and to select its actions efficiently in the future. dynamic energy status of sensor nodes. Conventional ad hoc
RL-based routing protocols can determine the optimal path routing protocols can be classified into proactive and reactive
as an adaptive method for complex network conditions and protocols [15]. In proactive routing, routes are computed even
quality of service requirements [8]–[10]. when they are not needed and stored in a routing table at
Most previous studies on energy-efficient routing path every node. Therefore, the routing table maintenance over-
selection typically consider communication energy with hop head is large and limits the scalability of this routing pro-
counts and the distance to the sink node to reduce the overall tocol. In reactive routing, routes are computed only when
network-wide energy consumption and/or residual energy they are needed, and sensor nodes store routes only for their
level at each sensor node to distribute the energy burden neighbors. However, this protocol may increase latency for
equally. However, distributing the possible routes to reduce sensed data delivery. To overcome these problems, many
the overhead of specific sensor nodes may conflict with the studies on finding the optimum routing path with low energy
objective of minimizing the network-wide energy consump- consumption are underway.
tion. Notably, the optimization goals do not consider the Mohemed et al. [16] addressed the hole problem in WSNs
possibility of data aggregation through the path. Furthermore, using two distributed, energy-efficient and connectivity-
data aggregation and routing path selection are considered aware routing protocols. They used two different proto-
separately in conventional approaches [11]–[14]. cols in local and global environments. This technique can
In this article, we propose an RL-based energy-aware rout- decrease the overhead of topology reformation and pro-
ing algorithm for obtaining a global optimum path to mini- long the network lifetime. Razaque et al. [17] presented the
mize the overall energy consumption and prolong the lifetime combined protocol of low-energy adaptive-clustering hierar-
of the WSN. We define the degrees of the possible data aggre- chy (LEACH) and power-efficient gathering in sensor infor-
gation of neighbor nodes when a node needs to determine mation systems (PEGASIS), named P-LEACH. This protocol
the routing path. Because data from various sensor types can improve the performance by considering the limitation
(e.g., vibration measurement sensor and temperature sen- of cluster-based routing in LEACH and static routing in
sor) may not show strong correlation, they cannot be aggre- PEGASIS. Khan et al. [18] addressed the problem of sensor
gated together. Therefore, we define sensor-type-dependent node movement in wireless body area sensor networks using
aggregation rewards. We propose a novel Q-learning-based a dynamic routing algorithm. Owing to diverse activities of
data-aggregation-aware energy-efficient routing (Q-DAEER) humans, the positions of sensor nodes on the human body
algorithm, in which each sensor node reinforces to determine change every second. Therefore, packet and energy losses

10738 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

occur during transmission when nodes use the static rout- communication distance and energy information of nodes
ing algorithm. The authors solved this problem using the to use the energy-efficient clusters to minimize packet loss.
information of the residual energies of nodes, hop count to Guo et al. [27] proposed an energy-efficient routing protocol
sink distance and throughput when nodes select the next hop based on a reinforcement learning algorithm. The nodes
node to forward data. Baker et al. [19] applied the GreeDi were reinforced to calculate the optimal routing path using
routing protocol to the ad hoc on-demand distance vector a reward policy to maximize the energy efficiency and life-
(AODV) in vehicular ad hoc networks (VANET), named time of the network. Wang et al. [28] used the ant colony
GreeAODV, to achieve an energy-efficient routing protocol optimization (ACO) algorithm to address the mobile sink
in the next hop selection. They modeled city map-based wireless sensor network routing protocol. They proposed
VANET scenarios and demonstrated that the proposed algo- an improved ACO algorithm that considered not only the
rithm was better than the original AODV. Oubbati et al. [20] time and energy but also the distance between the selected
proposed an energy-efficient routing protocol, named energy cluster head (CH) and a mobile sink to calculate the optimum
connectivity-aware data delivery, in the flying ad hoc net- mobility trajectory.
work. They ensured the connectivity of the proposed routing El Alami and Najid [29] proposed the LEACH-based fuzzy
protocol by using the information on unmanned aerial vehi- cluster head selection algorithm. They determine the chance
cles (UAVs), such as their speed and location, to minimize the value using the membership function that consists of residual
packet loss caused by the movement of UAVs. energy, expected efficiency and the closeness to base station.
There are some studies on maximizing data aggrega- The nodes which have the higher chance value are selected
tion and network lifetime. Oubbati et al. [21] addressed as CHs in a round. Lee and Teng [30] improve the LEACH
the trade-off between efficient data aggregation and total algorithm using fuzzy logic in mobile sensor network. The
link cost minimization. They used a comprehensive weight, change of location of the nodes in network causes packet
named weighted data aggregation routing strategy, for solv- losses so they use the membership function that is made
ing the trade-off. By overlapping the paths of the nodes of residual energy, the movement speed and pause time of
in a cluster-based WSN, they maximized the efficiency nodes. By the membership function, the chance values of all
of data aggregation and prolonged the network lifetime. nodes to elect the CH nodes are calculated. El Alami and
Ardakani et al. [22] presented a data-aggregation-aware Najid [31] proposed an enhanced clustering hierarchy (ECH)
efficient-routing algorithm in which the mobile agent approach to achieve energy efficiency in WSNs by using
received data from sensor nodes and aggregated and trans- sleeping-waking mechanism for overlapping and neighboring
mitted the data to the sink. They solved the delay and packet nodes. Thus, the data redundancy is minimized and then net-
loss in routing protocols using the movement scheme of work lifetime is maximized. Sert and Yazıcı [32] proposed the
the mobile agents. Haseeb et al. [23] addressed the secu- modified clonal selection algorithm (CLONALG-M) applied
rity issues in applying the conventional routing algorithm to to determine the approximate form of the output membership
a large-area Internet of things. They proposed light-weight functions to improve the performance of rule-based fuzzy
structure-based data aggregation routing, which is a secure routing. Fuzzy approach is superior to well-defined method-
protocol that uses in-route data aggregation for routing data ologies, especially where boundaries between clusters are
in the conventional routing protocols. Yazici et al. [24] pre- unclear. They derived the optimal solution by using the initial
sented a fusion-based framework to reduce the amount of membership function and iterative experiment.
data to be transmitted over the wireless multimedia sensor Some studies have focused on data aggregation-based
network by intra-node processing. They designed a sensor energy efficient routing in WSNs. Sensing data routing in
node to detect objects using machine learning techniques network aggregation provides a better solution in terms of the
and proposed a method for increasing the accuracy while reduced number of messages, high aggregation rate and reli-
reducing the data amount. For sensor network routing, a new able transmission. Zhang et al. [33] proposed the data aggre-
cluster-based routing algorithm that consume less power was gation mechanism supported by dynamic routing. Nodes in
presented. Clustering is one of the important techniques network select the neighbor node as next hop, which has the
for topology control, effective data aggregation and energy- minimum value of function that is made of residual energy,
efficient routing in WSN. hop count and the size of remained buffer. Li et al. [13]
Many researchers have applied machine learning presented differentiated data aggregation routing (DDAR)
techniques to obtain the optimal routing path with low that makes different QoS (Quality of Service) routes to sink
overhead and cost. Chang et al. [25] applied the k-means node based on aggregation threshold and aggregation dead-
algorithm and a genetic algorithm for multi-objective opti- line. Most of conventional data aggregation-based routing
mization. The sensor nodes in the network were clustered algorithms are generally utilizing tree structure or hierar-
using the k-means algorithm. They constructed a fitness chical clustering architecture to aggregate the data and to
function of the genetic algorithm to maximize the network find out the optimum route to the sink. However, they have
lifetime. Thangaramya et al. [26] presented a neuro- not considered network-wise data aggregation possibilities
fuzzy-based energy-efficient clustering algorithm. In neuro- and corresponding energy consumption for different sensor
fuzzy, they used a membership function comprising the types, in which they depend on type-dependent neighbor

VOLUME 9, 2021 10739

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

FIGURE 1. WSN model with multiple sensor types.

relationship and aggregation degrees of paths. To capture

network-wise dynamics, machine learning based adaptive
routing path evaluation mechanism is required. In this arti-
cle, we propose a Q-learning-based routing algorithm to
obtain the best next-hop node to maximize the efficiency of FIGURE 2. Schematic of the proposed system.
in-network processing. In addition, the network-wise energy
consumption for communication and the residual energy of
every intermediate node are also considered.
sensors periodically sense the environment based on a prede-
III. PROPOSED MODEL fined sensing schedule for each sensor type. When the sensing
A. NETWORK MODEL timer expires, the sensing module collects the data from the
In this study, we assume that various types of sensors, such environment and saves them in its sensor-type queue. Each
as temperature sensors, humidity sensors and photosensors, node can receive any sensor-type data from its neighbor nodes
are deployed in a field, as depicted in Fig. 1. Each sensor through a transceiver and stores the data in the queue for
type has different sensing intervals based on various operating the corresponding sensor type. Data collection at each node
requirements. A sensor node stores its observed data and any can be performed during a predefined waiting time for each
received data from its one-hop neighbor nodes in its buffers. sensor type. Depending on the latency requirement for each
Each node maintains multiple sensor-type-dependent buffers. sensor type, the waiting time at the queue can be determined.
The same-sensor-type data among neighbor nodes have When the waiting timer expires, the stored data in the queue
strong correlation. Therefore, the data of the same-type sen- are passed to the aggregation module. In the aggregation
sors can be aggregated at each node before being forwarded, module, all raw data of each sensor type measured by the
as depicted in Fig. 1 [34]. Each sensor node periodically node itself and collected from neighbor nodes are aggregated
forwards its stored data to one of its one-hop neighbor nodes using the aggregation model described in Section III.D. The
based on the proposed reinforcement-learning-based routing aggregated data for each sensor type are forwarded to the
algorithm; eventually, the data are delivered to the sink node. best neighbor node, which is determined using the proposed
A sink node periodically broadcasts a Hello packet with an Q-learning algorithm (see Section IV). After the neighbor
incremental sequence number and an initial zero hop count node receives the data, it responds with the ACK (acknowl-
value. As in the publish/subscribe model in the WSN [35], edgement) packets, which have the status information of
a sink node declares its interest in the Hello packet. When the data aggregation degree, hop count to the sink node,
a sensor node receives a Hello packet, it increases the hop energy-related values and the location of a node. Based on the
count by 1 and rebroadcasts it to its neighbors. When a sensor response, the sending node calculates the reward to update the
node receives a Hello packet that has the same sequence Q-table for the corresponding sensor type.
number but a larger hop count, it simply discards the packet.
With operation, all sensor nodes in the network always C. SENSING AND DATA TRANSMISSION MODEL
know the minimum hop count to the sink node. The pro- In this section, we introduce the WSN sensing and data trans-
posed Q-DAEER is designed to apply to the flat network as mission model of the proposed system. In WSN, the sensor
in Fig. 1. However, the concept of Q-DAEER can be extended node is composed of a sensor part for monitoring the sur-
to the cluster-based hierarchical network architecture for rounding environment and a transceiver part for transmitting
inter-cluster routing between cluster heads. and receiving data. It is assumed that each sensor node does
not continuously sense the surrounding environment, and the
B. FUNCTIONAL MODEL required sensing time and sensing interval for each sensor
A schematic of the proposed method is depicted in Fig. 2. type are predetermined. The sensing start time at each node
To reduce the energy consumption for environment sensing, does not need to be synchronized with other nodes so that

10740 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

FIGURE 3. Data aggregation and transmission system model.

FIGURE 4. Data aggregation models (a) Representative aggregation

asynchronous sensing method is used. On the other hand, (b) Lossy compressive aggregation (c) Lossless aggregation.
WSN transceivers generally use multi-mode (e.g., active, idle
and sleep) operation for energy-efficiency, in which there environment at a predefined sensing interval. In our model,
exists the transceiver wakeup time synchronization issue we defined the sensing time and sensing interval for each
with neighbor nodes. In the synchronous transceiver wakeup sensor type t as ST t and SI t , respectively. For data aggre-
method, complex clock synchronization implementation and gation for in-network processing, each node must wait for
high control packet overhead exist. In the asynchronous a certain amount of time to possibly receive the same type
method, there is high overhead for obtaining the wakeup of data from the neighbor nodes. A longer waiting time for
schedules of neighboring nodes in advance and packet deliv- data aggregation results in larger latency for data delivery to
ery latency can be higher than that of the synchronous the sink node. Because the level of time delay required for
method. each sensor-type data may be different, the waiting time is
In Fig. 3, it is assumed that each sensor node is equipped set differently for each type in this model. WT t represents
with one sensor type. Notation sti represents sensor node i the waiting time for sensor type t data aggregation. Typically,
with sensor type t. A sensor node can have multiple types of WT t is larger than ST t and, during sensing interval SI t ,
,tk
sensors, as st1,···
i . There are K different sensor types in the we have multiple WT t time steps. All nodes need not be time
WSN, and each node has K queues to separately store data synchronized; they can start their schedules independently at
for various sensor types. Note that even if the sensor node any time. As depicted in Fig. 4, at the nth waiting time step,
has only one sensor, it should have K queues because it can if there is a scheduled sensing time, the sti node measures the
be used as a relay node for any type of data. Fig. 3 shows the environment during ST t and obtains data ODti (n). The node
process of performing data aggregation on the routing path to will wait until the waiting timer expires to receive aggregated
the sink node. It was assumed that st1 i node is determined as data from its neighbors. In Fig. 4, sti receives ADta (n) and
the next node on the path to the sink node by the previous ADtb (n) from nodes a and b, respectively. At the end of
nodes. As depicted in Fig. 3, at the nth time step, sensor WT t (n), sti aggregates all stored data of its type t queue
node st1i measures the environment and has the observed Qti (n) and sends them to the next neighbor. When sti receives
data of sensor type t1, ODt1 i (n). It also receives aggregated aggregated data from the neighbor before the next sensing
data for each sensor type from its neighbor nodes. ADt1 j (n) time, the node will wait for aggregated data from neighbors
indicates the aggregated data of type t1 at time step n from until the waiting timer expires.
neighbor node j. During time step n, node st1 i stores all data The queue state and aggregated data size of the sti sensor
(the received aggregated data and its local observed data) in node at time step n are computed as follows:
sensor type queues Qti (n) , t = t1, · · · , tK . At the end of X
time step n, the node aggregates the stored data as ADti (n) , Qti (n) = ODti (n) + ADtj (n) (1)
t = t1, · · · , tK , and then it forwards the aggregated data of j∈Ni
each type to the selected neighbor nodes. ADti (n) = DA Qti (n)

(2)
Fig. 4 illustrates the sensing and transmission of data in the
proposed system model. Generally, to save energy, instead where Ni is the set of neighbor nodes of node i, and DA{ }
of continuous sensing, sensor nodes in the WSN sense the is the data aggregation function (explained in Section III.D).

VOLUME 9, 2021 10741

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

FIGURE 5. Schematic of sensing and data forwarding procedures (type t data only).

In Eq. (1), if there is no scheduled sensing time for type t at D. DATA AGGREGATION MODEL
time step n, then ODti (n) = 0. Owing to the high node density in sensor networks, similar
The required energy for data transmission is generally data are sensed by many nodes, which results in redundancy
proportional to the size of the aggregated data and the dis- in the sense data. Using data aggregation techniques, tem-
tance between the sender and receiver if the sensor nodes poral and spatial redundancies can be reduced while routing
can control the transmission power. The required reception packets from the source to the sink [37]–[39].
energy depends on the size and decoding of the data. The In this study, we consider three different types of data
required energy for data aggregation is proportional to the aggregation models. The first is a representative aggregation
queue state [36]. model, in which the sink node represents only a representative
The total transmission energy required by node i at the nth value. The typical mathematical functions are sum, average,
time step is maximum, minimum or median. In this model, regardless
X ADt (n)
(
di−nt∗ β
) of the cumulative queue state size, the aggregated data can
Ei (n) =
TX i
PtxElec + Pamp (3) have a unit packet size, as depicted in Fig. 5(a). The second
∀t B dmax model is the lossy compressive aggregation model. In this
where B is the nominal bit rate; PtxElec is the transmission model, the sensed data from multiple sensors can be rep-
power; Pamp is the amplifier power; dmax is the maximum resented by the limited size of the feature vector, in which
distance for communication at each node;di−nt∗ is the distance various types of dimension reduction techniques with infor-
between node i and the selected next neighbor node for type mation loss can be applied. As depicted in Fig. 5(b), when
t using the proposed routing algorithm, and β is the path loss the queue state is less than the feature vector size of the
exponent (β = 2 for free space). transformed domain, the data in the queue are transmit-
The total reception energy required by node i at the nth time ted without further aggregation. The third model is loss-
step is less aggregation, in which the sink node can reconstruct
X ADt (n) the raw data from the aggregated data without any loss.
EiRX (n) = i
PrxElec + ADti (n) EdecBit (4) In this study, we modeled this type of aggregation using a
∀t B log function, as depicted in Fig. 5(c). The three different
where PrxElec is the reception power, and EdecBit is the decod- data aggregation models are represented mathematically as
ing energy per bit. follows:
The total energy required for data aggregation by node i at
the nth time step is DAmodel1 Qti (n)

X
EiDA (n) = Qti (n)EaggBit
(
(5) t
Um1 if Qti (n) > 0
∀t
= (6)
where EaggBit is data aggregation energy per bit. 0 if Qti (n) = 0

10742 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

TABLE 1. System model parameters.

DAmodel2 Qti (n)


t < Qt
i (n)
t
Um2
 if Um2
= Qi (n)
t if 0 < Qi (n) < Um2
t t (7)
0ifQt (n) = 0


t i
DAmodel3 Qi (n)
(
2 (DPi (n) + 1) if 0 < DPi (n)
t × log
Um3
= (8)
0 if DPi (n) = 0

where Um1 t , U t and U t are the unit packet sizes for the
m2 m3
first, second and third models, respectively; DPi (n) is the
number of aggregated data packets in the queue of node i.
The data aggregation model is designed based on the WSN
application objectives and sensor data types. It means that the
actual shapes of models can be different depending on the real
applications and used aggregation methods. Table 1 lists the FIGURE 6. Q-learning model for the proposed system.
system model parameters defined in this study.

IV. Q-LEARNING-BASED DATA-AGGREGATION-AWARE set of states S and a set of actions A. By performing an action
ENERGY-EFFICIENT ROUTING PROTOCOL a ∈ A, the agent transitions from one state to another. The
Reinforcement learning methods are essential to solve agent in state s interacts with the environment with action a
optimal control problems using on-line measurements by to learn the environment, while depending on the outcome,
interacting with an environment. The objective of RL is to to acquire reward r. The decision goal for selecting one of
maximize the reward of an agent by taking a series of actions the actions in the given state is to maximize the expected sum
in response to a dynamic environment. RL can be applied to of weighted rewards, which include the current immediate
the WSN routing problem because it can capture the dynam- reward and future discounted rewards [40].
ics of the network and environment conditions efficiently, In the proposed Q-learning system for WSN routing,
in which the action at each sensor node is the selection of the agent is considered as a network-wide data flow. In the
the next node for forwarding the sensing data to the sink conventional single-agent approach, a centralized network
node. Q-learning is a model-free value-based RL algorithm controller acts as an agent that can observe the global condi-
that is used to obtain the optimal action-selection policy using tions of the entire network and control the packet transmission
a Q value function. The Q value (quality value) represents at each sensor node. This central agent approach requires a
how useful a given action is in gaining some future reward. large overhead and makes it difficult to know the status of
Q-learning uses temporal differences (TD) to estimate the the entire network in real time. In the proposed system, there
expected Q value through episodes with no prior knowledge is no explicit central agent; instead, cooperative informa-
of the environment. Q-learning is defined using an agent, a tion exchange among neighbor nodes ensures that each node

VOLUME 9, 2021 10743

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

knows the network-wide state transition behaviors. As shown follows:

in Fig. 6, the flow of data in the WSN is an agent, and each
sensor node represents a state. When the type t waiting timer at∗ |s = argmax Qt (s, a) (13)
a
of sensor node i expires, it must select the next neighbor
node to forward the aggregated data of type t. In this case, As depicted in Fig. 6, after taking the action (forwarding the
the current state is si ; the actions at the current state are the aggregated data of type t) in the current state s, the agent
list of neighbor nodes; the next state will be node sj , to which state changes to the new state s0 (the receiving sensor node
the aggregated data of type t are forwarded. The states and of the forwarded data); the rewards are given to the current
actions are defined as follows: state s; the Q-table of the action taken for state s is updated.
Because our Q-learning process is not controlled centrally
S = {s1 , s2 , · · · , sN }
and is performed in a distributed manner at each sensor node,
A = {A1 , A2 , · · · , AN } ,

Ai = aj = sj |sj ∈ Nsi (9) the current state node s does not have the Q-table of the
where N is the number of sensor nodes and Nsi is the set of next state to update its Q-table using Eq. (10). In addition,
neighbor nodes of node si . state s does not explicitly know the reward for the action
In Q-learning, the Q-table helps in finding the best action taken. In the data-aggregation-aware energy-efficient routing,
for each state, in which the action value function Q (s, a) reward R for the action in Eq. (10) represents the effectiveness
returns the expected sum of the current and future rewards of data aggregation and energy efficiency at the next node
when action a is performed at state s. This function can be selection, and it is computed at the next state (next node).
estimated through iterative update using the Bellman equa- Therefore, in this study, when the next node responds the
tion. receipt of the aggregated data to the sender it also includes
Suppose that the agent selects action a in state s, observes its maximum Q-values and the computed reward R.
reward R and enters new state s0 . Then the action value Because the agent acts based on the Q-value updated after
function (Q-value), Q (s, a), is updated as follows: the reward, it is essential to set the reward policy to determine
an optimum solution for the Q-learning algorithm. We define
Q (s, a) = (1 − α) Q (s, a) + α R + γ · Q s0 , a

(10) reward R for the proposed routing algorithm as a function of
rewards for the data aggregation degree, node energy status
where α is the learning rate and γ is the discount factor for
and hop count to the sink node. The data aggregation reward
the future reward. t , is defined as in Eq. (14), and it is computed
for type t, rDA
To achieve balance between exploitation and exploration,
by the next node s0 after it sends the received ADts (n) data to
the epsilon-greedy strategy is generally used to select action
its queue Qts0 (n) and aggregates the queued data of type t into
a∗ in state s, as in Eq. (11). The epsilon-greedy strategy,
ADts0 (n).
in which epsilon refers to the probability of choosing to
explore, exploits most of the time with a small chance of  t
Qs0 (n) Qts0 (n)
exploring:  − 1 if − 1 < rDA
max
ADts0 (n)
 t

t ADs0 (n)
 rDA = (14)
argmax Q (s, a) with probablity 1 − Qt 0 (n)
max
else st max

a∗ |s = a (11) rDA
 − 1 ≥ rDA
ADs0 (n)
any action a with probability
where rDAmax is the maximum reward for data aggregation.
In Q-DAEER, we perform data-type-dependent action 0
selection and Q-table updating. Fig. 6 depicts a Q-learning In s , when the data aggregation degree (ratio between the
raw and aggregated data sizes) for type t is high, reward rDA t
scenario for WSN routing. In state si (sensor node i), suppose
is also high. The data aggregation reward is type dependent.
i (n)
the waiting timer for type t1 expires so that the data in Qt1 t can be computed
aggregate into ADt1 (n). In Fig. 6, the agent takes the best When node s forwards the type t data, rDA
i
action that has the maximum action value for type t1 of the directly. However, the aggregation rewards for other t 0 types
current Q-table. The best action for the given state can be cannot be computed directly because node s did not send
different for each data type t. other types of data at this time step. In this study, we estimate
The action value of action a in state s is represented as a the expected rewards for other types. The estimation of the
expected reward for other t 0 types, r̂DA t 0 , is simply defined as
vector, as in Eq. (12), to capture sensor-data-type-dependent 0
t at node s0 . The data aggregation reward
expected rewards for each action: the most recent rDA
vector (RDA ) for all data types is defined using (15).
Q (s, a)
 t1 
 Qt2 (s, a)  Qts0 (n)
 
t
Q (s, a) = 

..

(12) r =
  DA
ADts0 (n) 
 .  
 ..


QtK (s, a) RDA =   .  (15)
t 0 −

Q0 n 
 
where K is the number of sensor types. The best action for  t0 t0
r̂DA = rDA n− = st 0
type t data forwarding in the given state s is defined as ADs0 n−

10744 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

FIGURE 7. Example scenario for the proposed Q-DAEER learning process.

where t is the data-type node s sent and n− is the most recent where Hs is the hop count of node s, 1 E is the K -dimensional
time step at which node s0 computed rDA t0 . vector with all 1s, Rs is the sink node reward and η is the
We have defined the type-independent energy status discount factor for the reward in range [0, 1].
reward. The energy status reward (RE ) is defined as follows: When node s receives reward R, it needs to update its
Q-table. To update its action value function Q (s, a),
E r0 (n) ds−s0 β

RE = sr − (16) (16) it requires Q s0 , a of the next state node. As explained
Es0 (0) dmax previously, in our proposed mechanism, when the next node
s0 receives an aggregated data packet, it replies with the
where Esr0 (n) and Esr0 (0) are the residual energies of the next ACK packet,in which the reward vector R of Eq. (17) and
node s0 at the nth and 0th time steps, respectively; ds−s0 is max Q s0 , a vector are included. Therefore, node s can
the estimated distance between nodes s and s0 ( estimated update its Q-table based on the ACK packet information. The
at node s0 using any distance estimation techniques);dmax is max Q s0 , a vector includes the maximum Q-value for each

the maximum transmission range of the sensor nodes; and data type at the next node s0 as follows:
β is the path loss exponent (in free space β = 2). When
max Qt1 s0 , a
 
the remaining energy of the next state node is relatively
 ∀a t2 0 
large and the distance between the next and current state  max Q s , a 
max Q s0 , a =  ∀a

nodes is short (which means that the energy requirement for  
(18)
.. 
transmission is low), the action selection is efficient in terms .
 
 
of energy. Consequently, the energy state reward increases. max QtK s0 , a

This reward policy can reduce the energy consumption of the ∀a
entire network and increase the network lifetime by evenly The general Q-table update rule of Eq. (10) can be represented
distributing the energy consumption at each node. in vector form as follows:
To forward data to the sink, the reward should be smaller Q (s, a) = (1 − α)Q (s, a) + α R + γ + max Q s0 , a

than the maximum Q-value of the parent hop count node.
However, the fixed reward for all nodes in the network has (19)
a higher probability of backwarding the nodes that are away Fig. 7 illustrates a scenario for the proposed Q-DAEER learn-
from the sink. An additional discount factor for the reward of ing procedure.
the nodes is necessary to prevent backwarding. Reward R for 1) At node s, the waiting timer of type t1 expires at time
s (n)
action a in state s is finally computed as follows: step n, and then node s aggregates data in queue Qt1
( to ADt1s (n).
ηHs × (RDA +RE × 1 E ) if s0 is not a sink
2) Node s selects action a2 (node s0 ) that has the maximum
R= (17)
Rs × 1 E else Q-value for type t1 of state s Q-table.

VOLUME 9, 2021 10745

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

3) Based on action a2 , node s forwards the aggregated data

to node s0 .
4) Node s0 calculates reward vector
R.
5) Node s0 derives maxQ s0 , a vector from its Q-table
in the form of Eq. (18).
6) Node s0 replies to ACK including R, max Q s0 , a .

7) Node s updates Q (s, a) vector using Eq. (19).

Table 2 shows the complexity and overhead analysis of the
proposed algorithm compared with two other WSN routing
methods. The first compared algorithm is the shortest path
routing using the proposed data aggregation model at each
node on the path. The second one is the shortest path routing
without data aggregation. The analysis has been conducted
in terms of complexity, queue management overhead, control
message overhead and time delay.

V. SIMULATION RESULTS AND PERFORMANCE

EVALUATION
In this section, we evaluate and analyze the performance
of the proposed Q-DAEER routing protocol in terms of its
energy consumption, network lifetime, average hop count and
decrease in data size. We implemented the simulation envi-
ronments using MATLAB R2019a to compare the proposed
routing algorithm with the conventional routing algorithms.
The simulation parameters and values used in this study are
listed in Table 3. We used the random-type grid topology
for the WSN, in which sensor nodes were deployed in the
form of a grid, as depicted in Fig. 8 (an example topology),
and each sensor node had only a single-type sensor module FIGURE 8. Energy consumption for nodes (a) Energy consumed per time
that is randomly selected. The characteristics of the three unit (tu) (b) Average energy consumed.
types of sensor modules are summarized in Table 4. 77 sensor
nodes were deployed in the sensing area. The initial energy
level of nodes followed a uniform distribution with [2J, 2.5J]. it forwards the aggregated data to the next node using the
The maximum transmission range of nodes was assumed to shortest path routing.
be 150 distance units (du). The unit packet sizes for data We performed the simulation until half of the nodes of
aggregation model-1, −2 and −3 given by Eqs. (6)–(8) were the one-hop neighbors of the sink were dead or some nodes
proportional to the observed data size by each sensor type. in the network were isolated so that they could not transmit
The transmission, amplification and reception powers were data to the sink. We compared the performances in terms of
200 mW, 500 mW and 200 mW, respectively. The nominal network-level energy consumption, number of dead nodes,
bit rate for nodes was 6 Mbps and the energy consumptions network lifetime, average hop count and decrease in data
for decoding and data aggregation were 40 nJ and 20 nJ per size. Network-level energy consumption is the sum of ener-
bit, respectively. The observed packet sizes, sensing intervals gies consumed by all the sensor nodes. The number of dead
and waiting timers of all sensor types are listed in Table 4. nodes represents the number of sensor nodes with depleted
We implemented two conventional energy-aware WSN energies. Network lifetime indicates the elapsed time until
routing algorithms shown in Table 2 for performance compar- half of the nodes of the one-hop neighbors of the sink are
ison. In the shortest path routing (SPR) without data aggrega- dead or some nodes in the network are isolated so that they
tion, to minimize energy consumption, a sensor node in the cannot transmit data to the sink. Average hop count is the
network selects the next neighbor node that has a least hop average of the hop counts required to reach the sink node,
count to the sink. This results in a minimum distance between which also approximately represents the delay from the data
the source node and the sink node. In the shortest path rout- source to the sink node. Decrease in data size represents the
ing with data aggregation (SPRwDA), when a sensor node amount of the reduced data size owing to data aggregation
receives the aggregated data from other nodes, the node waits through the routing path. It represents the efficiency of data
until the waiting timer expires to minimize the transmission aggregation of a routing algorithm. In the simulation study,
overhead. Then it aggregates all received and locally observed model-1, model-2 and model-3 represent the data aggregation
data together using the proposed aggregation procedure, and models given by Eqs (6), (7) and (8), respectively.

10746 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

TABLE 2. Complexity and overhead analysis.

TABLE 3. Simulation parameters.

FIGURE 9. Wireless sensor network simulation environment.

TABLE 4. Sensor type dependent parameters for simulation.

The comparative results of network-level energy consump-
tion are depicted in Fig. 9. Fig. 9(a) shows the results at
each time step tu (time unit). In the SPR and SPRwDA,
the energy consumption at every time step is almost constant
because they use the shortest routing path and it is only
determined by the current network topology. Since SPRwDA
uses the proposed data aggregation method before forwarding
data at each node, it can be seen that the energy used is the propose Q-DAEER can reduce energy consumption by
lower than that of SPR. In the proposed Q-DAEER method, 67%∼32% compared with SPR and by 25%∼5% compared
the energy consumption of each sensor node in the WSN with SPRwDA.
using the proposed routing algorithm is dynamic owing to the The comparisons of the numbers of dead sensor nodes over
policy-based dynamic reward update rule. Initially, the energy time and the average network lifetime are shown in Fig. 10.
consumption of the proposed method is higher than that of In the case of SPR, it can be seen that the number of dead
the conventional algorithms because each node needs to learn nodes increases faster than other methods due to high energy
the optimal path. However, after learning, the nodes spent the consumption. The data aggregation model-3 exhibits a faster
least energy for all three data aggregation models. Fig. 9(b) node dead time when compared with the other two models
shows the total average energy consumptions for all time because, as in Fig. 9(b), model-3 consumes more energy
steps. We can see that the proposed algorithm consumed the when compared with the other models. Fig. 10(b) depicts the
least energy compared with two other algorithms. In data network lifetimes when half of the nodes near the sink are
aggregation model-1, the efficiency of data aggregation is the dead or some of the nodes are isolated. In data aggregation
highest so that its average energy consumption was the lowest model-1, the network lifetime using the proposed method
among all the models. For three data aggregation models, is approximately 6.8∼2.5 and 1.55∼1.29 times longer than

VOLUME 9, 2021 10747

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

FIGURE 10. Numbers of dead sensor nodes and network lifetimes

(a) Number of dead sensors per time unit (tu) (b) Network lifetimes. FIGURE 11. Comparison of hop count averages (a) Average hop count per
time unit (tu) (b) Average hop count.

those of SPR and SPRwDA, respectively for three data aggre- models, the average hop count of Q-DAEER is approximately
gation models. 25%∼35% higher than those of SPR and SPRwDA.
Fig. 11 shows the average hop count of data packets from A comparison of the decrease in data sizes in the network
the data source node to the sink node. The average hop is presented in Fig. 12. Fig. 12(a) shows the decrease in the
count at each time unit is depicted in Fig. 11(a). In SPR data size at each time unit. Because SPR does not perform
and SPRwDA, because each sensor node forwards data to data aggregation, the reduced data size is zero. In SPRwDA,
the neighbor that is closest to the sink node, the average hop the reduction in the data size is almost similar at each time
count is almost constant and lower than that of the proposed step for roughly the first half of the network lifetime; after-
Q-DAEER regardless data aggregation models. However, ward, it increases suddenly. Because SPRwDA utilizes the
near the end of the simulation, the average hop counts of SPR shortest path, the energy of some nodes close to the sink
and SPRwDA increase slightly because some nodes become node depletes, eventually causing these nodes to stop func-
dead owing to the depletion of their energies. In contrast, tioning. This causes data from sensor nodes to concentrate
the proposed Q-DAEER method demonstrates a higher initial in the remaining nodes, which can significantly reduce the
average hop count for reinforcement learning. In Q-learning, data size. Therefore, the decrease in the data size increases
before the Q-table is stabilized and used, the agent needs in the second half of the simulation. However, as shown
to explore more paths. The average hop count in the pro- in Fig 9, this accelerates the energy shortage among the
posed method decreases significantly after the initial learning overloaded nodes and shortens the network lifetime. In the
period. Each sensor node dynamically learns the optimal proposed Q-DAEER algorithm, the rewards that are given by
routing path in terms of not only the hop count but also the the neighbor nodes consider the energy level and degree of
energy consumption and data aggregation degree on the path. data aggregation so that nodes always dynamically determine
The Q-DAEER algorithm may choose longer paths to obtain the best path. The results indicate that the proposed algo-
higher expected rewards by achieving more data aggregation rithm can obtain a more optimal path to improve energy and
and energy saving. Therefore, for three data aggregation data aggregation efficiency compared with the conventional

10748 VOLUME 9, 2021

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

TABLE 5. The results of grid and random topologies for 100 and 400 sensor node cases.

compared methods. As we can see, the proposed Q-DAEER

method consumed less energy and achieved longer network
lifetime for both of random and grid topology at even dense
node conditions.

VI. CONCLUSION
In this article, we proposed a Q-learning-based data-
aggregation-aware energy-efficient routing (Q-DAEER)
algorithm. To calculate the best path to maximize the lifetime
and minimize energy consumption of the network, we defined
a reward policy that considered the energy level, distance,
hop count and the degree of data aggregation at each node.
For efficient data aggregation at each node with different
sensor types, we presented a data aggregation and system
model in which sensor-type-dependent queue management
and transmission schedule control were used. The reward
functions defined in this study captured the changes in
the energy node, neighbor relationship and type-dependent
data aggregation dynamics of each node. In the proposed
Q-DAEER algorithm, we incorporated a data-type-dependent
action selection and Q-table updating algorithm. To demon-
strate the applicability of the proposed algorithm to various
data aggregation scenarios, we defined three different data
aggregation models. We compared the performance of the
proposed algorithm with that of the conventional routing
protocol in terms of its energy consumption, network lifetime,
average hop count and degree of data aggregation. The
results indicate that the proposed algorithm can obtain a
more optimal path to improve energy and data aggregation
FIGURE 12. Comparison of data size reduction due to data aggregation efficiencies when compared with the conventional method.
(a) Per time unit (tu) decrease in data size due to aggregation (b) Average
decrease in data size.
We demonstrated that the proposed Q-DAEER protocol can
successfully reduce the overall data transmission load and
method. As depicted in Fig. 12(b), the proposed algorithm extend the lifetime of the wireless sensor network.
achieved approximately 20%∼10% higher data reduction
ratio compared with SPRwDA for three aggregation models. REFERENCES
We applied a random topology in addition to the grid [1] Y. Jin, K. S. Kwak, and S.-J. Yoo, ‘‘A novel energy supply strategy for
topology in the previous experiments in the sensor deploy- stable sensor data delivery in wireless sensor networks,’’ IEEE Syst. J.,
ment topology, and also verified the scalability of the pro- vol. 14, no. 3, pp. 3418–3429, Sep. 2020.
[2] S. Wang, J. Wan, D. Li, and C. Xhang, ‘‘Implementing smart factory of
posed algorithm by increasing the number of nodes to industrie 4.0: An outlook,’’ Int. J. Distrib. Sensor Netw., vol. 12, pp. 1–10,
100 and 400. Table 5 shows the experimental results with the Jan. 2016.

VOLUME 9, 2021 10749

W.-K. Yun, S.-J. Yoo: Q-DAEER Protocol for WSNs

[3] T.-H. Kim, C. Ramos, and S. Mohammed, ‘‘Smart city and IoT,’’ Future [26] K. Thangaramya, K. Kulothungan, R. Logambigai, M. Selvi,
Gener. Comput. Syst., vol. 76, pp. 159–162, Nov. 2017. S. Ganapathy, and A. Kannan, ‘‘Energy aware cluster and neuro-fuzzy
[4] K. Cengiz and T. Dag, ‘‘Energy aware multi-hop routing protocol for based routing algorithm for wireless sensor networks in IoT,’’ Comput.
WSNs,’’ IEEE Access, vol. 6, pp. 2622–2633, 2018. Netw., vol. 151, pp. 211–223, Mar. 2019.
[5] J. Huang, Y. Hong, Z. Zhao, and Y. Yuan, ‘‘An energy-efficient multi-hop [27] W. Guo, C. Yan, and T. Lu, ‘‘Optimizing the lifetime of wireless sensor
routing protocol based on grid clustering for wireless sensor networks,’’ networks via reinforcement-learning-based routing,’’ Int. J. Distrib. Sensor
Cluster Comput., vol. 20, no. 4, pp. 3071–3083, Jun. 2017. Netw., vol. 15, no. 2, Feb. 2019, Art. no. 155014771983354.
[6] P. Guo, J. Cao, and X. Liu, ‘‘Lossless in-network processing in WSNs [28] J. Wang, J. Cao, R. S. Sherratt, and J. H. Park, ‘‘An improved ant colony
for domain-specific monitoring applications,’’ IEEE Trans. Ind. Informat., optimization-based approach with mobile sink for wireless sensor net-
vol. 13, no. 5, pp. 2130–2139, Oct. 2017. works,’’ J. Supercomput., vol. 74, no. 12, pp. 6633–6645, Dec. 2018.
[29] H. El Alami and A. Najid, ‘‘Energy-efficient fuzzy logic cluster head
[7] S. A. Putra, B. R. Trilaksono, A. Harsoyo, and A. I. Kistijantoro, ‘‘Mul-
selection in wireless sensor networks,’’ in Proc. Int. Conf. Inf. Technol.
tiagent system in-network processing in wireless sensor network,’’ Int. J.
Org. Develop. (IT OD), Rabat, Morocco, Mar. 2016, pp. 1–7.
Electr. Eng. Informat., vol. 10, no. 1, pp. 94–107, Mar. 2018.
[30] J.-S. Lee and C.-L. Teng, ‘‘An enhanced hierarchical clustering approach
[8] Z. A. Khan and A. Samad, ‘‘A study of machine learning in wireless
for mobile sensor networks using fuzzy inference systems,’’ IEEE Internet
sensor network,’’ Int. J. Comput. Netw. Appl., vol. 4, no. 4, pp. 105–112,
Things J., vol. 4, no. 4, pp. 1095–1103, Aug. 2017.
Aug. 2017.
[31] H. El Alami and A. Najid, ‘‘ECH: An enhanced clustering hierarchy
[9] D. P. Kumar, T. Amgoth, and C. S. R. Annavarapu, ‘‘Machine learning approach to maximize lifetime of wireless sensor networks,’’ IEEE Access,
algorithms for wireless sensor networks: A survey,’’ Inf. Fusion, vol. 49, vol. 7, pp. 107142–107153, 2019.
pp. 1–25, Sep. 2019. [32] S. A. Sert and A. Yazici, ‘‘Optimizing the performance of rule-based fuzzy
[10] Z. Mammeri, ‘‘Reinforcement learning based routing in networks: Review routing algorithms in wireless sensor networks,’’ in Proc. IEEE Int. Conf.
and classification of approaches,’’ IEEE Access, vol. 7, pp. 55916–55950, Fuzzy Syst. (FUZZ-IEEE), New Orleans, LA, USA, Jun. 2019, pp. 1–6.
2019. [33] J. Zhang, Q. Wu, F. Ren, T. He, and C. Lin, ‘‘Effective data aggregation
[11] J. Wang, Y. Gao, W. Liu, A. K. Sangaiah, and H.-J. Kim, ‘‘Energy efficient supported by dynamic routing in wireless sensor networks,’’ in Proc. IEEE
routing algorithm with mobile sink support for wireless sensor networks,’’ Int. Conf. Commun., Cape Town, South Africa, May 2010, pp. 1–6.
Sensors, vol. 19, pp. 1–19, Jan. 2019. [34] R. Rajagopalan and P. K. Varshney, ‘‘Data-aggregation techniques in sen-
[12] S. K. Das and S. Tripathi, ‘‘Intelligent energy-aware efficient routing for sor networks: A survey,’’ IEEE Commun. Surveys Tuts., vol. 8, no. 4,
MANET,’’ Wireless Netw., vol. 24, no. 4, pp. 1139–1159, May 2018. pp. 48–63, 4th Quart., 2006.
[13] X. Li, W. Liu, M. Xie, A. Liu, M. Zhao, N. N. Xiong, M. Zhao, and W. Dai, [35] A. Ganesh, ‘‘Publish/subscribe model in a wireless sensor network,’’
‘‘Differentiated data aggregation routing scheme for energy conserving and U.S. Patent 7 590 098, Sep. 15, 2009.
delay sensitive wireless sensor networks,’’ Sensors, vol. 18, no. 7, pp. 1–29, [36] H. Karl and A. Willig, Protocols and Architectures for Wireless Sensor
Jul. 2018. Networks. Hoboken, NJ, USA: Wiley, 2007.
[14] S. Sasirekha and S. Swamynathan, ‘‘Cluster-chain mobile agent rout- [37] M. Dagar and S. Mahajan, ‘‘Data aggregation in wireless sensor network:
ing algorithm for efficient data aggregation in wireless sensor network,’’ A survey,’’ Int. J. Inf. Comput. Technol., vol. 3, no. 3, pp. 167–174, 2013.
J. Commun. Netw., vol. 19, no. 4, pp. 392–401, Aug. 2017. [38] X. Xu, R. Ansari, A. Khokhar, and A. V. Vasilakos, ‘‘Hierarchical data
[15] Y. Bai, Y. Mai, and N. Wang, ‘‘Performance comparison and evaluation of aggregation using compressive sensing (HDACS) in WSNs,’’ ACM Trans.
the proactive and reactive routing protocols for MANETs,’’ in Proc. Wire- Sensor Netw., vol. 11, no. 3, pp. 1–25, May 2015.
less Telecommun. Symp. (WTS), Chicago, IL, USA, Apr. 2017, pp. 1–5. [39] F. Marcelloni and M. Vecchio, ‘‘An efficient lossless compression algo-
[16] R. E. Mohemed, A. I. Saleh, M. Abdelrazzak, and A. S. Samra, ‘‘Energy- rithm for tiny nodes of monitoring wireless sensor networks,’’ Comput. J.,
efficient routing protocols for solving energy hole problem in wireless vol. 52, no. 8, pp. 969–987, Nov. 2009.
sensor networks,’’ Comput. Netw., vol. 114, pp. 51–66, Feb. 2017. [40] Q. Yang, S.-J. Jang, and S.-J. Yoo, ‘‘Q-learning-based fuzzy logic for multi-
objective routing algorithm in flying ad hoc networks,’’ Wireless Pers.
[17] A. Razaque, M. Abdulgader, C. Joshi, F. Amsaad, and M. Chauhan, ‘‘P-
Commun., vol. 113, no. 1, pp. 115–138, Jul. 2020.
LEACH: Energy efficient routing protocol for wireless sensor networks,’’
in Proc. IEEE Long Island Syst., Appl. Technol. Conf. (LISAT), Apr. 2016, WAN-KYU YUN received the B.S. degree from
pp. 1–5. the Information and Communication Engineer-
[18] R. A. Khan, Q. Xin, and N. Roshan, ‘‘RK-energy efficient routing protocol ing Department, Inha University, South Korea,
for wireless body area sensor networks,’’ Wireless Pers. Commun., vol. 116, where he is currently pursuing the M.S. degree
pp. 1–13, Aug. 2020. with the Multimedia Network Laboratory. His
[19] T. Baker, M. J. García-Campos, D. G. Reina, S. Toral, H. Tawfik, research interests include wireless sensor net-
D. Al-Jumeily, and A. Hussain, ‘‘GreeAODV: An energy efficient routing works, the Internet of Things, machine learning,
protocol for vehicular ad hoc networks,’’ in Proc. Int. Conf. Intell. Comput., and deep learning.
Jul. 2018, pp. 670–681.
[20] O. S. Oubbati, M. Mozaffari, N. Chaib, P. Lorenz, M. Atiquzzaman, and
A. Jamalipour, ‘‘ECaD: Energy-efficient routing in flying ad hoc net- SANG-JO YOO (Member, IEEE) received the B.S.
works,’’ Int. J. Commun. Syst., vol. 32, no. 18, p. e4156, Dec. 2019. degree in electronic communication engineering
[21] O. A. Mahdi, A. W. A. Wahab, M. Y. I. Idris, A. A. Znaid, from Hanyang University, Seoul, South Korea,
Y. R. B. Al-Mayouf, and S. Khan, ‘‘WDARS: A weighted data aggregation
in 1988, and the M.S. and Ph.D. degrees in electri-
routing strategy with minimum link cost in event-driven WSNs,’’ J. Sen-
cal engineering from the Korea Advanced Institute
sors, vol. 2016, pp. 1–12, Sep. 2016.
of Science and Technology, in 1990 and 2000,
[22] S. P. Ardakani, J. Padget, and M. De Vos, ‘‘A mobile agent routing protocol
respectively.
for data aggregation in wireless sensor networks,’’ Int. J. Wireless Inf.
Netw., vol. 24, no. 1, pp. 27–41, Mar. 2017. From 1990 to 2001, he was a Member of Tech-
nical Staff with the Korea Telecom Research and
[23] K. Haseeb, N. Islam, T. Saba, A. Rehman, and Z. Mehmood, ‘‘LSDAR:
A light-weight structure based data aggregation routing protocol with Development Group, where he was involved in
secure Internet of Things integrated next-generation sensor networks,’’ communication protocol conformance testing and network design fields.
Sustain. Cities Soc., vol. 54, pp. 1–9, Mar. 2020. From 1994 to 1995 and 2007 to 2008, he was a Guest Researcher with the
[24] A. Yazici, M. Koyuncu, S. A. Sert, and T. Yilmaz, ‘‘A fusion-based National Institute Standards and Technology, USA. Since 2001, he has been
framework for wireless multimedia sensor networks in surveillance appli- with Inha University, where he is currently a Professor with the Information
cations,’’ IEEE Access, vol. 7, pp. 88418–88434, 2019. and Communication Engineering Department. His current research interests
[25] Y. Chang, X. Yuan, B. Li, D. Niyato, and N. Al-Dhahir, ‘‘Machine- include cognitive radio network protocols, adhoc wireless networks, MAC
learning-based parallel genetic algorithms for multi-objective optimization and routing protocol design, wireless networks, QoS, and wireless sensor
in ultra-reliable low-latency WSNs,’’ IEEE Access, vol. 7, pp. 4913–4926, networks.
2019.

10750 VOLUME 9, 2021

Wireless and Mobile Networks (22622) 8-5-2024 at 4-15 PM by Co Moderator
100% (1)
Wireless and Mobile Networks (22622) 8-5-2024 at 4-15 PM by Co Moderator
25 pages
Implementation of Energy Efficient Circuit Design Using A-Star Algorithmin Embedded Network
No ratings yet
Implementation of Energy Efficient Circuit Design Using A-Star Algorithmin Embedded Network
7 pages
Energy Efficient Routing Protocol
No ratings yet
Energy Efficient Routing Protocol
14 pages
An Energy Efficient Routing Protocol For Wireless Sensor Networks Using A-Star Algorithm
No ratings yet
An Energy Efficient Routing Protocol For Wireless Sensor Networks Using A-Star Algorithm
8 pages
Secure and Energy Aware Multi-Hop Routing Protocol in WSN Using Taylor-Based Hybrid Optimization Algorithm
No ratings yet
Secure and Energy Aware Multi-Hop Routing Protocol in WSN Using Taylor-Based Hybrid Optimization Algorithm
12 pages
Novel Energy
No ratings yet
Novel Energy
8 pages
Balancing Energy Consumption in Wireless Sensor Networks Using Fuzzy Artificial Bee Colony Routing Protocol
No ratings yet
Balancing Energy Consumption in Wireless Sensor Networks Using Fuzzy Artificial Bee Colony Routing Protocol
15 pages
Review of Energy Aware Routing Protocols For Wireless Sensor Networks
No ratings yet
Review of Energy Aware Routing Protocols For Wireless Sensor Networks
6 pages
2023an Energy-Efficient Routing Protocol With Reinforcement Learning in Software-Defined Wireless Sensor Networks
No ratings yet
2023an Energy-Efficient Routing Protocol With Reinforcement Learning in Software-Defined Wireless Sensor Networks
22 pages
Fuzzy Logic in WSN
No ratings yet
Fuzzy Logic in WSN
10 pages
Energy Efficient Wireless Sensor Networks A Survey On Energy-Based Routing Techniques
No ratings yet
Energy Efficient Wireless Sensor Networks A Survey On Energy-Based Routing Techniques
6 pages
Wireless Sensor Network: Improving The Network Energy Consumption
No ratings yet
Wireless Sensor Network: Improving The Network Energy Consumption
5 pages
A Energy-Preserving Model For Wireless Sensors Networks Based On Heuristic Self-Organized Routing
No ratings yet
A Energy-Preserving Model For Wireless Sensors Networks Based On Heuristic Self-Organized Routing
5 pages
Wireless Communications and Mobile Computing - 2021 - Varun - Energy Efficient Routing Using Fuzzy Neural Network in
No ratings yet
Wireless Communications and Mobile Computing - 2021 - Varun - Energy Efficient Routing Using Fuzzy Neural Network in
13 pages
An Energy-Aware Clustering Approach - 2009
No ratings yet
An Energy-Aware Clustering Approach - 2009
11 pages
Energy Efficient Routing in Wireless Sensor Network: Summer Project Report
No ratings yet
Energy Efficient Routing in Wireless Sensor Network: Summer Project Report
16 pages
9 - Ari
No ratings yet
9 - Ari
21 pages
Performance Generation of Routing Protocol For WSN
No ratings yet
Performance Generation of Routing Protocol For WSN
10 pages
(IJIT V7I3P3) :komal
No ratings yet
(IJIT V7I3P3) :komal
5 pages
BEAR: A Balanced Energy-Aware Routing Protocol For Wireless Sensor Networks
No ratings yet
BEAR: A Balanced Energy-Aware Routing Protocol For Wireless Sensor Networks
8 pages
Alexandria Engineering Journal: Bilal Saoud, Ibraheem Shayea, Marwan Hadri Azmi, Ayman A. El-Saleh
No ratings yet
Alexandria Engineering Journal: Bilal Saoud, Ibraheem Shayea, Marwan Hadri Azmi, Ayman A. El-Saleh
11 pages
WSN Based Energy Efficient Protocol
No ratings yet
WSN Based Energy Efficient Protocol
10 pages
Applsci 11 10517
No ratings yet
Applsci 11 10517
22 pages
IJETR032175
No ratings yet
IJETR032175
4 pages
Threshold Sensitive Routing Protocol For Wireless Sensor Networks
No ratings yet
Threshold Sensitive Routing Protocol For Wireless Sensor Networks
40 pages
Electronics 10 01539
No ratings yet
Electronics 10 01539
24 pages
Energy Efficient Routing Protocols For Node Distribution in Wireless Sensor Networks
No ratings yet
Energy Efficient Routing Protocols For Node Distribution in Wireless Sensor Networks
6 pages
Iarjset 38
No ratings yet
Iarjset 38
5 pages
An Adaptive Energy Efficient Reliable Routing Protocol For Wireless Sensor Networks
No ratings yet
An Adaptive Energy Efficient Reliable Routing Protocol For Wireless Sensor Networks
5 pages
Energy Efficient Protocols PDF
No ratings yet
Energy Efficient Protocols PDF
41 pages
IEEE-Energy-Efficient Routing Protocols in Wireless Sensor Networks - A Survey PDF
No ratings yet
IEEE-Energy-Efficient Routing Protocols in Wireless Sensor Networks - A Survey PDF
41 pages
Energy-Efficient Routing Protocols in Wireless Sensor Networks
No ratings yet
Energy-Efficient Routing Protocols in Wireless Sensor Networks
41 pages
References From Base Paper WSN-2 and I & II Papers
No ratings yet
References From Base Paper WSN-2 and I & II Papers
8 pages
A Simulation Framework For Energy-Aware Wireless Sensor Network Protocols
No ratings yet
A Simulation Framework For Energy-Aware Wireless Sensor Network Protocols
7 pages
2010 Globe Ahmed
No ratings yet
2010 Globe Ahmed
5 pages
Ece 01
No ratings yet
Ece 01
33 pages
10453-Article Text-12819-1-10-20240423
No ratings yet
10453-Article Text-12819-1-10-20240423
7 pages
Simulation & Performance Evaluation of Routing Protocols in Wireless Sensor Network
No ratings yet
Simulation & Performance Evaluation of Routing Protocols in Wireless Sensor Network
8 pages
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
No ratings yet
A Sample Article Using IEEEtran Cls For IEEE Journals and Transactions
5 pages
Energy Aware Routing in WSN
No ratings yet
Energy Aware Routing in WSN
16 pages
Novel Scoring For Energy-Efficient Routing in Multi-Sensored Networks
No ratings yet
Novel Scoring For Energy-Efficient Routing in Multi-Sensored Networks
21 pages
Trust-Distrust Protocol For The Secure Routing in Wireless Sensor Networks
No ratings yet
Trust-Distrust Protocol For The Secure Routing in Wireless Sensor Networks
5 pages
Energy-Efficient Grid-Based Routing Algorithm Using Intelligent Fuzzy Rules For Wireless Sensor Networks
No ratings yet
Energy-Efficient Grid-Based Routing Algorithm Using Intelligent Fuzzy Rules For Wireless Sensor Networks
14 pages
Alexandria Engineering Journal: Xuguang Chai, Yalin Wu, Lei Feng
No ratings yet
Alexandria Engineering Journal: Xuguang Chai, Yalin Wu, Lei Feng
11 pages
1hk21is010 031 049 053
No ratings yet
1hk21is010 031 049 053
11 pages
Measurement: Sensors: V.Rama Krishna, Vuppala Sukanya, Mohd Abdul Hameed
No ratings yet
Measurement: Sensors: V.Rama Krishna, Vuppala Sukanya, Mohd Abdul Hameed
11 pages
R.A.S.Prabhakaran Assistant Professor
No ratings yet
R.A.S.Prabhakaran Assistant Professor
9 pages
Sensors 22 01645 v2
No ratings yet
Sensors 22 01645 v2
24 pages
An Energy-Aware Qos Routing Protocol For Wireless Sensor Networks
No ratings yet
An Energy-Aware Qos Routing Protocol For Wireless Sensor Networks
9 pages
An Efficient Neural Network LEACH Protocol To Exte
No ratings yet
An Efficient Neural Network LEACH Protocol To Exte
22 pages
Hierarchical Energy-Saving Routing Algorithm Using Fuzzy Logic in Wireless Sensor Networks
No ratings yet
Hierarchical Energy-Saving Routing Algorithm Using Fuzzy Logic in Wireless Sensor Networks
11 pages
Energy Efficient Routing Algorithm With Mobile Sink
No ratings yet
Energy Efficient Routing Algorithm With Mobile Sink
19 pages
Ref 3
No ratings yet
Ref 3
23 pages
RLBEEP Reinforcement-Learning-Based Energy Efficient Control and Routing Protocol For Wireless Sensor Networks
No ratings yet
RLBEEP Reinforcement-Learning-Based Energy Efficient Control and Routing Protocol For Wireless Sensor Networks
13 pages
An Energy-Aware Technique To Improve The Lifetime of Cell Phone Based Wsns Using Isa100.11A
No ratings yet
An Energy-Aware Technique To Improve The Lifetime of Cell Phone Based Wsns Using Isa100.11A
7 pages
Energy Efficient Search Protocol: By: Utsav Kakkad, 17MIT0001
No ratings yet
Energy Efficient Search Protocol: By: Utsav Kakkad, 17MIT0001
29 pages
Investigation of Ant Colony Optimization Algorithm For Efficient Energy Utilization in Wireless Sensor Network
No ratings yet
Investigation of Ant Colony Optimization Algorithm For Efficient Energy Utilization in Wireless Sensor Network
20 pages
Investigation of Ant Colony Optimization Algorithm For Efficient Energy Utilization in Wireless Sensor Network
No ratings yet
Investigation of Ant Colony Optimization Algorithm For Efficient Energy Utilization in Wireless Sensor Network
20 pages
Patil 2012
No ratings yet
Patil 2012
6 pages
Energies: Wireless Sensor Network Energy Model and Its Use in The Optimization of Routing Protocols
No ratings yet
Energies: Wireless Sensor Network Energy Model and Its Use in The Optimization of Routing Protocols
33 pages
Optimized Energy Efficient Path Planning Strategy in WSN With Multiple Mobile Sinks
No ratings yet
Optimized Energy Efficient Path Planning Strategy in WSN With Multiple Mobile Sinks
15 pages
Reinforcement Learning Framework For Delay Sensitive Energy Harvesting Wireless Sensor Networks
No ratings yet
Reinforcement Learning Framework For Delay Sensitive Energy Harvesting Wireless Sensor Networks
11 pages
Final Presentation c7
No ratings yet
Final Presentation c7
21 pages
A 3 Sheet 8
No ratings yet
A 3 Sheet 8
1 page
20p31a04a2 Intership
No ratings yet
20p31a04a2 Intership
69 pages
20p31a04a2 Seminar
No ratings yet
20p31a04a2 Seminar
20 pages
ESS UNIT 4 Completed
No ratings yet
ESS UNIT 4 Completed
111 pages
Deduplication-Enabled CP-ABE With Revocation: Tiantian Zhou Zehui Tang Shengke Zeng Minfeng Shao
No ratings yet
Deduplication-Enabled CP-ABE With Revocation: Tiantian Zhou Zehui Tang Shengke Zeng Minfeng Shao
12 pages
ESS UNIT 2 Complete
No ratings yet
ESS UNIT 2 Complete
100 pages
ESS UNIT 1 Part B
No ratings yet
ESS UNIT 1 Part B
46 pages
Design and Implementation of Full Adder Using Different
No ratings yet
Design and Implementation of Full Adder Using Different
11 pages
Embedded Systems Unit-Iii
No ratings yet
Embedded Systems Unit-Iii
67 pages
Embedded Systems Project 1
No ratings yet
Embedded Systems Project 1
9 pages
(Ebook) Handbook of Algorithms For Wireless Networking and Mobile Computing by Azzedine Boukerche ISBN 9781420035094, 9781584884651, 1420035096, 1584884657 PDF Download
No ratings yet
(Ebook) Handbook of Algorithms For Wireless Networking and Mobile Computing by Azzedine Boukerche ISBN 9781420035094, 9781584884651, 1420035096, 1584884657 PDF Download
47 pages
Wireless Sensor Networks Syllabus
0% (1)
Wireless Sensor Networks Syllabus
1 page
Acknowledge
No ratings yet
Acknowledge
6 pages
Dark Net
No ratings yet
Dark Net
21 pages
M.tech Computer Science and Engineering Full Sylabi
No ratings yet
M.tech Computer Science and Engineering Full Sylabi
70 pages
Espose
No ratings yet
Espose
13 pages
Wireless Network: Multiple Choice Questions and Answers
No ratings yet
Wireless Network: Multiple Choice Questions and Answers
31 pages
WMN EPA by Campusify
No ratings yet
WMN EPA by Campusify
54 pages
Design and Implementation of A New Blockchain Algorithm To Increase Reliability, Security and Integrity
No ratings yet
Design and Implementation of A New Blockchain Algorithm To Increase Reliability, Security and Integrity
122 pages
MANET
No ratings yet
MANET
15 pages
DSDV
No ratings yet
DSDV
36 pages
Single Node Architecture: Goals of This Chapter
No ratings yet
Single Node Architecture: Goals of This Chapter
26 pages
CEP (ADC) Ahmar
No ratings yet
CEP (ADC) Ahmar
8 pages
Origins of Ad Hoc
No ratings yet
Origins of Ad Hoc
58 pages
Flying Ad Hoc Network (FANET) : An Exhaustive Summary: Abstract
No ratings yet
Flying Ad Hoc Network (FANET) : An Exhaustive Summary: Abstract
5 pages
Design Routing Protocol Performance Comparison in NS2: AODV Comparing To DSR As Example
No ratings yet
Design Routing Protocol Performance Comparison in NS2: AODV Comparing To DSR As Example
14 pages
Ajaz Et Al 2022 Architecture and Routing Protocols For Internet of Vehicles A Review
No ratings yet
Ajaz Et Al 2022 Architecture and Routing Protocols For Internet of Vehicles A Review
17 pages
An Opportunistic Routing Protocol
No ratings yet
An Opportunistic Routing Protocol
3 pages
Inter Intra Vehicle Wireless Communication
No ratings yet
Inter Intra Vehicle Wireless Communication
20 pages
IoT AMAR1
No ratings yet
IoT AMAR1
228 pages
WSN 1
No ratings yet
WSN 1
16 pages
Hidden Vs Exposed Terminal Problem in Ad Hoc Netwo
No ratings yet
Hidden Vs Exposed Terminal Problem in Ad Hoc Netwo
9 pages
Wireless and Mobile Computing: University of Gujrat
No ratings yet
Wireless and Mobile Computing: University of Gujrat
42 pages
FD-605MT Specs
No ratings yet
FD-605MT Specs
5 pages
M Tech Thesis in Computer Science
100% (3)
M Tech Thesis in Computer Science
8 pages
ATCP
100% (1)
ATCP
5 pages
Final Year Project Titles 2017-2018
0% (1)
Final Year Project Titles 2017-2018
24 pages
2021 - Machine Learning For 5G Security Architecture
No ratings yet
2021 - Machine Learning For 5G Security Architecture
9 pages
Quiz 1.1.8
No ratings yet
Quiz 1.1.8
14 pages

Q-Learning-Based Data-Aggregation-Aware Energy-Efficient Routing Protocol For Wireless Sensor Networks

Uploaded by

Q-Learning-Based Data-Aggregation-Aware Energy-Efficient Routing Protocol For Wireless Sensor Networks

Uploaded by

Received December 27, 2020, accepted January 10, 2021, date of publication January 13, 2021, date of current

version January 20, 2021.

I. INTRODUCTION In a WSN, many sensor nodes are deployed over a wide

10738 VOLUME 9, 2021

VOLUME 9, 2021 10739

FIGURE 1. WSN model with multiple sensor types.

relationship and aggregation degrees of paths. To capture

10740 VOLUME 9, 2021

FIGURE 3. Data aggregation and transmission system model.

FIGURE 4. Data aggregation models (a) Representative aggregation

VOLUME 9, 2021 10741

10742 VOLUME 9, 2021

TABLE 1. System model parameters.

DAmodel2 Qti (n)

VOLUME 9, 2021 10743

knows the network-wide state transition behaviors. As shown follows:

10744 VOLUME 9, 2021

FIGURE 7. Example scenario for the proposed Q-DAEER learning process.

VOLUME 9, 2021 10745

3) Based on action a2 , node s forwards the aggregated data

7) Node s updates Q (s, a) vector using Eq. (19).

V. SIMULATION RESULTS AND PERFORMANCE

10746 VOLUME 9, 2021

TABLE 2. Complexity and overhead analysis.

TABLE 3. Simulation parameters.

FIGURE 9. Wireless sensor network simulation environment.

TABLE 4. Sensor type dependent parameters for simulation.

VOLUME 9, 2021 10747

FIGURE 10. Numbers of dead sensor nodes and network lifetimes

10748 VOLUME 9, 2021

compared methods. As we can see, the proposed Q-DAEER

VOLUME 9, 2021 10749

10750 VOLUME 9, 2021

You might also like