0% found this document useful (0 votes)
54 views15 pages

Computer Networks: Feng Shan, Weifa Liang, Jun Luo, Xiaojun Shen

This document summarizes a research paper about maximizing network lifetime for time-sensitive data gathering in wireless sensor networks. The paper aims to construct a routing tree rooted at the base station that guarantees to forward sensed data from any sensor along the shortest path, while maximizing network lifetime. It shows that finding such an optimal tree is an NP-hard problem. The paper then presents a novel heuristic called the top-down algorithm to construct the routing tree layer by layer, and a distributed refinement algorithm to improve load balancing. Extensive simulations show the proposed approach achieves around 85% of the optimal network lifetime.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views15 pages

Computer Networks: Feng Shan, Weifa Liang, Jun Luo, Xiaojun Shen

This document summarizes a research paper about maximizing network lifetime for time-sensitive data gathering in wireless sensor networks. The paper aims to construct a routing tree rooted at the base station that guarantees to forward sensed data from any sensor along the shortest path, while maximizing network lifetime. It shows that finding such an optimal tree is an NP-hard problem. The paper then presents a novel heuristic called the top-down algorithm to construct the routing tree layer by layer, and a distributed refinement algorithm to improve load balancing. Extensive simulations show the proposed approach achieves around 85% of the optimal network lifetime.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Computer Networks 57 (2013) 1063–1077

Contents lists available at SciVerse ScienceDirect

Computer Networks
journal homepage: www.elsevier.com/locate/comnet

Network lifetime maximization for time-sensitive data


gathering in wireless sensor networks
Feng Shan a,e,⇑, Weifa Liang b, Jun Luo c, Xiaojun Shen d
a
School of Computer Science and Engineering, Southeast University, Jiangsu, Nanjing 210096, China
b
Research School of Computer Science, Australian National University, Canberra, ACT 0200, Australia
c
School of Computer, National University of Defense Technology, Changsha, Hunan 410073, China
d
School of Computing and Engineering, University of Missouri-Kansas City, Kansas City, MO 64110, USA
e
Key Laboratory of Computer Network and Information Integration, Ministry of Education, Nanjing 210096, China

a r t i c l e i n f o a b s t r a c t

Article history: Energy-constrained sensor networks have been widely deployed for environmental moni-
Received 12 July 2012 toring and security surveillance purposes. Since sensors are usually powered by energy-
Received in revised form 25 October 2012 limited batteries, in order to prolong the network lifetime, most existing research focuses
Accepted 11 December 2012
on constructing a load-balanced routing tree rooted at the base station for data gathering.
Available online 20 December 2012
However, this may result in a long routing path from some sensors to the base station.
Motivated by the need of some mission-critical applications that require all sensed data
Keywords:
to be received by the base station with minimal delay, this paper aims to construct a rout-
Wireless sensor networks
Network lifetime prolongation
ing tree such that the network lifetime is maximized while keeping the routing path from
Energy optimization each sensor to the base station minimized. This paper shows that finding such a tree is NP-
Load-balanced spanning tree hard. Thus a novel heuristic called top-down algorithm is presented, which constructs the
Network flow routing tree layer by layer such that each layer is optimally extended, using a network flow
Algorithm design model. A distributed refinement algorithm is then devised that dramatically improves on
the load balance for the routing tree produced by the top-down algorithm. Finally, exten-
sive simulations are conducted. The experimental results show that the top-down algo-
rithm with balance-refinement delivers a shortest routing tree whose network lifetime
achieves around 85% of the optimum.
Ó 2012 Elsevier B.V. All rights reserved.

1. Introduction are powered by energy-limited batteries, and sometimes


it is impossible to recharge or replace these batteries when
Recent advances in electronic and communication tech- the network is deployed in harsh or human inaccessible
nologies make it possible to build a large scale Wireless environments such as battlefields or nuclear polluted re-
Sensor Network (WSN) with hundreds of thousands of sen- gions. Therefore, energy conservation in this type of net-
sors. Due to its wide range of applications, from environ- work is of paramount importance in order to prolong the
mental monitoring to mission-critical surveillance [19], network lifetime. Most existing research focused on maxi-
WSNs have received tremendous attentions and data gath- mizing the network lifetime by constructing a load-bal-
ering as its fundamental function has been extensively anced routing tree for data gathering. However, such a
studied in the past several years. Sensors in most WSNs tree may contain long routing paths from some sensors
to the base station. In order to meet the need of mission-
critical applications that require all sensed data sending
⇑ Corresponding author at: School of Computer Science and Engineer-
their data to the base station with minimal delay, this pa-
ing, Southeast University, Jiangsu, Nanjing 210096, China.
per aims to constructing a routing tree rooted at the base
E-mail addresses: [email protected] (F. Shan), [email protected].
edu.au (W. Liang), [email protected] (J. Luo), [email protected] (X. Shen). station that guarantees to forward the sensed data from

1389-1286/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.comnet.2012.12.005
1064 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

any sensor along a shortest path while maximizing net- optimum. Cristescu et al. [3] studied the data correlation
work lifetime. As we here deal with time-sensitive data problem with an objective of minimizing the total trans-
gathering, we expect to collect the detailed data from all mission energy consumption. They assumed that each
sensors without any aggregation during the data transfer. node is cognizant of which nodes it should be merged with
We thus assume the energy consumption of each node so that the merged message has a minimal length. They
(sensor) is proportional to the number of descendants of showed that the data correlation problem is NP-complete,
the node in the routing tree. and provided an integer program solution, using the Sle-
This optimization problem is also applicable for non- pian-Wolf coding approach. Rickenbach and Wattenhofer
mission-critical but large scale WSNs. This is because wire- [21] studied the same problem and provided anpimproved
ffiffiffi
less communication (particularly multi-hop relay commu- solution with an approximation ratio of 2ð1 þ 2Þ, using
nication) is unreliable and long routing paths may cause the shallow light tree concept [14]. Buragohain et al. [2]
frequent and repeated re-transmissions that lead to net- studied the min–max model for the network lifetime max-
work failures. Therefore, a shorter routing path is highly imization problem. Instead of minimizing the total energy
desirable. consumption, they focused on minimizing the maximum
energy consumption among the sensors. They showed that
1.1. Related work finding an optimal routing tree under this model is NP-
complete, and proposed a heuristic solution. Liang and
Data gathering in a WSN means collecting sensed data Liu [15] also independently showed its NP-completeness
from every sensor and forwarding the data to the base sta- and devised several heuristics that trade off between dif-
tion. The most popular paradigm of data gathering is in- ferent energy optimization metrics. Intanagonwiwat et al.
network processing that constructs a routing spanning tree [12,11] studied the general data gathering issue by incor-
rooted at the base station (also referred to as a sink). porating the semantics of an aggregation query into build-
Depending on different applications, two major models ing an energy efficient routing tree that may not
are used in data relaying, namely the aggregated relay and necessarily be a spanning tree. For example, they proposed
non-aggregated relay models. Most routing trees adopt a data dissemination scheme called directed diffusion with
the aggregated relay model. Under this model, each relay opportunistic aggregation [11], where data is opportunisti-
node aggregates all received data from its children and cally aggregated at relaying nodes on a low-latency tree.
its own into a fixed-size message. The aggregated data They also explored a greedy aggregation by a novel ap-
are then transmitted to its parent. This type of data gather- proach [12] that adjusts aggregation points to increase
ing is applicable to applications such as database queries path sharing and thereby reducing the energy
AVG, MIN, MAX, COUNT, and so on. In the non-aggregated consumption.
relay model, the length of a message transmitted by a relay With different objectives, a number of algorithms have
node depends not only on the length of its own sensed data been proposed to produce different routing (spanning)
but also on the lengths of the messages received from its trees. For example, a Breadth-first search tree in Tiny
children. We refer to this latter one as the message-length AGgregation service (TAG) [18] aims at minimizing the
dependent data gathering [15]. transmission delay from each sensor to the root, while a
An extensively studied data gathering problem under degree-constrained spanning tree [2,25,24] focuses on
both aggregated and non-aggregated models is to find a minimizing the maximum energy consumption by any
routing tree that minimizes the total energy consumption node. Another kind of spanning tree [21] makes a trade
or minimizes the maximum energy consumption among off between the energy cost of a minimum spanning tree
individual nodes. Heinzelman et al. [9] initiated this study and the energy cost of a shortest path tree. It seeks a fair
under the aggregation model and proposed the clustering balance between the total energy consumption and the
protocol LEACH that groups nodes into a number of clus- maximum energy consumption among the sensors within
ters in a self-organizing manner. Then, a cluster-head each data gathering session. Wu et al. [25] considered
serves as the local ‘base station’ to aggregate the messages the network lifetime maximization problem with the same
gathered from its members and forward the result to the assumption used in [2] that the size of forwarded data from
sink directly. Lindsey and Raghavendra [17] presented an each relay node is identical and the energy consumption at
improved protocol called PEGASIS, in which all nodes in a each node is proportional to the number of its children. They
cluster form a chain and one of them is chosen as the head generalized the original algorithm for degree-constrained
responsible for reporting the aggregated result to the base spanning trees [5] to an algorithm for routing trees in
station. Kalpakis et al. [13] attacked this problem by for- sensor networks by incorporating the residual energy into
mulating it as an integer program and gave a heuristic the design. Wu et al. [24] later further extended their re-
solution. sults to a routing forest instead of a routing tree.
A number of research papers have been published In addition to the above mentioned tree construction
[2,3,7,13,15,22,23] that use various energy saving or bal- algorithms, special efforts have also been made by
ancing models. For example, Goel and Estrin [7] addressed researchers for constructing (energy) load-balanced rout-
the problem of minimizing the total transmission energy ing trees. For example, Hsiao et al. [10] introduced the dy-
consumption, assuming that the aggregation cost at each namic load-balanced tree for a grid-topology of wireless
relay node is a concave, non-decreasing function. They pro- access networks and developed a distributed algorithm.
posed a hierarchical matching algorithm that delivers an Dai and Han [4] introduced a hierarchy-balanced tree and
approximate solution within a logarithmic factor of the made use of the Chebyshev sum as a measuring criterion
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1065

for top-level load-balance trees. Liang and Liu [15] consid- is given in Section 6. The conclusion is presented in
ered the construction of the spanning tree dynamically Section 7.
with an aim to balance the transmission load among the
sensors according to their residual energy so that the net-
work lifetime can be prolonged. Yan et al. [26] extended 2. System and problem formulation
the load-balanced tree concept by introducing the dynamic
load-balanced tree, in which the load-balanced tree is A wireless sensor network can be modeled as an undi-
dynamically constructed per data gathering session. How- rected, connected graph M ¼ ðN [ fsg; LÞ, where N is a set
ever, the time and energy overhead incurred by this ap- of n stationary, identical sensor nodes randomly deployed
proach is too excessive to be acceptable for mission- in a monitoring region, s is the sink node, and L is a set
critical data gathering applications. Liang et al. [16] re- of links between sensors and a sink and the sink. There is
cently considered the network lifetime maximization a link between sensors u and v if and only if they are within
problem for the non-aggregation model and showed its transmission range of each other. For ease of presentation,
NP-hardness. They provided an approximate solution of we do not distinguish a node in the graph and its corre-
Xðlog n= log log nÞ, by reducing the problem to a bottleneck sponding sensor. We treat the sink as a special sensor that
spanning tree problem. receives sensed data. We assume that every sensor has a
However, almost all of these existing algorithms pro- limited initial energy IE while the sink has an unlimited en-
duce load-balancing or energy-saving trees without taking ergy supply. For the distributed implementations, we as-
into account the cost of transmission delays. That is, they sume each node has no knowledge of the topology of the
allow a balanced tree to be ‘slim’, which means the data entire network but assume initially each node has its local
from some leaf nodes will take a much longer journey to knowledge which includes its own ID number, and its
reach the tree root (the base station) than the average. This neighbors’ ID numbers that can be easily obtained by a lo-
may cause the failure of the entire network in an unreliable cal broadcast and acknowledgment messages. For the cen-
wireless communication environment or lead to an intoler- tralized algorithm, we assume the topology is known for
ably long delay for mission-critical applications. the computation, as other works do. It must be mentioned
that in this work although we consider the single sink data
1.2. Contributions gathering problem only, the developed algorithms and
techniques can be easily extended to solve the similar
The major contribution made by this paper is to address problem in a wireless sensor network with multiple sinks.
the need for time sensitive data gathering to guarantee We will attempt to design an algorithm to construct a
minimum delay when maximizing network lifetime. Un- shortest routing tree rooted at s for a given network
like previous works, this paper assumes that each sensor MðN [ fsg; LÞ such that the network lifetime is maximized
must send its data to the base station within the minimum under the non-aggregated relay model. In the rest of this
number of hops (a shortest path). It appears to be the first paper, unless otherwise specified, we assume that a rout-
time to formulate this type of optimization problem that ing tree is a shortest path tree and the length of a routing
has many potential applications such as disaster reliefs path is the number of links in the path. We further assume
and military responses. The contributions to the new opti- that the transmission delay of a message from its source to
mization problem by this paper can be summarized as the destination (the sink) is proportional to the length of its
follows. routing path. Since all the data generated from the sensors
First, it shows that finding a maximum network lifetime must go through the nodes adjacent to the sink, these adja-
shortest path routing tree is equivalent to constructing a cent nodes (the children of the sink in a shortest routing
node load-balanced distance spanning tree, which is NP- tree) will consume much more energy than others. Balanc-
hardness. Second, a novel top-down heuristic is presented, ing the load among them is the key to prolonging the
which makes use of the network flow technique by opti- network lifetime. The optimal balancing is obtained if the
mally constructing the balance spanning tree layer by number of nodes in the maximum branch (subtree) rooted
layer. A distributed implementation of the proposed algo- at a child node of s, denoted by NMB, is minimized. In order
rithm is also given. Third, to further improve the load-bal- to formally define the optimization problem dealt with by
anced spanning tree, a distributed load-balance refinement this paper, we introduce the following notations and
algorithm is proposed which effectively improves on the assumptions.
load balance for the routing tree produced by the
‘top-down’ algorithm. Finally, the performance of the pro- (a) For simplicity, this paper only takes into account the
posed heuristics are evaluated through extensive simula- energy consumption by each sensor for transmission
tions. The experimental results demonstrate that the and reception. This is because the radio frequency
proposed algorithms are very promising and deliver near transmission is the dominant energy consumption
optimal routing trees. in wireless networks [20]. Let et and er (usually
The rest of the paper is organized as follows. The system et > er ) denote the amounts of energy consumed
model and the optimization problem are defined in Sec- by a sensor node for transmitting and receiving
tion 2. The proof of NP-hardness of the problem is given one bit of data, respectively.
in Section 3, and the top-down distance tree algorithm (b) For a given data gathering session, we assume that
and the balance-refine algorithm are proposed in Section 4 the size of sensed data by any sensor is identical to
and Section 5, respectively. The performance evaluation a fixed length l.
1066 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

(c) Given a routing tree T rooted at the sink, for each Definition 3. Given a connected graph G(V, E) and a node
node v in T, let ndðv Þ denote the number of nodes s 2 V, its distance graph G0 ¼ ðV 0 ; E0 Þ rooted at s is a
in the subtree rooted at v including node v itself. subgraph of G such that V 0 ¼ V and E0 ¼ fðu; v Þjðu; v Þ 2 E
The amount of energy consumed by node v per data and distðu; sÞ þ 1 ¼ distðv ; sÞg.
gathering session ec ðv Þ can be calculated by Obviously, the distance graph GðV 0 ; E0 Þ is a multi-layer
graph that can be easily obtained by performing the
ec ðv Þ ¼ ðndðv Þ  1Þ  l  er þ ndðv Þ  l  et
Breadth-first search starting from node s.
¼ l  ðndðv Þ  ðer þ et Þ  er Þ:
Definition 4. Given a connected graph G(V, E) and a node
Obviously, a node in each data gathering session s 2 V, a spanning tree T rooted at s is called a distance
consumes more energy than any of its descendants. spanning tree if the length of any path in T from v 2 V to s
is equal to distðv ; sÞ.
(d) Let AdjðsÞ ¼ fv 1 ; v 2 ; . . . ; v d g denote the set of
neighboring nodes of sink s in MðN [ fsg; LÞ. Definition 5. Given a connected graph G(V, E) and a node
Obviously, these nodes must also be the children s 2 V, a distance spanning tree T rooted at s is called node
of s in any shortest path tree. balanced if the value of NMB(T) is minimized.
(e) Suppose T is a routing tree. We define Obviously, any distance spanning tree in G(V, E) rooted
NMBðTÞ ¼ maxv 2AdjðsÞ fndðv Þg, where NMB stands at s is a subgraph of the distance graph GðV 0 ; E0 Þ of G. It is
for the Num-ber of nodes in the Maximum also clear that, given a sensor network MðN [ fsg; LÞ, any
Branch. shortest path routing tree is a distance spanning tree
(f) Suppose T is a routing tree. Let node v k 2 AdjðsÞ rooted at s and vice versa. Therefore, following the discus-
be such a child of s, so that ndðv k Þ ¼ maxv 2AdjðsÞ sion in Section 2, finding the maximum lifetime shortest
fndðv Þg ¼ NMBðTÞ. Then, ec ðv k Þ ¼ maxv 2N routing tree in MðN [ fsg; LÞ is equivalent to finding a node
ec ðv Þg. Therefore, the lifetime of MðN [ fsg; LÞ load-balanced distance spanning tree in its distance graph.
for data gathering can be defined by Let us denote this problem by NBDT, namely, the Node Bal-
IE anced Distance Tree.
LifeðN; L; TÞ ¼ ; ð1Þ
ec ðv k Þ
Theorem 1. Given a connected graph G(V, E) and a node
where v k 2 AdjðsÞ with ndðv k Þ ¼ NMBðTÞ. s 2 V, the NBDT problem is NP-hard even for the special case
Definition 1. Given a sensor network MðN [ fsg; LÞ, the where jAdjðsÞj ¼ 2.
problem of maximum lifetime shortest routing tree for data
gathering is to construct a shortest path routing tree T Proof. We reduce the set cover problem [6] to the NBDT
rooted at s such that LifeðN; L; TÞ is maximized, subject to problem. An instance of the set cover problem can be sta-
meeting the assumptions from (a) to (f). We refer to tree T ted as follows. Given a positive integer k ð6 mÞ and a family
as the maximum lifetime shortest routing tree. F ¼ fS1 ; S2 ; . . . ; Sm g of m sets whose union consists of n
Clearly, this problem is equivalent to finding a routing different elements, e1 ; e2 ; . . . ; en , is there a collection C of
S
tree T such that NMB(T) is minimized. k sets selected from the m sets such that si 2C Si ¼
fe1 ; e2 ; . . . ; en g?
S
Let U ¼ si 2C Si ¼ fe1 ; e2 ; . . . ; en g. We now transform this
3. NP-Hardness
instance to an instance of the NBDT problem in a 4-layer
graph G whose construction is illustrated in Fig. 1 as
In this section we show that, given a graph
follows.
M ¼ ðN [ fsg; LÞ, finding the maximum lifetime shortest
routing tree is NP-hard. (1) The layer V 1 ¼ fu1 ; u2 g contains two nodes and
both are adjacent to s so that jAdjðsÞj = 2.
Definition 2. An h-layer graph is a graph G(V, E), where the (2) For layer V 2 , we create a node Si for every set
node set V is partitioned into disjoint subsets, Si in F and connect each of them to both nodes
u1 and u2 by edges ðu1 ; Si Þ and ðu2 ; Si Þ;
V 0 ; V 1 ; . . . ; V h , and the nodes in V i are in layer i,
ð1 6 i 6 mÞ. Moreover, we create mn þ 2k  m
0 6 i 6 h. Moreover, V 0 ¼ fsg consists of a single node
additional nodes for layer V 2 , and connect each
called the source (or root). A node in layer i; 1 6 i 6 h, may
of them to u2 only.
be only adjacent to nodes in layer i  1 or layer i þ 1. There
(3) For layer V 3 , we create n nodes, e1 ; e2 ; . . . ; en ,
are no edges within each layer or connecting nodes of non-
that correspond to the n different elements in
adjacent layers. An h-layer graph, h P 1, is called a multi-
U. Node ej is connected to node Si in V 2 by an
layer graph.
edge ðSi ; ej Þ if and only if ej 2 Si ; 1 6 j 6 n and
Let distða; bÞ denote the distance from node a to node b 1 6 i 6 m.
in a graph that is the number of edges used by a shortest (4) For layer V 4 , we create m  1 nodes for each node
path from a to b. Obviously, in an h-layer graph, V i consists ej 2 V 3 and connect each of them to node ej by an
of all nodes whose distance to s is i; 1 6 i 6 h. edge, 1 6 j 6 n.
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1067

V0 V1 V2 V3 V4 either. Suppose, for the sake of contradiction, u1 connects


to more than k nodes in V 2 . To be balanced, the tree rooted
at u2 must contain at least one node x in layer V 3 , while
mn +2k-m
node x must connect the ðm  1Þ nodes in V 4 that are


en m-1 exclusively adjacent to x only. Then, the tree rooted at u2
must have at least 1 þ ðmn þ 2k  mÞ þ 1 þ 1 þ ðm  1Þ ¼
u2 Sm ðmn þ k þ 1Þ þ ðk þ 1Þ nodes, contradicting that the tree is
balanced. Therefore, in the balanced tree, node u1 must

……
s connect to exactly k nodes in V 2 . Moreover, the k
nodes must connect to all n nodes in V 3 and then all
u1 nodes in V 4 , in order for u1 to have ðmn þ k þ 1Þ nodes in
its subtree. Obviously, these k nodes in V 2 correspond to
m-1 the k sets that cover all n elements in U. The theorem then
S2
follows. h
e2
m-1
S1

m-1 4. Top-down distance tree algorithm


e1

Fig. 1. An illustration of the 4-layer graph G transformed from an As discussed in Section 2, finding the maximum lifetime
instance of the set cover problem. shortest routing tree in MðN [ fsg; LÞ is equivalent to find-
ing a node load-balanced distance spanning tree in its dis-
tance graph. For ease of presentation, in this section, the
notation MðN [ fsg; LÞ is also used to represent its distance
The transformation takes polynomial time of n and m. graph if no ambiguity arises. Denote by V l the set of nodes
The number of nodes in G is calculated as follows. in layer l and El the set of edges between layer l and layer
There are two nodes in V 1 ; mn þ 2k nodes in V 2 ; n nodes l þ 1 in MðN [ fsg; LÞ. Let T l be a distance spanning tree up
in V 3 , and nðm  1Þ nodes in V 4 . Therefore, the total to layer l and ndl ðv k Þ the number of descendants of node v k
number of nodes in graph G is 2 þ mn þ 2kþ in T l ; 1 6 l 6 h. Equivalently, T l can be viewed as the sub-
n þ nðm  1Þ ¼ 2ðmn þ k þ 1Þ. Thus, any distance spanning tree of a distance spanning tree that consists of all nodes
tree T must have NMBðTÞ P mn þ k þ 1. In the following from layer 0 up to layer l.
we show that there is a solution to the instance of the set
cover problem if and only if there is a distance spanning 4.1. A centralized algorithm
tree T in G such that NMBðTÞ ¼ mn þ k þ 1.
We show the only if part first. Suppose there is a Given a distance spanning tree T l ; 1 6 l 6 h  1, we can
solution to the instance of the set cover problem and C is extend it to layer l þ 1 by linking each node v 2 V lþ1 to a
S
the collection of k sets such that Si 2C Si ¼ fe1 ; e2 ; . . . ; en g. node u 2 V l through an edge ðu; v Þ 2 El . Starting from T 1 ,
We construct a distance spanning tree T as follows. the top-down heuristic uses the network flow technique
(1) Connect s to u1 and u2 . to repeatedly and optimally extend tree T l to
(2) Connect u1 to Si 2 V 2 by edge ðu1 ; Si Þ if Si 2 C. T lþ1 ; 1 6 l 6 h  1, until all the nodes in N are included in
(3) Connect u2 to each remaining node in V 2 . Because the tree.
there are m  k sets that are not in C; u2 has
ðm  kÞ þ mn þ 2k  m þ 1 ¼ mn þ k þ 1 children, Definition 6. A distance spanning tree T lþ1 is said to be
including u2 itself. optimally extended from T l ; 1 6 l 6 h  1, if
(4) For element ej 2 U, find a set Si 2 C such that ej 2 Si . NMBðT lþ1 Þð¼ max16k6d fndlþ1 ðv k ÞgÞ is minimized among
Then, connect corresponding node Si 2 V 2 to node all possible extensions from layer l to layer l þ 1, where
ej 2 V 3 . Because the collection C covers all ele- ndlþ1 ðv k Þ denotes the number of nodes in the subtree of
ments, this can be done. Obviously, all nodes in T lþ1 rooted at v k , v k 2 AdjðsÞ; k ¼ 1; 2; . . . ; d.
V 3 become the descendants of u1 . In the following we explain how this heuristic optimally
(5) Include all edges between V 3 and V 4 . All nodes in extends tree T l to T lþ1 ; 1 6 l 6 h  1. Because the network
V 4 become the descendants of u1 in the tree. flow technique is the key technique employed in this algo-
rithm, we first describe how to construct the flow network
Following the above construction, it is clear that each of according to tree T l ; 1 6 l 6 h  1.
the nodes u1 and u2 has exactly mn þ k þ 1 descendants.
Therefore, NMBðTÞ ¼ mn þ k þ 1. 4.1.1. Constructing a flow network N(B) from the given T l
We then show the if part. Suppose there is a distance Let Aðv Þ ¼ v k denote that v 2 N is a descendant of
spanning tree T such that NMBðTÞ ¼ mn þ k þ 1. Then, the v k 2 AdjðsÞ in T l , and let V lþ1 ¼ fx1 ; x2 ; . . . ; xm g be the set
subtree rooted at u1 must connect to at least k nodes in V 2 ; of nodes in layer l þ 1. The construction of the flow net-
otherwise, it would have less than mn þ k þ 1 descendants work N(B) is given below.
even if all nodes in layers V 3 and V 4 are its descendants. As part of N(B), construct a directed bipartite graph G (X,
However, u1 cannot connect to more than k nodes in V 2 Y, E), where X ¼ V lþ1 ¼ fx1 ; x2 ; . . . ; xm g; Y ¼ AdjðsÞ ¼
1068 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

node u 2 V l such that ðxi ; uÞ 2 El and AðuÞ ¼ v k . We thus


connect xi to node u in T lþ1 . Obviously, xi becomes a
descendant of v k through edge ðxi ; uÞ. By doing so, we con-
nect every xi 2 V lþ1 ; 1 6 i 6 m, to a node in V l , which effec-
tively extends T l to T lþ1 . Moreover, for each node
P
v k; 1 6 k P 6 d, we have f ðv k ; tÞ ¼ m i¼1 f ðxi ; v k Þ þ f ðs ; v k Þ.

m
Because i¼1 f ðxi ; v k Þ is equal to the number of nodes in
layer l þ 1 that become new descendants of v k and
f ðs ; v k Þ ¼ ndl ðv k Þ is the number of descendants of v k from
layer 1 to layer l, we have f ðv k ; tÞ ¼ ndlþ1 ðv k Þ in the
extended T lþ1 . Therefore, ndlþ1 ðv k Þ ¼ f ðv k ; tÞ 6 B.
We then prove the if part. Suppose T lþ1 is a distance
spanning tree extended from Tl such that
max16k6d fndlþ1 ðv k Þg 6 B. According to T lþ1 , we assign a
flow f in N(B) as follows: assign f ðs ; xi Þ 1 and
f ðxi ; v k Þ 1 if xi is connected to a node u 2 V l in T lþ1 and
AðuÞ ¼ v k ; assign f ðs ; v k Þ ndl ðv k Þ and
f ðv k ; tÞ ndlþ1 ðv k Þ; and assign all other edges with zero
flow, 1 6 i 6 m and 1 6 k 6 d. Obviously, this flow assign-
ment saturates all outgoing edges from s , and because
max16k6d fndlþ1 ðv k Þg 6 B, the amount of flow assigned on
any edge is no greater than its capacity. Moreover, only if xi
is a descendant of v k in T lþ1 , then we assign f ðxi ; v k Þ ¼ 1;
Pm
otherwise f ðxi ; v k Þ ¼ 0. So, i¼1 f ðxi ; v k Þ is equal to the
number of newly added descendants from layer l þ 1 to v k .
Therefore, we have

X
m
f ðv k ; tÞ ¼ ndlþ1 ðv k Þ ¼ f ðxi ; v k Þ þ ndl ðv k Þ
i¼1
ð2Þ
X
m

Fig. 2. An illustration of the construction of flow network N(B). ¼ f ðxi ; v k Þ þ f ðs ; v k Þ:




i¼1

fv 1 ; v 2 ; . . . ; v d g, and E ¼ fðx; v Þ j x 2 X; v 2 Y; 9u such that Thus, at any node except s and t, the total incoming flow is
ðx; uÞ 2 El and AðuÞ ¼ v g, where node v 2 AdjðsÞ is an ances- equal to the total outgoing flow. Therefore, the assigned
tor of node u 2 V l in T l . The meaning of edge (x, v) is that x flow is a valid flow and it saturates all outgoing edges from
can become a descendant of v through edge (x, u). There s . Lemma 1 then follows. h
can be multiple edges from x to different nodes in set Y.
The capacity of every edge in G (X, Y, E) is set 1. Add to
Corollary 1. Finding an optimally extended distance span-
N(B) a source node s and a directed edge ðs ; xÞ for every
ning tree T lþ1 from tree T l is equivalent to finding a minimum
node x 2 X, with capacity cðs ; xÞ ¼ 1. Add to N(B) a direc-
integer B in the flow network N(B) such that a maximum flow
ted edge ðs ; v Þ for every node v 2 Y, with capacity
can saturate all outgoing edges from s ; 1 6 B 6 jNj.
cðs ; v k Þ ¼ ndl ðv k Þ. Add to N(B) a sink node t and a directed
edge (v, t) for every node v 2 Y, with a capacity cðv ; tÞ ¼ B,
where B is an adjustable integer whose meaning will be Proof. This corollary follows Lemma 1, omitted. h
clear later. Fig. 2 shows an example of the construction We now determine the minimum integer B in the flow
of N(B). network N(B) such that a maximum flow can saturate all
outgoing edges from s . Following the construction of N
Lemma 1. Given a positive integer B as the edge capacity in
(B), to saturate all outgoing edges from s in N(B), there is
N(B) with 1 6 B 6 jNj, there exists a maximum integral flow f
such an integer B that satisfies the following inequality
in N(B) that saturates all outgoing edges from s if and only if
for a flow f:
there is a distance spanning tree T lþ1 extended from T l such
that max16k6d fndlþ1 ðv k Þg 6 B.
max fndl ðv k Þg 6 B 6 max fndl ðv k Þg þ m;
16k6d 16k6d

Proof. We first show the only if part. Let f be a maximum


flow in N(B) that saturates all outgoing edges from s . where m ¼ jV lþ1 j. The smallest value of B can be found
We can extend T l in the following way. Because by algorithm Smallest_B using binary search as
f ðs ; xi Þ ¼ 1, every node xi must have exactly one outgoing follows.
edge ðxi ; v k Þ with flow f ðxi ; v k Þ ¼ 1; 1 6 k 6 d; 1 6 i 6 m.
Since ðxi ; v k Þ 2 E, by the construction of N(B), there is a Algorithm 1. Smallest_B(NðBÞ; T l ; m)
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1069

4.2. Overview of the distributed implementation


1: lower bound max16k6d fndl ðv k Þg;
/* This was known when T l was produced */ We assume that the sensor network is a synchronous
2: upper bound lower bound þ m; network, in which each node starts a message transmission
Pd
3: c i¼1 ndl ð v k Þ;/* The total capacity on all of in the beginning of a time unit, and finishes the transmis-
ðs ; v k Þ; 1 6 k 6 d */ sion at the end of the time unit. All local computation can
4: whilelower bound – upper bounddo 5: be done in the same time unit as well. In other words, we
b bupper boundþlower bound
c; assume that local computation takes no time.
2
6: B b /* Set B = b in N(B) */; The distributed implementation of the ‘top-down’ algo-
7: Find a max flow f in N(B) from s to t; rithm consists of h  1 iterations. Within each iteration, it
extends the current tree one layer further. We now con-
8: if jf j ¼ m þ cthen
sider one layer expansion by considering a subgraph
9: /* f saturates all out edges from s */
Gl;lþ1 ¼ ðV l [ V lþ1 ; ðV l  V lþ1 Þ \ LÞ. Note that Gl;lþ1 may not
10: upper bound b;
be connected. The flow network N 0 ðBÞ ¼ ðV l [ V lþ1 [ V 1 ; E0 Þ
11: else
based on Gl;lþ1 is constructed as follows. There is a source
12: lower bound b þ 1;
s and a destination t in N 0 ðBÞ such that s connects to all
13: end if
nodes in V lþ1 while all nodes v i 2 V 1 ð¼ AdjðsÞÞ connect
14: end while
to node t, there is a directed edge from a node y 2 V l to a
15: B upper bound;
node v i 2 V 1 if y is a descendant of v i in T l and the capacity
16: return B.
of the edge is assigned as B  ndl ðv i Þ. Meanwhile, the
capacity of each edge from a node v 2 V 1 to t is assigned
an integer B  ndl ðv Þ and the capacity of each edge from
s to x 2 V lþ1 is assigned an integer 1. Each edge (u,v) from
Theorem 2. Algorithm Smallest-B correctly finds the small- a node u 2 V lþ1 to a node v 2 V l in ðV l  V lþ1 Þ \ L is as-
est integer B for the flow network N(B) such that the maxi- signed capacity of 1. The task is to find a maximum flow
mum flow saturates all outgoing edges from s . from s to t distributively to saturate all edges starting from
s . Clearly, the flow network N 0 ðBÞ is equivalent to N(B) de-
Proof. It is clear that the value of the smallest B in N(B) is fined in the previous section. We first embed the graph
between the initial lower bound and the upper bound, N 0 ðBÞ into the communication network MðN [ fsg; LÞ as fol-
namely, max16k6d fndl ðv k Þg 6 B 6 m þ max16k6d fndl ðv k Þg. lows. The subgraph Gl;lþ1 can be embedded into the original
The while loop in the Smallest_B algorithm reduces the sensor network M easily. We embed each edge from a node
searching space by at least a half after each iteration, but y 2 V l to a node v 2 V 1 into node y and each edge from
guarantees that the smallest B is still within the reduced v 2 V 1 to the virtual node t into node v. Similarly we
interval. Therefore, when the length of the interval embed each edge from the virtual node s to a node
becomes 0, the smallest B is found. h x 2 V lþ1 into node x, and embed the virtual node t into
the base station.
To simulate an edge in N 0 ðBÞ from a node y 2 V l to
4.1.2. The top-down distance tree algorithm v 2 V 1 (or t), or from s to a node x 2 V lþ1 in the real com-
The following top-down distance tree heuristic algo- munication topology M, we use the unique path in the par-
rithm constructs a distance spanning tree by repeating tial BFS tree T BFS or T BFS
l lþ1 for such a propose. Thus, although
optimal layer extensions until all the nodes in N are in- Gl;lþ1 may not be connected, the messages sent by the
cluded in the tree. nodes in it can be collected, using trees T BFS or T BFS
l lþ1 . In other
words, each message transfer between two neighboring
Algorithm 2. Top-down-distance-treeðMðN [ fsg; LÞÞ nodes in N 0 ðBÞ can be emulated at most in OðlÞ time with
OðlÞ messages along the unique path in the partial tree
1: Construct the tree T 1 by including all edges of T BFS
l or T BFS
lþ1 . The distributed implementation is that each
ðs; v k Þ; 1 6 k 6 d; child v 2 AdjðsÞ of the root s broadcasts its identity and
2: for l 1 to h  1 do its number of descendants ndl ðv Þ to its descendants in V l
3: Construct the network graph N(B) having T l ; El using T BFS
l . Thus, each node in V l is labeled with one of
and V lþ1 ; the children of node s. Consequently, if finding a maximum
4: m jV lþ1 j; flow in network N 0 ðBÞ from s to t takes Oðtl;lþ1 Þ time and
5: Smallest_B ðNðBÞ; T l ; mÞ; Oðml;lþ1 Þ messages, it takes Oðl  t l;lþ1 Þ time and Oðl  ml;lþ1 Þ
6: Find a maximum flow f in N(B) from s to t; messages in the original communication network
7: Convert the flow f to V lþ1 according to steps given MðN [ fsg; LÞ.
by Lemma 1.
8: end for
4.3. Distributed implementation

The correctness of algorithm Top-down-distance- We here give a distributed implementation of the pro-
tree is obvious. In the rest of this paper, we refer to this posed ‘top-down’ algorithm, and we state the result by
algorithm as the ‘top-down’ algorithm for short. the following theorem.
1070 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

Theorem 3. Given a wireless sensor network In this section, we introduce a distributed balance-refine
M ¼ ðN [ fsg; LÞ, there is a distributed implementation of algorithm to refine the tree produced by the ‘top-down’
the ‘top-down’ algorithm for finding a maximum lifetime algorithm, layer by layer, through changing the connection
shortest routing tree, which takes Oðh  jNj2  log jNjÞ time and between two adjacent layers such that the load balance
uses OðjNj2  jLjÞ messages. among the children of the root can be further improved.
Specifically, for layer l and layer l þ 1 (1 6 l 6 h  1), we
first remove all the edges between these two layers in
Proof. Given the communication network MðN [ fsg; LÞ,
the current tree so that the tree is partitioned into two
the distributed construction of a BFS tree in M rooted at
parts, the upper part is the tree containing all nodes up
the base station takes OðhÞ time and OðjLj þ jNj1:6 Þ mes-
to layer l, and the lower part consists of a set of subtrees
sages by a distributed algorithm in [1], where h is the
whose roots are the nodes in layer l þ 1. As an example,
depth of the BFS tree. The construction of flow network
Fig. 4a shows the two parts after removing all tree edges
N 0 ðBÞ and its embedding to the communication network
between layers 1 and 2 of the spanning tree of Fig. 3c. After
M takes OðjNjÞ time and OðjLjÞ messages because the degree
removing these edges, we re-connect the two parts into a
of each node is no more than jNj. Finding a maximum flow
new tree using all available edges in the set
from s to t in N 0 ðBÞ takes Oðt l;lþ1 Þ ¼ OððjV l j þ jV lþ1 jÞ2 Þ time
El ¼ ðV l  V lþ1 Þ \ L that includes previously non-tree edges
and uses Oðml;lþ1 Þ ¼ OððjV l j þ jV lþ1 jÞ2  jðV l  V lþ1 Þ \ LjÞ
as well as tree edges, such that the new tree has a better
messages by Goldberg and Tarjan’s distributed algorithm
load balance. Fig. 4b shows a possible result by re-connect-
[8], while it takes Oðn0 2 Þ time and uses Oðn0 2 m0 Þ messages
ing the two parts of Fig. 4a which has a better load balance.
in a graph with n0 nodes and m0 edges (see Th. 6.3 in [8]).
Note that the removing and re-connecting procedure on
Following algorithm Smallest_B, there are at most
two adjacent layers is essentially different from expanding
dlog Be calling of the s-t maximum flow algorithm in order
the current tree to include the nodes in one more layer. Be-
to find a load-balanced matching for all nodes in V lþ1 and
cause the lower part includes the subtrees rooted at the
1 6 B 6 jNj. Thus, each layer extension of the routing tree
nodes in the next layer, and different subtrees contain dif-
in the original sensor network M takes Oðl  log jNj  t l;lþ1 Þ
ferent numbers of nodes, finding an optimal connection
time and uses Oðl  log jNj  ml;lþ1 Þ messages.
such that the new tree has the smallest NMB number be-
The algorithm for finding a maximum lifetime shortest
comes difficult and can be easily shown to be NP-complete.
routing tree needs h  1 iterations, the total time for the
Ph1
routing tree construction thus is l¼1 Oðl  log jNj
2 5.1. Modeling of the re-connecting problem
tl;lþ1 Þ ¼ Oðh  jNj  log jNjÞ and the number of messages
Ph1 2
used is l¼1 Oðl  log jNj  ml;lþ1 Þ ¼ OðjNj  jLj  log jNjÞ. The
Recall that V 1 ¼ AdjðsÞ ¼ fv 1 ; v 2 ; . . . ; v d g. We aim to re-
theorem then follows. h
connect layer l and layer l þ 1 ð1 6 l 6 h  1Þ of the current
tree such that the NMBðTÞ ¼ maxv i 2V 1 fndðv i Þg in the new
5. The balance-refine algorithm tree T is improved, where ndðv i Þ is the number of nodes
in the branch rooted at v i . To help readers understand
Although the ‘top-down’ algorithm constructs a load- the following distributed algorithm, we use a daily life
balanced spanning tree optimally layer by layer, the bal- example – the school admission process as an illustration
ance load (NMB) of the tree could be further improved. of the proposed distributed algorithm.
Fig. 3 shows such an example, where the sink has two chil- We view the subtree rooted at v i as a college i
dren A and B in layer 1, and both A and B connect to 4 ð1 6 i 6 dÞ and the nodes in this subtree as the students
nodes, c, d, e, f in layer 2. So, the ‘top-down’ algorithm ex- admitted to this college. The node v i plays the role of
tends the tree to layer 2 by assigning nodes c and e to A and admission officer. A node y in layer l is considered to be a
nodes d and f to B. It is a perfectly balanced tree up to layer recruiter for college i if y belongs the subtree rooted at
2. However, if node c and e can reach many nodes below v i . Since we have removed the edge connections between
layer 2, but nodes d and f connect to no nodes below layer layer l and layer l þ 1, the tree in the upper part represents
2 at all, then the result will be an unbalanced tree shown in
Fig. 3c. This is because the ‘top-down’ algorithm has no
way to use the connection information below the layer it
is dealing with within each iteration.

Fig. 3. An example of the spanning tree constructed by the ‘top-down’


algorithm that needs an improvement. Fig. 4. An illustration of reconnecting layers 1 and 2 of the tree in Fig. 3c.
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1071

the current enrollments to the d colleges. Connecting a (3) The communication between a recruiter of a col-
subtree rooted at node x in layer l þ 1 to a node y in layer lege and its admission officer is through a mes-
l is equivalent to admitting all students in the subtree of x sage which is forwarded by the unique path in
to the college where the node y belongs to. Our objective is the tree of the upper part between them.
to make the enrollments among d colleges to be balanced (4) In case a student group has links to multiple
as much as possible. The decision on whether to admit stu- recruiters of the same college, one of the recruit-
dents of a subtree is not made by the recruiter but by the ers is chosen to pass the message between the
admission officer only. This is because there may be many student group and the admission officer of that
recruiters for the same college and they have no way to college, and the other recruiters do not commu-
communicate directly with each other. We can expect that nicate with the student group.
a recruiter passes an application from a subtree to its Having the above assumptions, it can be seen that there
branch root (admission officer) and gets a reply back in ex- are exactly l time units used for one-way transmission be-
actly 2l time units, assuming that it takes one unit time for tween a student group and a college admission officer, and
a node to send a message to a neighboring node. Fig. 5 2l time units for a round trip communication. Now, we are
illustrates this model by an example. Note that a college ready to introduce the re-connection algorithm consisting
does not admit students individually but admits students of the following three stages.
in groups. The links (edges) between layer l and layer
l þ 1 define all possible ways that student groups can apply 5.2.1. The first stage – start stage
for colleges. It is very likely that a student group (a subtree) The re-connection procedure starts when the sink sends
may have links to multiple recruiters of the same college. a message start(nmb, layer l) to nodes in
For example, in Fig. 5, u3 has links to B2 and B3 , two recruit- V 1 ¼ fv 1 ; v 2 ; . . . ; v d g, where nmb is the NMB value of the
ers for college B. In this case, the student group chooses ex- current spanning tree which is to be improved. Upon
actly one recruiter to communicate with. receiving the message, each node v i , the admission officer
for college i, broadcasts a message recruit(nmb, layer
5.2. A distributed algorithm for re-connecting two layers l; enrollmentðiÞ, college i) to its recruiters, 1 6 i 6 d. Then,
each recruiter of college i broadcasts this message to all
In this subsection, we present a distributed algorithm student groups the recruiter is responsible to communicate
for re-connecting layer l and layer l þ 1 (1 6 l 6 h  1). with. After receiving this message, each node enters the
For the sake of convenience, we make the following rea- second stage.
sonable assumptions.
(1) Each college admission officer knows the num- 5.2.2. The second stage – admission stage
ber of students that have been recruited by the In this stage, each student group applies to a college and
college in the upper part. A variable the college selects only one student group to admit, and
enrollmentðiÞ is used to denote the enrolled num- this interaction will repeat in every 2l time units until all
ber of students in college i, which is updated student groups are admitted or one student group sends
immediately whenever the college has admitted an abort message. The detailed description of this stage is
new students. as follows.
(2) There are kð¼ jV lþ1 jÞ subtrees in the lower part First, each student group uj takes the following actions,
whose roots are u1 ; u2 ; . . . ; uk . We also assume 1 6 j 6 k ¼ jV lþ1 j.
that subtree rooted at uj has sj students, (1) Upon receiving all recruit(nmb, layer l; enrollmentðiÞ,
1 6 j 6 k, and node uj knows the value sj . college i) messages, it updates its local variable
enrollmentðiÞ. If uj has not been admitted by any col-
lege, then it does the following:
(a) Compute Dij ¼ enrollmentðiÞ þ sj . A college i is
qualified to be considered by uj if Dij 6 nmb;
(b) If no college is qualified, then send an abort mes-
sage to a recruiter of college i and stop communi-
cation until the third stage;
(c) If one or more colleges is qualified, find the one
with the minimum Dij , and send a message
applyðcollegei; groupj; Dij ; urgentÞ to the recruiter
of college i, where urgent is a boolean variable
which will be ‘true’ if only college i is qualified;

(2) Upon receiving a termination message, it enters the


third stage;
(3) Upon receiving an abort message, it resets the parent
of ui to be the initial parent, and then enters the third
stage;
(4) Upon receiving an admitted message from college i, it
Fig. 5. An illustration of the college admission modeling. does the following:
1072 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

(a) Send a message done to all recruiters that another college in the next round once it receives a new re-
node uj communicates with; cruit message.
(b) Connect uj to the recruiter of college i. The
recruiter node becomes the parent of uj in 5.2.3. The third stage – transfer stage
the new tree. The admission stage ends with either a termination
message or an abort message from the sink. If it is the for-
Second, the main role of each recruiter in layer l is to mer, a student group is assigned to a new parent which
pass messages between the college admission officer and may be different from the one prior to the reconnection;
students groups. Additional actions taken by a recruiter otherwise, every student keeps the original parent. In
for college i; 1 6 j 6 d, are: either case, the third stage gives each student group a
(1) Upon receiving the done message from a student chance to change its parent node, that is, to transfer to an-
group uj , it stops communication with uj until other college if this leads to an improvement on the bal-
the third stage; ance of enrollment, or keep as is. The detailed
(2) Upon receiving done messages from all student explanation of this stage consisting of three steps is as
groups with which the recruiter is responsible follows.
for communication, it sends a done message to In the first step, each student group uj ; 1 6 j 6 k, identi-
college admission officer v i . fies a college i from all communicating colleges that has
Third, each college admission officer v i ; 1 6 i 6 d, takes the smallest value of enrollmentðiÞ. Suppose uj currently
following actions. enrolls in college i0 . Let Dij ¼ enrollmentðiÞ þ sj . If i – i0 ,
(1) Upon receiving all apply messages, it admits then student group uj sends a transfer request trans-
those student groups whose parameter urgent fer(enrollmentði0 Þ, college i, group j; Dij ) to college i.
in its apply message is ‘true’, and update local In the second step, upon having received all transfer
variable enrollmentðiÞ. If enrollmentðiÞ > nmb, messages, each admission officer v i ; 1 6 i 6 d, identifies a
then it sends an abort message to the sink and transfer message with the largest Dij but Dij 6 nmb, accepts
stops communication until the third stage. the transfer, updates local variable
Otherwise, it sends a message admit- enrollmentðiÞ enrollmentðiÞ þ Dij , and sends a message
ted(enrollmentðiÞ, college i, group j) to each admitted(enrollmentðiÞ, college i, group j) to uj .
admitted applicant uj ; In the third step, each student group informs its previ-
(2) For each student group application with the ous college if it has been transferred to another college.
message apply(college i, group Each college admission officer then updates its college
j; Dij ; urgent ¼ ‘false0), it does the following: enrollment accordingly and notifies the sink. Finally, the
(a) Find a student group uj with the largest Dij sink computes the new nmb value and broadcasts it to
and Dij 6 nmb; the entire tree.
(b) Update local variable enrollmentðiÞ Dij ; For convenience, we will also refer to the balance-re-
(c) Send a message admitted(enrollmentðiÞ, fine algorithm as ‘refinement’ for short.
college i, group j) to student group uj ;
(d) Broadcast a message recruit(nmb, layer l, 5.3. Correctness and termination
enrollmentðiÞ, college i) to all recruiters;
The proposed distributed algorithm for re-connecting
two adjacent layers forms the base of algorithm bal-
(3) Upon receiving any abort message from a stu- ance-refine that consists of h  1 iterations, and the dis-
dent group, it forwards it to the sink and stop tributed algorithm for reconnecting two adjacent layers is
communication until the third stage; invoked within each iteration.
(4) Upon receiving done messages from all recruit- The correctness of the proposed distributed algorithm is
ers, it sends a doneðenrollmentðiÞÞ message to obvious. The rest is to show that it will terminate. Clearly,
the sink. it will terminate at its first stage - the ‘‘start’’ stage, and
Finally, the sink (the tree root) takes the following second stage - the ‘‘transfer’’ stage. We now show that it
actions. will terminate in its last stage - the ‘‘admission’’ stage, too.
(1) Upon receiving an abort message from any
v i ; 1 6 i 6 d, it broadcasts an abort message to Lemma 2. The admission stage of the distributed algorithm
all student groups through its recruiters along either terminates or aborts within no more than k rounds,
the tree paths; where k ¼ jV lþ1 j is the number of student groups with roots at
(2) Upon receiving doneðenrollmentðiÞÞ messages layer l þ 1; 1 6 l 6 h  1.
from all v i , it updates the value of nmb to be
max16j6d fenrollmentðjÞg; 1 6 i 6 d, and broad-
casts a message terminationðnmbÞ to all recruit- Proof. According to the proposed distributed algorithm,
ers and student groups. within each round, a college admission officer either
Note that a college admission officer does not need to receives no applications and so admits no student group,
send a denial message to a student group applicant. This or accepts at least one student group. Therefore, in each
implies that if a student group does not receive an admitted round of recruiting, if student group j does not receive
message in 2l time units and this group can apply to any admitted message from college i it applied to, then
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1073

there must be at least one student group other than group j sume that all sensors are identical, this implies that every
that has been accepted by college i in this round. Since sensor has the same initial energy capacity of 0.5 J, an iden-
there are only k student groups, one student group cannot tical transmission radius of R = 30 m, and the same sam-
be denied by more than k  1 times. Therefore, within k pling rate. Within each session each sensor sends the
rounds, the admission stage will either terminate with all same amount of sensed data (80 bits) to the sink without
student groups admitted or abort without admitting any any aggregation. As mentioned in Section 2, the dominant
student groups. h energy consumption in wireless networks is the radio com-
munication, we focus on the communication energy con-
sumption by ignoring the other energy consumptions.
Theorem 4. Given a shortest distance routing tree, the time
The amounts of energy consumed by a sensor for transmit-
and message complexities of the balance-refine algo-
2 ting or receiving 1-bit data are computed as follows [9]:
rithm are OðhnÞ and Oðhn Þ respectively, where h is the height
of the tree and n ¼ jNj is the number of sensor nodes in the et ¼ a þ bR2 ; ð3Þ
network MðN [ fsg; LÞ.
er ¼ c; ð4Þ
9
Proof. The number of iterations in algorithm balance- where R is the transmission radius, a ¼ 45  10 J=
refine is h  1. The time complexity of each iteration is bit; b ¼ 10  1012 J=bit=m2 , and c ¼ 135  109 J=bit [9].
dominated by the admission stage. According to Lemma The value in each figure is the mean of the simulation re-
2, there are at most k rounds and each rounds cost 2l time sults of 200 random network topology instances.
units. The time complexity of the balance-refine algo- To evaluate the performance of the proposed algo-
Ph1 Ph1 rithms, an existing ‘node-centric’ algorithm in [4] is em-
rithm thus is i¼1 Oði ni Þ 6 h i¼1 Oðni Þ 6 OðhnÞ, where
ni ¼ jV i j. ployed as our benchmark, where the ‘node-centric’
The message complexity is dominated by the admission algorithm proceeds iteratively. Initially, the tree contains
stage of the re-connecting procedure. Following Lemma 2, the only the root, i.e., the sink. Within each iteration, it first
in the admission stage of reconnecting layer l and layer l þ 1 selects a branch with the lightest load, and then grafts onto
ð1 6 l 6 h  1Þ, there are at most k rounds of recruiting, this branch the unassigned/unmarked border node gener-
while in each round at most k apply messages are sent. ating the heaviest load. Notice that the ‘node-centric’ algo-
Every apply message is transmitted to a college admission rithm focuses only on the load balance among the nodes
officer through l-hop relays and then is responded by without incorporating the constraint of the distance from
either an admitted message or a new recruit message. Thus, each node to the sink. Also, it must be mentioned that
the message complexity related to apply messages is the performance comparison between the proposed algo-
2
Oðk lÞ. The other message complexity related to recurit rithm and the ‘node-centric’ algorithm is unfair, because
messages, whenever a college admission officer accepts a the latter does not guarantee the shortest path routing,
student group, the value of enrollmentðiÞ increases, it then and may produce a very ‘slim’ tree to achieve load balance.
broadcasts this updated value to all student groups. The The purpose of making use of such a comparison is to ob-
message complexity related to recurit messages thus is serve the difference and the trade-off between load bal-
2
Oðk lÞ, too. Therefore, the message complexity within each ance and data routing delay in the routing trees delivered
2
iteration is Oðk lÞ. The number of messages required by the by both algorithms.
balance-refine algorithm is
Ph1 2
Ph1 2 2
i¼1 Oði ni Þ 6 h i¼1 Oðni Þ 6 Oðhn Þ. h 6.1. A Lower bound

In this subsection, we will establish a lower bound on


6. Performance evaluation NMB and use this lower bound to evaluate the performance
of our algorithm. A trivial lower bound on NMB is jNj=jV 1 j,
In this section we evaluate the performance of proposed where V 1 ¼ AdjðsÞ is the set of nodes forwarding sensed
algorithm in terms of NMB and the network lifetime, data to the sink directly. Obviously, no solution could be
through experimental simulations. We also investigate better than this bound. This lower bound however is too
the impact of several network parameters, such as network loose and may be far below any optimal solution. We ob-
density, network size, and the location of the sink on the serve that if all routing paths must be shortest, some nodes
performance by the simulations. We refer the proposed may only be able to reach one particular branch node x in
algorithm as ‘top-down with refinement’ algorithm, V 1 by this requirement. Based on this observation, for each
since it invoke two algorithms: ‘top-down’ and node x 2 V 1 , we define a set UðxÞ ¼ fv j v 2 N, and v can
‘refinement’. only reach x 2 V 1 by any shortest path}. Since
In a default setting, we consider a sensor network con- NMBðTÞ P maxx2V 1 fjUðxÞjg for any routing tree
sisting of 100–450 sensors randomly deployed in a T; maxx2V 1 fjUðxÞjg can be used as another lower bound on
200 m  200 m square region with the sink randomly de- NMB. We now generalize this idea further. Let P be any
ployed in a centered square area of 200/3 m  200/3 m. subset of V 1 , define Q ðPÞ to be the set of nodes in M that
We also consider the other cases in which sensors are ran- can only reach the nodes in P by any shortest routing,
domly distributed in a square area with the length of each i.e., Q ðPÞ ¼ fv j v 2 N, and v can only reach one or more
side ranging from 100 m to 300 m with an increment of nodes in P by a shortest path}. Then, for any routing tree
50 m while keeping the node density unchanged. We as- T; NMBðTÞ P jQjPjðPÞj
. Since we could not exhaust all possible
1074 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

sets of P, we will select a few of them in our simulations as algorithm and by the ‘node-centric’ algorithm through a
follows. concrete example. Fig. 6a shows a randomly generated
We first compute the coordinates ðhv ; rv Þ of each node sensor network and Fig. 6b is the distance graph of the net-
v 2 V 1 in a polar coordinate system with node s being work. The sink is marked by a small circle. Fig. 6c and d are
the center, where hv is the positive angle from the horizon- the routing trees produced by the ‘node-centric’ and our
tal line ðs; þ1Þ and rv is the distance between s and v. algorithm respectively, from which it can be seen that
Then, for each v k 2 V 1 ; 1 6 k 6 d, we compute a set Pk the tree produced by the ‘node-centric’ is more balanced
where Pk ¼ fv j v 2 V 1 and ðhv k 6 hv < hv k þ 45 Þ and among its branches, however the shape of the tree is ‘slim’,
rv > 12 Rg. In other words, P k is the set of nodes in V 1 that and quite a few nodes in it have long paths to reach the
are within the sector of 45 from v k and R=2 away from sink. While the tree produced by the proposed algorithm
the sink. Now, another lower bound on NMB(T) is derived, is ‘fat’, although slightly less balanced, but it allows every
which is max16k6d fjQjPðPkj Þjg. Since the optimal value of node to reach the sink along the shortest path from the
k
NMB(T) will above any lower bound, we use the largest node.
one among these three lower bounds as the benchmark As shown by Fig. 7, the average message latency in the
to measure how close between the results produced by routing tree produced by the ‘node-centric’ algorithm is
our algorithms and the achievable optimal results. That much longer than that in the routing tree by our proposed
is, the following lower bound will be used: algorithm. Moreover, the gap of the average message delay
LBðMÞ ¼ maxfjVjNj1 j ; maxx2V 1 fjUðxÞjg; max16k6d fjQjPðPkj Þjgg. between them increases with the growth of the node den-
k
sity. In fact, the trees produced by the ‘top-down’ algo-
rithm with or without balance refinement guarantee the
6.2. Balanced trees delivered by different algorithms
routing path from each node to the sink is the shortest
one. Also, the one with load-balance refinement has a
We first observe the differences between balanced rout-
ing trees produced by the ‘top-down with refinement’

100 100

50 50

0 0

−50 −50

−100 −100
−100 −50 0 50 100 −100 −50 0 50 100
(a) The original sensor network M (b) The distance graph of M
100 100

50 50

0 0

−50 −50

−100 −100
−100 −50 0 50 100 −100 −50 0 50 100
(c) The balance tree produced (d) The balance tree produced
by the node-centric heuristic by

Fig. 6. The load-balanced trees produced by different algorithms in a sensor network of 150 nodes randomly deployed in a 200 m  200 m square region.
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1075

Fig. 7. The average message latency between any node and the sink in a Fig. 9. The ratio of achieved network lifetime over its upper bound in a
spanning routing tree for a sensor network deployed within a sensor network deployed in a 200 m  200 m square region with various
200 m  200 m square. node densities.

Fig. 8. The size of the largest branch of balanced trees (NMB) delivered by Fig. 10. The network lifetime delivered by the ’top-down’ algorithm with
different algorithms for networks deployed in a 200 m  200 m square balance refinement and its ratio to the upper bound for sensor networks
with different node densities. deployed in a 200 m  200 m square with different node densities, where
the sink is located at one of the four corners.

much better load balance, which can be observed from shortest one. However, as seen from Fig. 8, NMB(T) of the
Fig. 8. ‘node-centric’ algorithm becomes worsen when the node
Fig. 8 plots the value curves of NMB of different algo- density reaches 300 or above.
rithms for networks deployed in a 200 m  200 m square
region with different node densities, from which it can be 6.3. Evaluation of network lifetime of different routing trees by
seen that the ‘top-down with refinement’ algorithm out- different algorithms
performs the ‘top-down’ algorithm significantly. The curve
of the ‘top-down with refinement’ is parallel to the curve of We then study the performance of the ‘top-down’ algo-
the lower bound, which means its performance is stable. rithm and the ‘top-down’ algorithm with balance refine-
Since the curve of the optimal solution lies inside the ment against an upper bound on the maximum network
gap, our solution is less than six nodes from the optimal lifetime defined in Section 2.
solution. In contrast, the NMB obtained by the ‘node-cen- Since the network lifetime is inversely proportional to
tric’ algorithm tends to be worse with the growth of net- the value of NMB, then a lower bound of NMB can be con-
work size. Notice that the value of NMB(T) of a tree T verted to an upper bound on the maximum network life-
delivered by the ‘node-centric’ algorithm may even below time, and this upper bound is obviously larger than the
the lower bound, because the tree is not necessarily to be a maximum network lifetime. Table 1 illustrates the net-

Table 1
The network lifetime delivered by different algorithms in a deployed network in a 200 m  200 m square region with various node densities.

Algorithm Number of nodes


100 150 200 250 300 350 400 450
Top down 800 822 819 828 857 855 866 878
Top down with refinement 908 1054 1119 1130 1207 1209 1227 1253
Upper bound 1057 1222 1282 1307 1427 1430 1451 1491
1076 F. Shan et al. / Computer Networks 57 (2013) 1063–1077

tree. A distributed implementation of the ‘top-down’ algo-


rithm has also been given. To further improve the perfor-
mance of the ‘top-down’ algorithm, we proposed a
‘balance-refine’ distributed algorithm. Finally, we con-
ducted extensive simulations to evaluate the performance
of the proposed algorithms. The simulation results have
shown that the network lifetime delivered by the ‘top-
down’ algorithm with balance refinement is no less than
85% of the optimal network lifetime.
There are several interesting directions for future
works. For example, designing an approximation algo-
rithms to construct the balanced shortest path (routing)
tree, which may have potential applications in many other
Fig. 11. The network lifetime delivered by the ‘top-down with research area; considering building the balanced shortest
refinement’ algorithm and its ratio to the upper bound of the optimal routing tree for sensor networks in which sensors are of
network lifetime for sensor networks deployed in a square with various
non-uniform energy levels; relaxation of shortest path
edge lengths while the node density 1=200ðnodes=m2 Þ is fixed.
constraints, that is sacrificing message delays from a small
portion of nodes to improve the balance of the routing tree
and time complexity of the algorithm.
work lifetimes achieved by different algorithms in the
same sensor network, in comparison with an upper bound
on the maximum network lifetime. Acknowledgments
Fig. 9 demonstrates that the ’top-down’ algorithm with
load balance refinement can achieve a network lifetime no The authors would like to thank the anonymous refer-
less than 85% of the maximum possible one (the upper ees and the editor for their constructive comments and
bound on the maximum network lifetime). As the best valuable suggestions which have helped improve the
achievable network lifetime for the network is in the mid- quality and presentation of the paper. The work of
dle of the gap, the proposed algorithm may achieve 92% of Feng Shan was done during his visit to University of
the maximum network lifetime. Missouri-Kansas City, USA. His research is supported by
Extensive simulations have also been conducted to National Key Basic Research Program of China under
evaluate the network lifetime when the sink is located at Grants No. 2010CB328104, China Scholarship Council,
one of the four corners, rather than at the center of the NSFC under Grants No. 61070161, No. 61003257, No.
monitoring area. Because of symmetry, we assume that 61202449, No. 61272531, No. 61272054, China National
the sink is located at the most upright corner in our exper- Key Technology R&D Program under Grants No.
iments. Fig. 10 shows that the network lifetime delivered 2010BAI88B03 and No. 2011BAK21B02, SRFDP under
by the ‘top-down’ algorithm with balance refinement Grants No. 20110092130002, China National Science and
achieves 86% of the upper bound of the maximum network Technology Major Project under Grants No.
lifetime. 2010ZX01044-001-001. Jiangsu Provincial Key Laboratory
We finally evaluate the performance of our proposed of Network and Information Security under Grants No.
algorithm by varying the network size while keeping the BM2003201.
node density fixed at 1=200ðnode=m2 Þ. The performance
of the ‘top-down’ algorithm with load-balance refinement
is shown in Fig. 11, from which it can be seen that with References
the increase of the square size, the achieved network life-
[1] B. Awerbuch, R.G. Gallager, A new distributed algorithm to find
time decreases. The figure also shows that the ‘top-down’ breadth first search trees, IEEE Transactions on Information Theory
algorithm with load-balance refinement always delivers a IT-33 (1987) 315–322.
solution that is very close to the upper bound of the max- [2] C. Buragohain, D. Agrawal, S. Suri, Power aware routing for sensor
databases, in: Proceedings of INFOCOM’05, IEEE, 2005.
imum network lifetime, i.e., the network lifetime delivered [3] R. Cristescu, B. Beferull-Lonzano, M. Vetterli, On network correlated
by it is no less than 82% of the optimal one. data gathering, in: Proceedings of INFOCOM’04, IEEE, 2004.
[4] H. Dai, R. Han, A node-centric load balancing algorithm for wireless
sensor networks, in: Proceedings of GLOBECOM’03, IEEE, 2003.
7. Conclusion and future work [5] M. Fürer, B. Raghavachari, Approximate the minimum-degree
Steiner tree to within one of optimal, Journal of Algorithms 17
(1994) 409–423.
In this paper we studied the network lifetime maximi- [6] M.R. Garey, D.S. Johnson, Computers and Intractability: A Guide to
zation problem for time-sensitive data gathering applica- the Theory of NP-Completeness, W.H. Freeman Company, NY, 1979.
[7] A. Goel, D. Estrin, Simultaneous optimization for concave costs:
tions, through constructing a load-balanced shortest
single sink aggregation or single source buy-at-bulk, in: Proceedings
distance routing tree. We first formulated the problem as of SODA’03, ACM-SIAM, 2003.
an optimization problem and showed its NP-hardness. [8] A.V. Goldberg, R.E. Tarjan, A new approach to the maximum-flow
We then devised a novel ‘top-down’ algorithm that con- problem, Journal of ACM 35 (1988) 921–940.
[9] W. Heinzelman, Application-Specific Protocol Architectures for
structs a balanced shortest distance routing tree layer by Wireless Networks, Ph.D. Thesis, Massachusetts Institute of
layer, and each layer extension is optimal from the current Technology, 2000.
F. Shan et al. / Computer Networks 57 (2013) 1063–1077 1077

[10] P.H. Hsiao, A. Hwang, H.T. Kung, D. Vlah, Load-balancing routing for Weifa Liang (M’99–SM’01) received the PhD
wireless access networks, in: Proceedings of INFOCOM’01, IEEE, degree from the Australian National Univer-
2001. sity in 1998, the ME degree from the Univer-
[11] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, F. Silva, sity of Science and Technology of China in
Directed diffusion for wireless sensor networking, IEEE/ACM 1989, and the BSc degree from Wuhan Uni-
Transactions on Networking 11 (2003) 2–16. versity, China in 1984, all in computer science.
[12] C. Intanagonwiwat, D. Estrin, R. Govindan, J. Heidemann, Impact of He is currently an Associate Professor in the
network density on data aggregation in wireless sensor networking, Research School of Computer Science at The
in: Proceedings of ICDCS’02, IEEE, 2002. Australian National University. His research
[13] K. Kalpakis, K. Dasgupta, P. Namjoshi, Efficient algorithms for
interests include design and analysis of
maximum lifetime data gathering and aggregation in wireless
energy-efficient routing protocols for wireless
sensor networks, Computer Networks 42 (2003) 697–716.
[14] S. Khuller, B. Raghavachar, N. Young, Balancing minimum spanning ad hoc and sensor networks, information
and shortest path trees, in: Proceedings of SODA’93, ACM-SIAM, processing in wireless sensor networks, cloud computing, design and
1993. analysis of parallel and distributed algorithms, combinatorial optimiza-
[15] W. Liang, Y. Liu, Online data gathering for maximizing network tion, and graph theory. He is a senior member of the IEEE.
lifetime in sensor networks, IEEE Transactions on Mobile Computing
6 (2007) 2–11.
[16] J. Liang, J. Wang, J. Cao, J. Chen, M. Lu, An efficient algorithm for Jun Luo received his M.S. degree in Software
constructing maximum lifetime tree for data gathering without Engineering from National University of
aggregation in wireless sensor networks, in: Proceedings of Defense Technology, China, in 1989, and the
INFOCOM’10, IEEE, 2010. B.S. degree in Computer Science from Wuhan
[17] S. Lindsey, C.S. Raghavendra, PEGASIS: power-efficient gathering in university, China, in 1984. He is currently a
sensor information systems, in: Proceedings of Aerospace full professor at School of Computer in
Conference, IEEE, 2002.
National University of Defense Technology,
[18] S. Madden, M.J. Franklin, J.M. Hellerstein, W. Hong, TAG: a Tiny
China. His research interests are in ad hoc and
AGgregation service for ad-hoc sensor networks, in: Proceedings of
sensor networks, design of energy-efficient
OSDI’02, ACM, 2002.
[19] A. Mainwaring, J. Polastre, R. Szewczyk, D. Culler, J. Anderson, protocols for wireless networks and operating
Wireless sensor networks for habitat monitoring, in: Proceedings of systems.
ACM International Workshop on Wireless Sensor Networks and
Applications, ACM, 2002.
[20] G.J. Pottie, Wireless sensor networks, in: Proceedings of Information
Theory Workshop, IEEE, 1998.
[21] P. von Richenbach, R. Wattenhofer, Gathering correlated data in Xiaojun Shen received his Ph.D. degree in
sensor networks, in: Proceedings of DIALM-POMC, ACM, 2004. computer science from the University of Illi-
[22] M.A. Sharaf, J. Beaver, A. Labrinidis, P.K. Chrysanthis, Balancing nois at Urbana–Champaign, in 1989. He
energy efficiency and quality of aggregate data in sensor networks. received his M.S. degree in computer science
Journal of VLDB (2004).
from the Nanjing University of Science and
[23] A. Singh, M. Woo, C.S. Raghavendra, Power-aware routing in mobile
Technology, China, in 1982, and his B.S.
ad hoc networks, in: Proceedings of MobiCom, ACM/IEEE, 1998.
[24] Y. Wu, Z. Mao, S. Fahmy, N.B. Shroff, Constructing maximum-lifetime degree in numerical analysis from the Tsing-
data gathering forests in sensor networks, IEEE/ACM Transactions on hua University, Beijing, China, in 1968. He is
Networking 18 (2010) 1571–1584. currently a professor in the School of Com-
[25] Y. Wu, S. Fahmy, N.B. Shroff, On the construction of a maximum- puting and Engineering at the University of
lifetime data gathering tree in sensor networks: NP-completeness Missouri-Kansas City. He is a senior IEEE
and approximation algorithm, in: Proceedings of INFOCOM’08, IEEE, member. His current research interests
2008. include computer algorithms and computer networking with focus on
[26] T. Yan, Y. Bi, L. Sun, H. Zhu, Probability based dynamic load- routing and scheduling.
balancing tree algorithm for wireless sensor networks, in:
Proceedings of ICCNMC 05, LNCS, 2005.

Feng Shan received his B.S. degree in Com-


puter Science from Hohai University, Nanjing,
China, in 2008. He is currently pursuing the
Ph.D. degree in computer science and engi-
neering at Southeast University, Nanjing,
China. He joined the Department of Computer
Networking, University of Missouri-Kansas
City, Kansas City, MO, United States, from
2010 to 2012 as a visiting scholar. His
research interests are in the areas of Wireless
Sensor Network, algorithm, and wireless
multi-hop networks.

You might also like