Distributed Sensor Network Localization From Local Connectivity: Performance Analysis For The HOP-TERRAIN Algorithm
Distributed Sensor Network Localization From Local Connectivity: Performance Analysis For The HOP-TERRAIN Algorithm
information and no distance information, we have constant According to the three phase classification presented in Ta-
Pi,j ’s for all (i, j) ∈ E (see Figure 1). ble 1, this is closely related to the first two phases of the
The sensor localization algorithms can be classified into robust positioning algorithm. This algorithm uses a slightly
two different categories. For the connectivity-based model, different method for computing the shortest paths, which is
which is alternatively also known as the range-free model, compared in detail later in this section. Hence, through out
only the connectivity information is available. Formally, this paper, we refer to this algorithm as Hop-TERRAIN,
which denotes the first two steps of robust positioning algo-
r if (i, j) ∈ E, rithm in [SLR02].
Pi,j =
∗ otherwise, Distributed shortest paths. The goal of the first step
is for each of the unknown nodes to estimate the distances
where a ∗ denotes that di,j > r. between itself and the anchors. This approximate distances
For the range-based model, which is also known as the will be used in the next triangulation step to derive an es-
range-aware model, the distance measurement is available timated position. The shortest path between an unknown
but may be corrupted by noise or have limited accuracy. node i and an anchor a in the graph G provides an estimate
[di,j + zi,j ]+ if (i, j) ∈ E, for the Euclidean distance di,a = ||xi − xa ||, and for a care-
Pi,j = fully chosen radio range r this shortest path estimation is
∗ otherwise,
close to the actual distance as will be shown in Lemma 3.1.
where zi,j models the measurement noise (in the noiseless Formally, the shortest path between an unknown node i
case zi,j = 0), possibly a function of the distance di,j , and and an anchor a in the graph G = (V, E, P ) is defined as a
[a]+ ≡ max{0, a}. Common examples are the additive Gaus- path between two nodes such that the sum of the proximity
sian noise model, where the zi,j ’s are i.i.d. Gaussian ran- measures of its constituent edges is minimized. We denote
dom variables, and the multiplicative noise model, where by dˆi,a the computed shortest path and this provides the
Pi,j = [(1 + Ni,j )di,j ]+ , for independent Gaussian random initial estimate for the distance between the node i and the
variables Ni,j ’s. anchor a. When only the connectivity information is avail-
In distributed sensor network localization not all the able and the corresponding graph G = (V, E, P ) is defined
information is available at each node. Given the graph as in the connectivity-based model, the shortest path dˆi,a is
G(n, r) = (V, E, P ) with associated proximity measurements equivalent to the minimum number of hops between a node
for each edges in E, we assume that each of the nodes is i and an anchor a multiplied by the radio range r.
aware of the proximity measurements between itself and its In order to find the minimum number of hops from an
adjacent neighbors and each of the anchors is also aware of unknown node i ∈ Vu to an anchors a ∈ Va in a distributed
way, we use a method similar to DV-hop [NN03]. Each Triangulation using least squares. In the second step,
unknown node maintains a table {xa , ha } which is initially each unknown node i ∈ Vu uses a set of estimated distances
empty, where xa ∈ Rd refers to the position of the anchor a {dˆi,a : a ∈ Va } together with the known positions of the
and ha to the number of hops from the unknown node to the anchors to perform a triangulation. the resulting estimated
anchor a. First, each of the anchors initiate a broadcast con- position is denoted by x̂i . For each node, the triangula-
taining its known location and a hop count of 1. All of the tion consists of solving a single instance of a least squares
one-hop neighbors surrounding the anchor, on receiving this problem (Ax = b) and this process is known as Lateration
broadcast, record the anchor’s position and a hop count of 1, [SRB01, LR03].
and then broadcast the anchor’s known position and a hop For an unknown node i, the position vector xi and the
count of 2. From then on, whenever a node receives a broad- anchor positions {xa : a ∈ {1, . . . , m}} satisfy the following
cast, it does one of the two things. If the broadcast refers to series of equations:
an anchor that is already in the record and the hop count
is larger than or equal to what is recorded, then the node ||x1 − xi ||2 = d2i,1 ,
does nothing. Otherwise, if the broadcast refers to an an- ..
chor that is new or has a hop count that is smaller, the node .
updates its table with this new information on its memory ||xm − xi ||2 = d2i,m .
and broadcast the new information after incrementing the
hop count by one. When every node has computed the hop
count to all the anchors, the number of hops is multiplied This set of equations can be linearized by subtracting each
by the radio range r to estimate the distances between the line from the next line.
node and the anchors. Note that to start triangulation, not
||x2 ||2 − ||x1 ||2 + 2(x1 − x2 )T xi
all the hop counts to all the anchors are necessary. A node
can start triangulation as soon as it has estimated distances = d2i,2 − d2i,1 ,
to d + 1 anchors. There is an obvious trade-off between ..
number of communications and performance. .
The above step of computing the minimum number of
hops is the same distributed algorithm as described in DV- ||xm ||2 − ||xm−1 ||2 + 2(xm−1 − xm )T xi
hop. However, the main difference is that instead of mul- = d2i,m − d2i,m−1 .
tiplying the number of hops by a fixed radio range r, in
DV-hop, the number of hops is multiplied by an average By reordering the terms, we get a series of linear equations
(i)
hop distance. The average hop distance is computed from for node i in the form A xi = b0 , for A ∈ R(m−1)×d and
m−1
the known pairwise distances between anchors and the num- b∈R defined as
ber of hops between the anchors. While numerical simula- 2
2(x1 − x2 )T
3
tions show that the average hop distance provides a better ..
A ≡ 4 5 ,
6 7
estimate, the difference between the computed average hop .
distance and the radio range r becomes negligible as n grows 2(xm−1 − xm )T
large.
||x1 ||2 − ||x2 ||2 + d2i,2 − d2i,1
2 3
We are interested in a scalable system of n unknown nodes
(i)
b0 ≡ 4 ..
for large value of n. As n grows large, it is reasonable to 5 .
6 7
.
assume that the average number of connected neighbors for ||xm−1 ||2 − ||xm ||2 + d2i,m − d2i,m−1
each node should stay constant. This happens, in our model,
if we chose the radio range r = C/n1/d . However, the num-
ber of hops is well defined only if the graph G is connected. Note that the matrix A does not depend on the particular
If G is not connected there might be a set of unknown nodes unknown node i and all the entries are known exactly to all
that are connected to too few anchors, resulting in under- the nodes after the distributed shortest paths step. However,
(i)
determined series of equations in the triangulation step. In the vector b0 is not available at node i, since di,a ’s are not
the unit square, assuming sensor positions are drawn uni- known. Hence we use an estimation b(i) , which is defined
(i)
formly at random as define in the previous section, the graph from b0 by replacing di,a by dˆi,a everywhere. Then, finding
is connected, with high probability, if πr 2 > (log n + cn )/n the optimal estimation x̂i of xi that minimizes the mean
for cn → ∞ [GK98]. A similar condition can be derived squared error is solved in a closed form using a standard
for generic d-dimensions as Cd r d > (log n + cn )/n, where least squares approach:
Cd ≤ π is a constant that depends on d. Hence, we focus
in the regime where the average number of connected neigh- x̂i = (AT A)−1 AT b(i) . (1)
bors is slowly increasing with n, namely, r = α(log n/n)1/d For bounded d = o(1), a single least squares has complex-
for some positive constant α such that the graph is connected ity O(m), and applying it n times results in the overall com-
with high probability. plexity of O(n m). No communication between the nodes is
As will be shown in Lemma 3.1, the key observation of necessary for this step.
the shortest paths step is that the estimation is guaranteed
to be arbitrarily close to the correct distance for properly 2.3 Main results
chosen radio range r = α(log n/n)1/d and large enough n.
Our main result establishes that Hop-TERRAIN [SLR02]
Moreover, this distributed shortest paths algorithm can be
achieves an arbitrarily small error for a radio range r =
done efficiently with total complexity of O(n m).
α(log n/n)1/d with a large enough constant α, when we
have only the connectivity information as in the case of the
connectivity-based model. The same bound holds immedi-
ately for the range-based model, when we have an approx-
imate measurements for the distances, and the same algo- Unknown sensors
rithm can be applied without any modification. The extra Deterministic anchors
information can be readily incorporated into the algorithm
to compute better estimates for the actual distances between
the unknown nodes and the anchors.
Define
«1
√
„
log n d
r0 ≡ 8 3 d3/2 , (2)
n
0.13 n=10,000
0.9
Average Error
0.12 0.8
0.11 0.7
0.6
0.1
0.5
0.09
0.4
0.08
0.3
0.07
0.2
0.06 0.1
1 1.5 2 2.5 3 3.5 4 4.5
α 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.009
n=5,000
0.008 n=10,000
0.007
Figure 4: 200 nodes randomly placed in the unit square
Average Error
and
p 3 anchors in fixed positions. The radio range is r =
0.006 0.8 ∗ log n/n.
0.005
0.004
enough radio range r. Define
0.003
«1
√
„
0.002 (1 + β) log n d
r̃0 (β) ≡ 8 d . (5)
n
0.001
For simplicity we denote r̃0 (0) by r̃0 .
0
1 1.5 2 2.5 3 3.5 4 4.5 Lemma 3.1. (Bound on the shortest paths) Under the hy-
α pothesis of Theorem 2.1, w.h.p., the shortest paths between
Figure 3: Average distance between the correct position all pairs of nodes in the graph G(V, E, P ) are uniformly
{xi } and estimation p
{x̂i } using Hop-TERRAIN as a func- bounded by
tion of α, for r = α log n/n with n sensors in the unit r̃0
dˆ2i,j ≤ (1 + )d2i,j + O(r) ,
square under connectivity based model (above) and range r
based model (below). for r > r0 , where r̃0 is defined as in Eq. (5) and r0 as in
Eq. (2).
The proof of this lemma is given in Section 3.1. Since d2i,j ≤
d for all i and j, we have
approaches 1 as the number of sensors n goes to infinity.
Given a matrix A ∈ Rm×n , the spectral norm of A is denoted “ m−1 ´2 ”1/2
dk,2 − d2k,1 − dˆ2k,2 + dˆ2k,1
(i)
X` 2
by ||A||2 , and the Frobenius norm is denoted by ||A||F . For a ||b0 − b(i) || =
vector a ∈ Rn , ||a|| denotes the Euclidean norm. Finally, we k=1
dˆi,j ≤ F (di,j ) .
0.5
(11)
0.4
First, in the case when di,j ≤ r, by definition of connectivity
0.3
based model, nodes i and j are connected by an edge in the
0.2 graph G, whence dˆi,j = r.
Next, assume that the bound in Eq. √ (11) is true for all
0.1
pairs (l, m) with dl,m ≤ r + (k − 1)(r − 2 d)δ. We consider
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
two nodes i and j at distance di,j ∈ Lk . Consider a line
segment li,j in a d-dimensional space with one end at xi and
the other at xj , where xi and xj denote the positions of
nodes i and j, respectively. √Let p be the point in the line
Figure 5: Location estimation using Hop-TERRAIN.
li,j which is at distance r − dδ from xi . Then, we focus
on the bin that contains p. By assumption that no bins are
empty, we know that we can always find at least one node
3.1 Proof of the bound on the shortest paths in the bin. Let k denote any one of those nodes in the bin.
Then we use following inequality which is true for all nodes
In this section, we prove a slightly stronger version of
(i, k, j).
Lemma 3.1. We will show that under the hypothesis of
Theorem 2.1, for any β ≥ 0 and all sensors i 6= j, there dˆi,j ≤ dˆi,k + dˆk,j .
exists a constant C such that, with probability larger than
Cn−β
1 − (1+β) , the shortest paths between all the pairs of Each of these two terms can be bounded using triangular
log n
nodes are uniformly bounded by inequality. To bound the first term, √ note that two nodes in
the same bin are at most distance dδ apart. √ Since p and xk
r̃0 (β) 2 are in the same bin and p is at distance r − dδ from node xi
dˆ2i,j ≤ (1 + )di,j + O(r) , (7)
r by construction, we know that di,k ≤ ||xi −p||+||p−xk || ≤ r,
for r > r0 , where r̃0 (β) is defined as in Eq. (5) and r0 as in whence dˆi,k = r. Analogously for the second√ term, dk,j ≤
Eq. (2). Especially, Lemma 3.1 follows if we set β = 0. ||xj − p|| + ||p − xk || ≤ r + (k − 1)(r − 2 d)δ, which implies
We start by applying bin-covering technique to the ran- that dˆk,j ≤ F (dk,j ) = kr. Hence, we proved that if Eq.√(11)
dom geometric points in [0, 1]d in a similar way as in [MP05]. is true for pairs (i, j) such that di,j ≤ r + (k − 1)(r − 2 d)δ,
We cover the space [0, 1]d with a set of non-overlapping hy- then dˆi,j ≤ (k + 1)r for pairs (i, j) such that di,j ∈ Lk . By
percubes whose edge lengths are δ. Thus there are total induction, this proves that the bound in Eq. (11) is true for
⌈1/δ⌉d bins, each of volume δ d . In formula, bin (i1 , . . . , id ) all pairs (i, j).
is the hypercube [(i1 − 1)δ, i1 δ) × · · · × [(id − 1)δ, id δ), for √ “ ”1/d
ik ∈ {1, . . . , ⌈1/δ⌉} and k ∈ {1, . . . , d}. When n nodes are Let µ = (r/2 d) n/((1 + β) log n) . Then, the func-
deployed in [0, 1]d uniformly at random, we say a bin is tion F (y) can be easily upper bounded by an affine function
empty if there is no node inside the bin. We want to en- Fa (y) = (1+1/(µ−1))y +r. Hence we have following bound
sure that, with high probability, there are no empty bins. on the shortest paths between any two nodes i and j.
For a given δ, define a parameter β ≡ (δ d n/ log n) − 1. 0 1
Since a bin is empty with probability (1 − δ d )n , we apply 1
union bound over all the ⌈1/δ⌉d bins to get, dˆi,j ≤ @1 + A di,j + r . (12)
B C
“ ”1/d
r n
l 1 md √
2 d (1+β) log n
−1
P(no bins are empty) ≥ 1 − (1 − δ d )n (8)
δ Figure 6 illustrates the comparison of the upper bounds
Cn “ (1 + β) log n ” n
F (di,j ) and Fa (di,j ), and the trivial lower bound dˆi,j ≥ di,j
≥1− 1− (9)
(1 + β) log n n in a p simulation with parameters d = 2, n = 6000 and
C n−β r = 64 log n/n. The simulation data shows the distri-
≥1− , (10) bution of shortest paths between all pairs of nodes with re-
(1 + β) log n
spect to the actual pairwise distances, which confirms that
where in (9) we used the fact that there exists a constant C shortest paths lie between the analytical upper and lower
such that ⌈1/δ⌉d ≤ C/δ d , and in (10) we used (1 − 1/n)n ≤ bounds. While the gap between the upper and p lower bound
e−1 , which is true for any positive n. is seemingly large, in the regime where r = α log n/n with
Assuming that, with high probability, no bins are empty, a constant α, the vertical gap r vanishes as n goes to infinity
we first show that the shortest paths is bounded by a func- and the slope of the affine upper bound can be made arbi-
tion F (di,j ) that only depends on the distance between the trarily small by increasing the radio range r or equivalently
two nodes. Let dˆi,j denote the shortest path between nodes taking large enough α. The bound on the squared shortest
i and j and di,j denote the Euclidean distance. Define a paths dˆ2i,j can be derived from the bound on the shortest
2.5 3.2.1 Deterministic Model
simulation data
affine upper bound By putting the sensors in the mentioned positions the d×d
upper bound matrix A will be Toeplitz and have the following form.
2 lower bound 2 3
1 −1 0 · · · 0
6 0 1 −1 · · · 0 7
1.5 6 . . .. 7 .
6 7
A = 2 6 .. .. .. ..
dˆi,j 6 . . . 7
7
4 0 ··· 0 1 −1 5
1
0 ··· 0 0 1
We can easily find the inverse of matrix A.
0.5 2 3
1 1 1 ··· 1
6 0 1 1 ··· 1 7
0 16
A−1 = 6 ... .. .
7
.. ..
0 0.5 1 1.5 . .. 7.
6 7
26 . .
di,j 4 0 ··· 0 1 1
7
5
0 ··· 0 0 1
Figure 6: comparison of upper and lower bound of
shortest paths {dˆi,j } with respect to the correct distance Note that the maximum singular value of A−1 and the min-
{di,j } computed for n = 6000 sensors in 2-dimensional imum singular value of A are related as follows.
square [0, 1]2 under connectivity based model.
1
σmin (A) = . (18)
σmax (A−1 )
paths in Eq. (12) after some calculus. To find the maximum singular value of A−1 need to calculate
´T
the maximum eigenvalue of A−1 A−1 which has the form
`
n µ o2
dˆ2i,j ≤ di,j + r (13) 2
d d − 1 d −2 ··· 1
3
µ−1
6 d − 1 d − 1 d −2 ··· 1 7
µ2 µ 16
d2i,j + r 2 + 2 .. .. .. 7 .
´T 7
= di,j r (14) A−1 A−1 = 6
` 6 .. ..
(µ − 1)2 µ−1 46 . . . . . 7
7
“ 2µ − 1 ” 4 2 ··· 2 2 1 5
= 1+ d2i,j + O(r) (15) 1 ··· 1 1 1
(µ − 1)2
“ 4” 2 Using the Gershgorin circle theorem (see appendix B) we
≤ 1+ di,j + O(r) . (16)
µ can find an upper bound on the maximum eigenvalue of
´T
A−1 A−1 .
`
where in (15) we used the fact that (µ/(µ√ − 1))di,j = O(1),
which follows from the assumptions (r/4 d)d > (1+β) log n/n “ ´T ” d2
λmax A−1 A−1
`
and d = O(1). In (16), we used the inequality (2µ − 1)/(µ − ≤ , (19)
√ 4
1)2 ≤ 4/µ, which is true for µ ≥ 2 + 3. Substituting µ in
the formula, this finishes the proof of the desired bound in Hence, by combining (17) and (19) we get
Eq. (7). d
Note that although for the sake of simplicity, we focus ||(AT A)−1 A||2 ≤ . (20)
2
on [0, 1]d hypercube, our analysis easily generalizes to any
bounded convex set and homogeneous Poisson process model 3.2.2 Random Model
with density ρ = n. The homogeneous Poisson process Let the symmetric matrix B be defined as AT A. The
model is characterized by the probability that there are diagonal entries of B can be written as
exactly k nodes appearing in any region with volume A : m−1
k X
P(kA = k) = (ρA) k!
e−ρA . Here, kA is a random variable bi,i = 4 (xk,i − xk+1,i )2 , (21)
defined as the number of nodes in a region of volume A. k=1