0% found this document useful (0 votes)
97 views30 pages

A Simple Ant Colony Optimizer

Uploaded by

smkumaran90
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views30 pages

A Simple Ant Colony Optimizer

Uploaded by

smkumaran90
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Algorithmica (2012) 64:643–672

DOI 10.1007/s00453-011-9606-2

A Simple Ant Colony Optimizer for Stochastic Shortest


Path Problems

Dirk Sudholt · Christian Thyssen

Received: 30 December 2010 / Accepted: 12 December 2011 / Published online: 30 December 2011
© Springer Science+Business Media, LLC 2011

Abstract Ant Colony Optimization (ACO) is a popular optimization paradigm in-


spired by the capabilities of natural ant colonies of finding shortest paths between
their nest and a food source. This has led to many successful applications for various
combinatorial problems. The reason for the success of ACO, however, is not well
understood and there is a need for a rigorous theoretical foundation.
We analyze the running time of a simple ant colony optimizer for stochastic short-
est path problems where edge weights are subject to noise that reflects delays and
uncertainty. In particular, we consider various noise models, ranging from general,
arbitrary noise with possible dependencies to more specific models such as indepen-
dent gamma-distributed noise. The question is whether the ants can find or approx-
imate shortest paths in the presence of noise. We characterize instances where the
ants can discover the real shortest paths efficiently. For general instances we prove
upper bounds for the time until the algorithm finds reasonable approximations. Con-
trariwise, for independent gamma-distributed noise we present a graph where the ant
system needs exponential time to find a good approximation. The behavior of the ant
system changes dramatically when the noise is perfectly correlated as then the ants
find shortest paths efficiently. Our results shed light on trade-offs between the noise
strength, approximation guarantees, and expected running times.

Keywords Ant colony optimization · Combinatorial optimization · Running time


analysis · Shortest path problems · Stochastic optimization

A preliminary version of this work appeared in [27].


D. Sudholt ()
CERCIA, School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK
e-mail: [email protected]

C. Thyssen
Fakultät für Informatik, LS 2, Technische Universität Dortmund, 44221 Dortmund, Germany
e-mail: [email protected]
644 Algorithmica (2012) 64:643–672

1 Introduction

Ant Colony Optimization (ACO) is a modern optimization paradigm inspired by


the foraging behavior of ant colonies. When an ant searches the environment for
food, it deposits pheromones on the ground while moving around. Ants can smell
pheromones and they tend to be attracted by pheromone trails. If an ant finds a food
source, it tends to walk back on its own trail, investing it with more pheromones.
Other ants are then attracted towards this trail. If the path is short, the pheromone
trail is reinforced quickly. This way, if several ants find different paths from the nest
to a food source, it is likely that the whole colony converges to the shortest path. This
remarkable behavior is an example of swarm intelligence where simple agents are
capable of solving complex problems without centralized control.
The collective intelligence of ant colonies has inspired the ACO paradigm [19].
The basic idea is that artificial ants search for good candidate solutions, guided by
artificial pheromones. Artificial ants thereby perform a random walk through a graph
and the next edge to be chosen is selected according to pheromones that represent
the attractiveness of an edge. This leads to a construction procedure for problems that
can be represented as finding good paths in graphs; examples are shortest paths or
the TSP [17]. Furthermore, the path formation of ants can be used to construct solu-
tions for various combinatorial problems via so-called construction graphs. In such a
graph the choice which edges are taken by an artificial ant are mapped to decisions
about components of a candidate solution. This makes ACO a general paradigm for
the design of metaheuristics. These algorithms have been applied to many problems
from various domains, such as the Quadratic Assignment Problem, network routing
problems, and scheduling problems. For details and further applications see the book
by [18].
Metaheuristics such as ACO algorithms are popular for practitioners as they are
easy to implement and they usually produce good solutions in short time. They often
produce better solutions than problem-specific approximation algorithms with proven
performance guarantees and they are applicable in settings where there is not enough
knowledge, time, or expertise to design a problem-specific algorithm. For some prob-
lems like the sequential ordering problem or open shop scheduling ACO algorithms
are regarded as state-of-the-art [18].
Despite many successful applications and empirical studies on benchmarks, the
success behind ACO is not well understood. There is little insight into how, when,
and why the collective intelligence of an artificial ant colony efficiently finds good
solutions for a particular problem. In particular, studies making rigorous formal state-
ments about ACO are very rare. Many design questions for ACO systems remain
unanswered and finding appropriate parameter settings is often done by trial and er-
ror. Leading researchers have therefore called out to theoretical computer science to
provide analyses that give insight into the dynamic behavior and the performance of
ACO [16].
A number of researchers have followed this call and established a rapidly grow-
ing theory of ACO algorithms. The motivation is to assess the performance of ACO
on interesting and well-understood problems, using techniques from the analysis of
randomized algorithms. The goal is to shed light on the working principles of ACO,
Algorithmica (2012) 64:643–672 645

identify situations where ACO algorithms perform well and where they do not, and
to give hints on how to design better ACO algorithms.
The theory of ACO started with investigations of ACO on simple example func-
tions [15, 24, 35, 37]. The methods developed in these works have then enabled re-
searchers to perform analyses for more complex settings. This includes hybridiza-
tions with local search [36], broader function classes [31], problems from combi-
natorial optimization, and different pheromone update schemes that lead to systems
which are harder to analyze [38, 43]. Concerning combinatorial optimization, Neu-
mann and Witt [34] considered ACO for finding minimum spanning trees for two
different construction graphs. Attiratanasunthron and Fakcharoenphol [2] as well as
the present authors [26, 44] considered the classical problem of finding shortest paths
in graphs. Zhou [45] as well as Kötzing et al. [30] analyzed ACO for the TSP and
Kötzing et al. [29] considered ACO for the minimum cut problem.
In this work, we take a further step and consider combinatorial problems in the
presence of uncertainty. In particular, we focus on the performance of ACO for
stochastic shortest path problems. Shortest path problems closely reflect the biolog-
ical inspiration for ACO and they represent a fundamental problem in computer sci-
ence and many other areas. Algorithmic research on these problems is still an active
field [1, 3, 8, 40]. Shortest paths have also been investigated in the context of other
metaheuristics as described in the following.

1.1 Related Work

Shortest path problems have been investigated for various evolutionary algorithms.
First studies focused on evolutionary algorithms that only use mutation, without
crossover [4, 12, 42]. Remarkably, it turned out that the all-pairs shortest path
problem is a rare example where using crossover significantly speeds up optimiza-
tion [10, 11, 13]. Also results for NP-hard multi-objective shortest path problems
have been obtained. Horoba [25] proved that a simple multi-objective evolutionary
algorithm using only mutation and a mechanism to maintain diversity in the pop-
ulation represents a fully polynomial randomized approximation scheme (FPRAS).
Neumann and Theile [33] extended this result towards the use of crossover, showing
that crossover leads to improved running times. In addition, their algorithm works
with a smaller population size, compared to [25].
The first investigation of ACO for shortest path problems has been made by [2].
They considered the single-destination shortest path problem (SDSP) on directed
acyclic graphs where one is looking for shortest paths from all vertices to a single
destination vertex. In their algorithm n-ANT, on each vertex an artificial ant heads out
in search for the destination. For each vertex v the best path found so far is recorded
and the pheromones on the edges leaving v are updated according to the best-so-far
path. The update also involves a parameter 0 ≤ ρ ≤ 1 called evaporation rate that de-
termines the strength of a pheromone update. Note that all ants update disjoint sets of
edges. The collaborative effort of all ants leads to the construction of shortest paths.
Ants whose shortest paths contain only few edges tend to find their shortest paths
first. By marking the respective edges at their start vertices with pheromones they
pave the way for other ants whose shortest paths contain more edges. This implies
646 Algorithmica (2012) 64:643–672

that shortest paths propagate through the graph, similar to the algorithmic idea of the
Bellman-Ford algorithm [9, Sect. 24.1].
Attiratanasunthron and Fakcharoenphol proved a bound of O(mΔ log(Δ)/ρ)
for the expected optimization time (i.e., the expected number of iterations) of n-ANT
for the SDSP that holds for every directed acyclic graph with m edges, maximum
outdegree Δ and maximum number of edges  on any shortest path. These results
were extended and improved in [26]. The authors considered a modified variant of
n-ANT called MMASSDSP that—unlike n-ANT—can deal with arbitrary directed
graphs containing cycles. We gave an improved running time bound for MMASSDSP
on the SDSP of O(Δ2 +  log(Δ)/ρ) that holds under some mild conditions on the
graph. The same bound on the number of iterations also holds for the all-pairs shortest
path problem (APSP) and an algorithm MMASAPSP using distinct ants and distinct
pheromones for each destination. We also showed that a simple interaction mecha-
nism between ants heading for different destinations yields a significant speed-up.
The resulting bound of O(n3 log3 n) constructed solutions was—at that time—lower
than the worst-case expected optimization time of evolutionary algorithms using only
mutation [12] as well as evolutionary algorithms with crossover [13]. For the lat-
ter an upper bound of O(n3.25 log0.75 n) holds and examples are given where this is
tight [11]. Only after our previous work was published, Doerr et al. [14] presented an
evolutionary algorithm with a modified parent selection that has a better upper bound
of O(n3 log n) constructed solutions.
In an independent line of research, Borkar and Das [6] presented convergence
proofs and empirical results for an ACO system that contains an additional learn-
ing component. In terms of shortest paths they only considered specific classes of
graphs containing layers of nodes such that the edge set contains exactly all pairs of
nodes in subsequent layers. This is referred to as multi-stage shortest path problem.
Kolavali and Bhatnagar [28] extended this work towards four variants of the basic
ACO algorithm from Borkar and Das [6].

1.2 Our Contribution

In this work we extend our previous work [26] on the SDSP towards a stochastic
variant of the SDSP on directed acyclic graphs. The motivation is to investigate the
robustness of ACO in stochastic settings and to see under which conditions the ants
are still able to find shortest paths efficiently. Several different variants of stochastic
shortest path problems have been investigated in the literature on optimization (see,
e.g., [32, 39]). One variant is to find a path with the least expected time (LET) [5, 41].
Another problem is to maximize the probability of arriving at the destination within
a given time bound [7, 22].
We consider a setting where noise is added to edges. The noise is non-negative
and the task is to find or approximate the real shortest paths, i.e., the shortest paths
without noise. The main question is in which settings the ants are still able to locate
shortest paths efficiently, while being subjected to noise. The reader might think of
noise reflecting errors of measurement that occur when trying to evaluate the quality
of a candidate solution. In the special case where the expected noisy path length
and the real path length differ by a fixed factor, for all paths, the task of finding
Algorithmica (2012) 64:643–672 647

the shortest real path is equivalent to finding the path with the least expected time.
This then yields an instance of the LET problem. The described property holds for
some of the investigated settings, hence our analyses address special cases of the LET
problem. To our knowledge this is the first running time analysis of a randomized
search heuristic on a stochastic combinatorial problem.
We describe our results and give an outline of this paper (for an overview of our
theorems see Table 1). In Sect. 2 we formally introduce our setting, the problem, and
the ant system. Section 3 presents general upper bounds for the time until a reason-
able approximation is found. This includes graphs with gaps between the shortest-
path lengths and the lengths of non-optimal paths that allow the ants to efficiently
compute the real shortest paths. For arbitrary weights an upper bound for general
and possibly dependent noise is presented. This result is refined for general indepen-
dent noise. Section 4 deals with the gamma distribution, which is a common choice
for modeling noise. In Sect. 5 we consider independent gamma-distributed noise and
prove an exponential lower bound for the time until a good approximation is found
on a constructed graph. Section 6 is dedicated to the case that the noise values on
different edges are strongly correlated. We show that the negative result for the graph
considered in Sect. 5 breaks down. This demonstrates that correlations can make a
difference between polynomial and exponential times for finding a good approxima-
tion. We also prove a general upper bound for finding shortest paths under gamma-
distributed noise that holds for graphs with gaps between the shortest-path lengths
and those of non-optimal paths. In particular, there is a trade-off between the size of
the gap and the upper bound on the expected running time. We conclude in Sect. 7.

2 Problem and Algorithm

Consider a weighted directed acyclic graph G = (V , E, w) with V = {1, . . . , n}, E =


{e1 , . . . , em } ⊆ V × V , and w : E → R+ 0 . We are interested in finding a shortest path
from each source v ∈ V to a single destination. Throughout this work,  denotes
the maximum number of edges on any shortest path to the destination. Similarly, L
denotes the maximum number of edges on any path to the destination.
We consider a sequence p = (v0 , . . . , vs ) of vertices vi ∈ V with (vi−1 , vi ) ∈ E
for each 1 ≤ i ≤ s as a path from v0 to vs with s edges. We also consider the corre-
sponding sequence p = ((v0 , v1 ), . . . , (vs−1 , vs )) of edges (vi−1 , vi ) ∈ E as a path.
In the remainder of the paper we utilize both representations  for convenience. We
define the length w(p) of a path p = (e1 , . . . , es ) as w(p) := si=1 w(ei ) if it ends
with the destination and w(p) := ∞ otherwise. By deg(v) we denote the outdegree
of v and by Δ we denote the maximum outdegree of any vertex in the graph.
We investigate a stochastic version of the described shortest path problem. The
term “stochastic” means that whenever we evaluate the length of a path p at time t
we do not get the real length w(p) but a noisy length w̃(p, t). The resulting random
variables w̃(p, 1), w̃(p, 2), . . . for a fixed path are i.i.d.
Each random variable w̃(p, t) is determined as follows. Assume a family
(η(e, p, t))e∈E of nonnegative random variables η(e, p, t) ≥ 0. The noisy length
648 Algorithmica (2012) 64:643–672

Table 1 Overview of the theorems. η(e) = η(e, p, t) and η = η(p, t) are shorthands for noise added to
an edge e, after being multiplied with the edge’s weight w(e) (see Sect. 2). Independence (“ind.”) refers to
independence across edges. L denotes the maximum number of edges on any path to the destination, optv
is the length of a shortest path from v, ηmax := max1≤i≤m E(η(ei )), and w̃max := maxe∈E E(η(e)) · w(e).
The constraints on the parameters of the algorithm τmin , τmax and ρ (see Sect. 2) and the parameters of
the gamma-distributed noise k and θ are given in the corresponding theorem

Theorem Graph Optimization goal


Noise Time bound

1 Instances with gapsa Shortest paths


η(e) ≥ 0 O((L log n)/τmin · α/(α − 1) + L log(τmax /τmin )/ρ) in expectation
2 All instances Multiplicative error (1 + α · ηmax )L , α > 1
η(e) ≥ 0 O((L log n)/τmin · α/(α − 1) + L log(τmax /τmin )/ρ) in expectation
3 All instances Additive error (L + 1)2 /2 · w̃max
η(e) ≥ 0 ind. O((L log n)/τmin + L log(τmax /τmin )/ρ) in expectation
4 Example gr. Gn,kθ,ε Multiplicative√error (1 + r), r > 0 arbitrary
η(e) ∼ Γ (k, θ) ind. Ω(n/τmin + n log(τmax /τmin )/ρ) w.h.p.
5 Example gr. Gn,kθ,ε Multiplicative
√ error (1 + ), ≤ 1 with /(kθ) = 1/e − Ω(1)
η(e) ∼ Γ (k, θ) ind. Ω(ec n ) w.h.p., c > 0 constant
6 Example gr. Gn,kθ,ε Shortest paths
η ∼ Γ (k, θ) O(n/τmin + (kθ/ )k · eε/θ /τmin + n log(τmax /τmin )/ρ) in expectation
7 Instances with gapsb Shortest paths
η ∼ Γ (k, θ) O(((z + 1)k L log n)/τmin + L log(τmax /τmin )/ρ) in expectation

a Length of every non-optimal path from v at least (1 + α · E(η(opt ))) · opt , α > 1
v v
b Length of every non-optimal path from v at least (1 + kθ/z) · opt , z ∈ N
v

w̃(e, t) of an edge e ∈ p is then computed as (1 + η(e, p, t)) · w(e). The noisy length
w̃(p, t) of a path p = (e1 , . . . , es ) is then defined as


s
(1 + η(ei , p, t)) · w(ei ).
i=1

Note that w̃(e, t) ≥ w(e) holds for each edge e and w̃(p, t) ≥ w(p) holds for each
path p. Also note that the strength of the noise depends on the weight of the corre-
sponding edge.
We already mentioned that noisy path length evaluations are independent across
time, i.e., η(e, p, t) and η(e, p, t
) are i.i.d. random variables for all t = t
. Similarly,
we assume that η(e, p, t) and η(e, p
, t) are i.i.d. for all p
= p. Two ants, construct-
ing paths p and p
, respectively, may experience different noise on the same edge, in
the same iteration. (Many analyses also apply to noise models where all ants in one
iterations experience the same noise: η(e, p, t) = η(e, p
, t) for all p = p
.) When
speaking of independent noise, in the following we refer to independence across
edges, i.e., η(e, p, t) and η(e
, p, t) are independent. While independence across time
and across paths/ants is always assumed, our results cover noise models with non-
independent noise across edges as well as models where η(e, p, t) and η(e
, p, t) are
i.i.d. for all edges e = e
.
Algorithmica (2012) 64:643–672 649

Algorithm 1 Path Construction from u to v


1: Initialize i ← 0, p0 ← u, V1 ← {p ∈ V | (p0 , p) ∈ E}
2: while pi = v and Vi+1 = ∅ do
3: i ←i +1 
4: Choose pi ∈ Vi with probability τ ((pi−1 , pi ))/ p∈Vi τ ((pi−1 , p))
5: Vi+1 ← {p ∈ V | (pi , p) ∈ E}
6: end while
7: return (p0 , . . . , pi )

Algorithm 2 MMASSDSP
1: Initialize pheromones τ and best-so-far paths p1∗ , . . . , pn∗
2: for t = 1, 2, . . . do in parallel
3: for u = 1 to n do in parallel
4: Construct a simple path pu from u to the destination w.r.t. τ
5: Sample w̃(pu , t) from the distribution underlying the noisy path length of pu
6: if w̃(pu , t) ≤ w̃(pu∗ , tu∗ ) then pu∗ ← pu , tu∗ ← t end if
7: Update pheromones τ on all edges (u, ·) w.r.t. pu∗
8: end for
9: end for

We consider the ant system MMASSDSP introduced in [26]. To ease the presenta-
tion, we describe a simplification of the system for directed acyclic graphs as the re-
sults in this work will be limited to acyclic graphs. If the underlying graph is acyclic,
the ant system can be formulated as follows (see Algorithms 1 and 2). In every iter-
ation, from each vertex, an ant starts and tries to construct a path to the destination.
It does so by performing a random walk through the graph guided by pheromones.
Pheromones are positive real values associated with the edges of a graph. They are
denoted by a function τ : E → R+ . The next edge is always chosen with a probabil-
ity that is proportional to the amount of pheromone on the edge. If the ant gets stuck
because there are no more outgoing edges, the length of the resulting path is ∞ by
definition of w. This can only happen if there are vertices from which the destination
is not reachable. The path construction is detailed in Algorithm 1.
Once an ant with start vertex u has created a path, its best-so-far path pu∗ is updated
with the newly constructed path pu if the latter is not worse than the former. Note that
this decision is made according to the noisy path lengths w̃(pu∗ , tu∗ ) and w̃(pu , t), tu∗
being the iteration in which pu∗ was stored as best-so-far path. This means that we
store the noisy length of pu∗ at the time tu∗ and use this value for the comparison,
instead of re-evaluating the noisy path length of pu∗ at time t. In the following, we
always use short-hands w̃(pu∗ ) := w̃(pu∗ , tu∗ ) and w̃(pu ) := w̃(pu , t), t referring to
the current iteration. We use similar abbreviations for η(e, p, t) where appropriate.
Finally, the ant updates pheromones on the edges leaving its start vertex according
to the best-so-far path pu∗ and the following formula. Pheromones on an edge e are
denoted by τ (e). Initially we set τ (e) = 1/ deg(v) for all e = (v, ·) as well as pv∗ = (),
tv∗ = 0 and w̃(pv∗ , 0) = ∞ for all v ∈ V . Using equal pheromones for all edges (v, ·)
implies that in the first iteration all ants make uniform random decisions at all ver-
tices. The evaporation factor 0 ≤ ρ ≤ 1 as well as τmin and τmax are parameters of the
650 Algorithmica (2012) 64:643–672

algorithm. The pheromone update formula is given by



min{(1 − ρ) · τ (e) + ρ, τmax } if e = (u, v) ∈ pu∗ ,
τ (e) ←
max{(1 − ρ) · τ (e), τmin } / pu∗ .
if e = (u, v) ∈

The so-called pheromone borders τmin and τmax ensure that the pheromone for each
edge is always bounded away from 0, so that there is always a chance of reverting
a decision once made. As in [26] we fix τmax := 1 − τmin and only vary τmin , sub-
ject to the constraints 0 < τmin ≤ 1/Δ. The latter inequality is required to make the
initialization work properly. Note that the algorithm can be parallelized easily as the
path constructions are independent and the pheromone update concerns disjoint sets
of edges. The complete algorithm is described in Algorithm 2.
In the theory of randomized search heuristics often the number of iterations is
taken as performance measure. As we have n ants constructing solutions in one it-
eration, it makes sense to also consider the number of constructed solutions. This
holds in particular when comparing our results to other search heuristics. For more
detailed runtime measures as common in the theory of algorithms the number of op-
erations needed for path constructions and pheromone updates has to be taken into
account. The probabilities for choosing specific edges only need to be computed once
in each iteration. The effort for each vertex is proportional to its outdegree, hence the
total effort for computing the probabilities is O(|E|). One path construction can be
implemented such that it runs in time Õ(L) (Õ hiding logarithmic factors). Hence,
one iteration of the ant system can be implemented using Õ(|E| + n · L) = Õ(n2 )
elementary operations.
In the following, we bound the number of iterations until the subsequent goal is
met. Let optv denote the length w(p) of a shortest path p from v ∈ V to n. Let Pv
denote the set containing all paths from v ∈ V to n and Pv (α) the set containing each
path p from v ∈ V to n with w(p) ≤ α · optv where α ∈ R with α ≥ 1. Call a vertex
v α-approximated if and only if w(pv∗ ) ≤ α · optv . Call a vertex v permanently α-
approximated if and only if w̃(pv∗ ) ≤ α · optv . The difference between these notions
can be explained as follows. If v is α-approximated, the length of the best-so-far path
might be larger than an α-approximation as we store the noisy length of the best-so-
far path, w̃(pv∗ ). The ant may then later switch to another path with smaller noisy
length, but larger real length, so that v loses its α-approximation property. On the
other hand, if a vertex is permanently α-approximated, then it will always remain
permanently α-approximated.
We say that MMASSDSP has found an α -approximation if all v ∈ V are α-approx-
imated at the same time. Note that we require the best-so-far paths to reflect good
approximations. It may happen that ants create paths with better real length, but if
their noisy length is worse than the current best-so-far path length, this better path will
be lost immediately. Therefore, it makes sense to base the goal of the optimization on
best-so-far paths as this is much more permanent information. Also we should expect
that a well-performing ACO system is able to store information about good solutions
in the pheromones.
Note that with a 1-approximation all shortest paths have been found. However,
in general it does not make sense to hope for a permanent 1-approximation as the
probability of experiencing a noise value of 0 might be 0.
Algorithmica (2012) 64:643–672 651

3 General Upper Bounds

We begin with a very general upper bound for the optimization time holding for all
noise distributions. Our arguments are based on the following statements that are
implicitly proven in [26]. For the sake of completeness, we give a proof.

Lemma 1 The ant system MMASSDSP features the following properties on directed
acyclic graphs with destination n.
1. The probability of an ant at u = n with deg(u) > 1 following the edge e = (u, v)
is at most τ (e) and at least τ (e)/2 ≥ τmin /2.
2. If the pheromone on an edge has not been increased for at least T ∗ :=
ln(τmax /τmin )/ρ iterations, the pheromone on the edge is τmin .
3. Consider a set Sv of paths from any vertex v ∈ V to the destination such that every
edge leaving Sv has pheromone τmin . If every path in Sv has at most  − 1 edges
and τmin ≤ 1/(Δ), then the probability of an ant following a path in Sv from v to
the destination is at least 1/e.

Proof The first statement follows from Lemma 1 in [26]. It says that probabilities
and pheromones are closely related as they differ only by a factor of at most 2. In
many cases, like at initialization, the pheromones on edges (v, ·) for some vertex v
sum up to 1. If v has degree larger than 2, this sum can exceed 1 if pheromones
on several edges are set to τmin by the max-term in the pheromone update formula.
Taking the maximum can be thought of as pheromones being artificially raised after
being multiplied with 1 − ρ. This can increase the sum of pheromones beyond 1.
However, a simple calculation shows that if τmin ≤ 1/Δ then the sum of pheromones
is always bounded by 2. This directly implies the claimed lower bound τ (e)/2; the
upper bound τ (e) follows as the sum of pheromones is always at least 1.
The second statement holds since any initial pheromone τ (e) attains the value τmin
after not having been increased for at least ln(τmax /τmin )/ρ iterations. The reason is
that the pheromone is always multiplied by 1 − ρ (unless the lower border τmin is hit)
and (1 − ρ)ln(τmax /τmin )/ρ · τ (e) ≤ e− ln(τmax /τmin ) · τ (e) ≤ τmin .
The third statement holds since at every vertex v the probability of choosing
a “wrong” edge (i.e. an edge leaving Sv ) is at most deg(v) · τmin . This follows
from the assumption on the pheromones and the first statement. The probability
of not choosing any wrong edge on a path of at most  − 1 edges is then at least
(1 − deg(v) · τmin )−1 ≥ (1 − 1/)−1 ≥ 1/e since τmin ≤ 1/(deg(v)) by assump-
tion. 

Intuitively, our ant system should be able to identify the real shortest paths in case
there is a certain gap between weights of all shortest paths and all non-optimal paths.
The following theorem gives insight into how large this gap has to be. The variable
α describes a trade-off between the strength of the preconditions and the expected
running time.

Theorem 1 Consider a graph G = (V , E, w) with weight w(e) > 0 and noise η(e) =
η(e, p, t) ≥ 0 for each e ∈ E. Choose τmin ≤ 1/(ΔL). Let optv denote the real length
652 Algorithmica (2012) 64:643–672

of a shortest path from v and η(optv ) denote the random noise on all edges of this
path. If for each vertex v ∈ V and some α > 1 it holds that every non-optimal path
has length at least (1 + α · E(η(optv ))) · optv then in expectation MMASSDSP finds a
1-approximation after O((L log n)/τmin · α/(α − 1) + L log(τmax /τmin )/ρ) iterations.

For τmin = 1/(ΔL) and α = 1 + Ω(1), the running time bound simplifies to
O(ΔL2 log n + (L log n)/ρ). If additionally ρ = Ω(1/(ΔL)), we get O(ΔL2 log n).

Proof Adapting notation from [2], we call a vertex v optimized if the ant starting
at v has found a shortest path and w̃(pv∗ ) < (1 + α · E(η(optv ))) · optv . Along with
our condition on path lengths, we have that then the ant at v will never accept a non-
optimal path as its best-so-far path. Hence, it will always reinforce some shortest path
with pheromones. We call vertex v processed if it is optimized and all pheromones
on edges (v, ·) that are not part of shortest paths from v have decreased to τmin .
Due to Lemma 1 this happens at most ln(τmax /τmin )/ρ iterations after v has become
optimized.
We define a partition of V according to the maximum number of edges on any
path to the destination. Let

Vi := {v ∈ V | ∃p ∈ Pv : |p| ≥ i and ∀p ∈ Pv : |p| ≤ i}

with 0 ≤ i ≤ L where |p| denotes the number of edges of p. (Recall that Pv denotes
the set containing all paths from v to the destination, see Sect. 2.) Then V0 , . . . , VL
form a partition of V . Note that for each (u, v) ∈ E there are indices i and j with
u ∈ Vi , v ∈ Vj , and i > j .
Consider a vertex u ∈ Vi . Assume that for each index 0 ≤ j < i each vertex v ∈ Vj
is processed. We estimate the expected time until u becomes processed. For each
edge (v, ·), v ∈ Vj , 0 ≤ j < i, that is not part of a shortest path from v we have
τ ((v, ·)) = τmin . Denote by au the ant starting at vertex u. The probability of ant au
choosing the first edge (u, v), v ∈ Vj , 0 ≤ j < i, of a path from Pu (1) (i.e. a shortest
path) is lower bounded by τmin /2 due to the first statement of Lemma 1. Invoking
Lemma 1 with Sv = Pv (1), the set of all shortest paths, the probability of ant au
continuing from v on some path from Pv (1) to the destination is at least 1/e. Together,
the probability of ant au finding some shortest path is at least τmin /(2e).
In addition to finding such a path p = (e1 , . . . , ek ), the noisy length evaluation of
p must not be too poor. The vertex u becomes optimized if the noise sampled on optu
is less than α · E(η(optu )). By Markov’s inequality this happens with probability at
least 1 − 1/α. Hence, the probability of optimizing vertex u ∈ Vi is lower bounded
by τmin /(2e) · (1 − 1/α). The expected waiting time until this happens is at most
O(1/τmin · α/(α − 1)).
By standard arguments, the expected time until the last vertex in Vi has become
optimized is bounded by O((log n)/τmin · α/(α − 1)). After an additional waiting
time of T ∗ = ln(τmax /τmin )/ρ iterations all pheromones for edges leaving Vi have
been adapted appropriately and we continue our considerations with the set Vi+1 .
Summing up the expected times for all L sets yields the claimed upper bound. 
Algorithmica (2012) 64:643–672 653

Note that the statement of Theorem 1 can be improved by replacing L with  ≤ L


and adapting the partition V0 , . . . , VL towards edges on shortest paths. We refrain
from such a modification to be consistent with upcoming results.
The condition on the length of non-optimal paths is stronger for vertices that are
“far away” from the destination. Imagine a multigraph where some vertex v has two
edges e1 , e2 of different weight such that e1 is part of a shortest path and both edges
lead to the same vertex w. In order for each non-optimal path to have length at least
(1 + α · E(η(optv ))) · optv , it must be that e2 has a large weight. This effect be-
comes more pronounced, the larger the real length of the shortest path from w is.
This stronger requirement makes sense because the ants must still be able to distin-
guish short paths from long paths efficiently.
In the case of arbitrary weights the situation becomes more complicated. If second-
best paths are only slightly longer than shortest paths, it may be difficult for artificial
ants to distinguish between these paths in the presence of noise. In this case we cannot
always rely on the optimality of sub-paths as done in Theorem 1. The next theorem
provides a trade-off between the desired approximation ratio and the required opti-
mization time according to the variable α.

Theorem 2 Consider a graph G = (V , E, w) with weight w(e) > 0 and noise


η(e) = η(e, p, t) ≥ 0 for each e ∈ E. Choose τmin ≤ 1/(ΔL) and α > 1 and let χ =
(1 + α · ηmax )L , where ηmax := max1≤i≤m E(η(ei )). Then in expectation MMASSDSP
finds a χ -approximation in O((L log n)/τmin · α/(α − 1) + L log(τmax /τmin )/ρ) iter-
ations.

The running time bound can be simplified in the same way as described af-
ter Theorem 1. Note that the approximation ratio χ ≤ eα·ηmax ·L converges to 1 if
ηmax = o(1/(α · L)). However, the approximation ratio quickly deteriorates with
larger noise.

Proof Recall the partition V0 , . . . , VL from Theorem 1. Consider a vertex u ∈ Vi . As-


sume that for each index 0 ≤ j < i each vertex v ∈ Vj has been permanently χ j/L -
approximated for at least T ∗ = ln(τmax /τmin )/ρ iterations. We estimate the expected
time until u becomes permanently χ i/L -approximated. For each edge (v, ·), v ∈ Vj ,
0 ≤ j < i, that is not extendable to a path from Pv (χ j/L ) we have τ ((v, ·)) = τmin
by the second statement of Lemma 1. Denote by au the ant starting at vertex u. The
probability of ant au choosing the first edge (u, v), v ∈ Vj , 0 ≤ j < i, of a path from
Pu (1) (i.e. a shortest path) is lower bounded by τmin /2 due to the first statement of
Lemma 1. Invoking Lemma 1 with Sv = Pv (χ j/L ), the probability of ant au contin-
uing from v on some path from Pv (χ j/L ) to the destination is at least 1/e. Together,
the probability of ant au finding some path p from Pu (χ j/L ) is at least τmin /(2e).
In addition to finding such a path p = (e1 , . . . , ek ), the noisy length evaluation of
p must not be too poor. The vertex u becomes permanently χ i/L -approximated if
w̃(p) ≤ χ i/L · optu . We have

χ i/L · optu = χ i/L · (w(e1 ) + optv )


 k 
i=2 w(ei )
≥χ i/L
· w(e1 ) +
χ j/L
654 Algorithmica (2012) 64:643–672

  
k
= χ i/L − χ (i−j )/L · w(e1 ) + χ (i−j )/L · w(ei )
i=1


k
≥ χ 1/L · w(ei ).
i=1

Hence,
 
Prob w̃(p) ≥ χ i/L · optu
k

 
k
≤ Prob (1 + η(ei )) · w(ei ) ≥ χ 1/L
· w(ei )
i=1 i=1
k

 
k
= Prob η(ei ) · w(ei ) ≥ (χ 1/L
− 1) · w(ei ) .
i=1 i=1

By Markov’s inequality, this probability is at most


 k
E( ki=1 η(ei ) · w(ei )) i=1 E(η(ei )) · w(ei ) 1
k = k ≤ < 1.
(χ 1/L − 1) · i=1 w(ei ) α · ηmax · i=1 w(ei ) α

Therefore we have w̃(p) < χ i/L · w(p) with probability at least 1 − 1/α > 0. Hence,
the probability of vertex u ∈ Vi becoming permanently χ i/L -approximated is lower
bounded by τmin /(2e) · (1 − 1/α). The expected waiting time until this happens is at
most O(1/τmin · α/(α − 1)).
By standard arguments, the expected time until the last vertex in Vi has become
permanently χ i/L -approximated is bounded by O((log n)/τmin · α/(α − 1)). After an
additional waiting time of T ∗ iterations all pheromones for edges leaving Vi have
been adapted appropriately and we continue our considerations with the set Vi+1 .
Summing up the expected times for all L sets yields the claimed upper bound. 

In the following we assume that the random variables η(e), e ∈ E, are indepen-
dent. Each time a new path is constructed, new random variables η(e) are used to
derive its noisy weight; all η(e)-values are independent for each edge, each ant and
each iteration. This means that in one iteration different ants may experience differ-
ent noise, even if they follow the same edges. Recall, however, that the length of a
best-so-far path is not re-evaluated whenever it is compared against a new path.
In contrast to Theorem 2 we formulate the result in terms of additive errors in-
stead of multiplicative approximation ratios. While the approximation guarantee in
Theorem 2 depends exponentially on L, the additive error in the upcoming Theo-
rem 3 only depends quadratically on L. We use the following lemma, which is an
immediate implication of Theorem 1 in [23].

Lemma 2 [23] Let X1 , . . . , Xn be arbitrary nonnegative  independent random vari-


ables, with expectations μ1 , . . . , μn , respectively. Let X = ni=1 Xi , and let μ denote
Algorithmica (2012) 64:643–672 655

n
the expectation of X (hence, μ = i=1 μi ). Let μmax := max{μ1 , . . . , μn }. Then for
every α > 0,

Prob(X ≤ μ + αμmax ) ≥ min{α/(1 + α), 1/13}.

Theorem 3 Consider a graph G = (V , E, w) with weight w(e) > 0 where the


noise variables η(e) = η(e, p, t) ≥ 0 for each e ∈ E are independent. Choose
τmin ≤ 1/(ΔL) and let w̃max := maxe∈E E(η(e)) · w(e). MMASSDSP finds an ap-
proximation within an additive error of (L + 1)2 /2 · w̃max within O((L log n)/τmin +
L log(τmax /τmin )/ρ) expected iterations.

Proof The basic proof idea is the same as in Theorem 2. Recall the partition
V0 , . . . , VL from Theorem 1 and consider a vertex u ∈ Vi . Assume that for each index
0 ≤ j < i each vertex v ∈ Vj has been permanently approximated within an additive
error of (j + 1)2 /2 · w̃max for at least T ∗ = ln(τmax /τmin )/ρ iterations. We estimate
the expected time until u becomes permanently approximated within an additive error
of (i + 1)2 /2 · w̃max . For i = L this implies the claim.
Let Pv+ ( ) := Pv (1 + / optv ) for all v ∈ V and ≥ 0. For each edge (v, ·), v ∈ Vj ,
0 ≤ j < i, that is not extendable to a path from Pv+ ((j + 1)2 /2 · w̃max ) we have
τ ((v, ·)) = τmin by the second statement of Lemma 1. Denote by au the ant starting at
vertex u. The probability of ant au choosing the first edge (u, v), v ∈ Vj , 0 ≤ j < i,
of a path from Pu+ (0) (i.e. a shortest path) is lower bounded by τmin /2 due to the first
statement of Lemma 1. Invoking Lemma 1 with Sv = Pv+ ((j + 1)2 /2 · w̃max ), the
probability of ant au continuing from v on some path from Pv+ ((j + 1)2 /2 · w̃max ) to
the destination is at least 1/e. Together, the probability of ant au finding some path
p = (e1 , . . . , ek ) from Pu+ ((i + 1)2 /2 · w̃max ) is at least τmin /(2e).
In addition to finding such a path p, the noisy length evaluation of p must not
be too poor. Let u be the first and v be the second vertex on the path. Assuming
v ∈ Pv+ ((j + 1)2 /2 · w̃max ) implies that


k
(j + 1)2
w(ei ) ≤ optv + · w̃max . (1)
2
i=2

The vertex u becomes permanently approximated within an additive error of (i +


1)2 /2 · w̃max if w̃(p) ≤ optu +(i + 1)2 /2 · w̃max . Using (i + 1)2 /2 − i 2 /2 = i + 1/2
and (1), we have

(i + 1)2 (i + 1)2
optu + · w̃max = w(e1 ) + optv + · w̃max
2 2
 
(j + 1)2 1
≥ w(e1 ) + optv + · w̃max + i + w̃max
2 2
k  
1
≥ w(er ) + i + · w̃max .
2
r=1
656 Algorithmica (2012) 64:643–672

Hence,
 
(i + 1)2
Prob w̃(p) ≤ optu + · w̃max
2
k  

 k
1
≥ Prob (1 + η(er )) · w(ei ) ≤ w(er ) + i + · w̃max
2
r=1 r=1
 


k
1
≥ Prob η(er ) · w(er ) ≤ k + · w̃max
2
r=1


k 
k
1
≥ Prob η(er ) · w(er ) ≤ E(η(er )) · w(er ) + · w̃max
2
r=1 r=1

as E(η(er )) · w(er ) ≤ w̃max for every edge er . Invoking Feige’s inequality (Lemma 2)
with α = 1/2, we have that
k

 
k
1 1
Prob η(er ) · w(er ) ≤ E(η(er )) · w(er ) + · w̃max ≥ .
2 13
r=1 r=1

We conclude that w̃(p) ≤ optu +(i + 1)2 /2 · w̃max holds with probability at least
1/13. The probability of vertex u ∈ Vi becoming permanently approximated within
an additive error of (i + 1)2 /2 · w̃max is at least τmin /(26e). The expected waiting
time until this happens is at most O(1/τmin ).
By standard arguments, the expected time until the last vertex in Vi has become
permanently approximated within an additive error of (i + 1)2 /2 · w̃max is bounded by
O((log n)/τmin ). After an additional waiting time of T ∗ = ln(τmax /τmin )/ρ iterations
all pheromones for edges leaving Vi have been adapted appropriately and we continue
our considerations with the set Vi+1 . Summing up the expected times for all L sets
yields the claimed upper bound. 

In the proofs of Theorems 2 and 3 we argue that a bound on the noisy path length
implies the same bound for the real path length. This way of reasoning may seem
wasteful as in many noise models the real path length will be shorter than the noisy
path length. However, there are examples where ants may switch from a short real
path with noise to a longer path without noise and both paths have almost the same
noisy path length.
There is a family of graphs parameterized by n, L, and an arbitrary value
μ ≥ 0 where we cannot hope for the ant system to achieve an approximation ra-
tio 1 + with 0 ≤ < μ. Consider the graph G = (V , E) with V = {1, . . . , n}
and E = {(i, i + 1) | n − L ≤ i < n} ∪ {(n − L, n)}. We have w(e) = 1/L and
Prob(η(e) = μ/L) = 1 for e = (i, i + 1), n − L ≤ i < n, as well as w(e) = 1 +
and Prob(η(e) = 0) = 1 for e = (n − L, n). See Fig. 1 for an illustration. Consider
p
= (n − L, . . . , n) and p

= (n − L, n). Hence, w(p


) = 1 < 1 + = w(p

) and
Prob(w̃(p
) = 1 + μ > 1 + = w̃(p

)) = 1. When ant an−L follows p

for the first


Algorithmica (2012) 64:643–672 657

Fig. 1 Example graph with parameters n, μ, L, deterministic noise, and destination n


time it replaces the current best-so-far path pn−L and the approximation ratio remains
1 + forever.
Since deterministic noise is independent, we can use Theorem 3. In our example
we have w̃max = μ/L, hence the ant system finds an approximation within an additive
2
error of (L+1)
2L · μ in polynomial expected time. This corresponds to a multiplicative
2
error of 1 + (L+1)2L · μ, in comparison to the lower bound 1 + ε. Our upper bound is
asymptotically tight if L is a constant. In general, as ε can be made arbitrarily close
to μ the additive terms in the two bounds are basically off by a factor of around L/2.
It is an open problem to determine whether this factor is really necessary or whether
it is just an artifact of our proof.
This example also shows that the considered setting is too general since determin-
istic noise can transform any problem instance into any other instance with larger or
equal weights. This illustrates the major obstacle on the way to stronger and more
detailed results. For arbitrary noise, a guarantee on the noisy best-so-far path length
may not contain much reliable information—there can be an arbitrarily large discrep-
ancy between noisy and real path lengths. Better results on what approximation ratio
is achievable within a certain expected time require further restrictions on the noise
model or the considered instances.
In the following, we will consider gamma-distributed noise η(e, p, t) ∼ Γ (k, θ )
for all edges e. In this case for every path p it holds E(w̃(p)) = (1 + kθ ) · w(p),
hence we are looking for least expected time (LET) paths.

4 The Gamma Distribution

Preparing for the upcoming analyses, we discuss the gamma distribution many of
our results are based on. The gamma distribution has been introduced for modeling
stochastic travel times by [22]. The motivation is due to the following observation.
In collision-free traffic cars that arrive at a particular landmark are due to a Poisson
process. As the gamma distribution models the time between events in a Poisson
process, travel times follow a gamma distribution [39].
Note that we do not directly use a gamma-distributed random variable X as edge
weight. Instead, we use (1 + X) · w(e) as noisy length for the edge e. The addition
of 1 can be seen as incorporating a minimum travel time. We multiply by the real
weight of the edge to make delays proportional to the length of an edge. Besides
being motivated by physical models, the gamma distribution also has nice structural
properties that make it well suited for a theoretical analysis.
658 Algorithmica (2012) 64:643–672

The gamma distribution is parameterized by a shape parameter k ∈ R+ and a


scale parameter θ ∈ R+ . A gamma-distributed random variable X is denoted by
X ∼ Γ (k, θ ). Its probability density function is given by

0 if x ≤ 0,
fX (x) = k−1 e−x/θ
x θ k Γ (k)
if x > 0,

where Γ denotes the gamma function Γ (k) = 0 t k−1 e−t dt. For k ∈ N, the proba-
bility density function simplifies to

e−x/θ
fX (x) = x k−1
θ k (k − 1)!
for x > 0. In the special case k = 1, the gamma distribution equals the exponential
distribution with mean θ . In general, with k ∈ N the gamma distribution reflects the
sum of k such exponentially distributed random variables. This distribution is also
known as Erlang distribution. The expectation of X is E(X) = kθ ; the variance is kθ 2 .
The cumulative distribution function for k ∈ N and x ≥ 0 is

k−1
(x/θ )i
FX (x) = Prob(X ≤ x) = 1 − e−x/θ .
i!
i=0

The gamma distribution exhibits the following properties.


1. The sum of m independent gamma distributed random variables with the same
scaling parameter is again gamma distributed: if for 1 ≤ i ≤ m we have indepen-
dent variables Xi ∼ Γ (ki , θ ) then
m

m 
Xi ∼ Γ ki , θ .
i=1 i=1

2. Scaling a gamma-distributed random variable results in another gamma-distributed


random variable: if X ∼ Γ (k, θ ) and α > 0 then αX ∼ Γ (k, αθ ).
The “invariance” with respect to summation makes sense as we sum over sums of
i.i.d. exponentially distributed variables. This property will prove useful for estimat-
ing the total noise on a path of edges with equal weights. The “invariance” with
respect to scaling implies that the absolute added noise on an edge e, w(e) · Γ (k, θ ),
is again a gamma-distributed random variable according to Γ (k, w(e) · θ ).
Considering the gamma distribution in noise models also makes sense because
of the following. Assume we would try to reduce the (unweighted) noise Γ (k, θ )
by sampling an edge s times, for some integer s, and taking the average noise. The
resulting distribution of the added noise is then again gamma-distributed according
to Γ (ks, θ/s). This does not change the expected noise as kθ = ks · θ/s, but it leads
to a reduction of variance from kθ 2 to kθ 2 /s. Note that these arguments also hold
if the noisy length of an edge e is given by a function w(e) + Γ (k, θ ) for a deter-
ministic weight w(e). One important conclusion is that results for arbitrary gamma-
distributed noise also apply when this resampling strategy is employed. Conditions
on the expected noise kθ apply for both settings in an equal manner.
Algorithmica (2012) 64:643–672 659

The following lemma provides estimates for the cumulative distribution function
of a gamma-distributed random variable and will prove useful later on. We present a
self-contained proof as we did not find appropriate references in the literature. The
proof technique used is well known.

Lemma 3 Consider a gamma-distributed random variable X with X ∼ Γ (k, θ )


where k ∈ N and θ ∈ R+ . Then for every x ∈ R+

(x/θ )k −x/θ (x/θ )k


·e ≤ Prob(X ≤ x) ≤ .
k! k!

Proof Consider the n-th order Taylor approximation to f at a


n
f (i) (a)
Tn (x) = · (x − a)i ,
i!
i=0

where f (i) denote the i-th derivative of f . In general, we have f (x) = Tn (x) + Rn (x)
where
f (n+1) (ξ )
Rn (x) = · (x − a)n+1
(n + 1)!
denotes the Lagrange form of the remainder term for some ξ ∈ [−x/θ, 0]. In partic-
ular, putting f := ex , a := −x/θ , and n := k − 1 we have
k−1   
 e−x/θ eξ
e =
0
· (0 − (−x/θ )) i
+ · (0 − (−x/θ ) ) .
k
i! k!
i=0

Hence, we conclude


k−1  
(x/θ )i −x/θ eξ
Prob(X ≤ x) = 1 − ·e =1− e − · (x/θ )k
0
i! k!
i=0

for some ξ ∈ [−x/θ, 0]. We derive

e−x/θ e0
· (x/θ )k ≤ Prob(X ≤ x) ≤ · (x/θ )k . 
k! k!

We also prove the following tail bound. Again, the proof is self-contained. A sim-
ilar tail bound can be found in [21, Chap. 1].

Lemma 4 Consider a gamma-distributed random variable X with X ∼ Γ (k, θ )


where k ∈ N and θ ∈ R+ . Then for every x ∈ R+ with x ≥ kθ = E(X)

k
Prob(X > x) ≤ e−x/θ (x/θ )k−1 .
(k − 1)!
660 Algorithmica (2012) 64:643–672

Fig. 2 Example graph Gn,kθ, from Definition 1 for n = 10 with Wi := 2n(2 + 2kθn)i+1

Proof Using x/θ ≥ k ≥ 1,


k−1
(x/θ )i 
k−1
(x/θ )k−1
=
i! i!(x/θ )k−i−1
i=0 i=0


k−1
1
≤ (x/θ )k−1
i!k k−i−1
i=0


k−1
1 k
≤ (x/θ )k−1 = (x/θ )k−1 .
(k − 1)! (k − 1)!
i=0

Plugging this into the cumulative distribution function yields the claimed bound. 

5 A Lower Bound for Independent Noise

In this section we will establish a lower bound on the random time until a good
approximation is found. The result holds for a worst-case graph that will be described
in the following (for an example, see Fig. 2). First of all, the graph contains the
following subgraph, referred to as right part. The right part has a unique source and
a unique sink. There are two paths to traverse the right part: a chain of n/2 vertices
and a single edge that directly connects the source with the sink. The edge weight for
the long edge is chosen such that the edge is by a factor of (1 + ) more expensive
than the whole chain.
The main observation now is as follows. When an ant traverses the chain, the
noisy length of the chain will be concentrated around the expectation as the length of
the chain is determined by the sum of many i.i.d. variables. Contrarily, the variance
for the noisy length of the long edge will be quite large. This means that, although in
expectation taking the chain leads to a shorter noisy path, chances are high that the ant
experiences a small noise when taking the long edge. If the value of is not too large,
this will result in the ant storing the long edge as best-so-far path and reinforcing it
with pheromones. As long as this best-so-far path is maintained, the approximation
ratio is no better than 1 + ε.
To establish our lower bound, we will prove that after some time, with high prob-
ability the best-so-far path of the ant starting at the source of the right part will be the
long edge. Furthermore we will show that the stored noisy length of the best-so-far
path will be relatively small. This implies that the ant will not accept the chain as
Algorithmica (2012) 64:643–672 661

best-so-far path with high probability. This will establish the claimed lower bound on
the approximation ratio.
However, the construction of the worst-case instance is not yet complete. As men-
tioned, the ant starting in the source of the right part needs some time in order to
traverse the long edge and sample it with a noise value that is significantly smaller
than its expectation. In other words, we need some time to trick the ant into believing
that the long edge leads to a shorter path. Therefore, we extend our graph by a sub-
graph where the ants will need a specific minimum amount of time in order to find a
good approximation. This subgraph is prepended to the right part, so that it does not
change the behavior of ants starting at the source of the right part. We give a formal
definition for our graph. Figure 2 gives an illustration for n = 10.

Definition 1 Let n ∈ N, k ∈ N, θ ∈ R+ , and ∈ R+ . W.l.o.g. we assume that n/2


is an integer. Let Gn,kθ, = (V , E, w) with the vertices V = {u1 , . . . , un/2 = v0 ,
v1 , . . . , vn/2 } and the following edges. We have (ui , ui+1 ), 1 ≤ i ≤ n/2 − 1 and
(vi , vi+1 ), 0 ≤ i ≤ n/2 − 1, with weight 1. Furthermore we have (ui , un/2 ), 1 ≤ i ≤
n/2 − 2, with weight Wi := 2n(2 + 2kθ n)i+1 and (v0 , vn/2 ) with weight (1 + ) · n/2.
The destination is given by the vertex vn/2 .

With regard to Fig. 2, we refer to edges with weight Wi or (1 + ) · n/2 as upper


edges and to the remaining edges as lower edges (or the lower path in the latter case).
We also speak of a left part containing all vertices ui and a right part containing all
vertices vi (un/2 = v0 belongs to both parts).
The first step for proving a lower bound is to show that, given enough time, the ant
starting at v0 will eventually favor the upper path. The condition /(kθ ) = 1/e −Ω(1)
can be read as a trade-off between the expected noise strength kθ and the gap between
the edge weights on the upper and lower paths, respectively. The larger the expected
noise kθ , the larger the gap can be.

Lemma 5 Consider the graph Gn,kθ, = (V , E, w) with independent noise η(e) =


η(e, p, t) ∼ Γ (k, θ ) for each e ∈ E. Let = O(1), k = o(log n), k ∈ N, and /(kθ ) =
1/e − Ω(1). Choose the parameters√1/poly(n) ≤ τmin ≤ 1/2 and ρ ≥ 1/poly(n).
Then with probability 1 − exp(−Ω( n)) after κn/τmin iterations, we have a situ-
ation where for the ant starting from v0 the following holds:
1. the ant’s best-so-far path is the upper path and
2. the probability of changing the best-so-far path within a specific iteration towards
the lower path is exp(−Ω(n))
where κ > 0 is an arbitrary constant.

Proof Consider the ant av0 and the edge (v0 , vn/2 ). Since τ ((v0 , vn/2 )) ≥ τmin ,
the probability of the ant following this edge is lower bounded by τmin /2 due to
Lemma 1. Consider the first κn/τmin iterations and assume w.l.o.g. that this num-
ber is an integer. We count the number of times the ant follows the path. Define
κn/τ
X = i=1 min Xi where Xi are independent Bernoulli-distributed random variables
with Prob(Xi = 1) = τmin /2 and Prob(Xi = 0) = 1 − τmin /2. It is obvious that the
662 Algorithmica (2012) 64:643–672

real number of times the ant chooses the upper edge stochastically dominates X. We
have E(X) = κn/τmin · τmin /2 = κn/2. Then the probability of following this path
less than κn/4 times is at most
 
Prob(X < κn/4) ≤ Prob X ≤ (1 − 1/2) · E(X)

≤ e−(1/2)
2 ·E(X)/2
= e−κn/16

due to a well-known Chernoff bound. Hence, with probability at least


1 − exp(−Ω(n)) the ant chooses the upper edge at least κn/4 times during the first
κn/τmin iterations of the phase.
We now argue that there is a threshold b such that with high probability the ant
finds a path of noisy length at most b when following the upper path. Contrarily,
experiencing a noisy length at most b when following the lower path has probabil-
ity exp(−Ω(n)). This proves both statements.
The noise on the upper path is distributed according to U ∼ Γ (k, (1 + ) · n/2 · θ )
and the noise on the lower path is distributed according to L ∼ Γ (n/2 · k, θ ). Hence,
the noisy length of the upper path is (1 + ) · n/2 + Ui and the noisy length of the
lower path is n/2 + Li where Ui and Li denote the i-th instantiation of U and L,
respectively.
The precondition ε/(kθ ) ≤ 1/e − Ω(1) implies the existence of a constant c, 0 <
c < 1, such that ε ≤ kθ · 2c−1
e . We choose the threshold
n n c
b := + · kθ · .
2 2 e
First of all, consider Prob((1 + ) · n/2 + min{Ui } ≤ b) with 1 ≤ i ≤ κn/4. In
order for the above event to happen, we must have one instantiation Ui where
Ui ≤ b − (1 + ε) · n/2 or, equivalently,
 
n c
Ui ≤ · kθ · − ε .
2 e
We estimate the probability of this event. Due to Lemma 3 and the inequality n! ≤ nn ,
    
n c 1 n/2 · (kθ · c/e − ε) k − n/2·(kθ·c/e−ε)
Prob Ui ≤ · kθ · − ε ≥ · · e (1+ε)n/2·θ
2 e k! (1 + ε)n/2 · θ
 
1 kθ · c/e − ε k − kθ·c/e−ε
= · · e (1+ε)θ
k! (1 + ε)θ
and using kθ c/e − ε ≥ kθ (1 − c)/e we bound this as
 
1 k(1 − c) k − kθ·c/e−ε
≥ · · e (1+ε)θ .
k! (1 + ε) · e

The e-term is bounded from below by e−k/e since


kθ c/e − ε kθ c/e k
≤ ≤ .
(1 + ε)θ θ e
Algorithmica (2012) 64:643–672 663

Using k! · ek/e ≤ 2k k for all k ∈ N, we conclude


    k
n c 1 1−c 1
Prob Ui ≤ · kθ · − ε ≥ · ≥ √ ,
2 e 2 (1 + ε) · e 2 n

where the last inequality holds due to k = o(log n), if n is not too small.
Hence,

Prob((1 + ) · (n/2) + min{Ui } ≤ b)


 κ·(n/4)
= 1 − 1 − Prob(U1 ≤ b − (1 + ) · (n/2))
 
1 κ·(n/4)
≥1− 1− √
2 n
 √ 
= 1 − exp −Ω( n) .

The probability that an ant on the chain finds a path with costs greater than the
threshold b is given by Prob(n/2 + Li > b). By Lemma 3 and n! ≥ (n/e)n ,

Prob(n/2 + Li ≤ b) = Prob(Li ≤ b − n/2)


 
1 b − n/2 n/2·k
≤ ·
(n/2 · k)! θ
(n/2 · kc/e)n/2·k
=
(n/2 · k)!
 
n/2 · kc n/2·k

n/2 · k
= cn/2·k = e−Ω(n) .

This completes the proof. 

The following lemma establishes a lower bound on the time until shortest paths
are found for the left part of the graph. This will also give a lower bound for the time
until a good approximation is reached for the left part and, as a consequence, also for
the whole graph.

Lemma 6 Consider the graph Gn,kθ, = (V , E, w) with independent noise η(e) =


η(e, p, t) ∼ Γ (k, θ ) for each e ∈ E. Let ≤ 1 and k ∈ N. Choose
√ 1/poly(n) ≤ τmin ≤
1/2 and 1/poly(n) ≤ ρ ≤ 1/2. With probability 1−exp(−Ω(√ n/ log n)) MMASSDSP
does not find a 2-approximation within n/(6τmin ) + n ln(τmax /τmin )/ρ iterations.

Proof In [26] the expected optimization time of the ant system MMASSDSP on a
graph being very similar to the left part of G in a setting without noise is lower
bounded by Ω(n/τmin + n/(ρ log(1/ρ))). We partly base our analysis on this proof
and focus on the left part of G. A common trait for the graph in [26] without noise
664 Algorithmica (2012) 64:643–672

and the left part of our graph with noise is that the shortest path through the subgraph
is clearly to follow the chain of lower edges to vertex un/2 . In addition, when starting
from vertex ui the second best choice is to take the edge (ui , un/2 ) as the weights for
the upper edges increase from left to right. As a consequence, many ants are tricked
into placing pheromones on their first upper edges, which makes it hard for other ants
to follow the chain of lower edges.
We first prove that (ui , un/2 ) is the second best choice in the noisy setting, with
overwhelming probability. This enables us to argue with basically the same ordering
of paths as in the deterministic setting of our prior work. Then we prove that the ant
system needs at least the claimed amount of time to subsequently find good approx-
imate paths from right to left. The basic idea is that at the beginning pheromones
are laid such that the probability of choosing an edge of the chain is 1/2. For ver-
tices whose ants have chosen the direct edge to un/2 , this probability steadily drops
over time, until√ it reaches the lower pheromone border τmin . In this situation, after a
first phase of n ln(τmax /τmin )/ρ iterations, there is still a linear number of vertices
for which no good approximation has been found. The remaining time until these
vertices become well approximated is at least n/(6τmin ) with the claimed probability.
On Second-Best Paths. First observe that, by Lemma 4, if η(e) ∼ Γ (k, θ ) we have
Prob(η(e) ≥ kθ n) ≤ e−kn (kn)k−1 k/(k − 1)! ≤ e−Ω(n) . Assuming that this√event does
not happen for any edge during the considered time period of n/(6τmin ) + nT ∗ with
T ∗ = ln(τmax /τmin )/ρ iterations only introduces an exponentially small error. Thus,
with overwhelming probability we can assume for every path p

w(p) ≤ w̃(p) < (1 + kθ n)w(p) = βw(p),

where β := 1 + kθ n. For 1 ≤ i ≤ n/2 − 2 and 0 ≤ j ≤ n/2 − i − 2 let pi,j denote


the path that starts with ui and follows j lower edges before taking the upper edge to
un/2 :
pi,j := (ui , ui+1 ), . . . , (ui+j −1 , ui+j ), (ui+j , un/2 ).
Recall that (uj , un/2 ) has weight Wj = 2n(2 + 2kθ n)j +1 = 2n(2β)j +1 . Also note
that every path from un/2 to the destination has real length in between n/2 and (1 +
)n/2 ≤ n as ≤ 1. For every i and every j
with j < j
≤ n/2 − i − 2 and for
every two paths p ∗ , p ∗∗ from un/2 to the destination we have (with ◦ denoting the
concatenation of paths)
   
w̃ pi,j ◦ p ∗ < βw pi,j ◦ p ∗
 
≤ β j + 2n(2β)j +1 + n ≤ 2n(2β)j +2
   
≤ w pi,j
◦ p ∗∗ ≤ w̃ pi,j
◦ p ∗∗ .

Hence, an ant will always prefer pi,j over pi,j


; in particular, pi,0 is the second best
path. Note that the second best path for each vertex in the left part has approximation
ratio at least 2. Hence, with the claimed probability the ants on all vertices in the left
part need to find shortest paths to un/2 in order to arrive at a 2-approximation.
Algorithmica (2012) 64:643–672 665

Having established this relation, we can re-use some arguments from Theorem 4
in [26]. However, as the mentioned result only makes a statement about an expected
time, more effort √ is needed to turn this into a high-probability result.
If τmin ≥ 1/ n the following simple argument proves the claim. The ant at the
leftmost vertex u1 has to make a correct decision between two edges for n/2−2 times
in order to find a 2-approximation. Even when the pheromones are best possible, τmin
is still so large such that the

probability of always choosing the right edge is at most
(1 − τmin /2)n/2−2 = e−Ω( √ n) . This proves the claim. In the remainder of the proof

we hence assume τmin < 1/ n. √


As described before, we first investigate a first phase of n ln(τmax /τmin )/ρ iter-
ations, followed by a second phase of n/(6τmin ) steps.
Analysis of Phase 1. Call a vertex wrong if the best-so-far path starts with the
upper edge. By Chernoff bounds, with probability 1 − exp(−Ω(n)) we initially have
at least 4/9 · n/2 wrong vertices. Also observe that for a wrong vertex the probability
of taking the first edge of a shortest path is initialized with 1/2 and it decreases over
time towards τmin as long as the vertex remains wrong (cf. the upper bound on the
probability from the first statement of Lemma 1). After T ∗ iterations, the border τmin
is reached.
Call a vertex optimized if the corresponding ant√has found a shortest path to un/2 .
Consider a vertex v with at least 8 log(1/ρ) + κ n/ log n wrong successors on its
shortest path for some small constant κ > 0. In order to optimize v, the ant starting
at v has to make the correct decisions for all wrong successors. Lemma 4 in [44]
states that√the probability of optimizing v within 1/ρ − 1 ≥ 1/(2ρ) iterations is
exp(−Ω( n/ log n)). The intuitive reason is that even if successors of v become
optimized, the pheromones need some time to adapt (recall that the probability
√ of
choosing a correct edge √ is at most 1/2). Note that poly(n) · exp(−Ω( n/ log n))
is still of order exp(−Ω( n/ log n)) as we can decrease the hidden constant in the
exponent to account for arbitrary polynomials poly(n).
Taking the union bound for√all vertices v, we have that during √ 1/(2ρ) iterations
with probability 1 − exp(−Ω( n/ log n)) only 8 log(1/ρ) + κ n/ log n wrong ver-
tices are corrected.

After 2√ nT ∗ phases, each consisting
√ of 1/(2ρ) iterations,
√ with probability 1 −
exp(−Ω( n/ log n)) at most 2 nT ∗ (8 log(1/ρ) + κ n/ log n) ≤ 1/18 · n/2 wrong
vertices have been corrected, where the inequality holds if κ is chosen small enough
and n is large enough.
Analysis of Phase 2. Assume the described “typical” events in Phase 1 have hap-
pened. Then we are in a situation where there are still 4/9 · n/2 − 1/18 · n/2 =
7/18 · n/2 wrong vertices left. For these vertices the probability of taking the lower
edge has decreased to τmin as pheromone has been decreased for more than T ∗ it-
erations. Now, if v has i wrong successors on its shortest path, the probability of
i by the first statement of Lemma 1.
optimizing v in the next iteration is at most τmin
The following argument is borrowed from Theorem 17 in [20]. Imagine a 0–1-
string of unbounded length where each bit is set to 1 independently with probability
τmin . Then the random number of ones before the first zero follows the same geo-
metric distribution as the number of optimized vertices (in fact, the former stochas-
i
tically dominates the latter as the probabilities are not exactly τmin but smaller). The
666 Algorithmica (2012) 64:643–672

probability of optimizing 7/18 · n/2 wrong vertices in n/(6τmin ) iterations is thus


bounded by the probability of having at least 7/18 · n/2 = 7/36 · √ n ones among the
first n/(6τmin ) + 7/36 · n bits of the 0–1-string. Recall τmin
√ ≤ 1/ n, hence the ex-
pected number of ones is n/6 + O(n · τmin ) = n/6 + O( n). This is by a constant
factor smaller than 7/36 · n, if n is large enough. By a straightforward application of
a Chernoff bound, the mentioned probability is exp(−Ω(n)). √
√ 1 ∗− exp(−Ω( n/ log n)) not
Adding up all error probabilities, with probability
all vertices have been optimized after n/(6τmin ) + nT iterations. 

Note that we can easily achieve an arbitrarily bad approximation ratio in the spec-
ified time period by increasing all weights Wi by some arbitrarily large factor. In
fact, we have proven the following theorem stating that we cannot√ hope to achieve
any specific approximation ratio within less than n/(6τmin ) + n ln(τmax /τmin )/ρ
iterations.

Theorem 4 For every r ∈ R+ there is a graph with n vertices and independent noise
η(e) = η(e, p, t) ∼ Γ (k, θ ) for each edge e, k ∈ N, such that with overwhelming
probability MMASSDSP with 1/poly(n) ≤ ρ ≤ 1/2 and 1/poly(n) √ ≤ τmin ≤ 1/2 does
not find a (1 + r)-approximation within the first n/(6τmin ) + n ln(τmax /τmin )/ρ it-
erations.

The results for MMASSDSP in [26] state that the algorithm can indeed find
all shortest paths when given a little more time: an upper bound of O(n/τmin +
n log(τmax /τmin )/ρ) holds without noise. For stochastic shortest paths, the situation
is much different. Putting Lemmas 5 and 6 on the different parts of G together, we
obtain that the waiting time to obtain any approximation ratio better than 1 + ε is
exponential.

Theorem 5 Consider the graph Gn,kθ, = (V , E, w) with independent noise η(e) =


η(e, p, t) ∼ Γ (k, θ ) for each e ∈ E. Let ≤ 1, k = o(log n), k ∈ N, and /(kθ ) =
1/e − Ω(1). Choose the parameters 1/poly(n)
√ ≤ τmin ≤ 1/2 and 1/poly(n) ≤ ρ ≤
1/2. Then with probability 1 − exp(−Ω( n/ log n)) MMAS √SDSP
does not achieve an
approximation ratio better than (1 + ) within the first e c n iterations where c > 0
is a small enough constant.

In particular, we get the following.

Corollary 1 Under the conditions given in Theorem 5, with overwhelming


√ probabil-
ity MMASSDSP does not find a (1 + kθ/3)-approximation within ec n iterations.

Proof of Theorem 5 Let δ < . According to Lemma 6 the √ algorithm does not
find a (1 + δ)-approximation within the first n/(6τmin ) + nT ∗ iterations where
T ∗ = ln(τmax /τmin )/ρ. By the first statement of Lemma 5, after this time period the
ant starting at the source of the right part has stored the upper path as best-so-far
path. Furthermore, the probability that this best-so-far path is changed in one iter-
ation is at most exp(−Ω(n)) by the second statement of Lemma 5. Recall that the
Algorithmica (2012) 64:643–672 667

upper path has weight (1 + ) · (n/2) in Gn,kθ, while the lower path has weight n/2.
Also recall that a (1 + δ)-approximation requires all best-so-far paths to be (1 + δ)-
approximations of the shortest paths. Hence, by the union bound the probability that
the ant av0 achieves a (1 + δ)-approximation within ecn iterations is still exp(−Ω(n))
if c is small enough. 

The result from Theorem 5 is due to fact that the ant at v0 cannot store a (1 + δ)-
approximation as best-so-far path. It can easily construct a shortest real path, but it
does not realize it as being a shortest path. Our negative results can easily be extended
towards more relaxed notions of approximations that accept an approximate path if it
is only found temporarily. Replace v0 by a chain of Ω(n) vertices such that all edges
leading to un/2 = v0 lead to the start of the chain instead and all edges leaving v0
leave from the end of the chain. All edges on the chain receive a tiny weight. In this
setting, all ants of the chain are basically in the same situation as the single ant on
v0 in the original graph. In order to achieve a (1 + δ)-approximation, all ants on the
chain must choose to follow their shortest paths via v0 , v1 , . . . , vn/2 . The probability
that in one iteration all ants decide to do so is exponentially small.

6 Perfectly Correlated Noise

In many stochastic optimization settings noise is not independent, but correlated.


In this section we look at a setting that is opposed to independent noise: we
assume the same noise for all edges, i.e., for each ant there is a single value
η = η(p, t) for each path p (i.e., for each ant) and each time t such that the
noisy length of an edge e is given by w̃(e) := (1 + η)w(e). The reader might
think of ants reflecting traveling agents and each agent traveling at an individual
speed. Formally, we may still think of gamma-distributed η-values for all edges:
η(e1 , p, t), . . . , η(em , p, t) ∼ Γ (k, θ ) for k ∈ N and θ ∈ R+ , but they are all equal:
η(e1 , p, t) = · · · = η(em , p, t) = η(p, t) = η. Note that this corresponds to a perfect
correlation among the delays. The noisy length w̃(p) of a path p = (e1 , . . . , es ) then
equals

s
 
w̃(p) = 1 + η(ei ) · w(ei ) = (1 + η) · w(p).
i=1
When reconsidering the graph Gn,kθ,ε from Definition 1, we expect strongly cor-
related noise to be helpful as the noise values for the chain of lower edges in the right
part are likely to show similar deviations from their expectation. This enables the
ants to sample the lower path with small total noise. In fact, with perfectly correlated
noise we prove that the ants indeed can observe the shorter paths and find shortest
paths efficiently. The following theorem states an upper bound that depends on the
noise parameters and the value of that determines the gap between the weights on
the upper and lower paths in G, respectively.

Theorem 6 Consider the graph Gn,kθ, = (V , E, w) with the same noise η =


η(p, t) ∼ Γ (k, θ ) for each e ∈ E. Choose τmin ≤ 1/(2n), then in expectation
668 Algorithmica (2012) 64:643–672

MMASSDSP finds a 1-approximation after at most O(n/τmin + (kθ/ )k · eε/θ /τmin +


n log(τmax /τmin )/ρ) iterations.

Proof We first consider the right part and estimate the expected time until the ant at
v0 has sampled a shortest path with such a low noise that the noisy length of its best-
so-far path is less than the real length of the sub-optimal path. By Lemma 1 the ant
follows the optimal path with probability at least τmin /(2e). The noisy length w̃(p)
of the shortest path is (1 + X)n/2 with X ∼ Γ (k, θ ). The real length of the upper
path is (1 + ε)n/2. By Lemma 3 we have

Prob((1 + X)n/2 ≤ (1 + )n/2) = Prob(X ≤ )


 k
( /θ ))k − /θ
≥ e ≥ e− /θ .
k! kθ

The expected waiting time for following the shortest path and sampling it with low
noise is O(1/τmin · (kθ/ )k · eε/θ ). After T ∗ = ln(τmax /τmin )/ρ more steps we have
τ ((v0 , v1 )) = τmin and this property is maintained forever. In particular, we have
found a 1-approximation for all vertices in the right part.
A similar argument holds for the left part. We show that the ants in the left part
find shortest paths from right to left. Consider the ant starting from un/2−i with 1 ≤
i ≤ n/2. Assume that all ants at un/2−i+1 , . . . , un/2 have already found their shortest
paths and sampled it with such a low noise that they will stick to their shortest paths
forever. Also assume that the pheromones on their incorrect edges have decreased
to τmin . Under these assumptions the ant at un/2−i follows the optimal path with
probability at least τmin /(2e). The noisy length of the shortest path from un/2−i is
(1 + X)(i + n/2) ≤ (1 + X)n with X ∼ Γ (k, θ ) and the noisy length w̃(p) of a path
p starting with an upper edge is larger than 2n(1 + kθ n) + n/2. Using Lemma 4,

Prob((1 + X)n ≤ 2n(1 + kθ n) + n/2)


= Prob(X ≤ 3/2 + 2kθ n)
≥ Prob(X ≤ kθ n) ≥ 1 − exp(−Ω(n)).

This yields an expected number of O(1/τmin ) steps until the desired property
also holds for the best-so-far path from un/2−i . After T ∗ more steps we have
τ ((un/2−i , un/2 )) = τmin . Summing up all expected waiting times for the right part
and for all vertices un/2−1 , . . . , u1 in the left part yields the claimed time bound. 

In particular, we conclude the following special case.

Corollary 2 For k = O(1) and = Ω(θ/n) Theorem 6 yields a polynomial upper


bound, while Theorem 5 yields an exponential lower bound for the expected running
time of MMASSDSP .

This example demonstrates that correlated noise really helps the ants to locate
shortest paths on the considered graph.
Algorithmica (2012) 64:643–672 669

Finally, we give a general upper bound for perfectly correlated noise. We consider
graphs with gaps between the weights of optimal and non-optimal paths, respectively.
In contrast to the general upper bound from Theorem 1 we have a smoother trade-off
between the gap size and the expected optimization time. This is expressed by the
parameter z in the following statement.

Theorem 7 Consider a graph G = (V , E, w) with weight w(e) > 0 and with the
same noise η = η(p, t) ∼ Γ (k, θ ) for each e ∈ E. Choose τmin ≤ 1/(ΔL) and let optv
denote the real length of a shortest path from v. If for each vertex v ∈ V and some
z ∈ N it holds that every non-optimal path has length at least (1 + kθ/z) · optv then
in expectation MMASSDSP finds a 1-approximation after O(((z + 1)k L log n)/τmin +
L log(τmax /τmin )/ρ) iterations.

Proof The proof is very similar to the proof of Theorem 1. We call a vertex v opti-
mized if w̃(pv∗ ) < (1 + θ/z) · optv and processed if all pheromones on edges that are
not part of shortest paths have decreased to τmin . Call an iteration good if the global
noise is less than kθ/z. Define η1 , . . . , ηk ∼ Γ (1, θ ) and note that η has the same

distribution as ki=1 η1 . In particular, we have η < kθ/z if ηi < θ/z for all i. The
probability for a good iteration is at least

  k  
kθ θ
Prob η < ≥ Prob ηi < ≥ (1 − e−1/z )k
z z
i=1
 k
1 1
≥ 1− = .
1 + 1/z (z + 1)k

By Chernoff bounds, the probability that in (z + 1)k · (c log n)/τmin iterations we have
at least 2 ln n · 2e/τmin good iterations is at least 1 − 1/n, if c > 0 is a constant chosen
large enough. In the following we assume that this happens.
Recall the partition V0 , . . . , VL of V from Theorem 1. Assume that all vertices
in V0 , . . . , Vi−1 are processed and fix a vertex u ∈ Vi . The probability of finding a
shortest path from u is at least τmin /(2e). The probability that this happens at least
once in 2 ln n · 2e/τmin good iterations is
 
τmin 2 ln n·2e/τmin 1
1− 1− ≥ 1 − e−2 ln n = 1 − 2 .
2e n

So, with a failure probability of at most 1/n + |Vi | · 1/n2 ≤ 2/n all vertices in Vi be-
come optimized within (z + 1)k · (c log n)/τmin iterations. Repeating these arguments
in case of a failure establishes a bound of O((z + 1)k · (log n)/τmin ) on the expected
time until all vertices in Vi are optimized. Summing up all these times and adding
terms ln(τmin /τmin )/ρ for optimized vertices to become processed yields the claimed
bound. 
670 Algorithmica (2012) 64:643–672

7 Conclusions

We have presented a first analysis of ACO for a stochastic combinatorial problem:


finding shortest paths with noisy edge weights. Different settings of (nonnegative)
noisy edge weights have been examined: general noise, independent noise, indepen-
dent gamma-distributed noise, and perfectly correlated noise. We have characterized
on which instances ACO can still find shortest paths in case there is a gap between
the shortest-path length and the lengths of non-optimal paths. For general weights
we have given general upper bounds on the approximation ratio that can be obtained
efficiently.
For gamma-distributed noise we have constructed a setting where noise and the
different variances for upper and lower paths trick the ants into following the path
with the larger variance, but larger real length. The ants can be seen to become risk-
seeking in this scenario. The expected time until a good approximation is found then
becomes exponential. Another insight is that this effect vanishes when considering
perfectly correlated noise. Many results have established trade-offs between the ex-
pected running times on the one hand and the expected noise strength or approxima-
tion guarantees on the other hand.
Future work could deal with other variants of stochastic shortest path problems.
Also other noise functions could be investigated, including functions with negative
noise or noise with moderate correlation. It is also not clear whether and in which set-
tings it is possible to prove better general upper bounds that avoid the term (L + 1)/2
in the approximation ratio of Theorem 3 or the level-wise blow-up of the approxi-
mation ratio in Theorem 2, respectively. Finally, an open question is whether ACO
shows similar degrees of robustness for other stochastic problems.
Acknowledgements The authors thank Samitha Samaranayake and Sébastien Blandin from UC Berke-
ley for references and discussions on stochastic shortest path problems and Thomas Sauerwald for pointing
us to Feige [23]. Dirk Sudholt was partly supported by EPSRC grant EP/D052785/1 and a postdoctoral fel-
lowship from the German Academic Exchange Service while visiting the International Computer Science
Institute in Berkeley, CA, USA.

References
1. Abraham, I., Fiat, A., Goldberg, A.V., Werneck, R.F.F.: Highway dimension, shortest paths, and prov-
ably efficient algorithms. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algo-
rithms (SODA’10), pp. 782–793. SIAM, Philadelphia (2010)
2. Attiratanasunthron, N., Fakcharoenphol, J.: A running time analysis of an ant colony optimization
algorithm for shortest paths in directed acyclic graphs. Inf. Process. Lett. 105(3), 88–92 (2008)
3. Bast, H., Funke, S., Sanders, P., Schultes, D.: Fast routing in road networks with transit nodes. Science
316(5824), 566 (2007)
4. Baswana, S., Biswas, S., Doerr, B., Friedrich, T., Kurur, P.P., Neumann, F.: Computing single source
shortest paths using single-objective fitness functions. In: Proceedings of the International Workshop
on Foundations of Genetic Algorithms (FOGA’09), pp. 59–66. ACM, New York (2009)
5. Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res.
16(3), 580–595 (1991)
6. Borkar, V., Das, D.: A novel ACO algorithm for optimization via reinforcement and initial bias.
Swarm Intell. 3(1), 3–34 (2009)
7. Boyan, J.A., Mitzenmacher, M.: Improved results for route planning in stochastic transportation. In:
Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’01), pp. 895–
902. SIAM, Philadelphia (2001)
Algorithmica (2012) 64:643–672 671

8. Chan, T.M.: More algorithms for all-pairs shortest paths in weighted graphs. In: Proceedings of
the Annual ACM Symposium on Theory of Computing (STOC’07), pp. 590–598. ACM, New York
(2007)
9. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT
Press, Cambridge (2001)
10. Doerr, B., Johannsen, D.: Edge-based representation beats vertex-based representation in shortest path
problems. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’10),
pp. 759–766. ACM, New York (2010)
11. Doerr, B., Theile, M.: Improved analysis methods for crossover-based algorithms. In: Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO’09), pp. 247–254. ACM, New York
(2009)
12. Doerr, B., Happ, E., Klein, C.: A tight analysis of the (1+1)-EA for the single source shortest path
problem. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC’07), pp. 1890–
1895. IEEE Press, New York (2007)
13. Doerr, B., Happ, E., Klein, C.: Crossover can provably be useful in evolutionary computation. In:
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’08), pp. 539–546.
ACM, New York (2008)
14. Doerr, B., Johannsen, D., Kötzing, T., Neumann, F., Theile, M.: More effective crossover operators
for the all-pairs shortest path problem. In: Proceedings of the International Conference on Parallel
Problem Solving from Nature (PPSN’10), pp. 184–193. Springer, Berlin (2010)
15. Doerr, B., Neumann, F., Sudholt, D., Witt, C.: Runtime analysis of the 1-ANT ant colony optimizer.
Theor. Comput. Sci. 412(17), 1629–1644 (2011)
16. Dorigo, M., Blum, C.: Ant colony optimization theory: a survey. Theor. Comput. Sci. 344(2–3), 243–
278 (2005)
17. Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling
salesman problem. IEEE Trans. Evol. Comput. 1(1), 53–66 (1997)
18. Dorigo, M., Stützle, T.: Ant Colony Optimization, 1st edn. MIT Press, Cambridge (2004)
19. Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: an autocatalytic optimizing process. Technical
Report 91-016 Revised, Politecnico di Milano (1991)
20. Droste, S., Jansen, T., Wegener, I.: On the analysis of the (1+1) evolutionary algorithm. Theor. Com-
put. Sci. 276(1–2), 51–81 (2002)
21. Dubhashi, D., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms.
Cambridge University Press, Cambridge (2009)
22. Fan, Y.Y., Kalaba, R.E., Moore, J.E.: Arriving on time. J. Optim. Theory Appl. 127(3), 497–513
(2005)
23. Feige, U.: On sums of independent random variables with unbounded variance and estimating the
average degree in a graph. SIAM J. Comput. 35(4), 964–984 (2006)
24. Gutjahr, W.J., Sebastiani, G.: Runtime analysis of ant colony optimization with best-so-far reinforce-
ment. Methodol. Comput. Appl. Probab. 10(3), 409–433 (2008)
25. Horoba, C.: Exploring the runtime of an evolutionary algorithm for the multi-objective shortest path
problem. Evol. Comput. 18(3), 357–381 (2010)
26. Horoba, C., Sudholt, D.: Running time analysis of ACO systems for shortest path problems. In:
Proceedings of the International Workshop on Engineering Stochastic Local Search Algorithms
(SLS ’09), pp. 76–91. Springer, Berlin (2009)
27. Horoba, C., Sudholt, D.: Ant colony optimization for stochastic shortest path problems. In: Pro-
ceedings of the Genetic and Evolutionary Computation Conference (GECCO’10), pp. 1465–1472.
Springer, Berlin (2010)
28. Kolavali, S.R., Bhatnagar, S.: Ant colony optimization algorithms for shortest path problems. In:
Altman, E., Chaintreau, A. (eds.) Network Control and Optimization, pp. 37–44. Springer, Berlin
(2009)
29. Kötzing, T., Lehre, P.K., Oliveto, P.S., Neumann, F.: Ant colony optimization and the minimum cut
problem. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’10),
pp. 1393–1400. ACM, New York (2010)
30. Kötzing, T., Neumann, F., Röglin, H., Witt, C.: Theoretical properties of two ACO approaches for
the traveling salesman problem. In: Proceedings of the International Conference on Ant Colony Opti-
mization and Swarm Intelligence (ANTS’10), pp. 324–335. Springer, Berlin (2010)
31. Kötzing, T., Neumann, F., Sudholt, D., Wagner, M.: Simple Max-Min ant systems and the optimiza-
tion of linear pseudo-Boolean functions. In: Proceedings of the 11th Workshop on Foundations of
Genetic Algorithms (FOGA 2011), pp. 209–218. ACM, New York (2011)
672 Algorithmica (2012) 64:643–672

32. Miller-Hooks, E.D., Mahmassani, H.S.: Least expected time paths in stochastic, time-varying trans-
portation networks. Transp. Sci. 34(2), 198–215 (2000)
33. Neumann, F., Theile, M.: How crossover speeds up evolutionary algorithms for the multi-criteria
all-pairs-shortest-path problem. In: Proceedings of the International Conference on Parallel Problem
Solving from Nature (PPSN’10), pp. 667–676. Springer, Berlin (2010)
34. Neumann, F., Witt, C.: Ant colony optimization and the minimum spanning tree problem. Theor.
Comput. Sci. 411(25), 2406–2413 (2010)
35. Neumann, F., Witt, C.: Runtime analysis of a simple ant colony optimization algorithm. Algorithmica
54(2), 243–255 (2009)
36. Neumann, F., Sudholt, D., Witt, C.: Rigorous analyses for the combination of ant colony optimization
and local search. In: Proceedings of the International Conference on Ant Colony Optimization and
Swarm Intelligence (ANTS’08), pp. 132–143. Springer, Berlin (2008)
37. Neumann, F., Sudholt, D., Witt, C.: Analysis of different MMAS ACO algorithms on unimodal func-
tions and plateaus. Swarm Intell. 3(1), 35–68 (2009)
38. Neumann, F., Sudholt, D., Witt, C.: A few ants are enough: ACO with iteration-best update. In: Pro-
ceedings of the Genetic and Evolutionary Computation Conference (GECCO’10), pp. 63–70. ACM,
New York (2010)
39. Nikolova, E., Brand, M., Karger, D.R.: Optimal route planning under uncertainty. In: Proceedings
of the International Conference on Automated Planning and Scheduling (ICAPS’06), pp. 131–141.
AAAI Press, Menlo Park (2006)
40. Orlin, J.B., Madduri, K., Subramani, K., Williamson, M.: A faster algorithm for the single source
shortest path problem with few distinct positive lengths. J. Discrete Algorithms 8(2), 189–198 (2010)
41. Papadimitriou, C.H., Yannakakis, M.: Shortest paths without a map. Theor. Comput. Sci. 84(1), 127–
150 (1991)
42. Scharnow, J., Tinnefeld, K., Wegener, I.: The analysis of evolutionary algorithms on sorting and
shortest paths problems. J. Math. Model. Algorithms 3(4), 349–366 (2004)
43. Sudholt, D.: Using Markov-chain mixing time estimates for the analysis of ant colony optimization.
In: Proceedings of the 11th Workshop on Foundations of Genetic Algorithms (FOGA 2011), pp. 139–
150. ACM, New York (2011)
44. Sudholt, D., Thyssen, C.: Running time analysis of ant colony optimization for shortest path problems.
J. Discrete Algorithms. doi:10.1016/j.jda.2011.06.002 (2011, to appear)
45. Zhou, Y.: Runtime analysis of an ant colony optimization algorithm for TSP instances. IEEE Trans.
Evol. Comput. 13(5), 1083–1092 (2009)

You might also like