0% found this document useful (0 votes)
63 views48 pages

Nonconvex Optimization For Communication Systems

This document discusses nonconvex optimization problems that arise in communication systems. It provides an overview of three typical nonconvex applications: Internet congestion control through nonconcave network utility maximization, wireless network power control through geometric and sigmoidal programming, and DSL spectrum management through distributed nonconvex optimization. A variety of nonconvex optimization techniques are presented, including dual relaxation, semidefinite programming relaxation, and geometric programming relaxation.

Uploaded by

ayush89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views48 pages

Nonconvex Optimization For Communication Systems

This document discusses nonconvex optimization problems that arise in communication systems. It provides an overview of three typical nonconvex applications: Internet congestion control through nonconcave network utility maximization, wireless network power control through geometric and sigmoidal programming, and DSL spectrum management through distributed nonconvex optimization. A variety of nonconvex optimization techniques are presented, including dual relaxation, semidefinite programming relaxation, and geometric programming relaxation.

Uploaded by

ayush89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Nonconvex Optimization for

Communication Systems
Mung Chiang
Electrical Engineering Department
Princeton University, Princeton, NJ 08544, USA
[email protected]
Summary. Convex optimization has provided both a powerful tool and an intriguing mentality to the analysis and design of communication systems over the last
few years. A main challenge today is on nonconvex problems in these application.
This paper presents an overview of some of the important nonconvex optimization
problems in point-to-point and networked communication systems. Three typical applications are covered: Internet congestion control through nonconcave network utility maximization, wireless network power control through geometric and sigmoidal
programming, and DSL spectrum management through distributed nonconvex optimization. A variety of nonconvex optimization techniques are showcased: from
standard dual relaxation to sum-of-squares programming through successive SDP
relaxation, signomial programming through successive GP relaxation, and leveraging
the specific structures in problems for efficient and distributed heuristics.

Key words: Nonconvex optimization, Geometric programming, Semidefinite programming, Sum of squares, Duality, Network utility maximization,
TCP/IP, Wireless network, Power control.

1 Introduction
There has been two major waves in the history of optimization theory: the
first started with linear programming and simplex method in late 1940s, and
the second with convex optimization and interior point method in late 1980s.
Each has been followed by a transforming period of appreciation-application
cycle: as more people appreciate the use of LP/convex optimization, more
look for their formulations in various applications; then more work on its theory, efficient algorithms and softwares; the more powerful the tools become;
then more people appreciate its usage. Communication systems benefit significantly from both waves, including multicommodity flow solutions (e.g.,
Bellman Ford algorithm) from LP, and basic network utility maximization
and robust transceiver design from convex optimization.

Chiang: Nonconvex Optimization for Communication Systems

Much of the current research frontier is about the potential of the third
wave, on nonconvex optimization. If one word is used to differentiate between easy and hard problems, convexity is probably the watershed. But if
a longer description length is allowed, much useful conclusions can be drawn
even for nonconvex optimization. Indeed, convexity is a very disturbing watershed, since it is not a topological invariant under change of variable (e.g.,
see geometric programming) or higher-dimension embedding (e.g., see sum of
squares method). A variety of approaches have been proposed, from nonlinear
transformation to turn an apparently nonconvex problem into a convex problem, to characterization of attraction regions and systematically jumping out
of a local optimum, from successive convex approximation to dualization, from
leveraging the specific structures of the problems (e.g., Difference of Convex
functions, concave minimization, low rank nonconvexity) to developing more
efficient branch-and-bound procedures.
Researchers in communications and networking have been examining nonconvex optimization using domain-specific structures in important problems
in the areas of wireless networking, Internet engineering, and communication
theory. Perhaps four typical topics best illustrate the variety of challenging
issues arising from nonconvex optimization in communication systems:

Nonconvex objective to be minimized. An example is congestion control


for inelastic applications.
Nonconvex constraint set. An example is power control in low SIR regimes.
Integer constraints. Two important examples are single path routing and
multiuser detection.
Constraint sets that are convex but require an exponential number of
inequalities to explicitly describe. An example is optimal scheduling.

This chapter overviews the latest results in recent publications about the
first two topics, with a particular focus on showing the connections between
the engineering intuitions about important problems in communication systems and the state-of-the-art algorithms in nonconvex optimization theory.

2 Internet Congestion Control


2.1 Introduction
Basic network utility maximization
Since the publication of the seminal paper [24] by Kelly, Maulloo, and Tan in
1998, the framework of Network Utility Maximization (NUM) has found many
applications in network rate allocation algorithms and Internet congestion
control protocols (e.g., surveyed in [32, 47]). It has also lead to a systematic
understanding the entire network protocol stack in the unifying framework
(e.g., surveyed in [11, 31]). By allowing nonlinear concave utility objective

Special Volume

functions, NUM substantially expands the scope of the classical LP-based


Network Flow Problems.
Consider a communication network with L links, each with a fixed capacity
of cl bps, and S sources (i.e., end users), each transmitting at a source rate
of xs bps. Each source s emits one flow, using a fixed set L(s) of links in its
path, and has a utility function Us (xs ). Each link l is shared by a set S(l)
of sources. Network Utility Maximization (NUM), in its basic version,
P is the
following problem of maximizing the total utility of the P
network s Us (xs ),
over the source rates x, subject to linear flow constraints s:lL(s) xs cl for
all links l:
P
maximize Ps Us (xs )
subject to sS(l) xs cl , l,
(1)
x0
where the variables are x RS .
There are many nice properties of the basic NUM model due to several
simplifying assumptions of the utility functions and flow constraints, which
provide the mathematical tractability of problem (1) but also limit its applicability. In particular, the utility functions {Us } are often assumed to be
increasing and strictly concave functions.
Assuming that Us (xs ) becomes concave for large enough xs is reasonable,
because the law of diminishing marginal utility eventually will be effective.
However, Us may not be concave throughout its domain. In his seminal paper
published a decade ago, Shenker [45] differentiated inelastic network traffic
from elastic traffic. Utility functions for elastic traffic were modeled as strictly
concave functions. While inelastic flows with nonconcave utility functions represent important applications in practice, they have received little attention
and rate allocation among them have scarcely any mathematical foundation,
except three recent publications [28, 12, 15] (see also earlier work in [54, 29, 30]
related to the approach in [28])
In this section, we investigate the extension of the basic NUM to maximization of nonconcave utilities, as in the approach of [15]. We provide a
centralized algorithm for off-line analysis and establishment of a performance
benchmark for nonconcave utility maximization when the utility function is a
polynomial or signomial. Based on the semialgebraic approach to polynomial
optimization, we employ convex sum-of-squares (SOS) relaxations solved by
a sequence of semidefinite programs (SDP), to obtain increasingly tighter upper bounds on total achievable utility for polynomial utilities. Surprisingly, in
all our experiments, a very low order and often a minimal order relaxation
yields not just a bound on attainable network utility, but the globally maximized network utility. When the bound is exact, which can be proved using
a sufficient test, we can also recover a globally optimal rate allocation.

Chiang: Nonconvex Optimization for Communication Systems

Canonical distributed algorithm


A reason that the the assumption of utility functions concavity is upheld in
almost all papers on NUM is that it leads to three highly desirable mathematical properties of the basic NUM:

It is a convex optimization problem, therefore the global minimum can be


computed (at least in centralized algorithms) in worst-case polynomialtime complexity [5].
Strong duality holds for (1) and its Lagrange dual problem. Zero duality
gap enables a dual approach to solve (1).
Minimization of a separable objective function over linear constraints can
be conducted by distributed algorithms based on the dual approach.

Indeed, the basic NUM (1) is such a nice optimization problem that
its theoretical and computational properties have been well studied since the
1960s in the field of monotropic programming, e.g., as summarized in [41].
For network rate allocation problems, a dual-based distributed algorithm has
been widely studied (e.g., in [24, 32]), and is summarized below.
Zero duality gap for (1) states that the solving the Lagrange dual problem
is equivalent to solving the primal problem (1). The Lagrange dual problem
is readily derived. We first form the Lagrangian of (1):

X
X
X
xs
L(x, ) =
Us (xs ) +
l cl
s

sS(l)

where l 0 is the Lagrange multiplier (link congestion price) associated with


the linear flow constraint on link l. Additivity of total utility and linearity
of flow constraints lead to a Lagrangian dual decomposition into individual
source terms:

X
X
X
Us (xs )
l xs +
cl l
L(x, ) =
s

X
s

lL(s)

Ls (xs , s ) +

cl l

P
where s = lL(s) l . For each source s, Ls (xs , s ) = Us (xs ) s xs only
depends on local xs and the link prices l on those links used by source s.
The Lagrange dual function g() is defined as the maximized L(x, ) over
x. This net utility maximization obviously can be conducted distributively
P
by the each source, as long as the aggregate link price s =
lL(s) l is
available to source s, where source s maximizes a strictly concave function
Ls (xs , s ) over xs for a given s :
xs (s ) = argmax [Us (xs ) s xs ] , s.

(2)

Special Volume

The Lagrange dual problem is


minimize g() = L(x (), )
subject to  0

(3)

where the optimization variable is . Any algorithms that find a pair of primaldual variables (x, ) that satisfy the KKT optimality condition would solve
(1) and its dual problem (23). One possibility is a distributed, iterative subgradient method, which updates the dual variables to solve the dual problem
(23):

+
X
l (t + 1) = l (t) (t) cl
xs (s (t)) , l
(4)
sS(l)

where t is the iteration number and (t) > 0 are step sizes. Certain choices of
step sizes, such as (t) = 0 /t, 0 > 0, guarantee that the sequence of dual
variables (t) will converge to the dual optimal as t . The primal
variable x((t)) will also converge to the primal optimal variable x . For a
primal problem that is a convex optimization, the convergence is towards the
global optimum.
The sequence of the pair of algorithmic steps (2,4) forms a canonical distributed algorithm that globally solves network utility optimization problem
(1) and the dual (23) and computes the optimal rates x and link prices .
Nonconcave Network Utility Maximization

It is known that for many multimedia applications, user satisfaction may


assume non-concave shape as a function of the allocated rate. For example, the
utility for streaming applications is better described by a sigmoidal function:
with a convex part at low rate and a concave part at high rate, and a single
inflexion point x0 (with Us (x0 ) = 0) separating the two parts. The concavity
assumption on Us is also related to the elasticity assumption on rate demands
by users. When demands for xs are not perfectly elastic, Us (xs ) may not be
concave.
Suppose we remove the critical assumption that {Us } are concave functions, and allow them to be any nonlinear functions. The resulting NUM
becomes nonconvex optimization and significantly harder to be analyzed and
solved, even by centralized computational methods. In particular, a local optimum may not be a global optimum and the duality gap can be strictly
positive. The standard distributive algorithms that solve the dual problem
may produce infeasible or suboptimal rate allocation.
Despite such difficulties, there have been two very recent publications on
distributed algorithm for nonconcave utility maximization. In [28], a selfregulation heuristic is proposed to avoid the resulting oscillation in rate
allocation and shown to converges to an optimal rate allocation asymptotically when the proportion of nonconcave utility sources vanishes. In [12], a

Chiang: Nonconvex Optimization for Communication Systems


3

2.5

U (x)

1.5

0.5

10

12

x
Fig. 1. Some examples of utility functions Us (xs ): it can be concave or sigmoidal
as shown in the graph, or any general nonconcave function. If the bottleneck link
capacity used by the source is small enough, i.e., if the dotted vertical line is pushed
to the left, a sigmoidal utility function effectively becomes a convex utility function.

set of sufficient conditions and necessary conditions is presented under which


the canonical distributed algorithm still converges to the globally optimal solution. However, these conditions may not hold in many cases. These two
approaches illustrate the choice between admission control and capacity planning to deal with nonconvexity (see also the discussion in [23]). But neither
approach provides a theoretically polynomial-time and practically efficient algorithm (distributed or centralized) for nonconcave utility maximization.
In this section, we remove the concavity assumption on utility functions,
thus turning NUM into a nonconvex optimization problem with a strictly positive duality gap. Such problems in general are NP hard, thus extremely unlikely to be polynomial-time solvable even by centralized computations. Using
a family of convex semidefinite programming (SDP) relaxations based on the
sum-of-squares (SOS) relaxation and the Positivstellensatz Theorem in real
algebraic geometry, we apply a centralized computational method to bound
the total network utility in polynomial-time. A surprising result is that for all
the examples we have tried, wherever we could verify the result, the tightest
possible bound (i.e., the globally optimal solution) of NUM with nonconcave
utilities is computed with a very low order relaxation. This efficient numerical method for off-line analysis also provides the benchmark for distributed
heuristics.
These three different approaches: proposing distributed but suboptimal
heuristics (for sigmoidal utilities) in [28], determining optimality conditions
for the canonical distributed algorithm to converge globally (for all nonlinear
utilities) in [12], and proposing efficient but centralized method to compute
the global optimum (for a wide class of utilities that can be transformed into

Special Volume

polynomial utilities) in [15] and this section, are complementary in the study
of distributed rate allocation by nonconcave NUM.
2.2 Global maximization of nonconcave network utility
Sum-of-squares method
We would like to bound the maximum network utility by in polynomial time
and search for a tight bound. Had there been no link capacity constraints,
maximizing a polynomial is already an NP hard problem, but can be relaxed
into a SDP [46]. This is because testing if the following bounding inequality
holds p(x), where p(x) is a polynomial of degree d in n variables, is
equivalent to testing the positivity of p(x), which can be relaxed
Pr into testing
if p(x) can be written as a sum of squares (SOS): p(x) = i=1 qi (x)2 for
some polynomials qi , where the degree of qi is less than or equal to d/2. This
is referred to as the SOS relaxation. If a polynomial can be written as a sum
of squares, it must be non-negative, but not vice versa. Conditions under
which this relaxation is tight were studied since Hilbert. Determining if a
sum of squares decomposition exists can be formulated as an SDP feasibility
problem, thus polynomial-time solvable.
Constrained nonconcave NUM can be relaxed by a generalization of the
Lagrange duality theory, which involves nonlinear combinations of the constraints instead of linear combinations in the standard duality theory. The key
result is the Positivstellensatz, due to Stengle [48], in real algebraic geometry,
which states that for a system of polynomial inequalities, either there exists
a solution in Rn or there exists a polynomial which is a certificate that no
solution exists. This infeasibility certificate is recently shown to be also computable by an SDP of sufficient size [38, 37], a process that is referred to as
the sum-of-squares method and automated by the software SOSTOOLS [39]
initiated by Parrilo in 2000. For a complete theory and many applications of
SOS methods, see [38] and references therein.
Furthermore, the bound itself can become an optimization variable in
the SDP and can be directly minimized. A nested family of SDP relaxations,
each indexed by the degree of the certificate polynomial, is guaranteed to
produce the exact global maximum. Of course, given the problem is NP hard,
it is not surprising that the worst-case degree of certificate (thus the number
of SDP relaxations needed) is exponential in the number of variables. What
is interesting is the observation that in applying SOSTOOLS to nonconcave
utility maximization, a very low order, often the minimum order relaxation
already produces the globally optimal solution.
Application of SOS method to nonconcave NUM
Using sum-of-squares and the Positivstellensatz, we set up the following problem whose objective value converges to the optimal value of problem (1), where

Chiang: Nonconvex Optimization for Communication Systems

{Ui } are now general polynomials, as the degree of the polynomials involved
is increased.
minimize
subject
P to
P
P
s Us (xs ) l l (x)(cl sS(l) xs )
P
P
P
j,k jk (x)(cj sS(j) xs )(ck sS(k) xs )
P
P
. . . 12...n (x)(c1 sS(1) xs ) . . . (cn sS(n) xs )
is SOS,
l (x), jk (x), . . . , 12...n (x) are SOS.

(5)

The optimization variables are and all of the coefficients in polynomials


l (x), jk (x), . . . , 12...n (x). Note that x is not an optimization variable; the
constraints hold for all x, therefore imposing constraints on the coefficients.
This formulation uses Schm
udgens representation of positive polynomials
over compact sets [44].1 Two alternative representations are discussed in [15].
Let D be the degree of the expression in the first constraint in (5). We
refer to problem (5) as the SOS relaxation of order D for the constrained
NUM. For a fixed D, the problem can be solved via SDP. As D is increased,
the expression includes more terms, the corresponding SDP becomes larger,
and the relaxation gives tighter bounds. An important property of this nested
family of relaxations is guaranteed convergence of the bound to the global
maximum.
Regarding the choice of degree D for each level of relaxation, clearly a
polynomial of odd degree cannot be SOS, so we need to consider only the
cases where the expression has even degree. Therefore, the degree of the first
non-trivialPrelaxation is the largest even number greater than or equal to
degree of s Us (xs ), and the degree is increased by 2 for the next level.
A key question now becomes: How do we find out, after solving an SOS
relaxation, if the bound happens to be exact? Fortunately, there is a sufficient
test that can reveal this, using the properties of the SDP and its dual solution. In [19, 26], a parallel set of relaxations, equivalent to the SOS ones, is
developed in the dual framework. The dual of checking the nonnegativity of
a polynomial over a semi-algebraic set turns out to be finding a sequence of
moments that represent a probability measure with support in that set. To
be a valid set of moments, the sequence should form a positive semidefinite
moment matrix. Then, each level of relaxation fixes the size of this matrix,
i.e., considers moments up a certain order, and therefore solves an SDP. This
is equivalent to fixing the order of the polynomials appearing in SOS relaxP
1
Schm
udgens representation applies when
Us (xs ) is strictly positive on
the feasible set. Therefore the convergence is asymptotic in theory, however in
practice finite convergence is observed most of the time. If we were to use Stengles
Positivstellensatz, we would have finite convergence but could not have as an
optimization variable and at each relaxation level would have to use a bisection
on . For computational convenience, we choose Schm
udgens form.

Special Volume

ations. The sufficient rank test checks a rank condition on this moment matrix
and recovers (one or several) optimal x , as discussed in [19].
In summary, we have the following Algorithm for centralized computation
of a globally optimal rate allocation to nonconcave utility maximization, where
the utility functions can be written as or converted into polynomials.
Algorithm 1. Sum-of-squares for nonconcave utility maximization.
1) Formulate the relaxed problem (5) for a given degree D.
2) Use SDP to solve the Dth order relaxation, which can be conducted
using SOSTOOLS [39].
3) If the resulting dual SDP solution satisfies the sufficient rank condition,
the Dth order optimizer (D) is the globally optimal network utility, and a
corresponding x can be obtained 2 .
4) Increase D to D + 2, i.e., the next higher order relaxation, and repeat.
In the following subsection, we give examples of the application of SOS
relaxation to the nonconcave NUM. We also apply the above sufficient test to
check if the bound is exact, and if so, we recover the optimum rate allocation
x that achieve this tightest bound.
2.3 Numerical Examples and Sigmoidal Utilities
Polynomial utility examples
First, consider quadratic utilities, i.e., Us (xs ) = x2s as a simple case to start
with (this can be useful, for example, when the bottleneck link capacity limits
sources to their convex region of a sigmoidal utility). We present examples
that are typical, in our experience, of the performance of the relaxations.
Example 1. A small illustrative example. Consider the simple 2 link, 3
user network shown in Figure 2, with c = [1, 2]. The optimization problem is
x2

x3

c1

c2

x1
Fig. 2. Network topology for example 1.

Otherwise, (D) may still be the globally optimal network utility but is only
provably an upper bound.

10

Chiang: Nonconvex Optimization for Communication Systems

P 2
maximize
s xs
subject to x1 + x2 1
x1 + x3 2
x1 , x2 , x3 0.

(6)

The first level relaxation with D = 2 is


minimize
subject to
(x21 + x22 + x23 ) 1 (x1 x2 + 1) 2 (x1
x3 + 2) 3 x1 4 x2 5 x3 6 (x1 x2 + 1)
(x1 x3 + 2) 7 x1 (x1 x2 + 1) 8 x2 (x1
x2 + 1) 9 x3 (x1 x2 + 1) 10 x1 (x1 x3 + 2)
11 x2 (x1 x3 + 2) 12 x3 (x1 x3 + 2)
13 x1 x2 14 x1 x3 15 x2 x3 is SOS,
i 0, i = 1, . . . , 15.

(7)

The first constraint above can be written as xT Qx for x = [1, x1 , x2 , x3 ]T


and an appropriate Q. For example, the (1,1) entry which is the constant
term reads 1 22 26 , the (2,1) entry, coefficient of x1 , reads 1 +
2 3 + 36 7 210 , and so on. The expression is SOS if and only if
Q 0. The optimal is 5, which is achieved by, e.g., 1 = 1, 2 = 2, 3 =
1, 8 = 1, 10 = 1, 12 = 1, 13 = 1, 14 = 2 and the rest of the i equal to
zero. Using the sufficient test (or, in this example, by inspection) we find the
optimal rates x0 = [0, 1, 2].
In this example, many of the i could be chosen to be zero. This means
not all product terms appearing in (7) are needed in constructing the SOS
polynomial. Such information is valuable from the decentralization point of
view, and can help determine to what extent our bound can be calculated in
a distributed manner. This is a challenging topic for future work.
Example 2. Larger tree topology. As a larger example, consider the network shown in Figure 2.3 with 7 links. There are 9 users, with the following
routing table that lists the links on each users path.
x1 x2 x3 x4 x5 x6 x7 x8 x9
1,2 1,2,4 2,3 4,5 2,4 6,5,7 5,6 7 5
For c = [5, 10, 4, 3, 7, 3, 5], we obtain the bound = 116 with D = 2,
which turns out to be globally optimal, and the globally optimal rate vector
can be recovered: x0 = [5, 0, 4, 0, 1, 0, 0, 5, 7]. In this example, exhaustive
search is too computationally intensive, and the sufficient condition test plays
an important role in proving the bound was exact and in recovering x0 .
Example 3. Large m-hop ring topology. Consider a ring network with n
nodes, n users and n links where each users flow starts from a node and goes
clockwise through the next m links, as shown in Figure 2.3 for n = 6, m = 2.
As a large example, with n = 25, m = 2 and capacities chosen randomly
for a uniform distribution on [0, 10], using relaxation of order D = 2 we

Special Volume
c3
c1

11

c6
c5

c2
c4

c7
Fig. 3. Network topology for example 2.

obtain the exact bound = 321.11 and recover an optimal rate allocation.
For n = 30, m = 2, and capacities randomly chosen from [0, 15], it turns out
that D = 2 relaxation yields the exact bound 816.95 and a globally optimal
rate allocation.

c1
c6

c2

c3

c5
c4

Fig. 4. Network topology for example 3.

Sigmoidal utility examples


Now consider sigmoidal utilities in a standard form:
Us (xs ) =

1
1+

e(as xs +bs )

where {as , bs } are constant integers. Even though these sigmoidal functions
are not polynomials, we show the problem can be cast as one with polynomial
cost and constraints, with a change of variables.
Example 4. Sigmoidal utility. Consider the simple 2 link, 3 user example
shown in Figure 2 for as = 1 and bs = 5.
The NUM problem is to

12

Chiang: Nonconvex Optimization for Communication Systems

P
1
maximize
s 1+e(xs 5)
subject to x1 + x2 c1
x1 + x3 c2
x 0.

(8)

1
Let ys = 1+e(x
, then xs = log( y1s 1) + 5. Substituting for x1 , x2 in
s 5)
the first constraint, arranging terms and taking exponentials, then multiplying
the sides by y1 y2 (note that y1 , y2 > 0), we get

(1 y1 )(1 y2 ) e(10c1 ) y1 y2 ,
which is polynomial in the new variables y. This applies to all capacity con1
straints, and the non-negativity constraints for xs translate to ys 1+e
5.
Therefore the whole problem can be written in polynomial form, and SOS
methods apply. This transformation renders the problem polynomial for general sigmoidal utility functions, with any as and bs .
We present some numerical results, using a small illustrative example. Here
SOS relaxations of order 4 (D = 4) were used. For c1 = 4, c2 = 8, we find
= 1.228, which turns out to be a global optimum, with x0 = [0, 4, 8] as the
optimal rate vector. For c1 = 9, c2 = 10, we find = 1.982 and x0 = [0, 9, 10].
Now place a weight of 2 on y1 , while the other ys have weight one, we obtain
= 1.982 and x0 = [9, 0, 1].
In general, if as 6= 1 for some s, however, the degree of the polynomials in
the transformed problem may be very high. If we write the general problem
as
P
1
maximize
Ps 1+e(as xs +bs )
subject to sS(l) xs cl , l,
(9)
x 0,
each capacity constraint after transformation will be
Q
rls k6=s ak

s (1 ys )
Q
Q
P
Q rls k6=s ak
exp( s as (cl + s rls /as bs )) s ys
,

where rls = 1 if l L(s) and equals 0 otherwise. Since the product of the
as appears in the exponents, as > 1 significantly increases the degree of the
polynomials appearing in the problem and hence the dimension of the SDP
in the SOS method.
It is therefore also useful to consider alternative representations of sigmoidal functions such as the following rational function:
Us (xs ) =

xns
,
a + xns

1/n
where the inflection point is x0 = ( a(n1)
and the slope at the inflection
n+1 )
n1
n+1 1/n
0
point is Us (x ) = 4n ( a(n1) ) . Let ys = Us (xs ), the NUM problem in this
case is equivalent to

Special Volume

maximize
s ys
n
n
subject to x
s
P ys xs ays = 0
sS(l) xs cl , l
x0

13

(10)

which again can be accommodated in the SOS method and be solved by


Algorithm 1.
The benefit of this choice of utility function is that the largest degree of
the polynomials in the problem is n + 1, therefore growing linearly with n.
The disadvantage compared to the exponential form for sigmoidal functions
is that the location of the inflection point and the slope at that point cannot
be set independently.
2.4 Alternative representations for convex relaxations to
nonconcave NUM
The SOS relaxation we used in the last two sections is based on Schm
udgens
representation for positive polynomials over compact sets described by other
polynomials. We now briefly discuss two other representations of relevance to
the NUM, that are interesting from both theoretical (e.g., interpretation) and
computational points of view.
LP relaxation
Exploiting linearity of the constraints in NUM and with the additional assumption of nonempty interior for the feasible set (which holds for NUM), we
can use Handelmans representation [18] and refine the Positivstellensatz condition to obtain the following convex relaxation of nonconcave NUM problem:
maximize
subject to
L
Y
X
P
P
(cl sS(l) xs )l , x
s Us (xs ) =
N L

(11)

l=1

0, ,

where the optimization variables are and , and denotes an ordered set
of integers {l }.
P
Fixing D where l l D, and equating the coefficients on the two sides
of the equality in (11), yields a linear program (LP). (Note that there are no
SOS terms, therefore no semidefiniteness conditions.) As before, increasing
the degree D gives higher order relaxations and a tighter bound.
We provide a pricing interpretation for problem (11).
PFirst, normalize each
capacity constraint as 1 ul (x) 0, where ul (x) = sS(l) xs /cl . We can
interpret ul (x) as link usage, or the probability that link l is used at any given
point in time. Then, in (11), we have terms linear in u such as l (1 ul (x)),

14

Chiang: Nonconvex Optimization for Communication Systems

in which l has a similar interpretation as in concave NUM, as the price of


using link l. We also have product terms such as jk (1 uj (x))(1 uk (x)),
where jk uj (x)uk (x) indicates the probability of simultaneous usage of links
j and k, for links whose usage probabilities are independent (e.g., they do not
share any flows). Products of more terms can be interpreted similarly.
While the above price interpretation is not complete and does not justify all
the terms appearing in (11) (e.g., powers of the constraints; product terms for
links with shared flows), it does provide some useful intuition: this relaxation
results in a pricing scheme that provides better incentives for the users to
observe the constraints, by putting additional reward (since the corresponding
term adds positively to the utility) for simultaneously keeping two links free.
Such incentive helps tighten the upper bound and eventually achieve a feasible
(and optimal) allocation.
This relaxation is computationally attractive since we need to solve an
LPs instead of the previous SDPs at each level. However, significantly more
levels may be required [27].
Relaxation with no product terms
Putinar [40] showed that a polynomial positive over a compact set 3 can
be represented as an SOS-combination of the constraints. This yields the
following convex relaxation for nonconcave NUM problem:
maximize
subject to
P
PL
P
s Us (xs ) = l=1 l (x)(cl sS(l) xs ), x
(x) is SOS,

(12)

where the optimization variables are the coefficients in l (x). Similar to the
SOS relaxation (5), fixing the order D of the expression in (12) results in an
SDP. This relaxation has the nice property that no product terms appear: the
relaxation becomes exact with a high enough D without the need of product
terms. However, this degree might be much higher than what the previous
SOS method requires.
2.5 Concluding Remarks and Future Directions
We consider the NUM problem in the presence of inelastic flows, i.e., flows
with nonconcave utilities. Despite its practical importance, this problem has
not been studied widely, mainly due to the fact it is a nonconvex problem.
There has been no effective mechanism, centralized or distributed, to compute the globally optimal rate allocation for nonconcave utility maximization
3

with an extra assumption that always holds for linear constraints as in NUM
problems

Special Volume

15

problems in networks. This limitation has made performance assessment and


design of networks that include inelastic flows very difficult.
To address this problem, we employed convex SOS relaxations, solved by a
sequence of SDPs, to obtain high quality, increasingly tighter upper bounds on
total achievable utility. In practice, the performance of our SOSTOOLS-based
algorithm was surprisingly good, and bounds obtained using a polynomialtime (and indeed a low-order and often minimal order) relaxation were found
to be exact, achieving the global optimum of nonconcave NUM problems.
Furthermore, a dual-based sufficient test, if successful, detects the exactness
of the bound, in which case the optimal rate allocation can also be recovered.
This performance of the proposed algorithm brings up a fundamental question
on whether there is any particular property or structure in nonconcave NUM
that makes it especially suitable for SOS relaxations.
We further examined the use of two more specialized polynomial representations, one that uses products of constraints with constant multipliers,
resulting in LP relaxations; and at the other end of spectrum, one that uses a
linear combination of constraints with SOS multipliers. We expect these relaxations to give higher order certificates, thus their potential computational
benefits need to be examined further. We also show they admit economics
interpretations (e.g., prices, incentives) that provide some insight on how the
SOS relaxations work in the framework of link congestion pricing for the simultaneous usage of multiple links.
An important research issue to be further investigated is decentralization methods for rate allocation among sources with nonconcave utilities. The
proposed algorithm here is not easy to decentralize, given the products of the
constraints or polynomial multipliers that destroy the separable structure of
the problem. However, when relaxations become exact, the sparsity pattern
of the coefficients can provide information about partially decentralized computation of optimal rates. For example, if after solving the NUM off-line, we
obtain an exact bound, then if the coefficient of the cross-term xi xj turns out
to be zero, it means users i and j do not need to communicate to each other
to find their optimal rates. An interesting next step in this area of research is
to investigate distributed version of the proposed algorithm through limited
message passing among clusters of network nodes and links.

3 Wireless Network Power Control


3.1 Introduction
Due to the broadcast nature of radio transmission, data rates and other Quality of Service (QoS) in a wireless network are affected by interference. This is
particularly important in CDMA systems where users transmit at the same
time over the same frequency bands and their spreading codes are not perfectly orthogonal. Transmit power control is often used to tackle this problem

16

Chiang: Nonconvex Optimization for Communication Systems

of signal interference. We study how to optimize over the transmit powers to


create the optimal set of Signal-to-Interference Ratios (SIR) on wireless links.
Optimality here can be with respect to a variety of objectives, such as maximizing a system-wide efficiency metric (e.g., the total system throughput), or
maximizing a Quality of Service (QoS) metric for a user in the highest QoS
class, or maximizing a QoS metric for the user with the minimum QoS metric
value (i.e., a maxmin optimization).
While the objective represents a system-wide goal to be optimized, individual users QoS requirements also need to be satisfied. Any power allocation
must therefore be constrained by a feasible set formed by these minimum
requirements from the users. Such a constrained optimization captures the
tradeoff between user-centric constraints and some network-centric objective.
Because a higher power level from one transmitter increases the interference
levels at other receivers, there may not be any feasible power allocation to
satisfy the requirements from all the users. Sometimes an existing set of requirements can be satisfied, but when a new user is admitted into the system,
there exists no more feasible power control solutions, or the maximized objective is reduced due to the tightening of the constraint set, leading to the need
for admission control and admission pricing, respectively.
Because many QoS metrics are nonlinear functions of SIR, which is in turn
a nonlinear (and neither convex nor concave) function of transmit powers, in
general power control optimization or feasibility problems are difficult nonlinear optimization problems that may appear to be NP-hard problems. This
section shows that, when SIR is much larger than 0dB, a class of nonlinear
optimization called Geometric Programming (GP) can be used to efficiently
compute the globally optimal power control in many of these problems, and
efficiently determine the feasibility of user requirements by returning either a
feasible (and indeed optimal) set of powers or a certificate of infeasibility. This
also leads to an effective admission control and admission pricing method.
The key observation is that despite the apparent nonconvexity, through log
change of variable the GP technique turns these constrained optimization of
power control into convex optimization, which is intrinsically tractable despite
its nonlinearity in objective and constraints. However, when SIR is comparable
to or below 0dB, the power control problems are truly nonconvex with no
efficient and global solution methods. In this case, we present a heuristic that
is provably convergent and empirically almost always compute the globally
optimal power allocation by solving a sequence of GPs through the approach
of successive convex approximations.
The GP approach reveals the hidden convexity structure, which implies
efficient solution methods and the global optimality of any local optimum in
power control problems with nonlinear objective functions. It clearly differentiates the tractable formulations in high-SIR regime from the intractable
ones in low-SIR regime. Power control by GP is applicable to formulations
in both cellular networks with single-hop transmission between mobile users
and base stations, and ad hoc networks with mulithop transmission among

Special Volume

17

the nodes, as illustrated through several numerical examples in this section.


Traditionally, GP is solved by centralized computation through the highly efficient interior point methods. In this section we present a new result on how
GP can be solved distributively with message passing, which has independent
value to general maximization of coupled objective, and applies it to power
control problems with a further reduction of message passing overhead by
leveraging the specific structures of power control problems.
3.2 Geometric Programming
GP is a class of nonlinear, nonconvex optimization problems with many useful
theoretical and computational properties. It was invented in 1967 by Duffin,
Peterson, and Zener [14], and much of the developments by early 1980s was
summarized in [1]. Since a GP can be turned into a convex optimization
problem, a local optimum is also a global optimum, Lagrange duality gap
is zero under mild conditions, and a global optimum can be computed very
efficiently. Numerical efficiency holds both in theory and in practice: interior
point methods applied to GP have provably polynomial time complexity [35],
and are very fast in practice with high-quality software downloadable from
the Internet (e.g., the MOSEK package). Convexity and duality properties
of GP are well understood, and large-scale, robust numerical solvers for GP
are available. Furthermore, special structures in GP and its Lagrange dual
problem lead to distributed algorithms, physical interpretations, and computational acceleration beyond the generic results for convex optimization. A
detailed tutorial of GP and comprehensive survey of its recent applications to
communication systems can be found in [10]. This subsection contains a brief
introduction of GP terminology.
There are two equivalent forms of GP: standard form and convex form.
The first is a constrained optimization of a type of function called posynomial,
and the second form is obtained from the first through a logarithmic change
of variable.
We first define a monomial as a function f : Rn++ R:
(1)

(2)

f (x) = dxa1 xa2

(n)

. . . xan

where the multiplicative constant d 0 and the exponential constants


a(j) R, j = 1, 2, . . . , n. A sum of monomials, indexed by k below, is called a
posynomial:
K
(1)
(2)
(n)
X
a
a
a
f (x) =
dk x1 k x2 k . . . xnk .
k=1

(j)

where dk 0, k = 1, 2, . . . , K, and ak R, j = 1, 2, . . . , n, k = 1, 2, . . . , K.
0.5
100
For example, 2x
is a posynomial in x, x1 x2 is not a posyn1 x2 + 3x1 x3
omial, and x1 /x2 is a monomial, thus also a posynomial.

18

Chiang: Nonconvex Optimization for Communication Systems

Minimizing a posynomial subject to posynomial upper bound inequality


constraints and monomial equality constraints is called GP in standard form:
minimize f0 (x)
subject to fi (x) 1, i = 1, 2, . . . , m,
hl (x) = 1, l = 1, 2, . . . , M
where fi , i = 0, 1, . . . , m, are posynomials: fi (x) =

PKi

(13)
(1)

(2)

(n)

ik
ik
. . . xnik ,
k=1 dik x1 x2

(1)

(2)

(n)

and hl , l = 1, 2, . . . , M are monomials: hl (x) = dl x1 l x2 l . . . xnl .


GP in standard form is not a convex optimization problem, because posynomials are not convex functions. However, with a logarithmic change of the
variables and multiplicative constants: yi = log xi , bik = log dik , bl = log dl ,
and a logarithmic change of the functions values, we can turn it into the
following equivalent problem in y:
PK0
minimize p0 (y) = log k=1
exp(aT0k y + b0k )
P Ki
(14)
subject to pi (y) = log k=1 exp(aTik y + bik ) 0, i = 1, 2, . . . , m,
ql (y) = aTl y + bl = 0, l = 1, 2, . . . , M.
This is referred to as GP in convex form, which is a convex optimization
problem since it can be verified that the log-sum-exp function is convex [5].
In summary, GP is a nonlinear, nonconvex optimization problem that can
be transformed into a nonlinear, convex problem. GP in standard form can
be used to formulate network resource allocation problems with nonlinear objectives under nonlinear QoS constraints. The basic idea is that resources are
often allocated proportional to some parameters, and when resource allocations are optimized over these parameters, we are maximizing an inverted
posynomial subject to lower bounds on other inverted posynomials, which are
equivalent to GP in standard form.
SP/GP, SOS/SDP
Note that, although posynomial seems to be a non-convex function, it becomes
a convex function after the log transformation, as shown in an example in
Figure 5. Compared to the (constrained or unconstrained) minimization of
a polynomial, the minimization of a posynomial in GP relaxes the integer
constraint on the exponential constants but imposes a positivity constraint on
the multiplicative constants and variables. There is a sharp contrast between
these two problems: polynomial minimization is NP-hard, but GP can be
turned into convex optimization with provably polynomial-time algorithms
for a global optimum.
In an extension of GP called Signomial Programming to be discussed later
in this section, the restriction of non-negative multiplicative constants is removed. This results in a general class of nonlinear and truly non-convex problems that is simultaneously a generalization of GP and polynomial minimization over the positive quadrant, as summarized in the comparison Table 1.

Special Volume

19

5
120

4.5
4

100

3.5
Function

Function

80
60

3
2.5
2

40

1.5
20
0
0
10

5
5

10

1
0.5
4

4
2
B

2
0

Fig. 5. A bi-variate posynomial before (left graph) and after (right graph) the log
transformation. A non-convex function is turned into a convex one.

c
a(j)
xj

GP PMoP SP
R+
R
R
R
Z+
R
R++ R++ R++

Table 1. Comparison of GP, constrained polynomial minimization over the positive


quadrant (PMoP), and Signomial Programming (SP). All three types of problems
minimize a sum of monomials subject to upper bound inequality constraints on sums
Q (j)
of monomials, but have different definitions of monomial: c j xja , as shown in the
table. GP is known to be polynomial-time solvable, but PMoP and SP are not.

The objective function of Signomial Programming can be formulated as


minimizing a ratio between two posynomials, which is not a posynomial (since
posynomials are closed under positive multiplication and addition but not
division). As shown in Figure 6, a ratio between two posynomials is a nonconvex function both before and after the log transformation. Although it
does not seem likely that Signomial Programming can be turned into a convex optimization problem, there are heuristics to solve it through a sequence
of GP relaxations. However, due to the absence of algebraic structures found
in polynomials, such methods for Signomial Programming currently lack a
theoretical foundation of convergence to global optimality. This is in contrast
to the sum-of-squares method [38], which uses a nested family of SDP relax-

20

Chiang: Nonconvex Optimization for Communication Systems

ations to solve constrained polynomial minimization problems as explained in


the last section.

60

3.5

40
Function

Function

3
20

2.5

20
10

2
3
10
5
Y

5
0 0

2
1
B

1
0 0

Fig. 6. Ratio between two bi-variate posynomials before (left graph) and after (right
graph) the log transformation. It is a non-convex function in both cases.

3.3 Power Control by Geometric Programming: Convex Case


Various schemes for power control, centralized or distributed, have been extensively studied since 1990s based on different transmission models and application needs, e.g., in [2, 16, 34, 43, 50, 55]. This subsection summarizes the
new approach of formulating power control problems through GP. The key advantage is that globally optimal power allocations can be efficiently computed
for a variety of nonlinear system-wide objectives and user QoS constraints,
even when these nonlinear problems appear to be nonconvex optimization.
Basic model
Consider a wireless (cellular or multihop) network with n logical transmitter/receiver pairs. Transmit powers are denoted as P1 , . . . , Pn . In the cellular
uplink case, all logical receivers may reside in the same physical receiver, i.e.,
the base station. In the multihop case, since the transmission environment
can be different on the links comprising an end-to-end path, power control
schemes must consider each link along a flows path.

Special Volume

21

Under Rayleigh fading, the power received from transmitter j at receiver


i is given by Gij Fij Pj where Gij 0 represents the path gain (it may also
encompass antenna gain and coding gain) that is often modeled as proportional to d
ij , where dij denotes distance, is the power fall-off factor, and
Fij models Rayleigh fading and are independent and exponentially distributed
with unit mean. The distribution of the received power from transmitter j at
receiver i is then exponential with mean value E [Gij Fij Pj ] = Gij Pj . The SIR
for the receiver on logical link i is:
SIRi = PN

j6=i

Pi Gii Fii
Pj Gij Fij + ni

(15)

where ni is the noise power for receiver i.


The constellation size M used by a link can be closely approximated for
1
MQAM modulations as follows: M = 1 + ln(
SIR, where BER is the
2 BER)
bit error rate and 1 , 2 are constants that depend on the modulation type.
1
leads to an expression of the data rate Ri on the
Defining K = ln(
2 BER)
ith link as a function of the SIR: Ri = T1 log2 (1 + KSIRi ), which can be
approximated as
1
Ri = log2 (KSIRi )
(16)
T
when KSIR is much larger than 1. This approximation is reasonable either
when the signal level is much higher than the interference level or, in CDMA
systems, when the spreading gain is large. For notational simplicity in the rest
of this section, we redefine Gii as K times the original Gii , thus absorbing
constant K into the definition of SIR.
The aggregate data rate for the system can then be written as
"
#
Y
X
1
SIRi .
Rsystem =
Ri = log2
T
i
i
So in the high SIR regime, aggregate data rate maximization is equivalent
to maximizing a product of SIR. The system throughput is the aggregate
data rate supportable by the system given a set of users with specified QoS
requirements.
Outage probability is another important QoS parameter for reliable communication in wireless networks. A channel outage is declared and packets
lost when the received SIR falls below a given threshold SIRth , often computed from the BER requirement. Most systems are interference dominated
and the thermal noise is relatively small, thus the ith link outage probability
is
Po,i = Prob{SIRi SIRth }
= Prob{Gii Fii Pi SIRth

X
j6=i

Gij Fij Pj }.

22

Chiang: Nonconvex Optimization for Communication Systems

The outage probability can be expressed as Po,i = 1

1
j6=i 1+ SIRth Gij Pj
G P
ii i

[25], which means that the upper bound Po,i Po,i,max can be written as an
upper bound on a posynomial in P:

Y
SIRth Gij Pj
1
1+

.
(17)
Gii Pi
1 Po,i,max
j6=i

Cellular wireless networks


We first present how GP-based power control applies to cellular wireless networks with one-hop transmission from N users to a base station. These results
extend the scope of power control by the classical solution in CDMA systems
that equalizes SIRs, and those by the iterative algorithms (e.g., in [2, 16, 34])
that minimize total power (a linear objective function) subject to SIR constraints.
We start the discussion on the suite of power control problem formulations with a simple objective function and basic constraints. The following
constrained problem of maximizing the SIR of a particular user i is a GP:
maximize Ri (P)
subject to Ri (P) Ri,min , i,
Pi1 Gi1 = Pi2 Gi2 ,
0 Pi Pi,max , i.
The first constraint, equivalent to SIRi SIRi,min , sets a floor on the SIR
of other users and protects these users from user i increasing her transmit
power excessively. The second constraint reflects the classical power control
criterion in solving the near-far problem in CDMA systems: the expected
received power from one transmitter i1 must equal that from another i2. The
third constraint is regulatory or system limitations on transmit powers. All
constraints can be verified to be inequality upper bounds on posynomials in
transmit power vector P.
Alternatively, we can use GP to maximize the minimum rate among all
users. The maxmin fairness objective:
maximizeP min {Ri }
i

can be accommodated in GP-based power control because it can be turned


into equivalently maximizing an auxiliary variable t such that SIRi (P)
exp(t), i, which has posynomial objective and constraints in (P, t).
Example 5. A small illustrative example. A simple system comprised
of five users is used for a numerical example. The five users are spaced at
distances d of 1, 5, 10, 15, and 20 units from the base station. The power fall-off
factor = 4. Each user has a maximum power constraint of Pmax = 0.5mW .
The noise power is 0.5W for all users. The SIR of all users, other than the

Special Volume

23

user we are optimizing for, must be greater than a common threshold SIR level
. In different experiments, is varied to observe the effect on the optimized
users SIR. This is done independently for the near user at d = 1, a medium
distance user at d = 15, and the far user at d = 20. The results are plotted in
Figure 7.
Optimized SIR vs. Threshold SIR
20
near
medium
far

15

Optimized SIR (dB)

10

10

15

20
5

10

Threshold SIR (dB)

Fig. 7. Constrained optimization of power control in a cellular network (Example


5).

Several interesting effects are illustrated. First, when the required threshold SIR in the constraints is sufficiently high, there is no feasible power control
solution. At moderate threshold SIR, as is decreased, the optimized SIR initially increases rapidly. This is because it is allowed to increase its own power
by the sum of the power reductions in the four other users, and the noise is
relatively insignificant. At low threshold SIR, the noise becomes more significant and the power trade-off from the other users less significant, so the curve
starts to bend over. Eventually, the optimized user reaches its upper bound
on power and cannot utilize the excess power allowed by the lower threshold
SIR for other users. This is exhibited by the transition from a sharp bend in
the curve to a much shallower sloped curve.
We now proceed to show that GP can also be applied to the problem
formulations with an overall system objective of total system throughput,
under both user data rate constraints and outage probability constraints.
The following constrained problem of maximizing system throughput is a
GP:

24

Chiang: Nonconvex Optimization for Communication Systems

maximize Rsystem (P)


subject to Ri (P) Ri,min , i,
Po,i (P) Po,i,max , i,
0 Pi Pi,max , i

(18)

where the optimization variables are the Q


transmit powers P. The objective is
1
. Each ISR
equivalent to minimizing the posynomial i ISRi , where ISR is SIR
is a posynomial in P and the product of posynomials is again a posynomial.
The first constraint is from the data rate demand Ri,min by each user. The
second constraint represents the outage probability upper bounds Po,i,max .
These inequality constraints put upper bounds on posynomials of P, as can
be readily verified through (16) and (17). Thus (18) is indeed a GP, and
efficiently solvable for global optimality.
There are several obvious variations of problem (18) that can be solved
by GP, e.g., we can lower bound Rsystem as a P
constraint and maximize Ri
for a particular user i , or have a total power i Pi constraint or objective
function.
The objective function to
Pbe maximized can also be generalized to a
weighted sum of data rates: i wi Ri P
where w  0 is a given weight vector.
This is Q
still a GP because maximizing i wi log SIRi is equivalent
to maximizQ
wi
i
ing log i SIRw
,
which
is
in
turn
equivalent
to
minimizing
ISR
i
i . Now use
i
Q wi
auxiliary variables {ti }, and minimize i ti over the original constraints in
(18) plus the additional constraints ISRi ti for all i. This is readily verified
to be a GP in (x, t), and is equivalent to the original problem.
Generalizing the above discussions and observing that high-SIR assumption is needed for GP formulation only when there are sums of log(1 + SIR)
in the optimization problem, we have the following summary.
Proposition 1 In the high-SIR regime, any combination of objectives (A)(E) and constraints (a)-(e) in Table 2 (pick any one of the objectives and any
subset of the constraints) is a power control optimization problem that can be
solved by GP, i.e., can be transformed into a convex optimization with efficient algorithms to compute the globally optimal power vector. When objectives
(C)-(D) or constraints (c)-(d) do not appear, the power control optimization
problem can be solved by GP in any SIR regime.
In addition to efficient computation of the globally optimal power allocation with nonlinear objectives and constraints, GP can also be used for
admission control based on feasibility study described in [10], and for determining which QoS constraint is a performance bottleneck, i.e., met tightly at
the optimal power allocation 4 .
4

This is because most GP solution algorithms solve both the primal GP and its
Lagrange dual problem, and by complementary slackness condition, a resource
constraint is tight at optimal power allocation when the corresponding optimal
dual variable is non-zero.

Special Volume

25

Table 2. Suite of Power Control Optimization solvable by GP


Objective Function
(A) Maximize Ri (specific user)
(B) Maximize min
P i Ri (worst-case user)
(C) Maximize Pi Ri (total throughput)
(D) Maximize P i wi Ri (weighted rate sum)
(E) Minimize
P (total power)
i i

Constraints
(a) Ri Ri,min (rate constraint)
(b) P
Pi1 Gi1 = Pi2 Gi2 (near-far constraint)
(c)
Ri Rsystem,min (total throughput constraint)
i
(d) Po,i Po,i,max (outage prob. constraint)
(e) 0 Pi Pi,max (power constraint)

Extensions
In wireless multihop networks, system throughput may be measured either by
end-to-end transport layer utilities or by link layer aggregate throughput. GP
application to the first approach has appeared in [9], and those to the second
approach in [10]. Furthermore, delay and buffer overflow properties can also
be accommodated in the constraints or objective function of GP-based power
control.
3.4 Power Control by Geometric Programming: Non-convex Case
If we maximize the total throughput Rsystem in the medium to low SIR case,
i.e., when SIR is not much larger than 0dB, the approximation of log(1 + SIR)
as log SIR does not hold. Unlike SIR, which is an inverted posynomial, 1+SIR is
1
not an inverted posynomial. Instead, 1+SIR
is a ratio between two posynomials:
P
f (P)
j6=i Gij Pj + ni
= P
.
g(P)
j Gij Pj + ni

(19)

Minimizing or upper bounding a ratio between two posynomials belongs to


a truly nonconvex class of problems known as Complementary GP [1, 10] that
is an intrinsically intractable NP-hard problem. An equivalent generalization
of GP is Signomial Programming (SP) [1, 10]: minimizing a signomial subject
to upper bound inequality constraints on signomials, where a signomial s(x) is
a sum of monomials, possibly with negative multiplicative coefficients: s(x) =
PN
N
and gi (x) are monomials 5 .
i=1 ci gi (x) where c R
Successive convex approximation method
Consider the following nonconvex problem:
5

An SP can always be converted into a Complementary GP, because an inequality


in SP, which can be written as fi1 (x) fi2 (x) 1, where fi1 , fi2 are posynomials,
fi1 (x)
is equivalent to an inequality 1+f
1 in Complementary GP.
i2 (x)

26

Chiang: Nonconvex Optimization for Communication Systems

minimize f0 (x)
subject to fi (x) 1, i = 1, 2, . . . , m,

(20)

where f0 is convex without loss of generality6 , but the fi (x)s, i are nonconvex. Since directly solving this problem is NP-hard, we want to solve it by
a series of approximations fi (x) fi (x), x, each of which can be optimally
solved in an easy way. It is known [33] that if the approximations satisfy the
following three properties, then the solutions of this series of approximations
converge to a point satisfying the necessary optimality Karush-Kuhn-Tucker
(KKT) conditions of the original problem:
(1) fi (x) fi (x) for all x,
(2) fi (x0 ) = fi (x0 ) where x0 is the optimal solution of the approximated
problem in the previous iteration,
(3) fi (x0 ) = fi (x0 ).
The following algorithm describes the generic successive approximation
approach.Given a method to approximate fi (x) with fi (x) , i, around some
point of interest x0 , the following algorithm provides the output of a vector
that satisfies the KKT conditions of the original problem.
Algorithm 2. Successive approximation to a nonconvex problem.
1) Choose an initial feasible point x(0) and set k = 1.
2) Form an approximated problem of (20) based on the previous point
x(k1) .
3) Solve the k-th approximated problem to obtain x(k) .
4) Increment k and go to step 2 until convergence to a stationary point.
Single condensation method. Complementary GPs involve upper bounds on
the ratio of posynomials as in (19); they can be turned into GPs by approximating the denominator of the ratio of posynomials, g(x), with a monomial
g(x), but leaving the numerator f (x) as a posynomial.
P
Lemma 1 Let g(x) = i ui (x) be a posynomial. Then
Y  ui (x) i
g(x) g(x) =
.
(21)
i
i
If, in addition, i = ui (x0 )/g(x0 ), i, for any fixed positive x0 , then g(x0 ) =
g(x0 ), and g(x) is the best local monomial approximation to g(x) near x0 in
the sense of first order Taylor approximation.
P
Q
Proof. The arithmetic-geometric mean inequality states that i i vi i vii ,
where v 0 and  0, 1T = 1. Letting ui = i vi , we can write this basic
6

If f0 is nonconvex, we can move the objective function


constraint by introducing auxiliary scalar variable t and
minimize t subject to the additional constraint f0 (x) t 0.

to the
writing

Special Volume

27

Q  ui i

P
. The inequality becomes an equality if we
inequality as i ui i i
P
let i = ui / i ui , i, which satisfies the condition that  0 and 1T = 1.
It can be readily verified that the best local monomial approximation of g(x)
near x0 is g(x).
Proposition 2 The approximation of a ratio of posynomials f (x)/g(x) with
f (x)/
g (x) where g(x) is the monomial approximation of g(x) using the
arithmetic-geometric mean approximation of Lemma 1 satisfies the three conditions for the convergence of the successive approximation method.
Proof. Conditions (1) and (2) are clearly satisfied since g(x) g(x) and
g(x0 ) = g(x0 ) (Lemma 1). Condition (3) is easily verified by taking derivatives
of g(x) and g(x).
Double condensation method. Another choice of approximation is to make
a double monomial approximation for both the denominator and numerator
in (19). However, in order to satisfy the three conditions for the convergence
of the successive approximation method, a monomial approximation for the
numerator f (x) should satisfy f (x) f(x).
Applications to power control
Figure 8 shows a block diagram of the approach of GP-based power control
for general SIR regime. In the high SIR regime, we need to solve only one GP.
In the medium to low SIR regimes, we solve truly nonconvex power control
problems that cannot be turned into convex formulation through a series of
GPs.

(High SIR)

Original
Problem

(Medium
to
Low SIR)

Original Problem

- Solve

1 GP

SP

- Complementary

GP (Condensed)

- Solve

1 GP

Fig. 8. GP-based power control for general SIR regime.

GP-based power control problems in the medium to low SIR regimes become SP (or, equivalently, Complementary GP), which can be solved by the
single or double condensation method. We focus on the single condensation
method here. Consider a representative problem formulation of maximizing
total system throughput in a cellular wireless network subject to user rate

28

Chiang: Nonconvex Optimization for Communication Systems

and outage probability constraints in problem (18), which can be explicitly


written out as:
QN
1
minimize
i=1 1+SIRi
1
T Ri,min
subject to (2
1) SIR
1, i = 1, . . . , N,
i
(22)
QN G P
N 1
(SIRth )
(1 Po,i,max ) j6=i Gijii Pji 1, i = 1, . . . , N,
Pi (Pi,max )1 1, i = 1, . . . , N.
All the constraints are posynomials. However, the objective is not a posynomial, but a ratio between two posynomials as in (19). This power control
problem can be solved by the condensation method by solving a series of
GPs. Specifically, we have the following single-condensation algorithm:
Algorithm 3. Single condensation GP power control.
1) Evaluate the denominator posynomial of the objective function in (22)
with the given P.
2) Compute for each term i in this posynomial,
i =

value of ith term in posynomial


.
value of posynomial

3) Condense the denominator posynomial of the (22) objective function


into a monomial using (21) with weights i .
4) Solve the resulting GP using an interior point method.
5) Go to step 1 using P of step 4.
6) Terminate the kth loop if k P(k) P(k1) k where is the error
tolerance for exit condition.
As condensing the objective in the above problem gives us an underestimate of the objective value, each GP in the condensation iteration loop tries
to improve the accuracy of the approximation to a particular minimum in the
original feasible region. All three conditions for convergence are satisfied, and
the algorithm is convergent. Empirically through extensive numerical experiments, we observe that it almost always computes the globally optimal power
allocation.
Example 6. Single condensation example. We consider a wireless cellular
network with 3 users. Let T = 106 s, Gii = 1.5, and generate Gij , i 6= j,
as independent random variables uniformly distributed between 0 and 0.3.
Threshold SIR is SIRth = 10dB, and minimal data rate requirements are
100 kbps, 600 kbps and 1000 kbps for logical links 1, 2 and 3 respectively.
Maximal outage probabilities are 0.01 for all links, and maximal transmit
powers are 3mW, 4mW and 5mW for link 1, 2 and 3, respectively. For each
instance of SP power control (22), we pick a random initial feasible power
vector P uniformly between 0 and Pmax . Figure 9 compares the maximized
total network throughput achieved over five hundred sets of experiments with
different initial vectors. With the (single) condensation method, SP converges

Special Volume

29

to different optima over the entire set of experiments, achieving (or coming
very close to) the global optimum at 5290 bps 96% of the time and a local
optimum at 5060 bps 4% of the time. The average number of GP iterations
required by the condensation method over the same set of experiments is 15
if an extremely tight exit condition is picked for SP condensation iteration:
= 1 1010 . This average can be substantially reduced by using a larger ,
e.g., increasing to 1 102 requires on average only 4 GPs.
Optimized total system throughput

Total system throughput achieved

5300

5250

5200

5150

5100

5050
0

50

100

150

200
250
300
Experiment index

350

400

450

500

Fig. 9. Maximized total system throughput achieved by the (single) condensation


method for 500 different initial feasible vectors (Example 6). Each point represents
a different experiment with a different initial power vector.

We have thus far discussed a power control problem (22) where the objective function needs to be condensed. The method is also applicable if some
constraint functions are signomials and need to be condensed [51].
3.5 Distributed Implementation
A limitation for GP-based power control in ad hoc networks without base
stations is the need for centralized computation (e.g., by interior point methods). The GP formulations of power control problems can also be solved by
a new method of distributed algorithm for GP. The basic idea is that each
user solves its own local optimization problem and the coupling among users
is taken care of by message passing among the users. Interestingly, the special structure of coupling for the problem at hand (all coupling among the
logical links can be lumped together using interference terms) allows one to

30

Chiang: Nonconvex Optimization for Communication Systems

further reduce the amount of message passing among the users. Specifically,
we use a dual decomposition method to decompose a GP into smaller subproblems whose solutions are jointly and iteratively coordinated by the use
of dual variables. The key step is to introduce auxiliary variables and to add
extra equality constraints, thus transferring the coupling in the objective to
coupling in the constraints, which can be solved by introducing consistency
pricing (in contrast to congestion pricing). We illustrate this idea through
an unconstrained GP followed by an application of the technique to power
control.
Distributed algorithm for GP
Suppose we have the following unconstrained standard form GP in x 0:
P
minimize i fi (xi , {xj }jI(i) )
(23)

where xi denotes the local variable of the ith user, {xj }jI(i) denote the
coupled variables from other users, and fi is either a monomial or posynomial.
Making a change of variable yi = log xi , i, in the original problem, we obtain
P
minimize i fi (eyi , {eyj }jI(i) ).

We now rewrite the problem by introducing auxiliary variables yij for the
coupled arguments and additional equality constraints to enforce consistency:
P
yij
yi
}jI(i) )
minimize
i fi (e , {e
(24)
subject to yij = yj , j I(i), i.

Each ith user controls the local variables (yi , {yij }jI(i) ). Next, the Lagrangian
of (24) is formed as
X
X X
ij (yj yij )
L({yi }, {yij }; {ij }) =
fi (eyi , {eyij }jI(i) ) +
i

jI(i)

Li (yi , {yij }; {ij })

where
yi

Li (yi , {yij }; {ij }) = fi (e , {e

yij

}jI(i) )+

 X

j:iI(j)


X
ji yi
ij yij . (25)
jI(i)

The minimization of the Lagrangian with respect to the primal variables


({yi }, {yij }) can be done simultaneously and distributively by each user in
parallel. In the more general case where the original problem (23) is constrained, the additional constraints can be included in the minimization at
each Li .

Special Volume

31

In addition, the following master Lagrange dual problem has to be solved


to obtain the optimal dual variables or consistency prices {ij }:
max g({ij })

(26)

{ij }

where
g({ij }) =

X
i

min Li (yi , {yij }; {ij }).

yi ,{yij }

Note that the transformed primal problem (24) is convex with zero duality
gap; hence the Lagrange dual problem indeed solves the original standard GP
problem. A simple way to solve the maximization in (26) is with the following
subgradient update for the consistency prices:
ij (t + 1) = ij (t) + (t)(yj (t) yij (t)).

(27)

Appropriate choice of the stepsize (t) > 0, e.g., (t) = 0 /t for some constant
0 > 0, leads to convergence of the dual algorithm.
Summarizing, the ith user has to: i) minimize the function Li in (25)
involving only local variables, upon receiving the updated dual variables
{ji , j : i I(j)}, and ii) update the local consistency prices {ij , j I(i)}
with (27), and broadcast the updated prices to the coupled users.
Applications to power control
As an illustrative example, we maximize the total system throughput in the
high SIR regime with constraints local to each user. If we directly applied the
distributed approach described in the last subsection, the resulting algorithm
would require knowledge by each user of the interfering channels and interfering transmit powers, which would translate into a large amount of message
passing. To obtain a practical distributed solution, we can leverage the structures of power control problems at hand, and instead keep a local copy of
each of the effective received powers PijR = Gij Pj . Again using problem (18)
as an example formulation and assuming high SIR, we can write the problem
as following (after the log change of variable):



P
1
i ) P exp(P R ) + 2
minimize
log
G
exp(
P
ij
ii
i
j6=i
ij + Pj ,
subject to PijR = G
Constraints local to each user, e.g., (a),(d) and (e) in Table (2).
(28)
The partial Lagrangian is




X
X
XX
R
i )
ijR ) + 2 +

L=
log G1
exp(
P
exp(
P

G
+
P
,
ij
ij
j
ij
ii
i

j6=i

j6=i

(29)

32

Chiang: Nonconvex Optimization for Communication Systems

and the local ith Lagrangian function in (29) is distributed to the ith user,
from which the dual decomposition method can be used to determine the optimal power allocation P . The distributed power control algorithm is summarized as follows.
Algorithm 4. Distributed power allocation update to maximize Rsystem .
At each iteration t:
P

1) The ith user receives the term

(t)
involving the dual variables
ji
j6=i
from the interfering users by message
n passing
o and minimizes the following local
Lagrangian with respect to Pi (t), PijR (t) subject to the local constraints:
j

n
o
Li Pi (t), PijR (t) ; {ij (t)}j
j


P

2

R
= log G1
+
ii exp(Pi (t))
j6=i exp(Pij (t)) +
P

P
R

j6=i ij Pij (t)


j6=i ji (t) Pi (t).

2) The ith user estimates the effective received power from each of the
interfering users PijR (t) = Gij Pj (t) for j 6= i, updates the dual variable by


ij (t + 1) = ij (t) + (0 /t) PijR (t) log Gij Pj (t) ,

(30)

and then broadcast them by message passing to all interfering users in the
system.
Example 7. Distributed GP power control. We apply the distributed algorithm to solve the above power control problem for three logical links with
Gij = 0.2, i 6= j, Gii = 1, i, maximal transmit powers of 6mW, 7mW and
7mW for link 1, 2 and 3 respectively. Figure 10 shows the convergence of the
dual objective function towards the globally optimal total throughput of the
network. Figure 11 shows the convergence of the two auxiliary variables in
link 1 and 3 towards the optimal solutions.
3.6 Concluding Remarks and Future Directions
Power control problems with nonlinear objective and constraints may seem to
be difficult, NP-hard problems to solve for global optimality. However, when
SIR is much larger than 0dB, GP can be used to turn these problems into
intrinsically tractable convex formulations, accommodating a variety of possible combinations of objective and constraint functions involving data rate,
delay, and outage probability. Then interior point algorithms can efficiently
compute the globally optimal power allocation even for a large network. Feasibility analysis of GP naturally lead to admission control and pricing schemes.
When the high SIR approximation cannot be made, these power control problems become SP and may be solved by the heuristic of condensation method

Special Volume
4

2.2

33

Dual objective function

x 10

2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4

50

100
Iteration

150

200

Fig. 10. Convergence of the dual objective function through distributed algorithm
(Example 7).

Consistency of the auxiliary variables


2

log(P2)

log(PR
/G )
12 12
log(PR /G )

10

32

50

100
Iteration

150

32

200

Fig. 11. Convergence of the consistency constraints through distributed algorithm


(Example 7).

through a series of GPs. Distributed optimal algorithms for GP-based power


control in multihop networks can also be carried out through message passing.
Several interesting research issues remain to be further explored, in particular, reduction of SP solution complexity (e.g., by using high-SIR approximation to obtain the initial power vector and by solving the series of GPs
only approximately except the last GP), and combination of SP solution and
distributed algorithm for distributed power control in low SIR regime.

34

Chiang: Nonconvex Optimization for Communication Systems

4 DSL Spectrum Management


4.1 Introduction
Digital Subscriber Line (DSL) technologies transform traditional voice-band
copper channels into high bandwidth data pipes, which are capable of delivering data rates up to several Mbps per twisted-pair over a distance of about 10
kft. The major obstacle for performance improvement in modern DSL systems
(e.g., ADSL and VDSL) is the crosstalk, which is the interference generated
between different lines in the same binder. The crosstalk is typically 10-20
dB larger than the background noise, and direct crosstalk cancellation (e.g.,
[6, 17]) may not be feasible in many cases due to the complexity issues or as
a result of unbundling. To mitigate the detriments caused by crosstalk, static
spectrum management that mandates spectrum mask or flat power backoff
across all frequencies (i.e., tones) has been implemented in the current system.
Dynamic spectrum management (DSM) techniques, on the other hand,
can significantly improves data rates over the current practice of static spectrum management. Within the current capability of the DSL modems, each
modem has the capability to shape its own power spectrum density (PSD)
across different tones, but can only treat crosstalk as background noise (i.e.,
no signal level coordination, such as vector transmission or iterative decoding, is allowed), and each modem is inherently a single-input-single-output
communication system. The objective would be to optimize the PSD of all
users on all tones (i.e., continuous power loading or discrete bit loading), such
that they are compatible with each other and the system performance (e.g.,
weighted rate sum as discussed below) is maximized.
Compared to power control in wireless networks treated in the last section,
the channel gains are not time-varying in DSL systems, but the problem dimension increases tremendously because there are many tones (or frequency
carriers) over which transmission takes place. Nonconvexity still remains a
major technical challenge, and high SIR approximation in general cannot be
made. However, utilizing the specific structures of the problem (e.g., the channel gain values), an efficient and distributed heuristics is shown to perform
close to the optimum in many realistic DSL network scenarios.
This section develops, analyzes, and simulates the new algorithm for spectrum management in frequency selective interference channels for DSL, called
Autonomous Spectrum Management (ASB). It is autonomous (distributed algorithm across the users without explicit information exchange) with linearcomplexity, while provably convergent and comes close to the globally optimal
rate region in practice, thus overcoming bottlenecks in the state-of-the-art algorithms in DSM, such as IW, OSB, and ISB summarized below.
Let K be the number of tones and N the number of users (lines). The
iterative waterfilling (IW) algorithm [56] is among one of the first DSM algorithms proposed. In IW, each user views any crosstalk experienced as additive

Special Volume

35

Gaussian noise, and seeks to maximize its data rate by waterfilling over the
aggregated noise plus interference. No information exchange is needed among
users, and all the actions are completely autonomous. IW leads to an great
performance over the static approach, and enjoys a low complexity that is
linear in N . However, the greedy nature of IW leads to a performance far
from optimal in the near-far scenarios such as mixed CO/RT deployment and
upstream VDSL.
To address this, an optimal spectrum balancing (OSB) algorithm [8] has
been proposed, which finds the best possible spectrum management solution
under the current capabilities of the DSL modems. OSB avoids the selfish
behaviors of individual users by aiming at the maximization of a total weighted
sum of users rates, which corresponds to a boundary point of the achievable
rate region. On the other hand, OSB has a high computation complexity that
is exponential in N , which quickly leads to intractability when N is larger
than 6. Moreover, it is a completely centralized algorithm where a spectrum
management center at the central office needs to know the global information
(i.e., all the noise PSDs and crosstalk channel gains in the same binder) to
perform the algorithm.
As an improvement to the OSB algorithm, an iterative spectrum balancing
(ISB) algorithm [7] has been proposed, which is based on a weighted sum
rate maximization similar as OSB. Different from OSB, ISB performs the
optimization iteratively through users, which leads to a quadratic complexity
in N. Closely to optimal performance can be achieved by the ISB algorithm in
most cases. However, each user still needs to know the global information as in
OSB, thus ISB is still a centralized algorithm and considered to be impractical
in many cases.
This section presents the ASB algorithm [21], which further reduce the
complexity from ISB algorithm, and achieves close optimal performance similar as ISB and OSB. The basic idea is to use the concept of reference line
to mimic a typical victim line in the current binder. By setting the power
spectrum level to protect the reference line, a good balance between selfish
and global maximizations can be achieved. The ASB algorithm enjoys a linear
complexity in N and K, and can be implemented in a completely autonomous
way. We prove the convergence of ASB for both 2-user and N -user case, under
both sequential and parallel updates.
Table 3 compares various aspects of different DSM algorithms. Utilizing
the structures of the DSL problem, in particular, the lack of channel variation
and user mobility, is the key to provide a linear complexity, distributed, convergent, and almost optimal solution to this coupled nonconvex optimization
problem.
4.2 System Model
Using the notation as in [8, 7], we consider a DSL bundle with N = {1, ..., N }
modems (i.e., lines, users) and K = {1, ..., K} tones. Assume discrete multi-

36

Chiang: Nonconvex Optimization for Communication Systems


Table 3. Comparison of different DSM algorithms
Algorithm
IW
OSB
ISB
ASB

Operation
Autonomous
Centralized
Centralized
Autonomous

Complexity
O (KN ) 
O KeN

O KN 2
O (KN )

Performance
Suboptimal
Optimal
Near optimal
Near optimal

Reference
[56]
[8]
[7]
[21]

tone (DMT) modulation is employed by all modems, transmission can be


modeled independently on each tone as
yk = Hk xk + zk .
The vector xk = {xnk , n N } contains transmitted signals on tone k, where
xnk is the signal transmitted onto line n at tone k. yk and zk have similar
structures. yk is the vector of received signals on tone k. zk is the vector of
additive noise on tone k and contains thermal noise, alien crosstalk, singlecarrier modems, radio frequency interference etc. Hk = [hn,m
]n,mN is the
k
is
the
channel from
N N channel transfer matrix on tone k, where hn,m
k
TX m to RX n on tone k. The diagonal elements of Hk contains the directchannels whilst the off-diagonal elements contain the crosstalk
We
n channels.
o
n
n 2
denote the transmit power spectrum density (PSD) sk = E |xk | . In last
sections notation for single-carrier systems, we would have snk = Pn , k. For
convenience we denote the vector containing the PSD of user n on all tones
as sn = {snk , k K} . We denote DMT symbol rate as fs .
Assume that each modem treats interference from other modems as noise.
When the number of interfering modems is large, the interference can be
well approximated by a Gaussian distribution. Under this assumption the
achievable bit loading of user n on tone k is
!
snk
1
n
,
(31)
bk = log 1 + P
n
m6=n n,m
sm
k
k + k
2

where n,m
= |hn,m
| / |hn,n
k
k
k | is the normalized crosstalk channel gain, and
2
n
k is the noise power density normalized by the direct channel gain |hn,n
k | .
Here denotes the SINR-gap to capacity, which is a function of the desired
BER, coding gain and noise margin [49]. Without loss of generality, we assume
= 1. The data rate on line n is thus
X
R n = fs
bnk .
(32)
kK

Each modem n is typically subject to a total power constraint P n , due to the


limitations on each modems analog frontend.

Special Volume

snk P n .

37

(33)

kK

4.3 Spectrum Management Problem Formulation


The spectrum management problem is defined as follows
maximize R1
n
n,target
subject to R
, n > 1
P Rn
n
s

P
,
n.
kK k

(34)

Here Rn,target the target rate constraint of user n. In other words, we


try to maximize the achievable rate of user 1, under the condition that all
other users achieve their target rates Rn,target . The mutual interference in
(31) causes Problem (34) to be coupled across users on each tone, and the
individual total power constraint causes Problem (34) to be coupled across
tones as well. Moreover, the objective function in Problem (34) is non-convex
due to the coupling of interference, and the convexity of the rate region can
not be guaranteed in general.
However, it has been shown in [57] that the duality gap between the dual
gap of Problem (34) goes to zero when the number of tones K gets large (e.g.,
for VDSL), thus Problem (34) can be solved by dual decomposition method,
which brings the complexity as a function of K down to linear. Moreover, a
frequency-sharing property ensures the rate region is convex with large enough
K, and each boundary point of the boundary point of the rate region can be
achieved by a weighted rate maximization as following [8]:
P
1
n n
maximize R
P + nn>1 w nR
(35)
subject to kK sk P , n N ,

such that the nonnegative weight coefficient wn is adjusted to ensure that the
target rate constraint of user n is met. Without loss of generality, here we
define w1 = 1. By changing the rate constraints Rn,target for users n > 1 (or
equivalently, changing the weight coefficients, wn for n > 1), every boundary
point of the convex rate region can be tracked.
We observe that at the optimal solutions of (34), each user chooses a
PSD level that leads to a good balance of maximization of her own rate
and minimization of the damages he causes to the other users. To accurately
calculate the latter, the user needs to know the global information of the noise
PSDs and crosstalk channel gains. However, if we aim at a less aggressive
objective and only require each user give enough protection to the other users
in the binder while maximization her own rate, then global information may
not be needed. Indeed, we can introduce the concept of a reference line,
a virtual line that represents a typical victim in the current binder. Then
instead of solving (34), each user tries to maximize the achievable data rate

38

Chiang: Nonconvex Optimization for Communication Systems

on the reference line, subject to its own data rate and total power constraint.
Define the rate of the reference line to user n as


X
X
sk
bn =
Rn,ref =
log
1
+
.
k

nk snk +
k
kK

kK

The coefficients {
sk ,
k ,
nk , k, n} are parameters of the reference line and
can be obtained from field measurement. They represent the conditions of a
typical victim user in an interference channel (here a binder of DSL lines),
and are known to the users a priori. They can be further updated on a much
slower timescale through channel measurement data. User n then wants to
solve the following problem local to itself:
maximize Rn,ref
n
n,target
subject to R
,
P Rn
n
s

P
.
kK k

(36)

By using Lagragnian relaxation on the rate target constraint in Problem


(36) with a weight coefficient (dual variable) wn , the relaxed version of (36)
is
n n
n,ref
maximize w
P R +nR n
(37)
subject to kK sk P .

The weight coefficient wn needs to be adjusted to enforce the rate constraint.


4.4 ASB Algorithms
We first introduce the basic version of the ASB algorithm (ASB-I), where each
user n chooses the PSD sn to solve (36) , and updates the weight coefficient wn
to enforce the target rate constraint. Then we introduce a variation of the ASB
algorithm (ASB-II) that enjoys even lower lower computational complexity
and provable convergence.
ASB-I

For each user n, replacing the original optimization (36) with the Lagrange
dual problem
X

max
max
Jkn wn , n , snk , sn
,
(38)
P
k
n
n 0,

where

kK

sn
P n
k

kK

sk


Jkn wn , n , snk , sn
= wn bnk + bnk n snk .
k
n

(39)

By introducing the dual variable , we decouple (36) into several smaller


subproblem, one for each tone. And define Jkn as user ns objective function
on tone k. The optimal PSD that maximizes Jkn for given wn and n is

Special Volume


sn,I
wn , n , sn
= arg
k
k

max

sn
[0,P n ]
k


Jkn wn , n , snk , sn
,
k

39

(40)


which can be found by solving the first order condition, Jkn wn , n , snk , sn
/snk =
k
0, which leads to

nk sk

 n = 0.

n,I
n,I
n
n
+
+
sk +
k sk +
k
k sk +
k
(41)
Note that (41) can be simplified into a cubic equation which has three
solutions. The optimal PSD can be found by substituting
these three solutions

back to the objective function Jkn wn , n , snk , sn
,
,
as
well as checking the
k
boundary solutions snk = 0 and snk = P n , and pick the one that yields the
largest value of Jkn .
The user then updates n to enforce the power constraint, and updates
wn to enforce the target rate constraint. The complete algorithm is given as
follows, where and w are small stepsizes for updating n and wn .
wn

sn,I
k

n,m m
sk
m6=n k

kn

Algorithm 5. Autonomous Spectrum Balancing.


repeat
for each user n = 1, ..., N
repeat
for each tone k = 1, ..., K, find
sn,I
= arg maxsnk 0 Jkn
hk
P
i+
n,I
n
n = n +
s

P
;
k k
h

i+
P
wn = wn + w Rn,target k bnk
;
until convergence
end
until convergence

ASB-II with Frequency-Selective Waterfilling


To obtain the optimal PSD in ASB-I (for fixed n and wn ), we have to solve
the roots of a cubic equation. To reduce the computation complexity and
gain more insights of the solution structure, we assume that the reference
line operates in the high SIR regime whenever it is active: If sk > 0, then
sk
k n,m
snk for any feasible snk , n N and k K. This assumption is
k
motivated by our observations on optimal solutions in DSL type of interference
channels. It means that the reference PSD is much larger than the reference
noise, which is in turn much larger than the interference from user n. Then
= {k|
on any tone k K
sk > 0, k K} , the reference lines achievable rate is

40

Chiang: Nonconvex Optimization for Communication Systems


log 1 +

sk

nk snk +
k

log

sk

nk snk
.

and user ns objective function on tone k can be approximated by


 


nk snk
sk
n,II,1
n
n n n
n n
n n
Jk
w , , sk , sk = w b k
sk + log
.

The corresponding optimal PSD is

+
X

w
n
n
sn,II,1
wn , n , sn
= n

n,m
sm
.
k k
k
k
k
+
nk /
k

(42)

and the corresponding optimal PSD is


+

n
X n,m

w
n,II,2
n
n
k sm
.
sk
wn , n , sk = n
k k

(43)

m6=n

This is a waterfilling type of solution and is intuitively satisfying: the PSD


should be smaller when the power constraint is tighter (i.e., n is larger), or
the interference coefficient to the reference line
nk is higher, or the noise level
on
line
k is smaller, or there is more interference plus noise
P the reference
n,m m
n

s
+

on
the current tone. It is different from the conventional
k
k
m6=n k
waterfilling in that the water level in each tone is not only determined by the
dual variables wn and n , but also by the parameters of the reference line,

nk /
k .
On the other hand, on any tone where the reference line is inactive, i.e.,
C = {k|
kK
sk = 0, k K}, the objective function is

Jkn,II,2 wn , n , snk , sn
= wn bnk n snk ,
k

m6=n

This is the same solution as the iterative waterfilling.


The choice of optimal PSD in ASB-II can be summarized as the following:

+
P
n,m m
n

n wnn

,kK

k
k
m6=n k
+
k /
k
n
n n


sn,II
w
,

,
s
=
.
+
k
k
n,m m
wn P
n
C

,
k

K
n
k
k
m6=n k

(44)
This is essentially a waterfilling type of solution, with different water levels
for different tones (frequencies). We call it frequency selective waterfilling.
4.5 Convergence Analysis
In this subsection, we show the convergence for both ASB-I and ASB-II, for
the case where users fix their weight coefficients wn , which is also called Rate

Special Volume

41

Adaptive (RA) spectrum balancing [49] that aims at maximizing users rates
subject to power constraint. 7
Convergence in the Two-user case
The first
 result is on the convergence of ASB-I algorithm, with fixed w =
w1 , w2 and = 1 , 2 .

Proposition 3 The ASB-I algorithm converges in a two-user


case

 under fixed

w and, if users start from initial PSD values s1k , s2k = 0, P 2 or s1k , s2k =
P 1 , 0 on all tones.

The proof of Theorem 3 uses supermodular game theory [53] and strategy
transformation similar to [20].
Now consider the ASB-II algorithm where two users sequentially optimize
their PSD levels under fixed values of w, but adjust to enforce the power
constraint. The following lemma will be useful in proving the main convergence
results.
Lemma 1. Consider any non-decreasing function f (x) and non-increasing
function g (x), where there exists a unique x such that f (x ) = g (x ) , f (x) / (x)|x=x >
0 and g (x) / (x)|x=x < 0. Then
x = arg min {max{f (x) , g (x)}} .
x

Proof. For any x > 0,


f (x + x) > f (x ) = g (x ) > g (x + x) ,
and
max {f (x + x) , g (x + x)} = f (x + x) > f (x ) = max{f (x ) , g (x )}.
Similarly for any x < 0,
f (x + x) < f (x ) = g (x ) < g (x + x) ,
and
max {f (x + x) , g (x + x)} = g (x + x) > g (x ) = max{f (x ) , g (x )}.
It then clear that
x = arg min {max{f (x) , g (x)}} .
x

The second main category of spectrum balancing problem is Fixed Margin (FM),
which is concerned with finding a minimal power allocation such that minimum
target data-rates for each user is satisfied. For example, problem (34) is a mixed
RA/FM problem.

42

Chiang: Nonconvex Optimization for Communication Systems

P n,t
Denote sn,t
k as the PSD of user n in tone k after iteration t, where
k sk =
P n is satisfied for any n and t. One iteration is defined as one round of updates
of all users. We can show that
Proposition 4 The ASB-II algorithm globally converges to the unique fixed
1,2
point in a two-user system under fixed w, if maxk 2,1
k maxk k < 1.
The convergence result of iterative waterfilling in the two user case [56] is
a special case of Proposition 4 by setting sk = 0, k.
Proof. Define
nk as the equivalent interference channel gain from user n to
the reference line,
 n

k , k K
n

k =
C ,
0, k K
and we can simplify (44) as
sn,t+1
=
k

wn
n,m
sm,t
kn
k
k
n,t+1 +
nk /
k

+

, n, m {1, 2} , m 6= n, k, t

P
where n,t+1 is chosen such that k sn,t+1
= P n.
k
+

Define [x] = max (x, 0) and [x] = max (x, 0) , then it is clear that
Xh
k

n,t
sn,t
k sk

i+

i
X h n,t

sk sn,t
, n, t, t ,
k

(45)

since the total power constraint is always satisfied after any iteration. Also
define
i
P h
n
f n,t (x) = k x+w
m
sm,t
kn sn,t
,
k
k
k
n
/

k
k
i+
P h
n
m m,t
g n,t (x) = k x+w
kn sn,t
,
k
n /
k k sk
k

and it is clear that f n,t (x) (g n,t (x) , respectively) is non-decreasing (nonn,t+1
increasing) in x, and strictly
.
 increasing (strictly decreasing) at x =
n,t+1
n,t+1
Also f
=g
. Then by Lemma 1,





max f n,t (x) , g n,t (x) max f n,t n,t+1 , g n,t n,t+1 , x.
Take x = n,t , we have






max f n,t n,t , g n,t n,t max f n,t n,t+1 , g n,t n,t+1 .
This leads to

Special Volume

max

i+ X h
i
X h 1,t+1
1,t+1
1,t
sk
s1,t
,
s

s
k
k
k
k

k

1,t+1

43

(46)



= max f 1,t 1,t+1 , g 1,t



max f 1,t 1,t , g 1,t 1,t
(

X
w1
1,2 2,t
1,t
2
k sk k sk
,
= max
1,t +
1k /
k
k
+ )
X
w1
1,2 2,t
1,t
2
k sk k sk
1,t +
1k /
k
k
(
)
i+ X
h
i
X 1,2 h 2,t1
2,t
1,2
2,t1
2,t
= max
k sk
sk
,
k sk
sk
k

(
)
n
o
i+ X h
i
X h 2,t
1,2
2,t1
2,t
2,t1
max k
max
sk sk
,
sk sk
k

(
)
n
o
n
o
i+ X h
i
X h 1,t
1,2
2,1
1,t1
1,t
1,t1
max k
max k max
sk sk
,
sk sk
.
k

< max

Xh
k

i+ X h
i
1,t1
1,t1
s1,t
,
s1,t
k sk
k sk
k

(47)

2,1
The last inequality is due to the fact that maxk 1,2
k maxk k < 1. This shows
that the algorithm is a contraction mapping from any initial PSD values, thus
globally converges to a unique fixed point [4].

Convergence in the N -user case


We further extend the convergence results to a system with an arbitrary N > 2
of users. We consider both sequential and parallel PSD updates of the users.
In the more realistic but harder-to-analyze parallel updates, time is divided
into slots, and each user n updates the PSD simultaneously in each time slot
according to (44) based on the PSDs in the previous slot, where the n is
adjusted such that the power constraint is satisfied.
Proposition 5 Assume maxn,m,k n,m
< N11 , then the ASB-II algorithm
k
globally converges (to the unique fixed point) in an N -user system under fixed
w, with either sequential or parallel updates.
Proposition 5 contains the convergence of iterative waterfilling in an N user case with sequential updates (proved in [13]) as a special case of ASB
convergence with sequential or parallel updates. Moreover, the convergence
proof for the parallel updates turns out to be simpler than the one for sequential updates. The proof extends the that of Proposition 4, and can be found
in [21].

44

Chiang: Nonconvex Optimization for Communication Systems

5 km

CO

2 km

CP

4 km

RT1
3 km

RT2
4 km

CP
3.5 km

RT3

CP
3 km

CP

Fig. 12. An example of mixed CO/RT deployment topology (Example 8).

4.6 Simulation Results


Example 8. Mixed CO-RT DSL. Here we summarize a typical numerical
example comparing the performances of ASB algorithms with IW, OSB and
ISB. We consider a standard mixed Central Office (CO) and Remote Terminal
(RT) deployment. A 4 user scenario has been selected to make a comparison
with the highly complex OSB algorithm possible. As depicted in Figure 12
the scenario consists of 1 CO distributed line, and 3 RT distributed lines.
The target rates on RT1 and RT2 have both been set to 2 Mbps. For a
variety of different target rates on RT3, the CO line attempts to maximize
its own data-rate either by transmitting at full power in IW, or by setting its
corresponding weight wco to unity in OSB, ISB and ASB. This produces the
rate regions shown in Figure 13, which shows that ASB achieves near optimal
performance similar as OSB and ISB, and significant gain over IW even though
both ASB and IW are autonomous. For example, with a target rate of 1 Mbps
on CO, the rate on RT3 reaches 7.3 Mbps under ASB algorithm, which is a
121% increase compared with the 3.3 Mbps achieved by IW. We have also
performed extensive simulations (more than 10, 000 scenarios) with different
CO and RT positions, line lengths and reference line parameters. We found
that the performance of ASB is very insensitive to definition of reference line:
with a single choice of the reference line we observe good performance in a
broad range of scenarios.
4.7 Concluding Remarks and Future Directions
Dynamic spectrum management techniques can greatly improve the performance the DSL lines by inducing cooperation among interfering users in the
same binder. For example, iterative waterfiling (IW) algorithm is a completely
autonomous DSM algorithm with linear complexity in the number of users
and number of tones, but the performance could be far from optimal in the

Special Volume

45

RT1 @ 2 Mbps, RT2 @ 2 Mbps


2
Optimal Spectrum Balancing
Iterative Spectrum Balancing
Autonomous Spectrum Balancing
Iterative Waterfilling

1.8

1.6

CO (Mbps)

1.4

1.2

0.8

0.6

0.4
0

4
RT3 (Mbps)

Fig. 13. Rate regions obtained by ASB, IW, OSB, and ISB.

mixed CO/RT deployment scenario. The optimal spectrum balancing (OSB)


and iterative spectrum balancing (ISB) algorithms achieve optimal and close
to optimal performances, respectively, but have high complexities in terms of
the number of users and are completely centralized.
This section surveys an autonomous dynamic spectrum management (DSM)
algorithm called autonomous spectrum balancing (ASB). ASB utilizes the concept of reference line, which mimics a typical victim line in the binder. By
setting the power spectrum level to protect the reference line, a good balance between selfish and global maximizations can be achieved. Compared
with IW, OSB and ISB, the ASB algorithm enjoys completely autonomous
operations, low (linear) complexity in both the number of users and number
of tones. Simulation shows that the ASB algorithm achieves close to optimal
performance and is robust to the choice of reference line parameters.

Acknowledgment
The author would like to acknowledge collaborations with Raphael Cendrillon,
Maryam Fazel, Prashanth Hande, Jianwei Huang, Daniel Palomar, and Chee
Wei Tan on the publications related to this survey [12, 15, 51, 21], as well
as helpful general discussions on related topics with Stephen Boyd, Robert
Calderbank, John Doyle, David Julian, Jang-Won Lee, Ying Li, Steven Low,
Daniel ONeill, Asuman Ozdaglar, Pablo Parrilo, Ness Shroff, R. Srikant, and
Ao Tang.

46

Chiang: Nonconvex Optimization for Communication Systems

References
1. M. Avriel, Ed. Advances in Geometric Programming, Plenum Press, New York,
1980.
2. N. Bambos, Toward power-sensitive network architectures in wireless communications: Concepts, issues, and design aspects. IEEE Pers. Comm. Mag., vol.
5, no. 3, pp. 50-59, 1998.
3. D. P. Bertsekas, Nonlinear Programming, Athena Scientific, 1999.
4. D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: numerical methods. Prentice Hall, 1989.
5. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University
Press, 2004.
6. R. Cendrillon, G. Ginis, and M. Moonen, Improved linear crosstalk precompensation for downstream vdsl, in Proceedings of IEEE International Conference
on Acoustics, Speed and Signal Processing (ICASSP), 2004, pp. 10531056.
7. R. Cendrillon and M. Moonen, Iterative spectrum balancing for digital subscriber lines, in IEEE International Communications Conference (ICC), 2005.
8. R. Cendrillon, W. Yu, M. Moonen, J. Verlinden, and T. Bostoen, Optimal
multi-user spectrum management for digital subscriber lines, accepted by IEEE
Transactions on Communications, 2005.
9. M. Chiang, Balancing transport and physical layers in wireless multihop networks: Jointly optimal congestion control and power control, IEEE J. Sel. Area
Comm., vol. 23, no. 1, pp. 104-116, Jan. 2005.
10. M. Chiang, Geometric programming for communication systems, Foundations
and Trends of Communications and Information Theory, vol. 2, no. 1-2, pp. 1156, Aug. 2005.
11. M. Chiang, S. H. Low, R. A. Calderbank, and J. C. Doyle, Layering as optimization decomposition, To appear in Proceedings of IEEE, 2006.
12. M. Chiang, S. Zhang, and P. Hande, Distributed rate allocation for inelastic
flows: Optimization framework, optimality conditions, and optimal algorithms,
Proc. IEEE Infocom, Miami, FL, March 2005.
13. S. T. Chung, Transmission schemes for frequency selective gaussian interference
channels, Ph.D. dissertation, Stanford University, 2003.
14. R. J. Duffin, E. L. Peterson, and C. Zener, Geometric Programming: Theory
and Applications. Wiley, 1967.
15. M. Fazel and M. Chiang, Nonconcave network utility maximization by sum of
squares programming, Proc. IEEE CDC, Dec. 2005.
16. G. Foschini and Z. Miljanic, A simple distributed autonomous power control
algorithm and its convergence, IEEE Trans. Veh. Tech., vol. 42, no. 4, 1993.
17. G. Ginis and J. Cioffi, Vectored transmission for digital subscriber line systems, IEEE Journal on Selected Areas of Communications, vol. 20, no. 5, pp.
10851104, 2002.
18. D. Handelman, Representing polynomials by positive linear functions on compact convex polyhedra, Pacific J. Math., vol. 132, pp. 35-62, 1988.
19. D. Henrion, J.B. Lasserre, Detecting global optimality and extracting solutions
in GloptiPoly, Research report, LAAS-CNRS, 2003.
20. J. Huang, R. Berry, and M. L. Honig, A game theoretic analysis of distributed
power control for spread spectrum ad hoc networks, Proc. IEEE ISIT, July
2005.

Special Volume

47

21. J. Huang, R. Cendrillon, and M. Chiang, Autonomous Spectrum Balancing in


DSL Interference Channels, Proc. IEEE ISIT, July 2006.
22. D. Julian, M. Chiang, D. ONeill, and S. Boyd, QoS and fairness constrained
convex optimization of resource allocation for wireless cellular and ad hoc networks. Proc. IEEE Infocom, New York, NY, June 2002.
23. F. P. Kelly, Models for a self-managed Internet, Philosophical Transactions of
the Royal Society, A358, 2335-2348, 2000.
24. F. P. Kelly, A. Maulloo, and D. Tan, Rate control for communication networks:
shadow prices, proportional fairness and stability, Journal of Operations Research Society, vol. 49, no. 3, pp.237-252, March 1998.
25. S. Kandukuri and S. Boyd, Optimal power control in interference limited fading
wireless channels with outage probability specifications, IEEE Trans. Wireless
Comm., vol. 1, no. 1, pp. 46-55, Jan. 2002.
26. J.B. Lasserre, Global optimization with polynomials and the problem of moments, SIAM J. Optim., vol. 11, no. 3, pp. 796-817, 2001.
27. J.B. Lasserre, Polynomial programming: LP-relaxations also converge, SIAM
J. Optimization, vol. 15, no. 2, pp. 383-393, 2004.
28. J. W. Lee, R. Mazumdar, and N. B. Shroff, Non-convex optimization and rate
control for multi-class services in the Internet, IEEE/ACM Trans. Networking,
vol. 13, no. 4, pp. 827-840, Aug. 2005.
29. J. W. Lee, R. Mazumdar, and N. B. Shroff, Downlink power allocation for
multi-class CDMA wireless networks, IEEE/ACM Networking, vol, 13, no. 4,
pp. 854-867, Aug. 2005.
30. J. W. Lee, R. Mazumdar, and N. B. Shroff, Opportunistic power scheduling
for multi-server wireless systems with minimum performance constraints, IEEE
Trans. Wireless Comm., vol.5, no. 5, May 2006.
31. X. Lin, N. B. Shroff, and R. Srikant, A tutorial on cross-layer optimization in
wireless networks, IEEE J. Sel. Area Comm., Aug. 2006.
32. S. H. Low, A duality model of TCP and queue management algorithms,
IEEE/ACM Tran. Networking, vol. 11, no. 4, pp. 525-536, Aug. 2003.
33. B. R. Marks and G. P. Wright, A general inner approximation algorithm for
nonconvex mathematical program, Operations Research, 1978.
34. D. Mitra, An asynchronous distributed algorithm for power control in cellular
radio systems, Proceedings of 4th WINLAB Workshop Rutgers University, NJ,
1993.
35. Yu. Nesterov and A. Nemirovsky, Interior Point Polynomial Methods in Convex
Programming, SIAM Press, 1994.
36. D. Palomar and M. Chiang, Alternative decompositions for distributed maximization of network utility: Framework and applications, Proc. IEEE Infocom,
Barcelona, Spain, April 2006.
37. P. A. Parrilo, Structured semidefinite programs and semi-algebraic geometry
methods in robustness and optimization, PhD thesis, Caltech, May 2002.
38. P. A. Parrilo, Semidefinite programming relaxations for semi-algebraic problems, Math. Program., vol. 96, pp.293-320, 2003.
39. S. Prajna, A. Papachristodoulou, P. A. Parrilo, SOSTOOLS:
Sum of squares optimization toolbox for Matlab, available from
http://www.cds.caltech.edu/sostools, 2002-04.
40. M. Putinar, Positive polynomials on compact semi-algebraic sets, Indiana
University Mathematics Journal, vol. 42, no. 3, pp. 969-984, 1993.

48

Chiang: Nonconvex Optimization for Communication Systems

41. R. T. Rockafellar, Network Flows and Monotropic Programming, Athena Scientific, 1998.
42. R. T. Rockafellar, Lagrange multipliers and optimality, SIAM Review, vol.
35, pp. 183-283, 1993.
43. C. Saraydar, N. Mandayam, and D. Goodman, Pricing and power control in
a multicell wireless data network, IEEE J. Sel. Areas Comm., vol. 19, no. 10,
pp. 1883-1892, 2001.
44. K. Schm
udgen, The k-moment problem for compact semialgebraic sets, Math.
Ann., vol. 289, pp. 203-206, 1991.
45. S. Shenker, Fundamental design issues for the future Internet, IEEE J. Sel.
Area Comm., vol. 13, no. 7, pp. 1176-1188, Sept. 1995.
46. N. Z. Shor, Quadratic optimization problems, Soviet J. Comput. Systems Sci.,
vol 25, pp. 1-11, 1987.
47. R. Srikant, The Mathematics of Internet Congestion Control, Birkhauser 2004.
48. G. Stengle, A Nullstellensatz and a Positivstellensatz in semialgebraic geometry, Math. Ann., vol. 207, pp.87-97, 1974.
49. T. Starr, J. Cioffi, and P. Silverman, Understanding digital Subscriber Line Technology. Prentice Hall, 1999.
50. C. Sung and W. Wong, Power control and rate management for wireless multimedia CDMA systems, IEEE Trans. Comm. vol. 49, no. 7, pp. 1215-1226,
2001.
51. C. W. Tan, D. Palomar, and M. Chiang, Solving non-convex power control
problems in wireless networks: Low SIR regime and distributed algorithms,
Proc. IEEE Globecom, St. Louis, MO, Nov. 2005.
52. C. W. Tan, D. Palomar and M. Chiang, Distributed Optimization of Coupled Systems with Applications to Network Utility Maximization, Proc. IEEE
ICASSP 2006, Toulouse, France, May 2006.
53. D. M. Topkis, Supermodularity and Complementarity. Princeton University
Press, 1998.
54. M. Xiao, N. B. Shroff, and E. K. P. Chong, Utility based power control (UBPC)
in cellular wireless systems, IEEE/ACM Trans. Networking, vol. 11, no. 10, pp.
210-221, March 2003.
55. R. Yates, A framework for uplink power control in cellular radio systems,
IEEE J. Sel. Areas Comm., vol. 13, no. 7, pp. 1341-1347, 1995.
56. W. Yu, G. Ginis, and J. Cioffi, Distributed multiuser power control for digital
subscriber lines, IEEE Journal on Selected Areas in Communication, vol. 20,
no. 5, pp. 11051115, June 2002.
57. W. Yu, R. Lui, and R. Cendrillon, Dual optimization methods for multiuser
orthogonal frequency division multiplex systems, in Proceedings of IEEE Globecom, vol. 1, 2004, pp. 225229.

You might also like