Penalty Methods, Barrier Methods and Augmented Lagrangians: Prob Lem Modifiers in Model
Penalty Methods, Barrier Methods and Augmented Lagrangians: Prob Lem Modifiers in Model
1 1 m
P(z) = 2 II II2 = 2 52 (max(0,5i(x)))2 . (4.3)
The last two inequalities yield (c^+i — cyyp(xy+v) < (cp+i — clf)p(xy), which
shows that, for all v > 0,
or it would contradict p(xy) < p(x°), for all v > 0, which follows from
(4.7) if condition (u) holds. This proves the existence of an accumula
tion point x, and we let K C No = {0,1,2,...} be a subse
quence converging to x. From continuity of F and from (4.8) we obtain
\
*
lim (F(xy) + c,p(xy)) = q* < F(x
Q is robust and that int Q = {x e IRn | g;(x) <0, i = 1,2,... ,m}, then
m i
the function q(x) = — 5 —— is a barrier function for Q.
)
*
F(x < F(xv) < F(x,J) + c^x"), (4.12)
and use the continuity of F and the robustness of fl to conclude that, for
every e > 0, there exists a point x 6 int Q. such that F(x) < F(x
)
* + e.
F(x
)
* + e + c„q(x) > F(x) + c>,q(x) > F(xu) + cl/q(x>'),
Chapters 7 and 8). For simplicity of presentation we deal here only with
equality constrained problems of the form
where F : JRn —> IR, hi : IRn —> IR, for all z = 1,2,..., m are given functions
and Q C IRn is a given subset. The Lagrangian of this problem is defined
as
m
L(x, zr) = F(x) + Kihi(x), (4.17)
i=l
for every x e IRn and every zr = (zrz) G IRm. The dual function associated
with this problem is
#(zr) = minL(£,7r), (4.18)
where V2F and V2hi are the Hessian matrices of the respective functions.
The local duality theorem requires that this matrix be positive definite,
x)
i.e., that (z, L
* > 0, for every x e IRn, such that x / 0.
Theorem 4.3.1 (Local Duality Theorem) Suppose that x* is a local
minimum point of the optimization problem (4-14)~(4-15) a minimal
)
value F(x
* ~ r* and a Lagrange multiplier vector 7r *. Suppose also that
x* is a regular point of the constraints (4-15) and that the corresponding
Hessian of the Lagrangian L* — V2L(x *
) is positive definite. Then the
dual problem (4• 19)~(4-%0) has a local maximum at 7r * with a maximal
value g(ir
*) = r*, and x* as the corresponding point to it* in (4-18).
A similar result can be obtained for problems having inequality con
straints in addition to the equality constraints. If appropriate convexity
assumptions are added, then the local extrema (mentioned in the theorem)
can be replaced by global extrema. Finally, it is not necessary to include
all the constraints of a problem in the definition of the dual functional
g(w). Local duality can be defined with respect to any subset of functional
constraints; this is called partial duality. For example, the constraints
hi(x) = 0 might be separated into two subsets, easy and hard constraints.
The hard ones can be dualized, i.e., removed from the constraint set and in
corporated into the Lagrangian, while the easy ones remain as constraints.
The primal-dual approach for constrained optimization problems aims,
in view of this duality theorem, at solving the dual unconstrained prob
lem (4.19)-(4.20). This is done by an iterative scheme which alternates
between the minimization of the Lagrangian in (4.18) and the application
of a steepest ascent iteration to the dual problem.
where h : IRn —> IRm is the vector of functions h = (hi), thus (4.28)
represents the same equality constraints as (4.15). For this problem to be
equivalent to the original problem, the function f must have the property
that /(0) = 0 and the scalar parameter c is positive. The Lagrangian of
this equivalent problem, called the augmented Lagrangian, has the form
Thus, keeping the original constraints (4.15) and (4.28) in the equivalent
problem, in addition to having them built into (4.27), really means that a
penalty term of the form (1 / c) f (ch(x)) has been added to the Lagrangian
rather than to the original objective function F(x). To generate an algo
rithmic scheme an iterative process is used, which alternates between the
minimization of the augmented Lagrangian (4.30) and the application of
an ascent iteration to the dual problem.
70 Penalty Methods, Barrier Methods and Augmented Lagrangians Chap. 4
The method was originally proposed with the function f(z) = j || z ||2
and was later extended to functions of the form /(^) = 52i2i ^>(^?), where
(j): IR —> IR belongs to the class of penalty functions Pe defined as follows.
Definition 4.4.1 (The Class Pe of Penalty Functions) The function
dirt
(4.36)
+h(x(7r)). (4.37)
/?i(7T,c) = y (4.38)
where the function under the integral sign is the second derivative of <j) at
a point t = achi(x(7r)). Let B(tt,c) = diag (/?i(tt, c), /^(tt, c), ... ,/3m(7r,c))
be the diagonal m x m matrix with the /3/s on its diagonal. If 0(t) = jt2
we get fy(7r,c) = 1 for all i, and B(tt,c) is the identity matrix. In this
case, the augmented Lagrangian algorithmic scheme, given next, becomes
the classic quadratic method of multipliers.
the ascent nature of (4.40) can be guaranteed. This formula can be simpli
fied in the following way. By using a first-order Taylor expansion formula
for the function around the point t — 0, and using the fact that <£(0) = 0,
we get
dc/>
7T,- (4.42)
dt
Finally, it is worth noting that with constant parameters Cy = c > 0, for all
v > 0, the sequence of matrices {B(ttZ7, c)}£L0 tends to the identity matrix,
d^d)
as x(tt^) —» x*, assuming that —y (0) = 1, thus making (4.40) a fixed
dtz
stepsize, steepest ascent iteration, as tends to 7r
* —the optimal solution
of the dual problem.
We wish now to analyze the convergence of Algorithm 4.4.1 when ap
plied to the original problem (4.27)-(4.29) for the case of linear equality
constraints, i.e., h(x) = Ax -b, for some given m x n matrix A and given
vector b C IRm. It turns out that this algorithm is then closely related to
the proximal minimization algorithm (Algorithm 3.1.2) that we studied ear
lier (see Chapter 3). Specifically, the augmented Lagrangian algorithm is
then equivalent to the proximal minimization algorithm applied to the dual
problem (4.31)-(4.32). Let us assume here that the minimum in (4.39) is
always uniquely attained, and introduce the new vector variable z = Ax — b.
Then
where
= F(x) + (n,Ax~b), (4.47)
(4.51)
2=1
(4.53)
Combining this with (4.46)- (4.49) and with the fact that /.z
* = 7rp+1, we
74 Penalty Methods, Barrier Methods and Augmented Lagrangians Chap. 4
conclude that