0% found this document useful (0 votes)
48 views4 pages

Appendix E. Lagrange Multipliers

This document provides an overview of Lagrange multipliers, which are used to find the stationary points of a function subject to constraints. It discusses how Lagrange multipliers allow solving constrained optimization problems by introducing a parameter λ to formulate the problem in terms of an unconstrained Lagrangian function. The stationary points of this Lagrangian function satisfy both the optimality conditions for the original function as well as the constraints. The document illustrates this technique geometrically and provides an example of using Lagrange multipliers to maximize a function subject to an equality constraint. It also discusses how the method extends to inequality constraints.

Uploaded by

Mihai Fieraru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views4 pages

Appendix E. Lagrange Multipliers

This document provides an overview of Lagrange multipliers, which are used to find the stationary points of a function subject to constraints. It discusses how Lagrange multipliers allow solving constrained optimization problems by introducing a parameter λ to formulate the problem in terms of an unconstrained Lagrangian function. The stationary points of this Lagrangian function satisfy both the optimality conditions for the original function as well as the constraints. The document illustrates this technique geometrically and provides an example of using Lagrange multipliers to maximize a function subject to an equality constraint. It also discusses how the method extends to inequality constraints.

Uploaded by

Mihai Fieraru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

from "Pattern Recognition and Machine Learning

by Christopher M. Bishop
Springer Verlag, New York 2006

Appendix E. Lagrange Multipliers


Lagrange multipliers, also sometimes called undetermined multipliers, are used to
find the stationary points of a function of several variables subject to one or more
constraints.
Consider the problem of finding the maximum of a function f (x 1 , x 2 ) subject to
a constraint relating x 1 and x 2 , which we write in the form

(E. I )

One approach would be to solve the constraint equation (E.l) and thus express x 2 as
a function of x 1 in the fonn x 2 = h (xt) . This can then be substituted into f (x 1 , :1: 2 )
to give a function of x 1 alone of the form f (x 1 , h (xt) ). The maximum with respect
to x 1 could then be found by differentiation in the usual way, to give the stationary
value xi, with the corresponding value of x 2 given by x~ = h (xt).
One problem with this approach is that it may be difficult to find an analytic
solution of the constraint equation that allows x 2 to be expressed as an explicit func-
tion of x 1 . Also, this approach treats x 1 and x 2 differently and so spoils the natural
symmetry between these variables.
A more elegant, and often simpler, approach is based on the introduction of a
parameter A called a Lagrange multiplier. We shall motivate this technique from
a geometrical perspective. Consider a D-dimensional variable x with components
x 1 , . . . , x v. The constraint equation g (x ) = 0 then represents a (D - 1)-dimensional
surface in x-space as indicated in Figure E.l.
We first note that at any point on the constraint surface the gradient \7 g (x ) of
the constraint function will be orthogonal to the surface. To see this, consider a point
x that lies on the constraint surface, and consider a nearby point x + £ that also lies
on the surface. If we make a Taylor expansion around x, we have

g (x + £) ~ g (x ) + £T\7g (x ). (E.2)

Because both x and x + £lie on the constraint surface, we have g(x ) = g (x + £) and
hence £T\7g (x ) ~ 0. In the limit 11 £11 ---+ 0 we have £T\7g (x ) = 0, and because£ is

707
708 E. LAGRANGE MULTIPLIERS

Figure E.1 A geometrical picture of the technique of La- Figure E


grange multipliers in which we seek to maximize a
function f( x ), subject to the constraint g(x ) = 0.
If xis D dimensional, the constraint g(x ) = 0 cor-
responds to a subspace of dimensionality D - 1,
indicated by the red curve. The problem can
be solved by optimizing the Lagrangian function
L (x , .X) = f (x ) + .Xg(x ).

then parallel to the constraint surface g(x ) = 0, we see that the vector \1 g is normal
to the surface.
Next we seek a point x* on the constraint surface such that f (x ) is maximized.
Such a point must have the property that the vector \1 f (x ) is also orthogonal to the
constraint surface, as illustrated in Figure E. I, because otherwise we could increase
the value of f (x ) by moving a short distance along the constraint surface. Thus \1 f
and \1 g are parallel (or anti-parallel) vectors, and so there must exist a parameter>.
such that
\lf + >.\lg = O (E.3)
where).. #- 0 is known as a Lagrange multiplier. Note that).. can have either sign.
At this point, it is convenient to introduce the Lagrangian function defined by
L (x , >..) =f (x ) + >.g(x ). (E.4)

The constrained stationarity condition (E.3) is obtained by setting \1 xL = 0. Fur-


thermore, the condition DL/ 8>.. = 0 leads to the constraint equation g(x ) = 0.
Thus to find the maximum of a function f (x ) subject to the constraint g(x ) = 0,
we define the Lagrangian function given by (E.4) and we then find the stationary
point of L (x , >..) with respect to both x and >... For aD-dimensional vector x, this
gives D + 1 equations that determine both the stationary point x* and the value of>..
If we are only interested in x *, then we can eliminate ).. from the stationarity equa-
Figure E.3
tions without needing to find its value (hence the term 'undetermined multiplier').
As a simple example, suppose we wish to find the stationary point of the function'
f (x 1, x2) = 1 - xi - x~ subject to the constraint g(x t , x 2) = x 1 + x 2 - 1 = 0, as
illustrated in Figure E.2. The corresponding Lagrangian function is given by
L(x , >..) = 1 - xi - x~ + >.(x 1 + x 2 - 1). (E.S)
The conditions for this Lagrangian to be stationary with respect to x 1 , x 2 , and).. give
the following coupled equations:
- 2 Xt + ).. 0 (E.6)
- 2x2 + >.. 0 (E.7)
Xt + X2 - 1 0. (E.8)
E. LAGRANGE MULTIPLIERS 709

7f (x ) Figure E.2 A simple example of the use of Lagrange multipli-


ers in which the aim is to maximize J (x 1,x2) =
1 -xi - x~ subject to the constraint g(xt, x2) = 0
where g(x 1 ,x2) = x 1 + x 2 - 1. The circles show
contours of the function f (xt, x2) . and the diagonal
line shows the constraint surface g(x1 , x2) = 0.

) g(x ) = 0

Solution of these equations then gives the stationary point as (:1:t, : r~) = (~,~), and
1 is normal the corresponding value for the Lagrange multiplier is A = 1.
So far, we have considered the problem of maximizing a function subject to an
naximized. equality constraint of the form g(x ) = 0. We now consider the problem of maxi-
onal to the mizing f (x ) subject to an inequality constraint of the form g(x ) ~ 0, as iiiustrated
ld increase in Figure E.3.
'·Thus \1 f There are now two kinds of solution possible, according to whether the con-
arameter A strained stationary point lies in the region where g(x ) > 0, in which case the con-
straint is inactive, or whether it lies on the boundary g(x ) = 0, in which case the
(E.3) constraint is said to be active. In the former case, the function g(x ) plays no role
her sign. and so the stationary condition is simply V' f (x ) = 0. This again corresponds to
efined by a stationary point of the Lagrange function (E.4) but this time with A = 0. The
latter case, where the solution lies on the boundary, is analogous to the equality con-
(E.4)
straint discussed previously and corresponds to a stationary point of the Lagrange
= 0. Fur- function (E.4) with A # 0. Now, however, the sign of the Lagrange multiplier is
= 0. crucial, because the function f (x ) will only be at a maximum if its gradient is ori-
tg(x ) = 0, ented away from the region g(x ) > 0, as illustrated in Figure E.3. We therefore have
: stationary V' f (x ) = - AV' g(x ) for some value of A > 0.
ctor x, this For either of these two cases, the product Ag(x ) = 0. Thus the solution to the
value of A.
tarity equa- Figure E.3 Illustration of the problem of maximizing
tltiplier' ). f( x ) subject to the inequality constraint
he function g(x ) ;;:: 0.
- 1 = 0, as
1 by

(E.5)

, and A give

(E.6)
(E.7)
(E.8)
710 E. LAGRANGE MULTIPLffiRS

problem of maximizing f (x ) subject to g(x) ~ 0 is obtained by optimizing the


Lagrange function (E.4) with respect to x and>. subject to the conditions

g(x) ~ 0 (E.9)
>. ~ 0 (E.10)
>.g(x ) = 0 (E.ll)

These are known as the Karush-Kuhn -Tucker (KKT) conditions (Karush, 1939; Kuhn
and Tucker, 1951).
Note that if we wish to minimize (rather than maximize) the function f (x ) sub-
ject to an inequality constraint g(x ) ~ 0, then we minimize the Lagrangian function
L (x , >. ) = f (x ) - >.g(x ) with respect to x, again subject to >. ~ 0.
Finally, it is straightforward to extend the technique of Lagrange multipliers to
the case of multiple equality and inequality constraints. Suppose we wish to maxi- Referenc~
mize f (x ) subject to gj(x ) = 0 for j = 1, ... , J , and hk(x ) ~ 0 fork = 1, ... , K.
We then introduce Lagrange multipliers {>.j } and {fLk}, and then optimize the La-
grangian function given by
Abramowitz, M. ar
J K
of Mathematica
L (x , {>. j } , {ILk }) = f (x ) + L Aj gj(x ) + L /Lkhk(x ) (E.l2) Adler, S. L. (1981)
j=l k= l
Monte Carlo e
subject to /Lk ~ 0 and /Lkhk( x ) = 0 fork = 1, . .. , K . Extensions to constrained tion for multiqt
Appendix D functional derivatives are similarly straightforward. For a more detailed discussion D 23, 2901- 291
of the technique of Lagrange multipliers, see Nocedal and Wright ( 1999). Ahn, J. H. and J. !-
algorithm for pr
ral Computati01
Aizerman, M. A., E
noer (1964). Th
recognition lean
functions. Autot.
1175- 1190.
Akaike, H. (1974).
identification. /l
Control 19, 716
Ali, S. M. and S. D
of coefficients o
from another. ]Ol
ciety, B 28(1), 1 ~
Allwein, E. L., R. E.
Reducing multic
proach for margi1
Learning Resear.
Amari, S. (1985). Di
in Statistics. Sprl

You might also like