Lagrange Multipliers Can Fail To Determine Extrema: Acknowledgment
Lagrange Multipliers Can Fail To Determine Extrema: Acknowledgment
Lagrange Multipliers Can Fail To Determine Extrema: Acknowledgment
Acknowledgment. The authors wish to thank George Andrews for his support and encour-
agement.
References
1. R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathematics, Addison-Wesley, 1989, pp. 283–286.
2. Joe Roberts, Elementary Number Theory, MIT Press, 1977, p. 16.
Then if the geometry is right, a constrained extremum must occur at a point (x0 , y0 )
among the solutions to (1). Since this set is often finite, the location of the extrema can
be determined by surveying all possibilities. But to be assured that the method suc-
ceeds, we must know that the geometry is right—that is, the set defined by g(x, y) = k
is a smooth curve in the plane. Here the Implicit Function Theorem is useful; it guar-
antees that a level set g(x, y) = k is a smooth curve with nonvanishing tangent vector
in a neighborhood of a point (a, b) if ∇g(a, b) = 0. Thus, when seeking constrained
extrema, we should also examine all critical points of g(x, y).
60
c THE MATHEMATICAL ASSOCIATION OF AMERICA
Consider the very simple situation where f (x, y) = x + y and g(x, y) = x 2 + y 2 .
If the constrained set is defined by g(x, y) = 1, then g’s gradient is always nonvanish-
ing and the method works beautifully. But suppose that g(x, y) = 0, so the constrained
set is the single point defined by x 2 + y 2 = 0. Trivially, the function f has 0 as both its
maximum and minimum value. But ∇ f (0, 0) = (1, 1) and ∇g(0, 0) = (0, 0), so there
is no value of λ for which ∇ f (0, 0) = λ∇g(0, 0). Thus, in this example, by neglecting
g’s critical point we miss f ’s extrema.
For a prettier and less trivial example, let us minimize f (x, y) = x on the piriform
curve defined by g(x, y) = 0, where g(x, y) = y 2 + x 4 − x 3 . As the plot below shows,
this curve has a singular point at the origin where g’s graph is not smooth. At any such
singular point, ∇g = (0, 0) as predicted by the Implicit Function Theorem. (If the
gradient were nonzero, the level set must be locally a smooth curve.) Although f ’s
minimum clearly occurs at (0, 0), this point does not satisfy the Lagrange condition
that ∇ f (0, 0) = λ∇g(0, 0) for any value λ.
We recall the simple geometric argument that justifies using Lagrange multipli-
ers to find constrained extrema of f (x, y) at a nonsingular point (a, b) of g(x, y).
By the gradient condition, the Implicit Function Theorem asserts that the constrained
set g(x, y) = k can be represented locally near (a, b) as a parametrized curve r(t)
with r(t0 ) = (a, b) and r
(t0 ) = 0. Therefore g(r(t)) = k, and so (by the chain rule)
∇g(r(t)) · r
(t) = 0 at t = t0 . Since the function of f (r(t)) has an extremum at t0 ,
we also have (again by the Chain Rule) ∇ f (r(t)) · r
(t) = 0 at t0. . Because both vec-
tors ∇g(r(t0 )) and ∇ f (r(t0 )) are perpendicular to the nonzero vector r
(t0 ), they must
be parallel. So there exists a scalar λ such that ∇ f (r(t0 )) = λ∇g(r(t0 )). This com-
pletes the argument in the case of two variables. A similar argument applies in higher
dimensions.
The same sort of example also works in three dimensions, where the geometry
for Lagrange multipliers requires that the constrained set g(x, y, z) = k be a C1
surface. Suppose that we require the extrema of f (x, y, z) = x + y on the set de-
fined by g(x, y, z) = x 2 + y 2 = 0 in R3 . The constrained set is now the z-axis, and
f (0, 0, z) = 0 at each point of this line. Thus, both the maximum and minimum
of f are 0 on this set. However, the equation ∇ f (x, y, z) = λ∇g(x, y, z) yields
(1, 1, 0) = λ(2x, 2y, 0), which is satisfied at no point of the constrained set. To locate
f ’s extrema, we must also consider g’s critical points (0, 0, z), namely, the z-axis.
The moral here is that the geometry matters, and Lagrange multipliers can fail to
identify the proper candidate points if ∇g = 0. Therefore, the correct procedure is to
consider all points satisfying the equations (1) and also all the critical points of g (i.e.,
those for which ∇g = 0). This additional consideration is not sufficiently emphasized
References
1. Jerrold Marsden, Anthony Tromba, and Alan Weinstein, Basic Multivariable Calculus, Springer Verlag, 1993.
2. James Stewart, Calculus (Early Transcendentals), 3rd ed., Brooks/Cole, 1995.
Off on a Tangent
Russell A. Gordon ([email protected]), and Brian C. Dietel (dietelbc@whitman.
edu), Whitman College, Walla Walla, WA 99362
It was noted in [1] that if a is any positive number other than 1, the tangent line to the
curve y = a x at the point (1/ ln a, e) goes through the origin. The interesting feature
here is that the y-coordinate is independent of a. We will show that this result is not as
unique as it initially appears; there are many families of curves whose tangent lines at a
fixed y-coordinate go through the origin. Shifting attention to the x-coordinate, it turns
out that families of curves whose tangent lines at a fixed x-coordinate go through the
origin have some interesting properties. In particular, these functions form the kernel
of a linear transformation.
The tangent line to a differentiable function f at the point (c, f (c)) with c = 0
goes through the origin if and only if f
(c) = f (c)/c. By shifting, scaling, and tilting
a given graph, it is possible to make the modified graph have a tangent line go through
the origin at any given value of x or y. These geometric operations on a graph can be
performed by modifying the original function by linear factors. This is the main idea
used in the next paragraph.
Let (a, b) be an open interval that contains the number 1. Suppose that f is differ-
entiable and strictly increasing on (a, b) and that f
(1) ≤ e. (There is nothing special
about the choice of the numbers 1 and e; they could be replaced with any positive real
numbers.) For each c = 0, define a function gc on the open interval with endpoints ca
and cb by
e − f
(1)
gc (x) = f (x/c) + x + f
(1) − f (1) .
c
Since each function gc is strictly monotone, it is clear that gc (x) = e only when x = c.
Since g
(c) = e/c, the tangent line to the graph of y = gc (x) at the point (c, e) goes
through the origin; the y-coordinate is independent of the parameter c. Therefore,
many functions generate families of curves whose tangent lines when y = e go through
the origin. The uniqueness of the function e x lies in the fact that the linear “correcting
factor” is 0. The reader may find it interesting to choose several functions for f (such
as x 2 or sin x) and look at the families generated by this method.
Let f be a strictly monotone differentiable function and suppose that the tangent
line to the graph of y = f (x) at the point (c, d) goes through the origin. By the prop-
erties of inverse functions, the tangent line to the graph of y = f −1 (x) at the point
62
c THE MATHEMATICAL ASSOCIATION OF AMERICA