0% found this document useful (0 votes)
79 views

Convexity-Print Version PDF

The document discusses convexity and quasiconvexity. It defines convex combinations and convex sets. A set is convex if the straight line between any two points in the set is contained in the set. Concave and convex functions are defined based on their subgraphs and epigraphs being convex sets. A function is concave if its subgraph is convex, and convex if its epigraph is convex. Characterizations are given for concave and convex functions using inequalities involving convex combinations of points. The concavity or convexity of differentiable functions can be determined from properties of the first and second derivatives. Examples are provided to illustrate the definitions and characterizations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Convexity-Print Version PDF

The document discusses convexity and quasiconvexity. It defines convex combinations and convex sets. A set is convex if the straight line between any two points in the set is contained in the set. Concave and convex functions are defined based on their subgraphs and epigraphs being convex sets. A function is concave if its subgraph is convex, and convex if its epigraph is convex. Characterizations are given for concave and convex functions using inequalities involving convex combinations of points. The concavity or convexity of differentiable functions can be determined from properties of the first and second derivatives. Examples are provided to illustrate the definitions and characterizations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Convexity and Quasiconvexity

Convex Combinations and Convex Sets.


Definition. Given any finite collection of points x1 , . . . , xm ∈ Rn , a point z ∈ Rn is
said to be a convex combination of the points {x1 , . . . , xm } if there is some λ ∈ Rm
satisfying
Pm
1. λi ≥ 0, i = 1, . . . , m, and 2. i=1 λi = 1,
Pm
such that z = i=1 λi xi . A subset D of Rn is convex if the convex combination
of any two points in D is also in D. N

• Thus a set is convex if the straight line joining any two points in D is completely
contained in D i.e. if for all x and y in D and λ ∈ (0, 1) it is the case that
λx + (1 − λ)y is a subset of D.

(a) (b) (c)

(d) (e) (f)


Figure 1: The sets represented by (a), (b) and (c) are convex, while (d), (e) and (f) illustrate nonconvex sets.

Concave and Convex Functions.


Definition. Let D be a convex subset of Rn and let f : D → R be a function.
• The subgraph of f , denoted sub f , is the set

sub f = {(x, y) ∈ D × R | f (x) ≥ y}.

• The epigraph of f , denoted epi f , is the set

epi f = {(x, y) ∈ D × R | f (x) ≤ y} N


• The subgraph of a function is the area lying below the graph of a function.
• On the other hand, the epigraph of a function is the area lying above the graph of
the function.

y y

epi f

sub f

x x
Figure 2: The subgraph and epigraph of f .

Definition. Let D be a convex subset of Rn and let f : D → R be a function.


• We say that f is concave on D if sub f is a convex set.
• We say that f is convex on D if epi f is a convex set. N

• Note concave and convex functions are required to have convex domains.
• The following theorem provides an alternative definition of concave and convex
functions.

Theorem 1. Let D be a convex subset of Rn and let f : D → R be a function. Then


1. f is concave iff for all x, y ∈ D and λ ∈ (0, 1), we have

f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y).

2. f is convex iff for all x, y ∈ D and λ ∈ (0, 1), we have

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y).

• So a function is concave iff the function’s value at a convex combination of any


two points is at least as great as the same convex combination of the function’s
values at each point.

Definition. Let D be a convex subset of Rn and let f : D → R be a function.


• We say f is strictly concave if for all x, y ∈ D with x 6= y, and all λ ∈ (0, 1),
we have
f (λx + (1 − λ)y) > λf (x) + (1 − λ)f (y).

2
f (λx + (1 − λ)y)

λf (x) + (1 − λ)f (y)

x λx + (1 − λ)y y
Figure 3: A function f is concave iff the secant line connecting any two points on the graph of f lies below the graph.

• We say f is strictly convex if for all x, y ∈ D with x 6= y, and all λ ∈ (0, 1), we
have
f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y). N

Theorem 2. Let D be a convex subset of Rn and let f : D → R be a function. Then


1. f is concave iff the function −f is convex.
2. f is strictly concave iff the function −f is strictly convex.

• The previous result allows us to easily apply all results about concave functions
to convex functions
• Another valuable property of concave functions is that they behave well under
addition and scalar multiplication by positive numbers.

Theorem 3. Let D be a convex subset of Rn . Let fi : D → R be concave functions


and let ai be positive numbers i = 1, . . . , k. Then

a1 f1 + · · · + ak fk

is a concave function.
Proof. Simply apply the definition of a concave function. 

• An identical result holds for convex functions.

• The assumption of convexity has two important implications.


• First, every concave function must also be continuous except possible at the
boundary points.

3
• Second, every concave function is differentiable “almost everywhere”.

Theorem 4. Let D be a convex subset of Rn and let f : D → R be a concave or


convex function. Then
1. If D is open, f is continuous on D.
2. If D is not open, f is continuous on int D.
3. If D is open, f is differentiable “almost everywhere” on D and the derivative
Df of f is continuous at all points where it exists.

• For a discussion of the meaning of “almost everywhere” see Sundaram pp182-


183.

Convexity and the Properties of the Derivative.

• We can characterize the concavity or convexity of a differentiable function using


the first derivative.

Theorem 5. Let D be an open convex subset of Rn and let f : D → R be a C 1


function. Then
1. f is concave iff Df (x)(y − x) ≥ f (y) − f (x) for all x, y ∈ D.
2. f is convex iff Df (x)(y − x) ≤ f (y) − f (x) for all x, y ∈ D.

• Note that, if we let z = y−x, we can rewrite (1) to say f is concave iff Df (x)z+
f (x) ≥ f (x + z) for all x, z ∈ D.
• Thus a function is concave iff the tangent line lies above the graph of the func-
tion.

f 0 (x)z + f (x)
f (x + z)

f (x)

x x+z
Figure 4: A function is concave iff the tangent line lies above the graph of the function.

4
• In the next theorem, the concavity or convexity of a C 2 function is characterized
using the second derivative.
• The theorem also gives a sufficient condition which can be used to identify
strictly concave and strictly convex functions.

Theorem 6. Let D be an open convex subset of Rn and let f : D → R be a C 2 . Then


1. f is concave iff D2 f (x) is a negative semidefinite matrix for all x ∈ D.

2. f is convex iff D2 f (x) is a positive semidefinite matrix for all x ∈ D.


3. If D2 f (x) is a negative definite matrix for all x ∈ D, then f is strictly concave.
4. If D2 f (x) is a positive definite matrix for all x ∈ D, then f is strictly convex.

• It is important to note that parts (3) and (4) of the theorem are only sufficient
conditions. For example, part (3) does not say that if f is strictly concave on D,
then D2 f (x) is a negative definite matrix for all x ∈ D.

• The next example illustrates this point.

Example 1. Let f : R → R and g : R → R be defined by f (x) = −x4 and g(x) = x4


respectively.
• The f is strictly concave on R, while g is strictly convex on R.
• However f 00 (0) = g 00 (0), so that f 00 (0) is not negative definite and g 00 (0) is not
positive definite. 

• Our next example illustrates the importance of the theorem for simplifying the
identification of concavity in practice.

Example 2. Let f : R2++ → R be given by f (x, y) = xa y b , a, b > 0.


• For given a and b, this function is concave if, for any (x, y) and (x̂, ŷ) in R2++
and any λ ∈ (0, 1), we have

[λx + (1 − λ)x̂]a [λy + (1 − λ)ŷ]b ≥ λxa y b + (1 − λ)x̂a ŷ b .

• Similarly f is convex, if for all (x, y) and (x̂, ŷ) in R2++ and any λ ∈ (0, 1), we
have

[λx + (1 − λ)x̂]a [λy + (1 − λ)ŷ]b ≤ λxa y b + (1 − λ)x̂a ŷ b .

• Compare checking for convexity of f using these inequalities to checking using


the second derivative test.

5
• The latter only requires us to identify the definiteness of the following matrix:

a(a − 1)xa−2 y b abxa−1 y b−1


 
D2 f (x, y) = .
abxa−1 y b−1 b(b − 1)xa y b−2

The determinant of this matrix is

ab(1 − a − b)x2(a−1) y 2(b−1)

which is positive if a + b < 1, zero if a + b = 1 and negative if a + b > 1.


• Furthermore, if a, b < 1 the diagonal terms are negative and so f is a strictly
concave function if a + b < 1 and concave if a + b = 1. If a + b > 1, then
D2 f (x, y) is indefinite and f is neither concave nor convex.
• In summary, a Cobb-Douglas production function on R2++ is concave iff it ex-
hibits constant or decreasing returns to scale. 

• We now present some results which indicate the importance of convexity for
optimization theory.
• But first some terminology.

Definition.
• We refer to a maximization problem as a convex maximization problem if the
constraint set is convex and the objective function is concave.
• Similarly, we refer to a minimization problem as a convex minimization problem
if the constraint set is convex and the objective function is convex.
• More generally, we refer to an optimization problem as a convex optimization
problem if it is either of the above. N

• The first result establishes that in convex optimization problems, all local optima
must also be global optima.
• Thus, to find a global optimum in such problems, it is sufficient to identify a
local optimum.

Theorem 7. Let D be a convex subset of Rn and let f : D → R be concave. Then


1. Any local maximum of f is a global maximum of f .
2. The set arg max{f (x) | x ∈ D} of maximizers of f on D is either empty or
convex.

• Similar results hold for convex minimization problems.


• The second part of the result means that we cannot have multiple isolated points
as maximizers.

6
• For example, in the utility maximization problem with two perfect substitutes, ei-
ther the solution is a unique corner solution or there are infinitely many solutions
along the budget constraint.
• The second result shows that if a strictly convex optimization problem has a
solution, then the solution is unique.

Theorem 8. Let D be a convex subset of Rn and let f : D → R be strictly concave.


Then the set arg max{f (x) | x ∈ D} of maximizers of f on D is either empty or
contains a single point.

• We can combine this result with the Weierstrass theorem to establish the exis-
tence of a unique global optimum in a convex optimization problem in which the
objective function is continuous and the constraint set is compact.

Quasiconcave and Quasiconvex Functions.

• We have seen that convexity has powerful implications for optimization prob-
lems. However, convexity is a very restrictive assumption, which is important
when we come to applications.
• For example, we saw that the Cobb-Douglas function production f (x, y) = xa y b
(a, b > 0) is not concave unless a + b ≤ 1.
• So, we will now look at optimization under a weakening of the condition of
convexity, called quasiconvexity.

Definition. Let D be a convex subset of Rn and let f : D → R be a function.


• The upper contour set of f at a ∈ R, denoted Uf (a), is the set

Uf (a) = {x ∈ D | f (x) ≥ a}.

• The lower contour set of f at a ∈ R, denoted Lf (a), is the set

Lf (a) = {x ∈ D | f (x) ≤ a}. N

Definition. Let D be a convex subset of Rn and let f : D → R be a function.


• We say that f is quasiconcave on D if Uf (a) is a convex set for all a ∈ R.
• We say that f is quasiconvex on D if Lf (a) is a convex set for all a ∈ R. N

• Thus a function is quasiconcave if its upper contour sets are convex sets.
• Similarly, a function is quasiconvex if its lower contour sets are convex sets.
• As is the case with concave and convex functions, it is also true for quasicon-
cave and quasiconvex functions that a relationship exists between the value of a
function at two points and the value of the function at a convex combination.

7
Uf (a)

x
λx + (1 − λ)y

y {x̂ ∈ D | f (x̂) = a}

{x̂ ∈ D | f (x̂) = b}

Figure 5: The level sets of a strictly quasiconcave function (a > b). The upper contour set of f at a is Uf (a) = {x̂ ∈
D | f (x̂) ≥ a}.

• The following theorem provides two alternative definitions of quasiconcavity.


Theorem 9. Let D be a convex subset of Rn and let f : D → R be a function. Then
the following statements are equivalent.
1. f is quasiconcave on D.
2. For all x, y ∈ D and all λ ∈ (0, 1)
f (x) ≥ f (y) implies f (λx + (1 − λ)y) ≥ f (y).

3. For all x, y ∈ D and all λ ∈ (0, 1)


f (λx + (1 − λ)y) ≥ min{f (x), f (y)}.

• A similar result holds for quasiconvex functions, with the inequalities reversed
and “min” replaced with “max”.
Definition. Let D be a convex subset of Rn and let f : D → R be a function.
• We say f is strictly quasiconcave if for all x, y ∈ D with x 6= y, and all λ ∈
(0, 1), we have
f (λx + (1 − λ)y) > min{f (x), f (y)}.
• We say f is strictly quasiconvex if for all x, y ∈ D with x 6= y, and all λ ∈ (0, 1),
we have
f (λx + (1 − λ)y) < max{f (x), f (y)}. N
Theorem 10. Let D be a convex subset of Rn and let f : D → R be a function. Then
1. f is quasiconcave iff the function −f is quasiconvex.
2. f is strictly quasiconcave iff the function −f is strictly quasiconvex.

8
Quasiconvexity as a Generalization of Convexity.

• It is straightforward to show that the set of all quasiconcave functions contains


the set of all concave functions and similarly for quasiconvex functions.

Theorem 11. Let D be a convex subset of Rn and let f : D → R be a function. Then


1. If f is concave on D, then it is also quasiconcave on D.
2. If f is convex on D, then it is also quasiconvex on D.

• The following example demonstrates how to check directly for quasiconvexity


and shows the converse of the above result is false.

Example 3. Let f : R → R be any increasing function. Then f is both quasiconcave


and quasiconvex.

• To show this, consider any x, y ∈ R and any λ ∈ (0, 1). Assume, without loss
of generality, that x > y. Then

x > λx + (1 − λ)y > y.

• Since f is increasing, we have

f (x) ≥ f (λx + (1 − λ)y) ≥ f (y).

• Since f (x) = max{f (x), f (y)}, the first inequality shows that f is quasiconvex.
• Similarly, since f (y) = min{f (x), f (y)}, the second inequality shows that f is
quasiconcave.
• Since it is always possible to choose a nondecreasing function f that is neither
concave nor convex on R (say f (x) = x3 ), we have shown that not every quasi-
concave function is concave and not every quasiconvex function is convex. 

• The next theorem elaborates on the relationship between concave and quasicon-
cave functions.

Theorem 12. Let D be a convex subset of Rn and let f : D → R be a quasiconcave


function.
1. If φ : R → R is an increasing function, then the composition φ ◦ f is a quasi-
concave function from D to R.
2. In particular, any increasing transform of a concave function results in a quasi-
concave function.

• The converse of this theorem is not true. That is, we cannot say that every quasi-
concave function is an increasing transformation of some concave function. See
Sundaram pp207-209 for two concrete examples of quasiconcave functions that
are not increasing transformations of any concave function.

9
Quasiconvexity and the Properties of the Derivative.

• As with concavity we can characterize the quasiconcavity of a differentiable


function using the first derivative.

Theorem 13. Let D be an open convex subset of Rn and let f : D → R be a C 1


function. Then
1. f is quasiconcave iff f (y) ≥ f (x) implies Df (x)(y − x) ≥ 0 for all x, y ∈ D.
2. f is quasiconvex iff f (y) ≤ f (x) implies Df (x)(y − x) ≤ 0 for all x, y ∈ D..

• The condition (1) is illustrated in the figure. If we think of Df (x)T as the gradi-
ent vector ∇f (x), then the theorem says that the angle between the gradient and
the vector y − x is acute (or right).

{x̂ ∈ D | f (x̂) ≥ f (x)}


∇f (x)

x
y

Figure 6: The condition (1) says that the angle between the vector y − x and ∇f (x) is acute.

• We can also test for quasiconcavity using the second derivative.

Theorem 14. Let D be an open convex subset of Rn and let f : D → R be a C 2


function. Consider the bordered Hessian
 
0 f1 · · · fn
 f1 f11 · · · f1n 
H= .
 
.. .. .. 
 .. . . . 
fn fn1 · · · fnn

Let H k denote the rth order leading principal submatrix of H.

10
1. If f is quasiconcave on D, then, for all x ∈ D, (−1)r−1 |H r | ≥ 0 for r =
2, . . . , n + 1.
2. If f is quasiconvex on D, then, for all x ∈ D, |H r | ≤ 0 for r = 2, . . . , n + 1.
3. If (−1)r−1 |H r | > 0 for all r = 2, . . . , n + 1, then f is quasiconcave on D.
4. If |H r | < 0 for all r = 2, . . . , n + 1, then f is quasiconvex on D.
• Part (3) requires the signs of the leading principal minors to alternate, starting
with negative for the 2 × 2 matrix H 2 .
• Compare this theorem with the corresponding theorem on concavity.
• There are two important differences.
– In theorem (6), a weak inequality i.e. the negative semidefiniteness of D2 f
was both necessary and sufficient to establish concavity.
– However, in the result above, the weak inequality is only a necessary condi-
tion for quasiconcavity. The sufficient condition involves a strict inequality.
– Second, the theorem does not give a test for strict quasiconcavity.
Example 4. Let f : R2++ → R be given by f (x, y) = xa y b , a, b > 0.
• We saw that f is strictly concave on if a + b < 1, concave if a + b = 1, and
neither concave nor convex if a + b > 1.
• We will show that f is quasiconcave for all a, b > 0.
• To show this directly, using the definition of quasiconcavity, requires us to prove
that
[λx + (1 − λ)x̂]a [λy + (1 − λ)ŷ]b ≥ min{xa y b , x̂a , ŷ b }
holds for all (x, y) 6= (x̂, ŷ) in R2++ and for all λ ∈ (0, 1).
• Compare checking for quasiconcavity f using this inequality to checking using
the second derivative test.
• We have to show that |H 3 (x, y)| < 0 and |H 3 (x, y)| > 0 for all x, y ∈ R2++ ,
where
axa−1 y b
 
0
H 2 (x, y) = ,
axa−1 y b a(a − 1)xa−2 y b
axa−1 y b bxa y b−1
 
0
H 3 (x, y) =  axa−1 y b a(a − 1)xa−2 y b abxa−1 y b−1  .
a b−1 a−1 b
bx y abx y b(b − 1)xa y b−2
• Calculating the determinants we find
|H 2 (x, y)| = −a2 x2(a−1) y 2b < 0
|H 3 (x, y)| = ab(a + b)x3a−2 y 3b−2 > 0,
for all (x, y) ∈ R2++ .
• Thus f is quasiconcave on (x, y) ∈ R2++ . 

11
Quasiconvexity and Optimization.

• Unlike concave and convex functions:


– Quasiconcave and quasiconvex functions are not necessarily continuous on
the interior of their domains.
– Quasiconcave functions can have local maxima that are not global max-
ima, and quasiconvex functions can have local minima that are not global
minima.
– First order conditions are not sufficient to identify even local optima under
quasiconvexity.
• The following example illustrates these points.

Example 5. Let f : R → R be given by


 3
 x , x≤1
f (x) = 1, x ∈ (1, 2]
 3
x , x>2

Since f is increasing, it is both quasiconcave and quasiconvex on R.

• Clearly, f has a discontinuity at x = 2.


• Also, f is constant on the open interval (1, 2), so that every point in this interval
is a local maximizer and local minimizer of f .
• However, no point in (1, 2) is either a global maximizer or a global minimizer.
• Finally, f 0 (0) = 0, although 0 is not a local maximum or local minimum. 

• Another important distinction between convexity and quasiconvexity, is that while


a strictly concave function cannot be even weakly convex, a strictly quasiconcave
function can also be strictly quasiconvex.
• For example any strictly increasing function on R is both strictly quasiconvex
and strictly quasiconcave. This can be shown by modifying example (3).

• We saw that local maxima of quasiconcave functions need not be global maxima.
• However, when the function is strictly quasiconcave, there is a result identical to
that for strictly concave functions.

Theorem 15. Let D be a convex subset of Rn and let f : D → R be strictly quasicon-


cave. Then
1. Any local maximum of f is a global maximum of f .
2. The set arg max{f (x) | x ∈ D} of maximizers of f on D is either empty or a
singleton.

12
• A similar result holds for strictly quasiconvex functions in minimization prob-
lems.
• This is significant because it says that the weaker property of strict quasiconcav-
ity is enough to guarantee uniqueness of the solution (if there is one).

13

You might also like