0% found this document useful (0 votes)
96 views22 pages

Convex Sets and Jensen's Inequality

The document defines convex sets and convex functions. It states that a function f is convex if its epigraph is a convex set. Jensen's inequality is introduced, which states that for a convex function f and weights w1,...wn, f(w1x1 + ... + wnxn) ≤ w1f(x1) + ... + wnf(xn). The inequality is proved using induction. Examples are given to illustrate Jensen's inequality and how it can be used to solve optimization problems.

Uploaded by

Nou Channarith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views22 pages

Convex Sets and Jensen's Inequality

The document defines convex sets and convex functions. It states that a function f is convex if its epigraph is a convex set. Jensen's inequality is introduced, which states that for a convex function f and weights w1,...wn, f(w1x1 + ... + wnxn) ≤ w1f(x1) + ... + wnf(xn). The inequality is proved using induction. Examples are given to illustrate Jensen's inequality and how it can be used to solve optimization problems.

Uploaded by

Nou Channarith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Convex Sets

and Jensen’s Inequality

ANDREW D SMITH
School of Mathematics and Statistics
University College Dublin
3

Definition of Convex Sets: A set A ⊂ Rn is convex if:


• For any vectors a, b ∈ A
• For any λ ∈ [0, 1]
• The point λa + (1 − λ)b ∈ A.

This says that if two points, a and b lie in the set, then so does the
straight line segment connecting a to b.
Which of these sets are convex?
4

Definition of Convex Functions: A function f : R → R is


convex if the epigraph of f (x) is a convex set.
The epigraph is the set of points lying on or above the graph of
f (x):

epif = {(x, y) ∈ R2 : y ≥ f (x)}

y y

x x

y
y

x x

A function f is concave if −f is convex, or equivalently, if the


subgraph is a convex set.

Similar definitions apply if f is defined on a sub-interval or R, of if


f is defined on (a convex subset of) Rn.
5

Proposition: A function f : R → R is convex if and only if, for


all x, y and 0 ≤ λ ≤ 1:

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)

Proof We prove this in two stages. Firstly, we show that this def-
inition implies that the epigraph is convex (the ‘if’ part), and then
that a convex epigram implies this inequality (the ‘only if’ part).
If. Suppose that the inequality holds. We need to show that the
epigraph is convex.
Suppose then that the vectors (x, a) and (y, b) are in the epigraph,
which is equivalent to:

a ≥ f (x)

b ≥ f (y)

We then consider an intermediate point (λx+(1−λ)y, λa+(1−λ)b).


We then have:

λa + (1 − λ)b ≥ λf (x) + (1 − λ)f (y)

≥ f (λx + (1 − λ)y)

Thus, the intermediate point lies in the epigraph of f (x). This


proves that the epigraph is convex.
6

Only if. Suppose that the epigraph of f (x) is convex. Then we


need to prove that f (x) satisfies the inequality.
Let us then pick x, y in the domain of f . Then the vectors (x, f (x))
and (y, f (y)) lie in the epigraph of f .
By hypothesis, the epigraph is convex and so, for 0 ≤ λ ≤ 1 the
following point is in the epigraph:

(λx + (1 − λ)y, λf (x) + (1 − λ)f (y))

By definition of the epigraph, this implies that:

λf (x) + (1 − λ)f (y) ≥ f (λx + (1 − λ)y)

Therefore, any convex function satisfies the inequality claimed.


We have therefore proved that the inequality holds for convex func-
tions, and only for convex functions.
7

Examples of Convex Functions


• y = x2
• y = 10x
• y = 1/x for x > 0.

• y = 1 + x2

4 100
3.5
80
3
2.5 60
2
1.5 40
1
20
0.5
0 0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

4 2.5
3.5
3 2
2.5 1.5
2
1.5 1
1 0.5
0.5
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
8

Jensen’s Inequality:
Let f (x) be a convex function and let w1, w2, . . . wn be weights with

• wj ≥ 0
• nj=1 wj = 1
P

Then, for arbitrary x1, x2, . . . xn Jensen’s inequality states:

f (w1x1 + w2x2 + . . . wnxn) ≤ w1f (x1)+w2f (x2)+. . .+wnf (xn)

Proof We proceed by induction on n, the number of weights.


If n = 1 then equality holds and the inequality is trivially true.
Let us suppose, inductively, that Jensen’s inequality holds for n =
k − 1. We seek to prove the inequality when n = k.
Let us then suppose that w1, w2, . . . wk be weights with

• wj ≥ 0
• kj=1 wj = 1
P

If wk = 1 then the inequality reduces to f (xk ) ≥ f (xk ) which is


trivially true, so we concentrate on the case wk < 1. Then, applying
the inductive hypothesis to the k − 1 points x1, x2, . . . xk :

 
w1 w2 wk−1
f x1 + x2 + . . . + xk−1
1 − wk 1 − wk 1 − wk
w1f (x1) + w2f (x2) + . . . wk−1f (xk−1)

1 − wk
9

Trivially we also have:

f (xk ) ≤ f (xk )

Taking a weighted average of the last two formulas with weights


1 − wk and wk respectively, we have:
 
w1 w2 wk−1
(1 − wk )f x1 + x2 + . . . + xk−1 + wk f (xk )
1 − wk 1 − wk 1 − wk
≤ w1f (x1) + w2f (x2) + . . . wk−1f (xk−1) + wk f (xk )

But by the convexity of f we can compare the left hand side:

f (w1x1 + w2x2 + . . . wk xk ) ≤
 
w1 w2 wk−1
(1 − wk )f x1 + x2 + . . . + xk−1 + wk f (xk )
1 − wk 1 − wk 1 − wk
Combining these last two inequalities, we finally have proved the
inductive hypothesis when n = k:

f (w1x1 + w2x2 + . . . wk xk )

≤ w1f (x1) + w2f (x2) + . . . wk−1f (xk−1) + wk f (xk )

By induction we have proved Jensen’s inequality for arbitrary positive


integers n.
10

When does Equality Hold?


Equality holds in Jensen’s inequality if:
• All the xj are equal
• All but one of the wj are zero.
If the function f (x) is strictly convex then these are the only cases
of equality.
11

Example Problem
Show that:
√ √ √ n√ 2
12 + 1 + 22 + 1 + . . . + n2 + 1 ≥ n + 2n + 5
2
Solution: Apply Jensen’s inequality to the convex function f (x) =

1 + x2 at the points xn = n with weight 1/n. Then
√ √ √
1 + 1 + 2 + 1 + . . . + n2 + 1
2 2

n
s  2
1 + 2 + ... + n
≥ 1+
n
r
(n + 1)2
= 1+
4
1p
= (n + 1)2 + 4
2
Multiplying by n, we obtain the result we set out to prove.
12

Example: Suppose a, b and c are positive real numbers with:


1 1 1
+ + =a+b+c
a b c
Find the minimal value of this expression.
Solution By Jensen’s inequality applied to the convex function
f (x) = 1/x, for arbitrary a, b, c > 0:
 −1  
a+b+c 1 1 1 1
≤ + +
3 3 a b c
We can therefore conclude that:
 
1 1 1
+ + (a + b + c) ≥ 9
a b c
In this example, we are told the two factors are equal and positive;
therefore:
1 1 1
+ + =a+b+c≥3
a b c
Equality holds when a = b = c = 1.
13

Example: Suppose {xi : 1 ≤ i ≤ n} are non-negative real num-


bers with n
X
xi = 1
i=1
What is the lowest possible value of:
(1) ni=1 x2i
P
Pn √
(2) i=1 xi?
For the first problem, we note by the convexity of x2 that
 x + x + . . . + x  2 x2 + x2 + . . . + x2
1 2 n
≤ 1 2 n
n n
Therefore, !2
n n
X 1 X 1
x2i ≥ xi =
i=1
n i=1
n
Equality holds when all the xi are equal to 1/n.
For the second problem, we cannot apply Jensen’s inequality because

x is concave, not convex.
However, we note that for 0 ≤ x ≤ 1:

x≥x

Therefore: n n
X √ X
xi ≥ xi = 1
i=1 i=1
Equality holds when one of the xi = 1 and all the others are zero.
14

Intersections of Convex Sets:


The intersection of a collection of convex sets is still convex:

Corollary The maximum of convex functions is convex (because


the epigraph of the maximum is the intersection of the epigraphs).
15

Example: Arithmetic-Geometric Mean Inequality


Let a1, a2, . . . an ≥ 0. Then:
v
u n n
uY
1 X
tn aj ≤ aj
j=1
n j=1

In this expression, the left hand side is the geometric mean and the
right hand side is the arithmetic mean.
Proof If any of the aj are zero then the result holds trivially setting
the left hand side to zero.
So let us suppose all the aj are strictly positive. Then we can write
aj = 10xj for some (positive or negative) xj .
Then applying Jensen’s inequality to the convex function 10x, with
weights equal to 1/n, we have:
n
(x1 +x2 +...xn )/n 1 X xj
10 ≤ 10
n j=1

This is the inequality we set out to prove.


Note: Equality holds when all the aj are equal.
16

Applications of AM ≥ GM.
Problem AMGM #1
If {b1, b2, . . . bn} is a permutation of the sequence {a1, a2, . . . an}
of positive real numbers, then show that:
a1 a2 an
+ + ... + ≥n
b1 b2 bn

Problem AMGM #2 Let a, b, c be positive real numbers. Show


that:
a3 + b3 + c3 ≥ a2b + b2c + c2a

Hint: Start by showing:


a3 + a3 + b3
≥ a2 b
3

Problem AMGM #3 Let a, b, c be positive real numbers. Show


that:
c a b
+ + ≥2
a b+c c
Hint Add 1 to each side and apply AM ≥ GM.
17

Example Problem Let x1, x2, . . . xn and y1, y2, . . . yn be real se-
quences, satisfying:
n
X
|xi|p = 1
i=1
Xn
|yi|q = 1
i=1

Here, the exponents p, q > 1 satisfy:


1 1
+ =1
p q
Prove that n
X
−1 ≤ xi yi ≤ 1
i=1
18

Solution: Without loss of generality, we may assume that xi > 0


and yi > 0, otherwise we could increase the absolute value of the
left hand side by replacing each x and y by their absolute value.
We apply Jensen’s inequality to the convex function f (z) = z p,
writing:

wi = yiq
xi
zi = q−1
yi
Jensen’s inequality then implies:
" n #p p p
p
q xi
X X X
xi yi ≤ yi p(q−1) = xpi = 1
i=1 i=1 yi i=1

In the middle step, the y 0s cancel because the exponent is zero:


 
1 1
q − p(q − 1) = pq −1+ =0
p q
Taking the pth root of the previous inequality gives the result we set
out to prove.

Remark. This result is more often stated in the equivalent form:



n

n
! p1 n
! 1q
X X X
xi yi ≤ |xi|p |yi|q



i=1 i=1 i=1

It is known as Hölder’s Inequality.


19

Unit Balls and Duality


Hölder’s inequality is a special case of a profound result in the theory
of convex sets.
Let a unit ball B ⊂ Rn be a closed, bounded, convex set containing
a neighborhood of the origin. An example of such a unit ball is:
( n
)
X
B = x ∈ Rn : |xi|p ≤ 1
i=1

Then the dual ball, B 0, is defined by:

B 0 = {y ∈ Rn : x.y ≤ 1, ∀x ∈ B}

Hölder’s inequality identifies the dual ball in our example. These are
shown in R2 for p = 5 and q = 1.25.
1.5 1.5

1 1

0.5 0.5

0 0
-1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5

-0.5 -0.5

-1 -1

-1.5 -1.5
20

Generalising Factorials to Non-Integers


The factorial function n! is defined for non-negative integers n by:

0! = 1

1! = 1

2! = 2

n! = n × (n − 1)!

Question: Is there a natural generalisation of n! to non-integer n?


We could generalise n! by choosing an arbitrary function for 0 <
n < 1, for example 1, and then applying the recurrence relation for
larger n.
Plotting this on a logarithmic scale, we get the following function:

100

10

1
-1 0 1 2 3 4 5

0.1
21

Convex Generalised Factorial


Let us suppose we want the generalised factorial to be convex when
plotted on a geometric scale (unlike the plot above).
In other words, we want to define x! for x > −1 such that

• x! = x × (x − 1)!
• x! = 10f (x) where f (x) is a convex function.

In particular, taking 0 < x < 1, the definition of convexity implies:

f (n + x) ≤ (1 − x)f (n) + xf (n + 1)

f (n) ≤ xf (n − 1 + x) + (1 − x)f (n + x)

Raising 10 to the power of each side, we have:

(n + x)! ≤ (n!)1−x[(n + 1)!]x

n! ≤ [(n − 1 + x)!]x[(n + x)!]1−x

Now using the recurrence relation for the factorial, we have:

(n + x)! ≤ (n + 1)x × n!

n! ≤ (n + x)−x × (n + x)!

Putting these together, we have upper and lower bounds for (n+x)!:

(n + x)xn! ≤ (n + x)! ≤ (n + 1)x × n!


22

Dividing through by (1 + x)(2 + x) . . . (n + x) we have:


(n + x)xn! (n + 1)x × n!
Qn ≤ x! ≤ Qn
j=1 (j + x) j=1 (j + x)

This gives upper and lower bounds for x!. The chart below shows
the upper and lower bounds for 0 ≤ x ≤ 1 and n = 1 (red), n = 5
(orange), n = 10 (green) and the limit of large n (black):

0.95

0.9

0.85

0.8
0 0.2 0.4 0.6 0.8 1

There is much more that can be said about this function, including

1 π
that 2! = 2 , but this will have to wait for another time!

You might also like