0% found this document useful (0 votes)
17 views3 pages

Jensen

Uploaded by

arnavsinhal13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views3 pages

Jensen

Uploaded by

arnavsinhal13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Jensen’s Inequality

Jensen’s inequality applies to convex functions. Intuitively a function is


convex if it is “upward bending”. f (x) = x2 is a convex function. To make
this definition precise consider two real numbers x1 and x2 . f is convex if the
line between f (x1 ) and f (x2 ) stays above the function f . To make this even
more precise consider p ∈ [0, 1] and consider the weighted average px1 + (1 −
p)x2 . This is a number between x1 and x2 . For a given function f we can
consider the same weighted average between f (x1 ) and f (x2 ). As p goes from
1 to 0 we have that the x-y pairs hpx1 + (1 − p)x2 , pf (x1 ) + (1 − p)f (x2 )i
trace a line from the point hx1 , f (x1 )i to the point hx2 , f (x2 )i. Jensen’s
inequality states that this line is everywhere at least as large as f (x).

Definition: A function f from the reals to the reals is convex if


for every x1 and x2 and every p ∈ [0, 1] we have

pf (x1 ) + (1 − p)f (x2 ) ≥ f (px1 + (1 − p)x2 ).

If f is (doubly) differentiable then f is convex if and only if d2 f /dx2 ≥ 0.


Now consider a probability distribution P on a set M and a function X
assigning real values X(m) for m ∈ M .

Theorem 1 (Jensen’s Inequality) If f is convex then for any distribution


P on M we have the following.

Em∼P [f(X(m))] ≥ f (Em∼P [X(m)])

Usually the right hand side above — f of an expectation — is simpler than


the left hand side — the expectation of f. Jensen’s inequality is used to bound
the “complicated” expression E [f(X)] by the simpler expression f (E [X]).
Often these expression are actually very close to each other. (Assuming that
these expressions are equal is called the mean field approximation).
We prove Jensen’s inequality only for the case where M is a finite set
{m1 , . . . , mk }. Let xi abbreviates X(mi ) and pi abbreviates P (mi ). First
consider the case where M contains only two elements. In this case we have
the following.
Em∼M [f(X(m)] = p1 f (x1 ) + p2 f (x2 )
≥ f (p1 x1 + p2 x2 )
= f (Em∼M [X(m)])

Note that the definition of convexity is simply the statement that Jensen’s
inequality holds for two point distributions. We prove Jensen’s inequality for
finite M by induction on the number of elements of M . Suppose M contains
k elements and assume that Jensen’s inequality holds for distributions on
k − 1 points. We now have the following where the fourth line follows from
the induction hypothesis.

Em∼P [f(X(m))]
= p1 f (x1 ) + p2 f (x2 ) + p3 f (x3 ) + · · · + pk f (xk )
! ! !
p1 p2
= (p1 + p2 ) f (x1 ) + f (x2 ) + p3 f (x3 ) + · · · + pk f (xk )
p1 + p2 p1 + p2
! ! !
p1 p2
≤ (p1 + p2 )f x1 + x2 + p3 f (x3 ) + · · · + pk f (xk )
p1 + p2 p1 + p2
! !
p1 x1 p2 x2
≤ f (p1 + p2 ) + + p3 x3 + · · · pk xk
p1 + p2 p1 + p2
= f (p1 x1 + p2 x2 + p3 x3 + · · · pk xk )
= f (Em∼M [X] (m))

The definition of convexity generalizes to the case where f is a function


from vectors to reals and x1 and x2 are taken to be vectors. Jensen’s in-
equality also generalizes to the case where X(m) is a vector. In this case
Em∼P [X(m)] is an average vector. In the vector case the above definitions
and derivations go through unchanged.

1 Problems
1. Which of the following functions are convex. (Hint: compute the second
derivative.)
• x3 (on all reals)

• 2x2 − 3x + 1 (on all reals)

• x2 − ln x (on positive reals only)


 
1
• ln 1+ex
(on all reals)

2. Show that if f and g are convex then so is f + g and max(f, g).


3. Use Jensen’s inequality to show that if f is convex and ai > 0 then we
have the following.
! P !
X X iai x i
ai f (xi ) ≥ ai f P
i i i ai

You might also like