0% found this document useful (0 votes)

158 views7 pages

Notes On Jensen's Inequality

Jensen's inequality states that for a convex function f(x) and a random variable X, the expected value of f(X) is greater than or equal to f of the expected value of X. It also holds for concave functions f(x) but with the inequalities reversed. The notes provide several representations of Jensen's inequality from different sources and relate it to concepts like expectation-maximization and Kullback-Leibler divergence. The derivations in the notes from Richard Yida Xu and Wikipedia are identified as better matching the derivation of EM and KL-divergence.

Uploaded by

Jun Wang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views7 pages

Notes On Jensen's Inequality

Uploaded by

Jun Wang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Notes on Jensen’s inequality

There are various concrete representa!ons of Jensen’s inequality.

Jensen’s inequality in Andrew Ng’s Lecture Notes

Let f be a convex func!on, and let X be a random variable. Then:

E[f (X)] ≥ f (EX)

Moreover, if f is strictly convex, then E[f (X)] = f (EX) holds true if and only if
X = E[X] with probability 1 (i.e., if X is a constant).

Jensen’s inequality also holds for concave func!ons f , but with the direc!on of all the
inequali!es reversed (E[f (X)] ≤ f (EX), etc.).

For an interpreta!on of the theorem, consider the figure below.

Here, f is a convex func!on shown by the solid line. Also, X is a random variable that
has a 0.5 chance of taking the value a, and a 0.5 chance of taking the value b (indicated
on the x-axis). Thus, the expected value of X is given by the midpoint between a and b.
We also see the values f (a), f (b) and f (E[X]) indicated on the y -axis. Moreover, the
value E[f (X)] is now the midpoint on the y -axis between f (a) and f (b). From our
example, we see that because f is convex, it must be the case that E[f (X)] ≥
f (EX).

Jensen’s inequality in David McAllester’s Lecture Notes

Consider a probability distribu!on P on a set M and a func!on X assigning real values

X(m) for m ∈ M . If f is convex then for any distribu!on P on M we have the
following:
Em∼P [f (X(m))] ≥ f (Em∼P [X(m)])

Jensen’s inequality in Richard Yida Xu’s Lecture Notes

If Φ is a convex func!on and 0 < t < 1, then

Φ((1 − t) × x1 + t × x2 ) ≤ (1 − t) × Φ(x1 ) + t × Φ(x2 )

n
With ∑i=1 pi = 1, we can generalize the above inequality.

Φ(p1 x1 + p2 x2 + ... + pn xn ) ≤ p1 Φ(x1 ) + p1 Φ(x1 ) + ... + pn Φ(xn )

n n
Φ(∑ pi × xi ) ≤ ∑ pi × Φ(xi )
i=1 i=1

If both values of xi and f (xi ) are in the domain of Φ, we can replace xi with f (xi ) and
s!ll get
n n
Φ(∑ pi × f (xi )) ≤ ∑ pi × Φ(f (xi ))
i=1 i=1

For the con!nuous case with ∫x∈S p(x) = 1, if both values of x and f (x) are in the
domain of Φ we can get

Φ(∫ f (x)p(x)dx) ≤ ∫ Φ(f (x))p(x)dx

x∈S x∈S

Actually, the above inequality is

Φ(E[f (x)]) ≤ E[Φ(f (x))]

Jensen’s inequality from Wikipedia

Form involving a probability density func!on

Suppose Ω is a measurable subset of the real line and f (x) is a non-nega!ve func!on
such that
∞
∫ f (x) dx = 1
−∞

In probabilis!c language, f is a probability density func!on.

Then Jensen’s inequality becomes the following statement about convex integrals:

If g is any real-valued measurable func!on and φ is convex over the range of g , then
∞ ∞
φ (∫ g(x)f (x) dx) ≤ ∫ φ(g(x))f (x) dx.
−∞ −∞

If g(x) = x, then this form of the inequality reduces to a commonly used special case:
∞ ∞
φ (∫ x f (x) dx) ≤ ∫ φ(x) f (x) dx.
−∞ −∞

Alterna!ve finite form

Let Ω = {x1 , ...xn }, and take µ to be the coun!ng measure on Ω, then the general
form reduces to a statement about sums:

n n
φ (∑ g(xi )f (xi )) ≤ ∑ φ(g(xi ))f (xi )
i=1 i=1

provided that f (xi ) = λi ≥ 0 and

λ1 + ⋯ + λn = 1

Gibbs’ inequality

If p(x) is the true probability distribu!on for x, and q(x) is another distribu!on, then
applying Jensen’s inequality for the random variable Y (x) = q(x)/p(x) and the
func!on φ(y) = −log(y) gives
E[φ(Y )] ≥ φ(E[Y ])

Therefore:

KL(p(x)∥q(x)) = ∫ p(x) log ( ) dx

p(x)
q(x)
= − ∫ p(x) log ( ) dx
q(x)
p(x)
≥ − log (∫ p(x) dx)
q(x)
p(x)
= − log (∫ q(x) dx) = 0

a result called Gibbs’ inequality.

It shows that the average message length is minimized when codes are assigned on the
basis of the true probabili!es p rather than any other distribu!on q . The quan!ty that is
non-nega!ve is called the Kullback–Leibler divergence of q from p.

Since −log(x) is a strictly convex func!on for x > 0, it follows that equality holds
when p(x) equals q(x) almost everywhere.

Notes:
Compared with other notes, it seems that the notes from Richard Yida Xu and Wikipedia
be"er match the deriva!on of EM and KL-divergence.

Reference

David McAllester, Jensen’s Inequality, (h"p://#c.uchicago.edu/~dmcallester/#c101-

07/lectures/jensen/jensen.pdf)

Richard YiDa Xu, Expecta!on-Maximiza!on, (h"p://www-

staﬀ.it.uts.edu.au/~ydxu/ml_course/em.pdf)

Jensen’s inequality from Wikipedia (h"ps://en.wikipedia.org/wiki/Jensen’s_inequality)

Jensens Inequality 2
No ratings yet
Jensens Inequality 2
56 pages
Excercises
No ratings yet
Excercises
72 pages
Jensen's Inequality
No ratings yet
Jensen's Inequality
8 pages
Distribution Cheatsheet PDF
No ratings yet
Distribution Cheatsheet PDF
3 pages
Class 3 Maths Olympiad - 2nd PDF
83% (6)
Class 3 Maths Olympiad - 2nd PDF
8 pages
A Proof of Jensen's Inequality
No ratings yet
A Proof of Jensen's Inequality
3 pages
Probability Cheat Sheet: Distributions
No ratings yet
Probability Cheat Sheet: Distributions
2 pages
Convex Functions Vol 16
No ratings yet
Convex Functions Vol 16
80 pages
Chapter 4 Expectation, Moments and Moment Generating Functions
No ratings yet
Chapter 4 Expectation, Moments and Moment Generating Functions
81 pages
STAT2372 Topic4 2020
No ratings yet
STAT2372 Topic4 2020
39 pages
Lect 02
No ratings yet
Lect 02
36 pages
Session 2
No ratings yet
Session 2
60 pages
Prob 2 B English
No ratings yet
Prob 2 B English
81 pages
Lect 05
No ratings yet
Lect 05
28 pages
Lecture 3
No ratings yet
Lecture 3
31 pages
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
No ratings yet
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
55 pages
Selective Review - Probability
No ratings yet
Selective Review - Probability
30 pages
04 - Random Variables 2
No ratings yet
04 - Random Variables 2
17 pages
MUML Preliminiaries
No ratings yet
MUML Preliminiaries
24 pages
Module 17 0 PDF
No ratings yet
Module 17 0 PDF
18 pages
153-TG Math Problems
100% (1)
153-TG Math Problems
44 pages
Week8 Notes
No ratings yet
Week8 Notes
18 pages
MA 4040/ MA 2540: Probability Theory
No ratings yet
MA 4040/ MA 2540: Probability Theory
12 pages
Properties of Expectation: Jeff Chak Fu WONG
No ratings yet
Properties of Expectation: Jeff Chak Fu WONG
55 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
New Refinement of The Jensen Inequality Associated
No ratings yet
New Refinement of The Jensen Inequality Associated
12 pages
Convex Functions: 3.1 First Acquaintance
No ratings yet
Convex Functions: 3.1 First Acquaintance
36 pages
Introduction To Information Theory
No ratings yet
Introduction To Information Theory
20 pages
If U Is Strictly Concave and 0 E (U (X) )
No ratings yet
If U Is Strictly Concave and 0 E (U (X) )
10 pages
Notes 2
No ratings yet
Notes 2
10 pages
About The Precision in Jensen-Steffensen Inequality: Proposition 1.1
No ratings yet
About The Precision in Jensen-Steffensen Inequality: Proposition 1.1
12 pages
Lecture 02: Mathematical Basics (Inequalities)
No ratings yet
Lecture 02: Mathematical Basics (Inequalities)
19 pages
Asad Paper 22 March 2024
No ratings yet
Asad Paper 22 March 2024
26 pages
Problems: MN) MN
No ratings yet
Problems: MN) MN
11 pages
Math556 05 Inequalities
No ratings yet
Math556 05 Inequalities
8 pages
Inequalites Mso205
No ratings yet
Inequalites Mso205
5 pages
Hel Conj
No ratings yet
Hel Conj
9 pages
Tutorial 2
No ratings yet
Tutorial 2
8 pages
Amen A240117
No ratings yet
Amen A240117
8 pages
Convex Sets and Jensen's Inequality
No ratings yet
Convex Sets and Jensen's Inequality
22 pages
Algebra Important Question
No ratings yet
Algebra Important Question
20 pages
Concentration Inequalities: Hoeffding and Mcdiarmid
No ratings yet
Concentration Inequalities: Hoeffding and Mcdiarmid
5 pages
Model With One-Word Context: 2vec 2vec 2vec 2vec
100% (1)
Model With One-Word Context: 2vec 2vec 2vec 2vec
17 pages
Basic Engineering Correlation Calculus v3 001
100% (1)
Basic Engineering Correlation Calculus v3 001
3 pages
3rd Quarter Gr.9
No ratings yet
3rd Quarter Gr.9
52 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
Conditional Distributions
No ratings yet
Conditional Distributions
5 pages
SC633 Lecture Notes
No ratings yet
SC633 Lecture Notes
4 pages
2ND Term J2 Mathematics
No ratings yet
2ND Term J2 Mathematics
24 pages
Ch4 Handout
No ratings yet
Ch4 Handout
4 pages
MIT6 441S16 Midterm
No ratings yet
MIT6 441S16 Midterm
5 pages
Ineq
No ratings yet
Ineq
8 pages
Jensen
No ratings yet
Jensen
3 pages
Convex Optimisation Solutions
No ratings yet
Convex Optimisation Solutions
14 pages
Essentiel Proba Stat en
No ratings yet
Essentiel Proba Stat en
2 pages
I.M.R. Pinheiro: Key Words and Phrases. Convex, S Convex, S Convex, S S Convex
No ratings yet
I.M.R. Pinheiro: Key Words and Phrases. Convex, S Convex, S Convex, S S Convex
4 pages
Jensen Inequality
No ratings yet
Jensen Inequality
2 pages
Probability Bounds: Simple Bounds On Expectation
No ratings yet
Probability Bounds: Simple Bounds On Expectation
3 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
Daoist Mineral, Plant and Animal Magic: The Secret Teaching of Esoteric Daoist Magic Jerry Alan Johnson instant download
No ratings yet
Daoist Mineral, Plant and Animal Magic: The Secret Teaching of Esoteric Daoist Magic Jerry Alan Johnson instant download
67 pages
hw3 Sol
No ratings yet
hw3 Sol
8 pages
GNN-XAI 学习提纲.md
No ratings yet
GNN-XAI 学习提纲.md
4 pages
Math LF
No ratings yet
Math LF
169 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Demand Forecasting
No ratings yet
Demand Forecasting
31 pages
Solutions To Exam 1: 1 2 N N A N
No ratings yet
Solutions To Exam 1: 1 2 N N A N
3 pages
05 Fall Exam1 Soln
No ratings yet
05 Fall Exam1 Soln
2 pages
AP Old RD Class 10 (Excercise 5.4) - 1
100% (1)
AP Old RD Class 10 (Excercise 5.4) - 1
4 pages
Partial Differential Equation
No ratings yet
Partial Differential Equation
23 pages
Matrices and Determinants
No ratings yet
Matrices and Determinants
36 pages
PDF Document
No ratings yet
PDF Document
9 pages
Transformation
No ratings yet
Transformation
49 pages
Extensions of Calculus Assignment
No ratings yet
Extensions of Calculus Assignment
2 pages
JEE Main 2025 Math Checklist
No ratings yet
JEE Main 2025 Math Checklist
2 pages
Numerical Methods For The Simulation of Chemical Engineering Processes
No ratings yet
Numerical Methods For The Simulation of Chemical Engineering Processes
14 pages
Lecture# 08 Greedy Algorithms
No ratings yet
Lecture# 08 Greedy Algorithms
63 pages
Atlib 2012
No ratings yet
Atlib 2012
5 pages
Secondary 1 G3 Math - Approximation and Estimation
No ratings yet
Secondary 1 G3 Math - Approximation and Estimation
21 pages
LAB 2 Full
No ratings yet
LAB 2 Full
8 pages
Fractions Quiz
No ratings yet
Fractions Quiz
2 pages
Chapter2 Answers 3rd
No ratings yet
Chapter2 Answers 3rd
24 pages
神经网络中涉及的向量和矩阵求导
100% (1)
神经网络中涉及的向量和矩阵求导
18 pages
02 Trigonometric Ratios Revision Notes Quizrr
No ratings yet
02 Trigonometric Ratios Revision Notes Quizrr
65 pages
Covering Spaces and Graph Theory PDF
No ratings yet
Covering Spaces and Graph Theory PDF
46 pages
Lda-The Gritty Details
100% (1)
Lda-The Gritty Details
12 pages
The Hub-And-Spoke Model - A Tutorial
No ratings yet
The Hub-And-Spoke Model - A Tutorial
23 pages
CH 1. Kinematics of Particles 2016 - Part A (Rectilinear Motion) PDF
No ratings yet
CH 1. Kinematics of Particles 2016 - Part A (Rectilinear Motion) PDF
36 pages
Examples of Variational Inference With Gaussian-Gamma Distribution
No ratings yet
Examples of Variational Inference With Gaussian-Gamma Distribution
6 pages
Notes On Beta and Dirchilet Distribution
No ratings yet
Notes On Beta and Dirchilet Distribution
19 pages
Class 12 B Math Project Section B Project No 1
No ratings yet
Class 12 B Math Project Section B Project No 1
14 pages
Exponential Family Related To LDA
No ratings yet
Exponential Family Related To LDA
12 pages
Movellan - 1991 - Contrastive Hebbian Learning in The Continuous Hopfield Model PDF
No ratings yet
Movellan - 1991 - Contrastive Hebbian Learning in The Continuous Hopfield Model PDF
8 pages
Amc Warm-Up Paper Senior Paper 7 Solutions: 2009 Australian Mathematics Trust 3 X 3
No ratings yet
Amc Warm-Up Paper Senior Paper 7 Solutions: 2009 Australian Mathematics Trust 3 X 3
4 pages
Logistic Regression and Cross-Entropy
No ratings yet
Logistic Regression and Cross-Entropy
3 pages
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Notes On Jensen's Inequality

Uploaded by

Notes On Jensen's Inequality

Uploaded by

Notes on Jensen’s inequality

There are various concrete representa!ons of Jensen’s inequality.

Jensen’s inequality in Andrew Ng’s Lecture Notes

Let f be a convex func!on, and let X be a random variable. Then:

E[f (X)] ≥ f (EX)

For an interpreta!on of the theorem, consider the figure below.

Jensen’s inequality in David McAllester’s Lecture Notes

Consider a probability distribu!on P on a set M and a func!on X assigning real values

Jensen’s inequality in Richard Yida Xu’s Lecture Notes

If Φ is a convex func!on and 0 < t < 1, then

Φ((1 − t) × x1 + t × x2 ) ≤ (1 − t) × Φ(x1 ) + t × Φ(x2 )

Φ(p1 x1 + p2 x2 + ... + pn xn ) ≤ p1 Φ(x1 ) + p1 Φ(x1 ) + ... + pn Φ(xn )

Φ(∫ f (x)p(x)dx) ≤ ∫ Φ(f (x))p(x)dx

Actually, the above inequality is

Φ(E[f (x)]) ≤ E[Φ(f (x))]

Jensen’s inequality from Wikipedia

Form involving a probability density func!on

In probabilis!c language, f is a probability density func!on.

Alterna!ve finite form

provided that f (xi ) = λi ≥ 0 and

KL(p(x)∥q(x)) = ∫ p(x) log ( ) dx

a result called Gibbs’ inequality.

David McAllester, Jensen’s Inequality, (h"p://#c.uchicago.edu/~dmcallester/#c101-

Richard YiDa Xu, Expecta!on-Maximiza!on, (h"p://www-

Jensen’s inequality from Wikipedia (h"ps://en.wikipedia.org/wiki/Jensen’s_inequality)

You might also like