0% found this document useful (0 votes)
25 views19 pages

Expectation

The expectation of a random variable X, if it exists, is defined as the weighted average of all possible values of X, where the weights are given by the probability distribution of X. The expectation has several useful properties like linearity and independence. Higher order moments like variance are also defined in terms of expectations. Covariance and correlation are measures of dependence between random variables. Sample moments like sample mean and variance are used to estimate population moments from data.

Uploaded by

Shah Fahad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views19 pages

Expectation

The expectation of a random variable X, if it exists, is defined as the weighted average of all possible values of X, where the weights are given by the probability distribution of X. The expectation has several useful properties like linearity and independence. Higher order moments like variance are also defined in terms of expectations. Covariance and correlation are measures of dependence between random variables. Sample moments like sample mean and variance are used to estimate population moments from data.

Uploaded by

Shah Fahad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

IV: Expectation

A modern crash course in intermediate Statistics and Probability

Paul Rognon

Barcelona School of Economics


Universitat Pompeu Fabra
Universitat Politècnica de Catalunya

1 / 19
Expectation of a random variable

If X is a random variable then the expectation of X , if it exists, is


P
 R x fX (x) for discrete variables
E(X ) := R
 x f (x)dx for continuous variables
R X

For example:
• if X ∼ Bern(p) then E(X ) = p,
• if X ∼ N(0, 1) then E(X ) = 0.

Properties of the expectation


P P
• Linearity: E( i ai Xi ) = i ai E(Xi ).
• If X , Y independent, then E(XY ) = E(X )E(Y ).

2 / 19
Higher order moments
k-th moment of X (k ≥ 1)
E(X k ) is the k-th moment, if it exists.
E(X − E(X ))k is the k-th central moment, if it exists.
Variance
The variance var(X ) is the the second central moment
• var(X ) = E[(X − E(X ))2 ] = E(X 2 ) − (E(X ))2
• var(aX + b) = a2 var(X )
ai2 var(Xi )
P P
• (Xi )i...k independent then var( i ai Xi ) = i
p
The standard deviation is defined as sd(X ) = σx = var(X )
Expectation g (X )
For any mesurable function g , if the expectation E (g (X )) exists,
Z X
E(g (X )) = g (x)fX (x)dx or E(g (X )) = g (x)fX (x)dx
R R
3 / 19
Exercises

Compute expectations of:


a. exponential variable
b. U(0, 1)

Compute the variance of:


a. U(0, 1)
b. Exp(λ)

4 / 19
Covariance
The covariance cov(X , Y ) between two random variables X and Y is
defined by:

cov(X , Y ) = E[(X − E(X ))(Y − E(Y ))] = E(XY ) − E(X )E(Y )

It is a measure of dependence:
• X⊥
⊥ Y =⇒ cov(X , Y ) = 0
• cov(X , Y ) ̸= 0 =⇒ X , Y are not independent
The correlation ρ(X , Y ) is a standardized covariance and measure of
dependence

cov(X , Y )
ρ(X , Y ) = p ∈ [−1, 1].
var(X )var(Y )

Nota Bene: If X , Y independent then cov(X , Y ) = ρ(X , Y ) = 0 but the


converse is typically not true. Exceptions: X , Y are binary or Gaussian.
5 / 19
Properties of the covariance

• var(X ) = cov(X , X ) ≥ 0
• Symmetry: cov(X , Y ) = cov(Y , X )
• cov(X + a, Y + b) = cov(X , Y )
• Bilinearity: cov(aX + bY , Z ) = a cov(X , Z ) + b cov(Y , Z )
• var(X + Y ) = var(X ) + var(Y ) + 2cov(X , Y ), and
var(X − Y ) = var(X ) + var(Y ) − 2cov(X , Y )
• morePgenerally, P
var ( i ai Xi ) = i ai2 var (Xi ) + 2
PP
i<j ai aj cov (Xi , Xj )

6 / 19
Exercise

There are two coins: one fair coin and one coin where head has
probability 2/3. We pick at random one coin and toss it twice. Let X be
the Bernoulli variable such that X = 1 if we picked the fair coin, T1 and
T2 be the results of the two tosses. We define T1 and T2 as Bernoulli
random variables that take the value 1 if the result is head, 0 if tail.

1. Compute var(T1 ) and var(T2 ).


2. Compute cov(T1 , T2 ), are T1 and T2 independent?
3. We add gains to our experiments. If we get head in the first toss, we
earn 1/2 and if get head in the second toss, we earn 1/4. Let G be
the total gain in the two tosses. Compute var(G ).
4. We repeat the complete experiment with gain 3 times. Compute the
variance of the total gain over the three runs.

7 / 19
Sample moments
Let X1 , . . . , Xn be independent copies of X (random sample).
1 Pn
Sample mean: X̄n = n i=1 Xi
1 Pn
Sample variance: Sn = n−1 i=1 (Xi − X̄ )2

The sample mean is an estimator of the population mean, and the


sample variance is an estimator of the population variance.
Expectation and variance of sample moments
If ∀i E(Xi ) = µ and var(Xi ) = σ 2 , then
σ2
• E(X̄n ) = µ, var(X̄n ) = n ,
• E(Sn ) = σ 2 .

Distribution under normality


2
If ∀i Xi ∼ N(µ, σ) then X̄n ∼ N(µ, σn ), (n − 1) σSn2 ∼ χ2n−1 .

8 / 19
Moments of a random vector
Mean vector
Let X = (X1 , . . . , Xp ) be a vector of random variables with joint
distribution fX (x).
The expectation µ = E(X ) of X is a vector whose i-th entry is
µi = E(Xi ).
The expectation is linear: for A ∈ Rm×p , E(AX ) = A E(X )
The (variance) covariance matrix
The (variance) covariance matrix var(X ) of X is defined as:

var(X ) = E (X − µ)(X − µ)T


 

var(X ) is a positive semidefinite p × p matrix Σ whose (i, j)-th entry is


Σij = cov(Xi , Xj )

for A ∈ Rm×p , var(AX ) = A · var(X ) · AT .

9 / 19
Exercise

In our coin tosses example, we define the random vector T as


T = (T1 , T2 )

1. Find the mean vector of T .


2. Find the covariance matrix of T .
3. Use properties of the covariance matrix to recover the variance of G .

10 / 19
Multivariate sample moments
Suppose that X ∈ R n×p is a matrix of n replications of a random vector
X = (X1 , . . . , Xp ). Denote the rows of X by x (1) , . . . , x (n) .
• The sample mean vector is
n
1 X (i) 1 T
x̄ = x = 1 X ∈ Rp .
n n n
i=1
• The sample covariance (variance) matrix is
n
1 X (i)
S = (x − x̄)(x (i) − x̄)T .
n−1
i=1
The diagonal entries are sample variances of Xi and the off-diagonal
entries are sample covariances (estimators of cov(Xi , Xj )).

If X is centred so that x̄ = 0p then this simplifies to


n
1 X (i) (i) T 1
S = x (x ) = XTX
n−1 n−1
i=1
11 / 19
Basic inequalities
Markov’s inequality
If X ≥ 0 and E(X ) < ∞, then for every t > 0

E(X )
P(X ≥ t) ≤ .
t

Chebyshev’s inequality (follows from the Markov’s inequality)


Let E(X ) = µ, var(X ) = σ 2 , then for every t > 0

σ2
P(|X − µ| ≥ t) ≤
t2
 
and, in particular, P X σ−µ ≥ t ≤ 1
.

t2

Jensen’s inequality
If g : R → R convex, then E(g (X )) ≥ g (E(X )).
12 / 19
Exercise

1. We say X has an exponential distribution if fX (x) = λe −λx 11R+ (x).


Give a lower bound for P(X ≥ 13 ).
2. Using the Markov’s inequality, prove the Chebyshev’s inequality.
3. Let Z ∼ N (0, 1). Give an upper bound on P(|Z | ≥ 1.64).
4. Let Z ∼ N (2, 1). Give a lower bound for E(Z 2 ).

13 / 19
Conditional expectation

Let X , Y have joint distribution fX ,Y (x, y ) and conditional fX |Y (x|y ).


Then the conditional expectation E(X |Y = y ) is the expectation of X
with respect to the conditional distribution X |Y = y .
P
 x x fX |Y (x|y ) for discrete variables
E(X |Y = y ) = .
R x f (x|y )dx for continuous variables
R X |Y

Note that when Y is not set at a fixed value y , E(X |Y ) is a function of


Y and so a random variable!

Example: two binary variables


Suppose fX ,Y (0, 0) = 0.4, fX ,Y (0, 1) = 0.2, fX ,Y (1, 0) = 0.1,
fX ,Y (1, 1) = 0.3. Find E(X |Y = 0).

14 / 19
Properties of conditional expectation

Let r be a measurable real function.


• E(r (X )|X ) = r (X )
• If X and Y are independent, E(r (X )|Y ) = E(r (X ))
• E(r (Y )X |Y ) = r (Y )E(X |Y ).
• E(a X + b Y |Z ) = a E(X |Z ) + b E(Y |Z )
 
• E E(X |Y ) = E(X )
 
• more generally, E E(r (X , Y )|Y ) = E(r (X , Y ))

15 / 19
Exercise

Recall our coin tosses example. There are two coins: one fair coin and
one coin where head has probability 2/3. We pick at random one coin
and toss it twice.
Let X be the Bernoulli variable such that X = 1 if we picked the fair
coin, T1 and T2 be the results of the two tosses. We define T1 and T2 as
Bernoulli random variables that take the value 1 if the result is head, 0 if
tail.
If we get head in the first toss, we earn 1/2 and if get head in the second
toss, we earn 1/4. Let G be the total gain in the two tosses.

1. Find the distribution of E(T1 |X )


2. Find the distribution of E(X |T1 )
3. Verify that E(E(X |T1 )) = E(X )
4. Find E(G |X = 0)

16 / 19
Moment generating function
The moment generating function when it exists is defined as:
P
 i e txi P (X = xi ) for a discrete variable
t·X
MX (t) = E(e ) = R
 e t·x f (x)dx for a continuous variable
R X

We have:
dMX
dt (0) = E(X ), and
more generally, M (k) (0) = E[X k ].

If X = (X1 , . . . , Xp ) is a random vector we define


Z
MX (t) = E(exp(⟨X , t⟩)) = exp(⟨X , t⟩)fX (x)dx.
Rp

Derivatives evaluated at t = 0n give the corresponding moments.


NB: For a continuous variable, MX is the Laplace transform of the density.
17 / 19
Characteristic function
The characteristic function is defined for all ω as:
 P iωx
iωX e k P (X = xk ) , if X is discrete
ϕX (ω) = E(e ) = R ∞k iωx
−∞ e fX (x)dx, if X is continuous

Properties
P
• if X1 , . . . , Xn are independent random variables and S = i Xi ,
then:
n
Y
ϕS (ω) = ϕXk (ω)
k=1

• ϕX specifies uniquely the probability law of X . Two random


variables have the same characteristic function if and only if they
have the same distribution function.

NB: For a continuous variable, ϕX is the Fourier transform of the density.


18 / 19
Exercise

1. Find the moment generating function and the characteristic function


of a Bernoulli random variable with parameter p.
2. Find the moment generating function and the characteristic function
of a random variable with exponential distribution with parameter λ.
3. Use the moment generation function to recover the mean of a
Bernoulli random variable with parameter p and the mean of a
random variable with exponential distribution with parameter λ ( λ1 ).

19 / 19

You might also like