0% found this document useful (0 votes)
24 views6 pages

Problem Set 10 Solutions

This document contains solutions to problems involving probability and statistics. It covers finding joint, marginal, and conditional densities from a given conditional density. It also examines conditional moments for normal random variables and properties of the sample variance and maximum likelihood estimators for the variance of a normal distribution.

Uploaded by

glowygamingno1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views6 pages

Problem Set 10 Solutions

This document contains solutions to problems involving probability and statistics. It covers finding joint, marginal, and conditional densities from a given conditional density. It also examines conditional moments for normal random variables and properties of the sample variance and maximum likelihood estimators for the variance of a normal distribution.

Uploaded by

glowygamingno1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Mathematics for Economics and Finance (Fall 2023)

Problem Set 10 Solutions: Probability & Statistics


Professor: Norman Schürhoff
You do not need to hand in any solutions!
1 December 2023

1 Joint density, marginal density, conditional density


Consider the following conditional density for Y given X = x is
2y + 4x
f Y |X ( y| x) = .
1 + 4x
The marginal density of x is
1 + 4x
fX (x) = ,
3
for 0 < x < 1 and 0 < y < 1. Find
1. the joint density fXY (x, y),
2. the marginal density of Y , fY (y), and
3. the conditional density for X given Y = y, f X|y ( x| y).

Answer:
1. From Definition 4.18 in the lecture notes we know that the conditional density of a random variable
Y given another random variable X = x is

 fX,Y (x,y) , if fX (x) > 0;
fX (x)
f Y |X ( y| x) =
0, otherwise.
1+4x
∀x ∈ (0, 1) fX (x) = 3 > 0, therefore, the joint density of X, Y is
2y + 4x 1 + 4x 2y + 4x
fXY (x, y) = f Y |X ( y| x) fX (x) = · = .
1 + 4x 3 3
The joint cumulative distribution is an integral over this density function
Z yZ x Z yZ x
2b + 4a y 2 x + 2x2 y
FX,Y (x, y) = P(X ⩽ x, Y ⩽ y) = fXY (a, b) da db = da db = .
0 0 0 0 3 3
2. The marginal density of Y can be derived by integrating out the variable x
Z ∞ Z 1
2y + 4x 2y + 2
fY (y) = fXY (x, y) dx = dx = .
−∞ 0 3 3
The corresponding marginal distribution of Y is then given by
Z y Z y
2b + 2 y 2 + 2y
FY (y) = P(Y ⩽ y) = fY (b) db = db = .
0 0 3 3
2y+2
3. Finally, as ∀y ∈ (0, 1) fY (y) = 3 > 0, the conditional density of X given Y is
2y+4x
fX,Y (x, y) 3 y + 2x
f X|Y ( x| y) = = 2y+2 = .
fY (y) 3
y+1
The corresponding conditional cumulative distribution function is
Z x x
yx + x2
Z
y + 2a
FX|Y (x|y) = P(X ≤ x|Y = y) = fX|Y (a|y) da = da = .
0 0 y+1 y+1

1
2 Conditional moments

For random variables X, Y ∼ N 0, σ 2 determine

1. E X X 2
2. E (X |XY )

3. E X 2 + Y 2 |X + Y .

Answer:
2
1. Since normal
√ distribution is symmetric around zero,
√ thus knowing X = x one can immediately tell
2
that X √= x√ with probability √ 1/2 or
√ X = −√ x with probability
√ 1/2. Thus E X X = x =
P (X = x) · x + P (X = − x) · (− x) = 21 · x + 21 · (− x) = 0.
 
  2
Another way to show is to notice that −X ∼ N 0, σ 2 =⇒ E X X 2 = E −X (−X) =
2
 2
 2

E −X X = −E X X =⇒ E X X = 0.


2. One may note that −X, −Y ∼ N 0, σ 2 .

E (X |XY ) = −E (−X |XY ) = −E (−X |(−X) (−Y ) ) = −E (X |XY )

because −X, −Y have the same distribution as X, Y . Thus E (X |XY ) = −E (X |XY ) ⇐⇒ E (X |XY ) =
0.
 
 2 2 
3. Applying the same idea, E X 2 + Y 2 |X + Y = −E (−X) + (−Y ) | − X − Y = −E X 2 + Y 2 |X + Y =⇒

E X 2 + Y 2 |X + Y = 0.

3 Estimators
Let X1 , ..., Xn be i.i.d N (µ, σ 2 ).

1. Show the statistics sample variance S 2 is unbiased estimator for σ 2 .


(n−1)S 2
2. Compute MSE of the estimator. (hints: σ2 ∼ χ2n−1 and Varχ2n−1 = 2(n − 1).)
n−1 2
3. An alternative estimator for σ 2 is the maximum likelihood estimator σ̂ 2 , show σ̂ 2 = n S .

4. Is σ̂ 2 a biased estimator? What is the variance of σ̂ 2 ? (e) Show that σ̂ 2 has smaller MSE than
S 2 .Explain why.

Answer:

1.
n
1X 1
E X̄ = E( Xi ) = nEX1 = µ;
n i=1 n

2
n
1X
V ar(X̄) = V ar( Xi )
n i=1
n
1 X
= V ar( Xi )
n2 i=1
1
= nV ar(X1 )
n2
σ2
= ;
n

E(X̄ 2 ) = V ar(X̄) + (E X̄)2


σ2
= + µ2 ;
n
n
1 X
E(S 2 ) = E{ (Xi − X̄)2 }
n − 1 i=1
n
1 X
= E{ (Xi2 − 2Xi X̄ + X̄ 2 )}
n − 1 i=1
n n n
1 X X X
= { E(Xi2 ) − 2E(X̄ Xi ) + E( X̄ 2 )}
n − 1 i=1 i=1 i=1
| {z }
=nX̄
1 
= nE(X12 ) − 2nE(X̄ 2 ) + nE(X̄ 2 )
n−1
1 σ2
= {n(σ 2 + µ2 ) − n( + µ2 )}
n−1 n
= σ2 .
So Sample variance is unbiased estimator for σ 2 .
2. We get
(n − 1)S 2
V ar( ) = 2(n − 1) ⇒
σ2
(n − 1)2
V ar(S 2 ) = 2(n − 1) ⇒
σ4
2σ 4
V ar(S 2 ) = .
n−1

3. Compute MLE σ̂ 2 ,
n
1 1 X (xi − µ)2
L(µ, σ 2 | X) = n exp{− };
(2πσ 2 ) 2 2 i=1 σ2
n
n n 2 1 X (xi − µ)2
lnL = − ln 2π − ln σ − ;
2 2 2 i=1 σ2
FOC:
n
∂ ln L 1X
= 2 (xi − µ) = 0 (1)
∂µ σ i=1
n
∂ ln L n 1 X
= − + (xi − µ)2 = 0 (2)
∂σ 2 2σ 2 2σ 4 i=1

3
n
1 n−1 2
From (2) ⇒ σ̂ 2 = (xi − X̄)2 ⇒ σ̂ 2 =
P
n n S .
i=1

4. E σ̂ 2 = E( n−1 2
n S )=
n−1 2
n σ (clearly, it is biased. )
n−1 2
2 2
Bias: E(σ̂ − σ ) = n σ −σ 2 = − n1 σ 2 (Notes: as n → ∞, bias → 0 )
2(n−1)σ 4
V ar(σ̂ 2 ) = V ar( n−1 2 n−1 2 2
n S ) = ( n ) V ar(S ) = n2 .
5. MSE of σ̂ 2 is given as
E(σ̂ 2 − σ 2 )2 = V ar(σ̂ 2 ) + (Bias(σ̂ 2 ))2
2(n − 1)σ 4 1
= + 2 σ4
n2 n
2n − 1 4
= σ .
n2
Compare M SE(σ̂ 2 ) with M SE(S 2 ), we find
2n − 1 4 2
2
σ < σ4 ⇒
n n−1
M SE(σ̂ 2 ) < M SE(S 2 ).
This shows there is trade-off between bias and variance.

4 MLE, Fisher Information Matrix


X1 , ..., Xn is a sample from a normal distribution with mean θ1 and variance 1, Y1 , ..., Yn is a sample from a
normal distribution with mean θ2 and variance 2 and Z1 , ..., Zn is a sample from a normal distribution with
mean θ1 + θ2 and variance 4. All three samples are independent.
1. Find the maximum likelihood estimators of θ1 and θ2 .
2. Find the Fisher Information Matrix.
3. Use it to find the asymptotic distribution of the MLEs.

Answer:
1. The log likelihood function is
n  
1X 1 1
lnL(θ1 , θ2 | X, Y, Z) = |{z}
c − (Xi − θ1 )2 + (Yi − θ2 )2 + (Zi − θ1 − θ2 )2 .
2 i=1 2 4
constant

The FOC
∂ 1
lnL = (X − θ1 ) + (Z̄ − θ1 − θ2 ) = 0 ⇒ 4X̄ + Z̄ = 5θ1 + θ2
∂θ1 4
∂ 1 1
lnL = (Ȳ − θ2 ) + (Z̄ − θ1 − θ2 ) = 0 ⇒ 2Ȳ + Z̄ = θ1 + 3θ2
∂θ2 2 4
Solving for θ̂1 and θ̂2 ,

6X̄ − Ȳ + Z̄
θ̂1 = .
7
−2X̄ + 5Ȳ + 2Z̄
θ̂2 = .
7

4
2. Take n = 1. The vector of scores is,
 
 
1 1 1
 2 2 2

∂ c − 2 (X − θ 1 ) + 2 (Y − θ 2 ) + 4 (Z − θ 1 − θ 2 )
 |{z}
∂ ln f (θ1 , θ2 )

constant
=
∂θ ∂θ
(X − θ1 ) + 14 (Z − θ1 − θ2 )

= 1 1 .
2 (Y − θ2 ) + 4 (Z − θ1 − θ2 )

The Fisher Information Matrix:


1
− 14 5 1
     
∂ ln f (θ1 , θ2 ) −1 −
I(θ1 , θ2 ) = −E = −E 4 = 4 4 .
∂θ∂θ′ − 41 1
−2 − 1
4
1
4
3
4

3. The inverse of the information matrix:

5 1
−1 3
− 41 3
− 14
      
−1 4 4
1 4
8 4
1 6 −2
I(θ1 , θ2 ) = 1 3 = = = .
4 4
15
16 − 1
16
− 14 5
4 7 − 14 5
4 7 −2 10

The asymptotic distribution of the MLE:


    
θ̂1 − θ1 d −1 1 6 −2
n → N (0, I ) = N 0, .
θ̂2 − θ2 7 −2 10

5 Cramer-Rao lower bound


e−x+θ π2
Consider the density function: f (x; θ) = (1+e−x+θ )2
, x ∈ R, and EX = θ, V arX = 3 . Consider the
following estimator of θ: θb = X.

1. Show that it is unbiased.


2. Show that it is not the best unbiased estimator (BUE) of θ. Hint: compute Cramer-Rao lower bound.

Answer:
 Pn 
Xi nEX
1. Unbiased: E θb = EX = E i=1
n = n = EX = θ.
2

2. We have − (∂θ) 2 ln f (x; θ) = 2f (x; θ) (obtained by direct computation!).
  R∞ R∞
∂2
Thus I (θ) = E − (∂θ) 2 ln f (x; θ) = 2Ef (x; θ) = 2 −∞ f (x; θ) · f (x; θ) dx = 2 −∞ f 2 (x; θ) dx =


2 ∞ ∞
e−x+θ e−x+θ 0
Z Z Z Z
−x+θ tdt tdt
=2 4 dx = 2 − 4 de =2 − 4 =2 4 =
−∞ (1 + e−x+θ ) −∞ (1 + e−x+θ ) ∞ (1 + t) 0 (1 + t)

Z ∞ −3
! Z ∞ ∞ Z ∞ ∞ Z ∞ −3
(1 + t) 2t (1 + t)
=2 td − =2 udv = 2uv −2 vdu = − 3 −2 − dt =
0 3 0 0 3(1 + t) 0 3
0 0

5
−2 ∞
(1 + t) 1
=0−2 = .
6 3
0

The lower bound on the variance of all estimators of θ equals n1 I −1 (θ) = n3 . Estimator θb = X is the
best unbiased estimator (efficient) if its variance equals to the Cramer-Rao lower bound.
π2 3
However, V arθb = 3n > n, thus estimator θb is not the best unbiased estimator of θ.
Remark: Note that the lower bound might not be achievable, i.e. the best unbiased (efficient) estimator
might not exist. Also note that from the above analysis we cannot conclude anything on the optimality
of θb in the class of all unbiased estimators strictly speaking.

You might also like