0% found this document useful (0 votes)
6 views

Assignment

Uploaded by

hadjiamine93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Assignment

Uploaded by

hadjiamine93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MS2 Instructors: A.

Hadji
18 February 2022 TAs: S. Rice

Assignment
Mathematical Statistics II
The Assignment is due on Thursday, March 4 at 23:59. You should hand in a pdf in LaTeX contai-
ning the theoretical answers and an R-script. Please before you hand in your assignment, rename
the files as “LastName FirstName AssignmentMS2.R” and “LastName FirstName AssignmentMS2.pdf”.

We look at the probit model. Let y be a binary random variable: yi ∈ {0, 1}, ∀i ∈ {1, ..., n}.
Note xi = (x0i , x1i , ..., xqi ) ∈ Rq+1 the regression variables (∀i, x0i = 1). In the following, we note
y = (y1 , ..., yn ) and X the matrix of covariates. We consider the model:
Yi ∼ Bernoulli(pi )
pi = Φ(xTi β)
where Φ is the cdf of the standard normal distribution. We wish to estimate β ∈ Rq+1 . In the
following, we use the dataset bank from package gclus, which describes characteristics of bank
notes and whether they are counterfeit or genuine.
> i n s t a l l . packages ( g c l u s )
> require ( g c l u s )
> data ( bank )
> y = bank [ , 1 ]
> X = bank [ , 2 : 5 ]
Let the prior on β be a flat prior P (β) ∝ 1. This is an improper prior, but the posterior will be
an actual distribution.
1. Give the form of the posterior distribution. Is this model easy to use?
Use the R function glm to get the MLE β̂. Give the asymptotic variance Σ̂ of this estimator
> mle = summary(glm( y˜X, family=binomial ( l i n k=” p r o b i t ” ) ) )
> b e t a h a t = mle$ c o e f f i c i e n t s [ , 1 ]
> sigma . asymp = mle$cov . u n s c a l e d

2. We propose to generate a posterior sample of β using a random walk Metropolis-Hastings


algorithm

• Initialization: β (0) = β̂
• At iteration k:
– Generate β 0 ∼ Nq+1 (β (k−1) , τ 2 Σ̂)
– Compute the acceptance probability

!

(k−1) 0
 P (β 0 |y)
α β ,β = min 1,
P (β (k−1) |y)

– Generate U ∼ U(0,1)
∗ If U < α(β (k−1) , β 0 ), let β (k) = β 0 .
∗ Else, let β (k) = β (k−1)
(a) Implement this algorithm.
(b) Run the algorithm for 10,000 iterations, using the values τ = 0.1, 1, 10.
(k)
For each value of τ and each i, plot the trajectory of (βi ), the histogram of the
(k)
(βi ) over the last 9,000 iterations, and the autocorrelations (use the R function
acf). Which value of τ seems best?
(c) Give a Monte Carlo estimate of the posterior mean and variance of β.

3. Alternate method. We propose another representation of the model, which includes a latent
variable.

Yi = 1Zi > 0
Zi ∼ N (xTi β, 1)

(a) Show that this corresponds to the same model


(b) Deduce that (
N+ (xTi β, 1) if yi = 1
Zi |Yi = yi , β ∼
N− (xTi β, 1) if yi = 0
where N+ (µ, 1) and N− (µ, 1) correspond to the normal distribution with mean µ and
variance 1 truncated respectively to the left and to the right of 0.

(c) Prove that the truncated normal distribution can be simulated from in R using the
following code:
> xp = qnorm( runif ( 1 ) ∗pnorm(mu)+pnorm(−mu))+mu
> xm = qnorm( runif ( 1 ) ∗pnorm(−mu))+mu

(d) Show that


β|X, Y = y, Z ∼ Nq+1 ((X T X)−1 X T z, (X T X)−1 )

(e) Deduce a Gibbs’ sampler to generate β from the posterior distribution.


(f) Run your Gibbs’ sampler for 10,000 iterations and display all useful plots.
(g) Compare the execution time of the two methods. You can use either system.time()
or proc.time(), depending on which suits your code structure best

You might also like