Assignment
Assignment
Hadji
18 February 2022 TAs: S. Rice
Assignment
Mathematical Statistics II
The Assignment is due on Thursday, March 4 at 23:59. You should hand in a pdf in LaTeX contai-
ning the theoretical answers and an R-script. Please before you hand in your assignment, rename
the files as “LastName FirstName AssignmentMS2.R” and “LastName FirstName AssignmentMS2.pdf”.
We look at the probit model. Let y be a binary random variable: yi ∈ {0, 1}, ∀i ∈ {1, ..., n}.
Note xi = (x0i , x1i , ..., xqi ) ∈ Rq+1 the regression variables (∀i, x0i = 1). In the following, we note
y = (y1 , ..., yn ) and X the matrix of covariates. We consider the model:
Yi ∼ Bernoulli(pi )
pi = Φ(xTi β)
where Φ is the cdf of the standard normal distribution. We wish to estimate β ∈ Rq+1 . In the
following, we use the dataset bank from package gclus, which describes characteristics of bank
notes and whether they are counterfeit or genuine.
> i n s t a l l . packages ( g c l u s )
> require ( g c l u s )
> data ( bank )
> y = bank [ , 1 ]
> X = bank [ , 2 : 5 ]
Let the prior on β be a flat prior P (β) ∝ 1. This is an improper prior, but the posterior will be
an actual distribution.
1. Give the form of the posterior distribution. Is this model easy to use?
Use the R function glm to get the MLE β̂. Give the asymptotic variance Σ̂ of this estimator
> mle = summary(glm( y˜X, family=binomial ( l i n k=” p r o b i t ” ) ) )
> b e t a h a t = mle$ c o e f f i c i e n t s [ , 1 ]
> sigma . asymp = mle$cov . u n s c a l e d
• Initialization: β (0) = β̂
• At iteration k:
– Generate β 0 ∼ Nq+1 (β (k−1) , τ 2 Σ̂)
– Compute the acceptance probability
!
(k−1) 0
P (β 0 |y)
α β ,β = min 1,
P (β (k−1) |y)
– Generate U ∼ U(0,1)
∗ If U < α(β (k−1) , β 0 ), let β (k) = β 0 .
∗ Else, let β (k) = β (k−1)
(a) Implement this algorithm.
(b) Run the algorithm for 10,000 iterations, using the values τ = 0.1, 1, 10.
(k)
For each value of τ and each i, plot the trajectory of (βi ), the histogram of the
(k)
(βi ) over the last 9,000 iterations, and the autocorrelations (use the R function
acf). Which value of τ seems best?
(c) Give a Monte Carlo estimate of the posterior mean and variance of β.
3. Alternate method. We propose another representation of the model, which includes a latent
variable.
Yi = 1Zi > 0
Zi ∼ N (xTi β, 1)
(c) Prove that the truncated normal distribution can be simulated from in R using the
following code:
> xp = qnorm( runif ( 1 ) ∗pnorm(mu)+pnorm(−mu))+mu
> xm = qnorm( runif ( 1 ) ∗pnorm(−mu))+mu