0% found this document useful (0 votes)

21 views13 pages

Problem Set

The document is a problem set for a course titled MTH707, focusing on Markov chains, transition kernels, and various sampling algorithms including Metropolis-Hastings and Gibbs sampling. It contains theoretical questions and coding tasks related to statistical distributions and MCMC methods. The problem set is structured into chapters, each addressing different aspects of the course material with specific exercises and coding assignments.

Uploaded by

munniprincess27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views13 pages

Problem Set

Uploaded by

munniprincess27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

MTH707: Problem Set

Instructor: Dootika Vats

April 4, 2025

1 Chapter 1
1. Let F be the target distribution with density f , and let P (x, ·) be a Markov kernel
with transition density k(x, y). We write the one-step marginal distribution as
Z Z
F P (dy) = F (dx)P (x, dy) = f (x)k(x, y)dxdy .
X X

Although we think of f (x)k(x, y) as the joint density of the current state and the next
state, why is not actually written as a joint density?

2. Consider an autoregressive (AR(1)) process of order 1. That is, for ρ such that |ρ| < 1
and given an X0 ,
Xt+1 = ρXt + ϵt+1
iid
where ϵt ∼ N (0, σ 2 ). What is the Markov transition kernel for this Markov chain.
What is the Markov transition density?

3. For an X0 , consider the following Markov chain:

(a) Draw U ∼ U (0, 1).

(b) If U < 1/2, set Xn+1 = Xn

(c) Else set Xn+1 = N (Xn , 1).

Write the Markov transition kernel for this chain. Does this kernel define a measure
that is absolutely continuous with respect to the Lebesgue measure?

4. Consider the following Markov chain. X0 ∼ λ where λ is a probability distribution.

Set Xt+1 = Xt for all t ≥ 0.

1
(a) Is this a Markov chain?

(b) What is the Markov transition kernel?

(c) What is a stationary distribution for this kernel?

5. Ideal Slice Sampler: Consider a distribution F with density f . Consider the follow-
ing “Slice Sampler”:

(i) Draw Un+1 ∼ U (0, f (Xn ))

(ii) Draw Xn+1 ∼ Uniform(C) where set C is {x : Un+1 ≤ f (x)}

Answer the following questions:

(a) Write down the Markov transition kernel for this chain.

(b) Write down the Markov transition density for this chain.

(c) Show that F is the stationary distribution for this kernel.

6. Show
that2 the
AR(1) Markov chain satisfies detailed balance with respect to F =
σ
N 0, for |ρ| < 1.
1 − ρ2
7. For Markov proposal kernel Q(x, ·) = U (x − h, x + h) where h > 0, show that q(x, y) =
q(y, x).

8. Show that a Markov chain transition kernel P is F -communicating for all A ∈ B(X )
iff P is F -irreducible.

9. Let q(x, y) be the proposal distribution of the Metropolis-Hastings algorithm. Show

that if q(x, y) > 0 for all x, y ∈ X , then P is F -irreducible.

10. Will the following target and proposal distributions in a Metropolis-Hastings algo-
rithm lead to a reducible or an F -irreducible Markov chain? Explain intuitively, and
mathematically wherever possible:

(a) F = U (0, 1) and Q(x, ·) = U (x − 2, x + 2)

(b) F = U (0, 1) and Q(x, ·) = U (x − .2, x + .2)

(c) F = N (0, 1) and Q(x, ·) = td for d ≥ 1

(d) F = N (0, 1) and Q(x, ·) = U (0, 1)

2
(e) F = N (0, 1) and Q(x, ·) = U (x − 2, x + 2)

11. What happens to the Metropolis-Hastings algorithm when Q(x, ·) = F ? Describe

mathematically and intuitively.

12. For a symmetric proposal and for c ≥ 0, consider the acceptance probability

f (y)
αc (x, y) = .
f (x) + f (y) + c

Show that αc (x, y) will yield an F -symmetric Markov chain. Will this acceptance prob-
ability be useful in practice?

13. Rejection sampling chains: Consider the accept-reject MCMC algorithm with an in-
dependent proposal distribution: q(x, y) = q(y). For c > 0, define the set

S = {x : f (x) ≤ cq(x)} .

Consider accepting a draw with probability




 1 x∈S

 cq(x)

α(x, y) = f (x) x∈
̸ S, y ∈ S


 f (y) q(x)
min 1, x∉ S, y ̸∈ S .


f (x) q(y)

Show that this procedure yields an F -invariant Markov chain. In the context of iid
accept-reject samplers, what are the potential advantages of this sampler?

14. Code: Consider target distribution F = N (0, 1) and proposal Q(x, ·) = N (x, h) (this
is obviously not a realistic scenario). Run a MH algorithm with starting value X0 = 0
and obtain 1000 samples for each of h = 1, 5, 10. What are the estimated acceptance
probabilities for each h? Which h seems best from the trace plots?

15. Code: Repeat the previous problem for target distribution F = N100 (0, I100 ), where
I100 is the 100 × 100 identity matrix. That is, the target distribution is a standard
multivariate normal of 100 dimensions. Have the acceptance probabilities for each h
changed? Why? Can you find a better h?

3
16. Code: Consider target F = N (0, Σ) where Σ is the diagonal matrix with elements
(1, 2, 5, 10, 100). Using Q(x, ·) = N5 (x, hI5 ), tune h to obtain roughly 30% acceptance
rate.

17. Code: For the same target as above, now use Q(x, ·) = N5 (x, Ω), where Ω is a diagonal
of matrix of (h1 , h2 , h3 , h4 , h5 ). Tune all choices to obtain 30% acceptance while also
retaining good behavior of the Markov chain.

2 Chapter 2
β
1. For what values of β > 1 is f (x) ∝ e−|x| is

|∇ log f (x)|
lim → ∞.
|x|→∞ |x|

2. Code: Implement random-walk Metropolis for Bayesian logsitic for the dataset Pima.tr
in the MASS library in R. Tune this to obtain 30% acceptance.

3. Code: For the same dataset and model, implement the MALA algorithm and tune it
to obtain 60% acceptance. Compare the performance of this algorithm with the RWM
in the previous.
4
4. Code: Implement MALA for f (x) ∝ e−x with various starting values and tuning.
What do you observe.

5. Unadjusted Langevin Algorithm: Write down the discretized Lanvegin dynamics

for F = N (5, 20) distribution. For h = .001, .01, .1, 1, 10:

(a) Implement this discretized process (without accept reject) to obtain 104 samples
from each implement.

(b) Plot the density estimate of the samples for each value of h and compare with the
truth

(c) Identify what you observe?

This algorithm is often called the ‘Unadjusted Langevin Algorithm’.

6. Repeat the above but for MALA (that is discretized Langevin dynamics with accept-
reject.)

4
7. For when µσ is the density of a symmetric distribution with mode at 0, show that
Z
1 1
µ σ (z)dz = .
R 1 + e−z∇ log f (x) 2

√
8. Show that for g(t) = t,
g e(y−x)∇ log f (x) µσ (y − x)

is proportional to the density of the MALA proposal.

3 Chapter 3
NOTHING

4 Chapter 4
1. Recall that
f (y)q(y, x)
αB (x, y) = .
f (x)q(x, y) + f (y)q(y, x)
Show that αB (x, y) ≤ αMH (x, y). That is, Barker’s acceptance probability is less than
equal to the MH acceptance probability.

2. For i = 1, . . . , d, let Pi be F -invariant MTK. Let Pm and Pc be mixing and composition

kernels, respectively.

(a) Show that Pm and Pc are F -invariant.

(b) If one Pi is F -irreducible, show that both Pm and Pc are F -irreducible.

3. Suppose Pi is F -symmetric for i = 1, . . . , d. Prove that P = PRS is also F -symmetric.

4. Suppose Pi is F -symmetric for i = 1, . . . , d. Then note that Pc = P1 P2 . . . Pd is not

F -symmetric. However, show that Ps = P1 P2 . . . Pd Pd−1 . . . P2 P1 is F -symmetric.

5. Recall a component-wise algorithm for a joint target density f (x1 , . . . , xd ). Let f (xi |x(i) )
be the full conditionals for the components and let q((xi , x(i) ), yi ) be the respective pro-

5
posal densities. Recall, the acceptance probability for each component

f (yi |x(i) ) q((yi , x(i) ), xi )

α(xi , yi ) = min 1, .
f (xi |x(i) ) q((xi , x(i) ), yi )

However, in practice, the joint target density is used in the acceptance probability.
That is, the following is evaluated:

f (yi , x(i) ) q((yi , x(i) ), xi )
α(xi , yi ) = min 1, .
f (x) q((xi , x(i) ), yi )

Why are the two equivalent? What is the advantage of using the second over the first?

6. Consider target distribution F defined on X with density f (x) = mf˜(x), where m > 0
is unknown, and f˜ is known. Suppose g(x) is an independent proposal density with
support Y such that
f˜(x)
sup ≤M.
x∈Y g(x)

Consider an accept-reject MCMC algorithm with acceptance probability

f˜(y)
αI (x, y) = .
M g(y)

(a) What is the Markov chain transition kernel, P ?

(b) Show that the P is F -invariant.

(c) What conditions are required for P to be F -irreducible?

(d) Notice that αI (x, y) is independent of the current state x. Does this imply that
X1 , X2 , . . . , Xn drawn from P are independent? Why or why not?

(e) Would this sampler work well if X is high-dimensional? Why or why not?

7. Consider a three component target density f (x, y, z). Suppose the following three
conditional densities are available to sample from

f (z|x, y), f (x|y, z), f (y|z) .

(a) Consider the component-wise algorithm: Given (Xn , Yn , Zn ) = (xn , yn , zn ), the

next update is

6
i. Yn+1 ∼ f (y | zn )

ii. Xn+1 ∼ f (x | yn+1 , zn )

iii. Zn+1 ∼ f (z | xn+1 , yn+1 )

Write the Markov transition density of a deterministic scan combination of these

component-wise updates.

(b) Show that the Markov chain is invariant for the joint density f (x, y, z).

8. Bayesian linear regression: Consider likelihood:

ind
Yi |β ∼ N (xTi β, σ 2 )

where Yi ∈ R and xi ∈ Rp . We have priors

β ∼ N (0, σ 2 Ip ) and σ 2 ∼ Inverse Gamma(a, b) .

Find the full conditionals of β and σ 2 , and construct a deterministic scan Gibbs sam-
pler, writing down its transition density.

9. Bayesian Gaussian Mixture Model Let xi ∈ Rd be the observed data points,

assumed to come from a mixture of K Gaussian components:

xi | zi , µk , Σk ∼ N (µzi , Σzi )

where zi ∈ {1, . . . , K} is the latent cluster assignment. Each cluster zi follows a

categorical distribution based on mixture weights:

zi | π ∼ Categorical(π)

where π = (π1 , . . . , πK ) are the mixture weights. A Dirichlet prior is placed on the
mixture proportions:
π ∼ Dirichlet(α1 , . . . , αK )

where αk are concentration parameters. We assume a Normal-Inverse-Wishart prior

on each component:

µk | Σk ∼ N (µ0 , Σk /κ0 ) and Σk ∼ Inverse-Wishart(ν0 , Λ0 )

where µ0 , κ0 , ν0 , and Λ0 are hyperparameters.

7
Show that the full conditions are:

(a) Sample Cluster Assignments zi : For each data point i, update zi from:

P (zi = k | xi , µk , Σk , π) ∝ πk N (xi | µk , Σk )

(b) Sample Mixture Weights π: Given cluster counts nk (number of data points in
cluster k), update:

π ∼ Dirichlet(α1 + n1 , . . . , αK + nK )

(c) Sample Component Means µk and Covariances Σk : Given the subset of data
points assigned to cluster k, compute sufficient statistics (sample mean x̄k and
covariance Sk ):

µk | Σk , Xk ∼ N (µ∗k , Σk /κ∗k )

where
κ0 µ0 + nk x̄k
µ∗k = , κ∗k = κ0 + nk
κ0 + nk
and
Σk | Xk ∼ Inverse-Wishart(ν0 + nk , Λ∗k )

where
κ0 nk
Λ∗k = Λ0 + Sk + (x̄k − µ0 )(x̄k − µ0 )T .
κ0 + nk

10. Bayesian Robust Regression with t-Likelihood: A heavy-tailed alternative to

normal regression using a latent variance. The likelihood is:

yi | xi , β, σ 2 , ν ∼ Tν (x⊤ 2
i β, σ )

where Tν is a Student-t distribution. Consider an augmented variable model where a

new variable is included to allow for a Gibbs sampler:

yi | xi , β, λi , σ 2 ∼ N (x⊤ 2
i β, λi σ )

λi ∼ Inverse-Gamma(ν/2, ν/2)

The following implies that the marginal distribution of yi |xi , β, σ 2 is Tν , and thus adding

8
this λi does not change the marginal model.

Assume priors:

β ∼ N (0, s0 Ip ) and σ 2 ∼ Inverse Gamma(a0 , b0 ) .

(a) Show that the distribution of yi |xi , β, σ 2 is Tν (x⊤ 2

i β, σ )

(b) Find the full conditionals for β, σ 2 and λ = (λ1 , . . . , λp )s to construct a Gibbs
sampler.

11. Code: For the 100-dimensional multivariate normal question in the previous example,
write a deterministic scan component-wise MCMC algorithm with h = 1 in the pro-
posal distribution. Store the acceptance probability for each component in the vector
accept.vec and at the end of the program output

summary(accept.vec)

12. Code: Consider the bivariate normal target distribution

   
 2   1 ρ 
F = N  ,  .
−2 ρ 1.

Implement a MH (with h chosen that acceptance probability is around 35%) and Gibbs
sampler for ρ = 0, ρ = .5 and ρ = .99. For each value of ρ, run the Markov chain for
1000 steps and store the output of the chain in n×2 matrix chain.mh and chain.gibbs.

For each value of ρ, output the marginal density plots and compare with the true
marginal distribution. Also compare the trace plots of MH and Gibbs. For which ρ
MH is better, and for which ρ Gibbs seems better?

13. Bayesian logistic regression: The Bayesian logistic regression model is one of the
most popular in MCMC has been given special attention. Let Y1 , . . . , Ym be the ob-
served binary response. For i = 1, . . . , m, let xi = (xi1 , . . . , xip )T denote the vector of
covariates for the ith response. For β ∈ Rp , the Bayesian logistic regression setup is

1
Yi ∼ Bernoulli .
1 + exp(−xTi β)

9
We assume the following multivariate normal prior on β, β ∼ N (0, 100I10 ), where I10
is the 10 × 10 identity matrix.

Load the following data in R

install.packages("mcmc") # install package

library(mcmc)
data(logit) # loads the dataset. Do ?logit to learn about the data

(a) Write a Metropolis-Hastings algorithm to sample from the posterior distribution

of β for the above data set.

(b) You may notice that the above algorithm exhibits high autocorrelation. A Gibbs
sampler using data augmentation was introduced for this model by ?. An open
version of the paper is available here: https://arxiv.org/pdf/1205.0310.pdf

The Gibbs sampler is presented at the bottom of Page 6. Describe the sampler
in your own words and explain the justification of this algorithm presented in the
paper.

14. Metropolis-within-Gibbs: We have learned component-wise MCMC updates and

within that we’ve learned Gibbs sampling which requires the full conditional distribu-
tion for each component to be available. But what if for one or more of the components,
the full conditional distribution is not available.

When at least one of the fi (xi |x(i) ) is not available to sample from, then we use a
general proposal qi as described before to update that component, and use Gibbs
sampling updates for all other components.

For i = 1, . . . m, let ti denote the observed failure time for lamp (where m lamps’ data
are collected). Suppose
Ti | λ, β ∼ Weibull(λ, β)

where λ > 0 is the scale parameter and β is a shape parameter. In a Bayesian paradigm,
we further assume prior distributions on this. So

λ ∼ Gamma(a0 , b0 ) and β ∼ Gamma(a1 , b1 ) .

The resulting posterior distribution is complicated for which the normalizing constant

10
is not known.

m
!β−1 ( m
)
Y X
f (λ, β | T) ∝ λm+a0 −1 m+a1 −1
β ti exp −λ tβi exp {−b1 β} exp {b0 λ}
i=1 i=1

It can also be shown that

m
!
X
λ | β, T ∼ Gamma m + a0 , b0 + tβi .
i=1

However, β | λ, T does not have a closed-form expression.

m
!β−1 ( m
)
Y X
f (β | λ, T ) ∝ β m+a1 −1 ti exp −λ tβi exp {−b1 β} .
i=1 i=1

In this case, we can implement the following (deterministic scan) Metropolis-within-

Gibbs sampler:

(a) λn+1 ∼ λ | βn , T

(b) Propose Y ∼ Q((λn+1 , βn ), ·) and draw U ∼ U [0, 1].

(c) If U ≤ α((λn+1 , βn ), y), where

f (y|λn+1 , T ) q((y, λn+1 ), βn )
α((λn+1 , βn ), y) = min 1, ,
f (βn |λn+1 , T ) q((βn , λn+1 ), y)

then βn+1 = Y .

(d) Else βn+1 = βn .

The MTK for this is

k((λ, β), (λ′ , β ′ )) = f (λ′ |β) p((λ′ , β), (λ′ , β ′ )) .

| {z }
MH kernel

Implement the above sampler to sample from the posterior distribution with randomly gen-
erated data.

11
5 Chapter 5

6 Chapter 7
1. Recall that an F -Harris ergodic Markov chain on X is uniformly ergodic if for some
M < ∞ and some t ∈ [0, 1],

∥P n (x, ·) − F (·)∥ ≤ M tn .

(a) Suppose the full support X is small. That is, there exists an ϵ > 0 and a measure
Q such that for all x ∈ X and for all A ∈ B(X )

P (x, A) ≥ ϵ Q(A) .

Show that since X is small, P is uniformly ergodic with t = (1 − ϵ) and M = 1.

(b) In how many steps, n∗ , will the above Markov chain be within δ TV-distance of
stationarity?

(c) Consider target distribution F being a Pareto(α, β) and consider an independent

proposal distribution Pareto(α, λ). Show that is λ ≤ β, then the Markov chain is
uniformly ergodic. For what value of λ is n∗ the smallest?

(d) Show that if λ > 2β, then the asymptotic variance for estimating the mean of a
Pareto distribution using the above independent MH algorithm is ∞.

2. Consider an Independence MH for target F = Exp(λ), and Q = Exp(θ).

(a) For what values of θ is the chain uniformly ergodic?

(b) For λ = 5 and θ = 3, how long will a Markov chain take to be within .01 TV
distance of the target?

3. If X is finite, then every F -irreducible, F -invariant, aperiodic, recurrent Markov chain

is uniformly ergodic.

4. Suppose X is compact and P be a M-H kernel with proposal q be absolutely continuous

wrt. Lebesgue measure such that x, y) > 0 on X . If f (x) ≤ k on X , then show that P
is uniformly ergodic.

5. Let PRSGS denote the Markov transition density of a 2-component random scan Gibbs
2
sampler. Then show that PRSGS ≥ r(1 − r)PDU GS .

12
Hint: You can use the MTDs to show this. Then

kRSGS ((x, y), (u, v))kRSGS ((u, v), (x′ , y ′ ))

= rfX|Y (u|y)δy (v) + (1 − r)fY |X (v|x)δx (u) rfX|Y (x′ |v)δv (y) + (1 − r)fY |X (y ′ |u)δu (x′ )

6. Using the above result, show that if PDU GS is uniformly ergodic, then so id PRSGS .
(This is true even outside of the two variable case).

Stochastic Simulation Book
No ratings yet
Stochastic Simulation Book
146 pages
MCMC Sampling - Class 2025
No ratings yet
MCMC Sampling - Class 2025
101 pages
Annotated L18
No ratings yet
Annotated L18
73 pages
MCMC
No ratings yet
MCMC
76 pages
MCMC Notes
No ratings yet
MCMC Notes
77 pages
Randomised Algorithms: The Probabilistic Method
No ratings yet
Randomised Algorithms: The Probabilistic Method
48 pages
Math 181A WI25 HW Problems Complete
No ratings yet
Math 181A WI25 HW Problems Complete
26 pages
Lec33 MetropolisHastings
No ratings yet
Lec33 MetropolisHastings
66 pages
Chapter 6 Bayesian - 250322 - 223838
No ratings yet
Chapter 6 Bayesian - 250322 - 223838
10 pages
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
No ratings yet
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
64 pages
5d MCMC
No ratings yet
5d MCMC
9 pages
MCMC
No ratings yet
MCMC
7 pages
Practice Problems Set 5 2025 - IE622
No ratings yet
Practice Problems Set 5 2025 - IE622
3 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Lec 25
No ratings yet
Lec 25
3 pages
2 MS2 (Sampling)
No ratings yet
2 MS2 (Sampling)
29 pages
MCMC
No ratings yet
MCMC
70 pages
Chib UnderstandingMetropolisHastingsAlgorithm 1995
No ratings yet
Chib UnderstandingMetropolisHastingsAlgorithm 1995
10 pages
Inference in Hidden Markov Models
No ratings yet
Inference in Hidden Markov Models
204 pages
ProblemSheet1 23
No ratings yet
ProblemSheet1 23
5 pages
Bayesian Inference
No ratings yet
Bayesian Inference
28 pages
MCMC: Metropolis Hastings Algorithm
No ratings yet
MCMC: Metropolis Hastings Algorithm
5 pages
Questions For Unit 5 RM
No ratings yet
Questions For Unit 5 RM
4 pages
Computation
No ratings yet
Computation
11 pages
Act - 03 - Cost Analysis
No ratings yet
Act - 03 - Cost Analysis
9 pages
Stat513 l10
No ratings yet
Stat513 l10
27 pages
Bayesian Modelling Tuts-12-15
No ratings yet
Bayesian Modelling Tuts-12-15
4 pages
Aero 630: Introduction To Random Dynamical Systems
No ratings yet
Aero 630: Introduction To Random Dynamical Systems
1 page
MMT 008
No ratings yet
MMT 008
7 pages
Rarefied Gas Dynamics - DSMC Course
No ratings yet
Rarefied Gas Dynamics - DSMC Course
50 pages
Solutions 2
No ratings yet
Solutions 2
7 pages
Lectures 6
No ratings yet
Lectures 6
17 pages
Metropolis Hastings Explained
No ratings yet
Metropolis Hastings Explained
2 pages
MA40189 20 Mock
No ratings yet
MA40189 20 Mock
4 pages
Problem Sheet 3.3
No ratings yet
Problem Sheet 3.3
3 pages
Simon Shaw Bayes Theory
No ratings yet
Simon Shaw Bayes Theory
72 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
AP2
No ratings yet
AP2
3 pages
Stat906 f24 A2
No ratings yet
Stat906 f24 A2
2 pages
DSSM Imp Questions For End Examinations
No ratings yet
DSSM Imp Questions For End Examinations
6 pages
Markov Chains: Modified by Longin Jan Latecki Temple University, Philadelphia Latecki@temple - Edu
No ratings yet
Markov Chains: Modified by Longin Jan Latecki Temple University, Philadelphia Latecki@temple - Edu
36 pages
HW 4
No ratings yet
HW 4
5 pages
ML cd −y+ (1−θ) z
No ratings yet
ML cd −y+ (1−θ) z
2 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
No ratings yet
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
78 pages
CSE291D Lecture 6: Monte Carlo Methods 2: Markov Chain Monte Carlo
No ratings yet
CSE291D Lecture 6: Monte Carlo Methods 2: Markov Chain Monte Carlo
66 pages
hw3 Solution
No ratings yet
hw3 Solution
7 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
No ratings yet
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
24 pages
1 s2.0 S1544612323006918 Main
No ratings yet
1 s2.0 S1544612323006918 Main
7 pages
PRP Anna University QA
No ratings yet
PRP Anna University QA
8 pages
Putational Statistics Using Matlab
No ratings yet
Putational Statistics Using Matlab
78 pages
M Stat (2015) - Revised PDF
No ratings yet
M Stat (2015) - Revised PDF
59 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
Lecture #39
No ratings yet
Lecture #39
8 pages
Mrinali Bhiwapurkar 1062212449 SPDDT
No ratings yet
Mrinali Bhiwapurkar 1062212449 SPDDT
23 pages
Compulsory - Statistical Methods I Term 1 PDF
No ratings yet
Compulsory - Statistical Methods I Term 1 PDF
6 pages
Probability and Stochastic Process
No ratings yet
Probability and Stochastic Process
3 pages
A Topic Modeling Approach Towards Understanding The Discourse Between Religion and Videogames On Reddit
No ratings yet
A Topic Modeling Approach Towards Understanding The Discourse Between Religion and Videogames On Reddit
44 pages
Day Trading For A Living?
No ratings yet
Day Trading For A Living?
17 pages
Markov Chain Monte Carlo and Gibbs Sampling
No ratings yet
Markov Chain Monte Carlo and Gibbs Sampling
24 pages
UGP Report
No ratings yet
UGP Report
30 pages
PQT QB
No ratings yet
PQT QB
7 pages
JB Ies 109 Exercises Answers
No ratings yet
JB Ies 109 Exercises Answers
246 pages
PQT MJ07
No ratings yet
PQT MJ07
6 pages
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
No ratings yet
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
4 pages
Chapter - Five - Limited Dependent Variable Models
No ratings yet
Chapter - Five - Limited Dependent Variable Models
75 pages
Syllabus B Pharm
No ratings yet
Syllabus B Pharm
133 pages
ME-Unit 2
No ratings yet
ME-Unit 2
62 pages
42.M.E. Biometrics and Cyber Security
No ratings yet
42.M.E. Biometrics and Cyber Security
47 pages
Attachment in The Student-Teacher Relationship As
No ratings yet
Attachment in The Student-Teacher Relationship As
23 pages
Computational Statistics With Matlab
No ratings yet
Computational Statistics With Matlab
71 pages
Dms P1
No ratings yet
Dms P1
1 page
What Is Imp
No ratings yet
What Is Imp
10 pages
CIMMYT
No ratings yet
CIMMYT
106 pages
02 Production
No ratings yet
02 Production
72 pages
Correlation and Regression Analysis: Pembe Begul GUNER
No ratings yet
Correlation and Regression Analysis: Pembe Begul GUNER
30 pages
Answer of All Previous Question in Econometrics
No ratings yet
Answer of All Previous Question in Econometrics
37 pages
Adv Analytical Theory and Methods: Regression
No ratings yet
Adv Analytical Theory and Methods: Regression
45 pages
CPA Syllabus 2009
No ratings yet
CPA Syllabus 2009
2 pages
Crop - Recommendation - System 2023
No ratings yet
Crop - Recommendation - System 2023
5 pages
Modeling The Equilibrium Compressed Void Volume of Carbon Black
No ratings yet
Modeling The Equilibrium Compressed Void Volume of Carbon Black
30 pages
RBC Statistics Overview RBC
No ratings yet
RBC Statistics Overview RBC
31 pages
IITB Chem
No ratings yet
IITB Chem
24 pages
4123-Article Text-15297-1-10-20211020
No ratings yet
4123-Article Text-15297-1-10-20211020
16 pages
Mean Reversion in Profitability and Earnings: Evidence From India (2007-2020)
No ratings yet
Mean Reversion in Profitability and Earnings: Evidence From India (2007-2020)
22 pages
CS3361 DS Lab-2021 R
No ratings yet
CS3361 DS Lab-2021 R
2 pages
Math 1281 Learning Journal Unit 6
No ratings yet
Math 1281 Learning Journal Unit 6
2 pages
BA Sample Paper Questions
No ratings yet
BA Sample Paper Questions
8 pages
Practice 02 Nonlinear Regression
No ratings yet
Practice 02 Nonlinear Regression
3 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)

Problem Set

Uploaded by

Problem Set

Uploaded by

MTH707: Problem Set

Instructor: Dootika Vats

3. For an X0 , consider the following Markov chain:

(a) Draw U ∼ U (0, 1).

(b) If U < 1/2, set Xn+1 = Xn

(c) Else set Xn+1 = N (Xn , 1).

4. Consider the following Markov chain. X0 ∼ λ where λ is a probability distribution.

(b) What is the Markov transition kernel?

(c) What is a stationary distribution for this kernel?

(i) Draw Un+1 ∼ U (0, f (Xn ))

(ii) Draw Xn+1 ∼ Uniform(C) where set C is {x : Un+1 ≤ f (x)}

Answer the following questions:

(c) Show that F is the stationary distribution for this kernel.

9. Let q(x, y) be the proposal distribution of the Metropolis-Hastings algorithm. Show

(a) F = U (0, 1) and Q(x, ·) = U (x − 2, x + 2)

(b) F = U (0, 1) and Q(x, ·) = U (x − .2, x + .2)

(c) F = N (0, 1) and Q(x, ·) = td for d ≥ 1

(d) F = N (0, 1) and Q(x, ·) = U (0, 1)

11. What happens to the Metropolis-Hastings algorithm when Q(x, ·) = F ? Describe

Consider accepting a draw with probability

5. Unadjusted Langevin Algorithm: Write down the discretized Lanvegin dynamics

(c) Identify what you observe?

This algorithm is often called the ‘Unadjusted Langevin Algorithm’.

is proportional to the density of the MALA proposal.

2. For i = 1, . . . , d, let Pi be F -invariant MTK. Let Pm and Pc be mixing and composition

(a) Show that Pm and Pc are F -invariant.

(b) If one Pi is F -irreducible, show that both Pm and Pc are F -irreducible.

3. Suppose Pi is F -symmetric for i = 1, . . . , d. Prove that P = PRS is also F -symmetric.

4. Suppose Pi is F -symmetric for i = 1, . . . , d. Then note that Pc = P1 P2 . . . Pd is not

f (yi |x(i) ) q((yi , x(i) ), xi )

Consider an accept-reject MCMC algorithm with acceptance probability

(a) What is the Markov chain transition kernel, P ?

(b) Show that the P is F -invariant.

(c) What conditions are required for P to be F -irreducible?

f (z|x, y), f (x|y, z), f (y|z) .

(a) Consider the component-wise algorithm: Given (Xn , Yn , Zn ) = (xn , yn , zn ), the

ii. Xn+1 ∼ f (x | yn+1 , zn )

iii. Zn+1 ∼ f (z | xn+1 , yn+1 )

Write the Markov transition density of a deterministic scan combination of these

8. Bayesian linear regression: Consider likelihood:

where Yi ∈ R and xi ∈ Rp . We have priors

β ∼ N (0, σ 2 Ip ) and σ 2 ∼ Inverse Gamma(a, b) .

9. Bayesian Gaussian Mixture Model Let xi ∈ Rd be the observed data points,

where zi ∈ {1, . . . , K} is the latent cluster assignment. Each cluster zi follows a

where αk are concentration parameters. We assume a Normal-Inverse-Wishart prior

µk | Σk ∼ N (µ0 , Σk /κ0 ) and Σk ∼ Inverse-Wishart(ν0 , Λ0 )

where µ0 , κ0 , ν0 , and Λ0 are hyperparameters.

10. Bayesian Robust Regression with t-Likelihood: A heavy-tailed alternative to

where Tν is a Student-t distribution. Consider an augmented variable model where a

β ∼ N (0, s0 Ip ) and σ 2 ∼ Inverse Gamma(a0 , b0 ) .

(a) Show that the distribution of yi |xi , β, σ 2 is Tν (x⊤ 2

12. Code: Consider the bivariate normal target distribution

Load the following data in R

install.packages("mcmc") # install package

(a) Write a Metropolis-Hastings algorithm to sample from the posterior distribution

14. Metropolis-within-Gibbs: We have learned component-wise MCMC updates and

λ ∼ Gamma(a0 , b0 ) and β ∼ Gamma(a1 , b1 ) .

It can also be shown that

However, β | λ, T does not have a closed-form expression.

In this case, we can implement the following (deterministic scan) Metropolis-within-

(b) Propose Y ∼ Q((λn+1 , βn ), ·) and draw U ∼ U [0, 1].

(c) If U ≤ α((λn+1 , βn ), y), where

(d) Else βn+1 = βn .

The MTK for this is

k((λ, β), (λ′ , β ′ )) = f (λ′ |β) p((λ′ , β), (λ′ , β ′ )) .

Show that since X is small, P is uniformly ergodic with t = (1 − ϵ) and M = 1.

(c) Consider target distribution F being a Pareto(α, β) and consider an independent

2. Consider an Independence MH for target F = Exp(λ), and Q = Exp(θ).

(a) For what values of θ is the chain uniformly ergodic?

3. If X is finite, then every F -irreducible, F -invariant, aperiodic, recurrent Markov chain

4. Suppose X is compact and P be a M-H kernel with proposal q be absolutely continuous

kRSGS ((x, y), (u, v))kRSGS ((u, v), (x′ , y ′ ))

You might also like