0% found this document useful (0 votes)

19 views4 pages

Chapter 15 Bayesian Inference For Gaussian Process (Lecture On 02-23-2021) - STAT 243 - Stochastic Process

1) The document discusses Bayesian inference for Gaussian processes for nonlinear regression models. It assumes the unknown function f in the regression model follows a Gaussian process prior. 2) It presents the hierarchical Bayesian model for the regression, with priors placed over the hyperparameters μ, τ2, σ2, and φ. 3) Marginalizing over the latent function values θ is recommended for improved mixing when sampling the posterior. The full conditional distributions of each parameter are then derived for Gibbs sampling.

Uploaded by

Prof. Madya Dr. Umar Yusuf Madaki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views4 pages

Chapter 15 Bayesian Inference For Gaussian Process (Lecture On 02-23-2021) - STAT 243 - Stochastic Process

Uploaded by

Prof. Madya Dr. Umar Yusuf Madaki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Chapter 15 Bayesian Inference for Gaussian

Process(Lecture on 02/23/2021)

When the data (y, x) has a highly nonlinear relationship, the way to perform a regression model
is to assume y = f (x) + ϵ , where f is an unknown function. We use GP as the prior for f .
Assume f ∼ GP (μ, Cν (⋅, ⋅, θ)) , it means that

( f (x1 ) ⋯ f (xn ) ) ∼ N (μ1, H ) (15.1)

where H = (Hij )
n
i,j=1
and Hij = Cν (xi , xj ; θ) . A commonly used covariance kernel is the
Matern covariance kernel, defined as

1−ν
2
2 2 ν
Cν (d, ϕ, σ ) = σ (√2νϕd) Kν (√2νϕd) (15.2)
Γ(ν)

where d = ||xi − xj || is the Euclidean distance between xi and xj .

Now, the regression model can be written as

y = f (x) + ϵ
(15.3)
i.i.d. 2
ϵ ∼ N (0, τ ), f ∼ GP (μ, Cν )

As a bayesian statistician, we want to find the posterior distribution of f , that is f |y1 , ⋯ , yn .

Let us have the data (y1 , x1 ), ⋯ , (yn , xn ), then

T
( f (x1 ) ⋯ f (xn ) ) ∼ M V N (μ1, Σ) (15.4)

where Σ 2
= σ H(ϕ) . For simplicity, assume we work with the exponential covariance function

2 2
Cν (xi , xj , ϕ, σ ) = σ exp(−ϕ||xi − xj ||) (15.5)

then H(ϕ) is an n × n matrix with the (i, j)th entry of H (ϕ) is given by exp(−ϕ||xi − xj ||) .
Let μ ∼ P (μ) = N (μ|aμ , bμ ) σ , 2 2
∼ Pσ (σ ) = I G(σ |aσ , bσ ) τ
2
, 2
∼ Pτ 2 (τ
2
) = I G(τ
2
|aτ , bτ )

and ϕ ∼ Pϕ (ϕ) = U nif (aϕ , bϕ ) be the prior distribution for the model parameters. Utlimately it
turns out that the model can be written hierarchically as:

T 2
y ∼ N (( f (x1 ) ⋯ f (xn ) ) ,τ I)

T
( f (x1 ) ⋯ f (xn ) ) ∼ M V N (μ1, Σ)
(15.6)
2 2
μ ∼ N (μ|aμ , bμ ), σ ∼ I G(σ |aσ , bσ )

2 2
τ ∼ I G(τ |aτ , bτ ), ϕ ∼ U nif (aϕ , bϕ )

T
Just for notational simplicity, denote θ = ( f (x1 ) ⋯ f (xn ) ) , our job is to estimate the
posterior distribution of μ, τ 2 , σ 2 , ϕ and θ.

1. There is some issues with directly build a sampler for μ, τ 2 , σ 2 , ϕ and θ together.
It is hard for the chain to converge. Therefore, in practice, we usually marginalize
out θ at first.

2. σ 2 and ϕ can not be jointly estimated consistently. One solution is to estimate σ 2

without any constrains, while put a uniform prior on a small range for ϕ. Typically
3 3
the range is set as [ max(d) , min(d)
] , where d = {|xi − xj | : |xi − xj | ≠ 0} .

Since y ∼ N (θ, τ
2
I) and θ ∼ N (μI, Σ) , marginally y ∼ N (μ1, Σ + τ
2
. Mixing will be
I)

improved when θ is integrated out. The full posterior for μ, τ 2

,σ ,ϕ
2
is therefore

2 2 2 2 2 2
p(μ, τ , σ , ϕ|y) ∝ N (μ1, τ I + σ H (ϕ))I G(σ |aσ , bσ )I G(σ |aτ , bτ )
(15.7)
× U nif (aϕ , bϕ )N (μ|aμ , bμ )

Now we consider the full conditionals for each parameter. If it has closed form, then it can be
sampled using Gibbs sampler, otherwise it will be sampled using Metropolis-Hasting sampler.
Firstly,

2 2
p(μ|rest, y) ∝ N (y|μ1, τ I + σ H (ϕ)) × N (μ|aμ , bμ )

T 2 2 −1
(y − μ1) (τ I + σ H (ϕ)) (y − μ1)
∝ exp{− }
2
2
(μ − aμ )
× exp{− }
2bμ (15.8)

1 1
2 T 2 2 −1
∝ exp{− [μ (1 (σ H (ϕ) + τ I) 1 + )
2 bμ

aμ
T 2 2 −1
− 2μ(1 (σ H (ϕ) + τ I) y + )]}
bμ

Therefore, by complete the squares, we have μ|rest, y ∼ N (aμ|⋅ , bμ|⋅ ) , where

1
bμ|⋅ =
T 2 2 −1 1
1 (σ H (ϕ) + τ I) 1 +
bμ
(15.9)
aμ
T 2 2 −1
aμ|⋅ = bμ|⋅ [1 (σ H (ϕ) + τ I) y + ]
bμ

Therefore, we can sample μ from its full conditional distribution.

In addition,

2 2 2 2 2 2
p(ϕ, τ , σ |y, μ) ∝ N (y|μ1, τ I + σ H (ϕ))I G(σ |aσ , bσ )I G(σ |aτ , bτ )U nif (aϕ , bϕ

ϕ, τ
2
,σ
2
do not have any closed form full conditional. They need to be updated using Metropolis-
Hastings.

Should we update ϕ, τ 2 , σ 2 all together, or should we update them one at a time? The
answer is case by case. Updating σ 2 , τ 2 together and updating ϕ separately using M-
H tends to give better mixing.
Once posterior samples of μ, ϕ, τ 2 , σ 2 are obtained, the next step is to get the posterior samples
of θ. Note that

2 2
p(θ|rest, y) ∝ N (y|θ, τ I)N (θ|μ1, σ H (ϕ))

T
(y − θ) (y − θ)
∝ exp{− }
2
2τ
T 2 −1
(θ − μ1) (σ H (ϕ)) (θ − μ1) (15.11
× exp{− }
2
−1 −1
1 I H (ϕ) y H (ϕ) 1μ
T T
∝ exp{− [θ ( + )θ − 2θ ( + )]}
2 2 2 2
2 τ σ τ σ

After completing the squares, we get θ|rest, y ∼ N (μθ|⋅ (ϕ, μ, σ , τ

2 2
), Σθ|⋅ (ϕ, μ, σ , τ
2 2
))

where

−1
I H (ϕ)
2 2 −1
Σθ|⋅ (ϕ, μ, σ , τ ) = ( + )
2 2
τ σ
(15.12)
−1
y H (ϕ) 1μ
2 2
μθ|⋅ (ϕ, μ, σ , τ ) = Σθ|⋅ ( + )
2 2
τ σ

For the lth post burn-in samples of parameters, denoted as (ϕl , μl , σl2 , τl2 ), we can obtain a
sample of θ by sample from N (μθ|⋅ (ϕl , μl , σl2 , τl2 ), Σθ|⋅ (ϕl , μl , σl2 , τl2 )) 。

Anova (Assignment) - Khabab
No ratings yet
Anova (Assignment) - Khabab
16 pages
Basic Concepts of Quality and TQM
No ratings yet
Basic Concepts of Quality and TQM
77 pages
Time Series Analysis
No ratings yet
Time Series Analysis
44 pages
Bayesian Credible Interval
100% (1)
Bayesian Credible Interval
8 pages
A New Extended Distribution and Its Application To
0% (1)
A New Extended Distribution and Its Application To
12 pages
MCMC
No ratings yet
MCMC
76 pages
K. Akerhielm - Does Class Size Matter
No ratings yet
K. Akerhielm - Does Class Size Matter
13 pages
RSH Qam11 ch05
No ratings yet
RSH Qam11 ch05
84 pages
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
5 pages
9.ekonometrikada Ehtimollar Nazariyasi Va Matematik Statistikaning Asosiy Tushunchalari2
No ratings yet
9.ekonometrikada Ehtimollar Nazariyasi Va Matematik Statistikaning Asosiy Tushunchalari2
74 pages
Intro Bayes Time Series 2
No ratings yet
Intro Bayes Time Series 2
73 pages
Parameter Estimation
No ratings yet
Parameter Estimation
50 pages
Chapter 13 Birth and Death Process, MCMC For Discrete Distribution (Lecture On 02-16-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 13 Birth and Death Process, MCMC For Discrete Distribution (Lecture On 02-16-2021) - STAT 243 - Stochastic Process
7 pages
Introductory Statistics Formulas and Tables
No ratings yet
Introductory Statistics Formulas and Tables
10 pages
20 Bayesian2
No ratings yet
20 Bayesian2
50 pages
Durrande 2020
No ratings yet
Durrande 2020
90 pages
Intro Bayes Time Series 1
No ratings yet
Intro Bayes Time Series 1
72 pages
Mc031 - Gsdp8118 Advanced Research Methodology
No ratings yet
Mc031 - Gsdp8118 Advanced Research Methodology
42 pages
Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF
No ratings yet
Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF
50 pages
Simon Shaw Bayes Theory
No ratings yet
Simon Shaw Bayes Theory
72 pages
Bayes Gauss
100% (1)
Bayes Gauss
29 pages
7 Regression
No ratings yet
7 Regression
96 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
Chapter 4 B
No ratings yet
Chapter 4 B
38 pages
Expo Kundu
No ratings yet
Expo Kundu
22 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
RV Prob Distributions First Lecture Notes
No ratings yet
RV Prob Distributions First Lecture Notes
45 pages
PRML Slides 2
No ratings yet
PRML Slides 2
86 pages
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
No ratings yet
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
40 pages
Nonparametric Testing in Excel PDF
No ratings yet
Nonparametric Testing in Excel PDF
72 pages
Chapter 12 Poisson Process, Birth and Death Process (Lecture On 02-11-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 12 Poisson Process, Birth and Death Process (Lecture On 02-11-2021) - STAT 243 - Stochastic Process
6 pages
ML and MAP - HTML
No ratings yet
ML and MAP - HTML
9 pages
Mstat Note14 Bayesian Inference FSP
No ratings yet
Mstat Note14 Bayesian Inference FSP
30 pages
Week 2 Test Statistics
No ratings yet
Week 2 Test Statistics
61 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
Bayesian Kernel Methods
No ratings yet
Bayesian Kernel Methods
40 pages
Bayesian Analysis of A Stochastic Volatility
No ratings yet
Bayesian Analysis of A Stochastic Volatility
25 pages
Lecture 07 Slides mm206 v2
No ratings yet
Lecture 07 Slides mm206 v2
30 pages
EC 6310: Advanced Econometric Theory: Bayesian Computation in The Nonlinear Regression Model
No ratings yet
EC 6310: Advanced Econometric Theory: Bayesian Computation in The Nonlinear Regression Model
33 pages
Lec5 Part2
No ratings yet
Lec5 Part2
33 pages
1.8 - Quality of Tests
No ratings yet
1.8 - Quality of Tests
27 pages
Bayesian - Lec - 4
No ratings yet
Bayesian - Lec - 4
25 pages
CH 05
No ratings yet
CH 05
29 pages
Bayesian Inference For The Gaussian
No ratings yet
Bayesian Inference For The Gaussian
11 pages
Izzati Fyp
No ratings yet
Izzati Fyp
21 pages
Lesson 12 T Test Dependent Samples
No ratings yet
Lesson 12 T Test Dependent Samples
26 pages
Statistics I: Parameter Estimation, Part II
No ratings yet
Statistics I: Parameter Estimation, Part II
22 pages
Inequality Restricted Maximum Entropy Estimation in STAT
No ratings yet
Inequality Restricted Maximum Entropy Estimation in STAT
26 pages
A Bivariate Power Lindley Survival Distribution
No ratings yet
A Bivariate Power Lindley Survival Distribution
19 pages
A New Generalized Gamma-Weibull Distribution With
No ratings yet
A New Generalized Gamma-Weibull Distribution With
18 pages
Filt Ident Lecturenotes
No ratings yet
Filt Ident Lecturenotes
12 pages
Can J Chem Eng - 2024 - Gibson - Bayesian Parameter Estimation Using Truncated Normal Distributions As Priors For
No ratings yet
Can J Chem Eng - 2024 - Gibson - Bayesian Parameter Estimation Using Truncated Normal Distributions As Priors For
17 pages
Lecture 19
No ratings yet
Lecture 19
12 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
Energetic Variational Gaussian Process Regression For Computer Experiments
No ratings yet
Energetic Variational Gaussian Process Regression For Computer Experiments
19 pages
10 Chapter 4
No ratings yet
10 Chapter 4
16 pages
Practical Session 1 Solved
No ratings yet
Practical Session 1 Solved
14 pages
Sales and Advertising
No ratings yet
Sales and Advertising
14 pages
Week 11
No ratings yet
Week 11
11 pages
wrcr10616 Sup 0003 txts01
No ratings yet
wrcr10616 Sup 0003 txts01
9 pages
Testing The
No ratings yet
Testing The
12 pages
Distributed Lag Model
No ratings yet
Distributed Lag Model
11 pages
MCMC
No ratings yet
MCMC
7 pages
WILCOXON TOOL 2 Sure1
No ratings yet
WILCOXON TOOL 2 Sure1
12 pages
Computation
No ratings yet
Computation
11 pages
LECTURE 1 STAT 401-Non-Parametric ABU ZARIA
No ratings yet
LECTURE 1 STAT 401-Non-Parametric ABU ZARIA
10 pages
Multivariate Analysis of Variance-MANOVA
No ratings yet
Multivariate Analysis of Variance-MANOVA
14 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Data Scientist: Nanodegree Program Syllabus
No ratings yet
Data Scientist: Nanodegree Program Syllabus
17 pages
Unit 4 Part 2
No ratings yet
Unit 4 Part 2
24 pages
Regression Analysis With GRETL
No ratings yet
Regression Analysis With GRETL
21 pages
Multi Parametric Models
No ratings yet
Multi Parametric Models
5 pages
07 - Chapter 1
No ratings yet
07 - Chapter 1
8 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Advanced ML Notes (Midterm)
No ratings yet
Advanced ML Notes (Midterm)
10 pages
ProblemSheet1 23
No ratings yet
ProblemSheet1 23
5 pages
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
No ratings yet
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
11 pages
35 Cs
No ratings yet
35 Cs
5 pages
12 - Chapter 7
No ratings yet
12 - Chapter 7
3 pages
Regression Questions Topic 6
No ratings yet
Regression Questions Topic 6
3 pages
2020 - Applied Statistics For Environmental Science With R
No ratings yet
2020 - Applied Statistics For Environmental Science With R
3 pages
St2334 Tutorial 7
No ratings yet
St2334 Tutorial 7
4 pages
MCMC: Gibbs Sampling: D K k1 k+1 D
No ratings yet
MCMC: Gibbs Sampling: D K k1 k+1 D
7 pages
Chapter 4 Conditions For Recurrent and Transient State (Lecture On 01-14-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 4 Conditions For Recurrent and Transient State (Lecture On 01-14-2021) - STAT 243 - Stochastic Process
5 pages
Chapter 14 MCMC For Continuous Distribution, Gaussian Process (Lecture On 02-18-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 14 MCMC For Continuous Distribution, Gaussian Process (Lecture On 02-18-2021) - STAT 243 - Stochastic Process
6 pages
Liu Et Al-2018-Journal of Applied Econometrics - Sup-1
No ratings yet
Liu Et Al-2018-Journal of Applied Econometrics - Sup-1
9 pages
Week 10
No ratings yet
Week 10
2 pages
W10 Notes
No ratings yet
W10 Notes
2 pages
Stat 111
No ratings yet
Stat 111
7 pages
IVK - New Models and Methods For Dependent Censoring in Survival Analysis
No ratings yet
IVK - New Models and Methods For Dependent Censoring in Survival Analysis
1 page
Finite Mixture Modelling Model Specification, Estimation & Application
No ratings yet
Finite Mixture Modelling Model Specification, Estimation & Application
11 pages
Gaussian Process - Part 2: 1 2 N T I 1 2 N T
No ratings yet
Gaussian Process - Part 2: 1 2 N T I 1 2 N T
4 pages
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
No ratings yet
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
6 pages
MIT18 S096F13 CaseStudy2
No ratings yet
MIT18 S096F13 CaseStudy2
20 pages
Machine Learning Assignment 1 Basic Concepts: Due: 27 March 2015, 15:00pm
No ratings yet
Machine Learning Assignment 1 Basic Concepts: Due: 27 March 2015, 15:00pm
3 pages
MCMC MCMC: Congratulations! You Passed!
No ratings yet
MCMC MCMC: Congratulations! You Passed!
1 page
The Gibbs Sampler: Function
No ratings yet
The Gibbs Sampler: Function
1 page
Shapiro-Wilk Test - Wikipedia
No ratings yet
Shapiro-Wilk Test - Wikipedia
3 pages
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
No ratings yet
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
10 pages
Time Series With Minitab - I Smoothing
No ratings yet
Time Series With Minitab - I Smoothing
1 page
Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
The Infinite Gaussian Mixture Model: Carl Edward Rasmussen
No ratings yet
The Infinite Gaussian Mixture Model: Carl Edward Rasmussen
7 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Chapter 15 Bayesian Inference For Gaussian Process (Lecture On 02-23-2021) - STAT 243 - Stochastic Process

Uploaded by

Chapter 15 Bayesian Inference For Gaussian Process (Lecture On 02-23-2021) - STAT 243 - Stochastic Process

Uploaded by

Chapter 15 Bayesian Inference for Gaussian

( f (x1 ) ⋯ f (xn ) ) ∼ N (μ1, H ) (15.1)

where d = ||xi − xj || is the Euclidean distance between xi and xj .

Now, the regression model can be written as

As a bayesian statistician, we want to find the posterior distribution of f , that is f |y1 , ⋯ , yn .

Let us have the data (y1 , x1 ), ⋯ , (yn , xn ), then

2. σ 2 and ϕ can not be jointly estimated consistently. One solution is to estimate σ 2

improved when θ is integrated out. The full posterior for μ, τ 2

Therefore, by complete the squares, we have μ|rest, y ∼ N (aμ|⋅ , bμ|⋅ ) , where

Therefore, we can sample μ from its full conditional distribution.

After completing the squares, we get θ|rest, y ∼ N (μθ|⋅ (ϕ, μ, σ , τ

You might also like