0% found this document useful (0 votes)
13 views3 pages

5 3-2 Spatial Environmental Data Model Selection Long-Range Dependencies

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

5 3-2 Spatial Environmental Data Model Selection Long-Range Dependencies

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

5/16/2021 Sensing and Analyzing global patterns of dependence | Module 5: Environmental Data and Gaussian Processes | Data Analysis:

cesses | Data Analysis: Statistical Modeling and Computation in Applications | edX

MITx 6.419x Help HuitianDiao


Data Analysis: Statistical Modeling and Computation in Applications
Course Progress Dates Discussion Resources

Course / Module 5 Environmental Data and G… / Sensing and Analyzing global patter…
Previous Next
2. Model Selection
Bookmark this page

Model Selection
to fit Gaussian processes on a variety
of data
without even that much prior
knowledge.
It's still good to know what these
different kernels do,
so that you can already come up with
a good set of candidate
kernels.
But other than that, you can actually
 fit the rest to the data at hand that
you have.

 20 43 / 20 43  1.50x    

Video Transcripts
Download video file Download SubRip (.srt) file
Download Text (.txt) file

Initially, let's recall our setup where we have a pair of multivariate Gaussian random variables , and 𝐗1 ∈ ℝ
𝑑

𝐗2 ∈ ℝ
𝑁 −𝑑
. These two random variables are used to represent the temperature at two sets of cities: are the 𝐗1

cities for which we do no have temperature measurements, and are the cities for which we do have temperate 𝐗2

measurements. In addition, we also have access to the means of both of these random variables, which are
denoted by and respectively — these will be the mean temperature at each of the cities.
𝜇1 𝜇2

The random variables are associated with physical locations represented by the variables and 𝐙1 ∈ ℝ
𝑀 ×𝑑

𝐙2 ∈ ℝ where we have assumed that we are working on an -dimensional space; typically,


𝑀 ×(𝑁 −𝑑)
for 𝑀 𝑀 = 2

spatial data. Further, we have selected a covariance function that serves as proxy for the relation
𝑘 (𝑧𝑖 , 𝑧𝑗 )

between two random variables as a function of their spatial locations. We use this kernel function to construct a
covariance matrix so that . Thus, we build the matrix:
Σ 𝑖𝑗 = 𝑐𝑜𝑣 (𝑋𝑖 , 𝑋𝑗 ) = 𝑘 (𝑧𝑖 , 𝑧𝑗 )

𝑑×𝑑 𝑑×(𝑁 −𝑑)


𝚺 11 ∈ ℝ 𝚺 12 ∈ ℝ
𝚺 =
[ (𝑁 −𝑑)×𝑑 (𝑁 −𝑑)×(𝑁 −𝑑) ]
𝚺 21 ∈ ℝ 𝚺 22 ∈ ℝ

In the previous sections we have shown that the distribution of the random variable 𝐗1 conditioned on 𝐗 2 = 𝐱2

is a Normal distribution with


https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec3/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec3-tab2 1/3
5/16/2021 Sensing and Analyzing global patterns of dependence | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

−1
𝜇𝐗 = 𝜇1 + Σ12 Σ (𝐱2 − 𝜇2 )
1 |𝐗2 22

−1
Σ𝐗 = Σ11 − Σ12 Σ Σ21 .
1 |𝐗2 22

The main running assumption in this process is to model the variables to be measured – like temperature – as a
jointly Normally distributed random variable with correlations determined as a function of location through the
kernel function . Once the means have been specified, we may predict the unobserved random variables
𝑘 (𝑧𝑖 , 𝑧𝑗 )

by computing the marginal distributions conditioned on the observed variables.


Here, we study the fundamental question of how to select this kernel function. One could create a countable
number of models, such as Gaussian processes with different kernels, or use the same kernel with a set of
different parameters. However, from these sets of kernels, how do we specify and select which model for the
kernel is best?
We present two possible approaches for such a problem of model selection. These are not an exhaustive
exposition of the possible approaches but are useful for the problems at hand. The interested reader may wish to
consult the literature of Gaussian processes and model selection for further approaches.
We will proceed by first constructing an additional abstraction: we will consider the parameters of the kernel
function to be some generic value . That is, for example in the case of the kernel function
𝜃

2
‖𝑦 𝑖 − 𝑦 𝑗 ‖
𝑘 (𝑦 𝑖 , 𝑦 𝑗 ) = exp − .
2
( 2ℓ )

We can say that 𝜃 = {ℓ} , and our objective is to find the “best" in some particular sense that will be defined
𝜃

later.
The two approaches we will explore are:
Estimate Generalization error: cross-validation, leave-one-out, or k-fold. This defines a “good model"' as one
that predicts best data that we have not seen before, i.e., generalization. This approach corresponds to the
classical tension between having a model that fits the data well, and at the same time, generalizes to
unobserved data.
Here we assume we have a probabilistic
Maximize the log marginal likelihood of the data, 𝑝 (𝑦|𝑋, 𝜃) to 𝜃 .
model, where we compute how likely the data is that we have seen, under the chosen model. Alternatively, in
short, how well the model fits the data as measured by a normalized probability. This approach balances fitting
power and the simplicity of the model.

Discussion Hide Discussion


Topic: Module 5 Environmental Data and Gaussian Processes:Sensing and Analyzing
global patterns of dependence / 2. Model Selection
Add a Post

Show all posts by recent activity


There are no posts in this topic yet.

Previous Next

All Rights Reserved

https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec3/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec3-tab2 2/3
5/16/2021 Sensing and Analyzing global patterns of dependence | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

edX
About
Affiliates
edX for Business
Open edX
Careers
News
Legal
Terms of Service & Honor Code
Privacy Policy
Accessibility Policy
Trademark Policy
Sitemap
Connect
Blog
Contact Us
Help Center
Media Kit
Donate

© 2021 edX Inc. All rights reserved.


深圳市恒宇博科技有限公司 粤ICP备17044299号 2

https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec3/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec3-tab2 3/3

You might also like