0% found this document useful (0 votes)
18 views5 pages

5 2-3 Spatial Environmental Data Gaussian Processes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

5 2-3 Spatial Environmental Data Gaussian Processes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

5/16/2021 Spatial Prediction | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation

s: Statistical Modeling and Computation in Applications | edX

MITx 6.419x Help HuitianDiao


Data Analysis: Statistical Modeling and Computation in Applications
Course Progress Dates Discussion Resources

Course / Module 5 Environmental Data and Gaussian Processes / Spatial Prediction


Previous Next
3. How to Model Covariance Matrices
Bookmark this page

Exercises due May 21, 2021 19 59 EDT


How to Model Covariance Matrices
methods we have been talking about
earlier in this course.
So now, in summary, again, I have two
interpretations.
One is that I just get my prediction
as a weighted average of the
measurement
values in the neighborhood weighted
by a function of the distance.
 Or I could view this as, essentially,
a linear regression with nonlinear
features.
And the features are given by this
kernel matrix.
Both of these views are valuable,
and we will see both of these having
effects in what is going to come next
in our lecture here.

 20 45 / 20 45  1.50x    

Video Transcripts
Download video file Download SubRip (.srt) file
Download Text (.txt) file

Video note: At 17 42, should be referred to as a column vector.


𝛼

In all previous contents we have assumed we have the covariance matrix available. However, we will often be in a
situation where we have made some observations, , but do not have the covariances for these observations.
𝐱2

We can instead make an assumption on the covariances.


For example, going back to the case of City 1 and City 2, one can assume that the covariance between the two
cities is determined by its distance. In this case,
𝖢𝗈𝗏 (𝑋1 , 𝑋2 ) = 𝑘 (𝑍 1 , 𝑍 2 ) ,

where 𝑘 (⋅, ⋅) is some covariance function 𝑘 (𝑍 1 , 𝑍 2 ) , for example,


‖𝑍 1 − 𝑍 2 ‖
2
7.5
𝑘 (𝑍 1 , 𝑍 2 ) = exp − ,
( 2ℓ
2 )

https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec2/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec2-tab3 1/5
5/16/2021 Spatial Prediction | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

where is a parameter to be estimated, called the length-scale . Covariance functions are often go by the name

kernels as well; however, kernels are a broader class of functions and not all kernels are covariance functions.

Recall that and are points in some space. In our case, and are the physical locations of City 1 and
𝑍1 𝑍2 𝑍1 𝑍2

City 2. Thus, measures the Euclidean distance in this space between the two cities.
‖𝑍 1 − 𝑍 2 ‖
2

The figure below shows an example of generating a covariance matrix from a covariance function. The matrix on
the right is the generated covariance matrix. The plot on the left is a random draw from a multivariate Normal
distribution using this covariance matrix.

33: An example of a covariance matrix generated by Normal random variables located at equal intervals over
25

the real line.


We can now ask the question of what happens if we take some of these points on the left-hand plot, and, instead
of drawing them from the multivariate Normal distribution, we instead set them to to our observed measurements
and condition the remaining points on those set values. We will do this by first partitioning the vector space into
the 𝑁 − 𝑑observed values, , and the unobserved values, , such that the mean becomes
𝐱2 𝑑 𝐱1

𝑑
𝜇1 ∈ ℝ
𝜇𝐗 = .
[ 𝑁 −𝑑 ]
𝜇2 ∈ ℝ

Note that although we have placed the means for the observed values at the end of this vector, they can be
located anywhere on the above plot. The order of the values in the mean vector does not need to correspond to
the spatial or temporal order for the values.
We then compute the full covariance matrix using the covariance function:
𝑑×𝑑 𝑑×(𝑁 −𝑑)
𝚺 11 ∈ ℝ 𝚺 12 ∈ ℝ
𝚺 =
[ (𝑁 −𝑑)×𝑑 (𝑁 −𝑑)×(𝑁 −𝑑) ]
𝚺 21 ∈ ℝ 𝚺 22 ∈ ℝ

Finally, we condition the means and covariances for the unobserved values on the observed values:
−1
𝜇𝐗 = 𝜇1 + Σ12 Σ (𝐱2 − 𝜇2 )
1 |𝐗2 22

−1
Σ𝐗 = Σ11 − Σ12 Σ Σ21 .
1 |𝐗2 22

You will notice that to do this, we need some values for and , but we only have the observations . Thus,
𝜇1 𝜇2 𝐱2

we need to make an assumption on the means. A common (but not necessary) assumption is that so that 𝜇𝐗 = 0

both mean vectors are also zero.


Thus we get
−1
𝜇𝐗 = Σ12 Σ 𝐱2
1 |𝐗2 22

The important thing here is that we not only get a mean for each observed value but we can also get the
https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec2/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec2-tab3 2/5
5/16/2021
The important thing here is that we not only get a mean for each observed value, but we can also get the
Spatial Prediction | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

variances from the diagonals of . Each variance acts as a measure of the uncertainty in that prediction. It
Σ 𝐗1 |𝐗2

gauges how accurate the prediction is, and shows the degree to which an actual observation could depart from
the predicted mean.
To visualize this uncertainty, a band can be drawn around the predicted mean that extends one or two standard
deviations above and below the mean. An example is shown below for three sets of observations.

34: An example of a estimations on a line segment, with (left), (middle), and


𝑁 − 𝑑 = 2 𝑁 − 𝑑 = 3 𝑁 − 𝑑 = 5

(right) observations as black points. The predicted mean is shown as a black line, the grey band is two standard
deviations away from the mean, using the computed standard deviation for the prediction.
You will notice that as more observations are added, the band shrinks in size. This shows that more data makes
for a more precise prediction.
You will also notice that in the above figure, there is no hint of the discretization that you may expect from a finite
number of prediction variables. Instead of choosing some large and computing the full covariance matrix,
𝑑 𝑑 × 𝑑

we instead set . This results in just a single prediction, but we are free to move the location of this
𝑑 = 1

prediction around. The above plot is made by scanning this single prediction along the -axis, and plotting the 𝑥

mean and standard deviation as a function of the prediction location . 𝑥

Quantifying Uncertainty 1
1 point possible (graded)
If one could select the next point to take a measurement, is the mid-point between two existing points the best
place to make this measurement? That is, the best place in terms of the maximal reduction of the shaded area.
Yes
No
Not Necessarily

Submit You have used 0 of 1 attempt

Quantifying Uncertainty 2
1 point possible (graded)
Is the shaded gray area symmetric? That is, the same width above and below the black line?
Yes
No

Submit You have used 0 of 1 attempt


https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec2/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec2-tab3 3/5
5/16/2021 Spatial Prediction | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

Quantifying Uncertainty 3
1 point possible (graded)
Is it possible that at some point in our estimation, when observing a new point on the line, the width of the shaded
gray area increases for some values of the x-axis?
Yes
No

Submit You have used 0 of 1 attempt

The below figure shows another example of the effects of the spatial correlations for temperature measurements.
In the top image, a diagram of a classroom is shown, where the location of the temperature sensors is shown as
black hexagons. In the two bottom images, the corresponding estimates for mean and variances are shown. This
will be a two-dimensional example of what we have shown in the example above.
Note that in particular, the area where it is less dense in terms of the number of sensors is precisely the area
where the highest variance of the estimates occurs. This is expected, as the covariance will decrease with
distance, the areas which are effectively the farthest to a sensor will be the ones with higher uncertainty about
their estimated values. Of course, those are not necessarily the precise points that are the farthest in the
distance, as they need to be weighted by their corresponding variances.

35: An example of a estimations on a 2D spatial dataset.

Discussion Hide Discussion


Topic: Module 5 Environmental Data and Gaussian Processes:Spatial Prediction / 3.
How to Model Covariance Matrices
Add a Post

Show all posts by recent activity


https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec2/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec2-tab3 4/5
5/16/2021 Spatial Prediction | Module 5: Environmental Data and Gaussian Processes | Data Analysis: Statistical Modeling and Computation in Applications | edX

 Staff] Assumption of Normal distribution and Fig34 2


It seems that Fig 34 is not drawn with the assumption of Normal distribution of the random variables. The give-away is that the blac…
 Figure 34 interpretation 1
In figure 34, when taking additional measurements between 1 and 1, it seems that shaded area before 1 and after 1 did not change.…
 Staff] a little mistake: "column vector" instead of "row vector" at 17 42 3

Previous Next

All Rights Reserved

edX
About
Affiliates
edX for Business
Open edX
Careers
News
Legal
Terms of Service & Honor Code
Privacy Policy
Accessibility Policy
Trademark Policy
Sitemap
Connect
Blog
Contact Us
Help Center
Media Kit
Donate

© 2021 edX Inc. All rights reserved.


深圳市恒宇博科技有限公司 粤ICP备17044299号 2

https://learning.edx.org/course/course-v1:MITx+6.419x+1T2021/block-v1:MITx+6.419x+1T2021+type@sequential+block@gp_lec2/block-v1:MITx+6.419x+1T2021+type@vertical+block@gp_lec2-tab3 5/5

You might also like