0% found this document useful (0 votes)
12 views

Lecture 2

This document discusses linear discrete inverse problems and parameter estimation using least squares methods. It provides background on least squares problems, how they relate to maximum likelihood estimates, and how to assess goodness of fit. It explains that for over-determined problems with independent Gaussian errors, minimizing the weighted prediction error gives the least squares solution. It also discusses how the chi-square statistic can be used to test if a model fit is consistent with the estimated data uncertainties and describes how data errors propagate into uncertainties on the estimated model parameters.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 2

This document discusses linear discrete inverse problems and parameter estimation using least squares methods. It provides background on least squares problems, how they relate to maximum likelihood estimates, and how to assess goodness of fit. It explains that for over-determined problems with independent Gaussian errors, minimizing the weighted prediction error gives the least squares solution. It also discusses how the chi-square statistic can be used to test if a model fit is consistent with the estimated data uncertainties and describes how data errors propagate into uncertainties on the estimated model parameters.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Linear discrete inverse Problems

(parameter estimation)

Least squares and all that....


Least squares problems

Least squares is the basis of many parameter estimation and data


fitting procedures.

A concise tutorial can be found in Chapter 15 of


the book Numerical Recipes Press et al. (1992)
Cambridge Univ. Press.

Available for free online at http://www.nrbook.com

Good explanation of essentials in Aster et al.


(2005).

2
Linear discrete inverse problems

Can a and b be resolved ?

Under-determined

Over-determined

Even-determined

3
Over-determined: Linear discrete inverse problem

To find the best fit model we can


minimize the prediction error of
the solution

But the data contain errors. Let’s assume these are independent and
normally distributed, then we weight each residual inversely by the
standard deviation of the corresponding (known) error distribution.

We can obtain a least squares solution by minimizing the weighted


prediction error of the solution.

4
Over-determined: Linear discrete inverse problem

Compare with
We seek the model vector m which minimizes maximum likelihood

Note that this is a quadratic function of the model vector.


Solution: Differentiate with respect to m and solve for the
model vector which gives a zero gradient in
This gives…

This is the least-squares solution.

A solution to the normal equations:

3 5
Over-determined: Linear discrete inverse problem

How does the Least-squares solution compare to the standard


equations of linear regression ?

Given N data yi with independent normally distributed errors


and standard deviations σi what are the expressions for the
model parameters m = [a,b]T ?

6
Linear discrete inverse problem: Least squares

What happens in the under and even-determined cases ?

Under-determined, N=1:

Matrix has a zero determinant


What
and aiszero
m ?eigenvalue
an infinite number of solutions exist

Even-determined, N=2:

r is the−
What= d G = m
prediction 0
error ?

Prediction error is zero !


7
Example: Over-determined, Linear discrete inverse
problem

The Ballistics example

Given data and noise Calculate G

Is the data fit good enough ?

And how to errors in data propagate


into the solution ?
8
The two questions in parameter
estimation

We have our fitted model parameters

…but we are far from finished !

We need to:
Assess the quality of the data fit.
Goodness of fit: Does the model fit the data to within
the statistical uncertainty of the noise ?

Estimate how errors in the data propagate


into the model
What are the errors on the model parameters ?

9
Goodness of fit

Once we have our least squares solution mLS how do we know


whether the fit is good enough given the errors in the data ?

Use the prediction error at the least squares solution !

If data errors are Gaussian this as a chi-square statistic

10
Why do we always assume errors are Gaussian ?
Probability density functions of 5 random variables

X 100000
Xi deviates
i The Central
Limit Theorem

0 0.5 1.0 0 0.5 1.0


0 2.5 5.0 X1 X2

0 0.5 1.0 0 0.5 1.0 0 0.5 1.0


X3 X4 X5

11
Mathematical Background:
Probability distributions

A random variable x follows a Normal or Gaussian probability


density function if

Multivariate case

If data covariance matrix is diagonal


then data errors are independent.

12
Mathematical Background:
Probability distributions

Expectation operator

Expectation of a Gaussian random variable

Variance

Covariance

If X and Y are independent then

13
Mathematical Background:
Probability distributions

Multi-dimensional Gaussian

Expectation of a Gaussian random vector

Covariance matrix

⎡ ⎤
σ1,1 σ1,2 σ1,3 σ1,4
Correlation matrix ⎢ ⎥
⎢ σ2,1 σ2,2 σ2,3 σ2,4 ⎥
⎢ ⎥
⎣ σ3,1 σ3,2 σ3,3 σ3,4 ⎦
σ4,1 σ4,2 σ4,3 σ4,4

Independence between X and Y

Positive or negative correlation

14
Background: Chi-square distribution
If x follows a Normal distribution

What distribution does the square of x follow ?

Answer: A chi-square distribution with


1 degree of freedom

If x1 and x2 are Normal, what distribution does y=x21 + x22 follow ?

Answer: A chi-square distribution with


2 degrees of freedom 15
Goodness of fit
For Gaussian data errors the data prediction error is the square of
a Gaussian random variable hence it has a chi-square probability
density function with N-M degrees of freedom.

p = 0.95 p = 0.05

ndf χ2(5%) χ2(50%) χ2(95%)


5 1.15 4.35 11.07
10 3.94 9.34 18.31
20 10.85 19.34 31.41
50 34.76 49.33 67.50
100 77.93 99.33 124.34

The test provides a means to testing the assumptions that


went into producing the least squares solution. It gives the
likelihood that the fit actually achieved is reasonable.
16
Example: Goodness of fit

The Ballistics problem

Given data and noise

How many degrees of freedom ? ν =N-M = 10 -3=7

In practice values between 0.1 and 0.9 are plausible


4 17
Variance ratio

Another way of looking at the chi-square statistic is as the variance


ratio of two distributions.

Given two sets of random samples

Question: Did they come from the same distribution or not ?

The ratio of the variances of the two distributions follows a chi-square distribution.

Chi-square tables tell us the likelihood that the two sets of observables
come from the same distribution.

18
Goodness of fit

For Gaussian data errors the chi-square statistic has a chi-


square distribution with υ = N-M degrees of freedom.

ndf χ2(5%) χ2(50%) χ2(95%)


5 1.15 4.35 11.07
10 3.94 9.34 18.31
20 10.85 19.34 31.41
50 34.76 49.33 67.50
100 77.93 99.33 124.34

Exercise:
If I fit 7 data points with a straight line and get
what would you conclude ?

If I fit 102 data points with a straight line and get


what would you conclude ?

If I fit 52 data points with a straight line and get


what would you conclude ?

19
Goodness of fit

For Gaussian data errors the chi-square statistic has a chi-


square distribution with υ = N-M degrees of freedom.

ndf χ2(5%) χ2(50%) χ2(95%)


5 1.15 4.35 11.07
10 3.94 9.34 18.31
20 10.85 19.34 31.41
50 34.76 49.33 67.50
100 77.93 99.33 124.34

What could be the cause if:

the prediction error is much too large ? (poor data fit)


Truly unlikely data errors Errors in forward theory

Under-estimated data errors

the prediction error is too small ? (too good data fit)

Truly unlikely data errors Over-estimated the data errors

Fraud ! 20
Solution error
Once we have our least squares solution mLS and we know
that the data fit is acceptable, how do we find the likely errors
in the model parameters arising from errors in the data ?

mLS = G−g d

The data set we actually observed is only one realization of


the many that could have been observed

d0 → d + ²
m0LS → mLS + ²m

m0LS = G−g d0
mLS + ²m = G−g (d + ²)
The effect of adding noise to the data is to add noise to the solution

²m = G−g ²
The model noise is a linear combination of the data noise !
21
Solution error: Model Covariance

Multivariate Gaussian data error distribution

How to turn this into a probability distribution for the model errors ?

We know that the solution error is a linear combination of


the data error
²m = G−g ²
The covariance of any linear
combination Ad of Gaussian
distributed random variables d is

So we have the covariance of the model parameters

22
Solution error: Model Covariance

The model covariance for a least squares problem depends


on data errors and not the data itself ! G is controlled by
the design of the experiment.

is the least squares solution

The data error distribution gives a


model error distribution !

23
Solution error: Model Covariance

For the special case of independent data errors

Independent data errors Correlated model errors

For linear regression problem

24
Confidence intervals by projection (1-D)
The model covariance is a symmetric M x M matrix

In the multi-dimensional model space


the value of Δ2 follows a
distribution with M degrees of freedom.

Projecting onto the mi axis the 1-D confidence interval becomes

Where Δ2 follows a χ21 distribution

e.g. for 90% interval on m1

Note this is 90% the confidence interval on m1 alone.


The joint (m1, m2) 90% confidence ellipse is wider than this.
25
Example: Model Covariance and confidence intervals

For Ballistics problem

95% confidence interval for parameter i

26
Confidence intervals by projection
The M-dimensional confidence ellipsoid can be projected onto any subset (or
combination) of Δ parameters to obtain the corresponding confidence ellipsoid.

Full M-dimensional ellipsoid

Projected ν dimension ellipsoid

Projected model vector


Projected covariance matrix

Chosen percentage point of


the χ2ν distribution

To find the 90% confidence ellipse for


(x,y) from a 3-D (x,y,z) ellipsoid

Can you see that this procedure gives the same formula for the 1-D
case obtained previously ? 27
Recap: Goodness of fit and model covariance

Once a best fit solution has been obtained we test goodness of fit with a
chi-square test (assuming Gaussian statistics)

If the model passes the goodness of fit test we may proceed to evaluating
model covariance (if not then your data errors are probably too small)

Evaluate model covariance matrix

Plot model or projections of it onto chosen subsets of parameters

Calculate confidence intervals using projected equation

Where Δ2 follows a χ21 distribution


28
Robust data fitting with the L1 norm

Least squares solutions are not robust

Minimize

We can calculate an L1 solution with the IRLS algorithm

is a diagonal weighting matrix that depends on the model


See section 2.4 of Aster (2005)

29
Monte Carlo error propagation

Its possible to define an approximate p statistic for the L1 solution


and hence test goodness of fit of the solution. However there is no
analytical solution to error propagation.

…but Monte Carlo error propagation can be used.

1. Calculate data prediction from solution

2. Add random realization of noise to data


and repeat IRLS algorithm

3. Repeat Q times and generate difference vectors

30
Monte Carlo error propagation

For the ballistics problem we get

Compare to LS solution without outlier

31
What if we do not know the errors on the data ?

Both Chi-square goodness of fit tests and model covariance


Calculations require knowledge of the variance of the data.

What can we do if we do not know σ ?

Consider the case of

CD = σ 2I
Independent data errors

Calculated from least


squares solution

So we can still estimate model errors using the calculated data errors
but we can no long claim anything about goodness of fit.

32
Example: Over-determined, Linear discrete inverse
problem

MATLAB exercise
Generate data with Gaussian noise for a linear regression
problem and invert for the best fitting gradient and intercept

1. Generate xi points randomly between 0 and 10

2. Calculate data yi

3. Add N[0,σ] noise to data yi

4. Calculate G matrix

5. Use MATLAB matrix routines to solve the normal equations

6. Plot the data, plot the data errors and plot the least squares solution
33
Model Resolution matrix

If we obtain a solution to an inverse problem we can


ask what its relationship is to the true solution

mest = G−g d
But we know

d = Gmtrue
and hence

mest = G−g Gmtrue = Rmtrue


The matrix R measures how `good an inverse’ G-g is.

The matrix R shows how the elements of mest are built from
linear combination of the true model, mtrue. Hence matrix R
measures the amount of blurring produced by the inverse
operator.
For the least squares solution we have

−1 −1
G−g = (GT CD G)−1GT CD ⇒R=I
5 34
Example: Model resolution in a tomographic
experiment

m = Rmtrue
If the calculated model resolution matrix looked like this
⎡ ⎤
0.75 −0.25 0.25 0.25
⎢ −0.25 0.75 0.25 0.25 ⎥

R=⎢

⎥ What units do the
⎣ 0.25 0.25 0.75 −0.25 ⎦ elements of R have ?
0.25 0.25 −0.25 0.75

Spike test
True model Recovered model

35
Data Resolution matrix

If we obtain a solution to an inverse problem we can


ask what how it compares to the data

dpre = Gmest
But we know

mest = G−g dobs


and hence

dpre = GG−g dobs = Ddobs

The matrix D is analogous to the model resolution matrix R


but measures how independently the model produced by G-g
can reproduce the data. If D = I then the data is fit exactly
and the prediction error d-Gm is zero.

36
Recap: Linear discrete inverse problems

The Least squares solution minimizes the prediction error.

Goodness of fit criteria tells us whether the least squares


model adequately fits the data, given the level of noise.

Chi-square with N-M degrees of freedom

The covariance matrix describes how noise propagates from


the data to the estimated model

Chi-square with M degrees of freedom

Gives confidence intervals


The resolution matrix describes how the estimated model
relates to the true model
37

You might also like