Fitting
Fitting
Theory Observation
10000
"bh2
Compare
l(l+1)ClTT/2! (!k2)
1000 TT
l(l+1)ClTT/2! (!k2)
100 1000
WMAP
BICEP
QUAD
10 CBI
100
ACT
0 500 1000 1500 2000 2500 3000 SPT
0 500 1000 1500 2000 2500 3000
Bayesian
l
l
Analysis
−0.4−0.2 0 0.2
1.2
1
ΩΛ,0
0.8
0.6
0.4
−0.4−0.2 0 0.2 0.5 1
0.01 0.01
0 0
Ωk,0
−0.01 −0.01
−0.02 −0.02
José-Alberto Vázquez
ICF-UNAM / Kavli-Cambridge
In progress
August 12, 2021
-ii-
1
Curve Fitting
Data are often given for discrete values along a continuum. However, you may require estimates
at points between the discrete values.
1. Any individual data point may be incorrect, we make no effort to intersect every point.
Rather, the curve is designed to follow the pattern of the points taken as a group: least-
squares regression.
2. These data are known to be very precise, the basic approach is to fit a curve or a series
of curves that pass directly through each of the points: interpolation.
Two types of applications are generally encountered when fitting experimental data: trend
analysis and hypothesis testing.
• Trend analysis may be used to predict or forecast values of the dependent variable. This
can involve extrapolation beyond the limits of the observed data or interpolation within
the range of the data.
1
1. CURVE FITTING
P
yi
ȳ = ,
n
The most common measure of spread for a sample is the standard deviation (sy ) about
the mean
rP
(yi − ȳ)2
sy =
n−1
or the variance:
(yi − ȳ)2
P
s2y =
n−1
with n − 1 degrees of freedom.
To compute the standard deviation
yi2 − ( yi )2 /n
P P
s2y =
n−1
Notice that it does not require precomputation of ȳ.
-2-
1.2 The Normal Distribution
The shape with which these data are spread around the mean. A histogram provides a simple
visual representation of the distribution.
The probability that the true mean of y, µ, falls within the bound from L to U is 1 − α.
kurtosis, skewness, etc.
1.3 Interpolation
You will frequently have occasion to estimate intermediate values between precise data points.
For n + 1 data points, there is one and only one polynomial of order n that passes through all
the points.
f (x) = a0 + a1 x + a2 x2 + · · · + +an xn ,
Polynomial interpolation consists of determining the unique nth-order polynomial that fits
n + 1 data points.
-3-
1. CURVE FITTING
The simplest form of interpolation is to connect two data points with a straight line.
f (x1 ) − f (x0 )
f1 (x) = f (x0 ) + (x − x0 ), (1.2)
x1 − x0
f1 (x) designates that this is a first-order interpolating polynomial
If three data points are available, this can be accomplished with a second-order polynomial
(also called a quadratic polynomial or a parabola)
-4-
1.4 Lagrange Polynomial
can be expressed as
f2 (x) = a0 + a1 x + a2 x2 (1.4)
where
a0 = b0 − b1 x0 + b2 x0 x1 , (1.5)
a1 = b1 − b2 x0 − b2 x1 , (1.6)
a2 = b2 . (1.7)
b0 = f (x0 ), (1.8)
f (x1 ) − f (x0 )
b1 = , (1.9)
x1 − x0
x − x1 x − x0
L0 (x) = and L1 (x) = , (1.11)
x0 − x1 x1 − x0
and define
-5-
1. CURVE FITTING
-6-
1.5 Splines
for cases where the order of the polynomial is unknown, the Newton method has advantages
because of the insight it provides into the behavior of the different-order formulas. Lagrange
version is somewhat easier to program. Because it does not require computation and storage of
divided differences, the Lagrange form is often used when the order of the polynomial is known
a priori. (here: wiki for hermite)
1.5 Splines
In the previous sections, nth-order polynomials were used to interpolate between n + 1 data
points. An alternative approach is to apply lower-order polynomials to subsets of data points.
Such connecting polynomials are called spline functions.
fi (x) = ai x2 + bi x + ci ,
For n + 1 data points, there are n intervals and, consequently, 3n unknown constants to
evaluate. Therefore, 3n equations or conditions are required to evaluate the unknowns. These
are:
1. The function values of adjacent polynomials must be equal at the interior knots.
for i = 2 to n. Because only interior knots are used, each equation provides n−1 conditions
for a total of 2n − 2 conditions.
-7-
1. CURVE FITTING
2. The first and last functions must pass through the end points.
(1.20)
total of 2n − 2 + 2 = 2n conditions.
-8-
1.6 Quadratic Splines
a1 = 0. (1.22)
-9-
1. CURVE FITTING
Where substantial error is associated with data, polynomial interpolation is inappropriate and
may yield unsatisfactory results when used to predict intermediate values.
A more appropriate strategy for such cases is to derive an approximating function that fits
the shape or general trend of the data without necessarily matching the individual points.
One way to do this is to derive a curve that minimizes the discrepancy between the data
points and the curve. A technique for accomplishing this objective, called least- squares regres-
sion
The simplest example of a least-squares approximation is fitting a straight line to a set of paired
observations
y = a0 + a1 x + e
-10-
1.7 Least-Squares Regression
The error, or residual, is the discrepancy between the true value of y and the approximate
value, predicted by the linear equation.
Minimize the sum of the squares of the residuals between the measured y and the y calculated
with the linear model
n
X n
X n
X
Sr = e2i = (yi,measured − yi,model )2 = (yi − a0 − a1 xi )2
i=1 i=1 i=1
∂Sr X
= −2 (yi − a0 − a1 xi ) (1.23)
∂a0
∂Sr X
= −2 [(yi − a0 − a1 xi )xi ] (1.24)
∂a1
-11-
1. CURVE FITTING
X X
X
0 = yi −
a0 − a1 xi (1.25)
X X X
0 = yi xi − a0 xi − a1 x2i ] (1.26)
P
Now, realizing that a0 = na0
X X
na0 + a1 xi = yi (1.27)
X X X
a0 xi + a1 x2i = yi xi (1.28)
These are called the normal equations, and can be solved simultaneously
P P P
n x y − xi yi
a1 = Pi i2 P
n xi − ( xi )2
and
a0 = ȳ − a1 x̄.
r
Sr
sy/x =
n−2
quantifies the spread around the regression line. This concept can be used to quantify the
”goodness” of our fit.
The difference between the two quantities, St − Sr , quantifies the improvement or error
reduction due to describing the data in terms of a straight line rather than as an average value.
[St = (yi − ȳ)2 ]
P
St − Sr
r2 =
St
where r2 is called the coefficient of determination and r is the correlation coefficient For a
perfect fit, Sr = 0 and r = 1, signifying that the line explains 100 percent of the variability of
the data. For r = 0, Sr = St and the fit represents no improvement.
-12-
1.7 Least-Squares Regression
P P P
xi yi − ( xi )( yi )
n
r= p P 2
(n xi − ( xi )2 )(n yi2 − ( yi )2 )
P P P
In some cases, techniques such as polynomial regression, are appropriate. For others, transfor-
mations can be used to express the data in a form that is compatible with linear regression.
One example is the exponential model
y = a1 ebx
-13-
1. CURVE FITTING
y = a2 xb
The least-squares procedure can be readily extended to fit the data to a higher-order polynomial.
For example, suppose that we fit a second-order polynomial or quadratic
y = a 0 + a 1 x + a 2 x2 + e
n
X n
X
Sr = e2i = (yi − a0 − a1 xi − a2 x2i )2
i=1 i=1
-14-
1.7 Least-Squares Regression
∂Sr X
= −2 (yi − a0 − a1 xi − a2 x2i ) (1.29)
∂a0
∂Sr X
= −2 [(yi − a0 − a1 xi − a2 x2i )xi ] (1.30)
∂a1
∂Sr X
= −2 [(yi − a0 − a1 xi − a2 x2i )x2i ] (1.31)
∂a2
X X X
na0 + a1 xi + ( x2i )a2 = yi (1.32)
X X X X
a0 xi + a1 x2i + ( x3i )a2 = yi xi (1.33)
X X X X
a0 x2i + a1 x3i + ( x4i )a2 = yi x2i (1.34)
The coefficients of the unknowns can be calculated directly from the observed data. And in
general, for a polynomial of order n, we have : (py: the quadratic case in HW.)
y = y = a0 + a1 x + a2 x2 + e
-15-
1. CURVE FITTING
1.8 Pade
A rational funcion r of degree N has the form
p(x)
r(x) ≡
q(x)
where p(x) and q(x) are polynomial whose degrees sum to N . The rational function whose
numerator and denominator have the same or nearly the same degree generally produce ap-
proximation results superior to polynomial methods for the same amount of computation effort.
p(x) p 0 + p 1 x + · · · + p n xn
r(x) = =
q(x) q0 + q1 x + · · · + qm xm
that is used to approximate a function f on a closed interval I containing zero.
The Pade approximation technique, which is the extension of Taylor polynomial ap-
proximation to rational functions, choses the N + 1 parameters so that f (k) (0) = r(k) (0), for
each k = 0, 1, . . . , N . When n = N and m = 0, the Pade approximation is just the N th Maclau-
rin polynomial.
Consider de difference
Pm Pn
p(x) f (x)q(x) − p(x) f (x) i=0 qi xi − i=0 pi xi
f (x) − r(x) = f (x) − = =
q(x) q(x) q(x)
P∞
and suppose f has the Maclaurin series expansion f (x) = i=0 ai xi . Then
P∞ Pm Pn
i=0 ai xi i=0 qi x
i
− i=0 pi xi
f (x) − r(x) =
q(x)
-16-
1.8 Pade
So, the rational function for Pade approximation results from the solution of the N + 1
linear equations
k
X
ai qk−i = pk , k = 0, 1, . . . N
i=0
in the N + 1 unknowns q1 , q2 , . . . qm , p0 , p1 , . . . pn .
P∞ (−1)n n
(here: do it) The Maclaurin series expansion for e−x if n=0 n! x . To find the Pade
−x
approximation to e of degree 5 with n = 3 and m = 2, we need to choose:
1 − 53 x + 20
3 2 1 3
x − 60 x
r(x) = 2 1 2 .
1 + 5 x + 20 x
-17-
1. CURVE FITTING
(hw: Determine all degree 3 Pade approximations for f (x) = x ln(x + 1). Compare the
results at xi = 0.2i for i = 1, 2, 3, 4, 5 with the actual values f (xi ))
Although the rational-function approximation gave results superior to the polynomial approxi-
mation of the same degree, the approximation has a wide variation in accuracy. This accuracy
variations is expected because the Pade approximation is based on a Taylor polynomial repre-
sentation, and the Taylor representation has a wide variation of accuracy.
Pn
pk Tk (x)
r(x) = Pk=0
m where N = n + m and q0 = 1
k=0 qk Tk (x)
-18-
1.8 Pade
-19-
1. CURVE FITTING
-20-
2
Orthogonal Polynomials and Least
Squares Approximation
Suppose f ∈ C[a, b] and that a polynomial Pn x of degree at most n is requires that will minimize
the error
Z b
[f (x) − Pn (x)]2 dx (2.1)
a
and define,
n
!2
Z b X
k
e = e(a0 , a1 , . . . , an ) = f (x) − ak x dx. (2.3)
a k=0
21
2. ORTHOGONAL POLYNOMIALS AND LEAST SQUARES
APPROXIMATION
-22-
-23-
2. ORTHOGONAL POLYNOMIALS AND LEAST SQUARES
APPROXIMATION
-24-
-25-
2. ORTHOGONAL POLYNOMIALS AND LEAST SQUARES
APPROXIMATION
-26-
3
Fourier Approximations
Fourier approximation represents a systematic framework for using trigonometric series for this
purpose. We will use the cosine
f (t) = A0 + C1 cos(w0 t + θ)
27
3. FOURIER APPROXIMATIONS
or
P
y
A0 = (3.1)
N
2 X
A1 = y cos(w0 t) (3.2)
N
2 X
B1 = y sin(w0 t) (3.3)
N
-28-
3.1 Least-Squares Fit of a Sinusoid
-29-
3. FOURIER APPROXIMATIONS
-30-