0% found this document useful (0 votes)
7 views

Lecture 8

Regression

Uploaded by

getasew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lecture 8

Regression

Uploaded by

getasew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

ISyE 6416: Computational Statistics

Spring 2017

Lecture 10: Spline

Prof. Yao Xie

H. Milton Stewart School of Industrial and Systems Engineering


Georgia Institute of Technology
Motivation: non-linear regression
I Bone mineral density versus age for male versus female.
I To deal with non-linearity: split the data into a number a
parts; perform a regression on each part.
I Splitting either via evenly spaced “knots”, or via known
locations based on external information.
Piecewise constant model
Piecewise linear model
Continuous piecewise linear model
Spline
I A spline is a piecewise polynomial function.
I A cubic spline is 3rd order polynomial.
I Fit piecewise continuous splines to noisy data.

https://www.youtube.com/watch?v=LgodR7pwwW8
Quadratic splines
Cubic splines
Formal definition
I Assume f (xi ) = fi of the function f (x) at the points
x0 < x1 < . . . < xn .
I A cubic interpolating spline s(x) is a function on the interval
[x0 , xn ] satisfying
I s(x) is a cubic polynomial on each node-to-node interval
[xi , xi+1 ]
I s(xi ) = fi at each node xi
I the second order derivative s00 (x) exists and is continuous
throughout the entire interval [x0 , xn ]
I at the terminal nodes, s00 (x0 ) = s00 (xn ) = 0
I Cubic splines are derived from the physical laws that govern
bending of thin beams.
I An approximate solution of the minimum energy bending
equation, valid when the amount of bending is small.
Properties of spline

I There is exactly one function s(x) on [x0 , xn ] satisfying these


properties.
I Intuitively, these requirements leads to well-defined math
problems.
I For n knots, the number of parameters can be 4n
I At the same time,
I 2n zeroth-order condition s(xi ) = fi
I n − 1 first order condition s0 (x) continuous at knots
I n + 1 second order conditions
Number of unknowns = number of parameters (necessary
condition)
Computation for a spline
I inter-knot distances hi = xi+1 − xi
I second order derivative σi = s00 (xi ) (n + 1 parameters to
parameterize the cubic spline function)
I we can derive the following
M σ = Qf
M =
1 h1 
3
(h0 + h1 ) 6
0 ··· 0 0
h1 1 (h + h2
h2 ) ··· 0 0
 
 6 3 1 6 
 h2 1 (h


 0 6 3 2 + h3 ) ··· 0 0 

. . . . .
 
 . 
. . . . . .
.
 
 . . . . . 
hn−2
 
0 0 0 ··· 1 (h
n−3 + hn−2 )
 
 3 6 
hn−2 1 (h
0 0 0 ··· 6 3 n−2 + hn−1 )

σ = [σ1 , · · · , σn−1 ], f = [f0 , f1 , . . . , fn ]

1/h0 −1/h0 − 1/h1 1/h1


 
 1/h1 −1/h1 − 1/h2 1/h2 
 ∈ R(n−1)×(n+1)
 
Q= . . .

 .. .. .. 

1/hn−2 −1/hn−2 − 1/hn−1 1/hn−1
Solving the linear system of equations
I Matrix M is symmetric and positive definite, and tridiagonal
I Cholesky factorization

M = LDLT

where  
1 ... 0

a1 1 .. 
 .
L=
 .. .. 
. . 
 ..
 
.. ..
. .

. 
0 ... an−2 1
and D is a diagonal matrix.
This enables efficient inverse of the matrix

σ = M −1 Qf = (LT )−1 D−1 L−1 Qf

inversion of L and D has O(n) complexity.


Final expressions for splines

σi σi+1
si (x) = (xi+1 − x)3 + (x − xi )3
6hi 6hi
   
fi+1 σi+1 hi fi σi hi
+ − (x − xi ) + − (xi+1 − x)
hi 6 hi 6
i = 0, 1, . . . , n − 1.
Minimum energy property

I Why spline? For any other twice continuously differentiable


function Z xn Z xn
[g 00 (x)]2 dx ≥ [s00 (x)]2 dx
x0 x0
Error bound
Suppose that f (x) is twice continuously differentiable and s(x) is
the spline interpolating f (x) at the knots x0 < x1 < · · · < xn . If
h = max0≤i≤n−1 (xi+1 − xi ) then
Z xn
max |f (x) − s(x)| ≤ h [ 3/2
f 00 (y)2 dy]1/2 .
x0 ≤x≤xn x0

f (x) = sin(2x)/x.
Problem with fitting a global polynomial
Runge’s example
1
f (x) =
1 + x2
High order interpolation using a global polynomial often exhibit
these oscillations

I f (x) interpolated using 15th I f (x) interpolated using


order polynomial based on cubic spline based on 15
equidistant sample points. equidistant samples.
Example

The equation for solving σ becomes


    
2.0 0.4 σ1 0.5
=
0.4 1.6 σ2 0.4
⇒ σ1 = 0.2105, σ2 = 0.1974

S0 (x) = 0.0877(x − 0.9)3 + 3.736(x − 0.9) + 3.25(1.3 − x)
S1 (x) = 0.0585(x − 1.3)3 + 0.0548(1.9 − x)3 + 3.0636(x − 1.3) + 2.4790(1.9 − x)
S2 (x) = 0.1645(x − 1.9)3 + 10.5(x − 1.9) + 9.2434(2.1 − x)
Nonlinear regression

I Given responses yi , and variables xi

yi = f (xi ) + i , i = 0, . . . , n

f : unknown regression function


Nonlinear regression
I Given weights w0 , w1 , . . . , wn , wi > 0, minimize
n
X Z xn
Jα (s) = α 2
wi [yi − s(xi )] + (1 − α) [s00 (x)]2 dx
i=0 x0

I tradeoff between smoothness of s and goodness of fit


α ∈ (0, 1)
Matrix-vector parameterization

I One can show Z xn


s00 (x)2 dx = σ T M σ
x0

Jα (f ) = α(y − f )T W (y − f ) + (1 − α)f T QT M −1 Qf
where W = diag{w0 , . . . , wn }
I spline function s parameterized by f
I solution

fˆ = [αW + (1 − α)QT M −1 Q]−1 αW y

I one can show

σ̂ = [αM + (1 − α)QT W −1 Q]−1 αQy


Cross validation
I For notational convenience, we reformulate the optimization
problem
n
X Z xn
Jλ (s) = 2
wi [yi − s(xi )] + λ [s00 (x)]2 dx
i=0 x0

λ = (1 − α)/α
I Define leave-one-out cost function, for 1 ≤ k ≤ n
n Z xn
(−k)
X
hλ (x) = arg min 2
wi [yi − s(xi )] + λ [s00 (x)]2 dx
s x0
i=0,i6=k

I Define cross-validation criterion function


n
(−k)
X
CV(λ) = [yk − hλ (xk )]2
k=0
One can show
n
X [yk − fˆ(λ)k ]2
CV(λ) =
[1 − [S(λ)]kk ]2
k=0
Generalized CV (GCV): replace [S(λ)]kk by its average, since it
can get close to 1.
n
X [yk − fˆ(λ)k ]2
GCV(λ) =
[1 − Tr(S(λ)) ]2
k=0 (n+1)
where
S(λ) = [W + λQT M −1 Q]−1 W
Generalized Cross Validation Noisy Observations
2.18 3
Noisy Observations
Fitted Spline
2.5
2.17

2
2.16

1.5
GCV( )

2.15

2.14
0.5

2.13
0

2.12 -0.5
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 -4 -3 -2 -1 0 1 2 3 4

x
Bi-cubic interpolation

You might also like