Function_and_Data_Approximation
Function_and_Data_Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 1 / 87
Content
1 Interpolation
4 Conclusion
7 References
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 2 / 87
Interpolation
Theorem
Suppose that f is defined and continuous on [a, b]. For each ε > 0, there exists a
polynomial P (x) such that
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 3 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 4 / 87
Interpolation
Lagrange interpolation
There is an explicit formula, called the Lagrange interpolating formula, for writing
down a polynomial of degree d = n − 1 that interpolates the points. Suppose that
we are given three points (x1 , y1 ), (x2 , y2 ), (x3 , y3 ).Then the polynomial
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 5 / 87
Interpolation
In general, suppose that we are presented with n points (x1 , y1 ), ..., (xn , yn ). For
each k between 1 and n, define the degree n − 1 polynomial
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 6 / 87
Interpolation
Theorem
Let (x1 , y1 ), . . . , (xn , yn ) be n points in the plane with distinct xi . Then there
exists one and only one polynomial P of degree n − 1 or less that satisfies
P (xi ) = yi for i = 1, . . . , n.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 7 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 8 / 87
Interpolation
Assume that the data points come from a function f (x), so that our goal is to
interpolate (x1 , f (x1 )), ..., (xn , f (xn )).
Definition
Denote by f [x1 , . . . , xn ] the coefficient of the xn−1 term in the (unique)
polynomial that interpolates (x1 , f (x1 )), . . . , (xn , f (xn )).
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 9 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 10 / 87
Interpolation
f [xk ] = f (xk )
f [xk+1 ] − f [xk ]
f [xk , xk+1 ] =
xk+1 − xk
f [xk+1 , xk+2 ] − f [xk , xk+1 ]
f [xk , xk+1 , xk+2 ] =
xk+2 − xk
f [xk+1 , xk+2 , xk+3 ] − f [xk , xk+1 , xk+2 ]
f [xk , xk+1 , xk+2 , xk+3 ] = .
xk+3 − xk
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 11 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 12 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 13 / 87
Interpolation
Interpolation Error
Assume that P (x) is the (degree n − 1 or less) interpolating polynomial fitting the
n points (x1 , y1 ) , . . . , (xn , yn ). The interpolation error is
(x − x1 ) (x − x2 ) · · · (x − xn ) (n)
f (x) − P (x) = f (c),
n!
where c lies between the smallest and largest of the numbers x, x1 , . . . , xn .
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 14 / 87
Interpolation
Fact
f [x1 , . . . , xn ] = f [σ(x1 ), . . . , σ(xn )] for any permutation σ of the xi .
Fact
P (x) can be written in the form:
P (x) = c0 +c1 (x−x1 )+c2 (x−x1 )(x−x2 )+· · ·+cn−1 (x−x1 )(x−x2 ) · · · (x−xn−1 ).
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 15 / 87
Interpolation
Runge Phenomenon
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 16 / 87
Interpolation
Chebyshev’s theorem
(x − x1 )(x − x2 ) · · · (x − xn ) (n)
· f (c)
n!
on the interpolation interval.Let’s fix the interval to be [−1, 1]. The numerator of
the above interpolation is a degree n polynomial ans has some maximum value on
[−1, 1. We have to try and consider minimax problem.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 17 / 87
Interpolation
Theorem
The choice of real numbers −1 ≤ x1 , . . . , xn ≤ 1 that makes the value of
max |(x − x1 ) · · · (x − xn )|
−1≤x≤1
as small as possible is
(2i − 1)π
xi = cos for i = 1, . . . , n
2n
and the minimum value is 1/2n−1 . In fact, the minimum is achieved by
1
(x − x1 ) · · · (x − xn ) = Tn (x)
2n−1
where Tn (x) denotes the degree n Chebyshev polynomial.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 18 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 19 / 87
Interpolation
Chebishev Polynomials
Fact
The Tn ’s are polynomials. Since T3 is a polynomial combination of T1 and T2 , T3
is also a polynomial. The same argument goes for all Tn . The first few Chebyshev
polynomials (see Figure 3.9) are
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 − 1
T3 (x) = 4x3 − 3x
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 20 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 21 / 87
Interpolation
Fact
deg(Tn ) = n,and the leading coefficient is 2n−1 .
This is clear for n = 1 and n = 2, and the recursion relation extends the fact to all
n.
Fact
Tn (1) = 1 and Tn (−1) = (−1)n . Both are clear for n = 1 and 2. In general,
Tn+1 (1) = 2(1)Tn (1) − Tn−1 (1) = 2(1) − 1 = 1
Tn+1 (−1) = 2(−1)Tn (−1) − Tn−1 (−1)
= −2(−1)n − (−1)n−1
= (−1)n−1 (2 − 1) = (−1)n−1 = (−1)n+1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 22 / 87
Interpolation
Fact
The maximum absolute value of Tn (x) for −1 ≤ x ≤ 1 is 1. This follows
immediately from the fact that Tn (x) = cos(y) for some y.
Fact
All zeros of Tn (x) are located between -1 and 1 . See Figure 3.10. In fact, the
zeros are the solution of 0 = cos(n arccos x). Since cos y = 0 if and only if y =
odd integer ·(π/2), we find that
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 23 / 87
Interpolation
Fact
Tn (x) alternates between −1 and 1a total of n + 1 times. In fact, this happens
at cos(0), cos n , . . . , cos (n−1)π
π
n , cos(π).
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 24 / 87
Interpolation
Change of interval
(b−a)
Stretch the points by the factor 2 (the ratio of the two interval lengths)
(b+a)
Translate the points by 2 to move the center of mass from 0 to the
midpoint of [a, b].
On the interval [a, b],
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 25 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 26 / 87
Interpolation
Cubic Splines
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 27 / 87
Interpolation
Properties of Splines
Assume that we are given the n data points (x1 , y1 ), ..., (xn , yn ), where the xi are
distinct and in increasing order.
A cubic spline S(x) through the data points (x1 , y1 ), ..., (xn , yn ) is a set of cubic
polynomials
2 3
S1 (x) = y1 + b1 (x − x1 ) + c1 (x − x1 ) + d1 (x − x1 ) on [x1 , x2 ]
2 3
S2 (x) = y2 + b2 (x − x2 ) + c2 (x − x2 ) + d2 (x − x2 ) on [x2 , x3 ]
..
.
2 3
Sn−1 (x) = yn−1 + bn−1 (x − xn−1 ) + cn−1 (x − xn−1 ) + dn−1 (x − xn−1 )
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 28 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 29 / 87
Interpolation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 30 / 87
Interpolation
Definition
φ(x) = φ̂(∥x∥),
or some other fixed point c, called a center, so that
φ(x) = φ̂(∥x∥),
is a radial function. The distance is usually Euclidean distance, although other
metrics are sometimes used. Radial basis functions are often used as a collection
{φk }k which forms a basis for some function space of interest, hence the name.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 31 / 87
Interpolation
φc = φ(∥x − c∥)
is said to be a radial kernel centered at c ∈ V . A radial function and the
associated radial kernels are said to be radial basis functions if, for any finite set of
nodes
{xk }nk=1 ⊆ V,
all of the following conditions are true:
The kernels
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 32 / 87
Interpolation
The kernels φx1 , φx2 , . . . , φxn form a basis for a Haar Space, meaning that the
interpolation matrix (given below) is non-singular:
φ(∥x1 − x1 ∥) φ(∥x2 − x1 ∥) . . . φ(∥xn − x1 ∥)
φ(∥x1 − x2 ∥) φ(∥x2 − x2 ∥) . . . φ(∥xn − x2 ∥)
.. .. .. ..
. . . .
φ(∥x1 − xn ∥) φ(∥x2 − xn ∥) . . . φ(∥xn − xn ∥)
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 33 / 87
Interpolation
Examples
Commonly used types of radial basis functions include (writing r = ∥x − xi ∥ and
using ε to indicate a shape parameter that can be used to scale the input of the
radial kernel):
2
φ(r) = e−(εr)
1
φ(r) =
p
φ(r) = 1 + (εr)2 1 + (εr)2
Figure: Multiquadratic
Figure: Gaussian Radial Figure: Inverse Multiquadric
Radial Basis Function
Basis Function Radial Basis Function
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 34 / 87
Interpolation
Radial basis functions are typically used to build up function approximations of the
form
N
X
y(x) = wi φ(∥x − xi ∥),
i=1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 35 / 87
Introduction to Fourier Transform
Euler’s Formula
Let’s revisit one of the most iconic and fundamental formulas in mathematics:
Euler’s formula, which elegantly connects complex exponentials with
trigonometric functions:
eiθ = cos θ + i sin θ
This relationship allows us to express complex numbers in polar form, linking their
magnitude and phase to trigonometric terms.
For example, the complex number z = eiπ corresponds to z = −1, which also
gives us the celebrated identity:
eiπ + 1 = 0
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 36 / 87
Introduction to Fourier Transform
zn = 1
On the real number line, there are only two roots of unity, −1 and 1. However, in
the complex plane, there are many more. For example, the imaginary unit i is a
4th root of unity, because:
i4 = (−1)2 = 1
An n-th root of unity is called *primitive* if it is not a k-th root of unity for any
k < n. For instance: −1 is a primitive second root of unity but a non-primitive
fourth root of unity.
For any integer n, the complex number:
ωn = e−i2π/n
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 37 / 87
Introduction to Fourier Transform
Lemma
Let ω be a primitive nth root of unity and k be an integer. Then
n−1
(
X
jk n if nk is an integer
ω =
j=0
0 otherwise
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 39 / 87
Introduction to Fourier Transform
For example,
√ Lemma shows that the DFT of x = [1, 1, . . . , 1] is
y = [ n, 0, . . . , 0].
In matrix terms, this definition says
0
ω0 ω0
y0 a0 + ib0 ω ··· x0
y1 a1 + ib1 ω 0 ω1 ··· ω n−1 x1
y2 a2 + ib2
1 ω 0 ω 2
··· ω 2(n−1)
= = √ x2 .
n ..
.. .. .. .. . .
.. ..
. . . . .
yn−1 an−1 + ibn−1 ω 0 ω n−1 · · · ω (n−1)2 xn−1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 40 / 87
Introduction to Fourier Transform
Number of Operations
Definition
The magnitude of a complex vector v is the real number
√
∥v∥ = vT v.
FT F = I.
A unitary matrix, like the Fourier matrix, is the complex version of a real
orthogonal matrix. If F is unitary, then
∥Fv∥2 = vT FT Fv = vT v = ∥v∥2 .
Applying the Discrete Fourier Transform is a matter of multiplying by the n × n
matrix Fn , and therefore requires O(n2 ) operations (specifically n2 multiplications
and n(n − 1) additions). The inverse Discrete Fourier Transform, which is applied
by multiplication by Fn−1 , is also an O(n2 ) process.
Fast Fourier Transform, developed from DFT requires significantly fewer
operations.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 42 / 87
Introduction to Fourier Transform
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 43 / 87
Introduction to Fourier Transform
Parseval’s Theorem states that the total energy of a signal in the time domain is
equal to its total energy in the frequency domain.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 44 / 87
Introduction to Fourier Transform
Interpretation:
The Fourier Transform and its inverse form a pair of operations.
The Fourier Inversion Theorem guarantees that no information is lost when
transforming a function into the frequency domain and then back to the time
domain.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 45 / 87
Introduction to Fourier Transform
Convolution Theorem
The Convolution Theorem states that the Fourier transform of the convolution of
two functions is the product of their Fourier transforms.
Convolution of Two Functions
The convolution of two functions f (t) and g(t) is defined as:
Z ∞
(f ∗ g)(t) = f (τ )g(t − τ ) dτ
−∞
Convolution Theorem
If F (ω) = F{f (t)} and G(ω) = F{g(t)}, then:
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 47 / 87
Phase Space and the Fourier Transform
The phase space is a crucial concept in both classical and quantum mechanics,
where the state of a system is described by both position and momentum (or time
and frequency in signal processing).
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 48 / 87
Phase Space and the Fourier Transform
Heisenberg Principle
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 49 / 87
Phase Space and the Fourier Transform
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 51 / 87
Conclusion
Conclusion
Key Points
Fourier transform provides a decomposition of functions into sinusoidal
components.
Theorems such as Parseval’s theorem, the convolution theorem, and the
Fourier inversion theorem offer fundamental insights.
Multivariate Fourier transforms extend these concepts to higher-dimensional
spaces.
Phase space allows us to study signals in both time and frequency
simultaneously.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 52 / 87
Conclusion
Coding Time
Create a Signal
Let’s create two sine waves with given frequencies and combine them into one
signal. We will use two frequencies: 27Hz and 35Hz.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 53 / 87
Conclusion
Visualization
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 54 / 87
Conclusion
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 55 / 87
Conclusion
Doing FFT
Using SciPy’s built in Discrete Fourier transform library to get the signal from
Time to Frequency domain.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 56 / 87
Conclusion
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 57 / 87
Conclusion
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 58 / 87
Least Square Approximation
Ax = b
where:
A ∈ Rm×n is a matrix of coefficients
x ∈ Rn is the vector of unknowns
b ∈ Rm is the vector of observations
If m > n the system is inconsistent, so there is no exact solution.
An alternative in this situation is to find a vector x that comes the closest to
being a solution.
This special x will be called the least squares solution, which minimizes the
error, defined as the Euclidean distance between the predicted and actual values.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 59 / 87
Least Square Approximation
Geometric Interpretation
x 1 v1 + x 2 v 2 + · · · + x n vn
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 60 / 87
Least Square Approximation
Orthogonality
If b lies outside of the plane defined by v1 and v2 , there will be no solution. The
least squares solution x̄ makes the combination vector Ax̄ the one in the plane Ax
that is nearest to b in the sense of Euclidean distance.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 61 / 87
Least Square Approximation
Normal equation
We now
(b − Ax̄)⊥{Ax|x ∈ Rn }
(Ax)T (b − Ax̄) = 0 for all x in Rn
xT AT (b − Ax̄) = 0 for all x in Rn
means that the n-dimensional vector AT (b − Ax̄) is perpendicular to every vector
x in Rn , including itself.
AT (b − Ax̄) = 0
AT Ax̄ = AT b
This system of equation is known as the normal eqautions. So, we should solve
this equation for the least squares solution x̄ that minimizes the Euclidean length
of the residual r = b − Ax
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 62 / 87
Least Square Approximation
Error Measures
If the residual is the zero vector, then we have solved the original system Ax = b
exactly.
If not, the Euclidean length of the residual vector is a backward error measure of
how far x̄ is from being a solution. There are at least three ways to express the
size of the residual. The Euclidean length of a vector,2-norm
q
∥r∥2 = r12 + · · · + rm2
Squared error
SE = r12 + · · · + rm
2
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 63 / 87
Least Square Approximation
STEP 2. FORCE THE MODEL TO FIT THE DATA. Substitute the data
points into the model. Each data point creates an equation whose unknowns are
the parameters, such as c1 and c2 in the line model. This results in a system
Ax = b, where the unknown x represents the unknown parameters.
STEP 3. SOLVE THE NORMAL EQUATION The least squares solution for
the parameters will be found as the solution to the system of normal equations
AT Ax = AT b.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 64 / 87
Least Square Approximation
Example
Find the best line for the four data points (1, 1), (0, 0), (1, 0), (2, 2)
follow three steps: (1) Choose the model y = c1 + c2 t
(2) Forcing the model to fit the data yields
c1 + c2 (−1) = 1
c1 + c2 (0) = 0
c1 + c2 (1) = 0
c1 + c2 (2) = −2
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 65 / 87
Least Square Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 66 / 87
Least Square Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 67 / 87
Least Square Approximation
Least Squares Approximation is a method for finding the best-fitting curve (or
line) to a set of points by minimizing the sum of the squared differences between
the observed values and the values predicted by the curve. The idea is to
minimize the error between the actual data points and the predicted ones. if we
have a set of data (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) and to fit a line, curve, or any
function f (x), you want to find the function that minimizes the sum of squared
residuals (differences between actual and predicted values).
n
X
min (yi − f (xi ))2
i=1
ri = yi − f (xi )
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 68 / 87
Least Square Approximation
Definition: The method of linear least squares minimizes the sum of the squared
differences between observed values and the values predicted by a linear function.
Use Case: It’s widely used for fitting a straight line to data points. If your data
points are (xi , yi ), the goal is to find the line y = ax + b that best fits the data.
Equation: The error function (or residual) is
X 2
R(a, b) = (yi − (axi + b))
i
Solution: The optimal values of a and b minimize the sum of squares of these
residuals, and solving this involves matrix operations.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 69 / 87
Least Square Approximation
Objective: Given a set of data points {(xi , yi )}ni=1 , you want to find the
best-fitting line y = ax + b that minimizes the sum of squared errors. The error is
defined as:
n
X 2
E(a, b) = (yi − (axi + b))
i=1
To minimize E(a, b), we take the partial derivatives of E with respect to a and b
and set them to zero:
n
∂E X
= −2 xi (yi − (axi + b)) = 0
∂a i=1
n
∂E X
= −2 (yi − (axi + b)) = 0
∂b i=1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 70 / 87
Least Square Approximation
n
X n
X
a xi + nb = yi
i=1 i=1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 71 / 87
Least Square Approximation
Definition: The constrained least squares (CLS) problem minimizes the sum of
squared errors, similar to the linear least squares, but with additional constraints
on the solution.
Use Case: It is useful when the solution must satisfy certain conditions, such as
equality constraints. Given a system of data points and constraints, CLS finds the
best solution that fits both the data and the constraints.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 72 / 87
Least Square Approximation
Equation: The objective function for the constrained least squares problem is:
where:
A ∈ Rm×n is the data matrix,
x ∈ Rn is the solution vector,
b ∈ Rm is the observation vector,
C ∈ Rp×n is the constraint matrix,
d ∈ Rp is the vector of constraints.
Solution: The constrained least squares problem can be solved using the
**Karush-Kuhn-Tucker (KKT) conditions**. The optimality condition is given by:
T
2A A C T x
T
2A b
=
C 0 z d
where z is the vector of Lagrange multipliers.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 73 / 87
Least Square Approximation
Objective: The goal is to minimize the objective function ∥Ax − b∥22 while
ensuring that the solution satisfies the equality constraint Cx = d.
KKT System: The KKT system combines both the objective function and the
constraints into one system, and solving it yields the optimal values of x and the
Lagrange multipliers z:
T −1 T
x 2A A CT 2A b
=
z C 0 d
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 74 / 87
Least Square Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 75 / 87
Least Square Approximation
where:
xi are the input data points,
yi are the observed data points,
f (xi , θ) is the nonlinear function parameterized by θ,
θ is the vector of parameters to be optimized.
Solution: Nonlinear least squares problems are typically solved using iterative
methods. One popular method is the **Gauss-Newton method**, which
approximates the nonlinear function by a linear one using a first-order Taylor
expansion:
f (xi , θ) ≈ f (xi , θk ) + Ji (θ − θk )
where Ji is the Jacobian matrix of partial derivatives of f (xi , θ) with respect to θ.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 76 / 87
Least Square Approximation
θk+1 = θk + ∆θ
Jacobian Matrix: The Jacobian matrix J consists of the partial derivatives of the
model function with respect to the parameters:
∂f (xi , θ)
Jij =
∂θj
At each step, the linearized problem is solved, and the parameters are updated
until convergence.
Convergence: The iterative process continues until the parameter updates ∆θ
are sufficiently small, indicating that the solution has converged.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 77 / 87
Least Square Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 78 / 87
Least Square Approximation
∂ri (θ)
Jij =
∂θj
2 Solve the normal equation:
AT A∆θ = AT r
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 79 / 87
Least Square Approximation
Hessian
PmApproximation: In Newton’s method, the Hessian matrix
H = i=1 ∇2 ri involves second-order derivatives. In Gauss-Newton, we
approximate the Hessian by neglecting the second-order term:
H ≈ JT J
θk+1 = θk + v k
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 80 / 87
Least Square Approximation
y = aebx
to a set of data points (xi , yi ), where a and b are parameters to be determined, x
is the independent variable, and y is the dependent variable. Objective In
exponential least squares, the objective is to minimize the sum of squared
residuals between the observed values yi and the values predicted by the
exponential model. The residual for each point is the difference between the
observed value and the predicted value, and the error function to minimize is:
n
X
E(a, b) = (yi − aebxi )2
i=1
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 81 / 87
Least Square Approximation
Solving To solve this nonlinear problem, we first linearize the exponential model
by taking the natural logarithm of both sides:
ln(y) = ln(a) + bx
Letting Y = ln(y) and A = ln(a), the equation becomes:
Y = A + bx
This is now a linear model in terms of A and b, which we can solve using the
method of linear least squares. Once we find A and b, we can recover a by
exponentiating A:
a = eA
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 82 / 87
Least Square Approximation
Steps
1 Transform the Data: Take the natural logarithm of the dependent variable
y to linearize the equation.
2 Apply Linear Least Squares: Solve the linearized equation Y = A + bx
using linear least squares techniques.
3 Recover Parameters: Once the least squares method gives estimates for A
and b, recover a as a = eA .
4 Evaluate the Fit: Use the original model y = aebx with the obtained
parameters to evaluate the fit to the data.
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 83 / 87
Least Square Approximation
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 84 / 87
Importance of Approximation in Real-World Applications
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 86 / 87
References
References
Ani Okropiridze, Mariami Mamageishvili, Anano Tamarashvili Function and Data Approximation November 28, 2024 87 / 87