Numerical Analysis Formulae
Numerical Analysis Formulae
February 2021
1 Interpolation
Notation: f : [a, b] → R will be the continuous function to be interpolated.
Pn refers to the vector space of polynomials with degree ≤ n. p ∈ Pn is
the interpolating polynomial (spline) that interpolates f at x0 , x1 , . . . xn . (i.e.
p(xi ) = f (xi ) = fi at all xi .)
kf k = supx∈[a,b] |f (x)| refers to the supremum norm of the function.
Note that
n
X
li (x) = 1 ∀x
i=0
• Uniqueness Theorem
Given n-interpolation points (and value of f at these points), there is a
unique polynomial interpolant from Pn . that can do the interpolation.
• Weierstrass theorem
Given any continuous f : [a, b] → R and any > 0, there exists a polyno-
mial p such that kf − pk <
1
p that interpolates f at (distinct) x0 , x1 , x2 , . . . xn , and x ∈ [a, b], there
exists ζ ∈ [a, b] such that
n
1 (n+1)
Y
f (x) − p(x) = f (ζ) (x − xi )
(n + 1)! i=0
• Runge’s Example
Equally-spaced polynomial interpolation is a recipe for disaster
(overfit-
b+a b−a (n−i)π
ting). Chebyshev interpolation points xi = 2 + 2 cos n much
better. Equally spaced splines are fine.
• Newton’s recipe
f [xi ] = f (xi )
f [xi+1 , xi+2 , . . . xj ] − f [xi , xi+1 , . . . xj−1 ]
f [xi , xi+1 , . . . , xj ] =
xj − xi
Xn
p(x) = f [x0 , x1 , . . . xi ](x − x0 )(x − x1 ) . . . (x − xi−1 )
i=0
1.2 Splines
• Basic idea
Make a piecewise polynomial function over each sub-interval through the
interpolation points so that the spline and its degree − 1 derivatives are
continuous at each knot.
• Linear spline
Just join each interpolation point by a line segment.
• Error in linear spline interpolation
For equally space interpolation points with size of segment being h,
h2 00
kf − pk ≤ kf k
8
2
σ0 = σn = 0
σi
bi =
2
σi+1 − σi
ai =
6h
fi+1 − fi h
ci = − (σi+1 + 2σi )
h 6
di = f (xi ) = fi
si (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di ∀x ∈ [xi , xi+1 ]
kf − pk ≤ h2 kf 00 − p00 k
kf − pk ≤ h4 kf 0000 k - NOT TRUE FOR NATURAL CUBIC SPLINES!
2 Quadrarture
2.1 Basic Idea
Rb
We will estimate the integral If = P a
f (x)dx by the integral of interpolant Ip .
We know that we can write p(x) = f (xi )Li (x) where Li ’s Pdon’t depend on f
n
at all (only on interpolation points). So we can write Ip = i=0 f (xi )Wi where
Rb
wi are some ’weights’ whose value does not depend on f at all.(Wi = a Li (x)dx)
3
Properties of weights
• wi = wn−i
Pn Rn
• i=0 wi = 0 1dx = n
• |If − Cp1 | ≤ m · 1 00
12 kf kh
3
= b−a 00
12 kf kh
2
Simpsons
For higher order ODE, set some of yi0 s to the derivatives of the first order
variables.
4
• Peano’s theorem (existence):
Suppose f : R2 → R is continuous on (−δ, δ) × (y0 − η, y0 + η). Then there
exists an > 0 and a y : (−, ) → R such that y 0 (t) = f (t, y) y(t =
0) = y0
5
• Global error:
Notation: yn → yn,h (to explicitly denote dependence of h). We de-
fine en,h = y(tn ) − yn,h . The gloabal error is defined as eglobal,h =
max |en,h |. We say that a method converges if lim eglobal,h = 0.
n=0,1,2,...bT /hc h→0+
n
Further, if eglobal,h ≤ Ch for all h < h0 (i.e. for all small enough h), then
we say the method is of global order n.
• For Euler’s method, the local order is 1 and global order is also 1
C LT
etrunc ≤ Ch2 =⇒ eglobal ≤ (e − 1)h on domain [0, T ]
L
• For Trapezoidal method, the local order is 2 and global order is also 2
Ch2
TL
etrunc ≤ Ch3 =⇒ eglobal ≤ exp on domain [0, T ]
L 1 − Lh/2
–
s h s−1
−
Z
1 t tk
X Y
yn+s − yn+s−1 ≈h dt fn+k
h 0 tj − tk
k=0 j=0,j6=k
| {z }
βk
6
• Local truncation Order of General s-step Method A general s-step
method is of order p iff -
s
X
αk = 0
k=0
s
X s
X
k m αk = m k m−1 βs for m ≤ p
k=0 k=0
s
X s
X
k p+1 αk = p k p βs
k=0 k=0
7
4 Solving Systems of Linear Equations
• Simple Systems to solve (Ax = b):
– Diagonal Matrix: If A = diag(a1 , a2 , . . . , an ) then A−1 = diag( a11 , a12 , . . . , a1n ).
x = diag( a11 , a12 , . . . , a1n )b
– Unitary Matrix: In this case, A−1 = AT . So we can directly obtain
x = AT b.
– Upper triangular matrix (lower is similar): Back Substitution
Method: first solve for the equation that looks like an,n xn = bn , then
substitute in an−1,n−1 xn−1 + an−1,n xn = bn−1 and solve for xn−1
and so on, all the way back upto x1 .
• Gaussian Elimination:
– In the simplest case, no pivoting is required, i.e. the diagonal entries
are non-zero throughout the procedure. In this course, we only reduce
A to upper diagonal form and then solve via back-substitution.
(
A i)
– Multipliers: The mij = Aijii where the superscript i denotes that
these values are as calculated at the ith step.
– Pivoting: Exchanging rows before we start elimination so that no
diagonal entry is ever zero.
– Theorem: No pivotining is needed if each k th order submatrix
A11 A12 . . . A1k
A21 A22 . . . A2k
∆k = .
.. ..
.. . .
Ak1 Ak2 . . . Akk
is non-singular.
• LU factorisation
• Cholesky Factorisation
– A = BB T where A should be symmetric positive definite, and B
should be lower triangular and with all diagonal entries positive.
– Positive definite: A matrix is positive definite iff all its eigenvalues
are real and positive, iff y T Ay > 0 for all column vectors y.
8
– If A is positive semidefinite, then the B we get is unique.
– If we try Cholesky factorisation on a matrix with zero or negative
eigenvalue, we will get a diagonal entry of B as either 0 or imaginary
repectively - fast test for positive definite.
• QR factorisation
– We factor A = QR, where Q is orthogonal, i.e. Q−1 = QT (or
equivalently all its columns are orthonormal vectors) and R is upper
triangular.
– This factorisation is unique, if we demand that diagonal entries of R
are positive.
– Columns of Q are obtained as Gram-Scmidt orthonormalisation of
columns of original A.
– Rki = hqk , Ai i if 0 < k < i
Pi−1
Rii = kAi − j=1 hqj , Aj iqj k And all other entries 0 (upper triangu-
lar)
• Norm Equivalence: Two norms k.k1 and k.k2 are considered equivalent
iff ∃m, M > 0 such that mkxk1 ≤ kxk2 ≤ M kxk1 for all x.Note that this
happens to be reflexive,symmetric and transitive relation.
Theorem: in a finite dimensional vector space, all norms are equivalent.
• The lp norm: v
u n
u
p
X
kxklp := t |xk |p
k=0
9
– Subordinate Matrix Norm is a Matrix norm (satisfies the inequality)
– Subordinate Matrix norm of identity matrix is 1.
– The matrix norm subordinate to lp norm of any unitary matrix is 1.
– If A is unitary and B is arbitrary then kABk2 = kBAk2 = kBk2
– For a diagonal matrix A = diag(a1 , a2 , . . . , an ), kAk2 = max{|ai |}
• Normal Matrices: A† A = AA† , characterisation theorem states that all
normal matrices can be written as U DU † where D is diagonal, with entries
as eigenvalues of original A. Note that hermetian (A = A† )matrices are
normal.
• Thus, for normal matrices, the kAk2 = biggest |λ| where λ iterates over
all eigenvalues. We define ρ(A) = max{|λ| |λ is eigenvalue of A}. Then
kAk2 = ρ(A). Note that ρ(A) is only equal to kAk2 if A is normal, not in
general.
• For any generic matrix norm on A, we can prove that kAk ≥ ρ(A). Sim-
ilarly, we can prove that for any matrix A and for any > 0, there is a
subordinate matrix norm such that kAk ≤ ρ(A) +
• We can prove the following results to easily get the subordinate norms-
Pn
– kAk1 = max1≤j≤n { i=0 |Aij |}, i.e. max column sum of modulii.
– kAk2 = ρ(A) (for normal matrices)
Pn
– kAk∞ = max1≤i≤n { j=0 |Aij |} i.e. max row sum norm.
– lim Ak = 0 (powers)
k→∞
10
• Geometric Series of matrices -
∞
X
(I − A)−1 = Ak
k=0
Converges iff ρ(A) < 1, and in this case, the LHS is well defined as well.
• If A is invertible, then all matrices in the open 1
kA−1 k neighbourhood of A
are invertible.
• Condition number of A is defined as kAkkA−1 k ≥ 1.
x(k+1) = M −1 N x(k) + M −1 b
11
• For a matrix with real eigenvalues (eg:real symmetric) matrix A with
eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λn , the Richardson’s method converges iff all
eigenvalues have same sign and
0 < α < λ2n or λ21 < α < 0
M=D
N=D−A
−1
M N = I − D−1 A
M = D + L0
N = −U0
12
– 0 < ρ(G) < ρ(J) < 1
– ρ(G) = ρ(J) = 1
– ρ(G) > ρ(J) > 1
So whenever Jacobi method converges (for J ≥ 0), Gauss Seidel will con-
verge even faster.
• Relaxed Gauss-Seidel Method
1
M= D−E
α
1
N= −1 D+F
α
−1
−1 1 1
M N = Gα = I−L −1 I+U
α α
13
• Contraction Mapping A function g : [a, b] → R is said to be a contrac-
tion mapping if, there exists 0 < L < 1 such that for any x, y ∈ [a, b],
|g(x) − g(y)| ≤ L|x − y|. If the range of g is also [a, b] then we know that g
has a fixed point, and if such g is also a contraction, then the fixed point
iteration will converge to the UNIQUE fixed point, for any initial guess
x0 ∈ [a, b]
• Let ξ be a fixed point of g : [a, b] → R which is continuous in some
neighbourhood [ξ−h, ξ+h] and has dervative existing at ξ with |g 0 (ξ)| < 1.
Then g will be a continuous contraction in some neighborhood of ξ and
we can converge to ξ by fixed point iteration starting anywhere in this
neighborhood.However if |g 0 (ξ)| > 1 then we can never converge to ξ by
fixed point iteration method.
• Relaxation iteration To find the zeros of a function f , we can equiv-
alently try to find the fixed points of g(x) = x − λf (x) where λ is an
arbitrary non-zero parameter. (i.e. xk+1 = xk + λf (xk ))
• If we pick λ having the same sign as f 0 (ξ) (f (ξ) = 0) and with small
enough magnitude then by the convergence theorem of fixed point iteration
|1 − λf 0 | < 1 so the relaxation iteration will converge.
• Convergence theorem for relaxation iteration If we have f : [a, b] →
R which has a zero at ξ, and if f 0 is continuous in a neighbourhood of ξ
with f 0 (ξ) 6= 0 then there exists a λ, δ such that the relaxation iteration
with λ converges with any initial guess δ-close to ξ.
• Error sequence The sequence defined as ek = |xk − ξ| for any iteration
method sequence xk is called the error sequence.
If there exists a p ≥ 1 and a non negative constant C such that |ek+1 | <
C|ek |p (with C < 1 if p = 1) then we say that the sequence of iterates
converges with order atleast p.
• Simple iteration for fixed point converges with order 1 with C = L < 1
(contraction constant).
• Improved Relaxation iteration Rather than having a constant λ, we
allow λ = λ(xk ). This will converge whenever |1 − λ0 f − λf 0 | = |1 − λf 0 | <
1 (f (ξ) = 0, so that term drops)
• The smaller that |1 − λf 0 | is, the faster the method will converge. In
1
particular, we can choose λ(xk ) = f 0 (x k)
to make this quantity zero (the
minimum) and this leads to Newtop-Raphson Iteration
f (xk )
xk+1 = xk −
f 0 (xk )
14
1
x ∈ [ξ − δ, ξ + δ]. Then for any x0 ∈ [ξ − h, ξ + h] where h = min δ, M
the Newton’s method with initial guess x0 converges with order atleast 2
to ξ.
• Bisection Method Guess an interval [a, b] within which the root should
lie f (a)f (b) < 0. Compute f a+b
2 . If it is zero, we found a root and we are
done.Otherwise, either f (a)f a+b or f (b)f a+b
2 < 0 2 < 0. Accordingly,
halve the interval and iterate with the new interval.
• Secant Method Replace f 0 (xk ) in Newton’s method by the approxima-
tion f (xxkk)−f (xk−1 )
−xk−1
We can show that if f 0 (ξ) 6= 0 then this method is atleast first order with
C = 32
15