Exam TANA09 2022 January
Exam TANA09 2022 January
Matematiska institutionen
Beräkningsmatematik/Fredrik Berntsson
Allowed:
1. Pocket calculator
Good luck!
2
(5p) 1: a) Let a = 0.008755661 be an exact value. Round the value a to 5 correct decimals
to obtain an approximate value ā. Also give a bound for the relative error in
ā.
b) We want to store the number x = 117.2277634 on a computer using the floating
point system (10, 5, −10, 10). What approximate number x̄ would actually be
stored on the machine?
√
c) Explain why the formula y = 1 + x − 1 can give poor accuracy when evalua-
ted, for small x, on a computer. Also propose an alternative formula that can
be expected to work better.
d) Let y = e−2x , where x = 0.95 ± 0.02. Compute the approximate value ȳ and
give an error bound.
3
(3p) 4: A quadratic Beziér curve is given by the expression
where P1 , P2 and P3 are control points. Suppose we want to combine two quadratic
Beziér curves to one single corve. For this purpose we chose five control points as
follows
P2
P3
P1
P4
P5
The point P3 is common for both curves. We have chosen P2 = (2 , 6)T , P3 = (3 , 5)T
and P5 = (6 , 1). Find coordinates for the point P4 such that the tangent direction
of the combined curve is continuous at the point P3 and that the tangent is vertical
at the endpoint P5 . Motivate your choice for P4 carefully.
c) Explain what is ment by a matrix norm beeing induced from a vector norm.
Also show that if A and B are matrices then for an induced norm kABk ≤
kAkkBk.
4
(4p) 6: Suppose A ∈ Rm×n , m > n. The least squares method can be used to minimize
kAx − bk2 .
Do the following:
to the measurements by using the least squares method. Clearly show what
the matrix A and the right hand side b is for this particular case.
b) Let A be an m × n, m > n, matrix, and let A = Q1 R be the reduced QR
decomposition. Give the dimensions for Q1 and R. Also give a formula for
computing the solution to the least squares problem Ax = b using the reduced
QR decomposition. Finally estimate the amount of arithmetic work required to
compute the least squares solution (not counting the work needed to compute
the QR decomposition itself).
c) Show that if k · k is an induced norm and Q is orhogonal then kAQk = kAk.
Use the table to determine C and p. Also estimate the value of h needed for the
error to be of magnitude 10−3.
(3p) 8: a) Suppose the n × n matrix A has rank k < n and that the linear system
of equations Ax = b has a solution. Use the singular valur decomposition
A = UΣV T to give a general formula for all solutions x of the system Ax = b.
Clealy motivate your answer.
b) Let A be an m × n matrix, m > n. Show how the singular value decomposition
A = UΣV T can be used for solving the minimization problem
min kAxk2 .
kxk2 =1
Give both the minmizer x and the minimum in terms of singular values and
singular vectors.
5
Answers
(5p) 1: For a) we obtain the approximate value ā = 0.00876 which has 5 correct decimal
digits. The absolute error is at most |∆a| ≤ 0.5 · 10−5 and thus the relative error is
bounded by ∆a|/|a| ≤ 0.5 · 10−5/0.00876 ≤ 0.58 · 10−3 .
In b) we rewrite the number as x = 1.172277634 · 102 to see that x̄ = 1.17228 · 102
is actually stored on the computer.
√
p c) Since 1 + x ≈ 1, for small x, we catastrophic cancellation will occur when
For
1 + x) − 1 is computed resulting in a large relative error in the result. A better
formula would be
p p
p ( 1 + x) − 1)( 1 + x) + 1) x
1 + x) − 1 = p =p
1 + x) + 1 1 + x) + 1
(x − 0.8)(x − 1.0)
p̄2 (x) = p2 (x) + ∆f1 .
(0.6 − 0.8)(0.6 − 1.0)
The function |(x − 0.8)(x − 1.0)| has a local maximum for x = 0.9 and also a local
maximum at x = 0.6. The largest absolute value is achived for x = 0.6 which means
that
(0.6 − 0.8)(0.6 − 1.0)
|p̄2 (x) − p2 (x)| ≤ |∆f1 | = |∆f1 | ≤ 0.03.
(0.6 − 0.8)(0.6 − 1.0)
f (x) = ex − 3x = a − 3x = a − b = c
6
The error propagation formula gives us
∂f ∂f ∂f
|∆f | . | ||∆a| + | ||∆b| + | ||∆c| = |1||∆a| + |1||∆b| + |1||∆c| .
∂a ∂b ∂c
µ(|a| + |b| + |c|) ≈ µ(|1| + |3x| + |1|) ≈ 2µ,
where we have used ex ≈ 1, f (x) = c ≈ 1 since x is small. There is no cancellation
present in these calculations. Everything turns out fine and both the absolute and
relative errors are bounded by 2µ (since the function value f (x) ≈ 1).
(3p) 4: We note that for the tangent to be vertical at P5 the x–coordinate need to be the
same at P4 and P5 . Thus P4 = (6, , α)T for some real number α. In order to get
a continuous tangent direction at P3 we need the vectors P3 − P2 = (1 , −1)T to
be parallell with P4 − P3 = (3 , α − 5)T which only works out if α = 2. Thus
P4 = (6 , 2)T .
(3p) 5: For a) we just observe that one of the multipliers (i.e. ℓ32 = 1.8) is larger than one.
Thus pivoting wasn’t used correctly.
For b) The multipliers are m3 = 0.6/3 = 0.2 and m4 = −1.8/3 = −0.6. Therefore
the Gauss transformation is
1 0 0 0
0 1 0 0
M = 0 0.2
.
1 0
0 −0.6 0 1
For c) A matrix norm is induced if its definition is based on a vector norm, i.e.
kAxk
kAk = max
x6=0 kxk
(4p) 6: For a) we note that each data point (xi , yi ) gives one row of the over determined
system Ax = b. The model is y = c1 + c2 e−x + c3 sin(πx) + c4 x sin(πx). Thus the
system Ax = b is
1 e−x1 sin(πx1 ) x1 sin(πx1 ) c1 y 1
1 e−x2 sin(πx2 ) x2 sin(πx2 ) y2
c2
.. .. .. .. = .. .
. . . . c3 .
1 e −xm
sin(πxm ) xm sin(πxm ) c 4 ym
7
For b) The dimensions are m × n for Q1 and n × n for R. The formula is x =
R−1 (QT1 b) and the matrix vector multiply y = QT1 b requires approimately mn mul-
tiplications and additions. Since R is triangular computing R−1 y by backwards
substitution requires n2 /2 multiplications and additions. So the operation count is
2mn + n2 ≈ 2mn if m >> n.
Finnally, for c) we have
kAQxk kAyk
kAQk = max = { set y = Qx and note kQxk = kyk} = max = kAk
x6=0 kxk y6=0 kyk
(3p) 8: For a) we note that if rank(A) = k then {vk+1, . . . , vn } is a basis for null(A) and
{v1 , . . . , vk } is a basis for its orthogonal complement (null(A))⊥ . Thus for evey x
we can write
Xk n
X
x = x1 + x2 = ( ci vi ) + ( ci vi ).
i=1 i=k+1
In order to determine x1 we compute
k
X n
X
Ax = A(x1 + x2 ) = Ax1 + 0 = ci σi ui = b = (uTi b)ui .
i=1 i=1
Where (uTi b) = 0, for i = k + 1, . . . , n, since it is said that the solution exists. Thus
k n
X uT b i
X
x1 = vi and x2 = ci vi ,
i=1
σi i=k+1
since σn is the smallest singular value, with equality if y = en which means that
x = V T en = vn .