2021 04 26 MidtermExam Key
2021 04 26 MidtermExam Key
26 April 2020
Instructions:
1
Answer any four (4) of the following questions.
1. (a) (13 pts) Find the IEEE standard single-precision format represen-
tation of the decimal number -47.125.
Answer We begin by converting the integral and fractional parts
to binary form. Thus, (47)10 = 101111, and .125 = .001. It then
follows that (47.125)10 = (101111.001)2 . Next, we put the num-
ber in IEEE standard single-precision format by shifting the bits
from left to right until the only bit on the left side of the bi-
nary point is a 1 and multiplying the result by an appropriate
power of 2 to compensate for the shift. This results in the repre-
sentation 1.01111001000000000000000 × 25 . Finally, we compute
the characteristic, c, by setting c − 127 = 5; that is, c = 132.
Thus, in the IEEE single-precision format, the number −47.125
is represented with s = 1, c = (132)10 = (10000100)2 , and 1.f =
1.01111001000000000000000. The number is then represented as
110000100011110010000000000000002 = C23C800016 .
(b) (12 pts) Determine the decimal number that has the hexadecimal
IEEE format representation C1DA0000.
Answer The hexadecimal number C1DA0000 in binary is
11000001110110100000000000000000
2
2. Consider the equation x4 − 3x2 − 3 = 0.
(a) (15 pts) Use the Modified Newton’s iteration method to determine
a solution accurate to within 10−3 for the given equation on the
interval [1,2]. Use z0 = 1.5
Answer f (x) = x4 − 3x2 − 3, f 0 (x) = 4x3 − 6x, f 00 (x) = 12x2 − 6.
The modified Newton’s formula is
f 0 (xk−1 )f (xk−1 )
xk = xk−1 − , k = 1, 2, 3, . . .
f 0 (xk−1 )2 − f (xk−1 )f 00 (xk−1 )
Let
Nk = f 0 (xk−1 )f (xk−1 ), and Dk = f 0 (xk−1 )2 − f (xk−1 )f 00 (xk−1 ).
With x0 = 1.5, we have the following table
k Nk Dk xk |xk − xk−1 |
1 -21.0937500000 118.6875000000 1.6777251185 0.1777251185
2 -31.0701541316 175.6639319168 1.8545978196 0.1768727011
3 -21.4127068002 259.5164991845 1.9371078214 0.0825100017
4 -3.0847945310 311.4869602033 1.9470112681 0.0099034468
5 -0.0355592492 318.3894611900 1.9471229529 0.0001116848
6 -0.0000043972 318.4681694564 1.9471229667 0.0000000138
7 -0.0000000000 318.4681791881 1.9471229667 0.0000000000
(b) (10 pts) How many iterations would you need to perform in or-
der to obtain the same degree of accuracy in (b) if the Bisection
method were used instead?
Answer
(b − a)
< 10−3
2k+1
With a = 1 and b − 2, we have
1 ln 103
< 10−3 =⇒ 2k+1 > 103 or k > − 1 ≈ 8.9657
2k+1 ln 2
Therefore, at least 9 iterations of the Bisection method is required
to obtain the same degree of accuracy as with the Modified New-
ton’s method.
3
3. Let f (x) = x − tan x.
(a) (5 pts) Verify that f (x) = 0 has a solution in the interval [ 9π , 23π ].
16 16
Answer Since f (x)x−tan x is continuous on [ 16 , 16 ], and f ( 9π
9π 23π
16
)>
23π
0 and f ( 16 ) > 0, by the Intermediate Value Theorem, there is no
number ξ ∈ [ 9π , 23π ] where f (ξ) = 0.
16 16
(b) (10 pts) Use the Bisection algorithm to approximate the solution
of f (x) = 0 to within 10−2 .
Answer: By part (a), there is no solution.
(c) (10 pts) How many iterations using the Bisection algorithm will
guarantee an approximation to the solution of f (x) = 0 in the
interval [ 9π , 23π ] with an abolute error less than 10−4 ?
16 16
Answer: Not applicable.
(a) (15 pts) Determine a function g(x) suitable for fixed-point iteration
for the given equation and show that g(x) satisfies the conditions
of the Fixed-Point Theorem.
Answer: There are many ways to change the equation to the
fixed point form x = g(x) using algebraic manipulations. One
way consists of the following sequence of rearrangements:
12
x4 + 45 x4 + 45
2 4 2
18x = x + 45 =⇒ x = =±
18 18
12
x4 + 45
x = g(x) =
18
q q
46 61
Observe that g(1) = 18
∈ (1, 2) and g(2) = 18
∈ (1, 2).
Moreover,
2x3
g 0 (x) = √ 1 > 0, ∀x ∈ [1, 2],
18(x4 + 45) 2
4
Moreover,
Equations (1) and (2) imply that the fixed point iteration
xn = g(xn−1 ), n = 1, 2, . . .
5
That is
6 −1 2 2 1 0 0 0 6 −1 2 2
−1 1 −2 0
5 1 0 −1 5
0 1 −2
−2 =
−3 −10 4 0 0 1 0 −2 −3 −10 4
−3 1 3 8 0 0 0 1 −3 1 3 8
Compute the multipliers
a21 1 a31 2 a41 3
m21 = − = , m31 = − = , m41 = − =
a11 6 a11 6 a11 6
and then perform the follow row operations
R2 ← R2 + m21 R1 , R3 ← R3 + m31 R1 , R4 ← R4 + m41 R1
This gives
1 0 0 0 6 −1 2 2
− 1 1 0 0 29 8
− 10
0 610
A= 6 6 6
− 2 0 1 0 0 − 3 − 28 14
6 3 3
− 36 0 0 1 0 2 1
4 9
Next, the operations
R3 ← R3 + m32 R2 , R4 ← R4 + m42 R2
where m32 = − aa32
22
= 20
29
and m42 = − aa22
42 3
= − 29 , yields
1 0 0 0 6 −1 2 2
− 1 1 0 0 29 8
− 10
0 6
A= 6 6 6
− − 2 20 244 102
6 29
1 0 0 0 − 29 29
3 3 112 266
− 6 29 0 1 0 0 29 29
Thus,
A = LU
where
1 0 0 0 6 −1 2 2
− 1 1 0 0 0 29 8
− 10
L= 6 , U = 6 6 6
− 2 − 20 1 0 0 0 − 244 102
6 29 29 29
− 6 29 − 28
3 3
61
1 0 0 0 658
61
6
(b) (12 pts) Use the result obtained in (a) to solve the system Ax = b
T
where b = 21 26 −7 −6 .
Answer Using the result from part (a), we have
Ax = LU x = b
This gives
Ly = b, and U x = y
where y = (y1 , y2 , y3 , y4 )t . Solving Ly = b; that is,
1 0 0 0 y1 21
− 1 1 0 0 y2 26
62 = ,
− − 20 1 0 y3 −7
6 29
− 36 29 3
− 28
61
1 y4 −6
using forward substitution yields
21
59
y= 2
590
29
658
61
Finally, we solve
Ux = y
using back substitution to get
5
7
x=
−2
(a) (10 pts) Show that the coefficient matrix of the given linear system
is positive definite.
Answer The coefficient matrix is
6 −2 3
A = −2 8 1
3 1 7
7
Let x = (x1 , x2 , x3 )t ∈ R3 .
6 −2 3 x1
xT Ax = (x1 , x2 , x3 ) −2 8 1 x2
3 1 7 x3
= 6x21 − 4x1 x2 + 6x1 x3 + 8x22 + 2x2 x3 + 7x23
(c) (5 pts) Use the result obtained in (b) to solve the linear system.
Answer Using the result from part (b), we have
Ax = LLT x = b
This gives
Ly = b, and LT x = y
8
where y = (y1 , y2 , y3 , y4 )t . Solving Ly = b; that is,
√
6 q0 0
√2 y1 11
44
− 6 0
q 6 q y2 = −9
√3 2 44 6 109 y3 9
6 22
Finally, we solve
LT x = y
using back substitution to get
1
x = −1
1