Exercise 09
Exercise 09
Exercise 9
Task 1
R R
Consider the function f : 2 → , defined by f (x, y) = (x2 − 2xy + x)2 . We would like
to determine a point at which the function f takes on its minimum value. Starting from x0 =
(x0 , y0 ) = (2, 2), calculate the next iteration point x1 = (x1 , y1 ) according to each of the
following methods:
c) Newton’s method;
Task 2
Calculate the first two terms we get by applying Aitken’s acceleratioen method to the zero con-
vergent sequence starting with 100, 10, 2, 21 , . . .
Task 3
The function f : R2 → R defined by
f (x, y) = x4 + y 4
clearly attains its minimum at (0, 0). In this task we investigate the behaviour of Newton’s me-
thod when determining this minimum, starting from an arbitrary point (x0 , y0 ) 6= (0, 0).
a) Determine the first iteration point (x1 , y1 ) as a function of the starting point (x0 , y0 ).
d) Apply Aitken’s acceleration method to the sequence found in b).(Use the starting point
(x0 , y0 ) = (1, 1) for simplicity. What do you find? Can you explain your result?
1
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
Task 4
For f : R2 → R defined by
f (x, y) = x4 + y 4
as in Task 3, perform two steps of Broyden’s method starting from (x0 , y0 ) = (1, 1). Determine
the matrix (A1 )−1 used as an approximation for the inverse of the Hessian matrix in the second
iteration step, and compare it to the exact inverse of the Hessian matrix that would be applicable.
Task 5∗
Study the implementation variants of Newton’s and Broyden’s method available from the ho-
mepage (exact derivatives and approximate derivatives), and try them out on the Bazarra-Shetty
function and other examples.
Task 6∗
Study the implementation of Aitken’s acceleration method available from the homepage, and
use it to improve the convergence of the programs from Task 5 and from Task 4 of Exercise 8.
2
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
Solutions to Exercise 9
Solution to Task 1
As a preliminary, calculate the partial derivatives of the function f :
f (2, 2) = 4,
−4
∇f (2, 2) = .
16
3
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
The next iteration point therefore is x1 = (x1 , y1 ) = (2.125, 1.5) (when only using suc-
cessive halving).
b) Using the results from a), we first fit a parabola P (t) = at2 + bt + c through the following
three sample points. (Recall from a) that after the successive halving phase we have β =
1/32.)
! 2 −4
t P (t) = f −t
2
16
! 2 −4 2
0 P (0) = f −0 =f =4
2 16
2
1 1 ! 2 1 −4 2.125
32 P ( 32 ) = f − 32 =f = 0.07056
2 16 1.5
1 1 ! 2 1 −4 2.25
16 P ( 16 )=f − 16 =f = 7.910156
2 16 1
c) We calculate the gradient and the Hessian matrix at the iteration point (x0 , y0 ) = (2, 2):
−4
∇f (x0 , y0 ) =
16
2(4 − 4 + 1)2 + 4(4 − 8 + 2) −48 − 16 + 64
−6 0
Hf (x0 , y0 ) = =
−48 − 16 + 64 32 0 32
We obtain −1
x1 x0
= − Hf (x0 , y0 ) ∇f (x0 , y0 )
y1 y0
−1
2 −6 0 −4
= −
2
1 0 32 16
2 −6 0 −4
= − 1
2 40 32 16
2 6
= − 16
2 32
4
= 3 .
3
2
4
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
Therefore the iteration point x1 in Newton’s method is given by x1 = (x1 , y1 ) = (1.3̄, 1.5).
d) The first iteration step (and hence also the iteration point x1 ) in Broyden’s method is the
same as in Newton’s method. We therefore use the results from c):
x0 2
=
y0 2
−4
∇f (x0 , y0 ) =
16
−1 −1 − 1 0
A 0 = Hf (x0 , y0 ) = 6
1
0 32
x1 1.3̄
=
y1 1.5
−1
We now calculate the gradient and an approximation A1 for the inverse of the Hes-
sian Matrix at the iteration point x1 = (x1 , y1 ) = (1.3̄, 1.5):
−1.185185
∇f (x1 , y1 ) =
4.740741
2
−3
d1 = x1 − x0 =
− 12
1 2.814815
g = ∇f (x1 , y1 ) − ∇f (x0 , y0 ) =
−11.25926
0 )−1 g 1 − d1 (d1 )T (A0 )−1
−1 −1
1 0
(A −0.211579 0.006316
A = A − =
(d1 )T (A0 )−1 g 1 −0.033684 0.035987
We obtain
x2 x1 −1
= − A1 ∇f (x1 , y1 )
y2 y4 1
3 −0.211579 0.006316 −1.185185
= 3 −
2 −0.033684 0.035987 4.740741
1.053
= ,
1.289
i.e. the next iteration point in Broyden’s method is x2 = (x2 , y2 ) = (1.053, 1.289).
−1
For inverse Hessian at (x1 , y1 ) = (1.3̄, 1.5) is (Hf (x1 , y1 )) =
comparison: The exact
−0.375 0
, so the approximation by (A1 )−1 is not very good here.
0 0.0703125
Solution to Task 2
a2 = 1.2195122 and a3 = 0.1538462.
5
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
Solution to Task 3
3
4x
a) ∇f (x, y) =
4y 3
12x2
0
Hf (x, y) =
0 12y 2
−1
12(x0 )2 4(x0 )3
x1 x0 0 x0 x0 /3 2 x0
= − · = − =
y1 y0 0 12(y0 )2 4(y0 )3 y0 y0 /3 3 y0
xn 2 n
x0
b) = 3
yn y0
c) The convergence speed is only linear, despite the fact that Newton’s method usually“
”
converges quadratically. The general statements about convergence speeds discussed in
the lecture do not apply in this example because at (0, 0) not only the gradient (=first deri-
vative), but also the Hessian matrix (=second derivative) vanishes. Note that this happens
in particular when trying to find a minimum that is a multiple root of a (univariate) poly-
nomial with Newton’s method, like e.g. the unique minimum at x = 2 in f1 (x) = (x−2)2
or in f2 (x) = (x − 2)5 (x2 + x + 1).
d) We only consider the first component xi , the second component yi behaves identically.
We have i i−1
1 2 i
2 2
∆xi = − =−
3 3 2 3
and therefore i i−1 !
1 2 i
1 2 1 2
∆2 xi = − − − = .
2 3 2 3 4 3
for any i. As the sequence xi converges exactly linear“ towards 0, the Aitken conver-
”
gence improvement works perfectly here and correctly predicts“ 0 as the goal of the
”
convergence already after the first step.
Solution to Task 4
According to Task 9.3we have (x1
, y1 ) = (2/3, 2/3), and the Hessian matrix at this point is
12 · ( 23 )2 16
0 0
= 3 16 .The matrix (A1 )−1 we need to compute should therefore
0 12 · ( 23 )2 0 3
3
16 0
be understood as an approximation for 3 .
0 16
6
Optimization, Autumn 2023
Prof. Dr. Lin Himmelmann
1
0
Further we have (A0 )−1= (Hf (x0 , y0 = ))−1 1
12
0 12
x1 x0 2/3 1 −1/3
d1 = − = − = ,
y1 y0 2/3 1 −1/3
4 · ( 32 )3
32
and with ∇f (x1 , y1 ) = ∇f (2/3, 2/3) = = 2732 we get
4 · ( 32 )3
32 7627
1 27 4 − 27
g = ∇f (x1 , y1 ) − ∇f (x0 , y0 ) = 32 − = .
27 4 − 76
27
Our approximation for the inverted Hessian matrix therefore is