Unconstrained Minimization in R: Newton Methods
Unconstrained Minimization in R: Newton Methods
Unconstrained Minimization in Rn :
Newton Methods
1. Motivation
Again the problem here is
min f (x) (1)
x∈Rn
The basic idea is to take a quadratic approximation around the current point xi and minimize
this approximation to get xi+1 . So we have:
1
minn f (xi ) + ∇f (xi )(x − xi ) + (x − xi )∇2 f (xi )(x − xi ).
x∈R 2
By setting the gradient of this approximation to zero, we get:
2. Local Convergence of Newton’s Method and its Superlinear and Quadratic Rates
of Convergence
and
kxk+1 − x̄k
lim =0 (Superlinear quotient rate)
k→∞ kxk − x̄k
(b) Moreover, if
k∇g(x) − ∇g(x̄)k <
= αkx − x̄k
for all x in some neighborhood N (x̄) of x̄, that is g ∈ LCα1 (N (x̄)), then there is a positive
constant c s.t.:
2
kxk+1 − x̄k <
= ckxk − x̄k (Quadratic quotient rate)
1
Proof.
(a) We first show that ∇f (x) is nonsingular for all x in a neighborhood of x̄. Set β =
k∇g(x̄)−1 k and let 0 < ε < (2β)−1 . By the continuity
of ∇g at x̄ there is a δ > 0
s.t. k∇g(x) − ∇g(x̄)k = ε whenever x ∈ S: = {x kx − x̄k <
<
= δ}. Hence by the Banach
By B.6.2:
∇h(x̄) = I − ∇g(x̄)−1 ∇g(x̄) = 0.
Hence by the Ostrowski Point of Attraction Theorem B.6.1, x̄ is a point of attraction.
Moreover since h is differentiable at x̄, that is:
kxk+1 − x̄k
lim = 0.
k→∞ kxk − x̄k
and
provided that for a given x0 , k0 , is chosen so that xk lies in the neighborhood of all
k>= k0 , and by part (a) such a k0 is guaranteed to exist.
2
3. Local convergence of Newton’s method, global convergence of damped Newton’s
method and their quadratic rate of convergence
1
(iii) k∇2 f (x)−1 k <
= , ∀x ∈ B(x̄) for some m > 0 and B 2m (x̄) ⊂ B(x̄)
m R
2m
(iv) kx0 − x̄k <
R
Then
R
(a): (i), (ii) and (iii) =⇒ kxi+1 − x̄k <
= kxi − x̄k2
2m
(b): (i), (ii), (iii) and (iv)=⇒ ∀i : xi ∈ B 2m (x̄) & xi → x̄
R
Note: B 2m (x̄) ⊂ B(x̄) can be induced by taking R larger, but then x0 must be closer to x̄.
R
Proof.
(a)
xi+1 − x̄ = xi − x̄ − ∇2 fi−1 ∇fi − ∇f (x̄) By (i)
h it=1
= xi − x̄ − ∇2 fi−1 ∇f x̄ + t(xi − x̄)
t=0
Z 1
= xi − x̄ − ∇2 fi−1 ∇2 f x̄ + t(xi − x̄) − ∇2 fi + ∇2 fi (xi − x̄)dt
0
(By continuity of ∇2 f from (ii))
Hence:
Z 1
kxi+1 − x̄k <
= k∇2 fi−1 k k∇2 f x̄ + t(xi − x̄) − ∇2 fi k · kxi − x̄kdt
0
(By Lemma B.2.2)
1 1
Z
<
= R(1 − t)kxi − x̄kdtkxi − x̄k (By (ii), (iii))
m 0
R
= kxi − x̄k2
2m
3
(b) By induction:
! !2i
R < R
kxi − x̄k = kx0 − x̄k i = 1, 2, . . .
2m 2m
2 i
(Induction proof: ai <
= a0
2 i
= a0 . Then:
The inequality holds for i = 1. Suppose it holds for i, that is ai <
i
2 i+1
2< 2
ai+1 <
= (ai ) = ao = a2i ,
where the first inequality follows from part (a). Thus the induction is extended to i + 1.)
Hence !2i
2m R 2m
kxi − x̄k <
= kx0 − x̄k < ,
R 2m R
and lim kxi − x̄k = 0.
i→∞
for some forcing function σ2 . (Hence all stepsizes of SCAT1 can be used. 1 )
Assumption
(v) f ∈ C 2 and ∃M >
= m > 0 s.t. ∀x, y ∈ RN :
Then
4
Proof.
1
(a) −∇fi pi = ∇fi ∇2 fi−1 ∇fi = k∇fi k2 =: σ(k∇fi k)
M 2
= f (xi ) the bounded set S = {xf (x) = f (x0 )} ⊂{x kx−x0 k =
Because f (xi+1 ) < < < k∇f(x0 )k}
m
contains {xi }. But:
1 1
kpi k2 = ∇fi ∇2 fi−2 ∇fi <
= 2
k∇fi k2 <
= max ||∇f (xi )k2 .
M m2 x∈S
So {xi , pi } is bounded and hence must have an accumulation point. But for each such
accumulation point (x̄, p̄), x̄ is the unique solution of minn f (x). So {xi } → x̄.
x∈R
(b) We need the following lemma:
3 Lemma Let f : Rn → R, let M >
= m>0 s.t. ∀x, y ∈ Rn
Then * +
M 1
2
f (y) <
= f (x), ∇f (x̄ = 0 =⇒ ky − x̄k <
= kx − x̄k
m
Proof of Lemma
1
f (y) = f (x̄) + ∇f (x̄)(y − x̄) + (y − x̄)∇2 f (s)(y − x̄)
2
1
f (x) = f (x̄) + ∇f (x̄)(x − x̄) + (x − x̄)∇2 f (t)(x − x̄),
2