5 ND Basic Questions
5 ND Basic Questions
OPTIMIZATION
Q1. Stopping conditions? (B) While some stopping criterion is not met by Xk
Q2. How to find Xk+1 (i) find Xk+1 : f (Xk+1) < f (Xk )
} Q3. Does the Algorithm
Converge over iterations?
Endwhile
2
FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
X
∙ Second-order Necessary Condition (SONCC): Hessian Matrix H is positive semi-definite (p.s.d) Computationally
∙ Second-order Sufficiency Condition (SONCC): Hessian Matrix H is positive definite (p.d) expensive
Utilize FONCC
f (xk ) − f (xk+1)
∙ ≤ϵ
f (xk )
3
FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
4
FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
Xk+1 = Xk + αk dk
Perspective-II Set of all directions which make an obtuse angle with ∇f Sd(X ) = {dk : ∇f T (X )dk < 0}
T T
f (X + αk dk ) − f (X )
f (Xk + αk dk ) = f (X ) + ∇f (Xk ) αk dk ⟹ ∇f (Xk ) dk = Ltα→o+
αk
All Optimization methods, despite usage of different dk s, honor these definitions of Descent Direction
5
FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
αk : Exact Method Xk+1 = Xk + αk dk assuming dk is fixed through choice of a method αk : Inexact Method
Focus on the fact that f (Xk + αk dk ) = f (αk ) a given XK , dk ⟹ f (Xk + αk dk ) is a function of a single variable αk
FONCC for a single variable
f′(αk ) = 0
df df dX T
αk is such that it takes you to a point at which the
f′(αk ) = = ⋅ = 0 ⟹ ∇f (Xk + αk dk ) ⋅ dk = 0 ⟹ ∇fk+1 dk = 0
dαk dX dαk Gradient is perpendicular to the current direction
dk
Xk+1
∇f T (Xk+1)
αk
f′(αk = 0) ≡ ∇fkT dk < 0 Xk
⟹ Wolfe's condition
6



FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
αk
y = mx + c
ϕ(αk ) = [ ∇fkT dk] ⋅ αk + f (Xk )
αkA αkA
Armijo's condition Choose αk such that f (αk ) ≤ ϕ(αk ) ⟹ f (Xk + αk dk ) ≤ A[ ∇f kT dk ] ⋅ αk + f (Xk ); A ∈ (0,1)
Backtracking: ∙ Start with αk = 1
∙ If Armijo's condition is violated with αk = 1 Even a small step length may fulfil Armijo's condn.
reset αknew = αk /β where β may be set at 2 } hence, significant step length may not prevail 7

FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
αk
ϕ(αk ) = [ ∇fkT dk] ⋅ αk + f (Xk ) Wolfe's condition Choose αk such that
αkW αkW
the new slope at αk
αkA αkA is larger than the original slope
f′(αk ) ≥ W f′(0); W ∈ (0,1)
αkAW αkAW
Then the Optimization Algorithm either terminates in a finite number of iterations or Ltk→∞ | ∇fk | = 0
The above claim does not depend on X0, hence, Convergence is independent of the initial point
9
FRAMEWORK FOR UNCONSTRAINED N-D OPTIMIZATIO N
√
Speed of Convergence (order and rate of convergence)
Consider that an Optimization algorithm generates a sequence of points Xk
| Xk+1 − X* |
The sequence Xk converges to X* with order p if : Ltk→∞ p = r (a finite number)
| Xk − X* |
δk+1 X*
Ltk→∞ | δk+1 | = r | δk |p ; (r is finite number)
Xk+1
δk p
Xk Asymptotically | δk+1 | = r | δk |
p : Order of Convergence
r : rate of Convergence
∙ δk ≈ 0 for very large k, and the aim is to achieve δk+1 = 0
∙ The aim is better fulfilled with: larger p (which makes δkp even smaller); smaller r
An Algorithm with higher-order and lower-rate is said to Converge faster (faster speed of Convergence)