0% found this document useful (0 votes)
37 views16 pages

Lecture 12

Uploaded by

Ashik Ahmed IUT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views16 pages

Lecture 12

Uploaded by

Ashik Ahmed IUT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

ECE 5314: Power System Operation & Control

Lecture 12: Gradient and Newton Methods

Vassilis Kekatos

R3 S. Boyd and L. Vandenberghe, Convex Optimization, Chapter 9.

R2 A. Gomez-Exposito, A. J. Conejo, C. Canizares, Electric Energy Systems: Analysis and


Operation, Appendix B.

R1 A. J. Wood, B. F. Wollenberg, and G. B. Sheble, Power Generation, Operation, and Control,


Wiley, 2014, Chapter 13.

Lecture 10 V. Kekatos 1
Unconstrained minimization

Assume f convex, twice continuously differentiable, and finite p∗

p∗ := min f (x)
x

unconstrained minimization methods

• produce sequence of points xt with f (xt ) → p∗

• interpreted as iterative methods for solving optimality condition

∇f (x∗ ) = 0

• if ∇2 f (x)  mI with m > 0 (strong convexity), then

1
0 ≤ f (x) − p∗ ≤ k∇f (x)k22
2m

useful as stopping criterion (assuming m is known)

Lecture 10 V. Kekatos 2
Examples

Example 1: unconstrained QP (P = P>  0):

min x> Px + 2q> x + r


x

Example 2: analytic center of linear inequalities


m
X
min − log(bi − a>
i x)
x
i=1

Example 3: interior-point methods tackle constrained problems by solving a


sequence of unconstrained minimization problems

Lecture 10 V. Kekatos 3
Descent method

1. Compute search direction ∆xt

2. Choose step size µt > 0

3. Update xt+1 = xt + µt ∆xt

4. Iterate (t → t + 1) until stopping criterion is satisfied

Definition: An iterative method is a descent method if f (xt+1 ) < f (xt ) ∀t

Recall for convex f , we have f (xt+1 ) ≥ f (xt ) + (∇f (x))> (xt+1 − xt ). Then:

f (xt+1 ) < f (xt ) ⇒ descent direction satisfies (∇f (xt ))> ∆xt < 0

Step size µt > 0: constant, exact line search, or backtracking search

exact line search : µt := arg min f (xt + µ∆xt )


µ>0

Lecture 10 V. Kekatos 4
Gradient descent

1. Compute search direction ∆xt = −∇f (xt )


(special case of descent method)

2. Choose a step size µt > 0

3. Update xt+1 = xt + µt ∆xt

4. Iterate until stopping criterion is satisfied

• converges with exact or backtracking line search and upper bounded µ

• convergence rate results: c ∈ (0, 1) depends on m, x0 , and line search

linear for strongly convex f : f (xt ) − p∗ ≤ ct (f (x0 ) − p∗ )


sublinear for general convex: f (xt ) − p∗ ≤ L
t
(f (x0 ) − p∗ )

• very simple but typically slow

Lecture 10 V. Kekatos 5
Example

min x21 + M x22


x

where M > 0

• exact line search

• initialize at x0 = (M, 1)

Figure: [Tom Luo’s slides]

• iterates take the form


 t  t !
t M −1 M −1
x = M , −
M +1 M +1

• fast convergence when M close to 1; one step if M = 1!

• slow, zig-zagging if M  1 or M  1

Lecture 10 V. Kekatos 6
Example 2

For m = 100 and n = 50, use gradient method (exact line search)
m
X
min c> x − log(a>
i x − bi )
x
i=1

Figure: Function value convergence for gradient method [Z.-Q. Luo’s slides]

Lecture 10 V. Kekatos 7
Steepest descent direction

Term ∇f (x)> z gives approximate decrease in f for small z

f (x + z) ≈ f (x) + ∇f (x)> z

Find the direction of steepest descent (SD):

zsd = arg min ∇f (x)> z


kzk≤1

Euclidean norm kzk2 : zsd = −∇f (x)/k∇f (x)k2 (gradient descent)


Quadratic norm kzkP := z> Pz for some P  0
 −1/2
zsd = − ∇f (x)> P−1 ∇f (x) P−1 ∇f (x)

Equivalent to SD with Euclidean norm on transformed variables y = P1/2 x

Lecture 10 V. Kekatos 8
Geometric interpretation

move as far as possible in direction −∇f (x), while staying inside the unit ball

Figure: Boyd’s slides

Lecture 10 V. Kekatos 9
Choosing the norm

Figure: choice of P strongly affects speed of convergence [Boyd’s slides]

• steepest descent with backtracking line search for two quadratic norms

• ellipses show {x : kx − xt kP = 1}

Lecture 10 V. Kekatos 10
Pure Newton step and interpretations

Newton update: x+ = x + v

Newton step: v = −∇2 f (x)−1 ∇f (x)

minimizes second-order expansion of f at x

f (x)+∇f (x)> (x+ −x)+ 21 (x+ −x)> ∇2 f (x)(x+ −x)

solves linearized optimality condition

∇f (x) + ∇2 f (x)(x+ − x) = 0

Figure: [Boyd’s slides]

Lecture 10 V. Kekatos 11
Global behavior of Newton iterations

Example: f (x) = log(ex + e−x ), starting at x0 = −1.1

Figure: pure Newton iterations may diverge! [Z.Q. Luo’s slides]

Lecture 10 V. Kekatos 12
Newton method

Also called damped or guarded Newton method


−1
1. Compute Newton direction ∆xt = − ∇2 f (xt ) ∇f (xt )


2. Choose step size µt

3. Update xt+1 = xt + µt ∆xt

4. Iterate until stopping criterion is satisfied

• global convergence with backtracking or exact line search

• quadratic local convergence

• affine invariance:

Newton iterates for minx f (x) and minz f (Tz) for invertible T
are equivalent and xt = Tzt

Lecture 10 V. Kekatos 13
Convergence results

assumptions: mI  ∇2 f (x)  M I and Lipschitz condition

k∇2 f (x) − ∇2 f (y)k ≤ Lkx − yk

1. damped Newton phase: k∇f (x)k2 ≥ η1 : f (x+ ) ≤ f (x) − η2 , hence

#iterations ≤ η2−1 (f (x0 ) − f ∗ )

2. quadratically convergent phase: k∇f (x)k2 < η1

#iterations ≤ log2 log2 (η3 /)

total # iterations for reaching accuracy f (xt ) − f ∗ ≤  bounded by:

η2−1 (f (x0 ) − f ∗ ) + log2 log2 (η3 /)

η1 , η2 , η3 depend on m, M, L (waived for self-concordant functions)

Lecture 10 V. Kekatos 14
Example
10,000 100,000
X X
f (x) = − log(1 − x2n ) − log(bi − a>
i x)
n=1 i=1

Figure: Two-phase convergence of Newton method [Boyd’s slides]

• x ∈ R10,000 with sparse ai ’s

Lecture 10 V. Kekatos 15
Minimization with linear equality constraints

Linearly-constrained optimization problem:

min{f (x) : Ax = b}
x

Approach 1: solve reduced or eliminated problem

min f (Fz + x0 )
z

where Ax0 = b and range(F) = null(A)

Approach 2: Find feasible update that minimizes second-order approximation

∆x := arg min f (x) + ∇f (x)> v + 12 v> ∇2 f (x)v


v

s.to A(x + v) = b

[Q: How can this be solved?]

Lecture 10 V. Kekatos 16

You might also like