0% found this document useful (0 votes)
51 views30 pages

Unconstrained

This document discusses various methods for unconstrained minimization of functions, including gradient descent, steepest descent, and Newton's method. It provides terminology and assumptions, describes how the methods iteratively find points with lower function values to converge to the optimal value. Gradient descent uses the negative gradient as the search direction at each step. Steepest descent uses a normalized steepest direction. The choice of norm can affect the performance of steepest descent. Examples are provided to illustrate the methods.

Uploaded by

gjrfjsdgfhjk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views30 pages

Unconstrained

This document discusses various methods for unconstrained minimization of functions, including gradient descent, steepest descent, and Newton's method. It provides terminology and assumptions, describes how the methods iteratively find points with lower function values to converge to the optimal value. Gradient descent uses the negative gradient as the search direction at each step. Steepest descent uses a normalized steepest direction. The choice of norm can affect the performance of steepest descent. Examples are provided to illustrate the methods.

Uploaded by

gjrfjsdgfhjk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

L.

Vandenberghe ECE236B (Winter 2022)

10. Unconstrained minimization

• terminology and assumptions


• gradient descent method
• steepest descent method
• Newton’s method
• self-concordant functions
• implementation

10.1
Unconstrained minimization

minimize 𝑓 (𝑥)

• 𝑓 convex, twice continuously differentiable (hence dom 𝑓 open)


• we assume optimal value 𝑝★ = inf 𝑥 𝑓 (𝑥) is attained (and finite)

Unconstrained minimization methods

• produce sequence of points 𝑥 (𝑘) ∈ dom 𝑓 , 𝑘 = 0, 1, . . . , with

𝑓 (𝑥 (𝑘) ) → 𝑝★

• can be interpreted as iterative methods for solving optimality condition

∇ 𝑓 (𝑥★) = 0

Unconstrained minimization 10.2


Initial point and sublevel set

algorithms in this chapter require a starting point 𝑥 (0) such that

• 𝑥 (0) ∈ dom 𝑓
• sublevel set 𝑆 = {𝑥 | 𝑓 (𝑥) ≤ 𝑓 (𝑥 (0) )} is closed

2nd condition is hard to verify, except when all sublevel sets are closed:

• equivalent to condition that epi 𝑓 is closed


• true if dom 𝑓 = R𝑛
• true if 𝑓 (𝑥) → ∞ as 𝑥 → bd dom 𝑓

examples of differentiable functions with closed sublevel sets:

X
𝑚 X
𝑚
𝑓 (𝑥) = log( exp(𝑎𝑇𝑖 𝑥 + 𝑏𝑖 )), 𝑓 (𝑥) = − log(𝑏𝑖 − 𝑎𝑇𝑖 𝑥)
𝑖=1 𝑖=1

Unconstrained minimization 10.3


Strong convexity and implications

𝑓 is strongly convex on 𝑆 if there exists an 𝑚 > 0 such that

∇2 𝑓 (𝑥)  𝑚𝐼 for all 𝑥 ∈ 𝑆

Implications

• for 𝑥, 𝑦 ∈ 𝑆 ,
𝑇 𝑚
𝑓 (𝑦) ≥ 𝑓 (𝑥) + ∇ 𝑓 (𝑥) (𝑦 − 𝑥) + k𝑥 − 𝑦k22
2
• 𝑆 is bounded
• 𝑝★ > −∞ and for 𝑥 ∈ 𝑆 ,

★ 1
𝑓 (𝑥) − 𝑝 ≤ k∇ 𝑓 (𝑥) k22
2𝑚

useful as stopping criterion (if you know 𝑚 )

Unconstrained minimization 10.4


Descent methods

𝑥 (𝑘+1) = 𝑥 (𝑘) + 𝑡 (𝑘) Δ𝑥 (𝑘) with 𝑓 (𝑥 (𝑘+1) ) < 𝑓 (𝑥 (𝑘) )

• other notations:
𝑥 + = 𝑥 + 𝑡Δ𝑥, 𝑥 := 𝑥 + 𝑡Δ𝑥

• Δ𝑥 is the step, or search direction; 𝑡 is the step size, or step length


• for convex 𝑓 : if 𝑓 (𝑥 +) < 𝑓 (𝑥) then Δ𝑥 must be a descent direction:

∇ 𝑓 (𝑥)𝑇 Δ𝑥 < 0

General descent method


given: a starting point 𝑥 ∈ dom 𝑓
repeat
1. determine a descent direction Δ𝑥
2. line search: choose a step size 𝑡 > 0
3. update: 𝑥 := 𝑥 + 𝑡Δ𝑥
until stopping criterion is satisfied

Unconstrained minimization 10.5


Line search types

Exact line search: 𝑡 = argmin𝑡>0 𝑓 (𝑥 + 𝑡Δ𝑥)

Backtracking line search (with parameters 𝛼 ∈ (0, 1/2) , 𝛽 ∈ (0, 1) )


• starting at 𝑡 = 1, repeat 𝑡 := 𝛽𝑡 until

𝑓 (𝑥 + 𝑡Δ𝑥) < 𝑓 (𝑥) + 𝛼𝑡∇ 𝑓 (𝑥)𝑇 Δ𝑥

• graphical interpretation: backtrack until 𝑡 ≤ 𝑡0

𝑓 (𝑥 + 𝑡Δ𝑥)

𝑇
𝑓 (𝑥) + 𝑡∇ 𝑓 (𝑥) Δ𝑥 𝑓 (𝑥) + 𝛼𝑡∇ 𝑓 (𝑥)𝑇 Δ𝑥
𝑡
𝑡=0 𝑡0
Unconstrained minimization 10.6
Gradient descent method

Gradient descent: general descent method with Δ𝑥 = −∇ 𝑓 (𝑥)


given: a starting point 𝑥 ∈ dom 𝑓
repeat
1. Δ𝑥 := −∇ 𝑓 (𝑥)
2. line search: choose step size 𝑡 via exact or backtracking line search
3. update: 𝑥 := 𝑥 + 𝑡Δ𝑥
until stopping criterion is satisfied

• stopping criterion usually of the form k∇ 𝑓 (𝑥) k2 ≤ 𝜖


• convergence result: for strongly convex 𝑓 ,

𝑓 (𝑥 (𝑘) ) − 𝑝★ ≤ 𝑐 𝑘 ( 𝑓 (𝑥 (0) ) − 𝑝★)

𝑐 ∈ (0, 1) depends on 𝑚 , 𝑥 (0) , line search type


• very simple, but often very slow

Unconstrained minimization 10.7


Quadratic problem in R2

𝑓 (𝑥) = 21 (𝑥12 + 𝛾𝑥22) (𝛾 > 0)

with exact line search, starting at 𝑥 (0) = (𝛾, 1) :

 𝑘  𝑘
(𝑘) 𝛾−1 (𝑘) 𝛾−1
𝑥1 = 𝛾 , 𝑥2 = −
𝛾+1 𝛾+1

• very slow if 𝛾 ≫ 1 or 𝛾 ≪ 1
• example for 𝛾 = 10:

𝑥 (0)
𝑥2

0
𝑥 (1)

−4
−10 0 10
𝑥1
Unconstrained minimization 10.8
Nonquadratic example

𝑓 (𝑥1, 𝑥2) = 𝑒 𝑥1+3𝑥2−0.1 + 𝑒 𝑥1−3𝑥2−0.1 + 𝑒 −𝑥1−0.1

𝑥 (0) 𝑥 (0)

𝑥 (2)
𝑥 (1)

𝑥 (1)

backtracking line search exact line search

Unconstrained minimization 10.9


Example in R100

𝑇
X
500
𝑓 (𝑥) = 𝑐 𝑥 − log(𝑏𝑖 − 𝑎𝑇𝑖 𝑥)
𝑖=1

104

102
𝑓 (𝑥 (𝑘) ) − 𝑝★

100
exact l.s.

10−2

backtracking l.s.
10−4
0 50 100 150 200
𝑘

‘linear’ convergence, i.e., a straight line on a semilog plot

Unconstrained minimization 10.10


Steepest descent method

Normalized steepest descent direction (at 𝑥 , for norm k · k ):

Δ𝑥nsd = argmin{∇ 𝑓 (𝑥)𝑇 𝑣 | k𝑣k = 1}

interpretation: for small 𝑣,

𝑓 (𝑥 + 𝑣) ≈ 𝑓 (𝑥) + ∇ 𝑓 (𝑥)𝑇 𝑣

direction Δ𝑥 nsd is unit-norm step with most negative directional derivative

(Unnormalized) steepest descent direction

Δ𝑥sd = k∇ 𝑓 (𝑥) k∗Δ𝑥nsd

satisfies ∇ 𝑓 (𝑥)𝑇 Δ𝑥 sd = −k∇ 𝑓 (𝑥) k∗2

Steepest descent method


• general descent method with Δ𝑥 = Δ𝑥sd
• convergence properties similar to gradient descent

Unconstrained minimization 10.11


Examples

• Euclidean norm: Δ𝑥sd = −∇ 𝑓 (𝑥)


• quadratic norm k𝑥k𝑃 = (𝑥𝑇 𝑃𝑥) 1/2 (𝑃 ∈ S++
𝑛 ):

Δ𝑥sd = −𝑃−1 ∇ 𝑓 (𝑥)

• ℓ1-norm: Δ𝑥sd = −(𝜕 𝑓 (𝑥)/𝜕𝑥𝑖 )𝑒𝑖 , where |𝜕 𝑓 (𝑥)/𝜕𝑥𝑖 | = k∇ 𝑓 (𝑥) k∞

unit balls, steepest descent directions for a quadratic norm and ℓ1-norm:

−∇ 𝑓 (𝑥)

−∇ 𝑓 (𝑥)
Δ𝑥nsd
Δ𝑥nsd

Unconstrained minimization 10.12


Choice of norm for steepest descent

𝑥 (0)
𝑥 (0)
𝑥 (2)
𝑥 (1) 𝑥 (2)

𝑥 (1)

• steepest descent with backtracking line search for two quadratic norms
• ellipses show {𝑥 | k𝑥 − 𝑥 (𝑘) k𝑃 = 1}
• equivalent interpretation of steepest descent with quadratic norm k · k𝑃 :
gradient descent after change of variables 𝑥¯ = 𝑃1/2𝑥

shows choice of 𝑃 has strong effect on speed of convergence

Unconstrained minimization 10.13


Newton step

Δ𝑥nt = −∇2 𝑓 (𝑥) −1 ∇ 𝑓 (𝑥)

Interpretations

• 𝑥 + Δ𝑥nt minimizes second order approximation

b 𝑇 1 𝑇 2
𝑓 (𝑥 + 𝑣) = 𝑓 (𝑥) + ∇ 𝑓 (𝑥) 𝑣 + 𝑣 ∇ 𝑓 (𝑥)𝑣
2

• 𝑥 + Δ𝑥nt solves linearized optimality condition

∇ 𝑓 (𝑥 + 𝑣) ≈ ∇ b
𝑓 (𝑥 + 𝑣) = ∇ 𝑓 (𝑥) + ∇2 𝑓 (𝑥)𝑣 = 0

b
𝑓′

b
𝑓 𝑓′
(𝑥 + Δ𝑥nt, 𝑓 ′ (𝑥 + Δ𝑥nt))
(𝑥, 𝑓 (𝑥)) (𝑥, 𝑓 ′ (𝑥))

(𝑥 + Δ𝑥nt, 𝑓 (𝑥 + Δ𝑥nt)) 𝑓

Unconstrained minimization 10.14


• Δ𝑥nt is steepest descent direction at 𝑥 in local Hessian norm

k𝑢k∇2 𝑓 (𝑥) = (𝑢𝑇 ∇2 𝑓 (𝑥)𝑢) 1/2

𝑥 + Δ𝑥nsd
𝑥 + Δ𝑥nt

dashed lines are contour lines of 𝑓 ; ellipse is {𝑥 + 𝑣 | 𝑣𝑇 ∇2 𝑓 (𝑥)𝑣 = 1}

arrow shows −∇ 𝑓 (𝑥)

Unconstrained minimization 10.15


Newton decrement

𝜆(𝑥) = (∇ 𝑓 (𝑥)𝑇 ∇2 𝑓 (𝑥) −1 ∇ 𝑓 (𝑥)) 1/2

a measure of the proximity of 𝑥 to 𝑥★

Properties

• gives an estimate of 𝑓 (𝑥) − 𝑝★, using quadratic approximation b


𝑓:

b 1
𝑓 (𝑥) − inf 𝑓 (𝑦) = 𝜆(𝑥) 2
𝑦 2

• equal to the norm of the Newton step in the quadratic Hessian norm

𝜆(𝑥) = (Δ𝑥𝑇nt ∇2 𝑓 (𝑥)Δ𝑥nt) 1/2

• directional derivative in the Newton direction: ∇ 𝑓 (𝑥)𝑇 Δ𝑥nt = −𝜆(𝑥) 2


• affine invariant (unlike k∇ 𝑓 (𝑥) k2)

Unconstrained minimization 10.16


Newton’s method

given: a starting point 𝑥 ∈ dom 𝑓 , tolerance 𝜖 > 0


repeat
1. compute the Newton step and decrement

Δ𝑥nt := −∇2 𝑓 (𝑥) −1 ∇ 𝑓 (𝑥) ; 𝜆2 := ∇ 𝑓 (𝑥)𝑇 ∇2 𝑓 (𝑥) −1 ∇ 𝑓 (𝑥)


2. stopping criterion: quit if 𝜆2/2 ≤ 𝜖
3. line search: choose step size 𝑡 by backtracking line search
4. update: 𝑥 := 𝑥 + 𝑡Δ𝑥 nt

Affine invariance

• Newton iterates for 𝑓˜(𝑦) = 𝑓 (𝑇 𝑦) with starting point 𝑦 (0) = 𝑇 −1𝑥 (0) are

𝑦 (𝑘) = 𝑇 −1𝑥 (𝑘)

• independent of linear changes of coordinates

Unconstrained minimization 10.17


Classical convergence analysis

Assumptions

• 𝑓 strongly convex on 𝑆 with constant 𝑚


• ∇2 𝑓 is Lipschitz continuous on 𝑆 , with constant 𝐿 > 0:

k∇2 𝑓 (𝑥) − ∇2 𝑓 (𝑦) k2 ≤ 𝐿k𝑥 − 𝑦k2

( 𝐿 measures how well 𝑓 can be approximated by a quadratic function)

Outline: there exist constants 𝜂 ∈ (0, 𝑚 2/𝐿) , 𝛾 > 0 such that

• if k∇ 𝑓 (𝑥) k2 ≥ 𝜂, then 𝑓 (𝑥 (𝑘+1) ) − 𝑓 (𝑥 (𝑘) ) ≤ −𝛾


• if k∇ 𝑓 (𝑥) k2 < 𝜂, then
 2
𝐿 (𝑘+1) 𝐿 (𝑘)
2
k∇ 𝑓 (𝑥 ) k2 ≤ 2
k∇ 𝑓 (𝑥 ) k2
2𝑚 2𝑚

Unconstrained minimization 10.18


Classical convergence analysis

Damped Newton phase ( k∇ 𝑓 (𝑥) k2 ≥ 𝜂 )

• most iterations require backtracking steps


• function value decreases by at least 𝛾
• if 𝑝★ > −∞, this phase ends after at most ( 𝑓 (𝑥 (0) ) − 𝑝★)/𝛾 iterations

Quadratically convergent phase ( k∇ 𝑓 (𝑥) k2 < 𝜂 )

• all iterations use step size 𝑡 = 1


• k∇ 𝑓 (𝑥) k2 converges to zero quadratically: if k∇ 𝑓 (𝑥 (𝑘) ) k2 < 𝜂, then

  2𝑙−𝑘   2𝑙−𝑘
𝐿 𝑙 𝐿 𝑘 1
2
k∇ 𝑓 (𝑥 ) k2 ≤ 2
k∇ 𝑓 (𝑥 ) k2 ≤ , 𝑙≥𝑘
2𝑚 2𝑚 2

Unconstrained minimization 10.19


Classical convergence analysis

Conclusion: number of iterations until 𝑓 (𝑥) − 𝑝★ ≤ 𝜖 is bounded above by

𝑓 (𝑥 (0) ) − 𝑝★
+ log2 log2 (𝜖0/𝜖)
𝛾

• 𝛾 , 𝜖0 are constants that depend on 𝑚 , 𝐿 , 𝑥 (0)


• second term is small (of the order of 6) and almost constant for practical
purposes

• in practice, constants 𝑚 , 𝐿 (hence 𝛾 , 𝜖0) are usually unknown


• provides qualitative insight in convergence properties (i.e., explains two
algorithm phases)

Unconstrained minimization 10.20


Examples

Example in R2 (page 10.9)

105

𝑥 (0) 100

𝑓 (𝑥 (𝑘) ) − 𝑝★
𝑥 (1) 10−5

10−10

10−15
0 1 2 3 4 5
𝑘

• backtracking parameters 𝛼 = 0.1, 𝛽 = 0.7


• converges in only 5 steps
• quadratic local convergence

Unconstrained minimization 10.21


Examples

Example in R100 (page 10.10)

105 2

exact line search


100 1.5
𝑓 (𝑥 (𝑘) ) − 𝑝★

step size 𝑡 (𝑘)


backtracking
10−5 1

exact line search


10−10 0.5 backtracking

10−15 0
0 2 4 6 8 10 0 2 4 6 8
𝑘 𝑘

• backtracking parameters 𝛼 = 0.01, 𝛽 = 0.5


• backtracking line search almost as fast as exact l.s. (and much simpler)
• clearly shows two phases in algorithm

Unconstrained minimization 10.22


Examples

Example in R10000 (with sparse 𝑎𝑖 )

X
10000 X
100000
𝑓 (𝑥) = − log(1 − 𝑥𝑖2) − log(𝑏𝑖 − 𝑎𝑇𝑖 𝑥)
𝑖=1 𝑖=1

105
𝑓 (𝑥 (𝑘) ) − 𝑝★

100

10−5

0 5 10 15 20
𝑘

• backtracking parameters 𝛼 = 0.01, 𝛽 = 0.5


• performance similar as for small examples

Unconstrained minimization 10.23


Self-concordance

Shortcomings of classical convergence analysis

• depends on unknown constants (𝑚 , 𝐿 , . . . )


• bound is not affinely invariant, although Newton’s method is

Convergence analysis via self-concordance (Nesterov and Nemirovski)

• does not depend on any unknown constants


• gives affine-invariant bound
• applies to special class of convex functions (‘self-concordant’ functions)
• developed to analyze polynomial-time interior-point methods for convex
optimization

Unconstrained minimization 10.24


Self-concordant functions

Definition

• convex 𝑓 : R → R is self-concordant if

| 𝑓 ′′′ (𝑥)| ≤ 2 𝑓 ′′ (𝑥) 3/2 for all 𝑥 ∈ dom 𝑓

• 𝑓 : R𝑛 → R is self-concordant if 𝑔(𝑡) = 𝑓 (𝑥 + 𝑡𝑣) is s.c. for all 𝑥 ∈ dom 𝑓 and 𝑣

Examples on R

• linear and quadratic functions


• negative logarithm 𝑓 (𝑥) = − log 𝑥
• negative entropy plus negative logarithm: 𝑓 (𝑥) = 𝑥 log 𝑥 − log 𝑥

Affine invariance: if 𝑓 : R → R is s.c., then 𝑓˜(𝑦) = 𝑓 (𝑎𝑦 + 𝑏) is s.c.:

𝑓˜′′′ (𝑦) = 𝑎 3 𝑓 ′′′ (𝑎𝑦 + 𝑏), 𝑓˜′′ (𝑦) = 𝑎 2 𝑓 ′′ (𝑎𝑦 + 𝑏)

Unconstrained minimization 10.25


Self-concordant calculus

Properties

• preserved under sums and positive scaling by factor ≥ 1


• preserved under composition with affine function
• if 𝑔 is convex with dom 𝑔 = R++ and |𝑔′′′ (𝑥)| ≤ 3𝑔′′ (𝑥)/𝑥 then

𝑓 (𝑥) = log(−𝑔(𝑥)) − log 𝑥

is self-concordant

Examples: properties can be used to show that the following are s.c.
P𝑚 𝑇 𝑥) on {𝑥 | 𝑎𝑇 𝑥 < 𝑏 , 𝑖 = 1, . . . , 𝑚}
• 𝑓 (𝑥) = − 𝑖=1
log(𝑏 𝑖 − 𝑎 𝑖 𝑖 𝑖
𝑛
• 𝑓 (𝑋) = − log det 𝑋 on S++
• 𝑓 (𝑥) = − log(𝑦 2 − 𝑥𝑇 𝑥) on {(𝑥, 𝑦) | k𝑥k2 < 𝑦}

Unconstrained minimization 10.26


Convergence analysis for self-concordant functions

Summary: there exist constants 𝜂 ∈ (0, 1/4] , 𝛾 > 0 such that

• if 𝜆(𝑥) > 𝜂, then


𝑓 (𝑥 (𝑘+1) ) − 𝑓 (𝑥 (𝑘) ) ≤ −𝛾

• if 𝜆(𝑥) ≤ 𝜂, then
 2
(𝑘+1) (𝑘)
2𝜆(𝑥 ) ≤ 2𝜆(𝑥 )

(𝜂 and 𝛾 only depend on backtracking parameters 𝛼, 𝛽)

Complexity bound: number of Newton iterations bounded by

𝑓 (𝑥 (0) ) − 𝑝★
+ log2 log2 (1/𝜖)
𝛾

for 𝛼 = 0.1, 𝛽 = 0.8, 𝜖 = 10−10, bound evaluates to 375( 𝑓 (𝑥 (0) ) − 𝑝★) + 6

Unconstrained minimization 10.27


Numerical example

150 randomly generated instances of

P
𝑚
minimize 𝑓 (𝑥) = − log(𝑏𝑖 − 𝑎𝑇𝑖 𝑥)
𝑖=1

25

20

◦: 𝑚 = 100, 𝑛 = 50
iterations
15
: 𝑚 = 1000, 𝑛 = 500
^: 𝑚 = 1000, 𝑛 = 50 10

0
0 5 10 15 20 25 30 35
(0) ★
𝑓 (𝑥 )−𝑝
• number of iterations much smaller than 375( 𝑓 (𝑥 (0) ) − 𝑝★) + 6
• bound of the form 𝑐( 𝑓 (𝑥 (0) ) − 𝑝★) + 6 with smaller 𝑐 (empirically) valid
Unconstrained minimization 10.28
Implementation

main effort in each iteration: evaluate derivatives and solve Newton system

𝐻Δ𝑥 = −𝑔

where 𝐻 = ∇2 𝑓 (𝑥) , 𝑔 = ∇ 𝑓 (𝑥)

Via Cholesky factorization

𝐻 = 𝐿𝐿𝑇 , Δ𝑥nt = −𝐿 −𝑇 𝐿 −1 𝑔, 𝜆(𝑥) = k𝐿 −1 𝑔k2

• cost (1/3)𝑛3 flops for unstructured system


• cost ≪ (1/3)𝑛3 if 𝐻 sparse, banded

Unconstrained minimization 10.29


Example of dense Newton system with structure

X
𝑛
𝑓 (𝑥) = 𝜓𝑖 (𝑥𝑖 ) + 𝜓0 ( 𝐴𝑥 + 𝑏), 𝐻 = 𝐷 + 𝐴𝑇 𝐻0 𝐴
𝑖=1

• assume 𝐴 ∈ R 𝑝×𝑛 , dense, with 𝑝 ≪ 𝑛


• 𝐷 diagonal with diagonal elements 𝜓𝑖′′ (𝑥𝑖 ) ; 𝐻0 = ∇2𝜓0 ( 𝐴𝑥 + 𝑏)

Method 1: form 𝐻 , solve via dense Cholesky factorization (cost (1/3)𝑛3)

Method 2 (page 9.15): factor 𝐻0 = 𝐿 0 𝐿𝑇0 ; write Newton system as

𝐷Δ𝑥 + 𝐴𝑇 𝐿 0 𝑤 = −𝑔, 𝐿𝑇0 𝐴Δ𝑥 − 𝑤 = 0

eliminate Δ𝑥 from first equation; compute 𝑤 and Δ𝑥 from

(𝐼 + 𝐿𝑇0 𝐴𝐷 −1 𝐴𝑇 𝐿 0)𝑤 = −𝐿𝑇0 𝐴𝐷 −1 𝑔, 𝐷Δ𝑥 = −𝑔 − 𝐴𝑇 𝐿 0 𝑤

cost: 2𝑝 2 𝑛 (dominated by computation of 𝐿𝑇0 𝐴𝐷 −1 𝐴𝑇 𝐿 0)


Unconstrained minimization 10.30

You might also like