EO - Chapter 7 - Multiple Variable Methods - 2nd Part
EO - Chapter 7 - Multiple Variable Methods - 2nd Part
Engineering Optimization
Unconstrained Problems:
Multiple-Variable Methods (2nd part)
● Unconstrained Optimization:
Methods for Multiple Variables
– 0th order
– Quasi-Newton Methods
● Conclusions:
y2
di Ad j = 0 | i j
T
● Conjugate directions:
2
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 5
CG practical
where 𝜆1∗ is the optimal step length in the direction 𝑺1 . Set 𝑖 = 2 and go to
the next step.
∇𝑓𝑖 2
4. Find ∇𝑓𝑖 = ∇𝑓 𝑿𝑖 , and set: 𝑺𝑖 = −∇𝑓𝑖 + 2
𝑺𝑖−1
∇𝑓𝑖−1
5. Compute the optimum step length 𝜆∗𝑖 in the direction 𝑺𝑖 , and find the
new point: 𝑿𝑖+1 = 𝑿𝑖 + 𝜆∗𝑖 𝑺𝑖
6. Test for the optimality of the point 𝑋𝑖+1 . If 𝑋𝑖+1 is optimum, stop the
process. Otherwise, set the value of 𝑖 = 𝑖 + 1 and go to step 4.
● Single-variable methods
– 0th order
– 1st order
– 2nd order
● Concept:
– Minimize approximation
– Repeat
2
● First order sufficiency condition applied to local
approximation to find the minimum:
f (x) = 0 f + Hx = 0 x = −H−1f
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 9
Newton’s method (2)
● Step: s = −H −1f
Evaluated at x
● Update: x k +1 = x k + s k sk
xk x k +1
● Note:
● Update: x k +1 = x k + k s k 10
-5
k sk
xk x k +1
-10
10
optimum: 2
x k +1 − x* c x k − x* Iteration
1 2 3 4
1 𝑇
𝑓 𝒙 = 𝒙 𝐴 𝒙 + 𝑩𝑻 𝒙 + 𝐶
2
The minimum of 𝐹 𝑿 is given by
∇𝑓 = 𝐴 𝒙 + 𝑩 = 0 → 𝒙∗ = − 𝐴 −1 𝑩
Since
𝒙𝑘+1 = 𝒙𝑘 − H −1 ∇𝑓 = 𝒙𝑘 − 𝐴 −1 𝐴 𝒙𝑘 + 𝑩
𝑘
where 𝑥𝑘 is the starting point for the 𝑘𝑡ℎ iteration. Thus above equation
gives the exact solution
𝒙𝑘+1 = 𝒙∗ = − 𝐴 −1 𝑩
As
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = = =
𝜕𝑓Τ𝜕𝑥2 𝑿1
−1 + 2𝑥1 + 2𝑥2 𝑿1 −1
Thus
1 1
− −1
0 1
𝑿2 = 𝑿1 − 𝐻 −1 ∇𝑓1 = − 2 2 = 3
0 1 −1
− 1 2
2
To see whether or not 𝑿2 is the optimum point, we evaluate
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 0
∇𝑓2 = = =
𝜕𝑓Τ𝜕𝑥2 𝑿 −1 + 2𝑥1 + 2𝑥2 𝑿 0
2 2
1. Start with an arbitrary initial point 𝑿1 and constants 𝛼1 (on the order of
104 ), 𝑐1 (0 < 𝑐1 < 1), 𝑐2 (𝑐2 >1), and 𝜀 (on the order of 10−2 ). Set the
iteration number as 𝑖 = 1.
2. Compute the gradient of the function, ∇𝑓𝑖 = ∇𝑓(𝑿𝑖 ).
3. Test for optimality of the point 𝑋𝑖 . If ||∇𝑓𝑖 || = ||∇𝑓(𝑋𝑖 )|| ≤ 𝜀, 𝑿𝑖 is
optimum and hence stop the process. Otherwise, go to step 4.
4. Find the new vector 𝑿𝑖+1 as
−1
𝑿𝑖+1 = 𝑿𝑖 + 𝑺𝑖 = 𝑿𝑖 − 𝐻𝑖 + 𝛼𝑖 𝐼 ∇𝑓𝑖
5. Compare the values of 𝑓𝑖+1 and 𝑓𝑖 . If 𝑓𝑖+1 < 𝑓𝑖 , go to, step 6. If 𝑓𝑖+1 ≥
𝑓𝑖 , go to step 7.
6. Set 𝛼𝑖+1 = 𝑐1 𝛼𝑖 , 𝑖 = 𝑖 + 1, and go to step 2.
7. Set 𝛼𝑖 = 𝑐2 𝛼𝑖 and go to step 4.
Solution:
Iteration 1 (𝒊 = 𝟏)
Here 𝑓1 = 𝑓 𝑿1 = 0.0 and
𝜕𝑓 Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = = =
𝜕𝑓 Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
Since
𝜕2𝑓 𝜕2𝑓
𝜕𝑥12 𝜕𝑥1 𝜕𝑥2 4 2
𝐻1 = =
𝜕2𝑓 𝜕2𝑓 2 2
𝜕𝑥2 𝜕𝑥1 𝜕𝑥22 (0,0)
−1
𝑿2 = 𝑿𝟏 + 𝑺𝟏 = 𝑿𝟏 − 𝐻1 + 𝛼1 𝐼 ∇𝑓1
4 −1
0 4 + 10 2 1 −0.9998
= − 4 = 10−4
0 2 2 + 10 −1 1
As 𝑓2 = 𝑓 𝑿2 = −1.9997 × 10−4 < 𝑓1 , we set 𝛼2 = 𝑐1 𝛼1 = 2500, 𝑖 = 2, and proceed to
the next iteration.
Iteration 2 (𝒊 = 𝟐)
0.9998
The gradient vector corresponding to 𝑿2 is given by ∇𝑓2 = , ∇f2 = 1.4141 > 𝜀,
−1
and hence we compute
−1
𝑿3 = 𝑿𝟐 − 𝐻2 + 𝛼2 𝐼 ∇𝑓2
−0.9998 × 10 −4 2504 2 −1 0.9998 −4.9958 × 10−4
= − =
1 × 10−4 2 2504 −1 5 × 10−4
Since 𝑓3 = 𝑓 𝑿3 = −0.9993 × 10−3 < 𝑓2 , we set 𝛼3 = 𝑐1 𝛼2 = 625, 𝑖 = 3, and proceed to
the next iteration. The iterative process is to be continued until the convergence criterion,
∇𝑓𝑖 < 𝜀, is satisfied.
● Conclusions:
– Not robust!
● Robustness improvements:
Solution:
Iteration 1 (𝒊 = 𝟏)
Here
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = ∇𝑓(𝑿1 ) = = =
𝜕𝑓Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
and hence
1 0 1 −1
𝑺𝟏 = − B1 ∇𝑓1 = =
0 1 −1 1
To find the minimizing step length 𝜆1∗ along 𝑺1 , we minimize
0 −1
𝑓 𝑿1 + 𝜆1 𝑺1 = 𝑓 + 𝜆1 = 𝑓 −𝜆1 , 𝜆1 = 𝜆12 − 2𝜆1
0 1
with respect to 𝜆1 . Since 𝑑𝑓Τ𝑑𝜆1 = 0 at 𝜆1∗ = 1, we obtain
0 −1 −1
𝑿2 = 𝑿𝟏 + 𝜆1∗ 𝑺𝟏 = +1 =
0 1 1
−1
Since ∇𝑓2 = ∇𝑓 𝑿2 = and ∇𝑓2 = 1.4142 > 𝜀, we proceed to update the matrix
−1
𝐵𝑖 by computing
−1 1 −2
𝒈1 = ∇𝑓2 − ∇𝑓1 = − = ;
−1 −1 0
−2 −1 1 −1
𝑺1𝑇 𝒈1 = −1 1 = 2; 𝑺1 𝑺1𝑇 = −1 1 =
1 1 −1 1
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 28
Example 4 (3)
1 0 −2 −2 𝑇 −2 𝑇
𝐵1 𝒈1 = = ; 𝐵1 𝒈1 = = −2 0
0 1 0 0 0
1 0 −2 −2
𝒈1𝑇 𝐵1 𝒈1 = −2 0 = −2 0 =4
0 1 0 0
1 1
𝑇
𝑺1 𝑺1 1 1 −1 −
𝑀1 = 𝜆1∗ 𝑇 = 1 = 2 2
𝑺1 𝒈1 2 −1 1 1 1
−
2 2
−2
𝐵1 𝒈1 𝐵1 𝒈1 𝑇 −2 0 1 4 0
0 1 0
𝑁1 = − = − = − = −
𝒈1𝑇 𝐵1 𝒈1 4 4 0 0 0 0
0.5 −0.5
𝐵2 = 𝐵1 + 𝑀1 + 𝑁1 =
−0.5 1.5
Iteration 2 (𝒊 = 𝟐)
The next search direction is determined as
0.5 −0.5 −1 0
𝑺2 = − 𝐵2 ∇𝑓2 = − =
−0.5 1.5 1 1
To find the minimizing step length 𝜆∗2 along 𝑺2 , we minimize
−1 0
𝑓 𝑿2 + 𝜆2 𝑺2 = 𝑓 + 𝜆1 = 𝑓 −1,1 + 𝜆2 = 𝜆22 − 𝜆2 − 1
1 2
with respect to 𝜆2 . Since 𝑑𝑓Τ𝑑𝜆2 = 0 at 𝜆∗2 = 1/2, we obtain
1 0 −1
∗ −1
𝑿3 = 𝑿𝟐 + 𝜆2 𝑺𝟐 = + = 3
1 2 1
2
This point can be identified to be optimum since
0
∇𝑓3 = 𝑎𝑛𝑑 ∇𝑓3 = 0 < 𝜀
0
Solution:
Iteration 1 (𝒊 = 𝟏)
Here
𝜕𝑓Τ𝜕𝑥1 1 + 4𝑥1 + 2𝑥2 1
∇𝑓1 = ∇𝑓(𝑿1 ) = = =
𝜕𝑓Τ𝜕𝑥2 (0,0)
−1 + 2𝑥1 + 2𝑥2 (0,0) −1
and hence
1 0 1 −1
𝑺𝟏 = − B1 ∇𝑓1 = =
0 1 −1 1
To find the minimizing step length 𝜆1∗ along 𝑺1 , we minimize
0 −1
𝑓 𝑿1 + 𝜆1 𝑺1 = 𝑓 + 𝜆1 = 𝑓 −𝜆1 , 𝜆1 = 𝜆12 − 2𝜆1
0 1
with respect to 𝜆1 . Since 𝑑𝑓Τ𝑑𝜆1 = 0 at 𝜆1∗ = 1, we obtain
0 −1 −1
𝑿2 = 𝑿𝟏 + 𝜆1∗ 𝑺𝟏 = +1 =
0 1 1
−1
Since ∇𝑓2 = ∇𝑓 𝑿2 = and ∇𝑓2 = 1.4142 > 𝜀, we proceed to update the matrix
−1
𝐵𝑖 by computing
−1 1 −2 −1 −1
𝒈1 = ∇𝑓2 − ∇𝑓1 = − = ; 𝐝1 = 𝜆1∗ 𝑺1 = 1 =
−1 −1 0 1 1
−1 1 −1 −2
𝒅1 𝒅1𝑇 = −1 1 = ; 𝒅1𝑇 𝒈1 = −1 1 =2
1 −1 1 0
Engineering Optimization – Unconstrained Problems: Multiple-variable Methods (2nd part) 33
Example 5 (3)
−1 0 2 −2 2 −2
𝒅1 𝒈1𝑇 = −2 0 = ; 𝒈1 𝒅1𝑇 = −1 1 =
1 0 −2 0 0 0
1 0 −2 −2
𝒈1𝑇 𝐵1 𝒈1 = −2 0 = −2 0 =4
0 1 0 0
2 0 1 0 2 0
𝒅1 𝒈1𝑇 𝐵1 = =
−2 0 0 1 −2 0
1 0 2 −2 2 −2
𝐵1 𝒈1 𝒅1𝑇 = =
0 1 0 0 0 0
Update the Hessian matrix as
1 1
4 1 1 −1 1 2 0 1 2 −2 −
1 0
𝐵2 = + 1+ − − = 2 2
0 1 2 2 −1 1 2 −2 0 2 0 0 1 5
−
2 2
Iteration 2 (𝒊 = 𝟐)
The next search direction is determined as
1 1
−
𝑺2 = − 𝐵2 ∇𝑓2 = − 2 2 −1 = 0
1 5 1 2
−
2 2
To find the minimizing step length 𝜆∗2 along 𝑺2 , we minimize
−1 0
𝑓 𝑿2 + 𝜆2 𝑺2 = 𝑓 + 𝜆1 = 𝑓 −1,1 + 2𝜆2 = 4𝜆22 − 2𝜆2 − 1
1 2
with respect to 𝜆2 . Since 𝑑𝑓Τ𝑑𝜆2 = 0 at 𝜆∗2 = 1/4, we obtain
1 0 −1
∗ −1
𝑿3 = 𝑿𝟐 + 𝜆2 𝑺𝟐 = + = 3
1 4 2
2
This point can be identified to be optimum since
0
∇𝑓3 = 𝑎𝑛𝑑 ∇𝑓3 = 0 < 𝜀
0
● BFGS: 53
● DFP: 260
● Newton: 18
● Marquardt: 29
– Not robust
● 0th order methods robust and simple, but inefficient for n > 10