Optimization Lesson 3 - Numerical Solutions of Unconstrained Multi-Variable Optimization
Optimization Lesson 3 - Numerical Solutions of Unconstrained Multi-Variable Optimization
(ME3202)
𝑥𝑗 0 + 𝑝 for 𝑖 = 𝑗
𝑥𝑗 𝑖 =ቐ 0
𝑥𝑗 + 𝑞 for 𝑖 ≠ 𝑗
𝑎
Where, 𝑝 = 𝑛+1+𝑛−1
𝑛 2
𝑎
and, 𝑞 = 𝑛+1−1
𝑛 2
Step 2: Perform exploratory search (call subroutine) with starting point 𝑥𝑗𝑐 = 𝑥𝑗 𝑖
Step 3: Check, if ∆𝑗 < 𝜀, then set current solution 𝑥𝑗∗ = 𝑋𝑗 and Terminate
Else, set ∆𝑗 = ∆𝑗 /𝛼 and go to Step 2
Hooke-Jeeve’s technique – Overall Algorithm (Contd.)
Step 4: Set 𝑖 = 𝑖 + 1
𝑖+1 𝑖 𝑖 𝑖−1
Perform pattern move: 𝑥𝑗 = 𝑥𝑗 + 𝑥𝑗 − 𝑥𝑗
Set 𝑥𝑗 𝑘 = 𝑥𝑗𝑐
Descent Direction
• Let the objective function 𝑓 𝑥𝑗 is a function of 𝑁 variables given by 𝑥𝑗 = 𝑥1 , 𝑥2 , … , 𝑥𝑁
• Consider a problem on ‘Minimization of 𝑓(𝑥𝑗 )’
𝑖+1 𝑖 𝑖 𝑖
• Let, the next possible solution is given by: 𝑥𝑗 = 𝑥𝑗 + 𝛼 𝑠𝑗 , estimated after at the 𝑖 𝑡ℎ
iteration
• where, 𝑥𝑗 𝑖 is the previously guessed solution that did not converge, 𝛼 is the distance between
𝑥𝑗 𝑖+1 and 𝑥𝑗 𝑖 in the search direction otherwise known as search step, and 𝑠𝑗 𝑖 is the preferred
search direction.
𝑖
• Let us consider that the search step length 𝛼 is very small (for better precision in the search
space) positive quantity.
Descent Direction …contd.
𝑖
• If 𝑠𝑗 have to be the descent directions, then,
𝑖+1 𝑖
𝑓 𝑥𝑗 < 𝑓 𝑥𝑗
𝑖 𝑖 𝑖 𝑖
or, 𝑓 𝑥𝑗 + 𝛼 𝑠𝑗 < 𝑓 𝑥𝑗
• By Taylor series expansion and neglecting higher order terms (as ,
𝑓 𝑥𝑗 𝑖 +𝛼 𝑖
𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 < 𝑓 𝑥𝑗 𝑖
or, 𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 < 0
• This is the condition for descent direction that the dot product of the gradient vector and
the descent direction is negative, since 𝛼 𝑖 is positive.
𝑖 𝑖
• Magnitude of the vector 𝛻𝑓 𝑥𝑗 ⋅ 𝑠𝑗 gives how steep the descent is.
Cauchy’s Steepest Descent technique
• In Cauchy’s method is more efficient when the search location is far from the actual
solution. But, as we get closer to the optimum point, the rate of convergence decreases as
the gradient value decreases gradually and so the estimated new point located closer and
closer to the previous point.
• At each iteration a new possible solution is calculated as: 𝑥𝑗 𝑖+1 = 𝑥𝑗 𝑖 + 𝛼 𝑖 𝑠𝑗 𝑖
• Since the gradient vector represents the direction of steepest ascent, the negative of the
gradient vector denotes the direction of the steepest descent, so, if 𝑠𝑗 𝑖 = −𝛻𝑓 𝑥𝑗 𝑖 , the
quantity 𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 is maximally negative, then search direction 𝑠𝑗 𝑖 is the steepest
descent direction.
• The step size 𝛼 𝑖 is determined such that the 𝑓 𝑥𝑗 𝑖+1 = 𝑓 𝑥𝑗 𝑖 − 𝛼 𝑖 𝛻𝑓 𝑥𝑗 𝑖 will be
minimum, using an appropriate unidirectional search method.
• This minimum point becomes current point and the search is continued from this point.
• Terminate when magnitude of the gradient vector becomes very small
• The improvement of the objective function value, at every iteration, is guaranteed due to
the descent property.
Cauchy’s Steepest Descent – Algorithm
Step 1: Set iteration counter 𝑖 = 0
Set maximum number of iterations allowed 𝐾;
Set termination tolerances 𝜀1 and 𝜀2
𝑖
Step 2: Compute 𝛻𝑓 𝑥𝑗
𝑖+1 𝑖
𝑥𝑗 −𝑥𝑗
Step 5: If 𝑖 < 𝜀1, then Terminate; Else, set 𝑖 = 𝑖 + 1 and go to step 2
𝑥𝑗
Cauchy’s Steepest Descent – Example
0
Minimize 𝑓 𝑥𝑗=2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 starting from a point 𝑥𝑗=2 = .
0
• Given:
0 0 0 0
2) And, starting point 𝑥𝑗=2 = ⇒ 𝑥𝑗=2 = or, 𝑥1 , 𝑥2 = 0,0
0 0
3) The value of the function at the starting point 0,0 i.e.
0
𝑓 𝑥𝑗=2 ቚ =𝑓 = 𝑓 0,0 = 0
0,0 0
0
So, 𝑓 𝑥𝑗=2 ቚ = 0, which is not the minimal point and can be observed by
0,0
plotting the curve or calculating the gradient vector (will be zeros for minimum).
Cauchy’s Steepest Descent – Example
• Solution:
1) Calculate the gradient vector of the function:
𝜕𝑓
𝜕𝑥1 1 + 4𝑥1 + 2𝑥2
𝛻𝑓 𝑥𝑗=2 = =
𝜕𝑓 −1 + 2𝑥1 + 2𝑥2
𝜕𝑥2
2) So, the gradient direction, in the starting iteration (superscripted (0)), at the starting
0 1
point 0,0 , is: 𝛻𝑓 𝑥𝑗=2 ห𝑥 0 =
𝑗=2 −1
3) Therefore, the steepest descent direction, in the starting iteration, at the starting
0 0 −1
point 0,0 , is: 𝑠𝑗=2 ቚ 0
= − 𝛻𝑓 𝑥𝑗=2 ห𝑥 0 =
𝑥𝑗=2 𝑗=2 1
Cauchy’s Steepest Descent – Example
• Solution …contd.:
1 1 0 0 0
4) Now let us find 𝑥𝑗=2 by unidirectional search method 𝑥𝑗=2 = 𝑥𝑗=2 +𝛼 𝑠𝑗=2 ቚ 0
𝑥𝑗=2
0 0
5) But, 𝛼 is not given. So we search for an optimal step length, 𝛼 for which the
1 1 0 0 0
function evaluated at 𝑥𝑗=2 i.e. 𝑓 𝑥𝑗=2 or, 𝑓 𝑥𝑗=2 + 𝛼 𝑠𝑗=2 ቚ 0
is minimum.
𝑥𝑗=2
0 0 0 0 0 −1 0 0
6) So, 𝑓 𝑥𝑗=2 + 𝛼 𝑠𝑗=2 ቚ 0
=𝑓 +𝛼 = 𝑓 −𝛼 ,𝛼 which is,
𝑥𝑗=2 0 1
0 0 0 2 0 0
𝑓 −𝛼 ,𝛼 = 𝛼 − 2𝛼 has to be minimum, for optimum value of 𝛼
Cauchy’s Steepest Descent – Example
• Solution …contd.:
0 0
7) To find out the minimum of 𝑓 −𝛼 ,𝛼 , let us find out the first derivative of
0 0 0
𝑓 −𝛼 ,𝛼 w.r.t. 𝛼 , and equate it to zero, i.e.
0 2 0
𝑑𝑓 −𝛼 , 𝛼 0 0 𝑑 𝛼 − 2𝛼
0
= = 2𝛼 −2=0
𝑑𝛼 0 𝑑𝛼 0
0
8) Which gives, 𝛼 = 1. So, now we have
1 0 0 0 0 −1 −1
𝑥𝑗=2 = 𝑥𝑗=2 +𝛼 𝑠𝑗=2 ቚ 0
= +1 = or, −1,1
𝑥𝑗=2 0 1 1
Cauchy’s Steepest Descent – Example
• Solution …contd.:
9) We can check for optimality at this point −1,1 . Let us evaluate the gradient,
1 −1 0
𝛻𝑓 𝑥𝑗=2 ቚ 1
= ≠
𝑥𝑗=2 −1 0
1
10) So, 𝑥𝑗=2 is not optimum, we continue to the next iteration.
Newton’s method
• The negative gradient direction may not always point towards the global minima for any
arbitrary multivariable nonlinear function (it does so, only in case of circular contours).
• The Newton’s method, on the other hand, uses the information about the second
derivative or the Hessian of the objective function (hence called a second order method)
which takes the curvature of the objective function surface into account and results in
identifying better search direction.
• The search direction is given by:
−1
𝑖 𝑖
𝑠𝑗 = − 𝛻 2 𝑓 𝑥𝑗 𝛻𝑓 𝑥𝑗 𝑖
• Due to the improved and updated search direction, faster convergence is also achieved.
−1
𝑖
2
• If matrix 𝛻 𝑓 𝑥𝑗 is positive semi-definite, the direction 𝑠𝑗 𝑖 is a descent direction.
This is very well followed when the search starts very close to the optimal point. At the
minimal point, the matrix becomes positive definite.
Newton’s method – Algorithm
Step 1: Set iteration counter 𝑖 = 0
Set maximum number of iterations allowed 𝐾; Set termination tolerances 𝜀1 and 𝜀2
𝑖
Step 2: Compute 𝛻𝑓 𝑥𝑗
−1
2 𝑖 2 𝑖
Step 4: Compute 𝛻 𝑓 𝑥𝑗 and inverse of the same, i.e. 𝛻 𝑓 𝑥𝑗
Newton’s method – Algorithm
−1
𝑖+1 𝑖 𝑖
𝑖
Step 5: compute 𝛼 , such that 𝑓 𝑥𝑗 = 𝑓 𝑥𝑗 − 𝛼 𝑖 2
𝛻 𝑓 𝑥𝑗 𝛻𝑓 𝑥𝑗 𝑖 is
minimum using a unidirectional search.
𝑖+1 𝑖
𝑥𝑗 −𝑥𝑗
Step 6: If 𝑖 < 𝜀1, then Terminate; Else, set 𝑖 = 𝑖 + 1 and go to step 2
𝑥𝑗