0% found this document useful (0 votes)
6 views34 pages

Optimization Lesson 3 - Numerical Solutions of Unconstrained Multi-Variable Optimization

This document discusses numerical solution techniques for unconstrained multi-variable optimization problems, focusing on direct search and gradient-based methods. It details various algorithms such as evolutionary search, unidirectional search, simplex search, univariate search, and pattern search techniques, explaining their processes and applications. The document emphasizes the importance of search direction, search steps, and the iterative nature of these optimization methods.

Uploaded by

Sameer Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views34 pages

Optimization Lesson 3 - Numerical Solutions of Unconstrained Multi-Variable Optimization

This document discusses numerical solution techniques for unconstrained multi-variable optimization problems, focusing on direct search and gradient-based methods. It details various algorithms such as evolutionary search, unidirectional search, simplex search, univariate search, and pattern search techniques, explaining their processes and applications. The document emphasizes the importance of search direction, search steps, and the iterative nature of these optimization methods.

Uploaded by

Sameer Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

DESIGN OF FRICTIONAL MACHINE ELEMENTS

(ME3202)

Module 7: Optimization in Design


Lesson 3: Numerical Solution Techniques for the
Unconstrained Multi-Variable Optimization Problems

B. Tech., Mech. Engg., 6th Sem, 2022-23


Unconstrained Multi-Variable Optimization
• The objective function consists of multiple design variables
• The algorithms can be broadly classified into two categories — direct search methods and
gradient-based methods.
• Direct search method may or may not be directional.
• For any directional search method, direct or indirect, search direction and search step
(resolution) are required, together they form the ‘search increment’.
1. Preferred search direction:
The function value, at any point in 𝑁-dimensional space, increases at the fastest
rate along the gradient direction. Hence, the gradient direction is the direction of
‘steepest ascent’. This direction may be preferred for maximization problem.
The opposite or the negative gradient direction is the ‘steepest descent’ direction
and may be preferred for a minimization problem.
Gradient based direction is local property
Unconstrained Multi-Variable Optimization
2. Search step:
Search step is a scalar multiplier represents the resolution or, the precision of the
search increment
It can, generally, be taken as a small positive quantity.
Direct Search Methods
• Many real-world optimization problems are complex and not continuous, for which the
function evaluation is computationally costly and gradient information are more difficult
to obtain. For these problems, direct search techniques may be found to be useful.
• In these methods, function values are directly evaluated at different point in the search
space
• Direct search method use algorithms which
• either 1) completely eliminate the concept of search direction and manipulate a set of
points to create a better set of points, like – evolutionary optimization technique
• or 2) use complex search directions to effectively decouple the nonlinearity of the
function, like – Unidirectional Search, Hooke-Jeeve’s pattern search technique
Evolutionary Search technique (not directional)
• The algorithm requires (2𝑁 + 1) points, of which 2𝑁 are corner points of an 𝑁-
dimensional hypercube centred on the other point.
• All (2𝑁 + 1) function values are compared and the best point is identified.
• In the next iteration, another hypercube is formed around this best point. If at any
iteration, an improved point is not found, the size of the hypercube is reduced.
• This process continues until the hypercube becomes very small.
Evolutionary Search – Algorithm
Step 1: Assume an initial point 𝑥𝑗 0
Assume size reduction parameters ∆𝑗 for all design variables, 𝑗 = 1, 2, … , 𝑁.
Assume a termination parameter 𝜀.
Set 𝑥𝑗 = 𝑥𝑗 0
Step 2: Check, if ∆𝑗 < 𝜀, then Terminate;
𝑁 ∆𝑗
Else, create 2 points by adding and subtracting from each variable at the point 𝑥𝑗 .
2
Step 3: Compute function values at all (2𝑁 + 1) points.
Find the point having the minimum function value.
Designate the minimum point to be 𝑥ഥ𝑗 .
0 ∆𝑗
Step 4: Check, if 𝑥ഥ𝑗 = 𝑥 , then reduce size parameters ∆𝑗 = and go to Step 2;
2
0
Else, set 𝑥 = 𝑥ഥ𝑗 and go to Step 2.
Unidirectional search
• Let the 𝑗-dimensional points 𝑥𝑗 represent the design variables to be optimized which
results in extremizing the objective function: 𝑓(𝑥𝑗 ) = 0
• A unidirectional search is a one-dimensional search performed from a point 𝑥𝑗 𝑖 along a
𝑖
specified (descent) direction 𝑠𝑗 available at any 𝑖 𝑡ℎ iteration, by comparing function
values only with the same in the successive iterations.
• The next possible solution or, point 𝑥𝑗 𝑖+1 in the 𝑠𝑗 𝑖 direction, can be expressed as:
𝑖+1 𝑖 𝑖
𝑥𝑗 𝛼 = 𝑥𝑗 + 𝛼 𝑠𝑗 in terms of a single scalar variable 𝛼. 𝛼 is the distance between
𝑥𝑗 𝑖+1 and 𝑥𝑗 𝑖 in a particular direction, so it is the precision or resolution of the search.
• Now, the objective function 𝑓 𝑥𝑗 𝑖+1 will be evaluated as 𝑓 𝛼 𝑖+1
has become a
single variable function of the scalar variable 𝛼 only.
Unidirectional search …contd.
• In order to find the minimum point on the specified line, we can now use any suitable
single-variable search method described earlier.
𝑖+1
• Once the optimal value 𝛼 ∗ is found, the corresponding point, 𝑥𝑗 𝛼 (the design
variables), can also be found using the above equation.
• A few successive unidirectional search can estimate the global optima
• Unidirectional search is used in many direction based search schemes, categorized under
direct or indirect (gradient based) methods
Simplex Search Technique
• AKA ‘Regular Sequential Search’
• In 𝑁-dimensions, a simplex is a polyhedron composed of 𝑁 + 1 equidistant vertices. Like,
an equilateral triangle is a simplex in 2D (or, 2-simplex), a tetrahedron is simplex in 3D (or,
3-simplex), etc.
• A new simplex can be formed by projecting any chosen vertex a suitable (to be equidistant
from each other) distance through the centroid of the remaining vertices of the old
simplex. The newly projected point (called, reflection) will replace the old vertex and a new
simplex is formed.
• It can be done in any direction on any face (the geometric entity used to form the simplex,
i.e. points or vertices, lines, planes) of the old simplex.
• The points chosen for the initial simplex should not form a zero-volume 𝑁-dimensional
hypercube. Thus, in a function with two variables, the chosen three points in the simplex
should not lie along a line. Similarly, in a function with three variables, four points in the
initial simplex should not lie on a plane.
Simplex Search Technique …contd.
• This technique is used for searching new possible solutions.
• In the first iteration, for an 𝑁 -variable objective function, a regular 𝑁 -simplex is
constructed.
• Generation of a regular 𝑵-simplex from given data:
 Generally, simplex search starts with constructing an 𝑁-dimensional regular (all the
sides are same) simplex.
 For an 𝑁-variable objective function, initial values of 𝑁 number of variables or less
can be provided in the problem definition.
 Then the other points of the 𝑁-simplex are so chosen that they must constitute and
𝑁-simplex (i.e., the points must be equidistant from all given points and each other)
 And, together they must not constitute zero area (not a collinear point) or zero
volume (not a coplanar point) respectively for 2-simplex or higher.
Simplex Search Technique …contd.
 To generate a regular 𝑁-simplex with sides 𝑎 long, for the first time, if one initial
point 𝑥0 is given then the other points are obtained by:

𝑥𝑗 0 + 𝑝 for 𝑖 = 𝑗
𝑥𝑗 𝑖 =ቐ 0
𝑥𝑗 + 𝑞 for 𝑖 ≠ 𝑗
𝑎
Where, 𝑝 = 𝑛+1+𝑛−1
𝑛 2
𝑎
and, 𝑞 = 𝑛+1−1
𝑛 2

 Here, 𝑖 represents 1𝑠𝑡 , 2𝑛𝑑 , … , 𝑁 𝑡ℎ point and 𝑗 represents the coordinates of a


point.
 For the next time onwards, the simplex will be automatically upgraded and
constituted
Simplex Search Technique …contd.
• Now that the 𝑁-simplex is formed, all 𝑁 + 1 vertices are defined, the objective function is
needed to be evaluated at every vertices of the simplex
• Amongst which, the worst point (having the highest function value, for a minimization
problem) is identified and replaced by its reflection following the properties of simplex.
 First, the points are sorted as per the deteriorating
order (ascending order, for minimization problem) of
the function values, like:
1 2 𝑁+1
𝑓 𝑥𝑗 < 𝑓 𝑥𝑗 < ⋯ < 𝑓 𝑥𝑗
𝑛+1 𝑁+1
Considering 𝑓 𝑥𝑗 be the worst value, 𝑥𝑗
be the worst point accordingly
3
 Let, at point 3, in the picture, the function value 𝑓 𝑥 is the worst (highest). So,
point 3 has to be replaced.
Simplex Search Technique …contd.
 The direction of search is oriented away from this worst point through the centroid of
the remaining simplex (say, centroid of point 1 & 2).
 So now, leaving the worst point, the centroid 𝑥𝑗ҧ is calculated for remaining 1 to 𝑁
1 2 𝑁 1 𝑁
vertex points i. e. 𝑥𝑗 , 𝑥𝑗 , … , 𝑥𝑗 , as: 𝑥𝑗ҧ = 𝑁 σ𝑖=1 𝑥𝑗 𝑖
 A new point is selected in the reflected direction (say, point 4), preserving the
simplex’s geometrical requirements i.e. shape (not always, there are modifications to
simplex method) and non-zero area (non-collinear) for 2-simplex or, non-zero volume
(non-coplanar) for higher order simplices
 The new point location is given by: 𝑥𝑗𝑟 = 𝑥𝑗 𝑁+1 + 𝑟 𝑥𝑗ҧ − 𝑥𝑗 𝑁+1
Note: In the above equation, 𝑟 = 0 gives the original point (point 3), 𝑟 = 1 gives the
centroid itself 𝑥𝑗ҧ , 𝑟 = 2 will retain the regularity of the simplex as the reflection
will be symmetric.
 Thus, putting 𝑟 = 2, the new point (point 4) is obtained as: 𝑥𝑗𝑛𝑒𝑤 = 2𝑥𝑗ҧ − 𝑥𝑗𝑤𝑜𝑟𝑠𝑡
Simplex Search Technique …contd.
• This way a new simplex is formed (constituting points 1, 2, and 4) and in order to estimate
the function values at each vertices, this time only one evaluation is required, at the new
point (point 4, because, at point 1 and 2 the function is already evaluated)
• If the evaluated function value is not improved, consider the old simplex and replace the
second worse point. And continue with the process.
• Again, in the next iteration, comparing the function values at each vertices, a new search
direction is determined.
• The method proceeds this way iterating and rejecting one vertex at a time until the simplex
includes the optimum and/or it is very close to the optimum point. We may not get any
closer to the optimum if continued further, and the simplex will repeat itself in all future
iterations.
• In this case, we need to change the size of the simplex to enforce the convergence. We may
consider the midpoints of all the sides of the last simplex as vertices of a new simplex.
• The search is terminated when the simplex becomes small enough or the function values at
all the vertices are very close to each other.
• The only disadvantage of this technique is large repetition of the process.
Univariate Search technique
• Univariate search refers to minimizing a multivariable objective function by considering
‘one-variable-at-a-time’, while keeping other variables constant.
• Any single variable optimization technique can be implemented,
• The objective function can be minimized by searching for every optimum design variables
in their both their positive and negative directions if necessary.
• Univariate functions may not converge always but, if it does, it is an effective (quick to
converge) technique.
• This method mostly effective in case of quadratic objective functions of 𝑓(𝑥𝑗 ) = 𝑐𝑗 𝑥𝑗2
type, because the search directions are aligned with the variable axes
• But this method may not be so effective for functions whose contour is oriented in
different directions rather than variable axes directions
Pattern Search technique
• Instead of retaining the search direction always parallel to the variable axes, it can be
changed in a favourable manner i.e. towards the optimum point. These favourable
directions are called the ‘Pattern Directions’.
• The process of going from one point (solution) to another is called a ‘move’.
• For minimization of a function, a move is a
‘success’ if the function value decreases after the
move, otherwise, it is a ‘failure’. (vice-versa for a
maximization problem)
• Pattern direction
For univariate search
1 −2 − 3− 4− 5⋯
Pattern search
1 − 3 − 5 ⋯ 𝑎𝑙𝑤𝑎𝑦𝑠 𝑡𝑜𝑤𝑎𝑟𝑑𝑠 𝑡ℎ𝑒 𝑜𝑝𝑡𝑖𝑚𝑢𝑚
Pattern Search technique
• For 𝑁-variable function, we need 𝑁 linearly independent search directions to ‘move’
• There are many possible combinations of search directions. Within which, some will
converge faster than others.
• In general, a pattern search method takes 𝑁 univariate steps, and then searches for the
minimum along the pattern direction 𝑆𝑗 = 𝑥𝑗 − 𝑥𝑗−𝑁 , where, 𝑥𝑗 is the point obtained at
the end of 𝑁 univariate steps and 𝑥𝑗−𝑁 is the starting point before taking 𝑁 univariate
steps.
Hooke-Jeeve’s Pattern Search technique
• Each sequence consists of two kinds of moves – 1) exploratory move, & 2) pattern move
• The exploratory move is performed systematically in the vicinity of the current point
using univariate technique to explore the local behaviour of the objective function. The
goal is to obtain the best point around the current point. If the function value is improved
(minimized or maximized depending up on the objective) after the exploratory move, it
is called a ‘success’
• If the exploratory move is successful, the pattern move is then performed using two best
points (current, found by exploration and one found in the previous iteration) to take
advantage of the pattern direction, 𝑥𝑗 𝑖+1 = 𝑥𝑗 𝑖 + 𝑥𝑗 𝑖 − 𝑥𝑗 𝑖−1 . Again, if the function
value improves, the pattern move is accepted (success).
• If the pattern move does not improve the solution, the exploratory move is performed
with reduced step size.
• The search is terminated when the step size becomes sufficiently small (as specified)
Hooke-Jeeve’s technique – Overall Algorithm
Step 1: Set 𝑖 = 0
𝑖
Input number of variables 𝑁, Assume starting solutions 𝑥𝑗
Assume search step reduction factor 𝛼 > 1, Termination tolerance 𝜀
Assume fixed perturbation size for each variables ∆𝑗

Step 2: Perform exploratory search (call subroutine) with starting point 𝑥𝑗𝑐 = 𝑥𝑗 𝑖

Check, if exploratory search is a ‘success’, set 𝑥𝑗 𝑖+1 = 𝑋𝑗 and go to Step 4


Else, continue to Step 3

Step 3: Check, if ∆𝑗 < 𝜀, then set current solution 𝑥𝑗∗ = 𝑋𝑗 and Terminate
Else, set ∆𝑗 = ∆𝑗 /𝛼 and go to Step 2
Hooke-Jeeve’s technique – Overall Algorithm (Contd.)
Step 4: Set 𝑖 = 𝑖 + 1
𝑖+1 𝑖 𝑖 𝑖−1
Perform pattern move: 𝑥𝑗 = 𝑥𝑗 + 𝑥𝑗 − 𝑥𝑗

Step 5: Perform exploratory search (call subroutine) with 𝑥𝑗𝑐 = 𝑥𝑗 𝑖+1

Check, if exploratory search is a ‘success’, set 𝑥𝑗 𝑖+1 = 𝑋𝑗

Step 6: Check, if 𝑓 𝑥𝑗 𝑖+1 < 𝑓 𝑥𝑗 𝑖 , then go to Step 3


Algorithm for Exploratory Search Subroutine
Step 1: Set 𝑘 = 1

Set 𝑥𝑗 𝑘 = 𝑥𝑗𝑐

Step 2: Calculate 𝑓 = 𝑓 𝑥𝑗 𝑘 , 𝑓 + = 𝑓 𝑥𝑗 𝑘 + ∆𝑗 , and 𝑓 − = 𝑓 𝑥𝑗 𝑘 − ∆𝑗

Find 𝑓𝑚𝑖𝑛 = min 𝑓, 𝑓 + , 𝑓 −

Set 𝑋𝑗 = 𝑥𝑗 𝑘 corresponding to the 𝑓𝑚𝑖𝑛

Step 3: Check, if 𝑘 ≠ 𝑁, then 𝑘 = 𝑘 + 1 and go to Step 2


Else, 𝑋𝑗 is the result and continue to Step 4
Step 4: Check, if 𝑋𝑗 ≠ 𝑥𝑗𝑐 , exploratory move is a success, Else, failure
Indirect or Gradient Based Methods
• The general structure of all the gradient based methods are more or less same. These methods
differ in the determination of the search direction and step length.

Descent Direction
• Let the objective function 𝑓 𝑥𝑗 is a function of 𝑁 variables given by 𝑥𝑗 = 𝑥1 , 𝑥2 , … , 𝑥𝑁
• Consider a problem on ‘Minimization of 𝑓(𝑥𝑗 )’
𝑖+1 𝑖 𝑖 𝑖
• Let, the next possible solution is given by: 𝑥𝑗 = 𝑥𝑗 + 𝛼 𝑠𝑗 , estimated after at the 𝑖 𝑡ℎ
iteration
• where, 𝑥𝑗 𝑖 is the previously guessed solution that did not converge, 𝛼 is the distance between
𝑥𝑗 𝑖+1 and 𝑥𝑗 𝑖 in the search direction otherwise known as search step, and 𝑠𝑗 𝑖 is the preferred
search direction.
𝑖
• Let us consider that the search step length 𝛼 is very small (for better precision in the search
space) positive quantity.
Descent Direction …contd.
𝑖
• If 𝑠𝑗 have to be the descent directions, then,
𝑖+1 𝑖
𝑓 𝑥𝑗 < 𝑓 𝑥𝑗
𝑖 𝑖 𝑖 𝑖
or, 𝑓 𝑥𝑗 + 𝛼 𝑠𝑗 < 𝑓 𝑥𝑗
• By Taylor series expansion and neglecting higher order terms (as ,
𝑓 𝑥𝑗 𝑖 +𝛼 𝑖
𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 < 𝑓 𝑥𝑗 𝑖

or, 𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 < 0
• This is the condition for descent direction that the dot product of the gradient vector and
the descent direction is negative, since 𝛼 𝑖 is positive.
𝑖 𝑖
• Magnitude of the vector 𝛻𝑓 𝑥𝑗 ⋅ 𝑠𝑗 gives how steep the descent is.
Cauchy’s Steepest Descent technique
• In Cauchy’s method is more efficient when the search location is far from the actual
solution. But, as we get closer to the optimum point, the rate of convergence decreases as
the gradient value decreases gradually and so the estimated new point located closer and
closer to the previous point.
• At each iteration a new possible solution is calculated as: 𝑥𝑗 𝑖+1 = 𝑥𝑗 𝑖 + 𝛼 𝑖 𝑠𝑗 𝑖
• Since the gradient vector represents the direction of steepest ascent, the negative of the
gradient vector denotes the direction of the steepest descent, so, if 𝑠𝑗 𝑖 = −𝛻𝑓 𝑥𝑗 𝑖 , the
quantity 𝛻𝑓 𝑥𝑗 𝑖 ⋅ 𝑠𝑗 𝑖 is maximally negative, then search direction 𝑠𝑗 𝑖 is the steepest
descent direction.
• The step size 𝛼 𝑖 is determined such that the 𝑓 𝑥𝑗 𝑖+1 = 𝑓 𝑥𝑗 𝑖 − 𝛼 𝑖 𝛻𝑓 𝑥𝑗 𝑖 will be
minimum, using an appropriate unidirectional search method.
• This minimum point becomes current point and the search is continued from this point.
• Terminate when magnitude of the gradient vector becomes very small
• The improvement of the objective function value, at every iteration, is guaranteed due to
the descent property.
Cauchy’s Steepest Descent – Algorithm
Step 1: Set iteration counter 𝑖 = 0
Set maximum number of iterations allowed 𝐾;
Set termination tolerances 𝜀1 and 𝜀2

Assume a reasonable starting value 𝑥𝑗 𝑖

𝑖
Step 2: Compute 𝛻𝑓 𝑥𝑗

Step 3: Check for convergence: if 𝛻𝑓 𝑥𝑗 𝑖 < 𝜀1 , then Terminate

Else if, 𝑖 = 𝐾, the Terminate; Else, continue to step 4


Cauchy’s Steepest Descent – Algorithm

Step 4: compute 𝛼 𝑖 , such that 𝑓 𝑥𝑗 𝑖+1 = 𝑓 𝑥𝑗 𝑖 − 𝛼 𝑖


𝛻𝑓 𝑥𝑗 𝑖 is minimum using a
unidirectional search
𝑖+1 𝑖
Check, if 𝛻𝑓 𝑥𝑗 ⋅ 𝛻𝑓 𝑥𝑗 < 𝜀2 , then terminate

𝑖+1 𝑖
𝑥𝑗 −𝑥𝑗
Step 5: If 𝑖 < 𝜀1, then Terminate; Else, set 𝑖 = 𝑖 + 1 and go to step 2
𝑥𝑗
Cauchy’s Steepest Descent – Example
0
Minimize 𝑓 𝑥𝑗=2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 starting from a point 𝑥𝑗=2 = .
0
• Given:

1) Clearly, 𝑓 𝑥𝑗=2 = 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22

0 0 0 0
2) And, starting point 𝑥𝑗=2 = ⇒ 𝑥𝑗=2 = or, 𝑥1 , 𝑥2 = 0,0
0 0
3) The value of the function at the starting point 0,0 i.e.
0
𝑓 𝑥𝑗=2 ቚ =𝑓 = 𝑓 0,0 = 0
0,0 0
0
So, 𝑓 𝑥𝑗=2 ቚ = 0, which is not the minimal point and can be observed by
0,0
plotting the curve or calculating the gradient vector (will be zeros for minimum).
Cauchy’s Steepest Descent – Example
• Solution:
1) Calculate the gradient vector of the function:
𝜕𝑓
𝜕𝑥1 1 + 4𝑥1 + 2𝑥2
𝛻𝑓 𝑥𝑗=2 = =
𝜕𝑓 −1 + 2𝑥1 + 2𝑥2
𝜕𝑥2
2) So, the gradient direction, in the starting iteration (superscripted (0)), at the starting
0 1
point 0,0 , is: 𝛻𝑓 𝑥𝑗=2 ห𝑥 0 =
𝑗=2 −1
3) Therefore, the steepest descent direction, in the starting iteration, at the starting
0 0 −1
point 0,0 , is: 𝑠𝑗=2 ቚ 0
= − 𝛻𝑓 𝑥𝑗=2 ห𝑥 0 =
𝑥𝑗=2 𝑗=2 1
Cauchy’s Steepest Descent – Example
• Solution …contd.:

1 1 0 0 0
4) Now let us find 𝑥𝑗=2 by unidirectional search method 𝑥𝑗=2 = 𝑥𝑗=2 +𝛼 𝑠𝑗=2 ቚ 0
𝑥𝑗=2

0 0
5) But, 𝛼 is not given. So we search for an optimal step length, 𝛼 for which the
1 1 0 0 0
function evaluated at 𝑥𝑗=2 i.e. 𝑓 𝑥𝑗=2 or, 𝑓 𝑥𝑗=2 + 𝛼 𝑠𝑗=2 ቚ 0
is minimum.
𝑥𝑗=2

0 0 0 0 0 −1 0 0
6) So, 𝑓 𝑥𝑗=2 + 𝛼 𝑠𝑗=2 ቚ 0
=𝑓 +𝛼 = 𝑓 −𝛼 ,𝛼 which is,
𝑥𝑗=2 0 1
0 0 0 2 0 0
𝑓 −𝛼 ,𝛼 = 𝛼 − 2𝛼 has to be minimum, for optimum value of 𝛼
Cauchy’s Steepest Descent – Example
• Solution …contd.:
0 0
7) To find out the minimum of 𝑓 −𝛼 ,𝛼 , let us find out the first derivative of
0 0 0
𝑓 −𝛼 ,𝛼 w.r.t. 𝛼 , and equate it to zero, i.e.
0 2 0
𝑑𝑓 −𝛼 , 𝛼 0 0 𝑑 𝛼 − 2𝛼
0
= = 2𝛼 −2=0
𝑑𝛼 0 𝑑𝛼 0

0
8) Which gives, 𝛼 = 1. So, now we have
1 0 0 0 0 −1 −1
𝑥𝑗=2 = 𝑥𝑗=2 +𝛼 𝑠𝑗=2 ቚ 0
= +1 = or, −1,1
𝑥𝑗=2 0 1 1
Cauchy’s Steepest Descent – Example
• Solution …contd.:
9) We can check for optimality at this point −1,1 . Let us evaluate the gradient,
1 −1 0
𝛻𝑓 𝑥𝑗=2 ቚ 1
= ≠
𝑥𝑗=2 −1 0
1
10) So, 𝑥𝑗=2 is not optimum, we continue to the next iteration.
Newton’s method
• The negative gradient direction may not always point towards the global minima for any
arbitrary multivariable nonlinear function (it does so, only in case of circular contours).
• The Newton’s method, on the other hand, uses the information about the second
derivative or the Hessian of the objective function (hence called a second order method)
which takes the curvature of the objective function surface into account and results in
identifying better search direction.
• The search direction is given by:
−1
𝑖 𝑖
𝑠𝑗 = − 𝛻 2 𝑓 𝑥𝑗 𝛻𝑓 𝑥𝑗 𝑖
• Due to the improved and updated search direction, faster convergence is also achieved.
−1
𝑖
2
• If matrix 𝛻 𝑓 𝑥𝑗 is positive semi-definite, the direction 𝑠𝑗 𝑖 is a descent direction.
This is very well followed when the search starts very close to the optimal point. At the
minimal point, the matrix becomes positive definite.
Newton’s method – Algorithm
Step 1: Set iteration counter 𝑖 = 0
Set maximum number of iterations allowed 𝐾; Set termination tolerances 𝜀1 and 𝜀2

Assume a reasonable starting value 𝑥𝑗 𝑖

𝑖
Step 2: Compute 𝛻𝑓 𝑥𝑗

Step 3: Check for convergence: if 𝛻𝑓 𝑥𝑗 𝑖 < 𝜀1 , then Terminate


Else if, 𝑖 = 𝐾, the Terminate; Else, continue to step 4

−1
2 𝑖 2 𝑖
Step 4: Compute 𝛻 𝑓 𝑥𝑗 and inverse of the same, i.e. 𝛻 𝑓 𝑥𝑗
Newton’s method – Algorithm
−1
𝑖+1 𝑖 𝑖
𝑖
Step 5: compute 𝛼 , such that 𝑓 𝑥𝑗 = 𝑓 𝑥𝑗 − 𝛼 𝑖 2
𝛻 𝑓 𝑥𝑗 𝛻𝑓 𝑥𝑗 𝑖 is
minimum using a unidirectional search.

Check, if 𝛻𝑓 𝑥𝑗 𝑖+1 ⋅ 𝛻𝑓 𝑥𝑗 𝑖 < 𝜀2 , then terminate

𝑖+1 𝑖
𝑥𝑗 −𝑥𝑗
Step 6: If 𝑖 < 𝜀1, then Terminate; Else, set 𝑖 = 𝑖 + 1 and go to step 2
𝑥𝑗

You might also like