NLP Notes
NLP Notes
Topics covered:
Nonlinear programming: One dimensional minimization methods,
unconstrained optimization
2) Unconstrained optimization:
i) Steepest descent method
ii) Conjugate gradient method
f(x) f(x)
a bx a bx
The following function, defined in the range a ≤ x ≤ b, is not an unimodal function, as it has
more than one minima in the given range (moreover it has some maxima also in the given range)
Q. 01: Find the minimum of f = x (x – 1.5) starting from x = 0. Take step size s = 0.1
Soln.: Given x1 = 0 and s = 0.1
f(x1) = 0, x2 = 0 + 0.1 = 0.1, f(x2) = – 0.14
As f2 < f1 hence the search direction is correct. The iterations performed are shown in the
table below –
Iteration count xi f(xi) Is f(xi) < f(xi-1)
1 0 0 ---
2 0.10 – 0.14 Yes
3 0.20 – 0.26 Yes
4 0.30 – 0.36 Yes
Q. 02: Find the minimum of f = x (x – 1.2) starting from x = 1.6. Take step size s = 0.1
Soln.: Given x1 = 1.6 and s = 0.1
f(x1) = 0.64, x2 = 1.6 + 0.1 = 1.7, f(x2) = 0.85
f2 > f1 indicates that the search direction is wrong, hence the search direction is changed i.e.
s = –s = –0.1. The iterations performed are shown in the table below –
Iteration count xi f(xi) Is f(xi) < f(xi-1)
1 1.6 0.64 ---
2 1.5 0.45 Yes
3 1.4 0.28 Yes
4 1.3 0.13 Yes
5 1.2 0 Yes
6 1.1 – 0.11 Yes
7 1.0 – 0.20 Yes
8 0.9 – 0.27 Yes
9 0.8 – 0.32 Yes
10 0.7 – 0.35 Yes
11 0.6 – 0.36 Yes
12 0.5 – 0.35 No
f(x1) = 0.8334 and f(x2) =1.8058. As f(x1) < f(x2), hence x2 is discarded
f(x1) = 0.8334 and f(x3) =1.3503. As f(x1) < f(x3), hence x3 is discarded
The new range available is [0.4616 to 1.2308]
Experiment No. 4: In the range [0.4616 to 1.2308], one point (x1) is at a distance of 0.3076 (=
0.7696 – 0.4616) from left side. Hence place a new point (x4) at the same distance (0.3076) from
right side, i.e. place x4 at 1.2308 – 0.3076 (at 0.9232).
f(x1) = 0.8334 and f(x4) =0.8809. As f(x1) < f(x4), hence x4 is discarded
The new range available is [0.4616 to 0.9232]
Experiment No. 5: In the range [0.4616 to 0.9232], one point (x1) is at a distance of 0.3076 from left
side. Hence place a new point (x5) at the same distance (0.3076) from right side, i.e. place x5 at
0.9232 – 0.3076 (at 0.6156).
f(x1) = 0.8334 and f(x6) =0.8332. As f(x6) < f(x1), hence x1 is discarded
The new range available is [0.7692 to 0.9232]
As it is asked to perform six experiments, hence we can conclude that the minimum lies in the
range [0.7692 to 0.9232].
The final range (after performing six experiments) L6= [0.9232 – 0.7692] = 0.154
After performing 6 experiments (iterations), we get reduction ratio of L6 / L0 = [0.154 / 2] = 0.077,
which is same as 1 / Fn = [1 / 13] = 0.0769
Que. 02: Minimize f = x3 – 4 x in the range [1 , 4] using Fibonacci method. Perform six iterations.
Solution: Given function f(x) = x3 – 4 x
The range in which minimum lies; a = 1, b = 4
The number of experiments (iterations) to be performed; n = 6
Fn = F6 = 13
F0 = 1, F1 = 1, F2 = 2, F3 = 3, F4 = 5, F5 = 8, F6 = 13 ……
Initial interval L0 = b – a = 4 – 1 = 3
L2* = (Fn-2 / Fn) x L0 = (5/13) x 3 = 1.1538
Experiment No. 2: Distance of two experiments (points) from two ends = 1.1538
x1 = 1 + 1.1538 = 2.1538, and x2 = 4 – 1.1538 = 2.8462
f(x1) = 1.3759 and f(x2) = 11.6718. As f(x1) < f(x2), hence x2 is discarded
The new range available is [1 to 2.8462]
Experiment No. 3: In the range [1 to 2.8462], one point (x1) is at a distance of 1.1538 from left side.
Hence place a new point (x3) at the same distance (1.1538) from right side, i.e. place x3 at 2.8462 –
1.1538 (at 1.6924).
The final range (after performing six experiments) L6= [1.231 – 1] = 0.231
After performing 6 experiments (iterations), we get reduction ratio of L6 / L0 = [0.231 / 3] = 0.077,
which is same as 1 / Fn = [1 / 13] = 0.0769
The final range (after performing four experiments) L4= [0.944 to 0.472] = 0.472
After performing 6 experiments (iterations), we get reduction ratio of L4/ L0 = [0.472 / 2] = 0.236.
Unconstrained Optimization –
The gradient of an n-component vector has a very important property. If we move along the
gradient direction from any point in n-dimensional space, the function value increases at the fastest
rate. Hence the gradient direction is called the “direction of steepest ascent.” Unfortunately, the
direction of steepest ascent is a local property and not global one.
Since the gradient vector represents the direction of steepest ascent, the negative of gradient
vector denotes the direction of steepest descent.
Any method that makes use of gradient vector can be expected to give the minimum faster
than one that does not make use of gradient vector. All the descent methods make use of gradient
vector, either directly or indirectly, in finding the search direction
Steepest Descent Method (Cauchy’s):
The use of the negative of the gradient vector as a direction for minimization was first made
by Cauchy in 1847. In this method, we start from a initial trial point X1 and iteratively move along
the steepest descent directions until the optimum point is found
Algorithm (for function minimization)
01. Start with initial solution i.e. initial design vector (X1). If it is not given, then it is taken as zero
vector.
02. Set iteration count i = 1.
03. Find gradient of function f at X = Xi. It is written as f i and is defined as (Gradient is vector of
partial derivatives of function w.r.t design variables)
f ( X i i S i )
07. Find optimum value of i i.e. *i . For this take 0
i
Iteration I:
1
Gradient, f 1
1
1
Search direction S1 f 1
1
1
New point X 2 X 1 1* S1 X 2
1
1
Check for optimality X 2 X 1
1
Difference is not small, so proceed for iteration II.
Iteration II:
1
Gradient, f 2
1
1
Search direction S 2 f 2
1
1 1 1 2
X 2 2 S 2 2
1 1 1 2
f ( X 2 2 S 2 ) ( 1 2 ) (1 2 ) 2( 1 2 ) 2 2( 1 2 )(1 2 ) (1 2 ) 2
522 2 2 1
f ( X i i S i )
0 10 2 2 0 *2 0.2
i
0.8
New point X 3 X 2 *2 S 2 X 3
1.2
0.8
As it is asked to perform two iterations only, hence X * and fmin = – 1.2
1.2
f
x 12 x 6 x 2 1
Its gradient is f 1 1
f 4 x 2 6 x1 2
x 2
1
Gradient, f 1
2
1
Search direction S1 f 1
2
X i i S i 1
21
f ( X i i S i ) 612 812 1212 1 41
212 51
f ( X i i S i )
0 41 5 0 1* 1.25
i
1.25
New point X 2 X 1 1* S1 X 2
2.50
1.25
Check for optimality X 2 X 1
2.50
Difference is not small, so proceed for iteration II.
Iteration II:
1
Gradient, f 2
0 . 5
1
Search direction S 2 f 2
0.5
1.25 1 1.25 2
X 2 2 S 2 2
2.50 0.5 2.5 0.52
f ( X 2 2 S 2 ) 9.522 1.25 2 3.125
f ( X i i S i )
0 192 1.25 0 *2 0.0658
i
1.3158
New point X 3 X 2 *2 S 2 X 3
2.4608
1.3158
As it is asked to perform two iterations only, hence X * and fmin = – 3.1658
2.4608
Q. 03: The profit per acre of a farm is given by
20 x1 26 x 2 4 x1 x 2 4 x12 3 x 22
where x1 and x 2 are labor cost and fertilizer cost respectively. Find the values of x1 and x 2 to
maximize the profit. Use Steepest Descent method and perform two iterations.
f
x 20 4 x 2 8 x1
Its gradient is f 1
f 26 4 x1 6 x 2
x 2
0
Starting point X 1 (Assumed)
0
Iteration I:
20
Gradient, f 1
26
20
Search direction S1 f 1
26
201
X i i S i
261
f ( X i i S i ) 154812 10761
f ( X i i S i )
0 30961 1076 0 1* 0.35
i
7
New point X 2 X 1 1* S1 X 2
9.1
7
Check for optimality X 2 X 1
9.1
Difference is not small, so proceed for iteration II.
Iteration II:
0.4
Gradient, f 2
0.6
0. 4
Search direction S 2 f 2
0.6
7 0.4 7 0.4 2
X 2 2 S 2 2
9.1 0.6 9.1 0.6 2
f ( X 2 2 S 2 ) 2.44 22 0.52 2 186.97
f ( X i i S i )
0 4.88 2 0.52 0 *2 0.106
i
01. Start with initial solution i.e. initial design vector (X1). If it is not given, then it is taken as zero
vector.
02. Set iteration count i = 1.
03. Find gradient of function f at X = X1. It is written as f 1 .
04. Find the first search direction, S1 f 1
05. Find X i i S i in terms of i
06. Write f ( X i i S i )
f ( X i i S i )
07. Find optimum value of i i.e. *i . For this take 0
i
0
Starting point X 1 (Given)
0
Iteration I:
1
Gradient, f 1
1
1
Search direction S1 f 1
1
X i i S i 1
1
f ( X i i S i ) 1 1 212 212 12
12 21
f ( X i i S i )
0 21 2 0 1* 1
i
1
New point X 2 X 1 1* S1 X 2
1
1
Check for optimality X 2 X 1
1
Difference is not small, so proceed for iteration II.
Iteration II:
1
Gradient, f 2
1
2
f 2 1 2 1 0
Search direction S 2 f 2 2
S1
f 1 1 2 1 2
1 0 1
X 2 2 S 2 2
1 2 1 2 2
f ( X 2 2 S 2 ) 1 1 2 2 2 2 4 2 1 4 22 4 2
4 22 2 2 1
1
New point X 3 X 2 *2 S 2 X 3
1.5
1
As it is asked to perform two iterations only, hence X * and fmin = –1.25
1.5
0
Starting point X 1 (Given)
0
Iteration I:
1
Gradient, f 1
2
1
Search direction S1 f 1
2
X i i S i 1
21
f ( X i i S i ) 612 412 1212 1 41
212 51
f ( X i i S i )
0 41 5 0 1* 1.25
i
1.25
New point X 2 X 1 1* S1 X 2
2.50
1.25
Check for optimality X 2 X 1
2.50
Difference is not small, so proceed for iteration II.
f ( X i i S i )
0 197.75 2 11.75 0 *2 0.06
i
0.935
New point X 3 X 2 *2 S 2 X 3
1.72
0.935
As it is asked to perform two iterations only, hence X * and fmin = 5.888
1.72
f
x 20 4 x 2 8 x1 0
Its gradient is f 1 Starting point X 1 (Assumed)
f 26 4 x1 6 x 2 ,
0
x 2
Iteration I:
20
Gradient, f 1
26
20
Search direction S1 f 1
26
201
X i i S i
261
f ( X i i S i ) 154812 10761
f ( X i i S i )
0 30961 1076 0 1* 0.35
i
7
Check for optimality X 2 X 1
9.1
Difference is not small, so proceed for iteration II.
Iteration II:
0.4
Gradient, f 2
0.6
2
f 2 0.4 0.52 20 0.41
Search direction S 2 f 2 S1
0.6 1076 26 0.59
2
f 1
7 0.4 7 0.4 2
X 2 2 S 2 2
9.1 0.6 9.1 0.6 2
Reference: