0% found this document useful (0 votes)

15 views84 pages

Optimumengineeringdesign Day5

The document discusses optimization methods for engineering design. It covers numerical optimization techniques like iterative methods and line searches. Specific iterative methods covered include the steepest descent method and conjugate gradient method. It also discusses using interval reduction and approximate search methods for solving the line search problem, including the golden section and quadratic curve fitting methods.

Uploaded by

Santiago Garrido Bullón

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views84 pages

Optimumengineeringdesign Day5

Uploaded by

Santiago Garrido Bullón

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

Optimization Methods

in Engineering Design
Day-5
Course Materials
• Arora, Introduction to Optimum Design, 3e, Elsevier,
(https://www.researchgate.net/publication/273120102_Introductio
n_to_Optimum_design)
• Parkinson, Optimization Methods for Engineering Design, Brigham
Young University
(http://apmonitor.com/me575/index.php/Main/BookChapters)
• Iqbal, Fundamental Engineering Optimization Methods, BookBoon
(https://bookboon.com/en/fundamental-engineering-optimization-
methods-ebook)
Numerical Optimization
• Consider an unconstrained NP problem: min 𝑓 𝒙
𝒙
• Use an iterative method to solve the problem: 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ,
where 𝒅𝑘 is a search direction and 𝛼𝑘 is the step size, such that the
function value decreases at each step, i.e., 𝑓 𝒙𝑘+1 < 𝑓 𝒙𝑘
• We expect lim 𝒙𝑘 = 𝒙∗
𝑘→∞
• The general iterative method is a two-step process:
– Finding a suitable search direction 𝒅𝑘 along which the function
value locally decreases and any constraints are obeyed.
– Performing line search along 𝒅𝑘 to find 𝒙𝑘+1 such that 𝑓 𝒙𝑘+1
attains its minimum value.
The Iterative Method
• Iterative algorithm:
1. Initialize: chose 𝒙0
2. Check termination: 𝛻𝑓 𝒙𝑘 ≅ 0
3. Find a suitable search direction 𝒅𝑘 ,
that obeys the descent condition:
𝑇
𝛻𝑓 𝒙𝑘 𝒅𝑘 < 0
4. Search along 𝒅𝑘 to find where
𝑓 𝒙𝑘+1 attains minimum value
(line search problem)
5. Return to step 2
The Line Search Problem
• Assuming a suitable search direction 𝒅𝑘 has been determined, we
seek to determine a step length 𝛼𝑘 , that minimizes 𝑓 𝒙𝑘+1 .
• Assuming 𝒙𝑘 and 𝒅𝑘 are known, the projected function value along
𝒅𝑘 is expressed as:
𝑓 𝒙𝑘 + 𝛼𝑘 𝒅𝑘 = 𝑓 𝒙𝑘 + 𝛼𝒅𝑘 = 𝑓(𝛼)
• The line search problem to choose 𝛼 to minimize 𝑓 𝒙𝑘+1 along 𝒅𝑘
is defined as:
min 𝑓(𝛼) = 𝑓 𝒙𝑘 + α𝒅𝑘
𝛼
• Assuming that a solution exists, it is found by setting 𝑓′ 𝛼 = 0.
Example: Quadratic Function
• Consider minimizing a quadratic function:
𝑓 𝒙 = 12 𝒙𝑇 𝑨𝒙 − 𝒃𝑇 𝒙, 𝛻𝑓 = 𝑨𝒙 − 𝒃
• Given a descent direction 𝒅, the line search problem is defined as:
𝑇
min 𝑓(𝛼) = 𝒙𝑘 + 𝛼𝒅 𝑨 𝒙𝑘 + 𝛼𝒅 − 𝒃𝑇 𝒙𝑘 + 𝛼𝒅
𝛼
• A solution is found by setting 𝑓 ′ 𝛼 = 0, where
𝑓 ′ 𝛼 = 𝒅𝑇 𝑨 𝒙𝑘 + 𝛼𝒅 − 𝒅𝑇 𝒃 = 0
𝒅𝑇 𝑨𝒙𝑘 − 𝒃 𝛻𝑓(𝒙𝑘 )𝑇 𝒅
𝛼=− =−
𝒅𝑇 𝑨𝒅 𝒅𝑇 𝑨𝒅
• Finally, 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝒅.
Computer Methods for Line Search Problem
• Interval reduction methods
– Golden search
– Fibonacci search
• Approximate search methods
– Arjimo’s rule
– Quadrature curve fitting
Interval Reduction Methods
• The interval reduction methods find the minimum of a unimodal
function in two steps:
– Bracketing the minimum to an interval
– Reducing the interval to desired accuracy
• The bracketing step aims to find a three-point pattern, such that for
𝑥1 , 𝑥2 , 𝑥3 , 𝑓 𝑥1 ≥ 𝑓 𝑥2 < 𝑓 𝑥3 .
Fibonacci’s Method
• The Fibonacci’s method uses Fibonacci numbers to achieve
maximum interval reduction in a given number of steps.
• The Fibonacci number sequence is generated as:
𝐹0 = 𝐹1 = 1, 𝐹𝑖 = 𝐹𝑖−1 + 𝐹𝑖−2 , 𝑖 ≥ 2.
• The properties of Fibonacci numbers include:
𝐹𝑛−1 5−1
– They achieve the golden ratio 𝜏 = lim = ≅ 0.618034
𝑛→∞ 𝐹𝑛 2
– The number of interval reductions 𝑛 required to achieve a desired
accuracy 𝜀 (where 1/𝐹𝑛 < 𝜀) is specified in advance.
𝐹𝑛−1
– For given 𝐼1 and 𝑛, 𝐼2 = 𝐼 ,𝐼 = 𝐼1 − 𝐼2 , 𝐼4 = 𝐼2 − 𝐼3 , etc.
𝐹𝑛 1 3
The Golden Section Method
• The golden section method uses the golden ratio: 𝜏 = 0.618034.
• The golden section algorithm is given as:
𝜀
1. Initialize: specify 𝑥1 , 𝑥4 𝐼1 = 𝑥4 − 𝑥1 , 𝜀, 𝑛: 𝜏 𝑛 < 𝐼
1

2. Compute 𝑥2 = 𝜏𝑥1 + 1 − 𝜏 𝑥4 , evaluate 𝑓2

3. For 𝑖 = 1, … , 𝑛 − 1
Compute 𝑥3 = 1 − 𝜏 𝑥1 + 𝜏𝑥4 , evaluate 𝑓3 ; if 𝑓2 < 𝑓3 , set
𝑥4 ← 𝑥1 , 𝑥1 ← 𝑥3 ; else set 𝑥1 ← 𝑥2 , 𝑥2 ← 𝑥3 , 𝑓2 ← 𝑓3
Approximate Search Methods
• Consider the line search problem: min 𝑓(𝛼) = 𝑓 𝒙𝑘 + α𝒅𝑘
𝛼
• Sufficient Descent Condition. The sufficient descent condition guards
against 𝒅𝑘 becoming too close to 𝛻𝑓 𝒙𝑘 . The condition is stated as:
𝑇 𝑘 2
𝛻𝑓 𝒙𝑘 𝒅 < −𝑐 𝛻𝑓 𝒙𝑘 , 𝑐>0
• Sufficient Decrease Condition. The sufficient decrease condition ensures
a nontrivial reduction in the function value. The condition is stated as:
𝑇 𝑘
𝑓 𝒙𝑘 + 𝛼𝒅𝑘 − 𝑓 𝒙𝑘 ≤ 𝜇 𝛼 𝛻𝑓 𝒙𝑘 𝒅 , 0< 𝜇<1
• Curvature Condition. The curvature condition guards against 𝛼 becoming
too small. The condition is stated as:
𝑇 𝑘
𝑓 𝒙𝑘 + 𝛼𝒅𝑘 𝒅 ≥ 𝑓 𝒙𝑘 + 𝜂 𝛻𝑓 𝒙𝑘 𝑇 𝑘
𝒅 , 0<𝜇<𝜂<1
Approximate Line Search
• Strong Wolfe Conditions. The strong Wolfe conditions commonly
used by all line search algorithms include:
1. The sufficient decrease condition (Arjimo’s rule):
𝑓 𝛼 ≤ 𝑓 0 + 𝜇𝛼𝑓 ′ (0), 0 < 𝜇 < 1
2. Strong curvature condition:
𝑓′ 𝛼 ≤ 𝜂 𝑓′ 0 , 0 < 𝜇 ≤ 𝜂 < 1
Approximate Line Search
• The approximate line search includes two steps:
– Bracketing the minimum
– Estimating the minimum
• Bracketing the Minimum. In the bracketing step we seek an interval
𝛼, 𝛼 such that 𝑓 ′ 𝛼 < 0 and 𝑓 ′ 𝛼 > 0.
– Since for any descent direction, 𝑓 ′ 0 < 0, therefore, 𝛼 = 0 serves
as a lower bound on 𝛼. To find an upper bound, gradually increase
𝛼, e.g., 𝛼 = 1,2, …,
– Assume that for some 𝛼𝑖 > 0, we get 𝑓 ′ 𝛼𝑖 < 0 and 𝑓 ′ 𝛼𝑖+1 > 0;
then, 𝛼𝑖 serves as an upper bound.
Approximate Line Search
• Estimating the Minimum. Once the minimum has been bracketed
to a small interval, a quadratic or cubic polynomial approximation is
used to find the minimizer.
• If the polynomial minimizer 𝛼 satisfies strong Wolfe’s conditions for
the desired 𝜇 and 𝜂 values (say 𝜇 = 0.2, 𝜂 = 0.5), it is taken as the
function minimizer.
• Otherwise, 𝛼 is used to replace one of the 𝛼 or 𝛼, and the
polynomial approximation step repeated.
Quadratic Curve Fitting
• Assuming that the interval 𝛼𝑙 , 𝛼𝑢 contains the minimum of a
unimodal function, 𝑓 𝛼 , its quadratic approximation, given as:
𝑞 𝛼 = 𝑎0 + 𝑎1 𝛼 + 𝑎2 𝛼 2 , is obtained using three points
𝛼𝑙 , 𝛼𝑚 , 𝛼𝑢 , where the mid-point may be used for 𝛼𝑚
The quadratic coefficients {𝑎0 , 𝑎1 , 𝑎2 } are solved as:
1 𝑓 𝛼𝑢 −𝑓 𝛼𝑙 𝑓 𝛼𝑚 −𝑓 𝛼𝑙
𝑎2 = −
𝛼𝑢 −𝛼𝑚 𝛼𝑢 −𝛼𝑙 𝛼𝑚 −𝛼𝑙
1
𝑎1 = 𝑓 𝛼𝑚 − 𝑓 𝛼𝑙 − 𝑎2 (𝛼𝑙 + 𝛼𝑚 )
𝛼𝑚 −𝛼𝑙
𝑎0 = 𝑓(𝛼𝑙 ) − 𝑎1 𝛼𝑙 − 𝑎2 𝛼𝑙2
𝑎1
Then, the minimum is given as: 𝛼𝑚𝑖𝑛 = −
2𝑎2
Example: Approximate Search
• Let 𝑓 𝛼 = 𝑒 −𝛼 + 𝛼 2 , 𝑓 ′ 𝛼 = 2𝛼 − 𝑒 −𝛼 , 𝑓 0 = 1, 𝑓 ′ 0 = −1.
Let 𝜇 = 0.2, and try 𝛼 = 0.1, 0.2, …, to bracket the minimum.
• From the sufficient decrease condition, the minimum is bracketed
in the interval: [0, 0.5]
• Using quadratic approximation, the minimum is found as:
𝑥 ∗ = 0.3531
The exact solution is given as: 𝛼𝑚𝑖𝑛 = 0.3517
• The Matlab commands are:
Define the function:
f=@(x) x.*x+exp(-x);
mu=0.2; al=0:.1:1;
Example: Approximate Search
• Bracketing the minimum:
f1=feval(f,al)
1.0000 0.9148 0.8587 0.8308 0.8303 0.8565 0.9088 0.9866
1.0893 1.2166 1.3679
>> f2=f(0)-mu*al
1.0000 0.9800 0.9600 0.9400 0.9200 0.9000 0.8800 0.8600
0.8400 0.8200 0.8000
>> idx=find(f1<=f2)
• Quadratic approximation to find the minimum:
al=0; am=0.25; au=0.5;
a2 = ((f(au)-f(al))/(au-al)-(f(am)-f(al))/(am-al))/(au-am);
a1 = (f(am)-f(al))/(am-al)-a2*(al+am);
xmin = -a1/a2/2 % 0.3531
Computer Methods for Finding the Search Direction
• Gradient based methods
– Steepest descent method
– Conjugate gradient method
– Quasi Newton methods
• Hessian based methods
– Newton’s method
– Trust region methods
Steepest Descent Method
• The steepest descent method determines the search direction as:
𝒅𝑘 = −𝛻𝑓(𝒙𝑘 ),
• The update rule is given as: 𝒙𝑘+1 = 𝒙𝑘 − 𝛼𝑘 ∙ 𝛻𝑓(𝒙𝑘 )
where 𝛼𝑘 is determined by minimizing 𝑓(𝒙𝑘+1 ) along 𝒅𝑘
• Example: quadratic function
1
𝑓 𝒙 = 𝒙𝑇 𝑨𝒙 − 𝒃𝑇 𝒙, 𝛻𝑓 = 𝑨𝒙 − 𝒃
2
𝑇
𝑘+1 𝑘 𝑘 𝛻 𝑓 𝒙𝑘 𝛻 𝑓 𝒙𝑘
Then, 𝒙 = 𝒙 − 𝛼 ∙ 𝛻𝑓 𝒙 ; 𝛼 = 𝑇
𝛻 𝑓 𝒙𝑘 𝐀𝛻 𝑓 𝒙𝑘
Define 𝒓𝑘 = 𝒃 − 𝑨𝒙𝑘
𝒓𝑇
𝑘 𝒓𝑘
Then, 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒓𝑘 ; 𝛼𝑘 =
𝒓𝑇
𝑘 𝐴𝒓𝑘
Steepest Descent Algorithm
• Initialize: choose 𝒙0
• For 𝑘 = 0,1,2, …
– Compute 𝛻𝑓(𝒙𝑘 )
– Check convergence: if 𝛻𝑓(𝒙𝑘 ) < 𝜖, stop.
– Set 𝒅𝑘 = −𝛻𝑓(𝒙𝑘 )
– Line search problem: Find min 𝑓 𝒙𝑘 + 𝛼𝒅𝑘
𝛼≥0
– Set 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝒅𝑘 .
Example: Steepest Descent
• Consider min𝑓 𝒙 = 0.1𝑥12 + 𝑥22 ,
𝒙
0.2𝑥1 0.1 0 5
𝛻𝑓 𝒙 = , 𝛻2𝑓 𝑥 = ; let 𝒙0 = , then, 𝑓 𝒙0 = 3.5,
2𝑥2 0 1 1
−1
𝑑1 = −𝛻𝑓 𝒙0 = , 𝛼 = 0.61
−2
4.39
𝒙1 = , 𝑓 𝒙1 = 1.98
−0.22
Continuing..
Example: Steepest Descent
• MATLAB code:
H=[.2 0;0 2];
f=@(x) x'*H*x/2; df=@(x) H*x; ddf=H;
x=[5;1];
xall=x';
for i=1:10
d=-df(x);
a=d'*d/(d'*H*d);
x=x+a*d;
xall=[xall;x'];
end
plot(xall(:,1),xall(:,2)), grid
axis([-1 5 -1 5]), axis equal
Steepest Descent Method
• The steepest descent method becomes slow close to the optimum
• The method progresses in a zigzag fashion, since
𝑑 𝑇 𝑘 𝑇
𝑓 𝒙𝑘 + 𝛼𝒅𝑘 = 𝛻 𝑓 𝒙𝑘+1 𝒅 = −𝛻 𝑓 𝒙𝑘+1 𝛻 𝑓 𝒙𝑘 = 0
𝑑𝛼
• The method has linear convergence with rate constant
𝑓 𝒙𝑘+1 −𝑓 𝒙∗ 𝑐𝑜𝑛𝑑 𝑨 −1 2
𝐶= ≤
𝑓 𝒙𝑘 −𝑓 𝒙∗ 𝑐𝑜𝑛𝑑 𝑨 +1
Preconditioning
• Preconditioning (scaling) can be used to reduce the condition
number of the Hessian matrix and hence aid convergence
• Consider 𝑓 𝒙 = 0.1𝑥12 + 𝑥22 = 𝒙𝑇 𝑨𝒙, where 𝑨 = 𝑑𝑖𝑎𝑔(0.1, 1)
• Define a linear transformation: 𝒙 = 𝑷𝒚, where 𝑷 = 𝑑𝑖𝑎𝑔( 10, 1);
then, 𝑓 𝒙 = 𝒚𝑇 𝑷𝑇 𝑨𝑷𝒚 = 𝒚𝑇 𝒚
• Since 𝑐𝑜𝑛𝑑 𝑰 = 1, the steepest descent method in the case of a
quadratic function converges in a single iteration
Conjugate Gradient Method
• For any square matrix 𝑨, the set of 𝑨-conjugate vectors is defined
𝑇
by: 𝒅𝑖 𝑨𝒅𝑗 = 0, 𝑖 ≠ 𝑗
• Let 𝒈𝑘 = 𝛻 𝑓 𝒙𝑘 denote the gradient; then, starting from
𝒅0 = −𝒈0 , a set of 𝑨-conjugate directions is generated as:
𝒅0 = −𝒈0 ; 𝒅𝑘+1 = −𝒈𝑘+1 + 𝛽𝑘 𝒅𝑘 𝑘 ≥ 0, …
𝒈𝑇
𝑘+1 𝑨𝒅
𝑘
where 𝛽𝑘 = 𝑇
𝒅𝑘 𝑨𝒅𝑘
There are multiple ways to generate conjugate directions
• Using {𝒅0 , 𝒅2 , … , 𝒅𝑛−1 } as search directions, a quadratic function is
minimized in 𝑛 steps.
Conjugate Directions Method
• The parameter 𝛽𝑘 can be computed in different ways:
1
– By substituting 𝑨𝒅𝑘 = 𝛼 (𝒈𝑘+1 − 𝒈𝑘 ), we obtain:
𝑘
𝒈𝑇
𝑘+1 (𝒈𝑘+1 −𝒈𝑘 )
𝛽𝑘 = 𝑇 (the Hestenes-Stiefel formula)
𝒅𝑘 (𝒈𝑘+1 −𝒈𝑘 )
𝑇
– In the case of exact line search, 𝑔𝑘+1 𝒅𝑘 = 0; then
𝒈𝑇
𝑘+1 (𝒈𝑘+1 −𝒈𝑘 )
𝛽𝑘 = (the Polak-Ribiere formula)
𝒈𝑇
𝑘 𝒈𝑘
– Also, for exact line search 𝒈𝑇𝑘+1 𝒈𝑘 = 𝛽𝑘−1 (𝒈𝑘 + 𝛼𝑘 𝑨𝒅𝑘 )𝑇 𝒅𝑘−1 = 0,
𝒈𝑇
𝑘+1 𝒈𝑘+1
resulting in 𝛽𝑘 = (the Fletcher-Reeves formula)
𝒈𝑇
𝑘 𝒈𝑘
Other versions of 𝛽𝑘 have also been proposed.
Example: Conjugate Gradient Method
• Consider min𝑓 𝒙 = 0.1𝑥12 + 𝑥22 ,
𝒙
0.2𝑥1 0.1 0 5
𝛻𝑓 𝒙 = , 𝛻2𝑓 𝑥 = ; let 𝒙0 = , then 𝑓 𝒙0 = 3.5,
2𝑥2 0 1 1
−1
𝑑 0 = − 𝛻𝑓 𝒙0 = , 𝛼 = 0.61
−2
4.39
𝒙1 = , 𝑓 𝒙1 = 1.98
−0.22
𝛽0 = 0.19
−0.535
𝑑1 = , 𝛼 = 8.2
0.027
0
𝒙1 =
0
Example: Conjugate Gradient Method
• MATLAB code
H=[.2 0;0 2];
f=@(x) x'*H*x/2; df=@(x) H*x; ddf=H;
x=[5;1]; n=2;
xall=zeros(n+1,n); xall(1,:)=x';
d=-df(x); a=d'*d/(d'*H*d);
x=x+a*d; xall(2,:)=x';
for i=1:size(x,1)-1
b=df(x)'*H*d/(d'*H*d);
d=-df(x)+b*d;
r=-df(x);
a=r'*r/(d'*H*d);
x=x+a*d;
xall(i+2,:)=x';
end
plot(xall(:,1),xall(:,2)), grid
axis([-1 5 -1 5]), axis equal
Conjugate Gradient Algorithm
• Conjugate-Gradient Algorithm (Griva, Nash & Sofer, p454):
• Initialize: Choose 𝒙0 = 𝟎, 𝒓0 = 𝒃, 𝒅(−1) = 0, 𝛽0 = 0.
• For 𝑖 = 0,1, …
– Check convergence: if 𝒓𝑖 < 𝜖, stop.
𝒓𝑇
𝑖 𝒓𝑖
– If 𝑖 > 0, set 𝛽𝑖 =
𝒓𝑇
𝑖−1 𝒓𝑖−1

𝒓𝑇
𝑖 𝒓𝑖
– Set 𝒅𝑖 = 𝒓𝑖 + 𝛽𝑖 𝒅𝑖−1 ; 𝛼𝑖 = 𝑖𝑇 𝑖
; 𝒙 𝑖+1 = 𝒙 𝑖 + 𝛼 𝑖 𝒅 𝑖;
𝒅 𝑨𝒅
𝒓𝑖+1 = 𝒓𝑖 − 𝛼𝑖 𝑨𝒅𝑖 .
Conjugate Gradient Method
• Assume that an update that includes steps 𝛼𝑖 along 𝑛 conjugate
vectors 𝒅𝑖 is assembled as: 𝑦 = 𝑛𝑖=1 𝛼𝑖 𝒅𝑖 .
• Then, for a quadratic function, the minimization problem is
decomposed into a set of one-dimensional problems, i.e.,
𝑛 1 2 𝑖𝑇
min 𝑓(𝒚) ≡ 𝑖=1 min 𝛼𝑖 𝒅 𝑨𝒅𝑖 − 𝛼𝑖 𝒃𝑇 𝒅𝑖
𝑦 𝛼𝑖 2

• By setting the derivative with respect to 𝛼𝑖 equal to zero, i.e.,

𝑖𝑇 𝑖 𝑇 𝑖 𝒃𝑇 𝒅𝑖
𝛼𝑖 𝒅 𝑨𝒅 − 𝒃 𝒅 = 0, we obtain: 𝛼𝑖 = 𝑇 .
𝒅𝑖 𝑨𝒅𝑖
• This shows that the CG algorithm iteratively determines the
conjugate directions 𝒅𝑖 and their coefficients 𝛼𝑖 .
CG Rate of Convergence
• Conjugate gradient methods achieve superlinear convergence:
– In the case of quadratic functions, the minimum is reached exactly
in 𝑛 iterations.
– For general nonlinear functions, convergence in 2𝑛 iterations is to
be expected.
• Nonlinear CG methods typically have the lowest per iteration
computational costs of all gradient methods.
Newton’s Method
• Consider minimizing the second order approximation of 𝑓 𝒙 :
min 𝑓 𝒙𝑘 + Δ𝒙 = 𝑓 𝒙𝑘 + 𝛻𝑓 𝒙𝑘 𝑇 Δ𝒙 + 12 Δ𝒙𝑇 𝑯𝑘 Δ𝒙
𝒅
• Apply FONC: 𝑯𝑘 𝒅 + 𝒈𝑘 = 𝟎, where 𝒈𝑘 = 𝛻𝑓 𝒙𝑘
Then, assuming that 𝑯𝑘 = 𝛻 2 𝑓 𝒙𝑘 stays positive definite, the
Newton’s update rule is derived as: 𝒙𝑘+1 = 𝒙𝑘 − 𝑯−1
𝑘 𝒈𝑘
• Note:
– The convergence of the Newton’s method is dependent on 𝑯𝑘
staying positive definite.
– A step size may be included in the Newton’s method, i.e.,
𝒙𝑘+1 = 𝒙𝑘 − 𝛼𝑘 𝑯−1 𝑘 𝒈𝑘
Marquardt Modification to Newton’s Method
• To ensure the positive definite condition on 𝑯𝑘 , Marquardt
proposed the following modification to Newton’s method:
𝑯𝑘 + 𝜆𝑰 𝒅 = −𝒈𝑘
where 𝜆 is selected to ensure that the Hessian is positive definite.
• Since 𝑯𝑘 + 𝜆𝑰 is also symmetric, the resulting system of linear
equations can be solved for 𝒅 as:
𝑳𝑫𝑳𝑇 𝒅 = −𝛻𝑓 𝒙𝑘
Newton’s Algorithm
Newton’s Method (Griva, Nash, & Sofer, p. 373):
1. Initialize: Choose 𝒙0 , specify 𝜖
2. For 𝑘 = 0,1, …
3. Check convergence: If 𝛻𝑓 𝒙𝑘 < 𝜖, stop
4. Factorize modified Hessian as 𝛻 2 𝑓 𝒙𝑘 + 𝑬 = 𝑳𝑫𝑳𝑇 and solve
𝑳𝑫𝑳𝑇 𝒅 = −𝛻𝑓 𝒙𝑘 for 𝒅
5. Perform line search to determine 𝛼𝑘 and update the solution
estimate as 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒅𝑘
Rate of Convergence
• Newton’s method achieves quadratic rate of convergence in the
close neighborhood of the optimal point, and superlinear
convergence otherwise.
• The main drawback of the Newton’s method is its computational
cost: the Hessian matrix needs to be computed at every step, and a
linear system of equations needs to be solved to obtain the update.
• Due to the high computational and storage costs, classic Newton’s
method is rarely used in practice.
Quasi Newton’s Methods
• The quasi-Newton methods derive from a generalization of secant
method, that approximates the second derivative as:
′′ 𝑓′ 𝑥𝑘 −𝑓′ (𝑥𝑘−1 )
𝑓 (𝑥𝑘 ) ≅
𝑥𝑘 −𝑥𝑘−1
• In the multi-dimensional case, the secant condition is generalized
as: 𝑯𝑘 𝒙𝑘 − 𝒙𝑘−1 = 𝛻𝑓 𝒙𝑘 − 𝛻𝑓 𝒙𝑘−1
• Define 𝑭𝑘 = 𝑯𝑘−1 , then
𝒙𝑘 − 𝒙𝑘−1 = 𝑭𝑘 𝛻𝑓 𝒙𝑘 − 𝛻𝑓 𝒙𝑘−1
• The quasi-Newton methods iteratively update 𝑯𝑘 or 𝑭𝑘 as:
– Direct update: 𝑯𝑘+1 = 𝑯𝑘 + ∆𝑯𝑘 , 𝑯0 = 𝑰
– Inverse update: 𝑭𝑘+1 = 𝑭𝑘 + ∆𝑭𝑘 , 𝑭 = 𝑯−1 , 𝑭0 = 𝑰
Quasi-Newton Methods
• Quasi-Newton update:
Let 𝒔𝑘 = 𝒙𝑘+1 − 𝒙𝑘 , 𝒚𝑘 = 𝛻𝑓 𝒙𝑘+1 − 𝛻𝑓 𝒙𝑘 ; then,
– The DFP (Davison-Fletcher-Powell) formula for inverse Hessian
update is given as:
𝑭𝑘 𝒚𝑘 𝑭𝑘 𝒚𝑘 𝑇 𝒔𝑘 𝒔𝑘 𝑇
𝑭𝑘+1 = 𝑭𝑘 − +
𝒚𝑘 𝑇 𝑭𝑘 𝒚𝑘 𝒚𝑘 𝑇 𝒔𝑘
– The BGFS (Broyden, Fletcher, Goldfarb, Shanno) formula for direct
Hessian update is given as:
𝑯𝑘 𝒔𝑘 𝑯𝑘 𝒔𝑘 𝑇 𝒚𝑘 𝒚𝑘 𝑇
𝑯𝑘+1 = 𝑯𝑘 − + 𝑇
𝒔𝑘 𝑇 𝑯𝑘 𝒔𝑘 𝒚𝑘 𝒔𝑘
Quasi-Newton Algorithm
The Quasi-Newton Algorithm (Griva, Nash & Sofer, p.415):
• Initialize: Choose 𝒙0 , 𝑯0 (e.g., 𝑯0 = 𝑰), specify 𝜀
• For 𝑘 = 0,1, …
– Check convergence: If 𝛻𝑓 𝒙𝑘 < 𝜀, stop
– Solve 𝑯𝑘 𝒅 = −𝛻𝑓 𝒙𝑘 for 𝒅𝑘 (alternatively, 𝒅 = −𝑭𝑘 𝛻𝑓 𝒙𝑘 )
– Solve min 𝑓 𝒙𝑘 + 𝛼𝒅𝑘 for 𝛼𝑘 , and update the current estimate:
𝛼
𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒅𝑘
– Compute 𝒔𝑘 , 𝒚𝑘 , and update 𝑯𝑘 (or 𝑭𝑘 as applicable)
Example: Quasi-Newton Method
• Consider the problem: min 𝑓(𝑥1 , 𝑥2 ) = 2𝑥12 − 𝑥1 𝑥2 + 𝑥22 , where
𝑥1 ,𝑥2
4 −1 𝑥1 1
𝑯= , 𝛻𝑓 = 𝑯 𝑥 . Let 𝒙0 = , 𝑓 0 = 4, 𝑯0 = 𝑰, 𝑭0 = 𝑰;
−1 2 2 1
Choose 𝒅0 = −𝛻𝑓 𝑥 0 = −3 ;
−1
then 𝑓 𝛼 = 2 1 − 3𝛼 2 + 1 − 𝛼 2
− (1 − 3𝛼)(1 − 𝛼),

Using 𝑓 ′ 𝛼 = 0 → 𝛼 = 16 → 𝒙1 = 0.625 , 𝑓 1 = 0.875;

5
0.688
−3.44 1.193 0.065 0.381 −0.206
then 𝒚1 = , 𝑭1 = , 𝑯1 = ,
0.313 0.065 1.022 −0.206 0.9313
0.4375
and using either update formula 𝒅1 = ; for the next step,
−1.313
0.2188
𝑓 𝛼 = 5.36𝛼 2 − 3.83𝛼 + 0.875 → 𝛼 = −0.3572, 𝒙2 = .
0.2188
Example: Quasi-Newton Method
• For quadratic function, convergence is achieved in two iterations.
Trust-Region Methods
• The trust-region methods locally employ a quadratic approximation
𝑞𝑘 𝒙𝑘 to the nonlinear objective function.
• The approximation is valid in the neighborhood of 𝒙𝑘 defined by
Ω𝑘 = 𝒙: 𝚪(𝒙 − 𝒙𝑘 ) ≤ ∆𝑘 , where 𝚪 is a scaling parameter.
• The method aims to find a 𝒙𝑘+1 ∈ Ω𝑘 , that satisfies the sufficient
decrease condition in 𝑓(𝒙).
• The quality of the quadratic approximation is estimated by the
𝑓(𝒙𝑘 )−𝑓(𝒙𝑘+1 )
reliability index: 𝛾𝑘 = . If this ratio is close to unity,
𝑞𝑘 𝒙𝑘 −𝑞𝑘 𝒙𝑘+1
the trust region may be expanded in the next iteration.
Trust-Region Methods
• At each iteration 𝑘, trust-region algorithm solves a constrained
optimization sub-problem involving quadratic approximation:
𝑇
1 𝑇 2
min 𝑞𝑘 𝒅 = 𝑓 𝒙𝑘 + 𝛻𝑓 𝒙𝑘 𝒅 + 𝒅 𝛻 𝑓 𝒙𝑘 𝒅
𝒅 2
Subject to: 𝒅 ≤ ∆𝑘
Lagrangian function: ℒ 𝑥, 𝜆 = 𝑓 𝒙𝑘 + 𝜆 𝒅 − ∆𝑘
FONC: 𝛻 2 𝑓 𝒙𝑘 + 𝜆𝑰 𝒅𝑘 = −𝛻𝑓 𝒙𝑘 , 𝜆 𝒅 − ∆𝑘 = 0
• The resulting search direction 𝒅𝑘 is given as: 𝒅𝑘 = 𝒅𝑘 (𝜆).
– For large ∆𝑘 and a positive-definite 𝛻 2 𝑓 𝒙𝑘 , the Lagrange
multiplier 𝜆 → 0, and 𝒅𝑘 (𝜆) reduces to the Newton’s direction.
– For ∆𝑘 → 0, 𝜆 → ∞, and 𝒅𝑘 (𝜆) aligns with the steepest-descent
direction.
Trust-Region Algorithm
• Trust-Region Algorithm (Griva, Nash & Sofer, p.392):
1 3
• Initialize: choose 𝒙0 , ∆0 ; specify 𝜀, 0 < 𝜇 < 𝜂 < 1 (e.g., 𝜇 = 4 ; 𝜂 = 4)
• For 𝑘 = 0,1, …
– Check convergence: If 𝛻𝑓 𝒙𝑘 < 𝜀, stop
– Solve the subproblem: min 𝑞𝑘 𝒅 subject to 𝒅 ≤ ∆𝑘
𝒅
– Compute 𝛾𝑘 ,
1
• if 𝛾𝑘 < 𝜇, set 𝒙𝑘+1 = 𝒙𝑘 , ∆𝑘+1 = ∆𝑘
2
𝑘
• else if 𝛾𝑘 < 𝜂, set 𝒙𝑘+1 = 𝒙𝑘 + 𝒅 , ∆𝑘+1 = ∆𝑘
• else set 𝒙𝑘+1 = 𝒙𝑘 + 𝒅𝑘 , ∆𝑘+1 = 2∆𝑘
Computer Methods for Constrained Problems
• Penalty and Barrier methods
• Augmented Lagrangian method (AL)
• Sequential linear programming (SLP)
• Sequential quadratic programming (SQP)
Penalty and Barrier Methods
• Consider the general optimization problem: min 𝑓 𝒙
𝒙

ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑝;
Subject to 𝑔𝑗 𝒙 ≤ 0, 𝑗 = 𝑖, … , 𝑚;
𝑥𝑖𝐿 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑈 , 𝑖 = 1, … , 𝑛.
• Define a composite function to be used for constraint compliance:
Φ 𝒙, 𝑟 = 𝑓 𝒙 + 𝑃 𝑔 𝒙 , ℎ 𝒙 , 𝒓
where 𝑃 defines a loss function, and 𝒓 is a vector of weights (penalty
parameters)
Penalty and Barrier Methods
• Penalty Function Method. A penalty function method employs a
quadratic loss function and iterates through the infeasible region
2
𝑃 𝑔 𝒙 ,ℎ 𝒙 ,𝒓 = 𝑟 𝑖 𝑔𝑖+ 𝒙 + 𝑖 ℎ𝑖 𝒙 2

𝑔𝑖+ 𝒙 = max 0, 𝑔𝑖 𝒙 ,𝑟 > 0

• Barrier Function Method. A barrier method employs a log barrier
function and iterates through the feasible region
1
𝑃 𝑔 𝒙 ,ℎ 𝒙 ,𝒓 = 𝑖 log −𝑔𝑖 𝑥
𝑟
• For both penalty and barrier methods, as 𝑟 → ∞, 𝒙(𝑟) → 𝒙∗
The Augmented Lagrangian Method
• Consider an equality-constrained problem: min 𝑓 𝒙
𝒙
Subject to: ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑙
• Define the augmented Lagrangian (AL) as:
1
𝒫 𝒙, 𝒗, 𝑟 = 𝑓 𝒙 + 𝑗 𝑣𝑗 ℎ𝑗 𝒙 + 𝑟ℎ𝑗2 𝒙
2
where the additional term defines an exterior penalty function with
𝑟 as the penalty parameter.
• For inequality constrained problems, the AL may be defined as:
1 𝑢𝑗
𝑢𝑖 𝑔𝑖 𝒙 + 𝑟𝑔𝑖2 𝒙 , if 𝑔𝑗 + ≥0
2 𝑟
𝒫 𝒙, 𝒖, 𝑟 = 𝑓 𝒙 + 𝑖 1 2 𝑢𝑗
− 𝑢 , if 𝑔𝑗 + <0
2𝑟 𝑖 𝑟
where a large 𝑟 makes the Hessian of AL positive definite at 𝒙.
The Augmented Lagrangian Method
• The dual function for the AL is defined as:
1 2
𝜓 𝒗 = min 𝒫 𝒙, 𝒗, 𝑟 = 𝑓 𝒙 + 𝑗 𝑣𝑗 ℎ𝑗 𝒙 + 𝑟 ℎ𝑗 𝒙
𝒙 2
• The resulting dual optimization problem is: max 𝜓 𝒗
𝒗
• The dual problem may be solved via Newton’s method as:
−1
𝑑2 𝜓
𝒗𝑘+1 = 𝒗𝑘 − 𝒉
𝑑𝑣𝑖 𝑑𝑣𝑗

𝑑2𝜓
where = −𝛻ℎ𝑖 𝑇 𝛻 2 𝒫 −1 𝛻ℎ
𝑗
𝑑𝑣𝑖 𝑑𝑣𝑗

• For large 𝒓, the Newton’s update may be approximated as:

𝑣𝑗𝑘+1 = 𝑣𝑗𝑘 + 𝑟𝑗 ℎ𝑗 , 𝑗 = 1, … , 𝑙
Example: Augmented Lagrangian
• Maximize the volume of a cylindrical tank subject to surface area
constraint:
𝜋𝑑 2 𝑙 𝜋𝑑 2
max 𝑓 𝑑, 𝑙 = , subject to ℎ: + 𝜋𝑑𝑙 − 𝐴0 = 0
𝑑,𝑙 4 4
• We can normalize the problem as:
min 𝑓 𝑑, 𝑙 = −𝑑 2 𝑙, subject to ℎ: 𝑑 2 + 4𝑑𝑙 − 1 = 0
𝑑,𝑙
• The solution to the primal problem is obtained as:
Lagrangian function: ℒ 𝑑, 𝑙, 𝜆 = −𝑑 2 𝑙 + 𝜆(𝑑 2 + 4𝑑𝑙 − 1)
FONC: 𝜆 𝑑 + 2𝑙 − 𝑑𝑙 = 0, 𝜆𝑑 𝑑 + 4 − 𝑑 2 = 0, 𝑑 2 + 4𝑑𝑙 − 1 = 0
1
Optimal solution: 𝑑∗ = 2𝑙 ∗ = 4𝜆∗ = .
3
Example: Augmented Lagrangian
• Alternatively, define the Augmented Lagrangian function as:
1
𝒫 𝑑, 𝑙, 𝜆, 𝑟 = −𝑑 2 𝑙 + 𝜆 𝑑 2 + 4𝑑𝑙 − 1 + 𝑟 𝑑 2 + 4𝑑𝑙 − 1 2
2
• Define the dual function: 𝜓 𝜆 = min 𝒫 𝑑, 𝑙, 𝜆, 𝑟
𝑑,𝑙
• Define dual optimization problem: max 𝜓 𝜆
𝑑,𝑙
• Solution to the dual problem: 𝜆∗ = 𝜆𝑚𝑎𝑥 = 0.144
• Solution to the design variables: 𝑑 ∗ = 2𝑙 ∗ = 0.577
Sequential Linear Programming
• Consider the general optimization problem: min 𝑓 𝒙
𝒙

ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑝;
Subject to 𝑔𝑗 𝒙 ≤ 0, 𝑗 = 𝑖, … , 𝑚;
𝑥𝑖𝐿 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑈 , 𝑖 = 1, … , 𝑛.
• Let 𝒙𝑘 denote the current estimate of the design variables, and let
𝒅 denote the change in variables; define the first order expansion
of the objective and constraint functions in the neighborhood of 𝒙𝑘
𝑇
𝑓 𝒙𝑘 + 𝒅 = 𝑓 𝒙𝑘 + 𝛻𝑓 𝒙𝑘 𝒅
𝑇
𝑔𝑖 𝒙𝑘 + 𝒅 = 𝑔𝑖 𝒙𝑘 + 𝛻𝑔𝑖 𝑘
𝒙 𝒅, 𝑖 = 1, … , 𝑚
𝑇
ℎ𝑗 𝒙𝑘 + 𝒅 = ℎ𝑗 𝒙𝑘 + 𝛻ℎ𝑗 𝒙𝑘 𝒅, 𝑗 = 1, … , 𝑙
Sequential Linear Programming
• Let 𝑓 𝑘 = 𝑓 𝒙𝑘 , 𝑔𝑖𝑘 = 𝑔𝑖 𝒙𝑘 , ℎ𝑗𝑘 = ℎ𝑗 𝒙𝑘 ; 𝑏𝑖 = −𝑔𝑖𝑘 , 𝑒𝑗 = −ℎ𝑗𝑘 ,
𝒄 = 𝛻𝑓 𝒙𝑘 , 𝒂𝑖 = 𝛻𝑔𝑖 𝒙𝑘 , 𝒏𝑗 = 𝛻ℎ𝑗 𝒙𝑘 ,
𝑨 = 𝒂1 , 𝒂2 , … , 𝒂𝑚 , 𝑵 = 𝒏1 , 𝒏2 , … , 𝒏𝑙 .
• Using first order expansion, define an LP subprogram for the
current iteration of the NLP problem:
min 𝑓 = 𝒄𝑇 𝒅
𝒅
Subject to: 𝑨𝑇 𝒅 ≤ 𝒃,
𝑵𝑇 𝒅 = 𝒆
where 𝑓 represents first-order change in the cost function, and the
columns of 𝑨 and 𝑵 matrices represent, respectively, the gradients
of inequality and equality constraints.
• The resulting LP problem can be solved via the Simplex method.
Sequential Linear Programming
• We may note that:
– Since both positive and negative changes to design variables 𝒙𝑘 are
allowed, the variables 𝑑𝑖 are unrestricted in sign
– The SLP method requires additional constraints of the form:
− ∆𝑘𝑖𝑙 ≤ 𝑑𝑖𝑘 ≤ ∆𝑘𝑖𝑢 (termed move limits) to bind the LP solution.
These limits represent maximum allowable change in 𝑑𝑖 in the
current iteration and are selected as percentage of current value.
– Move limits serve dual purpose of binding the solution and
obviating the need for line search.
– Overly restrictive move limits tend to make the SLP problem
infeasible.
SLP Example
• Consider the convex NLP problem:
min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
Subject to: 1 − 𝑥12 − 𝑥22 ≤ 0; −𝑥1 ≤ 0, −𝑥2 ≤ 0
∗ 1 1
The problem has a single minimum at: 𝒙 = ,
2 2
• The objective and constraint gradients are:
𝛻𝑓 𝑇 = 2𝑥1 − 𝑥2 , 2𝑥2 − 𝑥1 ,
𝛻𝑔1𝑇 = −2𝑥1 , −2𝑥2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = [0, −1].
• Let 𝒙0 = 1, 1 , then 𝑓 0 = 1, 𝒄𝑇 = 1 1 , 𝑏1 = 𝑏2 = 𝑏3 = 1;
𝒂1𝑇 = −2 − 2 , 𝒂𝑇2 = −1 0 , 𝒂𝑇3 = 0 − 1
SLP Example
• Define the LP subproblem at the current step as:
min 𝑓 𝑥1 , 𝑥2 = 𝑑1 + 𝑑2
𝑑1 ,𝑑2
−2 −2 𝑑 1
1
Subject to: −1 0 ≤ 1
𝑑2
0 −1 1
• In the absence of move limits, the LP problem is unbounded; using
1 1 𝑇
50% move limits, the SLP update is given as: 𝒅∗ = − , − ,
2 2
1 1 𝑇 1
𝒙1= , with resulting constraint violation: 𝑔𝑖 = , 0, 0 ;
,
2 2 2
smaller move limits may be used to reduce the constraint violation.
Sequential Linear Programming
SLP Algorithm (Arora, p. 508):
• Initialize: choose 𝒙0 , 𝜀1 > 0, 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Choose move limits ∆𝑘𝑖𝑙 , ∆𝑘𝑖𝑢 as some fraction of current design 𝒙𝑘
– Compute 𝑓 𝑘 , 𝒄, 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝑏𝑖 , 𝑒𝑗
– Formulate and solve the LP subproblem for 𝒅𝑘
– If 𝑔𝑖 ≤ 𝜀1 ; 𝑖 = 1, … , 𝑚; ℎ𝑗 ≤ 𝜀1 ; 𝑖 = 1, … , 𝑝; and 𝒅𝑘 ≤ 𝜀2, stop
– Substitute 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑘 ← 𝑘 + 1.
Sequential Quadratic Programming
• Sequential quadratic programming (SQP) uses a quadratic
approximation to the objective function at every step of iteration.
• The SQP problem is defined as:
1
min 𝑓 = 𝒄𝑇 𝒅 + 𝒅𝑇 𝒅
𝒅 2
Subject to, 𝑨𝑇 𝒅 ≤ 𝒃, 𝑵𝑇 𝒅 = 𝒆
• SQP does not require move limits, alleviating the shortcomings of
the SLP method.
• The SQP problem is convex; hence, it has a single global minimum.
• SQP can be solved via Simplex based linear complementarity problem
(LCP) framework.
Sequential Quadratic Programming
• The Lagrangian function for the SQP problem is defined as:
ℒ 𝒅, 𝒖, 𝒗 = 𝒄𝑇 𝒅 + 12 𝒅𝑇 𝒅 + 𝒖𝑇 𝑨𝑇 𝒅 − 𝒃 + 𝒔 + 𝒗𝑇 (𝑵𝑇 𝒅 − 𝒆)
• Then the KKT conditions are:
Optimality: 𝛁ℒ = 𝒄 + 𝒅 + 𝑨𝒖 + 𝑵𝒗 = 𝟎,
Feasibility: 𝑨𝑇 𝒅 + 𝒔 = 𝒃, 𝑵𝑇 𝒅 = 𝒆 ,
Complementarity: 𝒖𝑇 𝒔 = 𝟎,
Non-negativity: 𝒖 ≥ 𝟎, 𝒔 ≥ 𝟎
Sequential Quadratic Programming
• Since 𝒗 is unrestricted in sign, let 𝒗 = 𝒚 − 𝒛, 𝒚 ≥ 𝟎, 𝒛 ≥ 𝟎, and
the KKT conditions are compactly written as:
𝒅
𝑰 𝑨 𝟎 𝑵 −𝑵 𝒖 −𝒄
𝑨𝑇 𝟎 𝑰 𝟎 𝟎 𝒔 = 𝒃 ,
𝑵𝑇 𝟎 𝟎 𝟎 𝟎 𝒚 𝒆
𝒛
or 𝑷𝑿 = 𝑸
• The complementary slackness conditions, 𝒖𝑇 𝒔 = 𝟎, translate as:
𝑿𝑖 𝑿𝑖+𝑚 = 0, 𝑖 = 𝑛 + 1, ⋯ , 𝑛 + 𝑚.
• The resulting problem can be solved via Simplex method using LCP
framework.
Descent Function Approach
• In SQP methods, the line search step is based on minimization of a
descent function that penalizes constraint violations, i.e.,
Φ 𝒙 = 𝑓 𝒙 + 𝑅𝑉 𝒙
where 𝑓 𝒙 is the cost function, 𝑉 𝒙 represents current
maximum constraint violation, and 𝑅 > 0 is a penalty parameter.
• The descent function value at the current iteration is computed as:
Φ𝑘 = 𝑓𝑘 + 𝑅𝑉𝑘 ,
𝑚 𝑘 𝑝
𝑅 = max 𝑅𝑘 , 𝑟𝑘 where 𝑟𝑘 = 𝑖=1 𝑢𝑖 + 𝑗=1 𝑣𝑗𝑘
𝑉𝑘 = max {0; 𝑔𝑖 , 𝑖 = 1, . . . , 𝑚; ℎ𝑗 , 𝑗 = 1, … , 𝑝}
• The line search subproblem is defined as:
min Φ 𝛼 = Φ 𝒙𝑘 + 𝛼𝒅𝑘
𝛼
SQP Algorithm
SQP Algorithm (Arora, p. 526):
• Initialize: choose 𝒙0 , 𝑅0 = 1, 𝜀1 > 0, 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Compute 𝑓 𝑘 , 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝒄, 𝑏𝑖 , 𝑒𝑗 ; compute 𝑉𝑘 .
– Formulate and solve the QP subproblem to obtain 𝒅𝑘 and the
Lagrange multipliers 𝒖𝑘 and 𝒗𝑘 .
– If 𝑉𝑘 ≤ 𝜀1 and 𝒅𝑘 ≤ 𝜀2 , stop.
– Compute 𝑅; formulate and solve line search subproblem for 𝛼
– Set 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑅𝑘+1 ← 𝑅, 𝑘 ← 𝑘 + 1
• The above algorithm is convergent, i.e., Φ 𝒙𝑘 ≤ Φ 𝒙0 ; 𝒙𝑘
converges to the KKT point 𝒙∗
SQP with Approximate Line Search
• The SQP algorithm can use with approximate line search as follows:
Let 𝑡𝑗 , 𝑗 = 0,1, … denote a trial step size,
𝒙𝑘+1,𝑗 denote the trial design point,
𝑓 𝑘+1,𝑗 = 𝑓( 𝒙𝑘+1,𝑗 ) denote the function value at the trial solution, and
Φ𝑘+1,𝑗 = 𝑓 𝑘+1,𝑗 + 𝑅𝑉𝑘+1,𝑗 is the penalty function at the trial solution.
• The trial solution is required to satisfy the descent condition:
2
Φ𝑘+1,𝑗 + 𝑡𝑗 𝛾 𝒅𝑘 ≤ Φ𝑘,𝑗 , 0<𝛾<1
1 1
where a common choice is: 𝛾 = 2 , 𝜇 = 2 , 𝑡𝑗 = 𝜇 𝑗 , 𝑗 = 0,1,2, ….
• The above descent condition ensures that the constraint violation
decreases at each step of the method.
SQP Example
• Consider the NLP problem: min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
subject to 𝑔1 : 1 − 𝑥12 − 𝑥22 ≤ 0, 𝑔2 : −𝑥1 ≤ 0, 𝑔3 : −𝑥2 ≤ 0
Then 𝛻𝑓 𝑇 = 2𝑥1 − 𝑥2 , 2𝑥2 − 𝑥1 , 𝛻𝑔1𝑇 = −2𝑥1 , −2𝑥2 , 𝛻𝑔2𝑇 =
−1,0 , 𝛻𝑔3𝑇 = [0, −1]. Let 𝑥 0 = 1, 1 ; then, 𝑓 0 = 1, 𝒄 = 1, 1 𝑇 ,
𝑔1 1,1 = 𝑔2 1,1 = 𝑔3 1,1 = −1.
• Since all constraints are initially inactive, 𝑉0 = 0, and 𝒅 = −𝒄 =
−1, −1 𝑇 ; the line search problem is: min Φ 𝛼 = 1 − 𝛼 2 ;
𝛼
• By setting Φ′ 𝛼 = 0, we get the analytical solution: 𝛼 = 1; thus
𝑥 1 = 0, 0 , which results in a large constraint violation
SQP Example
• Alternatively, we may use approximate line search as follows:
1
– Let 𝑅0 = 10, 𝛾 = 𝜇 = ; let 𝑡0 = 1, then 𝒙1,0 = 0,0 , 𝑓 1,0 = 0,
2
𝑉1,0 = 1, Φ1,0 = 10; 𝒅0 2 = 2, and the descent condition
1
Φ1,0 + 𝒅0 2 ≤ Φ0 = 1 is not met at the trial point.
2
1 1 1 1 1
– Next, for 𝑡1 = , we get: 𝒙1,1 = , , 𝑓 1,1 = , V1,1 = ,
2 2 2 4 2
1
Φ1,1 = 5 , and the descent condition fails again;
4
1 3 3 9
– Next, for 𝑡2 = , we get: 𝒙1,2 = , , V1,2 = 0, 𝑓 1,2 = Φ1,2 = ,
4 4 4 16
1
and the descent condition checks as: Φ1,2 + 𝒅0 2 ≤ Φ0 = 1.
8
1 3 3
– Therefore, we set 𝛼 = 𝑡2 = , 𝒙1 = 𝒙1,2 = , with no
4 4 4
constraint violation.
The Active Set Strategy
• To reduce the computational cost of solving the QP subproblem, we
may only include the active constraints in the problem.
• For 𝒙𝑘 ∈ Ω, the set of potentially active constraints is defined as:
ℐ𝑘 = 𝑖: 𝑔𝑖𝑘 > −𝜀; 𝑖 = 1, … , 𝑚 ⋃ 𝑗: 𝑗 = 1, … , 𝑝 for some 𝜀.
• For 𝒙𝑘 ∉ Ω, let 𝑉𝑘 = max {0; 𝑔𝑖𝑘 , 𝑖 = 1, . . . , 𝑚; ℎ𝑗𝑘 , 𝑗 = 1, … , 𝑝};
then, the active constraint set is defined as:
ℐ𝑘 = 𝑖: 𝑔𝑖𝑘 > 𝑉𝑘 − 𝜀; 𝑖 = 1, … , 𝑚 ⋃ 𝑗: ℎ𝑗𝑘 > 𝑉𝑘 − 𝜀; 𝑗 = 1, … , 𝑝
• The gradients of inactive constraints, i.e., those not in ℐ𝑘 , do not
need to be computed
SQP via Newton’s Method
• Consider the following equality constrained problem:
min 𝑓(𝒙), subject to ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑙
𝒙
• The Lagrangian function is given as: ℒ 𝒙, 𝒗 = 𝑓 𝒙 + 𝒗𝑇 𝒉(𝒙)
• The KKT conditions are: 𝛻ℒ 𝒙, 𝒗 = 𝛻𝑓 𝒙 + 𝑵𝒗 = 𝟎, 𝒉 𝒙 = 𝟎
where 𝑵 = 𝛁𝒉(𝒙) is a Jacobian matrix whose 𝑖th column is 𝛻ℎ𝑖 𝒙
• Using first order Taylor series expansion (with shorthand notation):
𝛻ℒ 𝑘+1 = 𝛻ℒ 𝑘 + 𝛻 2 ℒ 𝑘 Δ𝒙 + 𝑁Δ𝒗
𝒉𝑘+1 = 𝒉𝑘 + 𝑵𝑇 Δ𝒙
• By expanding Δ𝒗 = 𝒗𝑘+1 − 𝒗𝑘 , 𝛻ℒ 𝑘 = 𝛻𝑓 𝑘 + 𝑵𝒗𝑘 , and assuming
𝑘 𝑘+1 𝛻 2ℒ 𝑘 𝑵 Δ𝒙 𝑘 𝛻𝑓 𝑘
𝒗 ≅𝒗 we obtain: 𝑇 𝑘+1 = −
𝑵 𝟎 𝒗 𝒉𝑘
which is similar to N-R update, but uses Hessian of the Lagrangian
SQP via Newton’s Method
• Alternately, we consider minimizing the quadratic approximation:
1
min Δ𝒙𝑇 𝛻 2 ℒΔ𝒙 + 𝛻𝑓 𝑇 Δ𝒙
Δ𝒙 2
Subject to: ℎ𝑖 𝑥 + 𝒏𝑇𝑖 Δ𝒙 = 0, 𝑖 = 𝑖, … , 𝑙
• The KKT conditions are: 𝛻𝑓 + 𝛻 2 ℒΔ𝒙 + 𝑵𝒗 = 𝟎, 𝒉 + 𝑵Δ𝒙 = 𝟎
• Thus the QP subproblem can be solved via Newton’s method!
𝛻 2 ℒ 𝑘 𝑵 Δ𝒙𝑘 = − 𝛻𝑓 𝑘
𝑵𝑇 𝟎 𝒗𝑘+1 𝒉𝑘
• The Hessian of the Lagrangian can be updated via BFGS method as:
𝑯𝑘+1 = 𝑯𝑘 + 𝑫𝑘 − 𝑬𝑘
𝑇 𝑇
𝒚𝑘 𝒚𝑘 𝒄𝑘 𝒄𝑘 𝑘 = 𝑯𝑘 Δ𝒙𝑘 , 𝒚𝑘 = 𝛻ℒ 𝑘+1 − ℒ 𝑘
where 𝑫𝑘 = 𝑘𝑇 𝑘
, 𝑬 𝑘 =
𝑘𝑇 𝑘
, 𝒄
𝒚 Δ𝒙 𝒄 Δ𝒙
Example: SQP with Hessian Update
• Consider the NLP problem: min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
subject to 𝑔1 : 1 − 𝑥12 − 𝑥22 ≤ 0, 𝑔2 : −𝑥1 ≤ 0, 𝑔3 : −𝑥2 ≤ 0
Let 𝑥 0 = 1, 1 ; then, 𝑓 0 = 1, 𝒄 = 1, 1 𝑇 , 𝑔1 1,1 = 𝑔2 1,1 =
𝑔3 1,1 = −1; 𝛻𝑔1𝑇 = −2, −2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = [0, −1].
1 3 3
• Using approximate line search, 𝛼 = , 𝒙1 = , .
4 4 4
• For the Hessian update, we have:
𝑓 1 = 0.5625, 𝑔1 = −0.125, 𝑔2 = 𝑔3 = −0.75; 𝒄1 = [0.75, 0.75];
3 3
𝛻𝑔1𝑇 = − 2 , − 2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = 0, −1 ; Δ𝒙0 = −0.25, −0.25 ;
1 1 1 1
then, 𝑫0 = 8 , 𝑬0 = 8 , 𝑯1 = 𝑯0
1 1 1 1
SQP with Hessian Update
• For the next step, the QP problem is defined as:
3 1
min 𝑓 = 𝑑1 + 𝑑2 + 𝑑12 + 𝑑22
𝑑1 ,𝑑2 4 2
3
Subject to: − 2 𝑑1 + 𝑑2 ≤ 0, −𝑑1 ≤ 0, −𝑑2 ≤ 0
• The application of KKT conditions results in a linear system of
equations, which are solved to obtain:
𝒙𝑇 = 𝑑1 , 𝑑2 , 𝑢1 , 𝑢2 , 𝑢3 , 𝑠1 , 𝑠2 , 𝑠3 = 0.188, 0.188, 0, 0, 0,0.125, 0.75, 0.75
Modified SQP Algorithm
Modified SQP Algorithm (Arora, p. 558):
• Initialize: choose 𝒙0 , 𝑅0 = 1, 𝑯0 = 𝐼; 𝜀1 , 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Compute 𝑓 𝑘 , 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝒄, 𝑏𝑖 , 𝑒𝑗 , and 𝑉𝑘 . If 𝑘 > 0, compute 𝑯𝑘
– Formulate and solve the modified QP subproblem for search
direction 𝒅𝑘 and the Lagrange multipliers 𝒖𝑘 and 𝒗𝑘 .
– If 𝑉𝑘 ≤ 𝜀1 and 𝒅𝑘 ≤ 𝜀2, stop.
– Compute 𝑅; formulate and solve line search subproblem for 𝛼
– Set 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑅𝑘+1 ← 𝑅, 𝑘 ← 𝑘 + 1.
SQP Algorithm
%SQP subproblem via Hessian update
% input: xk (current design); Lk (Hessian of Lagrangian
estimate)
%initialize
n=size(xk,1);
if ~exist('Lk','var'), Lk=diag(xk+(~xk)); end
tol=1e-7;
%function and constraint values
fk=f(xk);
dfk=df(xk);
gk=g(xk);
dgk=dg(xk);
%N-R update
A=[Lk dgk; dgk' 0*dgk'*dgk];
b=[-dfk;-gk];
dx=A\b;
dxk=dx(1:n);
lam=dx(n+1:end);
SQP Algorithm
%inactive constraints
idx1=find(lam<0);
if idx1
[dxk,lam]=inactive(lam,A,b,n);
end
%check termination
if abs(dxk)<tol, return, end
%adjust increment for constraint compliance
P=@(xk) f(xk)+lam'*abs(g(xk));
while P(xk+dxk)>P(xk),
dxk=dxk/2;
if abs(dxk)<tol, break, end
end
%Hessian update
dL=@(x) df(x)+dg(x)*lam;
Lk=update(Lk, xk, dxk, dL);
xk=xk+dxk;
disp([xk' f(xk) P(xk)])
SQP Algorithm
%function definitions
function [dxk,lam]=inactive(lam,A,b,n)
idx1=find(lam<0);
lam(idx1)=0;
idx2=find(lam);
v=[1:n,n+idx2];
A=A(v,v); b=b(v);
dx=A\b;
dxk=dx(1:n);
lam(idx2)=dx(n+1:end);
end

function Lk=update(Lk, xk, dxk, dL)

ga=dL(xk+dxk)-dL(xk);
Hx=Lk*dxk;
Dk=ga*ga'/(ga'*dxk);
Ek=Hx*Hx'/(Hx'*dxk);
Lk=Lk+Dk-Ek;
end
Generalized Reduced Gradient
• The GRG method finds the search direction by projecting the
objective function gradient onto the constraint hyperplane.
• The GRG points tangent to the constraint hyperplane, so that
iterative steps try to conform to the constraints.
• The constraints are effectively used to implicitly eliminate variables
and reduce problem dimensions.
Implicit Elimination
• Consider an equality constrained problem in two variables:
Objective: min 𝑓 𝒙 , 𝒙𝑇 = 𝑥1 , 𝑥2
Subject to: 𝑔 𝒙 = 0
• The variation in the objective and constraint functions are:
𝜕𝑓 𝜕𝑓
𝑑𝑓 = 𝛻𝑓 𝑇 𝑑𝒙 = 𝜕𝑥 𝑑𝑥1 + 𝜕𝑥 𝑑𝑥2
1 2
𝜕𝑔 𝜕𝑔
𝑑𝑔 = 𝛻𝑔 𝑇 𝑑𝒙 = 𝜕𝑥 𝑑𝑥1 + 𝜕𝑥 𝑑𝑥2 = 0
1 2
𝜕𝑔/𝜕𝑥1
• Solve for 𝑑𝑥2 = − 𝑑𝑥1 and substitute in the objective function:
𝜕𝑔/𝜕𝑥2
𝜕𝑓 𝜕𝑓 𝜕𝑔/𝜕𝑥1
𝑑𝑓 = − 𝜕𝑥 𝑑𝑥1
𝜕𝑥1 2 𝜕𝑔/𝜕𝑥2

• Then the reduced gradient of 𝑓 along 𝑥1 is given as:

𝜕𝑓 𝜕𝑓 𝜕𝑔/𝜕𝑥1
𝛻𝑓𝑅 = 𝜕𝑥 − 𝜕𝑥
1 2 𝜕𝑔/𝜕𝑥2
Implicit Elimination
• Consider a problem in 𝑛 variable with 𝑚 equality constraints:
Objective: min 𝑓 𝒙 , 𝒙𝑇 = 𝑥1 , 𝑥2 , … , 𝑥𝑛
Subject to: 𝑔𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑚
• We define 𝑚 basic variables in terms of 𝑛 − 𝑚 nonbasic variables;
let 𝒙𝑇 = 𝒚𝑇 , 𝒛𝑇 , where 𝒚 are basic and 𝒛 are nonbasic.
• The gradient vector is partitioned as: 𝛻𝑓 𝑇 = 𝛻𝑓 𝒚 𝑇 , 𝛻𝑓 𝒛 𝑇 .
• The variations in the objective and constraint functions are:
𝑑𝑓 = 𝛻𝑓 𝒚 𝑇 𝑑𝒚 + 𝛻𝑓 𝒛 𝑇 𝑑𝒛
𝜕𝜓 𝜕𝜓
𝑑𝒈 = 𝑑𝒚 + 𝑑𝒛 =𝟎
𝜕𝒚 𝜕𝒛
where the matrices of partial derivatives are defined as:
𝜕𝜓 𝜕𝑔𝑖 𝜕𝜓 𝜕𝑔𝑖
= ; =
𝜕𝒚 𝑖𝑗 𝜕𝑦𝑗 𝜕𝒛 𝑖𝑗 𝜕𝑧𝑗
Generalized Reduced Gradient
𝜕𝜓
• Since is a square 𝑚 × 𝑚 matrix, we may solve for 𝑑𝒚 as:
𝜕𝒚
𝜕𝜓−1 𝜕𝜓
𝑑𝒚 =− , and substitute in 𝑑𝑓 to obtain:
𝜕𝒚 𝜕𝒛
−1 𝜕𝜓
𝑇 𝑑𝒛 − 𝑇 𝜕𝜓
𝑑𝑓 = 𝛻𝑓 𝒛 𝛻𝑓 𝒚 𝑑𝒛
𝜕𝒚 𝜕𝒛
• Then the reduce gradient 𝛻𝑓𝑅 is defined as:
𝜕𝜓−1 𝜕𝜓
𝛻𝑓𝑅𝑇 𝑇
= 𝛻𝑓 𝒛 − 𝛻𝑓 𝒚 𝑇
𝜕𝒚 𝜕𝒛
• Next, we choose negative of 𝛻𝑓𝑅𝑇 as the search direction and
perform a line search to determine step size; then Δ𝒛 = −𝛼𝛻𝑓𝑅 ,
𝜕𝜓−1 𝜕𝜓
Δ𝒚 = Δ𝒛
𝜕𝒚 𝜕𝒛
GRG Algorithm
• Initialize: choose 𝒙0 ; evaluate objective function and constraints;
convert binding inequality constraints to equality constraints.
• Partition the variables into 𝑚 basic and 𝑛 − 𝑚 nonbasic ones, e.g.,
choose first 𝑚 values, or 𝑚 highest values as basic variables.
• Compute the 𝛻𝑓𝑅 along nonbasic variables. If 𝛻𝑓𝑅 = 0, exit.
𝜕𝜓−1 𝜕𝜓
• Set Δ𝒛 = −𝛻𝑓𝑅 / 𝛻𝑓𝑅 , Δ𝒚 = −
𝜕𝒚 𝜕𝒛
Δ𝒛.

• Do a line search along Δ𝒙 to obtain α.

• Check feasibility at 𝒙𝑘 + 𝛼Δ𝒙. If necessary, use Newton-Raphson
𝜕𝜓−1 𝑘
iterations to adjust Δ𝒚 as: Δ𝒚𝑘+1 = Δ𝒚𝑘 − 𝑔
𝜕𝒚

• Update: 𝒙𝑘+1 = 𝒙𝑘 + 𝛼Δ𝒙

Generalized Reduced Gradient
• Consider an equality constrained problem
Objective: min 𝑓 𝒙 = 3𝑥1 + 2𝑥2 + 2𝑥12 − 𝑥1 𝑥2 + 1.5𝑥22
Subject to: 𝑔 𝒙 = 𝑥12 − 𝑥2 − 1 = 0
• Let 𝒙0 = −1 ; then 𝑓 0 = −1, 𝛻𝑓 0 = −1 , 𝑔0 = 0, 𝛻𝑔0 = −2 .
0 3 −1
−2
• Let 𝒚 = 𝑥2 on the first iteration; then 𝛻𝑓𝑅𝑇 = −1 − 3 −1
= −7.
−2
• Let Δ𝒛 = 1, then Δ𝒚 = −1
1 = 2. By doing a line search along
0.333
Δ𝒙 = , we obtain 𝒙1 = −0.350 , 𝑓 1 = −2.13.
0.667 −0.577
• The optimum is reached in three iterations: 𝒙∗ = −0.634 ,
−0.598
∗
𝑓 𝒙 = −2.137.
Generalized Reduced Gradient
• Consider an inequality constrained problem:
Objective: min 𝑓 𝒙 = 𝑥12 + 𝑥2
Subject to: 𝑔1 𝒙 = 𝑥12 + 𝑥22 − 9 ≤ 0, 𝑥1 + 𝑥2 − 1 ≤ 0
• Add slack variable to inequality constraints:
𝑔1 𝒙 = 𝑥12 + 𝑥22 − 9 + 𝑠1 = 0, 𝑔2 𝒙 = 𝑥1 + 𝑥2 − 1 + 𝑠2 = 0
2𝑥1 2𝑥1 1
Then 𝛻𝑓 𝒙 = ; 𝛻𝑔1 𝒙 = ; 𝛻𝑔2 𝒙 =
1 2𝑥2 1
• Let 𝒙0 = 2.56 ; then 𝑓 0 = 4.99, 𝛻𝑓 0 = 5.12 , 𝒈0 = −0.013 ,
−1.56 1 0
5.12 1
• Since 𝑔2 is binding, add 𝑠2 to variables: 𝛻𝑓 0 = 1 , 𝛻𝑔20 = 1
1 1
Generalized Reduced Gradient
𝑥2
• Let 𝑦 = 𝑥1 , 𝒛 = 𝑠 ; then 𝛻𝑓 𝑦 = 5.12, 𝛻𝑓 𝒛 = 1 , 𝛻𝑔2 𝑦 = 1,
2 0
1 1 1 −4.12
𝛻𝑔2 𝒛 = , therefore 𝛻𝑓𝑅 𝒛 = − 5.12 =
1 0 1 −5.12
• Let Δ𝒛 = −𝛻𝑓𝑅 𝒛 , Δ𝑦 = −[1 1]Δ𝒛 = −9.24; then, Δ𝒙 = −9.24 and
4.12
0
𝒔 = Δ𝒙/ Δ𝒙 . Suppose we limit the maximum step size to 𝛼 ≤ 0.5,
then 𝒙1 = 𝒙0 + 0.5𝒔0 = 2.103 with 𝑓 𝑥1 = 𝑓 1 = 3.068. There are
−1.356
no constraint violations, hence first iteration is completed.
• After seven iterations: 𝒙7 = 0.003 with 𝑓 7 = −3.0
−3.0
• The optimum is at: 𝒙∗ = 0.0 with 𝑓 ∗ = −3.0
−3.0
GRG for LP Problems
• Consider an LP problem: min 𝑓(𝒙) = 𝒄𝑇 𝒙
Subject to: 𝑨𝒙 = 𝒃, 𝒙 ≥ 𝟎
• Let 𝒙 be partitioned into 𝑚 basic variables and 𝑛 − 𝑚 nonbasic
variables: 𝒙𝑇 = [𝒚𝑇 , 𝒛𝑇 ].
• The objective function is partitioned as: 𝑓 𝒙 = 𝒄𝑇𝑦 𝒚 + 𝒄𝑇𝑧 𝒛
• The constraints are partitioned as: 𝑩𝒚 + 𝑵𝒛 = 𝒃, 𝒚 ≥ 𝟎, 𝒛 ≥ 𝟎.
Then 𝒚 = 𝑩−1 𝒃 − 𝑩−1 𝑵𝒛
• The objective function in terms of independent variables is:
𝑓 𝒛 = 𝒄𝑇𝑦 𝑩−1 𝒃𝒛 + (𝒄𝑇𝑧 − 𝒄𝑇𝑦 𝑩−1 𝑵)𝒛
• The reduced costs for nonbasic variables are given as:
𝒓𝑇𝑐 = 𝒄𝑇𝑧 − 𝒄𝑇𝑦 𝑩−1 𝑵, or 𝒓𝑇𝑐 = 𝒄𝑇𝑧 − 𝝀𝑇 𝑵
GRG for LP Problems
• Using Tableu notation, the reduced costs are computed as:
𝑩 𝑵 𝒃 𝑰 𝑩−1 𝑵 𝑩−1 𝒃
𝒄𝑇𝒚 𝒄𝑇𝑧 0 → 𝟎 𝒓𝑇𝑐 −𝒄𝑇𝒚 𝑩−1 𝒃
• The objective function variation is given as:
𝑑𝑓 = 𝛻𝑓𝒚𝑇 𝑑𝒚 + 𝛻𝑓𝒛𝑇 𝑑𝒛
• The reduced gradient along the constraint surface is given as:
𝛻𝑓𝑅𝑇 = 𝛻𝒛 𝑓 𝑇 − 𝛻𝒚 𝑓 𝑇 𝑩−1 𝑵 = 𝒓𝑇𝑐
GRG Algorithm for LP Problems
1. Choose the largest 𝑚 components of 𝒙 as basic variables
2. Compute the reduced gradient 𝛻𝑓𝑅𝑇 = 𝒓𝑇𝑐
−𝑟𝑖 𝑖𝑓 𝑟𝑖 ≤ 0
3. Let Δ𝑧𝑖 =
−𝑥𝑖 𝑟𝑖 𝑖𝑓 𝑟𝑖 > 0
4. If Δ𝒛 = 0, stop; otherwise set Δ𝒚 = 𝑩−1 𝑵Δ𝒛
5. Compute step size: let 𝛼1 = max 𝛼: 𝒚 + Δ𝒚 ≥ 0, 𝒛 + Δ𝒛 ≥ 0 ,
𝛼2 = min 𝑓 𝒙 + Δ𝒙 ≥ 0 , 𝛼 = min{𝛼1 , 𝛼2 }
6. Update: 𝒙𝑘+1 = 𝒙𝑘 + 𝛼Δ𝒙
7. If 𝛼2 ≥ 𝛼1 , update 𝑩, 𝑵 (use pivoting)
8. Return to 1

View publication stats

Lecture 9
No ratings yet
Lecture 9
31 pages
12 Information Technology (802) Important Question
67% (9)
12 Information Technology (802) Important Question
6 pages
Clnote Sept24
No ratings yet
Clnote Sept24
24 pages
0.lecture4 Unconstrained2
No ratings yet
0.lecture4 Unconstrained2
30 pages
Optimization
No ratings yet
Optimization
30 pages
3-Region Elimination Methods - Unrestricted Search,-11!01!2025
No ratings yet
3-Region Elimination Methods - Unrestricted Search,-11!01!2025
79 pages
Clnote Sept28
No ratings yet
Clnote Sept28
30 pages
Optimization PPT - Part-2
No ratings yet
Optimization PPT - Part-2
42 pages
Lecture8 UnconstrainedII 2023
No ratings yet
Lecture8 UnconstrainedII 2023
57 pages
1D Methods
No ratings yet
1D Methods
14 pages
Lecture 5 Si416 2025
No ratings yet
Lecture 5 Si416 2025
21 pages
Multivariable Optimization
No ratings yet
Multivariable Optimization
48 pages
JavaScript Interview Questions (2021) - Javatpoint
No ratings yet
JavaScript Interview Questions (2021) - Javatpoint
25 pages
Maths Project
No ratings yet
Maths Project
25 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
5 One Dimensional Search
No ratings yet
5 One Dimensional Search
24 pages
Introduction To Mobile Robotics - Burgard PDF
No ratings yet
Introduction To Mobile Robotics - Burgard PDF
745 pages
pdfHXu ch1
No ratings yet
pdfHXu ch1
30 pages
Numerical Optimization
No ratings yet
Numerical Optimization
31 pages
4 Pattern Directions, 21-08-2024
No ratings yet
4 Pattern Directions, 21-08-2024
58 pages
Chapter 2 Power System Operation
No ratings yet
Chapter 2 Power System Operation
63 pages
Linnear Nonlineae Numerical Method
No ratings yet
Linnear Nonlineae Numerical Method
43 pages
Multi Variable Optimization: Min F (X, X, X, - X)
No ratings yet
Multi Variable Optimization: Min F (X, X, X, - X)
38 pages
Multi Variable Optimization: Min F (X, X, X, - X)
No ratings yet
Multi Variable Optimization: Min F (X, X, X, - X)
69 pages
Regression Analysis: An Approximation Problem
No ratings yet
Regression Analysis: An Approximation Problem
16 pages
229 - ClassX - Data-Entry-Operation - NIOS
No ratings yet
229 - ClassX - Data-Entry-Operation - NIOS
283 pages
Unconstrained Multivariable Optimization
No ratings yet
Unconstrained Multivariable Optimization
42 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
Multi Variable Optimization: Min F (X, X, X, - X)
No ratings yet
Multi Variable Optimization: Min F (X, X, X, - X)
69 pages
Optimization Nonlinear
No ratings yet
Optimization Nonlinear
144 pages
Mastering NumPy For Data Science
No ratings yet
Mastering NumPy For Data Science
161 pages
ME 310 Numerical Methods Optimization
No ratings yet
ME 310 Numerical Methods Optimization
11 pages
6 Gradient Method
No ratings yet
6 Gradient Method
19 pages
امثلية2
No ratings yet
امثلية2
13 pages
MATLAB Practical File (Codes) by Priyanshu Sinha
No ratings yet
MATLAB Practical File (Codes) by Priyanshu Sinha
35 pages
An Overview of Traditional Optimization Methods - Truncated
No ratings yet
An Overview of Traditional Optimization Methods - Truncated
17 pages
ADS & A Unit-1 Study Material
No ratings yet
ADS & A Unit-1 Study Material
13 pages
CS-6777 Liu Abs
No ratings yet
CS-6777 Liu Abs
103 pages
NEOM UNIT-1 Sept-23
No ratings yet
NEOM UNIT-1 Sept-23
34 pages
NLP Notes
No ratings yet
NLP Notes
20 pages
Lec - 3 Bisection Method
No ratings yet
Lec - 3 Bisection Method
31 pages
NEOM Manual Part-II 4-Expts
No ratings yet
NEOM Manual Part-II 4-Expts
41 pages
Line Search Algorithms: Bracket
No ratings yet
Line Search Algorithms: Bracket
13 pages
ECEG-6311 Power System Optimization and AI: Linear and Non Linear Programming Yoseph Mekonnen (PH.D.)
No ratings yet
ECEG-6311 Power System Optimization and AI: Linear and Non Linear Programming Yoseph Mekonnen (PH.D.)
56 pages
Elimination Methods
No ratings yet
Elimination Methods
58 pages
Designing Accurate Data Entry Procedures: Kendall & Kendall Systems Analysis and Design, 9e
No ratings yet
Designing Accurate Data Entry Procedures: Kendall & Kendall Systems Analysis and Design, 9e
72 pages
Numerical Optimal Control: July 2011
No ratings yet
Numerical Optimal Control: July 2011
123 pages
Control Optimo
No ratings yet
Control Optimo
132 pages
Real World Java Ee Patterns Rethinking Best Practices Adam Bien Download
No ratings yet
Real World Java Ee Patterns Rethinking Best Practices Adam Bien Download
76 pages
Music Player Java Project
No ratings yet
Music Player Java Project
9 pages
Lecture 11
No ratings yet
Lecture 11
25 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
Structural and Multidisciplinary Optimization
No ratings yet
Structural and Multidisciplinary Optimization
33 pages
Hawassa University (Hu), Institute of Technology (Iot) Chemical Engineering Department
No ratings yet
Hawassa University (Hu), Institute of Technology (Iot) Chemical Engineering Department
30 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
ECEG-6311 Power System Optimization and AI: Linear and Non Linear Programming Yoseph Mekonnen (PH.D.)
No ratings yet
ECEG-6311 Power System Optimization and AI: Linear and Non Linear Programming Yoseph Mekonnen (PH.D.)
25 pages
Optimumengineeringdesign Day6
No ratings yet
Optimumengineeringdesign Day6
66 pages
Process Optimization
No ratings yet
Process Optimization
70 pages
Updated - M5 - Python For Machine Learning - Copy - Maria S
No ratings yet
Updated - M5 - Python For Machine Learning - Copy - Maria S
67 pages
ECOM 6302: Engineering Optimization: Chapter Three
100% (1)
ECOM 6302: Engineering Optimization: Chapter Three
56 pages
System Identification
100% (3)
System Identification
23 pages
Optimization 1
No ratings yet
Optimization 1
32 pages
PHD Course Lectures: Optimization
No ratings yet
PHD Course Lectures: Optimization
25 pages
Lyapunov Stability Theory:: Problem of Motion Stability, Includes Two Methods For Stability Analysis (The So
No ratings yet
Lyapunov Stability Theory:: Problem of Motion Stability, Includes Two Methods For Stability Analysis (The So
25 pages
Fuzzy 2
No ratings yet
Fuzzy 2
22 pages
OptimumEngineeringDesign Day 1
No ratings yet
OptimumEngineeringDesign Day 1
40 pages
Computer System Architecture
No ratings yet
Computer System Architecture
6 pages
One Variable Optimization
No ratings yet
One Variable Optimization
15 pages
Optimumengineeringdesign Day3a
No ratings yet
Optimumengineeringdesign Day3a
34 pages
Algorithms Process Optimization
No ratings yet
Algorithms Process Optimization
5 pages
Unit
No ratings yet
Unit
14 pages
OptimumEngineeringDesign Day2a
No ratings yet
OptimumEngineeringDesign Day2a
33 pages
cs8261 Lab Manual
100% (2)
cs8261 Lab Manual
44 pages
Optim
No ratings yet
Optim
70 pages
W5-Lambda APIGateway
No ratings yet
W5-Lambda APIGateway
28 pages
Feedback Linearization
No ratings yet
Feedback Linearization
42 pages
BSCVSEM
No ratings yet
BSCVSEM
4 pages
L8 Single Variable Optimization Algorithms
No ratings yet
L8 Single Variable Optimization Algorithms
9 pages
Chapter 2 - Burl Optimal Quadratic Control: Islamic University of Gaza
No ratings yet
Chapter 2 - Burl Optimal Quadratic Control: Islamic University of Gaza
44 pages
Optim Notes
No ratings yet
Optim Notes
19 pages
The Maximum Principle and Hamilton Jacobi Theory
No ratings yet
The Maximum Principle and Hamilton Jacobi Theory
40 pages
Chapter 3
No ratings yet
Chapter 3
39 pages
Line Search Algorithms
No ratings yet
Line Search Algorithms
13 pages
Single-Link Flexible Joint Manipulator
No ratings yet
Single-Link Flexible Joint Manipulator
37 pages
Optimal Quadratic Control L2
No ratings yet
Optimal Quadratic Control L2
35 pages
Java Capgemini Questions
No ratings yet
Java Capgemini Questions
5 pages
Mini Project Final Report
No ratings yet
Mini Project Final Report
12 pages
7phase Portraits Chaos FD
No ratings yet
7phase Portraits Chaos FD
30 pages
Nonlinear Control
No ratings yet
Nonlinear Control
28 pages
OptimumEngineeringDesign Day2b
No ratings yet
OptimumEngineeringDesign Day2b
24 pages
Nonlinear Systems
No ratings yet
Nonlinear Systems
30 pages
1.1 Questions: D N Y DD NN YY
No ratings yet
1.1 Questions: D N Y DD NN YY
4 pages
Fuzzy 3 Fuzzy Inference Process
No ratings yet
Fuzzy 3 Fuzzy Inference Process
17 pages
Lyapunov-Based Methods in Control: Dr. Alexander Schaum
No ratings yet
Lyapunov-Based Methods in Control: Dr. Alexander Schaum
22 pages
Code
No ratings yet
Code
5 pages
1 X X X X : Examples: Example 1: Consider The System
No ratings yet
1 X X X X : Examples: Example 1: Consider The System
16 pages
DBMS LAB - Ex - No-6
No ratings yet
DBMS LAB - Ex - No-6
10 pages
Dsa 2 PDF
No ratings yet
Dsa 2 PDF
12 pages
A Simple Explanation of Why Lagrange Multipliers Works: Andrew Chamberlain, PH.D
No ratings yet
A Simple Explanation of Why Lagrange Multipliers Works: Andrew Chamberlain, PH.D
13 pages
Fuzzy 1
No ratings yet
Fuzzy 1
11 pages
Case Study On Different Scheduling Algorithms
No ratings yet
Case Study On Different Scheduling Algorithms
5 pages
Robolab Tutorial Slides
No ratings yet
Robolab Tutorial Slides
33 pages
U X F DT DX: Nonlinear Control
No ratings yet
U X F DT DX: Nonlinear Control
11 pages
Pps Important Questions 2024 Even Sem
No ratings yet
Pps Important Questions 2024 Even Sem
3 pages
Computer Programming Two Marks
No ratings yet
Computer Programming Two Marks
17 pages
Phase Plane Analysis: Glad & Ljung
No ratings yet
Phase Plane Analysis: Glad & Ljung
8 pages
CV - José Valladares
No ratings yet
CV - José Valladares
2 pages
Cid 2 Code
No ratings yet
Cid 2 Code
3 pages
Brevent 20240616 0607 28365
No ratings yet
Brevent 20240616 0607 28365
1 page
PPSC Computer Science Lecturer 2015 (68 Question With Answer)
No ratings yet
PPSC Computer Science Lecturer 2015 (68 Question With Answer)
9 pages
6nlc Relay
No ratings yet
6nlc Relay
4 pages
Introduction To Systems Programming (CSE 405) Credit: 4 Prerequisite: CSE 301
No ratings yet
Introduction To Systems Programming (CSE 405) Credit: 4 Prerequisite: CSE 301
25 pages
Introduction To Visual C++ 2010 Express
No ratings yet
Introduction To Visual C++ 2010 Express
7 pages
Exercise Dynamic Programming
No ratings yet
Exercise Dynamic Programming
1 page
Anna University Engineering Question Bank
No ratings yet
Anna University Engineering Question Bank
7 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
SAT Math Shortcuts
From Everand
SAT Math Shortcuts
Bella Biscotti
No ratings yet
Quadratic Equation: easy way to learn equation
From Everand
Quadratic Equation: easy way to learn equation
Prashant Singh
No ratings yet
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Quadratic Equation: new and easy way to solve equations
From Everand
Quadratic Equation: new and easy way to solve equations
Prashant Singh
No ratings yet

Optimumengineeringdesign Day5

Uploaded by

Optimumengineeringdesign Day5

Uploaded by

Optimization Methods

2. Compute 𝑥2 = 𝜏𝑥1 + 1 − 𝜏 𝑥4 , evaluate 𝑓2

• By setting the derivative with respect to 𝛼𝑖 equal to zero, i.e.,

Using 𝑓 ′ 𝛼 = 0 → 𝛼 = 16 → 𝒙1 = 0.625 , 𝑓 1 = 0.875;

𝑔𝑖+ 𝒙 = max 0, 𝑔𝑖 𝒙 ,𝑟 > 0

• For large 𝒓, the Newton’s update may be approximated as:

function Lk=update(Lk, xk, dxk, dL)

• Then the reduced gradient of 𝑓 along 𝑥1 is given as:

• Do a line search along Δ𝒙 to obtain α.

• Update: 𝒙𝑘+1 = 𝒙𝑘 + 𝛼Δ𝒙

View publication stats

You might also like