Optimumengineeringdesign Day5
Optimumengineeringdesign Day5
in Engineering Design
Day-5
Course Materials
• Arora, Introduction to Optimum Design, 3e, Elsevier,
(https://www.researchgate.net/publication/273120102_Introductio
n_to_Optimum_design)
• Parkinson, Optimization Methods for Engineering Design, Brigham
Young University
(http://apmonitor.com/me575/index.php/Main/BookChapters)
• Iqbal, Fundamental Engineering Optimization Methods, BookBoon
(https://bookboon.com/en/fundamental-engineering-optimization-
methods-ebook)
Numerical Optimization
• Consider an unconstrained NP problem: min 𝑓 𝒙
𝒙
• Use an iterative method to solve the problem: 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ,
where 𝒅𝑘 is a search direction and 𝛼𝑘 is the step size, such that the
function value decreases at each step, i.e., 𝑓 𝒙𝑘+1 < 𝑓 𝒙𝑘
• We expect lim 𝒙𝑘 = 𝒙∗
𝑘→∞
• The general iterative method is a two-step process:
– Finding a suitable search direction 𝒅𝑘 along which the function
value locally decreases and any constraints are obeyed.
– Performing line search along 𝒅𝑘 to find 𝒙𝑘+1 such that 𝑓 𝒙𝑘+1
attains its minimum value.
The Iterative Method
• Iterative algorithm:
1. Initialize: chose 𝒙0
2. Check termination: 𝛻𝑓 𝒙𝑘 ≅ 0
3. Find a suitable search direction 𝒅𝑘 ,
that obeys the descent condition:
𝑇
𝛻𝑓 𝒙𝑘 𝒅𝑘 < 0
4. Search along 𝒅𝑘 to find where
𝑓 𝒙𝑘+1 attains minimum value
(line search problem)
5. Return to step 2
The Line Search Problem
• Assuming a suitable search direction 𝒅𝑘 has been determined, we
seek to determine a step length 𝛼𝑘 , that minimizes 𝑓 𝒙𝑘+1 .
• Assuming 𝒙𝑘 and 𝒅𝑘 are known, the projected function value along
𝒅𝑘 is expressed as:
𝑓 𝒙𝑘 + 𝛼𝑘 𝒅𝑘 = 𝑓 𝒙𝑘 + 𝛼𝒅𝑘 = 𝑓(𝛼)
• The line search problem to choose 𝛼 to minimize 𝑓 𝒙𝑘+1 along 𝒅𝑘
is defined as:
min 𝑓(𝛼) = 𝑓 𝒙𝑘 + α𝒅𝑘
𝛼
• Assuming that a solution exists, it is found by setting 𝑓′ 𝛼 = 0.
Example: Quadratic Function
• Consider minimizing a quadratic function:
𝑓 𝒙 = 12 𝒙𝑇 𝑨𝒙 − 𝒃𝑇 𝒙, 𝛻𝑓 = 𝑨𝒙 − 𝒃
• Given a descent direction 𝒅, the line search problem is defined as:
𝑇
min 𝑓(𝛼) = 𝒙𝑘 + 𝛼𝒅 𝑨 𝒙𝑘 + 𝛼𝒅 − 𝒃𝑇 𝒙𝑘 + 𝛼𝒅
𝛼
• A solution is found by setting 𝑓 ′ 𝛼 = 0, where
𝑓 ′ 𝛼 = 𝒅𝑇 𝑨 𝒙𝑘 + 𝛼𝒅 − 𝒅𝑇 𝒃 = 0
𝒅𝑇 𝑨𝒙𝑘 − 𝒃 𝛻𝑓(𝒙𝑘 )𝑇 𝒅
𝛼=− =−
𝒅𝑇 𝑨𝒅 𝒅𝑇 𝑨𝒅
• Finally, 𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝒅.
Computer Methods for Line Search Problem
• Interval reduction methods
– Golden search
– Fibonacci search
• Approximate search methods
– Arjimo’s rule
– Quadrature curve fitting
Interval Reduction Methods
• The interval reduction methods find the minimum of a unimodal
function in two steps:
– Bracketing the minimum to an interval
– Reducing the interval to desired accuracy
• The bracketing step aims to find a three-point pattern, such that for
𝑥1 , 𝑥2 , 𝑥3 , 𝑓 𝑥1 ≥ 𝑓 𝑥2 < 𝑓 𝑥3 .
Fibonacci’s Method
• The Fibonacci’s method uses Fibonacci numbers to achieve
maximum interval reduction in a given number of steps.
• The Fibonacci number sequence is generated as:
𝐹0 = 𝐹1 = 1, 𝐹𝑖 = 𝐹𝑖−1 + 𝐹𝑖−2 , 𝑖 ≥ 2.
• The properties of Fibonacci numbers include:
𝐹𝑛−1 5−1
– They achieve the golden ratio 𝜏 = lim = ≅ 0.618034
𝑛→∞ 𝐹𝑛 2
– The number of interval reductions 𝑛 required to achieve a desired
accuracy 𝜀 (where 1/𝐹𝑛 < 𝜀) is specified in advance.
𝐹𝑛−1
– For given 𝐼1 and 𝑛, 𝐼2 = 𝐼 ,𝐼 = 𝐼1 − 𝐼2 , 𝐼4 = 𝐼2 − 𝐼3 , etc.
𝐹𝑛 1 3
The Golden Section Method
• The golden section method uses the golden ratio: 𝜏 = 0.618034.
• The golden section algorithm is given as:
𝜀
1. Initialize: specify 𝑥1 , 𝑥4 𝐼1 = 𝑥4 − 𝑥1 , 𝜀, 𝑛: 𝜏 𝑛 < 𝐼
1
𝒓𝑇
𝑖 𝒓𝑖
– Set 𝒅𝑖 = 𝒓𝑖 + 𝛽𝑖 𝒅𝑖−1 ; 𝛼𝑖 = 𝑖𝑇 𝑖
; 𝒙 𝑖+1 = 𝒙 𝑖 + 𝛼 𝑖 𝒅 𝑖;
𝒅 𝑨𝒅
𝒓𝑖+1 = 𝒓𝑖 − 𝛼𝑖 𝑨𝒅𝑖 .
Conjugate Gradient Method
• Assume that an update that includes steps 𝛼𝑖 along 𝑛 conjugate
vectors 𝒅𝑖 is assembled as: 𝑦 = 𝑛𝑖=1 𝛼𝑖 𝒅𝑖 .
• Then, for a quadratic function, the minimization problem is
decomposed into a set of one-dimensional problems, i.e.,
𝑛 1 2 𝑖𝑇
min 𝑓(𝒚) ≡ 𝑖=1 min 𝛼𝑖 𝒅 𝑨𝒅𝑖 − 𝛼𝑖 𝒃𝑇 𝒅𝑖
𝑦 𝛼𝑖 2
ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑝;
Subject to 𝑔𝑗 𝒙 ≤ 0, 𝑗 = 𝑖, … , 𝑚;
𝑥𝑖𝐿 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑈 , 𝑖 = 1, … , 𝑛.
• Define a composite function to be used for constraint compliance:
Φ 𝒙, 𝑟 = 𝑓 𝒙 + 𝑃 𝑔 𝒙 , ℎ 𝒙 , 𝒓
where 𝑃 defines a loss function, and 𝒓 is a vector of weights (penalty
parameters)
Penalty and Barrier Methods
• Penalty Function Method. A penalty function method employs a
quadratic loss function and iterates through the infeasible region
2
𝑃 𝑔 𝒙 ,ℎ 𝒙 ,𝒓 = 𝑟 𝑖 𝑔𝑖+ 𝒙 + 𝑖 ℎ𝑖 𝒙 2
𝑑2𝜓
where = −𝛻ℎ𝑖 𝑇 𝛻 2 𝒫 −1 𝛻ℎ
𝑗
𝑑𝑣𝑖 𝑑𝑣𝑗
ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑝;
Subject to 𝑔𝑗 𝒙 ≤ 0, 𝑗 = 𝑖, … , 𝑚;
𝑥𝑖𝐿 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑈 , 𝑖 = 1, … , 𝑛.
• Let 𝒙𝑘 denote the current estimate of the design variables, and let
𝒅 denote the change in variables; define the first order expansion
of the objective and constraint functions in the neighborhood of 𝒙𝑘
𝑇
𝑓 𝒙𝑘 + 𝒅 = 𝑓 𝒙𝑘 + 𝛻𝑓 𝒙𝑘 𝒅
𝑇
𝑔𝑖 𝒙𝑘 + 𝒅 = 𝑔𝑖 𝒙𝑘 + 𝛻𝑔𝑖 𝑘
𝒙 𝒅, 𝑖 = 1, … , 𝑚
𝑇
ℎ𝑗 𝒙𝑘 + 𝒅 = ℎ𝑗 𝒙𝑘 + 𝛻ℎ𝑗 𝒙𝑘 𝒅, 𝑗 = 1, … , 𝑙
Sequential Linear Programming
• Let 𝑓 𝑘 = 𝑓 𝒙𝑘 , 𝑔𝑖𝑘 = 𝑔𝑖 𝒙𝑘 , ℎ𝑗𝑘 = ℎ𝑗 𝒙𝑘 ; 𝑏𝑖 = −𝑔𝑖𝑘 , 𝑒𝑗 = −ℎ𝑗𝑘 ,
𝒄 = 𝛻𝑓 𝒙𝑘 , 𝒂𝑖 = 𝛻𝑔𝑖 𝒙𝑘 , 𝒏𝑗 = 𝛻ℎ𝑗 𝒙𝑘 ,
𝑨 = 𝒂1 , 𝒂2 , … , 𝒂𝑚 , 𝑵 = 𝒏1 , 𝒏2 , … , 𝒏𝑙 .
• Using first order expansion, define an LP subprogram for the
current iteration of the NLP problem:
min 𝑓 = 𝒄𝑇 𝒅
𝒅
Subject to: 𝑨𝑇 𝒅 ≤ 𝒃,
𝑵𝑇 𝒅 = 𝒆
where 𝑓 represents first-order change in the cost function, and the
columns of 𝑨 and 𝑵 matrices represent, respectively, the gradients
of inequality and equality constraints.
• The resulting LP problem can be solved via the Simplex method.
Sequential Linear Programming
• We may note that:
– Since both positive and negative changes to design variables 𝒙𝑘 are
allowed, the variables 𝑑𝑖 are unrestricted in sign
– The SLP method requires additional constraints of the form:
− ∆𝑘𝑖𝑙 ≤ 𝑑𝑖𝑘 ≤ ∆𝑘𝑖𝑢 (termed move limits) to bind the LP solution.
These limits represent maximum allowable change in 𝑑𝑖 in the
current iteration and are selected as percentage of current value.
– Move limits serve dual purpose of binding the solution and
obviating the need for line search.
– Overly restrictive move limits tend to make the SLP problem
infeasible.
SLP Example
• Consider the convex NLP problem:
min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
Subject to: 1 − 𝑥12 − 𝑥22 ≤ 0; −𝑥1 ≤ 0, −𝑥2 ≤ 0
∗ 1 1
The problem has a single minimum at: 𝒙 = ,
2 2
• The objective and constraint gradients are:
𝛻𝑓 𝑇 = 2𝑥1 − 𝑥2 , 2𝑥2 − 𝑥1 ,
𝛻𝑔1𝑇 = −2𝑥1 , −2𝑥2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = [0, −1].
• Let 𝒙0 = 1, 1 , then 𝑓 0 = 1, 𝒄𝑇 = 1 1 , 𝑏1 = 𝑏2 = 𝑏3 = 1;
𝒂1𝑇 = −2 − 2 , 𝒂𝑇2 = −1 0 , 𝒂𝑇3 = 0 − 1
SLP Example
• Define the LP subproblem at the current step as:
min 𝑓 𝑥1 , 𝑥2 = 𝑑1 + 𝑑2
𝑑1 ,𝑑2
−2 −2 𝑑 1
1
Subject to: −1 0 ≤ 1
𝑑2
0 −1 1
• In the absence of move limits, the LP problem is unbounded; using
1 1 𝑇
50% move limits, the SLP update is given as: 𝒅∗ = − , − ,
2 2
1 1 𝑇 1
𝒙1= , with resulting constraint violation: 𝑔𝑖 = , 0, 0 ;
,
2 2 2
smaller move limits may be used to reduce the constraint violation.
Sequential Linear Programming
SLP Algorithm (Arora, p. 508):
• Initialize: choose 𝒙0 , 𝜀1 > 0, 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Choose move limits ∆𝑘𝑖𝑙 , ∆𝑘𝑖𝑢 as some fraction of current design 𝒙𝑘
– Compute 𝑓 𝑘 , 𝒄, 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝑏𝑖 , 𝑒𝑗
– Formulate and solve the LP subproblem for 𝒅𝑘
– If 𝑔𝑖 ≤ 𝜀1 ; 𝑖 = 1, … , 𝑚; ℎ𝑗 ≤ 𝜀1 ; 𝑖 = 1, … , 𝑝; and 𝒅𝑘 ≤ 𝜀2, stop
– Substitute 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑘 ← 𝑘 + 1.
Sequential Quadratic Programming
• Sequential quadratic programming (SQP) uses a quadratic
approximation to the objective function at every step of iteration.
• The SQP problem is defined as:
1
min 𝑓 = 𝒄𝑇 𝒅 + 𝒅𝑇 𝒅
𝒅 2
Subject to, 𝑨𝑇 𝒅 ≤ 𝒃, 𝑵𝑇 𝒅 = 𝒆
• SQP does not require move limits, alleviating the shortcomings of
the SLP method.
• The SQP problem is convex; hence, it has a single global minimum.
• SQP can be solved via Simplex based linear complementarity problem
(LCP) framework.
Sequential Quadratic Programming
• The Lagrangian function for the SQP problem is defined as:
ℒ 𝒅, 𝒖, 𝒗 = 𝒄𝑇 𝒅 + 12 𝒅𝑇 𝒅 + 𝒖𝑇 𝑨𝑇 𝒅 − 𝒃 + 𝒔 + 𝒗𝑇 (𝑵𝑇 𝒅 − 𝒆)
• Then the KKT conditions are:
Optimality: 𝛁ℒ = 𝒄 + 𝒅 + 𝑨𝒖 + 𝑵𝒗 = 𝟎,
Feasibility: 𝑨𝑇 𝒅 + 𝒔 = 𝒃, 𝑵𝑇 𝒅 = 𝒆 ,
Complementarity: 𝒖𝑇 𝒔 = 𝟎,
Non-negativity: 𝒖 ≥ 𝟎, 𝒔 ≥ 𝟎
Sequential Quadratic Programming
• Since 𝒗 is unrestricted in sign, let 𝒗 = 𝒚 − 𝒛, 𝒚 ≥ 𝟎, 𝒛 ≥ 𝟎, and
the KKT conditions are compactly written as:
𝒅
𝑰 𝑨 𝟎 𝑵 −𝑵 𝒖 −𝒄
𝑨𝑇 𝟎 𝑰 𝟎 𝟎 𝒔 = 𝒃 ,
𝑵𝑇 𝟎 𝟎 𝟎 𝟎 𝒚 𝒆
𝒛
or 𝑷𝑿 = 𝑸
• The complementary slackness conditions, 𝒖𝑇 𝒔 = 𝟎, translate as:
𝑿𝑖 𝑿𝑖+𝑚 = 0, 𝑖 = 𝑛 + 1, ⋯ , 𝑛 + 𝑚.
• The resulting problem can be solved via Simplex method using LCP
framework.
Descent Function Approach
• In SQP methods, the line search step is based on minimization of a
descent function that penalizes constraint violations, i.e.,
Φ 𝒙 = 𝑓 𝒙 + 𝑅𝑉 𝒙
where 𝑓 𝒙 is the cost function, 𝑉 𝒙 represents current
maximum constraint violation, and 𝑅 > 0 is a penalty parameter.
• The descent function value at the current iteration is computed as:
Φ𝑘 = 𝑓𝑘 + 𝑅𝑉𝑘 ,
𝑚 𝑘 𝑝
𝑅 = max 𝑅𝑘 , 𝑟𝑘 where 𝑟𝑘 = 𝑖=1 𝑢𝑖 + 𝑗=1 𝑣𝑗𝑘
𝑉𝑘 = max {0; 𝑔𝑖 , 𝑖 = 1, . . . , 𝑚; ℎ𝑗 , 𝑗 = 1, … , 𝑝}
• The line search subproblem is defined as:
min Φ 𝛼 = Φ 𝒙𝑘 + 𝛼𝒅𝑘
𝛼
SQP Algorithm
SQP Algorithm (Arora, p. 526):
• Initialize: choose 𝒙0 , 𝑅0 = 1, 𝜀1 > 0, 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Compute 𝑓 𝑘 , 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝒄, 𝑏𝑖 , 𝑒𝑗 ; compute 𝑉𝑘 .
– Formulate and solve the QP subproblem to obtain 𝒅𝑘 and the
Lagrange multipliers 𝒖𝑘 and 𝒗𝑘 .
– If 𝑉𝑘 ≤ 𝜀1 and 𝒅𝑘 ≤ 𝜀2 , stop.
– Compute 𝑅; formulate and solve line search subproblem for 𝛼
– Set 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑅𝑘+1 ← 𝑅, 𝑘 ← 𝑘 + 1
• The above algorithm is convergent, i.e., Φ 𝒙𝑘 ≤ Φ 𝒙0 ; 𝒙𝑘
converges to the KKT point 𝒙∗
SQP with Approximate Line Search
• The SQP algorithm can use with approximate line search as follows:
Let 𝑡𝑗 , 𝑗 = 0,1, … denote a trial step size,
𝒙𝑘+1,𝑗 denote the trial design point,
𝑓 𝑘+1,𝑗 = 𝑓( 𝒙𝑘+1,𝑗 ) denote the function value at the trial solution, and
Φ𝑘+1,𝑗 = 𝑓 𝑘+1,𝑗 + 𝑅𝑉𝑘+1,𝑗 is the penalty function at the trial solution.
• The trial solution is required to satisfy the descent condition:
2
Φ𝑘+1,𝑗 + 𝑡𝑗 𝛾 𝒅𝑘 ≤ Φ𝑘,𝑗 , 0<𝛾<1
1 1
where a common choice is: 𝛾 = 2 , 𝜇 = 2 , 𝑡𝑗 = 𝜇 𝑗 , 𝑗 = 0,1,2, ….
• The above descent condition ensures that the constraint violation
decreases at each step of the method.
SQP Example
• Consider the NLP problem: min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
subject to 𝑔1 : 1 − 𝑥12 − 𝑥22 ≤ 0, 𝑔2 : −𝑥1 ≤ 0, 𝑔3 : −𝑥2 ≤ 0
Then 𝛻𝑓 𝑇 = 2𝑥1 − 𝑥2 , 2𝑥2 − 𝑥1 , 𝛻𝑔1𝑇 = −2𝑥1 , −2𝑥2 , 𝛻𝑔2𝑇 =
−1,0 , 𝛻𝑔3𝑇 = [0, −1]. Let 𝑥 0 = 1, 1 ; then, 𝑓 0 = 1, 𝒄 = 1, 1 𝑇 ,
𝑔1 1,1 = 𝑔2 1,1 = 𝑔3 1,1 = −1.
• Since all constraints are initially inactive, 𝑉0 = 0, and 𝒅 = −𝒄 =
−1, −1 𝑇 ; the line search problem is: min Φ 𝛼 = 1 − 𝛼 2 ;
𝛼
• By setting Φ′ 𝛼 = 0, we get the analytical solution: 𝛼 = 1; thus
𝑥 1 = 0, 0 , which results in a large constraint violation
SQP Example
• Alternatively, we may use approximate line search as follows:
1
– Let 𝑅0 = 10, 𝛾 = 𝜇 = ; let 𝑡0 = 1, then 𝒙1,0 = 0,0 , 𝑓 1,0 = 0,
2
𝑉1,0 = 1, Φ1,0 = 10; 𝒅0 2 = 2, and the descent condition
1
Φ1,0 + 𝒅0 2 ≤ Φ0 = 1 is not met at the trial point.
2
1 1 1 1 1
– Next, for 𝑡1 = , we get: 𝒙1,1 = , , 𝑓 1,1 = , V1,1 = ,
2 2 2 4 2
1
Φ1,1 = 5 , and the descent condition fails again;
4
1 3 3 9
– Next, for 𝑡2 = , we get: 𝒙1,2 = , , V1,2 = 0, 𝑓 1,2 = Φ1,2 = ,
4 4 4 16
1
and the descent condition checks as: Φ1,2 + 𝒅0 2 ≤ Φ0 = 1.
8
1 3 3
– Therefore, we set 𝛼 = 𝑡2 = , 𝒙1 = 𝒙1,2 = , with no
4 4 4
constraint violation.
The Active Set Strategy
• To reduce the computational cost of solving the QP subproblem, we
may only include the active constraints in the problem.
• For 𝒙𝑘 ∈ Ω, the set of potentially active constraints is defined as:
ℐ𝑘 = 𝑖: 𝑔𝑖𝑘 > −𝜀; 𝑖 = 1, … , 𝑚 ⋃ 𝑗: 𝑗 = 1, … , 𝑝 for some 𝜀.
• For 𝒙𝑘 ∉ Ω, let 𝑉𝑘 = max {0; 𝑔𝑖𝑘 , 𝑖 = 1, . . . , 𝑚; ℎ𝑗𝑘 , 𝑗 = 1, … , 𝑝};
then, the active constraint set is defined as:
ℐ𝑘 = 𝑖: 𝑔𝑖𝑘 > 𝑉𝑘 − 𝜀; 𝑖 = 1, … , 𝑚 ⋃ 𝑗: ℎ𝑗𝑘 > 𝑉𝑘 − 𝜀; 𝑗 = 1, … , 𝑝
• The gradients of inactive constraints, i.e., those not in ℐ𝑘 , do not
need to be computed
SQP via Newton’s Method
• Consider the following equality constrained problem:
min 𝑓(𝒙), subject to ℎ𝑖 𝒙 = 0, 𝑖 = 1, … , 𝑙
𝒙
• The Lagrangian function is given as: ℒ 𝒙, 𝒗 = 𝑓 𝒙 + 𝒗𝑇 𝒉(𝒙)
• The KKT conditions are: 𝛻ℒ 𝒙, 𝒗 = 𝛻𝑓 𝒙 + 𝑵𝒗 = 𝟎, 𝒉 𝒙 = 𝟎
where 𝑵 = 𝛁𝒉(𝒙) is a Jacobian matrix whose 𝑖th column is 𝛻ℎ𝑖 𝒙
• Using first order Taylor series expansion (with shorthand notation):
𝛻ℒ 𝑘+1 = 𝛻ℒ 𝑘 + 𝛻 2 ℒ 𝑘 Δ𝒙 + 𝑁Δ𝒗
𝒉𝑘+1 = 𝒉𝑘 + 𝑵𝑇 Δ𝒙
• By expanding Δ𝒗 = 𝒗𝑘+1 − 𝒗𝑘 , 𝛻ℒ 𝑘 = 𝛻𝑓 𝑘 + 𝑵𝒗𝑘 , and assuming
𝑘 𝑘+1 𝛻 2ℒ 𝑘 𝑵 Δ𝒙 𝑘 𝛻𝑓 𝑘
𝒗 ≅𝒗 we obtain: 𝑇 𝑘+1 = −
𝑵 𝟎 𝒗 𝒉𝑘
which is similar to N-R update, but uses Hessian of the Lagrangian
SQP via Newton’s Method
• Alternately, we consider minimizing the quadratic approximation:
1
min Δ𝒙𝑇 𝛻 2 ℒΔ𝒙 + 𝛻𝑓 𝑇 Δ𝒙
Δ𝒙 2
Subject to: ℎ𝑖 𝑥 + 𝒏𝑇𝑖 Δ𝒙 = 0, 𝑖 = 𝑖, … , 𝑙
• The KKT conditions are: 𝛻𝑓 + 𝛻 2 ℒΔ𝒙 + 𝑵𝒗 = 𝟎, 𝒉 + 𝑵Δ𝒙 = 𝟎
• Thus the QP subproblem can be solved via Newton’s method!
𝛻 2 ℒ 𝑘 𝑵 Δ𝒙𝑘 = − 𝛻𝑓 𝑘
𝑵𝑇 𝟎 𝒗𝑘+1 𝒉𝑘
• The Hessian of the Lagrangian can be updated via BFGS method as:
𝑯𝑘+1 = 𝑯𝑘 + 𝑫𝑘 − 𝑬𝑘
𝑇 𝑇
𝒚𝑘 𝒚𝑘 𝒄𝑘 𝒄𝑘 𝑘 = 𝑯𝑘 Δ𝒙𝑘 , 𝒚𝑘 = 𝛻ℒ 𝑘+1 − ℒ 𝑘
where 𝑫𝑘 = 𝑘𝑇 𝑘
, 𝑬 𝑘 =
𝑘𝑇 𝑘
, 𝒄
𝒚 Δ𝒙 𝒄 Δ𝒙
Example: SQP with Hessian Update
• Consider the NLP problem: min 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥1 𝑥2 + 𝑥22
𝑥1 ,𝑥2
subject to 𝑔1 : 1 − 𝑥12 − 𝑥22 ≤ 0, 𝑔2 : −𝑥1 ≤ 0, 𝑔3 : −𝑥2 ≤ 0
Let 𝑥 0 = 1, 1 ; then, 𝑓 0 = 1, 𝒄 = 1, 1 𝑇 , 𝑔1 1,1 = 𝑔2 1,1 =
𝑔3 1,1 = −1; 𝛻𝑔1𝑇 = −2, −2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = [0, −1].
1 3 3
• Using approximate line search, 𝛼 = , 𝒙1 = , .
4 4 4
• For the Hessian update, we have:
𝑓 1 = 0.5625, 𝑔1 = −0.125, 𝑔2 = 𝑔3 = −0.75; 𝒄1 = [0.75, 0.75];
3 3
𝛻𝑔1𝑇 = − 2 , − 2 , 𝛻𝑔2𝑇 = −1,0 , 𝛻𝑔3𝑇 = 0, −1 ; Δ𝒙0 = −0.25, −0.25 ;
1 1 1 1
then, 𝑫0 = 8 , 𝑬0 = 8 , 𝑯1 = 𝑯0
1 1 1 1
SQP with Hessian Update
• For the next step, the QP problem is defined as:
3 1
min 𝑓 = 𝑑1 + 𝑑2 + 𝑑12 + 𝑑22
𝑑1 ,𝑑2 4 2
3
Subject to: − 2 𝑑1 + 𝑑2 ≤ 0, −𝑑1 ≤ 0, −𝑑2 ≤ 0
• The application of KKT conditions results in a linear system of
equations, which are solved to obtain:
𝒙𝑇 = 𝑑1 , 𝑑2 , 𝑢1 , 𝑢2 , 𝑢3 , 𝑠1 , 𝑠2 , 𝑠3 = 0.188, 0.188, 0, 0, 0,0.125, 0.75, 0.75
Modified SQP Algorithm
Modified SQP Algorithm (Arora, p. 558):
• Initialize: choose 𝒙0 , 𝑅0 = 1, 𝑯0 = 𝐼; 𝜀1 , 𝜀2 > 0.
• For 𝑘 = 0,1,2, …
– Compute 𝑓 𝑘 , 𝑔𝑖𝑘 , ℎ𝑗𝑘 , 𝒄, 𝑏𝑖 , 𝑒𝑗 , and 𝑉𝑘 . If 𝑘 > 0, compute 𝑯𝑘
– Formulate and solve the modified QP subproblem for search
direction 𝒅𝑘 and the Lagrange multipliers 𝒖𝑘 and 𝒗𝑘 .
– If 𝑉𝑘 ≤ 𝜀1 and 𝒅𝑘 ≤ 𝜀2, stop.
– Compute 𝑅; formulate and solve line search subproblem for 𝛼
– Set 𝒙𝑘+1 ← 𝒙𝑘 + 𝛼𝒅𝑘 , 𝑅𝑘+1 ← 𝑅, 𝑘 ← 𝑘 + 1.
SQP Algorithm
%SQP subproblem via Hessian update
% input: xk (current design); Lk (Hessian of Lagrangian
estimate)
%initialize
n=size(xk,1);
if ~exist('Lk','var'), Lk=diag(xk+(~xk)); end
tol=1e-7;
%function and constraint values
fk=f(xk);
dfk=df(xk);
gk=g(xk);
dgk=dg(xk);
%N-R update
A=[Lk dgk; dgk' 0*dgk'*dgk];
b=[-dfk;-gk];
dx=A\b;
dxk=dx(1:n);
lam=dx(n+1:end);
SQP Algorithm
%inactive constraints
idx1=find(lam<0);
if idx1
[dxk,lam]=inactive(lam,A,b,n);
end
%check termination
if abs(dxk)<tol, return, end
%adjust increment for constraint compliance
P=@(xk) f(xk)+lam'*abs(g(xk));
while P(xk+dxk)>P(xk),
dxk=dxk/2;
if abs(dxk)<tol, break, end
end
%Hessian update
dL=@(x) df(x)+dg(x)*lam;
Lk=update(Lk, xk, dxk, dL);
xk=xk+dxk;
disp([xk' f(xk) P(xk)])
SQP Algorithm
%function definitions
function [dxk,lam]=inactive(lam,A,b,n)
idx1=find(lam<0);
lam(idx1)=0;
idx2=find(lam);
v=[1:n,n+idx2];
A=A(v,v); b=b(v);
dx=A\b;
dxk=dx(1:n);
lam(idx2)=dx(n+1:end);
end