0% found this document useful (0 votes)
30 views24 pages

OptimumEngineeringDesign Day2b

The document discusses optimization methods and their application in engineering design. It covers topics like linear programming, nonlinear programming, mixed integer programming, convex sets, functions, and solving systems of linear equations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views24 pages

OptimumEngineeringDesign Day2b

The document discusses optimization methods and their application in engineering design. It covers topics like linear programming, nonlinear programming, mixed integer programming, convex sets, functions, and solving systems of linear equations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Optimization Methods

in Engineering Design
Day-2b
Course Objectives
• Learn basic optimization methods and how they are applied in
engineering design
• Use MATLAB to solve optimum engineering design problems
– Linear programming problems
– Nonlinear programming problems
– Mixed integer programing problems
Course Prerequisites
• Some knowledge of these will be assumed of the participants:
– Linear algebra
– Multivariable calculus
– Scientific reasoning
– Basic programming
– MATLAB
Course Materials
• Arora, Introduction to Optimum Design, 3e, Elsevier,
(https://www.researchgate.net/publication/273120102_Introductio
n_to_Optimum_design)
• Parkinson, Optimization Methods for Engineering Design, Brigham
Young University
(http://apmonitor.com/me575/index.php/Main/BookChapters)
• Iqbal, Fundamental Engineering Optimization Methods, BookBoon
(https://bookboon.com/en/fundamental-engineering-optimization-
methods-ebook)
Mathematical Preliminaries
Set Definitions
• Closed Set. A set 𝑆 is closed if it contains its limit points, i.e., for any
sequence of points 𝑥𝑘 , 𝑥𝑘 ∈ 𝑆, lim 𝑥𝑘 = 𝑥 ∈ 𝑆. For example, the
𝑘→∞
set 𝑆 = 𝑥: 𝑥 ≤ 𝑐 is closed.
• Bounded Set. A set 𝑆 is bounded if for every 𝑥 ∈ 𝑆, 𝑥 < 𝑐, where
∙ represents a vector norm and c is a finite number.
• Compact set. A set 𝑆 is compact if it is both closed and bounded.
• Open Set. A set 𝑆 is open if every 𝑥 ∈ 𝑆 is an interior point of 𝑆. For
example, the set 𝑆 = 𝑥: 𝑥 < 𝑐 is open.
Set Definitions
• Hyperplane. The set 𝑆 = 𝒙: 𝒂𝑇 𝒙 = 𝑏 , where 𝒂 and 𝑏 are constants,
defines a hyperplane. A line is a hyperplane in two dimensions.
Note that the vector 𝒂 is normal to the hyperplane.
• Halfspace. The set 𝑆 = 𝒙: 𝒂𝑇 𝒙 ≤ 𝑏 , where 𝒂 and 𝑏 are constants,
defines a halfspace. Note that vector 𝒂 is normal to the halfspace.
• Convex Set. A set 𝑆 is convex if for every pair 𝑥, 𝑦 ∈ 𝑆, their convex
combination 𝛼𝑥 + 1 − 𝛼 𝑦 ∈ 𝑆 for 0 ≤ 𝛼 ≤ 1. A line segment, a
hyperplane, a halfspace, sets of real numbers (ℝ, ℝ𝑛 ) are convex.
• Extreme Point. A point 𝑥 ∈ 𝑆 is an extreme point (or vertex) of a
convex set 𝑆 if it cannot be expressed as 𝑥 = 𝛼𝑦 + 1 − 𝛼 𝑧, with
𝑦, 𝑧 ∈ 𝑆 where 𝑦, 𝑧 ≠ 𝑥, and 0 < 𝛼 < 1.
• Interior point. A point 𝑥 ∈ 𝑆 is interior to the set 𝑆 if
𝑦: 𝑦 − 𝑥 < 𝜖 ⊂ 𝑆 for some 𝜖 > 0.
Function Definitions
• Continuous Function. A function 𝑓(𝒙) is continuous at a point 𝒙0 if
lim 𝑓(𝒙) = 𝑓(𝒙0 ).
𝒙→𝒙0

• Affine Function. The function 𝑓 𝒙 = 𝒂𝑇 𝒙 + 𝑏 is affine.


• Quadratic Function. A quadrative function is of the form:
1
𝑓 𝒙 = 𝒙𝑇 𝑸𝒙 + 𝒂𝑇 𝒙 + 𝑏, where 𝑸 is symmetric.
2
• Convex Functions. A function 𝑓(𝒙) defined on a convex set 𝑆 is
convex if and only if for every pair 𝒙, 𝒚 ∈ 𝑆,
𝑓 𝛼𝒙 + 1 − 𝛼 𝒚 ≤ 𝛼𝑓 𝒙 + 1 − 𝛼 𝑓 𝒚 , 𝛼 ∈ [0,1]
– Affine functions defined over convex sets are convex.
– Quadratic functions defined over convex sets are convex only if
𝑸 > 0, i.e., all its eigenvalues are positive.
The Gradient Vector
• The Gradient Vector. Let 𝑓(𝒙) = 𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a real-valued
function of 𝑛 variables; the gradient of 𝑓 is a vector defined by:
𝑇 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝛻𝑓 𝒙 = ,
𝜕𝑥1 𝜕𝑥2
,…,
𝜕𝑥𝑛
. The gradient of 𝑓(𝒙) at a point 𝒙0 is
given as: 𝛻𝑓 𝒙0 = 𝛻𝑓 𝒙 |𝒙=𝒙0 .
• Directional Derivative. The directional derivative of 𝑓(𝒙) along any
direction 𝒅, is defined as: 𝑓𝒅 ′ 𝒙 = 𝛻𝑓(𝒙)𝑇 𝒅. By definition, the
directional derivative at 𝒙0 is maximum along 𝛻𝑓 𝒙0 .
The Hessian Matrix
• The Hessian of 𝑓(𝒙) is the 𝑛 × 𝑛 matrix 𝛻 2 𝑓(𝒙) of second partial
𝜕2 𝑓
derivatives, where 𝛻2𝑓 𝒙 𝑖𝑗 = . Note that the Hessian is
𝜕𝑥𝑖 𝜕𝑥𝑗
𝜕2 𝑓 𝜕2 𝑓
symmetric, since = .
𝜕𝑥𝑖 𝜕𝑥𝑗 𝜕𝑥𝑗 𝜕𝑥𝑖
1
• Example: let 𝑓 𝒙 = 𝒙𝑇 𝑸𝒙 + 𝒂𝑇 𝒙 + 𝑏 where 𝑸 is symmetric;
2
then, 𝛻𝑓 𝒙 = 𝑸𝒙 + 𝒂; 𝛻 2 𝑓 𝒙 = 𝑸
• Example: let 𝑓 𝑥, 𝑦 = 3𝑥 2 𝑦.
6𝑦 6𝑥
Then, 𝛻𝑓 𝑥, 𝑦 = 6𝑥𝑦, 3𝑥 2 𝑇 and 𝛻 2 𝑓 𝑥, 𝑦 = .
6𝑥 0
Let 𝑥0 , 𝑦0 = 1,2 , then 𝛻𝑓 𝑥0 , 𝑦0 = 12 , 𝛻 2 𝑓 𝑥0 , 𝑦0 = 12 6 .
3 6 0
The Taylor Series
• The Taylor series expansion of 𝑓 𝑥 around 𝑥0 is given as:
1 ′′
𝑓 𝑥0 + Δ𝑥 = 𝑓 𝑥0 + 𝑓 ′ 𝑥0 Δ𝑥 + 𝑓 𝑥0 Δ𝑥 2 + ⋯
2!
• The 𝑛th order Taylor series approximation of 𝑓 𝑥 is given as:

1 ′′ 2
1 (𝑛)
𝑓 𝑥0 + Δ𝑥 ≅ 𝑓 𝑥0 + 𝑓 𝑥0 Δ𝑥 + 𝑓 𝑥0 Δ𝑥 + ⋯ + 𝑓 𝑥0 Δ𝑥 𝑛
2! 𝑛!
First order: 𝑓 𝑥0 + Δ𝑥 ≅ 𝑓 𝑥0 + 𝑓 ′ 𝑥0 Δ𝑥
1 ′′
Second order: 𝑓 𝑥0 + Δ𝑥 ≅ 𝑓 𝑥0 + 𝑓 ′ 𝑥0 Δ𝑥 + 𝑓 𝑥0 Δ𝑥 2
2!
• The local behavior of a function is approximated as:
𝑓 𝑥 − 𝑓 𝑥0 ≅ 𝑓 ′ 𝑥0 𝑥 − 𝑥0
Taylor Series
• The Taylor series expansion in the case of a multi-variable function
is given as (where 𝜹𝒙 = 𝒙 − 𝒙0 ):
1
𝑓 𝒙0 + 𝜹𝒙 = 𝑓 𝒙0 + 𝛻𝑓 𝒙0 𝑇 𝜹𝒙 + 𝜹𝒙𝑇 𝛻 2 𝑓 𝒙0 𝜹𝒙 + ⋯
2!
where 𝛻𝑓 𝒙0 and 𝛻 2 𝑓 𝒙0 are, respectively, the gradient and
Hessian of 𝑓 computed at 𝒙0 .
• A first-order change in 𝑓(𝒙) at 𝒙0 along a direction 𝒅 is given by its
directional derivative:
𝛿𝑓 = 𝑓 𝒙 − 𝑓 𝒙0 = 𝛻𝑓 𝒙0 𝑇 𝒅
Quadratic Function Forms
• The quadratic (scalar) function form on 𝒙 is defined as:
𝑛 𝑛
𝑓 𝒙 = 𝒙𝑇 𝑸𝒙 = 𝑖=1 𝑗=1 𝑄𝑖,𝑗 𝑥𝑖 𝑥𝑗
1
Note that replacing 𝑸 by (𝑸 + 𝑸𝑇 ) does not change 𝑓 𝒙 .
2
Hence, in a quadratic form 𝑸 can always be assumed as symmetric
• The quadratic form is classified as:
– Positive definite if 𝒙𝑇 𝑸𝒙 > 0 or 𝜆 𝑸 > 0
– Positive semidefinite if 𝒙𝑇 𝑸𝒙 ≥ 0 or 𝜆 𝑸 ≥ 0
– Negative definite if 𝒙𝑇 𝑸𝒙 < 0 or 𝜆 𝑸 < 0
– Negative semidefinite if 𝒙𝑇 𝑸𝒙 ≤ 0 or or 𝜆 𝑸 ≤ 0
– Infinite otherwise
Matrix Norms
• Norms provide a measure for the size of a vector or matrix, similar
to the notion of absolute value in the case of real numbers.
1
𝑛
• Vector p-norms are defined by: 𝒙 𝑝 = 𝑖=1 𝑥𝑖 𝑝 , 𝑝 ≥ 1.
– 1-norm: 𝒙 1 = 𝑛𝑖=1 𝑥𝑖 ,
𝑛 2
– Euclidean norm: 𝒙 2 = 𝑖=1 𝑥𝑖 ,
– ∞-norm: 𝒙 ∞ = max 𝑥𝑖 .
1≤𝑖≤𝑛
• Induced matrix norms are defined by: 𝑨 = max 𝑨𝒙 ≤ 𝑨 𝒙 .
𝑥 =1
𝑛
– 𝑨 1 = max 𝑖=1 𝐴𝑖,𝑗 (the largest column sum of 𝑨)
1≤𝑗<𝑛

– 𝑨 2 = 𝜆𝑚𝑎𝑥 (𝑨𝑇 𝑨), where 𝜆𝑚𝑎𝑥 is the largest eigenvalue of 𝑨


𝑛
– 𝑨 ∞ = max 𝑗=1 𝐴𝑖,𝑗 (the largest row sum of 𝑨)
1≤𝑖≤𝑛
Properties of Convex Functions
• If 𝑓 ∈ ∁1 (i.e., 𝑓 is differentiable), then 𝑓 is convex over a convex set
𝑆 if and only if for all 𝒙, 𝒚 ∈ 𝑆, 𝑓 𝒚 ≥ 𝑓 𝒙 + 𝛻𝑓 𝒙 𝑇 (𝒚 − 𝒙).
Graphically, it means that a function is on or above the tangent line
(hyperplane) passing through 𝒙.
• If 𝑓 ∈ ∁2 (i.e., 𝑓 is twice differentiable), then 𝑓 is convex over a
convex set 𝑆 if and only if 𝑓′′(𝒙) ≥ 0 for all 𝒙 ∈ 𝑆 . In the case of
multivariable functions, 𝑓 is convex over a convex set 𝑆 if and only if
its Hessian matrix is positive semi-definite everywhere in 𝑆, i.e., for
all 𝒙 ∈ 𝑆 and for all 𝒅, 𝒅𝑇 𝛁 2 𝑓 𝒙 𝒅 ≥ 0.
• If the Hessian is positive definite for all 𝒙 ∈ 𝑆 and for all 𝒅, i.e., if
𝒅𝑇 𝛁 2 𝑓 𝒙 𝒅 > 0, then the function is strictly convex.
• If 𝑓(𝒙∗ ) is a local minimum for a convex function 𝑓 defined over a
convex set 𝑆, then it is also a global minimum.
Solving Linear System of Equations
• A system of 𝑚 linear equations in 𝑛 unknowns described as:
𝑨𝒙 = 𝒃, where 𝑨 has a rank 𝑟 = min(𝑚, 𝑛).
• A solution to the system exists only if rank 𝑨 = rank 𝑨 𝒃 , i.e., only
if 𝒃 lies in the column space of 𝑨. The solution is unique if 𝑟 = 𝑛.
– For 𝑚 = 𝑛, a solution is obtained as: 𝒙 = 𝑨−1 𝒃
– The general solution for 𝑚 < 𝑛 is obtained by resolving the system
into canonical form: 𝑰𝒙(𝑚) + 𝑸𝒙(𝑛−𝑚) = 𝒃′, where 𝒙(𝑚) are 𝑚
dependent variables and 𝒙(𝑛−𝑚) are independent variables.
– The general solution is given as: 𝒙(𝑚) = 𝒃′ − 𝑸𝒙(𝑛−𝑚) .
– A basic solution is obtained as: 𝒙(𝑛−𝑚) = 𝟎, 𝒙(𝑚) = 𝒃′ .
– For 𝑚 > 𝑛, the system is inconsistent, but can be solved in the
least-squares sense.
Example: General and Basic Solution
• Consider the LP problem
max 𝑧 = 2𝑥1 + 3𝑥2
𝒙
Subject to: 𝑥1 ≤ 3, 𝑥2 ≤ 5, 2𝑥1 + 𝑥2 ≤ 7; 𝑥1 , 𝑥2 ≥ 0
• Add slack variables to turn inequality constraints into equality:
𝑥1 + 𝑠1 = 3, 𝑥2 + 𝑠2 = 5, 2𝑥1 + 𝑥2 + 𝑠3 = 7
• Using 𝑥1 , 𝑥2 as independent variables, the system is written as:
1 𝑠1 1 0 𝑥 3
1 𝑠2 + 0 1 1 = 5
𝑥2
1 𝑠3 2 1 7
𝑠1 3
• Choosing 𝑥1 = 𝑥2 = 0, we obtain a basic solution as: 𝑠2 = 5
𝑠3 7
• A different choice of independent variables will result in different
basic solution.
Linear Systems of Equations
• The general solution to 𝑨𝒙 = 𝒃 may be written as: 𝒙 = 𝑨† 𝒃, where
𝑨† is the pseudo-inverse of 𝑨, defined as:
– 𝑨† = 𝑨−1 (m=n)
– 𝑨† = 𝑨𝑇 𝑨𝑨𝑇 −1 (m<n)
– 𝑨† = 𝑨𝑇 𝑨 −1 𝑨𝑇 (m>n)
Linear Diophantine System of Equations
• A Linear Diophantine system of equations (LDSE) is represented as:
𝑨𝒙 = 𝒃, 𝒙 ∈ ℤ𝑛 .
• A square matrix 𝑨 ∈ ℤ𝑛×𝑛 is unimodular if det 𝑨 = ±1.
– If 𝑨 ∈ ℤ𝑛×𝑛 is unimodular, then 𝑨−1 ∈ ℤ𝑛×𝑛 is also unimodular.
– Assume that 𝑨 is unimodular and 𝒃 is an integer vector, then every
solution {𝒙|𝑨𝒙 = 𝒃} is integral.
• A non-square matrix 𝑨 ∈ ℤ𝑚×𝑛 is totally unimodular if every square
submatrix 𝑪 of 𝑨, has det 𝑪 ∈ 0, ±1 .
– Assume that 𝑨 is totally unimodular and 𝒃 is an integer vector,
then every basic solution {𝒙|𝑨𝒙 = 𝒃} is integral.
– Note, a basic solution has only 𝑚 non-zero elements
Example: Integer BFS
• Consider the LP problem with integral coefficients
max 𝑧 = 2𝑥1 + 3𝑥2
𝒙
Subject to: 𝑥1 ≤ 3, 𝑥2 ≤ 5, 𝑥1 + 𝑥2 ≤ 7, 𝒙 ∈ ℤ2 , 𝒙 ≥ 𝟎
• Add slack variables and write the constraints in matrix form as: :
1 0 1 0 0 3
𝑨= 0 1 0 1 0 ,𝒃= 5
1 1 0 0 1 7
where columns of 𝑨 represents variables and rows represent the
constraints. Note that 𝑨 is totally unimodular and 𝒃 ∈ ℤ3 .
• Then, using the simplex method, the optimal integral solution is
obtained as: 𝒙𝑇 = 2,5,1,0,0 , with 𝑧 ∗ = 19.
Condition Numbers and Convergence Rates
• The condition number of matrix 𝑨 is defined as: 𝑐𝑜𝑛𝑑 𝑨 = 𝑨 ∙
𝑨−1 . Note that,
– 𝑐𝑜𝑛𝑑 𝑨 ≥ 1
– 𝑐𝑜𝑛𝑑 𝑰 = 1
– If 𝑨 is symmetric with real eigenvalues, then 𝑐𝑜𝑛𝑑 𝑨 =
𝜆𝑚𝑎𝑥 (𝑨)/𝜆𝑚𝑖𝑛 (𝑨).
• The condition number of the Hessian matrix affects the
convergence rate of the optimization algorithm
Convergence Rates of Numerical Algorithms
• Assume that a sequence of points {𝑥 𝑘 } converges to a solution 𝑥 ∗ ,
and let 𝑒𝑘 = 𝑥 𝑘 − 𝑥 ∗ . Then, the sequence {𝑥 𝑘 } converges to 𝑥 ∗ with
rate 𝑟 and rate constant 𝐶 if 𝑒𝑘+1 = 𝐶 𝑒𝑘 𝑟 .
Note that convergence is faster if 𝑟 is large and 𝐶 is small.
• Linear convergence. For 𝑟 = 1, 𝑒𝑘+1 = 𝐶 𝑒𝑘 , i.e., convergence is
𝑓 𝑥 𝑘+1 −𝑓(𝑥 ∗ )
linear with 𝐶 ≈ 𝑓 𝑥 𝑘 −𝑓(𝑥 ∗ )
.
• Quadratic Convergence. For 𝑟 = 2, 𝑒𝑘+1 = 𝐶 𝑒𝑘 2 .
If, additionally, 𝐶 = 1, then the number of correct digits doubles at
every iteration.
• Superlinear Convergence. For 1 < 𝑟 < 2, the convergence is
superlinear. Numerical algorithms that only use the gradient
information can achieve superlinear convergence.
Newton’s Method
• Newton’s method iteratively solves the equation: 𝑓 𝑥 = 0.
– Starting from an initial guess 𝑥0 , it generates a series of solutions
𝑥𝑘 that converge to a fixed point 𝑓 𝑥 ∗ = 0.
– Newton’s method for a single variable function is given as:
𝑓 𝑥𝑘
𝑥𝑘+1 = 𝑥𝑘 −
𝑓′ 𝑥𝑘
𝑇
• For a system of equations, let 𝐽 𝒙 = 𝛻𝑓1 𝒙 , 𝛻𝑓2 𝒙 , … , 𝛻𝑓𝑛 𝒙 ;
−1
then, Newton’s update is given as: 𝒙𝑘+1 = 𝒙𝑘 − 𝐽 𝒙𝑘 𝑓 𝒙𝑘
• The Newton’s method achieves quadratic convergence with a rate
1 𝑓′′ (𝑥 ∗ )
constant: 𝐶 = 2 𝑓′ (𝑥 ∗ )
.
Conjugate Gradient Method
• The conjugate-gradient (CG) method was designed to iteratively
solve the linear system of equations: 𝑨𝒙 = 𝒃, where 𝑨 is assumed
normal, i.e., 𝑨𝑇 𝑨 = 𝑨𝑨𝑇 .
• The method initializes with 𝒙0 = 𝟎, and obtains an approximate
solution 𝒙𝑛 in 𝑛 iterations.
• The method generates a set of vectors 𝒗1 , 𝒗2 , … , 𝒗𝑛 that are
conjugate with respect to 𝑨 matrix, i.e., 𝒗𝑇𝑖 𝑨𝒗𝑗 = 0, 𝑖 ≠ 𝑗.
– Let 𝒗−1 = 𝟎, 𝛽0 = 0; and define 𝒓𝑖 = 𝒃 − 𝑨𝒙𝑖 . Then, a set of
conjugate vectors with respect to 𝑨 is iteratively generated as:
𝒗𝑇
𝑖 𝑨𝒓𝑖
𝒗𝑖 = 𝒓𝑖 + 𝛽𝑖 𝒗𝑖−1 , 𝛽𝑖 = .
𝒗𝑇
𝑖 𝑨𝒗𝑖

View publication stats

You might also like