Lecture Notes - Optimization and Control PDF
Lecture Notes - Optimization and Control PDF
i
it
n minimum
what characteristics
are
of optimum
M tion
Cn o
is
we are
Leastsquareexampleaf
An icecream seller to how
find many
icecreamta
sell
ni temperature i
day
Ice cream sold i
ye day
ya aus
I to find a b minimizedistance
bluepointfline
Find a b
mign.SE
in yi
a
ni BI
General formulation
T.co
fa
aibfferentoaseem
7edmed
or.mn
min e
F
II
g omophtimFation
o i c 9mi
ein constraint
to
subject
cow IO ie inequality
constraint
Feasible
General optimizationproblem
Mtnz IZ
2 M na
Is 42
gene Mz M2 E 3
2 M
Geng 7
I 1,23
Gr m
special cases
minnern
finear
Ain bi Zo Eez
programming
al'tfoam
Ag n beg
Ain n bon
19Plauadraticpregrammin
ut G din
min
ne Rn t n t
s t Ait n bi e o EEE
q.tn Zo i
eeobjetieel
canbeaguadraticeatpressoon
LPenampter
LP example
n e tonnes A
RE tonnes B
t 7000 M t 6000Nz
60 m t 80Nz E 2000
Optimizationproblem
In 7000 M 6000 Rz
An Eb
O s O o
L by
II
tinprey function
Convexity
es ant acte I
my yes
non Connex
Connex set
set
Conuexfun ation
S R is conuex
n Es
A E
O D
tog
fog
geobalm.m.mg
r
set
feasible
solve
are
Integerproblems
never conuen
f is concave
Definitions
Cian Io ice
Feasible set I NE
Ral Geng 0 EEE Coca 20 EEI
Global solution
local solution
fooga
fan
TIdoealminimumntyI.IR
seal
global
Gfeasibeeset
EAT
Pr s
tsroblems
Proof by contradiction
Affume
n n
Defining 2 2 nd t l a n a E Lo I
SafCn tf a
fend
faux Cc a
floral tf asfcm
f tE f feel final
add subtractfEn
LO
in ne
fo l
fenix
jangle nixisglobal
meaning
solution
2 in the
of n
forlesser see it isn't
is even a local solution
function
Of book
fans city
Thtifeier
KKTconditions FqLCn H O
g n o ti EE
Ci Ent Hi EI
20
dot 20 C Hi I
Ai qCn o to E EUI
Usefulfact
descent
direction
d flat 22,1
Will I
CO ed
if
a descentdirection
Dfid is
acid O E d
Is feasible
a direction for day
o
Mccoy
cent 0
acid 20 e
de a
feasible detection for clonko
conto
g
55,420
Equality constraint
Casey
n
clearly emiston feasible
n a descent direction
arnica
of
µ d'T Ct
Fomedonsuatinthint
O
Ff
as i m
E.Iqconse
qq.ee
d annt.a
if.eocmt.aco
oycqqqy
qq.gg ig aoeeatoee.eonf
descent direction
FfCn X acCn
no clearly exists
feasiseeoggceeef.eu
descent direction
d cons O
dat Afca O
final D actual 4 20
Jttn iEIIG.cn
Active set EU o
o.EE
CaseI
Ca NEO
n U
n fill
F buy R2 F Cn did'T If
not e
AGE I 22g
F Cn d DT and DT
f 20
Zo
f 9
madeseenffedngiiffYIL.gtxixec.cn
ai.ia zo
II Lectures optimalitycondaforconstrainedoptimization
Gradient
man o
Eating
itssymmetric foot
Qf_
dream
uessoanofanlayangian
g
ftp.P
mm
uedzess
F
Ballon string with constraints
of
minimize M2tnEtnz
quality
inequality canneetbe ne
fix
CnH fans AC by
A Gretna 1
stationarity
0.8 1 49 t m tic
0
22 7 0 2
42
21
Hfimatsoeity
netz I
1 4 4,3
94 42
2T
µ kgn
KK 2
Cn 2420
Cn 220
L mm A G m ni Tim Az Nz
KKI IL n t d I
dm
IL K t 2nd Az
Dnc
2 ni NE I 0
zo
primary feasibility
Na 20
1 20
20 dual feasibility
Az
20
Az
ti Cz m nil o
complementarity
Az ni o cand n
Iz Cud O
Case I
Assume 9 O
in the feasible
Nz 7 0 Nz EO 12 0
my 93 0 n
0,13 0
checking similarly
X D Kkt
Oo
ooo
fulfills
fulfils
candor are
only necessary
M so is no solution 42 0
Nz o ll 43 0
Nz I
2222 0
3k 2
Mtn z
Nz 3
N 2
43 43
ne
Solvability
densitiuityandhagrayemulk.pk Baak
Given n H Acne
Ci Cn 20 film Z E E 70
Cnt Ces f Cn
so
Tf Shiki
substituting
E djeg.cn nice n
jEAn
cjcnxcej cjcnxyiagtln.ee n
Aj Cj nice Cj cn
irontaomi
j µn
o 0
exceptfor jai
when E
XiE Cj n CED
for o
D
df
Tangential
doesn'thold
If 4cg
we not
might get Tf
value lagrange n
unique
of
daesmtaed
qzy
De
qs criticalcane Refervideo lecture
17th Jan 19
17thJan 19
Lecture 4: Linear programming
Brief recap: Linear algebra, Real analysis
Linear programming, formulation, standard form
KKT conditions for linear programming
Dual problem, weak&strong duality
Reference: N&W Ch.13.1 (also: 12.9)
of 0
LinearProgramming
linear
Recap Algebra
Norms
comma
row
Condiber
Well conditioned
can bedetermined
use condition
2
Types of constrained optimization problems
Linear programming
Convex problem
Quadratic programming
Convex problem if
Nonlinear programming
In general non-convex!
Norms norm(X,2) returns the 2-norm of X.
norm(X) is the same as norm(X,2).
norm(X,1) returns the 1-norm of X.
norm(X,Inf) returns the infinity norm of X.
Vector norms: Hassomethingtadawithsizeaf norm(X,'fro') returns the Frobenius norm of X.
nectar
Tatar
www.l
fee'Emitti
Matrix norms (same definition as vector norm):
largestrow
sum
Eminem
af
r
largesteigenvalue
J
largestcolumn
sum
pythinkasanoperator
projectionof
e ninrange
space
D mapsies
mapsto
prajectionofn
innullspace
Domain Range
Condition number in eigenvalues eigenvalues aremore
spread if
spread longerna
lessspreadeigenvalues small
Well-conditioned: A small perturbation gives small changes
use
we scaling
suchcases
for
Condition number:
>> help cond
cond Condition number with respect to inversion.
cond(X) returns the 2-norm condition number (the
e
ratio of the largest singular value of X to the smallest).
Large condition numbers indicate a nearly singular
Kappa
A small condition number (say, 1-100) implies the matrix.
matrix is well-conditioned, a large condition number cond(X,P) returns the condition number of X in P-norm:
(say, >10 000) implies the matrix is ill-conditioned.
The condition number (2-norm) of the above matrices NORM(X,P) * NORM(INV(X),P).
are 6.9 and 400 000, respectively. where P = 1, 2, inf, or 'fro'.
Matrinfactor ization
A n b
1 Lo decomposition Gaussianelimination
For generalmatrinA
Symon
ORdecomposition
EigenvalueEspectraydecomposition
ysis
Directional derivative
Lipschitz continuity
x x
Matrix factorizations
Solve linear equation system:
In ac ice, ne e e he in e e. I i inefficien and n able .
Instead, use matrix factorizations:
General matrix A: Use LU-decomposition (Gaussian elimination)
Due to triangular structure of L and U, we easily solve the two linear systems by substitution
Symmetric pd matrix A: Use Cholesky decomposition
Generally, algorithms use permutations:
Other important factorizations value decourfon
A = QR: Finds orthogonal basis for nullspace of A
fornon
peigen square
matrices
Eigenvalue (spectral) decomposition, singular value decomposition
ed CsuD
This is entension
of eigenvalue decomposition for non
square matrices
AE RmXa
m n
Uµµ
orthogonal matrices
diagonal G
S on
Tite Z re
Gradient and Hessian: For a function , , the
gradient and Hessian are
Directional derivative: The directional derivative of is
Also valid when is not continuously differentiable. When ,
Lipschitz continuity: A function is Lipschitz continuous in a
neighborhood , if
For , , we have
wikipedia.org
CHI Linearfroghlmmitf
thisdoesn't
adding make
differenceas itonly
any
changethevalue offunckat
Bothconstraints
objfune
are linear
Gr
ain b IO ie I
Standard foam LP
An b
minne In it primal
NIO
endaeedform
Transformationtaste
Say
Trick1 slackvariables
N ont n ktzo n zo
n ont w
It 3 min CT at n s t A fat a tz b
at
µ at 20 a 20 2.20
min et E o
Int
s t
A A
b
Tat
Yutzy 20
10
LP example: Farming
A farmer wants to grow apples (A)
and bananas (B)
He has a field of size 100 000 m2
Growing 1 tonne of A requires an
area of 4 000 m2, growing 1 tonne
A requires 60 kg fertilizer per tonne grown, B requires 80 kg
fertilizer per tonne grown
The profit for A is 7000 per tonne (including fertilizer cost), the
profit for B is 6000 per tonne (including fertilizer cost)
The farmer can legally use up to 2000 kg of fertilizer
He wants to maximize his profits
4000M t 30 Nz t 2
Gong t 80 Nz t Zz 2 000
M 20 Nz O Z Z O Zz IO
special
I
if not then
may endup
with infeasible or
unbounded Solh
optimalsolution
Rankof matron A m E n
constraints
of
max no no
of
thatueecanhaueuaewabl.ee
independent
12
Lagrangian:
(stationarity)
(primal feasibility)
(dual feasibility)
(complementarity condition/
complementary slackness)
Either or
Linear programming, standard form and KKT
Lagrangian:
I
can betueor ue d
KKT-conditions (necessary and sufficient for LP): Hasteebetue
Lagrangemultiplier
TnLCn X 0 9 forinequality
Xs Lagrangemultiplier of equalities
s Lagrangemultiplier forinequalities
www.ntnu.no TTK4135 Optimization and control
KKTforLP
Lcn A En XT An b Stn
An b
n 20
8 20
Sini O i4 r n
KKT
for LPs are
necessary
and sufficient
Tisa
feasibleft
cTn ATf ts n f c Atx ts
IEEE
login
feandtfmbi.es
HEE n
ts
b 20
I bTy Ink
Explanations
i
T T
T b
A TA Ifor Enke ATx
IA
CATAK n
toptimal
e
I I btl
cImmunity
candor
ofka
scalar 1
Ad 1
changing
doesn't makes
solution
difference
Dual problem
Max BTX ft ATTIC dual sedLdefined
Gotra
on Rm
XERM
mEc
Rewrite for KKT go.fmaiFREE
zoaerm.in Rs bIIaec
mini
I X X bit nT c Atl
Is
modified
optimization
lagrangian
variable lagrangian
multifier
interchanged than
regular
17 L x btAn
KKifal n O
six c AT 20
2 20
nikcc ATMi O.it in
sik
SK GATAK
I feasible
nobjectivefunction
values
in
omfg dualitygap
BTI
St Theorem 13 I
y
1
If primal
cTn bTx
Ytitygopiusedtfno.ba
aae.yueaeeaeaupsaue.on
Praef bycontradiction
suppose primal is unbounded Enist feasible s t
CT NH N
s E ATI Ec
ATI E C
XT A new E calKI
b TI E CT Nlt a
contradiction
sensitivity
them
14
The feasible set is a polytope (a convex
The objective function contours are planar
obj.ms as itis
alinearfunC
Three possible cases for solutions:
No solutions: Feasible set is empty
or problem is unbounded 52
0
One solution: A vertex
Infinite number of solutions:
An edge i a l i n
www.ntnu.no
ofthiscourse TTK4135 Optimization and control
1
Lecture 5: Solving LPs the simplex method
Brief recap previous lecture
The geometry of the feasible set
Basic feasible points, The fundamental theorem of
linear programming
Some implementation issues
Reference: N&W Ch.13.2-13.3, also 13.4-13.5
Linear programming, standard form and KKT: recap
LP:
LP, standard form:
Lagrangian:
KKT-conditions (LPs: necessary and sufficient for optimality):
Duality
Primal problem Dual problem
Identical KKT conditions!
Equal optimal value:
Weak duality:
Duality gap:
Strong duality (Thm 13.1):
i) If primal or dual has finite solution, both are equal
ii) If primal or dual is unbounded, the other is infeasible
Basic optimal point (BOP)
Basic feasible point (BFP)
(if they exist)
In general, the BFP has
at most m non-zero
components
Inspired by drawings by Miguel A. Carreira-Perpinan
22 Jan 19 Lecture 5
easibleset
LPiGeometryoff
fullhour rank
BEP
n ix
n is
if
feasible
mean
a
B En contains m indices
at a
AEI TETI
Rankofit m if Ben ni O
B Ai Egging is non
singular
column iof An
Mxn
basis matrix
n
Ahasfull rank
as
for
rank B am
If
n 5 me 5
Ef
i 3
It
A n b
converges to a solution if
For standard
ii If one
Iii P as feasible and bounded there is a solution
If
Theoremisis
All vertices
of the feasible polytope
meRn Ane b n Zo
I
simple BOP
Degeneracy if
with me o
for EC Cng
LP KKT conditions (necessary&sufficient)
Simplex method iterates BFPs until one that fulfills KKT is found.
Each step is a move from a vertex to a neighboring vertex (one
change in the basis), that decreases the objective
Csetofenduc
B Basis matrix
N E in 1B
NB Basicfeasiblepoint
B ni Tiegs nm no nm O from
icy
definitionof Bep
points liken
feasibleand nonfeasible
CB Gee
Procedure 13.1
KKT 3 1 20 my O
ears o
20 sofromKent2 Bnpgb incaseof
n.fiEoi a.z Eb
nrs
risende
enEoiKkT
ns
l StATd
CSBtBTd
Cg c
stimeconsuming
she_Gu out
A LBTTCB KkT4
If n
If not
g one inden t Sq co
Choose g s
increase
ng along An b until another
and Btg CB
the size
always matow
Boyz b triangle
B LU NB b
Unpeg
Ly b
µ t
easier to solve
and bysubstitution
section
Readbgooke simplex
on
Check KKT-conditions for BFP
Given BFP , and corresponding basis . Define
Partition , and :
KKT conditions
KKT-2: (since x is BFP)
KKT-3: (since x is BFP)
KKT-5: if we choose
KKT-1:
KKT-4: Is ?
If , then the BFP fulfills KKT and is a solution
If not, change basis, and try again
E.g. pick smallest element of sN (index q), increase xq along Ax=b until xp becomes zero.
Move q from to , and p from to . This guarantees decrease of objective, and no
c cling (if non-degenerate).
Enample B min 4M Zz
NEIR2
St Mtn E5
jIf
conuoa.gtaseo.name
Mtnztnz 5
204 4221 24 8
M 20 220 20,2420
f I og't
13 3,1313
N l343
am my 8
B B'b
f if
no
Csis's
I
g i
sn.cm mix
f fi I
I E'd
m
t
sm
EIisniE.gg
I
is
c CS 4 is optimal
pick q
Let I enter B
Increase m An b
Simplex in 3D
wikipedia.org
Two linear systems must be solved in each iteration:
(to find the direction to check when inreasing )
We also had . Since is not needed in the iterations, e don t
need to solve this (apart from in the final iteration)
This is the major work per iteration of simplex, efficiency is important!
B is a general, non-singular matrix
LU factorization is the appropriate method to use (same for both systems)
Don t use matri in ersion!
In each step of Simplex method, one column of B is replaced:
Can update ( maintain ) the LU factori ation of B in a smart and efficient fashion
No need to do a new LU factorization in each step, save time!
Other practical implementation issues (Ch. 13.5)
Selection of entering inde q
Dant ig s rule: Select the index of the most negative element in sN
Other rules have proved to be more efficient in practice
Handling of degenerate bases/degenerate steps (when a positive xq is
not possible)
If no degeneracy, each step leads to decrease in objective and convergence in finite number of
iterations is guaranteed (Theorem 13.4)
Degenerate steps lead to no decrease in objective. Not necessarily a problem, but can lead to cycling
(we end up in the same basis as before)
Practical algorithms uses perturbation strategies to avoid this
Starting the simplex method
We assumed an initial BFP available but finding this is as difficult as solving the LP
Normally, simplex algorithms have two phases:
Phase I: Find BFP
Phase II: Solve LP
Phase I: Design other LP with trivial initial BFP, and whose solution is BFP for original problem
Presolving (Ch. 13.7)
Reducing the size of the problem before solving, by various tricks to eliminate variables and
constraints. Size reduction can be huge. Can also detect infeasibility.
8 Beak
PhaseGm
forsimpler
min 1 i 2 e 1,1 1T
2E RM NER
Ant EZ b E
min n
FET diag Ejj
if byLO
N O
f lbjI j I m
mm
corresponding to
basicfeasiblepoint
min eTz o Ee L l l 1 2 0
2 0 minhappenswhen
2 0
An tEz And b
thisoptimalpaintforthisproblemcan now
the
originalproblem
12
Complexity:
Worst case: All vertices must be visited (exponential complexity in n)
Compare interior point method: Guaranteed polynomial complexity, but in
practice hard to beat simplex on many problems
Active set methods (such as simplex method):
Makes small changes to the set in each iteration (a single index in simplex)
Next week: Active set method for QP
24th jah.IO
Lecture 6: Quadratic programming
Quadratic programming; convex and non-convex QPs
Equality constrained QPs
Reference: N&W Ch.15.3-15.5, 16.1-2,4-5
Types of constrained optimization problems
Linear programming
Convex problem
Quadratic programming
Convex problem if
Nonlinear programming
In general non-convex!
The Simplex algorithm
The feasible set of LPs are (convex) polytopes
LP solution is a vertex ( co ne ) of the feasible set
In each iteration, we solve a linear system to find which
component in the ba i ( e of no ac i e con ain ) e ho ld change
Almost guaranteed convergence (if LP not unbounded or infeasible)
Complexity:
Typically, at most 2m to 3m iterations
Worst case: All vertices must be visited (exponential complexity in n)
Active set methods (such as simplex method):
Maintains explicitly an estimate of the set of inequality constraints that are
active at the solution (the set for the simplex method)
Makes small changes to the set in each iteration (a single index in simplex)
Today, and next lecture: Active set method for QP
because
adf.ffaegmentgtfo.ws
quadrateprogramming
NTGu In quadfun came
ofobjf.me
s t CiEng a
ain bi o EE E caremtraint
linear
Ci Cn Ait n bz Zo EI
Connex
of G20 G20 positivesemidefinite 9Pconnex
ifGao wegetup
uadhaticipregramimiy_
linear toquadraticform
Three (main) reasons:
I i he mo ea nonlinea p og amming p oblem ( o pecial
that it is given a separate name; quadratic programming)
ea : efficien algo i hm e i , e peciall fo con e QP
The QP i he ba ic b ilding block of SQP ( e en ial ad a ic
p og amming ), a common method for solving general nonlinear
programs
Topic in end of course (N&W Ch. 18)
QPs are very much used in control, especially as solvers in what
Convex QP
Feasible set is (convex) polytope
Objective is quadratic function, which can be non-convex
(concave or indefinite), convex or strictly convex
QP example: Farming
example with changing prices
A farmer wants to grow apples (A)
and bananas (B)
He has a field of size 100 000 m2
Growing 1 tonne of A requires an
area of 4 000 m2, growing 1 tonne
A requires 60 kg fertilizer per tonne grown, B requires 80 kg
fertilizer per tonne grown
The profit for A is (7000 200 ) per tonne (including fertilizer
cost), the profit for B is (6000 140 ) per tonne (including
fertilizer cost)
The farmer can legally use up to 2000 kg of fertilizer
He wants to maximize his profits
Farming objectivefunction
6000 1402 Nz
SolutionofQPcan beinside
feasibleregion
Equality-Constrained QP
• Problem:
1 "
minJ $ 5$ + 7 " $
-∈ℝ 2
8. :. : *$ = +
• Matrix form:
5 −*" $ ∗ = −7
* 0 '∗ +
12
FQP
minnern nTGntcTn Sit An b
E of
KK T
In Lcn d ATX
Izcan Gn Gta c
to be
we assume G symmetricalways
i n a I HEH
An b
mat
q oµ
Karmann
fg
Lagrangian:
(stationarity)
(primal feasibility)
(dual feasibility)
(complementarity condition/
complementary slackness)
Either or
(strict complimentarity: Only one of them is zero)
Example 16.2
Note symmetry of G.
Always possible!
Flings
makeGSymon
>> G = [6 2 1; 2 5 2; 1 2 4]; c = [-8; -3; -3]; A = [1 0 1;0 1 1]; b = [3;0];
>> K = [G, -A'; A, zeros(2,2)];
>> K\[-c;b] % X = A\B is the solution to the equation A*X = B
ans = usesLUdecomposition in this case
2.0000
-1.0000
1.0000
3.0000
-2.0000
1 "
min $ 5$ + 7 " $
8. :. : *$ = +
"
5 −*" $ ∗ = −7
' ∗ +
* 0
Also known as Lemma 16.3 allows us to go from current guess to best guess.
13
5 *" −? = 7 + 5$
* 0 ' *$ − +
• Schur-Complement method
– Requires 5 nonsingular
• Null-Space Method
14
cheuyewmina.tt
ntP
n ntp
l I s.is
am t notes
Il I t y
e.mn
E I III
Eatrin
11
Example 16.2
Note symmetry of G.
Always possible!
2.0000 Q =
-1.0000
1.0000 -0.7071 0.4082 -0.5774
0 -0.8165 -0.5774
3.0000
-2.0000 -0.7071 -0.4082 0.5774
R =
-1.4142 -0.7071
0 -1.2247
0 0
P =
1 0
0 1
s t A Cmp b AP 0
tt has
kemma l6 3
full rank IG so
then non
singular
Ga Aoi
Ian
A we 0
we 52W
QuetATu O
GZU AT neo
uteru Prooflemma16.3
hessian
13
Full space:
Use LU
Or better: Since KKT-matrix is symmetric, use LDL-method
Reduced space, efficient if n-m ≪ n:
Solve two much smaller systems using LU and Cholesky
Main complexity is calculating basis for nullspace. Usual method is using QR.
Alternative to direct methods: Iterative methods (16.3)
For very large systems, can be parallelized
8. :. : *($ + ?) = +
• Trying to solve:
5 *" −? = 7 + 5$
∗
* 0 ' *$ − +
Ap A n b
AYpg AzPz An b
Ay Py e Amb ta
15
Gp t Atl
CTG n
IG Ypg zTGzpz zt et Gm
ET Gz Pz 2T GyPy 2T CtGm Cq
moue solve
for Pg
solve
for Pz y
2 come
for ardecompa
Cray
aka ntp
Concluding remarks
16
1
29 1 19
Lecture 7: Quadratic programming
Recap last time EQPs
Active set method for solving QPs
Example 16.4
Quadratic programming
(solving quadratic programs, QPs)
I
wesemidefiniteforMPC
Feasible set convex (as for LPs)
Important special case:
Equality-constrained QP (EQP)
Basic assumption:
A full row rank
KKT-conditions (KKT system, KKT matrix):
Solvable when (columns of Z basis for nullspace of A):
tree
ifobjfund
How to solve KKT system (KKT matrix indefinite, but symmetric):
Full-space: Symmetric indefinite (LDL) factorization:
Reduced space: Use Ax=b to eliminate m variables. Requires computation
of Z, basis for nullspace of A, which can be costly. Reduced space method
can be faster than full-space if n-m ≪ n.
Lagrangian:
(stationarity)
(primal feasibility)
(dual feasibility)
(complementarity condition/
complementary slackness)
Either or
(strict complimentarity: Only one of them is zero)
IP En
minner IgnTGn
s t Ain bi O EEE
ain bi ZO EE I
Kat Gn t C t E Xiao O CD
E EUI
aint bj O EEE G
aint bi 20 LEI
Xi 20 i C I 4
aiTn bi O IGI E
1 aint
Acey EE EUI bi O
lp Reformulation of KKT
Gn c E
iexacnxytiai o
foriEVtcn1 Xi o complimentary
asaiTn boforale
inactivesets
Ait bi O I Eftcay
aint bi 20 iEIlACn
Ai IO IE IN Acn
and G20
sufficient as
EE
XiaoCn n t E ti ait n n
Ee
iEAcn
ing thatarepartof
aerueset
Xi Tn aetna
Ee t.ca nedicaIingiInIo
O 7O
qty Cn n JTG n t n ng t
n t nextCn nd
Iq
o
g Gra G20
20 20
I qcn
Ado IG 2 SO 2 basis
for nullspace of active constraints
global
b at theoptimum
Mustbesafeguarded
5
Nonconvex QP
Degeneracy
General QP problem
Lagrangian
KKT conditions
General:
Defined via active set:
Min G n f CT
Ent n
At ain bi i C A Cna
summa
Onestep
of the active set algorithm
for QP
lala current estimate
of Atul
given
of
current feasibleestimate n
na e
Consider thelE9PI
KII.MIL
gtE.tctcmtsRgAk
fgIJiewkbk
ien
fminplzptcptffnktcfps.LA o
gut
solution pie
Pk
O
If He20 then all KKT
fulfilled n area
Eti ai tk
i c talk ga
i
of a
negative Xi and removethis inden
from talk
startouer
Pk40 If a
key natpk feasible ue.at all constraints
set ink start
over
my
ME min bi I a a step
length
afip.tn
start over
Minner G D cnzz.SI
Egf t
M 22 220
M 22 6 ZO
M 22 2 20
2420
Nz 20
a
I
e
L 3
no Lalo 3,5
Zo
canoe
to KITE
go
crib Eftp.ficzsIL
II
www im n
po
i iii
Min f
f If 51oz
it o
H L
ftp.o
I EH
s 191
o startouer i
Lp by from
y
n Int 3
Zo
g
I
K f
set peo
Xa n't
p lo feasible
blzelot
Example 16.4
Same way as for LP:
Al erna i e me hod: Big M
Relax all constraints; penalize constraint violations in objective
3M
Lecture 8: Open loop dynamic optimization
Static vs dynamic optimization (and quasi-dynamic )
Batch approach vs recursive approach for solving dynamic
optimization problems
Video lecture 12 onwards
F
Reference: B&H Ch. 3,4
IM
When using optimization for solving practical problems (that is, we
The model of the process is time independent, resulting in static
optimization
Common in finance, economic optimization,
Recall farming example
dynamic optimization
The typical case in control
The process is a mechanical system (boat, drone, ), chemical process
(e.g. chemical reactor),
Oil production
(example of quasi-dynamic
I
optimization, ex. 2 in B&H)
www.corec.no
e
T
water
i4 is
Eggman 20 12
Fyi if
E c que
fue man
EI 9
sit Ck
mine pro o i
f 7g i 9we 9on Fu
4variable
all sweets for
similarythese
Parameters Quee i Ii
Ffgman 9agmax
age
At
Iago am changesslowly
parameter values
4
Possible objectives in dynamic optimization
Penalize deviations from a constant reference/setpoint
Very often used in optimization for control.
Economic objectives. Optimize economic profit: maximize
production (e.g. oil), and/or minimize costs (e.g energy or raw
material)
Limit tear and wear of equipment (e.g. valves)
Reach a specific endpoint, possibly avoiding obstacles
Reach a specific endpoint as fast as possible
n state
Dixme qedels.C nonlinear
state
out ya pinput
e
time
Jt
define'perbubations ng Ey tort
et tout
Uf
gig
assuming
Tetouan
often tggntzl.int
went uI
At
II true
MIMI wnearizationishelpful
getnonlinear
Bf
X.c.gl problem
Mt
minz ft L ft Contai ut
to
E 0
Nti ft Gk cf
S't MT
ht GE ut 0 too M1
No
given objfuncodependson
a futurestate current
input
Z given
Uo M M Nz g Une MM pm
1k
M predoctoon horizon
ft cnet.ae
stagecast
NEER
n E Rra
u
ententpredictionhorizontheneeof
predictionhorizon
states
Ndola nhigh
decisionzariones
n 4 input eotgxioe.co
nosomesteps
if timesteps
aan
decision 2 20 480
Ulam Sugg a high
Ut
NE
I 2 3 i Mt M
Standardstagecost
ft latte Ut Ignite 9th atte t dntNtti
I
9 20 positivesemidefinite
UttRt at du t Ut
d
Ry o positivedefinite
ft Entre UE t DuttSB ut
n
a But Ut Ut
o
b n wzation
alone ne s ahigh
eueeah Inna
n.ge
CE 1
min
tf
o
off
f 0 O O A BI O
l i
l t
a.m
Batchapproac hun2
m A not Belo
Nz Am Boy AZ not ABuot Boy
in I
L
4Mt
X SX s ne U
X s no smu E
if if
a
E
min NTg ut pi u fifer
U
Ig n
tf
onion 1g shots UJTg fsnnots.ci tlgUTRU
alone high
es
1 Ii
L
5
Ademoye et al., Path planning via CPLEX optimization, 40th SE Symp. on System Theory, 2008.
www.ntnu.no TTK4135 Optimization and control
6
Swing-up
Obstacle avoidance
Developed in MSc-thesis by
P. Giselsson, LTH, Sweden
0.8
0.7
0.6
x2
0.5
0.4
0.3
0.2
0.1
0
Two reasons: -1 -0.8 -0.6 -0.4 -0.2 0
x
0.2 0.4 0.6 0.8 1
vs
Constraints:
Straightforward to add constraints to batch approaches (both becomes
convex QPs)
Much more difficult to add constraints to the recursive approach
Can we
How toAto add feedback (and thereby robustness) to batch
approaches?
Model predictive control!
From ETH
b Future performanceof
costfunC
QRS treesemidefinite
M5
input
future
min 2T Gz G RG r niAnotBn
2
sit
Ag bet A
Ag Io ko Ia.rs bet
Yo
1
ni
bing.IEaa
6
Linear quadratic control:
Dynamic optimization without constraints
Fullspace
I
also called
multifdeshooting
an
is
Solution found by setting gradient equal to zero:
su asuu.ru
asnnotsm
sansatan u
F e.scui asu.ee sn
fegdbf.cm
vs
MPC illustration
Illustration
Et alone E of E alright E t l he
alone y E uhigh E 0 N
E ZO
Ena M IN 15 4
is large enough
g E o is solution
if possible
Exactpenalty functionimplies that if thereexistsLafeasible solution
to the E E should always bezogggo
originalproblem Caiithout
Stability If always stable
MPC
constraints
Lo
Z
min E
Uo q k atte th ut
MI k
s t M l 2 NotUo
Nz 1.2M t Cy l 2 l 2Notyo tu
l 44 Not l 2 tee
substitute States
mina CLE
Ny
twenty
Nz
2 2
no 42 u
Offa
E lo2Notuo t le 2 l 44 Not l 2 14 tr UE
O
ae
aCiaa ut
EE't
uo e
14
not
is stable
unstable stable
Mpc
necessarily only if
always
Name latte1 2
unstable
desired response
k ut at
R
n.FI o
Inter att t
a o D DDT R 0
Nominal CA B stablizable
stability if
LAD detectable
MPC
stability approach
MI
Min E
to tant d req tight Ry t ignorePam
16
How to achieve
nominal stability?
Choose prediction horizon equal to infinity (N = ∞)
Usually not possible
For given N, design Q and R such that MPC is stable (cf. example)
Difficult, and not always possible!
Change the optimization problem such that
The new problem gives a finite upper bound of infinite horizon problem cost
The constraints is guaranteed to hold after the prediction horizon
Terminal cost
Terminal constraint
Fai l aigh fo a d o do in heo , b can be cl m in ac ice
Typically, in practice: Choose N la ge
Stability guaranteed for N large enough, but difficult/conservative to compute this limit
Shorter N often OK
So what i la ge eno gh in ac ice? R le of h mb: longe than dominating dynamics
www.ntnu.no TTK4135 Optimization and control
17
S ee e and hif
How MPC (or better control in general) improves profitability
Ac i e con ain
Time
Squeeze
(e.g. MPC)
Shift
Open-loop vs closed-loop
PenalizeC so that it
is as small aspossible
w
r amace tooaggretine
andunstable
www.ntnu.no TTK4135 Optimization and control
7
i
name ensign
alone E ut E Uhigh E I M
No Nt
control mane Ue
Ue Uo Applyfirst from
the solution above
end
T
piecewise
stateful affine
Ut
H.am s ICABTIM
e9a
kh Iem
Outputfeedbacko
f
Cue make an observer
Real system
Pb assume
he y E end
EAg
qp T
7
gain
II
I i
1 iesiinator
Eiffel
I measurement
comedian
here are
corrupted
ye noise
here by
obseries
Linearstateestimator
Requires CA C observable
are now having
Reference tracking LT
nastpeaa.dtofyaeepqathn.me
in constrasttee
Controlled variables Yt Hmt regularcontrol
theorywhere
we control
Control objective we might
y
Ye Yref controlsome
othervariable
Assume is constant Rt
yay onpeicewiseconstant A.us
paxsumiyLI
inuertibl
T
steady state ng Ans t Bus Ns CATBUS
i
Ys Has years
fi If
A B H a D
EEE
I
2 inputs
I control war
Ys 3.33 8 33 Us
Obserue
a
Input constraint limit possibleYs
G Oo E ne
Ef 0
Eye Ell 66
several us same
Ys
give
us E so
g
so
E
g
g Cruise control
7µg
t
Y Eosin im
from caratfront
is maintained
We are reformulating optimization problem to include
Tref
Targetcalculation
UstRs Us
Tsing Hq 2TG z
s t ng Ans t Bug
t
ef n S Hns Yay E 2
alone E ng s nhigh
whom
us uhigh
ya f9a F RE
Solve
min
IokaI9nt aIR In initial formulation heflaee
t Nta Ant But t oi my objective function with
mi
HomE NtEnhign t b in HaiHnsT9 YetiHnstight Rft
aloneE high
utE U E4 M
done so that we have
no at
bettercontrolauthority
and can acheiveoptimumvalue
of us
8
From description:
MPC with nonlinear model and a linear (input) disturbance model with one disturbance
state: xt = f(xt,ut)+Bd dt . All states are measured (yt = xt).
A linear observer is designed as a steady-state Kalman filter for the linearized augmented
model at the final equilibrium.
The forward-looking nature of the MPC controller allows to react to disturbances by
considering obstacles in the environment and drastic replanning when necessary.
From Off e -free MPC explained: novelties, subtleties, and applications - G. Pannocchia,
M. Gabiccini, A. Artoni, NMPC 2015 - Seville, Spain September 17 - 20, 2015.
the
IT Biased estimate
I Te unbaised estimate
freempt
Offset f integral action
t ffset
ef.ee
unmadelled disturbance
offset in controlled variables
anydynamicmodel
fordisturbane
Idea Augmentrapcmadel
H
iii It faith 3
ye Ee coiffed
Offsetfreecanti
1 Use state estimator to estimate ndt
M
7 requires c ca to be observable
Ao Ed
Thisrequires dim Cdt 2
dimly
calculation must be modified to take
Target
disturbances into account
Industrial phactice Ad 0 CD I
data If t
left Jt
1
444
Lecture 11: Linear Quadratic (LQ) Control
no
given T
t.FI TABfIsZ UE mi Unt naut
Ut Kent 920
R 0
172 L Z X Am O
and Nti Ane But O Ee Cicely O IE E
R UE t BTAtty O t O I my
Of Ont Att Atlee O t I my
dat t
notincludinglast
t
Xm 9mm
6
Guess ht _Pent Lassiquziyotherttobelinear fun
Pee Ptt 20
Nem AntetB RtBtttte
One Pint AT
Pu CI t Bre BTPEMAne O
0
onlypossible ifthisis o
Ut Rt BTPete Ntt
where
Kt Rt BTPete It B Rt B Pay A It
Pt CftATP It BRt BP YA E O All
Pm g
o
the 9am E fat
at
4
Second-order conditions
Critical directions:
The critical directions are the allowed directions where it is not clear
from KKT-conditions whether the objective will decrease or increase
That is, since the solution of the Riccati equation implies the
KKT conditions are fulfilled, Thm 16.4 means that Riccati
equation gives the global solution
Side-remark: It is, in fact, the unique global solution. If G is positive definite (implied by
Q positive definite), this follows from the proof of Thm 16.4. If Q positive semidefinite,
further arguments are necessary (for instance using Thm 12.6 as in the note).
Pu Qi
t
Pee 9 t t l 2pm Pee 1.2 It 1.44k E0 M
ltd htPtah
t
KE dy Pty 1.2 1.2 Pty
Itt
Pepe
le t Ptty
8
Note that the gain matrix Kt is independent of the states. It can therefore
be computed in advance (knowing At, Bt, Qt, Rt)
Note that the boundar condition is given at the end of the hori on, and
the Pt -matrices must be found iterating backwards in time
Example
LQ solution
15 1
r = 0.1 r = 0.1
r=1 r=1
10
r = 20 r = 20
t
0.5
t
P
x
5
0 0
1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
t t
1.5 0
1 -0.5
t
t
K
0.5 -1
0 -1.5
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
t t
MPC vs LQ
The difference between finite horizon LQ and MPC open loop problem is
constraints
Mo constraints
That is: The MPC control law when there are no constraints, is
MPC solution
N
min E
E O
Ig htt 9 ate that Rut
S t Nta Ant But E 0 a
PE P
steady state Pete
Ut K Ne
t
K Rt Btp It EeBTp
t
P At ATP It BR BTp A
Increasing LQ horizon
Horizon 20
HorizonNN==320
40
80
160
640
5
3
t
P
0
0 100 200 300 400 500 600
I
13
Controllability vs stabilizability
Observability vs detectability
Stabilizable: All unstable modes are controllable
(that is: all uncontrollable modes are stable)
TgoesLee0onitself
Detectability: All unstable modes are observable
(that is: all unobservable modes are stable)
b O b O stablizable controllable
b 40 bz o E stablizable
b 0 boffo Not stablizable
a Lao
9,20 92 0 Detectable
DTD
9
Y
o
A
Ogg off DTD
2
14
Riccati equations
Discrete-time Riccati equation in the note (and lecture)
The trick used to get the different formulas is the Matri Inversion
Lemma (a ver useful Lemma in control theor , optimi ation, )
ARE ContinuousAlgebraicRiccate
Discrete-time Algebraic Riccati equation (DARE) in the note (and lecture) qn
Other form (e.g. Matlab)
Note that the gain matrix Kt and the Riccati equation is independent of the
states. It can therefore be computed in advance (knowing At, Bt, Qt, Rt).
N e ha he b da c di i i gi e a he e d f he h i ,a d
the Pt -matrices must be found iterating backwards in time.
Example
LQ solution
15 1
r = 0.1 r = 0.1
r=1 r=1
10
r = 20 r = 20
t
0.5
t
P
x
5
0 0
1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
t t
1.5 0
1 -0.5
t
t
K
0.5 -1
0 -1.5
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
t t
Increasing LQ horizon
Horizon 20
HorizonNN==320
40
80
160
640
5
3
t
P
0
0 100 200 300 400 500 600
Being a state feedback solution, it implies some robustness (more on this later)
www.ntnu.no TTK4135 Optimization and control
9
Controllability vs stabilizability
Observability vs detectability
Stabilizable: All unstable modes are controllable
(that is: all uncontrollable modes are stable)
Riccati equations
Discrete-time Riccati equation in the note (and lecture)
The ic ed ge he diffe e f a i he Ma i I e i
Le a (a e ef Le ai c he )
LQR vs MPC
LQR i MPC i h c ai , i i i ea a e feedbac
MPC i i i e i i ai (QP)
LQR vs MPC, II
Region (polytopic set) where LQR solution is optimal
(where we can assume problem unconstrained)
LQ regulator (LQR)
LQ and robustness
SISO LQ regulators have 60 degrees phase margin and 6dB
gain margin
Can be extended to MIMO systems
LQ
Lead a f e ea ch i b c i he 80 (a d a e ),
not topic of this course
Mathematical formulation
a
iEmgi 1 5155am
Model Predictive Control (MPC)
use stepresponsemodels
Follow trajectories,
not statefeedback
keep process within constraints
© Skogestad
www.ntnu.no TTK4135 Optimization and control
22
NMPC Discrete-time Open Loop
Optimal Control Problem
Solve
with
Ac i e c ai
Time
Squeeze
(e.g. MPC)
Shift
N Histogram
LEVEL 0 / LEVEL 1
2
Sigma 1 -- Sigma 2
Sigma 2
LEVEL 2
OFF
Q1 -- Q2
SPEC
W2
1.5
Sigma 1
0.5
SQUEEZE
Q2 Q1 QUALITY
0
0 50 100 150 200 250 300 350 400 450
SHIFT
© Richalet
www.ntnu.no TTK4135 Optimization and control
26
Economic optimization and MPC
(current research)
S ee e a d hif Indirect
economic optimization by (N)MPC
Short-term production
optimization: Maximize (e.g.)
oil production, given a long-
term recovery strategy
Ad ch e f d ci g
wells to obtain largest
ib e i d ci
Simulation study
Production optimization
We want to maximize oil production
Assumption: Bottleneck is gas processing capacity
Implication: Upper limit on inlet separator pressure is active
constraint
to oil ratio
Due to different GOR in wells, it can be many combinations of
choke openings that meet active constraint
C e i a i :U ea i g d ce a a ac i e
constraint
N nos na ofcontrolinputs
to
https://www.continentalmitsubishi.com/blog/mitsubishi-adaptive-cruise-control-explained/
ACC Modeling
Host car Lead car
Desired distance, hr
Relative distance, h
• Limit speed
MPC formulation
• Assume d = 0
• Weights:
• Constraints:
limit on
undershoot
useslackvariable
relax to uaitate
constraint but
put aheavylime
so that it
bouncesback
asquicklyas
possible
Why we needtargetcalculation
Nt
Zt
• Do target calculation min UE
– Not implemented yet t 0 tf I
2
6 Us
inputneededto counteract
resistiveforces
www.ntnu.no TTK4135 Optimization and control
9
n e
Inlhat is solution
Global minimizer Cn all
tri fCn Ef for n
Local minimizer
f Cn flag for all ENEMY
e n
Taylor expansions
From Calculus?
In this course:
Taylor s theorem
to ntp
a p
First order: If f is continuously differentiable,
quality
g
f Cutts
approx
forest a
fCnTp
Second order: If f is twice continuously differentiable
k PTIZCf uttppoet
If coup Ffca f
t
n local solution n E C
f d FfEnix O
Proof By contradiction
Axxume a local solution and FfCn TO
teepeedescent FfCH pTl7fCn Hafen 11 so
direction Since Cn is continuous there exists T 70 At
f
pHfCn ttp so for all te Lo t
from taylor forany E E o t
Cn t Ep fCnk t Ept FfGetty t E Lo EI
p f
n tT
Http f cut
n is not local min contradiction
Paced theorem
g
nk
e
m
ga
theorem sufficientconditions
E C N T Cnk 0 172 0
f f fluty
Hunt
is strict local solution
Proof IfCng O
If continuous Exists970 At ThfCa 70
forall n E n IHn n Her
openbowl aroundnk
Taylor for any b Hpu ca
t ecoD
f Great
p f Cn t fcnkltptlg.ptofCnattp p
O 7O
fenix
look that
If we
for only points satisfy sufficient
condition then we miss some minimum
might
na
g
n
M
local
f convex n min
n min
global
General line search algorithm
for solving minnfees
1 Initial bro
guess no
27 While termitmia notfulfilled
2cg find a des n
pk Cfrannee
2b walk along pie to Mate Mate Nkt Mapk
24 ke Rtl
end
3 n
Nm
But thenhere
If
11nee n 11 E E I Cnd LE
flak we need to
know nk which
we dam it
anyfurther
check all terminate when holds
Inpractice first
Descentdine ction
im
at
bae TfCnl
Maindescentdirections
e teddies pk TfEnea
Used in most
of machinelearningalgorithms
r
2 Mutton pea Ff laid aflaw
I ma
more
computationally expensive
a.am
e
c
I dgcnII.t
t 172
meet'sr an
fGH flak p o
3
Q on
Pk BI Ff Gk
t.BE
fnTfs approximatinghessian
How should walk along f k
far we
mind f Cnetxpk Ak
set nku Nk t
Nkpk
Mk Cf fCnut fLuft
p Ig Tfendto
t PT
approx
flak t TfCnutp pT p
min mkLpl
PERn Ilp11 a
ie
plieswithintrust region
7
Minimize approximation:
Newton step :
initial
minimizing quadrate
approximation
1stiteration
zigzagsteps more
aligned
Better scaled
steepest descent
f Emine
Newton
t in
p ma subagent
8144
Lecture 14: Globalization strategies
Two basic globalization strategies: line search (Ch. 3) and trust-
region (Ch. 4, not syllabus)
N e: gl bali a i d e i l ha e ea ch for global optimum, but we make
the algorithm work far from a (local or global) optimum!
Step-length, Wolfe conditions
Step-length computation
Hessian modifications
A comparison of steepest
descent (green) and
Newton's method (red) for
minimizing a function
(with small step sizes).
Newton's method uses
curvature information to take
a more direct route.
(wikipedia.org)
www.ntnu.no TTK4135 Optimization and control
3
pie Brit
Ifk
Ba I steepest descent
Bk Thf o
Newton
How to choose me
I
ii 9 10 4
Enact line search 19K
a
floretxpie
a
ay mina
value g lo 4
typical
2 Curvature condition
G I Go 0 40
9 10 4
7 It I
These are the keolfe conditions
1
f CnutAapa E fCmd t G Ma 7ftpie
2
RfCnet Kpk Effiepk
OL G C G C I
4
Sufficient decrease
Curvature condition
q A X't o x
k
I
9 04012
to
210Ko Geo
01 o Ro
q see book
possibly no solution to
Several Ek Ch3 4
ways of constructing
Faure
Ek Tye I
0
Zoe if 172ft O
Sy
Amin 712 t d otherwise
fa
172ft Mtn
rn
8
Steepest descent:
Linear convergence
Newton:
Quadratic convergence
Quasi-Newton:
Superlinear convergence
Quasi-Newton
Q-N efficiently produce good search directions
Steepest descent: Many iterations, but each iteration cheap (need only gradient)
Newton: Few iterations, but each iteration expensive (need also Hessian)
Quasi-Newton: Achieve few iterations by approximating the Hessian using only the
gradient
Secant condition
BFGS (and DFP) Hessian approximation update formulas
A comparison of steepest
descent (green) and
Newton's method (red) for
minimizing a function
(with small step sizes).
Newton's method uses
curvature information to take
a more direct route.
(wikipedia.org)
www.ntnu.no TTK4135 Optimization and control
3
Line search
Hessian modification
For to be a descent direction, we need
I
Ma
veryfast convergence
Expensive to calculate Ffa
quasi
Meuet Cfm
model
fan at my
t
ftnktp.IE MkCp featTifap k p Bap t
Have to choose Bk
1 Bk o so that we can produce descent direction
2
BE 728k forfast convergence
3
Only dependence on Ffk
Consider none
I
I
a'Kkk X
mean lo fkn
want
mantapa Tfkµ Bky9kpk Rfk
Bay Kpk Hokey Tok
L
SENka Nk Lk
Secant equation BaeSk yea
shouldbe
Requires Sktya 70 Holds automatically if
1
f is a convex function or
2 hefulfills the Wolfe conditions
IIg.iq
Bien Bist Bse Ya fog BgYI
et
Alternatively Choose
Hku min HH Hull st H HT O
Of
My E Sk
1Hkti LI
fkskyIJHKII
fkyksktltfkskskhg.gs
BEGS is considered the most effective 9M
formula
y
T 2
NTH Csisa
yeaskin
Kun n Hk n
flag
a
tf
unless
ZO Sian _0
only 0 when n o
0
except when n o
Have to choose Ho
Ho YI
5
BFGS
BFGS method
wikipedia.org
http://www.cs.cmu.edu/afs/cs/academic/class/15780-s16/www/slides/optimization.pdf
http://www.cs.cmu.edu/afs/cs/academic/class/15780-s16/www/slides/optimization.pdf
Quasi-Newton
in machine learning
Proceedings of the 28 th International Conference on Machine Learning, Bellevue, WA, USA, 2011
BEGS
wikipedia.org
Directional derivative of
Lz Ilp112
T
Let OO 0
p Eep ei 0,1 0 E 70
I f Ent Ee 3 fans E
Ef ei I E LezE
L
I fcnteeiz
fcnl ffm.IE LEE
ldfoni
toCEX
fcnteeil.to
htt
function evaluations
Tueasideddifferences
dotn fC eei to ET
2E
2nsfunction evaluations
Ta he e
Two modes
Forward
Reverse (adjoint)
Reverse mode
First, calculate by traversing graph forward
Then, calculate derivatives by traversing graph backward
Given a function
Costs of calculating derivatives with AD:
F ad de ( e c ):
Forward mode (entire Jacobian):
Re e e de ( e ):
Reverse mode (entire Jacobian):
Implementing AD
Prototype procedure:
1. Dec e igi a c de i i i ic f c i (e.g. 1x2, sin(x),
ln(x), etc.)
2. Diffe e ia e i i ic f c i ( b ica , a ea i )
( i ( ) = c ( ), e c.)
3. Put everything together according to the chain rule
(either forward or reverse)
Example (C/C++)
function.c
x4 = x1*x2;
x5 = cos(x1);
x6 = x3*x5;
x7 = x4 + x6 + x3;
return x7;
}
function.c
x4 = x1*x2;
dx4 = dx1*x2 + x1*dx2;
x5 = cos(x1);
dx5 = -sin(x1)*dx1;
x6 = x3*x5;
dx6 = dx3*x5 + x3*dx5;
x7 = x4 + x6 + x3;
dx7 = dx4 + dx6 + dx3;
df[0] = x7;
df[1] = dx7;
return df;
}
f_res = f(x);
Software etc.
General information
http://www.autodiff.org/
http://en.wikipedia.org/wiki/Automatic_differentiation
Book:
A. Griewank, A. Wa he , E a a i g De i a i e : P i ci e a d Tech i e f
A g i h ic Diffe e ia i , 2 d edi i . SIAM, 2008.
Calculate gradient of
at
AD forward mode
AD forward mode
AD forward mode
AD forward mode
AD forward mode
Forward mode
Both and are calculated by forward traversing the graph
Reverse mode
First, calculate by traversing graph forward
Then, calculate derivatives by traversing graph backward
AD reverse mode
AD reverse mode
AD reverse mode
AD reverse mode
AD reverse mode
AD reverse mode
mode node Of e I Cabuays
Right dni
rosenbrock.m
import casadi.*
Derivative-free optimization
If you with reasonable effort can implement derivatives
(gradients, possibly Hessian), use them!
A a e efficie ha i g he !
However, sometimes, obtaining derivatives is prohibitive
Examples:
Your objective function (and possibly constraints) are calculated using a (large)
b ac -b i a ha i e e i e
Your objective function (and possibly constraints) contain code that is not (easily)
diffe e iab e ( e ica i e a e e f fi i e diffe e ce diffic )
Often models from computational fluid dynamics (CFD)
(but recently, some CFD software exports some sort of derivative information)
Thi i ae de i a i e-f ee i i ai eh d
www.ntnu.no TTK4135 Optimization and control
33
I
1
Given initial simplex S M ok anti
singular materinVC
S is non nom nom nm n
is non
singular
it means no 3 points are on the same line
Define centroid I
in Gini
E.EE
i
OneiterationofMMCreplacinfant
G and th
Compute a
Freflection point
f f n
G GOTO
If fam s
f E
f Cnn Replace ann with a
andthen sort
else
if f c
fore
imm
Ieee Ii Em iE si
else replace ante with 2 te GO TO
else I
if f fCnn contraction
if f Cnn E
fi Ef Emm outfidontract
compute a Cts fig f EE t's point
Reflection
Expansion
Contraction (outside)
Contraction (inside)
Shrinkage
each iteration
possibly exceptforshrinkage
Termination
I fend f Cann LE
Not a
global optimalalgorithm depends where we
start with
40
Suited for problemswith limited decision
Examples variables complex function evaluations
wikipedia.org
to the data
Outlierspresent not a
good idea to use
squareforerr
residual
Generalizations:
(Statistical) Machine Learning: Regression, or parametric learning
Control theory: System identification (fitting dynamic models to data)
Gradient of objective :
Gauss-Newton method
For these problems, a good approximation of the Hessian is
to data
Define and least squares optimization problem
i where am
knbr o Ig
Problemtosolue find ns.t.eecn
o
O
unit
r Cnt 0 can have no solution one
Tingle unique sol
or solutions
many
7
Flowsheet analysis in
chemical engineering
(steady state simulators)
model
of rcmatp
htnktpk t.skCp eeGua t Fenip
Milk o
pie FenderCme
In
I
em
1
2km Nk
Pk
Newton's algorithm nonlinear equations
for
choose no k 0
while Heilman E
Solue I lawfk rank
Nkt Nut pig
k KH
end
lathy called newton
minn fca fromligneous
To solve search
for ns.t
If Cn O
Jacobian Ca
of Ff
Cal
If
t
Pk IfCny IfCa Newton direction
remain
rfk raffa
sina.is
Tf
EM in a region around n
rCnka
T
EMkCn
Potentialproblemaeithalevetonymeth
1
Computing Fema may be expensive andfor 7cal may
become not invertible
Solution Do as in quasi Newton ApproximateFlock using
acne and modify to avoid singularity
No
objective functionfor line search
Solution8 define merit function
Ne me h d
KIT o
e l 2Mt
0 l 227 0
242 22 1 0
F na
a f to
LQcalsqpforegualitycanstainedNLP.fi
minne fan at cent 44
1,72
L CnD Cn XTCcn
f P Jacobian
Acnst 79Cn Fsln Cn
17cm
Ccn 0
Final
Nonlinear
µ Yin
with
0
Ii II Yy
qq.gqq.ua
K1 H
Iend
Theoreml6 rs Ax umption
The K KT system has a solution KKT matrin non singular
a ACn is a rank matrin LIC 9
full
b dt n for all d fo s t ACnk d 0
treedefinite on the nullspaceofAcn
This Newton as an excellent alg forsolving1 1 when
alfto
starting close six
AlternatiuederiuationofkktsystemT
e t Clark 0
min few
Approximate at Caa Aia
mpinep.fcnxltafcnkltpttzpttnntcnk.tk p
pY
st cand t ACK p O
Ifp f fcnedtaflnkjtptlg.pt172mLCneadp
lagrange
ITfunk tACnHp
multiplier
KKTconditions
Clink tACnk p O
viii
i.a
cnn.gg
q tt d
Subtract Alak Tdk from both sides first line Kix
KKTsystem
1
ll 7 tl l
µ Xk
Tueainterpretationsofalgaritt y
analysisquadrate
conuoyenc
1 Newton's method for solving KKTsystem
2 sequential solution to Cfpapproximations ofCry
extensionto
inequalities
Cn At 914 0 i EE
Minne f
City O E C I
Given no Xo
Aff
while not convoyed
Salue Cfp
min fcnwt7fcnftptkptthnttnk.tk
s t City aCiEngTp O EEE
KE k 4 Cn t TheCnTp 70 i E I
µ my
q
me
Reason
fGua t T
f WT p tIgpt172mLthis Xk p t th Cena t Akap
f Gud t XT e Cnk t 17
fcnkltptXIACnkptbzptnxnnslnk.tk
quadraticapproximation oflagrangian
1
• Quadratic programming
Convex problem if
Feasible set polyhedron
• Nonlinear programming
In general non-convex!
• Example:
The Lagrangian
For constrained functions, introduce modification of objective
function (the Lagrangian):
(stationarity)
(primal feasibility)
(dual feasibility)
(complementarity condition/
complementary slackness)
Either or
Equality-constrained QP (EQP)
Basic assumption:
A full row rank
EQP:
Globalization of SQP-algorithms
makingthealgorithmwork away
far
Computation/approximation of the Hessian
fromoptimum
Linesearch
Other issues
Infeasible linearized constraints
The Maratos effect
Ne eh df i g i ea e ai (Ch. 11)
Solve equation system
Assume Jacobian exists and is continuous
Taylor:
Equality-constrained NLP
Lagrangian:
KKT-system:
T e: U e Ne eh df i ea e ai KKT-system:
where
KKT:
Thm 18.1: Alg. 18.1 identifies (eventually) the optimal active set of constraints (under
a i ). Af e , i beha e ike Ne eh df e ai c ai ed be .
quadratic minpfcneelttfcnedtptlg.ptEfcmalp
approximation
pre HyenasHfW
Have 1
Modify the hessian
cue ensure
reawadirn 4 Approximate thehessianusing BEGS
spaceofconstraintsfaced
dimenitfunction
O Cn m f Cnt tu feel city ME ions
µ penalty parameter
EET max 0 2
ME is not differentiable
Defis A function 0Cn µ is exact
0 the local
if for any µ I let solution
of theMLP
is a minimizer of 0
gradually
Assume I 0 then 0 Cn µ flattell cCulll
We use merit funetoon as an alternate tee objectivefun
when we use line search be 02 algorithm allows iterates to
violate the constraints
we need directional derivative 70 Cai Miftp.k
But is not defined everywhere
Instead use D l 0 Cnn Mk Aa
Itf
iIT41
termination criteria
P ib e i :A i a e ed ced He ia (He ia
nullspace of constraints) instead. This reduced Hessian is much more
likely to be positive definite (recall sufficient conditions).
Thm 18.2:
That is: pk is a descent direction for merit function if Hessian of Lagrangian is positive
definite and µ is large enough
Backtrackinglinesearch
Maratos effect
Maratos effect: A merit function may reject good steps!
Ex. 15.4:
Remedy:
Use a merit function that does not suffer from the Maratos effect
U e - e a eg ( e ai a i c ea e i ei f ci )
U e ec d- de c ec i ( he Maratos effect occurs)
www.ntnu.no TTK4135 Optimization and control
10
NLP software
SNOPT
e large-scale linear and nonlinear problems; especially recommended if some of the
constraints are highly nonlinear, or constraints respectively their gradients are costly to evaluate
and second derivative information is unavailable or hard to obtain; assumes that the number of
f ee variables is modest.
Licence: Commercial Stanford
IPOPT
i ei i e h d f a ge- ca e NLP
License: Open source (but good linear solvers might be commercial) corn
WORHP
interior ft
SQP solver for very large problems, IP at QP level, exact or approximate second derivatives,
various linear algebra options, varius interfaces
Licence: Commercial, but free for academia
KNITRO
trust region interior point method, efficient for NLPs of all sizes, various interfaces
License: Commercial
( a d several others, including fmincon in Matlab Optimization Toolbox)
rosenbrock.m
import casadi.*
Course in a nutshell:
Unconstrained optimization
Steepest descent, Newton, Quasi-Newton
Globalization (line-search and Hessian modification), derivatives
Constrained optimization
Optimality conditions, KKT
Linear programming: SIMPLEX
Quadratic programming: Active set method
Nonlinear programming: SQP
Control and optimization
LQ control
MPC
Today: Nonlinear MPC, e ac ica /i d ia i e MPC
wikipedia.org
Quadratic programming
Convex problem if
Feasible set polyhedron
Nonlinear programming
In general non-convex!
(stationarity)
(primal feasibility)
(dual feasibility)
(complementarity condition/
complementary slackness)
Starting point for all algorithms for constrained optimization in this course!
LP:
Lagrangian:
Partition , and :
KKT conditions
KKT-2: (since x is BFP)
KKT-3: (since x is BFP)
KKT-5: if we choose
KKT-1:
KKT-4: Is ?
General QP problem
Lagrangian
KKT conditions
General: Defined via active set:
Example 16.4
Equality-constrained NLP
Lagrangian:
KKT-system:
T e: U e Ne eh df i ea e ai KKT-system:
where
KKT:
Thm 18.1: Alg. 18.1 identifies (eventually) the optimal active set of constraints (under
a i ). Af e , i beha e i e Ne eh df e ai c ai ed be .
NLP software
SNOPT
e large-scale linear and nonlinear problems; especially recommended if some of the
constraints are highly nonlinear, or constraints respectively their gradients are costly to evaluate
and second derivative information is unavailable or hard to obtain; assumes that the number of
f ee variables is modest.
Licence: Commercial
IPOPT
i ei i e h d f a ge- ca e NLP
License: Open source (but good linear solvers might be commercial)
WORHP
SQP solver for very large problems, IP at QP level, exact or approximate second derivatives,
various linear algebra options, varius interfaces
Licence: Commercial, but free for academia
KNITRO
trust region interior point method, efficient for NLPs of all sizes, various interfaces
License: Commercial
( a d several others, including fmincon in Matlab Optimization Toolbox)
rosenbrock.m
import casadi.*
desired
outputs inputs outputs
measurements
Time
Squeeze
(e.g. MPC)
Shift
D. K. Kufoalor, G. Frison, L. Imsland, T. A. Johansen, J. B. Jørgensen, Block Factorization of Step Response Model Predictive Control Problems, J. Process Control, Vol. 53, May, pp. 1 14, 2017;
D. K. M. Kufoalor, S. Richter, L. Imsland, T. A. Johansen, Enabling Full Structure Exploitation of Practical MPC Formulations for Speeding up First-Order Methods, 56th IEEE Conference on Decision and Control, 2017
D. K. M. Kufoalor, T. A. Johansen, L. S. Imsland, Efficient Implementation of Step Response Models for Embedded Model Predictive Control, Computers & Chemical Engineering, Volume 90, July, Pages 121 135, 2016
D. K. M. Kufoalor, V. Aaker, L. S. Imsland, T. A. Johansen, G. O. Eikrem, Automatically Generated Embedded Model Predictive Control: Moving an Industrial PC-based MPC to an Embedded Platform,
J. Optimal Control - Applications and Methods, vol. 36, pp. 705 727, 2015
gCnt Ut
S't atte E 0 My
h Che Ue
Yt
E e high
yean ye g
alone e ut uhigh
Stage cost St
Sz Catti Ut Antti t UIRut
L
YI
Regulation
Chen
T
neg Utaref
ft
Tracking Ss Cnt Ut Ig hktm.ee yt78T9lhhaiHyII
Euereto TRu ueref
k
I
calculate objective and state constraints (and gradients)
Re i a i i ai be ih c e
Standard SQP methods are suitable
Simultaneous methods
Have both inputs and states as optimization variables, include model as
equality constraints
Results in huge optimization problem, but equality constraints are very
structured ( a e , a f e )
Must use solvers that exploit structure (e.g. IPOPT)
Inbetween: Multiple shooting
Divide horizon into b-h i , e i g e-shooting on each sub-horizon
a d add e a i c ai g e each b-horizon together
Re i b c - c ed i i ai be
Ideally use solvers that exploit this structure (but not many exists)
Wha i be ? De e d
www.ntnu.no TTK4135 Optimization and control
26
t.to
I
Audio
01
1 32
30
#4
Åsgard
Norne #7
Heidrun
Snøhvit
Total in Statoil:
92 MPC Applications
#28
Mongstad
#5
Kollsnes
Gullfaks/Tordis #2
#25
Kårstø
#21
Kalundborg
Stig Strand, Statoil