Chapter 2: Quadratic Programming
Chapter 2: Quadratic Programming
Overview
Quadratic programming (QP) problems are characterized by objective functions that are quadratic
in the design variables, and linear constraints. In this sense, QPs are a generalization of LPs and
a special case of the general nonlinear programming problem. QPs are ubiquitous in engineering
problems, include civil & environmental engineering systems. A classic example is least squares
optimization, often performed during regression analysis. Like LPs, QPs can be solved graphically.
However, the nature of solutions is quite different. Namely, interior optima are possible. This
leads us towards conditions for optimality, which is an extension of basic optimization principles
learned in first-year calculus courses. Finally, QPs provide a building-block to approximately solve
nonlinear programs. That is, a nonlinear program can be solved by appropriately constructing a
sequence of approximate QPs.
By the end of this chapter, students will be able to identify and formulate QPs. They will also
be able to assess the nature of its solution, i.e. unique local maxima/minima, infinite solutions,
or no solutions. Finally, they will have one tool to approximately solve a more general nonlinear
programming problem.
Chapter Organization
• (Section 3) Graphical QP
1 Quadratic Programs
A quadratic program (QP) is the problem of optimizing a quadratic objective function subject to
linear constraints. Mathematically,
1 T
Minimize: x Qx + RT x + S (1)
2
where x ∈ Rn is the vector of design variables. The remaining matrices/vectors have dimensions
Q ∈ Rn×n , R ∈ Rn , S ∈ R, A ∈ Rm×n , b ∈ Rm , Aeq ∈ Rl×n , beq ∈ Rl , where n is the number
of design variables, m is the number of inequality constraints, and l is the number of equality
constraints.
Remark 1.1 (The “S” Term). Note that the S term in (1) can be dropped without loss of generality,
since it has no impact on the optimal solution x∗ . That is, suppose x∗ is the optimal solution to
(1)-(3). Then it is also the optimal solution to
1 T
Minimize: x Qx + RT x (4)
2
As a consequence, we often disregard the S term. In fact, some computational solvers do not
even consider the S term as an input.
1 T
subject to: x Ux + V T x + W ≤ 0 (8)
2
Aeq x = beq (9)
When U = 0, then this problem degenerates into a standard QP. We will not discuss this class of
QPs further, besides to say it exists along with associated solvers.
2 Least Squares
To this point, QPs are simply an abstract generalization of LPs. In this section, we seek to provide
some practical motivation for QPs. Consider an overdetermined linear system of equations. That
is consider a “skinny” matrix A, and the following equation to solve:
y = Ax (10)
r = Ax − y (11)
and consider a solution x∗ that minimizes the norm of the residual, krk. This solution is called
the “least squares” solution. It is often used in regression analysis, among many applications, as
demonstrated below.
Example 2.1 (Linear Regression). Suppose you have collected measured data pairs (xi , yi ), for
i = 1, · · · , N where N > 6, as shown in Fig. 1. You seek to fit a fifth-order polynomial to this data,
i.e.
y = c0 + c1 x + c2 x2 + c3 x3 + c4 x4 + c5 x5 (12)
Figure 1: You seek to fit a fifth-order polynomial to the measured data above.
The goal is to determine parameters cj , j = 0, · · · , 5 that “best” fit the data in some sense. To
this end, you may compute the residual r for each data pair:
Now we compute an optimal fit for c in the following sense. We seek the value of c which minimizes
the squared residual
1 1 1 1 1
min krk2 = rT r = (Ac − y)T (Ac − y) = cT AT Ac − y T Ac + y T y. (15)
c 2 2 2 2 2
Note that (15) is quadratic in variable c. In this case the problem is also unconstrained. As a result,
we can set the gradient with respect to c to zero and directly solve for the minimizer.
∂ 1
krk2 = AT Ac − AT y = 0,
∂c 2
AT Ac = AT y,
This provides a direct formula for fitting the polynomial coefficients cj , j = 0, · · · , 5 using the
measured data.
Exercise 1. Consider fitting the coefficients c1 , c2 , c3 of the following sum of radial basis functions
to data pairs (xi , yi ), i = 1, · · · , N .
2 2 2
y = c1 e−(x−0.25) + c2 e−(x−0.5) + c3 e−(x−0.75) (17)
Exercise 2. Repeat the same exercise for the following Fourier Series:
3 Graphical QP
For problems of one, two, or three dimensions, it is possible to solve QPs graphically. Consider
the following QP example:
s. to 2x1 + 4x2 ≤ 28
5x1 + 5x2 ≤ 50
x1 ≤ 8
x2 ≤ 6
x1 ≥ 0
x2 ≥ 0
The feasible set and corresponding iso-contours are illustrated in the left-hand side of Fig. 2. In
this case, the solution is an interior optimum. That is, no constraints are active at the minimum.
In contrast, consider the objective function J = (x1 − 6)2 + (x2 − 6)2 shown on the right-hand side
of Fig. 2. In this case, the minimum occurs at the boundary and is unique.
10
J = (x1 – 6)2 + (x2 – 6)2
9
8
7
J
=
0
6
5
9 J*
4
1
3
2
1
Figure 2: An interior optimum [LEFT] and boundary optimum [RIGHT] for a QP solved graphically.
s. to 5x1 + 3x2 ≤ 15
x1 ≥ 3
x2 ≤ 0
min J = x22
s. to x1 + x2 ≤ 10
x1 ≥ 0
x1 ≤ 5
4 Optimality Conditions
1
min f (x) = xT Qx + RT x (19)
2
In calculus, you learned that a necessary condition for minimizers is that the function’s slope is
zero at the optimum. We extend this notion to multivariable functions. That is, if x∗ is an optimum,
then the gradient is zero at the optimum. Mathematically,
d
f (x∗ ) = 0
dx
= Qx∗ + R
⇒ Qx∗ = −R (20)
We call this condition the first order necessary condition (FONC) for optimality. This condition is
necessary for an optimum, but not sufficient for completely characterizing a minimizer or maxi-
mizer. As a result, we call a solution to the FONC a stationary point, x† . In calculus, the term
“extremum” is often used.
Recall from calculus that the second derivative can reveal if a stationary point is a minimizer,
maximizer, or neither. For single variable functions f (x), the stationary point x† has nature char-
2
At UC Berkeley, the related course in Math 1A
We now extend this notion to multivariable optimization problems. Consider the second derivative
of a multivariable function, which is called the Hessian. Mathematically,
d2
f (x† ) = Q (22)
dx2
and Q is the Hessian. The nature of the stationary point is given by the positive definiteness of
matrix Q, as shown in Table 1. Namely, if Q is positive definite then x† is a local minimizer. If Q
is negative definite then x† is a local maximizer. It is also possible to have infinite solutions, which
can be characterized by the Hessian Q. If Q is positive (negative) semi-definite, then x† is a valley
(ridge). If Q is indefinite, then x† is a saddle point. We call this the second order sufficient condition
(SOSC). Visualizations of each type of stationary point are provided in Fig. 3 and 4.
A quick review of positive definite matrices is provided in Section 4.1.
3
Technically, an inflection point is a point x† where the curve f (x) changes from concave to convex, or vice versa. It
is possible for f 00 (x† ) = 0 and the concavity does not change. An example is f (x) = x4 for x† = 0. In this case x† is
not an inflection point, but undulation point.
Local Minimum
5
250
200
200
150
150
0 100
x2
100
50
0
50 −5
0 5
−5 x1 0 x2
−5 x1 0 5 5 −5
Local Maximum
5
−20
0
−40
−50
−60
0
−80−100
x2
−100
−150
5
−120 5
x2 0
0 x1
−5 −140
−5 x1 0 5 −5 −5
Figure 3: Visualizations of stationary points of different nature. In each case, the objective functions take
the form f (x1 , x2 ) = xT Qx and have stationary points at the origin (x†1 , x†2 ) = (0, 0), denoted by the red star.
The Hessian for each case are: Local Minimum Q = [3 2; 2 3], Local Maximum Q = [−2 1; 1 − 2].
Valley
5
400
600
350
300 400
x2
250
0
200 200
150
0
100 5
5
x 0
50 2
0 x1
−5
−5 x1 0 5 −5 −5
Ridge
5
−20 0
−40
−50
x2
0 −60
−100
−80
−150
5
−100 5
x2 0 x1
0
−5 −120
−5 x1 0 5 −5 −5
Saddle Point
5 300
250
200 200
150 100
100 0
0
x2
50
−100
0
−200
−50 5
x2 0 5
−5 x1 0 0 x1
−5 5 −5 −5
Figure 4: Visualizations of stationary points of different nature. In each case, the objective functions take
the form f (x1 , x2 ) = xT Qx and have stationary points at the origin (x†1 , x†2 ) = (0, 0), denoted by the red
star. The Hessian for each case are: Valley Q = [8 − 4; −4 2], Ridge Q = [−5 0; 0 0], Saddle Point
Q = [2 − 4; −4 1.5].
In linear algebra, matrix positive definiteness is a generalization of positivity for scalar variables.
Definition 4.1 (Positive Definite Matrix). Consider symmetric matrix Q ∈ Rn×n . All of the following
conditions are equivalent:
• Q is positive definite
• xT Qx > 0, ∀ x 6= 0
• −Q is negative definite
• Q is positive semi-definite
• xT Qx ≥ 0, ∀ x 6= 0
• the real parts of all eigenvalues of Q are positive, and at least one eigenvalue is zero
• −Q is negative semi-definite
In practice, the simplest way to check positive-definiteness is to examine the signs of the
eigenvalues. In Matlab, one can compute the eigenvalues using the eig command.
Exercise 5. Determine if the following matrices are positive definite, negative definite, positive
semi-definite, negative semi-definite, or indefinite.
1 0 1 2
(a) M = (d) M =
0 1 2 1
9 −3 −8 4
(b) M = (e) M =
−3 1 4 −2
2 −1 0 4 −2 0
(c) M = −1 2 −1 (f) M = −2 4 1
0 −1 2 0 1 4
f (x1 , x2 ) = (3 − x1 )2 + (4 − x2 )2
1 0 x x
1 1
= + −6 −8 (23)
x1 x2
0 1 x2 x2
Check the FONC. That is, find values of x = [x1 , x2 ]T where the gradient of f (x1 , x2 ) equals zero.
∂f −6 + 2x1 0
= = (24)
∂x
−8 + 2x2 0
has the solution (x†1 , x†2 ) = (3, 4). Next, check the SOSC. That is, check the positive definiteness
of the Hessian.
∂2f 2 0
2
= → positive definite (25)
∂x
0 2
since the eigenvalues are 2,2. Consequently, the point (x∗1 , x∗2 ) = (3, 4) is a unique minimizer.
Example 4.2. Consider the following unconstrained QP
Check the FONC. That is, find values of x = [x1 , x2 ]T where the gradient of f (x1 , x2 ) equals zero.
∂f −4 + 8x1 − 4x2 0
= = (27)
∂x
2 − 4x1 + 2x2 0
has an infinity of solutions (x†1 , x†2 ) on the line 2x1 − x2 = 1. Next, check the SOSC. That is, check
the positive definiteness of the Hessian.
∂2f 8 −4
= → positive semidefinite (28)
∂x2
−4 2
since the eigenvalues are 0,10. Consequently, there is an infinite set of minima (valley) on the line
2x∗1 − x∗2 = 1.
Exercise 6. Examine the FONC and SOSC for the following unconstrained QP problems. What is
the stationary point x† ? What is its nature, i.e. unique minimizer, unique maximizer, valley, ridge,
or no solution?
(c) minx1 ,x2 ,x3 f (x1 , x2 , x3 ) = x21 + 2x22 + 3x23 + 3x1 x2 + 4x1 x3 − 3x2 x3
(d) minx1 ,x2 ,x3 f (x1 , x2 , x3 ) = 2x21 + x1 x2 + x22 + x2 x3 + x23 − 6x1 − 7x2 − 8x3 + 19
(e) minx1 ,x2 ,x3 f (x1 , x2 , x3 ) = x21 + 4x22 + 4x23 + 4x1 x2 + 4x1 x3 + 16x2 x3
In our discussion of QPs so far, we have defined QPs, motivated their use with least squares in
regression analysis, examined graphical solutions, and discussed optimality conditions for uncon-
strained QPs. To this point, however, we are still not equipped to solve general nonlinear programs
(NLPs). In this section, we provide a direct method for handling NLPs with constraints, called the
Sequential Quadratic Programming (SQP) method. The idea is simple. We solve a single NLP as
a sequence QP subproblems. In particular, at each iteration we approximate the objective func-
tion and constraints by a QP. Then, within each iteration, we solve the corresponding QP and use
the solution as the next iterate. This process continues until an appropriate stopping criterion is
satisfied.
SQP is very widely used in engineering problems and often the first “go-to” method for NLPs.
For many practical energy system problems, it produces fast convergence thanks to its strong
theoretical basis. This method is commonly used under-the-hood of Matlab function fmincon.
Consider the general NLP
and the k th iterate xk for the decision variable. We utilize the Taylor series expansion. At each
iteration of SQP, we consider the 2nd-order Taylor series expansion of the objective function (29),
∂f T 1 ∂2f
f (x) ≈ f (xk ) + (xk ) (x − xk ) + (x − xk )T (xk ) (x − xk ) , (32)
∂x 2 ∂x2
∂g T
g(x) ≈ g(xk ) + (xk ) (x − xk ) ≤ 0, (33)
∂x
∂h T
h(x) ≈ h(xk ) + (xk ) (x − xk ) = 0. (34)
∂x
1 T
min x̃ Qx̃ + RT x̃, (35)
2
s. to Ax̃ ≤ b (36)
Aeq x̃ = beq (37)
where
d2 f df
Q = (xk ), R= (xk ), (38)
dx2 dx
dg T
A = (xk ), b = −g(xk ), (39)
dx
dh T
Aeq = (xk ), beq = −h(xk ). (40)
dx
Suppose (35)-(37) yields the optimal solution x̃∗ . Then let xk+1 = xk + x̃∗ , and repeat.
Remark 5.1. Note that the iterates in SQP are not guaranteed to be feasible for the original NLP
problem. That is, it is possible to obtain a solution to the QP subproblem which satisfies the
approximate QP’s constraints, but not the original NLP constraints.
with the initial guess [x1,0 , x2,0 ]T = [1, 1]T . By hand, formulate the Q,R,A,b matrices for the first
three iterates. Use Matlab command quadprog to solve each subproblem. What is the solution
after three iterations?
We have f (x) = e−x1 + (x2 − 2)2 and g(x) = x1 x2 − 1. The iso-contours for the objective function
and constraint are provided in Fig. 5. From visual inspection, it is clear the optimal solution is near
1 T
min x̃ Qx̃ + RT x̃ (43)
2
s. to Ax̃ ≤ b (44)
Now consider the initial guess [x1,0 , x2,0 ]T = [1, 1]T . Note that this guess is feasible. We obtain the
following matrices for the first QP subproblem
e−1 0 −e−1
Q = , R= ,
0 2 −2
A = 1 1 , b=0
Solving this QP subproblem results in x̃∗ = [−0.6893, 0.6893]. Then the next iterate is given by
[x1,1 , x2,1 ] = [x1,0 , x2,0 ] + x̃∗ = [0.3107, 1.6893]. Repeating the formulation and solution of the
QP subproblem at iteration 1 produces [x1,2 , x2,2 ] = [0.5443, 1.9483]. Note that this solution is
infeasible. Continued iterations will produce solutions that converge toward the true solution.
4
3.5
10 Iteration [x1 , x2 ] f (x) g(x)
9
3
8
0 [1, 1] 1.3679 0
g(x) = 0
2.5 7
1 [0.3107, 1.6893] 0.8295 -0.4751
Iterations 6
x2
2
of SQP 5 2 [0.5443, 1.9483] 0.5829 0.0605
1.5
4
1
3 [0.5220, 1.9171] 0.6002 0.0001
3
Figure 5 & Table 2: [LEFT] Iso-contours of objective function and constraint for Example 5.1. [RIGHT]
Numerical results for first three iterations of SQP. Note that some iterates are infeasible.
SQP provides an algorithmic way to solve NLPs in civil & environmental engineering systems.
However, it still relies on approximations - namely truncated Taylor series expansions - to solve the
optimization problem via a sequence of QP subproblems. In Chapter 4, we will discuss a direct
method for solving NLPs, without approximation.
min x1 x−2 −1
2 + x2 x1 (47)
x1 ,x2
s. to 1 − x1 x2 = 0, (48)
1 − x1 − x2 ≤ 0, (49)
x1 , x2 ≥ 0 (50)
with the initial guess [x1,0 , x2,0 ]T = [2, 0.5]T . By hand, formulate the Q,R,A,b matrices for the first
three iterates. Use Matlab command quadprog to solve each subproblem. What is the solution
after three iterations?
with the initial guess [x1,0 , x2,0 ]T = [1, 2]T . By hand, formulate the Q,R,A,b matrices for the first
three iterates. Use Matlab command quadprog to solve each subproblem. What is the solution
after three iterations?
6 Notes
Many solvers have been developed specifically for quadratic programs. In Matlab, the quadprog
command uses algorithms tailored towards the specific structure of QPs.
More information of the theory behind QPs can be found in Section 4.4 of [2], including ap-
plications such as minimum variance problems and the famous Markowitz portfolio optimization
problem in economics. Readers interested in further details and examples/exercises for optimality
conditions should consult Section 4.3 of [1]. Section 7.7 of [1] also provides an excellent exposition
of Sequential Quadratic Programming.
Applications of QPs in systems engineering problems include electric vehicles [3], hydro power
networks [4], ground-water planning and management [5], transportation engineering [6], financial
engineering [7], and structural reliability [8].
References
[1] P. Y. Papalambros and D. J. Wilde, Principles of Optimal Design: Modeling and Computation. Cam-
bridge University Press, 2000.
[2] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2009.
[3] S. Bashash and H. Fathy, “Optimizing demand response of plug-in hybrid electric vehicles using
quadratic programming,” in American Control Conference (ACC), 2013, 2013, pp. 716–721.
[4] S. A. A. Moosavian, A. Ghaffari, and A. Salimi, “Sequential quadratic programming and analytic
hierarchy process for nonlinear multiobjective optimization of a hydropower network,” Optimal
Control Applications and Methods, vol. 31, no. 4, pp. 351–364, 2010. [Online]. Available:
http://dx.doi.org/10.1002/oca.909
[5] W. Yeh, “Systems analysis in groundwater planning and management,” Journal of Water Resources
Planning and Management, vol. 118, no. 3, pp. 224–237, 1992.
[6] M. E. O’kelly, “A quadratic integer program for the location of interacting hub facilities,” European Journal
of Operational Research, vol. 32, no. 3, pp. 393 – 404, 1987.
[7] R. O. Michaud, “The markowitz optimization enigma: Is ’optimized’ optimal?” Financial Analysts Journal,
vol. 45, no. 1, pp. pp. 31–42, 1989.
[8] P.-L. Liu and A. D. Kiureghian, “Optimization algorithms for structural reliability,” Structural Safety, vol. 9,
no. 3, pp. 161 – 177, 1991.