0% found this document useful (0 votes)
48 views18 pages

Lec4 Quad Form

This document discusses quadratic functions, optimization, and quadratic forms. It begins by defining quadratic optimization problems and noting that they naturally arise in settings like approximating solutions to overdetermined linear systems. It then discusses properties of symmetric matrices like being positive semidefinite or positive definite, and how these properties relate to a quadratic function being convex or strictly convex. It provides proofs that a quadratic function is minimized when its gradient is zero, and gives examples of convex quadratic forms. The document continues by discussing characteristics of symmetric matrices like their eigenvalues always being real and their eigenvectors being orthogonal. It concludes by proving that a symmetric matrix has n orthogonal eigenvectors that form a basis.

Uploaded by

SudheerKumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views18 pages

Lec4 Quad Form

This document discusses quadratic functions, optimization, and quadratic forms. It begins by defining quadratic optimization problems and noting that they naturally arise in settings like approximating solutions to overdetermined linear systems. It then discusses properties of symmetric matrices like being positive semidefinite or positive definite, and how these properties relate to a quadratic function being convex or strictly convex. It provides proofs that a quadratic function is minimized when its gradient is zero, and gives examples of convex quadratic forms. The document continues by discussing characteristics of symmetric matrices like their eigenvalues always being real and their eigenvectors being orthogonal. It concludes by proving that a symmetric matrix has n orthogonal eigenvectors that form a basis.

Uploaded by

SudheerKumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Quadratic Functions, Optimization, and Quadratic

Forms
Robert M. Freund
February, 2004
2004 Massachusetts Institute of Technology.

Quadratic Optimization

A quadratic optimization problem is an optimization problem of the form:


(QP) : minimize f (x) := 12 xT Qx + cT x
s.t.

x n .

Problems of the form QP are natural models that arise in a variety


of settings. For example, consider the problem of approximately solving
an over-determined linear system Ax = b, where A has more rows than
columns. We might want to solve:
(P1 ) : minimize Ax b
s.t.

x n .

Now notice that Axb2 = xT AT Ax2bT Ax+bT b, and so this problem


is equivalent to:
(P1 ) : minimize xT AT Ax 2bT Ax + bT b
s.t.

x n ,

which is in the format of QP.


A symmetric matrix is a square matrix Q nn with the property that
Qij = Qji for all i, j = 1, . . . , n .

We can alternatively dene a matrix Q to be symmetric if


QT = Q .

We denote the identity matrix (i.e., a matrix with all 1s on the diagonal
and 0s everywhere else) by I, that is,

1 0
0 1

I = .. ..
. .
0 0

... 0
... 0
.
..
. ..
... 1

and note that I is a symmetric matrix.


The gradient vector of a smooth function f (x) : n  is the vector of
rst partial derivatives of f (x):
f (x)
x1
.
f (x) :=
..
f (x)
xn

The Hessian matrix of a smooth function f (x) : n  is the matrix of second partial derivatives. Suppose that f (x) : n  is twice
dierentiable, and let
2 f (x)
.
[H(x)]ij :=
xi xj
Then the matrix H(x) is a symmetric matrix, reecting the fact that
2 f (x)
2 f (x)
=
.
xi xj
xj xi

A very general optimization problem is:

(GP) : minimize f (x)

s.t.

x n ,

where f (x) : n  is a function. We often design algorithms for GP


by building a local quadratic model of f () at a given point x = x
. We
form the gradient f (
x) (the vector of partial derivatives) and the Hessian
H(
x) (the matrix of second partial derivatives), and approximate GP by the
following problem which uses the Taylor expansion of f (x) at x = x
up to
the quadratic term.
x) + f (
x)T (x x)
+ 12 (x x)
T H(
x)(x x)

(P2 ) : minimize f(x) := f (


s.t.

x n .

This problem is also in the format of QP.


Notice in the general model QP that we can always presume that Q is a
symmetric matrix, because:
1
xT Qx = xT (Q + QT )x
2
:= 1 (Q + QT ).
and so we could replace Q by the symmetric matrix Q
2
Now suppose that

1
f (x) := xT Qx + cT x

2
where Q is symmetric. Then it is easy to see that:
f (x) = Qx + c
and
H (x) = Q .

Before we try to solve QP, we rst review some very basic properties of
symmetric matrices.

2 Convexity, Definiteness of a Symmetric Matrix,


and Optimality Conditions
A function f (x) : n  is a convex function if
f (x+(1)y) f (x)+(1)f (y) for all x, y n , for all [0, 1].
A function f (x) as above is called a strictly convex function if the
inequality above is strict for all x = y and (0, 1).
A function f (x) : n  is a concave function if
f (x+(1)y) f (x)+(1)f (y) for all x, y n , for all [0, 1].
A function f (x) as above is called a strictly concave function if the
inequality above is strict for all x =
 y and (0, 1).
Here are some more denitions:
Q is symmetric and positive semidenite (abbreviated SPSD and denoted by Q
0) if
x T Qx 0 for all x n .
Q is symmetric and positive denite (abbreviated SPD and denoted
by Q 0) if
x T Qx > 0 for all x n , x =
 0.
Theorem 1 The function f (x) := 12 xT Qx + cT x is a convex function if and
only if Q is SPSD.

6
Proof: First, suppose that Q is not SPSD. Then there exists r such that
r
T Qr < 0. Let x = r. Then f (x) = f (r) = 12 2 rT Qr + cT r is strictly
concave on the subset {x | x = r}, since rT Qr < 0. Thus f () is not a
convex function.
Next, suppose that Q is SPSD. For all [0, 1], and for all x, y,
f (x + (1 )y) = f (y + (x y))
+ (x y))T Q(y + (x y)) + cT (y + (x y))

1
2 (y

1 T
2 y Qy

+ (x y)T Qy + 12 2 (x y)T Q(x y) + cT x + (1 )cT y

1 T
2 y Qy

+ (x y)T Qy + 12 (x y)T Q(x y) + cT x + (1 )cT y

1
T
2 x Qx

+ 12 (1 )y T Qy + cT x + (1 )cT y

= f (x) + (1 )f (y) ,
thus showing that f (x) is a convex function.
Corollary 2 f (x) is strictly convex if and only if Q 0.
f (x) is concave if and only if Q 0.
f (x) is strictly concave if and only if Q 0.
f (x) is neither convex nor concave if and only if Q is indenite.

Theorem 3 Suppose that Q is SPSD. The function f (x) := 12 x


T Qx + cT x
attains its minimum at x if and only if x solves the equation system:
f (x) = Qx + c = 0 .

7
Proof: Suppose that x satises Qx + c = 0. Then for any x, we have:
f (x) = f (x + (x x ))
=

1
2 (x

+ (x x ))T Q(x + (x x )) + cT (x + (x x ))

1 T

2 (x ) Qx

+ (x x )T Qx + 12 (x x )T Q(x x ) + cT x + cT (x x )

1 T

2 (x ) Qx

+ (x x )T (Qx + c) + 12 (x x )T Q(x x ) + cT x

1 T

2 (x ) Qx

+ cT x + 12 (x x )T Q(x x )

= f (x ) + 12 (x x )T Q(x x )
f (x ) ,
thus showing that x is a minimizer of f (x).
Next, suppose that x is a minimizer of f (x), but that d := Qx + c = 0.
Then:
f (x + d) =
=

1
2 (x

+ d)T Q(x + d) + cT (x + d)

1 T

2 (x ) Qx

+ dT Qx + 12 2 dT Qd + cT x + cT d

= f (x ) + dT (Qx + c) + 12 2 dT Qd
= f (x ) + dT d + 12 2 dT Qd .
But notice that for < 0 and suciently small, that the last expression
will be strictly less than f (x ), and so f (x + d) < f (x ). This contradicts
the supposition that x is a minimizer of f (x), and so it must be true that
d = Qx + c = 0.
Here are some examples of convex quadratic forms:
f (x) = xT x

8
f (x) = (x a)T (x a)
f (x) = (x a)T D(x a), where

d1

D=
0

0
..

dn

is a diagonal matrix with dj > 0, j = 1, . . . , n.


f (x) = (x a)T M T DM (x a), where M is a non-singular matrix and
D is as above.

Characteristics of Symmetric Matrices

A matrix M is an orthonormal matrix if M T = M 1 . Note that if M is


orthonormal and y = M x, then
y2 = y T y = xT M T M x = xT M 1 M x = xT x = x2 ,
and so y = x.
A number  is an eigenvalue of M if there exists a vector x
= 0
such that M x
= x
. x
is called an eigenvector of M (and is called an
eigenvector corresponding to ). Note that is an eigenvalue of M if and
only if (M I)
x = 0, x
= 0 or, equivalently, if and only if det(M I) = 0.
Let g() = det(M I). Then g() is a polynomial of degree n, and so
will have n roots that will solve the equation
g() = det(M I) = 0 ,
including multiplicities. These roots are the eigenvalues of M .
Proposition 4 If Q is a real symmetric matrix, all of its eigenvalues are
real numbers.

9
Proof: If s = a + bi is a complex number, let s = a bi. Then s t = s t,
s is real if and only if s = s, and s s = a2 + b2 . If is an eigenvalue of Q,
for some x = 0, we have the following chains of equations:
Qx = x
Qx = x

Qx
= x

x = xT (
x = xT Q
x
) = xT x

xT Q
as well as the following chains of equations:
Qx = x
x
T Qx = x
T (x) = x
T x
T
T
T
T
x=x Q x
=x
Qx = x
T x = xT x
.
x Q
= xT x
, and since x = 0 implies xT x
= 0, = , and so is
Thus xT x
real.
Proposition 5 If Q is a real symmetric matrix, its eigenvectors corresponding to dierent eigenvalues are orthogonal.
Proof: Suppose
Qx1 = 1 x1 and Qx2 = 2 x2 , 1 = 2 .
Then
1 xT1 x2 = (1 x1 )T x2 = (Qx1 )T x2 = xT1 Qx2 = xT1 (2 x2 ) = 2 xT1 x2 .
Since 1 = 2 , the above equality implies that xT1 x2 = 0.
Proposition 6 If Q is a symmetric matrix, then Q has n (distinct) eigenvectors that form an orthonormal basis for n .
Proof: If all of the eigenvalues of Q are distinct, then we are done, as the
previous proposition provides the proof. If not, we construct eigenvectors

10
iteratively, as follows. Let u1 be a normalized (i.e., re-scaled so that its norm
is 1) eigenvector of Q with corresponding eigenvalue 1 . Suppose we have k
mutually orthogonal normalized eigenvectors u1 , . . . , uk , with corresponding
eigenvalues 1 , . . . , k . We will now show how to construct a new eigenvector
uk+1 with eigenvalue k+1 , such that uk+1 is orthogonal to each of the vectors
u1 , . . . , uk .
Let U = [u1 , . . . , uk ] nk . Then QU = [1 u1 , . . . , k uk ].
Let V = [vk+1 , . . . , vn ] n(nk) be a matrix composed of any n k
mutually orthogonal vectors such that the n vectors u1 , . . . , uk , vk+1 , . . . , vn
constitute an orthonormal basis for n . Then note that
UT V = 0
and

V T QU = V T [1 u1 , . . . , k uk ] = 0.

Let w be an eigenvector of V T QV (nk)(nk) for some eigenvalue


, so that V T QV w = w, and uk+1 = V w (assume w is normalized so that
uk+1 has norm 1). We now claim the following two statements are true:
(a) U T uk+1 = 0, so that uk+1 is orthogonal to all of the columns of U , and
(b) uk+1 is an eigenvector of Q, and is the corresponding eigenvalue of
Q.
Note that if (a) and (b) are true, we can keep adding orthogonal vectors
until k = n, completing the proof of the proposition.
To prove (a), simply note that U T uk+1 = U T V w = 0w = 0. To prove
(b), let d = Quk+1 uk+1 . We need to show that d = 0. Note that
d = QV w V w, and so V T d = V T QV w V T V w = V T QV w w = 0.
Therefore, d = U r for some r k , and so
r = U T U r = U T d = U T QV w U T V w = 0 0 = 0.
Therefore, d = 0, which completes the proof.
Proposition 7 If Q is SPSD, the eigenvalues of Q are nonnegative.

11
Proof: If is an eigenvalue of Q, Qx = x for some x = 0. Then 0
xT Qx = xT (x) = xT x, whereby 0.
Proposition 8 If Q is symmetric, then Q = RDRT , where R is an orthonormal matrix, the columns of R are an orthonormal basis of eigenvectors of Q, and D is a diagonal matrix of the corresponding eigenvalues of
Q.
Proof: Let R = [u1 , . . . , un ], where u1 , . . . , un are the n orthonormal eigenvectors of Q, and let

1
0

..
D=
,
.
0
n
where 1 , . . . , n are the corresponding eigenvalues. Then

T

(R R)ij =

uTi uj

0, if i = j
,
1, if i = j

so RT R = I, i.e., RT = R1 .
Note that i RT ui = i ei , i = 1, . . . , n (here, ei is the ith unit vector).
Therefore,
RT QR = RT Q[u1 , . . . , un ] = RT [1 u1 , . . . , n un ]
= [1 e1 , . . . , n en ]

=
0

0
..

=D .

Thus Q = (RT )1 DR1 = RDRT .


Proposition 9 If Q is SPSD, then Q = M T M for some matrix M .
1

Proof: Q = RDRT = RD 2 D 2 RT = M T M , where M = D 2 RT .

12
Proposition 10 If Q is SPSD, then xT Qx = 0 implies Qx = 0.
Proof:
0 = xT Qx = xT M T M x = (M x)T (M x) = M x2 M x = 0 Qx = M T M x = 0.
Proposition 11 Suppose Q is symmetric. Then Q
0 and nonsingular if
and only if Q 0.
Proof:
() Suppose x = 0. Then xT Qx 0. If xT Qx = 0, then Qx = 0,
which is a contradiction since Q is nonsingular. Thus xT Qx > 0, and so Q
is positive denite.
() Clearly, if Q 0, then Q
0. If Q is singular, then Qx = 0, x = 0
has a solution, whereby xT Qx = 0, x = 0, and so Q is not positive denite,
which is a contradiction.

Additional Properties of SPD Matrices

Proposition 12 If Q 0 (Q
0), then any principal submatrix of Q is
positive denite (positive semidenite).
Proof: Follows directly.
Proposition 13 Suppose Q is symmetric. If Q 0 and


M=

Q
cT

c
b

then M 0 if and only if b > cT Q1 c.


Proof: Suppose b cT Q1 c. Let x = (cT Q1 , 1)T . Then
xT M x = cT Q1 c 2cT Q1 c + b 0.

13
Thus M is not positive denite.
Conversely, suppose b > cT Q1 c. Let x = (y, z). Then xT M x = y T Qy +
+ bz 2 . If x = 0 and z = 0, then xT M x = y T Qy > 0, since Q 0.
If z = 0, we can assume without loss of generality that z = 1, and so
xT M x = y T Qy + 2cT y + b. The value of y that minimizes this form is
y = Q1 c, and at this point, y T Qy + 2cT y + b = cT Q1 c + b > 0, and so
M is positive denite.
2zcT y

The k th leading principal minor of a matrix M is the determinant of the


submatrix of M corresponding to the rst k indices of columns and rows.
Proposition 14 Suppose Q is a symmetric matrix. Then Q is positive
denite if and only if all leading principal minors of Q are positive.
Proof: If Q 0, then any leading principal submatrix of Q is a matrix M ,
where


M N
Q=
,
NT P
and M must be SPD. Therefore M = RDRT = RDR1 (where R is orthonormal and D is diagonal), and det(M ) = det(D) > 0.
Conversely, suppose all leading principal minors are positive. If n = 1,
then Q 0. If n > 1, by induction, suppose that the statement is true for
k = n 1. Then for k = n,


Q=

M
cT

c
b

where M (n1)(n1) and M has all its principal minors positive, so


M 0. Therefore, M = T T T for some nonsingular T . Thus


Q=


Let
F =

TTT
cT

c
b

(T T )1
0
T
c (T T T )1 1

14
Then


F QF


T
0

(T T )1
0
T
c (T T T )1 1

TTT
cT

c
b

(T T )1 c
T 1 (T T T )1 c

b cT (T T T )1 c
0
1
T

(T T )
Then det Q = bc det(F
)2
from Proposition 13.

1 c

T 1 (T T T )1 c
0
1

I
0
T
0 b c (T T T )1 c

> 0 implies b cT (T T T )1 c > 0, and so Q 0

Quadratic Forms Exercises


1. Suppose that M 0. Show that M 1 exists and that M 1 0.
2. Suppose that M
0. Show that there exists a matrix N satisfying
N
0 and N 2 := N N = M . Such a matrix N is called a square
1
root of M and is written as M 2 .
3.
Let v denote the usual Euclidian norm of a vector, namelyv :=
v T v. The operator norm of a matrix M is dened as follows:
M  := max{M x | x = 1} .
x

Prove the following two propositions:


Proposition 1: If M is n n and symmetric, then
M  = max{|| | is an eigenvalue of M } .

Proposition 2: If M is m n with m < n and M has rank m, then


M  =

max (M M T ) ,

where max (A) denotes the largest eigenvalue of a matrix A.

15
4.
Let v denote the usual Euclidian norm of a vector, namely v :=
v T v. The operator norm of a matrix M is dened as follows:
M  := max{M x | x = 1} .
x

Prove the following proposition:


Proposition: Suppose that M is a symmetric matrix. Then the
following are equivalent:
(a) h > 0 satises M 1 

1
h

(b) h > 0 satises M v h v for any vector v


(c) h > 0 satises |i (M )| h for every eigenvalue i (M ) of M ,
i = 1, . . . , m.

5. Let Q
0 and let S := {x | xT Qx 1}. Prove that S is a closed
convex set.
6. Let Q
0 and let S := {x | xT Qx 1}. Let i be a nonzero
eigenvalue of Q and let ui be a corresponding eigenvector normalized
i
so that ui 2 = 1. Let ai := ui . Prove that ai S and ai S.
7. Let Q 0 and consider the problem:
(P) : z = maximumx cT x
xT Qx 1 .

s.t.

Prove that the unique optimal solution of (P) is:


Q1 c
x =
cT Q1 c
with optimal objective function value
z =

cT Q1 c .

16
8. Let Q 0 and consider the problem:
(P) : z = maximumx cT x
xT Qx 1 .

s.t.

For what values of c will it be true that the optimal solution of (P)
will be equal to c? (Hint: think eigenvectors.)
9. Let Q
0 and let S := {x | xT Qx 1}. Let the eigendecomposition
of Q be Q = RDRT where R is orthonormal and D is diagonal with
diagonal entries 1 , . . . , n . Prove that x S if and only if x = Rv for
some vector v satisfying
n

j=1

i vi2 1 .

10. Prove the following:


Diagonal Dominance Theorem: Suppose that M is symmetric and
that for each i = 1, . . . , n, we have:
Mii

|Mij | .

j=i

Then M is positive semidenite. Furthermore, if the inequalities above


are all strict, then M is positive denite.
11. A function f () : n  is a norm if:
(i) f (x) 0 for any x, and f (x) = 0 if and only if x = 0
(ii) f (x) = ||f (x) for any x and any , and
(iii) f (x + y) f (x) + f (y).

Dene fQ (x) = xt Qx. Prove that fQ (x) is a norm if and only if Q


is positive denite.
12. If Q is positive semi-denite, under what conditions (on Q and c)
will f (x) = 12 xt Qx + ct x attain its minimum over all x n ?, be
unbounded over all x n ?

17
13. Consider the problem to minimize f (x) = 12 xt Qx + ct x subject to
Ax = b. When will this program have an optimal solution?, when
not?
14. Prove that if Q is symmetric and all its eigenvalues are nonnegative,
then Q is positive semi-denite.


2 3
15. Let Q =
. Note that 1 = 1 and 2 = 2 are the eigenvalues of
0 1
Q, but that xt Qx < 0 for x = (2, 3)t . Why does this not contradict
the result of the previous exercise?

16. A quadratic form of the type g(y) = pj=1 yj2 + nj=p+1 dj yj + dn+1 is a
separable hybrid of a quadratic and linear form, as g(y) is quadratic in
the rst p components of y and linear (and separable) in the remaining
n p components. Show that if f (x) = 12 xt Qx + ct x where Q is
positive semi-denite, then there is an invertible linear transformation
y = T (x) = F x + g such that f (x) = g(y) and g(y) is a separable
hybrid, i.e., there is an index p, a nonsingular matrix F , a vector g
and constants dp , . . . , dn+1 such that
g(y) =

(F x +

j=1

g)2j

dj (F x + g)j + dn+1 = f (x).

j=p+1

17. An nn matrix P is called a projection matrix if P T = P and P P = P .


Prove that if P is a projection matrix, then
a. I P is a projection matrix.
b. P is positive semidenite.
c. P x x for any x, where   is the Euclidian norm.
18. Let us denote the largest eigenvalue of a symmetric matrix M by
max (M ). Consider the program
(Q) : z = maximumx xT M x
s.t.

x = 1 ,

where M is a symmetric matrix. Prove that z = max (M ).

18
19. Let us denote the smallest eigenvalue of a symmetric matrix M by
min (M ). Consider the program
(P) : z = minimumx xT M x
x = 1 ,

s.t.

where M is a symmetric matrix. Prove that z = min (M ).




20. Consider the matrix


M=

A
BT

B
C

where A, B are symmetric matrices and A is nonsingular. Prove that


M is positive semi-denite if and only if C B T A1 B is positive
semi-denite.

You might also like