Notes On Matrices
Notes On Matrices
Notes On Matrices
University of Essex
Dr Gordon Kemp
Session 2016/2017
Autumn Term
Matrix Functions
Matrix Calculus
Multivariate Optimization
These notes are intended as a supplement to the lecture notes not as a substitute for lecture
notes or textbooks. The notation used in these notes may differ from that used in the lecture notes
and/or textbooks.
(e) Matrix Subtraction: Let both A = ai j and B = bi j be (n m) matrices; then (A B)
is the (n m) matrix whose (i, j)-th element is ai j bi j . Hence (B) + B = 0 so (B) =
(1) B.
(f) Multiplication of a Vector by a Matrix: Let A = ai j be an (n m) matrix and x be an
(m 1) vector; then Ax is the (n 1) vector whose i-th element is mj=1 ai j x j .
(g) Matrix Multiplication: Let A = ai j and B = bi j be (n m) and (m p) matrices respectively; then AB is the (n p) matrix whose (i, k)-th element is mj=1 ai j b jk . Note:
(i) BA will only exist if n = p.
(ii) BA need not be equal to AB even if n = m = p. Thus the order of multiplication is very
important.
(iii) A (B + C) = AB + AC while (A + B) C = AC + BC.
(iv) ABC = (AB) C = A (BC) .
(h) Let A = ai j be an (n m) matrix; then the transpose of A, denoted A0 , is the (m n)
matrix whose ( j, i)-th element is given by ai j . Observe that (A0 )0 = A.
Matrix Functions
(a) Let A = ai j be an (n n) matrix, i.e. a square matrix; then the trace of A, denoted tr (A),
is a scalar equal to ni=1 aii , i.e., the sum of the elements on the main diagonal of A. Observe
that:
(i) tr (A0 ) = tr (A).
(ii) Let C = ci j and D = di j be (n p) and (p n) matrices respectively; then:
n
tr (CD) =
ci j d ji = d jici j = tr (DC) .
i=1 j=1
j=1 i=1
(c) The (n 1) vectors x1 , . . . , xq are linearly independent if and only if i=1 i xi = 0 implies
that 1 = . . . = q = 0.
(d) The rank of an (n m) matrix A, denoted rank (A), is equal to q if the largest collection of
linearly independent columns of A has q members. Note that this is also equal to the number
of vectors in the largest collection of linearly independent rows of A. Note:
(i) rank (A0 ) = rank (A);
(ii) if A is a non-singular (n n) matrix then rank (A) = n; and
(iii) if A is a singular (n n) matrix then rank (A) < n.
(e) Let A = ai j be a non-singular (n n) matrix, then the inverse of A, denoted A1 , exists
and is unique and satisfies AA1 = A1 A = In . Observe that:
(i) |A1 | = 1/|A|, provided A is non-singular.
(ii) (AB)1 = B1 A1 .
0
(iii) (A0 )1 = A1 .
1
(iv) A1
= A.
3
= In X(IK + X0 X)1 X0 .
(e) The determinant of a square matrix is equal to the product of its eigenvalues, provided that
it has a full complement of eigenvalues (some of which may be repeated).
(f) If A is an idempotent matrix (see Item 2 above) then all of the eigenvalues of A are equal to
either 0 or 1. It follows that the the trace of an idempotent matrix is equal to its rank and that
if an idempotent matrix is non-singular then it is an identity matrix.
Matrix Calculus
(a) Let f (X) be a scalar function of the (n m) matrix X = xi j ; then f (X) / X is an (n m)
matrix whose (i, j)-th element is f (X) / xi j .
(b) Product Rules
(i) Let X and Y be respectively (n m) and (m p) matrix functions of the (q 1) vector
; then:
X
Y
(XY)
=
Y+X
.
i
i
i
(ii) Let and X be respectively a scalar function and an (n m) matrix function of the
(q 1) vector ; then:
( X)
X
=
X+
.
i
i
i
(c) Chain Rule
Let f () be a scalar function of an (n m) matrix argument and G () be an (n m) matrix
function of a (p 1) vector argument ; then:
#0
("
)
G ( )
f (G)
{ f (G ( ))} = tr
.
i
G G( )
i
(d) Let a be an (n 1) vector of constants and x be an (n 1) vector; then (a0 x) / x = a.
(e) Let A be an (n m) matrix of constants and x be an (m 1) vector; then (Ax) / x0 = A.
(f) Let A be an (n n) matrix of constants and x be an (n 1) vector; then (x0 Ax) / x0 =
(A + A0 ) x.
(g) Let A be an (n m) matrix of constants and X be an (m n) matrix; then tr (AX) / X = A0 .
(h) Let X be an (n n) matrix; then |X|/ X = |X| (X0 )1 and ln |X|/ X = (X0 )1 .
(i) Let X be a non-singular
(n n) matrix whose elements are functions of the (q 1) vector ;
then X1 / i = X1 [ X/ i ] X1 for i = 1, 2, . . . , q.
(j) Let A be an (n n) matrix of constants and X bean (n n) matrix
1whose elements
are
1
1
functions of the (q 1) vector ; then tr AX
/ i = tr AX ( X/ i ) X
for
i = 1, 2, . . . , q.
5
(k) Let A be an (n n) matrix of constants and X be a non-singular (n n) matrix; then tr AX1 / X =
X1 AX1 .
(l) Let A and B be (n n) and
of constants respectively and X be an (m n)
(m m) matrices
0
0
matrix; then tr XAX B / X = BXA + B XA0 .
a11 B . . . a1m B
. . ..
(A B) = ...
.
. .
an1 B . . . anm B
(b) (A B) (C D) = (AC) (BD), provided that the matrices involved are conformable.
(c) tr (A B) = tr (A) tr (B).
(d) Let A be an (n n) matrix and B be a (p p) matrix; then |A B| = |A| p |B|n .
(e) Let A = ai j be an (n m) matrix; then vec (A) is an (nm 1) vector defined by:
a11
..
an1
a12
.
vec (A) =
...
a
n2
.
..
anm
(f) vec (ABC) = (C0 A) vec (B).
(g) tr (A0 B) = vec (A)0 vec (B).
(h) If i is an eigenvalue of A and j is an eigenvalue of B then i j is an eigenvalue of (A B).
Multivariate Optimization
F (x )
= 0.
x x=x
(ii) Second-Order Conditions (SOC):
A set of sufficient conditions for x to be a local minimum of the function f () with
respect to x are that thex is a turning point (the FOC are satisfied) and that:
f (x)
,
H (x )
x x0 x=x
is positive definite. If the FOC are satisfied at x but H (x ) is negative definite then
f () has a strict local maximum at x while if the FOC are satisfied at x but H (x ) is
indefinite and non-singular then f () has a saddlepoint at x .
(iii) Convexity
If the objective function f () is globally convex in x then any solution to the FOC is a
global minimum of f ().
(b) Equality Constrained Optimization
Suppose that we wish to find a local minimum of the function f () with respect to the (p 1)
vector x subject to the (q 1) vector of constraints g (x) = 0. Write down the Lagrangian
function:
L (x, ) = f (x) 0 g (x) ,
where is a (q 1) vector.
(i) FOC
Suppose that x is a turning point of the function f () with respect to x subject to the
constraints that g (x) = 0. Also suppose that G (x ) = [ g (x) / x0 ; x = x ] has rank
equal to q. Then there exist unique Lagrange multipliers such that:
"
#
L (x, )
= F (x ) G (x )0 = 0,
x
x=x , =
"
#
L (x, )
= g (x ) = 0.
x=x , =
(ii) SOC
Suppose that x satisfies the Lagrange Condition and the condition that [ g (x) / x0 ; x = x ]
has rank equal to q. Then a sufficient set of conditions for x to be a local minimum of
the function f () with respect to x subject to the constraints that g (x) = 0 are that the
matrix of second derivatives of L (x, ) with respect to x evaluated at (x = x , = )
is positive definite with respect to all vectors c 6= 0 such that G (x )0 c = 0. Note that
this is equivalent to the bordered-Hessian condition.
7
(iii) Convexity
If the objective function f () is globally convex in x and the constraint functions g ()
are linear then any x which satisfies the Lagrange Condition is a global minimum.
(iv) Lagrange Multipliers
Suppose the g (x) function in the constraints can be written as g (x) = g0 (x) c. Then:
=
V (c)
,
c
where:
V (c) = min f (x) s.t. g0 (x) = c.
x