Notes On Matrices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Department of Economics

University of Essex
Dr Gordon Kemp

Session 2016/2017
Autumn Term

EC966: E STIMATION AND I NFERENCE IN E CONOMETRICS


Notes on Vectors and Matrices1
Contents
1

Basic Matrix Operations

Special Types of Matrices

Matrix Functions

Eigenvalues and Eigenvectors

Matrix Calculus

Kronecker Products and the Vec Operator

Multivariate Optimization

These notes are intended as a supplement to the lecture notes not as a substitute for lecture
notes or textbooks. The notation used in these notes may differ from that used in the lecture notes
and/or textbooks.

Basic Matrix Operations


(a) A (real) (n m) matrix is a two-dimensional array of real numbers with n rows and m
columns. A = [ai j ] denotes a matrix whose (i, j)-th element is ai j . In all of what follows we
will only consider real matrices, i.e. matrices whose elements are real numbers.
(b) A column vector is a matrix with only one column while a row vector is a matrix with only
one row. Note that a matrix with only one row and one column, i.e. a matrix with only one
element, is a scalar.
 
 
(c) Matrix Addition: Let both A = ai j and B = bi j  be (n m) matrices; then (A + B) is the
(n m) matrix whose (i, j)-th element is ai j + bi j . Observe that if A, B and C are (n m)
matrices then A + B + C = A + (B + C) = (A + B) + C and also that (A + B) = (B + A).
 
(d) Multiplication of a Matrix by a Scalar: Let A = ai j be an (n m) matrix and be a
scalar (i.e., a real number); then A is the (n m) matrix whose (i, j)-th element is ai j .
1
c

Gordon C.R. Kemp, 20002016.

 
 
(e) Matrix Subtraction: Let both A = ai j and B = bi j be (n m) matrices; then (A B)
is the (n m) matrix whose (i, j)-th element is ai j bi j . Hence (B) + B = 0 so (B) =
(1) B.
 
(f) Multiplication of a Vector by a Matrix: Let A = ai j be an (n m) matrix and x be an
(m 1) vector; then Ax is the (n 1) vector whose i-th element is mj=1 ai j x j .
 
 
(g) Matrix Multiplication: Let A = ai j and B = bi j be (n m) and (m p) matrices respectively; then AB is the (n p) matrix whose (i, k)-th element is mj=1 ai j b jk . Note:
(i) BA will only exist if n = p.
(ii) BA need not be equal to AB even if n = m = p. Thus the order of multiplication is very
important.
(iii) A (B + C) = AB + AC while (A + B) C = AC + BC.
(iv) ABC = (AB) C = A (BC) .
 
(h) Let A = ai j be an (n m) matrix; then the transpose of A, denoted A0 , is the (m n)
matrix whose ( j, i)-th element is given by ai j . Observe that (A0 )0 = A.

Special Types of Matrices


(a) A square matrix is a matrix which has the same number of rows as columns.
 
(b) A diagonal matrix is an (n n) matrix A = ai j such that ai j = 0 for all i 6= j.
 
(c) An identity matrix is an (n n) matrix A = ai j such that aii = 1 for all i = 1, . . . , n while
ai j = 0 for all i 6= j. Usually an (n n) identity matrix is denoted In .
(d) A singular matrix is an (n n) matrix whose determinant (see Section 3 below) is equal to
0. A non-singular matrix is an (n n) matrix whose determinant is not equal to 0.
(e) A symmetric matrix is an (n n) matrix A which is equal to its own transpose, i.e., A0 = A.
(f) An idempotent matrix is an (n n) matrix A which is equal to its own matrix square, i.e.,
A2 AA = A.
(g) The (n n) matrix A is positive definite if for any (n 1) vector x 6= 0 then x0 Ax > 0. Note
that x 6= 0 means that at least one element of x is not equal to 0.
(i) The (n n) matrix A is negative definite if for any (n 1) vector x 6= 0 then x0 Ax < 0.
Hence if A is positive definite then (A) is negative definite
(ii) A (n n) matrix A is non-negative definite (or positive semi-definite) if for any
(n 1) vector x then x0 Ax 0.
(iii) A (n n) matrix A is non-positive definite (or negative semi-definite) if for any
(n 1) vector x, x0 Ax 0.
(iv) A (n n) matrix A is indefinite if it is neither non-negative definite nor non-positive
definite.
(h) An (n n) matrix A is orthonormal if A0 = A1 . This is equivalent to A0 A = In .
2

Matrix Functions
 
(a) Let A = ai j be an (n n) matrix, i.e. a square matrix; then the trace of A, denoted tr (A),
is a scalar equal to ni=1 aii , i.e., the sum of the elements on the main diagonal of A. Observe
that:
(i) tr (A0 ) = tr (A).
 
 
(ii) Let C = ci j and D = di j be (n p) and (p n) matrices respectively; then:
n

tr (CD) =

ci j d ji = d jici j = tr (DC) .

i=1 j=1

j=1 i=1

(iii) for any scalar , tr ( A) = tr (A).


 
(b) Let A = ai j be an (n n) matrix. The determinant of A, denoted |A| or det(A) is a scalar
defined as follows. If n = 1 then |A| is equal to a11 . Otherwise, if n > 1 then |A| is defined
recursively as follows. Let Bi j denote the ((n 1) (n 1)) matrix obtained after removing
the ith row and j-th column of A; then |A| = nj=1 (1)i+ j ai j |Bi j | for each i = 1, 2, . . . , n.
For example, if A is (2 2) then |A| = a11 a22 a21 a22 . Observe that:
(i) |A0 | = |A|.
(ii) If both A and B are square matrices then |AB| is the product of |A| and |B|.
(iii) If A is an (n n) identity matrix then |A| = 1.
(iv) For any scalar , if A is an (n n) matrix then | A| = n |A|;
(v) if |A| = 0 then there exists x 6= 0 such that Ax = 0.
q

(c) The (n 1) vectors x1 , . . . , xq are linearly independent if and only if i=1 i xi = 0 implies
that 1 = . . . = q = 0.
(d) The rank of an (n m) matrix A, denoted rank (A), is equal to q if the largest collection of
linearly independent columns of A has q members. Note that this is also equal to the number
of vectors in the largest collection of linearly independent rows of A. Note:
(i) rank (A0 ) = rank (A);
(ii) if A is a non-singular (n n) matrix then rank (A) = n; and
(iii) if A is a singular (n n) matrix then rank (A) < n.
 
(e) Let A = ai j be a non-singular (n n) matrix, then the inverse of A, denoted A1 , exists
and is unique and satisfies AA1 = A1 A = In . Observe that:
(i) |A1 | = 1/|A|, provided A is non-singular.
(ii) (AB)1 = B1 A1 .
0
(iii) (A0 )1 = A1 .
1
(iv) A1
= A.
3

(v) For any non-zero scalar , i.e., 6= 0 then ( A)1 = 1 A1 .


(vi) If A is a symmetric positive definite matrix then so too is A1 .
(vii) If A and B are conformable symmetric positive definite matrices such that (A B) is
non-negative definite then B1 A1 is also non-negative definite.
(viii) Let A and B be non-singular matrices of the same size such that (A + B) and A1 + B1
are also non-singular; then:
1 1
B .
i. (A + B)1 = A1 A1 + B1
1 1
A .
ii. A1 (A + B)1 = A1 A1 + B1
1

(ix) Let X be an (n K) matrix with rank K; then (In + XX0 )

= In X(IK + X0 X)1 X0 .

Eigenvalues and Eigenvectors


(a) The (n 1) vector x 6= 0 is an eigenvector (also called a characteristic vector) of the (n n)
matrix A with associated eigenvalue (also called a characteristic value) if Ax = x. This
implies that |A In | = 0. It follows that if A is non-singular, i.e., |A| =
6 0, then no eigenvalue of A is equal to 0. Note:
(i) If x is an eigenvector of the matrix A with associated eigenvalue then for any real
c 6= 0, cx is also an eigenvector of the matrix A with associated eigenvalue .
(ii) Different eigenvectors may sometimes have the same associated eigenvalue in which
case any linear combination of them is also an eigenvector with that same associated
eigenvalue.
(iii) An (n n) matrix has at most n eigenvalues.
(b) If x is an eigenvector of the matrix A with associated eigenvalue and A is non-singular
then x is also an eigenvector of the matrix A1 but with associated eigenvalue 1/ .
(c) In general the eigenvalues and eigenvectors of a matrix need not be real-valued, i.e., they
may be complex.
(i) However, if the (n n) matrix A is symmetric all its eigenvectors and associated eigenvalues are real-valued. In particular, it has n linearly independent real eigenvectors defined up to arbitrary scalar multiples with associated real eigenvalues (note that some
of the eigenvalues may be repeated and hence there may be fewer than n distinct eigenvalues).
(ii) Furthermore, if A is a symmetric matrix then there exists an orthonormal matrix X (see
Section 2 above) and a diagonal matrix such that AX = X and hence A = X X1 =
X X0 . Repeated application of this implies that Ar = X r X0 for any positive integer r.
Note that the diagonal elements of are the eigenvalues of A and the columns of X are
corresponding linearly independent eigenvectors of A.
(d) The trace of a square matrix is equal to the sum of its eigenvalues, provided that it has a full
complement of eigenvalues (some of which may be repeated).

(e) The determinant of a square matrix is equal to the product of its eigenvalues, provided that
it has a full complement of eigenvalues (some of which may be repeated).
(f) If A is an idempotent matrix (see Item 2 above) then all of the eigenvalues of A are equal to
either 0 or 1. It follows that the the trace of an idempotent matrix is equal to its rank and that
if an idempotent matrix is non-singular then it is an identity matrix.

Matrix Calculus
 
(a) Let f (X) be a scalar function of the (n m) matrix X = xi j ; then f (X) / X is an (n m)
matrix whose (i, j)-th element is f (X) / xi j .
(b) Product Rules
(i) Let X and Y be respectively (n m) and (m p) matrix functions of the (q 1) vector
; then:

 



X
Y
(XY)
=
Y+X
.
i
i
i
(ii) Let and X be respectively a scalar function and an (n m) matrix function of the
(q 1) vector ; then:

 



( X)

X
=
X+
.
i
i
i
(c) Chain Rule
Let f () be a scalar function of an (n m) matrix argument and G () be an (n m) matrix
function of a (p 1) vector argument ; then:
#0 
("

)


G ( )
f (G)

{ f (G ( ))} = tr
.
i
G G( )
i
(d) Let a be an (n 1) vector of constants and x be an (n 1) vector; then (a0 x) / x = a.
(e) Let A be an (n m) matrix of constants and x be an (m 1) vector; then (Ax) / x0 = A.
(f) Let A be an (n n) matrix of constants and x be an (n 1) vector; then (x0 Ax) / x0 =
(A + A0 ) x.
(g) Let A be an (n m) matrix of constants and X be an (m n) matrix; then tr (AX) / X = A0 .
(h) Let X be an (n n) matrix; then |X|/ X = |X| (X0 )1 and ln |X|/ X = (X0 )1 .
(i) Let X be a non-singular
(n n) matrix whose elements are functions of the (q 1) vector ;

then X1 / i = X1 [ X/ i ] X1 for i = 1, 2, . . . , q.
(j) Let A be an (n n) matrix of constants and X bean (n n) matrix
 1whose elements
 are
1
1
functions of the (q 1) vector ; then tr AX
/ i = tr AX ( X/ i ) X
for
i = 1, 2, . . . , q.
5


(k) Let A be an (n n) matrix of constants and X be a non-singular (n n) matrix; then tr AX1 / X =
X1 AX1 .
(l) Let A and B be (n n) and
of constants respectively and X be an (m n)
 (m m) matrices
0
0
matrix; then tr XAX B / X = BXA + B XA0 .

Kronecker Products and the Vec Operator


(a) Let A be an (n m) matrix and B be a (p q) matrix; then the Kronecker product (A B)
is the (np mq) matrix defined by:

a11 B . . . a1m B

. . ..
(A B) = ...
.
. .
an1 B . . . anm B
(b) (A B) (C D) = (AC) (BD), provided that the matrices involved are conformable.
(c) tr (A B) = tr (A) tr (B).
(d) Let A be an (n n) matrix and B be a (p p) matrix; then |A B| = |A| p |B|n .
 
(e) Let A = ai j be an (n m) matrix; then vec (A) is an (nm 1) vector defined by:

a11
..

an1

a12
.
vec (A) =
...

a
n2
.

..

anm
(f) vec (ABC) = (C0 A) vec (B).
(g) tr (A0 B) = vec (A)0 vec (B).
(h) If i is an eigenvalue of A and j is an eigenvalue of B then i j is an eigenvalue of (A B).

Multivariate Optimization

Suppose that f () is a scalar function of a (p 1) vector argument x and that g () is a (q 1) vector


function of x where q p. In addition suppose that f () is twice continuously differentiable with
respect to x and g () is continuously differentiable with respect to x.
(a) Unconstrained Optimization

(i) First-Order Conditions (FOC):


A set of necessary conditions for x to be a turning point (local minimum, local maximum or saddlepoint) of the function f () with respect to x are that:



f (x)

F (x )
= 0.
x x=x
(ii) Second-Order Conditions (SOC):
A set of sufficient conditions for x to be a local minimum of the function f () with
respect to x are that thex is a turning point (the FOC are satisfied) and that:



f (x)

,
H (x )
x x0 x=x
is positive definite. If the FOC are satisfied at x but H (x ) is negative definite then
f () has a strict local maximum at x while if the FOC are satisfied at x but H (x ) is
indefinite and non-singular then f () has a saddlepoint at x .
(iii) Convexity
If the objective function f () is globally convex in x then any solution to the FOC is a
global minimum of f ().
(b) Equality Constrained Optimization
Suppose that we wish to find a local minimum of the function f () with respect to the (p 1)
vector x subject to the (q 1) vector of constraints g (x) = 0. Write down the Lagrangian
function:
L (x, ) = f (x) 0 g (x) ,
where is a (q 1) vector.
(i) FOC
Suppose that x is a turning point of the function f () with respect to x subject to the
constraints that g (x) = 0. Also suppose that G (x ) = [ g (x) / x0 ; x = x ] has rank
equal to q. Then there exist unique Lagrange multipliers such that:
"
#

L (x, )
= F (x ) G (x )0 = 0,

x
x=x , =
"
#

L (x, )
= g (x ) = 0.

x=x , =
(ii) SOC
Suppose that x satisfies the Lagrange Condition and the condition that [ g (x) / x0 ; x = x ]
has rank equal to q. Then a sufficient set of conditions for x to be a local minimum of
the function f () with respect to x subject to the constraints that g (x) = 0 are that the
matrix of second derivatives of L (x, ) with respect to x evaluated at (x = x , = )
is positive definite with respect to all vectors c 6= 0 such that G (x )0 c = 0. Note that
this is equivalent to the bordered-Hessian condition.
7

(iii) Convexity
If the objective function f () is globally convex in x and the constraint functions g ()
are linear then any x which satisfies the Lagrange Condition is a global minimum.
(iv) Lagrange Multipliers
Suppose the g (x) function in the constraints can be written as g (x) = g0 (x) c. Then:
=

V (c)
,
c

where:
V (c) = min f (x) s.t. g0 (x) = c.
x

You might also like