0% found this document useful (0 votes)
77 views25 pages

Matrix Algebra: 1.1 Vector

1. The document defines basic concepts of matrix algebra including vectors, matrices, operations on matrices like transposition and matrix multiplication. 2. Key matrix types are introduced such as diagonal, lower/upper triangular, identity and null matrices. Properties like symmetry and how operations affect matrices are covered. 3. Matrix multiplication is defined, requiring the inner dimension of the first matrix to match the outer dimension of the second. The result is a matrix with the same number of rows as the first and columns as the second.

Uploaded by

Diana Carillo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views25 pages

Matrix Algebra: 1.1 Vector

1. The document defines basic concepts of matrix algebra including vectors, matrices, operations on matrices like transposition and matrix multiplication. 2. Key matrix types are introduced such as diagonal, lower/upper triangular, identity and null matrices. Properties like symmetry and how operations affect matrices are covered. 3. Matrix multiplication is defined, requiring the inner dimension of the first matrix to match the outer dimension of the second. The result is a matrix with the same number of rows as the first and columns as the second.

Uploaded by

Diana Carillo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Chapter 1

Matrix Algebra

1.1 Vector
1. A vector is an ordered sequence of elements arranged in a row or column. Unless otherwise
noted, a vector will always be assumed to be a column vector. For example, a is a
3-element column vector and x is a n-element column vector.
 
  x1
5  x2 
 
a =  1  , x =  .. 
 . 
3
xn
Column vectors can be transformed into row vectors by the operation of transposition.
We denote the transposition operation by a prime. Thus, row vector:
a′ = [5 1 3], x′ = [x1 x2 . . . xn ]

2. The inner product of two vectors is defined as


 
b1
 b2  Xn
′  
a b = [a1 a2 . . . an ]  ..  = a1 b1 + a2 b2 + · · · + an bn = ai bi = b′ a
 .  i=1
bn
That is, corresponding elements are multiplied together and summed to give the product,
which is a scalar.
3. A special case of inner product:
n
X

Sum of squares: a a = a2i
i=1

Example: Let e = [e1 e2 . . . en ]′ be a (n × 1) residuals vector of OLS estimators then we


can obtain n
X

Residual sum of squares = e e = e2i
i=1

1
4. Definition: Let i is a vector that contains a column of ones.
 
1
 1 
 
in×1 =  .. 
 . 
1

5. Suppose i and x are two (n × 1) column vector, then


 
x1
n
X  x2 
 
xi = x1 + x2 + · · · + xn = [1 1 . . . 1]  ..  = i′ x
i=1
 . 
xn

1
Pn
6. Sample mean: x̄ = n i=1 xi = n1 i′ x

7. A column vector with sample mean:


 

 x̄  1 1
 
 ..  = ix̄ = i i′ x = ii′ x
 .  n n
x̄ n×1

⇒ n1 ii′ is a n × n matrix with every element equal to 1/n.

1.2 Matrices
1. Definition:
A matrix is a rectangular array of elements. The order of a matrix is given by the
number of rows and the number of columns. The first number is the number of rows, and
the second number is the number of columns. A matrix A of order (or dimension) m × n
can be expressed as  
a11 a12 . . . a1n
 a21 a22 . . . a2n 
 
A =  .. .. . . .. 
 . . . . 
am1 am2 . . . amn

Example:
 
2 1 7 4
B =  1 2 −2 1 
4 1 2 −3
The matrix B is of order 3 × 4.

2
2. Definition: Let A be a m × n matrix. A is said to be a square matrix if m = n.

3. Definition: lower-triangular matrix


A matrix in which all elements above the diagonal are 0, e.g.
 
1 0 0
A= 2 3 0 
4 5 6

4. Definition: upper-triangular matrix


A matrix in which all elements below the diagonal are 0, e.g.
 
1 2 3
A= 0 4 5 
0 0 6

5. A diagonal matrix is a square matrix with nonzero elements in all diagonal position and
zeros occur elsewhere.  
a11 0 . . . 0
 0 a22 . . . 0 
 
A =  .. .. . . .. 
 . . . . 
0 0 . . . ann
We often denoted a diagonal matrix A as

diag(a11 , a22 , . . . , ann ).

where aii is the ith element on the principal diagonal.

6. The identity matrix of order n × n is defined as


 
1 0 ... 0
 0 1 ... 0 
 
In =  .. .. . . .. 
 . . . . 
0 0 ... 1

7. Suppose A is a matrix with order m × n, it follows that

Im A = AIn = A

That is, pre- or post-multiplying by I does not change the matrix.

8. The identity matrix can be entered or suppressed at will in matrix multiplication. For
example,
y − Py = Iy − Py = (I − P)y = My, where M = I − P

3
9. The transpose of a matrix A = [aij ] denoted as A′ , is obtained by creating the matrix
whose kth row is the kth column of the original matrix, i.e. A′ = [aji ].
Example:  
  1 4
1 2 3
A= , A′ =  2 5 
4 5 6
3 6

10. A square matrix A is a symmetric matrix if and only if A = A′ .


Example 1:  
1 −2 3
A =  −2 1 4  = A′
3 4 2
Example 2:
If X is any n × K matrix, then X′ X is a symmetric matrix.
11. If k is a scalar then (kA)′ = A′ k ′ = A′ k = kA′ .
12. (A′ )′ = A
13. The transpose of a sum is the sum of the transposes.
(A + B)′ = A′ + B′

14. The transpose of an identity matrix I is the identity matrix itself, i.e. I′ = I.
15. Definition: A matrix with all elements zero is said to be a null matrix and denoted as On .
16. Two matrices A and B are said to be equal if A − B = O.
17. Definition: Matrix multiplication
If the matrix A is m × n and B is n × k, then AB is defined.
   
a11 a12 · · · a1n a′1
 a21 a22 · · · a2n   ′ 
   a2 
A =  .. .. .. ..  =  . 
 . . . .   .. 
am1 am2 · · · amn a′m
 
b11 b12 · · · b1k
 b21 b22 · · · b2k 
 
B= .. .. .. ..  = [b1 b2 · · · bk ]
 . . . . 
bn1 bn2 · · · bnk
Then C is a m × k matrix defined as
   
a′1 a′1 b1 a′1 b2 · · · a′1 bk
 a ′  a′ b1 a′ b2 · · · a′2 bk 
 2   2 2 
C = AB =  ..  [b1 b2 · · · bk ] =  .. .. .. .. 
 .   . . . . 
′ ′ ′ ′
am am b1 am b2 · · · am bk

4
Example 1:
   
1 3 2 1 3
A= ,B =
2 4 0 1 2
   
1(2) + 3(0) 1(1) + 3(1) 1(3) + 3(2) 2 4 9
⇒ AB = =
2(2) + 4(0) 2(1) + 4(1) 2(3) + 4(2) 4 6 14

Example 2:
Let X be a n × K matrix defined as:
   
x11 x12 · · · x1K x′1
 x21 x22 · · · x2K   x′2 
   
X(n×K) =  .. .. .. .. = .. 
 . . . .   . 
xn1 xn2 · · · xnK x′n

where  
xi1
 xi2 
 
xi =  ..  , where i = 1, 2, . . . , n
 . 
xiK

It turns out that


 
x′1

 
 x′2 

X X(K×K) = x1 x2 . . . xn  .. 
 . 
x′n
n
X
= xi x′i
i=1

18. Generally, AB 6= BA if both products exist. e.g.


   
4 7 1 5
A= ,B =
3 2 6 8
   
46 76 19 17
⇒ AB = 6= BA =
15 31 48 58

19. The transpose of the product of two matrices is given by

(AB)′ = B′ A′ , where A is m × n and B is n × k.

That is, the transpose of a product is the product of the transposes in reverse order.
Example:    
1 2 2 0
A= ,B =
3 4 1 1

5
   
4 2 ′ 4 10
⇒ AB = ⇒ (AB) =
10 4 2 4
    
′ ′ 2 1 1 3 4 10
BA = =
0 1 2 4 2 4

20. A matrix A is defined as an idempotent matrix, then

A = AA.

That is, multiplying A by itself, however many times, simply reproduces the original matrix.
Example:  
4 −2
A=
6 −3
    
2 4 −2 4 −2 4 −2
A = =
6 −3 6 −3 6 −3

21. Example 1:

(1) projection matrix


P = X(X′ X)−1 X′
(2) residual maker
M = I − X(X′ X)−1X′

22. Example 2: Let an idempotent matrix A = I − n1 ii′ , then


 
x1 − x̄
1  x2 − x̄ 
 
Ax = x − (ii′ )x =  .. 
n  . 
xn − x̄

The matrix A transforms raw data into deviation form.

1.3 Trace of a Matrix


1. If the square matrix A is of order n × n, the trace of A is defined as the sum of the
elements on the principal diagonal; i.e.
n
X
tr(A) = aii
i=1

2. Basic properties of the trace:

(1) tr(cA) = c · tr(A), where c is a constant.


(2) tr(A′ ) = tr(A)

6
(3) tr(A + B) = tr(A) + tr(B)
(4) tr(In ) = n
(5) tr(AB) = tr(BA)
(6) tr(ABC) = tr(CAB) = tr(BCA)
(7) a′ a = tr(a′ a) = tr(aa′ ), where a is a n × 1 column vector.
P P
(8) tr(A′ A) = tr(AA′ ) = ni=1 nj=1 a2ij

1.4 Determinant of a Square Matrix


1. Definition: For a 2 × 2 matrix  
a11 a12
A=
a21 a22
its determinant is defined as
det(A) = |A| = a11 (−1)(1+1) |a22 | + a12 (−1)(1+2) |a21 |
= a11 a22 − a21 a12
2. For a 3 × 3 matrix  
a11 a12 a13
A =  a21 a22 a23 
a31 a32 a33
its determinant is
a a a21 a23 a21 a22
|A| = a11 (−1)(1+1) 22 23 + a12 (−1)

(1+2)
a31 a33
+ a13 (−1)

(1+3)
a31 a32


a32 a33
3. Suppose A is an n × n matrix, then
Xn
|A| = aij Cij , for any row i = 1, 2, . . . , n.
j=1
Xn
= aij Cij , for any column j = 1, 2, . . . , n.
i=1

where Cij is called the cofactor of the element of aij and denoted by Cij = (−1)i+j |Aij |.
Aij is a submatrix obtained from A by deleting row i and column j.
4. If A is an n × n matrix and c is a nonzero constant, then |cA| = cn |A|. e.g.
   
1 2 3 6
A= and B = 3A = .
3 10 9 30
⇒ |A| = 4, |B| = 36 = 32 · 4
5. If any row (or column) of a matrix is a multiple of any other row (or column), then its
determinant is 0.
6. |A′| = |A|.
7. If A and B are square matrices of the same order, then |AB| = |A| · |B|.

7
1.5 Inverse Matrix
1. An n × n matrix A has an inverse, denote A−1 , provided that AA−1 = A−1 A = In .

2. A matrix that has an inverse is said to be a nonsingular matrix. Otherwise, it is said to


be singular.

3. The inverse of an identity matrix is the identity matrix itself, i.e. I−1 = I.

4. The inverse of the inverse is the original matrix itself, i.e. (A−1 )−1 = A.

5. The inverse of the transpose is the transpose of the inverse, i.e. (A′ )−1 = (A−1 )′ .

6. If A and B are nonsingular, then (AB)−1 = B−1 A−1 .

7. Let A be an order n square matrix, then the inverse matrix of A is


 ′
C11 C12 . . . C1n
1   C21 C22 . . . C2n 

A−1 =  .. .. .. .. 
.
|A|  . . . 
Cn1 Cn2 . . . Cnn

8. Example:
 
1 3 4
A= 1 2 1 
2 4 5

C11 = 6, C12 = −3, C13 = 0, C21 = 1, C22 = −3, C23 = 2, C31 = −5, C32 = 3, C33 = −1
|A| = 1(6) + 3(−3) + 4(0) = −3
 ′    
6 −3 0 6 1 −5 −2 −1 5
1  1  3 3
A−1 = 1 −3 2  = −3 −3 3  =  1 1 −1 
−3 −3
−5 3 −1 0 2 −1 0 −23
1
3

Verify:    

−1 5
1 3 4 −2 3 3
1 0 0
AA−1 =  1 2 1  1 1 −1  =  0 1 0  = I3
−2 1
2 4 5 0 3 3
0 0 1

1.6 Rank of a Matrix


1. Definition: Linear Combinations
A linear combination of a set of n vectors a1 , a2 , . . ., an is denoted as

c1 a1 + c2 a2 + . . . + cn an

for the constants c1 , c2 , . . . , cn .

8
     
1 2 3
2. Example: Suppose a1 = , a2 = , a3 = , and c1 = −2, c2 = 1, c3 = 2, then
2 1 4
the linear combination is
       
1 2 3 6
c1 a1 + c2 a2 + c3 a3 = −2 +1 +2 =
2 1 4 5

3. Definition: Linearly Independent


Denote n columns of the matrix A as a1 , a2 , . . ., an . The set of these vectors is linearly
independent if and only if there exists only a set of scalars c1 = c2 = . . . = cn = 0, such
that
c1 a1 + c2 a2 + · · · + cn an = 0
Otherwise they are linearly dependent.
4. Example:    
1 1
Suppose a1 = , a2 = , then there exists only one solutions c1 = c2 = 0 such that
1 2
     
1 1 0
c1 + c2 = .
1 2 0
This implied that a1 and a2 is linearly independent. (|A| =
6 0)
5. Example:    
1 2
Suppose a1 = , a2 = , then there exists nonzero solutions c1 = −2, c2 = 1 such
2 4
that      
1 2 0
c1 + c2 = .
2 4 0
Therefore, we say that a1 and a2 is linearly dependent. (|A| = 0)
6. Let A be an m × n matrix. The rank of A, denoted by rank(A), is the maximum number
of linearly independent rows or columns of A. The matrix A has rank(A) = r if and only
if the order of the largest square submatrix (r × r) whose determinant is not zero.
7. If the maximum number of linearly independent columns (or rows) is equal to the number
of columns, we say that the matrix has a full column rank.
8. If the maximum number of linearly independent rows (or columns) is equal to the number
of rows, we say that the matrix has a full row rank.
9. If a square matrix A has a full column (and row) rank and |A| =
6 0, then A is said to be
nonsingular.
10. Example: Suppose
 
3 2 7
A =  0 1 −3  ⇒ |A| = 0
3 4 1

9
In this example, there exists nonzero solutions c1 = −13
3
, c2 = 3, and c3 = 1 such that
       
3 2 7 0
−13  
0 + 3  1  +  −3  =  0  .
3
3 4 1 0

Therefore, we say that a1 , a2 and a3 are linearly dependent.


Let A11 denote the submatrix of A deleted the first row and the first column of A. Then
 
1 −3
A11 = ⇒ |A11 | = 13 6= 0
4 1

Hence rank(A) = 2.

11. rank(In )=n.

12. rank(cA)=rank(A), where c is a constant that is not 0.

13. rank(A′ )=rank(A).

14. If A is an (m × n) matrix, then rank(A) ≤ min{m, n}.

15. If A is an (n × n) matrix, then rank(A) = n if and only if A is nonsingular.

16. Let X be an (n × K) matrix, then rank(X)=rank(X′ X).

17. Let X be an (n×K) matrix and A be an (n×n) nonsigular matrix, then rank(AX)=rank(X).

1.7 Partitioned Matrices


1. Let  
1 2 | 3
 4 7 | 5   
  A11 A12
A=


−− −− −− −−  =
 A21 A22
8 2 | 4 
2 1 | 3
then we said that A is a partitioned matrix.

2. If  
A11 0
A=
0 A22
then the inverse of A is  
−1 A−1
11 0
A =
0 A−1
22

provided that A−1 −1


11 and A22 exist.

10
3. If  
A11 A12
A=
A21 A22
where A11 and A22 are square nonsingular matrices, then
 
−1 B11 −B11 A12 A−1
22
A =
−A−1 −1 −1 −1
22 A21 B11 A22 + A22 A21 B11 A12 A22

where B11 = (A11 − A12 A−1 −1


22 A21 ) . Or, alternatively,
 −1 
−1 A11 + A−1 −1
11 A12 B22 A21 A11 −A−1
11 A12 B22
A =
−B22 A21 A−1
11 B22

where B22 = (A22 − A21 A−1 −1


11 A12 ) .

1.8 Quadratic Forms and Definite Matrices


1. Definition: A quadratic form in x is a function of the form
n X
X n
q= xi xj aij .
i=1 j=1

Let A be an (n × n) symmetric matrix, then q = x′ Ax.


Example:

q = x′ Ax
  
1 2 x1
= [x1 x2 ]
2 6 x2
= x21 + 4x1 x2 + 6x22

2. Definition: A quadratic form x′ Ax or its matrix A is said to be

(1) positive definite (p.d.) if q > 0 for all nonzero x.


(2) negative definite (n.d.) if q < 0 for all nonzero x.
(3) positive semidefinite (p.s.d.) if q ≥ 0 for all nonzero x.
(4) negative semidefinite (n.s.d.) if q ≤ 0 for all nonzero x.

3. If A is an (n × n) symmetric matrix and rank(A)=n, then the following are all equivalent:

(1) x′ Ax > 0 (p.d.) for all nonzero x.


(2) The determinants of the n principal minors are all strictly positive, i.e.

a11 a12 a13
a11 a12
|a11 | > 0, > 0, a21 a22 a23 > 0, . . . , |A| > 0
a21 a22
a31 a32 a33

11
4. Example:  
1 2
Show that A = is p.d.
2 6
(1) Method 1:
  
′ 1 2 x1
q = x Ax = [x1 x2 ]
2 6 x2
= x21 + 4x1 x2 + 6x22
= (x1 + 2x2 )2 + 2x22

q > 0 ∀ x 6= 0, since which is a sum of squares.


(2) Method 2: A is p.d. since |a11 | = 1 > 0 and |A| = 2 > 0.

5. Properties of positive definite and positive semidefinite matrices

(1) A positive definite matrix has diagonal elements that are strictly positive, while p.s.d.
matrix has nonnegative diagonal elements
(2) If A is p.d., then A−1 exists and is p.d.
(3) If X is n × K, then X′ X is p.s.d.

Proof:
Let c be an K × 1 nonzero matrix.

q = c′ X′ Xc
= y′ y, Let yn×1 = Xc
X
= yi2
≥ 0

(4) If X is n × K and rank(X)=K, then X′ X is p.d. and nonsingular.

6. Consider two matrices A and B with the same dimension, then

(1) A > B if A − B is p.d.


(2) A ≥ B if A − B is p.s.d.
(3) A < B if A − B is n.d.
(4) A ≤ B if A − B is n.s.d.

7. Example:
Show that OLS estimator b = (X′ X)−1X′ y have minimum variance in the class of unbiased
estimators.
Proof:
Let b0 = Cy be another linear unbiased estimator of β, where C is an K × n matrix such
that CX = I.

12
Let D = C − (X′ X)−1 X′

⇒ DX = CX − IK = OK×K
∴ b0 = Cy
= C(Xβ + ε)
= β + Cε
⇒ b0 − β = Cε

Var(b0 |X) = E[(b0 − β)(b0 − β)′ |X]


= E(Cεε′ C′ |X)
= CE(εε′ |X)C′
= σ 2 CC′ , since E(εε′ |X) = σ 2 I
= σ 2 [(D + (X′ X)−1X′ )(D + (X′ X)−1 X′ )′ ]
= σ 2 (X′X)−1 + σ 2 DD′ , since DX = O
= Var(b|X) + σ 2 DD′

Since DD′ is p.s.d, it follows that Var(b0 |X) ≥ Var(b|X).


This implied that the least square estimator b is the best linear unbiased estimator (BLUE)
of β.
Note:
Show that DD′ is p.s.d.
Proof:
The quadratic form in DD′ is

q = z′ DD′ z
= h′ h, let h = D′ z
X
= h2i
≥ 0

Therefore, DD′ is p.s.d.

1.9 Matrix Differentiation


1. Suppose a function y = f (x1 , x2 , . . . , xn ) = f (x) is a scalar-valued function of a vector x,
where x′ = [x1 x2 . . . xn ]. The gradient of y is denoted as
 
∂y/∂x1
∂f (x)   ∂y/∂x2 

= .. 
∂x  . 
∂y/∂xn

13
The Hessian matrix (i.e. second derivatives matrix) of y is defined as
 
f11 f12 · · · f1n
∂2y ∂(∂y/∂x) ∂(∂y/∂x)  f21 f22 · · · f2n 
 
H= ′
= ′
= =  .. .. . . .. 
∂x∂x ∂x ∂(x1 x2 . . . xn )  . . . . 
fn1 fn2 · · · fnn

where fij = ∂ 2 y/∂xi ∂xj .

2. If a′ = [a1 a2 . . . an ] and x′ = [x1 x2 . . . xn ], then

y = a′ x = a1 x1 + a2 x2 + . . . + an xn
 ∂y   
∂x1 a1
∂y ′
∂(a x)  ∂y 
∂(x a)  ∂x2   a2
′  

= = =  .  =  .. =a
∂x ∂x ∂x  ..   . 
∂y an
∂x n

Example:
Residual sum of squares:
e′ e = y′ y − 2b′ X′ y + b′ X′ Xb
∂(−2b′ X′ y)
= −2X′ y
∂b
3. Theorem:
If A is a symmetric matrix, then

∂x′ Ax
= 2Ax.
∂x
 
1 3
Example 1: A = . Then
3 4
  
′ 1 3 x1
x Ax = [x1 x2 ] = 1x21 + 4x22 + 6x1 x2
3 4 x2
    
∂x′ Ax 2x1 + 6x2 1 3 x1
= =2 = 2Ax
∂x 6x1 + 8x2 3 4 x2

Example 2:
∂(b′ X′ Xb)
= 2X′ Xb, since X′X is symmetric
∂b
4. If A is not symmetric, then
∂x′ Ax
= (A + A′ )x.
∂x

14
 
1 3
e.g. A = . Then
0 4
  
′ 1 3 x1
y = x Ax = [x1 x2 ] = x21 + 4x22 + 3x1 x2
0 4 x2

" #
∂y
∂x′ Ax ∂x1
= ∂y
∂x ∂x2
 
2x1 + 3x2
=
3x1 + 8x2
  
2 3 x1
=
3 8 x2
     
1 3 1 0 x1
= +
0 4 3 4 x2
= (A + A′ )x

5. If y = A x then
(m×1) (m×n)(n×1)
∂y
= A′ .
∂x (n×m)
e.g. A


   
x1  
y1 5 3 2   5x1 + 3x2 + 2x3
y= = x2 = = Ax
y2 2 1 3 2x1 + x2 + 3x3
x3
   
∂y1 ∂y2
  ∂x ∂x 5 2
∂y ∂(Ax) ∂y1 ∂y2  ∂y1 ∂y2  
1 1
= = =  ∂x ∂x2  = 3 1  = A′
∂x ∂x ∂x ∂x ∂y1
2
∂y2
∂x ∂x
2 3
3 3

1.10 Eigenvalues and Eigenvectors


1. Suppose that we want to find the solutions of

Ac = λc

where A is a known (k × k) square matrix, c is an unknown (k × 1) nonzero vector, and λ


is an unkown scalar. Thus,

Ac = λIc
Ac − λIc = 0
(A − λI)c = 0
If the inverse of (A − λI) exists, then

(A − λI)−1 (A − λI)c = 0

15
This implies that
c=0
This solution contradicts the condition that c 6= 0. The results that the matrix

(A − λI) is singular.

i.e., (A − λI)−1 does not exist.


This implies that
|A − λI| = 0

2. Suppose A is a (k × k) matrix of known numbers such that

A c =λ c
(k×k) (k×1) (k×1)

or
(A − λI)c = 0
where λ is an unknown scalar and c is an unknown k × 1 vector. Then,

|A − λI| = 0

The above polynomial equation in λ of degree k is known as the characteristic equation


of A. These λ’s are called characteristic roots (or eigenvalues) of the matrix A. Each
λi can be substituted in (A − λI)c = 0 and the corresponding k × 1 matrix ci obtained,
where ci is a nonzero vector. The c vectors are known as the characteristic vectors (or
eigenvectors) of A.

3. Aci = λi ci , i = 1, 2, . . . , k. Stacking all k solutions produces the matrix equation


   
Ak×k c1 c2 . . . ck k×k = λ1 c1 λ2 c2 . . . λk ck k×k (1.1)
 
λ1 0 . . . 0
 
 0 λ2 . . . 0 

= c1 c2 . . . ck  .. .. . . .  (1.2)
 . . . .. 
0 0 . . . λk

It can be written as
Ak×k Ck×k = Ck×k Λk×k
where Λ is the diagonal matrix of eigenvalues.

Assume C is nonsingular, then we obtain the diagonalization of A

Λ = C−1 AC

where Λ is a k × k diagonal matrix with eigenvalues λi s in diagonal position.

16
4. Example: Suppose  
3 1
A=
1 3
     
3 1 1 0 3−λ 1
A − λI = −λ =
1 3 0 1 1 3−λ

The characteristic equation is given by

|A − λI| = 0

3−λ 1
⇒ = 0
1 3−λ

⇒ λ2 − 6λ + 8 = (λ − 4)(λ − 2) = 0
The eigenvalues of A are λ1 = 4 and λ2 = 2.

• Find eigenvectors:
(1) λ1 = 4:

Ac1 = 4c1
    
3 1 c11 c11
i.e. = 4
1 3 c21 c21
   
3c11 + c21 4c11
⇒ =
c11 + 3c21 4c21
   
−c11 + c21 0
⇒ =
c11 − c21 0
⇒ c11 = c21
Let c11 = 1, then c21 = 1.
 
1
Therefore, eigenvector for λ1 = 4 is c1 =
1

(2) λ2 = 2:

Ac2 = 2c2
    
3 1 c12 c12
i.e. = 2
1 3 c22 c22
   
3c12 + c22 2c12
⇒ =
c12 + 3c22 2c22
   
c12 + c22 0
⇒ =
c12 + c22 0
⇒ c12 = −c22
Let c12 = 1, then c22 = −1.
 
1
Therefore, eigenvector for λ2 = 2 is c2 =
−1

17
• The equations system is homogeneous, it will yield an infinite number of vectors cor-
responding to the root λi .
• Check Λ = C−1 AC.
   
  1 1 −1 −1 −1 −1
C= c1 c2 = ⇒C =
1 −1 2 −1 1
     
−1
−1 −1 −1 3 1 1 1 4 0
⇒ Λ = C AC = =
2 −1 1 1 3 1 −1 0 2
5. The eigenvalues of a symmetric matrix are all real.

6. If all k eigenvalues are distinct, C will have k linearly independent columns and such that

Λ = C−1 AC

7. If A is symmetric, the eigenvectors are linear independent and pairwise orthogonal in


that c′i cj = 0 for λi 6= λj .

8. Normalization of eigenvectors: c′i ci = 1, i = 1, . . . , k

9. Let Q denote the matrix whose columns are normalized orthogonal eigenvectors. Then

Q′ Q = I

The matrix Q is called an orthogonal matrix since its inverse is its transpose, i.e.

Q′ Q = QQ′ = I

⇒ Q−1 = Q′ , since Q−1 Q = I

     
c′1 c′1 c1 c′1 c2 . . . c′1 ck 1 0 ... 0
 c′2   c′2 c1 c′2 c2 . . . c′2 ck   0 1 ... 0 
′     
QQ= ..  [c1 c2 . . . ck ] =  .. .. .. ..  =  .... . . .. =I
 .   . . . .   . . . . 
c′k ′ ′
ck c1 ck c2 ′
. . . ck ck 0 0 ... 1

10. Let A be an (k × k) symmetric matrices. Then, there exists an (k × k) orthogonal matrix


Q such that Q′ AQ = Λ is diagonal.

Q′ AQ = Λ ⇔ A = QΛQ′

11. Example: (continue...)

18
• λ1 = 4:     
3−4 1 c11 0
=
1 3−4 c21 0

c11 = c21

c211 + c221 = 1 (normalization)
" # " #
√1 −1

⇒ c1 = 2 or 2
√1 −1

2 2

• λ2 = 2:     
3−2 1 c12 0
=
1 3−2 c22 0

c12 = −c22

c212 + c222 = 1 (normalization)
" # " #
√1 −1

⇒ c2 = 2 or 2
−1
√ √1
2 2

• Check Q′ Q = QQ′ = I.
" #′ " #  
√1 √1 √1 √1 1 0
′ 2 2 2 2
QQ= √1 −1 √1 −1 =
2

2 2

2
0 1

• Check Q′ AQ = Λ.
" #′  " #  
√1 √1 3 1 √1 √1 4 0
′ 2 2 2 2
Q AQ = √1 −1 √1 −1 = =Λ
2

2
1 3 2

2
0 2

12. The determinant of a symmetric matrix is the product of its eigenvalues.


Proof:

|Λ| = |Q′ AQ|


= |Q′ | · |A| · |Q|, since |AB| = |A| · |B|
= |Q′ | · |Q| · |A|
= |Q′ Q| · |A|, since |A| · |B| = |AB|
= |I| · |A|
= |A|

13. The sum of all the eigenvalues is equal to the trace of A.


Proof:

tr(Λ) = tr(C−1 AC)


= tr(ACC−1 ), since tr(AB) = tr(BA)
= tr(A)

19
14. The rank of A is equal to the number of nonzero eigenvalues.
Proof:
Let k1 be the number of nonzero eigenvalues of matrix A.
rank(A) = rank(QΛQ′ )
= rank(ΛQ′ ), since rank(BX) = rank(X) if B is nonsingular
= rank(QΛ), since rank(X) = rank(X′ )
= rank(Λ), since rank(BX) = rank(X) if B is nonsingular
= k1

15. The rank of an idempotent matrix is equal to its trace.


16. Suppose A is symmetric then A is positive definite if and only if all eigenvalues of A are
positive.
17. If A is symmetric and positive definite then there exists a nonsingular matrix P = QΛ1/2
such that A = PP′ , where Q is an orthogonal matrix of eigenvectors and
 √ 
λ1 √0 · · · 0
 0 λ2 · · · 0 
 
Λ1/2 =  .. .. . . ..  .
 . . . 
√.
0 0 ··· λk

There is also a matrix T = Λ1/2 Q′ such that A = T′ T and TA−1 T′ = I.


Proof:
A = QΛQ′ = QΛ1/2 Λ1/2 Q′ = (QΛ1/2 )(QΛ1/2 )′ = PP′
A = QΛQ′ = QΛ1/2 Λ1/2 Q′ = (Λ1/2 Q′ )′ (Λ1/2 Q′ ) = T′ T

18. Example: (continue...)


(1) Check PP′ = A.
" #  " #
√1 √1 2 √0 √2 1
P = QΛ1/2 = √1
2
−1
2 = √2
2

2

2
0 2 2
−1
" #" #′  
√2 1 √2 1 3 1
′ 2 2
PP = √2 √2
= =A
2
−1 2
−1 1 3

(2) Check T′ T = A.
 " #′  
2 √0 √1 √1 √2 √2
1/2 ′ 2 2 2 2
T=Λ Q = √1 −1 =
0 2 2

2
1 −1
 ′    
√2 √2 √2 √2 3 1
′ 2 2 2 2
TT= = =A
1 −1 1 −1 1 3

20
(3) Check TA−1 T′ = I.
   
3 1 −1 1 3 −1
A= ⇒A =
1 3 8 −1 3

   ′
1 √2 √2 3 −1 √2 √2
−1 ′ 2 2 2 2
TA T =
8 1 −1 −1 3 1 −1
 " #
1 √4 √4 √2 1
= 2 2 2
8 4 −4 √2 −1
2
 
1 8 0
=
8 0 8
 
1 0
=
0 1
= I2

21
Chapter 2

Multivariate Distributions

2.1 Multivariate Densities


1. Let x denote a vector of random variables X1 , X2 , . . . , Xk . Then, expected values of x in a
vector can be expressed as
   
E(X1 ) µ1
 E(X2 )   µ2 
   
µ = E(x) =  ..  =  .. 
 .   . 
E(Xk ) µk

 
X1
 X2 
 
xk×1 =  .. 
 . 
Xk

22
The variance-covariance (or covariance) matrix denotes by

Var(x)
= E[(x − µ)(x − µ)′ ]
  

 (X 1 − µ 1 ) 


 (X2 − µ2 )  

 
= E  ..  [(X 1 − µ 1 ) (X 2 − µ 2 ) · · · (X k − µ k )]

  .  


 (X − µ ) 

k k
 
E(X1 − µ1 )2 E[(X1 − µ1 )(X2 − µ2 )] . . . E[(X1 − µ1 )(Xk − µk )]
 E[(X2 − µ2 )(X1 − µ1 )] E(X2 − µ2 )2 . . . E[(X2 − µ2 )(Xk − µk )] 
 
=  .. .. .. .. 
 . . . . 
E[(Xk − µk )(X1 − µ1 )] E[(Xk − µk )(X2 − µ2 )] . . . E(Xk − µk )2
 
σ11 σ12 . . . σ1k
 σ21 σ22 . . . σ2k 
 
=  .. .. . . .. 
 . . . . 
σk1 σk2 . . . σkk
= Σ

2.2 Multivariate Normal Distribution


1. Let x be a k × 1 normal random vector and its probability density is given by
 
−k/2 −1/2 −1 ′ −1
f (x) = (2π) |Σ| exp (x − µ) Σ (x − µ)
2

where µ = [µ1 µ2 · · · µk ]′ and the positive definite matrix


 
σ11 σ12 · · · σ1k
 σ21 σ22 · · · σ2k 
 
Σ =  .. .. . . .. 
 . . . . 
σk1 σk2 · · · σkk

An abbreviated notation is
x ∼ N(µ, Σ)

2. If xk×1 ∼ N(µ, Σk×k ), then

(1) x − µ ∼ N(0, Σ).


(2) Let c be an k × 1 vector of constant, then

c′ x ∼ N(c′ µ, c′Σc).

23
Since E(c′ x) = c′ E(x) = c′ µ.

Var(c′ x) = E [(c′ x − E(c′ x))(c′ x − E(c′ x))′ ]


= E [(c′ x − c′ µ)(c′ x − c′ µ)′ ]
= c′ E [(x − µ)(x − µ)′ ] c
= c′ Σc

3. Suppose that xk×1 ∼ N(0, Ik ), then

x′ x ∼ χ2 (k)

4. Suppose that x ∼ N(0, σ 2 I), then


1 ′
x x = x′ (σ 2 I)−1 x ∼ χ2 (k)
σ2

5. Suppose that x ∼ N(0, Σ), where Σ is a positive definite matrix. Then,

x′ Σ−1 x ∼ χ2 (k)

6. If x ∼ N(0, σ 2 I) and A is a (k × k) symmetric and idempotent matrix with rank r (r ≤ k),


then
1 ′
x Ax ∼ χ2 (r)
σ2
Proof:
Let Q denote the orthogonal matrix, then
 
′ Ir 0
Q AQ = Λ =
0 0

Define y = Q′ x and x = Qy. Then E(y) = 0 and

Var(y) = E(yy′ )
= E(Q′ xx′ Q)
= Q′ (σ 2 I)Q
= σ2 I
⇒y ∼ N(0, σ 2 I)

⇒ x′ Ax = y′ Q′ AQy
= y′ y
= y12 + y22 + . . . + yr2
yi
∵ ∼ N(0, 1)
σ
x′ Ax
∴ ∼ χ2 (r)
σ2
24
7. Suppose x ∼ N(0, σ 2 I) and there are two quadratic form x′ Ax and x′ Bx, where A and B
are symmetric and idempotent matrices. Then, x′ Ax and x′ Bx are statistically indepen-
dently if and only if
AB = O.

8. Assume x ∼ N(0, σ 2 I). Let L be an (m × n) matrix and A a symmetric matrix of order n,


the linear form Lx is independent of the quadratic form x′ Ax if and only if LA = O.

25

You might also like