Linear Algebra Primer: Daniel S. Stutts, PH.D
Linear Algebra Primer: Daniel S. Stutts, PH.D
1 Introduction
This primer was written to provide a brief overview of the main concepts and methods in elemen-
tary linear algebra. It was not intended to take the place of any of the many elementary linear
algebra texts in the market. It contains relatively few examples and no exercises. The interested
reader will find more in depth coverage of these topics in introductory text books. Much of the
material including the order in which it is presented comes from Howard Anton’s “Elementary
Linear Algebra” 2nd Ed., John Wiley, 1977. Another excellent basic text is “Linear Algebra and
Its Applications,” by Charles G. Cullen. A more advanced text is “Linear Algebra and its Appli-
cations” by Gilbert Strang.
The author hopes that this primer will answer some of your questions as they arise, and provide
some motivation (prime the pump, so to speak) for you to explore the subject in more depth. At
the very least, you now have a list (albeit a short one) of references from which to obtain more in
depth explanation.
It should be noted that the examples given here have been motivated by the solution of consistent
systems of equations which have an equal number of unknowns and equations. Therefore, only the
analysis of square (n by n) matrices have been presented. Furthermore, only the properties of real
matrices (those with real elements) have been included.
Subscripts are used to denote elements of matricies or vectors. Superscripts (when not referring
to exponentiation) are used to identify eigenvectors and their respective components.
Thus, ij th component of the sum of two matrices, A and B, may be written: [A+B]ij = aij +bij
Example 1
(a11 + b11) (a12 + b12)
C = A(2×2) + B(2×2) =
(a21 + b21) (a22 + b22)
1
2 Linear Systems of Equations
The following systems of equations
a11 a12 · · · aln
x 1
b 1
x2
b2
a21 a22 · · · a2n
.. .. = .. (2)
.
.
.
an1 an2 · · · ann xn bn
or
Ax = b (3)
where A is a n by n matrix and x and b are n by 1 matrices or vectors.
While the solution of systems of linear equations provides one significant motivation to study
matrices and their properties, there are numerous other applications for matrices. All applications
of matrices require a reasonable degree of understanding of matrix and vector properties.
1. A + B = B + A
2. A + (B + C) = (A + B) + C
3. A(B + C) = AB + AC
1 0 0 . .
0 1 0 . .
I=
0 0 1 . .
. . . . .
. . . . 1
2
5. Zero Matrix: 0A = A0 = 0
0 0 0 . .
0 0 0 . .
0=
0 0 0 . .
. . . . .
. . . . 0
6. A + 0 = A
Example 2
a11 a12 b11 b12 a11b11 + a12b21 a11b12 + a12b22
AB = =
a21 a22 b21 b22 a21b11 + a22b21 a21b12 + a22b22
3
Note that in general, matrices are not commutative over multiplication:
AB 6= BA (6)
This fact leads to the definition of pre-multiplication and post-multiplication:
Other terminology for the direction of multiplication in common use is left multiplication and
right multiplication. In Equation (6), on the r.h.s., A is being pre-multiplied by B, and on
the l.h.s. A is being post-multiplied by B.
3. Multiplication by a vector:
a11 a12 x1 a11x1 + a12 x2
Ax = =
a21 a22 x2 a21x1 + a22 x2
Pre-multiplication of a matrix by a vector requires taking the transpose of the vector first in
order to comply with the rules of matrix multiplication.
i j k
A × B = det a1 a2 a3
b1 b2 b3
This method of computing the determinant is called cofactor expansion. A cofactor is the signed
minor of a given element in a matrix. A minor Mij is the determinant of the sub matrix which
remains after the ith row and the j th column of the matrix are deleted. In this case, we have
4
M13 = (a1 b2 − a2b1 ) (10)
The cofactors are given by
etc.
In the above example,
i j k
det a1 a2 a3 = c11i + c12j + c13k (14)
b1 b2 b3
2 1
Example 3 Let A = then
3 −2
2 1 0
Example 4 Let A = 3 −2 1
1 −1 2
then
Expansion by cofactors is mostly useful for small matrices (less than 4×4). For larger matrices, the
number of operations becomes prohibitively large. For example:
This trend suggests that soon even the largest and fastest computers would choke on such a
computation.
5
For large matrices, the determinant is best computed using row reduction.
Row reduction consists of using elementary row and column operations to reduce a matrix down
to a simpler form, usually upper or lower triangular form.
This is accomplished by multiplying one row by a constant and adding it another row to produce
a zero at the desired position.
2 1 0
Example 5 Let A = 3 −2 1
1 −1 2
Reduce A to upper triangular form, i.e., all zeros under the main diagonal (2 -2 2).
The determinant is now easily computed by multiplying the elements of the main diagonal.
detA = 2(− 72 )( 11
7 ) = −11
This type of row reduction is called Gaussian elimination and is much more efficient than the co-
factor expansion technique for large matrices.
3. det(AB) = det(A)det(B) where A and B are square matrices of the same size.
Proof 1 If A is invertible then AA−1 = I ⇒ det AA−1 = (detA) detA−1 = detI = 1
1
Thus, detA−1 = detA ⇒ detA 6= 0.
6
An important implication of this result is the following.
This result is used very often in applied mathematics, physics and engineering.
6. If A is invertible then
1
A−1 = det(A) adj(A) where adjA is the adjoint of A.
Definition 2 The adjoint of a matrix A is defined as the transpose of the cofactor matrix of
A.
Another way to calculate the inverse of a matrix is by Gaussian elimination. This method is
easier to apply on larger matrices.
Since A−1 A = I, we start with the matrix A which we want to invert on the left and the
identity matrix on the right. We then do elementary row operations (Gaussian Elimination) on the
matrix while simultaneously doing the same operations on I. This can be accomplished by adjoining
the two matrices to form a matrix of the form [A I].
Example 6
1 2 3
A= 2 5 3 (15)
1 0 8
Adjoining A with I yields
1 2 3 1 0 0
2 5 3 0 1 0 (16)
1 0 8 0 0 1
Adding -2 times the first row to the second row and -1 times the first row to the third yields
1 2 3 1 0 0
0 1 −3 −2 1 0 (17)
0 −2 5 −1 0 1
7
1 2 3 1 0 0
0 1 −3 −2 1 0
0 0 1 5 −2 −1
Adding 3 times the third row to the second and -3 times the third row to the first yields
1 2 3 −14 6 3
0 1 −3 13 −5 −3
0 0 1 5 −2 −1
2 4 −2 x1 18
0 2 3 x2 = −2
1 0 5 x3 −7
8
Next, we define the three matrices formed by replacing in turn each of the columns of A with b:
18 4 −2 2 18 −2
A1 = −2 2 3 A2 = 0 −2 3
−7 0 5 1 −7 5
2 4 18
A3 = 0 2 −2
1 0 −7
Next, we compute the individual determinants:
Thus,
Example 8
a11 a12 x1 b1
= (21)
a21 a22 x2 b2
det(A) = a11a22 − a12a21 , det(A1 ) = b1a22 − a12 b2, det(A2 ) = a11 b2 − b1a21 (22)
Thus,
det(A1) a22b1 − a12b2 det(A2 ) a11 b2 − b1a21
x1 = = , x2 = = (23)
det(A) a11 a22 − a12a21 det(A) a11 a22 − a12a21
(A − λI)x = 0 (25)
Hence, from Theorem (5) on page 7, there exists a nontrivial x if and only if
9
Evaluation of the above results in a polynomial in λ. This is the so called characteristic poly-
nomial and its roots λi are the characteristic values or eigenvalues. Evaluation of (26) yields,
Furthermore, the solution of (25), xi, corresponding to the ith eigenvalue, is the ith eigenvector of
the matrix A. It can be shown that the matrix thA itself satisfies the characteristic polynomial.
An + c1 An−1 + c2 An−2 + · · · + cn I = 0
This result is known as Cayley-Hamilton Theorem. It may be shown that the matrix A is also anihi-
lated by a minimum polynomial of degree less than or equal to that of the characteristic polynomial.
= (4 − λ)(−2 − λ) + 5 = 0
λ2 − 2λ − 3 = 0
or
(λ − 3)(λ + 1) = 0
Thus, λ1 = −1, and λ2 = 3 The eigenvector of A corresponding to λ = −1 may be found as
follows:
4 − (−1) −5 x11 0
(A − λI) x1 = =
1 −2 − (−1) x12 0
or
5 −5 x11 0
=
1 −1 x12 0
which has an obvious solution of
1 1
x = c1
1
10
Similarly, substituting λ2 = 3 yields
1 −5 x21 0
=
1 −5 x22 0
Thus,
2 5
x = c2
1
The calculation of eigenvalues and eigenvectors has many applications. One important application
is the similarity transformation. Under certain conditions, a general system of equations may be
transformed into a diagonal system. The most important case is that of symmetric matrices which
will be discussed later. In other words, the system Ax = b may be transformed into an equivalent
system Dy = c where D is the diagonal matrix – making the solution of Dy = c especially easy.
Two matrices A and B are said to be similar if det(A) = det(B). Another way of saying the above
is if A and B are square matrices B is similar to A if and only if there is an invertible matrix P
such that A = PBP−1 . It turns out that an n × n matrix A is diagonalizable if A has n linearly
independent eigenvectors.
Example 10 Since the previous example had two independent eigenvectors (i.e., x1 6= Kx2 for
any scalar K)
4 −5
A= (28)
1 −2
should be diagonalizable. A matrix composed of x1 and x2 as its two columns will diagonalize A.
We show that this is so by trying it!
−1 − 14 5
4 −4 5 1 5
P AP = 1
4 − 14 1 −2 1 1
1 −1 5 −1 15 −1 0
= =
4 1 −1 1 −3 0 3
We see that D is composed of λ1 and λ2 on its main diagonal and zeros elsewhere. Hence, the
matrix, A, was indeed similar to a diagonal matrix.
11
3.7 Special Properties of Symmetric Matrices
Symmetric matrices have several special properties. The principal ones for an n by n symmetric
matrix are enumerated below:
1. Symmetric real matrices (those with real elements) have n real eigenvalues.
Proof 3 If A = AT , and
Ax1 = λ1x1 (29)
and
Ax2 = λ2x2 (30)
Then,
λ1 x2T Ax1 = λ1x2T x1 (31)
and
λ2 x1T Ax2 = λ2x1T x2 (32)
Since x2T x1 = x1T x2 and
x2T Ax1 = x1T Ax2 (33)
subtraction of (31) from (32) yields
One of the most important consequences of the above for symmetric matrices is that all symmetric
matrices are similar to a diagonal matrix. This fact has powerful consequences in the solution of
systems of linear ordinary differential equations with constant coefficients which result from the
application of Newton’s 2nd law, or Hamilton’s principle. Essentially, such systems, which usually
result from symmetric operators, may be uncoupled by a similarity transformation, and hence, each
ordinary differential equation solved individually. Exceptions to this rule include systems modeled
with general viscous damping, and those with gyroscopic inertial terms.
Example 11 (symmetric)
4 0 0
A= 0 4 0
0 0 5
Characteristic polynomial: (λ − 4)2(λ − 5) = 0
λ1 = 5, λ2 = λ3 = 4
12
λ1 = 5
1
x1 = −1
0
1 0
λ2 = 4, x2 = 1 , x3 = 0 (35)
0 1
Example 12 (non-symmetric)
5 0 0
A= 4 5 0
0 0 3
Characteristic polynomial: (λ − 5)2(λ − 3) = 0
λ1 = 3, λ2 = λ3 = 5
1
λ1 = 3, x1 = 1
0
0
λ2 = 5; x2 = 1
0
13
4 Bibliography
1. Anton, Howard, “Elementary Linear Algebra,” 1977, John Wiley&Sons
2. Cullen, Charles G., “Matrices and Linear Transformations,” 1972, 2nd Ed., Addison-Wesley,
Reprinted by Dover 1990.
3. Strang, Gilbert, “Linear Algebra and Its Applications,” 1988, 3rd Ed., International Thomson
Publishing.
14