Lecture 19
Lecture 19
The content of our ODE course requires some basic knowledge from the linear algebra course, and
this lecture serves as a quick review of the pertinent material. I also introduce the notation that I use
throughout the rest of the lectures.
The meaning of this notation is absolutely the same. However, this should not be confused with
−1 0
,
0 5
which denotes the determinant of a matrix. I often use the shortcut notation A = [aij ]m×n to specify
matrix A. Sometimes the dimensions of the matrix are convenient to write as indices: Am×n denotes
a matrix with m rows and n columns.
The matrix transpose exchanges rows and columns of a matrix and is usually denoted as A⊤ . For-
mally, transpose of a matrix Am×n is the matrix B n×m = A⊤ m×n such that bij = aji , i = 1, . . . , n, j =
1, . . . , m. Here is a simple example:
⊤ −1 2
−1 1 3
= 1 0 .
2 0 −2
3 −2
a = (a1 , . . . , an ).
MATH266: Intro to ODE by Artem Novozhilov, e-mail: [email protected]. Spring 2024
1
The matrix with only one column is called the column-vector :
b1
..
b = . .
bm
Quite obvious, a transposed row-vector becomes a column-vector and vice versa. To denote
vectors I use small bold letters, in class I usually write ⃗a, ⃗b. Frequently I do not specify whether
it is a row or a column vector, in this case it is usually meant to be a column-vector. For vectors
I use only one index. Depending on the nature of constants in vectors (they can be either real
or complex), I use the notations a ∈ Rn , b ∈ Cm , meaning that vector a has n real elements
and vector b has m complex elements.
A matrix A is called square if it has the same number of rows and columns: A = An×n = An .
The elements a11 , a22 , . . . , ann of a square matrix A are said to be on the main diagonal.
A square matrix A is called upper triangular if all its elements below the main diagonal are zero
and lower triangular is all its elements above the main diagonal are zero. Both upper and lower
triangular matrices are often called triangular.
A square matrix A is called diagonal if all its elements outside of the main diagonal are zero.
A diagonal matrix is called an identity matrix if all its diagonal elements are ones. The identity
matrix is usually denoted as I or I n to specify the dimension of the matrix:
1 0 ... 0 0
0 1 . . . 0 0
I n = ... ... . . . ... ... .
0 0 . . . 1 0
0 0 ... 0 1
Matrices can be added, multiplied by constants and by matrices. Here are the rules:
Multiplication by a constant. Assume that we have a matrix Am×n . To multiply this matrix
by a constant α means to multiply every element of A by α, formally:
Addition of matrices. To add two matrices we need to require these matrices have the same
dimensions, then addition goes elementwise:
The rules of multiplication by constants and additions allow to make sense of, e.g., the following
expression:
1 −1 2 0 1 −1 2 0 5 −5 −2 0 3 −5
5 2 0 − −1 1 = 5 2 0 + (−1) −1 1 = 10 0 + 1 −1 = 11 −1 .
2 1 2 3 2 1 2 3 10 5 −2 −3 8 2
2
Matrix multiplication. Two matrices Am×k and B k×n can be multiplied if and only if the number
of columns of the first matrix is equal to the number of rows of the second matrix. The result
will be matrix C m×n :
C m×n = Am×k · B k×n .
The elements of the product can be found as
X
k
cij = ail blj , i = 1, . . . m, j = 1, . . . , n.
l=1
The actual strategy to multiply matrices can be most easily learned by an example:
1 2 1 · (−1) + 2 · (−2) 1 · (−3) + 2 · (−4)
3 4 −1 −3
= 3 · (−1) + 4 · (−2) 3 · (−3) + 4 · (−4)
−2 −4 2×2
5 6 3×2 5 · (−1) + 6 · (−2) 5 · (−3) + 6 · (−4) 3×2
From the definition of matrix multiplication it is clear that even if we can multiply matrices AB,
it does not mean that BA makes sense. However, if A, B are square matrices of the same order
(the same dimension), then both AB and BA are defined and result in the matrices n × n.
Simple examples (provide one!) show that starting with dimension 2
̸ BA,
AB =
which explains the name “identity matrix”: It plays the same role as played by number one for
number multiplication.
Putting together matrix addition and matrix multiplication, we can see now that indeed a system
of linear first order ODE can be written in the concise matrix form
ẏ = Ay + f (t),
for the matrix A, vector function f : R → Rn , and the unknown vector-function y(t). Especially
handy here is the strange formula for matrix multiplication defined above, but of course the actual
reason for this (not very natural and intuitive) definition is quite different from simply have a nice
shortcut notation for systems of equations (including algebraic systems). You should definitely consider
taking an intermediate level Linear Algebra course to find out why the matrix multiplication is defined
in this way.
19.2 Determinants
Consider a system of two linear algebraic equations with two unknowns in the form
a11 x1 + a12 x2 = b1 ,
a21 x1 + a22 x2 = b2 ,
3
or, in the matrix form
Ax = b.
Multiply the first equation by a22 , second by a12 , and deduct the second one from the first. You’ll
find that
(a11 a22 − a12 a21 )x1 = b1 a22 − b2 a12 ,
or, assuming that a11 a22 − a12 a21 ̸= 0,
b1 a22 − b2 a12
x1 = .
a11 a22 − a12 a21
Similarly, you can find for x2 :
b2 a11 − b1 a21
x2 = .
a11 a22 − a12 a21
These formulas provide a unique solution to our system provided, again, if a11 a22 − a12 a21 ̸= 0. Hence
there is a quantity that is defined in terms of the elements of matrix A, that indicates when our system
has unique solution. This quantity is called the determinant of matrix A and denoted as det A or
|A|. Hence we have
a a
det A = 11 12 = a11 a22 − a12 a21 .
a21 a22
To define the determinant for a general square matrix A of the n-th order, I choose the recursive
definition, which is convenient for determinant evaluations. In the usual linear algebra courses this
definition follows from other, more conceptual definitions of determinants. First I will need the notion
of a minor. The minor Mij of the element aij in the matrix A is the determinant of the matrix, which
is obtained from A by deleting the i-th row and the j-th column (hence the dimension reduces by
one). The cofactor Cij of the element aij is defined as Cij = (−1)i+j Mij .
Definition 1. The determinant of square matrix A is the number, which can be calculated by one of
the following formulas:
X
n
det A = aij Cij ,
j=1
for any 1 ≤ i ≤ n, or
X
n
det A = aij Cij ,
i=1
for any 1 ≤ j ≤ n.
Note that there are total 2n formulas in the above definition, and it is implicitly assumed that
all of them yield the same answer (this is proved in the course of linear algebra). The first formula
expands the determinant along row i, and the second formula expands the determinant along column
j. Since each determinant is defined in terms of cofactors, i.e., the determinants of the matrices of
size n − 1 × n − 1, hence the definition is recursive. To finalize it we need, e.g., the determinant of
matrix 2 × 2 (the formula is above), or we can define the determinant of matrix 1 × 1 to be equal the
only element of this matrix: det[a11 ] = a11 .
4
Example 2. Using the definition, we can find the determinant of a 3 × 3 matrix, using the expansion
along, e.g., the first row
a11 a12 a13
a a a a a a
det a21 a22 a23 = a11 22 23 − a12 21 23 + a13 21 22 .
a32 a33 a31 a33 a31 a32
a31 a32 a33
The determinant is a function defined on the space of all square matrices and it possesses a number
of useful to remember properties, which I list without proof:
The determinant of a triangular matrix can be found as the product of the elements on the main
diagonal:
det A = a11 . . . ann ,
if A is triangular. (Q: Can you prove this fact using the definition above?) As a simple corollary,
we have
det I n = 1.
If two rows (or two columns) are switched, then the determinant changes its sign.
If a matrix has two identical rows or columns then its determinant is zero.
det A = det A⊤ .
det αA = αn det A.
Adding a row multiplied by a constant to another row of a matrix does not change the determi-
nant.
det AB = det A det B.
While the definition for the determinant is good for evaluating the determinants of matrices of a
reasonable size, in numerical computations usually the properties listed above are used to first put the
matrix into triangular form with the transformations that do not change the determinant, and then
calculate the determinant of the resulting triangular matrix.
The determinant gives a simple criterion when system of linear algebraic equations has a unique
solution. Namely, system
Ax = b,
of n equations with n unknowns has unique solution if and only if
det A ̸= 0.
5
19.3 Inverse matrix and solving Ax = b
Definition 3. A matrix B of dimensions n × n is called the inverse to matrix A of dimensions n × n
and denoted A−1 if
AB = BA = AA−1 = A−1 A = I n .
The inverse, when it exists, is unique. To actually calculate the inverse matrix, I present an explicit
formula. It should be noted, however, that this formula requires a significant amount of calculations
and should not be used for matrices of order 5 and above.
1
A−1 = C ⊤,
det A
where
C = [Cij ]n×n ,
is the matrix composed of cofactors of A. This formula actually shows, among other things, that
matrix A is invertible (i.e., it has the inverse) if and only if det A ̸= 0. Such matrices are called
non-singular.
We have
det A = ad − bc,
and
d −c
C= .
−b a
Therefore,
−1 1 d −b
A = .
ad − bc −c a
Check that AA−1 = A−1 A = I 2 .
Using the notion of the inverse matrix we can “solve” system Ax = b in one line:
Actually, if you consider in details what is written in A−1 b, you will recover familiar Cramer’s formulas
for the solution of linear system:
det ∆i
xi = , i = 1, . . . , n,
det A
where ∆i is the matrix which is obtained from A by replacing the i-th column with column-vector b.
Using the properties of the determinant and the formula for the inverse matrix, we have
1
det A = .
det A−1
6
Example 5. Solve system of three linear equations with three unknowns by the inverse matrix method:
x1 + +2x3 = 5,
x1 − x2 + x3 = 5,
2x1 + x2 + x3 = 2.
In matrix form we have
1 0 2 5
Ax = b, A = 1 −1 1 , b = 5 .
2 1 1 2
First we find det A:
−1 1 1 −1
det A = 1 · +2· = −2 + 6 = 4.
1 1 2 1
The cofactor matrix is given here as
−2 1 3
C = 2 −3 −1 .
2 1 −1
Therefore, the inverse matrix is
−2 2 2
1 1
A−1 = C ⊤ = 1 −3 1 .
det A 4
3 −1 −1
7
The set of column-vectors in Cn . This is not really different from the previous example, however
now we can multiply by complex constants.
The set of real matrices of dimension m × n, let us denote this set as Mm×n . We know that we
can add matrices of the same dimension and multiply them by a real constant. The result will
be in Mm×n .
Consider the fourth order linear homogeneous ODE with constant coefficients:
y (4) + a3 y ′′′ + a2 y ′′ + a1 y ′ + a0 y = 0.
Recall from the previous part of the course that if I denote the set of all the solutions to this
equation as S, then if y1 , y2 ∈ S (i.e., y1 and y2 are solutions) then their linear combination
αy1 + βy2 is also a solution for any constants α, β ∈ R.
Ax = 0,
where A = [aij ]m×n . This system always has at least one solution (the trivial one). Denote the
set of all solutions as X . Recall from your linear algebra course (or prove by yourself), that if
x1 , x2 ∈ X then αx1 + βx2 ∈ X for any α, β ∈ R.
Let Pn denote the set of all polynomials of degree at most n with real coefficients. Then if you
take two polynomials P1 , P2 ∈ Pn then their linear combination αP1 + βP2 ∈ Pn .
What is common between sets R3 , Cn , Mm×n , S, X , Pn in the above examples? For all of them it
is true that any linear combination of the elements of these sets still belongs to the same set. The
mathematical abstraction that deals with such structures is called the vector (or linear ) space. Hence
R3 , Cn , Mm×n , S, X , Pn are examples of vector spaces. Here is the formal definition.
Definition 6. A vector space V over the real numbers R (or over the complex numbers C) is a
nonempty set with the operations of addition, such that for any two elements v 1 , v 2 ∈ V =⇒ v 1 +v 2 ∈
V, and multiplication by real scalars (complex scalars): αv ∈ V for any α ∈ R (α ∈ C) and any v ∈ V.
For these two operations the following axioms hold:
v1 + v2 = v2 + v1,
(v 1 + v 2 ) + v 3 = v 1 + (v 2 + v 3 ),
there exists 0 ∈ V, 0 + v = v,
there exists − v, v + (−v) = 0,
α(βv) = (αβ)v,
1 · v = v,
α(v 1 + v 2 ) = αv 1 + βv 2 ,
(α + β)v = αv + βv.
Here α, β ∈ R (or in C), and v, v 1 , v 2 , v 3 ∈ V. The elements of a vector space are called vectors.
8
It is a good exercise to convince yourself that for all the examples above, the sets R3 , Cn , Mm×n , S,
X , Pn are vector spaces, for which all the listed axioms hold. Hence in each case the elements of these
sets are vectors. This is a confusing point, since we usually get used to the point that vectors are
arrays of numbers as in the first two examples. However, a matrix, a solution to the linear differential
equation, and a polynomial is also a vector. There is a good reason to call arrays of numbers as
vectors, but for these details you need to consult your linear algebra course.
Why this abstract definition of a vector space useful? Because in many cases we can express any
element of the vector space using its basis. Here are some more definitions.
Consider an ordered set of vectors {v 1 , . . . , v k }. This set is said to be linearly independent if any
linear combination of these vectors
α1 v 1 + . . . αk v k
is equal to zero only if α1 = . . . = αk = 0. This ordered set of vectors is said to be linearly dependent
if there exist constants α1 , . . . , αk not equal to zero simultaneously such that
α1 v 1 + . . . αk v k = 0.
With a slight abuse of the language I will call the vectors belonging to a linearly independent (or
dependent) set as simply linearly dependent (or independent).
The notion of the linear independence means that none of the vectors in the set can be expressed
as a linear combination of the rest of them; if vectors are linearly dependent, then there is linear
combination expressing at least one of the vectors through the rest.
The span of the set of vectors {v 1 , . . . , v k } is by definition all possible linear combinations of these
vectors. Vector space V is called finite dimensional if there exits a finite set of vectors that span V,
otherwise V is called infinite dimensional. A basis of vector space V is a set of vectors {v 1 , . . . , v n }
that 1) is linearly independent and 2) spans V. The dimension of a finite-dimensional vector space
V is the number of vectors in a basis. This is a basic fact from linear algebra that any basis in finite
dimensional vector space has the same number of elements, and hence the definition of the dimension
makes perfect sense.
Example 7. Vector space R3 is 3-dimensional. To show this fact we need to present a basis. The
standard basis of R3 is
1 0 0
e1 = 0 , e2 = 1 , e3 = 0 .
0 0 1
Why is this set of vectors a basis? Because, first, any vector is in span these vectors, i.e., any vector
can be presented as a linear combination:
x1
x = x2 = x1 e1 + x2 e2 + x3 e3 .
x3
It is said that x1 , x2 , x3 are the coordinates of x in the basis (e1 , e2 , e3 ). Second, we need to show
that these three vectors are linearly independent. Assume opposite: Let α1 , α2 , α3 be constants not
equal to zero simultaneously, such that
α1 e1 + α2 e2 + α3 e3 = 0.
9
The last expression can be rewritten in the matrix form
α1
Iα = 0, α = α2 .
α3
The vectors e1 , e2 , e3 are the columns of the matrix I. This is a system of 3 linear algebraic homo-
geneous equations with 3 unknowns. Since det I = 1, it has a unique solution, which is zero vector.
Therefore α = 0. Contradiction.
Actually, it is possible to generalize this example in the following way: in Rn n vectors v 1 , . . . , v n
form a basis if and only if the matrix V , where the i-th column is given by v i , has nonzero determinant:
det V ̸= 0.
y ′′ + y = 0
is two dimensional. To show this fact we need to present a basis. I claim that the basis can be taken
as {cos t, sin t}. The general solution to this equation has the form
therefore any solution can be represented as a linear combination of cos and sin. Moreover, these two
functions are linearly independent on any interval I. To prove this we can use the Wronskian, or we
can do it directly, by considering the linear combination
α cos t + β sin t = 0.
It is said that this solution has coordinates (1, 0) in the basis {cos t, sin t}.
10