0% found this document useful (0 votes)

43 views101 pages

Lectures On Linear Algebra

The document defines key concepts in linear algebra: 1. A binary operation on a set is a function that takes two elements of the set and returns an element of the set. A field is a set with two binary operations, addition and multiplication, that satisfy certain properties like commutativity and associativity. 2. A vector space is a set V with operations of vector addition and scalar multiplication that satisfy certain axioms. It is defined as an ordered triple (V, F, μ) where V is the set, F is a field, and μ defines scalar multiplication. 3. The span of a subset A of a vector space V is the set of all finite linear combinations of elements

Uploaded by

Maciej Pogorzelski

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views101 pages

Lectures On Linear Algebra

Uploaded by

Maciej Pogorzelski

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 101

Lecture 1.

A binary operation φ on a set A is a function from A × A to A, i.e., φ : A × A → A.

φ((x, y)) is also denoted by xφy. Often we use a symbol for φ: +, ·, − , ⋆ .

A field F is a set with two operation, called addition and multiplication, which are denoted
by + and · (often omitted), respectively, and which satisfy the following properties:

1. Both operations are commutative: a + b = b + a and ab = ba for all a, b ∈ F

2. Both operations are associative: (a + b) + c = a + (b + c) and (ab)c = a(bc) for all

a, b, c ∈ F

3. There exists a unique identity element for each operation, denoted by 0 and 1, i.e.,
0 + a = a + 0 = a and 1a = a1 = a for all a ∈ F

4. For every a ∈ F, there exists a unique b ∈ F such that a + b = b + a = 0. This element b

is called the additive inverse of a in F, and is denoted by −a.

5. For every a ∈ F× := F \ {0}, there exists a unique b ∈ F such that ab = ba = 1. This

element b is called the multiplicative inverse of a in F, and is denoted by a−1 .

6. Multiplication and addition are related by the distributive laws:

a(b + c) = ab + ac and (b + c)a = ba + ca.

The axiom system above is not the ‘most economic’ one. Check that it implies that 0a = a0 = 0
for every a ∈ F.

Examples of fields.

Q – the field of rational numbers;

R – the field of real numbers;

C – the field of complex numbers;

Fp – the finite field of p elements, p is prime. The field Fp is often denoted by Z/pZ or Zp , and
is thought as the set of p elements {0, 1, . . . , p − 1} where addition and multiplication are done
by modulo p.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 1

We have Q ⊂ R ⊂ C, and the operations in the subfields are just restrictions of the corresponding
operations in C.

Suppose V is a set whose elements are called vectors and denoted by v̄. Suppose there exists
a binary operation on V , called addition and denoted by +, which is commutative, associative,
there exists an identity element, denoted by 0̄ and called zero vector, and for every vector v̄,
there exists an additive inverse which we denote by −v̄.

Suppose F is a field and there exists a function µ from µ : F × V → V given by (k, v̄) 7→ kv̄,
which satisfies the following axioms:

1. 1v = v for all v̄ ∈ V

2. k(v̄ + ū) = kv̄ + kū for all k ∈ F and all v̄, ū ∈ V

3. (k + m)v̄ = kv̄ + mv̄ for all k ∈ F and all v̄ ∈ V

4. k(mv̄) = (km)v̄ for all k, m ∈ F and all v̄ ∈ V

An ordered triple ((V, +), F, µ) is called a vector space. In a simpler manner, we just say that
V is a vector space over the field F. The function µ is mentioned very rarely.

We say that µ defines a ‘multiplication’ of elements of F by vectors from V . Often in this

context the elements of F are called scalars, and we say that µ defines a ‘multiplication of
vectors by scalars’. Note that this multiplication is not a binary operation on a set. Its result
is an element of V , i.e., is always a vector. When we write k(mv̄) = (km)v̄, we mean that the
scalars k and m are multiplied in F, and the result of this multiplication km, which is a scalar,
is ‘multiplied’ by a vector v̄.

Examples of vector spaces.

• For a field F, and positive integer n, let

V = Fn := {v̄ = (v1 , v2 , . . . , vn ) : vi ∈ F for all i = 1, 2, . . . , n}

The addition on Fn and the multiplication by scalars are defined as follows: for every v̄, ū ∈ V ,
and every k ∈ F,

v̄ + ū := (v1 + u1 , . . . , vn + un ) and kv̄ = k(v1 , . . . , vn ) := (kv1 , . . . , kvn ).

Then V = Fn is a vector space over F, which can be (and has to be) checked easily. For F = R,
we obtain the well familiar space Rn .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 2

If F = Fp , vector space Fnp contains pn vectors. E.g., F32 contains 23 = 8 vectors.

• Let V = C(0, 1) be the set of all continuous real valued functions on (0, 1): f : (0, 1) → R.

The addition on C(0, 1) and the scalar multiplication are defined as follows: for any f, g ∈
C(0, 1), and every k ∈ R,

(f + g)(x) := f (x) + g(x) and (kf )(x) := kf (x).

Again, C(0, 1) is a vector space over R, which has to be checked. Here the fact that + is a
binary operation on C(0, 1), or more precisely, that it is ‘closed’, which means f + g ∈ C(0, 1),
is not a trivial matter. The same for kf , though it is a little easier. Similarly we can consider
C(R) or C 1 (0, 1) – the vector space over R of all real valued differentiable functions from (0, 1)
to R with continuous first derivative.

• V = R is a vector space over Q. V = C is a vector space over R. V = C is a vector space

over Q. In all these examples the addition in V is the usual addition in the field, and the
multiplication of vectors by scalars is the usual multiplication of two numbers in V .

• V = F[x] – set of all polynomials of x with coefficients from a field F. V is a vector space
over F with respect to the usual addition of polynomials and the multiplication of polynomials
by numbers (elements of F).

• V is the set of all functions f : R → R which satisfy the differential equation

y ′′ − 5y ′ + 6y = 0.

V is a vector space over R.

• V is the set of all sequences of real numbers (xn ), n ≥ 0, defined by the recurrences:
x0 = a, x1 = b, a, b ∈ R, and for all n ≥ 0, xn+2 = 2xn+1 − 3xn .

V is a vector space over R.

• V = Mm×n (F) – the set of all m × n matrices with entries from F with respect to usual
addition of matrices and the multiplication of matrices by scalars.

• Let A ∈ Mm×n , i.e., A is an m × n matrix over R. Then the set V of all solutions of the
homogeneous system of linear equations Ax̄ = 0̄, i.e., the set of all vectors x̄ ∈ Rn such that
Ax̄ = 0̄, is a vector space over R with respect to usual addition of vectors and the scalar
multiplication in Rn . Note that in this example elements of Rn are thought of as the column
vectors ( n × 1 matrices).

Proposition 1 Let V be a vector space over a field F. Then

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 3

(i) 0̄ is unique.

(ii) for each ā ∈ V , the inverse of ā is unique.

(iii) 0ā = 0̄ for every ā ∈ V .

(iv) (−1)ā = −ā for every ā ∈ V .

(v) −(−ā) = ā for every ā ∈ V .

(vi) Cancellation Law: ā + b̄ = ā + c̄ if and only if b̄ = c̄.

Proof. (i) Indeed, if 0̄′ is a possible another identity element, then 0̄ + 0̄′ = 0̄ as 0̄′ is an identity,
and 0̄ + 0̄′ = 0̄′ as 0̄ is an identity. So 0̄ = 0̄′ . This justifies the notation 0̄.

(ii) Indeed, let b̄, b̄′ be inverses of a. Then consider the element (b̄ + ā) + b̄′ . Since b̄ + ā = 0̄,
then (b̄ + ā) + b̄′ = 0̄ + b̄′ = b̄′ . Similarly, consider b̄ + (ā + b̄′ ). Since ā + b̄′ = 0̄, then
b̄ + (ā + b̄′ ) = b̄ + 0̄ = b̄. Due to the associativity, (b̄ + ā) + b̄′ = b̄ + (ā + b̄′ ). So b̄′ = b̄. This
justifies the notation −ā.

(iii)

(iv)

(v)

(vi)

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 4

From now on:

We will NOT denote vectors by bars.

When we write kv , we will mean that k ∈ F and v ∈ V .

If we do not say otherwise , V will denote a vector space over an arbitrary field F.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 5

Lecture 2.

Let V be a vector space and k1 , . . . , km ∈ F and v1 , . . . , vm ∈ V . Then the vector

k1 v1 + . . . + km vm

is called a linear combination of v1 , . . . , vm . At this time we do not define infinite linear

combinations.

For a subset A of V the set of all (finite) linear combinations of vectors from A is called the
span of A and is denoted by Span(A) or hAi. Clearly, A ⊆ Span(A).
P
Sometimes it is convenient to use a∈A ka a as a notation for a general linear combination
of finitely many elements of A. In this notation we always assume that only finitely many
coefficients ka are nonzero

A subset W of a vector space V over a field F is called a subspace or a linear subspace of

V if it is a vector space over the the operation on V restricted to W and the multiplication of
elements of V by elements from F restricted to W .

In order to check that a subset W is a subspace in V it is sufficient to check that it is “closed”

with respect to the addition of V and with respect to the multiplication by scalars. All other
axioms will be inherited from the ones in V . The existence of the zero vector in W and the
additive inverses follows from the fact that in V 0a = 0 (= 0̄), (−1)a = −a and that W is closed
with respect to multiplication of vectors by scalars.

QUESTION: Given a subset A in a vector space V , what is the smallest subspace of V with
respect to inclusion which is a superset of A?

It turns out that it is Span(A)! It is proved in the following theorem among other properties
of subspaces and spans.

Theorem 2 Let V be a vector space.

1. For any subset A of V , Span(A) is a subspace of V

2. A subset W of V is a subspace if and only if Span(W ) = W .

3. For every subset A of V , Span(Span(A)) = Span(A).

4. Intersection of any collection of subspaces of V is a subspace of V .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 6

5. For every subset W of V ,
\
Span(W ) = U,
W ⊆U

where the intersection is taken for all subspaces U of V for which W is a subset.

Proof. Please complete.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 7

Comments. The first statement of the theorem describes subspaces “from within”. It can be
used to prove that a subset is a subspace.

It is clear that for every A ⊂ V , A ⊆ Span(A). The second statement means that taking a
span of a subset of V more than once does not produce a greater subspace.

The last statement describes a subspace “from outside”.

Let V be a vector space. We say that A ⊆ V generates V (or spans V ) if Span(A) = V .

For example, the set A = {(3, 1), (−1, 2))} ⊂ R2 generates R2 .

Problems.

When you do these problems, please do not use any notions or facts of linear algebra which we
have not discussed in this course. You must prove all your answers.

1. Prove that if V is a vector space and A ⊆ B ⊆ V , then Span(A) ⊆ Span(B).

2. Prove or disprove:

(a) A = {(3, 1, 2), (−1, 3, 1), (2, 4, 3)} ⊂ R3 generates R3 .

(b) A = {(3, 1, 2), (−1, 3, 1), (2, 4, −3)} ⊂ R3 generates R3 .
(c) A = {(t, t2 , t3 ) : t ∈ R} ⊂ R3 generates R3 . This subset is known as twisted cubic.
(d) A = {(3, 1), (−1, 2)} ⊂ F25 generates F25 .

3. Is (2, −1, 0) in Span({(1, 1, 1), (2, 3, −1)})?

4. Let v ∈ F34 and v is not a zero vector. How many vectors does Span({v}) have?

5. Let P3 be the set of all polynomials of x with real coefficients of degree at most 3. Show
that P3 is a vector space over R. Show that the set {1, x, x2 , x3 } spans P3 , and the set
{1, x − 1, (x − 1)2 , (x − 1)3 } spans P3 .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 8

6. Does the set of functions A = {1, ex , e2x , . . . enx , . . .} ⊂ C(R) span C(R)? Here C(R) is
the vector space of all continuous functions f : R → R. Will the answer change if we
consider A as a subset of the vector space of all differentiable functions from R to R?

7. Show that if U and W are subspaces of a vector space V , then U ∪ W need not be a
subspace of V . However, U ∪ W is a subspace of V if and only if U ⊆ W or W ⊆ U .

8. Let V be the set of all sequences of real numbers (xn ), n ≥ 0, defined by the recurrences:
x0 = a, x1 = b, and for all n ≥ 0, xn+2 = −2xn+1 + 3xn .

Every such sequence is completely defined by a choice of a and b. Show that V is a vector
space over R and that it can be spanned by a set of two vectors.

9. Let U and W be subspaces of a vector space V , and let

U + W = {u + w : u ∈ U and w ∈ W }.

Prove that U + W is a subspace of V , and that U and W are subspaces of U + W . Prove

that if a subspace X of V contains U and W as subspaces (equivalently as subsets), than
U + W ⊆ X. Hence U + W is the smallest (with respect to inclusion) subspace of V which
contains both U and W .

10. Let X, Y and Z be subspaces of a vector space V and X + Y = X + Z. Does it imply

that Y = Z? Prove your answer.

11. Consider R∞ – the vector space of all infinite sequences of real numbers, with addition
of vectors and multiplication of vectors by scalars defined similarly to the ones in Rn .
Consider a subset l 2 (R) of all those sequences (xn ) such that ∞ 2
P
i=1 xi converges. Does
l 2 (R) span R∞ ?

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 9

Lecture 3.

Usually, and in this course, a set can have only distinct elements. Otherwise we would refer to
it as a multiset.

A set X is called finite if there exists an integer n ≥ 0 and a bijection from X to {1, . . . , n}. A
set X is called infinite if there exists a bijection from a subset Y of X to N = {1, 2, . . .}.

Let {v1 , . . . , vm } ⊆ V be a set of m vectors. We say that vectors v1 , . . . , vm are linearly

independent if k1 v1 + . . . + km vm = 0 implies that all ki = 0. Equivalently, we say that
{v1 , . . . , vm } is a linearly independent set.

A set of vectors is called linearly independent, if every finite subset of it is linearly independent.
Otherwise a set is called linearly dependent.

Examples.

• A vector v ∈ V forms a linearly independent set {v} if and only if v 6= 0. The set {0} is
linearly dependent.

• A set of two vectors {u, v} ⊆ V , u 6= 0, is linearly independent if and only if v 6= ku for

some k ∈ F, i.e., they are not colinear.

• A set {(2, 3), (−1, 4), (5, −9)} ⊂ R2 is linearly dependent: 1(2, 3) + (−3)(−1, 4) = (5, −9).

• A set {1, x, ex } ⊂ C(R) is linearly independent. Indeed: let

a1 + bx + cex = a + bx + cex = 0 for all x ∈ R.

Hence the equality holds, in particular, for x = 0, 1, −1. This leads to

a + c = 0, a + b + ce = 0, a − b + ce−1 = 0.

Hence a(1−e)+b = 0 and a(1−e−1 )−b = 0. Adding these equalities we get (2−e−e−1 )a =
0. Since e = 2.7182818284590 . . ., 2 − e − e−1 6= 0. Hence a = 0. Substituting back, we
get b = 0 and c = 0. Hence {1, x, ex } is linearly independent.

• Let i ∈ N and ei = (x1 , . . . , xn , . . .) be a vector (i.e., an infinite sequence) from the vector
space R∞ such that xi = 1 and xj = 0 for all j 6= i. An infinite set of all vectors ei is
linearly independent.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 10

Theorem 3 Let V be a vector space.

1. If A ⊂ B and B is linearly independent, then A is linearly independent.

2. If A ⊂ B and A is linearly dependent, then B is linearly dependent.

3. A set of vectors is linearly dependent if and only if there exists a vector in the set which
is a linear combination of other vectors.

Or, equivalently,

A set of vectors is linearly dependent if and only if, there exists a vector in the set which
is a linear combination of some other vectors from the set.

(This explains why the same definition of linear independence is not made for multiset).

In particular, no linearly independent set contains 0 (zero vector).

4. If A is linearly independent subset of V , then

X X
βa a = γa a implies βa = γa for all a ∈ A.
a∈A a∈A

P
Remark: when denote a linear combination of vectors from A by a∈A ka a, we assume
that only finitely many coefficients ka are nonzero.

5. Let A be a linearly independent subset of V which does not span V . Let b ∈ V \ Span(A).
Then A ∪ {b} is linearly independent subset of V .

Proof. Please complete.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 11

Lecture 4.

A set of vectors of V is called a basis of V if it spans V and is linearly independent. We

will show that every non-trivial vector space has a basis and all bases of V are of the same
cardinality. This will justify the following definition.

The cardinality of a basis is called the dimension of V . If V has a basis of n vectors, for some
n ≥ 1, we say that V is finite-dimensional, or n-dimensional, or has dimension n, or write
dim V = n.

If V has no finite basis, it is called infinite-dimensional.

We assume that the trivial vector space {0} has dimension zero, though it has no basis.

Examples

• Let n ≥ 1, 1 ≤ i ≤ n. Let ei denote an vector from Fn having the i-th component 1 and
all other components 0. Then {e1 , . . . , en } is a basis of Fn , called the standard basis of
Fn .
√ √
• Let V = Q[ 2] := {a + b 2 : a, b ∈ Q}. Then V is a vector space over Q of dimension 2
√
and {1, 2} is a basis.

• Let P = F[x] be the vector space of all polynomials of x over F. Then P is infinite
dimensional. Indeed, if it is not, then it has a finite basis B. Each element of the basis
is a polynomial of some (finite) degree. Let m be the greatest of the degrees of the
polynomials from the basis. Then Span(B) contains only polynomials of degree at most
m and hence Span(B) is a proper subset of P . E.g., xm+1 6∈ Span(B). The obtained
contradiction proves that P is infinitely dimensional over F.

We wish to remind ourselves that the field F which we decided not to mention every time is
there. The same set of objects can be a vector space over different fields, and the notion of
dimension depends on the field. For example, C is vector space over R of dimension two: {1, i}
is a basis. The same C is a vector space over Q of infinite dimension. All these statements have
to be, and will be, be proved.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 12

Question: Why does every vector space have a dimension? In other words, why all its bases
have the same cardinality?

This question is not trivial, especially for infinite-dimensional

spaces. Here we will answer it for finitely dimensional spaces.

Proposition 4 If Span(A) = V 6= {0} and |A| = m, then some subset of A is a basis of V .

Proof. By Theorem 3, we can assume that A contains no zero vector. We proceed by induction
on m. Let m = 1. Then A = {v}, where v 6= 0. Hence A is a basis.

Suppose the statement is proven for all sets A, |A| = k, 1 ≤ k < m. Span(A) = V 6= {0} and
|A| = m. If A is linearly independent, then A is a basis, and the proof is finished. Therefore we
assume that A is linearly dependent. Then some v ∈ A is a linear combination of other vectors
from A (Theorem 3). Let A′ = A \ {v}. Then Span(A′ ) = Span(A) = V , and |A′ | = m − 1. By
induction hypothesis A′ has a subset which is a basis of V . This subset is also the subset of A,
and the proof is finished.

Theorem 5 Let A be a basis of V and |A| = n. Then

(i) any set of n + 1 vectors from V is linearly dependent, and

(ii) any set of n − 1 vectors from V does not span V .

Proof. See the handout.

Corollary 6 In a finite-dimensional space every basis has the same number of vectors.

Now our definition of dimension for finite-dimensional spaces is completely justified.

Corollary 7 If a vector space contains an infinite linearly independent subset, then it is infinite-
dimensional.

In mathematics we often look for sets which satisfy a certain property and which are minimal
(maximal) in the sense that no proper subset (superset) of them satisfy this property. E.g., a
minimal set of axioms for a theory, a minimal independence set of a graph, maximal matching
in graphs, a minimal generating set for a group, etc.. At the same time we often want to find

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 13

a minimum (maximum) sets which satisfy a certain property, which are defined as the sets
of the smallest (largest) cardinalities which satisfy a given property. The two notions do not
always coincide, though it is clear that a finite minimum (maximum) set is always minimal
(maximal).

Example. Consider all families of non-empty subsets of the set {1, 2, 3, 4, 5} such that the in-
tersection of every two subsets of the family is empty. Examples of such families are A =
{{1, 2, 3, 4}, {5}}, or B = {{1, 5}, {2, 3}, {4}}. |A| = 2 and |B| = 3. Both A and B are maximal,
but none of them is maximum, since the family of singletons C = {{1}, {2}, {3}, {4}, {5}} also
possess the property and has 5 members. So a maximum family must contain at least five
members.

Nevertheless, we have the following corollary.

Corollary 8 In a finite-dimensional vector space V

1. every maximal linearly independent subset of vectors is a maximum linearly independent

subset;

2. every minimal spanning subset of vectors is a minimum spanning subset.

Proof. Complete! .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 14

Problems.

When you do these problems, please do not use any notions or facts of linear algebra which we
have not discussed in this course. You must prove all your answers.

1. Let V = F be a field considered as a vector space over itself. Find dim V and describe all
bases of V .

2. Prove or disprove:

(a) A = {(3, 1, 2), (−1, 3, 1), (2, 4, 3)} ⊂ R3 is a basis of R3 .

(b) A = {(3, 1, 2), (−1, 3, 1), (2, 4, −3)} ⊂ R3 is a basis R3 .
(c) A = {(t, t2 , t3 ) : t ∈ R} ⊂ R3 is a basis of R3 .
(d) A = {(3, 1), (−1, 2), (1, 0)} ⊂ F25 is a basis of F25 .

3. How many bases does Fp2 have? You can first try to answer this question for p = 2, 3, 5.

4. Let P3 be the set of all polynomials of x with real coefficients of degree at most three.
Show that P3 is a vector space over R. Show that the set {1, x − 1, (x − 1)2 , (x − 1)3 } is
a basis of P3 .

(Remember, that here P3 is a space of polynomials (as formal sum of monomials), not
polynomial functions.)

Is there a basis of P3 which contains no polynomial of degree one?

5. The fact that polynomials 1, x, x2 , x3 linearly independent as vectors in F[x] is trivial. It

follows from the definition of polynomials.

Let 1, x, x2 , x3 be functions (polynomial functions) from the vector space C(R). Prove
that they are linearly independent.

6. Let V = {(x, y, z, t) : x + y + 2z − t = 0} ⊆ R4 . Prove that V is a vector space and find a

basis of V over R. What is dim V ?

7. Let 0 6= (a1 , . . . , an ) ∈ Fn , and let

V = {(x1 , . . . , xn ) : a1 x1 + . . . + an xn = 0} ⊆ Fn .

Prove that V is a subspace of Fn and find a basis of V over F. What is dim V ?

8. Let V = {(x, y, z, t) : x + y + 2z − t = 0 and x + iy + z − t = 0} ⊆ C4 . Prove that V is

a subspace of C4 and find a basis of V over C. What is dim V ?

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 15

9. Let V is the vector space of all sequences of real numbers (xn ), n ≥ 0, defined by the
recurrences:
x0 = a, x1 = b, and for all n ≥ 0, xn+2 = xn+1 + xn .

Find a basis in V consisting of two geometric progressions, i.e., of sequences of the form
(crn ), n ≥ 0. Write the vector (i.e., sequence) of this space corresponding to a = b = 1
as linear combination of the vectors from this basis.

10. Let u, v, w be three distinct vectors in Fnp . How many vectors can Span({u, v, w}) have?

11. Consider V = R∞ – the vector space of all infinite sequences of real numbers, with addition
of vectors and multiplication of vectors by scalars defined similarly as in Rn . Prove that
V is infinite-dimensional.

12. A complex (in particularly real) number α is called transcendental, if α is not a root
of a polynomial equation with integer coefficients. For example, the famous numbers
π and e are transcendental, though the proofs are hard. Explain that the existence of
transcendental numbers implies that R is infinite-dimensional as a vector space over Q.

13. Let V = R be the vector space over Q. Prove that the set {1, 21/3 , 22/3 } is linearly
independent.

14. (Optional) Let V = R be the vector space over Q. Prove that the infinite set
2 n
(i) {1, 21/2 , 21/2 , . . . , 21/2 , . . .} is linearly independent
√
(ii) { p : p ∈ Z, p ≥ 2, and p is prime} is linearly independent.

15. (Optional) Prove that the functions ex , e2x form a basis in the vector space of all solutions
of the differential equation y ′′ − 3y ′ + 2y = 0.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 16

Lecture 5.

We would like to add a few more facts about subspaces and their dimensions. The following
notations are useful: by W ≤ V we will denote the fact that W is a subspace of V , and we
write W < V if W is a proper subspace of V (i.e., W 6= V ).

Corollary 9 If W ≤ V and V is finite-dimensional, then every basis of W can be extended to

a basis of V .

Proof. Let dim V = n. As W ≤ V , W has a basis. Denote it by B. If B is a basis of V , we

are done. If W < V , then Span(B) = W < V . Take v1 ∈ V \ W . Then B1 = B ∪ {v1 } is
linearly independent. If Span(B1 ) = V , we are done. If not, take v2 ∈ V \ Span(B1 ). Then
B2 = B1 ∪ v2 = B ∪ {v1 , v2 } is linearly independent. If Span(B2 ) = V , we are done. If not
we proceed in a similar way. As V is n-dimensional, there are no n + 1 linearly independent
vectors in V . Hence the process will terminate with a basis of V containing B as a subset.

Corollary 10 If W ≤ V and dim W = dim V = n, then W = V .

Proof. If W < V , then a basis of W , which has n vectors in it, is not a basis of V . By the
previous corollary, it can be extended to a basis of V which will contain n + 1 elements. This
is impossible, since every basis of V contains n vectors.

QUESTION: Let U and W be subspaces of finite-dimensional space V . What is dim(U + W )?

Theorem 11 For any finite-dimensional subspaces U and W of V ,

dim(U + W ) = dim U + dim W − dim(U ∩ W ).

Moreover, dim(U + W ) = dim U + dim W if and only if U ∩ W = {0}.

Proof. Let dim U = p, dim W = q, and dim(U ∩ W ) = r. As U ∩ W is a subspace of both U

and W , a basis {v1 , . . . , vr } of U ∩ W can be extended to a basis {v1 , . . . , vr , u1 , . . . , up−r } of
U , and to a basis {v1 , . . . , vr , w1 , . . . , wq−r } of W (by Corollary 9).

We claim that
B = {v1 , . . . , vr , u1 , . . . , up−r , w1 , . . . , wq−r }

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 17

is a basis of U + W .

Indeed, every vector u + w ∈ U + W , where u ∈ U , w ∈ W , is, clearly, in Span(B). What is

left is to show that B is linearly independent. Let

r
X p−r
X q−r
X
ai vi + bi u i + ci wi = 0. (1)
i=1 i=1 i=1

This can be rewritten as

p−r
X r
X q−r
X
u := bi u i = − ai vi − ci wi =: v.
i=1 i=1 i=1

As u ∈ U and v ∈ W , then u ∈ U ∩ W . Hence u = d1 v1 + . . . + dr vr , and we have

r
X q−r
X
(di + ai )vi + ci wi = 0.
i=1 i=1

Since {v1 , . . . , vr , w1 , . . . , wq−r } is linearly independent (as a basis of W ), then all the coefficients
are zeros. In particular all ci are zeros. A similar argument gives all bi equal zero. Then (1)
gives
r
X
ai vi = 0,
i=1

and as {v1 , . . . , vr } is linearly independent (as a basis of U ∩ W , all ai are zeros. Hence B is
linearly independent and

dim(U + W ) = |B| = r − (p − r) + (q − r) = p + q − r.

The last statement is trivial, since r = 0 if and only if U ∩ W = {0}.

The space U + W is the smallest subspace of V which contains U ∪ W . If for every v ∈ U + W ,

there exist unique u ∈ U and w ∈ W such that v = u + w, then U + V is called the direct
sum of its subspaces U and V or the internal direct sum of U and W , and it is denoted by
U ⊕ W.

Proposition 12 U + W = U ⊕ W if and only if U ∩ W = {0}.

Proof. Done in class.

Let V1 and V2 be vector spaces over the same field F. Consider the set

V1 × V2 = {(v1 , v2 ) : v1 ∈ V1 , v2 ∈ V2 .}

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 18

If we define

(v1 , v2 ) + (v1′ , v2′ ) = (v1 + v1′ , v2 + v2′ ) and k(v1 , v2 ) = (kv1 , kv2 ) for k ∈ F,

then it is easy to check that V1 × V2 is a vector space over F. It is called the direct product
or the external direct product of V1 and V2 , and will be denoted by V1 × V2 . The direct
product of more than two vector spaces over F is defined similarly.

A reader may get a feeling that the notions of the direct sum and direct product are very similar,
and there is no distinction between the resulting vector spaces. Note also that, opposite to U
and W being subspaces of U ⊕ W , neither V1 nor V2 is a subspace of V1 × V2 . All this may be a
little bothering. The notion of isomorphism, which we are going to define very soon, will help
to discuss these issues in a precise way.

Problems.

When you do these problems, please do not use any notions or facts of linear algebra which we
have not discussed in this course. You must prove all your answers.

1. It is clear how to generalize the definition of the direct product of two vector spaces to
any finite (or infinite) collection of vector spaces over the same field F.

What about the direct sum of subspaces? Try n ≥ 2 subspaces. Can you state and prove
a statement similar to Proposition 12? If you do not see how to do it for any n ≥ 2
subspaces, maybe do it first for three subspaces. Then state the result for n subspaces.
You do not have to prove it.

Can you extend the definition of the direct sum for infinitely many subspaces? If you can,
please state it.

2. Extend Theorem 11 to three finite-dimensional subspaces X, Y, Z of a vector space V and

prove it.

Can you extend the statement of Theorem 11 to any n ≥ 2 finite-dimensional subspaces?

You do not have to prove the statement.

3. Proof that C(R) = E ⊕ O, where E is the subspace of all even functions and O is the
subspace of all odd functions.

(A function f : R → R is called even (resp. odd) if for all x ∈ R, f (−x) = f (x) (resp.
f (−x) = −f (x) ).)

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 19

4. How many k-dimensional subspaces, k = 1, 2, 3, does F5p have? You can first try to answer
this question for p = 2, 3.

5. Can two 4-dimensional subspaces of F62 have exactly 20 vectors in common?

6. How many k-dimensional subspaces, k ≥ 1 and fixed, does C(R) (over R) have?

7. Prove that functions 1, ex , e2x , . . . , enx , n ≥ 1 and fixed, are linearly independent as vectors
in C ∞ (R).

8. Give an example of three functions f1 , f2 , f3 ∈ C ∞ (R), and three distinct real numbers
a, b, c, such that f1 , f2 , f3 are linearly independent, but the vectors (f1 (a), f2 (a), f3 (a)),
(f1 (b), f2 (b), f3 (b)), (f1 (c), f2 (c), f3 (c)) are linearly dependent as vectors in R3 .

9. Let U = hsin x, sin 2x, sin 3xi and W = hcos x, cos 2x, cos 3xi be two subspaces in C ∞ (R).
Find dim(U ∩ W ).

(The symbol hv1 , . . . , vn i is just another common notation for Span({v1 , . . . , vn })).

10. Prove that if dim V = n and U ≤ V , then there exists W ≤ V such that V = U ⊕ W .
Does such W have to be unique for a given U and V ?

11. Prove that Rn is the direct sum of two subspaces defined as:

U = {(x1 , . . . , xn ) : x1 + . . . + xn = 0} and

W = {(x1 , . . . , xn ) : x1 = x2 = · · · = xn }.

12. Let U = {(x1 , . . . , x4 ) : x1 + x2 + x3 − x4 = 0, and x1 − x3 = 0, all xi ∈ R} ≤ R4 . Find

a basis of a subspace W ≤ R4 such that U ⊕ W = R4 .

13. (Optional) Let V = R be the vector space over Q. Prove that the vectors π and cos−1 (1/3)
are linearly independent.

14. (Optional) A hyperplane in an n-dimensional space, n ≥ 1, is any subspace of dimension

n − 1. Is Rn a union of a finite number of hyperplanes?

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 20

Lecture 6.

Strange objects (vector spaces) have to be studied by strange methods. This is how we arrive
to linear mappings. Since linear mappings are functions, we review related terminology and
facts. We do it here in a very brief way.

Given sets A and B, a function f from A to B is a subset of A × B such that for every a ∈ A the
exists a unique b ∈ B that (a, b) ∈ f . The fact that f is a function from A to B is represented
by writing f : A → B. The fact that (a, b) ∈ f is represented by writing by f : a 7→ b, or
f a = b, or in the usual way: f (a) = b. We also say that f maps a to b, or that b is an image
of a in f .

Let im f := {b ∈ B : b = f (a) for some a ∈ A}. im f is called the the image of f , or the
range of f .

We say that f is one-to-one or injective if for every a1 , a2 ∈ A, f (a1 ) = f (a2 ) implies a1 = a2 .

Or, equivalently, if for every a1 , a2 ∈ A, a1 6= a2 implies f (a1 ) 6= f (a2 ).

We say that f is onto or surjective if im f = B.

We say that f is bijective if it is one-to-one and onto, or, equivalently, if f is both injective
and surjective.

Let A1 ⊆ A and f : A → B. Then

f |A1 = {(a, b) : a ∈ A1 , f (a) = b} ⊆ A1 × B

is a function from A1 to B, called the restriction of f on A1 .

Given a function f : A → B and a function g : B → C, one can consider the set

h = {(a, g(f (a))) : a ∈ A} ⊆ A × C.

It is easy to show that h is a function from A to C, and h(a) = g(f (a)). It is called the
composition of functions f and g, and denoted by g ◦ f . It is easy to check that h is

• injective if and only if both f and g |im f are injective.

• surjective if and only if g |im f is surjective;

• bijective if and only if f is injective and g |im f is bijective.

It may happen that for f : A → B, the set {(b, a) : f (a) = b} ⊂ B × A is a function from B to
A. This function is denoted by f −1 and called the inverse (function) of f . It is clear that this
happens, i.e., f −1 exists, if and only if f is a bijection.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 21

Let V and W be two vector spaces over the same field F, and let f : V → W satisfy the following
properties:

• for all x, y ∈ V, f (x + y) = f (x) + f (y), and

• for all x ∈ V, and all k ∈ F, f (kx) = kf (x).

Then f is called a linear map (or mapping) from V to W . If V = W , a linear map from V to
V is called a linear operator on V , or a linear transformation of V .

Here are some examples of linear maps. Verification that the maps are linear is left to the
reader.

• Let f : V → V , where f (x) = x for all x ∈ V – the identity map.

• Let f : V → W , where f (x) = 0 for all x ∈ V – the zero map.

• Let V = W . Fix k ∈ F , and let f : V → V be defined via x 7→ kx.

• Let A be an m × n matrix with entries from F, V = Fn , W = Fm , and let f : V → W be

defined via x 7→ Ax. (Here vectors of V or W are thought as columns.)

• Let V = C 2 (a, b), W = C 1 (a, b), and let f : V → W be defined via f 7→ f ′ – the derivative
of f .

• Let V = C 2 (a, b), W = C(a, b), and let f : V → W be defined via f 7→ 2f ′′ − 3f ′ + f .

Rb
• Let V = C[a, b], W = R, and let f : V → W be defined via f 7→ a f (x) dx – the definite
integral of f on [a, b].

• Let V = W = R∞ , and let f : V → W be defined via (x1 , x2 , x3 . . .) 7→ (x2 , x3 , . . .) – the

backward shift on R∞ .

An easy way to construct a linear map is the following. Let {v1 , . . . , vn } be a basis of V . Chose
arbitrary n vectors {w1 , . . . , wn } in W and define f : V → W via ni=1 ki vi 7→ ni=1 ki wi for
P P

all possible choices of ki ∈ F. As every vector of V is a unique linear combination of vectors vi ,

f is a function. It is easy to see that f is linear.

For a linear map f : V → W let ker f := {v ∈ V : f (v) = 0}. The set ker f is called the
kernel of f .

We collect several important properties of linear maps in the following theorem.

Theorem 13 Let V and W be vector spaces over F, and let f : V → W be a linear map. Then

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 22

1. f (0) = 0.

2. f (−v) = −f (v) for every v ∈ V .

3. ker f ≤ V , and f (x) = f (y) if and only if x − y ∈ ker f .

4. f is injective if and only if ker f = {0}.

5. im f ≤ W

6. If V is finite-dimensional, then dim ker f + dim im f = dim V .

Proof. Will be discussed in class.

We introduce three more definitions. An injective linear map is called a monomorphism,

a surjective linear map is called an epimorphism, and a bijective linear map is called an
isomorphism.

Corollary 14 Let V and W be finite-dimensional spaces over F, and let f : V → W be a linear

map.

1. If f is a monomorphism, then dim V ≤ dim W .

2. If f is an epimorphism, then dim V ≥ dim W .

3. f is an isomorphism, then dim V = dim W .

4. If dim V = dim W , then there exists an isomorphism from V to W .

Proof.

1. Since f in injective, then ker f = h0i. Hence dim V = dim ker f + dim im f = dim im f ≤
dim W .

2. Since f in surjective, then dim V = dim ker f + dim im f ≤ dim ker f + dim W ≥ dim W .

3. Follows from the first two statements.

4. Let n be the common dimension of V and W , and let {v1 , . . . , vn } and {w1 , . . . , wn } be
the bases of V and W , respectively. Define f : V → W by ni=1 ki vi 7→ ni=1 ki wi . Since
P P

{v1 , . . . , vn } is a basis, f is well-defined. Also f is a linear map, and ker f = h0i. So f is

injective. Also, dim im f = n − dim ker f = n − 0 = n, hence im f = W . This implies that f
is onto, and therefore is an isomorphism.

We will denote the fact that V is isomorphic to U by V ≃ U .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 23

So, why do we study distinct finite-dimensional spaces if we could concentrate on just Fn ? Yes,
as just vector spaces over the same field, all n-dimensional spaces are isomorphic, and this is
very helpful! But often we ask questions about vectors which are of interest for a given vector
space only, and which are related to the properties not preserved by isomorphisms. We will see
many such questions in this course.

Problems.

1. Let A = {v1 , . . . , vn } be a linearly dependent subset of V . Chose arbitrary n vectors

{w1 , . . . , wn } in W and try to define a function f from Span(A) to W by mapping
Pn Pn
i=1 ki vi 7→ i=1 ki wi for all possible choices of ki ∈ F. Will f necessarily be a function?
Sometimes the same question is phrased as: “Will f be well-defined?”

2. Can you describe ker f and im f for all linear maps f from the examples of this lecture?

3. Let V and W be vector spaces over F, and let f : V → W be a function satisfying one
of the two conditions required for f being linear. For each of the two conditions, find an
example of f which satisfies this condition, but not the other one.

4. Find an isomorphism between Fn+1 and Pn (F) – the vector space of all polynomials over
F of degree at most n.

5. Find an isomorphism between Fmn and Mm×n (F) – the vector space of all m × n matrices
over F.

6. Let V be a finite dimensional space over F. Decide whether the following statements are
true or false? Explain.

(i) If V = V1 ⊕ V2 , then V ≃ V1 × V2 .
(ii) If V ≃ V1 × V2 , where V1 , V2 are subspaces of V , then V = V1 ⊕ V2 .
(iii) Let f : V → V be a linear operator on V . Then V ≃ ker f × im f .
(iv) Let f : V → V be a linear operator on V . Then V = ker f ⊕ im f .

7. Let U , W be subspaces of V and dim V be finite. Reprove the formula

dim(U + W ) = dim U + dim W − dim U ∩ W

by considering the map f : U × W → V given by f ((u, w)) = u − w. Hint: is f linear?

what is ker f ? im f ?

8. (Optional) Suppose S is a set of 2n + 1 irrational real numbers. Prove that S has a subset
T of n + 1 elements such that no nonempty subset of T has a rational sum.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 24

When you come to a fork on the road take it.

– Yogi Berra

Lecture 7.

What do we do next? There are many natural questions we can ask about vector spaces at
this point. For example, how do they relate to geometry or other parts of mathematics? Or to
physics? Did they help Google or the national security?

I hope we will touch all these relations, but now we will talk about matrices, objects which are
inseparable from vector spaces. We assume that the reader is familiar with basic definitions and
facts about matrices. Below we wish to discuss some natural questions which lead to matrices.

One can arrive to matrices in many ways.

1. Trying to solve systems of linear equations is one of them. There matrices and vectors (the
latter also can be viewed as n × 1 or 1 × n matrices) appear as very convenient notations. As all
of you know, the problem of solving a system of m linear equations each with n variables can
be restated as finding a column vector x ∈ Fn such that Ax = b, where A = (aij ) is the matrix
of the coefficients of the system, and b ∈ Fm is the column vector representing the right hand
sides of the equations. Here the definition for the multiplication of A by x is chosen in such a
way that Ax = b is just a short way of rewriting the system. It seems that nothing is gained
by this rewriting, but not quite. Somehow this way of writing reminds us about the simplest
linear equation ax = b, where a, b ∈ F are given numbers and x

is the unknown number. We know that we can always solve it if a 6= 0, and the unique solution
is x = a−1 b. The logic of arriving to this solution is as follows:

ax = b ⇔ a−1 (ax) = a−1 b ⇔ (a−1 a)x = a−1 b ⇔ 1(x) = a−1 b ⇔ x = a−1 b.

We also know that for a = 0, we either have no solutions (if b 6= 0), or every element of F is a
solution (if b = 0). Analogy in appearance, suggests analogy in the approach, and we may try
to invent something like 0, or 1, or A−1 for matrices. We may ask the question whether the
product of matrices is associative, etc. .

Trying to push the analogy, we may say that, in F, it does not matter whether we are solving
ax = b or xa = b or xa − b = 0. Trying to see whether it is true for matrices, we immediately
realize that xA (what is it ???) has little to do with already introduced Ax, and that the
obvious candidate for zero-matrix, does not allow to claim that if A is not ‘the’ zero matrix,
then Ax = b is always solvable. Thinking about all this, one may come to the usual non-
commutative (but associative) ring of square matrices Mn×n (F). Analyzing further, one realizes

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 25

that it is convenient to write the matrix (caij ) as cA: a great simplification of actual writing!
Now we do not need to repeat c mn times. Doing this for a while, one may begin saying
that one ‘is multiplying a matrix by a number’, and that every matrix can be written uniquely
as a ‘linear expression’, or ‘a linear form’, or ‘a linear combination’ of some very simple mn
matrices. That is how we can start viewing Mm×n (F) as a vector space over F, and forget about
the multiplication of matrices if it is not needed for what we are doing....

2. Another way to arrive to matrices, especially to the notion of matrix multiplication, can be
through doing changes of variables. The method was used long before vectors or matrices were
born. Mostly in number theory or in geometry, when the coordinates were used. If x = 3a − 2b
and y = a + 5b, and a = e − f − 5g, and b = 2e + f + 7g, then x = 3(e − f − 5g) − 2(2e + f + 7g) =
−e − 5f − 29g, and y = (e − f − 5g) + 5(2e + f + 7g) = 11e + 4f + 30g. The expression for x
and y in terms of e, f , and g, can be obtain via the following computation with matrices:

 
" # " #" # " # e
# "
x 3 −2 a 3 −2   1 −1 −5 f  =
 
= = (2)
y 1 5 b 1 5  2 1 7  
g
   
" #" #! e " # e
3 −2 1 −1 −5 f  = −1 −5 −29 f 
   
(3)
1 5 2 1 7   11 4 30  
g g

Tracing how the coefficients are transformed, leads to the rule for matrix multiplication. This
rule will become more visible if we use letters instead of numbers for the coefficients in our
transformations.

3. A linear map f : U → V can be represented by a matrix in the following way. Let

{u1 , . . . , um } be a basis of U , and let {v1 , . . . , vn } be a basis of V . Then f is completely defined
by a m × n matrix Mf = (aij ), where f (ui ) = ai1 v1 + . . . ain vn , i ∈ {1, . . . , m}. One can use
matrix notation to write this map as
   
f (u1 ) v1
   
 . . .  = Mf . . . (4)
   
f (um ) vn
Note that in this notation the column matrices are not from Fn and Fm , as they usually are. It is
just a convenient way of expressing the set of equalities f (ui ) = ai1 v1 +. . . ain vn , i ∈ {1, . . . , m}.

It is clear that the correspondence f 7→ Mf defined by (4) between the set of all linear maps
from U to V and the set of all m × n matrices over F is a bijection, and depends heavily on the
choices of two bases.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 26

A linear map g : V → W can be represented by a matrix in a similar way. Let {w1 , . . . , wq }
be a basis of W . Then g is completely defined by a n × q matrix B = (bk,l ), where g(vk ) =
bk1 w1 + . . . bkq wq , k ∈ {1, . . . , n}. By using matrix notation one can represent this map as
   
g(v1 ) w1
   
 . . .  = Mg . . . (5)
   
g(vn ) wq

Then the composition g ◦ f of linear maps f and g, which is a linear map itself, is given by a

m × q matrix Mg◦f = (cst ), where (g ◦ f )(us ) = cs1 w1 + . . . + csq wq , s ∈ {1, . . . , m}. Let us
express the coefficients cst in terms of aij and bkl .
X X
(g ◦ f )(us ) = g(f (us )) = g asj vj = asj g(vj ) =
j j
X X X X
asj bjt wt = asj bjt wt =
j t j t
X X
asj bjt wt .
t j
P
Therefore cst = j asj bjt , and we obtain

Mg◦f = Mf Mg .

Hence      
(g ◦ f )(u1 ) w1 w1
     
 ...  = Mg◦f . . . = (Mf Mg ) . . . (6)
     
(g ◦ f )(um ) wq wq

Though equality (6) makes the relation between the matrix of a composition of linear maps
and the product of the corresponding matrices very clear, one should not read more from it
than what it displays. Contrary to the associativity of the product of matrices in (2) and
(3), the right hand side of (6) is not the usual product of three matrices: trying to check the
associativity, we have difficulties with the meaning of the ‘products’ involved. If the column
vector of vectors wi is denoted ny w,
~ what is the meaning of Mg w?
~ Or of Mf w,
~ if we hope that
multiplying by Mf first may help?

Problems.

1. Check that Mm×n (F) is a vector space over F of dimension mn.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 27

2. Check that the only matrix X ∈ Mm×n (F) with the property that X + A = A for all
A ∈ Mm×n (F) is the zero matrix, i.e. the one with all entries equal zero.

3. Check that the only matrix X ∈ Mm×m (F) with the property that XA = A for all
A ∈ Mm×n (F) is the identity matrix Im =diag(1, 1, . . . , 1). Similarly, the only matrix
Y ∈ Mn×n (F) with the property that AY = A for all A ∈ Mm×n (F) is the identity matrix
In =diag(1, 1, . . . , 1).

4. Check that if AB and BA are defined, then both AB and BA are square matrices. Show
that the matrix multiplication of square matrices is not, in general, commutative.

5. Check that the following products of three matrices, (AB)C and A(BC), always exist
simultaneously, and are equal. Hence the matrix multiplication is associative. Do this
exercise in two different way. First by using the formal definition for matrix multiplication
and manipulating sums. Then by using the correspondence between matrix multiplication
and composition of linear maps. Prove the fact that composition of three functions is
associative.

6. Check that for any three matrices A, B, C over the same field F, A(B + C) = AB + AC,
and (A + B)C = AC + BC, provided that all operations are defined. These are the
distributive laws.

7. Show that there exist matrices A, B, C ∈ M2×2 (F) such that AB = AC = 0, A 6= 0,

B 6= C.

8. Matrices from Mn×n (F) are also referred to as square matrices of order n. Let A be
a square matrix of order n. A matrix B is called the inverse of A, if AB = BA = In .

(i) Prove that if the inverse matrix exists, then it is unique.

The inverse matrix, if it exists, is denoted by A−1 , and A is called nonsingular, otherwise
it is called singular.

(ii) Give an example of a 2 × 2 matrix A such that A is not zero matrix, and A−1 does
not exist.

9. Prove that if the Show that there exist matrices A, B, C ∈ M2×2 (F) such that AB =
AC = 0, A 6= 0, B 6= C.
Pn
10. Let A = (aij ) ∈ Mn×n (F). Define tr A := i=1 aii . The field element tr A is called the
trace of A.

Prove that f : Mn×n (F) → F defined via A 7→ tr A is a linear map, and that tr AB =
tr BA for every two matrices A, B ∈ Mn×n (F).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 28

11. Prove that the matrix equation XY − Y X = In has no solutions with X, Y ∈ Mn×n (F),
where F is a subfield of C.

12. Find the set of all matrices C ∈ Mn×n (F) such that CA = AC for all A ∈ Mn×n (F).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 29

Lecture 8.

Let dim V = n, and let f : V → V be an isomorphism of V . Let α = {u1 , . . . , un }, and

β = {v1 , . . . , vn } be bases of V , and for each i, let f (ui ) = ai1 v1 + . . . + ain vn . Then Mf = (aij ).
Since f is an isomorphism of V , then f −1 exists (obvious), is linear (check), and therefore is also
an isomorphism of V . Let f −1 (vi ) = bi1 u1 + . . . + bin un . Then Mf −1 = (bij ). As = f −1 ◦ f = id
– the identity map on V , the matrix of id with respect to bases α and α is In . Similarly,
f ◦ f −1 = id, and the matrix of id with respect to bases β and β is In . Therefore we have

Mf −1 Mf = Mf Mf −1 = In .

This implies that

Mf −1 = (Mf )−1 (7)

With α and β as above, let g : V → V be a linear map. Then we can represent g in two different
ways:        
g(u1 ) u1 g(v1 ) v1
       
 . . .  = Mg,α . . . and  . . .  = Mg,β . . . (8)
       
g(un ) un g(vn ) vn

As β is a basis, we have    
u1 v1
   
. . . = C . . . , (9)
   
un vn
for some matrix C. Since C = Mid with respect to the bases α and β, and id is an isomorphism
of V to V , then C is an invertible matrix, as we proved at the beginning of this lecture. Next
we notice that (9) implies    
g(u1 ) g(v1 )
   
 ...  = C  ...  (10)
   
g(un ) g(vn )
Equalities (8), (9), (10), and the associativity of matrix multiplication imply that
     
g(u1 ) v1 v1
     
 . . .  = (Mg,α C) . . . = (C Mg,β ) . . . (11)
     
g(un ) vn vn

As vectors {v1 , . . . , vn } are linearly independent, we obtain Mg,α C = C Mg,β , or, since C is
invertible,
Mg,β = C −1 Mg,α C (12)

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 30

Equation (12) represents the change of a matrix of a linear map g on V with respect to the
change of bases of V .

We conclude this lecture with another important result. Though it is also related to a change
of basis of V , it is about a different issue.

Every vector x ∈ V can be written uniquely as x = x1 u1 + . . . + xn un , where all xi ∈ F . The

scalars xi are called the coordinates of x in basis α. The column vector (xi ), i.e., n × 1
matrix (xi1 ) with all xi1 = xi , is called the coordinate vector of x in basis α, and we denote
it by [x]α . It is clear that the map f : V → Fn defined via x 7→ [x]α is an isomorphism. The
following theorem describes the change of [x]α if we change basis α to β = {v1 , . . . , vn }.

Theorem 15 Let V be vector spaces, and let α = {u1 , . . . , un } and β = {v1 , . . . , vn } be two
P
bases of V , and let A = (aij ) be a n × n matrix over F, such that uj = i aij vi . Then

[x]β = A [x]α .

Proof. We have
X X X X X
x= xj uj = xj aij vi = aij xj vi .
j j i i j

Now we observe that (i1)-th, or just the i-th, entry of the column vector A [x]α is precisely
P
j aij xj .

Remarks

• This theorem becomes very useful when we want to convert coordinates of many vectors
from one fixed basis to another fixed basis.

• Note that according to our definition of the coordinate vector, [uj ]β is the j-th column of
matrix A.

• If α is a standard basis of Fn , and x ∈ Fn , then the i-th components of x and [x]α are
equal.

For any m × n matrix A = (aij ), let AT denote the transpose of A, i.e., the n × m matrix
(a′kl ), where a′kl = alk for all k ∈ {1, . . . , n} and l ∈ {1, . . . , m}. If in the statement of Theorem
P
15, we wrote ui = j aij vj , then the result would change to

[x]β = AT [x]α .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 31

By using the notion of the transpose matrix, we can write [vj ]β = [a1j , a2j , . . . , anj ]T . So the
transpose notation is a good way to avoid writing column vectors!

One may ask why we do not just use row notations for vectors from Fn . Many texts do it.
In this case, to multiply a vector and a matrix, one would write xA. The approach has both
advantage and disadvantages. One disadvantage is having x on the left of A when we want to
consider a function defined by x 7→ xA.

All this said, we will adopt the following convention:

No matter how we write vectors we think about them as columns.

Row vectors will be represented by means of the transposition symbol.

Problems.

I recommend that you check your computations with any CAP (Computer Algebra Package), i.e., Maple,
or Mathematica, etc.

1. Let f : R3 → R2 be a linear map defined as f ([x, y, z]) = [2x − y, x + y − 2z], and let α and β be
bases of R3 and R2 , respectively. Find matrix Mf of f if

(a) α and β are the standard bases of R3 and R2 , respectively;

(b) α is the standard basis, and β = {v1 = [1, −1], v2 = [1, 3]};
(c) α = {u1 = [1, −1, 0], u2 = [0, 1, −1], u3 = [0, 2, 1], and β is the standard basis;
(d) α = {u1 = [1, −1, 0], u2 = [0, 1, −1], u3 = [0, 2, 1], and β = {v1 = [1, −1], v2 = [1, 3]}.

2. Let α = {v1 = [1, 1, 0], v2 = [0, 1, 1], v3 = [1, 0, 1]} ⊂ R3 , and let f be a linear operator on R3
such that [f (v1 )]α = [−1, −3, −3], [f (v2 )]α = [3, 5, 3], [f (v3 )]α = [−1, −1, 1]. Find Mf,α . Then
find Mf,β for β = {[1, 1, 1], [1, 0, 0], [1, 0, −3]}.

3. Let V = P3 be the vector space of all polynomials with real coefficients of degree at most 3. View
df
P3 as the subspace of C∞ (R). Let d : V → V defined as f 7→ f ′ = dx. Find Md,α , Md,β for bases:
2 3 2 3
α = {1, x, x , x } and β = {1, x − 2, (x − 2) /2!, (x − 2) /3!}. In which basis the matrix is simpler?
Then compute Mdi ,α , Mdi ,β , i = 2, 3, 4, where di := d ◦ di−1 (the composition of d with itself i
times.

4. Identify the Euclidean plane E2 with R2 by introducing a Cartesian coordinate system in E2 , and
matching a point with coordinates (x1 , x2 ) with vector (x1 , x2 ) (same as [x1 , x2 ]). We can depict
the vector as a directed segment from the origin (0, 0) to point (x1 , x2 ).
Write the matrix Mf for the following linear operators on R2 with respect to the standard basis.

(a) f = sl – the symmetry with respect to a line l of E2 , where l passes through the origin. Do
it for l being x-axis; y-axis; line l : y = mx.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 32

(b) f = sl1 ◦ sl2 , where l1 : y = m1 x and l2 : y = m2 x.
(c) f is a rotation rθ around the origin counterclockwise by angle θ.
(d) f is a rotation rθ1 +θ2 around the origin counterclockwise by angle θ = θ1 + θ2 . Then notice
that rθ1 +θ2 = rθ2 ◦ rθ1 . Compute now Mrθ1 +θ2 as a product of two matrices. Surprised?
(e) Prove that a symmetry sl is never a rotation rθ , i.e., that they are always different linear maps.
(When you explain, remember that “things which look different are not always different”.)
(f) Prove that any rotation rθ is a composition of two symmetries sl1 and sl2 with respect to
some lines l1 and l2 .
(g) Is it true that a composition of a rotation rθ and a symmetry sl , is always a symmetry with
respect to a line?

5. (a) Let V = U ⊕ W , where V is a finite-dimensional space. Consider a map f : V → U defined

as follows: write every v ∈ V in a unique way as v = u + w, where u ∈ U and w ∈ W , and
define f (v) = u. Prove that f is a linear map and f 2 := f ◦ f = f . Such a map f is called
the projection of V on U in the direction W .
Write the matrix Mf for this map with respect to the basis {u1 , . . . , uk , w1 , . . . wl } of V ,
where {u1 , . . . , uk } is a basis of U .
(b) Let V be a finite-dimensional space, and let f : V → V be a linear map satisfying the
property f ◦ f = f . Such a map is called an idempotent. Prove that f is a projection of V
on some subspace U in some direction W .
(c) If we just consider a function f : V → V such that f ◦ f = f , does f have to be linear?
(d) Let f be a projection of R3 on a hyperplane U = {(x, y, z) : x − y − 2z = 0} in the direction
W = h(1, 1, 1)i. Find the matrix Mf with respect to the standard basis of R3 and the basis
{(1, 1, 0), (0, 2, −1)} of U .

6. (i) Let A, B ∈ Mm×n (F). Prove that (A + B)T = AT + B T .

(ii) Let A ∈ Mm×n (F) and B ∈ Mn×p (F). Prove that (AB)T = B T AT .

7. Let A and B be square matrices of order n. Prove that if AB = In , then BA = In . This means
that the equality AB = In alone implies B = A−1 , i.e., the second condition in the definition of a
nonsingular matrix can be dropped.

8. (Optional) Let f be a rotation of R3 by angle π/2 (or θ) with the axis h(1, 2, 3)i (or h(a, b, c)i).
Find Mf with respect to the standard basis.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 33

Lecture 9.

Now we wish to introduce the notion of the determinant of a square matrix A. Roughly speaking
the determinant of A is an element of the field constructed by using all n2 entries of the matrix
in a very special way.

It is denoted by det A. Though det A does not capture all the properties of A, it does capture
a few very important ones. Determinants has always played a major role in linear algebra.
They appeared when people tried to solve systems of linear equations. Then they found use
in analysis, differential equations, geometry. The notion of a volume of a parallelepiped in
Rn , defined by n linearly independent vectors, is introduced as the absolute value of the the
determinant of the matrix which has these vectors as rows.

There are several conventional expositions of determinants, each having its own merits. The
one we chose employs the notion of an exterior algebra. So far we have discussed two main
algebraic structures, namely fields and vector spaces. We briefly mentioned non-commutative
rings, which are like fields, but the multiplication is not commutative and the existence of some
multiplicative inverses is not required. The main example of non-commutative ring in this
course is Mn×n (F). Now we introduce the definition of an algebra. In this context ‘algebra’ is
a specific technical term, not the whole field of mathematics known as algebra.

Let V be a vector space over F, and let a function ⋆ : V × V → V be a multiplication on V ,

(u, w) 7→ u ⋆ w =: uw, which satisfies the following axioms:

• Distributive laws: u(v + w) = uv + uw, and (v + w)u = vu + wu for all u, v, w ∈ V

• (ku)v = u(kv) = k(uv) for all u, v ∈ V , and all k ∈ F.

Then V with such a multiplication is called an algebra over F. If the multiplication is com-
mutative or associative, we obtain a commutative or an associative algebra, respectively.
If there exists a vector e ∈ V such that ev = ve = v for all v ∈ V , then it is called the identity
element of V , and the algebra is called an algebra with the identity. It is clear, that if the
identity exists, it is unique. If for every non-zero v ∈ V , there exists a v −1 – the inverse of v
with respect to the multiplication on V , then an algebra with the identity is called a division
algebra.

It is easy to understand that if {v1 , . . . , vn } is a basis of V , then, by defining all products

Pn k
vi vj = k=1 aij vk , we actually define the products of every two vectors u, w ∈ V due to
distributive laws: if u = nk=1 tk vk and w = nk=1 sk vk , then
P P

n
X n
X n
X X
ti sj akij vk

uv = tk vk sk vk =
k=1 k=1 k=1 1≤i,j≤n

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 34

This will give us an algebra. If we wish to attain some additional properties, like associativity,
or commutativity, or the existence of the identity element, etc,we have to chose the n3 constants
akij very carefully, and often such a choice is not possible. Of course, one may take all products
to be zero, but who needs this algebra?

Here are some examples of algebras.

• Let F be a subfield of a field E. Then E is a commutative division algebra. Hence R is

an infinite dimensional algebra over Q, and C is a 2-dimensional algebra over R ( {1, i} is
a basis).

• The vector space F[x] of all polynomials over F is an infinite-dimensional algebra over F
with respect to the usual product of polynomials. It is commutative, associative, with the
identity, but not a division algebra.

• A similar example is the algebra F[x1 , . . . , xn ] of all polynomials over F with n variables.

• There exists a 4-dimensional associative division algebra over R, called the algebra of
quaternions, or the Quaternion algebra (or the algebra of Hamilton Quaternions). It
has a basis {1, i, j, k}, with multiplication of the basis elements defined as

i2 = j 2 = k 2 = −1 ; and ij = −ji = k, jk = −kj = i, ki = −ik = j,

and continued to the whole algebra by the distributive laws. As we see, this algebra is
not commutative. 1 (the field element viewed as a vector) is the identity element.

There exists one more division algebra over reals. It is the Graves-Cayley algebra of
octonions, which is 8-dimensional, non-commutative and non-associative.

• Mn×n (F) - the n2 -dimensional associative algebra of all n × n matrices over F. It has the
identity, but is not commutative, and is not a division algebra.

• R3 with the usual cross product of vectors is a non-commutative, non-associative 3-

dimensional algebra over R without the identity.

Now we define the exterior algebra which will be used to develop the theory of the determinants
of square matrices.

Consider V = Fn , and let {e1 , . . . , en } be the standard basis of V . We define the following
vector spaces over F.

Λ0 = F with basis {1}.

Λ1 = V = Fn with basis {e1 , . . . , en }.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 35

Λ2 with basis {ei ∧ ej : 1 ≤ i < j ≤ n}. The expression ei ∧ ej is just a symbol, and we read it
‘ei wedge ej ’.

Similarly, let Λ3 with basis {ei ∧ ej ∧ ek : 1 ≤ i < j < k ≤ n},

Λk with basis {ei1 ∧ ei2 ∧ . . . ∧ eik : 1 ≤ i1 < i2 < . . . < ik ≤ n},

Λn with basis {e1 ∧ e2 ∧ . . . ∧ en }.

Let Λ be defined as the direct product of all Λi :

Λ = Λ0 × Λ1 × . . . × Λn

Note that according to our definition Λ 6= Λ1 .

Similarly to complex numbers or polynomials, we may, and we will, think about elements of Λ
as formal sums of elements from Λi , something like
2 √
2 + 3 e2 + e3 − e1 ∧ e2 + e1 ∧ e3 − 2 e1 ∧ e2 ∧ e3 .
15
We will not write any term with coefficient 0, since it is equal to zero vector.

It is clear that dim Λk = nk , and dim Λ = nk=0 nk = (1 + 1)n = 2n .

Now we introduce the multiplication on Λ which will distributes over vector addition and thus it
will suffice to define multiplication on basis elements. Our algebra is also going to be associative.

(i) We postulate that the basis element 1 for Λ0 (= F) is to be the identity element for
multiplication.

(ii) Next we define the product of two basis elements ei and ej from Λ1 to be an element of
Λ2 :
ei ej = ei ∧ ej = −ej ∧ ei for all i, j, even for i = j.

This implies that ei ei = −ei ei , hence 2ei ∧ ei = 0. When we consider exterior algebras, we
will consider only those fields where 2 6= 0, e.g., F2 is prohibited. This allows to conclude
that 2ei ∧ ei = 0 is equivalent to ei ∧ ei = 0.

(iii) The product of basis elements ei1 ∧ ei2 ∧ . . . ∧ eik ∈ Λk and ej1 ∧ ej2 ∧ . . . ∧ ejm ∈ Λm is
defined as follows:

(a) it is zero vector if k + m > n, or if {ei1 , . . . eik } ∩ {ej1 , . . . ejm } =

6 ∅, and
(b) it is ǫ eh1 ∧ eh2 ∧ . . . ∧ ehk+m ∈ Λk+m , if {ei1 , . . . eik } ∩ {ej1 , . . . ejm } = ∅, where
h1 < . . . < hk+m is the increasing ordering of {i1 , . . . , ik , j1 , . . . , jm }, and ǫ = 1 or
−1, depending on whether the ordering requires even or odd number of interchanges
of the subsequent elements (i.e., the parity of the permutation i1 . . . ik j1 . . . jm ).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 36

Let us make several comments on this definition.

1. Part (iii) of the definition can also be rephrased as follows. The products of two vectors from
the basis of Λ is defined as

(ei1 ∧ ei2 ∧ . . . ∧ eik ) (ej1 ∧ ej2 ∧ . . . ∧ ejm ) = ei1 ∧ ei2 ∧ . . . ∧ eik ∧ ej1 ∧ ej2 ∧ . . . ∧ ejm ,

with the assumption that the latter can be simplified, by using associativity: either to 0, if
some index is is equal to some jt or k + m > n, or, otherwise, to a basis element of Λk+m or its
opposite.

2. Why should we use a special notation for the product? Why instead of uw we cannot just
write u ∧ w, same wedge as in other symbols? It certainly agrees with the multiplication of the
basis vectors of Λ. That what we will do. And we will refer to the multiplication in Λ as the
wedge product.

3. It is not obvious, that the rule of determining ǫ in the part (iii) will agree with the associativity
of the product of the basis elements, but it is possible to show that it does.

4. Since zero vector 0V = 0F v, then the rule (ku)v = u(kv) = k(uv) for all u, v ∈ V , and all
k ∈ F, implies that if among the factors in a wedge product one vector is 0, then the whole
product is 0.

Examples.

• e2 ∧ (e1 ∧ e3 ) = (e2 ∧ e1 ) ∧ e3 = (−e1 ∧ e2 ) ∧ e3 = −e1 ∧ e2 ∧ e3 .

• e2 ∧(e1 ∧e2 ) = (e2 ∧e1 )∧e2 = (−e1 ∧e2 )∧e2 = −(e1 ∧e2 )∧e2 = −e1 ∧(e2 ∧e2 ) = −e1 ∧0 = 0.

• (e1 ∧ e5 ) ∧ (e2 ∧ e3 ∧ e4 ) = −e1 ∧ e2 ∧ e3 ∧ e4 ∧ e5 .

• (e1 + e2 ∧ e3 ) ∧ (e1 + e2 ∧ e3 ) = 2e1 ∧ e2 ∧ e3 6= 0. This example illustrates that it is not

true that w2 = 0 for all w ∈ Λ.

• (a0 + a1 e1 + a3 e3 + a13 e1 ∧ e3 + a123 e1 ∧ e2 ∧ e3 ) ∧ (b2 e2 + b12 e1 ∧ e2 + b13 e1 ∧ e3 ) =

a0 b2 e2 + (a0 b12 + a1 b2 ) e1 ∧ e2 + a0 b13 e1 ∧ e3 −

a3 b2 e2 ∧ e3 + (a3 b12 − a13 b2 ) e1 ∧ e2 ∧ e3

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 37

Lecture 10.

We are ready to proceed with the properties of the exterior algebra.

Proposition 16 If x, y ∈ Λ1 are linearly dependent, then x ∧ y = 0. For every x ∈ Λ1 ,

x ∧ x = 0.

Pn Pn
Proof. Let x = i=1 xi ei and y = i=1 yi ei . Linear dependence of x and y implies that one
of them is a scalar multiple of another. Suppose x = ky, for some k ∈ F. Then (x1 , . . . , xn ) =
(ky1 , . . . , kyn ) and
n
X n
X n
X

x∧y = xi ei ∧ yi ei = (xi yj − xj yi )(ei ∧ ej ) =
i=1 i=1 1≤i<j≤n
Xn n
X
(kyi yj − kyj yi )(ei ∧ ej ) = (0)(ei ∧ ej ) = 0.
1≤i<j≤n 1≤i<j≤n

The second statement follows from the fact that x and x are linearly dependent.

Proposition 17 If v1 , . . . , vm ∈ Λ1 are linearly dependent, then v1 ∧ . . . ∧ vm = 0.

Proof. Linear dependence of vi implies that one of them is a linear combination of others.
Renumber them such that v1 = a2 v2 + . . . + am vm . Then

v1 ∧ v2 ∧ . . . ∧ vm = (a2 v2 + . . . + am vm ) ∧ v2 ∧ . . . ∧ vm =
a2 v2 ∧ v2 ∧ . . . ∧ vn + . . . + am vm ∧ v2 ∧ . . . ∧ vm = 0,

since when we distribute the product every term will contain some vi twice among its wedge
factors, and, by Proposition 16, each such term is zero.

The contrapositive to the first statement of the Proposition 17 is as interesting: if v1 ∧. . .∧vm 6=

0, then v1 , . . . , vm ∈ Λ1 are linearly independent.

Let {e1 , . . . , en } be the standard basis of Fn , and let 1 ≤ i1 < i2 < . . . < ip ≤ n. For
A ∈ Mp×p (F), we define the “product of A and eik with respect to (ei1 , ei2 , . . . , eip )” as follows:
p
X
A eik := a1k ei1 + a2k ei2 + . . . + apk eip = atk eit .
t=1

Hence [a1k , . . . , ank ]T is the k-th column of A.

Next we define the “product of A and ei1 ∧ . . . ∧ eip with respect to (ei1 , ei2 , . . . , eip )” as
p
^ p
X
A (ei1 ∧ . . . ∧ eip ) = A ei1 ∧ . . . ∧ eip := A ei1 ∧ . . . ∧ A eip = atk eit .
k=1 t=1

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 38

Then A ei1 ∧ . . . ∧ eip lies in 1-dimensional space h ei1 ∧ . . . ∧ eip i. Hence it is a scalar multiple
of the basis vector ei1 ∧ . . . ∧ eip . Denote the corresponding scalar by λ = λ(A; (i1 , i2 , . . . , ip )).
It is clear that for every increasing sequence of p integers 1 ≤ i1 < i2 < . . . < ip ≤ n, the value
of this scalar is the same! It is called the determinant of A, and it is denoted by det A. Thus
p
Y p
X
Aei1 ∧ . . . ∧ eip = A ei1 ∧ . . . ∧ A eip = atk eit =: (det A)ei1 ∧ . . . ∧ eip .
k=1 t=1

If p = n, then the there exists only one increasing sequence of length n in {1, 2, . . . , n}. As
A ∈ Mn×n (F), the coordinate vector of A ei in the basis (e1 , . . . , en ) represents the i-th column
of A, which is [a1i , . . . , ani ]T . In this case A ei can be thought as the genuine product of two
matrices: A and the the column (n × 1 matrix) ei . Again, A e1 ∧ . . . ∧ A en lies in 1-dimensional
space Λn . Hence it is a scalar multiple of the basis vector e1 ∧ . . . ∧ en . Hence

A e1 ∧ . . . ∧ A en = (det A) e1 ∧ . . . ∧ en . (13)

" #
a11 a12
It is easy to check that if n = 1 and A = (a), then det A = a. If n = 2, and A = ,
a21 a22
then A e1 ∧ A e2 = (a11 e1 + a21 e2 ) ∧ (a12 e1 + a22 e2 ) = (a11 a22 − a12 a21 ) e1 ∧ e2 , hence,

det A = a11 a22 − a12 a21 .

Let A ∈ Mn×n (F). For any x = x1 e1 + . . . + xn vn , Ax = x1 A e1 + . . . + xn A en . For every

v1 , . . . vp ∈ Λ, we now define

A (v1 ∧ . . . ∧ vp ) := A v1 ∧ . . . ∧ A vp .

The following statements are easy to prove.

Proposition 18 1. det In = 1.

2. det(kA) = k n det A for k ∈ F.

3. If columns of A are linearly dependent, then det A = 0.

4. For all A, B ∈ Mn×n (F), det AB = det A det B.

5. For each A ∈ Mn×n (F), the inverse matrix A−1 exists if and only if det A 6= 0.

6. If C −1 AC = B, then det A = det B.

Proof. Done in class.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 39

Lecture 11.

Here we wish to prove other important properties of the determinants.

One of them is often called the Laplace expansion. It allows to compute the determinant of a
matrix by computing (many!) determinants of smaller matrices.

Let A = (aij ) be an n × n square matrix over F, and let Aij denote the (n − 1) × (n − 1) square
matrix obtained from A by deleting the i-th row and the j-th column of A.

Proposition 19

1. (Expansion by the j-th column) For every j = 1, . . . , n,

n
X
det A = (−1)i+j aij det Aij . (14)
i=1

2. (Expansion by the i-th row) For every i = 1, . . . , n,

n
X
det A = (−1)i+j aij det Aij . (15)
j=1

3. (Expansion by permutations) Let π ∈ Sn , where Sn is the set of all n! bijections on

{1, . . . , n}. Then
X
det A = sgn(π) a1π(1) · · · anπ(n) , (16)
π∈Sn
where sgn(π) = 1 if π is an even permutation, and sgn(π) = −1 if π is an odd permutation.

V V
Proof. As you remember, A ei = det A ei .
1≤i≤n 1≤i≤n

We begin with the first statement. First we establish the result for j = 1, i.e.,
X
det A = (−1)t+1 at1 det At1 .
1≤t≤n

We have:
     
^ ^ X X ^ X ^ X
A ei = ati et =  at1 et  ∧  A ei  = (at1 et ) ∧  aki ek 
1≤i≤n 1≤i≤n 1≤t≤n 1≤t≤n 2≤i≤n 1≤t≤n 2≤i≤n 1≤k≤n
   
X  ^ X X ^
= (at1 et ) ∧  aki ek  = at1 et ∧ det At1 ei 
  
1≤t≤n 2≤i≤n 1≤k≤n 1≤t≤n 1≤i≤n
k6=t i6=t
   
X ^ X ^
= at1 det At1 et ∧ ei  = at1 det At1 (−1)t−1 ei  =
 
1≤t≤n 1≤i≤n 1≤t≤n 1≤i≤n
i6=t
 
X ^
= (−1)t−1 at1 det At1  ei .
1≤t≤n 1≤i≤n

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 40

Since (−1)t−1 = (−1)t+1 , then det A = (−1)t+1 at1 det At1 .
P
1≤t≤n

In order to get a similar result for the expansion with respect to the j-th column, we prove the
following lemma. It states that if two columns of a square matrix are interchanged, then the
determinant of the matrix changes its sign.

Lemma 20 Let A be a square matrix of order p, 1 ≤ p ≤ n. Let matrix A′ be obtained from

matrix A by interchanging any two columns of A. Then det A = − det A′ .

Proof. Let 1 < i1 < . . . < ip < n, and {e1 , . . . , en } be the standard basis of Fn . Then
^ ^
A eik = det A eik .
1≤k≤p 1≤k≤p

What happens when we interchange two adjacent columns of A, say the j-th and the (j + 1)-th?
As
X X X X
A eij ∧ A eij+1 = akij eik ∧ atij+1 eit = (akij eik ) ∧ (atij+1 eit ) =
1≤k≤p 1≤t≤p 1≤k≤p 1≤t≤p
X X X X
akij atij+1 (eik ∧ eit ) = atij+1 akij (−eit ∧ eik ) = −(A eij+1 ∧ A eij ),
1≤k≤p 1≤t≤p 1≤k≤p 1≤t≤p

the interchange of two adjacent columns of A leads to a change of sign of the determinant. If
we wish to interchange the 1-st and the j-th columns of A, we can use (j − 1) adjacent column
interchanges to place the j-th column first, and then j −2 adjacent column interchanges to place
the 1-th column of A to be the j-th column of A′ . Since the total number of interchanges of
adjacent columns is an odd integer 2j − 3, the sign of the determinant will change odd number
of times. This proves the lemma.

Now we are ready to prove the formula for the expansion of det A with respect to the j-th column
for the arbitrary j. Consider the matrix A′′ obtained from A by subsequent interchanges of the
j-th column with the first j − 1 columns. In other words, the first column of A′′ is the j-th
column of A, the k-the column of A′′ is the (k − 1)-th column of A for 1 < k ≤ j, and it is the
k-th column of A for j < k ≤ n. As it takes (j − 1) interchanges, then det A = (−1)j−1 det A′′
by Lemma 20. At the same time, det Aij = det A′′i1 for all i. Expanding det A′′ with respect to
the 1-st column we obtain:
X X
det A = (−1)j−1 det A′′ = (−1)t+1 atj det A′′tj = (−1)t+j atj det Atj ,
1≤t≤n 1≤t≤n

and this ends the proof of part 1.

The proof of the second statement is similar.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 41

We now prove the third statement. Recall that a permutation is even if it can be constructed
from the identity permutation by an even number of transpositions, and it is odd otherwise.
Therefore
^ ^
eπ(i) = sgn (π) ei .
1≤i≤n 1≤i≤n
V V P
But A ei = aji ej =
1≤i≤n 1≤i≤n 1≤j≤n
    
X Y ^ X Y ^
 aπ(i)i  eπ(i)  =  sgn (π) aπ(i)i  ei
π∈Sn 1≤i≤n 1≤i≤n π∈Sn 1≤i≤n 1≤i≤n
V V
But A ei = (det A) ei by definition of the determinant. This ends the proof.
1≤i≤n 1≤i≤n

We just wish to mention that for large n, and general A, the computation of det A using the
expansion by permutations takes long, and, hence, is not practical.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 42

Problems.

In solutions of the following problems you can use the techniques based on exterior algebra, as well as
all other properties of determinants that we proved.

1. Prove the second statement of Proposition 19.

2. Check the third statement of Proposition 19 for n = 3.

3. A square matrix D = (dij ) of order n is called diagonal if dij = 0 for all i 6= j. Prove that
det D = d11 d22 · · · dnn .

4. A square matrix A = (aij ) of order n is called upper triangular (lower triangular) if aij = 0 for
all i > j (i < j). Prove that for both upper and lower triangular matrix A, det A = a11 a22 · · · ann .

5. Let A be a square matrix of order n. Prove that det A = det AT .

" #
A 0
6. Let A and B be square matrices (possibly of different size). Show that the determinant of
0 B
is (det A)(det B). Generalize the statement.

7. Let A be a square matrix of order n, A1 (A2 ) be a matrix obtained from A by interchanging two
rows (columns) of A, and A3 (A4 ) be a matrix obtained from A by replacing the i-th row (column)
of A by the sum of this row (column) with the j-th row (column) multiplied by a scalar.
Prove that
det A = − det A1 = − det A2 = det A3 = det A4 .

The property det A = det A3 = det A4 is very useful for computing determinants: if aij 6= 0, then
applying the transformation several time one can make all entries of the j-th column or the i-th
row of A, except aij zeros.

8. (Vandermonde’s determinant) Let {x1 , . . . , xn } be n distinct elements from a field F. Prove that

 
1 1 ... 1
 x1 x2 ... xn 
 
 
2
x2 2 xn 2 
Y
det  x1 ... = (xj − xi )

 . .. .. .. 
 . .  1≤i<j≤n
 . . . 
x1 n−1 x2 n−1 ... xn n−1

9. Let a, b ∈ F and a 6= b. Compute the following determinant:

 
a b b ... b
b a b ... b
 
 
det  b b a . . . b


. . .. .. 
. . .
. . .


b b b ... a

(all diagonal entries of the matrix are a, and all other entries are b).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 43

10. Recall that a square matrix A is called nonsingular, if A−1 exists, or equivalently, det A 6= 0.
Compute the number of all non–singular n × n matrices over the field Fp (p is a prime number).

11. For i = 1, . . . , n, let fi : R → R be a continuous function. Let f : R → Rn defined by t →

(f1 (t), f2 (t), . . . , fn (t)) be a continuous curve in Rn . Find f (i.e., find n continuous functions fi )
such that every n distinct points on the curve do not lie on a hyperplane of Rn .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 44

Lecture 12.

We are going to present a kind of an explicit formula for the inverse of a nonsingular square
matrix. As before, we denote by Aij the matrix obtained from a square matrix A by deleting
the i-th row and the j-th column. Let

bij = (−1)i+j det Aij

be the (ij)-cofactor of A, and let adj A = (bij )T . The matrix adj A is called the classical
adjoint of A. By GL(n, F) we will denote the set of all nonsingular n × n matrices. Since
it forms a group under multiplication (i.e., the multiplication is an associative operation on
GL(n, F), there exists an identity element, and every element has an inverse with respect to
multiplication), GL(n, F) is called the general linear group.

Theorem 21 Let A be a square matrix of order n over F. Then A(adj A) = (adj A)A =
(det A)In .

Proof. Our proof is based on two facts: (i) the Laplace expansion formula, and (ii) that a
matrix with two equal columns has zero determinant.

Let B = (bij ). Then adj A = B T . Let C = (cij ) = (adj A)A. Then, for all j, we have
n
X
T
cjj = [b1j , b2j , . . . , bnj ] [a1j , a2j , . . . , anj ] = bkj akj = det A,
k=1

due to the Laplace expansion with respect to the j-th column of A.

For i 6= j, we get
n
X
cij = [b1i , b2i , . . . , bni ] [a1j , a2j , . . . , anj ]T = bki akj .
k=1

Consider a matrix A′ = (a′ij ) with two equal columns: A′ is obtained from A by replacing the
i-th column of A by the j-th column of A. Note that det A′ki = det Aki for all k, and that
det A′ = 0 since columns of A′ are linearly dependent. Expanding det A′ with respect to the
i-th column we obtain:
X X X
0 = det A′ = a′ki (−1)k+i det A′ki = akj (−1)k+i det Aki = akj bki .
1≤k≤n 1≤k≤n 1≤k≤n

Therefore cij = 0 for i 6= j. This proves that (adj A)A = (det A)In .

Similarly, we can show that A(adj A) = (det A)In .

The proof of the following corollary is obvious. It gives an explicit formula for the inverse of a
nonsingular square matrix.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 45

Corollary 22 Let A ∈ GL(n, F). Then A−1 = ( det1 A ) adj A.

We just wish to mention that for large n, and general A, the computation of A−1 using adj A
takes very long, and, hence, is not practical.

Theorem 21 and Corollary 22 allow us to prove Cramer’s rule for solutions of a system of n
linear equations with n unknowns:

a11 x1 + . . . + a1n xn = b1
a21 x1 + . . . + a2n xn = b2
............
an1 x1 + . . . + ann xn = bn .

Let bij be the (ij)-cofactor of A = (aij ). Fix some j, 1 ≤ j ≤ n. Multiplying both sides of the
i-th equation by bij , and then adding all the results, we obtain:
   
X X X X X
 aik bij  xk =  aik bij  xk = bi bij .
1≤i≤n 1≤k≤n 1≤k≤n 1≤i≤n 1≤i≤n
P
By Theorem 21, the inner sum 1≤i≤n aik bij is equal to 0 if k 6= j, and is det A for k = j.
Hence we have
X
det A xj = bi bij .
1≤i≤n

Let Aj be the matrix obtained from A by replacing its j-th column by the column [b1 , . . . , bn ]T .
P
Then 1≤i≤n bi bij = det Aj , and we have

(det A) xj = det Aj

for all j. This implies that if the system has a solution x, and det A 6= 0, then x = [ detA detAn T
det A , . . . , det A ] .
1

It also shows that if det A = 0, but det Aj 6= 0 for at least one value of j, then the system has
no solutions.
detAi
For det A 6= 0, we can check that xi = det A , i = 1, . . . , n, are indeed the solutions of the system
by substituting them into an arbitrary equation of the system and simplifying the right hand
side.

If b = 0, i.e., the system is homogeneous, then all det Aj = 0. Therefore, if det A 6= 0, the
system will have only the trivial solution x = 0. If b = 0 and the system has a non-trivial
solution, then det A = 0. Hence we proved the following theorem.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 46

Theorem 23 (Cramer’s Rule) Let Ax = b be a system of n linear equations with n unknowns,
x = [x1 , . . . , xn ]T , and b = [b1 , . . . , bn ]T . Let Aj be the matrix obtained from A by replacing its
j-th column by the column b.

If det A 6= 0, the system has unique solution x = [ detA detAn T

det A , . . . , det A ] . If det A = 0, but det Aj 6= 0
1

for at least one value of j, then the system has no solutions.

If b = 0, i.e., the system is homogeneous, then, if det A 6= 0, the system has only the trivial
solution x = 0. Equivalently, if b = 0 and the system has a non-trivial solution, then det A = 0.

As applications of the Cramer’s rule and Vandermonde’s determinant, we can prove the following
fundamental facts about polynomials.

Theorem 24 Let x1 , . . . , xn+1 be n + 1 distinct elements of a field F, and y1 , . . . , yn+1 be

arbitrary n + 1 elements of F. Then there exists a unique polynomial f over F of degree at most
n such that f (xi ) = yi for all i.

Proof. Let f = an xn + . . . + a1 x + a0 . Then f (xi ) = yi , i = 1, . . . , n + 1, is a system of n + 1

linear equations with n + 1 unknowns ai . As the matrix A of the coefficients of the system is the
Q
Vandermonde’s matrix, its determinant det A = 1≤i<j≤n+1 (xi − xj ). Since all xi are distinct,
det A 6= 0, and all ai are determined uniquely by Cramer’s rule.

Corollary 25 Let f = an xn + an−1 xn−1 + . . . + a1 x + a0 ∈ F[x], an 6= 0, and let x1 , . . . , xn+1

be n + 1 distinct elements of F. If f (xi ) = 0 for all i = 1, . . . , n + 1, then f is zero polynomial,
i.e., all its coefficients are zeros.

Proof. If A is the matrix of the coefficients of the corresponding system, then by Cramer’s rule
ai = det Ai / det A = 0/ det A = 0, i = 0, . . . n, as every matrix Ai contains a column of all
zeros.

The corollary above can be restated this way: no polynomial of degree n over a field can have
more than n distinct roots. It turns out that if roots are not distinct, but are counted with
their multiplicities, then it is still true that no polynomial of degree n over a field can have
more than n roots. But a proof in this case must be different.

The following corollary just restates the uniqueness part of Theorem 24. It generalizes the
facts that there exists a unique line passing through two points, and a unique parabola passing
through any three points, etc..

Corollary 26 Let f, g ∈ F[x] be two polynomials of degree at most n. If f (xi ) = g(xi ) for
n + 1 distinct elements xi of F, then f = g.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 47

Proof. Consider h = f − g. Then degree of h is at most n and h(xi ) = 0 for all i. Applying
Corollary 25, we obtain that h is zero polynomial. Hence f = g.

Again, let x1 , . . . , xn+1 be n + 1 distinct elements of a field F, and y1 , . . . , yn+1 be arbitrary

n + 1 elements of F. Consider the following n + 1 polynomials, each of degree n:

(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn+1 )

fi (x) = , i = 1, . . . , n + 1,
(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn+1 )

where the factor x − xi is missing in the numerator and the factor xi − xi = 0 is missing in the
denominator.

Then it is obvious that fi (xi ) = 1 for all i, and fi (xj ) = 0 for all i 6= j. This implies that the
polynomial
L(x) = y1 f1 (x) + . . . + yn+1 fn+1 (x) =
n+1
X (x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn+1 )
yi (17)
(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn+1 )
i=1

has the property that its degree is at most n and L(xi ) = yi for all i. We wish to note that
the polynomial L is exactly the same polynomial as the polynomial f obtained in the proof
of Theorem 24. Though it is not clear from the ways these polynomials are defined, it is the
case, as we showed in the proof of the theorem, and then again in Corollary 26, that such a
polynomial is unique. The form (17) is just a representation of the polynomial in the basis fi
of the vector space of all polynomials over F of degree at most n. This form is often referred
to as the Lagrange Interpolation Formula. For another view of the Lagrange interpolation
formula via Linear Algebra, the one which uses the notion of the dual basis, see the text by
Hoffman and Kunze.

We conclude with the following amazing fact. The proof is immediate and is left to the reader.

Corollary 27 Every function f : Fp → Fp can be represented by a polynomial of degree at most

p − 1.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 48

Problems.

1. Prove Corollary 27.

2. Find a polynomial over the reals of degree at most 3 whose graph passes through points (0, 1), (1, 2), 2, 4),
and (3, 2).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 49

Lecture 13.

Sketch of the lecture.

Reminding the class the notions of three basic elementary row operations. The row-echelon
form of a matrix, and the reduced row-echelon form.

Suggested reading for the review: the text by Dummit and Foote, pages 424 – 431 as the
collection of all main facts, and favorite undergraduate texts.

We assume that we know that any matrix can be transformed into a row-echelon form by using
the elementary row operations.

The row space (column space) of a matrix A ∈ Mm×n (F) is the span of all its row vectors
(column vectors). The dimensions of these spaces are called, respectively, the row-rank and
the column-rank of the matrix. Is there any relation between these spaces? The row space is
a subspace of Fm , and the column space is the subspace of Fn , so, in general, they are distinct
spaces. These two spaces can be distinct even for m = n, as the following example shows:
" #
1 1
A=
0 0

The row space of A is h(1, 1)i, and its column space is h(1, 0)i. The spaces are distinct, though
both are 1-dimensional. Experimenting with several more examples, like
 
1 1 2
     
1 0 0 1 1 2 7 2 1 0
 
   
0 3 0  , or 0 1 3 3  , or  3 2 2 ,

     
0 0 −1 0 0 1 −1 6 4 4
 
1 0 1

we notice that the dimension of the row space and column space of each of these matrices are
equal. It turns out that this is always true: the row-rank of any matrix is equal to its column-
rank. This will be proved in Theorem 30 below. The common value of the row-rank of A and
its column-rank, is called the rank of A, and it is denoted by rank A.

Lemma 28 Both row-rank and column-rank are preserved by (or invariant under) the elemen-
tary row operations and by the elementary column operations.

Proof. Let A = (apq ) be an m × n matrix. The fact that an elementary row operation does not
change the row space of A, and so its row-rank, is easy to demonstrate, and we leave it to the
reader. We will prove a less obvious fact, namely that an elementary row operation preserves
the column-rank.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 50

Let A′ be a matrix obtained from A by an elementary row operation. It is obvious that
interchange of two rows or multiplication of a row by a nonzero constant does not change the
column-rank: if columns with indices i1 , . . . , ik form a basis of column space of A, the columns
with the same indices will form a basis for the column space of A′ . Suppose A′ = (a′pq ) is
obtained from A by replacing the ith row of A by its sum with the jth row of A multiplied by
a nonzero constant c, i 6= j. Let C1 , . . . , Cm be the columns of A, and C1′ , . . . , Cm
′ be the ones

of A′ .
Pm Pm
Suppose k=1 λk Ck = 0. This is equivalent to k=1 λk atk = 0 for all t ∈ [m]. Then
m
X m
X m
X m
X
λk a′ik = λk (aik + cajk ) = λk aik + c λk ajk = 0 + c0 = 0.
k=1 k=1 k=1 k=1

As all rows of A and A′ , except maybe the ith, are equal, we get that m
P
k=1 λk Ck = 0 implies
Pm ′
k=1 λk Ck = 0.

Suppose m
P ′
Pm ′
k=1 λk Ck = 0. This is equivalent to k=1 λk atk = 0 for all t ∈ [m]. For t = j, we
get, we get k=1 λk a′jk = m
Pm P
k=1 λk ajk = 0. For t = i, we get

m
X m
X m
X m
X
λk a′ik =0 ⇔ λk (aik + cajk ) = 0 ⇔ λk aik + c λk ajk = 0.
k=1 k=1 k=1 k=1
Pm Pm Pm
As k=1 λk ajk = 0, we obtain k=1 λk aik = 0. This implies that k=1 λk Ck = 0. Hence
m
X m
X
λk Ck = 0 if and only if λk Ck′ = 0.
k=1 k=1

Think about it. Does not this immediately imply what we need, i.e., that the column-ranks of
A and A′ are equal? Of course!

Let us state and prove this result in slightly greater generality.

Lemma 29 Given two sets of n vectors {v1 , . . . , vn } and {u1 , . . . , un } such that

λ1 v1 + . . . + λn vn = 0 if and only if λ1 u1 + . . . + λn un = 0.

Then dim hv1 , . . . , vn i = dim hu1 , . . . , un i.

Proof. Let k = dim hv1 , . . . , vn i. Then there exists vi1 , . . . , vik which form a basis of hv1 , . . . , vn i.
The condition of the lemma implies that {ui1 , . . . , uik } is a basis of hu1 , . . . , un i. Indeed, the
Pk
linear independence of ui1 , . . . , uik is clear. If vj = t=1 βt vit for some βt , then (−1)vj +
Pk Pk Pk
t=1 βt vit = 0. Hence, (−1)uj + t=1 βt uit = 0, which is equivalent to uj = t=1 βt uit . This
proves that {ui1 , . . . , uik } spans hu1 , . . . un i, and so it is a basis of hu1 , . . . , un i.

By Lemma 29, the column-ranks of A and A′ are equal.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 51

Similarly one can show that both the column-rank and the row-rank is invariant with respect
to the elementary column operations.

As the row-rank and the column-rank of a matrix in its row-echelon form are equal, we get the
following

Theorem 30 Row-rank and column rank of an arbitrary matrix in Mm×n (F) are equal.

Here are several more important facts related to rank A.

Let A ∈ Mm×n (F). Deleting some (or possibly none) rows or columns of A, we can obtain a
submatrix of A.

Theorem 31 Let A ∈ Mm×n (F). Then the following holds.

1. rank A = rank AT

2. r = rank A if and only if there exists a square r × r submatrix of A which is nonsingular,

but every larger square submatrix of A is singular.

3. The solution set of a homogeneous system of linear equations Ax = 0, A ∈ Mm×n (F), is

a subspace of Fn of dimension n − rank A. In particular, every homogeneous system of
linear equations which has less equations than the number of unknowns has a nontrivial
solution.

4. rank A − rank B ≤ rank (A + B) ≤ rank A + rank B

5. Let B ∈ Mn×p (F). Then rank AB ≤ min{rank A, rank B}. If m > n, then AB is singular.

Proof. Here are hints for proofs. The reader should supply all missing details.

1. Follows from Theorem 30

2. Straightforward.

3. One may try two different approaches. For the first, use the row-echelon form of the matrix.
For the second approach, consider the map φ : Rn → Rm , given by x 7→ Ax. Then im φ is the
span of the column space of A, and ker φ is the solution space of the system Ax = 0.

4. Follows easily from the earlier result dim(U + W ) = dim U + dim W − dim U ∩ W , where U
and W are subspaces of a finite-dimensional space V .

5. The inequality follows from the observation that a column space of AB is a subspace of the
column space of A; and the row space of AB is a subspace of the row space of B. The second
statement follows from it.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 52

For the linear map φ : Fn → Fm , given by x 7→ Ax, ker φ is exactly the solution space of
the homogeneous system Ax = 0. Therefore this space is often referred to as the kernel of
the matrix A, and is denoted by ker A. As im φ is the span of the set of columns of A,
dim(im φ) = rank A. Hence dim(ker A) = n − rank A.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 53

Problems.

1. Prove all parts of Theorem 31.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 54

Lecture 14.

In this lecture we begin to study how the notions of the Euclidean geometry in dimensions 2
and 3 are generalized to higher dimensions and arbitrary vector spaces. Our exposition is close
to the one in [2].

Let V be a vector space over a field F. A bilinear form over V is a map f : V × V → F such
that f is linear in each variable:

f (αu + βv, w) = αf (u, w) + βf (v, w),

and
f (w, αu + βv) = αf (w, u) + βf (w, v)

for all α, β ∈ F and u, v, w ∈ V .

A bilinear form is called symmetric if f (x, y) = f (y, x) for all x, y ∈ V .

EXAMPLES.

• V = Fn , f (x, y) = x1 y1 + . . . + xn yn .

• V = F3 , f (x, y) = x1 y1 + 4x2 y2 − 5x3 y3 .

• V = Fn , B is a matrix of order n over F, and f (x, y) = xT By. The properties of

multiplication of matrices immediately imply that f is a bilinear form. We say that f is
associated with B, or is defined by B. In two previous examples the matrix B was In
and diag(1, 4, −5).
If n = 3 and  
1 2 −3
 
A=
 2 0 ,
5 
−3 5 4
then f (x, y) = xT Ay = (x1 , x2 , x3 ) A (y1 , y2 , y3 )T =

x1 y1 + 2x1 y2 − 3x1 y3 + 2x2 y1 + 5x2 y3 − 3x3 y1 + 5x3 y2 + 4x3 y3 .

Rb
• V = C[a, b], f (u, v) = a u(t)v(t) dt.

• V = C(R). Let K(s, t) be a continuous function of two variables s and t. We define

RbRb
f (u, v) = a a K(s, t)u(s)v(t) ds dt.

The relation between bilinear forms on a finite-dimensional vector space and matrices is de-
scribed in the theorem below. We remind the reader that a matrix B is called symmetric if
B = BT .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 55

Proposition 32 For every bilinear form f over Fn , there exists a square matrix B of order n
over F, such that f (x, y) = xT By. Conversely, every square matrix B of order n over F defines
a bilinear form via (x, y) 7→ xT By. If f is symmetric, then B is a symmetric matrix. If B is a
symmetric matrix, then the form xT By is symmetric.

Proof. If {e1 , . . . , en } is the standard basis of Fn , then B = (bij ), where bij = f (ei , ej ). Then,
using bilinearity of f ,
X X X
xi yj f (ei , ej ) = xT By.

f (x, y) = f xi ei , yj ej =
1≤i≤n 1≤j≤n 1≤i,j≤n

The converse follows from the properties of matrix multiplication.

This argument shows that B depends on the choice of the basis. If α = {v1 , . . . , vn } is another
basis, let A = (aij ), where aij = f (vi , vj ). Then f (x, y) = [x]Tα A [y]α .

If f is a symmetric bilinear form over F, we call the pair (V, f ) an inner product space, with
f the inner product.

When V = Fn and B = In , f (x, y) = xT By = x1 y1 + . . . + xn yn is called the standard inner

product over Fn .

For the rest of this lecture, f will denote an arbitrary inner product over V = Fn , and B will
denote the associated symmetric matrix.

The vectors u and v are called orthogonal (or perpendicular) if their inner product is zero.
This often is denoted by u ⊥ v. For a subset S ⊆ V , we define the perpendicular space of S
as
S ⊥ = {v ∈ V : f (u, v) = 0 for all u ∈ S}.

Two subsets S, T ⊆ V are called perpendicular (denoted S ⊥ T ) if u ⊥ v for all u ∈ S and

v ∈ T.

Proposition 33 Let S and T be two subsets of V . Then the following holds.

1. {0}⊥ = V .

2. S ⊥ T ⇔ S ⊆ T ⊥ ⇔ T ⊆ S ⊥ .

3. S ⊥ is a subspace of V and S ⊥ = (Span(S))⊥ .

4. If S ⊆ T ⊆ V then T ⊥ ≤ S ⊥ ≤ V .

5. S ⊆ S ⊥⊥ .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 56

Proof. Left as an exercise.

A nonzero vector v ∈ V is called isotropic if v ⊥ v, or, equivalently, f (v, v) = 0. A subspace

U ≤ V is called isotropic if it contains an isotropic vector; otherwise it is called anisotropic. U
is called totally isotropic if U ⊥ U , i.e., every pair of vectors of U is orthogonal. (Equivalently,
U ≤ U ⊥ .)

The radical of a subspace U , denoted rad U , is defined as

rad U = U ∩ U ⊥ .

The subspace U is called singular if rad U 6= h0i; and nonsingular otherwise. We call the
inner product singular or nonsingular according to whether or not the whole space V itself
is singular. In other words, (V, f ) is nonsingular if the only vector of V orthogonal to the whole
V is zero vector. The terminology suggests that the notion of singularity in inner product spaces
is related to singularity of matrices. Indeed, this is the case.

Theorem 34 Let (V, f ) be an inner product space, dim V = n, B be a matrix associated with
f , and U ≤ V be an arbitrary subspace of V . Then the following holds.

(a) dim U + dim U ⊥ ≥ n.

(b) If V is nonsingular, then dim U + dim U ⊥ = n.

(c) V is nonsingular if and only if the matrix B is nonsingular.

Proof. Let {u1 , . . . uk } be a basis of U . Then x ∈ U ⊥ if and only if x ⊥ ui for all i, or,
equivalently, x is a solution of the homogeneous system of k linear equations

uTi Bx = 0 (i = 1, . . . , k). (18)

As the rank of this system is at most k, therefore its solution space U ⊥ has dimension at least
n − k (Theorem 31). This proves (a).

Suppose B is nonsingular. Then all vectors uTi B are linearly independent. Indeed, let
X
λi uTi B = 0.
1≤i≤k

Then ( 1≤i≤k λi uTi )B = 0. Let uT := 1≤i≤k λi uTi . Then uT B = 0, which is equivalent to

P P

B T u = 0. As B is nonsingular, then so is B T , and the only solution of B T u = 0 is u = 0.

Since {u1 , . . . uk } is a basis of U , we have λi = 0 for all i. Hence all vectors uTi B are linearly
independent. Note that these vectors uTi B are the rows of the matrix of the coefficients of the

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 57

system (18). Therefore the rank of the matrix is exactly k, and the solution space of (18) has
dimension n − k by Theorem 31. This proves (b).

By part (b), if B is nonsingular, then dim(rad V ) = dim(V ⊥ ) = n − dim V = n − n = 0. Hence,

V is nonsingular. This proves one implication in (c).

If B is singular, then rank B < n, and the system By = 0 has a nontrivial solution. Call it v,
v 6= 0. So Bv = 0. Then for every w ∈ V , wT B v = wT (B v) = wT 0 = 0. Hence v 6= 0 and v
is orthogonal to the whole space V . Then rad V 6= h0i, and the inner space is singular. This
proves the second implication in part (c).

Corollary 35 In a nonsingular inner product space of dimension n, every totally isotropic

subspace has dimension at most ⌊n/2⌋.

Proof. Since U ≤ U ⊥ , dim U ≤ dim U ⊥ . As dim U + dim U ⊥ = n, we have 2 dim U ≤

dim U + dim U ⊥ ≤ n. Hence dim U ≤ n/2.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 58

Problems.

1. Let (V, f ) be an inner product space, and let u, w be two isotropic vectors in V . Prove that if
u ⊥ w, then Span({u, w}) is a totally isotropic subspace.

2. For each of the following inner product spaces determine whether it is singular, and if it is find
a nonzero vector orthogonal to the whole space. Find isotropic vectors or show that they do not
exist. Find a maximal totally isotropic subspace (if isotropic vectors exist).

(a) V = R2 , f (x, y) = x1 y1 .
(b) V = R2 , f (x, y) = x1 y2 + x2 y1 .
(c) V = R2 , f (x, y) = x1 y1 − x2 y2 .
Rb
(d) V = C[a, b], f (u, v) = a u(t)v(t) dt.
(e) V = C2 with the standard inner product.
(f) V = R4 , f (x, y) = xT By, where
 
4 2 1 0
 
 2 1 0 0 
B=
 1 0 −1 1
.

 
0 0 1 −1

(g) V = F2p with the standard inner product, for p = 5, and for p = 7.
(h) V = F52 with the standard inner product.
(i) V = R4 with the inner product f (x, y) = x1 y1 + x2 y2 + x3 y3 − x4 y4 . This is the famous
Minkowski space used in the theory of special relativity. Here (x1 , x2 , x3 ) correspond to
coordinate of an event in R3 , and x4 to its time coordinate.

3. Let U be a subspace in an inner product space over F, where F is any field we have been using in
this course except F2 . Prove that if every vector of U is isotropic, then U is totally isotropic.
Show that the statement is not necessarily correct for F = F2 .

4. Let V be an anisotropic inner product space. Let {v1 , . . . , vn } be a set of nonzero vectors in V
such that vi ⊥ vj for all i 6= j. Prove that {v1 , . . . , vn } are linearly independent.

5. Prove that the set of vectors {cos x, cos 2x, . . . , cos nx, sin x, sin 2x, . . . , sin nx} is linearly indepen-
dent in V = C[0, 2π].
Rb
(Hint: Consider the inner product f (u, v) = a
u(t)v(t) dt, and use the previous exercise.)

6. Prove Proposition 33.

7. Let f be the standard inner product on F3 . Let α = {[2, 1, −1], [1, 0, 1], [3, 1, 1]} be another basis
of Fn . Find a matrix A such that

(i) A = (aij ) = (f (vi , vj )).

(ii) f (x, y) = [x]Tα A [y]α .

8. (Optional) Prove that Fp3 with the standard inner product is isotropic for every prime p.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 59

Lecture 15.

The goal of this lecture is to discuss the Gram-Schmidt orthogonalization process. Besides
being a very useful tool, it will lead us to some striking conclusions.

Let (V, f ) be an inner product space. A set S of vectors of (V, f ) is called orthogonal if every
two distinct vectors of S are orthogonal.

Proposition 36 Let (V, f ) be an n-dimensional inner product space, and let S be an orthogonal
set of vectors, such that no vector of S is isotropic. Then S is a linearly independent set, and
|S| ≤ n.

Proof. Consider some m vectors in S, denote them v1 , . . . , vm . If λ1 v1 + . . . + λm vm = 0, then,

for each i = 1, . . . , m,

0 = f (vi , 0) = f (vi , λ1 v1 + . . . + λm vm ) = λ1 f (vi , v1 ) + . . . + λm f (vi , vm ) = λi f (vi , vi ).

So λi f (vi , vi ) = 0 for all i. Since f (vi , vi ) 6= 0 (S has no isotropic vectors), λi = 0 for all i.
Hence S is linearly independent. Since dim V = n, then every n + 1 vectors of V are linearly
independent. So |S| ≤ n.

We would like to mention two applications of this simple result.

1. Prove that the set S = {cos x, . . . , cos nx, sin x, . . . , sin nx} of functions in V = C[0, 2π] is
linearly independent (over R).

Rb
Solution. Indeed, let f (u, v) := a u(t)v(t) dt, where u, v ∈ V . Then f an inner product on V .
Consider the inner space (V, f ). As
Z 2π Z 2π
2
f (cos kt, cos kt) = cos kt dt = sin2 kt dt = f (sin kt, sin kt) > 0,
0 0

for all nonzero integers k, and

Z 2π Z 2π
f (cos kt, cos mt) = cos kt cos mt dt = sin kt sin mt dt = f (sin kt, sin mt) = 0
0 0

for all nonzero integers k, m, k 6= m, S is an orthogonal set of nonisotropic vectors. Hence it is

linearly independent.

2. People in a city of 100 residents like to form clubs. The only restrictions on these clubs are
the following:

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 60

(i) each club must have an odd number of people;

(ii) every two clubs must share an even number of people.

What is the greatest number of clubs that they can form?

Solution. Let {p1 , . . . , p100 } be the set of all people in the city, and let C1 , . . . , Cm denote a set
of clubs. For each club Ci consider a vector vi ∈ F100
2 , such that the k-th component of vi is 1 if
pk ∈ Ci , and is 0 otherwise. Consider the standard inner product f on F100
2 . As each |Ci | is an
odd integer, each vi = (vi1 , . . . , vi100 ) contains an odd number of components equal to 1. Hence
100
X
f (vi , vi ) = vik vik = 1 + . . . + 1 (|Ci | addends) = 1.
k=1

As each |Ci ∩ Cj | is an even integer, the vectors vi and vj share 1’s in even number of same
components. Hence
100
X
f (vi , vj ) = vik vjk = 1 + . . . + 1 (|Ci ∩ Cj | addends) = 0.
k=1

Therefore the set of all vi is an orthogonal set in (F100

2 , f ), and it is linearly independent by
Proposition 36. As we have m vectors, m ≤ 100. Of course, it is possible to have 100 clubs:
take, e.g., Ci = {pi } for i = 1, . . . , 100, among many other constructions.

OK, let’s get serious. Everyone likes orthogonal bases. A good news is that often we can build
one. A popular technology is called the Gram-Schmidt Orthogonalization process. And
it is free! And Jorgen P. Gram (1850-1916) and Erhard Schmidt (1876-1959) are two different
people. And the method seems to be known to Laplace (1749-1827), and used by Cauchy in
1846...

Theorem 37 (Gram-Schmidt Orthogonalization) Let (V, f ) be an anisotropic n-dimensional

inner product space. Then V has an orthogonal basis.

Proof. Let {v1 , . . . , vn } be a basis of V . Set v1′ = v1 , and try to find v2′ ∈ Span({v1 , v2 } such
that v1′ ⊥ v2′ and Span({v1′ , v2′ }) = Span({v1 , v2 }).

In order to do this, search for v2′ in the form v2′ = v2 + xv1′ , where the scalar x is unknown.
Then
v1′ ⊥ v2′ ⇔ f (v1′ , v2′ ) = 0 ⇔ f (v1′ , v2 + xv1′ ) = 0 ⇔
f (v1′ , v2 )
f (v1′ , v2 ) + xf (v1′ , v1′ ) = 0 ⇔ x = − .
f (v1′ , v1′ )

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 61

Note that the denominator is not zero, since v1′ = v1 6= 0 and (V, f ) is anisotropic. Hence

f (v1′ , v2 ) ′
v2′ = v2 − v .
f (v1′ , v1′ ) 1

As v1′ ⊥ v2′ and, as none of them is zero vector (why?), and as (V, f ) is anisotropic, vec-
tors v1′ , v2′ are linearly independent by Proposition 36. As they are in Span({v1 , v2 }), we get
Span({v1′ , v2′ }) = Span({v1 , v2 }).

If n > 2, we search for v3′ in the form v3′ = v3 + xv1′ + yv2′ . We wish to have v1′ ⊥ v3′ and v2′ ⊥ v3′ .
Taking the inner product of v1′ with v3′ , and of and v2′ with v3′ , we obtain

f (v1′ , v3 ) f (v2′ , v3 )
x=− and y = − .
f (v1′ , v1′ ) f (v2′ , v2′ )

Hence
f (v1′ , v3 ) ′ f (v2′ , v3 ) ′
v3′ = v3 − v − v .
f (v1′ , v1′ ) 1 f (v2′ , v2′ ) 2
Clearly v1′ , v2′ , v3′ are pairwise orthogonal, and each vector is nonzero. As (V, f ) is anisotropic,
vectors v1′ , v2′ , v3′ are linearly independent by Proposition 36. As Span({v1′ , v2′ , v3′ }) ≤ Span({v1 , v2 , v3 }),
we obtain Span({v1′ , v2′ , v3′ }) = Span({v1 , v2 , v3 }). Continue by induction, if needed.

We call u a unit vector if f (u, u) = 1 (= 1F ). If every element of F is a square in F (not true

in many fields), then every nonzero nonisotropic vector x in (V, f ) is collinear to a unit vector.
Indeed, we have f (x, x) = a2 for some a ∈ F, a 6= 0. Then for x′ := a1 x, we have

1 1 11 1
f (x′ , x′ ) = f ( x, x) = f (x, x) = 2 a2 = 1.
a a aa a
For the field of real numbers R, every nonnegative number is a square. Therefore if f has the
property that f (x, x) > 0 for all x 6= 0 (f is called positive definite in this case), then each
p
vector has ‘length’ f (x, x), which is usually called the norm of x in (V, f ), and which is
1
denoted by k x k. If x 6= 0, then kxk x is a unit vector. We say that a set of vectors in an inner
product space is orthonormal if it is orthogonal and all vectors in the set are unit vectors.

Corollary 38 Let (V, f ) be an n-dimensional inner product space over R, where f is positive
definite. Then V has an orthonormal basis.

We are ready to pass to the ‘striking conclusions’ promised at the beginning of the lecture.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 62

Problems.

1. Find an orthogonal basis in the following spaces (V, f ). Make the found basis orthonormal, if
possible.

(a) V = Span({(1, 1, 3), (0, 1, 1)}) ⊆ R3 , f is the standard inner product.

(b) V = Span({(1, 1, 3, −1, 0), (0, 1, 1, −2, 0), (1, 2, 3, 4, 5), (5, 6, 7, 8, 6)}) ⊆ R5 , f is the standard
inner product. (Use computer!)
(c) V = Span({(1, 1, 3), (0, 1, 1)}) ⊆ F35 , f is the standard inner product.
(d) V = P3 ⊂ C(R), where P3 is the set of all polynomials of degree at most three with real
R1
coefficients. Let f (u, v) = −1 u(t)v(t) dt define the inner product on V .
dn
2. Let P0 (x) = 1, and Pn (x) = d xn (x2 − 1)n for n ≥ 1. Prove that {Pn (x) : n ≥ 0} form an
orthogonal basis in the space of all polynomial functions with real coefficients with inner product
R1
given by f (u, v) = −1 u(t)v(t) dt. These polynomials are called Legendre polynomials.

3. Consider the V = Span(Fn ) ⊂ C([0, 2π]), where

Fn = {1, cos x, cos 2x, . . . , cos nx, sin x, sin 2x, . . . , sin nx}.

Elements of V are called Fourier polynomials of order at most n. Consider the inner product
R 2π
0
u(t)v(t) dt on V . Check that Fn is an orthogonal basis of V . Turn it into an orthonormal
basis of V . Let
n
X
f = a0 /2 + (ak cos kx + bk sin kx).
k≥1

Express the coefficients ai and bi (Fourier coefficients of f ) as inner products of f and vectors
from the orthonormal basis.

4. People in a city of 100 residents like to form clubs. The only restrictions on these clubs are the
following:

(i) each club must have an even number of people;

(ii) every two clubs must share an even number of people;
(iii) all clubs are distinct (as subsets of people).

Prove that the greatest number of clubs that they can form is 250 . (Note the surprising difference
with the example discussed in the lecture.)

5. People in a city of 100 residents like to form clubs. The only restrictions on these clubs are the
following:

(i) each club must have exactly 20 people;

(ii) every two clubs must share exactly 11 people.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 63

(i) Prove that they cannot form more than 100 clubs.

Hint: Let {p1 , . . . , p100 } be the set of all people in the city, and let C1 , . . . , Cm denote a set of
clubs. For each club Ci consider a vector vi ∈ R100 , such that the k-th component of vi is 1 if
pk ∈ Ci , and is 0 otherwise. Consider the m × 100 matrix A, whose rows are the vectors vi . Prove
that AAT is nonsingular.
(ii) (Optional) What is the greatest number of clubs that they can form?

6. (Optional) Solve the previous problem where the seemingly strong condition (i) is replaced by a
trivial conditions that all clubs are distinct (as subsets of people).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 64

Lecture 16.

Thought there are many advantages in considering general inner product spaces, if one wants
to generalize usual Euclidean geometry of 2- and 3-dimensional spaces to higher dimensions,
one considers an inner product space (V, f ) over real numbers with f being a positive definite
form. We will call these spaces Euclidean spaces. If V = Rn and f is the standard inner
product, we call (V, f ) the standard n-dimensional Euclidean space. There are at least two
ways of proceeding with such generalizations. One way is to study arbitrary Euclidean spaces
axiomatically, i.e., based on the definition of a symmetric positive definite bilinear form. Another
way is to explain that all n-dimensional Euclidean spaces are in a certain sense the same, and
to work with just one of them, your favorite. We will discuss both these approaches.

Theorem 39 Let (V, f ) be a Euclidean space, and kxk = f (x, x) be the norm.

1. (The Pythagorean Theorem.) For any mutually orthogonal vectors v1 , . . . , vk , k ≥ 2,

kv1 + . . . + vk k2 = kv1 k2 + . . . + kvk k2 .

2. (The Cauchy-Schwarz Inequality.) For every two vectors x and y,

|f (x, y)| ≤ kxkkyk.

The equality is attained if and only if x and y are colinear.

3. (The Triangle Inequality.) For any vectors v1 , . . . , vk , k ≥ 2,

kv1 + . . . + vk k ≤ kv1 k + . . . + kvk k.

The equality is attained if and only if all vectors are colinear and of the ‘same direction’,
i.e., there exists i such that for all j, vj = kj vi and kj ≥ 0.

Proof. 1. For the first statement, using f (vi , vj ) = 0 for i 6= j, we obtain

2
X X X X X X
vi = f vi , vi = f (vi , vj ) = f (vi , vi ) = kvi k2 .
1≤i≤k 1≤i≤k 1≤i≤k 1≤i,j≤k 1≤i≤k 1≤i≤k

2. For y = 0, the statement is obvious. Let y 6= 0. As f is positive definite, we have

g(t) := kx − tyk2 = f (x − ty, x − ty) = kyk2 t2 − 2f (x, y) t + kxk2 ≥ 0 for all t ∈ R.

Since kyk2 > 0, g(t) is a quadratic function on R which takes only nonnegative values. Hence
its discriminant D is nonpositive, i.e.,

D = (−2f (x, y))2 − 4 kyk2 kxk2 ≤ 0 ⇔ |f (x, y)| ≤ kxkkyk.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 65

The equality sign in the last inequality is attained if and only if

D = 0 ⇔ ∃t0 g(t0 ) = 0 ⇔ x = t0 y.

As y is nonzero, the last condition is equivalent to colinearity of x and y.

3. For k = 2, we have

kv1 + v2 k2 = kv1 k2 + 2f (v1 , v2 ) + kv2 k2 ≤ (by Cauchy-Schwarz)

kv1 k2 + 2kv1 kkv2 k + kv2 k2 = (kv1 k + kv2 k)2 .

Taking square roots, we obtain the result. The equality happens if and only if f (v1 , v2 ) =
kv1 kkv2 k. Hence the vectors are colinear ( again by Cauchy-Schwarz). If v1 = tv2 , we have
f (v1 , v2 ) = f (tv2 , v2 ) = tf (v2 , v2 ) = tkv2 k2 . At the same time, kv1 kkv2 k = ktv2 kkv2 k = |t|kv2 k2 .
Hence t ≥ 0. If k > 2, one proceeds by a straightforward induction.

For the standard n-dimensional Euclidean space, the inequalities can be rewritten as
X X 1/2 X 1/2
ai bi ≤ a2i b2i ,
1≤i≤n 1≤i≤n 1≤i≤n

Rb
and for f (u, v) = a u(t)v(t) dt on C([a, b]),
Z b Z b 1/2
Z b 1/2
u(t)2 dt v(t)2 dt

u(t)v(t) dt ≤ · .
a a a

The distance between two vectors x and y in a Euclidean space (V, f ) is defined as kx − yk.
This definition is, of course, motivated by the distance between two points. Note that the
notion of a point Euclidean space has not been defined. Intuitively, we think about the vectors
as directed segments which have initial points at the origin, and about points as the endpoints
of these vectors, i.e., like in dimensions 2 and 3.

Now we turn to the second approach. We call two inner product spaces (V, f ) and (V ′ , f ′ )
isometric (or isomorphic), if there exists an isomorphism φ : V → V ′ , x 7→ x′ , such that
f (x, y) = f ′ (x′ , y ′ ) for every x, y ∈ V . Such inner product preserving isomorphism is called an
isometry. When V = V ′ and f = f ′ , the isometries are often called orthogonal maps. It is
clear that an isometry also preserves norms associated with f and f ′ . The following theorem
may, at first, look very surprising.

Theorem 40 Every two n-dimensional Euclidean spaces are isometric. In particular, all such
spaces are isometric to the standard Euclidean space.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 66

Proof. Let (V, f ) and (V ′ , f ′ ) be two n-dimensional Euclidean spaces. Then, by Corollary 36,
each space has an orthonormal basis. Let {v1 , . . . , vn } and {v1′ , . . . , vn′ } be such bases. Consider
the map φ : V → V ′ such that φ(vi ) = vi′ for all i, and continue it by linearity. In other words,
let φ(λ1 v1 + . . . + λn vn ) = λ1 v1′ + . . . + λn vn′ for all λi ∈ R. Then f (vi , vj ) = f ′ (vi′ , vj′ ) = 0
P
for i 6= j and 1 for i = j. Every two vectors x, y ∈ V , can be written as x = 1≤i≤n xi vi and
P
y = 1≤i≤n yi vi , where all xi , yj ∈ R. Therefore
X X X X X
xi yi = f ′ xi vi′ , yi vi′ = f ′ (x′ , y ′ ),

f (x, y) = f xi vi , yi vi =
1≤i≤n 1≤i≤n 1≤i≤n 1≤i≤n 1≤i≤n

and φ is an isometry.

Hence, there exists essentially only one n-dimensional Euclidean geometry (up to isometries).
The theorem implies that every ‘geometric’ assertion (i.e., an assertion stated in terms of addi-
tion, inner product and multiplication of vectors by scalars) pertaining to vectors in the standard
n-dimensional Euclidean space, is also true in all other n-dimensional Euclidean spaces. For
n = 2, 3, it allows to claim certain geometric facts without a proof, as long as we know that
they are correct in the usual Euclidean geometry. In particular, the inequality

2π 1/2 2π 2 1/2 2π 2 1/2

Z Z Z
(u + v)2 dt ≤ u dt + v dt
0 0 0

holds because the triangle inequality holds in usual plane geometry. No proof is necessary!

By now the reader understands that we use the word “geometry” quite freely in our discus-
sions. Here we wish to add several other informal comments.

What we mean by saying “geometry” depends on the content. We certainly used the term
much earlier, when we discussed just vector spaces. There we also had a way to identify
different spaces by using isomorphisms (just linear bijections). Those mappings preserved the
only essential features of vector spaces: addition and multiplication by scalars. This gave us
geometry of vector spaces, or just linear geometry.

Now, in addition to linearity, we wanted to preserve more, namely the inner product of vectors.
This is what the isometries do. This leads us to the geometries of inner product spaces.

So a geometry is defined by a set of objects and a set of transformations which preserve certain
relations between the objects, or some functions defined on the sets of objects. We will return
to this discussion later in the course, where new examples of geometries will be considered.

Sometimes, instead of a bilinear form f in the description of (V, f ) which intuitively represents
somehow both the lengths and angles, we can begin with a quadratic form Q, which correspond
to the notion of length only.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 67

Let F be any field we used in this course but F2 . Let Q : V → F be a function with the following
properties:

(i) Q(kx) = k 2 Q(x) for all k ∈ F and x ∈ V ; and

(ii) the function g(x, y) := 12 [Q(x + y) − Q(x) − Q(y)] is a symmetric bilinear function on V
(i.e., g is an inner product on V ).

Then Q is called a quadratic form on V . The function g in the definition of Q is referred to

as the bilinear form polar to the quadratic form Q.

Notice that
1 1
g(x, x) = [Q(2x) − 2Q(x)] = [4Q(x) − 2Q(x)] = Q(x),
2 2
so g(x, x) = Q(x). Hence Q can be “recovered” from g.

On the other hand, beginning with any symmetric bilinear form g on V , the function H(x) :=
g(x, x) is a quadratic form on V with g being its polar bilinear form. Indeed,

H(kx) = g(kx, kx) = k 2 g(x, x) = k 2 H(x), and

H(x + y) = g(x + y, x + y) = g(x, x) + g(x, y) + g(y, x) + g(y, y) = H(x) + 2f (x, y) + H(y),

which gives
1
g(x, y) = [H(x + y) − H(x) − H(y)].
2

It is time to mention something about the word “form”. Why forms?

In algebra a form of degree k ≥ 0 of x1 , . . . , xn (referred to as symbols, or indeterminates, or

variables) over F is just a homogeneous polynomial of degree k over F, i.e., a sum of monomials
with coefficients from F such that the (total) degree of each monomial is k. For example, for
n = 4 and F = R, 2x1 − 3x2 + 5x3 + x4 is form of degree 1, 2x1 x2 − x23 − x24 + x1 x3 is a form of
degree 2, x31 + x1 x2 x3 − x2 x24 is a form of degree 3.

An equivalent definition is that a polynomial p = p(x1 , . . . , xn ) is a form of degree k, if

p(λx1 , . . . , λxn ) = λk p(x1 , . . . , xn ) textf orall λ ∈ F.

For k = 0, assume λ0 = 1 for all λ.

We have seen that, as soon as a basis α in V is chosen, every bilinear function on V can be
represented as xT By = 1≤i,j≤n bij xi yj , where [x]α = [x1 , . . . , xn ] and [y]α = [y1 , . . . , yn ]. This
P

is a homogeneous polynomial of x1 , . . . , xn , y1 , . . . , yn of degree 2. Also every quadratic function

can be represented as xT Bx, where B is the symmetric matrix corresponding to the polar
bilinear form. This is a homogeneous polynomial of x1 , . . . , xn of degree 2. This is all to it.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 68

Problems.

1. Prove that a2 + b2 + c2 ≥ ab + bc + ca for all real a, b, c, and that the equality is attained if and
only if they are equal.

2. Let x = (x1 , . . . , x5 ) ∈ R5 and the standard norm kxk = 1. Let σ be an arbitrary permutation
(bijection) on {1, 2, 3, 4, 5}, and xσ = (xσ(1) , . . . , xσ(5) ). What is the greatest value of x1 xσ(1) +
. . . + x5 xσ(5) , and for which x it is attained.

3. Prove that if a + b + c = 1, then a2 + b2 + c2 ≥ 1/3, and that the equality is attained if and only
if a = b = c.

4. Consider a plane α : 2x − 3y − z = 5 in the usual 3-space (point space). Find the point in α which
is the closest to the origin.

5. Let a1 , . . . , an be positive real numbers. Prove that

(a1 + . . . + an ) (1/a1 + . . . + 1/an ) ≥ n2 .

For which ai the equality is attained?

6. Prove that in the usual 2- or 3-dimensional Euclidean space, the following geometric fact holds: in
any parallelogram, the sum of squares of the diagonals is equal to the sum of squares of all sides.
Does it remind you something from the lectures?

7. Prove that in every tetrahedron ABCD, if two pairs of opposite (skew) sides are perpendicular,
then so is the third pair. Prove also that in such tetrahedron, the sums of squares of lengthes of
every pair of opposite sides are equal.

8. The goal of this exercise is to show that an isometry can be characterized in a somewhat more
economic way, namely as a map on V which just fixes zero vector and all distances between the
vectors.
Let (V, f ) be a Euclidean n-dimensional space and φ : V → V is such that

(i) φ(0) = 0 (φ fixes zero vector) and

(ii) kφ(x) − φ(y)k = kx − yk for all x, y ∈ V (φ preserves the distances between vectors).

Prove that the following hold.

(a) φ preserves the norm on (V, f ): kφ(x)k = kxk for all x ∈ V ;

(b) φ preserves the inner product f : f (φ(x), φ(y)) = f (x, y) for all x, y ∈ V ;
(c) φ maps every orthonormal basis (of V ) to an orthonormal basis;
(d) φ is a liner map on V ;
(e) φ is an isomorphism on V .

Show that only one of the conditions (i) or (ii) does not imply that the map is an isometry.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 69

9. Describe all isometries in the standard 2-dimensional Euclidean space. You can do it in terms of
matrices or without them.
−−→ −→ −−→
10. Consider three non-coplanar rays AB, AC, AD in the usual Euclidean 3-dimensional space. Draw
bisectors of each of three angles: ∠BAC, ∠CAD, ∠BAD. Prove that the angles formed by these
bisectors are either all acute, all all right, or all obtuse.
−−→ −→ −−→
Give an explicit example of AB, AC, AD such that the three angles are all right angles. You can
do by showing the coordinates of B, C, D, assuming that A is at the origin.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 70

Lecture 17, 18.

The goals of this lecture are the following.

• To establish some ties between usual Euclidean geometry and Linear algebra.

• To demonstrate that using even the simplest facts from linear algebra enables us to answer
some nontrivial questions from 2- and 3-dimensional Euclidean geometry.

• To generalize the geometric notions to Euclidean spaces of higher dimensions.

When we think about vector spaces geometrically, we draw diagrams imagining Euclidean 1-,2-,
and 3-dimensional point spaces. Vectors are depicted as directed segments. Often we “tie” all
vector to the origin, and say that their “endpoints” correspond to the points in the space. This
creates an inconvenience when we want to draw them somewhere else in the space, which is
desirable for many applications of vectors, in particular, in geometry and physics. Then we
agree that two different directed segments define the same vector if they have equal lengths and
“directed the same”. We explain how to add and subtract vectors geometrically. This type of
discussion is usually not precise, but, nevertheless, we got used to passing from points to vectors
and back. Coordinates allow to discuss all this with greater rigor, but the relations between the
definitions and the geometric rules for operations on vectors still have to be justified. It is not
hard to make the relation between point spaces and vector spaces precise, but we will not do
it here. See, i.e., [15], or [8], or [12] for rigorous expositions. Usually a point space built over a
vector space V is referred to as affine space associated with V .

Instead dealing with affine Euclidean spaces, we will translate the usual notions of these spaces
into the language of vector spaces, and try to discuss them by means of linear algebra.

For the rest of this lecture we will deal with the standard inner product space (Rn , f ) only.

Let a be a nonzero vector in E n . A segment spanned by a is defined as the set of vectors

{ta : 0 ≤ t ≤ 1}.

Let a, b be two vectors. We define an affine segment spanned by a and b as the following
set of vectors:

{a + (b − a)t : 0 ≤ t ≤ 1} = {(1 − t)a + tb : 0 ≤ t ≤ 1} = {t1 a + t2 b : 0 ≤ ti , t1 + t2 = 1}.

The first representation exhibits a vector c = (1 − t)a + tb whose endpoint C lies on (point)
segment AB and divides its length in proportion AC/CB = t/(1 − t) = t1 /t2 , t, t1 > 0. For
t = t1 = 0 we get b and for t = t2 = 1 we get a. The second way or writing is more symmetric.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 71

Clearly, the segment spanned by a and b is a shift, or, equivalently, a (parallel) translation of
the segment spanned by b − a by vector a. If we allow t to range through the whole R, we get
the set of vectors which correspond to the affine line spanned by a and b or the affine line
through A and B.

We say that an affine segment (line) spanned by a and b is parallel to the affine segment (line)
spanned by c and d if vectors b − a and d − c are colinear.

Let b, c be two non-colinear vectors. We define the triangle spanned by b and c as the
following set of vectors:
{tb + sc : t, s ≥ 0, s + t ≤ 1}.

If the definition of the segment made sense to us, the definition of a triangle should too. If
0 < k < 1, then the set of vectors {tb + sc : t, s ≥ 0, s + t = k} is equal to the set of vectors
{t(kb) + s(kc) : t, s ≥ 0, s + t = 1}, which is the segment with the endpoints kb and kc. Hence
the triangle is the union of all such segments.

An affine triangle spanned by a, b, c, where vectors b − a and c − a are non-colinear, is the

set of vectors defined as follows:

{a + t(b − a) + s(c − a) : t, s ≥ 0, s + t ≤ 1}, or

{t1 a + t2 b + t3 c : ti ≥ 0, t1 + t2 + t3 = 1}.

We leave the verification that these sets are equal to the reader. The affine triangle spanned by
vectors a, b, c is a translation by vector a of the triangle spanned by b − a and c − b.

Let b, c, d be three non-coplanar vectors. We define a tetrahedron spanned by b, c and d

as the following set of vectors:

{t1 b + t2 c + t3 d : 0 ≤ ti , t1 + t2 + t3 ≤ 1}.

An affine tetrahedron spanned by a, b, c, d, where vectors b−a, c−a, d−a are non-coplanar,
is the set of vectors defined as follows:

{a + t2 (b − a) + t3 (c − a) + t4 (d − a) : 0 ≤ ti , t2 + t3 + t4 ≤ 1}, or

{t1 a + t2 b + t3 c + t4 d : 0 ≤ ti , t1 + t2 + t3 + t4 = 1}.

The affine tetrahedron spanned by vectors a, b, c, d is a translation by vector a of the tetrahedron

spanned by vectors b − a, c − a and d − a.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 72

Let b, c be two non-colinear vectors. We define the parallelogram spanned by b and c as
the following set of vectors:
{sb + tc : 0 ≤ s, t ≤ 1},

and an affine parallelogram spanned by a, b, c, as

{a + t(b − a) + s(c − a) : 0 ≤ s, t ≤ 1},

where the vectors b − a and c − a assumed to be non-colinear.

Let b, c, d be three non-coplanar vectors. We define the parallelepiped spanned by b, c and

d as the following set of vectors:

{t1 b + t2 c + t3 d : 0 ≤ ti ≤ 1},

and an affine parallelepiped spanned by a, b, c, d, as

{a + t1 (b − a) + t2 (c − a) + t3 (d − a) : 0 ≤ ti ≤ 1},

where the vectors b − a, c − a and d − a are assumed to be non-coplanar.

We call an arbitrary subset S of V a figure.

Theorem 41 Every non-singular linear operator φ on V preserves the following properties of

affine figures.

1. The property of a figure to be an affine segment, line, triangle, parallelogram, or paral-

lelepiped.

2. The property of segments and lines being parallel.

3. The ratio of lengths of parallel segments. In particular, it preserves the ratio of lengths
of segments on the same line. In particular, the midpoint of a segment is mapped to the
midpoint of its image.

4. The ratio of areas or volumes of figures, where those are defined and nonzero.

Proof. 1. Let us demonstrate the property for affine triangles only. Others can be done
similarly. Let T = {t1 a + t2 b + t3 c : 0 ≤ ti , t1 + t2 + t3 = 1} be an affine triangle spanned
by a, b, c. Then b − a, c − a are non-colinear vectors. Then φ(T ) = {φ(t1 a + t2 b + t3 c) : 0 ≤
ti , t1 + t2 + t3 = 1} = {t1 φ(a) + t2 φ(b) + t3 φ(c) : 0 ≤ ti , t1 + t2 + t3 = 1}. As φ is non-singular,
vectors φ(b − a) = φ(b) − φ(a) and φ(c − a) = φ(c) − φ(a) are non-colinear, and, hence, φ(T ) is
the affine triangle spanned by φ(a), φ(b) and φ(c). ✷

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 73

2. Let S = {a + (b − a)t : 0 ≤ t ≤ 1} and S ′ = {a′ + (b′ − a′ )t : 0 ≤ t ≤ 1} be affine
segments spanned by a, b and a′ , b′ , respectively. As φ(x + t(y − x)) = φ(x) + tφ(y − x) =
φ(x) + t(φ(y) − φ(x)), φ(S) is the affine segment spanned by φ(a) and φ(b). Similarly, φ(S ′ ) is
the affine segment spanned by φ(a′ ) and φ(b′ ). As S and S ′ are parallel, b − a and b′ − a′ are
collinear. As φ is linear, their images, namely φ(b−a) = φ(a)−φ(b) and φ(b′ −a′ ) = φ(a′ )−φ(b′ )
are colinear. Hence φ(S) is parallel to φ(S ′ ). ✷

3. All these statements follow immediately from the relation φ(x+t(y−x)) = φ(x)+t(φ(y−x)). ✷

4. Let us first restrict ourselves to areas only, and recall the notion of an area in R2 .

Consider a grid of congruent unit squares in R2 . As we have shown in parts 1,2,3, parallel lines
are mapped to parallel lines, and midpoints of segments to midpoints of their images. Therefore
the image of this grid will be a grid of congruent parallelograms.

Let F1 and F2 be two figures in R2 for which area exist. If the grid of squares is sufficiently
fine, then the ratio of the number of squares in the interior of F1 to the number of squares in
the interior of F2 can be made as close to the ratio of their areas as we wish. Actually it will
be equal to the ratio of areas in the limit, as the length of the side of a square in the square
grid decreases to zero.

Consider now φ(F1 ) and φ(F2 ). The ratio of the numbers of the parallelograms in the interiors
of φ(F1 ) and φ(F2 ) will be exactly the same as the ratio of the numbers of square grids in the
interiors of F1 and F2 . When the sides of a square in the square grid decreases, so does the size
of the parallelogram in the corresponding parallelogram grid. Passing to the limit will give
area (F1 ) area (φ(F1 ))
= .
area (F2 ) area (φ(F2 ))
It is clear that a similar argument can be applied to the ratio of volumes in R3 .

Remark. The statement of part 4 can be restated as follows: there exists a positive constant
c = c(φ), such that area (φ(F )) = c area (F ) for all figure F in E 2 which have areas. The
argument used in the proof of part 4, applies to all figures with positive areas (volumes), and to
all dimensions. It turns out that the the coefficient c is just | det Mφ |, where Mφ is the matrix
representing φ in the standard basis. Let us show it for dimensions 2 and 3.

Recall the following fundamental facts from analytical geometry: the area of a parallelogram
spanned by vectors a, b, whose coordinates in the standard basis of R2 are [a] = [a1 , a2 ] and
[b] = [b1 , b2 ], respectively, is the absolute value of the determinant of the matrix
!
a1 a2
,
b1 b2
and the volume of a parallelepiped spanned by vectors a, b, c, whose coordinates in the standard
basis of R3 are [a] = [a1 , a2 , a3 ], [b] = [b1 , b2 , b3 ] and [c] = [c1 , c2 , c3 ], respectively, is the absolute

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 74

value of the determinant of the matrix
 
a1 a2 a3
 
.
 b1 b2 b3 

c1 c2 c3

Let n = 2, let {e1 , e2 } be the standard basis of R2 , and let [φ(e1 ), φ(e2 )]T = A[e1 , e2 ]T . Hence
| det A| is the area of the parallelogram spanned by φ(e1 ) and φ(e2 ). Since the determinant
of the identity matrix is 1, the area of the unit square (the one spanned by e1 and e2 ) is 1.
So | det A| is the ratio of the area of the parallelogram to the area of the square. Let’s see
that the area of any parallelogram will change by the same factor. Let x and y be two non-
colinear vectors in R2 . Vectors x, y span a parallelogram with the area | det B|, where B is the
matrix having [x] and [y] as its rows. Then vectors φ(x) and φ(y) span a parallelogram with
area | det(BA)|, since the rows of BA are the coordinate vectors of [φ(x)] and [φ(y)]. Since
| det(BA)| = | det(B)| | det A|, we obtain that the area of every parallelogram changes by the
same factor | det A|. A similar argument works also for n = 3.

Proposition 42 Every two segments, two triangles, two parallelograms, two tetrahedra, two
parallelepipeds in V can be mapped to each other by a linear map.

Proof. The statement follows from the definitions of the figures in the statement, and the fact
that every bijection between two sets of linearly independent vectors can be continued to a
linear map of Rn .

Let us present three examples of how Theorem 41 and Proposition 42 can be used to solve
problems in Euclidean geometry.

Example 43 Is there a non-regular pentagon with the property that each its diagonal is parallel
to one of its sides?

Solution. Yes. It is easy to show that a regular pentagon has this property (do it!). Consider
any linear operator of R2 which maps a regular pentagon to a non-regular one. There are many
such projections: e.g., just pick three consecutive vertices of the regular pentagon and map
the corresponding triangle to an equilateral one. Then the image of the whole pentagon is
not regular, since one of its angles has measure of 60◦ . Since parallel segments are mapped to
parallel segments, the image will satisfy the required property. ✷

Example 44 Let A1 , B1 , C1 be points on the sides BC, CA, AB of a △ABC, respectively, such
that
BA1 /A1 C = CB1 /B1 A = AC1 /C1 B = 1/2.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 75

Let A2 , B2 , C2 be the points of intersections of the segments BB1 and CC1 , CC1 and AA1 , AA1
and BB1 , respectively. Prove that
area (△A2 B2 C2 ) 1
= .
area (△ABC) 7

Proof. Consider a linear map which maps △ABC to an equilateral △A′ B ′ C ′ . Points A′ , B ′ , C ′
will divide the sides of △A′ B ′ C ′ in the same ratio, and therefore, A′ C1′ = B ′ A′1 = C ′ B1′ = 1
(we can just choose the scale this way). See Figure ***. Therefore it is sufficient to solve the
problem in this case, since the ratio of areas does not change.

This can be done in many ways. Here is one of them. Rotating △A′ B ′ C ′ counterclockwise by
120◦ around its center, we obtain that A′1 7→ B1′ 7→ C1′ 7→ A′1 , where 7→ means ‘is mapped to’.
This implies that
A′ A′1 7→ B ′ B1′ 7→ C ′ C1′ 7→ A′ A′1 ,

and therefore A′2 7→ B2′ 7→ C2′ 7→ A′2 . It implies that △A′2 B2′ C2′ is equilateral. Using the Cosine
theorem for △A′ C1′ C ′ , we get
√
C ′ C1′ =
p
12 + 32 − 2 · 1 · 3 · cos(π/3) = 7.

Now, △A′ B2′ C1′ ∼ △A′ B ′ A′1 , since they have two pairs of congruent angles. Therefore B2′ C1′ =
√ √ √ √ √ √
1/ 7 and A′ B2′ = C ′ A′2 = 3/ 7. Therefore A′2 B2′ = 7 − 1/ 7 − 3/ 7 = 3/ 7. This implies
√
that A′2 B2′ /A′ B ′ = 1/ 7, and therefore

area (△A′2 B2′ C2′ ) A′2 B2′ 2 1 2 1

′ ′ ′
= ′ ′
= √ = .
area (△A B C ) AB 7 7

Since the ratio of areas is an invariant of a parallel projection, area(△A2 B2 C2 )/area(△ABC) =

1/7.

Example 45 Let ABCD be a tetrahedra, and let E and F be the midpoints of segments AB
and CD, respectively. Let α be any plane passing through E and F . Prove that α divides
ABCD into two polyhedra of equal volumes.

Proof. By using a linear operator map ABCD to a regular tetrahedra A′ B ′ C ′ D′ . Then E and
F are mapped to the midpoints E ′ and F ′ of segments A′ B ′ and C ′ D′ , and plane α to a plane
α′ passing through E ′ and F ′ .

Now note that the line E ′ F ′ is perpendicular to the sides A′ B ′ and C ′ D′ and lies in α′ . Therefore
a rotation around the line E ′ F ′ by 180◦ maps each of the two polyhedra into which α′ divides
A′ B ′ C ′ D′ to another one. Hence they are congruent, and their volumes are equal. But the
ratio of volumes is preserved by any non-singular linear operator.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 76

Now we generalize some of the notions above to Rn .

Let a1 , . . . , am+1 ∈ Rn be m + 1 vectors which span a subspace of dimension m. The following

set of vectors is called the m-simplex in Rn spanned by a1 , . . . , am :

{t1 a1 + . . . + tm+1 am+1 : 0 ≤ ti , t1 + . . . + tm+1 = 1}.

For m = 0, 1, 2, 3, we get a point, a segment, a triangle and a tetrahedron, respectively.

Similarly, let a1 , . . . , am be the set of linearly independent vectors in Rn . The set of vectors

{t1 a1 + . . . + tm am : 0 ≤ ti ≤ 1},

is called the m-parallelepiped in Rn spanned by a1 , . . . , am . For m = 1, 2, 3, we get a

segment, a parallelogram and a 3-dimensional parallelepiped, respectively.

We define the volume of an n-parallelepiped spanned by a1 , . . . , an in Rn as | det A|, where

A is the matrix such that the i-th rows of A is aTi . We denote this volume by vol (a1 , . . . , an ).

It takes some work to verify that this definition of the volume of a parallelepiped can be extended
to to definitions of volumes of other figures, and the obtained function satisfies all the axioms
imposed on volumes as objects of Measure theory and Euclidean geometry. What is obvious, at
least, is that the number is positive, and it becomes zero when the n-parallelepiped degenerates
into one of a smaller dimension, i.e., when defining vectors are linearly dependent. It also
satisfies our expectation that two sets of n linearly independent vectors {ai } and {bi } span the
same parallelepiped then the volumes defined by them should be the same. (Check!).

Before closing this section we wish to introduce another important matrix which determinant
also can be used to compute volumes.

Let (V, f ) be any n-dimensional Euclidean space, and let a1 , . . . , am be a sequence of vectors
in V . Consider a square matrix G of order m define as: G = G(a1 , . . . , am ) = (gij ), where
gij = f (ai , aj ). Then G is called the Gram matrix of the sequence of vectors a1 , . . . , am ,
and det G is called the Gram determinant of this sequence of vectors.

Theorem 46 Let G = G(a1 , . . . , am ) be the Gram matrix of a1 , . . . , am in the n-dimensional

Euclidean space (V, f ). Then the following holds.

1. G is singular if and only if a1 , . . . , am are linearly dependent.

2. a1 , . . . , am is an orthonormal set of vectors if and only if G = Im .

3. (Hadamard’s inequality.) Let {a′i } be the orthogonal basis constructed from {ai } by Gram-
Schmidt procedure. Then

G = ka′1 k2 · · · · · ka′m k2 ≤ ka1 k2 · · · · · kam k2 .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 77

The equality in the inequality is attained if and only if they are orthogonal.

4. Let m = n, and let α be an orthonormal basis of (V, f ). If A is a matrix which i-th row
is [ai ]α for all i, then G = AAT . Consequently,

det G = (det A)(det AT ) = (det A)2 ≥ 0, and

√
vol (a1 , . . . , an ) = det G = | det A|.

Proof.

1. Handout was given in class.

2. Obvious.

3. Follows from Gram-Schmidt procedure and the properties of determinants. Handout was
given in class. The inequality ka′i k2 ≤ kai k2 follows from the Pythagorean Theorem.

4. Straightforward.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 78

Problems.

1. Prove that the area of a parallelogram and the volumes a parallelepipeds in 2- and 3- dimen-
sional Euclidean space, respectively, are equal to the absolute values of the determinants of the
corresponding matrices.
You are allowed to use the facts from elementary geometry: the area of a parallelogram is equal
to the product of the length of its side and the length of the corresponding altitude, and that the
volume of a parallelepiped is the product of the area of its base and the corresponding altitude.

2. Given a tetrahedron AA1 A2 A3 . Let A′i be a point on the side AAi such that AA′i /AAi = λi ,
i = 1, 2, 3. Prove that vol (AA′1 A′2 A′3 ) = λ1 λ2 λ3 vol (AA1 A2 A3 ).

3. A plane passes through the midpoints of two skew sides of a tetrahedron, and intersects two other
sides at points M and N . Prove that the points M and N divide the sides (they belong to) in the
same ratio.

4. Think about an ellipsoid with semi-axes of length a and b in R2 as

x21 x2
{ x = (x1 , x2 ) : 2
+ 22 ≤ 1. }
a b
Prove that every ellipsoid can be viewed is an image of a circular disc with respect to a linear
map. Conclude that the area of the ellipsoid with semi-axes of length a and b is πab. State and
prove the generalization of this result to R3 .

5. Given a tetrahedron AA1 A2 A3 . Let A′i be a point on the side AAi such that AA′i /AAi = λi ,
i = 1, 2, 3. Prove that vol (AA′1 A′2 A′3 ) = λ1 λ2 λ3 vol (AA1 A2 A3 ).

6. Let a1 , . . . , am be vectors in the standard inner product space (Rn , f ) such that the distance
between every two of them is equal to 1. Prove that m ≤ n + 1. Construct an example of n + 1
vectors with this property.
Hint: Let ui = ai − a1 for all i. Show that G(u2 , . . . , um ) is nonsingular.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 79

Lectures 19.

In this lecture we discussed the notion of the distance from a vector (point) in a Euclidean
space (V, f ) to its n-dimensional subspace. We proved the following theorem.

Theorem 47 Let (V, f ) be a Euclidean space, and W be its n-dimensional subspace. Then for
every v ∈ V , there exists a unique vector w0 in W such that

(i) v − w0 ⊥ W , and

(ii) kv − w0 k = min{kv − wk : w ∈ W }.

Proof. Given in class.

Vector w0 described in the theorem is called the orthogonal projection of v to W , and is

often denoted by proj W v, and the linear map V → W defined by v 7→ proj W v is called the
orthogonal projection of V to W . The orthogonal projection is the projection of V to W
in the direction W ⊥ , which was the notion we defined before. The number kv − w0 k is called
the distance from v to W , and it is often denoted by dist (v, W ).

Connections to the question of approximation of functions in functional spaces by functions

from a given subspace were discussed.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 80

Problems.

1. Let (V, f ) be a Euclidean space, W be its subspace, v ∈ V and is not orthogonal to W . Prove
that the angle between v and proj W v is the smallest among all angles between v and w, w ∈ W .
f (v,w)
(The measure of the angle between two nonzero vectors a and b in V is, by definition, cos−1 kvk kwk .)

Try to get a good feeling for this result in the standard Euclidean 2- and 3-dimensional spaces,
with W being of dimension 1 and 2, respectively.

2. In R4 , let W = h(1, 1, 0, 0), (1, 1, 1, 2)i. Find w ∈ W such that kw − (1, 0, 2, 2)k is as small as
possible.

3. Find proj W v, and dist (v, W ) for (V, f ), W , and v defined below.

(i) V is the standard Euclidean space R4 , v = (0, 1, 2, 3) and

W = {x = (x1 , x2 , x3 , x4 ) : x1 + x2 + x3 − 2x4 = 0 }.

(ii) V is the standard Euclidean space R4 , v = (0, 1, 2, 3) and

W = {x = (x1 , x2 , x3 , x4 ) : x1 + x2 = x3 − x4 , and x1 = x2 + x3 = 0 }.
R1
(iii) V = C[−1, 1], f (u, v) = −1
u(t)v(t) dt, v = cos x and W = h1, x, x2 , x3 i.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 81

Lectures 20.

Let V be a vector space over a field F, and let L(V ) denote the set of all linear operators on V .
For φ ∈ L(V ), a subset S ⊆ V such that φ(S) ⊆ S is called stable or invariant with respect
to φ, or φ-stable, or φ-invariant. Clearly {0} and V are φ-stable for every φ. Restricting φ on
its invariant subspace W , we obtain a linear operator φ|W on W . The study of the latter may
be easier than of φ on the whole V , mainly because this subspace is of smaller dimensions than
V . In particular, if V is a direct sum of its subspaces W1 , W2 , . . ., then φ|V is completely defined
by all φ|Wi . This simple idea suggest the following approach for studying linear operators.

Given φ ∈ L(V ),

(i) find its invariant subspaces W1 , W2 , . . . whose direct sum is V ;

(ii) understand how φ acts on each of them, i.e., all φ|Wi .

This is exactly what we will try to do during the next several lectures..

As the action of φ on h0i is trivial, the first interesting case is when a φ-invariant space is
1-dimensional.

Let φ(v) = λv for some nonzero vector v ∈ V and a scalar λ ∈ F. Then v is called an
eigenvector of φ, and λ is called an eigenvalue of φ. We also say that λ and v correspond
to each other. Every eigenvector v of φ spans a 1-dimensional φ-stable subspace hvi.

If dim V = n, and if V has a basis α = {v1 , . . . , vn } consisting of eigenvectors of φ with

corresponding eigenvalues {λ1 , . . . , λn }, the image of an arbitrary vector can be computed in a
very simple way. Let x = x1 v1 + · · · + xn vn . Then

φ(x) = λ1 x1 v1 + · · · + λn xn vn .

In coordinate notations, if [x]α = [x1 , . . . xn ], then [φ(x)]α = [λ1 x1 , . . . λn xn ].

Therefore finding as many as possible linearly independent eigenvectors of φ is useful. How can
one find them? In order to answer this question, we consider matrices which represent φ in
different bases.

Let dim V = n, φ ∈ L(V ), and α = {v1 , . . . , vn } be a basis of V . Let Mφ,α , be the matrix of φ
corresponding to α. We remind ourselves that Mφ,α = (aij ), where the entries aij are defined
by the equalities φ(vi ) = ai1 v1 + · · · + ain vn , i = 1, . . . , n. To simplify the presentation, we
T by A. Then it is easy to check that
denote Mφ,α

[φ(x)]α = A [x]α

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 82

for all x ∈ V . This implies that the matrix which represents φ with respect to a basis consisting
of eigenvectors of φ is diagonal!

As φ(v) = λv if and only if A [v]α = λ[v]α , we wish also to define the notions of eigenvector and
eigenvalues for matrices.

Let V = Fn , A be a square matrix of order n, and A v = λv for some nonzero vector v and
a scalar λ. Then v is called an eigenvector of A, and λ is called an eigenvalue of A. The
equality A v = λv is equivalent to (λ I − A) v = 0, where I − In is the identity matrix of order
n. The system of homogeneous linear equations (λ I − A) x = 0 has a nontrivial solution if
and only if its matrix of coefficients λ I − A is singular, or if and only if det(λ I − A) = 0.
Therefore, each scalar λ which satisfies the equation det(λ I − A) = 0, is an eigenvalue of A,
and the corresponding eigenvector can always be found by solving the system of equations
(λ I − A) x = 0. Therefore, for a short while, we will discuss how one finds the eigenvalues of A.

Until this moment, we considered determinants of matrices over fields only. A very similar
theory of determinants exists for matrices over commutative rings. The only ring we will be
concerned with is the ring of polynomials F[x]. If we analyze the statements about determinants
which do not refer to the notion of linear independence of vectors, those will hold for matrices
over F[x], and the proofs can be carried verbatim. The expression x I − A can be considered as
a matrix over F[x], and its determinant as a polynomial of x over F. We call det(x I − A) the
characteristic polynomial of A, and denote it by cA (x). We proved the following important
fact.

Theorem 48 Let A be a square matrix of order n over F. Then λ ∈ F is an eigenvalue of A

if and only if λ is a root of cA (x) = det(x I − A).

This theorem implies that in order to find eigenvalues of a linear operator φ one can choose a
T which represents φ in this basis and find its eigenvalues.
basis α, consider the matrix A = Mφ,α
T . How do eigenvalues of B
A choice of another basis β leads to another matrix B = Mφ,β
compare to the ones of A? It turns out that they are exactly the same! Let us call the the
multiset of all eigenvalues of a matrix A the spectrum of A, and denote it by spec A.

Corollary 49 Let A and B be square matrices which represent φ ∈ L(V ) in bases α and β,
respectively. Then cA (x) = cB (x). Hence spec A = spec B.

Proof. Let α = {v1 , . . . , vn }, β = {u1 , . . . , un }, and C be the matrix whose i-th column is [ui ]α .
Then C is nonsingular, and B = C −1 AC. This implies

cB (x) = det(x I − B) = det(x I − C −1 AC) = det[C −1 (x I − A) C] =

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 83

det C −1 det(x I − A) det C = (det C)−1 det(x I − A) det C =

det(x I − A) = cA (x).

The result of this corollary allows us to define the characteristic polynomial cφ (x) of φ ∈
L(V ) as cA (x), where A is the matrix representing φ in some basis.

We have understood that when V has a basis of n eigenvectors, the matrix of φ in this basis is
diagonal. The following theorem provides a simple sufficient condition for existence of linearly
independent eigenvectors. The condition is not necessary, as simple examples show (like φ = id).

Theorem 50 Let dim V = n, and let v1 , . . . , vm be eigenvectors of a linear operator φ on V

which correspond to distinct eigenvalues λ1 , . . . , λm , respectively. Then

v1 , . . . , vm are linearly independent. If m = n, then φ has a basis consisting of eigenvectors,

and, hence, φ is diagonalizable.

Proof. Suggested to be read in a textbook.

As we know, not every polynomial in F[x] has roots in F. And if it does, not all of the roots
must be distinct. What can be said about φ if cφ (x) has this property? We will be discussing
this question in the next lecture.

We wish to finish this section with an example illustrating how useful can be to diagonalize
a matrix. We will find an explicit formula for the Fibonacci sequence: F0 = F1 = 1, and
Fn = Fn−1 + Fn−2 for n ≥ 2.

The example was presented in class, and was based on the observation that for i ≥ 2,
" # ! " #
Fi 1 1 Fi−1
= .
Fi−1 1 0 Fi−2
!
1 1
Diagonalizing A = lead to an explicit formula for Fn . Some details were left as
1 0
exercises.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 84

Problems.

1. (i) Let λ be an eigenvalue of φ ∈ L(V ), and let Vλ = {v ∈ V : φ(v) = λv}. Prove that Vλ is a
subspace of V . Vλ is called the eigenspace of φ corresponding to λ. A similar notion can be
defined for a square matrix of order n.
(ii) Let Vλi , i = 1, . . . , k, be eigenspaces of φ ∈ L(V ) corresponding to pairwise distinct eigenvalues
λ1 , . . . , λk . Let αi be a basis of Vλi . Then union of all αi is a linearly independent set of vectors
of V .

2. For a matrix A below, find its characteristic polynomial cA (x); spec A; for each λ ∈ spec A, find
a maximum set of linearly independent eigenvectors of A corresponding to λ, i.e. a basis for the
eigenspace of A corresponding to λ. If A is diagonalizable, find C such that C −1 AC is a diagonal
matrix.
Try to do it first without using computer. Then use computer if you have difficulties, and in order
to check your results.
 
    3 1 0 0
4 1 −1 3 −1 1  
     0 3 1 0 
(i) A =  2 5 −2 
 (ii) A =  7 −5 1 
 (iii) A = 
 0

  0 3 1 
1 1 2 6 −6 2
 
0 0 0 3
3. Let A = Jn (λ), which is a square matrix of order n, having all diagonal elements equal to λ, all
entries in (i, i + 1) positions equal to 1 (i = 1, . . . , n − 1), and all zero entrees everywhere else.
Matrices of the form Jn (λ) are called Jordan matrices or Jordan blocks. Find spec Jn (λ) and
a maximum set of linearly independent corresponding eigenvectors of A.
(This will generalize your computation for part (iii) of Problem 2 of this set.)

4. Let a, b ∈ F and a 6= b. Find eigenvalues and eigenvectors of a square matrix A, where

 
a b b b
 
b a b b
A=  
b b a b


b b b a

Generalize the result, if you wish.

5. Find the characteristic polynomial of a matrix

 
0 1 0 0
 
 0 0 1 0 
A=  0
.
 0 0 1 

−d −c −b −a

6. Given a polynomial f (x) = xn + an−1 xn−1 + . . . + a1 x + a0 , find a square matrix A such that
cA (x) = f (x). Such a matrix is called the companion matrix for f .
(Hint: try a matrix like the one in Problem 5 of this set.)

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 85

7. Supply all details for the computations used in class for an explicit formula for the n-th Fibonacci
number Fn . Show that
1
Fn = √ λn+1 − λn+1

1 2 , n ≥ 0,
5
√ √
1+ 5 1− 5
where λ1 = 2 and λ2 = 2 .

8. Let A ∈ M3×3 (R) such that all entries of A are positive. Prove that A has an eigenvector having
all its components positive.

9. (i) Is there a matrix A ∈ M6×6 (R) with negative determinant and having no real eigenvalues?
(ii) Is there a matrix A ∈ M6×6 (R) with no real eigenvector?
(iii) Is there a matrix A ∈ M7×7 (R) with no real eigenvector?

10. Let λ be an eigenvalue of φ and v be a corresponding eigenvector. Let p(x) ∈ F[x]. Then p(λ) is
an eigenvalue of p(φ) and v is a corresponding eigenvector, i.e., p(φ) v = p(λ) v.

11. Let φ be a nonsingular operator on a finite-dimensional space V . Let W be a φ-stable subspace

of V . Prove that W is a stable subspace of φ−1 .
Does the statement hold if V is infinite-dimensional?

12. Let φ ∈ L(V ). Is it possible for φ not to have any nontrivial invariant subspaces, but for φ2 to
have one?

13. Prove that if linear operators φ, ψ ∈ L(V ) commute, i.e., φψ = ψφ, then every eigenspace of φ is
an invariant subspace of ψ (and vice versa).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 86

Lectures 21.

In this lecture we continue our discussion of the question of how to find invariant subspaces for
a linear operator φ on V , dim V = n. Instead of operator φ we will deal with a matrix A which
represents it in some basis. We assume that A acts as an operator on Fn , by mapping x to Ax.
A choice of another basis leads to a transformation of A to a similar matrix C −1 AC. Finding
a basis where φ is presented in a simple way is equivalent to transforming A to a simple form
by means of the similarity transformation. This is usually done by further investigations of the
connections between matrices and polynomials.

For any square matrix A of order n, we can consider the set of all matrices of the form p(A) =
ad Ad + ad−1 Ad−1 + . . . + a1 A + a0 In , where all ai ∈ F. We add and multiply such matrices
similarly to the polynomials in F[x]. We can also think that p(A) is obtained from p(x) = xd +
ad−1 xd−1 +. . .+a1 x+a0 by substituting A instead of x. In order to do it, we just have to interpret
a0 as a0 In = a0 A0 . If the reader is familiar with the notion of ring (or algebra) homomorphism,
we can just say that p(A) is the image of p(x) under the homomorphism F[x] → Mn×n (F),
where p(x) 7→ p(A). If p(A) = 0, we say that the polynomial p an annihilating polynomial
of A.

Given A, is there always an annihilating polynomial of A different from zero polynomial? The
answer is Yes, and the proof is surprisingly easy.

The algebra Mn×n (F) is a n2 -dimensional space over F. Therefore matrices Ad , Ad−1 , . . . A, I
form a linearly dependent set in Mn×n (F) if d ≥ n2 , as we get at least n2 + 1 matrices in the
set. If λd Ad + λd−1 Ad−1 + . . . + λ1 A + λ0 In = 0, then p(x) = xd + ad−1 xd−1 + . . . + a1 x + a0 is
an annihilating polynomial of A. Hence we proved the following fact.

Proposition 51 For every matrix A ∈ Mn×n (F), there exists an annihilating polynomial of A
of degree at most n2 .

Dividing an annihilating polynomial of A by its leading coefficient, we get a monic annihilating

polynomial. It is clear that among all annihilating polynomials of A, there exists a monic one
of the smallest degree. It is called the minimal polynomial of A. We denote it by mA (x).
The degree of the minimal polynomial is always at least 1.

Proposition 52 Minimal polynomial of A divides every annihilating polynomial of A.

Proof. Let p(A) = 0. Dividing p(x) by mA (x) with remainder, we obtain

p(x) = q(x)mA (x) + r(x), where 0 ≤ deg r(x) < deg mA (x).

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 87

Substituting x = A, we obtain 0 = q(A)0 + r(A). Hence r(x) is an annihilating polynomial
of A of degree smaller than the degree of the minimal polynomial, and therefore r(x) is zero
polynomial. This proves that mA (x) divides p(x).

In order to move our investigations further, we have to recall several basic notions and facts
about polynomials. We remind the readers that for every two polynomials a = a(x) and b = b(x)
in F[x], not both zero polynomials, there exits a unique monic polynomial d = d(x) such that
d is a common divisor of a and b (i.e., d divides both a and b), and d is divisible by every
other common divisor of a and b. It is called the greatest common divisor of a and b, and
it is denoted by gcd(a, b). The gcd(a, b) can be found by the Euclidean algorithm applied to
a and b, and it leads to the following fundamental fact: if d(x) = gcd(a(x), b(x)), there exist
polynomials u(x), v(x) such that

d(x) = u(x)a(x) + v(x)b(x).

If gcd(a, b) = 1, a and b are called relatively prime. In this case, the above equality becomes

1 = u(x)a(x) + v(x)b(x).

The following main theorem allows to reduce the question of finding invariant subspaces of A
to the one of factoring of polynomials.

Theorem 53 (Splitting Theorem) Let p(x) be an annihilating polynomial of A, and suppose

that
p(x) = p1 (x) p2 (x),

where p1 (x) and p2 (x) are relatively prime. Then V = Fn can be represented as the direct sum

V = V1 ⊕ V2 ,

where subspaces V1 and V2 are invariant with respect to A. Moreover,

V1 = ker p2 (A), and V2 = ker p1 (A),

so p1 (x) and p2 (x) are annihilating polynomials of A|V2 and A|V1 , respectively.

Proof. As p1 and p2 are relatively prime, there exist q1 , q2 ∈ F[x] such that

q1 (x)p1 (x) + q2 (x)p2 (x) = 1,

and hence
q1 (A)p1 (A) + q2 (A)p2 (A) = I.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 88

Let Vi = im pi (A)(V ) = {pi (A) v : v ∈ V }, i = 1, 2. For every x ∈ Vi , x = pi (A) y for some
y ∈ V . Then, as A x = A pi (A) y = pi (A) A y ∈ Vi , as A y ∈ V . Hence each Vi is invariant with
respect to A.

For every x ∈ V1 , x = p1 (A)y for some y ∈ V . Then

p2 (A) x = p2 (A)(p1 (A) y = (p2 (A)p1 (A)) y = p(A) y = 0 y = 0.

Hence V1 ≤ ker p2 (A). A similar argument gives V2 ≤ ker p1 (A).

For every v ∈ V ,

v = q1 (A)p1 (A) v + q2 (A)p2 (A) v = q1 (A)(p1 (A) v) + q2 (A)(p2 (A) v) = v1 + v2 ,

where vi = qi (A)(pi (A) v). Since pi (A) v ∈ Vi , and Vi is A-stable, Vi is qi (A)-stable. Hence
vi ∈ Vi . This proves that V = V1 + V2 .

For every v ∈ V1 ∩ V2 ,

v = q1 (A)p1 (A) v + q2 (A)p2 (A) v = q1 (A)(p1 (A) v) + q2 (A)(p2 (A) v) =

q1 (A) 0 + q2 (A) 0 = 0.

Hence V1 ∩ V2 = h0i, and the sum V1 + V2 is direct. Hence V = V1 ⊕ V2 .

The only statement remained to be proved is that

V1 = ker p2 (A), and V2 = ker p1 (A).

As V = V1 ⊕ V2 = V1 ⊕ im p2 (A), dim V1 = dim V − dim im p2 (A) = ker p2 (A). We have

already showed that V1 ≤ ker p2 (A). Hence V1 = ker p2 (A). The equality V2 = ker p1 (A) can
be proven similarly.

Theorem 53 explains how one can split V in the direct sum of A-stable subspaces. The strategy
is simple:

(1) Find an annihilating polynomial p(x) of A

(2) Represent it as a product of pairwise relatively prime factors: p(x) = p1 (x) · . . . pk (x), such
that each factor pi (x) cannot be further split in this way.

(3) Consider A|im pi (A) , i = 1, . . . , k, and try to find a basis in im pi (A), where the operator
defined by A|im pi (A) can be easily described. The latter is equivalent to finding a matrix similar
to A|im pi (A) and of a simple form.

How easy is to accomplish all these steps? It depends on F and on A.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 89

Regarding (1). According to Proposition 51, there always exists an annihilating polynomial of
A of degree n2 , but we do not have a good way of finding it, and if the degree is close to n2 ,
then even if it is found, it can be very hard to work with. For n = 10, it may be of degree 99.

It turns out that there exists a much better way. It was found by two of the creators of matrix
theory in the 19-th century. It turns out that an annihilating polynomial of degree n exists,
and we actually already know what it is!

Theorem 54 ( Hamilton-Cayley Theorem). Let A ∈ Mn×n (F), and let cA (x) = det(x I − A)
be the characteristic polynomial of A. Then cA (A) = 0, i.e., every matrix is annihilated by its
characteristic polynomial.

The obvious “proof”: cA (A) = det(A I − A) = det(0) = 0, is, unfortunately, a nonsense. We

will discuss a proof later, but none of the existing proofs is easy in general case.

Striving for more, namely for the minimal polynomial mA (x) of A which can have much smaller
degree than n, we can look for it among the factors of cA (x), as it must divide it due to
Proposition 52.

Regarding (2). We see that the success in accomplishing part 1 depends heavily on the property
of polynomials over F and on particular matrix A. What do we know about factoring of
polynomials?

There are several fundamental theorems in this regard. A polynomial f ∈ F[x] is called irre-
ducible in F[x] if deg f ≥ 1 and f is not a product of two polynomials of smaller degrees from
F[x].

Theorem 55 Every polynomial f ∈ F[x] can be represented as a product of irreducible polyno-

mials from F[x]. Such representation is unique up to order of factors and multiplication of the
factors by scalars from F. In particular, for every monic polynomial f ∈ F[x] of degree at least
1 is either irreducible, or
f (x) = f1 (x)e1 f2 (x)e2 · . . . · fk (x)ek ,

where k ≥ 2, fi (x) are distinct irreducible monic polynomials, and ei are positive integers. This
representation is unique up to the order of factors.

Over an arbitrary field F, Theorem 55 is all we can say.

But more can be said if F = C or F = R.

Theorem 56 Every monic polynomial f ∈ C[x] of degree at leat 2 can be represented as a

product of powers of distinct monic linear polynomials:

f (x) = (x − λ1 )e1 (x − λ2 )e2 · . . . · (x − λk )ek ,

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 90

where λi are all distinct roots of f in C. Such representation is unique up to order of factors.

Theorem 57 Every monic polynomial f ∈ R[x] of degree at leat 2 can be represented as a

product of powers of distinct monic linear polynomials and monic quadratic polynomials:

f (x) = (x − λ1 )e1 (x − λ2 )e2 · . . . · (x − λk )ek · q1 (x)t1 · . . . · qs (x)ts ,

where λi are all distinct roots of f in R, and qj (x) are irreducible monic quadratic polynomials
over R. Such representation is unique up to order of factors.

Regarding (3). This part is also far from easy. As two success stories, we present the Jordan
canonical form, and The Rational Canonical Form. Jordan form can be used in all those cases
when we have (or there exists) a factorization of an annihilating polynomial into the product
of linear factors. In particular, it exists for matrices over C. The The Rational Canonical Form
can be used whenever we have factorization of an annihilating polynomial into the product of
powers of distinct irreducible factors. Both forms cover the diagonal case, if such is possible.

For particular classes of matrices, more can be said. Those cover symmetric real matrices, her-
mitian matrices, and the orthogonal real matrices (the ones which correspond to the isometries
inner product spaces). The list can be continued.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 91

Problems.

1. Give an example of two non-similar square matrices A and B such that

(i) cA (x) = cB (x)

(ii) mA (x) = mB (x)
(iii) cA (x) = cB (x) and mA (x) = mB (x).

2. What is wrong with the obvious “proof” of the Hamilton-Cayley Theorem: cA (A) = det(A I −A) =
det(0) = 0.

3. Assuming Hamilton-Cayley Theorem, prove that cA (x) divides (mA (x))t for some positive integer
t.

4. Let A ∈ Mn (F) and cA (x) = xn + cn−1 xn−1 + . . . + c1 x + c0 . Let λ1 , λ2 , . . . , λn be all eigenvalues

of A (not necessarily distinct). Then

cn−1 = tr A = λ1 + λ2 + . . . + λn and c0 = (−1)n det A = λ1 λ2 . . . λn .

In general, using Vièta’s formuli, we obtain that cn−k = (−1)k

P
λi1 λi2 · · · λik , where the sum-
mation is over all distinct sequences i1 < i2 . . . < ik of distinct integers from {1, . . . , n}.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 92

Lecture 22.

In this lecture we describe a special basis for an operator ψ ∈ L(V ) having (x − λ)a as its
annihilating polynomial. As (ψ − λ id)a = 0, setting φ = ψ − λ id, we obtain an even simpler
equation φa = 0. An operator φ with the property φa = 0 for some a ∈ N, is called nilpotent,
and the smallest positive b ∈ N such that φb = 0 is called the order of nilpotency of φ.

The most prominent nilpotent operator is, undoubtedly, the differential operator D on a vector
space of polynomials over R (or C) of degree at most m − 1, which maps every polynomial to
1
its derivative. Consider the following basis in this space: {vi = i! xi : i = 0, . . . , m − 1}. It is
clear that
D D D D D
vm−1 7−→ vm−2 7−→ · · · 7−→ v1 7−→ v0 7−→ 0,

and that D is nilpotent with m being the order of nilpotency.

The m × m matrix of D in this basis, ordered (vm−1 , . . . , v0 ), has a very simple form, having
1 in positions (1, 2), (2, 3), . . . , (m − 1, m), and 0 everywhere else. It is denoted by Jm (0). The
matrix Jm (λ) := λ I + Jm (0), which is obtained from Jm (0) by putting a scalar λ on the main
diagonal, is called a Jordan matrix, or a Jordan block. For m = 4,
   
0 1 0 0 λ 1 0 0
   
 0 0 1 0   0 λ 1 0 
J4 (0) =  and J4 (λ) = 
   
 
 0 0 0 1    0 0 λ 1 
  
0 0 0 0 0 0 0 λ

Observe that xm and (x−λ)m are the annihilating polynomials of Jm (0) and Jm (λ), respectively.
Moreover, they are the minimal polynomials.

Though we arrive to Jordan matrices via the example of a particular nilpotent operator D, it
turns out that similar bases exist for other nilpotent operators. This explains the importance
of Jordan matrices in linear algebra. Before we prove the existence of such a basis, we would
like to mention another attractive computational feature of Jordan matrices, wildly admired in
some 19-th and 20-th century pre-computer societies.

Theorem 58 Let p (x) ∈ C[x]. Then

′ ′′
p(m−1) (λ)
 
p (λ) p 1!(λ) p (λ)
2! ... (m−1)!
 ′ 
p (λ) p(m−2) (λ)
 0 p (λ) ...
 
p (Jm (λ)) =  1! (m−2)! 

 ... ... ... ... ... 
 
0 0 0 ... p (λ)

Proof. Was discussed in class. See Gel’fand’s book [6], p. 136 – 137, for details.

We are ready to prove the main theorem of this lecture.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 93

Theorem 59 Let dim V = n, and let φ ∈ L(V ) with minimal polynomial xa . Then there exists
a basis α of V such that the matrix Mφ,α is block-diagonal of the form

Mφ,α = diag [Ja1 (0), . . . , Ja1 (0), Ja2 (0), . . . , Ja2 (0), . . . . . . , Jat (0), . . . , Jat (0)],

where a = a1 > a2 > . . . ≥ at .

Proof. Let φ0 = id, and Wi := ker φa−i for i = 0, . . . , a. First we show that

V = W0 > W1 > . . . > Wa = h0i, (19)

where each Wi is φ-invariant, and all inclusions are strict.

Indeed, if x ∈ Wi , 1 ≤ i ≤ a, then φa−i+1 (x) = φ(φa−i (x)) = φ(0) = 0. Hence x ∈

ker φa−(i−1) = Wi−1 , and
Wi−1 ≥ Wi . (20)

For 1 ≤ i ≤ a − 1, 0 = pa−i (x) = φa−i−1 (φ(x)), hence, φ(x) ∈ ker φa−(i+1) = Wi+1 . So
φ(Wi+1 ) ≥ Wi . Together with (20), it gives Wi ≥ φ(Wi ), which proves that each Wi is φ-
invariant. For i = a the statement is obvious. Suppose that for some i, 0 ≤ i ≤ a − 1,
Wi = Wi+1 . Then, Wi 6= h0i, as the order of nilpotency of φ is a. Hence im φa−i = im φa−i−1 ,
since the former is a subspace of the latter and they have equal positive dimensions. This
implies h0i = im φa = im φa−1 = . . . im φa−i = im φa−i−1 6= h0i, a contradiction. This proves
that all inclusions in (19) are strict.

Let U be a proper subspace in V . A set of vectors {v1 , . . . , vp } from V \ U is called U -

independent if a1 v1 + . . . + ap vp ∈ U implies a1 = . . . = ap = 0. It is clear that uniting
a U -independent set with a set of linearly independent vectors of U , we obtain a linearly
independent set of vectors of V (check!). A U -independent set is called a U -basis if it is U -
independent, and united with a basis of U gives a basis of V . It is clear, that {v1 , . . . , vp } is
a U -basis if and only if p = dim V − dim U (check!). It is also clear that at least one U -basis
always exists (check!).

Lemma 60 Let αi = {v1 , . . . , vp } ⊂ Wi−1 \ Wi be a Wi -basis of Wi−1 , 1 ≤ i ≤ a − 1. Then

φ(αi ) = {φ(v1 ), . . . , φ(vp )} ⊆ Wi is Wi+1 -independent.

Proof. We have

a1 φ(v1 ) + . . . + ap φ(vp ) ∈ Wi+1 ⇔ φ(a1 v1 + . . . + ap vp ) ∈ Wi+1 ⇔

φa−(i+1) (φ(a1 v1 + . . . + ap vp )) = 0 ⇔ φa−i (a1 v1 + . . . + ap vp ) = 0 ⇔

a1 v1 + . . . + ap vp ∈ Wi ⇒ a1 = . . . = ap = 0,

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 94

since αi is a Wi -basis of Wi−1 .

Having Lemma 60, the construction of the desirable basis is as follows. Let di = dim Wi . First
chose
α1 = {e1 , . . . , es1 },

a W1 -basis of W0 = V , with s1 = d0 − d1 = n − d1 . Then {φ(e1 ), . . . , φ(es1 )} is a linearly

independent set in W1 . Extend it to

α2 = {φ(e1 ), . . . , φ(es1 ), es1 +1 , . . . , es2 },

a W2 -basis of W1 of d2 −d1 elements. Continue until you get a basis αa of Wa−1 (over Wa = h0i)
of da−1 − da = da−1 − 0 = da−1 elements. Let us list all these relative basis in the following
table.

α1 : e1 ... es1
α2 : φ(e1 ) ... φ(es1 ), es1 +1 ... es2
........................................................................
αa : φa−1 (e1 ) . . . φa−1 (es1 ), φa−2 (es1 +1 ) . . . φa−2 (es2 ), esa−1 +1 . . . esa .

Now we collect vectors in this table which stand in the same column. Let

βi = {ei , . . . , φbi −1 (ei )}, (21)

where i = 1, . . . , sa , and

b1 = . . . = bs1 = a − 1, bs1 +1 = . . . = bs2 = a − 2, . . . , bsa−1 +1 = . . . = bsa = 1,

with b1 ≥ b2 ≥ . . . ≥ bsa .

Lemma 61

(i) Each subspace hβi i is φ-invariant.

Sa S sa
(ii) α = j=1 αj = i=1 βi is a basis of V .

(iii) Each βi is a linearly independent set. βi ∩ βj = ∅ for i 6= j.

Proof. (i) This part is obvious, since φbi (ei ) = 0.

Si
(ii) α is a basis of V due to its construction: j=a αj is just a basis of Wi−1 . Note that
|α| = |αa | + . . . + |α1 | = (da−1 − da ) + (da−2 − da−1 ) + . . . + (d0 − d1 ) = d0 − da = n − 0 = n.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 95

(iii) βi is linearly independent as a subset of the basis α. If i 6= j, and βi ∩ βj 6= ∅, then two
vectors from some αj are equal, or two vectors from distinct αj and αj ′ are equal. None of
these cases is possible.

To finish our proof of the theorem, we just observe that if φi = φ|hβi i , then Mφi ,βi = Jbi (0).

Let us mention some important corollaries of of Theorem 59.

Theorem 62 (Upper-triangular Form) Every square matrix over C is similar to an upper

triangular matrix.

Proof. Since the Jordan form is upper trianguar, the statement follows.

Theorem 63 (Hamilton-Cayley) For every matrix A ∈ Mn (C), ca (A) = 0.

Proof. As we mentioned before, no ‘easy’ proof of this theorem exists. Instead of presenting a
proof, we describe four different ideas on which a proof can be based, and refer the reader to
the literature.

1. For a proof based on the Jordan form, see [14] p. 155, or

http://www.blue-arena.com/mewt/entry.php?id=147 , or

http://www.cs.ut.ee/∼toomas l/linalg/lin1/node19.html

2. For a proof based on Theorem 62, see [13], or [1], p. 173. Of course, in this case Theorem
62 should be proved independently from Theorem 59. The latter can be done by induction and
can be found in [1] p. 84, or [8] p. 64-65.

3. For a proof based on the density of matrices with distinct eigenvalues in the space of all matri-
ces, see http://planetmath.org/encyclopedia/ProofOfCayleyHamiltonTheorem.html The
density is understood relative to Zariski’s topology. An advantage of this proof is that it works
for many fields different from C.

4. For a proof based on the isomorphism of rings Mn (F)[x] and Mn (F[x]), see [4] p. 94-95.
Proofs based on this idea can be found in many other books, but many of them do not state
the isomorphism clearly, and develop some weaker results instead.

The following theorem provides more details on the block structure of the Jordan form of an
operator. We remind the reader that the algebraic multiplicity of an eigenvalue λ of φ is
the multiplicity of λ as a root of the characteristic polynomial cφ (x). A geometric multiplicity
of an eigenvalue λ of φ is the dimension of its eigenspace Vλ .

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 96

Theorem 64 Let dim V = n, and let φ ∈ L(V ) with the characteristic polynomial cφ (x) =
Qk ei
Qk mi
i=1 (x − λi ) and the minimal polynomial mφ (x) = i=1 (x − λi ) . Then there exists a basis
β of V such that the matrix Mφ,β is block-diagonal of the form

Mφ,β = diag [B1 , B2 , . . . , Bk ],

where each
Bi = diag [Jmi,1 (λi ), Jmi,2 (λi ), . . . , Jmi,li (λi ), ]

where mi,1 ≥ mi,1 ≥ . . . ≥ mi,li ≥ 1.

Moreover,

(i) Each Bi is a ei × ei matrix.

(ii) The number li of Jordan blocks in Bi is equal to the geometric multiplicity of λi .

(iii) mi,1 = mi , i = 1, . . . , k.

(iv) The total number of 1’s above the diagonal in each Bi is ei − li .

(v) The Jordan form of a matrix is unique.

Proof. (i) Suppose Bi is a bi × bi matrix. Then

k
Y
cMφ,β (x) = (x − λi )bi ,
i=1

since the matrix xI − Mφ,β is upper triangular and its determinant is equal to the product of
its diagonal entries. Then bi = ei from the uniqueness of a representation of a polynomial as a
product of irreducible factors.

(ii) Each diagonal block of Bi has exactly one eigenvector corresponding to the eigenvalue λi .
Therefore Bi has dim ker (φ − λi id) linearly independent eigenvectors, which is, by definition,
li .

(iii) Each matrix Bi is annihilated by (x − λi )mi . Hence each block of Bi is annihilated by

(x − λi )mi . Hence mi,1 ≤ mi . If mi,1 < mi , then (x − λi )mi,1 would annihilate Bi , and the
minimal polynomial mB (x) would have smaller degree than mφ (x), a contradiction.

(iv) Each Bi has li Jordan blocks, and the number of 1’s above the diagonal in each block is
one less than the block’s size.

(v) Most proofs are by induction, and are reduced to the uniqueness of the Jordan form of a
nilpotent operator. The idea is to use the fact that the cardinalities for bases aj (see the table

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 97

which precedes (21)) define the cardinalities of βi given by (21) uniquely. For details, see, e.g.,
[8]. The uniqueness of the Jordan form can also be derived from a much more general theorem
of the uniqueness of the elementary divisor form for finitely generated modules over a PID. The
latter is usually presented in Algebra texts. See, e.g., [5], Chapter 12.3.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 98

Problems.

1. Find all possible Jordan forms for a real matrix A whose characteristic and minimal polynomials
are as follows.

(a) cA (x) = (x − 1)4 (x − 2)2 , mA (x) = (x − 1)2 (x − 2)2

(b) cA (x) = (x + 5)4 (x − 2)4 , mA (x) = (x + 5)2 (x − 2)2
(c) cA (x) = (x − 1)6 , mA (x) = (x − 1)2

2. Given cA (x) and mA (x) for A ∈ M3 (C), show that these completely determines the Jordan form
of A.

3. Find the Jordan form of the following matrices.

 
  2 0 0 0 0
 
" # 1 1 −1  5 2 0 0 0 
2 1    
(i) A = (ii) A = 
 2 2 1  (iii) A =  0 0 8 0 0 
 
−1 4   
2 −1 4  0
 0 0 3 1 
0 0 0 5 −2
   
  2 0 0 0 1 2 3 4
1 −1 0    
   1 2 0 0   0 1 2 3 
(iv) A =  2 1
 3  (v) A = 
 0
 (vi) A =  
 1 0 0   0 0 1 2 
1 2 0
   
0 0 1 0 0 0 0 1

4. Find p (J4 (3)) for p(x) = x5 − 4x3 + x2 − x − 1.

5. Let A ∈ Mn (F) be upper-triangular. Then the multiset of its diagonal entries is precisely spec A.

6. Let A ∈ Mn (F) and cA (x) = xn + cn−1 xn−1 + . . . + c1 x + c0 . Let λ1 , λ2 , . . . , λn be all eigenvalues

of A (not necessarily distinct). Then

cn−1 = tr A = λ1 + λ2 + . . . + λn and c0 = (−1)n det A = λ1 λ2 . . . λn .

(This problem has appeared earlier, but we suggest that readers think about it again.)

7. Show that a square matrix A is diagonalizable if and only if the minimal polynomial mA (x) is a
product of distinct linear factors (i.e., mA (x) = (x − λ1 ) · . . . · (x − λk ) where all λ1 , . . . , λk are
distinct).

8. Let A ∈ Mn (C) such that Am = I, m ≥ 1. Prove that A is diagonalizable.

9. Let A ∈ Mn (C). Prove that A is similar to AT .

10. Let A ∈ Mn (C) such that tr A = 0. Prove that there exist B, C ∈ Mn (C) such that A = BC −CB.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 99

References

[1] S. Axler. Linear Algebra Done Right, 2nd edition, Springer-Verlag, 1997.

I disagree that this text does Linear Algebra “right”, and I disagree with many method-
ological decisions of the author. But some pages and exercises are good.

[2] L. Babai, P. Frankl. Linear Algebra Methods in Combinatorics. Preliminary Version 2,

Department of Computer Science, The University of Chicago, 1992. Great price.

An unfinished manuscrip. Excellent for the title, but also much beyod it. Some chapters
are masterpieces.

[3] O. Bretcher, Linear Algebra with Applications, Prentice Hall, 1997.

A very good undergraduate text. If you find some sections of these notes to fast/hard, try
to find the corresponding material in this book. Sometimes you will not succeed.

[4] M.L. Curtis. Abstract Linear Algebra, Springer-Verlag New York Inc., 1990.

Nice, rather algebraic. A friendly introduction to exterior algebras, those some details are
missing (like in these notes).

[5] D.S. Dummit and R.M. Foote. Abstract Algebra, 3rd edition, John Wiley & Sons, Inc.,
2004.

A quite complete text in Abstract Algebra. Good for references and examples.

[6] I.M. Gel’fand. Lectures on Linear Algebra, Dover, 1989.

A classics. The terminology and notations are sometimes out of fashion. No matter how
many times I read this thin book, I often find something new in it. Great price.

[7] K. Hoffman, R. Kunze. Linear Algebra, 2nd edition, Prentice Hall, 1971.

Quite complete and thourough. Good as a reference, but it is not easy to use.

[8] A.I. Kostrikin, Yu. I. Manin. Linear Algebra and Geometry (Algebra, Logic and Applica-
tions), Gordon and Breach Science Publishers, 1989.

Outstanding and demanding. Details may be read elsewhere. Makes connections with many
advanced mathematical topics. A stimulating exposition of linear algebra related to Quan-
tum Mechanics.

[9] S. Lang. Linear Algebra, 3rd edition, Springer-Verlag, 1987.

Not great, but some chapter are fine.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 100

[10] S. Lipschutz and M. Lipson. Linear Algebra, 3rd edition. Schaum’s Outline Series, McGraw-
Hill, 2001.

Very well written text. Many good examples and exercises.

[11] P. D. Lax. Linear Algebra, 3rd edition, John Wiley & Sons, Inc., 1997.

Stimulating, some great examples, rather unusual (for linear algebra texts) content.

[12] B.A. Rosenfeld. Multidimensional Spaces, Nauka, Moscow, 1966. (In Russian).

A very clearly written monograph, which discusses the use of linear algebra in high-
dimensional geometries.

[13] G. Sewell, http://www.math.tamu.edu/ sewell/640/notes3b.pdf

[14] G.E. Shilov. Linear Algebra. Dover, 1977.

A classics. The terminology and notations are sometimes out of fashion. Complete. The
best treatment of orthogonalization, Gram matrix/determinant, and volumes. Great price.

[15] H. Weyl. Space,Time, Matter, Springer, 1923.

A great book in general. Has definition of Euclidean point spaces based on linear algebra.

Math 672: Lecture Notes. Fall 2005. Felix Lazebnik. 101

Introduction To Linear Algebra-Compressed
100% (1)
Introduction To Linear Algebra-Compressed
435 pages
Module III
No ratings yet
Module III
33 pages
Solution Manual For Linear Algebra With Applications 2nd Edition Holt 1464193347 9781464193347 PDF Download Full Book With All Chapters
100% (30)
Solution Manual For Linear Algebra With Applications 2nd Edition Holt 1464193347 9781464193347 PDF Download Full Book With All Chapters
60 pages
Linear Algebra
No ratings yet
Linear Algebra
145 pages
Linear Topological Spaces
100% (3)
Linear Topological Spaces
270 pages
Linear Algebra-Balwan Sir PDF
78% (9)
Linear Algebra-Balwan Sir PDF
245 pages
Linear Algebra Worktexr - Week 8
No ratings yet
Linear Algebra Worktexr - Week 8
18 pages
Ross 1.0
No ratings yet
Ross 1.0
31 pages
Linear Algebra 03-15
No ratings yet
Linear Algebra 03-15
143 pages
BCS405D M-I
No ratings yet
BCS405D M-I
23 pages
Revision Notes - MA2101
No ratings yet
Revision Notes - MA2101
59 pages
UGSemsterSyllabus Maths 5Sem517Maths English LINEARALGEBRA
No ratings yet
UGSemsterSyllabus Maths 5Sem517Maths English LINEARALGEBRA
113 pages
Fields and Rings
No ratings yet
Fields and Rings
32 pages
LA Notes Complete
No ratings yet
LA Notes Complete
36 pages
Notes 610
No ratings yet
Notes 610
209 pages
Wa0001.
No ratings yet
Wa0001.
58 pages
MATH 304 Linear Algebra Vector Spaces
No ratings yet
MATH 304 Linear Algebra Vector Spaces
14 pages
Linear Algebra Notes
No ratings yet
Linear Algebra Notes
60 pages
MATH2102 Notes Protected Unlocked
No ratings yet
MATH2102 Notes Protected Unlocked
47 pages
Vector Spaces
No ratings yet
Vector Spaces
5 pages
LAII Book I
No ratings yet
LAII Book I
34 pages
Linear
No ratings yet
Linear
97 pages
Math 146 Notes
No ratings yet
Math 146 Notes
86 pages
MATH212 LA Notes
No ratings yet
MATH212 LA Notes
65 pages
Vector Notes For IIT JEE - pdf-62
No ratings yet
Vector Notes For IIT JEE - pdf-62
8 pages
(Final) HSC-Maths Board Question Paper With Solutions
No ratings yet
(Final) HSC-Maths Board Question Paper With Solutions
11 pages
Project On Linear Mapping
No ratings yet
Project On Linear Mapping
30 pages
Vector Space
No ratings yet
Vector Space
32 pages
MAT231ET - Unit 4 - Lecture Notes
No ratings yet
MAT231ET - Unit 4 - Lecture Notes
31 pages
Linear Algebra Original
No ratings yet
Linear Algebra Original
55 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
46 pages
Math853 JBrown Grad Linear Alg
No ratings yet
Math853 JBrown Grad Linear Alg
155 pages
115af18 Lecture Notes
No ratings yet
115af18 Lecture Notes
59 pages
Lecture Slides For Introduction To Applied Linear Algebra: Vectors, Matrices, and Least Squares
No ratings yet
Lecture Slides For Introduction To Applied Linear Algebra: Vectors, Matrices, and Least Squares
470 pages
MA106 Linear Algebra
No ratings yet
MA106 Linear Algebra
14 pages
4330 Week 2
No ratings yet
4330 Week 2
20 pages
Cambridge Linear Algebra Notes PDF
No ratings yet
Cambridge Linear Algebra Notes PDF
82 pages
3140 Pset 1-1
No ratings yet
3140 Pset 1-1
20 pages
Vector Space Notes
No ratings yet
Vector Space Notes
44 pages
Lecture 2,3,4 Unit 5
No ratings yet
Lecture 2,3,4 Unit 5
15 pages
Cambridge Part IB Linear Algebra Alex Chan
No ratings yet
Cambridge Part IB Linear Algebra Alex Chan
82 pages
Unit 1-Linear Algebra-I
No ratings yet
Unit 1-Linear Algebra-I
47 pages
1.2. Vector Spaces-Students'
No ratings yet
1.2. Vector Spaces-Students'
9 pages
LA - W1 VS, SB, Ins&Union
No ratings yet
LA - W1 VS, SB, Ins&Union
12 pages
Fall 2013 Math 290 Lecture 16
No ratings yet
Fall 2013 Math 290 Lecture 16
23 pages
1 Vector Spaces: Closure Property Associative Property Identity Property
No ratings yet
1 Vector Spaces: Closure Property Associative Property Identity Property
31 pages
Vector Spaces Crash Course
No ratings yet
Vector Spaces Crash Course
11 pages
Vector Spaces
No ratings yet
Vector Spaces
9 pages
3.5.6.1 Maths Handout
No ratings yet
3.5.6.1 Maths Handout
16 pages
Mathematical Spaces: °2011 by Taejeong Kim
No ratings yet
Mathematical Spaces: °2011 by Taejeong Kim
39 pages
Vector Spaces
No ratings yet
Vector Spaces
4 pages
S2 Vector Space
No ratings yet
S2 Vector Space
5 pages
Lec# 3
No ratings yet
Lec# 3
4 pages
Lecture 6
No ratings yet
Lecture 6
4 pages
MATH 304 Linear Algebra Vector Spaces
No ratings yet
MATH 304 Linear Algebra Vector Spaces
14 pages
Lecture 1: September 28, 2021: Mathematical Toolkit Autumn 2021
No ratings yet
Lecture 1: September 28, 2021: Mathematical Toolkit Autumn 2021
5 pages
Unit 3 PDF
No ratings yet
Unit 3 PDF
24 pages
Section 1
No ratings yet
Section 1
14 pages
WEEK 5 - Vector Space, Subspace
No ratings yet
WEEK 5 - Vector Space, Subspace
28 pages
Vectorspace PDF
No ratings yet
Vectorspace PDF
8 pages
Vector Spaces Over R: Multiplication by Scalars, Satisfying The Following Properties
No ratings yet
Vector Spaces Over R: Multiplication by Scalars, Satisfying The Following Properties
6 pages
2 Vector Spaces: V V V V
No ratings yet
2 Vector Spaces: V V V V
7 pages
Optimization by Vector Space (Luenberger)
100% (1)
Optimization by Vector Space (Luenberger)
342 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Linear Solutions 2.3
No ratings yet
Linear Solutions 2.3
36 pages
Vectors in Linear Algebra Group 11' - 051657
No ratings yet
Vectors in Linear Algebra Group 11' - 051657
20 pages
SSMDA Notes Unit 2
No ratings yet
SSMDA Notes Unit 2
47 pages
RPLA QB 5 Units With Img
No ratings yet
RPLA QB 5 Units With Img
40 pages
LAG ProblemSheet3
No ratings yet
LAG ProblemSheet3
2 pages
Linear Algebra MAT223 Workbook 2021/08/06 Edition Jason Siefken
No ratings yet
Linear Algebra MAT223 Workbook 2021/08/06 Edition Jason Siefken
41 pages
Linear Algebra
No ratings yet
Linear Algebra
18 pages
Lect2 03web
No ratings yet
Lect2 03web
15 pages
Vector
No ratings yet
Vector
34 pages
Lecture 2: Fields, Rings, Vector Spaces Oh My!. .
No ratings yet
Lecture 2: Fields, Rings, Vector Spaces Oh My!. .
7 pages
Lecture - 2 Mod
No ratings yet
Lecture - 2 Mod
41 pages
Orthogonal Functions: 3.1 Vectors
No ratings yet
Orthogonal Functions: 3.1 Vectors
10 pages
Vector Space
No ratings yet
Vector Space
27 pages
MA2001 Summary Notes
No ratings yet
MA2001 Summary Notes
12 pages
Univalent Harmonic Functions
100% (2)
Univalent Harmonic Functions
4 pages
III Sem 22MAT31A Module 1 Notes
No ratings yet
III Sem 22MAT31A Module 1 Notes
16 pages
MATHS 2 Week-3 Notes by Farhan
No ratings yet
MATHS 2 Week-3 Notes by Farhan
11 pages
MATLAB 5.1 - Linear Combination-Group-2
No ratings yet
MATLAB 5.1 - Linear Combination-Group-2
10 pages
MATH 2050 Homework Sheet 1 Fall 2022
No ratings yet
MATH 2050 Homework Sheet 1 Fall 2022
4 pages
Fundamentals of Vector Spaces and Subspaces: Kevin James
No ratings yet
Fundamentals of Vector Spaces and Subspaces: Kevin James
17 pages
Chapter 2 MCQ
No ratings yet
Chapter 2 MCQ
4 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Lectures on Measure and Integration
From Everand
Lectures on Measure and Integration
Harold Widom
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet