0% found this document useful (0 votes)
16 views

LA Lecture Notes

The document is a work in progress on Linear Algebra by M. Winklmeier, covering various topics such as systems of linear equations, matrices, determinants, vector spaces, and linear transformations. It includes sections with examples, summaries, and exercises for each topic. The content is structured into chapters, each focusing on different aspects of linear algebra.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

LA Lecture Notes

The document is a work in progress on Linear Algebra by M. Winklmeier, covering various topics such as systems of linear equations, matrices, determinants, vector spaces, and linear transformations. It includes sections with examples, summaries, and exercises for each topic. The content is structured into chapters, each focusing on different aspects of linear algebra.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 392

Linear Algebra

FT
RA
D

Analysis Series

M. Winklmeier

Chigüiro Collection
Work in progress. Use at your own risk.

FT
RA
D
Contents

1 Introduction 5
1.1 Examples of systems of linear equations; coefficient matrices . . . . . . . . . . . . . . 6
1.2 Linear 2 × 2 systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 R2 and R3
2.1 Vectors in R2 . . . . . . . . . . . . . .
2.2 Inner product in R2 . . . . . . . . . .
2.3 Orthogonal Projections in R2 . . . . .
2.4 Vectors in Rn . . . . . . . . . . . . . .
2.5 Vectors in R3 and the cross product .
2.6 Lines and planes in R3 . . . . . . . . .
FT .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
33
39
43
46
53
RA
2.7 Intersections of lines and planes in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3 Linear Systems and Matrices 77


3.1 Linear systems and Gauß and Gauß-Jordan elimination . . . . . . . . . . . . . . . . 77
3.2 Homogeneous linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.3 Matrices and linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4 Matrices as functions from Rn to Rm ; composition of matrices . . . . . . . . . . . . 96
D

3.5 Inverses of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105


3.6 Matrices and linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
3.7 The transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.8 Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
3.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4 Determinants 137
4.1 Determinant of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2 Properties of the determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.3 Geometric interpretation of the determinant . . . . . . . . . . . . . . . . . . . . . . . 152
4.4 Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

3
4 CONTENTS

4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

5 Vector spaces 165


5.1 Definitions and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.3 Linear combinations and linear independence . . . . . . . . . . . . . . . . . . . . . . 181
5.4 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
5.5 Intersections and sums of vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 206
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

6 Linear transformations and change of bases 223


6.1 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
6.2 Matrices as linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6.3 Change of bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
6.4 Linear maps and their matrix representations . . . . . . . . . . . . . . . . . . . . . . 255
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

FT
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

7 Orthonormal bases and orthogonal projections in Rn 275


7.1 Orthonormal systems and orthogonal bases . . . . . . . . . . . . . . . . . . . . . . . 275
7.2 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
7.3 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
7.4 Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
7.5 The Gram-Schmidt process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
RA
7.6 Application: Least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

8 Symmetric matrices and diagonalisation 313


8.1 Complex vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.2 Similar matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
8.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
8.4 Properties of the eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 336
D

8.5 Symmetric and Hermitian matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343


8.6 Application: Conic Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
8.6.1 Solutions of ax2 + bxy + cy 2 = d as conic sections . . . . . . . . . . . . . . . . 359
8.6.2 Solutions of ax2 + bxy + cy 2 + rx + sy = d . . . . . . . . . . . . . . . . . . . 361
8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

A Complex Numbers 369

B Solutions 375

Index 389

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 5

Chapter 1

Introduction

This chapter serves as an introduction to the main themes of linear algebra, namely the problem of
solving systems of linear equations for several unknowns. We are not only interested in an efficient

FT
way to find their solutions, but we also wish to understand how the solutions could possibly look
and what we can say about their structure. For the latter, it will be crucial to find a geometric
interpretation of systems of linear equations. In this chapter we will use the “solve and insert”-
strategy for solving linear systems. A systematic and efficient formalism will be given in Chapter 3.
Everything we discuss in this chapter will appear again later on, so you may read it quickly or even
skip (parts of) it.
A linear system is a set of equations for a number of unknowns which have to be satisfied simul-
RA
taneously and where the unknowns appear only linearly. If the number of equations is m and the
number of unknowns is n, then we call it an m × n linear system. Typically the unknowns are
called x, y, z or x1 , x2 . . . , xn . The following is an example of a linear system of 3 equations for 5
unknowns:

x1 + x2 + x3 + x4 + x5 = 3, 2x1 + 3x2 − 5x3 + x4 = 1, 3x1 − 8x5 = 0.

An example of a non-linear system is

x1 x2 + x3 + x4 + x5 = 3, 2x1 + 3x2 − 5x3 + x4 = 1, 3x1 − 8x5 = 0


D

because
√ in the first equation we have a product of two of the unknowns. Also expressions like x2 ,
3
x, xyz, x/y or sin x make a system non-linear.
Now let us briefly discuss the simplest non-trivial case: A system consisting of one linear equation
for one unknown x. Its most general form is

ax = b (1.1)

where a and b are given constants and we want to find all x ∈ R which satisfy (1.1). Clearly, the
solution to this problem depends on the coefficients a and b. We have to distinguish several cases.
Case 1. a 6= 0. In this case, there is only one solution, namely x = b/a.
Case 2. a = 0, b 6= 0. In this case, there is no solution because whatever value we choose for x,
the left hand side ax will always be zero and therefore cannot be equal to b.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
6 1.1. Examples of systems of linear equations; coefficient matrices

Case 3. a = 0, b = 0. In this case, there are infinitely many solutions. In fact, every x ∈ R solves
the equation.
So we see that already in this simple case we have three very different types of solution of the
system (1.1): no solution, exactly one solution or infinitely many solutions.
Now let us look at a system of one linear equation for two unknowns x, y. Its most general form is

ax + by = c. (1.1’)

Here, a, b, c are given constants and we want to find all pairs x, y so that the equation is satisfied.
For example, if a = b = 0 and c 6= 0, then the system has no solution, whereas if for example a 6= 0,
then there are infinitely many solutions because no matter how we choose y, we can always satisfy
the system by taking x = a1 (c − y).

Question 1.1
Is it possible that the system has exactly one solution?
(Come back to this question again after you have studied Chapter 3.)

FT
The general form of a system of two linear equations for one unknown is

a1 x = b1 ,

a11 x + a12 y = c1 ,
a2 x = b2

and that of a system of two linear equations for two unknowns is

a21 x + a22 y = c2
RA
where a1 , a2 , b1 , b2 , respectively a11 , a12 , a21 , a22 , c1 , c2 are constants and x, respectively x, y are the
unknowns.
Question 1.2
Can you find find examples for the coefficients such that the systems have

(i) no solution, (iii) exactly two solutions,


(ii) exactly one solution, (iv) infinitely many solutions?
D

Can you maybe even give a general rule for when which behaviour occurs?
(Come back to this question again after you have studied Chapter 3.)

Before we discuss general linear systems, we will discuss in this introductory chapter the special
case of a system of two linear equations with two unknowns. Although this is a very special type
of system, it exhibits many porperties of general linear systems and they appear very often in
problems.

1.1 Examples of systems of linear equations; coefficient ma-


trices
Let us start with a few examples of systems of linear equations.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 7

Example 1.1. Assume that a car dealership sells motorcycles and cars. Altogether they have 25
vehicles in their shop with a total of 80 wheels. How many motorcycles and cars are in the shop?

Solution. First, we give names to the quantities we want to calculate. So let M = number of
motorcyles, C = number of cars in the dealership. If we write the information given in the exercise
in formulas, we obtain

1 M + C = 25, (total number of vehicles)


2 2M + 4C = 80, (total number of wheels)

since we assume that every motorcycle has 2 wheels and every car has 4 wheels. Equation 1 tells
us that M = 25 − C. If we insert this into equation 2 , we find

80 = 2(25 − C) + 4C = 50 − 2C + 4C = 50 + 2C =⇒ 2C = 30 =⇒ C = 15.

This implies that M = 25 − C = 25 − 15 = 10. Note that in our calculations and arguments, all

FT
the implication arrows go “from left to right”, so what we can conclude at this instance is that the
system has only one possible candidate for a solution and this candidate is M = 10, C = 15. We
have not (yet) shown that it really is a solution. However, inserting these numbers in the original
equation we see easily that our candidate is indeed a solution.
So the answer is: There are 10 motorcycles and 15 cars (and there is no other possibility). 

Let us put one more equation into the system.


RA
Example 1.2. Assume that a car dealership sells motorcycles and cars. Altogether they have 28
vehicles in their shop with a total of 80 wheels. Moreover, the shop arranges them in 7 distinct areas
of the shop so that in each area there are either 3 cars or 5 motorcycles. How many motorcycles
and cars are in the shop?

Solution. Again, let M = number of motorcyles, C = number of cars. The information of the
exercise leads to the following system of equations:
D

1 M+ C = 25, (total number of vehicles)


2 2M + 4C = 80, (total number of wheels)
3 M/5 + C/3 = 7. (total number of areas)

As in the previous exercise, we obtain from 1 and 2 that M = 10, C = 15. Clearly, this also
satisfies equation 3 . So again the answer is: There are 10 motorcycles and 15 cars (and there is
no other possibility). 

Example 1.3. Assume that a car dealership sells motorcycles and cars. Altogether they have 25
vehicles in their shop with a total of 80 wheels. Moreover, the shop arranges them in 5 distinct areas
of the shop so that in each area there are either 3 cars or 5 motorcycles. How many motorcycles
and cars are in the shop?

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
8 1.1. Examples of systems of linear equations; coefficient matrices

Solution. Again, let M = number of motorcycles, C = number of cars. The information of the
exercise gives the following equations:

1 M+ C = 25, (total number of vehicles)


2 2M + 4C = 80, (total number of wheels)
3 M/5 + C/3 = 5. (total number of areas)

As in the previous exercise, we obtain that M = 10, C = 15 using only equations 1 and 2 .
However, this does not satisfy equation 3 ; so there is no way to choose M and C such that all
three equations are satisfied simultaneously. Therefore, a shop as in this example does not exist. 

Example 1.4. Assume that a zoo has birds and cats. The total count of legs of the animals is 60.
Feeding a bird takes 5 minutes, feeding a cat takes 10 minutes. The total time to feed the animals
is 150 minutes. How many birds and cats are in the zoo?

Solution. Let B = number of birds, C = number of cats in the zoo. The information of the

FT
exercise gives the following equations:

1 2B + 4C = 60, (total number of legs)


2 5B + 10C = 150, (total time for feeding)

The first equation gives B = 30 − 2C. Inserting this into the second equation, gives

150 = 5(30 − 2C) + 10C = 150 − 10C + 10C = 150


RA
which is always true, independently of the choice of B and C. Indeed, for instance B = 10, C = 10
or B = 14, C = 8, or B = 0, C = 15 are solutions. We conclude that the information given in the
exercise it no sufficient to calculate the number of animals in the zoo. 

Remark. The reason for this is that both equations 1 and 2 are basically the same equation.
If we divide the first one by 2 and the second one by 5, then we end up in both cases with the
equation B + 2C = 30, so both equations contain exactly the same information.
D

Algebraically, the linear system has infinitely many solutions. But our variables represent animals
and the only come in nonnegativ integer quantities, so we have the 16 different solutions B = 30−2C
where C ∈ {0, 1, . . . , 15}.

We give a few more examples.

Example 1.5. Find a polynomial P of degree at most 3 with

P (0) = 1, P (1) = 7, P 0 (0) = 3, P 0 (2) = 23. (1.2)

Solution. A polynomial of degree at most 3 is known if we know its 4 coefficients. In this exercise,
the unknowns are the coefficients of the polynomial P . If we write P (x) = αx3 + βx2 + γx + δ,
then we have to find α, β, γ, δ such that (1.2) is satisfied. Note that P 0 (x) = 3αx2 + 2βx + γ. Hence

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 9

(1.2) is equivalent to the following system of equations:


 
P (0) = 1,  1 δ = 1,

 

P (1) = 7, 2 α+ β+γ+δ = 7,
⇐⇒
P 0 (0) = 3,
 
 3 γ = 3,
 
0  
P (2) = 23. 4 12α + 4β + γ = 23.
Clearly, δ = 1 and γ = 3. If we insert this in the remaining equations, we obtain a system of two
equations for the two unknowns α, β:
2’ α + β = 3,
4’ 12α + 4β = 20.
From 2’ we obtain β = 3 − α. If we insert this into 4’ , we get that 20 = 12α + 4(3 − α) = 8α + 12,
that is, α = (20 − 12)/8 = 1. So the only possible solution is
α = 1, β = 2, γ = 3, δ = 1.

FT
3 2
It is easy to verify that the polynomial P (x) = x + 2x + 3x + 1 has all the desired properties. 

Example 1.6. A pole is 5 metres long and shall be coated with varnish. There are two types of
varnish available: The blue one adds 3 g per 50 cm to the pole, the red one adds 6 g per meter to
the pole. Is it possible to coat the pole in a combination of the varnishes so that the total weight
added is
(a) 35 g? (b) 30 g?
RA
Solution. (a) We denote by b the length of the pole which will be covered in blue and r the length
of the pole which will be covered in red. Then we obtain the system of equations
1 b+ r = 5 (total length)
2 6b + 6r = 35 (total weight)
The first equation gives r = 5 − b. Inserting into the second equation yields 35 = 6b + 6(5 − b) = 30
which is a contradiction. This shows that there is no solution.
(b) As in (a), we obtain the system of equations
D

1 b+ r = 5 (total length)
2 6b + 6r = 30 (total weight)
Again, the first equation gives r = 5−b. Inserting into the second equation yields 30 = 6b+6(5−b) =
30 which is always true, independently of how we choose b and r as long as 1 is satisfied. This
means that in order to solve the system of equations, it is sufficient to solve only the first equation
since then the second one is automatically satisfied. So we have infinitely many solutions. Any pair
b, r such that b + r = 5 gives a solution. So for any b that we choose, we only have to set r = 5 − b
and we have a solution of the problem. Of course, we could also fix r and then choose b = 5 − r to
obtain a solution.
For example, we could choose b = 1, then r = 4, or b = 0.00001, then r = 4.99999, or r = −2 then
b = 7. Clearly, the last example does not make sense for the problem at hand, but it still does
satisfy our system of equations. 

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
10 1.1. Examples of systems of linear equations; coefficient matrices

Example 1.7. When octane reacts with oxigen, the result is carbon dioxide and water. Find the
equation for this reaction.

Solution. The chemical formulas for the substances are C8 H18 , O2 , CO2 and H2 O. Hence the
reaction equation is
a C8 H18 + b O2 −→ c CO2 + d H2 O
with unkonwn integers a, b, c, d. Clearly the solution will not be unique since if we have one set
of numbers a, b, c, d which works and we multiply all of then by the same number, then we obtain
another solution. Let us write down the system of equations. To this end we note that the number
of atoms of each element has to be equal on both sides of the equation. We obtain:
1 8a = c (carbon)
2 18a = 2d (hydrogen)
3 2b = 2c + d (oxygen)
or, if we put all the variables on the left hand side,

1
2
4

FT 8a
18a
− c = 0,
− 2d = 0,
2b − 2c − d = 0.
Let us express all the unknowns in terms of a: 1 and 2 show that c = 8a and d = 9a. Inserting
this in 3 we obtain 0 = 2b − 2 · 8a − 9a = 2b − 25a, hence b = 25
2 a. If we want all coefficients to
be integers, we can choose a = 2, b = 25, c = 16, d = 18 and the reaction equation becomes
RA
2 C8 H18 + 25 O2 −→ 16 CO2 + 18 H2 O . 
All the examples we discussed in this section are so-called systems of linear equations. Let us give
a precise definition of what we mean by this.

Definition 1.8 (Linear system). An m×n system of linear equations (or simply a linear system)
is a system of m linear equations for n unknowns of the form
a11 x1 + a12 x2 + · · · + a1n xn = b1
D

a21 x1 + a22 x2 + · · · + a2n xn = b2


.. .. .. (1.3)
. . .
am1 x1 + am2 x2 + · · · + amn xn = bm
The unknowns are x1 , . . . , xn while the numbers aij and bi (i = 1, . . . , m, j = 1, . . . , n) are given.
The numbers aij are called the coefficients of the linear system and the numbers b1 , . . . , bn are
called the right side of the linear system.
A solution of the system (1.3) is a tuple (x1 , . . . , xn ) such that all m equations of (1.3) are satisfied
simultaneously. The system (1.3) is called consistent if it has at least one solution. It is called
inconsistent if it has no solution.
In the special case when all bi are equal to 0, the system is called a homogeneous system; otherwise
it is called inhomogeneous.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 11

Definition 1.9 (Coefficient matrix). The coefficient matrix A of the system is the collection of
all coefficients aij in an array as follows:
 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
A= . ..  . (1.4)
 
 .. . 
am1 am2 . . . amn
The numbers aij are called the entries or components of the matrix A.
The augmented coefficient matrix A of the system is the collection of all coefficients aij and the
right hand side; it is denoted by
 
a11 a12 . . . a1n b1
 a21 a22 . . . a2n b2 
 
(A|b) =  ..
 .. ..  . (1.5)
 . . . 

am1 am2 . . . amn bm

FT
The coefficient matrix is nothing else than the collection of the coefficients aij ordered in some sort
of table or rectangle such that the place of the coefficient aij is in the ith row of the jth column.
The augmented coefficient matrix contains additionally the constants from the right hand side.

Important observation. There is a one-to-one correspondence between linear systems and aug-
mented coefficient matrices: Given a linear system, it is easy to write down its augmented coefficient
matrix and vice versa.
RA
Let us write down the coefficient matrices of our examples.
Example 1.1: This is a 2 × 2 system with coefficients a11 = 1, a12 = 1, a21 = 2, a22 = 4 and
right hand side b1 = 25, b2 = 80. The system has a unique solution. The coefficient matrix and the
augmented coefficient matrix are
   
1 1 1 1 25
A= , (A|b) = .
2 4 2 4 80
Example 1.2: This is a 3 × 2 system with coefficients a11 = 1, a12 = 1, a21 = 2, a22 = 4, a31 = 51 ,
D

a32 = 13 , and right hand side b1 = 25, b2 = 80, b3 = 7. The system has a unique solution. The
coefficient matrix and the augmented coefficient matrix are
   
1 1 1 1 25
A = 2 4 , (A|b) =  2 4 80 ,
1 1 1 1
5 3 5 3 7
Example 1.3: This is a 3 × 2 system with coefficients a11 = 1, a12 = 1, a21 = 2, a22 = 4, a31 = 15 ,
a32 = 13 , and right hand side b1 = 60, b2 = 200, b3 = 100. The system has no solution. The
coefficient matrix is the same as in Example 1.2, the augmented coefficient matrix is
 
1 1 25
(A|b) =  2 4 80 ,
1 1
5 3 5

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
12 1.1. Examples of systems of linear equations; coefficient matrices

Example 1.5: This is a 4 × 4 system with coefficients a11 = 0, a12 = 0, a13 = 0, a14 = 1, a21 = 1,
a22 = 1, a23 = 1, a24 = 1, a31 = 0, a32 = 0, a33 = 1, a34 = 0, a41 = 24, a42 = 8, a43 = 2, a44 = 1,
and right hand side b1 = 1, b2 = 7, b3 = 3, b4 = 23. The system has a unique solution. The
coefficient matrix and the augmented coefficient matrix are
   
0 0 0 1 0 0 0 1 1
 1 1 1 1 1 1 1 1 7
A=  0 0 1 0 , (A|b) =  0 0 1 0 3  .
  

12 4 1 0 12 4 1 0 23

Example 1.7: This is a 3 × 4 homogeneous system with coefficients a11 = 8, a12 = 0, a13 = −1,
a14 = 0, a21 = 18, a22 = 0, a23 = 0, a24 = −2, a31 = 0, a32 = 2, a33 = −2, a34 = −1, and right
hand side b1 = 1, b2 = 7, b3 = 3, b4 = 23. The system has a unique solution. The coefficient matrix
and the augmented coefficient matrix are
   
8 0 −1 0 8 0 −1 0 0

FT
A = 18 0 0 −2 , (A|b) = 18 0 0 −2 0 .
0 2 −2 −1 0 2 −2 −1 0

We saw that Examples 1.1, 1.2, 1.5, 1.6 (a) have unique solutions. In Examples 1.6 (b) and 1.7
the solution is not unique; they even have infinitely many solutions! Examples 1.3 and 1.6(a) do
not admit solutions. So given an m × n system of linear equations, two important questions arise
naturally:
RA
• Existence: Does the system have a solution?
• Uniqueness: If the system has a solution, is it unique?

More generally, we would like to be able to say something about the structure of solutions of linear
systems. For example, is it possible that there is only one solution? That there are exactly two
solutions? That there are infinite solutions? That there is is no solution? Can we give criteria for
D

existence and/or uniqueness of solutions?


Can we give criteria for existence of infinitely many solutions? Is there an efficient way to calculate
all the solutions of a given linear system?
(Spoiler alert: A system of linear equations has either no or exactly one or infinitely many solutions.
It is not possible that it has, e.g., exactly 7 solutions. This will be discussed in detail in Chapter 3.)
Before answering these questions for general m × n systems in Chapter 3, we will have a closer look
at the special case of 2 × 2 systems in the next section.

You should now have understood


• what a linear system is,

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 13

• what a coefficient matrix and an augmented coefficient matrix are,


• their relation with linear systems,
• that a linear system can have different types of solutions,
• etc.

You should now be able to


• pass easily from a linear m × n system to its (augmented) coefficient matrix and back,
• solve linear systems by the “solve and substitute”-method,
• etc.

Ejercicios.
De los siguientes sistemas de ecuaciones lineales, encuentre al menos una solucin (si la hay). ¿Cules
tienen solucin nica?

(a) 4x − 6y = 7
6x − 9y = 12
(b) x + 2y − 3z = −4

1.2
2x + y − 3z = 4

Linear 2 × 2 systems
(c)

FT
3x − 5y = 0
15x − 9y = 0
(d) 2x + 4y + 6z = 18
4x + 5y + 6z = 24
(e) 4x + 14y = 23

(f)
6x + 21y = 30
6x + 8y = 12
15x + 20y = 30
RA
Let us come back to the equation from Example 1.1. For convenience, we write now x instead of B
and y instead of C. Recall that the system of equations that we are interested in solving is

1 x + y = 60,
(1.6)
2 2x + 4y = 200.

We want to give a geometric meaning to this system of equations. To this end we think of pairs
x, y as points (x, y) in the plane. Let us forget about the equation 2 for a moment and concentrate
only on 1 . Clearly, it has infinitely many solutions. If we choose an arbitrary x, we can always
D

find y such that 1 satisfied (just take y = 60 − x). Similarly, if we choose any y, then we only have
to take x = 60 − y and we obtain a solution of 1 .
Where in the xy-plane lie all solutions of 1 ? Clearly, 1 is equivalent to y = 60 − x which we easily
identify as the equation of the line L1 in the xy-plane which passes through (0, 60) and has slope
−1. In summary, a pair (x, y) is a solution of 1 if and only if it lies on the line L1 , see Figure 1.1.
If we apply the same reasoning to 2 , we find that a pair (x, y) satisfies 2 if and only if (x, y) lies
on the line L2 in the xy-plane given by y = 41 (200 − 2x) (this is the line in the xy-plane passing
through (0, 50) with slope − 12 ).
Now it is clear that a pair (x, y) satisfies both 1 and 2 if and only if it lies on both lines L1 and
L2 . So finding the solution of our system (1.6) is the same as finding the intersection of the two
lines L1 and L2 . From elementary geometry we know that there are exactly three possibilities for
their intersection:

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
14 1.2. Linear 2 × 2 systems

M M M

40 40 40

30 30 2M + 4C = 80 30

20 M + C = 25 20 20

10 10 10 (15, 10)

C C C
−10 10 20 30 −10 10 20 30 −10 10 20 30
−10 −10 −10

Figure 1.1: Graphs of the lines L1 , L2 which represent the equations from the system (1.6) (see also
Example 1.1). Their intersection represents the unique solution of the system.

FT
(i) L1 and L2 are not parallel. Then they intersect in exactly one point.
(ii) L1 and L2 are parallel and not equal. Then they do not intersect.
(iii) L1 and L2 are parallel and equal. Then L1 = L2 and they intersect in infinitely many points
(they intersect in every point of L1 = L2 ).

In our example we know that the slope of L1 is −1 and that the slope of L2 is − 21 , so they are not
RA
parallel and therefore intersect in exactly one point. Consequently, the system (1.6) has exactly
one solution.

If we look again at Example 1.6, we see that in Case (a) we have to determine the intersection of
the lines
35
L1 : y = 5 − x, L2 : y = − x.
6
Both lines have slope −1 so they are parallel. Since the constant terms in both lines are not equal,
they intersect nowhere, showing that the system of equations has no solution, see Figure 1.2.
D

In Case (b), the two lines that we have to intersect are

G1 : y = 5 − x, G2 : y = 5 − x.

We see that G1 = G2 , so every point on G1 (or G2 ) is solution of the system and therefore we have
infinite solutions, see Figure 1.2.

Important observation. If a linear 2 × 2 system has a unique solution or not, has nothing to
do with the right hand side of the system because this only depends on whether the two lines are
parallel or not, and this in turn depends only on the coefficients on the left hand side.

Now let us consider the general case.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 15

r
6
5
4
3 L1 : y = 5 − x
2 L2 : y = 35/6 − x

1
g
−1 1 2 3 4 5 6
−1

Figure 1.2: Example 1.6. Graphs of L1 , L2 .

FT
One linear equation with two unknowns
The general form of one linear equation with two unknowns is

αx + βy = γ. (1.7)

For the set of solutions, there are three possibilities:


(i) The set of solutions forms a line. This happens if at least one of the coefficients α or β is
RA
γ
different from 0. If β 6= 0, then set of all solutions is equal to the line L : y = − αβ x + β which
is a line with slope − αγ . If β = 0 and α 6= 0, then the set of solutions of (1.7) is a line parallel
to the y-axis passing through (0, αγ ).
(ii) The set of solutions is all of the plane. This happens if α = β = γ = 0. In this case, clearly
every pair (x, y) is a solution of (1.7).
(iii) There is no solution. This happens if α = β = 0 and γ 6= 0. In this case, no pair (x, y) is a
solution of (1.7) since the left hand side is always 0.
D

In the first two cases, (1.7) has infinitely many solutions, in the last case it has no solution.

Two linear equations with two unknowns


The general form of one linear equation with two unknowns is

1 Ax + By = U
(1.8)
2 Cx + Dy = V.

We are using the letters A, B, C, D instead of a11 , a12 , a21 , a22 in order to make the calculations
more readable. If we interprete the system of equations as intersection of two geometrical objects,
in our case lines, we already know the there are exactly three possible types of solutions:
(i) A point if 1 and 2 describe two non-parallel lines.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
16 1.2. Linear 2 × 2 systems

(ii) A line if 1 and 2 describe the same line; or if one of the equations is a plane and the other
one is a line.
(iii) A plane if both equations describe a plane.
(iv) The empty set if the two equations describe parallel but different lines; or if one of the
equations has no solution.
In case (i), the system has exactly one solution, in cases (ii) and (iii) the system has infinitely many
solutions and in case (iv) the system has no solution.
In summary, we have the following very important observation.

Remark 1.10. The system (1.8) has either exactly one solution or infinitely many solutions or
no solution.
It is not possible to have for instance exactly 7 solutions.
Question 1.3
What is the geometric interpretation of

FT
(i) a system of 3 linear equations for 2 unknowns?
(ii) a system of 2 linear equations for 3 unknowns?

What can be said about the structure of its solutions?

Algebraic proof of Remark 1.10. Now we want to prove the Remark 1.10 algebraically and we want
RA
to find a criterion on A, B, C, D which allows us to decide easily how many solutions there are. Let
us look at the different cases.
1
Case 1. B 6= 0. In this case we can solve 1 for y and obtain y = B (U − Ax). Inserting 2 we find
D
Cx + − Ax) = V . If we put all terms with x on one side and all other terms on the other side,
B (U
we obtain
2’ (AD − BC)x = DU − BV.
DU −BV
(i) If AD − BC 6= 0 then there is at most one solution, namely x = AD−BC and consequently
AV −CU
y = B1 (U − Ax) = AD−BC . Inserting these expressions for x and y in our system of equations,
D

we see that they indeed solve the system (1.8), so that we have exactly one solution.

(ii) If AD − BC = 0 then equation 2’ reduces to 0 = DU − BV . This equation has either no


solution (if DU − BV 6= 0) or it is true for every possible choice of x and y (if DU − BV = 0).
Since 1 has infinitely many solutions, it follows that the system (1.8) has either no solution
or infinitely many solutions.
Case 2. D 6= 0. This case is analogous to Case 1. In this case we can solve 2 for y and obtain
1 B
y= D (V − Cx). Hence 1 becomes Ax + D (V − Cx) = U . If we put all terms with x on one side
and all other terms on the other side, we obtain
1’ (AD − BC)x = DU − BV
We have the same subcases as before:

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 17

DU −BV
(i) If AD − BC 6= 0 then there is exactly one solution, namely x = AD−BC and consequently
1 AV −CU
y= B (U − Ax) = AD−BC .

(ii) If AD − BC = 0 then equation 1’ reduces to 0 = DU − BV . This equation has either no


solution (if DU − BV 6= 0) or holds for every x and y (if DU − BV = 0). Since 2 has
infinitely many solutions, it follows that the system (1.8) has either no solution or infinitely
many solutions.

Case 3. B = 0 and D = 0. Observe that in this case AD − BC = 0 . In this case the system (1.8)
reduces to
Ax = U, Cx = V. (1.9)
We see that the system no longer depends on y. So, if the system (1.9) has at least one solution,
then we automatically have infinitely many solutions since we may choose y freely. If the system
(1.9) has no solution, then the original system (1.8) cannot have a solution either.
Note that there are no other possible cases for the coefficients.

FT
In summary, we proved the following theorem.

Theorem 1.11. Let us consider the linear system

1 Ax + By = U
(1.10)
2 Cx + Dy = V.

(i) The system (1.10) has exactly one solution if and only if AD − BC 6= 0 . In this case, the
RA
solution is
DU − BV AV − CU
x= , y= . (1.11)
AD − BC AD − BC

(ii) The system (1.10) has no solution or infinitely many solutions if and only if AD − BC = 0 .

Definition 1.12. The number d = AD − BC is called the determinant of the system (1.10).

In Chapter 4.1 we will generalise this concept to n × n systems for n ≥ 3.


D

Remark 1.13. Let us see how the determinant connects to our geometric interpretation of the
system of equations. Assume that B 6= 0 and D 6= 0. Then we can solve 1 and 2 for y to obtain
equations for a pair of lines
A 1 C 1
L1 : y= − x + U, L2 : y= − x + V.
B B D D
A C
The two lines intersect in exactly one point if and only if they have different slopes, i.e., if − B 6= − D .
After multiplication by −BD we see that this is the same as AD 6= BC, or in other words,
AD − BC 6= 0.
On the other hand, the lines are parallel (hence they are either equal or they have no intersection)
A C
if − B 6= − D . This is the case if and only if AD = BC, or in other words, if AD − BC = 0.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
18 1.2. Linear 2 × 2 systems

y
7

(5, 3)
3 L1 : x + 2y = 11
L2 : 3x + 4y = 27

1
x
−1 1 3 5 7 9 11
−1

Figure 1.3: Example 1.14(a). Graphs of L1 , L2 and their intersection (5, 3).

Question 1.4

FT
Consider the cases when B = 0 or D = 0 and make the connection between Theorem 1.11 and
the geometric interpretation of the system of equations.

Let us consider some more examples.


RA
Examples 1.14. (a) 1 x + 2y = 11
2 3x + 4y = 27.
Clearly, the determinant is d = 4 − 6 = −2 6= 0. So the system has exactly one solution.
We can check this easily: The first equation gives x = 11 − 2y. Inserting this into the second
equations leads to
3(11 − 2y) + 4y = 27 =⇒ −2y = −6 =⇒ y=3 =⇒ x = 11 − 2 · 3 = 5.
D

So the solution is x = 5, y = 3. (If we did not have Theorem 1.11, we would have to check
that this is not only a candidate for a solution, but indeed is one.)
Check that the formula (1.11) is satisfied.

(b) 1 x + 2y = 1
2 2x + 4y = 5.
Here, the determinant is d = 4 − 4 = 0, so we expect either no solution or infinitely many
solutions. The first equations gives x = 1 − 2y. Inserting into the second equations gives
2(1 − 2y) + 4y = 5. We see that the terms with y cancel and we obtain 2 = 5 which is a
contradiction. Therefore, the system of equations has no solution.

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 19

L1 : x + 2y = 1 L1 : x + 2y = 1
y y
L2 : 3x + 4y = 5 L2 : 3x + 6y = 3
2 2

1 1

L1 = L2
x x

−1 1 2 3 −1 1 2 3

−1 −1

Figure 1.4: Picture on the left: The lines L1 , L2 from Example 1.14(b) are parallel and do not
intersect. Therefore the linear system has no solution.
Picture on the right: The lines L1 , L2 from Example 1.14(c) are equal. Therefore the linear system

FT
has infinitely many solutions.

(c) 1 x + 2y = 1
2 3x + 6y = 3.
The determinant is d = 6 − 6 = 0, so again we expect either no solution or infinitely many
solutions. The first equations gives x = 1 − 2y. Inserting into the second equations gives
RA
3(1 − 2y) + 6y = 3. We see that the terms with y cancel and we obtain 3 = 3 which is true.
Therefore, the system of equations has infinitely many solutions given by x = 1 − 2y.

Remark. This was somewhat clear since we can obtain the second equation from the first one
by multiplying both sides by 3 which shows that both equations carry the same information
and we loose nothing if we simply forget about one of them.

Exercise 1.15. Find all k ∈ R such that the system


D

1 kx + (15/2 − k)y = 1
2 4x + 2ky = 3

has exactly one solution.

Solution. We only need to calculate the determinant and find all k such that it is different from
zero. So let us start by calculating

d = k · 2k − (15/2 − k) · 4 = 2k 2 + 4k − 30 = 2(k 2 + 2k − 15) = 2 (k + 1)2 − 16 .


 

Hence there are exactly two values for k where d = 0, namely k = −1 ± 4, that is k1 = 3, k2 = −5.
For all other k, we have that d 6= 0.
So the answer is: The system has exactly one solution if and only if k ∈ R \ {−5, 3}. 

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
20 1.2. Linear 2 × 2 systems

Remark 1.16. (a) Note that the answer does not depend on the right hand side of the system
of the equation. Only the coefficients on the left hand side determine if there is exactly one
solution or not.
(b) If we wanted to, we could also calculate the solution x, y in the case k ∈ R \ {−5, 3}. We
could do it by hand or use (1.11). Either way, we find
1 5k − 45/2 1 6k − 4
x= [2k − 3(15/2 − k)] = 2 , y= [6k − 4] = 2 .
d 2k + 4k − 30 d 2k + 4k − 30
Note that the denominators are equal to d and they are equal to 0 exactly for the “forbidden”
values of k = −5 or k = 3.
(c) What happens if k = −5 or k = 3? In both cases, d = 0, so we will either have no solution or
infinitely many solutions.
If k = −5, then the system becomes −5x + 25/2y = 1, 4x − 10y = 3.
Multiplying the first equation by −4/5 and not changing the second equation, we obtain

FT
4
4x − 10y = − , 4x − 10y = 3
5
which clearly cannot be satisfied simultaneously.
If k = 3, then the system becomes 3x − 9/2y = 1, 4x + 6y = 3.
Multiplying the first equation by 4/3 and not changing the second equation, we obtain
4
4x − 6y = , 4x − 6y = 3
RA
3
which clearly cannot be satisfied simultaneously.
In conclusion, if k = −5 or k = 3, then the linear system has no solution.

You should have understood


• the geometric interpretation of a linear m × 2 system and how it helps to understand the
D

qualitative structure of solutions,


• how the determinant helps to decide whether a linear 2 × 2 system has a unique solution or
not,
• that whether 2 × 2 system a unique solution depends only on the coefficients; it does not
depend on the right side of the equation (the actual values of the solutions of course do
depend on the right side of the equation),
• etc.
You should now be able to
• pass easily from a linear m × 2 system to its geometric interpretation and back,
• calculate the determinant of a linear 2 × 2 system,

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 21

• determine if a linear 2 × 2 system has a unique, no or infinitely many solutions and calculate
them,
• give criteria for existence/uniqueness of solutions,
• etc.

Ejercicios.
1. Usando el criterio del determinante, diga cuáles sistemas tienen solucin nica y encuntrela.
En caso de que el determinante sea cero, especifique si el sistema posee infinitas soluciones o
ninguna solucin.

(a) 6x + y = 3; −4x − y = 8 (d) 2x − 8y = 6; −3x + 12y = 4


(b) 5x + 2y = 7; 2x + 5y = 4 (e) 2x − 8y = 6; −3x + 12y = −9
(c) 4x − 6y = 0; 2x − 3y = 0 (f) 2y = 4; 5x − 3y = 1

FT
2. ¿Para cules valores de k se cortan las siguientes rectas exactamente en un punto?
k+3 √ 2k − 5 √
3
y= x + π − 2, y= x + 3.
2 3
3. ¿Para cules valores de k se cortan las siguientes rectas exactamente en un punto?

(k + 2)x − 3y = 2π, 5kx + (k − 1)y = 3 − e2 .
RA
1.3 Summary
A linear system is a system of equations
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. ..
. . .
am1 x1 + am2 x2 + · · · + amn xn = bm
D

where x1 , . . . , xn are the unknowns and the numbers aij and bi (i = 1, . . . , m, j = 1, . . . , n) are
given. The numbers aij are called the coefficients of the linear system and the numbers b1 , . . . , bn
are called the right side of the linear system.
In the special case when all bi are equal to 0, the system is called a homogeneous; otherwise it is
called inhomogeneous.
The coefficient matrix A and the augmented coefficient matrix (A|b) of the system is are
 

a11 a12 . . . a1n
 a11 a12 . . . a1n b1
 a21 a22 . . . a2n   a21 a22 . . . a2n b2 
 
A= . , (A|b) = . . ..  .
 
. 
. .
 .. .. 

 . . .


am1 am2 ... amn am1 am2 ... amn bn

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
22 1.4. Exercises

The general form of linear 2 × 2 system is


a11 x1 + a12 x2 = b1
(1.12)
a21 x1 + a22 x2 = b2
and its determinant is
d = a11 a22 − a21 a12 .
The determinant tells us if the system (1.12) has a unique solution:
• If d 6= 0, then (1.12) has a unique solution.
• If d = 0, then (1.12) has either no or infinitely many solutions (it depends on b1 and b2 which
case prevails).
Observe that d does not depend on the right hand side of the linear system.

1.4 Exercises

FT
1. Encuentre el área del triángulo que se encuentra en el primer cuadrante y que está delimitado
por las rectas y = 2x − 4, y = −4x + 20.
2. Suponga que los puntos (1, 5), (−1, 3) y (0, 1) están sobre la parábola y = ax2 + bx + c. Con
esta información, determine los valores de a, b, c.
3. Describa todas las parábolas que pasan por los puntos (1, 1) y (−1, 4).
4. Encuentre todos los valores de t, k ∈ R tal que el siguiente sistema sea consistente.
2x + 8y = 4,
RA
5x + 4ky = 20,
tx + 2y = 1.
5. De un número de tres cifras sabemos que sus tres dı́gitos suman 11, y la suma del primer y
tercer dı́gito es 5. Encuentre todos los números que cumplen la propiedad anterior.
6. El dueño de una tienda vende comida para perros a 40$ y comida para gatos a 20$. Haciendo
cuentas de la semana observa que por concepto de comida de animales recibió 640$ y que 22
clientes entraron esa semana a comprar comida de animales. Si se supone que cada cliente
D

tiene una única mascota ¿Cuántos clientes eran dueños de perros y cuántos de gatos?
7. La suma de la cifra de las decenas y la cifra de las unidades de un número de dos dgitos es
12, y si al número se le resta 18, las cifras se invierten. Hallar el número.
8. Sabemos que la distancia entre Bogotá y Puerto Concordia es de 375 km aproximadamente
y la distancia entre Villavicencio y Puerto Concordia es de aproximadamente 261 km. Un
conductor A parte de Bogotá hacia Villavicencio con una velocidad constante de 57 kmh a las
4:00 am y una hora despéus, un conductor B parte de Puerto Concordia hacia Bogotá a
una velocidad constante de 49 kmh . ¿A qué hora llega el conductor A a Villavicencio? ¿Los
conductores A y B se encuentran en carretera? ¿A qué hora lo hacen?. Repita las preguntas
si suponemos que el conductor A se mueve a una velocidad de 19 km h y el conductor B se
km
mueve a una velocidad de 70 h .

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 1. Introduction 23

FT
RA
D

Last Change: Sat Sep 30 09:25:27 PM -05 2023


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 2. R2 and R3 25

Chapter 2

R2 and R3

In this chapter we will introduce the vector spaces R2 , R3 and Rn . We will define algebraic
operations in them and interpret them geometrically. Then we will add some additional structure

FT
to these spaces, namely an inner product. This allows us to assign a norm (length) to a vector and
talk about the angle between two vectors; in particular, it gives us the concept of orthogonality. In
Section 2.3 we will define orthogonal projections in R2 and we will give a formula for the orthogonal
projection of a vector onto another. This formula is easily generalised to projections onto a vector
in Rn with n ≥ 3. Section 2.5 is dedicated to the special and very important case R3 since it is the
space that physicists use in classical mechanics to describe our world. In the last two sections we
study lines and planes in Rn and in R3 . We will see how we can describe them in formulas and we
will learn how to calculate their intersections. This naturally leads to the question on how to solve
RA
linear systems efficiently which will be addressed in the next chapter.

2.1 Vectors in R2
Recall that the xy-plane is the set of all pairs (x, y) with x, y ∈ R. We will denote it by R2 .
Maybe you already encountered vectors in a physics lecture. For instance velocities and forces are
described by vectors. The velocity of a particle says how fast it is and in which direction the particle
moves. Usually, the velocity is represented by an arrow which points in the direction in which the
D

particle moves and whose length is proportional to the magnitude of the velocity.
Similarly, a force has strength and a direction so it is represented by an arrow which points in the
direction in which it acts and with length proportional to its strength.
Observe that it is not important where in the space R2 or R3 we put the arrow. As long it points
in the same direction and has the same length, it is considered the same vector. We call two arrows
equivalent if they have the same direction and the same length. A vector is the set of all arrows
which are equivalent to a given arrow. Each specific arrow in this set is called a representation of
the vector. A special representation is the arrow that starts in the origin (0, 0). Vectors are usually
denoted by a small letter with an arrow on top, for example ~v .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
26 2.1. Vectors in R2

# –
Given two points P, Q in the xy-plane, we write P Q for
the vector which is represented by the arrow that starts y
in P and ends in Q. For example, let P (2, 1) and Q(4, 4)
be pointsinthe xy-plane. Then the arrow from P to Q
# – 2 Q
is P Q = .
3

P#Q–
We can identify a point P (p1 , p2 ) in the xy-plane with
the vector starting in the poiint (0, 0) and ending in P
# – p1 x
P . We denote this vector by OP or or some-
p2
times by (p1 , p2 )t in order to save space (the subscript
t
stands for “transposed”). p1 is called its x-coordinate
or x-component and p2 is called its y-coordinate or y-
# –
component. Figure 2.1: The vector P Q and several of
its representations. The green arrow is the

FT
 
a special representation whose initial point is
On the other hand, every vector describes a unique in the origin.
b
point in the xy-plane, namely the tip of the arrow which
represents the given vector and starts in the origin.
Clearly its coordinates are (a, b). Therefore we can iden-
tify the set of all vectors in R2 with R2 itself.
RA
 
a b
Observe that the slope of the arrow ~v = is a if a 6= 0. If a = 0, then the vector is parallel to
b
the y-axis.
 
2
For example, the vector ~v = can be represented as an arrow whose initial point is in the origin
5
and its tip is at the point (2, 5). If we put its initial point anywhere else, then we find the tip by
moving 2 units to the right (parallel to the x-axis) and 5 units up (parallel to the y-axis).
D

 
0
A very special vector is the zero vector . Is is usually denoted by ~0.
0

We call numbers in R scalars in order to distinguish them from vectors.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 27

Algebra with vectors


If we think of a force and we double its strength then the
corresponding vector should be twice as long. If we multiply
the force by 5, then the length of the corresponding vector y
should be 5 times as long, that is, if for instance a force F~ = 2~v
(3, 4)t is given, then 5F~ should be (5 · 3, 5 · 4)t = (15, 20)t .
  ~v
a
In general, if a vector ~v = and a scalar c are given, then
  b x
ca
c~v = . Note that the resulting vector is always parallel −~v
cb
to the original one. If c > 0, then the resulting vector points
in the same direction as the original one, if c < 0, then it
points in the opposite direction, see Figure 2.2. Figure 2.2: Multiplication of a
vector by a scalar.
Given two points P (p1 , p2 ), Q(q1 , q2 ) in the xy-plane.
# – # –
Convince yourself that P Q = −QP .

FT
How should we sum two vectors? Again, let us think of forces. Assume we have two forces F~1
and F~2 both acting on the same particle. Then we get the resulting force if we draw the arrow
representing F~1 and attach to its end point the initial point of the arrow representing F~2 . The total
force is then represented by the arrow starting in the initial point of F~1 and ending in the tip of F~2 .

Convince yourself that we obtain the same result if we start with F~2 and put the initial point of
F~1 at the tip of F~2 .
RA
We could also think of the sum of velocities. For example, if a train moves with velocity ~vt and a
passengar on the train is moving with relative velocity ~vp , then her total velocity with respect to
the ground is the vectorsum
 of the twovelocities.

a p
Now assume that ~v = and w ~ = . Algebraically,
b q
we obtain the components of their sum by summing the y

a+p ~v + w
~
components: ~v + w~= , see Figure 2.3.
b+q w~
D

When you sum vector, you should always think of triangles w


~ ~v
(or polygons if you sum more than two vectors).
x

Given two points P (p1 , p2 ), Q(q1 , q2 ) in the xy-plane.


# – # – # –
Convince yourself that OP + P Q = OQ and consequently
# – # – # –
P Q = OQ − OP .
# – # – # – Figure 2.3: Sum of two vectors.
How could you write QP in terms of OP and OQ? What
# –
is its relation with P Q?

Our discussion of how the product of a vector and a scalar and how the sum of two vectors should
be, leads us to the following formal definition.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
28 2.1. Vectors in R2

   
a p
Definition 2.1. Let ~v = ,w
~= ∈ R2 , c ∈ R. Then:
b q
     
a p a+p
Vector sum: ~v + w~= + = ,
b q b+q
   
a ca
Product with a scalar: c~v = c = .
b cb

It is easy to see that the vector sum satisfies what one expects from a sum: (~u +~v ) + w
~ = ~u + (~v + w)
~
(associativity) and ~v + w ~ = w ~ + ~v (commutativity). Moreover, we have the distributivity laws
(a + b)~v= a~v + b~v 
anda(~v + w)
~ = a~v + aw.
~ Let us verify for example associativity. To this end,
u1 v1 w1
let ~u = , ~v = ,w ~= . Then
u2 v2 w2
 
         
u1 v1 w1 u1 + v1 w1 (u1 + v1 ) + w1
(~u + ~v ) + w
~= + + = + =
u2 v2 w2 u2 + v2 w2 (u2 + v2 ) + w2

FT
           
u1 + (v1 + w1 ) u1 (v1 + w1 ) u1 v1 w1
= = + = + +
u2 + (v2 + w2 ) u2 (v2 + w2 ) u2 v2 w2
= ~u + (~v + w).
~

In the same fashion, verify commutativity and distributivity of the vector sum.

~v w~ w~ ~v
RA
w
~
~v +

~v
w
~

~+
~v

w
w~
~v

Figure 2.4: The picture illustrates the commutativity of the vector sum.

~z ~z
~z)
~z
D
+

w
~
+
)

w
w~

(w~

~+

~z ~
w
+

~v +
+
(~v

w
~ w
~
~v

~z

~v
~v ~v

Figure 2.5: The picture illustrates associativity of the vector sum.

Draw pictures that illustrate the distributivity laws.

We can take these properties and define an abstract vector space. We shall call a set of things, called
vectors, with a “well-behaved” sum of its elements and a “well-behaved” product of its elements
with scalars a vector space. The precise definition is the following.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 29

Vector Space Axioms. Let V be a set together with two operations

vector sum + : V × V → V, (v, w) 7→ v + w,


product of a scalar and a vector · : K × V → V, (λ, v) 7→ λ · v.

Note that we will usually write λv instead of λ · v. Then V is called an R-vector space and its
elements are called vectors if the following holds:

(a) Associativity: (u + v) + w = u + (v + w) for every u, v, w ∈ V .

(b) Commutativity: v + w = w + v for every v, w ∈ V .

(c) Identity element of addition: There exists an element O ∈ V , called the additive identity
such that for every v ∈ V , we have O + v = v + O = v.

(d) Inverse element: For all v ∈ V , we have an inverse element v 0 such that v + v 0 = O.

(e) Identity element of multiplication by scalar: For every v ∈ V , we have that 1v = v.

FT
(f) Compatibility: For every v ∈ V and λ, µ ∈ R, we have that (λµ)v = λ(µv).

(g) Distributivity laws: For all v, w ∈ V and λ, µ ∈ R, we have

(λ + µ)v = λv + µv and λ(v + w) = λv + λw.

These axioms are fundamental for linear algebra and we will come back to them in Chapter 5.1.
RA
Check that R2 is a vector space, that its additive identity is O = ~0 and that for every vector
~v ∈ R2 , its additive inverse is −~v .

It is important to note that there are vector spaces that do not look like R2 and that we cannot
always write vectors as columns. For instance, the set of all polynomials form a vector space (the
sum and scalar multiple of polynomials is again polynomial, the sum is additive and commutative;
the additive identity is the zero polynomial and for every polynomial p, its additive inverse is the
D

polynomial −p; we can multiply polynomials with scalars and obtain another polynomial, etc.). The
vectors in this case are polynomials and it does not make sense to speak about its “components” or
“coordinates”. (We will however learn how to represent certain subspaces of the space of polynomials
as subspaces of some Rn in Chapter 6.3.)

After this brief excursion about abstract vector spaces, let us return to R2 . We know that it can
be identified with the xy-plane. This means that R2 has more structure than only being a vector
space. For example, we can measure angles and lengths. Observe that these concepts do not appear
in the definition of a vector space. They are something in addition to the vector space properties.
Let us now look at some more geometric properties of vectors in R2 . Clearly a vector is known if
we know its length anditsangle with the x-axis. From the Pythagoras theorem it is clear that the
a √
length of a vector ~v = is a2 + b2 .
b

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
30 2.1. Vectors in R2

y
y y
~v
ϕ
~v
ϕ
x
x x ϕ

~v

Figure 2.6: Angle of a vector with the x-axis.

y
~v

ϕ0
ϕ
x

FT
−~v

Figure 2.7: The angle of ~v and −~v with the x-axis. Clearly, ϕ0 = ϕ + π.

2
Definition 2.2 (Norm of a vector in R ). The length of ~v =
 
a
∈ R2 is denoted by k~v k. It
RA
b
is given by
p
k~v k = a2 + b2 .

Other names for the length of ~v are magnitude of ~v or norm of ~v .

As already mentioned earlier, the slope of vector ~v is ab if a 6= 0. If ϕ is the angle of the vector ~v
with the x-axis then tan ϕ = ab if a 6= 0. If a = 0, then ϕ = − π2 or ϕ = π2 . Recall that the range
a
of arctan is (−π/2, π/2), so we cannot simply take arctan
  of the  fraction b inorder to obtain ϕ.
D


b −b a −a a
Observe that arctan a = arctan −a , but the vectors and =− point in opposite
b −b b
directions, so they do not have the same angle with the x-axis. In fact, their angles differ by π, see
Figure 2.7. From elementary geometry, we find

arctan ab


 if a > 0,
b π − arctan b

if a < 0,
tan ϕ = if a 6= 0 and ϕ= a
a π/2
 if a = 0, b > 0,

−π/2 if a = 0, b < 0.

Note that this formula gives angles with values in [−π/2, 3π/2).

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 31

Remark 2.3. In order to obtain angles with values in (−π, π], we can use the formula

a

arccos √a2 +b2
 if b > 0,
ϕ= − arccos √a2a+b2 if b < 0,

π if a < 0, b = 0.

~ ∈ R2 . Then the following is


Proposition 2.4 (Properties of the norm). Let λ ∈ R and ~v , w
true:

(i) k~v k = 0 if and only if ~v = ~0.

(ii) kλ~v k = |λ|k~v k,

(iii) k~v + wk
~ ≤ k~v k + kwk
~ (triangle inequality),
   
a c

FT
Proof. Let ~v = ,w
~= ∈ R2 and λ ∈ R.
b d

(i) Since k~v k = a2 + b2 it follows that k~v k = 0 if and only if a = 0 and b = 0. This is the case
if and only if ~v = ~0.


   
a λa p p
(ii) kλ~v k = λ = = (λa)2 + (λb)2 = λ2 (a2 + b2 ) = |λ| a2 + b2 = |λ|k~v k.
b λb
RA
(iii) We postpone the proof of the triangle inequality to Corollary 2.20 when we will have the
cosine theorem at our disposal.

Geometrically, the triangel inequality says that in the plane


the shortest way to get from one point to the other is a w~
w
~

straight line. Figure 2.8 shows that it is shorter to go di-


~v +

rectly from the origin of the blue vector to its tip than taking
~ In other words, k~v +wk
a detour along ~v and w. ~ ≤ k~v k+kwk.
~ ~v
D

Figure 2.8: Triangle inequality.

Definition 2.5. A vector ~v ∈ R2 is called a unit vector if k~v k = 1.

Note that every vector ~v 6= ~0 defines a unit vector pointing in the same direction as itself by k~v k−1~v .

Remark 2.6. (i) The tip of every unit vector lies on the unit circle, and, conversely, every vector
whose initial point is the origin and whose tip lies on the unit circle is a unit vector.
 
cos ϕ
(ii) Every unit vector is of the form where ϕ is its angle with the positive x-axis.
sin ϕ

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
32 2.1. Vectors in R2

~v
ϕ
x
1

Figure 2.9: Unit vectors.

Finally, we define two very special unit vectors:

FT
   
1 0
~e1 = , ~e2 = .
0 1
Clearly, ~e1 is parallel to the x-axis, ~e2 is parallel to the y-axis and k~e1 k = k~e2 k = 1.
 
a
Remark 2.7. Every vector ~v = can be written as
b
     
a a 0
~v = = + = a~e1 + b~e2 .
RA
b 0 b

Remark 2.8. Another notation for ~e1 and ~e2 is ı̂ and ̂.

You should have understood


• the concept of an abstract vector space and vectors,
• the vector space R2 and how to calculate with vectors in R2 ,
 
a
D

• the difference between a point P (a, b) in R2 and a vector ~v = in R2 ,


b
• geometric concepts (angles, length of a vector),
• etc.
You should now be able to
• perform algebraic operations in the vector space R2 and visualise them in the plane,
• calculate lengths and angles,
• calculate unit vectors, scale vectors,
• perform simple abstract proofs (e.g., prove that R2 is a vector space).
• etc.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 33

Ejercicios.
 
3
1. Sean P (2, 3), Q(−1, 4) puntos en R2 y sea ~v = un vector en R2 .
−2
−−→
(a) Calcule P Q.
−−→
(b) Calcule kP Qk.
−−→
(c) Calcule P Q + ~v .
(d) Encuentre el ángulo que forma ~v con el eje x.
−−→
(e) Encuentre el ángulo que forma P Q con el eje x.

2. (a) Determine con la suma vectorial si los puntos (1, 1), (4, 2), (2, 4) y (−1, 3) forman un
paralelogramo.
(b) Repita el ejercicio anterior con los puntos (1, −3), (2, 0), (3, −2) y (0, 4).

FT
(c) Repita el ejercicio anterior con los puntos (1, 1), (2, 3), (3, 2) y (4, 4).

2.2 Inner product in R2


In this section we will explore further geometric properties of R2 and we will introduce the so-called
inner product. Many of these properties carry over almost literally to R3 and more generally, to
Rn . Let us start with a definition.
RA
   
v1 w1
Definition 2.9 (Inner product). Let ~v = ,w
~= be vectors in R2 . The inner product
v2 w2
of ~v and w
~ is
h~v , wi
~ := v1 w1 + v2 w2 .
The inner product is also called scalar product or dot product and it can also be denoted by ~v · w.
~

We usually prefer the notation h~v , wi


~ since this notation is used frequently in physics and extends
naturally to abstract vector spaces with an inner product. Moreover, the notation with the dot
D

seems to suggest that the dot product behaves like a usual product, whereas in reality it does not,
see Remark 2.12.

Before we give properties of the inner product and explore what it is good for, we first calculate a
few examples to familiarise ourselves with it.

Examples 2.10.
   
2 −1
(i) , = 2 · (−1) + 3 · 5 = −2 + 15 = 13.
3 5
      2
2 2 2
(ii) , = 22 + 32 = 4 + 9 = 13. Observe that this is equal to .
3 3 3

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
34 2.2. Inner product in R2

       
2 1 2 0
(iii) , = 2, , = 3.
3 0 3 1
   
2 −3
(iv) , = 0.
3 2

~ ∈ R2 and λ ∈ R. Then the


Proposition 2.11 (Properties of the inner product). Let ~u, ~v , w
following holds.

(i) h~v , ~v i = k~v k2 . In dot notation: ~v · ~v = k~v k2 .

(ii) h~u , ~v i = h~v , ~ui. In dot notation: ~u · ~v = ~v · ~u.

(iii) h~u , ~v + wi
~ = h~u , ~v i + h~u , wi.
~ In dot notation: ~u · (~v + w)
~ = ~u · ~v + ~u · w.
~

(iv) hλ~u , ~v i = λh~u , ~v i. In dot notation: (λ~u) · ~v = λ(~u · ~v ) .

FT
     
u1 v1 w1
Proof. Let ~u = , ~v = and w
~= .
u2 v2 w2

(i) h~v , ~v i = v12 + v22 = k~v k2 .

(ii) h~u , ~v i = u1 v1 + u2 v2 = v1 u1 + v2 u2 = h~v , ~ui.


   
u1 v1 + w 1
(iii) h~u , ~v + wi~ = ,
u2 v2 + w 2
RA
= u1 (v1 + w1 ) + u2 (v2 + w2 ) = u1 v1 + u2 v2 + u1 w1 + u2 w2
       
u1 v u1 w1
= , 1 + , = h~u , ~v i + h~u , wi.
~
u2 v2 u2 w2
   
λu1 v
(iv) hλ~u , ~v i = , 1 = λu1 v1 + λu2 v2 = λ(u1 v1 + u2 v2 ) = λh~u , ~v i.
λu2 v2
D

Remark 2.12. Observe that the proposition shows that the inner product is commutative and
distributive, so it has some properties of the “usual product” that we are used to from the product
in R or C, but there are some properties that show that the inner product is not a product.

(a) The inner products takes two vectors and gives back a number, so it gives back an object that
is not of the same type as the two things we put in.
(b) In Example 2.10(iv) we saw that it may happen that ~v 6= ~0 and w
~ 6= ~0 but still h~v , wi
~ =0
which is impossible for a “decent” product.
(c) Given a vector ~v 6= 0 and a number c ∈ R, there are many solutions of the equation h~v , ~xi = c
for the vector ~x, in stark contrast to the usual product in R or C. Look for instance at
Example 2.10(i) and (ii). Therefore it makes no sense to write something like ~v −1 .
(d) There is no such thing as a neutral element for scalar multiplication.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 35

Now let us see why the inner product is useful. In fact, it is related to the angle between two vectors
and it will help us to define orthogonal projections of one vector onto another. Let us start with a
definition.

~ be vectors in R2 . The angle between ~v and w


Definition 2.13. Let ~v , w ~ is the smallest nonnegative
angle between them, see Figure 2.10. It is denoted by ^(~v , w).
~

~v ~v
w
~
ϕ
ϕ ϕ
w
~ ϕ ~v
w
~
w
~
~v

FT
Figure 2.10: Angle between two vectors.

The following properties of the angle are easy to see.

Proposition 2.14. ~ ∈ [0, π] and ^(~v , w)


(i) ^(~v , w) ~ = ^(w,
~ ~v ).
(ii) If λ > 0, then ^(λ~v , w)
~ = ^(~v , w).
~
~ = π − ^(~v , w).
(iii) If λ < 0, then ^(λ~v , w) ~
RA
w
~
ϕ ~v
ψ
−~v

~ and the vectors ~v and −~v . ϕ = ^(w,


Figure 2.11: Angle between the vector w ~ −~v ) =
~ ~v ), ψ = ^(w,
π − ^(w,
~ ~v ) = π − ϕ.
D

Definition 2.15. (a) Two non-zero vectors ~v and w


~ are called parallel if ^(~v , w)
~ = 0 or π. In
this case we use the notation ~v k w.
~

(b) Two non-zero vectors ~v and w~ are called orthogonal (or perpendicular ) if ^(~v , w)
~ = π/2. In
this case we use the notation ~v ⊥ w.
~

(c) The vector ~0 is parallel and perpendicular to every vector.

The following properties should be intuitively clear from geometry. A formal proof of (ii) and (iii)
can be given easily after Corollary 2.20. The proof of (i) will be given after Remark 2.24.

~ be vectors in R2 . Then:
Proposition 2.16. Let ~v , w

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
36 2.2. Inner product in R2

(i) If ~v k w ~ 6= ~0, then there exists λ ∈ R such that ~v = λw.


~ and w ~
(ii) If ~v k w
~ and λ, µ ∈ R, then also λ~v k µw.
~
(iii) If ~v ⊥ w
~ and λ, µ ∈ R, then also λ~v ⊥ µw.
~

Remark 2.17. (i) Observe that (i) is wrong if we do not assume that w ~ 6= ~0 because if w
~ = ~0,
2
then it is parallel to every vector ~v in R , but there is no λ ∈ R such that λw ~ could ever
become different from ~0.
(ii) Observe that the reverse direction in (ii) and (iii) is true only if λ 6= 0 and µ 6= 0.

Without proof, we state the following theorem which should be known.


Theorem 2.18 (Cosine Theorem). Let a, b, c be the
sides or a triangle and let ϕ be the angle between the c
b
sides a and b. Then
ϕ
c2 = a2 + b2 − 2ab cos ϕ. (2.1)
a

Theorem 2.19. Let ~v , w

Proof.

The vectors ~v and w


FT
~ ∈ R2 and let ϕ = ^(~v , w).
~ Then

h~v , wi
~ = k~v kkwk

~ define a triangle in R2 , see


~ cos ϕ.
RA
Figure 2.12. Now we apply the cosine theorem ~v − w
~
with a = k~v k, b = kwk,
~ c = k~v − wk. We obtain w
~

~ 2 = k~v k2 + kwk
k~v − wk ~ 2 − 2k~v kkwk
~ cos ϕ. (2.2) ϕ
~v

Figure 2.12: Triangle given by ~v and w.


~
On the other hand,

~ 2 = h~v − w
D

k~v − wk ~ , ~v − wi
~ = h~v , ~v i − h~v , wi
~ − hw
~ , ~v i + hw ~ = h~v , ~v i − 2h~v , wi
~ , wi ~ + hw
~ , wi
~
= k~v k2 − 2h~v , wi ~ 2.
~ + kwk (2.3)

Comparison of (2.2) and (2.3) yields

k~v k2 + kwk
~ 2 − 2k~v kkwk
~ cos ϕ = k~v k2 − 2h~v , wi ~ 2,
~ + kwk

which gives the claimed formula.


A very important consequence of this theorem is that we can now determine if two vectors are
parallel or perpendicular to each other by simply calculating their inner product as can be seen
from the following corollary.

~ ∈ R2 and ϕ = ^(~v , w).


Corollary 2.20. Let ~v , w ~ Then:

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 37

(i) ~v k w
~ ⇐⇒ k~v k kwk
~ = |h~v , wi|.
~
(ii) ~v ⊥ w
~ ⇐⇒ h~v , wi
~ = 0,
(iii) Cauchy-Schwarz inequality: |h~v , wi|
~ ≤ k~v k kwk.
~
(iv) Triangle inequality:
k~v + wk
~ ≤ k~v k + kwk.
~ (2.4)

Proof. The claims are clear if one of the vectors is equal to ~0 since the zero vector is parallel and
orthogonal to every vector in R2 . So let us assume now that ~v 6= ~0 and w~ 6= ~0.

(i) From Theorem 2.19 we have that |h~v , wi| ~ = k~v k kwk
~ if and only if | cos ϕ| = 1. This is the
case if and only if ϕ = 0 or π, that is, if and only if ~v and w
~ are parallel.

(ii) From Theorem 2.19 we have that |h~v , wi| ~ = 0 if and only if cos ϕ = 0. This is the case if and
only if ϕ = π/2, that is, if and only if ~v and w
~ are perpendicular.

(iii) By Theorem 2.19 we have that |h~v , wi|


~ = k~v k kwk
~ | cos ϕ| ≤ k~v k kwk
~ since 0 ≤ | cos ϕ| ≤ 1 for

FT
ϕ ∈ [0, π].

(iv) Consider the triangle whose sides are ~v , w ~ and


~v + w~ and let ϕ be the angle opposite to the side
~v + w
~ (hence ϕ = π−^(~v , w)).
~ The cosine theorem
w~
gives

w
~
~v +
w
~ ϕ
~ 2 = k~v k2 + kwk
k~v + wk ~ 2 + 2k~v k wk
~ cos ϕ
RA
≤ k~v k2 + kwk
~ 2 + 2k~v k wk
~ ~v

~ 2.
= (k~v k + kwk)
Taking the square root on both sides gives us the desired inequality.

Question 2.1
When does equality hold in the triangle inequality (2.4)? Draw a picture and prove your claim
using the calculations in the proof of (iv).
D

Exercise. Prove (ii) and (iii) of Proposition 2.16 using Corollary 2.20.

Exercise. (i) Prove Corollary 2.20 (iii) without the cosine theorem.
2
Hint. Start with the inequality 0 ≤ kwk~
~ v − k~v kw
~ and expand the right hand side similar
~ 2 k~v k2 − 2(h~v , wi)
as in the proof of Proposition 8.6. You will find that 0 ≤ 2kwk ~ 2.
(ii) Prove Corollary 2.20 (iv) without the cosine theorem.
Hint. Cf. the proof of the triangle inequality in Cn (Proposition 8.6).

We give a proof of (iii) and (iii) in Proposition 8.6 without the use of the cosine theorem which
works also in the complex case.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
38 2.2. Inner product in R2

Example 2.21. Theorem 2.19 allows us to calculate the angle of a given vector with the x-axis
easily (see Figure 2.13):
h~v ,~e1 i h~v ,~e2 i
cos ϕx = , cos ϕy = .
k~v kk~e1 k k~v kk~e2 k
If we now use that k~e1 k = k~e2 k = 1 and that h~v ,~e1 i = v1 and h~v ,~e2 i = v2 , then we can simplify
the expressions to
v1 v2
cos ϕx = , cos ϕy = .
k~v k k~v k

y
~v ϕy
ϕx
x

You should have understood


FT
Figure 2.13: Angle of ~v with the axes.

• the concepts of being parallel and of being perpendicular,


• the relation of the inner product with the length of a vector and the angle between two
RA
vectors,
• that the inner product is commutative and associative, but that it is not a product,
• etc.
You should now be able to
• calculate the inner product of two vectors,
• use the inner product to calculate angles between vectors
D

• use the inner product to determine if two vectors are parallel, perpendicular or neither,
• etc.

Ejercicios.
 
2
1. Sea ~v = ∈ R2 .
5

(a) Encuentre todos los vectores unitarios cuya dirección es opuesta a la de ~v .


(b) Encuentre todos los vectores de longitud 3 que tienen la misma dirección que ~v .
(c) Encuentre todos los vectores que tienen la misma dirección que ~v y que tienen doble
longitud de ~v .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 39

(d) Encuentre todos los vectores con norma 2 que son ortogonales a ~v .

2. Para los siguientes vectores ~u y ~v decida si son ortogonales, paralelos o ninguno de los dos.
Calcule el coseno del ángulo entre ellos. Si son paralelos, encuentre números reales λ y µ tales
que ~v = λ~u y ~u = µ~v .
       
1 5 2 1
(a) ~v = , ~u = , (b) ~v = , ~u = ,
4 −2 4 2
       
3 −8 −6 3
(c) ~v = , ~u = , (d) ~v = , ~u = .
4 6 4 −2

~ encuentre todos los α ∈ R tal que ~v y w


3. (a) Para las siguientes parejas ~v y w ~ son paralelos:
       
1 α 2 1+α
(i) ~v = , w~= , (ii) ~v = , w
~= ,
4 −2 α 2
       
α 1+α 2 1+α

FT
(iv) ~v = , w
~= , (ii) ~v = , w
~= ,
5 2 α 2α

~ encuentre todos los α ∈ R tal que ~v y w


(b) Para las siguientes parejas ~v y w ~ son perpen-
diculares:
           
1 α 2 α α 1+α
(i) ~v = ,w~= , (ii) ~v = ,w~= , (iii) ~v = ,w
~= .
4 −2 α 2 5 2
RA
   
2 1
4. Sean ~a = y ~b = .
α −1

(a) Encuentre todos los α ∈ R tales que:


(i) ~a k ~b;
(ii) ~a ⊥ ~b;
(iii) el ángulo entre ~a y ~b es π
6;
(iv) el ángulo entre ~a y ~b es π
D

4;
(v) el ángulo entre ~a y ~b es 5π
6 .

(b) ¿Hacı́a donde tiende el ángulo entre ~a y ~b cuando α → ∞ ó α → −∞?

Haga un dibujo de cada caso.

2.3 Orthogonal Projections in R2


~ be vectors in R2 and w
Let ~v and w ~ 6= ~0. Geometrically, we have an intuition of what the orthogonal
projection of ~v onto w~ should be and that we should be able to construct it as described in the
following procedure: We move ~v such that its initial point coincides with that of w. ~ Then we
extend w ~ to a line and construct a line that passes through the tip of ~v and is perpendicular to w.
~

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
40 2.3. Orthogonal Projections in R2

The vector from the initial point to the intersection of the two lines should then be the orthogonal
projection of ~v onto w.
~ see Figure 2.14

~v
~v
~v
w
~
w
~ w
~

~v
~v
·
~v
· w
~ ~vk
~vk w
~ w
~
·
~vk

~ in R2 .

FT
Figure 2.14: Some examples for the orthogonal projection of ~v onto w

This procedure decomposes the vector ~v in a part parallel to w ~ and a part perpendicular to w
~ so
that their sum gives us back ~v . The parallel part is the orthogonal projection of ~v onto w.
~
In the following theorem we give the precise meaning of the orthogonal projection, we show that
a decomposition as described above always exists and we even derive a formula for orthogonal
projection. A more general version of this theorem is Theorem 7.30.
RA
Theorem 2.22 (Orthogonal projection). Let ~v and w ~ 6= ~0. Then there
~ be vectors in R2 and w
exist uniquely determined vectors ~vk and ~v⊥ (see Figure 2.15) such that

~vk k w,
~ ~v⊥ ⊥ w
~ and ~v = ~vk + ~v⊥ . (2.5)

The vector ~vk is called the orthogonal projection of ~v onto w


~ and it is given by

h~v , wi
~
~vk = w.
~ (2.6)
~ 2
kwk
D

~v ~v⊥
~v ~v⊥
·
~v
· w
~
~vk = projw~ ~v
w
~ ~v⊥ w
~
~vk = projw~ ~v
·
~vk = projw~ ~v

Figure 2.15: Examples of decompositions of ~v into ~v = ~vk + ~v⊥ with ~vk k w


~ and ~v⊥ ⊥ w.
~ Note that
by definition ~vk = projw~ ~v .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 41

Proof. Assume we have vectors ~vk and ~v⊥ satisfying (2.5). Since ~vk and w
~ are parallel by definition
and since w~ 6= ~0, there exists λ ∈ R such that ~vk = λw,
~ so in order to find ~vk it is sufficient to
determine λ. For this, we notice that ~v = λw ~ + ~v⊥ by (2.5). Taking the inner product on both
sides with w
~ leads to

h~v , wi
~ = hλw ~ = hλw
~ + ~v⊥ , wi ~ , wi ~ = hλw
~ + h~v⊥ , wi ~ , wi
~ = λhw
~ , wi ~ 2
~ = λkwk
| {z }
v⊥ ⊥ w
= 0 since ~ ~
h~v , wi
~
=⇒ λ= .
~ 2
kwk

So if a sum representation of ~v as in (2.5) exists, then the only possibility is

h~v , wi
~ h~v , wi
~
~vk = λw
~= w
~ and ~v⊥ = ~v − ~vk = ~v − w.
~
~ 2
kwk ~ 2
kwk

This already proves uniqueness of the vectors ~vk and ~v⊥ . It remains to show that they indeed have

FT
the desired properties. Clearly, by construction ~vk is parallel to w
~ and ~v = ~vk + ~v⊥ since we defined
~v⊥ = ~v − ~vk . It remains to verify that ~v⊥ is orthogonal to w.
~ This follows from
   
h~v , wi
~ h~v , wi
~ h~v , wi
~
h~v⊥ , wi
~ = ~v − w
~ , w
~ = h~
v , wi
~ − w
~ , w
~ = h~v , wi
~ − hw
~ , wi
~ =0
~ 2
kwk ~ 2
kwk ~ 2
kwk

where in the last step we used that hw ~ 2.


~ = kwk
~ , wi

Notation 2.23. Instead of ~vk we often write projw~ ~v , in particular when we want to emphasise
RA
onto which vector we are projecting.

Remark 2.24. (i) projw~ ~v depends only on the direction of w.


~ It does not depend on its length.
(ii) For every c ∈ R, we have that projw~ (c~v ) = c projw~ ~v .
(iii) As special cases of the above, we find projw~ (−~v ) = − projw~ ~v and proj−w~ ~v = projw~ ~v .
(iv) ~v k w
~ =⇒ projw~ ~v = ~v .
(v) ~v ⊥ w
~ =⇒ projw~ ~v = ~0.
D

(vi) projw~ ~v is the unique vector in R2 such that

(~v − projw~ ~v ) ⊥ ~v and projw~ ~v k w.


~

Proof. (i): By our geometric intuition, this should be clear. Let us give a formal proof. Suppose
~ for some c ∈ R \ {0}. Then
we want to project ~v onto cw

h~v , cwi
~ ch~v , wi
~ h~v , wi
~
projcw~ ~v = (cw)
~ = 2 (cw)
~ = w
~ = projw~ ~v .
kcwk~ 2 c kwk~ 2 ~ 2
kwk

Convince yourself graphically that it does not matter if we project ~v on w ~ or on − 75 w;


~ or on 5w ~
only the direction of w
~ matters, not its length.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
42 2.3. Orthogonal Projections in R2

(ii): Again, by geometric considerations, this should be clear. The corresponding calculation is

hc~v , wi
~ ch~v , wi
~
projw~ (c~v ) = 2
w
~= w
~ = c projw~ ~v .
kwk~ kwk~ 2

(iii) follows directly from (i) and (ii).


(iv), (v) and (vi) follow from the uniqueness of the decomposisition of the vector ~v as sum of a
vector parallel and a vector perpendicular to w.
~

Now the proof of Proposition 2.16 (i) follows easily.

Proof of Proposition 2.16 (i). We have to show that if ~v k w ~ 6= ~0, then there exists λ ∈ R
~ and if w
~ = λ~v . From Remark 2.24 (iv) it follows that ~v = projw~ ~v = h~
such that w v ,wi
~
~ 2 w,
kwk ~ hence the claim
h~
v ,wi
~
follows if we can choose λ = ~ 2 .
kwk

FT
We end this section with some examples.

Example 2.25. Let ~u = 2~e1 + 3~e2 , ~v = 4~e1 − ~e2 .


h~
u ,~e1 i 2
(i) proj~e1 ~u = k~e1 k2 ~ e1 = 12 ~
e1 = 2~e1 .

h~
u ,~e2 i 3
(ii) proj~e2 ~u = k~e2 k2 ~ e2 = 12 ~
e2 = 3~e2 .
RA
(iii) Similarly, we can calculate proj~e1 ~v = 4~e1 , proj~e2 ~v = −~e2 .
    
2
  ,
5
−1
 
h~ vi
u ,~ 3 8−3 5 5 2
(iv) proj~u ~v = k~uk2 ~u = uk2
k~ ~u = 22 +32 ~u = 13 ~u = 13 .
3
   
4 2
 ,
−1
 
h~ ui
v ,~ 3 8−3 5 5 4
(v) proj~v ~u = k~vk2 ~u = ~u = 42 +(−1)2 ~u = 17 ~u = 17 .
D

uk2
k~ −1
 
a
Example 2.26 (Angle with coordinate axes). Let ~v = ∈ R2 \ {~0}. Then cos ^(~v ,~e1 ) =
b
a b
vk ,
k~ cos ^(~v ,~e2 ) = vk ,
k~ hence
     
a cos ^(~v ,~e1 ) cos ϕx
~v = = k~v k = k~v k
b cos ^(~v ,~e2 ) cos ϕy

and

projection of ~v onto the x-axis = proj~e1 ~v = k~v k cos ^(~v ,~e1 )~e1 = k~v k cos ϕx ~e1 ,
projection of ~v onto the y-axis = proj~e2 ~v = k~v k cos ^(~v ,~e2 )~e2 = k~v k cos ϕy ~e2 .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 43

Question 2.2

~ be a vector in R2 \ {~0}.
Let w
~ is equal to ~0?
(i) Can you describe geometrically all the vectors ~v whose projection onto w
(ii) Can you describe geometrically all the vectors ~v whose projection onto w
~ have length 2?
(iii) Can you describe geometrically all the vectors ~v whose projection onto w
~ have length 3kwk?
~

You should have understood


• the concept of orthogonal projections in R2 ,
• why the orthogonal projection of w
~ onto w
~ does not depend on the length of w,
~
• etc.

FT
You should now be able to
• calculate the projection of a given vector onto another vector,
• calculate vectors with a given projection onto another vector,
• etc.

Ejercicios.
   
RA
1 5
1. Sean ~a = y ~b = .
3 2

(a) Calcule proj~b ~a y proj~a ~b.


(b) Encuentre todos los vectores ~v ∈ R2 tal que k proj~a ~v k = 0. Describa este conjunto
geométricamente.
(c) Encuentre todos los vectores ~v ∈ R2 tal que k proj~a ~v k = 2. Describa este conjunto
geométricamente.
(d) ¿Existe un vector ~x tal que proj~a ~x k ~b?
D

¿Existe un vector ~x tal que proj~x ~a k ~b?


¿Existe un vector ~x tal que proj ~a = ~b?
~
x

2.4 Vectors in Rn
In this section we extend our calculations from R2 to Rn . If n = 3, then we obtain R3 which
usually serves as model for our everyday physical world and which you probably already are familiar
with from physics lectures. We will discuss R3 and some of its peculiarities in more detail in the
Section 2.5.
First, let us define Rn .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
44 2.4. Vectors in Rn

Definition 2.27. For n ∈ N we define the set


  
 x1
 

n  .. 
R =  .  : x1 , . . . , x n ∈ R .
 
xn
 

Again we can think of vectors as arrows. As in R2 , we can identify every point in Rn with the arrow
that starts in the origin of coordinate system and ends in the given point. The set of all arrows
with the same lengthandthe same direction is called a vector in Rn . So every point P (p1 , . . . , pn )
p1
defines a vector ~v =  ...  and vice versa. As before, we sometimes denote vectors as (p1 , . . . , pn )t
 

pn
in order to save (vertical) space. The superscript t stands for “transposed”.

Rn becomes a vector space with the operations

FT
      

v1 w1 v1 + w 1 cv1
~ =  ...  +  ...  = 
Rn × Rn → Rn , ~v + w ..
, R × Rn → Rn , c~v =  ...  . (2.7)
       
.
vn wn vn + w n cvn

Exercise. Show that Rn is a vector space. That is, you have to show that the vector space
axioms on page 29 hold.

As in R2 , we can define the norm of a vector, the angle between two vectors and an inner product.
RA
Note that the definition of the angle between two vectors is not different from the one in R2 since
when we are given two vectors, they always lie in a common plane which we can imagine as some
sort of rotated R2 . Let us give now the formal definitions.
   
v1 w1
 ..   .. 
Definition 2.28 (Inner product; norm of a vector). For vectors ~v =  .  and w
~ = . 
vn wn
the inner product (or scalar product or dot product) is defined as
D

   
* v1 w1 +
 ..   .. 
~ =  .  ,  .  = v1 w1 + · · · + vn wn .
h~v , wi
vn wn
 
v1
The length of ~v =  ...  ∈ Rn is denoted by k~v k and it is given by
 

vn
q
k~v k = v12 + · · · + vn2 .

Other names for the length of ~v are magnitude of ~v or norm of ~v .

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 45

As in R2 , we have the following properties:

~ ∈ Rn , we have that h~v , wi


(i) Symmetry of the inner product: For all vectors ~v , w ~ = hw
~ , ~v i.
~ ∈ Rn and all c ∈ R, we have that
(ii) Bilinearity of the inner product: For all vectors ~u, ~v , w
h~u , ~v + cwi
~ = h~u , ~v i + ch~u , wi.
~
~ ∈ Rn and let ϕ =
(iii) Relation of the inner product with the angle between vectors: Let ~v , w
^(~v , w).
~ Then
h~v , wi
~ = k~v k kwk
~ cos ϕ.

In particular, we have (cf. Proposition 2.16):

(a) ~v k w
~ ⇐⇒ ~ ∈ {0, π}
^(~v , w) ⇐⇒ |h~v , wi|
~ = k~v k kwk,
~
(b) ~v ⊥ w
~ ⇐⇒ ^(~v , w)
~ = π/2 ⇐⇒ h~v , wi
~ = 0.

Remark 2.29. In abstract inner product spaces, the inner product is actually used to define

FT
orthogonality.
(iv) Relation of the inner product with the norm: For all vectors ~v ∈ Rn , we have k~v k2 = h~v , ~v i.
~ ∈ Rn and scalars c ∈ R, we have that kc~v k = |c|k~v k
(v) Properties of the norm: For all vectors ~v , w
and k~v + wk
~ ≤ k~v k + kwk.
~
~ 6= ~0 the
~ ∈ Rn with w
(vi) Orthogonal projections of one vector onto another: For all vectors ~v , w
orthogonal projection of ~v onto w
~ is
RA
h~v , wi
~
projw~ ~v = w.
~ (2.8)
~ 2
kwk

As in R2 , we have n “special vectors” which are parallel to the coordinate axes and have norm 1:
     
1 0 0
0 1  .. 
~e1 :=  .  , ~e2 :=  .  , . . . , ~en :=  .  .
     
 ..   ..  0
D

0 0 1

In the special case n = 3, the vectors ~e1 , ~e2 and ~e3 are sometimes denoted by ı̂,̂, k̂.
For a given vector ~v 6= ~0, we can now easily determine its projections onto the n coordinate axes
and its angle with the coordinate axes. By (2.8), the projection onto the xj -axis is

proj~ej ~v = vj~ej .

Let ϕj be the angle between ~v and the xj -axis. Then

h~v ,~ej i vj
ϕj = ^(~v ,~ej ) =⇒ cos ϕx = = .
k~v k k~ej k k~v k

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
46 2.5. Vectors in R3 and the cross product

 
cos ϕ1
It follows that ~v = k~v k  ... . Sometimes the notation
 

cos ϕn
 
cos ϕ1
~v
v̂ := = k~v k  ... 
 
k~v k
cos ϕn
is used for the unit vector pointing in the same direction as ~v . Clearly kv̂k = 1 because kv̂k =
kk~v k−1~v k = k~v k−1 k~v k = 1. Therefore v̂ is indeed a unit vector pointing in the same direction as
the original vector ~v .

You should have understood


• the vector space Rn and vectors in Rn ,
• geometric concepts (angles, length of a vector) in Rn ,

FT
• that R2 from chapter 2.1 is a special case of Rn from this section,
• etc.
You should now be able to
• perform algebraic operations in the vector space R3 and, in the case n = 3, visualise them
in space,
• calculate lengths and angles,
RA
• calculate unit vectors, scale vectors,
• perform simple abstract proofs (e.g., prove that Rn is a vector space).
• etc.

Ejercicios.
   
2 0
1 4
1. Sean ~a =   ~  
0 y b = 5. Calcular:
D

3 1
~
(a) 4~a + 3b. (c) h~a − ~b + 3~e1 , ~b − 5~e4 + ~e3 i.
(b) k3~a − 2~bk. (d) proj~b ~a.

2.5 Vectors in R3 and the cross product


The space R3 is very important since it is used in mechanics to model the space we live in. On
R3 we can define an additional operation with vectors, the so-called cross product. Another name
for it its vector product. It takes two vectors and gives back another vector. It does have several
properties which makes it look like a product, however we will see that it is not a product. Here
is its definition.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 47

   
v1 w1
Definition 2.30 (Cross product). Let ~v = v2  , w ~ = w2  ∈ R3 . Their cross product (or
v3 w3
vector product or wedge product) is
     
v1 w1 v2 w3 − v3 w2
~ = v2  × w2  := v3 w1 − v1 w3  .
~v × w
v3 w3 v1 w2 − v2 w1

Another notation for the cross product is ~v ∧ w.


~

A way to remember this formula is as follows. Write the first and the second component of the
vectors underneath them, so that formally you get a column of 5 components. Then make crosses
as in the sketch below, starting with the cross consisting of a line from v2 to w3 and then from w2
to v3 . Each line represents a product of the corresponding components; if the line goes from top
left to bottom right then it is counted positive, if it goes from top right to bottom left then it is
counted negative.

FT v2 w3 − v3 w2
 
v1 w1
   
 v2  ×  w2  =  v3 w1 − v1 w3 
     
v3 w3 v1 w2 − v2 w1
v1 w1
v2 w2
RA
The cross product is defined only in R3 !

Before we collect some easy properties of the cross product, let us calculate a few examples.
   
1 5
Examples 2.31. Let ~u = 2, ~v = 6.
3 7
         
1 5 2·7−3·6 14 − 18 −4
D

• ~u × ~v = 2 × 6 = 3 · 5 − 1 · 7 =  15 − 7  =  8,


3 7 1·6−2·5 6 − 10 −4
         
5 1 6·3−7·2 18 − 14 4
• ~v × ~u = 6 × 2 = 7 · 1 − 5 · 3 =  7 − 15  = −8,
7 3 5·2−6·1 10 − 6 4
       
5 1 6·0−7·0 0
• ~v × ~e1 = 6 × 0 = 7 · 1 − 5 · 0 =  7,
7 0 5·0−6·1 −6
       
5 5 6·7−7·6 0
• ~v × ~v = 6 × 6 = 7 · 5 − 5 · 7 = 0.
7 7 5·6−6·5 0

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
48 2.5. Vectors in R3 and the cross product

~ ∈ R3 and let c ∈ R. Then:


Proposition 2.32 (Properties of the cross product). Let ~u, ~v , w

(i) ~u × ~0 = ~0 × ~u = ~0.

(ii) ~u × ~v = −~v × ~u.

(iii) ~u × (~v + w)
~ = (~u × ~v ) + (~u × w).
~

(iv) (c~u) × ~v = c(~u × ~v ).

(v) ~u k ~v ⇐⇒ ~u × ~v = ~0. In particular, ~v × ~v = ~0.

(vi) h~u , ~v × wi
~ = h~u × ~v , wi.
~

(vii) h~u , ~u × ~v i = 0 and h~v , ~u × ~v i = 0, in particular

FT
~v ⊥ ~v × ~u, ~u ⊥ ~v × ~u

In other words: the vector ~v × w


~ is orthogonal to both ~v and w.
~

Proof. The proofs of the formulas (i) – (iv) are easy calculations (you should do them!).

(v) The implication “=⇒” is easy to check. The other direction follows easily from Theorem 2.34
below. Or it can be seen as follows: Let us assume that w ~ × ~v = ~0. If ~v = ~0, then clearly
RA
~ = ~a + ~b where ~a = proj~v w
~ If ~v 6= ~0, then we write w
~v k w. ~ and ~b ⊥ v. We need to show that
~b = ~0. Using that ~v × ~a = ~0 (because they are parallel), we obtain that
 
v2 b3 − v3 b2
~ = ~v × (~a + ~b) = (~v × ~a) + (~v × ~b) = ~v × ~b = v3 b1 − v1 b3  .
~0 = ~v × w
v1 b2 − v2 b1

In addition we know that h~v , ~bi = 0. So in total we have four linear equations for the three
components b1 , b2 , b3 of ~b:
D

v2 b3 − v3 b2 = 0, v3 b1 − v1 b3 = 0, v1 b2 − v2 b1 = 0, v1 b1 + v2 b2 + v3 b3 = 0.

Since ~v 6= ~0, at least one of its components is diffrerent from 0. Let as asssume that v1 6= 0.
Then we can solve for b1 in the second and third equation and obtain that b2 = vv12 b1 and
b3 = vv13 b1 . If we subsitute this in the last equation, we find that

v22 v2 b1 k~v k2
0 = v1 b1 + b1 + 3 b1 = 2 (v12 + v22 + v32 ) = b1 .
v1 v1 v1 v1

Since ~v 6= ~0, it follows that b1 = 0, but then also b2 = v2


v 1 b1 = 0 and b3 = v3
v 1 b1 = 0. In
summary, ~b = ~0 as we wanted to show.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 49

(vi) The proof is a long but straightforward calculation:


*u  v  w + *u  v w − v w +
1 1 1 1 2 3 3 2
h~u , ~v × wi
~ = u2  , v2  × w2  = u2  , v3 w1 − w3 v1 
u3 v3 w3 u3 v1 w2 − v2 w1
= u1 (v2 w3 − v3 w2 ) + u2 (v3 w1 − v1 w3 ) + u3 (v1 w2 − v2 w1 )
= u1 v2 w3 − u1 v3 w2 + u2 v3 w1 − u2 v1 w3 + u3 v1 w2 − u3 v2 w1
= u2 v3 w1 − u3 v2 w1 + u3 v1 w2 − u1 v3 w2 + u1 v2 w3 − u2 v1 w3
= (u2 v3 − u3 v2 )w1 + (u3 v1 − u1 v3 )w2 + (u1 v2 − u2 v1 )w3
= h~u × ~v , wi.
~

(vii) It follows from (vi) and (v) that


h~u , ~u × ~v i = h~u × ~u , ~v i = h~0 , ~v i = 0 and h~v , ~u × ~v i = −h~v , ~v × ~ui = 0.

Note that the cross product is distributive but it is neither commutative nor associative.

tion 2.36.
FT
Exercise. Prove the formulas in (i) – (iv) and the implication “=⇒” in (v).

Remark. A geometric interpretation of the number h~u , ~v × wi


~ from (vi) will be given in Proposi-

Remark 2.33. The property (vii) explains why the cross product makes sense only in R3 . Given
two non-parallel vectors ~v and w,
~ their cross product is a vector which is orthogonal to both of
RA
them and whose length is k~v k kwk
~ sin ϕ (see Theorem 2.34; ϕ = ^(~v , w))
~ and this should define the
result uniquely up to a factor ±1. This factor has to do with the relative orientation of ~v and w
~ to
each other. However, if n 6= 3, then one of the following holds:
• If we were in R2 , the problem is that “we do not have enough space” because then the only
vector orthogonal to ~v and w ~ at the same time would be the zero vector ~0 and it would not
make too much sense to define a product where the result is always ~0.
• If we were in some Rn with n ≥ 4, the problem is that “we have too many choices”. We will
see later in Chapter 7.3 that the orthogonal complement of the plane generated by ~v and w ~
D

has dimension n − 2 and every vector in the orthogonal complement is orthogonal to both
~v and w.~ For example, if we take ~v = (1, 0, 0, 0)t and w~ = (0, 1, 0, 0)t , then every vector of
t
the form ~a = (0, 0, x, y) is perpendicular to both ~v and w
~ and it easy to find infinitely many
vectors of this form which in addition have norm k~v k kwk~ sin ϕ = 1 (~a = (0, 0, sin ϑ, ± cos ϑ)t
for arbitrary ϑ ∈ R works).

Recall that for the inner product we proved the formula h~v , wi
~ = k~v k kwk
~ cos ϕ where ϕ is the angle
between the two vectors, see Theorem 2.19. In the next theorem we will prove a similar relation
for the cross product.

~ be vectors in R3 and let ϕ be the angle between them. Then


Theorem 2.34. Let ~v , w

k~v × wk
~ = k~v k kwk
~ sin ϕ

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
50 2.5. Vectors in R3 and the cross product

~ 2 = k~v k2 kwk
Proof. A long, but straightforward calculation shows that k~v × wk ~ 2 − h~v , wi
~ 2 . Now it
follows from Theorem 2.19 that

~ 2 = k~v k2 kwk
k~v × wk ~ 2 − h~v , wi
~ 2 = k~v k2 kwk
~ 2 − k~v k2 kwk
~ 2 (cos ϕ)2
= k~v k2 kwk
~ 2 (1 − (cos ϕ)2 ) = k~v k2 kwk
~ 2 (sin ϕ)2 .

If we take the square root of both sides, we arrive at the claimed formula. (We do not need to
worry about taking the absolute value of sin ϕ because ϕ ∈ [0, π], hence sin ϕ ≥ 0.)

~ 2 = k~v k2 kwk
Exercise. Show that k~v × wk ~ 2 − h~v , wi
~ 2.

Application: Area of a parallelogram and volume of a parelellepiped


Area of a parallelogram

FT
~ be two vectors in R3 . Then they define a parallelogram (if the vectors are parallel or
Let ~v and w
one of them is equal to ~0, it is a degenerate parallelogram).

w
~ h
RA
~v

Figure 2.16: Parallelogram spanned by ~v and w.


~

Proposition 2.35 (Area of a parallelogram). The area of the parallelogram spanned by the
vectors ~v and w
~ is
A = k~v × wk.
~ (2.9)
D

Proof. The area of a parallelogram is the product of the length of its base with the height. We
can take w ~ and ~v . Then we obtain that h = k~v k sin ϕ and
~ as base. Let ϕ be the angle between w
therefore, with the help of Theorem 2.34

A = kwkh
~ = kwkk~
~ v k sin ϕ = k~v × wk.
~

Note that in the case when ~v and w


~ are parallel, this gives the right answer A = 0.

Volume of a paralellepiped

Any three vectors in R3 define a parallelepiped.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 51

~n

pro j~ ~u
n ~u

FT
w
~

~v
RA
Figure 2.17: Parallelepiped spanned by ~
u, ~v , w.
~

Proposition 2.36 (Volume of a parallelepiped). The volume of the parallelepiped spanned by


the vectors ~u, ~v and w
~ is
V = |h~u , ~v × wi|.
~ (2.10)

Proof. The volume of a parallelepiped is the product of the area of its base with the height. Let us
D

take the parallelogram spanned by ~v , w


~ as base. If ~v and w
~ are parallel or one or them is equal to
~0, then (2.10) is true because V = 0 and ~v × w
~ = ~0 in this case.
Now let us assume that they are not parallel. By Proposition 2.35 we already know that its base
has area A = k~v × wk.
~ The height is the length of the orthogonal projection of ~u onto the normal
vector of the plane spanned by ~v and w. ~ We already know that ~v × w ~ is such a normal vector.
Hence we obtain that
h~u , ~v × wi
~ |h~u , ~v × wi|
~ |h~u , ~v × wi|
~
h = k proj~v×w~ ~uk = ~v × w
~ = k~v × wk
~ = .
k~v × wk~ 2 k~v × wk~ 2 k~v × wk~

Therefore, the volume of the parallelepiped is

|h~u , ~v × wi|
~
V = Ah = k~v × wk
~ = |h~u , ~v × wi|.
~
k~v × wk~

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
52 2.5. Vectors in R3 and the cross product

Note that in the case when two of the vectors ~u, ~v and w
~ are parellel, the formula gives the right
answer that he volume of the parellelepiped is 0.

~ ∈ R3 . Then
Corollary 2.37. Let ~u, ~v , w

|h~u , ~v × wi|
~ = |h~v , w
~ × ~ui| = |hw
~ , ~u × ~v i|.

Proof. The formula holds because each of the expressions describes the volume of the parallelepiped
spanned by the three given vectors since we can take any of the sides of the parallelogram as its
base.

You should have understood


• the geometric interpretations of the cross product,
• why it exists only in R3
• etc.

You should now be able to


• calculate the cross product,
FT
• use it to say something about the angle between two vectors in R3 ,
• use it calculate the area of a parallelogram and the volume of a parallelepiped,
RA
• etc.

Ejercicios.
1. (a) Calcule el área del paralelogramo cuyos vértices adyacentes son A(1, 2, 3), B(2, 3, 4),
C(−1, 2, −5) y calcule el cuarto vértice.
(b) Calcule el área del triángulo con los vértices A(1, 2, 3), B(2, 3, 4), C(−1, 2, −5).
D

(c) Encuentre un punto P tal que el área del triángulo con vértices B, C, P sea igual a 13.
¿Cuántos tales puntos P hay? Descrı́balos geométricamente.

2. Calcule
 el volumen
 del
 paralelepı́pedo
  determinado por los vectores
5 −1 1
~u = 2 , ~v =  4 , w~ = −2.
1 3 7

3. 
Use elproducto
 cruz para encontrar el seno del ángulo formado por los vectores
2 −3
 1 y −2.
−1 4

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 53

   
1 2
4. Encuentre todos los vectores ~a ∈ R3 tales que ~a ⊥ −1 y ~a ⊥  0 ¿Cuántos de ellos
2 −3
   
1 8
tienen norma 1?. ¿Cuáles tales ~a satisfacen que ~a × −1 = −2?
2 −5

2.6 Lines and planes in R3


In this section we discuss lines and planes and how to describe them in formulas. In the next
section, we will calculate, e.g., intersections between them. We work mostly in R3 and only give
some hints on how the concepts discussed here generalise to Rn with n 6= 3. The special case n = 2
should be clear.
The formal definition of lines and planes will be given in Definition 5.57 because this requires the
concept of linear independence. (For the curious: a line is an (affine) one-dimensional subspace of
a vector space; a plane is an (affine) two-dimensional subspace of a vector space; a hyperplane is an

FT
(affine) (n − 1)-dimensional subspace of an n-dimensional vector space). In this section we appeal
to our knowledge and intuition from elementary geometry.

Lines
Intuitively, it is clear what a line in R3 should be. In order to describe a line in R3 completely, it
is not necessary to know all its points. It is sufficient to know either

(a) two different points P, Q on the line


RA
or

(b) one point P on the line and the direction of the line.

Q Q
~v ~v
P P P
D

# –
OQ
L L L # –
OP
O

Figure 2.18: Line L given by: two points P, Q on L; or by a point P on L and the direction of L.

Clearly, both descriptions are equivalent because: If we have two different points P, Q on the line
# –
L, then its direction is given by the vector P Q. If on the other hand we are given a point P on L
# – # –
and a vector ~v which is parallel to L, then we easily get another point Q on L by OQ = OP + ~v .

Now we want to give formulas for the line.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
54 2.6. Lines and planes in R3

Vector equation of a line

Given two points P (p1 , p2 , p3 ) and Q(q1 , q2 , q3 ) with P 6= Q, there is exactly one line L which passes
through both points. In formulas, this line is described as
  
n# – # – o  p1 + (q1 − p1 )t 
L = OP + tP Q : t ∈ R = p2 + (q2 − p2 )t : t ∈ R . (2.11)
p3 + (q3 − p3 )t
 

 
v1
If we are given a point P (p1 , p2 , p3 ) on L and a vector ~v = v2  6= ~0 parallel to L, then
v3
  
n# – o  p 1 + v1 t 
L = OP + t~v : t ∈ R = p2 + v2 t : t ∈ R (2.12)
p 3 + v3 t
 

FT
The formulas are easy to understand. They say: In order to trace the line, we first move to an
# –
arbitrary point on the line (this is the term OP ) and then we move an amount t along the line.
With this procedure we can reach every point on the line, and on the other hand, if we do this,
then we are guaranteed to end up on the line.
The formulas (2.11) and (2.12) are called vector equation for the line L. Note that they are the
same if we set v1 = q1 − p1 , v2 = q2 − p2 , v3 = q3 − p3 . We will mostly use the notation with the
v’s since it is shorter. The vector ~v is called directional vector of the line L.
RA
Question 2.3
# –
Is it true that E passes through the origin if and only if OP = ~0?

Remark 2.38. It is important to observe that a given line has many different parametrisations.

• The vector equation that we write down depends on the points we choose on L. Clearly, we
have infinitely many possibilities to do so.
D

• Any given line L has many directional vectors. Indeed, if ~v is a directional vector for L, then
c~v is so too for every c ∈ R \ {0}. However, all possible directional vectors are parallel.

Exercise. Check that the following formulas all describe the same line:
         
 1 6   1 12 
(i) L1 = 2 + t 5 : t ∈ R , (ii) L2 = 2 + t 10 : t ∈ R ,
3 4 3 8
   
    
 13 6 
(ii) L3 = 12 + t 5 : t ∈ R .
11 4
 

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 55

Question 2.4
• How can you see easily if two given lines are parallel or perpendicular to each other?
• How would you define the angle between two lines? Do they have to intersect so that an
angle between them can be defined?

Parametric equation of a line

From the formula (2.12) it is clear that a point (x, y, z) belongs to L if and only if there exists t ∈ R
such that

x = p1 + tv1 , y = p2 + tv2 , z = p3 + tv3 . (2.13)

FT
If we had started with (2.11), then we would have obtained

x = p1 + t(q1 − p1 ), y = p2 + t(q2 − p2 ), z = p3 + t(q3 − p3 ). (2.14)

The system of equations (2.13) or (2.14) are called the parametric equations of L. Here, t is the
parameter.
RA
Symmetric equation of a line

Observe that for (x, y, z) ∈ L, the three equations in (2.13) must hold for the same t. If we assume
that v1 , v2 , v3 6= 0, then we can solve for t and we obtain that

x − p1 y − p2 z − p3
= = . (2.15)
v1 v2 v3

If we use (2.14) then we obtain


D

x − p1 y − p2 z − p3
= = . (2.16)
q1 − p1 q2 − p2 q3 − p3

The system of equations (2.15) or (2.16) is called the symmetric equation of L.


If for instance, v1 = 0 and v2 , v3 6= 0, then the line is parallel to the yz-plane and its symmetric
equation is
y − p2 z − p3
x = p1 , = .
v2 v3

If v1 = v2 = 0 and v3 6= 0, then the line is parallel to the z-axis and its symmetric equation is

x = p1 , y = p2 , z ∈ R.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
56 2.6. Lines and planes in R3

Representations of lines in Rn .
In Rn , the vector form of a line is
n# – o
L = OP + t~v : t ∈ R

for fixed P ∈ L and a directional vector ~v . Its parametric form is

x1 = p1 + tv1 , x2 = p2 + tv2 , ... , xn = pn + tvn , t ∈ R,

and, assuming that all vj are different from 0, its symmetric form is
x1 − p1 x2 − p2 xn − pn
= = ··· = .
v1 v2 vn

Question 2.5. Normal form of a line.

FT
In R2 , there is also the normal form of a line:

L : ax + by = d (2.17)

where a, b and d are fixed numbers. This means that L consists of all the points (x, y) whose
coordinates satisfy the equation ax + by = d.
(i) Given a line in the form (2.17), find a vector representation.
RA
(ii) Given a line in vector representation, find a normal form (that is, write it as (2.17)).
 
a
(iii) What is the geometric interpretation of a, b? (Hint: Draw the line L and the vector .)
b

(iv) Can this normal form be extended/generalised to lines in R3 ? If it is possible, how can it
be done? If it is not possible, explain why not.

Planes
D

In order to know a plane E in R3 completely, it is sufficient to know


(a) three points P, Q, R on the plane that do not lie on a a common line,
or
(b) one point P on the plane and two non-parallel vectors ~v , w
~ which are both parallel the plane,
or
(c) one point P on the plane and a vector ~n which is perpendicular to the plane,
First, let us see how we can pass from one description to another. Clearly, the descriptions (a) and
(b) are equivalent because given three points P, Q, R on E which do not lie on a line, we can form
# – # –
the vectors P Q and P R. These vectors are then parallel to the plane E but are not parallel to each

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 57

~n

R E E E
w
~
Q P
P ~v
P

(a) (b) (c)

Figure 2.19: Plane E given by: (a) three points P, Q, R on E, (b) a point P on E and two vectors
~v , w
~ parallel to E, (c) a point P on E and a vector ~n perpendicular to E.

~
n

FT
E

Q R

P
# Q

# P


O

O R–
#
RA
(0, 0, 0)

# – # –
Figure 2.20: Plane E given with three points P, Q, R on E, two vectors P Q, P R parallel to E, and a
# – # –
vector ~n perpendicular to E. Note the ~n k P Q × P R.

# – # – # – # –
other. (Of course, we also could have taken QR and QP or RP and RQ.) If, on the other hand,
~ parallel to E and ~v 6k w,
we have one point P on E and two vectors ~v and w, ~ then we can easily get
D

# – # – # – # –
two other points on E, for instance by OQ = OP + ~v and OR = OP + w. ~ Then the three points
P, Q, R lie on E and do not lie on a line.

Vector equation of a plane

In formulas, we can now describe our plane E as


   
x
 # – 
E = (x, y, z) : y  = OP + s~v + tw
~ for some s, t ∈ R .
z
 

As in the case of the vector equation of a line, it is easy to understand the formula. We first move
# –
to an arbitrary point on the line (this is the term OP ) and then we move parallel to the plane as

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
58 2.6. Lines and planes in R3

we please (this is the term s~v + tw).


~ With this procedure we can reach every point on the plane,
and on the other hand, if we do this, then we are guaranteed to end up on the plane.
Question 2.6
# –
Is it true that E passes through the origin if and only if OP = ~0?

Normal form of a plane


Now we want to use the normal vector of the plane to describe it. Assume that we are given a
point P on E and a vector ~n perpendicular to the plane. This means that every vector which is
parallel to the plane E must be perpendicular to ~n. If we take an arbitrary point Q(x, y, z) ∈ R3 ,
# – # –
then Q ∈ E if and only if P Q is parallel to E, that means that P Q is orthogonal to ~n. Recall that
two vectors are perpendicular if and only if their inner product is 0, so Q ∈ E if and only if
*n  x − p +
# – 1 1
0 = hn , P Qi = n2  , y − p2  = n1 (x − p1 ) + n2 (y − p2 ) + n3 (z − p3 )

FT
n3 z − p3
= n1 x + n2 y + n3 z − (n1 p1 + n2 p2 + n3 p3 )
If we set d = n1 p1 + n2 p2 + n3 p3 , then it follows that a point Q(x, y, z) belongs to E if and only if
its coordinates satisfy
n1 x + n2 y + n3 z = d. (2.18)
Equation (2.18) is called the normal form for the plane E and ~n is called a normal vector of E.
RA
Notation 2.39. In order to define E, we write E : n1 x + n2 y + n3 z = d. As a set, we denote E as
E = {(x, y, z) : n1 x + n2 y + n3 z = d}.

Exercise. Show that E passes through the origin if and only if d = 0.

Remark 2.40. As before, note that the normal equation for a plane is not unique. For instance,
x + 2y + 3z = 5 and 2x + 4y + 6z = 10
describe the same plane. The reason is that “the” normal vector of a plane is not unique. If ~n is
D

normal vector of the plane E, then every c~n with c ∈ R \ {0} is also a normal vector to the plane.

Definition 2.41. The angle between two planes is the angle between their normal vectors.

Note that this definition is consistent with the fact that two planes are parallel if and only if their
normal vectors are parallel.

Remark 2.42. • Assume a plane is given as in (b) (that is, we know a point P on E and two
vectors ~v and w ~ parallel to E but with ~v 6k w).
~ In order to find a description as in (c) (that is
one point on E and a normal vector), we only have to find a vector ~n that is perpendicular to
both ~v and w. ~ Proposition 2.32(vii) tells us how to do this: we only need to calculate ~v × w.
~
Another way to find an appropriate ~n is to find a solution of the linear 2 × 3 system given by
{h~v , ~ni = 0, hw
~ , ~ni = 0}.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 59

• Assume a plane is given as in (c) (that is, we know a point P on E and a normal vector). In
order to find vectors ~v and w
~ as in (b), we can proceed in many ways:

– Find two solutions of h~x , ~ni = 0 which are not parallel.


# – # – # –
– Find two points Q, R on the plane such that P Q 6k P R. Then we can take ~v = P Q and
# –
w
~ = P R.
– Find one solution ~v 6= ~0 of h~n , ~v i = ~0 which is usually easy to guess and then calculate
~ = ~v × ~n. The vector w
w ~ is perpendicular to ~n and therefore it is parallel to the plane.
It is also perpendicular to ~v and therefore it is not parallel to ~v . In total, this vector w
~
does what we need.

Representations of planes in Rn .
In Rn , the vector form of plane is
n# – o
~ :t∈R
E = OP + t~v + sw

for fixed P ∈ E and a two vectors ~v , w

FT
~ parallel to the plane but not parallel to each other.
Note that there is no normal form of a plane in Rn for n ≥ 4. The reason it that for n ≥ 4, there
are more than just one normal directions to a given plane, so a normal form of a plane E must
consist of more than one equations (more precisely, it must consist of n − 2 equations of the form
n1 x1 + . . . nn xn = d).
RA
You should have understood
• the concept of lines and planes in R3 ,
• how they can be described in formulas,
• etc.
You should now be able to
• pass easily between the different descriptions of lines and planes,
D

• etc.

Ejercicios.
1. Mostrar que las siguientes ecuaciones describen la misma recta:
              
 1 4   1 8   1 −4 
2 + t 5 : t ∈ R , 2 + t 10 : t ∈ R , 2 + t −5 : t ∈ R ,
3 6 3 12 3 −6
     

    
 5 4  x−1 y−2 z−3 x+3 y+3 z+3
7 + t 5 : t ∈ R , = = , = = .
4 5 6 4 5 6
9 6
 

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
60 2.7. Intersections of lines and planes in R3

2. Dadas lı́neas L1 y L2 y el punto P ,

(i) determine si L1 y L2 son paralelas,


(ii) determine si L1 y L2 tienen un punto de intersección,
(iii) determine si P pertenece a L1 y/o a L2 ,
(iv) encuentre una recta paralela a L2 que pase por P .
   
3 1
x−3 y−2 z−1
(a) L1 : ~r(t) = 4 + t −1, L2 : 2 = 3 = 4 , P (5, 2, 11).
5 3
   
2 1
(b) L1 : ~r(t) =  1 + t 2, L2 : x = t + 1, y = 3t − 4, z = −t + 2, P (5, 7, 2).
−7 3

3. En R3 considere el plano E dado por E : 3x − 2y + 4z = 16.

FT
(a) Encuentre por lo menos tres puntos que pertenecen a E.
(b) Encuentre un punto en E y dos vectores ~v y w
~ en E que no son paralelos entre si.
(c) Encuentre un punto en E y un vector ~n que es ortogonal a E.
(d) Encuentre un punto en E y dos vectores ~a y ~b en E con ~a ⊥ ~b.

4. Para los puntos P (1, 1, 1), Q(1, 0, −1) y los siguientes planos E,
RA
(i) encuentre la ecuación del plano.
(ii) determine si P pertenece al plano.
(iii) encuentre una recta que esté ortogonal a E y que contenga al punto Q.

(a) E es el plano que contiene al punto A(1, 0, 1) y es paralelo a los vectores


   
1 3
~v = 1 y w~ = 2.
0 1
D

(b) E es el plano que contiene los puntos A(1, 0, 1), B(2, 3, 4), C(3, 2, 4).
 
3
(c) E es el plano que contiene el punto A(1, 0, 1) y es ortogonal al vector ~n = 2.
1

2.7 Intersections of lines and planes in R3


Intersection of lines
Given two lines G and L in R3 , there are three possibilities:

(a) The lines intersect in exactly one point. In this case, they cannot be parallel.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 61

(b) The lines intersect in infinitely many points. In this case, the lines have to be equal. In
particular the have to be parallel.

(c) The lines do not intersect. Note that in contrast to the case in R2 , the lines do not have to be
parallel for this to happen. For example, the line L : x = y = 1 is a line parallel to the z-axis
passing through (1, 1, 0), and G : x = z = 0 is a line parallel to the y-axis passing through
(0, 0, 0), The lines do not intersect and they are not parallel.

Example 2.43. We consider four lines Lj = {~


pj + t~vj : t ∈ R} with
       
1 0 2 2
(i) ~v1 = 2 , p~1 = 0 , (ii) ~v2 = 4 , p~2 = 4 ,
3 1 6 7
       
1 −1 1 3
(iii) ~v3 = 1 , p~3 =  0 , (iv) ~v4 = 1 , p~4 = 0 .
2 0 2 5

We will calculate their mutual intersections.

L1 ∩ L2 = L1

# –
FT
Proof. A point Q(x, y, z) belongs to L1 ∩ L2 if and only if it belongs both to L1 and L2 . This means
# –
that there must exist an s ∈ R such that OQ = p~1 + s~v1 and there must exist a t ∈ R such that
OQ = p~2 + t~v2 . Note that s and t are different parameters. So we are looking for s and t such that
RA
       
0 1 2 2
p~1 + s~v1 = p~2 + t~v2 , that is 0 + s 2 = 4 + t 4 . (2.19)
1 3 7 6

Once we have solved (2.19) for s and t, we insert them into the equations for L1 and L2 respectively,
in order to obtain Q. Note that (2.19) in reality is a system of three equations: one equation for
each component of the vector equation. Writing it out and solving each equation for s, we obtain
D

0 + s = 2 + 2t s = 2 + 2t
0 + 2s = 4 + 4t ⇐⇒ s = 2 + 2t
1 + 3s = 7 + 6t s = 2 + 2t

This means that there are infinitely many solutions of (2.19). Given any point R on L1 , there is a
# – # –
corresponding s ∈ R such that OR = p~1 + s~v1 . Now if we choose t = (s − 2)/2, then OR = p~2 + t~v2
holds, hence R ∈ L2 too. If on the other hand we have a point R0 ∈ L2 , then there is a corresponding
# – # –
t ∈ R such that OR0 = p~2 + t~v2 . Now if we choose s = 2 + 2t, then OR0 = p~1 + t~v1 holds, hence
R0 ∈ L2 too. In summary, we showed that L1 = L2 .

Remark 2.44. We could also have seen that the directional vectors of L1 and L2 are parallel. In
fact, ~v2 = 2~v1 . It then suffices to show that L1 and L2 have at least one point in common in order
to conclude that the lines are equal.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
62 2.7. Intersections of lines and planes in R3

L1 ∩ L3 = {(1, 2, 4)}

Proof. As before, we need to find s, t ∈ R such that


       
0 1 −1 1
p~1 + s~v1 = p~3 + t~v3 , that is 0 + s 2 =  0 + t 1 . (2.20)
1 3 0 2

We write this as a system of equations, we get

1 0 + s = −1 + t 1 s − t = −1
2 0 + 2s = 0 + t ⇐⇒ 2 2s − t = 0
3 1 + 3s = 0 + 2t 3 3s − 2t = −1

From 1 it follows that s = t − 1. Inserting in 2 gives 0 = 2(t − 1) − t = t − 2, hence t = 2. From


1 we then obtain that s = 2 − 1 = 1. Observe that so far we used only equations 1 and 2 . In
order to see if we really found a solution, we must check if it is consistent with 3 . Inserting our

FT
candidates for s and t, we find that 3 · 1 − 2 · 2 = −1 which is consistent with 3 .
So L1 and L3 intersect in exactly one point. In order to find it, we put s = 1 in the equation for
L1 :      
0 1 1
# –
OQ = p~1 + 1 · ~v1 = 0 + 2 = 2 ,
1 3 4
hence the intersection point is Q(1, 2, 4).
In order to check if this result is correct, we can put t = 2 in the equation for L3 . The result must
RA
be the same. The corresponding calculation is:
     
−1 2 1
# –
OQ = p~3 + 2 · ~v3 =  0 + 2 = 2 ,
0 4 4

which confirms that the intersection point is Q(1, 2, 4).


D

L1 ∩ L4 = ∅

Proof. As before, we need to find s, t ∈ R such that


       
0 1 3 1
p~1 + s~v1 = p~4 + t~v4 , that is 0 + s 2 = 0 + t 1 . (2.21)
1 3 5 2

We write this as a system of equations and we get

1 s=3+ t 1 s− t=3
2 2s = t ⇐⇒ 2 2s − t = 0
3 1 + 3s = 5 + 2t 3 3s − 2t = 5

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 63

From 1 it follows that s = t + 3. Inserting in 2 gives 0 = 2(t + 3) − t = t + 6, hence t = −6.


From 1 we then obtain that s = −6 + 3 = −3. Observe that so far we used only equations 1 and
2 . In order to see if we really found a solution, we must check if it is consistent with 3 . Inserting
our candidates for s and t, we find that 3 · (−3) − 2 · (−6) = 3 which is inconsistent with 3 .
Therefore we conclude that there is no pair of real numbers s, t which satisfies all three equations
1 – 3 simultaneously, so the two lines do not intersect.

Exercise. Show that L3 ∩ L4 = ∅.

Intersection of planes
Given two planes E1 and E2 in R3 , there are two possibilities:

(a) The planes intersect. In this case, they necessarily intersect in infinitely many points. Their
intersection is either a line (if E1 and E2 are not parallel) or a plane (if E1 = E2 ).

(b) The planes do not intersect. In this case, the planes must be parallel and not equal.

E1 : x + y + 2z = 3,

We will calculate their mutual intersections.

E1 ∩ E2 = ∅
FT
Example 2.45. We consider the following four planes:

E2 : 2x + 2y + 4z = −4, E3 : 2x + 2y + 4z = 6, E4 : x + y − 2z = 5.
RA
Proof. The set of all points Q(x, y, z) which belong both to E1 and E2 is the set of all x, y, z which
simultaneously satisfy

1 x + y + 2z = 3,
2 2x + 2y + 4z = −4.

Now clearly, if x, y, z satisfies 1 , then it cannot satisfy 2 (the right side would be 6). We can
see this more formally if we solve 1 , e.g., for x and then insert into 2 . We obtain from 1 :
x = 3 − y − 2z. Inserting into 2 leads to
D

−4 = 2(3 − y − 2z) + 2y + 4z = 6,

which is absurd.
   
1 2
This result was to be expected since the normal vectors of the planes are ~n1 = 1 and ~n2 = 2
2 4
respectively. Since they are parallel, the planes are parallel and therefore they either are equal or
they have empty intersection. Now we see that for instance (3, 0, 0) ∈ E1 but (3, 0, 0) ∈/ E2 , so the
planes cannot be equal. Therefore they have empty intersection.

E1 ∩ E3 = E1

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
64 2.7. Intersections of lines and planes in R3

Proof. The set of all points Q(x, y, z) which belong both to E1 and E3 is the set of all x, y, z which
simultaneously satisfy

1 x + y + 2z = 3,
2 2x + 2y + 4z = 6.

Clearly, both equations are equivalent: if x, y, z satisfies 1 , then it also satisfies 2 and vice versa.
Therefore, E1 = E3 .

    
 4 −1 
E1 ∩ E4 =  0 + t  1 : t ∈ R .
− 12 0
 

   
1 1
Proof. First, we notice that the normal vectors ~n1 = 1 and ~n4 =  1 are not parallel, so we
2 −2

FT
expect that the solution is a line in R3 .
The set of all points Q(x, y, z) which belong both to E1 and E4 is the set of all x, y, z which
simultaneously satisfy

1 x + y + 2z = 3,
2 x + y − 2z = 5.

Equation 1 shows that x = 3 − y − 2z. Inserting into 2 leads to 5 = 3 − y − 2z + y − 2z = 3 − 4z,


RA
hence z = − 21 . Putting this into 1 , we find that x+y = 3−2z = 4. So in summary, the intersection
consists of all points (x, y, z) which satisfy

1
z=− , x=4−y with y ∈ R arbitrary,
2
in other words,
           
x 4−y 4 −y 4 −1
y  =  y  =  0  +  y  =  0  + y  1 with y ∈ R arbitrary.
D

z − 12 − 21 0 − 21 0

Intersection of a line with a plane


Finally we want to calculate the intersection of a plane E with a line L. There are three possibilities:

(a) The plane and the line intersect in exactly one point. This happens if and only if L is not
parallel to E which is the case if and only if L is not perpendicular to the normal vector of E.

(b) The plane and the line do not intersect. In this case, the E and L must be parallel, that is,
L must be perpendicular to the normal vector of E.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 65

Figure 2.21: The left figure shows E1 ∩ E2 = ∅, the right figure shows E1 ∩ E4 which is a line.

FT
(c) The plane and the line intersect in infinitely many points. In this case, L lies in E, that is, E
and L must be parallel and they must share at least one point.
As an example we calculate E1 ∩ L2 . Since L2 is clearly not parallel to E1 , we expect that their
intersection consists of exactly one point.

E1 ∩ L2 = {(1/9, 2/9, 4/3)}

Proof. The set of all points Q(x, y, z) which belong both to E1 and L2 is the set of all x, y, z which
RA
simultaneously satisfy

x + y + 2z = 3 and x = 2 + 2t, y = 4 + 4t, z = 7 + 6t for some t ∈ R.

Replacing the expression with t from L2 into the equation of the plane E1 , we obtain the following
equation for t:

3 = (2 + 2t) + (4 + 4t) + 2(7 + 6t) = 20 + 18t =⇒ t = −17/18.

Replacing this t into the equation for L2 gives the point of intersection Q(1/9, 2/9, 4/3).
D

In order to check our result, we insert the coordinates in the equation for E1 and obtain x+y +2z =
1/9 + 2/9 + 2 · 4/3 = 1/3 + 8/3 = 3 which shows that Q ∈ E1 .
Let us calculate two more examples.

Example 2.46. We consider the plane E and the lines L, G given by


z−2 x−3 z−1
E : x + 2y − z = 2, L:x+1=y = . and G : =y−1= .
3 2 4
Note that the normal vector of E and the directional vectors of L and G are
     
1 1 2
~n =  2 , ~vL = 1 and ~vG = 1 .
−1 3 4

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
66 2.7. Intersections of lines and planes in R3

Observe that ~n ⊥ ~vL and ~n ⊥ ~vG . Therefore L k E and G k E (but L 6k G) and we expect for each
of the lines that it either does not intersect E or that it lies in E.
E ∩ L = ∅ The parametric equations of L are

x = −1 + t, y = t, z = 2 + 3t.

If we replace this in the formula for E, we obtain


?
2 = t − 1 + 2t − (2 + 3t) = −7

which is absurd. Therefore E and L do not have any point in common. In other words, they do
not intersect.
E ∩ G = G The parametric equations of G are

x = 3 + 2t, y = 2 + t, z = 5 + 4t.

Replacing this in the formula for E we obtain

FT
?
2 = 3 + 2t + 2(2 + t) − (5 + 4t) = 2

which is true for any t ∈ R. Therefore every point of G belongs also to E. In other words, G ⊆ E
or E ∩ G = G.

Remark. Recall that in the example above ~n ⊥ ~vL and ~n ⊥ ~vG . This means that L is parallel to
E, therefore it must either lies completely in E or it does not intersect E. The same is true for G.
RA
So if we take an arbitrary point on L and this point belongs also to E, then the intersection E ∩ L
is not empyt, but then we must have that E ∩ L = L. If that point does not belong to E, then E
and L cannot intersect (because otherwise L ⊆ E and the point would be in E too). For instance,
it is easy to see that the point P (−1, 0, 2) belongs to L (just take t = 0 in its parametric equation).
Let us put the coordinates of P in the formula for E:

−1 + 2 · 0 − 2 = −3 6= 2.

Therefore P ∈/ E and therefore E ∩ L = ∅.


On the other hand, if we choose Q(3, 2, 5), then we easily see that Q ∈ G. Plugging its coordinates
D

in the formula for E we find that


3 + 2 · 2 − 5 = 2,
therefore Q ∈ E and consequently G ⊆ E.

Intersection of several lines and planes


If we wanted to intersect for instance, 5 planes in R3 , then we would have to solve a system of 5
equations for 3 unknowns. Or if we wanted to intersect 7 lines in R3 , then we had to solve a system
of 3 equations for 7 unknowns. If we solve them as we did here, the process could become quite
messy. So the next chapter is devoted to find a systematic and efficient way to solve a system of m
linear equations for n unknowns.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 67

You should have understood


• what intersections of lines and planes can be geometrically and how they depends on their
relative orientation,
• the interpretation of a linear system with three unknowns as the intersection of planes in
R3 ,
• etc.
You should now be able to

• calculate the intersection of lines and planes,


• etc.

Ejercicios.
1. Considere el plano E : 2x − y + 3z = 9 y la recta L : x = 3t + 1, y = −2t + 3, z = 4t.

(a) Encuentre E ∩ L.

FT
(b) Encuentre una recta G que no interseque ni al plano E ni a la recta L. Pruebe su
afirmación. ¿Cúantas rectas con esta propiedad hay?
 
2
2. Sea L la recta que pasa por el punto (1, 1, 1) y es paralela al vector −1. Muestre que L
4
RA
no interseca al plano E : x − 2y − z = 1.

3. Dado el plano E : 3x − 4y + 2z = 5, encuentre

(a) un punto P ∈ E cuya coordenada x es 2;


(b) un punto Q ∈
/ E cuya coordenada x es 2;
(c) una recta L paralela a E que pase por el punto (3, 1, 5);
(d) una recta L perpendicular a E que pase por el punto (3, 1, 5);
D

4. Dada la recta L : x = 3t − 2, y = 2t + 5, z = t + 3, encuentre o diga por qué no exsite

(a) un punto P ∈ L cuya coordenada x es 2;


(b) un punto Q ∈
/ L cuya coordenada x es 2;
(c) un plano E paralelo a L que pase por el punto (3, 1, 5);
(d) un plano E perpendicular a L que pase por el punto (3, 1, 5);
(e) un plano E que contiene a la recta L y que pase por el punto (2, −5, 1);

5. De dos planos E y F en R3 se sabe que no son paralelos y que los puntos A(1, 2, 3) y B(4, 0, −3)
pertenecen a E ∩ F . Se puede concluir qué es E ∩ F ? Dé dos ejemplos de planos E y F con
la propiedad arriba.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
68 2.8. Summary

 
3
6. Un caminante arranca en al tiempo t = 0 en el punto (1, 2) con velocidad ~vc = . Hay
1
un ciclista en el punto (12, −3) y un punto de refrigerios en el sitio (16, 7). El caminante y el
ciclista se mueven en lineas rectas ambos con velocidad constante.
(a) Muestre que el caminante pasa por el punto de los refrigerios. ¿En cuál tiempo t pasa
por este punto?
(b) ¿En cuál dirección se debe dirigir el ciclista para pasar también por el punto de los
refrigerios?
(c) Supongamos que el ciclista arranca al mismo tiempo que el caminante. ¿Cómo debe el
ciclista escoger su velocidad si quiere encontrarse con el caminante en el punto de los
refrigerios? ¿Cómo la debe escoger si quiere pasar por el punto antes del caminante?
 
3
(d) Ahora supongamos que el ciclista se mueve con la velocidad w ~= . Muestre que el
7.5
ciclista pasa por el punto de los refrigerios. ¿A qué hora debe arrancar para encontrarse
con el caminante?

2.8 Summary
The vector space Rn is given by

n


FT
 
 x1

 .. 

xn



R =  .  : x1 , . . . , xn ∈ R .


RA
For points P (p1 , . . . , pn ), Q(q1 , . . . , qn ), the vector whose initial point is P and final point is Q, is
   
q1 − p1 q1
# –  .  # – .
P Q =  ..  and OQ =  ..  where O denotes the origin.
qn − pn qn
On Rn , the sum and product with scalars are defined by
       
v1 w1 v1 + w1 cv1
D

Rn × Rn → Rn , ~v + w ~ =  ...  +  ...  =  ..
, R × Rn → R n , c~v =  ...  .
       
.
vn wn vn + wn cvn
The norm of a vector is q
k~v k = v12 + · · · + vn2 .
# – # –
If ~v = P Q, then k~v k = kP Qk = distance between P and Q.
~ ∈ Rn their inner product is a real number defined by
For vectors ~v and w
   
* v1 w1 +
 ..   .. 
h~v , wi
~ =  .  ,  .  = v1 w1 + · · · + vn wn .
vn wn

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 69

Important formulas involving the inner product


• h~v , wi
~ = hw ~ , ~v i,
h~v , cwi
~ = ch~v , wi,~
h~v , w
~ + ~ui = h~v , wi
~ + h~v , ~ui,
• h~v , wi
~ = k~v k kwk
~ cos ϕ,
• k~v + wk
~ ≤ k~v k + kwk
~ Triangle inequality
• ~v ⊥ w
~ ⇐⇒ h~v , wi
~ = 0,
• h~v , ~v i = k~v k2 .

The cross product is defined only in R3 . It is a vector defined by


     
v1 w1 v2 w3 − v3 w2
~ = v2  × w2  = v3 w1 − v1 w3  .
~v × w
v3 w3 v1 w2 − v2 w1

Important formulas involving the cross product


• ~u × ~v = −~v × ~u,
~u × (~v + w)
~ = (~u × ~v ) + (~u × w),
(c~u) × ~v = c(~u × ~v ),
• h~u , ~v × wi
• ~u k ~v
~ = h~u × ~v , wi.
~
~
FT
RA
⇐⇒ ~u × ~v = ~0.
• h~u , ~u × ~v i = 0 and h~v , ~u × ~v i = 0, in particular ~v ⊥ ~v × ~u, ~u ⊥ ~v × ~u .
• k~v × wk
~ = k~v k kwk
~ sin ϕ.

Applications
• Area of a parallelogram spanned by ~v , w
~ ∈ R3 : A = k~v × wk.
~
• Volume of a parallelepiped spanned by ~u, ~v , w
~ ∈ R3 : V = |h~u , ~v × wi|.
D

Representations of lines
n# – o
• Vector equation L = OP + ~t~v : t ∈ R .
P is a point on the line, ~v is called directional vector of L.
• Parametric equation x1 = p1 + tv1 , . . . , xn = pn + tvn , t ∈ R.
Then P (p1 , . . . , pn ) is a point on L and ~v = (v1 , . . . , vn )t is a directional vector of L.
• Symmetric equation x1v−p 1
1
= x2v−p
2
2
= · · · = xnv−p
n
n
.
Then P (p1 , . . . , pn ) is a point on L and ~v = (v1 , . . . , vn )t is a directional vector of L.
If one or several of the vj are equal to 0, then the formula above has to be modified.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
70 2.8. Summary

Representations of planes
n# – o
• Vector equation E = OP + ~t~v + sw
~ : s, t ∈ R .
~ are vectors parallel to E with ~v 6k w.
P is a point on the line, ~v and w ~
• Normal form (only in R3 !!) E : ax + by + cz = d.
 
a
The vector ~n =  b  formed by coefficients on the left hand side is perpendicular to E.
c
Moreover, E passes through the origin if and only if d = 0.

The parametrisations are not unique!! (One and the same line (or plane) has many different
parametrisations.)

• The angle between two lines is the angle between their directional vectors.
• Two lines are parallel if and only if their directional vectors are parallel.

FT
Two lines are perpendicular if and only if their directional vectors are perpendicular.
• The angle between two planes is the angle between their normal vectors.
• Two planes are parallel if and only if their normal vectors are parallel.
Two planes are perpendicular if and only if their normal vectors are perpendicular.
• A line is parallel to a plane if and only if its directional vector is perpendicular to the plane.
A line is perpendicular to a plane if and only if its directional vector is parallel to the plane.
RA
D

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 71

2.9 Exercises
   
2 −1
1. Sean ~a = y ~b = . Encuentre vectores ~u, w
~ tal que cumplan todas las siguientes
−3 4
condiciones:
(a) ~a = ~u + w.
~
~
(b) ~u k b.
~ ⊥ ~b.
(c) w
 
1
2. Sea ~a = . Encuentre ~b ∈ R2 tal que ~a ⊥ ~b y k~ak = k~bk. Repita el ejercicio cuando
−3
     
2 1 x
~a = , , .
3 4 y
   
1 1
3. Sea ~a = −2 y ~b =  2. Encuentre escalares x, y tal que x~a + y~b ⊥ ~b y x~a + y~b 6= ~0.

FT
3 −4

4. Sean ~a, ~b, ~c ∈ Rn . Responda si son falsas o verdaderas las siguientes afirmaciones. En caso
afirmativo demuéstrela y en caso negativo de un contraejemplo.
(a) Si h~a , ~bi = h~a , ~ci y ~a 6= ~0, entonces ~b = ~c.
(b) Si existe un vector ~b con h~a , ~bi = 0, entonces ~a = 0.
(c) Si h~a , ~bi = 0 para todo vector ~b, entonces ~a = 0.
RA
(d) Si n = 3. Si h~a , ~bi = 0 y ~a × ~b = 0, entonces ~a = 0 ó ~b = 0 (Haga una interpretación
geométrica antes de intentar desarrollar este ı́tem).
   
1 0
5. Sean ~a = −1 y ~b =  3. Encuentre por lo menos 3 vectores ~c distintos tales que el
2 −1
volumen del paralepı̀pedo generado por ~a, ~b, ~c sea 1. ¿Cuál es el lugar geométrico que describen
todos los vectores ~c con esta propiedad?
D

π π π
6. En R3 demuestre que no existe un vector unitario cuyos ángulos directores son 6 , 10 , 3 .

7. Un cometa sale disparado del punto P (1, 0, 3) y se mueve con velocidad (3, 5, −2)t , al mismo
tiempo un asteroide sale disparado del punto R(3, 4, −1) y se mueve con vector velocidad
(2, 3, 0)t . Si suponemos que ambos asteroides se desplazan en lı́nea recta, a velocidad constante
y que además ningún objeto celeste perturba su trayectoria, responda las siguientes preguntas:
(a) i. ¿Cuáles son las ecuaciones vectoriales que describen las trayectorias del asteroide y
del cometa?
ii. ¿Los dos asteroides colisionan? ¿En que tiempo lo hacen?
iii. Si el cometa deja en su trayectoria una estela de hielo y el asteroide deja en su
recorrido una estela de polvo, ¿el hielo del cometa y el polvo del asteroide se mezclan
en algún punto del espacio?.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
72 2.9. Exercises

(b) Repita las preguntas anteriores considerando que el asteroide parte de R(10, 5, 2) a ve-
locidad (3, 15, −7)t .

R
P

8. Considere una pared inclinada que está dada por la ecuación E : 2x − 3y + 0.5z = 4.

(a) Demuestre que el punto Q(4, 2, 4) pertenece a la pared.


(b) En el punto P (2, 0, 1) hay un laser. ¿En cuál dirección debe apuntar para marcar en
punto Q en la pared?
(c) ¿Cuál

3
FT
 punto en la pared marcará el laser del literal anterior si apunta en la dirección
2/3
 1 ?

9. Una empresa fabrica bultos de peso. Hay tres materiales con los que puede llenar los bultos:
Material A tiene una densidad de 1 kg
tiene una densidad de 3 kg
kg
l , material B tiene una densidad de 2 l y material C
RA
l . Cada bulto debe pesar 20kg.

(a) Interprete la información dada en el ejercicio como un plano en R3 donde los ejes repre-
sentan la cantidad de cada material.
(b) Si en un bulto ya se encuentran 5l del material A y 3l del material B, ¿cuántos litros del
material C se deba agregar para completar el bulto?
(c) ¿Cómo se pueden armar los bultos si se quiere adicionalmente que el volumen de cada
bulto es 13l? ¿Cuántas posibilidades de hacerlo hay? Interprete sus cálculos como
D

intersección de dos planos.


(d) En un bulto ya se encuentran 2l del material A, 5l del material B y 1l del material C.
La empresa lo quiere completar con un mezcla de los tres materiales en la que 20% son
del material A, 50% son del material B, 30% son del material C. ¿Cuántos litros de
esta mezcla hay que agregar para que el bulto pese 20kg? Interprete sus cálculos como
intersección de un plano con una recta.

10. En R3 sea E un plano y L una recta.

(a) Muestre que E, L se cortan en exactamente un punto si el vector director de L y el vector


normal de E no son perpendiculares.
(b) Muestre que E, L no se cortan si el vector director de L es perpendicular al normal de
E y E no contiene ningún punto de L.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 73

(c) Muestre que L está contenida en E si el vector director de L es perpendicular al normal


de E y E contiene al menos un punto de L.
11. Encontrar la ecuación vectorial de las rectas que satisfacen:
(a) Contiene a (7, 1, 3) y (−1, −2, −3)
(b) Contiene a (−2, 3, −2) y es paralela a 4~e2 .
(c) Contiene a (−2, 3, 7) y es perpendicular a 2~e1 .
x−2 y+1 z−5
(d) Contiene a (4, 1, −4) y es paralela a la recta dada por 3 = 6 = 2 .

12. (a) En R2 considere la recta L : ax + by = c y un punto P (x1 , y1 ) exterior a L. Demuestre


que la distancia d de P a L viene dada por la fórmula:

|ax1 + by1 − c|
d= √ .
a2 + b2
 
a

FT
(Hint: Recuerde el significado geométrico del vector en la recta L. Usar proyec-
b
ciones.)
(b) En R3 considere el plano E : ax + by + cz = d y un punto P (x1 .y1 , z1 ) exterior a E.
Demuestre que la distancia d de P a E viene dada por la fórmula:

|ax1 + by1 + cz1 − d|


d= √ .
a2 + b2 + c2
RA
(Hint: Es el mismo razonamiento del inciso anterior.)
−−→
13. En R3 considere en forma vectorial la recta L = OP + t~v .
−−→
(a) Encuentre la distancia de la recta L al origen. (Hint. Encuentre t ∈ R con OP + t~v ⊥ ~v .)
(b) Use el ejercicio anterior para encontrar la distancia entre la recta L y el origen donde L
es:    
4 1
3 + t 2 , t ∈ R.
D

7 0

14. Sea E el plano 3x + y + z = 1 y P (−6, 4, 4). El siguiente ejercicio pretende encontrar el punto
más cercano a P dentro de E.
(a) Verifique que P ∈
/ E.
(b) Encuentre la ecuación de la recta L que es paralela al vector normal de E y pasa por P .
(c) Obtenga el punto de intersección entre E y L, llámelo Q.
(d) Verifique que el valor de la distancia entre el punto obtenido y P da lo mismo que la
distancia de P al plano E (Ejercicio 12. parte (b)).
(e) Justifique el porqué Q es el punto más cercano a P que está dentro de E.

15. Sean ~a, ~b ∈ R2 con ~a 6= ~0.

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
74 2.9. Exercises

(a) Demuestre que k proj~a ~bk ≤ k~bk.


(b) ¿Qué deben cumplir ~a y ~b para que k proj~a ~bk = k~bk ?
(c) ¿Es cierto que k proj~ ~ak ≤ k~bk?
b
   
1 1
16. Sea L = 2 + t 1
3 1

(a) Muestre que (2, 3, 5) no pertenece a L.


(b) Encuentre un plano que contenga a L y pase por (2, 3, 5). ¿Cuántos planos que cumplen
la condición anterior existen?

17. Sean
x+3 z+2
L1 := =y−4=
2 7

FT
y L2 dada por sus ecuaciones paramétricas:

x = −3 + s
y = 2 − 4s
z = 1 + 6s

(a) ¿L1 y L2 se intersecan?


(b) Encuentre un plano F que sea perpendicular a L1 y que pase por (3, 2, 1).
RA
(c) Encuentre la ecuación normal de un plano E que sea paralelo tanto a L1 como a L2 y
que pase por el origen. ¿Cuántos planos con esta propiedad hay?
(d) ¿Existe algún plano que contenga simultáneamente a las rectas L1 , L2 ? En caso de que la
respuesta sea negativa, ¿cuales condiciones deberı́an cumplir dos rectas para que exista
un plano que las contenga a ambas?

18. En R3 considere el plano E dado por E : 3x − 2y + 4z = 16.


     
2 2 2
D

(a) Demuestre que los vectores ~a =  1, ~b = 5 y ~v = 3 son paralelos al plano E.
−1 1 0
(b) Encuentre números λ, µ ∈ R tal que λ~a + µ~b = ~v .
 
1
(c) Demuestre que el vector ~c = 1 no es paralelo al plano E y encuentre vectores ck y
1
c⊥ tal que ck es paralelo a E, c⊥ es ortogonal a E y c = ck + c⊥ .

19. Sea E un plano en R2 y sean ~a, ~b vectores paralelos a E. Demuestre que para todo λ, µ ∈ R,
el vector λ~a + µ~b es paralelo al plano.

20. Sea V un espacio vectorial. Demuestre lo siguiente:

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 2. R2 and R3 75

(a) El elemento neutral es único.


(b) 0v = O para todo v ∈ V .
(c) λO = O para todo λ ∈ R.
(d) Dado v ∈ V , su inverso ve es único.
(e) Dado v ∈ V , su inverso ve cumple ve = (−1)v.

21. De todos los siguientes conjuntos decida si es un espacio vectorial con su suma y producto
usual.
  
a
(a) V = :a∈R ,
a
  
a
(b) V = : a ∈ R ,
a2
(c) V es el conjunto de todas las funciones continuas R → R.

FT
(d) V es el conjunto de todas las funciones f continuas R → R con f (4) = 0.
(e) V es el conjunto de todas las funciones f continuas R → R con f (4) = 1.
RA
D

Last Change: Sat Feb 17 05:49:41 PM -05 2024


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 3. Linear Systems and Matrices 77

Chapter 3

Linear Systems and Matrices

We will rewrite linear systems as matrix equations in order to solve them systematically and effi-
ciently. We will interpret matrices as linear maps from Rn to Rm which then allows us to define

FT
algebraic operations with matrices, specifically we will define the sum and the composition (=mul-
tiplication) of matrices which then leads naturally to the concept of the inverse of a matrix. We
can interpret a matrix as a system which takes some input (the variables x1 , . . . , xn ) and gives us
back as output b1 , . . . , bm via A~x = ~b. Sometimes we are given the input and we want to find
the bj ; and sometimes we are given de output b1 , . . . , bm and we want to find the input x1 , . . . , xn
which produces the desired output. The latter question is usually the harder one. We will see that
a unique input for any given output exists if and only if the matrix is invertible. We can refine
the concept of invertibility of a matrix. We say that A has a left inverse if for any ~b the equation
RA
A~x = ~b has at most one solution and we say that it has a right inverse A~x = ~b has at least one
solution for any ~b.
We will discuss in detail the Gauß and Gauß-Jordan elimination which helps us to find solutions
of a given linear system and the inverse of a matrix if it exists. In Section 3.7 we define the trans-
position of matrices and we have a first look at symmetric matrices. They will become important
in Chapter 8. We will also see the interplay of transposing a matrix and the inner product. In the
last section of this chapter we define the so-called elementary matrices which can be seen as the
building blocks of invertible matrices. We will use them in Chapter 4 to prove important properties
D

of the determinant.

3.1 Linear systems and Gauß and Gauß-Jordan elimination


We start with a linear system as in Definition 1.8:

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. (3.1)
. . .
am1 x1 + am2 x2 + · · · + amn xn = bm .

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
78 3.1. Linear systems and Gauß and Gauß-Jordan elimination

Recall that the system is called consistent if it has at least one solution; otherwise it is called
inconsistent. According to (1.4) and (1.5) its associated coefficient matrix and augmented coefficient
matrices are  
a11 a12 . . . a1n
 a21 a22 . . . a2n 
A= . (3.2)
 
.. 
 .. . 
am1 am2 ... amn
and  
a11 a12 ... a1n b1
 a21 a22 ... a2n b2 
 
(A|b) = 
 ... .. ..  . (3.3)
 . . 

am1 am2 ... amn bm

Definition 3.1. The set of all matrices with m rows and n columns is denoted by M (m × n). If we

FT
want to emphasise that the matrix has only real entries, then we write M (m × n, R) or MR (m × n).
Another frequently used notation is Mm×n . A matrix A is called a square matrix if its number of
rows is equal to its number of columns.

In order to solve (3.1), we could use the first equation, solve for x1 and insert this in all the other
equations. This gives us a new system with m − 1 equations for n − 1 unknowns. Then we solve
the next equation for x2 , insert it in the other equations, and we continue like this until we have
only one equation left. This of course will fail if for example a11 = 0 because in this case we cannot
RA
solve the first equation for x1 . We could save our algorithm by saying: we solve the first equation
for the first unknown whose coefficient is different from 0 (or we could take an equation where the
coefficient of x1 is different from 0 and declare this one to be our first equation. After all, we can
order the equations as we please). Even with this modification, the process of solving and replacing
is error prone.
Another idea is to manipulate the equations. The question is: Which changes to the equations
are allowed without changing the information contained in the system? We don’t want to destroy
information (thus potentially allowing for more solutions) nor introduce more information (thus
potentially eliminating solutions). Or, in more mathematical terms, what changes to the given
D

system of equations result in an equivalent system? Here we call two systems equivalent if they
have the same set of solutions.
We can check if the new system is equivalent to the original one, if there is a way to restore the
original one.
For example, if we exchange the first and the second row, then nothing really happened and we end
up with an equivalent system. We can come back to the original equation by simply exchanging
again the first and the second row.
If we multiply both sides of the first equation on both sides by some factor, let’s say, by 2, then
again nothing changes. Assume for example that the first equation is x + 3y = 7. If we multiply
both sides by 2, we obtain 2x + 6y = 14. Clearly, if a pair (x, y) satisfies the first equation, then
it satisfies also the second one an vice versa. Given the new equation 3x + 6y = 14, we can easily
restore the old one by simply dividing both sides by 2.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 79

If we take an equation and multiply both of its sides by 0, then we destroy information because we
end up with 0 = 0 and there is no way to get back the information that was stored in the original
equation. So this is not an allowed operation.
Show that squaring both sides of an equation in general does not give an equivalent equation.
Are there cases, when it does?
Squaring an equation or taking the logarithm on both sides or other such things usually are not
interesting to us because the resulting equation will no longer be a linear equation.
Let us denote the jth row of our linear system (3.1) by Rj . The following tabel contains the so-
called called elementary row operations. They are the “allowed” operations because they do not
alter the information contained in a given linear system since they are reversible.
The first column describes the operation in words, the second introduces their shorthand notation
and in the last row we give the inverse operation which allows us to get back to the original system.

FT
Elementary operation Notation Inverse Operation
1 Swap rows j and k. Rj ↔ Rk Rj ↔ Rk
2 Multiply row j by some λ 6= 0. Rj → λRj Rj → λ1 Rj
3 Replace row k by the sum of row k and λ times Rk → Rk + λRj Rk → Rk − λRj
Rj and leave row j unchanged (j 6= k).
RA
Exercise. Show that the operation in the third column reverses the operation from the second
column.

Exercise. Show that instead of the operation 3 it suffices to take 3’ : Rk → Rk + Rj because


3 can be written and as a composition of operations of the form 2 and 3’ . Show how this can
be done.

Exercise. Show that in reality 1 is not necessary since it can be achieved by a composition of
operations of the form 2 and 3 (or 2 and 3’ ). Show how this can be done.
D

Let us see in an example how this works.

Example 3.2.
    
x1 + x2 − x3 = 1 x1 + x2 − x3 = 1
  x1 + x2 − x3 = 1
 
R →R −2R1 R →R −4R2
2x1 + 3x2 + x3 = 3 −−2−−−2−−−→ x2 + 3x3 = 1 −−3−−−3−−−→ x2 + 3x3 = 1
    
4x2 + x3 = 7 4x2 + x3 = 7 − 11x3 = 3
    


x1 + x2 − x3 =
 1
R3 →R3 −4R2
−−−−−−−−→ x 2 + 3x 3 = 1

x3 = −3/11.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
80 3.1. Linear systems and Gauß and Gauß-Jordan elimination

Here we can stop because it is already quite easy to read off the solution. Proceeding from the
bottom to the top, we obtain

x3 = −3/11, x2 = 1 − 3x3 = 20/11, x1 = 1 + x3 − x2 = −12/11.

Note that we could continue our row manipulations to clean up the system even more:
   
x1 + x2 − x3 = 1
  R →−1/11R  x 1 + x 2 − x 3 = 1

3 3
· · · −→ x2 + 3x3 = 1 −−−−−−−−−→ x2 + 3x3 = 1
   
− 11x3 = 3 x3 = −3/11
   

   
x1 + x2 − x3 =
 1 R →R −1/11R x1 + x2
 = 8/11

R →R −3R3 1 1 3
−−2−−−2−−−→ x2 = 20/11 −−−−−−−−−−−→ x2 + = 20/11
   
x3 = −3/11 x3 = −3/11
   


x1 + = −12/11

FT

R →R −R
−−1−−−−
1 2
−−→ x2 = 20/11

x3 = −3/11

Our strategy was to apply manipulations that successively eliminate the unknowns in the lower
equations and we aimed to get to a form of the system of equations where the last one contains the
least number of unknowns possible.

Convince yourself that the first step of our reduction process is equivalent to solve the first
RA
equation for x1 and insert it in the other equations in order to eliminate it there. The next step
in the reduction is equivalent to solve the new second equation for x2 and insert it into the third
equation.

It is important to note that there are infinitely many different routes leading to the final result,
but usually some are quicker than others.

Let us analyse what we did. We looked at the coefficients of the system and we applied trans-
D

formations such that they become 0 because this results in removing the corresponding unknowns
from the equations. So in the example above we could just as well delete all the xj , keep only the
augmented coefficient matrix and perform the line operations in the matrix. Of course, we have
to remember that the numbers in the first columns are the coefficients of x1 , those in the second
column are the coefficients of x2 , etc. Then our calculations are translated into the following:
     
1 1 1 1 1 1 1 1 1 1 1 1
R2 →R2 −2R1 R →R −4R2
2 3 1 3 − −−−−−−−→ 0 1 3 1 −−3−−−3−−−→ 0 1 3 1
0 4 1 7 0 4 1 7 0 0 −11 3
 
1 1 1 1
R3 →1/11R3
−−−−−−−−→ 0 1 3 1 .
0 0 1 −3/11

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 81

If we translate this back into a linear system, we get

x1 + x2 + x3 = 1
x2 + 3x3 = 3
x3 = −3/11

which can be easily solved from the bottom up.


We did exactly the same calculations as we did with the system of equations but it looks much
tidier in matrix notation since we do not have to write down the unknowns all the time.
If we want to solve a linear system we write it as an augmented matrix and then we perform row
operations until we reach a “nice” form where we can read off the solutions if there are any.
But what is a “nice” form? Remember that if a coefficient is 0, then the corresponding unknown
does not show up in the equation.

• All rows with only zeros should be at the bottom.


• In the first non-zero equation from the bottom, we want as few unknowns as possible and we

FT
want them to be the last unknowns. So as last row we want one that has only zeros in it
or one that starts with zeros, until finally we get a non-zero number say in column k. This
non-zero number can always be made equal to 1 by dividing the row by it. Now we know
how the unknowns xk , . . . , xn are related. Note that all the other unknowns x1 , . . . , xk−1 have
disappeared from the equation since their coefficients are 0.
If k = n as in our example above, then we there is only one solution for xn .
• The second non-zero row from the bottom should also start with zeros until we get to a column,
say column l, with non-zero entry which we always can make equal to 1. This column should
RA
be to the left of the column k (that is we want l < k). Because now we can use what we
know from the last row about the unknowns xk , . . . , xn to say something about the unknowns
xl , . . . , xk−1 .
• We continue like this until all rows are as we want them.

Note that the form of such a “nice” matrix looks a bit like it had a triangle consisting of only zeros
in its lower left part. There may be zeros in the upper right part. If a matrix has the form we just
described, we say it is in row echelon form. Let us give a precise definition.
D

Definition 3.3 (Row echelon form). We say that a matrix A ∈ M (m × n) is in row echelon
form if:

• All its zero rows are the last rows.


• The first non-zero entry in a row is 1. It is called the pivot of the row.
• The pivot of any row is strictly to the right of that of the row above.

Definition 3.4 (Reduced row echelon form). We say that a matrix A ∈ M (m×n) is in reduced
row echelon form if:

• A is in row echelon form.


• All the entries in A which are above a pivot are equal to 0.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
82 3.1. Linear systems and Gauß and Gauß-Jordan elimination

Let us quickly see some examples.

Examples 3.5.
(a) The following matrices are in reduced row echelon form. The pivots are highlighted.
   
  1 1 0 0
1 0 0 0
! !
 1 6 0 0 , 0 0 1 0
  

,
 1 6 0 1 , 1 0 0 0 , 0 1 0 0 .
0 0 1 0  0 0 0 1
0 0 1 1 0 0 1 1 0 0 1 0
   
0 0 0 1  0 0 0 0  
0 0 0 1
0 0 0 0

(b) The following matrices are in row echelon form but not in reduced row echelon form. The
pivots are highlighted.
   
  1 6 3 1 ! ! 1 0 5 0
1 6 3 1 0 0 1 4

FT
  
1 6 1 0 , 1 0 2 0 , 0 1 0 0
 0 0 1 1 ,  ,
  .
0 0 0 1
  
0 0 1 1 0 0 1 1 0 0 1 0
  
0 0 0 1 0 0 0 0   
0 0 0 1
0 0 0 0

(c) The following matrices are not in row echelon form:


 
1 6 0 0  
  0 0 0 1
1 6 0 0 0 0 1 0
RA
   
2 0 1 0 , 0
  0 0 1 1 0 3 1 1 0 0 1 0
0 0 1 , , ,  .
  1 6 0 1 1 0 0 0 0 1 0 0
3 0 0 1 0 0 0 0
1 0 0 0
0 6 0 0

Exercise. • Say why the matrices in (b) are not in reduced row echelon form and use ele-
mentary row operations to transform them into a matrix in reduced row echelon form.
• Say why the matrices in (c) are not in row echelon form and use elementary row operations
to transform them into a matrix in row echelon form. Transform them further to obtain a
D

matrix in reduced row echelon form.

Question 3.1
If we interchange two rows in a matrix this corresponds to writing down the given equations in a
different order. What is the effect on a linear system if we interchange two columns?

Remember: if we translate a linear system to an augmented coefficient matrix (A|b), perform the
row operations to arrive at a (reduced) row echelon form (A0 |b0 ), and translate back to a linear
system, then this new system contains exactly the same information as the original one but it is
“tidied up” and it is easy to determine its solution.
The natural question now is: Can we always transform a matrix into one in (reduced) row echelon
form? The answer is that this is always possible and we can even give an algorithm for it.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 83

Gaußian elimination. Let A ∈ M (m × n) and assume that A is not the zero matrix. Gaußian
elimination is an algorithm that transforms A into a row echelon form. The steps are as follows:

• Find the first column which does not consist entirely of zeros. Interchange rows appropiately
such that the entry in that column in the first row is different from zero.
• Multiply the first row by an appropriate number so that its first non-zero entry is 1.
• Use the first row to eliminate all coefficients below its pivot.
• Now our matrix looks like  
0 0 1 ∗ ∗
 
 0 .


0
A0
 
 
 
0 0 0

where ∗ are arbitrary numbers and A0 is a matrix with fewer columns than A and m − 1 rows.

FT
Now repeat the process for A0 . Note that in doing so the first columns do not change since
we are only manipulating zeros.

Gauß-Jordan elimination. Let A ∈ M (m × n). The Gauß-Jordan elimination is an algorithm


that transforms A into a reduced row echelon form. The steps are as follows:

• Use the Gauß elimination to obtain a row echelon form of A.


• Use the pivots to eliminate the non-zero coefficients which are columns above a pivot.
RA
Of course, if we do a reduction by hand, then we do not have to follow the steps of the algorithm
strictly if it makes calculations easier. However, these algorithms always work and therefore can be
programmed so that a computer can perform them.

Definition 3.6. Two m × n matrices A and B are called row equivalent if there are elementary
row operations that transform A into B. (Clearly then B can be transformed by row operations
into A.)
D

Remark. Let A be an m × n matrix.


• A can be transformed into infinitely many different row echelon forms.
• There is only one reduced row echelon form that A can be transformed into.

Prove the assertions above.

Before we give examples, we note that from the row echelon form we can immediately tell how
many solutions the corresponding linear system has.

Theorem 3.7. Let (A|b) be the augmented coefficient matrix of a linear m × n system and let
(A0 |b0 ) be a row reduced form.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
84 3.1. Linear systems and Gauß and Gauß-Jordan elimination

(1) If there is a row of the form (0 · · · 0|β) with β 6= 0, then the system has no solution.
(2) If there is no row of the form (0 · · · 0|β) with β 6= 0, then one of the following holds:
(2.1) If there is a pivot in every column of A0 then the system has exactly one solution.
(2.2) If there is a column in A0 without a pivot, then the system has infinitely many solutions.

Proof. (1) If (A0 |b0 ) has a row of the form (0 · · · 0|β) with β 6= 0, then the corresponding equation
is 0x1 + · · · + 0xn = β which clearly has no solution.
(2) Now assume that (A0 |b0 ) has no row of the form (0 · · · 0|β) with β 6= 0. In case (2.1), the
transformed matrix is then of the form
 
0 0 0 0
 1 a12 a13 a1n b1 
 0 1 a023 0 0
 
 a2n b 2


 

 1 

FT
 
 
 . (3.4)

 1 a0(n−1)n b0n−1  
0
 
b0n
 
 1 
 

 0 0 

0
 
 
0 0 0
RA
Note that the last zero rows appear only if n < m. This system clearly has the unique solution
xn = b0n , xn−1 = b0n−1 − a(n−1)n xn , ..., x1 = b01 − a1n xn − · · · − a12 x2 .

In case (2.2), the transformed matrix is then of the form


 
0 0 1 ∗ ∗ b01

∗ b02 
 
0
 0 1 ∗ 
0 
 
0 0 1 ∗ ∗ b3 
D


. (3.5)
 



 0 0 1 ∗ 0 
∗ bk 

 

 0 0  
 
 
0 0 0

where the stars stand for numbers. (If we continue the reduction until we get to the reduced
row echelon form, then the numbers over the 1’s must be zeros.) Note that we can choose the
unknowns which correspond to the columns without a pivot arbitrarily. The unknowns which
correspond to the columns with pivots can then always be chosen in a unique way such that
the system is satisfied.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 85

Definition 3.8. The variables wich correspond to columns without pivots are called free variables.

We will come back to this theorem later on page 112 (the theorem is stated again in the coloured
box).
From the above theorem we get as an immediate consequence the following.

Theorem 3.9. A linear system has either no, exactly one or infinitely many solutions.

Now let us see some examples.

Example 3.10 (Example with a unique solution (no free variables)). We consider the
linear system
2x1 + 3x2 + x3 = 12,
−x1 + 2x2 + 3x3 = 15, (3.6)
3x1 − x3 = 1.
Solution. We form the augmented matrix and perform row reduction.

FT
     
2 3 1 12 0 7 7 42 0 7 7 42
R1 →R1 +2R2 R3 →R3 +3R2
−1 2 3 15 −−−−−−−−→ −1 2 3 15 −−−−−−−−→ −1 2 3 15
3 0 −1 1 3 0 −1 1 0 6 8 46
  R →−R  
−1 2 3 15 1 1
R2 → 71 R2
1 −2 −3 −15
R1 ↔R2
−−−−−→  0 7 7 42 −−−−−−→ 0
 1 1 6
0 6 8 46 0 6 8 46
RA
   
1 −2 −3 −15 R3 → 12 R3
1 −2 −3 −15
R →R −6R2
−−3−−−3−−−→ 0 1 1 6 −−−−−−→ 0 1 1 6 .
0 0 2 10 0 0 1 5
This shows that the system (3.6) is equivalent to the system
x1 − 2x2 − 3x3 = −15,
x2 + x3 = 6, (3.7)
x3 = 5
D

whose solution is easy to write down:


x3 = 5, x2 = 6 − x3 = 1, x1 = −15 + 2x2 + 3x3 = 2. 

Remark. If we continue the reduction process until we reach the reduced row echelon form, then
we obtain
     
1 −2 −3 −15 1 −2 −3 −15 1 −2 0 0
R →R2 −R3 R →R +3R3
. . . −→ 0 1 1 6 −−2−−−− −−→ 0 1 0 1 −−1−−−1−−−→ 0 1 0 1
0 0 1 5 0 0 1 5 0 0 1 5
 
1 0 0 2
R →R +2R2
−−1−−−1−−−→ 0 1 0 1 .
0 0 1 5

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
86 3.1. Linear systems and Gauß and Gauß-Jordan elimination

Therefore the system (3.6) is equivalent to the system


x1 = 2,
x2 = 1,
x3 = 5.
whose solution can be read off immediately to be

x3 = 5, x2 = 1, x1 = 2.

Example 3.11 (Example with two free variables). We consider the linear system
3x1 − 2x2 + 3x3 + 3x4 = 3,
2x1 + 6x2 + 2x3 − 9x4 = 2, (3.8)
x1 + 2x3 + x3 − 3x4 = 1.
Solution. We form the augmented matrix and perform row reduction.

FT
     
3 −2 3 3 3 3 −2 3 3 3 0 −8 0 12 0
R →R −2R1 R →R −3R3
2 6 2 −9 2 −−2−−−2−−−→ 0 2 0 −3 0 −−1−−−1−−−→ 0 2 0 −3 0
1 2 1 −3 1 1 2 1 −3 1 1 2 1 −3 1
   
1 2 1 −3 1 1 2 1 −3 1
R ↔R3 R →R +4R2
−−1−−−→ 0 2 0 −3 0 −−3−−−3−−−→ 0 2 0 −3 0
0 −8 0 12 0 0 0 0 0 0
RA
 
1 0 1 0 1
R →R1 −R2
−−1−−−−−−→ 0 2 0 −3 0 .
0 0 0 0 0

The 3rd and the 4th column do not have pivots and we see that the system (3.8) is equivalent to
the system
x1 − x3 = 1,
x2 + x4 = 0.
D

Clearly we can choose x3 and x4 (the unknowns corresponding to the columns without a pivot)
arbitrarily. We will always be able to adjust x1 and x2 such that the system is satisfied. In order
to make it clear that x3 and x4 are our free variables, we sometimes call them x3 = t and x4 = s.
Then every solution of the system (3.8) is of the form

x1 = 1 + t, x2 = −s, x3 = t, x4 = s, for arbitrary s, t ∈ R. 

In vector form we can write the solution as follows. . A tuple (x1 , x2 , x3 , x4 ) is a solution of (3.8)
if and only if the corresponding vector is of the form
         
x1 1+t 1 1 0
x2   −s  0 0 −1
x3   t  = 0 + t 1 + s  0 for some s, t ∈ R.
 =       

x4 s 0 0 1

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 87

Geometrically, the set of all solutions is an affine plane in R4 .

Remark 3.12. The solutions of an inhomogeneous system of linear equations in vector notation
is always of the form
~x = ~z0 + t1 ~y1 + · · · + tk ~yk .

This will become important in Theorem 3.22.


In the example above, k = 2 and
    

1 1 0
0 0 −1
~z0 = 
0 , ~y1 = 1 ,
   ~y2 = 
 0 .

0 0 1

Example 3.13 (Example with no solution). We consider the linear system


2x1 + x2 − x3 = 7,

FT
3x1 + 2x2 − 2x3 = 7, (3.9)
−x1 + 3x2 − 3x3 = 2.
Solution. We form the augmented matrix and perform row reduction.
     
2 1 −1 7 0 7 −7 11 0 7 −7 11
R1 →R1 +2R3 R →R +3R3
 3 2 −2 7 − −−−−−−−→  3 2 −2 7 −−2−−−2−−−→  0 11 −11 13
−1 3 −3 2 −1 3 −3 2 −1 3 −3 2
RA
   
−1 3 −3 2 −1 3 −3 2
R1 ↔R3 11R3 →R3 −7R2
−−−−−→  0 11 −11 13 −−−−−−−−−−→  0 11 −11 13 .
0 7 −7 11 0 0 0 30

The last line tells us immediately that the system (3.9) has no solution because there is no choice
of x1 , x2 , x3 such that 0x1 + 0x2 + 0x3 = 30. 
D

You should now have understood


• what it means that two linear systems are equivalent,
• which row operations transform a given system into an equivalent one and why this is so,
• when a matrix is in row echelon and a reduced row echelon form,
• why the linear system associated to a matrix in (reduced) echelon form is easy to solve,
• what the Gauß- and Gauß-Jordan elimination do and why they always work,
• that the Gauß- and Gauß-Jordan elimination is nothing very magical; essentially it is the
same as solving for variables and replacing in the remaining equations. It only does so in a
systematic way;

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
88 3.1. Linear systems and Gauß and Gauß-Jordan elimination

• why a given matrix can be transformed into may different row echelon forms, but in only
one reduced row echelon form,
• why a linear system always has either no, exactly one or infinitely many solutions,
• etc.
You should now be able to
• identify if a matrix is in row echelon or a reduced row echelon form,
• use the Gauß- or Gauß-Jordan elimination to solve linear systems,
• say if a system has no, exactly one or infinitely many solutions if you know its echelon form,
• etc.

Ejercicios.
1. Por medio de operaciones elementales, lleve las siguientes matrices aumentadas a sus formas

FT
escalonadas reducidas y encuentre todas las soluciones del sistema de ecuaciones asociado a
cada matriz:
     
1 2 3 −2 5 1 −5 1 0 0 1 −3
(a)
2 −1 −1 1  4 2 2 −4  (g)  0 2 1 1 
(d) 
 
1 0 3 2  3 2 1 7
−3 1 4 −1  
  1 2 3 4 1
2 4 6 −1  1 2 3 0 −2 
RA
 
(b)  4 5 6 2  1 1 5 −1 2 (h)  
 1 2 0 0 −1 
2 7 12 1 (e)  0 1 2 1 0 
1 0 0 0 2
2 −1 4 1 13  
    1 0 0 0 1
6 2 4 0 2 3 1  2 1 0 0 −3 
(i)  
(c)  1 −2 −4  (f)  2 −6 7 3   3 2 1 0 2 
1 1 2 1 −2 5 2 4 3 2 1 5

2. Encontrar condiciones sobre a, b, c tal que el siguiente sistema tenga solución:


D

x + 3y − 2z = a + 1
3x − y + z = b + 6
5x − 5y + 4z = c + 11

3. Encuentre un polinomio de grado a lo más 2 que pase por los puntos (−1, −6), (1, 0), (2, 0).
¿Cuántos tales polinomios hay?

4. (a) ¿Existe un polinomio de grado 1 que pase por los tres puntos del Ejercicio 3.? ¿Cuántos
tales polinomios hay?
(b) ¿Existe un polinomio de grado 3 que pase por los tres puntos del Ejercicio 3.? ¿Cuántos
tales polinomios hay? Dé por lo menos dos polinomios de grado 3.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 89

5. Un rolo hace un pequeño tour por Colombia y revisando sus cuentas nota que: En hostales
gastó $30.000 diarios en Medellı́n, $20.000 diarios en Villavicencio y $20.000 diarios en Yopal,
en alimentación gastó $20.000 diarios en Medellı́n, $30.000 diarios en Villavicencio y $20.000
diarios en Yopal y finalmente en desplazamientos gastó en promedio $10.000 diariamente en
cada ciudad. Si se sabe que durante todo el tour gastó $340.000 en hostales, $320.000 en
alimentación y $140.000 en desplazamientos, calcule el número de dı́as que estuvo el rolo en
cada ciudad.

6. Un criadero del llano tiene chiguiros, conejos y ratones arroceros. Cada chiguiro de consume
cada semana en promedio un kilo de frutas, un kilo de hierbas y dos kilos de arroz, cada
conejo consume semanalmente en promedio tres kilos de frutas, cuatro kilos del hierbas y
cinco kilos de arroz y cada ratón arrocero consume cada semana en promedio dos kilos de
frutas, un kilo de hierbas y cinco kilos de arroz. Cada semana se proporcionan 25.000 kilos
de frutas, 20.000 de hierbas y 55.000 de arroz. Si las tres especies de roedores se comen todo
el alimento ¿Cuantos roedores de cada tipo pueden vivir en el criadero?

3.2 Homogeneous linear systems

FT
In this short section we deal with the special case of homogeneous linear systems. Recall that a
linear system (3.1) is called homogeneous if b1 = · · · = bn = 0. Such a system has always at least
one solution, the so-called trivial solution x1 = · · · = xn = 0. This also clear from Theorem 3.7
since no matter what row operations we perform, the right side will always remain equal to 0. Note
that if we perform Gauß or Gauß-Jordan elimination, there is no need to write down the right hand
RA
side since it always will be 0.
If we adapt Theorem 3.7 to the special case of a homogeneous system, we obtain the following.

Theorem 3.14. Let A be the coefficient matrix of a homogeneous linear m × n system and let A0
be a row reduced form.

(i) If there is a pivot in every column then the system has exactly one solution, namely the trivial
solution.
D

(ii) If there is a column with without a pivot, then the system has infinitely many solutions.

Corollary 3.15. A homogeneous linear system has either exactly one or infinitely many solutions.

Let us see an example.

Example 3.16 (Example of a homogeneous system with infinitely many solutions). We


consider the linear system

x1 + 2x2 − x3 = 0,
2x1 + 3x2 − 2x3 = 0, (3.10)
3x1 − x2 − 3x3 = 0.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
90 3.2. Homogeneous linear systems

Solution. We perform row reduction on the associated matrix.


     
1 2 −1 1 2 −1 1 2 −1
R2 →R2 −2R1 R3 →R3 −3R1
2 3 −2 −−−−−−−−→ 0 −1 0 −−−−−−−−→ 0 −1 0
3 −1 −3 3 −1 −3 0 −7 0
   
1
use R2 to clear 0 −1 1 0 −1
the 2nd column R2 →−R2
−−−−−−−−−−→ 0 −1 0 −−−−−−→ 0 1 0 .
0 0 0 0 0 0

We see that the third variable is free, so we set x3 = t. The solution is

x1 = t, x2 = 0, x3 = t for t ∈ R.

or in vector form    
x1 1
x2  = t 0 for t ∈ R. 

FT
x3 1

Example 3.17 (Example of a homogeneous system with exactly one solution). We con-
sider the linear system

x1 + 2x2 = 0,
2x1 + 3x2 = 0, (3.11)
3x1 + 5x2 = 0.
RA
Solution. We perform row reduction on the associated matrix.
       
1 2 use R1 to clear 1 2 use R2 to clear 1 0 1 0
R →−R2
2 3 −the
−−−1st column 
−−−−−→ 0 −1 − the 2nd column 
−−−−−−−−−→ 0 −1 −−2−−−−→ 0 1 .
3 5 0 −1 0 0 0 0

So the only possible solution is x1 = 0 and x2 = 0. 


D

In the next section we will see the connection between the set of solutions of a linear system and
the corresponding homogeneous linear system.

You should now have understood


• why a homogeneous linear system always has either one or infinitely many solutions,
• etc.
You should now be able to

• use the Gauß- or Gauß-Jordan elimination to solve homogeneous linear systems,


• etc.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 91

Ejercicios.
1. Encuentre todas las soluciones de los sistemas homogéneo asociado a las matrices del Ejercicio
1. de la sección 3.1.
2. Determine el conjunto de soluciones de los siguientes sistemas homogéneos:

(a) x + 2y − 3z = 0 (b) x+ y+ z+w =0 (c) 2x − 8y = 0


2x + 4y − 6z = 0 −x + 5z − 3w =0 −x + 4y = 0
−3x − 6y + 9z = 0 2x + 3y + 8z =0 3x − 12y = 0
x + 2y + 7z − w =0

3. Encuentre todos los r ∈ R tal que el siguiente sistema tenga solución única:

(2 + r)x − 2y = 0
2x + (1 − r)y = 0

FT
3.3 Matrices and linear systems
So far we were given a linear system with a specific right hand side and we asked ourselves which
xj do we have to feed into the system in order to obtain the given right hand side. Problems of
this type are called inverse problems since we are given an output (the right hand of the system;
the “state” that we want to achieve) and we have to find a suitable input in order to obtain the
desired output.
Now we change our perspective a bit and we ask ourselves: If we put certain x1 , . . . , xn into the
RA
system, what do we get as a result on the right hand side? To investigate this question, it is very
useful to write the system (3.1) in a short form. First note that we can view it as an equality of
the two vectors with m components each:

a11 x1 + a12 x2 + · · · + a1n xn


   
b1
a x + a x + · · · + a x   
 21 1 22 2 2n n  b2 
= . . (3.12)
 .. ..

  .. 
 
 . .
am1 x1 + am2 x2 + · · · + amn xn bm
D

Let A be the coefficient matrix and ~x the vector whose components are x1 , . . . , xn . Then we write
the left hand side of (3.12) as

a11 x1 + a12 x2 + · · · + a1n xn


 
 
x1 a x + a x + · · · + a x 
 21 1 22 2 2n n
A~x = A  ...  := 

  . (3.13)
 .. .. 
 . .
x

n
am1 x1 + am2 x2 + · · · + amn xn

With this notation, the linear system (3.1) can be written very short as

A~x = ~b

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
92 3.3. Matrices and linear systems

 
b1
with ~b =  ... .
 

bm

A way to remember the formula for the multiplication of a matrix and a vector is that we “multiply
each row of the matrix by the column vector”, so we calculate “row by column”. For example, the
jth component of A~x is “(jth row of A) by (column ~x)”.
   
a11 a12 . . . a1n   aj1 x1 + aj2 x2 + · · · + ajn xn
 .. ..  x1  .. 
 . .  .
x   
  2  
A~x =  aj1 aj2 . . . ajn  .  =  aj1 x1 + aj2 x2 + · · · + ajn xn  . (3.14)
   
 . .
 .  
. .
 .. ..  ..

xn
 
am1 am2 . . . amn am1 x1 + am2 x2 + · · · + amn xn

Definition 3.18. The formula in (3.13) is called the multiplication of a matrix and a vector.

FT
An m × n matrix A takes a vector with n components and gives us back a vector with m compo-
nents.
Observe that something like ~xA does not make sense!

Remark 3.19. Formula (3.13) can be interpreted as follows. If A is an m × n matrix and ~x is


a vector in Rn , then A~x is the vector in Rm which is the sum of the columns of A weighted with
RA
coefficients given by ~x since
     
a11 x1 + a12 x2 + · · · + a1n xn a11 x1 a1n xn
a21 x1 + a22 x2 + · · · + a2n xn   a21 x1   a2n xn 
A~x =  . = + · · · +
     
.. ..  .. 
 ..
  
.   .   . 
am1 x1 + am2 x2 + · · · + amn xn am1 x1 amn xn
    (3.15)
a11 a1n
 a21   a2n 
= x1  .  + · · · + xn  .  .
   
D

 ..   .. 
am1 amn

Remark 3.20. Recall that ~ej is the vector which has a 1 as its jth component and has zeros
everywhere else. Formula (3.13) shows that for every j = 1, . . . , n
 
a1j
A~ej =  ...  = jth column of A. (3.16)
 

amj

Let us prove some easy properties.

Proposition 3.21. Let A be an m × n matrix, ~x, ~y ∈ Rn and c ∈ R. Then

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 93

(i) A(c~x) = cA~x,


(ii) A(~x + ~y ) = A~x + A~y ,
(iii) A~0 = ~0.

Proof. The proofs are not difficult. They follow by using the definitions and carrying out some
straightforward calculations as follows.
a11 cx1 + · · · + a1n cxn a11 x1 + · · · + a1n xn
   
 
cx1 a cx + · · · + a cx  a x + · · · + a x 
21 1 2n n   21 1 2n n 
 ..  
(i) A(c~x) = A  .  =  .

.
 = c 
. ..
 = cA~x.
 .. ..   . 
 . .
cxn
 
am1 cx1 + · · · + amn cxn am1 x1 + · · · + amn xn

a11 (x1 + y1 ) + · · · + a1n (xn + yn )


 
 
x1 + y1 a (x + y ) + · · · + a (x + y ) 
21 1 1 2n n n
..   

FT
(ii) A(~x + ~y ) = A  =

.  .. ..


 . .
x +y

n n
am1 (x1 + y1 ) + · · · + amn (xn + yn )
a11 x1 + · · · + a1n xn a11 y1 + · · · + a1n yn
   
a x + · · · + a x  a y + · · · + a y 
 21 1 2n n   21 1 2n n 
=
 .. ..
+
  .. ..
 = A~x + A~y .

 . .   . . 
am1 x1 + · · · + amn xn am1 y1 + · · · + amn yn
RA
(iii) To show that A~0 = ~0, we could simply do the calculation (which is very easy!) or we can use
(i):
A~0 = A(0~0) = 0A~0 = ~0.

Note that in (iii) the ~0 on the left hand side is the zero vector in Rn whereas the ~0 on the right
hand side is the zero vector in Rm .
Proposition 3.21 gives an important insight into the structure of solutions of linear systems. See
also Remark 3.12.
D

Theorem 3.22. (i) Let ~x and ~y be solutions of the linear system (3.1). Then ~x − ~y is a solution
of the associated homogeneous linear system.
(ii) Let ~x be a solution of the linear system (3.1), let ~z be a solution of the associated homogeneous
linear system and let λ ∈ R. Then ~x + λ~z is solution of the system (3.1).

Proof. Assume that ~x and ~y are solutions of the (3.1), that is

A~x = ~b and A~y = ~b.

By Proposition 3.21 (i) and (ii) we have

A(~x − ~y ) = A~x + A(−~y ) = A~x − A~y = ~b − ~b = ~0

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
94 3.3. Matrices and linear systems

which shows that ~x − ~y solves the homogeneous equation A~v = ~0. Hence (i) is proved
In order to show (ii), we proceed similarly. If ~x solves the inhomogeneous system (3.1) and ~z solves
the associated homogeneous system, then
A~x = ~b and A~z = ~0.
Now (ii) follows from

A(~x + λ~z) = A~x + Aλ~z = A~x + λA~z = ~b + λ~0 = ~b.

Corollary 3.23. Let ~x be an arbitrary solution of the inhomogeneous system (3.1). Then the set
of all solutions of (3.1) is
{~x + ~z : ~z is solution of the associated homogeneous system}.

This means that in order to find all solutions of an inhomogeneous system it suffices to find one
particular solution and all solutions of the corresponding homogeneous system.

FT
We will show later that the set of all solutions of a homogeneous system is a vector space. When you
study the set of all solutions of linear differential equations, you will encounter the same structure.

Example 3.24. Let us consider the system


x1 + 2x2 − x3 = 3,
2x1 + 3x2 − 2x3 = 3, (3.10’)
RA
3x1 − x2 − 3x3 = −12.
Solution. We form the augmented matrix and perform row reduction.
     
1 2 −1 3 1 2 −1 3 use R2 to clear 1 0 −1 −3
R2 →R2 −2R1 the 2nd column
2 3 −2 3 −−−−−−−−→ 0 −1 0 −3 −−−−−−−−−−→ 0 −1 0 −3
R3 →R3 −3R1
3 −1 −3 −12 0 −7 0 −21 0 0 0 0
 
1 0 −1 −3
R2 →−R2
−−−−−−→ 0 1 0 3 .
D

0 0 0 0
It follows that x2 = 3 and x1 = −3 + x3 . If we take x2 as parameter, the general solution of the
system in vector form is      
x1 0 1
x2  = 3 + t 0 for t ∈ R.
x3 0 1
Note that the left hand side of the system (3.10’) is the same as that of the homogeneous system
(3.10) in Example 3.16 which has the general solution
   
x1 1
x2  = t 0 for t ∈ R.
x3 1

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 95

This shows that indeed we obtain all solutions of the inhomogenous equation as sum of the particular
solution (0, 3, 0)t and all solutions of the corresponding homegenous system. 

You should now have understood


• that an m × n matrix can be viewed as a function that takes vectors in Rn and returns a
vector in Rm ,
• the structure of the set of all solutions of a given linear system,
• etc.
You should now be able to

• calculate expressions like A~x,


• relate the solutions of an inhomogeneous system with those of the corresponding homoge-
neous one,
• etc.

Ejercicios.

1. Sea A =  1 3

1 2 −1 3

−1 2

FT
0 2. Para cada uno de los siguientes vectores, verifique si es una
1 3
solución del sistema homogéneo A~x = ~0:
RA

     
  
5 1 10 3 2
−3 2 −6  0 −1
 5,
(a)  
3,
(b)  
 10,
(c)  
 0,
(d)  
 0.
(e)  

2 4 20 −1 1

Posteriormente, encuentre todas las soluciones del sistema homogéneo. ¿Existe una solución
del sistema anterior tal que alguna de sus componentes sea cero pero no sea la solución trivial?
D

2. En cada ı́tem, escriba el sistema de ecuaciones lineales correspondiente y obtenga todas sus
soluciones:
   
    2 −1 1 3
1 1 3 2 7 3
 1 −2 

−2
 
(a) 2 −1 0 4 ~x = 8 (b)  1 −1 1 ~x =  7
 
0 3 6 0 8 1 5 7  13
1 −7 −5 12
       
2 1 1 2 3 1 0 2 6
(c) 1 1 −2 ~x = 3 (d) 2 1 −1 0 ~x = 3
1 1 1 1 1 1 1 1 6

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
96 3.4. Matrices as functions from Rn to Rm ; composition of matrices

3. Escriba el siguiente sistema de ecuaciones de la forma A~x = ~b:

x + 3y + z = 3
2x + 7y + 2z = 5
2x + 6y + (a2 − 2)z = a + 4.

Luego, determine todos los a ∈ R tal que el sistema

(a) tenga única solución,

(b) tenga infinitas soluciones ó

(c) no tenga solución.

3.4
matrices FT
Matrices as functions from Rn to Rm ; composition of

In the previous section we saw that a matrix A ∈ M (m × n) takes a vector ~x ∈ Rn and returns
RA
a vector A~x in Rm . This allows us to view A as a function from Rn to Rm , and therefore we can
define the sum and composition of two matrices. Before we do this, let us see a few examples of
such matrices. As examples we work with 2 × 2 matrices because their action on R2 can be sketched
in the plane.
 
1 0
Example 3.25. Let us consider A = . This defines a function TA from R2 to R2 by
0 −1
D

TA : R2 → R2 , TA ~x = A~x.

Remark. We write TA to denote the function induced by A, but sometimes we will write simply
A : R2 → R2 when it is clear that we consider the matrix A as a function.

We calculate easily

           
1 1 0 0 x x
TA = , TA = , in general TA = .
0 0 1 −1 y −y

So we see that TA represents the reflection of a vector ~x about the x-axis.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 97

y y
TA

~e2 ~v = ( vv12 )

TA w
~
x x
~e1 TA~e1
w
~ TA~e2
v1 
TA~v = −v2

FT
Figure 3.1: Reflection on the x-axis.
 
0 0
Example 3.26. Let us consider B = . This defines a function TB from R2 to R2 by
0 1

TB : R2 → R2 , TB ~x = B~x.
We calculate easily
           
1 0 0 0 x 0
RA
TB = , TB = , in general TB = .
0 0 1 1 y y
So we see that TB represents the projection of a vector ~x onto the y-axis.
y y
TB

0

TB ~v = v2
~e2 ~v = ( vv12 )
D

x x
~e1
w
~ TB w
~

Figure 3.2: Orthogonal projection onto the y-axis.


 
0 −1
Example 3.27. Let us consider C = . This defines a function TC from R2 to R2 by
1 0

TC : R2 → R2 , TC ~x = C~x.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
98 3.4. Matrices as functions from Rn to Rm ; composition of matrices

We calculate easily
           
1 0 0 −1 x −y
TC = , TC = , in general TC = .
0 1 1 0 y x

So we see that TC represents the rotation of a vector ~x about 90◦ counterclockwise.


y y
TC

−v2

TC ~v = v1

~e2 ~v = ( vv12 )

TC~e1
x x
~e1 TC~e2
w
~

FT
Figure 3.3: Rotation about π/2 counterclockwise.
TC w
~
RA
Just as with other functions, we can sum them or compose them. Remember from your calculus
classes, that functions are summed “pointwise”. That means, if we have two functions f, g : R → R,
then the sum f + g is a new function which is defined by

f + g : R → R, (f + g)(x) = f (x) + g(x). (3.17)

The multiplication of a function f with a number c gives the new function cf defined by
D

cf : R → R, (cf )(x) = c(f (x)). (3.18)

The composition of functions if defined as

f ◦ g : R → R, (f ◦ g)(x) = f (g(x)). (3.19)

Matrix sum

Let us see how this looks like in the case of matrices. Let A and B be matrices. First note that
they both must depart from the same space Rn because we want to apply them to the same ~x, that
is, both A~x and B~x must be defined. Therefore A and B must have the same number of columns.
They also must have the same number of rows because we want to be able to sum A~x and B~x. So

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 99

let A, B ∈ M (m × n) and let ~x ∈ R. Then, by definition of the sum of two functions, we have
     
a11 a12 · · · a1n x1 b11 b12 · · · b1n x1
 a21 a22 · · · a2n   x2   b21 b22 · · · b2n   x2 
(A + B)~x := A~x + B~x =  . ..   ..  +  ..
     
.. .. ..   .. 
 .. . .  .   . . .  . 
am1 am2 · · · amn xn bm1 bm2 · · · bmn xn
   
a11 x1 + a12 x2 + · · · + a1n xn b11 x1 + b12 x2 + · · · + b1n xn
 a21 x1 + a22 x2 + · · · + a2n xn  b21 x1 + b22 x2 + · · · + b2n xn

= +
   
.. .. 
 .   . 
am1 x1 + am2 x2 + · · · + amn xn bm1 x1 + bm2 x2 + · · · + bmn xn
 
a11 x1 + a12 x2 + · · · + a1n xn + b11 x1 + b12 x2 + · · · + b1n xn
 a21 x1 + a22 x2 + · · · + a2n xn + b21 x1 + b22 x2 + · · · + b2n xn

=
 
.. 
 . 

FT
am1 x1 + am2 x2 + · · · + amn xn + bm1 x1 + bm2 x2 + · · · + bmn xn
 
(a11 + b11 )x1 + (a12 + b12 )x2 + · · · + (a1n + bmn )xn
 (a21 + b11 )x1 + (a22 + b12 )x2 + · · · + (a2n + bmn )xn

=
 
.. 
 . 
(am1 + b11 )x1 + (am2 + b12 )x2 + · · · + (amn + bmn )xn
  
a11 + b11 a12 + b12 ··· a1n + bmn x1
RA
 a21 + b11 a22 + b12 ··· a2n + bmn   x2 
=   ..  .
  
.. .. ..
 . . .  . 
am1 + b11 am2 + b12 ··· amn + bmn xn
We see that A + B is again a matrix of the same size and that the components of this new matrix
are just the sum of the corresponding components of the matrices A and B.

Multiplication of a matrix by a scalar


D

Now let c be a number and let A ∈ M (m × n). Then we have


     
a11 a12 · · · a1n x1 a11 x1 + · · · + a1n xn
(cA)~x = c(A~x) = c  ... .. ..   ..  = c  ..
 
. .   .   . 
am1 am2 · · · amn xn am1 x1 + · · · + amn xn
    
ca11 x1 + · · · + ca1n xn ca11 ca12 ··· ca1n x1
=
 ..   ..
= .. ..   ..  .
.   . . .  . 
cam1 x1 + · · · + camn xn cam1 cam2 ··· camn xn
We see that cA is again a matrix and that the components of this new matrix are just the product
of the corresponding components of the matrix A with c.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
100 3.4. Matrices as functions from Rn to Rm ; composition of matrices

Proposition 3.28. Let A, B, C ∈ M (m × n) let O be the matrix whose entries are all 0 and let
λ, µ ∈ R. Moreover, let A
e be the matrix whose entries are the negative entries of A. Then the
following is true.

(i) Associativity of the matrix sum: (A + B) + C = A + (B + C).

(ii) Commutativity of the matrix sum: A + B = B + A.

(iii) Additive identity: A + O = A.

e = O.
(iv) Additive inverse A + A

(v) 1A = A.

(vi) (λ + µ)A = λA + µA and λ(A + B) = λA + λB.

(vii) (λµ)A = λ(µA).

FT
Proof. The claims of the proposition can be proved by straightforward calculations.

Prove Proposition 3.28.


RA
From the proposition we obtain immediately the following theorem.

Theorem 3.29. M (m × n) is a vector space.

Composition of two matrices


D

Now let us calculate the composition of two matrices. This is also called the product of the matrices.
Assume we have A ∈ M (m × n) and we want to calculate AB for some matrix B. Note that A
describes a function from Rn → Rm . In order for AB to make sense, we need that B goes from
some Rk to Rn , that means that B ∈ M (n × k). The resulting function AB will then be a map
from Rk to Rm .

B A
Rk Rn Rm

AB

So let B ∈ M (n × k). Then, by the definition of the composition of two functions, we have for every

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 101

~x ∈ Rk

··· b11 x1 + b12 x2 + · · · + b1k xk


     
b11 b12 b1k x1
 b21 b22 ··· b2k 
 x2 
   b21 x1 + b22 x2 + · · · + b2k xk 
(AB)~x = A(B~
x) = A  . ..   ..  = A 
  
.. ..
 ..

. .   .   . 
bn1 bn2 ··· bnk xk bn1 x1 + bn2 x2 + · · · + bnk xk

··· b11 x1 + b12 x2 + · · · + b1k xk


  
a11 a12 a1k
 a21 a22 ···   b21 x1 + b22 x2 + · · · + b2k xk 
a2k   
= .

.. ..   ..
 ..

. .  . 
an1 an2 ··· ank bn1 x1 + bn2 x2 + · · · + bnk xk

a11 [b11 x1 + b12 x2 + · · · + b1k xk ] + a12 [b21 x1 + b22 x2 + · · · + b2k xk ] + · · · + a1n [bn1 x1 + bn2 x2 + · · · + bnk xk ]
 
 a21 [b11 x1 + b12 x2 + · · · + b1k xk ] + a22 [b21 x1 + b22 x2 + · · · + b2k xk ] + · · · + a2n [bn1 x1 + bn2 x2 + · · · + bnk xk ] 
=
 
.. 
 . 

FT
am1 [b11 x1 + b12 x2 + · · · + b1k xk ] + am2 [b21 x1 + b22 x2 + · · · + b2k xk ] + · · · + amn [bn1 x1 + bn2 x2 + · · · + bnk xk ]

[a11 b11 + a12 b21 + · · · + a1n bn1 ]x1 + [a11 b12 + a12 b22 + · · · + a1n bn2 ]x2 + · · · + [a11 b1k + a12 b2k + · · · + a1n bnk ]xk
 
 [a21 b11 + a22 b21 + · · · + a2n bn1 ]x1 + [a21 b12 + a22 b22 + · · · + a2n bn2 ]x2 + · · · + [a21 b1k + a22 b2k + · · · + a2n bnk ]xk 
=
 
.. 
 . 
[am1 b11 + am2 b21 + · · · + amn bn1 ]x1 + [am1 b12 + am2 b22 + · · · + amn bn2 ]x2 + · · · + [am1 b1k + am2 b2k + · · · + amn bnk ]xk

a11 b11 + a12 b21 + · · · + a1n bn1 a11 b12 + a12 b22 + · · · + a1n bn2 ··· a11 b1k + a12 b2k + · · · + a1n bnk
  
x1
RA
 a21 b11 + a22 b21 + · · · + a2n bn1 a21 b12 + a22 b22 + · · · + a2n bn2 ··· a21 b1k + a22 b2k + · · · + a2n bnk   x2 
=
  
.. ..  . 
 . .   .. 
am1 b11 + am2 b21 + · · · + amn bn1 am1 b12 + am2 b22 + · · · + amn bn2 ··· am1 b1k + am2 b2k + · · · + amn bnk xk

We see that AB is a matrix of the size m × k as was to be expected since the composition function
goes from Rk to Rm . The component j` of the new matrix (the entry in lines j and column `) is
D

n
X
cj` = ajr br` .
r=1

So in order to calculate this entry we need from A only its jth row and from B we only need its
`th column and we multiply them component by component. You can memorise this again as “row
by column”, more precisely:

cj` = component in row j and column ` of AB = (row j of A) × (column ` of B) (3.20)

as in the case of multiplication of a vector by a matrix. Actually, a vector in Rn can be seen as an


n × 1 matrix (a matrix with n rows and one column), hence (3.13) can be viewed as a special case

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
102 3.4. Matrices as functions from Rn to Rm ; composition of matrices

of (3.20).
    
a11 a12 ... a1n b11 ... b1` ... b1k c11 ... c1` ... c1k
 .. ..   b21 ... b2` ... b2k   .. .. .. 
 . . 
  .. .. ..   . . . 
  
 
 aj1
AB =  aj2 . . . ajn  . ... . .  =  cj1 ... cj` ... cjk 
 

 . ..   .. .. ..   .. .. .. 
 .. .  . ... . .   . . . 
am1 am2 . . . amn bn1 ... bn` ... bnk cm1 ... cm2 ... cmk

with cj` = aj1 b1` + aj2 b2` + · · · + ajn bn` .


 
  7 1 2 3
1 2 3
Example 3.30. Let A = and B = −2 0 1 4. Then
8 6 4
2 6 −3 0
 
7 1 2 3

FT
 
1 2 3 
AB = 2 0 1 4
8 6 4
2 6 −3 0
 
1·7+2·2+3·2 1·1+2·0+3·6 1 · 2 + 2 · 1 + 3 · (−3) 1·3+2·4+3·0
=
8·7+6·2+4·2 8·1+6·0+4·6 8 · 2 + 6 · 1 + 4 · (−3) 8·3+6·4+4·0
 
17 19 −5 11
= .
76 32 10 48
RA
Let us see some properties of the algebraic operations for matrices that we just introduced.

Proposition 3.31. Let A ∈ M (m × n), B, C ∈ M (k × m), S, T ∈ M (n × k) and R ∈ M (k × `).


Then the following is true.

(i) Associativity of the matrix product: A(RS) = A(RS).


D

(ii) Distributivity: A(S + T ) = AS + AT and (B + C)A = BA + CA.

Proof. The claims of the proposition can be proved by straightforward calculations.

Prove Proposition 3.31.

Very important remark.


The matrix multiplication is not commutative, that is, in general

AB 6= BA.

That matrix multiplication is not commutative is to be expected since it is the composition of two
functions (think of functions that you know from your calculus classes. For example, it does make

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 103

a difference if you first square a variable and then take the arctan or if you first calculate its arctan
and then square the result).
Let us see an example. Let B be the matrix from Example 3.26 and C be the matrix from
Example 3.27. Recall that B represents the orthogonal projection onto the y-axis and that C
represents counterclockwise rotation by 90◦ . If we take ~e1 (the unit vector in x-direction), and we
first rotate and then project, we get the vector ~e2 . If however we project first and rotate then, we
get ~0. That means, BC~e1 6= CB~e1 , therefore BC 6= CB. Let us calculate the products:
    
0 0 0 −1 0 0
BC = = first rotation, then projection,
0 1 1 0 1 0
    
0 −1 0 0 0 −1
CB = = first projection, then rotation.
1 0 0 1 0 0

FT
Let A be the matrix from Example 3.25, B be the matrix from Example 3.26 and C the matrix
from Example 3.27. Verify that AB 6= BA and AC 6= CA and understand this result geometrically
by following for example where the unit vectors get mapped to.

Note also that usually, when AB is defined, the expression BA is not defined because in general
the number of columns of B will be different from the number of rows of A.

We finish this section with the definition of the so-called identity matrix.
RA
Definition 3.32. Let n ∈ N. Then the n × n identity matrix is the matrix which has 1s on its
diagonal and has zero everywhere else:
   
1 0 0 1

0 1
 
  1 0 

= . (3.21)
   

   
1 0  1
   
0
D

 
0 0 1 1

As notation for the identity matrix, the following symbols are used in the literature: En , idn , Idn ,
In , 1n , 1n . The subscript n can be omitted if the size of the matrix is clear.

Remark 3.33. It can be easily verified that

A idn = A, idn B = B, idn ~x = ~x

for every A ∈ M (m × n), for every B ∈ M (n × k) and for every ~x ∈ Rn .

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
104 3.4. Matrices as functions from Rn to Rm ; composition of matrices

You should now have understood


• what the sum and the composition of two matrices is and where the formulas come from,
• why the composition of matrices is not commutative,
• that M (m × n) is a vector space,
• etc.

You should now be able to


• calculate the sum and product (composition) of two matrices,
• etc.

Ejercicios.
     
1 3 −2 0 −1 1
1. Para A =  2 5, B =  1 4 y C =  4 6, calcular:

FT
−1 2 −7 5 −7 3
(a) 2A,
(b) 3C − 2A,
(c) A + B + C,
(d) 2A − 3B + 5C,
(e) una matriz D tal que A + 2B − 3C + D es la matriz de solo ceros.
RA
2. Realice los siguientes cálculos (antes de hacer la multiplicación indicada, especifique cuál será
el tamaño de la matriz resultante al hacer el producto):
    
2 −3 5 1 4 6 1 
(a) 1 0 6 −2 3 5 (d)  3 0 −2 5
2 3 1 1 0 4 −1
 
 1 −1
  
1 4 −2 0 1 
(b) 3 2 1 −2  4 3
3 0 4 2 3 (e)
D


−6 4 0 3  0 5
  2 0
3 −6  
 2 4 1 6  
(c) 1 4 0 2   1
 7 1 4
0 (f)  0 4
2 −3 5
−2 3 −2 3

3. Verifique la ley asociativa 


de la multiplicación
 paralas matrices

  1 −1 2 1 6
3 −1 4
A= ,B= 2 0 −1 y C = −1 4.
1 0 −1
−3 −2 0 −2 3
 
2 −5
4. Encuentre A ∈ M (2 × 2) tal que A = id2 .
−1 3

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 105

   
3 −1 −5 2 6 10
5. Sean A = yC = . Encuentre por lo menos una matriz B tal
4 2 1 0 −3 2
que AB = C. ¿Cuántas soluciones a la ecuación matricial AX = C existen?

6. En R2 , encuentre la matriz (2 × 2) que rota el plano cierto ángulo ϑ.


 
a b
Hint. Plantee la matriz Cϑ = y calcule Cϑ e~1 y Cϑ e~2 para encontrar a, b, c, d.
c d

Cϑ~e2 ~e2

Cϑ~e1
ϑ
ϑ
~e1

7. Si A =

1
0
1
1

8. Sean A, B ∈ M (n × n):

yB=

a
c
b
d


FT
, encuentre condiciones sobre a, b, c, d tal que AB = BA.

(a) ¿Se cumple la igualdad A2 − B 2 = (A − B)(A + B)? Si su respuesta es negativa, ¿cuáles


RA
condiciones sobre A, B puede dar para que se cumpla la igualdad?
(b) Lo mismo del inciso anterior para la igualdad (A + B)2 = A2 + 2AB + B 2 .

3.5 Inverses of matrices


We will give two motivations why we are interested in inverses of matrices before we give the formal
definition.
D

Inverse of a matrix as a function


The inverse of a given matrix is a matrix that “undoes” what the original matrix did. We will
review the matrices from the Examples 3.25, 3.26 and 3.27.

• Assume we are given the matrix A from Example 3.25 which represents reflection on the
x-axis and we want to find a matrix that restores a vector after we applied A to it. Clearly,
we have to reflect again on the x-axis: reflecting an arbitrary vector ~x twice on the x-axis
leaves the vector where it was. Let us check:
    
1 0 1 0 1 0
AA = = = id2 .
0 −1 0 −1 0 1

That means that for every ~x ∈ R2 , we have that A2 ~x = ~x, hence A is its own inverse.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
106 3.5. Inverses of matrices

• Assume we are given the matrix C from Example 3.27 which represents counterclockwise
rotation by 90◦ and we want to find a matrix that restores a vector after we applied C to it.
Clearly, we have to rotate clockwise by 90◦ . Let us assume that there exists a matrix which
represents this rotation and let us call it C−90◦ . By Remark 3.19 it is enough to know how it
acts on ~e1 and ~e2 in order to write it down. Clearly C−90◦~e1 = −~e2 and C−90◦~e2 = ~e1 , hence
C−90◦ = (−~e2 |~e1 ).
Let us check:     
0 1 0 −1 1 0
C−90◦ C = = = id2
−1 0 1 0 0 1
and     
0 −1 0 1 1 0
CC −90◦ = = = id2
1 0 −1 0 0 1
which was to be expected because rotating first 90◦ clockwise and then 90◦ counterclockwise,
leaves any vector where it is.

• Assume we are given the matrix B from Example 3.26 which represents projection onto the

FT
y-axis. In this case, we cannot 
restore
 a vector ~x after we projected
  it onto the y-axis. For
0 0 7
example, if we know that B~x = , then ~x could have been or or any other vector
2 2 2
in R2 whose second component is equal to 2. This shows that B does not have an inverse.

Inverse of a matrix for solving a linear system


RA
Let us consider the following situation. A grocery sells two different packages of fruits. Type A
contains 1 peach and 3 mangos and type B contains 2 peaches and 1 mango. We can ask two
different type of questions:

(i) Given a certain number of packages of type A and of type B, how many peaches and how
many mangos do we get?

(ii) How many packages of each type do we need in order to obtain a given number of peaches
and mangos?
D

The first question is quite easy to answer. Let us write down the information that we are given. If

a = number of packages of type A, p = number of peaches


b = number of packages of type B, m = number of mangos.

then

p = 1a + 2b
(3.22)
m = 3a + 1b.
Using vectors and matrices, we can rewrite this as
    
p 1 2 a
= .
m 3 1 b

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 107

 
1 2
Let A = . Then the above becomes simply
3 1
   
p a
=A . (3.23)
m b

If we know a and b (that is, we know how many packages of each type we bought), then we can
find the values of p and m by simply evaluating A( ab ) which is relatively easy.

Example 3.34. Assume that we buy 1 package of type A and 3 packages of type B, then we
calculate         
p 1 1 2 1 7
=A = = ,
m 3 3 1 3 6
which shows that we have 9 peaches and 7 mangos.

If on the other hand, we know p and m and we are asked find a and b such that (3.22) holds, we
have to solve a linear system which is much more cumbersome. Of course, we can solve (3.23) using

FT
the Gauß or Gauß-Jordan elimination process, but if we were asked to do this for several pairs p
and m, then it would become long quickly. However, if we had a matrix A0 such that A0 A = id2 ,
then this task would be quite easy since in this case we could manipulate (3.23) as follows:
           
p a p a a a
=A =⇒ A0 = A0 A = id2 = .
m b m b b b

If in addition we knew that AA0 = id2 , then we have that


RA
       
p a p a
=A ⇐⇒ A0 = . (3.24)
m b m b

The task to find a and b again reduces to perform a matrix multiplication. The matrix A0 , if it
exists, is called the inverse of A and we will dedicate the rest of this section to give criteria for its
existence, investigate its properties and give a recipe for finding it.
 
−1 2
Exercise. Check that A0 = 1
satisfies A0 A = id2 .
5 3 −1
D

Example 3.35. Assume that we want to buy 5 peaches and 5 mangos. Then we calculate
        
a 0 5 1 −1 2 5 1
=A = = ,
b 5 5 3 −1 5 2

which shows that we have to by 1 package of type A and 2 packages of type B.

Now let us give the precise definition of the inverse of a matrix.

Definition 3.36. A matrix A ∈ M (n×n) is called invertible if there exists a matrix A0 ∈ M (n×n)
such that
AA0 = idn and A0 A = idn

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
108 3.5. Inverses of matrices

In this case A0 is called the inverse of A and it is denoted by A−1 . If A is not invertible then it is
called non-invertible or singular.

The reason why in the definition we only admit square matrices (matrices with the same number
or rows and columns) is explained in the following remark.

Remark 3.37. (i) Let A ∈ M (m×n) and assume that there is a matrix B such that BA = idn .
This means that if for some ~b ∈ Rm the equation A~x = ~b has a solution, then it is unique
because
A~x = ~b =⇒ BA~x = B~b =⇒ ~x = B~b.
From the above it is clear that A ∈ M (m × n) can have an inverse only if for every ~b ∈ Rm
the equation A~x = ~b has at most one solution. We know that if A has more columns than
rows, then the number of columns will be larger than the number of pivots. Therefore,
A~x = ~b has either no or infinitely many solutions (see Theorem 3.7). Hence a matrix A with
more columns than rows cannot have an inverse.

FT
(ii) Again, let A ∈ M (m × n) and assume that there is a matrix B such that AB = idm . This
means that for every ~b ∈ Rm the equation A~x = ~b is solved by ~x = B~b because

idm ~b = ~b =⇒ AB~b = ~b =⇒ A(B~b) = ~b.

From the above it is clear that A ∈ M (m × n) can have an inverse only if for every ~b ∈ Rm
the equation A~x = ~b has at least one solution. Assume that A has more rows than columns.
If we apply Gaussian elimination to the augmented matrix A|~b) then the last row of the
RA
row-echelon form has to be (0 · · · 0|βm ). If we chose ~b such that after the reduction βm 6= 0,
then A~x = ~b does not have a solution. Such a ~b is easy to find: We only need to take ~em
(the mth unit vector) and do the steps from the Gauß elimination backwards. If we take
this vector as right hand side of our system, then the last row after the reduction will be
(0 . . . 0|1). Therefore, a matrix A with more rows than columns cannot have an inverse
because there will always be some ~b such that the equation A~x = ~b has no solution.
In conlusion we showed that we must have m = n if A ought to have an inverse matrix.
D

If A ∈ M (m × n) with n 6= m, then it does not make sense to speak of an inverse of A as explained


above. However, we can define the left inverse and the right inverse.

Definition 3.38. Let A ∈ M (m × n).

(i) A matrix C is called a left inverse of A if CA = idn .


(ii) A matrix D is called a right inverse of A if AD = idm .

Note that C and D must be n × m matrices. The following examples show that the left- and right
inverses do not need to exist, and if they do, they are not unique.
 
0 0
Examples 3.39. (i) A = has neither left- nor right inverse.
0 0

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 109

 
  1 0
1 0 0
(ii) A = has no left inverse and has right inverse D = 0 1. In fact, for every
0 1 0
0 0
 
1 0
x, y ∈ R the matrix  0 1 is a right inverse of A.
x y
 
1 0  
1 0 0
(iii) A = 0 1 has no right inverse and has left inverse C =
  . In fact, for every
0 1 0
0 0  
1 0 x
x, y ∈ R the matrix is a left inverse of A.
0 1 y

Remark 3.40. We will show in Theorem 3.45 that a matrix A ∈ M (n × n) is invertible if and only
if it has a left- and a right inverse.

Examples 3.41. • From the examples at the beginning of this section we have:

FT
   
1 0 −1 1 0
A= =⇒ A = A = ,
0 −1 0 −1
   
0 −1 −1 0 1
C= =⇒ C = ,
1 0 −1 0
 
0 0
B= =⇒ B is not invertible.
RA
0 1
   
4 0 0 0 1/4 0 0 0
0 5 0 0
• Let A =  . Then we can easily guess that A−1 =  0 1/5 0 0

 
0 0 −3 0  0 0 −1/3 0
0 0 0 2 0 0 0 1/2
is an inverse of A. It is easy to check that the product of these matrices gives id4 .
• Let A ∈ M (n × n) and assume that the kth row of A consists of only zeros. Then A is not
D

invertible because for any matrix B ∈ M (n × n), the kth row of the product matrix AB will
be zero, no matter how we choose B. So there is no matrix B such that AB = idn .
• Let A ∈ M (n × n) and assume that the kth column of A consists of only zeros. Then A is
not invertible because for any matrix B ∈ M (n × n), the kth column of the product matrix
BA will be zero, no matter how we choose B. So there is no matrix B such that BA = idn .

Now let us prove some theorems about inverse matrices. Recall that A ∈ M (n × n) is invertible if
and only if there exists a matrix A0 ∈ M (n × n) such that AA0 = A0 A = idn .
First we will show that the inverse matrix, if it exists, is unique.

Theorem 3.42. Let A, B ∈ M (n × n).

(i) If A is invertible, then its inverse is unique.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
110 3.5. Inverses of matrices

(ii) If A is invertible, then its inverse A−1 is invertible and its inverse is A.
(iii) If A and B are invertible, then their product AB is invertible and (AB)−1 = B −1 A−1 .

Proof. (i) Assume that A is invertible and that A0 and A00 are inverses of A. Note that this
means that
AA0 = A0 A = idn and AA00 = A00 A = idn . (3.25)
We have to show that A0 = A00 . This follows from (3.25) and from the associativity of the
matrix multiplication because

A0 = A0 idn = A0 (AA00 ) = (A0 A)A00 = idn A00 = A00 .

(ii) Assume that A is invertible and let A−1 be its inverse. In order to show that A−1 is invertible,
we need a matrix C such that CA−1 = A−1 C = idn . This matrix C is then the inverse of
A−1 . Clearly, C = A does the trick. Therefore A−1 is invertible and (A−1 )−1 = A.
(iii) Assume that A and B are invertible. In order to show that AB is invertible and (AB)−1 =

FT
B −1 A−1 , we only need to verify that B −1 A−1 (AB) = (AB)B −1 A−1 = idn . We see that this
is true using the associativity of the matrix product:

B −1 A−1 (AB) = B −1 (A−1 A)B = B −1 idn B = B −1 B = idn ,


(AB)B −1 A−1 = A(BB −1 )A−1 = A−1 idn A = A−1 A = idn .

Note that in the proof we guessed the formula for (AB)−1 and then we verified that it indeed is the
inverse of AB. We can also calculate it as follows. Assume that C is a left inverse of AB. Then
RA
C(AB) = idn ⇐⇒ (CA)B = idn ⇐⇒ CA = idn B −1 = B −1 ⇐⇒ C = B −1 A−1

If D is a right inverse of AB then

(AB)D = idn ⇐⇒ A(BD) = idn ⇐⇒ BD = A−1 idn = A−1 ⇐⇒ D = B −1 A−1 .

Since C = D, this is the inverse of AB.

Remark 3.43. In general, the sum of invertible matrices is not invertible. For example, both idn
D

and − idn are invertible, but their sum is the zero matrix which is not invertible.

Theorem 3.44 in the next section will show us how to find the inverse of a invertible matrix; see in
particular the section on page 113.

You should now have understood


• what invertibility of a matrix means and why it does not make sense to speak of the invert-
ibility of a matrix which is not a square matrix,
• that invertibility of matrix of n × n-matrix is equivalent to the fact that for every ~b ∈ Rm
the associated linear system A~x = ~b has exactly one solution.
• etc.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 111

You should now be able to


• guess the inverse of simple invertible matrices, for example of matrices which have a clear
geometric interpretation, or of diagonal matrices,
• verify if two given matrices are inverse to each other,
• give examples of invertible and of non-invertible matrices,
• etc.

Ejercicios.

1. Diga cuáles de las siguientes matrices con mutualmente inversas.

FT
         
3 1 1/3 1 2 −1 −3 −1 −2 1
A= , B= , C= , D= , E= .
5 2 1/5 1/2 −5 3 −5 −2 5 −3

 
−1 0 0
2. Muestre que  0 −1 0 es su propia inversa.
RA
−2 −4 1

 
4 −3  
1 2 3
3. Verifique que  0 0 es una inversa a derecha de la matriz A = y úsela para
1 3 4
−1 1
encontrar una solución particular al sistema A~x = ~b.
D

 
1 2
4. Sea A = . ¿Es A invertible? Hint. Suponga que AB = id2 para alguna matriz
1 2
B ∈ M (2 × 2). ¿Cómo se ve la primer columna si uno evalua el producto AB?

3.6 Matrices and linear systems

Let us recall from Theorem 3.7:

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
112 3.6. Matrices and linear systems

For A ∈ M (m × n) and ~b ∈ Rm consider the equation

A~x = ~b. (3.26)

Then the following is true:


(1) Equation (3.26) has no ⇐⇒ The reduced row echelon form of the augmented
solution. system (A|~b) has a row of the form (0 · · · 0|β) with
some β 6= 0.
(2) Equation (3.26) has at least ⇐⇒ The reduced row echelon form of the augmented
one solution. system (A|~b) has no row of the form (0 · · · 0|β)
with some β 6= 0.
In case (2), we have the following two sub-cases:
(2.1) Equation (3.26) has exactly one solution. ⇐⇒ #pivots = #columns.
(2.2) Equation (3.26) has infinitely many solutions. ⇐⇒ #pivots < #columns.

FT
Observe that the case (1), no solution, cannot occur for homogeneous systems.
The next theorem connects the above to invertibility of the matrix representing the system.

Theorem 3.44. Let A ∈ M (n × n). Then the following is equivalent:


(i) A is invertible.
(ii) For every ~b ∈ Rn , the equation A~x = ~b has exactly one solution.
RA
(iii) The equation A~x = ~0 has exactly one solution.
(iv) Every row-reduced echelon form of A has n pivots.
(v) A is row-equivalent to idn .

We will complete this theorem with one more item in Chapter 4 (Theorem 4.11).

Proof. (ii) ⇒ (iii) follows if we choose ~b = ~0.


(iii) ⇒ (iv) If A~x = ~0 has only one solution, then, by the case (2.1) above (or by Theorem 3.7(2.1)),
D

the number of pivots is equal to n (the number of columns of A) in every row-reduced echelon form
of A.
(iv) ⇒ (v) is clear.
(v) ⇒ (ii) follows from case (2.1) above (or by Theorem 3.7(2.1)) because no row-reduced form of
A can have a row consisting of only zeros.
So far we have shown that (ii) - (v) are equivalent. Now we have to connect them to (i).
(i) ⇒ (ii) Assume that A is invertible and let ~b ∈ Rn . Then A~x = ~b ⇐⇒ ~x = A−1~b which shows
existence and uniqueness of the solution.
(ii) ⇒ (i) Assume that (ii) holds. We will construct A−1 as follows (this also tells us how we can
calculate A−1 if it exists). Recall that we need a matrix C such that AC = idn . This C will
then be our candidate for A−1 (we still would have to check that CA = idn ). Let us denote the
columns of C by ~cj for j = 1, . . . , n, so that C = (~c1 | · · · |~cn ). Recall that the kth column of AC is

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 113

A(kth column of C) and that the columns of idn are exactly the unit vectors ~ek (the vector with a
1 as kth component and zeros everywhere else). Then AC = idn can be written as

(A~c1 | · · · |A~cn ) = (~e1 | · · · |~en ).

By (ii) we know that equations of the form A~x = ~ej have a unique solution. So we only need
to set ~cj = unique solution of the equation A~x = ~ej . With this choice we then have indeed that
AC = idn .
It remains to show that CA = idn . To this end, note that

A = idn A =⇒ A = ACA =⇒ A − ACA = O =⇒ A(idn −CA) = O.

This means that A(idn −CA)~x = ~0 for every ~x ∈ Rn . Since by (ii) the equation A~y = ~0 has the
unique solution ~y = ~0, it follows that (idn −CA)~x = ~0 for every x ∈ Rn . But this means that
~x = CA~x for every ~x, hence CA must be equal to idn .

FT
Theorem 3.45. Let A ∈ M (n × n).

(i) If A has a left inverse C (that is, if CA = idn ), then A is invertible and A−1 = C.

(ii) If A has a right inverse D (that is, if AD = idn ), then A is invertible and A−1 = D.

Proof. (i) By Theorem 3.44 it suffices to show that A~x = ~0 has a the unique solution ~0. So
assume that ~x ∈ Rn satisfies A~x = ~0. Then ~x = idn ~x = (CA)~x = C(A~x) = C~0 = ~0. This
shows that A is invertible. Moreover, C = C(idn ) = C(AA−1 ) = (CA)A−1 = idn A−1 = A−1 ,
RA
hence C = A−1 .

(ii) By (i) applied to D, it follows that D has an inverse and that D−1 = A, so by Theo-
rem 3.42 (ii), A is invertible and A−1 = (D−1 )−1 = D.

Calculation of the inverse of a given square matrix

Let A be a square matrix. The proof of Theorem 3.44 tells us how to find its inverse if it exists.
D

We only need to solve A~x = ~ek for k = 1, . . . , n. This might be cumbersome and long, but we
already know that if these equations have solutions, then we can find them with the Gauß-Jordan
elimination. We only need to form the augmented matrix (A|~ek ), apply row operations until we get
to (idn |~ck ). Then ~ck is the solution of A~x = ~ek and we obtain the matrix A−1 as the matrix whose
columns are the vectors ~c1 , . . . , ~cn . If it is not possible to reduce A to the identity matrix, then it
is not invertible.
Note that the steps that we have to perform to reduce A to the identiy matrix depend only on
the coefficients in A and not on the right hand side. So we can calculate the n vectors ~c1 , . . . ~cn
with only one (big) Gauß-Jordan elimination if we augment our given matrix A by the n vectors
~e1 , . . . ,~en . But the matrix (~e1 | · · · |~en ) is nothing else than the identity matrix idn . So if we take
(A| idn ) and apply the Gauß-Jordan elimination and if we can reduce A to the identity matrix,
then the columns on the right are the columns of the inverse matrix A−1 . If we cannot get to the
identity matrix, then A is not invertible.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
114 3.6. Matrices and linear systems

 
1 2
Examples 3.46. (i) Let A = . Let us show that A is invertible by reducing the aug-
3 4
mented matrix (A| id2 ):
     
1 2 1 0 R2 −3R1 →R2 1 2 1 0 R1 +R2 →R1 1 0 2 1
(A| id2 ) = −−−−−−−−→ −−−−−−−−→
3 4 0 1 0 −2 −3 1 0 −2 −3 1
 
−1/2R2 →R2 1 0 −2 1
−−−−−−−−→ .
0 1 3/2 −1/2
 
−2 1
Hence A is invertible and A−1 = .
3/2 −1/2
We can check your result by calculating
      
1 2 −2 1 −2 + 3 1 − 1 1 0
= =
3 4 3/2 −1/2 −6 + 6 3 − 2 0 1
and

FT
      
−2 1 1 2 −2 + 3 −4 + 4 1 0
= = .
3/2 −1/2 3 4 3/2 − 3/2 3−2 0 1
 
1 2
(ii) Let A = . Let us show that A is not invertible by reducing the augmented matrix
−2 −4
(A| id2 ):
   
1 2 1 0 R2 +2R1 →R2 1 2 1 0
(A| id2 ) = −−−−−−−−→ .
−2 −4 0 1 0 0 2 1
RA
Since there is a zero row in the left matrix, we conclude that A is not invertible.
 
1 1 1
(iii) Let A = 0 2 3. Let us show that A is invertible by reducing the augmented matrix
5 5 1
(A| id3 ):
   
1 1 1 1 0 0 1 1 1 1 0 0
R3 −5R1 →R3
(A| id3 ) = 0 2 3 0 1 0 −−−−−−−−→ 0 2 3 0 1 0
D

5 5 1 0 0 1 0 0 −4 −5 0 1
   
1 1 1 1 0 0 4 4 0 −1 0 1
4R2 +3R3 →R2 4R1 +R3 →R1
−−−−−−−−−→ 0 8 0 −15 4 3 −−−−−−−−→ 0 8 0 −15 4 3
0 0 −4 −5 0 1 0 0 −4 −5 0 1
 
8 0 0 13 −4 −1
2R1 −R2 →R1
−−− −−−−−→ 0 8 0 −15 4 3
0 0 −4 −5 0 1
 
1 0 0 13/8 −1/2 −1/8
2R1 −R2 →R1
−−− −−−−−→ 0 1 0 −15/8 1/2 3/8 .
0 0 1 5/4 0 −1/4

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 115

   
13/8 −1/2 −1/8 13 −4 −1
Hence A is invertible and A−1 = −15/8 1/2 3/8 = 1 
8
−15 4 3 .
5/4 0 −1/4 10 0 2
We can check your result by calculating
    
1 1 1 13/8 −1/2 −1/8 1 0 0
0 2 3 −15/8 1/2 3/8 = · · · = 0 1 0
5 5 1 5/4 0 −1/4 0 0 1
and
    
13/8 −1/2 −1/8 1 1 1 1 0 0
−15/8 1/2 3/8 0 2 3 = · · · = 0 1 0 .
5/4 0 −1/4 5 5 1 0 0 1

Special case: Inverse of a 2 × 2 matrix


 
a b
Let A = . We already know that A is invertible if and only if its associated homogeneous

FT
c d
linear system has exactly one solution. By Theorem 1.11 this is the case if and only if det A 6= 0.
Recall that det A = ad − bc. So let us assume here that det A 6= 0.
Case 1. a 6= 0.
   
a b 1 0 aR2 −cR1 →R2a b 1 0
(A| id2 ) = −−−−−−−−−→
c d 0 1 0 ad − bc −c a
RA
bc ab ad ab
b
   
R1 − ad−bc R2 →R1 a 0 1 + ad−bc − ad−bc a 0 ad−bc − ad−bc
−−−−−−−−−−−−→ =
0 ad − bc −c a 0 ad − bc −c a

d b
b
 
R1 − ad−bc 1 0
R2 →R1
ad−bc − ad−bc
−−−−−−−−−−−−→ c a .
0 1 − ad−bc ad−bc

It follows that  
1 d −b
A−1 = . (3.27)
ad − bc −c a
D

Case 2. a = 0. Since 0 6= det A = ad − bc = bc in this case, it follows that c 6= 0 and calculations


as above again lead to formula (3.27).

You should now have understood

• the relation between the invertibility of a square matrix A and the existence and uniqueness
of solution of A~x = ~b,
• that inverting a matrix is the same as solving a linear system,
• etc.
You should now be able to

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
116 3.6. Matrices and linear systems

• calculate the inverse of a square matrix if it exists,


• use the inverse of a square matrix if it exists to solve the associated linear system,
• etc.

Ejercicios.
1. Determine la inversa (si es posible) de las siguientes matrices:

 
  1 1 1  
1 3 0 −1 6
(a) (b) 2 3 (c)
−2 6 2 −12
5 5 1
 
    1 1 1 1
3 2 1 1 0 0
0 −3
1 2 −1 2
(d) 2 2 (e) 0 0 (f)  
1 −1 2 1
0 0 −1 4 3 1

FT
1 3 3 2
   
1 1 1 1 0 0
(g) 1 2 2 (h) 0 −5 0
0 3 3 0 0 3

2. De los siguientes sistemas de ecuaciones lineales, ¿cuál tiene una solución no trivial?
2x + y − z = 0 x−y−z =0
RA
i) x − 2y − 3z = 0 ii) 2x + y + 2z = 0
−3x − y − z = 0 −2x + 5y + 6z = 0

3. Determine todos los valores de a tal que


a2
 
1 4
A = 1 0 0
1 2 2
es invertible. En dicho caso hallar A−1 .
D

 
cos ϑ − sin ϑ
4. Calcular la inversa de la matriz de rotación Cϑ = . ¿Cómo se interpreta
sin ϑ cos ϑ
geométricamente Cϑ−1 ? (ver Sección 3.4, Ejercicio 6.).
5. Calcular la inversa de  
1 0 0
0 cos ϑ − sin ϑ .
0 sin ϑ cos ϑ
6. Sea A ∈ M (n × n) no invertible. Demuestre que existe B ∈ M (n × n), B 6= O tal que
AB = O. (Hint: considere el sistema homogéneo A~x = ~0).
7. Sean B ∈ M (6 × 5) y C ∈ M (5 × 6). Muestre que BC no puede ser una matriz invertible.
(Hint: considere el sistema homogéneo C~x = ~0).

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 117

3.7 The transpose of a matrix


 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
 .. .. .. 
 
Definition 3.47. Let A = (aij )i=1,...,m =  . . .  ∈ M (m × n). Then its trans-
j=1,...,n
 
 . . .
 .. .. .. 

am1 am2 · · · amn


pose At is the n × m matrix whose columns are the rows of A and whose rows are the columns of
A, that is,  
a11 a21 · · · am1
 a12 a22 · · · am2 
 .. .. .. 
 
At =  . . .  ∈ M (n × m).

 . . .
 .. .. .. 

a1n a2n · · · amn

FT
If we denote At = (e
aij ) i=1,...,n , then e
aij = aji for i = 1, . . . , n and j = 1, . . . , m.
j=1,...,m

Examples 3.48. The transposes of


 
    1 2 3
1 2 1 2 3 4 5 6
A= , B= , C= 
3 4 4 5 6 7 7 7
3 2 4
RA
are    
  1 4 1 4 7 3
1 3
At = , B t = 2 5 , C t = 2 5 7 2 .
2 4
3 6 3 6 7 4

Proposition 3.49. Let A, B ∈ M (m × n). Then (At )t = A and (A + B)t = At + B t .

Proof. Clear.
D

Theorem 3.50. Let A ∈ M (m × n) and B ∈ M (n × k). Then (AB)t = B t At .

Proof. Note that both (AB)t and B t At are m × k matrices. In order to show that they are equal,
we only need to show that they are equal in every entry. Let i ∈ {1, . . . , m} and j ∈ {1, . . . , k}.
Then

component ij of (AB)t = component ji of AB


= [row j of A] × [column i of B]
= [column j of At ] × [row i of B t ]
= [row i of B t ] × [column j of At ]
= component ij of B t At .

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
118 3.7. The transpose of a matrix

Theorem 3.51. Let A ∈ M (n × n). Then A is invertible if and only if At is invertible. In this
case, (At )−1 = (A−1 )t .

Proof. Assume that A is invertible. Then AA−1 = id. Taking the transpose on both sides, we find

id = idt = (AA−1 )t = (A−1 )t At .

This shows that At is invertible and its inverse is (A−1 )t , see Theorem 3.45. Now assume that
At is invertible. From what we just showed, it follows that then also its transpose (At )t = A is
invertible.

Next we show an important relation between transposition of a matrix and the inner product on
Rn .

Theorem 3.52. Let A ∈ M (m × n).

(i) hA~x , ~y i = h~x , At ~y i for all ~x ∈ Rn and all ~y ∈ Rm .

FT
(ii) If hA~x , ~y i = h~x , B~y i for all ~x ∈ Rn and all ~y ∈ Rm , then B = At .

Proof. Let A = (aij )i=1,...,m and B = (bij ) i=1,...,n .


j=1,...,n j=1,...,m
Pn
(i) Observe that the kth component of A~x is (A~x)k = j=1 akj xj . and that the `th coordinate
Pm
of At ~y is (At ~y )` = j=1 aj` yj . Then
m m X
n n X
m n
RA
X X X X
hA~x , ~y i = (A~x)k yk = akj xj yk = xj akj yk = xj (At ~y )j = h~x , At ~y i.
k=1 k=1 j=1 j=1 k=1 j=1

(ii) We have to show: For all i = 1, . . . , m and j = 1, . . . , n we have that aij = bji . Take
~x = ~ej ∈ Rn and ~y = ~ei ∈ Rm . If we take the inner product of A~ej with ~ei , then we obtain
the ith component of A~ej . Recall that A~ej is the jth column of A, hence

hA~ej ,~ei i = aij .


D

Similarly if we take the inner product of B~ei with ~ej , then we obtain the jth component of
B~ei . Since B~ei is the jth column of B it follows that

h~ej , B~ei i = bji .

By assumption hA~ej ,~ei i = h~ej , B~ei i, hence it follows that aij = bji , hence B = At .

Definition 3.53. Let A = (aij )ni,j=1 ∈ M (n × n) be a square matrix.

(i) A is called upper triangular if aij = 0 if i > j.


(ii) A is called lower triangular if aij = 0 if i < j.
(iii) A is called diagonal if aij = 0 if i 6= j. Diagonal matrices are sometimes denoted by
diag(c1 , . . . , cn ) where the c1 , . . . , cn are the numbers on the diagonal of the matrix.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 119

That means that for an upper triangular matrix all entries below the diagonal are zero, for a lower
triangular matrix all entries above the diagonal are zero and for a diagonal matrix, all entries except
the ones on the diagonal must be zero. These matrices look as follows:
     
a11 a11 a11



a22 ∗ 
,




a22 0 
,




a22 0 

0 ∗ 0
     
     
ann ann ann

upper triangular matrix, lower triangular matrix, diagonal matrix diag(a11 , . . . , ann ).

Remark 3.54. A matrix is both upper and lower triangular if and only if it is diagonal.

Examples 3.55.

FT
   
  0 2 4 2 0 0 0 0    
1 2 4 0 2 0 0 0 0 0
0 5 2 0 3 0 0
A = 0 2 5 , B = 
0
, C =  , D = 0 3 0 , E = 0 0 0 .
0 0 8 3 4 0 0
0 0 3 0 0 8 0 0 0
0 0 0 0 5 0 0 1

The matrices A, B, D, E are upper triangular, C, D, E are lower triangular, D, E are diagonal.
RA
Definition 3.56. (i) A matrix A ∈ M (n × n) is called symmetric if At = A. The set of all
symmetric n × n matrices is denoted by Msym (n × n).

(ii) A matrix A ∈ M (n × n) is called antisymmetric if At = −A. The set of all antisymmetric


n × n matrices is denoted by Masym (n × n).

Examples 3.57.
       
1 7 4 3 0 4 0 2 −5 0 0 8
D

A = 7 2 5 , B = 0 4 0 , C = −2 0 −3 , D = 0 3 0 .
4 5 3 4 0 1 5 3 0 2 0 0

The matrices A and B are symmetric, C is antisymmetric and D is neither.

Clearly, every diagonal matrix is symmetric.

Exercise 3.58. • Let A ∈ M (n × n). Show that A + At is symmetric and that A − At is


antisymmetric.
• Show that every matrix A ∈ M (n × n) can be written as the sum of symmetric and an
antisymmetric matrix.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
120 3.7. The transpose of a matrix

Question 3.2
How many possibilities are there to express a given matrix A ∈ M (n × n) as sum of a symmetric
and an antisymmetric matrix?

Exercise 3.59. Show that the diagonal entries of an antisymmetric matrix are 0.

You should now have understood


• why (AB)t = B t At ,
• what the transpose of a matrix has to do with the inner product,
• etc.

You should now be able to

FT
• calculate the transpose of a given matrix,
• check if a matrix is symmetric, antisymmetric or none,
• etc.

Ejercicios.
1. (a) Encuentre las transpuestas de las siguientes matrices:
RA
 
  0
  1 3  
−1 4 2 −1 1 5 −1
(a) , (b) −1 2 , (c) ,  4 .
(d)  
10 8 0 0 4 13
4 5
3

(b) Para cada una de las matrices del punto anterior, verifique la igualdad hA~x , ~y i =
hx , At yi.
 
1 β + 5 α + 2β − 2
D

2. Encuentre α, β tales que α + 2β −1 2  es una matriz simétrica.


2α + β 2 4
3. ¿Cuáles de las siguientes matrices son antisimétricas?
   
    3 −3 −3 0 1 −1
2 −7 0 −1
(a) , (b) , (c) 3 3 −3 , (d)  −1 0 2 .
7 0 1 0
3 3 3 1 −2 0

4. Si A, B ∈ Msym (n × n), muestre que (AB)t = BA. ¿Se puede concluir que AB es simétrica?.
5. Sean v~1 , v~2 , v~3 ∈ R3 tales que hvi , vj i = 0 si i 6= j y hvi , vj i = 1 si i = j. Sea A = [v~1 v~2 v~3 ]
la matiz cuyas columnas son los vectores dados. Muestre que AAt = id3 (ver Sección 3.6,
Ejercicio 5.).

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 121

6. Muestre que la suma de dos matrices simétricas (antisimétricas) de tamaño n × n da como


resultado una matriz simétrica (antisimétrica).

7. (a) Muestre que una matriz triangular superior (inferior) es invertible si y solo si todas las
entradas de la diagonal principal son distintas de 0.
(b) Sea D ∈ M (n × n) una matriz diagonal. Determine cuando D es invertible y halle su
inversa

3.8 Elementary matrices


In this section we study three special types of matrices. They are called elementary matrices. Let
us define them.

Definition 3.60. For n ∈ N we define the following matrices in M (n × n):

FT
 
1
 
(i) Sj (c) = 
 c

 for j = 1, . . . , n and c 6= 0. All entries outside the diagonal are 0.
 
1

column k
 
1 c
RA
row j

 
(ii) Qjk (c) = 


 for j, k = 1, . . . , n with j 6= k and c ∈ R. The number c is
 
 
1
in row j and column k. All entries apart from c and the diagonal are 0.

col. k col. j
 
1
 
D

 

 0 1 

row k
 
(iii) Pjk =


 for j, k = 1, . . . , n. This matrix is obtained from the
 
 

 1 0 
 row j
 
1
identity matrix by swapping rows j and k (or, equivalently, by swapping columns j and k).

Examples 3.61. Let us see some examples for n = 2.


     
5 0 1 0 0 1
S1 (5) = , Q21 (3) = , P12 = .
0 1 3 1 1 0

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
122 3.8. Elementary matrices

Some examples for n = 3:


       
1 0 0 1 0 0 0 0 1 0 1 0
S3 (−2) = 0 1 0  , Q23 (4) = 0 1 4 , P31 = 0 1 0 P21 = 1 0 0 .
0 0 −2 0 0 1 1 0 0 0 0 1

Let us see how these matrices act on other n × n matrices. Let A = (aij )ni,j=1 ∈ M (n × n). We
want to calculate EA where E is an elementary matrix.
    
1 a11 a12 a1n a11 a12 a1n
    
    
• Sj (c)A =  ajn  =  caj1
    
c  aj1 aj2 caj2 cajn 
    
    
    
1 an1 an2 ann an1 an2 ann
   
 a11 a12 a1n a11 a12 a1n

FT

1    
    
 aj1 aj2 ajn
  

1 c    aj1 + cak1 aj2 + cak2 ajn + cakn 
• Qjk (c)A = 

 =
  
 
 
  
 ak1 ak2 akn
 
  
  ak1 ak2 akn 

    
1    
an1 an2 ann an1 an2 ann

   
RA

1 a11 a12 a1n a11 a12 a1n
 

 
 


 
 0 1  aj1 aj2 ajn   ak1 ak2
 
akn


• Pjk A = 
 
= .
   
    

 ak1

ak2 akn   aj1 aj2 ajn 
1 0
   
    
    
1 an1 an2 ann an1 an2 ann

In summary, we see that


D

Proposition 3.62. • Sj (c) multiplies the jth row of A by c.

• Qjk (c) sums c times the kth row to the jth row of A.
• Pjk swaps the kth and the jth row of A.

These are exactly the row operations from the Gauß or Gauß-Jordan elimination! So we see that
every row operation can be achieved by multiplying from the left by an appropriate elementary
matrix.

Remark 3.63. The form of the elementary matrices is quite easy to remember if you recall that
E idn = E for every matrix E, in particular for an elementary matrix. So, if you want to remember

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 123

e.g. how the 5 × 5 matrix looks like which sums 3 times the 2nd row to the 4th, just remember
that this matrix is
 
1 0 0 0 0
0 1 0 0 0
 
0 0 1 0 0
E = E id5 = (take id5 and sum 3 times its 2nd row to its 4th row) =  
0 3 0 1 0
0 0 0 0 1

which is Q42 (3).

Question 3.3
How do elementary matrices act on other matrices if we multiply them from the right?
Hint. There are two ways to find the answer. One is to carry out the matrix multiplication as we
did on page 122. Or you could use that AE = [(AE)t ]t = [E t At ]t . If E is an elementary matrix,

FT
then so is E t , see Proposition 3.65. Since you know how E t At looks like, you can then deduce
how its transpose looks like.

Since the action of an elementary matrix can be “undone” (since the corresponding row operation
can be undone), we expect them to be invertible. The next proposition shows that they indeed are
and that their inverse is again an elementary matrix of the same type.

Proposition 3.64. Every elementary n × n matrix is invertible. More precisely, for j, k = 1, . . . , n


RA
with j 6= k the following holds:

(i) (Sj (c))−1 = Sj (c−1 ) for c 6= 0.

(ii) (Qjk (c))−1 = Qjk (−c).

(iii) (Pjk )−1 = Pjk .

Proof. Straightforward calculations.


D

Show that Proposition 3.64 is true. Convince yourself that it is true using their interpretation as
row operations.

Proposition 3.65. The transpose of an elementary n × n matrix is again an elementary matrix.


More precisely, for j, k = 1, . . . , n with j 6= k the following holds:

(i) (Sj (c))t = Sj (c) for c 6= 0.

(ii) (Qjk (c))t = Qkj (c).

(iii) (Pjk )t = Pjk .

Proof. Straightforward calculations.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
124 3.8. Elementary matrices

Exercise 3.66. Show that Qjk (c) = Sk (c−1 )Qjk (1)Sk (c) for c 6= 0. Interpret the formula in
terms of row operations.

Exercise. Show that Pjk can be written as product of matrices of the form Qjk (c) and Sj (c).

Let us come back to the relation of elementary matrices and the Gauß-Jordan elimination process.

Proposition 3.67. Let A ∈ M (n × n) and let A0 be a row echelon form of A. Then there exist
elementary matrices E1 , . . . , Ek such that

A = E1 E2 · · · Ek A0 .

Proof. We know that we can arrive at A0 by applying suitable row operations to A. By Propo-
sition 3.62 they correspond to multiplication of A from the left by suitable elementary matrices
Fk , Fk−1 , . . . , F2 , F1 , that is
A0 = Fk Fk−1 · · · F2 F1 A.

FT
We know that all the Fj are invertible, hence their product is invertible and we obtain

A = [Fk Fk−1 · · · F2 F1 ]−1 A0 = F1−1 F2−1 · · · Fk−1


−1
Fk−1 A0 .

We know that the inverse of every elementary matrix Fj is again an elementary matrix, so if we set
Ej = Fj−1 for j = 1, . . . , k, the proposition is proved.

Corollary 3.68. Let A ∈ M (n × n). Then there exist elementary matrices E1 , . . . , Ek and an
upper triangular matrix U such that
RA
A = E1 E2 · · · Ek U.

Proof. This follows immediately from Proposition 3.67 if we recall that every row reduced echelon
form of A is an upper triangular matrix.

The next theorem shows that every invertible matrix is “composed” of elementary matrices.

Theorem 3.69. Let A ∈ M (n × n). Then A is invertible if and only if it can be written as product
D

of elementary matrices.

Proof. Assume that A is invertible. Then the reduced row echelon form of A is idn . Therefore,
by Proposition 3.67, there exist elementary matrices E1 , . . . , Ek such that A = E1 · · · Ek idn =
E1 · · · Ek .
If, on the other hand, we know that A is the product of elementary matrices, say, A = F1 · · · F` , then
clearly A is invertible since each elementary matrix Fj is invertible and the product of invertible
matrices is invertible.

We finish this section with an exercise where we write an invertible 2 × 2 matrix as product of
elementary matrices. Notice that there are infinitely many ways to write it as product of elementary
matrices just as there are infinitely many ways of performing row reduction to get to the identity
matrix.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 125

 
1 2
Example 3.70. Write the matrix A = as product of elementary matrices.
3 4

Solution. We use the idea of the proof of Theorem 3.44: we apply the Gauß-Jordan elimination
process and write the corresponding row transformations as elementary matrices.
       
1 2 R2 →R2 −3R1 1 2 R1 →R1 +R2 1 0 R2 →− 12 R2 1
0
−−−−−−−−→ −−−−−−−−→ −−−−−− −→
3 4 Q21 (−3) 0 −2 Q12 (1) 0 −2 1
S2 (− 2 ) 0
1
| {z } | {z } | {z }
=Q21 (−3)A =Q21 (1)Q21 (−3)A =S2 (− 12 )Q21 (1)Q21 (−3)A

So we obtain that
id2 = S2 (− 12 )Q21 (1)Q21 (−3)A. (3.28)

Since the elementary matrices are invertible, we can solve for A and obtain

A = [S2 (− 21 )Q21 (1)Q21 (−3)]−1 id2 = [S2 (− 12 )Q21 (1)Q21 (−3)]−1

FT
= [Q21 (−3)]−1 [Q21 (1)]−1 [S2 (− 12 )]−1
= Q21 (3)Q21 (−1)S2 (−2). 

Note that from (3.28) we get the factorisation for A−1 for free. Clearly, we must have

A−1 = S2 (− 12 )Q21 (1)Q21 (−3). (3.29)


RA
If we wanted to we could now use (3.29) to calculate A−1 . It is by no means a surprise that we
actually get first the factorisation of A−1 because the Gauß-Jordan elimination leads to the inverse
of A. So A−1 is the composition of the matrices which leads from A to the identity matrix. (To
get from the identity matrix to A, we need to reverse these steps.)

You should now have understood


D

• the relation of the elementary matrices with the Gauß-Jordan process,


• why a matrix is invertible if and only if it is the product of elementary matrices,
• etc.
You should now be able to
• express an invertible matrix as product of elementary matrices,
• etc.

Ejercicios.
1. Determine cuáles de las siguientes son matrices elementales:

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
126 3.8. Elementary matrices

 
  1 0 0  1 
1 0 −1 1 0 −3 0
(a) (b) (c) 1
2 1 0 3
2 0 1
   
  1 0 0 0 1 0 0 0
0 1 0 0 1 1 0 1
0 1 0 0
(d) 0 1 (e) 
0 0 1 0
 (f) 
0

0 1 0
1 0 0
0 0 0 1 0 0 1 1

2. Muestre que cada una de las siguientes matrices es invertible y factorı́cela como un producto
de matrices elementales:
 
  1 1 0 0  
  2 0 4 3 1 1
2 1 3 2 4 0
(a) , (b) 0 1 1 , (c) 
0 0 −2
, (d) 0 2 4 .
3 2 1
3 −1 1 0 0 −1

FT
0 0 0 1

3. Escriba cada matriz como producto de matrices elementales y una matriz triangular superior:
       
2 −2 2 1 3 1 0 0 0 0
(a) , (d) .
−2 6 (b) 0 −3 1, (c)  5 0 0, 2 0
1 0 2 −2 1 3
RA
4. En los siguientes problemas, encuentre una matriz elemental E tal que EA = B:

   
    12 1 2
1 2 3 6
(a) A = , B= (b) A = 3 4 , B = 3 4
−1 3 −1 3
56 1 −2
       
1 4 7 8 −2 5 3
1 −3 0 5 −5
(c) A = 5 6 , B = 5 6 (d) A = 2 −1 0 4 , B =  2 −1 0 4
7 8 1 4 5 1 −3 2 5 1 −3 2
D

   
    5 1 2 0 11 2
1 −3 −2 0
(e) A = , B= (f) A = −1 3 4 , B = −1 3 4
−1 1 −1 1
1 −2 0 1 −2 0

5. (a) Sea A ∈ M (3 × 3) una matriz triangular superior (inferior) tal que las entradas de su
diagonal son todas distintas de 0. Muestre que A se factoriza como producto de a lo más
seis matrices elementales.
(b) Sean A, B ∈ M (3 × 3) matrices triangulares superiores (inferiores). Muestre que AB es
una matriz triangular superior (inferior).
(c) Sea A ∈ M (3 × 3) una matriz triangular superior (inferior) tal que las entradas de su
diagonal son todas distintas de 0. Muestre que A−1 es triangular superior (inferior).

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 127

3.9 Summary
Elementary row operations (= operations which lead to an equivalent system) for
solving a linear system.

Elementary operation Notation Inverse Operation


1 Swap rows j and k. Rj ↔ Rk Rj ↔ Rk
2 Multiply row j by some λ ∈ R \ {0} Rj → λRj Rj → λ1 Rj
3 Replace row k by the sum of row k and λ times Rk → Rk + λRj Rk → Rk − λRj
Rj and keep row j unchanged (j 6= k)

On the solutions of a linear system.


• A linear system has either no, exactly one or infinitely many solutions.

FT
• If the system is homogeneous, then it has either exactly one or infinitely many solutions. It
always has at least one solution, namely the trivial one.
• The set of all solutions of a homogeneous linear equations is a vector space.
• The set of all solutions of a inhomogeneous linear equations is an affine vector space.

For A ∈ M (m × n) and ~b ∈ Rm consider the equation A~x = ~b. Then the following is true:
RA
(1) No solution ⇐⇒ The reduced row echelon form of the augmented
system (A|~b) has a row of the form (0 · · · 0|β) with
some β 6= 0.
(2) At least one solution ⇐⇒ The reduced row echelon form of the augmented
system (A|~b) has no row of the form (0 · · · 0|β)
with some β 6= 0.
In case (2), we have the following two sub-cases:
(2.1) Exactly one solution ⇐⇒ # pivots= # columns.
D

(2.2) Infinitely many solutions ⇐⇒ # pivots< # columns.

Algebra with matrices and vectors


A matrix A ∈ M (m × n) can be viewed as a function A : Rn → Rm .

Definition.
    
a11 a12 ··· a1n x1 a11 x1 + a12 x2 + · · · + a1n xn
 a21 a22 ···   x2  a21 x1 + a22 x2 + · · · + a2n xn 
a2n     
A~x =  . ..   ..  =  .. ,

.. ..
 .. . .  .   . . 
am1 am2 ··· amn xn am1 x1 + am2 x2 + · · · + amn xn

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
128 3.9. Summary

   
a11 a12 ··· a1n b11 b12 ··· b1n
 a21 a22 ··· a2n   b21 b22 ··· b2n 
A+B = . ..  +  ..
   
.. .. .. 
 .. . .   . . . 
am1 am2 ··· amn bm1 bm2 · · · bmn
 
a11 + b11 a12 + b12 ··· a1n + b1n
 a21 + b21 a22 + b22 ··· a2n + b2n 
= ,
 
.. .. ..
 . . . 
am1 + bm1 am2 + bm2 ··· amn + bmn
  
a11 a12 ··· a1n b11 b12 ··· b1n
 a21 a22 ··· a2n   b21 b22 ··· b2n 
AB =  .
  
.. ..   .. .. .. 
 .. . .  . . . 
am1 am2 · · · amn bm1 bm2 · · · bmn
 
a11 b11 + a12 b21 + · · · + a1n bn1 · · · a11 b1k + a12 b2k + · · · + a1n bnk

FT
 a21 b11 + a22 b21 + · · · + a2n bn1 · · · a21 b1k + a22 b2k + · · · + a2n bnk 
=
 
.. .. 
 . . 
am1 b11 + am2 b21 + · · · + amn bn1 ··· am1 b1k + am2 b2k + · · · + amn bnk

= (cj` )j`

with
n
RA
X
cj` = ajh bh` .
h=1

• Sum of matrices: componentwise,

• Product of matrices with vector or matrix with matrix: “multiply row by column”.

Properties. Let A1 , A2 , A2 ∈ M (m × n), B ∈ M (n × k), C ∈ M (k × r) be matrices, ~x, ~y ∈ Rn ,


~z ∈ Rk and c ∈ K.
D

• A1 + A2 = A2 + A1 ,

• (A1 + A2 ) + A3 = A1 + (A2 + A3 ),

• (AB)C = A(BC),

• in general, AB 6= BA,

• A(~x + c~y ) = A~x + cA~y ,

• (A1 + cA2 )~x = A1 ~x + cA2 ~x,

• (AB)~z = A(B~z),

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 129

Transposition of matrices
Let A = (aij )i=1,...,m ∈ M (m × n). Then its transpose is the matrix At = (e
aij ) i=1,...,n ∈ M (n × m)
j=1...,n j=1...,m
with e
aij = aji .
For A, B ∈ M (m × n) and C ∈ M (n × k) we have
• (At )t = A,
• (A + B)t = At + B t ,
• (AC)t = C t At ,
• hA~x , ~y i = h~x , At ~y i for all ~x ∈ Rn and ~y ∈ Rm .
A matrix A is called symmetric if At = A and antisymmetric if At = −A. Note that only square
matrices can be symmetric.
A matrix A = (aij )i,j=1,...,n ∈ M (n × n) is called
• upper triangular if aij = 0 whenever i > j,
• lower triangular if aij = 0 whenever i < j,

FT
• diagonal if aij = 0 whenever i 6= j.
Clearly, a matrix is diagonal if and only if it is upper and lower triangular. The transpose of an
upper triangular matrix is lower triangular and vice verse. Every diagonal matrix is symmetric.

Invertibility of matrices
A matrix A ∈ M (n × n) is called invertible if there exists a matrix B ∈ M (n × n) such that
AB = BA = idn . In this case B is called the inverse of A and it is denoted by A−1 . If A is not
RA
invertible, then it is called singular.
• The inverse of an invertible matrix A is unique.
• If A is invertible, then so is A−1 and (A−1 )−1 = A.
• If A is invertible, then so is At and (At )−1 = (A−1 )t .
• If A and B are invertible, then so is AB and (AB)−1 = B −1 A−1 .

Theorem. Let A ∈ M (n × n). Then the following is equivalent:


D

(i) A is invertible.
(ii) For every ~b ∈ Rn , the equation A~x = ~b has exactly one solution.
(iii) The equation A~x = ~0 has exactly one solution.
(iv) Every row-reduced echelon form of A has n pivots.
(v) A is row-equivalent to idn .

Calculation of A−1 using Gauß-Jordan elimination


Let A ∈ M (n × n). Form the augmented matrix (A| idn ) and use the Gauß-Jordan elimination to
reduce A to its reduced row echelon form A0 : (A| idn ) → · · · → (A0 |B). If A0 = idn , then A is
invertible and A−1 = B. If A0 6= idn , then A is not invertible.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
130 3.10. Exercises

Inverse of a 2 × 2 matrix
 
a b
Let A = . Then det A = ad − bc. If det A = 0, then A is not invertible. If det A 6= 0, then
c d
 
d −b
A is invertible and A−1 = det1 A .
−c a

Elementary matrices
We have the following three types of elementary matrices:

• Sj (c) = (sik )i,k=1...,n for c 6= 0 where sik = 0 if i 6= k, skk = 1 for k 6= j and sjj = c,
• Qjk (c) = (qi` )i,`=1...,n for j 6= k, where qjk = c, q`` = 1 for all ` = 1, . . . , n and all other
coefficients equal to zero,
• Pjk = (pi` )i,`=1...,n for j 6= k, where p`` = 1 for all ` ∈ {1, . . . , n} \ {j, k}, pjk = pkj = 1 and
all other coefficients equal to zero.


Sj (c) = 


1
c


,

FT

1

Qjk (c) = 


column k

c






row j
, Pjk






=


1
col. k

0
col. k

1









row k

.
RA
 
   
1 1  1 0  row j
 
 
1

Relation Elementary matrix - Elemntary row operation

Elementary matrix Elementary operation Notation


Pjk Swap rows j with row k Rj ↔ Rk
D

Sj (c), c 6= 0 Multiply row j by c Rj → cRk


0
Qjk (c), j 6= k Sum c times row k to row j Rk → Rk + cRj

3.10 Exercises
1. Vuelva al Capı́tulo 1 y haga los ejercicios otra vez utilizando los conocimientos adquiridos en
este capı́tulo.

2x2 − 4x + 14
2. Encuentre las fracciones parciales de .
x(x − 2)2

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 131

3. Encuentre un sistema lineal 2 × 3 cuya solución sea


   
1 4
2 + t 5 , t ∈ R.
3 6
¿Existen sistemas 3 × 3 y 4 × 3 con las mismas solucioes? Dé ejemplos o diga por qué no
existen.
¿Existe un sistema 4 × 3 con las mismas solucioes? Dé ejemplos o diga por qué no existen.

4. Encuentre un sistema lineal 4 × 4 cuya solución sea


     
1 4 7
2 5 3
  + s  + t ,
3 6 2 s, t ∈ R.
4 7 1

5. Considere el sistema lineal

FT
x1 + 2x2 + 3x3 = b1
3x1 − x2 + 2x3 = b2
4x1 + x2 + x3 = b3 .

Encuentre todo los posibles b1 , b2 , b3 , o diga por qué no hay, para que el sistema tenga
RA
(a) exactamente una solución,
(b) ninguna solución,
(c) infinitas soluciones.

6. Calcule todas las posibles combinaciones (matriz)(vector):


 
  1 0  
1 0 3 6 1 3 6  
4 8 −1 2 7
D


A = 4 8 1 0 , B= , C = 4 1 0 , D= ,
1 4 3 −2 2
1 4 4 3 1 4 3
5 −4
 
  1  
1  4 4  
    1
0 2    3 −3
~r = 
3 ,
 ~v = , w
~ = 3 ,
 ~x = 
 5 ,
 ~y = , ~z = −2 .
3  5 5
π
6 −1
−1

   
2 6 −1 17
7. Sean A = 1 −2 2 y ~b =  6 . Encuentre todos los vectores ~x ∈ R3 tal que A~x = ~b.
1 2 −2 4

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
132 3.10. Exercises

 
1 1
8. Sea M = .
−1 3

(a) Demuestre que no existe ~y 6= 0 tal que M ~y ⊥ ~y .


(b) Encuentre todos los vectores ~x 6= 0 tal que M~xk~x. Para cada tal ~x, encuentre λ ∈ R tal
que M~x = λ~x.

9. Calcule todas las posibles combinaciones (matriz)(matriz):


 
  1 0  
1 0 3 6 4 8  1 3 6
A = 4 8 1 0 , B =  1 4  , C = 4 1 0 ,
  
1 4 4 3 1 4 3
5 −4
   
−1 2 7 1 0
D= , E= .
3 −2 2 3 6

A=

1 −2
2 7

, B=


FT
10. Determine si las matrices son invertibles. Si lo son, encuentre su matriz inversa.

−14 21
12 −18

11. De las siguientes matrices determine



,

1 3 6
D = 4 1 0 ,
1 4 3
 

E = 2 1
1 4

3 5
6

5 .
11

si son invertibles. Si lo son, encuentre su matriz inversa.


RA
 
      1 3 6
1 0 5 2 4 10
A= , B= , C= , D = 4 1 0 .
3 6 8 6 6 15
1 4 3

12. Una tienda vende dos tipos de cajitas de dulces:


Tipo A contiene 1 chocolate y 3 mentas, Tipo B contiene 2 chocolates y 1 menta.

(a) Dé una ecuación de la forma A~x = ~b que describe lo de arriba. Diga que signiifican los
D

vectores ~x y ~b.
(b) Calcule, usando el resultado de (a), cuantos chocolates y cuantas mentas contienen:
(i) 1 caja de tipo A y 3 de tipo B, (iii) 2 caja de tipo A y 6 de tipo B,
(ii) 4 cajas de tipo A y 2 de tipo B, (iv) 3 cajas de tipo A y 5 de tipo B.

(c) Determine si es posible conseguir


(i) 5 chocolates y 15 mentas, (iii) 21 chocolates y 23 mentas,
(ii) 2 chocolates y 11 mentas, (iv) 14 chocolates y 19 mentas.
comprando cajitas de dulces en la tienda. Si es posible, diga cuántos de cada tipo se
necesitan.

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 133

 
1 3
13. Sea Ak = y considere la ecuación
2 k
 
0
Ak ~x = . (∗)
0

(a) Encuentre todos los k ∈ R tal que (∗) tiene exactamente una solución para ~x.
(b) Encuentre todos los k ∈ R tal que (∗) tiene infinitas soluciones para ~x.
(c) Encuentre todos los k ∈ R tal que (∗) tiene ninguna solución para ~x.
 
2
(d) Haga lo mismo para Ak ~x = en vez de (∗).
3
   
b1 b1
(e) Haga los mismo para Ak ~x = en vez de (∗) donde es un vector arbitrario
  b 2 b2
0
distinto de .
0


7 4
 1 4 −4
FT
14. Escriba las matrices invertibles de los Ejercicios 10. y 11. como producto de matrices elemen-
tales.

15. Para las siguientes matrices encuentre matrices elementales E1 , . . . , En tal que E1 ·E2 ·· · ··En A
es de la forma triangular superior.
  
1 2 3

RA
A= , B = 2 1 0  , C = 1 2 0 .
3 5
3 5 3 2 4 3

16. Sea A ∈ M (m × n) y sean ~x, ~y ∈ Rn , λ ∈ R. Demuestre que A(~x + λ~y ) = A~x + λA~y .

17. Demuestre que el espacio M (m×n) es un espacio vectorial con la suma de matrices y producto
con λ ∈ R usual.
D

18. Sea A ∈ M (n × n).

(a) Demuestre que hA~x, ~y i = h~x, At ~y i para todo ~x ∈ Rn .


(b) Demuestre que hAAt ~x, ~xi ≥ 0 para todo ~x ∈ Rn .

19. Sea A = (aij ) i=1,...,n ∈ M (m×n) y sea ~ek el k-ésimo vector unitario en Rn (es decir, el vector
j=1,...,m
en Rn cuya k-ésima entrada es 1 y las demás son cero). Calcule A~ek para todo k = 1, . . . , n
y describa en palabras la relación del resultado con la matriz A.

20. (a) Sea A ∈ M (m × n) y suponga que A~x = ~0 para todo ~x ∈ Rn . Demuestre que A = 0 (la
matriz cuyas entradas son 0).

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
134 3.10. Exercises

(b) Sea x ∈ Rn y suponga que A~x = ~0 para todo A ∈ M (n × n). Demuestre que ~x = ~0.
(c) Encuentre una matriz A ∈ M (2 × 2) y ~v ∈ R2 , ambos distintos de cero, tal que A~v = ~0.
(d) Encuentre matrices A, B ∈ M (2 × 2) tal que AB = 0 y BA 6= 0.

   
4 −1
21. Sean ~v = yw
~= .
5 3

(a) Encuentre una matriz A ∈ M (2 × 2) que mapea el vector ~e1 a ~v y el vector ~e2 a w.
~
(b) Encuentre una matriz B ∈ M (2 × 2) que mapea el vector ~v a ~e1 y el vector w
~ a ~e2 .

22. Sean A ∈ M (m, n), B, C ∈ M (n, k), D ∈ M (k, l).

(a) Demuestre que A(B + C) = AB + AC.


(b) Demuestre que A(BD) = (AB)D.

24. Sean A =

1 1

,B=

0
FT
23. Sean R, S ∈ M (n, n) matrices invertibles. Demuestre que

RS = SR

2

yC=
⇐⇒ R−1 S −1 = S −1 R−1 .


9 6

. Encuentre X ∈ M (2 × 2) que satisface
RA
1 2 2 0 −7 11
la ecuación
AX + 3X − B = C.

25. Falso o verdadero? Pruebe sus respuestas.

(a) Si A es una matriz simétrica invertible, entonces A−1 es sı́metrica.


D

(b) Si A, B son matrices simétricas, entonces AB es sı́metrica.


(c) Si AB es una matriz simétrica, entonces A, B son matrices simétricas.
(d) Si A, B son matrices simétricas, entonces A + B es sı́metrica.
(e) Si A + B es una matriz simétrica, entonces A, B son matrices simétricas.
(f) Si A es una matriz simétrica, entonces At es sı́metrica.
(g) AAt = At A.

26. Sea A ∈ M (m × n). Demuestre que AAt y At A son matrices simétricas.

27. Sea A ∈ M (n × n). Demuestre que A + At es simétrica y que A − At es antisimétrica.

28. Sea A ∈ M (n × n) tal que A2 = O:

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 3. Linear Systems and Matrices 135

(a) Muestre que idn −A es invertible y encuentre su inversa. (Hint: Considere idn −A2 .
¿Por qué en este caso es correcto factorizar por medio de diferencia de cuadrados? Ver
Sección 3.4, Ejercicio 8.).
(b) Si λ ∈ R, λ 6= 0, muestre que λ idn −A es invertible.
(c) ¿Cómo se pueden generalizar los incisos anteriores si en lugar de suponer que A2 = O
suponemos que Am = O para algún m ∈ N?
29. Calcule (Sj (c))t , (Qij (c))t , (Pij )t .

30. (a) Sea P12 = ( 01 10 ) ∈ M (2 × 2). Demuestre que P12 se deja expresar como producto de
matrices elementales de la forma Qij (c) y Sk (c).
(b) Pruebe el caso general: Sea Pij ∈ M (n × n). Demuestre que Pij se deja expresar como
producto de matrices elementales de la forma Qkl (c) y Sm (c).
Observación: El ejercicio demuestra que en verdad solo hay dos tipos de matrices elementales
ya que el tercero (las permutaciones) se dejan reducir a un producto apropiado de matrices

FT
de tipo Qij (c) y Sj (c).
RA
D

Last Change: Thu Mar 21 05:49:26 PM -05 2024


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 4. Determinants 137

Chapter 4

Determinants

In this section we will define the determinant of matrices in M (n × n) for arbitrary n and we will
recognise the determinant for n = 2 defined in Section 1.2 as a special case of our new definition.

FT
We will discuss the main properties of the determinant and we will show that a matrix is invertible
if and only if its determinant is different from 0. We will also give a geometric interpretation of
the determinant and get a glimpse of its importance in geometry and the theory of integration.
Finally we will use the determinant to calculate the inverse of an invertible matrix and we will
prove Cramer’s rule.

4.1 Determinant of a matrix


RA
Recall that in Section 1.2 on page 17 we defined the determinant of a 2 × 2 matrix by
 
a a12
det 11 = a11 a22 − a12 a21 .
a21 a22
Moreover, we know that a 2 × 2 matrix A is invertible if and only if its determinant is different
from 0 because both statements are equivalent to the associated homogeneous system having only
the trivial solution.
In this section we will define the determinant for arbitrary n × n matrices and we will see that
D

again the determinant tells us if a matrix is invertible or not. We will give several formulas for the
determinant. As definition, we use the Leibniz formula because it is non-recursive. First we need
to know what a permutation is.

Definition 4.1. A permutation of a set M is a bijection M → M . The set of all permutations of


the set M = {1, . . . , n} is denoted by Sn . We denote an element σ ∈ Sn by
 
1 2 ··· n−1 n
σ(1) σ(2) · · · σ(n − 1) σ(n).
The sign (or parity) of a permutation σ ∈ Sn is
sign(σ) = (−1)#inversions of σ
where an inversion of σ is a pair i < j with σ(i) > σ(j).

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
138 4.1. Determinant of a matrix

Note that Sn consists of n! permutations.

Examples 4.2. (i) S2 consists of two permutations:


   
1 2 1 2
σ
1 2 2 1

sign(σ) 1 -1

(ii) S3 consists of six permutations:


           
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
σ
1 2 3 2 3 1 3 1 2 2 1 3 1 3 2 3 2 1

sign(σ) 1 1 1 -1 -1 -1

FT
For instance the second permutation has two inversions (1 < 3 but σ(1) > σ(3) and 2 < 3
but σ(2) > σ(3)), the third permutation has two inversions (1 < 2 but σ(1) > σ(2), 1 < 3 but
σ(1) > σ(3)), etc.

Definition 4.3. Let A = (aij )i,j=1,...,n ∈ M (n × n). Then its determinant is defined by
X
det A = sign(σ) a1σ(1) a2σ(2) · · · anσ(n) . (4.1)
σ∈Sn
RA
The formula in equation (4.1) is called the Leibniz formula.

Remark. Another notation for the determinant is |A|.

Remark 4.4. Note that according to the formula

(a) the determinant is a sum of n! terms,


D

(b) each term is a product of n components of A,


(c) in each product, there is exactly one factor from each row and from each column and all such
products appear in the formula.

So clearly, the Leibniz formula is computational nightmare . . .

Equal rights for rows and columns!


Show that X
det A = sign(σ) aσ(1)1 aσ(2)2 · · · aσ(n)n . (4.2)
σ∈Sn

This means: instead of putting the permutation in the column index, we can just as well put
them in the row index.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 139

Let us check if this new definition coincides with our old definition for the case n = 2.
 
a11 a12 X
det = sign(σ) a1σ(1) a2σ(2) = a11 a22 − a21 a12
a21 a22
σ∈S2

which is the same as our old definition.


Now let us see what the formula gives us for the case n = 3. Using our table with the permutations
in S3 , we find
 
a11 a12 a13 X
det A = det a21 a22 a23  = sign(σ) a1σ(1) a2σ(2) a3σ(3)
a31 a32 a33 σ∈S3

= a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a12 a21 a33 − a11 a23 a32 − a13 a22 a31 . (4.3)
Now let us group terms with coefficients from the first line of A.
     
det A = a11 a22 a33 − a23 a32 − a12 a21 a33 − a23 a31 + a13 a21 a32 − a22 a31 . (4.4)

FT
We see that the terms in brackets are again determinants:
• a11 is multiplied by the determinant of the 2 × 2 matrix obtained from A by deleting row 1
and column 1.
• a12 is multiplied by the determinant of the 2 × 2 matrix obtained from A by deleting row 1
and column 2.
• a13 is multiplied by the determinant of the 2 × 2 matrix obtained from A by deleting row 1
and column 3.
RA
If we had grouped the terms by coefficients from the second row, we would have obtained something
similar: each term a2j would be multiplied by the determinant of the 2 × 2 matrix obtained from
A by deleting row 2 and column j.
Of course we could also group the terms by coefficients all from the first column. Then the formula
would become a sum of terms where the aj1 are multiplied by the determinants of the matrices
obtained from A by deleting row j and column 1.
This motivates the definition of the so-called minors of a matrix.
D

Definition 4.5. Let A = (aij )i,j=1,...,n ∈ M (n × n) and let Mij be the (n − 1) × (n − 1) matrix Mij
which is obtained from A by deleting row i and column j. Then the numbers det Mij are called
minors of A and Cij := (−1)i+j det(Mij ) are called cofactors of A.

Remark. Some books use a slightly different notation. They call the (n − 1) × (n − 1) matrices
Mij which is obtained from A by deleting row i and column j of A the minors of A (or the
minor matrices of A). However, it seems that the majority of textbooks uses the convention from
Definition 4.5.

With these definitions we can write (4.3) as


3
X 3
X
det A = (−1)1+j a1j det M1j = a1j C1j .
j=1 j=1

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
140 4.1. Determinant of a matrix

This formula is called the expansion of the determinant of A along the first row. We also saw that
we can expand along the second or the third row, or along columns, so
3
X 3
X
det A = (−1)k+j akj det Mkj = akj Ckj for k = 1, 2, 3,
j=1 j=1
3
X 3
X
det A = (−1)i+k aik det Mik = aik Cik for k = 1, 2, 3.
i=1 i=1

The first formula is called expansion along the kth row, and the second formula is called expansion
along the kth column. With a little more effort we can show that an analogous formula is true for
arbitrary n.

Theorem 4.6. Let A = (aij )i,j=1,...,n ∈ M (n × n) and let Mij denote its minors. Then
n
X n
X
det A = (−1)k+j akj det Mkj = akj Ckj for k = 1, 2, . . . , n, (4.5)

FT
j=1 j=1
Xn n
X
det A = (−1)i+k aik det Mik = aik Cik for k = 1, 2, . . . , n. (4.6)
i=1 i=1

The formulas (4.5) and (4.5) are called Laplace expansion of the determinant. More precisely, (4.5)
is called expansion along the kth row, (4.6) is called expansion along the kth column.
RA
Proof. Let us firste prove (4.5) for the case when k = 1. Let A ∈ M (n × n). Note that by definition
det A is the sum of products of the form sign(σ)a1σ(1) . . . an,σ(n) , see Remark 4.4. So the terms
are exactly all possible products of entries of the matrix A with exactly one term of each row and
exactly one term of each column, multiplied by +1 or −1.
The same is true for the formula (4.5): a11 is multiplied by det A11 , but the latter consists of
products with exactly one factor in each row and each column of A11 , that is, exactly one factor
from row 2 to n and column 2 to n of A; a12 is multiplied by det A12 , but the latter consists of
D

products with exactly one factor in each row and each column of A12 , that is, exactly one factor
from row 2 to n and column 1, 3, to n of A; etc.
So all products that appear in the Leibniz formula (4.1) appear also in the Laplace formula (4.5)
and vice versa. So it only remains to show that they appear with the same factor 1 or −1 in both
formulas.
Let σ ∈ Sn and set σ e : {2, 3, . . . , n} → {1, 2, . . . , n} \ {σ(1)}, σ
e(j) = σ(j). Then

#(inversions of σ) = #(pairs (i, j) such that i < j and σ(i) > σ(j))
= #(pairs (i, j) such that 2 ≤ i < j and σ(i) > σ(j))
+ #(pairs (1, j) such that 1 < j and σ(1) > σ(j))
= #(inversions of σ
e) + #(pairs (1, j) such that 1 < j and σ(1) > σ(j))
e) + σ(1) − 1,
= #(inversions of σ

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 141

hence sign(σ) = (−1)#(inversions of σe)+σ(1)−1 = sign(e σ )(−1)σ(1)−1 = sign(e σ )(−1)σ(1)+1 and there-
fore
sign(σ)a1σ(1) a2σ(2) · · · an,σ(n) = (−1)σ(1)+1 a1σ(1) sign(e
 
σ )a2σ(2) · · · an,σ(n) .
The term in brackets is one of the terms that appear in the determinant of A1σ(1) , hence the
product on the right hand side appears in the formula (4.5) (when j = σ(1)) and it is the only term
that contains the factors a11 , a12 , . . . , a1n . Consequently each term in the Leibniz formula appears
exactly once in the Laplace formula with exactly the same sign and there are no other terms in the
Laplace formula. Hence both formulas are equal.
The reasoning for k > 1 is the same. We only need to take σ e as the restriction of σ to {1, 2, . . . , n} \
{k} and note that sign(σ) = sign(eσ )(−1)σ(k)+k . This is true because

#(inversions of σ) = #(pairs (i, j) such that i < j and σ(i) > σ(j))
= #(pairs (i, j) such that i, j 6= k, 1 ≤ i < j and σ(i) > σ(j))
+ #(pairs (i, k) such that i < k or j = k and σ(i) > σ(k))
+ #(pairs (k, j) such that k < j, or j = k and σ(k) > σ(j))

FT
= #(inversions of σ
e) + #(pairs (1, j) such that 1 < j and σ(1) > σ(j))
e) + σ(k) − k + 2`
= #(inversions of σ

Therefore we obtain

sign(σ)a1σ(1) a2σ(2) · · · an,σ(n)


= (−1)σ(k)+k akσ(k) sign(e
 
σ )a1σ(1) · · · ak−1,σ(k−1) ak+1,σ(k+1) . . . an1,σ(n) .
RA
The term in brackets appears in the determinant of the submatrix Akσ(k) , hence it appears in the
sum of expansion along the kth row (in the term with j = σ(k)).
In order to prove (4.6), we can use the same arguments as above applied to the “column version”
(4.2) of the Leibniz formula.

Note that for calculating for instance the determinant of a 5×5 matrix, we have to calculate five 4×4
determinants for each of which we have to calculate four (3×3) determinants, etc. Computationally,
it is as long as the Leibniz formula, but at least we do not have to find all permutations in Sn first.
D

Later, we will see how to calculate the determinant using Gaussian elimination. This is computa-
tionally much more efficient, see Remark 4.12.

Example 4.7. We use expansion along the second column to calculate


       
3 2 1 3 2 1 3 2 1 3 2 1
det 5 6 4 = − 2 det 5 6 4 + 6 det 5 6 4 − 0 det 5 6 4
8 0 7 8 0 7 8 0 7 8 0 7
     
5 4 3 1 3 1
= −2 det + 6 det − 0 det
8 7 8 7 5 4
   
= −2 5 · 7 − 4 · 8] + 6 3 · 7 − 1 · 8] = −2 35 − 32] + 6 21 − 8] = −6 + 78 = 72.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
142 4.1. Determinant of a matrix

We obtain the same result if we expand the determinant along e.g. the first row:
       
3 2 1 3 2 1 3 2 1 3 2 1
det 5 6 4 = 3 det 5 6 4 − 2 det 5 6 4 + 1 det 5 6 4
8 0 7 8 0 7 8 0 7 8 0 7
     
6 4 5 4 5 6
= 3 det − 2 det + 1 det
0 7 8 7 8 0
   
= 3 6 · 7 − 4 · 0] − 2 5 · 7 − 4 · 8] + 5 · 0 − 6 · 8] = 3 · 42 − 2 35 − 32] − 40 = 126 − 6 − 48 = 72.

Example 4.8. We give an example of the calculation of the determinant of a 4 × 4 matrix. The
red arrows indicate along which row or column we expand.
 
1 2 3 4        
0 6 0 1 6 0 1 0 0 1 0 6 1 0 6 0
det   = det 0 7 0 − 2 det 2 7 0 + 3 det 2 0 0 − 4 det 2 0 7
2 0 7 0

FT
3 0 1 0 0 1 0 3 1 0 3 0
0 3 0 1
          
6 1 0 1 6 1 2 7
= 7 det − 2 7 det + 3 −2 det − 4 −6 det
3 1 0 1 3 1 0 0
= 7[6 − 3] − 14[0 − 0] − 6[6 − 3] + 24[0 − 0] = 21 − 18 = 3.

Now we calculate the determinant of the same matrix but choose a row with more zeros in the first
step. The advantage is that there are only two 3 × 3 minors whose determinants we really have to
RA
compute.

 
1 2 3 4        
0 6 0 1 2 3 4 1 3 4 1 2 4 1 2 3
det   = −0 det 0 7 0 + 6 det 2 7 0 − 0 det 2 0 0 + det 2 0 7
2 0 7 0
3 0 1 0 0 1 0 3 1 0 3 0
0 3 0 1
         
2 0 1 4 0 7 2 3
= 6 −3 det +7 + det − 2 det
0 1 0 1 3 0 3 0
D

= 6[−6 + 7] + [−21 + 18] = 6 − 3 = 3.

Rule of Sarrus
We finish this section with the so-called rule of Sarrus. From (4.3) we know that
 
det A = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a12 a21 a33 + a11 a22 a33 + a13 a22 a31
which can be memorised as follows: Write down the matrix A and append its first and second
column to it. Then we sum the products of the three terms lying on diagonals from the top left to
the bottom right and subtract the products of the terms lying on diagonals from the top right to
the bottom left as in the following picture:

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 143

a11 a12 a13 a11 a12

a21 a22 a23 a21 a22

a31 a32 a33 a31 a32

 
det A = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 + a11 a23 a32 + a12 a21 a33 .

The rule of Sarrus works only for 3 × 3 matrices!!!

FT
Convince yourself that one could also append the first and the second row below the matrix and
make crosses.

Example 4.9 (Rule of Sarrus).

 
1 2 3
RA
 
det 4 5 6 = 1 · 5 · 7 + 2 · 6 · 0 + 3 · 4 · 8 − 3 · 5 · 0 + 6 · 8 · 1 + 7 · 2 · 4
0 8 7
= 35 + 96 − [48 + 56] = 131 − 106 = 27.
D

You should now have understood

• what a permutation is,


• how to derive the Laplace expansion formula from the Leibniz formula,
• etc.
You should now be able to

• calculate the determinant of an n × n matrix,


• etc.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
144 4.2. Properties of the determinant

Ejercicios.
1. Calcule los determinantes de las siguientes matrices:
     
−2 3 1 0 1 4 6 3 5
(a)  0 2 1 (b) −2 0 −6 (c)  3 −1 4
4 6 5 2 1 0 −2 1 −6
 
    −3 0 0 0
−1 1 0 3 5 −1 −4 7 0 0
(d)  2 1 4 (e) 0 1 −2 (f) 
−5

10 −1 0
1 5 6 0 0 2
2 3 −11 6
 
2 3 −1
√ 20 π  
−2 0 0 7
0 1 2 −11 10
   1 2 −1 4
(g) 0 0 4 −1 5 (h)  
   3 0 −1 5
0 0 0 −2 50
4 2 3 0

FT
0 0 0 0 6

 
1 1 1
2. Sea A =  a b c . Muestre que det A = (b − a)(c − a)(c − b).
a2 b2 c2
 
0 a b
3. Sea A = −a 0 c. Muestre que det A = 0.
RA
−b −c 0

4.2 Properties of the determinant


In this section we will show properties of the determinant and we will prove that a matrix is
invertible if and only if its determinant is different from 0.

(D1) The determinant is linear in its rows.


D

This means the following. Let ~r1 , . . . , ~rn be the row vectors of the matrix A and assume that
~rj = ~sj + γ~tj . Then
       
~r1 ~r1 ~r1 ~r1
 ..   ..   ..   .. 
 .




 . 

 .




 . 

~tj
 ~rj
det A = det   = det ~sj + γtj  = det  ~sj  + γ det  .
      
 . ..  . ..
 ..  ..
     
  .    . 
~rn ~rn ~rn ~rn

This is proved easily by expanding the determinant along the jth row, or it can be seen from the
Leibniz formula as well.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 145

(D1’) The determinant is linear in its columns.


This means the following. Let ~c1 , . . . , ~cn be the column vectors of the matrix A and assume that
~cj = ~sj + γ~tj . Then
det A = det(~c1 | · · · |~cj | · · · |~cn ) = det(~c1 | · · · |~sj + γtj | · · · |~cn )
= det(~c1 | · · · |~sj | · · · |~cn ) + γ det(~c1 | · · · |tj | · · · |~cn ).
This is proved easily by expanding the determinant along the jth column, or it can be seen from
the Leibniz formula as well.

(D2) The determinant is alternating in its rows.


If two rows in a matrix are swapped, then the determinant changes its sign. This means: Let
~r1 , . . . , ~rn be the row vectors of the matrix A and i 6= j ∈ {1, . . . , n}. Then
.. ..
   
 .   . 
 ~rj   ~ri 

FT
   
det A = det  .  = − det  ...  .
 ..   
   
 ~ri   ~rj 
   
.. ..
. .
This is easy to see when the two rows that shall be interchanged are adjacent. For example, assume
that j = i + 1. Let A be the original matrix and let B be the matrix with rows i and i + 1 swapped.
We expand the determinant of A along the ith row and and the determinant of B along the (i+1)th
A B
RA
row. Note that in both cases the minors are equal, that is, Mik = M(i+1)k (we use superscripts A
and B to distinguish between the minors of A and of B). So we find
n
X n
X n
X
det B = (−1)(i+1)+k M(i+1)k
B
= (−1)(−1)i+k Mik
A
=− (−1)i+k Mik
A
= − det A.
k=1 k=1 k=1

This can seen also via the Leibniz formula. Now let us see what happens if i and j are not adjacent
rows. Without restriction we may assume that i < j. Then we first swap the jth row (j − i) times
with the row above until it is in the ith row. The original ith row is now in row (i + 1). Now we
swap it down with its neighbouring rows until it becomes row j. To do this we need j − (i + 1)
D

swaps. So in total we swapped [j − i] + [j − (i + 1)] = 2j − 2i + 1 times neighbouring rows, so the


determinant of the new matrix is
(−1) · (−1) · · · · · (−1) · det A = (−1)2j−2i+1 det A = − det A.
| {z }
2j−2i+1 times (one factor for each swap)

(D2’) The determinant is alternating in its columns.


If two columns in a matrix are swapped, then the determinant changes its sign. This means: Let
~c1 , . . . , ~cn be the column vectors of the matrix A and i 6= j ∈ {1, . . . , n}. Then
det A = det(· · · |~ci | · · · |~cj | · · · ) = − det(· · · |~cj | · · · |~ci | · · · ).
This follows in the same way as the alternating property for rows.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
146 4.2. Properties of the determinant

(D3) det idn = 1.


Expansion in the first row shows

det idn = 1 det idn−1 = 12 det idn−2 = · · · = 1n = 1.

Remark 4.10. It can shown: Every function f : M (n × n) → R which satisfies (D1), (D2) and
(D3) (or (D1’), (D2’) and (D3)) must be det.

Now let us see some more properties of the determinant.

(D4) det A = det At .


This follows easily from the Leibniz formula or from the Laplace expansion (if you expand A along
the first row and At along the first column, you obtain exactly the same terms). This also shows
that (D1’) follows from (D1) and that (D2’) follows from (D2) and vice versa.


..
.

FT
(D5) If one row of A is multiple of another row, or if a column is a multiple of another
column, then det A = 0. In particular, if A has two equal rows or two equal columns
then det A = 0.
Let ~r1 , . . . , ~rn denote the rows of the matrix A and assume that ~rk = c~rj . Then

..
.
 
..
.
 
..
.

RA
       

 ~rk 


 c~rj 
 (D2)

 ~rj
 (D1)

 ~rj 

.. .. .. ..
det A = det   = det   = − det   = −c det 
       
 .   .  .  . 

 ~rj   ~rj   c~rj   ~rj 
       
.. .. .. ..
. . . .
.. ..
   
 .   . 

 c~rj  
 ~rk 
 
(D1)
= − det 
 ..  = − det  ..  = − det A.
D

 . 
 . 
 

 ~rj  
 ~rj 
 
.. ..
. .

This shows det A = − det A, and therefore det A = 0. If A has a column which is a multiple of
another, then its transpose has a row which is multiple of another row and with the help of (D4) it
follows that det A = det At = 0.

(D6) The determinant of an upper or lower triangular matrix is the product of its
diagonal entries.
Let A be an upper triangular matrix and let us expand its determinant in the first column. Then
only the first term in the Laplace expansion is different from 0 because all coefficients in the first

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 147

column are equal to 0 except possibly the one in the first row. We repeat this and obtain
     
c1 c2 c3

det A = det 
 c2 ∗  
 = c1 det 
  c3 ∗  
 = c1 c2 det 
  c4 ∗ 

0 0 0
     
     
cn cn cn
 
c 0
= · · · = c1 c2 · · · cn−2 det n−1 = c1 c2 · · · cn−1 cn .
0 cn

The claim for lower triangular matrices follows from (D4) and what we just showed because the
transpose of an upper triangular matrix is lower triangular and the diagonal entries are the same.
Or we could repeat the above proof but this time we would expand always in the first row (or last
column).

FT
Next we calculate the determinant of elementary matrices.

(D7) The determinant of elementary matrices.


(i) det Sj (c) = c,
(ii) det Qij (c) = 1,
(iii) det Pij = −1.
RA
The affirmation about Sj (c) and Qij (c) follow from (D6) since they are triangular matrices. The
claim for Pij follows from (D2) and (D3) because swapping row i and row j in Pij gives us the
identity matrix, so det Pij = − det id = −1.

Now we calculate the determinant of a product of an elementary matrix with another matrix.

(D8) Let E be an elementary matrix and let A ∈ M (n × n). Then det(EA) = det E det A.
D

Let E be an elementary matrix and let us denote the rows of A by ~r1 , . . . , ~rn . We have to distinguish
between the three different types of elementary matrices.

Case 1. E = Sj (c). We know from (D6) that det E = det Sj (c) = c. Using Proposition 3.62 and
(D1) we find that

.. ..

  
  .   . 
 c~rj  = c det  ~rj
det(EA) = det Sj (c)A = det     = c det A = det Sj (c) det A.

.. ..
. .

Case 2. E = Qij (c). We know from (D6) that det E = det Qij (c) = 1. Using Proposition 3.62 and

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
148 4.2. Properties of the determinant

(D1) and (D5) we find that


.. .. .. ..
       
 .   .   .   . 
~ri + c~rj   ~ri   ~rj   ~ri 
       

det(EA) = det Qij (c)A = det  .. = det
 .. .. ..
 + c det   = det 
      
 . 

 .
   .   . 

 ~rj   ~rj   ~rj   ~rj 
       
.. .. .. ..
. . . .
= det A = det Qij (c) det A.

Case 3. E = Pij . We know from (D6) that det E = det Pjk = −1. Using Proposition 3.62 and
(D2) we find that
.. .. ..
      
  .   .   . 
  ~rj   ~rk   ~rj 

FT
      
  ..  .. ..
det(EA) = det Pjk A = det Pjk  .  = det   = − det 
    
    .  . 

  ~rk   ~rj   ~rk 
      
.. .. ..
. . .
= − det A = det Pjk det A.

If we repeat (D8), then we obtain

det(E1 · · · Ek A) = det(E1 ) · · · det(Ek ) det(A)


RA
for elementary matrices E1 , . . . , Ek .

(D9) Let A ∈ M (n × n). Then A is invertible if and only if det A 6= 0.


Let A0 be the reduced row echelon form of A. By Proposition 3.67 there exist elementary matrices
E1 , . . . , Ek such that A = E1 · · · Ek A0 , hence

det A = det(E1 · · · Ek ) = det(E1 ) · · · det(Ek ) det A0 . (4.7)


D

Recall that the determinant of an elementary matrix is different from zero, so (4.7) shows that
det A = 0 if and only if det A0 = 0.
If A is invertible, then A0 = id hence det A0 = 1 6= 0 and therefore also det A 6= 0. If A is not
invertible, then the last row of A0 must be zero, hence det A0 = 0 and therefore also det A = 0.
Next we show that the determinant is multiplicative.

(D10) Let A, B ∈ M (n × n). Then det(AB) = det A det B.


As before, let A0 be the reduced row echelon form of A. By Proposition 3.67 there exist elementary
matrices E1 , . . . , Ek such that A = E1 · · · Ek A0 . It follows from (D9) that

det(AB) = det(E1 · · · Ek A0 B) = det(E1 ) · · · det(Ek ) det(A0 B). (4.8)

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 149

If A is invertible, then A0 = id and (4.7) shows that

det(AB) = det(E1 ) · · · det(Ek ) det(B) = det(E1 · · · Ek ) det(B) = det(A) det(B).

If on the other hand A is not invertible, then det A = 0. Moreover, the last row of A0 is zero,
so also the last row of A0 B is zero, hence A0 B is not invertible and therefore det A0 B = 0. So
we have det(AB) = 0 by (4.7), and also det(A) det(B) = 0 det(B) = 0, so also in this case
det(AB) = det A det B.

(D11) Let A ∈ M (n × n) be an invertible matrix. Then det(A−1 ) = (det A)−1 .

If A invertible then det A 6= 0 and it follows from (D10) that

1 = det(idn ) = det(AA−1 ) = det(A) det(A−1 ).

Solving for det(A−1 ) gives the desired formula.

(i) Apply (D1) or (D1’) n times.

FT
Let A ∈ M (n × n). Give two proofs of det(cA) = cn det A using either one of the following:

(ii) Use that cA = diag(c, c, . . . , c)A and apply (D10) and (D6).

The determinant is not additive!


RA
Recall that det(AB) = det A det B. But in general

det(A + B) 6= det A + det B.


   
1 0 0 0
For example, if A = and B = , then det A + det B = 0 + 0 = 0, but det(A + B) =
0 0 0 1
det id2 = 1.

The following theorem is Theorem 3.44 together with (D9).


D

Theorem 4.11. Let A ∈ M (n × n). Then the following is equivalent:

(i) A is invertible.

(ii) For every ~b ∈ Rn , the equation A~x = ~b has exactly one solution.

(iii) The equation A~x = ~0 has exactly one solution.

(iv) Every row-reduced echelon form of A has n pivots.

(v) A is row-equivalent to idn .

(vi) det A 6= 0.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
150 4.2. Properties of the determinant

On the computational complexity of the determinant.


Remark 4.12. The above properties provide an efficient way to calculate the determinant of an
n × n matrix. Note that both the Leibniz formula and the Laplace expansion require O(n!) steps
(O(n!) stands for “order of n!”. You can think of it as “roughly n!” or “up to a constant multiple
roughly equal to n!”. Something like O(2n!) is still the same as O(n!)). However, reducing a
matrix with the Gauß-Jordan elimination requires only O(n3 ) steps until we reach a row echelon
form. Since this is always an upper triangular matrix, its determinant can be calculated easily.
If n is big, then n3 is big, too, but n! is a lot bigger, so the Gauß-Jordan elimination is computa-
tionally much more efficient than the Leibniz formula or the Laplace expansion.

Let us illustrate this with an example.


       
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
1 3 4 6 1 0 1 1 2 2 0 1 1 2 3 0 0 1 2
det 
1 7
 = det   = 5 det   = 5 det  
8 9 0 5 5 5 0 1 1 1 0 0 1 1
1 5 3 4 0 3 0 0 0 3 0 0 0 3 0 0

FT
   
1 2 3 4 1 2 3 4
4 0 0 0 1 5 0 3 0 0 7
= 5  = −5 det  = −15.
0 0 1 1 0 0 1 1
0 3 0 0 0 0 0 1

1 We subtract the first row from all the other rows. The determinant does not change.
2 We factor 5 in the third row.
We subtract 1/3 of the last row from rows 2 and 3. The determinant does not change.
RA
3
4 We subtract row 3 from row 2. The determinant does not change.
5 We swap rows 2 and 4. This gives a factor −1.
6 Easy calculation.

You should now have understood


• the different properties of the determinant,
D

• why a matrix is invertible if and only if its determinant is different from 0,


• why the Gauß-Jordan elimination is computationally more efficient than the Laplace expan-
sion formula,
• etc.
You should now be able to
• compute determinants using their properties,
• compute abstract determinants,
• use the factorisation of a matrix to compute its determinant,
• etc.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 151

Ejercicios.
 
a −2b c
1. Suponga que sabemos que det 1 3 −1 = −1. Calcule
0 5 2
     
−3a 6b −3c a −2b c 1 3 −1
det  −1 −3 1 , det 1 + 2a 3 − 4b −1 + 2c , det  2 11 0 .
2
0 1 5 −a 5 + 2b 2−c a+1 −2b + 3 c−1
 
a11 a12 a13
2. Suponga que det a21 a22 a23  = 6. Calcular:
a31 a32 a33
 
a31 a32 a33
det 3a11 − 5a31 3a12 − 5a32 3a13 − 5a33  .
a21 a22 a23

FT
3. ¿Para cuáles valores de a la matriz:
 
2−a −2 0
 0 1 1 + a
a 2 2a
no tiene inversa?
 
1 4
4. Sea A = . Halle todos los λ ∈ R tal que λ id2 −A es no invertible.
RA
0 4
5. ¿Falso o verdadero? Sean A, B ∈ M (n × n).
(a) Si n es impar, entonces det(−A) = − det(A).
(b) Si n es impar y A es antisimétrica, entonces A es no invertible.
(c) Si det A = 0 entonces A = O.
(d) Si det A = 0 entonces una fila ó columna de A consta de solo ceros.
(e) Si det A = 0 entonces por lo menos una entrada de A debe ser 0.
D

(f) Si P es invertible y A = P BP −1 , entonces det A = det B.


6. (a) Se dice que A ∈ M (n × n) es una matriz ortogonal si su inversa es su transpuesta. Sea
A ∈ M (3 × 3) ortogonal. ¿Cuáles son las posibilidades para det A?
(b) Se dice que A ∈ M (n × n) es una matriz idempotente si A2 = A. Sea A ∈ M (3 × 3)
idempotente. ¿Cuáles son las posibilidades para det A?
7. Sean A, B ∈ M (3 × 3) tales que AB = O y suponga que det A 6= 0. Muestre que B = O.
8. Sean A, B, C ∈ M (3 × 3) tales que AB = AC y suponga que det A 6= 0. Muestre que B = C.
¿Es cierto también si det A = 0?
9. Sea A ∈ M (3 × 3) tal que la suma de sus vectores filas (columnas) da como resultado ~0.
Muestre que A no es invertible.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
152 4.3. Geometric interpretation of the determinant

4.3 Geometric interpretation of the determinant


In this short section we show a geometric interpretation of the determinant. This is of course only
a small part of the true importance of the determinant. You will hear more about this in a course
on vector calculus when you discuss the transformation formula (the substitution rule for higher
dimensional integrals), or in a course on Measure Theory or Differential Geometry. Here we content
ourselves with two basic facts.

Area in R2
   
a1 b
Let ~a = and ~b = 1 be vectors in R2 and let us consider the matrix A = (~a|~b) the matrix
a2 b2
whose columns are the given vectors. Then
A~e1 = ~a, A~e2 = ~b.
That means that A transforms the unit square spanned by the unit vectors ~e1 and ~e2 into the
parallelogram spanned by the vectors ~a and ~b. Let area(~a, ~b) be the area of the parallelogram

FT
spanned by ~a and ~b. We can view ~a and ~b as vectors in R3 simply by adding a third component.
Then formula (2.9) shows that the area of the parallelogram spanned by ~a and ~b is equal to
     
a1 b1 a1 b2 − a2 b1
a2  × b2  =  0  = |a1 b2 − a2 b1 | = | det A|,
0 0 0
hence we obtain the formula
area(~a, ~b) = | det A|. (4.9)
RA
So while A tells us how the shape of the unit square changes, | det A| tells us how its area changes,
see Figure 4.1.

A
y y

 
b1
~e2 A~e2 = b2
D

A~e1 = ( aa12 )
x x
~e1

Figure 4.1: The figure shows how the area of the unit square transforms under the linear transforma-
tion A. The area of the square on left hand side is 1, the area of the parallelogram on the right hand
side is | det A|.

You should also notice the following: The area of the image of the unit square under A is zero
if and only if the two image vectors ~a and ~b are parallel. This is in accordance to the fact that
det A = 0 if and only if the two lines described by the associated linear equations are parallel (or if
one equation describes the whole plane).

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 153

Volumes in R3
     
a1 b1 c1
Let ~a = a2 , ~b = b2  and ~c = c2  be vectors in R3 and let us consider the matrix
a3 b3 c3
~
A = (~a | b | ~c) whose columns are the given vectors. Then
A~e1 = ~a, A~e2 = ~b, A~e3 = ~c.
That means that A transforms the unit cube spanned by the unit vectors ~e1 , ~e2 and ~e3 into the
parallelepiped spanned by the vectors ~a, ~b and ~c. Let vol(~a, ~b, ~c) be the volume of the parallelepiped
spanned by the vectors ~a, ~b and ~c. According to formula (2.10), vol(~a, ~b, ~c) = |h~a , ~b × ~ci|. We
calculate
*     + *   +
a1 b1 c1 a1 b2 c3 − b3 c2
|h~a , ~b × ~ci| = a2  , b2  × c2  = a2  , b3 c1 − c3 b1 
a3 b3 c3 a3 b1 c2 − b2 c1
= |a1 (b2 c3 − b3 c2 ) − a2 (c3 b1 − b3 c1 ) + a3 (b1 c2 − b2 c1 )|

FT
= | det A|
hence
vol(~a, ~b, ~c) = | det A| (4.10)
since we recognise the second to last line as the expansion of det A along the first column. So while
A tells us how the shape of the unit cube changes, | det A| tells us how its volume changes.

z A
RA
z

A~e3
~e3

~e2 y y

A~e2
D

~e1 A~e
1

x
x

Figure 4.2: The figure shows how the volume of the unit cube transforms under the linear transfor-
mation A. The volume of the cube on left hand side is 1, the volume of the parallelepiped on the right
hand side is | det A|.

You should also notice the following: The volume of the image of the unit cube under A is zero if
and only if the three image vectors lie in the same plane. We will see later that this implies that
the range of A is not all of R3 , hence A cannot be invertible. For details, see Section 6.2.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
154 4.3. Geometric interpretation of the determinant

What we saw for n = 2 and n = 3 can be generalised to Rn with n ≥ 4: A matrix A ∈ M (n × n)


transforms the unit cube in Rn spanned by the unit vectors ~e1 , . . . , ~en into a parallelepiped in Rn
and | det A| tells us how its volume changes.
Exercise. Give two proofs of the following statements: One using the formula (4.9) and linearity
of the determinant in its columns; and another proof using geometry.
y
(i) Show that the area of the blue parallelogram is w
~
twice the area of the green parallelogram.
2~v

−~v x

−w
~

FT
(ii) Show that the area of the blue parallelogram is
six times the area of the green parallelogram.

−w
~
y

~v
3w
~

2~v
x
RA
y

w
~
(iii) Show that the area of the blue and the red par-
allelogram is equal to the area of the green par-
~z
allelogram. ~v
D

x
−~z
−~z

You should now have understood


• the geometric interpretation of the determinant in R2 and R3 ,
• the close relation between the determinant and the cross product in R3 and that this is the
reason why the cross product appears in the formulas for the area of a parallelogram and

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 155

the volume of a parallelepiped,


• etc.
You should now be able to
• calculate the area of a parallelogram and the volume of a parallelepiped using determinants,
• etc.

Ejercicios.

    
1 2 4
1. Calcule el volumen del paralelepı́pedo generado por los vectores  3, 0, 4.
−5 1 4
 
1 0 2
2. Sea A = 0 −1 −2. Calcule el volumen del paralelepı́pedo generado por A~e1 , A~e2 y
2 −2 0

FT
A~e3 . ¿Cómo interpreta geométricamente el resultado obtenido?
3. Sean P = (1, 1), Q = (2, 2), R = (0, 3) y W = (−1, 2). Muestre que los puntos P QRW
forman un paralelogramo y calcule su área.
4. Sean P = (x1 , y1 ) ,Q = (x2 , y2 ) y R = (x3 , y3 ) puntos de R2 . Muestre que el área de triángulo
4P QR está dada por la fórmula:
 
1 x1 y1
1
RA
det 1 x2 y2  .
2
1 x3 y3
¿Cuándo este determinante será igual a cero?

4.4 Inverse of a matrix


In this section we prove a method to calculate the inverse of an invertible square matrix using
determinants. Although the formula might look nice, computationally it is not efficient. Here it
D

goes.
Let A = (aij )i,j=1,...,n ∈ M (n × n) and let Mij be its minors, see Definition 4.5. We already know
from (4.5) that for every fixed k ∈ {1, . . . , n}
n
X
det A = (−1)k+j akj det Mkj . (4.11)
j=1

Now we want to see that happens if the k in akj and in Mkj are different.

Proposition 4.13. Let A = (aij )i,j=1,...,n ∈ M (n × n) and let k, ` ∈ {1, . . . , n} with k 6= `. Then
n
X
(−1)`+j akj det M`j = 0. (4.12)
j=1

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
156 4.4. Inverse of a matrix

Proof. We build the new matrix B from A by replacing its `th row by the kth row. Then B has
two equal rows (row ` and row k), hence det B = 0. Note that the matrices A and B are equal
B A
everywhere except possibly in the `th row, so their minors along the row ` are equal: M`j = M`j
(we put superscripts A, B in order to distinguish the minors of A and of B). If we expand det B
along the `th row then we find
n
X n
X
0 = det B = (−1)`+j b`j det M`j
B
= (−1)`+j akj det M`j
A
.
j=1 j=1

Using the cofactors Cij of A (see Definition 4.5), formulas (4.11) and (4.12) can be written as
n n
(
X
`+j A
X det A if k = `,
(−1) akj det M`j = akj C`j := (4.13)
j=1 j=1
0 if k 6= `.

Definition 4.14. For A ∈ M (n × n) we define its adjugate matrix adj A as the transpose of its
cofactor matrix:

FT
 t  
C11 C12 · · · C1n C11 C21 ··· Cn1
 C21 C22 · · · C2n   C12 C22 ··· Cn2 
adj A :=  . ..  =  .. ..  .
   
.. ..
 .. . .   . . . 
Cn1 Cn2 · · · Cnn C1n C2n ··· Cnn

Theorem 4.15. Let A ∈ M (n × n) be an invertible matrix. Then


1
RA
A−1 = adj A. (4.14)
det A
Proof. Let us calculate PA adj A. By definition of adj A the coefficient ck` in the matrix product
n
A adj A is exactly ck` = j=1 (−1)`+j akj det M`j , so by (4.13) it follows that
 
det A 0 ... 0
 0 det A . . . 0 
A adj A =  . ..  = (det A) idn .
 
. ..
 .. .. . . 
D

0 0 ... det A

Rearranging, we obtain that A−1 = 1


det A adj A id−1
n =
1
det A adj A.

Remark 4.16. Note that the proof of Theorem 4.15 shows that A adj A = det A idn is true for
every A ∈ M (n × n), even if it is not invertible (in this case, both sides of the formula are equal to
the zero matrix).

Formula (4.14) might look quite nice and innocent, however bear in mind that in order to calculate
A−1 with it you have to calculate one n ×n determinant and n2 determinants of the (n −1)×(n −1)
minors of A. This is a lot more than the O(n3 ) steps needed in the Gauß-Jordan elimination.
Finally, we prove Cramer’s rule for finding the solution of a linear system if the corresponding
matrix is invertible.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 157

Theorem 4.17. Let A ∈ M (n × n) be an invertible matrix and let B ~ ∈ Rn . Then the unique
solution ~x of A~x = ~b is given by
~
 
det Ab1
 
x1
~
 x2  det Ab2 

1 
~x =  .  = (4.15)
  
 ..  det A  . 
 .. 

xn ~
det Abn
~
where Abj is the matrix obtained from the matrix A if we replace its jth column by the vector ~b.

Proof. As usual we write Cij for the cofactors of A and Mij for its minors. Since A is invertible,
we know that ~x = A−1~b = det1 A adj A~b. Therefore we find for j = 1, . . . , n that
n n
1 X 1 X 1 ~
xj = Ckj bk = (−1)k+j bk Ckj = det Abj .
det A det A det A
k=1 k=1

FT
~
The last equality is true because the second to last sum is the expansion of the determinant of Abj
along the kth column.
Note that, even if (4.15) might look quite nice, it involves the computation of n + 1 determinants
of n × n matrices, so it involves O((n + 1)!) steps.

 
1 2 3
RA
Example 4.18. Let us calculate the inverse of the matrix A = 4 5 6 from Example 4.9. We
0 8 7
already know that det A = 27. Its cofactors are
     
5 6 4 6 4 5
C11 = det = −13, C12 = − det = −28, C13 = det = 32,
8 7 0 7 0 8
     
2 3 1 3 1 2
C21 = − det = 10, C22 = det = 7, C23 = − det = −8,
8 7 9 7 0 8
     
2 3 1 3 1 2
D

C31 = det = −3, C32 = − det = 6, C33 = det = −3.


5 6 4 6 4 5

Therefore
   
C C21 C31 −13 10 −3
1 1  11 1
A−1 = adj A = C12 C22 C32  = −28 7 6 .
det A det A 27
C13 C23 C33 32 −8 −3

Example 4.19. Let us use Cramer’s rule to solve the following linear system:
x+y+ z = 8
4y − z = −2
3x − y + 2z = 0.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
158 4.4. Inverse of a matrix

We write the previous system in the form A~x = ~b:


    
1 1 1 x 8
0 4 −1 y  = −2 .
3 −1 2 z 0

So, by Cramer’s rule:


   
8 1 1 1 8 1
det −2 4 −1 det 0 −2 −1
0 −1 2 62 31 3 0 2 −22 11
x=   = =− , y=  = = ,
1 1 1 −8 4 1 1 1 −8 4
det 0 4 −1 det 0 4 −1
3 −1 2 3 −1 2
 
1 1 8

FT
det 0 4 −2
3 −1 0 −104
z=  = = 13.
1 1 1 −8
det 0 4 −1
3 −1 2
RA
You should now have understood
• what the adjugate matrix is and why it can be used to calculate the inverse of a matrix,
• etc.
You should now be able to
• calculate A−1 using adj A.
• solve systems of linear equations using Cramer’s rule.
D

• etc.

Ejercicios.
1. Usando la regla de Cramer, resuelva los siguientes sistemas de ecuaciones lineales:

(a) 2x + 3y = −1 (b) 3x − 2y − 2z = −2 (c) 2x + 3y − z = −5 (d) x−w =7


−7x + 4y = 5 x + y + 2z = 4 2x + 4y + 6z = 2 2y + z = 2
2x + y + z = 6 x + 2z = 0 4x − y = −3
3z − 5w = 2

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 159

2. Calcular la inversa de las siguientes matrices usando el método de cofactores:


 
    3 −12 −2 −6
  4 2 2 1 −1 2  1 −3
3 5 0 −2
(a) (b) 0 1 2 (c) 3 1 0 (d)  .
−1 2 −1 6 1 3
1 0 3 1 1 1
−2 10 2 5

3. Sea  
1 −2 3 1
−1 5 0 2
A= .
 0 3 1 −2
−2 1 2 4
Calcule la entrada a32 de A−1 .

4. El siguiente ejercicio tiene como objetivo demostrar la ley del coseno usando álgebra lineal.

(a) Considere el siguiente triángulo cuyos lados tienen longitudes a, b, c:

α
γ


FT a

β
RA
b cos α a cos β
c
Usando trigonometrı́a elemental, deduzca las siguientes relaciones:
b cos α + a cos β =c
c cos α + a cos γ = b
c cos β + b cos γ = a
D

(b) Considere las relaciones anteriores como un sistema de ecuaciones con cos α, cos β y cos γ
como incógnitas. Calcule el determinante del sistema.
(c) Use la regla de Cramer para despejar cos(γ) y concluya la ley del coseno:

c2 = a2 + b2 − 2ab cos(γ)

5. Sea A ∈ M (n × n).

(a) Muestre que si A no es invertible entonces A adj A = O


(b) Muestre que det(adj A) = (det A)n−1 .

6. Sea A ∈ M (n × n) simétrica e invertible. Muestre que adj A es simétrica. ¿Sigue siendo cierta
la afirmación si no suponemos que A es invertible?

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
160 4.5. Summary

4.5 Summary
The determinant is a function from the square matrices to the real numbers. Later we will also
consider matrices with complex entries. In thia case, the determinant is a function from the square
matrices to the complex numbers. Let A = (aij )ni,j−1 ∈ M (n × n).

Formulas for the determinant.


X
det A = sign(σ) a1σ(1) a2σ(2) · · · anσ(n) Leibniz formula
σ∈Sn
Xn n
X
= (−1)k+j akj det Mkj = akj Ckj Laplace expansion along the kth row
j=1 j=1
Xn n
X
= (−1)i+k aik det Mik = aik Cik Laplace expansion along the kth column
i=1 i=1

FT
with the following notation

• Sn is the set of all permutations of {1, . . . , n},

• Mij are the minors of A ((n − 1) × (n − 1) matrices obtained from A by deleting row i and
column j),

• Cij = (−1)i+j det Mij are the cofactors of A.


RA
Inverse of a matrix using the adjugate matrix
If A ∈ M (n × n) is invertible then
 
C11 C22 ··· Cn1
1 1  C12
 C22 ··· Cn2 
A−1 = adj A = ..  .

 . ..
det A det A  .. . . 
C1n C2n ··· Cnn
D

Geometric interpretation
The determinant of a matrix A gives the oriented volume of the image of the unit cube under A.

• in R2 : area of parallelogram spanned by ~a and ~b = | det A|,


• in R3 : volume of parallelepiped spanned by ~a, ~b and ~c = | det A|.

Properties of the determinant.


• The determinant is linear in its rows and columns.
• The determinant is alternating in its rows and columns.
• det idn = 1.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 161

• det A = det At .
• If one row of A is multiple of another row, or if a column is a multiple of another column,
then det A = 0. In particular, if A has two equal rows or two equal columns then det A = 0.
• The determinant of an upper or lower triangular matrix is the product of its diagonal entries.
• The determinants of the elementary matrices are

det Sj (c) = c, det Qij (c) = 1, det Pij = −1.

• Let A ∈ M (n × n). Then A is invertible if and only if det A 6= 0.

• Let A, B ∈ M (n × n). Then det(AB) = det A det B.

• If A ∈ M (n × n) is invertible, then det(A−1 ) = (det A)−1 .

Note however that in general det(A + B) 6= det A + det B.

(ii) A is invertible.
FT
Theorem. Let A ∈ M (n × n). Then the following is equivalent:
(i) det A 6= 0.

(iii) For every ~b ∈ Rn , the equation A~x = ~b has exactly one solution.
(iv) The equation A~x = ~0 has exactly one solution.
RA
(v) Every row-reduced echelon form of A has n pivots.
(vi) A is row-equivalent to idn .

4.6 Exercises
1. De las siguientes matrices calcule la determinante. Determine si las matrices son invertibles.
Si lo son, encuentre su matriz inversa y la determinante de la inversa.
D

   
    1 3 6 1 4 6
1 −2 −14 21
A= , B= . D = 4 1 0 , E = 2 1 5 .
2 7 12 −18
1 4 3 3 5 11

2. Sea
5t3
 
3 1
A=0 2+t 0.
−t 10 − t t2
4

 
√π
Determine t ∈ R tal que el sistema A~x =  √2  tiene exactamente una única solución.
3
2

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
162 4.6. Exercises

 
1+x y z w r
 x 1+y z w r 
 
3. Sea A =  x
 y 1+z w r 
 , muestre que det A = 1 + x + y + z + w + r.
 x y z 1+w r 
x y z w 1+r
(Hint: es más sencillo si usa la propiedad D1 de la sección 4.2).

4. De las siguientes matrices calcule el determinante. Determine si las matrices son invertibles.
Si lo son, encuentre su matriz inversa y el determinante de la inversa.
 
  1 2 3 0
  −1 2 3
π 3 0 1 2 2
A= , B =  1 3 1 , C= 1 4 0 3 .

5 2
4 3 2
1 1 5 4

5. Encuentre por lo menos cuatro matrices 3 × 3 cuyo determinante es 18.

1
7. Escribe la matriz A =  1

2
2
−2 −2 −6
FT
6. Use las factorizaciones encontradas en los Ejercicios 14. y 14. en Capı́tulo 3 para calcular sus
determinantes.

3

6  como producto de matrices elementales y calcule el

determinante de A usando las matrices elementales encontradas.


RA
8. Determine todos los x ∈ R tal que las siguientes matrices son invertibles:
   
  x x 3 11 − x 5 −50
x 2
A= , B =  1 2 6 , C= 3 −x −15  .
1 x−3
−2 2 −6 2 1 −x − 9

9. Suponga que una función y satisface y [n] = bn−1 y [n−1] + · · · b1 y 0 + b0 y donde b0 , . . . , bn−1 son
coeficientes constantes y y [j] denota la derivada j-ésima de y.
D

Verifique que Y 0 = AY donde


 
0 1 0 0 ... 0  
0 0 1 0 ... 0  y
 ..

.. .. .. ..

  y0 
. . . . . 
y 00

A=. , Y =
  
 .. ..  ..

. 1 0 
 
   . 
0 0 0 ... 0 1  y [n−1]
b1 b2 ... ... ... bn−1

y calcule el determinante de A.

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 4. Determinants 163

10. Sin usar fórmulas de expansión para determinantes, encuentre para cada una de las matrices
dadas parámetros x, y tales que el determinante de las siguientes matrices es igual a zero y
explique por qué los parametros encontrados sirven.
 
  1 x y 2
x 2 6 x 0 1 y 
N1 =  2 5 1 , N2 =  x 5 3 y  .

3 4 y
4 x y 8

11. (a) Calcule det Bn donde Bn es la matriz en M (n × n) cuyas entradas en la diagonal son 0
y todas las demás entradas son 1, es decir:
 
  0 1 1 1 1
  0 1 1 1
  0 1 1 1 0 1 1 1
0 1 1 0 1 1  
B1 = 0, B2 = , B3 = 1 0 1 , B4 =   , B5 = 1 1 0 1 1 , etc.
1 0  1 1 0 1  
1 1 0 1 1 1 0 1
1 1 1 0
1 1 1 1 0

B1 = 0, B2 =

0 1
1 0

FT
¿Cómo cambia la respuesta si en vez de 0 hay x en la diagonal?
(b) Calcule det Bn donde Bn es la matriz en M (n × n) cuyas entradas en la diagonal son 0
y todas las demás entradas satisfacen bij = (−1)i+j , es decir:

, B3 =  1 0
0 1 −1

−1 1

1 , B4 = 
0

 1

0

−1
1 −1
0
1
1 −1
0
1

,
1
RA
1 −1 1 0
 
0 1 −1 1 −1
 1
 0 1 −1 1

B5 = −1
 1 0 1 −1  , etc.
 1 −1 1 0 1
−1 1 −1 1 0

¿Cómo cambia la respuesta si en vez de 0 hay x en la diagonal? Compare con el Ejerci-


cio 10..
D

Last Change: Sat Mar 9 02:32:07 PM -05 2024


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 5

Vector spaces

In the following, K always denotes a field. In this chapter, you may always think of K = R,

FT
though almost everything is true also for other fields, like C, Q or Fp where p is a prime number.
Later, in Chapter 8 it will be more useful to work with K = C.

In this chapter we will work with abstract vector spaces. We will first discuss their basic proper-
ties. Then, in Section 5.2 we will define subspaces. These are subsets of vector spaces which are
themselves vector spaces. In Section 5.4 we will introduce bases and the dimension of a vector
space. These concepts are fundamental in linear algebra since they allow us to classify all finite
RA
dimensional vector spaces. In a certain sense, all n-dimensional vector spaces over the same field
K are equal. In Chapter 6 we will study linear maps between vector spaces.

5.1 Definitions and basic properties


First we recall the definition of an abstract vector space from Chapter 2 (p. 29).

Definition 5.1. Let V be a set together with two operations


D

vector sum + : V × V → V, (v, w) 7→ v + w,


product of a scalar and a vector · : K × V → V, (λ, v) 7→ λ · v.

Note that we will usually write λv instead of λ · v. Then V (or more precisely, (V, +, ·)) is called a
vector space over K if for all u, v, w ∈ V and all λ, µ ∈ K the following holds:

(a) Associativity: (u + v) + w = u + (v + w) for every u, v, w ∈ V .


(b) Commutativity: v + w = w + v for every u, v ∈ V .
(c) Identity element of addition: There exists an element O ∈ V , called the additive identity
such that O + v = v + O = v for every v ∈ V .
(d) Inverse element: For every v ∈ V , there exists an inverse element v 0 such that v + v 0 = O.

165
166 5.1. Definitions and basic properties

(e) Identity element of multiplication by scalar: For every v ∈ V , we have that 1v = v.


(f) Compatibility: For every v ∈ V and λ, µ ∈ K, we have that (λµ)v = λ(µv).
(g) Distributivity laws: For all v, w ∈ V and λ, µ ∈ K, we have

(λ + µ)v = λv + µv and λ(v + w) = λv + λw.

We already know that Rn is a vector space over R.

Remark 5.2. (i) Note that we use the notation ~v with an arrow only for the special case of
vectors in Rn or Cn . Vectors in abstract vector spaces are usually denoted without an arrow.
(ii) If K = R, then V is called a real vector space. If K = C, then V is called a complex vector
space.

Before we give examples of vector spaces, we first show some basic properties of vector spaces.

FT
Properties 5.3. (i) The identity element is unique. (Note that in the vector space axioms we
only asked for existence of an additive identity element; we did not ask for uniqueness. So one
could think that there may be several elements which satisfy (c) in Definition 5.1. However,
this is not possible as the following proof shows.)

Proof. Assume there are two neutral elements O and O0 . Then we know that for every v and
w in V the following is true:

v = v + O, w = w + O0 .
RA
Now let us take v = O0 and w = O. Then, using commutativity, we obtain

O0 = O0 + O = O + O0 = O.

(ii) Let x, y, z ∈ V . If x + y = x + z, then y = z.

Proof. Let x0 be an additive inverse of x (that means that x0 + x = O which must exist since
D

V is a vector space). This follows from

y = O + y = (x0 + x) + y = x0 + (x + y) = x0 + (x + z) = (x0 + x) + z O + z = z.

(iii) For every v ∈ V , its inverse element is unique. (Note that in the vector space axioms we
only asked for existence of an additive inverse for every element x ∈ V ; we did not ask
for uniqueness. So one could think that there may be several elements which satisfy (d) in
Definition 5.1. However, this is not possible as the following proof shows.)

Proof. Let v ∈ V and assume that there are elements v 0 , v 00 in V such that

v + v 0 = O, v + v 00 = O.

Now it follows from (ii) that v 0 = v 00 (take x = v, y = v 0 and z = v 00 in (ii)).

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 167

(iv) For every λ ∈ K we have λO = O.

Proof. Observe that λO = λO + O and that λO = λ(O + O) = λO + λO, hence

λO + O = λO + λO.

Now it follows from (ii) that O = λO (take x = λO, y = O and z = λO in (ii)).

(v) For every v ∈ V we have that 0v = O.

Proof. The proof is similar to the one above. Observe that 0v = 0v + O0 and 0v = (0 + 0)v =
0v + 0v, hence
0v + O = 0v + 0v.
Now it follows from (ii) that O = 0v (take x = 0v, y = O and z = 0v in (ii)).

(vi) If λv = O, then either λ = 0 or v = O.

(vii) For every v ∈ V , its inverse is (−1)v.


v=
FT
Proof. If λ = 0, then there is nothing to prove. Now assume that λ 6= 0. Then v is O because
1
λ
1
(λv) = O = O.
λ
RA
Proof. Let v ∈ V . Observe that by (vi), we have that 0v = O. Therefore

O = 0v = (1 + (−1))x = v + (−1)v.
Hence (−1)v is an additive inverse of v. By (iii), the inverse of v is unique, therefore (−1)v is
the inverse of v.

Remark 5.4. From now on, we write −v for the additive inverse of a vector. This notation is
D

justified by Property 5.3 (vii).

Examples 5.5. We give some important examples of vector spaces.

• R is a real vector space. More generally, Rn is a real vector space. The proof is the same
as for R2 in Chapter 2. Associativity and commutativity are clear. The identity element is
the vector whose entries are all equal to zero: ~0 = (0, . . . , 0)t . The inverse for a given vector
~x = (x1 , . . . , xn )t is (−x1 , . . . , −xn )t . The distributivity laws are clear, as is the fact that
1~x = ~x for every ~x ∈ Rn .

• C is a complex vector space. More generally, Cn is a complex space. The proof is the same
as for Rn .

• C can also be viewed as a real vector space.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
168 5.1. Definitions and basic properties

• R is not a complex vector space with the usual definition of the algebraic operations. If it
was, then the vectors would be real numbers and the scalars would be complex numbers. But
then if we take 1 ∈ R and i ∈ C, then the product i1 must be a vector, that is, a real number,
which is not the case.
• R can be seen as a Q-vector space.
• For every n, m ∈ N, the space M (m × n) of all m × n matrices with real coefficients is a real
vector space.

Proof. Note that in this case the vectors are matrices. Associativity and commutativity are
easy to check. The identity element is the matrix whose entries are all equal to zero. Given
a matrix A = (aij )i=1,...,m , its (additive) inverse is the matrix −A = (−aij )i=1,...,m . The
j=1,...,n j=1,...,n
distributivity laws are clear, as is the fact that 1A = A for every A ∈ M (m × n).

• For every n, m ∈ N, the space M (m × n, C) of all m × n matrices with complex coefficients,


is a complex vector space.

FT
Proof. As in the case of real matrices.

• Let C(R) be the set of all continuous functions from R to R. We define the sum of two
functions f and g in the usual way as the new function
f + g : R → R, (f + g)(x) = f (x) + g(x).
The product of a function f with a real number λ gives the new function λf defined by
RA
λf : R → R, (λf )(x) = λf (x).
Then C(R) is a vector space with these new operations.

Proof. It is clear that these operations satisfy associativity, commutativity and distributivity
and that 1f = f for every function f ∈ C(R). The additive identity is the zero function
(the function which is constant to zero). For a given function f , its (additive) inverse is the
function −f .
D

• Let P (R) be the set of all polynomials. With the usual sum and products with scalars, they
form a vector space.

Prove that C is a vector space over R and that R is a vector space over Q.

Observe that the sets M (m × n) and C(R) admit more operations, for example we can multiply
functions, or we can multiply matrices or we can calculate det A for a square matrix. However, all
these operations have nothing to do with the question whether they are vector spaces or not. It
is important to note that for a vector space we only need the sum of two vectors and the product
of a scalar with vector and that they satisfy the axioms from Definition 5.1.

We give more examples.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 169

Examples 5.6. • Consider R2 but we change the usual sum to the new sum ⊕ defined by
     
x a x+a
⊕ = .
y b 0

With this new sum, R2 is not a vector space. The reason is that
 there is no additive identity.
α
To see this, assume that we had an additive identity, say . Then we must have
β
       
x α x x
+ = for all ∈ R2 .
y β y y

However, for example,        


0 α α 0
+ = 6= ,
1 β 0 1

• Consider R2 but we change the usual sum to the new sum ⊕ defined by

FT
     
x a x+b
⊕ = .
y b y+b

With this new sum, R2 is not a vector space. One of the reasons is that the sum is not
commutative. For example
               
1 0 1+1 2 0 1 0+0 0
+ = = , but + = = .
0 1 0+0 0 1 0 1+1 2
RA
Show that there is no additive identity O which satisfies ~x ⊕ O = ~x for all ~x ∈ R2 .
• Let V = R+ = (0, ∞). We make V a real vector space with the following operations: Let
x, y ∈ V and λ ∈ R. We define

x ⊕ y = xy and λ x = xλ .

Then (V, ⊕, ) is a real vector space.


D

Proof. Let u, v, w ∈ V and let λ ∈ R. Then:

(a) Associativity: (u ⊕ v) ⊕ w = (uv) ⊕ w = (uv)w = u(vw) = u(v ⊕ w) = u ⊕ (v ⊕ w).


(b) Commutativity: v ⊕ w = vw = wv = w ⊕ v.
(c) The additive identity of ⊕ is 1 because for every x ∈ V we have that 1 ⊕ x = 1x = x.
(d) Inverse element: For every x ∈ V , its inverse element is x−1 because x⊕x−1 = xx−1 =
1 which is the identity element. (Note that this is in accordance with Properties 5.3 (vi)
since (−1) x = x−1 .)
(e) Identity element of multiplication by scalar: For every x ∈ V , we have that
1 x = 1x = x.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
170 5.1. Definitions and basic properties

(f) Compatibility: For every x ∈ V and λ, µ ∈ R, we have that

(λµ) v = v λµ = (v λ )µ = µ (v λ ) = λ (µ v).

(g) Distributivity laws: For all x, y ∈ V and λ, µ ∈ R, we have

(λ + µ) x = xλ+µ = xλ xµ = (λ v)(µ v) = (λ v) ⊕ (µ v)

and

λ (v ⊕ w) = (v ⊕ w)λ = (vw)λ = v λ wλ = v λ ⊕ wλ = (λ v) ⊕ (λ w).

• The example above can be generalised: Let f : R → (a, b) be an injective function. Then the
interval (a, b) becomes a real vector space if we define the sum of two vectors x, y ∈ (a, b) by

x ⊕ y = f (f −1 (x) + f −1 (y))

and the product of a scalar λ ∈ R and a vector x ∈ (a, b) by

FT λ x = f (λf −1 (x)).

Note that in the example above we had (a, b) = (0, ∞) and f = exp (that is: f (x) = ex ).

You should have understood


RA
• the concept of an abstract vector space,
• that the spaces Rn are examples of vector spaces, but there are many more,
• that “vectors” not necessarily can be written as columns (think of the vector space of all
polynomials, etc.)
• etc.

You should now be able to


D

• give examples of vector spaces different from Rn or Cn ,


• check if a given set with a given addition and multiplication with scalars is a vector space,
• recite the vector space axioms when woken in the middle of the night,
• etc.

Ejercicios.
En los siguientes ejercicios, diga si el conjunto dado es un espacio vectorial sobre R. Si no lo es,
determiné cuales propiedades de espacio vectorial no se cumplen.
1. {(x, y) ∈ R2 : y ≤ 0} con la suma y producto escalar usuales.
2. {(x, 2x, 3x) : x ∈ R} con la suma y producto escalar usuales.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 171

3. {(x, y, z) ∈ R3 : xy = 0} con la suma y producto escalar usuales.


4. {(x, y) ∈ R2 : x ∈ R, y ∈ Q} con la suma y producto escalar usuales.
5. El conjunto de puntos de R3 que están sobre la recta x = t + 1, y = 2t y z = 0 con la suma y
producto escalar usuales en R3 .
 0 −a 
6. 0 b : a, b ∈ R con la suma y multiplicación por escalar de matrices.

7. El conjunto de polinomios de grado ≤ 3 con término independiente cero con la suma y


multiplicación por escalar de polinomios.
8. El conjunto de polinomios de grado ≤ 2 con coeficiente que acompaña a X no negativo.
9. El conjunto de las funciones derivables en todo R con la suma y multiplicación por escalar de
C(R).
10. Rn con las siguientes operaciones:

suma: ~a ⊕ ~b := ~a − ~b,

11. R2 con las siguientes operaciones:

12. R2 con las siguientes operaciones:


FT
producto con escalar: λ ~a := λ~a.

suma: (a, b) ⊕ (c, d) := (a + c, −b − d),


producto con escalar: λ (a, b) := (λa, λb)
RA
suma: (a, b) ⊕ (c, d) := (a + c + 1, b + d),
producto con escalar: λ (a, b) := (a, λb)

13. Sea V = {a} (note que V solo tiene un elemento), sobre V defina las siguientes operaciones:

suma: a ⊕ a := a,
producto con escalar: λ a = a.
D

5.2 Subspaces
In this section, we work mostly with real vector spaces for the sake of definiteness. However, all
the statements are also true for complex vector spaces. We only have to replace R by C and the
word real by complex everywhere.

In this section we will investigate when a subset of a given vector space is itself a vector space.

Definition 5.7. Let V be a vector space and let W ⊆ V be a subset of V . Then W is called a
subspace of V if W itself is a vector space with the sum and product with scalars inherited from V .
A subspace W is called a proper subspace if W 6= {O} and W 6= V .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
172 5.2. Subspaces

First we observe the following basic facts.

Remark 5.8. Let V be a vector space.

• V always contains the following subspaces: {O} and V itself. However, they are not proper
subspaces.
• If V is a vector space, W is a subspace of V and U is a subspace of W , then U is a subspace
of V .
Prove these statements.

Remark 5.9. Let W be a subspace of a vector space V . Let O be the neutral element in V . Then
O ∈ W and it is the neutral element of W .

Proof. Since W is a vector space, it must have a neutral element OW . A priori, it is not clear that
OW = O. However, since OW ∈ W ⊂ V , we know that 0OW = O. On the other hand, since
W is a vector space, it is closed under product with scalars, so O = 0OW ∈ W . Clearly, O is a

FT
neutral element in W . So it follows that O = OW by uniqueness of the neutral element of W , see
Properties 5.3(i).

Now assume that we are given a vector space V and in it a subset W ⊆ V and we would like to
check if W is a vector space. In principle we would have to check all seven vector space axioms
from Definition 5.1. However, if W is a subset of V , then we get some of the vector space axioms
for free. More precisely, the axioms (a), (b), (e), (f) and (g) hold automatically. For example, to
prove (b), we take two elements w1 , w2 ∈ W . They belong also to V since W ⊆ V , and therefore
RA
they commute: w1 + w2 = w2 + w1 .
We can even show the following proposition:

Proposition 5.10. Let V be a real vector space and W ⊆ V a subset. Then W is a subspace of V
if and only if the following three properties hold:

(i) W 6= ∅, that is, W is not empty.


(ii) W is closed under sums, that is, if we take w1 and w2 in W , then their sum w1 + w2 belongs
to W .
D

(iii) W is closed under product with scalars, that is, if we take w ∈ W and λ ∈ R, then λw ∈ W .

Note that (ii) and (iii) can be summarised in the following:

(iv) W is closed under sums and product with scalars, that is, if we take w1 , w2 ∈ W and λ ∈ R,
then λw1 + w2 ∈ W .

Proof of 5.10. Assume that W is a subspace, then clearly (ii) and (iii) hold. (i) holds because every
vector space must contain at least the additive identity O.
Now suppose that W is a subset of V such that the properties (i), (ii) and (iii) are satisfied. In
order to show that W is a subspace of V , we need to verify the vector space axioms (a) - (f) from
Definition 5.1. By assumptions (ii) and (iii), the sum and product with scalars are well defined in
W . Moreover, we already convinced ourselves that (a), (b), (e), (f) and (g) hold. Now, for the

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 173

existence of an additive identity, we take an arbitrary w ∈ W (such a w exists because W is not


empty by assumption (i)). Hence O = 0w ∈ W where O is the additive identity in V . This is then
also the additive identity in W . Finally, given w ∈ W ⊆ V , we know from Properties 5.3 (vi) that
its additive inverse is (−1)w, which, by our assumption (iii), belongs to W . So we have verified
that W satisfies all vector space axioms, so it is a vector space.
The proposition is also true if V is a complex vector space. We only have to replace R everywhere
by C.

In order to verify that a given W ⊆ V is a subspace, one only has to verify (i), (ii) and (iii)
from the preceding proposition. In order to verify that W is not empty, one typically checks if it
contains O.

Let us see examples of subspaces.

Examples 5.11. Let V be a vector space. We assume that V is a real vector space, but everything

FT
works also for a complex vector space (we only have to replace R everywhere by C.)
(i) {0} is a subspace of V . It is called the trivial subspace of V .
(ii) V itself is a subspace of V .
(iii) Fix v ∈ V . Then the set W := {λv : λ ∈ R} is a subspace of V .
(iv) More generally, if we fix v1 , . . . vk ∈ V , then the set W := {α1 v1 + · · · αk vk : α1 , . . . , αk ∈ R}
is a subspace of V . This set is called the linear span of v1 , . . . , vk . It will be shown in
RA
Theorem 5.24 that it is indeed a vector space.
(v) P (R), the set of all real polynomials, is a subspace of C(R), the set of all continuous functions
on R.
(vi) For every n, the polynomials of degree at most n, Pn (R), is a subspace of P (R), and also of
C(R).
(vii) If W is a subspace of V , then V \ W is not a subspace. This can be easily seen if we recall that
W must contain O. But then V \ W cannot contain O, hence it cannot be a vector space.
D

Some more examples:

Examples 5.12. (i) The set of all solutions of a homogeneous system of linear equations is a
vector space.
(ii) The set of all solutions of a homogeneous linear differential equation is a vector space.
Proof. We prove only (i) since the proof of (ii) is similar. Assume that the system of equations is
given in matrix form A~x = ~0 for some matrix A ∈ (m × n). We denote by U the set of all solutions,
that is, U := {~x ∈ Rn : A~x = ~0}. Clearly, ~0 ∈ U . Now let ~y , ~z ∈ U and λ ∈ R. We have to show
that then also ~y + λ~z ∈ U . This is true because

A(~y + λ~z) = A~y + λA~z = ~0 + λ~0 = ~0.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
174 5.2. Subspaces

Therefore, the vector ~y + λ~z solves the homogeneous equation, so it belongs to U as we wanted to
show.

Note however that the set of all solutions of a homogeneous equation is not a vector space. Let us
consider a system of linear equations given in matrix form by
A~x = ~b
where A ∈ M (m × n) and ~b ∈ Rm with ~b 6= ~0. We denote its set of solutions by
U := {~x ∈ Rn : A~x = ~b}.
Clearly, ~0 ∈ / U because A~0 = ~0 6= ~b. This is already enough to conclude that U is not a vector
space. But we can also see that U is not closed under sums and multiplication by scalars. To this
end, let ~y , ~z ∈ U and λ ∈ R \ {0}. Then
A(~y + λ~z) = A~y + λA~z = ~b + λ~b = (1 + λ)~b 6= ~b.
However, U is “almost” a vector space. Recall that if we write the solutions of A~x = ~b in vector

FT
form, they are always of the form
~x = ~z0 + t1 ~y1 + · · · + tk ~yk
where k is the number of free variables and ~y1 , . . . , ~yk are solutions of the homogeneous system
A~x = ~0. See Remark 3.12. Therefore we can write
U = {~x ∈ R2 : A~x = ~b}
= {~x = ~z0 + t1 ~y1 + · · · + tk ~yk : t1 , . . . , tk ∈ R}
= ~z0 + {~x0 = t1 ~y1 + · · · + tk ~yk : t1 , . . . , tk ∈ R}
RA
= ~z0 + {~x ∈ R2 : A~x = ~0}.
This shows that U is equal to the vector space W = {~x ∈ R2 : A~x = ~0} but shifted by the vector ~z0 .
We will call such sets affine subspaces. They are very important in many applications. The formal
definition is as follows.

Definition 5.13. Let V be a vector space and W ⊆ V a subset. The W is called an affine subspace
if there exists an v0 ∈ V such that set
D

v0 + W := {v0 + w : w ∈ W }
is a subspace of V .

Let us see examples of affine subspaces.

Examples 5.14. Let V be a vector space. We assume that V is a real vector space, but everything
works also for a complex vector space (we only have to replace R everywhere by C.)
(i) Every subspace of V is also an affine subspace with z0 = O.
(ii) If we fix z0 and v1 , . . . vk ∈ V , then the set W := {z0 + α1 v1 + · · · αk vk : α1 , . . . , αk ∈ R} =
z0 + {α1 v1 + · · · αk vk : α1 , . . . , αk ∈ R} is an affine subspace of V . In general it will not be a
subspace.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 175

Exercise. Show that W := {z0 + α1 v1 + · · · αk vk : α1 , . . . , αk ∈ R} is an affine subspace of


V . Show that it is a subspace if and only if z0 ∈ span{v1 , . . . , vk }.

(iii) If W is a subspace of V , then V \ W is not an affine subspace.

Some more examples:

Examples 5.15. • The set of all solutions of an inhomogeneous system of linear equations is
an affine vector space if it is not empty.
• The set of all solutions of an inhomogeneous linear differential equation is an affine vector
space if it is not empty.

Now we give several examples and non-examples of subspaces of R2 and R3 . You should try to
generalize them to examples in R4 , R5 , etc. and also try to come up with your own examples.

FT
Examples 5.16 (Examples and non-examples of subspaces of R2 ).
  
λ
• W = : λ ∈ R is a subspace of R2 . This is actually a subspace of the form (iii) from
0  
1
Example 5.11 with z = . Note that geometrically W is a line (it is the x-axis).
0
 
v1
• For fixed v1 , v2 ∈ R let ~v = and let W = {λ~v : λ ∈ R}. Then W is a subspace of R2 .
RA
v2
Geometrically, W is the trivial subspace {~0} if ~v = ~0. Otherwise it is the line in R2 passing
through the origin which is parallel to the vector ~v .

W
~v
D

Figure 5.1: The subspace W generated by the vector ~v .

   
a1 v1
• For fixed a1 , a2 , v1 , v2 ∈ R let ~a = and ~v = . Let us assume that ~v 6= ~0 and set
a2 v2
W = {~a + λ~v : λ ∈ R}. Then W is an affine subspace. Geometrically, W represents a line in
R2 parallel to ~v which passes through the point (a1 , a2 ). Note that W is a subspace if and
only if ~a and ~v are parallel.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
176 5.2. Subspaces

~v W

~v W
~a ~a

Figure 5.2: Sketches of W = {~a + λ~v : λ ∈ R}. In the figure on the left hand side, ~a 6k ~v , so W is an
affine subspace of R@ but not a subspace. In the figure on the right hand side, ~a k ~v and therefore W
is a subspace of R2 .

• U = {~x ∈ R2 : k~xk ≥ 3} is not a subspace of R2 since it does not contain ~0.

FT
 
1
• V = {~x ∈ R : k~xk ≤ 2} is not a subspace of R . For example, take ~z =
2 2
. Then ~z ∈ W ,
0
however 3~z ∈
/ V.
    
x 2
• W = : x ≥ 0 . Then W is not a vector space. For example, ~z = ∈ W , but
y
  0
−2
(−1)~z = ∈
/ W.
0
RA
Note that geometrically W is a right half plane in R2 .

V 3~z ∈
/V −~z ∈
/W

~z ∈ V ~z ∈ W
D

Figure 5.3: The sets V and W in the figures are not subspaces of R2 .

Examples 5.17 (Examples and non-examples of subspaces of R3 ).


   
 x0 
• For fixed x0 , y0 , z0 ∈ R let W = λ  y0  : λ ∈ R . Then W is a subspace of R3 . Geomet-
z0
 
 
x0
rically, W is a line in R2 passing through the origin which is parallel to the vector  y0 .
z0

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 177

  
 x 
• For fixed a, b, c ∈ R the set W = y  : ax + by + cz = 0 is a subspace of R3 .
z
 

3 ~
Proof. We use Proposition 5.10
 toverify that W isa subspace of R . Clearly, 0 ∈ W since
x1 x2
0a+0b+0c = 0. Now let w~ 1 =  y1  and w~ 2 =  y2  in W and let λ ∈ R. Then w ~2 ∈ W
~ 1 +w
z1 z2
because

a(x1 + x2 ) + b(y1 + y2 ) + c(z1 + z2 ) = (ax1 + by1 + cz1 ) + (ax2 + by2 + cz2 ) = 0 + 0 = 0.

~ 1 ∈ W because
Also λw

a(λx1 ) + b(λy1 ) + c(λz1 ) = λ(ax1 + by1 + cz1 ) = λ0 = 0.

Hence W is closed under sum and product with scalars, so it is a subspace of R.

FT
Remark. Note that W is the set of all solutions of a homogeneouos linear system of equations
(one equation with three unknowns). Therefore W is a vector space by Theorem 3.22 where
it is shown that the sum and the product with a scalar of two solutions of a homogeneous
linear system is again a solution.

Remark. If a = b = c = 0, then W = R3 . If at least one of the numbers a, b, c ∈ R is


different from
 zero,
3
 then W is a plane in R which passes through the origin and has normal
RA
a
vector ~n =  b .
c

• For fixed
a,b, c, d ∈ R with d 6= 
0 and at least of the numbers a, b, c different from 0, the set
 x 
W = y  : ax + by + cz = d is not a subspace of R3 , see Figure 5.4, but it is an affine
z
 
subspace.
D

   
x1 x2
~ 1 =  y1  and w
Proof. Let us see that W is not a vector space. Let w ~ 2 =  y2  in W . Then
z1 z2
w
~1 + w~2 ∈
/ W because

a(x1 + x2 ) + b(y1 + y2 ) + c(z1 + z2 ) = (ax1 + by1 + cz1 ) + (ax2 + by2 + cz2 ) = d + d = 2d 6= d.

~ ∈ W and λ ∈ R \ {1}, then λw


(Alternatively, we could have shown that if w ~ ∈
/ W ; or we
could have shown that ~0 ∈
/ W .)
We know that W is a plane in R3 which has normal vector ~n = (a, b, c)t but does not pass
through the origin. This shows that W is an affine vector space because it can be written as
W = ~v0 + W0 where W0 is the plane parallel to W which passes through the origin and ~v0 is

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
178 5.2. Subspaces

FT
Figure 5.4: The green plane passes through the origin and is a subspace of R3 . The red plane does
not pass through the origin and therefore it is an affine subspace of R3 .

an arbitrary vector from the origin to a point on the plane W . (Note that W0 is the plane
described by ax + by + cz = 0.)
Note that we already showed in Corollary 3.23 that W is an affine vector space.
RA
Remark. If a = b = c = 0, then W = ∅.

• W = {~x ∈ R3 : k~xk ≥ 5} is not a subspace of R3 since it does not contain ~0.


 
5
• W = {~x ∈ R3 : k~xk ≤ 9} is not a subspace of R3 . For example, take ~z = 0. Then ~z ∈ W ,
0
however, for example, 7~z ∈
/ W (or: ~z + ~z ∈
/ W ).
D

    
 x  1
• W = x2  : x ∈ R . Then W is not a vector space. For example, ~a = 1 ∈ W , but
 3
x 1

 
2
2~a = 2 ∈ / W.
2

Examples 5.18 (Examples and non-examples of subspaces of M (m × n). The following


sets are examples for subspaces of M (m × n):

• The set of all matrices with a11 = 0.


• The set of all matrices with a11 = 5a12 .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 179

• The set of all matrices such that its first row is equal to its last row.

If m = n, then also the following sets are subspaces of M (n × n):

• The set of all symmetric matrices.


• The set of all antisymmetric matrices.
• The set of all diagonal matrices.
• The set of all upper triangular matrices.
• The set of all lower triangular matrices.

The following sets are not subspaces of M (n × n):


• The set of all invertible matrices.
• The set of all non-invertible matrices.
• The set of all matrices with determinant equal to 1.

FT
Examples 5.19 (Examples and non-examples of subspaces of the set all functions from
R to R). Let V be the set of all functions from R to R. Then V clearly is a real vector space.
The following sets are examples for subspaces of V :

• The set of all continuous functions.


• The set of all differentiable functions.
• The set of all bounded functions.
RA
• The set of all polynomials.
• The set of all polynomials with degree ≤ 5.
• The set of all functions f with f (7) = 0.
• The set of all even functions.
• The set of all odd functions.

The following sets are not subspaces of V :


D

• The set of all polynomials with degree 3.


• The set of all polynomials with degree ≥ 3.
• The set of all functions f with f (7) = 13.
• The set of all functions f with f (7) ≥ 0.

Prove the claims above.

Definition 5.20. For n ∈ N0 let Pn be the set of all polynomials of degree less than or equal to n.

Remark 5.21. Pn is a vector space.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
180 5.2. Subspaces

Proof. Clearly, the zero function belongs to Pn (it is a polynomial of degree 0). For polynomials
p, q ∈ Pn and numbers λ ∈ R we clearly have that p + q and λp are again polynomials of degree
at most n, so they belong to Pn . By Proposition 5.10, Pn is a subspace of the space of all real
functions, hence it is a vector space.

You should have understood


• the concept of a subspace of a given vector space,
• why we only have to check if a given subset of a vector space is non-empty, closed under
sum and closed under multiplication with scalars if we want to see if it is a subspace,
• etc.
You should now be able to
• give examples and non-examples of subspaces of vector spaces,

FT
• check if a given subset of a vector space is a subspace,
• etc.

Ejercicios.
En los ejercicios 1 al 13 diga si el subconjunto W es subespacio del espacio vectorial V .

1. Sea A ∈ M (m × n), V = Rn y W = {~x ∈ Rn : A~x = ~0}.


RA
2. Sea x ∈ Rn , V = M (n × n) y W = {A ∈ V : A~x = ~0}.

3. Sea A ∈ M (n × n), V = M (n × n) y W = {B ∈ V : AB = BA}.

4. Sea w ∈ Rn un vector no nulo, V = Rn y W = {~x ∈ V : ~x ⊥ w}.


~

5. Sea A ∈ M (n × n), V = M (n × n) y W = {B ∈ V : AB = O}.


n z+2y  o
6. V = R3 y W = y+3z : z, y ∈ R .
D

7. V = M (n × n) y W el conjunto de todas las matrices con determinante cero.


  
a a+1
8. V = M (2 × 2) y W = : a, b ∈ R .
0 0

9. V = R3 y W el plano xy.

10. V = Pn y W = {p ∈ Pn : p00 (0) = 0}.

11. V = C(R) y W = {f ∈ V : f (x2 ) = 2f (x)}.

12. V = C(R) y W = {f ∈ V : f (x2 ) = (f (x))2 }.


R1
13. V = C(R) y W = {f ∈ V : 0 f = 0}.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 181

14. V = C 1 (R) (funciones diferenciables en R) y W = {f ∈ V : f 0 (0) = f 0 (1)}.


15. Sea V = Rn y W = {~x ∈ V : hx ,~e1 i + hx ,~e2 i + · · · + hx ,~en i = 0}. Muestre que W es
subespacio vectorial de V .
16. Sea V = R3 :
(a) Sea E un plano que pasa por el origen y W = {~x ∈ V : ~x ⊥ E}. Muestre que W es un
subespacio vectorial de V . ¿Puede decir quién es W ?
(b) Sea w~ ∈ R3 y W = {~x ∈ V : projw ~x = ~0}. Muestre que W es un subespacio vectorial de
V . ¿Puede decir quién es W ?
(c) Sea w ~ = ~0}. Muestre que W es un subespacio vectorial de
~ ∈ R3 y W = {~x ∈ V : ~x × w
V . ¿Puede decir quién es W ?
  
−b a
17. En V = M (2 × 2), sean W1 = {A ∈ V : a22 = 0} y W2 = : a, b ∈ R .
a b
(a) Muestre que W1 y W2 son subespacios vectoriales de V .

FT
(b) Determine W1 ∩ W2 y muestre que también es un subespacio de V .

5.3 Linear combinations and linear independence


In this section, we work mostly with real vector spaces for the sake of definiteness. However, all
the statements are also true for complex vector spaces. We only have to replace R by C and the
word real by complex everywhere.
RA
We start with a definition.

Definition 5.22. Let V be a real vector space and let v1 , . . . , vk ∈ V and α1 , . . . , αk ∈ R. Then
every vector of the form
v = α1 v1 + · · · αk vk (5.1)
is called a linear combination of the vectors v1 , . . . , vk ∈ V .
D

       
1 4 9 3
Examples 5.23. • Let V = R3 and let ~v1 = 2 , ~v2 = 5 , ~a = 12 , ~b = 3 .
3 6 15 3
Then ~a and ~b are linear combinations of ~v1 and ~v2 because ~a = ~v1 + 2~v2 and ~b = −~v1 + ~v2 .
       
1 0 0 1 5 7 1 2
• Let V = M (2 × 2) and let A = , B= , R= , S= .
0 1 −1 0 −7 5 −2 3
Then R is a linear combination of A and B because R = 5A+7B. S is not a linear combination
of A and B because clearly every linear combination of A and B is of the form
 
α β
αA + βB =
−β α
so it can never be equal to S since S has two different numbers on its diagonal.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
182 5.3. Linear combinations and linear independence

Definition and Theorem 5.24. Let V be a real vector space and let v1 , . . . , vk ∈ V . Then the
set of all their possible linear combinations is denoted by
span{v1 , . . . , vk } := {α1 v1 + · · · + αk vk : α1 , . . . , αk ∈ R}.
It is a subspace of V and it is called the linear span of the vectors v1 , . . . , vk . The vectors v1 , . . . , vk
are called generators of the subspace span{v1 , . . . , vk }.

Remark. By definition, the vector space generated by the empty set is the vector space which
consists only of the zero vector, that is, span{} := {O}.

Remark. Other names for “linear span” that are commonly used, are subspace generated by
the v1 , . . . , vk or subspace spanned by the v1 , . . . , vk . Instead of span{v1 , . . . , vk } the notation
gen{v1 , . . . , vk } is used frequently. All these names and notations mean exactly the same thing.

Proof of Theorem 5.24. We have to show that W := span{v1 , . . . , vk } is a subspace of V . To this


end we use Proposition 5.10 again. Clearly, W is not empty since at least O ∈ W (we only need

FT
to choose all the αj = 0). Now let u, w ∈ W and λ ∈ R. We have to show that λu + w ∈ W . Since
u, w ∈ W , there are real numbers α1 , . . . , αk and β1 , . . . , βk such that u = α1 v1 + . . . , αk vk and
w = β1 w1 + · · · + βk vk . Then
λu + v = λ(α1 v1 + · · · + αk vk ) + β1 w1 + · · · + βk vk
= λα1 v1 + · · · + λαk vk ) + β1 w1 + · · · + βk vk
= (λα1 + β1 )v1 + · · · + (λαk + βk )vk
which belongs to W since it is a linear combination of the vectors v1 , . . . , vk .
RA
Remark. The generators of a given subspace are not unique.
  
  
1 0 −1 0 1 −1
For example, let A = , B= , C= . Then
0 1 0 1 1 1
  
α −β
span{A, B} = {αA + βB : α, β ∈ R} = : α, β ∈ R ,
β α
  
α + γ −(β + γ)
D

span{A, B, C} = {αA + βB + γC : α, β, γ ∈ R} = : α, β, γ ∈ R ,
β+γ α+γ
  
α+γ −γ
span{A, C} = {αA + γC : α, γ ∈ R} = : α, γ ∈ R .
γ α+γ
We see that span{A, B} = span{A, B, C} = span{A, C} (in all cases it consists of exactly those
matrices whose diagonal entries are equal and the off-diagonal entries differ by a minus sign). So
we see that neither the generators nor their number is unique.

Remark. If a vector is a linear combination of other vectors, then the coefficients in the linear
combination are not necessarily unique.

For example, if A, B, C are the matrices above, then A + B + C = 2A + 2B = 2C or A + 2B + 3C =


4A + 5B = B + 4C, etc.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 183

Remark 5.25. Let V be a vector space and let v1 , . . . , vn and w1 , . . . , wm be vectors in V . Then
the following are equivalent:
(i) span{v1 , . . . , vn } = span{w1 , . . . , wm }.
(ii) vj ∈ span{w1 , . . . , wm } for every j = 1, . . . , n and wk ∈ span{v1 , . . . , vn } for every k =
1, . . . , m.

Proof. (i) =⇒ (ii) is clear.


(ii) =⇒ (i): Note that vj ∈ span{w1 , . . . , wm } for every j = 1, . . . , n implies that every vj can be
written as a linear combination of the w1 , . . . , wm . Then also every linear combination of v1 , . . . , vn
is a linear combination of w1 , . . . , wm . This implies that span{v1 , . . . , vn } ⊆ span{w1 , . . . , wm }. The
converse inclusion span{w1 , . . . , wm } ⊆ span{v1 , . . . , vn } can be shown analogously. Both inclusions
together show that we must have equality.

Examples 5.26. (i) Pn = span{1, X, X 2 , . . . , X n−1 , X n } since every vector in Pn is a polyno-


mial of the form p = αn X n + αn−1 X n−1 + · · · + α1 X + α0 , so it is a linear combination of

FT
the polynomials X n , X n−1 , . . . , X, 1.

Exercise. Show that {1, 1 + X, X + X 2 , . . . , X n−1 + X n } is also a set of generators of


Pn .
 
0 1
(ii) The set of all antisymmetric 2 × 2 matrices is generated by .
−1 0

~ ∈ R3 \ {~0}.
(iii) Let V = R3 and let ~v , w
RA
• span{~v } is a line which passes through the origin and is parallel to ~v .
• If ~v 6k w,
~ then span{~v , w}
~ is a plane which passes through the origin and is parallel to ~v
and w. ~ If ~v k w,
~ then it is a line which passes through the origin and is parallel to ~v .

Example 5.27. Let p1 = X 2 − X + 1, p2 = X 2 − 2X + 5 ∈ P2 , and let U = span{p1 , p2 }. Check


if q = 2X 2 − X − 2 and r = X 2 + X − 3 belong to U .
Solution. • Let us check if q ∈ U . To this end we have to check if we can find α, β such that
q = αp1 + βp2 . Inserting the expressions for p1 , p2 , q we obtain
D

2X 2 − X − 2 = α(X 2 − X + 1) + β(X 2 − 2X + 5) = X 2 (α + β) + X(−α − 2β) + α + 5β.

Comparing coefficients of the different powers of X, we obtain the system of equations


α+ β=2
−α − 2β = −1
α + 5β = −2.
We use the Gauß-Jordan process to solve the system:
     
1 1 2 1 1 2 1 0 3
A = −1 −2 −1 −→ 0 −1 1 −→ 0 1 −1
1 5 −2 0 4 −4 0 0 0

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
184 5.3. Linear combinations and linear independence

It follows that α = 3 and β = −1 is a solution, and therefore q = 2p1 − p2 which shows that
q ∈ U.
• Let us check if r ∈ U . To this end we have to check if we can find α, β such that r = αp1 +βp2 .
Inserting the expressions for p1 , p2 , q we obtain

X 2 + X − 3 = α(X 2 − X + 1) + β(X 2 − 2X + 5) = X 2 (α + β) + X(−α − 2β) + α + 5β.

Comparing coefficients of the different powers of X, we obtain the system of equations

α+ β=1
−α − 2β = 1
α + 5β = −3.

We use the Gauß-Jordan process to solve the system:


     
1 1 1 1 1 2 1 0 2
A = −1 −2 1 −→ 0 −1 2 −→ 0 1 −2 .

FT
1 5 −3 0 4 −4 0 0 4

We see that the system is inconsistent. Therefore r is not a linear combination of p1 and p2 ,
hence r ∈
/ U. 

Definition 5.28. A vector space V is called finitely generated if is has a finite set of generators.

Examples 5.29. The following vector spaces are finitely generated.


RA
• The trivial vector space {O} is finitely generated.
• Rn because clearly Rn = gen{~e1 , . . . , ~en } where ~ej is the jth unit vector.
• M (m × n) because it is generated by the set of all possible matrices which are 0 everywhere
except a 1 in exactly one entry.
• Pn is finitely generated as was shown in Example 5.26.
• Let P be the vector space of all real polynomials. Then P is not finitely generated.
D

Proof. Assume that P is finitely generated and let q1 , . . . , qk be a system of generators of P .


Note that the qj are polynomials. We will denote their degrees by mj = deg qj and we set
M = max{m1 , . . . , mk }. Then any linear combination of them will be a polynomial of degree
at most M no matter who we choose the coefficients, However, there are elements in P which
have higher degree, for example X m+1 . Therefore q1 , . . . , qk cannot generate all of P .

Another proof using the concept of dimension will be given in Example 5.56 (f).

Later, in Lemma 5.53, we will see that every subspace of a finitely generated vector space is again
finitely generated.

Now we ask ourselves what is the least number of vectors we need in order to generate Rn . We
know that for example Rn = span{~e1 , . . . ,~en }. So in this case we have n vectors that generate

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 185

Rn . Could it be that fewer vectors are sufficient? Clearly, if we take away one of the ~ej , then the
remaining system no longer generates Rn since “one coordinate is missing”. However, could we
maybe find other vectors so that n − 1 or less vectors are enough to generate all of Rn ? The next
proposition says that this is not possible.

Proposition 5.30. Let ~v1 , . . . , ~vk be vectors in Rn . If span{~v1 , . . . , ~vk } = Rn , then k ≥ n.

Proof. Let A = (~v1 | . . . |~vk ) be the matrix whose columns are the given vectors. We know that
there exists an invertible matrix E such that A0 = EA is in reduced echelon form (the matrix E
is the product of elementary matrices which correspond to the steps in the Gauß-Jordan process
to arrive at the reduced echelon form). Now, if k < n, then we know that A0 must have at least
one row which consists of zeros only. If we can find a vector w ~ such that it is transformed to ~en
under the Gauß-Jordan process, then we would have that A~x = w ~ is inconsistent, which means
that w~ ∈
/ span{~v1 , . . . , ~vk }. How do we find such a vector w?
~ Well, we only have to start with ~en
and “do the Gauß-Jordan process backwards”. In other words, we can take w ~ = E −1~en . Now if we
apply the Gauß-Jordan process to the augmented matrix (A|w), ~ we arrive at (EA|E w) ~ = (A0 |~en )

FT
which we already know is inconsistent.
Therefore, k < n is not possible and therefore we must have that k ≥ n.

Note that the proof above is basically the same as the one in Remark 3.37. Observe that the system
of vectors ~v1 , . . . , ~vk ∈ Rn is a set of generators for Rn if and only if the equation A~y = ~b has a
solution for every ~b ∈ Rn (as above, A is the matrix whose columns are the vectors ~v1 , . . . , ~vk ).

Now we will answer the question when the coefficients of a linear combination are unique. The
following remark shows us that we have to answer this question only for the zero vector.
RA
Remark 5.31. Let V be a vector space, let v1 , . . . , vk ∈ V and let w ∈ span{v1 , . . . , vk }. Then
there are unique β1 , . . . , βk ∈ R such that

β 1 v1 + · · · + β k vk = w (5.2)

if and only if there are unique α1 , . . . , αk ∈ R such that

α1 v1 + · · · + αk vk = O. (5.3)
D

Proof. First note that (5.3) always has at least one solution, namely α1 = · · · = αk = 0. This
solution is called the trivial solution.
Let us assume that (5.2) has two different solutions, so that there are γ1 , . . . , γk ∈ R such that for
at least one j = 1, . . . , k we have that βj 6= γj and

γ1 v1 + · · · + γk vk = w. (5.2’)

Subtracting (5.2) and (5.2’) gives

(β1 − γ1 )v1 + · · · + (βk − γk )vk = w − w = O

where at least one coefficient is different from zero. Therefore also (5.3) has more than one solution.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
186 5.3. Linear combinations and linear independence

On the other hand, let us assume that (5.3) has a non-trivial solution, that is, at least one of the
αj in (5.3) is different from zero. But then, if we sum (5.2) and (5.3), we obtain another solution
for (5.2) because
(α1 + β1 )v1 + · · · + (α1 + βk )vk = O + w = w.
The proof shows that there are as many solutions of (5.2) as there are of (5.3).
It should also be noted that if (5.3) has one non-trivial solution, then it has automatically infinitely
many solutions, because if α1 , . . . , αk is a solution, then also cα1 , . . . , cαk is a solution for arbitrary
c ∈ R since
cα1 v1 + · · · + cαk vk = c(α1 v1 + · · · + αk vk ) = c O = O.

In fact, the discussion above should remind you of the relation between solutions of an inhomo-
geneous system and the solutions of its associated homogeneous system in Theorem 3.22. Note
that just as in the case of linear systems, (5.2) could have no solution. This happens if and only
if w ∈
/ span{v1 , . . . , vk }.
If V = Rn then the remark above is exactly Theorem 3.22.

FT
So we see that only one of the following two cases can occur: (5.4) as exactly one solution (namely
the trivial one) or it has infinitely many solutions. Note that this is analogous to the situation of
the solutions of homogeneous linear systems: They have either only the trivial solution or they have
infinitely many solutions. The following definition distinguishes between the two cases.

Definition 5.32. Let V be a vector space. The set of vectors v1 , . . . , vk in V are called linearly
independent if
α1 v1 + · · · + αk vk = O. (5.4)
RA
has only the trivial solution. They are called linearly dependent if (5.4) has more than one solution.

Remark 5.33. The empty set is linearly independent since O cannot be written as a nontrivial
linear combination of vectors from the empty set.

Before we continue with the theory, we give a few examples.


   
1 −4
Examples. (i) The vectors v~1 = and v~2 = ∈ R2 are linearly dependent because
D

2 −8
4v~1 + v~2 = ~0.
   
1 5
(ii) The vectors v~1 = and v~2 = ∈ R2 are linearly independent.
2 0

Proof. Consider the equation αv~1 + β v~2 = ~0. This equation is equivalent to the following
system of linear equations for α and β:

α + 3β = 0
2α + 0β = 0.

We can use the Gauß-Jordan process to obtain all solutions. However, in this case we easily
see that α = 0 (from the second line) and then that β = − 31 α = 0. Note that we could

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 187

 
1 3
also have calculated det = −6 6= 0 to conclude that the homogeneous system above
2 0
has only the trivial solution. Observe that the columns of the matrix are exactly the given
vectors.
   
1 2
(iii) The vectors v~1 = 1 and v~2 = 3 ∈ R2 are linearly independent.
1 4

Proof. Consider the equation αv~1 + β v~2 = ~0. This equation is equivalent to the following
system of linear equations for α and β:

α + 2β = 0
α + 3β = 0
α + 4β = 0.

FT
If we subtract the first from the second equation, we obtain β = 0 and then α = −2β = 0. So
again, this system has only the trivial solution and therefore the vectors v~1 and v~2 are linearly
independent.
       
1 −1 0 0
(iv) Let v~1 = 1, v~2 =  2 v~3 = 0 and v~4 = 6 ∈ R2 Then
1 3 1 8
RA
(a) The system {~v1 , ~v2 , ~v3 } is linearly independent.
(b) The system {~v1 , ~v2 , ~v4 } is linearly dependent.

Proof. (a) Consider the equation αv~1 + β v~2 + γ v~3 = ~0. This equation is equivalent to the
following system of linear equations for α, β and γ:

α − 1β + 0γ = 0
α + 2β + 0γ = 0
α + 3β + 1γ = 0.
D

We use the Gauß-Jordan process to solve the system. Note that the columns of the
matrix associated to the above system are exactly the given vectors ~v1 , ~v2 , ~v3 .
       
1 −1 0 1 −1 0 1 −1 0 1 0 0
A = 1 2 0 −→ 0 3 0 −→ 0 1 0 −→ 0 1 0 .
1 3 1 0 4 1 0 4 1 0 0 1

Therefore the unique solution is α = β = γ = 0 and consequently the vectors ~v1 , ~v2 , ~v3
are linearly independent.
Observe that we also could have calculated det A = 3 6= 0 to conclude that the homoge-
neous system has only the trivial solution.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
188 5.3. Linear combinations and linear independence

(b) Consider the equation αv~1 + β v~2 + δ v~4 = ~0. This equation is equivalent to the following
system of linear equations for α, β and δ:
α − 1β + 0δ = 0
α + 2β + 6δ = 0
α + 3β + 8δ = 0.
We use the Gauß-Jordan process to solve the system. Note that the columns of the
matrix associated to the above system are exactly the given vectors.
       
1 −1 0 1 −1 0 1 −1 0 1 −1 0
A = 1 2 6 −→ 0 3 6 −→ 0 1 2 −→ 0 1 2
1 3 8 0 4 8 0 1 2 0 0 0
 
1 0 2
−→ 0 1 2 .
0 0 0
So there are infinitely many solutions. If we take δ = t, then α = β = −2t. Consequently

FT
the vectors ~v1 , ~v2 , ~v3 are linearly dependent, because, for example, −2~v1 − 2~v2 + ~v3 = 0
(taking t = 1).
Observe that we also could have calculated det A = 0 to conclude that the system has
infinite solutions.
   
0 1 1 0
(v) The matrices and are linearly independent in M (2 × 2).
0 0 0 0
RA
     
1 1 1 0 0 1
(vi) The matrices A = , B= and C = are linearly dependent in M (2×2)
0 1 0 1 0 0
because A − B − C = 0.

After these examples we will proceed with some facts on linear independence. We start with the
special case when we have only two vectors.

Proposition 5.34. Let v1 , v2 be vectors in a vector space V . Then v1 , v2 are linearly dependent if
and only if one vector is a multiple of the other.
D

Proof. Assume that v1 , v2 are linearly dependent. Then there exist α1 , α2 ∈ R such that α1 v1 +
α2 v2 = 0 and at least one of the α1 and α2 is different from zero, say α1 6= 0. Then we have
α2
v1 + α 1
v2 = 0, hence v1 = − α
α 1 v2 .
2

Now assume on the other hand that, e.g., v1 is a multiple of v2 , that is v1 = λv2 for some λ ∈ R.
Then v1 − λv2 = 0 which is a nontrivial solution of α1 v1 + α2 v2 = 0 because we can take α1 = 1 6= 0
and α2 = −λ (note that λ may be zero).

The proposition
 above cannot
 be extended
 to the case of three or more vectors. For instance, the
1 ~ 0 1
vectors ~a = ,b= , ~c = are linearly dependent because ~a + ~b − ~c = ~0, but none of
0 1 1
them is a multiple of any of the other two vectors.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 189

Proposition 5.35. Let V be a vector space.

(i) Every system of vectors which contains O is linearly dependent.


(ii) Let v1 , . . . , vk ∈ V and assume that there are α1 , . . . , αk ∈ R such that α1 v1 + · · · + αk vk = O.
If α` 6= 0, then v` is a linear combination of the other vj .
(iii) If the vectors v1 , . . . , vk ∈ V are linearly dependent, then for every w ∈ V , the vectors
v1 , . . . , vk , w are linearly dependent.
(iv) If v1 , . . . , vk are vectors in V and w is a linear combination of them, then v1 , . . . , vk , w are
linearly dependent.
(v) If the vectors v1 , . . . , vk ∈ V are linearly independent, then every subset of them is linearly
independent.

Proof. (i) Let v1 , . . . , vk ∈ V . Clearly 1O +0v1 +· · ·+0vk = O is a non-trivial linear combination


which gives O. Therefore the system {O, v1 , . . . , vk } is linearly dependent.

FT
α`−1 α`+1 αk
(ii) If α` 6= 0, then we can solve for v` : v` = − αα1` v1 − · · · − α` v`−1 − α` v`+1 − ··· − α` vk .

(iii) If the vectors v1 , . . . , vk ∈ V are linearly dependent, then there exist α1 , . . . , αk ∈ R such
that at least one of them is different from zero and α1 v1 + · · · + αk vk = O. But then also
α1 v1 + · · · + αk vk + 0w = O which shows that the system {v1 , . . . , vk , w} is linearly dependent.

(iv) Assume that w is a linear combination of v1 , . . . , vk . Then there exist α1 , . . . , αk ∈ R such


that w = α1 v1 + · · · + αk vk . Therefore we obtain w − α1 v1 − · · · − αk vk = O which is a
RA
non-trivial linear combination since the coefficient of w is 1.

(v) Suppose that a subsystem of v1 , . . . , vk ∈ V are linearly dependent. Then, by (iii) every
system in which it is contained, must be linearly dependent too. In particular, the original
system of vectors must be linearly dependent which contradicts our assumption. Note that
also the empty set is linearly independent by Remark 5.33.

Now we specialise to the case when V = Rn . Let us take vectors ~v1 , . . . , ~vk ∈ Rn and let us write
(~v1 | · · · |~vk ) for the n × k matrix whose columns are the vectors ~v1 , . . . , ~vk .
D

Lemma 5.36. With the above notation, the following statements are equivalent:

(i) ~v1 , . . . , ~vk are linearly dependent.

(ii) There exist α1 , . . . , αk not all equal to zero, such that α1~v1 + · · · + αk~vk = 0.
   
α1 α1
 ..  ~  ..  ~
(iii) There exists a vector  .  6= 0 such that (~v1 | · · · |~vk )  .  = 0.
αk αk

(iv) The homogeneous system corresponding to the matrix (~v1 | · · · |~vk ) has at least one non-trivial
(and therefore infinitely many) solutions.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
190 5.3. Linear combinations and linear independence

Proof. (i) =⇒ (ii) is simply the definition of linear independence. (ii) =⇒ (iii) is only rewriting
the vector equation in matrix form. (iv) only says in word what the equation in (iii) means. And
finally (iv) =⇒ (i) holds because every non trivial solution the inhomogeneous system associated
to (~v1 | · · · |~vk ) gives a non-trivial solution of α1~v1 + · · · + αk~vk .

Since we know that a homogeneous linear system with more unknowns than equations has infinitely
many solutions, we immediately obtain the following corollary.

Corollary 5.37. Let ~v1 , . . . , ~vk ∈ Rn .

(i) If k > n, then the vectors ~v1 , . . . , ~vk are linearly dependent.
(ii) If the vectors ~v1 , . . . , ~vk are linearly independent, then k ≤ n.

Observe that (ii) does not say that if k ≤ n, then the vectors ~v1 , . . . , ~vk are linearly independent.
It only says that they have a chance to be linearly independent whereas a system with more than
n vectors always is linearly dependent.

FT
Now we specialise further to the case when k = n.

Theorem 5.38. Let ~v1 , . . . , ~vn be vectors in Rn . Then the following are equivalent:

(i) ~v1 , . . . , ~vn are linearly independent.


   
α1 α1
(ii) The only solution of (~v1 | · · · |~vn )  ...  = ~0 is the zero vector  ...  = ~0.
   
RA
αn αn
(iii) The matrix (~v1 | · · · |~vn ) is invertible.
(iv) det(~v1 | · · · |~vn ) 6= 0.

Proof. The equivalence of (i) and (ii) follows from Lemma 5.36. The equivalence of (ii), (iii) and
(iv) follows from Theorem 4.11.
D

Formulate an analogous theorem for linearly dependent vectors.

Now we can state when a system n vectors in Rn generates Rn .

Theorem 5.39. Let ~v1 , . . . , ~vn be vectors in Rn . and let A = (~v1 | · · · |~vn ) be the matrix whose
columns are the given vectors ~v1 , · · · , ~vn . Then the following are equivalent:

(i) ~v1 , . . . , ~vn are linearly independent.

(ii) Rn = span{~v1 , . . . , ~vn }.

(iii) det A 6= 0.

Proof. (i) ⇐⇒ (iii) is shown in Theorem 5.38.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 191

(ii) ⇐⇒ (iii): The vectors ~v1 , . . . , ~vn generate Rn if and only if for every w
~ ∈ Rn there exist
! numbers
β1
β1 , . . . , βn such that β1~v1 + · · · + βn vn = w.
~ In matrix form that means that A .. = w.
~ By
.
βn
Theorem 3.44 we know that this has a solution for every vector w ~ if and only if A is invertible
(because if we apply Gauß-Jordan to A, we must get to the identity matrix).

The proof of the preceding theorem basically goes like this: We consider the equation Aβ~ = w. ~
When are the vectors ~v1 , . . . , ~vn linearly independent? – They are linearly independent if and only
~ = ~0 the system has only the trivial solution. This happens if and only if the reduced echelon
if for w
form of A is the identity matrix. And this happens if and only if det A 6= 0.
When do the vectors ~v1 , . . . , ~vn generate Rn ? – They do, if and only if for every given vector w
~ ∈ Rn
the system has at least one solution. This happens if and only if the reduced echelon form of A is
the identity matrix. And this happens if and only if det A 6= 0.
Since a square matrix A in invertible if and only if its transpose At is invertible, Theorem 5.39 leads
immediately to the following corollary.

FT
Corollary 5.40. For a matrix A ∈ M (n × n) the following are equivalent:
(i) A is invertible.
(ii) The columns of A are linearly independent.
(iii) The rows of A are linearly independent.

We end this section with more examples.


RA
Examples. • Recall that Pn is the vector space of all polynomials of degree ≤ n.
In P3 , we consider the vectors p1 = X 3 − 1, p2 = X 2 − 1, p3 = X − 1. These vectors are
linearly independent.

Proof. Let α1 , α2 , α3 ∈ R such that α1 p1 + α2 p2 + α3 p3 = 0. This means that


0 = α1 (X 3 − 1) + α2 (X 2 − 1) + α3 (X − 1)
= α1 X 3 + α2 X 2 + α3 X − (α1 + α2 + α3 ).
D

Comparing coefficients, it follows that α1 = 0, α2 = 0, α3 = 0 and α1 + α2 + α3 = 0 which


shows that p1 , p2 and p3 are linearly independent.

If in addition we take p4 = X 3 − X 2 , then the system p1 , p2 , p3 and p4 is linearly dependent.

Proof. As before, let α1 , α2 , α3 , α4 ∈ R such that α1 p1 + α2 p2 + α3 p3 + α4 p4 = 0. This means


that
0 = α1 (X 3 − 1) + α2 (X 2 − 1) + α3 (X − 1) + α4 (X 3 − X 2 )
= (α1 + α4 )X 3 + (α2 − α4 )X 2 + α3 X − (α1 + α2 + α3 ).
Comparing coefficients, this is equivalent to α1 +α4 = 0, α2 −α4 = 0, α3 = 0 and α1 +α2 +α3 =
0. This system of equations has infinitely many solutions. They are given by α2 = α4 = −α1 ∈
R, α3 = 0 (verify this!). Therefore p1 , p2 , p3 and p4 are linearly dependent.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
192 5.3. Linear combinations and linear independence

Exercise. Show that p1 , p2 , p3 and p5 are linearly independent if p5 = X 3 + X 2 .

• In P2 , we consider the vectors p1 = X 2 + 2X − 1, p2 = 5X + 2, p3 = 2X 2 − 11X − 8. These


vectors are linearly dependent.

Proof. Let α1 , α2 , α3 ∈ R such that α1 p1 + α2 p2 + α3 p3 = 0. This means that

0 = α1 (X 2 + 2X − 1) + α2 (5X + 2) + α3 (2X 2 − 11X − 8)


= (α1 + 2α3 )X 2 + (2α1 + 5α2 − 11α3 )X − α1 + 2α2 − 8α3 ).

Comparing coefficients, it follows that α1 +2α3 = 0, 2α1 +5α2 −11α3 = 0, −α1 +2α2 −8α3 = 0.
We write this in matrix form and apply Gauß-Jordan:
       
1 0 2 1 0 2 1 0 2 1 0 2
 2 5 −11 −→ 0 5 −15 −→ 0 1 −3 −→ 0 1 −3 .
−1 2 −8 0 2 −6 0 1 −3 0 0 0

FT
This shows that the system has non-trivial solutions (find them!) and therefore p1 , p2 and p3
are linearly dependent.
     
1 2 1 0 0 5
• In V = M (2 × 2) consider A = ,B = ,C = . Then A, B, C are
2 1 0 1 5 0
linearly dependent because A − B − 52 C = 0.
     
1 2 3 2 2 2 1 2 2
• In V = M (2 × 3) consider A = ,B = ,C = . Then A, B, C
RA
4 5 6 1 1 1 2 1 1
are linearly independent.

Exercise. Prove this!

• Find a set of generators for the vector space


  
 x 
V = y  ∈ R3 : x + 2y = 0 .
D

z
 

x
Solution. Clearly, V is a subspace of R3 (it is a plane). Let ~x = y ∈ V . By definition
z
of V , we have that x + 2y = 0. We can solve the previous equation for x or y, we obtain
x = −2y. So        
x −2y −2 0
~x = y  =  y  = y  1 + z 0 .
z z 0 1
   
 −2 0 
Therefore  1 , 0 is a set of generators for V . 
0 1
 

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 193

You should have understood


• what a linear combination is,
• the concept of linear independence,
• the concept of linear span and that it consists either of only the zero vector or of infinitely
many vectors,
• geometrically the concept of linear independence in R2 and R3 ,
• that the coefficients in a linear combination are not necessarily unique,
• what the number of solutions of A~x = ~0 says about the linear independence of the columns
of A seen as vectors in Rn ,
• what the existence (or non-existence) of solutions of A~x = ~b for all ~b ∈ Rm says about the
span of the columns of A seen as vectors in Rn ,
• why a matrix A ∈ M (n × n) is invertible if and only if its columns are linearly independent,
• etc.

You should now be able to

FT
• verify if a given vector is a linear combination of a given set of vectors,
• verify if a given vector lies in the linear span of a given set of vectors,
• verify if a given set of vectors is a generator of a given vectors space,
• find a set of generators for a given vectors space,
RA
• verify if a given set of vectors is a linearly independent,
• etc.

Ejercicios.
En los ejercicios 1 al 10, encontrar un conjunto generador para el espacio dado. Antes de hacerlo,
asegúrese que los conjuntos dados efecitvamente son espacios vectoriales.
1. {~x ∈ R3 : x + y + z = 0}.
D


2. ~x ∈ R3 : x + y + z = 0, x − y − z = 0 .
3. Matrices antisimétricas de tamaño 3 × 3.
4. Polinomios de grado ≤ 3 tal que p00 (0) = 0.
R1
5. Polinomios de grado ≤ 3 tal que 0 xp(x)dx = 0.
6. {p ∈ P3 : p(0) = p(1), p0 (0) = p0 (1)}.
7. La recta en R2 dada por 2x + y = 0.
8. El plano 3x + 2y − z = 0 en R3 .
x y
9. La recta en R3 dada por 2 = 3 = 3z.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
194 5.3. Linear combinations and linear independence

10. {p ∈ P3 : p(x) = ax3 + cx2 + a − 2c}.


       
1 
 1 0 1 
5      
    1
1 1

11. Muestre 3 ∈ span 1 , 1 , 0.

 
4 0 1 0
 

12. Muestre que el conjunto {x3 + 1, x3 − 5, x2 , x2 − 1} genera todos los polinomios de grado ≤ 3
tales que su primera derivada evaluada en cero vale cero.
     
2  1 0 
13. Muestre que 1 no pertenece al generado de 1 , 1 .
3 1 1
 

14. En los siguientes ejercicios, diga si los vectores dados son linealmente independientes o de-
pendientes.

(d) En P3 ; x3 + 1, x3 − 5, x2 y x2 − 1.

FT
   
2 −4
(a) En R3 : −1,  2. (e) En P2 ; 1 − x, 1 + x, x2
4 −8
      (f) En P3 ; 2x, x3 −3, 1+x−4x3 , x3 +18x−9.
0 1 1
(g) En C(R); cos 2t, sin2 t, cos2 t.
(b) En R3 : 1, 0, 1
1 1 0
     
2−1 0−3 4 1
(c) En M (2×2); , , .
4 0 1 5 7−5
RA
   
1−c c
15. ¿Para qué valores de c son linealmente independientes los vectores y ?
−c 1+c
     
 1 0 2 
16. Determine si el siguiente conjunto 2 , 1 , 3 es linealmente independiente y
1 1 1
 
describa su generado.
D

     
1 2 3
17. ¿Para qué valores de α son linealmente independientes los vectores 2 , −1 , α?
3 4 4
     
1 1 1
18. Determine condiciones sobre a, b, c para que los vectores  a  ,  b  ,  c  sean lineal-
a2 b2 c2
mente independientes (ver Sección 4.1, Ejercicio 2.).
     
2 −4 1
19. ¿Para qué valores de α son linealmente dependientes los vectores −5 ,  10 , α?
3 −6 0
20. Falso o verdadero:

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 195

(a) Cinco vectores de R4 pueden ser linealmente independientes.


(b) Dos vectores de R3 pueden generar todo el espacio R3 .
(c) Un espacio vectorial puede tener infinitos conjuntos generadores.
(d) Sea V un espacio vectorial y W = span{v1 , ..., vn }. Si w ∈ W entonces W = span{v1 , ..., vn , w}.
(e) En Rn , si x~1 , .., x~k son linealmente independientes y A es una matriz invertible entonces
los vectores Ax~1 , .., Ax~k son linealmente independientes.

5.4 Basis and dimension


In this section, we work mostly with real vector spaces for the sake of definiteness. However, all
the statements are also true for complex vector spaces. We only have to replace R by C and the
word real by complex everywhere.

Definition 5.41. Let V be a vector space. A basis of V is a set of vectors {v1 , . . . , vn } in V which

FT
is linearly independent and generates V .

The following remark shows that a basis is a minimal system of generators of V and at the same
time a maximal system of linear independent vectors.

Remark. Let {v1 , . . . , vn } be a basis of V .


(i) Let w ∈ V . Then {v1 , . . . , vn , w} in not a basis of V because this system of vectors is no
longer linearly independent by Proposition 5.35 (iv).
RA
(ii) If we take away one of the vectors from {v1 , . . . , vn }, then it is no longer a basis of V be-
cause the new system of vectors no longer generates V . For example, if we take away v1 ,
then v1 ∈/ span{v2 , . . . , vn } (otherwise v1 , . . . , vn would be linearly dependent), and therefore
span{v2 , . . . , vn } =
6 V.

Remark 5.42. By definition, the empty set is a basis of the trivial vector space {O}.

Remark 5.43. Every basis of Rn has exactly n elements. To see this note that by Corollary 5.37,
D

a basis can have at most n elements because otherwise it cannot be linearly independent. On the
other hand, if it had less than n elements, then, by Remark 5.30, it cannot generate Rn .
     
 1 0 0 
Examples 5.44. • A basis of R3 is, for example, 0 , 1 , 0 . The vectors of this
0 0 1
 
basis are the standard unit vectors. The basis is called the standard basis (or canonical basis)
of R3 .
Other examples of bases of R3 are
           
 1 1 1   1 4 0 
0 , 1 , 1 , 2 , 5 , 2
0 0 1 3 6 1
   

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
196 5.4. Basis and dimension

Exercise. Verify that the systems above are bases of R3 .

The following systems are not bases of R3


                  
 1 4 3   1 4   1 1 1 0 
2 , 5 , 2 , 2 , 5 , 0 , 1 , 1 , 0 .
3 9 5 3 6 0 0 1 1
     

Exercise. Verify that the systems above are not bases of R3 .

• The standard basis in Rn (or canonical basis in Rn ) is {~e1 , . . . ,~en }. Recall that the ~ej are the
standard unit vectors whose jth entry is 1 and all other entries are 0.

Exercise. Verify that they form a basis of Rn .

• The standard basis in Pn (or canonical basis in Pn ) is {1, X, X 2 , . . . , X n }.

Exercise. Verify that they form a basis of Pn .

FT
• Let p1 = X, p2 = 2X 2 + 5X − 1, p3 = 3X 2 + X + 2. Then the system {p1 , p2 , p3 } is a basis
of P2 .

Proof. We have to show that the system in linearly independent and that it generates the
space P2 . Let q = aX 2 + bX + c ∈ P2 . We want to see if there are α1 , α2 , α3 ∈ R such that
q = α1 p1 + α2 p2 + α3 p3 . If we write this equation out, we find
RA
aX 2 + bX + c = α1 X + α2 (2X 2 + 5X − 1) + α3 (3X 2 + X + 2)
= (2α2 + 3α3 )X 2 + (α1 + 5α2 + α3 )X − α2 + 2α3 .

Comparing coefficients, we obtain the following system of linear equations for the αj :
     
2α2 + 3α3 = a  0 2 3 α1 a
α1 + 5α2 + α3 = b in matrix form: 1 5 1 α2  =  b  .
−α2 + 2α3 = c 0 −1 2 α3 c

D

Now we apply Gauß-Jordan to the augmented matrix:


     
0 2 3 a 1 5 1 b 1 0 11 b + 5c
1 5 1 b −→ 0 −1 2 c −→ 0 1 −2 c .
0 −1 2 c 0 2 3 a 0 0 7 a + 2c

So we see that there is exactly one solution for any given q. The existence of such a solution
shows that {p1 , p2 , p3 } generates P2 . We also see that for any give q ∈ P2 there is exactly one
way to write it as a linear combination of p1 , p2 , p3 . If we take the special case q = 0, this
shows that the system is linearly independent. In summary, {p1 , p2 , p3 } is a basis of P2 .

• Let p1 = X + 1, p2 = X 2 + X, p3 = X 3 + X 2 , p4 = X 3 + X 2 + X + 1. Then the system


{p1 , p2 , p3 , p4 } is not a basis of P2 .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 197

Exercise. Show this!


• In the spaces M (m × n), the set of all matrices Aij form a basis where Aij is the matrix with
aij = 1 and all other entries equal to 0. For example, in M (2 × 3) we have the following basis:
           
1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0
, , , , , .
0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1
       
1 0 1 0 1 0 1 1
• Let A = , B= , C= , D= . Then {A, B, C, D} is a basis
0 0 1 0 1 1 1 1
of M (2 × 2).
 
a b
Proof. Let M = be an arbitrary 2 × 2 matrix. Consider the equation M = α1 A +
c d
α2 B + α3 C + α4 D. This leads to
         
a b 1 0 1 0 1 0 1 1
= α1 + α2 + α3 + α4

FT
c d 0 0 1 0 1 1 1 1
 
α1 + α2 + α3 + α4 α4
= .
α2 + α3 + α4 α3 + α4

So we obtain the following set of equations for the αj :


     
α1 + α2 +α3 +α4 = a   1 1 1 1 α1 a
α4 = b 0 0 0 1 α2   b 

in matrix form:    =  .
α2 +α3 +α4 = c  0 1 1 1 α3   c 
RA

α3 +α4 = d 0 0 1 1 α4 d

Now we apply Gauß-Jordan to the augmented matrix:


     
1 1 1 1 a 1 1 1 1 a 1 1 1 0 a−b
0 0 0 1 b 0 1 1 1 c 0 1 1 0 c − b
0 1 1 1 c −→ 0 0 1 1
    −→  
d 0 0 1 0 d − b
0 0 1 1 d 0 0 0 1 b 0 0 0 1 b
   
1 1 0 0 a−d 1 0 0 0 a−c
D

0 1 0 0 c − d 0 1 0 0 c − d
−→   −→  .
0 0 1 0 d − b 0 0 1 0 d − b
0 0 0 1 b 0 0 0 1 b

We see that there is exactly one solution for any given M ∈ M (2 × 2). Existence of the
solution shows that the matrices A, B, C, D generate M (2 × 2) and uniqueness shows that
they are linearly independent if we choose M = 0.

The next theorem is very important. It says that if V has a basis which consists of n vectors, then
every basis consists of exactly n vectors.

Theorem 5.45. Let V be a vector space and let {v1 , . . . , vn } and {w1 , . . . , wn } be bases of V . Then
n = m.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
198 5.4. Basis and dimension

Proof. Suppose that m > n. We will show that then the vectors w1 , . . . , wm cannot be linearly
independent, hence they cannot be a basis of V . Since the vectors v1 , . . . , vn are a basis of V , every
wj can be written as a linear combination of them. Hence there exist numbers aij which

w1 = a11 v1 + a12 v2 + · · · + a1n vn


w2 = a21 v1 + a22 v2 + · · · + a2n vn
.. .. (5.5)
. .
wm = am1 v1 + am2 v2 + · · · + amn vn .

Now we consider the equation


c1 w1 + · · · + cm wm = O. (5.6)
If the w1 , . . . , wm were linearly independent, then it should follow that all cj are 0. We insert (5.5)
into (5.6) and obtain

O = c1 (a11 v1 + a12 v2 + · · · + a1n vn ) + c2 (a21 v1 + a22 v2 + · · · + a2n vn )

FT
+ · · · + cm (am1 v1 + am2 v2 + · · · + amn vn )
= (c1 a11 + c2 a21 + · · · + cm am1 )v1 + · · · + (c1 a1n + c2 a2n + · · · + cm amn )vn .

Since the vectors v1 , . . . , vn are linearly independent, the expressions in the parentheses must be
equal to zero. So we find

c1 a11 + c2 a21 + · · · + cm am1 = 0


c1 a12 + c2 a22 + · · · + cm am2 = 0
RA
.. .. (5.7)
. .
c1 a1n + c2 a2n + · · · + cm amm = 0.

This is a homogeneous system of n equations for the m unknowns c1 , . . . , cm . Since n < m it


must have infinitely many solutions. So the system {w1 , . . . , wm } is not linearly independent and
therefore it cannot be a basis of V . Therefore m > n cannot be true and it follows that n ≥ m.
If we assume that n > m, then the same argument as above, with the roles of the vj and the wj
exchanged, leads to a contradiction and it follows n ≤ m.
D

In summary we showed that both n ≥ m and n ≤ m must be true. Therefore m = n.

Definition 5.46. • Let V be a finitely generated vector space. Then it has a basis by The-
orem 5.47 below and by Theorem 5.45 the number n of vectors needed for a basis does not
depend on the particular chosen basis. This number is called the dimension of V . It is denoted
by dim V .

• If a vector space V is not finitely generated, then we set dim V = ∞.

• The empty set is a basis of the trivial vector space {O}, hence dim{O} = 0.

Next we show that every finitely generated vector space has a basis and therefore a well-defined
dimension.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 199

Theorem 5.47. Let V be a vector space and assume that there are vectors w1 , . . . , wm ∈ V such
that V = span{w1 , . . . , wm }. Then the set {w1 , . . . , wm } contains a basis of V . In particular, V
has a finite basis and dim V ≤ m.

Proof. Without restriction we may assume that all vectors wj are different from O. We start with
the first vector. If V = span{w1 }, then {w1 } is a basis of V and dim V = 1. Otherwise we set
V1 := span{w1 } and we note that V1 6= V . Now we check if w2 ∈ span{w1 }. If it is, we throw it out
because in this case span{w1 } = span{w1 , w2 } so we do not need w2 to generate V . Next we check
if w3 ∈ span{w1 }. If it is, we throw it out, etc. We proceed like this until we find a vector wi2 in
our list which does not belong to span{w1 }. Such an i2 must exist because otherwise we already
had that V1 = V . Then we set V2 := span{w1 , wi2 }. If V2 = V , then we are done. Otherwise, we
proceed as before: We check if wi2 +1 ∈ V2 . If this is the case, then we can throw it out because
span{w1 , wi2 } = span{w1 , wi2 , wi2 +1 }. Then we check wi2 +2 , etc., until we find a wi3 such that
wi3 ∈
/ span{w1 , wi2 } and we set V3 := span{w1 , wi2 , wi3 }. If V3 = V , then we are done. If not, then
we repeat the process. Note that after at most m repetitions, this comes to an end. This shows
that we can extract from the system of generators a basis {w1 , wi2 , . . . , wik } of V .

FT
The following theorem complements the preceding one.

Theorem 5.48. Let V be a finitely generated vector space. Then any system w1 , . . . , wm ∈ V of
linearly independent vectors can be completed to a basis {w1 , . . . , wm , vm+1 , . . . , vn } of V .

Proof. Note that dim V < ∞ by Theorem 5.47 and set n = dim V . It follows that n ≥ m because
we have m linearly independent vectors in V . If m = n, then w1 , . . . , wm is already a basis of V
and we are done.
RA
If m < n, then span{w1 , . . . , wm } = 6 V and we choose an arbitrary vector vm+1 ∈
/ span{w1 , . . . , wm }
and we define Vm+1 := span{w1 , . . . , wm , vm+1 }. Then dim Vm+1 } = m + 1. If m + 1 = n,
then necessarily Vm+1 = V and we are done. If m + 1 < n, then we choose an arbitrary vector
vm+2 ∈ V \ Vm+1 and we let Vm+2 := span{w1 , . . . , wm , vm+1 , vm+2 }. If m + 2 = n, then necessarily
Vm+2 = V and we are done. If not, we repeat the step before. Note that after n − m steps we have
found a basis {w1 , . . . , wm , vm+1 , . . . , vn } of V .

In summary, the two preceding theorems say the following:


D

• If the set of vectors v1 , . . . vm generates the vector space V , then it is always possible to extract
a subset which is a basis of V (we need to eliminate m − n vectors).

• If we have a set of linearly independent vectors v1 , . . . vm in a finitely generated vector space


V , then it is possible to find vectors vm+1 , . . . , vn such that {v1 , . . . , vn } is a basis of V (we
need to add dim V − m vectors).

Corollary 5.49. Let V be a vector space.

• If the vectors v1 , . . . , vk ∈ V are linearly independent, then k ≤ dim V .


• If the vectors v1 , . . . , vm ∈ V generate V , then m ≥ dim V .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
200 5.4. Basis and dimension

  
1 0 1 1
Example 5.50. • Let A = , B= ∈ M (2 × 2) and suppose that we want to
0 0 1 1
complete them to a basis of M (2 × 2) (it is clear that A and B are linearly independent,
so this makes sense). Since dim(M (2 × 2)) = 4, we know  that we need 2 more matrices.
0 1
We take any matrix C ∈ / span{A, B}, for example C = . Finally we need a matrix
 0 0
0 0
D ∈/ span{A, B, C}. We can take for example D = . Then A, B, C, D is a basis of
1 0
M (2 × 2).

Check that D ∈
/ span{A, B, C}

Find other matrices C 0 and D0 such that {A, B, C 0 , D0 } is a basis of M (2 × 2).


           
1 4 1 0 0 2
• Given the vectors ~v1 = 0 , ~v2 = 0 , ~v3 = 2 , ~v4 = 2 , ~v5 = 0 , ~v6 = 1
1 4 3 1 2 5

FT
3
and we want to find a subset of them which form a basis of R .
Note that a priori it is not clear that this is possible because we do not know without further
calculations that the given vectors really generate R3 . If they do not, then of course it is
impossible to extract a basis from them.
Let us start. First observe that we need 3 vectors for a basis since dim R3 = 3. So we start
with the first non-zero vector which is ~v1 . We see that ~v2 = 4~v1 , so we discard it. We keep
~v3 since ~v3 ∈
/ span{~v1 }. Next, ~v4 = ~v3 − ~v1 , so ~v4 ∈ span{~v1 , ~v3 } and we discard it. A little
RA
calculation shows that ~v5 ∈/ span{~v1 , ~v3 }. Hence {~v1 , ~v3 , ~v5 } is a basis of R3 .

Remark 5.51. We will present a more systematic way to solve exercises of this type in
Theorem 6.34 and Remark 6.35.

Theorem 5.52. Let V be a vector space with basis {v1 , . . . , vn }. Then every x ∈ V can be written
in unique way as linear combination of the vectors v1 , . . . , vn .
D

Proof. We have to show existence and uniqueness of numbers c1 , . . . , cn so that w = c1 v1 +· · ·+cn vn .


Existence is clear since the set {v1 , . . . , vn } is a set of generators of V (it is even a basis!).
Uniqueness can be shown as follows. Assume that there are numbers c1 , . . . , cn and d1 , . . . , dn such
that w = c1 v1 + · · · cn vn and w = d1 v1 + · · · dn vn . Then it follows that

O = w − w = c1 v1 + · · · cn vn − (d1 v1 + · · · dn vn ) = (c1 − d1 )v1 + · · · (cn − dn )vn .


Then all the coefficients c1 − d1 , . . . , cn − dn have to be zero because the vectors v1 , . . . , vn are
linearly independent. Hence it follows that c1 = d1 , . . . , cn = dn , which shows uniqueness. Note
that the theorem is also true if V = {O} because by definition the empty sum is equal to zero.

If we have a vector space V and a subspace W ⊂ V , then we can ask ourselves what the relation
between their dimensions is because W itself is a vector space.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 201

Lemma 5.53. Let V be a finitely generated vector space and let W be a subspace. Then W is
finitely generated and dim W ≤ dim V .

Proof. Let V be a finitely generated vector space with dim V = n < ∞. Let W be a subspace of
V and assume that W is not finitely generated. Then we can construct an arbitrary large system
of linear independent vectors in W as follows. Clearly, W cannot be the trivial space, so we can
choose w1 ∈ W \ {O} and we set W1 = span{w1 }. Then W1 is a finitely generated subspace of
W , therefore W1 ( W and we can choose w2 ∈ W \ W1 . Clearly, the set {w1 , w2 } is linearly
independent. Let us set W2 = span{w1 , w2 }. Since W2 is a finitely generated subspace of W , it
follows that W2 ( W and we can choose w3 ∈ W \ W2 . Then the vectors w1 , w2 , w3 are linearly
independent and we set W3 = span{w1 , w2 , w3 }. Continuing with this procedure we can construct
subspaces W1 ( W2 ( · · · W with dim Wk = k for every k. In particular, we can find a system
of n + 1 linear independent vectors in W ⊆ V which contradicts the fact that any system of more
than n = dim V vectors in V must be linearly dependent, see Corollary 5.49. This also shows that
any system of more than n vectors in W must be linear dependent. Since a basis of W consists of
linearly independent vectors, it follows that dim W ≤ n = dim V .

FT
Theorem 5.54. Let V be a finitely generated vector space and let W ⊆ V be a subspace. Then the
following is true:

(i) dim W ≤ dim V .


(ii) dim W = dim V if and only if W = V .

Proof. (i) follows immediately from Lemma 5.53.


RA
(ii) If V = W , then clearly dim V = dim W . To show the converse, we assume that dim V =
dim W and we have to show that V = W . As before let {w1 , . . . , wk } be a basis of W . Then
these vectors are linearly independent in W , and therefore also in V . Since dim W = dim V ,
we know that these vectors form a basis of V . Therefore V = span{w1 , . . . , wm } = W .

Remark 5.55. Note that (i) is true even when V is not finitely generated because dim W ≤ ∞ =
dim V whatever dim W may be. However (ii) is not true in general for infinite dimensional vector
spaces. In Example 5.56 (f) and (g) we will show that dim P = dim C(R) in spite of P 6= C(R).
(Recall that P is the set of all polynomials and that C(R) is the set of all continuous functions. So
D

we have P ( C(R).)

Now we give a few examples of dimensions of spaces.

Examples 5.56. (a) dim Rn = n, dim Cn = n.

(b) dim M (m × n) = mn. This follows because the set of all m × n matrices Aij which have a 1 in
the ith row and jth column and all other entries are equal to zero form a basis of M (m × n)
and there are exactly mn such matrices.

(c) Let Msym (n × n) be the set of all symmetric n × n matrices. Then dim Msym (n × n) = n(n+1) 2 .
To see this, let Aij be the n × n matrix with aij = aji = 1 and all other entries equal to 0.
Observe that Aij = Aji . It is not hard to see that the set of all Aij with i ≤ j form a basis of

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
202 5.4. Basis and dimension

Msym (n × n). The dimension of Msym (n × n) is the number of different matrices of this type.
How many of them are there? If we fix j = 1, then only i = 1 is possible. If we fix j = 2,
then i = 1, 2 is possible, etc. until for j = n the allowed values for i are 1, 2, . . . , n. In total
we have 1 + 2 + · · · + n = n(n+1)
2 possibilities. For example, in the case n = 2, the matrices
are      
1 0 0 1 0 0
A11 = , A12 = , A12 = .
0 0 1 0 0 1
In the case n = 3, the matrices are
     
1 0 0 0 1 0 0 0 1
A11 = 0 0 0 , A12 = 1 0 0 , A13 = 0 0 0 ,
0 0 0 0 0 0 1 0 0
     
0 0 0 0 0 0 0 0 0
A22 = 0 1 0 , A23 = 0 0 1 , A33 = 0 0 0 .
0 0 0 0 1 0 0 0 1

FT
Convince yourself that the Aij form a basis of Msym (n × n).

(d) Let Masym (n × n) be the set of all antisymmetric n × n matrices. Then dim Masym (n × n) =
n(n−1)
2 . To see this, for i 6= j let Aij be the n × n matrix with aij = −aji = 1 and all other
entries equal to 0 form a basis of Msym (n × n). It is not hard to see that the set of all Aij
with i < j form a basis of Masym (n × n). How many of these matrices are there? If we fix
j = 2, then only i = 1 is possible. If we fix j = 3, then i = 1, 2 is possible, etc. until for j = n
the allowed values for i are 1, 2, . . . , n − 1. In total we have 1 + 2 + · · · + (n − 1) = n(n−1)
RA
2
possibilities. For example, in the case n = 2, the only matrix is
 
0 1
A12 = .
−1 0

In the case n = 3, the matrices are


     
0 1 0 0 0 1 0 0 0
A12 = −1 0 0 , A13 =  0 0 0 , A23 = 0 0 1 .
D

0 0 0 −1 0 0 0 −1 0

Convince yourself that the Aij form a basis of Masym (n × n).

Remark. Observe that dim Msym (n × n) + dim Masym (n × n) = n2 = dim M (n × n). This
is no coincidence. Note that every n × n matrix M can be written as

M = 12 (M + M t ) + 21 (M − M t )

and that 21 (M + M t ) ∈ Msym (n × n) and 12 (M − M t ) ∈ Masym (n × n). Moreover it is easy


to check that Msym (n × n) ∩ Masym (n × n) = {0}. Therefore M (n × n) is the direct sum
of Msym (n × n) and Masym (n × n). (For the definition of the direct sum of subspaces, see
Definition 5.59).

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 203

(e) dim Pn = n + 1 since {1, X, . . . , X n } is a basis of Pn and consists of n + 1 vectors.

(f) dim P = ∞. Recall that P is the space of all polynomials.

Proof. We know that for every n ∈ N, the space Pn is a subspace of P . Therefore for every
n ∈ N, we must have that n + 1 = dim Pn ≤ dim P . This is possible only if dim P = ∞.

(g) dim C(R) = ∞. Recall that C(R) is the space of all continuous functions.

Proof. Since P is a subspace of C(R), it follows that dim P ≤ dim(C(R)), hence dim(C(R)) =
∞.

Now we use the concept of dimension to classify all subspaces of R2 and R3 . We already know that
for examples lines and planes which pass through the origin are subspaces of R3 . Now we can show
that there are no other proper subspaces.

FT
Subspaces of R2 . Let U be a subspace of R2 . Then U must have a dimension. So we have the
following cases:

• dim U = 0. In this case U = {~0} is the trivial subspace.

• dim U = 1. Then U is of the form U = span{~v1 } with some vector ~v1 ∈ R2 \ {~0}. Therefore
U is a line parallel to ~v1 passing through the origin.

• dim U = 2. In this case dim U = dim R2 . Hence it follows that U = R2 by Theorem 5.54 (ii).
RA
• dim U ≥ 3 is not possible because 0 ≤ dim U ≤ dim R2 = 2.

In conclusion, the only subspaces of R2 are {~0}, lines passing through the origin and R2 itself.

Subspaces of R3 . Let U be a subspace of R3 . Then U must have a dimension. So we have the


following cases:

• dim U = 0. In this case U = {~0} is the trivial subspace.

• dim U = 1. Then U is of the form U = span{~v1 } with some vector ~v1 ∈ R3 \ {~0}. Therefore
D

U is a line parallel to ~v1 passing through the origin.

• dim U = 2. Then U is of the form U = span{~v1 , ~v2 } with linearly independent vectors
~v1 , ~v2 ∈ R3 . Hence U is a plane parallel to the vectors ~v1 and ~v2 which passes through the
origin.

• dim U = 3. In this case dim U = dim R3 . Hence it follows that U = R3 by Theorem 5.54 (ii).

• dim U ≥ 4 is not possible because 0 ≤ dim U ≤ dim R3 = 3.

In conclusion, the only subspaces of R3 are {~0}, lines passing through the origin, planes passing
through the origin and R3 itself.

We conclude this section with the formal definition of lines and planes.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
204 5.4. Basis and dimension

Definition 5.57. Let V be a vector space with dim V = n and let W ⊆ V be a subspace. Then
W is called a

• line if dim W = 1,
• plane if dim W = 2,
• hyperplane if dim W = n − 1.

Note that in R3 the hyperplanes are exactly the planes.

You should have understood

• the concept of a basis of a finite dimensional vector space,


• that a given vector space has infinitely many bases, but the number of vectors in any basis
of the space is the same,

FT
• why and how the concept of dimension helps to classify all subspaces of given vector space,
• why a matrix A ∈ M (n × n) is invertible if and only if its columns are a basis of Rn ,
• etc.
You should now be able to
• check if a system of vectors is a basis for a given vector space,
• find a basis for a given vector space,
RA
• extend a system of linear independent vectors to a basis,
• find the dimension of a given vector space,
• etc.

Ejercicios.
1. Encuentre bases para los espacios dados en los ejercicios del 1 al 10 de la sección 5.3.
D

2. Determine la dimensión de los siguientes espacios:


 
x
4
y 
(a) En R , los vectores 
 z  tal que w = x + y.

w
 
a+c
 a − b
(b) Todos los vectores de la forma   b + c.

−a + b
    
1 −1 0 0
(c) A ∈ M (2 × 2) : A = .
2 −2 0 0

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 205

(d) Las soluciones del sistema homogéneo

x − 5y = 0
2x − 3y = 0

(e) Las soluciones del sistema homogéneo

x − 3y − z = 0
−2x + 2y − 3z = 0
4x − 8y + 5z = 0.

(f) Las soluciones del sistema homogéneo

−x + 3y − 2z = 0
2x − 6y + 4z = 0
−3x + 9y − 6z = 0.

(g) V = span{cos 2t, sin2 t, cos2 t}.

(a) En M (2 × 2);
3 1
0 0
 
,
3 2
0 0
 
,
FT
3. En los siguientes ejercicios, determine si el conjunto de vectores dado es una base para el
espacio vectorial indicado.

−5 1
0 6
 
,
0
0 −7
1

.
RA
 
1
(b) En W = {(x, y) ∈ R2 : 3x − y = 0}; .
3
       
0 −1 1 2
0  1 1 1
(c) En R4 ; 1,  1, 0, 2.
      

1 2 0 1
(d) En P2 ; 5 − x2 , 3x.
(e) En P3 ; x3 − x, x3 + x2 , x2 + 1, x − 1.
D

4. En R4 , encontrar una base para el subespacio U = {~x ∈ R4 : h~x , (2, −3, 0, 4)i = 0}. (Note
that U es un hiperplano en R4 .)
x y
5. En R3 , considere la recta L : 5 = 3 = 2z. Encuentre una base para L y complétela a una
base de R3 .

6. Encuentre una base para el plano E : 2x + y − 5z = 0 y complétela a una base de R3 .


Represéntela gráficamente (Existe una forma natural de hacerlo. ¿Cuál?).
   
1 −1
 0  1
  
7. Encuentre una base para R4 que contenga a los vectores  1 y  0.
0 1

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
206 5.5. Intersections and sums of vector spaces

   
a −b
8. Muestre que los vectores , forman una base de R2 si ab 6= 0. Muestre también que
b a
   
a −b
⊥ .
b a

9. Demuestre que {1 − x2 , 1 + x2 } es una base del subconjunto de P2 cuya primera derivada


evaluada en cero vale cero. Complete esta base a una base de todo P2 .
 5     
α α 3−α
10. ¿Para qué valores de α los vectores 1 + α,  0 ,  0  generan todo R3 ?
0 2 1

11. (a) En M (n × n), muestre que n2 matrices tales que en todas su entrada ann vale cero no
pueden ser linealmente independientes.
(b) En Pn , muestre que n + 1 polinomios cuya primera derivada evaluada en cero se anula
no pueden ser linealmente independientes.

FT
(c) En Pn , ¿existen n + 1 polinomios linealmente independientes tales que el coeficiente de
x0 es 1?

5.5 Intersections and sums of vector spaces


In this section we will contstruct new subspaces from given ones. We will see that the intersection
of to two vector spaces is again a vector space, whereas the union in general is not.
RA
Proposition 5.58. Let U, W be subspaces of a vector space V . Then their intersection U ∩ W is
a subspace of V .

Proof. Clearly, U ∩ W 6= ∅ because O ∈ U and O ∈ W , hence O ∈ U ∩ W . Now let z1 , z2 ∈ U ∩ W


and c ∈ K. Then z1 , z2 ∈ U and therefore z1 + cz2 ∈ U because U is a vector space. Analogously
it follows that z1 + cz2 ∈ W , hence z1 + cz2 ∈ U ∩ W .

Observe that U ∩ W is the largest subspace which is contained both in U and in V .


D

For example, the intersection of two planes in R3 which pass through the origin is either that same
plane (if the two original planes are the same plane), or it is a line passing through the origin. In
either case, it is a subspace of R3 .
Observe however that in general the union of two vector spaces in general is not a vector space. For
instance, in R2 the lines L : y = 0 (this is the x-axis) and G : x = 0 (this is the y-axis) are subspaces
and their union L ∪ G is consists of exactly both axis. This is clearly not a vector space because it
is not closed under sums. For example, ~e1 ∈ L ⊆ L ∪ G and ~e2 ∈ G ⊆ L ∪ G, but ~e1 +~e2 ∈ / L ∪ G. In
order to make it a vector space, we need to include all the missing linear combinations. The space
that we obtain in this way, is called a direct sum, see Definition 5.59.

Exercise. • Give more examples of two subspaces whose union is not a vector space.
• Give an example of two subspaces whose union is a vector space.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 207

Question 5.1. Union of subspaces.


Can you find a criterion that subspaces must satisfy such that their union is a subspace?

Let us define the sum and the direct sum of vector spaces.

Definition 5.59. Let U, W be subspaces of a vector space V . Then the sum of the vector spaces
U and W is defined as
U + W = {u + w : u ∈ U, w ∈ W }. (5.8)
If in addition U ∩ W = {O}, then the sum is called the direct sum of U and W and one writes
U ⊕ W instead of U + W .

Remark. Let U, W be subspaces of a vector space V . Then U + W is again a subspace of V .

FT
Proof. Clearly, U + W 6= ∅ because O ∈ U and O ∈ W , hence O + O = O ∈ U + W . Now let
z1 , z2 ∈ U + W and c ∈ K. Then there exist u1 , u2 ∈ U and w1 , w2 ∈ W with z1 = u1 + w1 and
z2 = u2 + w2 . Therefore

z1 + cz2 = u1 + w1 + c(u2 + w2 ) = (u1 + cu2 ) + (w1 + cw2 ) ∈ U + W

and U + W is a subspace by Proposition 5.10.


RA
Note that U + W consists of all possible linear combinations of vectors from U and from W . We
obtain immediately the following observations.

Remark 5.60. (i) Assume that U = span{u1 , . . . , uk } and that W = span{w1 , . . . , wj }, then
U + W = span{u1 , . . . , uk , w1 , . . . , wj }.

(ii) The space U + W is the smallest vector space which contains both U and W .

Examples 5.61. (i) Let V be a vector space and let U ⊆ V be a subspace. Then we always
D

have:

(a) U + {O} = U ⊕ {O} = U ,


(b) U + U = U ,
(c) U + V = V .

If U and W are subspaces of V , then

(a) U ⊆ U + W and W ⊆ U + W .
(b) U + W = U if and only if W ⊆ U .

(ii) Let U and W be lines in R2 passing through the origin. Then they are subspaces of R2 and
we have that U + W = U if the lines are parallel and U + W = R2 if they are not parallel.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
208 5.5. Intersections and sums of vector spaces

(iii) Let U and W be lines in R3 passing through the origin. Then they are subspaces of R3 and
we have that U + W = U if the lines are parallel; otherwise U + W is the plane containing
both lines.
(iv) Let U be a line and W be a plane in R3 , both passing through the origin. Then they are
subspaces of R3 and we have that U + W = W if the line U is contained in W . If not, then
U + W = R3 .

Prove the statements in the examples above.

Recall that the intersection of two subspaces is again a subspace, see Proposition 5.58. The formula
for the dimension of the sum of two vector spaces in the next proposition can be understood as
follows: If we sum the dimension of the two vector spaces, then we count the part which is common to
both spaces twice; therefore we have to subtract its dimension in order to get the correct dimension
of the sum of the vector spaces.

Proposition 5.62. Let U, W be subspaces of a vector space V . Then

FT
dim(U + W ) = dim U + dim W − dim(U ∩ W ).

In particular, dim(U + W ) = dim U + dim W if U ∩ W = {O}.

Proof. Let dim U = k and dim W = m. Recall that U ∩ W is a subspace of V . and that U ∩ W ⊆ U
and U ∩W ⊆ W . Let v1 , . . . , v` be a basis of U ∩W . By Theorem 5.48 we can complete it to a basis
v1 , . . . , v` , u`+1 , . . . , uk of U . Similarly, we can complete it to a basis v1 , . . . , v` , w`+1 , . . . , wm of
W . Now we claim that v1 , . . . , v` , u`+1 , . . . , uk , w`+1 , . . . , wm is a basis of U + W .
RA
• First we show that the vectors v1 , . . . , v` , u`+1 , . . . , uk , w`+1 , . . . , wm generate U + W . This
follows from Remark 5.60 and

U + W = span{v1 , . . . , v` , u`+1 , . . . , uk } + span{v1 , . . . , v` , w`+1 , . . . , wm }


= span{v1 , . . . , v` , u`+1 , . . . , uk , v1 , . . . , v` , w`+1 , . . . , wm }
= span{v1 , . . . , v` , u`+1 , . . . , uk , w`+1 , . . . , wm }.

• Now we show that the vectors v1 , . . . , v` , u`+1 , . . . , uk , w`+1 , . . . , wm are linearly indepen-
D

dent. Let α1 , . . . , αn , β`+1 , . . . , βm ∈ R such that

α1 v1 + · · · + α` v` + α`+1 u`+1 + · · · + αk uk + β`+1 w`+1 + · · · + βm wm = O.

It follows that

α1 v1 + · · · + α` v` + α`+1 u`+1 + · · · + αk uk = −(β`+1 w`+1 + · · · + βm wm ) (5.9)


| {z } | {z }
∈U ∈W

and therefore −(β`+1 w`+1 + · · · + βm wm ) ∈ U ∩ W hence it must be a linear combination of


the vectors v1 , . . . , v` because they are a basis of U ∩ W . So we can find γ1 , . . . , γ` ∈ R such
that γ1 v1 + · · · + γ` v` = −(β`+1 w`+1 + · · · + βm wm ). This implies that

γ1 v1 + · · · + γ` v` + β`+1 w`+1 + · · · + βm wm = O.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 209

Since the vectors v1 , . . . , v` , w`+1 , . . . , wm form a basis of W , they are linearly independent,
and we conclude that γ1 = · · · = γ` = β`+1 = · · · = βm = 0. Inserting in (5.9), we obtain

α1 v1 + · · · + α` v` + α`+1 u`+1 + · · · + αk uk = O,

hence α1 = · · · = αk = 0.

It follows that

dim(U + W ) = #{v1 , . . . , v` , u`+1 , . . . , uk , w`+1 , . . . , wm }


= ` + (k − `) + (m − `)
=k+m−`
= dim U + dim W − dim(U ∩ W ).

Examples 5.63. In R3 consider the subspaces E, F, G given by E : 2x−y+3z = 0, F = span{~v , w}


~

FT
and G = span{~a} where      
0 1 −4
~v = 1 , w ~ = 0 , ~a =  3 ,
2 1 2
Find E ∩ F , E + F , E ∩ G, E + G and F ∩ G, F + G and their dimensions.

Solution. Clearly, E and F are planes in R3 and F is a line.


RA
E ∩ F Note that the normal vectors ~nE of E and ~nF of F are
  

2 1
~nE = −1 , ~nF = ~v × w
~ =  2 .
3 −1

Solution 1. The normal form of F is F : x + 2y − z = 0. A point (x, y, z) belongs to E ∩ F if and


only if its coordinates satisfy the equation for E and F simultaneously. Therefore we obtain the
following system of equations:
2x − y + 3z = 0
D

x + 2y − z = 0
A short calculation (Gauß-Jordan) shows that the set of solution is the line H : x = −y = −z, or
in vector form  
−1
H = gen{~b} where ~b =  1 . (∗)
1
Solution 2. We can also use the vector forms of E and F . In order to write E in vector form, we
only need to choose two vectors ~r, ~s which are parallel to E, for instance
   
1 1
E = span{~r, ~s} where ~r = 2 , ~s = 5 .
0 1

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
210 5.5. Intersections and sums of vector spaces

A vector ~x = (x, y, z)t belongs to E ∩ F if and only if the vector ~x is a linear combination of
~ and a linear combination of ~r, ~s, that is, if and only if there exist α, β, γ, δ ∈ R such that
~v , w
α~v + β w
~ = γ~r + δ~s, or
~ − γ~r − δ~s = ~0.
α~v + β w
Writing this as a system for the unknowns α, β, γ, δ ∈ R, we obtain

β− γ− δ=0
α − 2γ − 5δ = 0
2α + β − δ=0

A straightforward calculation shows that the general solution is α = t, β = −t, γ = −2t, δ = t.


Therefore

E ∩ F = {t~v − tw
~ : t ∈ R} = {t(~v − w) ~ = span{~b} = H
~ : t ∈ R} = span{~v − w}

or equivalently

FT
E ∩ F = {−2t~r + t~s : t ∈ R} = {t(−2~r − ~s) : t ∈ R} = span{−2~r − ~s} = span{~b} = H

with ~b and H as in (∗).

E + F Solution 1. We know that E + F = span{~v , w, ~ ~r, ~s}. Now, similarly as inExample 5.50 we
see that the vectors ~v , w,
~ ~r are linearly independent, therefore the dimension of E + F is larger or
equal to 3. Since it is a subspace of R3 , it must be equal to R3 . (We could also use Theorem 6.34
and Remark 6.35) to find a system of generators for E + F .)
RA
Solution 2. We know that

dim(E + F ) = dim E + dim F − dim(E ∩ F ) = 2 + 2 − 1 = 3.

Since E + F ⊆ R3 , it follows that E + F = R3 .


E ∩ G We insert the parametric form of G in the normal form or E and obtain as condition for
intersection that
0 = 2(−4t) − 3t + 3 · (−t) = −14t.
D

The unique solution is t = 0 which impolies that E ∩ G = {O}.

E + G It is easy to see that the three vectors ~r, ~s, ~a are linearly independent, therefore they
generate R3 and hence E + G = span{~r, ~s, ~a} = R3 .
Alternatively we could use thet dim(E + G) = dim E + dim G − dim(E ∩ G) = 2 + 1 − 0 = 3 to
conclude that E + G = R3 .
F ∩ G It is easy to see that the three vectors ~v , w,
~ ~a are linearly dependent. In fact, ~a = 2~v − 4w.
~
Therefore G ⊆ F and consequently G ∩ F = F .
F + G From the above it follows that F + G = span{~v , w, ~ ~a} = span{~v , w}
~ = F.
From the above, it is clear that dim(E ∩ F ) = 1, dim(E + F ) = 3, dim(E ∩ G) = 0, dim(E + G) = 3,
dim(F ∩ G) = 2, dim(F + G) = 3.


Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 211

Example 5.64. En R5 considere los subespacios U = {(x, y, z, r, s) : 2x + y + 5z + 4s = 0} and


W = {(x, y, z, r, s) : x + y + 4z + 3s = 0, 3x + y + 6z + 4r + 6s = 0}. Find U ∩ W , U + W and their
dimensions.

Solution. U ∩ W It is easy to see that dim U = 4 and dim W = 3.


Solution 1. A point (x, y, z, r, s) belongs U ∩ W if and only if it satisfies the following system of
linear equations
2x + y + 5z + 4s =0
x + y + 4z + 3s =0
3x + y + 6z + 4r + 6s = 0
This can be solved with the Gauß-Jordan elimination
       
2 1 5 0 4 0 −1 −3 0 −2 0 −1 −3 0 −2 1 0 1 0 1
1 1 4 0 3 −→ 1 1 4 0 3 −→ 1 0 1 0 1 −→ 0 1 3 0 2
3 1 6 4 6 0 −2 −6 4 −3 0 0 0 4 1 0 0 0 4 1

Therefore, dim(U ∩ W ) = 2 and

FT
   
   −4 −1 
x + z + s = 0
 
−8

 


  −3

 
U ∩W = (x, y, z, r, s) : y + 3z + 2s = 0 = span  0
 ,
  1
  (5.10)
−1  0
  
4r + s = 0
  
  
 
4 0
 

Solution 2. We can use vector forms of U and W . We choose any set of linearly indepen-
dent vectors ~u1 , ~u2 , ~u3 , ~u4 in U and w
~ 1, w ~ 3 in W . Then U = span{~u1 , ~u2 , ~u3 , ~u4 } and
~ 2, w
RA
W = span{w ~ 1, w
~ 2, w~ 3 }. For instance, we may take
             
1 0 0 0 −3 −2 −1
−2 −5 −4 0 −3  2 −3
             
 0 , ~u2 =  1 , ~u3 =  0 , ~u4 = 0 , w
  ~ 1 =  0 , w
  ~ 2 =  0 , w
~u1 =         ~ 3 =  1 .
 
 0  0  0 1  0  1  0
0 0 1 0 2 0 0

Then ~x ∈ U ∩ W if it is a linear combination both of the ~uj and of the w


~ j , that is
D

~x = α1 ~u1 + α2 ~u2 + α3 ~u3 + α4 ~u4


= β1 w
~ 1 + β2 w
~ 2 + β3 w
~3

for some αj , βj ∈ R. If the take the difference of the right hand sides, we obtain
 
  α1
1 0 0 0 −3 −2 −1 α2 

−2 −5 −4 0 −3 2 −3 α3 
~0 = α1 ~u1 + α2 ~u2 + α3 ~u3 + α4 ~u4 − β1 w
  
~ 1 − β2 w
~ 2 − β3 w
~3 = 
 0 1 0 0 0 0 1
 α4  .
 
 0 0 0 1 0 1 0 
 β1 

0 0 1 0 2 0 0  β2 
β3

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
212 5.5. Intersections and sums of vector spaces

We solve this sysetm using Gauß-Jordan elimination:


   
1 0 0 0 −3 −2 −1 1 0 0 0 −3 −2 −1
−2 −5 −4 0 −3 2 −3  0 1 0 0 0 0 1
   
 −→ · · · −→ 0 0 1 0
 0 1 0 0 0 0 1 2 0 0
.


 0 0 0 1 0 1 0 0 0 0 1 0 1 0
0 0 1 0 2 0 0 0 0 0 0 −1 −2 0

The general solution is


     
α1 −1 −4
α2   1  0
     
α3   0  4
     
α4  = t  0 + s −1
     
 β1   0 −2
     
 β2   0  1
β3 1 0

and therefore

FT
U ∩ W = {(−t − 4s)~u1 + t~u2 + 4s~u3 − s~u4 : t, s ∈ R} = {t(−~u1 + ~u2 ) + s(−4~u1 + 4~u3 − ~u4 ) : t, s ∈ R}
= span{−~u1 + ~u2 , −4~u1 + 4~u3 − ~u4 }

or equivalently
RA
U ∩ W = {−2sw
~ 1 + sw ~ 3 : t, s ∈ R} = {s(−2w
~ 2 + tw ~1 + w ~ 3 : t, s ∈ R}
~ 2 ) + tw
= span{−2w
~1 + w ~ 3 }.
~ 2, w

This is of course the same result as in (5.10).

U + W We know that dim(U + W ) = dim U + dim W − dim(U ∩ W ) = 4 + 3 − 2 = 5 = dim R5 ,


therefore U + W = R5 . 
D

You should now have understood


• the concept of sum and direct sum of two subspaces,
• why the formula dim(U + W ) = dim U + dim W − dim(U ∩ W ) makes sense,
• etc.
You should now be able to
• find the intersection of two vector spaces and its dimension,
• find the sum of two vector spaces and its dimension,
• decide if the sum of two vector spaces is a direct sum,
• etc.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 213

Ejercicios.
   

 1 −3 
   
5  ,  4 y

4
1. En R , sean U = span   2  1

 
−5 −3
 
V = {~x ∈ R4 : h~x , (1, 2, −2, −1)t i = 0, h~x , (2, 5, −5, 1)t i = 0}. Determine U ∩ V , U + V y
dim(U + V ).

2. En R3 muestre que si E, F son planos no paralelos que pasan por el origen, entonces
E + F = R3 .

3. Sea V el subespacio de las matrices triangulares superiores y W el subespacio de las matrices


triangurales inferiores. Muestre que M (n × n) = V + W . ¿Es directa esta suma?

4. Muestre que M (3 × 3) = Msym (3 × 3) ⊕ Masym (3 × 3). (Hint: Basta demostrar que


Msym (3 × 3) ∩ Masym (3 × 3) = {O} ¿Por qué es suficiente mostrar esto?).

FT
5. Sean U, V subespacios de Rn , responda las siguientes preguntas.

(a) Si dim U + dim V = n ¿se sigue que U + V = Rn ?


(b) Si Rn = U + V y n = dim U + dim V ¿se sigue que U ∩ V = {~0}?

6. Sea V ⊆ Rn un subespacio de dimensión k. Demuestre que existe V 0 subespacio de Rn tal


que Rn = V ⊕ V 0 (Hint: Escoja una base de V y completela a una base de Rn , ¿cómo se debe
tomar V 0 ?).
RA
7. Sean U, V ⊆ Rn subespacios. Muestre que ~0 se escribe de forma única como la suma de un
elemento de U con un elemento de V si y solo si U ∩ V = {~0}. (En tal caso U + V = U ⊕ V .)

8. Suponga que U, V, W ⊆ R5 con dim U = 2, dim V = 3 y dim W = 4.

(a) ¿Cuáles son las posibilidades para dim U ∩ V y dim U + V ? Dé ejemplos para cada caso.
(b) ¿Cuáles son las posibilidades para dim U ∩ W y dim U + W ? Dé ejemplos para cada caso.
D

(c) ¿Cuáles son las posibilidades para dim V ∩ W y dim V + W ? Dé ejemplos para cada
caso.

5.6 Summary
Let V be a vector space over K and let v1 , . . . , vk ∈ V .

Linear combinations and linear independence


• A vector w is called a linear combination of the vectors v1 , . . . , vk if there exists scalars
α1 , . . . , αk ∈ K such that
w = α1 v1 + · · · + αk vk .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
214 5.6. Summary

• The set of all linear combinations of the vectors v1 , . . . , vk is a subspace of V , called the space
generated by the vectors v1 , . . . , vk or the linear span of the vectors v1 , . . . , vk . Notation:

gen{v1 , . . . , vk } := span{v1 , . . . , vk } := {w ∈ V : w is linear combination of v1 , . . . , vk }


= {α1 v1 + · · · + αk vk : α1 , . . . , αk ∈ K}.

• The vectors v1 , . . . , vk are called linearly independent if the equation

α1 v1 + · · · + αk vk = O

has only the trivial solution α1 = · · · = αk = 0.

Basis and dimension


• A system v1 , . . . , vm of vectors in V is called a basis of V if it is linearly independent and
span{v1 , . . . , vm } = V .
• A vector space V is called finitely generated if it has a finite basis. In this case, every basis

FT
of V has the same number of vectors. The number of vectors needed for a basis of a vector
space V is called the dimension of V .
• If V is not finitely generated, we set dim V = ∞.
• For v1 , . . . , vk ∈ V , it follows that dim(span{v1 , . . . , vk }) ≤ k with equality if and only if the
vectors v1 , . . . , vk are linearly independent.
• If V is finitely generated then every linearly independent system of vectors v1 , . . . , vk ∈ V
can be extended to a basis of V .
RA
• If V = span{v1 , . . . , vk }, then V has a basis consisting of a subsystem of the given vectors
v1 , . . . , vk .
• If U is a subspace of V , then dim U ≤ dim V .
• If V is finitely generated and U is a subspace of V , then dim U = dim V if and only if U = V .
This claim is false if dim V = ∞.
• dim{O} = 0 and {O} has the unique basis ∅.
D

Examples of the dimensions of some vector spaces:


• dim{O} = 0,
• dim Rn = n, dim Cn = n,
• dim M (m × n) = mn,
n(n+1)
• dim Msym (n × n) = 2 ,
n(n−1)
• dim Masym (n × n) = 2 ,
• dim Pn = n + 1,
• dim P = ∞,
• dim C(R) = ∞.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 215

Linear independence, generator property and bases in Rn and Cn


Let ~v1 , . . . , ~vk ∈ Rn or Cn and let A = (~v1 | . . . |~vk ) ∈ M (n × k) be the matrix whose columns consist
of the given vectors.
• gen{~v1 , . . . , ~vk } = Rn if and only if the system A~x = ~b has at least one solution for every
~b ∈ Rn .

• The vectors ~v1 , . . . , ~vk are linearly independent if and only if the system A~x = ~0 has only the
trivial solution ~x = ~0.
• The vectors ~v1 , . . . , ~vk are a basis of Rn if and only if k = n and A is invertible.

Linear independence, generator property and bases in Rn and Cn


Let V be a vector space en U, W subspaces of V . Then

U ∩ V := {v ∈ V : v ∈ U and v ∈ W },

FT
U ∪ V := {v ∈ V : v ∈ U or v ∈ W },
U + V := {u + w : u ∈ U, w ∈ W }.

Note that U ∩ W ⊆ U ⊆ U ∪ W ⊆ U + W and W ∩ U ⊆ W ⊆ U ∪ W ⊆ U + W .

• U ∩ W and U + W are subspaces of V


• U ∪ W in general is not a subspace.
• The sum of U and W is called a direct sum and it is denoted by U ⊕ W if U ∩ V = {O}.
RA
• dim U + W = dim U + dim W − dim U ∩ W .

5.7 Exercises
1. Sea X el conjunto de todas las funciones de R a R. Demuestre que X con la suma y producto
con números en R es un espacio vectorial.
D

De los siguientes subconjuntos de X, diga si son subespacios de X.


(a) Todas las funciones acotadas de R a R.
(b) Todas las funciones constantes.
(c) Todas las funciones continuas.
(d) Todas las funciones continuas con f (3) = 0.
(e) Todas las funciones continuas con f (3) = 4.
(f) Todas las funciones con f (3) > 0.
(g) Todas las funciones pares.
(h) Todas las funciones impares.
(i) Todos los polinomios.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
216 5.7. Exercises

(j) Todas las funciones no negativas.


(k) Todos los polinomios de grado ≥ 4.

2. Sean A ∈ M (m × n) y sea ~a ∈ Rk .

(a) Demuestre que U = {A~x : ~x ∈ Rn } es un subespacio de Rm .


(b) ¿Los conjuntos R = {~x ∈ Rn : A~x = (1, 1, . . . , 1)t } y S = {~x ∈ Rn : A~x 6= 0} son
subespacios de Rn ?

3. Sean A ∈ M (m × n) y sea ~a ∈ Rk .

(a) ¿El conjunto T = {~x ∈ Rk : h~x, ~ai = 0} es un subespacio de Rk ?


(b) ¿Los conjuntos

S1 = {~x ∈ Rk : k~xk = 1}, B1 = {~x ∈ Rk : k~xk ≤ 1}, F = {~x ∈ Rk : k~xk ≥ 1}

son subespacios de Rk ?

FT
4. Considere el conjunto R2 con las siguientes operaciones:

⊕ : R 2 × R2 → R2 ,
    
x1
x2
y
⊕ 1 =
y2
x1 + y2
x2 + y1

,
RA
   
x1 λx1
: R × R2 → R2 , λ = .
x2 λx2

¿Es R2 con esta suma y producto con escalares un espacio vectorial?

5. Considere el conjunto R2 con las siguientes operaciones:


     
2 2 2 x1 y1 x1 + y1
:R ×R →R ,  = ,
D

x2 y2 0
   
2 2 x1 λx1
:R×R →R , λ = .
x2 λx2

¿Es R2 con esta suma y producto con escalares un espacio vectorial?

6. (a) Sea V = (− π2 , π
2) y defina suma ⊕ : V × V → V y producto con escalar : R×V → V
por

x ⊕ y = arctan(tan(x) + tan(y)), λ x = arctan(λ tan(x))

para todo x, y ∈ V, λ ∈ R. Demuestre que (V, ⊕, ) es un espacio vectorial sobre R.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 217

(b) Una generalización de la construcción en (a) es lo siguiente:


Sea V un conjunto y f : Rn → V una función biyectiva. Entonces V es un espacio
vectorial con suma y producto con escalar definido ası́:
x ⊕ y = f (f −1 (x) + f −1 (y)), λ x = f (λf −1 (x))
para todo x, y ∈ V, λ ∈ R.
7. Sea U un subespacio de Rn . Demuestre que Rn \ U no es un subespacio de Rn .

8. Sean m, n ∈ N. Demuestre que M (m × n, R) con la suma y producto con números en R es un


espacio vectorial.
De los siguientes subconjuntos de M (n × n), diga si son subespacios.
(a) Todas matrices con a11 = 0.
(b) Todas matrices con a11 = 3.
(c) Todas matrices con a12 = µa11 para un µ ∈ R fijo.

FT
(d) Todas matrices cuya primera columna coincide con la última columna.
Para los siguientes numerales supongamos que n = m.
(e) Todas las matrices simétricas (es decir, todas las matrices A con At = A).
(f) Todas las matrices que no son simétricas.
(g) Todas las matrices antisimétricas (es decir, todas las matrices A con At = −A).
(h) Todas las matrices diagonales.
RA
(i) Todas las matrices triangular superior.
(j) Todas las matrices triangular inferior.
(k) Todas las matrices invertibles.
(l) Todas las matrices no invertibles.
(m) Todas las matrices con det A = 1.

9. Demuestre que   
 x1
D

 
x1 + x2 − 2x3 − x4 = 0 

 x2 
V =   :
 x3 x1 − x2 + x3 + 7x4 = 0 
  

x4
 

es un subespacio de R4 .

10. Demuestre que   


 x1 
3x1 − x2 − 2x3 − x4 = 3 
 
x2 
 
W =   :


 x3 4x1 + x2 + x3 + 7x4 = 5 

x4
 

es un subespacio afı́n de R4 .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
218 5.7. Exercises

11. Considere los sistemas de ecuaciones lineales


   
 x + 2y + 3z = 0
   x + 2y + 3z = 3 
 
(1) 4x + 5y + 6z = 0 , (2) 4x + 5y + 6z = 9 .
   
7x + 8y + 9z = 0 7x + 8y + 9z = 15
   

Sea U el conjunto de todas las soluciones de (1) y W el conjunto de todas las soluciones de
(2). Note que se pueden ver como subconjuntos de R3 .

(a) Demuestre que U es un subespacio de R3 y descrı́balo geométricamente.


(b) Demuestre que W no es un subespacio de R3 .
(c) Demuestre que W es un subespacio afı́n de R3 y descrı́balo geométricamente.

     
1 −2 2 3
12. (a) Sean v1 = , v2 = ∈ R . Escriba v = como combinación lineal de v1 y
2 5 0

FT
v2 .
     
1 1 1
(b) ¿Es v = 2 combinación lineal de v1 = 7 , v2 = 5?
5 2 2
 
13 −5
(c) ¿Es A = combinación lineal de
50 8
       
−1
RA
1 0 0 1 2 1 1
A1 = , A2 = , A3 = , A4 = ?
2 2 −2 2 5 0 5 2

     
1 2 3
13. (a) ¿Los vectores v1 = 2 , v2 = 2 , v3 = 0 son linealmente independientes en R3 ?
3 5 1
     
1 1 1
(b) ¿Los vectores v1 = −2 , v2 = 7 , v3 = 5 son linealmente independientes en
D

2 2 2
R3 ?
(c) ¿Los vectores p1 = X 2 − X + 2, p2 = X + 3, p3 = X 2 − 1 son linealmente independientes
en P2 ? Son linealmente independientes en Pn para n ≥ 3?
     
1 3 1 1 7 3 1 −1 0
(d) ¿Los vectores A1 = , A2 = , A3 = son lineal-
−2 2 3 2 −1 2 5 2 8
mente independientes en M (2 × 3)?

~ ∈ Rn . Suponga que w
14. Sean ~v1 , . . . , ~vk , w ~ 6= ~0 y que w
~ ∈ Rn es ortogonal a todos los vectores
~vj . Demuestre que w ~∈/ gen{~v1 , . . . , ~vm }. ¿Se sigue que el sistema w,
~ ~v1 , . . . , ~vm es linealmente
independiente?

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 219

15. Determine
  si gen{a
1 } =
, a2 , a3 , a4 , v3 }para
gen{v1 , v2      
0 1 1 2 5 1 1
a1 = 1 , a2 = 0 , a3 =  2  , a4 =  1  , v1 = −3 , v2 = 1 , v3 = −1.
5 3 13 11 0 8 −2

16. (a) ¿Las siguientes matrices generan el espacio de todas las matrices simétricas 2 × 2?
     
2 0 13 0 0 3
A1 = , A2 = , A3 = ,
0 7 0 5 3 0

Si no lo hacen, encuentre un M ∈ Msym (2 × 2) \ span{A1 , A2 , A3 }.


(b) ¿Las siguientes matrices generan el espacio de todas las matrices simétricas 2 × 2?
     
2 0 13 0 0 3
B1 = , B2 = , B3 = ,
0 7 0 5 −3 0

FT
(c) ¿Las siguientes matrices generan el espacio de las matrices triangulares superiores 2 × 2?
     
6 0 0 3 10 −7
C1 = , C2 = , C3 = .
0 7 0 5 0 0

Si no, encuentre una matriz M triangular superior que no pertence a span{C1 , C2 , C3 }.

17. Sea n ∈ N y sea V el conjunto de las matrices simétricas n × n con la suma y producto con
RA
λ ∈ R usual.

(a) Demuestre que V es un espacio vectorial sobre R.


(b) Encuentre matrices que generan V . ¿Cuál es el número mı́nimo de matrices que se
necesitan para generar V ?

18. Determine si los siguientes conjuntos de vectores son bases del espacio vectorial indicado.
   
1 −2
D

(a) v1 = , v2 = ; R2 .
2 5
       
1 3 5 3 0 1 2 1
(b) A = , B= , C= , D= ; M (2 × 2).
2 1 1 2 −2 2 5 0
(c) p1 = 1 + x, p2 = x + x2 , p3 = x2 + x3 , p4 = 1 + x + x2 + x3 ; P3 .

19. (a) Es F el plano dado por F : 2x − 5y + 3z = 0. Demuestre que F es subespacio de R3 y


~ ∈ R3 tal que F = gen{~u, w}.
encuentre vectores ~u y w ~
   
1 −5
(b) Sean v1 = 7 , v2 =  1  ∈ R3 . Sea E el plano E = gen{v1 , v2 }. Escriba E en la
3 2
forma E : ax + by + cz = d.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
220 5.7. Exercises

(c) Encuentre un vector w ∈ R3 , distinto de v1 y v2 , tal que gen{v1 , v2 , w} = E.


(d) Encuentre un vector v3 ∈ R3 tal que gen{v1 , v2 , v3 } = R3 .

20. (a) Encuentre una base para el plano E : x − 2y + 3z = 0 in R3 .


(b) Complete la base encontrada en (i) a una base de R3 .

21. Sea F := {(x1 , x2 , x3 , x4 )t : 2x1 − x2 + 4x3 + x4 = 0}.

(a) Demuestre que F es un subespacio de R4


(b) Encuentre una base para F y calcule dim F .
(c) Complete la base encontrada en (ii) a una base de R4 .

22. Sea G := {(x1 , x2 , x3 , x4 )t : 2x1 − x2 + 4x3 + x4 = 0, x1 − x2 + x3 + 2x4 = 0}.

FT
(a) Demuestre que G es un subespacio de R4
(b) Encuentre una base para G y calcule dim G.
(c) Complete la base encontrada en (ii) a una base de R4 .

         
1 0 4 2 1
23. Sean v1 = 2 , v2 = 4 , v3 = 2 , v4 = 8 , v5 = 0.
3 1 5 3 1
RA
Determine si estos vectoren generan el espacio R3 . Si lo hacen, escoja una base de R3 de los
vectores dados.
       
6 0 6 3 6 −3 12 −9
24. Sean C1 = , C2 = , C3 = , C4 = .
0 7 0 12 0 2 0 −1
Determine si estas matrices generan el espacio de las matrices triangulares superiores 2 × 2.
Si lo hacen, escoja una base de las matrices dadas.
D

25. Sean p1 = x2 + 7, p2 = x + 1, p3 = 3x3 + 7x. Determine si los polinomios p1 , p2 , p3 son


linealmente independientes. Si lo son, complételos a una base en P3 .

26. Para los siguientes conjuntos, determine si son espacios vectoriales. Si lo son, calcule su
dimensión.

(a) M1 = {A ∈ M (n × n) : A es triangular superior}.


(b) M2 = {A ∈ M (n × n) : A tiene ceros en la diagonal}.
(c) M3 = {A ∈ M (n × n) : At = −A}.
(d) M4 = {p ∈ P5 : p(0) = 0}.

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 5. Vector spaces 221

27. Para los siguientes sistemas de vectores en el espacio vectorial V , determine la dimensión del
espacio vectorial generado por ellos y escoja un subsistema de ellos que es base del espacio
vectorial generado por los vectores dados. Complete este subsistema a una base de V .
     
1 3 3
(a) V = R3 , ~v1 = 2 , ~v2 = 2 , ~v3 = 2.
3 7 1
(b) V = P4 , p1 = x3 + x, p2 = x3 − x2 + 3x, p3 = x2 + 2x − 5, p4 = x3 + 3x + 2.
       
1 4 3 0 0 12 9 −12
(c) V = M (2 × 2), A = , B= , C= , D= .
−2 5 1 4 −7 11 10 1

28. Sea V un espacio vectorial. Falso o verdadero?


(a) Suponga v1 , . . . , vk , u, z ∈ V tal que z es combinación lineal de los v1 , . . . , vk . Entonces
que z es combinación lineal de v1 , . . . , vk , u.
(b) Si u es combinación lineal de v1 , . . . , vk ∈ V , entonces v1 , . . . , vk , u es un sistema de

FT
vectores linealmente dependientes.
(c) Si v1 , . . . , vk ∈ V es un sistema de vectores linealmente dependientes, entonces v1 es
combinación lineal de los v2 , . . . , vk .

29. (a) ¿Es Cn un espacio vectorial sobre R?


(b) ¿Es Cn un espacio vectorial sobre Q?
RA
(c) ¿Es Rn un espacio vectorial sobre C?
(d) ¿Es Rn un espacio vectorial sobre Q?
(e) ¿Es Qn un espacio vectorial sobre R?
(f) ¿Es Qn un espacio vectorial sobre C?

30. Sea V un espacio vectorial y sean U, W ⊆ V subespacios.

(a) Demuestre que U ∩ W es un subespacio.


D

(b) Demuestre que dim U + W = dim U + dim V − dim(U ∩ W ).


(c) Suponga que U ∩ W = {0}. Demuestre que dim U ⊕ W = dim U + dim V .

Last Change: Sun Apr 14 01:37:23 PM -05 2024


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 6

Linear transformations and change


of bases

FT
In the first section of this chapter we will define linear maps between vector spaces and discuss their
properties. These are functions which “behave well” with respect to the vector space structure. For
example, m × n matrices can be viewed as linear maps from Rm to Rn . We will prove the so-called
dimension formula for linear maps. In Section 6.2 we will study the special case of matrices. One of
the main results will be the dimension formula (6.4). In Section 6.4 we will see that, after choice of
a basis, every linear map between finite dimensional vector spaces can be represented as a matrix.
This will allow us to carry over results on matrices to the case of linear transformations.
RA
As in previous chapters, we work with vector spaces over R or C. Recall that K always stands for
either R or C.

6.1 Linear maps


Definition 6.1. Let U, V be vector spaces over the same field K. A function T : U → V is called
a linear map if for all x, y ∈ U and λ ∈ K the following is true:
T (x + y) = T x + T y, T (λx) = λT x. (6.1)
D

Other words for linear map are linear function, linear transformation or linear operator.

Remark. Note that very often one writes T x instead of T (x) when T is a linear function.

Remark 6.2. (i) Clearly, (6.1) is equivalent to


T (x + λy) = T x + λT y for all x, y ∈ U and λ ∈ K. (6.1’‘)

(ii) It follows immediately from the definition that


T (λ1 v1 + · · · + λk vk ) = λ1 T v1 + · · · + λk T vk
for all v1 , . . . , vk ∈ V and λ1 , . . . , λk ∈ K.

223
224 6.1. Linear maps

(iii) The condition (6.1) says that a linear map respects the vector space structures of its
domain and its target space.

Exercise 6.3. Let U, V be vector spaces over K (with K = R or K = C). Let us denote the set
of all linear maps from U to V by L(U, V ). Show that L(U, V ) is a vector spaces over K. That
means you have to show that the sum of two linear maps is a linear map, that a scalar multiple
of linear map is a linear map and that the vector space axioms hold.

Examples 6.4 (Linear maps). (a) Every matrix A ∈ M (m × n) can be identified with a linear
map Rn → Rm .

(b) Differentiation is a linear map, for example:

(i) Let C(R) be the space of all continuous functions and C 1 (R) the space of all continuously
differentiable functions. Then

T : C 1 (R) → C(R), Tf = f0

is a linear map.

FT
Proof. First of all note that f 0 ∈ C(R) if f ∈ C 1 (R), so the map T is well-defined. Now
we want to see that it is linear. So we take f, g ∈ C 1 (R) and λ ∈ R. We find

T (λf + g) = (λf + g)0 = (λf )0 + g 0 = λf 0 + g 0 = λT f + T g.

(ii) The following maps are linear, too. Note that their action is the same as the one of T
RA
above, but we changed the vector spaces where it acts on.

R : Pn → Pn−1 , Rf = f 0 , S : P n → Pn , Sf = f 0 .

(c) Integration is a linear map. For example:


Z x
I : C([0, 1]) → C([0, 1]), f 7→ If where (If )(x) = f (t) dt.
0
D

Proof. Clearly I is well-defined since the integral of a continuous function is again continuous.
In order to show that I is linear, we fix f, g ∈ C(R) and λ ∈ R. We find for every x ∈ R:


Z x Z x Z t Z x
I(λf + g) (x) = (λf + g)(t) dt = λf (t) + g(t) dt = λ f (t) dt + g(t) dt
0 0 0 0
= λ(If )(x) + (Ig)(x).

Since this is true for every x, it follows that I(λf + g) = λ(If ) + (Ig).

(d) As an example for a linear map from M (n × n) to itself, we consider

T : M (n × n) → M (n × n), T (A) = A + At .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 225

Proof that T is a linear map. Let A, B ∈ M (n × n) and let c ∈ R. Then

T (A + cB) = (A + cB) + (A + cB)t = A + cB + At + (cB)t = A + cB + At + cB t


= A + At + c(B + B t ) = T (A) + cT (B).

(e) A first non-example of linear transformation. Let


 
x
T : R2 → R, T = xy.
y
         
1 0 1 1 0
Then T =T = xy = 0, but T = 1 6= T +T .
0 1 1 0 1
(f) A second non-example of linear transformation. Let be given by:
 
x  
x + 3z
T : R3 → R2 T  y  = .
2|y|

FT
z
        
0   0 0     0
0 0 0
Then T 1 = , but T −3 1 = T −3 = 6= = −3T 1.
2 6 −6
0 0 0 0

The next lemma shows that a linear map always maps the zero vector to the zero vector.

Lemma 6.5. If T is a linear map, then T O = O.


RA
Proof. T O = T (O − O) = T O − T O = O.

Definition 6.6. Let T : U → V be a linear map.


(i) T is called injective (or one-to-one) if

x, y ∈ U, x 6= y =⇒ T x 6= T y.
D

(ii) T is called surjective if for all v ∈ V there exists at least one x ∈ U such that T x = v.
(iii) T is called bijective if it is injective and surjective.
(iv) The kernel of T (or null space of T ) is

ker(T ) := {x ∈ U : T x = 0}.

Sometimes the notations N (T ) or NT are used for ker(T ).


(v) The image of T (or range of T ) is

Im(T ) := {y ∈ V : y = T x for some x ∈ U }.

Sometimes the notations Rg(T ) or R(T ) or T (U ) are used for Im(T ).

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
226 6.1. Linear maps

Remark 6.7. (i) Observe that ker(T ) is a subset of U , Im(T ) is a subset of V . In Proposi-
tion 6.11 we will show that they are even subspaces.

(ii) Clearly, T is injective if and only if for all x, y ∈ U the following is true:

Tx = Ty =⇒ x = y.

(iii) If T is a linear injective map, then its inverse T −1 : Im(T ) → U exists and is linear too.

Exercise 6.8. Let U, V, W be vector spaces over K (with K = R or K = C).

• Suppose that T : U → V and S : V → W are linear functions. Show that their composition
ST : U → W is a linear function too.
When you compare Im(ST ) and Im(S), what can you conclude?
When you compare ker(ST ) and ker(S), what can you conclude?
• Suppose that T : U → V is a linear invertible linear function so that we can define its inverse
function T −1 : Im(T ) → U . Show that it is a linear function too.

The following lemma is very useful.

FT
Lemma 6.9. Let T : U → V be a linear map.

(i) T is injective if and only if ker(T ) = {O}.


(ii) T is surjective if and only if Im(T ) = V .
RA
Proof. (i) From Lemma 6.5, we know that O ∈ ker(T ). Assume that T is injective. Then ker(T )
cannot contain any other element, hence ker(T ) = {O}.
Now assume that ker(T ) = {O} and let x, y ∈ U with T x = T y. By Remark 6.7 it is sufficient to
show that x = y. By assumption, O = T x − T y = T (x − y), hence x − y ∈ ker(T ) = {O}. Therefore
x − y = O, which means that x = y.
(ii) follows directly from the definitions of surjectivity and the image of a linear map.

Examples 6.10 (Kernels and ranges of the linear maps from Examples 6.4).
D

(a) We will discuss the case of matrices at the beginning of Section 6.2.

(b) If T : C 1 (R) → C(R), T f = f 0 , then it is easy to see that the kernel of T consists exactly
of the constant functions. Moreover T is surjective because every continuousR functions is the
x
derivative of another function because for every f ∈ C(R) we can set g(x) = 0 f (t) dt. Then
1 0
g ∈ C (R) and T g = g = f which shows that Im(T ) = C(R).

(c) For the integration operator in Example 6.4((c)) we have that ker(I) = {0} and Im(I) =
C 1 (R). In other words, I is injective but not surjective.

R x the claim about the range of I. Suppose that g ∈ Im(I). Then g is


Proof. First we prove
of the form g(x) = 0 f (t) dt for some f ∈ C(R). By the fundamental theorem of calculus,
it follows that g ∈ C 1 (R), so we proved Im(I) ⊆ C 1 (R). To show the other inclusion, let

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 227

0
g ∈ C 1 (R). Then g is differentiable
R x 0 and g ∈ C(R) and, again by the fundamental theorem of
calculus, we have that g(x) = 0 g (t) dt, so g ∈ Im(I) and it follows that C 1 (R) ⊆ Im(I).
Rx
Now assume that Ig = 0. If we differentiate, we find that 0 = (Ig)0 (x) = dx
d
0
g(t) dt = g(x)
for all x ∈ R, therefore g ≡ 0, hence ker(I) = {0}.

(d) Let T : M (n × n) → M (n × n), T (A) = A + At . Then ker T = Masym (n × n) (= the space


of all antisymmetric n × n matrices) and Im T = Msym (n × n) (= the space of all symmetric
n × n matrices).

Proof. First we prove the claim about the range of T . Clearly, Im(T ) ⊆ Msym (n × n) because
for every A ∈ M (n × n) we have that T (A) is symmetric because (T (A))t = (A + At )t =
At + (At )t = At + A = T (A). To prove Msym (n × n) ⊆ Im(T ) we take some B ∈ Msym (n × n).
Then T ( 12 B) = 12 B +( 12 B)t = 21 B + 12 B = B where we used that B is symmetric. In summary
we showed that Im(T ) = Msym (n × n).
The claim on the kernel of T follows from

FT
A ∈ ker T ⇐⇒ T (A) = 0 ⇐⇒ A+At = 0 ⇐⇒ A = −At ⇐⇒ A ∈ Masym (n×n).

Proposition 6.11. Let T : U → V be a linear map. Then


(i) ker(T ) is a subspace of U .
(ii) Im(T ) is a subspace of V .

Proof. (i) By Lemma 6.5, O ∈ ker(T ). Let x, y ∈ ker(T ) and λ ∈ K. Then x + λy ∈ ker(T )
RA
because
T (x + λy) = T x + λT y = O + λ0 = O.
Hence ker(T ) is a subspace of U by Proposition 5.10.
(ii) C;early, O ∈ Im(T ). Let v, w ∈ Im(T ) and λ ∈ K. Then there exist x, y ∈ U such that
T x = v and T y = w. Then v + λw = T x + λT y = T (x + λy) ∈ Im(T ). hence v + λw ∈ Im(T ).
Therefore Im(T ) is a subspace of V by Proposition 5.10.
Since we now know that ker(T ) and Im(T ) are subspaces, the following definition makes sense.
D

Definition 6.12. Let T : U → V be a linear map. We define

dim(ker(T )) = nullity of T, dim(Im(T )) = rank of T.

Sometimes the notations ν(T ) = dim(ker(T )) and ρ(T ) = dim(Im(T )) are used.

Example. Let T : P3 → P3 be defined by T p = p0 . Then Im(T ) = {q ∈ P3 : deg q ≤ 2} and


ker(T ) = {q ∈ P3 : deg q = 0}. In particular dim(Im(T )) = 3 and dim(ker(T )) = 1.
Proof. • First we show the claim about the image of T . We know that differentiation lowers
the degree of a polynomial by 1. Hence Im(T ) ⊆ {q ∈ P3 : deg q ≤ 2}. On the other hand,
we know that every polynomial of degree ≤ 2 is the derivative of a polynomial of degree ≤ 3.
So the claim follows.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
228 6.1. Linear maps

• First we show the claim about the kernel of T . Recall that ker(T) = {p ∈ P3 : T p = 0}. So
the kernel of T are exactly those polynomials whose first derivative is 0. These are exactly
the constant polynomials, i.e., the polynomials of degree 0.

Lemma 6.13. Let T : U → V be a linear map between two vector spaces U, V and let {u1 , . . . , uk }
be a basis of U . Then Im T = span{T u1 , . . . , T uk }.

Proof. Clearly, T u1 , . . . , T uk ∈ Im T . Since the image of T is a vector space, all linear combinations
of these vectors must belong to Im T too which shows span{T u1 , . . . , T uk } ⊆ Im T . To show the
other inclusion, let y ∈ Im T . Then there is an x ∈ U such that y = T x. Let us express x as linear
combination of the vectors of the basis: x = α1 u1 + . . . αk uk . Then we obtain

y = T x = T (α1 u1 + . . . αk uk ) = α1 T u1 + . . . αk T uk ∈ span{T u1 , . . . , T uk }.

Since y was arbitrary in Im T , we conclude that Im T ⊆ span{T u1 , . . . , T uk }. So in summary we


proved the claim.

FT
Proposition 6.14. Let U, V be K-vector spaces, T : U → V a linear map. Let x1 , . . . , xk ∈ U and
set y1 := T x1 , . . . , yk := T xk . Then the following is true.
(i) If the x1 , . . . , xk are linearly dependent, then y1 , . . . , yk are linearly dependent too.
(ii) If the y1 , . . . , yk are linearly independent, then x1 , . . . , xk are linearly independent too.
(iii) Suppose additionally that T invertible. Then x1 , . . . , xk are linearly independent if and only
if y1 , . . . , yk are linearly independent.
RA
In general the implication “If x1 , . . . , xk are linearly independent, then y1 , . . . , yk are linearly
independent.” is false. Can you give an example?
Proof of Proposition 6.14. (i) Assume that the vectors x1 , . . . , xk are linearly dependent. Then
there exist λ1 , . . . , λk ∈ K such that λ1 x1 + · · · + λk xk = O and at least one λj 6= 0. But then

O = T O = T (λ1 x1 + · · · + λk xk ) = λ1 T x1 + · · · + λk T xk
= λ1 y1 + · · · + λk yk ,
D

hence the vectors y1 , . . . , yk are linearly dependent.


(ii) follows directly from (i).
(iii) Suppose that the vectors y1 , . . . , yk are linearly independent. Then so are the x1 , . . . , xk by
(i). Now suppose that x1 , . . . , xk are linearly independent. Note that T is invertible, so T −1
exists. Therefore we can apply (i) to T −1 in order to conclude that the system y1 , . . . , yk is linearly
independent. (Note that xj = T −1 yj .)

Exercise 6.15. Assume that T : U → V is an injective linear map and suppose that {u1 , . . . , u` }
is a set of are linearly independent vectors in U . Show that {T u1 , . . . , T u` } is a set of are linearly
independent vectors in V .

The following lemma is very useful and it is used in the proof of Theorem 6.4.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 229

Proposition 6.16. Let U, V be K-vector spaces with dim U = k < ∞.

(i) If T : U → V is linear transformation, then dim Im(T ) ≤ dim U .


(ii) If T : U → V is an injective linear transformation, then dim Im(T ) = dim U .
(iii) If T : U → V is a bijective linear transformation, then dim U = dim V .

Proof. Let u1 , . . . , uk be a basis of U .

(i) From Lemma 6.13 we know that Im T = span{T u1 , . . . , T uk }. Therefore dim Im T ≤ k = dim U
by Theorem 5.47.

(ii) Assume that T is injective. We will show that T u1 , . . . , T uk are linearly independent. Let
α1 , . . . , αk ∈ K such that α1 T u1 + · · · + αk T uk = O. Then

O = α1 T u1 + · · · + αk T uk = T (α1 u1 + · · · + αk uk ).

FT
Since T is injective, it follows that α1 u1 + · · · + αk uk = O, hence α1 = · · · = αk = 0 which
shows that the vectors T u1 , . . . , T uk are indeed linearly independent. Therefore they are a basis
of span{T u1 , . . . , T uk } = Im T and we conclude that dim Im T = k = dim U .

(iii) Since T is bijective, it is surjective and injective. Surjectivity means that Im T = V and
injectivity of T implies that dim Im T = dim U by (ii). In conclusion,

dim U = dim Im T = dim V.


RA
The previous theorem tells us for example that there is no injective linear map from R5 to R3 ; or
that there is no surjective linear map from R3 to M (2 × 2).

Remark 6.17. Proposition 6.16 is true also for dim U = ∞. In this case, (i) clearly holds whatever
dim Im(T ) may be. To prove (ii) we need to show that dim Im(T ) = ∞ if T is injective. Note that
for every n ∈ N we can find a subspace Un of U with dim Un = n and we define Tn to be the
restriction of T to Un , that is, Tn : Un → V . Since the restriction of an injective map is injective,
it follows from (ii) that dim Im(Tn ) = n. On the other hand, Im(Tn ) is a subspace of V , therefore
D

dim V ≥ dim Im(Tn ) = n by Theorem 5.54 and Remark 5.55. Since this is true for any n ∈ N, it
follows that dim V = ∞. The proof of (iii) is the same as in the finite dimensional case.

Theorem 6.18. Let U, V be K-vector spaces and T : U → V a linear map. Moreover, let E : U →
U , F : V → V be linear bijective maps. Then the following is true:

(i) Im(T ) = Im(T E), in particular dim(Im(T )) = dim(Im(T E)).

(ii) ker(T E) = E −1 (ker(T )) and dim(ker(T )) = dim(ker(T E)).

(iii) ker(T ) = ker(F T ), in particular dim(ker(T )) = dim(ker(F T )).

(iv) Im(F T ) = F (Im(T )) and dim(Im(T )) = dim(Im(F T )).

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
230 6.1. Linear maps

In summary we have

ker(F T ) = ker(T ), ker(T E) = E −1 (ker(T )),


(6.2)
Im(F T ) = F (Im(T )), Im(T E) = Im(T ).

and

dim ker(T ) = dim ker(F T ) = dim ker(T E) = dim ker(F T E),


(6.3)
dim Im(T ) = dim Im(F T ) = dim Im(T E) = dim Im(F T E).

Proof. (i) Let v ∈ V . If v ∈ Im(T ), then there exists x ∈ U such that T x = v. Set y = E −1 x.
Then v = T x = T EE −1 x = T Ey ∈ Im(T E). On the other hand, if v ∈ Im(T E), then there exists
y ∈ U such that T Ey = v. Set x = E. Then v = T Ey = T x ∈ Im(T ).

(ii) To show ker(T E) = E −1 ker(T ) observe that

FT
ker(T E) = {x ∈ U : Ex ∈ ker(T )} = {E −1 u : u ∈ ker(T )} = E −1 (ker(T )).

It follows that
E −1 : ker T → ker(T E)
is a linear bijection and therefore dim T = dim ker(T E) by Proposition 6.16(iii) (or Remark 6.17 in
the infinite dimensional case) with E −1 as T , ker(T ) as U and ker(T E) as V .
RA
(iii) Let x ∈ U . Then x ∈ ker(F T ) if and only if F T x = O. Since F is injective, we know that
ker(F ) = {O}, hence it follows that T x = O. But this is equivalent to x ∈ ker(T ).

(iv) To show Im(F T ) = F Im(T ) observe that

Im(F T ) = {y ∈ V : y = F T x for some x ∈ U } = {F v : v ∈ Im(T )} = F (Im(T )).

It follows that
F : Im T → Im(F T )
D

is a linear bijection and therefore dim T = dim Im(F T ) by Proposition 6.16(iii) (or Remark 6.17 in
the infinite dimensional case) with F as T , Im(T ) as U and Im(F T ) as V .

Remark 6.19. Ingeneral,


 ker(T ) = ker(T
 E) and ker(T ) = ker(F T ) is false. Take for example
1 0 0 1
U = V = R2 , T = and E = F = . Then clearly the hypotheses of the theorem are
0 0 1 0
satisfied and    
0 1
ker(T ) = span , Im(T ) = span ,
1 0
but    
1 0
ker(T E) = span , Im(F T ) = span .
0 1

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 231

Draw a picture to visualise the example above, taking into account that T represents √
the projection
onto the x-axis and E and F are rotation by 45◦ and a “stretching” by the factor 2.

We end this section with one of the main theorems of linear algebra. In the next section we will
re-prove it for the special case when T is given by a matrix in Theorem 6.33. The theorem below
can be considered a coordinate free version of Theorem 6.33.

Theorem 6.20. Let U, V be vector spaces with dim U = n < ∞ and let T : U → V be a linear
map. Then
dim(ker(T )) + dim(Im(T )) = n. (6.4)

Proof. Let k = dim(ker(T )) and let {u1 , . . . , uk } be a basis of ker(T ). We complete it to a basis
{u1 , . . . , uk , wk+1 , . . . , wn } of U and we set W := span{wk+1 , . . . , wn }. Note that by construction
ker(T ) ∩ W = {O}. (Prove this!) Let us consider Te = T |W the restriction of T to W .
It follows that Te is injective because if Tex = O for some x ∈ W then also T x = Tex = O, hence
x ∈ ker(T ) ∩ W = {O}. It follows from Proposition 6.16(ii) that

FT
dim Im Te = dim W = n − k.

To complete the proof, it suffices to show that Im Te = Im T . Recall that by Lemma 6.13, we have
that the range of a linear map is generated by the images of a basis of the initial vector space.
Therefore we find that

Im T = span{T u1 , . . . , T uk , T wk+1 , . . . , T wn } = span{T wk+1 , . . . , T wn }


(6.5)
RA
= span{Tewk+1 , . . . , Tewn }
= Im Te

where in the second step we used that T u1 = · · · = T uk = O and therefore they do not contribute
to the linear span and in the third step we used that T wj = Tewj for j = k + 1, . . . , n. So we
showed that Im Te = Im T , in particular their dimensions are equal and the claim follows from (6.5)
because, recalling that k = dim ker(T ),
D

n = dim Im Te + k = dim Im T + dim ker T.

Note that an alternative way to prove the theorem above is to first prove Theorem 6.33 for matrices
and then use the results on representations of linear maps in Section 6.4 to conclude formula (6.4).

You should now have understood

• what a linear map is and why they are the natural maps to consider on vector spaces,
• what injectivity, surjectivity and bijectivity means,
• what the kernel and image of a linear map is,
• why the dimension formula (6.4) is true,
• etc.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
232 6.1. Linear maps

You should now be able to


• give examples of linear maps,
• check if a given function is a linear maps,
• find bases and the dimension of kernels and ranges of a given linear map,
• etc.

Ejercicios.
De los ejercicios 1 al 14 determinar si la función dada es una transformación lineal. Si lo es
demuéstrelo, en caso contrario dé un ejemplo donde no se cumpla la linealidad.
 
x  
x + 2y
 y 

1. T : R4 → R3 , T  z  = 3y − 5z .

w
w

FT
 
x  
3 2 x
2. T : R → R , T y =  .
1
z
   
x y
3. T : R2 → R2 , T = .
y x2
   
x y
4. T : R2 → R2 , T = .
RA
y 0
 
x
 y
5. T : R4 → R, T   z  = |w|.

w
 
x  
x − y + 2z + 3w
 y 
6. T : R4 → R3 , T    y + 4z + 3w .
z  =
D

x + 6z + 6w
w
Rx
7. T : P3 → P4 , T (p) = 0 p(t)dt.
 
a
8. T : R3 → P3 , T  b  = (a − 3b)x3 + (b + 2c).
c

9. T : M (3 × 3) → M (3 × 3), T (A) = At − A.
 
2 4
10. T : M (2 × 2) → M (2 × 2), T (A) = A .
0 1

11. Sea g ∈ C 1 (R) y T : C 1 (R) → C(R) dada por: T (f ) = (gf )0 .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 233

 
a11 a12 a13
12. T : M (3 × 3) → R, T a21 a22 a23  = a11 + a22 + a33 .
a31 a32 a33
13. Sea ~x0 ∈ Rn y T : M (m × n) → Rm dada por T (A) = A~x0 .
14. T : M (n × n) → R, T (A) = det A.
15. De los ejercicios anteriores, (salvo el ejercicio 11.) determine Im T , ker T y sus dimensiones.
~ ∈ Rn un vector no nulo y T : Rn → R dada por T (~x) = h~x , wi.
16. Sea w ~ Demuestre que T es
una transformación lineal y determine las dimensiones de ker T e Im T .
17. Sea w~ ∈ Rn un vector no nulo y T : Rn → Rn dada por T (~x) = projw~ ~x. Demuestre que T es
lineal, encuentre Im T , ker T y sus dimensiones.
 
1
18. Sea T : R3 → R3 dada por T (~x) = ~x × 2, ¿T es una transformación lineal? En caso
3

FT
afirmativo encuentre ker T e Im T y sus dimensiones. (Hint: Sale muy sencillo si lo piensa
geométricamente).
19. Sea T : R2 → R3 una transformación lineal tal que:
   
  1   −3
1 3
T = 0 , T =  0 ,
−1 2
3 −9
RA
   
7 x
encontrar T y aún más general, determinar T . ¿T es inyectiva?. ¿Cómo cambia la
11 y
 
  0
3
respuesta si T = 1?
2
0

20. ¿Existe una transformación lineal T : R3 → R4 tal que


     
  1   2   3
1 2 1
D

 0  4  4
T −1 =   1 , T −3 = −1 ,
     T −14 = 
 0?

2 1 8
−1 −3 −5

6.2 Matrices as linear maps


In this section, we work mostly with real vector spaces for definiteness sake. However, all the
statements are also true for complex vector spaces. We only have to replace everywhere R by C
and the word real by complex.

Let A ∈ M (m × n). We already know that we can view A as a linear map from Rn to Rm . Hence
ker(A) and Im(A) and the terms injectivity and surjectivity are defined.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
234 6.2. Matrices as linear maps

Strictly speaking, we should distinguish between a matrix and the linear map induced by it. So
we should write TA : Rn → Rm for the map x 7→ Ax. The reason is that if we view A directly
as a linear map then this implies that we tacitly have already chosen a basis in Rn and Rm , see
Section 6.4 for more on that. However, we will usually abuse notation and write A instead of TA .
If we view a matrix A as a linear map and at the same time as a linear system of equations, then
we obtain the following.

Remark 6.21. Let A ∈ M (m × n) and denote the columns of A by ~a1 , . . . , ~an ∈ Rm . Then the
following is true.

(i) ker(A) = all solutions ~x of the homogeneous system A~x = ~0.


(ii) Im(A) = all vectors ~b such that the system A~x = ~b has a solution
= span{~a1 , . . . , ~an }.

Consequently,

FT
(iii) A is injective ⇐⇒ ker(A) = {~0}
⇐⇒ the homogenous system A~x = ~0 has only the trivial solution ~x = ~0.
(iv) A is surjective ⇐⇒ Im(A) = Rm
⇐⇒ for every ~b ∈ Rm , the system A~x = ~b has at least one solution.

Proof. All claims should be clear except maybe the second equality in (ii). This follows from
     
x1 x1
RA

 

Im A = {A~x : ~x ∈ Rn } = (~a1 | . . . |~an )  ...  :  ...  ∈ Rn
   
 
xn xn
 

= {x1~a1 + · · · + xn~an ) : x1 , . . . , xn ∈ R}
= span{~a1 , . . . , ~an },

see also Remark 3.19.

To practice a bit, we prove the following two remarks in two ways.


D

Remark 6.22. Let A ∈ M (m × n). If m > n, then M cannot be surjective.

Proof with Gauß-Jordan. Let A0 be the row reduced echelon form of A. Then there must be an
invertible matrix E such that A = EA0 and A0 the last row of A0 must be zero because it can have
at most n pivots. But then (A0 |~em ) is inconsistent, which means that (A|E −1~em ) is inconsistent.
Hence E −1~em ∈/ Im A so A cannot be surjective. (Basically we say that clearly A0 is not surjective
because we can easily find a right side to that A0 ~x0 = ~b0 is inconsistent. Just pick any vector ~b0 whose
last coordinate is different from 0. The easiest such vector is ~em . Now do the Gauß-Jordan process
backwards on this vector in order to obtain a right hand side ~b such that A~x = ~b is inconsistent.)

Proof using the concept of dimension. We already saw that Im A is the linear span of its columns.
Therefore dim Im A ≤ #columns of A = n < m = dim Rm , therefore Im A ( Rm .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 235

Remark 6.23. Let A ∈ M (m × n). If m < n, then M cannot be injective.

Proof with Gauß-Jordan. Let A0 be the row reduced echelon form of A. Then A0 can have at
most m pivots. Since A0 has more columns than pivots, the homogeneous system A~x = ~0 has
infinitely solutions, but then also ker A contains infinitely many vectors, in particular A cannot be
injective.

Proof using the concept of dimension. We already saw that Im A is the linear span of its n columns
in Rm . Since n > m it follows that the column vectors are linearly dependent in Rm , hence A~x = ~0
has a non-trivial solution. Therefore ker A is not trivial and it follows that A is not injective.

Note that the remarks do not imply that A is surjective if m ≤ n or that A is injective if n ≤ m.
Find examples!

From Theorem 3.44 we obtain the following very important theorem for the special case m = n.

Theorem 6.24. Let A ∈ M (n × n) be a square matrix. Then the following is equivalent.


(i) A is invertible.
(ii) A is injective, that is, ker A = {~0}.
(iii) A is surjective, that is, Im A = Rn .
FT
In particular, A is injective if and only if A is surjective if and only if A is bijective.
RA
Definition 6.25. Let A ∈ M (m × n) and let ~c1 , . . . , ~cn be the columns of A and ~r1 , . . . , ~rm be the
rows of A. We define

(i) CA := span{~c1 , . . . , ~cm } =: column space of A ⊆ Rm ,


(ii) RA := span{~r1 , . . . , ~rn } =: row space of A ⊆ Rn ,

The next proposition follows immediately from the definition above and from Remark 6.21(ii).

Proposition 6.26. For A ∈ M (m × n) it follows that


D

(i) RA = CAt and CA = RAt ,


(ii) CA = Im(A) and RA = Im(At ).

The next proposition follows directly from the general theory in Section 6.1. We will give another
proof at the end of this section.

Proposition 6.27. Let A ∈ M (m × n), E ∈ M (n × n), F ∈ M (m × m) and assume that E and


F are invertible. Then

(i) CA = CAE .

(ii) RA = RF A .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
236 6.2. Matrices as linear maps

Proof. (i) Note that CA = Im(A) = Im(AE) = CAE , where in the first and third equality we
used Proposition 6.26, and in the second equality we used Theorem 6.4.
(ii) Recall that, if F is invertible, then F t is invertible too. With Proposition 6.26(i) and what
we already proved in (i), we obtain RF A = C(F A)t = CAt F t = CAt = RA .
We immediately obtain the following proposition.

Proposition 6.28. Let A, B ∈ M (m × n).


(i) If A and B are row equivalent, then
dim(ker(A)) = dim(ker(B)), dim(Im(A)) = dim(Im(B)), Im(At ) = Im(B t ), RA = RB .

(ii) If A and B are column equivalent, then


dim(ker(A)) = dim(ker(B)), dim(Im(A)) = dim(Im(B)), Im(A) = Im(B), CA = CB .

Proof. We will only prove (i). The claim (ii) can be proved similarly (or can be deduced easily
from (i) by applying (i) to the transposed matrices). That A and B are row equivalent means that

FT
we can transform B into A by row transformations. Since row transformations can be represented
by multiplication by elementary matrices from the left, there are elementary matrices F1 , . . . , Fk ∈
M (m × m) such that A = F1 . . . Fk B. Note that all Fj are invertible, hence F := F1 . . . Fk is
invertible and A = F B. Therefore all the claims in (i) follow from Theorem 6.4 and Proposition 6.27.

The proposition above is very useful to calculate the kernel of a matrix A: Let A0 be the reduced
row-echelon form of A. Then the proposition can be applied to A and A0 , and we find that
ker(A) = ker(A0 ).
RA
In fact, we know this since the first chapter of this course, but back then we did not have fancy
words like “kernel” at our disposal. It says nothing else than: the solutions of a homogenous
system do not change if we apply row transformations, which is exactly why the Gauß-Jordan
elimination works.
In Examples 6.36 and 6.37 we will calculate the kernel and range of a matrix. Now we will prove
two technical lemmas.
D

Lemma 6.29. Let A ∈ M (m × n). Then there exist elementary matrices E1 , . . . , Ek ∈ M (n × n)


and F1 , . . . , F` ∈ M (m × m) such that
F1 · · · F` AE1 · · · Ek = A00
where A00 is of the form
r n−r
 
1
 0  r
A00 =  (6.6)
 


 1 

 
 
  m−r
0 0
Last Change: Fri May 10 12:11:39 PM -05 2024
Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 237

Proof. Let A0 be the reduced row-echelon form of A. Then there exist F1 , . . . , F` ∈ M (m × m) such
that F1 · · · F` A = A0 and A0 is of the form
 
1 ∗ ∗ 0 ∗ ∗ 0 ∗
 
 1 ∗ ∗ 0 ∗ 
0
A = . (6.7)
 
 1 ∗ 
 
 

Now clearly we can find “allowed” column transformations such that A0 is transformed into the
form A00 . If we observe that applying column transformations is equivalent to multiplying A0 from
the right by elementary matrices, then we can find elementary matrices E1 , . . . , Ek such that
A0 E1 . . . Ek if of the form (6.6).

Lemma 6.30. Let A00 be as in (6.6). Then

FT
(i) dim(ker(A)) = m − r = number of zero rows of A00 ,
(ii) dim(Im(A)) = r = number of pivots A00 ,
(iii) dim(CA00 ) = dim(RA00 ) = r.

Proof. All assertions are clear if we note that

ker(A00 ) = span{~er+1 , . . . ,~en } and Im(A00 ) = span{~e1 , . . . ,~er }


RA
where the ~ej are the standard unit vectors (that is, their jth component is 1 and all other components
are 0).

Proposition 6.31. Let A ∈ M (m × n) and let A0 be its reduced row-echelon form. Then

dim(Im(A)) = number of pivots of A0 .

Proof. Let F1 , . . . , F` , E1 , . . . , Ek and A00 be as in Lemma 6.29 and set F := F1 · · · F` and E :=


D

E1 · · · Ek . It follows that A0 = F A and A00 = F AE. Clearly, the number of pivots of A0 and A00
coincide. Therefore, with the help of Theorem 6.4 we obtain

dim(Im(A)) = dim(Im(F AE))


= number of pivots of A00
= number of pivots of A0 .

Proposition 6.32. Let A ∈ M (m × n). Then

dim(Im(A)) = dim CA = dim RA .

That means:
(dimension of the range of A) = (dimension of row space) = (dimension of column space).

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
238 6.2. Matrices as linear maps

Proof. Since CA = Im(A) by Proposition 6.26, the first equality is clear.


Now let F1 , . . . , F` , E1 , . . . , Ek and A0 , A00 be as in Lemma 6.29 and set F := F1 · · · F` and E :=
E1 · · · Ek . Then

dim(RA ) = dim(RF AE ) = dim(RA00 ) = r = dim(CA00 ) = dim(CF AE )


= dim(CA ).

As an immediate consequence we obtain the following theorem which is a special case of Theo-
rem 6.20, see also Theorem 6.47.

Theorem 6.33. Let A ∈ M (m × n). Then

dim(ker(A)) + dim(Im(A)) = n. (6.8)

Proof. With the notation a above, we obtain

FT
dim(ker(A)) = dim(ker(A00 )) = n − r,
dim(Im(A)) = dim(Im(A00 )) = r

and the desired formula follows.

For the calculation of a basis of Im(A), the following theorem is useful.

Theorem 6.34. Let A ∈ M (m × n) and let A0 be its reduced row-echelon form with columns
RA
~c1 , . . . , ~cn and ~c1 0 , . . . , ~cn 0 respectively. Assume that the pivot columns of A0 are the columns j1 <
· · · < jk . Then dim(Im(A)) = k and a basis of Im(A) is given by the columns ~cj1 , . . . , ~cjk of A.

Proof. Let E be an invertible matrix such that A = EA0 . By assumption on the pivot columns of
A0 , we know that dim(Im(A0 )) = k and that a basis of Im(A0 ) is given by the columns ~cj1 0 , . . . , ~cjk 0 .
By Theorem 6.4 it follows that dim(Im(A)) = dim(Im(A0 )) = k. Now observe that by definition of
E we have that E~c` 0 = ~c` for every ` = 1, . . . , n; in particular this is true for the pivot columns of
A0 . Moreover, since E in invertible and the vectors ~cj1 0 , . . . , ~cjk 0 are linearly independent, it follows
from Theorem 6.14 that the vectors ~cj1 , . . . , ~cjk are linearly independent. Clearly they belong to
D

Im(A), so we have span{~cj1 , . . . , ~cjk } ⊆ Im(A). Since both spaces have the same dimension, they
must be equal.

Remark 6.35. The theorem above can be used to determine a basis of a subspace given in the
form U = span{~v1 , . . . , ~vk } ⊆ Rm as follows: Define the matrix A = (~v1 | . . . |~vk ). Then clearly
U = Im A and we can apply Theorem 6.34 to find a basis of U .

Example 6.36. Find ker(A), Im(A), dim(ker(A)), dim(Im(A)) and RA for


 
1 1 5 1
3 2 13 1
A= 0 2 4 −1 .

4 5 22 1

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 239

Solution. First, let us row-reduce the matrix A:


     
1 1 5 1 Q21 (−1) 1 1 5 1 Q32 (2) 1 1 5 1
3 2 13 1 Q41 (−4) 0 −1 −2 −2 Q42 (1) 0 −1 −2 −2
A = − −−−−→   −−−−→  
0 2 4 −1  0 2 4 −1 0 0 0 −5
4 5 22 1 0 1 2 −3 0 0 0 −5
     
S2 (−1) 1 1 5 1 S4 (1/5) 1 0 3 −1 Q14 (1) 1 0 3 0
Q43 (−1) 0 1 2 2 Q12 (−1) 0 1 2 2 Q24 (−2) 0 1 2 0 0
−−−−−→  0 0 0 5 −−−−−→ 0 0 0
  − −−−−→  0 0 0 1 =: A .

1
0 0 0 0 0 0 0 0 0 0 0 0

Now it follows immediately that dim RA = dim CA = 3 and

dim(Im(A)) = # pivot columns of A0 = 3,


dim(ker(A)) = 4 − dim(Im(A)) = 1

(or: dim(Im(A)) = #non-zero rows of A0 = 3, or: dim(Im(A)) = dim(RA ) = 3 or: dim(ker(A)) =

FT
#non-pivot columns A0 = 1).
Kernel of A: We know that ker(A) = ker(A0 ) by Theorem 6.4 or Proposition 6.28. From the
explicit form of A0 it is clear that A~x = 0 if and only if x4 = 0, x3 arbitrary, x2 = −2x3 and
x1 = −3x3 . Therefore
     

 −3x3 
 
 −3 
  
−2x3  −2
  
0
ker(A) = ker(A ) =    : x 3 ∈ R = span  1 .

 x3   
RA
   
0 0
   

Image of A: The pivot columns of A0 are the columns 1, 2 and 4. Therefore, by Theorem 6.34, a
basis of Im(A) are the columns 1, 2 and 4 of A:
     
 1
 1 1 
 3 2  1
Im(A) = span  , ,  .
  2 −1 (6.9)
 0
 
4 5 1
 
D

Alternative method for calculating the image of A: We can uses column manipulations of A
to obtain Im A. (If you fell more comfortable with row operations, you could apply row operations
to At and then transpose the resulting matrix again.) We find (Cj stands for “jth column of A):
  C →C −C    
1 1 5 1 C32 → C32 − 5C11 1 0 0 0 C3 → C3 − 2C2 1 0 0 0
3 2 13 1 C4 → C4 − C1 3 −1 −2 −2 C4 → C4 − 2C2 3 −1 0 0
A =  −−−−−−−−−−→   −−−−−−−−−−→  
0 2 4 −1 0 2 4 −1 0 2 0 −5
4 5 22 1 4 1 2 −3 4 1 0 −5
     
C4 → −1/5C4 1 0 0 0 C1 → C1 − 3C4 1 0 0 0 C3 ↔ C4 1 0 0 0
C1 → C1 + 3C2 0 −1 0 0 C2 → C2 − 2C4 0 −1 0 0 C2 → −C2 0 1 0 0
−−−−−−−−−−→  3
 −−−−−−−−−−→   −−−−−−−→   =: A.
e
2 0 1 0 0 0 1 0 0 1 0
7 1 0 1 4 −1 0 1 4 1 1 0

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
240 6.2. Matrices as linear maps

It follows that      

 1 0 0 
     
e = span   ,   , 0 .
0 1

Im(A) = Im(A) 0 0 1 (6.9’)

 
4 1 1
 

• Explain why the method with the column operations work.


• Show by an explicite calculation that the spaces in (6.9) and (6.9’) are equal.

Example 6.37. Find a basis of span{p1 , p2 , p3 , p4 } ⊆ P3 and its dimension for

p1 = x3 − x2 + 2x + 2, p2 = x3 + 2x2 + 8x + 13,
p3 = 3x3 − 6x2 − 5, p3 = 5x3 + 4x2 + 26x − 9.

Solution. First we identify P3 with R4 by ax3 + bx2 + cx + d = b (a, b, c, d)t . The polynomials
p1 , p2 , p3 , p4 correspond to the vectors

FT
       
1 1 3 5
−1 2 −6  4
 2 , ~v2 =  8  , ~v3 =  0 , ~v4 =  26 .
~v1 =        

2 13 −5 −9

Now we use Remark 6.35 to find a basis of span{v1 , v2 , v3 , v4 }. To this end we consider the A whose
columns are the vectors ~v1 , . . . , ~v4 :
RA
 
1 1 3 5
−1 2 −6 4
A=
 2 8
.
0 26
2 13 −5 −9

Clearly, span{~v1 , ~v2 , ~v3 , ~v4 } = Im(A), so it suffices to find a basis of Im(A). Applying row transfor-
mation to A, we obtain
   
1 1 3 5 1 0 4 5
D

−1 2 −6 4 0 1 2 3 = A0 .

A=  −→ · · · −→ 
 2 8 0 26 0 0 0 0
2 13 −5 −9 0 0 0 0

The pivot columns of A0 are the first and the second column, hence by Theorem 6.34, a basis of
Im(A) are its first and second columns, i.e. the vectors ~v1 and ~v2 .
It follows that {p1 , p2 } is a basis of span{p1 , p2 , p3 , p4 } ⊆ P3 , hence dim(span{p1 , p2 , p3 , p4 }) = 2.

Remark 6.38. Let us use the abbreviation π = span{p1 , p2 , p3 , p4 }. The calculation above actually
shows that any two vectors of p1 , p2 , p3 , p4 form a basis of π. To see this, observe that clearly any
two of them are linearly independent, hence the dimension of their generated space is 2. On the
other hand, this generated space is a subspace of π which has the same dimension 2. Therefore
they must be equal.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 241

Remark 6.39. If we wanted to complete p1 , p2 to a basis of P3 , we have (at least) the two following
options:

(i) In order to find q3 , q4 ∈ P3 such that p1 , p2 , q3 , q4 forms a basis of P3 we can use the reduction
process that was employed to find A0 . Assume that E is an invertible matrix such that
A = EA0 . Such an E can be found by keeping track of the row operations that transform
A into A0 . Let ~ej be the standard unit vectors of R4 . Then we already know that ~v1 = E~e1
and ~v2 = E~e2 . If we set w ~ 3 = E~e3 and w ~ 4 = E~e4 , then ~v1 , ~v2 , w ~ 4 form a basis of R4 .
~ 3, w
This is because ~e1 , . . . ,~e4 are linearly independent and E is injective. Hence E~e1 , . . . , E~e4 are
linearly independent too (by Proposition 6.14).

(ii) If we already have some knowledge of orthogonal complements as discussed in Chapter 7,


then we know that any basis of the orthogonal complement of span{~v1 , ~v2 } completes them
to a basis of R4 which we then only have to translate back to vectors in P3 . In order to find
two linearly independent vectors which are orthogonal to ~v1 an ~v2 we have to find linearly
independent solutions of the homogenous system of two equations for four unknowns

FT
x1 − x2 +2x3 +2x4 = 0,
x1 +2x2 −6x3 +4x4 = 0

or, in matrix notation, P ~x = 0 where P is the 2 × 4 matrix whose rows are ~v1 and ~v2 . Since
clearly Im(P ) ⊆ R2 , it follows that dim(Im(P )) ≤ 2 and therefore dim(ker(P )) ≥ 4 − 2 = 2.

Remark 6.40. In Section 7 we will define the orthogonal complement of a subspace U ⊆ Rn


~ which are orthogonal to every ~u ∈ U . We will show in
(Definition 7.18). It consists of all vectors w
RA
Theoroem ?? that for every matrix A ∈ M (m × n)

ker(A) = (RA )⊥ .

Since R(A) = Im At , this shows the important relation

ker(A) = (Im At )⊥ .

Example 6.41. Let T : R5 → R4 be given by


D

 
x  
y  3x + 2y − 5r
   2y + z − w 
T z  =  5x + 2y − w  .
  
w
w + 3r
r

We want to write T in the form A~x. Note that T can be expressed in the form
   
x   x
y  3 2 0 0 −5
  0 2 1 0 −1  y 
 
 z
T =
   z  ,
w 5 2 0 0 −1  w

0 0 0 1 3
r r

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
242 6.2. Matrices as linear maps

 y1 
y2
This way of expressing T is not arbitrary, since if y3 ∈ Im T then:
y4

x + 2y − 5r = y1
2y + z − w = y2
5x + 2y − w = y3
w + 3r = y4
and we know from section 3.3 that this system can be written in the form:
 
  x  
3 2 0 0 −5   y1
0 2 1 0 −1  y  y2 
5 2 0 0 −1  z  = y3  .
    
w 
0 0 0 1 3 y4
r

You should now have understood

FT
• what the relation between the solutions of a homogeneous system and the kernel of the
associated coefficient matrix is,
• what the relation between the admissible right hand sides of a system of linear equations
and the range of the associated coefficient matrix is,
• why the dimension formula (6.8) holds and why it is only a special case of (6.4),
RA
• why the Gauß-Jordan process works,
• etc.
You should now be able to
• calculate a basis of the kernel of a matrix and its dimension,
• calculate a basis of the range of a matrix and its dimension,
• etc.
D

Ejercicios.
1. Encuentre una base para el espacio generado de los siguientes conjuntos:
     
1 1 1
(a) , , .
3 −1 0
     
 2 −4 10 
(b) −1 , −2 , −5 .
1 2 5
 
       
 1 2 1 3 
(c) 0 , −2 , 2 , 3 .
1 0 3 5
 

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 243

       
1 0 1 −1 4 0 3 1
(d) , , , .
1 0 0 0 2 0 0 1

(e) X 3 − X 2 + X + 2, 2X 3 + 4X + 2, 2X 3 + X 2 + 5X + 1, X 3 + 2X 2 + 4X − 1 .

2. En los siguientes ejercicios, exprese T del Example 6.41 de la forma A~x. Determine Im T, ker T
y sus dimensiones.
 
1
(a) T : R3 → R3 , T (~x) = ~x ×  0.
−2
 
x  
x+y
 y 
(b) T : R4 → R3 , T 
z  = x − z .
  
2y + w
w
 
x−y

FT
 
x 5x + 4y + 9z 
 
(c) T : R3 → R5 , T  y  =   2x − 3y − z .

z  x+z 
2y + 2z
 
1
(d) Sea w =  3 y T : R3 → R3 dada por T (~x) = projw~ ~x.
−1
RA
3. Encuentre una matriz A(3 × 3) tal que su kernel es el plano E : x + 2y − z = 0.

4. Encuentre una matriz A(3 × 3) tal que su imagen es el plano E : 2x − y + 3z = 0.

5. Sea A ∈ M (m × n) tal que para todo ~b ∈ Rm , el sistema A~x = ~b tiene solución. ¿Cuánto vale
dim(Im A)?, ¿que se puede decir de ker A?

6. Sean A ∈ M (m × n) y B ∈ M (n × k).
D

(a) Muestre que dim(Im AB) ≤ dim Im A.


(b) Muestre que dim(Im AB) ≤ dim Im B.
(c) Muestre que dim(ker AB) ≥ dim ker B.
(d) Muestre que dim(ker AB) ≥ dim ker A.

¿Cuándo se tiene igualdad en los ejercicios anteriores? Encuentre ejemplos para igualdad y
ejemplos donde hay desigualdad estricta.

7. Sea A ∈ M (n × n) tal que A2 = A, muestre que ker A ⊕ Im A = Rn . (Hint: Basta mostrar


que ker A ∩ Im A = {~0} ¿Por qué?)

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
244 6.3. Change of bases

6.3 Change of bases


In this section, we work mostly with real vector spaces for definiteness sake. However, all the
statements are also true for complex vector spaces. We only have to replace everywhere R by C
and the word real by complex.

Usually we represent vectors in Rn as column of numbers, for example


   
3 x
~v =  2 , or more generally, ~ = y  ,
w (6.10)
−1 z

Such columns of numbers are usually interpreted as the Cartesian coordinates of the tip of the
vector if its initial point is in the origin. So for example, we can visualise ~v as the vector which
we obtain when we move 3 units along the x-axis, 2 units along the y-axis and −1 unit along the
z-axis.

FT
If we set ~e1 , ~e2 , ~e3 the unit vectors which are parallel to the x-, y- and z-axis, respectively, then we
can write ~v as a weighted sum of them:
 
3
~v =  2 = 3~e1 + 2~e2 − ~e3 . (6.11)
−1

So the column of numbers which we use to describe ~v in (6.10) can be seen as a convenient way to
abbreviate the sum in (6.11).
RA
Sometimes however, it may make more sense to describe a certain vector not by its Cartesian
coordinates. For instance, think of an infinitely large chess field (this is R2 ). Then the rook is
moving a along the Cartesian axis while the bishop moves a along the diagonals, that is along
~b1 = ( 1 ), ~b2 = −1 and the knight moves in directions parallel to ~k1 = ( 2 ), ~k2 = ( 1 ). We

1 1 1 2
suppose that in our imaginary chess game the rook, the bishop and the knight may move in arbitrary
multiples of their directions. Suppose all three of them are situated in the origin of the field and we
want to move them to the field (3, 5). For the rook, this is very easy. It only has to move 3 steps
to the right and then 5 steps up. He would denote his movement as ~vR = ( 35 )R where we put the
D

index R to indicate that the numbers in this column vector correspond to the natural coordinate
system of rook. The bishop cannot do this. He can move only along the diagonals. So what does
he have to do? He has to move 4 steps in the direction of ~b1 and 1 step in the direction of ~b2 . So
he would denote his movement with respect to his bishop coordinate system as ~vB = ( 41 )B . Finally
the knight has to move 31 steps in the direction of ~k1 and 37 steps in the direction of ~k2 to reach
the point
 (3,  5). So he would denote his movement with respect to his knight coordinate system as
1/3
~vK = 7/3
. See Figure 6.1.
K
 
1/3
Exercise. Check that ~vB = ( 41 )B = 4~b1 + 1~b2 = ( 35 ) and that ~vK = 7/3
= 1/3~k1 + 7/3~k2 = ( 35 ).
K

Although the three vectors ~v , ~vB and ~vK look very different, they describe the same vector – only
from three different perspectives (the rook, the bishop and the knight perspective). We have to

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 245

y y

P (3, 5) P (3, 5)
5 5
 
~b2 1
4 3
4 1 B 4 7
3 K
7~
3 3 k
3 2

4~b1 ~k2
2 2

~b2 1 1 ~k1
~b1 1~
x k
3 1 x
−1 0 1 2 3 4 −1 0 1 2 3 4

FT
Figure 6.1
The pictures shows the point (3, 5) in “bishop” and “knight” coordinates. The vectors for the
bishop are ~b1 = ~ −1
xB = ( 41 ). The vectors for the knight are ~k1 = ( 21 ), ~k2 = ( 12 )B
1

 ( 1 ), b2 =
1
1 and ~
and ~xK = 3
7 .
3 K

remember that they have to be interpreted as linear combinations of the vectors that describe their
RA
movements.
What we just did was to perform a change of bases in R2 : Instead of describing a point in the plane
in Cartesian coordinates, we used “bishop”- and “knight”-coordinates.
We can also go in the other direction and transform from “bishop”- or “knight”-coordinates to
Cartesian coordinates. Assume that we know that the bishop moves 3 steps in his direction ~b1 and
−2 steps in his direction ~b2 , where does he end up? In his coordinate system, he is displaced by
3

the vector ~u = −2 B . In Cartesian coordinates this vector is
       
3 3 2 5
D

~u = ~ ~
= 3b1 − 2b2 = + = .
−2 B 3 −2 1

~ ~
 3 steps in his direction k1 and −2 step in his direction k2 , that is, we move
If we move the knight
3
him along w
~ = −2 K according to his coordinate system, then in Cartesian coordinates this vector
is        
3 6 −2 4
w
~= = 3~k1 − 2~k2 = + = .
−2 K 3 −4 −1

Can the bishop and the knight reach every point in the plane? If so, in how many ways? The
answer is yes, and they can do so in exactly one way. The reason is that for the bishop and for the
knight, their set of direction vectors each form a basis of R2 (verify this!).

Let us make precise the concept of change of basis. Assume we are given an ordered basis B =

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
246 6.3. Change of bases

y y

3 3
3~b1 ~k2
2 −2~b2 2
  3~k1
~b1
3 ~k1
~b2 1 −2 B 1
−2~k2
x x
−1 1 2 3 4 5 6 −1 1 2 3 4 5 6
 
3
−2 K

3 3
 
Figure 6.2: The pictures shows the vectors −2 B and −2 K .

{~b1 , . . . , ~bn } of Rn . If we write

FT
 
x1
~x =  ... (6.12)
 

xn B
then we interprete it as a vector which is expressed with respect to the basis B and
 
x1
 ..
~x =  . := x1~b1 + · · · + xn~bn . (6.13)
RA
xn B

If there is no index attached to the column vector, then we interprete it as a vector with respect to the
canonical basis ~e1 , . . . ,~en of Rn . Now we want to find a way to calculate the Cartesian coordinates
(that is, those with respect to the canonical basis) if we are given a vector in B-coordinates and
vice versa.
It will turn out that the following matrix will be very useful:

AB→can = (~v1 | . . . |~vn ) = matrix whose columns are the vectors of the basis B.
D

We will explain the index “B → can” in a moment.

Transition from representation with respect to a given basis to Cartesian coordinates.


Suppose we are given a vector as in (6.13). How do we obtain its Cartesian coordinates?
This is quite straightforward. We only need to remember what the notation (·)B means. We will
denote by ~xB the representation of the vector with respect to the basis B and by ~x its representation
with respect to the standard basis of Rn .
     
x1 x1 x1
 ..  ~ ~ ~ ~ ~ ~  ..   .. 
~x =  .  = x1 b1 + x2 b2 + · · · + xn bn = (b1 |b2 | · · · |bn )  .  = AB→can  .  = AB→can ~xB ,
xn B
xn xn

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 247

that is  
y1
~x = AB→can ~xB =  ...  . (6.14)
 

yn can
The last vector (the one with the y1 , . . . , yn in it) describes the same vector as ~xB , but it does so
with respect to the standard basis of Rn . The matrix AB→can is called the transition matrix from
the basis B to the canonical basis (which explains the subscript “B → can”). The matrix is also
called the change-of-coordinates matrix

Transition from Cartesian coordinates to representation with respect to a given basis.


Suppose we are given a vector ~x in Cartesian coordinates. How do we calculate its coordinates ~xB
with respect to the basis B?
We only need to remember that the relation between ~x and ~xB according to (6.14) is

~x = AB→can ~xB .

order to obtain the entries of ~xB :

FT
In this case, we know the entries of the vector ~xB . So we only need to invert the matrix AB→can in

~xB = A−1
B→can ~x.
This requires of course that AB→can is invertible. But this is guaranteed by Theorem 5.39 since we
know that its columns are linearly independent. So it follows that the transition matrix from the
canonical basis to the basis B is given by

Acan→B = A−1
RA
B→can .
 
y1
Note that we could do this also “by hand”: We are given ~x =  ...  and we want to find the
 

yn can
entries x1 , . . . , xn of the vector ~xB which describes the same vector. That is, we need numbers
x1 , . . . , xn such that
~x = x1~b1 + · · · + ~bn xn .
If we know the vectors ~b1 , . . . , ~bn , then we can write this as an n × n system of linear equations
D

and then solve it for x1 , . . . , xn which


  of course in reality is the same as applying the inverse of the
y1
matrix AB→can to the vector ~x =  ...  .
 

yn can

Now assume that we have two ordered bases B = {~b1 , . . . , ~bn } and C = {~c1 , . . . , ~cn } of Rn and we
are given a vector ~xB with respect to the basis B. How can we calculate its representation ~xC with
respect to the basis C? The easiest way is to use the canonical basis of Rn as an auxiliary basis.
So we first calculate the given vector ~xB with respect to the canonical basis, we call this vector ~x.
Then we go from ~x to ~xC . According to the formulas above, this is
~ can→C ~x = Acan→C AB→can ~xB .
~xC = A

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
248 6.3. Change of bases

Hence the transition matrix from the basis B to the basis C is

AB→C = Acan→C AB→can .

Example 6.42. Let us go back to our example  of our imaginary chess board. We have the “bishop
basis” B = {~b1 , ~b2 } where ~b1 = ( 11 ), ~b2 = −11 and the “knight basis” K = {~k1 , ~k2 } ~k1 = ( 21 ), ~k2 =
( 12 ). Then the transition matrices to the canonical basis are
   
1 −1 2 1
AB→can = , AK→can = ,
1 1 1 2

their inverses are    


1 1 1 1 2 −1
Acan→B = , Acan→K =
2 −1 1 3 −1 2
and the transition matrices from C to K and from K to C are
   
1 3 −3 1 1 3

FT
AB→K = , AK→C = .
3 1 1 2 −1 3

• Given a vector ~x = ( 27 )B in bishop coordinates, what are its knight coordinates?

1 3 −3 −5
 
Solution. (~x)K = AB→K ~xB = 3 1 1 ( 27 ) = 3 K. 

• Given a vector ~y = ( 51 )K in knight coordinates, what are its bishop coordinates?


RA
1 13 3
 
Solution. (~y )B = AK→B ~yK = 2 −1 3 ( 51 ) = −1 B . 

• Given a vector ~z = ( 13 ) in standard coordinates, what are its bishop coordinates?

1 11

Solution. (~z)B = Acan→B ~z = 2 −1 1 ( 13 ) = ( 21 )B . 

Example 6.43. Recall the example on page 106 where we had a shop that sold different types of
D

packages of food. Package type A contains 1 peach and 3 mangos and package type B contains 2
peaches and 1 mango. We asked two types of questions:
Question 1. If we buy a packages of type A and b packages of type B, how many peaches and
mangos will we get? We could rephrase this question so that it becomes more similar to Question
2: How many peaches and mangos do we need in order to fill a packages of type A and b packages
of type B?
Question 2. How many packages of type A and of type B do we have to buy in order to get p
peaches and m mangos?
Recall that we had the relation
           
a m −1 m a 1 2 −1 1 −1 2
M = , M = where M= and M = .
b p p b 3 1 5 3 −1
(6.15)

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 249

mangos
7
Type B
6

5 ~
3B 4

4 6m
~
3

3
~
A 7~
p 2
2

1
1 ~
B
p
~
peaches Type A
1 2 3 4 5 6 7 −1 1 2
m
~

(a)

FT (b)

Figure 6.3: How many peaches and mangos do we need to obtain 1 package of type A and 3 packages
of type B? Answer: 7 peaches and 6 mangos. Figure (a) describes the situation in the “fruit plane”
while Figure (b) describes the same situation in the “packages plane”. In both figures we see that
~ + 3B
A ~ = 7~
p + 6m.
~
RA
We can view these problems in two different coordinate systems. We have the “fruit basis” F =
{~ ~ B}
~ and the “package basis” P = {A,
p, m} ~ where
       
1 0 ~ 1 ~ 2
m~ = , p~ = , A= , B= .
0 1 3 1

Note that A~=m ~ + 3~ ~ = 2m


p, B ~ + p~, and that m~ = 51 (−A ~ + 3B)
~ and p~ = 1 (2A~ − B)
~ (that means
5
for example that one mango is three fifth of a package B minus one fifth of a package A).
D

An example for the first question is: How many peaches and mangos do we need to obtain 1 package
of type A and 3 packages of type B? Clearly,
 we
 need 7 peaches and 6 mangos.   So the point that we
1 7
want to reach is in “package coordinates” and in “fruit coordinates” . This is sketched
3 P 6 F
in Figure 6.3.
An example for the second question is: How many packages of type A and of type B do we have
to buy in order to obtain 5 peaches and 5 mangos? Using (6.15) we find that we need 1 package of
type
  A and 3 packages of type B.Sothe point that we want to reach is in “package coordinates”
1 5
and in “fruit coordinates” . This is sketched in Figure 6.4.
2 P 5 F

In the rest of this section we will apply these ideas to introduce coordinates in abstract (finitely
generated) vector spaces V with respect to a given a basis. This allows us to identify in a certain

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
250 6.3. Change of bases

mangos
Type B
6

5 4

~
2B
4
3

3 5m
~
~
A 2
2 5~
p
1
1 ~
B
p
~
peaches Type A
1 2 3 4 5 6 −1 1 2
m
~

(a)

FT
(b)

Figure 6.4: How many packages of type A and of type B do we need to get 5 peaches and 5 mangos?
Answer: 1 package of type A and 2 packages of type B. Figure (a) describes the situation in the “fruit
plane” while Figure (b) describes the same situation in the “packages plane”. In both figures we see
~ + 2B
that A ~ = 5~p + 5m.
~

sense V with Rn or Cn for an appropriate n.


RA
Assume we are given a real vector space V with an ordered basis B = {v1 , . . . , vn }. Given a vector
w ∈ V , we know that there are uniquely determined real numbers α1 , . . . , αn such that

w = α1 v1 + · · · + αn vn .

So, if we are given w, we can find the numbers α1 , . . . , αn . On the other hand, if we are given the
numbers α1 , . . . , αn , we can easily reconstruct the vector w (just replace in the right hand side of
the above equation). Therefore it makes sense to write
 
α1
D

 .. 
w= . 
αn B

where again the index B reminds us that the column of numbers has to be understood as the
coefficients with respect to the basis B. In this way, we identify V with Rn since every column
vector gives a vector w in V and every vector w gives one column vector in Rn . Note that if we
start with some w in V , calculate its coordinates with respect to a given basis and then go back to
V , we get back our original vector w.

Example 6.44. In P2 , consider the bases B = {p1 , p2 , p3 }, C = {q1 , q2 , q3 }, D = {r1 , r2 , r3 }


where

p1 = 1, p2 = X, p3 = X 2 , q1 = X 2 , q2 = X, q3 = 1, r1 = X 2 + 2X, r2 = 5X + 2, r3 = 1.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 251

We want to write the polynomial π(X) = aX 2 + bX + c with respect to the given basis.
 
c
• Basis B: Clearly, π = cp1 + bp2 + ap3 , therefore π =  b  .
a B
 
a
• Basis C: Clearly, π = aq1 + bq2 + cq3 , therefore π =  b  .
c C
• Basis D: This requires some calculations. Recall that we need numbers α, β, γ ∈ R such that
 
α
π = β  = αr1 + βr2 + γr3 .
γ D
This leads to the following equation
aX 2 + bX + c = α(X 2 + 2X) + β(5X + 2) + γ = αX 2 + (2α + 5β)X + 2β + γ.

FT
Comparing coefficients we obtain
     
α =a 1 0 0 α a
2α + 5β =b in matrix form: 2 5 0 β  =  b  . (6.16)
2β + γ = c. 0 2 1 γ c

Note that the columns of the matrix appearing on the right hand side are exactly  a the vector
representations of r1 , r2 , r3 with respect to the basis C and the column vector b is exactly
c
RA
the vector representation of π with respect to the basis C! The solution of the system is
α = a, β = − 25 a + 15 b, γ = 25 a − 51 b + c,
therefore  
a
π =  − 25 a + 15 b  .
2 1
5a − 5b + c D

We could have found the solution also by doing a detour through R3 as follows: We identify the
D

vectors q1 , q2 , q3 with the canonical basis vectors ~e1 , ~e2 ,~e3 of R3 . Then the vectors r1 , r2 , r3
and π correspond to
       
1 0 0 a
0 0 0 0
~r1 =  2 , ~r2 =  5 , ~r3 =  0  , ~π = b  .

0 2 1 c
Let R = {~r10 , ~r20 , ~r30 }. In order to find the coordinates of ~π 0 with respect to the basis ~r10 , ~r20 , ~r30 ,
we note that
~π 0 = AR→can~πR
0

where AR→can is the transition matrix from the basis R to the canonical basis of R whose
columns consist of the vectors ~r10 , ~r20 , ~r30 . So we see that this is exactly the same equation as
the one in (6.16).

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
252 6.3. Change of bases

We give an example in a space of matrices.

Example 6.45. Consider the matrices


       
1 1 1 0 0 1 2 3
R= , S= , T = , Z= .
1 1 0 3 1 0 3 0
(i) Show that B = {R, S, T } is a basis of Msym (2 × 2) (the space of all symmetric 2 × 2 matrices).
(ii) Write Z in terms of the basis B.
Solution. (i) Clearly, R, S, T ∈ Msym (2 × 2). Since we already know that dim Msym (2 × 2) = 3,
it suffices to show that R, S, T are linearly independent. So let us consider the equation
 
α+β α+γ
0 = αR + βS + γT = .
α + γ α + 3β
We obtain the system of equations
     
α+ β = 0 1 1 0 α 0

FT
α +γ=0 in matrix form: 1 0 1 β  = 0 . (6.17)
α + 3β =0 1 3 0 γ 0

Doing some calculations, if follows that α = β = γ = 0. Hence we showed that R, S, T are


linearly independent and therefore they are a basis of Msym (2 × 2).
(ii) In order to write Z in terms of the basis B, we need to find α, β, γ ∈ R such that
 
α+β α+γ
Z = αR + βS + γT = .
α + γ α + 3β
RA
We obtain the system of equations
     
α+ β = 2 1 1 0 α 2
α +γ=3 in matrix form: 1 0 1 β  = 3 . (6.18)
α + 3β =0 1 3 0 γ 0

| {z }
=A

Therefore
        
α 2 3 0 −1 2 3
D

−1 1
β  = A 3 = −1 0 1 3 = −1 ,
2
γ 0 −3 2 1 0 0
 3
hence Z = 3R − S = −1 . 
0 B
Now we give an alternative solution (which is essentially
  the same as  the above)doing a detour
1 0 0 0 0 1
through R3 . Let C = {A1 , A2 , A3 } where A1 = , A2 = , A3 = . This is
0 0 0 1 1 0
clearly a basis of Msym (2 × 2). We identify it with the standard basis ~e1 ,~e2 ,~e3 of R3 . Then the
vectors R, S, T in this basis look like
       
1 1 0 2
R0 = 1 , S 0 = 0 , T 0 = 1 and Z 0 = 3 .
1 3 0 0

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 253

(i) In order to show that R, S, T are linearly independent, we only have to show that the vectors
R0 , S 0 and T 0 are linearly independent in R3 . To this end, we consider the matrix A whose
columns are these vectors. Note that this is the same matrix that appeared in (6.18). It is
easy to show that this matrix is invertible (we already calculated its inverse!). Therefore the
vectors R0 , S 0 , T 0 are linearly independent in R3 , hence R, S, T are linearly independent in
Msym (2 × 2).

(ii) Now in order to find the representation of Z in terms of the basis B, we only need to find the
representation of Z 0 in terms of the basis B 0 = {R0 , S 0 , T 0 }. This is done as follows:
 
2
ZB0 0 = Acan→B0 Z 0 = A−1 Z 0 = 3 .
0

FT
You should now have understood
• the geometric meaning of a change of bases in Rn ,
• how an abstract finite dimensional vector space can be represented as Rn or Cn and that
the representation depends on the chosen basis of V ,
• how the vector representation changes if the chosen basis is reordered,
• etc.
RA
You should now be able to

• perform a change of basis in Rn and Cn given a basis,


• represent vectors in a finite dimensional vector space V as column vectors after the choice
of a basis,
• etc.
D

Ejercicios.
       
1 1 00 0 0 1 0
1. Sea B = , , , . Muestre que B es una base de M (2 × 2)
0 0 11 1 0 0 4
 
a b
y encuentre [A]B para A = .
c d

2. Sea B = {X 2 − 1, X 2 + X + 1, X 2 }. Muestre que B es base de P2 . Encuentre [p(X)]B para


p(X) = a + bX + cX 2 .

3. Sea B = {1, ex , e−x } y V = span{1, ex , e−x }.

(a) Muestre que sinh x, cosh x ∈ V .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
254 6.3. Change of bases

(b) Encuentre A ∈ M (3 × 3) tal que

1 = a11 + a12 ex + a13 e−x


sinh x = a21 + a22 ex + a23 e−x
cosh x = a31 + a32 ex + a33 e−x

(c) Muestre que B 0 = {1, sinh x, cosh x} es base de V .


(d) Encuentre AB→B0 y AB0 →B .

4. Muestre que B = {1, X − 1, (X − 1)2 } es base de P2 y escriba a + bX + cX 2 en términos de


B. Aún más general, muestre que B = {1, X − 1, (X − 1)2 , . . . (X − 1)n } es base de Pn y
obtenga AB→can , Acan→B donde can = {1, X 2 , . . . , X n }.
            
 −1 1 0   2 −1 3 

FT
5. Sean B1 =  1 ,  0 , 1 y B2 = 1 ,  4 , −2 .
0 −1 1 3 5 4
   

(a) Muestre que B1 y B2 son bases de R3 .


 
x
(b) Sea ~v = y . Encuentre [~v ]B1 y [~v ]B2 .
z
RA
(c) Obtenga AB1 →B2 y AB2 →B1 .
  
cos ϑ − sin ϑ
6. Sea ϑ ∈ (−π, π]. Muestre que Bϑ = , es una base de R2 . Encuentre
sin ϑ cos ϑ
ABϑ →can y Acan→Bϑ . ¿Cómo se interpreta geométricamente Bϑ ?

7. Sean a, b tal que ab 6= 0.


    
−b
D

√ 1
a √ 1
(a) Muestre que B = , es base de R2 .
a2 +b2 b a2 +b2 a
(b) Muestre que existe ϑ ∈ (−π, π] tal que B = Bϑ . (Hint: Interpretación geométrica de B)

8. Sea Bϑ como en Ejercicio 6..


 √ 
π −3 3
(a) Si ϑ = 6, escriba en términos de Bϑ .
−3
 
1
(b) Si ϑ = π4 , escriba en términos de la base canónica.
−1 B
ϑ

9. Sean ϑ1 , ϑ2 ∈ (−π, π], ¿cómo se interpreta geométricamente ABϑ1 →Bϑ2 ?

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 255

6.4 Linear maps and their matrix representations


Let U, V be K-vector spaces and let T : U → V be a linear map. Recall that T satisfies
T (λ1 x1 + · · · + λk xk ) = λ1 T (x1 ) + · · · + λk T (xk )
for all x1 , . . . , xk ∈ U and λ1 , . . . , λk ∈ K. This shows that in order to know T , it is in reality
enough to know how T acts on a basis of U . Suppose that we are given a basis B = {u1 , . . . , un } ∈ U
and take an arbitrary vector w ∈ U . Then there exist uniquely determined λ1 , . . . , λk ∈ K such
that w = λ1 u1 + · · · + λn un . Hence
T w = T (λ1 u1 + · · · + λn un ) = λ1 T u1 + · · · + λn T un . (6.19)
So T w is a linear combination of the vectors T u1 , . . . , T un ∈ V and the coefficients are exactly the
λ1 , . . . , λn .
Suppose we are given a basis C = {v1 , . . . , vm } of V . Then we know that for every j = 1, . . . , n, the
vector T uj is a linear combination of the basis vectors v1 , . . . , vm of V . Therefore there exist uniquely
determined numbers aij ∈ K (i = 1, . . . , m, j = 1, . . . n) such that T uj = aj1 v1 + · · · + ajm vm , that
is

FT
T u1 = a11 v1 + a21 v2 + · · · + am1 vm ,
T u2 = a12 v1 + a22 v2 + · · · + am2 vm ,
.. .. .. .. (6.20)
. . . .
T un = a1n v1 + a2n v2 + · · · + amn vm .
Let us define the matrix AT and the vector ~λ by
   
a11 a12 · · · a1n λ1
RA
 a21 a22 · · · a2n   λ2 
AT =  .

.. ..  ∈ M (m × n),
 ~λ = 
 ..  ∈ Rn .

 .. . .   . 
am1 am2 · · · amn λn
Note that the first column of AT is the vector representation of T u1 with respect to the basis
v1 , . . . , vm , the second column is the vector representation of T u2 , and so on.
Now let us come back to the calculation of T w and its connection with the matrix AT . From (6.19)
and (6.20) we obtain
D

T w = λ1 T u1 + λ2 T u2 + · · · + λn T un

= λ1 (a11 v1 + a21 v2 + · · · + am1 vm )


+ λ2 (a12 v1 + a22 v2 + · · · + am2 vm )
+ ···
+ λn (a1n v1 + a2n v2 + · · · + amn vm )

= (a11 λ1 + a12 λ2 + · · · + a1n λn )v1


+ (a21 λ1 + a22 λ2 + · · · + a2n λn )v2
+ ···
+ (am1 λ1 + am2 λ2 + · · · + amn λn )vm .

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
256 6.4. Linear maps and their matrix representations

The calculation shows that for every k the coefficient of vk is the kth component of the vector AT ~λ!
Now we can go one step further. Recall that the choice of the basis B of U and the basis C of V
allows us to write w and T w as a column vectors:
   
λ1 a11 λ1 + a12 λ2 + · · · + a1n λn
λ2   a21 λ1 + a22 λ2 + · · · + a2n λn 
w=w ~B  .  , Tw =   .
   
..
 ..   . 
λ1 B
am1 λ1 + am2 λ2 + · · · + amn λn C

This shows that


(T w)C = AT w
~ B.
For now hopefully obvious reasons, the matrix AT is called the matrix representation of T with
respect to the bases B and C.
So every linear transformation T : U → V can be represented as a matrix AT ∈ M (m × n). On the
other hand, every a matrix A(m × n) induces a linear transformation TA : U → V .

FT
Very important remark. This identification of m×n-matrices with linear maps U → V depends
on the choice of the basis! See Example 6.48.

Let us summarise what we have found so far.

Theorem 6.46. Let U, V be finite dimensional vector spaces and let B = {u1 , . . . , un } be an ordered
basis of U and let C = {v1 , . . . , vm } be an ordered basis of V . Then the following is true:
RA
(i) Every linear map T : U → V can be represented as a matrix AT ∈ M (m × n) such that

(T w)C = AT w
~B

where (T w)C is the representation of T w ∈ V with respect to the basis C and w ~ B is the
representation of w ∈ U with respect to the basis B. The entries aij of AT can be calculated
as in (6.20).
(ii) Every matrix A = (aij )i=1,...,m ∈ M (m × n) induces a linear transformation T : U → V
j=1,...,n
defined by
D

T (uj ) = a1j v1 + . . . amj vm , j = 1, . . . , n.

(iii) T = TAT and A = ATA . , That means: If we start with a linear map T : U → V , calculate its
matrix representation AT and then the linear map TAT : U → V induced by AT , then we get
back our original map T . If on the other hand we start with a matrix A ∈ M (m×n), calculate
the linear map TA : U → V induced by A and then calculate its matrix representation ATA ,
then we get back our original matrix A.

Proof. We already showed (i) and (ii) in the text before the theorem. To see (iii), let us start with a
linear transformation T : U → V and let AT = (aij ) be the matrix representation of T with respect
to the bases B and C. For TAT , the linear map induced by AT , it follows that

T AT uj = a1j v1 + . . . amj vm = T uj , j = 1, . . . , n

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 257

Since this is true for all basis vectors and both T and TAT are linear, they must be equal.
If on the other hand we are given a matrix A = (aij )i=1,...,m ∈ M (m × n) then we have that the
j=1,...,n
linear transformation TA induced by A acts on the basis vectors u1 , . . . , un as follows:

TA uj = TAT uj = a1j v1 + . . . amj vm .

But then, by definition of the matrix representation ATA of TA , it follows that ATA = A.
Let us see this “identifications” of matrices with linear transformations a bit more formally. By
choosing a basis B = {u1 , . . . , un } in U and thereby identifying U with Rn , we are in reality defining
a linear bijection  
λ1
n  .. 
Ψ:U →R , Ψ(λu1 + · · · + λn un ) =  .  .
λn
Recall that we denoted the vector on the right hand side by ~uB .
The same happens if we choose a basis C = {v1 , . . . , vm } of V . We obtain a linear bijection

FT
 
µ1
m  .. 
Φ:V →R , Φ(µv1 + · · · + µm vm ) =  .  .
µm
With these linear maps, we find that

AT = Φ ◦ T ◦ Ψ−1 and TA = Φ−1 ◦ A ◦ Ψ.


RA
The maps Ψ and Φ “translate” the spaces U and V to Rn and Rm where the chosen bases serve
as “dictionary”. Thereby they “translate” linear maps U : U → V to matrices A ∈ M (m × n) and
vice versa. In a diagram this looks likes this:
T
U V
Ψ Φ
n AT m
R R
So in order to go from U to V , we can take the detour through Rn and Rm . The diagram above is
D

called commutative diagram. That means that it does not matter which path we take to go from
one corner of the diagram to another one as long as we move in the directions of the arrows. Note
that in this case we are even allowed to go in the opposite directions of the arrows representing Ψ
and Φ because they are bijections.
What is the use of a matrix representation of a linear map? Sometimes calculations are easier in
the world of matrices. For example, we know how to calculate the range and the kernel of a matrix.
Therefore, using Theorem :
• If we want to calculate Im T , we only need to calculate Im AT and then use Φ to “translate
back” to the range of T . In formula: Im T = Im(Φ−1 AT Ψ) = Im(Φ−1 AT ) = Φ−1 (Im AT ).
• If we want to calculate ker T , we only need to calculate ker AT and then use Ψ to “translate
back” to the kernel of T . In formula: ker T = ker(Φ−1 AT Ψ) = ker(AT Ψ) = Ψ−1 (ker AT ).

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
258 6.4. Linear maps and their matrix representations

• If dim U = dim V , i.e., if n = m, then T is invertible if and only if AT is invertible. This is


the case if and only if det AT 6= 0.

Let us summarise. From Theorem 6.24 we obtain again the following very important theorem, see
Theorem 6.20 and Proposition 6.16.

Theorem 6.47. Let U, V be vector spaces and let T : U → V be a linear transformation. Then

dim U = dim(ker T ) + dim(Im T ). (6.21)

If dim U = dim V , then the following is equivalent:


(i) T is invertible.
(ii) T is injective, that is, ker T = {O}.
(iii) T is surjective, that is, Im T = V .

Note that if T is bijective, then we must have that dim U = dim V .

Let us see some examples.

FT
Example 6.48. We consider the operator of differentiation

T : P 3 → P3 , T p = p0 .
RA
Note that in this case the vector spaces U and V are both equal to P3 .

(i) Represent T with respect to the basis B = {p1 , p2 , p3 , p4 } and find its kernel where p1 =
1, p2 = X, p3 = X 2 , p4 = X 3 .

Solution. We only need to evaluate T in the elements of the basis and then write the re-
sult again as linear combination of the basis. Since in this case, the bases are “easy”, the
calculations are fairly simple:
D

T p1 = 0, T p2 = 1 = p1 , T p3 = 2X = 2p2 , T p4 = 3X 2 = 3p3 .

Therefore the matrix representation of T is


 
0 1 0 0
B
0 0 2 0
AT =  0
.
0 0 3
0 0 0 0

The kernel of AT is clearly span{~e1 }, hence ker T = span{p1 } = span{1}. 

(ii) Represent T with respect to the basis C = {q1 , q2 , q3 , q4 } and find its kernel where q1 =
X 3 , q2 = X 2 , q3 = X, q4 = 1.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 259

Solution. Again we only need to evaluate T in the elements of the basis and then write the
result as linear combination of the basis.

T q1 = 3X 2 = 3q2 , T q2 = 2X = 2q3 , T q3 = X = q4 , T q4 = 0.

Therefore the matrix representation of T is


 
0 0 0 0
C
3 0 0 0
AT =  0
.
2 0 0
0 0 1 0

The kernel of AT is clearly span{~e4 }, hence ker T = span{q4 } = span{1}. 

(iii) Represent T with respect to the basis B in the domain of T (in the “left” P3 ) and the basis
C in the target space (in the “right” P3 ).

FT
Solution. We calculate

T p1 = 0, T p2 = 1 = q4 , T p3 = 2X = 2q3 , T p4 = 3X 2 = 3q2 .

Therefore the matrix representation of T is


 
0 0 0 0
0 0 0 3
ATB,C =
0
.
0 2 0
RA
0 1 0 0

The kernel of AT is clearly span{~e1 }, hence ker T = span{p1 } = span{1}. 

(iv) Represent T with respect to the basis D = {r1 , r2 , r3 , r4 } and find its kernel where

r1 = X 3 + X, r2 = 2X 3 + X 2 + 2X, r3 = 3X 3 + X 2 + 4X + 1, r4 = 4X 3 + X 2 + 4X + 1.

Solution 1. Again we only need to evaluate T in the elements of the basis and then write the
D

result as linear combination of the basis. This time the calculations are a bit more tedious.

T r1 = 3X 2 + 1 = − 8r1 + 2r2 + r4 ,
2
T r2 = 6X + 2X + 2 = − 14r1 + 4r2 + r3 ,
T r3 = 9X 2 + 2X + 4 = − 24r1 + 5r2 + 2r3 + 2r4 ,
T r4 = 12X 2 + 2X + 4 = 30r1 + 8r2 + 2r3 + 2r4 .

Therefore the matrix representation of T is


 
−8 −14 −24 −30
 2 4 5 8
AD
T = 0
 .
2 2 2
1 0 2 2

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
260 6.4. Linear maps and their matrix representations

In order to calculate the kernel of AT , we apply the Gauß-Jordan process and obtain
   
−8 −14 −24 −30 1 0 0 2
2 4 5 8  −→ · · · −→ 0 1 0 1
AD
  
T = 0
 .
2 2 2 0 0 1 0
1 0 2 2 0 0 0 0

The kernel of AT is clearly span{−2~e1 − ~e2 + ~e4 }, hence ker T = span{−2r1 − r2 + r4 } =


span{1}. 

Solution 2. We already have the matrix representation ACT and we can use it to calculate
AD
T . To this end define the vectors
       
1 2 3 4
0 1 1 1
ρ 1 , ρ
~1 =    ~4 =   .
2 ~3 = 4 , ρ
 ~2 =   , ρ
4
0 0 1 1

SD→C = 

1 2 3 4
0 1 1 1
1 2 4 4 ,
0 0 1 1
FT
Note that these vectors are the representations of our basis vectors r1 , . . . , r4 in the basis C.
The change-of-bases matrix from C to D and its inverse are, in coordinates,

 SC→D = SD→C−1
=

 0
−1
0 −2

1
1
0
0 −1
1 −2
0 −1
1

.
0
1
RA
It follows that

AD C
T = SC→D AT SD→C
     
0 −2 1 −2 0 0 0 0 1 2 3 4 −8 −14 −24 −30
 0 1 0 −1 3 0 0 0 0 1 1 1  2 4 5 8
=−1
  = .
0 1 0 0 2 0 0 1 2 4 4  0 2 2 2
1 0 −1 1 0 0 1 0 0 0 1 1 1 0 2 2
D

Let us see how this looks in diagrams. We define the two bijections of P3 with R4 which are
given by choosing the bases C and D by ΨC and ΨD :

Ψ C : P3 → R4 , ΨC (q1 ) = ~e1 , ΨC (q2 ) = ~e2 , ΨC (q3 ) = ~e3 , ΨC (q4 ) = ~e4 ,


4
ΨD : P 3 → R , ΨD (r1 ) = ~e1 , ΨD (r2 ) = ~e2 , ΨD (r3 ) = ~e3 , ΨD (r4 ) = ~e4 .

Then we have the following diagrams:

T T
P3 P3 P3 P3
ΨC ΨC ΨD ΨD
AC AD
R4 T
R4 R4 T
R4

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 261

We already know everything in the diagram on the left and we want to calculate AD
T in the
diagram on the right. We can put the diagrams together as follows:
T
P3 P3

ΨC ΨC
ΨD ΨD

SD→C AC SC→D
R4 R4 T
R4 R4
AD
T

We can also see that the change-of-basis maps SD→C and SC→D are

SD→C = ΨC ◦ Ψ−1
D , SC→D = ΨD ◦ Ψ−1
C .

For AD
T we obtain

FT
−1
AD C
T = ΨD ◦ T ◦ ΨD = SD→C ◦ AT ◦ SC→D .

Another way to draw the diagram above is


T
P3 P3

ΨC ΨC
RA
AC
ΨD R4 T
R4 S
ΨD
C C
→ →
D
SD

AD
T
R4 R4


B,C
Note that the matrices AB C D
D

T , AT , AT and AT all look different but they describe the same linear
transformation. The reason why they look different is that in each case we used different bases to
describe them.

Example 6.49. The next example is not very applied but it serves to practice a bit more. We
consider the operator given

T : M (2 × 2) → P2 , T ( ac db ) = (a + c)X 2 + (a − b)X + a − b + d.


Show that T is a linear transformation and represent T with respect to the bases B = {B1 , B2 , B3 , B4 }
of M (2 × 2) and C = {p1 , p2 , p3 } of P2 where
       
1 0 0 1 0 0 0 0
B1 = , B2 = , B3 = , B4 = ,
0 0 0 0 1 0 0 1

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
262 6.4. Linear maps and their matrix representations

and
p1 = 1, p2 = X, p3 = X 2 .
Find bases for ker T and Im T and their dimensions.
a1 b1

Solution. First we verify that T is indeed a linear map. To this end, we take matrices A1 = c1 d1
and A2 = ac22 db22 and λ ∈ R. Then


       
a b1 a2 b2 λa1 + a2 λb1 + b2
T (λA1 + A2 ) = T λ 1 + =T λ
c1 d1 c2 d2 λc1 + c2 λd1 + d2
= (λa1 + a2 + λc1 + c2 )X 2 + (λa1 + a2 − λb1 − b2 )X + λa1 + a2 − (λb1 + b2 ) + λd1 + d2
= λ[(a1 + c1 )X 2 + (a1 − b1 )X + a1 − b1 + d1 )] + (a2 + c2 )X 2 + (a2 − b2 )X + a2 − b2 + d2 )
 

= λT (A1 ) + T (A2 ).

This shows that T is a linear transformation.


Now we calculate its matrix representation with respect to the given bases.

FT
T B1 = X 2 + X + 1 = p1 + p2 + p3 ,
T B2 = −X = −p2 ,
T B3 = X 2 = p3 ,
T B4 = 1 = p1 .

Therefore the matrix representation of T is


RA
 
1 0 0 1
AT = 1 −1 0 0
1 0 1 0

In order to determine the kernel and range of AT , we apply the Gauß-Jordan process:
     
1 0 0 1 1 0 0 1 1 0 0 1
AT = 1 −1 0 0 −→ 0 −1 0 −1 −→ 0 1 0 1 .
1 0 1 0 0 0 1 −1 0 0 1 −1
D

So the range of AT is R3 and its kernel is ker e1 +~e2 −~e3 −~e3 }. Therefore Im T = P2 and
 AT = span{~
ker T = span{B1 + B2 − B3 − B4 } = span −11 −11 . For their dimensions we find dim(Im T ) = 3


and dim(ker T ) = 1. 

Example 6.50 (Reflection in R2 ). In R2 , consider the line L : 3x − 2y = 0. Let R : R2 → R2


which takes a vector in R2 and reflects it on the line L, see Figure 6.5. Find the matrix representation
of R with respect to the standard basis of R2 .
Observation. Note that L is the line which passes through the origin and is parallel to the vector
~v = ( 23 ).

Solution 1 (use coordinates adapted to the problem). Clearly, there are two directions which
are special in this problem: the direction parallel and the direction orthogonal to the line. So a

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 263

5
R~z L
4

3
~v ~z
w
~ 2

x
−3 −2 −1 1 2 3 4 5
−1

−2 ~ = −w
Rw ~

FT
Figure 6.5: The pictures shows the reflection R on the line L. The vector ~v is parallel to L, hence
R~v = ~v . The vector w ~ = −w.
~ is perpendicular to L, hence Rw ~

~ = −32 . Clearly, R~v = ~v



~ where ~v = ( 23 ) and w
basis which is adapted to the exercise, is B = {~v , w}
and Rw ~ = −w.~ Therefore the matrix representation of R with respect to the basis B is
 
B 1 0
AR = .
0 −1
RA
In order to obtain the representation AR with respect to the standard basis, we only need to perform
a change of basis. Recall that change-of-bases matrices are given by
   
2 −3 −1 1 2 3
SB→can = (~v |w)
~ = , Scan→B = SB→can = .
3 2 13 −3 2
Therefore
     
1 2 −3 1 0 2 3 1 −5 12
AR = SB→can AB
R Scan→B = = . 
D

13 3 2 0 −1 −3 2 13 12 5

Solution 2 (reduce the problem to a known reflection). The problem would be easy if we
were asked to calculate
 the matrix representation of the reflection on the x-axis. This would simply
1 0
be A0 = . Now we can proceed as follows: First we rotate R2 about the origin such that
0 −1
the line L is parallel to the x-axis, then we reflect on the x-axis and then we rotate back. The result
is the same as reflecting on L. Assume that Rot is the rotation matrix. Then

AT = Rot−1 ◦ A0 ◦ Rot. (6.22)

How can we calculate Rot? We know that Rot~v = ~e1 and that Rotw ~ = ~e2 . It follows that
Rot−1 = (~v |w)
~ = −32 32 . Note that up to a numerical factor, this is SB→can . We can calculate


easily that Rot = (Rot−1 )−1 = 13


1 2 −3 −5 12
 
3 2 . If we insert this in (6.22), we find again AR = 12 5 . 

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
264 6.4. Linear maps and their matrix representations

z z

~x ~x
• •
E E
P ~x
R~x
y y

x x

Figure 6.6: The figure shows the plane E : x − 2y + 3z = 0 and for the vector ~
x it shows its orthogonal
projection P ~
x onto E and its reflection R~
x about E, see Example 6.51.

Solution 3 (straight forward calculation).

 

 
2
3
−3
=w


~ = −AT w
FT
= ~v = AT ~v =


~ =−
We can form a system of linear equations in order
to find AT . We write AR = ac db with unknown numbers a, b, c, d. Again, we use that we know
~ = −w.
that AT ~v = ~v and AT w ~ This gives the following equations:

a b
c d

  

a b
2
3
=

−3
2a + 3b
2c + 3d
  
=

,

3a − 2b

RA
2 c d 2 3c − 2d

which gives the system

2a + 3b = 2, 2c + 3d = 3, 3a − 2b = −3,
3c − 2d = 2,
5 −5 12
, b = c = 12 5

Its unique solution is a = − 13 13 , d = 13 , hence AR = 12 5 . 

Example 6.51 (Reflection and orthogonal projection in R3 ). In R3 , consider the plane


E : x − 2y + 3z = 0. Let R : R3 → R3 which takes a vector in R3 and reflects it on the plane E and
D

let P : R3 → R3 be the orthogonal projection onto E. Find the matrix representation of R with
respect to the standard basis of R3 .
Observation. 1 Note
 that E is the plane which
 2  passes through
 0  the origin and is orthogonal to the
vector ~n = −2 . Moreover, if we set ~v = 1 and w ~ = 3 , then it is easy to see that {~v , w}
~ is
3 0 2
a basis of E.
Solution 1 (use coordinates adapted to the problem). Clearly, a basis which is adapted to
the exercise is B = {~n, ~v , w}
~ because for these vectors we have R~v = ~v , Rw ~ R~n = −~n, and
~ = w,
P ~v = ~v , P w ~ P ~n = ~0. Therefore the matrix representation of R with respect to the basis B is
~ = w,
 
1 0 0
AB
R =
0 1 0
0 0 −1

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 265

and the one of P is  


1 0 0
AB
R =  0 1 0
0 0 0
In order to obtain the representations AR and AP with respect to the standard basis, we only need
to perform a change of basis. Recall that change-of-bases matrices are given by
   
2 0 1 13 2 −3
−1 1
~ n) = 1 3 −2 ,
SB→can = (~v |w|~ Scan→B = SB→can = −3 6 5 .
28
0 2 3 2 −4 6
Therefore
   
2 0 1 1 0 0 13 2 −3
1 
AR = SB→can AB
R Scan→B = 1 3 −2 0 1 0 −3 6 5
28
0 2 3 0 0 −1 2 −4 6
 
6 2 −3
1

FT
=  2 3 6
7
−3 6 −2
and
   
2 0 2 1 0 0 13 2 −3
1 
AP = SB→can AB
P Scan→B = 1 3 −1 0 1 0 −3 6 5
28
0 2 3 0 0 0 2 −4 6
 
13 2 −3
1 
RA
= 2 10 6 
14
−3 6 5
Solution 2 (reduce the problem to a known reflection). The problem would be easy if we
were asked to calculate
 thematrix representation of the reflection on the xy-plane. This would
1 0 0
simply be A0 = 0 1 0. Now we can proceed as follows: First we rotate R3 about the origin
0 0 −1
such that the plane E is parallel to the xy-axis, then we reflect on the xy-plane and then we rotate
back. The result is the same as reflecting on the plane E. We leave the details to the reader. An
D

analogous procedure works for the orthogonal projection. 


Solution 3 (straight forward calculation).
 a11 a12 a13  Lastly, we can form a system of linear equations in
order to find AR . We write AR = aa21 aa22 aa23 with unknowns aij . Again, we use that we know
31 32 33
that AR~v = ~v , AR w ~ and AR~n = −~n. This gives a system of 9 linear equations for the nine
~ =w
unknowns aij which can be solved. 

Remark 6.52. Yet another solution is the following. Let Q be the orthogonal projection onto ~n.
We already know how to calculate its representing matrix:
  
1 −2 3 x
h~x , ~ni x − 2y + 3z 1 
Q~x = ~
n = ~
n = −2 4 −6  y  .
k~nk2 14 14
3 −6 9 z

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
266 6.4. Linear maps and their matrix representations

1 −2 3
 
1
Hence AQ = 14
−2 4 −6 . Geometrically, it is clear that P = id −Q and R = id −2Q. Hence it
3 −6 9
follows that
     
1 0 0 1 −2 3 13 2 −3
1 −2 1
AP = id −AQ = 0 1 0 − 4 −6 =  2 10 6
14 14
0 0 1 3 −6 9 −3 6 5

and      
1 0 0 1 −2 3 6 2 −3
1 1
AR = id −2AQ = 0 1 0 − −2 4 −6 =  2 3 6 .
7 7
0 0 1 3 −6 9 −3 6 −2

Change of bases as matrix representation of the identity


Finally let us observe that a change-of-bases matrix is nothing else than the identity matrix written

FT
with respect to different bases. To see this let B = {~v1 , . . . , ~vn } and C = {w
~ 1 , . . . , ~vw } be bases of
Rn . We define the the linear bijections ΨB and ΨC as follows:

ΨB : Rn → Rn , ΨB (~e1 ) = ~v1 , . . . , ΨB (~en ) = ~vn ,


n n
ΨC : R → R , ΨC (~e1 ) = w
~ 1 , . . . , ΨC (~en ) = w
~ n,

Moreover we define the change-of-bases matrices


RA
SB→can = (~v1 | · · · |~vn ), ~ 1 | · · · |w
SC→can = (w ~ n ).

Note that these matrices are exactly the matrix representations of ΨB and ΨC . Now let us consider
the diagram
id
Rn Rn
Ψ−1
B Ψ−1
C
Aid
Rn Rn
D

Therefore

Aid = Ψ−1 −1 −1
C ◦ id ◦ΨB = ΨC ◦ ΨB = SC→can ◦ SB→can = Scan→C ◦ SB→can = SB→C .

You should now have understood

• why every linear map between finite dimensional vector spaces can be written as a matrix
and why the matrix depends on the chosen bases,
• how the matrix representation changes if the chosen bases changes,
• in particular, how the matrix representation changes if the chosen bases are reordered,

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 267

• etc.

You should now be able to


• represent a linear map between finite dimensional vector spaces as a matrix,
• use the matrix representation of a linear map to calculate its kernel and range,
• interpret a matrix as a linear map between finite dimensional vector spaces,
• etc.

Ejercicios.
1. De los ejercicios 1 al 14 (exceptuando el 11.) de la sección 6.1 obtenga la representación
matricial de T en las respectivas bases canónicas.
2. Encuentre la representación matricial en la respectiva base canónica de las siguientes trans-
formaciones. En Pn tome la base {1, X, X 2 , . . . , X n+1 }.

FT
(a) En P4 , D(p) = Xp0 − p.
(b) En P4 , D(p) = p00 .
(c) En Pn , D(p) = p(m) , la m−ésima derivada de p.
(d) En M (3 × 3), T (A) = A − At .
R1
(e) En Pn , J : Pn → R dada por J(p) = 0
p(t)dt.
3. Para cada transformación lineal dada, encuentre su representación matricial en las bases
RA
indicadas:
   
2 2 1 −1
(a) T : R → R , T (x, y) = (y, 0) de base canónica a B2 = , .
1 1
        
x   1 1 1 
x+y+z

(b) T : R3 → R2 , T y  = de B1 =  0 , 1 , 0 a B2 =
x−y
z −1 1 0
 
   
0 1
, .
1 0
D

        
1  2 2 0 
(c) T : R3 → R3 , T (~x) = ~x × 2 en la base B = 4 , −1 , −3 .
3 6 0 2
 
 
2a + b + c
(d) T : P2 → R2 , T (aX 2 + bX + c) = de B1 = {X 2 + 1, X 2 + X, X 2 + X + 1}
    b − 3c
2 1
a B2 = , .
3 −1

4. Sea V = span{cos x, sin x, x cos x, x sin x} y B = {cos x, sin x, x cos x, x sin x}.
(a) Demuestre que B es base de V .
(b) Para D : V → V dada por D(f ) = f 0 , obtenga [D]B . ¿D es invertible?

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
268 6.5. Summary

5. Sea V = span{1, ex , e−x }, B1 = {1, ex , e−x } y B2 = {1, cosh x, sinh x}. Considere D : V → V
dada por D(f ) = f 0 , obtenga [D]B
B2 . ¿D es invertible?
1

6. Sea w
~ un vector nonulo y T: R3 → R3 dada por T (~x) = projw~ ~x. Encuentre una base B de
1 0 0
R3 tal que [T ]B = 0 0 0.
0 0 0

7. En R3 , sean E : x+y +z = 0 y T : R3 → R3 dada por T (~x) = reflexión de ~x con respecto a E.


 
1 0 0
(a) Encuentre una base B tal que [T ]B = 0 −1 0
0 0 1
(b) Obtenga [T ]can .
(c) Describa T en las coordenadas usuales.

8. (a) Sea S : R4 → R una transformación lineal tal que S e~1 = 4, S e~2 = −3, S e~3 = 0 y S e~4 = π.

FT
~ ∈ R4 tal que S~x = h~x , wi
Muestre que existe w ~ para todo ~x ∈ R4 .
     
1 0 0
4
1 0 0
(b) Sea S : R → R una transformación lineal tal que S 0 = 1, S 1 = −2, S 1 = 3
    

0 1 0
 
1
0
y S0 = −1. Encuentre w
 ~ ∈ R4 tal que S~x = h~x , wi~ para todo ~x ∈ R4 .
RA
4

9. Sea T : V → W una transformación lineal y suponga que B = {v1 , . . . vn } es una base de V .


Si para cada i ∈ {1, 2, ...n} se tiene que T (vi ) = ~0, muestre que T = O (la transformación que
a todo elemento de V lo envı́a al vector cero de W ).

10. Sea T : Rn → Rn transformación lineal tal que T (~ei ) = ei+1 si 1 ≤ i < n y T (~en ) = 0. Muestre
que T n = O ¿Cómo es la representación matricial de T, T 2 , . . . , T n−1 en la base canónica?
D

6.5 Summary
Linear maps
A function T : U → V between two K-vector spaces U and V is called linear map (or linear function
or linear transformation) if it satisfies

T (u1 + λu2 ) = T (u1 ) + λT (u2 ) for all u1 , u2 ∈ U and λ ∈ K.

The set of all linear maps from U to V is denoted by L(U, V ).

• The composition of linear maps is a linear map.


• If a linear map is invertible, then its inverse is a linear map.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 269

• If U, V are K-vector spaces then L(U, V ) is a K-vector space. This means: If S, T ∈ L(U, V )
and λ ∈ K, then S + λT ∈ L(U, V ).

For a linear map T : U → V we define the following sets

ker T = {u ∈ U : T u = O} ⊆ U,
Im T = {T u : u ∈ U } ⊆ V.

ker T is called kernel of T or null space of T . It is a subspace of U . Im T is called image of T or


range of T . It is a subspace of V .
The linear map T is called injective if T u1 = T u2 implies u1 = u2 for all u1 , u2 ∈ U . The linear
map T is called surjective if for every v ∈ V exist some u ∈ U such that T u = v. The linear map
T is called bijective if it is injective and surjective.
Let T : U → V be a linear map.

• The following are equivalent:

(i) T is injective.
(ii) T u = O implies that u = O.
(iii) ker T = {O}.

• The following are equivalent:

(i) T is surjective.
FT
RA
(ii) Im T = V .

• If T is bijective, then necessarily dim U = dim V . In other words: if dim U 6= dim V , then
there exists no bijection between them.

Let U, V be K-vector spaces and T : U → V a linear map. Moreover, let E : U → U , F : V → V


be linear bijective maps. Then
D

ker(F T ) = ker(T ), ker(T E) = E −1 (ker(T )),


Im(F T ) = F (Im(T )), Im(T E) = Im(T ).

and

dim ker(T ) = dim ker(F T ) = dim ker(T E) = dim ker(F T E),


dim Im(T ) = dim Im(F T ) = dim Im(T E) = dim Im(F T E).

If dim U = n < ∞ then

dim(ker(T )) + dim(Im(T )) = n.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
270 6.5. Summary

Linear maps and matrices


Every matrix A ∈ MK (m × n) represents a linear map from Kn to Km by

TA : Kn → Km , ~x 7→ A~x.

Very often we write A instead of TA .


On the other hand, every linear map T : U → V between finite dimensional vector spaces U and V
has a matrix representation. Let B = {u1 , . . . , un } be a basis of U and C = {v1 , . . . , vm } be a basis
of V . Assume that T uj = a1j v1 + · · · + amj vm . Then the matrix representation of T with respect
to the basis B and C is AT = (aij )i=1,...,m ∈ M (m × n). Note that the matrix representation of T
j=1,...,n
depends on the chosen bases in U and V .
If we define the functions Ψ and Φ as
   
α1 β1
 .. 
n
Ψ : U → K , Ψ(α1 u1 + . . . αn un ) =  .  , Φ : V → Km , Φ(β1 v1 + . . . βm vm ) =  ...  ,
 

αn βm

FT
then these functions are linear and Φ ◦ AT ◦ Ψ = T and Ψ−1 ◦ T ◦ Φ−1 = AT . In a diagram this is

Ψ
U

Rn
T

AT
V

Rm
Φ
RA
Matrices
Let A ∈ M (m × n).
• The column space CA of A is the linear span of its column vectors. It is equal to Im A.
• The row space RA of A is the linear span of its row vectors. It is equal to the orthogonal
complement of ker A.
• dim RA = dim CA = dim(Im A) = number of columns with pivots in any echelon form of A.
D

Kernel and image of A:


• dim(ker A) = number of free variables = number of columns without pivots in any row echelon
form of A.
ker A is equal to the solution set of A~x = ~0 which can be determined for instance with the
Gauß or Gauß-Jordan elimination.
• dim(Im A) = dim CA = number of columns with pivots in any row echelon form of A.
Im(A) be be found by either of the following two methods:
(i) row reduction of A. The columns of the original matrix A which correspond to the
columns of the row reduced echelon form of A are a basis of Im A.
(ii) column reduction of A. The remaining columns are a basis of Im A.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 271

6.6 Exercises
1. Determine si las siguientes funciones son lineales. Si lo son, calcule el kernel y la dimensión
del kernel.

 
x  
3 2x + y x−z
(a) A : R → M (2 × 2), A y =
  ,
x + y − 3z z
z
 
x  
3 2xy x−z
(b) B : R → M (2 × 2), A y =
  ,
x + y − 3z z
z
(c) D : P3 → P4 , Dp = p0 + xp,
 
a+b b+c c+d
(d) T : P3 → M (2 × 3), T (ax3 + bx2 + cx + d) = ,
0 a+d 0
 
a+b b+c c+d
T (ax3 + bx2 + cx + d) =

FT
(e) T : P3 → M (2 × 3), .
0 a+d 3

2. Sean U, V espacios vectoriales sobre K (con K = R o K = C) y sea T : U → V una función lin-


eal invertible. Entonces podemos considerar su función inversa T −1 : Im(T ) → U . Demuestre
que es una función lineal.

3. Sean U, V, W espacios vectoriales sobre K (con K = R o K = C) y sean T : U → V , S : V → W


funciones lineales. Demuestre que la composición ST : U → W también es una función lineal.
RA
4. Sean U, V espacios vectoriales sobre K (con K = R o K = C). Con L(U, V ) denotamos el
conjunto de todas las transformaciones lineales de U a V . Demuestre que L(U, V ) es un
espacio vectorial sobre K. ¿Qué se puede decir sobre dim L(U, V )?

5. Sean U, V espacios vectoriales sobre K (con K = R o K = C). Sabemos de Ejercicio 4. que


L(U, V ) es un espacio vectorial. Fije un vector v0 ∈ V . Demuestre que la siguiente función es
una función lineal:
D

Φv0 : L(U, V ) → U, Φv0 (T ) := T (v0 ).

6. Sean      
1 3 1 1 1 0
A= , E= , F = .
2 6 −1 1 0 −1

(a) Demuestre que E y F son invertibles. Describa como actuan geométricamente en R2 .


(b) Calcule Im(A), ker(A) y sus dimensiones. Dibuja Im(A) y ker(A), diga qué objetos
geométricas son.
(c) Calcule Im(A), Im(F A), Im(AE) y sus dimensiones. Dibújalos y diga cual es la relación
entre ellos.

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
272 6.6. Exercises

(d) Calcule ker(A), ker(F A), ker(AE) y sus dimensiones. Dibújalos y diga cual es la relación
entre ellos.

7. De los siguientes matrices, calcule kernel, imagen y las dimensiones correspondientes.


 
  1 1 5 1  
1 4 7 2 3 2 13 1  1 2 3
A = 2 5 8 4 , B= 0 2 7 −1 ,
 C = 1 2 3 .
3 6 9 6 1 2 9
4 5 25 1

8. Sea A ∈ M (m × n). Demuestre:


(a) A inyectiva =⇒ m ≥ n.
(b) A sobreyectiva =⇒ n ≥ m.
Demuestre que la implicación “⇐=” en (i) and (ii) en general es falsa.

11. Sean m, n ∈ N y A ∈ M (m × n).


FT
9. Sea A ∈ M (m × n) y suponga que A es invertible. Demuestre que m = n.

10. Sea A ∈ M (n × 1) y B ∈ M (1 × n) ambas no nulas. Describa Im(AB).

(a) ¿Cuáles son las dimensiones posibles de ker A y Im A?


(b) Para cada j = 0, 1, 2, 3 encuentre una matriz Aj ∈ M (2 × 3) con dim(ker Aj ) = j, es
RA
decir: encuentre matrices A0 , A1 , A2 , A3 con dim(ker A0 ) = 0, dim(ker A1 ) = 1, . . . . Si
tal matriz no existe, explique por qué no existe.

12. (a) Encuentre una transformación lineal de M (5×5) a M (3×3) diferente de la transfomación
nula.
(b) Encuentre por lo menos dos diferentes funciones lineales biyectivas de M (2 × 2) a P3 .
(c) Existe una función lineal biyectiva S : M (2 × 2) → Pk para k ∈ N, k 6= 3?
D

13. Sean V y W espacios vectoriales.

(a) Sea U ⊂ V un subspacio y sean u1 , . . . , uk ∈ U . Demuestre que gen{u1 , . . . uk } ⊂ U .


(b) Sean u1 , . . . , uk , w1 , . . . , wm ∈ V . Demuestre que lo siguiente es equivalente:
i. gen{u1 , . . . , uk } = gen{w1 , . . . , wm }.
ii. Para todo j = 1, . . . , k tenemos uj ∈ gen{w1 , . . . , wm } y para todo ` = 1, . . . , m
tenemos w` ∈ gen{u1 , . . . , uk }.
iii. Sean v1 , v2 , v3 , . . . , vm ∈ V y sea c ∈ R. Demuestre que
gen{v1 , v2 , v3 , . . . , vm } = gen{v1 + cv2 , v2 , v3 , . . . , vm }.
(c) Sean v1 , . . . , vk ∈ V y sea A : V → W una función lineal invertible. Demuestre que
dim gen{v1 , . . . , vk } = dim gen{Av1 , . . . , Avk }. ¿Es verdad si A no es invertible?

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 6. Linear transformations and change of bases 273

     
1 0 1
14. (a) Sean ~v1 = 4 , ~v2 = 1 , ~v3 = 0 y sea B = {~v1 , ~v2 , ~v3 }. Demuestre que B es
7 2 2
   
1 0
una base de R3 y escriba los vectores ~x = 2 , ~y = 1 en términos de la base B.
3 1
     
1 2 3 2 3 2
15. Sean R = , S= , T = . Demuestre que B = {R, S, T } es una base
0 3 0 7 0 1
del espacio de las matrices triangulares superiores y exprese las matrices
     
1 1 0 0 1 0
K= , L= , M=
0 1 0 1 0 1

en términos de la base B.
       
1 3 ~ −1 ~ 3
16. Sean ~a1 = , ~a2 = , b1 = , b2 = ∈ R2 y sean A = {~a1 , ~a2 }, B = {~b1 , ~b2 }.

FT
2 1 1 2

(a) Demuestre qu A y B son bases de R2 .


 
7
(b) Sea (~x)A = . Encuentre (~x)B y ~x (en la representación estandar).
8
 
3
(c) Sea (~y )B = . Encuentre (~y )A y ~y (en la representación estandar).
5
RA
     
2 −1 4
17. Sea B = {~b1 , ~b2 } una base de R2 y sean ~x1 = , ~x2 = , ~x3 = (dados en
3 1 6
coordenadas cartesianas).
   
3 3
(a) Si se sabe que ~x1 = , ~x2 = , es posible calcular ~b1 y ~b2 ? Si sı́, calcúlelos. Si
1 B 2 B
no, explique por qué no es posible.
   
3 6
es posible calcular ~b1 y ~b2 ? Si sı́, calcúlelos. Si
D

(b) Si se sabe que ~x1 = , ~x3 = ,


1 B 2 B
no, explique por qué no es posible.
   
3 6
(c) ¿Existen ~b1 y ~b2 tal que ~x1 = , ~x2 = ? Si sı́, calcúlelos. Si no, explique por
1 B 2 B
qué no es posible.
   
3 2
(d) ¿Existen ~b1 y ~b2 tal que ~x1 = , ~x3 = ? Si sı́, calcúlelos. Si no, explique por
1 B 5 B
qué no es posible.

18. (a) Demuestre que la siguente función es lineal:

Φ : M (2 × 2) → M (2 × 2), Φ(A) = At

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
274 6.6. Exercises

(b) Sea B = {E1 , E2 , E3 , E4 } la base estandar1 de M (2 × 2) . Encuentre la matriz que


representa a Φ con respecto a esta base.
       
1 2 1 0 0 1 1 0
(c) Sean R = , S= , T = , U= y sea C = {R, S, T, U }.
3 4 0 1 −1 0 1 0
Demuestre que C es una base de M (2 × 2) y escriba Φ como matriz con respecto a esta
base.

19. (a) Demuestre que T : P3 → P3 , T p = p0 es una función lineal.


(b) Determine ker(T ), Im(T ), dim(ker(T )), dim(Im(T )).
(c) Sea B = {1, X, X 2 , X 3 } la base estandar de P3 . Encuentre la matriz que representa a T
con respecto a esta base.
(d) Sean q1 = X + 1, q2 = X − 1, q3 = X 2 + X, q4 = X 3 + 1. Demuestre que C =
{q1 , q2 , q3 , q4 } es una base de P3 . .
(e) Encuentre la matriz con respecto a la base C que representa a T .

FT
Rx
20. Sean T : P3 → P4 dada por T (p) = 0 p(t)dt y D : P4 → P3 dada por D(p) = p0
(a) Muestre que T, D son transformaciones lineales y para cada una encuentre su kernel, su
imagen y las dimensiones del kernel y la imagen.
(b) ¿Se cumple que T (D(p)) = p para todo p ∈ P4 ? En caso de que la respuesta sea negativa,
¿en cuáles casos se cumple?
(c) Repetir lo del inciso anterior para D(T (p)) donde p ∈ P3 .
~ ∈ Rn un vector no nulo. Muestre que existe T : Rn → R tal que T w
RA
21. Sea w ~ 6= 0. Calcule
dim(ker T ) y dim(Im T ).
22. En Rn , sea ϕ : Rn → R una transformación lineal diferente de la trivial.

(a) Muestre que existe w ~ ∈ Rn tal que ϕ(~x) = h~x , wi


~ para todo ~x ∈ Rn . ¿Cual es la
dimensión de ker ϕ? ¿Si n = 2 ó n = 3 como luce ker ϕ?
     
1 −2 2
(b) Sean v1 = , v2 = y v3 = . Encuentre algún ϕ : R2 → R tal que
1 0 −1
ϕ(~v1 ), ϕ(~v2 ) y ϕ(~v3 ) son todos diferentes de 0.
D

(c) Sean ~v1 , ~v2 , . . . , ~vn vectores de R2 todos distintos de ~0. Muestre que existe ϕ : R2 → R
tal que ϕ no se anula en ninguno de ellos.

       
1E 1 0 0 1 0 0 0 0
1 = , E2 = , E3 = , E4 = .
0 0 0 0 1 0 0 1

Last Change: Fri May 10 12:11:39 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7

Orthonormal bases and orthogonal


projections in Rn

FT
In this chapter we will work in Rn and not in arbitrary vector spaces since we want to explore in
more detail its geometric properties. In particular we will discuss orthogonality. Note that in an
arbitrary vector space, we do not have the concept of angles or orthogonality. Everything that we
will discuss here can be extended to inner product spaces where the inner product is used to define
angles. Recall that we showed in Theorem 2.19 that for non-zero vectors ~x, ~y ∈ Rn the angle ϕ
between them satisfies the equation
h~x , ~y i
RA
cos ϕ = .
k~xk k~y k

In a general inner product space (V, h· , ·i) this equation is used to define the angle between two
vectors. In particular, two vectors are said to be orthogonal if their inner product is 0. Inner
product spaces are useful for instance in physics, and maybe in some not so distant future there
will be chapter in these lecture notes about them.
First we will define what the orthogonal complement of a subspace of Rn is and we will see that
the direct sum of a subspace and its orthogonal complement gives us all of Rn .
D

We already know what the orthogonal projection of a vector ~x onto another vector ~y 6= ~0 is (see
Section 2.3). Since it is independent of the norm of ~y , we can just as well consider it the orthogonal
projection of ~x onto the line generated by ~y . In this chapter we will generalise the concept of an
orthogonal projection onto a line to the orthogonal projection onto an arbitrary subspace.
As an application, we will discuss the minimal squares method for the approximation of data.

7.1 Orthonormal systems and orthogonal bases


Recall that two vectors ~x and ~y are orthogonal (or perpendicular ) to each other if and only if
h~x , ~y i = 0. In this case we write ~x ⊥ ~y .

275
276 7.1. Orthonormal systems and orthogonal bases

Definition 7.1. (i) A set of vectors ~x1 , . . . , ~xk ∈ Rn is called an orthogonal set if they are
pairwise orthogonal; in formulas we can write this as

h~xj , ~x` i = 0 for j 6= `.

(ii) A set of vectors ~x1 , . . . , ~xk ∈ Rn is called an orthonormal set if they are pairwise orthonormal;
in formulas we can write this as
(
1 for j = `,
h~xj , ~x` i =
0 for j 6= `.

The difference between an orthogonal and an orthonormal set is that in the latter we additionally
require that each vector of the set satisfies h~xj , ~xj i = 1, that is, that k~xj k = 1. Therefore an
orthogonal set may contain vectors of arbitrary lengths, including the vector ~0, whereas in an
orthonormal all vectors set must have length 1. Note that every orthonormal system is also an
orthogonal system. On the other hand, every orthogonal system which does not contain ~0 can be
converted to an orthonormal one by normalising each vector (that is, by dividing each vector by its

FT
norm).

Examples 7.2. (i) The following systems are orthogonal systems but not orthonormal systems
since the norm of at least one of their vectors is different from 1:
     
           1 0 0 
1 3 0 1 3
, , , , , 0 , 1 , −2 .
−1 3 0 −1 3
0 2 1
 
RA
(ii) The systems following systems are orthonormal systems:
     
      1 0 0 
1 1 1 1 1 1
√ , √ , 0 , √ 1 , √ −2 .
2 −1 2 1 
0 5 2 5 1

Lemma 7.3. Every orthonormal system is linearly independent.

Proof. Let ~x1 , . . . , ~xk be an orthonormal system and consider


D

~0 = α1 ~x1 + α2 ~x2 + · · · + αn−1 ~xn−1 + αn ~xn .

We have to show that all αj must be zero. To do this, we take the inner product on both sides
with the vectors ~xj . Let us start with ~x1 . We find

h~0 , ~x1 i = hα1 ~x1 + α2 ~x2 + · · · + αn−1 ~xn−1 + αn ~xn , ~x1 i


= α1 h~x1 , ~x1 i + α2 h~x2 , ~x1 i + · · · + αn−1 h~xn−1 , ~xn−1 i + αn h~xn , ~x1 i.

Since h~0 , ~x1 i = 0, h~x1 , ~x1 i = k~x1 k2 = 1 and h~x2 , ~x1 i = · · · = h~xn−1 , ~xn−1 i = h~xn , ~x1 i = 0, it
follows that
0 = α1 + 0 + · · · + 0 = α1 .
Now we can repeat this process with ~x2 , ~x3 , . . . , ~xn to show that α2 = · · · = αn = 0.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 277

Remark. The lemma shows that every orthogonal system of n vectors in Rn is a basis of Rn .

Definition 7.4. An orthonormal basis of Rn is a basis whose vectors form an orthogonal set.
Occasionally we will write ONB for “orthonormal basis”.

Examples 7.5 (Orthonormal bases of Rn ).

(i) The canonical basis ~e1 , . . . , ~en is an orthonormal basis of Rn .

(ii) The following systems are examples of orthonormal bases of R2 :


               
1 1 1 1 1 2 1 −3 1 3 1 −4
√ , √ , √ , √ , , , .
2 −1 2 1 13 3 13 2 5 4 5 3

(iii) The following systems are examples of orthonormal bases of R3 :

FT
             
 1 1 1 −1   1 −3 1 
1 1 1 1 1
√ −1 , √ 1 , √  1 , √ 2 , √  0 , √ −5 .
 3 2 0 6   14 10 35
1 2 3 1 3

   
cos ϕ − sin ϕ
Exercise 7.6. Show that every orthonormal basis of R2 is of the form ,
    sin ϕ cos ϕ
cos ϕ sin ϕ
or , for some ϕ ∈ R. See also Exercise 7.13.
sin ϕ − cos ϕ
RA
We will see in Corollary 7.27 that every orthonormal system in Rn can be completed to an or-
thonormal basis. In Section 7.5 we will show how to construct an orthonormal basis of a subspace
of Rn from a given basis. In particular it follows that every subspace of Rn has an orthonormal
basis.
Orthonormal bases are very useful. Among other things it is very easy to write a given vector
~ ∈ Rn as a linear combination of such a basis. Recall that if we are given an arbitrary basis
w
~z1 , . . . , ~zn of Rn and we want to write a vector ~x as linear combination of this basis, then we have
D

to find coefficients α1 , . . . , αn such that ~x = α1 ~z1 +· · ·+αn ~zn , which means we have to solve a n×n
system in order to determine the coefficients. If however the given basis is an orthonormal basis,
then calculating the coefficients reduces to evaluating n inner products as the following theorem
shows.

Theorem 7.7 (Representation of a vector with respect to an ONB). Let ~x1 , . . . , ~xn be an
orthonormal basis of Rn and let w
~ ∈ Rn . Then

~ = hw
w ~ , ~x1 i~x1 + hw
~ , ~x2 i~x2 + · · · + hw
~ , ~xn i~xn .

Proof. Since ~x1 , . . . , ~xn is a basis of Rn , there are α1 , . . . , αn ∈ R such that

~ = α1 ~x1 + α2 ~x2 + · · · + αn ~xn .


w

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
278 7.1. Orthonormal systems and orthogonal bases

Now let us take the inner product on both sides with ~xj for j = 1, . . . , n. Note that h~xk , ~xj i = 0
if k 6= j and that h~xj , ~xj i = k~xj k2 = 1.

hw
~ , ~xj i = hα1 ~x1 + α2 ~x2 + · · · + αn ~xn , ~xj i
= α1 h~x1 , ~xj i + α2 h~x2 , ~xj i + · · · + αn h~xn , ~xj i
= αj h~xj , ~xj i = αj .

Note that the proof of this theorem is essentially the same as that of Lemma 7.3. In fact, Lemma 7.3
follows from the theorem above if we choose w ~ = ~0.
Exercise 7.8. If ~x1 , . . . , ~xn are an orthogonal, but not necessarily orthonormal basis of Rn , then
we have for every w~ ∈ Rn that

hw
~ , ~x1 i hw
~ , ~x2 i hw
~ , ~xn i
w
~= 2
~x1 + 2
~x2 + · · · + ~xn .
k~x1 k k~x2 k k~xn k2

(You can either use a modified version of the proof of Theorem 7.7 or you define yj = k~xj k−1 ~xj ,

FT
show that ~y1 , . . . , ~yn is an orthogonal basis and apply the formula from Theorem 7.7.)

You should now have understood


• what an orthogonal system is,
• what an orthonormal system is,
• what an orthonormal basis is,
RA
• why orthogonal bases are useful,
• etc.
You should now be able to

• check if a given set of vectors is an orthogonal/orthonormal system,


• check if a given set of vectors is an orthogonal/orthonormal basis of the given space,
• check if a given basis is an orthogonal or orthonormal basis,
D

• give examples of orthonormal basis,


• find the coefficients of a given vector with respect to a given orthonormal or orthogonal
basis.
• etc.

Ejercicios.
1. De los ejercicios anteriores, verifique si el conjunto dado es una base ortonormal del espacio
vectorial V al que se refiere.
    
2 1 2 1 2
(a) V = R , √ , √5 .
5 1 −1

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 279

    
√1
1 √1 1
(b) V = R2 , , 2 .
2 −1 1
  
√1
2
(c) En R2 considerar V = la recta 3x − 2y = 0, 13 3
.
         
 1 −1   1 8 
(d) En R3 considerar V = span  2 ,  2 , √1 4 , √ 1  −2 .
 17 357
−1 3 0 −17
  
            
 1 −1 1   1 1 −1 
(e) En R3 considerar V = span 2 ,  0 ,  6 , √1 1 , √1 −1 , √1  1 .
 2 6 3
3 4 17 0 2 −1
  

2. ¿Para qué valores de a, b es el conjunto es una base ortogonal de R3 ?


     
 2 1 b − 5a 
 0 , a ,  4 
−1 2 1
 

FT
 
2 a
3. En R , sea ~v1 = un vector no nulo. ¿Cuántas bases ortogonales de R2 que contienen a
b
~v1 existen? ¿Cuántas bases ortogonales {~v1 , ~v2 } de R2 existen tales que k~v1 k = k~v2 k ?
4. El siguiente ejercicio pretende obtener una base ortonormal del plano E : ax + by + cz = 0
con herramientas vistas hasta ahora.
(a) Considere un vector ~v1 paralelo ~v1 6= ~0. Sea ~n algún vector normal de E y
n a E con o
RA
~
v1 ~
v2
tome ~v2 = v1 × ~n. Demuestre k~v1 k , k~v2 k es una base ortonormal de E, (observe que
n o
~
v1 ~
v2 ~
n
v1 k , k~
k~ v2 k , k~
nk es una base ortonormal de R3 ).

~n
D

E


v1


v2

(b) Para el plano E : x + 2y + 3z = 0, obtenga una base ortonormal y complétela a una base
ortonormal de R3 .
 
1
(c) Escriba 2 en términos de la base que obtuvo del inciso anterior.
3

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
280 7.2. Orthogonal matrices

(d) Sea L : x3 = y2 = z5 . ¿Puede obtener una base ortonormal de R3 que contenga algún
vector director de L?

5. Sea B cualquier base de Rn y sean ~v1 , ~v2 ∈ Rn . Si h[~v1 ]B , [~v2 ]B i = 0 ¿se sigue que ~v1 ⊥ ~v2 ?

7.2 Orthogonal matrices


We already saw that it is very easy to express a given vector as linear combination of the members
of an orthonormal basis. In this section we want to explore the properties of the transition matrices
between two orthonormal bases of Rn .
Let B = {~u1 , . . . , ~un } and C = {w ~ n } be orthonormal bases of Rn . Let Q = AB→C be the
~ 1, . . . , w
transition matrix from the basis B to the basis C. We know that its entries qij are the uniquely
determined numbers such that
   
q11 q1n
 ..   .. 
~u1 =  .  = q11 w ~ 1 + · · · + qn1 w ~ n, ..., ~un =  .  = q1n w ~ 1 + · · · + qnn w
~ n.

FT
qn1 C
qnn C

Since C is an orthonormal basis, it follows that qij = h~uj , w


~ i i, see Theorem 7.7. Therefore
 
 h~u1 , w
~ 1 i h~u2 , w
~ 1i h~un , w
~ 1 i
 h~u1 , w
~ 2 i h~u2 , w
~ 2i h~un , w
~ 2 i
 
AB→C =  
.

RA
 
 
 
h~u1 , w
~ n i h~u2 , w
~ ni h~un , w
~ ni

If we exchange the role of B and C and use that hw ~ i , ~uj i = h~uj , w~ i i, then we obtain
   
hw~ 1 , ~
u 1 i hw~ 2 , ~
u 1 i hw~ n , ~
u 1 i  h~u1 , w
~ 1 i h~
u 1 , w
~ 2 i h~u1 , w
~ n i

hw i h i h i h~ i h~ i h~ i
   
~ 1 , ~
u 2 w~ 2 , ~
u 2 w~ n , ~
u 2   u2 , w
~ 1 u 2 , w
~ 2 u2 , w
~ n
AC→B =  
=
 .

D

   
   
   
hw~ 1 , ~un i hw ~ 2 , ~un i hw~ n , ~un i h~un , w
~ 1 i h~un , w ~ 2i h~un , w
~ ni

This shows that AC→B = (AB→C )t . If we use that AC→B = (AB→C )−1 , then we find that

(AB→C )−1 = (AB→C )t .

From these calculations, we obtain the following lemma.

Lemma 7.9. Let B = {~u1 , . . . , ~un } and C = {w ~ n } be orthonormal bases of Rn and let
~ 1, . . . , w
Q = AB→C be the transition matrix from the basis B to the basis C. Then

Qt = Q−1 .

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 281

Definition 7.10. A matrix A ∈ M (n × n) is called an orthogonal matrix if it is invertible and


At = A−1 .

Proposition 7.11. Let Q ∈ M (n × n). Then the following is equivalent:

(i) Q is an orthogonal matrix.


(ii) Qt is an orthogonal matrix.
(iii) Q−1 exists and is an orthogonal matrix.

Proof. (i) =⇒ (ii): Assume that Q is orthogonal. Then it is invertible, hence also Qt is invertible
by Theorem 3.51 and (Qt )−1 = (Q−1 )t = (Qt )t = Q holds. Hence Qt is an orthogonal matrix.
(ii) =⇒ (i): Assume that Qt is an orthogonal matrix. Then (Qt )t = Q must be an orthogonal
matrix too by what we just proved.
(i) =⇒ (iii): Assume that Q is orthogonal. Then it is invertible and (Q−1 )−1 = (Qt )−1 = (Q−1 )t
where in the second step we used Theorem 3.51. Hence Q−1 is an orthogonal matrix.
(iii) =⇒ (i): Assume that Q−1 is an orthogonal matrix. Then its inverse (Q−1 )−1 = Q must be

FT
an orthogonal matrix too by what we just proved.

By Lemma 7.9, every transition matrix from one ONB to another ONB is an orthogonal matrix.
The reverse is also true as the following theorem shows.

Theorem 7.12. Let Q ∈ M (n × n). Then:

(i) Q is an orthogonal matrix if and only if its columns are an orthonormal basis of Rn .
RA
(ii) Q is an orthogonal matrix if and only if its rows are an orthonormal basis of Rn .
(iii) If Q is an orthgonal matrix, then | det Q| = 1.

Proof. (i): Assume that Q is an orthogonal matrix and let ~cj be its columns. We already know
that they are a basis of Rn since Q is invertible. In order to show that they are also an orthonormal
system, we calculate
 
h~c , ~
c i h~c , ~
c i h~c , ~
c i
   1 1 1 2 1 n
D

~c1

h~c , ~
c i h~c , ~
c i h~c , ~
c i
 
.  2 1 2 2 2 n 
t
id = Q Q =  ..  (~c1 | · · · | ~cn ) =  . (7.1)
  

~cn
 
 
 
h~cn , ~c1 i h~cn , ~c2 i h~cn , ~cn i

Since the product is equal to the identity matrix, it follows that all the elements on the diagonal
must be equal to 1 and all the other elements must be equal to 0. This means that h~cj , ~cj i = 1 for
j = 1, . . . , n and h~cj , ~ck i = 0 for j 6= k, hence the columns of Q are an orthonormal basis of Rn .
Now assume that the columns ~c1 , . . . , ~cn of Q are an orthonormal basis of Rn . Then clearly (7.1)
holds which shows that Q is an orthogonal matrix.
(ii): The rows of Q are the columns of Qt hence they are an orthonormal basis of Rn by (i) and
Proposition 7.11 (ii).

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
282 7.2. Orthogonal matrices

(iii): Recall that det Qt = det Q. Therefore we obtain

1 = det id = det(QQt ) = (det Q)(det Qt ) = (det Q)2 ,

which proves the claim.

 
1 1
with | det
Clearly, not every matrix R   R| = 1 is an orthogonalmatrix.
 For instance, if R = ,
0 1
1 −1 1 0
then det R = 1, but R−1 = is different from Rt = .
0 1 1 1

Question 7.1
Assume that ~a1 , . . . , ~an ∈ Rn are pairwise orthogonal and let R ∈ M (n × n) be the matrix whose
columns are the given vectors. Can you calculate Rt R and RRt ? What are the conditions on the
vectors such that R is invertible? If it is invertible, what is its inverse? (You should be able to

FT
answer the above questions more or less easily if k~aj k = 1 for all j = 1, . . . , n because in this case
R is an orthogonal matrix.)

 
cos ϕ − sin ϕ
Exercise 7.13. Show that every orthogonal 2 × 2 matrix is of the form Q =
  sin ϕ cos ϕ
cos ϕ sin ϕ
or Q = . Compare this with Exercise 7.6.
sin ϕ − cos ϕ
RA
Exercise 7.14. Use the results from Section 4.3 to prove that | det Q| = 1 if Q is an orthogonal
2 × 2 or 3 × 3 matrix.

It can be shown that every orthogonal matrix represents either a rotation (if its determinant is 1)
or the composition of a rotation and a reflection (if its determinant is −1).

Orthogonal matrices in R2 . Let Q ∈ M (2 × 2) be an orthogonal matrix with columns  ~c1and


cos ϕ
D

~c2 . Recall that Q~e1 = ~c1 and Q~e2 = ~c2 . Since ~c1 is a unit vector, it is of the form ~c1 = for
sin ϕ
some ϕ ∈ R. Since ~c2 is also a unit
 vector  and in addition
 must
 be orthogonal to ~c1 , there are only
− sin ϕ sin ϕ
the two possible choices ~c2 + = or ~c2 − = , see Figure 7.1.
cos ϕ − cos ϕ
 
cos ϕ − sin ϕ
• In the first case, det Q = det(~c1 |~c2 + ) = det = cos2 ϕ + sin2 ϕ = 1 and Q
sin ϕ cos ϕ
represents the rotation by ϕ counterclockwise.
 
cos ϕ sin ϕ
• In the second case, det Q = det(~c1 |~c2 ) = det

= − cos2 ϕ − sin2 ϕ = −1.
sin ϕ − cos ϕ
and Q represents the rotation by ϕ counterclockwise followed by a reflection on the direction
given by ~c1 (or: reflection on the x-axis followed by the rotation by ϕ counterclockwise).

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 283

y
Q y
cos ϕ 
~c1 = sin ϕ
~e2 − sin ϕ
~c2 + =

cos ϕ

ϕ
(a) x x
~e1

y
Q y
cos ϕ 
~c1 = sin ϕ
~e2

ϕ
(b) x x
~e1

FT
~c2 − = sin ϕ

− cos ϕ

Figure 7.1: In case (a), Q represents a rotation and det A = 1. In case (b) it represents rotation
followed by a reflection and det Q = −1.

Exercise 7.15. Let Q be an orthogonal n × n matrix. Show the following.

(i) Q preserves inner products, that is h~x , ~y i = hQ~x , Q~y i for all ~x, ~y ∈ Rn .
RA
(ii) Q preserves lengths, that is k~xk = kQ~xk for all ~x ∈ Rn .
(iii) Q preserves angles, that is ^(~x, ~y ) = ^(Q~x, Q~y ) for all ~x, ~y ∈ Rn \ {~0}.

Exercise 7.16. Let Q ∈ M (n × n)

(i) Assume that Q preserves inner products, that is h~x , ~y i = hQ~x , Q~y i for all ~x, ~y ∈ Rn . Show
that Q is an orthogonal matrix.
(ii) Assume that Q preserves lengths, that is k~xk = kQ~xk for all. Show that Q is an orthogonal
D

matrix.
Exercise 7.15 together with Exercise 7.16 show the following.

A matrix Q is an orthogonal matrix if and only if it preserves lengths if and only if it preserves
angles. That is

Q is orthogonal ⇐⇒ Qt = Q−1
⇐⇒ hQ~x , Q~y i = h~x , ~y i for all ~x, ~y ∈ Rn
⇐⇒ kQ~xk = k~xk for all ~x ∈ Rn .

Definition 7.17. A linear transformation T : Rn → Rm is called an isometry if kT ~xk = k~xk for


all ~x ∈ Rn .

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
284 7.2. Orthogonal matrices

Note that every isometry is injective since T ~x = ~0 if and only if ~x = ~0, therefore necessarily n ≤ m.

You should now have understood


• that a matrix is orthogonal if and only if it represents change of bases between two orthonor-
mal bases,
• that an orthogonal matrix represents either a rotation or a rotation composed with a reflec-
tion,
• etc.
You should now be able to

• check if a given matrix is an orthogonal matrix,


• construct orthogonal matrices,
• etc.

FT
Ejercicios.
1. Verifique que las siguientes matrices son ortogonales.
     
1 0 0 cos ϑ 0 − sin ϑ cos ϑ − sin ϑ 0
0 cos ϑ − sin ϑ ,  0 1 0 ,  sin ϑ cos ϑ 0
0 sin ϑ cos ϑ sin ϑ 0 cos ϑ 0 0 1
RA
¿Cuál es la interpretación geométrica de cada una? Ver 3.4 ejercicio 6..

2. Para el plano E : 2x + y − z = 0, obtenga una base B ortonormal de R3 tal que sus dos
primeros vectores sean una base de E. Obtenga AB→can y Acan→B .

3. Encuentre por lo menos seis isometrı́as distintas de R2 a R3 .

4. Sean A, B ∈ M (n × n):

(a) Si AB es ortogonal. ¿Se puede concluir que A y B deben ser matrices ortogonales?
D

(b) Si A,B son matrices ortogonales. ¿Se puede concluir que AB es una matriz ortogonal?

5. Sea T : Rn → Rm y Q ∈ M (n × n)

(a) Demuestre que T es una isometrı́a si y solo si hT x , T yi = hx , yi para todo x, y ∈ Rn (por


ende T preserva ángulos). (Hint: Basta hacer lo mismo que en el ejercicio 7.16 parte
(ii)).
(b) Muestre que Q es una matriz ortogonal si y solo si Q es una isometrı́a
(c) Sea B1 = {~e1 , . . . , ~en } la base canónica de Rn y suponga que T es una isometrı́a. Muestre
que {T ~e1 , ..., T ~en } es un sistema ortonormal de vectores.

6. Sea ~x ∈ Rn . Muestre que T (~x) = hx , ~e1 iT ~e1 + · · · + hx , ~en iT ~en .

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 285

7.3 Orthogonal complements


In this section we will learn how to find all the vectors that are orthogonal to a given subspace U
of Rn . This set is called the orthogonal complement of U . We start with its formal definition.

Definition 7.18. Let U be a subspace of Rn .

(i) Let U be a subspace of Rn . We say that a vector ~x ∈ Rn is perpendicular to U if it is


perpendicular to every vector in U . In this case we write ~x ⊥ U .

(ii) The orthogonal complement of U is denoted by U ⊥ and it is the set of all vectors which are
perpendicular to every vector in U , that is

U ⊥ = {~x ∈ Rn : ~x ⊥ U } = {~x ∈ Rn : ~x ⊥ ~u for every ~u ∈ U }.

We start with some easy observations.

FT
Remark 7.19. Let U be a subspace of Rn .

(i) U ⊥ is a subspace of Rn .

(ii) U ∩ U ⊥ = {~0}.

(iii) (Rn )⊥ = {~0}, {~0}⊥ = Rn .

Proof. (i) Clearly, ~0 ∈ U ⊥ . Let ~x, ~y ∈ U ⊥ and let c ∈ R. Then for every ~u ∈ U we have that
RA
h~x + c~y , ~ui = h~x , ~ui + ch~y , ~ui = 0, hence ~x + c~y ∈ U ⊥ and U ⊥ is a subspace by Theorem 5.10.

(ii) Let ~x ∈ U ∩ U ⊥ . Then it follows that ~x ⊥ ~x, hence k~xk2 = h~x , ~xi = 0 which shows that ~x = ~0
and therefore U ∩ U ⊥ consists only of the vector ~0.

(iii) Assume that ~x ∈ (Rn )⊥ . Then ~x ⊥ ~y for every ~y ∈ Rn , in particular also ~x ⊥ ~x. Therefore
k~xk2 = h~x , ~xi = 0 which shows that ~x = ~0. It follows that ~x ∈ (Rn )⊥ .
It is clear that h~x , ~0i = 0, hence Rn ⊆ {~0}⊥ ⊆ Rn which proves that {~0}⊥ = Rn .
D

Examples 7.20. (i) The orthogonal complement of a line in R2 is again a line, see Figure 7.2.

(ii) The orthogonal complement of a line in R3 is the plane perpendicular to the given lines. The
orthogonal complement to a plane in R3 is the line perpendicular to the given plane, see
Figure 7.2.

The next goal is to show that dim U + dim U ⊥ = n and to establish a method for calculating
U ⊥ . To this end, the following lemma is useful. It tells us that in order to verify that some ~x is
perpendicular to U we do not have to check that ~x ⊥ ~u for every ~u ∈ U , but that it is enough to
check it for a set of vectors ~u which generate U .

Lemma 7.21. Let U = span{~u1 , . . . , ~uk } ⊆ Rn . Then ~x ∈ U ⊥ if and only if ~x ⊥ ~uj for every
j = 1, . . . , k.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
286 7.3. Orthogonal complements

z
x H
G

U
L

• y •

FT
Figure 7.2: The figure on the left shows the orthogonal complement of the line L in R2 which is the
line G. The figure on the right shows the orthogonal complement of the plane U in R3 which is the
line H. Note the orthogonal complement of H is U .

Proof. Suppose that ~x ⊥ U , then ~x ⊥ ~u for every ~u ∈ U , in particular for the generating vectors
~u1 , . . . , ~uk . Now suppose that ~x ⊥ ~uj for all j = 1, . . . , k. Let ~u ∈ U be an arbitrary vector in U .
Then there exist α1 , . . . , αk ∈ R such that ~u = α1 ~u1 + · · · + ~uk αk . So we obtain
RA
h~x , ~ui = h~x , α1 ~u1 + · · · + ~uk αk i = h~x , α1 ~u1 i + · · · + αk h~x , ~uk i = 0.

Since ~u can be chosen arbitrary in U , it follows that ~x ⊥ U .

Theorem 7.22. Let A ∈ M (m × n). Then

ker(A) = (RA )⊥ = (Im At )⊥ .


D

Proof. Let ~r1 , . . . , ~rn be the rows of A. Since RA = span{~r1 , . . . , ~rn }, it suffices to show that
~x ∈ ker(A) if and only if ~x ⊥ ~rj for all j = 1, . . . , m.
By definition ~x ∈ ker(A) if and only if
    
~r1 x1 h~r1 , ~xi
~0 = A~x =  .  .   . 
 ..   ..  =  ..  .
~rm xm h~rm , ~xi

This is the case if and only if h~rj , ~xi = 0 for all j = 1, . . . , m, that is, if and only if ~x ⊥ ~rj for all
j = 1, . . . , m.

Alternative proof of Theorem 7.22. Observe that RA = CAt = Im(At ). So we have to show that

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 287

ker(A) = (Im(At ))⊥ . Recall that hAx , yi = hx , At yi. Therefore

x ∈ ker(A) ⇐⇒ Ax = 0 ⇐⇒ Ax ⊥ Rm
⇐⇒ hAx , yi = 0 for all y ∈ Rm
⇐⇒ hx , At yi = 0 for all y ∈ Rm ⇐⇒ x ∈ (Im(At ))⊥ .

The theorem above leads to a method for calculating the orthogonal complement of a given subspace
U of Rn as follows.

Lemma 7.23. Let U = span{~u1 , . . . , ~uk } ⊆ Rn and let A be the matrix whose rows consist of the
vectors ~u1 , . . . , ~uk . Then
U ⊥ = ker A. (7.2)

Proof. Let ~x ∈ Rn . By Lemma 7.21 we know that ~x ∈ U ⊥ if and only if ~x ⊥ ~uj for every
j = 1, . . . , k. This is the case if and only if

FT
h~u1 , ~xi = 0 ~u1
  
0
h~u2 , ~xi = 0 ~u2  0
which can be written in matrix form as  .  ~x =  . 
   
.. .  ..   .. 
. = ..
~uk 0
h~uk , ~xi = 0

which is the same as A~x = ~0 by definition of A. In conclusion, ~x ⊥ U if and only A~x = ~0, that is,
if and only if ~x ∈ ker A.
RA
In Example 7.28 we will calculate the orthogonal complement of a subspace of R4 .
The next two theorems are the main results of this section.

Theorem 7.24. For every subspace U ⊆ Rn we have that

dim U + dim U ⊥ = n. (7.3)

Proof. Let ~u1 , . . . , ~uk be a basis of U . Note that k = dim U . Then we have in particular U =
D

span{~u1 , . . . , ~uk }. As in Lemma 7.21 we consider the matrix A ∈ M (k × n) whose rows are the
vectors ~u1 , . . . , ~uk . Then U ⊥ = ker A, so

dim U ⊥ = dim(ker A) = n − dim(Im A).

Note that dim(Im A) is the dimension of the column space of A which is equal to the dimension of
the row space of A by Proposition 6.32. Since the vectors ~u1 , . . . , ~uk are linear independent, this
dimension is equal to k. Therefore dim U ⊥ = n − k = n − dim U . Rearranging we obtained the
desired formula dim U ⊥ + dim U = n.
(We could also have said that the reduced form of A cannot have any zero row because its rows
are linearly independent. Therefore the reduced form must have k pivots and we obtain dim U ⊥ =
dim(ker A) = n − #(pivots of the reduced form of A) = n − k = n − dim U . We basically re-proved
Proposition 6.32.)

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
288 7.3. Orthogonal complements

Theorem 7.25. Let U ⊆ Rn be a subspace of Rn . Then the following holds.

(i) U ⊕ U ⊥ = Rn .

(ii) (U ⊥ )⊥ = U .

Proof. (i) Recall that U ∩ U ⊥ = {~0} by Remark 7.19, therefore the sum is a direct sum. Now let
us show that U + U ⊥ = Rn . Since U + U ⊥ ⊆ Rn , we only have to show that dim(U + U ⊥ ) =
n because the only n-dimensional subspace of Rn is Rn itself, see Theorem 5.54. From
Proposition 5.62 and Theorem 7.24 we obtain

dim(U + U ⊥ ) = dim(U ) + dim(U ⊥ ) − dim(U ∩ U ⊥ ) = dim(U ) + dim(U ⊥ ) = n

where we used that dim(U ∩ U ⊥ ) = dim{~0} = 0.

(ii) First let us show that U ⊆ (U ⊥ )⊥ . To this end, fix ~u ∈ U . Then, for every ~y ∈ U ⊥ , we have
that h~x , ~y i = 0, hence ~x ⊥ U ⊥ , that is, ~x ∈ (U ⊥ )⊥ . Note that dim(U ⊥ )⊥ = n − dim U ⊥ =
n − (n − dim U ) = dim U . Since we already know that U ⊆ (U ⊥ )⊥ , it follows that they must

FT
be equal by Theorem 5.54.

The next proposition shows that every subspace of Rn has an orthonormal basis. Another proof of
this fact will be given later when we introduce the Gram-Schmidt process in Section 7.5.

Proposition 7.26. Every subspace U ⊆ Rn with dim U > 0 has an orthonormal basis.

Proof. Let U be a subspace of Rn with dim U = k > 0. Then dim U ⊥ = n − k and we can choose
RA
a basis w ~ k+1 , . . . , wn of U ⊥ . Let A0 ∈ M ((n − k) × n) be the matrix whose rows are the vectors
~ k+1 , . . . , wn . Since U = (U ⊥ )⊥ , we know that U = ker A0 . Pick any ~u1 ∈ ker A0 with ~u1 6= ~0.
w
Then ~u1 ∈ U . Now we form the new matrix A1 ∈ M ((n−k+1)×n) by adding ~u1 as a new row to the
matrix A0 . Note that the rows of A1 are linearly independent, so dim ker(A1 ) = n−(n−k+1) = k−1.
If k−1 > 0, then we pick any vector ~u2 ∈ ker A1 with ~u2 6= ~0. This vector is orthogonal to all the rows
of A1 , in particular it belongs to U (since it is orthogonal to w ~ k+1 , . . . , w
~ n ) and it is perpendicular
to ~u1 ∈ U . Now we form the matrix A2 ∈ M ((n−k +2)×n) by adding the vector ~u2 as a row to A1 .
Again, the rows of A2 are linearly independent and therefore dim(ker A2 ) = n − (n − k + 2) = k − 2.
If k − 2 > 0, then we pick any vector ~u3 ∈ ker A2 with ~u3 6= ~0. This vector is orthogonal to all
D

the rows of A2 , in particular it belongs to U (since it is orthogonal to w ~ k+1 , . . . , w


~ n ) and it is
perpendicular to ~u1 , ~u2 ∈ U . We continue this process until we have vectors ~u1 , . . . , ~uk ∈ U which
are pairwise orthogonal and the matrix Ak ∈ M (n × n) consists of linearly independent rows, so its
kernel is trivial. By construction, ~u1 , . . . , ~uk is an orthogonal system of k vectors in U with none of
them being equal to ~0. Hence they are linearly independent and therefore they are an orthogonal
basis of U since dim U = k. In order to obtain an orthonormal basis we only have to normalise
each of the vectors.

Corollary 7.27. Every orthonormal system in Rn can be completed to an orthonormal basis.

Proof. Let w ~ k be an orthonormal system in Rn and let W = span{w


~ 1, . . . , w ~ k }. By Propo-
~ 1, . . . , w
sition 7.26 we can find an orthonormal basis ~u1 , . . . ~un−k of W ⊥ (take U = W ⊥ in the proposition).
Then w ~ k , ~u1 . . . , ~un−k is an orthonormal basis of U ⊕ U ⊥ = Rn .
~ 1, . . . , w

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 289

We conclude this section with a few examples.

Example 7.28. Find a basis for the orthogonal complement of


   

 1 1 
   
2 , 0 .

U = span  3 1

 
4 0
 

Solution. Recall that ~x ∈ U ⊥ if and only if it is perpendicular to the vectors which generate U .
Therefore ~x ∈ U ⊥ if and only if it belongs to the kernel of the matrix whose rows are the generators
of U . So we calculate
     
1 2 3 4 1 2 3 4 1 0 1 0
−→ −→ .
1 0 1 0 0 −2 −2 −4 0 1 1 2

Hence a basis of U ⊥ is given by

FT
   
0 −1
−2 −1
w  0 ,
~1 =   w  1 .
~2 =   
1 0

Example 7.29. Find an orthonormal basis for the orthogonal complement of


   

 1 1 
   
RA
2 0

U = span   ,   .


 3 1 
4 0
 

Solution. We will use the method from Proposition 7.26. Another solution of this exercise will be
given in Example 7.44. From the solution of Example 7.28 we can take the first basis vector w ~ 1.
We append it to the matrix from the solution of Example 7.28 and reduce the new matrix (note
that the first few steps are identical to the reduction of the original matrix). We obtain
     
1 2 3 4 1 0 1 0 1 0 1 0
D

1 0 1 0 −→ 0 1 1 2 −→ 0 1 1 2
0 −2 0 1 0 −2 0 1 0 0 2 5

whose kernel is generated by  


5
 1
 .
−5
2
Hence an orthogonal basis of U ⊥ is given by
   
0 5
1  −2  1  1
~y1 = √  , ~y2 = √  . 
5  0 55 −5
1 2
Last Change: Wed May 8 12:13:52 PM -05 2024
Linear Algebra, M. Winklmeier
290 7.3. Orthogonal complements

You should now have understood


• the concept of the orthogonal complement,
• in particular the geometric interpretation of the orthogonal complement of a subspace (at
least in R2 and R3 ),
• etc.
You should now be able to
• find the orthogonal complement of a given subspace of Rn ,
• find an orthogonal basis of a given subspace of Rn ,
• etc.

Ejercicios.

FT
1. Encuentre el complemento ortogonal de los siguientes conjuntos:
 
−1
(a) span
5
(b) La intersección de los planos x + 2y + 5z = 0, 2x − 3y − 4z = 0.
   
 1 −1 
(c) span −2 ,  1
3 2
 
RA
(d) La imagen de la transformación lineal T : R3 → R4 dada por:
 
  2x + y − 3z
x  3x + 2y − 5z 
T y  =  
 x−y 
z
−x − 3y + 4z

2. En R4 , encuentre una base ortonormal del hiperplano x − y + w = 0.


D

3. Se dan un vector ~v y un subespacio W . En cada caso escriba ~v como la suma de un elemento


en W con un elemento de W ⊥ . Adicional encuentre una base ortonormal para W ⊥ .
   
3 −1
(a) v = y W = span .
5 1
 
−1
(b) ~v =  1 y W es el plano 2x + y + 2z = 0.
3
 
10
(c) ~v = −1 y W es la recta x = 0, y2 = 3z.
6

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 291

    
1 
 x 

−1  y  4

(d) ~v =   y W =   ∈ R : y = 2x + w, z = x − 2w .
   
3  z
 

2 w
 

4. Sea U un subespacio de Rn . Muestre que (U ⊥ )⊥ = U .


5. Sea A ∈ M (n × m) y W el espacio generado por las columnas de A. Muestre que todo ~v ∈ Rn
puede ser escrito como la suma de dos elementos ~a, ~b tales que ~a ∈ ker At y ~b ∈ W .

7.4 Orthogonal projections


Recall that in Section 2.3 we discussed the orthogonal projection of one vector onto another in R2 .
~ ∈ Rn with w
This can clearly be extended to higher dimensions. Let ~v , w ~ 6= ~0. Then
h~v , wi
~
projw~ ~v := w
~ (7.4)
~ 2
kwk

FT
is the unique vector in Rn which is parallel to w ~ and satisfies that ~v − projw~ ~v is orthogonal to
w.
~ We already know that the projection is independent on the length of w. ~ So projw~ ~v should be
regarded as the projection of ~v onto the one-dimensional subspace generated by w. ~
In this section we want to generalise this to orthogonal projections on higher dimensional subspaces,
for instance you could think of the projection in R3 onto a given plane. Then, given a subspace U
of Rn , we want to define the orthogonal projection as the function from Rn to Rn which assigns to
each vector ~v its orthogonal projection onto U . We start with the analogue of Theorem 2.22.
RA
Theorem 7.30 (Orthogonal projection). Let U ⊆ Rn be a subspace and let ~v ∈ Rn . Then there
exist uniquely determined vectors ~vk and ~v⊥ such that
~vk ∈ U, ~v⊥ ⊥ U and ~v = ~vk + ~v⊥ . (7.5)
The vector ~vk is called the orthogonal projection of ~v onto U ; it is denoted by projU ~v .

Proof. First we show the existence of the vectors ~vk and ~v⊥ . If U = Rn , we take ~vk = ~v and
~v⊥ = ~0. If U = {~0}, we take ~vk = ~0 and ~v⊥ = ~v . Otherwise, let 0 < dim U = k < n. Choose
orthonormal bases ~u1 , . . . , ~uk of U and w ~ n of U ⊥ . This is possible by Theorem 7.24
~ k+1 , . . . , w
D

and Proposition 7.26. Then ~u1 , . . . , ~uk , w ~ n is an orthonormal basis of Rn and for every
~ k+1 , . . . , w
n
~v ∈ R we find with the help of Theorem 7.7 that
~v = h~u1 , ~v i~u1 + · · · + h~uk , ~v i~uk + hw
~ k+1 , ~v iw
~ k+1 + · · · + hw
~ n , ~v iw
~n .
| {z } | {z }
∈U ∈U ⊥

If we set ~vk = h~u1 , ~v i~u1 + · · · + h~uk , ~v i~uk and ~v⊥ = hw ~ k+1 , ~v iw


~ k+1 + · · · + hw
~ n , ~v iw
~ n , then they
have the desired properties.
Next we show uniqueness of the decomposition of ~v . Assume that there are vectors ~vk and ~zk ∈ U
and ~v⊥ and ~z⊥ ∈ U ⊥ such that ~v = ~vk + ~v⊥ and ~v = ~zk + ~z⊥ . Then ~vk + ~v⊥ = ~zk + ~z⊥ and,
rearranging, we find that
~vk − ~zk = ~z⊥ − ~v⊥ .
| {z } | {z }
∈U ∈U ⊥

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
292 7.4. Orthogonal projections

Since U ∩ U ⊥ = {~0}, it follows that ~vk − ~zk = ~0 and ~z⊥ − ~v⊥ = ~0, and therefore ~zk = ~vk and
~z⊥ = ~v⊥ .

Definition 7.31. Let U be a subspace of Rn . Then we define the orthogonal projection onto U as
the map which sends ~v ∈ Rn to its orthogonal projection onto U . It is usually denoted by PU , so

PU : Rn → Rn , PU ~v = projU ~v .

Remark 7.32 (Formula for the orthogonal projection). The proof of Theorem 7.30 indicates
how we can calculate the orthogonal projection onto a given subspace U ⊆ Rn . If ~u1 , . . . , ~uk is an
orthonormal basis of U , then

PU ~v = h~u1 , ~v i~u1 + · · · + h~uk , ~v i~uk . (7.6)

This shows that PU is a linear transformation since PU (~x + c~y ) = PU ~x + cPU ~y follows easily from
(7.6).

FT
Exercise. If ~u1 , . . . , ~uk is an orthogonal basis of U (but not necessarily orthonormal), show that

h~u1 , ~v i h~uk , ~v i
PU ~v = ~u1 + · · · + ~uk . (7.7)
k~u1 k2 k~uk k2

Remark 7.33 (Formula for the orthogonal projection for dim U = 1). If dim U = 1, we
~ ∈U
obtain again the formula (7.4) which we already know from Section 2.3. To see this, choose w
~ 6= ~0. Then w
with w ~ 0 = kwk
~ −1 w
~ is an orthonormal basis of U and according to (7.6) we have that
RA
hw~ , ~v i
~ 0 , ~v iw
~ 0 = kwk
~ −1 w ~ −1 w ~ −2 hw

projw~ ~v = projU ~v = hw ~ , ~v kwk ~ = kwk ~ , ~v i w
~= w.
~
~ 2
kwk

Remark 7.34 (Pythagoras’s Theorem). Let U be a subspace of Rn , ~v ∈ Rn and let ~vk and ~v⊥
be as in Theorem 7.30. Then
k~v k2 = k~vk k2 + k~v⊥ k2 .

Proof. Using that ~vk ⊥ ~v⊥ , we find


D

k~v k2 = h~v , ~v i = h~vk + ~v⊥ , ~vk + ~v⊥ i = h~vk , ~vk i + h~vk , ~v⊥ i + h~v⊥ , ~vk i + h~v⊥ , ~v⊥ i
= h~vk , ~vk i + h~v⊥ , ~v⊥ i = k~vk k2 + k~v⊥ k2 .

Exercise 7.35. Let U be a subspace of Rn with basis ~u1 , . . . , ~uk and let w
~ k+1 , . . . , w~ n be a basis
of U ⊥ . Find the matrix representation of PU with respect to the basis ~u1 , . . . , ~uk , w~ k+1 , . . . , w
~ n.

Exercise 7.36. Let U be a subspace of Rn . Show that PU ⊥ = id −PU . (You can show this either
directly or using the matrix representation of PU from Exercise 7.35.)

Exercise 7.37. Let U be a subspace of Rn . Show that (PU )2 = PU . (You can show this either
directly or using the matrix representation of PU from Exercise 7.35.)

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 293

~v z

dist(~v , U )
U


pro
jU ~v

FT
Figure 7.3: The figure shows the orthogonal projection of the vector ~v onto the subspace U (which is
a vector) and the distance of ~v to U (which is a number. It is the length of the vector (~v − projU ~v )).

Exercise 7.38. Let U be a subspace of Rn .


RA
(i) Find ker PU and Im PU .
(ii) Find PU ⊥ PU and PU PU ⊥ .

In Theorem 7.30 we used the concept of orthogonality to define the orthogonal projection of ~v onto
a given subspace. We obtained a decomposition of ~v into a part parallel to the given subspace and
a part orthogonal to it. The next theorem shows that the orthogonal projection of ~v onto U gives
us the point in U which is closest to ~v .

Theorem 7.39. Let U be a subspace of Rn and let ~v ∈ Rn . Then PU ~v is the point in U which is
D

closest to ~v , that is,


k~v − PU ~v k ≤ k~v − ~uk for every ~u ∈ U.

Proof. Let ~v ∈ Rn and ~u ∈ U ⊆ Rn . Note that ~v − PU ~v ∈ U ⊥ and that PU ~v − ~u ∈ U since both


vectors belong to U . Therefore, the Pythagoras theorem shows that

k~v − ~uk2 = k~v − PU ~v + PU ~v − ~uk2 = k~v − PU ~v k2 + kPU ~v − ~uk2 ≥ k~v − PU ~v k2 .

Taking the square root on both sides shows the desired inequality.

Definition 7.40. as Let U be a subspace of Rn and let ~v ∈ R. The we define the distance of ~v to
U as
dist(~v , U ) := k~v − PU ~v k.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
294 7.4. Orthogonal projections

This is the shortest distance of ~v to any point in U .

In Remark 7.32 we already found a formula for the orthogonal projection PU of a vector ~v to a
given subspace U . This formula however requires to have an orthonormal basis of U . We want to
give another formula for PU which does not require the knowledge of an orthonormal basis.

Theorem 7.41. Let U be a subspace of Rn with basis ~u1 , . . . , ~uk and let B ∈ M (n × k) be the
matrix whose columns are these basis vectors. Then the following holds.
(i) B is injective.
(ii) B t B : Rk → Rk is a bijection.
(iii) The orthogonal projection onto U is given by the formula

PU = B(B t B)−1 B t .

Proof. (i) By construction, the columns of B are linearly independent. Therefore the unique
solution of B~x = ~0 is ~x = ~0 which shows that B is injective.

FT
(ii) Observe that B t B ∈ M (k × k) and assume that B t B~x = ~0 for some ~x ∈ Rk . Then it follows
for every ~y ∈ Rk that B~y = ~0 because

0 = h~y , B t B~y i = h(B t )t ~y , B~y i = hB~y , B~y i = kB~y k2 .

Since B is injective, this implies ~y = ~0, so B t B is injective. Since it is a square matrix, it


follows that it is even bijective.
RA
(iii) Observe that by construction Im B = U . Now let ~x ∈ Rn . Note that PU ~x ∈ Im B. Hence
there exists exactly one ~z ∈ Rk such that PU ~x = B~z. Moreover, ~x − PU ~x ⊥ U = Im B, hence
for every ~y ∈ Rk we have that

0 = h~x − PU ~x , B~y i = h~x − B~z , B~y i = hB t ~x − B t B~z , ~y i.

Since this is true for every ~y ∈ Rk , it follows that B t ~x − B t B~z = ~0. Now we recall that B t B
is invertible, so we can solve for ~z and obtain ~z = (B t B)−1 B t ~x. This finally gives

PU ~x = B~z = B(B t B)−1 B t ~x.


D

Since this holds for every ~x ∈ Rn , formula (iii) is proved.

You should now have understood


• the concept of an orthogonal projection onto a subspace of Rn ,
• the geometric interpretation of orthogonal projections and how it is related to the distance
of point to a subspace,
• etc.
You should now be able to
• calculate the orthogonal projection of a point to a subspace,

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 295

• calculate the distance of a point to a subspace,


• etc.

Ejercicios.
1. Sea E : x + 2y − z = 1. Encuentre la distancia del punto P (6, 1, 2) al plano E.
x−1 z+1
2. Sea L : 2 =y+2= 3 . Encuentre la distancia del punto P (6, −1, 0) a la recta L.
x y
3. Sea L : 2 = 3 = z.
(a) Encuentre L⊥ .
(b) Sea T (~x) = media rotación de ~x con respecto a la recta L. Encuentre una fórmula
explicita para T . (Hint: ¿Cúal es la proyección de T ~x en L y en L⊥ ? Vea la gráfica).

FT
T ~x ~x

P royL⊥ ~x
L⊥
RA
4. Sea W un subespacio de Rn y PW la proyección ortogonal sobre W .
(a) Sean {w~ 1, w ~ k } una base ortonormal de W . Recuerde que esta base se puede
~ 2, . . . , w
completar a una base ortonormal B de Rn . ¿Cómo es la representación matricial de PW
con respecto a la base B?
D

(b) Pruebe que [PW ]can es una matriz simétrica.


(c) Muestre que hPW ~x , ~y i = h~x , PW ~y i para todos ~x, ~y ∈ Rn .
(d) Muestre que PW PW ~x = PW ~x para todo ~x ∈ Rn .
5. Sean V, W subespacios de Rn tales que W ⊆ V . Muestre que V ⊥ ⊆ W ⊥ .
6. Sean V, W subespacios de Rn tales que W ⊆ V . Muestre que PV PW = PW PV = PW .
(Hint: PV PW = PW es directo. Para probar que PW PV ~x = PW ~x para todo ~x ∈ Rn , escriba
~x = ~v + ~v ⊥ donde ~v ∈ V y ~v ⊥ ∈ V ⊥ .)
7. Sea W ⊆ Rm un subespacio de dimensión n. Muestre que existe una isometrı́a T : Rn → Rm
tal que Im T = W . (Hint: Empiece escogiendo una base ortonormal para W , ver ejercicio 5.
de la sección 7.2).

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
296 7.5. The Gram-Schmidt process

7.5 The Gram-Schmidt process


In this section we will describe the so-called Gram-Schmidt orthonormalisation process. Roughly
speaking, it converts a given basis of a subspace of Rn into an orthonormal basis, thus providing
another proof that every subspace of Rn has an orthonormal basis (Corollary 7.27).

Theorem 7.42. Let U be a subspace of Rn with basis ~u1 , . . . , ~uk . Then there exists an orthonormal
basis ~x1 , . . . , ~xk of U such that
span{~u1 , . . . , ~uj } = span{~x1 , . . . , ~xj } for every j = 1, . . . , k.

Proof. The proof is constructive, that is, we do not only prove the existence of such basis, but it
tells us how to calculate it. The idea is to construct the new basis ~x1 , . . . , ~xk step by step. In order
to simplify notation a bit, we set Uj = span{~u1 , . . . , ~uj } for j = 1, . . . , k. Note that dim Uj = j
and that Uk = U .
• Set ~x1 = k~v1 k−1~v1 . Then clearly k~x1 k = 1 and span{~u1 } = span{~x1 } = U1 .

FT
• The vector ~x2 must be a normalised vector in U2 which is orthogonal to ~x1 , that is, it must
be orthogonal to U1 . So we simple take ~u2 and subtract its projection onto U1 :
~ 2 = ~u2 − projU1 ~u2 = ~u2 − proj~x1 ~u2 = ~u2 − h~x1 , ~u2 i~x1 .
w
Clearly w ~ 2 ∈ U2 because it is a linear combination of vectors in U2 . Moreover, w ~ 2 ⊥ U1
because
D E
hw
~ 2 , ~x1 i = ~u2 − h~x1 , ~u2 i~x1 , ~x1 = h~u2 , ~x1 i − h~x1 , ~u2 ih~x1 , ~x1 i = h~u2 , ~x1 i − h~x1 , ~u2 i = 0.
RA
Hence the vector ~x2 that we are looking for is
~ 2 k−1 w
~x2 = kw ~ 2.
Since ~x2 ∈ U2 it follows that span{~x1 , ~x2 } ⊆ U2 . Both spaces have dimension 2, so they must
be equal.
• The vector ~x3 must be a normalised vector in U3 which is orthogonal to U2 = span{~x1 , ~x2 }.
So we simple take ~x3 and subtract its projection onto U2 :
 
w~ 3 = ~u3 − projU2 ~u3 = ~u3 − (proj~x1 ~u3 + proj~x2 ~u3 ) = ~u3 − h~x1 , ~u3 i~x1 + h~x1 , ~u3 i~x1 .
D

~ 3 ∈ U3 because it is a linear combination of vectors in U3 . Moreover, w


Clearly w ~ 3 ⊥ U2
because for j = 1, 2 we obtain
D E
hw
~ 3 , ~xj i = ~x3 − (h~x1 , w
~ 3 i~x1 + h~x2 , w
~ 3 i~x2 ) , ~xj
= h~x3 , ~xj i − h~x1 , w
~ 3 ih~x1 , ~xj i − h~x2 , w
~ 3 ih~x2 , ~xj i
= h~x3 , ~xj i − h~xj , w
~ 3 ih~xj , ~xj i = h~x3 , ~xj i − h~xj , w
~ 3 i = 0.
Hence the vector ~x3 that we are looking for is
~ 3 k−1 w
~x3 = kw ~ 3.
Since ~x3 ∈ U3 it follows that span{~x1 , ~x2 , ~x3 } ⊆ U3 . Since both spaces have dimension 3, they
must be equal.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 297

We repeat this k times until have constructed the basis ~x1 , . . . , ~xk .
Note that the general procedure is as follows:

• Suppose that we already have constructed ~x1 , . . . , ~x` . Then we first construct

~ `+1 = ~u`+1 − PU` ~u`+1 .


w

~ `+1 ∈ U`+1 and w


This vector satisfies w ~ `+1 ⊥ U` . Note that w ~ `+1 6= ~0 because otherwise
we would have that ~u`+1 = PU` ~u`+1 ∈ U` which is impossible because ~u`+1 , ~u` , . . . , ~u1 are
linearly independent. Then ~x`+1 = kw~ `+1 k−1 w
~ `+1 has all the desired properties.

Example 7.43. Let U = span{~u1 , ~u2 , ~u3 } where


     
1 −1 −2
1  4  5
  √   
0 , ~u2 =  2 ,
~u1 =   0 .
~u3 = 
   
1  3  0

FT
1 2 1

We want to find an orthonormal basis ~x1 , ~x2 , ~x3 of U using the Gram-Schmidt process.
 
Solution. (i) ~x1 = k~u1 k−1 ~u1 = 12 ~u1 . −3
 2
√ 
(ii) w~ 2 = ~u2 − proj~x1 ~u2 = ~u2 − h~x1 , ~u2 i~x1 = ~u2 − 4~x1 = ~u2 − 2~u1 =   2

 
−3  1
 2 0
1 √ 
RA
=⇒ ~x2 = kw~ 2 k−1 w
~2 =  2.
4  1

0
   
(iii) w~ 3 = ~u3 − projspan{~x1 ,~x2 } ~u3 = ~u3 − h~x1 , ~u3 i~x1 + h~x2 , ~u3 i~x1 = ~u3 − 2~x1 + 4~x2
       
−2 1 −3 0
 5 1  2 
     √   √2 

=  0 − 0 −  2 = − 2
      

 0 1  1  −2
D

1 1 0 0
 
0
 −2
−1 1
√ 
=⇒ ~x3 = kw~ 3k w ~ 3 = √10   2 .

 2
0
Therefore the desired orthonormal basis of U is
     
1 −3 0
1  2  −2
1  1 √  1 √ 
~x1 =  0 , ~x2 =  2 , ~x3 = √  2. 
2 
  4 
  10 
1 1  2
1 0 0

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
298 7.5. The Gram-Schmidt process

Note that we will obtain a different basis if we change the order of the given basis ~u1 , ~u2 , ~u3 .

Example 7.44. We will give another solution of Example 7.29. We were asked to find an orthonor-
mal basis of the orthogonal complement of
   
 1
 1  
 2 0
U = span  ,   .
3 1

 
4 0
 

From Example 7.28 we already know that


  
0 −1
−2 −1
U ⊥ = span{w ~ 2}
~ 1, w where w  0 ,
~1 =   w  1 .
~2 =  

1 0

FT
We use the Gram-Schmidt process to obtain an orthonormal basis ~x1 , ~x2 of U .
1
(i) ~x1 = k~v1 k−1~v1 = √ ~v1 .
5
     
−1 0 −5
2 −1 2 −2 1 −1
(ii) ~y2 = w~ 2 − proj~x1 w~ 2 = ~u2 − h~x1 , ~u2 i~x1 = ~u2 − √ ~x1 = 
 1 − 5  0 = 5  5
    
5
0 1 −2
RA
 
5
1  1
=⇒ ~x2 = k~y2 k−1 ~y2 = √  
55 −5
2

Therefore   
0 5
1 −2 1 −1
~x1 = √  , ~x2 = √  .
5  0 55  5
D

1 2

You should now have understood


• why the Gram-Schmidt process works,
• etc.
You should now be able to

• apply the Gram-Schmidt process in order to generate an orthonormal basis of a given sub-
space,
• etc.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 299

Ejercicios.

1. Mediante proceso de Gram-Schmidt obtenga una base ortonormal del plano x − y + z = 0.

2. Encuentreuna
 base
 ortonormal
 de R4 que contenga una base del subespacio generado por los
1 2
0 −1
1,  0.
vectores    

0 1

       

 1 0 0  0
1 1 0 0
     
3. Sea W = span  0 , 1 , 1 y ~v = 1. Encuentre el elemento en W más cercano
      

 
1 1 0 1
 

FT
a ~v y determine la distancia de ~v a W .

 
1 2 2 −5
3 2 1 −2
4. Sea A =  . Determine una base ortonormal para el espacio columna de
2 0 −1 3
7 −2 1 4
A.
RA
5. Sea E : 2x + 3y
+ z = 0 y PE la proyección ortogonal sobre E. Encuentre bases B1 y B2 tales
5 0 0
que [PE ]B
B1 =
2 0 0 0 .

0 0 2

7.6 Application: Least squares


D

In this section we want to present the least squares method to fit a linear function to certain
measurements. Let us see an example.

Example 7.45. Assume that we want to measure the Hook constant k of a spring. By Hook’s law
we know that

y = y0 + km (7.8)

where y0 is the elongation of the spring without any mass attached and y is the elongation of the
spring when we attach the mass m to it.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
300 7.6. Application: Least squares

y0 y0

ym
m

y y

Assume that we measure the elongation for different masses. If Hook’s law is valid and if our
measurements were perfect, then our measured points should lie on a line with slope k. However,
measurements are never perfect and the points will rather be scattered around a line. Assume that

FT
we measured the following.

m 2 3 4 5
y 4.5 5.1 6.1 7.9

Figure 7.4 contains a plot of these measurements in the m-y-plane.

y y
RA
8 8

7 7

6 6
g1
5 5
g2
D

4 4
m m
1 2 3 4 5 1 2 3 4 5

Figure 7.4: The left plot shows the measured data. In the plot on the right we added the two functions
g1 (x) = x + 2.5, g2 (x) = 1.1x + 2 which seem to be reasonable candidates for linear approximations to
the measured data.

The plot gives us some confidence that Hook’s law holds since the points seem to lie more or less
on a line. How do we best fit a line through the points? The slope seems to be around 1. We could
make the following guesses:

g1 (x) = x + 2.5 or g2 (x) = 1.1x + 2

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 301

Which of the two functions is the better approximation? Are there other approximations that are
even better?
The answer to this questions depend very much on how we measure how “good” an approximation
is. One very common way is the following: For each measured point, we take the difference
∆j := mj − g(mj ) between the measured value and the value of our test function. Then we
P 1
n 2 2
square all these differences, sum them and then we take the square root j=1 (mj − g(m j )) ,
see also Figure 7.5. The resulting number will be our measure for how good our guess is.

y
y

∆n−1 ∆n

FT
∆1 ∆2

m m
x1 x2 x3 . . . xn−1 xn x1 x2 x3 . . . xn−1 xn

Figure 7.5: The graph on the left shows points for which we want to find an approximating linear
function. The graph on the right shows such a linear function and how to measure the error or
1
RA
Pn 2 2
discrepancy between the measured points the proposed line. A measure for the error is j=1 ∆j .

Before we do this for our data, we make some simple observations.

(i) If all the measured point lie on a line and we take this line as our candidate, then this method
gives the total error 0 as it should be.

(ii) We take the squares of the errors in each measured points so that the error is always counted
positive. Otherwise it could happen that the errors cancel each other. If we would simply
D

sum the errors, then the total error could be 0 while the approximating line is quite far from
all the measure points.
Pn
(iii) There are other ways how to measure the error, for example one could use j=1 |mj − g(mj )|,
but it turns out the methods with the squares has many advantages. (See some course on
optimisation for further details.)

Now let us calculate the errors for our measure points and our two proposed functions.

m 2 3 4 5 m 2 3 4 5
y (measured) 4.5 5.1 6.1 7.9 y (measured) 4.5 5.1 6.1 7.9
g1 (m) 4.5 4.5 6.5 7.5 g2 (m) 4.2 5.3 6.4 7.5
y − g1 0 0.6 -0.4 0.4 y − g2 0.3 -0.2 -0.3 0.4

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
302 7.6. Application: Least squares

Therefore we find for the errors


1 1
∆(1) = 02 + 0.62 + (−0.4)2 + 0.42 2 = [0.68] 2 ≈ 0.825,

Error for function g1 :
1 1
Error for function g2 : ∆(2) = 0.32 + (−0.2)2 + (−0.3)2 + 0.42 2 = [0.38] 2 ≈ 0.616,


so our second guess seems to be closer to the best linear approximation to our measured points
than the first guess. This exercise will be continued on p. 303.

Now the question arises how we can find the optimal linear approximation.

Best linear approximation. Assume we are given measured data (x1 , y1 ), . . . , (xn , yn ) and we
want to find a linear function g(x) = ax + b such that the total error
n
hX i 21
∆ := (yj − g(xj ))2 (7.9)
j=1

FT
is minimal. In other words, we have to find the parameters a and b such that ∆ becomes as small
as possible. The key here is to recognise the right hand side on (7.9) as the norm of a vector (here
the particular form of how we chose to measure the error is crucial). Let us rewrite (7.9) as follows:
 
y1 − (ax1 − b)
hXn i 12 hX n i 12  y2 − (ax2 − b) 
∆= (yj − g(xj ))2 = (yj − (axj − b))2 = 
 
.. 
j=1 j=1
 . 
yn − (axn − b)
RA
     

y1 x1 1
 y2    x2  1
=  .  − a  .  + b  .  .
      
 ..    ..   .. 
yn xn 1

Let us set      
y1 x1 1
 y2   x2  1
D

~y =  .  , ~x =  .  and ~u =  .  . (7.10)
     
 ..   ..   .. 
yn xn 1
Note that these are vectors in Rn . Then

∆ = k~y − [a~x + b~u]k

and the question is how we have to choose a and b such that this becomes as small as possible. In
other words, we are looking for the point in the vector space spanned by ~x and ~u which is closest
to ~y . By Theorem 7.39 this point is given by the orthogonal projection of ~y onto that plane.
To calculate this projection, set U = span{~x, ~u} and let P be the orthogonal projection onto U .
Then by our reasoning
P ~y = a~x + b~u. (7.11)

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 303

Now let us see how we can calculate a and b easily from (7.11).1 In the following we will assume
that ~x and ~u so that U is a plane. This assumption seems to be reasonable because that they are
linearly dependent would mean that x1 = · · · = xn (in our example with the spring this would
mean that we always used the same mass in the experiment). Observe that if ~x, ~u were linearly
independent, then the matrix A below would have only one column; everything else works just the
same.
Recall that by Theorem 7.41 the orthogonal projection onto U is given by

P = A(At A)−1 At

where A is the n × 2 matrix whose columns consist of the vectors ~x and ~u. Therefore (7.11) becomes
 
a
A(At A)−1 At ~y = a~x + b~u = A . (7.12)
b

Since by our assumption the columns of A are linearly independent, it is injective. Therefore we
can conclude from (7.12) that

FT
 
a
(At A)−1 At ~y =
b
which is formula for the numbers a and b that we were looking for.

Let us summarise our reasoning above in a theorem.

Theorem 7.46. Let (x1 , y1 ), . . . , (xn , yn ) be given. The linear function g(x) = ax + b which min-
RA
imises the total error
hX n i 21
∆ := (yj − g(xj ))2 (7.13)
j=1

is given by  
a
= (At A)−1 At ~y (7.14)
b
where ~y , ~x and ~u are as in (7.10) and A is the n × 2 matrix whose columns consist of the vectors ~x
and ~u.
D

In Remark 7.47 we will show how this formula can be derived with methods from calculus.

Exercise 7.45 continued. . Let us use Theorem 7.46 to calculate the best linear approximation
to the data from Exercise 7.45. Note that in this case the mj correspond to the xj from the theorem
and we will write m
~ instead of ~x. In this case, we have
       
2 1 2 1 4.5
3 1 3 1 5.1
m
~ =4 , ~u = 1 , A = (~x | ~u) = 4 1 , ~y = 6.1 ,
      

5 1 5 1 7.9
1 Of y and then plant the linear n × 2 system to find the coefficients a and b.
course, you could simply calculate P ~

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
304 7.6. Application: Least squares

hence  
  2 1    
t 2 3 4 5 
3 1
 = 54 14 t −1 1 2 −7
AA= , (AA ) =
1 1 1 1 4 1 14 4 5 −7 27
5 1
and therefore
   
     4.5   4.5
a t −1 t 1 2 −7 2 3 4 5.1 = 1 −3 −1
5   1 5.1
3  
= (A A) A ~y =
b 10 −7 27 1 1 1 1 6.1 10 13 6 −1 −8 6.1
7.9 7.9
 
1.12
= .
1.98
We conclude that the best linear approximation is
g(m) = 1.12m + 1.98.

6
y

FT g
RA
5

4
m
1 2 3 4 5

Figure 7.6: The plot shows the measured data and the linear approximation g(m) = 1.12m + 1.98
D

calculated with Theorem 7.46.

The method above can be generalised to other types of functions. We will show how it can be
adapted to the case of polynomial and to exponential functions.

Polynomial functions.. Assume we are given measured data (x1 , y1 ), . . . , (xn , yn ) and we want
to find a polynomial of degree k which best fits the data points. Let p(x) = ak xk + ak−1 xk−1 +
· · · + a1 x + a0 be the desired polynomial. We define the vectors
   k  k−1     
y1 x1 x1 x1 1
 y2  xk2  xk−1   x2  1
2
~y =  .  , ξ~k =  .  , ξ~k−1 =  .  , . . . , ξ~1 =  .  , ξ~0 =  .  .
         
 ..   ..   ..   ..   .. 
yn xkn xk−1
n xn 1

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 305

If the vectors ξ~k , . . . , ξ0 are linearly independent, then


 
ak
 .. 
.
  = (At A)−1 At ~y
a1 
a0

where A = (ξ~k | . . . | ξ~0 ) is the n × (k + 1) matrix whose columns are the vectors ξ~k , . . . , ξ~0 . Note
that by our assumption k < n (otherwise the vectors ξ~k , . . . , ξ~0 cannot be linearly independent).

Remark. Generally one should have many more data points than the degree of the polynomial
one wants to fit; otherwise the problem of overfitting might occur. For example, assume that
the curve we are looking for is f (x) = 0.1 + 0.2x and we are given only three measurements:
(0, 0.25), (1, 0), (3, 1). Then a linear fit would give us g(x) = 27 x + 28 1
≈ 0.23x + 0.036. The
1 2 1 1
fit with a quadratic function gives p(x) = 4 x − 2 x + 4 which matches the data points perfectly
but is far away from the curve we are looking for. The reason is that we have too many free

FT
parameters in the polynomial so the fit the data too well. (Note that for any given n + 1 points
(x1 , y1 ), . . . , (xn+1 , yn+1 ) with x1 z . . . , xn+1 , there exists exactly one polynomial p of degree ≤ n
such that p(xj ) = yj for every j = 1, . . . , n + 1.) If we had a lot more data points and we tried to
fit a polynomial to a linear function, then the leading coefficient should become very small but this
effect does not appear if we have very few data points.

y
RA
2
p
g
1
f
x
−2 −1 1 2 3 4
D

Figure 7.7: Example of overfitting when we have too many free variables for a given set of data
points. The dots mark the measured points which are supposed to approximate the red curve f .
Fitting polynomial p of degree 2 leads to the green curve. The blue curve g is the result of a linear fit.

Exponential functions.. Assume we are given measured data (x1 , y1 ), . . . , (xn , yn ) and we want
to find a function of form g(x) = c ekx to fit our data point. Without restriction we may assume
that c > 0 (otherwise we fit −g).
Then we only need to define h(x) = ln(g(x)) = ln c + kx so that we can use the method to fit a
linear function to the data points (x1 , ln(y1 )), . . . , (xn , ln(yn )) in order to obtain c and k.

Remark 7.47. Let us show how the formula in Theorem 7.46 can be derived with analytic methods.
Recall that the problem is the following: Let (x1 , y1 ), . . . , (xn , yn ) be given. Find a linear function

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
306 7.6. Application: Least squares

g(x) = ax + b which minimises the total error


n
hX i 12 n
hX i 21
∆ := (yj − g(xj ))2 = (yj − [axj + b])2
j=1 j=1

Let us consider ∆ as function of a and b. Then we have to find the minimum of


n
hX i 12
∆(a, b) = (yj − [axj + b])2
j=1

as a function of the two variables a, b. In order to simplify the calculations a bit, we observe that
it is enough to minimise the square of ∆ since ∆(a, b) ≥ 0 for all a, b, and therefore it is minimal if
and only if its square is minimal. So we want to find a, b which minimise
n
X
F (a, b) := (∆(a, b))2 = (yj − axj − b)2 . (7.15)
j=1

FT
To this end, we have to derive F . Since F : R2 → R, the derivative will be vector valued function.
We find
  X n n 
∂F ∂F X
DF (a, b) = (a, b), (a, b) = −2xj (yj − axj − b), −2(yj − axj − b)
∂a ∂b j=1 j=1
 X n Xn n
X n
X Xn 
2
=2 a xj + b xj − xj yj , a xj + nb − yj .
RA
j=1 j=1 j=1 j=1 j=1

Now we need to find the critical points, that is, a, b such that DF (a, b) = 0. This is the case for
 n n n

X X X
2
 


 a xj + b xj = xj yj 

 Pn
2
Pn  ! Pn 
j=1 xj j=1 xj a j=1 xj yj

 j=1 

j=1 j=1
n n
that is Pn  =  Pn .
 X X 
j=1 xj n b j=1 yj
a xj + bn = yj 

 


 
 
j=1 j=1
D

(7.16)

Now we can multiply on both sides from the left by the inverse of the matrix and obtain the solution
for a, b. This shows that F has only one critical point. Since F tends to infinity for k(a, b)k → ∞,
the function F must indeed have a minimum in this critical point. For details, see a course on
vector calculus or optimisation.
We observe the following: If, as before, we set
       
x1 1 x1 1 y1
 ..   ..   ..  .. 
~x =  .  , ~u =  .  , A = (~x | ~u) =  .  , ~y =  .  ,

xn 1 xn 1 yn

then

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 307

n
X n
X n
X n
X
x2j = h~x , ~xi, xj = h~x , ~ui, n = h~u , ~ui, xj yj = h~x , ~y i, yj = h~u , ~y i.
j=1 j=1 j=1 j=1

Therefore the expressions in equation (7.16) can be rewritten as


Pn 2
Pn ! ! !
j=1 xj j=1 xj h~x , ~xi h~x , ~ui ~x
Pn = = (~x | ~u) = At A
j=1 xj n h~u , ~xi h~u , ~ui ~u
Pn ! ! !
j=1 xj yj h~x , ~y i ~x
Pn = = ~y = At ~y
j=1 yj h~u , ~y i ~u

and we recognise that equation (7.16) is the same as


 
t a
AA = At ~y

FT
b

which becomes our equation (7.14) if we multiply both sides of the equation from the left by
(At A)−1 .

You should now have understood

• what the least square method is,


RA
• how it is related to orthogonal projections,
• what overfitting is,
• etc.
You should now be able to
• fit a linear function to given data points,
• fit a polynomial to given data points,
D

• fit an exponential function to given data points,


• etc.

Ejercicios.
1. Una bola rueda a lo largo del eje x con velocidad constante. A lo largo de la trayectoria de
la bola se miden las coordenadas x de la bola en ciertos tiempos t. Las siguientes mediciones
son (t en segundos, x en metros):

x 1.5 2.0 3.0 4.0 4.5 6


t 1.4 2.3 4.7 6.6 7.4 10.8

(a) Dibuje los puntos en el plano tx.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
308 7.7. Summary

(b) Use el método de mı́nimos cuadrados para econtrar la posición inicial x0 y la velocidad
v de la bola.
(c) Dibuje la recta en el bosquejo anterior. ¿Dónde/Cómo se ven x0 y v?

Hint. Recuerde que x(t) = x0 + vt para un movimiento con velocidad constante.

2. Se supone que una sustancia quı́mica inestable decaye según la ley P (t) = P0 ekt . Suponga
que se hicieron las siguientes mediciones:

t 1 2 3 4 5
P 7.4 6.5 5.7 5.2 4.9

Con el método de mı́nimos cuadrados aplicado a ln(P (t)), encuentre P0 y k que mejor corre-
sponden con las mediciones. Dé una estimada para P (8).

FT
3. Con el método de mı́nimos cuadrados encuentre el polı́nomio y = p(x) de grado 2 que mejor
aproxima los siguientes datos:

x -2 -1 0 1 2 3 4
y 15 8 2.8 -1.2 -4.9 -7.9 -8.7

7.7 Summary
RA
Let U be a subspace of Rn . Then its orthogonal complement is defined by

U ⊥ = {~x ∈ Rn : ~x ⊥ ~u for all ~u ∈ U }.

For any subspace U ⊆ Rn the following is true:

• U ⊥ is a vector space.
• U ⊥ = ker A where A is any matrix whose rows are formed by a basis of U .
• (U ⊥ )⊥ = U .
D

• dim U + dim U ⊥ = n.
• U ⊕ U ⊥ = Rn .
• U has an orthonormal basis. One way to construct such a basis is to first construct an
arbitrary basis of U and then apply the Gram-Schmidt orthogonalisation process to obtain
an orthonormal basis.

Orthogonal projection onto a subspace U ⊆ Rn


Let PU : Rn → Rn be the orthogonal projection onto U . Then

• PU is a linear transformation.
• PU ~x k U for every ~x ∈ Rn .

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 309

• ~x − PU ~x ⊥ U for every ~x ∈ Rn .
• For every ~x ∈ Rn the point in U nearest to ~x is given by ~x − PU ~x and dist(~x, U ) = k~x − PU ~xk.
• Formulas for PU :

– If ~u1 , . . . , ~uk is a basis of U , then

PU = h~u1 , ·i + · · · + h~uk , ·i,

that is PU ~x = h~u1 , ~xi + · · · + h~uk , ~xi for every ~x ∈ Rn .


– if B is any matrix whose columns form a basis of U , then PU = B(B t B)B t .

Orthogonal matrices
A matrix Q ∈ M (n × n) is called an orthogonal matrix if it is invertible and if Q−1 = Qt . Note
that the following assertions for a matrix Q ∈ M (n × n) are equivalent:

(i) Q is an orthogonal matrix.


(ii) Qt is an orthogonal matrix.
(iii) Q−1 is an orthogonal matrix.
FT
(iv) The columns of Q are an orthonormal basis of Rn .
(v) The rows of Q are an orthonormal basis of Rn .
RA
(vi) Q preserves inner products, that is h~x , ~y i = hQ~x , Q~y i for all ~x, ~y ∈ Rn .
(vii) Q preserves lengths, that is k~xk = kQ~xk for all ~x ∈ Rn .

Every orthogonal matrix represents either a rotation (in this case its determinant is 1) or a com-
position of a rotation with a reflection (in this case its determinant is −1).

7.8 Exercises
D

 
1/4
1. (a) Complete p a una base ortonormal para R2 . ¿Cuántas posibilidades hay para
15/16
hacerlo?
 √   √ 
1/ √2 1/√3
(b) Complete −1/ 2 , 1/√3 a una base ortonormal para R3 . ¿Cuántas posibilidades
0 1/ 3
hay para hacerlo?
 √ 
1/√2
(c) Complete 1/ 2 a una base ortonormal para R3 . ¿Cuántas posibilidades hay para
0
hacerlo?

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
310 7.8. Exercises

2. Encuentre una base para el complemento ortogonal de los siguientes espacios vectoriales.
Encuentre la dimensión del espacio y la dimensión de su complemento ortogonal.
         
 1
 2    1 3
     
 2 
2 3
  ,   , 3 ⊆ R4 .
2 4
  
,   ⊆ R , 4
(a) U = span  (b) U = span
 3
  4
 3
    5 4
 
4 5 4 6 5
   

3. (a) Sea U = {(x, y, z)t ∈ R3 : x + 2y + 3z = 0} ⊆ R3 .


i. Sea ~v = (0, 2, 5)t . Ecuentre el punto ~x ∈ U que esté más cercano a ~v y calcule la
distancia entre ~v y ~x.
ii. ¿Hay un punto ~y ∈ U que esté a una distancia máximal de ~v ?
iii. Encuentre la matriz que representa la proyección ortogonal sobre U (en la base
estandar).
(b) Sea W = gen{(1, 1, 1, 1)t , (2, 1, 1, 0)t } ⊆ R4 .
i. Encuentre una base ortogonal de W .

FT
ii. Sean ~a1 = (1, 2, 0, 1)t , ~a2 = (11, 4, 4, −3)t , ~a3 = (0, −1, −1, 0)t . Para cada j = 1, 2, 3
encuentre el punto w ~ j ∈ W que esté más cercano a ~aj y calcule la distancia entre ~aj
yw~j.
iii. Encuentre la matriz que representa la proyección ortogonal sobre W (en la base
estandar).
       
    0 −1 1 0
0 1 3 0 0 0
2
 ~ = 2, ~a = 4, ~b =  0 , ~c = 1, d~ = 1.
RA
         
2, w
4. Sean ~v =  3        
0 0 0 0
1 5
0 3 1 1
(a) Demuestre que ~v y w~ son linealmente independientes y encuentre una base ortonormal
~ ⊆ R4 .
de U = span{~v , w}
(b) Demuestre que ~a, ~b, ~c y d~ son linealmente independientes. Use el proceso de Gram-
Schmidt para encontrar una base ortonormal de U = span{~a, ~b, ~c, d}
~ ⊆ R5 . Encuentre

una base de U .
D

5. Encuentre una base ortonormal de U ⊥ donde U = gen{(1, 0, 2, 4)t } ⊆ R4 .


   
cos ϕ sin ϕ
6. (a) Sea ϕ ∈ R y sean ~v1 = , ~v2 = . Demuestre que ~v1 , ~v2 es una base
− sin ϕ cos ϕ
ortonormal de R2 .
(b) Sea α ∈ R. Encuentre la matriz Q(α) ∈ M (2 × 2) que describe rotación por α contra las
manecillas del reloj.
(c) Sean α, β ∈ R. Explique por qué es claro que Q(α)Q(β) = Q(α + β). Use esta relación
para concluir las identidades trigonométricas
cos(α + β) = cos α cos β − sin α sin β, sin(α + β) = sin α cos β + cos α sin β.

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
Chapter 7. Orthonormal bases and orthogonal projections in Rn 311

7. Sean O(n) = {Q ∈ M (n × n) : Q es matriz ortogonal} y SO(n) = {Q ∈ O(n) : det Q = 1}.


(a) Demuestre que O(n) con la composición es un grupo. Es decir, hay que probar que:
i. Para todo Q, R ∈ O(n), la composición QR es un elemento en O(n).
ii. Existe un E ∈ O(n) tal que QE = Q y EQ = Q para todo Q ∈ O(n).
iii. Para todo Q ∈ O(n) existe un elemento inverso Q
e tal que QQ
e = QQ e = E.
(b) ¿Es O(n) conmutativo (es decir, se tiene QR = RQ para todo Q, R ∈ O(n))?
(c) Demuestre que SO(n) con la composición es un grupo.

8. Ses T : Rn → Rm una isometrı́a. Demuestre que T es inyectivo y que m ≥ n.

FT
RA
D

Last Change: Wed May 8 12:13:52 PM -05 2024


Linear Algebra, M. Winklmeier
D
RA
FT
Chapter 8

Symmetric matrices and


diagonalisation

FT
In this chapter we work mostly in Rn and in Cn . We write MR (n × n) or MC (n × n) only if it is
important if the matrix under consideration is a real or a complex matrix.
The first section is dedicated to Cn . We already know that it is a vector space. But now we
introduce an inner product on it. Moreover we define hermitian and unitary matrices on Cn which
are analogous to symmetric and orthogonal matrices in Rn . We define eigenvalues and eigenvectors
in Section 8.3. It turns out that it is more convenient to work over C because the eigenvalues
RA
are zeros of the so-called characteristic polynomial and in C every polynomial has a zero. The
main theorem is Theorem 8.48 which says that an n × n matrix is diagonalisable if it has enough
eigenvectors to generate Cn (or Rn ). It turns out that every symmetric and every hermitian matrix
is diagonalisable.
We end the chapter with an application of orthogonal diagonalisation to the solution of quadratic
equations in two variables.
D

8.1 Complex vector spaces


In this section we introduce Cn as an inner product space because some calculations about eigen-
values later in this chapter are more natural in Cn than in Rn . Most of this section may be skipped.
The important part is the definition of the inner product on Cn , the notion of orthogonality derived
from it, and the concept of hermitian and unitary matrices.
Similarly as for Rn , we define the vector space Cn as the set

  
 z1
 

n  .. 
C =  .  : z1 , . . . , zn ∈ C
 
zn
 

313
314 8.1. Complex vector spaces

together with the sum and multiplication by a scalar c ∈ C:


         
w1 z1 w1 + z1 z1 cz1
 ..   ..  ..  ..   .. 
+ := , c :=  . .
 
 .  .  .  .
wn zn wn + zn zn czn
It is not hard to check that Cn together with these operations satisfies the vector space axioms
from Definition 5.1 with K = C, hence it is a complex vector space. In particular, we have concepts
like linear independence of vectors, basis and dimension of Cn , etc.
Next we introduce an inner product on Cn . As in the case of real vectors, we would like to interprete
h~z , ~zi as the square of the norm of ~z. In particular it should be a nonnegative real number. In
particular, for C1 = C, the vectors are just complex numbers ~z = z1 and we would like to have
h~z , ~zi = |z1 |2 = z1 z1 where z is the complex conjugate of the complex number z. This motivates
us to define the inner product in Cn as follows.
 
z1
Definition 8.1 (Inner product and norm of a vector in Cn ). For vectors ~z =  ...  and

FT
 

zn
 
w1
~ =  ...  ∈ Cn the inner product (or scalar product or dot product) is defined as
w
 

wn
   
* z1 w1 + n
 ..   ..  X
h~z , wi
~ =  .  , .  = zj wj = z1 w1 + · · · + zn wn .
RA
zn wn j=1

 
z1
The length of ~z =  ...  ∈ Rn is denoted by k~zk and it is given by
 

zn
p
k~zk = |z1 |2 + · · · + |zn |2 .
Other names for the length of ~z are magnitude of ~z or norm of ~z.
D

Exercise 8.2. Show that the scalar product from Definition 8.1 can be viewed as an extension of
the scalar product in Rn in the following sense: If the components of ~z and ~v happen to be real,
then they can also be seen as vectors in Rn . The claim is that their scalar product as vectors in
Rn is equal to their scalar product in Cn . The same is true for their norms.

Properties 8.3. (i) Norm of a vector: For all vectors ~z ∈ Cn , we have that
h~z , ~zi = k~zk2 .

~ ∈ Cn , we have (note the complex conju-


(ii) Symmetry of the inner product: For all vectors ~v , w
gation on the right hand side!)
~ = hw
h~v , wi ~ , ~v i.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 315

(iii) Sesqulinearity of the inner product: For all vectors ~u, ~v , ~z ∈ Cn and all c ∈ C, we have that

h~v + cw
~ , ~zi = h~v , ~zi + chw
~ , ~zi and h~v , w
~ + c~zi = h~v , wi
~ + ch~v , ~zi.

~ ∈ Cn and c ∈ C, we have that kc~v k = |c|k~v k.


(iv) For all vectors ~v , w
     
v1 w1 z1
~ =  ...  , ~z =  ...  ∈ Cn and let c ∈ C.
Proof. Let ~v =  ...  , w
     

vn wn zn

(i) h~z , ~zi = z1 z1 + · · · + zn zn = |z1 |2 + · · · + |zn |2 = k~zk2 .


(ii) h~v , wi
~ = v1 w1 + · · · + vn wn = v1 w1 + · · · + vn wn = w1 v1 + · · · + wn vn = hw
~ , ~v i.
(iii) A straightforward calculation shows

~ , ~zi = (v1 + cw1 )w1 + · · · + (vn + cwn )wn


h~v + cw

FT
= v1 w1 + · · · + vn wn + cw1 w1 + · · · + cwn wn
= h~v , ~zi + chw
~ , ~zi.

The second equation can be shown by an analogous calculation. Instead of repeating them,
we can also use the symmetry property of the inner product:

h~v , w
~ + c~zi = hw
~ + c~z , ~v i = hw
~ , ~v i + ch~z , ~v i = hw
~ , ~v i + ch~z , ~v i = h~v , ~zi + ch~v , ~zi.
RA
(iv) kc~zk2 = hc~z , c~zi = cch~z , ~zi = |c|2 k~zk2 . Taking the square root on both sides, we obtain the
desired equality kc~zk = |c|k~zk.

For Cn there is no cosine theorem and in general it does not make too much sense to speak about
the angle between two complex vectors (orthogonality still makes sense!).

Definition 8.4. Let ~z, ~v ∈ Cn .


(i) The vectors ~z, ~v are called orthogonal or perpendicular if h~z , ~v i = 0. In this case we write
~z ⊥ ~v .
D

h~ vi
z ,~
(ii) If ~v 6= ~0, then the orthogonal projection of ~z onto ~v is proj~v ~z = k~v k2 ~
v.

The next proposition shows that orthogonality works Cn as expected.

Proposition 8.5. Let ~z, ~v ∈ Cn .


(i) Pythagoras theorem: If ~z ⊥ ~v , then k~z + ~v k2 = k~zk2 + k~v k2 .
(ii) If ~v 6= ~0, then ~z = ~zk + ~z⊥ with ~zk := proj~v ~z and ~z⊥ := ~z − proj~v ~z and

proj~v ~z k ~v , and ~z − proj~v ~z ⊥ ~v .

Moreover, k proj~v ~zk ≤ k~zk.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
316 8.1. Complex vector spaces

(iii) If ~v 6= ~0, then Cn → Cn , ~z 7→ proj~v ~z is a linear map.

Proof. (i) If ~z ⊥ ~v , then k~z +~v k2 = h~z , ~zi + h~z , ~v i + h~v , ~zi + h~v , ~v i = h~z , ~zi + h~v , ~v i = k~zk2 + k~v k2 .
(ii) It is clear that ~z = ~zk + ~z⊥ and that ~zk k ~v by definition of ~zk and ~z⊥ . That ~z⊥ ⊥ ~v follows
from
h~z , ~v i
h~z⊥ , ~v i = h~z − proj~v ~z , ~v i = h~z , ~v i − hproj~v ~z , ~v i = h~z , ~v i − h~v , ~v i = h~z , ~v i − h~z , ~v i = 0.
k~v k2
Finally, by the Pythagoras theorem,

k~zk2 = k(~z − proj~v ~z) + proj~v ~zk2 = k~z − proj~v ~zk2 + k proj~v ~zk2 ≥ k proj~v ~zk2 .

(iii) Assume that ~v 6= ~0 and let ~z1 , ~z2 ∈ Cn and c ∈ C. Then


hz1 + c~z2 , ~v i hz1 , ~v i + ch~z2 , ~v i hz1 , ~v i~v ch~z2 , ~v i
proj~v (~z1 + c~z2 ) = 2
= 2
= 2
=
k~v k k~v k k~v k k~v k2

FT
= proj~v ~z1 + c proj~v ~z2 .

Question 8.1
What changes if in the definition of the orthogonal projection we put h~v , ~zi instead of h~z , ~v i?

Now let us show the triangle inequality. Note the the following inequalities (8.1) and (8.2) were
proved for real vector spaces in Corollary 2.20 using the cosine theorem.
RA
Proposition 8.6. For all vectors ~v , w ~ ∈ Cn and c ∈ C, we have the Cauchy-Schwarz inequality
(which is a special case of the so-called Hölder inequality)

|h~v , wi|
~ ≤ k~v k kwk
~ (8.1)

and the triangle inequality


k~v + wk
~ ≤ k~v k + kwk.
~ (8.2)

~ = ~0 because in this case both sides of the


Proof. We will first show (8.1). It is obviously true if w
D

inequality are equal to 0. So let us assume now that w ~ 6= ~0. Note that for any λ ∈ C we have that

~ 2 = h~v − λw
0 ≤ k~v − λwk ~ = kvk2 − λhw
~ , ~v − λwi ~ + |λ|2 kwk2 .
~ , ~v i − λh~v , wi

If we chose λ = − h~
v ,wi
~
~ 2 , we obtain
kwk

h~v , wi
~ h~v , wi
~ ~ 2
|h~v , wi|
0 ≤ kvk2 − hw~ , ~
v i − h~
v , wi
~ + kwk2
~ 2
kwk ~ 2
kwk ~ 4
kwk
~ 2
|h~v , wi| ~ 2
|h~v , wi|
= kvk2 − 2 + kwk2
~ 2
kwk ~ 4
kwk
~ 2
|h~v , wi| 1 h i
= kvk2 − = kvk 2
kwk 2
− |h~
v , wi|
~ 2
~ 2
kwk kwk~ 2

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 317

It follows that kvk2 kwk2 − |h~v , wi|


~ 2 ≥ 0, hence kvk2 kwk2 ≥ |h~v , wi|
~ 2 . We obtain the desired
inequality by taking the square root.
Now let us show the triangle inequality. It is essentially the same as for vectors in Rn , cf. Corol-
lary 2.20.

~ 2 = h~v + w
k~v + wk ~ = h~v , ~v i + h~v , wi
~ , ~v + wi ~ + hw
~ , ~v i + hw
~ , wi
~
~ + h~v , wi
= h~v , ~v i + h~v , wi ~ + hw
~ , wi
~
= k~v k2 + 2 Reh~v , wi ~ 2
~ + kwk
≤ k~v k2 + 2|h~v , wi| ~ 2 ≤ k~v k2 + 2k~v k kwk
~ + kwk ~ 2 = (k~v k + kw|)
~ + kwk ~ 2.

In the first inequality we used that Re a ≤ |a| for any complex number a and in the second inequality
we used (8.1). If we take the square root on both sides we get the triangle inequality.

Remark 8.7. Observe that the choice of λ in the proof of (8.1) is not as arbitrary as it may seem.
Note that for this particular λ

FT
h~v , wi
~
~v − λw
~ = ~v − ~ = ~v − projw~ ~v .
w
~ 2
kwk
Hence this choice of λ minimises the norm of ~v −λw
~ and ~v −projw~ ~v ⊥ w.
~ Therefore, by Pythagoras,

k~v k2 = k(~v − projw~ ~v ) + projw~ ~v k2 = k~v − projw~ ~v k2 + k projw~ ~v k2


2
h~v , wi
~ ~ 2
|h~v , wi|
≥ k projw~ ~v k2 = w
~ =
~ 2
kwk ~ 2
kwk
RA
which shows that k~v k2 kwk
~ 2 ≥ |h~v , wi|
~ 2.
Another way to see this inequality is

0 ≤ k~v − projw~ ~v k2 = h~v − projw~ ~v , ~v − projw~ ~v i = h~v − projw~ ~v , ~v i = k~v k2 − hprojw~ ~v , ~v i


h~v , wi
~ ~ 2
|h~v , wi|
= k~v k2 − hw~ , ~
v i = k~
v k2

~ 2
kwk ~ 2
kwk

which again gives k~v k2 kwk


~ 2 ≥ |h~v , wi|
~ 2.
D

Important classes of matrices


Recall that for a matrix A ∈ MR (m × n) we defined its transpose At . The important property of
At is that it is the unique matrix such that

hA~x , ~y i = h~x , At ~y i for all ~x ∈ Rn , ~y ∈ Rm .

In the complex case, we want for a given matrix A ∈ MC (m × n) a matrix A∗ such that

hA~x , ~y i = h~x , A∗ ~y i for all ~x ∈ Cn , ~y ∈ Cm .


t
It is easy to check that we have to take A∗ = A , where A is the matrix we obtain from A by taking
the complex conjugate of every entry. Clearly, if all entries in A are real numbers, then At = A∗ .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
318 8.1. Complex vector spaces

Definition 8.8. The matrix A∗ is called the adjoint matrix of A.

Lemma 8.9. Let A ∈ M (n × n). Then det(A∗ ) = det A = complex conjugate of det A.

Proof. det A∗ = det(A)t = det A = det A. The last equality follows directly from the definition of
the determinant.

A matrix with real entries is symmetric if and only if A = At . The analogue for complex matrices
are hermitian matrices.

Definition 8.10. A matrix A ∈ M (n × n) is called the hermitian if A = A∗ .


   
1 2 + 3i 1 5
Examples 8.11. • A= =⇒ ∗
A = . The matrix A is not
5 1 − 7I 2 − 3i 1 + 7I
hermitian.
   
1 2 + 3i 1 2 + 3i

FT
• A= =⇒ A∗ = . The matrix A is hermitian.
2 − 3i 5 2 − 3i 1 + 7I

Exercise 8.12. • Show that the entries on the diagonal of a hermitian matrix must be real.

• Show that the determinant of a hermitian matrix is a real number.

Another important class of real matrices are the orthogonal matrices. Recall that a matrix Q ∈
MR (n × n) is an orthogonal matrix if and only if Qt = Q−1 . We saw that if Q is orthogonal, then
RA
its columns (or rows) form an orthonormal basis for Rn and that | det Q| = 1, hence det Q = ±1.
The analogue in complex vector spaces are so-called unitary matrices.

Definition 8.13. A matrix Q ∈ M (n × n) is called unitary if Q∗ = Q−1 .

It is clear from the definition that a matrix is unitary if and only if its columns (or rows) form an
orthonormal basis for Cn , cf. Theorem 7.12.
D

Proposition 8.14. Let Q ∈ M (n × n).

(i) The following is equivalent:

(a) Q is unitary.
(b) hQ~x , Q~y i = h~x , ~y i for all ~x, ~y ∈ Rn .
(c) kQ~xk = k~xk for all ~x ∈ Rn .

(ii) If Q is unitary, then | det Q| = 1.

Proof. (i) (a) =⇒ (b): Assume that Q is a unitary matrix and let ~x, ~y ∈ Cn . Then

hQ~x , Q~y i = hQ∗ Q~x , ~y i = h~x , ~y i.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 319

(b) =⇒ (a): Fix ~x ∈ Cn . Then we have hQ~x , Q~y i = h~x , ~y i for all ~y ∈ Cn , hence

0 = hQ~x , Q~y i − h~x , ~y i = hQ∗ Q~x , ~y i − h~x , ~y i = hQ∗ Q~x − ~x , ~y i. = h(Q∗ Q − id)~x , ~y i.

Since this is true for any ~y ∈ Cn , it follows that (Q∗ Q − id)~x = 0. Since ~x ∈ Cn was arbitrary,
we conclude that Q∗ Q − id = 0, in other words, that Q∗ Q = id.
(b) =⇒ (c): It follows from (b) that kQ~xk2 = hQ~x , Q~xi = h~x , ~xi = k~xk2 , hence kQ~xk = k~xk.
(c) =⇒ (b): Observe that the inner product of two vectors in Cn can be expressed completely
in terms of norms as follows
1h i
h~a , ~bi = k~a + ~bk2 − k~a − ~bk2 + ik~a + i~bk2 − ik~a − i~bk2
4
as can be easily verified. Hence we find
1h i
hQ~x , Q~y i = kQ~x + Q~y k2 − kQ~x − Q~y k2 + ikQ~x + iQ~y k2 − ikQ~x − iQ~y k2
4
1h

FT
i
= kQ(~x + ~y )k2 − kQ(~x − ~y )k2 + ikQ(~x + i~y )k2 − ikQ(~x − i~y )k2
4
1h i
= k~x + ~y k2 − k~x − ~y k2 + ik~x + i~y k2 − ik~x − i~y k2
4
= h~x , ~y i.

(ii) Assume that Q is unitary. Then

1 = det id = det QQ∗ = (det Q)(det Q∗ ) = (det Q)(det Q) = | det Q|2 .


RA
    
0 i 0 i 0 −i
Examples 8.15. • The matrix Q = is unitary because QQ∗ = =
  i 0 i 0 −i 0
1 0
, hence Q∗ = Q−1 . Note that det Q = −i2 = 1.
0 1
 iα   iα   −iα   
e 0 e 0 e 0 1 0
• The matrix Q = is unitary because QQ =∗
= ,
0 eiβ 0 eiβ 0 e−iβ 0 1
hence Q∗ = Q−1 . Note that det Q = ei(α+β) , hence | det Q| = 1.
D

You should now have understood


• the vector space structure of Cn ,
• the inner product on Cn ,
• that the concept of orthogonality makes sense in Cn and works as in Rn ,
• why hermitian matrices in Cn play the role of symmetric matrices in Rn ,
• why unitary matrices in Cn play the role of orthogonal matrices in Rn ,
• etc.
You should now be able to

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
320 8.1. Complex vector spaces

• calculate with vectors in Cn ,


• check if vectors in Cn are orthogonal,
• calculate the orthogonal projection of one vector onto another,
• check if a given matrix is hermitian,
• check if a given matrix is unitary,
• etc.

Ejercicios.
1. Sea  
i −3 + i 2i
A= 2 4i 2 + 2i .
4+i 2−i 6−i
Encuentre bases para la imagen del espacio columna y el kernel de A.

FT
2. Sea  
1−i 1+i 8
A =  3i −3 −12 + 12i
4+i −1 + 4i 12 + 20i
Encuentre bases ortonormales para la imagen del espacio columna y el kernel de A.
   
1 i
1 − i 1+i
 −3  y v2 =  1  y V = span{v1 , v2 }. Verifique que v1 ⊥ v2 y obtenga
RA
3. Sean v1 =    

i 3 + 3i
una base ortonormal de V ⊥ .
4. Encuentre a, b, c, d, e, f ∈ R tales que la matriz
 
1 3a + ib 1 + 3i
A = 7a − 5ib − 4 3 + ic 5e + 3if + 2i
1 − 3i 4e − 6if + 2 − 8i 4 − ic
D

sea hermitiana.
5. Verifique que la matriz  
1 −i −1 + i
1
i 1 1+i 
2
1−i 1+i 0
es unitaria.
   
2 z1 −z2
6. Considere V = C y T : V → V dada por T = .
z2 z1

(a) ¿T es transformación lineal si V se considera sobre K = R?


(b) ¿T es transformación lineal si V se considera sobre K = C?

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 321

7. Sean ~x, ~y ∈ V donde V = Rn ó Cn .

(a) Si V = Cn , muestre que k~x + ~y k2 = kxk2 + 2 Reh~x , ~y i + kyk2 .


(b) Si V = Rn , muestre que k~x + ~y k2 = k~xk2 + k~y k2 si y solo si ~x ⊥ ~y .
(c) Si V = Cn ¿sigue siendo válida la afirmación del inciso anterior?

8. Sean A, B ∈ M (n × n) hermitianas.
Muestre que AB es hermitiana si y solo si AB = BA.

9. Muestre que dim Cn = 2n si se considera Cn como un espacio vectorial sobre R.

10. Sea A ∈ M (n × n) hermitiana. Muestre que hA~x , ~xi ∈ R para todo ~x ∈ Cn .

8.2 Similar matrices

FT
Definition 8.16. Let A, B ∈ M (n × n) be (real or complex) matrices. They are called similar if
there exists an invertible matrix C such that

A = C −1 BC. (8.3)

In this case, we write A ∼ B.

Exercise 8.17. Show that A ∼ B if and only if there exists an invertible matrix C
e such that
RA
A = CB
e Ce −1 . (8.4)

Question 8.2
Assume that A and B are similar. Is the matrix C in (8.3) unique or is it possible that there are
different invertible matrices C1 6= C2 such that A = C1−1 BC1 = C2−1 BC2 ?

Remark 8.18. Similarity is an equivalence relation on the set of all square matrices. This means
D

that it satisfies the following three properties. Let A1 , A2 , A3 ∈ M (n × n). Then:

(i) Reflexivity: A ∼ A for every A ∈ M (n × n).


(ii) Symmetry: If A1 ∼ A2 , then also A2 ∼ A1 .
(iii) Transitivity: If A1 ∼ A2 and A2 ∼ A3 , then also A1 ∼ A3 .

Proof. (i) Reflexivity is clear. We only need to choose C = id.

(ii) Assume that A1 ∼ A2 . Then there exists an invertible matrix C such that A1 = C −1 A2 C.
Multiplication from the left by C and from the right by C −1 gives CA1 C −1 = A2 . Let
Ce = C −1 . Then C e −1 = C. Hence we obtain C
e is invertible and C e −1 A1 C
e = A2 which shows
that A2 ∼ A1 .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
322 8.2. Similar matrices

(iii) Transitivity: If A1 ∼ A2 and A2 ∼ A3 , then there exist invertible matrices C1 and C2 such
that A1 = C1−1 A2 C1 and A2 = C2−1 A3 C2 . It follows that

A1 = C1−1 A2 C1 = C1−1 C2−1 A3 C2 C1 = (C1 C2 )−1 A3 C1 C2 .

Setting C = C1 C2 shows that A1 = C −1 A3 C, hence A1 ∼ A3 .

We can interpret A ∼ B as follows: Let C be an invertible matrix with A = C −1 BC. Since


C is an invertible matrix, its columns ~c1 , . . . , ~cn form a basis of Rn (or Cn ) and we can view C
as the transition matrix from the canonical basis to the basis ~c1 , . . . , ~cn . Since B is the matrix
representation of the map ~x 7→ B~x with respect to the canonical basis of Rn , the equation A =
C −1 BC says that A represents the same linear map but with respect to the basis ~c1 , . . . , ~cn .
On the other hand, if A and B are matrix representations of the same linear transformation but
with respect to possibly different bases, then A = C −1 BC where C is the transition matrix between
the two bases. Hence A and B are similar.
So we showed:

FT
Two matrices A and B ∈ M (n × n) are similar if and only if they represent the same linear
transformation. The matrix C in A = C −1 BC is the transition matrix between the two bases
used in the representations A and B.

Hence the following fact is not very surprising.

Proposition 8.19. If A, B ∈ M (n × n) are similar, then det A = det B.


RA
Proof. Let C ∈ M (n × n) invertible such that A = C −1 BC. Then

det A = det C −1 BC = det(C −1 ) det B det C = (det C)−1 det B det C = det B.

Exercise 8.20. Show that det A = det B does not imply that A and B are similar.

Exercise 8.21. Assume that A and B are similar. Show that dim(ker A) = dim(ker B) and that
dim(Im A) = dim(Im B). Why is this no surprise?
D

Question 8.3
Assume that A and B are similar. What is the relation between ker A and ker B? What is the
relation between Im A and Im B?
Hint. Theorem 6.4.

A very nice class of matrices are the diagonal matrices because it is rather easy to calculate with
them. Closely related are the so-called diagonalisable matrices.

Definition 8.22. A matrix A ∈ M (n × n) is called diagonalisable if it is similar to a diagonal


matrix.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 323

In other words, A is diagonalisable if there exists a diagonal matrix D and an invertible matrix C
with
C −1 AC = D. (8.5)

How can we decide if a matrix A is diagonalisable? We know that it is diagonalisable if and only if
it is similar to a diagonal matrix, that is, if and only if there exists a basis ~c1 , . . . , ~cn such that the
representation of A with respect to these vectors is a diagonal matrix. In this case, (8.5) is satisfied
if the columns of C are the basis vectors ~c1 , . . . , ~cn .
Denote the diagonal entries of D by d1 , . . . , dn . Then it easy to see that D~ej = dj~ej . This means
that if we apply D to some ~ej , then the image D~ej is parallel to ~ej . Since D is nothing else than
the representation of A with respect to the basis ~c1 , . . . , ~cn , we have A~cj = dj ~cj .
We can make this more formal: Take equation (8.5) and multiply both sides from the left by C so
that we obtain AC = CD. Recall that for any matrix B, we have that B~ej = jth column of B. If
we obtain

AC~ej = A~cj , AC=CD


=======⇒ A~cj = d~cj .

FT
CD~ej = C(dj~ej ) = dj C(~ej ) = dj ~cj .

In summary, we found:

A matrix A ∈ M (n × n) is diagonalisable if and only we can find a basis ~c1 , . . . , ~cn of Rn (or Cn )
and numbers d1 , . . . , dn such that

A~cj = dj ~cj , j = 1, . . . , n.
RA
In this case C −1 AC = D (or equivalently A = CDC −1 ) where D = diag(d1 , . . . , dn ) and C =
(~c1 | · · · |~cn ).

The vectors ~cj are called eigenvectors of A and the numbers dj are called eigenvalues of A. They
will be discussed in greater detail in the next section where we will also see how we can calculate
them.
Diagonalization of a matrix is very useful when we want to calculate powers of the matrix.
D

Proposition 8.23. Let A ∈ M (n × n) be a diagnalizable matrix and let C be an invertible matrix


and D = diag(d1 , . . . , dn ) such that A = CDC −1 . Then Ak = C diag(dk1 , . . . , dkn )C −1 for all k ∈ N0 .
If A is invertible, then all dj are different from 0 and the formula is true for all k ∈ Z.

Proof. Let k ∈ N0 . Then


k
Ak = C diag(d1 , . . . , dn )C −1
= C diag(d1 , . . . , dn )C −1 C diag(d1 , . . . , dn )C −1 · · · C diag(d1 , . . . , dn )C −1
= C diag(d1 , . . . , dn ) diag(d1 , . . . , dn ) · · · diag(d1 , . . . , dn )C −1
k
= C diag(d1 , . . . , dn ) C −1
= C diag(dk1 , . . . , dkn )C −1 .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
324 8.2. Similar matrices

−1
If all dj 6= 0, then D is invertible with inverse D−1 = diag(d1 , . . . , dn ) = diag(d−1 −1
1 , . . . , dn ).
−1
Hence A is invertible and A−1 = CDC −1 = CD−1 C −1 and we obtain for k ∈ Z with k < 0


|k|
Ak = A−|k| = (A−1 )|k| = CD−1 C −1 = C(D−1 )|k| C −1 = CDk C −1 = CD−|k| C −1
= C diag(dk1 , . . . , dkn )C −1 .

Proposition 8.23 is useful for example when we describe dynamical systems by matrices or when
we solve linear differential equations with constant coefficients in higher dimensions.

You should now have understood


• that similar matrices represent the same linear transformation,
• why similar matrices have the same determinant,
• why a matrix is diagonalisable if and only if Rn (or Cn ) admits a basis consisting of eigen-
vectors of A,

FT
• etc.
You should now be able to

• etc.

Ejercicios.
     
1 0 −1 12 110 7 1 0 −2
RA
1. Sean A = 0 1 3, B =  5 16 −6 y C = 0 3 1. Verifique que AC =
2 −1 0 6 57 4 1 4 −1
CB y concluya que A, B son matrices semejantes.

2. Encuentre tres matrices que son semejantes a la matriz A del Ejercicio 1.. Para cada una de
ellas, encuentra el determinante y la traza.

3. De las siguientes afirmaciones, diga en cada una si es verdadera ó falsa. Si es verdadera


justifique por qué, si es falsa encuentre un contraejemplo.
D

(a) Sean A, B ∈ M (n × n) tales que det A = det B, entonces A, B son matrices semejantes.
(b) Sean D1 , D2 ∈ M (n × n) matrices diagonales tales que D1 6= D2 , entonces D1 , D2 no
son matrices semejantes.
(c) Si A, B ∈ M (n × n) son matrices equivalentes por filas entonces A, B son matrices
semejantes.

4. (a) Sean λ1 , λ2 ∈ C y considere


   
λ1 0 λ2 0
D1 = , D1 = .
0 λ2 0 λ1

Muestre que D1 , D2 son semejantes. ¿Como puede generalizar este resultado a matrices
D1 , D2 diagonales de cualquier tamaño?

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 325

(b) Muestre que dos matrices diagonales D1 , D2 ∈ M (n × n) son semejantes si, salvo el
orden, tienen exactamente los mismos valores en la diagonal.

5. Defina la siguiente función

tr : M (n × n) → R, tr A := a11 + a22 + · · · + ann . (∗)

Note que para una matriz A ∈ M (n × n) el número tr A es la suma de los elementos de la


diagonal de A: este número se llama traza de A, ver definición 8.34.

(a) Muestre que tr definida en (∗) es una transformación lineal.


(b) Sean A, B ∈ M (n × n). Muestre que tr AB = tr BA.
(c) Sean A, B ∈ M (n × n) matrices semejantes. Muestre que tr A = tr B.
(d) Sea A ∈ M (n × n) tal que tr(At A) = 0. Muestre que A = O.

6. Encuentre matrices A y b con tr A = tr B que no son semejantes.

FT
7. Encuentre matrices A y b con det A = det B que no son semejantes.

8.3 Eigenvalues and eigenvectors


Definition 8.24. Let V be a vector space and let T : V → V be linear transformation. A number
λ is called an eigenvalue of T if there exists a vector ~v 6= ~0 such that

T v = λv. (8.6)
RA
The vector v is then called a eigenvector .

The reason why we exclude v = O in the definition above is because for every λ it is true that
T O = O = λO, so (8.8) would be satisfied for any λ if we were allowed to choose v = O, in which
case the definition would not make too much sense.

Exercise 8.25. Show that 0 is an eigenvalue of T if and only if dim(ker T ) ≥ 1, that is, if and only
if T is not invertible. Show that v is an eigenvector with eigenvalue 0 if and only if v ∈ ker T \{O}.
D

Exercise 8.26. Show that all eigenvalues of a unitary matrix have norm 1.

Question 8.4
Let V, W be vector spaces and let T : V → W be a linear transformation. Why does in not make
sense to speak of eigenvalues of T if V 6= W ?

Let us list some properties of eigenvectors that are easy to see.

(i) A vector v is an eigenvector of T if and only if T v k v.

(ii) If v is an eigenvector of T with eigenvalue λ 6= 0, then v ∈ Im T because v = λ1 T v.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
326 8.3. Eigenvalues and eigenvectors

(iii) If v is an eigenvector of T with eigenvalue λ, then every non-zero multiple of v is an eigenvector


with the same eigenvalue because

T (cv) = cT v = cλv = λ(cv).

(iv) We can generalise (iii) as follows: If v1 , . . . , vk are eigenvectors of T with the same eigenvalue
λ, then every non-zero linear combination is an eigenvector with the same eigenvalue because

T (α1 v1 + . . . αk vk ) = α1 T v1 + . . . αk T vk = α1 λv1 + · · · + αk λvk = λ(α1 v1 + · · · + αk vk ).

(iv) says that the set of all eigenvectors with the same eigenvalue is almost a subspace. The only
thing missing is the zero vector O. This motivates the following definition.

Definition 8.27. Let V be a vector space and let T : V → V be a linear map with eigenvalue λ.
Then the eigenspace of T corresponding to λ is

Eigλ (T ) := Eig(T, λ) := {v ∈ V : v is eigenvector of T with eigenvalue λ} ∪ {O}

FT
= {v ∈ V : T v = λv}.

The dimension of Eigλ (T ) is called the geometric multiplicity of λ.

Proposition 8.28. Let T : V → V be a linear map and let λ be an eigenvalue of T . Then

Eigλ (T ) = ker(T − λ id).


RA
Proof. Let v ∈ V . Then

v ∈ Eigλ (T ) ⇐⇒ T v = λv ⇐⇒ T v − λv = O ⇐⇒ T v − λ id v = O
⇐⇒ (T − λ id)v = O ⇐⇒ v ∈ ker(T − λ id).

Note that Proposition 8.28 shows again that Eigλ (T ) is a subspace of V . Moreover it shows that
that λ is an eigenvalue of T if and only if T − λ id is not invertible. For the special case λ = 0 we
have that Eig0 (T ) = ker T .
D

Examples 8.29. (a) Let V be a vector space and let T = id. Then for every v ∈ V we have that
T v = v = 1v. Hence T has only one eigenvalue, namely λ = 1 and Eig1 (T ) = ker(T − id) =
ker 0 = V . Its geometric multiplicity is dim(Eig1 (T )) = dim V .
(b) Let V = R2 and let R be reflection on the x-axis. If ~v is an eigenvector of R, then R~v
must be parallel to ~v . This happens if and only if ~v is parallel to the x-axis in which case
R~v = ~v , or if ~v is perpendicular to the x-axis in which case R~v = −~v . All other vectors
change directions under a reflection. Hence we have the eigenvalues λ1 = 1 and λ2 = −1 and
Eig1 (R) = span{~e1 }, Eig−1 (R) = span{~e2 }. Each eigenvalue has geometric multiplicity 1.
Note that
  the matrix representation
   of R 
with respect
 to the canonical basis of R2 is AR =
1 0 1 0 x1 x1
and AR ~x = = . Hence AR ~x is parallel to ~x if and only if
0 −1 0 −1 x2 −x2
x1 = 0 (in which case ~x ∈ span{~e2 }) or x2 = 0 (in which case ~x ∈ span{~e1 }).

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 327

(c) Let V = R2 and let R be rotation about 90◦ . Then clearly R~v 6k ~v for any ~v ∈ R2 \ {~0}. Hence
R has no eigenvalues.
2
Note that
  the matrix representation of R with respect to the canonical basis of R is AR =
0 −1
. If we consider AR as a real matrix, then it has no eigenvalues. However, if consider
1 0
AR as a complex matrix, then it has the eigenvalues ±i as we shall see later.
1 0 0 0 0 0
050000
(d) Let A =  00 00 50 05 00 00 . As always, we identify A with the linear map R6 → R6 , ~x 7→ A~x. It
000080
000000
is not hard to see that the eigenvalues and eigenspaces of A are

λ1 = 1, Eig1 (A) = span{~e1 }, geom. multiplicity: 1,


λ2 = 5, Eig5 (A) = span{~e2 , ~e3 , ~e4 }, geom. multiplicity: 3,
λ3 = 8, Eig8 (A) = span{~e6 , ~e7 }. geom. multiplicity: 2.

FT
Show the claims above.
(e) Let V = C ∞ (R) be the space of all infinitely many times differentiable functions from R
to R and let T : V → V, T f = f 0 . Analogously to Example 6.4 we can show that T is a
linear transformation. The eigenvalues of T are those λ ∈ R such that there exists a function
f ∈ C ∞ (R) with f 0 = λf . We know that for every λ ∈ R this differential equation has a
solution and that every solution is of the form fλ (x) = c eλx for some real number c. Therefore
every λ ∈ R is an eigenvalue of T with eigenspace Eigλ (T ) = span{gλ } where gλ is the function
RA
given by gλ (x) = eλx . In particular, the geometric multiplicity of any λ ∈ R is 1.

(f) Let V = C ∞ (R) be the space of all infinitely many times differentiable functions from R to
R and let T : V → V, T f = f 00 . It is easy to see that T is a linear transformation. The
eigenvalues of T are those λ ∈ R such that there exists a function f ∈ C ∞ (R)√with f 00√= λf .
If λ > 0, then the general solution of this differential
√ equation
√ is fλ (x) = a e λx +b e λx . If
λ < 0, the general solution is fλ (x) = a cos λx + b sin λx. If λ = 0, the general solution is
f0 (x) = ax + b. Hence every λ ∈ R is an eigenvalue of T with geometric multiplicity 2.
D

Write down the eigenspaces for a given λ.

Find the eigenvalues and eigenspaces if we consider the vector space of infinitely differentiable
functions from R to C.

In the examples above ifwas relatively


 easy to guess the eigenvalues. But how do we calculate the
1 2
eigenvalues of, e.g., A = or of the linear transformation T : M (n × n) → M (n × n), T (A) =
3 4
t
A+A ?
Since any linear transformation on a finite dimensional vector space V can be “translated” to a
matrix by choosing a basis on V , it is sufficient to find eigenvalues of matrices as the next theorem
shows.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
328 8.3. Eigenvalues and eigenvectors

Theorem 8.30. Let V be a finite dimensional vector space with basis B = {v1 , . . . , vn } and let
T : V → V be a linear transformation. If AT is the matrix representation of T with respect to
the basis B, then the eigenvalues of T and AT coincide  and  a vector v = c1 v1 + · · · + cn vn is an
c1
eigenvector of T with eigenvalue λ if and only if ~x =  ...  is an eigenvector of AT with the same
 

cn
eigenvalue λ. In particular, the dimensions of the eigenspaces of T and of AT coincide.

Proof. Let K = R if V is a real vector space and K = C if V is a complex vector space and
let Φ : V → Rn be the linear map defined by Φ(vj ) = ~ej , (j = 1, . . . , n). That means that Φ
c1
 .. 
“translates” a vector v = c1 v1 + · · · + cn vn into the column vector ~x =  . , cf. Section 6.4.
cn

FT
V V
Φ Φ
n AT m
K K

Recall that T = Φ−1 AT Φ. Let λ be an eigenvalue of T with eigenvector v, that is, T v = λv. We
express v as linear combination of the basis vectors from B as v = c1 v1 + · · · + cn vn . Hence

T v = λv ⇐⇒ Φ−1 AT Φv = λv ⇐⇒ AT Φv = Φλv ⇐⇒ AT (Φv) = λ(Φv)


RA
which is the case if and only if λ is an eigenvalue of AT and Φv ∈ Eigλ (AT ).

The proof shows that Eigλ (AT ) = Φ(Eigλ (T )) as was to be expected.

Corollary 8.31. Assume that A and B are similar matrices and let C be an invertible matrix with
D

A = C −1 BC. Then A and B have the same eigenvalues and for every eigenvalue λ we have that
Eigλ (B) = C Eigλ (A).

Now back to the question about how to calculate the eigenvalues and eigenvectors of a given matrix
A. Recall that λ is an eigenvalue of A if and only if ker(A − λ id) 6= {~0}, see Proposition 8.28. Since
A − λ id is a square matrix, this is the case if and only if det(A − λ id) = 0.

Definition 8.32. The function λ 7→ det(A − λ id) is called the characteristic polynomial of A. It
is usually denoted by pA .

Before we discuss the characteristic polynomial and show that it is indeed a polynomial, we will
describe how to find the eigenvalues and eigenvectors of a given square matrix A.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 329

Procedure to find the eigenvalues and eigenvectors of a given square matrix A.


• Calculate the characteristic polynomial pA (λ) := det(A − λ id).

• Find the zeros λ1 , . . . , λk of the characteristic polynomial. They are the eigenvalues of A.
• For each eigenvalue λj calculate ker(A − λj ), for instance using Gauß-Jordan elimination.
This gives the eigenspaces.

 
2 1
Example 8.33. Find the eigenvalues and eigenspaces of A = .
3 4
Solution. • The characteristic polynomial of A is
 
2−λ 1
pA (λ) = det(A − λ id) = det = (2 − λ)(4 − λ) − 3 = λ2 − 6λ + 5.
3 4−λ

• Now we can either complete the square or use the solution formula for quadratic equations to

FT
find the zeros of pA . Here we choose to complete the square.
pA (λ) = λ2 − 6λ + 5 = (λ − 3)2 − 4 = (λ − 5)(λ − 1).
Hence the eigenvalues of A are λ1 = 5 and λ2 = 1.
• Now we calculate the eigenspaces using Gauß elimination.
       
2−5 1 −3 1 R2 →R2 +R1 −3 1 R1 →−R1 3 −1
∗ A − 5 id = = −−−−−−−−→ −−−−−−→ .
3 4−5 3 −1 0 0 0 0
RA
 
1
Therefore, ker(A − 5 id) = span .
3
     
2−1 1 1 1 R2 →R2 −3R1 1 1
∗ A − 1 id = = −−−−−−−−→ .
3 4−1 3 3 0 0
 
1
Therefore, ker(A − 1 id) = span .
−1
In summary, we have two eigenvalues,
D

 
1
λ1 = 5, Eig5 (A) = span , geom. multiplicity: 1,
3
 
1
λ2 = 1, Eig1 (A) = span , geom. multiplicity: 1.
−1

   
1 1
If we set ~v1 = and ~v2 = we can check our result by calculating
3 −1
      
2 1 1 5 1
A~v1 = = =5 = 5~v1 ,
3 4 3 15 3
    
2 1 1 1
A~v2 = = = ~v2 .
3 4 −1 −1

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
330 8.3. Eigenvalues and eigenvectors

Before we give more examples, we show that the characteristic polynomial is indeed a polynomial.
First we need a definition.

Definition 8.34. Let A = (aij )ni,j=1 ∈ M (n × n). The trace of A is the sum of its entries on the
diagonal:
tr A := a11 + a22 + . . . ann .

Remark. Note that exercise 5. of section 8.2 shows that the trace of similar matrices coincides,
so if V is a finite-dimensional space and T is a linear transformation, it makes sense to define tr T
as the trace of the matrix representation of T in any base of V.

Theorem 8.35. Let A = (aij )ni,j=1 ∈ M (n × n) and let pA (λ) = det(A − λ id) be the characteristic
polynomial of A. Then the following is true.
(i) pA is a polynomial of degree n.
(i) Let pA (λ) = cn λn + cn−1 λn−1 + · · · + c1 λ + c0 . Then we have formulas for the coefficients

FT
cn , cn−1 and c0 :

cn = (−1)n , cn−1 = (−1)n−1 tr A, c0 = det A.

Proof. By definition,
 
 a11 − λ a12 a1n 
 a
21 a22 − λ a2n 
RA
 
pA (λ) = det(A − λ id) = det 

.

 
 
 
an1 an2 ann − λ

According to Remark 4.4, the determinant is the sum of products where each product consists of a
sign and n factors chosen such that it contains one entry from each row and from each column of
A − λ id. Therefore it is clear that pA is a polynomial in λ. The term with the most λ in it is the
one of the form
D

(a11 − λ)(a22 − λ) · · · (ann − λ). (8.7)


All the other terms contain at most n − 2 factors with λ. To see this, assume for example that in
one of the terms the factor from the first row is not (a11 − λ) but some a1j . Then there cannot be
another factor from the jth column, in particular the factor (ajj − λ) cannot appear. So this term
has already two factors without λ, hence the degree of the term as polynomial in λ can be at most
n − 2. This shows that

pA (λ) = (a11 − λ)(a22 − λ) · · · (ann − λ) + terms of order at most n − 2. (8.8)

If we expand the first term and sort by powers of λ, we obtain

(a11 − λ)(a22 − λ) · · · (ann − λ) = (−1)n λn + (−1)n−1 λn−1 (a11 + · · · + ann )


+ terms of order at most n − 2.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 331

Inserting this in (8.7), we find that

pA (λ) = (−1)n λn + (−1)n−1 λn−1 (a11 + · · · + ann ) + terms of order at most n − 2, (8.9)

hence deg(pA ) = n.
Formula (8.9) also shows the claim about cn and cn−1 . The formula for c0 follows from

c0 = pA (0) = det(A − 0 id) = det A.

We immediately obtain the following very important corollary.

Corollary 8.36. An n × n matrix can have at most n different eigenvalues.

Proof. Let A ∈ M (n × n). Then the eigenvalues of A are exactly the zeros of its characteristic
polynomial. Since it has degree n, it can have at most n zeros.

FT
Now we understand why working with complex vector spaces is more suitable when we are interested
in eigenvalues. They are precisely the zeros of the characteristic polynomial. While a polynomial
may not have real zeros, it always has zeros when we allow them to be complex numbers. Indeed,
any polynomial can always be factorised over C.
Let A ∈ M (n × n) and let pA be its characteristic polynomial. Then there exist complex numbers
λ1 , . . . , λk and integers m1 , . . . , mk ≥ 1 such that

pa (λ) = (λ1 − λ)m1 · (λ2 − λ)m2 · · · (λk − λ)mk .


RA
The numbers λ1 , . . . , λk are precisely the complex eigenvalues of A and m1 +· · ·+mk = deg pA = n.

Definition 8.37. The integer mj is called the algebraic multiplicity of the eigenvalue λj .

The following theorem is very important but we omit its proof.

Theorem 8.38. Let A ∈ M (n × n) and let λ be an eigenvalue of A. Then


D

geometric multiplicity of λ ≤ algebraic multiplicity of λ.


 
1 0 0 0 0 0
0 5 1 0 0 0
 
0 0 5 1 0 0
Example 8.39. Let A =  . Since A − λ id is an upper triangular matrix, its
0 0 0 5 0 0
 
0 0 0 0 8 0
0 0 0 0 0 8
determinant is the product of the entries on the diagonal. We we obtain

pA (λ) = det(A − λ id) = (1 − λ)(5 − λ)3 (8 − λ)2 .

Therefore the eigenvalues of A are λ1 = 1, λ2 = 5, λ3 = 8. Let us calculate the eigenspaces.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
332 8.3. Eigenvalues and eigenvectors

0 0 0 0 0 0 0 4 1 0 0 0
0 4 1 0 0 0 permute rows 004100
• A − 1 id =  00 0
0
4
0
1
4
0
0
0 −
0 −−−−−−−→  00 00 00 40 07 00 . This matrix is in row echelon form and
0 0 0 0 7 0 000007
0 0 0 0 0 7 000000
we can see easily that Eig1 (A) = ker(A − 1 id) = span{~e1 } which has dimension 1.
 −4 0 0 0 0 0   −4 0 0 0 0 0 
0 0 1 0 0 0 permute rows 0 0 1 0 0 0
• A − 5 id =  0
0
0
0
0
0
1
0
0
0
0 −
0 −−−−−−−→  0
0
0
0
0
0
1
0
0
3
0 .
0 This matrix is in row echelon form
0 0 0 0 3 0 0 0 0 0 0 3
0 0 0 0 0 3 0 0 0 0 0 0
and we can see easily that Eig5 (A) = ker(A − 5 id) = span{~e2 } which has dimension 1.
 −7 0 0 0 0 0 
0 −3 1 0 0 0
• A − 8 id =  0 0 −3 1 0 0 .
0 0 0 −3 0 0
This matrix is in row echelon form and we can see easily that
0 0 0 000
0 0 0 000
Eig8 (A) = ker(A − 8 id) = span{~e5 , ~e6 } which has dimension 2.
In summary, we have
λ1 = 1, Eig1 (A) = span{~e1 }, geom. multiplicity: 1, alg. multiplicity: 1,

FT
λ2 = 5, Eig5 (A) = span{~e2 }, geom. multiplicity: 1, alg. multiplicity: 3,
λ3 = 8, Eig8 (A) = span{~e6 , ~e7 }, geom. multiplicity: 2, alg. multiplicity: 2.
 
0 −1
Example 8.40. Find the complex eigenvalues and eigenspaces of R = .
1 0
Solution. From Example 8.29 we already know that R has no real eigenvalues. The characteristic
polynomial of R is
RA
 
−λ −1
pR (λ) = det(R − λ) = det = λ2 + 1 = (λ − i)(λ + i).
1 −λ
Hence the eigenvalues are λ1 = −i and λ2 = i. Let us calculate the eigenspaces.
     
i −1 R2 →R2 +iR1 i −1 1
• R−(−i) id = −−−−−−−−→ . Hence Eig−i (R) = ker(R+i id) = span .
1 i 0 0 i
     
−i −1 R2 →R2 +iR1 −i −1 1
• R−i id = −−−−−−−−→ . Hence Eigi (R) = ker(R−i id) = span .
1 −i 0 0 −i
D


 
2 1
Example 8.41. Find the diagonalisation of A = .
3 4

Solution. We need to find an invertible matrix C and a diagonal matrix D such that D = C −1 AC.
By Example 8.33, A has the eigenvalues λ1 = 5 and λ2 = 1, hence A is indeed diagonalisable. We
know that the diagonal entries of D are the eigenvalues
  of A,hence
 D = diag(5, 1) and the columns
1 1
of C are the corresponding eigenvalues ~v1 = and ~v2 = , hence
3 −1
   
5 0 1 1
D= , C= and D = C −1 AC.
0 1 3 −1

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 333

Alternatively, we could have chosen D e = diag(1, 5). Then the corresponding C e = (~v2 |~v1
e is C
because the jth column of the invertible matrix must be an eigenvector corresponding the the jth
entry of the diagonal matrix, hence
   
1 0 1 1 e −1 AC.
D=
e , C=
e and D e =C e 
0 5 −1 3

Observe that up to ordering the diagonal elements, the matrix D is uniquely determined by A. For
the matrix C however we have more choices. For instance, if we multiply each column of C by an
arbitrary constant different from 0, it still works.

Example 8.42. Let V = M (2 × 2) and let T : V → V, T (M ) = M + M t . Find the eigenvalues


and eigenspaces of T .

Solution. Let M1 = ( 10 00 ), M2 = ( 00 10 ), M3 = ( 01 00 ), M4 = ( 00 01 ). Then B = {M1 , M2 , M3 , M4 }


is a basis of M (2 × 2). The matrix representation of T with respect to it is

FT
 
2 0 0 0
0 1 1 0
AT = 0 1 1 0

0 0 0 2

The characteristic polynomial is


 
2−λ 0 0 0  
1−λ 1 0
RA
 0 1 − λ 1 0 
det(AT − λ id) = det  = (2 − λ) det  1 1−λ 0 
 0 1 1−λ 0 
0 0 2−λ
0 0 0 2−λ

= (2 − λ)2 (1 − λ)2 − 1) = λ(λ − 2)3 .


 

Hence there are two eigenvalues: λ1 = 0 and λ2 = 2.


Let us find the eigenspaces.
     
2 0 0 0 1 0 0 0 1 0 0 0
D

0 1 1 0 R3 →R3 −R2 0 1 1 0 R3 ↔R4 0 1 1 0


• AT − 0 id = AT  0 1 1 0 −−R−1−→
 −− 1
−−→ 0 0 0 0 −−−−−→ 0 0 0 1,
  
2 R 1
0 0 0 2 R4 → 12 R4 0 0 0 1 0 0 0 0
     
0 0 0 0 0 0 0 0 0 1 −1 0
0 −1 1 0
• AT − 2 id =  −R2 →R2 +R3 
− −− −− −− → 0 0 0 0−− −−−→ 0 0
R1 ↔R3  0 0
.
0 1 −1 0 0 1 −1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
       

 0   1 0 0 
  
     
−1 0 1
 and Eig2 (AT ) = span   ,   , 0 .
 
Hence Eig0 (AT ) = span   1 0 1 0

  
 
0 0 0 1
   

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
334 8.3. Eigenvalues and eigenvectors

This means
 that the eigenvalues of T are 0 and 2 and that the eigenspaces are Eig0 (T ) = span {M2 − M3 } =

0 1
span and
−1 0
 
0 1
Eig0 (T ) = span {M2 − M3 } = span = Masym (2 × 2),
−1 0
     
1 0 0 1 0 0
Eig2 (T ) = span {M1 , M2 + M3 , M4 } = span , , = Msym (2 × 2),
0 0 1 0 0 1

Remark. We could have calculated the eigenspaces or T directly without calculating those of AT
first as follows.
• A matrix M belongs to Eig0 (T ) if and only if T (M ) = 0. This is the case if and only if
M + M t = 0 which means that M = −M t . So Eig0 (T ) is the space of all antisymmetric 2 × 2
matrices.
• A matrix M belongs to Eig2 (T ) if and only if T (M ) = 2M . This means that M + M t = 2M .

FT
This is the case if and only if M = M t . So Eig0 (T ) is the space of all symmetric 2×2 matrices.

You should now have understood


• the concept of eigenvalues and eigenvectors,
• why an n × n matrix can have at most n eigenvalues,
RA
• why the restriction of A to any of its eigenspaces acts as a multiple of the identity,
• what the characteristic polynomial of a matrix says about its eigenvalues,
• why a n × n matrix is diagonalisable if and only if Kn has a basis consisting of eigenvectors
of A,
• etc.

You should now be able to


• calculate the characteristic polynomial of a square matrix A,
D

• calculate the eigenvalues and eigenvectors of a square matrix A,


• diagonalise a diagonalisable matrix,
• etc.

Ejercicios.
1. Para las siguientes matrices, encuentre los vectorios propios, los espacios propios, una matriz
invertible C y una matriz diagonal D tal que C −1 AC = D.
       
−3 5 −20 −2 0 1 −2 0 −1 1 0 0
A1 =  2 0 8 , A2 =  0 2 0 , A3 =  0 2 0  , A4 = 3 2 0 .
2 1 7 9 0 6 9 0 6 1 3 2

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 335

2. Sea D : P3 → P3 dada por Dp = p0 . Encuentre el polinomio caracterı́stico y los valores


propios de D.

3. Sea D : P3 → P3 dada por Dp = p + xp0 + p00 . Encuentre el polinomio caracterı́stico y los


valores propios de D.

4. Sea  
3 −1
A= .
−2 4

(a) Calcule (A−1 )n para cualquier n ∈ N.


(b) ¿Que se puede decir de (A−1 )n cuando n → ∞?

5. Sea T : R2 → R2 dada por:

T (~x) = reflexión de ~x con respecto a la recta y = x.

Calcule los valores propios de T y muestre que T es diagonalizable. (Hint: basta escoger una

FT
base adecuada de R2 ).

6. Sea V = C[0, 1] y T : V → V dada por


Z x
(T f )(x) = f (t) dt.
0

Muestre que T no tiene valores propios.


RA
7. Sea A ∈ M (n × n). Muestre que:

(a) A es diagonalizable si y solo si A−1 es diagonalizable.


(b) A es diagonalizable si y solo si At es diagonalizable.

8. Sea A ∈ M (n × n) invertible y λ un valor propio de A. Note que λ 6= 0. Muestre que λ1 es un


valor propio de A−1 . (Esto dice que si A tiene valores propios µ1 , . . . , µk entonces los valores
propios de A−1 son µ11 , . . . , µ1k )
D

9. Sea A ∈ M (n × n). Muestre que A y At tienen los mismos valores propios. (Hint: analice el
polinomio caracterı́stico)

10. Sea ~v ∈ R3 un vector no nulo y T : R3 → R3 dada por T (~x) = ~v × ~x. Muestre que 0 es el
único valor propio real.

11. Sea W un subespacio de Rn con dim W = m y PW la proyección ortogonal de Rn sobre W .


¿Cuáles son los autovalores de PW ? ¿Cuál es el polinomio caracterı́stico de PW ? ¿Es PW
diagonalizable? (Hint: Empiece escogiendo una base ortonormal de W ).

12. Sea A ∈ M (n × n) tal que Am = idn para algún m ∈ N .

(a) Muestre que si λ ∈ C es un valor propio de A entonces λm = 1.


(b) Encuentre cuatro matrices distintas A tales que A3 = id3 .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
336 8.4. Properties of the eigenvalues and eigenvectors

13. Sea A ∈ M (n × n) tal que Am = O para algún m ∈ N.

(a) Muestre que λ idn −A es invertible para todo λ ∈ C − {0}. (Hint: La prueba es la misma
del ejercicio 28. sección 3.5)
(b) Encuentre el polinomio caracterı́stico de A. Observe que en ejercicio 2., D4 = O.

14. Sea A ∈ M (n × n) una matriz tal que A2 = A, muestre que A es diagonalizable. (Hint: Por
el ejercicio 7. de la sección 6.2; elija una base de ker A y complétela a una base de Rn por
medio de una base de Im A)

15. Sea A ∈ M (n×n) distinta de la matriz nula y T : M (n×n) → M (n×n) dada por T (X) = XA.
Muestre que A y T tienen los mismos valores propios.

16. Sea A ∈ M (n × n) tal que todos sus valores propios son 0. ¿Se puede concluir que A = O?
¿Cómo cambia la respuesta si suponemos que A es diagonalizable?

FT
8.4 Properties of the eigenvalues and eigenvectors
In this section we collect important properties of eigenvectors.

Proposition 8.43. Let A ∈ M (n × n) and let λ1 , λ2 , . . . , λk be pairwise different eigenvalues of


A with eigenvectors ~v1 , ~v2 , . . . , ~vk . Then the vectors ~v1 , ~v2 , . . . , ~vk are linearly independent.

Proof. We proof the claim by induction.


RA
Basis of the induction: k = 2. Assume that λ1 6= λ2 are eigenvalues of A with eigenvectors ~v1 and
~v2 . Hence A~v1 = λ1~v1 and A~v2 = λ2~2 and ~v1 6= ~0 6= ~v2 . Let α1 , α2 numbers such that

α1~v1 + α2~v2 = ~0. (8.10)


α2
Assume that α1 6= 0. Then ~v1 = α1 ~
v2 and
 
α2 α2 α2 α2 ~0 = (λ1 − λ2 )~v1 .
λ1~v1 = A~v1 = A ~v2 = A~v2 = λ2~v2 = λ2 ~v2 = λ2~v1 =⇒
D

α1 α1 α1 α1

Since λ1 6= λ2 and ~v1 6= ~0, the last equality is false and therefore we must have α1 = 0. Then,
by (8.10), ~0 = α1~v1 + α2~v2 = α2~v2 , hence also α2 = 0 which proves that ~v1 and ~v2 are linearly
independent.
Induction step: Assume that we already know for some j < k that the vectors ~v1 , . . . , ~vj are linearly
independent. We have to show that then also the vectors ~v1 , . . . , ~vj+1 are linearly independent. To
this end, let α1 , α2 , . . . , αj+1 such that

~0 = α1~v1 + α2~v2 + · · · + αj ~vj + αj+1~vj+1 . (8.11)

On the one hand we apply A on both sides of the equation and use the fact that vectors are
eigenvectors. On the other hand we multiply both sides by λj+1 and then we compare the two

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 337

results.

apply A : ~0 = A(α1~v1 + α2~v2 + · · · + αj ~vj + αj+1~vj+1 )


= α1 A~v1 + α2 A~v2 + · · · + αj A~vj + αj+1 A~vj+1
= α1 λ1~v1 + α2 λ2~v2 + · · · + αj λj ~vj + αj+1 λj+1~vj+1 1

multiply by λj+1 : ~0 = α1 λj+1~v1 + α2 λj+1~v2 + · · · + αj λj+1~vj + αj+1 λj+1~vj+1 2

The difference 1 - 2 gives

~0 = α1 (λ1 − λj+1 )~v1 + α2 (λ1 − λj+1 )~v2 + · · · + αj (λ1 − λj+1 )~vj .

Note that the term with ~vj+1 cancelled. By the induction hypothesis, the vectors ~v1 , . . . , ~vj are
linearly independent, hence

α1 (λ1 − λj+1 ) = 0, α2 (λ1 − λj+1 ) = 0, ..., αj (λ1 − λj+1 ) = 0.

α1 = 0,

FT
We also know that λj+1 is not equal to any of the other λ` , hence it follows that

α2 = 0, ..., αj = 0.

Inserting this in (8.11) gives that also αj+1 = 0 and the proof is complete.

Note that the proposition shows again that an n×n matrix can have at most n different eigenvalues.
RA
Corollary 8.44. Let A ∈ M (n × n) and let µ1 . . . , µk be the different eigenvalues of A. If in each
Eigµj (A) we choose linearly independent vectors ~v1j , . . . , ~v`j1 , then the system of all those vectors
is linearly independent. In particular, if we choose bases in Eigµj (A), we see that the sum of
eigenspaces is a direct sum
Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A)

and dim(Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A)) = dim(Eigµ1 (A) + · · · + dim Eigµk (A)).
D

(m)
Proof. Let αj be numbers such that

~0 = α1(1)~v11 + · · · + α(1)~v`1 + α1(2)~v12 + · · · + α(2)~v`2 + . . . α1(k)~v1k + · · · + α(k)~v`k


`1 1 `2 2 `k k

=w
~1 + w
~2 + . . . w
~k

(j) (j)
~ j = α1 ~v1j + · · · + α`1 ~v`j1 ∈ Eigµj . Proposition 8.43 implies that w
with w ~ k = ~0.
~1 = · · · = w
(m) (m) (m)
But then also all coefficients αj = 0 because for fixed m, the vectors ~v1 , . . . , ~v`m are linearly
independent. Now all the assertions are clear.

A very special class of matrices are the diagonal matrices.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
338 8.4. Properties of the eigenvalues and eigenvectors

 
d1
Theorem 8.45. (i) Let D = diag(d1 , . . . , dn ) = 
 0  be a diagonal matrix. Then


0 dn

the eigenvalues of D are precisely the numbers d1 , . . . , dn and the geometric multiplicity of
each eigenvalue is equal to its algebraic multiplicity.
   
d1 ∗  d
(ii) Let B = 

 and C = 
 1 0   be upper and lower triangular matrices
 ∗

0 dn

dn

respectively. Then the eigenvalues of D are precisely the numbers d1 , . . . , dn and the algebraic
multiplicity of an eigenvalue is equal to the number of times it appears on the diagonal. In
general, nothing can be said about the geometric multiplicities.

Proof. (i) Since the determinant of a diagonal matrix is the product of its diagonal elements, we
obtain for the characteristic polynomial of D

FT
 
d 1 − λ
 0 
pD (λ) = det(D − λ) = det   = (d1 − λ) · · · · · (dn − λ).
 

 0 

dn − λ

Since the zeros of the characteristic polynomial are the eigenvalues of D, we showed that the
RA
numbers on the diagonal of D are precisely its eigenvalues. The algebraic multiplicity of an
eigenvalue µ is equal to the number of times it is repeated on the diagonal of D. The algebraic
multiplicity of µ is equal to dim(ker(D − µ id). Note that D − µ id is a diagonal matrix and
the jth entry on its diagonal is 0 if and only if µ = dj . it is not hard to see that the dimension
of the kernel of a diagonal matrix is equal to the number of zeros on its diagonal. So, in
summary we have for an eigenvalue µ of A:

algebraic multiplicity of µ = number of times µ appears in the diagonal of D


= geometric multiplicity of µ.
D

(ii) Since the determinant of a triangular matrix is the product of its diagonal elements, we obtain
for the characteristic polynomial of B
 
d − λ
 1
∗ 
pB (λ) = det(B − λ) = det   = (d1 − λ) · · · · · (dn − λ).
 

 0 

dn − λ

and analogously for C. The reasoning for the algebraic multiplicities of the eigenvalues is as
in the case of a diagonal matrix. However, in general the algebraic and geometric multiplicity
of an eigenvalue of a triangular matrix may be different as Example 8.39 shows.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 339

 
5 0 0 0 0 0
0 1 0 0 0 0
 
0 0 5 0 0 0
Example 8.46. Let D =   . Then pD (λ) = (1 − λ)(5 − λ)3 (5 − λ)2 .
0 0 0 8 0 0

0 0 0 0 8 0
0 0 0 0 0 5
The eigenvalues are 1 (with geom. mult = alg. mult = 1), 5 (with geom. mult = alg. mult = 3)
and 8 (with geom. mult = alg. mult = 2),

Theorem 8.47. If A and B are similar matrices, then they have the same characteristic polyno-
mial. In particular, they have the same eigenvalues with the same algebraic multiplicities. Moreover,
also the geometric multiplicities are equal.

Proof. Let C be an invertible matrix such that A = C −1 BC. Hence

A − λ id = C −1 BC − λ id = C −1 BC − λC −1 C = C −1 (B − λ id)C

FT
and we obtain for the characteristic polynomial of A

pA (λ) = det(A − λ id) = det(C −1 (B − λ id)C) = det(C −1 ) det(B − λ id) det C = det(B − λ id)
= pB (λ).

This shows that A and B have the same eigenvalues and that their algebraic multiplicities coincide.

Now let µ be an eigenvalue. Then


RA
Eigµ (A) = ker(A − µ id) = ker(C −1 (B − µ id)C) = ker((B − µ id)C) = C −1 ker(B − µ id)
= C −1 Eigµ (B)

where in the second to last step we used that C −1 is invertible. The invertibility of C −1 also shows
that dim(C −1 Eigµ (B)) = dim(Eigµ (B), hence dim Eigµ (A) = dim(Eigµ (B), which proves that the
geometric multiplicity of µ as eigenvalue of A is equal to that of B.
D

Next we prove a very important theorem about the diagonalisation of matrices.

Theorem 8.48. Let A ∈ MK (n × n) with K = R or K = C. Then the following is equivalent.

(i) A is diagonalisable, that means that there exists a diagonal matrix D and an invertible matrix
C such that C −1 AC = D.
(ii) For every eigenvalue of A, its geometric and algebraic multiplicities are equal.
(iii) A has a set of n linearly independent eigenvectors.
(iv) Kn has a basis consisting of eigenvectors of A.

Proof. Let µ1 , . . . , µk be the different eigenvalues of A and let us denote the algebraic multiplicities
of µj by mj (A) and mj (D) and the geometric multiplicities by nj (A) and nj (D).

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
340 8.4. Properties of the eigenvalues and eigenvectors

(i) =⇒ (ii): By assumption A and D are similar so they have the same eigenvalues by Theorem 8.47
and
mj (A) = mj (D) and nj (A) = nj (D) for all j = 1, . . . , k,
and Theorem 8.45 shows that

mj (D) = nj (D) for all j = 1, . . . , k,

because D is a diagonal matrix. Hence we conclude that also

mj (A) = nj (A) for all j = 1, . . . , k.

(ii) =⇒ (iii): Recall that the geometric multiplicities nj (A) are the dimensions of the kernel of
A − µj id. So in each ker(A − µj ) we may choose a basis consisting of nj (A) vectors. In total we
have n1 (A)+· · ·+nk (A) = m1 (A)+· · ·+mk (A) = n such vectors and they are linearly independent
by Corollary 8.44.
(iii) =⇒ (iv): This is clear because dim Kn = n.

FT
(iv) =⇒ (i): Let B = {~c1 , . . . , ~cn } be a basis of Kn consisting of eigenvectors of A and let d1 , . . . , dn
be the corresponding eigenvalues, that is, A~cj = dj ~cj . Note that the dj are not necessarily pairwise
different. Then the matrix C = (~c1 | · · · |~cn ) is invertible and C −1 AC is the representation of A in
the basis B, hence C −1 AC = diag(d1 , . . . , dn ). In more detail, using that ~cj = C~ej and C −1~cj = ~ej ,

jth column of C −1 AC = C −1 AC~ej = C −1 A~cj = C −1 (dj ~cj ) = dj C −1~cj = dj~ej ,

hence D = (d1~e1 | · · · |dn~en ) = diag(d1 , . . . , dn ).


RA
An immediate consequence of Theorem 8.48 is the following.

Corollary 8.49. If a matrix A ∈ M (n × n) has n different eigenvalues, then it is diagonalisable.

Proof. If A has n different eigenvalues λ1 , . . . , λn , then for each of them the algebraic multiplicity
is equal to 1. Moreover,
D

1 ≤ geometric multiplicity ≤ algebraic multiplicity = 1

for each eigenvalue. Hence the algebraic and the geometric multiplicity for each eigenvalue are equal
(both are equal to 1) and the claim follows from Theorem 8.48.

Corollary 8.50. If the matrix A ∈ M (n × n) is diagonalisable, then its determinant is equal to the
product of its eigenvalues.

Proof. Let λ1 , . . . , λn be the (not necessarily different) eigenvalues of A and let C be an invertible
matrix such that C −1 AC = D := diag(λ1 , . . . , λn ). Then
n
Y
det A = det(CDC −1 ) = (det C)(det D)(det C −1 ) = det D = λj .
j=1

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 341

Theorem 8.51. Let A ∈ M (n × n) and let µ1 , . . . , µk be its different eigenvalues. Then A is


diagonalisable if and only if
Kn = Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A). (8.12)
where K = R or K = C depending on whether A is acting on R or on C.

Proof. Let us denote the algebraic multiplicity of each µj by mj (A) and its geometric multiplicity
by nj (A).
If A is diagonalisable, then the geometric and algebraic multiplicities are equal for each eigenvalue.
Hence
dim(Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A)) = dim(Eigµ1 (A)) + · · · + dim(Eigµk (A))
= n1 (A) + · · · + nk (A) = m1 (A) + · · · + mk (A) = n.
Since every n-dimensional subspace of Kn is equal to Kn , (8.12) is proved.
Now assume that (8.12) is true. We have to show that A is diagonalisable. In each Eigµj we choose
a basis Bj . By (8.12) the collection of all those basis vectors form a basis of Kn . Therefore we found

FT
a basis of Kn consisting of eigenvectors of A. Hence A is diagonalisable by Theorem 8.48.
The above theorem says that A is diagonalisable if and only if there are enough eigenvectors of
A to span Kn . This is the case if and only if Kn splits in the direct sum of subspaces on each of
which A acts simply by multiplying each vector with the number (namely with the corresponding
eigenvalue).
To practice a bit the notions of algebraic and geometric multiplicities, finish this section with an
alternative proof of Theorem 8.48.
RA
Alternative proof of Theorem 8.48. Let us prove (i) =⇒ (iv) =⇒ (iii) =⇒ (ii) =⇒ (i).
(i) =⇒ (iv): This was already discussed after Definition 8.22. Let D = diag(d1 , . . . , dn ) and let ~c1 , . . . , ~cn
be the columns of C. Clearly they form a basis of Kn because C in invertible. By assumption we know
that AC = CD. Hence we have that
A~cj = jth column of AC = jth column of CD = dj · (jth column of C) = dj ~cj .
Therefore the vectors ~c1 , . . . , ~cn are linearly independent and are all eigenvalues of A and hence they are
even a basis of Kn .
D

(iv) =⇒ (iii): Clear.


(iii) =⇒ (ii): Suppose that ~v1 . . . , ~vn is a basis of K n consisting of eigenvectors of A. Clearly, each of them
must belong to some eigenspace of A. Let `j be the number of those vectors which belong to Eigµj (A).
Hence it follows that `j ≤ nj (A) because the vectors are linearly independent and nj (A) = dim Eigµj (A).
So by Theorem 8.38 we have `j ≤ nj (A) ≤ mj (A) where mj (A) is the algebraic multiplicity of µj . Summing
over all eigenvectors, we obtain
n = `1 + · · · + `k ≤ n1 (A) + · · · + nk (A) ≤ m1 (A) + · · · + mk (A) = n
The first equality holds because the vectors are a basis of Kn and the last equality holds by definition
of the algebraic multiplicity. Hence all the ≤ signs are in reality equalities and n1 (A) + · · · + nk (A) =
m1 (A) + · · · + mk (A). Therefore
 
0 = n1 (A) + · · · + nk (A) − m1 (A) + · · · + mk (A)
   
= n1 (A) − m1 (A) + · · · + nk (A) − mk (A) .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
342 8.4. Properties of the eigenvalues and eigenvectors

Since nj (A)−mj (A) ≤ 0 for all j = 1, . . . , k, each of the terms must be zero which shows that nj (A)−mj (A)
as desired.
(ii) =⇒ (i): For each j = 1, . . . , k let us choose a basis Bj of Eigµj (A). Observe that each basis has
nj (A) vectors. By Corollary 8.44, the system consisting of all these basis vectors is linearly independent.
Moreover, the total number of these vectors is n1 (A) + · · · + nk (A) = m1 (A) + · · · + mk (A) = n where we
used the assumption that the algebraic and geometric multiplicities are equal for each eigenvalue. Hence
the collection of all those vectors form a basis of Kn . That A is diagonalisable follows now as in the proof
of (iv) =⇒ (i):

You should now have understood


• why the eigenvectors of different eigenvalues of a matrix A are linearly independent,
• more generally, why the sum of the eigenspaces is even a direct sum,
• why a matrix is diagonalisable if and only if the vector space has a basis consisting of
eigenvectors of A,
• algebraic and geometric multiplicities,

FT
• etc.
You should now be able to

• verify if a given matrix is diagonalisable,


• if it is diagonalisable, find its diagonalisation,
• etc.
RA
Ejercicios.
1. Para cada una de las siguientes matrices, determine si son diagonalizables. Si lo es, encuentre
una D que es semejante. D = CAC −1 .
   
    −1 4 2 −7 3 2 5 1
3 1 −1 3 1 0  0 5 −3 6  2 0 2 6
A1 =  1 3 −1 , A2 = 0 3 1 , A3 =   0 0 −5 1  , A4 = 5 2 7
  .
−1
−1 −1 5 0 0 3
0 0 0 11 1 6 −1 3
D

2. Sea T : M (2 × 2) → M (2 × 2) dada por

1
T (A) = (A − At ).
2
Muestre que T es diagonalizable

3. (a) Sea D : Pn → Pn dada por Dp = p0 . ¿Es D diagonalizable?


(b) Sea D : Pn → Pn dada por Dp = p + xp0 + p00 . ¿Es D diagonalizable?

4. Sean A, B ∈ M (n × n).

(a) Si A, B son diagonalizables, ¿se sigue que A + B es diagonalizable?

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 343

(b) Si AB es diagonalizable, ¿se sigue que A o B son diagonalizables?


(c) Si A, B son diagonalizables, ¿se sigue que AB es diagonalizable?
5. Calcule A50 para  
−29 20 −4
A= 0 1 0 .
210 −140 29

6. ¿Para cuáles valores de k, t ∈ R, la matriz


 
i −1 0
0 3k + 2t k − 4t − 5
0 0 5k − 8t

es diagonalizable?
7. Sea A ∈ M (n × n) diagonalizable y sean d1 , d2 , . . . , dk todos sus valores propios distintos.
Muestre que (A−d1 idn )(A−d2 idn ) . . . (A−dk idn ) = On×n . ¿Sigue siendo cierta la afirmación

FT
si no suponemos que A es diagonalizable?
8. Sea A ∈ M (n × n) triangular superior ó inferior. ¿Cuál es el polinomio caracterı́stico de A?
¿Puede dar condiciones de cuando A es diagonalizable?
9. Sean A, B, C ∈ M (2 × 2) y sea  
A C
V =
O2×2 B
RA
(a) Muestre que el polinomio caracterı́stico de V es la multiplicación de los polinomios
caracterı́sticos de A y B.
(b) Si C = O2×2 , muestre que V es diagonalizable si y solo si A, B son diagonalizables.
(c) ¿Es cierta la conclusión del inciso anterior si no suponemos que C = O2×2 ?

8.5 Symmetric and Hermitian matrices


In this section we will deal with symmetric and hermitian matrices. The main results are that all
D

eigenvalues of a hermitian matrix are real, that eigenvectors corresponding to different eigenvalues
are orthogonal and that every hermitian matrix is diagonalisable. Note that symmetric matrices
are a special case of hermitian ones, so whenever we show something about hermitian matrices, the
same is true for symmetric matrices.

Theorem 8.52. Let A be a hermitian matrix. Then every eigenvalue λ of A is real.

Proof. Let A be hermitian, that is, A∗ = A and let λ be an eigenvalue of A with eigenvector ~v .
Then ~v 6= ~0 and A~v = λ~v . We have to show that λ = λ. Therefore

λk~v k2 = λh~v , ~v i = hλ~v , ~v i = hA~v , ~v i = h~v , A∗~v i = h~v , A~v i = h~v , λ~v i = λh~v , ~v i = λk~v k2 .

Since ~v 6= ~0, it follows that λ = λ which means that the imaginary part of λ is 0, hence λ ∈ R.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
344 8.5. Symmetric and Hermitian matrices

Theorem 8.53. Let A be a hermitian matrix and let λ1 , λ2 be two different eigenvalues of A with
eigenvectors ~v1 and ~v2 , that is A~v1 = λ1~v1 and A~v2 = λ2~v2 . Then ~v1 ⊥ ~v2 .

Proof. The prove is similar to the proof of Theorem 8.52. We have to show that h~v1 , ~v2 i = 0. Note
that by Theorem 8.52, the eigenvalues λ1 , λ2 are real.
λ1 h~v1 , ~v2 i = hλ1~v1 , ~v2 i = hA~v1 , ~v2 i = h~v1 , A∗~v2 i = h~v1 , A~v2 i = h~v1 , λ2~v2 i = λ2 h~v1 , ~v2 i = λ2 h~v1 , ~v2 i.
Since λ1 6= λ2 by assumption it follows that h~v1 , ~v2 i = 0.

Corollary 8.54. Let A be a hermitian matrix and let λ1 , λ2 be two different eigenvalues of A.
Then Eigλ1 (A) ⊥ Eigλ2 (A).

The next theorem is one of the most important theorems in Linear Algebra.

Theorem 8.55. Every hermitian matrix is diagonalisable.

FT
Theorem 8.55*. Every symmetric matrix is diagonalisable.

We postpone the proof of these theorems to end of this section.


As a corollary we obtain the following very important theorem.

Theorem 8.57. A matrix is hermitian if and only if it is unitarily diagonalisable, that is, there
exists a unitary matrix Q and a diagonal matrix D such that D = Q−1 AQ = Q∗ AQ.

The formulation of the above theorem for real matrices is:


RA
Theorem 8.57*. A matrix is symmetric if and only if it is orthogonally diagonalisable, that is,
there exists an orthogonal matrix Q and a diagonal matrix D such that D = Q−1 AQ = Qt AQ.

In both cases, D = diag(λ1 , . . . , λn ) where the λ1 , . . . , λn are the eigenvalues of A and the columns
of Q are the corresponding eigenvectors.
Proof. Let A be a hermitian matrix. From Theorem 8.55 we know that A is diagonalisable. Hence
D

Cn = Eigµ1 (A) ⊕ . . . Eigµk (A)


where µ1 , . . . , µk are the different eigenvalues of A. In each eigenspace Eigµj (A) we can choose an
orthonormal basis Bj consisting of nj vectors ~v1j , . . . , ~vnj j where nj is the geometric multiplicity of
µj . We know that the eigenspaces are pairwise orthogonal by Corollary 8.54. Hence the system of
all these vectors form an orthonormal basis B of Cn . Therefore the matrix Q whose columns are
the vectors of this basis is a unitary matrix and Q−1 AQ = D.
Now assume that A is unitarily diagonalisable. We have to show that A is hermitian. Let Q be a
unitary matrix and let D be a diagonal matrix such that D = Q∗ AQ. Then A = QDQ∗ and
A∗ = (QDQ∗ )∗ = (Q∗ )∗ D∗ Q∗ = QDQ∗ = A
where we used that D∗ = D because D is a diagonal matrix whose entries on the diagonal are real
numbers because they are the eigenvalues of A.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 345

The proof of Theorem 8.57* is the same.

Corollary 8.59. If a matrix A is hermitian (or symmetric), then its determinant is the product
of its eigenvalues.

Proof. This follows from Theorem 8.55 (or Theorem 8.55*) and Corollary 8.50.
Proof of Theorem 8.55. Let A ∈ MC (n × n) be a hermitian matrix and let µ1 , . . . , µk be the
different eigenvalues of A with geometric multiplicities n1 , . . . , nk . By Theorem 8.51 it suffices to
show that
Cn = Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A).
Let us denote the right hand side by U , that is, U := Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A). Then we have
to show that U ⊥ = {~0}. For the sake of a contradiction, assume that this is not true and let
(j) (j)
` = dim(U ⊥ ). In each Eigµj (A) we choose an orthogonal basis ~v1 , . . . , ~vnj and we choose and
orthogonal basis w ~ ` in U ⊥ . The set of all these vectors is an orthonormal basis B of
~ 1, . . . , w
C because all the eigenspaces are orthogonal to each other and to U ⊥ . Let Q be the matrix
n

FT
(1) (k)
whose columns are these vectors: Q = (~v1 | · · · |~vnk |w ~ 1 | · · · |w
~ ` ). Then Q is a unitary matrix
because its columns are an orthogonal basis of C . Next let us define B = Q−1 AQ. Then B is
n

symmetric because B ∗ = (Q−1 AQ)∗ = Q∗ A∗ (Q−1 )∗ = Q−1 AQ = B where we used that A = A∗


by assumption and that Q−1 = Q∗ because it is a unitary matrix. On the other hand, B being the
matrix representation of A with respect to the basis B, is of the form
 
µ1
 
RA
µ1
 
 
µ2
 
 
B= .
 
 
 

 µk 

 
 
C

All the empty spaces are 0 and C is an ` × ` matrix (it is the matrix representation of the restriction
D

of A to U ⊥ with respect to the basis w


~ 1, . . . , w
~ ` ). The characteristic polynomial of C has at least
one zero, hence C has at least one eigenvalue λ. Clearly, λ is then also an eigenvalue of B and if
~y ∈ C` is an eigenvector of C, we obtain an eigenvector of B with the same eigenvalue by putting
0s as its first n − ` components and ~y as its last ` components. Since A and B have the same
eigenvalues, λ must be equal to one of the eigenvectors µ1 , . . . , µk , say λ = µj0 . But then the
dimension of the eigenspace Eigµj0 (B) is strictly larger than the dimension of Eigµj0 (A) which
contradicts Theorem 8.47. Therefore U ⊥ = {~0} and the theorem is proved.
Proof of Theorem 8.55*. The proof is essentially the same as that for Theorem 8.55. We only
have to note that, using the notation of the proof above, the matrix C is symmetric (because B
is symmetric). If we view C as a complex matrix, it has at least one eigenvalue λ because in C
its characteristic polynomial has at least one complex zero. However, since C is hermitian, all its
eigenvalues are real, hence λ is real, so it is an eigenvalue of C if we view it as a real matrix.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
346 8.5. Symmetric and Hermitian matrices

You should now have understood


• why hermitian and symmetric matrices have real eigenvalues,
• why eigenvectors for different eigenvalues of a hermitian matrix are perpendicular to each
other,
• why a hermitian/symmetric matrix is orthogonally diagonalisable,
• that up to a rotation and maybe reflection, the eigenspaces of a hermitian matrix are gen-
erated by the coordinate axes,
• etc.
You should now be able to
• find eigenvalues and eigenvectors of hermitian/symmetric matrices,
• diagonalise symmetric matrices,
• write Cn (or Rn ) as direct sum of the eigenspaces of a given hermitian (or symmetric)

FT
matrix,
• etc.

Ejercicios.
1. Diagonalice ortogonalmente las siguientes matrices:
 
    1 −3 0 0
RA
  6 2 4 2 −1 0
1 −2 −1
−3 1 0 0
(a) , (b) 2 3 2 , (c) 3 −1 , (d)  .
−2 4  0 0 3 0
4 2 9 0 −1 2
0 0 0 3

2. De una matriz simétrica A ∈ M (3 × 3) se sabe que el polinomio caracterı́stico es p(λ) =


λ3 − 5λ2 + 8λ − 4.

(a) Determine los valores propios de A y las multiplicidades geométricas y algebraicas.


D

(b) Se sabe que ker(A − id) = gen{3~e2 − 4~e3 }. Encuentre los espacios propios de A.
(c) Encuentre una matriz A que cumple lo arriba.

3. Diagonalice  
5 3(1 + i)
A= .
3(1 − i) 2

4. (a) Dé una matriz simétrica tal que su kernel es el plano x − 3y + 2z = 0. ¿Cuál debe ser la
imagen de la matriz que escogió?
(b) Dé una matriz simétrica que tenga por imagen el plano 5x − y + z = 0. ¿Cuál es su
kernel?
(c) Caracterice todas las matrices M (3 × 3) que tienen un único valor propio.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 347

5. Obtenga una base ortogonal de Rn de vectores propios de T donde T es la transformación


lineal dada en el Ejercicio 3. en Sección 7.4.

6. Sean A, B ∈ Msym (2 × 2) y C ∈ M (2 × 2) todas matrices con entradas reales. Considere

A Ct
 
V = .
C B

Muestre que V ∈ Msym (2 × 2) y que además V es diagonalizable.

7. Sea A, B ∈ M (n × n) con entradas complejas. Muestre que:

(a) AA∗ y A∗ A son diagonalizables.


(b) Si A, B son hermitianas y AB = BA entonces AB es diagonalizable.

A O
 
8. Sean A, B ∈ Msym (2 × 2) y V =
O B donde O es la matriz cero. Muestre que V ∈
Msym (4 × 4) y que además, los valores propios de V son los valores propios de A junto con

FT
los valores propios de B.

8.6 Application: Conic Sections


In this section we will study quadratic equations in x and y. Recall that we know how to deal with
linear equations in two variables. The most general form is
RA
ax + by = d (8.13)

with constants a, b, d. A solution is a tuple (x, y) which satisfies (8.13). We can view the set of all
solutions as a subset in the plane R2 . Since (8.13) is a linear equation (a 1 × 2 system of linear
equations), we know that we have the following possibilities for the solution set:

(a) a line if a 6= 0 or b 6= 0,
(b) the plane R2 if a = 0, b = 0 and d = 0,
(c) the empty set (no solution) if a = 0, b = 0 and d 6= 0,
D

Now we will consider the quadratic equation

ax2 + bxy + cy 2 = d (8.14)

with constants a, b, c, d.

In the following we will always assume that d ≥ 0. This is no loss of generality because if d < 0,
we can multiply both sides of (8.14) by −1 and replace a, b, c by −a, −b, −c. The set of solutions
does not change.

Again, we want to identify the solutions with subsets in R2 and we want to find out what type of
figures they are. The equation (8.14) is not linear, so we have to see what relation (8.14) has with

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
348 8.6. Application: Conic Sections

what we studied so far. It turns out that the left hand side of (8.14) can be written as an inner
product       
x x a b/2
G , with G = . (8.15)
y y b/2 c

Question 8.5
The matrix G from (8.15) is not the only possible choice. Find all possible matrices G such that
hG( xy ) , ( xy )i = ax2 + bxy + cy 2 .

The matrix G is very convenient because it is symmetric. This means that up to an orthogonal
transformation, it is a diagonal matrix. So once we know how to solve the problem when G is
diagonal, then we know it for the general case since the solutions differ only by a rotation and
maybe a reflection. This motivates us to first study the case when G is diagonal, that is, when
b = 0.

FT
Quadratic equation without mixed term (b = 0).

If b = 0, then (8.14) becomes


ax2 + cy 2 = d (8.16)
with constants d ≥ 0 and a, c ∈ R.

Remark 8.60. The solution set is symmetric with respect to the x-axis and the y-axis because if
some (x, y) is a solution of (8.16), then so are (−x, y) and (x, −y).
RA
Let us define
( (
p p 2 a if a ≥ 0, 2 c if c ≥ 0,
α := |a|, γ := |c|, hence α = and γ =
−a if a < 0 −c if c < 0.

We have to distinguish several cases according to whether the coefficients a, c are positive, negative
or 0.
D

Case 1.1: a > 0 and c > 0. In this case, the equation (8.16) becomes

α2 x2 + γ 2 y 2 = d. (8.16.1.1)

(i) If d > 0, then (8.16.1.1) is the equation of an ellipse whose axes are parallel to the x and
√ p
the y-axis. The intersection with the x-axis is at ± αd = ± d/a and the intersection with
√ p
the y-axis is at ± γd = ± d/c.

(ii) If d = 0, then the only solution of (8.16.1.1) is the point (0, 0) .



Remark 8.61. Note that the length of the semiaxes of the ellipse is proportional to d. Hence
as d decreases, the ellipse from (i) becomes smaller and for d = 0 it degenerates to the point (0, 0)
from (ii).

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 349

y y
p
d/c
p
d/a
x x

Figure 8.1: Solution of (8.16) for det G > 0 . If a > 0, b > 0, then the solution is an ellipse (if d > 0)
or the point (0, 0) (if d = 0). The right picture shows ellipses with a and c fixed but decreasing d (from
red to blue). If a < 0, b < 0, d > 0, then there is no solution.

Case 1.2: a < 0 and c < 0. In this case, the equation (8.16) becomes

− α2 x2 − γ 2 y 2 = d. (8.16.1.2)

FT
(i) If d > 0, then (8.16.1.2) has no solution because the left hand side is always less or equal to
0 while the right hand side is strictly positive.
(ii) If d = 0, then the only solution of (8.16.1.2) is the point (0, 0) .

Case 2.1: a > 0 and c < 0. In this case, the equation (8.16) becomes

α2 x2 − γ 2 y 2 = d. (8.16.2.1)
RA
(i) If d > 0, then (8.16.2.1) is the equation of a hyperbola . If x = 0, the equation has no

solution. Indeed, we need |x| ≥ αr such that the equation has a solution. Therefore the
hyperpola

does

not intersect the y-axes (in fact, the hyperbola cannot pass through the strip
d d
− α < y < α ).
• Intersection with the√coordinate
p axes: No intersection with the y-axis. Intersection with
the x-axis at x = ± αd = ± d/a.
• Asymptotics: For |x| → ∞ and |y| → ∞, the hyperbola has the asymptotes
D

α
y = ± x.
γ
Note that the asymptote does not depend on d.
Proof. It follows from (8.16.2.1) that |x| → ∞ if and only if |y| → ∞ because otherwise
the difference α2 x2 − γ 2 y 2 cannot be constant. Dividing (8.16.2.1) by x2 and by γ 2 and
rearranging leads to
y2 α2 d x large α2 α
2
= 2 − 2 2 ≈ , hence y≈± x.
x γ γ x γ2 γ

(ii) If d = 0, then (8.16.2.1), becomes α2 x2 +γ 2 y 2 = 0, and its solution is the pair of lines y = ± αγ x .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
350 8.6. Application: Conic Sections

y y

p
d/a
x
x

FT
Figure 8.2: Solution of (8.16) for det G < 0 . The solutions are hyperbola (if d > 0) or a set of two
intersecting lines. The left picture shows a solution for a > 0, c < 0 and d > 0. The right picture
shows hyperbolas for fixed a and c but decreasing d. The blue pair of lines passing through the origin
correspond to the case d = 0.
RA
Remark√8.62. Note that the intersection point of the hyperbola with the x-axis is propor-
tional to d. Hence as d decreases, the intersection points moves closer to the 0 and the turn
becomes sharper. If d = 0, the intersection point reaches 0 and the hyperbola become two
angles which look like two crossing lines.

Case 2.2: a < 0 and c > 0. In this case, the equation (8.16) becomes

− α2 x2 + γ 2 y 2 = d. (8.16.2.2)
D

This case is the same as Case 2.1, only with the roles of x and y interchanged. So we find:

(i) If d > 0, then (8.16.2.1) is the equation of a hyperbola .

• Intersection with the√ coordinate


p axes: No intersection with the x-axis. Intersection with
the y-axis at y = ± γd = ± d/c.

• Asymptotics: For |x| → ∞ and |y| → ∞, the hyperbola has the asymptotes y = ± αγ x.

(ii) If d = 0, then (8.16.2.1), becomes α2 x2 +γ 2 y 2 = 0, and its solution is the pair of lines y = ± αγ x .

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 351

Case 3.1: a > 0 and c = 0. Then (8.16) Case 3.2: a = 0 and c > 0. Then (8.16)
becomes α2 x2 = d. becomes γ 2 y 2 = d.
• If d > 0, the solutions are the • If d > 0, the solutions are the
√ √
two parallel lines x = ± αd . two parallel lines y = ± γ
d
.
• If d = 0, the solution is the line x = 0 . • If d = 0, the solution is the line y = 0 .
Case 3.3: a < 0 and c = 0. Then (8.16) Case 3.4: a = 0 and c < 0. Then (8.16)
becomes −α2 x2 = d. becomes −γ 2 x2 = d.

• If d > 0, there is no solution . • If d > 0, there is no solution .


• If d = 0, the solution is the line x = 0 . • If d = 0, the solution is the line y = 0 .

Case 3.5: a = 0 and c = 0. Then (8.16) becomes 0 = d.

• If d > 0, there is no solution .


• If d = 0, the solution is R2 .

in all remaining cases det G = 0.

Quadratic equation with mixed term.


FT
Note that in the Cases 1.1 and 1.2, det G = ac > 0, in the Cases 2.1 and 2.2, det G = ac < 0 and

Now we want to solve (8.14) without the assumption that b = 0. Let G =



a b/2
b/2 c

and
RA
 
x
~x = . Then (8.14) is equivalent to
y

hG~x , ~xi = d. (8.17)

If G was diagonal, then we immediately could give the solution. We know that G is symmetric,
hence we know that G can be orthogonally diagonalized. In other words, there exists an orthogonal
basis of R2 with respect to which G has a representation as a diagonal matrix. We can even choose
this basis such that they are a rotation of the canonical basis ~e1 and ~e2 (without an additional
D

reflection).
Let λ1 , λ2 be eigenvalues of G and let D = diag(λ1 , λ2 ). We choose an orthogonal matrix Q such
that
D = Q−1 GQ. (8.18)
Denote the columns of Q by ~v1 and ~v2 . They are normalised eigenvectors of G with eigenvalues λ1
and λ2 respectively. Recall that for an orthogonal matrix Q we always have that det Q = ±1. We
may assume that det Q = 1, because if not we can simply multiply one of its columns by −1. This
column then is still a normalised eigenvector of G with the same eigenvalue, hence (8.18) is still
valid. With this choice we guarantee that Q is a rotation.
From (8.18) it follows that G = QDQ−1 = QDQ∗ . So we obtain from (8.17) that

d = hG~x , ~xi = hQDQ∗ ~x , ~xi = hDQ∗ ~x , Q∗ ~xi = hD~x 0 , ~x 0 i = hD~x 0 , ~x 0 i = λ1 x02 + λ2 y 02

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
352 8.6. Application: Conic Sections

 0
x
where ~x 0 = = Q∗ ~x = Q−1 ~x.
y0
 0
x
Observe that the column vector is the representation of ~x with respect to the basis ~v1 , ~v2
y0
(recall that they are eigenvectors of G). Therefore the solution of (8.14) is one of the solutions
we found for the case b = 0 only now the symmetry axes of the figures are no longer the x- and
y-axis, but they are the directions of the eigenvectors of G. In other words: Since Q is a rotation,
we obtain the solutions of ax2 + bxy + cy 2 = d by rotating the solutions of ax2 + cy 2 = d with the
matrix Q.

Procedure to find the solutions of ax2 + bxy + cy 2 = d.


 
a b/2
• Write down the symmetric matrix G = .
b/2 c
• Find the eigenvalues λ1 and λ2 and eigenvectors of G and define the diagonal matrix D =
diag(λ1 , λ2 ). and the orthogonal matrix Q such that det Q = 1 and D = Q−1 GQ.

FT
• Quadratic form without mixed terms: d = λ1 x02 + λ2 y 02 where x0 , y 0 are the components of
~x 0 = Q−1 ~x.
• Graphic of the solution: In the xy-coordinate system, indicate the x0 -axis (parallel to ~v1 )
and the y 0 -axis (parallel to ~v2 ). Note that these axes are a rotation of the x- and the y-axis.
The solutions are then, depending on the eigenvalues, an ellipse, hyperbola, etc. whose
symmetry axes are the x0 - and y 0 -axis.

If we want to know only the shape of the solution, it is enough to calculate the eigenvalues λ1 , λ2
RA
of G, or even only det G. Recall that we always assume d ≥ 0.

• If det G > 0, then we obtain an ellipse (which may be degenerate).


p
If λ1 > 0 and λ2 > 0, then the solution is an ellipse with length of its axes d/λ1 and
– p
d/λ2 . If d = 0 the ellipse is only the point (0, 0).
– If λ1 < 0 and λ2 < 0, then there is either no solution (if d > 0) or the solution is only
the point (0, 0) (if d = 0).
D

• If det G < 0, then we obtain a hyperbola (which may be degenerate).


0
– If λp
1 > 0 and λ2 < 0, then the solution is a hyperbola which intersects with the x -axis
at d/λ1 and has no intersection with the y 0 -axis.
0
1 < 0 and λ2 > 0, then the solution is a hyperbola which intersects with the x -axis
– If λp
0
at d/λ2 and has no intersection with the x -axis.
p
In both cases, the asymptotes of the hyperbola
p have slope ± λ1 /λ2 . If d = 0, the hyperbola
degenerate to the pair of lines y = ± λ1 /λ2 x.
• If det G = 0, then we obtain either the empty set, one of the axes, two lines parallel to one
of the axes, or R2 .

Definition 8.63. The axis of symmetry are called the principal axes.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 353

Example 8.64. Consider the equation

10x2 + 6xy + 2y 2 = 4. (8.19)

(i) Write the equation in matrix form.


(ii) Make a change of coordinates so that the quadratic equation (8.19) has no mixed term.
(iii) Describe the solution of (8.19) in geometrical terms and sketch it. Indicate the principal axes
and important intersections.

 we write (8.19) in the form hG~x, ~xi with a symmetric matrix G. Let us define
Solution.  (i) First
10 3
G= . Then (8.19) is equivalent to
3 2
    
x x
G , = 4. (8.20)
y y

FT
(ii) Now we calculate the eigenvalues of G. They are the roots of the characteristic polynomial
det(G − λ).

0 = det(G − λ) = (10 − λ)(2 − λ) − 9 = λ2 − 12λ + 11 = (λ − 6)2 − 25 = (λ − 1)(λ − 11).

Hence the eigenvalues of G are


λ1 = 1, λ2 = 11.
RA
Next we need the normalised eigenvectors. To this end, we calculate ker(G−λj ) using Gauß
elimination:
     
9 3 3 1 1 1
• G − λ1 = −→ =⇒ ~v1 = √ ,
3 1 0 0 10 −3
     
−1 3 −1 3 1 3
• G − λ2 = −→ =⇒ ~v2 = √ .
3 −9 0 0 10 1
D

(Recall that for symmetric matrices the eigenvectors for different eigenvalues are orthogonal.
If you solve such an exercise it might be a good idea to check if the vectors are indeed
orthogonal to each other.)
Observation. With the information obtained so far, we already can sketch the solution.

• The solution is an ellipse because both eigenvalues are positive.


• The principal axes
p (symmetry axes) are parallel to the vectors ~v1 u p
~v2 . The ellipsepinter-
sects them in ± 4/1 = ±2 along the axis parallel to ~v1 and in ± 4/11 = ±2/ 1/11
along the axis parallel to ~v2 .

Set      
1 1 3 λ1 0 1 0
Q = (~v1 |~v2 ) = √ , D= = ,
10 −3 1 0 λ2 0 11

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
354 8.6. Application: Conic Sections

then
Q−1 = Qt y D = Q−1 GQ = Qt GQ.
Observe that det Q = 1, so it is a rotation en R2 . It is a rotation by the angle arctan(−3).

If we define  0    
x −1 x 1 x − 3y
= Q = √ ,
y0 y 10 3x + y
then (8.20) gives
            0   0 
x x t x t x x x
4= G , = DQ , Q = D 0 , ,
y y y y y y0

and therefore
1 11
4 = x02 + 11y 02 = (x − 3y)2 + (3x + y)2 .
10 10

FT
(iii) The solution of (8.19) is an ellipse whose
principal axes are parallel to the vectors ~v1 y ~v2 .
x0 is the coordinate along the axis parallel to ~v1 ,
y 0 is the coordinate along the axis parallel to ~v2 .
y

√ √
2 10/ 11
y0
RA
~v2
x

~v1


2 10
D

x0


Example 8.65. Consider the equation

47 2 32 13
− x − xy + y 2 = 2. (8.21)
17 17 17
(i) Write the equation in matrix form.
(ii) Make a change of coordinates so that the quadratic equation (8.21) has no mixed term.
(iii) Describe the solution of (8.21) in geometrical terms and sketch it. Indicate the principal axes
and important intersections.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 355

Solution. (i) First we write (8.21) in the form hG~x, ~xi with symmetric matrix G. Let us define

1 −47 −16
G = 17 . Then (8.21) is equivalent to
−16 13
    
x x
G , = 2. (8.22)
y y

(ii) Now we calculate the eigenvalues of G. They are the roots of the characteristic polynomial
47 13
0 = det(G−λ) = (− 17 −λ)( 17 −λ)− 128 2 34 611 256 2
172 = λ + 17 λ− 172 − 172 = λ +2λ−3 = (λ−1)(λ+3).

Hence the eigenvalues of G are


λ1 = −3, λ2 = 1.
Next we need the normalised eigenvectors. To this end, we calculate ker(G−λj ) using Gauß
elimination:
     
1 4 −16 1 1 −4 1 4
~v1 = √

FT
• G − λ1 = −→ =⇒ ,
17 −16 64 17 0 0 17 1
     
1 −64 −16 1 4 1 1 −1
• G − λ2 = −→ =⇒ ~v2 = √ .
17 −16 −4 17 0 0 17 4

Observation. With the information obtained so far, we already can sketch the solution.
• The solution are hyperbola because the eigenvalues have opposite signs.
• The principal axes (symmetry axes) are parallel to the vectors ~v1 and ~v2 . The intersec-
RA

tions of the hyperbola with the axis parallel to ~v2 are ± 2.

Set      
1 4 −1 λ1 0 −3 0
Q = (~v1 |~v2 ) = √ , D= = ,
17 1 4 0 λ2 0 1
then
Q−1 = Qt y D = Q−1 GQ = Qt GQ.
Observe that det Q = 1, hence Q is a rotation of R2 . It is a rotation by the angle arctan(1/4).
D

If we define  0    
x −1 x 1 4x + y
= Q = √ ,
y0 y 17 −x + 4y
then (8.22) gives
            0   0 
x x x x x x
2= G , = DQt , Qt = D 0 , ,
y y y y y y0
hence
3 1
2 = −3x02 + y 02 = − (4x + y)2 + (−x + 4y)2 .
17 17

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
356 8.6. Application: Conic Sections

y
y0

(iii) The solution of equation (8.19) are hyperbola


whose principal axes are parallel to the vectors
~v1 y ~v2 .

x0 is the coordinate along the axis parallel to ~v1 , 2
y 0 is the coordinate along the axis parallel to ~v2 .
The angle between the x- and the x0 -axis is ~v2
arctan(1/4). x0
~v1
x

FT
Asymptotes of the hyperbola. In order to calculate the slopes of the asymptotes of the
hyperbola, we first calculate in the x0 -y 0 -coordinate system. Our starting point is the equation
2 = −3x02 + y 02 .
RA
r
02 02 y 02 1 y0 1
2 = −3x + y ⇐⇒ = 3 + 02 ⇐⇒ = ± 3 + 02 .
x02 2x x0 2x
0 √
We see that |y 0 | → ∞ if and only if |x0 | → ∞ and that xy 0 ≈ ± 3. So the slopes of the

asymptotes in x0 -y 0 -coordinates are ± 3.
How do we find the slope in x − y-coordinates?

• Method 1: Use Q. We know that if we rotate our hyperbola by the linear transforma-
D

tion Q− 1 (i.e. if we rotate by arctan(1/4)), then we obtain hyperbola whose symmetry


axes are the x- and y-axes and whose asymptotes have slopes ±3. Hence, in order to
obtain the asymptotes of our parabola, we only need to apply Q to the vectors w ~1 y w~2
which are parallel to the new asymptotes.
  The resulting
 vectors
 are then parallel to our
√1 1

original hyperbola. In our case w~1 = , w
~2 = . Hence
3 − 3
    √ 
1 4 1 √1 = √1 4+ √ 3
~ 10
w ~1 = √
= Qw ,
17 −1 4 3 17 −1 + 4 3

    √ 
1 4 1 1
√ 1 4− √ 3
~ 20
w ~2 = √
= Qw =√ .
17 −1 4 − 3 17 −1 − 4 3

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 357

Therefore the slopes of the asymptotes of our hyperbola are


√ √
−1 + 4 3 −1 − 4 3
√ and √ .
4+ 3 4− 3
y0

• Method 2: Insert in the formulas. The asymptotes are lines which satisfy x0 = ± 3.
Using x0 = √117 (4x − y) y y 0 = √117 (x + 4y), we obtain

√ √1 (x + 4y)
y0 17 x + 4y
± 3= = =
x0 √1 (4x − y) 4x − y
17

⇐⇒ ± 3(4x − y) = x + 4y
√ √
⇐⇒ (±4 3 − 1)x = (4 ± 3)y

y −1 ± 4 3
⇐⇒ = √ .
x 4± 3
• Method 3: Adding 0
√ angles. We know that the0 angle between the x -axis and an

FT
asymptote is arctan 3 and the angle between the x -axis and the x-axis
√ is arctan(1/4).
Therefore the angel between the asymptote and the x-axis is arctan 3 + arctan(1/4)
(see Figure 8.3.)

−3x2 + y 2 = 2 − 47 2
17 x −
32
17 16xy + 13 2
17 y =2

y y
y0 y0
RA

2 α=ϕ+ϑ

x0 ~v2 ϑ x0
ϑ
ϕ ~v1 ϕ
D

x x

ϕ = arctan(1/4)

ϑ = arctan( 3)

Figure 8.3: The figure on the right (our hyperbola) is obtained from the figure on the left by applying
the transformation Q to it (that is, by rotating it by arctan(1/4)).


Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
358 8.6. Application: Conic Sections

Example 8.66. Consider the equation

9x2 − 6xy + y 2 = 25. (8.23)

(i) Write the equation in matrix form.


(ii) Make a change of coordinates so that the quadratic equation (8.23) has no mixed term.
(iii) Describe the solution of (8.23) in geometrical terms and sketch it. Indicate the principal axes
and important intersections.

Solution 1.
 • First
 we write (8.21) in the form hG~x, ~xi with symmetric matrix G. Let us define
9 −3
G= . Then (8.23) is equivalent to
−3 1
    
x x
G , = 25. (8.24)
y y

Hence the eigenvalues of G are FT


• Now we calculate the eigenvalue’s of G. They are the roots of the characteristic polynomial

0 = det(G − λ) = (9 − λ)(1 − λ) − 9 = λ2 − 10λ = λ(λ − 10).

λ1 = 0, λ2 = 10.

Next we need the normalised eigenvectors. To this end, we calculate ker(G−λj ) using Gauß
RA
elimination:
     
9 −3 3 −1 1 1
• G − λ1 = −→ =⇒ ~v1 = √ ,
−3 1 0 0 10 3
     
−1 −3 1 3 1 −3
• G − λ2 = −→ =⇒ ~v2 = √ .
−3 −9 0 0 10 1

Observation. With the information obtained so far, we already can sketch the solution.
D

– The solution are two parallel lines because one of the eigenvalues is zero and the other
is positive.
– The
p lines are p
parallel to ~v1 and their intersections with the axis parallel to ~v1 are
± 25/10 = ± 5/2.

Set      
1 1 −3 λ1 0 0 0
Q = (~v1 |~v2 ) = √ , D= = ,
10 3 1 0 λ2 0 10
then
Q−1 = Qt y D = Q−1 GQ = Qt GQ.
Observe that det Q = 1, hence Q is a rotation in R2 . It is a rotation by the angle arctan(3).

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 359

If we define
 0    
x −1 x 1 x + 3y
=Q =√ ,
y0 y 10 −3x + y
then (8.24) gives
            0   0 
x x x x x x
25 = G , = DQt , Qt = D 0 , ,
y y y y y y0

therefore
25 = 10y 02 = (−3x + y)2 .

the
pvector ~v1 which
0
p intersect the y -axis at
± 25/10 = ± 5/2.
FT
• The solution of (8.19) are two lines parallel to

x0 is the coordinate along the axis parallel to ~v1 ,


y 0 is the coordinate along the axis parallel to ~v2 .
The angle between the x- and the x0 -axis is
p
5/2
~v2
~v1

x
RA
arctan(3).

Solution 2. Note that


D

25 = 9x2 − 6xy + y 2 = (3x − y)2 ⇐⇒ 5 = |3x − y|.

Therefore the solution are two parallel lines given by

y = 3x ± 5

which coincides with the result above. 

8.6.1 Solutions of ax2 + bxy + cy 2 = d as conic sections


The reason why the title of this section is “conic section” is because most of the solution sets of the
quadratic equations can be obtained as the intersection of a double cone with a planes.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
360 8.6. Application: Conic Sections

FT
Figure 8.4: Ellipses. The plane in the picture on the left is parallel to the xy-plane. Therefore
the intersection with the cone is a circle. If the plane starts to incline, the intersection becomes an
ellipse. The more inclined the plane is, the more prolonged is the ellipse. As long as the plane is not
yet parallel to the surface of the cone, the intersects only either the upper or the lower part of the
cone and the intersection is an ellipse.
RA
D

Figure 8.5: Parabola. If the plane is parallel to the surface of the cone and does not pass through
the origin, then the intersection with the cone is a parabola (this is not a possible solution of (8.14)).
If the plane is parallel to the surface of the cone and passes through the origin, then the plane is
tangential to the cone and the intersection is one line.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 361

Figure 8.6: Hyperbola. If the plane is steeper than the cone, then it intersects both the upper and
the lower part of the cone. The intersection are hyperbola. If the plane passes through the origin, then

FT
the hyperbola degenerate to two intersecting lines. The plane in the picture in the middle is parallel
to the yz-plane. Therefore the intersection with the cone is a circle.

8.6.2 Solutions of ax2 + bxy + cy 2 + rx + sy = d


Let us briefly discuss the case then the quadratic equation (8.14) contains linear terms:

ax2 + bxy + cy 2 + rx + sy = d (8.25)


RA
We want  to find a transformation
 so that (8.25) can be written without the linear terms rx and sy.
a b/2
Let G = and let λ1 , λ2 be its eigenvalues. Moreover, let D = diag(λ1 , λ2 ) and Q an
b/2 c
orthogonal matrix with det Q = 1 and D = Q−1 GQ.
In the following we assume that G is invertible.
Method 1. First eliminate the mixed term bxy.
If we set ~x 0 = Q−1 ~x, then ax2 + bxy + cy 2 = λ1 x02 + λ2 y 02 . Since x0 and y 0 are linear in x and y,
equation (8.25) becomes
D

λ1 x02 + λ2 y 02 + r0 x0 + s0 y 0 = d0 .
Now we only need to complete the squares on the left hand sides to obtain

λ1 (x0 + r0 /2)2 + λ2 (y 0 + s0 /2)2 − (r0 /2)2 − (s0 /2)2 = d0 .

Note that this can always be done if λ1 and λ2 are not 0 (here we use that G is invertible).
If we set d00 = d0 + (r0 /2)2 + (s0 /2)2 , x00 = x0 + r0 /2 y 00 = y 0 + s0 /2, then

λ1 x002 + λ2 y 002 = d00 . (8.26)


 0   0 
r /2 r /2
Since ~x 00 = 0 +~x 0 = 0 +Q−1 ~x we see that the solution is the solution of λ1 x2 +λ2 y 2 = d00
s /2 s /2  0 
r /2
but rotated by Q and shifted by the vector .
s0 /2

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
362 8.6. Application: Conic Sections

Method 2. First eliminate the linear term rx and sy.


Let us make the ansatz x = x0 + x
e and y = y0 + ye. Inserting in (8.25) gives

e)2 + b(x0 + x
d = a(x0 + x e)(y0 + ye) + c(y0 + ye)2 + r(x0 + x
e) + s(y0 + ye)2
x2 + be y 2 + 2ax0 + by0 + r x e + 2cy0 + bx0 + s ye + ax20 + bx0 y0 + cy02 .
   
= ae xye + ce (8.27)

We want the linear terms in x


e and ye to disappear, so we need 2ax0 +by0 +r = 0 and 2cy0 +bx0 +s = 0.
In matrix form this is
      
r 2a b x0 x
− = = 2G 0 .
s b 2c y0 y0
   
x0 r
Assume that G is invertible. Then we can solve for x0 and y0 and obtain = − 21 G−1 .
y0 s
Now if we set de = d − ax20 − bx0 y0 − cy02 , then (8.27) becomes

FT
x2 + be
de = ae y2
xye + ce (8.28)

which is now in the form of (8.14) (if de is negative, then we must multiply both sides of (8.28) by
−1. In this case, the eigenvalues of G change their sign, hence D also changes sign, but Q does
not). Hence if we set ~x 0 = Q−1~x
e, then

de = λ1 x02 + λ2 y 02
RA
 
0 −1~ −1 −1 −1 −1 r
1 −1 −1
and ~x = Q x e = Q (~x − ~x0 ) = Q ~x − Q ~x0 = Q ~x + 2Q G . So again we see that
s
2 2
the solution
 of (8.25) is the solution of λ1 x + λ2 y = d but rotated by Q and shifted by the vector
e
1 −1 −1 r
2Q G
s
.

Example 8.67. Find the solutions of


D

10x2 + 6xy + 2y 2 + 8x − 2y = 4. (8.19’)

Solution. We know from Example 8.64 that


     
10 3 1 1 3 1 0
G= , Q= √ , D=
3 2 10 −3 1 0 11

and that
 0  0  0
x + 3y 0
      
x −1 x 1 x − 3y x x 1
=Q =√ and =Q 0 = √ 0 .
y0 y 10 3x + y y y 0
10 −3x + y

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 363

Method 1. With the notation above, we know from Example 8.64 that(8.19’) is

4 = 10x2 + 6xy + 2y 2 + 8x − 2y = x02 + 11y 02 + 8x − 2y


8 2
= x02 + 11y 02 + √ (x0 + 3y 0 ) − √ (−3x0 + y 0 )
10 10
14 22
= x02 + √ x0 + 11y 02 + √ y 0
10 10
 2  2
7 1
= x0 + √ + 11 y 0 + √ − 6,
10 10
hence
 2  2
0 7 0 1
x +√ + 11 y + √ = 4 + 6 = 10.
10 10

This is an ellipse oriented as the one from Example 8.64 but shifted by 7/ 10 in x0 -direction and
√ √ q

FT
−1/ 10 in y 0 -direction. The length of the semiaxes are 10 and 10 11 .

Method 2. Note that


          
x0 1 r 1 1 2 −3 8 1 22 −1
= − G−1 =− · =− = .
y0 2 s 2 11 −3 10 −2 22 −44 2

e = x − x0 = x + 1 and ye = y − y0 = y − 2. Then
Set x
RA
4 = 10x2 + 6xy + 2y 2 + 8x − 2y = 10(e
x − 1)2 + 6(e
x − 1)(e y + 2)2 + 8(e
y + 2) + 2(e x − 1) − 2(e
y + 2)
x2 − 20e
= 10e x + 1 + 6e x − 6e
xye + 12e y 2 + 8e
y − 12 + 2e x − 8 − 2e
y + 8 + 8e y−4
x2 + 6e
= 10e y 2 − 15
xye + 2e

hence

x2 + 6e
19 = 10e e02 + 11e
y2 = x
xye + 2e y 02

with
D

 0        
x −1 x 1 xe + 3e
y 1 (x + 1) + 3(y − 2) 1 x + 3y − 5
= Q = √ = √ = √ .
e e
ye0 ye 10 3ex − ye 10 3(x + 1) − (y − 2) 10 3x − y + 5


You should now have understood


• that a symmetric 2×2 matrix which is not a multiple of the identity marks two distinguished
directions in R2 , namely the ones parallel to its eigenvectors,
• why a change of variables is helpful to find solutions of a quadratic equation in two variables,
• etc.
You should now be able to

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
364 8.7. Summary

• find the solutions of quadratic equations in two variables,


• make a change of coordinates such that the transformed equation has no mixed term,
• sketch the solution in the xy-plane,
• etc.

Ejercicios.
1. Encuentre una substitución ortogonal que diagonalice las formas cuadráticas dadas y en-
cuentre la forma diagonal. Haga un bosquejo de las soluciones. Si es un elipse, calcule las
longitudes de los ejes principales y el ángulo que tienen con el eje x. Si es una hipérbola,
calcule en ángulo que tiene las ası́ntotas con el eje x.

(a) 10x2 − 6xy + 2y 2 = 4,


(b) x2 − 9y 2 = 2,
(c) x2 − 9y 2 = 20 (compare la solución con la del literal anterior!)

FT
(d) 11x2 − 16xy − y 2 = 30.
(e) x2 + 4xy + 4y 2 = 4.
(f) xy = 1.
(g) 5x2 − 2xy + 5y 2 = −4.
(h) x2 − 2xy + 4y 2 = 0.
1 1
2. Encuentre la fórmula de una elipse cuyos semiejes tienen magnitudes y y cuyos ejes
RA
 2 3
principales son paralelos a 1 2 y −2 1 .

3. Encuentre la fórmula de una elipse cuyos semiejes tienen magnitudes 3 y 1 y cuyo primer eje
principal tiene un ángulo de 30◦ con el eje x.

8.7 Summary
Cn as an inner product space
D

Cn is an inner product space if we set


n
X
h~z , wi
~ = zj w j .
j=1

~ ~z ∈ Cn and c ∈ C:
We have for all ~v w,

• h~v , ~zi = h~z , wi,


~
• h~v + cw
~ , ~zi = h~v , ~zi + chw
~ , ~zi, h~z , ~v + cwi
~ = h~z , ~v i + ch~z , wi,
~
• h~z , ~zi = k~zk ,
2

• h~v , ~zi ≤ k~v k k~zk,

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 365

• k~v + ~zk2 ≤ k~v k2 + k~zk2 .


The adjoint of a matrix A ∈ MC (n×n) is A∗ = (At ) = (A)t (= transposed and complex conjugated).
The matrix A is called hermitian if A∗ = A. The matrix Q is called unitary if it is invertible and
Q∗ = Q−1 .
Note that det A∗ = det A.

Eigenvalues and eigenvectors


Definition. Let A ∈ M (n × n). Then λ is called an eigenvalue of A with eigenvector ~v if ~v 6= ~0
and A~v = λ~v . The set of all solutions of A~v = λ~v for an eigenvalue λ is called the eigenspace of A
for λ. It is denoted by Eigλ (A).

The eigenvalues of A are exactly the zeros of the characteristic polynomial

pA (λ) = det(A − λ).

It is a polynomial of degree n. Since every polynomial of degree ≥ 1 has at least one complex root,

FT
every complex matrix has at least one eigenvalue (but there are real matrices without eigenvalues.)
Moreover, an n × n-matrix has at most n eigenvalues. If we factorise pA , we obtain

pA (λ) = (λ − µ1 )m1 · · · (λ − µk )mk

where µ1 , . . . , µ)k are the different eigenvalues of A. The exponent mj is called algebraic multi-
plicity of µj . The geometric multiplicity of µj is dim(Eigµj (A). Note that
• geometric multiplicity ≤ algebraic multiplicity,
RA
• the sum of all algebraic multiplicities is m1 + · · · + mk = n.
Similar matrices.
• Two matrices A, B ∈ M (n × n) are called similar if there exists an invertible matrix C such
that A = C −1 BC.
• A matrix A is called diagonalisable if it is similar to a diagonal matrix.

Characterisation of diagonalisability. Let A ∈ MC (n × n) and let µ1 , . . . , µk be the different


D

eigenvalues of A. We set nj = dim(Eigµj (A) = geometric multiplicity of µj and mj = algebraic


multiplicity of µj . Then the following is equivalent:
(i) A is diagonalisable.
(ii) Cn has a basis consisting of eigenvectors of A.
(iii) Cn = Eigµ1 (A) ⊕ · · · ⊕ Eigµk (A).
(iv) nj = mj for every j = 1, . . . , k.
(v) n1 + · · · + nk = n.
The same is true for symmetric matrices with Cn replaced by Rn .

Properties of unitary matrices. Let Q be a unitary n × n matrix. Then:

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
366 8.7. Summary

• | det Q| = 1,
• If λ ∈ C is an eigenvalue of Q, then |λ| = 1.
• Q is unitarily diagonalisable (we did not prove this fact), hence Cn has a basis consisting of
eigenvectors of Q. They can be chosen to be mutually orthogonal.

Moreover, Q ∈ M (n × n) is unitary if and only if kQ~zk = k~zk for all ~z ∈ Cn .

Properties of hermitian matrices. Let A ∈ MC (n × n) be a hermitian n × n matrix. Then:

• det A ∈ R,
• If λ is an eigenvalue of Q, then λ ∈ R.
• A is unitarily diagonalisable hence Cn has a basis consisting of eigenvectors of A. They can
be chosen to be mutually orthogonal.

FT
Moreover, A ∈ M (n × n) is hermitian if and only if hA~v , ~zi = h~v , A~zi for all ~v , ~z ∈ Cn .

Properties of symmetric matrices. Let A ∈ MR (n × n) be a symmetric n × n matrix. Then:

• A is orthogonally diagonalisable. hence Rn has a basis consisting of eigenvectors of A. They


can be chosen to be mutually orthogonal.
RA
Moreover, A is symmetric if and only if hA~v , ~zi = h~v , A~zi for all ~v , ~z ∈ Rn .

Solution of ax2 +bxy +cy 2 = d. The equation can be rewritten as hG~x , ~xi = d with the symmetric
matrix  
a b/2
G= .
b/2 c

Let λ1 , λ2 be the eigenvalues of G and let us assume that d ≥ 0. Then the solutions are:
D

• an ellipse if det G > 0, more precisely,


p p
– an ellipse with length of its axes d/λ1 and d/λ2 if λ1 , λ2 > 0 and d > 0,
– the point (0, 0) if d = 0,
– the empty set if λ1 , λ2 < 0 and d > 0,

• hyperbola if det G < 0, more precisely,

– hyperbola d > 0,
– two lines crossing at the origin if d = 0,

• two parallel lines, one line or R2 if det G = 0.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter 8. Symmetric matrices and diagonalisation 367

8.8 Exercises
1. Sea Q una matriz unitaria. Demuestre que todos sus autovalores tienen norma 1.

2. Muestre que A y B son semejantes si y solo si At , B t son semejantes.


3. Encuente todas las matrices que son semejantes a la identidad.
   
1 2 5 1
4. Son las matrices A = yB= semejantes? Hint. Ejercicio 5.(c).
4 3 0 1
5. Sea A una matriz con autovalores µ1 , . . . , µk y sea c una constante.
(a) ¿Qué se puede decir sobre los autovalores de cA? ¿Qué se puede decir sobre los autoval-
ores de A + c id?

6. Dados la matriz A y los vectores u y w:


     
25 15 −18 0 −1

FT
A = −30 −20 36 , u =  1 , w =  1 .
−6 −6 16 −1 0
(a) Diga si los vectores u y w son autovectores de A. Si lo son, cuáles son los vectores propios
correspondientes?
(b) Puede usar que det(A − λ) = −λ3 + 21λ2 − 138λ + 280. Calcule todos los autovalores de
A.
 
RA
1 2 P∞ 1 n
7. Sea A = . Calcule eA := n=0 n! A .
2 4
Hint. Encuentre una matriz invertible C y una matriz diagonal D tal que A = C −1 DC y use
esto para calcular An .

8. (a) Sea Φ : M (2 × 2, R) → M (2 × 2, R), Φ(A) = At . Encuentre los valores propios y los


espacios propios de Φ.
(b) Sea P2 el espacio vectorial de polinomios de grado menor o igual a 2 con coeficientes
reales. Encuentre los valores propios y los espacios propios de T : P2 → P2 , T p =
D

p0 + 3p.
(c) Sea R la reflexión en el plano P : x + 2y + 3z = 0 en R3 . Calcule los valores propios y
los espacios propios de R.

9. We consider a string of lenth L which is fixed on both end points. It is excited then its vertical
∂2 ∂2
elongation satisfies the partial differential equation ∂t 2 u(t, x) = ∂x2 u(t, x). If we make the

ansatz u(, x) = eiωt v(x) for some number ω and a function v which depends only on x, we
obtain −ω 2 v = v 00 . If we set λ = −ω 2 , we see that we have to solve the following eigenvalue
problem:
T : V → V, T v = v 00
with
V = {f : [0, L] → R, f is twice differentiable and f (0) = f (L) = 0}.

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
368 8.8. Exercises

(a) Show that V is a vector space.


(b) Show that T is a well-defined linear opertor.
(c) Find the eigenvalues and eigenspaces of T .

10. Encuentre los valores propios y los espacios propios de las siguientes matrices n × n:
 
  1 1 ··· 1 1
1 1 ··· 1 1 1 · · · 1 2
A =  ... ... ..  , B = . . ..  .
  
. ..
 .. .. . .
1 1 ··· 1
1 1 ··· 1 n

Compare con el Ejercicio 11..

11. Sea A ∈ M (n × n, C) una matriz hermitiana tal que todos sus autovalores son estrictamente
mayores a 0. Sea h· , ·i el producto interno estandar en Cn . Demuestre que A induce un

FT
producto interno en Cn a través de

Cn × Cn → C, (x, y) := hAx , yi.


RA
D

Last Change: Thu Aug 24 05:15:09 PM -05 2023


Linear Algebra, M. Winklmeier
Appendix A

Complex Numbers

A complex number is an expression of the form

FT
a + ib
where a, b ∈ R and i is called the imaginary unit. The number a is called the real part of z, denoted
by Re(z) and b is called the imaginary part of z, denoted by Im(z).
The set of all complex numbers is sometimes called the complex plane and it is denoted by C:
C = {a + ib : a, b ∈ R}.
A complex number can be visualised as a point in the plane R2 where a is the coordinate on the
RA
real axis and b is the coordinate on the imaginary axis.
Let a, b, x, y ∈ R. We define the algebraic operations sum and product for complex numbers
z = a + ib, w = x + iy:
z + w = (a + ib) + (x + iy) := a + x + i(b + y),
zw = (a + ib)(x + iy) := ax − by + i(ay + bx).
 
a
Exercise A.1. Show that if we identify the complex number z = a+ib with the vector ∈ R2 ,
b
D

then the addition of complex planes is the same as the addition of vectors in Rn .
We will give a geometric interpretation of the multiplication of complex numbers later after formula
(A.5).
It follows from the definition above that i2 = −1. Moreover, we can view the real numbers R as a
subset of C if we identify a real number x with the complex number x + 0i.
Let a, b ∈ R and z = a + ib. Then the complex conjugate of z is
z = a − ib
and its modulus or norm is p
|z| = a2 + b2 .
Geometrically, the complex conjugate is obtained from the z by an reflection on the x-axis and its
norm is the distance of the point represented by z from the origin of the complex plane.

369
370

Im Im
3 + 2i
2 z = a + ib
−1 + i
1

Re Re
−3 −2 −1 1 2 3 4
−1
− 32 i
−2 z = a − ib

Figure A.1: Complex plane.

Properties A.2. Let a, b, x, y ∈ R and let z = a + ib, w = x + iy. Then:

(i) z = Re z + i Im z.

(iv) zz = |z|2 .
(v) Re z = 12 (z + z), Re z = 1
2i (z
FT
(ii) Re(z + w) = Re(z) + Re(w), Im(z + w) = Re(z) + Im(w).
(iii) (z) = z, z + w = z + w, zw = z w.

− z).
RA
Proof. (i) and (ii) should be clear. For (iii) not that z = a − ib = a + ib,

z + w = a + x + i(b + y) = a + x − i(b + y) = a − ib + x − iy = a + ib + x + iy = z + w,
zw = ax − by + i(ay + bx) = ax − by + i(ay + bx) = (a − ib)(x − iy) = (a + ib)(x + iy) = z w.

(iv) follows from

zz = (a + ib)(a + ib) = (a + ib)(a − ib) = a2 + b2 + i(ab − ba) = a2 + b2 = |z|2


D

and (v) follows from

z + z = a + ib + (a + ib) = a + ib + a − ib = 2a = 2 Re(z),
z + z = a + ib − (a + ib) = a + ib − (a − ib) = 2ib = 2i Im(z).

We call a complex number real if it is of the form z = a + i0 for some a ∈ R and we call it purely
imaginary if it is of the form z = 0 + ib for some b ∈ R. Hence

z is real ⇐⇒ z=z ⇐⇒ z = Re(z)


z is purely imaginary ⇐⇒ z = −z ⇐⇒ z = Im(z).

It turns out that C is a field , that is, it satisfies

Last Change: Mon Jul 24 01:02:23 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter A. Complex Numbers 371

(a) Associativity of addition: (u + v) + w = u + (v + w) for every u, v, w ∈ C.

(b) Commutativity of addition: v + w = w + v for every u, v ∈ C.

(c) Identity element of addition: There exists an element 0, called the additive identity such
that for every v ∈ C, we have 0 + v = v + 0 = v.

(d) Additive inverse: For all z ∈ C, we have an inverse element −z such that z + (−z) = 0.

(e) Associativity of multiplication (uv)w = u(vw) for every u, v, w ∈ C.

(f) Commutativity of multiplication vw = wv for every u, v ∈ C.

(g) Identity element of addition: There exists an element 1, called the multiplicative identity
such that for every v ∈ C, we have 1 · v = v + ·1 = v.

(h) Multiplicative inverse: For all z ∈ C \ {0}, we have an inverse element z −1 such that
z · z −1 = 1.

FT
(i) Distributivity laws: For all u, v, w ∈ C we have

u(w + v) = uw + uv.

It is easy to check that commutativity, associativity and distributivity hold. Clearly, the additive
identity is 0 + i0 and the multiplicative identity is 1 + 0i. If z = a + ib, then its additive inverse is
−a − ib. If z ∈ C \ {0}, then z −1 = |z|z 2 = aa−ib 2
2 +b2 . This can be seen easily if we recall that |z| = zz.

The proof of the next theorem is beyond the scope of these lecture notes.
RA
Theorem A.3 (Fundamental theorem of algebra). Every non-constant complex polynomial
has at least one complex root.

We obtain immediately the following corollary.

Corollary A.4. Every complex polynomial p can be written in the form

p(z) = c(z − λ1 )n1 · (z − λ2 )n2 · · · · · (z − λk )n1 (A.1)


D

where λ1 , . . . , λk are the different roots of p. Note that n1 + · · · + nk = deg(p).

The integers n1 , . . . , nk are called the multiplicity of the corresponding root.

Proof. Let n = deg(p). If n = 0, then p is constant and it clearly of the form (A.1). If n > 0, then,
by Theorem A.3 there exists µ1 ∈ C such that p(µ) = 0. Hence there exists some polynomial q1
such that p(z) = (z − µ)q1 (z). Clearly, deg(q) = n − 1. If q1 is constant, we are done. If q1 is not
constant, then it must have a zero µ2 . Hence q1 (z) = (z − µ2 )q2 (z) with some polynomial q2 with
deg(q2 ) = n − 2. If we repeat this process n times, we finally obtain that

p(z) = c(z − µ1 )(z − µ2 ) · · · (z − µn ).

Now we only have to group all terms with the same µj and we obtain the form (A.1).

Last Change: Mon Jul 24 01:02:23 PM -05 2023


Linear Algebra, M. Winklmeier
372

Functions of complex numbers


It is more or less obvious how to form a complex polynomial. We can also extend functions which
admit a power series representation to the complex numbers. To this end, we recall (from some
calculus course) that a power series is an expression of the form

X
cn (z − a)n (A.2)
n=0

where the cn are the coefficients and a is where the power series is centred.
P∞ In our case, they are
complex numbers and z P is a complex number. Recall that a series n=0 an is called absolutely

convergent if and only if n=0 |an | is convergent. It can be shown that every absolutely convergent
series of complex numbers is convergent. Moreover, for every power series of the form (A.2) there
exists a number R > 0 or R = ∞, called the radius of convergence such that the series converges
absolutely for every z ∈ C with |z − a| < R and it diverges for z with |z − a| > R. That means that
the series converges absolutely for all z in the open disc with radius R centred in a, and it diverges
outside the closed disc with radius R centred in a. For z on the boundary the series may converge

FT
or diverge. Note that R = 0 and R = ∞ are allowed. If R = 0, then the series converges only for
z = a and if R = ∞, then the series converges for all z ∈ C.
Important functions that we know from the real numbers and have a power series are sine, cosine
and the exponential function. We can use their power series representation to define them also for
complex numbers.

Definition A.5. Let z ∈ C. Then we define


∞ ∞ ∞
X (−1)n 2n+1 X (−1)n 2n X 1 n
ez =
RA
sin z = z , cos z = z , z . (A.3)
n=0
(2n + 1)! n=0
(2n)! n=0
n!

Note that for every z the series in (A.3) are absolutely convergent because, for instance, for the series
P∞ (−1)n 2n+1 P∞ 1
for the sine function, we have n=0 | (2n+1)! z | = n=0 (2n+1)! |z|2n+1 is convergent because |z|
is a real number and we know that the cosine series is absolutely convergent for every real argument.
Hence the sine series is absolutely convergent for any z ∈ C, hence converges. The same argument
shows that the series for the cosine and for the exponential are convergent for every z ∈ C.
D

Remark A.6. Since the series for the sine function contains only odd powers of z, it is an odd
function and cosine is an even function because it contains only even powers of z. In formulas:
sin(−z) = − sin z, cos(−z) = cos z.

Next we show the relation between the trigonometric and the exponential function.

Theorem A.7 (Euler formulas). For every z ∈ C we have that


eiz = cos z + i sin z,
1
cos(z) = (eiz + e−iz ),
2
1
sin(z) = (eiz − e−iz ).
2i

Last Change: Mon Jul 24 01:02:23 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter A. Complex Numbers 373

Proof. Let us show the formula for eiz . In the calculation we will use that i2n = (i2 )n = (−1)n and
i2n+1 = (i2 )n i = (−1)n i and

∞ ∞ ∞ ∞
X 1 X 1 n n X 1 (2n) 2n X 1
eiz = (iz)n = i z = i z + i(2n+1) z 2n+1
n=0
n! n=0
n! n=0
(2n)! n=0
(2n + 1)!
∞ ∞ ∞ ∞
X 1 X 1 X (−1)n 2n X (−1)n 2n+1
= (−1)n z 2n + i(−1)n z 2n+1 = z +i z
n=0
(2n)! n=0
(2n + 1)! n=0
(2n)! n=0
(2n + 1)!
= cos z + i sin z.

Note that the third steps needs some proper justification (see some course on intergral calculus).
For the proof of the formula for cos z we note that from what we just proved, it follows that

1 iz 1 1
(e + e−iz ) = (cos(z) + i sin(z) + cos(−z) + i sin(−z)) = (cos(z) + i sin(z) + cos(z) − i sin(z))

FT
2 2 2
= cos(z).

The formula for the sine function follows analogously.


RA
Exercise. Let z, w ∈ C. Show the following.

(i) ez ew = ez+w . Hint. Use Cauchy product.


(ii) Use the Euler formulas to prove cos α cos β = 12 (cos(α − β) + cos(α + β)), sin α sin β =
1 1
2 (cos(α − β) − cos(α + β)), sin α cos β = 2 (sin(α + β) + sin(α − β)).

(iii) (cos z)2 + (sin z)2 = 1.


(iv) cosh(z) = cos(iz), sinh(z) = −i sin(iz). In particular, sin and cos are not bounded functions
D

in C.
(v) Show that the exponential function is 2πi periodic.

Polar representation of complex numbers

Let z ∈ C with |z| = 1 and let ϕ be the angle between the positive real axis and the line connecting
the origin and z. It is called the argument of z. and it is denoted by arg(z). Observe that the
argument is only determined modulo 2π. That means, if we add or subtract any integer multiple
of 2π to the argument, we obtain another valid argument.

Last Change: Mon Jul 24 01:02:23 PM -05 2023


Linear Algebra, M. Winklmeier
374

Im
Im z
Im(z) = |z| sin ϕ

1 1
z
Im(z) ze
ϕ ϕ
Re(z) 1 Re Re(z) 1 Re
= |z| cos ϕ

Figure A.2: Left picture: If |z| = 1, then z = cos ϕ + i sin ϕ = eiϕ .


Right picture: If z 6= 0, then z = |z| cos ϕ + i|z| sin ϕ = |z| eiϕ .

FT
Then the real and imaginary part of z are Re(z) = cos ϕ and Im(z) = i sin ϕ, and therefore
z = cos ϕ + iϕ = eiϕ . We saw in Remark 2.3 how we can calculate the argument of a complex
number.
Now let z ∈ C \ {0} and again let ϕ be the angle between the positive real axis and the line
z
connecting the origin with z. Let ze = |z| z | = 1 and therefore ze = eiϕ . It follows that
. Then |e

z = |z| eiϕ . (A.4)


RA
(A.4) is called de polar representation of z.
Now we can give a geometric interpretation of the product of two complex numbers. Let z, w ∈
C \ {0} and let α = arg z and β = arg w. Then
zw = |z| eiα |w| eiβ = |z| |w| ei(α+β . (A.5)
This shows that the product zw is the complex number whose norm is the product of the norms of
z and w and whose argument is the sum of the arguments of z and w.

Im
D

zw w
α+β

β z
α
Re

Figure A.3: Geometric interprettion of the multiplication of two complex numbers.

Last Change: Mon Jul 24 01:02:23 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 375

Appendix B

Solutions

Solutions of selected exercises from Chapter 1


Soluciones de Sección 1.1
(a) No tiene solución.
(b) Infinitas soluciones.
(c) Solución única x = 0, y = 0.
(d) Infinitas soluciones.
(e) No tiene solución.
(f) Infinitas soluciones.
FT • t = 21 , k 6= 5.
• t = 14 , k = 10.

5. 065, 164, 263, 362, 461, 560.


6. 10 clientes eran dueños de perros y 12 de
gatos.
7. 75.
8. Si la velocidad del conductor A es de 57 km
RA
h
y la del conductor B es de 49 kmh .
Soluciones de Sección 1.2
• El conductor A llega a Villavicencio a
1.(a) Solución única. x = 21 , y = −30. las 6am.
1.(b) Solución única. x = 97 , y = 27 .
• Los dos conductores no se encuentran
1.(c) Infinitas soluciones.
en carretera.
1.(d) Ninguna solución.
1.(e) Infinitas soluciones. Si la velocidad del conductor A es de 19 km
h y
1.(f) Solución única. x = 57 , y = 2. la del conductor B es de 70 km
D

h .
2. k ∈ R \ {19}.
√ √ • El conductor A llega a Villavicencio a
3. k ∈ R \ {8 − 66, 8 + 66}. las 10am.
• Los dos conductores se encuentran en
Soluciones de Sección 1.4 carretera a las 9am.
1. 6.
2. a = 3, b = 1, c = 1.
3. Si y = ax2 + bx + c es tal parábola,
entonces a + c = 52 y b = − 32 .
4. Las opciones posibles son:

• t 6= 21 , k = 10.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
376

Solutions of selected exercises from Chapter 2


Soluciones de Sección 2.1 Soluciones de Sección 2.4
−−→  8

1.(a) P Q = −31 .

1.(a) 16 .
√ 15
1.(b) 10. √15
1.(c) −1 0
. 1.(b) 210.
1.(d) − arctan 23 . 1.(c) −50.
1.(e) π − arctan 13 . 1.(d) 16~b.
2.(a) Sı́ forman un paralelogramo.
2.(b) No forman un paralelogramo. Soluciones de Sección 2.5
2.(c) Sı́ forman un paralelogramo. √
1.(a) 2√ 26.
1.(b) 26.
Soluciones de Sección 2.2 1.(c) Todos los P (x, y, z) tales que
# – # –
3.(a) Note que hay tres maneras cómo  × BP k 
kBC = 13, es decir,
acercarse a este problema: dos vectores ~a, ~b −7+9y−5z

FT
6−9x+3z = 13. Por ejemplo,
son paralelos −1+5x−3y

1) si y solo si existen escalares µ, λ tal que P ((41 ± 7117)/106, 0, 0) sirve.
µ~a = λ~b; 2. 188.
2) si y solo si h~a , ~bi = k~ak k~bk;
q
5
3. 29 .
3) si y solo si las rectas que generan son 3
paralelas, es decir, no se intersecan en 4. Todos los ~a que son paralelos a 7 .
 3 2
exactamente un punto, lo que se puede 1
Los que tienen norma 1 son 62 7 y

verficar con el determinante. 2
RA
3
4.(a)(i) α = −2. − √162 7 .
2
4.(a)(ii) α = 2. √
4.(a)(iii) α = −2(2 ± 3). Soluciones de Sección 2.6
4.(a)(iv) α = 0.
4.(a)(v) No existe tal α. 2.(a)(i) No son paralelas.
4.(b) Cuando α → ∞ el ángulo entre ~a y ~b 2.(a)(ii) No tienen ningún punto en común.
tiende a 3π 2.(a)(iii) P pertenece a L1 , P no pertenece a
4 . Cuando α → −∞ el ángulo entre
~ L2 .
~a y b tiende a π4 . y−2
2.(a)(iv) x−5
2 = 3 =
z−11
4 .
D

2.(b)(i) No son paralelas.


Soluciones de Sección 2.3 2.(b)(ii) Se cruzan en exactamente un punto,
1.(a) proj~a ~b = 29
11~
b, proj~b ~a = 11 a saber en (4, 5, −1).
10 ~
a.
1.(b) Todos los ~v que son perpendiculares a ~a, 2.(b)(iii) P pertenece a L1 , P no pertenece a
es decir todo L2 .
 los vectores de la forma 4.4.a(i) x − y − z = 0.
~v = t −31 .
1.(c) Todos √ los ~v = ( xy ) tales que 4.4.a(ii) Pnno pertenece
 a E.
1 1
 o
|x + 3y| = 2 10. Note que son todos los 4.4.a(iii) 0 +t −1 :t∈R .
−1 −1
vectores de la forma ~v = k~a2k ~a + t −3 1 . 4.4.b(i) 3x + 3y − 4z = −1.
Observe la relación con los vectores de (ii). 4.4.b(ii) P no pertenece a E.
y
¿Por qué es ası́? 4.4.b(iii) x−1 z+1
3 = 3 =− 4 .
1.(d) No. Sı́. No. 4.4.c(i) 3x + 2y + z = 4.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 377

4.4.c(ii) P no pertenece a E.
4.4.c(iii) x = 3t + 1, y = 2t, z = t − 1, t ∈ R.

Soluciones de Sección 2.7


1.(a) Se intersecan en 52 , 2, 2 .


1.(b) Existen infinitas rectas con dicha


propiedad.
5. E ∩ F es un plano ó es una recta que pasa
por A y B.

Soluciones de Sección 2.9


1. Hint. Es rápido si usa proyecciones.
2. ~b = ( 31 ) y ~b = −3
−1 .
4.(a) Falso.
4.(b) Falso.

FT
4.(c) Verdadero.
4.(d) Verdadero.
7.(a)ii. Colisionan en el punto (7, 10, −1) y en
tiempo t = 2.
7.(b)ii. No colisionan.
7.(b)iii. Las dos estelas se mezclan en el
√ 20, −5).
punto (13,
13.(a) 3 6.
RA
14.(c) Q(−3, 5, 5). √
14.(d) La distancia obtenida es 11.
16.(b) Solo existe un único plano que cumple
tal condición.
17.(a) No.
17.(c) 34x − 5y − 9z = 0 es el único plano con
las propiedades deseadas.
18.(b) λ = µ = 12 .
D

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
378

Solutions of selected exercises from Chapter 3


Soluciones de Sección 3.1 3.(b) a = 2.

1 0 1 0
 3.(c) a = −2.
1.(a) .
0 1 1 −1
  Soluciones de 3.4
1 0 0  
1.(c) 0 1 2. 0 0
0 0 0 1.(e) D =  8 5.
  −6 −3
1 0 3 0 5 2.(b) No es posible efectuar la multiplicación.
1.(e) 0 1 2 0 −3/2.
0 0 0 1 3/2 
0 −2 5


1 0 0 2
 2.(d) 0 −6 15 .
1.(g) 0 1 0 2. 0 2 −5
0 0 1 −3 
19 −17 34

FT

1 0 0 0 1
 2.(f)  8 −12 20.
0 1 0 0 −3 −8
 −11 7
1.(g)  . 
0 0 1 0 2 3 5
4. A = .
0 0 0 1 5  1 2 
cos ϑ − sin ϑ
2. Los a, b, c ∈ R tales que a − 2b + c = 0. 6. .
sin ϑ cos ϑ
3. −x2 + 3x − 2.
5. El rolo pasó 6 dı́as en Medellı́n, 4 dı́as en 7. a = d, c = 0.
RA
Villavicencio y 4 dı́as en Yopal.
Soluciones de Sección 3.5
Soluciones de Sección 3.2 3. Una solución particular al sistema A~x = ~b
usando
 la inversa
 a derecha es
x  −2  3
2.(a) y = s 1 + t 0 . 4 −3
z 0 1
x 5  3 ~x =  0 0  ~b.
2.(b) yz = s 61 + t −20 . −1 1
w 0 1

2.(c) ( xy )
= t( 41 ). Soluciones de Sección 3.6
D

3. r ∈ R − {−3, 2}  
1 6 −3
1.a) 12 .
2 1
Soluciones de Sección 3.3 1.e) No tiene inversa.
1.(a) Sı́.
 
21 −3 −3 −6
1.(b) No.  4 −1 −4 1
1.(c) No. 1.f) 91 
 −1 −2
.
1 2
Las soluciones
 del sistema homogéneo son −15 6 6 3
x 5
 
y −3
=t
 
z 5 1 0 0
w 2
2. (a) No tiene solución. 1.h) 0 −1/5 0.
(b) No tiene solución. 0 0 3
3.(a) a ∈ R \ {2, −2}. 2.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 379

(i) Solo tiene solución trivial. 10. La matriz B no es invertible.


 La matrizD
3 15 −6
(ii) Tiene solución no trivial. es invertible y D−1 = 57
1 
−12 −3 24.
15 −1 −11
 inversa si a 6= 2, −2 y ental caso
3. A tiene  
1 50 29
0 1 0 24. X = 19 .
−29 36
 1 2−a2 a2
A−1 =  4−a 2(a −4) .

2 2
2(a −4) 2
1 1 −2
a2 − 4 a2 −4 a2 −4

5.  
1 0 0
0 cos ϑ sin ϑ  .
0 − sin ϑ cos ϑ

Soluciones de Sección 3.7


2. α = 32 , β = 72 .

FT
3.

(a) No.

(b) Sı́.

(c) No.

(d) Sı́.
RA
4. No se puede concluir que AB es simétrica.

Soluciones de Sección 3.8


1.a) Sı́. 1.c) No. 1.f) No.
 
2 −3
3.a) Q21 (−2) .
0 0
D

 
1 0 0
3.(c) Q21 (5)P23 Q21 (2) 0 1 3.
0 0 0
4.b) E = Q31 (−4).
4.c) E = P13 .
4.e) E = Q12 (3).

Soluciones de Sección 3.10


7 11 7
2. − 2x + 2(x−2) − (x−2)2 .
 
5
7. 3/2.
3

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
380

Solutions of selected exercises from Chapter 4


Soluciones de Sección 4.1 4.c)Es invertible y su inversa
 es
1 −69 10 27
1. (a) -4, −12
1  8 18 4

1. (c) 47, .
38  7 11 −6 −1 
1. (e) 6, −11 −1 4 7
1. (g) 96. 9. (−1)n+1 b1 .

Soluciones de Sección 4.2


1. −3/5, −1, −1.
2. 18.
3. Para todo a ∈ R la matriz tiene inversa.
6.(a) 1 y −1.
6.(b) 0 y 1.

FT
7. Hint. Use que A es invertible.

Soluciones de Sección 4.3


2. 0.
3. 3.
4. Elija por ejemplo P como punto inicial y
−−→ −→
desarrolle det[P Q, P R]. Compare con el
RA
determinante de la fórmula dada.

Soluciones de Sección 4.4


10 26
1.b) x = 7 , y= 7 , z = − 47 .
1.d) x = 21 171 284 182
29 , y = 29 , z = − 29 , w = − 29 .
 
3 −6 2
1 
2.b) 14 2 10 −8.
D

−1 2 4
 
1 0 2 0
−1 1 2 −2
2.d)  .
 1 0 −3 3
2 −2 −2 3
4
3. − 37 .
5. Use la relación entre A y adj A.

Soluciones de Sección 4.6


2. t ∈ R \ {0, − 13 , −2}.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 381

Solutions of selected exercises from Chapter 5


Soluciones de Sección 5.1 20.(d) Verdadero.
2. Sı́ es espacio vectorial. 20.(e) Verdadero.
4. No es espacio vectorial.
6. Sı́ es espacio vectorial.
7. Sı́ es espacio vectorial. Soluciones de Sección 5.4
9. Sı́ es espacio vectorial. 2.(b) Tiene dimensión 3.
10. No es espacio vectorial.
2.(e) Su dimensión es 0.
13. Sı́ es espacio vectorial.
2.(f) Tiene dimensión 2.
2.(g) Tiene dimensión 2.
Soluciones de Sección 5.2
3.(a) El conjunto dado es linealmente
1. Sı́ es subespacio de V . dependiente.
3. Sı́ es subespacio de V . 3.(b) El conjunto dado sı́ es base de W .
5. Sı́ es subespacio de V . 3.(c) El conjunto dado es base de R4 .

FT
7. No es subespacio de V .
3.(d) El conjunto dado solo tiene dos vectores.
8. No es subespacio de V .
3.(e) El conjunto dado es linealmente
11. Sı́ es subespacio de V .
dependiente.
14. Sı́ es subespacio de V .
   4. Una basedel  subespacio
  dado es
0 a

17.(b) W1 ∩ W2 = : a, b ∈ R .  2 0 0 
a 0       

  ,   , 0 .
0 4



 0   0   1 


−1 3 0
 
RA
Soluciones de Sección 5.3
  6. Hint. Si tiene una base de E, completela
0 a una base de R3 con el vector normal del
2.  1. plano.
−1 10. α ∈ R − {−1, 2}.
4. {1, x, x3 }.
3 2 11.(c) Sı́ existen. Hint. Modifique un poco la
6. {2x
 −  3x +x, 1}. 
 base canónica de Pn .
 1 0 1 
8. 0 , 1 , 1
3  2  5
 
D

Soluciones de Sección 5.5


 2 6   
9.  3  , 9 . 
 7 
 1  
1 −3
 
3 1. U ∩ V = span   0

14.(a) Son linealmente dependientes.

 
1
 
14.(c) Son linealmente dependientes.
U + V = {~x ∈ R4 : h~x , (−2, −1, 1, 1)i = 0} y
14.(e) Son linealmente independientes.
dim(U + V ) = 3.
14.(g) Son linealmente dependientes.
16. El conjunto dado es linealmente 3. La suma de V y W no es directa.
dependiente y su generado es el plano 4. Hint. Recuerde cuánto valen
x−y+z =0 dim Msym (3 × 3) y dim Masym (3 × 3).
20.(a) Falso. 5. (a) No.
20.(c) Verdadero. 5. (b) Sı́.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
382

Soluciones de Sección 5.7


1.(a) Sı́ es subespacio.
1.(d) Sı́ es subespacio.
1.(f) No es subespacio.
1.(h) Sı́ es subespacio.
1.(j) No es subespacio.
4. No.
5. No.
26.(b) n2 − n.
26.(d) 5.
29.(a) Sı́.
29.(c) No.
29.(a) Sı́.

FT
RA
D

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 383

Solutions of selected exercises from Chapter 6


 
Soluciones de Sección 6.1  1 
ker T = span  1 .
2. No es transformación lineal. 
−1

4. Sı́ es transformación lineal. 2.(d) Im T = span{w}
~ y ker T es el plano
6. Sı́ es transformación lineal. x + 3y − z = 0.
8. Sı́ es transformación lineal. 5. Observe que Im A = Rm .
10. Sı́ es transformación lineal. 7. Hint. Recuerde que
12. Sı́ es transformación lineal. dim Im A + dim ker A = n.
14. No es transformación lineal.
15. 4. Im T, ker T son ambos el eje x.
15. 6. Im T esel plano
 x + y−z =0 y Soluciones de Sección 6.3
6 −6  
b−a

 
   
4 −3

ker T = span  0  0.
, 2. [p(X)]B =  b .
c − 2b + a

 
1 1
 

FT
x −x
15. 8. Im T =span{x 3
, 1} y 3.(a) Recuerde que sinh x = e −e 2 ,
ex +e−x
 
6 cosh x = .
2
 
ker T = span  2 .

1 0 0
−1
 
3.(b) A = 0 1/2 −1/2
15.)10. Im T = M (2 × 2) y ker T = {O2×2 }. 0 1/2 1/2
15.)12. Im T = R y ker T son las matrices
 de  
a11 a12 a13 c+b+a
la formaa21 a22 a23 4. [a + bX + cX 2 ]B =  2c + b .
RA

a31 a32 −(a11 + a22 ) c
 
cos ϑ sin ϑ
17. Im T = span{w}~ y 6. ABϑ →can = y
− sin ϑ cos ϑ
ker T = {~x ∈ Rn : ~x ⊥ w}.
~  
cos ϑ − sin ϑ
20. No existe. Acan→Bϑ = .
sin ϑ cos ϑ
 √   
−3 3 −3
Soluciones de Sección 6.2 8.(a) = .
−3 Bϑ
0
  √ 
2.(a) Im T es 
elplano x − 2z = 0 y 1 2
D

8.(b) = .
−1 B

 1  ϑ
0
ker T = span  0 .

−2
 9. Hint. Use la relación
  
1  ABϑ1 →Bϑ2 = ABϑ1 →can Acan→Bϑ2 .


  
−1

3
2.(b) Im T = R y ker T = span   .


 1  Soluciones de Sección 6.4
2
 
 
0 1
   
1 −1  1. 4.

  0 0
5  4

   
 
 
2
2.(c) Im T = span   ,   y
   −3 1 −1 2 3
1  0 1. 6. 0 1 4 3.

 
 

0 2 1 0 6 6
 

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
384

 
0 1 2
0 0 0
1. 8. 
 .
0 0 0
1 −3 0
 
2 0 0 0
4 1 0 0
1. 10. 
0
.
0 2 0
0 0 4 1

1. 12. 1 0 0 0 1 0 0 0 1 .
 
−1 0 0 0 0
 0 0 0 0 0
 
 0 0 1 0 0.
2.(a)  
 0 0 0 2 0
0 0 0 0 3

FT
2.(e) (1, 12 , 31 , . . . , n+1
1
).
 
1 0 1
3.(b) .
0 3 1
 
0 4/5 2/5
3.(d) .
3 7/5 16/5
4.(a) Hint. Suponga una combinación lineal
de vectores de B igualada a 0, evalúe en
x = 0, x = π.
RA
 
0 1 1 0
−1 0 0 1
4.(b) [D]B =   0 0 0 1.

0 0 −1 0
~ = (4, −3, 0, π)t .
8.(a) w
~ = (19, −18, 3, −5)t .
8.(b) w

Soluciones de Sección 6.6


D

20.(b) Se cumple para los polinomios con


coeficiente independiente cero.
20.(c) Se cumple para los polinomios en P3 .
21. Hay varias opciones, la más natural es
considerar T ~x = h~x , wi.
~
22.(a) Observe que si n = 2 entonces
dim ker ϕ = 1 y si n = 3 entonces
dim ker ϕ = 2.
22.(b) Elija ϕ tal que su kernel (que es una
recta) no pase por ninguno de los vectores
dados.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 385

Solutions of selected exercises from Chapter 7


Soluciones de Sección 7.1 Soluciones de Sección 7.4
1.(a) No es base ortonormal de V 1. √5 .
√6
1.(c) Sı́ es base ortonormal de V . 2. 59.
1.(d) Sı́ es base ortonormal de V . 3.(b) T (~x) = ~x − 2 projL⊥ ~x.
2. a = − 85 , b = − 21
8 .
Ok×(n−k)
 
idk
4.(d) Hint. Escoja una base ortonormal del 4.(a) [PW ]B =
plano 3x + 2y + 5z = 0.
O(n−k)×k O(n−k)×(n−k)
5. Falso. 4.(c) Recuerde como se expresa [PW ]can en
términos de [PW ]B .
4.(c) No hace falta hacer cuentas, aplique el
Soluciones de Sección 7.2 inciso anterior.
 
cos ϑ − sin ϑ 0
1. La matriz  sin ϑ cos ϑ 0 rota en un Soluciones de Sección 7.5
0 0 1

FT
   
ángulo ϑ el plano xy.  1 1 
2. Hint. Recuerde que AB→can es ortogonal. 1. Partiendo de la base  0 , 1 se
−1 0
 
4.(a) Falso.   
4.(a) Verdadero.  1 1 
obtiene: √12  0 , √16 2 .
−1 1
 
Soluciones de Sección 7.3
   
1 2
 
5
RA
1.(a) span . 0 −1
1 2. Sea W el generado de 
1,  0.
    
7
1.(c) La recta t 5. 0 1
1 Encuentre una base para W ⊥ y aplique
1.(d) Gram-Schmidt en la base dada de W y en la
   complemento ortogonal es
Una base del
base obtenida de W ⊥ .
 −5
 −7    
  3  5 0
 ,   .
 1  0 1 1
 
3. projW ~v = 2   y la distancia de ~v a W

 
0 1
D

2
 
      1
3 1 1 es √12 .
3.(a) =− +4 . Una base
5 −1 1
  
1 4. Observe
que dim
 ImA =
3 yuna base de
ortornormal de W ⊥ es √12 . 1 1 2
1 
 

 3  1   1 
     
10 10 0 Im A es  ,  ,   .
   0  −1
 2

3.(c) −1 = −1 + 0. Una base

7 −1 1
 
6 6  0 Aplicando
    Gram-Schmidt
  se obtiene
 
 1 0   1 11/3 2 
ortonormal de W ⊥ es 0 , √137  1  .
 
3 4 −1
      
1 1 1
√   , √35 
  , √15   .
 
0 −6 3 7 2 1/3  3 
 

 
7 7/3 1
 

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
386

Soluciones de Sección 7.8


⊥ ⊥
2. dim
U =  dim
 U =  2. Una base para U

 1 2 
   
−2 −3

es   ,   .


 1 0  
0 1
 
 
13 −2 −3
1 
3.((a))iii 14 −2 10 −6.
−3 −6 5
 
3 1 1 −1
1 1 1 1
3.((b))iii 14 
 1 1 1 1 .

−1 1 1 3

5. Una base
    ortonormal para
U  es
4 0 2

FT

 

0 1  0
   
√ 1  ,  , √  1  .
17  0 0 357 −17
 
−1 0 8
 
RA
D

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 387

Solutions of selected exercises from Chapter 8


Soluciones de Sección 8.1 13.(b) El polinomio caracterı́stico de A es λn .
14. Los valores propios de A son 0 y 1.
1. Al reducir la matriz se obtiene la
identidad. 16. Si A no es diagonalizable la afirmación es
2. falsa.
 Unabase ortonormal
 para Im A es
 1−i 
√1  3i  y una base ortonormal para
 28 Soluciones de Sección 8.4
4+ i

   
 −i −2 − 2i  3.(a) No.
ker A es √12  1  , √117 −2 + 2i .
3.(a) Sı́.
0 1
 
4. a = 1, b = 0, c = 0, e = 2, f = −2. 4. Las tres afirmaciones son falsas.
6.(a) Sı́. 5. A20 = id3x3
6.(b) No. 6. k 6= 5t ó t = 5, k = 25.
7.(c) La afirmación del inciso (b) en Cn no es 7. Exprese A = QDQ−1 donde D es la matriz

FT
válida. de valores propios de A, por ende
(A − d1 idn )(A − d2 idn ) . . . (A − dk idn ) =
Q(D − d1 idn )(D − d2 idn ) . . . (D − dk idn )Q−1 .
Soluciones de Sección 8.2
9.(c) Si no suponemos que C 6= O2×2 la
3.(a) Falsa. conclusión de 9.(b) es falsa.
3.(b) Falsa.
3.(c) Falsa.
5.(c) Use que B = CAC −1 y la propiedad del Soluciones de Sección 8.5
RA
inciso (b).
5.(d) Observe que las entradas de la diagonal 2.(a) Los valorespropios 
de A son 1 y 2.
 
de At A son las normas al cuadrado de los  0 
vectores columna de A. 2.(b) E1 = span  3  y E5 es el plano
−4
 
3y − 4z = 0.
Soluciones de Sección 8.3
2. Su polinomio caracterı́stico es λ4 . 3.A =   
1 1+i 1 8 0 1−i 1
3. El polinomio caracterı́stico de D es .
D

3 1 −1 + i 0 −1 1 −(1 + i)
(λ − 1)(λ − 2)(λ − 3)(λ − 4).
4. Si empieza diagonalizando
  A se  obtiene
5. Escoja una base ortonormal de L⊥ y
 
1 −1 2 0 2 1
A = 31 .
1 2 0 5 −1 1 complétela a una base de R3 con un vector
5. Los valores propios de T son 1 y −1. unitario de L. La representación
 matricial
 de
6. Resuelva la ecuación diferencial y = λy 0 −1 0 0
sujeta a la condición inicial y(0) = 0. T en esta base será  0 −1 0
8. Si λ es valor propio de A, existe ~x 6= 0 tal 0 0 1
que A~x = λ~x. Multiplique por A−1 .
10. Observe que ~x 6= 0 es un vector propio de 6. Use el teorema 8.55*
T si ~v × ~x = λ~x. ¿Para cuáles λ la igualdad 7.(a) Use el teorema 8.55
anterior es cierta? 7.(b) Use el teorema 8.55 y el ejercicio 8. de
11. Vea el ejercicio 4.(a) de la sección 7.4 la sección 8.1.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
388

0 2
Soluciones de Sección 8.6 1.(e)
 5(y) = 4con
 ejes de rotacion
√1
−2 1
1.(a)(x0 )2 + 11(y 0 2
 ) = 4 con ejes de rotación , √15 .
5 1 2
√1
1 −3
10 3
, √110 .
1 y0
y

y x0
2

2 2
1

y0
1
√2 x
11 −2 −1 1 2
−1
−2 −1 1 2
x x0
−2

FT
−1

1.(g) 4(x0 )√
2
+ 6(y 0 )2 = −4.
−2
3. 3x − 4 3xy + 7y 2 = 9
2

Soluciones de Sección 8.8


0 2 0 2
 −(x
1.(d)  ) +3(y) = 6 con ejes de rotación 6.(b) Los autovalores de A son 4, 7, 10.
1 −2
RA
√1 , √15 y ası́ntotas    
5 2 1 0 0 A −1 1 0
√ √ 7. D = ye =C C.
y= −(1+2 3)
√ x, y= 2 3−1
√ x. 0 5 0 e5
2− 3 2+ 3
8.(a) Los valores propios de Φ son 1 y −1 y
y los espacios propios son Msym (n × n),
x0 Masym (n × n).
8.(b) El único valor propio de T es 3 con
espacio propio asociado span{1}.
8.(c) Los valores propios son 1, −1 y los
D

y0 espacios propios
  son el plano x + 2y + 3z = 0
1
y la recta t 2.
3
x 9.c Los autovalores de T son los valores de la
2 2 2 2
forma − kLπ2 donde k ∈ N y para cada − kLπ2 ,
2 2
su espacio propio es span{sin kLπ2 t}.

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Index

|A| (determinant of A), 136 additive inverse, 163


k · k, 32, 46, 308 adjoint matrix, 312
⊕, 202 adjugate matrix, 153
h· , ·i, 35, 46, 308 affine subspace, 169
^(~v , w),
~ 37 algebraic multiplicity, 325
k, 37 angle between ~v and w,~ 37
⊥, 37, 279, 309 angle between two planes, 59
∼, 315 antisymmetric matrix, 118

FT
×, 48 approximation by least squares, 293
∧, 48 argument of a complex number, 367
C, 363 augmented coefficient matrix, 13, 78
Cn , 307
M (m × n), 78 bases
R2 , 27, 30 change of, 237
R3 , 48 basis, 190
Rn , 45 orthogonal, 271
RA
Eigλ (T ), 320 bijective, 219
Im, 363
L(U, V ), 218, 265 canonical basis in Rn , 191
Masym (n × n), 118 Cauchy-Schwarz inequality, 38, 310
Msym (n × n), 118 change of bases, 237
Pn , 174 change-of-coordinates matrix, 240
Re, 363 characteristic polynomial, 322
Sn , 135 coefficient matrix, 13, 78
U ⊥ , 279 augmented, 13, 78
D

dist(~v , U ), 287 cofactor, 137


adj A, 153 column space, 229
arg, 367 commutative diagram, 251
gen, 177 complement
pA , 322 orthogonal, 279
projU ~v , 285 complex conjugate, 363
projw~ ~v , 43, 47, 309 complex number, 363
span, 177 complex plane, 363
v̂, 47 component of a matrix, 13
~vk , 42 composition of functions, 98
~v⊥ , 42 cross product, 48

additive identity, 31, 161, 365 determinant, 19, 136

389
390

expansion along the kth row/column, 138 inverse matrix, 112


Laplace expansion, 138 2×, 113
Leibniz formula, 136 invertible, 106
rule of Sarrus, 139 isometry, 277
diagonal, 117
diagonalisable, 316 kernel, 219
diagram, 251, 322
commutative, 251 Laplace expansion, 138
dimension, 193 least squares approximation, 293
direct sum, 197, 202 left inverse, 107
directional vector, 55 Leibniz formula, 136
distance of ~v to a subspace, 287 length of a vector, see norm of a vector
dot product, 35, 46, 308 line, 54, 199
directional vector, 55
eigenspace, 320 normal form, 57
eigenvalue, 317, 319 parametric equations, 56
eigenvector, 317, 319 symmetric equation, 56

FT
elementary matrix, 120 vector equation, 55
elementary row operations, 79 linear combination, 176
empty set, 177, 179, 181, 190, 193 linear map, 217
entry, 13 linear maps
equivalence relation, 315 matrix representation, 248
Euler formulas, 366 linear operator, see linear map
expansion along the kth row/column, 138 linear span, 177
linear system, 12, 77
RA
field, 364 consistent, 12
finitely generated, 179 homogeneous, 12
free variables, 85 inhomogeneous, 12
solution, 12
Gauß-Jordan elimination, 83 linear transformation, see linear map
Gaußian elimination, 83 matrix representation, 250
generator, 177 linearly dependent, 181
geometric multiplicity, 320 linearly independent, 181
Gram-Schmidt process, 290 lower triangular, 117
D

Hölder inequality, 310 magnitude of a vector, see norm of a vector


hermitian matrix, 312, 338 matrix, 78
homogeneous linear system, 12 adjoint, 312
hyperplane, 54, 199 adjugate, 153
antisymmetric, 118
idempotent matrix, 148 change-of-coordinates, 240
image of a linear map, 220 coefficient, 78
imaginary part of z, 363 cofactor, 137
imaginary unit, 363 column/row space, 229
inhomogeneous linear system, 12 diagonal, 117
injective, 219 diagonalisable, 316
inner product, 35, 46, 308 elementary, 120

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
Chapter B. Solutions 391

hermitian, 312, 338 orthogonal system, 270


idempotent, 148 orthogonal vectors, 37, 309
inverse, 112 orthogonalisation, 290
invertible, 106 orthonormal system, 270
left inverse, 107 overfitting, 299
lower triangular, 117
minor, 137 parallel vectors, 37
orthogonal, 148, 275 parallelepiped, 52
product, 99 parallelogram, 51
reduced row echelon form, 81 parametric equations, 56
right inverse, 107 permutation, 135
row echelon form, 81 perpendicular vectors, 37, 309
row equivalent, 83 pivot, 81
singular, 106 plane, 54, 199
snymmetrix, 338 angle between two planes, 59
square, 78 normal form, 59
symmetric, 118 polar represenation of a complex number, 368

FT
transition, 240 principal axes, 346
unitary, 312 product
upper triangular, 117 inner, 35, 46, 308
matrix representation of a linear product of vector in R2 with scalar, 30
transformation, 250 projection
minor, 137 orthogonal, 285
modulus, 363 proper subspace, 167
Multiplicative identity, 365 Pythagoras Theorem, 286, 309
RA
multiplicity
algebraic, 325 radius of convergence, 366
geometric, 320 range, 220
real part of z, 363
norm, 363 reduced row echelon form, 81
norm of a vector, 32, 46, 308 reflection in R2 , 256
normal form reflection in R3 , 258
line, 57 right hand side, 12, 77
plane, 59 right inverse, 107
D

normal vector of a plane, 59 row echelon form, 81


null space, 219 row equivalent, 83
row operations, 79
ONB, 271 row space, 229
one-to-one, 219
orthogonal basis, 271 Sarrus
orthogonal complement, 279, 279 rule of, 139
orthogonal diagonalisation, 338 scalar, 28
orthogonal matrix, 148, 275 scalar product, 35, 46, 308
orthogonal projection, 285, 286 sesquilinear, 309
orthogonal projection in R2 , 42 sign of a permutation, 135
orthogonal projection in Rn , 47, 285, 309 similar matrices, 315
orthogonal projection to a plane in R3 , 258 snymmetrix matrix, 338

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier
392

solution
vector form, 86
span, 177
square matrix, 78
standard basis in Rn , 191
standard basis in Pn , 191
subspace, 167
affine, 169
sum of functions, 98
surjective, 219
symmetric equation, 56
symmetric matrix, 118
system
orthogonal, 270
orthonormal, 270

trace, 319, 324

FT
transition matrix, 240
triangle inequality, 33, 39, 310
trivial solution, 89, 180

unit vector, 33
unitary matrix, 312
upper triangular, 117
RA
vector, 31
in R2 , 27
norm, 32, 46, 308
unit, 33
vector equation, 55
vector form of solutions, 86
vector product, 48
vector space, 31, 161
direct sum, 202
D

generated, 177
intersection, 201
polynomials, 174
spanned, 177
subspace, 167
sum, 202
vector sum in R2 , 30
vectors
orthogonal, 37, 309
parallel, 37
perpendicular, 37, 309

Last Change: Thu Aug 24 05:16:46 PM -05 2023


Linear Algebra, M. Winklmeier

You might also like