0% found this document useful (0 votes)
44 views27 pages

LinearAlgebraI 2023 Matrdets

The document summarizes key concepts about matrices from a notes document on linear algebra. 1) It defines what a matrix is, including that it is a rectangular array of numbers called entries or coefficients. It also defines operations on matrices like addition and scalar multiplication. 2) It discusses matrix multiplication, including that the product of an m×n matrix and an n×p matrix results in an m×p matrix. It provides the formula for calculating each entry. 3) It introduces linear maps and shows that a linear map from Rn to Rm can be represented by an m×n matrix, with the columns being the images of the basis vectors. It also defines the kernel and image (rank)

Uploaded by

guilhermeab2008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views27 pages

LinearAlgebraI 2023 Matrdets

The document summarizes key concepts about matrices from a notes document on linear algebra. 1) It defines what a matrix is, including that it is a rectangular array of numbers called entries or coefficients. It also defines operations on matrices like addition and scalar multiplication. 2) It discusses matrix multiplication, including that the product of an m×n matrix and an n×p matrix results in an m×p matrix. It provides the formula for calculating each entry. 3) It introduces linear maps and shows that a linear map from Rn to Rm can be represented by an m×n matrix, with the columns being the images of the basis vectors. It also defines the kernel and image (rank)

Uploaded by

guilhermeab2008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Notes on Linear Algebra I

Matrices and determinants


2023

José J. Ramón Marı́, PhD


2
Chapter 1

Matrices and linear maps

1.1 Matrices
The term was coined by James Joseph Sylvester in the 19th century.

1.1.1 Definitions and basic properties


A matrix with coefficients in a field K is a rectangular array of m rows and n columns,
i.e. A = (aij ), where aij are scalars. The usual notation is Mm×n (K), where K is the field
of scalars chosen.
We denote the rows of a matrix A as follows: Ai denotes the i-th row. Likewise, the
i-th column is denoted by Ai . We usually denote the coefficient in the i-th row and j-th
column by aij (only exceptionally by aij ).
There are two operations on matrices of a fixed size: (A + B)ij = Aij + Bij , and multi-
plication by scalars, (λA)ij = λAij . These make Mm×n (k) into a vector space over K, of
dimension mn. There is an obvious canonical basis eij for Mm×n (K). A square matrix
of order n is one such that m = n.
There is a product between matrices of concomitant sizes: if A is an m × n matrix and B
is an n × p matrix, the product AB shall be an m × p matrix, and it is defined as follows.
 
b1j
 b2j  Xn
i
(AB)ij = A Bj = (ai1 . . . ain )  ..  = aik bkj .
 
 .  k=1
bnj
Thus, the matrix product AB satisfies the obvious identities:
 
A1
 A2 
AB = A(B1 B2 . . . Bp ) =  ..  B.
 
 . 
An

Thus, the i-th row of AB is Ai B, and the i-th column of AB is ABi .

3
The matrices of size m × n, i.e. m rows and n columns, over R or C, is a vector space,
which we denote by Mm×n (R) (resp. C). Actually, it is like Rmn , if we choose an ordering
of the coefficients. The standard basis of Mm×n (R) consists of the matrices eji , where
(eji )k` = 1 if j = k and i = `, and zero otherwise. We chose this ordering of the subindices,
due to the fact that eji ei = ej . Note that ekj eji = eki , and ek` eji = 0 if ` 6= j. Here, the
matrices on the right are in Mm×n (R), and those on the left are in Mp×m (R).
The identity matrix I = In ∈ Mn (K): Iij = 1 if i = j, and 0 if i 6= j. One has for every
A m × n: AIn = Im A = A.

Proposition 1.1.1 (Associativity of matrix product) if A is m × n, B is n × p, and


C is p × q, one has the following identity.

(AB)C = A(BC).
P P
Proof: It suffices to show that the (i, j) coefficient of both sides is k1 k2 aik1 bk1 k2 ck2 j .
The transpose of an m × n matrix A = (aij ), denoted by AT , is the n × m matrix defined
as ATij = aji . One easily checks that (AB)T = B T AT , etc.

Connection with inner product in Rn : If u, v ∈ Rn are two (column) vectors, then


uT v = u • v = v T u.

1.1.2 Block multiplication


A matrix may be defined by blocks. Namely, if A, B, C, D are matrices, where A is r × r,
B is r × s, C is s × r and D is s × s, we may define the (r + s) × (r + s) matrix M by
juxtaposition of these blocks:
 
A B
M= .
C D
If we have a matrix A of size m × n and another matrix B of size m × p, we may grab
two matrices E, F of respective sizes n × q and p × q and define the block matrix product
 
 E
N= A B .
F

Indeed,
  the explicit formula is given by fixing the i-th row of A B and the j-th column
E
of . The result is
F
 
Ej
= Ai Ej + B i Fj ,

Nij = Ai Bi
Fj

so we have proven the following result.

4
Proposition 1.1.2 The block matrix product above yields
 
 E
A B = AE + BF.
F
This is the case for all block matrix products. For instance, assuming that all matrix
products involved are possible (which is the case if all blocks are square matrices of the
same order n), one has
    
A B E F AE + BG AF + BH
= .
C D G H CE + DG CF + DH


1.2 Linear maps and matrices


Definition A linear map between two vector spaces E, F over a field K is a map
T : E → F that satisfies the following:

T (λu + µv) = λT (u) + µT (v),

for scalars λ, µ ∈ K and vectors u, v ∈ E.

Example 1.2.1 Let C 1 (R) be the vector spaces of functions of class C 1 on R (differen-
tiable at each point x ∈ R, and such that f 0 is continuous). The function T (f ) = f 0 (0) is
a linear map T : C 1 (R) → R.

Example 1.2.2 Let A be an m × n matrix. A defines automatically a function Rn → Rm


(the order is reversed!), that is linear: T (x) = TA (x) = Ax. Indeed, T (λx + µy) =
A(λx + µy) = λAx + µAy = λT (x) + µT (y).

Proposition 1.2.3 Given a linear map T : E → F , there are two vector subspaces
associated with T : the kernel of T , ker T ⊂ E, defined as {x ∈ E|T (x) = 0} and the
image of T , Im T = T (E) ⊂ F are vector subspaces of E, F respectively.

Proof: Indeed, if λ, µ ∈ K and u, v ∈ ker T , then T (λu + µv) = λT (u) + µT (v) = 0, so


λu + µv ∈ ker T . We leave the case of Im T to the reader. 

Kernel and image of a matrix: From Example 1.2.2 we may define kernel and image
of a matrix A. The kernel, ker A, is the subspace given by ker A = {x ∈ Rn : Ax = 0}.
This automatically yields the kernel as the solution set of a system of homogeneous linear
equations.
The image, Im A, is the subspace {Ax : x ∈ Rn } ⊂ Rm . Since Ax is the linear combination

Ax = x1 A1 + . . . + xn An ,

we see that the image Im A = hA1 , A2 , . . . , An i.

5
Definition The rank of a linear map (resp. a matrix) is the dimension of its image,
rk T = dim Im T (resp. rk A = dim Im A). In other words, the rank of a matrix is
dimhA1 , . . . , An i. The nullity of a linear map (resp. matrix) is the dimension of its
kernel, dim ker T (resp. dim ker A).

The following result elucidates the relationship between linear maps and matrices (this
will suffice for this semester).
Theorem 1.2.4 Let T : Rn → Rm be linear over R. Then T is of the form given in
Example 1.2.2.
Proof: First of all, note that for TA (x) = Ax one has T (ei ) = Ai . This shows us
precisely what we need to prove: it is the matrix A, the columns of which are the vectors
T (ei )P∈ Rm . Now, let T be an arbitrary linear map from Rn to Rm . For a vector x ∈ Rn ,
x = xi ei , one has:
 
x
X X   .1 
T (x) = T ( xi ei ) = xi T (ei ) = T (e1 ) · · · T (en )  ..  = Ax,
xn
as desired. 

Proposition 1.2.5 Let A be an m × n matrix, where n > m. Then ker A 6= {0}.

Proof: The matrix has n columns Ai ∈ Rm , which must be linearly independent. 

1.3 Inverses
A square matrix A has an inverse B, also n × n, if and only if AB = BA = I. (The
results in this section may be treated with determinants and adjugants, but we assume
no previous knowledge of that subject here.)

Proposition 1.3.1 If a square matrix A satisfies AB = I, then also BA = I. Thus, in


dealing with square matrices, a left inverse is also a right inverse, and vice versa.

Proof: Suppose that AB = I. It follows that the columns Ai of A span Rn . This also
means that the columns Ai are a basis of Rn , by invariance of dimension (if they were not
LI, cancelling out those unnecessary would eventually produce a basis of Rn of less than
n elements!). Thus, ker A = {0}.
We shall show that BA − I = 0. Clearly, A(BA − I) = 0, but this means that every
column of BA − I is in ker A = {0}, so indeed BA = I, as desired.
Suitable use of the transpose settles the remaining claim; however, if one should wish to
write out all details, suppose now that BA = I. Again, the columns of A form a basis,

6
for they are clearly linearly independent (Ax = 0 ⇒ x = BAx = 0) and invariance of
dimension applies. To prove that AB = I, again consider (AB−I)A = 0. This means that
Im A ⊂ ker(AB − I), but since Im A = Rn is the span of the columns of A, AB − I = 0.


1.3.2 Let A be a square matrix of order n. Show that, if A is invertible, then so is AT ,


and (AT )−1 = (A−1 )T .

1.4 Matrices and systems of linear equations


One may make a system of m linear equations with n unknowns xi into an equation of
the form Ax = b, where A is m × n, x is the (column) vector of unknowns xi and b is a
column vector of size m (independent term).

Proposition 1.4.1 The following statements hold for a system of linear equations Ax = b
as above.

1. ker A is the solution set to the associated homogeneous system Ax = 0.

2. A solution exists to Ax = b if and only if b ∈ Im A.

3. Assume that a solution exists, say, x0 ∈ Rn : Ax0 = b. Then: any solution x to


Ax = b is of the form x = x0 + h, where h ∈ ker A. In other words, general solution
= particular solution + homogeneous solution.

Proof:

1. Obvious.
P
2. A solution is a set of coefficients xi which expresses b = Ai xi , i.e. as an element
of Im A. Therefore, the system is solvable (consistent) if and only if b ∈ Im A.

3. If such a solution x0 exists, a vector x is a solution if and only if Ax = Ax0 = b,


so x is a solution if and only if 0 = Ax − Ax0 = A(x − x0 ). This is equivalent to
saying, x − x0 = h ∈ ker A. Thus, any solution x is written as x0 + h, where h is a
solution of the associated homogeneous system. 

Corollary 1.4.2 If A is a square matrix and AN = I, then N = A−1 . Likewise, if


M A = I, then M = A−1 . In particular, the inverse is unique.

Proof: Assume that M A = I. Multiplying by A−1 (obtained, for instance, through the
adjugant of A) on the left yields M = A−1 . The case AN = I is likewise settled (right
multiplication). 

7
1.5 Problems
1.5.1 Let B = (bij ) be an n × n matrix, with zero diagonal, i.e., bii = 0 for all i. Prove
that there are n × n matrices X, Y such that XY − Y X = B. (Hint: Assume that X is
a suitable diagonal matrix.)

1.5.2 Let A be an n × n square matrix. Show that if the matrix G is defined by Gij =
Ai • Aj is invertible, then det A 6= 0.

1.5.3 Let A, B be square matrices of orcer n, such that I − AB is invertible. Show that
I − BA is invertible, and that

(I − BA)−1 = I + B(I − AB)−1 A.

8
Chapter 2

Determinants

The ground field K is taken to be K = R or C, as usual. Let us go back for a moment


to the picture we showed in Chapter (vectors?). If u = (a, b), v = (c, d) ∈ R2 then the
oriented area of the parallelogram formed by u, v is given by the formula:
a c
= ad − bc = |u|· |v| sin(β − α),
b d
where α, β are the respective arguments of u, v.
If we swap u, v, the result changes sign. Also, given the sign of the sine function, (figure
missing on positive and negative orientations) ...

2.1 Learning lessons from R3


Given u, v, w ∈ R3 vectors, the oriented volume of the parallelopiped formed by u, v, w in
this order satisfies a series of properties, and each one is eloquently explained by a figure
(all figures missing here!).
det(u, v, w) = (u × v) • w views u, v as the (oriented) base and w is the “height provider”.
If w is parallel to the base, i.e. w ∈ hu, vi, then det(u, v, w) = 0.
u×v
Let n = ku×vk . If w = a· n + αu + βv, then
(u × v) • w = aku × vk,
and the figures provide a visual explanation for the fact that (u × v) • [w + λu + µv] =
(u × v) • w. Note that Cavalieri’s principle provides a visual proof for additivity!
The behaviour of this oriented volume with respect to permutation of the arguments is
well-known: its sign changes for transpositions, i.e. whenever we swap two vectors, e.g.
swapping u, w we get (u × v) • w = −(w × v) • u, etc.. In fact, should we wish to consider
u, w to form the base of the parallelopiped, we write
(u × v) • w = (w × u) • v = −(u × w) • v,
and proceed accordingly.
Inspired by this example, one may list a set of axioms that a function called ‘signed
volume’ or ‘oriented volume’ should satisfy, and then come up with the determinant.

9
2.2 Alternate n-linear forms on K n
We have working knowledge of 2 × 2 and 3 × 3 determinants. We shall guide ourselves by
this geometrical knowledge to construct an n-dimensional generalisation, which shall be
the determinant of a square matrix of order n. First we shall list the essential properties
of the determinant.
n
Let V : Rn × · · · × Rn → R be an ‘oriented volume function’. From our 2− and
3−dimensional experience, we distilled the following rules.
DET0) (normalization) V (e1 , e2 , . . . , en ) = 1;
DET1) (multilinearity) the function V is n-linear, i.e. linear on each variable if we fix
the arguments in the others. To wit,

V (λv + µw, u2 , . . . , un ) = λV (v, u2 , . . . , un ) + µV (w, u2 , . . . , un ),

and in general

V (u1 , . . . , ui−1 , λv + µw, ui+1 , . . . , un ) =


λV (u1 , . . . , ui−1 , v, ui+1 , . . . , un ) + µV (u1 , . . . , ui−1 , w, ui+1 , . . . , un );

DET2) (alternate) the function V is alternate, i.e. if we swap two arguments, the value
of V changes precisely in a sign:

V (u1 , . . . uj (place i), . . . , ui (place j), . . . , un ) = −V (u1 , . . . , ui , . . . , uj , . . . , un ).

An obvious consequence (and reformulation!) is that, if we enter the same vector v


in two different arguments, then

V (u1 , . . . , v, . . . , v, . . . , un ) = 0.

Explanation: ˆ DET1) carries behind the following explanation. If we consider the


i-th component to be the height carrier, and the other n-1 vectors to form the base
of the n-dimensional parallelopiped, then linearity is what we expected based on
our 2- and 3-dimensional experience.
ˆ DET2) contains the following. If the height-carrying vector is parallel to the base
(i.e. span of the remaining n-1 vectors), then the determinant must be zero! (For
further details, see Theorem 2.3.2 below).
ˆ DET2) contains some information regarding orientation. If we permute the vectors
u1 , · · · , un , then the value changes in a sign.
ˆ DET0) carries the normalization value, i.e. fixes as 1 the volume of the canonical
parallelopiped, with the natural order of the vectors (since it is a signed volume,
the vectors e.g. e2 , e1 , e3 , · · · , en will yield the value −1 after normalising the value
of V (e1 , · · · , en ) = 1).

10
Let us retrieve the determinant in the 2×2 e 3×3 cases. Note that manipulations effected
with the canonical basis work for any basis (though this comment shall be made clear in
the 2nd Semester of the course).

Example 2.2.1 If n = 2, then V (ae1 +be2 , ce1 +de2 ) = aV (e1 , ce1 +de2 )+bV (e2 , ce1 +de2 ).
After developing the expression, it equals

acV (e1 , e1 ) + adV (e1 , e2 ) + bcV (e2 , e1 ) + bdV (e2 , e2 );

in turn, V (ei , ei ) = 0 by DET2, e V (e2 , e1 ) = −V (e1 , e2 ), hence

V (ae1 + be2 , ce1 + de2 ) = (ad − bc)V (e1 , e2 ).

Example 2.2.2 For n = 3, let us compute

V (A, B, C) = V (a1 e1 + a2 e2 + a3 e3 , b1 e1 + b2 e2 + b3 e3 , c1 e1 + c2 e2 + c3 e3 ).

By multilinearity, V (A, B, C) = a1 b2 c3 V (e1 , e2 , e3 ) + a1 b2 c2 V (e1 , e2 , e2 ) + · · · Observe,


however, that each time ei appears twice in the expansion, we obtain 0 (alternate), so
only the terms where all the ei appear explicitly may survive:

V (A, B, C) = a1 b2 c3 V (e1 , e2 , e3 ) + a1 b3 c2 V (e1 , e3 , e2 ) + a2 b1 c3 V (e2 , e1 , e3 ) +


+a2 b3 c1 V (e2 , e3 , e1 ) + a3 b1 c2 V (e3 , e1 , e2 ) + a3 b2 c1 V (e3 , e2 , e1 ).

To complete the process, observe the following récipé.

RECIPE: When we have a permutation of n elements in the arguments of V , as is


the case, the advice is to swap elements in pairs, leaving en in the rightmost place,
then en−1 in its rightful place, and so on, all along applying alternateness. For instance,
V (e2 , e3 , e1 ) = −V (e2 , e1 , e3 ), and in turn −V (e2 , e1 , e3 ) = V (e1 , e2 , e3 ).

Thus we see that

V (A, B, C) = (a1 b2 c3 − a1 b3 c2 − a2 b1 c3 + a2 b3 c1 + a3 b1 c2 − a3 b2 c1 ) V (e1 , e2 , e3 ).

In other words, V (A, B, C) = det(A, B, C)V (e1 , e2 , e3 ).


Properties DET1, DET2 determine V up to a constant, and condition V (e1 , . . . , en ) = 1
finally fixes the determinant function.

Definition The determinant of n vectors in K n is given by properties DET1, DET2,DET0.


Let A be an n × n square matrix. We define the determinant of the matrix A as
det A = det(A1 , . . . , An ).

11
2.3 Binet’s formula
Theorem 2.3.1 (Binet-Cauchy) det AB = det A det B.

Proof: Consider T (u1 , · · · , un ) = det(Au1 , · · · , Aun ). Since T is n-linear and alter-


nate, we have T (u1 , · · · , un ) = T (e1 , · · · , en ) det(ei ) (u1 , · · · , un ). In our case, det AB =
det(AB1 , · · · , ABn ), where Bi is the i-th column of B. Note that T (e1 , · · · , en ) = det A,
so indeed det AB = T (B1 , · · · , Bn ) = det A det(B1 , · · · , Bn ) = det A det B. 

IMPORTANT: What Proposition 2.3.1 proves is that the determinant of a square ma-
trix A, det A, is a scaling factor between the volume of an n-parallelopiped generated by
the vectors u1 , . . . , un and that of its image by A, i.e. to one determined by, Au1 , . . . , Aun .
The sign of det A shows whether the orientation of Aui is the same as, or opposite to,
that of the ui .

Theorem 2.3.2 Let A be an n × n matrix. Then: det A = 0 if and only if its columns
A1 , · · · , An are linearly dependent.

Proof: Suppose that


Pn the columns are LD. For simplicity (or after applying DET2),
suppose that A1 = i=2 λi Ai . One has:
X n
X
det(A1 , . . . , An ) = det( λi Ai , A2 , . . . , An ) = λi det(Ai , A2 , · · · , Ai , · · · , An ) = 0,
i≥2 i=2

by linearity and alternateness. Now, assume that the columns of A are linearly
P k indepen-
n
dent: since the Ai form a basis of K , the canonical basis is such that ei = bi Ak . This
means that I = AB, and by Binet’s Formula det A det B = det I = 1, hence det A 6= 0.
*
In
P order to have an explicit formula for the determinant, we use matrix notation: Aj =
aij ei . By DET1, we have:
n
X n
X X
V (A1 , . . . , An ) = V ( ak1 1 ek1 , · · · , akn n ekn ) = ak1 1 · · · akn n V (ek1 , . . . , ekn ).
k1 =1 kn =1 k1 ,...,kn

Again, by DET2, only the terms where k1 , . . . , kn distinct shall survive, i.e. {k1 , . . . , kn } =
{1, . . . , n}. Here we use the notation for permutations: a permutation of n elements is
a bijection σ : {1, . . . , n} → {1, . . . , n}. The set (group) of permutations of {1, · · · , n} is
denoted by Sn , and is called the symmetric group of n elements.
Back to the expression, we have
X
V (A1 , . . . , An ) = aσ(1)1 · · · aσ(n)n V (eσ(1) , . . . , eσ(n) ) = (?)
σ∈Sn

in turn, V (eσ(1) , . . . , eσ(n) ) = sgn(σ)V (e1 , . . . , en ), where sgn(σ) = ±1 (the sign of σ) is


related to the sign changes via DET2.

12
Definition A permutation τ ∈ Sn is called transposition (between i, j (com i 6= j) if
τ (i) = j, τ (j) = i and τ (k) = k for every k 6= i, j. In other words, a transposition is a
swap between two elements i, j. One usually writes τ = (i j). (Note that τ −1 = τ for
every transposition τ ).

Theorem 2.3.3 Every permutation σ ∈ Sn factors as a product of transpositions. If


σ = τ1 · · · τk , then σ −1 = τk · · · τ1 .

The following illustrates a general principle that provides a general proof.


5
Example 2.3.4 Let V : R5 × · · · × R5 → R satisfy DET1), DET2). Consider the
canonical basis ei . We wish to find out who is V (e2 , e3 , e4 , e5 , e1 ). Remember to put the
last elements last, and so on until all are back in position.
First, we swap e5 and e1 , so that e5 is back at its 5th place. By DET2), one has

V (e2 , e3 , e4 , e5 , e1 ) = −V (e2 , e3 , e4 , e1 , e5 ).
Now e4 goes to its rightful place, and for that we swap e1 , e4 :
V (e2 , e3 , e1 , e4 , e5 ) = −V (e2 , e3 , e4 , e1 , e5 ).
We do the same with e3 and e2 , which entails two more sign changes, and thus we get
V (e2 , e3 , e4 , e5 , e1 ) = V (e1 , e2 , e3 , e4 , e5 ).
Note that, if σ = τ1 · · · τr , where τi = τi−1 are transpositions, then using (ab)−1 = b−1 a−1
yields
σ −1 = (τ1 · · · τr )−1 = τr−1 · · · τ1−1 = τr · · · τ1 .
(The process we showed in Example 2.3.4 in fact provides a decomposition of the inverse
σ −1 , but we shall not dwell on this, and instead refer to the first Chapter on groups.)

Example 2.3.5 (Sign of a cycle) A cycle or order r is a permutation of this kind:


there are a1 , · · · , ar elements (within {1, · · · , n}) such that ai 6= aj , and: σ(ai ) = ai+1 for
i < r, σ(ar ) = a1 , and all other elements are fixed, i.e. σ(j) = j for j 6= ai . We write
such σ as σ = (a1 a2 . . . ar ).
One has (a1 a2 . . . ar ) = (a1 a2 )(a2 a3 ) · · · (ar−1 ar ), so its sign is (−1)r−1 .

Corollary 2.3.6 sgn(σ) = (−1)k , where σ = τ1 · · · τk .

CLAIM: (proven in the Appendix at the end of this Chapter) The function sgn is well
defined, and multiplicative, i.e.: sgn(ση) = sgn(σ)sgn(η) for σ, η ∈ Sn .

That is how we obtain the formula for the n × n determinant:

X
(2.1) det A = sgn(σ)aσ(1)1 · · · aσ(n)n ,
σ∈Sn

where sgn is the sign made explicit in Corollary 2.3.6. Another important result follows:

13
Proposition 2.3.7 det A = det AT .

Proof: Clearly, aσ(k)k = ATk,σ(k) . Also, when we have the graph of a bijection, the graph
of its inverse is obtained by transposing the horizontal and vertical axes (i.e. reflection
through the diagonal) in the case of real functions of a real variable. This also holds by
permutations of {1, 2, · · · , n} (A PICTURE IS LACKING!).
Therefore, for every σ ∈ Sn , it follows from the former paragraph that
n
Y n
Y
aσ(k)k = a`σ−1 (`) ,
k=1 `=1

just swapping k by ` = σ(k). On the other hand, sgn(σ) = sgn(σ −1 ), so therefore det A =
det AT . 

2.4 Determinants of special shapes


We need no more background than the above to work out the determinants of matrices
of certain important shapes.

Example 2.4.1 Let T be an upper (resp. lower) triangular matrix, i.e. tij = 0 for i > j.
The determinant of T , det T , is the product of the terms of the main diagonal of T :

det T = t11 . . . tnn .

Indeed, note that det T = σ sgn(σ) ni=1 tσ(i)i . Since tij = 0 for i > j, those permuta-
P Q
tions σ of 1, 2, . . . , n that have a term σ(i) > i will produce a zero product, so the ones
we need to pick are those σ such that σ(i) ≤ i for every 1 ≤ i ≤ n. σ(1) ≤ 1, hence
σ(1) = 1. Likewise, σ(2) ≤ 2, and the permutation σ must have σ(2) = 2. Likewise,
if σ(1) = 1, . . . , σ(i − 1) = i − 1, then σ(i) ≤ i, and since σ(i) cannot be 1, 2, . . . , i − 1
(for these values are taken), we are only left with σ(i) = i. Thus, the only permutation
which produces a product that has a priori no obvious zero terms is precisely the identity,
σ = Id. This means that only one product survives, i.e. det T = t11 · · · tnn .

Example 2.4.2 Consider now a shape such as the following:


 
a11 a12 c11 c12 c13
a21 a22 c21 c22 c23 
 
M = 0 0 b11 b12 b13  .
0 0 b21 b22 b23 
0 0 b31 b32 b33

This shape does not have so many vanishing coefficients as a triangular matrix. We still
have an interesting phenomenon, though. We claim that the determinant is det M =
det(aij ) det(bij ).

14
First solution, perhaps unpleasant: We shall deal with this example first through
the prism of permutations. Note that mi1 = mi2 = 0 for i = 3, 4, 5. In this case, the only
significant terms in the sum over all permutations features only terms within the nonzero
frame; namely, if σ is a permutation of 1, 2, 3, 4, 5,

σ : {1, 2, 3, 4, 5} → {1, 2, 3, 4, 5},

the term corresponding to σ is sgn(σ) 5i=1 mσ(i)i , where mσ(i)i is outside the zero zone.
Q
This means that, if i = 1, 2, then σ(i) must be 1 or 2. In other words, {σ(1), σ(2)} = {1, 2}.
Since σ is a bijection, automatically {σ(3), σ(4), σ(5)} = {3, 4, 5}.
Thus, every significant permutation σ may be factored as a product σ = α ◦ β of a
permutation α of 1, 2 (that leaves 3, 4, 5 fixed) and a permutation β of 3, 4, 5 (that leaves
1, 2 fixed). Precisely speaking,

X 5
Y X
det M = sgn(σ) mσ(i)i = sgn(αβ)mα(1)1 mα(2)2 mβ(3)3 mβ(4)4 mβ(5)5 =
σ i=1 α,β

! !
X X
sgn(α)mα(1)1 mα(2)2 sgn(β)mβ(3)3 mβ(4)4 mβ(5)5 = det(A) det(B),
α β

where A = (aij ) and B = (bij ).

Second solution (Binet-Cauchy): Note the following product


    
a11 a12 c11 c12 c13 1 0 0 0 0 a11 a12 c11 c12 c13
a21 a22 c21 c22 c23  0 1 0 0 0  a21 a22 c21 c22 c23 
   

0
M = 0 b11 b12  = 0
b13  0 b11 b12 b13   0 0 1 0 0 .
  
0 0 b21 b22 b23  0 0 b21 b22 b23   0 0 0 1 0
0 0 b31 b32 b33 0 0 b31 b32 b33 0 0 0 0 1

Applying the Binet-Cauchy formula, and working out the factors through Laplace expan-
sion, yields det M = det A· det B.

Generalisation via block matrices: Consider a square r × r matrix A, a square s × s


matrix C, and an r × s 
matrix C. Let M be the square matrix of order r + s defined by
A C
the blocks M = . By the rules of block matrix multiplication, one has
0 B
     
A C I 0 A C
M= = · ,
0 B 0 B 0 I

hence det M = det A· det B.

15
2.5 Vandermonde determinant
We shall apply what we learnt so far to this classical example (and will use the Laplace
expansion, too). Given x1 , . . . , xn numbers, one has the matrix M = (mij ), where mij =
xi−1
j , i.e.
 
1 1 1 1 ... 1
 x1 x2 x3 x4 . . . x n 
M = . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xn−1
1 x2n−1 xn−1 3 xn−1
4 . . . xn−1 n
Let V (x1 , . . . , xn ) = det M . We shall find an expression for V (x1 , . . . , xn ).
Q
CLAIM: V (x1 , . . . , xn ) = 1≤i<j≤n (xj − xi ).
To start with, consider the elementary property that
n
!
X
det u1 + αi ui , u2 , . . . , un = det(u1 , u2 , . . . , un )
i=2

(which follows from linearity and alternateness!). This we may do on each column: adding
a linear combination of the other columns to a given column leaves the determinant
unchanged. By the same token, this is also the case for rows (remember that det A =
det AT !).
Thus, we replace the columns u2 , . . . , un by u2 − u1 , . . . , un − u1 , that is,

1 0 0 0 ... 0
x1 x2 − x1 x3 − x1 x4 − x1 . . . xn − x1
.............................................
V = .
xi1 xi2 − xi1 xi3 − xi1 xi4 − xi1 . . . xin − xi1
.............................................
xn−1
1 xn−1
2 x3n−1 x4n−1 . . . xnn−1
Clearly, we may extract a factor xi − x1 out of the i-th column, for 2 ≤ i ≤ n, and in so
doing
1 0 0 ... 0
x1 1 1 ... 1
n
Y .................................................................
V = (xi −x1 ) i−1 = (?).
xi1 xi−1
2 + . . . + x1i−1 x3i−1 + . . . + x1i−1 . . . xi−1
n + . . . + x1
i=2
.................................................................
xn−1
1 xn−2
2 + . . . + x1n−2 x3n−2 + . . . + x1n−2 . . . xn−2
n + . . . + xn−2
1
Clearly, Laplace expansion along the first row reduces to the expression
1 1 ... 1
Yn ...........................................................
V = (?) = (xi − x1 ) xi−1
2 + . . . + xi−1
1 xi−1
3 + . . . + xi−1
1 . . . xi−1
n + . . . + x1
i−1
.
i=1 ...........................................................
xn−2
2 + . . . + xn−2
1 xn−2
3 + . . . + xn−2
1 . . . xn−2
n + . . . + x1n−2

16
Note now that, if we isolate the above determinant on the right, leaving the accompanying
factor on the left for now, the (n − 1)-th row minus x1 times the (n − 2)-th row yields the
row
(xn−2
2 xn−2
3 . . . xnn−2 ).
Apply this process to the (n − 2)-th row now (i.e. subtracting x1 times the (n − 3)-th
row) and proceed backwards accordingly. One gets

1 1 ... 1
n
Y x2 x3 . . . x n
V = (?) = (xi − x1 ) ,
.....................
i=2
xn−2
2 x3n−2 . . . xn−2
n
Qn
which in turn equals i=2 (xi − x1 )V (x2 , . . . , xn ). The case n = 2 is clear, and the claim
follows by induction. 

2.6 Minors. Laplace expansion


Clearly, once we have a matrix A with the first column A1 = e1 the determinant reduces
to one of order n − 1. Indeed,

a22 · · · a2n
det(e1 , A2 , · · · , An ) = . . . . . . . . . . . . . .
an2 · · · ann .

For instance, one may argue that σ(1) = 1 (hence aσ(1)1 = 1) for every permutation with
a nonzero product aσ(1)1 · · · aσ(n)n .
The case with general A1 may be tackled by linearity and reduction to the determinant
det(e1 , A2 , . . . , An ).
n
X
det A = det(A1 , A2 , · · · , An ) = ai1 det(ei , A2 , · · · , An ),
i=1

and so it suffices to process each det(ei , A2 , · · · , An ). In order to permute the rows


1, 2, · · · , i, · · · , n into i, 2, 3, · · · , i − 1, 1, i + 1, · · · , n so that at each moment the order
of all rows except that of the first and i-th row is preserved, one need perform i − 1 row
swaps, i.e. transpositions (see Example 2.3.5), and then we reduce to the case i = 1,
det(ei , A2 , · · · , An ) = (−1)i−1 det(e1 , A02 , · · · , A0n ), where the matrix A0 is obtained from
A by taking the first row A01 = Ai and the 2nd to n-th rows of A0 to be the rows 1st to
n-th of A after cancelling out the i-th row. If n = 4 and i = 3, then
 3
A
A1 
A0 =  0 0 0
 3−1
A2  and det(e3 , A2 , A3 , A4 ) = (−1) det(e1 , A2 , A3 , A4 ).

A4

17
c
Notation: Denote by Aij c the determinant of order n − 1 resulting from deleting the i-th
row and the j-th column from det A. Here ic refers to the complement of i in the set
{1, 2, . . . , n} as an ordered set.

In all, the above argument and its row analogue yield the following result.

Theorem 2.6.1 (Laplace expansion, poor man’s version) Let A be a square ma-
trix of order n.

1. (Developing by a column) Fix j (j-th column of A). Then:


n
X c
det A = (−1)k+j akj Ajic ;
k=1

Pn i+k c
2. (Developing by a row) fix i (i-th row of A). Then: det A = k=1 (−1) aik Akic ,
c
where Aabc is the determinant of the submatrix resulting from erasing the a-th row and the
b-th column.

Example 2.6.2 if n = 3 and we expand along the 2nd row,

a12 a13 a a a a
det A = a21 (−1)1+2 + a22 (−1)2+2 11 13 + a23 (−1)1+3 11 12 .
a31 a33 a31 a33 a21 a22

Proof of Theorem 2.6.1: The argument is essentially given before the statement.

Definition Let B ∈ Mm×n (K) be a matrix. A minor, or minor determinant of B, of


order r is the determinant of a square submatrix of B, which is obtained by choosing r
rows i1 < · · · < ir and r columns j1 < · · · < jr of B. The minor with rows I = {i1 , · · · , ir }
and columns J = {j1 , · · · , jr } is the determinant

bij11 · · · bij1r
BJI = . . . . . . . . . . . . .
bijr1 · · · bijrr
c
An example of this is Aij c above, where A is an n × n matrix: here ic is the ordered set
1 < 2 < · · · < i − 1 < i + 1 < · · · < n.

Laplace expansion has a full-fledged version, which we shall not state or prove here (ask
away if ye are curious).

18
2.6.1 Cofactors. The adjugant matrix
c
Definition Let A ∈ Mn (K). The cofactor matrix of A is defined by Cof (A)ij = Aij c ,
c
defined above. Aij c is the cofactor associated with deleting the i-th row and j-th column.

Definition The adjugant matrix of A is the transpose of the cofactor matrix, adj(A) =
Cof (A)T = Cof (AT ).

Given A ∈ Mn (K) and b ∈ K n , Laplace expansion on the first column of the determinant
det(b, A2 , . . . , An ) yields the following.

b1 a12 . . . a1n
b2 a22 . . . a2n 1+1 1c 1+n nc
X
(2.2) .. .. .. = b1 (−1) A1c + . . . + bn (−1) A1c = bk Cof (A)k1 .
. . ... . k=1
bn an2 . . . ann
Pn
Note that Cof (A)kj = adj(A)jk , so actually det(b, A2 , . . . , An ) = k=1 adj(A)1k bk =
adj(A)1 b. More generally, Laplace expansion along the i-th column below yields

(2.3) det(A1 , . . . , Ai−1 , b, Ai+1 , . . . , An ) = adj(A)i b.

Theorem 2.6.3 The following identities hold for A ∈ Mn (K):

A· adj A = adj A· A = (det A)I.

Thus, if A is invertible then A−1 = 1


det A
adj A.

Proof: Laplace expansion yields the following lemma.

Lemma 2.6.4 Let A be a square matrix of order n.


1. Fix j (j-th column of A). Then:
n
X c
det A = (−1)k+j akj Ajic ;
k=1

2. Fix j (j-th column of A), and let ` 6= j. Then:


n
X c
(−1)k+j ak` Ajkc = 0.
k=1

Pn i+k c
3. Fix i (i-th row of A). Then: det A = k=1 (−1) aik Akic ,
Pn i+k c
4. Fix i (i-th row of A), and let j 6= i. Then: k=1 (−1) ajk Akic = 0.

19
Proof: Parts 1 and 3 are in Theorem 2.6.1. Part 4 is Part 2 applied to AT . Part 2
follows from considering the determinant of A, deleting the j-th column and writing the
k-th column instead. Thus, clearly

a11 · · · a1k · · · a1k · · · a1n


a11 · · · aik · · · a2k · · · a2n
= 0.
.................................
an1 · · · ank · · · ank · · · ann
Alternatively, 2 follows from (2.3), for adj(A)i Aj = det(A1 , . . . , Ai−1 , Aj , Ai+1 , . . . , An ). 
The products that appear in the Lemma say precisely that A· adj A = adj A· A =
(det A)· I.

2.6.2 Characterizing the rank via minors


Theorem 2.6.5 Let A1 , · · · , Ar be r column vectors in K n , r ≤ n. If they are linearly
independent, then there are r rows 1 ≤ i1 < · · · < ir ≤ n (write I = {i1 , · · · , ir }) such
that the minor
ai11 · · · ai11
I
A[1,r] = . . . . . . . . . . . .
ai1r · · · airr
is nonzero.

Proof: Let us prove an equivalent statement. Given a set A1 , · · · , Ar of linearly indepen-


dent vectors in K n , perform the mixing as in Steinitz’s Theorem so as to obtain a new
basis A1 , · · · , Ar , ek1 , · · · , ekn−r . Finally,

ai11 · · · ai11
0 6= det(A1 , · · · , Ar , ek1 , · · · , ekn−r ) = ± . . . . . . . . . . . . = ±AI[1,r] .
airr · · · airr

Theorem 2.6.6 Let M be an m×n matrix with coefficients in K. The rank of M , rk M ,


is the largest order r of a nonzero minor of M .

Proof: Let ρ = rk M . If r > ρ, then r columns of M are surely linearly dependent, so


any r × r minor of M is necessarily zero. If r ≤ ρ, by Theorem 2.6.5 there is some nonzero
minor of order r, so ρ is the biggest among such numbers. 

The following result is essential in the study of systems of linear equations.

Corollary 2.6.7 The rank of A equals the rank of its transpose AT . In other words,
row rank and column rank are equal (row rank would be the maximum number of linearly
independent rows in A).

Proof: The fact that det M T = det M and the determinantal criterion show that the
rank is the same for A and for AT . The rows of A are the columns of AT . 

20
Corollary 2.6.8 let A1 , · · · , Ar ∈ K n be linearly independent vectors. If r < n, then
their span F = hA1 , · · · , Ar i is given by the following cartesian equations. Fix I = {i1 <
· · · < ir } so that AI[1,r] 6= 0. F is defined by the following n − r equations: for every j ∈ I c ,

ai11 · · · ai11 xi1


............
(?)x ∈ F ⇔ ± ir = 0, ∀j ∈ I c .
a1 · · · airr xir
aj1 · · · ajr xj

Proof: Firstly, the condition that x ∈ F is equivalent to rk (A1 · · · Ar x) < r + 1, which


is to say that every minor of order r + 1 is zero.
Let us show that it suffices to test I 0 = I ∪ {r + 1}. If (?) holds, then AI[1,r] xj and hence
xj is a linear combination of xi1 , · · · , xir . This follows from developing the determinant
on their rightmost column. This determines all variables xj , j > r. 
   
1 2
2  1 
Example 2.6.9 Consider the vector subspace F = h 1 ,  1 i. Let us write a com-
  

4 −1
plete set of cartesian equations for F .
Consider a generic vector of unknowns x, y, z, t. First of all, we spot in the matrix
 
1 2
2 1 
A= 1 1 

4 −1

the minor 2 × 2 of the first two rows and columns, which is nonzero. The augmented
matrix is now  
1 2 x
2 1 y 
A0 = 
1 1 z  ,

4 −1 t
and fixing the first two rows yields two equations:

1 2 x
2 1 y =0
1 1 z

and
1 2 x
2 1 y = 0.
4 −1 t
If the reader should choose another two rows, the results would be the same –we mean, up
to linear combinations of both equations, of course.

21
Example 2.6.10 (An oldie follows from Theorem 2.6.6) Recall the case of u, v ∈
Rn , of coordinates ui , vj respectively. The rank of the matrix (u v) is then the highest
order of a nonzero minor: if one of the vectors is nonzero, then it is at least 1 (a nonzero
coordinate is a nonzero 1 × 1 minor!). The rank is precisely 2 if and only if there is a
nonzero minor ui vj − uj vi . We knew this, but now Theorem 2.6.6 vastly generalises this
old result.

2.6.3 Cramer’s rule. Inverses via minors


Theorem 2.6.11 (Cramer’s rule) Let A be an invertible n × n matrix. Let b ∈ K n .
There is a unique solution to the linear system Ax = b, of unknowns x = (x1 x2 · · · xn )T .
The value of each xi is
det(A1 , · · · , Ai−1 , b, Ai+1 , · · · , An )
xi = .
det A
Proof: Clearly, Ax = b ⇔ x = A−1 b, so x = 1
det A
(adj A)b, and by (2.3) one has

1 det(A1 , · · · , Ai−1 , b, Ai+1 , · · · , An )


xi = (adj A)i b = .
det A det A
Remark An alternative proof of 2.6.11 goes as follows. The solution exists, for the
n
P
columns A1 , · · · , An of A form a basis of R and b = Ai xi has a unique set of coordi-
nates. Now, write the determinant
n
X
det(A1 , · · · , Ai−1 , b, Ai+1 , · · · , An ) = det(A1 , · · · , Ai−1 , Ak xk , Ai+1 , · · · , An ) = (?).
k=1

Expanding on the i-th column yields (?) = det(A1 , · · · , Ai−1 , Ai xi , Ai+1 , · · · , An )+0, since
the terms with Ak on the i-th position for k 6= i yield zero by DET2. It follows that

det(A1 , · · · , Ai−1 , b, Ai+1 , · · · , An ) = det(A1 , · · · , Ai−1 , Ai , Ai+1 , · · · , An )xi = xi det A,

which yields Cramer’s rule.

2.7 The Wronskian


Let f1 , · · · , fn ∈ C n−1 (I), where I ⊂ R is an interval. Assume that the fi are linearly
dependent. This means that one has α1 , · · · , αn real constants, not all zero such that
n
X
αi fi (x) = 0, ∀x ∈ I.
i=1

Much more information is contained here than a mere equation. In fact, if we differentiate
up to n − 1 times the above identity, we get a homogeneous system of n equations and n
unknowns with a nontrivial solution for every x ∈ I:

22
   
··· α1 0

f1 (x) f2 (x) fn (x)
0 0 0 α2  0
 f1 (x) f2 (x) ··· fn (x)    
 ..  =  ..  .
 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 .  .
(n−1) (n−1) (n−1)
f1 (x) f1 (x) · · · fn−1 (x) αn 0

Definition The Wronskian, or Wronski’s determinant, of n functions fi ∈ C n−1 (I) is


the determinant function
f1 (x) f2 (x) ··· fn (x)
0 0
f1 (x) f2 (x) ··· fn0 (x)
W (f1 , · · · , fn )(x) = .
...................................
(n−1) (n−1) (n−1)
f1 (x) f1 (x) · · · fn−1 (x)
The following was proven in the above discussion.
Theorem 2.7.1 Let f1 , · · · , fn ∈ C n−1 (I). If the fi are linearly dependent, then the
Wronskian W (f1 , · · · , fn ) is identically zero on I. 
Example 2.7.2 Let αk = ak + ibk ∈ C be pairwise distinct complex numbers, for 1 ≤ k ≤
n, where ak , bk are the real and imaginary parts of αk . The functions eαk x , k = 1, . . . , n
are linearly independent. (Big value for your money/effort!)
Example 2.7.3 The Theorem allows us to prove that the functions 1, sin x, cos 2x, sin 2x
are linearly independent. We leave the reader to check that their Wronskian is not iden-
tically zero (not sure how long the calculations are).
The thing with Example 2.7.3 is, one may use Example 2.7.2 to solve it (the calculations
are so pleasant in Example 2.7.2! I do not know at present if that is the case for Example
2.7.3. Instead, using the fact that certain functions are a basis allows one to use the
coordinates in such basis, taking a suitable finite-dimensional subspace and using Example
2.7.2. But this might be someone I know being quite lazy!)
Infinitely differentiable examples may be built, but we shall content ourselves with this
little gem.
Example 2.7.4 (Careful with quick récipés!) Let u1 (x) = x3 , u2 (x) = x2 |x|. The
Wronskian W (u1 , u2 ) is identically zero, although these functions are linearly independent.
Thus, the converse holds under only certain, albeit quite handy, hypotheses. (When in two
occasions people insisted that identically zero wronskians imply linear dependence, I sadly
did not place a bet, but with this you can. Hush!)

2.8 Problems
2.8.1 Consider, for 1 ≤ i, j ≤ 3 the variables xi , yj . Consider the matrix A = (aij )
defined by aij = sin(xi + yj ). Compute the determinant of A.
If n > 3, consider the n×n matrix A defined by aij = sin(xi +yj ). What is the determinant
of A?

23
2.8.2 [1, Ch. VI, Problem 2] Prove that (x − 1)3 divides the polynomial

1 x x 2 x3
1 1 1 1
.
1 2 3 4
1 4 9 16

The following was part of a test in 2018.

2.8.3 Let a1 , · · · , an ∈ R. Factor the polynomial P (x) defined as follows.

a21 + x a1 a2 a1 a3 . . . a1 an
a2 a1 a22 + x a2 a3 . . . a2 an
P (x) = .
.................................
an a1 an a2 an a3 . . . a2n + x

The result below is the very foundation of L’Hôpital’s rule, for its proof stems from
this little gem. A determinantal guise is most recommended, both for the proof and for
memory purposes.

2.8.4 (Cauchy’s Mean Value Theorem) Let f, g : [a, b] → R be continuous func-


tions, differentiable on (a, b), such that g(b) 6= g(a). Prove that there exists ξ ∈ (a, b) such
that
(f (b) − f (a))g 0 (ξ) = (g(b) − g(a))f 0 (ξ).
Hint: Let f, g, h : [a, b] → R be continuous on [a, b] and differentiable on (a, b), and
consider the function
f (a) g(a) h(a)
H(x) = f (b) g(b) h(b) .
f (x) g(x) h(x)

You might be surprised, but the following is very doable and you already made its ac-
quaintance!

2.8.5 (Chiò’s rule) K = R or C. Let A ∈ Mn (K) and let u, v ∈ K n , where K is a


field. Prove that  
1 vT
det = det(A − uv T ).
u A

2.8.6 Given A ∈ Mn (Z) a square matrix with integer coefficients, give necessary and
sufficient conditions for A to have an inverse with integer coefficients.

2.8.7 Let t be an indeterminate. Consider the monomials tr1 , · · · , trn where ri ∈ N are
pairwise distinct. Let M be the matrix whose first row is tr1 , · · · , trn , with the (i+1)-th
row being the derivative of the i-th row. Prove that det M = CtN , where C is a real
constant and N is natural, and find C, N explicitly.

24
This was an exam question back in 2019 (VE2, methinks).
2.8.8 Let x0 , · · · , xn ∈ R be pairwise distinct numbers, and let y0 , · · · , yn ∈ R. Prove
that there is precisely one polynomial p(x) ∈ R[x] of degree deg p ≤ n such that p(xi ) = yi ,
and that it is characterised by the condition
1 x0 x20 · · · xn0 y0
1 x1 x21 · · · xn1 y1
. . . . . . . . . . . . . . . . . . . . . . . . . = 0.
1 xn x2n · · · xnn yn
1 x x2 · · · xn p(x)

2.8.9 Let A be a real skew-symmetric matrix of order n, i.e. A ∈ Mn (R) such that
AT = −A (hemi-simétrica, anti-simétrica). If n is odd, show that det A = 0.

The following may be used as a stepping stone in proving Sylvester’s criterion for a
quadratic form to be positive definite.
2.8.10 (VF 2017) Let a1 , . . . , an , b ∈ R be such that b − nk=1 a2k > 0. Show that the
P
following determinant is nonzero:

1 0 0 ... 0 a1
0 1 0 ... 0 a2
0 0 1 ... 0 a3
.. .. .. ... .. .. .
. . . . .
0 0 0 . . . 1 an
a1 a2 a3 . . . an b

2.8.11 Let A be a square matrix of order n. Show that adj A 6= 0 if and only if rk A ≥
n − 1.

Appendix Det-A: The sign of a permutation


This optional section is independent from the rest of the chapter, and should be read for
further study only.
We shall prove here that the sign function, sgn : Sn → {±1} is well-defined, and shall
state and prove its relation to the number of inversions of a permutation.
Given a permutation σ, it is not just a bijection of {1, · · · , n} to itself, but also induces
a a bijection of P2 (n) to itself. Here P2 (n) is the set of unordered pairs {i, j}, and the
induced map is P2 (σ), defined by {i, j} 7→ {σ(i), σ(j)}. P2 (σ −1 ) is the inverse of P2 (σ).
It may occur that i < j but σ(i) > σ(j). The number of inversions of σ is the number
of pairs where the order is reversed by σ: N (σ) = {(i, j) : i < j, σ(i) > σ(j)}.

Proposition 2.8.12 Let τ = (a b) be a transposition in Sn . Then: N (τ ) = 2(k − h) − 1


is odd.

25
Proof:

ˆ Suppose that the indices i, j, a, b are distinct. Then: σ(i) = i < σ(j) = j.

ˆ If i < a < b (j = a or j = b, analogously), then i, a pass to i, b and there is no


inversion.

ˆ If a < b < j, (i = a or = b) then a, j become a, b and b, j go to a, j (no inversion).

ˆ If a < i < b (and j = b), a, i become b, i and there are b − a − 1 inversions. On the
other hand, j, b become j, a, which counts for b − a − 1 inversions.

ˆ Se i = a < b = j, the pair i, j becomes j, i

The total number of inversions is therefore N ((a b)) = 2(b − a) − 2 + 1 = 2(b − a) − 1


(odd). 

Theorem 2.8.13 Let σi be permutations. Then: (−1)N (σ1 σ2 ) = (−1)N (σ1 ) (−1)N (σ2 ) .
Therefore, sgn(σ) = (−1)N (σ) is well defined, i.e. independent of the factorization of
σ into transpositions.

Proof: Consider the product σP = 1≤k<h≤n (σ(h) − σ(k)). Note that σP


Q
P
= ±1, for
the terms of P are indexed by the unordered pairs of distinct elements (P2 (n)) and in
calculating σP only the sign may vary between numerator and denominator. Thus, σP P
=
σ(j)−σ(i)
±1. The sign of j−i is negative if and only if (i, j) presents an inversion. This means
that σP
P
= (−1)N (σ) 1, which depends only on σ.
On the other hand,
σ1 σ2 P σ1 σ2 P σ2 P
= ,
P σ2 P P
and reindexing (take i0 = the smaller of the pair σ2 (i), σ2 (j) and j 0 to be the bigger) one
sees that
σ1 σ2 P σ1 P
= .
σ2 P P
The sign of a transposition τ is −1, which matches with (−1)N (τ ) , by Proposition 2.8.12.
But then, the sign defined as (−1)k if σ = τ1 · · · τk coincides with our version via the
number of inversions.


26
Bibliography

[1] M. Castellet, I. Llerena, Álgebra Lineal y Geometrı́a, Ed. Reverte, 1996.

[2] K. Hoffman, R. Kunze, Linear Algebra, 2nd Ed., Prentice Hall, 1971.

[3] A. I. Kostrikin, Yu. I. Manin, Linear Algebra and Geometry,

[4] P. Lax, Linear Algebra, 1st Ed, Wiley, 1997.

[5] G. Pólya, Mathematical Discovery, Wiley, 1981.

[6] J. J. Ramón-Marı́, Systems of linear equations, from a 1st version typed by the
remarkable Filipe Abelha.

27

You might also like