Supplement: The Inductive Discovery of Determinants

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Supplement: The Inductive Discovery of Determinants

Determinants originally appeared as some secret recipes of solving systems of linear equations, and how
one obtained their formulas was quite mysterious (in the mid 19th century). The formula people obtained
for n × n determinant is an extremely complicated algebraic expression in n2 variables consisting of n! terms,
and each term is a product of n variables. It is highly non-trivial to manipulate the formula directly. In
fact, the usefulness of determinants lies in the properties they have, not in their apparent formulas.
Discovering the Determinants:
Recall that there are three possibilites of solutions for a system of linear equations Ax = b:
(a) it is inconsistent;
(b) it is consistent and has a unique solution x0 ; and

(c) it is consistent and has many solutions.


The case of unique solution is of special interest. Since x0 is uniquely determined by the system, one might
expect that there is a functional relation between x0 and (A, b). Can we obtain a formula for this functional
relation? If the entries of A are all explicitly given, one can perform row operations to obtain the formula
directly. But what will be the case that some, or all, of the entries of A involve unknown quantities? Or, what
if it were you to determine those coefficients that allow unique solution? Under this situation, a “theory” is
required. That is, to study the condition (if any) for a general system to have a unique solution.
Let us first try the 2 × 2 case.

(1) ax + by = e
(2) cx + dy = f

We need to avoid using division because some of the coefficients might be zero. So, we eliminate y by:
d × (1) − b × (2):
(ad − bc)x = (ed − bf ),
and therefore x will have a unique solution of x = ed−bf
ad−bc if ad − bc 6= 0. Similarly, we can eliminate x by:
a × (2) − c × (1):
(ad − bc)y = −(ec − af ),
and y will have a unique solution if the same quantity, ad − bc, is non-zero. 
The above quantity, ad − bc, determines when the system has a unique solution. So we call it the
determinant of the system.
 
a b
Definition: Let Ax = b be a system with a 2 × 2 coefficient matrix A = . The determinant of A,
c d
denoted by det A or |A|, is defined to be the quantity ad − bc, i.e.
 
a b a b
= ad − bc.
det =
c d c d

Remark: If ad − bc = 0, then a : b = c : d and hence the two equations are in proportion. One can see that
if such a system is consistent, there will be many solutions.
Conclusion: A 2 × 2 system Ax = b will have a unique solution if and only if det A 6= 0.

Experimenting the 3 × 3 Case:


So we now adopt a similar method to discover the expression
 for thedeterminant of a 3 × 3 matrix. Again
a1 b1 c1
we consider a system with a 3 × 3 coefficient matrix A =  a2 b2 c2 :
a3 b3 c3

(3) a1 x + b1 y + c1 z = d1
(4) a2 x + b2 y + c2 z = d2
(5) a3 x + b3 y + c3 z = d3

1
By successive elimination of variables, we will obtain an equation involving x only, so the coefficient of x
should be the 3 × 3 determinant we want. The elimination process is indicated as follows:
First we let c2 6= 0. We can eliminate z by: c2 × (3) − c1 × (4):

(6) (a1 c2 − a2 c1 )x + (b1 c2 − b2 c1 )y = d1 c2 − d2 c1 ,

or by: c3 × (4) − c2 × (5):

(7) (a2 c3 − a3 c2 )x + (b2 c3 − b3 c2 )y = d2 c3 − d3 c2 .

Now we have two equations (6) and (7) in two variables x and y. As in the 2 × 2 case, we can eliminate y
by: (b2 c3 − b3 c2 ) × (6) − (b1 c2 − b2 c1 ) × (7):

c2 (a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1 )x
(8)
= c2 (d1 b2 c3 − d1 b3 c2 + d2 b3 c1 − d2 b1 c3 + d3 b1 c2 − d3 b2 c1 ).

Since we have assumed that c2 6= 0, we can cancel c2 from both sides and obtain:

(a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1 )x
(80 )
= (d1 b2 c3 − d1 b3 c2 + d2 b3 c1 − d2 b1 c3 + d3 b1 c2 − d3 b2 c1 ).

Hence x will have a unique solution if:

(9) a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1 6= 0,

and we should define the above quantity as the determinant of the 3 × 3 coefficient matrix. 
Remark: You will obtain the same expression (at most differs by a sign only) if you choose to eliminate
x, y or x, z instead. The computations are exactly the same and hence are left as exercises.
Before we formally define the quantity in (9) to be the determinant of A, we observe that the expression
can be rewritten as the following form:

(10) a1 (b2 c3 − b3 c2 ) − a2 (b1 c3 − b3 c1 ) + a3 (b1 c2 − b2 c1 ),

which suggests the following much better way of elimination:

(11) (b2 c3 − b3 c2 ) × (3) − (b1 c3 − b3 c1 ) × (4) + (b1 c2 − b2 c1 ) × (5),

after which both the variables y, z will be eliminated in one stroke! (and also we do not need to assume
c2 6= 0 here.) Moreover, we note that the quantities:
     
b2 c2 b1 c 1 b1 c1
b2 c3 − b3 c2 = det , b1 c3 − b3 c1 = det , b1 c2 − b2 c1 = det ,
b3 c3 b3 c 3 b2 c2

are just special kinds of 2 × 2 determinants, each of them is the determinant of the sub-matrix obtained by
deleting the row and column where ai sits. The above “effective elimination” is surely pointing us to use
(10) instead of (9) as the definition of determinant of A. But whether it is really the appropriate one will
depend on how well it fits with larger sizes determinants. Therefore, we need to continue our experiments
to 4 × 4 matrix later.
 
a1 b1 c1
Definition: Let Ax = b be a system with a 3 × 3 coefficient matrix A =  a2 b2 c2 . The determinant
a3 b3 c3
of A is defined to be the quantity:
 
a1 b1 c1 a1 b1 c1

det  a2 b2 c2  = a2 b2 c2 = a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1
(12) a3 b3 c3 a3 b3 c3

b2 c 2 b1 c 1 b1 c 1
= a1
− a2
+ a3
.
b3 c 3 b3 c 3 b2 c 2

Remark: If det A = 0, the elimination (11) will “wipe” out the LHS of the resulting equation. Thus there
exists a free variable in the system. Again if such a system is consistent, there will be many solutions.
Conclusion: A 3 × 3 system Ax = b will have a unique solution if and only if det A 6= 0.

2
A simple rule for computing 3 × 3 determinants:
By a closer look at the expansion formula (12), we observe that by considering the following extended
matrix (obtained by appending the first two columns of A to the RHS):
   %− %− %−
a1 b1 c1 a1 b1 a1 b1 c1 a1 b1
 a2& b2& c2& a2 b2  a2 b2% c2% a2% b2
& & & % % %
a3 b3 c3 a3 b3 a3 b3 c3 a3 b3
& & &
+ + +

then the respective products of the three complete downward diagonals:

a1 → b2 → c3 , b1 → c2 → a3 , c1 → a2 → b3 ,

correspond exactly to those positive terms in (12). Similarly, the respective products of the three complete
upwards diagonals:
a3 → b2 → c1 , b3 → c2 → a1 , c3 → a2 → b1 ,
correspond exactly to those negative terms in (12). Hence, by adding the downwards diagonal products and
subtracting the upward diagonal products, we can obtain the value of this 3 × 3 determinant directly.

Warning: This trick only works for 3 × 3 determinants!

Discovering the 4 × 4 Determinant:

(13) a1 x + b1 y + c1 z + d1 w = e1
(14) a2 x + b2 y + c2 z + d2 w = e2
(15) a3 x + b3 y + c3 z + d3 w = e3
(16) a4 x + b4 y + c4 z + d4 w = e4

The success of “effective elimination” in 3 × 3 case (11) is pointing us to use the following as the “effective
elimination” for the 4 × 4 case:

b2 c2 d2 b1 c1 d1 b1 c1 d1 b1 c 1 d 1

(17) b3 c3 d3 × (13) − b3 c3 d3 × (14) + b2 c2 d2 × (15) − b2 c2 d2 × (16).

b4 c4 d4 b4 c4 d4 b4 c 4 d 4 b3 c 3 d 3

If, to be verified later, all the other variables y, z, w can be eliminated by (17), the coefficient of x left there
will be the 4 × 4 determinant we want. So, let us first define the 4 × 4 determinant as:

a1 b1 c1 d1
b2 c 2 d 2 b1 c1 d1 b1 c1 d1 b1 c1 d1
a2 b2 c2 d2
(18) = a1 · b3 c3 d3 − a2 · b3 c3 d3 + a3 · b2 c2 d2 − a4 · b2 c2 d2
a3 b3 c3 d3
b4 c 4 d 4 b4 c4 d4 b4 c4 d4 b3 c3 d 3
a4 b4 c4 d4

The coefficients of y, z, w after the elimination process (17) are as follows:



 b2 c2 d2 b1 c 1 d 1 b1 c1 d1 b1 c 1 d 1
 ?
b1 b3 c3 d3 −b2 b3 c3 d3 +b3 b2 c2 d2 −b4 b2 c2 d2 = 0






 b4 c4 d4 b4 c 4 d 4 b4 c4 d4 b3 c 3 d 3





 b2 c2 d2 b1 c1 d1 b1 c1 d1 b1 c1 d1


?
(19) c1 b3 c3 d3 −c2 b3 c3 d3 +c3 b2
c2 d2 −c4 b2 c2 d2 = 0
 b4 c4 d4 b4 c4 d4 b4 c4 d4 b3 c3 d3









 b2 c 2 d 2 b1 c1 d1 b1 c1 d1 b1 c1 d1
 ?
 d1 3 3 3 −d2 3 3 3 +d3 b2
b c d b c d c2 d2 −d4 b2 c2 d2 = 0



b4 c 4 d 4 b4 c4 d4 b4 c4 d 4 b3 c3 d3

Thus, whether the elimination method (17) is workable depends on the validities of (19). This is what we
are going to prove systematically.

3
By definition (18), the quantities in (19) can be regarded as the following special kind of 4×4 determinants:

b1 b1 c1 d1 c1 b1 c1 d1 d1 b1 c1 d1

b2 b2 c 2 d 2 c2 b2 c2 d2 d2 b2 c2 d2
(20) b3 b3 c3 d3 , c3 b3 c3 d3 , d3 b3 c3 d3


b4 b4 c 4 d 4 c4 b4 c4 d4 d4 b4 c4 d4

We see that they are all determinants with two equal columns. If we can prove that the value of any deter-
minant with two equal columns is necessarily zero (called “alternating property”), we are home. Technically,
it is easier to check the following property instead:
Skew-symmetric property: If we interchange two columns of a determinant, its value changes sign.
If the skew-symmetric property is verified, then by interchanging the two identical columns for those de-
terminants in (19), they all change to their negatives on one hand; but on the other hand, switching two
identical columns should have no effect on the determinants. So the only possibility is that they are all zeros!
Note: The above argument will not work in a field satisfying 1 = −1. For example, the finite field {0, 1}.
To ease the discussion, let us introduce a notation. Let A be an n × n matrix. We denote by A(i|j) the
(n − 1) × (n − 1) matrix obtained by deleting the i-th row and the j-th column of A. For example:
   
a1 b1 c1 d1 a1 b1 c1 d1  
 a2 b2 c2 d2   a2 b2 c2 d2  a1 b1 d1
A=  , then A(2|3) =   =  a3 b3 d3  .
 a3 b3 c3 d3   a3 b3 c3 d3 
a4 b4 d4
a4 b4 c4 d4 a4 b4 c4 d4
Similarly, we denote by A(i, j|k, `) the (n − 2) × (n − 2) matrix obtained by deleting the i-th, j-th rows and
the k-th, `-th columns of A. Note that A(i, j|k, `) = A(j, i|k, `), etc.
The skew-symmetric property can be verified directly for 2 × 2 and 3 × 3 determinants. For a 4 × 4
determinant, if the switching columns do not involved the first column, then by inductive argument, one can
easily apply the defining formula (18) to conclude that there are sign changes for every sub-determinant in
RHS, so the original determinant will change sign after the switching.
The first column of A needs separate treatment because the entries ak are in different positions with bk ,
ck and dk in the defining formula (18). When the switching involves the first column, the formula for the
new determinant will be completely different from the old one. However, we find out that we need only to
handle the case of switching the first and the second columns. Since, for example, the switching of first and
third column can be achieved by:
0. original: A–B–C–D
1. switching 2nd and 3rd: A–C–B–D
2. switching 1st and 2nd: C–A–B–D
3. switching 2nd and 3rd: C–B–A–D
There are sign changes in Step 1 and Step 3, and they cancel out each other. So, if Step 2 also introduce a
sign change, we will then have proved the skew-symmetric property for 4 × 4 determinants.
The trick is to further expand the sub-determinants det A(i|1) using the defining formula (12), so that
the entries bk are also taken out, sitting in the same positions with those ak :
det A = a1 det A(1|1) − a2 det A(2|1) + a3 det A(3|1) − a4 det A(4|1)

= a1 b2 det A(1, 2|1, 2) − b3 det A(1, 3|1, 2) + b4 det A(1, 4|1, 2)

−a2 b1 det A(2, 1|1, 2) − b3 det A(2, 3|1, 2) + b4 det A(2, 4|1, 2)

+a3 b1 det A(3, 1|1, 2) − b2 det A(3, 2|1, 2) + b4 det A(3, 4|1, 2)

−a4 b1 det A(4, 1|1, 2) − b2 det A(4, 2|1, 2) + b3 det A(4, 3|1, 2)
= (a1 b2 − a2 b1 ) det A(1, 2|1, 2) + (a3 b1 − a1 b3 ) det A(1, 3|1, 2) + (a1 b4 − a4 b1 ) det A(1, 4|1, 2)
+(a2 b3 − a3 b2 ) det A(2, 3|1, 2) + (a4 b2 − a2 b4 ) det A(2, 4|1, 2) + (a3 b4 − a4 b3 ) det A(3, 4|1, 2).
Hence, when we interchange the first two columns of A (i.e. ak ↔ bk ), all det A(i, j|1, 2) remain unchanged,
but each (ai bj − aj bi ) ↔ (aj bi − ai bj ) will introduce a sign change. So, in overall, det A will be changed to
− det A. This verifies the skew-symmetric property of 4 × 4 determinants and hence the validities of (19).
Similar to 3 × 3 case, we have:
Conclusion: A 4 × 4 system Ax = b will have a unique solution if and only if det A 6= 0.

4
The n × n Determinants:
Let A be the n × n coefficient matrix (n ≥ 3) of the following system:

 a11 x1 + a12 x2 + . . . + a1n xn = b1

 a21 x1 + a22 x2 + . . . + a2n xn = b2

(21) .. ..


 . .
an1 x1 + an2 x2 + . . . + ann xn = bn

Definition: The determinant of A in (21), denoted by det A, is defined inductively to be:



a11 a12 · · · a1n

a21 a22 · · · a2n
n+1
det A = . .. = a11 det A(1|1) − a21 det A(2|1) + . . . + (−1) an1 det A(n|1)

.. ..
.. . . .


an1 an2 · · · ann
Xn
(22) = ai1 · (−1)i+1 det A(i|1).
i=1

where A(i|1) is the (n − 1) × (n − 1) submatrix of A obtained by deleting the i-th row and the 1-st column
of A.
Note: The above is an “inductive definition”. That is, in order to define 5 × 5 determinants, you need to
define 4 × 4 determinants first, and hence require the definitions of 3 × 3 and then 2 × 2 determinants.
Remark: The quantity (−1)i+1 det A(i|1) is sometimes called the (i, 1)-cofactor of A. The (i, j)-cofactor of
A is similarly defined.
The inductive steps used in 4 × 4 case generalize directly to the n × n case. We verify the skew-symmetric
property of det A as follows:
• Case 1: If the switching columns do not involve the first column:
Then by the defining formula (22), there will be a corresponding switching columns occurs in each
det A(i|1), thus they all change sign. So, in overall, det A will be changed to − det A.
• Case 2: The switching of the first and the second columns.
We further expand each det A(i|1) into (n − 2) × (n − 2) sub-determinants using the defining formula
(22). The resulting expression will contain a huge sum of det A(i, j|1, 2). Instead of writing down the huge
sum directly, we try to analyze the coefficient of each det A(i, j|1, 2) instead.
There are two ways to obtain A(i, j|1, 2) from A:
(a) A → A(i, 1) → A(i, j|1, 2);
(b) A → A(j, 1) → A(j, i|1, 2).
Without loss of generality, let us assume i < j.
(a) the j-th row of A will be the (j − 1)-th row of A(i|1). So the coefficient of det A(i, j|1, 2) will be:

(−1)i+1 ai1 · (−1)(j−1)+1 aj2 ;

(b) the i-th row of A will again be the i-th row of A(j|1). So the coefficient of det A(j, i|1, 2) will be:

(−1)j+1 aj1 · (−1)i+1 ai2 .

Thus, in the huge sum of det A expansion, det A(i, j|1, 2) will appear as:

det A = . . . + (−1)i+j+1 ai1 aj2 − aj1 ai2 det A(i, j|1, 2) + . . .




Hence, when we switch the first and the second column of A (i.e. ai1 ↔ ai2 and aj1 ↔ aj2 ), det A(i, j|1, 2)
does not change, and the coefficient of det A(i, j|1, 2) will change accordingly as:

ai1 aj2 − aj1 ai2 → ai2 aj1 − aj2 ai1 = − ai1 aj2 − aj1 ai2

Therefore, in overall, det A changes sign. 

5
The Cramer’s Rule:
Suppose that we apply the “effective elimination” on (21), i.e. to consider:

det A(1|1) × (21.1) − det A(2|1) × (21.2) + . . . + (−1)n+1 det A(n|1) × (21.n).

Then the LHS of the combined equation should be:

a11 det A(1|1) − a21 det A(2|1) + . . . + (−1)n+1 an1 det A(n|1) x1


+ a12 det A(1|1) − a22 det A(2|1) + . . . + (−1)n+1 an2 det A(n|1) x2 + . . . +


+ a1n det A(1|1) − a2n det A(2|1) + . . . + (−1)n+1 ann det A(n|1) xn


= det[a1 , a2 , . . . , an ]x1 + det[a2 , a2 , . . . , an ]x2 + . . . + det[an , a2 , . . . , an ]xn


= det[a1 , a2 , . . . , an ]x1 + 0x2 + · · · + 0xn = (det A)x1 ,

and the RHS of the combined equation should be:

b1 det A(1|1) − b2 det A(2|1) + . . . + (−1)n+1 bn det A(n|1) = det[b, a2 , . . . , an ].

Therefore, x1 will have a unique solution if det A 6= 0, and this unique solution is:

det[b, a2 , . . . , an ]
x1 = .
det A
This is the well-known Cramer’s rule for x1 . The Cramer’s rule for other xi can be obtained by the trick of
switching columns.
Conclusion: An n × n system Ax = b will have a unique solution if and only if det A 6= 0.

The Characterization of Determinant:


Determinant satisfies the so-called “linear property” for the first column, provided that the other columns
are held fixed:

det[αu + βv, a2 , . . . , an ] = α · det[u, a2 , . . . , an ] + β · det[v, a2 , . . . , an ].

Using its alternating property, one can easily “transfer” the linear property to other columns.
Definition: A function D of n-vector variables is called multi-linear if it is linear in each of the components,
provided that the other components are held fixed, i.e.

D(. . . , αu + βv, . . .) = αD(. . . , u, . . .) + βD(. . . , v, . . .).

Theorem: (Characterization Theorem of Determinant) If D is a multi-linear, alternating function of n


variables of n-vectors, then there is a constant k such that:

D(a1 , . . . , an ) = k · det[a1 , . . . , an ].

The above k is given by D(e1 , . . . , en ), where ei is the i-th column of the identity matrix In .
Sketch of Proof: Consider F (a1 , . . . , an ) = D(a1 , . . . , an ) − k det[a1 , . . . , an ]. Then check that F is also
multi-linear and alternating, satisfying F (e1 , . . . , en ) = 0. Apply the multi-linear property of F step-by-step
to obtain the following huge expansion formula:
n
X n X
X n
F (a1 , a2 . . . , an ) = ... ai1 1 ai2 2 · · · ain n F (ei1 , ei2 , . . . , ein ).
in =1 i2 =1 i1 =1

If the subscripts (i1 , i2 , . . . , in ) have some repetitions, then by the alternating property of F , we must have
F (ei1 , ei2 , . . . , ein ) = 0. If all the subscripts (i1 , i2 , . . . , in ) are distinct, then by a suitable sequence of
switchings, we can re-arrange them back to be (1, 2, . . . , n), so F (ei1 , ei2 , . . . , ein ) = ±F (e1 , e2 , . . . , en ) = 0.
Thus, F is an identically zero function and hence the theorem is proved. 

You might also like