Determinants, Areas and Volumes: Theodore Voronov December 4, 2005
Determinants, Areas and Volumes: Theodore Voronov December 4, 2005
Determinants, Areas and Volumes: Theodore Voronov December 4, 2005
Theodore Voronov
December 4, 2005
Contents
1 Determinants 2
1.1 Matrices and vectors . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Determinant as a multilinear alternating function . . . . . . . 4
1.3 Properties of determinants . . . . . . . . . . . . . . . . . . . . 7
1.4 Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1
§1 Determinants
You might have met determinants of small order (two by two and three by
three) in high school algebra and you will definitely study determinants in
detail in the linear algebra course. Here we shall remind the definition and
basic facts about determinants in the form most convenient for seeing their
geometric meaning.
Clearly, we can add vectors only of the same size, i.e., for a fixed n. Later
on you will study the general notion of a ‘vector space’ of dimension n and
will learn that Rn is the first example of such a space.
Vectors in Rn can be written as rows: x = (x1 , . . . , xn ), or as columns:
x1
x = . . . .
xn
2
Such arrays are called matrices or n × m matrices when it is necessary to
indicate the dimensions. The numbers ij in the notation aij serve as labels
specifying the element in the ith row and jth column. They are pronounced,
for example “two, three” for a23 , not “twenty three”. The elements of a
matrix are also called matrix entries. Row-vectors and column-vectors are
particular cases of matrices (with only one row or only one columns, respec-
tively). An important role is played by matrices with n = m, called square
matrices.
µ ¶
2 0
Example 1.2. A = is a 2 × 2 matrix (a square matrix). Here
−1 8
a11 = 2, a12 = 0, a21 = −1, a22 = 8.
Example 1.3. There are two special matrices. The zero matrix
0 ... 0
0 = . . . . . . . . .
0 ... 0
(all entries are zeros) is defined for any n and m. In particular, there is
the zero vector in Rn , all coordinates of which are zero. When n = m, i.e.,
among square matrices, there is another important matrix called the identity
matrix :
1 0 ... 0
0 1 ... 0
E=
. . . . . . . . . . . . .
0 0 ... 1
Its entries on the diagonal are 1 and all others are 0. An alternative notation
is I. If we need to stress the dimension, the we can write En or In .
In the same way as for vectors, matrices of a fixed size can be multiplied
by numbers and added or subtracted. This is done entry by entry:
µ ¶ µ ¶ µ ¶
1 3 0 0 2 1 1 5 1
+ =
−2 0 5 1 4 −2 −1 4 3
µ ¶ µ ¶
2 2 14 14
7 =
−3 0 −21 0
A non-trivial operation in the matrix algebra is the matrix multipli-
cation. It is introduced as follows. First we define the product of a row-
b1
vector a = (a1 , . . . , an ) with a column-vector b = . . . as the number
bn
3
ab = a1 b1 + · · · + an bn . This extends to more general matrices. If we are
given a matric A with n rows and p columns, and a matrix B with p rows
and m columns, then each row of A can be multiplied with each column of
B according to the above rule, giving, by the definition, a matrix element of
the matrix product of A and B:
b11 . . . b1m
a11 a12 . . . a1p b21 . . . b2m
A = . . . . . . . . . . . . , B=
. . . . . . . . . ,
an1 an2 . . . anp
bp1 . . . bpm
a11 b11 + a12 b21 + · · · + a1p bp1 . . . a11 b1m + a12 b2m + · · · + a1p bpm
AB = ... ... ... .
an1 b11 + an2 b21 + · · · + anp bp1 . . . an1 b1m + an2 b2m + · · · + anp bpm
In other words, we take the products of the first row of A with all the columns
of B and thus obtain the first row of AB; then take the products of the second
row of A with the columns of B to obtain the second row of AB, and so on.
The product of an n × p matrix with a p × m matrix will be an n × m matrix.
This contains the product of a row-vector and a column-vector, giving a
number (a ‘1 × 1 matrix’), as a particular case.
The matrix multiplication is associative, i.e., satisfies A(BC) = (AB)C,
and distributive w.r.t. the addition of matrices. However, in general it is not
true that AB equals BA even if the dimensions are matching. One can check
that A0 = 0A = 0 and AE = EA = A for all matrices A.
4
a11 a12 a13
Example 1.6. For a 3 × 3 matrix A = a21 a22 a23 we have
a31 a32 a33
det A = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31 + a13 a21 a32 + a12 a23 a31 .
D(a1 , a2 , a3 , . . . , an ) = −D(a2 , a1 , a3 , . . . , an )
(and the same for any pair of columns ai and aj , where i, j = 1, . . . , n);
(4) for the identity matrix, D(E) = 1.
5
Proof. We shall give a proof for the case n = 2, and it will be clear how the
same logic works for an arbitrary n. Assume that the properties (1) to (4)
hold for a function D(A). As we shall see, this will allow to obtain a unique
formula for D(A). (This establishes the uniqueness part of the theorem.)
Taking it as the definition of D(A), we shall notice that the properties will
be indeed satisfied. (This establishes the existence part.) So consider a 2 × 2
matrix A. Assuming the existence of D(A) with the above properties we
obtain:
µ ¶
a11 a12 ¡ ¢
D(A) = D(a1 , a2 ) = D = D a11 e1 + a21 e2 , a12 e1 + a22 e2 =
a21 a22
¡ ¡ ¢ ¡ ¢ ¡ ¢
a11 a12 D e1 , e1 ) + a11 a22 D e1 , e2 + a21 a12 D e2 , e1 + a21 a22 D e2 , e2 .
µ ¶ µ ¶
1 0
Here e1 = , e2 = . We used the properties (1) and (2) to “open the
0 1
brackets”. Now we shall use the property (3). (Note that it implies vanishing
of D(A) if two columns of the matrix coincide.) We have, continuing the
calculation,
¡ ¢ ¡ ¢
D(A) = a11 a22 D e1 , e2 + a21 a12 D e2 , e1 =
¡ ¢ ¡ ¢ ³ ´ ¡ ¢
a11 a22 D e1 , e2 − a21 a12 D e1 , e2 = a11 a22 − a21 a12 D e1 , e2 ,
6
(we use round brackets for a matrix and vertical straight lines, for its deter-
minant).
Remark 1.1. From the proof of Theorem 1.1 follows that an arbitrary func-
tion of D(A) possessing the properties (1)–(3) but not necessarily (4), is
unique up to a factor:
D(A) = det A · D(E) .
7
Remark 1.4. The determinant of the product is the product of determinants.
You should be warned that it is not true that the determinant of the sum is the
sum of determinants! In fact, there is a formula for det(A + B), but it is quite
complicated.
We have considered determinant as a function of the matrix columns.
What about the properties of det A as a function of rows?
Theorem 1.3. The determinant det A of a square matrix A is a multilinear
alternating function of the rows of A. (That is, det A possesses the same
properties (1)–(3) w.r.t. the rows as it has w.r.t. the columns. It is uniquely
defined by these conditions and the condition det E = 1.)
For an n × m matrix A with the entries aij , the transpose of A, notation:
AT , is defined as the m × n matrix of the form
a11 a21 . . . an1
a12 a22 . . . an2
AT =
. . . . . . . . . . . . .
a1m a2m . . . anm
In other words, the columns of AT are the rows of A written as column-
vectors, and vice versa. The (ij)-th matrix element of AT is aji , the (ji)-th
element of A.
Example 1.7.
µ ¶T µ ¶
2 3 2 −1
=
−1 0 2 0
Theorem 1.4. For any square matrix A,
det AT = det A .
Determinants have several other important properties, which we shall not
discuss here. For example, it is possible to give a closed “general formula” for
the determinant of an n×n matrix, generalizing the formulas for determinants
of orders 2 and 3 given above in Examples 1.5 and 1.6. However, it is more
important to develop practical methods of calculating determinants, which
is done in the next subsection.
Proofs of Theorems 1.2, 1.3 and 1.4 follow. Notice that they are all based
on the axioms (multilinearity and skew-symmetricity with respect to columns) by
which we defined the determinant.
8
Proof of Theorem 1.2. Consider det(AB) as a function of the columns of B. De-
note them b1 , . . . , bn . Notice that the columns of the matrix AB are the column-
vectors Ab1 , . . . , Abn (check the definition of the matrix product). It follows that
if a column of B is multiplied by a number c, the corresponding column of AB
will be multiplied by c. Similarly, if two columns of B are interchanged, then the
corresponding columns of AB will be interchanged. Therefore, by the properties
of the determinant applied to det(AB), it follows that det(AB) as a function of
b1 , . . . , bn possesses the properties (1)–(3) of Theorem 1.1.
Proof of Theorem 1.3. Notice that for a matrix A, the multiplication from the left
by a matrix B acts as a transformation of rows of A. In particular, for
c 0 ... 0
0 1 ... 0
B=
. . . . . . . . . . . .
0 0 ... 1
the map A 7→ BA is the multiplication of the first row of A by the number c.
Similarly, by putting c on the kth place on the diagonal (and keeping other diagonal
elements equal to 1 and all off-diagonal, to 0) we obtain the matrix such that
A 7→ BA is the multiplication of the kth row by c. Notice that the determinant of
such matrix B equals c, for any k = 1, . . . , n (as it obtained from E by multiplying
the kth column by c). Hence, by Theorem 1.2, if a row of A is multiplied by c,
the determinant of the resulting matrix will be det(BA) = det B det A = c det A.
This implies linearity (see Remark 1.3). In the same way we can perform the
interchange of two rows of A by multiplying A from the left by a certain matrix
B. For example, A 7→ BA with
0 1 0 ... 0
1 0 0 ... 0
B= 0 0 1 ... 0
. . . . . . . . . . . . . . .
0 0 0 ... 1
acts as the interchange of the first and the second rows of A. Here B is obtained
from the identity matrix E by swapping the first and the second column. Hence
det B = −1. For given i and j, one can similarly construct B such that A 7→ BA
is the interchange of the ith row and the jth row. By Theorem 1.2 we have
det(BA) = det B det A = − det A for the result of the interchange. That means
that det A is also an alternating function of rows, and the theorem is proved.
Proof of Theorem 1.4. Consider f (A) = det AT as a function of the columns of
A. Since the columns of A are exactly the rows of AT , and by Theorem 1.3,
9
det AT is a multilinear alternating function of the rows of AT , it follows that
f (A) = det AT is a multilinear alternating function of the columns of A. Notice
also that f (E) = det E T = 1, since E T = E. Hence f (A) satisfies all the properties
(1)–(4) of Theorem 1.1 and must coincide with det A.
1.4 Calculations
Example 1.8. If a matrix A has a zero row or a zero column, then det A = 0.
Indeed, it follows from linearity. A linear function is zero for a zero argument:
if f (c a) = c f (a), then f (0) = f (c 0) = c f (0) for any c, so f (0) = 0.
¯ ¯
¯a 0¯
Example 1.9. For a diagonal 2 × 2 matrix we have ¯ ¯ ¯ = ad − 0 = ad.
0 d¯
In general, for a diagonal n × n matrix, its determinant is the product of the
diagonal entries: ¯ ¯
¯ λ1 . . . 0 ¯
¯ ¯
¯ 0 . . . 0 ¯ = λ1 λ2 . . . λ n
¯ ¯
¯ 0 . . . λn ¯
Indeed, λi can be taken out successively from each row (using linearity) until
we obtain the identity matrix.
Example 1.10. If a matrix has all zeros below the diagonal, its determinant
is again the product of the diagonal entries.
Consider, for example, the case n = 3. Suppose
a11 a12 a13
A = 0 a22 a23 .
0 0 a33
We have
¯ ¯ ¯ ¯ ¯ ¯
¯a11 a12 a13 ¯ ¯a11 a12 0¯¯ ¯a11 a12 0¯¯
¯ ¯ ¯ ¯
det A = a33 ¯ 0 a22 a23 ¯ = a33 ¯¯ 0
¯ ¯ a22 0¯ = a33 a22 ¯¯ 0
¯ 1 0¯¯ =
¯ 0 0 1 ¯ ¯ 0 0 1¯ ¯ 0 0 1¯
¯ ¯ ¯ ¯
¯a11 0 0¯¯ ¯1 0 0¯¯
¯ ¯
a33 a22 ¯¯ 0 1 0¯ = a33 a22 a11 ¯¯0 1
¯ 0¯¯ = a33 a22 a11 ,
¯ 0 0 1¯ ¯0 0 1¯
and the claim holds. Here we repeatedly used Remark 1.2 — more precisely, its
analog for rows: adding to any row a multiple of another row will not change the
determinant. In the same way we can treat the case of general n.
10
The same method of ‘row operations’ (i.e., simplifying the matrix by
multiplying/dividing a row by a number and adding a multiple of one row to
another) can be effectively applied for calculating an arbitrary determinant.
¯ ¯
¯1 −2 0¯
¯ ¯
Example 1.11. Calculate ¯¯2 0 4¯¯. We have
¯3 −5 5¯
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯1 −2 0¯ ¯1 −2 0¯ ¯1 −2 0¯ ¯1 −2 0¯
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯2 0 4¯ = ¯0 4 4¯ = 4 ¯0 1 1¯ = 4 ¯0 1 1¯ = 4 · 4 = 16 .
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯3 −5 5¯ ¯0 1 5¯ ¯0 1 5¯ ¯0 0 4 ¯
Example 1.12. Suppose that a matrix A in a certain row has all entries equal to
zero except for one equal to 1. (A variant: the same for a column.) What can be
said about its determinant? Consider a particular case of this situation. Let the
first row of A has 1 as the first element and 0 at all other places:
1 0 ... 0
a21 a22 . . . a2n
A=
. . . . . . . . . . . . .
an1 an2 . . . ann
Then we can repeatedly apply row operations, subtracting the first row multiplied
by a21 , a22 , etc., respectively, from the second, the third, etc., and the last row.
We arrive at
¯ ¯ ¯ ¯
¯ 1 0 ... 0 ¯¯ ¯¯ 1 0 ... 0 ¯¯
¯
¯a a22 . . . a2n ¯¯ ¯¯ 0 a22 . . . a2n ¯¯
det A = ¯¯ 21 ¯=¯ ¯.
¯ . . . . . . . . . . . . ¯ ¯. . . . . . . . . . . . ¯
¯an1 an2 . . . ann ¯ ¯ 0 an2 . . . ann ¯
What is the value of the resulting determinant? Notice that is a function f (r1 , . . . , rn−1 )
of the n − 1 row-vectors r1 = (a22 , . . . , a2n ), . . . , rn−1 = (an2 , . . . , ann ) belonging
to Rn−1 . Clearly, from the properties of det A follows that this function is linear
¯in each row-vector ¯ and alternating. Hence it is the (n − 1) × (n − 1) determinant
¯ a22 . . . a2n ¯
¯ ¯
¯ . . . . . . . . . ¯ up to a factor equal to the value of f (r1 , . . . , rn−1 ) at the identity
¯ ¯
¯an2 . . . ann ¯
11
matrix. We have
¯ ¯
¯ 1 0 ... 0 ¯¯
¯
¯ 0 1 ... 0 ¯¯
f (E) = ¯¯ = det E = 1
¯. . . ... ... . . .¯¯
¯ 0 0 ... 1¯
(here at the LHS the identity matrix E is (n − 1) × (n − 1), while at the RHS the
identity matrix is n × n). Hence, finally,
¯ ¯
¯ 1 0 ... 0 ¯¯ ¯¯ ¯
¯ a 22 . . . a 2n ¯
¯ a21 a22 . . . a2n ¯ ¯ ¯
det A = ¯¯ ¯ = ¯ . . . . . . . . . ¯¯ .
¯
¯
¯ . . . . . . . . . . . . ¯ ¯an2 . . . ann ¯
¯an1 an2 . . . ann ¯
What will change if 1 appears as the second element in the first row rather than
the first? Similarly to the above we will have
¯ ¯ ¯ ¯
¯ 0 1 0 . . . 0 ¯ ¯ 0 1 0 . . . 0 ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯ a21 a23 ... a2n ¯¯
¯ a21 a22 a23 . . . a2n ¯ ¯ a21 0 a23 . . . a2n ¯ ¯
¯ ¯ ¯ ¯ ¯ a3n ¯¯
¯ a31 a32 a33 . . . a3n ¯ = ¯ a31 0 a33 . . . a3n ¯ = C·¯ a31 a33 ...
,
¯ ¯ ¯ ¯ ¯. . . ... ... . . . ¯¯
¯. . . . . . . . . . . . . . . ¯ ¯. . . . . . . . . . . . . . . ¯ ¯
¯ ¯ ¯ ¯ ¯an1 an3 ... ann ¯
¯an1 an2 an3 . . . ann ¯ ¯an1 0 an3 . . . ann ¯
because this is just the determinant of the n × n identity matrix with the first and
second rows swapped (equivalently, the first and second column swapped). Hence
¯ ¯
¯ 0 1 0 . . . 0 ¯ ¯ ¯
¯ ¯ ¯ a21 a23 . . . a2n ¯
¯ a21 a22 a23 . . . a2n ¯ ¯ ¯
¯ ¯ ¯ a31 a33 . . . a3n ¯
¯ a31 a32 a33 . . . a3n ¯ = − ¯ ¯
¯ ¯ ¯. . . . . . . . . . . . ¯ .
¯. . . . . . . . . . . . . . . ¯ ¯ ¯
¯ ¯ ¯an1 an3 . . . ann ¯
¯an1 an2 an3 . . . ann ¯
12
One can in the same way see that for 1 at the kth position in the first row, the
answer will include the factor (−1)k−1 :
¯ ¯
¯ 0 ... 0 1 0 . . . 0 ¯ ¯ ¯
¯ ¯ ¯ a21 . . . a2,k−1 a2,k+1 ... a2n ¯¯
¯ a21 . . . a2,k−1 a2k a2,k+1 . . . a2n ¯ ¯
¯ ¯ ¯ a31 . . . a3,k−1 a3,k+1 ... a3n ¯¯
¯ a31 . . . a3,k−1 a3k a3,k+1 . . . a3n ¯ = (−1)k−1 ¯ .
¯ ¯ ¯. . . . . . ... ... ... . . . ¯¯
¯. . . . . . ... ... ... ... ...¯ ¯ ¯
¯ ¯an1 . . . an,k−1 an,k+1 ... ann ¯
¯an1 . . . an,k−1 ank an,k+1 . . . ann ¯
The main idea in this example is that, for matrices of a special appearance,
the determinant of order n reduces to a determinant of order n − 1. This can be
used for a general practical algorithm of calculating determinants.
where a11 , . . . , a1n are the entries of the first row and M11 , . . . , M1n are the
determinants of the (n−1)×(n−1) matrices obtained from A by crossing out
the first row and the first, the second, . . . , and the last columns, respectively.
Proof. Consider the first row of A. It is the sum a11 e1 + · · · + a1n en where ek ,
k = 1, . . . , n, is the row-vector with the single nonzero entry equal to 1 at the kth
place. By the linearity of det A, we have det A = a11 det A1 +· · ·+ann det An where
Ak is the n × n matrix obtained from A by replacing its first row by ek . These are
exactly the determinants calculated in Example 1.12, so det Ak = (−1)k−1 M1k , for
all k = 1, . . . , n, which proves the theorem.
The determinants M11 , . . . , M1n (and similar ones) are called the minors
of the matrix A. In general, a minor of A is the determinant of the square
matrix obtained from A by crossing out some rows and columns (the same
number of rows and columns). There are generalizations of the above ex-
pansion including other minors. In particular, there is an expansion in the
second row (instead of the first row), and in any given row, and in the first
column, as well as in any given column. The formulas are similar to the
above, but the signs will depend both on a row and a column.
2 1 1
Example 1.13. Calculate the determinant of the 3 × 3 matrix 0 −3 5
1 0 3
13
using the expansion in the first row. Solution: we have
¯ ¯
¯2 1 1¯¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯−3 5¯¯ ¯¯0 5¯¯ ¯¯0 −3¯¯
¯0 −3 5¯¯ = 2 ¯¯ − + = 2(−3·3)−(−5)+3 = −18+5+3 = −10 .
¯ 0 3¯ ¯1 3¯ ¯1 0 ¯
¯1 0 3¯
1 1 0 4
−2 0 3 6
Example 1.14. Calculate the determinant of the 4×4 matrix 0
1 5 −1
3 −3 6 1
using the expansion in the first row. Solution: calculate first the minors M11 , . . . ,
M14 . We have
¯ ¯
¯0 3 6¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ 1 −1¯ ¯ 1 5¯
¯ ¯
M11 = ¯ 1 5 −1¯ = −3 ¯ ¯ ¯ + 6¯ ¯ ¯ = −3 · 4 + 6 · 21 = 114
¯−3 6 1 ¯ −3 1 ¯ −3 6¯
¯ ¯
¯−2 3 6 ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯5 −1¯ ¯0 −1¯ ¯0 5¯
¯ ¯
M12 = ¯ 0 5 −1¯ = −2 ¯ ¯ ¯ −3 ¯ ¯ +6 ¯ ¯ = −2·11−3·3+6(−15) = −121
¯3 6 1¯ 6 1 ¯ ¯3 1 ¯ ¯3 6¯
¯ ¯
¯−2 0 6 ¯¯ ¯ ¯ ¯ ¯
¯ ¯ 1 −1¯ ¯0 1 ¯
M13 = ¯¯ 0 1 −1¯¯ = −2 ¯¯ ¯ + 6¯ ¯
¯3 −3¯ = −2(−2) + 6(−3) = −14
¯ 3 −3 1 ¯ −3 1 ¯
¯ ¯
¯−2 0 3¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ 1 5¯ ¯0 1 ¯
M14 = ¯¯ 0 1 5¯¯ = −2 ¯¯ ¯ + 3¯ ¯
¯3 −3¯ = −2 · 21 + 3(−3) = −51
¯ 3 −3 6¯ −3 6¯
Now we have
(We might have noticed earlier that it was not necessary to calculate M13 !)
1.5 Problems
Problem 1.1. Carry out
µ the ¶following matrix
µ operations:
¶
2 0 3 1
(a) AB and BA if A = and B = ;
1 1 0 1
2 1 0 3 1 −2
(b) AB − BA where A = 1 1 2 and B = 3 −2 4 .
−1 2 1 −3 5 −1
14
µ ¶ µ ¶
6 2 7 1
(Ans.: (a) and ; (b) 0.)
3 2 1 1
Problem 1.2. Evaluate the determinants: ¯ ¯
¯ ¯ ¯ ¯ ¯5 1 2 7¯
¯ ¯ ¯ ¯ ¯2 1 3¯ ¯4 −3 5 ¯ ¯ ¯
¯5 2¯ ¯3 2¯ ¯ ¯ ¯ ¯ ¯3 0 0 2¯
(a) ¯¯ ¯, (b) ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯8 5¯ , (c) ¯5 3 2¯, (d) ¯3 −2 8 ¯, (e) ¯1 3 4 5¯ .
¯
7 3¯ ¯1 4 3¯ ¯1 −7 −5¯ ¯ ¯
¯2 0 0 3¯
For the third and fourth order determinants you should use row operations
or the expansion in the first row.
(Ans.: (a) 1; (b) −1; (c) 40; (d) 100; (e) 10.)
µ ¶ µ ¶
a11 a12 b11 b12
Problem 1.3. Suppose A = and B = . Verify di-
a21 a22 b21 b22
rectly that det AB = det A · det B.
Problem 1.4. Consider a system of linear equations:
a11 x1 + a12 x2 = b1 ,
a21 x1 + a22 x2 = b2 .
Solve it and show that the solution is given by the formulae
¯ ¯ ¯ ¯
1 ¯¯b1 a12 ¯¯ 1 ¯¯a11 b1 ¯¯
x1 = , x2 =
∆ ¯b2 a22 ¯ ∆ ¯a21 b2 ¯
¯ ¯
¯a11 a12 ¯
where ∆ = ¯ ¯ ¯. (For example, you can express x1 from the first
a21 a22 ¯
equation, substitute it into the second equation, solve the resulting equation
for x2 and substitute the answer into the expression for x1 ; you may assume
that you can divide by any expression whenever you need it.)
Problem 1.5. For vectors in Rn there is the notion of a ‘scalar product’. For
row-vectors a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) their scalar product (a, b)
is the number abT = a1 b1 + · · · + an bn . (For column-vectors the formula
will be (a, b) = aT b.) Vectors are said to be orthogonal or perpendicular if
their scalar product vanishes. Now, in R3 there is another notion of a ‘vector
product’: for a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) their vector product a × b is
a vector defined as the symbolic determinant
¯ ¯
¯e1 e2 e3 ¯
¯ ¯
a × b = ¯¯a1 a2 a3 ¯¯
¯ b1 b2 b3 ¯
15
where the first row consists of the vectors e1 = (1, 0, 0), e2 = (0, 1, 0),
e3 = (0, 0, 1) and the determinant (whose value is a vector in R3 ) is un-
derstood via its expansion in the first row.
(a) Evaluate e1 × e2 , e2 × e3 , e3 × e1 .
(b) Show that for any vectors a and b, the vector product a × b is per-
pendicular to each of a and b. (Hint: check that for an arbitrary vector
c ∈ R3 , ¯ ¯
¯a1 a2 a3 ¯
¯ ¯
(c, a × b) = (b, c × a) = (a, b × c) = ¯¯ b1 b2 b3 ¯¯
¯ c1 c2 c3 ¯
and use the properties of determinants.)
(b) Calculate n = a × b for a = (0, 1, −2) and b = (3, −4, 7) and directly
verify that (n, a) = (n, b) = 0.
(c) Think how the notion of the vector product can be extended to Rn for
arbitrary n. (Hint: it cannot be a product of two vectors except for n = 3.)
µ ¶
a b
Problem 1.6. Consider an arbitrary 2 × 2 matrix A = .
c d
(a) Check that f (λ) = det(A − λE), where λ is a parameter, equals
λ2 − (a + d)λ + ad − bc.
(a) Show that, for an arbitrary matrix A, it satisfies the matrix identity
16
§2 Areas and Volumes
The area of a two-dimensional object such as a region of the plane and the
volume of a three-dimensional object such as a solid body in space, as well
the length of an interval of the real line, are all particular cases of a very
general notion of measure. General measure theory is a part of analysis.
Here we shall focus on the geometrical side of the idea of measure and its
relation with the algebraic notion of determinant.
17
certain infinite unions). We want to add to them some properties peculiar
for the plane R2 .
A translation of the space Rn is the map Ta : Rn → Rn that takes every
point x ∈ Rn to x + a, where a ∈ Rn is a fixed vector. For a subset S ⊂ Rn ,
a translation Ta “shifts” all points of S along a, i.e., each point x ∈ S is
mapped to x + a, and S moves “rigidly” to its new location in Rn .
Example 2.1. A disk of radius R and center O = (0, 0) in R2 under the
translation Ta where a = (a1 , a2 ) is mapped to the disk of the same radius
R with center (a1 , a2 ). The area of a disk, clearly, should not depend on a
position of the center.
The following natural properties hold for areas in R2 :
(3) Area is invariant under translations: area Ta (S) = area S, for all
vectors a ∈ R2 .
(4) For a one-dimensional object, such as a segment, the area should
vanish.
We do not define precisely what is a ‘one-dimensional object’. However,
examples such as segments and their unions will be sufficient for our purposes.
The plan now is as follows. Using conditions (1)–(4) we shall establish
a deep link between the notion of area and the theory of determinants. To
this end, we consider the area of a simple polygon, a parallelogram. Later
our considerations will be generalized to Rn .
Let a = (a1 , a2 ), b = (b1 , b2 ) be vectors in R2 . The parallelogram on a, b
with basepoint O ∈ R2 is the set of points of the form
x = O + ta + sb where 0 ≤ t, s ≤ 1 .
One can easily see that it is the plane region bounded by the two pairs of
parallel straight line segments: OA, BC and OB, AC where C = O + a + b.
B C
b½½
>
½
½
½ ½
½ -½
O a A
What is the area of it? From property (3) it follows that the area does
not depend on the location of our parallelogram in R2 : by a translation the
basepoint O can be made an arbitrary point of the plane without changing the
area. Let us assume that O = 0 is the point (0, 0). Denote the parallelogram
by Π(a, b). Then area Π(a, b) is a function of vectors a, b.
18
Proposition 2.1. The function area Π(a, b) has the following properties:
(1) area Π(na, b) = area Π(a, nb) = |n| · area Π(a, b) for any n ∈ Z;
(2) area Π(a, b + ka) = area Π(a + kb, b) = area Π(a, b) for any k ∈ R.
Proof. Suppose we replace a by na for a positive integer n. Then Π(na, b)
is the union of n copies of the parallelogram Π(a, b):
b½½
> b½½
> b½ >
½
½
½
½ ½ ½ ½
½ -½ -½ -½
a a a
19
Theorem 2.1. Suppose a = (a1 , a2 ), b = (b1 , b2 ). Then
area Π(a, b) = C · |∆| , (1)
20
Example 2.3. Find the area of the triangle ABC if A = (3, 2), B = (4, 2),
C = (1, 0). Solution: it is half of the area of the parallelogram built on
−→ −−→
CA = A − C = (2, 2), CB = B − C = (3, 2). Hence
¯ ¯
1 ¯¯3 2¯¯
area(ABC) = ¯ = 1.
2 2 2¯
How we can make sense of the determinant det(a, b) as such, not its
absolute value? It corresponds to the notion of signed, or oriented area.
Denote it Area Π(a, b) with capital “a”. By definition, signed area satisfies
21
Example 2.4. Find the oriented volume of the parallelipiped built on a1 =
(2, 1, 0), a2 = (0, 3, 11) and a3 = (1, 2, 7) in R3 . Solution:
¯ ¯
¯2 1 0 ¯
¯ ¯
Vol Π(a1 , a2 , a3 ) = ¯¯0 3 11¯¯ = 2 · 1 − 1 · 11 = −9 .
¯1 2 7 ¯
22
Remark 2.1. The space Rn together with the scalar product is called the
n-dimensional Euclidean space. The adjective ‘Euclidean’ points to the fact
that the scalar product allows to define lengths and angles, i.e., the main
notions of the classical Euclidean geometry.
Remark 2.2. The scalar product is alternatively denoted a · b and hence is
often referred to as the ‘dot product’.
It seems that the unit cube Π(e1 , . . . , en ) plays a distinguished role. Later
we shall show that any unit cube in Rn has unit volume. Consider an example
(for n = 2 we continue to use Area instead of Vol).
Example 2.5. Let g1 = (cos α, sin α), g2 = (− sin α, cos α) in R2 . We can
immediately see that |g1 | = |g2 | = 1 and g1 · g2 = 0, so Π(g1 , g2 ) is a unit
square. We have
¯ ¯
¯ cos α sin α ¯
Area Π(g1 , g2 ) = ¯¯ ¯ = cos2 α + sin2 α = 1.
− sin α cos α¯
There is a way of expressing the volume of a parallelipiped entirely in
terms of ‘intrinsic’ geometric information: lengths of vectors and angles be-
tween them, rather than their coordinates as in the previous formulas. Con-
sider the matrix
(a1 , a1 ) . . . (a1 , ak )
G(a1 , . . . , ak ) = . . . ... ... . (6)
(ak , a1 ) . . . (ak , ak )
Here k ≤ n may be less than n.
Definition 2.1. The matrix G(a1 , . . . , an ) is called the Gram matrix of the
system of vectors a1 , . . . , an and its determinant, the Gram determinant.
Theorem 2.2. The Gram determinant of a1 , . . . , an is the square of the
volume of the parallelipiped Π(a1 , . . . , an ).
Proof. Indeed, consider the n × n matrix A with rows ai . Consider
a11 . . . a1n a11 . . . an1 a1 ¡ ¢
T
AA = . . . . . . . . . . . . . . . . . . = . . . aT1 . . . aTn =
an1 . . . ann a1n . . . ann an
a1 a1 . . . a1 aTn
T
. . . . . . . . . = G(a1 , . . . , an ) .
an aT1 . . . an aTn
23
Hence
Example 2.6. Find the area of the parallelogram built on a = (1, −1, 2)
and b = (2, 0, 3). Solution: the Gram determinant is
¯ ¯
¯6 8 ¯
¯ ¯
¯8 13¯ = 14 .
√
Hence the area is 14.
24
by the condition (a, h) = 0). It is a very basic fact following from the
same sort of ideas that lead us to discovering the relation between areas and
determinants. For n = 3, the similar fact about the volume of a 3-dimensional
parallelipiped is also very familiar. This relation between the n-dimensional
volume in the Euclidean space Rn and the (n − 1)-dimensional volume in the
Euclidean space Rn−1 holds for any n. The easiest way to prove it is by using
the Gram determinants.
Temporarily introduce notation voln and voln−1 for distinguishing be-
tween volumes in Rn and Rn−1 . Let us consider the (n − 1)-dimensional par-
allelipiped Π(a1 , . . . , an−1 ) as a base of a parallelipiped Π(a1 , . . . , an ). Then
the height of Π(a1 , . . . , an ) is the length of a (unique) vector h defined by
the condition h = an + c where c is a linear combination of a1 , . . . , an−1 and
h is perpendicular to the plane of a1 , . . . , an−1 (i.e., to each of the vectors
a1 , . . . , an−1 ).
Theorem 2.3. The following formula holds:
voln Π(a1 , . . . , an ) = voln−1 Π(a1 , . . . , an−1 ) · h ,
where h = |h| is the height of Π(a1 , . . . , an ).
Proof. Using the properties of volumes we immediately conclude that
voln Π(a1 , . . . , an ) = voln Π(a1 , . . . , an−1 , h).
Now we can apply the Gram determinant:
¯ ¯
¯ (a1 , a1 ) . . . (a1 , an−1 ) (a , h) ¯
¯ 1 ¯
¯ ... ... ... . . . ¯¯
¯
det G(a1 , . . . , an−1 , h) = ¯ ¯=
¯(an−1 , a1 ) . . . (an−1 , an−1 ) (an−1 , h)¯
¯ (h, a1 ) . . . (h, an−1 ) (h, h) ¯
¯ ¯
¯ (a1 , a1 ) . . . (a1 , an−1 ) 0 ¯¯
¯
¯ ... ... ... . . . ¯¯
¯ 2
¯(an−1 , a1 ) . . . (an−1 , an−1 ) 0 ¯ = det G(a1 , . . . , an−1 ) · |h| .
¯ ¯
¯ 0 ... 0 (h, h)¯
and it remains to extract the square root.
25
2.4.1 The distance between a point and a plane
Consider a plane L in R3 and a point x not belonging to L. What is the
distance between x and L? It is natural to define it as the minimum of the
distances between x and points of the plane L. A practical calculation of
it can be nicely related with areas and volumes. Indeed, let y ∈ L is an
arbitrary point of the plane; the distance between x and y is |x − y|. We
can write x − y = ak + a⊥ where the vector ak is parallel to L and a⊥ is
perpendicular to it. Hence |x − y|2 = |ak |2 + |a⊥ |2 ≥ |a⊥ |2 . The part ak can
vary by adding vectors parallel to L, while a⊥ is unique as long as x is given.
It is clear now that the shortest length so obtained is for x − y = a⊥ , i.e.,
when y is the end of the perpendicular dropped on L from the point x. (Draw
a picture.) Let O be some fixed point of the plane L and let vectors a1 , . . . , ak
‘span’ L so that an arbitrary point of L has the appearance O + c1 a1 + . . . +
ck ak . Then we can consider a k-dimensional parallelipiped Π(a1 , . . . , ak ) and
a (k +1)-dimensional parallelipiped Π(a1 , . . . , ak , x−O), both with basepoint
O. Clearly, the desired distance is the height of Π(a1 , . . . , ak , x − O).
Corollary 2.2. The distance between a point x and a plane L through a
point O in the direction of vectors a1 , . . . , ak is given by the formula
p
vol Π(a1 , . . . , ak , x − O) det G(a1 , . . . , ak , x − O)
= p
vol Π(a1 , . . . , ak ) det G(a1 , . . . , ak )
Example 2.7. Given a plane through the points A = (1, 0, 0), B = (0, 1, 0),
C = (0, 0, 1). Find the distance between it and the point D = (3, 2, −1).
−→
Solution: denote the desired distance by h. Consider the vectors a = CA =
−−→ −−→
(1, 0, −1), b = CB = (0, 1, −1) and d = CD = (3, 2, −2). We have
vol Π(a, b, d)
h= .
area Π(a, b)
The numerator is easier to find by using formula (4). We have
¯ ¯
¯1 0 −1¯¯ ¯¯ ¯ ¯ ¯
¯ 1 −1¯¯ ¯¯0 1¯¯
¯0 ¯
1 −1¯ = ¯ ¯ − =3
¯ 2 −2¯ ¯3 2¯
¯3 2 −2 ¯
hence vol Π(a, b, d) = 3. For the denominator we calculate the Gram deter-
minant: ¯ ¯ ¯ ¯
¯ (a, a) (a, b) ¯ ¯2 1¯
¯
det G(a, b) = ¯ ¯ = ¯ ¯ = 1;
(b, a) (b, b)¯ ¯1 2¯
26
p
hence area Π(a, b) = det G(a, b) = 1. Finally, h = 3.
27
How can we use this? Suppose we want to calculate the area of a certain
domain D ⊂ R2 . Choose some system of curvilinear coordinates in which D
can be conveniently described. Consider a partition of D by the coordinate
lines v = vk , u = ul , k, l = 1, . . . , N , where ul+1 − ul = ∆u, vk+1 − vk =
∆v. Then the area of D is approximated by the sum of the areas of the
curvilinear quadrangles ∆S as above, where u varies between ul and ul+1 ,
and v, between vk and vk+1 . We can also approximate each of ∆S by the
area of the parallelogram Π(eu ∆u, ev ∆v). The basepoints are on the grid,
and the vectors eu , ev are calculated at the corresponding points of the grid.
Hence
X X X√
area D = lim ∆S = lim area Π(eu ∆u, ev ∆v) = lim g∆u∆v
k,l k,l k,l
where g = det G(eu , ev ). It is just the integral sum for a double integral, and
we arrive at the following statement.
Proposition 2.3. For the area of a domain D ⊂ R2 we have
ZZ
√
area D = dS where dS = gdudv .
D
The expression dS under the integral sign is called the element of area.
Example 2.10. The element of area in the standard coordinates x, y and
polar coordinates r, θ will be
dS = dxdy = rdrdθ
using the result of Example 2.9.
Example 2.11. Find the area of a disk DR of radius R using the above
formulas. Solution: let the center of the disk be at the origin; in polar
coordinates we have 0 ≤ r ≤ R, 0 ≤ θ ≤ 2π, and
ZZ ZZ Z 2π Z R
1
area DR = dS = rdrdθ = dθ rdr = 2π R2 = πR2 .
DR DR 0 0 2
Example 2.12. Similarly, for a sector of the same disk, of angle ∆θ, we have
(denoting the sector by S)
ZZ ZZ Z ∆θ Z R
1
area S = dS = rdrdθ = dθ rdr = R2 ∆θ.
S S 0 0 2
28
Exactly in the same way this works for pieces of surfaces in R3 . If a
surface M is parametrized by parameters u, v, then we again have vectors
∂x ∂x
eu = , ev = ,
∂u ∂v
which are now in the tangent planes at points of M (so they vary from point
to point). We have parallelograms Π(eu ∆u, ev ∆v) in the tangent planes,
approximating infinitesimal pieces of the surface M . Hence for the element
of area we have the same formula
√
dS = g dudv where g = det G(eu , ev ) ,
and the area of (a piece of) M is given by a double integral:
ZZ ZZ
√
area M = dS = g dudv .
M M
2.5 Problems
Problem 2.1. Find the areas and volumes (signed where indicated):
(a) Area Π(a, b) if a = (− cos α, − sin α), b = (− sin α, cos α) in R2 ; make a
sketch;
(b) Vol Π(a, b, c) if a = (3, 2, −1), b = (2, 2, 5), c = (0, 0, 1) in R3 ;
(c) area(a, b) if a = (1, −1, 2, 3), b = (0, 3, 1, 2) in R4 .
(Ans.: (a) −1; (b) 2; (c) 15.)
29
Problem 2.2. Verify by a direct calculation that the Gram determinant
det G(a, b, c) vanishes if one of the vectors a, b, c is a linear combination
of the others. (Geometrically that means that they are in the same plane.)
What can be said about the volume of the parallelipiped Π(a, b, c)?
Problem 2.3. Show that the oriented area of a triangle ABC in R2 is given
by the formula ¯ ¯
¯A1 A2 1¯
1 ¯ ¯
area(ABC) = ¯¯B1 B2 1¯¯
2 ¯
C1 C2 1¯
if A = (A1 , A2 ), B = (B1 , B2 ), C = (C1 , C2 ).
Problem 2.5. Show that |a×b| = |a| |b|·sin α where α is the angle between
a and b.
Hint: use the result of the previous problem, part (b), and express the area
by the Gram determinant.
Problem 2.6. Show that the distance of a point C = (C1 , C2 ) from the
straight line passing through points A = (A1 , A2 ) and B = (B1 , B2 ) in the
plane is given by the absolute value of the expression
¯ ¯
¯A1 A2 1¯
1 ¯ ¯
¯B1 B2 1¯ .
|B − A| ¯¯ ¯
C1 C2 1¯
2 2 2 2 2
Problem RR 2.7. Find the area of the sphere SR : x + y + z = 2R as the
integral S 2 dS using the result of Example 2.13. (Answer: 4πR .)
R
30