Basics of Affine Geometry
Basics of Affine Geometry
L’algèbre n’est qu’une géométrie écrite; la géométrie n’est qu’une algèbre figurée.
—Sophie Germain
89
90 CHAPTER 4. BASICS OF AFFINE GEOMETRY
b = a + ab,
addition being understood as addition in R3 . Then, in the standard frame, given a point
x = (x1 , x2 , x3 ), the position of x is the vector Ox = (x1 , x2 , x3 ), which coincides with the
point itself. In the standard frame, points and vectors are identified. Points and free vectors
are illustrated in Figure 4.1.
4.1. AFFINE SPACES 91
ab
a
O
What if we pick a frame with a different origin, say Ω = (ω1 , ω2 , ω3 ), but the same basis
vectors (e1 , e2 , e3 )? This time, the point x = (x1 , x2 , x3 ) is defined by two position vectors:
Ox = (x1 , x2 , x3 )
Ωx = (x1 − ω1 , x2 − ω2 , x3 − ω3 )
If we choose a different basis (e!1 , e!2 , e!3 ) and if the matrix P expressing the vectors (e!1 , e!2 , e!3 )
over the basis (e1 , e2 , e3 ) is
92 CHAPTER 4. BASICS OF AFFINE GEOMETRY
a1 b1 c1
P = a2 b2 c2 ,
a3 b3 c3
which means that the columns of P are the coordinates of the e!j over the basis (e1 , e2 , e3 ),
since
u1 e1 + u2 e2 + u3 e3 = u!1 e!1 + u!2 e!2 + u!3 e!3
and
v1 e1 + v2 e2 + v3 e3 = v1! e!1 + v2! e!2 + v3! e!3 ,
it is easy to see that the coordinates (u1 , u2, u3 ) and (v1 , v2 , v3 ) of u and v with respect to
the basis (e1 , e2 , e3 ) are given in terms of the coordinates (u!1 , u!2, u!3 ) and (v1! , v2! , v3! ) of u and
v with respect to the basis (e!1 , e!2 , e!3 ) by the matrix equations
! !
u1 u1 v1 v1
u2 = P u!2 and v2 = P v2! .
u3 u!3 v3 v3!
Everything worked out because the change of basis does not involve a change of origin. On the
other hand, if we consider the change of frame from the frame (O, (e1 , e2 , e3 )) to the frame
(Ω, (e1 , e2 , e3 )), where OΩ = (ω1 , ω2 , ω3 ), given two points a, b of coordinates (a1 , a2 , a3 )
and (b1 , b2 , b3 ) with respect to the frame (O, (e1, e2 , e3 )) and of coordinates (a!1 , a!2 , a!3 ) and
(b!1 , b!2 , b!3 ) with respect to the frame (Ω, (e1 , e2 , e3 )), since
and
(b!1 , b!2 , b!3 ) = (b1 − ω1 , b2 − ω2 , b3 − ω3 ),
4.1. AFFINE SPACES 93
unless λ + µ = 1.
Thus, we have discovered a major difference between vectors and points: The notion of
linear combination of vectors is basis independent, but the notion of linear combination of
points is frame dependent. In order to salvage the notion of linear combination of points,
some restriction is needed: The scalar coefficients must add up to 1.
A clean way to handle the problem of frame invariance and to deal with points in a more
intrinsic manner is to make a clearer distinction between points and vectors. We duplicate
R3 into two copies, the first copy corresponding to points, where we forget the vector space
structure, and the second copy corresponding to free vectors, where the vector space structure
is important. Furthermore, we make explicit the important fact that the vector space R3
acts on the set of points R3 : Given any point a = (a1 , a2 , a3 ) and any vector v = (v1 , v2 , v3 ),
we obtain the point
a + v = (a1 + v1 , a2 + v2 , a3 + v3 ),
which can be thought of as the result of translating a to b using the vector v. We can imagine
that v is placed such that its origin coincides with a and that its tip coincides with b. This
action +: R3 × R3 → R3 satisfies some crucial properties. For example,
a + 0 = a,
(a + u) + v = a + (u + v),
and for any two points a, b, there is a unique free vector ab such that
b = a + ab.
It turns out that the above properties, although trivial in the case of R3 , are all that is
needed to define the abstract notion of affine space (or affine structure). The basic idea is
−→
to consider two (distinct) sets E and E , where E is a set of points (with no structure) and
−
→
E is a vector space (of free vectors) acting on the set E.
94 CHAPTER 4. BASICS OF AFFINE GEOMETRY
Definition 4.1 An affine space is either the degenerate space reduced to the empty set, or a
−
→ −
→
triple %E, E , +& consisting of a nonempty set E (of points), a vector space E (of translations,
−
→
or free vectors), and an action +: E × E → E, satisfying the following conditions.
−
→
Conditions (A1) and (A2) say that the (abelian) group E acts on E, and condition (A3)
−
→
says that E acts transitively and faithfully on E. Note that
a(a + v) = v
4.1. AFFINE SPACES 95
−
→
E E
b=a+u
u
a c=a+w w
−
→
for all a ∈ E and all v ∈ E , since a(a + v) is the unique vector such that a+v = a+a(a + v).
Thus, b = a + v is equivalent to ab = v. Figure 4.2 gives an intuitive picture of an affine
space. It is natural to think of all vectors as having the same origin, the null vector.
−
→
The axioms defining an affine space %E, E , +& can be interpreted intuitively as saying
−
→
that E and E are two different ways of looking at the same object, but wearing different
sets of glasses, the second set of glasses depending on the choice of an “origin” in E. Indeed,
we can choose to look at the points in E, forgetting that every pair (a, b) of points defines
−
→ −
→
a unique vector ab in E , or we can choose to look at the vectors u in E , forgetting the
points in E. Furthermore, if we also pick any point a in E, a point that can be viewed as
an origin in E, then we can recover all the points in E as the translated points a + u for all
−
→ −
→
u ∈ E . This can be formalized by defining two maps between E and E .
−→
For every a ∈ E, consider the mapping from E to E given by
u '→ a + u,
−
→ −
→
where u ∈ E , and consider the mapping from E to E given by
b '→ ab,
which, in view of (A3), yields u. The composition of the second with the first mapping is
−
→
When we identify E with E via the mapping b '→ ab, we say that we consider E as
the vector space obtained by taking a as the origin in E, and we denote it by Ea . Thus,
−
→
an affine space %E, E , +& is a way of defining a vector space structure on a set of points E,
without making a commitment to a fixed origin in E. Nevertheless, as soon as we commit
to an origin a in E, we can view E as the vector space Ea . However, we urge the reader to
−
→
think of E as a physical set of points and of E as a set of forces acting on E, rather than
reducing E to some isomorphic copy of Rn . After all, points are points, and not vectors! For
−
→ −
→
notational simplicity, we will often denote an affine space %E, E , +& by (E, E ), or even by
−
→
E. The vector space E is called the vector space associated with E.
! One should be careful about the overloading of the addition symbol +. Addition
is well-defined on vectors, as in u + v; the translate a + u of a point a ∈ E by a
−→
vector u ∈ E is also well-defined, but addition of points a + b does not make sense. In
this respect, the notation b − a for the unique vector u such that b = a + u is somewhat
confusing, since it suggests that points can be subtracted (but not added!). It is possible to
make sense of linear combinations of points, and even mixed linear combinations of points
and vectors (see Gallier [22], Chapter 4).
−→ −
→
Any vector space E has an affine space structure specified by choosing E = E , and
−
→ −→ − →
letting + be addition in the vector space E . We will refer to the affine structure % E , E , +&
−
→ −
→
on a vector space E as the canonical (or natural) affine structure on E . In particular, the
vector space Rn can be viewed as the affine space %Rn , Rn , +&, denoted by An . In general,
if K is any field, the affine space %K n , K n , +& is denoted by AnK . In order to distinguish
between the double role played by members of Rn , points and vectors, we will denote points
by row vectors, and vectors by column vectors. Thus, the action of the vector space Rn over
the set Rn simply viewed as a set of points is given by
u1
.
(a1 , . . . , an ) + .. = (a1 + u1 , . . . , an + un ).
un
We will also use the convention that if x = (x1 , . . . , xn ) ∈ Rn , then the column vector
associated with x is denoted by x (in boldface notation). Abusing the notation slightly, if
a ∈ Rn is a point, we also write a ∈ An . The affine space An is called the real affine space of
dimension n. In most cases, we will consider n = 1, 2, 3.
the equation
x + y − 1 = 0.
The set L is the line of slope −1 passing through the points (1, 0) and (0, 1) shown in Figure
4.3.
The line L can be made into an official affine space by defining the action +: L × R → L
of R on L defined such that for every point (x, 1 − x) on L and any u ∈ R,
(x, 1 − x) + u = (x + u, 1 − x − u).
It is immediately verified that this action makes L into an affine space. For example, for any
two points a = (a1 , 1 − a1 ) and b = (b1 , 1 − b1 ) on L, the unique (vector) u ∈ R such that
b = a + u is u = b1 − a1 . Note that the vector space R is isomorphic to the line of equation
x + y = 0 passing through the origin.
Similarly, consider the subset H of A3 consisting of all points (x, y, z) satisfying the
equation
x + y + z − 1 = 0.
The set H is the plane passing through the points (1, 0, 0), (0, 1, 0), and (0, 0, 1). The plane
H can be made into an official affine space by defining the action +:%H & × R2 → H of R2 on
u
H defined such that for every point (x, y, 1 − x − y) on H and any ∈ R2 ,
v
% &
u
(x, y, 1 − x − y) + = (x + u, y + v, 1 − x − u − y − v).
v
For a slightly wilder example, consider the subset P of A3 consisting of all points (x, y, z)
satisfying the equation
x2 + y 2 − z = 0.
98 CHAPTER 4. BASICS OF AFFINE GEOMETRY
−
→
E E
b
ab
a c ac
bc
The set P is paraboloid of revolution, with axis Oz. The surface P can be made into an
official affine space by defining the action%+:&P × R2 → P of R2 on P defined such that for
u
every point (x, y, x2 + y 2 ) on P and any ∈ R2 ,
v
% &
2 2 u
(x, y, x + y ) + = (x + u, y + v, (x + u)2 + (y + v)2 ).
v
This should dispell any idea that affine spaces are dull. Affine spaces not already equipped
with an obvious vector space structure arise in projective geometry. More more on this topic,
see Gallier [22].
Thus, when a = (−1, −1) and b = (2, 2), the point a + b is the point c = (1, 1).
Let us now consider the new coordinate system with respect to the origin c = (1, 1) (and
the same basis vectors). This time, the coordinates of a are (−2, −2), the coordinates of b
are (1, 1), and the point a + b is the point d of coordinates (−1, −1). However, it is clear
that the point d is identical to the origin O = (0, 0) of the first coordinate system.
Thus, a + b corresponds to two different points depending on which coordinate system is
used for its computation!
This shows that some extra condition is needed in order for affine combinations to make
sense. It turns out that if the scalars sum up to 1, the definition is intrinsic, as the following
proposition shows.
Proposition 4.1 Given an affine space E, let (ai )i∈I be a family of points in E, and let
(λi )i∈I be a family of scalars. For any two points a, b ∈ E, the following properties hold:
'
(1) If i∈I λi = 1, then
( (
a+ λi aai = b + λi bai .
i∈I i∈I
'
(2) If i∈I λi = 0, then
( (
λi aai = λi bai .
i∈I i∈I
Thus, by Proposition
' 4.1, for any family of points (ai )i∈I in E, for any family (λi )i∈I of
scalars such that i∈I λi = 1, the point
(
x=a+ λi aai
i∈I
is independent of the choice of the origin a ∈ E. This property motivates the following
definition.
Definition
' 4.2 For any family of points (ai )i∈I in E, for any family (λi )i∈I of scalars such
that i∈I λi = 1, and for any a ∈ E, the point
(
a+ λi aai
i∈I
In physical terms, the barycenter is the center of mass of the' family of weighted points
((ai , λi ))i∈I (where the masses have been normalized, so that i∈I λi = 1, and negative
masses are allowed).
Remarks:
(1) Since the barycenter of a family ((ai , λi ))i∈I of weighted
' points is defined for families
(λi )i∈I of scalars with finite support (and such that i∈I λi = 1), we might as well
assume that I is finite. Then, for all m ≥ 2, it is easy to prove that the barycenter
of m weighted points can be obtained by repeated computations of barycenters of two
weighted points.
(2) This result still holds, provided that the field K has at least three distinct elements,
but the proof is trickier!
' '
(3) When i∈I λ' i = 0, the vector i∈I λi aai does not depend on the point a, and we may
denote it by i∈I λi ai .
Figure 4.5 illustrates
+ , + the , geometric
+ , construction of the barycenters g1 and g2 of the
weighted points a, 41 , b, 14 , and c, 21 , and (a, −1), (b, 1), and (c, 1).
The point g1 can be constructed geometrically as the middle of the segment joining c to
the middle 21 a + 12 b of the segment (a, b), since
1)1 1 * 1
g1 = a + b + c.
2 2 2 2
The point g2 can be constructed geometrically as the point such that the middle 21 b + 12 c of
the segment (b, c) is the middle of the segment (a, g2 ), since
)1
1 *
g2 = −a + 2 b + c .
2 2
Polynomial curve can be defined as the set of barycenters of a fixed number of points.
For example, let (a, b, c, d) be a sequence of points in A2 . Observe that
since the sum on the left-hand side is obtained by expanding (t + (1 − t))3 = 1 using the
binomial formula. Thus,
g1
a b
c
g2
a b
is a well-defined affine combination. Then, we can define the curve F : A → A2 such that
F (t) = (1 − t)3 a + 3t(1 − t)2 b + 3t2 (1 − t) c + t3 d.
Such a curve is called a Bézier curve, and (a, b, c, d) are called its control points. Note that
the curve passes through a and d, but generally not through b and c. Any point F (t) on the
curve can be constructed using an algorithm performing affine interpolation steps (the de
Casteljau algorithm). For more on this topic, see Gallier [21].
ax + by = c,
where it is assumed that a )= 0 or b )= 0. Given any m points (xi , yi ) ∈ U and any m scalars
λi such that λ1 + · · · + λm = 1, we claim that
m
(
λi (xi , yi ) ∈ U.
i=1
ax + by = 0
obtained by setting the right-hand side of ax + by = c to zero. Indeed, for any m scalars λi ,
the same calculation as above yields that
m
( −
→
λi (xi , yi ) ∈ U ,
i=1
this time without any restriction on the λi , since the right-hand side of the equation is
−
→ −
→
null. Thus, U is a subspace of R2 . In fact, U is one-dimensional, and it is just a usual line
104 CHAPTER 4. BASICS OF AFFINE GEOMETRY
U
−
→
U
in R2 . This line can be identified with a line passing through the origin of A2 , a line that is
parallel to the line U of equation ax + by = c, as illustrated in Figure 4.6.
Now, if (x0 , y0 ) is any point in U, we claim that
−
→
U = (x0 , y0 ) + U ,
where
→ 1
− →2
−
(x0 , y0 ) + U = (x0 + u1 , y0 + u2 ) | (u1 , u2 ) ∈ U .
−
→ −
→
First, (x0 , y0 ) + U ⊆ U, since ax0 + by0 = c and au1 + bu2 = 0 for all (u1 , u2 ) ∈ U . Second,
if (x, y) ∈ U, then ax + by = c, and since we also have ax0 + by0 = c, by subtraction, we get
a(x − x0 ) + b(y − b0 ) = 0,
−
→ −
→
which shows that (x − x0 , y − y0 ) ∈ U , and thus (x, y) ∈ (x0 , y0 ) + U . Hence, we also have
−
→ −
→
U ⊆ (x0 , y0 ) + U , and U = (x0 , y0) + U .
The above example shows that the affine line U defined by the equation
ax + by = c
−
→
is obtained by “translating” the parallel line U of equation
ax + by = 0
passing through the origin. In fact, given any point (x0 , y0 ) ∈ U,
−
→
U = (x0 , y0 ) + U .
4.5. AFFINE SUBSPACES 105
More generally, it is easy to prove the following fact. Given any m × n matrix A and any
vector b ∈ Rm , the subset U of Rn defined by
U = {x ∈ Rn | Ax = b}
is an affine subspace of An .
Actually, observe that Ax = b should really be written as Ax$ = b, to be consistent with
our convention that points are represented by row vectors. We can also use the boldface
notation for column vectors, in which case the equation is written as Ax = b. For the sake of
minimizing the amount of notation, we stick to the simpler (yet incorrect) notation Ax = b.
If we consider the corresponding homogeneous equation Ax = 0, the set
−
→
U = {x ∈ Rn | Ax = 0}
since
n
( ) n
( *
λi + 1 − λi = 1.
i=1 i=1
−
→ − → −
→
Given any point a ∈ E and any subset V of E , let a + V denote the following subset of E:
→ 1
− →2
−
a+ V = a+v | v ∈ V .
−
→
Proposition 4.2 Let %E, E , +& be an affine space.
106 CHAPTER 4. BASICS OF AFFINE GEOMETRY
−
→
E E
−
→
a V
−
→
V =a+ V
−
→
Figure 4.7: An affine subspace V and its direction V
(1) A nonempty subset V of E is an affine subspace iff for every point a ∈ V , the set
−
→
Va = {ax | x ∈ V }
−
→ −
→
is a subspace of E . Consequently, V = a + Va . Furthermore,
−
→
V = {xy | x, y ∈ V }
−
→ −
→ − → −
→
is a subspace of E and Va = V for all a ∈ E. Thus, V = a + V .
−
→ − → −→
(2) For any subspace V of E and for any a ∈ E, the set V = a + V is an affine subspace.
Proof . The proof is straightforward, and is omitted. It is also given in Gallier [21], Chapter
2.
In particular, when E is the natural affine space associated with a vector space E, Propo-
sition 4.2 shows that every affine subspace of E is of the form u + U, for a subspace U of E.
The subspaces of E are the affine subspaces that contain 0.
−
→
The subspace V associated with an affine subspace V is called the direction of V . It is
−
→ −→ −
→
also clear that the map +: V × V → V induced by +: E × E → E confers to %V, V , +& an
affine structure. Figure 4.7 illustrates the notion of affine subspace.
−
→
By the dimension of the subspace V , we mean the dimension of V .
An affine subspace of dimension 1 is called a line, and an affine subspace of dimension 2
is called a plane.
An affine subspace of codimension 1 is called a hyperplane (recall that a subspace F of
a vector space E has codimension 1 iff there is some subspace G of dimension 1 such that
E = F ⊕ G, the direct sum of F and G.
4.5. AFFINE SUBSPACES 107
We say that two affine subspaces U and V are parallel if their directions are identical.
−
→ − → −
→ −
→
Equivalently, since U = V , we have U = a + U and V = b + U for any a ∈ U and any
b ∈ V , and thus V is obtained from U by the translation ab.
In general, when we talk about n points a1 , . . . , an , we mean the sequence (a1 , . . . , an ),
and not the set {a1 , . . . , an } (the ai ’s need not be distinct).
−
→
By Proposition 4.2, a line is specified by a point a ∈ E and a nonzero vector v ∈ E , i.e.,
a line is the set of all points of the form a + λu, for λ ∈ R.
We say that three points a, b, c are collinear if the vectors ab and ac are linearly depen-
dent. If two of the points a, b, c are distinct, say a )= b, then there is a unique λ ∈ R such
ac
that ac = λab, and we define the ratio ab = λ.
−
→
A plane is specified by a point a ∈ E and two linearly independent vectors u, v ∈ E , i.e.,
a plane is the set of all points of the form a + λu + µv, for λ, µ ∈ R.
We say that four points a, b, c, d are coplanar if the vectors ab, ac, and ad are linearly
dependent. Hyperplanes will be characterized a little later.
−
→
Proposition 4.3 Given
' an affine space' %E, E , +&, for any family (ai )i∈I of points in E, the
set V of barycenters i∈I λi ai (where i∈I λi = 1) is the smallest affine subspace containing
(ai )i∈I .
'
Proof . If (ai )i∈I is empty, then V = ∅, because of the condition i∈I λi = 1. If (ai )i∈I is
nonempty, then
' the smallest affine subspace containing (ai )i∈I must contain the set V of
barycenters i∈I λi ai , and thus, it is enough to show that V is closed under affine combina-
tions, which is immediately verified.
Remarks:
(1) Since it can be shown that the barycenter of n weighted points can be obtained by
repeated computations of barycenters of two weighted points, a nonempty subset V
of E is an affine subspace iff for every two points a, b ∈ V , the set V contains all
barycentric combinations of a and b. If V contains at least two points, then V is an
affine subspace iff for any two distinct points a, b ∈ V , the set V contains the line
determined by a and b, that is, the set of all points (1 − λ)a + λb, λ ∈ R.
(2) This result still holds if the field K has at least three distinct elements, but the proof
is trickier!
108 CHAPTER 4. BASICS OF AFFINE GEOMETRY
−
→
Proposition 4.4 Given an affine space %E, E , +&, let (ai )i∈I be a family of points in E. If
the family (ai aj )j∈(I−{i}) is linearly independent for some i ∈ I, then (ai aj )j∈(I−{i}) is linearly
independent for every i ∈ I.
Proof . Assume that the family (ai aj )j∈(I−{i}) is linearly independent for some specific i ∈ I.
Let k ∈ I with k )= i, and assume that there are some scalars (λj )j∈(I−{k}) such that
(
λj ak aj = 0.
j∈(I−{k})
Since
ak aj = ak ai + ai aj ,
we have
( ( (
λj ak aj = λj ak ai + λj ai aj ,
j∈(I−{k}) j∈(I−{k}) j∈(I−{k})
( (
= λj ak ai + λj ai aj ,
j∈(I−{k}) j∈(I−{i,k})
( ) ( *
= λj ai aj − λj ai ak ,
j∈(I−{i,k}) j∈(I−{k})
and thus ( ) ( *
λj ai aj − λj ai ak = 0.
j∈(I−{i,k}) j∈(I−{k})
Since the family'(ai aj )j∈(I−{i}) is linearly independent, we must have λj = 0 for all j ∈
(I − {i, k}) and j∈(I−{k}) λj = 0, which implies that λj = 0 for all j ∈ (I − {k}).
−→
Definition 4.4 Given an affine space %E, E , +&, a family (ai )i∈I of points in E is affinely
independent if the family (ai aj )j∈(I−{i}) is linearly independent for some i ∈ I.
4.6. AFFINE INDEPENDENCE AND AFFINE FRAMES 109
−
→
E E
a2
a0 a2
a0 a1 a0 a1
Definition 4.4 is reasonable, since by Proposition 4.4, the independence of the family
(ai aj )j∈(I−{i}) does not depend on the choice of ai . A crucial property of linearly independent
vectors (u1 , . . . , um ) is that if a vector v is a linear combination
m
(
v= λi u i
i=1
of the ui , then the λi are unique. A similar result holds for affinely independent points.
−
→
Proposition 4.5 Given an affine space %E, E , ' +&, let (a0 , . . . , am'
) be a family of m + 1
m m
points in E. Let x ∈ E, and assume 'mthat x = i=0 λi ai , where i=0 λi = 1. Then, the
family (λ0 , . . . , λm ) such that x = i=0 λi ai is unique iff the family (a0 a1 , . . . , a0 am ) is
linearly independent.
Proposition 4.5 suggests the notion of affine frame. Affine frames are the affine ana-
−
→
logues of bases in vector spaces. Let %E, E , +& be a nonempty affine space, and (a0 , . . . , am )
a family of m + 1 points in E. The family (a0 , . . . , am ) determines the family of m vec-
−
→
tors (a0 a1 , . . . , a0 am ) in E . Conversely, given a point a0 in E and a family of m vectors
−→
(u1 , . . . , um) in E , we obtain the family of m+ 1 points (a0 , . . . , am ) in E, where ai = a0 + ui ,
1 ≤ i ≤ m.
Thus, for any m ≥ 1, it is equivalent to consider a family of m + 1 points (a0 , . . . , am )
−
→
in E, and a pair (a0 , (u1 , . . . , um )), where the ui are vectors in E . Figure 4.8 illustrates the
notion of affine independence.
Remark: The above observation also applies to infinite families (ai )i∈I of points in E and
−
→
families (−
→
ui )i∈I−{0} of vectors in E , provided that the index set I contains 0.
110 CHAPTER 4. BASICS OF AFFINE GEOMETRY
−
→
When (a0 a1 , . . . , a0 am ) is a basis of E then, for every x ∈ E, since x = a0 + a0 x, there
is a unique family (x1 , . . . , xm ) of scalars such that
x = a0 + x1 a0 a1 + · · · + xm a0 am .
The scalars (x1 , . . . , xm ) may be considered as coordinates with respect to the affine frame
(a0 , (a0 a1 , . . . , a0 am )). Since
m
/ m
0 m
( ( (
x = a0 + xi a0 ai iff x = 1 − xi a0 + xi ai ,
i=1 i=1 i=1
x = a0 + x1 a0 a1 + · · · + xm a0 am
for a unique family (x1 , . . . , xm ) of scalars, called the coordinates of x w.r.t. the affine frame
(a0 , (a0 a1 , . . ., a0 am )). Furthermore, every x ∈ E can be written as
x = λ0 a0 + · · · + λm am
for some unique family (λ0 , . . . , λm ) of scalars such that λ0 +· · ·+λm = 1 called the barycentric
coordinates of x with respect to the affine frame (a0 , . . . , am ).
Proof . By Proposition 4.5, the family (ai )i∈I is affinely dependent iff the family of vectors
(ai aj )j∈(I−{i}) is linearly dependent for some i ∈ I. For any i ∈ I, the family (ai aj )j∈(I−{i})
is linearly dependent iff there is a family (λj )j∈(I−{i}) such that λj )= 0 for some j, and such
that (
λj ai aj = 0.
j∈(I−{i})
Even though Proposition 4.6 is rather dull, it is one of the key ingredients in the proof
of beautiful and deep theorems about convex sets, such as Carathéodory’s theorem, Radon’s
theorem, and Helly’s theorem (see Gallier [22]).
A family of two points (a, b) in E is affinely independent iff ab )= 0, iff a )= b. If a )= b,
the affine subspace generated by a and b is the set of all points (1 − λ)a + λb, which is
the unique line passing through a and b. A family of three points (a, b, c) in E is affinely
independent iff ab and ac are linearly independent, which means that a, b, and c are not on
the same line (they are not collinear). In this case, the affine subspace generated by (a, b, c)
is the set of all points (1 − λ − µ)a + λb + µc, which is the unique plane containing a, b,
and c. A family of four points (a, b, c, d) in E is affinely independent iff ab, ac, and ad are
linearly independent, which means that a, b, c, and d are not in the same plane (they are
not coplanar). In this case, a, b, c, and d are the vertices of a tetrahedron. Figure 4.9 shows
affine frames for |I| = 0, 1, 2, 3.
Given n+1 affinely independent points (a0 , . . . , an ) in E, we can consider the set of points
λ0 a0 + · · · + λn an , where λ0 + · · · + λn = 1 and λi ≥ 0 (λi ∈ R). Such affine combinations are
called convex combinations. This set is called the convex hull of (a0 , . . . , an ) (or n-simplex
spanned by (a0 , . . . , an )). When n = 1, we get the segment between a0 and a1 , including
a0 and a1 . When n = 2, we get the interior of the triangle whose vertices are a0 , a1 , a2 ,
including boundary points (the edges). When n = 3, we get the interior of the tetrahedron
whose vertices are a0 , a1 , a2 , a3 , including boundary points (faces and edges). The set
{a0 + λ1 a0 a1 + · · · + λn a0 an | where 0 ≤ λi ≤ 1 (λi ∈ R)}
is called the parallelotope spanned by (a0 , . . . , an ). When E has dimension 2, a parallelotope
is also called a parallelogram, and when E has dimension 3, a parallelepiped .
More generally, we say that a subset V of E is convex if for any two points a, b ∈ V , we
have c ∈ V for every point c = (1 − λ)a + λb, with 0 ≤ λ ≤ 1 (λ ∈ R).
112 CHAPTER 4. BASICS OF AFFINE GEOMETRY
a2
a0 a0 a1
a3
a0 a1 a0 a2
a1
! Points are not vectors! The following example illustrates why treating points as
vectors may cause problems. Let a, b, c be three affinely independent points in A3 .
Any point x in the plane (a, b, c) can be expressed as
x = λ0 a + λ1 b + λ2 c,
However, there is a problem when the origin of the coordinate system belongs to the plane
(a, b, c), since in this case, the matrix is not invertible! What we should really be doing is to
solve the system
λ0 Oa + λ1 Ob + λ2 Oc = Ox,
where O is any point not in the plane (a, b, c). An alternative is to use certain well-chosen
cross products.
It can be shown that barycentric coordinates correspond to various ratios of areas and
volumes; see the problems.
4.7. AFFINE MAPS 113
Affine maps can be obtained from linear maps as follows. For simplicity of notation, the
same symbol + is used for both affine spaces (instead of using both + and +! ).
−
→ −
→
Given any point a ∈ E, any point b ∈ E ! , and any linear map h: E → E ! , we claim that
the map f : E → E ! defined such that
f (a + v) = b + h(v)
'
is an affine map. Indeed, for any family (λi )i∈I of scalars with i∈I λi = 1 and any family
(−→
vi )i∈I , since
( ( (
λi (a + vi ) = a + λi a(a + vi ) = a + λi vi
i∈I i∈I i∈I
and ( ( (
λi (b + h(vi )) = b + λi b(b + h(vi )) = b + λi h(vi ),
i∈I i∈I i∈I
we have
%( & % ( &
f λi (a + vi ) =f a+ λi vi
i∈I i∈I
%( &
= b+h λi vi
i∈I
(
= b+ λi h(vi )
i∈I
(
= λi (b + h(vi ))
i∈I
(
= λi f (a + vi ).
i∈I
114 CHAPTER 4. BASICS OF AFFINE GEOMETRY
d! c!
d c
a! b!
a b
Figure 4.10: The effect of a shear
'
Note that the condition i∈I λi = 1 was implicitly used (in a hidden call to Proposition
4.1) in deriving that
( (
λi (a + vi ) = a + λi vi
i∈I i∈I
and ( (
λi (b + h(vi )) = b + λi h(vi ).
i∈I i∈I
defines an affine map in A2 . It is a “shear” followed by a translation. The effect of this shear
on the square (a, b, c, d) is shown in Figure 4.10. The image of the square (a, b, c, d) is the
parallelogram (a! , b! , c! , d! ).
Let us consider one more example. The map
% & % &% & % &
x1 1 1 x1 3
'→ +
x2 1 3 x2 0
d!
d c
b!
a b a!
Figure 4.11: The effect of an affine map
−
→ − → −
→
Proposition 4.7 Given an affine map f : E → E ! , there is a unique linear map f : E → E !
such that
−→
f (a + v) = f (a) + f (v),
−
→
for every a ∈ E and every v ∈ E .
Proof . Let a ∈ E be any point in E. We claim that the map defined such that
−
→
f (v) = f(a)f(a + v)
−
→ −
→ − → −
→
for every v ∈ E is a linear map f : E → E ! . Indeed, we can write
a + λv = λ(a + v) + (1 − λ)a,
a + u + v = (a + u) + (a + v) − a,
we get
f(a)f(a + λv) = λf(a)f(a + v) + (1 − λ)f(a)f(a) = λf(a)f(a + v),
−
→ −
→
showing that f (λv) = λ f (v). We also have
f (a + u + v) = f (a + u) + f (a + v) − f (a),
116 CHAPTER 4. BASICS OF AFFINE GEOMETRY
f (b + v) = f (a + v) − f (a) + f (b),
−
→ − → −→
The unique linear map f : E → E ! given by Proposition 4.7 is called the linear map
associated with the affine map f .
Note that the condition
−→
f (a + v) = f (a) + f (v),
−
→
for every a ∈ E and every v ∈ E , can be stated equivalently as
−→ −
→
f (x) = f (a) + f (ax), or f(a)f(x) = f (ax),
for all a, x ∈ E. Proposition 4.7 shows that for any affine map f : E → E ! , there are points
−
→ − → −
→
a ∈ E, b ∈ E ! , and a unique linear map f : E → E ! , such that
−→
f (a + v) = b + f (v),
−
→ −
→
for all v ∈ E (just let b = f (a), for any a ∈ E). Affine maps for which f is the identity
−→
map are called translations. Indeed, if f = id,
−→
f (x) = f (a) + f (ax) = f (a) + ax = x + xa + af(a) + ax
= x + xa + af(a) − xa = x + af(a),
4.7. AFFINE MAPS 117
and so
xf(x) = af(a),
which shows that f is the translation induced by the vector af(a) (which does not depend
on a).
Since an affine map preserves barycenters, and since an affine subspace V is closed under
barycentric combinations, the image f (V ) of V is an affine subspace in E ! . So, for example,
the image of a line is a point or a line, and the image of a plane is either a point, a line, or
a plane.
It is easily verified that the composition of two affine maps is an affine map. Also, given
affine maps f : E → E ! and g: E ! → E !! , we have
) −→ * )−
→ *
g(f (a + v)) = g f (a) + f (v) = g(f (a)) + −
→
g f (v) ,
−−→ → − →
which shows that g ◦ f = − g ◦ f . It is easy to show that an affine map f : E → E ! is injective
−
→ − → −
→ −
→ − → −
→
iff f : E → E ! is injective, and that f : E → E ! is surjective iff f : E → E ! is surjective.
−
→ −→ −
→
An affine map f : E → E ! is constant iff f : E → E ! is the null (constant) linear map equal
−
→
to 0 for all v ∈ E .
If E is an affine space of dimension m and (a0 , a1 , . . . , am ) is an affine frame for E, then
for any other affine space F and for any sequence (b0 , b1 , . . . , bm ) of m + 1 points in F , there
is a unique affine map f : E → F such that f (ai ) = bi , for 0 ≤ i ≤ m. Indeed, f must be
such that
f (λ0 a0 + · · · + λm am ) = λ0 b0 + · · · + λm bm ,
where λ0 +· · ·+λm = 1, and this defines a unique affine map on all of E, since (a0 , a1 , . . . , am )
is an affine frame for E.
Using affine frames, affine maps can be represented in terms of matrices. We explain how
an affine map f : E → E is represented with respect to a frame (a0 , . . . , an ) in E, the more
general case where an affine map f : E → F is represented with respect to two affine frames
(a0 , . . . , an ) in E and (b0 , . . . , bm ) in F being analogous. Since
−→
f (a0 + x) = f (a0 ) + f (x)
−
→
for all x ∈ E , we have
−→
a0 f(a0 + x) = a0 f(a0 ) + f (x).
Since x, a0 f(a0 ), and a0 f(a0 + x), can be expressed as
x = x1 a0 a1 + · · · + xn a0 an ,
a0 f(a0 ) = b1 a0 a1 + · · · + bn a0 an ,
a0 f(a0 + x) = y1 a0 a1 + · · · + yn a0 an ,
118 CHAPTER 4. BASICS OF AFFINE GEOMETRY
−→
if A = (ai j ) is the n×n matrix of the linear map f over the basis (a0 a1 , . . . , a0 an ), letting x,
y, and b denote the column vectors of components (x1 , . . . , xn ), (y1 , . . . , yn ), and (b1 , . . . , bn ),
−→
a0 f(a0 + x) = a0 f(a0 ) + f (x)
is equivalent to
y = Ax + b.
Note that b )= 0 unless f (a0 ) = a0 . Thus, f is generally not a linear transformation, unless it
has a fixed point, i.e., there is a point a0 such that f (a0 ) = a0 . The vector b is the “translation
part” of the affine map. Affine maps do not always have a fixed point. Obviously, nonnull
translations have no fixed point. A less trivial example is given by the affine map
% & % &% & % &
x1 1 0 x1 1
'→ + .
x2 0 −1 x2 0
This map is a reflection about the x-axis followed by a translation along the x-axis. The
affine map % & % √ &% & % &
x1 1 − 3 x1 1
'→ √ +
x2 3/4 1/4 x2 1
can also be written as
% & % &% √ &% & % &
x1 2 0 √1/2 − 3/2 x1 1
'→ +
x2 0 1/2 3/2 1/2 x2 1
which shows that it is the composition of a rotation of angle π/3, followed by a stretch (by a
factor of 2 along the x-axis, and by a factor of 21 along the y-axis), followed by a translation.
It is easy to show that this affine map has a unique fixed point. On the other hand, the
affine map % & % &% & % &
x1 8/5 −6/5 x1 1
'→ +
x2 3/10 2/5 x2 1
has no fixed point, even though
% & % &% &
8/5 −6/5 2 0 4/5 −3/5
= ,
3/10 2/5 0 1/2 3/5 4/5
and the second matrix is a rotation of angle θ such that cos θ = 45 and sin θ = 53 . For more
on fixed points of affine maps, see the problems.
There is a useful trick to convert the equation y = Ax + b into what looks like a linear
equation. The trick is to consider an (n + 1) × (n + 1) matrix. We add 1 as the (n + 1)th
component to the vectors x, y, and b, and form the (n + 1) × (n + 1) matrix
% &
A b
0 1
4.8. AFFINE GROUPS 119
so that y = Ax + b is equivalent to
% & % &% &
y A b x
= .
1 0 1 1
This trick is very useful in kinematics and dynamics, where A is a rotation matrix. Such
affine maps are called rigid motions.
If f : E → E ! is a bijective affine map, given any three collinear points a, b, c in E,
with a )= b, where, say, c = (1 − λ)a + λb, since f preserves barycenters, we have f (c) =
(1 − λ)f (a) + λf (b), which shows that f (a), f (b), f (c) are collinear in E ! . There is a converse
to this property, which is simpler to state when the ground field is K = R. The converse
states that given any bijective function f : E → E ! between two real affine spaces of the same
dimension n ≥ 2, if f maps any three collinear points to collinear points, then f is affine.
The proof is rather long (see Berger [3] or Samuel [42]).
Given three collinear points a, b, c, where a )= c, we have b = (1 − β)a + βc for some
unique β, and we define the ratio of the sequence a, b, c, as
β ab
ratio(a, b, c) = = ,
(1 − β) bc
for every x ∈ E.
120 CHAPTER 4. BASICS OF AFFINE GEOMETRY
a!
b b!
d
c
c!
Remark: The terminology does not seem to be universally agreed upon. The terms affine
dilatation and central dilatation are used by Pedoe [41]. Snapper and Troyer use the term
dilation for an affine dilatation and magnification for a central dilatation [47]. Samuel uses
homothety for a central dilatation, a direct translation of the French “homothétie” [42]. Since
dilation is shorter than dilatation and somewhat easier to pronounce, perhaps we should use
that!
Observe that Ha,λ (a) = a, and when λ )= 0 and x )= a, Ha,λ (x) is on the line defined by
a and x, and is obtained by “scaling” ax by λ.
Figure 4.12 shows the effect of a central dilatation of center d. The triangle (a, b, c) is
magnified to the triangle (a! , b! , c! ). Note how every line is mapped to a parallel line.
−−→
When λ = 1, Ha,1 is the identity. Note that Ha,λ = λ id− →E
. When λ )= 0, it is clear that
Ha,λ is an affine bijection. It is immediately verified that
Another point worth mentioning is that affine bijections preserve the ratio of volumes of
−
→
parallelotopes. Indeed, given any basis B = (u1 , . . . , um ) of the vector space E associated
with the affine space E, given any m + 1 affinely independent points (a0 , . . . , am ), we can
compute the determinant detB (a0 a1 , . . . , a0 am ) w.r.t. the basis B. For any bijective affine
map f : E → E, since
)−
→ −→ * −
→
detB f (a0 a1 ), . . . , f (a0 am ) = det( f )detB (a0 a1 , . . . , a0 am )
−
→
and the determinant of a linear map is intrinsic (i.e., depends only on f , and not on the
particular basis B), we conclude that the ratio
)−
→ −
→ *
detB f (a0 a1 ), . . . , f (a0 am ) −
→
= det( f )
detB (a0 a1 , . . . , a0 am )
is independent of the basis B. Since detB (a0 a1 , . . . , a0 am ) is the volume of the parallelotope
spanned by (a0 , . . . , am ), where the parallelotope spanned by any point a and the vectors
(u1 , . . . , um) has unit volume (see Berger [3], Section 9.12), we see that affine bijections
preserve the ratio of volumes of parallelotopes. In fact, this ratio is independent of the
choice of the parallelotopes of unit volume. In particular, the affine bijections f ∈ GA(E)
−→
such that det( f ) = 1 preserve volumes. These affine maps form a subgroup SA(E) of
GA(E) called the special affine group of E. We now take a glimpse at affine geometry.
Proposition 4.9 Given any affine space E, if H1 , H2 , H3 are any three distinct parallel
hyperplanes, and A and B are any two lines not parallel to Hi , letting ai = Hi ∩ A and
bi = Hi ∩ B, then the following ratios are equal:
a1 a3 b1 b3
= = ρ.
a1 a2 b1 b2
a1 d
Conversely, for any point d on the line A, if a1 a2
= ρ, then d = a3 .
Proof . Figure 4.13 illustrates the theorem of Thales. We sketch a proof, leaving the details
−
→
as an exercise. Since H1 , H2 , H3 are parallel, they have the same direction H , a hyperplane
−
→ −
→ − →
in E . Let u ∈ E − H be any nonnull vector such that A = a1 + Ru. Since A is not
−
→ − → −
→
parallel to H, we have E = H ⊕ Ru, and thus we can define the linear map p: E → Ru,
122 CHAPTER 4. BASICS OF AFFINE GEOMETRY
a1 b1
H1
H2
a2 b2
a3 b3
H3
A B
−
→
the projection on Ru parallel to H . This linear map induces an affine map f : E → A, by
defining f such that
f (b1 + w) = a1 + p(w),
−
→ −
→
for all w ∈ E . Clearly, f (b1 ) = a1 , and since H1 , H2, H3 all have direction H , we also have
f (b2 ) = a2 and f (b3 ) = a3 . Since f is affine, it preserves ratios, and thus
a1 a3 b1 b3
= .
a1 a2 b1 b2
The converse is immediate.
We also have the following simple proposition, whose proof is left as an easy exercise.
Proposition 4.10 Given any affine space E, given any two distinct points a, b ∈ E, and for
any affine dilatation f different from the identity, if a! = f (a), D = %a, b& is the line passing
through a and b, and D ! is the line parallel to D and passing through a! , the following are
equivalent:
(i) b! = f (b);
(ii) If f is a translation, then b! is the intersection of D ! with the line parallel to %a, a! &
passing through b;
If f is a dilatation of center c, then b! = D ! ∩ %c, b&.
The first case is the parallelogram law, and the second case follows easily from Thales’
theorem.
We are now ready to prove two classical results of affine geometry, Pappus’s theorem and
Desargues’s theorem. Actually, these results are theorems of projective geometry, and we
are stating affine versions of these important results. There are stronger versions that are
best proved using projective geometry.
Theorem 4.11 Given any affine plane E, any two distinct lines D and D ! , then for any
distinct points a, b, c on D and a! , b! , c! on D ! , if a, b, c, a! , b! , c! are distinct from the inter-
section of D and D ! (if D and D ! intersect) and if the lines %a, b! & and %a! , b& are parallel,
and the lines %b, c! & and %b! , c& are parallel, then the lines %a, c! & and %a! , c& are parallel.
Proof . Pappus’s theorem is illustrated in Figure 4.14. If D and D ! are not parallel, let d be
their intersection. Let f be the dilatation of center d such that f (a) = b, and let g be the
dilatation of center d such that g(b) = c. Since the lines %a, b! & and %a! , b& are parallel, and
the lines %b, c! & and %b! , c& are parallel, by Proposition 4.10 we have a! = f (b! ) and b! = g(c! ).
However, we observed that dilatations with the same center commute, and thus f ◦ g = g ◦ f ,
and thus, letting h = g ◦ f , we get c = h(a) and a! = h(c! ). Again, by Proposition 4.10, the
124 CHAPTER 4. BASICS OF AFFINE GEOMETRY
c
D
b
a
c!
b!
!
D!
a
Figure 4.14: Pappus’s theorem (affine version)
lines %a, c! & and %a! , c& are parallel. If D and D ! are parallel, we use translations instead of
dilatations.
Theorem 4.12 Given any affine space E and given any two triangles (a, b, c) and (a! , b! , c! ),
where a, b, c, a! , b! , c! are all distinct, if %a, b& and %a! , b! & are parallel and %b, c& and %b! , c! & are
parallel, then %a, c& and %a! , c! & are parallel iff the lines %a, a! &, %b, b! &, and %c, c! & are either
parallel or concurrent (i.e., intersect in a common point).
Proof . We prove half of the proposition, the direction in which it is assumed that %a, c&
and %a! , c! & are parallel, leaving the converse as an exercise. Since the lines %a, b& and %a! , b! &
are parallel, the points a, b, a! , b! are coplanar. Thus, either %a, a! & and %b, b! & are parallel, or
they have some intersection d. We consider the second case where they intersect, leaving
the other case as an easy exercise. Let f be the dilatation of center d such that f (a) = a! .
By Proposition 4.10, we get f (b) = b! . If f (c) = c!! , again by Proposition 4.10 twice, the
lines %b, c& and %b! , c!! & are parallel, and the lines %a, c& and %a! , c!! & are parallel. From this it
follows that c!! = c! . Indeed, recall that %b, c& and %b! , c! & are parallel, and similarly %a, c& and
%a! , c! & are parallel. Thus, the lines %b! , c!! & and %b! , c! & are identical, and similarly the lines
4.10. AFFINE HYPERPLANES 125
a!
b b!
d
c
c!
%a! , c!! & and %a! , c! & are identical. Since a! c! and b! c! are linearly independent, these lines have
a unique intersection, which must be c!! = c! .
The direction where it is assumed that the lines %a, a! &, %b, b! & and %c, c! &, are either parallel
or concurrent is left as an exercise (in fact, the proof is quite similar).
is an affine subspace of Am , and if λ1 , . . . , λm are not all null, it turns out that it is a subspace
of dimension m − 1 called a hyperplane.
We can interpret the equation
λ1 x1 + · · · + λm xm = µ
f (x1 , . . . , xm ) = λ1 x1 + · · · + λm xm − µ
for all (x1 , . . . , xm ) ∈ Rm . It is immediately verified that this map is affine, and the set H of
solutions of the equation
λ1 x1 + · · · + λm xm = µ
is the null set, or kernel, of the affine map f : Am → R, in the sense that
where x = (x1 , . . . , xm ).
Thus, it is interesting to consider affine forms, which are just affine maps f : E → R from
an affine space to R. Unlike linear forms f ∗ , for which Ker f ∗ is never empty (since it always
contains the vector 0), it is possible that f −1 (0) = ∅ for an affine form f . Given an affine
map f : E → R, we also denote f −1 (0) by Ker f , and we call it the kernel of f . Recall that an
(affine) hyperplane is an affine subspace of codimension 1. The relationship between affine
hyperplanes and affine forms is given by the following proposition.
(a) Given any nonconstant affine form f : E → R, its kernel H = Ker f is a hyperplane.
(b) For any hyperplane H in E, there is a nonconstant affine form f : E → R such that
H = Ker f . For any other affine form g: E → R such that H = Ker g, there is some
λ ∈ R such that g = λf (with λ )= 0).
(c) Given any hyperplane H in E and any (nonconstant) affine form f : E → R such that
H = Ker f , every hyperplane H ! parallel to H is defined by a nonconstant affine form
g such that g(a) = f (a) − λ, for all a ∈ E and some λ ∈ R.
Proof . The proof, using Proposition 2.29, is straightforward and is omitted. It is given in
Gallier [21].
Also recall that every linear form f ∗ is such that f ∗ (x) = λ1 x1 + · · · + λn xn , for every
x = x1 u1 + · · · + xn un and some λ1 , . . . , λn ∈ R. Since an affine form f : E → R satisfies the
−
→
property f (a0 + x) = f (a0 ) + f (x), denoting f (a0 + x) by f (x1 , . . . , xn ), we see that we have
f (x1 , . . . , xn ) = λ1 x1 + · · · + λn xn + µ,
λ1 x1 + · · · + λn xn + µ = 0.
4.11 Problems
Problem 4.1 Given a triangle (a, b, c), give a geometric construction of the barycenter of
the weighted points (a, 41 ), (b, 14 ), and (c, 21 ). Give a geometric construction of the barycenter
of the weighted points (a, 32 ), (b, 32 ), and (c, −2).
Problem 4.2 Given a tetrahedron (a, b, c, d) and any two distinct points x, y ∈ {a, b, c, d},
let let mx,y be the middle of the edge (x, y). Prove that the barycenter g of the weighted points
(a, 14 ), (b, 41 ), (c, 14 ), and (d, 41 ) is the common intersection of the line segments (ma,b , mc,d ),
(ma,c , mb,d ), and (ma,d , mb,c ). Show that if gd is the barycenter of the weighted points
(a, 13 ), (b, 31 ), (c, 13 ), then g is the barycenter of (d, 41 ) and (gd , 34 ).
−
→
Problem 4.3 Let E be a nonempty set, and E a vector space and assume that there is a
−
→
function Φ: E × E → E , such that if we denote Φ(a, b) by ab, the following properties hold:
−
→
(2) For every a ∈ E, the map Φa : E → E defined such that for every b ∈ E, Φa (b) = ab,
is a bijection.
−
→ −
→
Let Ψa : E → E be the inverse of Φa : E → E .
−
→
Prove that the function +: E × E → E defined such that
a + u = Ψa (u)
−
→ −
→
for all a ∈ E and all u ∈ E makes (E, E , +) into an affine space.
−
→
Note. We showed in the text that an affine space (E, E , +) satisfies the properties stated
above. Thus, we obtain an equivalent characterization of affine spaces.
128 CHAPTER 4. BASICS OF AFFINE GEOMETRY
Problem 4.4 Given any three points a, b, c in the affine plane A2 , letting (a1 , a2 ), (b1 , b2 ),
and (c1 , c2 ) be the coordinates of a, b, c, with respect to the standard affine frame for A2 ,
prove that a, b, c are collinear iff 3 3
3 a1 b1 c1 33
3
3 a2 b2 c2 33 = 0,
3
31 1 13
i.e., the determinant is null.
Letting (a0 , a1 , a2 ), (b0 , b1 , b2 ), and (c0 , c1 , c2 ) be the barycentric coordinates of a, b, c with
respect to the standard affine frame for A2 , prove that a, b, c are collinear iff
3 3
3 a0 b0 c0 3
3 3
3 a1 b1 c1 3 = 0.
3 3
3 a2 b2 c2 3
Given any four points a, b, c, d in the affine space A3 , letting (a1 , a2 , a3 ), (b1 , b2 , b3 ), (c1 , c2 , c3 ),
and (d1 , d2 , d3 ) be the coordinates of a, b, c, d, with respect to the standard affine frame for
A3 , prove that a, b, c, d are coplanar iff
3 3
3 a1 b1 c1 d1 3
3 3
3 a2 b2 c2 d2 3
3 3
3 a3 b3 c3 d3 3 = 0,
3 3
31 1 1 13
or, equivalently,
where (x, y, z) are the barycentric coordinates of the generic point on the line %a, b&.
Prove that the equation of a line in barycentric coordinates is of the form
ux + vy + wz = 0,
ux + vy + wz = 0 and u! x + v ! y + w ! z = 0
represent the same line in barycentric coordinates iff (u! , v !, w ! ) = λ(u, v, w) for some λ ∈ R
(with λ )= 0).
A triple (u, v, w) where u )= v or v )= w or u = ) w is called a system of tangential
coordinates of the line defined by the equation
ux + vy + wz = 0.
Problem 4.7 Given two lines D and D ! in A2 defined by tangential coordinates (u, v, w)
and (u!, v ! , w ! ) (as defined in Problem 4.6), let
3 3
3u v w3
3 3
d = 33 u! v ! w ! 33 = vw ! − wv ! + wu! − uw ! + uv ! − vu! .
31 1 1 3
(a) Prove that D and D ! have a unique intersection point iff d )= 0, and that when it
exists, the barycentric coordinates of this intersection point are
1
(vw ! − wv !, wu! − uw !, uv ! − vu! ).
d
(b) Letting (O, i, j) be any affine frame for A2 , recall that when x + y + z = 0, for any
point a, the vector
xaO + yai + zaj
is independent of a and equal to
The triple (x, y, z) such that x + y + z = 0 is called the barycentric coordinates of the vector
yOi + zOj w.r.t. the affine frame (O, i, j).
Given any affine frame (O, i, j), prove that for u )= v or v )= w or u )= w, the line of
equation
ux + vy + wz = 0
130 CHAPTER 4. BASICS OF AFFINE GEOMETRY
in barycentric coordinates (x, y, z) (where x + y + z = 1) has for direction the set of vectors
of barycentric coordinates (x, y, z) such that
ux + vy + wz = 0
(where x + y + z = 0).
Prove that D and D ! are parallel iff d = 0. In this case, if D )= D ! , show that the common
direction of D and D ! is defined by the vector of barycentric coordinates
(c) Given three lines D, D ! , and D !! , at least two of which are distinct and defined by
tangential coordinates (u, v, w), (u!, v ! , w ! ), and (u!! , v !!, w !! ), prove that D, D ! , and D !! are
parallel or have a unique intersection point iff
3 3
3u v w 3
3 ! 3
3 u v ! w ! 3 = 0.
3 !! 3
3 u v !! w !! 3
MB NC PA m!! np!
= − ! !! .
MC NA PB mn p
MB NC PA
= 1.
MC NA PB
(c) Prove Ceva’s theorem: The lines AM, BN, CP have a unique intersection point or
are parallel iff
m!! np! − m! n!! p = 0.
When M )= C, N )= A, and P )= B, this is equivalent to
MB NC PA
= −1.
MC NA PB
4.11. PROBLEMS 131
Problem 4.9 This problem uses notions and results from Problems 4.6 and 4.7. In view of
(a) and (b) of Problem 4.7, it is natural to extend the notion of barycentric coordinates of a
point in A2 as follows. Given any affine frame (a, b, c) in A2 , we will say that the barycentric
coordinates (x, y, z) of a point M, where x + y + z = 1, are the normalized barycentric
coordinates of M. Then, any triple (x, y, z) such that x + y + z )= 0 is also called a system
of barycentric coordinates for the point of normalized barycentric coordinates
1
(x, y, z).
x+y+z
With this convention, the intersection of the two lines D and D ! is either a point or a vector,
in both cases of barycentric coordinates
When the above is a vector, we can think of it as a point at infinity (in the direction of the
line defined by that vector).
Let (D0 , D0! ), (D1 , D1! ), and (D2 , D2! ) be three pairs of six distinct lines, such that the
four lines belonging to any union of two of the above pairs are neither parallel nor concurrent
(have a common intersection point). If D0 and D0! have a unique intersection point, let M be
this point, and if D0 and D0! are parallel, let M denote a nonnull vector defining the common
direction of D0 and D0! . In either case, let (m, m! , m!! ) be the barycentric coordinates of M,
as explained at the beginning of the problem. We call M the intersection of D0 and D0! .
Similarly, define N = (n, n! , n!! ) as the intersection of D1 and D1! , and P = (p, p! , p!! ) as the
intersection of D2 and D2! .
Prove that 3 3
3m n p3
3 ! 3
3 m n! p! 3 = 0
3 !! 3
3 m n!! p!! 3
iff either
(i) (D0 , D0! ), (D1 , D1! ), and (D2 , D2! ) are pairs of parallel lines; or
(ii) the lines of some pair (Di , Di! ) are parallel, each pair (Dj , Dj! ) (with j )= i) has a unique
intersection point, and these two intersection points are distinct and determine a line
parallel to the lines of the pair (Di , Di! ); or
(iii) each pair (Di , Di! ) (i = 0, 1, 2) has a unique intersection point, and these points M, N, P
are distinct and collinear.
Problem 4.11 Prove the following version of Pappus’s theorem. Let D and D ! be distinct
lines, and let A, B, C and A! , B ! , C ! be distinct points respectively on D and D ! . If these
points are all distinct from the intersection of D and D ! (if it exists), then the intersec-
tion points (in the sense of Problem 4.7) of the pairs of lines (BC ! , CB ! ), (CA! , AC ! ), and
(AB ! , BA! ) are collinear in the sense of Problem 4.9.
Problem 4.12 The purpose of this problem is to prove Pascal’s theorem for the nondegen-
erate conics. In the affine plane A2 , a conic is the set of points of coordinates (x, y) such
that
αx2 + βy 2 + 2γxy + 2δx + 2λy + µ = 0,
where α )= 0 or β )= 0 or γ )= 0. We can write the equation of the conic as
α γ δ x
(x, y, 1) γ β λ y = 0.
δ λ µ 1
Let
α γ δ 1 0 0 x
B = γ β λ, C = 0 1 0, X = y.
δ λ µ 1 1 1 z
(a) Letting A = C $ BC, prove that the equation of the conic becomes
X $ AX = 0.
Prove that A is symmetric, that det(A) = det(B), and that X $ AX is homogeneous of degree
2. The equation X $ AX = 0 is called the homogeneous equation of the conic.
We say that a conic of homogeneous equation X $ AX = 0 is nondegenerate if det(A) )= 0,
and degenerate if det(A) = 0. Show that this condition does not depend on the choice of the
affine frame.
(b) Given an affine frame (A, B, C), prove that any conic passing through A, B, C has
an equation of the form
ayz + bxz + cxy = 0.
Prove that a conic containing more than one point is degenerate iff it contains three distinct
collinear points. In this case, the conic is the union of two lines.
(c) Prove Pascal’s theorem. Given any six distinct points A, B, C, A! , B ! , C ! , if no three of
the above points are collinear, then a nondegenerate conic passes through these six points iff
4.11. PROBLEMS 133
the intersection points M, N, P (in the sense of Problem 4.7) of the pairs of lines (BC ! , CB ! ),
(CA! , AC ! ) and (AB ! , BA! ) are collinear in the sense of Problem 4.9.
Hint. Use the affine frame (A, B, C), and let (a, a! , a!! ), (b, b! , b!! ), and (c, c! , c!! ) be the
barycentric coordinates of A! , B ! , C ! respectively, and show that M, N, P have barycentric
coordinates
(bc, cb! , c!! b), (c! a, c! a! , c!! a! ), (ab!! , a!! b! , a!! b!! ).
Problem 4.13 The centroid of a triangle (a, b, c) is the barycenter of (a, 13 ), (b, 31 ), (c, 13 ).
If an affine map takes the vertices of triangle ∆1 = {(0, 0), (6, 0), (0, 9)} to the vertices of
triangle ∆2 = {(1, 1), (5, 4), (3, 1)}, does it also take the centroid of ∆1 to the centroid of
∆2 ? Justify your answer.
Problem 4.14 Let E be an affine space over R, and let (a1 , . . . , an ) be any n ≥ 3 points in
E. Let (λ1 , . . . , λn ) be any n scalars in R, with λ1 + · · · + λn = 1. Show that there must be
some i, 1 ≤ i ≤ n, such that λi )= 1. To simplify the notation, assume that λ1 )= 1. Show
that the barycenter λ1 a1 + · · · + λn an can be obtained by first determining the barycenter b
of the n − 1 points a2 , . . . , an assigned some appropriate weights, and then the barycenter of
a1 and b assigned the weights λ1 and λ2 + · · · + λn . From this, show that the barycenter of
any n ≥ 3 points can be determined by repeated computations of barycenters of two points.
Deduce from the above that a nonempty subset V of E is an affine subspace iff whenever V
contains any two points x, y ∈ V , then V contains the entire line (1 − λ)x + λy, λ ∈ R.
Problem 4.15 Assume that K is a field such that 2 = 1 + 1 )= 0, and let E be an affine
space over K. In the case where λ1 + · · · + λn = 1 and λi = 1, for 1 ≤ i ≤ n and n ≥ 3,
show that the barycenter a1 + a2 + · · · + an can still be computed by repeated computations
of barycenters of two points.
Finally, assume that the field K contains at least three elements (thus, there is some
µ ∈ K such that µ )= 0 and µ )= 1, but 2 = 1 + 1 = 0 is possible). Prove that the barycenter
of any n ≥ 3 points can be determined by repeated computations of barycenters of two
points. Prove that a nonempty subset V of E is an affine subspace iff whenever V contains
any two points x, y ∈ V , then V contains the entire line (1 − λ)x + λy, λ ∈ K.
Hint. When 2 = 0, λ1 + · · · + λn = 1 and λi = 1, for 1 ≤ i ≤ n, show that n must be
odd, and that the problem reduces to computing the barycenter of three points in two steps
involving two barycenters. Since there is some µ ∈ K such that µ )= 0 and µ )= 1, note that
µ−1 and (1 − µ)−1 both exist, and use the fact that
−µ 1
+ = 1.
1−µ 1−µ
Problem 4.16 (i) Let (a, b, c) be three points in A2 , and assume that (a, b, c) are not
collinear. For any point x ∈ A2 , if x = λ0 a + λ1 b + λ2 c, where (λ0 , λ1 , λ2 ) are the barycentric
coordinates of x with respect to (a, b, c), show that
det(xb, bc) det(ax, ac) det(ab, ax)
λ0 = , λ1 = , λ2 = .
det(ab, ac) det(ab, ac) det(ab, ac)
134 CHAPTER 4. BASICS OF AFFINE GEOMETRY
Conclude that λ0 , λ1 , λ2 are certain signed ratios of the areas of the triangles (a, b, c), (x, a, b),
(x, a, c), and (x, b, c).
(ii) Let (a, b, c) be three points in A3 , and assume that (a, b, c) are not collinear. For any
point x in the plane determined by (a, b, c), if x = λ0 a + λ1 b + λ2 c, where (λ0 , λ1 , λ2 ) are the
barycentric coordinates of x with respect to (a, b, c), show that
xb × bc ax × ac ab × ax
λ0 = , λ1 = , λ2 = .
ab × ac ab × ac ab × ac
Given any point O not in the plane of the triangle (a, b, c), prove that
and
det(Ox, Ob, Oc)
λ0 = .
det(Oa, Ob, Oc)
(iii) Let (a, b, c, d) be four points in A3 , and assume that (a, b, c, d) are not coplanar. For
any point x ∈ A3 , if x = λ0 a + λ1 b + λ2 c + λ3 d, where (λ0 , λ1 , λ2 , λ3 ) are the barycentric
coordinates of x with respect to (a, b, c, d), show that
and
det(xb, bc, bd)
λ0 = .
det(ab, ac, ad)
Conclude that λ0 , λ1 , λ2 , λ3 are certain signed ratios of the volumes of the five tetrahedra
(a, b, c, d), (x, a, b, c), (x, a, b, d), (x, a, c, d), and (x, b, c, d).
(iv) Let (a0 , . . . , am ) be m+1 points in Am , and assume that they are affinely independent.
For any point x ∈ Am , if x = λ0 a0 + · · · + λm am , where (λ0 , . . . , λm ) are the barycentric
coordinates of x with respect to (a0 , . . . , am ), show that
det(xa1 , a1 a2 , . . . , a1 am )
λ0 = .
det(a0 a1 , . . . , a0 ai , . . . , a0 am )
Conclude that λi is the signed ratio of the volumes of the simplexes (a0 , . . ., x, . . . am ) and
(a0 , . . . , ai , . . . am ), where 0 ≤ i ≤ m.
4.11. PROBLEMS 135
Problem 4.17 With respect to the standard affine frame for the plane A2 , consider the
three geometric transformations f1 , f2 , f3 defined by
√
! 1 3 3
x =− x− y+ ,
√4 4 √4
3 1 3
y! = x− y+ ,
4 4
√ 4
! 1 3 3
x =− x+ y− ,
4√ 4 4√
3 1 3
y! = − x− y+ ,
4 4 4
1
x! = x,
2 √
1 3
y! = y + .
2 2
(a) Prove that these maps are affine. Can you describe geometrically what their action
is (rotation, translation, scaling)?
(b) Given any polygonal line L, define the following sequence of polygonal lines:
S0 = L,
Sn+1 = f1 (Sn ) ∪ f2 (Sn ) ∪ f3 (Sn ).
Construct S1 starting from the line segment L = ((−1, 0), (1, 0)).
Can you figure out what Sn looks like in general? (You may want to write a computer
program.) Do you think that Sn has a limit?
Problem 4.18 In the plane A2 , with respect to the standard affine frame, a point of co-
ordinates (x, y) can be represented as the complex number z = x + iy. Consider the set of
geometric transformations of the form
z '→ az + b,
z '→ az + b or z '→ az + b,
HK = {hk | h ∈ H, k ∈ K}.
the set of affine bijections leaving a fixed. Prove that that GAa (E) is a subgroup of GA(E),
−
→
and that GAa (E) is isomorphic to GL( E ). Prove that GA(E) is isomorphic to the direct
product of T (E) and GAa (E).
Hint. Note that if u = f(a)a and tu is the translation associated with the vector u, then
tu ◦f ∈ GAa (E) (where the translation tu is defined such that tu (a) = a+u for every a ∈ E).
(v) Given a group G, let Aut(G) denote the set of homomorphisms f : G → G. Prove
that the set Aut(G) is a group under composition (called the group of automorphisms of G).
Given any two groups H and K and a homomorphism θ: K → Aut(H), we define H ×θ K
as the set H × K under the multiplication operation
γ(k)(h) = khk −1
θ(f ) = f,
−
→
where f ∈ GL( E ) (note that θ can be viewed as an inclusion map). Prove that GA(E) is
−
→ −
→
isomorphic to the semidirect product E ×θ GL( E ).
−→ −
→
(vii) Let SL( E ) be the subgroup of GL( E ) consisting of the linear maps such that
−→
det(f ) = 1 (the special linear group of E ), and let SA(E) be the subgroup of GA(E) (the
−
→ −→
special affine group of E) consisting of the affine maps f such that f ∈ SL( E ). Prove that
−
→ −→ −
→ −
→
SA(E) is isomorphic to the semidirect product E ×θ SL( E ), where θ: SL( E ) → Aut( E )
is defined as in (vi).
−
→ −
→
(viii) Assume that (E, E ) is a Euclidean affine space. Let SO( E ) be the special or-
−→
thogonal group of E (the isometries with determinant +1), and let SE(E) be the subgroup
of SA(E) (the special Euclidean group of E) consisting of the affine isometries f such that
−
→ −
→ −
→ −
→
f ∈ SO( E ). Prove that SE(E) is isomorphic to the semidirect product E ×θ SO( E ),
−
→ −
→
where θ: SO( E ) → Aut( E ) is defined as in (vi).
Problem 4.20 The purpose of this problem is to study certain affine maps of A2 .
(1) Consider affine maps of the form
% & % &% & % &
x1 cos θ − sin θ x1 b1
'→ + .
x2 sin θ cos θ x2 b2
Prove that such maps have a unique fixed point c if θ )= 2kπ, for all integers k. Show that
these are rotations of center c, which means that with respect to a frame with origin c (the
unique fixed point), these affine maps are represented by rotation matrices.
(2) Consider affine maps of the form
% & % &% & % &
x1 λ cos θ −λ sin θ x1 b1
'→ + .
x2 µ sin θ µ cos θ x2 b2
Prove that such maps have a unique fixed point iff (λ + µ) cos θ )= 1 + λµ. Prove that if
λµ = 1 and λ > 0, there is some angle θ for which either there is no fixed point, or there are
infinitely many fixed points.
138 CHAPTER 4. BASICS OF AFFINE GEOMETRY
is invertible.
−
→
Problem 4.21 Let (E, E ) be any affine space of finite dimension. For every affine map
f : E → E, let Fix(f ) = {a ∈ E | f (a) = a} be the set of fixed points of f .
(i) Prove that if Fix(f ) )= ∅, then Fix(f ) is an affine subspace of E such that for every
b ∈ Fix(f ),
−
→
Fix(f ) = b + Ker ( f − id).
(ii) Prove that Fix(f ) contains a unique fixed point iff
−
→
Ker ( f − id) = {0},
−
→
i.e., f (u) = u iff u = 0.
Hint. Show that
−→
Ωf(a) − Ωa = Ωf(Ω) + f (Ωa) − Ωa,
for any two points Ω, a ∈ E.
−→ −→
Problem 4.22 Given two affine spaces (E, E ) and (F, F ), let A(E, F ) be the set of all
affine maps f : E → F .
−
→ −
→
(i) Prove that the set A(E, F ) (viewing F as an affine space) is a vector space under
the operations f + g and λf defined such that
for all a ∈ E.
4.11. PROBLEMS 139
fg(a) = f(a)g(a)
−
→
(for every a ∈ E) is affine, and thus fg ∈ A(E, F ). Furthermore, fg is the unique map in
−
→
A(E, F ) such that
f + fg = g.
−
→ −
→ −→
(iii) If E has dimension m and F has dimension n, prove that A(E, F ) has dimension
n + mn = n(m + 1).
Problem 4.24 Given an affine space E of dimension n and an affine frame (a0 , . . . , an ) for
E, let f : E → E and g: E → E be two affine maps represented by the two (n + 1) × (n + 1)
matrices % & % &
A b B c
and
0 1 0 1
w.r.t. the frame (a0 , . . . , an ). We also say that f and g are represented by (A, b) and (B, c).
(1) Prove that the composition f ◦ g is represented by the matrix
% &
AB Ac + b
.
0 1
We also say that f −1 is represented by (A, b)−1 = (A−1 , −A−1 b). Prove that if A is an
orthogonal matrix, the matrix associated with f −1 is
% $ &
A −A$ b
.
0 1
Remark: Note that this is the opposite of what happens if f and g are both represented
by matrices w.r.t. the “fixed” frame (a0 , . . . , an ), where g ◦ f is represented by the matrix
(B, c)(A, b). The frame (a!0 , . . . , a!n ) can be viewed as a “moving” frame. The above has
applications in robotics, for example to rotation matrices expressed in terms of Euler angles,
or “roll, pitch, and yaw.”