Perspective Mappings
Perspective Mappings
Contents
1 Introduction 2
1
1 Introduction
The first convex quadrilateral has vertices p00 , p10 , p11 and p01 , listed in counterclockwise order. The
second quadrilateral has vertices q00 , q10 , q11 and q01 , listed in counterclockwise order. The construction
of the perspective mapping uses 3 × 3 homogeneous matrices and 3 × 1 homogeneous coordinates. Figure 1
shows the two quadrilaterals.
2
Figure 1. Two convex quadrilaterals. The left quadrilateral is to be mapped perspectively to the right
quadrilateral. The transforms Ap and Aq are affine and F is a fractional linear transformation.
Affinely transform the source quadrilateral so that p00 is mapped to the origin (0, 0), p10 − p00 is mapped
to (1, 0) and p01 − p00 is mapped to (0, 1). This is equivalent to computing x = (x0 , x1 ) for which a point
p in the quadrilateral is represented by
The representation of p11 leads to (x0 , x1 ) = (a, b), which is shown in the lower left quadrilateral of Figure
1. In homogeneous coordinates we have
p p10 − p00 p01 − p00 p00 x Mp p00 x x
= = = Ap (2)
1 0 0 1 1 0T 1 1 1
where Mp is a 2 × 2 matrix whose columns are p10 − p00 and p01 − p00 and where 0T is the 1 × 2 vector of
zeros. The equation defines the 3 × 3 matrix Ap . The inverse of the affine transformation maps p to x,
−1 −1 −1
x p M −M p00 p M (p − p00 )
= A−1p
= p p
= p (3)
T
1 1 0 1 1 1
3
Similarly, we can write points q in the target quadrilateral as
Define y = (y0 , y1 ). The representation of q11 leads to (y0 , y1 ) = (c, d), which is shown in the lower right
quadrilateral of Figure 1. In homogeneous coordinates,
q q10 − q00 q01 − q00 q00 y Mq q00 y y
= = = Aq (5)
1 0 0 1 1 0T 1 1 1
where Mq is a 2 × 2 matrix whose columns are q10 − q00 and q01 − q00 . The equation defines the 3 × 3
matrix Aq .
In Figure 1, the two quadrilaterals at the bottom of the figure are called canonical quadrilaterals. The first
is mapped perspectively to the second using a fractional linear transformation,
We can substitute these in equation (6), multiply numerator and denominator by ab(c + d − 1), and factor
(a − c + bc − ad) = c(a + b − 1) − a(c + d − 1) and (b − d − bc + ad) = d(a + b − 1) − b(c + d − 1) to obtain
The factoring identifies subexpressions that can be computed once in a computer program, namely, a, b, c,
d, a + b − 1 and c + d − 1.
The convexity of the p-quadrilateral is characterized by a + b − 1 > 0 and implies that both numerator
coefficients bc(a+b−1) and ad(a+b−1) are positive. The denominator is a function of the form k0 x0 +k1 x1 +k2
where (x0 , x1 is in the canonical quadrilateral shown in Figure 1. The containment constraints are x0 ≥ 0,
x1 ≥ 0, (1 − b)x0 + a(x1 − 1) ≤ 0 and b(x0 − 1) + (1 − a)x1 ≤ 0. The minimum of the denominator subject
to the linear inequality constraints is solved as a linear programming problem that is easily solved. The
minimum must occur at a vertex of the domain. The denominator at (0, 0) is ab(c + d − 1), which is positive
because the convexity of the q-quadrilateral is characterized by c + d − 1 > 0. The denominator at (1, 0) is
bc(a + b − 1), which is positive. The denominator at (0, 1) is ad(a + b − 1), which is positive. Finally, the
denominator at (a, b) is ab(a + b − 1), which is positive. The minimum of the denominator is positive, which
means equation (8) has no singularities in its domain.
4
The fractional linear transformation from the q-quadrilateral to the p-quadrilateral is the inverse of the
function in equation (8),
(da(c + d − 1)y0 , cb(c + d − 1)y1 )
(x0 , x1 ) = (9)
d(a(c + d − 1) − c(a + b − 1))y0 + c(b(c + d − 1) − d(a + b − 1))y1 + cd(a + b − 1)
This may be constructed by simply swapping (x0 , x1 ) and (y0 , y1 ) and by swapping (a, b) and (c, d) in
equation (8).
The perspective mapping from the p-quadrilateral to the q-quadrilateral is provided by a 3 × 3 homogeneous
matrix and a perspective divide. Define s = a + b − 1 and t = c + d − 1. The homogeneous matrix is
bcs 0 0
D 0
F = 0 ads 0 = (10)
`T λ
b(cs − at) a(ds − bt) abt
where D is a 2 × 2 diagonal matrix, 0 is the 2 × 1 vector of zeros, `T is a 1 × 2 vector and λ is a scalar. The
perspective mapping is
q q0 q
∼ = Aq F A−1p
(11)
1 w 1
where the similarity symbol indicates that the left-hand side is obtained via the perspective divide: q = q0 /w.
The 3 × 3 homography matrix H = Aq F A−1 p is
Mq q00 D 0 Mp−1 −Mp−1 p00
H =
0T 1 `T λ 0T 1
(12)
(Mq D + q00 `T )Mp−1 −(Mq D + q00 `T )Mp−1 p00 + λq00
=
`T Mp−1 −`T Mp−1 p00 + λ
The perspective division leads to the correct mapping of vertices of the p-quadrilateral to the q-quadrilateral.
Generally, the mapping of p to q is
(Mq D + q00 `T )Mp−1 (p − p00 ) + λq00
q= (14)
`T Mp−1 (p − p00 ) + λ
Let the square vertices be (0, 0), (1, 0), (1, 1), and (0, 1). The corresponding quadrilateral vertices are q00 ,
q10 , q11 , and q01 . To map the quadrilateral to the square, consider the problem as a perspective projection
5
in three dimensions. The view plane is y2 = 0 and the eyepoint is E = (e0 , e1 , e2 ) where e2 6= 0 (the eyepoint
is not on the view plane). The perspective projection of a point r = (r0 , r1 , r2 ) to the view plane is the
point p = (x0 , x1 , 0), which lies on the ray originating at E and contains r. The intersection of the ray with
the plane is the projection point: p = E + t(r − E) for some t. Some algebra leads to t = e2 /(e2 − r2 ),
x0 = (e2 r0 − e0 r2 )/(e2 − r2 ), and x1 = (e2 r1 − e1 r2 )/(e2 − r2 ).
Embed the square vertices in the view plane as (i0 , i1 , 0) for i0 and i1 in {0, 1}. Embed the quadrilateral
vertices as (qi0 i1 − q00 , 0), which translates one of the vertices to the origin. Rotate the embedded quadri-
lateral vertices so that they lie in a plane containing the origin 0 and having normal vector N = (n0 , n1 , n2 ).
Let the rotated points be denoted ri0 i1 = R(qi0 i1 − q00 , 0), where R is a rotation matrix corresponding to
the normal; thus, N · ri0 i1 = 0. The problem is to construct E and N so that
for some scalars ti0 i1 . Equation (15) says that the embedded quadrilateral vertices are projected perspectively
to the square vertices. Figure 2 shows the geometric configuration.
Figure 2. Projection of a quadrilateral onto a square. The rotated quadrilateral points ri0 i1 are projected
to the square points (i, j, 0). The figure shows a rotated quadrilateral interior point projected to a square
interior point.
The vector q11 − q00 can be represented as a linear combination of two quadrilateral edges, namely,
The coefficients a0 and a1 are constructed as the solution of two linear equations in two unknowns. These
coefficients show up in the equations for the perspective projection. The convexity of the quadrilateral
guarantees that a0 ≥ 0, a1 ≥ 0, and a0 + a1 > 1. Rotation preserves the relative positions of the vertices, so
we know that
r11 = a0 r10 + a1 r01 (17)
6
Equation (15) contains four relationships,
It must be that t00 = 1, because r00 = 0. Dotting the last three equations with the normal vector leads to
Substituting equations (17) and (20) into the last line of equation (18) produces
[a0 (t10 + t01 − 1) − t10 ]r10 + [a1 (t10 + t01 − 1) − t01 ]r01 = 0 (23)
The linear independence of r10 and r01 guarantees that the coefficients in equation (23) must both be zero.
We now have two linear equations in the two unknowns t10 and t01 . These are easily solved, and together
with equation (20) produce
a0 a1 1
t00 = 1, t10 = , t01 = , t11 = (24)
a0 + a1 − 1 a0 + a1 − 1 a0 + a1 − 1
It turns out that computing t11 explicitly in terms of a0 and a1 is not necessary to construct the fractional
linear transformations.
At this time we may attempt to construct E and N, but this is not necessary to actually construct the
perspective projection. Instead, consider a rotated quadrilateral point r that is projected to a square point
(x0 , x1 , 0), as illustrated in Figure 2. We may write r as a linear combination of two rotated quadrilateral
edges,
r = y0 r10 + y1 r01 (25)
The ray equation for the pair of points is
for some scalar t. Substituting our now known values for ti0 i1 into the second and third lines of equation
(18), solving for r10 and r01 , replacing in equation (26), and grouping like terms produces
ty0 (1 − t10 ) ty1 (1 − t01 ) ty0 ty1
1− − −t E+ − x0 (1, 0, 0) + − x1 (0, 1, 0) = (0, 0, 0) (27)
t10 t01 t10 t01
7
By linear independence of E, (1, 0, 0), and (0, 0, 1), the coefficients in equation (27) must all be zero. This
leads to the fractional linear transformation that maps the quadrilateral to the square
(a1 (a0 + a1 − 1)y0 , a0 (a0 + a1 − 1)y1 )
(x0 , x1 ) = (28)
a0 a1 + a1 (a1 − 1)y0 + a0 (a0 − 1)y1
The fractional linear transformation that maps the square to the quadrilateral is
(a0 x0 , a1 x1 )
(y0 , y1 ) = (29)
(a0 + a1 − 1) + (1 − a1 )x0 + (1 − a0 )x1
Example. Intuitively, perspective transformations map lines to lines. They also map conic sections to conic
sections. The proof is straightforward. Let the perspective transformation be y0 = (a0 x0 +a1 x1 +a2 )/(c0 x0 +
c1 x1 + c2 ) and y1 = (b0 x0 + b1 x1 + b2 )/(c0 x0 + c1 x1 + c2 ). The inverse transformation is of the same form, so
it suffices to show a conic section in (y0 , y1 ) coordinates satisfying Ay02 + By0 y1 + Cy12 + Dy0 + Ey1 + F = 0
is mapped to a conic section in (x0 , x1 ) coordinates satisfying Āx20 + B̄x0 x1 + C̄x21 + D̄x0 + Ēx1 + F̄ = 0.
Substituting the formulas for y0 and y1 into the quadratic equation, multiplying by (c0 x0 + c1 x1 + c2 )2 ,
expanding the products, and grouping the appropriate terms yields a quadratic in x0 and x1 . The left image
in Figure 3 shows a square containing several conic sections. The top curve is a parabola, the left and right
curves are hyperbolas, the center curve is a circle, and the bottom curve is an ellipse. The grid consists of
straight lines. The right image in Figure 3 shows a perspective mapping of these conic sections.
The ideas of the previous section extend to 3D. A cuboid is a convex polyhedron that has 6 planar faces. For
example, a typical view frustum is a cuboid. A cube is a special case of a cuboid, all six faces are squares,
and opposite faces are parallel. Let the vertices of the cuboid be denoted qi0 i1 i2 where ij ∈ {0, 1}. The
canonical cube vertices are (i0 , i1 , i1 ).
8
To project perspectively the cuboid to the cube, we can embed the problem in 4D. The view hyperplane is
y3 = 0 and the eyepoint is E = (e0 , e1 , e2 , e3 ) where e3 6= 0. The cuboid is translated to the origin and then
rotated into a hyperplane with normal N = (n0 , n1 , n2 , n3 ). The rotation matrix is R and the rotated and
translated vertices are ri0 i1 i2 = R(qi0 i1 i2 − q000 , 0). Our goal is that
E + ti0 i1 i2 (ri0 i1 i2 − E) = (i0 , i1 , i2 0), ij ∈ {0, 1} (30)
for some scalars ti0 i1 i2 .
The vector q111 − q000 can be represented as a linear combination of three cuboid edges, namely,
q111 − q000 = a0 (q100 − q000 ) + a1 (q010 − q000 ) + a2 (q001 − q000 ) (31)
The coefficients a0 , a1 , and a2 are constructed as the solution of three linear equations in three unknowns.
These coefficients show up in the equations for the perspective projection. The convexity of the cuboid
guarantees that a0 ≥ 0, a1 ≥ 0, a2 ≥ 0, and a0 + a1 + a2 > 1. Rotation preserves the relative positions of
the vertices, so we know that
r111 = a0 r100 + a1 r010 + a2 r001 (32)
Dotting the relationships in equation (30) with the normal vector produces
(1 − t100 )N · E = n0 , (1 − t110 )N · E = n0 + n1
(1 − t010 )N · E = n1 , (1 − t101 )N · E = n0 + n2
(33)
(1 − t001 )N · E = n2 , (1 − t011 )N · E = n1 + n2
(1 − t111 )N · E = n0 + n1 + n2
Because N · E 6= 0, we obtain
t110 = t100 + t010 − 1, t101 = t100 + t001 − 1, t011 = t010 + t001 − 1, t111 = t100 + t010 + t001 − 2 (34)
Substituting the relationship for t111 from equation (34) into the equation of (30) involving t111 leads to
(3 − t100 − t010 − t001 )E + (t100 + t010 + t001 − 2)(a0 r100 + a1 r010 + a2 r001 ) = (1, 1, 1) (35)
Adding the relationships of equation (30) involving t100 , t010 , and t001 produces
(3 − t100 − t010 − t001 )E + t100 r100 + t010 r010 + t001 r001 = (1, 1, 1) (36)
Subtracting equation (36) from (35), we have
[a0 (t100 + t010 + t001 − 2) − t100 ]r100 + [a1 (t100 + t010 + t001 − 2) − t010 ]r010
(37)
+ [a2 (t100 + t010 + t001 − 2) − t001 ]r001 = 0
The linear independence of r100 , r010 , and r001 guarantees that the coefficients in equation (37) must all be
zero. We now have three linear equations in three unknowns t100 , t010 , and t001 . These are easily solved,
and together with previous equations produce
+a0 +a1 −a2 +1
t000 = 1, t110 = a0 +a1 +a2 −1
2a0 +a0 −a1 +a2 +1
t100 = a0 +a1 +a2 −1 , t101 = a0 +a1 +a2 −1
(38)
2a1 −a0 +a1 +a2 +1
t010 = a0 +a1 +a2 −1 , t011 = a0 +a1 +a2 −1
2a2 +a0 +a1 +a2 +1
t001 = a0 +a1 +a2 −1 , t111 = a0 +a1 +a2 −1
9
It turns out that computing t110 , t101 , t011 , and t111 explicitly in terms of a0 , a1 , and a2 is not necessary to
construct the fractional linear transformations.
As in the 2D problem, we can construct the fractional linear transformations for the projection without
explicitly constructing E and N. We may write a rotated interior cuboid point r as
for some scalar t. Substituting the values for ti0 i1 i2 into equation (30); solving for r100 , r010 , and r001 ;
replacing in equation (40), and grouping like terms produces
1 − ty0 (1−t
t100
100 )
− ty1 (1−t
t010
010 )
− ty2 (1−t
t001
001 )
E
(41)
+ tty 0
100
− x 0 (1, 0, 0, 0) + ty1
t010 − x 1 (0, 1, 0, 0) + ty2
t001 − x 2 (0, 0, 1, 0) = (0, 0, 0, 0)
By linear independence of E, (1, 0, 0, 0), (0, 1, 0, 0), and (0, 0, 1, 0), the coefficients in equation (41) must all
be zero. This leads to the fractional linear transformation that maps the cube to the cuboid,
2(a0 x0 , a1 x1 , a2 x2 )
(y0 , y1 , y2 ) = (42)
d0 x0 + d1 x1 + d2 x2 + d3
where d0 = a0 − a1 − a2 + 1, d1 = −a0 + a1 − a2 + 1, d2 = −a0 − a1 + a2 + 1, and d3 = a0 + a1 + a2 + 1.
The fractional linear transformation that maps the cube to the cuboid is
d3 (a1 a2 y0 , a0 a2 y1 , a0 a1 y2 )
(x0 , x1 , x2 ) = (43)
2a0 a1 a2 − a1 a2 d0 y0 − a0 a2 d1 y1 − a0 a1 d2 y2
Example. When drawing a view volume in 3D graphics, one typically creates a view frustum with near
plane, far plane, left plane, right plane, bottom plane, and top plane. The vertices of geometric primitives
are transformed from model space to world space, then from world space to view space (camera space). The
view-space vertices are finally transformed to clip space (homogeneous coordinates) by a 4 × 4 perspective
projection matrix. The rasterizer does the clipping in homogeneous coordinates and then performs the
perspective divide to obtain the normalized window coordinates (x, y) ∈ [−1, 1]2 and normalized depth
z ∈ [0, 1] (for DirectX; OpenGL maps depth to [−1, 1]). The viewport is the full rectangular window, but it
may be chosen to be an subwindow specified as an axis-aligned rectangle.
It is possible to specify a viewport that is a convex quadrilateral (nonrectangular). Let this viewport have
vertices in camera coordinates, qi0 i1 0 for ij ∈ {0, 1}, and let them be counterclockwise oriented. This polygon
acts as the near face of a cuboidal view volume. The quadrilateral may be extruded toward the far plane
of the view frustum, thus constructing the far face of the cuboidal view volume. Let the camera forward
direction be D. Let n be the near-plane distance from the eyepoint and let f be the far-plane distance from
the eyepoint. The viewport vertex qi0 i1 0 is on the near plane, so D · qi0 i1 = n. The corresponding far-plane
vertex is qi0 i1 1 = (f /n)qi0 i1 so that D · qi0 i1 1 = f .
We may construct a projection matrix that maps the cuboidal view volume to the cube [−1, 1]2 × [0, 1].
Define U0 = q100 − q000 , U1 = q010 − q000 , and U2 = q001 − q000 . Define the matrix M = [U0 U1 U2 ]
10
whose columns are the specified vectors. Define a to be the column vector whose components are the ai in
the fractional linear transformations. Then
Define H to be the homogeneous matrix whose upper-left block is the matrix M and whose translation
component is the vector q000 .
The projection of equation (43) maps the cuboidal volume to the canonical cube [0, 1]3 . We need to per-
form one additional scaling and translation to map (x0 , x1 ) from [0, 1]2 to [−1, 1]2 . With this additional
transformation, the cuboid-to-cube mapping may be written in homogeneous form as
a2 a2
(2d 3 + d0 ) d1 d 2 −2a2
a0 a1
a2 a2
−2a
a0 0 d a1 (2d 3 + d1 ) d 2 2
P =
(46)
0 0 d3 0
− aa20 d0 − aa12 d1 −d2 2a2
where the output (y0 , y1 , y2 , w) is processed by the rasterizer in the usual manner, performing the perspective
divide by w to obtain the cube points.
The projection matrix from camera space to clip space is therefore P H −1 . The matrix H −1 maps camera
coordinates to cuboid coordinates relative to three linearly independent edges of the cuboid. These points
are then perspectively projected by P into the cube.
The ideas in the previous sections generalize to higher dimensions. Let the d-dimensional hypercuboid have
vertices qI , where I = (i1 , i2 , . . . id ) is a multiindex with components ij ∈ {0, 1}. The canonical hypercube
vertices are I, with bold face used to distinguish between vectors and multiindices.
The view hyperplane is yd = 0 and the eyepoint is E = (e1 , e2 , . . . , ed ) with ed 6= 0. The hypercuboid is
translated to the origin and then rotated into a hyperplane with normal N = (n1 , n2 , . . . , nd ). Let R be the
corresponding rotation matrix and define the rotated hypercuboid vertices by rI = R(qI − qO ), where O is
the multiindex of all zeros. Our goal is that
for some scalars tI . Let U be the multiindex of all ones. The vector qU − qO may be written as
d
X
qU − qO = aj (qBj − qO ) (48)
j=1
11
where Bj is the multiindex with 0 in all components except for a 1 in component j. The coefficients are
Pd
nonnegative and j=1 aj > 0. By linearity of the rotation matrix,
d
X
rU = aj rBj (49)
j=1
Let I = (i1 , i2 , . . . , id ) be a multiindex with at least one nonzero component. Suppose that I has m compo-
nents that are 1; it is the case that 1 ≤ m ≤ d. Let those components be ij1 through ijm . Dotting the ray
equation with the normal vector, we have
m
X
(1 − tI )N · E = n ik (50)
k=1
Pm
Whenever m ≥ 2, the tI variable is related to tBj variables by summations. Specifically, I = k=1 ik Bik
and
Xm m
X
(1 − tI )N · E = n ik = (1 − tBik )N · E (51)
k=1 k=1
Therefore, once we determine the relationships between tBi and aj , the tI with 2 or more 1-valued indices
are determined. It turns out that for the construction of the fractional linear transformations, we only need
to know tU explicitly in terms of the aj ,
d
X
tU = tBi − (d − 1) (52)
i=1
The linear independence of the rBi guarantees that all the coefficients in this equation are zero. This leads
to an invertible linear system of d equations in the d unknowns tB1 through tBd .
As in dimensions 2 and 3, we need not construct E and N explicitly. We may write a rotated interior
hypercuboid point r as
Xd
r= yi rBi (56)
i=1
12
The ray equation for the pair of points is
d
!
X
(x1 , . . . , xd , 0) = E + t(r − E) = E + t yi rBi − E (57)
i=1
for some scalar t. Substituting the values for tBi into equation (47), solving for rBi , replacing in equation
(57), and grouping like terms produces
d
! d
X tyi (1 − tBi ) X tyi
1− E+ − xi (Bi , 0) = 0 (58)
i=1
t Bi i=1
t Bi
By linear independence of E and the (Bi , 0), the coefficients in equation (58) must all be zero. This leads to
the fractional linear transformation that maps the hypercube to the hypercuboid,
(d − 1)ai xi
yi = Pd , 1≤i≤d (59)
k=1 ck xk + cd+1
where
d d d − 2,
X X k=j
ck = bkj aj + 1, 1 ≤ k ≤ d; cd+1 = aj − 1; bkj = (60)
−1, k 6= j
j=1 j=1
The fractional linear transformation that maps the hypercuboid to the hypercube is
hQ i
d
j=1 aj cd+1 (yi /ai )
xi = h Q i , 1 ≤ i ≤ d (61)
d Pd
j=1 aj (d − 1) − j=1 cj (yj /aj )
The products of the aj appear as if they cancel, but if any of the aj are zero, you need to formally multiply
through by the product and then evaluate the expression using the aj .
13