0% found this document useful (0 votes)
18 views37 pages

CG Module 3.....

Uploaded by

anjalianju200115
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views37 pages

CG Module 3.....

Uploaded by

anjalianju200115
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

MODULE 3

GEOMETRIC OBJECTS AND


TRANSFORMATIONS

W e are now ready to concentrate on three-dimensional graphics. Much of this


chapter is concerned with such matters as how to represent basic geometric
types, how to convert between various representations, and what statements
we can make about geometric objects independent of a particular representation.
We begin with an examination of the mathematical underpinnings of computer
graphics. This approach should avoid much of the confusion that arises from a lack
of care in distinguishing among a geometric entity, its representation in a particular
reference system, and a mathematical abstraction of it.
We use the notions of affine and Euclidean vector spaces to create the necessary
mathematical foundation for later work. One of our goals is to establish a method
for dealing with geometric problems that is independent of coordinate systems. The
advantages of such an approach will be clear when we worry about how to represent
the geometric objects with which we would like to work. The coordinate-free
approach will prove to be far more robust than one based on representing the objects
in a particular coordinate system or frame. This coordinate-free approach also leads
to the use of homogeneous coordinates, a system that not only enables us to explain
this approach, but also leads to efficient implementation techniques.
We use the terminology of abstract data types to reinforce the distinction
between an object and its representation. Our development will show that the
mathematics arise naturally from our desire to manipulate a few basic geometric
objects. Much of what we present here is an application of vector spaces, geometry,
and linear algebra. Appendices B and C summarize the formalities of vector spaces
and matrix algebra, respectively.
In a vein similar to the approach we took in Chapters 2 and 3, we develop a
simple application program to illustrate the basic principles and to see how the
concepts are realized within an API. In this chapter, our example is focused on the
representation and transformations of a cube. We also consider how to specify
transformations interactively and apply them smoothly.
183

159
4.4 Frames in OpenGL

4.4 FRAMES IN OPENGL


As we have seen, OpenGL is based on a pipeline model, the first part of which is a
sequence of operations on vertices. We can characterize these geometric operations
by a sequence of transformations or, equivalently, as a sequence of changes of
frames for the objects defined by a user program. In a typical application, there are
six frames embedded in the pipeline, although normally the programmer will not see
more than a few of them directly. In each of these frames, a vertex has different
coordinates. The following is the order that the systems occur in the pipeline:

1. Object or model coordinates


2. World coordinates
3. Eye (or camera) coordinates
4. Clip coordinates
5. Normalized device coordinates
6. Window (or screen) coordinates

Let’s consider what happens when a user specifies a vertex in a program through
the function glVertex3f(x,y,z). This vertex may be specified directly in the
application program or indirectly through an instantiation of some basic object, as
we discussed in Chapter 3. In most applications, we tend to specify or use an object
with a convenient size, orientation, and location in its own frame called the model
or object frame. For example, a cube would typically have its faces aligned with
axes of the frame, its center at the origin, and have a side length of 1 or 2 units. The
coordinates in the corresponding function calls are in object or model coordinates.
Each object must be brought into an application that might contain hundreds or
thousands of individual objects. The application program generally applies a
sequence of transformations to each object to size, orient, and position it within a
frame that is appropriate for the particular application. For example, if we were using
an instance of a square for a window in an architectural application, we would scale
it to have the correct proportions and units, which would probably be in feet or
meters. The origin of application coordinates might be a location in the center of the
bottom floor of the building. This application frame is called the world frame and
the values are in world coordinates. Note that if we do not model with predefined
objects or apply any transformations before we execute the glVertex function, object
and world coordinates are the same.
Object and world coordinates are the natural frames for the application program.
However, the image that is produced depends on what the camera or viewer sees.
Virtually all graphics systems use a frame whose origin is the center of the camera’s
184 Chapter 4 Geometric Objects and Transformations

lens1 and whose axes are aligned with the sides of the camera. This frame is called
the camera frame or eye frame. Because there is an affine transformation that
corresponds to each change of frame, there are 4 × 4 matrices that represent the
transformation from model coordinates to world coordinates and from world
coordinates to eye coordinates. In OpenGL, these transformations are concatenated
together into the model-view transformation, which is specified by the model-view
matrix. Usually, the use of the model-view matrix instead of the individual matrices
does not pose any problems for the application programmer. In Chapter 9, where we
discuss programmable pipelines, we will see situations where we must separate the
two transformations.
OpenGL uses three other representations that we will need later, but, for
completeness, we introduce them here. Once objects are in eye coordinates, OpenGL
must check whether they lie within the view volume. If an object does not, it is
clipped from the scene prior to rasterization. OpenGL can carry out this process most
efficiently if it first carries out a projection transformation that brings all potentially
visible objects into a cube centered at the origin in clipcoordinates. We will study
this transformation in Chapter 5. After this transformation, vertices are still
represented in homogeneous coordinates. The division by the w component, called
perspective division, yields three-dimensional representations in normalized
device coordinates. The final transformation takes a position in normalized device
coordinates and, taking into account the viewport, creates a three-dimensional
representation in window coordinates. Window coordinates are measured in units
of pixels on the display but retain depth information. If we remove the depth
coordinate, we are working with two-dimensional screen coordinates.
From the application programmer’s perspective, OpenGL starts with two
frames: the eye frame and the object frame. The model-view matrix positions the
object frame relative to the eye frame. Thus, the model-view matrix converts the
homogeneous-coordinate representations of points and vectors to their
representations in the eye frame. Because the model-view matrix is part of the state
of the system, there is always a current camera frame and a current object frame.
OpenGL provides matrix stacks, so we can store model-view matrices or,
equivalently, frames.
Initially, the model-view matrix is an identity matrix, so the object frame and
eye frame are identical. Thus, if we do not change the model-view matrix, we are
working in eye coordinates. As we saw in Chapter 2, the camera is at the origin of
its frame, as shown in Figure 4.26(a). The three basis vectors in eye space correspond
to (1) the up direction of the camera, the y direction; (2) the direction the camera is
pointing, the negative z direction; and (3) a third orthogonal direction, x, placed so
that the x, y, z directions form a right-handed coordinate system. We obtain other
frames in which to place objects by performing homogeneous coordinate
transformations that define new frames relative to the camera frame. In Section 4.5,
we learn how to define these transformations; in Section 5.3, we use them to position
the camera relative to our objects. Because frame changes are represented by model-
view matrices that can be stored, we can save frames and move between frames by
changing the current modelview matrix.

1
. For a perspective view, the center of the lens is the center of projection (COP) whereas for an orthogonal
view (the default), the direction of projection is aligned with the sides of the camera.
185

When first working with multiple frames, there can be some confusion about
which frames are fixed and which are varying. Because the model-view matrix po-
4.4 Frames in OpenGL

(a) ( b )

FIGURE 4.26 Camera and object frames. (a) In default positions. (b) After applying model-view matrix.

sitions the camera relative to the objects, it is usually a matter of convenience as to


which frame we regard as fixed. Most of the time, we will regard the camera as fixed
and the other frames as moving relative to the camera, but you may prefer to adopt
a different view.
Before beginning a detailed discussion of transformations and how we use them
in OpenGL, we present two simple examples. In the default settings shown in Figure
4.26(a), the camera and object frames coincide with the camera pointing in the
negative z direction. In many applications, it is natural to specify objects near the
origin, such as a square centered at the origin or perhaps a group of objects whose
center of mass is at the origin. It is also natural to set up our viewing conditions so
that the camera sees only those objects that are in front of it. Consequently, to form
images that contain all these objects, we must either move the camera away from the
186 Chapter 4 Geometric Objects and Transformations

objects or move the objects away from the camera. Equivalently, we move the
camera frame relative to the object frame. If we regard the camera frame as fixed
and the modelview matrix as positioning the object frame relative to the camera
frame, then the model-view matrix,

A ,

moves a point (x, y, z) in the object frame to the point (x, y, z − d) in the camera frame.
Thus, by making d a suitably large positive number, we “move” the objects in front
of the camera by moving the world frame relative to the camera frame, as shown in
Figure 4.26(b). Note that, as far as the user—who is working in world coordinates—
is concerned, she is positioning objects as before. The model-view matrix takes care
of the relative positioning of the object and eye frames. This strategy is almost
always better than attempting to alter the positions of the objects by changing their
vertices to place them in front of the camera.
Let’s look at another example working directly with representations. When we
define our objects through vertices, we are working in the application frame ( or
world frame). The vertices specified by glVertex3f(x,y,z) are the representation of a
point in that frame. Thus, we do not use the world frame directly but rather implicitly
by representing points (and vectors) in it. Consider the situation illustrated in Figure
4.27.
Here we see the camera as positioned in the object frame. Using homogeneous
coordinates, it is centered at a point p = (1, 0, 1, 1)T in world coordinates and points
at the origin in the world frame. Thus, the vector whose representation in the world
frame is n = (−1, 0, −1, 0)T is orthogonal to the back of the camera and points toward
the origin. The camera is oriented so that its up direction is the same as the up
direction in world coordinates and has the representation v = (0, 1, 0, 0)T. We can
form an orthogonal coordinate system for the camera by using the cross product to
determine a third orthogonal direction for the camera, which is u = (1, 0, −1, 0)T. We
can now proceed as we did in Section 4.3.6 and derive the matrix M that converts

z
187

FIGURE 4.27 Camera at (1, 0, 1) pointing toward the origin.


188 Chapter 4 Geometric Objects and Transformations

the representation of points and vectors in the world frame to their representations in rasterizer. At this point, we
the camera frame. The transpose of this matrix in homogenous coordinates is obtained can assume it will do its job
by the inverse of a matrix containing the coordinates of the camera, automatically, provided we
perform the preliminary steps
correctly.

Note that the origin in the original frame is now one unit in the n direction from the FIGURE 4.28 One frame of cube
origin in the camera frame or, equivalently, at the point whose representation is (0, 0, animation.
1, 1) in the camera frame. 4.5.1 Modeling the Faces
In OpenGL, we can set a model-view matrix by sending an array of 16 elements
The cube is as simple a three-
to glLoadMatrix. For situations like the preceding, where we have the representation dimensional object as we
of one frame in terms of another through the specification of the basis vectors and the might expect to model and
origin, it is a direct exercise to find the required coefficients. However, such is not display. There are a number
usually the case. For most geometric problems, we usually go from one frame to of ways, however, to model
another by a sequence of geometric transformations such as rotations, translations, it. A CSG system would
and scales. We will follow this approach in subsequent sections. But first, we will regard it as a single primitive.
look at a few simple approaches to building geometric objects. At the other extreme, the
hardware processes the cube
as an object defined by eight
vertices. Our decision to use
surface-based models
implies that we regard a cube
either as the intersection of
4.5 MODELING A COLORED CUBE six planes or as the six
polygons, called facets, that
We now have most of the basic conceptual and practical knowledge we need to build define its faces. A carefully
three-dimensional graphical applications. We will use them to produce a program that designed data structure
draws a rotating cube. One frame of an animation might be as shown in Figure 4.28. should support both the high-
However, before we can rotate the cube, we will consider how we can model it level application view of the
efficiently. Although three-dimensional objects can be represented, like cube and the low-level view
twodimensional objects, through a set of vertices, we will see that data structures will needed for the
help us to incorporate the relationships among the vertices, edges, and faces of implementation.
geometric objects. Such data structures are supported in OpenGL through a facility We start by assuming
called vertex arrays, which we introduce at the end of this section. that the vertices of the cube
After we have modeled the cube, we can animate it by using affine are available through an array
transformations. We introduce these transformations in Section 4.6 and then use them of vertices; for example, we
to alter OpenGL’s model-view matrix. In Chapter 5, we use these transformations could use the following:
again as part of the viewing process. Our pipeline model will serve us well. Vertices
will flow through a number of transformations in the pipeline, all of which will use
our homogeneous-coordinate representation. At the end of the pipeline awaits the GLfloat vertices[8][3] =
4.5 Modeling a Colored Cube 189

{{-1.0,-1.0,-1.0},{1.0,-1.0,-1.0} ,
{1.0,1.0,-1.0}, {-1.0,1.0,-1.0}, {-1.0,-1.0,1.0} ,
{1.0,-1.0,1.0}, {1.0,1.0,1.0}, {-1.0,1.0,1.0}} ;

We can adopt a more object-oriented form if we first define a three-dimensional point

type as follows: typedef GLfloat point3[3];

The vertices of the cube can be specified as follows:

point3 vertices[8] ={{-1.0,-1.0,-1.0},{1.0,-1.0,-1.0} ,


{1.0,1.0,-1.0}, {-1.0,1.0,-1.0}, {-1.0,-1.0,1.0} ,
{1.0,-1.0,1.0}, {1.0,1.0,1.0}, {-1.0,1.0,1.0}} ;

OpenGL represents all vertices internally in four-dimensional homogeneous


coordinates. Function calls using a three-dimensional type, such as glVertex3fv, have
the values placed into a four-dimensional form within the graphics system.
We can then use the list of points to specify the faces of the cube. For example,
one face is

glBegin(GL_POLYGON);
glVertex3fv(vertices[0]); glVertex3fv(vertices[3]);
glVertex3fv(vertices[2]); glVertex3fv(vertices[1]);
glEnd();

and we can define the other five faces similarly. Note that we have defined
threedimensional polygons with exactly the same mechanism that we used to define
twodimensional polygons.
190 Chapter 4 Geometric Objects and Transformations

4.5.2 Inward- and Outward-Pointing Faces each edge is shared by two


faces. These statements
We have to be careful about the order in which we specify our vertices when we are
describe the topology of a
defining a three-dimensional polygon. We used the order 0, 3, 2, 1 for the first face.
six-sided
The order 1, 0, 3, 2 would be the same, because the final vertex in a polygon definition
is always linked back to the first. However, the order 0, 1, 2, 3 is different. Although
it describes the same boundary, the edges of the polygon are traversed in the reverse
order—0, 3, 2, 1—as shown in Figure 4.29. The order is important because each
polygon has two sides. Our graphics systems can display either or both of them. From
the camera’s perspective, we need a consistent way to distinguish between the two 1
2
faces of a polygon. The order in which the vertices are specified provides this
information.
0
We call a face outward facing if the vertices are traversed in a counterclockwise 3
order when the face is viewed from the outside. This method is also known as the
FIGURE 4.29 Traversal of the
right-hand rule because if you orient the fingers of your right hand in the direction
edges of a polygon.
the vertices are traversed, the thumb points outward.
In our example, the order 0, 3, 2, 1 specifies an outward face of the cube whereas
the order 0, 1, 2, 3 specifies the back face of the same polygon. Note that each face
of an enclosed object, such as our cube, is an inside or outside face, regardless of from
where we view it, as long as we view the face from outside the object. By specifying
front and back carefully, we will be able to eliminate (or cull) faces that are not visible
or to use different attributes to display front and back faces. We will consider culling
further in Chapter 7.

4.5.3 Data Structures for Object Representation


We could now describe our cube through a set of vertex specifications. For example,
we could use

glBegin(GL_POLYGON)

six times, each time followed by four vertices (via glVertex) and a glEnd, or we could
use

glBegin(GL_QUADS)

followed by 24 vertices and a glEnd. Both of these methods work, but they both fail
to capture the essence of the cube’s topology, as opposed to the cube’s geometry. If
we think of the cube as a polyhedron, we have an object—the cube—that is composed
of six faces. The faces are each quadrilaterals that meet at vertices; each vertex is
shared by three faces. In addition, pairs of vertices define edges of the quadrilaterals;
4.5 Modeling a Colored Cube 191

Polygon Faces Vertex lists Vertices each geometric location


x 0, y0
appears only once, instead of
5 6 being repeated each time it is
Cube A 0
x 1, y1 used for a facet. If, in an
B 3
interactive application, the
1 C 2 x 2, y2
2 location of a vertex is
D 1 changed, the application
4 7 x 3, y3
E needs to change that location
F 3 x 4, y4
only once, rather than
7 searching for multiple
0 3 x 5, y5
6 occurrences of the vertex.
2 x 6, y6

x 7, y7
4.5.4 The Color Cube
FIGURE 4.30 Vertex-list representation of a cube. We can use the vertex list to
define a color cube. We
assign the colors of the
vertices of the color solid of
polyhedron. All are true, regardless of the location of the vertices—that is, regardless Chapter 2 (black, white, red,
of the geometry of the object.2 green, blue, cyan, magenta,
Throughout the rest of this book, we will see that there are numerous advantages yellow) to the vertices. We
to building for our objects data structures that separate the topology from the define a function quad to
geometry. In this example, we use a structure, the vertex list, that is both simple and draw quadrilateral polygons
useful and can be expanded later. specified by pointers into the
The data specifying the location of the vertices contain the geometry and can be vertex list. We assign a color
stored as a simple list or array, such as in vertices[8]—the vertex list. The toplevel for the face using the index of
entity is a cube; we regard it as being composed of six faces. Each face consists of the
four ordered vertices. Each vertex can be specified indirectly through its index. This
data structure is shown in Figure 4.30. One of the advantages of this structure is that

2
. We are ignoring special cases (singularities) that arise, for example, when three or more vertices
lie along the same line or when the vertices are moved so that we no longer have nonintersecting
faces.
192 Chapter 4 Geometric Objects and Transformations

transfer. If, as we will do in Chapter 6, we also specify a different normal vector at


each vertex, we make even more function calls to draw our cube. Hence, although
we have used a data structure that helps us to conceptualize the cube as a three-
dimensional geometric object, the code to draw it may not execute quickly.
Vertex arrays provide a method for encapsulating the information in our data
structure such that we can draw polyhedral objects with only a few function calls.
They allow us to define a data structure using vertices and pass this structure to the
implementation. When the objects defined by these arrays need to be drawn, we can
ask OpenGL to traverse the structure with just a few function calls.
There are three steps in using vertex arrays. First, we enable the functionality of
vertex arrays. Second, we tell OpenGL where and in what format the arrays are.
Third, we render the object. The first two steps can be part of the initialization; the
third is typically part of the display callback. We illustrate for the cube.
OpenGL allows many different types of arrays, including vertex, color, color
index, normal, and texture coordinate, corresponding to items that can be set between
a glBegin and a glEnd. Any given application usually requires only a subset of these
types. For our rotating cube, we need only colors and vertices. We enable the
corresponding arrays as follows:

glEnableClientState(GL_COLOR_ARRAY); glEnableClientState(GL_VERTEX_ARRAY);

Note that, unlike most OpenGL state information, the information for vertex arrays
resides on the client side, not the server side—hence the function name
glEnableClientState. The arrays are the same as before and can be set up as globals:

GLfloat vertices[8][3] = {{-1.0,-1.0,-1.0},{1.0,-1.0,-1.0} ,


{1.0,1.0,-1.0}, {-1.0,1.0,-1.0}, {-1.0,-1.0,1.0} , {1.0,-1.0,1.0}, {1.0,1.0,1.0}, {-
1.0,1.0,1.0}} ;

GLfloat colors[8][3] = {{0.0,0.0,0.0},{1.0,0.0,0.0} ,


{1.0,1.0,0.0}, {0.0,1.0,0.0}, {0.0,0.0,1.0} , {1.0,0.0,1.0},
{1.0,1.0,1.0}, {0.0,1.0,1.0}} ; Next, we identify where the
arrays are as follows:

glVertexPointer(3, GL_FLOAT, 0, vertices); glColorPointer(3, GL_FLOAT, 0,


colors);

The first three arguments state that the elements are three-dimensional colors and
vertices stored as floats and that the elements are contiguous in the arrays. The fourth
argument is a pointer to the array holding the data.
4.5 Modeling a Colored Cube 193

Next, we have to provide the information in our data structure about the
relationship between the vertices and the faces of the cube. We do so by specifying
an array that holds the 24 ordered vertex indices for the six faces:
GLubyte cubeIndices[24] = {0,3,2,1, /* Face 0 */
2,3,7,6, /* Face 1 */
0,4,7,3, /* Face 2 */
1,2,6,5, /* Face 3 */
4,5,6,7, /* Face 4 */
0,1,5,4}; /* Face 5 */
Thus, the first face is determined by the indices (0, 3, 2, 1), the second by (2, 3, 7, 6)
, and so on. Note that we have put the indices in an order that preserves outward-
facing polygons. The index array can also be specified as a global.
Now we can render the cube through use of the arrays. When we draw elements
of the arrays, all the enabled arrays (in this example, colors and vertices) are
rendered. We have a few options regarding how to draw the arrays. We can use the
following function:

glDrawElements(GLenum type, GLsizei n, GLenum format, void *pointer)

Here type is the type of element, such as a line or polygon, that the arrays define; n
is the number of elements that we wish to draw; format specifies the form of the data
in the index array; and pointer points to the first index to use.
For our cube, within the display callback we could make six calls to
glDrawElements, one for each face as follows:

for(i=0;i<6;i++) glDrawElements(GL_POLYGON, 4, GL_UNSIGNED_BYTE,


&cubeIndices[4*i]);

Thus, once we have set up and initialized the arrays, each time that we draw the cube,
we have only six function calls. We can rotate the cube as before; glDrawElements
uses the present state when it draws the cube.
We can do even better though. Each face of the cube is a quadrilateral. Thus, if
we use GL_QUADS, rather than GL_POLYGON, we can draw the cube with the single
function call glDrawElements(GL_QUADS, 24, GL_UNSIGNED_BYTE, cubeIndices);

because GL_QUADS starts a new quadrilateral after each four vertices.


There are two important extensions to vertex arrays that are supported by the
latest hardware and software. Although attributes such as color are part of the
OpenGL state, we usually specify them on a vertex by vertex basis so logically they
appear to be vertex attributes. The vertex attributes supported by vertex arrays—
colors, texture coordinates, normals—are properties of the graphics system. On the
other hand, an application may have other properties that are specified on a vertex-
194 Chapter 4 Geometric Objects and Transformations

by-vertex basis. For example, in a fluid-flow application, we might have a fluid


velocity at each vertex. Programmable graphics that we study in Chapter 9 give the
application programmer the ability to define her own vertex attributes that will get
processed in the same manner as OpenGL’s standard attributes.
195

4.6 Affine Transformations

Vertex arrays are stored on the client, which minimizes the storage required on
the graphics server. Consequently, although they increase the efficiency by avoiding
many function calls, they still require data to be transferred between the client and
server each time that the data are rendered. Now that graphics cards have a significant
amount of memory, vertex data can be stored in vertex buffer objects for maximum
efficiency.

u v
R

4.6 AFFINE TRANSFORMATIONS P


T
A transformation is a function that takes a point (or vector) and maps that point ( or
vector) into another point (or vector). We can picture such a function by looking at
Figure 4.33 or by writing down the functional form Q

Q = T(P)
FIGURE 4.33 Transformation.
for points, or

v = R(u)

for vectors. If we can use homogeneous coordinates, then we can represent both
vectors and points as four-dimensional column matrices and we can define the
transformation with a single function, q = f (p), v = f (u),

that transforms the representations of both points and vectors in a given frame.
This formulation is too general to be useful, as it encompasses all single-valued
mappings of points and vectors. In practice, even if we were to have a convenient
description of the function f , we would have to carry out the transformation on every
point on a curve. For example, if we transform a line segment, a general
transformation might require us to carry out the transformation for every point
between the two endpoints.
Consider instead a restricted class of transformations. Let’s assume that we are
working in four-dimensional, homogeneous coordinates. In this space, both points
and vectors are represented as 4-tuples. 3 We can obtain a useful class of
transformations if we place restrictions on f . The most important restriction is

3
. We consider only those functions that map vertices to other vertices and that obey the rules for manipulating
points and vectors that we have developed in this chapter and in Appendix B.
196 Chapter 4 Geometric Objects and Transformations

linearity. A function f is a linear function if and only if, for any scalars α and β and
any two vertices (or vectors) p and q,

f (αp + βq) = αf (p) + βf (q).


The importance of such functions is that if we know the transformations of p and q,
we can obtain the transformations of linear combinations of p and q by taking linear
combinations of their transformations. Hence, we avoid having to calculate
transformations for every linear combination.
Using homogeneous coordinates, we work with the representations of points and
vectors. A linear transformation then transforms the representation of a given point
(or vector) into another representation of that point (or vector) and can always be
written in terms of the two representations, u and v, as a matrix multiplication:

v = Cu,

where C is a square matrix. Comparing this expression with the expression we


obtained in Section 4.3 for changes in frames, we can observe that as long as C is
nonsingular, each linear transformation corresponds to a change in frame. Hence, we
can view a linear transformation in two equivalent ways: (1) as a change in the
underlying representation, or frame, that yields a new representation of our vertices,
or (2) as a transformation of the vertices within the same frame.
When we work with homogeneous coordinates, A is a 4 × 4 matrix that leaves
unchanged the fourth (w) component of a representation. The matrix C is of the form

C ,

and is the transpose of the matrix M that we derived in Section 4.3.4. The 12 values
can be set arbitrarily, and we say that this transformation has 12 degrees of freedom.
However, points and vectors have slightly different representations in our affine
space. Any vector is represented as

u .

Any point can be written as

p .

If we apply an arbitrary C to a vector,


4.7 Translation, Rotation, and Scaling 197

v = Cu,
we see that only nine of the elements of C affect u, and thus there are only nine
degrees of freedom in the transformation of vectors. Affine transformations of points
have the full 12 degrees of freedom.
We can also show that affine transformations preserve lines. Suppose that we write a
line in the form

P(α) = P0 + αd,

where P0 is a point and d is a vector. In any frame, the line can be expressed as p(α)

= p0 + αd,

where p0 and d are the representations of P0 and d in that frame. For any affine transformation matrix
A,

Cp(α) = Cp0 + αCd.

Thus, we can construct the transformed line by first transforming p0 and d and using
whatever line-generation algorithm we choose when the line segment must be
displayed. If we use the two-point form of the line, p(α) = αp0 + (1− α)p1,

a similar result holds. We transform the representations of p0 and p1 and then


construct the transformed line. Because there are only 12 elements in C that we can
select arbitrarily, there are 12 degrees of freedom in the affine transformation of a
line or line segment.
We have expressed these results in terms of abstract mathematical spaces.
However, their importance in computer graphics is practical. We need only to
transform the homogeneous-coordinate representation of the endpoints of a line
segment to determine completely a transformed line. Thus, we can implement our
graphics systems as a pipeline that passes endpoints through affine-transformation
units, and generates the interior points at the rasterization stage.
Fortunately, most of the transformations that we need in computer graphics are
affine. These transformations include rotation, translation, and scaling. With slight
modifications, we can also use these results to describe the standard parallel and
perspective projections discussed in Chapter 5.

4.7 TRANSLATION, ROTATION, AND SCALING


We have been going back and forth between looking at geometric objects as abstract
entities and working with their representation in a given frame. When we work with
application programs, we have to work with representations. In this section, first we
198 Chapter 4 Geometric Objects and Transformations

show how we can describe the most important affine transformations independently
of any representation. Then, we find matrices that describe these transformations by

FIGURE 4.34 Translation. (a) Object in original position. (b) Object translated.

acting on the representations of our points and vectors. In Section 4.8, we will see
how these transformations are implemented in OpenGL.
We look at transformations as ways of moving the points that describe one or
more geometric objects to new locations. Although there are many transformations
that will move a particular point to a new location, there will almost always be only
a single way to transform a collection of points to new locations while preserving
the spatial relationships among them. Hence, although we can find many matrices
that will move one corner of our color cube from P0 to Q0, only one of them, when
applied to all the vertices of the cube, will result in a displaced cube of the same size
and orientation.

4.7.1 Translation
Translation is an operation that displaces points by a fixed distance in a given
direction, as shown in Figure 4.34. To specify a translation, we need only to specify
a displacement vector d, because the transformed points are given by

P= P + d

for all points P on the object. Note that this definition of translation makes no reference to
y
a frame or representation. Translation has three degrees of freedom because we can specify
(x , y ) the three components of the displacement vector arbitrarily.

(x , y )
4.7.2 Rotation
x
Rotation is more difficult to specify than translation
because we must specify more parameters. We start with the simple example of rotating a
point about the origin FIGURE4.35 Two-dimensional in a two-dimensional plane, as shown in Figure 4.35. Having specified
a particular rotation. point—the origin—we are in a particular frame. A two-dimensional point at (x, y) in this frame is
rotated about the origin by an angle θ to the position (x, y). We can obtain the standard equations describing this rotation
by representing (x, y) and (x, y) in polar form:
4.7 Translation, Rotation, and Scaling 199

x = ρ cos φ, y = ρ

sin φ, x = ρ cos(θ

+ φ), y = ρ sin(θ +

φ).

Expanding these terms using the trigonometric identities for the sine and cosine of
the sum of two angles, we find x = ρ cos φ cos θ − ρ sin φ sin θ = x cos θ − y sin θ, y
= ρ cos φ sin θ + ρ sin φ cos θ = x sin θ + y cos θ.

These equations can be written in matrix form as

We expand this form to three dimensions in Section 4.7.


Note three features of this transformation that extend to other rotations:
1. There is one point—the origin, in this case—that is unchanged by the
rotation. We call this point the fixed point of the transformation. Figure 4.36
shows a two-dimensional rotation about a fixed point in the center of the
object rather than about the origin of the frame.
2. Knowing that the two-dimensional plane is part of three-dimensional space,
we can reinterpret this rotation in three dimensions. In a right-handed system,
when we draw the x- and y-axes in the standard way, the positive z- axis
200 Chapter 4 Geometric Objects and Transformations

FIGURE 4.37 Three-dimensional rotation.

comes out of the page. Our definition of a positive direction of rotation is


counterclockwise when we look down the positive z-axis toward the origin.
We use this definition to define positive rotations about other axes.
3. Rotation in the two-dimensional plane z = 0 is equivalent to a three-
dimensional rotation about the z-axis. Points in planes of constant z all rotate
in a similar manner, leaving their z values unchanged.

We can use these observations to define a general three-dimensional rotation


that is independent of the frame. We must specify the three entities shown in Figure
4.37: a fixed point (Pf ), a rotation angle (θ), and a line or vector about which to
rotate. For a given fixed point, there are three degrees of freedom: the two angles
necessary to specify the orientation of the vector and the angle that specifies the
amount of rotation about the vector.
Rotation and translation are known as rigid-body transformations. No
combination of rotations and translations can alter the shape or volume of an object;
they can alter only the object’s location and orientation. Consequently, rotation and
translation alone cannot give us all possible affine transformations. The
transformations shown in Figure 4.38 are affine, but they are not rigid-body
transformations.

4.7.3 Scaling
Scaling is an affine non–rigid-body transformation by which we can make an object
bigger or smaller. Figure 4.39 illustrates both uniform scaling in all directions and
scaling in a single direction. We need nonuniform scaling to build up the full set of
affine transformations that we use in modeling and viewing by combining a properly
chosen sequence of scalings, translations, and rotations.
4.7 Translation, Rotation, and Scaling 201

Scaling transformations have a fixed point, as we can see from


Figure 4.40. Hence, to specify a scaling, we can specify the fixed point,
a direction in which we wish to scale, and a scale factor (α). For α > 1,
the object gets longer in the specified direction; for 0 ≤ α< 1, the object
gets smaller in that direction. Negative values of α give us reflection
(Figure 4.41) about the fixed point, in the scaling direction. Scaling
FIGURE 4.40 Effect of scale factor.
202 Chapter 4 Geometric Objects and Transformations

has six degrees of freedom because we can specify an arbitrary fixed


point and three independent scaling factors.

4.8 TRANSFORMATIONS IN HOMOGENEOUS COORDINATES


All graphics APIs force us to work within some reference system. Hence,
we cannot work with high-level expressions such as

Q = P + αv.

Instead, we work with representations in homogeneous coordinates and


with expressions such as

q = p + αv.

Within a frame, each affine transformation is represented by a 4 × 4


matrix of the form

A .
203

4.8 Transformations in Homogeneous Coordinates

4.8.1 Translation
Translation displaces points to new positions defined by a displacement vector. If we
move the point p to p by displacing by a distance d, then

p
= p + d.

Looking at their homogeneous-coordinate forms

x αx
⎡ ⎤ ⎡ ⎤

p , p = ⎢ ⎢ ⎣ ⎢ y ⎥ ⎥ ⎥ ⎦ , d = ⎢ ⎢ ⎢ ⎣ αy ⎥ ⎥ ⎥ ⎦ ,
zαz
10

we see that these equations can be written component by component as

x = x + αx, y = y + αy, z = z + αz.

This method of representing translation using the addition of column matrices does
not combine well with our representations of other affine transformations. However,
we can also get this result using the matrix multiplication:

p = Tp, where

T .

T is called the translation matrix. We sometimes write it as T(αx, αy, αz) to emphasize
the three independent parameters.
It might appear that using a fourth fixed element in the homogeneous representation
of a point is not necessary. However, if we use the three-dimensional forms

q ,
204 Chapter 4 Geometric Objects and Transformations

q= ,z

it is not possible to find a 3 × 3 matrix D such that q = Dq for the given displacement
vector d. For this reason, the use of homogeneous coordinates is often seen as a
clever trick that allows us to convert the addition of column matrices in three
dimensions to matrix–matrix multiplication in four dimensions.
We can obtain the inverse of a translation matrix either by applying an inversion
algorithm or by noting that if we displace a point by the vector d, we can return to
the original position by a displacement of −d. By either method, we find that

T .

4.8.2 Scaling
For both scaling and rotation, there is a fixed point that is unchanged by the
transformation. We let the fixed point be the origin, and we show how we can
concatenate transformations to obtain the transformation for an arbitrary fixed point.
A scaling matrix with a fixed point of the origin allows for independent scaling
along the coordinate axes. The three equations are x = βxx, y = βyy, z = βzz.

These three equations can be combined in homogeneous form as p

= Sp,

where

S .

As is true of the translation matrix and, indeed, of all homogeneous coordinate


transformations, the final row of the matrix does not depend on the particular
transformation, but rather forces the fourth component of the transformed point to
retain the value 1.
We obtain the inverse of a scaling matrix by applying the reciprocals of the scale
factors:
205

S .

4.8 Transformations in Homogeneous Coordinates

4.8.3 Rotation
We first look at rotation with a fixed point at the origin. There are three degrees of
freedom corresponding to our ability to rotate independently about the three
coordinate axes. We have to be careful, however, because matrix multiplication is
not a commutative operation (Appendix C). Rotation about the x-axis by an angle θ
followed by rotation about the y-axis by an angle φ does not give us the same result
as the one that we obtain if we reverse the order of the rotations.
We can find the matrices for rotation about the individual axes directly from the
results of the two-dimensional rotation that we developed in Section 4.7.2. We saw
that the two-dimensional rotation was actually a rotation in three dimensions about
the z-axis and that the points remained in planes of constant z. Thus, in three
dimensions, the equations for rotation about the z-axis by an angle θ are x = x cos θ
− y sin θ, y = x sin θ + y cos θ,

z = z; or, in matrix

form, p = Rzp,

where

Rz .

We can derive the matrices for rotation about the x- and y-axes through an identical
argument. If we rotate about the x-axis, then the x values are unchanged, and we
have a two-dimensional rotation in which points rotate in planes of constant x; for
rotation about the y-axis, the y values are unchanged. The matrices are

Rx ,
206 Chapter 4 Geometric Objects and Transformations

Ry .

The signs of the sine terms are consistent with our definition of a positive rotation in
a right-handed system.
Suppose that we let R denote any of our three rotation matrices. A rotation by
θ can always be undone by a subsequent rotation by −θ; hence, R−1(θ) = R(−θ).

In addition, noting that all the cosine terms are on the diagonal and the sine terms
are off-diagonal, we can use the trigonometric identities cos(−θ) = cos θ sin(−θ) = −
sin θ

to find

R−1(θ) = RT(θ).

In Section 4.9.1, we show how to construct any desired rotation matrix, with a
fixed point at the origin, as a product of individual rotations about the three axes R
= RzRyRx.

Using the fact that the transpose of a product is the product of the transposes in the
reverse order, we see that for any rotation matrix,

R−1= RT.

A matrix whose inverse is equal to its transpose is called an orthogonal matrix.


Normalized orthogonal matrices correspond to rotations about the origin.

4.8.4 Shear
Although we can construct any affine transformation from a sequence of rotations,
translations, and scalings, there is one more affine transformation—the shear
207

y y

x x

z z

FIGURE 4.42 Shear.


208 Chapter 4 Geometric Objects and Transformations

transformation—that is of such importance that we regard it as a basic type, rather y


than deriving it from the others. Consider a cube centered at the origin, aligned with
the axes and viewed from the positive z-axis, as shown in Figure 4.42. If we pull the ( x, y ) (x , y )
top to the right and the bottom to the left, we shear the object in the x direction. Note
that neither the y nor the z values are changed by the shear, so we can call this
operation x shear to distinguish it from shears of the cube in other possible
directions. Using simple trigonometry on Figure 4.43, we see that each shear is x
characterized by a single angle θ; the equations for this shear are
FIGURE 4.43 Computation of the
x = x + y cot θ, y = y, z = z, shear matrix.

leading to the shearing matrix

⎡1 cot θ 0 0 ⎤
Hx .

We can obtain the inverse by noting that we need to shear in only the opposite
direction; hence,

Hx−1(θ) = Hx(−θ).

4.9 CONCATENATION OF TRANSFORMATIONS


In this section, we create examples of affine transformations by multiplying together,
or concatenating, sequences of the basic transformations that we just introduced.
Using this strategy is preferable to attempting to define an arbitrary transformation
directly. The approach fits well with our pipeline architectures for implementing
graphics systems.
Suppose that we carry out three successive transformations on a point p,
creating a new point q. Because the matrix product is associative, we can write the
sequence as

q = CBAp,

without parentheses. Note that here the matrices A, B, and C (and thus M) can be
arbitrary 4 × 4 matrices, although in practice they will most likely be affine. The
order in which we carry out the transformations affects the efficiency of the
calculation. In one view, shown in Figure 4.44, we can carry out A, followed by B,
followed by C—an order that corresponds to the grouping q = (C(B(Ap))).
4.9 Concatenation of Transformations 209

CBA Consider a cube with its


center at pf and its sides
aligned with the axes. We
M want to rotate the cube about
p q the z-axis, but this time about
FIGURE 4.45 Pipeline transformation.
its center pf , which becomes
the fixed point of the
A B C transformation, as shown in
p q Figure 4.46. If pf were the
FIGURE 4.44 Application of transformations one at a time. origin, we would know how to
solve the problem: We would
simply use Rz(θ). This
observation suggests the
If we are to transform a single point, this order is the most efficient because each
strategy of first moving the
matrix multiplication involves multiplying a column matrix by a square matrix. If
cube to the origin. We can then
we have many points to transform, then we can proceed in two steps. First, we
apply Rz(θ) and finally move
calculate
the object back such that its
M = CBA. center is again at pf . This
sequence is shown in Figure
Then, we use this matrix on each point 4.47. In terms of our basic
affine transformations, the
q = Mp. first is T(−pf ), the second is
Rz(θ), and the final is T(pf ).
This order corresponds to the pipeline shown in Figure 4.45, where we compute M Concatenating them together,
first, then load it into a pipeline transformation unit. If we simply count operations, we obtain the single matrix
we see that although we do a little more work in computing M initially, because M
may be applied to tens of thousands of points, this extra work is insignificant M = T(pf )Rz(θ)T(−pf ).
compared with the savings we obtain by using a single matrix multiplication for each If we multiply out the
point. We now derive examples of computing M. matrices, we find that

⎡ cos θ − sin θ 0 xf −
4.9.1 Rotation About a Fixed Point xf cos θ + yf sin θ ⎤
Our first example shows how we can alter the transformations that we defined with M
a fixed point at the origin (rotation, scaling, shear) to have an arbitrary fixed point.
We demonstrate for rotation about the z-axis; the technique is the same for the other
cases.
.
210 Chapter 4 Geometric Objects and Transformations

y y y y

pf
pf

x x x x

z z z z
y y

pf
pf

x x

z z
(a) ( b )

FIGURE 4.46 Rotation of a cube about its center.

FIGURE 4.47 Sequence of transformations.

4.9.2 General Rotation


We now show that an arbitrary rotation about the origin can be composed of three
successive rotations about the three axes. The order is not unique (see Exercise 4.10)
, although the resulting rotation matrix is. We form the desired matrix by first doing
a rotation about the z-axis, then doing a rotation about the y-axis, and concluding
with a rotation about the x- axis.
4.9 Concatenation of Transformations 211

Consider the cube, again centered at the origin with its sides aligned with the
axes, as shown in Figure 4.48(a). We can rotate it about the z-axis by an angle α to
orient it, as shown in Figure 4.48(b). We then rotate the cube by an angle β about
the y-axis, as shown in a top view in Figure 4.49. Finally, we rotate the cube by an
angle γ about the x-axis, as shown in a side view in Figure 4.50. Our final rotation
matrix is

R = RxRyRz.
y y

x x

z z
(a) (b)

FIGURE 4.48 Rotation of a cube about the z-axis. (a) Cube before rotation.
(b) Cube after rotation.

x x

z z
(a) (b)

FIGURE 4.49 Rotation of a cube about the y- axis.


212 Chapter 4 Geometric Objects and Transformations

y y

z z

(a ) (b )

FIGURE 4.50 Rotation of a cube about the x-


axis.
A little experimentation should convince you that we can achieve any desired
orientation by proper choice of α, β, and γ, although, as we will see in the example of
Section 4.9.4, finding these angles can be tricky.

4.9.3 The Instance Transformation


Our example of a cube that can be rotated to any desired orientation suggests a
generalization appropriate for modeling. Consider a scene composed of many simple
objects, such as that shown in Figure 4.51. One option is to define each of these objects, FIGURE 4.51 Scene of simple
through its vertices, in the desired location with the desired orientation and size. An alternative
is to define each of the object types once at a convenient size, in objects. a convenient place, and
with a convenient orientation. Each occurrence of an object in the scene is an instance of that
object’s prototype, and we can obtain the desired size, orientation, and location by applying an
affine transformation—the instance transformation—to the prototype. We can build a simple
database to describe a scene from a list of object identifiers (such as 1 for a cube and 2 for a
sphere) and of the instance transformation to be applied to each object.
The instance transformation is applied in the order shown in Figure 4.52.
Objects are usually defined in their own frames, with the origin at the center of mass
and the sides aligned with the model frame axes. First, we scale the object to the
desired size. Then we orient it with a rotation matrix. Finally, we translate it to the
desired orientation. Hence, the instance transformation is of the form

M = TRS.

Modeling with the instance transformation works well not only with our pipeline
architectures but also with the display lists that we introduced in Chapter 3. A
complex object that is used many times can be loaded into the server once as a
display list.
4.9 Concatenation of Transformations 213
214 Chapter 4 Geometric Objects and Transformations

p2 4.9.4 Rotation About an Arbitrary Axis


u Our final rotation example illustrates not only how we can achieve a rotation about
an arbitrary point and line in space but also how we can use direction angles to
p1 specify orientations. Consider rotating a cube, as shown in Figure 4.53. We need
three entities to specify this rotation. There is a fixed point p0 that we assume is the
x center of the cube, a vector about which we rotate, and an angle of rotation. Note
p0 that none of these entities relies on a frame and that we have just specified a rotation
in a coordinatefree manner. Nonetheless, to find an affine matrix to represent this
transformation, we have to assume that we are in some frame.
z The vector about which we wish to rotate the cube can be specified in various
FIGURE 4.53 Rotation of a cube ways. One way is to use two points, p1 and p2, defining the vector
about an arbitrary axis.
u = p2 − p1.

Note that the order of the points determines the positive direction of rotation for θ
and that even though we draw u as passing through p0, only the orientation of u
y matters. Replacing u with a unit-length vector

αx
p2 p1
u
⎡ ⎤ v=|
v | = ⎣ αy ⎦ u
x αz
in the same direction simplifies the subsequent steps. We have already seen that
moving the fixed point to the origin is a helpful technique. Thus, our first
transformation is the translation T(−p0), and the final one is T(p0). After the initial
z
translation, the required rotation problem is as shown in Figure 4.54. Our previous
example (see Section 4.9.2) showed that we could get an arbitrary rotation from
three rotations about the individual axes. This problem is more difficult because we
do not know what angles to use for the individual rotations. Our strategy is to carry
out two rotations to align the axis of rotation, v, with the z-axis. Then we can rotate
by θ about the z- axis, after which we can undo the two rotations that did the aligning.
Our final rotation matrix will be of the form
FIGURE 4.54 Movement of the
fixed point to the origin. R = Rx(−θx)Ry(−θy)Rz(θ)Ry(θy)Rx(θx).
Displaying each instance of This sequence of rotations is shown in Figure 4.55. The difficult part of the process
it requires only sending the θ
is determining x and θy.
appropriate instance
transformation to the server We proceed by looking at the components of v. Because v is a unit-length
before executing the display
list. vector, αx2 + αy2 + αz2 = 1.
4.9 Concatenation of Transformations 215

y y y

x x x x
x

z
y
z z z z
We draw a line segment direction angles are independent because cos2φ x + cos2φy +
from the origin to the point
(αx, αy, αz). This line segment cos2φz = 1.
has unit length and the
orientation of v. Next, we
draw the perpendiculars We can now compute θx and θy using these angles. Consider Figure 4.57. It shows
from the that the effect of the desired rotation on the point (α x, αy, αz) is to rotate the line
FIGURE 4.55 Sequence segment into the plane y = 0. If we look at the projection of the line segment (before
of rotations. the rotation) on the plane x = 0, we see a line segment of length d on this plane.
Another way to envision this figure is to think of the plane x = 0 as a wall and
consider a distant light source located far down the positive x-axis. The line that we
point (αx, αy, αz) to the see on the wall is the shadow of the line segment from the origin to (αx, αy, αz). Note
that the length of the shadow is less than the length of the line segment. We can say
coordinate axes, as shown in the line segment has been foreshortened to d . The desired angle of
Figure 4.56. The three
rotation is determined by the angle that this shadow makes with the z-axis. However,
direction angles—φx, φy, the rotation matrix is determined by the sine and cosine of θx. Thus, we never need
φz—are the angles between to compute θx; rather, we need to compute only

the line segment (or v) and 1 0 0 0


the axes. The direction z/d −αy/d 0⎤
cosines are given by cos φx =

αx, cos φy = αy, cos φz = αz.
Rx(θx)⎥ ⎥ ⎥ ⎦ .
Only y/d αz/d 0
0 0 0 1
two of

the
216 Chapter 4 Geometric Objects and Transformations

y y

z
( x, y, z) ( x, ,
y z
)

y d 1
x
x x x
y x
z

z z

FIGURE 4.56 Direction FIGURE 4.57 Computation of the x rotation.


angles.

We compute Ry in a similar manner. Figure 4.58 shows the rotation. This angle is clockwise about the y-axis; therefore,
we have to be careful of the sign of the sine terms in the matrix, which is

Ry(θy) .

x
x
d
y 1

Finally, we concatenate all the matrices to find


z

FIGURE 4.58 Computation of M = T(p 0)Rx(−θx)Ry(−θy)Rz(θ)Ry(θy)Rx(θx)T(−p0). the y rotation.


Let’s look at a specific example. Suppose that we wish to rotate an object by 45
degrees about the line passing through the origin and the point (1, 2, 3). We leave
the fixed point at the origin. The first step is to find the point along the line that√is a
unit√ distance√ from the√origin. √We obtain√ it by normalizing (1, 2, 3) to

(1/ 14, 2/ 14, 3/ 14), or (1/ 14, 2/ 14, 3/ 14, 1) in homogeneous coordinates. The first
part of the rotation takes this point to (0, 0√, 1, 1). We√first rota√te about the x-axis
by the angle cos . This matrix carries (1/ 14, 2/ 14, 3/ 14, 1) to

(1/√ 14, 0, √13√/14, 1), which is in the plane y = 0. The y rotation must be by the
angle − cos −1( 13/14). This rotation aligns the object with the z-axis, and now we
can rotate about the z-axis by the desired 45 degrees. Finally, we undo the first two
rotations. If we concatenate these five transformations into a single rotation matrix
R, we find that
4.9 Concatenation of Transformations 217

Rx

This matrix does not change any point on the line passing through the origin and the
point (1, 2, 3). If we want a fixed point other than the origin, we form the matrix M
= T(pf)RT(−pf).

This example is not simple. It illustrates the powerful technique of applying many simple transformations to get a
complex one. The problem of rotation about
an arbitrary point or axis arises in many applications. The major variants lie in the manner in which the axis of rotation
is specified. However, we can usually employ techniques similar to the ones that we have used here to determine
direction angles or direction cosines
218

You might also like