0% found this document useful (0 votes)
48 views

Orthographic and Perspective Projection: Raycasting Object Space Renderer

This document discusses different types of 3D rendering techniques. It compares object space rendering, where calculations are done in the 3D scene space, to screen space rendering used by OpenGL. Screen space rendering first transforms 3D objects to 2D screen coordinates using a view transform and projection. It then describes how to construct the view transform matrix and perform orthographic and perspective projections to map 3D coordinates to 2D screen coordinates. The document also introduces the concept of a canonical view volume that models the viewing frustum or box.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Orthographic and Perspective Projection: Raycasting Object Space Renderer

This document discusses different types of 3D rendering techniques. It compares object space rendering, where calculations are done in the 3D scene space, to screen space rendering used by OpenGL. Screen space rendering first transforms 3D objects to 2D screen coordinates using a view transform and projection. It then describes how to construct the view transform matrix and perform orthographic and perspective projections to map 3D coordinates to 2D screen coordinates. The document also introduces the concept of a canonical view volume that models the viewing frustum or box.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Chapter 10

Orthographic and
Perspective Projection

raycasting projection
object space renderer screen space renderer

We have been, until now, creating images by raycasting. By shooting rays from
the eyepoint out into the scene, we determine what is visible at the screen
pixel that the ray passes through. This type of renderer is called an object (or
scene) space renderer, since all viewing calculations are done in the world space
of the scene. This approach is depicted above to the left. We are about to
begin looking at OpenGL, which uses a very different approach. OpenGL uses
a screen space rendering approach, as depicted above to the right, which works
by:

1. transforming all scene objects from world coordinates to camera coordi-


nates by a viewing transform,
2. projecting all scene geometry into 2D screen space and then using this
projection to produce a shaded image.

63
64 CHAPTER 10. ORTHOGRAPHIC AND PERSPECTIVE PROJECTION

For example, consider the tetrahedron in


the picture to the right. If we were to
project the tetrahedron onto the view plane,
it might be that only two of the four trian-
3
gular faces would be visible. We could color 3 1
2

2
all of the pixels covered by triangle 1 the 1

color of this face, and all those covered by


triangle 2 the color of that face, thus mak-
ing an image of the tetrahedron from this
viewpoint.

10.1 View Transform

Let us begin by considering how to transform the scene into camera coordinates.
We want to view the scene from the camera’s point of view. As a convention, in
camera space we will let the camera’s viewpoint be the origin xc = (0, 0, 0), the
camera’s view direction the negative z axis uc = (0, 0, −1), and the viewscreen’s
up direction the y axis (0, 1, 0). If the camera does not start out in this position
and orientation in world space, we have to make a change of coordinates. This
amounts to first translating the entire scene so that the camera is at the origin,
then rotating the scene about the x and y axes so that the view direction vector
uc is along the −z axis, and finally rotating the scene about the z axis so that
camera is in the desired orientation. This last operation can be facilitated by
specifying a vector that we will call vup , and rotating the scene so that vup lies in
the upper half of the y-z plane. The figure below illustrates this transformation.

vup
xc
uc
uý vúp

uć
uy ux́
uź o=xć

O
ux
uz

Summarizing, we do the following operations:

1. translate xc to the origin so x0c = 0,


2. rotate about x and y so u0c = u0z , and
3. rotate about z so that vup lies in the upper half of the uy -uz plane.
10.2. PROJECTION TRANSFORM 65

So, the view transformation would  be the product of a translation


xc
T (−xc , −yc , −zc ) taking xc =  yc  to the origin, and a rotation R, com-
zc
bining the three rotations above:

Mview = RT.

To understand how to do the rotations, remember first that we construct the


camera coordinate frame as follows:

ux = (uc × vup )/kuc × vup k,


uy = ux × uc .
uz = −uc .

After rotation x0 = Rx, we want


     
1 0 0
u0x =  0  , u0y =  1  , and u0z =  0  .
0 0 1

The matrix to do this is easy to find, if we remember that ux , uy , uz are mutually


orthogonal unit vectors. Thus, ux · uy = ux · uz = uy · uz = 0 and ux · ux =
uy · uy = uz · uz = 1. Thus, if we construct a matrix whose rows are formed
from utx , uty , utz we will get the result we want:

utx
 

R =  uty  ,
utz

since      
1 0 0
Rux =  0  , Ruy =  1  , and Ruz =  0  .
0 0 1

10.2 Projection transform

Now that our model is conveniently in camera


coordinates, it is possible to determine a set of
uý
screen coordinates. The distance from the view-
uý point to the view screen is known as the the focal
uź uć uź
length of the camera. In a real camera, the fo-
o cs ux́ cal length is the distance from the lens at which
ux́
66 CHAPTER 10. ORTHOGRAPHIC AND PERSPECTIVE PROJECTION

distant objects come into focus on the camera


backplane. In a pinhole camera, it is simply the
distance to the view screen. If we define the cam-
era’s focal length to be dn (distance to the near
plane), then the screen center cs = (0, 0, −dn ) and the x and y screen axes are
given by u0x and u0y , assuming that the screen is arranged so its normal is u0z
(i.e. it is perpendicular to the view direction u0c ). Applying our view transform
Mview to the entire scene, everything is in the same coordinate frame.
To do an orthographic projection of the scene
onto the camera plane is now straightforward – (x1, y1)
we just discard the z coordinate of each vertex, (x1, y1, z1)
uc
as shown above to the right in a 2D plan view.
To do a perspective projection, shown below to
the right, we use the device of similar triangles: (x0, y0, z0)
dn (x0, y0)
x1 /z = x01 /dn screen object

(x1, y1, z1)


y1 /z = y10 /dn x1́

xc uc
dn
Thus the transform is x0 = z x.
dn (x0, y0, z0)
z

10.3 Canonical view volumes

The view volume is the volume swept out by the screen through space in the
projection system being used. For an orthographic projection, this is a rect-
angular solid, as shown in Figure 10.1. We use the distance dn to denote the
distance of the front face, or near plane, of the volume and df to denote an
arbitrary, or far plane depth of the volume. Screen space cameras are generally
set up to ignore (or clip away) any part of the scene not in the view volume.
The width and height of the volume are determined by camera screen width w
and height h.

In a perspective camera, the view volume has a frustum shape, as shown in


Figure 10.2.

The idea of a canonical view volume is to provide a common frame of refer-


ence for processing after the projection is performed, which decouples shading
and display of an image from the projection system used. The typical canon-
ical view volume is a 2x2x2 cube aligned with the coordinate axes in viewing
coordinates, and centered at the origin. The idea is that no matter what the
viewing conditions are, after a projection transform is performed, all of the
scene is transformed into this space, and anything in that scene that is not in
the canonical view volume after that transformation is clipped away.
10.3. CANONICAL VIEW VOLUMES 67

near plane far plane

xc
uc h

dn

df

Figure 10.1: Parallel view volume.

far plane
near plane
w
uc
xc h

dn

df

Figure 10.2: Perspective view volume.

We construct the projection transform by considering how it will affect the 8


corners of the original view volume, under whichever projection system is being
employed.

10.3.1 Constructing the canonical orthographic view vol-


ume

To construct the canonical view volume under orthographic projection, cor-


ners (w/2, h/2), (−w/2, h/2), (w/2, −h/2), and (−w/2, −h/2) should be sent
to (1, 1), (−1, 1), (1, −1), and (−1, −1) respectively (here z does not matter, as
we are simply scaling in the x and y directions). In the depth (z) direction we
want zn to go to −1 and zf to go to 1. We see that this is a scale
 
2/w 0 0 0
 0 2/h 0 0 
S= ,
 0 0 −2/(df − dn ) 0 
0 0 0 1
68 CHAPTER 10. ORTHOGRAPHIC AND PERSPECTIVE PROJECTION

followed by a translation in the z direction to center the cube at the origin


1 0 0 0
 
 0 1 0 0 
T = df +dn .
 0 0 1 −( df −dn ) 
0 0 0 1
So the final orthographic projection matrix to transform the scene into the
canonical view volume is
2/w 0 0 0
 
 0 2/h 0 0 
Portho = T S =  −2 df +dn 
 0 0 df −dn −( df −dn )
0 0 0 1

10.3.2 Constructing the canonical perspective view vol-


ume

The construction of the canonical


view volume under perspective pro- y=z

jection follows a somewhat different 1


derivation. First, we scale the per- h/2

spective frustum so that its sides are 1

x = ±z, y = ±z and its front face


-h/2
is at z = 1: 1
  zn 1
2/w 0 0 0 y = -z
 0 2/h 0 0 
S= .
 0 0 −1/dn 0 
0 0 0 1
This is followed by a perspective transform. Remembering that by similar tri-
angles, we showed that the perspective transform should scale vertex x by d|z|n
(note, that the absolute value signs are needed since in camera coordinates, all
z values in the view volume are negative. Thus, a perspective transform should
produce the result x0 = d|z|n x. The matrix
 
dn 0 0 0
 0 dn 0 0
 
0 0 −dn 0
0 0 −1 0
will transform x as follows:
   
x dn x
 y   dn y 
Px = P 
 z  =  −dn z
  .

1 −z
10.3. CANONICAL VIEW VOLUMES 69

If thisresult is normalized into homogenous coordinates by scaling by −1/z, we


− dzn x
 − dn y 
have  z
 dn , which is the desired result. In general, a perspective transform

1
has non-zero terms in the fourth row of the matrix and will transform points in
the homogenous coordinates to points in 4-space that are not homogenous.

In order to place projected points back into homogeneous coordinates, the per-
spective matrix multiplication is followed by a normalization of each transformed
point by dividing by its own w coordinate, to complete the perspective trans-
form.

Now, to complete the transformation to the canonical view volume, we do a


perspective transform combined with a scale and translation in z that will place
zn at −1 and zf at 1:  
dn 0 0 0
 0 dn 0 0
P = 0
,
0 sz tz 
0 0 −1 0
where the scale sz and the translation tz are yet to be determined.

Note that    
x x
 y   y  = x0 .

 z  =  sz z + tz
Px = P   

1 z
After normalizing by the w coordinate to place the result back in homogeneous
coordinates, we have  
x/z
1 0 y/z
x = x00 = 
 
1
.
z  sz + t
z z

1

The requirements that zn ⇒ −1 and zf ⇒ 1 give us two linear equations


(remember that because of the initial scale zn ⇒ 1 and zf ⇒ −zf /zn ):
sz + tz = −1,
dn
sz + tz = 1.
df
Solving these equations simultaneously yields
2dn df
tz = ,
dn − df
dn + df
sz = .
dn − df
70 CHAPTER 10. ORTHOGRAPHIC AND PERSPECTIVE PROJECTION

Thus, the final perspective matrix, transforming from camera space to the
canonical view volume is
2dn /w 0 0 0
 
 0 2dn /h 0 0 
Ppersp = P S = 
 0 dn +df 2dn df 
0 dn −df dn −df

0 0 −1 0

You might also like