Interactive Computer Graphics Lesson 16
Interactive Computer Graphics Lesson 16
D. C. Anderson
August, 2001
Preface
The field of computer graphics has experienced a number of hardware and software revolutions since the early 1970s when these notes began. The purpose of these notes now, however, are very much the same as then, to provide students with a basic knowledge of, and practical experience with, the fundamental mathematics, algorithms, representations and techniques that are fundamental to interactive computer graphics applications. The chapters are arranged in an order that progresses from devices to software, from two dimensions (2D) to three dimensions (3D), and from simple geometric forms to the more complex curves, surfaces and solids. At times, this causes some topics to be re-visited, such as clipping and transformations in 2D and 3D. This order was chosen to facilitate a more natural topical progression for learning. Students accumulate knowledge and experience with relatively simpler topics first, such as 2D programming techniques, and then are ready to combine these with the later topics that are more mathematically and computationally challenging. Computer graphics and computer programming are inseparable. Typically in the course that uses these notes, four programming projects are assigned over sixteen weeks. Each project requires three to four weeks, coinciding as much as possible with the lecture topics. The goals of the projects are to give the student in-depth practice with the concepts while designing, coding and debugging a relevant and non-trivial application. Typically, the project topics are (1) basic 2D display with minor interaction, (2) fully interactive 2D drawing requiring dynamic data structures, (3) the wireframe 3D pipeline with projection, 3D clipping, and hierarchical transformations with little interaction, (4) 3D hidden surface removal and shading, such as scan-line rendering or ray tracing. Partially incomplete source code for the first project usually is given to the students to demonstrate basic graphical programming techniques and styles. This has proven to be effective in speeding the learning of programming details as well as various operating system and graphics package minutia. This also paves the way for later concentration on mathematical and algorithmic concepts. Over the past 25 years, there have been four generations of graphics devices used for the projects, starting with the Imlac minicomputer vector display system, progressing through Megatek 3D vector systems, raster graphics workstations and now personal computers. Each
ii
new display technology and new generation of system software spawned a new generation of graphics programming package. Such is the case for the GRAFIC package which was developed during the first days of X Windows (version 10) to support the teaching of computer graphics. GRAFIC was designed to provide basic, multi-language (initially C, Pascal and Fortran; and now C++ as well), multi-platform access to window-based raster graphics. An important objective of GRAFIC was that its use should require little training in the package itself. Now, versions of GRAFIC exist for X Windows for Unix workstations, IBM-compatible personal computers with Microsoft Windows , and Apple Macintosh computers. GRAFIC is available as-is and free of charge from the internet site shown below. There are many individuals who have contributed to these notes over the years. Michael Bailey taught the course Interactive Computer Graphics with the author for several years, established some of the early topics, and developed some of the early versions of the written notes. Warren Waggenspack contributed significantly to the presentation of curves and surfaces while at Purdue, and has been a valued collaborator who has encouraged continued work on the notes by the author. Philip Cunningham helped with the presentation of the reflectance and shading models. Joseph Cychosz contributed to the presentation on color and some of the ray casting algorithms. Gregory Allgood implemented the first versions of X Windows GRAFIC. To these individuals, the author extends sincere appreciation for their invaluable help.
iii
Bibliography
The following is a partial list of texts covering various topics in computer graphics. They are categorized by their primary emphasis and alphabetically by first author. General Computer Graphics Texts: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Berger, M., Computer Graphics with Pascal, The Benjamin/Cummings Publishing Company, Inc., 1986. Dewey, B. R., Computer Graphics for Engineers, Harper & Row, 1988. Foley, J.D. and Van Dam, A., Fundamentals of Interactive Computer Graphics, AddisonWesley, 1982. Giloi, W.K., Interactive Computer Graphics, Prentice-Hall, Inc, 1978. Harrington, S., Computer Graphics - A Programming Approach, McGraw-Hill Book Company, 1987. Hearn, D., Baker, M. P., Computer Graphics, Prentice-Hall, Inc., 1986. Hill, F. S. Jr., Computer Graphics, Macmillan Publishing Company, 1990. Newman, W.M. and Sproull, R.F., Principles of Interactive Computer Graphics, Second Edition, McGraw-Hill, 1979. Plastok, R. A., Kalley, G., Theory and Problems of Computer Graphics, Shaums Outline Series in Computers, McGraw-Hill Book Company, 1986. Pokorny, C. K., Curtis, F. G., Computer Graphics: The Principles Behind the Art and Science, Franklin, Beedle & Associates, 1989. Rogers, D.F., Procedural Elements for Computer Graphics, McGraw-Hill Book Company, New York, 1985.
Geometric Modeling Emphasis: 12. 13. 14. Barnsley, M. F., Devaney, R. L., Mandelbrot, B. B., Peitgen, H.-O., Saupe, D., and Voss, R. F., The Science of Fractal Images, Springer-Verlag, 1988. Ding, Q., Davies, B. J., Surface Engineering Geometry for Computer-Aided Design and Manufacture, John Wiley & Sons, 1987. Farin, G., Curves and Surfaces for Computer Aided Geometric Design - A Practical Guide, Academic Press, Inc., Harcourt Brace Jovanovich Publishers, 1988.
iv
Hoffmann, C. M., Geometric and Solid Modeling - An Introduction, Morgan Kaufmann Publishers, Inc. San Mateo, California, 1989. Lyche, T., Schumaker, L. L., Mathematical Methods in Computer Aided Geometric Design, Academic Press, Inc., Harcourt Brace Jovanovich Publishers, 1989. Mantyla, M., An Introduction to Solid Modeling, Computer Science Press, Inc., 1988. Mortenson, M. E., Geometric Modeling, John Wiley & Sons, New York, 1980. Mortenson, M. E., Computer Graphics - An Introduction to the Mathematics and Geometry, Industrial Press, Inc., 1989. Rogers, D.F and Adams, J.A., Mathematical Elements for Computer Graphics, McGrawHill, 1976. Yamaguchi, F., Curves and Surfaces in Computer Aided Geometric Design, Springer-Verlag, 1988.
Realistic Image Generation Emphasis: 22. 23. 24. 25. 26. 27. 28. Deken, Joseph, Computer Images, Stewart, Tabori, & Chang, New York, 1983. Glassner, A. S. (ed.), An Introduction to Ray Tracing, Academic Press, Inc., Harcourt Brace Jovanovich Publishers, 1990. Greenberg, D., Marcus, A., Schmidt, A., Gorter, V., The Computer Image: Applications of Computer Graphics, Addison-Wesley, 1982. Hall, R., Illumination and Color in Computer Generated Imagery, Springer-Verlag, 1989. Mandelbrot, Benoit, The Fractal Geometry of Nature, W.H. Freeman and Company, New York, 1983. Overheim, R.D. and Wagner, D.L., Light and Color, John Wiley & Sons, 1982. Schachter, B., Computer Image Generation, John Wiley & Sons, New York, 1983.
Contents
Chapter 1. Introduction .......................................................................................................... 1.1 Historical Perspective ....................................................................................... 1.2 Notes Overview................................................................................................. 1.3 Notation............................................................................................................. Chapter 2. Graphics Devices ................................................................................................. 2.1 Static Graphics Devices .................................................................................... 2.1.1 Plotters................................................................................................ 2.1.2 Electrostatic Plotters........................................................................... 2.1.3 Laser Printers ..................................................................................... 2.1.4 Film Recorders ................................................................................... 2.2 Dynamic Graphics Devices............................................................................... 2.2.1 The Cathode Ray Tube (CRT) ........................................................... 2.2.2 The Point Plotting Display ................................................................. 2.3 Raster Displays ................................................................................................. 2.3.1 Video Technology .............................................................................. 2.3.2 The Bitmap Display ........................................................................... 2.3.3 Pixel Memory Logic .......................................................................... 2.3.4 Gray Scale Displays ........................................................................... Chapter 3. Color..................................................................................................................... 3.1 Light .................................................................................................................. 3.2 Color Perception ............................................................................................... 3.3 Color CRT Operation........................................................................................ 3.4 Additive and Subtractive Color......................................................................... 3.5 Color Representations ....................................................................................... 3.5.1 RGB Color Space ............................................................................... 3.5.2 CMY and CMYK Color Spaces......................................................... 3.5.3 YIQ Color Space ................................................................................ 3.5.4 XYZ Color Space ............................................................................... 3.5.5 HSI Color Space................................................................................. 3.5.6 HSV Color Space ............................................................................... 3.5.7 Luv Color Space................................................................................. 3.5.8 Lab Color Space ................................................................................. 3.6 References ......................................................................................................... 3.7 Review Questions ............................................................................................. Chapter 4. Basic Graphics Software ...................................................................................... 4.1 Hardcopy Plotting ............................................................................................. 4.2 Drawing Lines................................................................................................... 4.2.1 Bresenham Line Stepping Algorithm................................................. 4.2.2 The Plot Routine ................................................................................ 4.3 Characters.......................................................................................................... 4.3.1 Character Definitions ......................................................................... 1 1 2 2 4 4 4 8 10 10 12 12 12 14 14 15 16 17 19 19 21 22 25 26 27 28 29 30 32 36 37 38 40 41 42 42 44 45 47 49 49
vi
4.4
4.5 4.6
4.3.2 Font Data Structures........................................................................... Raster Software ................................................................................................. 4.4.1 Rasterization....................................................................................... 4.4.2 Raster Display Drawing Operations................................................... References ......................................................................................................... Review Questions .............................................................................................
Chapter 5. Vector Display Systems ....................................................................................... 5.1 Display Programming ....................................................................................... 5.1.1 Driving A Vector Display .................................................................. 5.1.2 Driving A Display with Memory ....................................................... 5.2 Dynamic Display Manipulation ........................................................................ 5.3 Display Processors ............................................................................................ Chapter 6. 2D Graphics Software .......................................................................................... 6.1 Window to Viewport Mapping ......................................................................... 6.1.1 Mapping Equations ............................................................................ 6.1.2 Uniform Scaling ................................................................................. 6.2 Clipping............................................................................................................. 6.2.1 Region Codes ..................................................................................... 6.2.2 Clipping Against A Boundary............................................................ 6.3 2D Transformations .......................................................................................... 6.3.1 Matrix Representation ........................................................................ 6.3.2 Basic Transformations ....................................................................... 6.3.3 Concatenation (Composition) ............................................................ 6.3.4 Inverse Transformations..................................................................... 6.4 Review Questions ............................................................................................. Chapter 7. Graphics Packages ............................................................................................... 7.1 Window Graphics Systems ............................................................................... 7.2 Windows ........................................................................................................... 7.3 Events................................................................................................................ 7.4 Drawing Primitives ........................................................................................... 7.5 References ......................................................................................................... Chapter 8. Interactive Techniques ......................................................................................... 8.1 Pointing and Positioning ................................................................................... 8.2 Dragging............................................................................................................ 8.3 Picking .............................................................................................................. 8.3.1 Picking Text ....................................................................................... 8.3.2 Picking Lines...................................................................................... 8.3.3 Picking Circles ................................................................................... 8.3.4 Picking Composite Objects ................................................................ 8.3.5 Interaction Paradigms......................................................................... 8.4 Review Questions .............................................................................................
Chapter 9. Data Structures for Interactive Graphics.............................................................. 106 9.1 Basic Data Storage Methods ............................................................................. 107 9.1.1 Compacted Sequential Lists ............................................................... 107
vii
9.1.2 Linked Lists........................................................................................ 9.1.3 Linked Lists with C/C++ Pointers ..................................................... 9.1.4 Using Compacted Sequential Lists and Linked Lists ........................ Storing Multiple Objects ................................................................................... Instancing with Multiple Objects ...................................................................... Hierarchical Modeling with Instances .............................................................. The Current Transformation Matrix.................................................................. Modeling Languages and Scene Graphs ........................................................... Review Questions .............................................................................................
111 115 117 117 119 121 124 127 129 130 131 132 135 135 140 141 144 149 149 151 156 163 164 166 167 168 170 170 171 172 173 173 174 174 175 176 177 178 178 180 180 181 181 183 185
Chapter 10. 3D Rendering ..................................................................................................... 10.1 Depth Cueing .................................................................................................... 10.2 3D Data Structures ............................................................................................ 10.3 Planar Geometric Projections............................................................................ 10.4 Perspective Projection....................................................................................... 10.5 3D Modeling Transformations.......................................................................... 10.6 Homogeneous Coordinates ............................................................................... 10.7 Viewing Transformations ................................................................................. 10.8 Three Dimensional Clipping ............................................................................. 10.8.1 Z-Clipping with 2D Window Clipping .............................................. 10.8.2 3D Frustum Clipping.......................................................................... 10.8.3 3D Clipping in Homogeneous Coordinates ....................................... 10.9 EYE-AIM Viewing Methods ............................................................................ 10.9.1 AIM-Based Viewing .......................................................................... 10.9.2 EYE-Based Viewing .......................................................................... 10.10 References ......................................................................................................... 10.11 Review Questions ............................................................................................. Chapter 11. Computational Geometry ................................................................................... 11.1 Review of Vector Geometry Fundamentals...................................................... 11.2 Basic Vector Utilities ........................................................................................ 11.2.1 Vector Magnitude............................................................................... 11.2.2 Vector Addition and Subtraction ....................................................... 11.2.3 Vector Multiplication and Division by a Scalar................................. 11.2.4 Unit Vector......................................................................................... 11.2.5 Dot Product: ....................................................................................... 11.2.6 Cross Product ..................................................................................... 11.2.7 Vector Area ........................................................................................ 11.2.8 Vector Volume ................................................................................... 11.2.9 Vector Equation of a Line .................................................................. 11.2.10 Vector Equation of a Plane ................................................................ 11.3 Geometric Operations on Points, Lines and Planes .......................................... 11.3.1 Distance from a Point to a Line.......................................................... 11.3.2 Distance Between Two 3D Lines....................................................... 11.3.3 Intersection of Two Lines .................................................................. 11.3.4 Intersection of a Line with a Plane..................................................... 11.4 Change of Basis in Vector-Matrix Form...........................................................
viii
Instance Transformations .................................................................................. Transforming Continuous Geometries.............................................................. Rotation About An Arbitrary Axis Through The Origin .................................. Review Questions .............................................................................................
189 191 194 196 197 197 199 199 201 201 202 204 206 209 211 212 213 214 214 215 216 216 217 219 220 221 222 222 222 223 226 228 229 231 233 234 234 235 236 238 240 242 246 246 248
Chapter 12. Hidden Line and Surface Removal .................................................................... 12.1 Back-Plane Rejection ........................................................................................ 12.2 3D Screen Space ............................................................................................... 12.3 Hidden Line Algorithms ................................................................................... 12.4 Raster Algorithms ............................................................................................. 12.4.1 Painter's Algorithm (Newell-Newell-Sancha).................................... 12.4.2 Z-Buffer Algorithm ............................................................................ 12.4.3 Watkins Scan-Line Algorithm ........................................................... 12.5 Polygon Clipping .............................................................................................. 12.6 Scan Converting Polygons ................................................................................ 12.6.1 Vertex Intersections ........................................................................... 12.6.2 3D Scan Conversion........................................................................... 12.7 Scan-line Z Buffer Algorithm ........................................................................... 12.8 References ......................................................................................................... 12.9 Review Questions ............................................................................................. Chapter 13. Realistic Display Effects .................................................................................... 13.1 Illumination Models .......................................................................................... 13.1.1 Diffuse Reflection .............................................................................. 13.1.2 Specular Reflection ............................................................................ 13.1.3 Ambient Light .................................................................................... 13.1.4 Ambient - Diffuse - Specular Model.................................................. 13.1.5 Light Sources ..................................................................................... 13.1.6 Multiple Light Sources....................................................................... 13.2 Polygon Shading Methods ................................................................................ 13.2.1 Flat Shading ....................................................................................... 13.2.2 Smooth Shading ................................................................................. 13.3 Ray Tracing....................................................................................................... 13.3.1 Shadows ............................................................................................. 13.3.2 Reflection and Refraction .................................................................. 13.4 Ray Geometry ................................................................................................... 13.5 Ray-Polygon Intersections ................................................................................ 13.5.1 Point Containment for Convex Polygons........................................... 13.5.2 Semi-Infinite Ray Point Containment Test ........................................ 13.5.3 Sum of Angles Point Containment Test............................................. 13.6 Orienting Lighting Normals.............................................................................. 13.7 Intersecting a Ray with a Convex Polyhedron.................................................. 13.8 Intersecting a Ray with a Conic Solid............................................................... 13.9 Ray-Tracing CSG Solids................................................................................... 13.10 Texture Mapping............................................................................................... 13.10.1 Texture (Pattern) Mapping ................................................................. 13.10.2 Bump Mapping...................................................................................
ix
13.11 Review Questions ............................................................................................. 249 Chapter 14. Curves ................................................................................................................ 14.1 Parameterization................................................................................................ 14.2 Spline Curves .................................................................................................... 14.3 Polynomial Forms ............................................................................................. 14.4 Bzier Formulation ........................................................................................... 14.5 B-Spline Formulation........................................................................................ 14.6 Parabolic Blending ............................................................................................ 14.7 Rational Formulations....................................................................................... 14.8 Review Questions ............................................................................................. Chapter 15. Surfaces .............................................................................................................. 15.1 Lofted Surfaces ................................................................................................. 15.2 Linear Interior Blending.................................................................................... 15.3 Bzier Surfaces.................................................................................................. 15.4 B-Spline Surfaces.............................................................................................. Chapter 16. Geometric Modeling .......................................................................................... 16.1 Historical View of Geometric Modeling........................................................... 16.2 CAD and Geometric Modeling ......................................................................... 16.3 Solid Geometric Modeling................................................................................ 16.4 Solid Modeling Methods................................................................................... 16.4.1 Boundary Representation ................................................................... 16.4.2 Constructive Solid Geometry (CSG) ................................................. 16.4.3 Evaluating CSG Objects: A 2D Example .......................................... Index ...................................................................................................................................... APPENDIX. The GRAFIC Package. .................................................................................... 251 251 253 254 259 262 264 265 268 269 269 270 273 274 276 276 278 279 279 280 281 283 I-1
Chapter 1. Introduction
1.1 Historical Perspective
Computer graphics began in the late 1950s and was developed primarily in the large aircraft and automobile industries. In those days, a company or division had just one computer and the graphics device demanded nearly all of its computational capacity. These were the only industries that could afford such expensive technology and the only ones whose massive operation could justify the equipment, development costs and effort. In the early 1960s, Dr. Ivan Sutherland introduced the age of interactive computer graphics for engineering applications with his Ph.D. dissertation: SKETCHPAD: A Man-Machine Graphical Communication System. His work demonstrated various ways that interactive graphics could revolutionize engineering and other application areas. During the 1970s, advances in computer hardware caused the cost of computing to decline dramatically, making computer graphics applications more cost effective. In particular, computeraided design (CAD) systems flourished with storage tube terminals and minicomputers, causing a dramatic increase in computer graphics users and applications. The 1980s became the age of the workstation, which brought together computing, networking and interactive raster graphics technology. Computer graphics technology followed the way of computers, rapidly changing from add-on graphics devices to integral raster graphics workstations. Raster graphics systems and window-based systems became the standard. In the mid-1980s, personal computers took over the computing and graphics markets and quickly dominated the attention of nearly all graphics applications. The pervasiveness of the PCs made software developers flock to them. By the end of the 1980s, personal computer software was arguably the center of attention of the software developer community. Workstations also thrived as higher-end computing systems with strengths in networking and high-end computation. PCs, however, vastly outnumbered any other form of computer and became the mainstream graphics delivery system. Today, computer graphics equipment and software sales are a billions of dollars per year marketplace that undergoes continual performance and capability improvements. Some of today's most popular applications are:
Chapter 1. Introduction
- Entertainment, such as animation, advertising, and communications - Internet applications - Mechanical design and drafting ( CAD/CAM ) - Electronic design and circuit layout - Natural resource exploration and production - Simulation - Chemical and molecular analysis - Desktop publishing
1.3 Notation
Throughout the text, the following notation will be used.
Chapter 1. Introduction
1.3 Notation
scalar number (lower case plain text) position vector (upper case plain text, letters P or Q, possibly subscripted) direction vector (upper case plain text, letter V, possibly subscripted) matrix (upper case bold) tenser or matrix made from elements of B angles (lower case Greek)
row or column vector with explicit coordinates computer language statements and references in the text to computer variables and symbols.
Chapter 1. Introduction
2.1.1 Plotters
Pen plotters are electromechanical devices that draw on paper and other materials. Some past and present manufacturers include companies like Calcomp, Houston Instruments, Hewlett Packard and Tektronix. Prices range from $2,000 to over $50,000 according to speed, accuracy, and local intelligence. Local Intelligence means the plotter has capabilities for: 1. 2. 3. 4. arcs and circles, dashed lines, multiple pens, scaling, rotations, character generation, filling areas, etc.
Intelligence is typically accomplished with a microprocessor in the device. As with computers, there are typical trade-offs among speed, cost, intelligence and accuracy. Physical Arrangements The drum plotter was one of the first computer graphics hardcopy devices (Figure 2-1). There are three independent actions: (1) raising or lowering the pen by activating or deactivating the solenoid and stepping the drum or carriage to produce relative motions between the paper and the pen along the (2) X and (3) Y directions. Commands sent from a computer connected to the plotter raise or lower the pen, and then step the pen the required number of steps in X and Y to produce the desired effect. A variation of the drum plotter that facilitates drawing on thicker materials and materials
The solenoid lifts the pen on command. The pen is usually held down with a spring.
Another stepping motor rotates the drum. Figure 2-1. Elements of a Drum Plotter. that cannot bend is the flatbed plotter (Figure 2-2). The basic operation of the pen carriage and two stepping motors is like the drum plotter. pen carriage with solenoid X,Y Y,X
paper held by clamps or suction Figure 2-2. Elements of a Flatbed Plotter. Plotters are capable of drawing lines of different colors using multiple pen carriages (Figure 2-3). The pen carriage is rotated or translated to place the desired pen in the writing position. A more elaborate version of a multiple pen carriage changes pens by moving the pen carriage to the corner of the pen bed, depositing the current pen in a second pen carriage that holds extra pens, picking up a new pen from the extra carriage, and returning to the writing position to continue drawing. Basic Operation The pen is held in a chamber and is pressed against the paper surface by a spring. The
2 1
3 6
4 5
Figure 2-3. Multiple Pen Carriages. chamber is held inside a solenoid coil. When activated by the plotter electronics, the solenoid lifts the pen above the plotting surface. X and Y motions are controlled by stepping motors. The solenoid and motors are controlled by a computer. Only one step is taken at a time. The resolution of a plotter is a measure of its accuracy, or step-size. Typical resolutions are 0.002 and 0.005 inch per step, or 500 or 200 steps per inch, respectively. There are several critical factors in the design of plotters: speed The speed of a plotter is typically given in steps per second. The time required to complete a given drawing is a function of the distance travelled by the pen, and the resolution and speed of the plotter. For example, consider drawing an 8 by 10 box around a page using a plotter with 500 steps per inch and a speed of 500 steps per second. The time to complete the drawing is computed as follows: time (sec) = distance (inch) / {resolution (inch/step) * speed (step/sec)} or, time = (8+8+10+10) / {0.002 * 500} time = 36 sec (Note the plotter travels at 1 inch/second.) accuracy Generally, increasing accuracy requires slower speeds and more massive structures to maintain mechanical rigidity. ink flow High speed plotters must force ink from the pen because gravity alone is insufficient. Configurations There are three basic configurations for connecting a plotter to a computer: off-line, online, and spooled (buffered). Off-line configuration means the plotter is not physically connected to the computer, so data must be manually transferred. The plotter reads plotting information from
a tape that must be loaded and unloaded periodically. The plotting programs write this information on the tape and, at a designated time, the tape is removed and queued for the plotter.
Computer
tape
tape
Plotter
Manually Carry Tape Figure 2-4. Off-line Plotter Configuration. On-line configuration means there is a direct data communication interface between the computer and the plotter. As the plotting program executes in the computer, calls to plotting routines cause plot commands to be sent to the plotter. The plot commands are executed by the plotter hardware, i.e. the pen is moved. This is a typical configuration for a personal computer and a plotter.
Computer
plot commands
Plotter
Figure 2-5. On-line Plotter Configuration. It usually does not make sense to configure a plotter on-line with a multi-user, time-shared computer system because several plotting programs can be executing at once. In this case, it is necessary to couple the two previous approaches, on-line and off-line, in a manner that allows several plotting programs to create plot data, yet send only one programs data at a time to the plotter. Spooling is a method for accomplishing this. Spooling Current time-shared systems perform a combination of off-line and on-line plotting, called spooling. This is similar to how line printers are operated. Storing and plotting can occur simultaneously (with proper provisions for simultaneous access to files). Plotting programs create intermediate plot files when they execute. When a program finishes, the plot file is queued for plotting, that is, transferred onto a disk area set aside for plot files waiting to be plotted. This intermediate disk storage is a buffer between the programs creating plot data and the slower plotter,
which can only process one plot at a time. Another computer program, sometimes executing in another computer, accesses the queued plot files periodically and sends them to the plotter. The plotting programs are off-line with respect to the plotter, and the intermediate program is on-line. Disk Buffer store
plot
Computer
Plotter
Toner Applicator
Paper Roll
epp
Paper Motion
Top View of Writing Head (magnified) Figure 2-7. Elements of an Electrostatic Plotter.
One advantage of electrostatic plotting is that the printing process is almost entirely electronic, so the problems of speed and accuracy that limit mechanical pen plotters are eliminated. The drawback is that the resulting image must be drawn as an array of dots. Each dot is called an electrostatic plotter point or an epp. The plotting process involves the computer sending data to the plotter for each line of epps. The data for a line of epps is a series of binary numbers, 1or 0, representing dot (on) or no dot (off) for each writing head epp position. The computer must compute the binary values for all epps based on the image that is to be drawn. After a complete line of epps is received, the plotter controls the writing head to charge the paper at epps whose values are on. As the paper advances, new lines are charged and the charged lines reach the toner applicator where the ink is applied. The process continues down the entire page. The page cannot back up, so all plotting must be done in one pass. The density of epps, or resolution, is critical to the quality of the image. Typical electrostatic plotters have resolutions of 100 to 250 epps per inch. One hardware modification to increase resolution is to arrange the writing head in two rows of epps, one for even epps and another for odd. A single line of epps is printed on the paper in two steps that are synchronized in time so that the even and odd epps align along a line. t
Figure 2-8. Staggered Epp Writing Head Operation. Some past and present electrostatic plotter manufacturers are Versatec, Houston Instruments, Varian, and Gould. Prices range from $5,000 to $50,000, widths vary from 8 inches to 6 feet, speeds vary between 1 to over 6 inch/sec., and resolutions vary from 80 to 250 epps per inch. Color printing can be done with multiple passes over the same paper, once for each color component. This will be described later in the chapter on color. Electrostatic plotters have been replaced by the more popular laser printers.
10
Figure 2-9. Elements of Laser Printers. Some laser printer engine manufacturers are Canon, Ricoh, Kyocera, and Linotype. Prices range from $1,000 to over $50,000 and speeds vary from a few to 20 or more pages per minute.
11
Cathode Color Wheel Ray Tube Green computer video output Red Blue
Figure 2-10. Elements of a Film Recorder. similar to the film recorder above, however the image is not computed for the recorder, but is instead captured from a screen display. Color Wheel Switch
B/W CRT
Color Display
R G B
CRT
12
filament cathode control grid accelerating plates Figure 2-12. Elements of a CRT. and control grid. The flood of electrons is focused into a concentrated beam by the focusing system and accelerated into a narrow beam of high energy electrons by the accelerating plates. The beam is deflected by charges in two pairs of charged plates, one horizontal and one vertical, in the deflection system. The horizontal and vertical deflection voltages are controlled by signals from the computer.
13
through digital-to-analog (D/A) converters connected between the computer and the display. The computer sends digital values to the D/A converters that convert the values to voltages. The output voltages of the D/A converters are connected to the X and Y deflection plates. The computer then pulses the beam intensity to a pre-set maximum value. Where the beam strikes the screen, the phosphor glows, causing a visible dot. The beam must be turned off quickly to avoid burning the phosphor. This forms the basic point plotting display (Figure 2-13). X Computer Intensity Y D/A D/A D/A CRT
Figure 2-13. Configuration of a Point Plotting Display. When the beam is turned off, the glow decays rapidly based on the persistence, or rate of light decay, of the screen phosphor. The glow decays very quickly, so the image must be refreshed Intensity 100% Threshold Time Figure 2-14. Decay in Intensity of Phosphor Light. (re-drawn) often, typically every 1/30 - 1/60 sec. (30 - 60 cycles/sec or hertz), even if the picture does not change. If the glow fades too much before it is refreshed, the eye will detect this intensity Intensity Start End Average intensity Time Refresh Cycle Figure 2-15. Intensity Variations Due to Refresh Cycles.
14
variation as flicker. Hardware can be added to the basic point plotting display to produce two different types of refresh display technology: Vector continuous random lines: the beam motion and intensity are controlled by a vector generator that draws line as one continuous beam motion, analogous to pen plotting. Display commands from the computer direct the drawing of the picture. Raster ordered dots: the beam motion always follows the same path. The picture depends on the appearance of the dots, analogous to laser printing. The computer draws the picture by setting values in computer memory. We will focus on raster technology first.
15
Figure 2-16. Beam Motion for Interlaced Raster Display. must repeatedly process its data to the display screen to refresh the image. We call this hardware a raster display processor.
Computer Program
CRT
16
0,0
column h
pixels[h,v]
v Figure 2-19. Pixels Array Variables. a coordinate system on the pixel array by numbering each pixel by its position as the hth pixel in the vth scan-line, i.e. the coordinate (h,v), with the first pixel in the upper-left corner having coordinates (0,0). Historically, the video display order dictates this choice of coordinate system. It is now possible to alter the display by changing the values in the pixels array. For
17
example, setting a pixel is simply the statement: pixels[h,v] = BLACK. This pixel will be displayed black by the display processor during the next refresh cycle. The statement pixels[h,v] = WHITE clears the pixel, i.e. sets it to white, which is often the background color. One could think of this as erasing. A 512 by 512 black-and-white display requires 256K (K = 1024) bits of memory. Also, every pixel is displayed each refresh cycle, even if you have not drawn anything on the screen. That is, a pixel always has a value that is displayed.
18
3.1 Light
19
Chapter 3. Color
The perception of color is a complex problem that has been studied for hundreds of years. Creating a meaningful color display requires an understanding of the various physical and perceptual processes, including a fundamental understanding of the physical properties of light, human color perception, color CRT operation and computer models for color representation.
3.1 Light
Light is electromagnetic radiation, energy, that is emitted from light sources such as incandescent bulbs and the sun, reflected from the surfaces of objects, transmitted through objects, propagated through media such as air, and finally reaches our eyes. There are several models that have been proposed to describe the physical behavior of light. Generally, it is viewed as having both the properties of particles and waves. The model most appropriate for computer graphics describes light as an oscillating electromagnetic wave, as illustrated in Figure 3-1. As a wave, light
amplitude wavelength
propagation direction
Figure 3-1. An Electromagnetic Wave. has the wave properties of frequency , wavelength , and amplitude. It is known that frequency c and wavelength are related by = -- , where c is the speed of light in the medium (for a vacuum, c = 2.998 x 108 meters/sec). Wavelength is usually given in units of micrometers m (10-6) or nanometers nm (10-9). The electric field intensity of the wave varies sinusoidally in time and space as a function of amplitude and frequency. Visible light, the light that the human eye can perceive and thus the light that we can see,
Chapter 3. Color
3.1 Light
20
is only part of the entire electromagnetic spectrum, as shown in Figure 3-2 and Figure 3-3. The term
Visible X Rays Gamma Rays Ultraviolet Infrared Microwave
Orange
Violet
Green
Blue
350
400
450
650
Red 700
750
Figure 3-3. Colors in the Visible Light Spectrum. light for our purposes means electromagnetic radiation with wavelengths in the range 380 nm to 770 nm. Generally, light that we see is not just a single wavelength, but consists of a continuous, non-uniform distribution of single-wavelength (monochromatic) components. The graph of intensities versus wavelengths forms a spectral distribution, or spectral reflectance curve, as illustrated in Figure 3-4. The term spectral means variance with, or dependence upon, wavelength.
energy intensity
Chapter 3. Color
21
The first four factors pertain to the physical properties of the objects and their surrounding environment, and can be precisely determined. The last three deal with the physiological characteristics of the human visual system and vary from person to person, even with normal visual systems. For example, individuals will discern different transition points at which an object appears either red or orange, even though the physical situation does not vary [Cychosz]. The human eye consists of a sensor array of photo-receptors known as the retina, and a lens system under muscle control that focuses the visual stimulation on the retina. The retina consists of two types of sensors: rods and cones. Rods are primarily used for night vision and respond achromatically to a given stimulus. Cones are used for color vision in normal lighting and respond primarily to wavelength. Color is interpreted by the brain based on the stimuli from three types of cones that are distributed over the retina. Each cone absorbs either reddish light, greenish light or bluish light. The distribution of cones is not uniform. Sixty-four percent of the cones absorb red pigment, thirty two percent absorb green, and only two percent absorb blue. Figure 3-5 shows the relative sensitivities of the three cone types [Hunt 1987, page 13]. The stimulus information sent to the brain through the optic nerve by the receptors conveys a luminance (also called lightness, brightness and intensity) and two color ratios: a red-to-green ratio and a yellow-to-blue ratio. As a result of experimental data, it is believed that the eye responds primarily to intensities
Chapter 3. Color
22
relative sensitivity
wavelength (nm) Figure 3-5. Spectral Response Curves for the Three Cone Types (, and ). of three color hues, red (around 650 nm), green (around 530 nm) and blue (around 460 nm). This suggests that colors that we perceive can be approximated by combining amounts of three primary colors: red, green and blue. This is sometimes referred to as the trichroma or tri-stimulus theory, and dates back to the 1800s. This theory is applied in the design of color display systems and color printing equipment.
Green
CRT
Blue
Chapter 3. Color
23
components. Pixel memory can be configured in two ways, called direct color and mapped color. In Direct Color displays, the pixel value contains the color components (Figure 3-7). Pixel Pixel Memory pixel h,v pixel value R G B to CRT guns Figure 3-7. Direct Color Display. memory for direct color displays can be configured in different ways depending on the arrangement of the RGB components in computer memory. Some systems represent the RGB components for each pixel in one computer word. Others separate the RGB components into three separate arrays in memory, one each for red, green and blue. In the first method, the RGB components of a pixel are obtained by accessing the pixel value at the appropriate array location in pixel memory and then dissecting the pixel value with binary operations to obtain the color components. In the second method, the RGB components are individually accessed from the appropriate arrays. For example, consider the simplest direct color display with one bit for each color component. Each pixel would have 3 bits, (i.e. 3 bits per pixel ), one for red, one for green, and one for blue. The display would be capable of displaying the 8 colors shown below. R Component 0 0 0 0 1 1 1 1 G Component B Component Pixel Value 0 0 0 0 1 1 1 0 2 1 1 3 0 0 4 0 1 5 1 0 6 1 1 7 Color Name Black Blue Green Cyan Red Magenta Yellow White
In Mapped Color displays, the pixel value is an index into a separate array of memory in the display called a Video Lookup Table (VLT) and a Color Lookup Table (CLUT). Each entry of the VLT contains color components that are set by the computer. Each refresh cycle, the raster display processor accesses each pixel value from pixel memory and uses the RGB components
Chapter 3. Color
24
VLT
B to CRT guns
Figure 3-8. Mapped Color Display. from the VLT location addressed by the pixel value for the pixel color. Mapped color systems provide a means for reducing the size of pixel memory (and therefore the cost of a system) when it is acceptable to choose a subset of colors from a large palette of possible colors. For example, most workstations today with color displays are mapped systems capable of displaying 256 colors chosen from a palette of 224 possible colors. (The number 224 is over 16.6 million.) To understand the physical configuration of such a display, break down the quoted phrase in the previous sentence. 256 colors means simultaneous colors on the screen at one time, so pixel memory must be able to store 256 different colors. Therefore, pixel memory must have log2 256, or 8, bits per pixel (i.e. 28 = 256). Palette is a term often used for VLT, so 256 colors also means that there must be 256 VLT entries that can be indexed by the pixel values. All that remains is the format of each VLT entry. The phrase palette of 224 possible colors tells us that each entry must be able to store 224 colors, i.e. each entry must contain 24 bits. The entry must be divided into three components for RGB, so we can safely assume that each VLT entry contains 24 divided by 3, or 8 bits per component. It would be very unusual not to use the same number of bits for each component. Figure 3-9 shows a schematic diagram of the configuration of such a display system. A direct color system capable of displaying 224 colors would require 24 bits per pixel. This is a significantly more costly system than a mapped display with a palette of 224 colors. The difference is sometimes subtle: the mapped system can display only 256 colors at once because each pixel can represent only 256 possible values. Each pixel value is mapped to a VLT entry, however, that can contain 224 possible values (colors). The 24-bit direct system, sometimes called a full color system, can display 224 colors at once, but has three times as much pixel memory (24 bits per pixel versus 8 bits per pixel). The additional memory required for the VLT is generally
Chapter 3. Color
25
8 bits/pixel Figure 3-9. Configuration of a 256 Color Mapped Display System. insignificant. In this example, 256 24-bit words is small compared to 512 by 512 or 1024 by 1024 pixels in pixel memory.
Red Yellow
Magenta
Green Cyan
Blue
R+G+B = white
Figure 3-10. Primary and Secondary CRT Colors. A painter mixes yellow and blue (really cyan) to make green. This does not agree with CRT color, where green is a primary. The difference is that mixing paint and CRT color production involve different physical processes. Consider this from the point of view of light that reaches the
Chapter 3. Color
26
eyes. For paint, we see reflected light resulting from incident light (typically white light, light containing all colors) reflecting from the surface of the paint. Physically, light represented by a certain spectral curve irradiates the surface from a certain direction and interacts with it in complex ways. Some of the light may be absorbed by the surface, some may be transmitted through the surface, and some reflects from the surface to our eyes. These interactions alter the spectral properties of the light, changing its spectral curve and therefore its perceived color. The reflected light has different spectral properties (color and intensity) than the incident light. Consequently, the color of paint is reflected light from which certain amounts of each RGB component from the incident light have been removed (or filtered), thereby changing the light that reaches our eyes. The so-called subtractive primaries, cyan, magenta, yellow (CMY), actually filter one trichromatic component (RGB). For example, consider what color is produced when white light shines through a layer of cyan cellophane and a layer of magenta cellophane (Figure 3-11) The cyan cellophane filters or (white light) R G B Cyan Magenta Figure 3-11. Light Transmitted Through Cellophane. blocks the red component, and the magenta cellophane blocks the green. Therefore, only blue passes through both layers.
Chapter 3. Color
27
characteristics of color perception. Generally, these are mathematical and computational models, color models, that determine three color parameters for the display in terms of either a spectral curve for a color, or in terms of three color parameters in a different color representation. Some color models can be shown as a three-dimensional volume, the color space, in which colors are points defined by the three color parameters, the color coordinates. The color spaces can be classified in three ways: perceptually-based uniform spaces, perceptually-based non-uniform spaces, and device-directed spaces. Perceptually-based color spaces provide easy manipulation of color from a qualitative point of view and are useful for developing user-interfaces to specify colors for graphical applications. Uniform color spaces vary color uniformly across the space so that the distance between two colors is proportional to the perceived color difference. This is important for accurately simulating physical situations, such as theatrical lighting, and for accurate color comparisons. Device-based color spaces are founded on particular stimulant characteristics of a given device or color presentation medium. These spaces are developed statistically using color matching schemes and, in reality, only approximate a uniform space. The following sections present a number of popular color spaces: device-based color spaces: RGB, CMY, CMYK, YIQ and XYZ; perceptually-based color spaces: HSI and HSV; and perceptually-based uniform color spaces: Luv and Lab.
Chapter 3. Color
28
the range [0,1]. The RGB space is the fundamental color representation for all colors that are to be displayed on a CRT. Eventually, colors in all other color spaces must be transformed into RGB coordinates for display.
Chapter 3. Color
29
The color Cyan in RGB coordinates is (0,1,1), 1.0 intensity Green and 1.0 intensity Blue. Now consider how to print Red. For this, we must compute the CMY coordinates from the RGB. Red is (1,0,0) in RGB space, C 1 1 0 = 1 0 = 1 M Y 1 0 1 The color Red in CMY coordinates is (0,1,1), 1.0 intensity Magenta and 1.0 intensity Yellow. Black is produced by equal amounts of CMY, which means combining equal amounts of three paints or three inks. In practice, this results in an unpleasant dark-brown color as well as consuming ink (or paint) from all three colors. Some printing devices include a black ink cartridge and consider black as a separate color. This is the CMYK color space, where K stands for black. Unfortunately, it is a four-dimensional space that is not readily illustrated. The conversion formulae for CMY to CMYK are [Foley]: C C K M = M + K Y K Y K 0 K
K = min( C,M,Y)
and for CMYK to CMY: C C K = M + K M Y Y K It is necessary to go through CMY to convert between CMYK and RGB.
Chapter 3. Color
30
RGB space through a linear transformation that is compactly represented in the matrix equation: Y 0.299 0.587 0.114 R = 0.596 0.275 0.321 G I Q 0.212 0.523 0.311 B The conversion from YIQ to RGB involves the inverse of the matrix: R 1 0.956 0.620 Y G = 1 0.272 0.647 I B 1 1.108 1.705 Q The Y coordinate is the luminance, the brightness perceived by the viewer, and is the only signal received by a black-and-white television. The I and Q components are two chrominance values that determine the color of a pixel. They are derived from the modulation technique used in NTSC broadcast signals.
Z matching curve
Y matching curve
1.2 1 0.8 0.6 0.4 0.2 0 380
X matching curve
430
480
530
580
630
680
730
780
wavelength (nm)
Figure 3-14. CIE Spectral Matching Curves. properties. The Y() curve matches the luminance response curve of the human eye, i.e. the eyes sensitivity to spectral wavelengths. These color matching functions are tabulated at 1 nm
Chapter 3. Color
31
wavelength intervals [Wyszecki and Hall]. The CIE coordinates, (X,Y,Z), are computed from the spectral curve of a given color, I(), as integrals, X = k I()X() d Y = k I()Y() d Z = k I()Z() d
where k is a scaling constant appropriate for the lighting situation [Foley, p. 580]. In practice, I() is tabulated at 1 nm intervals and the integration is done numerically by summation. The XYZ coordinates include luminance information in addition to chroma. To remove the luminance factor, XYZ coordinates can be scaled to normalized values, i.e. each coordinate is divided by the sum X+Y+Z, resulting in xyz coordinates (lower case), called chromaticities. Note that x+y+z = 1 by definition, so only the x and y coordinates are needed. The two dimensional plot of these (x,y) values is called a chromaticity diagram and bounds the colors in the XYZ color space. The CIE xyz chromaticity diagram is shown in Figure 3-5.
y chromaticity
x chromaticity
Figure 3-15. CIE xyz Chromaticity Diagram. The chromaticities of a color can be measured with an incident light chromaticity meter. This can be done for a color monitor to calibrate its red, green, blue and white colors for precise color presentation. From the measured chromaticities it is possible to compute the transformation between RGB to XYZ. Mathematically, the chromaticities are the (x,y,z) values corresponding to the RGB primaries (red=(1,0,0), etc.). These reference colors are solved to form the transformation
Chapter 3. Color
32
of the form R X = XYZtoRGB Y G B Z where XYZtoRGB is a 3 by 3 transformation. White is used to scale the values. The resulting transformation relates the monitors RGB values and the standard XYZ coordinates. The NTSC provides a standard set of (x,y,z) chromaticities: red = (0.67, 0.33, 0.0), green = (0.21, 0.71, 0.08), blue = (0.14,0.08,0.78) and white = (0.313, 0.329, 0.358). The resulting transformations from XYZ to RGB are, R 1.967 0.548 0.297 X G = 0.955 1.938 0.027 Y B 0.064 0.130 0.982 Z and from RGB to XYZ: X 0.589 0.179 0.183 R = 0.290 0.605 0.104 G Y Z 0 0.068 1.020 B
Chapter 3. Color
33
Saturation and chroma describe the purity of the color, or its colorfulness. The greater the presence of achromatic light (light with a horizontal spectral curve), the less saturated a color becomes. As saturation decreases, the dominant hue becomes more difficult to discern. Chroma is the colorfulness compared to white or the level of illumination. Gray is unsaturated, whereas pure monochromatic red of low intensity has high saturation and low chroma. Saturation generally is independent of intensity, whereas chroma depends on intensity. Lightness and Brightness are often used in describing color. Brightness varies from black to bright, representing the strength of the stimulation as if the stimulant is self-luminous. Lightness, on the other hand, varies from black to white representing the reflected intensity of a diffuse surface that is externally illuminated. Thus, as the intensity of the illuminating source increases, the color of the surface becomes de-saturated as it approaches white in color. Here, we use intensity for lightness. The hue-saturation-intensity (HSI) color space, also called hue-saturation-lightness (HLS), is a popular perceptually-based color space that is a double cone with either flat sides (called a hexcone as shown in Figure 3-16) or as a rounded double inverted cone with curved sides. Colors I 1.0 White
Green (2/6 or 120 ) Cyan (3/6 or 180 ) Blue (4/6 or 240 ) 0.5
0.0 Black
H S
Chapter 3. Color
34
are points within the volume of the double cone. Some representations draw the space as a cylinder with black on the bottom surface and white on the top surface. Think of slicing the color cone at a given intensity, producing a two-dimensional color wheel of hues varying around and saturations varying radially. Hue (H) is the angular coordinate. It is represented by the angle around the equator of the cone, normalized to lie in the range [0,1] or [0,360]. Note that this does not follow physical wavelengths exactly. There is no reddish-blue in the electromagnetic spectrum. Saturation (S) is the radial coordinate. Full (1.0) saturation is what one might describe as a deep color. Colors with zero saturation, shades of gray, are located along the axis connecting the tips of the cone. Full saturation is on the outer surface of the cone. Intensity (I) is the vertical coordinate. Zero intensity of any hue and saturation is black. One-half intensity is along the equator. Full (1.0) intensity of any saturation and hue is white. The pure colors (fully saturated, without achromatic light) lie in the plane at half intensity. A color given in HSI coordinates is converted to RGB coordinates in a two step process. The first step is to compute the relative RGB values, (R,G,B) using only the hue (H) coordinate. This is similar in nature to finding chromaticity ratios without regard to luminance. These values can be represented as periodic functions that cycle through the visible light spectrum as approximated by the RGB primaries. The linear versions of the functions, corresponding to the hexcone, are tabulated below and are graphed in Figure 3-17. H 0 to
1 -6 2 -6 3 -6 4 -6 5 -6 1 -6 1 -6 2 -6 3 -6 4 -6 5 -6
G 6H 1 1 4-6H 0 0 1
B 0 0 6H-2 1 1 6-6H 0
to to to to
to 1 to
2 -6
RGB are converted to RGB as follows: For 0 <= I <= 0.5 : RGB = 2 I ( 0.5 - S ( RGB - 0.5 ) ) For 0.5 < I <= 1: RGB = 0.5 + S ( RGB - 0.5 ) + ( 2 I - 1 )( 0.5 - S ( RGB - 0.5 ) )
Chapter 3. Color
35
1 R 0 0 1 G 0 0 1 B 0 0
1 -6 2 -6 3 -6 4 -6 5 -6 1 -6 2 -6 3 -6 4 -6 5 -6 1 -6 2 -6 3 -6 4 -6 5 -6
H Figure 3-17. R B Ratio Functions. G As an example, consider creating realistic images of objects by simulating of the effects of light reflecting from surfaces. White light reflecting from an object of a certain color produces different shades of that color. The problem, then, is to produce a range of shades of a given color, i.e. different intensities of a basic hue and saturation. HSI is a convenient color space for these computations. Given the hue and saturation of the basic color, vary the intensity (I) between 0 and 1 and produce an RGB for each. For example, the following table shows five shades of the color red. I 1 2 3 4 5 0.00 0.25 0.50 0.75 1.00 R 0.0 0.5 1.0 1.0 1.0 G 0.0 0.0 0.0 0.5 1.0 B 0.0 0.0 0.0 0.5 1.0
Examine the color paths of through the five colors in the HSI and RGB color spaces (Figure
Chapter 3. Color
36
3-18). 5
White
3 2 1 Black
B
5 1 2 4 3
R
0.0 Black
H S
Figure 3-19. HSV Color Space. in a two step process. The first step is to compute the RGB ratios as described previously for the
Chapter 3. Color
37
HSI space. Given the RGB ratios, RGB coordinates are computed as follows: 1 R' R 1 G = V 1 + S G' 1 B 1 1 B'
[X0,Y0,Z0] is the color of the white reference point The equations assume reference white has been normalized such that Y0 = 1. The NTSC primaries can be used for standard monitors. Equal steps in L represent equal steps in the perceived lightness of related colors.
Chapter 3. Color
38
These equations can be solved for the conversion equations from CIE Luv to CIE XYZ: Y 0 ( 16 + L ) 3 Y = -----------------------------1560896 9 ( u + 13Lu' 0 )Y X = -------------------------------------4 ( v + 13Lv' 0 ) Y ( 156L 3u 39Lu' 0 20v 260Lv' 0 ) Z = -------------------------------------------------------------------------------------------------4 ( v + 13Lv' 0 ) Note that the order of the equations is important (X and Z depend on the value of Y). It is necessary to convert between the CIE XYZ coordinates and RGB coordinates to display CIE Luv colors.
Y 1 / 3 Z 1 / 3 b = 200 ------ ----- Y - Z0 0 where [X0,Y0,Z0] are the XYZ coordinates of the white reference point.
Chapter 3. Color
39
Figure 3-20. Munsell Color System [Williams, p. 19]. The CIE Lab to CIE XYZ equations are: Y 0 ( 16 + L ) 3 Y = -----------------------------1560896 Y 1/3 Y Y 2/3 X 0 a 3 + 1500a 2 ------ + 750000a ------ + 500 3 ------ Y - Y - Y - 0 0 0 X = ---------------------------------------------------------------------------------------------------------------------------------------------3 500 Y 1/3 Y 2/3 Y Z 0 b 3 + 600b 2 ------ 120000b ------ + 200 3 ------ Y - Y - Y - 0 0 0 Z = ----------------------------------------------------------------------------------------------------------------------------------------------3 200
Chapter 3. Color
3.6 References
40
3.6 References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Cychosz, J.M., The Perception of Color and Representative Space, Purdue CADLAB Technical Report No. JC-05, December 12, 1992. Hall, R., Illumination and Color in Computer Generated Imagery, Springer-Average, New York, New York, 1989. Hunt, R. W. G., The Reproduction of Colour in Photography, Printing and Television, Fourth Edition, Fountain Press, Tolworth, England, 1987. Incropera, F. P. and D. P. DeWitt, Fundamentals of Heat and Mass Transfer, Third Edition, John Wiley & Sons, New York, 1990. Joblove, G. H. and D. Greenberg, Color Spaces for Computer Graphics , Computer Graphics (SIGGRAPH 78 Proceedings), Vol. 12 No. 3, August 1978. Munsell, A. H., A Color Notation, 10th ed., Baltimore, Md., 1946. Murdoch, J. B., Illumination Engineering - From Edisons Lamp to the Laser, Macmillan Publishing Company, New York, 1985. Murch, G. M., Physiological Principles for the Effective Use of Color , IEEE Computer Graphics & Applications, November 1984, pp. 49-54. Smith, A. R., Color Gamut Transform Pairs , Computer Graphics (SIGGRAPH 78 Proceedings), Vol. 12 No. 3, August 1978. Williams, R. G., Lighting for Color and Form, Pitman Publishing Corporation, New York, 1954. Wyszecki, G. and W. Stiles, Color Science: Concepts and Methods, Quantitative Data and Formulae, second edition, Wiley, New York, 1982.
Chapter 3. Color
41
Chapter 3. Color
42
43
plot( x, y, pencode ) Plot moves the pen to the location (x,y) inches with respect to the origin, originally the lower left of the plotting page. If the value of pencode is DRAW (a numeric value equated to this symbolic constant), a line is drawn. If the value of pencode is MOVE, the pen is just moved, no line is drawn. Additionally, a negative pen code (-DRAW or -MOVE) moves the origin to the destination coordinates. Changing the plotting origin is convenient, for example, when making several side by side pages of graphs (in the x direction). After completing the first page, the statement call plot(11.0,0.0,-MOVE) positions the pen and the origin 11 inches to the right of the current origin, a convenient location for the next page. In general, changing the origin facilitates drawing objects whose coordinates are not conveniently expressed with respect to the lower left corner of the page. symbol( x, y, height, string, angle, ns ) Symbol draws a string of characters, a series of characters along a line. The coordinates of the lower-left corner of the first character in string will be (x,y). The strings height in inches will be height. string is a character string, either a quoted list of characters, a character variable or a character array. The string will be plotted at angle degrees measured counterclockwise from the East direction. The argument ns specifies the number of characters plotted from string. In Figure 4-1, the program creates the plot shown at the right. Y (0,11) plots(); plot( 0.0, 0.0, MOVE ); plot( 8.5, 0.0, DRAW ); plot( 8.5, 11.0, DRAW ); plot( 0.0, 11.0, DRAW ); plot( 0.0, 0.0, DRAW ); symbol(1.,9.,1.5,TEST,0.,4); plot( 0.0, 0.0, ENDPLOT ); TEST (1,9) (8.5,11)
X (8.5,0)
44
-x
+x
-x,-y
-y Step Size
+x,-y
Figure 4-2. Possible Pen Steps. There are 8 possible steps corresponding to horizontal, vertical and diagonal pen motions. Support software in the plotting package, i.e. in plot, must create each line as a sequence of these steps. This is accomplished by a stepping algorithm, the most famous of which is the Bresenham algorithm [Bresenham]. Of particular note with this algorithm is that it requires only addition and subtraction of integers, so it can be implemented efficiently and inexpensively in software or hardware with simple processors. The first plotters accepted only simple binary commands that specified step directions and pen control, so line generation had to be done by the host computer software. Now, most plotters do this internally, so that only endpoint coordinates of the line need to be sent by the host computer.
45
jn
minor step n
(0,0)
n+1
Figure 4-3. Stepping Algorithm Variables. does not pass through the origin can be transformed into such a line using a simple change of variables. Assume that the pen has been moved (stepped) to the position labelled P. There are two possible next steps: a major diagonal step to point A and a minor horizontal step to point B. The term major indicates that the step involves both X and Y, whereas a minor step involves only one direction. The exact coordinates of the point on the line at x = n+1 is point E, whose y value is v ( n + 1 ) -- . To decide which step, minor or major, would be closer to the exact line, first compute u two positive vertical distances, AE and EB: v AE = (jn + 1 ) - ( n + 1 ) -u v EB = ( n + 1 ) -- - jn u Let the quantity dn+1 be the scaled difference of these two distances (assuming u > 0):
46
dn+1
= u { EB - AE } v v = u { [ (n+1) -- - jn ] - [ (jn + 1 ) - ( n + 1 ) -- ] } u u = 2v ( n + 1 ) - u ( 2 jn + 1 ).
For the previous step (substituting n-1 for n): dn = 2nv - u ( 2 jn-1 + 1 ).
Now express dn+1 in terms of dn to formulate the incremental stepping algorithm dn+1 and = dn + 2 v - 2 u ( jn - jn-1 )
Note that ( jn - jn-1 ) = 0 if dn < 0, which means take a minor step and dn+1 = dn+ 2v, = 1 if dn >= 0, which means take major step and dn+1 = dn+ 2v - 2u For lines in the other four quadrants (and when y > x), it is necessary to define the proper minor and major steps according to the particular plotter and its commands, and the proper u and v quantities using absolute values. For example, given a line with a negative x and positive y, i.e. with the endpoint in the second quadrant, the minor step would be -x and the major step would be -x+y. The quantities u and v must be computed as their first octant values. This algorithm is summarized in Figure 4-4. major = proper plotter command for diagonal step; minor = proper plotter command for minor step; u = max( abs(dx), abs(dy) ); v = min( abs(dx), abs(dy) ); d = 2v - u; for( counter = 0; counter < u; counter++ ) { if( d >= 0 ) { d = d + 2v - 2u; /* or d += 2v - 2u; */ PlotterCommand(major); } else { d = d + 2v; /* or d += 2v; */ PlotterCommand(minor); } } Figure 4-4. Stepping Algorithm Logic.
47
Note that the quantities 2v - 2u and 2v are constants, and 2v = v+v. Figure 4-5 is an example of the execution of the algorithm. DX = 5, DY = 4, therefore: u = 5, v = 4, major = +x+y , minor = +x exact line dn dn+1 step 4 3 1 -1 7 5 1 -1 7 5 3 major 3 2 3 4 5 major minor major 1 major 0 0 1 2 3 4 5 2 actual steps
n 1
Figure 4-5. Example Steps. Figure 4-6 is an implementation of the stepping algorithm for a plotter whose step command is an integer in which each bit has the meaning shown in the figure. PlotterCommand is an operating system interface routine that sends the given integer, assumed to be a plotter command, to the plotter.
48
plotter command: u = dx; if( u < 0 ) { 32 16 8 4 2 1 u = -u; // abs(dx) major = 4; // -x PEN PEN +X -X +Y -Y DN UP } else major = 8; // +x example: 0010012 = 910 = step +X -Y v = dy; if( v < 0 ) { v = -v; // abs(dy) major |= 1; // OR -y bit } else major |= 2; // OR +y bit if( u >= v ) { // major direction is x minor = major & 12; // save only x bits } else { // else major direction is y minor = major & 3; // save only y bits d = u; // and swap u & v u = v; v = d; } count = u; v += v; // 2v for minor d increment d = v - u; // initial d is 2v - u u = d - u; // 2v - 2u for major d increment while( --count >= 0 ) { if( d >= 0 ) { d += u; PlotterCommand( major ); } else { d += v; PlotterCommand( minor ); } }
}
Figure 4-6. An Implementation of the Stepping Algorithm. for use. The routines EndPlot, PenUp and PenDown are system interface routines that communicate with the plotter (on-line) or enter plotter control commands to a plot file (off-line).
4.3 Characters
49
/* global package data */ float xpen, ypen, xorg, yorg, factor, xres, yres; void Plot( float x, float y, int pen ) { int dx, dy, penabs; float xold, yold; if( pen == ENDPLOT ) { Endplot(); return; } xold = xpen; yold = ypen; xpen = x * factor + xorg; ypen = y * factor + yorg; penabs = pen; if( pen < 0 ) { penabs = -pen; xorg = xpen; yorg = ypen; } if( penabs == MOVE ) Penup(); if( penabs == DRAW ) Pendown(); dx = ceil( xres * (xpen - xold) ); dy = ceil( yres * (ypen - yold) ); Step(dx, dy); }
4.3 Characters
To draw a character requires that its graphic appearance, or glyph, is pre-defined in some drawable form. It is necessary to create a general definition of each character that can be scaled and positioned as needed. We establish a definition grid, a layout convention for defining characters. A complete set of character definitions is called a font.
4.3 Characters
50
baseline for previous line. 7 6 I 5 4 3 2 1 0 0 1 2 3 4 5 characters start at (0,0) (lower-left). 1 2 3 4 5 AX 0 2 4 1 3 AY 0 6 0 3 3 PEN MOVE DRAW DRAW MOVE DRAW Pen Motion Data for A
Figure 4-8. Definition of Character A. void A( float x, float y, float ht ) { static int ax[5] = {0, 2, 4, 1, 3}; static int ay[5] = {0, 6, 0, 3, 3}; static int pen[5] = {MOVE, DRAW, DRAW, MOVE, DRAW}; float f = ht / 7; int i; for( i = 0; i < 5; ++i ) Plot( x + f * ax[i], y + f * ay[i], pen[i] ); } Figure 4-9. Routine to Draw the Character A. This character is defined as strokes (lines). Other character definition formats exist, such as outline fonts, where each character is defined as a series of lines and curves that form a closed boundary. By convention, characters are represented in computers as an ordered set of integers that correspond to graphic symbols. One standard in use today is the American Standard Code for Information Interchange, ASCII. All characters are represented in 7 bits in an 8-bit byte with values from 0 to 127. Only 96 of these are graphic (letters, numbers, punctuation), the others are nongraphic, e.g. carriage return, beep, delete. For example, 'A' = 101 octal = 65 decimal
4.3 Characters
51
'a'
The first approach to coding the full Symbol routine might be to create 96 individual routines, one per graphic character, and use a list of IF statements to decide which routine to call. This also illustrates the issue of character spacing (Figure 4-10).
void Symbol( float x, float y, float ht, char *string, int ns) { float dx = 0.0; int i; for( i = 0; i < ns; ++i ) { if( string[i] == A ) A( x+dx, y, ht ); else if( string[i] == B ) <etc.> dx += ht * 6.0 / 7.0; }
Figure 4-10. IF Statement Version of Symbol. The variable dx is the accumulated horizontal distance from the initial x location to the next character to be drawn. In this case, it is computed using a convention that the spacing between characters shall be 6/7 of their height (a convention in the Calcomp package). This is fixed width spacing. Another approach is variable width spacing, or proportional spacing, where the distance between characters depends on each individual character. In this example, this could be done by adding to each character definition a point with coordinates w 0 3 , where w is the horizontal distance, in definition grid units, to the start of the next character and would vary for each character. Thus, the letter i could move a smaller distance, i.e. be thinner, than the letter m.
4.3 Characters
52
data for A
max. #char
data for B
max. #points for ALL chars. Figure 4-11. A Font Data Structure. 1. ASCII order, fixed size. The character definitions in the data arrays must appear in increasing numerical order according to the ASCII codes. Also, the definitions must contain the same number of points, which we will assume is given in a predefined constant, CHARLEN. Therefore, the dimensions of xgrid, ygrid and pen are 128*CHARLEN (assuming 128 characters, with ASCII codes 0 to 127, will be stored). The data for a given character can be found using only the ASCII code. The index of the start of the definition for the character with ASCII code ascii is computed as CHARLEN * (ascii - 1), and the definition has CHARLEN points.
2.
ASCII order, variable size. The definitions appear in increasing ASCII order, but the sizes of the definitions are
allowed to vary. The is an improvement over the fixed size approach because now each character can be designed without concern for exceeding the size limit or wasting valuable data space. However, the starting index in the data arrays can no longer be computed from the ASCII code alone. Another data array must be created to store the starting indices in the data arrays: the start table. This table is indexed by the ASCII code. Due to the fact that the definitions are in ASCII order, the end of one definition can be computed from the start of the definition for the character with the next ASCII code.
53
3.
Unordered, variable size. Requiring ASCII order can be overly constraining. The data can
be more easily created and edited, for example, if definitions are allowed to appear in any order. The definitions are no longer in ASCII order, so the end of a definition can no longer be computed as above. This makes it necessary to extend the table with another data array to store the end index of each definition, the end array. (Some may find it more convenient to store the size of each character.) The arrays start and end are indexed by the ASCII code. Figure 4-12 shows how routine Symbol could be coded using these data structures.
void Symbol( float x, float y, float ht, char *string, int ns ) { float f, dx; int ascii, i1, i2, i, j; static int xgrid[?]={...}, ygrid[?]={...}, pen=[?]{...}; static int start[?]={...}, end[?]={...}; f = ht / 7.0; dx = 0.0; for( i = 0; i < ns; ++i ) { ascii = string[i]; /* option 1: ascii order, fixed size */ i1 = CHARLEN*(ascii-1)+1; i2 = i1 + CHARLEN - 1; /* option 2: asci order, variable size */ i1 = start(ascii); i2 = start(ascii+1)-1; /* option 3: unordered, variable size */ i1 = start(ascii); i2 = end(ascii); for( j = i1; j <= i2; ++j ) plot( x+dx + f*xgrid[j], y+f*ygrid[j], pen[j] ); dx = dx + ( ht * 6.0 / 7.0 ); } }
54
4.4.1 Rasterization
Consider the process of computing the raster units on a page that will be printed using an electrostatic plotter or laser printer. Imagine superimposing on the page a grid of all possible raster units at their respective raster locations. Place over the grid a 2D coordinate system where the centers of raster units are whole numbers. During window to viewport mapping, the endpoint coordinates of a line are carefully rounded to the nearest raster location. The grid can be represented as a two dimensional array of raster values that is indexed by the row number, i.e. number of units vertically from the bottom (or top) of the page, and the column number, i.e. the number of units horizontally from the left of the page. Thus, the rounded (x,y) coordinates can be used as array indices. This suggests that a computer implementation of rasterization involves a two dimensional array of integers representing all the values at raster locations. As shown in Figure 4-13, the problem of rasterizing a line given the starting and ending y 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 x exact line raster location (5,8) exact coordinate: (5.3, 8.4) before rounding.
Figure 4-13. Raster Grid and a Line. grid locations appears very similar to the plotter stepping problem. In fact, the stepping algorithm
55
can be used almost directly, provided we give a different interpretation to a step. In the case of a plotter, a step was the motion of the stepping motors controlling the pen motion. A plotter step draws a small straight line that was horizontal, vertical, or diagonal. Rasterization for an electrostatic plotter creates some special problems in computer storage. Suppose we try a brute force method, creating an array to represent all raster locations on a printer page and then computing the entire raster array before printing it. This approach is called a Page System. A plot on an 11" high page, one inch wide, at 200 raster units/inch contains: 200 x 200 x 11 = 440,000 values. If we store all information in memory,we would need almost 500,000 words of memory. If we pack the data so that the on/off information for N raster units are stored in bits in a single word (16, 32 or more bits/word are typical), we still need around 7,333 to 13,750 words, just for that one inch. Using a page system approach is not practical in some cases when the computer system lacks sufficient memory. Another approach is to make sequential passes over the data, rasterizing only one small strip at a time, called Strip-Page Plotting. For each strip, which amounts to a rectangular region of the plotting page (like a viewport), convert all linework (and symbols) into raster data, send this strip of data to the plotter, move the strip, then repeat the process until we have passed over all the data (Figure 4-14). Y
Text
56
discussed previously, but now a step means changing running pixel coordinates and setting the pixel value. For example, a horizontal minor step means incrementing or decrementing the current h coordinate along the line, and setting the pixel value at this (h,v) to the desired pixel value. Thus, a line is drawn by operating on the computed set of pixels that best approximates the theoretical line between two pixels. Note that for a raster display, there is no concept of a line in pixel memory. The display data representing the image are only the pixel values in pixel memory that are being refreshed on the CRT screen. Drawing a line means rasterizing the concept of a line into the appropriate set of pixels in pixel memory. Only the software doing the rasterization understands that there is a line. In addition to lines, there are other drawing operations that can be performed on raster displays to take advantage of their dynamic nature. One common raster operation is filling an area, like a rectangle, with a given pixel value. The location and dimensions of a rectangle are specified in different ways, for example as the upper left (h,v) coordinates and the width and height (as in X windows), or alternatively the upper left and lower right coordinates (as in Macintosh Quickdraw). Given the rectangle data, we simply use nested loops to set every pixel within the rectangle boundaries. An interesting variation of the typical drawing operations is drawing pixel patterns or textures. Typically, a pattern is an ordered sequence of varying pixel values specified over an area. For example, 50% gray would be alternating black and white pixels across and down scan-lines. Patterns could be pictures or other sequences of rows and columns of pixel values that can be painted into an area. It is possible in most packages to specify a pattern for the pen that is applied to lines or filled areas. Now consider more general operations that just setting a pixel value to draw an object. In general, raster drawing involves computing the pixel locations for a given drawing operation. For these pixel locations, a given pixel value, called the source pixel, is copied into pixel memory at each computed pixel location, called the destination pixel. For example, to draw a line, destination pixel locations are computed using the Bresenham stepping algorithm, and the source pixel value, often called the current pen color by graphics packages, is copied into each of these locations. Filling is similar, but the destination pixels are more easily computed. In principal, we could perform any arithmetic or logical operation on the source and/or destination pixel value, in addition
57
to the one we have used to this point, copy. In fact, raster graphics packages typically provide a number of pixel transfer functions, as they are called. One that is particularly useful for interaction is the exclusive-or, or XOR. The XOR operation is a well known Boolean logic function whose binary logic table is shown in Figure 4-15. In short, XOR is binary addition without carry. An important property of the A value 0 1 0 1 1 0
B value
0 1
Results of A XOR B
Figure 4-15. XOR Truth Table. XOR operator is that XORing the same source value twice to a destination value restores the original destination value. That is, for two numbers A and B, A XOR (A XOR B) = B. A very useful application of XOR for dynamic graphics is to draw and erase an object by drawing it twice at the same location using the XOR transfer function. Doing this repeatedly while moving the object creates the effect of moving the object over other objects on the screen, called dragging, without the need to redraw the other objects. Notice that for the pixel values of black and white, this works neatly as illustrated in Figure 4-16. For white = 0 and black = 1: source black(1) black(1) destination white(0) black(1) source xor destination black(1) white(0)
Figure 4-16. XOR Operations on a Black and White Display. For color displays, where pixel values are essentially arbitrary integers, the XOR operation will produce arbitrary colors. Consider the case of a mapped display, where pixel values are color table indices. Suppose that the color white is the pixel value 5 and black is 6. Drawing a black line over a white area means the source pixel values are black, or 6, and the destination pixel values are white, or 5. Drawing the line in XOR mode means that the system XORs each source and destination pixel value for each pixel location along the line. In this case, 5 XOR 6 is 3 (5 is 1012,
4.5 References
58
6 is 1102, 1012 XOR 1102 = 0112 or 3 ), so the resulting pixel values along the line are 3. But what color is pixel value 3? For a mapped display, it is the color in color table location 3, which may or may not be a color set by you! Drawing the same line in black a second time means that the pixel values will be 6 XOR 3, or 5, which restores the color white, as expected. XOR is useful only during temporary graphic operations where efficiency is essential, such as dragging. After XOR operations are finished, such as completing the dragging of an object, the entire image must be re-drawn with the copy transfer function to give the correct image.
4.5 References
1. J. E. Bresenham, Algorithm For Computer Control of A Digital Plotter, IBM Systems Journal, Volume 4, No. 1, 1965.
59
DrawCharacter(ichar,h,v), which draws the character whose ASCII code is ichar at the location (h,v) on the screen using GRAFIC. 6. A low-cost laser printer must be sent an entire 8 by 10 inch page of pixels (in scan-line order). The page is printed after all pixels have been received. Each pixel is represented by a character: 'b' for black, 'w' for white. The printer has x and y resolutions of 300 pixels/ inch. Describe the software needed to produce a printed page on this device, starting with an application program that calls the following routines: 1. STARTPICTURE is called to initialize plotting. 2. LINE( x1, y1, x2, y2 ) draws a straight line from (x1,y1) to (x2,y2), where the coordinates are given in (real) inches from the lower left of the page. 3. ENDPICTURE is called to signify plotting is finished.
60
(A) Draw a block diagram showing the general organization of the software, including the stages of processing, types of data, and flow of information starting with the application program and ending with the printing process. (B) Describe in detail the process of creating data for the printer after the application program has completed.
61
Display
Figure 5-1. Vector Display Configuration. We study its programming manual and learn that the screen coordinates range from -2048 to +2047 in x and y, with [0,0] at the screen center. The instructions must be sent as binary data to the DPU, but for convenience to humans they have been given symbolic forms, or mnemonics. MOVEA x,y MOVER dx,dy DRAWA x,y DRAWR dx,dy move to the absolute 2D point (x,y) move by dx and dy draw to the absolute point (x,y) draw by dx and dy
62
Now we must create the display program that will draw the desired picture when sent to the DPU, in this case a square 100 units on a side and located at [100,100]. Instruction 1 2 3 4 5 MOVEA 100,100 ; DRAWR 100,0 ; DRAWR 0,100 ; DRAWR -100,0 ; Comment Move to (100,100) Draw by (100,0) Draw by (0,100) Draw by (-100,0)
DRAWR 0,-100 ; Draw by (0,-100) Figure 5-2. Display Instructions to Draw a 100 Unit Square.
63
Y 5
2 X
Figure 5-4. Beam Movement for First Display Program. What happens when the program ends? The display driver program sent the display instructions once, so the beam followed the path, causing the screen phosphor to glow along the path. However, as we have seen, the display fades quickly away in less than 1/30 second. It is necessary to refresh the display as long as the image should appear. The program must be edited to re-send the data periodically to correct this problem (Figure 5-5). main() { static int dlist[5] = { 0x8989, 0x9880, 0x9FEF, 0x9ABC, 0x99AE }; /* enter an infinite loop to periodically re-send data */ /* to the display */ for( ;; ) { ToDisp( dlist, 5 ); /* compute */ WaitForNextRefreshCycle(); } } Figure 5-5. Second Display Driving Program with Refresh Loop. There are different refresh methods for displays. What we just have seen is on-line execution from the host. It is certainly the least expensive DPU we can buy, but the refresh burden is very demanding on our host computer. Consider also what happens if we add computations in the previous program at the comment. If these computations take too long, i.e. require more time than is left in the refresh cycle after the display instructions have been sent by ToDisp, then we again are faced with flicker. To relieve the host computer of its refresh burden, newer systems contain their own memory to store the display list. They do local refresh from this display memory, thus freeing the host computer to do other things (like compute).
64
the display:
Display Refresh
Figure 5-6. Vector Display with Memory. The UPDATE arrow shows the interface carrying display instructions sent by the host computer to the display when the picture changes. The REFRESH arrow indicates the display instructions being executed from the DPU memory each refresh cycle to refresh the screen image, even if the picture has not changed. Thus, after sending the initial display program, the host does not have to re-send data if the picture remains the same. Display memory adds functionality to the DPU that is accessed using some new instructions: DHALT stop the display and wait to restart
DJSR addr execute a display subroutine call DRET return from a display subroutine call
As one example, the DPU refreshes the display from instructions stored in the display memory as follows: 1. 2. The DPU starts at display memory location 1. instructions are read from sequential display memory locations and executed until a DHALT is executed. 3. After the DHALT, the DPU waits until the next refresh cycle time, then returns to step 1. Now we write another display program, this time to draw two squares, starting at display memory location 1 (Figure 5-7).
65
Instruction Comment MOVEA 100,100 ; Move to (100,100) DJSR BOX ; Draw the Box MOVEA 500,500 ; Move to (500,500) DJSR BOX ; Draw another box DHALT ; Stop DPU DRAWR 100,0 ; Draw by (100,0) DRAWR 0,100 ; Draw by (0,100) DRAWR -100,0 ; Draw by (-100,0) DRAWR 0,-100 ; Draw by (0,-100) DRET ; Return Figure 5-7. Two Squares Display Program.
data. We modify the support software in the host to perform this new data communication: ToDisp ( int array[], int nwords, int addr ); send nwords of display instructions to the display, loading them from array and storing them starting at display address daddr. The beam movement for this program is shown in Figure 5-9. main() { static int dlist[10] = { /* list of integers corresponding to the */ /* display instructions in Figure 5-7 */ }; ToDisp( dlist, 10, 1); } Figure 5-8. Two Squares Display Driver. Y
500,500
100,100 X Figure 5-9. Beam Movement for Two Squares Display Program.
66
What happens when the program terminates? The boxes remain displayed because the display is performing the refresh of the display instructions from its own memory.
DPU
CRT
Analog-Digital Converter
Figure 5-10. Hardware for Dynamic Display Manipulation. Our interface routines, developed as the other, are: int ReadVolt( int ichan ); returns a numerical value that is the current voltage read from input channel ichan. int MakeMove( int x, int y ) ; returns a MOVEA x,y display instruction given an x and a y coordinate. The new driving program is shown in Figure 5-11. What happens when the potentiometers are moved? The first box moves, because the program over-writes the MOVEA instruction at location 1 in display memory with a new instruction that was made by the MakeMove function from the x and y values read from the potentiometers.
67
main() { static int dlist[10] = { /* data as before */ }; int x, y; ToDisp( dlist, 10, 0 ); for(;;) { x = ReadVolt( 1 ); y = ReadVolt( 2 ); /* Note: C arrays start at index 0 */ dlist[0] = MakeMove( x, y ); ToDisp( dlist, 1, 1 ); } } Figure 5-11. Display Driver for Dynamic Display Manipulation. Note that the display list is being changed as the DPU executes it. This is dynamic display manipulation. One can now think of many other desirable functions: changing the size of the box, adding new boxes, deleting boxes pointed to by a cursor that we move with the potentiometers, etc. This involves the development of a graphics package of routines which provide a variety of functions: 1. 2. 3. manage display memory, provide drawing functions for creating entities like lines, characters, etc. provide logical organization of the display to permit adding, deleting, and other operations on groups of entities, 4. provide input functions for obtaining screen positions from devices.
68
refresh update
data path where the DPU executes display list, data path where the host computer changes the display list.
Case 1: Remote Display Processor with Local Memory This system was discussed earlier during the driving program examples.
CPU
DPU
CRT
Figure 5-12. Remote DPU with Memory. Generally, this display processor can be a raster DPU or a random vector DPU. As a Random Vector DPU: DPU memory consists of structured display commands, a display list. As a Raster DPU: DPU memory is unstructured pixel memory. In some cases, in addition to controlling the beam to display pixels, the DPU can also act as a general purpose computer, accepting structured commands from the host, such as line and character . These commands are simply translated into operations on pixel memory. Their meaning is lost in DPU memory. Case 2: The Raster Refresh Buffer This is an interesting variation on a raster display that incorporates a computer into the display to simulate structured display operations. The DPU display list is maintained by DPU #1, an internal computer, which interprets structured instructions from the host, stores them in DPU memory #1 as a structured display list, and translates them into pixels in DPU memory #2, pixel memory. DPU #2 is a simple raster DPU that refreshes the screen from pixel memory. The translation process is termed repaint. The host computer can draw lines and then erase them. This causes DPU #1 to edit its display list, and then repaint pixel memory accordingly. The advantage of such a system is that a raster display now has structured commands, such
69
CPU
DPU #2 (raster)
CRT DPU memory #1 (display list) repaint DPU memory #2 (pixel memory) refresh
CPU memory
update
Figure 5-13. Raster Refresh Buffer. as lines and characters, and erasure. The display independently handles the repaint process that all raster drawing requires. The disadvantage is the cost and complexity of the additional hardware. Also, in the limit as the update rate approaches the refresh rate, such as in fast animation, one is limited by the speed of DPU #1.
70
71
Yw
window wsx
viewport Yv Xv wsy
(xw,yw)
(wcx,wcy)
(xv,yv)
vsx vsy
Xw
(vcx,vcy)
Figure 6-2. Window and Viewport Variables. coordinates in viewport units, (xv,yv), using the window and viewport parameters involves three steps: 1. Translate to window coordinates relative to (wcx, wcy): x1 = xw - wcx y1 = yw - wcy 2. Scale from window to viewport coordinates relative to (vcx, vcy): vsx x2 = x1 -------wsx vsy y2 = y1 -------wsy 3. Translate to viewport coordinates relative to the viewport origin: xv = x2 + vcx yv = y2 + vcy Combine these into single equations: vsx xv = (xw - wcx) -------- + vcx wsx vsy yv = (yw - wcy) -------- + vcy wsy As a check, examine the units of the last equation for xv and yv: ( device units ) (device units) = (data units - data units) ---------------------------------- + (device units) ( data units ) As expected, we get (device units) = (device units).
72
As a further validation select a test point, the upper right corner of the window, and map it to viewport coordinates. The coordinates of the upper right corner are: (wcx+wsx, wcy+wsy). Substituting these for xw and yw into the equations above, vsx xv = (wcx+wsx - wcx) -------- + vcx wsx vsy yv = (wcy+wsy - wcy) -------- + vcy wsy These are the expected values. = vsx + vcx = vsy + vcy
Figure 6-3. Distortion Due to Unequal Scaling in X and Y. This can be avoided by altering the mapping process to perform uniform scaling. This vsx vsy involves using a single scale factor, factor = min( -------- , -------- ) , in place of two separate factors in wsx wsy the second step. In this case, the window boundary will map on or inside the viewport boundary. With uniform scaling, the window may map into a rectangle smaller than the viewport, depending upon the aspect ratios of the two. Using the center of the window and viewport as the reference point in the equations above causes the data to remain centered in the viewport. In many applications, the window is computed from the given data to be the smallest enclosing rectangle, often called the extent. The extent is computed by scanning the data coordinates to find the extrema for x and y, which are the coordinates of the diagonals of the
6.2 Clipping
73
bounding rectangle: (xmin, ymin) and (xmax, ymax). From these coordinates, the appropriate window variables are easily found. extent rectangle (xmax,ymax)
data
6.2 Clipping
Window to viewport mapping transforms coordinates from one coordinate system to another. Coordinates outside the window map to coordinates outside the viewport. It is often desirable or necessary to remove or clip portions of the data that lie outside a given boundary. At first glance, one might be tempted to write a routine with a long and complicated sequence of if tests, each dealing with a special case where the line is outside the window. Fortunately, there is a more organized approach. An elegant algorithm, known as the Cohen-Sutherland clipping algorithm, was developed for line clipping. The algorithm is based on a special integer code, called a region code or endpointendpoint code, that is computed for each endpoint of a line. This code compactly designates the boundary of the 2D region or zone in which the point lies, and has been designed for ease of use and efficiency in handling special cases.
6.2 Clipping
74
is shown below: TOP 10002 (810) BOTTOM 01002 (410) RIGHT 00102 (210) LEFT 00012 (110) IN 0
The region codes are computed based on these conditions as shown in Figure 6-5.
10012
10002
10102 ymax
00002
00102 ymin
01002
01102 xmax
Figure 6-5. Clipping Region Codes. Code to compute the clipping region code is illustrated in Figure 6-6. Note that points on the code = IN; if( x < xmin ) code = LEFT; else if( x > xmax ) code = RIGHT; if( y < ymin ) code = code + BOTTOM; else if( y > ymax ) code = code + TOP; Figure 6-6. Clipping Region Computations. boundary of the visible region, as well as those inside, have a region code of IN. Instead of attempting to detect and process the many possible cases of lines with endpoints in different regions, the algorithm uses the region codes to determine if the line is entirely visible or entirely invisible. If not, then one of the endpoints that is outside any boundary is re-computed to be the intersection of the line and the boundary, sometimes called pushing the endpoint to the boundary. This clipped line is then subjected to the same process. This test and clip iteration continues until the re-computed line is entirely visible or invisible. If the resulting line is visible, it
6.2 Clipping
75
is drawn. If it is invisible, it is ignored. The region codes greatly simplify and speed the visibility tests on a line. A line is visible when both endpoint codes are zero, a simple and fast computation. The value of the region bitcoding scheme is seen in the test for an entirely invisible line. Without the region codes, one could code this test as an unwieldy statement of the form if both endpoints are above the top boundary of the visible region, or if both endpoints are below the bottom boundary of the visible region, if both endpoints are left of the left boundary of the visible region, or if both endpoints are right of the right boundary of the visible region, then the line is entirely invisible. With the region codes, however, this test is simple. Notice that the condition of both endpoints violating a boundary, i.e. above the top, below the bottom, right of right, left of left, means that the region codes will have 1s in corresponding bit positions. This means that the integer and of the two region codes will be non-zero. In C, the "and" computation is the & operator, result = i1 & i2; The invisibility test reduces to a simple computation and test against zero. In effect, the and operation is four parallel if statements.
6.2 Clipping
76
be re-computed (clipped) as follows, y2 y1 y 1 = ---------------- ( xmax x 1 ) + y 1 x2 x1 x 1 = xmax Note that the order is critical. The y1 computation must use the original x1 coordinate. Although the algorithm appears at first to clip unnecessarily many times, for example, a line may be clipped several times before being found to be outside the visible region, in fact it is efficient and very compact. Figure 6-7 illustrates how a line is processed during the iterations of the algorithm. 10 Pass Step 1 0 1 2 11, 22, 23 13 Step 3 12 Step 2 20 , 21 Figure 6-7. Illustration of the Iterations of the Clipping Algorithm. 3 Endpoints 10 - 20 11 - 21 12 - 22 13 - 23 Action Original Line 10 was LEFT 21 was RIGHT 12 was BELOW
6.3 2D Transformations
77
6.3 2D Transformations
Window to viewport mapping involved coordinate transformations. In general, a transformation is a mapping: 1. of one set of points into another in a fixed reference frame, (Figure 6-9), i.e object
transformations, 2. from one coordinate system to another (Figure 6-10). This is termed
change of basis. For now, focus on the object transformations, the mapping of a set of points (data) into another.
6.3 2D Transformations
78
Y (x,y)
(x,y)
6.3 2D Transformations
79
when the coordinates x and y are represented as row vectors. When the coordinates are represented as column vectors, the matrix equation appears as, x' = a b x y' cd y Define symbols for the vectors and matrix: V' = x' y' V = x y M = a d b e
Using row vectors, the symbolic equation is : V = V M, and using column vectors, it is V = MT V, where MT is the matrix transpose of M. Given the symbolic equation, it is generally apparent whether row or column vectors are in use. However, given only a matrix, it is necessary to know if this matrix is to be applied to row vectors (vector before matrix) or column vectors (vector after matrix). These notes will use row vectors for coordinates. For 2D operations, we are tempted to make M 2 by 2, but this disallows constants because all M elements multiply x or y. Expand the representations of M and V. V = x y 1 a d M = b e c f
This will work, but looking ahead we see that having M 3 by 2 will cause problems because we cannot invert a 3 by 2 matrix and we cannot multiply a 3 by 2 by a 3 by 2. For these reasons, we add a third column to M: a d 0 M = b e 0 c f 1 and 1 0 0 I = 0 1 0 0 0 1
6.3 2D Transformations
80
1.
2.
The scaling transformation in functional form and matrix form: Sx 0 0 S ( Sx,Sy ) = 0 Sy 0 0 0 1 Note that all coordinates are scaled about the origin. This means that what we perceive as an objects position will change by scaling. Mirror images can be done (for some geomtric forms) using negative scale factors. Distortion, or scaling with different factors in x and y, can be avoided simply by using a single scale factor Sx = Sy = S.
6.3 2D Transformations
81
3.
Rotation: Y y (x,y)
y x x (x,y) X
Figure 6-13. Rotation. Given (x,y) and , rotate by the angle . x = R cos( + ) y = R sin( + ) Expanding, x = R ( cos cos - sin sin ) y = R ( sin cos + cos sin ) and then recognizing that x = R cos and y = sin, x = x cos - y sin y = x sin + y cos The rotation transformation in functional form and matrix form: cos sin 0 R ( ) = sin cos 0 0 0 1 Like scaling, the rotation transformation is about the origin (Figure 6-14) so that an objects position will change. Y (x,y) X X Y (x,y)
6.3 2D Transformations
82
For simplicity in writing and reading matrix equations, we generally write matrix transformations using the functional forms that were shown above: T(Tx,Ty) S(Sx,Sy) R() for translation, for scaling, for rotation,
The identity transformations, or null transformations, are those that produce the identity matrix: T( 0,0 ), S( 1,1 ), R( 0 ).
(a,b) X Figure 6-15. Scaling About a Point. First carry out the derivation algebraically: 1. Translate so that position (a,b) becomes the temporary origin. (a,b) could be any reference point: x1 = x - a 2. y1 = y - b
3.
Notice that x3 is a function of x2 and y2 is a function of x1. Substituting: x3 = (x - a) Sx + a The order is very important. y3 = (y - b) Sy + b
6.3 2D Transformations
83
Any number of basic transformations can be concatenated into one pair of equations, eliminating the intermediate results that would have required extra data storage and computation. Concatenation is easy in matrix form. For the scaling example: 1. 2. 3. translate: T(-a,-b) scale: S(Sx,Sy) translate: T(a,b)
Combining these into a matrix equation: v3 = v2 T(a,b) = { v1 S(Sx,Sy) } T(a,b) = { v T(-a,-b) } S(Sx,Sy) T(a,b) This last equation is easily converted to matrix form by substituting the appropriate matrices. We can define the composite transformation matrix, M as: M = T(-a,-b) S(Sx,Sy) T(a,b) and therefore, v3 = v M. Note again that the order of the matrices is critical! For practice, find M using the symbolic matrices:
M =
1 0 0 Sx 0 0 1 0 0 0 1 0 0 Sy 0 0 1 0 a b 1 0 0 1 a b 1
Multiplying out the matrix equation symbolically yields the two expected algebraic equations: x = x Sx + (-a Sx + a ) = ( x - a ) Sx + a
84
y = y Sy + (-b Sy + b )
( y - b ) Sy + b
Thus, a single 3 by 3 matrix can represent any combination of basic 2D transformations in a compact, codable and simple form. In the computer, these are all numbers. Functional form provides a convenient way of understanding and even solving transformation equations without doing matrix multiplication. For example, given the transformation equation in functional form, one can visualize the resulting picture by performing each basic transformation by hand on the objects. Try this with the previous scaling about a point example. This is easier (and much faster on a test) than performing the arithmetic.
85
3.
Often graphics packages map world data into "clipping coordinates" prior to performing 2D clipping in order to simplify the computations. The clipping coordinates of visible points lie between 0.0 and 1.0. After clipping, the endpoints of visible portions of lines are mapped to device coordinates and drawn. The process is: (1) apply the transformation [T1] to map world coordinates into clipping coordinates, (2) 2D clip, and (3) apply the transformation [T2] to map the clipped coordinates into device coordinates. Given a square window in world coordinates with center (wcx, wcy) and half-size ws and a square viewport in device coordinates with center (vcx, vcy) and half-size vs: (a) (b) (c) Show the basic transformations forming [T1] in symbolic functional form. Show [T1] as a single matrix. Find the expressions for the clipping coordinates of the world data point midway
between the window center and right boundary. (d) Show in symbolic functional form the basic transformations forming [T2] for
drawing on GRAFIC windows. 4. Window to viewport mapping transforms world data into device data based on a window (wcx,wcy,wsx,wsy) and a viewport (vcx,vcy,vsx,vsy). Derive this transformation as a single matrix with symbolic equations as its elements. Assume the X and Y data and device axes are parallel . 5. Draw a diagram illustrating endpoint codes for the line clipping algorithm. Explain how and why they are used. 6. Show the list of basic transformations in functional form that will make object B from object A in the following figure.
Y B A 1,1 (a,b) X 0,0
7.
Given the rotated window and horizontal viewport, derive the window to viewport mapping equations, first in functional transformation form, then in matrix form.
86
Apple Macintosh Toolbox Apple Computer, Inc. developed a comprehensive suite of software for the Macintosh computer that first shipped in 1984. The software consists of a hundreds of functions that are placed
87
in ROM (read-only memory) contained in each Macintosh [1]. The set of functions is called the Toolbox. A Macintosh program makes calls to Toolbox routines that are translated into system calls that enter the ROM code during execution.
Microsoft Windows Very similar to the Macintosh Toolbox, Microsoft Windows [2] was developed to support window graphics on IBM compatible personal computers. The first version of Windows appeared shortly after the Macintosh, and it has become the standard window environment for the IBM compatible world. Windows consists of a user environment that provides operating system functionality to the computer and contains graphics routines that are linked to programs executing Windows function calls. The routines support applications programming.
X Window System The X Window System [3] was developed jointly by Massachusetts Institute of Technologys Project Athena, Digitial Equipment Corporation and other companies starting in 1987. It was designed to support interactive raster graphics on networks of workstations. It has been adopted as a standard by nearly every workstation manufacturer. X Windows is network-based, meaning programs executing on one workstation calling X routines can display graphical data and interact with devices on another workstation or several other workstations. This makes it possible to run a program on a remote computer and interact with it on your local computer. This is termed a client-server model, where the client (application) program on a remote computer sends coded commands according to a protocol , or messages, over a network to a graphics server on the local computer. The commands are executed by the server to produce graphical output and other functions. The functionality of the X Window System is made available to programmers through Xlib, a library containing hundreds of C language routines for various functions. The Xlib routines are considered low-level functions, meaning they provide basic programming access to the protocol commands. Several software packages have been developed on top of Xlib routines to provide high-level functionality, for example, the OSF/Motif [4] user and programming environment. High-level refers to more sophisticated functionality intended for software developer support. There are some common elements in each of these window systems:
7.2 Windows
88
windows
are the fundamental organizational structure for drawing on the screen and communicating with the user,
events
convey information about actions (user inputs, system happenings) to the program,
drawing primitives are lines, characters, fill regions, color and pixel operations. The following sections discuss these common elements.
7.2 Windows
Windows are in essence the raster counterpart to segments. They have a similar intent to segments, to allow a programmer to structure graphical display information in a manner appropriate for efficient and clear presentation and editing. Some of the first window systems were implemented as tile systems in which the tiles, usually rectangular areas on the screen, cannot overlap. In other words, the screen is divided into a number of non-overlapping regions, or tiled windows. Window systems relax this restriction and allow the regions to overlap.
Tiles Windows
Tile 1 Tile 3
Tile 2 Tile 4
Figure 7-1. Tiles and Windows. We can see that tile systems are considerably easier to implement. Tiles are a one-to-one mapping to screen pixels. Windows are a more difficult problem: 1. 2. Each pixel can belong to a number of windows (Figure 7-2). Only complex regions of windows may be visible. For example, the line a-b drawn in window A must be clipped against window B (Figure 7-3).
7.2 Windows
89
A B
Figure 7-3. Overlapping Windows and Clipping. 3. There must be means for ordering the windows. The stacking order of the previous figures, from the top , is B, C, A. Windows are not only an organizational convenience for implementation. They are useful for efficient graphical communication with a user. 1. A window is a metaphorical means for looking at the infinite world of data through a window. For example, scrolling, panning and zooming through a large picture or long document by moving the data through the window. 2. Windows are means for presenting many sets of data, or different views of one data set, to a user, like a stack of papers on a desk. The user can peruse the stack, move papers around as needed, pick one to edit, place them side-by-side for comparison, etc. 3. Dialog windows facilitate temporary communication with the user, such as warnings and data prompts, by re-using precious screen space.
7.3 Events
90
7.3 Events
Graphics programs are notified of user actions, such as device inputs, and other noteworthy happenings through messages called events. Events control the execution of interactive programs and are central elements in all graphics systems. In general, all graphics programs are passive, waiting for events caused by actions to cause them to react. At the heart of each graphics system is the event processor that accumulates information and actions from devices connected to the computer and other sources, and places the data associated with the event (such as the key typed on the keyboard) at the end of the event queue. The typical graphics program contains at least one event loop that continually asks the question is there an event for me to process? In general, there is a function that removes an event from the head of the event queue and returns the data to the caller. For example, an event loop for a GRAFIC program is shown in Figure 7-4. The function GNextEvent returns four pieces of data: the for( ;; ) { /* forever */ eventId = GNextEvent( &wid, &a, &b, &c ); if( eventId == GKEYPRESS ) { if( a == a ) /* key a has been pressed on keyboard */ } if( eventId == GUPDATE ) { /* process window update event */ } } Figure 7-4. A GRAFIC Event Loop. window in which the event occurred (wid), and three data associated with the particular event (a,b,c). In the example, when a key is typed on the keyboard, an event of type GKEYPRESS is returned and the first argument (called a in the figure) will contain the ASCII code of the key. When running the X Windows version of GRAFIC, GNextEvent will call Xlibs XNextEvent; in Macintosh GRAFIC, the Toolbox function GetNextEvent is called; and in Microsoft Windows GRAFIC, PeekMessage is used in conjunction with a window callback procedure to accumulate events for GNextEvent. There are typically dozens of event types.
91
7.5 References
92
GPenColor( GBLACK GMoveTo( 100, 100 GLineTo( 400, 100 GPenMode( GXOR ); GMoveTo( 200, 100 GLineTo( 300, 100
); ); ); ); );
XOR Example
100
200
300
400
Figure 7-5. GRAFIC Calls Using XOR and Resulting Display. primitive is executed, its appearance is determined by the drawing attributes in the current drawing state. For example, in GRAFIC, each window has a separate drawing state that defines the color and drawing mode for the next line, character or fill region that is created. Functions are available for changing the drawing state, such as GPenColor to set the color, and GPenMode to set the mode. In the Macintosh Toolbox, the drawing state is called the GrafPort ; in Microsoft Windows, it is the device context; and in X Windows, it is the graphics context.
7.5 References
1. 2. 3. 4. Apple Computer, Inc., Inside Macintosh, Addison-Wesley Publishing Company, Inc., Reading, Massachusetts, 1985. Microsoft Inc., Microsoft Windows 95 Programmers Reference, Microsoft Press, Redmond, Washington, 1992. Nye, A., Xlib Programming Manual for Version 11, Volumes One and Two, OReilly &Associates, Inc., 1990. Open Sofware Foundation, OSF/Motif Users Guide, Prentice Hall, Englewood Cliffs, New Jersey, 1990.
93
A good GUI should follow and exploit the principle of least astonishment: Make the expected happen; make what happens expected. Exploit consistency and user anticipation.
94
position indicators, cursors can take different forms that are appropriate for certain tasks. For positioning busy error condition
example, cross-hair cursors are useful for aligning objects (Figure 8-2). cursor (x,y) screen or window horizontal line moves up and down with y cursor coordinate.
vertical line moves left and right with x cursor coordinate. Figure 8-2. Cross Hair Cursor. Positioning constraints provide more exact control of cursor motion according to userspecified options. Constraining the motion of objects connected to the cursor to either horizontal or vertical components of the cursor position facilitates accurate positioning operations (Figure 83). vertical constraint cursor unconstrained horizontal constraint
last point
Figure 8-3. Constrained Cursor Motion. Constraining the cursor location to fixed grid locations, sometimes called snap to grid, means the cursor is moved to the nearest imaginary (or visible) grid location that is computed for each cursor movement by rounding off the exact position to the nearest grid location (Figure 8-4).
8.2 Dragging
95
8.2 Dragging
Feedback during positioning allows the user to preview the appearance of objects during re-sizing and moving operations. One of the most useful techniques for providing this feedback is to move objects with the cursor, called dragging. Dragging means drawing and erasing an object as its position is changed with the cursor position. With segment-based vector display systems, implementing dragging is relatively simple. The object to be dragged is placed in its own segment, distinct from segments displaying other data that will not change during dragging. As the cursor moves, the segment of the dragged object is continually erased and re-created, thereby creating the visual effect to the user of an object moving across the screen. In actuality, the object is not moving, it is an effect created by sequentially erasing the object drawn at the old position and drawing the object at a new position. Although the principle is the same, implementing dragging with raster graphics systems requires special operations because there is a problem with the erasing steps. At first thought, erasing can be implemented easily by drawing over the previously drawn object in the background color (typically white). Considering the object to be a straight line, the visual appearance of dragging is illustrated in Figure 8-5. This effect is called the rubber-band line. A first algorithm for the rubber-band line with a raster graphics package might have the following steps: 1. 2. 3. 4. 5. draw the line the from the start point to the old endpoint, get the new cursor location, called the new endpoint, redraw the old line, from start to old endpoint, in white to erase it, assign the new endpoint coordinates to the old endpoint, loop back to step 1.
The flaw in this approach is step 3, erasing the old line by drawing it in the background
8.2 Dragging
96
cursor position old endpoint old line previous line new line start point Figure 8-5. Dragging A Rubber-Band Line with the Cursor. color, white. This operation replaces all pixels along the line with white, even those that belong to other lines passing through these pixels. For example, in Figure 8-6 we see the effect of erasing one line that crosses others, the classic pixel in common problem. As the dragging continues, the screen can become entirely erased. A Before A After Erasing Line AB new endpoint
Figure 8-6.Illustration of the Raster Erasing Problem. One inefficient and time-consuming solution is to redraw the entire image each time the line is redrawn. This will produce an annoying flash that will irritate the user. A better approach utilizes the drawing function exclusive-or (XOR), or the pen mode GXOR as it is called in GRAFIC. Using GXOR, an algorithm for the rubber-band line will have the following steps: 1. 2. 3. 4. 5. draw the line the from the start point to the old endpoint in GXOR mode, get the new cursor location, called the new endpoint, redraw the old line, from start to old endpoint, in GXOR mode to erase it, assign the new endpoint coordinates to the old endpoint, loop back to step 1.
8.3 Picking
97
Figure 8-7 shows an implementation of rubber-band line dragging using GRAFIC routines. The routine Drag is called from the main event loop when dragging should begin.
void Drag( int h0, int v0 ) { WID wid; int event, c, moved, oldh, oldv, newh, newv; moved = 0; oldh = h0; oldv = v0; GPenMode( GXOR ); do { event = GNextEvent( &wid, &newh, &newv, &c ); if( event == GMOUSEMOTION ) { if( moved != 0 ) { /* redraw (erase) old XOR line */ GMoveTo( h0, v0 ); GLineTo( oldh, oldv ); } if( newh != oldh || newv != oldv || moved != 0 ) { /* draw new XOR line */ GMoveTo( h0, v0 ); GLineTo( newh, newv ); oldh = newh; oldv = newv; moved = 1; } } } while( event != GBUTTONUP ); /* redraw the final line in GCOPY mode for correct picture */ /* it is also necessary to redraw the entire picture */ GPenMode( GCOPY ); GMoveTo( h0, v0 ); GLineTo( newh, newv ); }
8.3 Picking
Picking is an important graphical input process where the user identifies an object drawn on the screen using an input action. Typically, this is accomplished with a positioning device, such as a mouse, with which the user causes a cursor on the screen to move near an object of interest and then indicates the action of picking with another input, such as depressing a mouse button or typing a key. The program should acknowledge a successful pick (and a failed pick), by some response. The form of response is application dependent. Examples of good responses are:
8.3 Picking
98
1.
Highlighting and unhighlighting the selected object(s). Highlighting means to graphically distinguish selected objects from others.
2.
No response, when the reason is clear. For example, clicking the mouse button when the cursor is near nothing does not require a message saying you picked nothing.
Examples of bad responses are: 1. 2. Annoying, persistent or unclear actions, such as beeps, bells, blinking messages. No response, when the reason is unclear. For example, not allowing an object to be picked may need some response. Picking is often (nearly always) made difficult by problems of ambiguous picks, when more than one object may be identified with the picking action. There are many alternatives to resolving ambiguous picks, depending on the application context and the screen context. Application context means the understanding in the user's mind based on the knowledge required to get this far. Screen context means the visible graphical environment the user sees at the time. In general, you (the developer) must have a user model in mind when designing the user interface. Examine some basic approaches to picking objects.
8.3 Picking
99
lower-left corner
ABCD EFG
bounding rectangle
center (?) Figure 8-8. Characteristic Locations for Picking Text. value computations are computationally faster and create a rectangular picking region that is nearly indistinguishable to the user from the circular region. (Xc - Xi) 2+ ( Yc - Yi )2 < d 2
| Xc - Xi | < d AND | Yc - Yi | < d Figure 8-9. Picking Regions for Euclidean and Absolute Value Tests. Anywhere Along a Line: The problem is picking a bounded line (segment) with endpoints. We must combine the mathematical procedure for computing the distance to the unbounded line with the algorithmic process of limiting the picking region to the bounded portion of the line. The distance from a point to an unbounded line is classically computed as, Ax + By + C d = --------------------------------A2 + B2 where the line equation is Ax + By + C = 0, and A, B, and C are found from the line endpoints by (x,y) d (x2,y2)
8.3 Picking
100
solving the simultaneous equations: Ax 1 + By 1 + C = 0 Ax 2 + By 2 + C = 0 Note that this involves dealing with a number of special conditions (A=0 or B=0, for example). Another method, based on a vector solution that will be discussed later, is: xl y yl x d = ------------------------------2 2 x + y where: x = x 2 x 1 y = y 2 y 1 l x = x x1 l y = y y1 Caution: the denominator above can be 0 when the line is actually a point. This test alone, however, has a picking region that extends infinitely along the line:
Figure 8-11. Picking Region for an Unbounded Line. To bound the picking region, perform tests according to the Boolean expression: picking region = inside the smallest rectangle surrounding the line and its endpoints AND within distance d of the unbounded line This is shown visually in Figure 8-12.
8.3 Picking
101
AND
8.3 Picking
102
this point with respect to the picking region, which can be expressed logically as: inside the outer circle (radius R+d) AND outside the inner circle (radius R-d), or more conveniently, inside the outer circle (radius R+d) AND NOT inside the inner circle (radius R-d). The second form simplifies the logic to one routine Boolean InsideCircle( x, y, radius ).
Figure 8-14. Vertex Picking Examples. Picking an object that is composed of lines means picking any of its lines.
Figure 8-15. Picking an Object Composed of Lines. Picking an object by picking near its centroid, i.e., the geometric center of an object, can sometimes be useful (but not often). A bounding rectangle is the most often used picking region. This is visually understandable but also can be a gross over-approximation to the inside of an object. Picking by number or name should be used only when numbers or names are inherent to
8.3 Picking
103
picking region much larger than object Figure 8-17. Bounding Rectangles of Objects. the application, i.e. objects have names known by the user anyway. Even in this case, graphical picking should be permitted too.
Fred 12
Figure 8-18. Showing Object Names. Picking inside objects is a sophisticated approach to picking that is useful for simple objects, such as rectangles, but is very complicated for other objects, especially those with curves.
Figure 8-19. Interiors of Simple and Complex Objects. Inside/outside tests, or point containment tests, will be discussed in a later chapter. Picking inside a selection region is useful for selecting all objects in an area, usually a rectangle, on the screen. This requires special tests for the intersection of the selection region with the object
8.3 Picking
104
release button here Figure 8-20. Picking Objects Inside a Selection Region. region, either the bounding box, individual lines, or vertices.
Noun-Verb Paradigm Recent systems, first popularized by the Apple Macintosh graphical user interface, are based on a noun-verb paradigm. At first glance, the noun-verb paradigm appears to be a simple
105
reversal of picks and commands, but in practice it greatly simplifies user interfaces. The user selects (and un-selects) objects (the nouns), and then executes the desired command (verb). Noun-verb systems require that the program and user maintain a selection, meaning a set of selected objects. The selection is highlighted to visually distinguish selected objects from unselected objects. For example, selected text is drawn inverted (or outlined by a darkened box, etc.), or a rectangle has tabs called handles drawn at its corners. Knowing the selection before executing the command allows the program to: 1. enable and disable menu items and change their text to be appropriate for the selected objects and more informative to the user, 2. change the action performed to depend upon the selected objects.
For example, a delete command can be disabled (grayed and not allowed to be executed by the user) when no object is selected, different commands can be enabled when different numbers or types of objects are selected, and so on.
106
problem
graphic
Display
Analysis Figure 9-1. Problem and Graphical Data. Examples: Graphical Data Geometry Visual Properties (color) Graphics Package Information (windows, etc.) The following development of data structures combines methods for representation of objects with the needs for graphical interaction. We will use the application idea of sketching Problem Data Geometry Physical Properties (density)
107
objects consisting of lines as a basis for discussion of both the data structures and interaction.
MAX-1
unused ( free )
Figure 9-2. Compacted Sequential List Storage. The variables start and free are pointers that indicate locations in the data that hold the first used entity and first free cell, respectively. In a CSL, start always points to the first cell, index 0, and is actually unneeded. In a CSL, entities are first, followed by free cells, in two contiguous blocks of cells.
108
The entities can be stored in several array(s) indexed by the same variable. For example, if one were storing a line drawing as a sequence of endpoints marked by pen codes, there would be three arrays (see Figure 9-3).
x[] x1 x2 free x3 y1 y2 y3
y[]
Figure 9-3. Points and Pen Codes Data Structure. The ith entity consists of x[i],y[i] and pen[i]. In structured languages like C, C++ and Java, entities are more conveniently represented by structures or classes. For the example above, we could define the C data type Entity as a structure as follows. typedef struct { int x, y; int pen; /* constant meaning MOVE or DRAW */ } Entity; If the size of the array is known at compile time, the data storage can be allocated statically as a single array of structures, data: Entity data[MAX]; If the necessary size must be computed or otherwise found at run time, the data array can be allocated dynamically: In C: In C++: Entity* data = calloc( MAX, sizeof(Entity) ); Entity* data = new Entity[MAX];
109
Consider the four basic operations on our data structure: initialization, appending an entity, deleting an entity, and inserting an entity. 1. Initialization: Set free to point to the first cell. 2. Appending an entity: The general process is: 1. 2. 3. obtain a free cell, append the cell to the last entity of the object, fill the cell with the entity data.
For our points example, assume that point newx,newy,newpen is to be appended to the object. Figure 9-4 shows a code fragment for performing this operation.
/* step 1 (obtain free cell) */ new = free; if( new >= MAX ) Error(OUT OF SPACE); /* step 2 (append cell to last entity, remove from free) */ free = free + 1; /* step 3 (fill the cell with the entity data) */ data[new].x = newx; data[new].y = newy; data[new].pen = newpen;
Figure 9-4. Appending an Entity to a CSL. Note that for a CSL the statement free = free + 1 appends the entity to the object and removes the cell from free storage in one step. 3. Deleting an entity. If the cell is the last cell in the object, the process is simply decrementing the free pointer as illustrated in Figure 9-5. If, however, the cell is not the last in the object, then we must compact the list to maintain the proper structure of the CSL for subsequent operations. For example, to delete the ith point, we must copy the cells from i+1 to free-1 up (to lower indices), or shuffle up. This is necessary to fill the gap and to insure that all free cells are in a contiguous block at the end of the data array(s). In our example, deleting the second point would result in the structure in Figure 9-6. The loop to perform the compaction is illustrated in Figure 9-7.
110
Entity data[] x1,y1,MOVE x2,y2,DRAW free x3,y3,DRAW x1,y1 Figure 9-5. Deleting the Last Cell in a CSL. x2,y2 x3,y3
Entity data[] x1,y1, move x3,y3, draw free x1,y1 Figure 9-6. Deleting an Entity (Point).
for( j = i + 1; j <= free - 1; j++ ) /* the next statement can assign a whole struct in C or C++, or /* may need several assignments if separate arrays are used */ data[j-1] = data[j]; free = free - 1;
x3,y3
Figure 9-7. Compaction Loop. 4. Inserting an entity: Inserting an entity between existing entities is like the reverse of the delete operation. To create a free cell before a given entity, it is necessary to copy the following entities down (to higher indices), or shuffle-down. To insert a point before the ith point in our structure, we must move the entities from i through free-1 down (assuming there is a free cell), then fill the ith cell with the new entity data. The loop to perform expansion is illustrated in Figure 9-8.
for( j = free - 1; j >= i; j = j - 1 ) data[j+1] = data[j]; free = free + 1;
111
There is an important point to make regarding data structures in interactive graphics applications. It is essential to remember that changing the data does not change the display. That is, the program must maintain the integrity of the screen representation of the internal data the user has created by redrawing when necessary. Often during the development of an interactive graphical program, the developer carefully and correctly codes the change in a data structure in response to a user command, but mistakenly forgets to redraw the display to show the change to the user. For example, the user issues a delete line command, but the line stays on the screen. The user then tries to pick the line, but the program fails to find it because it is no longer in the data structure. Further editing finally causes a screen redraw, and the user is surprised to see the deleted line gone, finally. Conversely, another common error is the to change the display to the user, but failure to change the data. For example, the user issues a delete line command, and the line disappears from the display. After another command causes a re-draw of the screen, the pesky line magically reappears because the re-draw traverses the data structure, displaying each entity.
112
Entity data[] start free 0 1 2 unused ( free ) 1st entity x1,y1,pen1 unused ( free ) 1st entitys next pointer 2nd entitys next pointer is NULL, for end of list.
MAX
start
x1,y1,pen1
x2,y2,pen2
free Figure 9-10. Diagram Form of Linked List. The boxes are entities or cells, and the arrows are the links. The arrow with a line is a NULL pointer. Now consider the basic operations as before, using the same example as before. 1. Initialization: Initialization now means: set start to NULL and link all cells into the free list.
113
MAX Figure 9-11. Initialization of a Linked List Structure. One possible initialization loop is illustrated in Figure 9-12.
start = NULL; for( i = 0; i < MAX-1; i++ ) data[i].next = i + 1; data[MAX-1].next = NULL; free = 0;
Figure 9-12. Linked List Initialization. 2. Appending an entity: The three steps are the same: (1) get a free cell, (2) append the cell to the object, (3) fill the cell with the entity data. We now have an end condition to consider: when start is NULL, there is no last entity on which to append. Figure 9-13 illustrates one approach to appending to a linkedlist. It may be more convenient and efficient to maintain a variable last that always points to the end of the list. 3. Deleting an entity: Deletion shows the real advantage of linked lists for applications requiring dynamic data structures, i.e. data structures that are created, deleted, and edited interactively. Assume we have picked the entity to be deleted from the CSL and recorded a pointer to it in the variable here. The pointer prev indicates the previous entity, i.e., the entity that points to here. Note that the first entity will not have a prev entity, so we handle this specially as another end condition. Given start, we can find prev as shown in Figure 9-15. Now we are set to unlink the entity and return it to the free list as shown in Figure 9-16. The free list is generally unordered, so returning the cell to the front is the easiest method.
114
/* step 1: get a free cell */ new = free; if( new == NULL ) Error(OUT OF SPACE ); free = data[free].next; /* step 2: find the last entity in the entity list */ if( start == NULL ) { start = new; last = NULL; } else { last = start; while( data[last].next != NULL ) last = data[last].next; } if( last != NULL )
data[last].next = new;
/* step 3: */ data[new].x = newx; data[new].y = newy; data[new].pen = newpen; data[new].next = NULL; /* dont forget to init. next! */
ENTITY A
ENTITY B
ENTITY C
115
Inserting an entity is similar to appending one, so much so in fact that they can be combined without too much difficulty. Assume that we have already obtained a free cell and stored the data in the entity pointed to by new as illustrated in Figure 9-17. new
prev
entity A
Figure 9-17. Inserting a new entity. We use prev==NULL to mean at the front. The procedure can be coded as shown in Figure 9-18:.
/* link the entity into the object after the entity prev */ if( prev == NULL ) { data[new].next = start; start = new; } else { data[new].next = data[prev].next; data[prev].next = new; }
116
(The line Entity* next will not compile as shown, but is written this way for easier reading.) In the next figures, the previous code fragments are re-written to use the next pointer. Figure 9-19 shows initialization using pointers, Figure 9-20 shows appending an entity, deleting
start = NULL; /* NULL is usually 0 for C/C++ pointers */ for( i = 0; i < MAX-1; i++ ) data[i].next = &data[i + 1]; data[MAX-1].next = NULL; free = &data[0];
Figure 9-20. Appending An Entity to a Linked List Using Pointers. an entity using pointers is shown in Figure 9-21 and inserting an entity is shown in Figure 9-22.
if( here == START ) prev = NULL; else { prev = start; while( prev->next != here ) prev = prev->next; }
117
/* first unlink the entity: */ if( prev == NULL ) start = here->next; else prev->next = here->next; /* return the freed cell to the front of the free list */ here->next = free; free = here;
118
Object objects[MAXOBJ]; A second data structure, the object table, in this case a CSL, contains pointers to the start and end of each object in the Entity data. For an Entity data structure that is a linked list, the object table needs only the start pointer and can appear as shown in Figure 9-23. All that is needed is object table: entity storage:
entity[]
start
free
Figure 9-23. Object Table Using Linked Lists for Entity Data. the pointer to the first entity, called the head, of a list. The object table is simply a CSL of list head pointers.
119
3 1
Figure 9-24. Instances of Basic Objects in a Scene. entity data, another data structure is needed to represent the instances of objects in the scene, an instance table, which stores data for each instance. For this, we could define a structure and type called Instance: typedef struct { Object* object; /* pointer to basic object */ float tx, ty; /* the translation */ float sx, sy; /* the scale factors */ TPixel color; /* Grafics color of the instance */ /* etc. */ } Instance; Instance instances[MAXINSTANCE]; The instance table for the scene in Figure 9-24 is illustrated in Figure 9-25. The instance table contains a reference to the object, i.e., a pointer to the object table or a string that is the name of the object that can be matched with the name stored in the object table. An integer array index
120
for object is shown above, but a C/C++ pointer could be used as well as illustrated in previous sections. The instance table also contains attributes for each instance, which can be any geometric, visual or other properties that modify the appearance, location, scale, rotation, etc., of one instance of an object. Each instance is a transformed copy of a basic object. One instance of a basic object has the same basic definition as another, i.e. each instance of a box contains the same entities (lines). Each instance, however, has a different appearance on the screen due to its different location, size, color, etc. Often, the location, size and orientation of an instance are represented in a transformation matrix using the basic transformations of translation, scaling and rotation, respectively. instance table: Instance instances[MAXINSTANCE] 0 1 2 box ,tx1,ty1,sx1,sy1,color1 box ,tx2,ty2,sx2,sy2,color2 box ,tx3,ty3,sx3,sy3,color3
3 triangle ,tx4,ty4,sx4,sy4,color4 object table: Object objects[MAXOBJ] Entity data[MAX]: x,y,pen box , start triangle , start
Figure 9-25. Instance Data Structures. The parameters in the instance table modify the appearance of the basic object when this instance is drawn. The object is transformed geometrically and visually as it is drawn. For this example, the instance drawing loop is outlined in Figure 9-26. The coordinates are transformed by the instance transformation representing the location, size and orientation, and other visual properties can sometimes be implemented through simple graphics package calls. The instance table can be represented using a CSL or linked list structure,
121
for( i = 0; i < #instances; i = i + 1 ) { GPenColor( instances[i].color ); BuildTransform([M],instances[i].tx,instances[i].ty, instances[i].sx[i], instances[i].sy ); for( each point j of object[i] ) { Transform (data[j].x,data[j].y) by [M] into (x,y); if( data[j].pen[j] == MOVE ) GMoveTo( x, y ); else GLineTo( x, y ); } }
Figure 9-26. An Example Loop to Draw Instances Using Grafic. although the CSL is adequate for this example. Picking objects on the screen means picking the instance of an object, which in turn means picking a transformed copy of an object. Thus, the cursor coordinates must be compared to the transformed coordinates of each particular instance of an object. Another approach that is more computationally efficient is to transform the screen coordinates of the cursor into basic definition coordinates of each instance and then to compare these coordinates to the basic object coordinates. This involves multiplying the cursor coordinates by the inverse of the instance transformation of the object in question.
122
Data Node parent prev Ptr to Data & Transform child Figure 9-27. Graphical Depiction of Hierarchical Relationships between Entities. transformation matrix. In this representation, special significance is given to the parent-child relations. They represent a hierarchical relationship between instances that relates the position, size and orientation of each child relative to its parent. The prev and next pointers are a doubly-linked list of children of a parent that has no such geometric significance. Figure 9-28 is an example of how an airplane could be modeled with such a structure. next
Airplane
Fuselage
Left Wing
Right Wing
123
What is the transformation in the data node? There are two possibilities: absolute (global) or relative (local). An absolute transformation maps the data from local coordinates to world coordinates. The airplane flies in world coordinates. Using absolute transformations, when the transformation of any node is changed, all children must be changed appropriately. This is awkward and somewhat defeats the hierarchical nature of the representation because there is really no significance to the parent-child relation. In essence, with absolute transformations each node is independent of the others. A relative (local) transformation maps the childs coordinates to its parents coordinates. That is, each child transformation is relative to its untransformed parent. This localizes the transforms and makes the representation significantly more flexible. Transforming the airplane now requires changing only one matrix (and re-drawing the airplane, of course). The computation of the absolute transformation for each node is done now as part of the drawing process. Using relative transformations, the location, size and orientation of a wing that is defined in its own convenient coordinate system is defined by the location, size and orientation of the fuselage to which it is attached (i.e., its parent). Therefore, when traversing this hierarchical data structure, for example for drawing, it must be understood that transforming a node (data and associated transformation) in the tree also transforms all of its children. In other words, move the plane, and everything connected to the plane moves too. The parent-child relation is now a geometric association. This hierarchical data structure is greatly aided by transformation matrices. For example, a division of a company has many designers working long hours developing the data for a wing. The designers naturally adopt a convenient coordinate reference frame in which to express the wing data, which we can consider for now as a list of points and pen codes. Similarly, another division of the company has developed the fuselage data and another the engine data, likewise selecting convenient reference frames (different than the others). Now, it is time to develop a model of the airplane with two identical wings (well use mirror images for our 2D graphical example, but this would not be done with 3D data) connected to the fuselage, each wing with two identical engines. It would be unreasonable (and unnecessary) to ask one division to recompute their data in the reference frame of the other. Instead, we need to relate the reference frames, and use instance transformations to model the airplane structure as a hierarchy. The wing data defined in a wing reference frame (coordinate system) , WW , is first
124
positioned relative to a fuselage defined in its reference frame, FF, by a transformation MF/W (read as transformation from wing to fuselage ). Similarly, an engine defined in an engine coordinate frame, EE, is positioned relative to the wing by MW/E. The coordinates of the wing expressed in the fuselage coordinate system WF are: WF = WW * MF/W The coordinates of the engine expressed in the wing coordinate system EW are: EW = EE * MW/E And now the engine can be expressed in the fuselage system as EF: EF = EW * MF/W = ( EE * MW/E ) * MF/W Notice that EF = EE * ( MW/E * MF/W ) or EE * MF/E where MF/E = MW/E * MF/W . This shows that as we descend from parent to child in the hierarchy, we can concatenate matrices to accumulate the global transformation. This concatenation process continues as we further descend the structure. Consider, however, the transformations of the children of one parent, such as the fuselage. Transformations of one child do not affect transformations of other children with the same parent. Consider the example illustrated in Figure 9-29. To compute the transformation relating the outboard engine left to the airplane, MOELA, we must pre-concatenate the transformations of each parent, from airplane to outboard engine left: MOELA = MOEL * MLW * MF * MA
125
Ya Xa
C AB
Fuselage
Right Wing
Figure 9-29. Numerical Airplane Hierarchy Example. the current CTM: P' = P * CTM. (NOTE: this is for row vectors. For column vectors, the transformations and the order must be transposed.) However, examine the CTM for the Left Wing versus the Right Wing of the Fuselage. After drawing one child of a parent, the starting CTM for the next child is the parents CTM. This process repeats as the child becomes the parent, and so on. If we wish to maintain the prior parents CTM as we display each parent-child-child in a depth-first manner, we must store all the prior CTMs in a LIFO (last-in first-out) CTM stack. A stack is simply an array structure that has a top element index, or stack pointer. The stack entries, in this case, are transformation matrices. This can be implemented using a three dimensional array CTMSTACK[i,j,k]. Most systems maintain a stack of transformations and use the top transformation as the CTM. Prior to concatenating a childs transformation with the CTM, we push the parents CTM onto the stack. This is done by incrementing the stack pointer and copying the CTM into the
126
stack entity at this pointer. The CTM is now a copy of the previous CTM, which is the next transformation on the stack. Subsequent transformations will concatenate to the new CTM. After the child has been drawn with its CTM, we must restore the prior CTM before drawing the next child by popping the top transformation off the stack and restoring the previous CTM as the top of the stack. This is done simply by decrementing the stack pointer. Figure 9-30 illustrates CTM tracing code for this hierarchical data structure.
void DrawObject ( ObjectPointer object, TransformMatrix ctm ) { TransformMatrix newctm; /* new matrix allocated each call */ ObjectPointer child; MatrixMultiply: newctm object->xform ctm DrawData( object->data, newctm ); for(child=object->child; child != NULL; child=child->next ) DrawObject( child, newctm ); }
Figure 9-30. Hierarchical Drawing Routine. The execution of the routine in Figure 9-30 using the data in Figure 9-31 is traced in Figure
C
Figure 9-31. Example Hierarchy. 9-32.
127
DrawObject( A, I ) newctm1 = MA I DrawData( A, newctm1 ) for( child = B ) DrawObject( B, newctm1 ) newctm2 = MB newctm1 DrawData( B, newctm2 ) for( child = C ) DrawObject( C, newctm2 ) newctm3 = MC newctm2 DrawData( C, newctm3 ) ... for( child = D ) DrawObject( D, newctm1 ) newctm4 = MD newctm1 DrawData( D, newctm4 )
128
roof := BEGIN_STRUCTURE TRANSLATE BY -.05,.25; SCALE BY 1.4,.2; INSTANCE OF box; END_STRUCTURE; door := BEGIN_STRUCTURE TRANSLATE BY .12,0; SCALE BY .2,.8; INSTANCE OF box; END_STRUCTURE; house := BEGIN_STRUCTURE INSTANCE OF box; INSTANCE OF roof; INSTANCE OF door; END_STRUCTURE; houses := BEGIN_STRUCTURE INSTANCE OF house; TRANSLATE BY .4,0; INSTANCE OF house; END_STRUCTURE;
Xh Yhs "houses"
Xhs
houses
house
house
box
roof
door
etc.
box
box
Figure 9-33. The Evans & Sutherland PS300 Hierarchical Modeling Language. Such representations have been termed scene graphs. A recent graphics system based on scene graphs is Java3D by Sun Microsystems. Java3D is a set of Java classes that provides sophisticated and efficient 3D graphics support for constructing a scene graph dynamically. The scene graph is a tree of objects descending from the root node called the virtual universe. The system does all processing, and drawing, from the scene graph data.
129
130
simplest computationally, but is spatially ambiguous. Can you see two different views in Figure 10-1? A Is line C-D closer or farther away than line A-B? C B Figure 10-1. Ambiguity in Wire Frame Images. 2. Hidden Lines Removed (HLR). Only parts of lines or curves that are not covered by other D
faces (surfaces) are drawn. HLR involves more complicated and time-consuming computations, but produces an image with less ambiguity. A Is point A closer or farther away than B? B Figure 10-2. Figure 10-1 with Hidden Lines Removed. 3. Hidden Surfaces Removed (HSR). HSR output requires raster displays. Those parts of
faces not hidden by other faces are shaded (filled with pixels) with colors that indicate the color and intensity of light reflecting from the face. This involves the most complex and time-consuming computations, and produces the least ambiguous images. Special lighting effects include reflection, refraction and shadows to make the image more realistic.
131
Shading is based on the intensity of light reflecting from the faces and indicates their relative angle to the eye.
Figure 10-4. Perspective Image Showing Converging Lines. Images that do not use perspective projection, that is, those in which all parallel lines in 3D remain parallel in 2D, are termed orthographic projection. The most common use of orthographic projection, the three-view drawing, in essence ignores one coordinate when drawing each 2D orthographic view (Figure 10-5).
132
Front X Front Z
X Z Y Y
Side
Figure 10-5. Orthographic Images in Engineering Drawing Arrangement. 2. Intensity Depth Cueing displays lines farther away from the eye at lower intensities. This
can be done relatively easily in software for constant intensity lines on raster displays. Some high performance display systems offer this as an option. 3. Stereoscopic Viewing requires two projected views, one computed for the left eye and one
for the right. Special viewing apparatus is often required. Color can also be used to cue the two images, such as the red and green lens stereo glasses worn in movie theaters. 4. Kinetic Depth revolves the object dynamically, causing lines farther away to move more
than lines nearer to the eye. This requires dynamic 3D rotation, which is available on many display devices today (for additional cost, of course).
133
Geometrically, there are 12 points, each edge is a curve that is a straight line, each face is a planar surface. As another example, a cylinder has no vertices, two circular edges, and three faces, two planar (bounded by the edges) and one cylindrical (bounded by the edges). E B F A
H C
G D
Figure 10-6. A Cube and its Eight Vertices. Pictorial representations require different data storage methods. A box, for example, with vertices named A-G, as shown in Figure 10-6, can be stored several ways. The type of storage method can be dictated by the type of rendering(s) desired.
Wire Frame Representations This is the lowest level of the data storage methods. Only curves (lines) are needed, so any of the storage methods described in the previous chapter can be used. For the cube in Figure 10-6, the data structure would store the 12 lines: AB, BC, CD, DA, EF, FG, GH, HE, BE, CH, FA, GD. One approach to storing lines would be to store the two endpoints of each line, i.e., (x1,y1,z1) and (x2,y2,z2). Notice that this duplicates each point three times (three lines meet at a point in a cube) which can cause redundant calculations when transforming the object. Another storage method, called points and lines, stores points and lines separately. A line consists of references to its two endpoints, analogous to saying the line starts at vertex i and ends at vertex j. The points and lines representation of the cube is shown in Figure 10-7. Notice that each vertex is stored once, affording efficiency in space and computation time, because transforming the object involves transforming each vertex only once. In general, edges can be other than straight lines.This would make it necessary to store the type of curve or its equation with each edge. The edge is still a path between vertices, but now the description of this path, the curve, would be explicitly given.
134
index 1 2 3 4 5 6 7 8
z[] zA zB zC zD zE zF zG zH
Figure 10-7. Points and Lines Data Structure for the Cube in Figure 10-6.
Hidden Line Removal Representations Removing hidden lines (curves) involves removing lines and parts of lines that are obscured by faces of the object. The data structure must store more than just lines and points, it must know about faces. Therefore, we must add a face structure that represents a surface that is bounded by edges. In the cube of Figure 10-6, the faces are planes bounded by lines, called polygons. Polygons are the simplest form of a face and are very common among 3D data representations. A first storage method for polygons might be a list of points, ABCD, ADHE, EFGH, BCGF, ABFE, CDHG, where each adjacent pair of points in the list, including the last to the first, is a line. This suffers from the inefficiencies in space and redundant computations noted previously for lines. Alternatively, a polygon can be stored as a list of references to vertices, similar to the points and lines logic, called a points and polygons structure. The lines structure of Figure 10-7 is replaced by a polygons structure that has four indices per polygon (for this example) corresponding to the four vertices. Edges would be understood implicitly to be lines between adjacent vertices, requiring that the list of vertices be properly ordered.
Hidden Surface Removal Representations The data required for hidden surface removal is an extension of that for hidden line removal. In addition to the geometry of each face of the object, the light reflectance properties of the face also must be stored. Also, the lighting conditions of the surrounding environment are
135
136
Ye
projectors
Ze
Figure 10-8. The Eye Coordinate System and Projected Lines. (X1,Y1,Z1), and similarly for (X2p,Y2p) and (X2,Y2,Z2). After perspective projection, lines remain lines and planes remain planes (this will be shown mathematically later). Therefore, the projection of a 3D line from point 1 to point 2 is the line connecting the projected endpoints. Angles, however, are not preserved after perspective projection. To derive the equations for perspective projection, first look down the Xe axis (Figure 109). The COP is a point on the Ze axis that is a distance d, called the perspective distance, from Ye (X,Y,Z)
Ze
COP d
Yp Xe (out of page)
Figure 10-9. Viewing a Projector Looking Down the Xe Axis. the projection plane. Using similar triangles, relate Yp to d, Y and Z: d Yp = Y ---------dz
137
Now find Xp by looking down the Ye axis (Figure 10-10). (X,Y,Z) Ye (out of page) Xp Xe
COP Ze Figure 10-10. Viewing a Line Looking Down the Ye Axis. z 1 d Xp = X ---------- = X 1 -- d dz d 1 ---------- , also shown as ---------------- is the perspective factor and is a function of the perspective dz z 1 -- d distance d of the eye from the projection plane and the z coordinate of the given point. After perspective, when the data lies on the projection plane, we can use the standard twodimensional techniques to draw it. Examine the perspective factor more closely. 1. When z -, i.e. for data points far from the eye: d lim ---------- = 0 d z z Therefore (Xp,Yp) (0,0). The points converge to the origin. 2. When d , i.e. when observing from infinitely far away: d lim ---------- = 1 d z d Therefore (Xp,Yp) (X, Y). This is parallel (orthographic) projection. So, perspective projection from infinitely far away becomes parallel projection.
138
3.
When z > d, i.e., the data point is behind the COP: d ---------- = a negative number d z Therefore (Xp, Yp) = (a negative number) (X, Y). This causes inversion of the image, or
false projection. Ye
Ze
Figure 10-11. Inversion of Data Behind the COP. 4. When z d, i.e. a point in the z plane of the COP: d lim ---------- = d z zd The perspective factor approaches infinity, causing a severe lens distortion effect as objects approach the COP plane. Computationally, this is a problem that must be handled in code, as described later. 5. When z = 0, i.e. data points on the projection plane: d ---------- = 1 d z Points on the projection plane are not affected by perspective. Figure 10-12 summarizes these different regions along the Z axis and their projection characteristics.
139
z>d
z=d
0<z<d
z<0 Ye
Ze
Figure 10-12. Effects of Data Z Values on Projection. There are other definitions for the eye coordinate system. One of the early definitions for the eye coordinate system places the COP at the origin and the projection plane at the location z=d in a left-handed coordinate system as shown in Figure 10-13. This is popular because Ze increases Ye Ze (into page) projection plane @ z = d Xe COP Figure 10-13. A Left-Handed Eye Coordinate System. away from the eye, which some consider better than a right-handed system where Ze decreases away from the eye. To derive the perspective factor for this system, again look down the Xe axis (Figure 1014). Ye (X,Y,Z) Yp Y ------ = --d Z so: Ze d Y p = Y -Z
Yp Xe (out of page)
140
d Similarly, we find Xp = X -- . This appears simpler than the previous perspective equations, but has Z limitations for some applications, such as hidden surface removal, as well see later.
141
and to place the minus sign before one of the S terms. To do this, rotate an obvious point, like (0,1, 0) about the axis by 90 . For X: Y (0,1,0) 90 X Z (0,0,1)
Figure 10-15. Test Rotation about the X Axis. Rotation about Y, RY(): Y cos 0 x' y' z' 1 = x y z 1 sin 0 0 sin 0 1 0 0 0 cos 0 0 0 1
(1,0,0) (0,0,1) Z 90
Rotation about Z, RZ(): Y cos sin 0 0 sin cos 0 0 x' y' z' 1 = x y z 1 0 0 1 0 0 0 0 1 Z It is possible to rotate about an arbitrary axis, but this requires the concatenation of several basic rotation transformations. (0,1,0) 90 X (1,0,0)
142
For representing translation in matrix form and to make the transformation matrices square, we have set the fourth coordinate of a vector to 1, (x,y,z,1), and the last column of matrices to (0,0,0,1)T. Now we generalize this notation and give deeper geometric interpretation to it. A point in 3D space is a projection of a point in 4D space given by the coordinates (wx, wy, wz, w), the homogeneous coordinates of a point. Define normalized homogeneous coordinates or ordinary coordinates as coordinates having a unity value of w. The x, y and z components of homogeneous coordinates are the 3D coordinates only when w=1 (or 0 as will be discussed later), i.e. only after the coordinates have been normalized. Thus, (2, 4, 6, 2), (0.5, 1.0, 1.5, 0.5) are the same normalized coordinates, (1, 2, 3, 1), which is the 3D point (1,2,3). To find the normalized 3D coordinates, simply divide by the w coordinate, i.e., wx wy wz ------ , ------ , ------, 1 - - w w w To see the benefits of this convention, examine a general 4 by 4 homogeneous transformation matrix (for row vectors): a d g l 1. 2. 3. (a-i): (l-n): s: b e h m c f i n p q r s
Scaling and rotation elements. Translation. This is the homogeneous scaling element. Examine what happens when we
After normalization, the coordinates are x y z 1 . The net effect is to uniformly scale the -- -- - - s s s 3D coordinates by s-1.
143
4.
p,q,r: These are projection elements. Begin by examining the effects of a transformation with variable r and s elements: 1 0 xyz1 0 0 Now normalize, x y z x y z ( rz + s ) ------------ ------------ ------------ 1 , only if rz + s 0 rz + s rz + s rz + s 1 Letting s = 1 and r = -- , where d is the perspective distance, d 1 1 1 x ----------- , y ----------- , z ----------- , 1 z z z 1 -- 1 -- 1 -d d d This is perspective projection! Thus, due to the convention that normalization introduces 0 1 0 0 0 0 1 0 0 0 x y z ( rz + s ) r s
division, perspective can be represented as a transformation in matrix form. The perspective matrix, P, is: 1 0 0 0 0 1 0 0 P = 1 0 0 1 -d 0 0 0 1 Where is the point (x,y,z,0)? Following the homogeneous convention, normalize the point: x y z 0 -- -- -- -- - - 1 0 0 0 0 The interpretation of the homogeneous vector (x,y,z,0) is a point at infinity on a line from the origin through (x,y,z). In other words, the vector (x,y,z,0) is a direction vector. This is a consistent interpretation when one considers that subtraction of two position vectors with the same w coordinates will result in a direction vector with a 0 w coordinate. The ability to represent points at infinity is another benefit of homogeneous representation.
144
Project the point (0, 0, d, 1), the COP for perspective projection: 1 0 0 0 0 1 0 0 0 0 d1 1 = 0 0 d 0 0 0 1 -d 0 0 0 1 This is the point at infinity on a line from the origin through (0,0,d,1). Therefore, the COP projects to the point at infinity on the Ze axis.
Zw Figure 10-16. World Coordinate System and View Vector. transformed from world coordinates into eye coordinates before projection can be performed (analogous to translating before scaling when scaling-about-a-point ). This transformation, called the viewing transformation, V, relates the eye coordinate system to the world coordinate system. Where is the eye coordinate system? One answer is to use the view vector to define the eye coordinate system. We will stipulate that AIM is the origin of XYZe and EYE is the COP, or
145
Xe EYE AIM
Ze
Xw Data in World Coordinates Zw Figure 10-17. World and Eye Coordinate Systems. For now, also make the arbitrary specification that the Ye axis will be vertical, i.e. the plane Xe=0 will be perpendicular to the plane Yw=0. This means that the observer stands up-right with respect to Yw when viewing the data in the world coordinate system. The plane Ze=0 is the projection plane and the Ze direction is the normal to the projection plane. The definition of a viewing transformation, V, is that it transforms a point in world coordinates (Pw) into the corresponding point in eye coordinates (Pe): P e = Pw V To derive V given the view vector, visualize the process as transforming the view vector as a line in world coordinates. We want to align the Ze axis on the view vector with the Zw axis. This can be accomplished using three basic transformations: 1. 2. 3. Translate the vector so that AIM goes to the origin, i.e. AIMw (0, 0, 0)e Rotate the result so that the translated EYE, EYE', swings into the Xw=0 plane. Rotate this result so that EYE'' swings down onto the Zw axis.
The matrix equation becomes: Pe = Pw T(?) R?(?) R?(?) So that, V = T(?) R?(?) R?(?)
146
Step one. This is a translation: P' = Pw T(-AIM) After this translation, the situation is as shown in Figure 10-18. Yw
EYE'=EYE-AIM EYE'' EYE'x Zw Figure 10-18. View Vector After Translation. Step 2. Rotate by angle about Yw to rotate EYE' into the Xw=0 plane. Using trig:
1 EYE' x = tan -------------- EYE' z
EYE'y EYE'z
Xw
Check the sign by noting the rotation direction assuming positive values. Note that EYE'z can be 0, and that the ATAN function returns angles between - -- and -- . Therefore we must code 2 2 this using the ATAN2(EYE'x,EYE'z) function. There is still a problem, however, in that EYE'x and EYE'z will both be zero when the view vector is parallel to the Yw axis. In this case, ATAN2 cannot be called. The solution is to set = 0. Step 3. Rotate EYE'' down by angle to (0,0,d) by rotating about Xw. EYE' y = tan --------------------------------------------- 2 2 EYE' x + EYE'z
1
To check the transformations, try transforming the data points AIM and EYE. You know what the results should be: (0,0,0) and (0,0,d). For example, consider Figure 10-19. We first compute EYE' = EYE - AIM = (0,1,0). Then,
147
V = T ( 0, 0, 0 )R Y ( 0 ) R X -- 2 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 cos sin 0 --2 2 V = ( I )( I) = 0 0 0 1 0 sin cos 0 --2 2 0 0 0 0 0 1 As a check, we transform AIM and EYE, AIM V = (0,0,0), EYE V = (0,0,1). Also note that the XYZe axes, treated like data, will align with the XYZw axes when transformed by V. Often you can do this by observation. The EYE-AIM view vector method just described can be far too restrictive for some applications. Imposing the condition that d is the distance between EYE and AIM can be awkward. It may be useful to consider this EYE-AIM view vector approach as an internal representation for the viewing information and give the user alternative methods for specifying the viewing position, direction and perspective distance. An example of one alternative is using polar (or spherical) coordinates: R, azimuth and
148
elevation, for a viewing sphere with a given center C. This is useful for walking around an object. We treat the data as though it were at the origin of a sphere of radius R. Consider viewing the earth as an example. Azimuth is the angle of rotation about the earths axis, i.e. east-west rotation, which in XYZw notation means a rotation about our Y axis. Elevation (also called inclination) is the angle from the equator, i.e. north-south rotation, that is a rotation about our X axis. Y
Figure 10-20. Viewing Sphere and Polar Coordinates. The radius of the viewing sphere, R, and its center, C, are usually computed to surround the 3D data. One can compute a bounding sphere for the data, or more simply, compute a parallelepiped with corners (xmin,ymin,zmin) and (xmax,ymax,zmax). The sphere center C is then the midpoint of the diagonal connecting the corners, and R is half the length of the diagonal. Given the viewing sphere, the user specifies viewing information as elevation up, azimuth to the right. The problem is to compute EYE and AIM from this information, and then compute V as before. One approach is to set AIM = C, i.e. always look toward the center of the sphere. After all, this is the center of the data. To compute EYE, think of transforming a direction vector of length d along the Z axis (parallel to Zw) through C by the elevation and azimuth rotations, then adding this vector to AIM: EYE = (0,0,d,0) RX( -elevation ) RY( azimuth ) AIM
149
Clipping on the near plane is required prior to projection to avoid infinite projection factors and inverted data. Clipping on the far plane is optional and is useful for eliminating data too far to see and to produce special effects such as cross-sectioning. Clipping on the near and far planes is termed z-clipping. One approach to 3D clipping, based on the idea of utilizing existing 2D software for clipping and window to viewport mapping, is to: 1. clip the 3D line on the near and far planes (z-clip), 2. project the line into 2D on the projection plane, 3. clip the projected line against the 2D window, 4. and then map it to the viewport. Z-clipping can be implemented using logic based on the 2D Cohen-Sutherland clipping algorithm. The endpoint classification can be a simple Boolean value that signifies visible or invisible. This code is computed based on the z coordinate alone as illustrated in Figure 10-21.
150
Ze zn Xe zf
Figure 10-21. Z-Clipping Endpoint Regions. To compute a clipped endpoint, use the symmetric equations for a 3D line: y y1 z z1 x x1 ---------------- = ---------------- = --------------x2 x1 y2 y1 z2 z1 where (x,y,z) are the coordinates of any point on the line between the known points (x1,y1,z1) and (x2,y2,z2). For a line crossing one of the planes, we know that z', the clipped z coordinate, will be zn on the near plane and zf on the far plane. Therefore, the x' and y' of point 1' shown in Figure 10-21 can be computed. Lines can be rejected as wholly invisible during z-clipping and thus would not be further processed. The process of drawing a line resembles a pipeline of operations starting with the endpoints in 3D world coordinates and ending with a clipped, projected line in 2D viewport coordinates. This is often called the 3D pipeline. For the Z-Clipping approach to 3D clipping, the pipeline appears as shown in Figure 10-22. Across the top of the pipeline element boxes are parameters that control the execution of each element of the pipeline. Across the bottom are the dimensionality and reference frame for the coordinates entering and exiting each element. The diagram in Figure 10-22 can be used as a model for coding the pipeline. Each block can be coded as a routine that inputs the endpoints of the line in 2D or 3D. The first routine, e.g. Draw3DLine( x1, y1, z1, x2, y2, z2 ), would be called to draw a line given the 3D local
151
modeling transformations M
view vector
zn & zf
CTM
Z-CLIP
2D CLIP
W-V MAP
3D local
3D world
3D eye
2D device
Figure 10-22. The 3D Pipeline for Z-Clipping. coordinates of the endpoints. The application can then use this routine without concern for all the downstream details of its operation, thus forming a 3D package of sorts.
152
Ye
Xe Right Clipping Plane Ze COP Bottom Clipping Plane Near Clipping Plane (z=zn) Figure 10-23. Viewing Frustum. The horizontal Field-Of-View angle (FOV) is the angle between the left and right clipping planes (see Figure 10-24) and the vertical FOV is the angle between the top and bottom clipping planes. The FOV angle is often used as a viewing parameter.
Figure 10-24. Horizontal Field-of-View Angle. Examine the top and bottom clipping planes of the viewing frustum when viewed down the Xe axis as shown in Figure 10-25. The equations of the top and bottom planes are functions of z, the window size wsy, and the perspective distance d. These are infinite planes that divide 3D space in the Y direction into 3 regions: above the top clipping plane, visible, and below the bottom
153
clipping plane. TOP clipping plane z y = wsy 1 -- d Ze wsy Xe (out) zn wsy zf z y = wsy 1 -- d invisible BOTTOM clipping plane
invisible
Ye
d COP
Figure 10-25. Side View (Looking Down Xe) of the Viewing Frustum. Similarly, Figure 10-26 shows a top view of the viewing frustum and the equations for the left and right clipping planes. These are infinite planes that divide 3D space into three regions, left, visible, and right. LEFT clipping plane zf wsx z x = wsx 1 -- d invisible COP Ze Figure 10-26. Top View (Looking Down Ye) of the Viewing Frustum. z There is a common factor, (1 - -- ), that should look familiar. It is the inverse of the d perspective factor. The 6 infinite planes form the boundaries of a visible volume that is described by the following equations. wsx
Ye (out)
Xe z x = wsx 1 -- d invisible
zn
154
z z Letting wsxp = wsx 1 -- , wsyp = wsy 1 -- : d d -wsxp -wsyp zf <= <= <= x y z <= <= <= wsxp wsyp zn
These equations can be coded into a 3D endpoint code using a 6-bit binary code, where each bit represents the Boolean term the point lies on the invisible side of the ( TOP, BOTTOM, LEFT, RIGHT,NEAR, FAR) clipping plane. A zero code means the point is inside the frustum. 32 NEAR z > zn 16 FAR z < zf 8 4 2 1
TOP BOTTOM LEFT RIGHT y > wsyp y < -wsyp x < -wsxp x > wsxp
Figure 10-27. Sample 3D Endpoint Region Code. The frustum clipping algorithm can be implemented as an extension of the CohenSutherland 2D clipping algorithm. The exit conditions are still based on the AND of the endpoint codes. Clipping a line that crosses a boundary is a bit more complicated in 3D. Given two endpoints, P1 and P2, where P1 is invisible and P2 is visible, i.e. the line P1 P2 crosses a given clipping plane, one must compute a new P1 that lies on this plane. This was done in the 2D clipping algorithm using the symmetric equations of a line. Unfortunately, we now have a line in 3D space that intersects an oblique plane (not orthogonal to a principal axis). One technique for this computation is to use the plane equation of the clipping plane, Ax + By + Cz + D = 0. First note that this equation can be expressed as a homogeneous dot product: A B = 0 xyz1 C D
or,
PB = 0
B is called the plane coefficient vector for a clipping plane. The equation of any point along the line P1 P2 can be expressed as a parametric vector function P(), where the scalar parameter varies between 0.0 at P1 to 1.0 at P2:
155
P ( ) = P1 + ( P2 P1 ) First form the homogeneous dot product of the P() equation with the plane coefficient vector of the violated clipping plane, B. Then solve for the value of the parameter for which the line intersects the plane by recognizing that P() B = 0 at the intersection (i.e., when P() is on the plane). P() B = 0 P1 B + ( P2 P1 ) B = 0 P1 B = -----------------------------------P1 B P2 B The clipping plane coefficient vectors, B, are formed from the equations of the clipping planes and are shown in Figure 10-28. Plane Coefficient Vector ( 1, 0, ( ------- ) , wsx ) ( -1, 0, ( ------- ) , wsx ) ( 0, 1, ( ------- ) , wsy ) ( 0, -1, ( ------- ) , wsy ) ( 0, 0, 1, -zf ) ( 0, 0, -1, zn )
wsy d wsy d wsx d wsx d
Algebraic Equation
z x = wsx 1 -- d z x = wsx 1 -- d z y = wsy 1 -- d z y = wsy 1 -- d
z = zf z = zn
z < zf z > zn
Figure 10-28. Frustum Clipping Plane Coefficient Vectors. Using the fact that the plane equation, Ax+By+Cz+D=0, is the same as its negation, -AxBy-Cz-D=0, the coefficient vectors can be formed so that the value of the dot product, P B, can be used to determine on which the side of the plane the point P lies. The plane coefficient vectors shown in Figure 10-28 have been formed such that (P B) yields a positive number (or zero) if P is on the visible side of the plane and a negative value if the point is on the invisible side. As a result, the endpoint code test also can be based on the plane coefficient vector. This can further
156
simplify and modularize the frustum clipping code. A single general plane clipping routine can be passed the plane coefficient vectors of the six clipping planes, classify the endpoints and clip to the boundaries using only the plane coefficient vector. The pipeline for 3D frustum clipping is shown in Figure 10-29. view vector zn, zf , d & window d viewport
CTM
Frustum Clip
W-V Map
3D local
3D world
3D eye
3D eye
2D window
2D device
157
as complex as it first sounds, although it may appear to be more complex than the previous methods due to the nature of 4D coordinates. Observe what happens to the viewing frustum after perspective projection (Figure 10-30). Ye Yp
Xp
z zf ( d ) = ----------fp d zf
Ze eye
z=zn
Zp eye at
z
z (d) n = -----------np dz n
Figure 10-30. Viewing Frustum Before and After Perspective. We know that the window is not affected by projection (because z=0) and the COP goes to z=. Therefore the frustum transforms into a right parallelepiped, with the size clipping planes extending through the window between the projected near and far planes. Further examining the process, the two-dimensional image of the data is actually an orthographic parallel projection (front view, x and y values only) of the projected 3D coordinates. Deeper insight into the perspective projection process is gained by examining three regions along the Z axis in the W-Z plane of 4D space before and after perspective, as shown in Figure 1031. Before projection, the coordinates of point A are (-,-,d,1), where - means any value. The coordinates of B are (-,-,0,1). After projection, A is (-,-,d,0), a point at infinity. B is (-,-,0,1), unchanged as expected because it is on the projection plane. Region 1. 2. 3. Before Perspective d<z< 0<z<d - < z < 0 After Perspective - < z < -d 0 < z < + -d < z < 0
158
Figure 10-31. W-Z Plane View of Perspective Projection. Several of the z projections involve limits. In region 1, as z approaches d, zd lim ---------- = d z zd
z>d
Similarly, as z approaches and - in regions 1 and 3, zd lim ---------- = d d z z Finally, as z approaches d in region 2, zd lim ---------- = d z zd
z<d
zd lim ---------- = d d z z
In region 1, points behind the COP prior to projection transform into points in front of the eye after projection. Points in region 2 between the COP and the projection plane are projected into greatly expanded semi-infinite region. Consider what happens to a point that is placed very near the COP. It will project to a point near the eye, i.e., near . Likewise, points in the semi-infinite region 3 are compressed into a region between 0 and -d. This shows the highly non-linear nature of perspective projection. Homogeneous clipping can be simplified considerably by transforming the coordinates after applying perspective using a normalization transform, N, that transforms a point into clipping coordinates. The x, y, and z coordinates of a point transformed by P are mapped into a clipping coordinate system, XYZc, in which the visible coordinates, coordinates that were originally inside
159
the viewing frustum in XYZe, lie inside a cube centered at the origin (x,y,z [-1,1]). In effect, the view volume is transformed into a cube, as shown in Figure 10-32. After clipping lines (in 4D) against the clipping cube, they can be projected and mapped directly to the viewport. Yp far clipping plane window N
(1,1,1)
Xc
Figure 10-32. Homogeneous Normalization After Perspective. The normalization transformation N is constructed in two steps: (1) translate the center of the projected view volume to the origin, and (2) scale the resulting volume into a cube with x, y and z in [-1,1]. Referring to Figure 10-32, the center of the projected frustum lies along the z axis half-way between the near and far clipping planes. The locations of the near and far clipping planes usually are given as distances from the COP rather than z coordinates in XYZe, which would be awkward for a user. The distance to the near plane is n and the distance to the far plane is f. Thus, the z coordinates of the near and far clipping planes are zn = d-n and zf = d-f. Let znp be the z coordinate of the near clipping plane after projection, and zfp be the z coordinate of the far clipping plane after projection. Using the standard perspective equations: d z np = z n ------------d zn and d z fp = z f -----------d zf
Using the distances n and f in place of the z coordinates, d(d n) d(d f) and z fp = -----------------z np = ------------------n f
160
The half-sizes of the projected volume are wsx, wsy, and (znp-zfp)/2. We can now build N: ( z np + z fp ) 1 1 2 N = T 0, 0, -------------------------- S -------- , -------- , ------------------------- wsx wsy ( z z - 2 )
np fp
Concatenate the P and N matrices and multiply by d to simplify the terms. Notice that multiplying the matrix by d results in multiplying the resulting coordinates by d. However, scaling a homogeneous coordinate does not affect the normalized value. The resulting matrix, called dPN, is shown below. d -------- 0 wsx d 0 -------wsy dPN = 0 0 0 0 0 0 0
d d Consider the scaling elements -------- and -------- above. Referring to Figure 10-24, we see that wsx wsy d FOVx -------- = cot -------------- 2 - wsx and d FOVy -------- = cot -------------- 2 - wsy
The perspective and normalization transformations transform eye coordinates into clipping coordinates, Pc = Pe dPN After transformation by dPN, the resulting homogeneous coordinates must be properly clipped against the clipping cube. Each endpoint is a homogeneous coordinate (wx, wy, wz, w) for which the visible 3D (projected) coordinates lie between -1.0 and 1.0. The ordinary 3D clipping volume is described by the equations: -1 <= wx wy wz ------ , ------ , -----w w w <= 1
Recall, however, that clipping must occur prior to division. Therefore, clipping must be performed in the 4-dimensional homogeneous space. The above equations in 4D space are: -w <= wx, wy, wz <= w
161
This involves clipping on the six planes wx = -w, wx = +w, and so on for wy and wz. There is one special action necessary for lines with one endpoint that has a negative w value. Note that if both endpoints have negative w values, the line is wholly rejected. Recall that a negative w value results from perspective transformation of a point with a z coordinate behind the z COP, i.e. w = 1 -- is negative when z > d. This line must be clipped against the w=0 plane, d removing the portion of the line behind the COP. Clipping on the w=0 plane must be done first, before clipping on the other planes. This is a simple matter of properly ordering the plane tests in the clipping code. Homogeneous clipping involves seven hyperplanes in 4D space. Once again, the clipping algorithm can be an extension of the basic Cohen-Sutherland 2D clipping logic, now clipping on a volume bounded by seven planes. A seven bit clipping code can be used, where the endpoint codes identify mutually exclusive regions bounded by the infinite planes called TOP, BOTTOM, LEFT, RIGHT, NEAR, FAR, WZERO (for the w=0 hyperplane). After checking the Boolean AND (quick reject) and OR (accept) of the endpoint codes, the line must be clipped by recomputing one of its endpoints. Computing the intersection of the line with a clipping plane follows the same logic as described previously for frustum clipping. The plane equations, however, are much simpler and contain only constants (by design). Be sure to use homogeneous coordinates now. A B = 0 wx wy wz w C D
or,
P Bi = 0
The clipping plane coefficient vectors for the homogenous clipping coordinates, formed as before so that P B <0 for invisible points, are given in Figure 10-33. Note that they are simple constants. The pipeline for 3D homogeneous clipping consists of two matrix multiplications and the homogeneous clipping procedure: The homogenous line clipping algorithm is outlined in Figure 10-35.
162
Figure 10-33. Homogeneous Clipping Plane Data. M1 Homog. Clip & Divide M2 dPN Viewport Map
CTM
(x,y,z,w)local
(x,y,z,w)clip
(x,y)device
163
ClipLine( Point4D p1, Point4D p2 ) c2 = Zone( p2 ) LOOP: c1 = Zone( p1 ) IF c1 is IN and c2 is IN THEN ProjectAndDraw( p1, p2 ) RETURN. END IF (c1 AND c2) != 0 THEN RETURN. /*line is wholly invisible*/ /*special case: p2 outside WZERO and p1 out: do WZERO first:*/ IF (c2 AND WZERO) OR (c1 is IN) THEN SWAP c1<-> c2 and p1<->p2 IF c1 AND WZERO THEN B = wzeroplane ELSE IF c1 AND TOP THEN B = topplane etc... for the other planes in any order
p1 B = -----------------------------------p1 B p2 B p1 = p1 + ( p2 p1 )
ENDLOOP
Figure 10-35. Outline of Homogeneous Line Clipping. See Blinn [1] for another approach to homogeneous clipping.
164
location, orientation and shape of the viewing frustum in the world coordinate system. The viewing frustum data then is used to transform, clip and project 3D world data to produce the 2D image. In general, the viewing frustum can be specified in a number of different ways using a number of different parameters. Here we discuss two methods that we call AIM-based and EYEbased viewing. Ye Xe Yw Ze COP (EYE) Xw AIM Zw far plane window near plane
165
Ye Xe Yw Ze COP (EYE) Xw AIM Zw far plane Figure 10-36. Viewing Frustum in World Coordinates. consider viewing parameters that move and rotate the frustum but do not change its shape. Changing AIM causes you to translate without rotating the frustum. The COP translates the same as AIM. Changing azimuth and elevation causes you to rotate the frustum while keeping AIM in the same location. Notice that your eye, the COP, rotates about AIM. Now consider changing the parameters that affect the frustum shape but do not affect AIM or the orientation. Changing FOV rotates the frustum sides about the COP. This does not affect AIM or COP but does enlarge or shrink the window rectangle on the projection plane, thus making the image shrink or enlarge in size. Changing d moves the COP nearer or farther from AIM, which shrinks or expands the window on the projection plane. All these parameter changes will cause the resulting image to change. The dPN transformation for AIM-based viewing was given on page 160. d -------- 0 wsx d 0 -------wsy dPN aim = 0 0 0 0 0 0 0 0 window near plane
AIM-based viewing is useful to give the effect of walking around a point in space, such as around an object. This is appropriate for applications where the user wants to move around a
166
particular object, such as CAD systems or data visualization. AIM-based viewing in not useful for other applications where it would be better to simulate turning the head, such as a building walkthrough or action game. EYE-based viewing is more appropriate for such applications.
10.10 References
167
Top view of viewing frustum P projector P1 AIM2 AIM1 d2 d1 P2 FOV COP Figure 10-37. Effect of Changing d in EYE-Based Viewing. normalization transformations: dPNopengl = d T(0,0,d) P N, where P and N are the perspective projection and normalization transformations as before. The resulting matrix does not involve d, as shown below. FOVx cot -------------- 2 - 0 dPN eye = 0 0 0 FOVy cot -------------- 2 - 0 0 0 0 0 0
10.10 References
1. James F. Blinn, A Trip Down the Graphics Pipeline: Line Clipping, IEEE Computer Graphics & Applications, January 1991, pp. 98-105.
168
Ze equations for this eye coordinate system. Show the P matrix. 2. Derive the perspective projection equations for x, y and z for a left-handed eye coordinate system with the center of projection at the origin and projection plane at z = +d. Derive the corresponding P matrix. 3. The computation of the eye position given an AIM position using polar coordinates was given earlier as EYE = (0,0,d,0) R(-elevation, X) R(azimuth, Y) AIM. Rewrite this a Pw = Pe M. Derive the viewing transform directly from the elevation and azimuth angles, without first computing EYE. 4. Why is it necessary to do near plane z clipping? Where in the 3D pipeline should it be done? Why is it desirable to do BACK plane z clipping? Where in the 3D pipeline should it be done? 5. The homogeneous clipping coordinates of the endpoint of a line after transformation by V, P and N are (1,3,-3,2). Identify and give the plane equation of each clipping plane that is violated by this point.
169
6.
Develop the perspective P and normalization N matrices for transforming data from a righthanded eye coordinate system, with the projection plane at z=0, center of projection at (0,0,d), square window half-size of S, and near and back clipping plane locations ZF and ZB, into left-handed homogeneous clipping coordinates in which the visible x, y and z coordinates lie between 0.0 and 1.0 and z increases away from the eye. Draw and label an illustration of the 3D viewing frustrum before and after the P and N transformations. Show the N transformation as a concatenation of basic transforms given in functional form with symbolic expressions as arguments.
170
In Figure 11-1, the vector V is a direction vector that represents direction and length. It is also called a relative vector. The coordinates of a direction vector are relative displacements. The vector P in Figure 11-1 is a position vector, a vector that begins at the origin O. A position vector represents a location in space with respect to some coordinate system. The origin itself is a position vector. The distinction between direction and position vectors is sometimes subtle and often overlooked. It will be a useful distinction later when computational algorithms are discussed. We will represent all vectors in homogeneous coordinates: V = [ wx, wy, wz, w ].
171
V P O X
Z Figure 11-1. Position Vector P and Direction Vector V. Recall that the homogeneous vector [ x, y, z, 0 ] was reasoned to be a point at infinity on a line through the origin and the point. Such a vector also results from the vector subtraction of two position vectors with identical w coordinates: x2 y2 z2 w x1 y1 z1 w = x 2 x1 y 2 y1 z2 z1 0
Therefore, a direction vector has a zero w coordinate. As expected, position vectors have non-zero w coordinates. This will be particularly useful when we consider transformations, such as translation, applied to direction and position vectors. Geometrically, one can think of the vector coordinates as magnitudes to be applied to unit vectors along the principal axes in the vector equation (the dashed lines in Figure 11-1): V=xi+yj+zk where i, j, k are unit vectors, i.e. their length is one.
172
or a structure: typedef struct { float (or double) x,y,z,w; } vector; vector v; v.x = 1.0; The structure vector form allows assignment of vectors, pass-by-value function arguments and vector return values from functions. The following sections develop a basic set of vector utilities. The use of the vector structure is exploited in the example functions. The functions pass vectors by value, which encourages clear and correct coding, but is inefficient compared to passing vectors by address.
double VMag( vector v ) { return sqrt(v.x * v.x + v.y * v.y + v.z * v.z); }
Figure 11-2. Vector Magnitude Function. Note that the magnitude function should be applied only to the direction vectors (w=0). The w coordinate is ignored. In our applications, the magnitude is meaningful only for ordinary 3D coordinates.
173
vector VAdd(vector v1, vector v2) { v1.x += v2.x; /* remember that v1 is passed by value */ v1.y += v2.y; v1.z += v2.z; v1.w += v2.w; return v1; } vector VSub(vector v1, vector v2) { v1.x -= v2.x; v1.y -= v2.y; v1.z -= v2.z; v1.w -= v2.w; return v1; }
174
Figure 11-5. Unit Vector Function. Here one sees the advantages of vector-based programming: compactness and the code resembles the mathematical expression. The unit vector operation should be applied only to 3D vectors, and sets the result to a direction vector (w = 0). Its application is to create unit magnitude vectors of direction vectors.
Figure 11-6. Dot Product Function. For direction vectors, the value of the dot product can be interpreted as (Figure 11-7): V1 V2 = | V1 | | V2 | cos , and can be used to obtain the cosine of the angle between vectors: ( x 1 x 2 + y 1 y2 + z1 z2 ) cos ( ) = --------------------------------------------------V1 V2 The dot product is commutative and distributive: V1 V2 = V2 V1
175
V | 2 |V
s co
V2
V1
V1 V2 = V2 cos | V1 |
Figure 11-7. Geometric Interpretation of the Dot Product. V1 ( V2 V3 ) = V1 V2 + V1 V3 The dot product also can be interpreted geometrically as the projection of one vector on another, as illustrated in Figure 11-7. This makes the dot product useful for some geometric reasoning computations. For example, if the dot product of two vectors is negative then cos() is negative because vector magnitudes are always positive, which means that must be between 90 and 270 degrees, so we can say that the vectors are facing in opposite directions. Similarly, if the dot product is zero, then cos() is zero, so = 90 or 270 degrees, so the vectors are perpendicular. Note that V V = | V | | V | cos(0) = | V |2.
V1 V2 =
The direction can be determined by observation using the right hand rule, i.e. close your right hand in the direction of the first to second vector and your thumb points in the direction of the cross product. The magnitude of the vector product is interpreted as: | V1 V2 | = | V1 | | V2 | sin . The cross product function should only be applied to direction vectors. The w coordinates are
176
V2
Figure 11-9. Cross Product Function. ignored. The cross product is not commutative, V1 V2 = - V2 V1 (opposite directions), but is distributive, V1 ( V2 V3 ) = V1 V2 V1 V3 . The cross product also is useful for geometric reasoning. For example, if the cross product of two vectors is a zero length vector, then (assuming the vectors magnitudes are non-zero) sin() must be zero, meaning =0 or 180 degrees. We can reason that the vectors are parallel and facing the same or opposite direction.
177
V2
height base V1
Figure 11-10. Geometric Interpretation of the Area Function. magnitude, | V1 V2 | = | V1 | | V2 | sin(), but | V2 | sin() is the value of height, and | V1 | is the value of base. Therefore, the magnitude of the cross product is the area of the parallelogram, and hence the reason for the name Area. What is the geometric significance when the Area(V1,V2) = 0? By definition, the area of the enclosing parallelogram is zero, hence the vectors must be colinear (pointing in the same or opposite directions).
double VArea( vector v1, vector v2 ) { return VMag( VCross( v1, v2 ) ); }
178
V3
Figure 11-14. Vector Line Notation. Depending on the range of the parameter t used in the equation, we can classify P(t) as a: line ray segment if t can have any value, - <= t <=, if t is limited to t >= 0, if t is limited to t0 <= t <= t1 (usually 0 <= t <= 1).
179
normal direction N by observing that any point P in the plane forms a line in the plane with P0 that is perpendicular to the normal. The point-normal vector equation of the plane is given as: (P - P0) N = 0. N P0 P
Figure 11-15. Vector Plane Notation. Letting the plane normal N = (A, B, C), the plane point P0 = (x0, y0, z0) and an arbitrary point on the plane P = (x, y, z), look closer at this equation. First expand the dot product, (P N) - (P0 N) = 0. Now expand the vector equation into algebraic form, (A x + B y + C z) - (A x0 + B y0 + C z0) = 0. Compare this equation to the standard algebraic plane equation A x + B y + C z + D = 0. One sees that: D = - (P0 N) = - (A x0 + B y0 + C z0)
180
Q d P* P
Figure 11-16. Distance d from a Point Q to a Line P-V. forming a parallelogram with P, Q and V by duplicating the line PQ and the vector V to form four sides. The distance d is its height. The area of this parallelogram is the base times height, or |V| * d. Therefore, V (Q P) VArea ( V, VSub(Q, P) ) d = -------------------------------- = --------------------------------------------------------V VMag(V) Let P* = P t V, for some t. Then, P P (Q P) V VDot(VSub(Q, P), V) t = --------------------- = --------------------------- = ---------------------------------------------------V VV VDot(V, V) Examine the cases shown in Figure 11-17.
181
Q V t>1 P P* P* Q t<0 P V
Figure 11-18. Distance Between Parallel Lines. Using logic similar to that of finding the distance from a point to a line: V1 ( P2 P1 ) VArea ( V 1, VSub(P 2, P 1) ) d = --------------------------------------- = ---------------------------------------------------------------V1 VMag(V 1) If V1 and V2 are not parallel, then they form a volume for which the distance d is the height: Vol ( V 1, V 2, VSub(P 2, P 1) ) ( V1 V2 ) ( P2 P1 ) d = --------------------------------------------------------- = -----------------------------------------------------------------V1 V2 VArea ( V 1, V 2 ) Note that the VArea result above will be zero if the lines are parallel. This can be used as a decision value in the algorithm to compute the distance between lines.
182
compute the point of intersection, P*, using vectors, solve the vector equation expressing the statement the point on line 1 that is the same as the point on line 2 : P1 (t) = P2 (s). P1(t) P* V2 P1 P2(s) V1
P2
Figure 11-19. Intersection of Two Lines. The problem is to find the parameter values for t and s that solve this equation. Substituting the vector expressions for the lines yields a vector equation in the variables s and t: P1 t V1 = P2 s V2 Begin solving this equation by removing the absolute positions P1 and P2: t V1 = ( P2 - P1 ) s V2 Now cross both sides with V2 to eliminate s: t ( V1 V2 ) = ( (P2 - P1) V2 ) s ( V2 V2 ) Of course, ( V2 V2 ) = 0 by definition, so t ( V1 V2 ) = ( ( P2 - P1 ) V2 ). Dot this equation with (V1 V2) to make a scalar equation in t, t ( V1 V2 ) ( V1 V2 ) = ( (P2 - P1) V2) ( V1 V2). Finally, [ ( P2 P1 ) V2 ] ( V1 V2 ) t = -------------------------------------------------------------------------( V1 V2 ) ( V1 V2 ) Similarly, solving for s: [ ( P2 P1 ) V1 ] ( V1 V2 ) s = -------------------------------------------------------------------------( V 1 V2 ) ( V 1 V 2 ) The denominator of these expressions is zero for (V1 V2) = 0, so this must be a degenerate condition. Geometrically, the vector V1 V2 is the common normal between the two vectors. If this is 0, then the vectors must be parallel or lie along the same line, and possibly also
183
be overlapping. These must be handled as special cases in code. How do we know if the intersection point is on the segment P1 (t)? This is the advantage of parametric notation. The parameter value is a measure of the position of a point along the line. If 0 <= t <= 1, the point is along the segment P1 to P1+V1. If t < 0, the point is before P1, and if t > 1, it is after P1+V1. Similar logic holds for s and the line between P2 and P2+V2. This is not as messy to code as it looks, see Figure 11-20.
Boolean LineLineIntersection( vector p1, vector v1, vector p2, vector v2, double *s, double *t ) { double numerator, ndotn; vector n, p21; n = VCross( v1, v2 ); ndotn = VDot( n, n ); if( VMag(n) < TOOSMALL ) return FALSE; p21 = VSub( p2, p1 ); numerator = VDot( VCross( p21, v2 ), n); *t = numerator / ndotn; numerator = VDot( VCross( p21, v1 ), n); *s = numerator / ndotn; return TRUE; }
Figure 11-20. Line-line Intersection Function. With s and t known, P* is computed using P1(t) or P2(s). Figure 11-21 is an example of the use of the LineLineIntersection routine.
double s, t; vector pstar; if( LineLineIntersection(p1,v1,p2,v2,&s, &t) == TRUE ){ pstar = VAdd( p1, VScale( v1, t ) ); printf(the lines intersect at (%f,%f,%f)\n, pstar.x, pstar.y, pstar.z ); } else { printf(the lines do not intersect\n ); }
184
between a line and a plane. As in the case of the line-line intersection, this problem can be N P0 P Figure 11-22. Line and Plane Intersection. expressed geometrically as, there is a point on the plane that is also on the line. In vector notation, this becomes ( P* - P0 ) N = 0 P*(t) = P t V Substituting P*, [ ( P t V ) - P0 ] N = 0 Expanding: t ( V N ) + ( P - P0 ) N = 0 Finally, ( P P0 ) N t = ----------------------------VN The possibility of a zero in the denominator should always gain our attention. In this case, as in most vector-based computations, it is a warning of a special geometric situation. If V N is (approximately) zero, the line is parallel to the plane, so there can be no intersection. Of course, the line could be in the plane. If the plane is given as a plane coefficient vector, B, the line-plane intersection is computed as: PB t = ------------VB any point on the plane any point on the line V
P*
185
Figure 11-23. Two Coordinate Systems. Given a vector in XYZ1, P1, we would like to find its coordinates with respect to XYZ2, P2, using a transformation M: P2 = P1 M. The problem is to find the transformation M, called the change of basis. It is reasonable to assume that we know the origin of the second coordinate system, O2, and the unit direction vectors of the X2, Y2 and Z2 axes, called I2, J2 and K2, all with respect to XYZ1. This information allows us to form four equations with known vectors for P1 and P2, as summarized in the following table: P1 (XYZ1 Coordinates): I2 J2 K2 O2 P2 (XYZ2 Coordinates): [1,0,0,0] [0,1,0,0] [0,0,1,0] [0,0,0,1]
186
where M is the unknown 4 by 4 matrix. Solving for M, I2 M = J2 K2 O2 Therefore, the transformation of a vector from one coordinate system to another can be found by finding the inverse of the matrix formed by putting the three unit direction vectors (in homogeneous form) in the first three rows and the origin position vector in the final row, of the second system with respect to the first. The general matrix inverse is a computationally expensive operation, so it is desirable to further refine this process by factoring the matrix on the right hand side into a rotation matrix and a translation matrix, I2 J2 K2 O2 = I2 J2 K2 I I I I O2
1
where the Is represent the respective row of the identity matrix. This can be done because we know that the only transformations in M are rotations and translations. The inverse of a product of matrices is the reverse order product of the respective inverses,
1 1 1
I2 M = J2 K2 O2 =
I I I O2
I2 J2 K2 I
and the inverse of a translation is simply the translation by the negative vector. The rotation matrix is an orthonormal matrix, meaning the rows (and columns) are orthogonal unit vectors. The inverse of an orthonormal matrix is simply its transpose, the matrix formed by changing the rows to
187
I I = I O2
I2 J2 K2 I =
I I I O2
I2
x J2 x K2 x I2 y J2 y K2 y I2 J2 K2 z z z 0 0 0
0 0 0 1
where -O2 means [ -x, -y, -z, 1 ]. This is a compact, simple and efficient computation for M. The translation matrix is constructed simply by assigning the negative of O2 to the last row of a matrix that has been initialized to identity, and the transposed rotation matrix is formed by assigning the coordinates of I2, J2 and K2 as the first three columns of a matrix that has been initialized to identity. The transformation M is the product of these matrices. As an example, this procedure can be used to find the viewing transformation, V. The first coordinate system is XYZw, the world coordinate system, and the second is XYZe, the eye coordinate system. The resulting transformation is V, the viewing transformation. World coordinates will be transformed into eye coordinates. Given the view vector endpoints, EYE and AIM, one can find O2, I2,J2 and K2 using vector operations (see Figure 11-24). Yw Ye AIM Zw Xe Ze EYE Xw
Figure 11-24. Change of Basis for the Viewing Transformation. The process involves several steps: 1. K2 = || EYE - AIM ||, the unit direction vector of the Ze axis with respect to XYZw.
188
2.
The next step is to find I2, the direction of the Xe axis. It is necessary to specify a direction, often called the UP direction, which, when crossed with K2, forms the direction of I2. To follow the viewing conventions established during the trigonometric version of the viewing transfmoration discussed earlier, UP = J ([0,1,0,0]), the unit vector along the Yw axis. One must be careful, however, when UP is parallel to K2, i.e. when UP K2 = 0, for example, looking along Yw. In this case, one must specify another UP vector. To follow the conventions established in these notes, in this case specify I, [1,0,0,0], the unit Xw axis. IF ( J K2 ) approx.= 0 THEN I2 = I ELSE I2 = || J K2 ||
3. 4. 5. 6.
J2 = K2 I2
Create the transposed rotation matrix, R, from I2, J2, K2 as described earlier. Create the translation matrix, T = T(-AIM), because O2 = AIM in this case. Finally, compute V = T R.
As an example, consider the situation shown in Figure 11-25, with AIM = [0,0,0,1] and EYE = [1,0,0,1].
Yw Ye Xe Ze Xw
AIM
Zw
EYE
Figure 11-25. Example of Change of Basis Applied to the Viewing Transform. Following the steps, 1. K2 = || EYE - AIM || = || [1,0,0,1] - [0,0,0,1] || = [1,0,0,0].
189
2.
J K2 = [0,1,0,0] [1,0,0,0] = [0,0,-1,0] (i.e., -K). This is not zero length, so I2 = || J K2 || = [0,0,-1,0].
3. 4.
0 = 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
5.
6.
Finally, compute V = T R: 1 V = 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 = 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 1
Look at the columns of the R and V matrices in the previous example. They are the unit direction vectors of XYZe with respect to XYZw. This means that given V, one can find the directions of the axes of XYZe.
190
For example, it is simple to compute the points on a circle with its center at the origin and with a unit radius. Refer to this as the canonical circle in its local coordinate system. Drawing an instance of a circle at a point on an arbitrary plane in 3D space is another problem, however. We would like to compute the local coordinates of the points on a circle and use an instance transformation to map them to points on the 3D circle in world coordinates. The problem is to find this transformation from the information describing the world position and orientation. The problem, as usual, is finding the transformation M that maps XYZL coordinates into XYZw coordinates. The derivation of this transformation is similar to the derivation of the change of basis transformation. First, compute the origin and direction vectors of the local coordinate system in world coordinates from the position and orientation information given for the object. Next, form M from these vectors like the change of basis derivation, using the equation, Pw = PL M: XL YL 1 = 0 0 ZL 0 O 0 1 0 0 0 0 1 0 0 0 M = M 0 1
This M is made by simple row assignments from the four vectors, the three unit direction vectors XL, YL, and ZL, and the origin vector O, all expressed in world coordinates. For example, compute the instance transformation to map a unit cylinder to the world position and orientation as shown in Figure 11-26. In the figure, the XYZL axes have been drawn Yw YL XL [0,2,2,1] ZL Basic cylinder in local coordinates Zw Instance of cylinder in world coordinates
YL XL ZL
191
in world coordinates. By observation we find that O = [0,2,2,1], XL = [0,0,-1,0], YL = [0,1,0,0], and ZL = [1,0,0,0]. Next, these vectors are assembled into the instance transformation M: XL M = YL 0 = 0 1 ZL 0 O 0 1 0 2 1 0 0 2 0 0 0 1
Notice that the rotation sub-matrix is a simple RY(90). If the direction vectors could not have been found by observation, then the viewing transformation vector operations described in the previous section could have been used to compute XL, YL and ZL. Also, if the cylinder instance is to be scaled, such as changing the radius (or radii, e.g. making the cylinder elliptical) or length, then these transformations should be concatenated with the positioning and orienting transformation M to form the instance transformation. The scaling must be done in XYZL, before M.
192
P B = 0. The transformed points are P = P M, so, inverting to find P, P = P M-1. Substituting this equation for P into the original plane equation above, ( P M-1) B = 0. Re-group the matrix and B multiplication, P ( M-1 B) = 0. Or, B = M-1 B, and P B = 0 is the new plane equation. For example, consider the perspective projection of the plane y + z - 1 = 0, the plane along the projector through y = 1 (Figure 11-27). The plane equation is [x y z 1] [ 0 1 1 -1 ]T. The y y=1 y+z-1=0
x d=1 z Figure 11-27. The Plane y + z - 1 = 0. transformation M is the perspective projection matrix, P. For d = 1, this becomes: 1 0 A' B' = 0 1 C' 0 0 D' 0 0 0 0 1 0 0 0 1 1
1
0 1 0 1 = 0 1 1 0 0 1 0 0
0 0 1 0
0 0 0 0 1 = 1 1 1 0 1 1 1
So, the transformed plane equation is y - 1 = 0, or y = 1. This shows that the top clipping plane of the viewing volume transforms to a plane perpendicular to the y axis (Figure 11-28). Using similar logic, the coefficients of a transformed conic, i.e., circle, sphere, parabola,
193
y y+z-1=0 y-1=0 x d=1 z Figure 11-28. Original and Transformed Planes. ellipse, or hyperbola, can be derived. For simplicity, we will consider only two-dimensions in the following. The general 2D conic equation is, a20x2 + 2a11xy + a02y2 + 2a10x + 2a01y + a00 = 0. This can be put into matrix form, a 20 a 11 x y 1 a 11 a 02 a 10 x x a 01 y = x y 1 A y = 0 1 1 a
a 10 a 01 00 or, P A PT = 0
Transforming the curve is transforming all its points, so again apply the transformation M to [x, y, 1] to produce [x',y',1]. Inverting the equation to solve for [x,y,1] in terms of M-1 and [x',y',1], and substituting into the equation above, the transformed conic is found: A' = M-1 A (M-1)T. As an example, consider translating a unit circle at the origin by [1,0,0,0]. The A matrix equation becomes: 1 0 0 x x y 1 0 1 0 y = 0 0 0 1 1 1 0 0 A = 0 1 0 0 0 1
194
1 0 1 A' = 0 1 0 1 0 0 This is the equation: (x-1)2 + y2 -1 = 0, which, as expected, is the equation of a unit circle with center [1 0]. In three dimensions, the conics are ellipsoids, spheres, paraboloids, etc. The matrix form is x y = 0 xyzw A z w where the matrix A for 3D conics is 4 by 4. Exactly the same as the 2D case above, transforming the 3D conic by a transformation M creates a transformed conic represented by A' = M-1 A (M-1)T.
195
The approach is to first perform a change of basis that places the object in a coordinate system in which u becomes the z axis. Then rotate the object by the angle about this z axis and transform the object back into the original coordinate system. The first transformation is precisely the viewing transform, hence derive it using the change of basis in vector-matrix logic. The direction K2 is u. Next compute I2: I2 = || ( J K2 ) || Now compute J2 = K2 I2 uz u x --------------------- 0 --------------------I2 = 2 2 2 2 ux + uz ux + u z ux u z J 2 = --------------------2 2 ux + uz
2 2 uy uz u x + u z --------------------2 2 ux + uz
Now create R, the transposed rotation transformation, using these as the columns: uz u x u y --------------------- --------------------- u x 2 2 2 2 ux + uz ux + uz R = 0 ux --------------------2 2 ux + uz ux + u z u y uy uz --------------------- u z 2 2 ux + uz
2 2
Form the composite transformation as (recall R-1 RT): M = R R(,z) RT uz ux u y --------------------- --------------------- ux 2 2 2 2 u x + u z ux + u z M = 0 ux --------------------2 2 ux + uz
2 ux 2 + uz
uz --------------------2 2 ux + uz
ux --------------------2 2 ux + uz
uy u z 2 2 u x + u z --------------------2 2 ux + uz uy uz
196
After considerable algebra, one arrives at a convenient form: u x vers + cos M = u u vers u sin x y z
2
V=
2. 3.
How can one determine if two parallel vectors are in the same or opposite directions? What would be the approximate values for s and t for these line-line intersection cases? P1 (t) P1 (t) P2 (s) P2 (s)
4. 5.
How would the condition line in plane be computed? Given a viewing transformation V, show how to compute the translation and rotation matrices for the change of basis formulation. Also, show how the AIM location can be found from V.
197
The fundamental problem in these processes is to determine the visibility of a geometric element, either a line or surface, with respect to all other objects in a scene. Determining the visible portion of an element is actually computing the portion that is not hidden by another element. Once the visibility computations are finished, the scene is drawn by displaying all visible portions of elements. Of course, visibility depends on the viewing information. The concepts underlying visibility are introduced by first discussing back plane rejection.
198
Xw
Figure 12-1. Back-Plane Rejection Notation. A plane is back-facing if E N < 0, and need not be considered in subsequent processing. Back-plane rejection can be applied only if the data is a closed solid when it is not possible to see the back side of a plane. If the object is convex, back-plane rejection solves the HLR problem. What does it mean for an object to be convex or concave (non-convex)? In general, a polygon is convex if the exterior angle between any two adjacent edges never is less than 180 . Another specification for convexity is that the plane normals computed as the cross products of all adjacent edges taken as direction vectors must face the same direction. V2 V1
exterior angle
Figure 12-2. Concave and Convex Polygons. A convex object can be visualized as an object which, if shrink-wrapped with plastic, would have no cavities between the wrapper and its surfaces. We will see that hidden surface processing is
199
200
One of the first such algorithms was Robert's HLR algorithm. The general HLR idea is: Step 1. Perform the viewing and projection transformations so that all lines are in the 3D screen space. Step 2. Compare each line with each polygon in 2D, i.e. using only xv and yv. There are three possible cases: 2a. The line is outside the polygon. Action: keep the line. Yv
Xv Figure 12-4. Line Outside a Polygon. Note that determining if a line is outside a polygon requires (1) determining if any line of the polygon intersects the given line, and (2) determining if the endpoints of the given line lie inside the polygon (known as the point containment test ). Examine the lines and polygon in Figure 12-5. Line AB has endpoints outside the polygon, but intersects it. Line CD has endpoints inside the polygon and also intersects it. Line EF is a simple case, the location of the endpoints can be used to classify the line as crossing. A C E D F B
Figure 12-5. Lines and a Polygon. 2b. The line is inside the polygon in 2D. Determine if the line penetrates the plane of
201
Figure 12-6. Two Views of a Line Inside a Polygon. If it does not penetrate, determine if the line is completely in front of (e.g. line A), or completely behind (e.g. line C), the polygon. This involves 3D tests: 1. 2. 2c. if it is in front, KEEP IT; if behind, eliminate it; if the line does penetrate, keep only the portion in front of the polygon.
Figure 12-7. Line Intersecting a Polygon. Break the line 'a-b-c' into 'a-b' and 'b-c'. Process 'a-b' by case 2b above, and process 'b-c' by step 2a above.
202
sorted based on decreasing distance (depth) from the eye. Each polygon is filled, that is, the pixels within the projected boundaries of the polygon on the screen are written into the frame buffer, beginning with the polygon farthest from the observer and proceeding to the closest. Pixels within closer polygons are written after those in farther away polygons, and therefore overwrite them in pixel memory. In this case, the 3D screen space can be (h,v,zv), where h and v are the pixel coordinates computed using window to viewport mapping, and zv is the projected z coordinate. 1. 2. 3. Project polygons into 3D screen space, Sort polygons by zv values, Fill each in order, from farthest to nearest to the eye. 1 2 3
Figure 12-8. Overlapping Polygons in the Painters Algorithm. If two polygons overlap in 2D and intersect in zv, then depth sorting produces incorrect results. In this case, the two polygons must be split into non-intersecting polygons, which is a complex polygon clipping-like process. This greatly increases the complexity and diminishes the elegance of the algorithm. The speed of the Painters algorithm depends upon the filling speed of the graphics system because all polygons, visible or hidden, are filled. Filling is itself a computationally intensive process, so this is not the most efficient algorithm. It is, however, the simplest to implement.
203
buffer, called a z-buffer, is consulted prior to storing a pixel in pixel memory. This also requires that the polygon filling algorithm compute a depth (z value) for each pixel. Pixel Memory pixel values pixel depths Z-Buffer
depth at h,v
When a polygon is filled, for each pixel inside it a depth is computed and compared to the depth stored in the z-buffer for the corresponding pixel in pixel memory. If the polygons pixel is nearer to the eye (greater-than for a right-handed system, less-than for a left-handed system), the new pixel depth is stored in the z-buffer and the pixel value is stored in pixel memory. If the new pixel is farther away, it is ignored. After all polygons have been processed, the pixels remaining are the nearest pixels, i.e. the visible pixels. Often, to avoid the excessive overhead of pixel-by-pixel graphics system calls, a separate array of pixel values is created for this algorithm. In this case, the picture is not drawn until all polygons have been processed. Then, all the pixel values are written at once to the display. The 3D screen space is the same as the previous, (h,v,zv). 1) 2) 3) Project polygons into 3D screen space. Initialize pixel and depth buffers (two dimensional arrays). FOR each polygon Pi DO FOR each pixel & depth inside Pi DO IF depth is closer than stored depth[i,j] THEN depth[i,j] = depth pixel[i,j] = pixel (or write the pixel to the screen directly) ENDIF ENDDO ENDDO 4) Draw picture from pixel[][] data (if necessary)
204
The z-buffer approach has been implemented in hardware and is used in many high performance graphics cards.
Yv
Polygon A Xv
Zv
Xv
scan-plane
Figure 12-10. Scan-line and Scan-Plane. Viewed in the scan-plane, the segments may overlap. Depth comparisons are made in the overlapping area(s) of segments to find which parts of segments are in front of others, i.e. visible. Overlapping segments are sub-divided and comparisons continue until no overlaps exist.
205
The resulting non-overlapping segments are rasterized into pixels and finally output to the display device. Then, we proceed to the next scan-line. A segment is rasterized by computing the locations of the individual pixels along the horizontal scan-line at integral h positions and writing these into pixel memory. Due to the properties of lines and polygons in the 3D screen space, the Watkins algorithm actually is edge-driven. That is, most computations are performed in 2D with some additional bookkeeping for the depth values. Watkins algorithm uses a divide and conquer (non-deterministic) algorithm for subdividing segments to remove overlaps. Simple segment conditions are processed, and other complex cases are sub-divided until they become simple. Scan-line coherence, the similarity between visible segment locations on one scan-line to the next, is used to eliminate unneeded subdivision. It is the scan-line coherence methodology and segment visibility resolving logic that distinguishes the Watkins algorithm from others, and makes it one of the fastest (and complex) HSR algorithms. The Watkins segment rasterization phase can be changed to produce a raster image with only the edges, not the interior of surfaces. This is done by outputting marker pixels at the ends of resolved segments and background pixels at the interior pixels. The Watkins algorithm can also output lines. Instead of rasterizing the resolved segments on a scan-line, the HLR version of Watkins remembers the segment boundaries, marked by the polygons that generate them, from one scan-line to the next. By maintaining a list of growing lines, adjacent scan-lines can be compared to detect the appearance, disappearance, and continuation of lines. Consider the following example:
scan-line 2 3 C B 4 5 6 7 Action New lines A and B. New line C, extend A & B. Extend A, B & C. Extend A, B & C. End A & C, Extend B. End B.
When comparing the lines to the segments on the current scan-line, there are 3 possibilities: 1. A segment boundary has no counterpart growing line.
206
Start a new growing line with start and end positions at the given point. 2. A segment endpoint on the completed scan-line matches a growing line. Extend the growing line by setting its endpoint to the given point. 3. A growing line has no counterpart on the completed scan-line. The line ends: output it to the display or file being created and remove it from the data structure.
Figure 12-11. Line Clipping Applied to a Polygon. to maintain closed polygons for later computations such as filling and point containment tests. Polygon clipping appears to be considerably more difficult than line clipping (Figure 12-12).
207
clipping boundaries Figure 12-12. Complex Polygon Clipping. The Sutherland-Hodgman algorithm [Sutherland] is an elegant approach to polygon clipping. The algorithm sequentially processes each edge around the perimeter of a polygon, testing the start and end vertices of the edge against the clipping boundaries. Given the start point of an edge, P1, and the endpoint, P2, there are four cases to consider as shown in Figure 12-13. The Invisible region is shaded
P2:output
P1
I:output P2
P2 Case 3: no output.
Figure 12-13. Four Clipping Cases for Polygon Clipping. algorithm processes each pair of edge endpoints, P1 and P2, clipping the edge against each boundary. Note that the process for each boundary is the same, i.e. as points are output from one
208
boundary, they can be processed by the next. Each boundary evaluation is identical to the others, so the algorithm can be programmed using re-entrant code, i.e., one routine that processes P1 and P2 points can call itself. A simpler approach is to clip the whole polygon against one boundary at a time, storing the output vertices in a second buffer (array of points). If any points are in the buffer after clipping on one boundary, then clip on the next boundary, this time reading points from the second buffer and outputting them to the first. Continue clipping on the remaining boundaries, reading from one buffer and storing into the other, toggling input and output each time. The buffers must be dimensioned at least twice the maximum number of vertices allowed per polygon. This can be understood intuitively by considering a polygon shaped like a saw blade. Figure 12-14 shows a pseudo-code illustration of the basic procedure for clipping a polygon against one boundary, specified by an argument of data type boundary that represents the plane coefficient vector [A,B,C,D].
ClipPolygon(boundary B): p1 = last vertex in input buffer; FOR( p2 = each vertex of input buffer (1st to last) ) DO IF State(p1, B) != State(p2, B) THEN I = intersection of line p1-p2 with the boundary B; Append I to the output buffer; ENDIF IF State(p2, B) = IN THEN Append p2 to the output buffer; p1 = p2; ENDFOR
Figure 12-14. Pseudo-Code Polygon Clipping Routine. The routine State in Figure 12-14 returns the state of the given point with respect to the given plane, either IN or OUT. An example of State in pseudo-code is shown in Figure 12-15.
State( Vertex P, boundary B): B < 0 THEN IF P State = OUT ELSE State = IN; ENDIF END
209
For homogeneous polygon clipping, there would be seven planes, starting with WZERO. For example, consider Figure 12-16. The initial vertex buffer would contain the four boundary B P2 invisible side of boundary P3
I23 I30 P1
P0 Figure 12-16. Example Polygon and Boundary. ordered points { P0, P1, P2, P3 }. After processing by the code in Figure 12-14, the second vertex buffer would contain the five ordered points { I30, P0, P1, P2, I23 } that represent the clipped polygon. Other approaches to polygon clipping have been published [Liang][Vatti].
210
Yv
Grid locations are the coordinates of pixels. In some systems, this is the center of a pixel. In others, this may be a corner.
Xv Figure 12-17. Pixel Grid and Polygon. 2. Sort these intersections by increasing x. Due to the arbitrary ordering of vertices around a polygon, the intersections will not generally be ordered left to right (increasing value). 3. Group the ordered intersections into pairs to form scan-line segments that lie inside or on the polygon. This is commonly called the even-odd rule. For any closed object, as an infinite line crosses its boundaries, it toggles from outside to inside.
9 8 e1 7 e2 6 5 e3 4 e4 3 2 1 2 segments
Yv
1 segment
scan-line A
scan-line B scan-line C
Xv
1 2 3 4 5 6 7 8 9
Figure 12-18. Segments Formed by Intersections of Scan-Lines and a Polygon. 4. Rasterize each of the segments by marching horizontally along the segment from (Xstart, Yscan) to (Xend, Yscan) assigning pixel values at pixel locations, i.e. grid coordinates.
211
Exact segment x coordinates must be rounded to the nearest integer pixel coordinates. Care must be taken to round properly. In general, the start (left) segment x should be rounded to the next larger integer value, and the end (right) segment x should be rounded to the next smaller integer value, as shown in Figure 12-19. The library routine CEIL(X) provides the Pixel center locations 1 2 3
-1
4 Xv
Xend = 3.8
prev next
next
Figure 12-20. Vertex Intersections Along a Scan-Line. Another approach is simply not to allow vertices to have y values on a scan-line by
212
adjusting any offending y values, or all y values, prior to scan-converting the polygon. This requires that coordinates be stored as real (or rational integer) values. If a y value is on a scan-line, then add a small amount to its value to place it between scan-lines. The small amount should be large enough to avoid scan-line intersections but small enough to insure that the rounded values fall on the same scan-line. Some approaches simply add 0.5 to all rounded y coordinates. Because the scan-converting process finds pixels to the nearest integral x and y coordinate anyway, image quality will not suffer (for our purposes, anyway).
scan-plane Xv Zv Figure 12-21. Segments and the Scan-Plane. is the X coordinate as before, but Zi is the Z value at the intersection. This list is sorted by Xi as before, and paired into segments. The segments are now 3D lines in the plane Y=Yscan, from (Xstart,Yscan,Zstart) to (Xend,Yscan,Zend). As the line is rasterized into integral X pixel addresses, the corresponding z depth at the rounded pixel x values must be computed (Figure 12-22). This z depth can be used in
213
12.8 References
214
12.8 References
1. 2. 3. Sutherland, I. E. and Hodgman, G. W., Reentrant Polygon Clipping, Communications of the ACM, Vol. 17, No. 1, 1974. Liang, Y. and Barsky, B. A., An Analysis and Algorithm for Polygon Clipping, Communications of the ACM, Vol. 26, No. 11, November, 1983. Vatti, B. R., A Generic Solution to Polygon Clipping, Communications of the ACM, Vol. 35, No. 7, July, 1992.
215
Figure 13-1. HSR Rendered Image. ignored color variations due to lights, geometry and material. We have determined the visible pixels in the image, but have not assigned color values to the pixels that indicate 3D effects. To create a realistic image of 3D objects requires more than just hidden surface determination. It also requires simulation of the interaction of light with the geometry and surface properties of the objects that determines the light that reaches our eyes. Recalling the discussion of color, light is visible radiation that interacts with surfaces in complex ways. Incident light from many sources irradiates the surface of an object of a particular material. Light is absorbed by, reflected from, and possibly transmitted through, the surface. This behavior is determined by the geometry of the surface and its material properties, including its color. The terms illumination model, lighting model and reflectance model are used to describe models for the interaction of light and geometric objects. For our purposes, we will use the term illumination model for the mathematical model used to compute the color of a pixel corresponding to a visible point on the surface of an object. The shading model combines an illumination model
216
with methods for varying the color across the surface of objects in an efficient manner.
217
light can be approximated by the cosine curve shown in Figure 13-2, where L is the unit direction vector from the point in question toward the light, N is the unit surface normal vector, and is the angle between L and N. We know that cos = L N if L and N are unit vectors. The intensity of reflected light for diffuse reflection is Id = IL Kd cos(), where IL is the incident light, Kd is a surface property representing how much incident light is reflected diffusely from a given surface. Usually, Kd is a value ranging from 0.0 to 1.0.
Figure 13-3. Specular Light Reflection. surface is not a perfect reflector, as most are not, then the reflected light is scattered around RL and the intensity diminishes as the eye moves away from the direction of perfect reflection, RL. There are two common approaches to simulating specular reflection. The first, and least commonly used today, involves the angle between the ideal reflection direction, RL, and the vector to the eye, E, as shown in Figure 13-4. The specular component of reflected light is Is = IL Ks cosn. where Ks is a surface property representing the amount of incident light specularly reflected from
218
N RL E Eye
Figure 13-4. Specular Reflection Vectors. a surface, n is the specular exponent, a surface property representing the distribution of specularly reflected light about the reflection vector RL (values usually are 2 to 100), and is the angle between RL and E. In this case, it is computed using cos = RL E if RL and E are unit vectors. Given L and N, the vector RL is computed by sweeping L through N by an angle . N B L A B RL A=(N L)N B = A - L = (N L) N - L RL = A + B = 2 A - L RL = 2 ( N L ) N - L
Figure 13-5. Reflection Vector RL Computations. Note that it is possible to have RL E < 0, which cannot be allowed. This means cos() < 0, so > 90 , which makes it impossible for reflection. In this case, the specular reflection should be set to 0. Instead of using the RL vector for computing the specular component, a more popular approach is to define another vector, H, which is the unit bisector of L and E as shown in Figure 13-6. H is computed using vector operations: H = || E + L || assuming E and L are unit vectors.
219
This simplifies the computations considerably and eliminates the logical problems of RLE<0. N H L E Eye
Figure 13-6. H Vector Computations. If H and N are unit vectors, cos = H N and the specular component of reflected light is Is = IL Ks cosn. This is the most common method for specular reflection.
220
representing how much surrounding ambient light a particular surface reflects, Ia is a global illumination property, representing the amount of ambient light in the lighting environment surrounding the objects. In this case, it is the intensity of ambient light in the environment.
221
Likewise, the surface model coefficients become coefficient vectors, K a, red K a = K a, green K a, blue K d, red Kd = K d, green K d, blue K s, red K s = K s, green K s, blue
The full color lighting model is now a vector equation: I = Ka a IL Kd (L N) IL Ks ( H N )n The products of trichromatic vectors, for example Ka a above, are computed by multiplying corresponding array elements to produce a vector as the result. This is sometimes called "array multiplication." Similarly, the product IL Kd (L N) above is a vector product multiplied by a scalar, producing a vector as a result.
222
function. The direction of incident light is computed similar to the point light source, using the spotlight position and the point in question. The intensity of the incident light, however, is varied according to the geometry of the spotlight direction, spotlight cone angle and the incident light direction using the attenuation function. These computations are similar in nature to the computations for diffuse reflection, where the angle between the incident light direction and spotlight direction determines the factor applied to the spotlight color to determine the incident light intensity. The light is a maximum along the spotlight direction, falling off to a minimum value at the spotlight cone angle.
223
Figure 13-8. Examples of Smooth Flat and Smooth-Shaded Cylinders. equations, such as the equations of a cylinder as shown in Figure 13-8, then these normals are stored at the vertices and used for the lighting computations. When the surface equations are not
224
known, as is the case in systems that approximate all surfaces with planar polygons, then the shared vertices where facets from the same surface meet, the geometric normals are averaged to form a lighting normal that will be used for intensity computations. The averaged normal at a shared N2 8 3 N1 1 5 2 4 6 NA 7
N12
Figure 13-9. Normal Averaging for Smooth Shading. vertex is the vector average of the plane normals of all polygons sharing the vertex. In Figure 139, for example,
8
N12 = || || N1 || + || N2 || || and NA =
i=4
Ni
Scan-line computations must now linearly interpolate the color of pixels based on the intensity or trichromatic color components across a scan-line segment and along the edges from one scan-line to the next. For example, in Figure 13-10, the colors of pixels along the indicated scan-line is a linear interpolation of the colors IL and IR, which are themselves linear interpolations along the edges I1I2 and I1I3, respectively: IL = I1 + t12 ( I2 - I1 ) and IR = I1 + t13 ( I3 - I1 ), where t12 and t13 are parameters that vary from 0 to 1 along each edge. The intensity across segment IL-IR is interpolated with a parameter s that varies from pixel to pixel across the scan-line: Is = IL + s ( IR - IL ). Unfortunately, Gouraud shading produces noticeable mach bands of intensity patterns. These are due to discontinuities in the intensity variations across polygon boundaries. B. T. Phong suggested averaging the normal itself, instead of the intensity. This reduces the
225
I1 t12 IL
XL,ZL,IL
t13 IR
XR,ZR,IR
scan-line
I2 I3 I4 Figure 13-10. Intensity Interpolation While Filling a Polygon. bands produced by Gouraud shading. Notice the nonlinear relation between the intensity I and the normal N in the lighting model equation. Phong shading, or normal interpolation, produces smoother images, but requires more complicated per-pixel computations. The full lighting model must be computed for each pixel.
226
227
Procedure RayTrace: FOR each pixel on the screen in world coordinates DO Compute the vector ray equation through the pixel; Set the "visible_hit" parameter to "none"; FOR each object in the scene DO Compute the ray intersection(s) with the object; IF an intersection is nearer than visible_hit THEN set visible_hit to new intersection parameters; ENDIF ENDFOR IF there was a visible_hit THEN Set the pixel color using the illumination model; ELSE Set the pixel color to the background color; ENDIF ENDFOR The ray tracing process simplifies and enhances the rendering process considerably: 1. There is no need to transform the geometry into screen coordinates because rays can be cast just as easily in world coordinates. 2. Projection is a by-product of the process because each ray is a projector. If the rays emanate from the eye point, then the resulting image will have perspective projection. If the rays are all parallel and perpendicular to the screen, the image will appear to be orthographically projected. 3. Clipping is not needed because rays are cast only through pixels on the screen. In addition, intersections with geometry at or behind the eye, i.e. with zero or negative ray parameters, are easily ignored as invisible. 4. The world coordinates of the intersection point and the surface normal at this point are available for illumination model computations and more sophisticated rendering effects. 5. More complex geometry than simple polygons can be used because the computations only require the intersection with a ray. Before we abandon scan-line and z-buffer rendering, however, there is a significant disadvantage to ray tracing: it is many times more computationally expensive. Ray tracing is
228
elegant, but very time-consuming, compared to scan-line methods. Indeed, much of the research in ray tracing is directed towards improving the computation times.
13.3.1 Shadows
One rendering effect that adds considerable realism to an image is shadows. Shadows occur when light from a light source is obscured from portions of the surfaces of objects by other objects. We have simply ignored this phenomenon so far. Shadows modify the behavior of the illumination model. Ray tracing facilitates shadow computations. After the visible hit point, called Q from now on, has been computed from the ray emanating from the eye, another ray is cast from the Q toward each light source. If this shadow ray intersects any object before reaching the light (see Figure 1312), then this light source cannot contribute any light reflection toward the eye for this pixel. In a B A shadow ray
Ye Q Xe Ze Screen
Figure 13-12. Shadow Ray Vectors. ray tracing program, the shadow ray is "just another ray" to be intersected with the geometry of the scene. It is possible to have the same function process the initial ray and the shadow ray.This contributes to the elegance of the process. The shadow ray, however, must be processed differently than the initial ray from the eye. The ray parameter for the initial ray is only used for relative distance comparisons, so its magnitude is generally unimportant. It is a true ray with a semi-infinite parameter range (t > 0). The magnitude of the "V" direction vector in the initial ray equation is not important. For the shadow
229
ray, however, it is necessary to know if any object intersects the ray between Q and the light. This means that the shadow ray is really a segment, with upper and lower parameter bounds. Typically the shadow ray equation is: P(t) = Q + t (PL - Q) for 0 < t <= 1. If the visible hit point is determined to be shadowed from a light source, then only ambient light reflects from the point. Therefore, the illumination model is truncated to the ambient portion for this light for this point: I = Ka Ia.
Ye E Ze
N Xe Screen
Figure 13-13. Reflection Ray Vectors. The direction R is found with vector computations similar to the specular reflection vector RL discussed in section 13.1.2: R = 2 ( E N ) N - E. If the object is transparent, that is, light can be transmitted through it, a ray can be cast
230
through the surface to simulate the refracted light reaching the pixel. The direction of the transmitted ray, or refraction ray, is determined by Snells law of refraction that relates t, the angle between the transmitted ray T and the normal N, and nt, the coefficient of refraction of the transmitted medium, to i, the angle of the incident ray and ni, the coefficient of refraction of the incident medium (Figure 13-14): ni sin i = nt sin t. N E i ni (air) nt (other)
Figure 13-14. Vectors for Refracted Light. Each reflected and transmitted ray is further processed as another incident ray, in turn spawning its own reflections and refractions. The result is a ray producing process that creates a binary tree of rays, all to produce the color for one pixel. The resulting data structure resembles a tree of intersection data nodes (Figure 13-15). Once the tree has been computed, the resulting Node Data: surface properties (Ka,Kd,Ks,n), lighting model geometry (Q,N,E,L), shadow results per light.
T
Visible Hit R
Ii
R T R T
Ir
intensity (color) must be computed for this pixel. This is done, typically using a recursive
231
procedure, for each data node (bottom up), using a formula of the form: Inode = Ki Ii + Kr Ir + Kt It where: Ii is the intensity computed with the nodes data for the incident ray and the light sources using an illumination model, Ir It Ki, Kr, Kt is the intensity from the reflection ray branch of the tree, is the intensity from the transmission ray branch of the tree, are the surface properties that determine the contribution of each ray (and sum less than or equal to 1.0). In theory, parallel and perfectly reflecting objects can create an infinite number of rays. In practice, the surface properties are used during the ray casting process to attenuate (decrease) the "intensity" of each ray at each intersection. When the ray intensity of one of the rays (reflection or refraction) decreases below a certain limit, no new rays are cast. Alternatively, some ray tracing programs simply stop casting rays after a certain tree depth is reached.
window-screen Xe Ze EYE
Yw DY S Xw Zw Figure 13-16. Vectors for Ray Geometry. window. The viewport (screen) pixels must be mapped onto a grid in this window. S then becomes DX
232
the starting pixel location. (The other corners could do as well, of course.) Let the unit direction vectors of the eye coordinate system in XYZw be called xe, ye, ze. Given these, AIM, and wsx and wsy (the window half-sizes), S is found using vector operations in the image plane in XYZw space. S = AIM - wsx xe - wsy ye Let DX be the world vector between the centers of two adjacent pixels on the same scanline. That is, DX is a horizontal step. Given wsx, wsy and the viewport resolutions (number of pixels across and down the screen), DX can be found as shown in Figure 13-17. With S, DX and
DY
DX
Figure 13-17. DX and DY Vector Computations. DY, it is possible to march across the screen pixel by pixel, tracing rays at each location. The ray equation is P(t) = P + t V. For perspective projection, all rays start at the eye, so P = EYE. Each ray passes through a pixel, so these are the two points on the line. The direction vector V = (Pi - EYE) where Pi is the given pixel location. For orthographic projection, P is formed with the same [x,y] coordinates as Pi and a z coordinate that is appropriate for clipping points behind the eye. In other words, select the z coordinate of P such that ray intersections with the ray parameter t <= 0 are invisible. All rays are parallel and perpendicular to the screen, so V = -K = [0,0,-1,0], a constant. Advancing to the next pixel on a scan-line is a computation using the vector DX: Pi+1 = Pi + DX for i = 0..Resx-1
233
Advancing to the next scan-line can be done using one of two methods. Given that Pi is at the last pixel on a scan-line, advance to the next by returning horizontally to the first pixel location, then moving up to the next scan-line: Pi+1 = ( Pi - 2 wsx xe ) + DY Another approach is to save the location of the first pixel on the current scan-line, thereby avoiding the computations above. Call this Si. Advance to the scan-line now simply by computing: Pi+1 = Si + DY Si = Pi+1
Figure 13-18. Concave and Convex Polygons and Vertices Forming Their Boundaries. First consider point containment for convex polygons.
234
P2 V2
P0 V0
V1 P1
Figure 13-19. Convex Polygon as the Intersection of Half-Spaces Formed by Lines. This is the basis for the quick-rejection algorithm shown in Figure 13-20. FOR each edge PiVi DO if [(Q-Pi)Vi]N > 0 THEN RETURN OUTSIDE;
END RETURN INSIDE; Figure 13-20. Point Containment Algorithm for Convex Polygons.
235
intersections with vertices. The algorithm in Figure 13-22 simply restarts if the ray intersects a vertex.
3 Ray
236
P2
P1
i
Zw
P0
total = 0 FOR each vertex Pi of the polygon DO /* including Pn-1 to P0! */ Vi = Pi - Q Vi+1 = Pi sine =
+1
- Q
(Vi x Vi+1)
cosine = Vi Vi+1 IF |sine|0 AND |cosine|0 THEN RETURN INSIDE /* on vertex */ = ATAN2(sine,cosine) IF THEN RETURN INSIDE /* on edge */ total = total + ENDFOR IF |total | 2 THEN /* a large tolerance, +/- 0.5, can be used here */ RETURN INSIDE ELSE RETURN OUTSIDE ENDIF
237
illumination model. Open data may be oriented or not, meaning that surface normals may be used to determine the front or back of a surface, or not. For example, for a single plane, as the eye position is rotated around the data, the plane normal will sometimes face the observer and other times face away from the observer, determined by the result of E N. For un-oriented data, the proper surface normal must be selected for the lighting normal when the viewing position changes. In effect, this makes each surface a thin solid. Consider an edge-on view of the visible surface at the point of ray intersection, Q, as shown in Figure 13-25.
Eye a Ea
Na La Q
Light a
Surface Lb Light b
Eb Eye b Nb
Figure 13-25. Possible Normals for Different Eye Positions for a Surface. Consider the different combinations of the two different eye positions (labelled "Eye a" and "Eye b"), two light positions ("Light a" and "Light b") and two different surface normals ("Na" and "Nb"). The objective is to select the normal, Na or Nb, to give the proper lighting effect. There are 4 cases to consider depending on the eye and light locations. Select a lighting normal Nl from the surface normal, N, as follows:
Nl = current surface normal IF( E N < 0 ) THEN "flip lighting normal: Nl = - Nl" ENDIF IF( L Nl < 0 ) THEN "eye hidden from light: use ambient light only" ELSE "light is visible to eye: use the full lighting model" ENDIF
238
After classification, there will be: 1. 2. no points in { Qi } that classify as IN, in which case the ray misses the polyhedron, two points in { Qi } that classify as IN. In this case, let Q1 be the point with the smallest ti, the enter point, and Q2 be the other, the exit point. The normal for the enter point is the plane normal; the normal for the exit point is the negative of the normal. A second approach is even simpler and faster. Observe that if the ray intersects the convex
polyhedron, the entering t value must be less than the exiting t value. Conversely, if the ray intersects the infinite planes made by the faces of the object outside of the face boundaries, the entering t value will not be less than the exiting value. The entering intersection will have VN < 0, and the exiting VN > 0 (see Figure 13-27). The algorithm shown in Figure 13-28 combines the two steps in the previous algorithm into
239
t1, Q1
N2 (exit) t2, Q2
240
241
b b 4ac t = ------------------------------------2a The real roots of this equation, if any, are the parameter values of the intersection points of the ray and the conic. The discriminant inside the radical, b2 - 4ac, determines the solution: 1. if it is negative, there are no real roots and hence no intersection between the ray and conic, 2. if it is 0, there are two identical real roots, so the intersection is a point of tangency between the ray and conic, 3. if it is positive and non-zero, there are two real roots corresponding to the two intersection points between the ray and the conic. Generally, two identical roots can be handled the same as two distinct roots. The ray-conic intersection routine simply checks the sign of the discriminant and returns no hits if it is negative or 2 hits at t1 and t2 if it is positive. The values for t1 and t2 are the minimum and maximum of the two roots, respectively. To find the normal vector to a conic at a given point, Q, one begins with the basic equation of the transformed conic. Letting P = [x,y,z,1] and A be the transformed coefficient matrix, PAP
T
= f ( x, y, z ) = 0
The normal vector is found using the vector gradient operator, , defined as: N = f ( x, y, z ) = fi + fj + fz x y z
P=Q
The only variables involved in the partial derivatives are x, y and z in P. For example, evaluating
242
P = 10 0 0 = i x
T T T ( PAP ) = i ( AP ) + ( PA )i x
= 2 ( PA )i Similarly,
T T ( PAP ) = 2 ( PA )j y T T ( PAP ) = 2 ( PA )k z
Summing the gradient, N = ( PAP ) = ( 2 ( PA )i )i + ( 2 ( PA )j )j + ( 2 ( PA )k )k Dropping the factor of 2 (N is normalized anyway) and combining the vector components into a matrix, the result is: N = QA The w coordinate of N must be set to 0.
T T T T
243
Union B
Difference
Intersection
AB
A-B
AB
Figure 13-29. CSG Set Operations Illustrated with 2D Areas. UNION A T F B T T T F T F A T F T F F DIFFERENCE B F T F A INTERSECTION T F B T T F F F F
inside A OR inside B AB
Figure 13-31. Boolean Difference Applied to Volumes. tracing to create realistic images of complex objects composed of Boolean combinations of simpler primitives. This involves: 1. computing the intersections of the ray with each of the primitives and storing them (parameter value, object identifier, etc.), 2. 3. sorting the intersections by increasing ray parameter value, evaluating the resulting intersection list using the CSG expression, i.e., the list of primitives and the Boolean operators applied to them: primitive1 + primitive2 - primitive3 + ... primitiven To evaluate the intersections using the CSG expression, consider the ray as the real line starting at the left at - (far behind the eye), and proceeding to the right to + (far in front of the eye) as illustrated in Figure 13-33.
244
CSG object: Ye
ray
t1
t2
t3
t4
t5
t6
Figure 13-33. Ray Intersections Shown as Points on the Real Line. For each primitive, its current ray state is maintained as a Boolean variable Si for i = 1..#objects. Initially, assume the ray begins outside all primitives, so Si = FALSE. If it is possible for the ray to start inside a primitive, then additional computations are needed to determine the initial Si for each primitive. Begin processing each intersection along the ray, nearest to farthest (left to right on the real line), evaluating the Boolean expression using the current ray states of each primitive, Si. As each intersection is reached, denoted by the parameter tj, toggle the ray state of the corresponding primitive, Sj. Toggling means to change TRUE (IN) to FALSE (OUT), and viceversa, corresponding to entering the primitive on the first hit and leaving it on the second. The process terminates when: 1. 2. the expression evaluates to TRUE (IN), this is the visible intersection, the last intersection is passed without evaluating to IN, there is no intersection with the CSG object. (Consider the ray that intersects only a negative object.) This is illustrated in Figure 13-34. Given the individual ray states of the primitives, evaluating the Boolean expression utilizes the truth tables. Consider the CSG expression as a list of primitives, each with a SENSE of "+" or "-". An algorithm for evaluating the expression is shown in Figure 13-35.
245
Assign initial S values to each primitive FOR each intersection Ii in ascending ray parameter order DO Si = NOT Si IF EvaluateCSGObject() == TRUE THEN IF ray parameter value of Ii < 0.0 THEN RETURN "no hit: behind the eye" ELSE RETURN "visible hit is intersection Ii" ENDFOR RETURN "no hit"
246
Figure 13-36. Texture Space. Texture mapping involves relating texture coordinates to pixels during the rendering process. Given a pixel resulting from rendering an objects surface, the color of the pixel can be computed from the related texel value in several ways, for example: 1. The texel value becomes the pixel value. For example, the texture map is an array of pixels from a digitized picture and the picture is mapped onto the surface.
247
2.
The texel value is a scale factor applied to the pixel value. For example, the texture map is an array of intensity factors that alter the color pattern of the surface to create the desired effect.
3.
The texel value is an opacity (transparency) value or factor that determines the opacity of the pixels of the surface.
Usually, the texture coordinates are assigned in some way to a surface and are interpolated during rendering. For example, the texture coordinates of points on a sphere can be related to the two angles of spherical coordinates, the texture coordinates of a parametric surface can be the parameters of the surface, and so on. For polygons, texture coordinates could be assigned to each polygon vertex based on some desired effect. Then, the texture coordinates, (s,t), for vertices would be clipped with the edges of the polygon during polygon clipping, and interpolated during rasterization. For scan-line rendering algorithms, such as z-buffer hidden surface removal, the texture coordinates would be linearly interpolated along the edges of polygons and across segments during scan conversion. For ray-tracing, the texture coordinates of the visible ray intersection point, Q, would be interpolated from the surface or object vertices. When ray-tracing triangles, for example, barycentric interpolation is used to linearly interpolate the texture coordinates of any point Q inside a triangle from the texture coordinates at the 3 vertices. One problem with texture mapped images is aliasing, the appearance of jagged edges of objects. This occurs because the mapping of texels to pixels is highly non-linear, and one pixel may map to one or more texels as illustrated in Figure 13-37.
248
During rendering, the texture coordinates of a certain pixel are computed with some interpolated method as discussed above. These coordinates are real numbers resulting from the pipeline transformations, clipping, and pixel by pixel computations. From these transformations, it is possible also to compute the area of the pixel in texture space. As shown in Figure 13-38, the relative area of the shaded pixel being rasterized and its center location has been mapped to texture space. The pixel center lies inside the given texel, but the pixel area maps to several texels.
Texels
Pixels
Figure 13-38. Pixel to Texel Mapping.. The texel value for a given pixel, therefore, can be computed in several ways depending upon the desired effect and accuracy. For example, the simplest approach is to use the texel value of the texel coordinates. Another approach is to use a simple or weighted average of the texels covered by the area. This is more time-consuming, of course.
249
To perturb the normal vector, N, for example, to produce bumps on a golf ball, the object parameters of the hit point Q are used to alter the normal direction. The object parameters could be the two spherical angles of the point Q. The normal would be recomputed using functions based on the spherical angles such that "dimples" appear on the surface of the object. This is also called procedural texture mapping.
250
5.
A tetrahedron is a closed object bounded by four triangles intersecting at the four corner points. If the points are labelled A, B, C and D, then the triangles are ABC, BDC, ACD and ADB. Given the four corner points of a tetrahedron, A=(1,0,1), B=(2,0,3), C=(2,2,2), D=(3,0,1), the lighting model coefficients Ka = 0.2, Kd = 0.3, Ks = 0.4, and n = 2, the eye located at infinity on the +z axis, and the light located at infinity on the +x axis: compute the intensity of light from each face of the tetrahedron that reflects light to the eye using the ambient-diffuse-specular lighting model with the given parameters. Show equations in symbolic form prior to performing arithmetic.
6.
For the ray tracing situation shown below: (a) compute and draw L, the unit direction to the light source; (b) compute and draw E, the unit direction toward the eye; (c) compute and draw R, the unit direction of ideal reflection; (d) compute cos(), and label on the diagram; (e) compute cos() and label on the diagram. First show symbolic equations, then compute numeric values. Light [-2,3] N [-1,1] eye [-4,1] Y Q= (-2,1) X
14.1 Parameterization
251
Similarly, to draw an arc or circle, you only need the center C and radius R:
R C Figure 14-2. Circle. x( t ) = R cos( t ), y( t ) = R sin( t ) for 1 <= t <= 2 To draw the circle, one would connect a series of points around the circle by straight lines. The number of lines could vary from circle to circle to provide the appropriate quality of display. A general 3D curve can be described parametrically, for example, a helix (Figure 14-3): x( t ) = R cos( t ) , y( t ) = R sin( t ) , z( t ) = C t
It is possible to derive other parametric forms for analytic curves, such as second degree equations, i.e. conics: circles, ellipses, parabolas, and hyperbolae. For example, a circle can be represented by the parametric equations in Figure 14-4. Unfortunately, the parameter range of t needed to cover the whole circle is < t < . This is impractical and also shows that the point distribution around the circle is highly non-linear. It is computationally impractical to use a parameter range including the value infinity. By introducing a one dimensional homogeneous representation for the parameter, a more computationally convenient range is obtained. Replace t in the preceding equations with the ratio
14.1 Parameterization
252
Figure 14-3. Helix. y t=0 R 2tR x = ------------1 + t2 x ( 1 t 2 )R y = --------------------1 + t2 t = - t= Figure 14-4. Parametric Equations for a Circle. T --- . The resulting parameterization is more notationally complex as it now involves two U T parameters, T and U. Represent t = as the ratio of -0 The complete circle is now traced in two parts. The top half is completed by holding U=1 with -1<=T<=1. The bottom half is then traced by holding T constant at 1 and letting U vary over the range -1 to 1, as shown in Figure 14-5. Comparing the parametric representation for curves, C(t) = [(x(t) y(t) z(t)], to the implicit representation, C(x,y,z) = 0, there are advantages and disadvantages for each. Parametric representation is best for drawing curves and operations requiring an ordered point set and for transformations of curves. Implicit representation is best for geometric intersection computations and for determining if a given point is on a curve. Ideally, one would like to have both the parametric and implicit representations. However, this is a very complex process at times and may not be possible at all for some curves. Using homogeneous coordinates: 0 R 1 wx wy w = 1 t t 2 2R 0 0 0 R 1
t = -1
t=1
253
Figure 14-5. Homogeneous Parameterization of a Circle. The mathematical process of converting a parametric form into its corresponding implicit form has been termed implicitization. Finding the parameter value corresponding to a given (x,y,z) coordinate on a parametric curve is termed inversion. These processes are important computational geometry methods used in computer-aided design (CAD) systems.
spline
Figure 14-6. Curve Drawing with a Spline and Ducks. ducks as are necessary to control the shape. Then a pencil is drawn along the spline to produce the desired curve. The stiffness of the spline makes the curve smooth. This process can be simulated
254
mathematically using equations to represent the curves. The most common form for these equations is the parametric polynomial.
As shown in the circle example, the parametric definition is valid over the range ( - <= t <= ), but most often we are only interested in a segment of the curve. This finite segment is defined by imposing bounds on the acceptable parameter range, usually 0 <= t <= 1.
255
For cubic polynomials (n=3), a convenient matrix notation is often used: a0 b0 a1 b1 a2 b2 a3 b3 A power basis vector coefficient matrix c0 d0 c1 d1 c2 d2 c3 d3
x y z 1 = 1 t t2 t3 T
A curve is defined by its coefficient matrix A. The goal is to define A in an intuitive and concise manner in order to produce a curve of the desired shape. One way to accomplish this is to force the curve to interpolate (pass through) four points spaced along the curve at pre-assigned parameter values. Given four known points, including their respective parameter values along the intended curve, we can derive the A matrix for the curve. Given P0 = ( x0, y0 , z0 ) , . . . , P3 = ( x3 , y3 , z3 ) and ( t0 , t1, t2 , t3 ), we have a system of 12 linear equations in 12 unknowns, and therefore we can solve for the coefficient matrix A. We know that P(ti) = Pi for i = 0..3, so we can form the matrix equation: x 0 y0 x 1 y1 x 2 y2 x 3 y3 This can be solved for A:
2 3 1 t0 t0 t0 2 3 1 t1 t1 t1 A = 2 3 1 t2 t2 t2 2 3 1 t3 t3 t3 1
z0 1 z1 1
2 3 1 t0 t0 t0
2 3 1 t1 t1 t1 = A 2 3 z2 1 1 t2 t2 t2 z3 1 1 t t2 t3 3 3 3
x0 y0 x1 y1 x2 y2 x3 y3
z0 1 z1 1 z2 1 z3 1
Given A, we can step along the cubic curve segment using sequentially increasing values of t (t0 <= t <= t3) to compute a set of points that can be connected by straight lines (Figure 14-7).
256
P1 t1 P0 t0 Pi=[xi,yi,zi]
P2 t2
t3
P3
Figure 14-7. Cubic Polynomial Interpolation through Four Points. Note that moving any one of the four points or changing its associated parameter value will change the coefficients, thus altering the shape of the whole curve (Figure 14-8). We now begin to see the concept of curve sculpting. This will not always produce a nicely-behaved or smooth curve P1 P2
P0
P2
P3
Figure 14-8. Effects of Moving a Point on a Curve. because wiggles can result when a curve is forced to interpolate unevenly distributed points. One difficulty encountered with this formulation is that of matching slopes at the junction point (sometimes called a knot) between segments when creating a spline curve. 0 s=0 1 curve 1: P1(s) = S A1 knot 3 s=1 t=0 5 4 curve 2: P2(t) = T A2 6 t=1
Figure 14-9. Two Curves Meeting at a Knot Point. For many applications, a more appropriate and intuitive approach is achieved by specifying two endpoints and two parametric tangents. Taking the derivative of the curve with respect to the parameter: d P ( t ) = P' ( t ) = x ( t ) y ( t ) z ( t ) = 0 1 2t 3t 2 A dt We now use the curve endpoints (P0 and P1) and the end parametric tangents (P'0 and P'1)
257
to derive the coefficient matrix. Assuming t varies in the range [0,1]: P(0) 1 0 = P(1) = 1 1 P0 P ( 0 ) 0 1 P ( 1 ) 0 1 P1 P1 where: P' 0 = P' ( 0 ) = P' 1 = P' ( 1 ) = resulting in, 1 A = 1 0 0 0 1 1 1 0 1 0 2 0 1 0 3
1
P0
0 1 0 2
0 1 A 0 3
d P(t) dt d P(t) dt
t=0
t=1
x0 y0 z0 1 x1 y1 z1 1 x0 x1
1 = 0 y0 z0 1 3 2 y1 z1 1
0 0 3 2
0 1 2 1
x0 y0 0 0 x1 y1 1 x0 y0 1 x y 1 1
z0 1 z1 1 z0 1 z1 1
This is known as Hermitian interpolation. Slope continuity between adjacent curves can now be directly controlled using a common tangent vector at the end of one curve segment and at the start of the next. P0 = P1(0)
t=1 P1(s) s=1 s=0 P2(t) t=0
P2 P2 = P2(1)
P0 = P1(0)
P1 = P1(1) = P2(0)
P1 = P1(1) = P2(0)
Figure 14-10. Slope Continuity between Spline Curves. We can now sculpt curves by moving one of the two endpoints or by changing one of the two tangent vectors. Note that a tangent vector has both magnitude and direction, and that changing either value alters the curve definition (see Figure 14-11). We can now apply a transformation matrix, M, to all points on a curve: [ wx wy wz w ] = [ x y z 1] M = T A M = T A'
258
changing direction
changing magnitude
Figure 14-11. Effects of Changing Parametric Tangent Magnitude and Direction. Now, instead of processing individual straight line segments through the 3D wireframe pipeline, we are able to transform an entire curve. The transformed curve is represented by the transformed coefficient matrix, A'. Using homogeneous coordinates, this can even include perspective.
259
b1
x
bn t=1
curve is defined by a set of (n+1) ordered control points. When connected by straight lines, the points form what is commonly called a control polygon. The control points act somewhat like the ducks of the old drafting spline, however, only the first and last control points lie on the curve. The remaining points simply act as weights, pulling the curve towards each point. The resulting curve is said to mimic the overall shape of the control polygon, which is one reason Bzier curves are popular for interactive curve design. The general nth degree Bzier curve is defined as follows.
n
Bn, i(t) b i
i=0
n! B n, i(t) = -------------------- t i ( 1 t ) n i i! ( n i )! n is the degree of the curve (largest exponent of t), and bi, i = 0.. n are the control points. n! The constant factor in the Bernstein basis function is the binomial coefficient: -------------------- . i! ( n i )! It is expanded for some values of n and i in Figure 14-13. Recall that 0! = 1. Notice that for i = 0 and for i = n, the binomial coefficient is 1.
260
i 0 n 1 2 3 4 1 1 1 1
1 1 2 3 4
1 3 6 1 4 1
Figure 14-13. Binomial Coefficients. A cubic curve is formed when n = 3 (i.e., 4 control points).
3
P(t) =
B3, i(t) bi =
i=0
= b0 ( 1 t ) 3 + b 1 ( 3t ( 1 t ) 2 ) + b 2 (3 t 2 ( 1 t ) ) + b 3 t3 This can be represented in matrix form: 1 P ( t ) = 1 t t2 t 3 3 3 1 = TM be b where: T Mbe [b] is the (power) polynomial basis vector, is called the Bzier basis change matrix, is the matrix of Bzier control points. 0 3 6 3 0 0 3 3 b 0 0 0 b1 0 b2 1 b 3
If more control over the shape of a curve is desired, additional control points may be added (increasing the degree of the polynomial). Using more than one curve segment can also provide additional control, especially in cases where the curve must pass through several points. One can show that the tangents at the ends of a Bzier curve are parallel to the first and last legs of the control polygon, respectively. Thus, in general for slope continuity, or C1 continuity,
261
the next-to-last control point on curve #i, the common endpoint to curve #i and #i+1, and the second control point on curve #i+1 must be collinear (Figure 14-14).
Figure 14-14. Slope Continuity via Colinear Control Points. Second parametric derivative (curvature or C2) continuity between curve segments can also be achieved by proper placement of the third and third-to-last control points. Suppose we want a parametric cubic polynomial representation for a quarter circle in the first quadrant (Figure 14-15). P 3 = [0,1] @ t = 1 P 2 = [cos60,sin60] = [.5,.866] @ t = 2/3 P1 = [cos30,sin30] = [.866,.5] @ t = 1/3
P0 =[1,0] @ t = 0 Figure 14-15. Interpolation Points for a Quarter Circle. For point interpolation, the equation [P] = T A becomes: 1 0 0 1 1 0 0 0 0.866 0.5 0 1 = 1 0.3333 0.1111 0.0370 A 0.5 0.866 0 1 1 0.6667 0.4444 0.2963 0 1 0 1 1 1 1 1 Solving for A: 1 0 01 A = 0.04423 1.6029 0 0 1.4856 0.1615 0 0 0.4413 0.4413 0 0 Now find the Bzier control points for the same curve. To do this, first examine the matrix
262
equation for the two curve forms, equating the power basis polynomial form with the Bzier form: P (t) = T A = T Mbe [ b ] We see that A = Mbe [ b ], so [ b ] = [ Mbe ]-1 A. For this example, 1 0 1 1 -3 1 = M be A = b 2 1 -3 1 1 0 0 1 0 0 1 1 0 0 1 0.04423 1.6029 0 0 = 1.015 0.535 0 1 1 1.4856 0.1615 0 0 0.535 1.015 0 1 -- 0 3 0 1 0 1 0.4413 0.4413 0 0 1 1 0 0
Figure 14-16. Bzier Control Points for the Quarter Circle Example.
P( t) =
i=0
N m, i ( t )P i
263
( x xi ) ( xi + m x ) N m, i ( x ) = ------------------------------------N m 1, i ( x ) + ------------------------------------ N m 1, i + 1 ( x ) ( xi + m 1 xi ) ( x i + m xi + 1 )
N 1, i ( x ) =
1 if ( x [x i,x i + 1] ) 0 if ( x [x i,x i + 1] )
where: xi are the parametric knots, m is the order of the curve (degree + 1), t is the parameter. The array of knots, called the knot vector, is a non-decreasing sequence of parameter values that gives local control to the shape of the curve. This means that moving a control point only affects the curve near the point, not the whole curve (as does the power basis polynomial formulation). From the B-spline basis definition, one sees that for a given parameter value, only the adjacent m knots affect the shape. Unlike the Bzier formulation, it is possible to vary the order of the curve, m, without varying the number of control points. As the order decreases, the curve moves closer to the control points, ultimately becoming a piecewise linear function (straight lines between the control points) for order = 2. Control points can be repeated and so can knot values. Although the knot values can be any values, not just 0 to 1, usually the knot vector consists of integers starting with 0. If the knots are uniformly spaced, then it is termed a uniform B-spline; if not, it is a non-uniform B-spline. In general, B-spline curves do not pass through any of their control points. However, they can be formulated to pass through the first and last control points, similar to Bzier curves. This requires that the first and last knot values be repeated m times. Examine the B-spline basis to verify this. If k is the number of knots, v is the number of control points, and X is the maximum knot value, then we can compute: X=(v-1)-m+2 k = 2m + X - 1
264
s=0
t=1
Figure 14-17. Parabolic Blending of C(u) from P(s) and Q(t). Given three points defining the parabola such that: P0 = P(0), P1 = P(x), P2 = P(1) we can combine these equations into one matrix equation: P0 0 0 1
x 1 1 x P0 1 P 1 = ------------- 1 x 2 1 x 2 P 1 2 x x 2 P2 x x 0 0 P2 P0
265
The first parabola, P(s), passes through the first three points, and the second parabola, Q(t) passes through the last three points. We now linearly blend these two parabolas (2nd order curves) using the parameter u to obtain a cubic curve between P1 and P2: C(u) = (1 - u) P(s) u Q(t) Note that parameters s and t are linear functions of u: s = S + u ( 1-S ) and t = T u Combining all this, the equation becomes: 1 (1 S) T ------------------- T + ( 1 S ) S + ----------- ----------- ---------------- - S S T 1 1 T P 0 2 2 T 2 T - P 1 1 2 ( 1 S ) - 2 ( S 1 ) C(u) = u 3 u 2 u 1 -- ---------------------- ------------------- T ----------- 2S ----------- S T1 S T1 2 P2 2 ( 1 S ) ( 1 2S ) P3 S 0 ------------------- ------------------- S S 0 1 0 0
2 2
where S is the value of s on P(s) at P1, and T is the value of t on Q(t) at P2, for u in [0,1]. Values for S and T are determined by considering slope continuity between adjacent blended segments, like C1(u) and C2(u) in the figure above. For S= 0.5 and T = 0.5: 1 1 C(u) = u 3 u 2 u 1 -- 2 2 1 0 3 5 0 1 3 4 1 0 P0 1 1 P1 0 P2 0 P 3
This is a simple cubic interpolation scheme given points that must lie on the curve. Each set of four points corresponds to one parabolically blended curve, as illustrated in Figure 1418.
266
P (s)
C (u)
1
R (t)
C (u) P
0
Q (t)
Figure 14-18. Parabolic Blending of Spline Curves. forms (circular arcs, etc.). For our quarter circle example, Figure 14-19 shows the error in the circle radius (d = 1 - x2 - y2) at the x and y coordinates generated by P(t) = T A . Note that the error is zero only at t=0, 1 , 2 and 1, the interpolated points. -- -- 3 3
Error
0.0
1 -3
parameter t
2 -3
1.0
Figure 14-19. Error Between Exact Circle Radius and Polynomial Interpolation. To provide greater flexibility, the curve representations can be expanded using homogenous coordinates. wx( t ) = a0 + a1 t + a2 t2 + a3 t3 + . . . + an t n wy( t ) = b0 + b1 t + b2 t2 + b3 t3 + . . . + bn t n wz( t ) = c0 + c1 t + c2 t2 + c3 t3 + . . . + cn t n w( t ) = d0 + d1 t + d2 t2 + d3 t3 + . . . + cn t n In vector form: P( t ) = [ wx( t ) wy( t ) wz( t ) w( t ) ]. Notice that the 3D coordinates of a point now are the ratio of two polynomials and hence
267
As an example, recall the parameterization of a circle given earlier: 0 R 1 wx wy w = 1 t t 2 2R 0 0 0 R 1 The quadratic (n=2) Bzier formulation yields: 1 0 0 b0 P ( t ) = 1 t t2 2 2 0 b1 1 2 1 b 2 Equating the matrices as before and solving for the control points: b0 1 0 0 1 A = b1 = M be 2 2 0 1 2 1 b2
1
1 0 0 0 R 1 0 R 1 0 R 1 1 2R 0 0 = 1 -- 0 2R 0 0 = R R 1 2 0 R 1 0 R 1 2R 0 2 1 1 1
Compare these results with the previous cubic Bzier control points: Cubic Non-Rational Bzier (Approx.)
b 3 = (0,1)
x
b 2 = (0.535,1.015) b 1 = (1.015,0.535)
b 1 = (R,R,1)
b 0 = (1,0)
b 2 = (2R,0,2)
268
2 1 3
2.
The vector form of the cubic Bzier curve (n=3) is: B3( t ) = b0 (1-t)3 + 3 b1 t (1-t)2 + 3 b2 t2 (1-t) + b3 t3 After expanding the coefficients, the matrix form of this cubic equation is:
B ( t) = [ t
3 3
t 1 ] -1 3 -3 1 3 -6 3 0 -3 3 0 0 1 0 0 0
b0 b1 b2 b3
= [T ][Mb][b ]
For the quadratic (n=2) Bzier curve, find: (1) the vector form, and (2) the matrix form, written similar to the examples above. (Recall that 0! = 1.)
269
0,0
270
P3 (t)
S1 = blending of P1 and P2 Figure 15-2. Slope Discontinuity at Interior Curves of Lofted Surfaces.
1 B0(x) = 1 - x
1 B1(x) = 1 - B0(x) = x
Figure 15-3. Linear Blending Functions. blending functions sum to unity for the surface to contain its corner points. The surface is now a linear blend of the curves and the corner points:
271
P( s, t ) =
P1( t ) B0( s )
+ P4( s ) B0( t )
+ P3( t ) B1( s ) + P2( s ) B1( t ) - P00 B0( t ) B0( s ) - P01 B1( t ) B0( s ) - P11 B1( t ) B1( s ) - P10 B0( t ) B1( s ) where: P1 , P2 , P3 , P4 are bounding curves, P00, P01, P11, P10 are the corner vertices. See Figure 15-4. P10 P4(s) P3(t)
P11
s
P2(s) P00
t
P1(t)
P01
Figure 15-4. Surface with Linear Interior Blending of Four Curves. Note the boundary conditions for the boundary curves, i.e. when s and t are 0 and 1: P1( 0 ) = P4( 0 ) = P00 P1( 1 ) = P2( 0 ) = P01 P2( 1 ) = P3( 1 ) = P11 P3( 0 ) = P4( 1 ) = P10
272
Now check the s = t = 0 boundary condition for the surface: P00 1 P00 1 P( s=0, t=0 ) = P1( 0 ) B0( 0 ) + P4( 0 ) B0( 0) P10 P01 0 0 + P3( t ) B1( s ) + P2( 0 ) B1( 0 ) 1 1 - P00 B0( 0 ) B0( 0 ) 1 0 - P01 B1( 0 ) B0( 0 ) 0 0 - P11 B1( 0 ) B1( 0 ) 1 0 - P10 B0( 0 ) B1( 0 ) P( 0, 0 ) = P00 + P00 - P00 = P00 as expected. Note that each corner vertex belongs to two boundary curves and will therefore be added twice when s and t are 0 or 1. This is the reason for the last four terms involving the corner vertices in the P(s,t) equation. As a second check, evaluate P( s, 0 ): P00 1 P( s, t=0 ) = P1( 0 ) B0( s ) + P4( s ) B0( 0 ) P10 0 + P3( 0 ) B1( s ) + P2( s ) B1( 0 ) 1 - P00 B0( 0 ) B0( s ) 0 - P01 B1( 0 ) B0( s ) 0 - P11 B1( 0 ) B1( s ) 1 - P10 B0( 0 ) B1( s ) P( s, t=0 ) = P00 B0( s ) + P4( s ) + P10 B1( s ) - P00 B0( s ) - P10 B1( s ) = P4( s ) as expected Any four space curves (parameterized from 0 to 1) may be used to construct a surface. If P2( s ) and P4( s ) are linear, the equations reduce to a ruled surface.
273
P(s, t) =
where B n, i ( x ) is the Bernstein basis function defined in the previous chapter, n is the degree of the surface in the s direction, and m is the degree of the surface in the t direction. The bij are (n + 1) * (m + 1) control points that form a control net. (Figure 15-5)
b
03
b b
13
s b
01
23
b b
11
02
12
b
22
32
b b
b
10
21
b
31
b
33
b
00
20
b
30
Figure 15-5. Cubic Bzier Surface Control Net. The bi-cubic surface is shown below in blending function format. ( 1 s )3
2 P ( s, t ) = ( 1 t ) 3, 3t ( 1 t ) 2, 3t 2 ( 1 t ), t 3 b 3s ( 1 s ) 3s 2 ( 1 s )
274
where [ b] is a tensor (matrix of vectors) that contains the points on the control net, and T and S are again polynomial basis vectors and Mbe is the Bzier basis change matrix as discussed in the previous chapter. Like the Bzier curve, the Bzier surface closely mimics its control net. The surface contains only the corner points and the three control points nearest each vertex control the surface tangents. Parametric surface tangents are computed by partial derivatives in the s and t parametric directions: P ( s, t ) = P s ( s, t ) s where, P s ( s, t ) = x s ( s, t ) y s ( s, t ) z s ( s, t ) = x ( s, t ) y ( s, t ) z ( s, t ) s s s Given parametric tangents, we compute geometric tangents using ratios: y s ( s, t ) dy ----- ( s, t ) = ---------------x s ( s, t ) dx z s ( s, t ) dz ----- ( s, t ) = ---------------x s ( s, t ) dx z s ( s, t ) dz ----- ( s, t ) = ---------------y s ( s, t ) dy P ( s, t ) = P t ( s, t ) t
Note that because they are ratios of the parametric tangents, the geometric tangents have an extra degree of freedom (a scale factor). Surface normals can also be computed for any parametric point on a surface. The normal is defined by the cross product of the parametric tangent vectors: N ( s, t ) = P s ( s, t ) P t ( s, t ) Surface normals are important for a number of computational geometry applications, including machining of surfaces (locating the position of a cutting tool involves moving along the normal by the radius of the cutter), realistic rendering (lighting model calculations), and finding offset surfaces (the surface which is everywhere equidistant from another surface).
275
276
277
late 1970's The appearance of low-cost minicomputers, coupled with new storage tube CRT terminals, spawn 2D turnkey CAD systems development. Small companies can now develop software and deliver a system affordable to a larger segment of industries. The CAD industry is born. ~1973 Braid's Designing with Volumes thesis and his BUILD system demonstrate the potential for solid modeling in CAD. Also, the PADL research project at Rochester establishes the viability of solid modeling. 1970's Turnkey CAD systems flourish, and add 3D capabilities. A technical debate begins between the drafting versus solid modeling paradigms for CAD. Solid modeling is computationally expensive, making it impractical for most minicomputer systems in place in industry. early 1980's New powerful 32-bit midi -computers enter the marketplace. Turnkey systems struggle to re-write their systems to exploit the power of these computers. The viability of solid modeling increases, encouraging further development by CAD vendors. Solid modeling also gains support as applications appear in computer-aided manufacturing, CAM, such as computer automated machining planning for numerical control machining (NC). Research efforts at universities and industries in CAD grow considerably. mid-1980's Personal computers revolutionize the computer marketplace. PC CAD quickly takes over marketplace, redefining the cost structures and options available to customers, and opening the CAD market to all industries, large and small. The established turnkey market stagnates. Solid modeling has become a part of every large CAD system.
278
1990s + Parametric and constraint-based design systems gain attention. Integration of design and manufacturing becomes more important.
2D wireframe models geometry consists of lines and curves in 2D: (x,y), objects are created using drafting methods, 3D is simulated using multiple views. 3D wireframe models like 2D wireframe data, but 3D: (x,y,z), space curves now possible. surface models surfaces with boundary curves between points, skin of aircraft, auto, NC possible with manual guidance. solid models oriented surfaces + topology, robust shape description, complete spatial enumeration, automation (shape understanding) is aided greatly. features models are an extension of solid models to include product data in a form that supports integration among design, analysis and manufacturing programs.
279
280
In general, solid modeling methods vary according to how: 1. the model is constructed: languages and/or graphical interaction, set operations, sweeps, Euler operations 2. the model is stored (i.e. the internal data structure): CSG BREP Hybrid (combination of CSG and BREP) We will examine two of the fundamental approaches: constructive solid geometry (CSG) and boundary representation (BREP).
281
Object
Face
Face
Face
Loop
Loop
Intuitively, one realizes that these operations produce valid solids (except for some surfaceon-surface situations). Although different modelers use different sets of primitives, the ones
282
generally common to all are the Sphere and Cone, etc. as illustrated in the figures.
Figure 16-3. Cone Primitive and Functional Notation. 1 Swept Solids, also called 2 -- D solids: 2 Planar profile projected by extrusion: Slab.
Figure 16-4. Extruded or 2-1/2D Primitive. Planar profile revolved about an axis: Surface of revolution.
Figure 16-5. Surface of Revolution Primitive. Objects can be stored in CSG trees, also. A CSG tree is a binary tree data structure with Boolean operations at its branches and primitives at the leaves. Stored CSG models are in fact only a definition of how to make the model from primitives. This is called an unevaluated form, because each application that wants to use the
283
- -
Figure 16-6. CSG Tree Representation (Right) of an Object (Left). model must evaluate it for its own purposes. One of the methods used to evaluate a CSG model for realistic display is ray casting as discussed earlier.
284
BOX1
Y CIRCLE
BOX 0,0,0.5
BOX 0,0,1
CIRCLE 0,0,1
Figure 16-9
285
typedef struct CSGTree { int operator; struct CSGTree *left, *right; struct Primitive *prim; } CSGTree; and Primitive could be any appropriate data representation for storing the primitives data. Define the values for the operator field: ADD, SUBTRACT, INTERSECT and PRIMITIVE. The fields left and right point to sub-trees for the +, - and * binary operators. They are not used for the P operator. The field prim points to a structure of type Primitive only if the operator is P. The function to evaluate if a given point is inside or on a CSG object is coded (in C) using recursion as shown in Figure 16-10.
Boolean TestPoint( x, y, tree ) float x, y; CSGTree *tree; { switch( tree->operator ) { case ADD: return( TestPoint(x,y,tree->left) || TestPoint(x,y,tree->right) ); case SUBTRACT: return( TestPoint(x,y,tree->left) && ! TestPoint(x,y,tree->right) ); case INTERSECT: return( TestPoint(x,y,tree->left) && TestPoint(x,y,tree->right) ); case PRIMITIVE: if( tree->prim->type == CIRCLE ) return( TestCircle(x,y, tree->prim) ); if( tree->prim->type == BOX ) return( TestBox(x,y, tree->prim) ); } }
Figure 16-10. CSG Evaluation Function TestPoint. The primitive routines TestCircle and TestBox must return TRUE if the given point is inside the particular instance of the primitive, or FALSE otherwise. All calls to TestPoint end up in calls to primitive routines. Adding new primitives is simply a matter of coding their TestXXX routines.
286
For some applications, such as ray casting CSG trees, two-state logic is inadequate because it is necessary to know when a point is ON the object as well. Figure 16-11 gives the three-state logic tables for the Boolean operators.
B ON IN ON ON
UNION IN A ON OUT
IN IN IN IN
OUT IN ON OUT
DIFFERENCE IN A ON OUT
B ON ON ON OUT B
OUT IN ON OUT
INTERSECTION: IN A ON
IN IN ON
ON ON ON OUT
OUT OUT
Index
I-1
Index
Numerics
2D scaling 1-80 3D endpoint code 1-154 3D pipeline 1-150
A
achromatic 1-21 additive primaries 1-25 AIM-based viewing 1-164 algebraic curves 1-254 aliasing 1-247 ambient light 1-219 ambiguous picks 1-98 Apple Macintosh Toolbox 1-86 application context 1-98 ASCII 1-50 aspect ratio 1-72 azimuth 1-147
B
back-facing 1-198 basic definition 1-119 beam movement 1-65 Bernstein basis functions 1-259 Bezier curve definition 1-259 binomial coefficient 1-259 bitmap display 1-15 blending functions 1-270 boundary representation 1-280 bounding rectangle 1-102 Bresenham 1-44 bump mapping 1-248 byte 1-50
color coordinates 1-27 color lookup table (CLUT) 1-23 color model 1-27 color space 1-27 colorfulness 1-33 compacted sequential list 1-107 composite transformation matrix 1-83 computational geometry 1-170 concatenation 1-82 concave object 1-198 constructive solid geometry 1-280 control net 1-273 control points 1-259 control polygon 1-259 convex object 1-198 CPU 1-67 CPU memory 1-67 cross hairs 1-94 cross product 1-175 CSG 1-280 CSG tree 1-282 CSL 1-107 CTM 1-124 CTM stack 1-125 cubic polynomials 1-255 current transformation matrix 1-124 cursor 1-93
D
decay 1-13 depth comparisons 1-199 device coordinates 1-70 dialog windows 1-89 diffuse reflectance 1-216, 1-246, 1-247, 1-248 direct color 1-23 direction vector 1-170 display instruction 1-61 display memory 1-63 display programming 1-61 distortion 1-72 divide and conquer 1-205 dot product 1-174 dots per inch (dpi) 1-10 DPU 1-61 DPU memory 1-67 drawing attributes 1-91 drawing mode 1-91 drawing primitives 1-88 drawing state 1-91 drum plotter 1-4 ducks 1-253 DX vector 1-232 dynamic display manipulation 1-67
C
center of projection 1-135 character definition grid 1-49 character font 1-49 character spacing 1-51 chroma 1-33 chromatic 1-21 chromaticity 1-31 chromaticity diagram 1-31 clipping coordinate system 1-158 clipping coordinates 1-158 clipping plane coefficient vectors 1-155 coefficient matrix 1-255 color components 1-22
Index
Index
I-2
E
edge-driven 1-205 elevation 1-148 event 1-88, 1-90 event loop 1-90 extent 1-72 extrusion 1-282 eye coordinate system 1-135 eye coordinates 1-144 EYE-based viewing 1-166
inside/outside test 1-233 instance 1-119 instance table 1-119 instance transformations 1-123, 1-189 intensity depth cueing 1-132 intensity interpolation 1-223 interlace 1-14 interpolation 1-255 intersection of half-spaces 1-234 intersection tree 1-230 inversion 1-253
F
facets 1-223 far clipping plane 1-149 feedback 1-95 field-of-view 1-151, 1-163 filling 1-56, 1-209 film recorders 1-10 fixed width spacing 1-51 flat shading 1-223 flatbed plotter 1-5 flicker 1-14 font 1-51 frame buffer 1-15
K
kernel-based 1-86 kinetic depth 1-132 knot 1-256 knot vector 1-263
L
laser printers 1-10 lighting model 1-215 line of sight 1-197, 1-199 linear blend 1-269 linked lists 1-111 list compaction 1-109 local control 1-263 local intelligence 1-4 local refresh 1-63 lofting 1-269 luminance 1-21
G
geometric modeling 1-170 geometric tangents 1-274 glyph 1-49 Gouraud shading 1-223 graphical user interface 1-93 gray levels 1-17 gray scale display 1-17 growing lines 1-205 GUI 1-93
M
mach bands 1-224 mapped color 1-23 matrix equation 1-78 Microsoft Windows 1-87 modes 1-104
H
Hermitian interpolation 1-257 hierarchical scene 1-121 highlight 1-98, 1-105 HLR 1-130 homogeneous coordinates 1-142 homogeneous dot product 1-154 homogeneous transformation matrix 1-140 HSR 1-130
N
normalization transform 1-158 near clipping plane 1-149 network-based 1-86 non-uniform B-spline 1-263 normalized homogeneous coordinates 1-142 noun-verb paradigm 1-104 NTSC 1-29
I
identity transformations 1-82 illumination model 1-215 image plane 1-135 implicit representation 1-252 implicitization 1-253
O
object table 1-118 Off-line 1-6 on the fly execution 1-63 on-line 1-7
Index
Index
I-3
P
perspective matrix 1-143 page plotting system 1-55 palette 1-24 parallel light source 1-221 parametric domain 1-269 parametric knots 1-262 parametric polynomial 1-254 parametric representation 1-252 parametric tangents 1-256, 1-274 pen code 1-43 pen plotters 1-4 persistence 1-13 perspective depth cueing 1-131 perspective distance (d) 1-136 perspective factor 1-137 Phong shading 1-225 picking 1-97 pictorial renderings 1-130 picture plane 1-135 pixel 1-14 pixel coordinates 1-55 pixel memory 1-15 pixel patterns 1-56 pixel transfer functions 1-57 pixel value 1-15 plane coefficient vector 1-154 plane equation 1-154 plotting package 1-42 point containment test 1-233 point light source 1-221 point plotting display 1-13 points at infinity 1-143 Polygon clipping 1-206 polynomial basis 1-254 polynomial forms 1-254 poor man's hidden line removal 1-197 position vector 1-170 positioning constraints 1-94 primary colors 1-22 primitive instancing 1-281 principle of least astonishment 1-93 procedural texture mapping 1-249 projection axonometric 1-135 cabinet 1-135 cavalier 1-135 isometric 1-135 oblique parallel 1-135
orthographic parallel 1-135 parallel planar 1-135 perspective planar 1-135 projection plane 1-135 projector 1-135 proportional spacing 1-51
Q
quick-rejection 1-234
R
random vector 1-53 raster 1-53 raster display 1-14 raster display processor 1-15 raster location 1-54 raster unit 1-53, 1-54 rasterization 1-53 rational curves 1-267 reflectance model 1-215 refraction 1-230 refresh 1-13 refresh burden 1-63 region code 1-73 repaint 1-68 resolution 1-6 Robert's HLR algorithm 1-200 rubber-band line 1-95 ruled surface 1-269
S
S vector 1-232 scalar product 1-174 scalar triple product 1-177 scan-cycle 1-14 scan-line 1-14 scan-line coherence 1-205 scan-plane 1-204 screen context 1-98 screen integrity 1-111 segments 1-86 selection 1-105 selection region 1-103 shading model 1-215 shape design system 1-254 SKETCHPAD 1-1 slope continuity 1-257 spatial occupancy 1-279 spectral distribution 1-20 spectral reflectance curve 1-20 specular reflection 1-217 spline 1-253 spline curve 1-256
Index
Index
I-4
Spooling 1-7 spooling 1-7 spotlight source 1-221 stack pointer 1-125 step 1-4 stepping algorithm 1-44 stepping motor 1-6 step-size 1-6 stereoscopic views 1-132 strip-page plotting 1-55 strokes 1-50 subtractive primaries 1-26 surface normals 1-274 surface of revolution 1-282 surface patch 1-269 surface tangents 1-274 swept solids 1-282
magnitude 1-172 plane equation 1-179 unit vector 1-174 vector display processor 1-61 vector generator 1-14 vector product 1-175 vector volume 1-177 verb-noun paradigm 1-104 video 1-14 video lookup table (VLT) 1-23 view vector 1-144 viewing frustum 1-151, 1-164 viewing sphere 1-148 viewing volume 1-151 viewport 1-70
W
window 1-86 window coordinates 1-91 window systems 1-88 world coordinates 1-70
T
tesselation 1-223 texels 1-246 texture 1-246 texture coordinates 1-246 texture mapping 1-246 texture space 1-246 tile systems 1-88 topology 1-280 transformation 1-77 2D rotation 1-81 2D translation 1-80 3D rotation 1-140 3D scaling 1-140 3D translation 1-140 concatenation 1-82 trichroma 1-22 trichromatic vector 1-220 tri-stimulus theory 1-22 two dimensional clipping 1-73
X
X Window System 1-87
Z
z-clipping 1-149
U
unevaluated form 1-282 uniform B-spline 1-263 uniform scaling 1-72 up direction 1-188 update 1-64, 1-68 user model 1-98
V
viewing transformation 1-144 vanishing point 1-131 variable width spacing 1-51 vector cross product 1-175
Index