Computer Graphics and Fundamentals of Image Processing 21cs63
Computer Graphics and Fundamentals of Image Processing 21cs63
MODULE-1
easenotes 1
Module 1 2021 scheme Computer Graphics and OpenGL
An early application for computer graphics is the display of simple data graphs usually
plotted on a character printer. Data plotting is still one of the most common graphics
application.
Graphs & charts are commonly used to summarize functional, statistical, mathematical,
engineering and economic data for research reports, managerial summaries and other
types of publications.
Typically examples of data plots are line graphs, bar charts, pie charts, surface graphs,
contour plots and other displays showing relationships between multiple parameters in
two dimensions, three dimensions, or higher-dimensional spaces
b. Computer-Aided Design
easenotes 2
Module 1 2021 scheme Computer Graphics and OpenGL
CAD, computer-aided design or CADD, computer-aided drafting and design methods are
now routinely used in the automobiles, aircraft, spacecraft, computers, home appliances.
Circuits and networks for communications, water supply or other utilities are constructed
with repeated placement of a few geographical shapes.
Animations are often used in CAD applications. Real-time, computer animations using
wire-frame shapes are useful for quickly testing the performance of a vehicle or system.
c. Virtual-Reality Environments
easenotes 3
Module 1 2021 scheme Computer Graphics and OpenGL
There are many different kinds of data sets and effective visualization schemes depend on
the characteristics of the data. A collection of data can contain scalar values, vectors or
higher-order tensors.
easenotes 4
Module 1 2021 scheme Computer Graphics and OpenGL
f. Computer Art
The picture is usually painted electronically on a graphics tablet using a stylus, which can
simulate different brush strokes, brush widths and colors.
Fine artists use a variety of other computer technologies to produce images. To create
pictures the artist uses a combination of 3D modeling packages, texture mapping,
drawing programs and CAD software etc.
Commercial art also uses theses “painting” techniques for generating logos & other
designs, page layouts combining text & graphics, TV advertising spots & other
applications.
A common graphics method employed in many television commercials is morphing,
where one object is transformed into another.
g. Entertainment
Television production, motion pictures, and music videos routinely a computer graphics
methods.
Sometimes graphics images are combined a live actors and scenes and sometimes the
films are completely generated a computer rendering and animation techniques.
easenotes 5
Module 1 2021 scheme Computer Graphics and OpenGL
Some television programs also use animation techniques to combine computer generated
figures of people, animals, or cartoon characters with the actor in a scene or to transform
an actor’s face into another shape.
h. Image Processing
easenotes 6
Module 1 2021 scheme Computer Graphics and OpenGL
Each screen display area can contain a different process, showing graphical or non-
graphical information, and various methods can be used to activate a display window.
Using an interactive pointing device, such as mouse, we can active a display window on
some systems by positioning the screen cursor within the window display area and
pressing the left mouse button.
easenotes 7
Module 1 2021 scheme Computer Graphics and OpenGL
A beam of electrons, emitted by an electron gun, passes through focusing and deflection
systems that direct the beam toward specified positions on the phosphor-coated screen.
The phosphor then emits a small spot of light at each position contacted by the electron
beam and the light emitted by the phosphor fades very rapidly.
One way to maintain the screen picture is to store the picture information as a charge
distribution within the CRT in order to keep the phosphors activated.
The most common method now employed for maintaining phosphor glow is to redraw
the picture repeatedly by quickly directing the electron beam back over the same screen
points. This type of display is called a refresh CRT.
The frequency at which a picture is redrawn on the screen is referred to as the refresh
rate.
The primary components of an electron gun in a CRT are the heated metal cathode and a
control grid.
The heat is supplied to the cathode by directing a current through a coil of wire, called the
filament, inside the cylindrical cathode structure.
This causes electrons to be “boiled off” the hot cathode surface.
Inside the CRT envelope, the free, negatively charged electrons are then accelerated
toward the phosphor coating by a high positive voltage.
easenotes 8
Module 1 2021 scheme Computer Graphics and OpenGL
Intensity of the electron beam is controlled by the voltage at the control grid.
Since the amount of light emitted by the phosphor coating depends on the number of
electrons striking the screen, the brightness of a display point is controlled by varying the
voltage on the control grid.
The focusing system in a CRT forces the electron beam to converge to a small cross
section as it strikes the phosphor and it is accomplished with either electric or magnetic
fields.
With electrostatic focusing, the electron beam is passed through a positively charged
metal cylinder so that electrons along the center line of the cylinder are in equilibrium
position.
Deflection of the electron beam can be controlled with either electric or magnetic fields.
Cathode-ray tubes are commonly constructed with two pairs of magnetic-deflection coils
One pair is mounted on the top and bottom of the CRT neck, and the other pair is
mounted on opposite sides of the neck.
The magnetic field produced by each pair of coils results in a traverse deflection force
that is perpendicular to both the direction of the magnetic field and the direction of travel
of the electron beam.
Horizontal and vertical deflections are accomplished with these pair of coils
easenotes 9
Module 1 2021 scheme Computer Graphics and OpenGL
After a short time, the “excited” phosphor electrons begin dropping back to their stable
ground state, giving up their extra energy as small quantum of light energy called
photons.
What we see on the screen is the combined effect of all the electrons light emissions: a
glowing spot that quickly fades after all the excited phosphor electrons have returned to
their ground energy level.
The frequency of the light emitted by the phosphor is proportional to the energy
difference between the excited quantum state and the ground state.
Lower persistence phosphors required higher refresh rates to maintain a picture on the
screen without flicker.
The maximum number of points that can be displayed without overlap on a CRT is
referred to as a resolution.
Resolution of a CRT is dependent on the type of phosphor, the intensity to be displayed,
and the focusing and deflection systems.
High-resolution systems are often referred to as high-definition systems.
easenotes 10
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 11
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 12
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 13
Module 1 2021 scheme Computer Graphics and OpenGL
2) Shadow-mask technique
It produces wide range of colors as compared to beam-penetration technique.
This technique is generally used in raster scan displays. Including color TV.
easenotes 14
Module 1 2021 scheme Computer Graphics and OpenGL
In this technique CRT has three phosphor color dots at each pixel position.
One dot for red, one for green and one for blue light. This is commonly known as Dot
triangle.
Here in CRT there are three electron guns present, one for each color dot. And a shadow
mask grid just behind the phosphor coated screen.
The shadow mask grid consists of series of holes aligned with the phosphor dot pattern.
Three electron beams are deflected and focused as a group onto the shadow mask and
when they pass through a hole they excite a dot triangle.
In dot triangle three phosphor dots are arranged so that each electron beam can activate
only its corresponding color dot when it passes through the shadow mask.
A dot triangle when activated appears as a small dot on the screen which has color of
combination of three small dots in the dot triangle.
By changing the intensity of the three electron beams we can obtain different colors in
the shadow mask CRT.
easenotes 15
Module 1 2021 scheme Computer Graphics and OpenGL
Since we can even write on some flat panel displays they will soon be available as pocket
notepads.
We can separate flat panel display in two categories:
1. Emissive displays: - the emissive display or emitters are devices that convert
electrical energy into light. For Ex. Plasma panel, thin film electroluminescent
displays and light emitting diodes.
2. Non emissive displays: - non emissive display or non emitters use optical
effects to convert sunlight or light from some other source into graphics patterns.
For Ex. LCD (Liquid Crystal Display).
Firing voltage is applied to a pair of horizontal and vertical conductors cause the gas at
the intersection of the two conductors to break down into glowing plasma of electrons
and ions.
Picture definition is stored in a refresh buffer and the firing voltages are applied to refresh
the pixel positions, 60 times per second.
easenotes 16
Module 1 2021 scheme Computer Graphics and OpenGL
Alternating current methods are used to provide faster application of firing voltages and
thus brighter displays.
Separation between pixels is provided by the electric field of conductor.
One disadvantage of plasma panels is they were strictly monochromatic device that
means shows only one color other than black like black and white.
easenotes 17
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 18
Module 1 2021 scheme Computer Graphics and OpenGL
These vibrations are synchronized with the display of an object on a CRT so that each
point on the object is reflected from the mirror into a spatial position corresponding to the
distance of that point from a specified viewing location.
This allows us to walk around an object or scene and view it from different sides.
Here, the frame buffer can be anywhere in the system memory, and the video controller
accesses the frame buffer to refresh the screen.
easenotes 19
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 20
Module 1 2021 scheme Computer Graphics and OpenGL
Working:
Figure shows a two-dimensional Cartesian reference frame with the origin at the
lowerleft screen corner.
The screen surface is then represented as the first quadrant of a two-dimensional system
with positive x and y values increasing from left to right and bottom of the screen to the
top respectively.
Pixel positions are then assigned integer x values that range from 0 to xmax across the
screen, left to right, and integer y values that vary from 0 to ymax, bottom to top.
Two registers are used to store the coordinate values for the screen pixels.
easenotes 21
Module 1 2021 scheme Computer Graphics and OpenGL
Initially, the x register is set to 0 and the y register is set to the value for the top scan line.
The contents of the frame buffer at this pixel position are then retrieved and used to set
the intensity of the CRT beam.
Then the x register is incremented by 1, and the process is repeated for the next pixel on
the top scan line.
This procedure continues for each pixel along the top scan line.
After the last pixel on the top scan line has been processed, the x register is reset to 0 and
the y register is set to the value for the next scan line down from the top of the screen.
The procedure is repeated for each successive scan line.
After cycling through all pixels along the bottom scan line, the video controller resets the
registers to the first pixel position on the top scan line and the refresh process starts over
a.Speed up pixel position processing of video controller:
Since the screen must be refreshed at a rate of at least 60 frames per second,the simple
procedure illustrated in above figure may not be accommodated by RAM chips if the
cycle time is too slow.
To speed up pixel processing, video controllers can retrieve multiple pixel values from
the refresh buffer on each pass.
When group of pixels has been processed, the next block of pixel values is retrieved from
the frame buffer.
Advantages of video controller:
A video controller can be designed to perform a number of other operations.
For various applications, the video controller can retrieve pixel values from different
memory areas on different refresh cycles.
This provides a fast mechanism for generating real-time animations.
Another video-controller task is the transformation of blocks of pixels, so that screen
areas can be enlarged, reduced, or moved from one location to another during the refresh
cycles.
In addition, the video controller often contains a lookup table, so that pixel values in the
frame buffer are used to access the lookup table. This provides a fast method for
changing screen intensity values.
easenotes 22
Module 1 2021 scheme Computer Graphics and OpenGL
Finally, some systems are designed to allow the video controller to mix the framebuffer
image with an input image from a television camera or other input device
The purpose of the display processor is to free the CPU from the graphics chores.
In addition to the system memory, a separate display-processor memory area can be
provided.
Scan conversion:
A major task of the display processor is digitizing a picture definition given in an
application program into a set of pixel values for storage in the frame buffer.
This digitization process is called scan conversion.
Example 1: displaying a line
Graphics commands specifying straight lines and other geometric objects are scan
converted into a set of discrete points, corresponding to screen pixel positions.
Scan converting a straight-line segment.
easenotes 23
Module 1 2021 scheme Computer Graphics and OpenGL
Using outline:
For characters that are defined as outlines, the shapes are scan-converted into the frame
buffer by locating the pixel positions closest to the outline.
easenotes 24
Module 1 2021 scheme Computer Graphics and OpenGL
Encoding methods can be useful in the digital storage and transmission of picture
information
i) Run-length encoding:
The first number in each pair can be a reference to a color value, and the second number
can specify the number of adjacent pixels on the scan line that are to be displayed in that
color.
This technique, called run-length encoding, can result in a considerable saving in storage
space if a picture is to be constructed mostly with long runs of a single color each.
A similar approach can be taken when pixel colors change linearly.
ii) Cell encoding:
Another approach is to encode the raster as a set of rectangular areas (cell encoding).
Disadvantages of encoding:
The disadvantages of encoding runs are that color changes are difficult to record and
storage requirements increase as the lengths of the runs decrease.
In addition, it is difficult for the display controller to process the raster when many short
runs are involved.
Moreover, the size of the frame buffer is no longer a major concern, because of sharp
declines in memory costs
easenotes 25
Module 1 2021 scheme Computer Graphics and OpenGL
Multi-panel display screens are used in a variety of applications that require “wall-sized”
viewing areas. These systems are designed for presenting graphics displays at meetings,
conferences, conventions, trade shows, retail stores etc.
A multi-panel display can be used to show a large view of a single scene or several
individual images. Each panel in the system displays one section of the overall picture
A large, curved-screen system can be useful for viewing by a group of people studying a
particular graphics application.
A 360 degree paneled viewing system in the NASA control-tower simulator, which is
used for training and for testing ways to solve air-traffic and runway problems at airports.
easenotes 26
Module 1 2021 scheme Computer Graphics and OpenGL
Mouse Devices:
Mouse is a hand-held device,usually moved around on a flat surface to position the
screen cursor.wheeler or roolers on the bottom of the mouse used to record the amount
and direction of movement.
Some of the mouses uses optical sensors,which detects movement across the horizontal
and vertical grid lines.
Since a mouse can be picked up and put down,it is used for making relative changes in
the position of the screen.
Most general purpose graphics systems now include a mouse and a keyboard as the
primary input devices.
Joysticks:
Joystick is used as a positioning device,which uses a small vertical lever(stick) mounded
on a base.It is used to steer the screen cursor around and select screen position with the
stick movement.
A push or pull on the stick is measured with strain gauges and converted to movement of
the screen cursor in the direction of the applied pressure.
Data Gloves:
Data glove can be used to grasp a virtual object.The glove is constructed with a series of
sensors that detect hand and finger motions.
Input from the glove is used to position or manipulate objects in a virtual scene.
easenotes 27
Module 1 2021 scheme Computer Graphics and OpenGL
Digitizers:
Digitizer is a common device for drawing,painting or selecting positions.
Graphics tablet is one type of digitizer,which is used to input 2-dimensional coordinates
by activating a hand cursor or stylus at selected positions on a flat surface.
A hand cursor contains cross hairs for sighting positions and stylus is a pencil-shaped
device that is pointed at positions on the tablet.
Image Scanners:
Drawings,graphs,photographs or text can be stored for computer processing with an
image scanner by passing an optical scanning mechanism over the information to be
stored.
Once we have the representation of the picture, then we can apply various image-
processing method to modify the representation of the picture and various editing
operations can be performed on the stored documents.
Touch Panels:
Touch panels allow displayed objects or screen positions to be selected with the touch of
a finger.
Touch panel is used for the selection of processing options that are represented as a menu
of graphical icons.
Optical touch panel-uses LEDs along one vertical and horizontal edge of the frame.
Acoustical touch panels generates high-frequency sound waves in horizontal and vertical
directions across a glass plate.
Light Pens:
Light pens are pencil-shaped devices used to select positions by detecting the light
coming from points on the CRT screen.
To select positions in any screen area with a light pen,we must have some nonzero light
intensity emitted from each pixel within that area.
Light pens sometimes give false readings due to background lighting in a room.
easenotes 28
Module 1 2021 scheme Computer Graphics and OpenGL
Voice Systems:
Speech recognizers are used with some graphics workstations as input devices for voice
commands.The voice system input can be used to initiate operations or to enter data.
A dictionary is set up by speaking command words several times,then the system
analyses each word and matches with the voice command to match the pattern
easenotes 29
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 30
Module 1 2021 scheme Computer Graphics and OpenGL
Software Standards
The primary goal of standardized graphics software is portability.
easenotes 31
Module 1 2021 scheme Computer Graphics and OpenGL
In 1984, Graphical Kernel System (GKS) was adopted as the first graphics software
standard by the International Standards Organization (ISO)
The second software standard to be developed and approved by the standards
organizations was Programmer’s Hierarchical Interactive Graphics System (PHIGS).
Extension of PHIGS, called PHIGS+, was developed to provide 3-D surface rendering
capabilities not available in PHIGS.
The graphics workstations from Silicon Graphics, Inc. (SGI), came with a set of routines
called GL (Graphics Library)
easenotes 32
Module 1 2021 scheme Computer Graphics and OpenGL
The OpenGL functions also expect specific data types. For example, an OpenGL function
parameter might expect a value that is specified as a 32-bit integer. But the size of an
integer specification can be different on different machines.
To indicate a specific data type, OpenGL uses special built-in, data-type names, such as
GLbyte, GLshort, GLint, GLfloat, GLdouble, Glboolean
Related Libraries
In addition to OpenGL basic(core) library(prefixed with gl), there are a number of
associated libraries for handling special operations:-
1) OpenGL Utility(GLU):- Prefixed with “glu”. It provides routines for setting up
viewing and projection matrices, describing complex objects with line and polygon
approximations, displaying quadrics and B-splines using linear approximations,
processing the surface-rendering operations, and other complex tasks.
-Every OpenGL implementation includes the GLU library
2) Open Inventor:- provides routines and predefined object shapes for interactive three-
dimensional applications which are written in C++.
3) Window-system libraries:- To create graphics we need display window. We cannot
create the display window directly with the basic OpenGL functions since it contains
only device-independent graphics functions, and window-management operations are
device-dependent. However, there are several window-system libraries that supports
OpenGL functions for a variety of machines.
Eg:- Apple GL(AGL), Windows-to-OpenGL(WGL), Presentation Manager to
OpenGL(PGL), GLX.
4) OpenGL Utility Toolkit(GLUT):- provides a library of functions which acts as
interface for interacting with any device specific screen-windowing system, thus making
our program device-independent. The GLUT library functions are prefixed with “glut”.
Header Files
In all graphics programs, we will need to include the header file for the OpenGL core
library.
easenotes 33
Module 1 2021 scheme Computer Graphics and OpenGL
In windows to include OpenGL core libraries and GLU we can use the following header
files:-
#include <windows.h> //precedes other header files for including Microsoft windows ver
of OpenGL libraries
#include<GL/gl.h>
#include <GL/glu.h>
The above lines can be replaced by using GLUT header file which ensures gl.h and glu.h
are included correctly,
#include <GL/glut.h> //GL in windows
In Apple OS X systems, the header file inclusion statement will be,
#include <GLUT/glut.h>
easenotes 34
Module 1 2021 scheme Computer Graphics and OpenGL
Example: suppose we have the OpenGL code for describing a line segment in a
procedure called lineSegment.
Then the following function call passes the line-segment description to the display
window:
glutDisplayFunc (lineSegment);
Step 4: one more GLUT function
But the display window is not yet on the screen.
We need one more GLUT function to complete the window-processing operations.
After execution of the following statement, all display windows that we have created,
including their graphic content, are now activated:
glutMainLoop ( );
This function must be the last one in our program. It displays the initial graphics and puts
the program into an infinite loop that checks for input from devices such as a mouse or
keyboard.
Step 5: these parameters using additional GLUT functions
Although the display window that we created will be in some default location and size,
we can set these parameters using additional GLUT functions.
GLUT Function 1:
We use the glutInitWindowPosition function to give an initial location for the upper left
corner of the display window.
This position is specified in integer screen coordinates, whose origin is at the upper-left
corner of the screen.
easenotes 35
Module 1 2021 scheme Computer Graphics and OpenGL
GLUT Function 2:
After the display window is on the screen, we can reposition and resize it.
GLUT Function 3:
We can also set a number of other options for the display window, such as buffering and
a choice of color modes, with the glutInitDisplayMode function.
Arguments for this routine are assigned symbolic GLUT constants.
Example: the following command specifies that a single refresh buffer is to be used for
the display window and that we want to use the color mode which uses red, green, and
blue (RGB) components to select color values:
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
The values of the constants passed to this function are combined using a logical or
operation.
Actually, single buffering and RGB color mode are the default options.
But we will use the function now as a reminder that these are the options that are set for
our display.
Later, we discuss color modes in more detail, as well as other display options, such as
double buffering for animation applications and selecting parameters for viewing
threedimensional scenes.
easenotes 36
Module 1 2021 scheme Computer Graphics and OpenGL
The fourth parameter in the glClearColor function is called the alpha value for the
specified color. One use for the alpha value is as a “blending” parameter
When we activate the OpenGL blending operations, alpha values can be used to
determine the resulting color for two overlapping objects.
An alpha value of 0.0 indicates a totally transparent object, and an alpha value of 1.0
indicates an opaque object.
For now, we will simply set alpha to 0.0.
Although the glClearColor command assigns a color to the display window, it does not
put the display window on the screen.
easenotes 37
Module 1 2021 scheme Computer Graphics and OpenGL
Example program
For our first program, we simply display a two-dimensional line segment.
To do this, we need to tell OpenGL how we want to “project” our picture onto the display
window because generating a two-dimensional picture is treated by OpenGL as a special
case of three-dimensional viewing.
So, although we only want to produce a very simple two-dimensional line, OpenGL
processes our picture through the full three-dimensional viewing operations.
We can set the projection type (mode) and other viewing parameters that we need with
the following two functions:
glMatrixMode (GL_PROJECTION);
gluOrtho2D (0.0, 200.0, 0.0, 150.0);
This specifies that an orthogonal projection is to be used to map the contents of a
twodimensional rectangular area of world coordinates to the screen, and that the x-
coordinate values within this rectangle range from 0.0 to 200.0 with y-coordinate values
ranging from 0.0 to 150.0.
Whatever objects we define within this world-coordinate rectangle will be shown within
the display window.
Anything outside this coordinate range will not be displayed.
Therefore, the GLU function gluOrtho2D defines the coordinate reference frame within
the display window to be (0.0, 0.0) at the lower-left corner of the display window and
(200.0, 150.0) at the upper-right window corner.
For now, we will use a world-coordinate rectangle with the same aspect ratio as the
display window, so that there is no distortion of our picture.
Finally, we need to call the appropriate OpenGL routines to create our line segment.
The following code defines a two-dimensional, straight-line segment with integer,
Cartesian endpoint coordinates (180, 15) and (10, 145).
glBegin (GL_LINES);
glVertex2i (180, 15);
glVertex2i (10, 145);
glEnd ( );
Now we are ready to put all the pieces together:
easenotes 38
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 39
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 40
Module 1 2021 scheme Computer Graphics and OpenGL
The scan-conversion algorithm stores info about the scene, such as color values, at the
appropriate locations in the frame buffer, and then the scene is displayed on the output
device.
Screen co-ordinates:
Locations on a video monitor are referenced in integer screen coordinates, which
correspond to the integer pixel positions in the frame buffer.
Scan-line algorithms for the graphics primitives use the coordinate descriptions to
determine the locations of pixels
Example: given the endpoint coordinates for a line segment, a display algorithm must
calculate the positions for those pixels that lie along the line path between the endpoints.
Since a pixel position occupies a finite area of the screen, the finite size of a pixel must
be taken into account by the implementation algorithms.
For the present, we assume that each integer screen position references the centre of a
pixel area.
Once pixel positions have been identified the color values must be stored in the frame
buffer
easenotes 41
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 42
Module 1 2021 scheme Computer Graphics and OpenGL
The display window will then be referenced by coordinates (xmin, ymin) at the lower-left
corner and by coordinates (xmax, ymax) at the upper-right corner, as shown in Figure
below
We can then designate one or more graphics primitives for display using the coordinate
reference specified in the gluOrtho2D statement.
If the coordinate extents of a primitive are within the coordinate range of the display
window, all of the primitive will be displayed.
Otherwise, only those parts of the primitive within the display-window coordinate limits
will be shown.
Also, when we set up the geometry describing a picture, all positions for the OpenGL
primitives must be given in absolute coordinates, with respect to the reference frame
defined in the gluOrtho2D function.
easenotes 43
Module 1 2021 scheme Computer Graphics and OpenGL
where:
glBegin indicates the beginning of the object that has to be displayed
glEnd indicates the end of primitive
easenotes 44
Module 1 2021 scheme Computer Graphics and OpenGL
glEnd ( );
Case 2:
we could specify the coordinate values for the preceding points in arrays such as
int point1 [ ] = {50, 100};
int point2 [ ] = {75, 150};
int point3 [ ] = {100, 200};
and call the OpenGL functions for plotting the three points as
glBegin (GL_POINTS);
glVertex2iv (point1);
glVertex2iv (point2);
glVertex2iv (point3);
glEnd ( );
Case 3:
specifying two point positions in a three dimensional world reference frame. In this case,
we give the coordinates as explicit floating-point values:
glBegin (GL_POINTS);
glVertex3f (-78.05, 909.72, 14.60);
glVertex3f (261.91, -5200.67, 188.33);
glEnd ( );
easenotes 45
Module 1 2021 scheme Computer Graphics and OpenGL
Case 1: Lines
glBegin (GL_LINES);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 2: GL_LINE_STRIP:
Successive vertices are connected using line segments. However, the final vertex is not
connected to the initial vertex.
glBegin (GL_LINES_STRIP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 3: GL_LINE_LOOP:
Successive vertices are connected using line segments to form a closed path or loop i.e., final
vertex is connected to the initial vertex.
glBegin (GL_LINES_LOOP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
easenotes 46
Module 1 2021 scheme Computer Graphics and OpenGL
Example program:
Attribute functions may be listed inside or outside of a glBegin/glEnd pair.
Example: the following code segment plots three points in varying colors and sizes.
easenotes 47
Module 1 2021 scheme Computer Graphics and OpenGL
The first is a standard-size red point, the second is a double-size green point, and the third
is a triple-size blue point:
Ex:
glColor3f (1.0, 0.0, 0.0);
glBegin (GL_POINTS);
glVertex2i (50, 100);
glPointSize (2.0);
glColor3f (0.0, 1.0, 0.0);
glVertex2i (75, 150);
glPointSize (3.0);
glColor3f (0.0, 0.0, 1.0);
glVertex2i (100, 200);
glEnd ( );
easenotes 48
Module 1 2021 scheme Computer Graphics and OpenGL
That is, the magnitude of the horizontal and vertical separations of the line endpoints,
deltax and deltay, are compared to determine whether to generate a thick line using
vertical pixel spans or horizontal pixel spans.
Pattern:
Parameter pattern is used to reference a 16-bit integer that describes how the line should
be displayed.
1 bit in the pattern denotes an “on” pixel position, and a 0 bit indicates an “off” pixel
position.
The pattern is applied to the pixels along the line path starting with the low-order bits in
the pattern.
The default pattern is 0xFFFF (each bit position has a value of 1),which produces a solid
line.
repeatFactor
Integer parameter repeatFactor specifies how many times each bit in the pattern is to be
repeated before the next bit in the pattern is applied.
The default repeat value is 1.
Polyline:
With a polyline, a specified line-style pattern is not restarted at the beginning of each
segment.
easenotes 49
Module 1 2021 scheme Computer Graphics and OpenGL
It is applied continuously across all the segments, starting at the first endpoint of the
polyline and ending at the final endpoint for the last segment in the series.
Example:
For line style, suppose parameter pattern is assigned the hexadecimal representation
0x00FF and the repeat factor is 1.
This would display a dashed line with eight pixels in each dash and eight pixel positions
that are “off” (an eight-pixel space) between two dashes.
Also, since low order bits are applied first, a line begins with an eight-pixel dash starting
at the first endpoint.
This dash is followed by an eight-pixel space, then another eight-pixel dash, and so forth,
until the second endpoint position is reached.
Example Code:
typedef struct { float x, y; } wcPt2D;
wcPt2D dataPts [5];
void linePlot (wcPt2D dataPts [5])
{
int k;
glBegin (GL_LINE_STRIP);
for (k = 0; k < 5; k++)
glVertex2f (dataPts [k].x, dataPts [k].y);
easenotes 50
Module 1 2021 scheme Computer Graphics and OpenGL
glFlush ( );
glEnd ( );
}
/* Invoke a procedure here to draw coordinate axes. */
glEnable (GL_LINE_STIPPLE); /* Input first set of (x, y) data values. */
glLineStipple (1, 0x1C47); // Plot a dash-dot, standard-width polyline.
linePlot (dataPts);
/* Input second set of (x, y) data values. */
glLineStipple (1, 0x00FF); / / Plot a dashed, double-width polyline.
glLineWidth (2.0);
linePlot (dataPts);
/* Input third set of (x, y) data values. */
glLineStipple (1, 0x0101); // Plot a dotted, triple-width polyline.
glLineWidth (3.0);
linePlot (dataPts);
glDisable (GL_LINE_STIPPLE);
easenotes 51
Module 1 2021 scheme Computer Graphics and OpenGL
Method 2: Another method for displaying thick curves is to fill in the area between two Parallel
curve paths, whose separation distance is equal to the desired width. We could do this using the
specified curve path as one boundary and setting up the second boundary either inside or outside
the original curve path. This approach, however, shifts the original curve path either inward or
outward, depending on which direction we choose for the second boundary.
Method 3:The pixel masks discussed for implementing line-style options could also be used in
raster curve algorithms to generate dashed or dotted patterns
Method 4: Pen (or brush) displays of curves are generated using the same techniques discussed
for straight-line segments.
easenotes 52
Module 1 2021 scheme Computer Graphics and OpenGL
We determine values for the slope m and y intercept b with the following equations:
m=(yend - y0)/(xend - x0) ---------------- >(2)
b=y0 - m.x0 ------------- >(3)
Algorithms for displaying straight line are based on the line equation (1) and calculations
given in eq(2) and (3).
For given x interval δx along a line, we can compute the corresponding y interval δy from
eq.(2) as
δy=m. δx ---------------- >(4)
Similarly, we can obtain the x interval δx corresponding to a specified δy as
δx=δy/m ----------------- >(5)
These equations form the basis for determining deflection voltages in analog displays,
such as vector-scan system, where arbitrarily small changes in deflection voltage are
possible.
For lines with slope magnitudes
|m|<1, δx can be set proportional to a small horizontal deflection voltage with the
corresponding vertical deflection voltage set proportional to δy from eq.(4)
|m|>1, δy can be set proportional to a small vertical deflection voltage with the
corresponding horizontal deflection voltage set proportional to δx from eq.(5)
|m|=1, δx=δy and the horizontal and vertical deflections voltages are equal
easenotes 53
Module 1 2021 scheme Computer Graphics and OpenGL
A line is sampled at unit intervals in one coordinate and the corresponding integer values
nearest the line path are determined for the other coordinate
DDA Algorithm has three cases so from equation i.e.., m=(yk+1 - yk)/(xk+1 - xk)
Case1:
if m<1,x increment in unit intervals
i.e..,xk+1=xk+1
then, m=(yk+1 - yk)/( xk+1 - xk)
m= yk+1 - yk
yk+1 = yk + m ----------- >(1)
where k takes integer values starting from 0,for the first point and increases by 1 until
final endpoint is reached. Since m can be any real number between 0.0 and 1.0,
Case2:
if m>1, y increment in unit intervals
i.e.., yk+1 = yk + 1
then, m= (yk + 1- yk)/( xk+1 - xk)
m(xk+1 - xk)=1
xk+1 =(1/m)+ xk------------------------- (2)
Case3:
if m=1,both x and y increment in unit intervals
i.e..,xk+1=xk + 1 and yk+1 = yk + 1
Equations (1) and (2) are based on the assumption that lines are to be processed from the left
endpoint to the right endpoint. If this processing is reversed, so that the starting endpoint is at the
right, then either we have δx=-1 and
yk+1 = yk - m (3)
or(when the slope is greater than 1)we have δy=-1 with
xk+1 = xk - (1/m) --------------- (4)
easenotes 54
Module 1 2021 scheme Computer Graphics and OpenGL
Similar calculations are carried out using equations (1) through (4) to determine the pixel
positions along a line with negative slope. thus, if the absolute value of the slope is less
than 1 and the starting endpoint is at left ,we set δx==1 and calculate y values with eq(1).
when starting endpoint is at the right(for the same slope),we set δx=-1 and obtain y
positions using eq(3).
This algorithm is summarized in the following procedure, which accepts as input two
integer screen positions for the endpoints of a line segment.
if m<1,where x is incrementing by 1
yk+1 = yk + m
So initially x=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is the
starting point .
o Illuminate pixel(x, round(y))
o x1= x+ 1 , y1=y + 1
o Illuminate pixel(x1,round(y1))
o x2= x1+ 1 , y2=y1 + 1
o Illuminate pixel(x2,round(y2))
o Till it reaches final point.
if m>1,where y is incrementing by 1
xk+1 =(1/m)+ xk
So initially y=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is the
starting point .
o Illuminate pixel(round(x),y)
o x1= x+( 1/m) ,y1=y
o Illuminate pixel(round(x1),y1)
o x2= x1+ (1/m) , y2=y1
o Illuminate pixel(round(x2),y2)
o Till it reaches final point.
The DDA algorithm is faster method for calculating pixel position than one that directly
implements .
easenotes 55
Module 1 2021 scheme Computer Graphics and OpenGL
easenotes 56
Module 1 2021 scheme Computer Graphics and OpenGL
Bresenham’s Algorithm:
It is an efficient raster scan generating algorithm that uses incremental integral
calculations
To illustrate Bresenham’s approach, we first consider the scan-conversion process for
lines with positive slope less than 1.0.
Pixel positions along a line path are then determined by sampling at unit x intervals.
Starting from the left endpoint (x0, y0) of a given line, we step to each successive column
(x position) and plot the pixel whose scan-line y value is closest to the line path.
easenotes 57
Module 1 2021 scheme Computer Graphics and OpenGL
Code:
#include <stdlib.h>
#include <math.h>
/* Bresenham line-drawing procedure for |m| < 1.0. */
void lineBres (int x0, int y0, int xEnd, int yEnd)
{
int dx = fabs (xEnd - x0), dy = fabs(yEnd - y0);
int p = 2 * dy - dx;
int twoDy = 2 * dy, twoDyMinusDx = 2 * (dy - dx);
int x, y;
/* Determine which endpoint to use as start position. */
if (x0 > xEnd) {
x = xEnd;
y = yEnd;
xEnd = x0;
}
else {
x = x0;
y = y0;
}
setPixel (x, y);
while (x < xEnd) {
x++;
if (p < 0)
p += twoDy;
easenotes 58
Module 1 2021 scheme Computer Graphics and OpenGL
else {
y++;
p += twoDyMinusDx;
}
setPixel (x, y);
}
}
Properties of Circles
A circle is defined as the set of points that are all at a given distance r from a center
position (xc , yc ).
For any circle point (x, y), this distance relationship is expressed by the Pythagorean
theorem in Cartesian coordinates as
We could use this equation to calculate the position of points on a circle circumference
by stepping along the x axis in unit steps from xc −r to xc +r and calculating the
corresponding y values at each position as
One problem with this approach is that it involves considerable computation at each step.
Moreover, the spacing between plotted pixel positions is not uniform.
We could adjust the spacing by interchanging x and y (stepping through y values and
calculating x values) whenever the absolute value of the slope of the circle is greater than
1; but this simply increases the computation and processing required by the algorithm.
Another way to eliminate the unequal spacing is to calculate points along the circular
boundary using polar coordinates r and θ
Expressing the circle equation in parametric polar form yields the pair of equations
easenotes 59
Module 1 2021 scheme Computer Graphics and OpenGL
To summarize, the relative position of any point (x, y) can be determined by checking the
sign of the circle function as follows:
easenotes 60
Module 1 2021 scheme Computer Graphics and OpenGL
Conside the circle centered at the origin,if the point ( x, y) is on the circle,then we can
compute 7 other points on the circle as shown in the above figure.
Our decision parameter is the circle function evaluated at the midpoint between these
two pixels:
easenotes 61
Module 1 2021 scheme Computer Graphics and OpenGL
The initial decision parameter is obtained by evaluating the circle function at the start
position (x0, y0) = (0, r ):
easenotes 62
Module 1 2021 scheme Computer Graphics and OpenGL
Code:
void draw_pixel(GLint cx, GLint cy)
{
glColor3f(0.5,0.5,0.0);
glBegin(GL_POINTS);
glVertex2i(cx, cy);
glEnd();
}
easenotes 63
Module 1 2021 scheme Computer Graphics and OpenGL
d+=2*(x-y)+5;
--y;
}
++x;
}
plotpixels(xc, yc, x, y);
}
easenotes 64
Computer Graphics and Fundamentals of Image Processing(21CS63)
MODULE-2
2D Geometric Transformations
Operations that are applied to the geometric description of an object to change its position,
orientation, or size are called geometric transformations. Sometimes geometric transformation
operations are also referred to as modeling transformations.
Two-Dimensional Translation
Translation on single coordinate point is performed by adding offsets to its coordinates to
generate a new coordinate position. The original point position is moved along a straight line
path to its new location.
The translation distance pair (tx, ty) is called a translation vector or shift vector.
easenotes Page 1
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.1: Translating a point from position P to position P using a translation vector T.
We can express Equations 1 as a single matrix equation by using the following column vectors to
represent coordinate positions and the translation vector:
Translation is a rigid-body transformation that moves objects without deformation. That is, every
point on the object is translated by the same amount. Figure 2.2 illustrates the application of a
specified translation vector to move an object from one position to another.
easenotes Page 2
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.2: Moving a polygon from position (a) to position (b) with the
translation vector (-5.50, 3.75)
Two-Dimensional Rotation
Rotation transformation of an object is generated by specifying a rotation axis and a rotation
angle. All points of the object are then transformed to new positions by rotating the points
through the specified angle about the rotation axis.
easenotes Page 3
Computer Graphics and Fundamentals of Image Processing(21CS63)
A positive value for the angle θ defines a counterclockwise rotation about the pivot point. A
negative value for the angle θ rotates objects in the clockwise direction.
Figure 2.3: Rotation of an object through angle θ about the pivot point (xr ,yr )
The angular and coordinate relationships of the original and transformed point positions are
shown in Figure 2.4.
Figure 2.4: Rotation of a point from position (x,y) to position (x, y) through an angle θ
relative to the coordinate origin. The original angular displacement of the point from the x
axis is
easenotes Page 4
Computer Graphics and Fundamentals of Image Processing(21CS63)
r is the constant distance of the point from the origin, angle is the original angular position of
the point from the horizontal, and θ is the rotation angle.
x=rcos(+θ)=rcoscosθ-rsinsinθ (4)
y=rsin(+θ)=rcossinθ+rsincosθ
x’=xcos-ysin (6)
y’=xsin+ycos
P’=RP (7)
(8)
easenotes Page 5
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.5: Rotating a point from position(x,y) to position (x,y) through an angle about
rotation point (xr,yr)
Using the trigonometric relationships indicated by the two right triangles in this figure, we can
generalize Equations 6 to obtain the transformation equations for rotation of a point about any
specified rotation position (xr, yr):
Two-Dimensional Scaling
To alter the size of an object, scaling transformation is used. A two dimensional scaling
operation is performed by multiplying object positions (x,y) by scaling factors sx and sy to
produce the transformed coordinates(x’,y’).
Scaling factor sx scales an object in the x direction, while sy scales in the y direction. The basic
two-dimensional scaling equations 10 can also be written in the following matrix form:
easenotes Page 6
Computer Graphics and Fundamentals of Image Processing(21CS63)
or
P’=S.P (12)
Any positive values can be assigned to the scaling factors sx and sy. Values less than 1 reduce the
size of objects. Values greater than 1 produce enlargements. Specifying a value of 1 for both sx
and sy leaves the size of objects unchanged. When sx and sy are assigned the same value, a
uniform scaling is produced, which maintains relative object proportions. Unequal values for sx
and sy result in a differential scaling.
Figure 2.6: Turning a square (a) into a rectangle (b) with scaling factors sx = 2 and
sy = 1.
Figure 2.7 illustrates scaling of a line by assigning the value 0.5 to both sx and sy.
Both the line length and the distance from the origin are reduced by a factor of ½.
easenotes Page 7
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.7: A line scaled with Equation 12 using sx = sy = 0.5 is reduced in size and moved
closer to the coordinate origin
The location of a scaled object is controlled by choosing a position, called the fixed point, that is
to remain unchanged after the scaling transformation. Coordinates for the fixed point, (xf,yf), are
often chosen at some object position, such as its centroid but any other spatial position can be
selected.
Objects are now resized by scaling the distances between object points and the point (Figure
2.8). For a coordinate position (x, y), the scaled coordinates (x, y) are then calculated from the
following relationships:
x-xf=(x-xf)sx (13)
y-yf=(y-yf)sy
The above equation can be rewritten to separate the multiplicative and additive terms as
y=y.sy+yf(1-sy)
The additive terms xf(1-sx) and yf(1-sy) are constants for all points in the object.
easenotes Page 8
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.8: Scaling relative to a chosen fixed point(xf , yf ). The distance from each polygon
vertex to the fixed point is scaled by Equations 13
Each of the three basic two-dimensional transformations (translation, rotation, and scaling) can
be expressed in the general matrix form.
P’=M1.P+M2 (15)
For translation, M1 is the identity matrix. For rotation or scaling, M2 contains the translational
terms associated with the pivot point or scaling fixed point.
Homogeneous Coordinates
easenotes Page 9
Computer Graphics and Fundamentals of Image Processing(21CS63)
x=xh/h (16)
y=yh/h
(17)
P=T(tx,ty).P (18)
easenotes Page 10
Computer Graphics and Fundamentals of Image Processing(21CS63)
Two-dimensional rotation transformation equations about the coordinate origin can be expressed
in the matrix form as
(19)
or as
P’=R().P (20)
The rotation transformation operator R(θ) is the 3 × 3 matrix with rotation parameter θ.
A scaling transformation relative to the coordinate origin can be expressed as the matrix
multiplication.
(21)
P=S(sx,sy).P (22)
The scaling operator S(sx,sy) is the 3x3 matrix with parameters sx and sy.
Inverse Transformations
For translation, inverse matrix is obtained by negating the translation distances. If the two
dimensional translation distances are tx and ty , then inverse translation matrix is
easenotes Page 11
Computer Graphics and Fundamentals of Image Processing(21CS63)
(23)
An inverse rotation is accomplished by replacing the rotation angle by its negative. A two-
dimensional rotation through an angle θ about the coordinate origin has the inverse
transformation matrix
(24)
The inverse matrix for any scaling transformation is obtained by replacing the scaling parameters
with their reciprocals. For two-dimensional scaling with parameters sx and sy applied relative to
the coordinate origin, the inverse transformation matrix is
(25)
easenotes Page 12
Computer Graphics and Fundamentals of Image Processing(21CS63)
P=M2.M1.P
P’=M.P (26)
The coordinate position is transformed using the composite matrix M, rather than applying the
individual transformations M1 and then M2.
If two successive translation vectors (t1x, t1y) and (t2x, t2y) are applied to a two dimensional
coordinate position P. The final transformed location P is calculated as
P=T(t2x,t2y).{T(t1x,t1y).P}
P={T(t2x,t2y).T(t1x,t1y)}.P (27)
(28)
or
T(t2x,t2y).T(t1x,t1y)=T(t1x+t2x,t1y+t2y) (29)
P=R(2).{R(1).P}
= {R(2).{R(1).P} (30)
By multiplying the two rotation matrices, it can be verified that two successive rotations are
additive.
easenotes Page 13
Computer Graphics and Fundamentals of Image Processing(21CS63)
R(2).R(1)=R(1+2) (31)
The final rotated coordinates of a point can be calculated with the composite rotation matrix as
P=R(1+2).P (32)
Concatenating transformation matrices for two successive scaling operations in two dimensions
produces the following composite scaling matrix:
(33)
or
Two-dimensional rotation about any other pivot point (xr,yr) can be generated by performing the
following sequence of translate-rotate-translate operations:
1. Translate the object so that the pivot-point position is moved to the coordinate origin.
3. Translate the object so that the pivot point is returned to its original position.
The composite transformation matrix for this sequence is obtained with the concatenation
easenotes Page 14
Computer Graphics and Fundamentals of Image Processing(21CS63)
(35)
T(xr,yr).R().T(-xr,-yr)=R(xr,yr,) (36)
Figure 2.9: A transformation sequence for rotating an object about a specified pivot point
using the rotation matrix R(θ)
To produce a two-dimensional scaling with respect to a selected fixed position (xf,yf), following
sequence is followed.
easenotes Page 15
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. Translate the object so that the fixed point coincides with the coordinate origin.
3. Use the inverse of the translation in step (1) to return the object to its original position.
Concatenating the matrices for these three operations produces the required scaling matrix:
(37)
or
T(xf,yf).S(sx,sy).T(-xf,-yf)=S(xf,yf,sx,sy) (38)
Figure 2.10: A transformation sequence for scaling an object with respect to a specified
fixed position using the scaling matrix S(sx, sy )
Basic transformations such as translation, rotation, and scaling are standard components of
graphics libraries. Some packages provide a few additional transformations that are useful in
certain applications.
easenotes Page 16
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. Reflection and
2. Shear.
Reflection
Reflection about the line y = 0 (the x axis) is accomplished with the transformation matrix
(39)
This transformation retains x values, but “flips” the y values of coordinate positions.
The resulting orientation of an object after it has been reflected about the x axis is shown
in Figure 2.11
easenotes Page 17
Computer Graphics and Fundamentals of Image Processing(21CS63)
A reflection about the line x = 0 (the y axis) flips x coordinates while keeping y coordinates the
same. The matrix for this transformation is
(40)
Figure 2.12 illustrates the change in position of an object that has been reflected about
the line x = 0.
We flip both the x and y coordinates of a point by reflecting relative to an axis that is
perpendicular to the xy plane and that passes through the coordinate origin. The matrix
representation for this reflection is
(41)
easenotes Page 18
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.13: Reflection of an object relative to the coordinate origin. This transformation
can be accomplished with a rotation in the x y plane about the coordinate origin.
If we choose the reflection axis as the diagonal line y = x , the reflection matrix is
(42)
easenotes Page 19
Computer Graphics and Fundamentals of Image Processing(21CS63)
To obtain a transformation matrix for reflection about the diagonal y = x, concatenate matrices
for the transformation sequence:
easenotes Page 20
Computer Graphics and Fundamentals of Image Processing(21CS63)
To obtain a transformation matrix for reflection about the diagonal y = -x, we could concatenate
matrices for the transformation sequence:
1. Clockwise rotation by 45◦,
2. Reflection about the y axis
3. Counterclockwise rotation by 45◦.
The resulting transformation matrix is
(43)
easenotes Page 21
Computer Graphics and Fundamentals of Image Processing(21CS63)
A transformation that distorts the shape of an object such that the transformed shape appears as if
the object were composed of internal layers that had been caused to slide over each other is
called a shear.
Two common shearing transformations are those that shift coordinate x values and those that
shift y values.
An x-direction shear relative to the x axis is produced with the transformation Matrix
(44)
x=x+shx.y (45)
y=y
easenotes Page 22
Computer Graphics and Fundamentals of Image Processing(21CS63)
Any real number can be assigned to the shear parameter shx. A coordinate position (x, y) is then
shifted horizontally by an amount proportional to its perpendicular distance (y value) from the x
axis. Setting parameter shx to the value 2, for example, changes the square in Figure 2.17 into a
parallelogram. Negative values for shx shift coordinate positions to the left.
Figure 2.17: A unit square (a) is converted to a parallelogram (b) using the x -direction
shear matrix with shx= 2.
(46)
x=x+shx(y-yref) (47)
y=y
A y-direction shear relative to the line x = xref is generated with the transformation matrix
(48)
easenotes Page 23
Computer Graphics and Fundamentals of Image Processing(21CS63)
x=x (49)
y=y+shy(x-xref)
Figure 2.18: A unit square (a) is transformed to a shifted parallelogram (b) with shx = 0.5
and yref = -1 in the shear matrix 46.
Figure 2.19: A unit square (a) is turned into a shifted parallelogram (b) with parameter
values shy = 0.5 and xref = -1 in the y -direction shearing transformation 48.
easenotes Page 24
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.20: Translating an object from screen position (a) to the destination position
shown in (b) by moving a rectangular block of pixel values. Coordinate positions Pmin and
Pmax specify the limits of the rectangular block to be moved, and P0 is the destination
reference position
easenotes Page 25
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.21: Rotating an array of pixel values. The original array is shown in (a), the
positions of the array elements after a 90◦ counterclockwise rotation are shown in (b), and
the positions of the array elements after a 180◦ rotation are shown in (c).
For array rotations that are not multiples of 90◦, we need to do some extra processing.
The general procedure is illustrated in Figure 2.22.
Figure 2.22: A raster rotation for a rectangular block of pixels can be accomplished by
mapping the destination pixel areas onto the rotated block.
easenotes Page 26
Computer Graphics and Fundamentals of Image Processing(21CS63)
Each destination pixel area is mapped onto the rotated array and the amount of overlap
with the rotated pixel areas is calculated. A color for a destination pixel can then be computed by
averaging the colors of the overlapped source pixels, weighted by their percentage of area
overlap.
Similar methods can be used to scale a block of pixels. Pixel areas in the original block are
scaled, using specified values for sx and sy, and then mapped onto a set of destination pixels. The
color of each destination pixel is then assigned according to its area of overlap with the scaled
pixel areas. (Figure 2.23)
Figure 2.23: Mapping destination pixel areas onto a scaled array of pixel values. Scaling
factors sx = sy = 0.5 are applied relative to fixed point (xf ,yf ).
A translation of a rectangular array of pixel-color values from one buffer area to another
can be accomplished in OpenGL as the following copy operation:
glCopyPixels (xmin, ymin, width, height, GL_COLOR);
The first four parameters in this function give the location and dimensions of the pixel
block; and the OpenGL symbolic constant GL_COLOR specifies that it is color values
are to be copied
easenotes Page 27
Computer Graphics and Fundamentals of Image Processing(21CS63)
A block of RGB color values in a buffer can be saved in an array with the function
In the core library of OpenGL, a separate function is available for each of the basic geometric
transformations. OpenGL is designed as a three-dimensional graphics application programming
interface (API), all transformations are specified in three dimensions. Internally, all coordinate
positions are represented as four-element column vectors, and all transformations are represented
using 4 × 4 matrices.
To perform a translation, we invoke the translation routine and set the components for the
three-dimensional translation vector.
In the rotation function, we specify the angle and the orientation for a rotation axis that
intersects the coordinate origin. A scaling function is used to set the three coordinate scaling
factors relative to the coordinate origin. In each case, the transformation routine sets up a 4 × 4
matrix that is applied to the coordinates of objects that are referenced after the transformation
call
easenotes Page 28
Computer Graphics and Fundamentals of Image Processing(21CS63)
Translation parameters tx, ty, and tz can be assigned any real-number values, and the single
suffix code to be affixed to this function is either f (float) or d (double).
If v is not specified as a unit vector, then it is normalized automatically before the elements of
the rotation matrix are computed.
The suffix code can be either f or d, and parameter theta is to be assigned a rotation angle in
degree.
For example, the statement: glRotatef (90.0, 0.0, 0.0, 1.0);
sets up the matrix for a 90◦ rotation about the z axis.
We obtain a 4 × 4 scaling matrix with respect to the coordinate origin with the following
routine:
glScale* (sx, sy, sz);
The suffix code is again either f or d, and the scaling parameters can be assigned
any real-number values.
Scaling in a two-dimensional system involves changes in the x and y dimensions,
so a typical two-dimensional scaling operation has a z scaling factor of 1.0
easenotes Page 29
Computer Graphics and Fundamentals of Image Processing(21CS63)
The above statement produces a matrix that scales by a factor of 2 in the x direction, scales by a
factor of 3 in the y direction, and reflects with respect to the x axis:
OpenGL Matrix Operations
The glMatrixMode routine is used to set the projection mode which designates the matrix
that is to be used for the projection transformation.
modelview mode is specified with the following statement
glMatrixMode (GL_MODELVIEW);
which designates the 4×4 modelview matrix as the current matrix
Two other modes that can be set with the glMatrixMode function are the texture
mode and the color mode.
The texture matrix is used for mapping texture patterns to surfaces, and the color
matrix is used to convert from one color model to another. The default argument for the
glMatrixMode function is GL_MODELVIEW.
Identity matrix is assigned to the current matrix using following function:
glLoadIdentity( );
Other values can be assigned to the elements of the current matrix using
glLoadMatrix* (elements16);
easenotes Page 30
Computer Graphics and Fundamentals of Image Processing(21CS63)
M represents the matrix whose elements are specified by parameter otherElements16 in the
preceding glMultMatrix statement.
The glMultMatrix function can also be used to set up any transformation sequence with
individually defined matrices.
For example,
glMatrixMode (GL_MODELVIEW);
glLoadIdentity ( ); // Set current matrix to the identity.
glMultMatrixf (elemsM2); // Postmultiply identity with matrix M2.
glMultMatrixf (elemsM1); // Postmultiply M2 with matrix M1.
produces the following current modelview matrix:
M = M2 · M1
easenotes Page 31
Computer Graphics and Fundamentals of Image Processing(21CS63)
Methods for geometric transformations in three dimensions are extended from two
dimensional methods by including considerations for the z coordinate.
A three-dimensional position, expressed in homogeneous coordinates, is represented as a
four-element column vector.
Three-Dimensional Translation
or
P’=T.P
easenotes Page 32
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.24: Moving a coordinate position with translation vector T = (tx, ty, tz )
Figure 2.25: Shifting the position of a three-dimensional object using translation vector T
easenotes Page 33
Computer Graphics and Fundamentals of Image Processing(21CS63)
Three-Dimensional Rotation
By convention, positive rotation angles produce counterclockwise rotations about a coordinate
axis. Positive rotations about a coordinate axis are counterclockwise, when looking along the
positive half of the axis toward the origin.
Figure 2.26:Positive rotations about a coordinate axis are counterclockwise, when looking
along the positive half of the axis toward the origin.
Three-Dimensional Coordinate-Axis Rotations
Three dimensional z-axis rotation equation is as follows:
x=xcos-ysin
y=xsin+ycos (50)
z=z
easenotes Page 34
Computer Graphics and Fundamentals of Image Processing(21CS63)
Parameter θ specifies the rotation angle about the z axis, and z-coordinate values are unchanged
by this transformation. In homogeneous-coordinate form, the three-dimensional z-axis rotation
equations are:
P=Rz().P
Transformation equations for rotations about the other two coordinate axes can be obtained with
a cyclic permutation of the coordinate parameters x, y, and z in equation 50.
x→y→z→x (51)
To obtain the x-axis and y-axis rotation transformations, cyclically replace x with y, y with z,
and z with x.
easenotes Page 35
Computer Graphics and Fundamentals of Image Processing(21CS63)
z=ysin + zcos
x=x
x=zsin + xcos
y=y
Figure 2.28: Cyclic permutation of the Cartesian-coordinate axes to produce the three sets
of coordinate-axis rotation equations
easenotes Page 36
Computer Graphics and Fundamentals of Image Processing(21CS63)
Negative values for rotation angles generate rotations in a clockwise direction, and the identity
matrix is produced when we multiply any rotation matrix by its inverse.
i.e RR-1=I
easenotes Page 37
Computer Graphics and Fundamentals of Image Processing(21CS63)
A rotation matrix for any axis that does not coincide with a coordinate axis can be set up as a
composite transformation involving combinations of translations and the coordinate axis
rotations.
1. Translate the object so that the rotation axis coincides with the parallel coordinate axis
3. Translate the object so that the rotation axis is moved back to its original position.
Figure 2.31: Sequence of transformations for rotating an object about an axis that is
parallel to the x axis.
easenotes Page 38
Computer Graphics and Fundamentals of Image Processing(21CS63)
When an object is to be rotated about an axis that is not parallel to one of the coordinate axes,
some additional transformations has to be performed.
1. Translate the object so that the rotation axis passes through the coordinate origin.
2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes.
4. Apply inverse rotations to bring the rotation axis back to its original orientation.
5. Apply the inverse translation to bring the rotation axis back to its original spatial position.
easenotes Page 39
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.32 : Five transformation steps for obtaining a composite matrix for rotation about
an arbitrary axis, with the rotation axis projected onto the z axis.
Three-Dimensional Scaling
The matrix expression for the three-dimensional scaling transformation of a position P =(x, y, z)
is given by
(54)
easenotes Page 40
Computer Graphics and Fundamentals of Image Processing(21CS63)
P=S.P
where scaling parameters sx, sy, and sz are assigned any positive values
Explicit expressions for the scaling transformation relative to the origin are
Scaling an object with transformation (54) changes the position of object relative to the
coordinate origin. A parameter value greater than 1 move a point farther from the origin. A
parameter value less than 1 move a point closer to the origin.
Uniform scaling is performed when sx=sy=sz. If the scaling parameters are not all equal, relative
dimensions of a transformed object are changed.
Figure 2.33: Doubling the size of an object with transformation 54 also moves the object
farther from the origin. Scaling parameter is set to 2.
Scaling transformation with respect to any selected fixed position (xf,yf,zf) can be constructed
using the following transformation sequence:
easenotes Page 41
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 2.34: A sequence of transformations for scaling an object relative to a selected fixed
point
The matrix representation for an arbitrary fixed-point scaling can be expressed as the
concatenation of translate-scale-translate transformations:
easenotes Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)
Three-Dimensional Reflections
When the reflection plane is a coordinate plane (xy, xz, or yz), transformation can be thought as
a 180◦ rotation in four dimensional space with a conversion between a left-handed frame and a
right-handed frame.
An example of a reflection that converts coordinate specifications from a right handed system to
a left-handed system is shown in Figure 2.35.
easenotes Page 43
Computer Graphics and Fundamentals of Image Processing(21CS63)
(55)
In this transformation, sign of z coordinates changes, but the values of x and y coordinates
remains unchanged.
Three-Dimensional Shears
These transformations can be used to modify object shapes. For three-dimensional, shears can be
generated relative to the z axis. A general z-axis shearing transformation relative to a selected
reference position is produced with the following matrix:
(56)
easenotes Page 44
Computer Graphics and Fundamentals of Image Processing(21CS63)
Shearing parameter shzx and shzy can be assigned any real value. Transformation matrix alters the
values for the x and y coordinates by an amount that is proportional to distance from z ref . The z
coordinate value remain unchanged. Plane areas that are perpendicular to the z axis are shifted by
an amount equal to z-zref .
Figure 2.36: A unit cube (a) is sheared relative to the origin (b) by Matrix 56, with shzx =
shzy = 1 . Reference position zref =0
1. Modelview
2. Projection
3. Texture
4. Color
For each mode, OpenGL maintains a matrix stack. Initially, each stack contains only the identity
matrix. At any time during the processing of a scene, the top matrix on each stack is called the
“current matrix” for that mode. After we specify the viewing and geometric transformations, the
easenotes Page 45
Computer Graphics and Fundamentals of Image Processing(21CS63)
top of the modelview matrix stack is the 4 × 4 composite matrix that combines the viewing
transformations and the various geometric transformations that we want to apply to a scene.
glGetIntegerv(GL_MAX_MODELVIEW_STACK_DEPTH, stackSize);
The above function determine the number of positions available in the modelview stack for a
particular implementation of OpenGL. It returns a single integer value to array stackSize
We can also find out how many matrices are currently in the stack with
1. GL_MAX_PROJECTION_STACK_DEPTH
2. GL_MAX_TEXTURE_STACK_DEPTH
3. GL_MAX_COLOR_STACK_DEPTH
There are two functions available in OpenGL for processing the matrices in a stack
glPushMatrix( );
Copy the current matrix at the top of the active stack and store that copy in the second stack
position.
glPopMatrix( );
Destroys the matrix at the top of the stack, and the second matrix in the stack becomes the
current matrix.
easenotes Page 46
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. Explain general two dimensional pivot point rotation and derive the composite matrix.
2. Explain translation, rotation, scaling in 2D homogeneous coordinate system with matrix
representations.
3. What are the entities required to perform a rotation? Show that two successive rotations
are additive.
4. Explain with illustrations the basic 2-dimension geometric transformations used in
computer graphics.
5. What is the need of homogeneous coordinates ? Give 2-dimensional homogeneous
coordinate matrix for translation, rotation and scaling.
6. Obtain a matrix representation for rotation of a object about a specified pivot point in 2-
dimension?
7. Obtain the matrix representation for rotation of a object about an arbitrary axis ?
8. Prove that 2 successive 2D rotation are additive.
9. Prove that successive scaling are multiplicative.
10. Develop composite homogeneous transformation matrix to rotate an object with respect
to pivot point. For the triangle A(3,2) B(6,2) C(6,6) rotate it in anticlockwise direction by
90 degree keeping A(3,2) fixed. Draw the new polygon.
11. With the help of the diagram explain shearing and reflection transformation technique.
12. Give the reason to convert transformation matrix to homogeneous co-ordinate
representation and show the process of conversion. Shear the polygon A(1,1), B(3,1),
C(3,3), D(2,4), E(1,3) along x-axis with a shearing factor of 0.2.
13. With the help of suitable diagram explain basic 3D Geometric transformation techniques
and give the transformation matrix.
14. Design transformation matrix to rotate an 3D object about an axis that is parallel to one of
the co-ordinate axes ?
15. Describe 3D translation and scaling.
16. Describe any two of dimensional composite transformation
easenotes Page 47
Computer Graphics and Fundamentals of Image Processing(21CS63)
17. Explain translation, rotation and scaling of 2D transformation with suitable diagrams,
code and matrix.
18. Explain OpenGL raster transformations and OpenGL geometric transformation functions.
19. Explain any two of the 3D geometrical transformation.
20. Scale the given triangle A(3,2), B(6,2), C(6,6) using the scaling factors sx=1/3 and sy=1/2
about the point A(3,2). Draw the original and scaled object.
21. Explain shear and reflection transformation technique.
22. What is concatenation of transformation? Explain the following consider 2D
i. Rotation about a fixed point
ii. Scaling about a fixed point
23. Define the following two dimensional transformations. Translation, rotation, scaling,
reflection and shearing. Give example for each.
easenotes Page 48
Computer Graphics and Fundamentals of Image Processing (21CS63)
MODULE-3
Graphical Input Data
Graphics programs use several kinds of input data, such as coordinate positions, attribute values,
character-string specifications, geometric-transformation values, viewing conditions, and
illumination parameters. Many graphics packages, including the International Standards
Organization (ISO) and American National Standards Institute (ANSI) standards, provide an
extensive set of input functions for processing such data. But input procedures require interaction
with display-window managers and specific hardware devices. Therefore, some graphics
systems, particularly those that provide mainly device-independent functions, often include
relatively few interactive procedures for dealing with input data.
A standard organization for input procedures in a graphics package is to classify the functions
according to the type of data that is to be processed by each function. This scheme allows any
physical device, such as a keyboard or a mouse, to input any data class.
1. Locator Devices
easenotes Page 1
Computer Graphics and Fundamentals of Image Processing (21CS63)
Keyboards are used for locator input in several ways. A general-purpose keyboard usually has
four cursor-control keys that move the screen cursor up, down, left, and right. With an additional
four keys, cursor can be diagonally moved. Rapid cursor movement is accomplished by holding
down the selected cursor key. Sometimes a keyboard includes a touchpad, joystick, trackball, or
other device for positioning the screen cursor. For some applications, it may also be convenient
to use a keyboard to type in numerical values or other codes to indicate coordinate positions.
Other devices, such as a light pen, have also been used for interactive input of coordinate
positions. But light pens record screen positions by detecting light from the screen phosphors,
and this requires special implementation procedures.
2. Stroke Devices
This class of logical devices is used to input a sequence of coordinate positions, and the physical
devices used for generating locator input are also used as stroke devices. Continuous movement
of a mouse, trackball, joystick, or hand cursor is translated into a series of input coordinate
values. The graphics tablet is one of the more common stroke devices. Button activation can be
used to place the tablet into “continuous” mode. As the cursor is moved across the tablet surface,
a stream of coordinate values is generated. This procedure is used in paintbrush systems to
generate drawings using various brush strokes.
3. String Devices
The primary physical device used for string input is the keyboard. Character strings in computer-
graphics applications are typically used for picture or graph labeling. Other physical devices can
be used for generating character patterns for special applications. Individual characters can be
sketched on the screen using a stroke or locator-type device. A pattern recognition program then
interprets the characters using a stored dictionary of predefined patterns.
4. Valuator Devices
Valuator input can be employed in a graphics program to set scalar values for geometric
transformations, viewing parameters, and illumination parameters. In some applications, scalar
input is also used for setting physical parameters such as temperature, voltage, or stress-strain
factors. A typical physical device used to provide valuator input is a panel of control dials. Dial
settings are calibrated to produce numerical values within some predefined range. Rotary
easenotes Page 2
Computer Graphics and Fundamentals of Image Processing (21CS63)
potentiometers convert dial rotation into a corresponding voltage, which is then translated into a
number within a defined scalar range, such as -10.5 to 25.5. Instead of dials, slide potentiometers
are sometimes used to convert linear movements into scalar values.
Any keyboard with a set of numeric keys can be used as a valuator device. Joysticks, trackballs,
tablets, and other interactive devices can be adapted for valuator input by interpreting pressure or
movement of the device relative to a scalar range. For one direction of movement, say left to
right, increasing scalar values can be input. Movement in the opposite direction decreases the
scalar input value. Selected values are usually echoed on the screen for verification. Another
technique for providing valuator input is to display graphical representations of sliders, buttons,
rotating scales, and menus on the video monitor. Cursor positioning, using a mouse, joystick,
spaceball, or other device, can be used to select a value on one of these valuators.
5. Choice Devices
Menus are typically used in graphics programs to select processing options, parameter values,
and object shapes that are to be used in constructing a picture. Commonly used choice devices
for selecting a menu option are cursor-positioning devices such as a mouse, trackball,
keyboard, touch panel, or button box.
Keyboard function keys or separate button boxes are often used to enter menu selections. Each
button or function key is programmed to select a particular operation or value, although preset
buttons or keys are sometimes included on an input device.
For screen selection of listed menu options, cursor-positioning device is used. When a screen-
cursor position (x, y) is selected, it is compared to the coordinate extents of each listed menu
item. A menu item with vertical and horizontal boundaries at the coordinate values xmin, xmax,
ymin, and ymax is selected if the input coordinates satisfy the inequalities
For larger menus with relatively few options displayed, a touch panel is commonly used. A
selected screen position is compared to the coordinate extents of the individual menu options to
determine what process is to be performed.
easenotes Page 3
Computer Graphics and Fundamentals of Image Processing (21CS63)
Alternate methods for choice input include keyboard and voice entry. A standard keyboard can
be used to type in commands or menu options. For this method of choice input, some abbreviated
format is useful. Menu listings can be numbered or given short identifying names. A similar
encoding scheme can be used with voice input systems. Voice input is particularly useful when
the number of options is small (20 or fewer).
6. Pick Devices
Pick device is used to select a part of a scene that is to be transformed or edited in some way.
Several different methods can be used to select a component of a displayed scene, and any input
mechanism used for this purpose is classified as a pick device.
Most often, pick operations are performed by positioning the screen cursor. Using a mouse,
joystick, or keyboard, for example, we can perform picking by positioning the screen cursor and
pressing a button or key to record the pixel coordinates. This screen position can then be used to
select an entire object, a facet of a tessellated surface, a polygon edge, or a vertex. Other pick
methods include highlighting schemes, selecting objects by name, or a combination of
methods.
Using the cursor-positioning approach, a pick procedure could map a selected screen position to
a world-coordinate location using the inverse viewing and geometric transformations that were
specified for the scene. Then, the world coordinate position can be compared to the coordinate
extents of objects. If the pick position is within the coordinate extents of a single object, the pick
object has been identified. The object name, coordinates, or other information about the object
can then be used to apply the desired transformation or editing operations. But if the pick
position is within the coordinate extents of two or more objects, further testing is necessary.
Depending on the type of object to be selected and the complexity of a scene, several levels of
search may be required to identify the pick object.
When coordinate-extent tests do not uniquely identify a pick object, the distances from the pick
position to individual line segments could be computed. Figure 3.1 illustrates a pick position that
is within the coordinate extents of two line segments. For a two-dimensional line segment with
pixel endpoint coordinates (x1, y1) and (x2, y2), the perpendicular distance squared from a pick
position (x, y) to the line is calculated as
easenotes Page 4
Computer Graphics and Fundamentals of Image Processing (21CS63)
Another picking technique is to associate a pick window with a selected cursor position. The
pick window is centered on the cursor position, as shown in Figure 3.2, and clipping procedures
are used to determine which objects intersect the pick window. For line picking, we can set the
pick-window dimensions w and h to very small values, so that only one line segment intersects
the pick window.
Figure 3.2: A pick window with center coordinates ( xp, yp), width w, and height h.
easenotes Page 5
Computer Graphics and Fundamentals of Image Processing (21CS63)
Highlighting can also be used to facilitate picking. Successively highlight those objects whose
coordinate extents overlap a pick position (or pick window). As each object is highlighted, a user
could issue a “reject” or “accept” action using keyboard keys. The sequence stops when the user
accepts a highlighted object as the pick object. Picking could also be accomplished simply by
successively highlighting all objects in the scene without selecting a cursor position.
If picture components can be selected by name, keyboard input can be used to pick an object.
This is a straightforward, but less interactive, pick-selection method. Some graphics packages
allow picture components to be named at various levels down to the individual primitives.
Descriptive names can be used to help a user in the pick process, but this approach has
drawbacks. It is generally slower than interactive picking on the screen, and a user will probably
need prompts to remember the various structure names.
1. The input interaction mode for the graphics program and the input devices. Either the
program or the devices can initiate data entry, or both can operate simultaneously.
2. Selection of a physical device that is to provide input within a particular logical
classification (for example, a tablet used as a stroke device).
3. Selection of the input time and device for a particular set of data values.
Input Modes
Some input functions in an interactive graphics system are used to specify how the program and
input devices should interact. A program could request input at a particular time in the
processing (request mode), or an input device could independently provide updated input
(sample mode), or the device could independently store all collected data (event mode).
easenotes Page 6
Computer Graphics and Fundamentals of Image Processing (21CS63)
1) Request Mode
In request mode, the application program initiates data entry. When input values are requested,
processing is suspended until the required values are received. This input mode corresponds to
the typical input operation in a general programming language. The program and the input
devices operate alternately. Devices are put into a wait state until an input request is made; then
the program waits until the data are delivered.
2) Sample Mode
In sample mode, the application program and input devices operate independently. Input devices
may be operating at the same time that the program is processing other data. New values
obtained from the input devices replace previously input data values. When the program requires
new data, it samples the current values that have been stored from the device input.
3) Event Mode
In event mode, the input devices initiate data input to the application program. The program and
the input devices again operate concurrently, but now the input devices deliver data to an input
queue, also called an event queue. All input data is saved. When the program requires new data,
it goes to the data queue.
Typically, any number of devices can be operating at the same time in sample and event modes.
Some can be operating in sample mode, while others are operating in event mode. But only one
device at a time can deliver input in request mode.
Echo Feedback
Requests can usually be made in an interactive input program for an echo of input data and
associated parameters. When an echo of the input data is requested, it is displayed within a
specified screen area.
easenotes Page 7
Computer Graphics and Fundamentals of Image Processing (21CS63)
Callback Functions
For device-independent graphics packages, a limited set of input functions can be provided in an
auxiliary library. Input procedures can then be handled as callback functions that interact with
the system software. These functions specify what actions are to be taken by a program when an
input event occurs. Typical input events are moving a mouse, pressing a mouse button, or
pressing a key on the keyboard.
Interactive Picture-Construction
Techniques
A variety of interactive methods are often incorporated into a graphics package as aids in the
construction of pictures. Routines can be provided for positioning objects, applying constraints,
adjusting the sizes of objects, and designing shapes and patterns.
We can interactively choose a coordinate position with a pointing device that records a screen
location. How the position is used depends on the selected processing option. The coordinate
location could be an endpoint position for a new line segment, or it could be used to position
some object—for instance, the selected screen location could reference a new position for the
center of a sphere; or the location could be used to specify the position for a text string, which
could begin at that location or it could be centered on that location. As an additional positioning
aid, numeric values for selected positions can be echoed on the screen. With the echoed
coordinate values as a guide, a user could make small interactive adjustments in the coordinate
values using dials, arrow keys, or other devices.
2) Dragging
Another interactive positioning technique is to select an object and drag it to a new location.
Using a mouse, for instance, we position the cursor at the object position, press a mouse button,
easenotes Page 8
Computer Graphics and Fundamentals of Image Processing (21CS63)
move the cursor to a new position, and release the button. The object is then displayed at the new
cursor location. Usually, the object is displayed at intermediate positions as the screen cursor
moves.
3) Constraints
Any procedure for altering input coordinate values to obtain a particular orientation or alignment
of an object is called a constraint. For example, an input line segment can be constrained to be
horizontal or vertical, as illustrated in Figures 3.3 and 3.4. To implement this type of constraint,
we compare the input coordinate values at the two endpoints. If the difference in the y values of
the two endpoints is smaller than the difference in the x values, a horizontal line is displayed.
Otherwise, a vertical line is drawn.
easenotes Page 9
Computer Graphics and Fundamentals of Image Processing (21CS63)
Other kinds of constraints can be applied to input coordinates to produce a variety of alignments.
Lines could be constrained to have a particular slant, such as 45◦, and input coordinates could be
constrained to lie along predefined paths, such as circular arcs.
4) Grids
Another kind of constraint is a rectangular grid displayed in some part of the screen area. With
an activated grid constraint, input coordinates are rounded to the nearest grid intersection. Figure
3.5 illustrates line drawing using a grid. Each of the cursor positions in this example is shifted to
the nearest grid intersection point, and a line is drawn between these two grid positions. Grids
facilitate object constructions, because a new line can be joined easily to a previously drawn line
by selecting any position near the endpoint grid intersection of one end of the displayed line.
Spacing between grid lines is often an option, and partial grids or grids with different spacing
could be used in different screen areas.
easenotes Page 10
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.5: Construction of a line segment with endpoints constrained to grid intersection
positions.
5) Rubber-Band Methods
Line segments and other basic shapes can be constructed and positioned using rubber-band
methods that allow the sizes of objects to be interactively stretched or contracted. Figure 3.6
demonstrates a rubber-band method for interactively specifying a line segment. First, a fixed
screen position is selected for one endpoint of the line. Then, as the cursor moves around, the
line is displayed from the start position to the current position of the cursor. The second endpoint
of the line is input when a button or key is pressed. Using a mouse, a rubber-band
line is constructed while pressing a mouse key. When the mouse key is released, the line display
is completed.
easenotes Page 11
Computer Graphics and Fundamentals of Image Processing (21CS63)
Similar rubber-band methods can be used to construct rectangles, circles, and other objects.
Figure 3.7 demonstrates rubber-band construction of a rectangle, and Figure 3.8 shows a rubber-
band circle construction.
easenotes Page 12
Computer Graphics and Fundamentals of Image Processing (21CS63)
6) Gravity Field
Gravity fields around the line endpoints are enlarged to make it easier for a designer to connect
lines at their endpoints. Selected positions in one of the circular areas of the gravity field are
attracted to the endpoint in that area. The size of gravity fields is chosen large enough to aid
positioning, but small enough to reduce chances of overlap with other lines. If many lines are
displayed, gravity areas can overlap, and it may be difficult to specify points correctly. Normally,
the boundary for the gravity field is not displayed.
easenotes Page 13
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.9:A gravity field around a line. Any selected point in the shaded area is shifted to
a position on the line.
Options for sketching, drawing, and painting come in a variety of forms. Curve-drawing options
can be provided using standard curve shapes, such as circular arcs and splines, or with freehand
sketching procedures. Splines are interactively constructed by specifying a set of control points
or a freehand sketch that gives the general shape of the curve. Then the system fits the set of
points with a polynomial curve. In freehand drawing, curves are generated by following the path
of a stylus on a graphics tablet or the path of the screen cursor on a video monitor. Once a curve
is displayed, the designer can alter the curve shape by adjusting the positions of selected points
along the curve path.
Line widths, line styles, and other attribute options are also commonly found in painting and
drawing packages. Various brush styles, brush patterns, color combinations, object shapes, and
surface texture patterns are also available on systems, particularly those designed as artists’
workstations. Some paint systems vary the line width and brush strokes according to the pressure
of the artist’s hand on the stylus.
Virtual-Reality Environments
easenotes Page 14
Computer Graphics and Fundamentals of Image Processing (21CS63)
the object positions in the scene. With this system, a user can move through the scene and
rearrange object positions with the data glove
Another method for generating virtual scenes is to display stereographic projections on a raster
monitor, with the two stereographic views displayed on alternate refresh cycles. The scene is
then viewed through stereographic glasses. Interactive object manipulations can again be
accomplished with a data glove and a tracking device to monitor the glove position and
orientation relative to the position of objects in the scene.
Figure 3.10: Using a head-tracking stereo display, called the BOOM and a Dataglove a
researcher interactively manipulates exploratory probes in the unsteady flow around a
Harrier jet airplane.
easenotes Page 15
Computer Graphics and Fundamentals of Image Processing (21CS63)
Interactive device input in an OpenGL program is handled with routines in the OpenGL Utility
Toolkit (GLUT), because these routines need to interface with a window system. In GLUT, there
are functions to accept input from standard devices, such as a mouse or a keyboard, as well as
from tablets, space balls, button boxes, and dials. For each device, a procedure (the call back
function) is specified and it is invoked when an input event from that device occurs. These
GLUT commands are placed in the main procedure along with the other GLUT statements.
Following function is used to specify (“register”) a procedure that is to be called when the mouse
pointer is in a display window and a mouse button is pressed or released:
glutMouseFunc (mouseFcn);
Parameter button is assigned a GLUT symbolic constant that denotes one of the three mouse
buttons.
Parameter action is assigned a symbolic constant that specifies which button action we want to
use to trigger the mouse activation event. Allowable values for button are GLUT_
LEFT_BUTTON, GLUT_MIDDLE_BUTTON, and GLUT_RIGHT_BUTTON.
By activating a mouse button while the screen cursor is within the display window, we can select
a position for displaying a primitive such as a single point, a line segment, or a fill area.
easenotes Page 16
Computer Graphics and Fundamentals of Image Processing (21CS63)
This routine invokes fcnDoSomething when the mouse is moved within the display window
with one or more buttons activated. The function that is invoked has two arguments:
where (xMouse, yMouse) is the mouse location in the display window relative to the top-left
corner, when the mouse is moved with a button pressed.
Some action can be performed when we move the mouse within the display window without
pressing a button:
glutPassiveMotionFunc(fcnDoSomethingElse);
With keyboard input, the following function is used to specify a procedure that
is to be invoked when a key is pressed:
glutKeyboardFunc (keyFcn);
For function keys, arrow keys, and other special-purpose keys, following command can be used:
easenotes Page 17
Computer Graphics and Fundamentals of Image Processing (21CS63)
glutSpecialFunc (specialKeyFcn);
Usually, tablet activation occurs only when the mouse cursor is in the display
window. A button event for tablet input is recorded with
glutTabletButtonFunc (tabletFcn);
and the arguments for the invoked function are similar to those for a mouse:
We designate a tablet button with an integer identifier such as 1, 2, 3, and so on, and the button
action is specified with either GLUT_UP or GLUT_DOWN. The returned values xTablet and
yTablet are the tablet coordinates. The number of available tablet buttons can be determined
with the following command
glutDeviceGet (GLUT_NUM_TABLET_BUTTONS);
glutTabletMotionFunc (tabletMotionFcn);
The returned values xTablet and yTablet give the coordinates on the tablet
surface.
easenotes Page 18
Computer Graphics and Fundamentals of Image Processing (21CS63)
glutSpaceballButtonFunc (spaceballFcn);
Spaceball buttons are identified with the same integer values as a tablet, and
parameter action is assigned either the value GLUT_UP or the value
GLUT_DOWN. The number of available spaceball buttons can be determined with a
call to glutDeviceGet using the argument GLUT_NUM_SPACEBALL_BUTTONS
glutSpaceballMotionFunc (spaceballTranlFcn);
The three-dimensional translation distances are passed to the invoked function as, for example:
glutSpaceballRotateFunc (spaceballRotFcn);
The three-dimensional rotation angles are then available to the callback function, as follows:
glutButtonBoxFunc (buttonBoxFcn);
easenotes Page 19
Computer Graphics and Fundamentals of Image Processing (21CS63)
The buttons are identified with integer values, and the button action is specified as GLUT_UP or
GLUT_DOWN.
glutDialsFunc(dialsFcn);
Following callback function is used to identify the dial and obtain the angular amount of
rotation:
Dials are designated with integer values, and the dial rotation is returned as an integer degree
value.
easenotes Page 20
Computer Graphics and Fundamentals of Image Processing (21CS63)
o Assign identifiers to objects and reprocess the scene using the revised
view volume. (Pick information is then stored in the pick buffer.)
o Restore the original viewing and geometric-transformation matrix.
o Determine the number of objects that have been picked, and return to
the normal rendering mode.
o Process the pick information.
1. The stack position of the object, which is the number of identifiers in the name stack, up
to and including the position of the picked object.
2. The minimum depth of the picked object.
3. The maximum depth of the picked object.
4. The list of the identifiers in the name stack from the first (bottom) identifier to the
identifier for the picked object.
The integer depth values stored in the pick buffer are the original values in the
range from 0 to 1.0, multiplied by 232 - 1.
Above routine call switches to selection mode. A scene is processed through the viewing
pipeline but not stored in the frame buffer. A record of information for each object that would
have been displayed in the normal rendering mode is placed in the pick buffer. In addition, this
easenotes Page 21
Computer Graphics and Fundamentals of Image Processing (21CS63)
command returns the number of picked objects, which is equal to the number of information
records in the pick buffer. To return to the normal rendering mode (the default), glRenderMode
routine is invoked using the argument GL_RENDER. A third option is the argument GL_
FEEDBACK, which stores object coordinates and other information in a feedback buffer
without displaying the objects.
Following statement is used to activate the integer-ID name stack for the picking operations:
glInitNames ( );
The ID stack is initially empty, and this stack can be used only in selection mode.
To place an unsigned integer value on the stack, following function can be invoked:
glPushName (ID);
This places the value for parameter ID on the top of the stack and pushes the
previous top name down to the next position in the stack.
The top of the stack can be replaced using
glLoadName (ID);
To eliminate the top of the ID stack, following command is used:
glPopName ( );
A pick window within a selected viewport is defined using the following
GLU function:
gluPickMatrix (xPick, yPick, widthPick, heightPick, vpArray);
easenotes Page 22
Computer Graphics and Fundamentals of Image Processing (21CS63)
easenotes Page 23
Computer Graphics and Fundamentals of Image Processing (21CS63)
easenotes Page 24
Computer Graphics and Fundamentals of Image Processing (21CS63)
glutCreateMenu (menuFcn);
glutAddMenuEntry ("First Menu Item", 1);
...
...
...
glutAddSubMenu ("Submenu Option", submenuID);
The glutAddSubMenu function can also be used to add the submenu to the current menu.
Modifying GLUT Menus
To change the mouse button that is used to select a menu option, first
cancel the current button attachment and then attach the new button. A button
attachment is cancelled for the current menu with
glutDetachMenu (mouseButton);
where parameter mouseButton is assigned the GLUT constant identifying the
button (left, middle, or right) that was previously attached to the menu.
After detaching the menu from the button, glutAttachMenu is used to attach
it to a different button.
Options within an existing menu can also be changed.
For example, an option in the current menu can be deleted with the function
glutRemoveMenuItem (itemNumber);
where parameter itemNumber is assigned the integer value of the menu option
that is to be deleted.
Designing a Graphical User Interface
A common feature of modern applications software is a graphical user interface (GUI) composed
of display windows, icons, menus, and other features to aid a user in applying the software to a
particular problem. Specialized interactive dialogues are designed so that programming options
are selected using familiar terms within a particular field, such as architectural and engineering
design, drafting, business graphics, geology, economics, chemistry, or physics. Other
considerations for a user interface (whether graphical or not) are the accommodation of various
skill levels, consistency, error handling, and feedback.
easenotes Page 25
Computer Graphics and Fundamentals of Image Processing (21CS63)
easenotes Page 26
Computer Graphics and Fundamentals of Image Processing (21CS63)
simultaneous combinations of keyboard keys, because experienced users will remember these
shortcuts for commonly used actions.
Help facilities can be designed on several levels so that beginners can carry on a detailed
dialogue, while more experienced users can reduce or eliminate prompts and messages. Help
facilities can also include one or more tutorial applications, which provide users with an
introduction to the capabilities and use of the system.
Consistency
An important design consideration in an interface is consistency. An icon shape should always
have a single meaning, rather than serving to represent different actions or objects depending on
the context.
Examples of consistency:
Always placing menus in the same relative positions so that a user does not have to hunt
for a particular option.
Always using the same combination of keyboard keys for an action.
Always using the same color encoding so that a color does not have different meanings in
different situations.
Minimizing Memorization
Operations in an interface should also be structured so that they are easy to understand and to
remember. Obscure, complicated, inconsistent, and abbreviated command formats lead to
confusion and reduction in the effective application of the software. One key or button used for
all delete operations, for example, is easier to remember than a number of different keys for
different kinds of delete procedures.
Icons and window systems can also be organized to minimize memorization. Different kinds of
information can be separated into different windows so that a user can identify and select items
easily. Icons should be designed as easily recognizable shapes that are related to application
objects and actions. To select a particular action, a user should be able to select an icon that
resembles that action.
Backup and Error Handling
easenotes Page 27
Computer Graphics and Fundamentals of Image Processing (21CS63)
can be corrected. Typically, systems can undo several operations, thus allowing a user to reset
the system to some specified action. For those actions that cannot be reversed, such as closing an
application without saving changes, the system asks for a verification of the requested operation.
Good diagnostics and error messages help a user to determine the cause of an error. Interfaces
can attempt to minimize errors by anticipating certain actions that could lead to an error; and
users can be warned if they are requesting ambiguous or incorrect actions, such as attempting to
apply a procedure to multiple application objects.
Feedback
Responding to user actions is another important feature of an interface, particularly for an
inexperienced user. As each action is entered, some response should be given. Otherwise, a user
might begin to wonder what the system is doing and whether the input should be reentered.
Feedback can be given in many forms, such as highlighting an object, displaying an icon or
message, and displaying a selected menu option in a different color. When the processing of a
requested action is lengthy, the display of a flashing message, clock, hourglass, or other progress
indicator is important. It may also be possible for the system to display partial results as they are
completed, so that the final display is built up a piece at a time.
Standard symbol designs are used for typical kinds of feedback. A cross, a frowning face, or a
thumbs-down symbol is often used to indicate an error, and some kind of time symbol or a
blinking “at work” sign is used to indicate that an action is being processed.
This type of feedback can be very effective with a more experienced user, but the beginner may
need more detailed feedback that not only clearly indicates what the system is doing but also
what the user should input next.
Clarity is another important feature of feedback. A response should be easily understood, but not
so overpowering that the user’s concentration is interrupted. With function keys, feedback can be
given as an audible click or by lighting up the key that has been pressed. Audio feedback has the
advantage that it does not use up screen space, and it does not divert the user’s attention from the
work area. A fixed message area can be used so that a user always know where to look for
easenotes Page 28
Computer Graphics and Fundamentals of Image Processing (21CS63)
messages, but it may be advantageous in some cases to place feedback messages in the work area
near the cursor.
Echo feedback is often useful, particularly for keyboard input, so that errors can be detected
quickly. Selection of coordinate points can be echoed with a cursor or other symbol that appears
at the selected position
Computer animation generally refers to any time sequence of visual changes in a picture.
1. Storyboard layout
2. Object definitions
3. Key-frame specifications
4. Generation of in-between frames
1. Storyboard Layout
The storyboard is an outline of the action. It defines the motion sequence as a set of basic events
that are to take place. Depending on the type of animation to be produced, the storyboard could
consist of a set of rough sketches, along with a brief description of the movements, or it could
just be a list of the basic ideas for the action. Originally, the set of motion sketches was attached
to a large board that was used to present an overall view of the animation project. Hence, the
name “storyboard.”
2. Object Definitions
An object definition is given for each participant in the action. Objects can be defined in terms
of basic shapes, such as polygons or spline surfaces. In addition, a description is often given of
the movements that are to be performed by each character or object in the story.
easenotes Page 29
Computer Graphics and Fundamentals of Image Processing (21CS63)
3. Key-Frame Specifications
A key frame is a detailed drawing of the scene at a certain time in the animation sequence.
Within each key frame, each object (or character) is positioned according to the time for that
frame. Some key frames are chosen at extreme positions in the action; others are spaced so that
the time interval between key frames is not too great. More key frames are specified for intricate
motions than for simple, slowly varying motions. Development of the key frames is generally the
responsibility of the senior animators, and often a separate animator is assigned to each character
in the animation.
In-betweens are the intermediate frames between the key frames. The total number of frames,
and hence the total number of in-betweens, needed for an animation is determined by the display
media that is to be used. Film requires 24 frames per second, and graphics terminals are
refreshed at the rate of 60 or more frames per second. Typically, time intervals for the motion are
set up so that there are from three to five in-betweens for each pair of key frames. Depending on
the speed specified for the motion, some key frames could be duplicated.
There are several other tasks that may be required, depending on the application. These
additional tasks include motion verification, editing, and the production and synchronization
of a soundtrack. Many of the functions needed to produce general animations are now
computer-generated.
easenotes Page 30
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.11: One frame from the award-winning computer-animated short film Luxo Jr.
The film was designed using a key-frame animation system and cartoon animation
techniques to provide lifelike actions of the lamps. Final images were rendered with
multiple light sources and procedural texturing techniques.
Figure 3.12: One frame from the short film Tin Toy, the first computer-animated film to
win an Oscar. Designed using a key-frame animation system, the film also required
extensive facial-expression modeling. Final images were rendered using procedural
shading, self-shadowing techniques, motion blur, and texture mapping.
easenotes Page 31
Computer Graphics and Fundamentals of Image Processing (21CS63)
One of the most important techniques for simulating acceleration effects, particularly for
nonrigid objects, is squash and stretch.
Figure 3.13 shows how squash and stretch technique is used to emphasize the acceleration and
deceleration of a bouncing ball. As the ball accelerates, it begins to stretch. When the ball hits the
floor and stops, it is first compressed (squashed) and then stretched again as it accelerates and
bounces upwards.
Figure 3.13 : A bouncing-ball illustration of the “squash and stretch” technique for
emphasizing object acceleration.
Another technique used by film animators is timing. Timing refers to the spacing between
motion frames. A slower moving object is represented with more closely spaced frames, and a
faster moving object is displayed with fewer frames over the path of the motion. This effect is
illustrated in Figure 3.14, where the position changes between frames increase as a bouncing ball
moves faster.
easenotes Page 32
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.14: The position changes between motion frames for a bouncing ball increase as
the speed of the ball increases.
Object movements can also be emphasized by creating preliminary actions that indicate an
anticipation of a coming motion. For example, a cartoon character might lean forward and rotate
its body before starting to run; or a character might perform a “windup” before throwing a ball.
Follow-through actions can be used to emphasize a previous motion. After throwing a ball, a
character can continue the arm swing back to its body; or a hat can fly off a character that is
stopped abruptly. An action also can be emphasized with staging. Staging refers to any method
for focusing on an important part of a scene, such as a character hiding something.
easenotes Page 33
Computer Graphics and Fundamentals of Image Processing (21CS63)
Some animation packages, such as Wavefront for example, provide special functions for both
the overall animation design and the processing of individual objects. Others are special-
purpose packages for particular features of an animation, such as a system for generating in-
between frames or a system for figure animation.
A set of routines is often provided in a general animation package for storing and managing the
object database. Object shapes and associated parameters are stored and updated in the database.
Other object functions include those for generating the object motion and those for rendering the
object surfaces. Movements can be generated according to specified constraints using two
dimensional or three-dimensional transformations. Standard functions can then be applied to
identify visible surfaces and apply the rendering algorithms.
Another typical function set simulates camera movements. Standard camera motions are
zooming, panning, and tilting. Finally, given the specification for the key frames, the in-
betweens can be generated automatically.
Computer-Animation Languages
Routines can be developed to design and control animation sequences within a general-purpose
programming language, such as C, C++, Lisp, or Fortran. But several specialized animation
languages have been developed.
A graphics editor
A key-frame generator
An in-between generator
Standard graphics routines
The graphics editor allows an animator to design and modify object shapes, using spline
surfaces, constructive solid geometry methods, or other representation schemes.
easenotes Page 34
Computer Graphics and Fundamentals of Image Processing (21CS63)
intensities and surface illumination properties), and setting the camera parameters (position,
orientation, and lens characteristics).
Another standard function is action specification. Action specification involves the layout of
motion paths for the objects and camera. Usual graphics routines are needed for viewing and
perspective transformations, geometric transformations to generate object movements as a
function of accelerations or kinematic path specifications, visible-surface identification, and the
surface-rendering operations.
Key-frame systems were originally designed as a separate set of animation routines for
generating the in-betweens from the user-specified key frames. Now, these routines are often a
component in a more general animation package. In the simplest case, each object in a scene is
defined as a set of rigid bodies connected at the joints and with a limited number of degrees of
freedom.
Example:
The single-armed robot in Figure 3.15 has 6 degrees of freedom, which are referred to as arm
sweep, shoulder swivel, elbow extension, pitch, yaw, and roll. The number of degrees of
freedom for this robot arm can be extended to 9 by allowing three-dimensional translations for
the base (Figure 3.16). If base rotations are allowed, the robot arm can have a total of 12 degrees
of freedom. The human body, in comparison, has more than 200 degrees of freedom.
easenotes Page 35
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.16: Translational and rotational degrees of freedom for the base of the robot arm
Parameterized systems allow object motion characteristics to be specified as part of the object
definitions. The adjustable parameters control such object characteristics as degrees of freedom,
motion limitations, and allowable shape changes.
Scripting systems allow object specifications and animation sequences to be defined with a
user-input script. From the script, a library of various objects and motions can be constructed.
Character Animation
Animation of simple objects is relatively straightforward. It becomes much more difficult to
create realistic animation of more complex figures such as humans or animals. Consider the
animation of walking or running human (or humanoid) characters. Based upon observations in
their own lives of walking or running people, viewers will expect to see animated characters
move in particular ways. If an animated character’s movement doesn’t match this expectation,
the believability of the character may suffer. Thus, much of the work involved in character
animation is focused on creating believable movements.
easenotes Page 36
Computer Graphics and Fundamentals of Image Processing (21CS63)
The connecting points, or hinges, for an articulated figure are placed at the shoulders, hips,
knees, and other skeletal joints, which travel along specified motion paths as the body moves.
For example, when a motion is specified for an object, the shoulder automatically moves in a
certain way and, as the shoulder moves, the arms move. Different types of movement, such as
walking, running, or jumping, are defined and associated with particular motions for the joints
and connecting links.
Figure 3.17: A simple articulated figure with nine joints and twelve connecting links, not
counting the oval head
A series of walking leg motions, for instance, might be defined as in Figure 3.18. The hip joint is
translated forward along a horizontal line, while the connecting links perform a series of
movements about the hip, knee, and angle joints. Starting with a straight leg [Figure 3.18(a)], the
first motion is a knee bend as the hip moves forward [Figure 3.18(b)]. Then the leg swings
forward, returns to the vertical position, and swings back, as shown in Figures 3.18(c), (d), and
(e). The final motions are a wide swing back and a return to the straight vertical position, as in
Figures 3.18(f) and (g). This motion cycle is repeated for the duration of the animation as the
figure moves over a specified distance or time interval.
easenotes Page 37
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.18: Possible motions for a set of connected links representing a walking leg.
As a figure moves, other movements are incorporated into the various joints. A sinusoidal
motion, often with varying amplitude, can be applied to the hips so that they move about on the
torso. Similarly, a rolling or rocking motion can be imparted to the shoulders, and the head can
bob up and down.
Motion Capture
The classic motion capture technique involves placing a set of markers at strategic positions on
the actor’s body, such as the arms, legs, hands, feet, and joints. It is possible to place the markers
directly on the actor, but more commonly they are affixed to a special skintight body suit worn
by the actor. The actor is them filmed performing the scene. Image processing techniques are
then used to identify the positions of the markers in each frame of the film, and their positions
are translated to coordinates. These coordinates are used to determine the positioning of the body
of the animated character. The movement of each marker from frame to frame in the film is
tracked and used to control the corresponding movement of the animated character.
easenotes Page 38
Computer Graphics and Fundamentals of Image Processing (21CS63)
To accurately determine the positions of the markers, the scene must be filmed by multiple
cameras placed at fixed positions. The digitized marker data from each recording can then be
used to triangulate the position of each marker in three dimensions. Typical motion capture
systems will use up to two dozen cameras.
Optical motion capture systems rely on the reflection of light from a marker into the camera.
These can be relatively simple passive systems using photoreflective markers that reflect
illumination from special lights placed near the cameras, or more advanced active systems in
which the markers are powered and emit light.
Non-optical systems rely on the direct transmission of position information from the markers to a
recording device. Some non-optical systems use inertial sensors that provide gyroscope-based
position and orientation information.
Some motion capture systems record more than just the gross movements of the parts of the
actor’s body. It is possible to record even the actor’s facial movements. Often called
performance capture systems, these typically use a camera trained on the actor’s face and small
light-emitting diode (LED) lights that illuminate the face. Small photoreflective markers attached
to the face reflect the light from the LEDs and allow the camera to capture the small movements
of the muscles of the face, which can then be used to create realistic facial animation
on a computer-generated character.
Periodic Motions
When animation is constructed with repeated motion patterns, such as a rotating object, the
motion should be sampled frequently enough to represent the movements correctly. The motion
must be synchronized with the frame-generation rate so that enough frames are displayed per
cycle to show the true motion. Otherwise, the animation may be displayed incorrectly
easenotes Page 39
Computer Graphics and Fundamentals of Image Processing (21CS63)
3.20. Because the wheel completes 3/4 of a turn every 1/24 of a second, only one animation
frame is generated per cycle, and the wheel thus appears to be rotating in the opposite
(counterclockwise) direction.
The motion of a complex object can be much slower than we want it to be if it takes too long to
construct each frame of the animation.
Figure 3.19: Five positions for a red spoke during one cycle of a wheel motion that is
turning at the rate of 18 revolutions per second.
easenotes Page 40
Computer Graphics and Fundamentals of Image Processing (21CS63)
Figure 3.20: The first five film frames of the rotating wheel in Figure 19 produced at the
rate of 24 frames per second
Double-buffering operations, if available, are activated using the following GLUT command:
glutInitDisplayMode (GLUT_DOUBLE);
This provides two buffers, called the front buffer and the back buffer, that we can use
alternately to refresh the screen display. While one buffer is acting as the refresh buffer for the
current display window, the next frame of an animation can be constructed in the other buffer.
We specify when the roles of the two buffers are to be interchanged using
glutSwapBuffers ( );
To determine whether double-buffer operations are available on a system, we can issue the
following query:
A value of GL_TRUE is returned to array parameter status if both front and back
buffers are available on a system. Otherwise, the returned value is GL_FALSE.
easenotes Page 41
Computer Graphics and Fundamentals of Image Processing (21CS63)
glutIdleFunc (animationFcn);
where parameter animationFcn can be assigned the name of a procedure that is to perform the
operations for incrementing the animation parameters. This procedure is continuously executed
whenever there are no display-window events that must be processed. To disable the
glutIdleFunc, we set its argument to the value NULL or the value 0
Question Bank
1. Explain in detail about logical classification of input devices.
2. Explain request mode, sample mode and event mode.
3. Explain in detail about interactive picture construction techniques.
4. Write a note on virtual reality environment.
5. Explain different OpenGL interactive Input-Device functions.
6. Explain OpenGL menu functions in detail.
7. Explain about designing a graphical user interface.
8. Write a note on OpenGL Animation Procedures.
9. Explain character animation in detail.
10. Write a note on computer animation languages.
11. Explain briefly about general computer animation functions.
12. Explain in detail about traditional animation techniques.
13. Explain in detail about different stages involved in design of animation sequences.
14. Write a note on periodic motion.
easenotes Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)
MODULE-4
Overview of Image Processing
Computers are faster and more accurate than human beings in processing numerical data.
However, human beings score over computers in recognition capability. The human brain is so
sophisticated that we recognize objects in a few seconds without much difficulty. Human beings
use all the five sensory organs to gather knowledge about the outside world. Among these
perceptions, visual information plays a major role in understanding the surroundings. Other kinds
of sensory information are obtained from hearing, taste, smell and touch.
With the advent of cheaper digital cameras and computer systems, we are witnessing a powerful
digital revolution, where images are being increasingly used to communicate effectively.
Images are encountered everywhere in our daily lives. We see many visual information sources
such as paintings and photographs in magazines, journals, image galleries, digital libraries,
newspapers, advertisement boards, television, and the Internet. Many of us take digital snaps of
important events in our lives and preserve them as digital albums. Then from the digital album,
we print digital pictures or mail them to our friends to share our feelings of happiness and
sorrow. Images are not used merely for entertainment purposes. Doctors use medical images to
diagnose problems for providing treatment. With modern technology, it is possible to image
virtually all anatomical structures, which is of immense help to doctors in providing better
treatment. Forensic imaging application process fingerprints, faces and irises to identify
criminals. Industrial applications use imaging technology to count and analyse industrial
components. Remote sensing applications use images sent by satellites to locate the minerals
present in the earth.
Images are imitations of real-world objects. Image is a two-dimensional (2D) signal f(x,y), where
the values of the function f(x,y) represent the amplitude or intensity of the image. For processing
using digital computers, this image has to be converted into a discrete form using the process of
sampling and quantization, known collectively as digitization. In image processing, the term
‘image’ is used to denote the image data that is sampled, quantized and readily available in a
form suitable for further processing by digital computers. Image processing is an area that deals
with manipulation of visual information.
Easenotes.com Page 1
Computer Graphics and Fundamentals of Image Processing(21CS63)
Objects are perceived by the eye because of light. The sun, lamps, and clouds are all examples of
radiation or light sources. The object is the target for which the image needs to be created. The
object can be people, industrial components, or the anatomical structure of a patient. The objects
can be two-dimensional, three-dimensional or multidimensional mathematical functions
involving many variables. For example, a printed document is a 2D object. Most real-world
objects are 3D.
Easenotes.com Page 2
Computer Graphics and Fundamentals of Image Processing(21CS63)
Reflective mode imaging represents the simplest form of imaging and uses a sensor to acquire
the digital image. All video cameras, digital cameras, and scanners use some types of sensors for
capturing the image. Image sensors are important components of imaging systems. They convert
light energy to electric signals.
In Emissive type imaging , images are acquired from self-luminous objects without the help of a
radiation source . In emissive type imaging, the objects are self-luminous. The radiation emitted
by the object is directly captured by the sensor to form an image. Thermal imaging is an example
of emissive type imaging. In thermal imaging, a specialized thermal camera is used in low light
situations to produce images of objects based on temperature. Other examples of emissive type
imaging are magnetic resonance imaging (MRI) and positron emissive tomography (PET)
Transmissive Imaging
In Transmissive imaging, the radiation source illuminates the object. The absorption of radiation
by the objects depends upon the nature of the material. Some of the radiation passes through the
objects. The attenuated radiation is sensed into an image. This is called transmissive imaging.
Examples of this kind of imaging are X-ray imaging, microscopic imaging, and ultrasound
imaging.
The first major challenge in image processing is to acquire the image for further processing.
Figure 4.1 shows three types of processing – optical, analog and digital image processing.
Optical image processing is the study of the radiation source, the object, and other optical
processes involved. It refers to the processing of images using lenses and coherent light beams
instead of computers. Human beings can see only the optical image. An optical image is the 2D
projection of a 3D scene. This is a continuous distribution of light in a 2D surface and contains
information about the object that is in focus. This is the kind of information that needs to be
captured for the target image. Optical image processing is an area that deals with the object,
Easenotes.com Page 3
Computer Graphics and Fundamentals of Image Processing(21CS63)
optics, and how processes are applied to an image that is available in the form of reflected or
transmitted light. The optical image is said to be available in optical form till it is converted into
analog form.
An analog or continuous image is a continuous function f(x,y) where x and y are two spatial
coordinates. Analog signals are characterized by continuous signals varying with time. They are
often referred to as pictures. The processes that are applied to the analog signal are called analog
processes. Analog image processing is an area that deals with the processing of analog electrical
signals using analog circuits. The imaging systems that use film for recording images are also
known as analog imaging systems.
The analog signal is often sampled, quantized and converted into digital form using digitizer.
Digitization refers to the process of sampling and quantization. Sampling is the process of
converting a continuous-valued image f(x,y) into a discrete image, as computers cannot handle
continuous data. So the main aim is to create a discretized version of the continuous data.
Sampling is a reversible process, as it is possible to get the original image back. Quantization is
the process of converting the sampled analog value of the function f(x,y) into a discrete-valued
integer. Digital image processing is an area that uses digital circuits, systems and software
algorithms to carry out the image processing operations. The image processing operations may
include quality enhancement of an image, counting of objects, and image analysis.
Digital image processing has become very popular now as digital images have many advantages
over analog images. Some of the advantages are as follows:
1. It is easy to post-process the image. Small corrections can be made in the captured image
using software.
2. It is easy to store the image in the digital memory.
3. It is possible to transmit the image over networks. So sharing an image is quite easy.
4. A digital image does not require any chemical process. So it is very environment friendly,
as harmful film chemicals are not required or used.
Easenotes.com Page 4
Computer Graphics and Fundamentals of Image Processing(21CS63)
The disadvantages of digital images are very few. Some of the advantages are the initial cost,
problems associated with sensors such as high power consumption and potential equipment
failure, and other security issues associated with the storage and transmission of digital images.
The final form of an image is the display image. The human eye can recognize only the optical
form. So the digital image needs to be converted to optical form through the digital to analog
conversion process.
Image processing is an exciting interdisciplinary field that borrows ideas freely from many
fields. Figure 4.2 illustrates the relationships between image processing and other related fields.
Computer graphics and image processing are very closely related areas. Image processing deals
with raster data or bitmaps, whereas computer graphics primarily deals with vector data. Raster
data or bitmaps are stored in a 2D matrix form and often used to depict real images. Vector
Easenotes.com Page 5
Computer Graphics and Fundamentals of Image Processing(21CS63)
images are composed of vectors, which represent the mathematical relationships between the
objects. Vectors are lines or primitive curves that are used to describe an image. Vector graphics
are often used to represent abstract, basic line drawings.
The algorithms in computer graphics often take numerical data as input and produce an image as
output. However, in image processing, the input is often an image. The goal of image processing
is to enhance the quality of the image to assist in interpreting it. Hence, the result of image
processing is often an image or the description of an image. Thus, image processing is a logical
extension of computer graphics and serves as a complementary field.
Human beings interact with the environment by means of various signals. In digital signal
processing, one often deals with the processing of a one-dimensional signal. In the domain of
image processing, one deals with visual information that is often in two or more dimensions.
Therefore, image processing is a logical extension of signal processing.
The main goal of machine vision is to interpret the image and to extract its physical, geometric,
or topological properties. Thus, the output of image processing operations can be subjected to
more techniques, to produce additional information for interpretation. Artificial vision is a vast
field, with two main subfields –machine vision and computer vision. The domain of machine
vision includes many aspects such as lighting and camera, as part of the implementation of
industrial projects, since most of the applications associated with machine vision are automated
visual inspection systems. The applications involving machine vision aim to inspect a large
number of products and achieve improved quality controls. Computer vision tries to mimic the
human visual system and is often associated with scene understanding. Most image processing
algorithms produce results that can serve as the first input for machine vision algorithms.
Image processing is about still images. Analog video cameras can be used to capture still images.
A video can be considered as a collection of images indexed by time. Most image processing
algorithms work with video readily. Thus, video processing is an extension of image processing.
Easenotes.com Page 6
Computer Graphics and Fundamentals of Image Processing(21CS63)
Images are strongly related to multimedia, as the field of multimedia broadly includes the study
of audio, video, images, graphics and animation.
Optical image processing deals with lenses, light, lighting conditions, and associated optical
circuits. The study of lenses and lighting conditions has an important role in study of image
processing.
Image analysis is an area that concerns the extraction and analysis of object information from the
image. Imaging applications involve both simple statistics such as counting and mensuration and
complex statistics such as advanced statistical inference. So statistics plays an important role in
imaging applications. Image understanding is an area that applies statistical inferencing to extract
more information from the image.
(a) (b)
Figure 4.3: Digital Image Representation (a) Small binary digital image (b) Equivalent
image contents in matrix form
Easenotes.com Page 7
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 4.3(a) shows a displayed image. The source of the image is a matrix as shown in Fig.
4.3(b). The image has five rows and five columns. In general, the image can be written as a
mathematical function f(x,y) as follows:
In general, the image f(x,y) is divided into X rows and Y columns. Thus, the coordinate ranges
are {x=0,1,……X-1} and {y=0,1,2,…….Y-1}. At the intersection of rows and columns, pixels
are present. Pixels are the building blocks of digital images. Pixels combine together to give a
digital image. Pixel represents discrete data. A pixel can be considered as a single sensor,
photosite(physical element of the sensor array of a digital camera), element of a matrix, or
display element on a monitor.
The value of the function f(x,y) at every point indexed by row and a column is called grey value
or intensity of the image. The value of the pixel is the intensity value of the image at that point.
The intensity value is the sampled, quantized value of the light that is captured by the sensor at
that point. It is a number and has no units.
The number of rows in a digital image is called vertical resolution. The number of columns is
called horizontal resolution. The number of rows and columns describes the dimensions of the
image. The image size is often expressed in terms of the rectangular pixel dimensions of the
array. Images can be of various sizes. Some examples of image size are 256 X 256, 512 X 512.
For a digital camera, the image size is defined as the number of pixels (specified in megapixels)
Spatial resolution of the image is very crucial as the digital image must show the object and its
separation from the other spatial objects that are present in the image clearly and precisely.
Easenotes.com Page 8
Computer Graphics and Fundamentals of Image Processing(21CS63)
A useful way to define resolution is the smallest number of line pairs per unit distance. The
resolution can then be quantified as 200 line pairs per mm.
Spatial resolution depends on two parameters – the number of pixels of the image and the
number of bits necessary for adequate intensity resolution, referred to as the bit depth. The
numbers of pixels determine the quality of the digital image. The total number of pixels that are
present in the digital image is the number of rows multiplied by the number of columns.
The choice of bit depth is very crucial and often depends on the precision of the measurement
system. To represent the pixel intensity value, certain bits are required. For example, in binary
images, the possible pixel values are 0 or 1. To represent two values, one bit is sufficient. The
number of bits necessary to encode the pixel value is called bit depth. Bit depth is a power of
two. It can be written as 2m . In monochrome grey scale images (e.g medical images such as X-
rays and ultrasound images), the pixel values can be between 0 and 255. Hence, eight bits are
used to represent the grey shades between 0 and 255.(as 28 =256). So the bit depth of grey scale
images is 8. In colour images, the pixel value is characterized by both colour value and intensity
value. So colour resolution refers to the number of bits used to represent the colour of the pixel.
The set of all colours that can be represented by the bit depth is called gamut or palette.
Spatial resolution depends on the number of pixels present in the image and the bit depth.
Keeping the number of pixels constant but reducing the quantization levels(bit depth) leads to
phenomenon called false contouring. The decrease in number of pixels while retaining the
quantization levels leads to a phenomenon called checkerboard effect (or pixelization error).
A 3D image is a function f(x,y,z) where x,y, and z are spatial coordinates. In 3D images, the
term ‘voxel’ is used for pixel. Voxel is an abbreviation of ‘volume element’.
Types of Images
Images can be classified based on many criteria.
Easenotes.com Page 9
Computer Graphics and Fundamentals of Image Processing(21CS63)
Based on Nature
Images can be broadly classified as natural and synthetic images. Natural images are images
of the natural objects obtained using devices such as cameras or scanners. Synthetic images are
images that are generated using computer programs.
Based on Attributes
Based on attributes, images can be classified as raster images and vector graphics. Vector
graphics use basic geometric attributes such as lines and circles, to describe an image. Hence the
notion of resolution is practically not present in graphics. Raster images are pixel-based. The
quality of the raster images is dependent on the number of pixels. So operations such as
enlarging or blowing-up of a raster image often result in quality reduction.
Based on Colour
Based on colour, images can be classified as grey scale, binary, true colour and pseudocolour
images.
Easenotes.com Page 10
Computer Graphics and Fundamentals of Image Processing(21CS63)
Grayscale and binary images are called monochrome images as there is no colour component in
these images. True colour(or full colour) images represent the full range of available colours. So
the images are almost similar to the actual object and hence called true colour images. In
addition, true colour images do not use any lookup table but store the pixel information with full
precision. Pseudocolour images are false colour images where the colour is added artificially
based on the interpretation of the data.
Grey scale images are different from binary images as they have many shades of grey between
black and white. These images are also called monochromatic as there is no colour component in
the image, like in binary images. Grey scale is the term that refers to the range of shades between
white and black or vice versa.
Eight bits (28=256) are enough to represent grey scale as the human visual system can
distinguish only 32 different grey levels. The additional bits are necessary to cover noise
margins. Most medical images such as X-rays, CT images, MRIs and ultrasound images are grey
scale images. These images may use more than eight bits. For example, CT images may require a
range of 10-12 bits to accurately represent the image contrast.
(a) (b)
Figure 4.5: Monochrome images (a) Grey scale image (b) Binary image
Easenotes.com Page 11
Computer Graphics and Fundamentals of Image Processing(21CS63)
In binary images, the pixels assume a value of 0 or 1. So one bit is sufficient to represent the
pixel value. Binary images are also called bi-level images. In image processing, binary images
are encountered in many ways.
The binary image is created from a grey scale image using a threshold process. The pixel value
is compared with the threshold value. If the pixel value of the grey scale image is greater than the
threshold value, the pixel value in the binary image is considered as 1.Otherwise, the pixel value
is 0. The binary image is created by applying the threshold process on the grey scale image in
Fig. 4.5(a) is displayed in Fig. 4.5(b) . It can be observed that most of the details are eliminated.
However, binary images are often used in representing basic shapes and line drawings. They are
also used as masks. In addition, image processing operations produce binary images at
intermediate stages.
In true colour images, the pixel has a colour that is obtained by mixing the primary colours
red,green and blue. Each colour component is represented like a grey scale image using eight
bits. Mostly, true colour images use 24 bits to represent all the colours. Hence true colour images
can be considered as three-band images. The number of colours that is possible is 2563 (i.e
256x256x256=1,67,77,216 colours)
Figure 4.6 a) shows a colour image and its three primary colour components. Figure 4.6 b
illustrates the general storage structure of the colour image. A display controller then uses a
digital-to-analog converter(DAC) to convert the colour value to the pixel intensity of the
monitor.
Easenotes.com Page 12
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 4.6: True colour images (a) Original image and its colour components
(b)
(c)
Figure 4.6: (b) Storage structure of colour images (c) Storage structure of an indexed
image
Easenotes.com Page 13
Computer Graphics and Fundamentals of Image Processing(21CS63)
A special category of colour images is the indexed image. In most images, the full range of
colours is not used. So it is better to reduce the number of bits by maintaining a colour map,
gamut, or palette with the image. Figure 4.6(c) illustrates the storage structure of an indexed
image. The pixel value can be considered as a pointer to the index, which contains the address of
the colour map. The colour map has RGB components. Using this indexed approach, the number
of bits required to represent the colours can be drastically reduced. The display controller uses a
DAC to convert the RGB value to the pixel intensity of the monitor.
Like true colour images, pseudocolour images are also widely used in image processing. True
colour images are called three-band images. However, in remote sensing applications multi-band
images or multi-spectral images are generally used. These images, which are captured by
satellites contains many bands. A typical remote sensing image may have 3-11 bands in an
image. This information is beyond the human perceptual range. Hence it is mostly not visible to
the human observer. So colour is artificially added to these bands, so as to distinguish the bands
and to increase operational convenience. These are called artificial colour or pseudocolour
images. Pseudocolour images are popular in the medical domain also. For example, the Doppler
colour image is a pseudocolour image.
Based on Dimensions
Images can be classified based on dimension also. Normally, digital images are 2D rectangular
array of pixels. If another dimension, of depth or any other characteristics, is considered, it may
be necessary to use a higher-order stack of images. A good example of a 3D image is a volume
image, where pixels are called voxels. By ‘3D image’, it is meant that the dimension of the target
in the imaging system is 3D. The target of the imaging system may be a scene or an object. In
medical imaging, some of the frequently encountered images are CT images, MRIs and
microscopy images. Range images, which are often used in remote sensing applications, are also
3D images.
Images may be classified based on their data type. A binary image is a 1-bit image as one bit is
sufficient to represent black and white pixels. Grey scale images are stored as one-byte(8-bit) or
Easenotes.com Page 14
Computer Graphics and Fundamentals of Image Processing(21CS63)
two-byte(16-bit) images, With one byte, it is possible to represent 28 , that is 0-255=256 shades
and with 16 bits, it is possible to represent 216, that is 65,536 shades. Colour images often use 24
or 32 bits to represent the colour and intensity value.
Sometimes, image processing operations produce images with negative numbers, decimal
fractions, and complex numbers. For example, Fourier transforms produce images involving
complex numbers. To handle, negative numbers, signed and unsigned integer types are used. In
these data types, the first bit is used to encode whether the number is positive or negative.
Floating-point involves storing the data in scientific notation. For example, 1230 can be
represented as 0.1234x104 , where 0.123 is called the significand and the power is called the
exponent. There are many floating-point conventions.
The quality of such data representation is characterized by parameters such as data accuracy
and precision. Data accuracy is the property of how well the pixel values of an image are able
to represent the physical properties of the object that is being imaged. Data accuracy is an
important parameter, as the failure to capture the actual physical properties of the image leads to
the loss of vital information that can affect the quality of the application. While accuracy refers
to the correctness of a measurement, precision refers to the repeatability of the measurement.
Repeated measurements of the physical properties of the object should give the same result.
Most software use the data type ‘double’ to maintain precision as well as accuracy.
Images can be classified based on the domains and applications where such images are
encountered.
Range Images
Range images are often encountered in computer vision. In range images, the pixel values denote
the distance between the object and the camera. These images are also referred to as depth
images. This is in contrast to all other images whose pixel values denote intensity and hence are
often known as intensity images.
Multispectral Images
Easenotes.com Page 15
Computer Graphics and Fundamentals of Image Processing(21CS63)
Multispectral images are encountered mostly in remote sensing applications. These images are
taken at different bands of visible or infrared regions of the electromagnetic wave. Multispectral
images may have many bands that may include infrared and ultraviolet regions of the
electromagnetic spectrum.
For example, an analog image of size 3x3 is represented in the first quadrant of the Cartesian
coordinate system as shown in Fig 4.7.
Figure 4.7: Analog image f(x,y) in the first quadrant of Cartesian coordinate system
Figure 4.7 illustrates an image f(x,y) of dimension 3x3, where f(0,0) is the bottom left corner.
Since it starts from the coordinate position(0,0), it ends with f(2,2), that is x=0,1,2,…..M-1 and
y=0,1,2,…….,N-1. x and y define the dimensions of the image.
In digital image processing, the discrete form of the image is often used. Discrete images are
usually represented in the fourth quadrant of the Cartesian coordinate system. A discrete image
f(x,y) of dimension 3x3 is shown in Fig. 4.8(a)
Easenotes.com Page 16
Computer Graphics and Fundamentals of Image Processing(21CS63)
Many programming environments including MATLAB starts with an index of (1,1). The
equivalent representation of the given matrix is shown in Fig 4.8(b)
Figure 4.8: Discrete image (a) Image in the fourth quadrant of Cartesian coordinate system
(b) Image coordinates as handled by software environments such as MATLAB
The coordinates used for discrete image is, by default, the fourth quadrant of the Cartesian
system.
Image Topology
Image topology is a branch of image processing that deals with the fundamental properties of the
image such as image neighbourhood, paths among pixels, boundary, and connected components.
It characterizes the image with topological properties such as neighbourhood, adjacency and
connectivity. Neighbourhood is fundamental to understanding image topology. Neighbours of a
given reference pixel are those pixels with which the given reference pixel shares its edges and
corners.
In N4(p), the reference pixel p(x,y) at the coordinate position (x,y) has two horizontal and two
vertical pixels as neighbours. This is shown graphically in Fig. 4.9.
0 𝑋 0
[𝑋 𝑝(𝑥, 𝑦) 𝑋]
0 𝑋 0
Easenotes.com Page 17
Computer Graphics and Fundamentals of Image Processing(21CS63)
may have four diagonal neighbours. They are (x-1,y-1), (x+1,y+1),(x-1,y+1) and (x+1,y-1). The
diagonal pixels for the reference pixel p(x,y) are shown graphically in Fig 4.10.
𝑋 0 𝑋
[0 𝑝(𝑥, 𝑦) 0]
𝑋 0 𝑋
The diagonal neighbours of pixel p(x,y) are represented as ND (p). The 4-neighbourhood and ND
are collectively called the 8-neighbourhood. This refers to all the neighbours and pixels that
share a common corner with the reference pixel p(x,y). These pixels are called indirect
neighbours. This is represented as N8(p) and is shown graphically in Fig 4.11.
Connectivity
The relationship between two or more pixels is defined by pixel connectivity. Connectivity
information is used to establish the boundaries of objects. The pixels p and q are said to be
connected if certain conditions on pixel brightness specified by the set V and spatial adjacency
are satisfied. For a binary image, this set V will be {0,1} and for grey scale images, V might be
any range of grey levels.
4-Connectivity : The pixels p and q are said to be in 4-connectivity when both have the same
values as specified by the set V and if q is said to be in the set N4(p). This implies any path from
p to q on which every other pixel is 4-connected to the next pixel.
8-Connectivity : It is assumed that the pixels p and q share a common grey scale value. The
pixels p and q are said to be in 8-connectivity if q is in the set N8(p)
Mixed Connectivity: Mixed connectivity is also known as m-connectivity. Two pixels p and q
are said to be in m-connectivity when
Easenotes.com Page 18
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. q is in N4(p)
2. q is in ND(p) and the intersection of N4(p) and N4(q) is empty.
Easenotes.com Page 19
Computer Graphics and Fundamentals of Image Processing(21CS63)
Relations
A binary relation between two pixels a and b, denoted as aRb, specifies a pair of elements of an
image.
For example, consider the image pattern given in Fig 4.14. The set is given as A ={x1,x2,x3}. The
set based on the 4-connecivity relation is given as A={x1,x2}. It can be observed that x3 is
ignored as it is not connected to any other element of the image by 4-connectivity.
Reflexive: For any element a in the set A, if the relation aRa holds, this is known as a reflexive
relation.
Symmetric: If aRb implies that bRa also exists, this is known as a symmetric relation.
Transitive : If the relation aRb and bRc exist, it implies that the relationship aRc also exists.
This is called the transitivity property.
If all these three properties hold, the relationship is called an equivalence relation.
Distance Measures
The distance between the pixels p and q in an image can be given by distance measures such as
Euclidian distance, D4 distance and D8 distance. Consider three pixels p,q, and z. If the
Easenotes.com Page 20
Computer Graphics and Fundamentals of Image Processing(21CS63)
coordinates of the pixels are P(x,y),Q(s,t) and Z(u,w) as shown in Fig.4.15, the distances
between the pixels can be calculated.
The distance function can be called metric if the following properties are satisfied:
The Euclidean distance between the pixels p and q, with coordinates (x,y) and (s,t) respectively
can be defined as
The advantage of the Euclidean distance is its simplicity. However, since its calculation involves
a square root operation, it is computationally costly.
D4(p,q)=|x-s|+ |y-t|
D8(p,q)=max(|x-s|,|y-t|)
Easenotes.com Page 21
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. The set of pixels that has connectivity in a binary image is said to be characterized by the
connected set.
2. A digital path or curve from pixel p to another pixel q is a set of points p1,p2,….,pn. If the
coordinates of those points are (x0,y0) ,(x1,y1),…….(xn,yn), then p=(x0,y0) and q=(xn,yn).
The number of pixels is called the length. If x0=xn and y0=yn , then the path is called a
closed path.
3. R is called a region if it is a connected component.
4. If a path between any two pixels p and q lies within the connected set S, it is called a
connected component of S. If the set has only one connected component, then the set S is
called a connected set. A connected set is called a region.
5. Two Regions R1 and R2 are called adjacent if the union of these sets also forms a
connected component. If the regions are not adjacent, it is called disjoint set. In Fig 4.16,
two regions R1 and R2 are shown. These regions are 8-connected because the pixels
(underlined pixel ‘1’) have 8-connectivity. If the regions are not adjacent, they are called
disjoint.
6. The border of the image is called contour or boundary. A boundary is a set of pixels
covering a region that has one or more neighbours outside the region. Typically, in a
binary image, there is a foreground object and a background object. The border of the
foreground object may have at least one neighbor in the background. If the border pixels
are within the region itself, it is called inner boundary. This need not be closed.
7. Edges are present whenever there is an abrupt intensity change among pixels. Edges are
similar to boundaries, but may or may not be connected. If edges are disjoint, they have
to be linked together by edge linking algorithms. However boundaries are global and
have a closed path. Figure 4.17 illustrates two regions and an edge. It can be observed
that edges provide an outline of the object. The pixels that are covered by the edges lead
to regions.
Easenotes.com Page 22
Computer Graphics and Fundamentals of Image Processing(21CS63)
Easenotes.com Page 23
Computer Graphics and Fundamentals of Image Processing(21CS63)
1. Point Operations
2. Local Operations
3. Global Operations
Point operations are those whose output value at a specific coordinate is dependent only on the
input value. A local operation is one whose output value at a specific coordinate is dependent on
the input values in the neighbourhood of that pixel. Global operations are those whose output
value at a specific coordinate is dependent on all the values in the input image.
1. Linear Operations
2. Non-linear Operations
An operator is called a linear operator if it obeys the following rules of additivity and
homogeneity.
1. Property of additivity
= a1H(f1(x,y)) + a2H(f2(x,y))
2. Property of homogeneity
H(kf1(x,y))=kH(f1(x,y))=kg1(x,y)
Easenotes.com Page 24
Computer Graphics and Fundamentals of Image Processing(21CS63)
Image operations are array operations. These operations are done on a pixel-by-pixel basis.
Array operations are different from matrix operations. For example, consider two images
𝐴 𝐵
F1 = [ ]
𝐶 𝐷
𝐸 𝐹
F2=[ ]
𝐺 𝐻
The multiplication of F1 and F2 is element-wise, as follows:
𝐴𝐸 𝐵𝐹
F1 X F2 =[ ]
𝐶𝐺 𝐻𝐷
Arithmetic Operations
Arithmetic operations include image addition, subtraction, multiplication, division and blending.
Image Addition
g(x,y)=f1(x,y)+f2(x,y)
The pixels of the input images f1(x,y) and f2(x,y) are added to obtain the resultant image g(x,y).
Figure 4.18 shows the effect of adding a noise pattern to an image. However, during the image
addition process, care should be taken to ensure that the sum does not cross the allowed range.
For example, in a grey scale image, the allowed range is 0-255, using eight bits. If the sum is
above the allowed range, the pixel value is set to the maximum allowed value. Similarly, it is
possible to add a constant value to a single image, as follows:
g(x,y)=f1(x,y)+k
If the value of k is larger than 0, the overall brightness is increased. Figure 4.18(d) illustrates the
addition of the constant 50 increases the brightness of the image.
Easenotes.com Page 25
Computer Graphics and Fundamentals of Image Processing(21CS63)
The brightness of an image is the average pixel intensity of an image. If a positive or negative
constant is added to all the pixels of an image, the average pixel intensity of the image increases
or decreases respectively. The practical application of image addition is as follows:
Figure 4.18: Results of the image addition operations (a) Image1 (b) Image 2 (c) Addition of
images 1 and 2 (d) Addition of image 1 and constant 50
Image Subtraction
The subtraction of two images can be done as follows. Consider
g(x,y)=f1(x,y)-f2(x,y)
where f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image. To avoid negative
values, it is desirable to find the modulus of the difference as
Easenotes.com Page 26
Computer Graphics and Fundamentals of Image Processing(21CS63)
g(x,y)=| f1(x,y)-f2(x,y)|
i.e g(x,y)= | f1(x,y)-k|, as k is constant. The decrease in the average intensity reduces the
brightness of the image. Some of the practical applications of image subtraction are as follows:
1. Background elimination
2. Brightness reduction
3. Change detection
If there is no difference between the frames, the subtraction process yields zero, and if there is
any difference, it indicates the change. Figure 4.19 (a) -4.19(d) show the difference between the
images. In addition, it illustrates that the subtraction of a constant results in a decrease of the
brightness.
Figure 4.19: Results of the image subtraction operation (a) Image 1 (b) Image 2 (c)
Subtraction of images 1 and 2 (d) Subtraction of constant 50 from image 1
Easenotes.com Page 27
Computer Graphics and Fundamentals of Image Processing(21CS63)
Image Multiplication
Consider
g(x,y)=f1(x,y) x f2(x,y)
f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image. If the multiplied value
crosses the maximum value of the data type of the images, the value of the pixel is reset to the
maximum allowed value. Similarly, scaling by a constant can be performed as
g(x,y)=f(x,y)x k
where k is a constant.
If k is greater than 1, the overall contrast increases. If k is less than 1, the contrast decreases. The
brightness and contrast can be manipulated together as
g(x,y)=af(x,y)+k
Parameters a and k are used to manipulate the brightness and contrast of the input image. g(x,y)
is the output image. Some of the practical applications of image multiplication as follows:
1. It increases contrast. If a fraction less than 1 is multiplied with the image, it results in
decrease of contrast. Figure 4.20 shows that by multiplying a factor of 1.25 with the
original image, the contrast of the image increases.
2. It is useful for designing filter masks.
3. It is useful for creating a mask to highlight the area of interest.
Easenotes.com Page 28
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 4.20: Result of multiplication operation (image x 1.25) resulting in good contrast
Image Division
Division can be performed as
g(x,y) = f1(x,y)/f2(x,y)
where f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image.
The division process may result in floating-point numbers. Hence, the float data type should be
used in programming. Improper data type specification of the image may result in loss of
information. Division using a constant can also be performed as
1. Change detection
2. Separation of luminance and reflectance components
3. Contrast reduction
Figure 4.21 (a) shows such an effect when the original image is divided by 1.25.
Easenotes.com Page 29
Computer Graphics and Fundamentals of Image Processing(21CS63)
Figure 4.21 (b)-4.21(e) show the multiplication and division operations used to create a mask. It
can be observed that image 2 is used as a mask. The multiplication of image 1 with image 2
results in highlighting certain portions of image 1 while suppressing the other portions. It can be
observed that division yields back the original image.
Figure 4.21: Image division operation (a) Result of the image operation (image/1.25) (b)
Image 1 (c) Image 2 used as a mask (d) Image 3=image 1 x image 2 (e) Image 4=image
3/image 1
g(x,y)=f(x,y)+(x,y)
where f(x,y) is the input image and g(x,y) is the output image. Several instances of noisy images
can be averaged as
Easenotes.com Page 30
Computer Graphics and Fundamentals of Image Processing(21CS63)
where M is the number of noisy images. As M increases the averaging process reduces the
intensity of the noise and it becomes so low that it can automatically be removed. As M becomes
large, the expectation
Logical Operations
Bitwise operations can be applied to image pixels. The resultant pixel is defined by the rules of
the particular operation. Some of the logical operations that are widely used in image processing
are as follows:
1. AND/NAND
2. OR/NOR
3. EXOR/EXNOR
1. AND/NAND
The truth table of the AND and NAND operators is given in Table 4.2
A B C(AND) C(NAND)
0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0
Easenotes.com Page 31
Computer Graphics and Fundamentals of Image Processing(21CS63)
The operators AND and NAND take two images as input and produce one output image. The
output image pixels are the output of the logical AND/NAND of the individual pixels. Some of
the practical applications of the AND and NAND operators are as follows:
Figure 4.22 (a) -4.22(d) shows the effect of the AND and OR logical operators. It illustrates that
the AND operator shows overlapping regions of the two input images and the OR operator
shows all the input images with their overlapping.
Figure 4.22: Results of the AND and OR logical operators (a) Image 1 (b) Image 2 (c)
Result of image 1 OR image 2 (d) Result of image 1 and AND image 2
2. OR/NOR
The truth table of the OR and NOR operators is given in Table 4.3
A B C(OR) C(NOR)
0 0 0 1
0 1 1 0
1 0 1 0
1 1 1 0
Easenotes.com Page 32
Computer Graphics and Fundamentals of Image Processing(21CS63)
3. XOR/NOR
The truth table of the XOR and XNOR operators is given in Table 4.4.
A B C(XOR) C(XNOR)
0 0 0 1
0 1 1 0
1 0 1 0
1 1 0 1
The practical applications of the XOR and XNOR operations are as follows:
1. Change detection
2. Use as a subcomponent of a complex imaging operation. XOR for identical inputs is zero.
Hence it can be observed that the common region of image 1 and image 2 in Figures
4.22(a) and 4.22 (b), respectively, is zero and hence dark. This is illustrated in Fig. 4.23.
Easenotes.com Page 33
Computer Graphics and Fundamentals of Image Processing(21CS63)
4. Invert/Logical NOT
The truth table of the NOT operator is given in Table 4.5.
A C(NOT)
0 1
1 0
1. Obtaining the negative of an image. Figure 4.24 shows the negative of the original image
shown in Fig. 4.22(a)
2. Making features clear to the observer
3. Morphological processing
(a) (b)
Figure 4.24: Effect of the NOT operator (a) Original Image (b) NOT of original Image
= Equal to
>Greater than
>= Greater than or equal to
< Less than
Easenotes.com Page 34
Computer Graphics and Fundamentals of Image Processing(21CS63)
The resultant image pixel represents the truth or falsehood of the comparisons. Similarly shifting
operations are also very useful. Shifting of I bits of the image pixel to the right results in division
by 2I. Similarly, shifting of I bits of the image pixel to the left results in multiplication by 2I.
Shifting operators are helpful in dividing and multiplying an image by a power of two. In
addition, this operation is computationally less expensive.
Geometrical Operations
Translation
Translation is the movement of an image to a new position. Let us assume that the point at the
coordinate position X=(x,y) of the matrix F is moved to the new position X whose coordinate
position is (x’,y’). Mathematically, this can be stated as a translation of a point X to the new
position X. The translation is represented as
x =x+x
y=y+y
Easenotes.com Page 35
Computer Graphics and Fundamentals of Image Processing(21CS63)
In vector notation, this is represented as F=F+T, where x and y are translations parallel to the
x and y axes. F and F are the original and the translated images respectively. However, other
transformations such as scaling and rotation are multiplicative in nature. The transformation
process for rotation is given as F=RF, where R is the transform matrix for performing rotation,
and the transformation process for scaling is given as F=SF. Here, S is the scaling
transformation matrix.
1. In homogeneous coordinates, at least one point should be non-zero. Thus (0,0,0) does not
exist in the homogeneous coordinate system.
2. If one point is multiplicative of the other point, they are same. Thus, the points(1,3,5) and
(3,9,15) are same as the second point is 3 (1,3,5)
3. The point (x,y,w) in the homogeneous coordinate system corresponds to the point
(x/w,y/w) in 2D space.
In the homogeneous coordinate system, the translation process of point(x,y) of image F to the
new point (x,y) of the image F is described as
x=x+x
y=y+y
1 0 𝑥
[x,y,1]=[0 1 𝑥] [x,y,1]T
0 0 1
Sometimes, the image may not be present at the origin. In that case, a suitable negative
translation value can be used to bring the image to align with the origin.
Easenotes.com Page 36
Computer Graphics and Fundamentals of Image Processing(21CS63)
Scaling
Depending on the requirement, the object can be scaled. Scaling means enlarging and shrinking.
In the homogeneous coordinate system, the scaling of the point (x,y) of the image F to the new
point (x,y) of the image F is described as
x=x xSX
y=y xSy
𝑆𝑥 0
[x,y]=[ ] [x,y]
0 𝑆𝑦
Sx and Sy are called scaling factors along the x and y axes respectively. If the scale factor is 1, the
object would appear larger. If the scaling factors are fractions, the object would shrink. Similarly,
if Sx and SY are equal, scaling is uniform. This is known as isotropic scaling. Otherwise, it is
called differential scaling. In the homogeneous coordinate system, it is represented as
𝑆𝑥 0 0
[x’,y’,1] =[ 0 𝑆𝑦 0] [x,y,1]T
0 0 1
𝑆𝑥 0 0
The matrix S=[ 0 𝑆𝑦 0] is called scaling matrix.
0 0 1
1 0
F’=[-x,y]=[ ] x [x,y]T
0 −1
Similarly, the reflection along the y-axis is given by
−1 0
F’=[x,-y]=[ ] x [x,y]T
0 1
Easenotes.com Page 37
Computer Graphics and Fundamentals of Image Processing(21CS63)
0 1
F’=[x,-y]=[ ] x [x,y]T
1 0
The reflection about the line y=-x is given as
0 −1
F’=[x,-y]=[ ] x [x,y]T
−1 0
The reflection operation is illustrated in Fig 3.22(a) and 3.22(b). In the homogeneous coordinate
system, the matrices for reflection can be given as
−1 0 0
Ry-axis=[ 0 1 0]
0 0 1
1 0 0
Rx-axis=[0 −1 0]
0 0 1
−1 0 0
Rorigin=[ 0 −1 0]
0 0 1
The reflection along a line can be given as
0 1 0
Ry=x =[1 0 0]
0 0 1
0 −1 0
Ry=-x =[−1 0 0]
0 0 1
Shearing
Shearing is a transformation that produces a distortion of shape. This can be applied either in the
x-direction or the y-direction. In this transformation, the parallel and opposite layers of the object
are simply sided with respect to each other.
Shearing can be done using the following calculation and can be represented in the matrix form
as
x’=x+ay
y’=y
Easenotes.com Page 38
Computer Graphics and Fundamentals of Image Processing(21CS63)
1 𝑎 0
Xshear=[0 1 0] (where a=shx)
0 0 1
Yshear can be given as
x’=x
y’=y+bx
1 0 0
Yshear=[𝑏 1 0] (where b=shy)
0 0 1
where shx and shy are shear factors in the x an y directions, respectively.
Rotation
An image can be rotated by various degrees such as 900,1800 or 2700. In the matrix form it is
given as
𝑐𝑜𝑠 −𝑠𝑖𝑛 ]
[x’,y’]= [ [x,y] T
𝑠𝑖𝑛 𝑐𝑜𝑠
This can be represented as F=RA. The parameter is the angle of rotation with respect to the x-
axis. The value of can be positive or negative. A positive angle represents counter clockwise
rotation and a negative angle represents clockwise rotation. In homogeneous coordinate system,
rotation can be expressed as
𝑐𝑜𝑠 −𝑠𝑖𝑛 0
[x’,y’,1]= 𝑠𝑖𝑛
[ 𝑐𝑜𝑠 0] [x,y,1]
0 0 1
If is substituted with -, this matrix rotates the image in the clockwise direction.
Affine Transform
The transformation that maps the pixel at the coordinates(x,y) to a new coordinate position is
given as a pair of transformation equations. In this transform, straight lines are preserved and
parallel lines remain unchanged. It is described mathematically as
x =Tx(x,y)
Easenotes.com Page 39
Computer Graphics and Fundamentals of Image Processing(21CS63)
y =Ty(x,y)
Tx and Ty are expressed as polynomials. The linear equation gives an affine transform.
x =a0x+a1y+a2
y =b0x+b1y+b2
𝑥′ 𝑎0 𝑎1 𝑎2 𝑥
[𝑦′] = [𝑏0 𝑏1 𝑏2] [𝑦]
1 0 0 1 1
The affine transform is a compact way of representing all transformations. The given equation
represents all transformations.
Inverse Transformation
The purpose of inverse transformation is to restore the transformed object to its original form and
position. The inverse or backward transformation matrices are given as follows:
1 0 −𝑥
Inverse transform for translation= [0 1 −𝑦]
0 0 1
1 0 −𝑥
Inverse transform for scaling = [0 1 −𝑦]
0 0 1
Inverse transform for rotation can be obtained by changing the sign of the transform term. For example,
the following matrix performs inverse transform.
Easenotes.com Page 40
Computer Graphics and Fundamentals of Image Processing(21CS63)
𝑐𝑜𝑠 +𝑠𝑖𝑛 0
[−𝑠𝑖𝑛 𝑐𝑜𝑠 0]
0 0 1
3D Transforms
Some medical images such as computerized tomography(CT) and magnetic resonance imaging(MRI)
images are three-dimensional images. To apply translation, rotation, and scaling on 3D images, 3D
transformations are required. 3D transformations are logical extensions of 2D transformations. These
are summarized and described as follows:
Translation = [ ]
Transforms can be of two types. Affine transforms often produce pixels of the resultant image that
cannot be fit as some of the pixel values are non-integers and often go beyond the acceptable range.
This results in gaps(or holes) and issues related to number of pixels and range. So interpolation
techniques are required to solve these issues.
Forward Mapping
Forward mapping is the process of applying transformations iteratively to every pixel in the image,
yielding a new coordinate position and copying the values of the pixel to a new position.
Backward Mapping
Backward mapping is the process of checking the pixels of the output image to determine the position of
the pixels in the input image. This is used to guarantee that all the pixels of the input image are
processed.
During the process of both forward and backward mapping, it may happen that pixels cannot be fitted
into the new coordinates. For example, consider the process of rotation of a point(10,5) by 450. This
yields
x’ =xcos-ysin
=10 cos(450)-5sin(450)
=10(0.707)-5(0.707)
=3.535
y’ = xsin+ycos
Easenotes.com Page 41
Computer Graphics and Fundamentals of Image Processing(21CS63)
= 10 sin(450)+5cos(450)
= 10(0.707)+5(0.707)
=10.605
Since these new coordinate positions are not integers, the rotation process cannot be carried out. Thus,
the process may leave a gap in the new coordinate position, which creates poor quality output.
Therefore, whenever a geometric transformation is performed, a resampling process should be carried
out so that the desirable quality is achieved in the resultant image. The resampling process creates new
pixels so that the quality of the output is maintained. In addition, the rounding off of the new coordinate
position (3.535,10.605) should be carried out as (4,11). This process of fitting the output to the new
coordinates is called interpolation.
Interpolation is the method of calculating the expected values for a function with known pixels. Some of
the popular interpolation techniques are:
Bilinear technique
Bicubic technique
The most elementary form of interpolation is nearest neighbor interpolation or zero-order interpolation.
This technique determines the closest pixel and assigns it to every pixel in the new image matrix, that is,
the brightness of the pixels is equal to the closest neighbor. Sometimes, this may result in pixel blocking
and can degrade the resulting image, which may appear spatially disordered. These distortions are
called aliasing.
A more accurate interpolation scheme is bilinear interpolation. This is called first-order interpolation.
Four neighbours of the transformed original pixels that surround the new pixel are obtained and are
used to calculate the new pixel value. Linear interpolation is used in both the directions. Weights are
assigned based on the proximity. Then the process takes the weighted average of the brightness of the
four pixels that surround the pixels of interest.
g(x,y)=(1-a)(1-b)f(x’,y’)+(1-a)bf(x’,y’+1)+a(1-b)f(x’+1,y’)+abf(x’+1,y’+1)
Here g(x,y) is the output image and f(x,y) is the image that undergoes the interpolation operation. If the
desired pixel is very close to one of the four nearest neighbor pixels, its weight will be much higher. This
technique leads to blurring of the edges.However, it reduces aliasing artefacts.
High-order interpolation schemes takes more pixels into account. Second-order interpolation is known
as cubic interpolation. It uses a neighbourhood of 16 pixels. Then it fits two polynomials to the 16 pixels
of the transformed original matrix and the new image pixel. This technique is very effective and
produces images that are very close to the original. In extreme cases, more than 64 neighbouring pixels
can be used. However, as the number of pixel increases, the computational complexity also increases.
Easenotes.com Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)
Set Operations
An image can be visualized as a set. For example, the following binary image(Fig3 3.25) can be visualizd
as a set A={(0,0),(0,2),(2,2)}. The coordinates values represent the value of 1. Set operators can then be
applied to the set to get the resultant, which is useful for image analysis.
The complement of set A can be defined as the set of pixels that does not belong to the set A.
Ac={c/cA}
A={c=-a, aA}
AUB={c/(cA)(cB)}
The intersection of two sets is given as AB={c/(cA)(cB)}. The pixel c belongs to A,B or both.
A-B={c/(cA)(cB)}
Morphology is a collection of operations based on set theory, to accomplish various tasks such as
extracting boundaries, filling small holes present in the image, and removing noise present in the image.
Mathematical morphology is a very powerful tool for analyzing the shapes of the objects hat are present
in the images. The theory of mathematical morphology is based on set theory. One can visualize a binary
object as a set. Set theory can then be applied to the sample set. Morphological operators often take a
binary image and a mask known as structuring element as input. The set operators such as intersection,
union, inclusion and complement can then be applied to images. Dilation is one of the two basic
operators. It can be applied to binary as well as grey scale images. The basic effect of this operator on a
binary image is that it gradually increases the boundaries of the region, while the small holes that are
present in the images become smaller.
Let us assume that A and B are a set of pixel coordinates. The dilation of A by B can be denoted as
AB={(x,y)+(u,v): (x,y)A,(u,v)B}
Where x and y corresponds to the set A, and u and v corresponds to the set B. The coordinates are
added and the union is carried out to create the resultant set. These kinds of operations are based on
Minkowski algebra.
Easenotes.com Page 43
Computer Graphics and Fundamentals of Image Processing(21CS63)
Statistical Operations
Statistics plays an important role in image processing. An image can be assumed to be a set of discrete
points. Statistical operations can be applied to the image to get the desired results such as manipulation
of brightness and contrast.Some of the very useful statistical operations include mean,median, mode
and mid-range. These measures are useful in image processing. The measures of data dispersion also
includes quartiles,inter-quartile range and variance.
Mean
Mean is the average of all the values in the sample(population) and is denoted as
The overall brightness of the grey scale image is measured using the mean. This is calculated by
summing all the values of the pixels of an image and dividing it by the number of pixels in the image.
1
μ = ∑𝑛−1 𝐼
𝑛 𝑖=0 i
Sometimes the data is associated with a weight. This is called weighted mean. The problem of mean is
its extreme sensitivity to noise. Even small changes in the input affect the mean drastically.
Median
Median is the value where the given Xi is divided into two equal halves, with half of the values being
lower than the median and the other half higher. The procedure for obtaining the median is to sort the
values of the given Xi in ascending order. If the given sequence has an odd number of values, the middle
value is the median. Otherwise, the median is the arithmetic mean of the two middle values.
Mode
Mode is the value that occurs most frequently in the dataset. The procedure for finding the mode is to
calculate the frequencies for all of the values in the data. The mode is the value(or values) with the
highest frequency. Normally, based on the mode, the dataset is classified as unimodal, bimodal,and
trimodal. Any dataset that has two modes is called bimodal.
Percentile
Percentiles are data that are less than the coordinate by some percentage of the total value. For
example, the median is the 50th percentile and can be denoted as Q0.50.The 25th percentile is called the
first quartile and the 75th percentile is called third quartile. Another measure that is useful to measure
dispersion is the inter-quartile range. The inter-quartile is defined as Q0.75 Q0.25.Semi quartile range is
=0.5 Xiqr
Mean-Mode=3x(Mean-Median)
Easenotes.com Page 44
Computer Graphics and Fundamentals of Image Processing(21CS63)
The interpretation of the formula is that the mode for the unimodal frequency curve is moderately
skewed. The mid-range is also used to assess the central tendency of the dataset. In a normal
distribution, the mean, median, and mode are the same. In symmetrical distributions, it is possible for
the mean and median to be the same even though there may be several modes. By contrast, in
asymmetrical distributions, the mean and median are not the same. These distributions are said to be
skewed data where more than half the cases are either above or below the mean.
The most commonly used measures of dispersion are variance and standard deviation. The mean does
not convey much more than a middle point. For example, the following datasets {10,20,30} and
{10,50,0}, both have a mean of 20. The difference between these two sets is the spread of data.
Standard deviation is the average distance from the mean of the dataset to each point. The formula for
standard deviation is given by
Sometimes, we divide the value by N-1 instead of N. The reason is that in a larger, real-world scenario,
division by N-1 gives an answer that is closer to the actual value. In image processing, it is a measure of
how much a pixel varies from the mean value of the image. The mean value and the standard deviation
characterize the perceived brightness and contrast of the image. Variance is another measure of the
spread of the data. It is the square of standard deviation. While standard deviation is a more common
measure, variance also indicates the spread of the data effectively.
Entropy
This is the measure of the amount of orderliness that is present in the image. The entropy can be
calculated by assuming that the pixels are totally uncorrelated. An organized system has low entropy
and a complex system has a very high entropy. Entropy also indicates the average global information
content. Its unit is bits per pixel. It can be computed using the formula
Where pi is the prior probability of the occurrence of the message. Let us consider a binary image,
where the pixel assumes only two possible states,0 or 1 and the occurrence of each state is equally
likely. Hence, the probability is ½. Therefore, the entropy H=-[1/2log(1/2)+1/2log(1/2)]=1 bit
Therefore, 1 bit is sufficient to store the intensity of the pixel.Therefore, binary images are less complex.
Thus, entropy indicates the richness of the image. This can be seen visually using a surface plot where
pixel values are plotted as a function of pixel position.
The imaging system can be modeled as a 2D linear system.Let f(x,y) and g(x,y) represent the input and
output images, respectively. Then, they can be written as g(x,y)=t*(f(x,y)).Convolution is a group
Easenotes.com Page 45
Computer Graphics and Fundamentals of Image Processing(21CS63)
process, that is, unlike point operations, group processes operate on a group of input pixels to yield the
result. Spatial convolution is a method of taking a group of pixels in the input image and computing the
resultant output image. This is also known as a finite impulse response(FIR) filter. Spatial convolution
moves across pixel by pixel and produces the output image. Each pixel of the resultant image is
dependent on a group of pixels(called kernel).
g(x)=t*f(x)
=∑ 𝑛𝑖=−𝑛 𝑡(𝑖)𝑓(𝑥 − 𝑖)
The convolution window is a sliding window that centres on each pixel of the image to generate the
resultant image. The resultant pixel is thus calculated by multiplying the weight of the convolution mask
by pixel values and summing these values. Thus, the sliding window is moved for every corresponding
pixel in the image in both direction. Therefore, convolution is called ‘shift-add-multiply’ operation.
To carry out the process of convolution, the template or mask is first rotated by 1800. Then the
convolution process is carried out. Consider the process of convolution of two sequences – F, whose
dimension is 1 x 5 and a kernel or template T, whose dimension is 1x3.
Let F={0,0,2,0,0} and the kernel be {7 5 1}. The template has to be rotated by 1800. The rotated mask of
this original mask [7 5 1] is a convolution template whose dimensions is 1 x 3 with value {1,5,7}
To carry out the convolution process first, the process of zero padding should be carried out. Zero
padding is the process of creating more zeros and is done as shown in Table 3.7.
Convolution is the process of shifting and adding the sum of the product of mask coefficients and the
image to give the centre value.
Correlation is similar to the convolution operation and it is very useful in recognizing the basic shapes in
the image. Correlation reduces to convolution if the kernels are symmetric. The difference between the
correlation and convolution processes is that the mask or template is applied directly without any prior
rotation., as in the convolution process.
The correlation of these sequences is carried out to observe the difference between these processes.
The correlation process also involves the zero padding process.
Easenotes.com Page 46