0% found this document useful (0 votes)
44 views200 pages

Computer Graphics and Fundamentals of Image Processing 21cs63

cgiii

Uploaded by

dheeraj g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views200 pages

Computer Graphics and Fundamentals of Image Processing 21cs63

cgiii

Uploaded by

dheeraj g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 200

Module 1 2021 scheme Computer Graphics and OpenGL

MODULE-1

1. Overview: Computer Graphics and OpenGL


1.1 Basics of computer graphics
1.2 Application of Computer Graphics,
1.3 Video Display Devices
1.3.1 Random Scan and Raster Scan displays,
1.3.2 Color CRT monitors,
1.3.4 Flat panel displays.
1.4 Raster-scan systems:
1.4.1 Video controller,
1.4.2 Raster scan Display processor,
1.4.3 Graphics workstations and viewing systems,
1.5 Input devices,
1.6 Graphics networks,Graphics on the internet,
1.7 Graphics software.
OpenGL:
1.8 Introduction to OpenGL ,
1.9 Coordinate reference frames,
1.10 Specifying two-dimensional world coordinate reference frames in OpenGL,
1.11 OpenGL point functions,
1.12 OpenGL line functions, point attributes,
1.13 Line attributes,
1.14 Curve attributes,
1.15 OpenGL point attribute functions,
1.16 OpenGL line attribute functions,
1.17 Line drawing algorithms(DDA, Bresenham’s),
1.18 Circle generation algorithms (Bresenham’s).

1.1 Basics of Computer Graphics


Computer graphics is an art of drawing pictures, lines, charts, etc. using computers with the help
of programming. Computer graphics image is made up of number of pixels. Pixel is the smallest
addressable graphical unit represented on the computer screen.

easenotes 1
Module 1 2021 scheme Computer Graphics and OpenGL

1.2 Applications of Computer Graphics


a. Graphs and Charts

 An early application for computer graphics is the display of simple data graphs usually
plotted on a character printer. Data plotting is still one of the most common graphics
application.
 Graphs & charts are commonly used to summarize functional, statistical, mathematical,
engineering and economic data for research reports, managerial summaries and other
types of publications.
 Typically examples of data plots are line graphs, bar charts, pie charts, surface graphs,
contour plots and other displays showing relationships between multiple parameters in
two dimensions, three dimensions, or higher-dimensional spaces

b. Computer-Aided Design

 A major use of computer graphics is in design processes-particularly for engineering and


architectural systems.

easenotes 2
Module 1 2021 scheme Computer Graphics and OpenGL

 CAD, computer-aided design or CADD, computer-aided drafting and design methods are
now routinely used in the automobiles, aircraft, spacecraft, computers, home appliances.
 Circuits and networks for communications, water supply or other utilities are constructed
with repeated placement of a few geographical shapes.
 Animations are often used in CAD applications. Real-time, computer animations using
wire-frame shapes are useful for quickly testing the performance of a vehicle or system.

c. Virtual-Reality Environments

 Animations in virtual-reality environments are often used to train heavy-equipment


operators or to analyze the effectiveness of various cabin configurations and control
placements.
 With virtual-reality systems, designers and others can move about and interact with
objects in various ways. Architectural designs can be examined by taking simulated
“walk” through the rooms or around the outsides of buildings to better appreciate the
overall effect of a particular design.
 With a special glove, we can even “grasp” objects in a scene and turn them over or move
them from one place to another.
d. Data Visualizations
 Producing graphical representations for scientific, engineering and medical data sets and
processes is another fairly new application of computer graphics, which is generally
referred to as scientific visualization. And the term business visualization is used in
connection with data sets related to commerce, industry and other nonscientific areas.

easenotes 3
Module 1 2021 scheme Computer Graphics and OpenGL

 There are many different kinds of data sets and effective visualization schemes depend on
the characteristics of the data. A collection of data can contain scalar values, vectors or
higher-order tensors.

e. Education and Training

 Computer generated models of physical,financial,political,social,economic & other


systems are often used as educational aids.
 Models of physical processes physiological functions,equipment, such as the color coded
diagram as shown in the figure, can help trainees to understand the operation of a system.
 For some training applications,special hardware systems are designed.Examples of such
specialized systems are the simulators for practice sessions ,aircraft pilots,air traffic-
control personnel.
 Some simulators have no video screens,for eg: flight simulator with only a control panel
for instrument flying

easenotes 4
Module 1 2021 scheme Computer Graphics and OpenGL

f. Computer Art

 The picture is usually painted electronically on a graphics tablet using a stylus, which can
simulate different brush strokes, brush widths and colors.
 Fine artists use a variety of other computer technologies to produce images. To create
pictures the artist uses a combination of 3D modeling packages, texture mapping,
drawing programs and CAD software etc.
 Commercial art also uses theses “painting” techniques for generating logos & other
designs, page layouts combining text & graphics, TV advertising spots & other
applications.
 A common graphics method employed in many television commercials is morphing,
where one object is transformed into another.

g. Entertainment

 Television production, motion pictures, and music videos routinely a computer graphics
methods.
 Sometimes graphics images are combined a live actors and scenes and sometimes the
films are completely generated a computer rendering and animation techniques.

easenotes 5
Module 1 2021 scheme Computer Graphics and OpenGL

 Some television programs also use animation techniques to combine computer generated
figures of people, animals, or cartoon characters with the actor in a scene or to transform
an actor’s face into another shape.

h. Image Processing

 The modification or interpretation of existing pictures, such as photographs and TV scans


is called image processing.
 Methods used in computer graphics and image processing overlap, the two areas are
concerned with fundamentally different operations.
 Image processing methods are used to improve picture quality, analyze images, or
recognize visual patterns for robotics applications.
 Image processing methods are often used in computer graphics, and computer graphics
methods are frequently applied in image processing.
 Medical applications also make extensive use of image processing techniques for picture
enhancements in tomography and in simulations and surgical operations.
 It is also used in computed X-ray tomography(CT), position emission
tomography(PET),and computed axial tomography(CAT).

i. Graphical User Interfaces


 It is common now for applications software to provide graphical user interface (GUI).
 A major component of graphical interface is a window manager that allows a user to
display multiple, rectangular screen areas called display windows.

easenotes 6
Module 1 2021 scheme Computer Graphics and OpenGL

 Each screen display area can contain a different process, showing graphical or non-
graphical information, and various methods can be used to activate a display window.
 Using an interactive pointing device, such as mouse, we can active a display window on
some systems by positioning the screen cursor within the window display area and
pressing the left mouse button.

1.3 Video Display Devices


 The primary output device in a graphics system is a video monitor.
 Historically, the operation of most video monitors was based on the standard cathoderay
tube (CRT) design, but several other technologies exist.
 In recent years, flat-panel displays have become significantly more popular due to their
reduced power consumption and thinner designs.

Refresh Cathode-Ray Tubes

easenotes 7
Module 1 2021 scheme Computer Graphics and OpenGL

 A beam of electrons, emitted by an electron gun, passes through focusing and deflection
systems that direct the beam toward specified positions on the phosphor-coated screen.
 The phosphor then emits a small spot of light at each position contacted by the electron
beam and the light emitted by the phosphor fades very rapidly.
 One way to maintain the screen picture is to store the picture information as a charge
distribution within the CRT in order to keep the phosphors activated.
 The most common method now employed for maintaining phosphor glow is to redraw
the picture repeatedly by quickly directing the electron beam back over the same screen
points. This type of display is called a refresh CRT.
 The frequency at which a picture is redrawn on the screen is referred to as the refresh
rate.

Operation of an electron gun with an accelarating anode

 The primary components of an electron gun in a CRT are the heated metal cathode and a
control grid.
 The heat is supplied to the cathode by directing a current through a coil of wire, called the
filament, inside the cylindrical cathode structure.
 This causes electrons to be “boiled off” the hot cathode surface.
 Inside the CRT envelope, the free, negatively charged electrons are then accelerated
toward the phosphor coating by a high positive voltage.

easenotes 8
Module 1 2021 scheme Computer Graphics and OpenGL

 Intensity of the electron beam is controlled by the voltage at the control grid.
 Since the amount of light emitted by the phosphor coating depends on the number of
electrons striking the screen, the brightness of a display point is controlled by varying the
voltage on the control grid.
 The focusing system in a CRT forces the electron beam to converge to a small cross
section as it strikes the phosphor and it is accomplished with either electric or magnetic
fields.
 With electrostatic focusing, the electron beam is passed through a positively charged
metal cylinder so that electrons along the center line of the cylinder are in equilibrium
position.
 Deflection of the electron beam can be controlled with either electric or magnetic fields.
 Cathode-ray tubes are commonly constructed with two pairs of magnetic-deflection coils
 One pair is mounted on the top and bottom of the CRT neck, and the other pair is
mounted on opposite sides of the neck.
 The magnetic field produced by each pair of coils results in a traverse deflection force
that is perpendicular to both the direction of the magnetic field and the direction of travel
of the electron beam.
 Horizontal and vertical deflections are accomplished with these pair of coils

Electrostatic deflection of the electron beam in a CRT


 When electrostatic deflection is used, two pairs of parallel plates are mounted inside the
CRT envelope where, one pair of plates is mounted horizontally to control vertical
deflection, and the other pair is mounted vertically to control horizontal deflection.
 Spots of light are produced on the screen by the transfer of the CRT beam energy to the
phosphor.
 When the electrons in the beam collide with the phosphor coating, they are stopped and
their kinetic energy is absorbed by the phosphor.
 Part of the beam energy is converted by the friction in to the heat energy, and the
remainder causes electros in the phosphor atoms to move up to higher quantum-energy
levels.

easenotes 9
Module 1 2021 scheme Computer Graphics and OpenGL

 After a short time, the “excited” phosphor electrons begin dropping back to their stable
ground state, giving up their extra energy as small quantum of light energy called
photons.

 What we see on the screen is the combined effect of all the electrons light emissions: a
glowing spot that quickly fades after all the excited phosphor electrons have returned to
their ground energy level.
 The frequency of the light emitted by the phosphor is proportional to the energy
difference between the excited quantum state and the ground state.
 Lower persistence phosphors required higher refresh rates to maintain a picture on the
screen without flicker.
 The maximum number of points that can be displayed without overlap on a CRT is
referred to as a resolution.
 Resolution of a CRT is dependent on the type of phosphor, the intensity to be displayed,
and the focusing and deflection systems.
 High-resolution systems are often referred to as high-definition systems.

easenotes 10
Module 1 2021 scheme Computer Graphics and OpenGL

1.3.1 Raster-Scan Displays and Random Scan Displays


i) Raster-Scan Displays
 The electron beam is swept across the screen one row at a time from top to bottom.
 As it moves across each row, the beam intensity is turned on and off to create a pattern of
illuminated spots.
 This scanning process is called refreshing. Each complete scanning of a screen is
normally called a frame.
 The refreshing rate, called the frame rate, is normally 60 to 80 frames per second, or
described as 60 Hz to 80 Hz.
 Picture definition is stored in a memory area called the frame buffer.
 This frame buffer stores the intensity values for all the screen points. Each screen point is
called a pixel (picture element).
 Property of raster scan is Aspect ratio, which defined as number of pixel columns
divided by number of scan lines that can be displayed by the system.

Case 1: In case of black and white systems


 On black and white systems, the frame buffer storing the values of the pixels is called a
bitmap.
 Each entry in the bitmap is a 1-bit data which determine the on (1) and off (0) of the
intensity of the pixel.

easenotes 11
Module 1 2021 scheme Computer Graphics and OpenGL

Case 2: In case of color systems


 On color systems, the frame buffer storing the values of the pixels is called a pixmap
(Though now a days many graphics libraries name it as bitmap too).
 Each entry in the pixmap occupies a number of bits to represent the color of the pixel. For
a true color display, the number of bits for each entry is 24 (8 bits per red/green/blue
channel, each channel 28=256 levels of intensity value, ie. 256 voltage settings for each
of the red/green/blue electron guns).

ii). Random-Scan Displays


 When operated as a random-scan display unit, a CRT has the electron beam directed only
to those parts of the screen where a picture is to be displayed.
 Pictures are generated as line drawings, with the electron beam tracing out the component
lines one after the other.
 For this reason, random-scan monitors are also referred to as vector displays (or
strokewriting displays or calligraphic displays).
 The component lines of a picture can be drawn and refreshed by a random-scan system in
any specified order

 A pen plotter operates in a similar way and is an example of a random-scan, hard-copy


device.

easenotes 12
Module 1 2021 scheme Computer Graphics and OpenGL

 Refresh rate on a random-scan system depends on the number of lines to be displayed on


that system.
 Picture definition is now stored as a set of line-drawing commands in an area of memory
referred to as the display list, refresh display file, vector file, or display program
 To display a specified picture, the system cycles through the set of commands in the
display file, drawing each component line in turn.
 After all line-drawing commands have been processed, the system cycles back to the first
line command in the list.
 Random-scan displays are designed to draw all the component lines of a picture 30 to 60
times each second, with up to 100,000 “short” lines in the display list.
 When a small set of lines is to be displayed, each refresh cycle is delayed to avoid very
high refresh rates, which could burn out the phosphor.

Difference between Raster scan system and Random scan system


Base of
Raster Scan System Random Scan System
Difference
The electron beam is swept The electron beam is directed only
Electron Beam across the screen, one row at a to theparts of screen where a
time, from top to bottom picture is to be drawn
Its resolution is poor because Its resolution is good because this
raster system in contrast system produces smooth lines
Resolution
produces zigzag lines that are drawings because CRT beam
plotted as discrete point sets. directly follows the line path.
Picture definition is stored as
Picture definition is stored as a set
Picture a set of intensity values for
of line drawing instructions in a
Definition all screen points,called pixels
display file.
in a refresh buffer area.
The capability of this system
These systems are designed for
Realistic to store intensity values for
line-drawing and can’t display
Display pixel makes it well suited for
realistic shaded scenes.
the realistic display of scenes

easenotes 13
Module 1 2021 scheme Computer Graphics and OpenGL

contain shadow and color


pattern.
Screen points/pixels are used Mathematical functions are used to
Draw an Image
to draw an image draw an image

1.3.2 Color CRT Monitors


 A CRT monitor displays color pictures by using a combination of phosphors that emit
different-colored light.
 It produces range of colors by combining the light emitted by different phosphors.
 There are two basic techniques for color display:
1. Beam-penetration technique
2. Shadow-mask technique
1) Beam-penetration technique:
 This technique is used with random scan monitors.
 In this technique inside of CRT coated with two phosphor layers usually red and green.
 The outer layer of red and inner layer of green phosphor.
 The color depends on how far the electron beam penetrates into the phosphor layer.
 A beam of fast electron penetrates more and excites inner green layer while slow eletron
excites outer red layer.
 At intermediate beam speed we can produce combination of red and green lights which
emit additional two colors orange and yellow.
 The beam acceleration voltage controls the speed of the electrons and hence color of
pixel.
Disadvantages:
 It is a low cost technique to produce color in random scan monitors.
 It can display only four colors.
 Quality of picture is not good compared to other techniques.

2) Shadow-mask technique
 It produces wide range of colors as compared to beam-penetration technique.
 This technique is generally used in raster scan displays. Including color TV.

easenotes 14
Module 1 2021 scheme Computer Graphics and OpenGL

 In this technique CRT has three phosphor color dots at each pixel position.
 One dot for red, one for green and one for blue light. This is commonly known as Dot
triangle.
 Here in CRT there are three electron guns present, one for each color dot. And a shadow
mask grid just behind the phosphor coated screen.
 The shadow mask grid consists of series of holes aligned with the phosphor dot pattern.
 Three electron beams are deflected and focused as a group onto the shadow mask and
when they pass through a hole they excite a dot triangle.
 In dot triangle three phosphor dots are arranged so that each electron beam can activate
only its corresponding color dot when it passes through the shadow mask.
 A dot triangle when activated appears as a small dot on the screen which has color of
combination of three small dots in the dot triangle.
 By changing the intensity of the three electron beams we can obtain different colors in
the shadow mask CRT.

1.3.3Flat Panel Display


 The term flat panel display refers to a class of video device that have reduced volume,
weight & power requirement compared to a CRT.
 As flat panel display is thinner than CRTs, we can hang them on walls or wear on our
wrists.

easenotes 15
Module 1 2021 scheme Computer Graphics and OpenGL

 Since we can even write on some flat panel displays they will soon be available as pocket
notepads.
 We can separate flat panel display in two categories:
1. Emissive displays: - the emissive display or emitters are devices that convert
electrical energy into light. For Ex. Plasma panel, thin film electroluminescent
displays and light emitting diodes.
2. Non emissive displays: - non emissive display or non emitters use optical
effects to convert sunlight or light from some other source into graphics patterns.
For Ex. LCD (Liquid Crystal Display).

a) Plasma Panels displays


 This is also called gas discharge displays.
 It is constructed by filling the region between two glass plates with a mixture of gases
that usually includes neon.
 A series of vertical conducting ribbons is placed on one glass panel and a set of
horizontal ribbon is built into the other glass panel.

 Firing voltage is applied to a pair of horizontal and vertical conductors cause the gas at
the intersection of the two conductors to break down into glowing plasma of electrons
and ions.
 Picture definition is stored in a refresh buffer and the firing voltages are applied to refresh
the pixel positions, 60 times per second.

easenotes 16
Module 1 2021 scheme Computer Graphics and OpenGL

 Alternating current methods are used to provide faster application of firing voltages and
thus brighter displays.
 Separation between pixels is provided by the electric field of conductor.
 One disadvantage of plasma panels is they were strictly monochromatic device that
means shows only one color other than black like black and white.

b.Thin Film Electroluminescent Displays


 It is similar to plasma panel display but region between the glass plates is filled with
phosphors such as doped with magnesium instead of gas.
 When sufficient voltage is applied the phosphors becomes a conductor in area of
intersection of the two electrodes.
 Electrical energy is then absorbed by the manganese atoms which then release the energy
as a spot of light similar to the glowing plasma effect in plasma panel.
 It requires more power than plasma panel.
 In this good color and gray scale difficult to achieve.

c. Light Emitting Diode (LED)


 In this display a matrix of multi-color light emitting diode is arranged to form the pixel
position in the display and the picture definition is stored in refresh buffer.
 Similar to scan line refreshing of CRT information is read from the refresh buffer and
converted to voltage levels that are applied to the diodes to produce the light pattern on
the display.

easenotes 17
Module 1 2021 scheme Computer Graphics and OpenGL

d)Liquid Crystal Display (LCD)


 This non emissive device produce picture by passing polarized light from the surrounding
or from an internal light source through liquid crystal material that can be aligned to
either block or transmit the light.
 The liquid crystal refreshes to fact that these compounds have crystalline arrangement of
molecules then also flows like liquid.
 It consists of two glass plates each with light polarizer at right angles to each other
sandwich the liquid crystal material between the plates.
 Rows of horizontal transparent conductors are built into one glass plate, and column of
vertical conductors are put into the other plates.
 The intersection of two conductors defines a pixel position.
 In the ON state polarized light passing through material is twisted so that it will pass
through the opposite polarizer.
 In the OFF state it will reflect back towards source.

Three- Dimensional Viewing Devices


 Graphics monitors for the display of three-dimensional scenes have been devised using a
technique that reflects a CRT image from a vibrating, flexible mirror As the varifocal
mirror vibrates, it changes focal length.

easenotes 18
Module 1 2021 scheme Computer Graphics and OpenGL

 These vibrations are synchronized with the display of an object on a CRT so that each
point on the object is reflected from the mirror into a spatial position corresponding to the
distance of that point from a specified viewing location.
 This allows us to walk around an object or scene and view it from different sides.

1.4 Raster-Scan Systems


 Interactive raster-graphics systems typically employ several processing units.
 In addition to the central processing unit (CPU), a special-purpose processor, called the
video controller or display controller, is used to control the operation of the display
device.
 Organization of a simple raster system is shown in below Figure.

 Here, the frame buffer can be anywhere in the system memory, and the video controller
accesses the frame buffer to refresh the screen.

easenotes 19
Module 1 2021 scheme Computer Graphics and OpenGL

 In addition to the video controller, raster systems employ other processors as


coprocessors and accelerators to implement various graphics operations.

1.4.1 Video controller:


 The figure below shows a commonly used organization for raster systems.
 A fixed area of the system memory is reserved for the frame buffer, and the video
controller is given direct access to the frame-buffer memory.
 Frame-buffer locations, and the corresponding screen positions, are referenced in the
Cartesian coordinates.

Cartesian reference frame:


 Frame-buffer locations and the corresponding screen positions, are referenced in
Cartesian coordinates.
 In an application (user) program, we use the commands within a graphics software
package to set coordinate positions for displayed objects relative to the origin of the
 The coordinate origin is referenced at the lower-left corner of a screen display area by the
software commands, although we can typically set the origin at any convenient location
for a particular application.

easenotes 20
Module 1 2021 scheme Computer Graphics and OpenGL

Working:
 Figure shows a two-dimensional Cartesian reference frame with the origin at the
lowerleft screen corner.

 The screen surface is then represented as the first quadrant of a two-dimensional system
with positive x and y values increasing from left to right and bottom of the screen to the
top respectively.
 Pixel positions are then assigned integer x values that range from 0 to xmax across the
screen, left to right, and integer y values that vary from 0 to ymax, bottom to top.

Basic Video Controller Refresh Operations


 The basic refresh operations of the video controller are diagrammed

 Two registers are used to store the coordinate values for the screen pixels.

easenotes 21
Module 1 2021 scheme Computer Graphics and OpenGL

 Initially, the x register is set to 0 and the y register is set to the value for the top scan line.
 The contents of the frame buffer at this pixel position are then retrieved and used to set
the intensity of the CRT beam.
 Then the x register is incremented by 1, and the process is repeated for the next pixel on
the top scan line.
 This procedure continues for each pixel along the top scan line.
 After the last pixel on the top scan line has been processed, the x register is reset to 0 and
the y register is set to the value for the next scan line down from the top of the screen.
 The procedure is repeated for each successive scan line.
 After cycling through all pixels along the bottom scan line, the video controller resets the
registers to the first pixel position on the top scan line and the refresh process starts over
a.Speed up pixel position processing of video controller:
 Since the screen must be refreshed at a rate of at least 60 frames per second,the simple
procedure illustrated in above figure may not be accommodated by RAM chips if the
cycle time is too slow.
 To speed up pixel processing, video controllers can retrieve multiple pixel values from
the refresh buffer on each pass.
 When group of pixels has been processed, the next block of pixel values is retrieved from
the frame buffer.
Advantages of video controller:
 A video controller can be designed to perform a number of other operations.
 For various applications, the video controller can retrieve pixel values from different
memory areas on different refresh cycles.
 This provides a fast mechanism for generating real-time animations.
 Another video-controller task is the transformation of blocks of pixels, so that screen
areas can be enlarged, reduced, or moved from one location to another during the refresh
cycles.
 In addition, the video controller often contains a lookup table, so that pixel values in the
frame buffer are used to access the lookup table. This provides a fast method for
changing screen intensity values.

easenotes 22
Module 1 2021 scheme Computer Graphics and OpenGL

 Finally, some systems are designed to allow the video controller to mix the framebuffer
image with an input image from a television camera or other input device

b) Raster-Scan Display Processor


 Figure shows one way to organize the components of a raster system that contains a
separate display processor, sometimes referred to as a graphics controller or a display
coprocessor.

 The purpose of the display processor is to free the CPU from the graphics chores.
 In addition to the system memory, a separate display-processor memory area can be
provided.
Scan conversion:
 A major task of the display processor is digitizing a picture definition given in an
application program into a set of pixel values for storage in the frame buffer.
 This digitization process is called scan conversion.
Example 1: displaying a line
 Graphics commands specifying straight lines and other geometric objects are scan
converted into a set of discrete points, corresponding to screen pixel positions.
 Scan converting a straight-line segment.

easenotes 23
Module 1 2021 scheme Computer Graphics and OpenGL

Example 2: displaying a character


 Characters can be defined with rectangular pixel grids
 The array size for character grids can vary from about 5 by 7 to 9 by 12 or more for
higher-quality displays.
 A character grid is displayed by superimposing the rectangular grid pattern into the frame
buffer at a specified coordinate position.

Using outline:
 For characters that are defined as outlines, the shapes are scan-converted into the frame
buffer by locating the pixel positions closest to the outline.

Additional operations of Display processors:


 Display processors are also designed to perform a number of additional operations.
 These functions include generating various line styles (dashed, dotted, or solid),
displaying color areas, and applying transformations to the objects in a scene.
 Display processors are typically designed to interface with interactive input devices, such
as a mouse.

Methods to reduce memory requirements in display processor:


 In an effort to reduce memory requirements in raster systems, methods have been devised
for organizing the frame buffer as a linked list and encoding the color information.
 One organization scheme is to store each scan line as a set of number pairs.

easenotes 24
Module 1 2021 scheme Computer Graphics and OpenGL

 Encoding methods can be useful in the digital storage and transmission of picture
information
i) Run-length encoding:
 The first number in each pair can be a reference to a color value, and the second number
can specify the number of adjacent pixels on the scan line that are to be displayed in that
color.
 This technique, called run-length encoding, can result in a considerable saving in storage
space if a picture is to be constructed mostly with long runs of a single color each.
 A similar approach can be taken when pixel colors change linearly.
ii) Cell encoding:
 Another approach is to encode the raster as a set of rectangular areas (cell encoding).

Disadvantages of encoding:
 The disadvantages of encoding runs are that color changes are difficult to record and
storage requirements increase as the lengths of the runs decrease.
 In addition, it is difficult for the display controller to process the raster when many short
runs are involved.
 Moreover, the size of the frame buffer is no longer a major concern, because of sharp
declines in memory costs

1.4.3 Graphics workstations and viewing systems


 Most graphics monitors today operate as raster-scan displays, and both CRT and flat
panel systems are in common use.
 Graphics workstation range from small general-purpose computer systems to multi
monitor facilities, often with ultra –large viewing screens.
 High-definition graphics systems, with resolutions up to 2560 by 2048, are commonly
used in medical imaging, air-traffic control, simulation, and CAD.
 Many high-end graphics workstations also include large viewing screens, often with
specialized features.

easenotes 25
Module 1 2021 scheme Computer Graphics and OpenGL

 Multi-panel display screens are used in a variety of applications that require “wall-sized”
viewing areas. These systems are designed for presenting graphics displays at meetings,
conferences, conventions, trade shows, retail stores etc.
 A multi-panel display can be used to show a large view of a single scene or several
individual images. Each panel in the system displays one section of the overall picture
 A large, curved-screen system can be useful for viewing by a group of people studying a
particular graphics application.
 A 360 degree paneled viewing system in the NASA control-tower simulator, which is
used for training and for testing ways to solve air-traffic and runway problems at airports.

1.5 Input Devices


 Graphics workstations make use of various devices for data input.Most systems have
keyboards and mouses,while some other systems have trackball,spaceball,joystick,button
boxes,touch panels,image scanners and voice systems.
Keyboard:
 Keyboard on graphics system is used for entering text strings,issuing certain commands
and selecting menu options.
 Keyboards can also be provided with features for entry of screen coordinates,menu
selections or graphics functions.
 General purpose keyboard uses function keys and cursor-control keys.
 Function keys allow user to select frequently accessed operations with a single
keystroke.Cursor-control keys are used for selecting a displayed object or a location by
positioning the screen cursor.

Button Boxes and Dials:


 Buttons are often used to input predefined functions .Dials are common devices for
entering scalar values.
 Numerical values within some defined range are selected for input with dial rotations.

easenotes 26
Module 1 2021 scheme Computer Graphics and OpenGL

Mouse Devices:
 Mouse is a hand-held device,usually moved around on a flat surface to position the
screen cursor.wheeler or roolers on the bottom of the mouse used to record the amount
and direction of movement.
 Some of the mouses uses optical sensors,which detects movement across the horizontal
and vertical grid lines.
 Since a mouse can be picked up and put down,it is used for making relative changes in
the position of the screen.
 Most general purpose graphics systems now include a mouse and a keyboard as the
primary input devices.

Trackballs and Spaceballs:


 A trackball is a ball device that can be rotated with the fingers or palm of the hand to
produce screen cursor movement.
 Laptop keyboards are equipped with a trackball to eliminate the extra space required by a
mouse.
 Spaceball is an extension of two-dimensional trackball concept.
 Spaceballs are used for three-dimensional positioning and selection operations in virtual-
reality systems,modeling,animation,CAD and other applications.

Joysticks:
 Joystick is used as a positioning device,which uses a small vertical lever(stick) mounded
on a base.It is used to steer the screen cursor around and select screen position with the
stick movement.
 A push or pull on the stick is measured with strain gauges and converted to movement of
the screen cursor in the direction of the applied pressure.

Data Gloves:
 Data glove can be used to grasp a virtual object.The glove is constructed with a series of
sensors that detect hand and finger motions.
 Input from the glove is used to position or manipulate objects in a virtual scene.

easenotes 27
Module 1 2021 scheme Computer Graphics and OpenGL

Digitizers:
 Digitizer is a common device for drawing,painting or selecting positions.
 Graphics tablet is one type of digitizer,which is used to input 2-dimensional coordinates
by activating a hand cursor or stylus at selected positions on a flat surface.
 A hand cursor contains cross hairs for sighting positions and stylus is a pencil-shaped
device that is pointed at positions on the tablet.

Image Scanners:
 Drawings,graphs,photographs or text can be stored for computer processing with an
image scanner by passing an optical scanning mechanism over the information to be
stored.
 Once we have the representation of the picture, then we can apply various image-
processing method to modify the representation of the picture and various editing
operations can be performed on the stored documents.

Touch Panels:
 Touch panels allow displayed objects or screen positions to be selected with the touch of
a finger.
 Touch panel is used for the selection of processing options that are represented as a menu
of graphical icons.
 Optical touch panel-uses LEDs along one vertical and horizontal edge of the frame.
 Acoustical touch panels generates high-frequency sound waves in horizontal and vertical
directions across a glass plate.

Light Pens:
 Light pens are pencil-shaped devices used to select positions by detecting the light
coming from points on the CRT screen.
 To select positions in any screen area with a light pen,we must have some nonzero light
intensity emitted from each pixel within that area.
 Light pens sometimes give false readings due to background lighting in a room.

easenotes 28
Module 1 2021 scheme Computer Graphics and OpenGL

Voice Systems:
 Speech recognizers are used with some graphics workstations as input devices for voice
commands.The voice system input can be used to initiate operations or to enter data.
 A dictionary is set up by speaking command words several times,then the system
analyses each word and matches with the voice command to match the pattern

1.6 Graphics Networks


 So far, we have mainly considered graphics applications on an isolated system with a
single user.
 Multiuser environments & computer networks are now common elements in many
graphics applications.
 Various resources, such as processors, printers, plotters and data files can be distributed
on a network & shared by multiple users.
 A graphics monitor on a network is generally referred to as a graphics server.
 The computer on a network that is executing a graphics application is called the client.
 A workstation that includes processors, as well as a monitor and input devices can
function as both a server and a client.

1.7 Graphics on Internet


 A great deal of graphics development is now done on the Internet.
 Computers on the Internet communicate using TCP/IP.
 Resources such as graphics files are identified by URL (Uniform resource locator).
 The World Wide Web provides a hypertext system that allows users to loacate and view
documents, audio and graphics.
 Each URL sometimes also called as universal resource locator.
 The URL contains two parts Protocol- for transferring the document, and Server-
contains the document.

1.8 Graphics Software


 There are two broad classifications for computer-graphics software

easenotes 29
Module 1 2021 scheme Computer Graphics and OpenGL

1. Special-purpose packages: Special-purpose packages are designed for


nonprogrammers
Example: generate pictures, graphs, charts, painting programs or CAD systems in
some application area without worrying about the graphics procedure
2. General programming packages: general programming package provides a library of
graphics functions that can be used in a programming language such as C, C++, Java,
or FORTRAN.
Example: GL (Graphics Library), OpenGL, VRML (Virtual-Reality Modeling
Language), Java 2D And Java 3D

NOTE: A set of graphics functions is often called a computer-graphics application


programming interface (CG API)

1.10 Coordinate Representations


 To generate a picture using a programming package we first need to give the geometric
descriptions of the objects that are to be displayed known as coordinates.
 If coordinate values for a picture are given in some other reference frame (spherical,
hyperbolic, etc.), they must be converted to Cartesian coordinates.
 Several different Cartesian reference frames are used in the process of constructing and
displaying
 First we define the shapes of individual objects, such as trees or furniture, These
reference frames are called modeling coordinates or local coordinates
 Then we place the objects into appropriate locations within a scene reference frame
called world coordinates.
 After all parts of a scene have been specified, it is processed through various output-
device reference frames for display. This process is called the viewing pipeline.
 The scene is then stored in normalized coordinates. Which range from −1 to 1 or from 0
to 1 Normalized coordinates are also referred to as normalized device coordinates.
 The coordinate systems for display devices are generally called device coordinates, or
screen coordinates.
NOTE: Geometric descriptions in modeling coordinates and world coordinates can be given in

easenotes 30
Module 1 2021 scheme Computer Graphics and OpenGL

floating-point or integer values.


 Example: Figure briefly illustrates the sequence of coordinate transformations from
modeling coordinates to device coordinates for a display

1.11 Graphics Functions


 It provides users with a variety of functions for creating and manipulating pictures
 The basic building blocks for pictures are referred to as graphics output primitives
 Attributes are properties of the output primitives
 We can change the size, position, or orientation of an object using geometric
transformations
 Modeling transformations, which are used to construct a scene.
 Viewing transformations are used to select a view of the scene, the type of projection to
be used and the location where the view is to be displayed.
 Input functions are used to control and process the data flow from these interactive
devices(mouse, tablet and joystick)
 Graphics package contains a number of tasks .We can lump the functions for carrying out
many tasks by under the heading control operations.

Software Standards
 The primary goal of standardized graphics software is portability.

easenotes 31
Module 1 2021 scheme Computer Graphics and OpenGL

 In 1984, Graphical Kernel System (GKS) was adopted as the first graphics software
standard by the International Standards Organization (ISO)
 The second software standard to be developed and approved by the standards
organizations was Programmer’s Hierarchical Interactive Graphics System (PHIGS).
 Extension of PHIGS, called PHIGS+, was developed to provide 3-D surface rendering
capabilities not available in PHIGS.
 The graphics workstations from Silicon Graphics, Inc. (SGI), came with a set of routines
called GL (Graphics Library)

Other Graphics Packages


 Many other computer-graphics programming libraries have been developed for
1. general graphics routines
2. Some are aimed at specific applications (animation, virtual reality, etc.)
Example: Open Inventor Virtual-Reality Modeling Language (VRML).
We can create 2-D scenes with in Java applets (java2D, Java 3D)

1.12 Introduction To OpenGL


 OpenGL basic(core) library :-A basic library of functions is provided in OpenGL for
specifying graphics primitives, attributes, geometric transformations, viewing
transformations, and many other operations.

Basic OpenGL Syntax


 Function names in the OpenGL basic library (also called the OpenGL core library) are
prefixed with gl. The component word first letter is capitalized.
 For eg:- glBegin, glClear, glCopyPixels, glPolygonMode
 Symbolic constants that are used with certain functions as parameters are all in capital
letters, preceded by “GL”, and component are separated by underscore.
 For eg:- GL_2D, GL_RGB, GL_CCW, GL_POLYGON,
GL_AMBIENT_AND_DIFFUSE.

easenotes 32
Module 1 2021 scheme Computer Graphics and OpenGL

 The OpenGL functions also expect specific data types. For example, an OpenGL function
parameter might expect a value that is specified as a 32-bit integer. But the size of an
integer specification can be different on different machines.
 To indicate a specific data type, OpenGL uses special built-in, data-type names, such as
GLbyte, GLshort, GLint, GLfloat, GLdouble, Glboolean

Related Libraries
 In addition to OpenGL basic(core) library(prefixed with gl), there are a number of
associated libraries for handling special operations:-
1) OpenGL Utility(GLU):- Prefixed with “glu”. It provides routines for setting up
viewing and projection matrices, describing complex objects with line and polygon
approximations, displaying quadrics and B-splines using linear approximations,
processing the surface-rendering operations, and other complex tasks.
-Every OpenGL implementation includes the GLU library
2) Open Inventor:- provides routines and predefined object shapes for interactive three-
dimensional applications which are written in C++.
3) Window-system libraries:- To create graphics we need display window. We cannot
create the display window directly with the basic OpenGL functions since it contains
only device-independent graphics functions, and window-management operations are
device-dependent. However, there are several window-system libraries that supports
OpenGL functions for a variety of machines.
Eg:- Apple GL(AGL), Windows-to-OpenGL(WGL), Presentation Manager to
OpenGL(PGL), GLX.
4) OpenGL Utility Toolkit(GLUT):- provides a library of functions which acts as
interface for interacting with any device specific screen-windowing system, thus making
our program device-independent. The GLUT library functions are prefixed with “glut”.

Header Files
 In all graphics programs, we will need to include the header file for the OpenGL core
library.

easenotes 33
Module 1 2021 scheme Computer Graphics and OpenGL

 In windows to include OpenGL core libraries and GLU we can use the following header
files:-
#include <windows.h> //precedes other header files for including Microsoft windows ver
of OpenGL libraries
#include<GL/gl.h>
#include <GL/glu.h>
 The above lines can be replaced by using GLUT header file which ensures gl.h and glu.h
are included correctly,
 #include <GL/glut.h> //GL in windows
 In Apple OS X systems, the header file inclusion statement will be,
 #include <GLUT/glut.h>

Display-Window Management Using GLUT


 We can consider a simplified example, minimal number of operations for displaying a
picture.
Step 1: initialization of GLUT
 We are using the OpenGL Utility Toolkit, our first step is to initialize GLUT.
 This initialization function could also process any command line arguments, but we will
not need to use these parameters for our first example programs.
 We perform the GLUT initialization with the statement
glutInit (&argc, argv);
Step 2: title
 We can state that a display window is to be created on the screen with a given caption for
the title bar. This is accomplished with the function
glutCreateWindow ("An Example OpenGL Program");
 where the single argument for this function can be any character string that we want to
use for the display-window title.
Step 3: Specification of the display window
 Then we need to specify what the display window is to contain.
 For this, we create a picture using OpenGL functions and pass the picture definition to
the GLUT routine glutDisplayFunc, which assigns our picture to the display window.

easenotes 34
Module 1 2021 scheme Computer Graphics and OpenGL

 Example: suppose we have the OpenGL code for describing a line segment in a
procedure called lineSegment.
 Then the following function call passes the line-segment description to the display
window:
glutDisplayFunc (lineSegment);
Step 4: one more GLUT function
 But the display window is not yet on the screen.
 We need one more GLUT function to complete the window-processing operations.
 After execution of the following statement, all display windows that we have created,
including their graphic content, are now activated:
glutMainLoop ( );
 This function must be the last one in our program. It displays the initial graphics and puts
the program into an infinite loop that checks for input from devices such as a mouse or
keyboard.
Step 5: these parameters using additional GLUT functions
 Although the display window that we created will be in some default location and size,
we can set these parameters using additional GLUT functions.
GLUT Function 1:
 We use the glutInitWindowPosition function to give an initial location for the upper left
corner of the display window.
 This position is specified in integer screen coordinates, whose origin is at the upper-left
corner of the screen.

easenotes 35
Module 1 2021 scheme Computer Graphics and OpenGL

GLUT Function 2:
After the display window is on the screen, we can reposition and resize it.
GLUT Function 3:
 We can also set a number of other options for the display window, such as buffering and
a choice of color modes, with the glutInitDisplayMode function.
 Arguments for this routine are assigned symbolic GLUT constants.
 Example: the following command specifies that a single refresh buffer is to be used for
the display window and that we want to use the color mode which uses red, green, and
blue (RGB) components to select color values:
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
 The values of the constants passed to this function are combined using a logical or
operation.
 Actually, single buffering and RGB color mode are the default options.
 But we will use the function now as a reminder that these are the options that are set for
our display.
 Later, we discuss color modes in more detail, as well as other display options, such as
double buffering for animation applications and selecting parameters for viewing
threedimensional scenes.

A Complete OpenGL Program


 There are still a few more tasks to perform before we have all the parts that we need for a
complete program.
Step 1: to set background color
 For the display window, we can choose a background color.
 Using RGB color values, we set the background color for the display window to be
white, with the OpenGL function:
glClearColor (1.0, 1.0, 1.0, 0.0);
 The first three arguments in this function set the red, green, and blue component colors to
the value 1.0, giving us a white background color for the display window.
 If, instead of 1.0, we set each of the component colors to 0.0, we would get a black
background.

easenotes 36
Module 1 2021 scheme Computer Graphics and OpenGL

 The fourth parameter in the glClearColor function is called the alpha value for the
specified color. One use for the alpha value is as a “blending” parameter
 When we activate the OpenGL blending operations, alpha values can be used to
determine the resulting color for two overlapping objects.
 An alpha value of 0.0 indicates a totally transparent object, and an alpha value of 1.0
indicates an opaque object.
 For now, we will simply set alpha to 0.0.
 Although the glClearColor command assigns a color to the display window, it does not
put the display window on the screen.

Step 2: to set window color


 To get the assigned window color displayed, we need to invoke the following OpenGL
function:
glClear (GL_COLOR_BUFFER_BIT);
 The argument GL COLOR BUFFER BIT is an OpenGL symbolic constant specifying
that it is the bit values in the color buffer (refresh buffer) that are to be set to the values
indicated in the glClearColor function. (OpenGL has several different kinds of buffers
that can be manipulated.

Step 3: to set color to object


 In addition to setting the background color for the display window, we can choose a
variety of color schemes for the objects we want to display in a scene.
 For our initial programming example, we will simply set the object color to be a dark
green
glColor3f (0.0, 0.4, 0.2);
 The suffix 3f on the glColor function indicates that we are specifying the three RGB
color components using floating-point (f) values.
 This function requires that the values be in the range from 0.0 to 1.0, and we have set red
= 0.0, green = 0.4, and blue = 0.2.

easenotes 37
Module 1 2021 scheme Computer Graphics and OpenGL

Example program
 For our first program, we simply display a two-dimensional line segment.
 To do this, we need to tell OpenGL how we want to “project” our picture onto the display
window because generating a two-dimensional picture is treated by OpenGL as a special
case of three-dimensional viewing.
 So, although we only want to produce a very simple two-dimensional line, OpenGL
processes our picture through the full three-dimensional viewing operations.
 We can set the projection type (mode) and other viewing parameters that we need with
the following two functions:
glMatrixMode (GL_PROJECTION);
gluOrtho2D (0.0, 200.0, 0.0, 150.0);
 This specifies that an orthogonal projection is to be used to map the contents of a
twodimensional rectangular area of world coordinates to the screen, and that the x-
coordinate values within this rectangle range from 0.0 to 200.0 with y-coordinate values
ranging from 0.0 to 150.0.
 Whatever objects we define within this world-coordinate rectangle will be shown within
the display window.
 Anything outside this coordinate range will not be displayed.
 Therefore, the GLU function gluOrtho2D defines the coordinate reference frame within
the display window to be (0.0, 0.0) at the lower-left corner of the display window and
(200.0, 150.0) at the upper-right window corner.
 For now, we will use a world-coordinate rectangle with the same aspect ratio as the
display window, so that there is no distortion of our picture.
 Finally, we need to call the appropriate OpenGL routines to create our line segment.
 The following code defines a two-dimensional, straight-line segment with integer,
 Cartesian endpoint coordinates (180, 15) and (10, 145).
glBegin (GL_LINES);
glVertex2i (180, 15);
glVertex2i (10, 145);
glEnd ( );
 Now we are ready to put all the pieces together:

easenotes 38
Module 1 2021 scheme Computer Graphics and OpenGL

The following OpenGL program is organized into three functions.


 init: We place all initializations and related one-time parameter settings in function init.
 lineSegment: Our geometric description of the “picture” that we want to display is in
function lineSegment, which is the function that will be referenced by the GLUT function
glutDisplayFunc.
 main function main function contains the GLUT functions for setting up the display
window and getting our line segment onto the screen.
 glFlush: This is simply a routine to force execution of our OpenGL functions, which are
stored by computer systems in buffers in different locations,depending on how OpenGL
is implemented.
 The procedure lineSegment that we set up to describe our picture is referred to as a
display callback function.
 And this procedure is described as being “registered” by glutDisplayFunc as the routine
to invoke whenever the display window might need to be redisplayed.
Example: if the display window is moved.
Following program to display window and line segment generated by this program:
#include <GL/glut.h> // (or others, depending on the system in use)
void init (void)
{
glClearColor (1.0, 1.0, 1.0, 0.0); // Set display-window color to white.
glMatrixMode (GL_PROJECTION); // Set projection parameters.
gluOrtho2D (0.0, 200.0, 0.0, 150.0);
}
void lineSegment (void)
{
glClear (GL_COLOR_BUFFER_BIT); // Clear display window.
glColor3f (0.0, 0.4, 0.2); // Set line segment color to green.
glBegin (GL_LINES);
glVertex2i (180, 15); // Specify line-segment geometry.
glVertex2i (10, 145);
glEnd ( );

easenotes 39
Module 1 2021 scheme Computer Graphics and OpenGL

glFlush ( ); // Process all OpenGL routines as quickly as possible.


}
void main (int argc, char** argv)
{
glutInit (&argc, argv); // Initialize GLUT.
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); // Set display mode.
glutInitWindowPosition (50, 100); // Set top-left display-window position.
glutInitWindowSize (400, 300); // Set display-window width and height.
glutCreateWindow ("An Example OpenGL Program"); // Create display window.
init ( ); // Execute initialization procedure.
glutDisplayFunc (lineSegment); // Send graphics to display window.
glutMainLoop ( ); // Display everything and wait.
}

1.13 Coordinate Reference Frames


To describe a picture, we first decide upon
 A convenient Cartesian coordinate system, called the world-coordinate reference frame,
which could be either 2D or 3D.
 We then describe the objects in our picture by giving their geometric specifications in
terms of positions in world coordinates.
 Example: We define a straight-line segment with two endpoint positions, and a polygon
is specified with a set of positions for its vertices.
 These coordinate positions are stored in the scene description along with other info about
the objects, such as their color and their coordinate extents
 Co-ordinate extents :Co-ordinate extents are the minimum and maximum x, y, and z
values for each object.
 A set of coordinate extents is also described as a bounding box for an object.
 Ex:For a 2D figure, the coordinate extents are sometimes called its bounding rectangle.
 Objects are then displayed by passing the scene description to the viewing routines which
identify visible surfaces and map the objects to the frame buffer positions and then on the
video monitor.

easenotes 40
Module 1 2021 scheme Computer Graphics and OpenGL

 The scan-conversion algorithm stores info about the scene, such as color values, at the
appropriate locations in the frame buffer, and then the scene is displayed on the output
device.

Screen co-ordinates:
 Locations on a video monitor are referenced in integer screen coordinates, which
correspond to the integer pixel positions in the frame buffer.
 Scan-line algorithms for the graphics primitives use the coordinate descriptions to
determine the locations of pixels
 Example: given the endpoint coordinates for a line segment, a display algorithm must
calculate the positions for those pixels that lie along the line path between the endpoints.
 Since a pixel position occupies a finite area of the screen, the finite size of a pixel must
be taken into account by the implementation algorithms.
 For the present, we assume that each integer screen position references the centre of a
pixel area.
 Once pixel positions have been identified the color values must be stored in the frame
buffer

Assume we have available a low-level procedure of the form


i) setPixel (x, y);
 stores the current color setting into the frame buffer at integer position(x, y), relative to
the position of the screen-coordinate origin
ii) getPixel (x, y, color);
 Retrieves the current frame-buffer setting for a pixel location;
 Parameter color receives an integer value corresponding to the combined RGB bit codes
stored for the specified pixel at position (x,y).
 Additional screen-coordinate information is needed for 3D scenes.
 For a two-dimensional scene, all depth values are 0.

easenotes 41
Module 1 2021 scheme Computer Graphics and OpenGL

Absolute and Relative Coordinate Specifications


Absolute coordinate:
 So far, the coordinate references that we have discussed are stated as absolute coordinate
values.
 This means that the values specified are the actual positions within the coordinate system
in use.
Relative coordinates:
 However, some graphics packages also allow positions to be specified using relative
coordinates.
 This method is useful for various graphics applications, such as producing drawings with
pen plotters, artist’s drawing and painting systems, and graphics packages for publishing
and printing applications.
 Taking this approach, we can specify a coordinate position as an offset from the last
position that was referenced (called the current position).

Specifying a Two-Dimensional World-Coordinate Reference Frame in OpenGL


 The gluOrtho2D command is a function we can use to set up any 2D Cartesian reference
frames.
 The arguments for this function are the four values defining the x and y coordinate limits
for the picture we want to display.
 Since the gluOrtho2D function specifies an orthogonal projection, we need also to be sure
that the coordinate values are placed in the OpenGL projection matrix.
 In addition, we could assign the identity matrix as the projection matrix before defining
the world-coordinate range.
 This would ensure that the coordinate values were not accumulated with any values we
may have previously set for the projection matrix.
 Thus, for our initial two-dimensional examples, we can define the coordinate frame for
the screen display window with the following statements
glMatrixMode (GL_PROJECTION);
glLoadIdentity ( );
gluOrtho2D (xmin, xmax, ymin, ymax);

easenotes 42
Module 1 2021 scheme Computer Graphics and OpenGL

 The display window will then be referenced by coordinates (xmin, ymin) at the lower-left
corner and by coordinates (xmax, ymax) at the upper-right corner, as shown in Figure
below

 We can then designate one or more graphics primitives for display using the coordinate
reference specified in the gluOrtho2D statement.
 If the coordinate extents of a primitive are within the coordinate range of the display
window, all of the primitive will be displayed.
 Otherwise, only those parts of the primitive within the display-window coordinate limits
will be shown.
 Also, when we set up the geometry describing a picture, all positions for the OpenGL
primitives must be given in absolute coordinates, with respect to the reference frame
defined in the gluOrtho2D function.

1.14 OpenGL Functions


Geometric Primitives:
 It includes points, line segments, polygon etc.
 These primitives pass through geometric pipeline which decides whether the primitive is
visible or not and also how the primitive should be visible on the screen etc.
 The geometric transformations such rotation, scaling etc can be applied on the primitives
which are displayed on the screen.The programmer can create geometric primitives as
shown below:

easenotes 43
Module 1 2021 scheme Computer Graphics and OpenGL

where:
glBegin indicates the beginning of the object that has to be displayed
glEnd indicates the end of primitive

1.15 OpenGL Point Functions


 The type within glBegin() specifies the type of the object and its value can be as follows:
GL_POINTS
 Each vertex is displayed as a point.
 The size of the point would be of at least one pixel.
 Then this coordinate position, along with other geometric descriptions we may have in
our scene, is passed to the viewing routines.
 Unless we specify other attribute values, OpenGL primitives are displayed with a default
size and color.
 The default color for primitives is white, and the default point size is equal to the size of a
single screen pixel
Syntax:
Case 1:
glBegin (GL_POINTS);
glVertex2i (50, 100);
glVertex2i (75, 150);
glVertex2i (100, 200);

easenotes 44
Module 1 2021 scheme Computer Graphics and OpenGL

glEnd ( );
Case 2:
 we could specify the coordinate values for the preceding points in arrays such as
int point1 [ ] = {50, 100};
int point2 [ ] = {75, 150};
int point3 [ ] = {100, 200};
and call the OpenGL functions for plotting the three points as
glBegin (GL_POINTS);
glVertex2iv (point1);
glVertex2iv (point2);
glVertex2iv (point3);
glEnd ( );
Case 3:
 specifying two point positions in a three dimensional world reference frame. In this case,
we give the coordinates as explicit floating-point values:
glBegin (GL_POINTS);
glVertex3f (-78.05, 909.72, 14.60);
glVertex3f (261.91, -5200.67, 188.33);
glEnd ( );

1.16 OpenGL LINE FUNCTIONS


 Primitive type is GL_LINES
 Successive pairs of vertices are considered as endpoints and they are connected to form
an individual line segments.
 Note that successive segments usually are disconnected because the vertices are
processed on a pair-wise basis.
 we obtain one line segment between the first and second coordinate positions and another
line segment between the third and fourth positions.
 if the number of specified endpoints is odd, so the last coordinate position is ignored.

easenotes 45
Module 1 2021 scheme Computer Graphics and OpenGL

Case 1: Lines
glBegin (GL_LINES);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );

Case 2: GL_LINE_STRIP:
Successive vertices are connected using line segments. However, the final vertex is not
connected to the initial vertex.
glBegin (GL_LINES_STRIP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );

Case 3: GL_LINE_LOOP:
Successive vertices are connected using line segments to form a closed path or loop i.e., final
vertex is connected to the initial vertex.
glBegin (GL_LINES_LOOP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );

easenotes 46
Module 1 2021 scheme Computer Graphics and OpenGL

1.16 Point Attributes


 Basically, we can set two attributes for points: color and size.
 In a state system: The displayed color and size of a point is determined by the current
values stored in the attribute list.
 Color components are set with RGB values or an index into a color table.
 For a raster system: Point size is an integer multiple of the pixel size, so that a large point
is displayed as a square block of pixels

Opengl Point-Attribute Functions


Color:
 The displayed color of a designated point position is controlled by the current color
values in the state list.
 Also, a color is specified with either the glColor function or the glIndex function.
Size:
 We set the size for an OpenGL point with
glPointSize (size);
and the point is then displayed as a square block of pixels.
 Parameter size is assigned a positive floating-point value, which is rounded to an integer
(unless the point is to be antialiased).
 The number of horizontal and vertical pixels in the display of the point is determined by
parameter size.
 Thus, a point size of 1.0 displays a single pixel, and a point size of 2.0 displays a 2×2
pixel array.
 If we activate the antialiasing features of OpenGL, the size of a displayed block of pixels
will be modified to smooth the edges.
 The default value for point size is 1.0.

Example program:
 Attribute functions may be listed inside or outside of a glBegin/glEnd pair.
 Example: the following code segment plots three points in varying colors and sizes.

easenotes 47
Module 1 2021 scheme Computer Graphics and OpenGL

 The first is a standard-size red point, the second is a double-size green point, and the third
is a triple-size blue point:

Ex:
glColor3f (1.0, 0.0, 0.0);
glBegin (GL_POINTS);
glVertex2i (50, 100);
glPointSize (2.0);
glColor3f (0.0, 1.0, 0.0);
glVertex2i (75, 150);
glPointSize (3.0);
glColor3f (0.0, 0.0, 1.0);
glVertex2i (100, 200);
glEnd ( );

1.17 Line-Attribute Functions OpenGL


 In OpenGL straight-line segment with three attribute settings: line color, line-width, and
line style.
 OpenGL provides a function for setting the width of a line and another function for
specifying a line style, such as a dashed or dotted line.

OpenGL Line-Width Function


 Line width is set in OpenGL with the function
Syntax: glLineWidth (width);
 We assign a floating-point value to parameter width, and this value is rounded to the
nearest nonnegative integer.
 If the input value rounds to 0.0, the line is displayed with a standard width of 1.0, which
is the default width.
 Some implementations of the line-width function might support only a limited number of
widths, and some might not support widths other than 1.0.

easenotes 48
Module 1 2021 scheme Computer Graphics and OpenGL

 That is, the magnitude of the horizontal and vertical separations of the line endpoints,
deltax and deltay, are compared to determine whether to generate a thick line using
vertical pixel spans or horizontal pixel spans.

OpenGL Line-Style Function


 By default, a straight-line segment is displayed as a solid line.
 But we can also display dashed lines, dotted lines, or a line with a combination of dashes
and dots.
 We can vary the length of the dashes and the spacing between dashes or dots.
 We set a current display style for lines with the OpenGL function:
Syntax: glLineStipple (repeatFactor, pattern);

Pattern:
 Parameter pattern is used to reference a 16-bit integer that describes how the line should
be displayed.
 1 bit in the pattern denotes an “on” pixel position, and a 0 bit indicates an “off” pixel
position.
 The pattern is applied to the pixels along the line path starting with the low-order bits in
the pattern.
 The default pattern is 0xFFFF (each bit position has a value of 1),which produces a solid
line.

repeatFactor
 Integer parameter repeatFactor specifies how many times each bit in the pattern is to be
repeated before the next bit in the pattern is applied.
 The default repeat value is 1.

Polyline:
 With a polyline, a specified line-style pattern is not restarted at the beginning of each
segment.

easenotes 49
Module 1 2021 scheme Computer Graphics and OpenGL

 It is applied continuously across all the segments, starting at the first endpoint of the
polyline and ending at the final endpoint for the last segment in the series.
Example:
 For line style, suppose parameter pattern is assigned the hexadecimal representation
0x00FF and the repeat factor is 1.
 This would display a dashed line with eight pixels in each dash and eight pixel positions
that are “off” (an eight-pixel space) between two dashes.
 Also, since low order bits are applied first, a line begins with an eight-pixel dash starting
at the first endpoint.
 This dash is followed by an eight-pixel space, then another eight-pixel dash, and so forth,
until the second endpoint position is reached.

Activating line style:


 Before a line can be displayed in the current line-style pattern, we must activate the line-
style feature of OpenGL.
glEnable (GL_LINE_STIPPLE);
 If we forget to include this enable function, solid lines are displayed; that is, the default
pattern 0xFFFF is used to display line segments.
 At any time, we can turn off the line-pattern feature with
glDisable (GL_LINE_STIPPLE);
 This replaces the current line-style pattern with the default pattern (solid lines).

Example Code:
typedef struct { float x, y; } wcPt2D;
wcPt2D dataPts [5];
void linePlot (wcPt2D dataPts [5])
{
int k;
glBegin (GL_LINE_STRIP);
for (k = 0; k < 5; k++)
glVertex2f (dataPts [k].x, dataPts [k].y);

easenotes 50
Module 1 2021 scheme Computer Graphics and OpenGL

glFlush ( );
glEnd ( );
}
/* Invoke a procedure here to draw coordinate axes. */
glEnable (GL_LINE_STIPPLE); /* Input first set of (x, y) data values. */
glLineStipple (1, 0x1C47); // Plot a dash-dot, standard-width polyline.
linePlot (dataPts);
/* Input second set of (x, y) data values. */
glLineStipple (1, 0x00FF); / / Plot a dashed, double-width polyline.
glLineWidth (2.0);
linePlot (dataPts);
/* Input third set of (x, y) data values. */
glLineStipple (1, 0x0101); // Plot a dotted, triple-width polyline.
glLineWidth (3.0);
linePlot (dataPts);
glDisable (GL_LINE_STIPPLE);

1.18 Curve Attributes


 Parameters for curve attributes are the same as those for straight-line segments.
 We can display curves with varying colors, widths, dot-dash patterns, and available pen
or brush options.
 Methods for adapting curve-drawing algorithms to accommodate attribute selections are
similar to those for line drawing.
 Raster curves of various widths can be displayed using the method of horizontal or
vertical pixel spans.
Case 1: Where the magnitude of the curve slope |m| <= 1.0, we plot vertical spans;
Case 2: when the slope magnitude |m| > 1.0, we plot horizontal spans.

Different methods to draw a curve:


Method 1: Using circle symmetry property, we generate the circle path with vertical spans in the
octant from x = 0 to x = y, and then reflect pixel positions about the line y = x to y=0

easenotes 51
Module 1 2021 scheme Computer Graphics and OpenGL

Method 2: Another method for displaying thick curves is to fill in the area between two Parallel
curve paths, whose separation distance is equal to the desired width. We could do this using the
specified curve path as one boundary and setting up the second boundary either inside or outside
the original curve path. This approach, however, shifts the original curve path either inward or
outward, depending on which direction we choose for the second boundary.

Method 3:The pixel masks discussed for implementing line-style options could also be used in
raster curve algorithms to generate dashed or dotted patterns

Method 4: Pen (or brush) displays of curves are generated using the same techniques discussed
for straight-line segments.

Method 5: Painting and drawing programs allow pictures to be constructed interactively by


using a pointing device, such as a stylus and a graphics tablet, to sketch various curve shapes.

1.19 Line Drawing Algorithm


 A straight-line segment in a scene is defined by coordinate positions for the endpoints of
the segment.
 To display the line on a raster monitor, the graphics system must first project the
endpoints to integer screen coordinates and determine the nearest pixel positions along
the line path between the two endpoints then the line color is loaded into the frame buffer
at the corresponding pixel coordinates
 The Cartesian slope-intercept equation for a straight line is
y=m * x +b ------------>(1)
with m as the slope of the line and b as the y intercept.
 Given that the two endpoints of a line segment are specified at positions (x0,y0) and
(xend, yend) ,as shown in fig.

easenotes 52
Module 1 2021 scheme Computer Graphics and OpenGL

 We determine values for the slope m and y intercept b with the following equations:
m=(yend - y0)/(xend - x0) ---------------- >(2)
b=y0 - m.x0 ------------- >(3)
 Algorithms for displaying straight line are based on the line equation (1) and calculations
given in eq(2) and (3).
 For given x interval δx along a line, we can compute the corresponding y interval δy from
eq.(2) as
δy=m. δx ---------------- >(4)
 Similarly, we can obtain the x interval δx corresponding to a specified δy as
δx=δy/m ----------------- >(5)
 These equations form the basis for determining deflection voltages in analog displays,
such as vector-scan system, where arbitrarily small changes in deflection voltage are
possible.
 For lines with slope magnitudes
 |m|<1, δx can be set proportional to a small horizontal deflection voltage with the
corresponding vertical deflection voltage set proportional to δy from eq.(4)
 |m|>1, δy can be set proportional to a small vertical deflection voltage with the
corresponding horizontal deflection voltage set proportional to δx from eq.(5)
 |m|=1, δx=δy and the horizontal and vertical deflections voltages are equal

DDA Algorithm (DIGITAL DIFFERENTIAL ANALYZER)


 The DDA is a scan-conversion line algorithm based on calculating either δy or δx.

easenotes 53
Module 1 2021 scheme Computer Graphics and OpenGL

 A line is sampled at unit intervals in one coordinate and the corresponding integer values
nearest the line path are determined for the other coordinate
 DDA Algorithm has three cases so from equation i.e.., m=(yk+1 - yk)/(xk+1 - xk)

Case1:
if m<1,x increment in unit intervals
i.e..,xk+1=xk+1
then, m=(yk+1 - yk)/( xk+1 - xk)
m= yk+1 - yk
yk+1 = yk + m ----------- >(1)
 where k takes integer values starting from 0,for the first point and increases by 1 until
final endpoint is reached. Since m can be any real number between 0.0 and 1.0,

Case2:
if m>1, y increment in unit intervals
i.e.., yk+1 = yk + 1
then, m= (yk + 1- yk)/( xk+1 - xk)
m(xk+1 - xk)=1
xk+1 =(1/m)+ xk------------------------- (2)

Case3:
if m=1,both x and y increment in unit intervals
i.e..,xk+1=xk + 1 and yk+1 = yk + 1

Equations (1) and (2) are based on the assumption that lines are to be processed from the left
endpoint to the right endpoint. If this processing is reversed, so that the starting endpoint is at the
right, then either we have δx=-1 and
yk+1 = yk - m (3)
or(when the slope is greater than 1)we have δy=-1 with
xk+1 = xk - (1/m) --------------- (4)

easenotes 54
Module 1 2021 scheme Computer Graphics and OpenGL

 Similar calculations are carried out using equations (1) through (4) to determine the pixel
positions along a line with negative slope. thus, if the absolute value of the slope is less
than 1 and the starting endpoint is at left ,we set δx==1 and calculate y values with eq(1).
 when starting endpoint is at the right(for the same slope),we set δx=-1 and obtain y
positions using eq(3).
 This algorithm is summarized in the following procedure, which accepts as input two
integer screen positions for the endpoints of a line segment.
 if m<1,where x is incrementing by 1
yk+1 = yk + m
 So initially x=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is the
starting point .
o Illuminate pixel(x, round(y))
o x1= x+ 1 , y1=y + 1
o Illuminate pixel(x1,round(y1))
o x2= x1+ 1 , y2=y1 + 1
o Illuminate pixel(x2,round(y2))
o Till it reaches final point.
 if m>1,where y is incrementing by 1
xk+1 =(1/m)+ xk
 So initially y=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is the
starting point .
o Illuminate pixel(round(x),y)
o x1= x+( 1/m) ,y1=y
o Illuminate pixel(round(x1),y1)
o x2= x1+ (1/m) , y2=y1
o Illuminate pixel(round(x2),y2)
o Till it reaches final point.

 The DDA algorithm is faster method for calculating pixel position than one that directly
implements .

easenotes 55
Module 1 2021 scheme Computer Graphics and OpenGL

 It eliminates the multiplication by making use of raster characteristics, so that appropriate


increments are applied in the x or y directions to step from one pixel position to another
along the line path.
 The accumulation of round off error in successive additions of the floating point
increment, however can cause the calculated pixel positions to drift away from the true
line path for long line segments. Furthermore ,the rounding operations and floating point
arithmetic in this procedure are still time consuming.
 we improve the performance of DDA algorithm by separating the increments m and 1/m
into integer and fractional parts so that all calculations are reduced to integer operations.
#include <stdlib.h>
#include <math.h>
inline int round (const float a)
{
return int (a + 0.5);
}
void lineDDA (int x0, int y0, int xEnd, int yEnd)
{
int dx = xEnd - x0, dy = yEnd - y0, steps, k;
float xIncrement, yIncrement, x = x0, y = y0;
if (fabs (dx) > fabs (dy))
steps = fabs (dx);
else
steps = fabs (dy);
xIncrement = float (dx) / float (steps);
yIncrement = float (dy) / float (steps);
setPixel (round (x), round (y));
for (k = 0; k < steps; k++) {
x += xIncrement;
y += yIncrement;
setPixel (round (x), round (y));
}

easenotes 56
Module 1 2021 scheme Computer Graphics and OpenGL

Bresenham’s Algorithm:
 It is an efficient raster scan generating algorithm that uses incremental integral
calculations
 To illustrate Bresenham’s approach, we first consider the scan-conversion process for
lines with positive slope less than 1.0.
 Pixel positions along a line path are then determined by sampling at unit x intervals.
Starting from the left endpoint (x0, y0) of a given line, we step to each successive column
(x position) and plot the pixel whose scan-line y value is closest to the line path.

 Consider the equation of a straight line y=mx+c where m=dy/dx

Bresenham’s Line-Drawing Algorithm for |m| < 1.0


1. Input the two line endpoints and store the left endpoint in (x0, y0).
2. Set the color for frame-buffer position (x0, y0); i.e., plot the first point.
3. Calculate the constants ∆x, ∆y, 2∆y, and 2∆y − 2∆x, and obtain the starting value for
the decision parameter as
p0 = 2∆y −∆x
4. At each xk along the line, starting at k = 0, perform the following test:
If pk < 0, the next point to plot is (xk + 1, yk ) and
pk+1 = pk + 2∆y
Otherwise, the next point to plot is (xk + 1, yk + 1) and
pk+1 = pk + 2∆y − 2∆x
5. Repeat step 4 ∆x − 1 more times.
Note:
If |m|>1.0
Then
p0 = 2∆x −∆y
and

easenotes 57
Module 1 2021 scheme Computer Graphics and OpenGL

If pk < 0, the next point to plot is (xk , yk +1) and


pk+1 = pk + 2∆x
Otherwise, the next point to plot is (xk + 1, yk + 1) and
pk+1 = pk + 2∆x − 2∆y

Code:
#include <stdlib.h>
#include <math.h>
/* Bresenham line-drawing procedure for |m| < 1.0. */
void lineBres (int x0, int y0, int xEnd, int yEnd)
{
int dx = fabs (xEnd - x0), dy = fabs(yEnd - y0);
int p = 2 * dy - dx;
int twoDy = 2 * dy, twoDyMinusDx = 2 * (dy - dx);
int x, y;
/* Determine which endpoint to use as start position. */
if (x0 > xEnd) {
x = xEnd;
y = yEnd;
xEnd = x0;
}
else {
x = x0;
y = y0;
}
setPixel (x, y);
while (x < xEnd) {
x++;
if (p < 0)
p += twoDy;

easenotes 58
Module 1 2021 scheme Computer Graphics and OpenGL

else {
y++;
p += twoDyMinusDx;
}
setPixel (x, y);
}
}

Properties of Circles
 A circle is defined as the set of points that are all at a given distance r from a center
position (xc , yc ).
 For any circle point (x, y), this distance relationship is expressed by the Pythagorean
theorem in Cartesian coordinates as

 We could use this equation to calculate the position of points on a circle circumference
by stepping along the x axis in unit steps from xc −r to xc +r and calculating the
corresponding y values at each position as

 One problem with this approach is that it involves considerable computation at each step.
Moreover, the spacing between plotted pixel positions is not uniform.
 We could adjust the spacing by interchanging x and y (stepping through y values and
calculating x values) whenever the absolute value of the slope of the circle is greater than
1; but this simply increases the computation and processing required by the algorithm.
 Another way to eliminate the unequal spacing is to calculate points along the circular
boundary using polar coordinates r and θ
 Expressing the circle equation in parametric polar form yields the pair of equations

easenotes 59
Module 1 2021 scheme Computer Graphics and OpenGL

Midpoint Circle Algorithm


 Midpoint circle algorithm generates all points on a circle centered at the origin by
incrementing all the way around circle.
 The strategy is to select which of 2 pixels is closer to the circle by evaluating a function
at the midpoint between the 2 pixels
 To apply the midpoint method, we define a circle function as

 To summarize, the relative position of any point (x, y) can be determined by checking the
sign of the circle function as follows:

Eight way symmetry


 The shape of the circle is similar in each quadrant.
 Therefore ,if we determine the curve positions in the first quadrant ,we can generate the
circle positions in the second quadrant of xy plane.
 The circle sections in the third and fourth quadrant can be obtained from sections in the
first and second quadrant by considering the symmetry along X axis

easenotes 60
Module 1 2021 scheme Computer Graphics and OpenGL

 Conside the circle centered at the origin,if the point ( x, y) is on the circle,then we can
compute 7 other points on the circle as shown in the above figure.
 Our decision parameter is the circle function evaluated at the midpoint between these
two pixels:

 Successive decision parameters are obtained using incremental calculations.


 We obtain a recursive expression for the next decision parameter by evaluating the circle
function at sampling position xk+1 + 1 = xk + 2:

easenotes 61
Module 1 2021 scheme Computer Graphics and OpenGL

 The initial decision parameter is obtained by evaluating the circle function at the start
position (x0, y0) = (0, r ):

 If the radius r is specified as an integer, we can simply round p0 to


p0 = 1 − r (for r an integer)
because all increments are integers.

Midpoint Circle Algorithm


1. Input radius r and circle center (xc , yc ), then set the coordinates for the first point on the
circumference of a circle centered on the origin as
(x0, y0) = (0, r )
2. Calculate the initial value of the decision parameter as
p0 = 1-r
3. At each xk position, starting at k = 0, perform the following test:
If pk <0, the next point along the circle centered on (0, 0) is (xk+1, yk ) and
pk+1 = pk + 2xk+1 + 1
Otherwise, the next point along the circle is (xk + 1, yk − 1) and

pk+1 = pk + 2xk+1 + 1 – 2yk+1


where 2xk+1 = 2xk + 2 and 2yk+1= 2yk − 2.

4. Determine symmetry points in the other seven octants.


5. Move each calculated pixel position (x, y) onto the circular path centered at (xc , yc ) and plot
the coordinate values as follows:
x = x + xc , y = y + yc
6. Repeat steps 3 through 5 until x ≥ y.

easenotes 62
Module 1 2021 scheme Computer Graphics and OpenGL

Code:
void draw_pixel(GLint cx, GLint cy)
{
glColor3f(0.5,0.5,0.0);
glBegin(GL_POINTS);
glVertex2i(cx, cy);
glEnd();
}

void plotpixels(GLint h, GLint k, GLint x, GLint y)


{
draw_pixel(x+h, y+k);
draw_pixel(-x+h, y+k);
draw_pixel(x+h, -y+k);
draw_pixel(-x+h, -y+k);
draw_pixel(y+h, x+k);
draw_pixel(-y+h, x+k);
draw_pixel(y+h, -x+k);
draw_pixel(-y+h, -x+k);
}

void circle_draw(GLint xc, GLint yc, GLint r)


{
GLint d=1-r, x=0,y=r;
while(y>x)
{
plotpixels(xc, yc, x, y);
if(d<0) d+=2*x+3;
else
{

easenotes 63
Module 1 2021 scheme Computer Graphics and OpenGL

d+=2*(x-y)+5;
--y;
}
++x;
}
plotpixels(xc, yc, x, y);
}

easenotes 64
Computer Graphics and Fundamentals of Image Processing(21CS63)

MODULE-2
2D Geometric Transformations

Operations that are applied to the geometric description of an object to change its position,
orientation, or size are called geometric transformations. Sometimes geometric transformation
operations are also referred to as modeling transformations.

Basic Two-Dimensional Geometric Transformations


The geometric-transformation functions that are available in all graphics packages are those for
translation, rotation, and scaling. Other useful transformation routines that are sometimes
included in a package are reflection and shearing operations.

Two-Dimensional Translation
Translation on single coordinate point is performed by adding offsets to its coordinates to
generate a new coordinate position. The original point position is moved along a straight line
path to its new location.

To translate a two-dimensional position, we add translation distances tx and ty to the original


coordinates (x,y) to obtain the new coordinate position (x,y) as shown in Figure 2.1.

The translation distance pair (tx, ty) is called a translation vector or shift vector.

easenotes Page 1
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.1: Translating a point from position P to position P using a translation vector T.

We can express Equations 1 as a single matrix equation by using the following column vectors to
represent coordinate positions and the translation vector:

The two-dimensional translation equations can be written in matrix forms as:

Translation is a rigid-body transformation that moves objects without deformation. That is, every
point on the object is translated by the same amount. Figure 2.2 illustrates the application of a
specified translation vector to move an object from one position to another.

easenotes Page 2
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.2: Moving a polygon from position (a) to position (b) with the
translation vector (-5.50, 3.75)

Two-Dimensional Rotation
Rotation transformation of an object is generated by specifying a rotation axis and a rotation
angle. All points of the object are then transformed to new positions by rotating the points
through the specified angle about the rotation axis.

A two-dimensional rotation of an object is obtained by repositioning the object along a circular


path in the xy plane. Parameters for the two-dimensional rotation are the rotation angle θ and a
position(xr,yr), called the rotation point (or pivot point), about which the object is to be rotated.

easenotes Page 3
Computer Graphics and Fundamentals of Image Processing(21CS63)

A positive value for the angle θ defines a counterclockwise rotation about the pivot point. A
negative value for the angle θ rotates objects in the clockwise direction.

Figure 2.3: Rotation of an object through angle θ about the pivot point (xr ,yr )

1) Determination of the transformation equations for rotation of a point position P


when the pivot point is at the coordinate origin.

The angular and coordinate relationships of the original and transformed point positions are
shown in Figure 2.4.

Figure 2.4: Rotation of a point from position (x,y) to position (x, y) through an angle θ
relative to the coordinate origin. The original angular displacement of the point from the x
axis is 

easenotes Page 4
Computer Graphics and Fundamentals of Image Processing(21CS63)

r is the constant distance of the point from the origin, angle  is the original angular position of
the point from the horizontal, and θ is the rotation angle.

Using standard trigonometric identities, transformed coordinates can be expressed in terms of


angles of  and 

x=rcos(+θ)=rcoscosθ-rsinsinθ (4)

y=rsin(+θ)=rcossinθ+rsincosθ

The original coordinates of the point in the polar coordinates are

x=rcos y=rsin (5)

Substituting expressions 5 into 4 we obtain the transformation equations for


rotating a point at position (x, y) through an angle θ about the origin.

x’=xcos-ysin (6)

y’=xsin+ycos

The rotation equation in matrix form is written as

P’=RP (7)

where the rotation matrix is

(8)

2) Determination of the transformation equations for rotation of a point position P


when the pivot point is at (xr,yr).

Rotation of a point about an arbitrary pivot position is illustrated in Figure 2.5

easenotes Page 5
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.5: Rotating a point from position(x,y) to position (x,y) through an angle  about
rotation point (xr,yr)

Using the trigonometric relationships indicated by the two right triangles in this figure, we can
generalize Equations 6 to obtain the transformation equations for rotation of a point about any
specified rotation position (xr, yr):

x=xr+(x-xr)cos - (y-yr)sin (9)

y=yr+(x-xr) sin + (y-yr)cos

Two-Dimensional Scaling

To alter the size of an object, scaling transformation is used. A two dimensional scaling
operation is performed by multiplying object positions (x,y) by scaling factors sx and sy to
produce the transformed coordinates(x’,y’).

Scaling factor sx scales an object in the x direction, while sy scales in the y direction. The basic
two-dimensional scaling equations 10 can also be written in the following matrix form:

easenotes Page 6
Computer Graphics and Fundamentals of Image Processing(21CS63)

or

P’=S.P (12)

where S is the 2 × 2 scaling matrix in Equation 11

Any positive values can be assigned to the scaling factors sx and sy. Values less than 1 reduce the
size of objects. Values greater than 1 produce enlargements. Specifying a value of 1 for both sx
and sy leaves the size of objects unchanged. When sx and sy are assigned the same value, a
uniform scaling is produced, which maintains relative object proportions. Unequal values for sx
and sy result in a differential scaling.

Figure 2.6: Turning a square (a) into a rectangle (b) with scaling factors sx = 2 and
sy = 1.

Figure 2.7 illustrates scaling of a line by assigning the value 0.5 to both sx and sy.
Both the line length and the distance from the origin are reduced by a factor of ½.

easenotes Page 7
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.7: A line scaled with Equation 12 using sx = sy = 0.5 is reduced in size and moved
closer to the coordinate origin

The location of a scaled object is controlled by choosing a position, called the fixed point, that is
to remain unchanged after the scaling transformation. Coordinates for the fixed point, (xf,yf), are
often chosen at some object position, such as its centroid but any other spatial position can be
selected.

Objects are now resized by scaling the distances between object points and the point (Figure
2.8). For a coordinate position (x, y), the scaled coordinates (x, y) are then calculated from the
following relationships:

x-xf=(x-xf)sx (13)

y-yf=(y-yf)sy

The above equation can be rewritten to separate the multiplicative and additive terms as

x=x.sx+xf (1-sx) (14)

y=y.sy+yf(1-sy)

The additive terms xf(1-sx) and yf(1-sy) are constants for all points in the object.

easenotes Page 8
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.8: Scaling relative to a chosen fixed point(xf , yf ). The distance from each polygon
vertex to the fixed point is scaled by Equations 13

Matrix Representations and Homogeneous Coordinates

Each of the three basic two-dimensional transformations (translation, rotation, and scaling) can
be expressed in the general matrix form.

P’=M1.P+M2 (15)

With coordinate positions P and P’ represented as column vectors. Matrix M1 is a 2 × 2 array


containing multiplicative factors, and M2 is a two-element column matrix containing
translational terms.

For translation, M1 is the identity matrix. For rotation or scaling, M2 contains the translational
terms associated with the pivot point or scaling fixed point.

Homogeneous Coordinates

Multiplicative and translational terms for a two-dimensional geometric transformation can be


combined into a single matrix, if representations are expanded to 3 × 3 matrices. The third
column of a transformation matrix can be used for the translation terms, and all transformation
equations can be expressed as matrix multiplications.

easenotes Page 9
Computer Graphics and Fundamentals of Image Processing(21CS63)

The matrix representation for a two-dimensional coordinate position is expanded to a three-


element column matrix. Each two dimensional coordinate-position representation(x,y) is
expanded to a three-element representation (xh,yh,h) called homogeneous coordinates.

The homogeneous parameter h is a nonzero value such that

x=xh/h (16)

y=yh/h

A general two-dimensional homogeneous coordinate representation could also be written as


(h· x, h· y, h).
A convenient choice is simply to set h = 1. Each two-dimensional position is then represented
with homogeneous coordinates (x,y,1). The term homogeneous coordinates is used in
mathematics to refer to the effect of this representation on Cartesian equations.

Expressing positions in homogeneous coordinates allows us to represent all


geometric transformation equations as matrix multiplications, which is the standard method used
in graphics systems.

Two-Dimensional Translation Matrix

The homogeneous-coordinate for translation is given by

(17)

This translation operation can be written in the abbreviated form

P=T(tx,ty).P (18)

with T(tx, ty) as the 3 × 3 translation matrix in Equation 17

easenotes Page 10
Computer Graphics and Fundamentals of Image Processing(21CS63)

Two-Dimensional Rotation Matrix

Two-dimensional rotation transformation equations about the coordinate origin can be expressed
in the matrix form as

(19)

or as

P’=R().P (20)

The rotation transformation operator R(θ) is the 3 × 3 matrix with rotation parameter θ.

Two-Dimensional Scaling Matrix

A scaling transformation relative to the coordinate origin can be expressed as the matrix
multiplication.

(21)

P=S(sx,sy).P (22)

The scaling operator S(sx,sy) is the 3x3 matrix with parameters sx and sy.

Inverse Transformations

For translation, inverse matrix is obtained by negating the translation distances. If the two
dimensional translation distances are tx and ty , then inverse translation matrix is

easenotes Page 11
Computer Graphics and Fundamentals of Image Processing(21CS63)

(23)

An inverse rotation is accomplished by replacing the rotation angle by its negative. A two-
dimensional rotation through an angle θ about the coordinate origin has the inverse
transformation matrix

(24)

Negative values for rotation angles generate rotations in a clockwise direction.

The inverse matrix for any scaling transformation is obtained by replacing the scaling parameters
with their reciprocals. For two-dimensional scaling with parameters sx and sy applied relative to
the coordinate origin, the inverse transformation matrix is

(25)

The inverse matrix generates an opposite scaling transformation.

Two-Dimensional Composite Transformations

Forming products of transformation matrices is referred to as a concatenation, or composition,


of matrices. If two transformations are applied to point position P, the transformed location
would be calculated as:

easenotes Page 12
Computer Graphics and Fundamentals of Image Processing(21CS63)

P=M2.M1.P

P’=M.P (26)

The coordinate position is transformed using the composite matrix M, rather than applying the
individual transformations M1 and then M2.

Composite Two-Dimensional Translations

If two successive translation vectors (t1x, t1y) and (t2x, t2y) are applied to a two dimensional
coordinate position P. The final transformed location P is calculated as

P=T(t2x,t2y).{T(t1x,t1y).P}

P={T(t2x,t2y).T(t1x,t1y)}.P (27)

where P and P are represented as three-element, homogeneous-coordinate column vectors.

The composite transformation matrix for this sequence of translations is

(28)

or

T(t2x,t2y).T(t1x,t1y)=T(t1x+t2x,t1y+t2y) (29)

Composite Two-Dimensional Rotations

Two successive rotations applied to a point P produce the transformed position

P=R(2).{R(1).P}

= {R(2).{R(1).P} (30)

By multiplying the two rotation matrices, it can be verified that two successive rotations are
additive.

easenotes Page 13
Computer Graphics and Fundamentals of Image Processing(21CS63)

R(2).R(1)=R(1+2) (31)

The final rotated coordinates of a point can be calculated with the composite rotation matrix as

P=R(1+2).P (32)

Composite Two-Dimensional Scalings

Concatenating transformation matrices for two successive scaling operations in two dimensions
produces the following composite scaling matrix:

(33)

or

S(s2x,s2y). S(s1x,s1y)=S(s1x.s2x,s1y.s2y) (34)

General Two-Dimensional Pivot-Point Rotation

Two-dimensional rotation about any other pivot point (xr,yr) can be generated by performing the
following sequence of translate-rotate-translate operations:

1. Translate the object so that the pivot-point position is moved to the coordinate origin.

2. Rotate the object about the coordinate origin.

3. Translate the object so that the pivot point is returned to its original position.

The composite transformation matrix for this sequence is obtained with the concatenation

easenotes Page 14
Computer Graphics and Fundamentals of Image Processing(21CS63)

(35)

which can be expressed in form

T(xr,yr).R().T(-xr,-yr)=R(xr,yr,) (36)

where T(-xr ,-yr ) = T-1(xr ,yr)

Figure 2.9: A transformation sequence for rotating an object about a specified pivot point
using the rotation matrix R(θ)

General Two-Dimensional Fixed-Point Scaling

To produce a two-dimensional scaling with respect to a selected fixed position (xf,yf), following
sequence is followed.

easenotes Page 15
Computer Graphics and Fundamentals of Image Processing(21CS63)

1. Translate the object so that the fixed point coincides with the coordinate origin.

2. Scale the object with respect to the coordinate origin.

3. Use the inverse of the translation in step (1) to return the object to its original position.

Concatenating the matrices for these three operations produces the required scaling matrix:

(37)

or

T(xf,yf).S(sx,sy).T(-xf,-yf)=S(xf,yf,sx,sy) (38)

Figure 2.10: A transformation sequence for scaling an object with respect to a specified
fixed position using the scaling matrix S(sx, sy )

Other Two-Dimensional Transformations

Basic transformations such as translation, rotation, and scaling are standard components of
graphics libraries. Some packages provide a few additional transformations that are useful in
certain applications.

easenotes Page 16
Computer Graphics and Fundamentals of Image Processing(21CS63)

Two such transformations are reflection and shear.

1. Reflection and
2. Shear.

Reflection

A transformation that produces a mirror image of an object is called a reflection.

For a two-dimensional reflection, this image is generated relative to an axis of reflection


by rotating the object 180◦ about the reflection axis.

Reflection about the line y = 0 (the x axis) is accomplished with the transformation matrix

(39)

This transformation retains x values, but “flips” the y values of coordinate positions.

The resulting orientation of an object after it has been reflected about the x axis is shown
in Figure 2.11

Figure 2.11: Reflection of an object about the x axis

easenotes Page 17
Computer Graphics and Fundamentals of Image Processing(21CS63)

A reflection about the line x = 0 (the y axis) flips x coordinates while keeping y coordinates the
same. The matrix for this transformation is

(40)

Figure 2.12 illustrates the change in position of an object that has been reflected about
the line x = 0.

Figure 2.12: Reflection of an object about the y axis.

We flip both the x and y coordinates of a point by reflecting relative to an axis that is
perpendicular to the xy plane and that passes through the coordinate origin. The matrix
representation for this reflection is

(41)

easenotes Page 18
Computer Graphics and Fundamentals of Image Processing(21CS63)

An example of reflection about the origin is shown in Figure 2.13.

Figure 2.13: Reflection of an object relative to the coordinate origin. This transformation
can be accomplished with a rotation in the x y plane about the coordinate origin.

If we choose the reflection axis as the diagonal line y = x , the reflection matrix is

(42)

easenotes Page 19
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.14: Reflection of an object with respect to the line y = x .

To obtain a transformation matrix for reflection about the diagonal y = x, concatenate matrices
for the transformation sequence:

1. Clockwise rotation by 45◦,

2. Reflection about the x axis

3. Counterclockwise rotation by 45◦

easenotes Page 20
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.15: Sequence of transformations to produce a reflection about the line y = x : A


clockwise rotation of 45◦ (a), a reflection about the x axis (b), and a counterclockwise
rotation by 45◦ (c)

To obtain a transformation matrix for reflection about the diagonal y = -x, we could concatenate
matrices for the transformation sequence:
1. Clockwise rotation by 45◦,
2. Reflection about the y axis
3. Counterclockwise rotation by 45◦.
The resulting transformation matrix is

(43)

easenotes Page 21
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.16: Reflection with respect to the line y = -x .


Shear

A transformation that distorts the shape of an object such that the transformed shape appears as if
the object were composed of internal layers that had been caused to slide over each other is
called a shear.

Two common shearing transformations are those that shift coordinate x values and those that
shift y values.

An x-direction shear relative to the x axis is produced with the transformation Matrix

(44)

which transforms coordinate positions as

x=x+shx.y (45)

y=y

easenotes Page 22
Computer Graphics and Fundamentals of Image Processing(21CS63)

Any real number can be assigned to the shear parameter shx. A coordinate position (x, y) is then
shifted horizontally by an amount proportional to its perpendicular distance (y value) from the x
axis. Setting parameter shx to the value 2, for example, changes the square in Figure 2.17 into a
parallelogram. Negative values for shx shift coordinate positions to the left.

Figure 2.17: A unit square (a) is converted to a parallelogram (b) using the x -direction
shear matrix with shx= 2.

We can generate x-direction shears relative to other reference lines with

(46)

Coordinate positions are transformed as

x=x+shx(y-yref) (47)

y=y

A y-direction shear relative to the line x = xref is generated with the transformation matrix

(48)

easenotes Page 23
Computer Graphics and Fundamentals of Image Processing(21CS63)

which generates transformed coordinates as

x=x (49)

y=y+shy(x-xref)

Figure 2.18: A unit square (a) is transformed to a shifted parallelogram (b) with shx = 0.5
and yref = -1 in the shear matrix 46.

Figure 2.19: A unit square (a) is turned into a shifted parallelogram (b) with parameter
values shy = 0.5 and xref = -1 in the y -direction shearing transformation 48.

easenotes Page 24
Computer Graphics and Fundamentals of Image Processing(21CS63)

Raster Methods for Geometric Transformations


Raster systems store picture information as color patterns in the frame buffer. Therefore, some
simple object transformations can be carried out rapidly by manipulating an array of pixel
values. Few arithmetic operations are needed, so the pixel transformations are particularly
efficient.
Functions that manipulate rectangular pixel arrays are called raster operations and
moving a block of pixel values from one position to another is termed a block transfer, a
bitblt, or a pixblt.
Figure 2.20 illustrates a two-dimensional translation implemented as a block transfer of
a refresh-buffer area

Figure 2.20: Translating an object from screen position (a) to the destination position
shown in (b) by moving a rectangular block of pixel values. Coordinate positions Pmin and
Pmax specify the limits of the rectangular block to be moved, and P0 is the destination
reference position

easenotes Page 25
Computer Graphics and Fundamentals of Image Processing(21CS63)

Rotations in 90-degree increments are accomplished easily by rearranging the elements


of a pixel array. We can rotate a two-dimensional object or pattern 90◦ counterclockwise by
reversing the pixel values in each row of the array, then interchanging rows and columns
A 180◦ rotation is obtained by reversing the order of the elements in each row of the
array, then reversing the order of the rows.
Figure 2.21 demonstrates the array manipulations that can be used to rotate a pixel
block by 90◦ and by 180◦.

Figure 2.21: Rotating an array of pixel values. The original array is shown in (a), the
positions of the array elements after a 90◦ counterclockwise rotation are shown in (b), and
the positions of the array elements after a 180◦ rotation are shown in (c).

For array rotations that are not multiples of 90◦, we need to do some extra processing.
The general procedure is illustrated in Figure 2.22.

Figure 2.22: A raster rotation for a rectangular block of pixels can be accomplished by
mapping the destination pixel areas onto the rotated block.

easenotes Page 26
Computer Graphics and Fundamentals of Image Processing(21CS63)

Each destination pixel area is mapped onto the rotated array and the amount of overlap
with the rotated pixel areas is calculated. A color for a destination pixel can then be computed by
averaging the colors of the overlapped source pixels, weighted by their percentage of area
overlap.
Similar methods can be used to scale a block of pixels. Pixel areas in the original block are
scaled, using specified values for sx and sy, and then mapped onto a set of destination pixels. The
color of each destination pixel is then assigned according to its area of overlap with the scaled
pixel areas. (Figure 2.23)

Figure 2.23: Mapping destination pixel areas onto a scaled array of pixel values. Scaling
factors sx = sy = 0.5 are applied relative to fixed point (xf ,yf ).

OpenGL Raster Transformations

A translation of a rectangular array of pixel-color values from one buffer area to another
can be accomplished in OpenGL as the following copy operation:
glCopyPixels (xmin, ymin, width, height, GL_COLOR);
The first four parameters in this function give the location and dimensions of the pixel
block; and the OpenGL symbolic constant GL_COLOR specifies that it is color values
are to be copied

easenotes Page 27
Computer Graphics and Fundamentals of Image Processing(21CS63)

A block of RGB color values in a buffer can be saved in an array with the function

glReadPixels (xmin, ymin, width, height, GL_RGB, GL_UNSIGNED_BYTE, colorArray);


If color-table indices are stored at the pixel positions, we replace the constant GL RGB
with GL_COLOR_INDEX.
To rotate the color values, we rearrange the rows and columns of the color array, as
described in the previous section. Then we put the rotated array back in the buffer with
glDrawPixels (width, height, GL_RGB, GL_UNSIGNED_BYTE, colorArray);

A two-dimensional scaling transformation can be performed as a raster operation in


OpenGL by specifying scaling factors and then invoking either glCopyPixels or
glDrawPixels.
For the raster operations, we set the scaling factors with
glPixelZoom (sx, sy);
We can also combine raster transformations with logical operations to produce various
effects with the exclusive or operator

OpenGL Functions for Two-Dimensional Geometric Transformations

In the core library of OpenGL, a separate function is available for each of the basic geometric
transformations. OpenGL is designed as a three-dimensional graphics application programming
interface (API), all transformations are specified in three dimensions. Internally, all coordinate
positions are represented as four-element column vectors, and all transformations are represented
using 4 × 4 matrices.

To perform a translation, we invoke the translation routine and set the components for the
three-dimensional translation vector.

In the rotation function, we specify the angle and the orientation for a rotation axis that
intersects the coordinate origin. A scaling function is used to set the three coordinate scaling
factors relative to the coordinate origin. In each case, the transformation routine sets up a 4 × 4
matrix that is applied to the coordinates of objects that are referenced after the transformation
call

easenotes Page 28
Computer Graphics and Fundamentals of Image Processing(21CS63)

Basic OpenGL Geometric Transformations


A 4× 4 translation matrix is constructed with the following routine:
glTranslate* (tx, ty, tz);

Translation parameters tx, ty, and tz can be assigned any real-number values, and the single
suffix code to be affixed to this function is either f (float) or d (double).

For two-dimensional applications, we set tz = 0.0; and a two-dimensional position is represented


as a four-element column matrix with the z component equal to 0.0.
Example: glTranslatef (25.0, -10.0, 0.0);
Using above statement defined coordinate positions is translated 25 units in the x direction and
-10 units in the y direction.

A 4 × 4 rotation matrix is generated with

glRotate* (theta, vx, vy, vz);


where the vector v = (vx, vy, vz) can have any floating-point values for its components.
This vector defines the orientation for a rotation axis that passes through the coordinate origin.

If v is not specified as a unit vector, then it is normalized automatically before the elements of
the rotation matrix are computed.
The suffix code can be either f or d, and parameter theta is to be assigned a rotation angle in
degree.
For example, the statement: glRotatef (90.0, 0.0, 0.0, 1.0);
sets up the matrix for a 90◦ rotation about the z axis.
We obtain a 4 × 4 scaling matrix with respect to the coordinate origin with the following
routine:
glScale* (sx, sy, sz);

The suffix code is again either f or d, and the scaling parameters can be assigned
any real-number values.
Scaling in a two-dimensional system involves changes in the x and y dimensions,
so a typical two-dimensional scaling operation has a z scaling factor of 1.0

easenotes Page 29
Computer Graphics and Fundamentals of Image Processing(21CS63)

Example: glScalef (2.0, -3.0, 1.0);

The above statement produces a matrix that scales by a factor of 2 in the x direction, scales by a
factor of 3 in the y direction, and reflects with respect to the x axis:
OpenGL Matrix Operations
The glMatrixMode routine is used to set the projection mode which designates the matrix
that is to be used for the projection transformation.
modelview mode is specified with the following statement
glMatrixMode (GL_MODELVIEW);
which designates the 4×4 modelview matrix as the current matrix
Two other modes that can be set with the glMatrixMode function are the texture
mode and the color mode.
The texture matrix is used for mapping texture patterns to surfaces, and the color
matrix is used to convert from one color model to another. The default argument for the
glMatrixMode function is GL_MODELVIEW.
Identity matrix is assigned to the current matrix using following function:
glLoadIdentity( );
Other values can be assigned to the elements of the current matrix using
glLoadMatrix* (elements16);

A single-subscripted, 16-element array of floating-point values is specified with


parameter elements16, and a suffix code of either f or d is used to designate the data type. The
elements in this array must be specified in column-major order
To illustrate this ordering, we initialize the modelview matrix with the following code:
glMatrixMode (GL_MODELVIEW);
GLfloat elems [16];
GLint k;
for (k = 0; k < 16; k++)
elems [k] = float (k);
glLoadMatrixf (elems);

which produces the matrix

easenotes Page 30
Computer Graphics and Fundamentals of Image Processing(21CS63)

A specified matrix can be concatenated with the current matrix as follows:


glMultMatrix* (otherElements16);
The suffix code is either f or d, and parameter otherElements16 is a 16-element,
single-subscripted array that lists the elements of some other matrix in column-major
order.
Assuming that the current matrix is the modelview matrix, which we designate as
M, then the updated modelview matrix is computed as
M = M· M’

M represents the matrix whose elements are specified by parameter otherElements16 in the
preceding glMultMatrix statement.

The glMultMatrix function can also be used to set up any transformation sequence with
individually defined matrices.
For example,
glMatrixMode (GL_MODELVIEW);
glLoadIdentity ( ); // Set current matrix to the identity.
glMultMatrixf (elemsM2); // Postmultiply identity with matrix M2.
glMultMatrixf (elemsM1); // Postmultiply M2 with matrix M1.
produces the following current modelview matrix:
M = M2 · M1

easenotes Page 31
Computer Graphics and Fundamentals of Image Processing(21CS63)

Three-Dimensional Geometric Transformations

Methods for geometric transformations in three dimensions are extended from two
dimensional methods by including considerations for the z coordinate.
A three-dimensional position, expressed in homogeneous coordinates, is represented as a
four-element column vector.

Three-Dimensional Translation

A position P = (x, y, z) in three-dimensional space is translated to a location P=(x,y,z) by


adding translation distances tx, ty, and tz to the Cartesian coordinates of P.
x'=x+tx
y=y+ty
z=z+tz
Three-dimensional translation operations can be represented in matrix form.
The coordinate positions, P and P , are represented in homogeneous coordinates with four-
element column matrices, and the translation operator T is a 4 × 4 matrix:

or
P’=T.P

easenotes Page 32
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.24: Moving a coordinate position with translation vector T = (tx, ty, tz )

An inverse of a three-dimensional translation matrix is obtained by negating the translation


distances tx, ty, and tz

Figure 2.25: Shifting the position of a three-dimensional object using translation vector T

easenotes Page 33
Computer Graphics and Fundamentals of Image Processing(21CS63)

Three-Dimensional Rotation
By convention, positive rotation angles produce counterclockwise rotations about a coordinate
axis. Positive rotations about a coordinate axis are counterclockwise, when looking along the
positive half of the axis toward the origin.

Figure 2.26:Positive rotations about a coordinate axis are counterclockwise, when looking
along the positive half of the axis toward the origin.
Three-Dimensional Coordinate-Axis Rotations
Three dimensional z-axis rotation equation is as follows:
x=xcos-ysin
y=xsin+ycos (50)
z=z

easenotes Page 34
Computer Graphics and Fundamentals of Image Processing(21CS63)

Parameter θ specifies the rotation angle about the z axis, and z-coordinate values are unchanged
by this transformation. In homogeneous-coordinate form, the three-dimensional z-axis rotation
equations are:

P=Rz().P

Figure 2.27: Rotation of an object about the z axis

Figure 2.27 illustrates rotation of an object about z-axis.

Transformation equations for rotations about the other two coordinate axes can be obtained with
a cyclic permutation of the coordinate parameters x, y, and z in equation 50.

x→y→z→x (51)

To obtain the x-axis and y-axis rotation transformations, cyclically replace x with y, y with z,
and z with x.

By substituting permutations 51 in equation 50, equations for x-axis rotation is obtained.

easenotes Page 35
Computer Graphics and Fundamentals of Image Processing(21CS63)

The equation for x axis rotation is:

y=ycos - zsin (52)

z=ysin + zcos

x=x

A cyclic permutation of coordinates in Equations 52 gives us the transformation equations for a


y-axis rotation:

The equation for y- axis rotation is:

z=zcos - xsin (53)

x=zsin + xcos

y=y

Figure 2.28: Cyclic permutation of the Cartesian-coordinate axes to produce the three sets
of coordinate-axis rotation equations

easenotes Page 36
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.29: Rotation of an object about the x axis

Figure 2.30: Rotation of an object about the y axis

Negative values for rotation angles generate rotations in a clockwise direction, and the identity
matrix is produced when we multiply any rotation matrix by its inverse.

i.e RR-1=I

easenotes Page 37
Computer Graphics and Fundamentals of Image Processing(21CS63)

General Three-Dimensional Rotations

A rotation matrix for any axis that does not coincide with a coordinate axis can be set up as a
composite transformation involving combinations of translations and the coordinate axis
rotations.

The following transformation sequence is used:

1. Translate the object so that the rotation axis coincides with the parallel coordinate axis

2. Perform the specified rotation about that axis.

3. Translate the object so that the rotation axis is moved back to its original position.

Figure 2.31: Sequence of transformations for rotating an object about an axis that is
parallel to the x axis.

easenotes Page 38
Computer Graphics and Fundamentals of Image Processing(21CS63)

A coordinate position P is transformed with the sequence

where the composite rotation matrix for the transformation is

When an object is to be rotated about an axis that is not parallel to one of the coordinate axes,
some additional transformations has to be performed.

The required rotation can be accomplished in five steps:

1. Translate the object so that the rotation axis passes through the coordinate origin.

2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes.

3. Perform the specified rotation about the selected coordinate axis.

4. Apply inverse rotations to bring the rotation axis back to its original orientation.

5. Apply the inverse translation to bring the rotation axis back to its original spatial position.

easenotes Page 39
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.32 : Five transformation steps for obtaining a composite matrix for rotation about
an arbitrary axis, with the rotation axis projected onto the z axis.

Three-Dimensional Scaling

The matrix expression for the three-dimensional scaling transformation of a position P =(x, y, z)
is given by

(54)

easenotes Page 40
Computer Graphics and Fundamentals of Image Processing(21CS63)

The three-dimensional scaling transformation for a point position can be represented as

P=S.P

where scaling parameters sx, sy, and sz are assigned any positive values

Explicit expressions for the scaling transformation relative to the origin are

x = x · sx , y= y·sy , z = z ·sz

Scaling an object with transformation (54) changes the position of object relative to the
coordinate origin. A parameter value greater than 1 move a point farther from the origin. A
parameter value less than 1 move a point closer to the origin.

Uniform scaling is performed when sx=sy=sz. If the scaling parameters are not all equal, relative
dimensions of a transformed object are changed.

Figure 2.33: Doubling the size of an object with transformation 54 also moves the object
farther from the origin. Scaling parameter is set to 2.

Scaling transformation with respect to any selected fixed position (xf,yf,zf) can be constructed
using the following transformation sequence:

1. Translate the fixed point to the origin

easenotes Page 41
Computer Graphics and Fundamentals of Image Processing(21CS63)

2. Apply the scaling transformation relative to the coordinate origin

3. Translate the fixed point back to its original position.

Figure 2.34: A sequence of transformations for scaling an object relative to a selected fixed
point

The matrix representation for an arbitrary fixed-point scaling can be expressed as the
concatenation of translate-scale-translate transformations:

easenotes Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)

Composite Three-Dimensional Transformations

Composite three dimensional transformation can be formed by multiplying the matrix


representations for the individual operations in the transformation sequence. Transformation
sequence can be implemented by concatenating the individual matrices from right to left or from
left to right, depending on the order in which the matrix representations are specified. Rightmost
term in a matrix product is always the first transformation to be applied to an object. Leftmost
term is always the last transformation. Coordinate positions are represented as four-element
column vectors which are premultiplied by the composite 4 X 4 transformation matrix.

Other Three-Dimensional Transformations

Three-Dimensional Reflections

A reflection in a three-dimensional space can be performed relative to a selected reflection axis


or with respect to a reflection plane. In general, three dimensional-reflection matrices are set up
similarly to those for two dimensions. Reflection relative to a given axis is equivalent to 180 0
rotations about that axis.

When the reflection plane is a coordinate plane (xy, xz, or yz), transformation can be thought as
a 180◦ rotation in four dimensional space with a conversion between a left-handed frame and a
right-handed frame.

An example of a reflection that converts coordinate specifications from a right handed system to
a left-handed system is shown in Figure 2.35.

easenotes Page 43
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 2.35: Conversion of coordinate specifications between a right-handed and a left-


handed system can be carried out with the reflection transformation 55

The matrix representation for this reflection relative to the xy plane is

(55)

In this transformation, sign of z coordinates changes, but the values of x and y coordinates
remains unchanged.

Three-Dimensional Shears

These transformations can be used to modify object shapes. For three-dimensional, shears can be
generated relative to the z axis. A general z-axis shearing transformation relative to a selected
reference position is produced with the following matrix:

(56)

easenotes Page 44
Computer Graphics and Fundamentals of Image Processing(21CS63)

Shearing parameter shzx and shzy can be assigned any real value. Transformation matrix alters the
values for the x and y coordinates by an amount that is proportional to distance from z ref . The z
coordinate value remain unchanged. Plane areas that are perpendicular to the z axis are shifted by
an amount equal to z-zref .

Figure 2.36: A unit cube (a) is sheared relative to the origin (b) by Matrix 56, with shzx =
shzy = 1 . Reference position zref =0

OpenGL Geometric-Transformation Functions


OpenGL Matrix Stacks

glMatrixMode specify which matrix is the current matrix.

There are four modes:

1. Modelview

2. Projection

3. Texture

4. Color

For each mode, OpenGL maintains a matrix stack. Initially, each stack contains only the identity
matrix. At any time during the processing of a scene, the top matrix on each stack is called the
“current matrix” for that mode. After we specify the viewing and geometric transformations, the

easenotes Page 45
Computer Graphics and Fundamentals of Image Processing(21CS63)

top of the modelview matrix stack is the 4 × 4 composite matrix that combines the viewing
transformations and the various geometric transformations that we want to apply to a scene.

OpenGL supports a modelview stack depth of at least 32.

glGetIntegerv(GL_MAX_MODELVIEW_STACK_DEPTH, stackSize);

The above function determine the number of positions available in the modelview stack for a
particular implementation of OpenGL. It returns a single integer value to array stackSize

We can also find out how many matrices are currently in the stack with

glGetIntegerv (GL_MODELVIEW_STACK_DEPTH, numMats);

Other OpenGL symbolic constants are

1. GL_MAX_PROJECTION_STACK_DEPTH

2. GL_MAX_TEXTURE_STACK_DEPTH

3. GL_MAX_COLOR_STACK_DEPTH

There are two functions available in OpenGL for processing the matrices in a stack

glPushMatrix( );

Copy the current matrix at the top of the active stack and store that copy in the second stack
position.

glPopMatrix( );

Destroys the matrix at the top of the stack, and the second matrix in the stack becomes the
current matrix.

easenotes Page 46
Computer Graphics and Fundamentals of Image Processing(21CS63)

VTU Previous Year Questions

1. Explain general two dimensional pivot point rotation and derive the composite matrix.
2. Explain translation, rotation, scaling in 2D homogeneous coordinate system with matrix
representations.
3. What are the entities required to perform a rotation? Show that two successive rotations
are additive.
4. Explain with illustrations the basic 2-dimension geometric transformations used in
computer graphics.
5. What is the need of homogeneous coordinates ? Give 2-dimensional homogeneous
coordinate matrix for translation, rotation and scaling.
6. Obtain a matrix representation for rotation of a object about a specified pivot point in 2-
dimension?
7. Obtain the matrix representation for rotation of a object about an arbitrary axis ?
8. Prove that 2 successive 2D rotation are additive.
9. Prove that successive scaling are multiplicative.
10. Develop composite homogeneous transformation matrix to rotate an object with respect
to pivot point. For the triangle A(3,2) B(6,2) C(6,6) rotate it in anticlockwise direction by
90 degree keeping A(3,2) fixed. Draw the new polygon.
11. With the help of the diagram explain shearing and reflection transformation technique.
12. Give the reason to convert transformation matrix to homogeneous co-ordinate
representation and show the process of conversion. Shear the polygon A(1,1), B(3,1),
C(3,3), D(2,4), E(1,3) along x-axis with a shearing factor of 0.2.
13. With the help of suitable diagram explain basic 3D Geometric transformation techniques
and give the transformation matrix.
14. Design transformation matrix to rotate an 3D object about an axis that is parallel to one of
the co-ordinate axes ?
15. Describe 3D translation and scaling.
16. Describe any two of dimensional composite transformation

i. 2D translation ii) 2D fixed point scaling

easenotes Page 47
Computer Graphics and Fundamentals of Image Processing(21CS63)

17. Explain translation, rotation and scaling of 2D transformation with suitable diagrams,
code and matrix.
18. Explain OpenGL raster transformations and OpenGL geometric transformation functions.
19. Explain any two of the 3D geometrical transformation.
20. Scale the given triangle A(3,2), B(6,2), C(6,6) using the scaling factors sx=1/3 and sy=1/2
about the point A(3,2). Draw the original and scaled object.
21. Explain shear and reflection transformation technique.
22. What is concatenation of transformation? Explain the following consider 2D
i. Rotation about a fixed point
ii. Scaling about a fixed point
23. Define the following two dimensional transformations. Translation, rotation, scaling,
reflection and shearing. Give example for each.

easenotes Page 48
Computer Graphics and Fundamentals of Image Processing (21CS63)

MODULE-3
Graphical Input Data
Graphics programs use several kinds of input data, such as coordinate positions, attribute values,
character-string specifications, geometric-transformation values, viewing conditions, and
illumination parameters. Many graphics packages, including the International Standards
Organization (ISO) and American National Standards Institute (ANSI) standards, provide an
extensive set of input functions for processing such data. But input procedures require interaction
with display-window managers and specific hardware devices. Therefore, some graphics
systems, particularly those that provide mainly device-independent functions, often include
relatively few interactive procedures for dealing with input data.

A standard organization for input procedures in a graphics package is to classify the functions
according to the type of data that is to be processed by each function. This scheme allows any
physical device, such as a keyboard or a mouse, to input any data class.

Logical Classification of Input Devices


When input functions are classified according to data type, any device that is used to provide the
specified data is referred to as a logical input device for that data type. The standard logical
input-data classifications are

1. LOCATOR: A device for specifying one coordinate position.


2. STROKE: A device for specifying a set of coordinate positions.
3. STRING: A device for specifying text input.
4. VALUATOR: A device for specifying a scalar value.
5. CHOICE: A device for selecting a menu option.
6. PICK: A device for selecting a component of a picture.

1. Locator Devices

Interactive selection of a coordinate point is usually accomplished by positioning the screen


cursor at some location in a displayed scene. Mouse, touchpad, joystick, trackball, spaceball,
thumbwheel, dial, hand cursor, or digitizer stylus can be used for screen-cursor positioning.

easenotes Page 1
Computer Graphics and Fundamentals of Image Processing (21CS63)

Keyboards are used for locator input in several ways. A general-purpose keyboard usually has
four cursor-control keys that move the screen cursor up, down, left, and right. With an additional
four keys, cursor can be diagonally moved. Rapid cursor movement is accomplished by holding
down the selected cursor key. Sometimes a keyboard includes a touchpad, joystick, trackball, or
other device for positioning the screen cursor. For some applications, it may also be convenient
to use a keyboard to type in numerical values or other codes to indicate coordinate positions.
Other devices, such as a light pen, have also been used for interactive input of coordinate
positions. But light pens record screen positions by detecting light from the screen phosphors,
and this requires special implementation procedures.

2. Stroke Devices

This class of logical devices is used to input a sequence of coordinate positions, and the physical
devices used for generating locator input are also used as stroke devices. Continuous movement
of a mouse, trackball, joystick, or hand cursor is translated into a series of input coordinate
values. The graphics tablet is one of the more common stroke devices. Button activation can be
used to place the tablet into “continuous” mode. As the cursor is moved across the tablet surface,
a stream of coordinate values is generated. This procedure is used in paintbrush systems to
generate drawings using various brush strokes.

3. String Devices

The primary physical device used for string input is the keyboard. Character strings in computer-
graphics applications are typically used for picture or graph labeling. Other physical devices can
be used for generating character patterns for special applications. Individual characters can be
sketched on the screen using a stroke or locator-type device. A pattern recognition program then
interprets the characters using a stored dictionary of predefined patterns.

4. Valuator Devices

Valuator input can be employed in a graphics program to set scalar values for geometric
transformations, viewing parameters, and illumination parameters. In some applications, scalar
input is also used for setting physical parameters such as temperature, voltage, or stress-strain
factors. A typical physical device used to provide valuator input is a panel of control dials. Dial
settings are calibrated to produce numerical values within some predefined range. Rotary

easenotes Page 2
Computer Graphics and Fundamentals of Image Processing (21CS63)

potentiometers convert dial rotation into a corresponding voltage, which is then translated into a
number within a defined scalar range, such as -10.5 to 25.5. Instead of dials, slide potentiometers
are sometimes used to convert linear movements into scalar values.

Any keyboard with a set of numeric keys can be used as a valuator device. Joysticks, trackballs,
tablets, and other interactive devices can be adapted for valuator input by interpreting pressure or
movement of the device relative to a scalar range. For one direction of movement, say left to
right, increasing scalar values can be input. Movement in the opposite direction decreases the
scalar input value. Selected values are usually echoed on the screen for verification. Another
technique for providing valuator input is to display graphical representations of sliders, buttons,
rotating scales, and menus on the video monitor. Cursor positioning, using a mouse, joystick,
spaceball, or other device, can be used to select a value on one of these valuators.

5. Choice Devices

Menus are typically used in graphics programs to select processing options, parameter values,
and object shapes that are to be used in constructing a picture. Commonly used choice devices
for selecting a menu option are cursor-positioning devices such as a mouse, trackball,
keyboard, touch panel, or button box.

Keyboard function keys or separate button boxes are often used to enter menu selections. Each
button or function key is programmed to select a particular operation or value, although preset
buttons or keys are sometimes included on an input device.

For screen selection of listed menu options, cursor-positioning device is used. When a screen-
cursor position (x, y) is selected, it is compared to the coordinate extents of each listed menu
item. A menu item with vertical and horizontal boundaries at the coordinate values xmin, xmax,
ymin, and ymax is selected if the input coordinates satisfy the inequalities

xmin ≤ x ≤ xmax, ymin ≤ y ≤ ymax (1)

For larger menus with relatively few options displayed, a touch panel is commonly used. A
selected screen position is compared to the coordinate extents of the individual menu options to
determine what process is to be performed.

easenotes Page 3
Computer Graphics and Fundamentals of Image Processing (21CS63)

Alternate methods for choice input include keyboard and voice entry. A standard keyboard can
be used to type in commands or menu options. For this method of choice input, some abbreviated
format is useful. Menu listings can be numbered or given short identifying names. A similar
encoding scheme can be used with voice input systems. Voice input is particularly useful when
the number of options is small (20 or fewer).

6. Pick Devices

Pick device is used to select a part of a scene that is to be transformed or edited in some way.
Several different methods can be used to select a component of a displayed scene, and any input
mechanism used for this purpose is classified as a pick device.

Most often, pick operations are performed by positioning the screen cursor. Using a mouse,
joystick, or keyboard, for example, we can perform picking by positioning the screen cursor and
pressing a button or key to record the pixel coordinates. This screen position can then be used to
select an entire object, a facet of a tessellated surface, a polygon edge, or a vertex. Other pick
methods include highlighting schemes, selecting objects by name, or a combination of
methods.

Using the cursor-positioning approach, a pick procedure could map a selected screen position to
a world-coordinate location using the inverse viewing and geometric transformations that were
specified for the scene. Then, the world coordinate position can be compared to the coordinate
extents of objects. If the pick position is within the coordinate extents of a single object, the pick
object has been identified. The object name, coordinates, or other information about the object
can then be used to apply the desired transformation or editing operations. But if the pick
position is within the coordinate extents of two or more objects, further testing is necessary.
Depending on the type of object to be selected and the complexity of a scene, several levels of
search may be required to identify the pick object.

When coordinate-extent tests do not uniquely identify a pick object, the distances from the pick
position to individual line segments could be computed. Figure 3.1 illustrates a pick position that
is within the coordinate extents of two line segments. For a two-dimensional line segment with
pixel endpoint coordinates (x1, y1) and (x2, y2), the perpendicular distance squared from a pick
position (x, y) to the line is calculated as

easenotes Page 4
Computer Graphics and Fundamentals of Image Processing (21CS63)

where x = x2-x1 and y = y2-y1.

Figure 3.1: Distances to line segments from a pick position.

Another picking technique is to associate a pick window with a selected cursor position. The
pick window is centered on the cursor position, as shown in Figure 3.2, and clipping procedures
are used to determine which objects intersect the pick window. For line picking, we can set the
pick-window dimensions w and h to very small values, so that only one line segment intersects
the pick window.

Figure 3.2: A pick window with center coordinates ( xp, yp), width w, and height h.

easenotes Page 5
Computer Graphics and Fundamentals of Image Processing (21CS63)

Highlighting can also be used to facilitate picking. Successively highlight those objects whose
coordinate extents overlap a pick position (or pick window). As each object is highlighted, a user
could issue a “reject” or “accept” action using keyboard keys. The sequence stops when the user
accepts a highlighted object as the pick object. Picking could also be accomplished simply by
successively highlighting all objects in the scene without selecting a cursor position.

If picture components can be selected by name, keyboard input can be used to pick an object.
This is a straightforward, but less interactive, pick-selection method. Some graphics packages
allow picture components to be named at various levels down to the individual primitives.
Descriptive names can be used to help a user in the pick process, but this approach has
drawbacks. It is generally slower than interactive picking on the screen, and a user will probably
need prompts to remember the various structure names.

Input Functions for Graphical Data


Graphics packages that use the logical classification for input devices provide several functions
for selecting devices and data classes. These functions allow a user to specify the following
options:

1. The input interaction mode for the graphics program and the input devices. Either the
program or the devices can initiate data entry, or both can operate simultaneously.
2. Selection of a physical device that is to provide input within a particular logical
classification (for example, a tablet used as a stroke device).
3. Selection of the input time and device for a particular set of data values.

Input Modes

Some input functions in an interactive graphics system are used to specify how the program and
input devices should interact. A program could request input at a particular time in the
processing (request mode), or an input device could independently provide updated input
(sample mode), or the device could independently store all collected data (event mode).

easenotes Page 6
Computer Graphics and Fundamentals of Image Processing (21CS63)

1) Request Mode

In request mode, the application program initiates data entry. When input values are requested,
processing is suspended until the required values are received. This input mode corresponds to
the typical input operation in a general programming language. The program and the input
devices operate alternately. Devices are put into a wait state until an input request is made; then
the program waits until the data are delivered.

2) Sample Mode

In sample mode, the application program and input devices operate independently. Input devices
may be operating at the same time that the program is processing other data. New values
obtained from the input devices replace previously input data values. When the program requires
new data, it samples the current values that have been stored from the device input.

3) Event Mode

In event mode, the input devices initiate data input to the application program. The program and
the input devices again operate concurrently, but now the input devices deliver data to an input
queue, also called an event queue. All input data is saved. When the program requires new data,
it goes to the data queue.

Typically, any number of devices can be operating at the same time in sample and event modes.
Some can be operating in sample mode, while others are operating in event mode. But only one
device at a time can deliver input in request mode.
Echo Feedback

Requests can usually be made in an interactive input program for an echo of input data and
associated parameters. When an echo of the input data is requested, it is displayed within a
specified screen area.

Echo feedback can include:

 The size of the pick window


 The minimum pick distance

easenotes Page 7
Computer Graphics and Fundamentals of Image Processing (21CS63)

 The type and size of a cursor


 The type of highlighting to be employed during pick operations
 The range (mininum and maximum) for valuator input
 The resolution (scale) for valuator input.

Callback Functions

For device-independent graphics packages, a limited set of input functions can be provided in an
auxiliary library. Input procedures can then be handled as callback functions that interact with
the system software. These functions specify what actions are to be taken by a program when an
input event occurs. Typical input events are moving a mouse, pressing a mouse button, or
pressing a key on the keyboard.

Interactive Picture-Construction
Techniques
A variety of interactive methods are often incorporated into a graphics package as aids in the
construction of pictures. Routines can be provided for positioning objects, applying constraints,
adjusting the sizes of objects, and designing shapes and patterns.

1) Basic Positioning Methods

We can interactively choose a coordinate position with a pointing device that records a screen
location. How the position is used depends on the selected processing option. The coordinate
location could be an endpoint position for a new line segment, or it could be used to position
some object—for instance, the selected screen location could reference a new position for the
center of a sphere; or the location could be used to specify the position for a text string, which
could begin at that location or it could be centered on that location. As an additional positioning
aid, numeric values for selected positions can be echoed on the screen. With the echoed
coordinate values as a guide, a user could make small interactive adjustments in the coordinate
values using dials, arrow keys, or other devices.

2) Dragging
Another interactive positioning technique is to select an object and drag it to a new location.
Using a mouse, for instance, we position the cursor at the object position, press a mouse button,

easenotes Page 8
Computer Graphics and Fundamentals of Image Processing (21CS63)

move the cursor to a new position, and release the button. The object is then displayed at the new
cursor location. Usually, the object is displayed at intermediate positions as the screen cursor
moves.

3) Constraints

Any procedure for altering input coordinate values to obtain a particular orientation or alignment
of an object is called a constraint. For example, an input line segment can be constrained to be
horizontal or vertical, as illustrated in Figures 3.3 and 3.4. To implement this type of constraint,
we compare the input coordinate values at the two endpoints. If the difference in the y values of
the two endpoints is smaller than the difference in the x values, a horizontal line is displayed.
Otherwise, a vertical line is drawn.

Figure 3.3: Horizontal line constraint.

easenotes Page 9
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.4: Vertical line constraint.

Other kinds of constraints can be applied to input coordinates to produce a variety of alignments.
Lines could be constrained to have a particular slant, such as 45◦, and input coordinates could be
constrained to lie along predefined paths, such as circular arcs.

4) Grids

Another kind of constraint is a rectangular grid displayed in some part of the screen area. With
an activated grid constraint, input coordinates are rounded to the nearest grid intersection. Figure
3.5 illustrates line drawing using a grid. Each of the cursor positions in this example is shifted to
the nearest grid intersection point, and a line is drawn between these two grid positions. Grids
facilitate object constructions, because a new line can be joined easily to a previously drawn line
by selecting any position near the endpoint grid intersection of one end of the displayed line.
Spacing between grid lines is often an option, and partial grids or grids with different spacing
could be used in different screen areas.

easenotes Page 10
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.5: Construction of a line segment with endpoints constrained to grid intersection
positions.

5) Rubber-Band Methods

Line segments and other basic shapes can be constructed and positioned using rubber-band
methods that allow the sizes of objects to be interactively stretched or contracted. Figure 3.6
demonstrates a rubber-band method for interactively specifying a line segment. First, a fixed
screen position is selected for one endpoint of the line. Then, as the cursor moves around, the
line is displayed from the start position to the current position of the cursor. The second endpoint
of the line is input when a button or key is pressed. Using a mouse, a rubber-band
line is constructed while pressing a mouse key. When the mouse key is released, the line display
is completed.

easenotes Page 11
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.6: A rubber-band method for constructing and positioning a straight-line


segment.

Similar rubber-band methods can be used to construct rectangles, circles, and other objects.
Figure 3.7 demonstrates rubber-band construction of a rectangle, and Figure 3.8 shows a rubber-
band circle construction.

Figure 3.7: A rubber-band method for constructing a rectangle.

easenotes Page 12
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.8: Constructing a circle using a rubber-band method.

6) Gravity Field

In the construction of figures, we sometimes need to connect lines at positions between


endpoints that are not at grid intersections. Because exact positioning of the screen cursor at the
connecting point can be difficult, a graphics package can include a procedure that converts any
input position near a line segment into a position on the line using a gravity field area around the
line. Any selected position within the gravity field of a line is moved (“gravitated”) to the nearest
position on the line. A gravity field area around a line is illustrated with the shaded region shown
in Figure 3.9.

Gravity fields around the line endpoints are enlarged to make it easier for a designer to connect
lines at their endpoints. Selected positions in one of the circular areas of the gravity field are
attracted to the endpoint in that area. The size of gravity fields is chosen large enough to aid
positioning, but small enough to reduce chances of overlap with other lines. If many lines are
displayed, gravity areas can overlap, and it may be difficult to specify points correctly. Normally,
the boundary for the gravity field is not displayed.

easenotes Page 13
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.9:A gravity field around a line. Any selected point in the shaded area is shifted to
a position on the line.

7) Interactive Painting and Drawing Methods

Options for sketching, drawing, and painting come in a variety of forms. Curve-drawing options
can be provided using standard curve shapes, such as circular arcs and splines, or with freehand
sketching procedures. Splines are interactively constructed by specifying a set of control points
or a freehand sketch that gives the general shape of the curve. Then the system fits the set of
points with a polynomial curve. In freehand drawing, curves are generated by following the path
of a stylus on a graphics tablet or the path of the screen cursor on a video monitor. Once a curve
is displayed, the designer can alter the curve shape by adjusting the positions of selected points
along the curve path.

Line widths, line styles, and other attribute options are also commonly found in painting and
drawing packages. Various brush styles, brush patterns, color combinations, object shapes, and
surface texture patterns are also available on systems, particularly those designed as artists’
workstations. Some paint systems vary the line width and brush strokes according to the pressure
of the artist’s hand on the stylus.

Virtual-Reality Environments

Interactive input is accomplished in virtual-reality environment with a data glove, which is


capable of grasping and moving objects displayed in a virtual scene. The computer-generated
scene is displayed through a head-mounted viewing system as a stereographic projection.
Tracking devices compute the position and orientation of the headset and data glove relative to

easenotes Page 14
Computer Graphics and Fundamentals of Image Processing (21CS63)

the object positions in the scene. With this system, a user can move through the scene and
rearrange object positions with the data glove

Another method for generating virtual scenes is to display stereographic projections on a raster
monitor, with the two stereographic views displayed on alternate refresh cycles. The scene is
then viewed through stereographic glasses. Interactive object manipulations can again be
accomplished with a data glove and a tracking device to monitor the glove position and
orientation relative to the position of objects in the scene.

Figure 3.10: Using a head-tracking stereo display, called the BOOM and a Dataglove a
researcher interactively manipulates exploratory probes in the unsteady flow around a
Harrier jet airplane.

easenotes Page 15
Computer Graphics and Fundamentals of Image Processing (21CS63)

OpenGL Interactive Input-Device Functions

Interactive device input in an OpenGL program is handled with routines in the OpenGL Utility
Toolkit (GLUT), because these routines need to interface with a window system. In GLUT, there
are functions to accept input from standard devices, such as a mouse or a keyboard, as well as
from tablets, space balls, button boxes, and dials. For each device, a procedure (the call back
function) is specified and it is invoked when an input event from that device occurs. These
GLUT commands are placed in the main procedure along with the other GLUT statements.

GLUT Mouse Functions

Following function is used to specify (“register”) a procedure that is to be called when the mouse
pointer is in a display window and a mouse button is pressed or released:

glutMouseFunc (mouseFcn);

This mouse callback procedure, named mouseFcn, has four arguments.

void mouseFcn(GLint button, GLint action, GLint xMouse, GLint yMouse)

Parameter button is assigned a GLUT symbolic constant that denotes one of the three mouse
buttons.

Parameter action is assigned a symbolic constant that specifies which button action we want to
use to trigger the mouse activation event. Allowable values for button are GLUT_
LEFT_BUTTON, GLUT_MIDDLE_BUTTON, and GLUT_RIGHT_BUTTON.

Parameter action can be assigned either GLUT_DOWN or GLUT_UP, depending on whether


we want to initiate an action when we press a mouse button or when we release it. When
procedure mouseFcn is invoked, the display-window location of the mouse cursor is returned as
the coordinate position (xMouse, yMouse). This location is relative to the top-left corner of the
display window, so that xMouse is the pixel distance from the left edge of the display window
and yMouse is the pixel distance down from the top of the display window.

By activating a mouse button while the screen cursor is within the display window, we can select
a position for displaying a primitive such as a single point, a line segment, or a fill area.

easenotes Page 16
Computer Graphics and Fundamentals of Image Processing (21CS63)

Another GLUT mouse routine that we can use is


glutMotionFunc (fcnDoSomething);

This routine invokes fcnDoSomething when the mouse is moved within the display window
with one or more buttons activated. The function that is invoked has two arguments:

void fcnDoSomething (GLint xMouse, GLint yMouse)

where (xMouse, yMouse) is the mouse location in the display window relative to the top-left
corner, when the mouse is moved with a button pressed.

Some action can be performed when we move the mouse within the display window without
pressing a button:

glutPassiveMotionFunc(fcnDoSomethingElse);

The mouse location is returned to fcnDoSomethingElse as coordinate position (xMouse,


yMouse), relative to the top-left corner of the display window.

GLUT Keyboard Functions

With keyboard input, the following function is used to specify a procedure that
is to be invoked when a key is pressed:

glutKeyboardFunc (keyFcn);

The specified procedure has three arguments:

void keyFcn (GLubyte key, GLint xMouse, GLint yMouse)

Parameter key is assigned a character value or the corresponding ASCII code.


The display-window mouse location is returned as position (xMouse,yMouse)
relative to the top-left corner of the display window. When a designated key is
pressed, mouse location can be used to initiate some action, independently of
whether any mouse buttons are pressed.

For function keys, arrow keys, and other special-purpose keys, following command can be used:

easenotes Page 17
Computer Graphics and Fundamentals of Image Processing (21CS63)

glutSpecialFunc (specialKeyFcn);

The specified procedure has three arguments:

void specialKeyFcn (GLint specialKey, GLint xMouse,GLint yMouse)

Parameter specialKey is assigned an integer-valued GLUT symbolic constant. To select a


function key, one of the constants GLUT_KEY_F1 through GLUT_KEY_F12 is used. For the
arrow keys, constants such as GLUT_KEY_UP and GLUT_KEY_RIGHT is used. Other keys
can be designated using GLUT_KEY_PAGE_DOWN, GLUT_KEY_HOME.

GLUT Tablet Functions

Usually, tablet activation occurs only when the mouse cursor is in the display
window. A button event for tablet input is recorded with

glutTabletButtonFunc (tabletFcn);

and the arguments for the invoked function are similar to those for a mouse:

void tabletFcn (GLint tabletButton, GLint action,GLint xTablet, GLint yTablet)

We designate a tablet button with an integer identifier such as 1, 2, 3, and so on, and the button
action is specified with either GLUT_UP or GLUT_DOWN. The returned values xTablet and
yTablet are the tablet coordinates. The number of available tablet buttons can be determined
with the following command

glutDeviceGet (GLUT_NUM_TABLET_BUTTONS);

Motion of a tablet stylus or cursor is processed with the following function:

glutTabletMotionFunc (tabletMotionFcn);

where the invoked function has the form

void tabletMotionFcn (GLint xTablet, GLint yTablet)

The returned values xTablet and yTablet give the coordinates on the tablet
surface.

easenotes Page 18
Computer Graphics and Fundamentals of Image Processing (21CS63)

GLUT Spaceball Functions

Following function is used to specify an operation when a spaceball button is


activated for a selected display window:

glutSpaceballButtonFunc (spaceballFcn);

The callback function has two parameters:

void spaceballFcn (GLint spaceballButton, GLint action)

Spaceball buttons are identified with the same integer values as a tablet, and
parameter action is assigned either the value GLUT_UP or the value
GLUT_DOWN. The number of available spaceball buttons can be determined with a
call to glutDeviceGet using the argument GLUT_NUM_SPACEBALL_BUTTONS

Translational motion of a spaceball, when the mouse is in the display window,


is recorded with the function call

glutSpaceballMotionFunc (spaceballTranlFcn);

The three-dimensional translation distances are passed to the invoked function as, for example:

void spaceballTranslFcn (GLint tx, GLint ty, GLint tz)

A spaceball rotation is recorded with

glutSpaceballRotateFunc (spaceballRotFcn);

The three-dimensional rotation angles are then available to the callback function, as follows:

void spaceballRotFcn (GLint thetaX, GLint thetaY, GLint thetaZ)

GLUT Button-Box Function

Input from a button box is obtained with the following statement:

glutButtonBoxFunc (buttonBoxFcn);

Button activation is then passed to the invoked function:

easenotes Page 19
Computer Graphics and Fundamentals of Image Processing (21CS63)

void buttonBoxFcn (GLint button, GLint action);

The buttons are identified with integer values, and the button action is specified as GLUT_UP or
GLUT_DOWN.

GLUT Dials Function

A dial rotation can be recorded with the following routine:

glutDialsFunc(dialsFcn);

Following callback function is used to identify the dial and obtain the angular amount of
rotation:

void dialsFcn (GLint dial, GLint degreeValue);

Dials are designated with integer values, and the dial rotation is returned as an integer degree
value.

OpenGL Picking Operations

In an OpenGL program, interactively objects can be selected by pointing to screen positions.


However, the picking operations in OpenGL are not straightforward. Basically, picking is
performed using a designated pick window to form a revised view volume. Integer identifiers are
assigned to objects in a scene, and the identifiers for those objects that intersect the revised view
volume are stored in a pick-buffer array. Thus, to use the OpenGL pick features, following
procedures has to be incorporated into a program:

 Create and display a scene.


 Pick a screen position and, within the mouse callback function, do the following:
o Set up a pick buffer.
o Activate the picking operations (selection mode).
o Initialize an ID name stack for object identifiers.
o Save the current viewing and geometric-transformation matrix.
o Specify a pick window for the mouse input.

easenotes Page 20
Computer Graphics and Fundamentals of Image Processing (21CS63)

o Assign identifiers to objects and reprocess the scene using the revised
view volume. (Pick information is then stored in the pick buffer.)
o Restore the original viewing and geometric-transformation matrix.
o Determine the number of objects that have been picked, and return to
the normal rendering mode.
o Process the pick information.

A pick-buffer array is set up with the following command

glSelectBuffer (pickBuffSize, pickBuffer);

Parameter pickBuffer designates an integer array with pickBuffSize elements. The


glSelectBuffer function must be invoked before the OpenGL picking operations (selection
mode) are activated. An integer information record is stored in pick-buffer array for each object
that is selected with a single pick input. Several records of information can be stored in the pick
buffer, depending on the size and location of the pick window. Each record in the pick buffer
contains the following information:

1. The stack position of the object, which is the number of identifiers in the name stack, up
to and including the position of the picked object.
2. The minimum depth of the picked object.
3. The maximum depth of the picked object.
4. The list of the identifiers in the name stack from the first (bottom) identifier to the
identifier for the picked object.

The integer depth values stored in the pick buffer are the original values in the
range from 0 to 1.0, multiplied by 232 - 1.

The OpenGL picking operations are activated with


glRenderMode (GL_SELECT);

Above routine call switches to selection mode. A scene is processed through the viewing
pipeline but not stored in the frame buffer. A record of information for each object that would
have been displayed in the normal rendering mode is placed in the pick buffer. In addition, this

easenotes Page 21
Computer Graphics and Fundamentals of Image Processing (21CS63)

command returns the number of picked objects, which is equal to the number of information
records in the pick buffer. To return to the normal rendering mode (the default), glRenderMode
routine is invoked using the argument GL_RENDER. A third option is the argument GL_
FEEDBACK, which stores object coordinates and other information in a feedback buffer
without displaying the objects.
Following statement is used to activate the integer-ID name stack for the picking operations:
glInitNames ( );
The ID stack is initially empty, and this stack can be used only in selection mode.
To place an unsigned integer value on the stack, following function can be invoked:
glPushName (ID);
This places the value for parameter ID on the top of the stack and pushes the
previous top name down to the next position in the stack.
The top of the stack can be replaced using
glLoadName (ID);
To eliminate the top of the ID stack, following command is used:
glPopName ( );
A pick window within a selected viewport is defined using the following
GLU function:
gluPickMatrix (xPick, yPick, widthPick, heightPick, vpArray);

Parameters xPick and yPick give the double-precision, screen-coordinate


location for the center of the pick window relative to the lower-left corner of
the viewport. When these coordinates are given with mouse input, the mouse
coordinates are relative to the upper-left corner, and thus the
input yMouse value has to be inverted. The double-precision values for the width and height of
the pick window are specified with parameters widthPick and heightPick.
Parameter vpArray designates an integer array containing the coordinate position and size
parameters for the current viewport.

easenotes Page 22
Computer Graphics and Fundamentals of Image Processing (21CS63)

OpenGL Menu Functions


GLUT contains various functions for adding simple pop-up menus to programs. With these
functions, we can set up and access a variety of menus and associated submenus. The GLUT
menu commands are placed in procedure main along with the other GLUT functions.
Creating a GLUT Menu
A pop-up menu is created with the statement
glutCreateMenu (menuFcn);
where parameter menuFcn is the name of a procedure that is to be invoked when
a menu entry is selected. This procedure has one argument, which is the integer
value corresponding to the position of a selected option.
void menuFcn (GLint menuItemNumber)
The integer value passed to parameter menuItemNumber is then used by
menuFcn to perform an operation. When a menu is created, it is associated with
the current display window.
To specify the options that are to be listed in the menu, a series of statements that list the name
and position for each option is used. These statements have the general form
glutAddMenuEntry (charString, menuItemNumber);

Parameter charString specifies text that is to be displayed in the menu.


The parameter menuItemNumber gives the location for that entry in the menu.
For example, the following statements create a menu with two options:
glutCreateMenu (menuFcn);
glutAddMenuEntry ("First Menu Item", 1);
glutAddMenuEntry ("Second Menu Item", 2);
Next, we must specify a mouse button that is to be used to select a menu option.
This is accomplished with
glutAttachMenu (button);
where parameter button is assigned one of the three GLUT symbolic constants referencing the
left, middle, or right mouse button.

easenotes Page 23
Computer Graphics and Fundamentals of Image Processing (21CS63)

Creating and Managing Multiple GLUT Menus


When a menu is created, it is associated with the current display window. We can create multiple
menus for a single display window, and we can create different menus for different windows. As
each menu is created, it is assigned an integer identifier, starting with the value 1 for the first
menu created. The integer identifier for a menu is returned by the glutCreateMenu routine, and
we can record this value with a statement such as
menuID = glutCreateMenu (menuFcn);
A newly created menu becomes the current menu for the current display window. To activate a
menu for the current display window, following statement is used.
glutSetMenu (menuID);
This menu then becomes the current menu, which will pop up in the display window when the
mouse button that has been attached to that menu is pressed.
To eliminate a menu following command is used:
glutDestroyMenu (menuID);
If the designated menu is the current menu for a display window, then that window has no menu
assigned as the current menu, even though other menus may exist.
The following function is used to obtain the identifier for the current menu in the current display
window:
currentMenuID = glutGetMenu( );
A value of 0 is returned if no menus exist for this display window or if the previous current menu
was eliminated with the glutDestroyMenu function.
Creating GLUT Submenus
A submenu can be associated with a menu by first creating the submenu using glutCreateMenu,
along with a list of suboptions, and then listing the submenu as an additional option in the main
menu. Submenu can be added to the option list in a main menu (or other submenu) using a
sequence of statements such as
submenuID = glutCreateMenu (submenuFcn);
glutAddMenuEntry ("First Submenu Item", 1);
...

easenotes Page 24
Computer Graphics and Fundamentals of Image Processing (21CS63)

glutCreateMenu (menuFcn);
glutAddMenuEntry ("First Menu Item", 1);
...
...
...
glutAddSubMenu ("Submenu Option", submenuID);
The glutAddSubMenu function can also be used to add the submenu to the current menu.
Modifying GLUT Menus
To change the mouse button that is used to select a menu option, first
cancel the current button attachment and then attach the new button. A button
attachment is cancelled for the current menu with
glutDetachMenu (mouseButton);
where parameter mouseButton is assigned the GLUT constant identifying the
button (left, middle, or right) that was previously attached to the menu.
After detaching the menu from the button, glutAttachMenu is used to attach
it to a different button.
Options within an existing menu can also be changed.
For example, an option in the current menu can be deleted with the function
glutRemoveMenuItem (itemNumber);
where parameter itemNumber is assigned the integer value of the menu option
that is to be deleted.
Designing a Graphical User Interface
A common feature of modern applications software is a graphical user interface (GUI) composed
of display windows, icons, menus, and other features to aid a user in applying the software to a
particular problem. Specialized interactive dialogues are designed so that programming options
are selected using familiar terms within a particular field, such as architectural and engineering
design, drafting, business graphics, geology, economics, chemistry, or physics. Other
considerations for a user interface (whether graphical or not) are the accommodation of various
skill levels, consistency, error handling, and feedback.

easenotes Page 25
Computer Graphics and Fundamentals of Image Processing (21CS63)

The User Dialogue


For any application, the user’s model serves as the basis for the design of the dialogue by
describing what the system is designed to accomplish and what operations are available. It states
the type of objects that can be displayed and how the objects can be manipulated. For example, if
the system is to be used as a tool for architectural design, the model describes how the package
can be used to construct and display views of buildings by positioning walls, doors, windows,
and other building components. A circuit-design program provides electrical or logic symbols
and the positioning operations for adding or deleting elements within a layout. All information in
the user dialogue is presented in the language of the application.
Windows and Icons
Typical GUIs provide visual representations both for the objects that are to be manipulated in an
application and for the actions to be performed on the application objects. In addition to the
standard display-window operations, such as opening, closing, positioning, and resizing, other
operations are needed for working with the sliders, buttons, icons, and menus. Some systems are
capable of supporting multiple window managers so that different window styles can be
accommodated, each with its own window manager, which could be structured for a particular
application. Icons representing objects such walls, doors, windows, and circuit elements are often
referred to as application icons. The icons representing actions, such as rotate, magnify, scale,
clip, or paste, are called control icons, or command icons.
Accommodating Multiple Skill Levels
Usually, interactive GUIs provide several methods for selecting actions. For example, an option
could be specified by pointing to an icon, accessing a pulldown or pop-up menu, or by typing a
keyboard command. This allows a package to accommodate users that have different skill levels.
A less experienced user may find an interface with a large, comprehensive set of operations to be
difficult to use, so a smaller interface with fewer but more easily understood operations and
detailed prompting may be preferable. A simplified set of menus and options is easy to learn and
remember, and the user can concentrate on the application instead of on the details of the
interface. Simple point-and-click operations are often easiest for an inexperienced user of an
applications package.
Experienced users, typically want speed. This means fewer prompts and more input from the
keyboard or with multiple mouse-button clicks. Actions are selected with function keys or with

easenotes Page 26
Computer Graphics and Fundamentals of Image Processing (21CS63)

simultaneous combinations of keyboard keys, because experienced users will remember these
shortcuts for commonly used actions.
Help facilities can be designed on several levels so that beginners can carry on a detailed
dialogue, while more experienced users can reduce or eliminate prompts and messages. Help
facilities can also include one or more tutorial applications, which provide users with an
introduction to the capabilities and use of the system.
Consistency
An important design consideration in an interface is consistency. An icon shape should always
have a single meaning, rather than serving to represent different actions or objects depending on
the context.
Examples of consistency:
 Always placing menus in the same relative positions so that a user does not have to hunt
for a particular option.
 Always using the same combination of keyboard keys for an action.
 Always using the same color encoding so that a color does not have different meanings in
different situations.
Minimizing Memorization
Operations in an interface should also be structured so that they are easy to understand and to
remember. Obscure, complicated, inconsistent, and abbreviated command formats lead to
confusion and reduction in the effective application of the software. One key or button used for
all delete operations, for example, is easier to remember than a number of different keys for
different kinds of delete procedures.

Icons and window systems can also be organized to minimize memorization. Different kinds of
information can be separated into different windows so that a user can identify and select items
easily. Icons should be designed as easily recognizable shapes that are related to application
objects and actions. To select a particular action, a user should be able to select an icon that
resembles that action.
Backup and Error Handling

A mechanism for undoing a sequence of operations is another common feature of an interface,


which allows a user to explore the capabilities of a system, knowing that the effects of a mistake

easenotes Page 27
Computer Graphics and Fundamentals of Image Processing (21CS63)

can be corrected. Typically, systems can undo several operations, thus allowing a user to reset
the system to some specified action. For those actions that cannot be reversed, such as closing an
application without saving changes, the system asks for a verification of the requested operation.

Good diagnostics and error messages help a user to determine the cause of an error. Interfaces
can attempt to minimize errors by anticipating certain actions that could lead to an error; and
users can be warned if they are requesting ambiguous or incorrect actions, such as attempting to
apply a procedure to multiple application objects.
Feedback
Responding to user actions is another important feature of an interface, particularly for an
inexperienced user. As each action is entered, some response should be given. Otherwise, a user
might begin to wonder what the system is doing and whether the input should be reentered.

Feedback can be given in many forms, such as highlighting an object, displaying an icon or
message, and displaying a selected menu option in a different color. When the processing of a
requested action is lengthy, the display of a flashing message, clock, hourglass, or other progress
indicator is important. It may also be possible for the system to display partial results as they are
completed, so that the final display is built up a piece at a time.

Standard symbol designs are used for typical kinds of feedback. A cross, a frowning face, or a
thumbs-down symbol is often used to indicate an error, and some kind of time symbol or a
blinking “at work” sign is used to indicate that an action is being processed.
This type of feedback can be very effective with a more experienced user, but the beginner may
need more detailed feedback that not only clearly indicates what the system is doing but also
what the user should input next.

Clarity is another important feature of feedback. A response should be easily understood, but not
so overpowering that the user’s concentration is interrupted. With function keys, feedback can be
given as an audible click or by lighting up the key that has been pressed. Audio feedback has the
advantage that it does not use up screen space, and it does not divert the user’s attention from the
work area. A fixed message area can be used so that a user always know where to look for

easenotes Page 28
Computer Graphics and Fundamentals of Image Processing (21CS63)

messages, but it may be advantageous in some cases to place feedback messages in the work area
near the cursor.

Echo feedback is often useful, particularly for keyboard input, so that errors can be detected
quickly. Selection of coordinate points can be echoed with a cursor or other symbol that appears
at the selected position

Design of Animation Sequences


Note:

Computer animation generally refers to any time sequence of visual changes in a picture.

Constructing an animation sequence can be a complicated task, particularly when it involves a


story line and multiple objects, each of which can move in a different way. A basic approach is
to design animation sequences using the following development stages:

1. Storyboard layout
2. Object definitions
3. Key-frame specifications
4. Generation of in-between frames

1. Storyboard Layout

The storyboard is an outline of the action. It defines the motion sequence as a set of basic events
that are to take place. Depending on the type of animation to be produced, the storyboard could
consist of a set of rough sketches, along with a brief description of the movements, or it could
just be a list of the basic ideas for the action. Originally, the set of motion sketches was attached
to a large board that was used to present an overall view of the animation project. Hence, the
name “storyboard.”

2. Object Definitions

An object definition is given for each participant in the action. Objects can be defined in terms
of basic shapes, such as polygons or spline surfaces. In addition, a description is often given of
the movements that are to be performed by each character or object in the story.

easenotes Page 29
Computer Graphics and Fundamentals of Image Processing (21CS63)

3. Key-Frame Specifications

A key frame is a detailed drawing of the scene at a certain time in the animation sequence.
Within each key frame, each object (or character) is positioned according to the time for that
frame. Some key frames are chosen at extreme positions in the action; others are spaced so that
the time interval between key frames is not too great. More key frames are specified for intricate
motions than for simple, slowly varying motions. Development of the key frames is generally the
responsibility of the senior animators, and often a separate animator is assigned to each character
in the animation.

4. Generation of in-between frames

In-betweens are the intermediate frames between the key frames. The total number of frames,
and hence the total number of in-betweens, needed for an animation is determined by the display
media that is to be used. Film requires 24 frames per second, and graphics terminals are
refreshed at the rate of 60 or more frames per second. Typically, time intervals for the motion are
set up so that there are from three to five in-betweens for each pair of key frames. Depending on
the speed specified for the motion, some key frames could be duplicated.

There are several other tasks that may be required, depending on the application. These
additional tasks include motion verification, editing, and the production and synchronization
of a soundtrack. Many of the functions needed to produce general animations are now
computer-generated.

easenotes Page 30
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.11: One frame from the award-winning computer-animated short film Luxo Jr.
The film was designed using a key-frame animation system and cartoon animation
techniques to provide lifelike actions of the lamps. Final images were rendered with
multiple light sources and procedural texturing techniques.

Figure 3.12: One frame from the short film Tin Toy, the first computer-animated film to
win an Oscar. Designed using a key-frame animation system, the film also required
extensive facial-expression modeling. Final images were rendered using procedural
shading, self-shadowing techniques, motion blur, and texture mapping.

easenotes Page 31
Computer Graphics and Fundamentals of Image Processing (21CS63)

Traditional Animation Techniques


Film animators use a variety of methods for depicting and emphasizing motion sequences. These
include object deformations, spacing between animation frames, motion anticipation and follow-
through, and action focusing.

One of the most important techniques for simulating acceleration effects, particularly for
nonrigid objects, is squash and stretch.

Figure 3.13 shows how squash and stretch technique is used to emphasize the acceleration and
deceleration of a bouncing ball. As the ball accelerates, it begins to stretch. When the ball hits the
floor and stops, it is first compressed (squashed) and then stretched again as it accelerates and
bounces upwards.

Figure 3.13 : A bouncing-ball illustration of the “squash and stretch” technique for
emphasizing object acceleration.

Another technique used by film animators is timing. Timing refers to the spacing between
motion frames. A slower moving object is represented with more closely spaced frames, and a
faster moving object is displayed with fewer frames over the path of the motion. This effect is
illustrated in Figure 3.14, where the position changes between frames increase as a bouncing ball
moves faster.

easenotes Page 32
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.14: The position changes between motion frames for a bouncing ball increase as
the speed of the ball increases.

Object movements can also be emphasized by creating preliminary actions that indicate an
anticipation of a coming motion. For example, a cartoon character might lean forward and rotate
its body before starting to run; or a character might perform a “windup” before throwing a ball.
Follow-through actions can be used to emphasize a previous motion. After throwing a ball, a
character can continue the arm swing back to its body; or a hat can fly off a character that is
stopped abruptly. An action also can be emphasized with staging. Staging refers to any method
for focusing on an important part of a scene, such as a character hiding something.

General Computer-Animation Functions


Many software packages have been developed either for general animation design or for
performing specialized animation tasks.

Typical animation functions include

 Managing object motions


 Generating views of objects
 Producing camera motions
 The generation of in-between frames

easenotes Page 33
Computer Graphics and Fundamentals of Image Processing (21CS63)

Some animation packages, such as Wavefront for example, provide special functions for both
the overall animation design and the processing of individual objects. Others are special-
purpose packages for particular features of an animation, such as a system for generating in-
between frames or a system for figure animation.

A set of routines is often provided in a general animation package for storing and managing the
object database. Object shapes and associated parameters are stored and updated in the database.
Other object functions include those for generating the object motion and those for rendering the
object surfaces. Movements can be generated according to specified constraints using two
dimensional or three-dimensional transformations. Standard functions can then be applied to
identify visible surfaces and apply the rendering algorithms.

Another typical function set simulates camera movements. Standard camera motions are
zooming, panning, and tilting. Finally, given the specification for the key frames, the in-
betweens can be generated automatically.

Computer-Animation Languages

Routines can be developed to design and control animation sequences within a general-purpose
programming language, such as C, C++, Lisp, or Fortran. But several specialized animation
languages have been developed.

Computer animation languages typically include :

 A graphics editor
 A key-frame generator
 An in-between generator
 Standard graphics routines

The graphics editor allows an animator to design and modify object shapes, using spline
surfaces, constructive solid geometry methods, or other representation schemes.

An important task in an animation specification is scene description. Scene description includes


the positioning of objects and light sources, defining the photometric parameters (light-source

easenotes Page 34
Computer Graphics and Fundamentals of Image Processing (21CS63)

intensities and surface illumination properties), and setting the camera parameters (position,
orientation, and lens characteristics).

Another standard function is action specification. Action specification involves the layout of
motion paths for the objects and camera. Usual graphics routines are needed for viewing and
perspective transformations, geometric transformations to generate object movements as a
function of accelerations or kinematic path specifications, visible-surface identification, and the
surface-rendering operations.

Key-frame systems were originally designed as a separate set of animation routines for
generating the in-betweens from the user-specified key frames. Now, these routines are often a
component in a more general animation package. In the simplest case, each object in a scene is
defined as a set of rigid bodies connected at the joints and with a limited number of degrees of
freedom.

Example:

The single-armed robot in Figure 3.15 has 6 degrees of freedom, which are referred to as arm
sweep, shoulder swivel, elbow extension, pitch, yaw, and roll. The number of degrees of
freedom for this robot arm can be extended to 9 by allowing three-dimensional translations for
the base (Figure 3.16). If base rotations are allowed, the robot arm can have a total of 12 degrees
of freedom. The human body, in comparison, has more than 200 degrees of freedom.

Figure 3.15: Degrees of freedom for a stationary, single-armed robot

easenotes Page 35
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.16: Translational and rotational degrees of freedom for the base of the robot arm

Parameterized systems allow object motion characteristics to be specified as part of the object
definitions. The adjustable parameters control such object characteristics as degrees of freedom,
motion limitations, and allowable shape changes.

Scripting systems allow object specifications and animation sequences to be defined with a
user-input script. From the script, a library of various objects and motions can be constructed.

Character Animation
Animation of simple objects is relatively straightforward. It becomes much more difficult to
create realistic animation of more complex figures such as humans or animals. Consider the
animation of walking or running human (or humanoid) characters. Based upon observations in
their own lives of walking or running people, viewers will expect to see animated characters
move in particular ways. If an animated character’s movement doesn’t match this expectation,
the believability of the character may suffer. Thus, much of the work involved in character
animation is focused on creating believable movements.

Articulated Figure Animation


A basic technique for animating people, animals, insects, and other critters is to model them as
articulated figures. Articulated figures are hierarchical structures composed of a set of rigid
links that are connected at rotary joints (Figure 3.17). Animate objects are modeled as moving
stick figures, or simplified skeletons, that can later be wrapped with surfaces representing skin,
hair, fur, feathers, clothes, or other outer coverings.

easenotes Page 36
Computer Graphics and Fundamentals of Image Processing (21CS63)

The connecting points, or hinges, for an articulated figure are placed at the shoulders, hips,
knees, and other skeletal joints, which travel along specified motion paths as the body moves.
For example, when a motion is specified for an object, the shoulder automatically moves in a
certain way and, as the shoulder moves, the arms move. Different types of movement, such as
walking, running, or jumping, are defined and associated with particular motions for the joints
and connecting links.

Figure 3.17: A simple articulated figure with nine joints and twelve connecting links, not
counting the oval head

A series of walking leg motions, for instance, might be defined as in Figure 3.18. The hip joint is
translated forward along a horizontal line, while the connecting links perform a series of
movements about the hip, knee, and angle joints. Starting with a straight leg [Figure 3.18(a)], the
first motion is a knee bend as the hip moves forward [Figure 3.18(b)]. Then the leg swings
forward, returns to the vertical position, and swings back, as shown in Figures 3.18(c), (d), and
(e). The final motions are a wide swing back and a return to the straight vertical position, as in
Figures 3.18(f) and (g). This motion cycle is repeated for the duration of the animation as the
figure moves over a specified distance or time interval.

easenotes Page 37
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.18: Possible motions for a set of connected links representing a walking leg.

As a figure moves, other movements are incorporated into the various joints. A sinusoidal
motion, often with varying amplitude, can be applied to the hips so that they move about on the
torso. Similarly, a rolling or rocking motion can be imparted to the shoulders, and the head can
bob up and down.

Motion Capture

An alternative to determining the motion of a character computationally is to digitally record the


movement of a live actor and to base the movement of an animated character on that
information. This technique, known as motion capture or mo-cap, can be used when the
movement of the character is predetermined (as in a scripted scene). The animated character will
perform the same series of movements as the live actor.

The classic motion capture technique involves placing a set of markers at strategic positions on
the actor’s body, such as the arms, legs, hands, feet, and joints. It is possible to place the markers
directly on the actor, but more commonly they are affixed to a special skintight body suit worn
by the actor. The actor is them filmed performing the scene. Image processing techniques are
then used to identify the positions of the markers in each frame of the film, and their positions
are translated to coordinates. These coordinates are used to determine the positioning of the body
of the animated character. The movement of each marker from frame to frame in the film is
tracked and used to control the corresponding movement of the animated character.

easenotes Page 38
Computer Graphics and Fundamentals of Image Processing (21CS63)

To accurately determine the positions of the markers, the scene must be filmed by multiple
cameras placed at fixed positions. The digitized marker data from each recording can then be
used to triangulate the position of each marker in three dimensions. Typical motion capture
systems will use up to two dozen cameras.

Optical motion capture systems rely on the reflection of light from a marker into the camera.
These can be relatively simple passive systems using photoreflective markers that reflect
illumination from special lights placed near the cameras, or more advanced active systems in
which the markers are powered and emit light.

Non-optical systems rely on the direct transmission of position information from the markers to a
recording device. Some non-optical systems use inertial sensors that provide gyroscope-based
position and orientation information.

Some motion capture systems record more than just the gross movements of the parts of the
actor’s body. It is possible to record even the actor’s facial movements. Often called
performance capture systems, these typically use a camera trained on the actor’s face and small
light-emitting diode (LED) lights that illuminate the face. Small photoreflective markers attached
to the face reflect the light from the LEDs and allow the camera to capture the small movements
of the muscles of the face, which can then be used to create realistic facial animation
on a computer-generated character.

Periodic Motions
When animation is constructed with repeated motion patterns, such as a rotating object, the
motion should be sampled frequently enough to represent the movements correctly. The motion
must be synchronized with the frame-generation rate so that enough frames are displayed per
cycle to show the true motion. Otherwise, the animation may be displayed incorrectly

A typical example of an undersampled periodic-motion display is the wagon wheel in a Western


movie that appears to be turning in the wrong direction. Figure 3.19 illustrates one complete
cycle in the rotation of a wagon wheel with one red spoke that makes 18 clockwise revolutions
per second. If this motion is recorded on film at the standard motion-picture projection rate of 24
frames per second, then the first five frames depicting this motion would be as shown in Figure

easenotes Page 39
Computer Graphics and Fundamentals of Image Processing (21CS63)

3.20. Because the wheel completes 3/4 of a turn every 1/24 of a second, only one animation
frame is generated per cycle, and the wheel thus appears to be rotating in the opposite
(counterclockwise) direction.

In a computer-generated animation, the sampling rate in a periodic motion can be controlled by


adjusting the motion parameters. Angular increment for the motion of a rotating object can be set
so that multiple frames are generated in each revolution. Thus, a 3◦ increment for a rotation angle
produces 120 motion steps during one revolution, and a 4◦ increment generates 90 steps. For
faster motion, larger rotational steps could be used.

The motion of a complex object can be much slower than we want it to be if it takes too long to
construct each frame of the animation.

Another factor that we need to consider in the display of a repeated motion


is the effect of round-off in the calculations for the motion parameters. We can
reset parameter values periodically to prevent the accumulated error from producing erratic
motions. For a continuous rotation, we could reset parameter values once every cycle (360º).

Figure 3.19: Five positions for a red spoke during one cycle of a wheel motion that is
turning at the rate of 18 revolutions per second.

easenotes Page 40
Computer Graphics and Fundamentals of Image Processing (21CS63)

Figure 3.20: The first five film frames of the rotating wheel in Figure 19 produced at the
rate of 24 frames per second

OpenGL Animation Procedures


Raster operations and color-index assignment functions are available in the core library, and
routines for changing color-table values are provided in GLUT. Other raster-animation
operations are available only as GLUT routines because they depend on the window system in
use. In addition, computer-animation features such as double buffering may not be included in
some hardware systems.

Double-buffering operations, if available, are activated using the following GLUT command:

glutInitDisplayMode (GLUT_DOUBLE);

This provides two buffers, called the front buffer and the back buffer, that we can use
alternately to refresh the screen display. While one buffer is acting as the refresh buffer for the
current display window, the next frame of an animation can be constructed in the other buffer.
We specify when the roles of the two buffers are to be interchanged using

glutSwapBuffers ( );

To determine whether double-buffer operations are available on a system, we can issue the
following query:

glGetBooleanv (GL_DOUBLEBUFFER, status);

A value of GL_TRUE is returned to array parameter status if both front and back
buffers are available on a system. Otherwise, the returned value is GL_FALSE.

easenotes Page 41
Computer Graphics and Fundamentals of Image Processing (21CS63)

For a continuous animation, we can also use

glutIdleFunc (animationFcn);

where parameter animationFcn can be assigned the name of a procedure that is to perform the
operations for incrementing the animation parameters. This procedure is continuously executed
whenever there are no display-window events that must be processed. To disable the
glutIdleFunc, we set its argument to the value NULL or the value 0

Question Bank
1. Explain in detail about logical classification of input devices.
2. Explain request mode, sample mode and event mode.
3. Explain in detail about interactive picture construction techniques.
4. Write a note on virtual reality environment.
5. Explain different OpenGL interactive Input-Device functions.
6. Explain OpenGL menu functions in detail.
7. Explain about designing a graphical user interface.
8. Write a note on OpenGL Animation Procedures.
9. Explain character animation in detail.
10. Write a note on computer animation languages.
11. Explain briefly about general computer animation functions.
12. Explain in detail about traditional animation techniques.
13. Explain in detail about different stages involved in design of animation sequences.
14. Write a note on periodic motion.

easenotes Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)

MODULE-4
Overview of Image Processing

Computers are faster and more accurate than human beings in processing numerical data.
However, human beings score over computers in recognition capability. The human brain is so
sophisticated that we recognize objects in a few seconds without much difficulty. Human beings
use all the five sensory organs to gather knowledge about the outside world. Among these
perceptions, visual information plays a major role in understanding the surroundings. Other kinds
of sensory information are obtained from hearing, taste, smell and touch.

With the advent of cheaper digital cameras and computer systems, we are witnessing a powerful
digital revolution, where images are being increasingly used to communicate effectively.

Images are encountered everywhere in our daily lives. We see many visual information sources
such as paintings and photographs in magazines, journals, image galleries, digital libraries,
newspapers, advertisement boards, television, and the Internet. Many of us take digital snaps of
important events in our lives and preserve them as digital albums. Then from the digital album,
we print digital pictures or mail them to our friends to share our feelings of happiness and
sorrow. Images are not used merely for entertainment purposes. Doctors use medical images to
diagnose problems for providing treatment. With modern technology, it is possible to image
virtually all anatomical structures, which is of immense help to doctors in providing better
treatment. Forensic imaging application process fingerprints, faces and irises to identify
criminals. Industrial applications use imaging technology to count and analyse industrial
components. Remote sensing applications use images sent by satellites to locate the minerals
present in the earth.

Images are imitations of real-world objects. Image is a two-dimensional (2D) signal f(x,y), where
the values of the function f(x,y) represent the amplitude or intensity of the image. For processing
using digital computers, this image has to be converted into a discrete form using the process of
sampling and quantization, known collectively as digitization. In image processing, the term
‘image’ is used to denote the image data that is sampled, quantized and readily available in a
form suitable for further processing by digital computers. Image processing is an area that deals
with manipulation of visual information.

Easenotes.com Page 1
Computer Graphics and Fundamentals of Image Processing(21CS63)

Major objectives of image processing is to

 Improve the quality of pictorial information for better human interpretation.


 Facilitate the automatic machine interpretation of images.

Nature of Image Processing

There are three scenarios or ways of acquiring an image

1. Reflective mode Imaging


2. Emissive Type Imaging
3. Transmissive Imaging

The radiation source shown in Figure 4.1 is the light source.

Figure 4.1: Image processing environment

Objects are perceived by the eye because of light. The sun, lamps, and clouds are all examples of
radiation or light sources. The object is the target for which the image needs to be created. The
object can be people, industrial components, or the anatomical structure of a patient. The objects
can be two-dimensional, three-dimensional or multidimensional mathematical functions
involving many variables. For example, a printed document is a 2D object. Most real-world
objects are 3D.

Easenotes.com Page 2
Computer Graphics and Fundamentals of Image Processing(21CS63)

Reflective Mode Imaging

Reflective mode imaging represents the simplest form of imaging and uses a sensor to acquire
the digital image. All video cameras, digital cameras, and scanners use some types of sensors for
capturing the image. Image sensors are important components of imaging systems. They convert
light energy to electric signals.

Emissive Type Imaging

In Emissive type imaging , images are acquired from self-luminous objects without the help of a
radiation source . In emissive type imaging, the objects are self-luminous. The radiation emitted
by the object is directly captured by the sensor to form an image. Thermal imaging is an example
of emissive type imaging. In thermal imaging, a specialized thermal camera is used in low light
situations to produce images of objects based on temperature. Other examples of emissive type
imaging are magnetic resonance imaging (MRI) and positron emissive tomography (PET)

Transmissive Imaging

In Transmissive imaging, the radiation source illuminates the object. The absorption of radiation
by the objects depends upon the nature of the material. Some of the radiation passes through the
objects. The attenuated radiation is sensed into an image. This is called transmissive imaging.
Examples of this kind of imaging are X-ray imaging, microscopic imaging, and ultrasound
imaging.

The first major challenge in image processing is to acquire the image for further processing.
Figure 4.1 shows three types of processing – optical, analog and digital image processing.

Optical Image Processing

Optical image processing is the study of the radiation source, the object, and other optical
processes involved. It refers to the processing of images using lenses and coherent light beams
instead of computers. Human beings can see only the optical image. An optical image is the 2D
projection of a 3D scene. This is a continuous distribution of light in a 2D surface and contains
information about the object that is in focus. This is the kind of information that needs to be
captured for the target image. Optical image processing is an area that deals with the object,

Easenotes.com Page 3
Computer Graphics and Fundamentals of Image Processing(21CS63)

optics, and how processes are applied to an image that is available in the form of reflected or
transmitted light. The optical image is said to be available in optical form till it is converted into
analog form.

Analog Image Processing

An analog or continuous image is a continuous function f(x,y) where x and y are two spatial
coordinates. Analog signals are characterized by continuous signals varying with time. They are
often referred to as pictures. The processes that are applied to the analog signal are called analog
processes. Analog image processing is an area that deals with the processing of analog electrical
signals using analog circuits. The imaging systems that use film for recording images are also
known as analog imaging systems.

Digital Image Processing

The analog signal is often sampled, quantized and converted into digital form using digitizer.
Digitization refers to the process of sampling and quantization. Sampling is the process of
converting a continuous-valued image f(x,y) into a discrete image, as computers cannot handle
continuous data. So the main aim is to create a discretized version of the continuous data.
Sampling is a reversible process, as it is possible to get the original image back. Quantization is
the process of converting the sampled analog value of the function f(x,y) into a discrete-valued
integer. Digital image processing is an area that uses digital circuits, systems and software
algorithms to carry out the image processing operations. The image processing operations may
include quality enhancement of an image, counting of objects, and image analysis.

Digital image processing has become very popular now as digital images have many advantages
over analog images. Some of the advantages are as follows:

1. It is easy to post-process the image. Small corrections can be made in the captured image
using software.
2. It is easy to store the image in the digital memory.
3. It is possible to transmit the image over networks. So sharing an image is quite easy.
4. A digital image does not require any chemical process. So it is very environment friendly,
as harmful film chemicals are not required or used.

Easenotes.com Page 4
Computer Graphics and Fundamentals of Image Processing(21CS63)

5. It is easy to operate a digital camera.

The disadvantages of digital images are very few. Some of the advantages are the initial cost,
problems associated with sensors such as high power consumption and potential equipment
failure, and other security issues associated with the storage and transmission of digital images.

The final form of an image is the display image. The human eye can recognize only the optical
form. So the digital image needs to be converted to optical form through the digital to analog
conversion process.

Image Processing and Related Fields

Image processing is an exciting interdisciplinary field that borrows ideas freely from many
fields. Figure 4.2 illustrates the relationships between image processing and other related fields.

Figure 4.2: Image Processing and other closely related fields

1) Image Processing and Computer Graphics

Computer graphics and image processing are very closely related areas. Image processing deals
with raster data or bitmaps, whereas computer graphics primarily deals with vector data. Raster
data or bitmaps are stored in a 2D matrix form and often used to depict real images. Vector

Easenotes.com Page 5
Computer Graphics and Fundamentals of Image Processing(21CS63)

images are composed of vectors, which represent the mathematical relationships between the
objects. Vectors are lines or primitive curves that are used to describe an image. Vector graphics
are often used to represent abstract, basic line drawings.

The algorithms in computer graphics often take numerical data as input and produce an image as
output. However, in image processing, the input is often an image. The goal of image processing
is to enhance the quality of the image to assist in interpreting it. Hence, the result of image
processing is often an image or the description of an image. Thus, image processing is a logical
extension of computer graphics and serves as a complementary field.

2) Image Processing and Signal Processing

Human beings interact with the environment by means of various signals. In digital signal
processing, one often deals with the processing of a one-dimensional signal. In the domain of
image processing, one deals with visual information that is often in two or more dimensions.
Therefore, image processing is a logical extension of signal processing.

3) Image Processing and Machine Vision

The main goal of machine vision is to interpret the image and to extract its physical, geometric,
or topological properties. Thus, the output of image processing operations can be subjected to
more techniques, to produce additional information for interpretation. Artificial vision is a vast
field, with two main subfields –machine vision and computer vision. The domain of machine
vision includes many aspects such as lighting and camera, as part of the implementation of
industrial projects, since most of the applications associated with machine vision are automated
visual inspection systems. The applications involving machine vision aim to inspect a large
number of products and achieve improved quality controls. Computer vision tries to mimic the
human visual system and is often associated with scene understanding. Most image processing
algorithms produce results that can serve as the first input for machine vision algorithms.

4) Image Processing and Video processing

Image processing is about still images. Analog video cameras can be used to capture still images.
A video can be considered as a collection of images indexed by time. Most image processing
algorithms work with video readily. Thus, video processing is an extension of image processing.

Easenotes.com Page 6
Computer Graphics and Fundamentals of Image Processing(21CS63)

Images are strongly related to multimedia, as the field of multimedia broadly includes the study
of audio, video, images, graphics and animation.

5) Image Processing and Optics

Optical image processing deals with lenses, light, lighting conditions, and associated optical
circuits. The study of lenses and lighting conditions has an important role in study of image
processing.

6) Image Processing and Statistics

Image analysis is an area that concerns the extraction and analysis of object information from the
image. Imaging applications involve both simple statistics such as counting and mensuration and
complex statistics such as advanced statistical inference. So statistics plays an important role in
imaging applications. Image understanding is an area that applies statistical inferencing to extract
more information from the image.

Digital Image Representation


An image can be defined as a 2D signal that varies over the spatial coordinates x and y, and can
be written mathematically as f(x,y). Medical images such as magnetic resonance images and
computerized tomography(CT) images are 3D images that can be represented as f(x,y,z), where
x,y, and z are spatial coordinates. A simple digital image and its matrix equivalent are shown in
Figs 4.3(a) and 4.3(b).

(a) (b)

Figure 4.3: Digital Image Representation (a) Small binary digital image (b) Equivalent
image contents in matrix form
Easenotes.com Page 7
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 4.3(a) shows a displayed image. The source of the image is a matrix as shown in Fig.
4.3(b). The image has five rows and five columns. In general, the image can be written as a
mathematical function f(x,y) as follows:

In general, the image f(x,y) is divided into X rows and Y columns. Thus, the coordinate ranges
are {x=0,1,……X-1} and {y=0,1,2,…….Y-1}. At the intersection of rows and columns, pixels
are present. Pixels are the building blocks of digital images. Pixels combine together to give a
digital image. Pixel represents discrete data. A pixel can be considered as a single sensor,
photosite(physical element of the sensor array of a digital camera), element of a matrix, or
display element on a monitor.

The value of the function f(x,y) at every point indexed by row and a column is called grey value
or intensity of the image. The value of the pixel is the intensity value of the image at that point.
The intensity value is the sampled, quantized value of the light that is captured by the sensor at
that point. It is a number and has no units.

The number of rows in a digital image is called vertical resolution. The number of columns is
called horizontal resolution. The number of rows and columns describes the dimensions of the
image. The image size is often expressed in terms of the rectangular pixel dimensions of the
array. Images can be of various sizes. Some examples of image size are 256 X 256, 512 X 512.
For a digital camera, the image size is defined as the number of pixels (specified in megapixels)

Resolution is an important characteristic of an imaging system. It is the ability of the imaging


system to produce the smallest discernable details, that is the smallest sized object clearly and
differentiate it from the neighbouring small objects that are present in the image. Image
resolution depends on two factors- optical resolution of the lens and spatial resolution.

Spatial resolution of the image is very crucial as the digital image must show the object and its
separation from the other spatial objects that are present in the image clearly and precisely.

Easenotes.com Page 8
Computer Graphics and Fundamentals of Image Processing(21CS63)

A useful way to define resolution is the smallest number of line pairs per unit distance. The
resolution can then be quantified as 200 line pairs per mm.

Spatial resolution depends on two parameters – the number of pixels of the image and the
number of bits necessary for adequate intensity resolution, referred to as the bit depth. The
numbers of pixels determine the quality of the digital image. The total number of pixels that are
present in the digital image is the number of rows multiplied by the number of columns.

The choice of bit depth is very crucial and often depends on the precision of the measurement
system. To represent the pixel intensity value, certain bits are required. For example, in binary
images, the possible pixel values are 0 or 1. To represent two values, one bit is sufficient. The
number of bits necessary to encode the pixel value is called bit depth. Bit depth is a power of
two. It can be written as 2m . In monochrome grey scale images (e.g medical images such as X-
rays and ultrasound images), the pixel values can be between 0 and 255. Hence, eight bits are
used to represent the grey shades between 0 and 255.(as 28 =256). So the bit depth of grey scale
images is 8. In colour images, the pixel value is characterized by both colour value and intensity
value. So colour resolution refers to the number of bits used to represent the colour of the pixel.
The set of all colours that can be represented by the bit depth is called gamut or palette.

So, the total number of bits necessary to represent the image is

Number of rows x Number of columns x Bit depth

Spatial resolution depends on the number of pixels present in the image and the bit depth.
Keeping the number of pixels constant but reducing the quantization levels(bit depth) leads to
phenomenon called false contouring. The decrease in number of pixels while retaining the
quantization levels leads to a phenomenon called checkerboard effect (or pixelization error).

A 3D image is a function f(x,y,z) where x,y, and z are spatial coordinates. In 3D images, the
term ‘voxel’ is used for pixel. Voxel is an abbreviation of ‘volume element’.

Types of Images
Images can be classified based on many criteria.

Easenotes.com Page 9
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 4.4: Classification of Images

Based on Nature

Images can be broadly classified as natural and synthetic images. Natural images are images
of the natural objects obtained using devices such as cameras or scanners. Synthetic images are
images that are generated using computer programs.

Based on Attributes

Based on attributes, images can be classified as raster images and vector graphics. Vector
graphics use basic geometric attributes such as lines and circles, to describe an image. Hence the
notion of resolution is practically not present in graphics. Raster images are pixel-based. The
quality of the raster images is dependent on the number of pixels. So operations such as
enlarging or blowing-up of a raster image often result in quality reduction.

Based on Colour

Based on colour, images can be classified as grey scale, binary, true colour and pseudocolour
images.

Easenotes.com Page 10
Computer Graphics and Fundamentals of Image Processing(21CS63)

Grayscale and binary images are called monochrome images as there is no colour component in
these images. True colour(or full colour) images represent the full range of available colours. So
the images are almost similar to the actual object and hence called true colour images. In
addition, true colour images do not use any lookup table but store the pixel information with full
precision. Pseudocolour images are false colour images where the colour is added artificially
based on the interpretation of the data.

i) Grey scale Images

Grey scale images are different from binary images as they have many shades of grey between
black and white. These images are also called monochromatic as there is no colour component in
the image, like in binary images. Grey scale is the term that refers to the range of shades between
white and black or vice versa.

Eight bits (28=256) are enough to represent grey scale as the human visual system can
distinguish only 32 different grey levels. The additional bits are necessary to cover noise
margins. Most medical images such as X-rays, CT images, MRIs and ultrasound images are grey
scale images. These images may use more than eight bits. For example, CT images may require a
range of 10-12 bits to accurately represent the image contrast.

(a) (b)

Figure 4.5: Monochrome images (a) Grey scale image (b) Binary image

ii) Binary Images

Easenotes.com Page 11
Computer Graphics and Fundamentals of Image Processing(21CS63)

In binary images, the pixels assume a value of 0 or 1. So one bit is sufficient to represent the
pixel value. Binary images are also called bi-level images. In image processing, binary images
are encountered in many ways.

The binary image is created from a grey scale image using a threshold process. The pixel value
is compared with the threshold value. If the pixel value of the grey scale image is greater than the
threshold value, the pixel value in the binary image is considered as 1.Otherwise, the pixel value
is 0. The binary image is created by applying the threshold process on the grey scale image in
Fig. 4.5(a) is displayed in Fig. 4.5(b) . It can be observed that most of the details are eliminated.
However, binary images are often used in representing basic shapes and line drawings. They are
also used as masks. In addition, image processing operations produce binary images at
intermediate stages.

iii) True Colour Images

In true colour images, the pixel has a colour that is obtained by mixing the primary colours
red,green and blue. Each colour component is represented like a grey scale image using eight
bits. Mostly, true colour images use 24 bits to represent all the colours. Hence true colour images
can be considered as three-band images. The number of colours that is possible is 2563 (i.e
256x256x256=1,67,77,216 colours)

Figure 4.6 a) shows a colour image and its three primary colour components. Figure 4.6 b
illustrates the general storage structure of the colour image. A display controller then uses a
digital-to-analog converter(DAC) to convert the colour value to the pixel intensity of the
monitor.

Original Image Red Component

Easenotes.com Page 12
Computer Graphics and Fundamentals of Image Processing(21CS63)

Green Component Blue Component


(a)

Figure 4.6: True colour images (a) Original image and its colour components

(b)

(c)

Figure 4.6: (b) Storage structure of colour images (c) Storage structure of an indexed
image

Easenotes.com Page 13
Computer Graphics and Fundamentals of Image Processing(21CS63)

A special category of colour images is the indexed image. In most images, the full range of
colours is not used. So it is better to reduce the number of bits by maintaining a colour map,
gamut, or palette with the image. Figure 4.6(c) illustrates the storage structure of an indexed
image. The pixel value can be considered as a pointer to the index, which contains the address of
the colour map. The colour map has RGB components. Using this indexed approach, the number
of bits required to represent the colours can be drastically reduced. The display controller uses a
DAC to convert the RGB value to the pixel intensity of the monitor.

iv) Pseudocolour Images

Like true colour images, pseudocolour images are also widely used in image processing. True
colour images are called three-band images. However, in remote sensing applications multi-band
images or multi-spectral images are generally used. These images, which are captured by
satellites contains many bands. A typical remote sensing image may have 3-11 bands in an
image. This information is beyond the human perceptual range. Hence it is mostly not visible to
the human observer. So colour is artificially added to these bands, so as to distinguish the bands
and to increase operational convenience. These are called artificial colour or pseudocolour
images. Pseudocolour images are popular in the medical domain also. For example, the Doppler
colour image is a pseudocolour image.

Based on Dimensions

Images can be classified based on dimension also. Normally, digital images are 2D rectangular
array of pixels. If another dimension, of depth or any other characteristics, is considered, it may
be necessary to use a higher-order stack of images. A good example of a 3D image is a volume
image, where pixels are called voxels. By ‘3D image’, it is meant that the dimension of the target
in the imaging system is 3D. The target of the imaging system may be a scene or an object. In
medical imaging, some of the frequently encountered images are CT images, MRIs and
microscopy images. Range images, which are often used in remote sensing applications, are also
3D images.

Based on Data Types

Images may be classified based on their data type. A binary image is a 1-bit image as one bit is
sufficient to represent black and white pixels. Grey scale images are stored as one-byte(8-bit) or

Easenotes.com Page 14
Computer Graphics and Fundamentals of Image Processing(21CS63)

two-byte(16-bit) images, With one byte, it is possible to represent 28 , that is 0-255=256 shades
and with 16 bits, it is possible to represent 216, that is 65,536 shades. Colour images often use 24
or 32 bits to represent the colour and intensity value.

Sometimes, image processing operations produce images with negative numbers, decimal
fractions, and complex numbers. For example, Fourier transforms produce images involving
complex numbers. To handle, negative numbers, signed and unsigned integer types are used. In
these data types, the first bit is used to encode whether the number is positive or negative.
Floating-point involves storing the data in scientific notation. For example, 1230 can be
represented as 0.1234x104 , where 0.123 is called the significand and the power is called the
exponent. There are many floating-point conventions.

The quality of such data representation is characterized by parameters such as data accuracy
and precision. Data accuracy is the property of how well the pixel values of an image are able
to represent the physical properties of the object that is being imaged. Data accuracy is an
important parameter, as the failure to capture the actual physical properties of the image leads to
the loss of vital information that can affect the quality of the application. While accuracy refers
to the correctness of a measurement, precision refers to the repeatability of the measurement.
Repeated measurements of the physical properties of the object should give the same result.
Most software use the data type ‘double’ to maintain precision as well as accuracy.

Domain Specific Images

Images can be classified based on the domains and applications where such images are
encountered.

Range Images

Range images are often encountered in computer vision. In range images, the pixel values denote
the distance between the object and the camera. These images are also referred to as depth
images. This is in contrast to all other images whose pixel values denote intensity and hence are
often known as intensity images.

Multispectral Images

Easenotes.com Page 15
Computer Graphics and Fundamentals of Image Processing(21CS63)

Multispectral images are encountered mostly in remote sensing applications. These images are
taken at different bands of visible or infrared regions of the electromagnetic wave. Multispectral
images may have many bands that may include infrared and ultraviolet regions of the
electromagnetic spectrum.

Basic Relationship and Distance Matrices


Images can be easily represented as a two-dimensional array of matrix. Pixels can be visualized
logically and physically. Logical pixels specify the points of a continuous 2D function. These are
logical in the sense that they specify a location but occupy no physical area. Normally, this is
represented in the Cartesian first coordinate system. Physical pixels occupy a small amount of
space when displayed on the output device. Digitized images indicating physical pixels are
represented in the Cartesian fourth coordinate system.

For example, an analog image of size 3x3 is represented in the first quadrant of the Cartesian
coordinate system as shown in Fig 4.7.

Figure 4.7: Analog image f(x,y) in the first quadrant of Cartesian coordinate system

Figure 4.7 illustrates an image f(x,y) of dimension 3x3, where f(0,0) is the bottom left corner.
Since it starts from the coordinate position(0,0), it ends with f(2,2), that is x=0,1,2,…..M-1 and
y=0,1,2,…….,N-1. x and y define the dimensions of the image.

In digital image processing, the discrete form of the image is often used. Discrete images are
usually represented in the fourth quadrant of the Cartesian coordinate system. A discrete image
f(x,y) of dimension 3x3 is shown in Fig. 4.8(a)

Easenotes.com Page 16
Computer Graphics and Fundamentals of Image Processing(21CS63)

Many programming environments including MATLAB starts with an index of (1,1). The
equivalent representation of the given matrix is shown in Fig 4.8(b)

Figure 4.8: Discrete image (a) Image in the fourth quadrant of Cartesian coordinate system
(b) Image coordinates as handled by software environments such as MATLAB

The coordinates used for discrete image is, by default, the fourth quadrant of the Cartesian
system.

Image Topology
Image topology is a branch of image processing that deals with the fundamental properties of the
image such as image neighbourhood, paths among pixels, boundary, and connected components.
It characterizes the image with topological properties such as neighbourhood, adjacency and
connectivity. Neighbourhood is fundamental to understanding image topology. Neighbours of a
given reference pixel are those pixels with which the given reference pixel shares its edges and
corners.

In N4(p), the reference pixel p(x,y) at the coordinate position (x,y) has two horizontal and two
vertical pixels as neighbours. This is shown graphically in Fig. 4.9.

0 𝑋 0
[𝑋 𝑝(𝑥, 𝑦) 𝑋]
0 𝑋 0

Figure 4.9: 4-Neighbourhood N4(p)

The set of pixels {(x+1,y),(x-1,y),(x,y+1),(x,y-1)}, called the 4-neighbours of p is denoted as


N4(p) .Thus, 4-neighbourhood includes the four direct neighbours of the pixel p(x,y). The pixel

Easenotes.com Page 17
Computer Graphics and Fundamentals of Image Processing(21CS63)

may have four diagonal neighbours. They are (x-1,y-1), (x+1,y+1),(x-1,y+1) and (x+1,y-1). The
diagonal pixels for the reference pixel p(x,y) are shown graphically in Fig 4.10.

𝑋 0 𝑋
[0 𝑝(𝑥, 𝑦) 0]
𝑋 0 𝑋

Figure 4.10: Diagonal elements ND(p)

The diagonal neighbours of pixel p(x,y) are represented as ND (p). The 4-neighbourhood and ND
are collectively called the 8-neighbourhood. This refers to all the neighbours and pixels that
share a common corner with the reference pixel p(x,y). These pixels are called indirect
neighbours. This is represented as N8(p) and is shown graphically in Fig 4.11.

The set of pixels N8(x)=ND(x)  N4(x)


𝑋 𝑋 𝑋
[𝑋 𝑝(𝑥, 𝑦) 𝑋]
𝑋 𝑋 𝑋

Figure 4.11: 8-Neighbourhood N8(p)

Connectivity
The relationship between two or more pixels is defined by pixel connectivity. Connectivity
information is used to establish the boundaries of objects. The pixels p and q are said to be
connected if certain conditions on pixel brightness specified by the set V and spatial adjacency
are satisfied. For a binary image, this set V will be {0,1} and for grey scale images, V might be
any range of grey levels.

4-Connectivity : The pixels p and q are said to be in 4-connectivity when both have the same
values as specified by the set V and if q is said to be in the set N4(p). This implies any path from
p to q on which every other pixel is 4-connected to the next pixel.

8-Connectivity : It is assumed that the pixels p and q share a common grey scale value. The
pixels p and q are said to be in 8-connectivity if q is in the set N8(p)

Mixed Connectivity: Mixed connectivity is also known as m-connectivity. Two pixels p and q
are said to be in m-connectivity when

Easenotes.com Page 18
Computer Graphics and Fundamentals of Image Processing(21CS63)

1. q is in N4(p)
2. q is in ND(p) and the intersection of N4(p) and N4(q) is empty.

For example, Fig 4.12 shows 8-connectivity when V={0,1}

Figure 4.12 : 8-connectivity represented as lines

8- Connectivity is shown as lines. Here, a multiple path or loop is present. In m-connectivity,


there are no such multiple paths. The m-connectivity for the image in Fig 4.12 is as shown in Fig
4.13.

Figure 4.13 : m-Connectivity

Easenotes.com Page 19
Computer Graphics and Fundamentals of Image Processing(21CS63)

It can be observed that the multiple paths have been removed.

Relations

A binary relation between two pixels a and b, denoted as aRb, specifies a pair of elements of an
image.

For example, consider the image pattern given in Fig 4.14. The set is given as A ={x1,x2,x3}. The
set based on the 4-connecivity relation is given as A={x1,x2}. It can be observed that x3 is
ignored as it is not connected to any other element of the image by 4-connectivity.

Figure 4.14 : Image pattern

The following are the properties of the binary relations:

Reflexive: For any element a in the set A, if the relation aRa holds, this is known as a reflexive
relation.

Symmetric: If aRb implies that bRa also exists, this is known as a symmetric relation.

Transitive : If the relation aRb and bRc exist, it implies that the relationship aRc also exists.
This is called the transitivity property.

If all these three properties hold, the relationship is called an equivalence relation.

Distance Measures
The distance between the pixels p and q in an image can be given by distance measures such as
Euclidian distance, D4 distance and D8 distance. Consider three pixels p,q, and z. If the

Easenotes.com Page 20
Computer Graphics and Fundamentals of Image Processing(21CS63)

coordinates of the pixels are P(x,y),Q(s,t) and Z(u,w) as shown in Fig.4.15, the distances
between the pixels can be calculated.

Figure 4.15 : Sample image

The distance function can be called metric if the following properties are satisfied:

1. D(p,q) is well-defined and finite for all p and q.


2. D(p,q) 0 if p=q, then D(p,q)=0
3. The distance D(p,q)=D(q,p)
4. D(p,q)+D(q,z)  D(p,z). This is called the property of triangular inequality.

The Euclidean distance between the pixels p and q, with coordinates (x,y) and (s,t) respectively
can be defined as

The advantage of the Euclidean distance is its simplicity. However, since its calculation involves
a square root operation, it is computationally costly.

The D4 distance or city block distance can be simply calculated as

D4(p,q)=|x-s|+ |y-t|

The D8 distance or chessboard distance can be calculated as

D8(p,q)=max(|x-s|,|y-t|)

Easenotes.com Page 21
Computer Graphics and Fundamentals of Image Processing(21CS63)

Important Image Characteristics


Some important characteristics of images are as follows:

1. The set of pixels that has connectivity in a binary image is said to be characterized by the
connected set.
2. A digital path or curve from pixel p to another pixel q is a set of points p1,p2,….,pn. If the
coordinates of those points are (x0,y0) ,(x1,y1),…….(xn,yn), then p=(x0,y0) and q=(xn,yn).
The number of pixels is called the length. If x0=xn and y0=yn , then the path is called a
closed path.
3. R is called a region if it is a connected component.
4. If a path between any two pixels p and q lies within the connected set S, it is called a
connected component of S. If the set has only one connected component, then the set S is
called a connected set. A connected set is called a region.
5. Two Regions R1 and R2 are called adjacent if the union of these sets also forms a
connected component. If the regions are not adjacent, it is called disjoint set. In Fig 4.16,
two regions R1 and R2 are shown. These regions are 8-connected because the pixels
(underlined pixel ‘1’) have 8-connectivity. If the regions are not adjacent, they are called
disjoint.
6. The border of the image is called contour or boundary. A boundary is a set of pixels
covering a region that has one or more neighbours outside the region. Typically, in a
binary image, there is a foreground object and a background object. The border of the
foreground object may have at least one neighbor in the background. If the border pixels
are within the region itself, it is called inner boundary. This need not be closed.
7. Edges are present whenever there is an abrupt intensity change among pixels. Edges are
similar to boundaries, but may or may not be connected. If edges are disjoint, they have
to be linked together by edge linking algorithms. However boundaries are global and
have a closed path. Figure 4.17 illustrates two regions and an edge. It can be observed
that edges provide an outline of the object. The pixels that are covered by the edges lead
to regions.

Easenotes.com Page 22
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 4.16 : Neighbouring regions

Figure 4.17 : Edge and regions

Easenotes.com Page 23
Computer Graphics and Fundamentals of Image Processing(21CS63)

Classification of Image Processing Operations


There are various ways to classify image operations. The reason for categorizing the operations
is to gain an insight into the nature of the operations, the expected results, and the kind of
computation burden that is associated with them.

One way of categorizing the operations based on neighbourhood is as follows:

1. Point Operations
2. Local Operations
3. Global Operations

Point operations are those whose output value at a specific coordinate is dependent only on the
input value. A local operation is one whose output value at a specific coordinate is dependent on
the input values in the neighbourhood of that pixel. Global operations are those whose output
value at a specific coordinate is dependent on all the values in the input image.

Another way of categorizing operations is as follows:

1. Linear Operations
2. Non-linear Operations

An operator is called a linear operator if it obeys the following rules of additivity and
homogeneity.

1. Property of additivity

H(a1f1(x,y)+a2f2(x,y))= H(a1f1(x,y)) + H(a2f2(x,y))

= a1H(f1(x,y)) + a2H(f2(x,y))

=a1 x g1(x,y)+a2 x g2(x,y)

2. Property of homogeneity

H(kf1(x,y))=kH(f1(x,y))=kg1(x,y)

A non-linear operator does not follow these rules.

Easenotes.com Page 24
Computer Graphics and Fundamentals of Image Processing(21CS63)

Image operations are array operations. These operations are done on a pixel-by-pixel basis.
Array operations are different from matrix operations. For example, consider two images

𝐴 𝐵
F1 = [ ]
𝐶 𝐷
𝐸 𝐹
F2=[ ]
𝐺 𝐻
The multiplication of F1 and F2 is element-wise, as follows:

𝐴𝐸 𝐵𝐹
F1 X F2 =[ ]
𝐶𝐺 𝐻𝐷

By default, image operations are array operations only.

Arithmetic Operations
Arithmetic operations include image addition, subtraction, multiplication, division and blending.

Image Addition

Two images can be added in a direct manner, as given by

g(x,y)=f1(x,y)+f2(x,y)

The pixels of the input images f1(x,y) and f2(x,y) are added to obtain the resultant image g(x,y).

Figure 4.18 shows the effect of adding a noise pattern to an image. However, during the image
addition process, care should be taken to ensure that the sum does not cross the allowed range.
For example, in a grey scale image, the allowed range is 0-255, using eight bits. If the sum is
above the allowed range, the pixel value is set to the maximum allowed value. Similarly, it is
possible to add a constant value to a single image, as follows:

g(x,y)=f1(x,y)+k

If the value of k is larger than 0, the overall brightness is increased. Figure 4.18(d) illustrates the
addition of the constant 50 increases the brightness of the image.

Easenotes.com Page 25
Computer Graphics and Fundamentals of Image Processing(21CS63)

The brightness of an image is the average pixel intensity of an image. If a positive or negative
constant is added to all the pixels of an image, the average pixel intensity of the image increases
or decreases respectively. The practical application of image addition is as follows:

1. To create double exposure. Double exposure is the technique of superimposing an image


on another image to produce the resultant. This gives a scenario equivalent to exposing a
film to two pictures.
2. To increase the brightness of an image

Figure 4.18: Results of the image addition operations (a) Image1 (b) Image 2 (c) Addition of
images 1 and 2 (d) Addition of image 1 and constant 50

Image Subtraction
The subtraction of two images can be done as follows. Consider

g(x,y)=f1(x,y)-f2(x,y)

where f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image. To avoid negative
values, it is desirable to find the modulus of the difference as

Easenotes.com Page 26
Computer Graphics and Fundamentals of Image Processing(21CS63)

g(x,y)=| f1(x,y)-f2(x,y)|

It is also possible to subtract a constant value k from the image

i.e g(x,y)= | f1(x,y)-k|, as k is constant. The decrease in the average intensity reduces the
brightness of the image. Some of the practical applications of image subtraction are as follows:

1. Background elimination
2. Brightness reduction
3. Change detection

If there is no difference between the frames, the subtraction process yields zero, and if there is
any difference, it indicates the change. Figure 4.19 (a) -4.19(d) show the difference between the
images. In addition, it illustrates that the subtraction of a constant results in a decrease of the
brightness.

Figure 4.19: Results of the image subtraction operation (a) Image 1 (b) Image 2 (c)
Subtraction of images 1 and 2 (d) Subtraction of constant 50 from image 1

Easenotes.com Page 27
Computer Graphics and Fundamentals of Image Processing(21CS63)

Image Multiplication

Image multiplication can be done in the following manner:

Consider

g(x,y)=f1(x,y) x f2(x,y)

f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image. If the multiplied value
crosses the maximum value of the data type of the images, the value of the pixel is reset to the
maximum allowed value. Similarly, scaling by a constant can be performed as

g(x,y)=f(x,y)x k

where k is a constant.

If k is greater than 1, the overall contrast increases. If k is less than 1, the contrast decreases. The
brightness and contrast can be manipulated together as

g(x,y)=af(x,y)+k

Parameters a and k are used to manipulate the brightness and contrast of the input image. g(x,y)
is the output image. Some of the practical applications of image multiplication as follows:

1. It increases contrast. If a fraction less than 1 is multiplied with the image, it results in
decrease of contrast. Figure 4.20 shows that by multiplying a factor of 1.25 with the
original image, the contrast of the image increases.
2. It is useful for designing filter masks.
3. It is useful for creating a mask to highlight the area of interest.

Easenotes.com Page 28
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 4.20: Result of multiplication operation (image x 1.25) resulting in good contrast

Image Division
Division can be performed as

g(x,y) = f1(x,y)/f2(x,y)

where f1(x,y) and f2(x,y) are two input images and g(x,y) is the output image.

The division process may result in floating-point numbers. Hence, the float data type should be
used in programming. Improper data type specification of the image may result in loss of
information. Division using a constant can also be performed as

g(x,y) = f(x,y)/k where k is a constant

Some of the practical applications of image division is as follows:

1. Change detection
2. Separation of luminance and reflectance components
3. Contrast reduction

Figure 4.21 (a) shows such an effect when the original image is divided by 1.25.

Easenotes.com Page 29
Computer Graphics and Fundamentals of Image Processing(21CS63)

Figure 4.21 (b)-4.21(e) show the multiplication and division operations used to create a mask. It
can be observed that image 2 is used as a mask. The multiplication of image 1 with image 2
results in highlighting certain portions of image 1 while suppressing the other portions. It can be
observed that division yields back the original image.

Figure 4.21: Image division operation (a) Result of the image operation (image/1.25) (b)
Image 1 (c) Image 2 used as a mask (d) Image 3=image 1 x image 2 (e) Image 4=image
3/image 1

Applications of Arithmetic Operations


Arithmetic operations can be combined and put to effective use. For example, the image
averaging process can be used to remove noise. Noise is a random fluctuation of pixel values,
which affects the quality of the image. A noisy image can be considered as an image plus noise:

g(x,y)=f(x,y)+(x,y)

where f(x,y) is the input image and g(x,y) is the output image. Several instances of noisy images
can be averaged as

Easenotes.com Page 30
Computer Graphics and Fundamentals of Image Processing(21CS63)

where M is the number of noisy images. As M increases the averaging process reduces the
intensity of the noise and it becomes so low that it can automatically be removed. As M becomes
large, the expectation

Logical Operations
Bitwise operations can be applied to image pixels. The resultant pixel is defined by the rules of
the particular operation. Some of the logical operations that are widely used in image processing
are as follows:

1. AND/NAND

2. OR/NOR

3. EXOR/EXNOR

4. Invert/ Logical NOT

1. AND/NAND

The truth table of the AND and NAND operators is given in Table 4.2

Table 4.2: Truth table of the AND and NAND operators

A B C(AND) C(NAND)
0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0

Easenotes.com Page 31
Computer Graphics and Fundamentals of Image Processing(21CS63)

The operators AND and NAND take two images as input and produce one output image. The
output image pixels are the output of the logical AND/NAND of the individual pixels. Some of
the practical applications of the AND and NAND operators are as follows:

1. Computation of the intersection of images


2. Design of filter masks
3. Slicing of grey scale images; for example, the pixel value of the grey scale image may be
1100 0000. The first bits of the pixels of an image constitute one slice. To extract the first
slice, a mask of value 1000 0000 can be designed. The AND operation of the image pixel
and the mask can extract the first bit and first slice of the image.

Figure 4.22 (a) -4.22(d) shows the effect of the AND and OR logical operators. It illustrates that
the AND operator shows overlapping regions of the two input images and the OR operator
shows all the input images with their overlapping.

Figure 4.22: Results of the AND and OR logical operators (a) Image 1 (b) Image 2 (c)
Result of image 1 OR image 2 (d) Result of image 1 and AND image 2

2. OR/NOR

The truth table of the OR and NOR operators is given in Table 4.3

Table 4.3: Truth table of the OR and NOR operators

A B C(OR) C(NOR)
0 0 0 1
0 1 1 0
1 0 1 0
1 1 1 0

Easenotes.com Page 32
Computer Graphics and Fundamentals of Image Processing(21CS63)

The practical applications of the OR and NOR operators are as follows:

1. OR is used as the union operator of two images.


2. OR can be used as a merging operator

3. XOR/NOR

The truth table of the XOR and XNOR operators is given in Table 4.4.

A B C(XOR) C(XNOR)
0 0 0 1
0 1 1 0
1 0 1 0
1 1 0 1

The practical applications of the XOR and XNOR operations are as follows:

1. Change detection
2. Use as a subcomponent of a complex imaging operation. XOR for identical inputs is zero.
Hence it can be observed that the common region of image 1 and image 2 in Figures
4.22(a) and 4.22 (b), respectively, is zero and hence dark. This is illustrated in Fig. 4.23.

Figure 4.23: Result of the XOR operation

Easenotes.com Page 33
Computer Graphics and Fundamentals of Image Processing(21CS63)

4. Invert/Logical NOT
The truth table of the NOT operator is given in Table 4.5.

Table 4.5: Truth table of the OR and NOR operators

A C(NOT)
0 1
1 0

For grey scale values, the inversion operation is described as

g(x,y)= 255- f(x,y)

The practical applications of the inversion operator are as follows:

1. Obtaining the negative of an image. Figure 4.24 shows the negative of the original image
shown in Fig. 4.22(a)
2. Making features clear to the observer
3. Morphological processing

(a) (b)

Figure 4.24: Effect of the NOT operator (a) Original Image (b) NOT of original Image

Similarly, two images can be compared using operators such as

 = Equal to
 >Greater than
 >= Greater than or equal to
 < Less than

Easenotes.com Page 34
Computer Graphics and Fundamentals of Image Processing(21CS63)

 <= Less than or equal to


  Not equal to

The resultant image pixel represents the truth or falsehood of the comparisons. Similarly shifting
operations are also very useful. Shifting of I bits of the image pixel to the right results in division
by 2I. Similarly, shifting of I bits of the image pixel to the left results in multiplication by 2I.

Shifting operators are helpful in dividing and multiplying an image by a power of two. In
addition, this operation is computationally less expensive.

Geometrical Operations
Translation

Translation is the movement of an image to a new position. Let us assume that the point at the
coordinate position X=(x,y) of the matrix F is moved to the new position X whose coordinate
position is (x’,y’). Mathematically, this can be stated as a translation of a point X to the new
position X. The translation is represented as

x =x+x

y=y+y

The translation of the floor image by (25,25) is shown in Figure 4.25.

Figure 4.25 : Result of translation by 50 units

Easenotes.com Page 35
Computer Graphics and Fundamentals of Image Processing(21CS63)

In vector notation, this is represented as F=F+T, where x and y are translations parallel to the
x and y axes. F and F are the original and the translated images respectively. However, other
transformations such as scaling and rotation are multiplicative in nature. The transformation
process for rotation is given as F=RF, where R is the transform matrix for performing rotation,
and the transformation process for scaling is given as F=SF. Here, S is the scaling
transformation matrix.

To create uniformity and consistency, it is necessary to use a homogeneous coordinate system


where all transformations are treated as multiplications. A point (x,y) in 2D space is expressed as
(wx,wy,w) for w0. The properties of homogeneous coordinates are as follows:

1. In homogeneous coordinates, at least one point should be non-zero. Thus (0,0,0) does not
exist in the homogeneous coordinate system.
2. If one point is multiplicative of the other point, they are same. Thus, the points(1,3,5) and
(3,9,15) are same as the second point is 3 (1,3,5)
3. The point (x,y,w) in the homogeneous coordinate system corresponds to the point
(x/w,y/w) in 2D space.

In the homogeneous coordinate system, the translation process of point(x,y) of image F to the
new point (x,y) of the image F is described as

x=x+x

y=y+y

In matrix form, this can be stated as

1 0 𝑥
[x,y,1]=[0 1 𝑥] [x,y,1]T
0 0 1

Sometimes, the image may not be present at the origin. In that case, a suitable negative
translation value can be used to bring the image to align with the origin.

Easenotes.com Page 36
Computer Graphics and Fundamentals of Image Processing(21CS63)

Scaling
Depending on the requirement, the object can be scaled. Scaling means enlarging and shrinking.
In the homogeneous coordinate system, the scaling of the point (x,y) of the image F to the new
point (x,y) of the image F is described as

x=x xSX

y=y xSy
𝑆𝑥 0
[x,y]=[ ] [x,y]
0 𝑆𝑦

Sx and Sy are called scaling factors along the x and y axes respectively. If the scale factor is 1, the
object would appear larger. If the scaling factors are fractions, the object would shrink. Similarly,
if Sx and SY are equal, scaling is uniform. This is known as isotropic scaling. Otherwise, it is
called differential scaling. In the homogeneous coordinate system, it is represented as

𝑆𝑥 0 0
[x’,y’,1] =[ 0 𝑆𝑦 0] [x,y,1]T
0 0 1

𝑆𝑥 0 0
The matrix S=[ 0 𝑆𝑦 0] is called scaling matrix.
0 0 1

Mirror or Reflection Operation


This function creates the reflection of the object in a plane mirror. The function returns an image
in which the pixels are reversed. This operation is useful in creating an image in the desired order
and for making comparisons. The reflected object is of the same size as the original object, but
the object is in the opposite quadrant. Reflection is also described as rotation by 180.. The
reflection along the x-axis is given by

1 0
F’=[-x,y]=[ ] x [x,y]T
0 −1
Similarly, the reflection along the y-axis is given by
−1 0
F’=[x,-y]=[ ] x [x,y]T
0 1

Easenotes.com Page 37
Computer Graphics and Fundamentals of Image Processing(21CS63)

Similarly, the reflection about the line y=x is given as

0 1
F’=[x,-y]=[ ] x [x,y]T
1 0
The reflection about the line y=-x is given as
0 −1
F’=[x,-y]=[ ] x [x,y]T
−1 0
The reflection operation is illustrated in Fig 3.22(a) and 3.22(b). In the homogeneous coordinate
system, the matrices for reflection can be given as

−1 0 0
Ry-axis=[ 0 1 0]
0 0 1
1 0 0
Rx-axis=[0 −1 0]
0 0 1
−1 0 0
Rorigin=[ 0 −1 0]
0 0 1
The reflection along a line can be given as

0 1 0
Ry=x =[1 0 0]
0 0 1
0 −1 0
Ry=-x =[−1 0 0]
0 0 1

Shearing
Shearing is a transformation that produces a distortion of shape. This can be applied either in the
x-direction or the y-direction. In this transformation, the parallel and opposite layers of the object
are simply sided with respect to each other.

Shearing can be done using the following calculation and can be represented in the matrix form
as

x’=x+ay

y’=y

Easenotes.com Page 38
Computer Graphics and Fundamentals of Image Processing(21CS63)

1 𝑎 0
Xshear=[0 1 0] (where a=shx)
0 0 1
Yshear can be given as

x’=x

y’=y+bx

1 0 0
Yshear=[𝑏 1 0] (where b=shy)
0 0 1
where shx and shy are shear factors in the x an y directions, respectively.

Rotation
An image can be rotated by various degrees such as 900,1800 or 2700. In the matrix form it is
given as

𝑐𝑜𝑠 −𝑠𝑖𝑛 ]
[x’,y’]= [ [x,y] T
𝑠𝑖𝑛 𝑐𝑜𝑠

This can be represented as F=RA. The parameter  is the angle of rotation with respect to the x-
axis. The value of  can be positive or negative. A positive angle represents counter clockwise
rotation and a negative angle represents clockwise rotation. In homogeneous coordinate system,
rotation can be expressed as

𝑐𝑜𝑠 −𝑠𝑖𝑛 0
[x’,y’,1]= 𝑠𝑖𝑛
[ 𝑐𝑜𝑠 0] [x,y,1]
0 0 1
If  is substituted with -, this matrix rotates the image in the clockwise direction.

Affine Transform
The transformation that maps the pixel at the coordinates(x,y) to a new coordinate position is
given as a pair of transformation equations. In this transform, straight lines are preserved and
parallel lines remain unchanged. It is described mathematically as

x =Tx(x,y)

Easenotes.com Page 39
Computer Graphics and Fundamentals of Image Processing(21CS63)

y =Ty(x,y)

Tx and Ty are expressed as polynomials. The linear equation gives an affine transform.

x =a0x+a1y+a2

y =b0x+b1y+b2

This is expressed in matrix form as

𝑥′ 𝑎0 𝑎1 𝑎2 𝑥
[𝑦′] = [𝑏0 𝑏1 𝑏2] [𝑦]
1 0 0 1 1

The affine transform is a compact way of representing all transformations. The given equation
represents all transformations.

 Translation is the situation where a0 =1, a1=0, and a2=x.


 Scaling transformation is a situation where a0=sx and b1=sy and a1=0, a2=0, b0=0 and
b2=0.
 Rotation is a situation where a0=cos, a1=-sin, b0=sin, b1=cos, a2=0 and b2=0.
 Horizon shear is performed with a0=1, b0=1,a1=Shx, a2=0, b1=Shy and b2=0

Inverse Transformation
The purpose of inverse transformation is to restore the transformed object to its original form and
position. The inverse or backward transformation matrices are given as follows:

1 0 −𝑥
Inverse transform for translation= [0 1 −𝑦]
0 0 1

1 0 −𝑥
Inverse transform for scaling = [0 1 −𝑦]
0 0 1

Inverse transform for rotation can be obtained by changing the sign of the transform term. For example,
the following matrix performs inverse transform.

Easenotes.com Page 40
Computer Graphics and Fundamentals of Image Processing(21CS63)

𝑐𝑜𝑠 +𝑠𝑖𝑛 0
[−𝑠𝑖𝑛 𝑐𝑜𝑠 0]
0 0 1

3D Transforms

Some medical images such as computerized tomography(CT) and magnetic resonance imaging(MRI)
images are three-dimensional images. To apply translation, rotation, and scaling on 3D images, 3D
transformations are required. 3D transformations are logical extensions of 2D transformations. These
are summarized and described as follows:

Translation = [ ]

Image Interpolation Techniques

Transforms can be of two types. Affine transforms often produce pixels of the resultant image that
cannot be fit as some of the pixel values are non-integers and often go beyond the acceptable range.
This results in gaps(or holes) and issues related to number of pixels and range. So interpolation
techniques are required to solve these issues.

Forward Mapping

Forward mapping is the process of applying transformations iteratively to every pixel in the image,
yielding a new coordinate position and copying the values of the pixel to a new position.

Backward Mapping

Backward mapping is the process of checking the pixels of the output image to determine the position of
the pixels in the input image. This is used to guarantee that all the pixels of the input image are
processed.

During the process of both forward and backward mapping, it may happen that pixels cannot be fitted
into the new coordinates. For example, consider the process of rotation of a point(10,5) by 450. This
yields

x’ =xcos-ysin

=10 cos(450)-5sin(450)

=10(0.707)-5(0.707)

=3.535

y’ = xsin+ycos

Easenotes.com Page 41
Computer Graphics and Fundamentals of Image Processing(21CS63)

= 10 sin(450)+5cos(450)

= 10(0.707)+5(0.707)

=10.605

Since these new coordinate positions are not integers, the rotation process cannot be carried out. Thus,
the process may leave a gap in the new coordinate position, which creates poor quality output.
Therefore, whenever a geometric transformation is performed, a resampling process should be carried
out so that the desirable quality is achieved in the resultant image. The resampling process creates new
pixels so that the quality of the output is maintained. In addition, the rounding off of the new coordinate
position (3.535,10.605) should be carried out as (4,11). This process of fitting the output to the new
coordinates is called interpolation.

Interpolation is the method of calculating the expected values for a function with known pixels. Some of
the popular interpolation techniques are:

Nearest neighbor technique

Bilinear technique

Bicubic technique

The most elementary form of interpolation is nearest neighbor interpolation or zero-order interpolation.
This technique determines the closest pixel and assigns it to every pixel in the new image matrix, that is,
the brightness of the pixels is equal to the closest neighbor. Sometimes, this may result in pixel blocking
and can degrade the resulting image, which may appear spatially disordered. These distortions are
called aliasing.

A more accurate interpolation scheme is bilinear interpolation. This is called first-order interpolation.
Four neighbours of the transformed original pixels that surround the new pixel are obtained and are
used to calculate the new pixel value. Linear interpolation is used in both the directions. Weights are
assigned based on the proximity. Then the process takes the weighted average of the brightness of the
four pixels that surround the pixels of interest.

g(x,y)=(1-a)(1-b)f(x’,y’)+(1-a)bf(x’,y’+1)+a(1-b)f(x’+1,y’)+abf(x’+1,y’+1)

Here g(x,y) is the output image and f(x,y) is the image that undergoes the interpolation operation. If the
desired pixel is very close to one of the four nearest neighbor pixels, its weight will be much higher. This
technique leads to blurring of the edges.However, it reduces aliasing artefacts.

High-order interpolation schemes takes more pixels into account. Second-order interpolation is known
as cubic interpolation. It uses a neighbourhood of 16 pixels. Then it fits two polynomials to the 16 pixels
of the transformed original matrix and the new image pixel. This technique is very effective and
produces images that are very close to the original. In extreme cases, more than 64 neighbouring pixels
can be used. However, as the number of pixel increases, the computational complexity also increases.

Easenotes.com Page 42
Computer Graphics and Fundamentals of Image Processing(21CS63)

Set Operations

An image can be visualized as a set. For example, the following binary image(Fig3 3.25) can be visualizd
as a set A={(0,0),(0,2),(2,2)}. The coordinates values represent the value of 1. Set operators can then be
applied to the set to get the resultant, which is useful for image analysis.

The complement of set A can be defined as the set of pixels that does not belong to the set A.

Ac={c/cA}

The reflection of the set is defined as

A={c=-a, aA}

The union of two sets, A and B can be represented as

AUB={c/(cA)(cB)}

Where the pixel c belongs to A,B or both.

The intersection of two sets is given as AB={c/(cA)(cB)}. The pixel c belongs to A,B or both.

The difference can be expressed as

A-B={c/(cA)(cB)}

Which is equivalent to ABC

Morphology is a collection of operations based on set theory, to accomplish various tasks such as
extracting boundaries, filling small holes present in the image, and removing noise present in the image.

Mathematical morphology is a very powerful tool for analyzing the shapes of the objects hat are present
in the images. The theory of mathematical morphology is based on set theory. One can visualize a binary
object as a set. Set theory can then be applied to the sample set. Morphological operators often take a
binary image and a mask known as structuring element as input. The set operators such as intersection,
union, inclusion and complement can then be applied to images. Dilation is one of the two basic
operators. It can be applied to binary as well as grey scale images. The basic effect of this operator on a
binary image is that it gradually increases the boundaries of the region, while the small holes that are
present in the images become smaller.

Let us assume that A and B are a set of pixel coordinates. The dilation of A by B can be denoted as

AB={(x,y)+(u,v): (x,y)A,(u,v)B}

Where x and y corresponds to the set A, and u and v corresponds to the set B. The coordinates are
added and the union is carried out to create the resultant set. These kinds of operations are based on
Minkowski algebra.

Easenotes.com Page 43
Computer Graphics and Fundamentals of Image Processing(21CS63)

Statistical Operations

Statistics plays an important role in image processing. An image can be assumed to be a set of discrete
points. Statistical operations can be applied to the image to get the desired results such as manipulation
of brightness and contrast.Some of the very useful statistical operations include mean,median, mode
and mid-range. These measures are useful in image processing. The measures of data dispersion also
includes quartiles,inter-quartile range and variance.

Some of the frequently used statistical measures are the following

Mean

Mean is the average of all the values in the sample(population) and is denoted as

The overall brightness of the grey scale image is measured using the mean. This is calculated by
summing all the values of the pixels of an image and dividing it by the number of pixels in the image.
1
μ = ∑𝑛−1 𝐼
𝑛 𝑖=0 i

Sometimes the data is associated with a weight. This is called weighted mean. The problem of mean is
its extreme sensitivity to noise. Even small changes in the input affect the mean drastically.

Median

Median is the value where the given Xi is divided into two equal halves, with half of the values being
lower than the median and the other half higher. The procedure for obtaining the median is to sort the
values of the given Xi in ascending order. If the given sequence has an odd number of values, the middle
value is the median. Otherwise, the median is the arithmetic mean of the two middle values.

Mode

Mode is the value that occurs most frequently in the dataset. The procedure for finding the mode is to
calculate the frequencies for all of the values in the data. The mode is the value(or values) with the
highest frequency. Normally, based on the mode, the dataset is classified as unimodal, bimodal,and
trimodal. Any dataset that has two modes is called bimodal.

Percentile

Percentiles are data that are less than the coordinate by some percentage of the total value. For
example, the median is the 50th percentile and can be denoted as Q0.50.The 25th percentile is called the
first quartile and the 75th percentile is called third quartile. Another measure that is useful to measure
dispersion is the inter-quartile range. The inter-quartile is defined as Q0.75 Q0.25.Semi quartile range is
=0.5 Xiqr

Unimodal curves are slightly skewed and the empirical relation is

Mean-Mode=3x(Mean-Median)

Easenotes.com Page 44
Computer Graphics and Fundamentals of Image Processing(21CS63)

The interpretation of the formula is that the mode for the unimodal frequency curve is moderately
skewed. The mid-range is also used to assess the central tendency of the dataset. In a normal
distribution, the mean, median, and mode are the same. In symmetrical distributions, it is possible for
the mean and median to be the same even though there may be several modes. By contrast, in
asymmetrical distributions, the mean and median are not the same. These distributions are said to be
skewed data where more than half the cases are either above or below the mean.

Standard Deviation and Variance

The most commonly used measures of dispersion are variance and standard deviation. The mean does
not convey much more than a middle point. For example, the following datasets {10,20,30} and
{10,50,0}, both have a mean of 20. The difference between these two sets is the spread of data.
Standard deviation is the average distance from the mean of the dataset to each point. The formula for
standard deviation is given by

Sometimes, we divide the value by N-1 instead of N. The reason is that in a larger, real-world scenario,
division by N-1 gives an answer that is closer to the actual value. In image processing, it is a measure of
how much a pixel varies from the mean value of the image. The mean value and the standard deviation
characterize the perceived brightness and contrast of the image. Variance is another measure of the
spread of the data. It is the square of standard deviation. While standard deviation is a more common
measure, variance also indicates the spread of the data effectively.

Entropy

This is the measure of the amount of orderliness that is present in the image. The entropy can be
calculated by assuming that the pixels are totally uncorrelated. An organized system has low entropy
and a complex system has a very high entropy. Entropy also indicates the average global information
content. Its unit is bits per pixel. It can be computed using the formula

Entropy H=-∑𝑛 𝑖=1 𝑙𝑜𝑔2𝑝𝑖

Where pi is the prior probability of the occurrence of the message. Let us consider a binary image,
where the pixel assumes only two possible states,0 or 1 and the occurrence of each state is equally
likely. Hence, the probability is ½. Therefore, the entropy H=-[1/2log(1/2)+1/2log(1/2)]=1 bit

Therefore, 1 bit is sufficient to store the intensity of the pixel.Therefore, binary images are less complex.

Thus, entropy indicates the richness of the image. This can be seen visually using a surface plot where
pixel values are plotted as a function of pixel position.

Convolution and Correlation Operations

The imaging system can be modeled as a 2D linear system.Let f(x,y) and g(x,y) represent the input and
output images, respectively. Then, they can be written as g(x,y)=t*(f(x,y)).Convolution is a group

Easenotes.com Page 45
Computer Graphics and Fundamentals of Image Processing(21CS63)

process, that is, unlike point operations, group processes operate on a group of input pixels to yield the
result. Spatial convolution is a method of taking a group of pixels in the input image and computing the
resultant output image. This is also known as a finite impulse response(FIR) filter. Spatial convolution
moves across pixel by pixel and produces the output image. Each pixel of the resultant image is
dependent on a group of pixels(called kernel).

The one-dimensional convolution formula is as follows:

g(x)=t*f(x)

=∑ 𝑛𝑖=−𝑛 𝑡(𝑖)𝑓(𝑥 − 𝑖)

The convolution window is a sliding window that centres on each pixel of the image to generate the
resultant image. The resultant pixel is thus calculated by multiplying the weight of the convolution mask
by pixel values and summing these values. Thus, the sliding window is moved for every corresponding
pixel in the image in both direction. Therefore, convolution is called ‘shift-add-multiply’ operation.

To carry out the process of convolution, the template or mask is first rotated by 1800. Then the
convolution process is carried out. Consider the process of convolution of two sequences – F, whose
dimension is 1 x 5 and a kernel or template T, whose dimension is 1x3.

Let F={0,0,2,0,0} and the kernel be {7 5 1}. The template has to be rotated by 1800. The rotated mask of
this original mask [7 5 1] is a convolution template whose dimensions is 1 x 3 with value {1,5,7}

To carry out the convolution process first, the process of zero padding should be carried out. Zero
padding is the process of creating more zeros and is done as shown in Table 3.7.

Convolution is the process of shifting and adding the sum of the product of mask coefficients and the
image to give the centre value.

Correlation is similar to the convolution operation and it is very useful in recognizing the basic shapes in
the image. Correlation reduces to convolution if the kernels are symmetric. The difference between the
correlation and convolution processes is that the mask or template is applied directly without any prior
rotation., as in the convolution process.

The correlation of these sequences is carried out to observe the difference between these processes.
The correlation process also involves the zero padding process.

Easenotes.com Page 46

You might also like