Computer Graphics Handout-Ch-2
Computer Graphics Handout-Ch-2
Computer Graphics Handout-Ch-2
2.1. Introduction
Graphic hardware can be divided into three major categories of devices: (1) Input devices with
which the user interacts to generate necessary instruction or data for creating graphics(2) Display
systems where the graphics are rendered on the monitor screen (3) Hardcopy devices or printers
through which the tangible graphics output is produced.
Based on the logical interaction types the input devices can be broadly classified as – (1) locator
device such as graphics tablet, touch panel, mouse, trackball, joystick, keyboard that indicates a
position (e.g., point coordinate) or orientation, (2) pick device such as the light pen, joystick,
mouse that can select a graphical object, (3) valuator device such as joystick or trackball that are
used to input scalar values such as rotation angle, scale factors etc. (4) keyboard or text input
device and (5) choice device such as the keyboard function keys, mouse, touch panel, voice
systems that are used to select menu options. This unit deals exclusively with various input devices
with different functional capabilities.
There are two broad categories of hardcopy devices — one is the printer and the other is the plotter.
Though plotters have limited and specialized uses, printer is a common yet important accessory of
any computer system, especially for a graphics system. Most of the computer graphics creation
have their ultimate utilizations in printed/plotted forms that are used for design documentation,
exhibition or publication in books or other print media. So it is the quality of printed/plotted output
that makes computer graphics applications appealing to the businesses it caters to. In keeping with
its importance in real life there have been close competitions amongst the manufacturers in
developing newer and cheaper models of hardcopy devices ranging from low cost dot matrix and
popular desk jets to heavy duty laser jets or sophisticated pen plotters. This unit describes various
hardcopy technologies and functional aspects of a variety of printers and plotters.
The display medium for computer graphic-generated pictures has become widely diversified.
Typical examples are CRT-based display, Liquid Crystal, LED and Plasma-based display and
stereoscopic display. CRT display is by far the most common display technology and most of the
fundamental display concepts are embodied in CRT technology. This unit focuses on CRT-based
display technologies explaining the related concepts followed by illustrations of structural and
functional components and working principles of each. The unit briefly explains few other
common display technologies.
Various devices are available for data input ranging from general purpose computer systems with
graphic capabilities to sophisticated workstations designed for graphics applications. Among these
devices are graphic tablets, light pens, joysticks, touch panels, data gloves, image scanner,
trackballs, digitizer, voice systems and of course the common alphanumeric keyboard and mouse.
Given below are the basic functional characteristics and applications of these devices.
2.2.1. Keyboard
With a keyboard, a person can type a document, use keystroke shortcuts, access menus, play games
and perform a variety of other tasks. Though keyboards can have different keys depending on the
manufacturer, the operating system that they are designed for, and whether they are attached to a
desktop computer or are part of a laptop most keyboards have between 80 and 110 keys, including:
Control keys (Ctrl, Alt, Del, Pg Up, Pg Dn, Home, End, Esc, , Fn, arrow keysetc.)
Function keys allow users to enter frequently-used operations with a single keystroke and Control
keys allow cursor and screen control. Displayed objects and menus can be selected using the
Control keys.
A keyboard is a lot like a miniature computer. It has its own processor, circuitry (key matrix) and
a ROM storing the character map. It uses a variety of switch technology.
Though the basic working technology is same there are design variations to make the keyboards
easier and safer to use, versatile and elegant. Some of the non-traditional keyboards are Das
keyboard, Virtual Laser keyboard, True-touch Roll-up keyboard, Ion Illuminated keyboard, and
Wireless keyboard.
2.2.2. Mouse
A mouse is a hand-held pointing device, designed to sit under one hand of the user and to detect
movement relative to its two-dimensional supporting surface. It has become an inseparable part of
a computer system just like the keyboard. A cursor in the shape of a narrow or cross-hair always
associated with a mouse. We reach out for the mouse whenever we want to move the cursor or
activate something or drag and drop or resize some object on display. Drawing or designing figures
and shapes using graphic application packages like AutoCAD, Photoshop, CorelDraw, and Paint
is almost impossible without mouse.
The mouse’s 2D motion typically translates into the motion of a pointer on a display. In a
mechanical mouse a ball – roller assembly is used; one roller used for detecting X direction motion
and the other for detecting Y direction motion. An optical mouse uses LED and photodiodes (or
optoelectronic sensors) to detect the movement of the underlying surface, rather than moving some
of its parts as in a mechanical mouse. Modern Laser mouse uses a small laser instead of a LED.
A mouse may have one, two or three buttons on the top. Usually clicking the primary or leftmost
button will select items or pick screen-points, and clicking the secondary or rightmost button will
bring up a menu of alternative actions applicable to the selected item or specific to the context.
Extra buttons or features are included in the mouse to add more control or dimensional inputs.
2.2.3. Trackball
A trackball is a pointing device consisting of a ball housed in a socket containing sensors to detect
rotation of the ball about two axes—like an upside-down mouse with an exposed protruding ball.
The user rolls the ball with his thumb, fingers, or the palm of his hand to move a cursor. A
potentiometer captures the track ball orientation which is calibrated with the translation of the
cursor on screen. Tracker balls are common on CAD workstations for ease of use and, before the
advent of the touchpad, on portable computers, where there may be no desk space on which to use
a mouse.
2.2.4. Joystick
A joystick is used as a personal computer peripheral or general control device consisting of a hand-
held stick that pivots about the base and steers the screen cursor around. Most joysticks are two-
dimensional, having two axes of movement (similar to a mouse), but three-dimensional joysticks
do exist. A joystick is generally configured so that moving the stick left or right signals movement
along the X-axis, and moving it forward (up) or back (down) signals movement along the Y-axis.
In joysticks that are configured for three- dimensional movement, twisting the stick left (counter-
clockwise) or right (clockwise) signals movement along the Z-axis. In conventional joystick
potentiometers, or variable resistors, are used to dynamically detect the location of the stick and
springs are there to return the stick to center position as it is released.
In many joysticks, optical sensors are used instead of analog potentiometer to read stick movement
digitally. One of the biggest additions to the world of joysticks is force feedback technology. On
using a force feedback (also called haptic feedback) joystick if you’re shooting a machine gun in
an action game, the stick would vibrate in your hands. Or if you crashed your plane in a flight
simulator, the stick would push back suddenly which means the stick moves in conjunction with
onscreen actions.
Joysticks are often used to control games, and usually have one or more push-buttons whose state
can also be read by the computer. Most I/O interface cards for PCs have a joystick (game control)
port. Joysticks were popular during the mid-1990s for playing games and flight-simulators,
although their use has declined with promotion of the mouse and keyboard.
A digitizer is a locator device used for drawing, painting, or interactively selecting coordinate
positions on an object. Graphics tablet is one such digitizer that consists of a flat surface upon
which the user may draw an image using an attached stylus, a pen-like drawing apparatus. The
image generally does not appear on the tablet itself but, rather, is displayed on the computer
monitor.
The first graphics tablet resembling contemporary tablets was the RAND Tablet, also known as
the Grafacon (for Graphic Converter). It employed an orthogonal grid of wires under the surface
of the pad. When pressure is applied to a point on the tablet using a stylus, the horizontal wire and
vertical wire associated with the corresponding grid point meet each other, causing an electric
current to flow into each of these wires. Since an electric current is only present in the two wires
that meet, a unique coordinate for the stylus can be retrieved. The coordinate returned are tablet
coordinates which are converted to user or screen coordinates by an imaging software. Even if it
doesn’t touch the tablet, proximity of the stylus to the tablet surface can also be sensed by virtue
of a weak magnetic field projected approximately one inch from the tablet surface. It is important
to note that, unlike the RAND Tablet, modern tablets do not require electronics in the stylus and
any tool that provides an accurate ‘point’ may be used with the pad. In some tablets a multiple
button hand-cursor is used instead of stylus. Graphics tablets are available in various sizes and
price ranges—A6-sized tablets being relatively inexpensive and A3-sized tablets being far more
expensive.
Modern tablets usually connect to the computer via a USB interface. Because of their stylus-based
interface and (in some cases) ability to detect pressure, tilt, and other attributes of the stylus and
its interaction with the tablet, are widely used to create two-dimensional computer graphics. Free-
hand sketches by an artist or drawing following an existing image on the tablet are useful while
digitizing old engineering drawing, electrical circuits and maps and toposheets for GIS. Indeed,
many graphics packages (e.g., Corel Painter, Inkscape, Photoshop, Pixel Image Editor, Studio
Artist, The GIMP) are able to make use of the pressure (and, in some cases, stylus tilt) information
generated by a tablet, by modifying attributes such as brush size, opacity, and color. Three
dimensional graphics can also be created by a 3D digitizer that uses sonic or electromagnetic
transmissions to record positions on a real object as the stylus moves over its surface.
A touch panel is a display device that accepts user input by means of a touch sensitive screen. The
input is given by touching displayed buttons or menus or icons with finger. In a typical optical
touch panel LEDs are mounted in adjacent edges (one vertical and one horizontal). The opposite
pair of adjacent edges contain light detectors. These detectors instantly identify which two
orthogonal light beams emitted by the LEDs are blocked by a finger or other pointing device and
thereby record the x, y coordinates of the screen position touched for selection. However, because
of its poor resolution the touch panel cannot be used for selecting very small graphic objects or
accurate screen positions.
The other two type of touch panels are electrical (or capacitive) and acoustical. In an electrical
touch panel two glass plates coated with appropriate conductive and resistive materials are placed
face to face similar to capacitor plates. Touching a point on the display panel generates force which
changes the gap between the plates. This in turn causes change in capacitance across the plates
that is converted to coordinate values of the selected screen position. In acoustic type, similar to
the light rays, sonic beams are generated from the horizontal and vertical edges of the screen. The
sonic beam is obstructed or reflected back by putting a finger in the designed location on the screen.
From the time of travel of the beams the location of the fingertip is determined.
Touch panels have gained wide acceptance in bank ATMs, video games and railway or tourist
information systems.
A light pen is a pointing device shaped like a pen and is connected to the computer. The tip of the
light pen contains a light-sensitive element (photoelectric cell) which, when placed against the
screen, detects the light from the screen enabling the computer to identify the location of the pen
on the screen. It allows the user to point to displayed objects, or draw on the screen, in a similar
way to a touch screen but with greater positional accuracy. Alight pen can work with any CRT-
based monitor, but not with LCD screens, projectors or other display devices.
The light pen actually works by sensing the sudden small change in brightness of a point on the
screen when the electron gun refreshes that spot. By noting exactly where the scanning has reached
at that moment, the x, y position of the pen can be resolved. The pen position is updated on every
refresh of the screen.
Light pens are popularly used to digitize map or engineering drawing or signature or handwriting.
The data glove is an interface device that uses position tracking sensors and fiber optic strands that
run down each finger and are connected to a compatible computer; the movement of the hand and
fingers are displayed live on the computer monitor which in turn allows the user to virtually touch
an object displayed on the same monitor. With the object animated it would appear that the user
(wearing the data glove) can pick up an object and do things with it just as he would do with a real
object. In modern data glove devices, tactile sensors are used to provide the user with an additional
feeling of touch or the amount of pressure or force the fingers or hands are exerting even though
the user is not actually touching anything. Thus data glove is an agent to transport the user to
virtual reality.
2.2.9. Voice-System
The voice-system or speech recognition system is a sophisticated input device that accepts voice
or speech input from the user and transforms it into digital data that can be used to trigger graphic
operations or enter data in specific fields. A dictionary is established for a particular operator
(voice) by recording the frequency-patterns of the voice commands (words spoken) and
corresponding functions to be performed. Later when a voice command is given by the same
operator, the system searches for a frequency-pattern match in the dictionary and if found the
corresponding action is triggered. If a different operator is to use the system then the dictionary
has to be re-established with the new operator’s voice patterns.
FURTHER READING
2.3.1. Printer
The printer is an important accessory of any computer system, especially for a graphics system.
This is because most of the graphics creation using computer graphics has its ultimate utilization
in printed form – for documentation, exhibition or publication in print media or books. It is the
quality of printed output that finally matters in many businesses.
Based on the available printing technology the major factors which control the quality of printer
are individual dot size on the paper and number of dots per inch (dpi). Clearly, the lesser the size
of the dots the better the detail of the figure reproduced. Higher dpi values increase the sharpness
and detail of a figure and enhance the intensity levels that a printer supports. Other important
factors for selection of a printer are printing speed and print area or printer memory.
There are several major printer technologies available. These technologies can be broken down
into two main categories with several types in each:
Impact: These printers have a mechanism whereby formed character faces are pressed
against an inked ribbon onto the paper in order to create an image. For example, dot matrix
printer and line printer.
Non-impact: These printers do not touch the paper rather use laser techniques, ink sprays,
xerographic processes and electrostatic methods to produce the image on paper. For
example, laser printer, inkjet printer, electrostatic printer, drum plotter, flatbed plotter.
A dot matrix printer refers to a type of computer printer with a print head (usually containing9 to
24 pins) that runs back and forth on the page and prints by impact, striking an ink-soaked cloth
ribbon against the paper, much like a typewriter. Unlike a typewriter or daisywheel printer, letters
are drawn out of a dot matrix, and thus, varied fonts and arbitrary graphics can be produced.
Because the printing involves mechanical pressure, these printers can create carbon copies. The
print head normally prints along every raster row of the printer paper and the color of print is the
color of the ink of the ribbon.
Figure 2.10
Each dot is produced by a tiny yet stiff metal rod, also called a ‘wire’ or ‘pin’, which is driven
forward by the power of a tiny electromagnet or solenoid, either directly or through small levers
(pawls). The pins are usually arranged vertically where marginal offsets are provided between
columns to reduce inter-dot spacing. The position of pins in the print head actually limits the
quality of such a printer.
Hardware improvements to dot matrix printers boosted the carriage speed, added more (typeface)
font options, increased the dot density (from 60dpi up to 240dpi), and added pseudo-colour
printing through multi-colour ribbon. Still such printers lack the ability to print computer-
generated images of acceptable quality. It is good for text printing in continuous sheets.
Strictly speaking, ‘dot matrix’ in this context is a misnomer, as nearly all inkjet, thermal, and laser
printers produce dot matrices. However, in common parlance these are seldom called ‘dot matrix’
printers, to avoid confusion with dot matrix impact printers.
The line printer is a form of high speed impact printer in which a line of type is printed at a time.
In a typical design, a fixed font character set is engraved onto the periphery of a number of print
wheels, the number matching the number of columns (letters in a line).The wheels spin at high
speed and paper and an inked ribbon are moved past the print position. As the desired character
for each column passes the print position, a hammer strikes the paper and ribbon causing the
desired character to be recorded on the continuous paper. Printed type is set at fixed positions and
a line could consist of any number of character positions with 132 columns as the most common,
but 80 column, 128 column and 160 column variants are also in use. Other variations of line printer
have the type on moving bars or a horizontal spinning chain.
The line printer technology is usually both faster and less expensive (in total ownership) than laser
printers. It has its use in medium volume accounting and other large business applications, where
print volume and speed is a priority over quality. Because of the limited character set engraved on
the wheels and the fixed spacing of type, this technology was never useful for material of high
readability such as books or newspapers.
An inkjet printer is a non-impact printer that places extremely small droplets of ink onto the paper
to create an image. These printers are popular because they less costly but generate attractive
graphic output.
The dots sprayed on paper are extremely small (usually between 50 and 60 microns in diameter),
and are positioned very precisely, with resolutions of up to 1440 × 720 dpi. The dots can have
different colours combined together to create photo-quality images.
The core of an inkjet printer is the print head that contains a series of nozzles that are used to spray
drops of ink. The ink is contained in ink cartridges that come in various combinations, such as
separate black and colour cartridges, or a cartridge for each ink colour. A stepper motor moves the
print head assembly (print head and ink cartridges) back and forth across the paper. The mechanical
operation of the printer is controlled by a small circuit board containing a microprocessor and
memory.
There are two main inkjet technologies currently used by printer manufacturers.
Thermal bubble (or bubble jet): This is used by manufacturers such as Canon and Hewlett
Packard. In a thermal inkjet printer, tiny resistors create heat, and this heat vaporizes ink
to create a bubble. As the bubble expands, some of the ink is pushed out of a nozzle onto
the paper. When the bubble ‘pops’ (collapses), a vacuum is created. This pulls more ink
into the print head from the cartridge. A typical bubble jet print head has 300 or 600 tiny
nozzles, and all of them can fire a droplet simultaneously.
Piezoelectric: Patented by Epson, this technology uses piezo crystals. A crystal is located
at the back of the ink reservoir of each nozzle. The crystal receives a tiny electric charge
that causes it to vibrate. When the crystal vibrates inward, it forces a tiny amount of ink
out of the nozzle. When it vibrates out, it pulls some more ink into the reservoir to replace
the ink sprayed out.
Figure 2.12
The laser printer employs technology similar to that of a photocopy machine. A laser beam focuses
a positively charged selenium-coated rotating drum. The laser gun removes the positive charge
from the drum except for the area to be printed (black portion of the paper). In this way, the laser
draws the letters and images to be printed as a pattern of electrical-charges — an electrostatic
image. The negatively-charged black toner powder first adheres to this positively-charged area
(image) on the drum from where it is transferred to the rolling white paper. Before the paper rolls
under the drum, it is given a positive charge stronger than the positive charge of the electrostatic
image, so the paper can pull the toner powder away. The paper is then subjected to mild heating
to melt and fix the loose toner on the paper. The laser printer is mainly a bi-level printer. In case
of colour lasers, this process is repeated three times.
For the printer controller and the host computer to communicate, they need to speak the same page
description language. The primary printer languages used nowadays are Hewlett Packard’s Printer
Command Language (PCL) and Adobe’s Postscript. Both these languages describe the page in
vector form — that is, as mathematical values of geometric shapes, rather than as a series of dots
(a bitmap image). Apart from image data the printer controller receives all of the commands that
tell the printer what to do — what paper to use, how to format the page, how to handle the font,
etc. Accordingly the controller sets the text margins, arranges the words and places the graphics.
When the page is arranged, the raster image processor (RIP) takes the page data, either as a whole
or piece by piece, and breaks it down into an array of tiny dots so the laser can write it out on the
photoreceptor drum. In most laser printers, the controller saves all print-job data in its own
memory. This lets the controller put different printing jobs into a queue so it can work through
them one at a time.
Figure 2.13
In inkjet printers, the single printing head moves left-to-right and prints as it is traveling. In
contrast, the electrostatic printer has many print heads, actually covering the entire 36"media
width. So instead of a single print head moving across the width of the media, the electrostatic
printer prints an entire width of the page at one time. The media (paper, vellum, film) is
electrostatically charged (energized). The toner solution is circulated past the media and ‘sticks’
to the energized portion of the media, thus producing a very fast high quality image.
2.3.8. Plotter
In contrast to the printer which is primarily a raster scan device, the plotter is a vector device. In
colour plotters the carriage accommodates a number of pens with varying colours and widths. The
microprocessor in the plotter receives instructions from the host computer and executes commands
like ‘move’ (moving the carriage to a given position with pens up) and ‘draw’ (drawing geometric
entities like point, line, arc, circle etc. with pens down).Since the plotter is a vector device it can
directly reach specific positions on printer paper without following raster row sequence. In flat bed
plotter the paper lies flat and stationary while the pen moves from one location to another on the
paper. But in drum plotters the paper itself slides on a cylindrical drum and the pen moves over
the drum.
FURTHER READING
2.4.1. Video
Just like text, audio and still image digital videos are also a powerful elements of multimedia
systems. To understand how digital video is used as a media we need to understand some
fundamental aspects of analog video technology.
Basically video or motion pictures are created by displaying images depicting progressive stages
of motion at a rate fast enough so that the projection of individual images overlap on the eye.
Persistence of vision of human eye, which allows any projected image to persist for 40–50 ms,
requires a frame rate of 25–30 frames per second to ensure perception of smooth motion picture.
In a video display:
Horizontal resolution is the number of distinct vertical lines that can be produced in a
frame.
Vertical resolution is the number of horizontal scan lines in a frame.
Aspect ratio is the width-to-height ratio of a frame.
Interface ratio is the ratio of the frame rate to the field rate.
Constitution-wise there are three types of video signals: Component video, Composite video and
S-video. Most computer systems and high-end video systems use component video whereby three
signals R, G and B are transmitted through three separate wires corresponding to red, green and
blue image planes respectively.
However, because of the complexities of transmitting the three signals of component video in exact
synchronism and relationship these signals are encoded using a frequency-interleaving scheme
into a composite format that can be transmitted through a single cable. Such format known as
composite video, used by most video systems and broadcast TV, uses one luminance and two
chrominance signals. Luminance (Y) is a monochrome video signal that controls only the
brightness of an image. Chrominance is actually two signals (I and Q or U and V), called colour
differences (B–Y, R–Y) and contains colour information of an image. Each chrominance
component is allocated half as much band width as the luminance, a form of analog data
compression∗, which is justified by the fact that human eyes are less sensitive to variations in
colour than to variations in brightness. Theoretically, there are infinite possible combinations
(additive) of R, G, B signals to produce Y, I, Q or Y, U, V signals. The common CCIR601 standard
defines —
Unlike composite video, S-video (separated video or super video as S –VHS) uses two wires, one
for luminance and another for a composite chrominance signal. Component video gives the best
output since there is no cross-talk or interference between the different channels unlike composite
video or S-video.
In a television transmission system, every part of every moving image is converted into analog
electronic signals and is transmitted. The VCR can store TV signal on magnetic tapes, which can
be played to reproduce stored images. There are three main standards for analog video signals used
in television transmission: NTSC, SECAM and PAL. A characteristic comparison of these
standards is listed in the Table 2.1.
For video to be played and processed in computers it needs to be converted from analog to digital
representation. Video digitization is achieved just like audio digitization by sampling the analog
video signal at a preset frequency and subsequently quantizing the discrete samples in digital
format. This is done with the help of analog to digital converter or ADC.
There are two kinds of possible digitizations or digital coding–Composite coding and Component
coding. In composite coding, all signal components taken together as a whole are converted into
digital format. In component coding, each signal component is digitized separately using different
sampling frequency (13.5 MHz for luminance, 6.75 MHz for chrominance).
Based on whether the ADC is inside a digital camera, or in an external unit or inside the computer
there can be three types of digital video systems – Digital Camera-based System, External ADC
System and Video Frame Grabber Card-based System.
The main function of the video frame grabber card is to take the composite (luminance-
chrominance) analog video signal, decode it to RGB signal, then convert it to the digital format
and store each frame first in the frame buffer on the card itself. At an adequate frame-rate
consecutive frames are streamed to the monitor, routed through the framebuffer of the main
memory to present live video on the computer screen. The Sun video digitizer from Sun
Microsystems captures NTSC video signal (RGB) with frame resolution of 320 × 240 pixels,
quantization of 8 bits/pixels and a frame rate of 30 frames per second. How closely the digital
video approximates the original analog video depends on the sampling resolution or number of
bits used to represent any pixel value.
1. Storing video on digital devices memory ready to be processed, (noise removal, cut and
paste, size and motion control and so on) and integrated into various multimedia
applications is possible.
2. It allows direct access, which makes non-linear video editing (audio mixing, adding text,
titles and digital effects etc.) simple.
3. It allows repeated recording without degradation of image quality.
4. Ease of encryption and better tolerance to channel noise is possible.
To improve the picture quality and transmission efficiency new generation televisions systems are
designed based on international standards that exploit the advantage of digital signal processing.
These standards include High Definition Television or HDTV, Improved Definition Television or
IDTV, Double Multiplexed Analog Components or D2-MAC,Advanced Compatible Television,
First System or ACTV-I. HDTV standard that support progressive (non-interlaced) video
scanning; has much wider aspect ratio (16:9 instead of4:3); greater field of view; higher horizontal
and vertical resolution (9600 and 675 respectively in USA) and more bandwidth (9 MHz in USA)
as compared to conventional colour Systems.
Of the different multimedia elements the need for compression is greatest for video as the data
volume for Full Screen Full Motion (FSFM) video is very high. Frame size for NTSC video is 640
pixels × 480 pixels and if we use 24 bits colour depth then each frame occupies640 × 480 × 3
bytes, i.e., 900 KB. So each second of NTSC video comprising 30 frames occupies 900 30 KB
which is around 26 MB and each minute occupies 26 × 60 ≈ 1.9 GB. Thus a 600 MB CD would
contain maximum 22 seconds of FSFM video. Now imagine the storage space required for a 2-
hour movie. So the only way to achieve digital motion video on PC is to reduce or compress the
redundant data in video files.
Redundancy in digital video occurs when the same information is transmitted more than once.
Primarily in any area of an image frame where same colour or intensity spans more than one pixel
location, there is spatial redundancy.
Secondly, when a scene is stationary or only slightly moving, there is redundancy between frames
of motion sequence – the contents of consecutive frames in time are similar, or they may be related
by a simple translation function. This kind of redundancy is called temporal redundancy.
Spatial redundancy is removed by compressing each individual image frame in isolation and the
techniques used are generally called spatial compression or intra-frame compression. Temporal
redundancy is removed by storing only the differences of subsequence of frames instead of
compressing each frame independently and the technique is known as temporal compression or
inter-frame compression.
Spatial compression applies different lossless and lossy method same as those applied for still
images. Some of these methods are:
The most simplistic approach for temporal compression is to perform a pixel-by-pixel comparison
(subtraction) between two consecutive frames. The compare should produce zero for pixels, which
have not changed, and non-zero for pixels, which are involved in motion. Only then can the pixels
with non-zero differences be coded and stored, thus reducing the burden of storing all the pixel
value of a frame. But there are certain problems with this approach. Firstly, the even if there is no
object motion in a frame, slightest movement of camera would produce non-zero difference of all
or most pixels. Secondly, quantization noise would yield non-zero difference of stationary pixels.
In an alternative approach the motion generators camera and/or object can be ‘compensated’ by
detecting the displacements (motion vectors) of corresponding pixel blocks or regions in the
frames and measuring the differences of their content (prediction error). Such approach of
temporal compression is said to be based on motion compensation. For efficiency each image is
divided into macroblocks of size N × N. The current image frame is referred to as target frame.
Each macroblock of the target frame is examined with reference to the most similar macroblocks
in previous and /or next frame called reference frame. This examination is known as forward
prediction or backward prediction depending on whether the reference frame is a previous frame
or next frame. If the target macroblocks found to contain no motion, a code is sent to the
decompressor to leave the block the way it was in the reference frame. If the block does have
motion the motion vector and difference block need to be coded so that the decompressor can
reproduce the target block from the code.
2.4.1.4. MPEG
MPEG is the international standard for audio/video digital compression and MPEG-1 is most
relevant for video at low data rate (up to 1.5 M bit/s) to be incorporated in multimedia.MPEG-1 is
a standard in 5 parts, namely – Systems, Video and Audio, Conformance Testing and Software
Simulation (a full C-language implementation of the MPEG-1 encoder and decoder). Though
higher standards like MPEG-2, MPEG-4, MPEG-7 and MPEG-21 have evolved in search of higher
compression ratio, better video quality, effective communication and technological upgradation,
we will discuss MPEG-1 only for the understanding of basic MPEG scheme.
2.4.1.5. MPEG-1
MPEG-1 standard doesn’t actually define a compression algorithm; it defines a data stream syntax
and a decompressor. The data stream architecture is based on a sequence of frame, each of which
contains the data needed to create a single displayed image. There are four different kind of frames,
depending on how each image is to be decoded:
I-frames (Intra-coded images) are self-contained, i.e., coded without any reference to other images.
These frames are purely spatially compressed using a transform coding method similar to JPEG.
The compression ratio for I-frames is the lowest within MPEG. An I-frame must exist at the start
of any video stream and also at any random access entry point in the stream.
P-frames (Predictive-coded images) are compressed images resulting from removal of temporal
redundancy between successive frames. These frames are coded by a forward predictive coding
method in which target macroblocks are predicted from most similar reference macroblocks in the
preceding I or P-frame. Only the difference between the spatial location of the macroblocks, i.e.,
the motion vector and the difference in content of the macroblocks are coded. Instead of the
difference macroblocks it is coded as non-motion compensated macroblock when a good match as
reference macroblocks is not found. Usually in P-frames large compression ratio (three times as
much in I-frames) is achieved.
used for generating the difference macroblock. Maximum compression ratio (one and half times
as much as in P-frame) is achieved in B-frames.
D-frames (DC-coded frames) are intraframe coded and are used for fast forward or fast rewind
modes.
Macroblocks which are basic frame elements for predictive encoding are partitioned into 16 × 16
pixels for luminance component and 8 × 8 pixels for each of the chrominance components. Hence
for 4:2 chroma subsampling employed in MPEG coding a macroblock (including a difference
macroblock) consists of four Y blocks, one Cb and one Cr block each of size 8 × 8 pixels. As in
JPEG while coding, a DCT transform is applied for each8 × 8 block which then undergoes
quantization, zig-zag scan and entropy coding.
As far as the sequence of I, P and B-frames is concerned in a MPEG-1 video datastream, there are
certain guiding factors like resolution, access speed and compression ratio. Forfast random access,
the best resolution would be achieved by coding the whole datastreamas I-frames. On the other
hand the highest degree of compression is attained by using as many B-frames as possible.
However, to perform B-frame decoding, the future I or P frame involved must be transmitted
before any of the dependent B-frames can be processed. This would cause delay in the decoding
proportional to the number of B-frames in the series. Considering all the issues, an optimized and
proven sequence is ‘IBBPBBPBBI’.
MPEG uses M to indicate the interval between a P-frame and its preceding I or P frame and N to
indicate the interval between two consecutive I-frames.
1. Sequence layer: A video sequence consists of one or more group of pictures and always
starts with a sequence header. The header contains picture information such as horizontal
size and vertical size, aspect ratio, frame rate, bit rate, and buffer size. These parameters
can be changed with optional sequence headers between GOPs.
2. Group of Pictures (GOPs) layer: A GOP contains one or more pictures or frames at least
one of which should be an I-frame. At this layer it is possible to distinguish the order of
frames in the datastream with that in the display. The decoder decodes the I-frame first in
the datastream. But in the order of display, a B-frame can occur before an I-frame.
Type of Frame I B B P B B P B B I
Frame number 1 2 3 4 5 6 7 8 9 10
Display order:
Type of frame B B I B B P B B P
Frame number 1 2 3 4 5 6 7 8 9
3. Picture layer: This layer contains a displayable picture. Pictures are of 4 types – I, B,P and
D though MPEG-1 doesn’t allow mixing D pictures with other types.
4. Slice layer: Each slice consists of a number of macroblocks that may vary in a single image.
The length and position of each slice are specified in the layer header. Slices are useful for
recovery and synchronization of lost or corrupted bits.
5. Macroblock layer: Each image is divided into macroblocks. Each macroblock contains four
Y blocks, one Cb block and one Cr block each of size 8 × 8. The coded blocks are preceded
by a macroblock header, which contains all the control information (spatial address, motion
vectors, prediction modes, quantizer step size etc.) belonging to the macroblock.
6. Block layer: Each 8 × 8 pixels block, the lowest level entity in the stream, can be intra-
coded (using DCT), motion compensated or interpolated. Each block contains DC
coefficients first, followed by variable length codes (VLC) and terminated by end-of-block
marker.
An MPEG video data stream is not of much use on its own, except for playing silent video clips
over the Internet or from a server. MPEG-7 standard defines the MPEG Transport Stream that
transmits combined or multiplexed or packeted video, audio and ancillary data streams together.
The Programme Map Tables (PMTs) identifies which audio and video signals go together to make
a particular program out of several are channels transmitted.
We have seen MPEGs that only define a video data stream and not a video file format. Several
approaches have been proposed for standardization of video files architecture but QuickTime has
established itself as a de facto standard. Another widely used video file format is AVI of Microsoft
for Windows.
For building multimedia applications, digitized video is often required to be edited using
specialized softwares. Adobe Premiere is a popular mid-range non-linear video editing application
that provides some post-production facilities on desktop platform. Three windows are used by
Premiere namely Project, Timeline and Monitor. Project window is used for importing and
displaying raw video and audio clips and still images with all relevant information. The timeline
window provides a visual display of the linear extent of the completed movie, showing the order
of its component clips. It uses multiple audio and video tracks for transitions and overlays. Audio
and video clips can be dragged from the project window and dropped on the timeline for assembly.
The monitor window is used for editing and viewing the video frames. Editing includes trimming,
overlaying, applying effects like dissolve, wipes, spins and page turns for transition of one clip to
another. Serious post production operations (changes) like colour and contrast corrections, blurring
or sharpening of images, element insertion and compositing, applying filter to a clip and vary it
over time, sophisticated interpolation between key frames and soon can be done with more control
and perfection by using dedicated post-production softwares like Adobe After Effects.
The most prominent part in a personal computer is the display system that makes graphic display
possible. The display system may be attached to a PC to display character, picture and video
outputs. Some of the common types of display systems available in the market are:
The display systems are often referred to as Video Monitor or Video Display Unit (VDU). The
most common video monitor that normally comes with a PC is the Raster Scan type. However,
every display system has three basic parts – the display adapter that creates and holds the image
information, the monitor which displays that information and the cable that carries the image data
between the display adapter and the monitor. Before we discuss the major display systems let us
first know about some basic terms.
2.4.2.1. Pixel
A pixel may be defined as the smallest size object or colour spot that can be displayed and
addressed on a monitor. Any image that is displayed on the monitor is made up of thousands of
such small pixels (also known as picture elements). The closely-spaced pixels divide the image
area into a compact and uniform two-dimensional grid of pixel lines and columns. Each pixel has
a particular colour and brightness value. Though the size of a pixel depends mostly on the size of
the electron beam within the CRT, they are too fine and close to each other to be perceptible by
the human eye. The finer the pixels the more the number of pixels displayable on a monitor screen.
However, it should be remembered that the number of pixels in an image is fixed by the program
that creates the image and not by the hardware that displays it.
2.4.2.2. Resolution
There are two distinctly different terms, which are often confused. One is Image Resolution and
the other is Screen Resolution. Strictly speaking image resolution refers to the pixel spacing, i.e.,
the distance from one pixel to the next pixel. A typical PC monitor displays screen images with a
resolution somewhere between 25 pixels per inch and 80 pixels per inch (ppi). In other words,
resolution of an image refers to the total number of pixels along the entire height and width of the
image. For example, a full-screen image with resolution800 × 600 means that there are 800
columns of pixels, each column comprising 600 pixels, i.e., a total of 800 × 600 = 4,80,000 pixels
in the image area.
The internal surface of the monitor screen is coated with red, green and blue phosphor material
that glows when struck by a stream of electrons. This coated material is arranged into an array of
millions of tiny cells–red, green and blue, usually called dots. The dot pitch is the distance between
adjacent sets (triads) of red, green and blue dots. This is also same as the shortest distance between
any two dots of the same colour, i.e., from red-to-red, or, green-to-green like that. Usually monitors
are available with a dot pitch specification 0.25mm to 0.40 mm. Each dot glow with a single pure
colour (red, green or blue) and each glowing triad appears to our eye as a small spot of colour (a
mixture of red, green and blue). Depending on the intensity of the red, green and blue colours
different colours results in different triads. The dot pitch of the monitor thus indicates how fine the
colored spots that make up the picture can be, though electron beam dia is an important factor in
determining the spot size.
Pixel therefore, is the smallest element of a displayed image, and dots (red, green and blue) are the
smallest elements of a display surface (monitor screen). The dot pitch is the measure of screen
resolution. The smaller the dot pitch, the higher the resolution, sharpness and detail of the image
displayed.
In order to use different resolutions on a monitor, the monitor must support automatic changing of
resolution modes. Originally, monitors were fixed at a particular resolution, but for most monitors
today display resolution can be changed using software control. This lets you use higher or lower
resolution depending on the need of your application. A higher resolution display allows you to
see more information on your screen at a time and is particularly useful for operating systems such
as Windows. However, the resolution of an image you see is a function of what the video card
outputs and what the monitor is capable of displaying. To see a high resolution image such as 1280
× 1024 you require both a video card capable of producing an image this large and a monitor
capable of displaying it.
If the image resolution is more as compared to the inherent resolution of the display device, then
the displayed image quality gets reduced. As the image has to fit in the limited resolution of the
monitor, the screen pixels (comprising a red, a green and a blue dot) show the average colour and
brightness of several adjacent image pixels. Only when the two resolutions match, will the image
be displayed perfectly and only then is the monitor used to its maximum capacity.
Table 2.2. Common Resolutions, Respective Number of Pixels and Standard Aspect Ratios
The aspect ratio of the image is the ratio of the number of X pixels to the number of Y pixels. The
standard aspect ratio for PCs is 4:3, and some resolutions even use a ratio of5:4. Monitors are
calibrated to this standard so that when you draw a circle it appears to be a circle and not an ellipse.
Displaying an image that uses an aspect ratio of 5:4 will cause the image to appear somewhat
distorted. The only mainstream resolution that uses 5:4 is the high-resolution 1280 × 1024.
This type of display basically employs a Cathode Ray Tube (CRT) or LCD Panel for display. The
CRT works just like the picture tube of a television set. Its viewing surface is coated with a layer
of arrayed phosphor dots. At the back of the CRT is a set of electron guns (cathodes) which produce
a controlled stream of electrons (electron beam). The phosphor material emits light when struck
by these high-energy electrons. The frequency and intensity of the light emitted depends on the
type of phosphor material used and energy of the electrons. To produce a picture on the screen,
these directed electron beams start at the top of the screen and scan rapidly from left to right along
the row of phosphor dots. They return to the left-most position one line down and scan again, and
repeat this to cover the entire screen. The return of the beam to the leftmost position one line down
is called horizontal retrace during which the electron flow is shut off. In performing this scanning
or sweeping type motion, the electron guns are controlled by the video data stream that comes into
the monitor from the video card. This varies the intensity of the electron beam at each position on
the screen. The instantaneous control of the intensity of the electron beam at each dot is what
controls the colour and brightness of each pixel on the screen. All this happens very quickly, and
the entire screen is drawn in a fraction (say, 1/60th) of a second.
An image in raster scan display is basically composed of a set of dots and lines; lines are displayed
by making those dots bright (with the desired colour) which lie as close as possible to the shortest
path between the endpoints of a line.
When a dot of phosphor material is struck by the electron beam, it glows for a fraction of a second
and then fades. As brightness of the dots begins to reduce, the screen-image becomes unstable and
gradually fades out.
In order to maintain a stable image, the electron beam must sweep the entire surface of the screen
and then return to redraw it a number of times per second. This process is called refreshing the
screen. After scanning all the pixel-rows of the display surface, the electron beam reaches the
rightmost position in the bottommost pixel line. The electron flow is then switched off and the
vertical deflection mechanism steers the beam to the top left position to start another cycle of
scanning. This diagonal movement of the beam direction across the display surface is known as
vertical retrace. If the electron beam takes too long to return and redraw a pixel, the pixel will
begin to fade; it will return to full brightness only when redrawn. Over the full surface of the
screen, this becomes visible as a flicker in the image, which can be distracting and hard on the
eyes.
In order to avoid flicker, the screen image must be redrawn fast enough so that the eye cannot tell
that refresh is going on. The refresh rate is the number of times per second that the screen is
refreshed. It is measured in Hertz (Hz), the unit of frequency. The refresh rates are somewhat
standardized; common values are 56, 60, 65, 70, 72, 75, 80, 85, 90, 95,100, 110 and 120 Hz.
Though higher refresh rates are preferred for better comfort in viewing the monitor, the maximum
refresh rate possible depends on the resolution of the image. The maximum refresh rate that a
higher resolution image can support is less than that supported by a lower resolution image,
because the monitor has more number of pixels to cover with each sweep. Actually support for a
given refresh rate requires two things: a video card capable of producing the video images that
many times per second, and a monitor capable of handling and displaying that many signals per
second.
Every monitor should include, as part of its specification, a list of resolutions it supports and the
maximum refresh rate for each resolution. Many video cards now include setup utilities that are
pre-programmed with information about different monitors. When you select a monitor, the video
card automatically adjusts the resolutions and respective allowable refresh rates. Windows 95 and
later versions extend this facility by supporting Plug and Play for monitors; you plug the monitor
in and Windows will detect it, set the correct display type and choose the optimal refresh rate
automatically.
Some monitors use a technique called interlacing to cheat a bit and allow themselves to display at
a higher resolution than is otherwise possible. Instead of refreshing every line of the screen, when
in an interlaced mode, the electron guns sweep alternate lines on each pass. In the first pass, odd-
numbered lines are refreshed, and in the second pass, even-numbered lines are refreshed. This
allows the refresh rate to be doubled because only half the screen is redrawn at a time. The usual
refresh rate for interlaced operation is 87 Hz, which corresponds to 43.5 Hz of ‘real’ refresh in
half-screen interlacing.
In the Figure 2.20, the odd-numbered lines represent scanning one half of the screen and the even-
numbered lines represent scanning of the other half. There are two separate sets of horizontal and
vertical retrace.
2.4.2.5.2. CRT
A CRT is similar to a big vacuum glass bottle. It contains three electron guns that emit a focused
beam of electrons, deflection apparatus (magnetic or electrostatic), which deflects these beams
both up and down and sideways, and a phosphor-coated screen upon which these beams impinge.
The vacuum is necessary to let those electron beams travel across the tube without running into air
molecules that could absorb or scatter them. The primary component in an electron gun is a cathode
(negatively charged) encapsulated by a metal cylinder known as the control grid. A heating
element inside the cathode causes the cathode to be heated as current is passed. As a result electrons
‘boil-off’ the hot cathode surface. These electrons are accelerated towards the CRT screen by a
high positive voltage applied near the screen or by an accelerating anode. If allowed to continue
uninterrupted, the naturally diverging electrons would simply flood the entire screen. The cloud of
electrons is forced to converge to a small spot as it touches the CRT screen by a focusing system
using an electrostatic or magnetic field. Just as an optical lens focuses a beam of light at a particular
focal distance, a positively charged metal cylinder focuses the electron beam passing through it on
the center of the CRT screen. A pair of magnetic deflection coils mounted outside the CRT
envelope deflects the concentrated electron beam to converge at different points on the screen in
the process of scanning. Horizontal deflection is obtained by one pair of coils and vertical
deflection by the other pair, and the deflection amount is controlled by adjusting the current passing
through the coils. When the electron beam is deflected away from the center of the screen, the
point of convergence tends to fall behind the screen resulting in a blurred (defocused) display near
the screen edges. In high-end display devices this problem is eliminated by a mechanism which
dynamically adjusts the beam focus at different points on the screen.
When the electron beam converges on to a point on the phosphor-coated face of the CRT screen,
the phosphor dots absorb some of the kinetic energy from the electrons. This causes the electrons
in the phosphor atoms to jump to higher energy orbits. After a short time these excited electrons
drop back to their earlier stable state, releasing their extra energy as small quantum of light energy.
As long as these excited electrons return to their stable state phosphor continue to glow
(phosphorescence) but gradually loses brightness. The time between the removal of excitation and
the moment when phosphorescence has decayed to 10 per cent of the initial brightness is termed
as persistence of phosphor. The brightness of the light emitted by phosphor depends on the
intensity with which the electron beam (number of electrons) strikes the phosphor. The intensity
of the beam can be regulated by applying measured negative voltage to the control grid.
Corresponding to a zero value in the frame buffer a high negative voltage is applied to the control
grid, which in turn will shut off the electron beam by repelling the electrons and stopping them
from coming out of the gun and hitting the screen. The corresponding points on the screen will
remain black. Similarly, a bright white spot can be created at a particular point by minimizing the
negative voltage at the control grid of the three electron guns when they are directed to that point
by the deflection mechanism.
Apart from brightness the size of the illuminated spot created on the screen varies directly with the
intensity of the electron beam. As the intensity or number of electrons in the beam increases, the
beam diameter and spot size increases. Also the highly excited bright phosphor dots tend to spread
the excitation to the neighboring dots thereby further increasing the spot size. Therefore the total
number of distinguishable spots (pixels) that can be created on the screen depends on the individual
spot size. The lower the spot size, the higher the image resolution.
In a monochrome CRT there is only one electron gun, whereas in a colour CRT there are three
electron guns each controlling the display of red, green and blue light respectively. Unlike the
screen of a monochrome CRT, which has a uniform coating of phosphor, the colour CRT has three
colour-phosphor dots (dot triad) – red, green and blue – at each point on the screen surface. When
struck by an electron beam the red dot emits red light, the green dot emits green light and the blue
dot emits blue light. Each triad is arranged in a triangular pattern, as are the three electron guns.
The beam deflection arrangement allows all the three beams to be deflected at the same time to
form a raster scan pattern. There are separate video streams for each RGB (red, green and blue)
colour components which drive the electron guns to create different intensities of RGB colours at
each point on the screen. To ensure that the electron beam emitted from individual electron guns
strikes only the correct phosphor dots (e.g., the electron gun for red colour excites only the red
phosphor dot), a shadow mask is used just before the phosphor screen. The mask is a fine metal
sheet with a regular array of holes punched in it. The mask is so aligned that as the set of three
beams sweeps across the shadow mask they converge and intersect at the holes and then hit the
correct phosphor dot; the beams are prevented or masked from intersecting other two dots of the
triad. Thus, different intensities can be set for each dot in a triad and a small colour spot is produced
on the screen as a result.
There is an alternative to accomplishing the masking function which is adopted by some CRTs.
Instead of a shadow mask, they use an aperture grill. In this system, the metal mesh is replaced by
hundreds of fine metal strips that run vertically from the top of the screen to the bottom. In these
CRTs, the electron guns are placed side-by-side (not in a triangular fashion).The gaps between the
metal wires allow the three electron beams to illuminate the adjacent columns of coloured
phosphor which are arranged in alternating stripes of red, green and blue. This configuration allows
the phosphor stripes to be placed closer together than conventional dot triads. The fine vertical
wires block less of the electron beam than ordinary shadow masks resulting in a brighter and
sharper image. This design is most common in Sony’s popular Trinitron. Trinitron monitors are
curved on the horizontal plane and are flat on the vertical plane.
For TV sets and monitors, the diagonal dimension is stated as the size. As a portion of the picture
tube is covered by the case, the actual viewable portion measures only19 inches diagonally. For
standard monitors the height is about three-fourth of the width. For a 19-inch monitor the image
width will be 15 inches and the height will be11 inches.
The appearance and colour of a pixel of an image is a result of intersection of three primary colours
(red, green and blue) at different intensities. When the intensities of all three electron beams are
set to the highest level (causing each dot of a triad to glow with maximum intensity), the result is
a white pixel; when all are set to zero, the pixel is black. And for many different combinations of
intermediate intensity levels, several million colour pixels can be generated. For a mono monitor
using a single electron gun, the phosphor material can glow with varied intensities depending on
the intensity of the electron beam. As a result a pixel can be black (zero intensity) or white
(maximum intensity) or have different shades of grey.
Figure 2.23. For bit depth = n , n number of bit planes are used; each bit plane contributes to the
gray shade of a pixel.
The number of discrete intensities that the video card is capable of generating for each primary
colour determines the number of different colours that can be displayed. The number of memory
bits required to store colour information (intensity values for all three primary colour components)
about a pixel is called colour depth or bit depth. A minimum of one memory bit (colour depth =
1) is required to store intensity value either 0 or 1 for every screen point or pixel. Corresponding
to the intensity value 0 or 1, a pixel can be black or white respectively. So if there are n pixels in
an image a total of n bits of memory used for storing intensity values will result in a pure black
and white image. The block of memory which stores (or is mapped with) bi-level intensity values
for each pixel of a full-screen pure black and white image is called a bit plane or bitmap.
Colour or grey levels can be achieved in the display using additional bit planes. First consider a
single bit plane – a planar array of bits, with one bit for each screen pixel. This plane is replicated
as many times as there are bits per pixel, placing each bit plane behind its predecessor. Hence, the
result for n-bits per pixel (colour depth = n) is a collection of n bit planes that allows specifying
any one of 2n colours or grey shades at every pixel.
The more the number of bits used per pixel, the finer the colour detail of the image. However,
increased colour depths also require significantly more memory for storage, and also more data
for the video card to process, which reduces the allowable refresh rate.
For true colour three bytes of information are used, one each of the red, blue and green signals that
make a pixel. A byte can hold 256 different values and so 256 voltage settings are possible for
each electron gun which means that each primary colour can have 256intensities, allowing over
16 million (256 × 256 × 256) colour possibilities. This allows fora very realistic representation of
the images, without necessitating any colour compromise. In fact, 16 million colours is more than
the human eye can discern. True colour is a necessity for those involved in high quality photo
editing, graphical design, etc.
For high colour two bytes of information are used to store the intensity values for all three colours.
This is done by dividing 16 bits into 5 bits for blue, 5 bits for red and 6 bits for green. This means
32 (= 25) intensities for blue, 32 (= 25) for red, and 64 (= 26) for green. This reduced colour
precision results in a loss of visible image quality, but one cannot easily see the difference between
true colour and high colour image. However high colour is often used instead of true colour
because high colour requires 33 per cent (or 50per cent in some cases) less memory and also image
generation is faster.
Figure 2.24. For bit depth =24 (true colour display), 8 bit planes are used for storing each
primary colour component of the colour value of a pixel.
Figure 2.25. The n-bit register holds the row number of the look-up table; the particular row
pointed contains the actual pixel intensity value which is a x-bit number (x>n).
In 256-colour mode the PC uses only 8 bits; this means something like 2 bits for blue and 3 each
for green and red. There are chances that most of the colours of a given picture are not available,
and choosing between only 4 (= 22) or 8 (= 23) different values for each primary colour would
result in rather blocky or grainy look of the displayed image. A palette or look-up table is used
here. A palette is a separate memory block (in addition to the 8 bit plane) created containing 256
different colours. The intensity values stored there in are not constrained within the range 0 to 3
for blue and 0 to7 each for green and red. Rather each colour is defined using the standard 3-byte
colour definition that is used in true colour. Thus the intensity values for each of the three primary
colour components can be anything between 0 and 255 in each of the table entries. Upon reading
the bit planes, the resulting number instead of directly specifying the pixel colour, is used as a
pointer to the 3-bytecolour value entry in the look-up table. For example, if the colour number read
from the bit-planes is 10 for a given pixel, then the intensities of red, green and blue to be displayed
for that pixel will be found in the 10th entry of the table. So the full range of true colour can be
accessed, but only 256 of the available 16 million colours can be used at a time.
The palette is an excellent compromise at the cost of moderate increase in memory: it allows only
8 bits of the frame buffer to be used to specify each colour in an image and allows the creator of
the image to pick any of the 256 colours for the image. This is because the palette can be reloaded
any time with different combinations of 256 colours out of 16 million without changing the frame
buffer values. Since virtually no image contains an even distribution of colours, this allows for
more precision in an image by using more colours than would be possible by assigning each pixel
a 2-bit value for blue and 3-bit value each for green and red. For example, an image of the sky
with clouds (like the Windows 95standard background) would have different shades of blue, white
and gray, and virtually no red, green, yellow and the like.
256-colour is the standard for much of computing, mainly because the higher-precision colour
modes require more resource (especially video memory) and are not supported by many PCs.
Despite the ability to ‘hand pick’ the 256 colours, this mode produces noticeably worse image
quality than high colour.
In the early days of PCs, the amount of information displayed was less. A screen of monochrome
text, for example, needs only about 2 KB of memory space. Special parts of the upper memory
area (UMA) were dedicated to hold this video data. As the need for video memory increased into
the megabyte range, it made more sense to put the memory on the video card itself. In fact, to
preserve existing PC design limitations, it was necessary (the UMA does not have the space to
hold bigger screen images). The frame buffer is the video memory (RAM) that is used to hold or
map the image displayed on the screen. The amount of memory required to hold the image depends
primarily on the resolution of the screen image and also the colour depth used per pixel. The
formula to calculate how much video memory is required at a given resolution and bit depth is:
However one needs more memory than this formula computes. One major reason is that video
cards are available only in certain memory configurations (in terms of whole megabytes). For
example you can’t order a card with 1.7 MB of memory; you have to use a standard 2MB card
available in the market. Another reason is that many video cards, especially high end accelerators
and 3D cards, use memory for computation as well as for the frame buffer. Thus, they need much
more memory than is required strictly to hold the screen image.
Table 3.4 displays, in binary megabytes, the amount of memory required for the framebuffer for
each common combination of screen resolution and colour depth. The smallest industry standard
video memory configuration required to support the combination is shown in parentheses.
Some motherboards designed are to integrate the video chipset into itself and use a part of the
system RAM for the frame buffer. This is called unified memory architecture. This is done to save
costs. The result is poorer video performance, because in order to use higher resolutions and refresh
rates, the video memory needs a higher performance than the RAM normally used for the system.
This is also the reason why video card memory is so expensive compared to regular system RAM.
In order to meet the increasing demand for faster and dedicated video memory at a comparable
price, a technology was introduced by Intel which is fast becoming a new standard. It is called the
Accelerated Graphics Port or AGP. The AGP allows the video processor to access the system
memory for graphics calculations, but keeps a dedicated video memory for the frame buffer. This
is more efficient because the system memory can be shared dynamically between the system
processor and the video processor, depending on the needs of the system. However it should be
remembered that AGP is considered a port – a dedicated interface between the video chipset and
the system processor.
The display adapter circuitry (on video card or motherboard) in a raster graphics system typically
employs a special purpose processor called Display Processor or Graphics Controller or Display
Coprocessor which is connected as an I/O peripheral to the CPU. Such processors assist the CPU
in scan-converting the output primitives (line, circle, arc etc.) into bitmaps in frame buffer and also
perform raster operations of moving, copying and modifying pixels or block of pixels. The output
circuitry also includes another specialized hardware called Video Controller which actually drives
the CRT and produces the display on the screen.
The monitor is connected to the display adapter circuitry through a cable with 15-pinconnectors.
Inside the cable are three analog signals carrying brightness information in parallel for the three
colour components of each pixel. The cable also contains two digital signal lines for vertical and
horizontal drive signals and three digital signal lines which carry specific information about the
monitor to the display adapter.
The video controller in the output circuitry generates the horizontal and vertical drive signals so
that the monitor can sweep its beam across the screen during raster scan. Memory reference
addresses are generated in synchrony with the raster scan, and the contents of the memory are used
to control the CRT beam intensity or colour. Two registers(X register and Y register) are used to
store the coordinates of the screen pixels. Assume that the y values of the adjacent scan lines
increase by 1 in upward direction starting from 0 at the bottom of the screen to ymax at the top. And
along each scan line the screen pixel positions or x values are incremented by 1 from 0 at the
leftmost position to xmax at the rightmost position. The origin is at the lower left corner of the
screen as in a standard Cartesian coordinate system. At the start of a refresh cycle, the X register
is set to 0 and the Y register is set to ymax. This (x, y) address is translated into a memory address
of framebuffer where the colour value for this pixel position is stored. The controller retrieves this
colour value (a binary number) from the frame buffer, breaks it up into three parts and sends each
part to a separate digital-to-analog converter (DAC)*. After conversion the DAC puts the
proportional analog voltage signals on the three analog output wires going to the monitor. These
voltages in turn controls the intensity of three electron beams that are focused at the (x, y) screen
position by the horizontal and vertical drive signals.
his process is repeated for each pixel along the top scan line, each time incrementing the X register
by 1. As pixels on the first scan line are generated the X register is incremented through xmax. Then
the X register is reset to 0 and the Y register is decremented by 1 to access the next scan line.
Pixels along this scan line are then processed and the procedure is repeated for each successive
scan line until pixels on the last scan line (y = 0) are generated. For a display system employing a
colour look-up table, however, frame buffer value is not directly used to control the CRT beam
intensity. It is used as an index to find the true pixel-colour value from the look-up table. This
look-up operation is done for each pixel on each display cycle.
As the time available to display or refresh a single pixel on the screen is too less (in the order of a
few nanoseconds), accessing the frame buffer every time for reading each pixel intensity value
would consume more time than what is allowed. Therefore, multiple adjacent pixel values are
fetched to the frame buffer in a single access and stored in a register. After every allowable time
gap (as dictated by the refresh rate and resolution) one pixel value is shifted out from the register
to control the beam intensity for that pixel. The procedure is repeated with the next block of pixels
and so on, thus the whole group of pixels will be processed.
Basically there are two types of CRTs – Raster Scan type and Random Scan type. The main
difference between the two is the technique with which the image is generated on the phosphor
coated CRT screen. In raster scan method, the electron beam sweeps the entire screen in the same
way you would write a full page text in a notebook, word by word, character by character, from
left to right, and from top to bottom. In random scan technique, the electron beam is directed
straightaway to the particular point(s) of the screen where the image is to be produced. It generates
the image by drawing a set of random straight lines much in the same way one might move a pencil
over a piece of paper to draw an image – drawing strokes from one point to another, one line at a
time. This is why this technique is also referred to as vector drawing or stroke writing or
calligraphic display.
There are of course no bit planes containing mapped pixel values in a vector system. Instead the
display buffer memory stores a set of line drawing commands along with endpoint coordinates in
a display list or display program created by a graphics package. The display processing unit (DPU)
executes each command during every refresh cycle and feeds the vector generator with digital x,
y and Δx, Δy values. The vector generator converts the digital signals into equivalent analog
deflection voltages. This causes the electron beam to move to the start point or from the start point
to the end point of a line or vector. Thus the beam sweep does not follow any fixed pattern; the
direction is arbitrary as dictated by the display commands. When the beam focus must be moved
from the end of one stroke to the beginning of the other, the beam intensity is set to 0.Though the
vector-drawn images lack in depth and colour precision, the random display scan work at much
higher resolutions than the raster displays. The images are sharp and have smooth edges unlike the
jagged edges and lines on raster displays.
Direct View Storage Tube (DVST) is rarely used today as part of a display system. However,
DVST marks a significant technological change in the usual refresh type display. Both in the raster
scan and random scan system the screen image is maintained (flicker free) by redrawing or
refreshing the screen many times per second by cycling through the picture data stored in the
refresh buffer. In DVST there is no refresh buffer; the images are created by drawing vectors or
line segments with a relatively slow-moving electron beam. The beam is designed not to draw
directly on phosphor but on a fine wire mesh (called storage mesh) coated with dielectric and
mounted just behind the screen. A pattern of positive charge is deposited on the grid, and this
pattern is transferred to the phosphor-coated screen by a continuous flood of electrons emanating
from a separate flood gun.
Just behind the storage mesh is a second grid, the collector, whose main purpose is to smooth out
the flow of flood electrons. These electrons pass through the collector at low velocity and are
attracted to the positively charged portions of the storage mesh but repelled by the rest. Electrons
not repelled by the storage mesh pass right through it and strike the phosphor.
To increase the energy of these slow-moving electrons and thus create a bright picture, the screen
is maintained at a high positive potential. The storage tube retains the image generated until it is
erased. Thus no refreshing is necessary, and the image is absolutely flicker free. A major
disadvantage of DVST in interactive computer graphics is its inability to selectively erase parts of
an image from the screen. To erase a line segment from the displayed image, one has to first erase
the complete image and then redraw it by omitting that line segment. However, the DVST supports
a very high resolution which is good for displaying complex images.
To satisfy the need of a compact portable monitor, modern technology has gifted us with LCD
panel, Plasma display panel, LED panel and thin CRT. These display devices are smaller, lighter
and specifically thinner than the conventional CRT and thus are termed as Flat Panel Display
(FPD). FPD in general and LCD panels in particular are most suitable for laptop (notebook)
computers but are expensive to produce. Though hardware prices are coming down sharply, cost
of the LCD or Plasma monitors is still too high to compete with CRT monitors in desktop
applications. However, the thin CRT is comparatively economical. To produce a thin CRT the tube
length of a normal CRT is reduced by bending it in the middle. The deflection apparatus is
modified so that electron beams can be bent through 90 degrees to focus on the screen and at the
same time can be steered up and down and across the screen.
2.4.2.5.7.1. LCD
o understand the fundamental operation of a simple LCD, a model is shown Figure 3.15.LCD
basically consists of a layer of liquid crystal, sandwiched between two polarizing plates. The
polarizers are aligned perpendicular to each other (one vertical and the other horizontal), so that
the light incident on the first polarizer will be blocked by the second. This is because a polarizer
plate only passes photons (quanta of light) with their electric fields aligned parallel to the
polarizing direction of that plate.
The LCD displays are addressed in a matrix fashion. Rows of matrix are defined by a thin layer of
horizontal transparent conductors, while columns are defined by another thin layer of vertical
transparent conductors; the layers are placed between the LCD layer and the respective polarizer
plate. The intersection of the two conductors defines a pixel position. This means that an individual
LCD element is required for each display pixel, unlike a CRT which may have several dot triads
for each pixel.
The liquid crystal material is made up of long rod-shaped crystalline molecules containing
cyanobiphenyl units. The individual polar molecules in a nematic (spiral) LC layer are normally
arranged in a spiral fashion such that the direction of polarization of polarized light passing through
it is rotated by 90 degrees. Light from an internal source (backlight)*enters the first polarizer (say
horizontal) and is polarized accordingly (horizontally). As the light passes through the LC layer it
is twisted 90 degrees (to align with the vertical) so that it is allowed to pass through the rear
polarizer (vertical) and then reflect from the reflector behind the rear polarizer. When the reflected
light reaches the viewer’s eye travelling in the reverse direction, the LCD appears bright.
When an electric current is passed through the LCD layer, the crystalline molecules align
themselves parallel to the direction of light and thus have no polarizing effect. The light entering
through the front polarizer is not allowed to pass through the rear polarizer due to mismatch of
polarization direction. The result is zero reflection of light and the LCD appears black.
In a colour LCD there are layers of three liquid crystal panels one on top of another. Each one is
filled with a coloured (red, green or blue) liquid crystal. Each one has its own set of horizontal and
vertical conductors. Each layer absorbs an adjustable portion of just one colour of the light passing
through it. This is similar to how colour images are printed. The principal advantage of this design
is that it helps create as many screen pixels as intersections, thus making higher-resolution LCD
panels possible. In the true sense, each pixel comprises three colour cells or sub-pixel elements.
The image painting operation in LCD panels is a different from that of the CRT though both are
of raster scan type. In a simple LCD panel, an entire line of screen pixels is illuminated at one
time. The process continues to the next line and so on till the entire screen image is completed.
Picture definitions are stored in a refresh buffer and the screen is refreshed typically at the rate of
60 frames per second. Once set, the screen pixels stay at fixed bright-ness until they are reset. The
time required to set the brightness of a pixel is high compared to that of the CRT. This is why LCD
panel pixels cannot be turned on or off anywhere near the rate at which pixels are painted on a
CRT screen. Except the high quality Active Matrix LCD panels*, others have trouble displaying
movies, which require quick refreshing.
Here a layer of gas (usually neon) is sandwiched between two glass plates. Thin vertical (column)
strips of conductor run across one plate, while horizontal (row) conductors run up and down the
other plate. By applying high voltage to a pair of horizontal and vertical conductors, a small section
of the gas (tiny neon bulb) at the intersection of the conductors breaks down into glowing plasma
of electrons and ions. Thus, in the array of gas bulbs, each one can be set to an ‘on’ (glowing) state
or ‘off’ state by adjusting voltages in the appropriate pair of conductors. Once set ‘on’ the bulbs
remain in that state until explicitly turned ‘off’ by momentarily reducing the voltage applied to the
pair of conductors. Hence no refreshing is necessary.
Because of its excellent brightness, contrast and scalability to larger sizes, plasma panel is
attractive. Research is on to eliminate the colour-display limitation of such device at low
production cost.
So far we have discussed some fundamental concepts on how graphic images are generated and
stored in some of the most common and widely used display systems. Let us briefly study a graphic
device which directly copies images from a paper or photograph and converts it into the digital
format for display, storage and graphic manipulations. It is the Scanner. Traditionally, design and
publishing houses have been the prime users of scanners, but the phenomenal growth of Internet
has made the scanner more popular among the web designers. Today scanners are becoming
affordable tools to the graphic artists and photographers.
There are basically three types of scanners – Drum, Flatbed and Sheet fed scanners. Drum scanners
are the high-end ones, whereas sheet fed scanners are the ordinary type. Flatbed scanners strike a
balance between the two in quality as well as price. There are also handheld scanners or bar-code
readers which are typically used to scan documents in strips of about 4 inches wide by holding the
scanner in one hand and sliding it over the document.
A flatbed scanner uses a light source, a lens, a charge-coupled device (CCD) array and one or more
analog-to-digital converters (ADCs) to collect optical information about the object to be scanned,
and transform it to an image file. A CCD is a miniature photometer that measures incident light
and converts that into an analog voltage.
When you place an object on the copy board or glass surface (like a copier machine)and start
scanning, the light source illuminates a thin horizontal strip of the object called a raster line. Thus
when you scan an image, you scan one line at a time. During the exposure of each raster line, the
scanner carriage (optical imaging elements, which is a network of lenses and mirrors) is
mechanically moved over a short distance using a motor. The light reflected is captured by the
CCD array. Each CCD converts the light to an analog voltage and indicates the grey level for one
pixel. The analog voltage is then converted into a digital value by an ADC using 8, 10 or 12 bits
per colour.
Figure 2.33. Scanning Operation; The assembly of light, mirrors, lens and CCD moves over the
length of the glassbed in the direction shown while scanning a paper.
The CCD elements are all in a row, with one element for each pixel in a line. If you have300 CCD
elements per inch across the scanner, you can have a maximum potential optical resolution of 300
pixels per inch, also referred to as dots per inch (dpi).
There are two methods by which the incident white light is sensed by the CCD. The first involves
a rapidly rotating light filter that individually filters the red, green and blue components of the
reflected light which are sensed by a single CCD device. Here the colour filter is fabricated into
the chip directly. In the second method, a prismatic beam splitter first splits the reflected white
light and three CCDs are used to sense the red, green and blue light beams.
Another imaging array technology that has become popular in inexpensive flatbed scanners is
contact image sensor (CIS). CIS replaces the CCD array, mirrors, filters, lamp and lens with rows
of red, green and blue light emitting diodes (LEDs). The image sensor mechanism, consisting of
300 to 600 sensors spanning the width of the scan area, is placed very close to the glass plate that
the document rests upon. When the image is scanned, the LEDs combine to provide white light.
The illuminated image is then captured by the row of sensors. CIS scanners are cheaper, lighter
and thinner, but do not provide the same level of quality and resolution found in most CCD
scanners.
The output of a scanner is a bitmap image file, usually in a PCX or JPG format. If you scan a page
of text, it may be saved as an image file which cannot be edited in a word processing software.
Optical Character Recognition (OCR) softwares are intelligent programs which can convert a
scanned page of text into editable text either into a plain text file, a Word document or even an
Excel spreadsheet which can be easily edited. OCR can also be used to scan and recognize printed,
typewritten or even handwritten text. The OCR software requires a raster image as an input, which
may be an existing image file or an image transferred from a scanner. OCR analyzes the image to
find blocks of image information that resemble possible text fields and creates an index of such
areas. The software examines these areas, compares shape of each object with a database of words
categorized by different fonts or typefaces and recognizes individual text characters from the
information.
In the later chapters we will explore how basic graphic entities are drawn, manipulated and viewed
on a computer.
c) plasma panel
d) active matrix TFT
3. The display on the CRT-based monitor screen is produced by
a) video controller
b) raster scan generator
c) display processor
d) DAC
1. What are frame grabbers? Are frame buffers different from frame grabbers?
2. Draw a neat block diagram, to explain the architecture of a raster display.
3. Give the logical organization of the video controller in a raster display system.
4. What is refresh buffer? Identify the contents and the organization of the refresh buffer for
the case of raster display and vector display.
5. Describe the function of an image scanner.
6. What role does CCD play in an image scanner?
7. How are different shades of colour generated on the RGB monitors?
8. What is the role of shadow masks in graphics monitors?
9. What do you understand by VGA and SVGA monitors?
10. What is computer graphics? Indicate five practical applications of computer graphics.
11. Discuss in brief different interactive picture construction techniques.
12. How is colour depth and resolution of an image related to the video memory requirement?
13. What is the fundamental difference in the method of operation of a monochrome CRT and
a colour CRT?
14. Compare storage type CRT against refresh type CRT display. List the important properties
of phosphor being used in CRTs.
15. Compare and contrast the operating characteristics of raster refresh systems, plasma panels
and LCDs.
16. Bring out the need for a colour look-up table. Give the organization of a colour look-up
table providing 12 bits per entry, per colour for each pixel position and with 8 bits per pixel
in the frame buffer.
17. A colour display device has 8 bit planes and a look-up table of 256 entries, each of which
can hold a 24 bit number. The manufacturer claims it can ‘display 256 colours out of a
palette of 16 million’. Explain this statement.
18. Briefly explain two main classes of hardware device for user interaction.
19. If a monitor screen has 525 scan lines and an aspect ratio of 3:4 and if each pixel contains
8 bits worth of intensity information, how many bits per second are required to show 30
frames each second?
20. For a medium resolution display of 640 pixels by 480 lines refreshing 60 times per second,
the video controller fetches 16 bits in one memory cycle. RAM memory chips have cycle
times around 200ns. How many memory cycles will be needed for displaying 16 one bit
pixels? DOEACC-‘A’ level — Jan 2001.
21. What is the fraction of the total refresh time per frame spent in retrace of the electron beam
for a non-interlaced raster system with a resolution of 1280 × 1024, a refresh rate of 60Hz,
a horizontal retrace time of 5 μ sec and a vertical retrace time of500 μ sec.
22. Assume a raster scan display system supports a frame buffer size of 256 × 256 × 2bits.
Two bits/pixel are used to look up a 4 × 2 colour table. The entries in the colour table are
writable once per raster scan only during the vertical retrace period. The actual colour codes
are given as follows.
00 Black 01 Red 10 Yellow 11 White
i. Give a scheme for using the frame buffer if it consists of two separate image
planes of size 256 × 256 × 1 each. Plane 1 is to be displayed as yellow on
red image. Plane 2 is to be displayed as white on black.
ii. How will you turn on a pixel in either plane?
iii. How will you delete a pixel in either plane?
FURTHER READING