0% found this document useful (0 votes)
28 views32 pages

Fundamentals of Image Processing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views32 pages

Fundamentals of Image Processing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

UNIT 1 FUNDAMENTALS OF IMAGE PROCESSING

Definition - Image Representation - Steps in DIP-– Components – Elements of


Visual Perception – Image Formation - Image Sampling and Quantization- Image
acquisition, storage and retrieval –– Relationships between pixels - Color image
fundamentals - RGB, HSI models- data products – satellite data formats – Digital
Image Processing Systems – Hardware and software design consideration.

Image : A digital image is an image composed of picture elements, also known as pixels,


each with finite, discrete quantities of numeric representation for its intensity or grey
level that is an output from its two-dimensional functions fed as input by its spatial
coordinates denoted with x, y on the x-axis and y-axis, respectively. Depending on
whether the image resolution is fixed, it may be of vector or raster type. By itself, the
term "digital image" usually refers to raster images (bitmap)

IMAGE REPRESENTATION: After getting an image, it is important to devise ways to


represent the image. There are various ways by which an image can be represented.
Let’s look at the most common ways to represent an image.

1.Image as a matrix

The simplest way to represent the image is in the form of a matrix. It is commonly seen
that people use up to a byte to represent every pixel of the image. This means that
values between 0 to 255 represent the intensity for each pixel in the image where 0 is
black and 255 is white. For every color channel in the image, one such matrix is
generated. In practice, it is also common to normalize the values between 0 and 1

2.Image as a function

An image can also be represented as a function. An image (grayscale) can be thought of


as a function that takes in a pixel coordinate and gives the intensity at that pixel.

It can be written as function f:  ℝ² →  ℝ that outputs the intensity at any input point
(x,y). The value of intensity can be between 0 to 255 or 0 to 1 if values are normalized.

2.1 Image Transformation

Images can be transformed when they are looked upon as functions. A change in the
function can result in changes in the pixel values of the image. There are other ways too
by which we can perform image transformation.
2.1.1 Image Processing Operations

Essentially, there are three main operations that can be performed on an image.

 Point Operations
 Local Operations
 Global Operations

Given below is an explanation of each of these operations.

Point Operation

In this, the output value depends only on the input value at that particular coordinate. A
very famous point operation example that one uses a lot while editing images is reversing
the contrast. In the most simple terms, it flips the dark pixels into light pixels and vice

versa. The point operation that helps us to achieve this is stated below.

Here, I(x,y) stands for the intensity value at coordinate (x,y) of an


image I. Iₘₐₓ and Iₘᵢₙ refer to the maximum and minimum intensity value of image I.
For example, say that an image I has an intensity between 0 and 255.
Therefore, Iₘₐₓ and Iₘᵢₙ become 255 and 0 respectively. You wish to flip the intensity
value at a coordinate say (x,y) where the current intensity value is 5. By using the above
operation, you get the output as : (255) — 5 + 0 = 250 which will be the new value of
intensity at coordinate (x,y).

Let’s say you clicked a still scene using a camera. But there can be noise in the image due
to many reasons like dust particles on the lens, damage in a sensor, and many
more. Noise reduction using point operations can be very tedious. One way is to take
multiple still scenes and average the value at every pixel and hope that the noise gets
removed. But at times, it is not possible to get multiple images of a scene and the
stillness of a scene can not be guaranteed every time. To do this, we need to move from
point operation to local operation.
Local Operation

In local operation, output value is dependent on the input value and its neighbours. A
simple example to understand local operation is the moving average. The above
operation is a local operation as the output is dependent on the input pixel and its
neighbours. Due to the operation, noise pixels in the image are smoothened out in the
output.

Global Operation

As the name suggests, in global operation, the value at the output pixel is dependent on
the entire input image. An example of the global operation is the Fourier transformation

STEPS IN DIP:

The fundamental steps in any typical Digital Image Processing pipeline are as follows:

1. Image Acquisition

The image is captured by a camera and digitized (if the camera output is not digitized
automatically) using an analogue-to-digital converter for further processing in a computer.

2. Image Enhancement

In this step, the acquired image is manipulated to meet the requirements of the specific
task for which the image will be used. Such techniques are primarily aimed at highlighting
the hidden or important details in an image, like contrast and brightness adjustment, etc.
Image enhancement is highly subjective in nature.

3. Image Restoration

This step deals with improving the appearance of an image and is an objective operation since
the degradation of an image can be attributed to a mathematical or probabilistic model. For
example, removing noise or blur from images.

4. Color Image Processing

This step aims at handling the processing of colored images (16-bit RGB or RGBA images),
for example, peforming color correction or color modeling in images.

5. Wavelets and Multi-Resolution Processing


Wavelets are the building blocks for representing images in various degrees of resolution.
Images subdivision successively into smaller regions for data compression and for pyramidal
representation.

6. Image Compression

For transferring images to other devices or due to computational storage constraints,


images need to be compressed and cannot be kept at their original size. This is also
important in displaying images over the internet; for example, on Google, a small thumbnail of
an image is a highly compressed version of the original. Only when you click on the image is it
shown in the original resolution. This process saves bandwidth on the servers.

7. Morphological Processing

Image components that are useful in the representation and description of shape need to be
extracted for further processing or downstream tasks. Morphological Processing provides
the tools (which are essentially mathematical operations) to accomplish this. For example,
erosion and dilation operations are used to sharpen and blur the edges of objects in an
image, respectively.

8. Image Segmentation

This step involves partitioning an image into different key parts to simplify and/or change
the representation of an image into something that is more meaningful and easier to analyze.
Image segmentation allows for computers to put attention on the more important parts of
the image, discarding the rest, which enables automated systems to have improved
performance.

9. Representation and Description

Image segmentation procedures are generally followed by this step, where the task for
representation is to decide whether the segmented region should be depicted as a boundary
or a complete region. Description deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating one class of objects
from another.

10. Object Detection and Recognition

After the objects are segmented from an image and the representation and description
phases are complete, the automated system needs to assign a label to the object—to let the
human users know what object has been detected, for example, “vehicle” or “person”, etc.

11. Knowledge Base


Knowledge may be as simple as the bounding box coordinates for an object of interest that
has been found in the image, along with the object label assigned to it. Anything that will
help in solving the problem for the specific task at hand can be encoded into the knowledge
base.

COMPONENTS:

 Number and speed of the computers CPU


 Operating system
 Amount of random access memory
 Number of image analysts that Can use the System at one time and the mode of
operation.
 Serial or parallel processing
 Arithmetic coprocessor or array processor
 Software compilers
 Type of mass storage and amount
 Monitor display spatial resolution
 Monitor colour resolution
 Input devices
 Output devices
 Networks
 Image processing system applications software
 Interoperability with major GIS software

ELEMENTS OF VISUAL PERCEPTION

The field of digital image processing is built on the foundation of mathematical and
probabilistic formulation, but human intuition and analysis play the main role to make
the selection between various techniques, and the choice or selection is basically
made on subjective, visual judgements. 
In human visual perception, the eyes act as the sensor or camera, neurons act as the
connecting cable and the brain acts as the processor. 

The basic elements of visual perceptions are: 


 

1. Structure of Eye
2. Image Formation in the Eye
3. Brightness Adaptation and Discrimination

The human eye is a slightly asymmetrical sphere with an average diameter of the
length of 20mm to 25mm. It has a volume of about 6.5cc. The eye is just like a
camera. The external object is seen as the camera take the picture of any object.
Light enters the eye through a small hole called the pupil, a black looking aperture
having the quality of contraction of eye when exposed to bright light and is focused
on the retina which is like a camera film. 

The lens, iris, and cornea are nourished by clear fluid, know as anterior chamber. The
fluid flows from ciliary body to the pupil and is absorbed through the channels in the
angle of the anterior chamber. The delicate balance of aqueous production and
absorption controls pressure within the eye. 

Cones in eye number between 6 to 7 million which are highly sensitive to colors.
Human visualizes the colored image in daylight due to these cones. The cone vision is
also called as photopic or bright-light vision. 

Rods in the eye are much larger between 75 to 150 million and are distributed over
the retinal surface. Rods are not involved in the color vision and are sensitive to low
levels of illumination. 

Image Formation in the Eye: 


When the lens of the eye focus an image of the outside world onto a light-sensitive
membrane in the back of the eye, called retina the image is formed. The lens of the
eye focuses light on the photoreceptive cells of the retina which detects the photons
of light and responds by producing neural impulses. 
 

The distance between the lens and the retina is about 17mm and the focal length is
approximately 14mm to 17mm. 

Brightness Adaptation and Discrimination: 


Digital images are displayed as a discrete set of intensities. The eyes ability to
discriminate black and white at different intensity levels is an important
consideration in presenting image processing result. 
The range of light intensity levels to which the human visual system can adapt is of
the order of 1010 from the scotopic threshold to the glare limit. In a photopic vision,
the range is about 106.
IMAGE FORMATION
We shall denote images by two-dimensional functions of the form f(x, y). The value or
amplitude of f at spatial coordinates (x, y) is a positive scalar quantity whose physical
meaning is determined by the source of the image. Most of the images represented are
monochromatic images, whose values are said to span the gray scale. When an image is
generated from a physical process, its values are proportional to energy radiated by a
physical source (e.g., electromagnetic waves).As a consequence, f(x, y) must be nonzero
and finite; that is,
0< f (x,y) < infinity
The function f(x, y) may be characterized by two components: (1) the amount of source
illumination incident on the scene being viewed, and (2) the amount of illumination
reflected by the objects in the scene. Appropriately, these are called the illumination
and reflectance components and are denoted by i(x, y) and r(x, y), respectively. The
two functions combine as a product to form f(x, y):
f(x, y)= i(x, y)r(x, y)
….(1)
where
0 < i (x ,y) < infinity
.....…(2)
and
0 < r (x ,y) < 1
……..(3)
Equation (3) indicates that reflectance is bounded by 0 (total absorption) and 1 (total
reflectance).The nature of i(x, y) is determined by the illumination source, and r(x, y) is
determined by the characteristics of the imaged objects. It is noted that these
expressions also are applicable to images formed via transmission of the illumination
through a medium, such as a chest X-ray. In this case, we would deal with a
transmissivity instead of a reflectivity function, but the limits would be the same as in
Eq. (3), and the image function formed would be modelled as the product in Eq. (1)
we call the intensity of a monochrome image at any coordinates (x0 , y0 ) the gray
level (l) of the image at that point.
l = f(x0, y0)
……..(4)
That is, (4) From Eqs. (1) through (3), it is evident that / lies in the range
Lmin l Lmax
………(5)
In theory, the only requirement on Lmin is that it be positive, and on Lmax that it be
finite. In practice, L min = imin rmin and Lmax =imax rmax. Using the preceding
average office illumination and range of reflectance values as guidelines, we may expect
Lmin~10 and Lmax~1000 to be typical limits for indoor values in the absence of
additional illumination. The interval is called the gray scale. Common practice is to shift
this interval numerically to the interval [0, L-1], where /=0 is considered black and /=L-
1 is considered white on the gray scale. All intermediate values are shades of gray
varying from black to white.
Image Sampling and Quantization
To create a digital image, we need to convert the continuous sensed data into digital
form .This involves two processes : sampling and quantization
The basic idea behind sampling and quantization is illustrated in Fig.
2.16. Figure 2.16(a) shows a continuous image, f(x, y), that we want to convert to digital
form .An image may be continuous with respect to the x- and y-coordinates, and also in
amplitude. To convert it to digital form, we have to sample the function in both
coordinates and in amplitude. Digitizing the coordinate values is called sampling.
Digitizing the amplitude values is called quantization. The one-dimensional function
shown in Fig. 2.16(b) is a plot of amplitude (gray level) values of the continuous image
along the line segment AB in Fig. 2.16(a).The random variations are due to image
noise.To sample this function, we take equally spaced samples along line AB, as shown in
Fig. 2.16(c).The location of each sample is given by a vertical tick mark in the bottom
part of the figure.The samples are shown as small white squares superimposed on the
function.The set of these discrete locations gives the sampled function. However, the
values of the samples still span (vertically) a continuous range of gray-level values. In
order to form a digital function, the gray-level values also must be converted
(quantized) into discrete quantities. The right side of Fig. 2.16(c) shows the gray-level
scale divided into eight discrete levels, ranging from black to white. The vertical tick
marks indicate the specific value assigned to each of the eight gray levels. The
continuous gray levels are quantized simply by assigning one of the eight discrete gray
levels to each sample. The assignment is made depending on the vertical proximity of a
sample to a vertical tick mark. The digital samples resulting from both sampling and
quantization are shown in Fig. 2.16(d). Starting at the top of the image and carrying out
this procedure line by line produces a two-dimensional digital image. Sampling in the
manner just described assumes that we have a continuous image in both coordinate
directions as well as in amplitude. In practice, the method of sampling is determined by
the sensor arrangement used to generate the image. When an image is generated by a
single sensing element combined with mechanical motion, as in Fig. 2.13, the output of
the sensor is quantized in the manner described above. However, sampling is
accomplished by selecting the number of individual mechanical increments at which we
activate the sensor to collect data. Mechanical motion can be made very exact so, in
principle, there is almost no limit as to how fine we can sample an image. However,
practical limits are established by imperfections in the optics used to focus on the
sensor an illumination spot that is inconsistent with the fine resolution achievable with
mechanical displacements. When a sensing strip is used for image acquisition, the
number of sensors in the strip establishes the sampling limitations in one image
direction. Mechanical motion in the other direction can be controlled more accurately,
but it makes little sense to try to achieve sampling density in one direction that
exceeds the sampling limits established by the number of sensors in the other.
Quantization of the sensor outputs completes the process of generating a digital image.
When a sensing array is used for image acquisition, there is no motion and the number
of sensors in the array establishes the limits of sampling in both directions.
Quantization of the sensor outputs is as before. Figure 2.17 illustrates this concept.
Figure 2.17(a) shows a continuous image projected onto the plane of an array sensor.
Figure 2.17(b) shows the image after sampling and quantization. Clearly, the quality of a
digital image is determined to a large degree by the number of samples and discrete
gray levels used in sampling and quantization. However, as shown in Section 2.4.3, image
content is an important consideration in choosing these parameters.
Image Sensing and Acquisition
The types of images in which we are interested are generated by the
combination of an “illumination” source and the reflection or absorption of energy from
that source by the elements of the “scene” being imaged. We enclose illumination and
scene in quotes to emphasize the fact that they are considerably more general than
the familiar situation in which a visible light source illuminates a common everyday 3-D
(three-dimensional) scene. For example, the illumination may originate from a source of
electromagnetic energy such as radar, infrared, or X-ray energy. But, as noted earlier,
it could originate from less traditional sources, such as ultrasound or even a computer-
generated illumination pattern. Similarly, the scene elements could be familiar objects,
but they can just as easily be molecules, buried rock formations, or a human brain.We
could even image a source, such as acquiring images of the sun. Depending on the nature
of the source, illumination energy is reflected from, or transmitted through,
objects .An example in the first category is light reflected from a planar surface. An
example in the second category is when X-rays pass through a patient’s body for the
purpose of generating a diagnostic X-ray film. In some applications, the reflected or
transmitted energy is focused onto a photoconverter (e.g., a phosphor screen), which
converts the energy into visible light. Electron microscopy and some applications of
gamma imaging use this approach. Figure 2.12 shows the three principal sensor
arrangements used to transform illumination energy into digital images. The idea is
simple: Incoming energy is
tra
nsformed into a voltage by the combination of input electrical power and sensor
material that is responsive to the particular type of energy being detected.The output
voltage waveform is the response of the sensor(s), and a digital quantity is obtained
from each sensor by digitizing its response. In this section, we look at the principal
modalities for image sensing and generation.
Image Acquisition Using a Single Sensor
Figure 2.12(a) shows the components of a single sensor. Perhaps the
most familiar sensor of this type is the photodiode, which is constructed of silicon
materials and whose output voltage waveform is proportional to light. The use of a
filter in front of a sensor improves selectivity. For example, a green (pass) filter in
front of a light sensor favors light in the green band of the color spectrum.As a
consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum. In order to generate a 2-D image using a single
sensor, there has to be relative displacements in both the x- and y-directions between
the sensor and the area to be imaged. Figure 2.13 shows an arrangement used in high-
precision scanning, where a film negative is mounted onto a drum whose mechanical
rotation provides displacement in one dimension.The single sensor is mounted on a lead
screw that provides motion in the perpendicular direction. Since mechanical motion can
be controlled with high precision, this method is an inexpensive (but slow) way to obtain
high-resolution images. Other similar mechanical arrangements use a flat bed, with the
sensor moving in two linear directions. These types of mechanical digitizers sometimes
are referred to as microdensitometers. Another example of imaging with a single
sensor places a laser source coincident with the sensor. Moving mirrors are used to
control the outgoing beam in a scanning pattern and to direct the reflected laser signal
onto the sensor. This arrangement also can be used to acquire images using strip and
array sensors, which are discussed in the following two sections.

Image Acquisition Using Sensor Strips


A geometry that is used much more frequently than single sensors
consists of an in-line arrangement of sensors in the form of a sensor strip, as Fig.
2.12(b) shows. The strip provides imaging elements in one direction. Motion
perpendicular to the strip provides imaging in the other direction, as shown in Fig.
2.14(a).This is the type of arrangement used in most flat bed scanners. Sensing devices
with 4000 or more in-line sensors are possible. In-line sensors are used routinely in
airborne imaging applications, in which the imaging system is mounted on an aircraft
that flies at a constant altitude and speed over the geographical area to be imaged.
One-dimensional imaging sensor strips that respond to various bands of the
electromagnetic spectrum are mounted perpendicular to the direction of flight. The
imaging strip gives one line of an image at a time, and the motion of the strip completes
the other dimension of a two-dimensional image. Lenses or other focusing schemes are
used to project the area to be scanned onto the sensors. Sensor strips mounted in a
ring configuration are used in medical and industrial imaging to obtain cross-sectional
(“slice”) images of 3-D objects, as Fig. 2.14(b) shows. A rotating X-ray source provides
illumination and the portion of the sensors opposite the source collect the X-ray energy
that pass through the object (the sensors obviously have to be sensitive to X-ray
energy). This is the basis for medical and industrial computerized axial tomography
(CAT) imaging as indicated in Sections 1.2 and 1.3.2. It is important to note that the
output of the sensors must be processed by reconstruction algorithms whose objective
is to transform the sensed data into meaningful cross-sectional images. In other words,
images are not obtained directly from the sensors by motion alone; they require
extensive processing. A 3-D digital volume consisting of stacked images is generated as
the object is moved in a direction perpendicular to the sensor ring. Other modalities of
imaging based on the CAT principle include magnetic resonance imaging (MRI) and
positron emission tomography (PET).The illumination sources, sensors, and types of
images are different, but conceptually they are very similar to the basic imaging
approach shown in Fig. 2.14(b).

Image Acquisition Using Sensor Arrays


Figure 2.12(c) shows individual sensors arranged in the form of a 2-D
array. Numerous electromagnetic and some ultrasonic sensing devices frequently are
arranged in an array format. This is also the predominant arrangement found in digital
cameras.A typical sensor for these cameras is a CCD array, which can be manufactured
with a broad range of sensing properties and can be packaged in rugged arrays of
elements or more. CCD sensors are used widely in digital cameras and other light
sensing instruments. The response of each sensor is proportional to the integral of the
light energy projected onto the surface of the sensor, a property that is used in
astronomical and other applications requiring low noise images. Noise reduction is
achieved by letting the sensor integrate the input light signal over minutes or even
hours (we discuss noise reduction by integration in Chapter 3). Since the sensor array
shown in Fig. 2.15(c) is two dimensional, its key advantage is that a complete image can
be obtained by focusing the energy pattern onto the surface of the array. Motion
obviously is not necessary, as is the case with the sensor arrangements discussed in the
preceding two sections. The principal manner in which array sensors are used is shown
in Fig. 2.15. This figure shows the energy from an illumination source being reflected
from a scene element, but, as mentioned at the beginning of this section, the energy
also could be transmitted through the scene elements. The first function performed by
the imaging system shown in Fig. 2.15(c) is to collect the incoming energy and focus it
onto an image plane. If the illumination is light, the front end of the imaging system is a
lens, which projects the viewed scene onto the lens focal plane, as Fig. 2.15(d)
shows.The sensor array, which is coincident with the focal plane, produces outputs
proportional to the integral of the light received at each sensor. Digital and analog
circuitry sweep these outputs and convert them to a video signal, which is then
digitized by another section of the imaging system. The output is a digital image, as
shown diagrammatically in Fig. 2.15(e). Conversion of an image into digital form is the
topic of Section 2.4.

RELATIONSHIPS BETWEEN PIXELS


An image is denoted by f(x,y) and p,q are used to represent individual pixels of the
image.

Neighbours of a pixel

A pixel p at (x,y) has 4-horizontal/vertical neighbours at (x+1,y), (x-1,y), (x,y+1) and (x,y-
1). These are called the 4-neighbours of p : N4(p).
A pixel p at (x,y) has 4 diagonal neighbours at (x+1,y+1), (x+1,y-1), (x-1,y+1) and (x-1,y-1).
These are called the diagonal-neighbours of p : ND(p).

The 4-neighbours and the diagonal neighbours of p are called 8-neighbours of p :


N8(p).

Adjacency between pixels

Let V be the set of intensity values used to define adjacency.

In a binary image, V ={1} if we are referring to adjacency of pixels with value 1. In a


gray-scale image, the idea is the same, but set V typically contains more elements.

For example, in the adjacency of pixels with a range of possible intensity values 0 to
255, set V could be any subset of these 256 values.

We consider three types of adjacency:

a) 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set
N4(p).

b) 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set
N8(p).

c) m-adjacency(mixed adjacency): Two pixels p and q with values from V are m-


adjacent if

1. q is in N4(p), or
2. 2) q is in ND(p) and the set N4(p)∩N4(q) has no pixels whose values are from
V.

Connectivity between pixels

It is an important concept in digital image processing.

It is used for establishing boundaries of objects and components of regions in an image.


Two pixels are said to be connected:

 if they are adjacent in some sense(neighbour pixels,4/8/m-adjacency)


 if their gray levels satisfy a specified criterion of similarity(equal intensity
level)

There are three types of connectivity on the basis of adjacency. They are:

a) 4-connectivity: Two or more pixels are said to be 4-connected if they are 4-adjacent


with each others.

b) 8-connectivity: Two or more pixels are said to be 8-connected if they are 8-


adjacent with each others.

c) m-connectivity: Two or more pixels are said to be m-connected if they are m-


adjacent with each others.

COLOUR MODELS:

The purpose of a color model is to facilitate the specification of colors in some


standard way. A color model is a specification of a coordinate system and a subspace
within that system where each color is represented by a single point. Color models most
commonly used in image processing are:

· RGB model for color monitors and video cameras


· CMY and CMYK (cyan, magenta, yellow, black) models for color printing
· HSI (hue, saturation, intensity) model

The RGB color model


In this model, each color appears in its primary colors red, green, and blue. This model
is based on a Cartesian coordinate system. The color subspace is the cube shown in the
figure below. The different colors in this model are points on or inside the cube, and
are defined by vectors extending from the origin. Figure 15.3 RGB

All color values R, G, and B have been normalized in the range [0, 1]. However, we can
represent each of R, G, and B from 0 to 255. Each RGB color image consists of three
component images, one for each primary color as shown in the figure below. These
three images are combined on the screen to produce a color image.

The total number of bits used to represent each pixel in RGB image is called pixel
depth. For example, in an RGB image if each of the red, green, and blue images is an 8-
bit image, the pixel depth of the RGB image is 24-bits. The figure below shows the
component images of an RGB image.

The HSI color model


HSI stands for hue, saturation, intensity. This model is interesting because it can
initially seem less intuitive than the RGB model, despite the fact that it describes color
in a way that is much more consistent with human visual perception.

It’s true that the RGB model draws upon our familiarity with mixing primary colors to
create other colors, but in terms of actual perception, RGB is very unnatural. People
don’t look at a grapefruit and think about the proportions of red, green, and blue that
are hidden inside the somewhat dull, yellowish-orangish color of the rind or the shinier,
reddish flesh. Though you probably never realized it, you think about color more in
terms of hue, saturation, and intensity.

 Hue is the color itself. When you look at something and try to assign a word
to the color that you see, you are identifying the hue. The concept of hue is
consistent with the way in which a particular wavelength of light
corresponds to a particular perceived color.
 Saturation refers to the “density” of the hue within the light that is
reaching your eye. If you look at a wall that is more or less white but with a
vague hint of peach, the hue is still peach, but the saturation is very low. In
other words, the peach-colored light reaching your eye is thoroughly
diluted by white light. The color of an actual peach, on the other hand,
would have a high saturation value.
 Intensity is essentially brightness. In a grayscale photograph, brighter
areas appear less gray (i.e., closer to white) and darker areas appear more
gray. A grayscale imaging system faithfully records the intensity of the
light, despite the fact that it ignores the colors. The HSI color model does
something similar in that it separates intensity from color (both hue and
saturation contribute to what we call color).

HSI is closely related to two other color models: HSL (hue, saturation, lightness) and
HSV (hue, saturation, value). The differences between these models are rather subtle;
the important thing at this point is to be aware that all three models are used and that
they all adopt the same general approach to quantifying color.

The following three images are screen captures from a graphics program called
Inkscape; they indicate the H, S, and L components of the three colors shown above (in
the RGB section).
Converting RGB to HSI color model

Converting between the color models requires computing values one pixel at a time. So
it may be computationally intensive if we try converting it too many times in a short
amount of time. This holds especially for images with larger dimensions.

To avoid dividing by 0, it’s a good practice to add a very small number in denominators
of conversion formulas.

Hue component formula

Hue angle formula

The R, G and B variables presented in the formulas above are pixel color channel
components – red, green and blue.

Saturation component formula


Intensity component formula

Data products

The data from various sensors are presented in a form and format with specified
radiometric and geometric accuracy which can be readily used by various application
scientists for specific themes of their interest. Remote sensing data can be procured
by a number of users for various applications and information extraction, in the form of
a ‘data product’. This may be in the form of photographic output for visual processing
or in a digital format amenable for further computer processing.

There are varieties of remote sensing data which are acquired by different sensors
and satellites. Before reaching to users, the data undergo some processing steps.
Requirements of users may vary depending upon their interests and project objectives,
hence there are various remote sensing data providers/suppliers which prepare variety
of data products in different formats.

Remote sensing data products are generated in certain ‘data formats’ about which the
users must be aware of, for various practical reasons. Pre-processed remote sensing
data are generated into a number of products, like hardcopy prints on various types of
papers, digital data on various types of computer compatible media, like tapes, compact
discs (CDs), DVDs, and various other computer compatible storage devices. If the data
product is in hard copy print then it is impossible to carry out any further processing or
conversion before use. But if the product is in digital form, it may be possible to
convert the data into a processed digital image. It may be further required to carry
out certain processing before any image analysis operation is performed. Types of data
products may vary from country to country and/or from one data provider to another.

Index numbers for data products:

All remote sensing data products carry a specific index number. This index number is
generated using the satellite path which runs from North Pole to South Pole of Earth.
This pole to pole coverage on the Earth for each pass of the satellite is given a specific
number, called path number or track number.

SATELLITE DATA FORMAT:

Image data format can be defined as the sequential arrangement


of pixels, representing a digital image in a computer compatible storage medium, such
as a compact disk (CDs/DVDs).
Superposition of any three bands of data, each of which is developed in blue, green and
red shades gives a color composite image of the area. That means, remote sensing
image data, stored in data files/image files on magnetic tapes, compact disks
(CDs/DVDs) or other media, consist only of digital numbers.

These representations of numbers form the B & W or color images when they are
displayed on a screen or output on a hard copy. Thus, the image has to be retained in its
digital form in order to carry computer processing/ classification. The digital output is
supplied on a suitable computer compatible storage media, such as DVDs, CD-ROMs,
DAT, etc., depending on user requests. The data may be arranged in band sequential
(BSQ), band interleaved by line (BIL) or band interleaved pixel (BIP) formats. Similarly,
the concept of image data format comes in, with the question of how to arrange these
pixels to achieve optimum level of desired processing and display.

Types of Data Formats

Basically, there are three types of data formats:

• Band Interleaved by Pixel (BIP),

• Band Interleaved by Line (BIL), and

• Band Sequential (BSQ)

Band Interleaved by Pixel (BIP)

Data storage sequence in BIP format is shown in Fig. 6.4, for an image of size 3×3 (i.e.
3 rows and 3 columns) having three bands. Band, row and column (pixel) are generally
represented as B, R and P, respectively. B1, R1 and P1, respectively represent band 1,
row 1 and column (pixel)1. In this format, first pixel of row 1 of band 1 is stored first
then the first pixel of row 1 of band 2 and then the first pixel of row 1 of band 3.
These are followed by the second pixel of row 1 of band 1, and then second pixel of row
1 of band 2 and then second pixel of row 1 of band 3 and likewise.
Band Interleaved by Line (BIL)

Data storage sequence in BIL format is shown here in Fig. 6.5 for a three band image
of size 3x3 (i.e. 3 rows and 3 columns). B and R represent band and row. B1 and R1
represent band 1 and row 1. In this format, all the pixels of row 1 of band 1 are stored
in sequence first, then all the pixels of row 1 of band 2 and then the pixels of row 1 of
band 3. These are followed by the all the pixels of row 2 of band 1, and then all the
pixels of row 1 of band 2 and then all the pixels of row 1 of band 3 and likewise. You
should note that both the BIP and BIL format store data/pixels in a line (row) at a
time.
Band Sequential (BSQ)

BSQ format stores each band of data as a separate file. Arrangement sequence of
data in each file is shown in Fig. 6.6 for a three band image of size 3×3 (i.e. 3 rows and
3 columns). B and R, respectively represent band and row. B1 and R1 represent band 1
and row 1, respectively. In this format, all the pixels of band 1 are stored in sequence
first, followed by all the pixels of band 2 and then the pixels of band 3.

Keeping to these three basic data formats, a number of other formats, like tiff,
geotiff, png, adrg, super structured, jfif, jpeg, etc., are developed by different
organisations. Most of the image processing software for remote sensing data
processing support these file formats. You can see the list of data file formats
supported by the image processing software in the software’s documentation manual. If
a software does not have a certain data format in its list then that file cannot be
opened and used in that particular software.

DIGITAL IMAGE PROCESSING SYSTEMS:

The central processing unit (CPU) is the computing part of the computer. It consists
of a control unit and an arithmetic logic unit. The CPU performs: numerical integer
~nd/or floating point calculations, and directs input and output from and to mass
storage devices, color monitors, digitizers, plotters, etc. The CPU's efficiency is often
measured in terms of how many millions of instructions per second (MIPS) it can
process, e.g., 500 MIPS. It is also customary to describe CPU in terms of the number
of cycles it can process in 1 second measured in megahertz, e.g.;--1000 Mhz(1 GHz).-
Manufacturers market computers with CPUs faster than 4 GHz, and this speed will
continue to increase. The system bus connects the CPU with the main memory, managing
data transfer and instructions between the two. Therefore, another important
consideration when purchasing a computer is bus speed.

Personal computers (with 16- to 64-bit CPUs) are the workhorses of digital image
processing and GIS analysis. Personal computers are based on microprocessor
technology where the entire CPU is placed on a single chip. The most common operating
systems for personal computers are various Microsoft Windows operating systems and
the Macintosh operating system. Personal computers useful for digital image processing
seem to always cost approximately $2,500 with 2 GB of random access memory (RAM),
a high-resolution color monitor (e.g., capable of displaying : 1024 x 768 pixels), a
reasonably sized hard disk (e.g., >300 Gb), and a rewriteable disk (e.g., CD-RW or
DVDRW).

Computer Workstations usually consist of a 64-bit reduced-instruction-set-computer


(RISC) CPU that can address more random access memory than personal computers.
The RISC chip is typically faster than the traditional CISC. RISC workstations
application software and hardware maintenance costs are usually higher than personal
computer-based image processing systems. The most common workstation operating
systems are UNIX and various Microsoft Windows products. The computers can
function independently or be networked to a file-server. Both PCs and workstations can
have multiple CPUs that allow remotely sensed data to be processed in parallel and at
great speed.

Mainframe computers (with 2:64-bit CPU) perform calculations more rapidly than PCs
or workstations and are able to support hundreds of users simultaneously, especially
parallel mainframe computers such as a CRAY. This makes mainframes ideal for
intensive, CPU-dependent tasks such as image registration/rectification, mosaicking
multiple scenes, spatial frequency filtering, terrain rendering, classification,
hyperspectral image analysis, and complex spatial GIS modelling. If desired, the output
from intensive mainframe ·processing can be passed to a workstation or personal
computer for subsequent-less-intensive or inexpensive processing.

Read-Only Memory, Random Access Memory, Serial and Parallel Processing, and
Arithmetic Coprocessor

Computers have banks of memory that contain instructions . that are indispensable to
the successful functioning of the computer. A computer may contain a single CPU or
multiple CPUs and process data serially (sequentially) or in parallel. Most CPUs now have
special-purpose math coprocessors.

Read-only memory (ROM) retains information even after the computer is shut down
because power is supplied from a battery that must be replaced occasionally. For
example, the date and time are stored in ROM after the computer is turned off. When
restarted, the computer looks in the date and time ROM registers and displays the
correct information. Most computers have sufficient ROM for digital image processing
applications; therefore, it is not a serious consideration.

Random access memory (RAM) is the computer's primary temporary workspace. It


requires power to maintain its content. Therefore, all of the information that is
temporarily placed in RAM while the CPU is performing digital image processing must be
saved to a hard disk (or other media such as a CD) before turning the computer.

Computers should have sufficient RAM for the operating system, image processing
applications software, and any remote sensor data that must be held in temporary
memory while calculations are performed. Computers with 64-bit CPUs can address
more RAM than 32-bit machines (see Table 3-1). RAM is broken down into two types:
dynamic RAM (DRAM) and static RAM (SRAM). The data stored in DRAM is updated
thousands of times per second; SRAM does not need to be refreshed. SRAM is faster
but is also more expensive. It seems that one can never have too much RAM for image
processing applications. RAM prices continue to decline while RAM speed continues to
increase

Serial and parallel processing

Consider performing a per-pixel classification on a typical 1024 row by 1024 column


remote sensing dataset . Each pixel is classified by passing the spectral data to the
CPU and then progressing to the next pixel. This is serial processing. Conversely,
suppose that instead of just one CPU we had 1024 CPUs. In this case the class of each
of the 1024 pixels in the row could be determined using 1024 separate CPUs . The
parallel image processing would classify the line of data approximately 1024 times
faster than would processing it serially. In an entirely different parallel configuration,
each of the 1024 CPUs could be allocated an entire row of the dataset. Finally, each of
the CPUs could be allocated a separate band if desired. For example, if 224 bands of
AVIRIS hyperspectral data were available, 224 of the 1024 processors could be
allocated to evaluate the 224 brightness values associated with each individual pixel
with 800. additional CPUs available for other tasks.

An arithmetic coprocessor is a special mathematical circuit that performs high-speed


floating point operations while working in harmony with the CPU. The lntel 486
processor was the first CPU to offer a built-in math coprocessor (Intel, 2004). All of
the current CPUs contain arithmetic coprocessors. If substantial resources are
available, then an array processor is ideal. It consists of a bank of memory and special
circuitry dedicated to performing simultaneous computations on elements of an array
(matrix) of data in n dimensions. Most remotely sensed data are collected and stored as
arrays of numbers so array processors are especially well suited to image enhancement
and analysis operations. However, specialised software must often be written to take
advantage of the array processor.

Graphical User Interface


One of the best scientific visualization environments for the analysis of remote
sensor data takes place when the analyst communicates with the digital image
processing system interactively using a point-and-click--graphical- user
interface (Limp, 1999). Most sophisticated image processing systems are now
configured with a friendly, point-and-click GUI that allows rapid display of
images and the selection of important image processing functions (Chan, 2001).
Several effective digital image processing graphical user interfaces include:

 ERDAS Imagine's intuitive point-and-click icons ,


 Research System's Environment for Visualizing Images (ENVI) hyperspectral
data analysis interface ,
 ER Mapper,
 IDRISI
 ESRI ArcGIS, and
 Adobe Photoshop.

Photoshop is very useful for processing photographs and images that have three or
fewer bands of data.

Noninteractive, batch processing is of value for time-consuming processes such as


image rectification, mosaicking, orthophoto generation, and filtering. Batch processing
frees up laboratory PCs or workstations during peak demand because the jobs can be
stored ~d executed when the computer is otherwise idle ( eg, during early morning
hours). Batch processing can also be useful during peak hours · because it allows the
analyst to set up a series of-operations that can be executed in sequence without
operator intervention. Digital image processing also can now be performed interactively
over the Internet at selected sites.

Computer Operating System and Compiler(s):

The computer operating system and compiler(s) must be easy to use yet powerful
enough so that analysts can program their own relatively sophisticated algorithms and
experiment with them on the system. It is not wise to configure an image processing
system around an unusual operating system or compiler because it becomes difficult to
communicate with the peripheral devices and share applications with other scientists.

Operating System:

The operating system is the first program loaded into memory (RAM) when the
computer is turned on. It controls all of the computer's higher-order functions. The
operating system kernel resides in memory at all times . The operating system provides
the user interface and controls multitasking. It handles the input and output to the
hard disk and all peripheral devices such as compact disks, scanners, printers, plotters,
and color displays. All digital image processing application programs must communicate
with the operating system. The operating system sets the protocols for the application
programs that are executed by it. The difference between a single-user operating
system and a network operating system is the latter's multi-user capability. For
example, Microsoft Windows XP (home edition) and the Macintosh OS are single-user
operating systems designed for one person at a desktop computer working
independently. Various Microsoft Windows, UNIX, and Linux network operating
systems are designed to manage multiple user requests at the same time and complex
network security.

Compiler:

A computer software compiler translates instructions programmed in a high-level


language such as c++ or Visual Basic into machine language that the CPU can understand.
A compiler usually generates assembly language first and then translates the assembly
language into machine language. The compilers most often used in the development of
digital image processing software are c++, Assembler, and Visual Basic. Many digital
image processing systems provide a trial kit that programmers can use to compile their
own digital image processing algorithms (e.g., ERDAS, ER Mapper, ENVI). The toolkit
consists of fundamental subroutines that perform very specific tasks such as reading a
line of image data into RAM or modifying a color look-up table to change the color of a
pixel (RGB) on the screen.

It is often useful for remote sensing analysts to program in one of the high-level
languages just listed. Very seldom will single digital image processing software system
perform ail' of the-functions needed for a given project. Therefore, the ability to
modify existing software or integrate newly developed algorithms with the existing
software is important.

Rapid Access Mass Storage:

Digital remote sensor data (and other ancillary raster GIS data)-are often stored in a
matrix band sequential (BSQ) format in which each spectral band of imagery (or GIS
data) is stored as an individual file. Each picture element of each band is typically
represented in the computer by a single 8- bit byte with values from 0 to 255. The
best way to make brightness values rapidly available to the computer is to place the
data on a hard disk, CD-ROM, DVD, or DVD RAM where each pixel of the data matrix
may be accessed at random (not serially) and at great speed (e.g., within microseconds).
The cost of hard disk, CD-ROM, or DVD storage per gigabyte continues to decline .

It is common for digital image processing laboratories to have gigabytes of hard-disk


mass storage associated with each workstation. For example;-each personal computer in
the laboratory shown in Figure 3-2 has 80 GB of mass storage. Many image processing
laboratories now use RAID (redundant arrays of inexpensive hard disks) technology in
which two or more drives working together provide increased performance and various
levels of error recovery and fault tolerance. Other storage media, such as magnetic
tapes, are usually too slo\\' for real-time image retrieval, manipulation, and storage
because they do not allow random access of data. However, given their large storage
capacity, they remain a cost-effective way to store data.

Companies are now developing new mass storage technologies based on atomic
resolution storage (ARS), which holds the promise of storage densities of close to 1
terabit per square inch - the equivalent of nearly 50 DVDs on something the size of a
credit card. The technology uses microscopic probes less than one-thousandth the
width of a human hair. -When the probes are brought near a conducting mate rial,
electrons write data on the surface. The same probes can detect and retrieve data and
can be used to write over old data.

Archiving Considerations: Longevity

Storing remote sensor data is no trivial matter. Significant sums of money are spent
purchasing remote sensor data by commercial companies, natural resource agencies, and
universities. Unfortunately, most of the time not enough attention is given to how the
expensive data are stored or archived to protect the long-term investment. Figure 3-5
depicts ·several types of analog and digital remote sensor data mass storage devices
and the average time to physical obsolescence, that is, when the media begin to
deteriorate and information is lost. Interestingly, properly exposed, washed, and fixed
analog black and white aerial photog1-uph/ negatives have considerable longevity )',
often more that1 l-00 years. Color negatives with their respective dye layers have
longevity, but not as much as the black-and-white negatives. Similarly, black-and-white
paper prints have greater longevity than color prints (Kodak, 1995). Hard and floppy
magnetic disks have relatively short longevity, often less than 20 years. Magnetic tape
media (e.g., 3/4-in. tape, 8-mm tape, and 1/2- in. tape, shown in Figure 3-6) can become
unreadable within 10 to 15 years if not rewound and properly stored in a cool, dry
environment.

Optical disks can now be written to, read, and -written over again at relatively high
speeds and can store much more data than other portable media such as floppy disks.
The technology used in rewriteable optical systems is magneto-optics, where data is
recorded magnetically like disks and tapes, but the bits are much smaller because a
laser is used to etch the bit. The laser beets the bit to 150 °C, at which temperature
the bit is realigned when subjected to a magnetic field. To record new data, existing
bits must first be set to zero.

Only the optical disk provides relatively long-term storage potential (> 100 years). In
addition, optical disks store large volumes of data on relatively small media. Advances in
optical compact disc (CD) technology promise to increase the storage capacity to > 17
Gb using new rewriteable digital video disc (DVD) technology. In most remote sensing
laboratories, rewritable CD-RWs or DVD-RWs have supplanted tapes as the backup
system of choice. DVD drives are back' wards compatible and can read data from CDs.

It is important to remember when archiving remote sensor data that sometimes it


would be the loss of a) the read-write software and/or b) the read-write hardware
(the drive mechanism and heads) that is the problem and not the digital media itself
(Rothenberg, 1995; Jensen et al., 1996). Therefore, as new computers are purchased it
is a good idea to set aside a single computer system that is representative of a certain
Computer era so that one can always read any data stored on old mass storage media.

Computer Display Spatial and Color Resolution:

The display of remote sensor data on a computer screen is one of the most
fundamental elements of digital image analysis (Brown and Feringa, 2003). Careful
selection of the computer display characteristics will provide the optimum visual image
analysis environment for the human interpreter. The two most important
characteristics are computer display spatial and color resolution.

Computer Screen Display Resolution:

The image processing system should be able to display at least 1024 rows by 1024
columns on the computer screen at. one time. This allows larger geographic areas to be
examined and places the terrain of interest in its regional context. J Most Earth
scientists prefer this regional perspective when performing terrain analysis using
remote sensor data. Furthermore, it is disconcerting to have to analyze four 512 X 512
images when a single 1024 x 1024 display provides the information at a glance. An ideal
screen display resolution is 1600 x 1200 pixels.

Computer Screen Color Resolution:

The computer screen color resolution is the number of gray scale tones or colors (e.g.,
256) that can be displayed on a CRT monitor.at one time out of a palette of available
colors (e.g., 16.7 million). For many applications, such as highcontrast black-and-white
linework cartography, only 1 bit of color is required [i.e., either the line is black or
white (0 or I)]. For more sophisticated computer graphics for which many shades of
gray or color combinations are required, up to 8 bits (or 256 colors) may be required.
Most thematic mapping and GIS applications may be performed quite well by systems
that display just 64 user-selectable colors out of ' a palette of 256 colors.

Conversely, the analysis and display of remote sensor image data may require much
higher CRT screen color-resolution than cartographic and GJS applications (Slocum,
1999). For example, most relatively sophisticated digital image processing systems can
display a tremendous number of unique colors (e.g., 16.7 million) from a large color
palette (e.g., 16.7 million). The primary reason for these color requirements is that
image analysts must often display a composite of several images at one time on a CRT.
This process _is called color compositing.

For example, to display a typical color-infrared image of Landsat Thematic Mapper


data, it is necessary to composite three separate 8-bit images [e.g., the green band
(TM 2 = 0.52 to 0.60 µm), the red band (TM 3 = 0.63 to 0.69 µm), and the reflective
infrared band (TM 4 = 0.76 to 0.90 µm)]. To obtain a true-color composite image that
provides every possible color combination for the three 8-bit images requires that 224
colors (16,777,216) be available in the palette. Such true-color, direct-definition
systems are relatively expensive because every pixel location must be bitmapped. This
means that there must Pe a specific location in memory that keeps track of the exact
blue, green, and red color value for every pixel. This requires substantial amounts of
computer memory which are usually collected in what is called an image processor. Given
the availability of image processor memory, the question is: what is adequate color
resolution?

Generally, 4096 carefully selected colors out of a very large palette (e.g., 16.7 million)
appears to be the minimum acceptable for the creation of remote sensing color
composites. This provides 12-bits of color, with 4 bits available for each of the blue,
green, and red image planes (Table 3-3). For image pr-0cessing applications other than
compositing (e.g., black-and-white image display, color density slicing, pattern
recognition classification), the 4,096 available colors and large color palette are more
than adequate. However, the larger the palette and the greater the number of
displayable colors at one time, the better the representation of the remote sensor
data on the CRT screen for visual analysis. More information about how images are
displayed using an image processor is in Chapter _5. The network configured in Figure
3-2 has six 24-bit color workstations.

Several remote sensing systems now collect data with 10-, 11-, and even 12-bit
radiometric resolution with brightness values ranging from 0 to 1023, 0 to 2047, and 0
to 4095, respectively. Unfortunately, despite advances in video technology, at the
present time it is necessary to generalize (i.e., dumb down) the radiometric precision of
the remote sensor data to 8 bits per pixel simply because current video display
technology cannot handle the demands of the increased precision.

You might also like