Class Note - Image Processing and GIS (Repaired)
Class Note - Image Processing and GIS (Repaired)
Raster image data are laid out in a grid similar to the squares on a checkerboard. Each cell of
the grid is represented by a pixel, also known as a grid cell. In remotely sensed image data,
each pixel represents an area of the Earth at a specific location. The data file value assigned
to that pixel is the record of reflected radiation or emitted heat from the Earth‘s surface at that
location. Data file values may also represent elevation, as in digital elevation models
(DEMs).
A photograph refers specifically to images that have been detected as well as recorded on
photographic film. The black and white photo to the left, of part of the city of Ottawa, Canada
was taken in the visible part of the spectrum. Photos are normally recorded over the
wavelength range from 0.3 mm to 0.9 mm - the visible and reflected infrared. Based on
these definitions, we can say that all photographs are images, but not all images are
photographs. Therefore, unless we are talking specifically about an image recorded
photographically, we use the term image.
1
A photograph could also be represented and displayed in a digital format by subdividing the
image into small equal-sized and shaped areas, called picture elements or pixels, and
representing the brightness of each area with a numeric value or digital number.
Indeed, that is exactly what has been done to the photo to the left. In fact, using the
definitions we have just discussed, this is actually a digital image of the original photograph.
The photograph was scanned and subdivided into pixels with each pixel assigned a digital
number representing its relative brightness. The computer displays each digital value as
different brightness levels. Sensors that record electromagnetic energy, electronically
record the energy as an array of numbers in digital format right from the start. These
two different ways of representing and displaying remote sensing data, either pictorially or
digitally, are interchangeable as they convey the same information (although some detail
may be lost when converting back and forth).
Bands
Digital Image data may include several bands of information. Each band is a set of data file
values for a specific portion of the electromagnetic spectrum of reflected light or emitted heat
(red, green, blue, near-infrared, infrared, thermal, etc.) or some other user-defined
information created by combining or enhancing the original bands, or creating new bands
from other sources.
2
Bands vs. Layers:
The, bands of data are occasionally referred to as layers. Once a band is imported into a GIS,
it becomes a layer of information which can be processed in various ways. Additional layers
can be created and added to the image file
We see colour because our eyes detect the entire visible range of wavelengths and our brains
process the information into separate colours. Can you imagine what the world would look
like if we could only see very narrow ranges of wavelengths or colours? That is how many
sensors work. The information from a narrow wavelength range is gathered and stored
in a channel, also sometimes referred to as a band. We can combine and display channels of
information digitally using the three primary colours (blue, green, and red). The data from
each channel is represented as one of the primary colours and, depending on the relative
brightness (i.e. the digital value) of each pixel in each channel, the primary colours combine
in different proportions to represent different colours.
When we use this method to display a single channel or range of wavelengths, we are
actually displaying that channel through all three primary colours. Because the brightness
level of each pixel is the same for each primary colour, they combine to form a black and
white image, showing various shades of gray from black to white. When we display more
than one channel each as a different primary colour, then the brightness levels may be
different for each channel/primary colour combination and they will combine to form a
colour image.
NOTE: DEMs are not remotely sensed image data, but are currently being produced from
stereo points in radar imagery.
Coordinate Systems
The location of a pixel in a file or on a displayed or printed image is expressed using a
coordinate system. In two-dimensional coordinate systems, locations are organized in a grid
of columns and rows. Each location on the grid is expressed as a pair of coordinates known as
X and Y. The X coordinate specifies the column of the grid, and the Y coordinate
specifies the row. Image data organized into such a grid are known as raster data.
3
File Coordinates: File coordinates refer to the location of the pixels within the image (data)
file. File coordinates for the pixel in the upper left corner of the image always begin at 0, 0.
1003
1004
Resolution is a broad term commonly used to describe, four distinct types of resolution
must be considered
• spectral—the specific wavelength intervals that a sensor can record
• spatial—the area on the ground represented by each pixel
• radiometric—the number of possible data file values in each band (indicated by the number
of bits into which the recorded energy is divided)
• temporal—how often a sensor obtains imagery of a particular area
These four domains contain separate information that can be extracted from the raw data.
Scale
The terms large-scale imagery and small-scale imagery often refer to spatial resolution. Scale
is the ratio of distance on a map as related to the true distance on the ground. Large-scale in
remote sensing refers to imagery in which each pixel represents a small area on the ground,
such as SPOT data, with a spatial resolution of 10 m or 20 m. Small scale
4
refers to imagery in which each pixel represents a large area on the ground, such as Advanced
Very High Resolution Radiometer (AVHRR) data, with a spatial resolution of 1.1 km.
This terminology is derived from the fraction used to represent the scale of the map, such as
1:50,000. Small-scale imagery is represented by a small fraction (one over a very large
number). Large-scale imagery is represented by a larger fraction (one over a smaller number).
Generally, anything smaller than 1:250,000 is considered small-scale imagery.
The ratio of distance on an image or map, to actual ground distance is referred to as scale. If
you had a map with a scale of 1:100,000, an object of 1cm length on the map would actually
be an object 100,000cm (1km) long on the ground. Maps or images with small "map-to-
ground ratios" are referred to as small scale (e.g. 1:100,000), and those with larger ratios (e.g.
1:5,000) are called large scale.
NOTE: Scale and spatial resolution are not always the same thing. An image always has the
same spatial resolution, but it can be presented at different scales (Simonett et al, 1983).
The detail discernible in an image is dependent on the spatial resolution of the sensor and
refers to the size of the smallest possible feature that can be detected. Spatial resolution of
passive sensors (we will look at the special case of active microwave sensors later) depends
primarily on their Instantaneous Field of View (IFOV).
5
The IFOV is the angular cone of visibility of the sensor (A) and determines the area on the
Earth's surface which is "seen" from a given altitude at one particular moment in time (B).
The size of the area viewed is determined by multiplying the IFOV by the distance from the
ground to the sensor (C). This area on the ground is called the resolution cell and determines
a sensor's maximum spatial resolution. For a homogeneous feature to be detected, its size
generally has to be equal to or larger than the resolution cell. If the feature is smaller than
this, it may not be detectable as the average brightness of all features in that resolution cell
will be recorded. However, smaller features may sometimes be detectable if their reflectance
dominates within a articular resolution cell allowing sub-pixel or resolution cell detection.
The most remote sensing images are composed of a matrix of picture elements, or pixels,
which are the smallest units of an image. Image pixels are normally square and represent a
certain area on an image. It is important to distinguish between pixel size and spatial
resolution - they are not interchangeable. If a sensor has a spatial resolution of 20 metres and
an image from that sensor is displayed at full resolution, each pixel represents an area of 20m
x 20m on the ground. In this case the pixel size and resolution are the same. However, it is
possible to display an image with a pixel size different than the resolution. Many posters of
satellite images of the Earth have their pixels averaged to represent larger areas, although the
original spatial resolution of the sensor that collected the imagery remains the same.
Images where only large features are visible are said to have coarse or low resolution. In fine
or high resolution images, small objects can be detected. Military sensors for example, are
designed to view as much detail as possible, and therefore have very fine resolution.
Commercial satellites provide imagery with resolutions varying from a few metres to several
kilometres. Generally speaking, the finer the resolution, the less total ground area can be
seen.
6
Spectral Resolution
Spectral resolution refers to the specific wavelength intervals in the electromagnetic spectrum
that a sensor can record (Simonett et al, 1983). For example, band 1 of the Landsat TM
sensor records energy between 0.45 and 0.52 µm in the visible part of the spectrum. Wide
intervals in the electromagnetic spectrum are referred to as coarse spectral resolution,
and narrow intervals are referred to as fine spectral resolution. For example, the SPOT
panchromatic sensor is considered to have coarse spectral resolution because it records EMR
between 0.51 and 0.73 µm. On the other hand, band 3 of the Landsat TM sensor has fine
spectral resolution because it records EMR between 0.63 and 0.69 µm (Jensen, 1996).
NOTE: The spectral resolution does not indicate how many levels the signal is broken into.
The spectral response and spectral emissivity curves are characterize the reflectance and/or
emittance of a feature or target over a variety of wavelengths. Different classes of features
and details in an image can often be distinguished by comparing their responses over distinct
wavelength ranges. Broad classes, such as water and vegetation, can usually be separated
using very broad wavelength ranges - the visible and near infrared. Other more specific
classes, such as different rock types, may not be easily distinguishable using either of these
broad wavelength ranges and would require comparison at much finer wavelength ranges to
separate them. Thus, we would require a sensor with higher spectral resolution. Spectral
resolution describes the ability of a sensor to define fine wavelength intervals. The finer the
spectral resolution, the narrower the wavelength ranges for a particular channel or band.
Black and white film records wavelengths extending over much, or all of the visible portion
of the electromagnetic spectrum. Its spectral resolution is fairly coarse, as the various
wavelengths of the visible spectrum are not individually distinguished and the overall
reflectance in the entire visible portion is recorded. Colour film is also sensitive to the
reflected energy over the visible portion of the spectrum, but has higher spectral resolution, as
it is individually sensitive to the reflected energy at the blue, green, and red wavelengths of
7
the spectrum. Thus, it can represent features of various colours based on their reflectance in
each of these distinct wavelength ranges.
Many remote sensing systems record energy over several separate wavelength ranges at
various spectral resolutions. These are referred to as multi-spectral sensors and will be
described in some detail in following sections. Advanced multi-spectral sensors called
hyperspectral sensors, detect hundreds of very narrow spectral bands throughout the
visible, near-infrared, and mid-infrared portions of the electromagnetic spectrum. Their
very high spectral resolution facilitates fine discrimination between different targets
based on their spectral response in each of the narrow bands.
Radiometric Resolution
Radiometric resolution refers to the dynamic range, or number of possible data files values in
each band. This is referred to by the number of bits into which the recorded energy is divided.
For instance, in 8-bit data, the data file values range from 0 to 255 for each pixel, but in 7-bit
data, the data file values for each pixel range from 0 to 128. In following Figure, 8-bit and 7-
bit data are illustrated. The sensor measures the EMR in its range. The total intensity of the
energy from 0 to the maximum amount the sensor measures is broken down into 256
brightness values for 8-bit data, and 128 brightness values for 7-bit data.
8
While the arrangement of pixels describes the spatial structure of an image, the radiometric
characteristics describe the actual information content in an image. Every time an image is
acquired on film or by a sensor, its sensitivity to the magnitude of the electromagnetic energy
determines the radiometric resolution. The radiometric resolution of an imaging system
describes its ability to discriminate very slight differences in energy. The finer the
radiometric resolution of a sensor, the more sensitive it is to detecting small differences in
reflected or emitted energy.
9
Temporal resolution: In addition to spatial, spectral, and radiometric resolution, the concept
of temporal resolution is also important to consider in a remote sensing system, which refers
to the length of time it takes for a satellite to complete one entire orbit cycle. The revisit
period of a satellite sensor is usually several days. Therefore the absolute temporal resolution
of a remote sensing system to image the exact same area at the same viewing angle a second
time is equal to this period. However, because of some degree of overlap in the imaging
swaths of adjacent orbits for most satellites and the increase in this overlap with increasing
latitude, some areas of the Earth tend to be re-imaged more frequently. Also, some satellite
systems are able to point their sensors to image the same area between different satellite
passes separated by periods from one to five days. Thus, the actual temporal resolution of a
sensor depends on a variety of factors, including the satellite/sensor capabilities, the swath
overlap, and latitude.
The ability to collect imagery of the same area of the Earth's surface at different periods of
time is one of the most important elements for applying remote sensing data. Spectral
characteristics of features may change over time and these changes can be detected by
collecting and comparing multi-temporal imagery. For example, during the growing season,
most species of vegetation are in a continual state of change and our ability to monitor those
subtle changes using remote sensing is dependent on when and how frequently we collect
imagery. By imaging on a continuing basis at different times we are able to monitor the
changes that take place on the Earth's surface, whether they are naturally occurring (such as
changes in natural vegetation cover or flooding) or induced by humans (such as urban
development or deforestation). The time factor in imaging is important when:
persistent clouds offer limited clear views of the Earth's surface (often in the tropics)
short-lived phenomena (floods, oil slicks, etc.) need to be imaged
multi-temporal comparisons are required (e.g. the spread of a forest disease from one
year to the next)
Data Storage Image data can be stored on a variety of media—tapes, CD-ROMs, or floppy
diskettes, for example—but how the data are stored (e.g., structure) is more important than on
what they are stored. All computer data are in binary format. The basic unit of binary data is
a bit. A bit can have two possible values—0 and 1, or ―off‖ and ―on‖ respectively. A set of
bits, however, can have many more values, depending on the number of bits used. The
number of values that can be expressed by a set of bits is 2 to the power of the number of bits
used.
10
A byte is 8 bits of data. Generally, file size and disk space are referred to by number of bytes.
For example, a PC may have 640 kilobytes (1,024 bytes = 1 kilobyte) of RAM (random
access memory), or a file may need 55,698 bytes of disk space. A megabyte (Mb) is about
one million bytes. A gigabyte (Gb) is about one billion bytes.
Storage Formats: Image data can be arranged in several ways on a tape or other media. The
most common storage formats are: • BIL (band interleaved by line) • BSQ (band sequential) •
BIP (band interleaved by pixel). For a single band of data, all formats (BIL, BIP, and
BSQ) are identical, as long as the data are not blocked.
BSQ: In BSQ (band sequential) format, each entire bands are stored consecutively in the
different file (Slater, 1980). That is B1, B2, B3…Bn are written in separate n-files on the
storage medium. This format is advantageous, in that:
• One band can be read and viewed easily, and
• Multiple bands can be easily loaded in any order.
Band-1, Band-2, Band-3,.. Band-n all are in separate file in BSQ (Band Sequential) format
Band -1 file
1st line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
2nd line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
3rd line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
……………………………………………………………………………………………………………………………………………
Mth line : 1,2,3,………………………………………………………………………………………………………………….,N pixels
End of File
Band -2 file
st
1 line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
2nd line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
3rd line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
……………………………………………………………………………………………………………………………………………
Mth line : 1,2,3,………………………………………………………………………………………………………………….,N pixels
End of File
……………………………………………………………………………………………………………………………………………………………...
………………………………………………………………………………………………………………………………………………………………
Band -n file
st
1 line : 1,2,3,…………………………………………………………………………………………………………………….,N pixels
2nd line : 1,2,3,…………………………………………………………………………………………………………………….,N
pixels 3rd line :
1,2,3,…………………………………………………………………………………………………………………….,N pixels
……………………………………………………………………………………………………………………………………………
Mth line : 1,2,3,………………………………………………………………………………………………………………….,N pixels
End of File
11
BIL: In BIL (band interleaved by line) format, each record in the file contains a scan line
(row) of data for one band (Slater, 1980). In BIL, one line for band-1 is followed by the data
of the same line in the 2nd band till all the n-bands are over, then 2nd line with band-1. This is
repeated for whole scene and the Mth line of B1, B2, B3…Bn represent the last record. All
bands of data for a given line are stored consecutively within the file as shown in Fig.
Band-1, Band-2, Band-3,… Band-n all in single file in BIL (Band Interleaved Line)
BIP
In BIP (band interleaved by pixel) format, a single image file, in pixel interleaved order is
generated. In BIP, pixels are interleaved that is, (1st pixel of Band-1, 1st pixel of Band-2, 1st
pixel of Band-3, ….1st pixel of Band-n) constitute a single pixel group. the values for each
band are ordered within a given pixel. The pixels are arranged sequentially on the tape
(Slater, 1980). The sequence for BIP
format is:
Band-1, Band-2, Band-3,… Band-n all in single file in BIP(Band Interleaved by Pixel)
1st pixel of B-1, 1st pixel of B-2, 1st pixel of B-3, …. and 1st pixel of B-n
2nd pixel of B-1, 2nd pixel of B-2, 2nd pixel of B-3, …. and 2nd pixel of B-n
2nd pixel of B-1, 2nd pixel of B-2, 2nd pixel of B-3, …. and 2nd pixel of B-n
………………………………………………………………………………….
…………………………………………………………………………………..
Nth pixel of B-1, Nth pixel of B-2, Nth pixel of B-3, …. and Nth pixel of B-n
12
Common Data Format(CDF/netCDF)
Hierarchical Data Format (HDF)
CEOS Superstructure Format
MPH/SPH/DSR
Spatial Data Transfer Standard (SDTS)
Flexible Image Transport System (FITS)
Graphics Interchange Format (GIF)
ISO/IEC 12087 - Image Processing and Interchange
Standard Formatted Data Units (SFDU)
GeoTIFF
Figure: Physical layout of three band image data in Super structure BSQ (b)
14
Fast format
Fast format is a very comprehensive digital data format that is suitable for Level-2 data
products. It consists of two files namely
1. Header file.
2. Image file(s).
The physical layout of fast format is shown in Fig.
Header file
The first file on each volume, a Read-Me-First file, contains header data. It is in American
Standard Code for Information Interchange (ASCII) format.
The header file contains three 1536-byte ASCII records. The first record is the Administrative
Record which contains information that identifies the product, the scene and the data specifi-
cally needed to read the imagery from the digital media (CDROM, DAT or DISK). In order
80, retrieve the image data, it is necessary to read entries in the Administrative Record.
The second record is the Radiometric Record, which contains the coefficients needed to
convert the scene digital values into at-satellite spectral radiance
The third record is the Geometric Record which contains the scene geographic location (e.g
latitude, longitude etc) information. In order to align the imagery to other data sources, it will
be necessary to read entries in the Geometric Record.
Image file
Image files are written into CDROM, DAT od Disk in Band Sequential order that is each
image file contains one band of image data. There are no header records within the image
file, nor are there prefix and/or suffix data in the individual image record or scan lines.
GeoTIFF is based on the original TIFF (Tagged Image File Format) format, with additional geo-
graphic information. The GeoTIFF specification defines a set of TIFF tags provided to describe all
'Cartographic' information associated with TIFF imagery that originates from satellite imaging sys-
tems, scanned aerial photography, scanned maps, or as a result of geographic analysis. Its aim is to
allow for tying an image to a known model space or map projection. This is a platform independent
format which is used by a wide range of GIS (Geographical Information System) and Image
Processing Packages currently available in the market.
15
The GeoTIFF spec defines a set of TIFF tags provided to describe all "Cartographic" information
associated with TIFF imagery that originates from satellite imaging systems, scanned aerial
photography, scanned maps, digital elevation models, or as a result of geographic analyses. Its aim is
to allow means for tying a raster image to a known model space or map projection, and for describing
those projections.
GeoTIFF does not intend to become a replacement for existing geographic data interchange standards,
such as the USGS SDTS standard or the FGDC metadata standard. Rather, it aims to augment an
existing popular raster-data format to support georeferencing and geocoding information.
The Hierarchical Data Format (HDF) has been developed by the National Center for
Supercomputing Applications at the University of Illinois at Urbana- Champaign in the USA. It was
originally designed for the interchange of raster image data and multi-dimensional scientific data sets
across heterogeneous environments. It is a multi-object file format, with a number of predefined
object types, such as arrays, but with the ability to extend the object types in a relatively simple
manner. Recently, HDF has been extended to handle tabular scientific data, rather than just uniform
array oriented data, and also annotation attributes data.
HDF can store several types of data objects within one file, such as raster images, palettes, text and
table style data. Each ‗object‘ in an HDF file has a predefined tag that indicates the data type and a
reference number that identifies the instance. There are a number of tags which are available for
defining user defined data types, however only those people who have access to the software of the
user that defined the new types can access them properly. Each set of HDF data types has an
associated software interface. This is where HDF is very powerful. The software tools supplied to
support HDF are quite sophisticated, and due to the format of the files, which extensively use pointers
in their arrangement, the user is provided with means to analyses and visualise the data in an efficient
and convenient manner.
A table of contents is maintained within the file and as the user adds data to the file, the pointers in the
table of contents are updated. An example organisational structure of an HDF file is shown in
An Example organisation of Data Objects in an HDF File
HDF File
Hierarchical Data Format (HDF, HDF4, or HDF5) is the name of a set of Scientific data file
formats and libraries designed to store and organize large amounts of numerical data. Originally
developed at the National Center for Supercomputing Applications, it is currently supported by the
16
non-profit HDF Group, whose mission is to ensure continued development of HDF5 technologies, and
the continued accessibility of data currently stored in HDF.
In keeping with this goal, the HDF format, libraries and associated tools are available under a liberal,
BSD-like license for general use. HDF is supported by many commercial and non-commercial
software platforms, including Java, MATLAB/Scilab, IDL, Python, and R. The freely available HDF
distribution consists of the library, command-line utilities, test suite source, Java interface, and the
Java-based HDF Viewer (HDFView).
The Common Data Format (CDF) is developed and maintained by NASA. A variation of the format
that was designed for transfer across networks was developed by Unidata and called Network Common
Data Format (netCDF). The two formats are very similar except in the method that they used to physically
encode data. There is a move to merge the two developments, but at this stage they are still maintained
separately. They are discussed here under one heading as they are functionally and conceptually identical.
CDF is defined as a ―self describing‖ data format that permits not only the storage of the actual data of
interest, but also stores user-supplied descriptions of the data. CDF is a software library[3] accessible from
either FORTRAN or C, that allows the user to access and manage the data without regard to the physical
format on the media. In fact, the physical format is totally transparent to the user.
CDF is primarily suited for handling data that is inherently multidimensional; recent additions to the
format also permit the handling of scalar data, but not in such an efficient manner. Due to the nature of
Earth observation data, i.e. array oriented data, CDF is very efficient for the storage and processing of this
type of data. Data can be accessed either at the atomic level, for example, at the pixel level, or also at a
‗higher‘ level, for example, as a single image plane. The different access methods are provided by
separate software routines. One reason that CDF is efficient in data handling is that it is limited in the
basic data types that it can store. Essentially data can only be stored in a multiple of 8-bit bytes, such as
16-bit integer, 32-bit real, character string, etc. This is efficient for access, but is limiting for many Earth
observation products, where sensor data may be in a 10-bit word size, with another 6-bits used for flags,
such as cloud cover indicators.
The MPH/SPH/DSR product format is specifically used by ESA/ESRIN for ERS-1 and ERS-2 products and
hence extensively throughout Europe. It is used for the Fast Delivery Products from the ground stations to the
Processing and Archiving Facilities (PAFs) and to ESRIN, where it is archived in this format. This format also
forms the current baseline for the Envisat-1 Ground Segment. The MPH/SPH/DSR format is generally not used
for product distribution to end users, for this the CEOS SuperstructureFormat is used. Note, the format only
specifies the structure of the data packaging, it is not concerned with the syntax or semantics of the individual
data records.
Each product consists of three segments; the Main Product Header (MPH), the Specific Product
Header (SPH) and the Data Set Records (DSRs) as shown in
17
Schematic of an MPH/SPH/DSR Formatted File
::::::::
The MPH has a single fixed size record of 176 bytes that is mandatory for all products generated by
any satellite. The MPH for any one satellite is always the same. This header indicates, in fixed fields,
information which is applicable to all processing chain products, such as product identifier, type of
product, spacecraft identifier, UTC time of beginning of product, ground station identifier, many
quality control fields that are completed at various stages of the processing chain, etc. Following the
MPH is the SPH, which is present only if indicated by the MPH. The SPH can have a variable number
of records, each of variable size as dictated by the product type. These records contain information
specific to a particular product. For example, product confidence data that is specific to a product
type, parameters for instruments that are used to generate the product, etc. Finally there are a number
of DSRs (as specified in the MPH also), that contain the actual scientific data measurements. The
number and size of the DSR records is also dependent upon the product type.
There is only a limited number of data types that are supported in the headers, these are 1, 2 and 4-
byte integers, ASCII string parameters, single byte flags and ‘special’ fields formatted for a particular
product. The MPH/SPH/DSR does not contain any data description information. The MPH and each
of the SPH formats and fields are defined in conventional paper documents, there is no electronic
formal language description of records. The MPH indicates the type of product, and from this the user
would have to look up the relevant product specification and then know the type of SPH records and
the type of DSR records. Using this method new SPH and DSR records can be defined and then a new
identifier used in the MPH, but this is only a very basic method of data description.
In SDTS, objects are defined by attributes. For example, a ROAD may have attributes LENGTH and
DIRECTION. SDTS includes approximately 200 defined object names and 240 attributes.
18
For Earth observation, the vector representation is not of much interest, but the raster profile is
applicable. The raster profile is a standard method of formatting raster data, such as images or gridded
data that must be geolocated. Raster modules can accommodate image data, digital terrain models,
gridded GIS layers, and other regular point sample and grid cell data (all of which are termed raster
data). Two module types are required for the encoding of raster data: the Raster Definition module
and the Cell module. Additionally, a Registration module might be required to register the grid or
image geometry to latitude/longitude or a map-projection-based co-ordinate system.
SDTS supports many different organisation schemes for encoding raster data. Other data recorded in
the Raster Definition module complete the definition of the structure, orientation, and other
parameters required for interpreting the raster data. Actual pixel or grid cell data values are encoded in
Cell module records.
Where:
y = rows
x = columns
b = number of bytes per pixel
n = number of bands
1.4 adds 30% to the file size for pyramid layers and 10% for miscellaneous adjustments, such as
histograms, lookup tables, etc.
NOTE: This output file size is approximate
For example, to load a 3 band, 16-bit file with 500 rows and 500 columns, about 2,100,000 bytes of
disk space is needed.
20
operational lifetime. The technologies designed to accomplish this can also be used by an aerial
platform if the data are urgently needed on the surface.
There are three main options for transmitting data acquired by satellites to the surface. The data can
be directly transmitted to Earth if a Ground Receiving Station (GRS) is in the line of sight of the
satellite (A). If this is not the case, the data can be recorded on board the satellite (B) for transmission
to a GRS at a later time. Data can also be relayed to the GRS through the Tracking and Data Relay
Satellite System (TDRSS) (C), which consists of a series of communications satellites in
geosynchronous orbit. The data are transmitted from one satellite to another until they reach the
appropriate GRS.
DATA TRANSMISSION
In Canada, CCRS operates two ground receiving stations - one at Cantley, Québec (GSS), just outside
of Ottawa, and another one at Prince Albert, Saskatchewan (PASS). The combined coverage circles
for these Canadian ground stations enable the potential for reception of real-time or recorded data
from satellites passing over almost any part of Canada's land mass, and much of the continental
United States as well. Other ground stations have been set up around the world to capture data from a
variety of satellites.
21
Elements of Visual Interpretation
The analysis of remote sensing imagery involves the identification of various targets in an image, and
those targets may be environmental or artificial features which consist of points, lines, or areas.
Targets may be defined in terms of the way they reflect or emit radiation. This radiation is measured
and recorded by a sensor, and ultimately is depicted as an image product such as an air photo or a
satellite image. What makes interpretation of imagery more difficult than the everyday visual
interpretation of our surroundings? For one, we lose our sense of depth when viewing a two-
dimensional image, unless we can view it stereoscopically so as to simulate the third dimension of
height. Indeed, interpretation benefits greatly in many applications when images are viewed in stereo,
as visualization (and therefore, recognition) of targets is enhanced dramatically. Viewing objects from
directly above also provides a very different perspective than what we are familiar with. Combining
an unfamiliar perspective with a very different scale and lack of recognizable detail can make even
the most familiar object unrecognizable in an image. Finally, we are used to seeing only the visible
wavelengths, and the imaging of wavelengths outside of this window is more difficult for us to
comprehend.
Recognizing targets is the key to interpretation and information extraction. Observing the differences
between targets and their backgrounds involves comparing different targets based on any, or all, of the
visual elements of tone, shape, size, pattern, texture, shadow, and association. Visual
interpretation using these elements is often a part of our daily lives, whether we are conscious of it or
not. Examining satellite images on the weather report, or following high speed chases by views from a
helicopter are all familiar examples of visual image interpretation. Identifying targets in remotely
sensed images based on these visual elements allows us to further interpret and analyze. The nature of
each of these interpretation elements is described below, along with an image example of each.
TONE
22
Tone refers to the relative brightness or colour of objects in an image. Generally, tone is the
fundamental element for distinguishing between different targets or features. Variations in tone also
allows the elements of shape, texture, and pattern of objects to be distinguished.
SHAPE
Shape refers to the general form, structure, or outline of individual objects. Shape can be a very
distinctive clue for interpretation. Straight edge shapes typically represent urban or agricultural (field)
targets, while natural features, such as forest edges, are generally more irregular in shape, except
where man has created a road or clear cuts. Farm or crop land irrigated by rotating sprinkler systems
would appear as circular shapes.
SIZE
Size of objects in an image is a function of scale. It is important to assess the size of a target relative
to other objects in a scene, as well as the absolute size, to aid in the interpretation of that target. A
quick approximation of target size can direct interpretation to an appropriate result more quickly. For
example, if an interpreter had to distinguish zones of land use, and had identified an area with a
number of buildings in it, large buildings such as factories or warehouses would suggest commercial
property, whereas small buildings would indicate residential use.
Pattern refers to the spatial arrangement of visibly discernible objects. Typically an orderly repetition
of similar tones and textures will produce a distinctive and ultimately recognizable pattern. Orchards
with evenly spaced trees, and urban streets with regularly spaced houses are good examples of pattern.
Texture refers to the arrangement and frequency of tonal variation in particular areas of an image.
Rough textures would consist of a mottled tone where the grey levels change abruptly in a small area,
whereas smooth textures would have very little tonal variation. Smooth textures are most often the
result of uniform, even surfaces, such as fields, asphalt, or grasslands. A target with a rough surface
and irregular structure, such as a forest canopy, results in a rough textured appearance. Texture is one
of the most important elements for distinguishing features in radar imagery.
PATTERN TEXTURE
Shadow is also helpful in interpretation as it may provide an idea of the profile and relative height of
a target or targets which may make identification easier. However, shadows can also reduce or
eliminate interpretation in their area of influence, since targets within shadows are much less (or not at
all) discernible from their surroundings. Shadow is also useful for enhancing or identifying
topography and landforms, particularly in radar imagery.
23
Association takes into account the relationship between other recognizable objects or features in
proximity to the target of interest. The identification of features that one would expect to associate
with other features may provide information to facilitate identification. In the example given above,
commercial properties may be associated with proximity to major transportation routes, whereas
residential areas would be associated with schools, playgrounds, and sports fields. In our example, a
lake is associated with boats, a marina, and adjacent recreational land.
SHADOW ASSOCIATION
24
Image Processing techniques:
25
26
Radiometric calibration:
Satellite digital data is generally delivered as digital count values. These digital values should
be rescaled or calibrated to physically meaningful units (energy per unit area * steradian *
micrometer) to 8 bit digital count values (0-255). This process is called ―radiometric
calibration‖ or, more properly, radiometric rescaling.
Conversion to Radiance:
The different digital counts or digital values/ digital number corresponding to different pixels
of consecutive bands (band centre wavelength, λ) are converted to spectral radiance (Lλ)
image using following formula:
Where, Lλ = Spectral Radiance at the sensor‘s aperture in watts/(meter squared * ster * μm)
Qcal = the quantized calibrated pixel value in DN
Lminλ = the spectral radiance that is scaled to Qcalmin in watts/(meter squared*ster * μm) are given in
image metadata (example Table1- to3)
Lmaxλ = the spectral radiance that is scaled to Qcalmax in watts/(meter squared * ster *μm) are given
in image metadata (example Table1- to3)
Qcalmin = the minimum quantized calibrated pixel value (corresponding to Lminλ) in DN
Qcalmax = the maximum quantized calibrated pixel value (corresponding to Lmaxλ) in DN = 255
The values of, Lmax, Lmin for band 1 to band 5 and band 7 to band 8 are given in Table.
Grescale = Rescaled gain (the data product ―gain‖ contained in the Level 1 product header or
ancillary data record) in watts/(meter squared * ster * μm)/DN
Brescale = Rescaled bias (the data product ―offset‖ contained in the Level 1 product header or
ancillary data record ) in watts/(meter squared * ster * μm)
The estimated radiance image is then, converted to reflectance image using the following
relation:
where, d is earth-sun distance correction (Depending upon date of data acquisition, say for 242 Julian
date 1.00969, Astronomical Units), θ is Solar zenith angle (say, 21.32o), L λ is radiance (calculated
radiance image) as a function of bandwidth, E0λ solar spectral irradiances. The E0λ values are taken
from the Landsat 7 Science Data Users Handbook as given in Table. The values of d and θ are
collected from the header file of corresponding image over the study area.
27
Atmospheric correction:
Dark subtraction using band minimum has been applied for atmospheric scattering corrections to the
calculated reflectance image data. The dark-object subtraction (DOS) method of atmospheric
correction is a scene-based method to approximate the path radiance added by scattering, based on the
assumption that within an area of a full scene, there will be a location which is in deep topographic
shadowing, and that any radiance recorded by the satellite for that area arises from the path radiance
component (assumed to be constant across the scene). The radiance of a dark object (assumed to
have a reflectance of 1% by Chavez (1996) and Moran et al. (1992) is calculated by the following
relationship:
Lλ,1% = 0.01 * E0λ * cos2 θ / (π * d2)
Haze correction can be computed using the following relationship:
Lλ,haze = Lλ – Lλ1%
ρλ = [π * d2 * (Lλ - Lλhaze)] / (E0λ * cos θz)
Further, an additional refinement to the DOS method was proposed by Chavez (1996),
known as the COST model, as shown below:
ρλ = [π * d2 *(Lλsat - Lλhaze)] / (ESUNλ * cos2 θz)
where, λ = reflectance
d = Earth-Sun distance in astronomical units (AU)
Lsat = at-satellite radiance
ESUN = exo-atmospheric solar irradiance
z = solar zenith angle = 90 – solar elevation angle
λ = subscript indicating that these values are spectral band-specific
(The unit of Radiance is Watt Per Square Meter Per Steradian (W/m2-sr) A steradian can be
defined as the solid angle subtended at the center of a unit sphere by a unit area on its surface. For a
general sphere of radius r, any portion of its surface with area A = r2 subtends one steradian). A
graphical representation of 1 steradian. The sphere has radius r, and in this case the area of the patch
on the surface is A = r2. The solid angle is θ = A/r2 so in this case θ = 1. The entire sphere has a solid
angle of 4π sr ≈ 12.56637 sr. Because the surface area of a sphere is 4πr2, the definition implies that a
sphere measures 4π ≈ 12.56637 steradians. By the same argument, the maximum solid angle that can
be subtended at any point is 4π sr. A steradian can also be called a squared radian.
Table 1. Minimum and maximum radiances for the IRS – 1B LISS II sensor (March 21, 1995)
Band Wavelength Lmin Lmax Solar Spectral
−2 −1 −1 −2 −1 −1
range (Wm sr µm ) (W m sr µ m ) Irradiances
(µm) (W/(m-2 µm-1)
B1 0.450 – 0.520 0.0 140.7 1969
B2 0.520 – 0.590 0.0 226.5 1840
B3 0.620 – 0.680 0.0 180.2 1551
B4 0.770 – 0.860 0.0 164.5 1044
28
Table 2.Minimum and maximum radiances for the IRS – 1D LISS III sensor (March 18, 2000)
Band Wavelength Lmin Lmax Solar Spectral
range (Wm− sr−1µm−1) (W m−2 sr−1µm−1) Irradiances
(µm) (W/(m-2 µm-1
B2 0.520 – 0.590 0.0 148 1840
B3 0.620 – 0.680 0.0 156.6 1551
B4 0.770 – 0.860 0.0 164.5 1044
B5 1.55 – 1.70 0.0 24.38 240.62
Table 3. Minimum and maximum radiances for the Landsat 7 – ETM Plus sensor (November 21, 2000)
29
Data Correction There are several types of errors that can be manifested in remotely sensed data.
Among these are line dropout and striping. These errors can be corrected to an extent in GIS by
radiometric and geometric correction functions.
Line Dropout Line dropout occurs when a detector either completely fails to function or becomes
temporarily saturated during a scan (like the effect of a camera flash on a human retina). The result is
a line or partial line of data with higher data file values, creating a horizontal streak until the
detector(s) recovers, if it recovers. Line dropout is usually corrected by replacing the bad line with a
line of estimated data file values. The estimated line is based on the lines above and below it.
Striping: Striping or banding occurs if a detector goes out of adjustment—that is, it provides readings
consistently greater than or less than the other detectors for the same band over the same ground
cover.
Rectification
Raw satellite image pertains to the irregular surface of the earth. Even images of flat mass are
distorted due to the earth's curvature. Rectification is the process of projecting the data onto a plane,
and making it conforming to a map projection system (say, Universal Transverse Mercator (UTM),
Geographic (Lat &long), etc) designed to represent the surface of the sphere on a plane. The
calculation of a map projection requires definition of spheroid in terms of axis lengths and radius of
the reference sphere. Several spheroids are used for map projection depending on the region of the
earth's surface; for example, Clarke1866 for North America, Krasovsky1940 for Russia, Bessel
1841 for Central Europe, Everest 1830/1856 for the Indian subcontinent (Snyder, 1993) or in general
WGS84.
Map projection: A map projection is the manner in which the spherical surface of the Earth is
represented on a flat (two-dimensional) surface. This can be accomplished by direct geometric
projection or by a mathematically derived transformation. There are many kinds of projections, but all
involve transfer of the distinctive global patterns of parallels of latitude and meridians of longitude
onto an easily flattened surface, or developable surface.
The three most common developable surfaces are the cylinder, cone, and plane. A plane is already
flat, while a cylinder or cone may be cut and laid out flat, without stretching. Thus, map projections
may be classified into three general families: cylindrical, conical, and azimuthal or planar.
30
Lines of latitude are called parallels, which run east/west. Parallels are designated as 0° at the equator
to 90° at the poles. The equator is the largest parallel. Latitude and longitude are defined with respect
to an origin located at the intersection of the equator and the prime meridian. Lat/Lon coordinates are
reported in degrees, minutes, and seconds. Map projections are various arrangements of the Earth‘s
latitude and longitude lines onto a plane. These map projections require a point of reference on the
Earth‘s surface. Most often this is the center, or origin, of the projection.
Generally, rectification/ Georeferencing/ Geocoding are the process of transforming the data from
one grid system into another grid system using a geometric transformation. Since the pixels of the
new grid may not align with the pixels of the original grid, the pixels must be re-sampled. Re-
sampling is the process of extrapolating data values for the pixels on the new grid from the values of
the source pixels.
Rectification, by definition, involves georeferencing, since all map projection systems are associated
with map coordinates. Image-to-image registration, involves georeferencing only if the reference
image is already georeferenced. Georeferencing, by itself, involves changing only the map
coordinate information in the image file. The grid of the image does not change.
Geocoded data are images that have been rectified to a particular map projection and pixel size, and
usually have had radiometric corrections applied. Geocoded data should be rectified only if they must
conform to a different projection system or be registered to other rectified data.
Registration is the process of making an image conforms to another image.
When to Rectify: Rectification is necessary in cases where the pixel grid of the image must be
changed to fit a map projection system or a reference image. There are several reasons for rectifying
image data:
• comparing pixels scene to scene in applications, such as change detection or thermal inertia mapping
(day and night comparison)
• developing GIS data bases for GIS modeling
• identifying training samples according to map coordinates prior to classification
• creating accurate scaled photomaps
• overlaying an image with vector data, such as ArcInfo
• comparing images that are originally at different scales
• extracting accurate distance and area measurements
• mosaicking images
31
Spheroids:
Airy, Australian National , Bessel, Clarke 1866, Clarke 1880, Everest, GRS 1980, Helmert,
Hough, International 1909, Krasovsky, Mercury 1960, Modified Airy, Modified Everest, Modified
Mercury 1968, New International 1967 , Southeast Asia , Sphere of Nominal Radius of Earth,
Sphere of Radius 6370977m, Walbeck, WGS 66, WGS 72, WGS 84.
Disadvantages of Rectification
During rectification, the data file values of rectified pixels must be resampled to fit into a new grid of
pixel rows and columns. Although some of the algorithms for calculating these values are highly
reliable, some spectral integrity of the data can be lost during rectification. If map coordinates or map
units are not needed in the application, then it may be wiser not to rectify the image. An unrectified
image is more spectrally correct than a rectified image.
Classification
Some analysts recommend classification before rectification, since the classification is then based on
the original data values. Another benefit is that a thematic file has only one band to rectify instead of
the multiple bands of a continuous file. On the other hand, it may be beneficial to rectify the data first,
especially when using GPS data for the GCPs. Since these data are very accurate, the classification
may be more accurate if the new coordinates help to locate better training samples.
Thematic Files
Nearest neighbor is the only appropriate resampling method for thematic files, which may be a
drawback in some applications.
Rectification Steps NOTE: Registration and rectification involve similar sets of procedures.
Throughout this documentation, many references to rectification also apply to image-to-image
registration.
Usually, rectification is the conversion of data file coordinates to some other grid and coordinate
system, called a reference system. Rectifying or registering image data on disk involves the following
general steps, regardless of the application:
1. Locating the ground control points that specify pixels in the image for which the output map
coordinates are known;
2. Computation of transformation matrix using a polynomial equation to convert the source
coordinates to rectified coordinates;
3. Creation of an output image file with the pixels resample to conform to the new grid.
Ground Control Points
GCPs are specific pixels in an image for which the output map coordinates (or other output
coordinates) are known. GCPs consist of two X,Y pairs of coordinates:
• source coordinates—usually data file coordinates in the image being rectified
• reference coordinates—the coordinates of the map or reference image to which the source image is
being registered
Entering GCPs Accurate GCPs are essential for an accurate rectification. From the GCPs, the
rectified coordinates for all other points in the image are extrapolated. Select many GCPs throughout
33
the scene. The more dispersed the GCPs are, the more reliable the rectification is. GCPs for largescale
imagery might include the intersection of two roads, airport runways, utility corridors, towers, or
buildings. For small-scale imagery, larger features such as urban areas or geologic features may be
used. Landmarks that can vary (e.g., the edges of lakes or other water bodies, vegetation, etc.) should
not be used.
The source and reference coordinates of the GCPs can be entered in the following ways:
• They may be known a priori, and entered at the keyboard.
• Use the mouse to select a pixel from an image in the Viewer. With both the source and destination
Viewers open, enter source coordinates and reference coordinates for image-to image registration.
• Use a digitizing tablet to register an image to a hardcopy map.
Polynomial Transformation
Polynomial equations are used to convert source file coordinates to rectified map coordinates.
Depending upon the distortion in the imagery, the number of GCPs used, and their locations relative
to one another, complex polynomial equations may be required to express the needed transformation.
The degree of complexity of the polynomial is expressed as the order of the polynomial. The order is
simply the highest exponent used in the polynomial.
The equation of first order polynomial transformation is
X = A + A2 x + A3 y
Y = B1 + B2 x + B3 y
x and y are source coordinates (input) and X and Y are rectified coordinates (output)
So, any image file can be transformed or rectified to new coordinate system using three corner points
of input image say point-1 (x1, y1 ), point-2 (x2, y2 ), point-3 (x3, y3 ),
Rectified point-1 (X1,Y1)
X1 = A1 + A2 x1 + A3 y1
Y1 = B1 + B2 x1 + B3 y1
Rectified point-2 (X2,Y2)
X2 = A1 + A2 x2 + A3 y2
Y2 = B1 + B2 x2 + B3 y2
Rectified point-3 (X3,Y3)
X3 = A1 + A2 x3 + A3 y3
Y3 = B1 + B2 x3 + B3 y3
Further, any image file can be transformed or rectified to new coordinate system using second order
polynomial transformation by six corner points of input image say point-1 (x1, y1 ), point-2 (x2, y2 ),
point-3 (x3, y3 ), point-4 (x4, y4 ), point-5 (x5, y5 ), point-6 (x6, y6 ):
Rectified point-1 (X1,Y1)
X1 = A1 + A2 x1 + A3 y1 + A4 x12 + A5 x1 y1 + A6 y12
Y1 = B1 + B2 x1 + B3 y1 + B4 x12 + B5 x1 y1 + B6 y12
..................................................................................
Rectified point-6 (X6,Y6)
X6 = A1 + A2 x6 + A3 y6 + A4 x62 + A5 x6 y6 + A6 y62
Y6 = B1 + B2 x6 + B3 y6 + B4 x62 + B5 x6 y6 + B6 y62
34
Further, any image file can be transformed or rectified to new coordinate system using third order
polynomial transformation by six corner points of input image say point-1 (x1, y1 ), point-2 (x2, y2 ),
point-3 (x3, y3 ), point-4 (x4, y4 ), point-5 (x5, y5 ), point-6 (x6, y6 ):
Transformation Matrix
A transformation matrix is computed from the GCPs. The matrix consists of coefficients that are used
in polynomial equations to convert the coordinates. The size of the matrix depends upon the order of
transformation. The goal in calculating the coefficients of the transformation matrix is to derive the
polynomial equations for which there is the least possible amount of error when they are used to
transform the reference coordinates of the GCPs into the source coordinates.
Resampling Methods:
The next step in the rectification/registration process is to create the output file. Since the grid of
pixels in the source image rarely matches the grid for the reference image, the pixels are resampled so
that new data file values for the output file can be calculated.
The following resampling methods are generally utilised:
Nearest Neighbor: uses the value of the closest pixel to assign to the output pixel value. To
determine an output pixel‘s nearest neighbor, the algorithm uses the inverse of the transformation
35
matrix to calculate image file coordinates of the desired geographic coordinate. The pixel value
occupying the closest image file coordinate to the estimated coordinate are used for the output pixel
value in the georeferrence image.
Bilinear Interpolation—uses the data file values value of the rectified pixel, based upon the
distances between the retransformed coordinate location (xr, yr) and the four closest pixels in the input
(source) image of four pixels in a 2 × 2 window to calculate an output value with a bilinear function.
Sl.No. Advantages Disadvantages
1 Results in output images that are Since pixels are averaged, bilinear interpolation
smoother. has the effect of a low-frequency convolution.
Edges are smoothed, and some extremes of the
The stair-stepped effect that is possibly data file values are lost. Alters the original data
with nearest neighbour approach is and reduces contrast by averaging neighbouring
reduced. values together
2 This method is often used when changing It is computationally more complicated than
the cell size of the data, such as in nearest neighbour
SPOT/TM merges within the 2 × 2
resampling matrix limit
36
Cubic Convolution: uses the data file values of sixteen pixels in a 4 × 4 window to calculate an
output value with a cubic function.
Digital Image Subsetting refers to breaking out or clipping or cutting a portion of a large file into
one or more smaller files according to area of interest for study and analysis. Often, image files
contain areas much larger than a particular study area. In these cases, it is helpful to reduce the size of
the image file to include only the area of interest (AOI). This not only eliminates the extraneous data
in the file, but it speeds up processing due to the smaller amount of data to process. This can be
important when dealing with multiband data.
Digital Image Mosaicking is the process of joining or combining of adjacent images having partially
common parts to generate a larger image files according to the larger area of interest for study and
analysis. To combine two or more image files, each file must be georeferenced to the same coordinate
system, or to each other.
Radiometric Enhancement:
Radiometric enhancement deals with the individual values of the pixels in the image.
Depending on the points and the bands in which they appear, radiometric enhancements that are
applied to one band may not be appropriate for other bands. Therefore, the radiometric enhancement
of a multiband image can usually be considered as a series of independent, single band enhancements
(Faust, 1989). Radiometric enhancement usually does not bring out the contrast of every pixel in an
image. Contrast can be lost between some pixels, while gained on others.
However, the pixels outside the range between j and k are more grouped together than in the original
histogram to compensate for the stretch between j and k. Contrast among these pixels is lost.
The lookup table is a graph of that increases the contrast of input data file values by widening some
range of the input data (the range within the brackets). When radiometric enhancements are performed
on the display device, the transformation of data file values into brightness values is illustrated by the
graph of a lookup table. Note that the input range within the bracket is narrow, but the output
brightness values for the same pixels are stretched over a wider range. This process is called contrast
stretching. Notice that the graph line with the steepest (highest) slope brings out the most contrast by
stretching output values farther apart.
The different radiometric enhancement or contrast stretching are 1) Linear stretching, 2) Nonlinear
stretching, 3) Piecewise linear stretch, 4) Histogram equalization, 5) Level slicing.
38
data, the data file values fall within a narrow range—usually a range much narrower than the display
device is capable of displaying. That range can be expanded to utilize the total range of the display
device (usually 0 to 255). Generally, the linear contrast stretches are of four types 1) Minimum-
Maximum stretch 2) Saturation stretch 3) Average and standard deviation stretch 4) piecewise stretch
Minimum-Maximum stretch:
Saturation stretch:
Piecewise stretch:
The histogram shows scene brightness values occurring only in limited range of 60 to 158. If we were
to use these image values directly in the display device, we would be using only a small portion of the
full range of possible display level. Display levels 0 to 59 and 159 to 255 would be compressed into
small range of display values, reducing the interpreter‘s ability to discriminate radiometric detail.
A more expensive display would results if we were to expand the range of image levels present in the
scene (60 to 158) to fill the range of display values (0 to 255). In the figure-C, the range of image
values has been uniformly expanded to fill the total range of output device. Subtle variation in input
image data values would now be the displayed in output tones that would be more readily
distinguished by the interpreter. Lighter tonal areas would appear lighter and dark areas would appear
darker. Output Digital Numbers (DN/)=[{Input(DN)- Min(DN) of input image}/ {Max(DN) of
input image - Min(DN) of input image}]x255
Frequency
Frequency
Saturation Stretch: The saturation stretch (also referred to as percentage linear contrast stretch or
tail trim) is similar to the minimum-maximum linear contrast stretch except this method uses specified
minimum and maximum values that lie in a certain percentage of pixels (Fig. 10.20). Generally, very
few numbers of pixels reside at two ends of a histogram, but they occupy a reasonable amount of
39
brightness values. Sometimes these tails of the histogram are trimmed and then the remainder part of
the histogram enhances more prominently. This is the main advantage of percent linear contrast
stretch. Pixels outside the defined range are mapped to either 0 (for DNs less than defined minimum
value) or 255 (for DNs higher than defined maximum value). The information content of the pixels
that saturate at 0 and 255 is lost, yet a more detailed analysis of certain aspects of the image enhanced
for better interpretation. It is not necessary that the same percentage be applied to each tail of the
histogram distribution.
Frequency
Frequency
0 MIN DN MAX 255 0 MIN DN MAX 255
Average and Standard Deviation Stretch: It is similar to percent stretch. A standard deviation from
the mean is often used to push the tails of the histogram beyond the original minimum and maximum
values.
The contrast value for each range represents the percent of the available output range that particular
range occupies. The brightness value for each range represents the middle of the total range of
brightness values occupied by that range. Since rules 1 and 2 above are enforced, as the contrast and
brightness values are changed, they may affect the contrast and brightness of other ranges. For
example, if the contrast of the low range increases, it forces the contrast of the middle to decrease.
40
Nonlinear Contrast Stretch
A nonlinear spectral enhancement can be used to gradually increase or decrease contrast over a range,
instead of applying the same amount of contrast (slope) across the entire image. Usually, nonlinear
enhancements bring out the contrast in one range while decreasing the contrast in other ranges. The
graph of the function in Figure shows one example. Major Non linear contrast enhancement
techniques are 1) Histogram equilization 2) Histogram Normalization 3) Reference or special stretch
4) Density slicing/ level slicing 5) Thresholding
Histogram Equalization
In this approach image values are assigned to the display levels on the basis of their frequency of
occurrences. As shown in figure-d more display values (and hence more radiometric detail) are
assigned to the frequently occurring portion of the histogram. The image value range of 109 to 158 is
stretched over large portion of display levels (39 to 255). A smaller portion is reserved for the
infrequently occurring image values of 60-108.
Histogram equalization is a nonlinear stretch that redistributes pixel values so that there is
approximately the same number of pixels with each value within a range. The result approximates a
flat histogram. Therefore, contrast is increased at the peaks of the histogram and lessened at the tails.
The histogram shows scene brightness values occurring only in limited range of 60 to 158. If we were
to use these image values directly in the display device, we would be using only a small portion of the
full range of possible display level. Display levels 0 to 59 and 159 to 255 would be compressed into
small range of display values, reducing the interpreter‘s ability to discriminate radiometric detail.
41
Histogram Normalization:
Normal distribution of a histogram is actually a bell-shaped distribution (also known as Gaussian
distribution). In a normal distribution, most values are at or near the middle, by the peak of the bell
curve. Values that are more extreme are rarer, by the tails at the ends of the curve. Generally, a normal
distribution of the density in an image would create an image that is natural for a human observation.
In this sense, the histogram of the original image may be sometimes converted to the normalized
histogram. This method of contrast enhancement is based upon the histogram of the pixel values and
is called a Gaussian stretch because it involves the fitting of the observed histogram to a normal or
Gaussian histogram.
Reference Stretch / Histogram matching: Reference stretch (also known as histogram matching or
histogram specification) is the process of determining a lookup table that converts the histogram of
one image to resemble the histogram of another. Histogram matching is useful for matching the data
of the same scene or adjacent scenes that were scanned on separate days, or are slightly different
because of the sun angle or atmospheric effects. This is especially useful for mosaicking or change
detection. To achieve good results with histogram matching, the two input images should have similar
characteristics:
The general shape of the histogram curves should be similar.
Relative dark and light features in the image should be the same.
• For some applications, the spatial resolution of the data should be the same.
• The relative distributions of land covers should be about the same, even when matching scenes that
are not of the same area. If one image has clouds and the other does not, then the clouds should be
removed before matching the histograms. This can be done using the AOI function. The AOI function
is available from the Viewer menu bar.
To match the histograms, a lookup table is mathematically derived, which serves as a function for
converting one histogram to the other, as illustrated in Figure
Figure 6-10: Histogram Matching, (a) Source histogram, (b) Mapped through the lookup table,
(c) Approximates model histogram.
42
Level Slice or density slicing:
A level slice is similar to histogram equalization in that it divides the data into equal amounts. It
involves combining the DNs of different values within a specified range or interval into a single
value. A level slice on a true color display creates a stair-stepped lookup table. The effect on the data
is that input file values are grouped together at regular intervals into a discrete number of levels, each
with one output brightness value.
Density slicing represents a group of contiguous digital numbers using a single value. Although some
details of the image is lost, the effect of noise can also be reduced by using density slicing. As a result
of density slicing, an image may be segmented, or sometimes contoured into sections of similar grey
level. This density slice (also called 'level slice') method works best on single-band images.
It is especially useful when a given surface feature has a unique and generally narrow set of DN
values. The new single value is assigned to some grey level (intensity) for display on the computer
monitor (or in a printout). All other DNs can be assigned another level, usually black. This yields a
simple map of the distribution of combined DNs. If several features, each have different (separable)
DN values, then several grey-level slices may be produced, each mapping the spatial distribution of its
corresponding feature. The new sets of slices are commonly assigned different colours in a photo or
display. This has been used in colouring classification maps in most image analysis software systems.
Thresholding:
This is a process of image enhancement by segmenting DN values into two distinct values separated
by a threshold DN as shown in figure . Thresholding produces binary output with sharply defined
spatial boundaries.
This type of image enhancement segments the image DNs into two distinct values (black = 0 and
white = 255) separated by a threshold DN as shown in
Special stretch is useful for special analyses; specific feature may be analysed in greater radiometric
detail by assigning the display range exclusively to a particular range of image values. For example, if
water features were represented by narrow range of values, in a scene, characteristic of water features
could be enhanced by stretching this small range to the full display range. As shown in figure, the
output range is devoted entirely to the small range of image values between 60 to 92. On the stretched
43
display, minute tonal variations in the water range would be greatly exaggerated. The brighter land
feature on the other hand, would be washed out by being displayed at a single bright white level(255).
Decorrelation Stretch:
The purpose of a contrast stretch is to alter the distribution of the image DN values within the 0 - 255
range of the display device, and utilize the full range of values in a linear fashion.
The decorrelation stretch stretches the principal components of an image, not to the original image.
A principal components transform converts a multiband image into a set of mutually orthogonal
images portraying inter-band variance. Depending on the DN ranges and the variance of the
individual input bands, these new images (PCs) occupy only a portion of the possible 0 – 255 data
range.
Each PC is separately stretched to fully utilize the data range. The new stretched PC composite image
is then retransformed to the original data areas. Either the original PCs or the stretched PCs may be
saved as a permanent image file for viewing after the stretch.
Spatial Enhancement
While radiometric enhancements operate on each pixel individually and it does not alter the pixel
values, whereas, spatial enhancement modifies pixel values based on the values of surrounding pixels.
Spatial enhancement deals largely with spatial frequency, which is the difference between the highest
and lowest values of a contiguous set of pixels. Jensen (Jensen, 1986) defines spatial frequency as
―the number of changes in brightness value per unit distance for any particular part of an image.‖
Consider the examples in Figure:
• zero spatial frequency—a flat image, in which every pixel has the same value
• low spatial frequency—an image consisting of a smoothly varying gray scale
• highest spatial frequency—an image consisting of a checkerboard of black and white pixels
44
Convolution Filtering
Convolution filtering is the process spatial filtering by averaging small sets of pixels across an image.
Convolution filtering involves moving a window of a set of pixels in dimension (3x3, 5x5 etc) over
each pixel in the image, applying a mathetical calculation using the pixel values under that window,
and replacing the central pixel with the new value. This window know as a convolution kernel, a
matrix of numbers(these numbers are also know as coefficient) that is used to average the value of
each pixel with the values of surrounding pixels. The kernel is moved along in both the row and
column dimensions with one pixel at a time and the calculation is repeated until the entire image has
been filtered and a new image is generated. The numbers in the matrix serve to weight this average
toward particular pixels. These numbers are often called coefficients, because they are used as such in
the mathematical equations.
Filtering is a broad term, which refers to the altering of spatial or spectral features for image
enhancement (Jensen, 1996). Convolution filtering is used to change the spatial frequency
characteristics of an image (Jensen, 1996).
To understand how one pixel is convolved, imagine that the convolution kernel is overlaid on the data
file values of the image (in one band), so that the pixel to be convolved is in the center of the window.
Figure: Applying
a Convolution
Kernel,
Figure, shows a 3 × 3 convolution kernel being applied to the pixel in the third column, third row of
the sample data (the pixel that corresponds to the center of the kernel). To compute the output value
for this pixel, each value in the convolution kernel is multiplied by the image pixel value that
corresponds to it. These products are summed, and the total is divided by the sum of the values in the
kernel, as shown here:
Integer [(-1 × 8) + (-1 × 6) + (-1 × 6) + (-1 × 2) + (16 × 8) + (-1 × 6) + (-1 × 2) + (-1 × 2) + (-
1 × 8) ÷ [(-1) + (-1) + (-1) + (-1) + 16 + (-1) + (-1) + (-1) + (-1)]
= int [(128-40) / (16-8)] = int (88 / 8) = int (11) = 11
45
In order to convolve the pixels at the edges of an image, pseudo data must be generated in order to
provide values on which the kernel can operate. In the example below, the pseudo data are derived by
reflection. This means the top row is duplicated above the first data row and the left column is
duplicated left of the first data column. If a second row or column is needed (for a 5 × 5 kernel for
example), the second data row or column is copied above or left of the first copy and so on. An
alternative to reflection is to create background value (usually zero) pseudo data; this is called Fill.
When the pixels in this example image are convolved, output values cannot be calculated for the last
row and column; here we have used is to show the unknown values. In practice, the last row and
column of an image are either reflected or filled just like the first row and column.
The kernel used in this example is a high frequency kernel, as explained below. It is important to note
that the relatively lower values become lower, and the higher values become higher, thus increasing
the spatial frequency of the image.
Where:
fij = the coefficient of a convolution kernel at position i,j (in the kernel)
dij = the data value of the pixel that corresponds to fij
q = the dimension of the kernel, assuming a square kernel (if q = 3, the kernel is 3 × 3)
F = either the sum of the coefficients of the kernel, or 1 if the sum of coefficients is 0
V = the output pixel value
In cases where V is less than 0, V is clipped to 0.
The sum of the coefficients (F) is used as the denominator of the equation above, so that the output
values are in relatively the same range as the input values. Since F cannot equal zero (division by zero
is not defined), F is set to 1 if the sum is zero.
46
Zero-Sum Kernels
Zero-sum kernels are kernels in which the sum of all coefficients in the kernel equals zero. When a
zero-sum kernel is used, then the sum of the coefficients is not used in the convolution equation, as
above. In this case, no division is performed (F = 1), since division by zero is not defined.
This generally causes the output values to be:
• zero in areas where all input values are equal (no edges)
• low in areas of low spatial frequency
• extreme in areas of high spatial frequency (high values become much higher, low values become
much lower)
Therefore, a zero-sum kernel is an edge detector, which usually smooths out or zeros out areas of low
spatial frequency and creates a sharp contrast where spatial frequency is high, which is at the edges
between homogeneous (homogeneity is low spatial frequency) groups of pixels. The resulting image
often consists of only edges and zeros. Zero-sum kernels can be biased to detect edges in a particular
direction. For example, this 3 × 3 kernel is biased to the south (Jensen, 1996).
Averaging filter:
A 2D moving average filter is defined in terms of its dimensions, which must be odd, positive and
integral. The output is found by dividing the sum of the products of corresponding convolution kernel
and image element often divided by the numbers of kernel element. Averaging filter also known as
smoothening filter.
Mean Filter
The Mean filter is a simple calculation. The pixel of interest (center of window) is replaced by the
arithmetic average of all values within the window. This filter does not remove the aberrant (speckle)
value; it averages it into the data. Below is an example of a low-frequency kernel, or low-pass kernel,
which decreases spatial frequency.
This kernel simply averages the values of the pixels, causing them to be more homogeneous. The
resulting image looks either smoother or more blurred. In theory, a bright and a dark pixel within the
same window would cancel each other out. This consideration would argue in favor of a large window
size (e.g., 7 × 7). However, averaging results in a loss of detail, which argues for a small window size.
47
In general, this is the least satisfactory method of speckle reduction. It is useful for applications where
loss of resolution is not a problem.
Median Filter
Better ways to reduce speckle, but still simplistic, is the Median filter. This filter operates by
arranging all DN values in sequential order within the window that you define. The pixel of interest is
replaced by the value in the center of this distribution. A Median filter is useful for removing pulse or
spike noise. Pulse functions of less than one-half of the moving window width are suppressed or
eliminated. In addition, step functions or ramp functions are retained.
The median filter finds the median pixel value. In the aforementioned example, pixels are arranged in
106, 197, 198, 200, 200, 201,204, 209, 210.
There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th
So the median that is 5th pixel is 200. So, in the output filtered image center value of the window
(106) is replaced by 200.
Mode Filter:
The mode filter is primarily used to cleanup thematic maps for presentation purpose. This filter
computes the mode of the gray-level values (the most frequently occurring grey-level value) within
the filter window surrounding each pixel. Pixels are arranged in 57, 58, 60, 60,61, 64, 69, 70,125,
where, 60 occur two times. So, in the output filtered image center value of the window (125) is
replaced by 60.
It is possible that a decision have to be made between two values with the same frequency of
occurrence. In this case if the centre value is tie values, it chosen, otherwise first tie value
encountered is chosen.
For example,(1, 5, 3, 2, 3, 5, 4, 5), pixel 5 and 3 occur thrice. Neither 3 nor 5 is in the centre position,
the pixel 5 in the top row is encountered first as the value are read, and so it is chosen as mode value.
48
High-Frequency Kernels/ High-pass Filter:
High-pass filters do the opposite of low-pass filter and serve to sharpen the appearance of fine detail
in an image. A high-frequency kernel, or high-pass kernel, has the effect of increasing spatial
frequency. Just subtracting the low-frequency image resulting from a low-pass filter from the original
image can enhance high spatial frequencies. High-frequency information allows us either to isolate, or
amplify the local detail lithe high-frequency detail is amplified by adding back to the image some
multiple of the high-frequency component extracted by the filter, then the result is a sharper, de-
blurred image. High-frequency kernels serve as edge enhancers, since they bring out the edges
between homogeneous groups of pixels. Unlike edge detectors (such as zero-sum kernels), they
highlight edges and do not necessarily eliminate other features.
When this kernel is used on a set of pixels in which a relatively low value is surrounded by higher
values, like this, the low value gets lower:
Inversely, when the kernel is used on a set of pixels in which a relatively high value is surrounded by
lower values, the high value becomes higher. In either case, spatial frequency is increased by this
kernel:
Edge Detection Filter: Edge and line detection are important operations in digital image processing.
Directional, or edge detection filters are designed to highlight linear features, such as roads or field
boundaries. These filters can also be designed to enhance features which are oriented in specific
directions. These filters are useful in various fields such as geology, for the detection of linear
geologic structures. Zero-sum kernels are kernels in which the sum of all coefficients in the kernel
equals zero. A common type of edge detection kernel is a zero-sum kernel, In case of zero-sum
kernel, the sum of the coefficients is not used in the convolution equation (no division is performed),
since division by zero is not defined. This generally causes the output values to be zero in areas
where all input values are equal, low in areas of low spatial frequency, extreme in areas of high
spatial frequency. Therefore, a zero-sum kernel is an edge detector, which usually smoothes out or
zeros out areas of low spatial frequency and creates a sharp contrast where spatial frequency is high.
The resulting image often contains only edges and zeros. Following are examples of zero-sum
kernels.
49
Sobel filtering: The Sobel operator is used in image processing, particularly within edge detection
algorithms. Technically, it is a discrete differentiation operator, computing an approximation of the
gradient of the image intensity function. At each point in the image, the result of the Sobel operator is
either the corresponding gradient vector or the norm of this vector. The Sobel operator is based on
convolving the image with a small, separable, and integer valued filter in horizontal and vertical
direction and is therefore relatively inexpensive in terms of computations. On the other hand, the
gradient approximation that it produces is relatively crude, in particular for high frequency variations
in the image. The operator uses two 3×3 kernels which are convolved with the original image to
calculate approximations of the derivatives - one for horizontal changes, and one for vertical.
Prewitt Filtering: The Prewitt operator is used in image processing, particularly within edge
detection algorithms. Technically, it is a discrete differentiation operator, computing an
approximation of the gradient of the image intensity function. At each point in the image, the result
of the Prewitt operator is either the corresponding gradient vector or the norm of this vector. The
Prewitt operator is based on convolving the image with a small, separable, and integer valued filter in
horizontal and vertical direction and is therefore relatively inexpensive in terms of computations. On
the other hand, the gradient approximation which it produces is relatively crude, in particular for high
frequency variations in the image. The Prewitt operator is named for Judith Prewitt. Mathematically,
the operator uses two 3×3 kernels which are convolved with the original image to calculate
approximations of the derivatives - one for horizontal changes, and one for vertical.
The Laplacian is a 2-D isotropic measure of the 2nd spatial derivative of an image. The Laplacian
of an image highlights regions of rapid intensity change and is therefore often used for edge
detection (see zero crossing edge detectors). The Laplacian is often applied to an image that has first
been smoothed with something approximating a Gaussian smoothing filter in order to reduce its
sensitivity to noise, and hence the two variants will be described together here. The operator
normally takes a single graylevel image as input and produces another graylevel image as output.
Two commonly used discrete approximations to the Laplacian filter. (Note, we have defined the
Laplacian using a negative peak because this is more common; however, it is equally valid to use the
opposite sign convention.)
50
Adaptive filter: An adaptive filter is a filter that self-adjusts its transfer function according to an
optimization algorithm driven by an error signal. Because of the complexity of the optimization
algorithms, most adaptive filters are digital filters. By way of contrast, a non-adaptive filter has a
static transfer function. Adaptive filters are required for some applications because some parameters
of the desired processing operation (for instance, the locations of reflective surfaces in a reverberant
space) are not known in advance. The adaptive filter uses feedback in the form of an error signal to
refine its transfer function to match the changing parameters.
Generally speaking, the adaptive process involves the use of a cost function, which is a criterion for
optimum performance of the filter, to feed an algorithm, which determines how to modify filter
transfer function to minimize the cost on the next iteration.
As the power of digital signal processors has increased, adaptive filters have become much more
common and are now routinely used in devices such as mobile phones and other communication
devices, camcorders and digital cameras, and medical monitoring equipment.
Adaptive Filter Adaptive filters have kernel coefficients calculated for each window position based
on the mean and variance of the original DN in the underlying image. A powerful technique for
sharpening images in the presence of low noise levels is via an adaptive filtering algorithm. Here we
look at a method of re-defining a high-pass filter as the sum of a collection of edge sharpening
kernels. Following is one example of highpass filter.
This filter can be re-written as sum of the eight edge-sharpening kernels as follows:
0 0 0 -1 0 0
-2 2 0 0 1 0
0 0 0 0 0 0
0 0 -1 0 -2 0 0 0 0
0 0 0 0 0 0 0 0 0
0 1 0 0 2 0 0 2 0
0 2 -2 0 1 0 0 1 0
0 0 0 0 0 0
0 0 0 0 0 -1 0 -2 0 -1 0 0
Adaptive filtering using these kernels can be performed by filtering the image with each kernel, in
turn, and then summing those outputs that exceed a threshold. As a final step, this result is added to
the original image. This use of a threshold makes the filter adaptive in the sense that it overcomes the
directionality of any single kernel by combining the results of filtering with a selection of kernels,
each of which is tuned to an edge-sharpening inherent in the image.
51
the amplitudes in specified wavebands. The frequency domain can be represented as a 2D scatter plot
known as a Fourier spectrum (or Fourier domain), in which lower frequencies fall at the centre and
progressively higher frequencies are plotted outwards.
Filtering in the frequency domain consists of the following three steps:
1. Fourier transform the original image and compute the Fourier spectrum.
2. Select an appropriate filter function and multiply by the elements of the Fourier spectrum.
3. Perform an inverse Fourier transform to return to the spatial domain for display purposes.
Crisp Filter:
The crisp filter sharpens the overall scene luminance without distorting the inter-band variance
content of the image. This is a useful enhancement if the image is blurred due to atmospheric haze,
rapid sensor motion, etc. of the sensor.
The algorithm consists of the following three steps:
1. Calculate principal components of multi-band input image.
2. Convolve PC-1 with summary filter.
3. Retransform to RGB space (Faust 1993).
Another approach of image addition is called temporal averaging. For instance, it has the advantage of
reducing the speckle of radar image without losing spatial resolution. In this case, pixel-by-pixel
averaging is performed for multiple co-registered images of same geographic area taken at different
time. Another example of temporal averaging is creating a temperature map for a given area and given
year by averaging multiple data of different time.
52
16 20 65 19 56 64 25 65 36 42 45 42
69 56 37 28 + 45 65 85 75 = 57 61 61 52
65 75 25 46 35 29 35 64 50 52 30 55
64 59 57 38 65 98 25 54 65 79 41 46
Image Subtraction:
The subtraction operation is often carried out on a pair of co-registered images of the same area taken
at different times. Image subtraction is often used to identify changes (change detection) that have
occurred between images collected on different dates.
Typically, two images which have been geometrically registered are used with the pixel (brightness)
values in one image being subtracted from the pixel values in the other. In such an image, areas where
there has been little or no change between the original images contain resultant brightness values
around 0, while those areas where significant change has occurred contain values higher or lower than
0, e.g., brighter or darker depending on the 'direction' of change in reflectance between the two
images. This type of image transform can be useful for mapping changes in urban development
around cities and for identifying areas where deforestation is occurring.
2 5 4 6 7 5 3 1 -5 0 1 5
3 5 8 9 _ =
1 9 3 0 2 ---4 5 9
6 7 9 5 6 9 9 3 0 -2 0 2
8 9 6 8 8 6 2 7 0 3 4 1
It is also often possible to use just a single image as input and subtract a constant value from all the
pixels. Simple subtraction of a constant from an image can be used to remove the extra energy
recorded by the sensor due to atmospheric effects.
Image Multiplication:
Pixel-by-pixel multiplication of two images is rarely performed in practice. Multiplication operation
is, however, a useful one if an image of interest is composed of two or more distinctive region and if
the analysi is interested only in one of these regions. In some case, multiple co-registered images of
same are taken at same time and date are considered to multiply one with other, which increase the
variation of DNs between pixels.
53
Indices: Indices are used to create output images by mathematically combining the DN values of
different bands. These may be simplistic: (Band X - Band Y) or more complex:
These ratio images are derived from the absorption/reflection spectra of the material of interest.
The absorption is based on the molecular bonds in the (surface) material. Thus, the ratio often gives
information on the chemical composition of the target.
Vegetation index:
RATIO vegetation indices (Rose et al.,1973) separate green vegetation from soil
background by dividing the reflectance values contained in the near IR band (NIR) by those
contained in the red band (R).
Ratio = NIR / RED
This clearly shows the contrast between the red and infrared bands for vegetated pixels with high
index values being produced by combinations of low red (because of absorption by chlorophyll) and
high infrared (as a result of leaf structure) reflectance. Ratio value less than 1.0 is taken as non-
vegetation while ratio value greater than 1.0 is considered as vegetation. The major drawback in this
method is the division by zero. Pixel value of zero in red band will give the infinite ratio value. To
avoid this situation Normalized Difference Vegetation Index (NDVI) is computed.
This is the most commonly used VI due to the ability to minimize topographic effects while
producing a linear measurement scale ranging from –1 to +1.The negative value represents non
vegetated area while positive value represents vegetated area.
Land cover analysis is done using different slope and distance based vegetative indices (VI‘s). VI is
computed based on the data grabbed by space borne sensors in the range 0.6-0.7 (red band) and 0.7-
0.9 (Near-IR band), which helps in delineating the area under vegetation and non-vegetation areas.
54
Live green plants absorb solar radiation in the photosynthetically active radiation (0.4 µm to 0.7 µm is
PAR) spectral region, which they use as a source of energy in the process of photosynthesis. Leaf
cells have also evolved to scatter (i.e., reflect and transmit) solar radiation in the near-infrared spectral
region (which carries approximately half of the total incoming solar energy), because the energy level
per photon in that domain (wavelengths longer than about 700 nanometers) is not sufficient to be
useful to synthesize organic molecules. A strong absorption at these wavelengths would only result in
overheating the plant and possibly damaging the tissues. Hence, live green plants appear relatively
dark in the PAR and relatively bright in the near-infrared. By contrast, clouds and snow tend to be
rather bright in the red (as well as other visible wavelengths) and quite dark in the near-
infrared. The pigment in plant leaves, chlorophyll, strongly absorbs visible light (from 0.4 to
0.7 µm) for use in photosynthesis. The cell structure of the leaves, on the other hand, strongly
reflects near-infrared light (from 0.7 to 1.1 µm). The more leaves a plant has, the more these
wavelengths of light are affected, respectively.
In general, if there is much more reflected radiation in near-infrared wavelengths than in visible
wavelengths, then the vegetation in that pixel is likely to be dense and may contain some type of
forest. Subsequent work has shown that the NDVI is directly related to the photosynthetic capacity
and hence energy absorption of plant canopies
1. Negative values of NDVI (values approaching -1) correspond to water.
2. Values close to zero (-0.1 to 0.1) generally correspond to barren areas of rock, sand, or snow.
3. Low, positive values represent shrub and grassland (approximately 0.2 to 0.4).
4. High values indicate temperate and tropical rainforests (values approaching 1).
It can be seen from its mathematical definition that the NDVI of an area containing a dense vegetation
canopy will tend to positive values (say 0.3 to 0.8) while clouds and snow fields will be characterized
by negative values of this index. Other targets on Earth visible from space include
free standing water (e.g., oceans, seas, lakes and rivers) which have a rather low reflectance in
both spectral bands (at least away from shores) and thus result in very low positive or even
slightly negative NDVI values,
soils which generally exhibit a near-infrared spectral reflectance somewhat larger than the
red, and thus tend to also generate rather small positive NDVI values (say 0.1 to 0.2).
In addition to the simplicity of the algorithm and its capacity to broadly distinguish vegetated areas
from other surface types, the NDVI also has the advantage of compressing the size of the data to be
manipulated by a factor 2 (or more), since it replaces the two spectral bands by a single new field
(eventually coded on 8 bits instead of the 10 or more bits of the original data)
55
Normalized ratio Vegetation Indexes (NRVI)
Normalized ratio vegetation index is a modification of the RVI (Baret and Guyot, 1991) where the
result of RVI-1 is normalized over RVI+1.
NRVI=(RVI-1)/(RVI+1)
This normalization is similar in effect to that of NDVI, i.e it reduces topographic, illumination and
atmospheric effects and it creates a statistically desirable normal distribution. Ratio value less than 0.0
indicates vegetation area while greater than 0.0 values represents non-vegetation.
However negative values still exist for values less than –0.5 NDVI. There is no technical difference
between NDVI and TVI in terms of image output or active vegetation detection. Ratio values less than
0.71 is taken as non-vegetation and value greater than 0.71 gives the vegetation area.
The correction is applied in a uniform manner, the out put image using CTVI should have no
difference with the initial NDVI image or the TVI whenever TVI properly carries out the square root
operation. The correction is intended to eliminate negative values and generate a VI image that is
similar to, if not better than, the NDVI. Ratio value less than 0.71 is taken as non-vegetation and value
grater than 0.71 gives the vegetation area.
56
simply taking the square root of the absolute values of the NDVI in the original TVI expression to
have a new VI called as TTVI. It can be defined as:
Ratio value less than 0.71 is taken as non-vegetation and value grater than 0.71 gives the
vegetation area.
Many researchers used Landsat TM data for the detection of hydrothermal alteration zones in
different countries taking into account the specific characteristics of each region. The most
characteristic combination is the 5/7, 3/1 4/3 RGB false color composite.
Principal components analysis (PCA) is often used as a method of data compression. It allows
redundant data to be compacted into fewer bands—that is, the dimensionality of the data is reduced.
The bands of PCA data are noncorrelated and independent, and are often more interpretable than the
source data (Jensen, 1996; Faust, 989).
The process is easily explained graphically with an example of data in two bands. Below is an
example of a two-band scatterplot, which
shows the relationships of data file values in
two bands. The values of one band are plotted
against those of the other. If both bands have
normal distributions, an ellipse shapes results.
58
In case of multi-band image (n-dimensional
histogram), an ellipse (2D), ellipsoid (3D),
or hyper-ellipsoid (more than 3D) is formed
if the distributions of each input band are
normal or near normal. The concept of
hyper-ellipsoid is hypothetical. To transform
the original data onto the new principal
component axes, transformation coefficients
are obtained that are further applied in a
linear fashion to the original pixel values.
This linear transformation is derived from the covariance matrix of the original data set. These
transformation coefficients describe the lengths (eigenvalues) and directions (eigenvectors) of the
principal axes.
The length and direction of the widest transect of the ellipse are calculated using matrix algebra. The
transect, which corresponds to the major (longest) axis of the ellipse, is called the first principal
component of the data. The direction of the first principal component is the first eigenvector, and its
length is the first eigenvalue. A new axis of the ellipse is defined by this first principal component.
The points in the scatterplot are now given new coordinates, which correspond to this new axis. Since,
in spectral space (feature space), the coordinates of the points are the data file values, new data file
values are derived from this process. These values are stored in the first principal component band of
a new data file. The first principal component shows the direction and length of the widest transect of
the ellipse (Fig.b). Therefore, as an axis in spectral space, it measures the highest variation within the
data. Figure (c) shows that the first eigenvalue is always greater than the ranges of the input bands,
just as the hypotenuse of a right triangle must always be longer than the legs.
59
The second principal component is the widest transect of the ellipse that is orthogonal (perpendicular)
to the first principal component. As such, the second principal component describes the largest
amount of variance in the data that is not already accounted by the first principal component (Fig.). In
a 2D analysis, the second principal component corresponds to the minor axis of the ellipse.
In n dimensions, there are n principal components. Each successive principal component is the widest
transect of the ellipse that is orthogonal to the previous components in the n-dimensional space of the
scatterplot (Faust 1989), and accounts for a decreasing amount of the variation in the data, which is
not already accounted for by previous principal components (Taylor 1977).
To transform the spatial domain (original data file values) into the principal component values, the
following equation is used:
Where:
e = the number of the principal component (first, second)
Pe = the output principal component value for principal component number e
k = a particular input band
n = the total number of bands
dk = an input data file value in band k
Eke = the eigenvector matrix element at row k, column e
Although there are n output bands in a PCA, the first few bands account for a high proportion of the
variance in the data—in some cases, almost 100%. Therefore, PCA is useful for compressing data into
fewer bands. In other applications, useful information can be gathered from
the principal component bands with the least variance. These bands can show subtle
details in the image that were obscured by higher contrast in the original image. These bands may also
show regular noise in the data (for example, the striping in old MSS data) (Faust, 1989).
Tasseled Cap
The different bands in a multispectral image can be visualized as defining an N-dimensional space
where N is the number of bands. Each pixel, positioned according to its DN value in each band, lies
within the N-dimensional space. This pixel distribution is determined by the absorption/reflection
spectra of the imaged material. This clustering of the pixels is termed the data structure (Crist and
Kauth, 1986).
The data structure can be considered a multidimensional hyper-ellipsoid. The principal axes of this
data structure are not necessarily aligned with the axes of the data space (defined as the bands of the
input image). They are more directly related to the absorption spectra. For viewing purposes, it is
advantageous to rotate the N-dimensional space such that one or two of the data structure axes are
60
aligned with the Viewer X and Y axes. In particular, you could view the axes that are largest for the
data structure produced by the absorption peaks of special interest for the application.
For example, a geologist and a botanist are interested in different absorption features. They would
want to view different data structures and therefore, different data structure axes. Both would benefit
from viewing the data in a way that would maximize visibility of the data structure of interest.
If four bands are transformed in a feature space, four axes are generated. Tasselled cap is described
using three axes. The fourth axis that could not be characterized satisfactorily was called non-such.
However, for Lands at- TM, it has been characterized as indicating haze or noise.
Later, Crist, Cicone, and Kauth developed a new transformation technique for Landsat TM data (Crist
and Kauth 1986; Crist and Cicone 1984). Their new redness or brightness and greenness are defined
as
Redness = 0.3037 TM1 + 0.2793 TM2 + 0.4743 TM3 + 0.5586 TM4 + 0.5082 TM5 + 0.1863 TM7
Greenness = -0.2848 TM1 - 0.2435 TM2 - 0.5436 TM3 + 0.7243 TM4 + 0.0840 TM5 - 0.1800 TM7
Wetness = 0.1509 TM1 + 0.1973 TM2 + 0.3279 TM3 + 0.3406 TM4 - 0.7112 TM5 - 0.4572 TM7
Haze = -0.8242 TM1 + 0.0849 TM2 + 0.4392 TM3 - 0.0580 TM4 + 0.2012 TM5 - 0.2768 TM7
Fourier Transform:
A Fourier transform is a linear transformation that allows calculation of the coefficients
necessary for the sine and cosine terms to adequately represent the image.
Fourier transformations are typically used for the removal of noise such as striping, spots, or
vibration in imagery by identifying periodicities (areas of high spatial frequency). Fourier
editing can be used to remove regular errors in data such as those caused by sensor anomalies
(e.g., striping). This analysis technique can also be used across bands as another form of
pattern/feature recognition.
Fast Fourier Transform (FFT), a classical image filtering technique, is used to convert a raster image
from the spatial domain into a frequency domain image. The FFT calculation converts the image into
a series of two-dimensional sine waves of various frequencies. The Fourier image can be edited
accordingly for image enhancement such as sharpening, contrast manipulation and smoothing.
Sharpening is achieved by using a high-pass filter whose function is to attenuate low frequencies,
whereas image smoothing is done by low-pass filter. Sometimes combination of both of low-pass as
well as high-pass filters, known as band pass filter is used. In the frequency domain the high-pass
filter is implemented by attenuating the pixel frequencies with the help of different window functions
viz., Ideal, Bartlett (Triangular), Butterworth, Gaussian, Hanning and Hamming etc (ERDAS, 2001).
Let us consider a function f (x, y) of two variables x and y, where x = 0, 1, 2,…., N-1, and y =
0, 1, 2,…., M-1. The function f (x, y) represents digital value of an image in the xth row, yth column;
61
M, N are the maximum numbers of rows and columns in the image which are multiple of two. Then
the Forward Fourier Transform of f (x, y) is defined as (Gonzalez and Woods, 1992; Jahne, 1993);
Where:
M = the number of pixels horizontally
N = the number of pixels vertically
u,v = spatial frequency variables
e = 2.71828, the natural logarithm base
j = 1 ; the imaginary component of a complex number.
The number of pixels horizontally and vertically must each be a power of two. If the dimensions of
the input image are not a power of two, they are padded up to the next highest power of two.
f(x, y)
Input image/
spatial domain
FFT
F(u, v) H(u, v)
Frequency domain Fourier filter
image function
IFT
g(x, y)
Output filtered
Image
The raster image generated by the FFT calculation is not an optimum image for viewing or editing.
Each pixel of a fourier image is a complex number (i.e., it has two components: real and imaginary).
For display as a single image, these components are combined in a root-sum of squares operation.
62
Where:
M = the number of pixels horizontally
N = the number of pixels vertically
u, v = spatial frequency variables
e = 2.71828, the natural logarithm base. Equations [1] and [2] are known as frequency transform pair.
Hue refers to a specific tone of colour. Saturation refers to the purity, or intensity of a colour.
Intensity refers to how much bright, say, white or black, is contained within a colour.
Hue is generated by mixing red, green, and blue that are characterized by coordinates on the RGB
axes of the colour cube. The hue-saturation-intensity hexacone model (Fig.), where hue is the
dominant wavelength of the perceived colour represented by angular position around the apex of a
hexacone, saturation or purity is given by distance from the central vertical axis of the hexacone, and
intensity or brightness is represented by distance above the apex of the hexacone. Hue is what we
perceive as colour. Saturation is the degree of purity of the colour and may be considered as the
amount of white mixed in with the colour. It is sometimes useful to convert from RGB colour cube
coordinates to HSI hexacone coordinates, and vice versa. The HSI to RGB or RGB to HSI can be
derived through the following transformation equations:
If we consider
R, G, and B are each in the range of 0 to 1.0
I and S are each in the range of 0 to 1.0
H is in the range of 0 to 360
Then, In case of HSI to RGB
1= (R + G + B)/3
S = 1 - (3/(R + G + B)) X a; where a is the minimum of R, G, and B
H = cos-1 [(0.5 x (R - G) + (R - B)) / ((R - G)2 + (R- B) x (G-B))0.5)]
If S = 0, then H is meaningless.
The hue, saturation, and intensity transform is useful in two ways: first, as a method of image
enhancement and second, as a means of combining co-registered images from different sources. The
advantage of the HIS system is that it is a more precise representation of human colour vision than
the RGB system. This transformation has been quite
useful for geological applications, where pixel value
in the hue band represents the colour code of the
object (such as soil) and the object type can be
identified based on their colour code.
63
converting the RGB values. Some are also named HSV (hue, saturation, value) or HLS (hue,
luminance/lightness, saturation). (Fig.1) illustrates the geometric interpretation. While the complexity
of the models varies, they produce similar values for hue and saturation.
The fusion of two data sets can be done in order to obtain one single data set with the qualities of both
(Saraf, 1999). For example, the low-resolution multispectral satellite imagery can be combined with
the higher resolution radar imagery by fusion technique to improve the interpretability of
fused/merged image. The resultant data product has the advantages of the high spatial resolution,
structural information (from radar image), and spectral resolution (from optical and infrared bands).
Thus, with the help of all these cumulative information, the analyst can explore most of the linear and
anomalous features as well as lithologies. Various image fusion techniques are available in published.
Intensity-hew-saturation (IHS) method (Carper et al., 1990; Chavez et al., 1991; Kathleen and Philip,
1994), principal component analysis (PCA) (Chavez et al., 1991; Chavez and Kwarteng, 1989),
Brovey Transform, BT (Chavez and Kwarteng, 1989; Li et al., 2002; Tu et al., 2001) and Wavelet
Transform (WT) (Ranchin and Wald, 1993; Yocky, 1996); are the are most commonly used fusion
algorithms in remote sensing. The present study has been carried out using Principal Component
Analysis (PCA) technique, which has been successfully used earlier for fusion of remote sensing data
64
for geological assessment and land cover mapping (Chavez and Kwarteng, 1989; Chavez et al., 1991,
Li et al., 2002, Pal et al., 2007)
The PCA is a statistical technique that transforms a multivariate inter-correlated data set into a new
un-correlated data set. The basic concept of PCA fusion is shown in Figure 2
65
Its most important steps are: (1) perform a principal component transformation to convert a set of
multispectral bands (three or more bands) into a set of principal components, (2) replace one principal
component, usually the first component, by a high resolution panchromatic image, (3) perform a
reverse principal component transformation to convert the replaced components back to the original
image space. A set of fused multispectral bands is produced after the reverse transform (Chavez et al.
1991, Shettigara 1992, Zhang and Albertz 1998, Zhang 1999).
66
Arithmetic combination fusion:
Different arithmetic combinations have been employed for fusing multispectral and panchromatic
images. The arithmetic operations of multiplication, division, addition and subtraction have been
combined in different ways to achieve a better fusion effect. Brovey Transform, SVR (Synthetic
Variable Ratio), and RE (Ratio Enhancement) techniques are some successful examples for SPOT pan
fusion.
The SVR and RE techniques are similar, but involved more sophisticated calculations for the sum
image (Cliche et a1.l985, Welch and Ehlers 1987, Chavez et al. 1991, Munechika et a1.l993, Zhang
and Albertz 1998, Zhang 1999).
Multiplicative:
The algorithm is derived from the four component technique of Crippen (Crippen, 1989a). In this
paper, it is argued that of the four possible arithmetic methods to incorporate an intensity image into a
chromatic image (addition, subtraction, division, and multiplication), only multiplication is unlikely to
distort the color. However, in his study Crippen first removed the intensity component via band ratios,
spectral indices, or PC transform. The algorithm shown above operates on the original image. The
result is an increased presence of the intensity component. For many applications, this is desirable.
67
People involved in urban or suburban studies, city planning, and utilities routing often want roads and
cultural features (which tend toward high reflection) to be pronounced in the image.
Fusion image Bi= (Multi Bi) x (Pan Image)
Multi Bi are input multispectral band 1,2…n
The Brovey Transform was developed to visually increase contrast in the low and high ends of an
image‘s histogram (i.e., to provide contrast in shadows, water and high reflectance areas such as urban
features). Consequently, the Brovey Transform should not be used if preserving the original scene
radiometry is mportant. However, it is good for producing RGB images with a higher degree of
contrast in the low and high ends of the image histogram and for producing visually appealing images.
Since the Brovey Transform is intended to produce RGB images, only three bands at a timeshould be
merged from the input multispectral scene, such as bands 3, 2, 1 from a SPOT or Landsat TM image
or 4, 3, 2 from a Landsat TM image. The resulting merged image should then be displayed with bands
1, 2, 3 to RGB.
Classification:
Multispectral classification is the process of sorting pixels into a finite number of individual classes,
or categories of data, based on their data file values/ Digital Number (Pixel) values. If a pixel satisfies
a certain set of criteria, the pixel is assigned to the class that corresponds to that criteria. This process
is also referred to as image segmentation. Depending on the type of information you want to extract
from the original data, classes may be associated with known features on the ground or may simply
represent areas that look different to the computer. An example of a classified image is a land cover
map, showing vegetation, bare land, pasture, urban, etc.
68
can be performed with the human eye; the human brain automatically sorts certain textures and colors
into categories.
In a computer system, spectral pattern recognition can be more scientific. Statistics are derived from
the spectral characteristics of all pixels in an image. Then, the pixels are sorted based on mathematical
criteria. The classification process breaks down into two parts: training and classifying (using a
decision rule).
Pattern recognition is the science—and art—of finding meaningful patterns in data, which can be
extracted through classification. By spatially and spectrally enhancing an image, pattern recognition
can be performed with the human eye; the human brain automatically sorts certain textures and colors
into categories.
In a computer system, spectral pattern recognition can be more scientific. Statistics are derived from
the spectral characteristics of all pixels in an image. Then, the pixels are sorted based on mathematical
criteria. The classification process breaks down into two parts: training and classifying (using a
decision rule).
Training:
The computer system must be trained to recognize patterns in the data. Training is the process of
defining the criteria by which these patterns are recognized (Hord, 1982). Training can be performed
with either a supervised or an unsupervised method, as explained below.
• What classes are most likely to be present in the data? That is, which types of land cover, soil, or
vegetation (or whatever) are represented by the data?
In supervised training, you rely on your own pattern recognition skills and a priori knowledge of the
data to help the system determine the statistical criteria (signatures) for data classification.
To select reliable samples, you should know some information—either spatial or spectral—about the
pixels that you want to classify.
The location of a specific characteristic, such as a land cover type, may be known through ground
truthing. Ground truthing refers to the acquisition of knowledge about the study area from field work,
analysis of aerial photography, personal experience, etc. Ground truth data are considered to be the
most accurate (true) data available about the area of study. They should be collected at the same time
as the remotely sensed data, so that the data correspond as much as possible (Star and Estes, 1990).
However, some ground data may not be very accurate due to a number of errors and inaccuracies.
69
Supervised training is closely controlled by the analyst. In this process, analyst select pixels that
represent patterns or land cover features that you recognize, or that analyst can identify with help from
other sources, such as aerial photos, ground truth data, or maps. Knowledge of the data, and of the
classes desired, is required before classification.
By identifying patterns, analyst can instruct the computer system to identify pixels with similar
characteristics. If the classification is accurate, the resulting classes represent the categories within the
data that you originally identified.
In supervised training, it is important to have a set of desired classes in mind, and then create the
appropriate signatures from the data. You must also have some way of recognizing pixels that
represent the classes that you want to extract. Supervised classification is usually appropriate when
you want to identify relatively few classes, when you have selected training sites that can be verified
with ground truth data, or when you can identify distinct, homogeneous regions that represent each
class.
On the other hand, if you want the classes to be determined by spectral distinctions that are inherent in
the data so that you can define the classes later, then the application is better suited to unsupervised
training. Unsupervised training enables you to define many classes easily, and identify classes that
are not in contiguous, easily recognized regions.
Unsupervised Training
Unsupervised training is more computer-automated. It enables you to specify some parameters that
the computer uses to uncover statistical patterns that are inherent in the data. These patterns do not
necessarily correspond to directly meaningful characteristics of the scene, such as contiguous, easily
recognized areas of a particular soil type or land use. They are simply clusters of pixels with similar
spectral characteristics. In some cases, it may be more important to identify groups of pixels with
similar spectral characteristics than it is to sort pixels into recognizable categories.
Unsupervised training is dependent upon the data itself for the definition of classes. This method is
usually used when less is known about the data before classification. It is then the analyst‘s
responsibility, after classification, to attach meaning to the resulting classes (Jensen, 1996).
Unsupervised classification is useful only if the classes can be appropriately interpreted.
Unsupervised training requires only minimal initial input from analyst. However, analyst has the task
of interpreting the classes that are created by the unsupervised training algorithm.
Unsupervised training is also called clustering, because it is based on the natural groupings of pixels
in image data when they are plotted in feature space. According to the specified parameters, these
groups can later be merged, disregarded, otherwise manipulated, or used as the basis of a signature.
Generally, training samples are identified using one or more of the following methods:
• using a vector layer
• defining a polygon in the image
• identifying a training sample of contiguous pixels with similar spectral characteristics
• identifying a training sample of contiguous pixels within a certain area, with or without similar
spectral characteristics
• using a class from a thematic raster layer from an image file of the same area (i.e., the result of an
unsupervised classification)
Digitized Polygon:
Training samples can be identified by their geographical location (training sites, using maps, ground truth data).
The locations of the training sites can be digitized from maps with the ERDAS IMAGINE Vector or AOI tools.
Polygons representing these areas are then stored as vector layers. The vector layers can then be used as input to
the AOI tools and used as training samples to create signatures.
User-defined Polygon
Using your pattern recognition skills (with or without supplemental ground truth information), you can identify
samples by examining a displayed image of the data and drawing a polygon around the training site(s) of
interest. For example, if it is known that oak trees reflect certain frequencies of green and infrared light
according to ground truth data, you may be able to base your sample selections on the data (taking atmospheric
conditions, sun angle, time, date, and other variations into account). The area within the polygon(s) would be
used to create a signature.
71
When one or more of the contiguous pixels is accepted, the mean of the sample is calculated from the accepted
pixels. Then, the pixels contiguous to the sample are compared in the same way. This process repeats until no
pixels that are contiguous to the sample satisfy the spectral parameters. In effect, the sample grows outward
from the model pixel with each iteration. These homogenous pixels are converted from individual raster pixels
to a polygon and used as an AOI layer.
NOTE: The thematic raster layer must have the same coordinate system as the image file being classified.
Signatures: The result of training is a set of signatures that defines a training sample or cluster. Each
signature corresponds to a class, and is used with a decision rule (explained below) to assign the
pixels in the image file to a class. Signatures can be parametric or nonparametric.
A parametric signature is based on statistical parameters (e.g., mean and covariance matrix) of the
pixels that are in the training sample or cluster. Supervised and unsupervised training can generate
parametric signatures. A set of parametric signatures can be used to train a statistically based classifier
(e.g., maximum likelihood) to define the classes.
A nonparametric signature is not based on statistics, but on discrete objects (polygons or rectangles)
in a feature space image. These feature space objects are used to define the boundaries for the classes.
A nonparametric classifier uses a set of nonparametric signatures to assign pixels to a class based on
their location either inside or outside the area in the feature space image. Supervised training is used
to generate nonparametric signatures (Kloer, 1994).
Decision Rule: After the signatures are defined, the pixels of the image are sorted into classes based on the
signatures by use of a classification decision rule. The decision rule is a mathematical algorithm that, using data
contained in the signature, performs the actual sorting of pixels into distinct class values.
72
Parametric Decision Rule
A parametric decision rule is trained by the parametric signatures. These signatures are defined by the mean
vector and covariance matrix for the data file values of the pixels in the signatures. When a parametric decision
rule is used, every pixel is assigned to a class since the parametric decision space is continuous (Kloer, 1994).
Output File When classifying an image file, the output file is an image file with a thematic raster layer. This
file automatically contains the following data:
• class values
• class names
• color table
• statistics
• histogram
There is no theoretical limit of layers of data to be used for one classification; it is usually wise to reduce the
dimensionality of the data as much as possible. Often, certain layers of data are redundant or extraneous to the
task at hand. Unnecessary data take up valuable disk space, and cause the computer system to perform more
arduous calculations, which slows down processing.
Different ancillary data other than remotely-sensed data could be used for better classification. Using ancillary
data enables you to incorporate variables into the classification from, (for example,) vector layers, previously
classified data, or elevation data. The data file values of the ancillary data become an additional feature of each
pixel, thus influencing the classification.
73