0% found this document useful (0 votes)
22 views41 pages

Module 4 Image Enhancements

The document provides an overview of image enhancement techniques used in the processing of remotely sensed images, detailing various methods such as point operations, spatial and frequency filtering, and contrast enhancement. It discusses the importance of enhancing image features for better analysis and visualization, as well as the classification of enhancement techniques based on operator properties and goals. Additionally, it covers specific algorithms like linear contrast stretching, logarithmic transformations, and exponential transformations, explaining their effects on image quality and dynamic range.

Uploaded by

Ashwani Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views41 pages

Module 4 Image Enhancements

The document provides an overview of image enhancement techniques used in the processing of remotely sensed images, detailing various methods such as point operations, spatial and frequency filtering, and contrast enhancement. It discusses the importance of enhancing image features for better analysis and visualization, as well as the classification of enhancement techniques based on operator properties and goals. Additionally, it covers specific algorithms like linear contrast stretching, logarithmic transformations, and exponential transformations, explaining their effects on image quality and dynamic range.

Uploaded by

Ashwani Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Machine Processing of Module – 3

Remotely Sensed Images Image Enhancement

Image Enhancement techniques

Image Enhancement
Introduction to Image Enhancement (IE)
S.No. Topics

1. Image Histogram, Point operations and LUT

2. False Color Composite (FCC) • It refers to accentuation, or sharpening, of image features such as edges, boundaries
or contrast to make a graphic display more useful for display and analysis.
3. Density Slicing
• IE does not increase the inherent information content in data. But it does increase the
4. Contrast Enhancements, Histogram Equalization dynamic range of chosen features so that they are detected easily.

5. Spatial and frequency filtering, linear and non-linear filters,


• The purpose of IE is to improve picture quality, mainly to remove noise, to deblur
smoothing, sharpening, High/Low pass filters objects edges, and to highlight some specified features.

6. Edge detection and enhancement


Various types of enhancement operations Cont.
Point operation Contrast stretching, noise clipping, window slicing, histogram
modelling
• IE, in general, improves human viewing capability and increases the chances of
Spatial operations Noise smoothing, median filtering, unsharp masking, low success in DIP to achieve specified goals.
pass, bandpass, high pass filtering, zooming.
• IE can be divided into two main domains:
Transform operations Linear filtering, root filtering, and homomorphic filtering.
1. Spatial domain methods
Pseudo colouring False colouring, pseudo colouring.
2. Transform domain method

Classification of image enchantment in spatial domain


Cont. On the basis of operator properties
(a) Operator’s sensitivity to image context
• Spatial domain methods: • Subdivided as context free and context sensitive.
• Consists of procedures that operate directly on pixels of image in question. • Context free IE provides a position invariant operator in which the parameters are fixed a priori.
• Context sensitive operators are position variant in which parameters change in response to local
• For example, smoothing by averaging, sharpening by gradient type operators, image characteristics.
global enhancement by means of histogram modification techniques or contrast
(b) Area on which operators works/operation area
stretching.
• Subdivided as local and global.
• Transform domain methods: • Local IE takes sub-image each time to operate up on, can be fixed sized or variable sized.
• It consists of computing a 2-D frequency domain transform (Fourier/Hadamard) of • Global IE takes whole image for operation.
image to be enhanced, altering the transform, and computing the inverse to yield an (c) Operation goals
image that has been enhanced in certain manner. • Subdivided as noise cleaning or feature enhancement or a combination of both.
• For examples, low and high pass filters for smoothing and sharpening, (d) Technical method involved
homomorphic filtering for manipulating the effects of illumination. • Subdivided as spatial smoothing, grey level scaling, edge enhancement, frequency domain filtering.
Point operations Some Monadic operations (Heijdan, 1994)
• Point operations are pixel-to-pixel operations in which the amplitudes (or GLs) of • Threshold operations are used in computer
vision applications and can be applied to detect
individual pixels are modified according to some rule (Heijdan, 1994). g n ,m  Tn ,m ( f n ,m ) all pixels in the image that correspond to an object
• Indices of the operator Tn,m(.) indicate that if rule depends upon the position (n, m) (or a part of an object) in the scene. However, this
of pixel such an operator is also called space/ shift variant. is only successful if the difference between the
radiances of the object and its background is large
• However, if T is independent of position (n, m) of pixel, it is called space-invariant. enough.
• Monodic operations are those involving a single image. g n,m  T ( f n,m ) • Floor, ceiling and clip operations are useful to
limit the dynamic range of the grey levels in an
• Dyadic operations involve two operand images. image.
• Offset and scale operations (mainly used in
contrast stretching) are like the brightness and
contrast control of a CRT-monitor.
• Together with the square, square root and the
absolute value operations these operations are
primitive operations with which, in combination
with other operations, more complicated
Pixel-to-pixel operations functions can be designed.
• Monadic operations can be implemented with
LUT.

Easton, 2010
(i) Contrast Enhancement
• Contrast enhancement is applied primarily to improve visual analysis.
• Contrast stretching is a monodic operation and consists of two methods:
(a) Deterministic grey level transformation (DGLT)
(b) Histogram modification (HM).
• It is a deterministic grey level transformation (DLGT) and involves a pixel-by-pixel
radiometric transformation that is designed to enhance visual discrimination of low
contrast image features.
• Each pixel’s GL is modified by the specified transformation, without any
consideration to the neighboring pixel GL.
• It is not good practice to contrast stretch the original imagery and then use the
enhanced imagery for classification, change detection and so on.
Easton, 2010

Cont. Image with desired characteristics


1 Piecewise linear stretching
• Contrast stretching almost always changes the original pixel values, often in 2 Brightness stretching on the mid region
nonlinear fashion. 3 More eccentric stretching
4 Binarization: Limiting transformation, produces binary image
5 Dark region stretching transformation i.e., dark becomes less dark and
bright becomes less bright

Bright stretching transformation i.e., lower GL outputs for lower GL but


Piecewise linear stretching
higher GL outputs for higher GL
6 Produce a wide-dynamic-range image on a small-dynamic-range display,
achieved by removing the most significant bit of the pixel value
8 Reverse scaling/negative: negative of original image generated
9 Thresholding transformation: height can be changed to adjusted the
output dynamic range
(a) Linear Contrast Stretch (CS) or
Min-Max contrast stretch
Contrast Enhancement Algorithms • It is the basic form of CS which involves the mapping of the pixel values from
• Contrast enhancement algorithms are widely used: observed (m, M) to the full dynamic range (n, N) of the display device.
• Linear approaches  f ( x, y )  m 
• Output value is given by: g ( x, y )   N  n   n
• Linear contrast stretch (CS) or Min-Max contrast stretch  M m 
• Piecewise Linear stretch
• Non-linear approaches
• Logarithmic Transformation g(x ,y): output grey level
• Exponential Transformation f(x, y): input grey level
• Histogram Equalization m: minimum of observed data range
M: maximum of observed data range
n: minimum for available grey level scale (n, N)
N: maximum for available grey level scale (n, N)

a) The result of applying a minimum –


maximum contrast stretch to norm
ally distributed remotely sensed
data. The histogram s before and
after the transformation are shown.
The minimum and maximum
brightness values encountered in
band k are mink and maxk,
respectively.
b) Theoretical result of applying a one
standard deviation percentage
linear contrast stretch. This moves
the mink and maxk values – 34%
from the mean into the tails of the
distribution.

Jensen Linear contrast modification of the raw (as recorded) image in a to


produce the visually better product in b (Richards)
(a) Original (b) normal linear contrast stretch (c) 5% and 95% saturation linear stretch (Mather)

Histogram equalized contrast stretched image (Mather)

Richards

Mather
Cont.
• Effect: to brighten up an image which is under-exposed (too dark) or to darken an
otherwise over-bright image.
• This does not consider the characteristics of image. Only min/max values are
considered. These values may be outliers as well.
• To overcome this problem, it is sometimes advisable to find 5th and 95th percentiles
of the image histograms and m and M are replaced by GL5% and GL95% respectively
where:
• 5th percentile: point (GL value) that is exceeded by the 95% of image pixel values
• 95th percentile: point (GL value) that exceeds the 95% of image pixel values.

Derivation of a linear function from two points of input image X and output image Y (Liu et al.)

a, b) Thermal infrared imagery of the


(b) Piecewise Linear Stretch Savannah River enhanced using
piecewise linear contrast stretching to
highlight the thermal plume at the
expense of the ambient river water and
• This is applied:
the surrounding landscape.
a) when the histogram of an image is not c, d) Piecewise linear contrast stretching to
Gaussian in shape (multimodal). enhance the Savannah River at the
expense of the thermal plume and the
b) if the image histogram is asymptotic, as it often surrounding landscape
is, it is impossible to simultaneously control the
output image and the amount of saturation at
the ends of histogram with a simple linear
transformation.
• With a two-segment piecewise linear
transformation, more control, is gained, histogram
asymmetry is reduced, and better use is made of the
available GL range.
Piecewise linear y brightness value modification
function defined by a set of user-specified break
points that commence at 0, 0 and increase
monotonically to finish at L–1, L–1
(c) Logarithmic Transformation
• Dynamic range of an image can be compressed by
replacing each pixel value in a given image with
its natural logarithm:
y  b ln  ax 

• Expands the contrast of the small values of input


image f and compresses the large values.
• Results in increased dynamic range of dark regions
in an image and decrease the dynamic range in the
light regions. Thus, it maps the lower intensity
values or dark regions into a larger number of
greyscale values and compresses the higher
intensity values or light regions into a smaller range
of greyscale values.
• Shape of the curve can be controlled by changing
the minimum fmin and maximum fmax values (in
Liu et al. the input image).

• a (>0) controls the curvature of the logarithmic function (x and y, y  b ln  ax  1


input and output images, respectively) • By including a scaling parameter c to select the portion of
• b a scaling factor to make the output DNs fall within a given value the logarithmic curve to use and shifting the pixels by 1 g  c  ln  (e  1) f  1
range (say 0 to 255), and the shift 1 is to avoid the zero value at which to avoid problem at zero, the equation becomes (Solomon
the logarithmic function loses its meaning. and Breckon, 2011, g and f output and input image,
• Gradient of the function (y’) is greater than 1 in the low DN range and respectively).
thus it spreads out low DN values, whilst in the high DN range the • Addition of 1 is included to prevent problems where the
gradient of the function is less than 1 and so it compresses high DN logarithm is undefined case when f = 0.
values. • Scaling factor  controls the level of dynamic range
compression, Varying the parameter  changes the gradient
• As a result, logarithmic contrast enhancement shifts the peak of the
of the logarithmic function used for input to output. 255
image histogram to the right and highlights the details in dark c
areas in an input image. Many images have histograms similar in • c scales the output over the image quantization range 0 to ln 1  f max 
form to logarithmic normal distributions. In such cases, a 255 and can be computed by given formula.
logarithmic function will effectively modify the histogram to the • As shown in the figure, as the logarithmic function is
shape of a normal distribution. close to linear near the origin, the compression achieved Logarithmic Transform: Varying
the parameters  changes the
• Modification: include shift c, that shifts histogram of output image. is smaller for an image containing a low range of input gradient of the logarithmic
values than one containing a broad range of pixel values.
y  b ln  ax  1  c
function used for input to output
• Logarithmic transformation has following
desirable effects:
(i) makes the grey level relate linearly to optical
density rather than film transmittance or
intensity.
(ii) makes low contrast details more visible by
enhancing low contrast edges.
(iii) provides constant S/N ratio for quantization
noise.
(iv) somewhat matches the response of human
visual system (HVS).
(v) usually provides a more equal distribution of
grey levels. Effect of being photographed in front of a bright background (left) where the dynamic range of the
(vi) transforms multiplicative noise into additive film or camera aperture is too small to capture the full range of the scene. By applying the logarithmic
noise. transform, we brighten the foreground of this image by spreading the pixel values over a wider range
and revealing more of its detail whilst compressing the background pixel range.

(d) Exponential Transformation


• General form of the exponential function used for image
processing.  ax 1
y  be • Choice of base (1 +  is the base) depends on the level
• a (>0) controls the curvature of the exponential function. I output  c 1    input  1
I
of dynamic range compression required. In general,
base numbers just above 1 are suitable for
 
• b is a scaling factor to make the output DNs fall within a given
value range, and the exponential shift 1 is to avoid the zero value photographic image enhancement. Thus, we expand
because e0 1. our exponential transform notation to include a
variable base and scale to the appropriate output
• As the inverse of the logarithmic function, exponential
range as before:
contrast enhancement shifts the image histogram peak to the
left by spreading out high DN values and compressing low • This generally produces an image with less visible
Exponential Transform detail than the original and thus is not desirable image
DN values to enhance detail in light areas at the cost of
suppressing the tone variation in the dark areas (enhances enhancement transformation.
detail in high-value regions of the image (bright) whilst • However, an important feature of the exponential
decreasing the dynamic range in low-value regions (dark) – transformation is that the result is always non-
the opposite effect to the logarithmic transform. ) negative. Exponential Transform: Varying the
parameters  changes the gradient of the
• Modification: introduce a shift parameter c to modify the exponential function used for input to
exponential contrast enhancement function as below output

 ax 1
y  be c
• Thus, we expand our exponential transform notation to include a variable base and scale
to the appropriate output range as before:
I output  c 1    input  1
I
 
• (1 + ): base
• c: scaling factor required to ensure output lies in an appropriate range (say 0 to 255).
• Iinput = 0 implies Ioutput = c; unless we add in -1 to counter this potential offset appearing in
the output image.
• The level of dynamic range compression and expansion is controlled by the portion of the
exponential function curve used for the input to output mapping; this is determined by
parameter .
• As the exponential function is close to linear near the origin, the compression is greater
for an image containing a lower range of pixel values than one containing a broader
range. The background is a high-valued area of the image (bright), whilst the darker
regions have low pixel values. This effect increases as the base number is increased.

(e) Power transformation


• An alternative to both the logarithmic and exponential transforms is the ‘raise to a
power’ or power-law transform in which each input pixel value is raised to a fixed
power
I output  c  I input 

Some basic intensity transformation


functions.
Each curve scaled independently so • In general, a value of  > 1 enhances the contrast of high-value portions of the image at
that all curves would fit in the same the expense of low-value regions, whilst we see the reverse for  < 1. This gives the
graph. (Gonzalez and Woods) power-law transform properties similar to both the logarithmic ( < 1) and exponential
( > 1) transforms. The constant c performs range scaling as before.
Plots of the gamma equation 𝒔 𝒄𝒓𝜸
for various values of  (c = 1 in all
cases).
Each curve was scaled independently
so that all curves would fit in the
same graph.

Other Methods Mean and standard deviation adjustment


• The output value g(i, j) is the lookup table value for entry f(i, j) is given by the
• In this, the mean and standard deviation of the data may be specified which is
following expression where favg and  are the mean and standard deviation of the
equivalent to the gain and bias that are computed from the data
picture.
• n can take many values to give increasingly severe stretches

if  f (i, j )  f avg  2     
g (i, j )  0;
 gain   new  
 f (i, j )  f avg  2 
n 
   old  
f  2   f (i, j )  f  2  
bias  meannew  gain  meandata 
g (i, j ) 255  ; if
4
avg avg
  

g (i, j ) 255; if f (i, j )  f avg  2  

(d) Histogram specification/modification Theory of histogram specification
• Assume a grey-scale input image, denoted Iinput(x).
• Sometimes, a particularly distributed histogram of output images is desired in
• If the variable x is continuous and normalized to lie within
specific applications.
the range (0, 1), then this allows us to consider the
• Histogram specification is used to convert an image, so that it has a particular normalized image histogram as a probability density
histogram of output images as specified. function (PDF) pX(x), which defines the likelihood of given
grey-scale values occurring within the vicinity of x.
Example:
• Similarly, for the resulting grey-scale output image after
Histogram equalization intends to map any input image into an output image with histogram equalisation as Ioutput(y) with corresponding PDF
the uniformly distributed histogram. pY(y).
Gaussian histogram: To convert output image to have a Gaussian distribution. • In histogram equalization, we seek some transformation
function y = f(x) that maps between the input and the output
grey-scale image values and which will transform the input
PDF pX(x) to produce the desired output PDF pY(y).
• Let the input and output histograms be hi(x) and ho(x) as
shown in figure (considering these as continuous functions)

As a point operation does not change image size, the number of pixels in the DN range
Example: hi  x  1
δx in the input image X should be equal to the number of pixels in the DN range δy in the
Given a linear function: y = 2x – 6, then, y’ = 2 ho  y    hi  x 
output image Y, thus we have: y' 2
y  f  x  hi  x   x  ho  y   y
Thus, this linear function will produce an output image with a flattened histogram
twice as wide and half as high as that of the input image and with all the DNs shifted
Let x  0 then y  0, then hi  x  dx  ho  y  dy to the left by three DN levels. This linear function stretches image DN range to increase
its contrast.
hi  x  hi  x 
ho  y   
Hence,
Using equation: f '  x y'
dx dx h  x  hi  x 
ho  y   hi  x   hi  x  '  i'  • when the gradient of a point operation function is greater than 1, it is a stretching
dy f  x  dx f  x  y' function that increases the image contrast;
• when the gradient of a point operation function is less than 1 but positive, it is a
Implies that desired output PDF depends only on known
compression function that decreases the image contrast;
input PDF and the transformation function y = f(x)
• if the gradient of a point operation function is negative, then the image becomes
This shows that the histogram of output image can be derived from the histogram of negative with black and white inversed.
input image divided by the first derivative of point operation function.
Histogram equalization (HE) • HE involves find the function that converts hi(x) to ho(x) = A, where ho  y   hi  x 
dx 

A is a constant. dy

• It transforms an input image to an output image with a uniform (equalised) histogram such that each 1 L 
• Suppose image X has N pixels and the desired output DN range is dy  hi  x  dx  hi  x  dx 
histogram class (0-255) in the displayed image must contain an approximately equal number of pixel values ho  y  N 
and the histogram of these values will then be approximately uniform. Hence, the entropy of the histogram, L (the number of DN levels), then N 
ho  y   A  L L
which is a measure of the information content of the images, increases. However, because of the discrete L y   hi  x  dx  Ci  x  
• Hence, using earlier derivation N N 
nature of the image truly uniform histogram is rarely achieved. This stretching has the following effects: x
1. Classes with low frequency are amalgamated. • Ci(x): Cumulative distribution function of X Ci  x    hi  x 
i 0
2. Classes with high frequency are spread out than the original.
• Calculation of Ci(x) is simple for a discrete function in the case of
3. This improves the contrast in the centre of the image. digital images.
• Effect of these is to improve the contrast in the densely populated areas and to reduce it in the other more
sparsely populated areas. HE tends to automatically reduce the contrast in very light and very dark areas • For a given DN level x, Ci(x) is equal to the total number of
and to expand in the middle level towards the low and high ends of radiance scale due to the fact that most those pixels with DN values not greater than x:
histograms are Gaussian in nature. The contrast of the image is rather harsh, a characteristics that may be
• Theoretically, HE can be achieved if Ci(x) is a continuous function.
offset by the fact that no parameters are required from the analyst for transformation.
However, as Ci(x) is a discrete function for an integer digital
• In reality, however, HE often produces images with contrast too high. This is because natural scenes are
image, HE can only produce a relatively flat histogram
more likely to follow normal (Gaussian) distributions, and consequently the human eye is adapted to be
more sensitive for discriminating subtle grey‐level changes of intermediate brightness than of very high and
mathematically equivalent to an equalized histogram, in which the
very low brightness. distance between histogram bars is proportional to their heights.

Cumulative
histograms

Easton, 2010 Easton, 2010

Continuous case Discrete case


Algorithm to implement discreet
histogram equalization
• In the idealized case, the resulting equalized image will contain an equal number of
pixels (N/L) each having the same grey level.
• For L possible grey levels within an image that has N pixels, this equates to the jth
entry of the cumulative histogram C(j) having the value jN/L in this idealized case
(i.e., j times the equalized value). We can thus find a mapping between input pixel
intensity values i and output pixel intensity values j:
N
C i   j  
L
L
j    C i 
N
• Second equation represents a mapping from a pixel of intensity value i in the input
image to an intensity value of j in the output image via the cumulative histogram C( )
Original image (poor contrast) Contrast stretched image
(histogram equalized) of the input image.

Histogram specification or matching/modification (HS or HM)


• The maximum value of j from in an image is L, but the range of grey-scale values is • In HS/HM, original image is rescaled so that the histogram of
strictly j = {0, 1, 2, -------, L-1} for a given image. the enhanced image follows some desired form. It transforms
the histogram of an image to create a new image whose
• In practice, this is rectified by adding a -1 (minus one) to the equation, thus also histogram “matches” that of some reference image fref[x, y].
requiring a check to ensure a value of j = -1 is not returned. • Example: In histogram equalization process, histogram of the
enhanced image is forced to be uniform. Researchers have used
HM to produce exponential or hyperbolic-shaped histograms.
 L 
I output  c, r   max 0,   C  I input  c, r   1  • This process, known as histogram specification or matching, is
  N   a generalization of histogram equalization and allows direct
comparison of images perhaps taken under different conditions:
• where C( ) is the cumulative histogram for the input image Iinput, • a point operation that transforms an input image to make its
histogram match a given shape defined by either a mathematical
• N is the number of pixels in the input image I(c, r), and function or the histogram of another image.
• L is the number of possible grey levels for the images (i.e., quantization limit). • particularly useful for image comparison and differencing. If the
two images in question are modified to have similar histograms,
the comparison will be on a fair basis.
• implemented by applying HE twice, i.e., transformation of the
histogram is derived by first equalizing the histograms of both
images.
• For example, equation implies that an equalised histogram is only decided by • Thus y  g 1  z   g 1  f  x 
image size N and the output DN range L. N
ho  y   A  • Recall f(x) and g(y) are the cumulative distribution functions (C) of hi(x) and ho(y)
L
• Images of the same size always have the same equalized histogram for a fixed individually given as: L L
output DN range, and thus HE can act as a bridge to link images of the same size y   hi  x  dx  Ci  x 
N N
but different histograms (see Fig).
• Thus, HM can be easily implemented by a three‐column LUT containing
hi(x): histogram of an input image, ho(y): the reference histogram to be matched.
corresponding DN levels of x, z and y. An input DN level x will be transformed to
z = f(x): HE function to transform hi(x) to an equalised histogram he(z), and an output DN level y sharing the same z value. The output image Y will have a
z = g(y): HE function to transform the reference histogram ho(y) to the same histogram that matches the reference histogram ho(y).
equalised histogram he(z), then z  g  y  f  x Example
• Gaussian histogram match: If the reference histogram ho(y) is defined by a Gaussian
distribution function  xx 2 
ho  y  
1 
exp  
 
 2  2 2 
 
• Where σ is the standard deviation and 𝑥̅ the mean of image X, the HM
transformation is then called Gaussian stretch since the resultant image has a
histogram in the shape of a Gaussian distribution
Histogram equalization acts as a bridge for histogram matching

Histogram specification
Required transformation of the histogram H[f1] of image f1[x, y]
to histogram H[fRef] may be derived by first equalizing • Inverse of the lookup table transformation for the reference image is O−1 {gRef} = fRef.
histograms of both images (reference and input images):
• LUT for histogram specification of the input image is obtained by first deriving the
lookup tables that would flatten the histograms of the input and reference image. It
should be noted that some gray levels will not be specified by this transformation
and so must be interpolated.
• Functional form of the operation:

ORef  f Ref [ x, y ]  eRef [ x, y ] g1[ x, y ]  with specified histogram   ORef


1
O1  f1  ORef
1
ꞏ O1  . f1  CRef
1
C1  f1

Where: O1  f1[ x, y ]  e1[ x, y ] 
• f1 [x, y]: input image, fRef [x, y]: reference image
• H[f1]: Histogram of input image; H[fRef]: Histogram of reference image; C[f1]: Cumulative histogram of f1.
• en[x, y]: intermediate image with a flat/equalized histogram of fn [x, y]; obtained by applying operator On{ }; n =
Ref or 1,
• histograms of eRef and e1 are “identical” (both are flat/equalized).
Example: Histogram equalization (Mather)
• HE spreads range of GLs present in input image over full
range of the display device.
• Relative brightness of the pixels in the original image is not
maintained. Images obtained with a harsher contrast.
L
• To achieve uniform histogram (mapping function): j   N  C  i 
 
• number of levels used is almost always reduced (because
those histogram classes with relatively few members are
amalgamated to make up the target number, nt =
262144/16 = 16384).
• histogram areas that have greatest class frequencies, the
Col 1: Raw image (16 levels), Col 2:
individual classes are stretched out over a wider range.
number of pixels at each level, Col 3: • Effect is to increase the contrast in the densely populated
Cumulative number of pixels, Col 4:
new mapping, obtained by
parts of the histogram and to reduce it in other, more
determining target number of pixels sparsely populated areas.
(= total number of pixels divided by
number of classes, that is 262144/16 =
16384) and then finding integer part
of Cj divided by nt, the target
gmin and gmax are minimum and maximum GL values, respectively. Pratt number.

Example: Histogram specification (Mather) GLs determined by comparing cols (v) and
• Same input histogram as histogram equalization (vii) as follows:
example. Image size 512 × 512 = 262144 with • Cumulative value in col (vii) at level 0 is
quantization levels of 16. 1311. The first value in col (v) to exceed 1311
• For normal distribution ranging from −∞ to +∞, is that associated with level 1, namely, 1398;
some delimiting points are used. The range ±3 hence, the input level 0 becomes the output
from the mean is used. level 1.
• Level 1: probability of observing a value of a • Cumulative value in col (vii) with input level
Normally distributed variable that is 3 or more 1, that is 3933. First element of col (v) to
below the mean; exceed 3933 is that value associated with
• Level 2: probability of observing a value that is level 3 (9595) so input level 1 becomes
between 2.6 and 3 standard deviations below output level 3.
the mean, and so on. These values can be derived • Process is repeated for each input level. Once
by using a statistical algorithm. Col (i): DNs in original, un-enhanced image. Col (ii): points on the elements of col (viii) have been
• Target number of pixels (at each level): obtained the Standard Normal distribution to which these PVs will be determined, they can be written to LUT and
1   x  x 2 
ho  y   exp    Gaussian histogram fitting by multiplying probability for each level by value mapped, Col (iii): probabilities associated with the class the input levels of col (i) will automatically
 2  2 2  image size (262144). intervals, Col (iv): frequency for target pixels, Col (v): map to the output levels of column (viii).
 cumulative frequency, Col (vi) & Col (vii): observed and
Col (i): DNs in original, un-enhanced image. Col (ii): points on the Standard Normal distribution to cumulative frequency, respectively for the input image. Col • Usually, the limits chosen are symmetric
which these PVs will be mapped, Col (iii): probabilities associated with the class intervals, Col (iv): (viii): GL to be used in the Gaussian-stretched image. about the mean, by user defined limits.
frequency for target pixels, Col (v): cumulative frequency, Col (vi) & Col (vii): observed and
cumulative frequency, respectively for the input image. Col (viii): GL to be used in the Gaussian-
stretched image.
(2) False Color Composite (FCC)

• FCC are combination of three of such bands where each band is assigned to three
different primary colors (red, blue and green).

Mather • This assignment results in a pseudo or false colors in the image and helps in visual
interpretation of the imagery.
• Generally, the longest band is assigned the red color while the shortest wavelength
band is assigned the blue color.
• The other band in the triplet gets the green color.

Histogram equalized image Gaussian histogram stretched image

Cont. Cont.
• While choosing the band combination for the FCC, one can make use of Optimum
• This arrangement gives rise to what is often called as the standard FCC (see figures Index Factor (OIF) to decide about the three-band combination providing maximum
for FCCs using IRS 1C images) although large number of FCCs can be generated by information.
varying the band and color combinations.
• The OIF is used for selecting a subset of features by taking the variance and
correlation of the features into consideration.
• Mathematically, OIF is expressed as below (Jensen, 1996).


Where:
i i: Standard deviation for band i
OIF  m
i 1
m m: Number of bands to be selected

 
i 1 j i 1
rij rij: Coefficient of correlation between bands i and j.
Cont. Spatial operations
Operator
• In the OIF expression:
Output pixel is a weighted combination of the gray values of pixels in the
• numerator gives the overall variation in the data neighborhood of the input pixel, hence the term local neighborhood operations. The
• denominator gives the overall duplication of information. size of the neighborhood and the pixel weights determine the action of the operator O.
g  x, y   O  f  x  x, y  y 
• So, the bands, which have higher standard deviation and less correlation, will have
higher OIF. Linear transformation
If the output at a pixel point x is a linear combination of the original values at some of
• The feature subsets are ranked according to their OIF values and the subset, which
the points close to x. 1 1 1 
has highest OIF, is selected as the optimum band subset. g 2  x   g1  x  1  g1  x   g1  x  1 
4 2 4 

1 1 1 
Operator O = 
2 2 4  

Correlation operations 2-D Local correlation operator:


O  f  x, y   


 


f  ,   .   x,   y  d d  

1-D Local neighborhood operators:  
 
f  x  u, y  v  . u , v  dudv 
  
• Consider following process that acts on a 1-D input function f[x] that is defined over a continuous
domain
O  f  x    f   .   x  d


• This process computes the area of the product of two functions for each output location x: the input Analogous process for sampled functions requires that the integral be
f and a second function γ that has been translated (shifted) by a distance x. The result of the converted to a discrete summation  

operation is determined by the function γ[x]. g  n, m     f i, j . i  n, j  m
i  i 

• Process may be recast in a different form by defining a new variable of integration u   x For a 3 x 3 window: 
 

  f i  n, j  m. i, j 


  • Correlation scales the shifted function by the values of the matrix γ, i  i 

 f   .   x  d   f u  x  . u  du and thus computes a weighted average of the input image f[n, m].    1,1   0,1  1,1
 
 
• Difference from first expression: Second function γ[u] remains fixed in position and the input • Operation defined by this last equation is also called the cross-   n, m      1, 0   0, 0  1, 0
function f is shifted by a distance − x. If the amplitude of the function γ is zero outside some correlation of the image with the window function γ[n, m].   1, 1   0, 1  1, 1
interval in this second expression, then the integral need be computed only over the region where • Represented by
f [n, m]   [n, m]  

γ[u]  0. the function γ[x] is nonzero is called the support of γ, and functions that are nonzero over g  n, m     f i, j . i  n, j  m
i  i 
only a finite domain are said to exhibit finite or compact “support.” 
1   a b c    a b c 1 1
  f i  n, j  m. i, j  
 a b c   1   c b a i 1 i 1
Examples: Correlation or cross-correlation Convolution operations
• Consider following process that acts on a 1-D input function f[x] that is defined over a continuous
1   a b c    a b c
domain
 a b c   1   c b a

g  x   f  x   h  x    f   .h  x    d


• where α is a dummy variable of integration. For the cross-correlation, the function h[x] defines the
action of the system on the input f[x]. By changing the integration variable to u ≡ x − α, an
equivalent expression for the convolution is found.
g  x    f  .h  x    d 

 
u  
  h  x  u . f u   du  
u 
u  
 h  x  u . f u   du  
u 

f u  .h  x  u   du  
u 

u  
• where the dummy variable was renamed from u to α in the last step.
• Note: Roles of f[x] and h[x] have been exchanged between the first and last expressions, which
means that the input function f[x] and system function h[x] can be interchanged.

Convolution of a continuous 2-D function f[x, y] with a system function h[x, y] is defined as
For discrete functions, the convolution integral becomes a summation
g  x, y   f  x, y   h  x, y     f  ,   .h  x   , y    d d  

g  n, m   f  n, m   h  n, m  
 
  


   f  x   , y    .h  ,   d d     f i, j .h  n  i, m  j 
  i  i  
Comparison between two forms of cross-
g  x, y   f [ x, y ]   [ x, y ]  Note the difference in algebraic sign of the action of the kernel h[n, m] in convolution and
correlation and convolution 
   Cross  correlation the window γij in correlation
First pair   f  ,   .   x,   y  d d   g  n, m   f  n, m   h  n, m  
  
Changes of the order of the variables in the first pair 
g  x, y   f  x, y   h  x, y  
 
 Cross  correlation
says that the function γ is just shifted before    f i, j .   i  n, j  m  
multiplying by f in the cross-correlation, while the   Convolution i  i  
function h is flipped about its centre (or equivalently


f  
 ,   .h  x   , y    d  d   g  n, m   f  n, m   h  n, m  
rotated about the centre by 180o) before shifting. 
g  x, y   f [ x, y ]   [ x , y ]   
 Convolution
Second pair    f i, j .  h  n  i, m  j  
   Cross  correlation 
The difference in sign of the integration variables   
f  x  u , y  v  
.  u , v  d udv 
i  i 

says that the input function f is shifted in different


This form of the convolution results in the very useful property that convolution of an
directions before multiplying by the system function 
g x, y   f  x, y   h  x, y   impulse function δ[n, m] with a system function h[n, m] yields the system function

 Convolution
γ for cross-correlation and h for convolution. In 
f  x   , y    .h  ,   d d    [n, m]  h[n, m]  h[n, m]
convolution, it is common to speak of filtering the

   
where δ[n, m] ≡ 1 for n = m = 0, and 0 otherwise
input f with the kernel h.
Convolution operation
Convolved with the input image that is 1 at one pixel and
zero elsewhere. The output is a replica of h[n] centered at
the location of the impulse.
Discrete convolution is linear because it is defined by a
weighted sum of pixel gray

Example 1D convolution • In signal/image processing area


h[n, m] is called the impulse
response instead of the kernel,
since it is the output of convolution
with an impulse; In
• Optics its is called the point spread
function (psf). The psf is often
nonzero only in a finite
neighborhood — e.g., 3 × 3 or 5 × 5
Example 2D convolution

Convolution operation: Consider function f(x) and g(x) as


shown in the Figure (a) and (b), respectively.
Function g(x − α) must be formed from g(α) before the
integration is carried out.
g(x − α) is formed in two steps and is illustrated in Figure (c)
and (d).
Step-1: Simply fold (flip) g(α) about the origin to get g(−α),
and
Step-2: Shift the function is shifted by a distance x.
Step-3: Multiply: For any value of x, the function f(α) is
multiplied with g(x − α) and the product is integrated
from −∞ to ∞. The product of f(α) and g(x − α) is the
shaded portion of Figure (e).
This figure is valid for 0 ≤ X ≤ 1. The product is 0 for
values of f(α) outside the interval (0, x). So, f(x) ∗ g(x) =
x/2 which is simply the area of the shaded region.

x
2 0  x 1

 x
f  x   g  x   1  1 x  2
Convolution operation (Annadurai and Shanmugalakshmi)  2
0 otherwise
 Discrete convolution

The functions x(m, n) and h(m, n)
used for convolution
y(0, 1) : h(1, - 1) (c)

• Convolution is a process in which


the second function is folded (mirror
• In general, the convolution of two arrays of sizes
image) and slided over the first
(M1, N1) and (M2, N2).
function and at every point the area
is computed, or the sum of products • In this example, the convolution yields an array of
is obtained. size (M1 + M2 − 1) × (N1 + N2 − 1) = (2 + 2 − 1) × (3
+ 2 − 1), that is, 3 × 4. The various elements in the
• The three steps 1 to 3 illustrate how convolved array of size (3 × 4) are obtained in
to fold the second function and to Figures (c)–(n). The elements in the final convolved
slide right by one position. Now the array [Figure (o)] are denoted as y(m, n), where m
function h(1 − m, − n) can be used to takes the values 0, 1, 2 and n takes values 0, 1, 2
slide over the first function to get the and 3. Now the first column elements are denoted
convolved final function or image; as y(0, 0), y(0, 1), and y(0, 2) and are given in
Figures (c)–(e).

Steps to obtain the function h(1-m, -n)


Convolution operation (Annadurai and Shanmugalakshmi)

Convolution at the edge of the images

1. consider any pixel in the neighborhood that would extend off the image to have
gray value “0”;
2. consider pixels off the edge to have the same gray value as the edge pixel;
3. consider that the convolution in any such case to be undefined; and
4. define any pixels over the edge of the image to have the same gray value as pixels
on the opposite edge (repeat version of image):
 it states that the image is assumed to be periodic, i.e., that f [n, m]  f [n  kN , m  cM ]
 where N and M are the numbers of pixels in a row or column, respectively, and k,
c are integers.
Convolution/correlation examples
 x  x   ax bx 
a b      b a       Moving operator must be reflected before application
 y  y   ay by 

If operator is two dimensional, the it must be reflected both horizontally and vertically before
application
x  y  ax bx 
 y    a b    x    a b    ay by 
     
For A, B, C as operators, G an image
 ax bx  by ay 
 ay by   1  bx ax  A   B  G   A  B  G 
   

 x   y  by ay  A   B   C  G    A    B  C   G  
a b       1    a b        
  y   x  bx ax 
  A   B  C   G 
 x  x by ay  
 y    a b   1   y   b a   bx ax     A  B   C   G 
     

(Gonzalez and Woods)

(Gonzalez and Woods)


• The entire convolution
Convolution Operation operation
sequentially
is
where
applied
after
completing an entire row of
operations, the window is
moved downwards, and the
• It consists of moving an imaginary kernel or window of specified size throughout the entire operation is repeated till
image and computing the resultant DN value at each window position which are we complete the same for the
obtained as a function of multiplication of the corresponding kernel coefficient and whole image.
pixel value and adding all resultant multiplications.
• Many operations in DIP are
• The resultant value is written in a new image corresponding to the central location of implemented as convolution
the window. operation such as edge
• Depending upon the size of the window/kernel and its coefficients, a given detection, spatial filtering
convolution operation will yield different result for image enhancement. involving high and low pass
filters, etc.

Convolution operation using a moving window concept

Spatial Filtering Cont.

• The mathematical techniques for separating an image into its various spatial
• Spatial filtering a context dependent operation (which depends upon the property of frequency component is called Fourier analysis.
neighboring pixels) that alters the GL of a pixel according to its relationship with GL
of other pixels in the immediate vicinity (Showengerdt, 1980). • It is possible to emphasize certain group of frequencies relative to others and
recombine the spatial frequency to produce an enhanced image.
• Spatial frequency is defined as the number of changes in brightness value per unit
distance for any particular type of image. • Algorithms that perform such enhancements are called filters because they suppress
certain frequencies and pass others.
• Low Frequency Area: If there are a few changes in GL per unit area.
• Filters that pass high frequency and hence emphasize fine details and edges are
• High Frequency Area: If there are many changes in GL per unit area. high pass filters (HPF).
• Similarly, low pass filters (LPF) suppress high frequency content and emphasize
gradual changes.
Cont.
• Three general types of filters, which can be combined to form more complex filters
are:
a) Low pass filter (LPF)
b) High pass filter (HPF)
c) Band pass filter (BPF)
LPF: Smooth details in an image and reduces the GL range (image contrast) .
HPF: Enhances details, produces relatively narrow histogram centered at zero
grey level.
Pure BPF: These filters do not have general application in image processing; they are
primarily used to suppress periodic noise.

Low Pass Filters b) Weighted Mean Filter: A weighted mean is often used in which the weight for a pixel is
related to its distance from the center point. For 3 x 3 windows, the weights may be given as:
• The choice of a particular type of low pass filters, also sometimes known as the  1 
1 1 1
0 0 1 b 1
smoothing filters as they smooth high frequency information / noise, depends on 16 16  1 b b 2 b 
8 6
   
image type and the purpose. These arrays, called noise cleaning masks, are 1 1 1 1 1 1
b  2
2 
8 4 8
6 3 6 1 b 1 
normalized to unit weighting so that the noise-cleaning process does not introduce 
1 1 1
 
0 1

0 (c )
an amplitude bias in the processed image. 16 8 16   6 
(a) (b)

• A few of such linear filters are: (a) square shaped (b) plus shaped window (c) Parametric low pass

a) Mean Filter: The size and shape of the window over which the mean is
computed can be selected. For a 3 x 3 windows, typical filter weights or
coefficients are given as:
1 1 1  1 
9 9 9 0 5
0
   
1 1 1 1 1 1
9 9 9 5 5 5
   
1 1 1 0 1
0
 9 9 9   5 
(a) (b)
(a) square shaped (b) plus shaped
Cont. Properties of Gaussian filter
(a) In two dimensions Gaussian function is rotationally symmetric. This means that
(c) Gaussian smoothing: the amount of smoothing performed by the filter will be the same in all
directions. In general, edges in an image will not be oriented in same direction that
• The image is smoothed by assuming that the grey level values are distributed as a
is known in advance. Consequently, there is no reason to smooth more in one
Gaussian function.
direction than in another. The property of rotational symmetry implies that a
• This smoothing filter is characterized by the following equation where x and y are Gaussian smoothing filter will not bias subsequent edge detection in any particular
the variables of image function and  is the standard deviation of Gaussian direction.
function.
(b) The Gaussian function has a single lobe. This means that the Gaussian filter
1  x2  y 2  smoothens by replacing each image pixel with a weighted average of
G ( x, y )  exp   neighboring pixels such that the weight given to the neighbor decreases
2 2  2 2  monotonically with distance from the center pixel. This property is important
since an edge is a local feature in an image and a smoothing operation that gives
more significance to pixels further away will distort the features.

d) The Fourier transform of a Gaussian has a single lobe in the frequency


spectrum. This property is a straightforward corollary of the fact that the Fourier
transform of a Gaussian is itself a Gaussian. Images are often corrupted by (f) Large Gaussian filters can be implemented very efficiently because Gaussian
undesirable high frequency signals. The desirable image features will have the functions are separable. Two-dimensional Gaussian convolution can be preferred
components at both low and high frequency signals (noise and fine texture). The by convolving the image with a one-dimensional Gaussian and then convolving the
single lobe in the Fourier transform of a Gaussian means that the smoothed image result with the same one-dimensional filter oriented orthogonal to the Gaussian
will not be corrupted by contributions from unwanted high frequency signals, used in the first stage. This amount of computation required for 2-dimensional
while most of the desirable signals will be retained. Gaussian filter grows linearly with the filter mask instead of going quadratically.

e) The width and hence the degree of smoothing of a Gaussian filter is


parameterized by , and the relationship between  and the degree of
smoothing is very simple. A larger  implies a wider Gaussian filter and greater
smoothing. A user can adjust degree of smoothing to achieve a compromise
between excessive blur of the desired image features (too much smoothing) and
excessive undesired variations in the smoothed image due to noise and fine
texture.
For n = 7 and 2 = 2, the above expression yields
(i, j) -3 -2 -1 0 1 2 3
 i2  j2 
g (i, j )  c exp   2  -3 0.011 0.039 0.082 0.105 0.082 0.039 0.011
Design of Gaussian filters  2  -2 0.039 0.135 0.287 0.368 0.287 0.135 0.039
• The mask weights can be directly calculated from the discrete distribution, where c is -1 0.082 0.287 0.606 0.779 0.606 0.287 0.082
a normalizing constant.  0 0.105 0.368 0.779 1.000 0.779 0.368 0.105
g (i, j ) i2  j 2 
 exp   2  1 0.082 0.287 0.606 0.779 0.606 0.287 0.082
• Rewriting: c  2 
2 0.039 0.135 0.287 0.368 0.287 0.135 0.039
• and choosing a value of 2, we can evaluate it over an n x n window to obtain a 3 0.011 0.039 0.082 0.105 0.082 0.039 0.011
kernel for which the value at [0, 0] is equal to 1. Normalizing and converting to integers form results in the following matrix with
• Finally, to obtain integer weights, the real kernel coefficients can be scaled so that the normalizing coefficient equal to the summation of all terms (= 1115).
corner kernel coefficients become 1. Now the weights of the kernel will not sum to 1. (i, j) -3 -2 -1 0 1 2 3
Therefore, while performing the convolution, the output pixels must be normalized -3 1 4 7 10 7 4 1
-2 4 12 26 33 26 12 4
by the sum of the filter masks. -1 7 26 55 71 55 26 7
0 10 33 71 91 71 33 10
1 7 26 55 71 55 26 7
2 4 12 26 33 26 12 4
3 1 4 7 10 7 4 1

Cont.
(e) Median Filter:
Non-linear filters
• This is a non-linear filter that replaces a pixel value by the median of its
• The linear processing techniques described previously perform reasonably well on
neighboring pixels.
images with continuous noise, such as additive uniform or Gaussian distributed
noise. • Conceptually the median filter is simple, but it is computationally expensive to
implement because of the required sorting.
• However, linear filters tend to provide too much smoothing for impulse like
noise. Nonlinear techniques often provide a better trade-off between noise • However, it is one of the edge preserving noise smoothing filters.
smoothing and the retention of fine image detail.
• This filter is quite good for removing salt and pepper and impulse noise while
(d) Mode or Majority Filter: In this filter, a pixel is replaced by its most common retaining image details because they do not depend on values which are
neighbor. Useful particularly in coded images such as classification maps. significantly different from typical values in the neighborhood.
Averaging labels does not make sense, but mode filters may clean up isolated
noise points.
Salient features of median filters
1. It reduces the variance of the intensities in the image. Thus, it has the capability to significantly
alter the image texture.
2. Intensity oscillations with a period less than the window width are smoothened. This property
is significant when considering multipass implementation of a fixed size median filter. In
general, regions unchanged by the filter in a given pass are left unchanged in future passes.
3. Median filter will change the image intensity mean value if the spatial noise distribution in the
image is not symmetrical within the window.
4. Median filter preserve certain edge shapes.
5. Given a symmetrical window shape, the median filter preserves the location of edges.
6. No new GL values are generated.
7. The shape chosen for a median filter may affect the processing results.

Median filtering is one special case of Rank filters. Another possibility is maximum or
minimum of cells in the neighborhood. These are called maxima-minima filters.
The minimum filter works when noise is primarily of salt-type (high value) and maximum
filter works for pepper type noise (low values). Large size filters give painted effect. Median filtering on one-dimensional test signals Median filtering on image

Pratt

Examples of various filters Cont.

Mean filtered (size 5 x 5) Median filtered (size 3 x 3)


Original image Mean filtered (size 3 x 3)
The mean filtered image appears smoother after application of 5 x 5 filter.
The mean filtered image appears smoothened after application of 3 x 3 filter The median filtered image is similar to 3 x 3 mean filtered image but
relatively less smoother
Cont.  O1 O2 O3 
Cont.
e) Outlier noise cleaning algorithm: O
 8 X O4 
(f) K-nearest Neighbor Filter:
• A simple outlier noise cleaning technique in which O7 O6 O5 
each pixel is compared to the average of its eight  1 8
 • This is another filter used in edge preserving smoothing, in which the central
neighbors. If  X   Oi   Threshold pixel in the image window is replaced by the average of the K pixels which are
 8 i 1 
1 8 closest to the central pixel in this window.
• If the magnitude of the difference is greater than then X   Oi
some threshold level, the pixel is judged to be 8 i 1 • A typical value of K is 4 in a 3 x 3 square window.
noisy, and it is replaced by its neighborhood else
no change (g) Sigma Filter:
average. The eight-neighbor average can be
computed by convolution of the observed image endif • This filter sets the central window pixel equal to the average of all pixels in its
with the impulse response array: neighborhood whose values are within K counts of the central pixel value,
where K is an adjustable parameter.
1 1 1
1
H  1 0 1 • Called sigma filter because the parameter K may be derived from the sigma or
8 standard deviation of the pixel values in the window.
1 1 1
• This filter is like the K-nearest neighbor filter.

Noise cleaning with the outlier algorithm on the


noisy test images (Pratt)

Contra Harmonic filter (CHF)


Adaptive filters (Minimum mean-square error filter)
It works well for images containing salt or pepper type noise depending upon the filter
order R. Negative values of R eliminate salt type of noise. Positive value of R remove  2 
MMSE  d (r , c)   n2   d (r , c)  ml (r , c) 
pepper noise.  d (r , c) R1
( r ,c )w Where:  l 
CHM 

( r ,c )w
d (r , c) R
n2 = noise variance
Geometric mean filter l2 = local variance (in window under consideration)
It is best for Gaussian noise and retains details better than arithmetic mean filter (AMF). ml = local mean
1

GM   [ I (r , c)]
( r ,c )w
N2
One has to input window size Iw and n2
Harmonic mean filter
Works well with pepper noise and fails with salt noise. Also, works well with the
Gaussian noise. N2
HM 
1

( r , c )w d ( r , c ) |
d degraded image
I original image function d ( r, c)  I ( r, c)  n( r, c)
n additive noise function
Alfa-trimmed mean filter Edge preserving noise smoothing filters
Average of pixel values in a window but some of endpoint-ranked values are excluded
1 N 2 T • Median
Mean TM
N 2  2T
I
i i 1
i
• Nagao-Matsuyama filter

• Where T is number of pixel values excluded at each end of ordered set. ATM filter
ranges from mean to median filter depending upon value selected for T.
• If T = 0, it is arithmetic mean filter; 2
N 1
• Median filter, if T
2

• This filter is useful for images containing multiple type of noise such as Gaussian and Salt
and Pepper noise

High pass filters Cont.


• A few kernels for high pass filtering are given as:

• The high pass filters tend to accentuate high frequency information.  0 1 0


H   1 5 1
• Edges (discussed after this) are important high pass image information. So, all edge
detection filters are essentially high pass filters.  0 1 0 

• Sometimes one likes to enhance or sharpen the edge content while retaining the  1 1 1
original image. This process is called image sharpening or edge enhancement. H   1 9 1
• A simple approach to achieve this involves addition of edge detected image to the  1 1 1
original image.  1 2 1
H   2 5 2 
 1 2 1
Cont. Differencing Kernels — High pass Filters
• Another filter known as the high boost filter is also used for increasing the high Converse of the statement that local averaging reduces variability is also true:
frequency information in an image.
• It is created by using a weight K on the original image in a high pass filter operation: Local Differencing increases the variance of pixel gray values
High boost filtered image  1 1 1 Local Differencing “pushes” the gray values away from the mean
1
1 w 1 H (and the new mean may be zero)
= K (original image ) - Low pass filtered image 9 
 1 1 1
= (K – 1)(original image ) + original image – Low pass filtered image w  9k  1 A kernel with both positive and negative terms computes differences of neighboring
= (K – 1)(original image ) + high pass filtered image for k  1, w  8 pixels. Adjacent pixels with identical gray levels will tend to cancel, while differences
between adjacent pixels will tend to be emphasized. Since high-frequency sinusoids
• This operation results, therefore, to partially restore the low frequency components vary over shorter distances, differencing operators will enhance them and attenuate
lost in high pass operation. slowly varying (i.e., lower-frequency) terms.
• If K = 1 then standard high pass, if K > 1, processed image is more like the original
image with a degree of edge enhance map depending on K.

Discrete derivatives may be implemented via convolution with two specific discrete
kernels that compute the differences of a translated replica of an image and the original:

This definition of the derivative effectively “locates” the edge of an object at


the pixel immediately to the right or above the “crack” between pixels that is
the actual edge.

The output image is equivalent to the difference between an image shifted one pixel to
the left and an unshifted image, which may be written in the form
 f  x  x, y   f  x, y  
f [ x, y ]  lim 
x x  0 x 
 
 f [ x, y ]  f  n  1 .x, m.y   f  n.x, m.y 
x 
  x  f  n, m   f [n  1, m]  f [n, m] 

because the minimum nonzero value of the translation ∆x = 1 sample.


The corresponding discrete partial derivative in the y-direction is

  y  f  n, m   f [n, m  1]  f [n, m]
• A symmetric version of the derivative operators is sometimes used which takes the
difference across two pixels.
• These operators locate the edge of an object between two pixels symmetrically

Higher-Order Derivatives Usually, the kernel is translated to the right by one pixel to “center” the weights in a 3 × 3
The kernels for higher-order derivatives are easily computed since convolution is kernel
associative. The convolution kernel for the 1-D second derivative is obtained by auto-
convolving the kernel for the 1-D first derivative:

which generates the same image as cascaded first derivatives but for a shift to the right
by a pixel. The corresponding 2-D second partial derivative kernels are

This may be evaluated by using a 5 x 5 operator


Derivation may be extended to derivatives of still higher order by convolving kernels to Unsharp masking
obtain the kernels for the 1-D third and fourth derivatives • It is the difference of an image and a blurry replica, where the difference operation
was originally implemented as the sum of the blurry image and the original
photographic negative.
• We can think of unsharp masking in terms of the convolution operators that form the
individual component images.
• The sharply focused image is produced by convolution with the identity kernel.
• The blurry image is generated by convolution with the uniform averaging
operator

Higher-order derivatives are usually translated to reduce the size of the 1-D kernel and
obtain

The difference (unsharp masked) image may be written as Derivative operators: their formal (continuous) definitions and corresponding discrete approximations

• Differentiation is a
linear operation, and a
discrete approximation
of a derivative filter can
thus be implemented
by the kernel method.
• An important
condition to impose on
such a filter kernel is
that its response be zero
in completely smooth
regions. This condition
can be enforced by
ensuring that the
weights in the kernel
mask sum to zero

This is single convolution operator h[x, y] that will implement unsharp masking in one step.
Note that sum of the weights is zero, which means that the mean value of the output image
will be zero
Other image sharpening methods
• Consider the result of subtracting the image produced by the Laplacian operator just
• A different way to construct a sharpener is to add “positive” edge information to the specified from the original image.
original image.
• Convolution kernel may be specified as the difference of the identity and Laplacian
• Edge information may be determined by applying some variety of edge detector. A operators
very common edge detector is 2-D second derivative, which is called Laplacian, as:

2 2
     
        x   y
2 2 2


   
x  y

Most common form of Laplacian operator.


Note that the weights sum to zero, which means that the mean gray value of the image 1-D analogue of the Laplacian sharpener will subtract the 1-D second derivative from
produced by this operator will be zero: the identity

Edge Enhancement
Steps involved in Edge Extraction
• For many remote sensing applications, the most valuable information derivable from
an image is contained in the form of edges surrounding various objects or features of
interest.
• An edge is a discontinuity or sharp change in grey scale value at a particular pixel Pre-processing to smooth the image to reduce intensity variations in the image,
location which may have a certain interpretation in terms of geological structure or which cause false edges
relief (Mather, 1987). Edge detection to identify areas of rapid intensity change
• The intuitive notion about an edge in a digital image is that it occurs at the boundary Thresholding to eliminate unwanted low magnitude edges
between two pixels when the respective grey level values of these adjacent pixels are
significantly different. Thinning to reduce the width of detected edges
• Edges are generally found because of changes in some physical and surface properties Linking to join edge segments into continuous features
such as illumination, geometry (orientation, depth) and reflectance.
• Since image edges characterize object boundaries, these are useful for segmentation,
registration and identification of objects in a scene.
• Thus, edge detection plays an important role in interpretation of digital images.
Edge Detection
• An edge in an image is a significant local change in the image intensity usually
associated with a discontinuity in either the image intensity or the first derivative of the
Cont.
image intensity.
• Most of the edge detection schemes utilize the gradient magnitude of image data in some
manner to decide on the presence of an edge and its importance. • The gradient magnitude-based operators can be further divided as follows:
• A few approaches utilize the orientation of the detected edges as secondary information or
for further processing.  Operators approximating derivatives of image function using differences. Some
of them are rotationally invariant. Others that approximate first derivative, use
several masks.
If a threshold is used for
detection of edges, all points  The orientation is estimated on the basis of best matching of several simpler
between a and b will be marked patterns.
as edge pixels. However, by
removing points that are not a
 Operators based on zero-crossing of image fraction second derivative
local maximum in the first
derivative, edges can be
detected more accurately. This
local maximum in the first
derivative corresponds to a zero
crossing in the second
derivative. Different types of ideal edges

Directional Derivatives: Gradient Magnitude of the gradient often is approximated as the sum of the magnitudes of the
components

The gradient of a 2-D continuous function f[x, y] constructs a 2-D vector at each
g  n, m  = f  n, m     f  n, m    f  n, m
2 2
coordinate whose components are the x- and y-derivatives
 f f 
x y 
g  x , y  = f  x , y    ,  
 x x 
  x f  n, m    y f  n, m  

The gray value f is analogous to terrain “elevation” in a map. In physics, the gradient of
• The magnitude |∇f| is the “slope” of the 3-D surface f at pixel [n, m]. The azimuth Φ{∇f
a scalar “field” f[x, y] is the product of the vector operator ∇ (pronounced del) and the
[n, m]} defines the compass direction where this slope points “uphill.”
scalar “image” f, yielding ∇f[x, y]. This process calculates a vector for each coordinate [x,
y] whose Cartesian components are ∂f/∂x and ∂f/∂y. Note that the 2-D vector ∇f may be • The gradient is not a linear operator, and thus can neither be evaluated as a
represented in polar form as magnitude |∇f| and direction Φ{∇f}: convolution nor described by a transfer function.
2 2 • The largest values of the magnitude of the gradient correspond to the pixels where the
 f   f 
The vector points “uphill” in the direction of the maximum f  x, y        gray value “jumps” by the largest amount, and thus the thresholded magnitude of the
“slope” in gray level.  x   y  gradient may be used to identify such pixels. In this way, the gradient may be used as an
  f   “edge detection operator.”
  
y
 f  x, y   tan 1    
  x  f  n, m    f  
g  n, m  = f  n, m    
  x  
  
 y  f  n, m 
Example of the discrete gradient operator ∇f[n, m]:
(a): The original object is the nonnegative (A) First order gradient-based edge detection
function f[n, m], which has amplitude in the
interval 0 ≤ f ≤ +1.
(b) and (c): The gradient component images at
each pixel is the 2-D vector with components • A first-order derivative edge gradient can be generated by two fundamental methods.
bipolar [∂f/∂x, ∂f/∂y]. • One method involves generation of gradients in two orthogonal directions in an
image.
(d) Image of magnitude • The other utilizes a set of directional derivatives.
• In the orthogonal differential edge detection techniques, edge gradients are computed
∂f/∂y in two orthogonal directions, usually along rows and columns, and then the edge
(e) angle  = tan−1
∂f/∂x direction is inferred by computing the vector sum of the gradients.
• The extrema of the magnitude are located at
corners and edges in f[n, m]

Easton

If the image structure is primarily horizontal or vertical, it may be desirable to replace


(i) Robert’s gradient operator the kernels for the x- and y-derivatives in the gradient operator by kernels for
derivatives across the diagonals (by rotating by ±/4 radians):
• Often the gradient magnitude is approximated by replacing the Pythagorean sum of
the derivatives by the sum of their magnitudes
2 2 
 f   f   f   f  
f  x , y              
 x   y   x   y  

• The magnitude of the gradient is always positive, which removes any difficulty from
displaying bipolar data. The component operators for the Roberts’ gradient are often considered to be
• The gradient computed from the absolute values of the derivatives will preferentially
emphasize outputs where both derivatives are “large,” which will happen for
diagonal edges.
• The magnitude gradient will produce larger outputs for diagonal edges than for
horizontal or vertical edges. A change of ±1 gray value in the horizontal or vertical
direction will product a gradient magnitude |∇f| = 1 gray value, while changes of ±1
gray values along a 45◦ diagonal will generate a gradient magnitude of:
• Various other gradient based operators are given in the following figure

• A gradient magnitude which responds without preferential emphasis may be Operator Row Gradient (GR (i, j )) Column Gradient (GC (i, j ) )
generated by summing the outputs of all four derivative kernels. Hence, the 0 1   1 0
“untranslated” rotated operators are preferred to ensure that edges computed from all Roberts 1 0   0 1
four kernels will overlap    
1 0 1 1 1 1
1 1
Prewitt 1 0 1 0 0 0 
3 3
1 0 1  1 1 1 
• Because the gradient operators compute gray-level differences, they will generate 1 0 1 1 2 1
1 1
extreme values (positive or negative) where there are “large” changes in gray level, Sobel 2 0 2  0 0 0 
e.g., at edges. However, differencing operators also generate nonzero outputs due to 4 4
1 0 1  1 2 1 
“noise” in the images due to random signals, quantization error, etc. If the variations
due to noise are of similar size as the changes in gray level at an edge, identification of 1 0 1   1  2 1
1  1  
edge pixels will suffer. Frei-Chen 2 0  2  0 0 0
2 2  2 2 
 1 0 1   1 2 1 
Various gradient-based operators

Cont. Use of masks


• Eight gain-normalized compass gradient impulse response arrays (masks) for Kirsch
• A limitation common to edge gradient generation operators is that these are unable operator is defined as
to detect edges accurately in a highly noisy environments.
5 3 3  3 3 5
• This problem can be alleviated by increasing the size of the window. East 5 0 3 West  3 0 5
  
5 3 3 3 3 5
• As an example, a Prewitt-type 7 x 7 operator has a row gradient impulse response of
the following form.  Maxii07 (5Si  3Ti )
3 3 3  3 5 5
  5 0 3  3 0 5 
1 1 1 0 1 1 1 1  where Northeast   Southwest 
G (i, j )  
1 1 1 0 1 1 1 15  Si =Ai + Ai +1 + Ai  5 5 3 3 3 3
 T  A + A + A + A + A 3 3 3 5 5 5
 i
1 1 1 0 1 1 1 i +3 i +4 i +5 i +6 i +7
3 0 3  3
1   North   South  0 3
H R  1 1 1 0 1 1 1  5 5 5  3 3 3
21
1 1 1 0 1 1 1 3 3 3 5 5 3
   3 0 5  5
1 1 1 0 1 1 1 Northwest   Southeast  0 3
1 1 1 0 1 1 1 3 5 5  3 3 3

Compass masks for Kirsch
Cont. (B) Second order gradient-based edge detection
• The compass names indicate the slope direction of maximum response. • Second order derivative edge detection techniques employ some form of spatial
• The edge magnitude is given by the maximum response obtained after convolving second order differentiation.
all these eight masks with the image and choosing the one with the highest value. • An edge is marked if a significant spatial change occurs in the second derivative.
• The edge angle is determined by the direction of the largest gradient. • Various well-known operators under this category are:
• Laplacian
• Laplacian of Gaussian (LoG), etc.
• The digital implementation of Laplacian operator is given by the following mask
 0 1 0 
1
H 1 4 1
4
 0 1 0 

Zero crossing Laplacian of image


By combining these two equations into a single
operator, the following mask can be used to
approximate the Laplacian
• Generally, an edge occurs at the  0 1 0 
zero-crossing pixel between the  2   1 4 1
negative and positive value
Laplacian responses.
 0 1 0 

• In the case of a step edge, the zero- However, this approximation is centered about Sometimes it is desired to give more weight to the
crossing lies midway between the the pixel [i, j + 1]. Therefore, by replacing j with center pixels in the neighborhood. An approximation
neighboring negative and positive j - 1, we obtain to the Laplacian which does this
response pixels, then the edge is  0 4 0 
correctly marked at the pixel to the    4 20 4 
2
right of the zero crossing.
Which is the desired approximation to the  0 4 0 
second partial derivative centered about [i, j],
Similarly, we have
Laplacian of Gaussian (LoG)
This operator was introduced by Marr and Hildreth (“Theory of Edge Detection” Proc. The LoG edge detector has the following properties:
Royal Soc. London B207, pp. 187- 217, 1980) which blurs the images with a Gaussian
average and then performs a Laplacian edge detector. These are sometimes called “Marr (a) The smoothing filter is Gaussian.
and Hildreth operators.” The output may be written as (b) The detection step is the second derivative (Laplacian in two dimensions).
(c) The detection criterion is the presence of zero crossing in the second derivative
with a corresponding large peak in the first derivative.
(d) The edge location can be estimated with the sub-pixel resolution using linear
interpolation.

where h[x, y] is the impulse response of the Gaussian function with standard deviation σ;

• the value of σ may be selected as a free parameter; the larger the value of σ, the
wider the averaging of the Gaussian

Since both operations (Laplacian and Gaussian blur) are implemented by convolution,
Observations for Marr and Hildreth operator we can get the same result by applying the Laplacian to the Gaussian kernel first, thus
producing a single kernel for the entire operation. Consider a 1-D example with
(1) In natural images, features of interests occur at a variety of scales. No single continuous (non-sampled) coordinates, which we can easily extend to 2-D:
operator can function at all of these scales, so that the result of operators at each of
many scales should be combined. 1D case
(2) A natural scene does not appear to consist of diffraction patterns or other wavelike
effects, and so some form of local averaging (smoothing) must take place.
(3) The optimal smoothing filter that matches the observed requirements of biological
vision (smooth and localized in the spatial domain and smooth and band limited
in the frequency domain) is the Gaussian.
(4) When a change in the intensity (an edge) occurs, there is an extreme value in the
first derivative or intensity. This corresponds to zero-crossing in the second
2D case
derivative.
(5) The orientation-independent differential operator of lowest order is the Laplacian.
Sampled version of the
impulse response is
This operator is commonly called the Mexican hat operator.
generally generated in
Thus, the following two methods are mathematically equivalent: a fairly large array, say
(a) Convolve the image I with a Gaussian smoothing filter and compute the Laplacian of (9×9) or larger. A 9×9
the convolved image, call this L. Edge pixels are those for which there is a zero- approximation of the
LoG kernel with σ = 1.4
crossing in L.
(b) Convolve the image with LoG filter.
If the first method is adopted, then a Gaussian smoothing mask must be used. The slope
at the zero crossing depends on the contrast of the image across the edge. The problem of
combining this information in edge detection is that this operator detects edges at a
particular resolution. To obtain the real edges in the image, it may be necessary to
combine the information from operators at several window sizes. The size of the filter
depends on the width of the central excitatory region (w) of the operator. This width is
given by:
w  2 2
where  is the space constant (standard deviation) of the Gaussian. As the value of  increases,
the width of central excitatory region also increase, resulting in larger size of the mask. The
overall support of the filter is given by 3w or 6. A Gaussian function fitted over a ramp edge
model and corresponding zero crossing is shown One and 2D LoG for  = 2

Difference of Gaussian
• LoG filters usually involve large convolutions and therefore they are very slow. 5 x 5 LoG mask
• To increase the speed of convolution, LoG operator is approximated by a difference
of two Gaussian (DoG) function with different space constant. Marr and Hildreth
(Haralick, 1984) have found that the ratio 𝜎 ⁄𝜎 where 𝜎 < 𝜎 , provides good
approximation of DoG to the LoG operator.
• For the same excitatory region a DoG operator would require slightly larger
support over LoG operator. A support of 3w is sufficient for LoG while 4w is
required for DoG (Huertas et al., 1986).
17 x 17 LoG mask

A difference of Gaussians (DoG) operation


Canny edge detector The magnitude and orientation of the gradient can be computed:
M [i, j ]  P[i, j ]2  Q[i, j ]2 
• It is the first derivative of a Gaussian and closely approximates the operator that 
Steps for Canny edge detection:
optimizes the product of the S/N ratio and localization.  [i, j ]  tan 1  Q[i, j ], P[i, j ] 
1. Smooth the image with a Gaussian filter.
• The Canny edge detection algorithm is summarized by the following notation. Let
2. Compute the gradient magnitude and orientation using finite-difference
I(i, j) denote the image. The result from the convolving the image with a Gaussian
approximations for the partial derivatives.
smoothing filter using separable filtering is an array of smoothed data:
3. Apply non-maxima suppression to the gradient magnitude
S [i , j ]  G [i , j ;  ]  I [i , j ]
4. Use double thresholding algorithm to detect and link edges.
• where  is the spread of the Gaussian and controls smoothing. The gradient of the
smoothed array S[i, j] can be computed using the 2 x 2 first difference • The M array will have large values where the image gradient is large, but it is not
approximations to produce two arrays P[i, j] and Q[i, j] for the x and y partial sufficient to identify the edges, since the problem of finding locations in the image
derivatives: P[i, j ]  ( S [i, j  1]  S [i, j ]  S [i  1, j  1]  S [i  1, j ]) / 2  array where there is rapid change has merely been transformed into the problem
 finding locations in the magnitude array M that are local maxima.
Q[i, j ]  ( S [i, j ]  S [i  1, j ]  S [i, j  1]  S [i  1, j  1]) / 2 
• To identify edges, the broad ridges in the magnitude array must be thinned so that
• The finite differences are averaged over the 2 x 2 square so that the x and y partial only the magnitudes at the points of greatest local change. This is called non-maxima
derivatives are computed at the same point in the image. suppression and results in thinned edges.

• Non-maxima suppression thins the ridges of gradient magnitude


• The algorithm passes a 3x3 neighborhood across the magnitude array M[i, j].
in M by suppressing all values along the line of the gradient that
• At each point, the centre element M[i, j] of the neighbourhood is
are not peak values of a ridge.
compared with its two neighbours along the line of the gradient given by
• The magnitude image array M[i, j] will have large values where the sector value ([i, j] at the centre of the neighborhood.
the image gradient is large, but this is not sufficient to identify the • If the magnitude array value M[i, j] at the centre is not greater than both
edges, since the problem of finding locations in the image array of the neighbour magnitudes along the gradient line, then M[i, j] is set to
where there is rapid change has merely been transformed into the  i, j   Sector  i, j  zero.
problem of finding locations in the magnitude array M[i, j] that • This process thins the broad ridges of gradient magnitude in M[i, j] into
are local maxima. ridges that are only one pixel wide. The values for the height of the ridge
• To identify edges, the broad ridges in the magnitude array must are retained in the non-maxima suppressed magnitude:

N i, j   NMS  M i, j  ,  i, j 


be thinned so that only the magnitudes at the points of greatest The partition of the possible
local change remain. This process is called non-maxima gradient orientations into sectors
for non-maxima suppression is
suppression, which in this case results in thinned edges. Non- The above equation, denote the process of non-maxima suppression. shown. There are four sectors,
maxima suppression thins the ridges of gradient magnitude in The nonzero values in N[i, j] correspond to the amount of contrast at a numbered 0 to 3, corresponding to
M[i, j] by suppressing all values along the line of the gradient step change in the image intensity. In spite of the smoothing performed the four possible combinations of
that are not peak values of a ridge. The algorithm begins by as the first step in edge detection, the non-maxima-suppressed elements in a 3 x 3 neighborhood
that a line must pass through as it
reducing the angle of the gradient [i, j] to one of the four sectors magnitude image N[i, j] will contain many false edge fragments caused
by noise and fine texture. The contrast of the false edge fragments is passes through the center of the
shown in Figure. neighborhood. The divisions of the
small. circle of possible gradient line
orientations are labeled in degrees.
• Thresholding: After non-maxima suppression, a threshold is
applied and all values below the threshold are changed to zero. Application of edge detection
But it may still have problems due to improper choice of
threshold. operators on image
• Therefore, a double threshold procedure may be applied.
It involves taking the non-maxima suppressed image, N,
and applies two thresholds, t1 and t2, with t2  2t1 to
produce two threshold images T1 and T2. Since T2 was
formed with higher threshold, it will contain fewer false
edges but T2 may have gaps in contours.
• The double thresholding algorithm links the edges in T2
into contours. When it reaches the end of the contour, the
algorithm looks into T1 at the locations of the 8-neighbour
for the edges that can be linked to the contour. The
algorithm continues to gather edges from T1 until the gaps
have been bridged to an image in T1. The algorithm links First derivative in X direction Laplacian (second derivative) of the image
edges as a by-product of the thresholding resolving some
problems with choosing a threshold.

References
• Jensen, J. R. 1986, Introductory Digital image processing: a remote sensing
perspective, Prentice-Hall Englewood Cliffs: NJ
• Mather, P. M. 1987, Computer processing of remotely sensed image – an introduction,
John Wiley.
• Pratt, W. K. 2002, Digital Image Processing, third edition, John Wiley: NY.
• Ramesh Jain, Rangachar Kasturi, Brian G. Schunck, 1995, Machine Vision, McGraw-
Hill.

Original image Edge enhanced image indicating sharpened


edges with crisper appearance to the original
image

You might also like